new perspectives for dynamic traffic demand estimation and...

Dipartimento di Ingegneria

New Perspectives for Dynamic Traffic

Demand Estimation and Prediction

Adopting Big Data

Ernesto Cipriani, Marialisa Nigro

Department of Engineering, Roma Tre University

Junior Consulting

SIDT Conference

Dipartimento di

Ingegneria

Outlines

Dynamic OD estimation and prediction: literature

review and main issues;

Dynamic OD estimation: the bi-level formulation:

Main algorithmic enhancements;

Exploiting Big Data for Dynamic OD estimation:

Floating Car Data and the case study of Rome (Italy):

Spatial and temporal features of FCD;

Path information from FCD;

Experimental phase

Dynamic OD prediction:

A proposal based on advanced KF and experimental results

Conclusions and further developments

Junior Consulting

SIDT Conference

Dipartimento di

Ingegneria

Dynamic OD estimation and prediction:

Literature review and main issues

Non-linearity of the relation ODs – traffic

measurements: (Zhou & Mahmassani, 2006; Balakrishna et al., 2007; Flötteröd and

Bierlaire, 2009; Frederix et al., 2011;….)

High indeterminateness (dimension of the

unknowns): (Djukic et al., 2012; Cascetta et al. 2013; Cantelmo et al., 2014,

2015;….)

Selection of traffic measurements for the

estimation: (Ashok and Ben-Akiva, 2000; Tavana, 2001, Dixon and Rilett, 2002;

Barceló et al., 2013;….)

Junior Consulting

SIDT Conference

Dipartimento di

Ingegneria

Objective function

where

Constraints

p

nn

p

t

nn

t

l

nn

l

nn

n

hhhh

hhhh

hnh ff

ff

wwwwzzzz

yyyyddxx

dd xxˆ...ˆ,...ˆ...ˆ,...

ˆ...ˆ,......,...

minarg...114113

112111

)...(

**

1 1

p

t

l

XXf

hh

hh

hh

nn

nn

nn

x

i

)...F(...

)...F(...

)...F(...

ˆ,

11

11

11

xxww

xxzz

xxyy

....

11

....

11

......

......

BU

n

BU

n

BL

n

BL

n

hh

hh

xxxx

xxxx

F = DTA

Seed matrix Info on links

Info on routesInfo on nodes

Error index

Dynamic OD estimation: bi-level formulation

Junior Consulting

SIDT Conference

Dipartimento di

Ingegneria

First order SPSA AD PI (Cipriani et al., 2010-

2011)

1

12

11

11

1

)(

.

)(

)(

.........ˆ

h

hh

h

ni

i

i

i

i

n

iii

ni

nc

xxzcxxzxxg

repgrad

xxg

xxg

repgradi

n

i

n

h

h_

...ˆ

...

_

1

1

1

_

d

O. F.

dk dk1 dk2 dk3d

O. F.

dk dk1 dk2 dk3

The approximated

gradient

The average

approximated gradient

(inside iteration i)

To update the solution: a

third degree polinomial

interpolation

iii gi

gi

ig ˆ

1

1

11

The average approximated gradient

between iteration

Each z implies an assignment

(Spall, 1998)

Junior Consulting

SIDT Conference

Dipartimento di

Ingegneria

Sensitivity analysis results (Cantelmo et al.,

2014)

The step ck depends on the general noisy of the objective

function:

suggested to make an analysis of the adopted objective function

in the neighborhood of the seed matrix to properly define the

parameter value.

High value of the grad_rep parameter allows a high

efficiency of the optimization in terms of iterations, but if

considering total computational times, also lower values

can be adopted:

values equal to 10÷15% can be a good compromise between

reliability of the solution and computational times;

Current gradient information is the most effective option

Junior Consulting

SIDT Conference

Dipartimento di

Ingegneria

Second order SPSA AD-PI (Cantelmo et al.,

2014)

iniii

n

i

n hhhxxgHaxxxx ......... 1

1

1

1

1

)( ii HfH

iii Hi

Hi

iH ˆ

1

1

11

T

nii

i

i

nii

i

i

i hh

c

g

c

gH 11

111

1 )(,....,)(2

)(,....,)(22

1ˆ

iii

n

iii

n

i

cxxgcxxgghh

...... 11

Update the solution: the approximated

gradient is weighted by the inverse of

the Hessian

Mapping to cope with possible

nonpositive-definiteness

Average Hessian during iterations

(Spall, 2000 – Spall, 2003)

Second order SPSA AD-PI: Polinomial Interpolation to update the solution along

the gradient direction weighted by the inverse of the Hessian

Junior Consulting

SIDT Conference

Dipartimento di

Ingegneria

We define the weights wi as a vector (nh x 1), so that:

Taking inspiration from the structure of the second order

SPSA AD-PI, the gradient is weighted according to the

relevance of any O-D pair in explaining deviations from

observations;

The current direction of the average approximated

gradient is modified depending on the influence of the OD

pairs on the observed traffic phenomena.

ini

ni

i

i

ii

i

n

iii

ni

n h

h

hh

hxxgw

cw

xxzcxxzxxg ...ˆ

)(

.

)(

)(

.........ˆ

1

1

1

12

11

11

1

]1,0[iw

Adaptive SPSA AD-PI (Cantelmo et al., 2014)

weights of the gradient

Junior Consulting

SIDT Conference

Dipartimento di

Ingegneria

The methods to compute wi

Method 1 (M1): based on the knowledge of the path

proportion values of each OD on each sensor;

Method 2 (M2): based on the knowledge of the simulated

flows on the sensors (SimFlow), their differences with

respect to counts (RealData) and the influence of each OD

on each sensor. The influence of each OD on each detector

is weighted with respect to the flow value on the detector;

Method 3 (M3): as method 2, but the influence of each OD

on each sensor is weighted with respect to the single OD

“magnitude”;

Method 4 (M4): as method 2, but the influence of each OD

on each detector is weighted with respect to the global ODs

“magnitude”.

Junior Consulting

SIDT Conference

Dipartimento di

Ingegneria

Results of the four methods

10 sensors and 90 OD pairs;

starting demand (from 8:00

am to 9:00 am) divided into

four time slices of 15 min

each;

Measurements of flow,

speed, density and

occupancy are available

from sensors every 5 min.

(COST ACTION TU0903, MULTITUDE)

Junior Consulting

SIDT Conference

Dipartimento di

Ingegneria

Weighting SPSA methods

W-SPSA (Lu, Xu, Antoniou, Ben-Akiva, 2015): it takes into

account spatial and temporal correlations between

parameters and measurements to compute a weighting

matrix that allows to reduce noise in gradient

approximation;

Cluster-based SPSA (Tympakianaki, Koutsopoulos,

Jenelius, 2015): it clusters variables into small n. of

homogeneous clusters to reduce the bias. Gradient is

estimated based on the simultaneous perturbation of

subvector of the same cluster, a cluster at a time.

Junior Consulting

SIDT Conference

Dipartimento di

Ingegneria

FCD for the metropolitan area of Rome

Octo Telematics fleet, May 2010

Map matching on the TeleAtlas 2010 graph (VI

release: 300,000 links; 160,000 nodes)

Fleet: 103,000 floating vehicles equipped by GPS

on-board units

104 millions of records (positions and speeds)

Traffic: 9 million trips in one month

Polling: 1/2 km or 1/30 secs (on freeways)

Junior Consulting

SIDT Conference

Dipartimento di

Ingegneria

Analysed sub network – Eur network 130,000 monitored trips corresponding to 10,400,000

transmitted signals for the weekdays of May 2010

Eur network

Junior Consulting

SIDT Conference

Dipartimento di

Ingegneria

Temporal trend and day-to-day variations

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 240

100

200

300

400

500

600

700

hours

veh

/15

min

Monday

Tuesday

Wednesday

Thursday

Friday

Average

Morning peak

7:45-8:45

Junior Consulting

SIDT Conference

Dipartimento di

Ingegneria

Correlation of spatial demand features of

FCD vs Static model demand

Junior Consulting

SIDT Conference

Dipartimento di

Ingegneria

Highlights (1)

Spatial information from FCD strongly dependent on

several factors (penetration rate, representativeness of

the fleet for the whole population, …..):

spatial information from FCD cannot be directly adopted

in demand estimation at distribution level

information on generated trips shares can still be

derived;

Temporal distribution of FCD to profile initial OD

matrices, to investigate the day-to-day variation of the

demand, to assume similar behaviour for different

classes of users

Junior Consulting

SIDT Conference

Dipartimento di

Ingegneria

Main ODs and Path choices from FCD FCD information collected at network level, when the probe is

dipped in the traffic stream (e.g. path travel times and

choices)

Junior Consulting

SIDT Conference

Dipartimento di

Ingegneria

Main ODs and Path choices from FCD

Origin

Destination

Path choices of the three most used paths: 55%, 19%,

4%

Junior Consulting

SIDT Conference

Dipartimento di

Ingegneria

Main ODs and Path choices from FCD

Users moving from the same origin and in the same time

interval: multiple routes with different observed travel times.

Issues on:

Realism of the behavioral assumptions (Nigro et al., 2015, Transp

Res Proc 10);

Modeling the actual users’ route choice mechanism (Cipriani et al.,

2015, IEEE Conf on ITS).

Junior Consulting

SIDT Conference

Dipartimento di

Ingegneria

Solving dynamic demand estimation

including path information

Synthetic experiments on the Eur network

Objective Function:

Solving procedure: SPSA AD-PI

12 OD: travel times and route choice probabilities collected for each

15 minutes departure interval

link flows on 32 count sectionsSeed OD demand

SetSeed OD

matrixLink flows OD travel times

Route choice

probabilities

Set 1 + +

Set 2 + +

Set 3 + +

Set 4 + + +

Set 5 + + + +

p

nn

p

l

nn

l

nnn hhhhhhhnhfff wwwwyyyyddxxdd xx

ˆ...ˆ,...ˆ...ˆ,......,...minarg... 114112111)...(

**

1 1

Junior Consulting

SIDT Conference

Dipartimento di

Ingegneria

Improvements on ODs reproduction

Improvements in terms of Euclidean distances;

Route choice probabilities alone not suitable (only 6%

improvement).

Junior Consulting

SIDT Conference

Dipartimento di

Ingegneria

ODs coverage Traffic count sections intercept 67% of the demand

Average OD travel times and route choice probabilities related to

only 12 ODs covering 10% of the demand

Junior Consulting

SIDT Conference

Dipartimento di

Ingegneria

Highlights (2)

OD travel times and route choice probabilities are

effective in improving estimation on covered ODs;

Combining network data with link data allows to

provide information also on not monitored ODs.

Junior Consulting

SIDT Conference

Dipartimento di

Ingegneria

Moving to on-line demand estimation and

prediction

On-line applications (Okutani and Stephanades, 1984):

sequential update of ODs and predictions for future time slices

taking into account the real-time variability of traffic conditions;

Kalman Filter (KF, Kalman, 1960) algorithm (Ashok and

Ben-Akiva, 1993, Chang and Wu, 1994, Van der Zijpp and

Hammerslag, 1994 and Ashok, 1996):

“State-space” model:

Transition equation, which follows the evolution of the state variables

(OD flows) over time;

Measurement equation, mapping the state variables to the traffic

measurements;

Analysis equation, correcting the estimate of the state variables

(derived by the transition equation) with the results of the measurement

equation and a Kalman gain.

Junior Consulting

SIDT Conference

Dipartimento di

Ingegneria

Kalman Filter for on-line Ods estimation

and prediction

Main drawbacks of KF:

linearity hypothesis between OD flows and

measurements:

Several extensions (Antoniou et al., 2007,

Marzano et al. 2015):

Extended Kalman filter (EKF);

Unscented Kalman filter (UKF);

Limiting EKF (LimEKF).

intensive linear algebra computations:

LSQR algorithm (Bierlaire and Crittin, 2004)

Junior Consulting

SIDT Conference

Dipartimento di

Ingegneria

Local Ensemble Transformed Kalman

Filter (LETKF)

Local Ensemble Transformed Kalman Filter (LETKF):

Proposed in meteorological sciences as a “refinement” of

Ensemble approach (EnKF);

Deal with nonlinear problems, large-scale models and data sets;

The two strengths of LETKF:

Solution of the problem in the space of the ensemble

(“transformed”): explicit knowledge of the nonlinear map between OD

flows and traffic measurements no longer required

“Local” implementation:

Exploitation of the concept of spatial localization:

the modelled system has a “correlation distance”;

the analysis should ignore ensemble correlations over larger distance.

(Hunt et al, 2007)

Junior Consulting

SIDT Conference

Dipartimento di

Ingegneria

Main principles of LETKF

27/23

Given an ensemble Ea of the state variables for time interval n-1:

each element of Ea is propagated to n:

then solving the Kalman filter cost function in the space S:

covariance of the background ensemble Traffic measurements

Non linear function mapping OD flows into traffic measurements

(Measurement Equation)

Covariance of traffic measurements

Average of the background state

Transition Equation

Junior Consulting

SIDT Conference

Dipartimento di

Ingegneria

Kalman Gain in LETKF

Analysis equation corrects the estimate of the state

variables (derived by the transition equations) with the

results of the measurements equations and a Kalman

gain;

Analysis equation in LETKF are based on a change of

coordinate system (from S to 𝑆 ̃):

covariance matrix for the analysis state

in the k-dimension space

deviation with respect to the observed data

average of the analysis state in the ensemble space

Kalman gain

average of the analysis state

in the nOD dimension space

Junior Consulting

SIDT Conference

Dipartimento di

Ingegneria

LETKF VS EnKF (stationary conditions)

Monitoring

sections

collecting traffic

counts

Reduction of MAE [%] with respect to Ref.

Tot

[Ref. 158 veh/h]

EnKF -38%

LETKF -55%

Ref: starting error on reproducing

traffic counts given the historical

demand

Travel demand for three time slices + stationary conditions

for each time slice;

Traffic counts collected in real-time (LETKF can adopt also

other measurements, while this is no possible for EnKF);

LETKF outperforms common nonlinear KF:

(Nigro et al., 2016, TRB 95th Compendium of Papers;

Carrese et al., 2017 Transportation Research part C, in press)

Junior Consulting

SIDT Conference

Dipartimento di

Ingegneria

LETKF: sensitivity analysis in dynamic

conditions

A first assessment of the LETKF applied to on-line OD flows

estimation and prediction (35 minutes of simulation; 5 minutes

estimation and prediction);

Laboratory experiments:

Different starting matrices (seed matrices);

Different number k of elements in the ensemble

Different levels of error ε between the elements

in the ensemble

(from a uniform distribution between [-ε, + ε])

Junior Consulting

SIDT Conference

Dipartimento di

Ingegneria

Results (1)

31/23

Reduction (on all time slices) of MAE with respect to

the reference starting values.

If a reliable seed matrix is adopted:

satisfactory results, if the ensemble is in a neighborhood of the

starting matrix (ε=10%), regardless of the number of elements k;

If the ensemble is generated in a wider space (ε=30% or ε=50%),

required to increase the number of elements in the ensemble:

stabilization of the MAE reduction with increasing k;

If a bad seed matrix is adopted:

good results can be detected (higher than 20%) on traffic counts

reproduction, but:

Indeterminateness of the problem: good measurements

reproduction, no good OD flows reproduction

need of starting the on-line process with a reliable off-line

demand estimate.

Junior Consulting

SIDT Conference

Dipartimento di

Ingegneria

Results (2)

Junior Consulting

SIDT Conference

Dipartimento di

Ingegneria

Further develoments

Issues related to the concept of equilibrium in

the dynamic traffic assignment phase:

realism of the behavioural assumptions underlying the

dynamic assignment models;

equilibrium concept and convergence of assignment

procedures;

route choice mechanism based on instantaneous or on

experienced travel times/ several classes of vehicles.

Further investigations of LETKF:

Exploit several types of traffic measurements,

respect to only traffic counts;

localization strategy for large-scale road networks.

new perspectives for dynamic traffic demand estimation and...

Documents