testing and benchmarking of microscopic traffic flow simulation models

20
DLR-Institute of Transport Research Testing and benchmarking of microscopic traffic flow simulation models Elmar Brockfeld , Peter Wagner [email protected], [email protected] Institute of Transport Research German Aerospace Center (DLR) Rutherfordstrasse 2 12489 Berlin, Germany 10th WCTR, Istanbul, 06.07.2004

Upload: faxon

Post on 17-Jan-2016

61 views

Category:

Documents


0 download

DESCRIPTION

Testing and benchmarking of microscopic traffic flow simulation models. Elmar Brockfeld , Peter Wagner [email protected], [email protected] Institute of Transport Research German Aerospace Center (DLR) Rutherfordstrasse 2 12489 Berlin, Germany 10th WCTR, Istanbul, 06.07.2004. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Testing and benchmarking of microscopic traffic flow simulation models

DLR-Institute of Transport Research

Testing and benchmarking ofmicroscopic traffic flow simulation models

Elmar Brockfeld, Peter Wagner

[email protected], [email protected]

Institute of Transport Research

German Aerospace Center (DLR)

Rutherfordstrasse 2

12489 Berlin, Germany

10th WCTR, Istanbul, 06.07.2004

Page 2: Testing and benchmarking of microscopic traffic flow simulation models

DLR-Institute of Transport Research 2

The situation in microscopic traffic flow modelling today:

» A very large number of models exists describing the traffic flow.

» If they are tested, this is done separately with special data sets.

» By now the microscopic models are quantitatively not comparable.

„State of the art“

Page 3: Testing and benchmarking of microscopic traffic flow simulation models

DLR-Institute of Transport Research 3

Motivation

Idea

» Calibrate and validate microscopic traffic flow models with the same data sets. ( quantitative comparibility, benchmark possible ?)

» Calibration and validation in a microscopic way by analysing any time-series produced by single cars.

In the following

» Calibration and validation of ten car-following models with data recorded on a test track in Hokkaido, Japan.

» Comparison with results of other approaches.

Page 4: Testing and benchmarking of microscopic traffic flow simulation models

DLR-Institute of Transport Research 4

Test track Hokkaido, Japan

1200 m

curve300 m

»10 cars equipped with DGPS driving on a 3km test track

»Delivery of positions in intervals of 0.1 second

Page 5: Testing and benchmarking of microscopic traffic flow simulation models

DLR-Institute of Transport Research 5

Test track Hokkaido, JapanImpressions

Page 6: Testing and benchmarking of microscopic traffic flow simulation models

DLR-Institute of Transport Research 6

Hokkaido, Japan – The data

» Data recorded by Nakatsuji et al. in 2001» Data from 4 out of 8 experiments are used for the analyses:

» Exchange of drivers between the cars after each experiment» Leading car performed certain “driving patterns” on the straight

sections:» driving with constant speeds of 20, 40, 60 and 80 km/h» driving in waves varying from about 30 to 70 km/h

Experiment

Duration [min]

Full loops

„11“ 26 6

„12“ 25 7

„13“ 18 6

„21“ 14 4

Page 7: Testing and benchmarking of microscopic traffic flow simulation models

DLR-Institute of Transport Research 7

Hokkaido, Japan – Speed development

Speed development of the leading car in all four experiments

Page 8: Testing and benchmarking of microscopic traffic flow simulation models

DLR-Institute of Transport Research 8

The models

The following existing modelshave been analysed:

» 4 parameters, CA0.1 („cellular automaton model“)» 4 p, OVM (“Optimal Velocity Model” by Bando)» 6 p, GIPPSLIKE (basic model by P.G. Gipps)» 6 p, AERDE (used in the software INTEGRATION)» 6 p, IDM (“Intelligent Driver Model” by D. Helbing)» 7 p, IDMM (“Intelligent Driver Model with Memory”)» 7 p, SK_STAR (based on the model by S. Krauss)» 7 p, NEWELL (CA-variant of the model with more

variable acceleration and deceleration by G. Newell)» 13 p, FRITZSCHE (used in the british software PARAMICS;

similar to what is used in the german software VISSIM by PTV)» 15 p, MITSIM (used in the software MitSim)

leader

1nvnvng

follower

Page 9: Testing and benchmarking of microscopic traffic flow simulation models

DLR-Institute of Transport Research 9

The model‘s parameters

Parameters used by all models:

V_max Maximum velocity

l Vehicle length

a acceleration

Most models:

b deceleration

tau reaction time

Models with different driving regimes:

MITSIM and FRITZSCHE

Java Applet for testing the models

Page 10: Testing and benchmarking of microscopic traffic flow simulation models

DLR-Institute of Transport Research 10

Hokkaido, Japan - Simulation setup

» For each simulation run one vehicle pair is under consideration» Movement of leading car: as recorded in the data» Movement of following car: following the rules of a traffic model

» Error measurement:» e percentage error» T time series of experiment » g(obs) observed gaps/headways» g(sim) simulated gaps/headways

» Objective of calibration: Minimize the error e !

gap

V_dataV_sim

T1 (sim) (obs)g (t) g (t)T t 1e

T1 (obs)g (t)T t 1

Page 11: Testing and benchmarking of microscopic traffic flow simulation models

DLR-Institute of Transport Research 11

Hokkaido, Japan – gaps time series

Page 12: Testing and benchmarking of microscopic traffic flow simulation models

DLR-Institute of Transport Research 12

Hokkaido, Japan – Calibration and Validation

Calibration (“Adjust parameters of a model to real data”)

» Find the optimal parameter sets for each vehicle pair in each experiment (9*4 = 36 calibrations for each model):

» Minimize the error e as defined before» Minimization with a gradient free (direct search) optimisation algorithm

(“downhill simplex” or “Nelder-Mead”)» To avoid local minima: about 100 simulations with random initializations

Validation (“Apply calibrated model to other real data sets”)

» For each model all optimal parameter results are transferred to data sets of three other driver pairs (in total 108 validations for each model)

Page 13: Testing and benchmarking of microscopic traffic flow simulation models

DLR-Institute of Transport Research 13

Hokkaido, Japan - Calibration Results (1/2) Results of the first experiment “11”

» Errors between 9 and 19 %, mostly between 13 and 17 %

» No model appears to be the best

0510152025

21_1 21_2 21_3 21_4 21_5 21_6 21_7 21_8 21_9

D1-D8 D8-D7 D7-D6 D6-D5 D5-D4 D4-D3 D3-D2 D2-D9 D9-D10

trajectory

erro

r [%

]

OVM CA0.1 Newell FRITZSCHE GIPPS_LIKE IDM IDMM Aerde MitSim SK_STAR

0

510

1520

25

11_1 11_2 11_3 11_4 11_5 11_6 11_7 11_8 11_9

D1-D2 D2-D3 D3-D4 D4-D5 D5-D6 D6-D7 D7-D8 D8-D9 D9-D10

erro

r [%

]

Page 14: Testing and benchmarking of microscopic traffic flow simulation models

DLR-Institute of Transport Research 14

Hokkaido, Japan - Calibration Results (2/2)

» Calibration error: mostly 12 - 18 % (range 9 - 23 %)

» All models share the same problems with the same data sets -> Which is the best model?

» Average error of best model: 15.14 %» Average error of worst model: 16.20 %» Models with more parameters do not produce

better results

» Average difference of the models per data set is 2.5 percentage points

Calibration of 10 models with 36 data sets( 4 experiments „11“, „12“, „13“ and „21“, each with 9 driver pairs):0

5

10

15

20

25

11_1 11_2 11_3 11_4 11_5 11_6 11_7 11_8 11_9

D1-D2 D2-D3 D3-D4 D4-D5 D5-D6 D6-D7 D7-D8 D8-D9 D9-D10

erro

r [%

]

0510152025

12_1 12_2 12_3 12_4 12_5 12_6 12_7 12_8 12_9

D1-D8 D8-D7 D7-D6 D6-D5 D5-D4 D4-D3 D3-D2 D2-D9 D9-D10

erro

r [%

]

0

5

10

15

20

25

13_1 13_2 13_3 13_4 13_5 13_6 13_7 13_8 13_9

D1-D2 D2-D3 D3-D4 D4-D5 D5-D6 D6-D7 D7-D8 D8-D9 D9-D10

erro

r [%

]

0

5

1015

20

25

21_1 21_2 21_3 21_4 21_5 21_6 21_7 21_8 21_9

D1-D8 D8-D7 D7-D6 D6-D5 D5-D4 D4-D3 D3-D2 D2-D9 D9-D10

trajectory

erro

r [%

]

OVM CA0.1 New ell FRITZSCHE GIPPS_LIKE IDM IDMM Aerde MitSim SK_STAR

Diversity in Diversitydriver behaviour of models

(6 %) (2.5 %)

>

Page 15: Testing and benchmarking of microscopic traffic flow simulation models

DLR-Institute of Transport Research 15

Hokkaido, Japan - Validation Results (1/2)

» Validation error: mostly 17 - 27 %

» “Overfitting 1”: Special driver behaviour may produce high errors up to 40 or 50 % for all models

» “Overfitting 2”: For some driver pairs some models produce singular high errors of more than 100 %

Validation of each calibration result with three other driver pairs (->108 validations for each model)

Sample plots for 2*9 validations

Page 16: Testing and benchmarking of microscopic traffic flow simulation models

DLR-Institute of Transport Research 16

Hokkaido, Japan - Validation Results (2/2)

» Calibration errors:12 - 18 % (peak 15-17)

» Validation errors:17 - 27 % (peak 21-23)

» Additional validation error to calibration:6 percentage points

» Calibration: MEDIAN best/worst model: 14.84 % / 16.04 %» Validation: MEDIAN best/worst model:

21.60 % / 22.58 %» Additional validation error to calibration:

5.66 pp / 7.23 pp

Distribution functions of the errors

Page 17: Testing and benchmarking of microscopic traffic flow simulation models

DLR-Institute of Transport Research 17

Comparison with other calibration approaches

ApproachRecorded on;

DeviceData volume

Calibration errorsheadways

Hokkaido(Brockfeld et al)

Single lane test track; DGPS

40 traces 15-30 minutes

Main range: 13-19 %10 models: 15-16 %(Validation: 21.5-22.5 %)

Hokkaido(Ranjitkar, Japan)

Single lane test track; DGPS

47 traces 1-2 minutes

Main range: 9-17 %6 models: 12-13 %; 15 %; 18 %; 21 %

ICC FOT(Schober, USA)

Multilane highway; Radar

300 traces One to a few minutes

3 models: 18-20 %Speed class <35 mph: 9-10 %Speed class 35-55 mph:13-16 %Speed class > 55 mph: 16-17 %

San Pablo Dam(Brockfeld et al)

Single lane rural road; Humans

Passing times of 2300 vehicles at 8 positions on two days

Travel times10 models: 15-17 % (6), 23%(Validation: 17-27%)

I-80, Berkeley(Wagner et al)

Multilane highway; Loop detectors

24 hours highway data

Speed, flow2 models: 18%

Page 18: Testing and benchmarking of microscopic traffic flow simulation models

DLR-Institute of Transport Research 18

Conclusions

Essential results:

» Minimum reachable levels for calibration:» Short traces or special situations: 9 to 11 %» Simulating more than a few minutes: 15 to 20 %

» Minimum reachable levels for validation:» > 20 % ; about 3 to 7 percentage points higher than calibration case.

» The analysed models do not differ so much» The diversity in the driver behaviour is bigger than the diversity of the

models. » Models with more parameters must not necessarily produce better

results than simple ones.

» Preliminary advice: Take the simplest model or the one you know best!

Page 19: Testing and benchmarking of microscopic traffic flow simulation models

DLR-Institute of Transport Research 19

Perspectives and future research

» Testing more models and more data sets

» Test some other calibration techniques and measurements (speeds, accelerations,…)

» Sensitivity analyses of the parameters (robustness of the models)

» What are the problems of the models? Analysis of parameter results. Development of better models.

» Finally development of a benchmark for microscopic traffic flow models.

Page 20: Testing and benchmarking of microscopic traffic flow simulation models

DLR-Institute of Transport Research 20

THANK YOU

FOR YOUR ATTENTION !