extraction of bicycle commuter trips from day long gps trajectories
DESCRIPTION
Gerald Richter, Christian Rudloff, Anita Graser Austrian Institute of Technology, Austria Topic: “Extraction of bicycle commuter trips from day-long GPS trajectories”TRANSCRIPT
Extraction of bicycle commuter tripsfrom day-long GPS trajectories
Cycling Data Challenge 2013Leuven, Belgium
workshop presentation
Gerald Richter 1 Christian Rudloff 1 Anita Graser 1
1Austrian Inst. of Technology – Mobility Dept. – Dynamic Transportation Systems
G. Richter | AIT | mobility | DTS May 14, 2013 1 / 19
The Austrian Institute of TechnologyAIT – who we are and what we do
Austria’s largest non-university research instituteAIT: 5 departments focussing on applied research topics
• Energy• Mobility
business units:• Transportation Infrastructure Technologies• Dynamic Transportation Systems• Electric Drive Technologies• Light Metals Technologies Ranshofen
• Safety & Security• Health & Environment• Foresight & Policy Development
G. Richter | AIT | mobility | DTS May 14, 2013 2 / 19
Dynamic Transportation Systems“develop efficient, safe and cost-effective multimodaltransportation solutions for transportation networks, hubs andservices”
Airports / Train Stations
Shopping Centres / Events
Multi-Modal Transportation
Networks Transport Logistics
Crowd Dynamics Traffic Flow Modelling Dynamic Vehicle Routing
Optimisation Simulation /
Prediction Data Analysis Data Collection
G. Richter | AIT | mobility | DTS May 14, 2013 3 / 19
GPS measurementsand some peculiarities
Proper GPS measurement requires 4 satelitesto be visible by device.Measurement is stochastic process by nature.Positional precision is gaussian distributedunder clear-view conditions.Additional effects arise from obstructed view(signal shadowing, reflection by obstacles).
• outliers: sudden change in signal receptionconditions
• drift: longer phases of signal impairment,receiver-internal error correction walking amisguided path.
snap-back
true path
G. Richter | AIT | mobility | DTS May 14, 2013 4 / 19
The input data. . . hence this initial situation
some points not out of thisworldsome tracks far outside theregion of interestmost likely due to GPSinitialisation phase– fixable by bounding boxclipping
Figure: detail UK
G. Richter | AIT | mobility | DTS May 14, 2013 5 / 19
A simple yet efficient approachstages of processing
Cleaning• Outliers and unlikely points in the data are removed
i.e.: some trajectory smoothness is ascertained• Data is split into trip trajectories inbetween stops or
activitiesi.e.: a journey’s segments are identified
Mode Detection• A training set of data is used to identify decision criteria
within a manually chosen set of variables (trip parameters).• With those criteria modes of trips are detected to separate
bike trips from other trips
Details found in [1, 3, 2]
G. Richter | AIT | mobility | DTS May 14, 2013 6 / 19
Cleaning the dataSteps of the data cleaning algorithm
Outliers are removed according to• geographic location: within bounding box around area of
interest• accesiblity: reachable by realistic speeds (here ≤ 50 m
s )• GPS drifts: points before trajectory snap-backs are deleted
until the remaining trajectory only contains realistic speeds
Stop detection and trip separation• Stop is detected when trajectory does not
leave circle of radius 30m for at least 5minutes.
• GPS trajectories are cut into trips at stoppoints (removal of tumbleweed)
• Next trip starts when trajectory leavescircle
G. Richter | AIT | mobility | DTS May 14, 2013 7 / 19
Unlikely points
Tumbleweed also found atshorter stops (e.g. traffic lights)
Removed by loop detection(look ahead 3 minutes andfind very low effectivevelocities to reach asuccessive trajectory pointin given time interval)All points in loop arereplaced by one middlepoint between start andend of loop.
G. Richter | AIT | mobility | DTS May 14, 2013 8 / 19
Modal Decisionprinciple
Classification of cycling tracksusing a decision treeOther methodologies (logisticregression, support vectormachines, neural network)show similar out of sampleperformanceDecision tree are easy to useand interpret
exemplary diagram:(2-dimensional feature space)
Training data from the Vienna region with 8 different modes
G. Richter | AIT | mobility | DTS May 14, 2013 9 / 19
Mode Detectionalgorithmic choices
For CDC data set distinction was made between 3 Modes
Walking
Cycling
Other
Algorithmic separability optimisation left 3 separation variables:
maximum velocity
percentage of time over 16 km/h
maximum acceleration
G. Richter | AIT | mobility | DTS May 14, 2013 10 / 19
Processing outcomevisually
black: refined tracks; green: processed and detected cycling tracksG. Richter | AIT | mobility | DTS May 14, 2013 11 / 19
Bird’s eye comparisonin numbers
Comparison of no. cycle trips and trip lengthrefined all modes cycling
No. cycle trips 941 1,734 749Total trip [km] 4,483 6,800 3,014
Oct 12 2011
Oct 19 2011
Oct 26 2011
Nov 02 2011
Nov 09 2011
Nov 16 2011
Nov 23 20110
20000
40000
60000
80000
100000
tota
l trip
tim
e [s
]
trips per day comparisonwrt. total time
diaryprocessed
Oct 12 2011
Oct 19 2011
Oct 26 2011
Nov 02 2011
Nov 09 2011
Nov 16 2011
Nov 23 20110
10
20
30
40
50
60
70
tota
l num
ber o
f trip
s
trips per day comparisonwrt. number of trips
diaryprocessed
G. Richter | AIT | mobility | DTS May 14, 2013 12 / 19
Comparing track densitiesprinciple
fewer trips weredetected than in refineddataalgorithm unlikely tofalsely qualify tracks ascyclingcoordinate shift in initialdata along thebackslash diagonal
(processed cycling trips) – (refined trips)
G. Richter | AIT | mobility | DTS May 14, 2013 13 / 19
Different cyclists
0 100 200 300 400 500 600 700avg. number of pts per trip
5
0
5
10
15
20
25
num
ber
of
trip
s
processed trip scatterfor all cyclists
quite different profilesby cycling habit ortrajectory cleaning?⇒ look associatedvelocity profiles
0 10 20 30 40 50 60speed [km/h]
0
100
200
300
400
500
# G
PS p
oint
s
speed distribution: cyclist 101(high number of trips)
0 10 20 30 40 50 60speed [km/h]
0
50
100
150
200
250
300
350
400
450
# G
PS p
oint
s
speed distribution: cyclist 113(high avg. number of points per trip)
G. Richter | AIT | mobility | DTS May 14, 2013 14 / 19
Cyclist differences on map
high number of points per trackcyclist 113
high number of trackscyclist 101
G. Richter | AIT | mobility | DTS May 14, 2013 15 / 19
Big visual
G. Richter | AIT | mobility | DTS May 14, 2013 16 / 19
Summary & conclusionsApplied methods successfully discern useful GPS tracking datafrom technological artifacts.
Not too complex methods, good classification of the cyclingtransport mode
Results display periodic features of protocolled travel activity wrt.number of trips and travel times.
Algorithm cannot identify all cycling tracks of reference data.
Differences most likely due to dissimilar training set.
Low rate of false modal identification for cycling, while retainingthe substantial part of useable tracking data.
Compared to reference data, removal of erratic GPSmeasurement errors with appreciable reliability.
TODO: Use of homologous training data (road network topologyand traffic densities) expected to yield consistently better results.
G. Richter | AIT | mobility | DTS May 14, 2013 17 / 19
Remarks
Thanks to:CDC2013 organisersThe other contributers and colleagues who I work with. . . a patient audience
Questions & comments to:[email protected]@[email protected]
G. Richter | AIT | mobility | DTS May 14, 2013 18 / 19
References
[1] D. Bauer et al. “On Extracting Commuter Information fromGPS Motion Data”. In: Proceedings InternationalWorkshop on Computational Transportation Science(IWCTS08). 2008.
[2] R. Hariharan and K. Toyama. “Project Lachesis: Parsingand Modeling Location Histories.” In: Proceedings of theThird International Conference on GIScience. Adelphi,MD, USA, 2004.
[3] C. Rudloff and M. Ray. “Detecting Travel Modes andProfiling Commuter Routes Solely Based on GPS Data”.In: TRB 89th Annual Meeting. 2010.
G. Richter | AIT | mobility | DTS May 14, 2013 19 / 19