zheng wang yan liu jieping ye · smart transportation brain control analysis data collection signal...

Post on 29-May-2020

7 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Zheng WANG Yan LIU Jieping YEDiDiAI Labs

Didi Chuxing

DiDiAI Labs

Univ. of Southern California

DiDiAI Labs

Univ. of Michigan, Ann Arbor

Outlinen ChallengesandopportunitiesintransportationAI (20min)• Overviewofurbantransportation• TheemergingchallengesintransportationAI

n AIapplicationsintransportation (165min+Break)• MapservicesI:mapmatching,routeplanning,estimatedtimeofarrival(ETA) (60min)• Break(30min)• MapservicesII: trafficestimation,trafficforecast(45min)• Decisionmakingservices:dispatching(30min)• AIapplicationsandAIforsocialgood(30min)

n DataandtoolsfortransportationAI (15min)n Q&A(10min)

Part 1:ChallengesandOpportunitiesinTransportationAI

HistoryofUrbanTransportation

HistoryofTrafficLight

1860sBefore

NoTrafficLight Hand-operated TrafficLight

1910s

Electric TrafficLight

1920s

Weigh inMotion

1960s1980s

Camera RadarLoop Detector

Transportation:AMulti-DisciplinaryIndustry

Planning

Management

Engineering PolicyMaking

Design

Science

Transportation

……

Cutting-EdgeIssues

7

CooperativeVehicle-Highway Systems RideSharing Multimodal

Transportation

SmartTransportationSystemSmartTravelers

Smart InfrastructureSmartVehicles Cloud

BigData

TransportationEngineeringAI

FromDriveAlonetoRideSharing

FromHumanDriving toAutonomousDriving

FromIndependent SystemstoCooperativeVehicle-Highway Systems

BuildingtheBrainsofSmartTransportation

Intelligent andConnectedVehicles FutureIntersection

AdaptiveControlasaService

AIandMachineLearning

NeuralNetworks

MachineLearning:supervised,unsupervised

DeepLearning

ReinforcementLearning

?

DataResource

• Locationdata,Trajectorydata

•Transaction data

•Profile data

• Sensors:multimedia data

•Cross-platform identification

GPS Data

LocationDataandFloating-CarTrajectory

Sensors

Loopdetector,camera,microphone,mobilesensors …

TransportationAI

BigdatamakesAIpossiblefortransportation.

SmartTransportationBrain

Control

Analysis

DataCollection

SignalControl

FreewayControl

TrafficGuidance

IncidentManagement

AIDispatch

PerformanceMeasures

CongestionDiagnosis

NetworkDesign

TrafficSimulation

AccidentAnalysis

DiDi Data

GovernmentData

Collaborators’Data

CrowdSourcedData

Supply

Demand

??

? ?

Ride-sharingServices

PlatformOptimization

MapServices

Taxi

Express

Car Pool

Premiere

……

Demand-Supply Prediction

Order Dispatch

CarPooling

ResourceAllocation

Multi-modal

RoutePlanningETA

Pick-uplocations

VRNavigation

RoutePlanning

Outlinen ChallengesandopportunitiesintransportationAI (20min)• Overviewofurbantransportation• TheemergingchallengesintransportationAI

n AIapplicationsintransportation (165min+Break)• MapservicesI:mapmatching,routeplanning,estimatedtimeofarrival(ETA) (60min)• Break(30min)• MapservicesII: trafficestimation,trafficforecast(45min)• Decisionmakingservices:dispatching(30min)• AIapplicationsandAIforsocialgood(30min)

n DataandtoolsfortransportationAI (15min)n Q&A(10min)

Part 2:AIApplicationsinTransportation

FutureTransportation

SmartInfrastructure

SmartVehicle

SharedMobility

ElectricalVehicle,AutonomousVehicle,…

Highway,Road,SmartTrafficLight,…

MapService,DecisionService,…

KeyComponents

•BasicLayer:mapserviceandLBS•UpperLayer:decisionserviceandmarketplace

MapService I

SmartMapforModernTransportation

Navigation

RideHailing TransportationSystemEfficiency

AutonomousDriving

CoreMapService

MapMatching

RoutePlanning ETA

TrafficPositioning

MapMatching

MapMatching

•ProblemDefinition

• Solutions toMapMatching

•Challenges

MapMatching

GPS Points

Road Network

Naive Approach:NearestNeighbor

GPS Points

Road Network

SequencetoSequence:HMM

•ModelthisprocessusingHiddenMarkovModel.

Hidden state (Road

Segment)

Observation(GPS)

TransitionProbability

Emission Probability

zi

xi

x1 x2x0 x3 x4

z4z1 z2 z3

*PaulNewsonetal.HiddenMarkovMapMatchingThroughNoiseandSparseness.ACMSIGSPATIAL 2009

EmissionProbability

0

0.02

0.04

0.06

0.08

0.1

0.12

0 5 10 15 20

DistanceBetweenGPSandMatchedPoint(meters)

GPSDeviationFollowsGaussianDistribution

Data Histogram

Gaussian Distribution

*Partofthisslideisborrowedfrom:PaulNewsonetal.HiddenMarkovMapMatchingThroughNoiseandSparseness.ACMSIGSPATIAL 2009

TransitionProbability

01234567

0 0.5 1 1.5 2

Pro

bab

ility

abs(great circle distance - route distance) (meters)

DistanceDifferenceDistribution

Data Histogram

Exponential Distribution

*Partofthisslideisborrowedfrom:PaulNewsonetal.HiddenMarkovMapMatchingThroughNoiseandSparseness.ACMSIGSPATIAL 2009

ParameterEstimation

• Inreality:heuristicestimation• Emissionprobability parameter(noiseinlocationmeasurements):

• Transitionprobabilityparameter(toleranceofnon-directroutes):

• Parameterrefinement

• In literature:parameterlearning

ParameterLearningwithIRL

•Map Matching with Inverse Reinforcement Learning• Transition probability variant, HMM

• 𝑃"#$%& = )*exp(−

|#∗234|* )

• Conventionally, 𝑟∗ = 𝑑) + 𝑑9 + 𝑑: + 𝑑; − 𝑑2• Here 𝑟∗ = 𝑑) + 𝑑9 + 𝑑: + 𝑑; − 𝑑2 +𝑤"=#%(𝑢?) + 𝑢)9 + 𝑢9:)

• Weight estimation 𝑤"=#% with IRL

*T.Osogami etal.MapMatchingwithInverseReinforcementLearning,IJCAI2013.

StateEstimationusingViterbiAlgorithm

•Map-matchingasstateestimation• Input:asequenceofGPSpoints,HMM• Output:themostlikelystatesequence,i.e.,asequenceofedgesintheroadnetwork.

•ViterbiAlgorithm• DynamicProgrammingbasedmethodtoidentifythestatesequencewiththehighestprobability.

Challenges

• Large-scaleGPSdata

• Low-qualityGPSdataformobiledevice

• Limited amountoflabeleddata• Unsupervised learning:EMforHMM• Semi-supervised learning

Reference• SotirisBrakatsoula etal.Onmap-matchingvehicletrackingdata,VLDB2005• PaulNewsonetal.HiddenMarkovmapmatchingthroughnoiseandsparseness,ACMSIGSPATIAL2009• YinLouetal.Map-matching forlow-sampling-rateGPStrajectories,ACMSIGSPATIAL2009• T.Osogami etal.MapMatchingwithInverseReinforcementLearning,IJCAI 2013

RoutePlanning

RoutePlanning

RoutePlanning

•Challenges:• Large-scaletransportationnetwork• Highqueryspeed• Accurate result

•Classical problem:shortest pathalgorithm• Dijkstra’salgorithmanditsextensions:(high)query speed,lessrobust• A*-algorithm:robust,lowqueryspeed• Customizablerouting:robust,relativehighqueryspeed

•Data-drivenapproaches• Whatistheproperedgeweightforthegraph?• Howcanwetakeadvantageofthebigdata?

Shortest Path

•Buildagraphbasedonmapandtrafficinformation.•Graphedge weightfortravelcost• Findaroutewiththeminimumtravelcost.

A

B

C

D

E

F

G1

1

2

3

4

2

6

2

Dijkstra'salgorithm:agreedyapproach

•Given a weightedgraphwithnonnegativeedgeweights,Dijkstra'salgorithmfindstheshortestpathbetweennodesinthegraph.

Thecomplexity ofDijkstra’s algorithmis𝑂 𝑒 + 𝑣 log|𝑣| .

A*-Algorithm:aheuristicapproach

• Search withaheuristicguidance

*PicturefromHannahBast etal.RoutePlanninginTransportationNetworks. Arxiv 2015.

Forward Search Bidirectional Search A* Search

Speedup

• SpeedupDijkstra’sAlgorithm

Dijkstra’s Algorithm BidirectionalDijkstra’sAlgorithm GraphContraction

FastShortest Path

ContractionHierarchiesmethodisatechniquetospeedupshortest path routingbyfirstcreatingprecomputed"contracted"versionsoftheconnection graph.Itcanberegardedasaspecialcaseof"highway-noderouting".

•ContractionHierarchies(CH)• Preprocessing:sortingthenodes,addingshortcut,layer-wisecontraction• Query:bidirectionalDijkstra's algorithm,unfolding theshortcut

ComponentReduction

shortcut

•Remove nodes andedgesw/oshortcut.

E C

A

BD

E C

A

BD E C

BD

E C

BD

IsAintheshortestpath?N

Y

OrderforContraction

NodeOrder

• Setting the node orderforcontraction:edgedifference,costofcontraction,uniformity,costofqueries…•Orderforcontrationandshortestpathsearch.

4

2

3

5

1

1

2

3

4

5

Graph Contraction

•Multi-levelcontraction

4

2

3

5

11

2

3

4

5

Bidirectional Search

•BidirectionalDijkstra's algorithm:forwardsearchgoesfromlowordernodetohighordernode,andviceversa.•Unpacktheshortestpath.

Node Order

backwardforward

Guarantee

•Provedbycontradiction

MultipleMetric

•CRP:customizablerouteplanning• Metricindependentprocessing(partition)• Metriccustomization• Query

Partition

•Multi-layerpartitiononunweighted graph• Overlaygraph:distancepreservingsubgraph

OverlayGraph

•Threepossiblewaysofpreservingdistanceswithintheoverlaygraph:fullclique,arcreductionandskeleton.

*PicturefromDaniel Delling etal.CustomizableRoutePlanninginRoadNetworks.TransportationScience2017.

Customization

Pruning

*PicturefromDaniel Delling etal.CustomizableRoutePlanninginRoadNetworks.TransportationScience2017.

Query

• Shortestpathinasubgraph:overlaygraph+subgraphincludingODnodes.•Unpacktheshortestpath:Dijkstra’sAlgorithmforeachclique.

*PicturefromDaniel Delling etal.CustomizableRoutePlanninginRoadNetworks.TransportationScience2017.

SummaryofShortest Path Search

*HannahBast etal.RoutePlanninginTransportationNetworks. Arxiv 2015.

ResultsonroadnetworkofWestEurope,usingtraveltimesasedgeweights.(18Mnodesand42.5Medges)

Challenges

TravelDistance TravelSpeed Traffic condition Driver/PassengerPreference

It is verydifficulttopresetproperpenalty weighttorepresentall thosefactorsCanwelearnhumandrivingpatterns?

DataDrivenvsHeuristic

Data Driven:decisionprocess

Approximately,RoutePlanningproblemcanbeseenasadeterministicMDPproblem.Ourgoalistomaximize/minimize customer’sreward/cost

decision

Time cost,distance,

user preferenceetc.

Agent

Environment

Data Driven

Classification/Reinforcementlearning(drivingpolicy)problem

S

S

S

S

V(S)

a

a

a

1. Selection

Data Driven

How to guaranteethe long time planningeffectiveness?

S

S

S

S

S

V(S)

V(S)

a

a

a

a

2. Expand & Evaluate

a

a

Data Driven

S

S

S

S

S

V(S)

V(S)a

a

3. Backtrack

a

a

Q

Q

Data Driven

S

S

a

4. Execute

Data Driven

Learning to Plan the Route

•ModelingTrajectorieswithRecurrentNeuralNetworks• H.Wu,Z.Chen,W.Sun,etal.ModelingTrajectorieswithRecurrentNeuralNetworks.IJCAI2017.

•Route Planning with ReinforcementLearning• T.Weber,S.Racanière,D.P.Reichert,etal.Imagination-AugmentedAgentsforDeepReinforcementLearning.NIPS 2017.

ModelingTrajectorieswithRecurrentNeuralNetworks

•RNN:capturevariablelengthsequence• Similarity&differenceinlanguage/trajectorymodelingusingRNN• Similar:generatewords/edgesstepbystep,dependingonthepresentandpastwords/edges.• Different:thetransitionfromonewordtoanyotherwordisfree,whileonlythetransitionfromoneedgetoitsadjacentedgesispossible.

ModelingTrajectorieswithRecurrentNeuralNetworks

•Goal:useRNNtolearnthetopologicalconstraints.

• Solution:modificationofstate-constrainedsoftmax function.

ModelingTrajectorieswithRecurrentNeuralNetworks

HiddenState

HiddenState

ModelingTrajectorieswithRecurrentNeuralNetworks

Route Planning with Reinforcement Learning

Imagination-Augmented Agent (I2A), which incorporating model-free RL and model-

based RL, improvesdataefficiency,performance,androbustness.

Route Planning with Reinforcement Learning

Observed State

Model-FreeApproaches

Disadvantages• Requireslarge amountsoftraining

data• Resultingpoliciesdonotreadily

generalize tonoveltasksinthesameenvironment

Model-BasedApproaches

Advantages• Endowingagentswitha modelof

theworld• Supportgeneralization tostatesnot

previouslyexperienced

Disadvantages

• Complexdomains hard to buildenvironmentmodels.

• Performancesuffersfrommodelerrorsresulting.

Route Planning with Reinforcement Learning

Environment Model Construction Options• Make assumptions aboutthestructureoftheenvironmentmodelwith domain

knowledge• Trained directlyonlow-levelobservationswithlittledomainknowledge,similarly

torecentmodel-freesuccesses.(e.g., I2A)• PerfectWorldModel

Input:Observed State,Chosen Action à Output:Next State,Rewarde.g.

Left

EM

Down

EMState

Action

Input InputOutput Output

10 + (−1)Reward −1Box reach target 10

Step cost -1Step cost -1

Key of Model-Based method —— EnvironmentModel

Route Planning with Reinforcement Learning

Rollout EncoderRNN - GRU

“Last-mile”Value Presentation

Output Layer of Model-based: Path EncoderRole:EM generate path after taking a specific action via imagination.

Encoder transform the path into the evaluation of selected action.

e.g.

Imagination PathO"𝑂I";), �̂�";)𝑂I";9, �̂�";9

Action: Left

. . .

Realized by RNN

EM EM

Input

Output

Imagination

Route Planning with Reinforcement Learning

• I2A performs better than others • Performance ∝ Imagination Steps

• Diminishingreturnswithmorerollout steps

Route Planning with Reinforcement Learning

Address the challenge• RNN Encodercaptures thesequentialinformation.• Work well even with imperfect EM (learntoignorethelatteraserrorsaccumulate)

Alternative Solution• Learningtonavigateincitieswithoutamap,fromdeepmind.

*PiotrMirowski,etal.Learningtonavigateincitieswithoutamap.Arxiv 2018.

Reference• R.Geisberger etal.Contractionhierarchies:Fasterandsimplerhierarchicalroutinginroadnetworks.In7thWorkshoponExperimentalAlgorithms.LNCSSeries,vol.5038.Springer,pp319-333,2008• Daniel Delling etal.CustomizableRoutePlanninginRoadNetworks.TransportationScience.vol.51(2),pp395-789,2017.• HannahBast etal.RoutePlanninginTransportationNetworks. Arxiv 2015.• HWuetal.ModelingTrajectorieswithRecurrentNeuralNetworks.IJCAI2017.• T.Weberetal.Imagination-AugmentedAgentsforDeepReinforcementLearning.NIPS 2017.• PiotrMirowski etal.Learningtonavigateincitieswithoutamap.Arxiv 2018.

ETA(EstimatedTimeofArrival)

ETA(EstimatedTimeofArrival)

ETAApplications

Route Planning Ride SharingOrder Dispatching Pricing

MainStreamETAModels

• Additivemodels:• Rule-basedadditivemodels:explicitlymodelingthesegmentsinapath• Aggregatingthetimeofsub-paths• Machinelearningmodelsforthesub-pathproblem.

• Globalmodels:• Formulating ETAasaregressionproblem:Learning toEstimatetheTravelTimes(L2ETT)• Simple regression model and deeplearningmodel

• Path-freemodels:• Pathisnotavailable

SimpleAdditive Model

• Simple rules based on physical structure of road network•Popular solutionindigitalmapindustry•Challenges: precisespeedestimationanderror accumulation

AggregatingSub-Paths•Thepathcanbedecomposedintosub-paths.•Timeofeachsub-pathisinferencedfromthehistoricaltrajectories.•ETAisthesummationofthetimeofsub-paths.

*HongjianWangetal.Asimplebaselinefortraveltimeestimationusinglarge-scaletripdata.SIGSPATIAL2016.

TensorDecomposition•Build a 3D tensor (driver, road segment, time slot) andusetensordecompositiontoestimatethetraveltimeofeachsegment.•Use dynamic programing to find the optimal concatenations.

*YilunWang etal.Traveltimeestimationofapathusingsparsetrajectories.KDD2014.

RegressionModelonGPSPoints•DeepregressionmodelonGPSdata• GPSpointswillnotbeavailablebeforetrip inreal-worldcase.

*DongWang,etal.WhenWillYouArrive?EstimatingTravelTimeBasedonDeepNeuralNetworks.AAAI2018.

Time Series Prediction•Conventional regressionmodel:support vector regression•Deeplearningmodel:recurrent neural network

*Chun-Hsin etal.Travel-timepredictionwithsupportvectorregression.IEEETrans.ITS2004.*Yanjie Duan etal.TraveltimepredictionwithLSTMneuralnetwork.ITSC2016.

selected routes

f(t-n),..,f(t-1) à f(t)

f(t-n),..,f(t-m) à f(t)

ProbabilisticETA•Output a distributionof ETA•Consider the variance of travel time•Assume Gaussian distribution for linktraveltime

*MohammadAsghari etal.Probabilisticestimationoflinktraveltimesindynamicroadnetworks.SIGSPATIAL2015.

AdditiveModelvsGlobalModel

Additive model?

Pure learning?

1Heuristic rule

2 Indirective objective

3 Less robust

4 Insufficient use of data

Rethinking ETA

Historical &real-timetraffic data

Segment-wisestatistic andaccumulation

Directestimation byregressionmodel

Statisticalmachinelearning

Learning to Estimate the Travel Time

•Big data + machine learning• high accuracy• datadriven• robustness

Feature extraction

Static/dynamic featureDense featureSparse feature

Raw data

Machine learningmodel

ETAservice

ProblemFormulation• Formulation: ETA as a regression problem on sequential input

•Data:acollectionoftrips𝑿 = 𝑥O OP)Q

• 𝑥O denotesfeaturevectoronthetrajectoryalongtheroutepath 𝑝O• morethan20milliontripsaday

•Objective:forMAPEloss,minV∑ |XY2V(ZY)|

XYQOP)

•Robust:differenttrainingdatafordifferentscenarios• pickingthepassenger• deliveringthepassenger• orderdispatch• carpooling

RichFeature System

Roadnetwork Order GPS

trajectoryTrafficrecord Weather User

profile

Staticfeature

Dynamicfeature

Statistical densefeature

Large scalesparse feature

Personalizedfeature

Spatialpattern

Temporalpattern

Historicalpattern

Real-timepattern

Personalizedpattern

Data

Basic Feature

FeatureEngineering

Output

Twofaces:globalfeaturevssequentialfeature

Raw data

Raw data

……

……

feature

labelTod

RecurrentModel

•ModelsequentialfeatureswithRecurrentNeuralNetwork(RNN).

Tree-BasedModel->Wide&DeepModel

• Features• Spatialinformation• Temporalinformation• Trafficinformation• Personalized information• Augmentedinformation

•W&Dmodelprocessestheglobalinformation

Wide-Deep-RecurrentNetwork(WDR)

*Zheng Wangetal.LearningtoEstimatetheTravelTime,KDD2018.

OfflineEvaluationDataset

OfflineEvaluationResults

OnlineEvaluationResults

WDRachievesthebestperformanceforallmetrics.

Progress

RegressionModel

DeepLearningModel

Low risk of overfitting

Easy to implement

Fast training and inference

Better interpretability

Better performance

sequential information

ModelCompression

High inferencespeed

More portable

More Practical:OriginDestinationETA

Routeisnotavailableuntiltheendofthetrip.

OriginDestinationETA

•Origin-destinationETA(pathfree)

•Challenges• Limitedinformation:noactualpath• Complicatedspatiotemporaldependencies

MURAT:Multi-taskRepresentationLearningforETA

RoadNetworkEmbedding

SpatiotemporalSmoothness

MURAT:Multi-taskRepresentationLearningforETA

TravelDistance

#RoadSegments

#Trafficlights

#Left/rightTurns

HistoricalPaths AuxiliaryTasks

MURAT:Multi-taskRepresentationLearningforETA

*Yaguang Lietal.Multi-taskRepresentationLearningforTravelTimeEstimation,KDD,2018

Evaluation

Beijing60M+trips

NewYorkCity20M+trips

LearnedRepresentation

Timeinday

CircularShape

Smoothtransition

Reference• Corrado deFabritiis etal.TrafficEstimationAndPredictionBasedOnRealTimeFloatingCarData,IEEETITS,2008.• ErikJenelius etal.Traveltimeestimationforurban roadnetworksusinglowfrequencyprobevehicledata,TransportationResearchB,2013• Yilun Wangetal.Traveltimeestimationofapathusingsparsetrajectories,KDD2014• ZhengWangetal.Learningtoestimatethetraveltime,KDD2018.• Yaguang Lietal.Multi-taskRepresentationLearningforTravelTimeEstimation,KDD2018.

MapServiceII

TrafficEstimationandForecasting

Introduction

• Trafficcongestingiswastefuloftime,moneyandenergy• TrafficcongestioncostsAmericans$124billion+ direct/indirectlossin2013.

• Accuratetrafficforecastingcouldsubstantiallyimproverouteplanningandmitigatetrafficcongestion.

DataSource

•Traffic estimation and forecasting• Data source: loop detector, GPS trajectory and camera• Target: traffic speed in real time and for the future

TrafficEstimation

•Calculate theground-truthtrafficdata:multiplesourcedatafusion,spatial-temporalspline.

TrafficPrediction

• Input:roadnetworkandpastT’trafficspeedobservedatsensors•Output:trafficspeedforthenextTsteps

7:00AM 8:00AM

Input:ObservationsOutput:Predictions

...

... 8:10AM,8:20AM,…,9:00AM

ChallengesforTrafficForecasting

Page 112

12/08/2017

ComplexSpatial Dependency

Spee

d (m

ile/h

)

Non-linear, non-stationary Temporal Dynamic

ChallengesforTrafficForecasting

• Spatialrelationshipamongtrafficflowis non-Euclideananddirected

KNN-basedApproaches

• Findsimilarhistoricaltraffictimeseries• Extractvariousfeaturesfromtraffictimeseries• Definesimilaritymetricsbetweentraffictimeseries

•Calculatethepredictionbyaggregatingnexttrafficreadingsofnearestneighbors.

*PicturefromWikipedia.https://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm

TimeSeriesMethods

• SeasonalAutoregressiveIntegratedMovingAverage(SARIMA)

• SupportVectorRegression(SVR)

* Picture from http://www.saedsayad.com/support_vector_machine_reg.htm

TrafficPredictionwithLatentSpaceModel(LSM)

•Modeltheroadnetworkasagraph•ModelcorrelationbetweenroadsegmentwithLatentSpacemodel

*DingxiongDengetal,LatentSpaceModelforRoadNetworkstoPredictTime-VaryingTraffic.KDD,2016

Defineamodelthatcangeneratethetrafficconditionofroadnetwork

LatentAttributes

•Eachvertexhaslatent attributes• Vertexi haslatentattributevectorui

highway arterial business resident intersection non-inter

i 0.9 0.1 0.8 0.2 1 0

j 0.8 0.1 0.7 0.3 0 1

NodeattributematrixUnxk(nisthenumberofvertices)

AttributeInteraction

highway arterial business resident intersection non-inter

highway 40 20 20 40 10 20

arterial …

business …

residential …

intersection …

non-inter …

Attributeinteractionmatrix𝑩 ∈ 𝑹𝒌×𝒌

• InteractionmatrixB betweendifferentattributes

SpeedfromLatentAttribute

•Trafficspeedbetweenvertices𝑖 and 𝑗 (𝑖 → 𝑗) isalinearcombinationofthecorrespondingtrafficpatterns

1

1

40 0

0 101 1 = 50XX

highway business

ui

ujTB

i j?

BasicGraphModel

Non-negativeGraphMatrixFactorization(NMF)[1]

120

U

Graphmatrix:𝑮 ∈ 𝑹𝑵×𝑵 Latentproperties:𝑼 ∈ 𝑹𝒏×𝒌 and𝑩 ∈ 𝑹𝒌×𝒌

UTB

[1]Wang,F.,Li,T.,Wang,X.,Zhu,S.andDing,C.,2011.Communitydiscoveryusingnonnegativematrixfactorization.DataMiningandKnowledgeDiscovery,22(3),pp.493-521.

travelspeed

OverviewofLSMforRoadNetwork

Time

Latent attributes evolve with different timestamps

12

G1

Predict

U1 U2 Ut

Gt G2

Learn

Gt+1G3

LSMforRoadNetwork

122

BasicGraphModel

TransitionEffect

TemporalEffect

OvercometheSparsity

*DingxiongDengetal,LatentSpaceModelforRoadNetworkstoPredictTime-VaryingTraffic.KDD,2016

ExperimentalSettings

•Dataset• MarchandApril,2014sensordatawithmorethan60millionrecords• TwosubgraphsofLosAngelesroadnetwork

•Baselines:• LSM-Naive[Zhangetal.KDD’12]• ARIMA[Williamsetal.TRB’98],ARIMA-SP• SVR[Wuetal.ITS’04],SVR-SP

•Measurement:MAPE,RMSE

ExperimentalResults

5% 10% 15% 20% 25% 30% 35%

MAP

E(%

)

Time (pm)

LSM-Global LSM-Inc ARIMA ARIMA-SP

Non-Rush hour

5% 10% 15% 20% 25% 30% 35%

MA

PE (

%)

Time (am)

LSM-Global LSM-Inc ARIMA ARIMA-SP

Rush hour

Latent space models (LSM) outperform other baselines.

DeepLearningforTrafficForecasting

•TrafficPredictionusingStackedAutoencoder(SAE)togetherwithlogisticregressionontopofthenetworkforsupervisedtrafficflowprediction

* Yisheng Lv et al. Traffic flow prediction with big data: A deep learning approach, IEEE TITS, 2015

DeepLearningforExtremeTrafficPrediction

•MainGoal:forecastingtrafficforextremeincludingrush-hourandpost-accident

Y. Qi et al, Deep Learning: A Generic Approach for Extreme Condition Traffic Forecasting. SDM 2016

DeepMixtureLSTM

•DeepMixtureLSTMisamixtureofLSTMandAutoencoder• LSTM:normalconditiontraffic• Autoencoder:accidentspecificfeatures(time,severity)• Merge:linearregression

Experiments

•Baselines:• ARIMA:AutoRegressiveIntegratedMovingAverage• RandomWalk:constanttimeserieswithrandomnoise.• HistoricalAverage:weightedaverageofpreviousseasons• SupportVectorRegression(SVR):regressionusingSupportVectorMachine

•Data:• Traffic:2,018sensorsfromMay19,2012toJune30,2012• Accident:6,811incidents spreadacross1,650sensors

Rush-HourForecasting

•Weobservealmost50%improvementforthepeakhourtrafficforecast

Post-trafficForecasting

• DeepMixtureLSTMisroughly30%betterthanthebaselinemethods

ConvolutionNeuralNetworkforTrafficForecasting

•Modeltrafficspeedsindifferentlocationsasamatrix(image).•Applyconvolutiontomodelthespatiotemporaldependency.

* Xiaolei Ma et al. Learning traffic as images: a deep convolutional neural network for large-scale transportation network speed prediction, Sensors, 2017

TrafficForecastingwithConvolutiononGraph

• Modelspatialdependencywithproposed diffusionconvolutionongraph

• Modeltemporaldependencywithaugmentedrecurrentneuralnetwork

*YaguangLietal,DiffusionConvolutionalRecurrentNeuralNetwork:Data-drivenTrafficForecasting.ICLR,2018

SpatialDependencyinTrafficPrediction• Spatialdependencyamongtrafficflow

Sensor1 Sensor2

Sensor3

CloseinEuclideanspace

Similartrafficspeed

is non-Euclideananddirected

𝑑𝑖𝑠𝑡%i" 𝑣O → 𝑣j ≠ 𝑑𝑖𝑠𝑡%i" 𝑣O → 𝑣j

SpatialDependencyModeling

•Modelthenetworkoftrafficsensors,i.e.,loopdetectors,asadirected graph• Graph𝓖 = (𝐕,𝑨)• Vertices𝑽:o sensors• Adjacencymatrix𝑨:→ weightbetweenvertices

𝐴Oj = exp −disttuv 𝑣O, 𝑣j

9

𝜎9 ifdisttuv 𝑣O, 𝑣j ≤ 𝜅

disttuv 𝑣O, 𝑣j :roadnetworkdistancefrom𝑣O to𝑣j,𝜅:thresholdtoensuresparsity,𝜎9 varianceofallpairwiseroadnetworkdistances

ProblemStatement•Graphsignal:𝑿𝐭 ∈ ℝ|}|×~,observationon𝓖 attime𝑡• 𝑽 :numberofvertices• 𝑃 :featuredimensionofeachvertex.

•ProblemStatement:Learnafunction𝑔(·) tomap𝑇�historicalgraphsignalstofuture𝑇 graphsignals

… …

𝑿"2��;) 𝑿" 𝑿";) 𝑿";�𝑔 .

SpatialDependencyModeling

•ConvolutionNeuralNetworks*(CNN)learnmeaningfulspatialpatterns• State-of-the-artresultsonimagerelatedtasks

Image

Convolutional Filter

* Y LeCun et al. Gradient-based learning applied to document recognition. Proc. IEEE 1998

GeneralizeConvolutiontoGraph•Diffusionconvolutionfilter:combinationofdiffusionprocesseswithdifferentstepsonthegraph.

MaxMinFilterweight

=𝜃? +𝜃) +𝜃9 +…+𝜃�

0StepDiffusion

1StepDiffusion

2StepDiffusion

KStepDiffusion

ExamplediffusionfilterCenteredat

𝑿:,� ⋆𝒢 𝑓� = � 𝜃� 𝑫𝑶2𝟏𝑨�𝑿:,�

�2)

�P?

Transitionmatricesofthediffusionprocess

Learningcomplexity:𝑂 𝐾

⋆𝒢:diffusionconvolution,𝐷�:diagonalout-degreematrix.

GeneralizeConvolutiontoGraph•Diffusionconvolutionfilter:combinationofdiffusionprocesseswithdifferentstepsonthegraph.

= 𝜃? + 𝜃) + 𝜃9 + … + 𝜃�

0StepDiffusion

1StepDiffusion

2StepDiffusion

KStepDiffusion

ExamplediffusionfilterCenteredat

𝑿:,� ⋆𝒢 𝑓� = � 𝜃�,) 𝑫𝑶2𝟏𝑨�+ 𝜃�,9 𝑫𝑰2𝟏𝑨⊺

�𝑿:,�

�2)

�P?

Dualdirectionaldiffusiontomodelupstreamanddownstreamseparately

⋆𝒢:diffusionconvolution,𝐷�:diagonalout-degreematrix,𝐷�:diagonalin-degreematrix

MaxMinweight

AdvantageofDiffusionConvolution

•Efficient• Learningcomplexity:𝑂 𝐾• Timecomplexity:𝑂 𝐾 𝐸 , 𝐸 numberofedges

•Expressive• Manypopularconvolutionoperations,including theChebNet [Defferrardetal.,NIPS’16],canbeseenasspecialcasesofthediffusionconvolution[Lietal.ICLR’18].

𝑿:,� ⋆𝒢 𝑓� = � 𝜃�,) 𝑫𝑶2𝟏𝑨�+ 𝜃�,9 𝑫𝑰2𝟏𝑨⊺

�𝑿:,�

�2)

�P?

* Defferrard,Metal,ConvolutionalNeuralNetworksonGraphswithFastLocalizedSpectralFiltering,NIPS,2016*YaguangLietal.DiffusionConvolutionalRecurrentNeuralNetwork:Data-drivenTrafficForecasting,ICLR,2018

⋆𝒢:diffusionconvolution,𝐷�:diagonalout-degreematrix,𝐷�:diagonalin-degreematrix

Multi-stepaheadpredictionwithRNN

x��

DCGRU

x��

DCGRU

x��

DCGRU

Previousmodeloutput isfedintothenetwork

x�� x��

ErrorPropagation

x�

x

Modelprediction

Observationorgroundtruth

DCGRU

x)

DCGRU

x9

DCGRU

x:

Teachthemodeltodealwithitsownerror.CurrentTime

ModelTemporalDynamicsusingRecurrentNeuralNetwork

ImproveMulti-stepaheadForecasting•Trafficpredictionasasequencetosequencelearningproblem• Encoder-decoderframework

DCGRU

𝑥)

DCGRU

x9

DCGRU

𝑥:

Encoder

x��

DCGRU

x��

DCGRU

x��

DCGRU

Decoder

<GO> 𝑥� 𝑥�

𝑥�

𝑥

Modelprediction

Observationorgroundtruth

CurrentTime

𝑥� 𝑥� 𝑥�

Backproperrorsfrommultiplesteps.

*Sutskever etal.Sequencetosequencelearningwithneuralnetworks,NIPS2014

Groundtruthbecomesunavailableintesting.

𝛿� 𝛿� 𝛿�

𝑥),𝑥9, 𝑥: → 𝑥�

𝑥), 𝑥9,𝑥: → 𝑥�, 𝑥�,𝑥�

ImproveMulti-stepaheadForecasting• Improvemulti-stepaheadforecastingwithscheduledsampling

DCGRU

x)

DCGRU

x9

DCGRU

x:

x��

DCGRU

x��

DCGRU

x��

DCGRU

<GO>

x�x�� x�x��

Scheduledsampling:Choosetousethepreviousgroundtruthormodelpredictionbyflippingacoin

𝑥�

𝑥

Modelprediction

Observationorgroundtruth

Encoder Decoder

CurrentTime

*Bengio,Samy etal.Scheduledsamplingforsequencepredictionwithrecurrentneuralnetworks.NIPS2015

DiffusionConvolutionalRecurrentNeuralNetwork• DiffusionConvolutionalRecurrentNeuralNetwork(DCRNN)• Modelspatialdependencywithdiffusionconvolution• Sequencetosequencelearningwithencoder-decoderframework• Improvemulti-stepaheadforecastingwithscheduledsampling

* Yaguang Li et al. Diffusion Convolutional Recurrent Neural Network: Data-driven Traffic Forecasting, ICLR 2018

Experiment- Datasets

•METR-LA:• 207trafficsensorsinLosAngeles• 4monthsin2012• 6.5Mobservations

•PEMS-BAY:• 345trafficsensorsinBayArea• 6monthsin2017• 17Mobservations

Experiments

•Baselines• HistoricalAverage(HA)• AutoregressiveIntegratedMovingAverage(ARIMA)• SupportVectorRegression(SVR)• VectorAuto-Regression(VAR)• FeedforwardNeuralnetwork(FNN)• FullyconnectedLSTMwithSequencetoSequenceframework(FC-LSTM)

•Task• Multi-stepaheadtrafficspeedforecasting

ExperimentalResults

•DCRNNachievesthebestperformanceforallforecastinghorizonsforbothdatasets

1.00

2.00

3.00

4.00

5.00

6.00

7.00

15Min 30Min 1Hour

MeanAb

soluteError(M

AE)

METR-LA

HA ARIMA VAR SVR FNN FC-LSTM DCRNN

1.00

1.50

2.00

2.50

3.00

3.50

15Min 30Min 1Hour

MeanAb

soluteError(M

AE)

PEMS-BAY

HA ARIMA VAR SVR FNN FC-LSTM DCRNN

EffectsofSpatiotemporalDependencyModeling

•w/otemporal:removingsequencetosequencelearning.•w/ospatial:removethediffusionconvolution.

1.5

2

2.5

3

3.5

4

4.5

5

15Min 30Min 1Hour

MeanAb

soluteError(M

AE)

DCRNNw/oTemporal DCRNNw/oSpatial DCRNN

Removing either spatial or temporal modeling results in significantly worse results.

METR-LA

RelatedWork

•TrafficPredictionwithout spatialdependencymodeling• Simulationandqueuing theory[Drew1968]• KalmanFilter:[Okutani etal.TRB’83][Wangetal.TRB’05]• ARIMA: [Williamsetal.TRB’98][Panetal.ICDM’12]• SupportVectorRegression(SVR):[Mulleretal,ICANN'97][Wuetal.ITS‘04]• Gaussianprocess[Xie etal.TRB’10][Zhouetal.SIGMOD’15]• Recurrentneuralnetworksanddeeplearning: [Lv etalITS’15][Maetal.TRC’15][YuandLietalSDM’17]

RelatedWork

•TrafficPredictionwith spatialdependencymodeling• VectorARIMA[WilliamsandHoel JTE’03],[Chandraetal.ITS’09]• SpatiotemporalARIMA[Kamarianakis etal.,TRB’03][MinandWynter,TRC’11]• k-NearestNeighbor[Lietal.ITS’12][Riceetal.ITS’13]• LatentSpaceModel[Dengetal.KDD’16]• ConvolutionalRecurrentNeuralNetwork[Maetal.Sensors’17]• DiffusionConvolutionalRecurrentNeuralNetwork[Lietal.ICLR’18]

Reference• [Chandraetal.ITS’09]Chandra,S.R.andAl-Deek,H.,Predictionsoffreewaytrafficspeedsandvolumesusingvectorautoregressivemodels.ITS,2009

• [Dengetal.KDD’16]Deng,D.,Shahabi,C.,Demiryurek,U.,Zhu,L.,Yu,R.andLiu,Y.,LatentSpaceModelforRoadNetworkstoPredictTime-VaryingTraffic.KDD,2016.

• [Drew1968]DonaldRDrew.Trafficflowtheoryandcontrol.Technicalreport,1968.• [Lietal.ITS’12]Li,S.,Shen,Z.andXiong,G.,Ak-nearestneighborlocallyweightedregressionmethodforshort-termtrafficflowforecasting,ITS,2012

• [YuandLietal.SDM’17]Li,Y.,Yu,R.,ShahabiC.,Demiryurek,U.,Liu,Y.,DeepLearning:AGenericApproachforExtremeConditionTrafficForecasting,SDM,2017

• [Lietal.ICLR’18]Li,Y.,Yu,R.,ShahabiC.,Liu,Y.,DiffusionConvolutionalRecurrentNeuralNetwork:Data-drivenTrafficForecasting.ICLR,2018

• [Lv etalITS‘15]Lv,Y.,DuanY.,KangW.,Li,Z.,Wang,F.,TrafficFlowPredictionWithBigData:ADeepLearningApproach,ITS,2015.

• [Maetal.TRC’15]Ma, X., Tao Z.,Wang,Y.,Yu, Y.,Longshort-termmemoryneuralnetworkfortrafficspeedpredictionusingremotemicrowavesensordata,TRC2015

• [Maetal.Sensors’17]Ma,X.,Dai,Z.,LearningTrafficasImages:ADeepConvolutionalNeuralNetworkforLarge-ScaleTransportationNetworkSpeedPrediction,Sensors2017

• [Mulleretal,ICANN'97]Müller,K.R.,Smola,A.J.,Rätsch,G.,Schölkopf,B.,Kohlmorgen,J.andVapnik,V.,Predictingtimeserieswithsupportvectormachines.ICANN,1991

Reference• [MinandWynter,TRC’11]Min,W.,Wynter,L.,Real-timeroadtrafficpredictionwithspatio-temporalcorrelations,RTC,2011

• [Okutani etal.TRB’83]Okutani,I,andY.J.Stephanides.DynamicPredictionofTrafficVolumeThroughKalmanFilteringTheory.TransportationResearchB,1984

• [Panetal.ICDM’12]Pan,B.,Demiryurek,U.andShahabi,C.,Utilizingreal-worldtransportationdataforaccuratetrafficprediction.ICDM,2012

• [Wangetal.TRB’05]Wang,Y.,Papageorgiou,M.,Real-timefreewaytrafficstateestimationbasedonextendedKalmanfilter:ageneralapproach.TransportationResearchPartB:Methodological,2015

• [WilliamsandHoel JTE’03]Williams,B.M.,Hoel,L.A.,ModelingandforecastingvehiculartrafficflowasaseasonalARIMAprocess:Theoreticalbasisandempiricalresults.Journaloftransportationengineering,2003

• [Wuetal.ITS‘04]Wu,C.H.,Ho,J.M.andLee,D.T.,Travel-timepredictionwithsupportvectorregression.IEEEtransactionsonintelligenttransportationsystems,2004

• [Xie etal.TRB’10]Xie,Y.,Zhao,K.,Sun,Y.andChen,D.,Gaussianprocessesforshort-termtrafficvolumeforecasting.TransportationResearchRecord,2010

• [Zhouetal.SIGMOD’15]Zhou,J.andTung,A.K.,2015,May.Smiler:Asemi-lazytimeseriespredictionsystemforsensors.InProceedingsofthe,ACMSIGMOD,2015

DecisionService

IntelligentOrder-DispatchingSystem

OrderandDriverForecasting ALarge-ScaleDistributedComputing

PlatformEfficiencyandCustomerExperienceOptimization

MapService

RoutePlanningTheCoreofDispatching

TominimizecostTomaximizedriverefficiencyTooptimizetransportationefficiency

ETA(EstimatedTimeofArrival)

Toestimatethetravelingtimeandthewaitingtimeofeachride

Order-DispatchingMatrix

DiDiPremier

DiDiExpress

DiDiTaxi

OrdersforTaxiOrdersforPremier OrdersforExpress

DiversifiedOrders

TransportCapacity

BipartiteGraphMatching

Hungarianmatchingalgorithm (alsocalledtheKuhn-Munkres algorithm)

MDPFormulationforDispatching

ReinforcementLearningDispatching•Everydispatchingdecisionaffectsfuturesupply(drivers)distribution•Maximizedrivers’collectiveincomethroughoptimizeddispatching,whileensuringgoodcustomerexperience

Reinforcement Learning•Focusonlong-termreward(e.g.oneday)•Considerfutureimpactofcurrentdecision•Valuefunctionisa keyquantitytolearn

Q(s,a)

Method

CombiningReinforcementLearningandCombinatorialOptimization

Online Planning Step Offline Learning Step

Matching Value

Instant Reward

Future State-Value+

=

Value Functions

Historical data

Xuetal.,KDD2018

MDPDefinition

EachorderdispatchaccordstoadecisionmadebytheplatformFormulateorderdispatchintoaMarkovDecisionProcess(MDP)Optimalpolicygeneratestheoptimalrevenueoftheplatform–bysatisfyingmorerequestsandmaximizingdriver’sincome

• State�Time&location

• Action�order-driverpair• S:Driver’scurrentstate• S’:Order’sactual

finishing state,distributedneartheestimatedfinishingstateby!""#$

• R:Order’sprice�distributedneartheestimatedpriceby%"$

agent (driver)

order start

order finish

&''#(

pickup

delivery

Agent:ADriver

Episode:Awholedayfrom04:00to04:00(+1)

State:Time&Spacelocation

Action:Pickupanorderordonothing

Reward:Driver’sIncome(orderprice)

ValueFunction:TheexpectedfutureincomeofadriverinastateS

Motivation

max𝑉¢ 𝑠 = 𝐸¢[𝑅";)+ 𝛾𝑅";9+ ⋯|𝑆" = 𝑠]

Learning– PolicyEvaluation

𝑇?

𝑺𝟎(𝑻𝟎,𝑿)

𝑇)

𝑺𝟏(𝑻𝟏,𝑿)

𝑇9 𝑇:

𝑺𝟐(𝑻𝟑,𝒀)

Vacant:V¢ 𝑆? ← 𝑉¢ 𝑆? + 𝛼(0 + 𝛾𝑉¢ 𝑆) − 𝑉¢(𝑆?))Serving:V¢ 𝑆) ← 𝑉¢ 𝑆) + 𝛼(𝑅 + 𝛾9𝑉¢ 𝑆9 − 𝑉¢(𝑆)))

Dynamic Programming

Planning– AdvantageFunction

Linkweight:

Q s, a→ A(s, a)

= R.+V s’ – V(s)

Order’svalue

Valuefunctionofdriver’scurrentstate

Valuefunctionofdriver’sexpectedfinishingstate

DeepRLforDispatching

DeepReinforcementLearning

moreadaptivetoreal-timesupply-demand contextchanges

weightssharingamonginputs:location,time,destination,context- bettergeneralization

。。。

。。。

facilitateslearningfrommultiplecitiesandtimes

Wangetal.,ICDM2018

DeepQ-networkwithactionsearch

• Usehistoricaltripsdataastrainingtransitions.

• Eachtripxdefinesatransitionoftheagent’sstate

(s0,a,r,s1):• Currentstates0:=(l0,t0,f0),location,time,and

contextualfeatures• Action:theassignedtrip;• Next-states1:=(l1,t1,f1)• Reward:thetotalfeecollectedforthetrip.

• Implicitlyconsiderpick-uptime,tripdurationrelativetoreward

q Trainingdata

q Actionsearch

q Expandedactionsearch

DeepQ-networkwithactionsearch

q Trainingdata

q Actionsearch

q Expandedactionsearch

• Constructanapproximatefeasiblespacefortheactionsfor

computingmax·�∈¸

Q(s), a�) forthetargets

• Insteadofsearchingthroughallvalidactions,wesearchwithin

thehistoricaltripsoriginating fromthevicinityofs:

• Thesamesearchprocedureisusedforevaluation,wherewe

simulatethedriver’strajectoryduringthedayusinghistorical

tripsdata.

trip. Random sampling from the action space is thus not appropriate. Hence, we developed anapproximation scheme for computing this term by constructing an approximate feasible space for theactions, ˜A(s). This notation makes explicit the dependency of the action space to the state s wherethe search starts. Instead of searching through all valid actions, we search within the historical tripsoriginating from the vicinity of s:

˜A(s) := {xs1 |x 2 X , B(xs0) = B(s)}, (3)

where X is the set of all trips, and B(s) is discretized spatio-temporal bin that s falls into. For spatialdiscretization, we use the hexagon bin system, where in our case here, a hexagon bin is representedby its center point coordinates. xs0 is the s

0

component of the trip x. Certainly, the larger the searchspace, the more computation is required for evaluating the value network at each action point. Wehave made the number of actions allowed in the action search space a tuning parameter and dorandom sampling without replacement if necessary. The same search procedure is used for policyevaluation, where we simulate the driver’s trajectory during the day using historical trips data.

3.3 Expanded action search

Due to training data sparsity in certain spatio-temporal regions, e.g. some remote area in earlymorning, the above action search may return an empty set. In this case, we perform an expandedaction search in both spatial and temporal spaces. The first search direction is to stay at the lastdrop-off location and wait for a period of time, which corresponds to keeping the l-component, slconstant and advancing st, till one of the following happens (s0 is the searched state.): 1) ˜A(s

0) is

non-empty. We return ˜A(s

0). 2) A terminal state is reached. We return the terminal state. 3) s0t

exceeds the wait-time limit. We return s

0. The second search direction is through spatial expansionby searching the neighboring hexagon bins of s in a layered manner. See Figure 2 for an illustration.For each layer L of hexagon bins, we search within the appropriate time interval to take into accountthe travel time required to reach the target hexagon bin from s. The travel time estimation can beobtained from a map service, for example, and is beyond the scope of this paper. We denote the layerL neighboring spatio-temporal bins of s by B

0(s, L) and the set of historical trips originating from

any of the bins in B

0(s, L) by

˜A0(s, L) := {xs1 |x 2 X , B(xs0) 2 B

0(s, L)}. (4)

We stop increasing L when ˜A0(s, L) is non-empty and return ˜A0

(s, L). Otherwise, we returnB

0(s, L

max

), i.e. the hexagon bins’ center points and their associated time components. Algorithm3.1 summarizes the full action search.

3.4 Terminal state values

From the definition of a terminal state in Section 2, it is clear that Q(s, a) with st near the endof the episode horizon should be close to zero regardless sl. Following the idea of the dynamicprogramming algorithm, we add transitions with s

1

being a terminal state to the replay buffer at thevery beginning of training. We find that it helps getting the terminal state-action values right earlyin training. This is important because the target values for the states s

0

’s in the mini-batch updatesof DQN are computed through bootstrapping on the values of states that are temporally after them.Since the training samples with a terminal state form a very small percentage of the entire data set, auniform sampling would cause the values of many states far away from the terminals to be supervisedwith the incorrect targets, hence slowing down the learning process.

3.5 Experience augmentation

The original training data is the experience generated by the given historical policy, which may notexplore the trajectory space well and, in particular, contains very few transitions that require longwaiting time or repositioning without a passenger before a trip starts. Such transitions are typicalexperiences when the driver is at a state where historically there were few trips originating fromthere. If the agent is only trained on the original trips data, it would not learn to make good decisionsshould the driver go into a rare state in the future. The way we mitigate this problem is to supplementthe original training experience with transitions obtained through action search. Specifically, foreach time bin, we randomly sample a set of locations within the geographical boundary under

3

DeepQ-networkwithactionsearch

q Trainingdata

q Actionsearch

q Expandedactionsearch

• Duetotrainingdatasparsityincertainspatio-temporal

regions,weperformanexpandedactionsearchinboth

spatialandtemporalspaces.

Algorithm 3.1 Spatio-temporal action search1: Given s = (l

0

, t

0

); t0 is search time limit.2: A {}3: T

max

min(T, t

0

+ t

0)

4: if ˜A(s) 6= ; then5: A A+

˜A(s)

6: else7: for t = t

0

, t

0

+ 1, · · · , Tmax

do8: s

0 (l

0

, t)

9: if ˜A(s

0) 6= ; then

10: A A+

˜A(s

0)

11: break12: end if13: end for14: if A did not change from line 6 then15: A A+ s

0

16: end if17: for L = 1, · · · , L

max

do18: if B0

(s, L) 6= ; then19: A A+

˜A0(s, L)

20: break21: end if22: end for23: if A did not change from line 17 then24: A A+B

0(s, L

max

)

25: end if26: end if27: return A

Figure 2: Action search. The red circle lines cover the first two layers of neighboring hexagon binsof l

0

. The arrows represent searches to hexagon bin centered at l1

(first layer) at time t

1

> t

0

andhexagon bin centered at l

2

at time t

2

> t

2

. The spatio-temporal bins covered by the inner red circleare B

0((l

0

, t

0

), 1)..

4

Trainingformultiplecities

Dispatchingsystemsupportsalargenumberofcities§ Computationallyintensive§ Diversesupply-demandsettings

Knowledgetransfer§ Leveragecommonproperties:e.g.rush-hourtrafficpattern§ Improvelearningefficiency

Wangetal.,ICDM2018

TransferLearning

Idea§ Re-useweights§ Betterinitialsolution

ExistingMethods§ Fine-tuning§ Progressive:lateralconnections

betweensourcecitynodesandtargetcitynodes

CorrelatedFeatureProgressiveTransfer(CFPT)

Spatio-temporaldisplacement§ Involverelativetransitionpatternlikedistanceandtimecost

Generalonlinefeatures§ Numberofidledrivers§ Numberoforderscreated§ Averagepick-uptimeforpassengers§ …

ExperimentsTrainingdata:onemonthofExpressCartripdataSingle-drivertestenvironment

DQNv.s.policyevaluation§ Policyevaluation:max-Q->mean-Qinmini-batchupdates

Transferlearningv.s.no-priorDQNSpatialtransfer

§ SourcecitytotargetcityTemporaltransfer

§ Samecity:onetimeperiodtoanother City Size RegionA Large NorthernB Small SouthernC Medium SouthernD Large Western

Results

DQNv.s.policyevaluation§ Optimizationhelps

§ Smallercitieswithsimplerpatternand

fewertransitionshavesmaller

advantages

ResultsNotransferv.s.transfer

Spatialtransfer§ Robustness§ Betterinitialsolution§ Higherconvergedcumulativerewards

ResultsTemporaltransfer

DynamicRideSharing

DynamicRideSharingrequest Aroute request B

𝑣9(0,2)

𝑣º(2,4)

𝑣:(3,0)

𝑣½(5,2)

𝑣�(6,5)

𝑣�(10,2)

𝑣�(8,4)5

5

7 8

4

5

5 3

35

𝑣)(0,3)

1

𝐵

𝐴

Pick A

Pick B

Drop A

Drop BDrop B

• Limitations• targetonsingleobjective• relyoninefficient insertion,O(n2)orO(n3)

• Contributions• aunifiedcostfunction,generalizethreemainobjectives• dynamicprogrammingbasedinsertion,lineartime

Drop A

*YongxinTongetal.AUnifiedApproach toRoutePlanning forSharedMobility,VLDB,2018.

UnifiedProblemDefinition• Givenasetofdrivers,asetofrequestsdynamicallyarrived,theplatformaimstoplanrouteofeachdriverforservingtherequeststominimizetheunifiedcost.• UC W,R = α Ç ∑ D(SÊ)�

Ê∈Ì + ∑ pÍ�Í∈ÎÏ

• α:weight,pÍ:penaltyforrejectingtherequest

• Ð α = 1pÍ = ∞ ⟹

• Ðα = 0pÍ = 1 ⟹

• Ð α = unitpaymentpÍ = fareoftherequest⟹

� 最大化平台收益

MinimizeTotalDistance

MaximizeServedRequest

MaximizeTotalRevenue

*YongxinTongetal.AUnifiedApproach toRoutePlanning forSharedMobility,VLDB,2018.

CoreOperationinRidesharing:Insertion

• Givenaworkerwithcurrentroute,anewrequestisinserted intocurrentroutewithminimalincreaseddistance,whileordersofoldrequestsremainthesame.

PickA DropAPickBC

DropBDropC

PickD

DropD

oldroute newrequest

BeforeInsertion AfterInsertion

PickA DropAPickBC

DropBDropC

PickD

DropD

oldroute newroute

*YongxinTongetal.AUnifiedApproach toRoutePlanning forSharedMobility,VLDB,2018.

DynamicProgrammingbasedInsertion

• EventhoughthereareO(n2)pairs,insertion canbeimplementedbydynamicprogramminginlineartime.

Dio j =∞, ifpicked j − 1 > KÊ −KÍDio j − 1 , if det lÚ2), oÍ, lÚ > slack[j − 1]min Dio j − 1 , det lÚ2), oÍ, lÚ , otherwise

Plc[j] =

NIL, ifpicked j − 1 > KÊ − KÍPlc j − 1 , if det lÚ2), oÍ, lÚ > slack[j − 1]Plc j − 1 , ifDio j − 1 < det(lÚ2), oÍ, lÚ)j − 1, ifDio j − 1 ≥ det(lÚ2), oÍ, lÚ)

DynamicProgramming

*YongxinTongetal.AUnifiedApproach toRoutePlanning forSharedMobility,VLDB,2018.

O(n2)pairs oldroutenewrequest

Reference• [Tong et al. VLDB’18] Tong, Y., Zeng, Y., Zhou, Z., Chen, L., Ye, J., Xu, K., A Unified Approach

to Route Planning for Shared Mobility, VLDB, 2018.• [Tong et al. SIGMOD’18] Tong, Y., Wang, L., Zhou, Z., Chen, L., Du, B., Ye, J., Dynamic

Pricing in Spatial Crowdsourcing: A Matching-Based Approach, SIGMOD, 2018.• [Tong et al. VLDB’17] Tong, Y., Wang L., Zhou Z.,Ding. B., Chen L., Ye J., Xu K., Flexible

Dynamic Task Assignment in Real-Time Spatial Data, VLDB, 2017.

SupplyandDemandForecasting

SupplyandDemandForecasting

Tongetal.,KDD2017

SupplyandDemandForecasting

DeepST or ST-ResNet

• Spatial proximity: extract features from spatially proximate neighborhoods

• Temporal feature: temporal proximity and periodicity (n-weeks ago, n-days ago, n-hours ago)

Zhang,etal, AAAI2017

SupplyandDemandForecasting

Yao,etal, AAAI2018

Deep Multi-View ST Network

• Spatial proximity: extract features from spatially proximate neighborhoods

• Temporal feature: Recurrent Neural network

• Semantic similarity: build semantic graph followed by embedding

SmartTransportation

DataCollection,StateEstimation,andPrediction

10

20

30

40

0

Time

2018-4-18Time MsgType Msg16:47:39 TrafficSign NoParking16:47:39 TrafficSign SpeedLimit60km/h16:47:39 TrafficSign BusLane16:47:51 TrafficSign NoLeftTurn16:48:02 Congestion StopinQueue

Shutong St

Shuhan Rd

Tongde St

0 25 50 75 100

Space

Time-spacediagram

……

PerformanceEvaluationandDiagnosisImbalance

Spillover

Underutilization

Everymovementattheintersection

Imbalance UnderutilizationSpillover Normal

RealTimeTrafficControlReal-timedataprocessingwithdistributedcomputing

Trafficstatesreconstruction

Integratedrampmeteringandtrafficsignalcontrol

Datafusion-mobilesensing+fixedlocationdetection

Machinelearningbasedcontroller

IntegratedSolution- SmartTransportationBrain

Path GuidanceLane Control Signal TimingSmart Dispatch Network Optimization

GovernmentData

IndustryData

CrowdsourceData

DiDi Data

Data Fusion

Simulation

CloudComputation

AI

MeasureSystem

ControlCenter

AnalysisCenterDataCenter

SmartTransportationBrain

Improvingtrafficconditionsinover20cities

WuhanSmartTrafficLights 170+VariableMessageSigns:11

ChengduSmartTrafficLights 120+

ShenzhenSmartTrafficLights 10VariableMessageSigns:2

GuangzhouSmartTrafficLights 110+

SuzhouSmartTrafficLights 160+

NanjingSmartTrafficLights 18

JinanSmartTrafficLights 340+VariableMessageSigns:82

1300+

10�20%

ElectricVehicle

Motivations

• 400,000 electriccarsoperatingonDiDi’s platform

LessEmissionBrighterFuture!

>1.7millionnewenergy vehicles inChina(theworld'slargest)

• Chargingstation• Batterymanagingsystem(BMS)• Dispatchstrategy

Supportingthenewenergyvehicles

ChargingStation

RichTraceData

ChargingStationLocations

StationCapacities

InteractiveMap

SpatialClustering

DemandPrediction

Product

Supply-demand Pilesneeded

PotentialChargingStations

BMS(ConsumptionPrediction)

Driving StyleVehicleinformation Weather

RoadCondition

1. Power output2. Weight3. …

1. Sunny/rainy2. Temperature3. …

1. Acc/break freq2. Max speed3. …

1. Even?2. Up/down hill3. …

Unified Scale

Difference Model

PredictionMAPE distribution

Counts

truthpredicted

Asampletracefitting

DispatchingStrategy

Mydestinationis10kmaway

Ihavetochargenow

Iam100%full

Ihave30%battery

Iamgoingtoaplacenearchargingstation

Iamgoing50kmaway

Adaptingtheuniquecharacteristicofnewenergyvehicles

DiDi Brain

AIforSocialGood

AIforSocialGoodIntelligentMedical Hardware

Barrier-Free

Disease Warning

EnvironmentalSustainability

UrbanAirQuality Monitoring

Health

NeuralNetworks

Reinforcement Learning

CallforParticipant

DeepLearning

EVChargingPilesLocation

MachineLearning

Part 3:DataandToolsforTransportationAI

OpenDataset

KDDCup2017HighwayTollgatesTraffic

FlowPrediction

GAIAOpenDatasetTrajectoryData

Uber Movement

FederalHighwayAdministrationNextGenerationSimulation (NGSIM)Program

Public Data

TheGAIAInitiativeaimstoestablishstronglinksbetweenDiDi andacademia,andfacilitatedata-drivenresearchintransportation.Itprovidesaplatformfortheacademiatoaccessanonymizeddatafromreal-lifeurbanscenarios.

Open InnovativeCollaborative

Introduction

DailyRides

30 Million

DataProcessed

4875TB+

RoutingRequests

40 Billion 100TB+

New Data

Every Day

BigData

Real

High Accuracy

Full Sample

Vehicle Trajectory DataFrom: Chengdu andXi’an, China

Sharing AnonymizedData

01 02 03

Register

VisittheGAIAInitiativewebsite:GAIA.didichuxing.com/en andclick”ApplyNow”toregister.

Verification &Review

IntheDataCenter,click“DataAccessApplication”.Readtheuseragreementcarefullyandfillouttheapplicationformwithyourinfo,thensubmittheformforverification.

Access Data

Onceyourapplicationisapproved,DiDi willsendyouanemailwithinstructionsfordataaccess.Pleasecheckyouremail.

Process

Data

DiDi Academia

Scenarios

ToRedefinetheFutureofMobility

Outreach

GAIA.didichuxing.com/en

top related