scalable network distance browsing in spatial database samet, h., sankaranarayanan, j., and alborzi...

41
Scalable Network Distance Browsing in Spatial Database Samet, H., Sankaranarayanan, J., and Alborzi H. Proceedings of the 2008 ACM SIGMOD international Conference on Management of Data Presented by: Don Eagan Chintan Patel http://www-users.cs.umn.edu/~cpatel/8715.html

Post on 21-Dec-2015

220 views

Category:

Documents


1 download

TRANSCRIPT

Scalable Network Distance Browsing in Spatial DatabaseSamet, H., Sankaranarayanan, J., and Alborzi H.

Proceedings of the 2008 ACM SIGMOD international Conference on Management of Data

Presented by:Don Eagan

Chintan Patelhttp://www-users.cs.umn.edu/~cpatel/8715.html

Outline• Motivation• Problem Statement• Proposed Approach• Other Approaches• Evaluation• Our Comments• Questions

Motivation• Growing Popularity of Online Mapping Services

Motivation• Real Time Shortest Paths

Motivation• Static Network, Variable Queries• Find Gas Stations, Hotels, Markets etc.

Motivation• Static Network, Variable Queries• Find Gas Stations, Hotels, Markets etc.

Problem Statement• Input: • Spatial Network S, Node q from S

• Output:• k-nearest neighbors of q

• Objective:• Facilitate “fast” shortest path queries based on different search

criteria's• Constraints/ Assumptions:• Static spatial network• Contiguous (connected) regions

Challenges• Real-time response• Calculating all pairs shortest path is costly • Storing pre-computed naïvely doesn’t solve the problem • Scalability

Contribution• Efficient path encoding• Efficient retrieval• Abstracting shortest path calculation from domain queries

Key Concepts• Spatial Networks• Nearest Network Neighbor• Quad Tree• Morton Blocks• Decoupling• Scalability• Pre-computing

Spatial Networks• Graph with spatial components represented as nodes/ edges• Most Transportations are modeled as graph• Intersection – Node/ vertex• Roads – Edge• Time/ Distance – Edge Weight

K-Nearest Neighbors

K-Nearest Neighbors

K-Nearest Neighbors

K-Nearest Neighbors

K-Nearest Neighbors

K-Nearest Neighbors

Shortest Path• Dijkstra’s algorithm• Doesn’t work for real-time queries• Computationally expensive

Proposed Approach• Pre-compute shortest paths• Store and Retrieve Efficiently

N = Number of vertices, M = Number of edges, s = Length of the shortest path

Method Space Retrieval Time

Explicit O(N^3) O(1)

Dijkstra O(N + M) O(M + N log N)

SLIC O(N √N) O(s log N)

Path Encoding• Path coherence• Vertices in close proximity share portion of the shortest paths to

them from distant sources

Path Encoding• Path coherence• Vertices in close proximity share portion of the shortest paths to

them from distant sources

Path Encoding• Path coherence• Vertices in close proximity share portion of the shortest paths to

them from distant sources

Path Encoding• Path coherence• Vertices in close proximity share portion of the shortest paths to

them from distant sources

Path Encoding• Quadtree: Decompose until all vertices in block have same

color

How is space reduced?• Capturing boundaries !

Path Retrieval• Retrieve quadtree corresponding to s

Path Retrieval• Find connected node t in the quadtree containing d

Path Retrieval• Repeat the process

K-nearest Neighbor• Set of objects• Pre-computed paths (quadtree)

K-nearest Neighbor• K = 2

K-nearest Neighbor• Queue1: m a b• Queue2: a b

K-nearest Neighbor• Queue1: a g e b f• Queue2: a g

K-nearest Neighbor• Queue1: a g e• Queue2: a g

K-nearest Neighbor• Return a and g

Other Approaches

• IER: Incremental Euclidian Restriction• Based on Euclidian distance• Dijkstra’s algorithm to get network distance

• INE: Incremental. Network Expansion• Dijkstra's algorithm with a buffer L containing the k nearest

neighbors seen so far in terms of network distance•

Evaluation• Micro benchmark• Synthetic Data

Evaluation• Real Data Set: Major Road of the USA

Our Comments• We Liked:• Decoupling shortest path and neighbor calculation• Space reduction approach

• Scalable

• Correctness proofs• Detailed discussion about KNN variants

Our Comments• What we didn’t like:• Experiments:

• No comparison with other approaches (e.g. hierarchical, dynamic etc.)

• No performance graphs/ discussion with real dataset

Discussion• Other use cases?• Real Application: How to overcome assumptions?

Questions

? ? ?