isee: efficient k-nearest-neighbor monitoring over moving obejcts [ssdbm 2007] wei wu, kian-lee tan...

16
iSEE: Efficient k-Nearest-Neighbor Monitoring over Moving Obejcts [SSDBM 2007] Wei Wu, Kian-Lee Tan National University of Singapore

Post on 20-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

iSEE: Efficient k-Nearest-Neighbor Monitoring over Moving Obejcts

[SSDBM 2007]

Wei Wu, Kian-Lee Tan

National University of Singapore

Problem Settings

Given a query point q, continuously report k nearest objects of q

Objects and query points move in an unpredictable fashion

Objects and query points can be indexed in main memory

Related Work Divide space into conceptual rectangles Initialize heap with first level rectangles and cq

For top entry e in heap unless mindist(e,q) > best_dist; If e is rectangle, enheap all cells in e If e is cell; check objects in it

Conceptual Partitioning: An efficient Method for Continuous nearest neighbor monitoringKyriakos Mouratidis, Marios Hadjieleftheriou, Dimitris Papadias

[SIGMOD 2005]

Motivations CPM visits un-necessary cells during update

If some nearest neighbor moves away from the query, the new answer cannot be among any cell c for which maxdist(c,q)< best_dist where best_dist is the distance of kth nearest neighbor from q

All the shaded cells are visited

Ideally, the update should start from the cells that intersect the circle

Motivations Heap size is large because rectangles do not approximate circles

very wellAll the cells in any rectangle Rec are inserted in heap when r becomes greater than mindist(Rec,q)

All cells in the four rectangles are enheaped. Ideally, only the cells that intersect the circle should be enheaped

How to alleviate these problems?CircularTripVOB

Visit Order Build (VOB)

Each group LiGj has either 4 or 8 cells

The cells in any group LiGj have similar min-dist from q

The min-dist of a group LiGj from q is smaller than min-dist of Li+1Gj

The min-dist of a group LiGj from q is smaller than min-dist of LiGj+1

Min-dist of LiGj is the minimum of all min-distances between the cells in L iGj from q.

Initial Computation

• Initially en-heap cq and group L1G1 in heap with its min-dist

• For each de-heaped entry

• If it is a cell c

• Look in c for potential NNs

• store c in visit_list

• If it is a group LiGj

• en-heap all cells in LiGi with their min-distance

• en-heap next level group Li+1Gj with its min-dist

• if i=j

• en-heap next group of same level LiGj+1

Data Structure

•Each cell in grid stores object list and influence list

•best_NN stores the Nearest neighbors among the visited cells

• search heap H contains the cells and groups that were en-heaped but not de-heaped (Enables quick updates)

• visit_list stores min and max-distances of all the cells that were de-heaped

Update Handling

• If an object x moves inside the circle

• include x in result and delete current kth-NN

• (Update influence List) go backward (descending order) in visit_list deleting q from all the cells c for which min-dist(c,q) > new best_dist

•If a result object x moves outside the circle

• delete x from the result set

• start from the beginning of visit_list and skip the cells for which max-dist(c,q)< best_dist

Experiments

Experiments

Memory Comparison with CPM

• Data structure of iSEE is same as CPM but

• CPM has a larger search heap

• visit_list of both CPM and iSEE contains same number of cells but

• iSEE also stores max-dist for cells

• Let r be the distance of kth NN from q

• CPM memory

•search heap 2.(4r2 – πr2) = 1.71r2

• visit_list 2(πr2)

• Total: 8r2

• iSEE memory:

• visit_list 3(πr2)

• Total: 3(πr2) + search heap

iSEE vs CircularTrip CircularTrip doesn’t need any visit_list or search heap When a query moves, iSEE computes from scratch whereas CircularTrip

uses previous information VOB is almost as expensive as CircularTrip (if two consecutive

circularTrips visit some (at most 27%) cells twice VOB computes max-dist for each cell) Let C be the number of cells that are needed to be visited during computation, CircualrTrip computes distances for at most 1.27C cells whereas VOB computes distances for 2C cells.

Update of influence list by iSEE is faster than CircularTrip because they store visit_list (but lazy update approach can be used in both of these algorithms)

CircularTrip uses less memory (50%-85% of iSEE) and the running time of both algorithm is estimated to be similar

CircularTrip is more flexible ArcTrip returns all the cells that intersect a circle and lie in a

specified angle range <θ1,θ2> ArcTrip can be used to continuously monitor constrained nearest

neighbor queries optimally (visiting minimal set of cells) Extension of iSEE for constrained NN queries makes it in-efficient

Recall that six constrained NN queries are needed to be continuously monitored for continuous monitoring of RNN queries

CircularTrip is more flexible ArcTrip can be used to monitor constrained NN queries over

irregular regions efficiently CircularTrip can also be used to efficiently monitor farthest

neighbor queries Farthest neighbors in constrained regions can also be monitored

For all algorithms, CircularTrip preserves its property that it needs no book-keeping information and visits minimum number of cells