ad-hoc distributed spatial joins on mobile devices panos kalnis, xiaochen li national university of...
Post on 21-Dec-2015
214 views
TRANSCRIPT
![Page 1: Ad-hoc Distributed Spatial Joins on Mobile Devices Panos Kalnis, Xiaochen Li National University of Singapore Nikos Mamoulis The University of Hong Kong](https://reader030.vdocuments.site/reader030/viewer/2022032522/56649d6d5503460f94a4d55c/html5/thumbnails/1.jpg)
Ad-hoc Distributed Spatial Joins on Mobile Devices
Panos Kalnis, Xiaochen LiNational University of Singapore
Nikos MamoulisThe University of Hong Kong
Spiridon BakirasHong Kong University of Science and Technology
![Page 2: Ad-hoc Distributed Spatial Joins on Mobile Devices Panos Kalnis, Xiaochen Li National University of Singapore Nikos Mamoulis The University of Hong Kong](https://reader030.vdocuments.site/reader030/viewer/2022032522/56649d6d5503460f94a4d55c/html5/thumbnails/2.jpg)
Motivation
Users are equipped with a mobile device (eg. PDA)
Ad-hoc spatial queries Combine data from remote servers
Hotels Restaurants
“Find hotels which are within 500m of a seafood restaurant”
Servers do not collaborate with each other The query is executed on the mobile device
![Page 3: Ad-hoc Distributed Spatial Joins on Mobile Devices Panos Kalnis, Xiaochen Li National University of Singapore Nikos Mamoulis The University of Hong Kong](https://reader030.vdocuments.site/reader030/viewer/2022032522/56649d6d5503460f94a4d55c/html5/thumbnails/3.jpg)
Mediators?
Services may only allow end-user connections (eg., subscribers only)
Access through mediators may be more expensive
Requests are ad-hoc; existing mediators may not support them
Hotels Restaurants
Mediator
![Page 4: Ad-hoc Distributed Spatial Joins on Mobile Devices Panos Kalnis, Xiaochen Li National University of Singapore Nikos Mamoulis The University of Hong Kong](https://reader030.vdocuments.site/reader030/viewer/2022032522/56649d6d5503460f94a4d55c/html5/thumbnails/4.jpg)
Cost
Telecommunication companies typically charge by the bulk of transferred data (eg. GPRS), instead of connection time.
Goal: Minimize the amount of transferred data.
![Page 5: Ad-hoc Distributed Spatial Joins on Mobile Devices Panos Kalnis, Xiaochen Li National University of Singapore Nikos Mamoulis The University of Hong Kong](https://reader030.vdocuments.site/reader030/viewer/2022032522/56649d6d5503460f94a4d55c/html5/thumbnails/5.jpg)
Solution
Ask aggregate queries to estimate the data distribution (i.e., statistics)
Partition the space recursively to achieve sub-linear transfer cost
Choose the physical operator indepen-dently for each partition
![Page 6: Ad-hoc Distributed Spatial Joins on Mobile Devices Panos Kalnis, Xiaochen Li National University of Singapore Nikos Mamoulis The University of Hong Kong](https://reader030.vdocuments.site/reader030/viewer/2022032522/56649d6d5503460f94a4d55c/html5/thumbnails/6.jpg)
Related Work
Hash-based methods (eg. PBSM): require all data to be transferred
R-tree based methods (eg., [Tan et.al, TKDE, 2000]): require access to internal index
Mediators : HERMES : Statistics from previous queries DISCO, Garlic : Statistics during initialization Tuckila : Optimize parts of the execution tree
![Page 7: Ad-hoc Distributed Spatial Joins on Mobile Devices Panos Kalnis, Xiaochen Li National University of Singapore Nikos Mamoulis The University of Hong Kong](https://reader030.vdocuments.site/reader030/viewer/2022032522/56649d6d5503460f94a4d55c/html5/thumbnails/7.jpg)
Operators
WINDOW query: return all objects intersecting a window w
COUNT query: return the number of objects intersecting w
ε-RANGE query: return all objects within range ε from a point p
NO access to the internal indices!
ε
w
p
![Page 8: Ad-hoc Distributed Spatial Joins on Mobile Devices Panos Kalnis, Xiaochen Li National University of Singapore Nikos Mamoulis The University of Hong Kong](https://reader030.vdocuments.site/reader030/viewer/2022032522/56649d6d5503460f94a4d55c/html5/thumbnails/8.jpg)
Query Types Intersection Join
Find hotels which are inside parks
E-range Join Find restaurants which
are within 500m of a hotel
Iceberg Semi-join Find hotels which are
close to at least 3 restaurants
ε
![Page 9: Ad-hoc Distributed Spatial Joins on Mobile Devices Panos Kalnis, Xiaochen Li National University of Singapore Nikos Mamoulis The University of Hong Kong](https://reader030.vdocuments.site/reader030/viewer/2022032522/56649d6d5503460f94a4d55c/html5/thumbnails/9.jpg)
Hash Based Spatial Join
Each partition must fit in memory
![Page 10: Ad-hoc Distributed Spatial Joins on Mobile Devices Panos Kalnis, Xiaochen Li National University of Singapore Nikos Mamoulis The University of Hong Kong](https://reader030.vdocuments.site/reader030/viewer/2022032522/56649d6d5503460f94a4d55c/html5/thumbnails/10.jpg)
Recursive evaluation
Retrieve statistics for each subpart
![Page 11: Ad-hoc Distributed Spatial Joins on Mobile Devices Panos Kalnis, Xiaochen Li National University of Singapore Nikos Mamoulis The University of Hong Kong](https://reader030.vdocuments.site/reader030/viewer/2022032522/56649d6d5503460f94a4d55c/html5/thumbnails/11.jpg)
Inefficient HBSJ
![Page 12: Ad-hoc Distributed Spatial Joins on Mobile Devices Panos Kalnis, Xiaochen Li National University of Singapore Nikos Mamoulis The University of Hong Kong](https://reader030.vdocuments.site/reader030/viewer/2022032522/56649d6d5503460f94a4d55c/html5/thumbnails/12.jpg)
Nested Loop Spatial Join
Recursive HBSJ : 4 QRY + 2 RCV + 5 RCV
NLSJ : 2 RCV + 2 SND + 2 RES
![Page 13: Ad-hoc Distributed Spatial Joins on Mobile Devices Panos Kalnis, Xiaochen Li National University of Singapore Nikos Mamoulis The University of Hong Kong](https://reader030.vdocuments.site/reader030/viewer/2022032522/56649d6d5503460f94a4d55c/html5/thumbnails/13.jpg)
Inefficient NLSJ
![Page 14: Ad-hoc Distributed Spatial Joins on Mobile Devices Panos Kalnis, Xiaochen Li National University of Singapore Nikos Mamoulis The University of Hong Kong](https://reader030.vdocuments.site/reader030/viewer/2022032522/56649d6d5503460f94a4d55c/html5/thumbnails/14.jpg)
Cost Model
TCP/IP: MTU = MSS + BH
MSS
BBBBT DHDDB )(
c1: download |RW| objects from R and |Sw| objects from S and join them on the PDA
C2,3: download |RW| objects from R, send them as window queries to S and retrieve the results
c4: repartition w, retrieve detailed statistics and apply the algorithm recursively
![Page 15: Ad-hoc Distributed Spatial Joins on Mobile Devices Panos Kalnis, Xiaochen Li National University of Singapore Nikos Mamoulis The University of Hong Kong](https://reader030.vdocuments.site/reader030/viewer/2022032522/56649d6d5503460f94a4d55c/html5/thumbnails/15.jpg)
UpJoin (Uniform Partition Join)
Decide if datasets are uniform
If HBSJ is cheaper and both datasets are uniform then perform HBSJ
If NLSJ is cheaper and the largest dataset is uniform then perform NLSJ
Else repartition
![Page 16: Ad-hoc Distributed Spatial Joins on Mobile Devices Panos Kalnis, Xiaochen Li National University of Singapore Nikos Mamoulis The University of Hong Kong](https://reader030.vdocuments.site/reader030/viewer/2022032522/56649d6d5503460f94a4d55c/html5/thumbnails/16.jpg)
Uniformity check
wiww DDD
'4
Dw
Dw’0 Dw’1
Dw’3 Dw’2
% variation from uniform distribution
Note: UpJoin will not repartition if the cost for retrieving statistics is larger than the cost of joining
![Page 17: Ad-hoc Distributed Spatial Joins on Mobile Devices Panos Kalnis, Xiaochen Li National University of Singapore Nikos Mamoulis The University of Hong Kong](https://reader030.vdocuments.site/reader030/viewer/2022032522/56649d6d5503460f94a4d55c/html5/thumbnails/17.jpg)
Inefficient UpJoin
![Page 18: Ad-hoc Distributed Spatial Joins on Mobile Devices Panos Kalnis, Xiaochen Li National University of Singapore Nikos Mamoulis The University of Hong Kong](https://reader030.vdocuments.site/reader030/viewer/2022032522/56649d6d5503460f94a4d55c/html5/thumbnails/18.jpg)
SR-Join (Similarity Related Join)
wiw
wwi A
A
DD
Area% variationof density
Identify dense and sparse quadrants
If the distribution is similar then apply HBSJ or NLSJ
Else repartitionX
X
X
X
![Page 19: Ad-hoc Distributed Spatial Joins on Mobile Devices Panos Kalnis, Xiaochen Li National University of Singapore Nikos Mamoulis The University of Hong Kong](https://reader030.vdocuments.site/reader030/viewer/2022032522/56649d6d5503460f94a4d55c/html5/thumbnails/19.jpg)
Experimental setup Implementation
Server: Unix Client: HP-Ipaq PDA (WiFi network, 400MHz
RISC CPU, 64MB RAM, Windows Pocket PC) Datasets:
Synthetic: 1K – 10K points, varying skew Real: Roads and railways of Germany
![Page 20: Ad-hoc Distributed Spatial Joins on Mobile Devices Panos Kalnis, Xiaochen Li National University of Singapore Nikos Mamoulis The University of Hong Kong](https://reader030.vdocuments.site/reader030/viewer/2022032522/56649d6d5503460f94a4d55c/html5/thumbnails/20.jpg)
Setting the parameters
α (for UpJoin) ρ (for SR-Join)Uniform Uniform
![Page 21: Ad-hoc Distributed Spatial Joins on Mobile Devices Panos Kalnis, Xiaochen Li National University of Singapore Nikos Mamoulis The University of Hong Kong](https://reader030.vdocuments.site/reader030/viewer/2022032522/56649d6d5503460f94a4d55c/html5/thumbnails/21.jpg)
Real Dataset
Uniform
![Page 22: Ad-hoc Distributed Spatial Joins on Mobile Devices Panos Kalnis, Xiaochen Li National University of Singapore Nikos Mamoulis The University of Hong Kong](https://reader030.vdocuments.site/reader030/viewer/2022032522/56649d6d5503460f94a4d55c/html5/thumbnails/22.jpg)
Comparison with SemiJoin
•SemiJoin: Use intermediate levels of R-Tree index•We cannot use it in practice, because we cannot access the index
Uniform
![Page 23: Ad-hoc Distributed Spatial Joins on Mobile Devices Panos Kalnis, Xiaochen Li National University of Singapore Nikos Mamoulis The University of Hong Kong](https://reader030.vdocuments.site/reader030/viewer/2022032522/56649d6d5503460f94a4d55c/html5/thumbnails/23.jpg)
Conclusions Distributed spatial joins on mobile devices No mediator – non collaborative servers – limited
set of supported operators Two algorithms
UpJoin SRJoin Both estimate the datasets’ distribution
Future work Support multi-way spatial joins Improve the accuracy of the cost model
![Page 24: Ad-hoc Distributed Spatial Joins on Mobile Devices Panos Kalnis, Xiaochen Li National University of Singapore Nikos Mamoulis The University of Hong Kong](https://reader030.vdocuments.site/reader030/viewer/2022032522/56649d6d5503460f94a4d55c/html5/thumbnails/24.jpg)
Questions?