![Page 1: Spatial Index Structures › csci585 › Spring2010 › Docs › Session14.pdf · • Given a point (x, y) and the precision K find the pixel for the point and then compute the z-value](https://reader033.vdocuments.site/reader033/viewer/2022060504/5f1d43feaf3b787a555ae1c6/html5/thumbnails/1.jpg)
CSCI585CSCI585
Spatial Index
Structures
Instructor: Cyrus Shahabi
![Page 2: Spatial Index Structures › csci585 › Spring2010 › Docs › Session14.pdf · • Given a point (x, y) and the precision K find the pixel for the point and then compute the z-value](https://reader033.vdocuments.site/reader033/viewer/2022060504/5f1d43feaf3b787a555ae1c6/html5/thumbnails/2.jpg)
CSCI585CSCI585 Outline
• Grid File
• Z-ordering
• Hilbert Curve
• Quad Tree– PM
– PR
• R Tree (next session)– R* Tree
– R+ Tree
![Page 3: Spatial Index Structures › csci585 › Spring2010 › Docs › Session14.pdf · • Given a point (x, y) and the precision K find the pixel for the point and then compute the z-value](https://reader033.vdocuments.site/reader033/viewer/2022060504/5f1d43feaf3b787a555ae1c6/html5/thumbnails/3.jpg)
CSCI585CSCI585 Grid File
• Hashing methods for multidimensional points
(extension of Extensible hashing)
• Idea: Use a grid to partition the space� each
cell is associated with one page
• Two disk access principle (exact match)
![Page 4: Spatial Index Structures › csci585 › Spring2010 › Docs › Session14.pdf · • Given a point (x, y) and the precision K find the pixel for the point and then compute the z-value](https://reader033.vdocuments.site/reader033/viewer/2022060504/5f1d43feaf3b787a555ae1c6/html5/thumbnails/4.jpg)
CSCI585CSCI585Grid File
• Start with one bucket for the whole space.
• Select dividers along each dimension. Partition space into cells
• Dividers cut all the way.
![Page 5: Spatial Index Structures › csci585 › Spring2010 › Docs › Session14.pdf · • Given a point (x, y) and the precision K find the pixel for the point and then compute the z-value](https://reader033.vdocuments.site/reader033/viewer/2022060504/5f1d43feaf3b787a555ae1c6/html5/thumbnails/5.jpg)
CSCI585CSCI585 Grid File
• Each cell corresponds to 1 disk page.
• Many cells can point to the same page.
• Cell directory potentially exponential in the number of dimensions
![Page 6: Spatial Index Structures › csci585 › Spring2010 › Docs › Session14.pdf · • Given a point (x, y) and the precision K find the pixel for the point and then compute the z-value](https://reader033.vdocuments.site/reader033/viewer/2022060504/5f1d43feaf3b787a555ae1c6/html5/thumbnails/6.jpg)
CSCI585CSCI585 Grid File Implementation
• Dynamic structure using a grid directory
– Grid array: a 2 dimensional array with
pointers to buckets (this array can be large,
disk resident) G(0,…, nx-1, 0, …, ny-1)
– Linear scales: Two 1 dimensional arrays that
used to access the grid array (main memory)
X(0, …, nx-1), Y(0, …, ny-1)
![Page 7: Spatial Index Structures › csci585 › Spring2010 › Docs › Session14.pdf · • Given a point (x, y) and the precision K find the pixel for the point and then compute the z-value](https://reader033.vdocuments.site/reader033/viewer/2022060504/5f1d43feaf3b787a555ae1c6/html5/thumbnails/7.jpg)
CSCI585CSCI585 Example
Linear scale X
Linear scale
Y
Grid Directory
Buckets/Disk Blocks
![Page 8: Spatial Index Structures › csci585 › Spring2010 › Docs › Session14.pdf · • Given a point (x, y) and the precision K find the pixel for the point and then compute the z-value](https://reader033.vdocuments.site/reader033/viewer/2022060504/5f1d43feaf3b787a555ae1c6/html5/thumbnails/8.jpg)
CSCI585CSCI585
The scales do not need to be
uniform
X1 X2 X3X4
Y1
Y2
Y3
Scales
Grid-file
buckets
![Page 9: Spatial Index Structures › csci585 › Spring2010 › Docs › Session14.pdf · • Given a point (x, y) and the precision K find the pixel for the point and then compute the z-value](https://reader033.vdocuments.site/reader033/viewer/2022060504/5f1d43feaf3b787a555ae1c6/html5/thumbnails/9.jpg)
CSCI585CSCI585 Grid File Search
• Exact Match Search: at most 2 I/Os assuming linear scales fit inmemory.
– First use liner scales to determine the index into the cell directory
– access the cell directory to retrieve the bucket address (may cause 1 I/O if cell directory does not fit in memory)
– access the appropriate bucket (1 I/O)• Range Queries:
– use linear scales to determine the index into the cell directory.
– Access the cell directory to retrieve the bucket addresses of buckets to visit.
– Access the buckets.
![Page 10: Spatial Index Structures › csci585 › Spring2010 › Docs › Session14.pdf · • Given a point (x, y) and the precision K find the pixel for the point and then compute the z-value](https://reader033.vdocuments.site/reader033/viewer/2022060504/5f1d43feaf3b787a555ae1c6/html5/thumbnails/10.jpg)
CSCI585CSCI585 Grid File Insertions
• Determine the bucket into which insertion must occur.• If space in bucket, insert.
• Else, split bucket
– how to choose a good dimension to split?– ans: create convex regions for buckets.
• If bucket split causes a cell directory to split do so and adjust linear scales.
• insertion of these new entries potentially requires a complete reorganization of the cell directory---expensive!!!
![Page 11: Spatial Index Structures › csci585 › Spring2010 › Docs › Session14.pdf · • Given a point (x, y) and the precision K find the pixel for the point and then compute the z-value](https://reader033.vdocuments.site/reader033/viewer/2022060504/5f1d43feaf3b787a555ae1c6/html5/thumbnails/11.jpg)
CSCI585CSCI585 Grid File Deletions
• Deletions may decrease the space utilization.
Merge buckets
• We need to decide which cells to merge and
a merging threshold
• Buddy system and neighbor system
– A bucket can merge with only one buddy in each
dimension
– Merge adjacent regions if the result is a rectangle
![Page 12: Spatial Index Structures › csci585 › Spring2010 › Docs › Session14.pdf · • Given a point (x, y) and the precision K find the pixel for the point and then compute the z-value](https://reader033.vdocuments.site/reader033/viewer/2022060504/5f1d43feaf3b787a555ae1c6/html5/thumbnails/12.jpg)
CSCI585CSCI585 One dimensional embedding
• Example: Z-Ordering or bit-interleaving
• Basic assumption: Finite precision in the
representation of each coordinate, K bits (2K
values)
• The address space is a square (image) and
represented as a 2K x 2K array
• Each element is called a pixel
![Page 13: Spatial Index Structures › csci585 › Spring2010 › Docs › Session14.pdf · • Given a point (x, y) and the precision K find the pixel for the point and then compute the z-value](https://reader033.vdocuments.site/reader033/viewer/2022060504/5f1d43feaf3b787a555ae1c6/html5/thumbnails/13.jpg)
CSCI585CSCI585Z-Ordering
– Find a linear order for the cells of the grid while maintaining “locality” (i.e., cells close to each other in space are also close to each other in the linear order)
– Define this order recursively for a grid that is obtained by hierarchical subdivision of space
00
01 11
10
1
0
0 1 00 01 10 11
00
01
10
11
1110
![Page 14: Spatial Index Structures › csci585 › Spring2010 › Docs › Session14.pdf · • Given a point (x, y) and the precision K find the pixel for the point and then compute the z-value](https://reader033.vdocuments.site/reader033/viewer/2022060504/5f1d43feaf3b787a555ae1c6/html5/thumbnails/14.jpg)
CSCI585CSCI585 Z-ordering
• Impose a linear ordering on the pixels of the image � 1 dimensional problem
00 01 10 11
00
01
10
11
A
B
ZA = shuffle(xA, yA) = shuffle(“01”, “11”)
= 0111 = (7)10
ZB = shuffle(“01”, “01”) = 0011
![Page 15: Spatial Index Structures › csci585 › Spring2010 › Docs › Session14.pdf · • Given a point (x, y) and the precision K find the pixel for the point and then compute the z-value](https://reader033.vdocuments.site/reader033/viewer/2022060504/5f1d43feaf3b787a555ae1c6/html5/thumbnails/15.jpg)
CSCI585CSCI585 Z-ordering
• Given a point (x, y) and the precision K find the pixel for the point and then compute the z-value
• Given a set of points, use a B+-tree to index the z-values
• A range (rectangular) query in 2-d is mapped to a set of ranges in 1-d
![Page 16: Spatial Index Structures › csci585 › Spring2010 › Docs › Session14.pdf · • Given a point (x, y) and the precision K find the pixel for the point and then compute the z-value](https://reader033.vdocuments.site/reader033/viewer/2022060504/5f1d43feaf3b787a555ae1c6/html5/thumbnails/16.jpg)
CSCI585CSCI585 Queries
• Find the z-values that contained in the query and then the ranges
00 01 10 11
00
01
10
11
QA � range [4, 7]QA
QB
QB � ranges [2,3] and [8,9]
![Page 17: Spatial Index Structures › csci585 › Spring2010 › Docs › Session14.pdf · • Given a point (x, y) and the precision K find the pixel for the point and then compute the z-value](https://reader033.vdocuments.site/reader033/viewer/2022060504/5f1d43feaf3b787a555ae1c6/html5/thumbnails/17.jpg)
CSCI585CSCI585 Hilbert Curve
• We want points that are close in 2d to be close in the 1d
• Note that in 2d there are 4 neighbors for each point where in 1d only 2.
• Z-curve has some “jumps” that we would like to avoid
• Hilbert curve avoids the jumps
![Page 18: Spatial Index Structures › csci585 › Spring2010 › Docs › Session14.pdf · • Given a point (x, y) and the precision K find the pixel for the point and then compute the z-value](https://reader033.vdocuments.site/reader033/viewer/2022060504/5f1d43feaf3b787a555ae1c6/html5/thumbnails/18.jpg)
CSCI585CSCI585 Hilbert Curve- example
• It has been shown that in general Hilbert is better
than the other space filling curves for retrieval *
• Hi (order-i) Hilbert curve for 2ix2i array
* H. V. Jagadish: Linear Clustering of Objects with Multiple Atr* H. V. Jagadish: Linear Clustering of Objects with Multiple Atributes. ACM SIGMOD Conference 1990: 332ibutes. ACM SIGMOD Conference 1990: 332--342342
0
1
14
3
2
13
4
7
8
5
6
9
10111215
N=1 N=2 N=3 N=4
H:[0,2N-1]d� [0,2Nd-1]
![Page 19: Spatial Index Structures › csci585 › Spring2010 › Docs › Session14.pdf · • Given a point (x, y) and the precision K find the pixel for the point and then compute the z-value](https://reader033.vdocuments.site/reader033/viewer/2022060504/5f1d43feaf3b787a555ae1c6/html5/thumbnails/19.jpg)
CSCI585CSCI585 Quad Trees
• Region Quadtree
– The blocks are required to be disjoint
– Have standard sizes (squares whose sides are power
of two)
– At standard locations
– Based on successive subdivision of image array
into four equal-size quadrants
– If the region does not cover the entire array, subdivide
into quadrants, sub-quadrants, etc.
– A variable resolution data structure
![Page 20: Spatial Index Structures › csci585 › Spring2010 › Docs › Session14.pdf · • Given a point (x, y) and the precision K find the pixel for the point and then compute the z-value](https://reader033.vdocuments.site/reader033/viewer/2022060504/5f1d43feaf3b787a555ae1c6/html5/thumbnails/20.jpg)
CSCI585CSCI585 Example of Region Quadtree
1
2 3
4 5
67 8
9 10
11 12
13 14
15 16
17 1819
A
B C E
2
1
3 4 5 6 11 12D 13 14 19F
15 16 17 187 8 9 10
NW NE SW SE
3
3
11
11
![Page 21: Spatial Index Structures › csci585 › Spring2010 › Docs › Session14.pdf · • Given a point (x, y) and the precision K find the pixel for the point and then compute the z-value](https://reader033.vdocuments.site/reader033/viewer/2022060504/5f1d43feaf3b787a555ae1c6/html5/thumbnails/21.jpg)
CSCI585CSCI585 PR Quadtree
• PR (Point-Region) quadtree
• Regular decomposition (similar to Region quadtree)
• Independent of the order in which data points are inserted into it
• �: if two points are very close, decomposition can be very deep
![Page 22: Spatial Index Structures › csci585 › Spring2010 › Docs › Session14.pdf · • Given a point (x, y) and the precision K find the pixel for the point and then compute the z-value](https://reader033.vdocuments.site/reader033/viewer/2022060504/5f1d43feaf3b787a555ae1c6/html5/thumbnails/22.jpg)
CSCI585CSCI585 Example of PR Quadtree
(0,0) (100,0)
(100,100)(0,100)
Seattle
(1,55)
Toronto
(62,77)
Buffalo
(82,65)
Denver
(5,45)
Chicago
(35,42)
Omaha
(27,35) Mobile
(52,10)
Atlanta
(85,15)
Miami
(90,5)
A
Seattle
(1,55)
Toronto
(62,77)Buffalo
(82,65)
Denver
(5,45)
Chicago
(35,42)
Omaha
(27,35)
Mobile
(52,10)
Atlanta
(85,15)Miami
(90,5)
B C E
D F
Subdivide into quadrants until the two
points are located in different regions
![Page 23: Spatial Index Structures › csci585 › Spring2010 › Docs › Session14.pdf · • Given a point (x, y) and the precision K find the pixel for the point and then compute the z-value](https://reader033.vdocuments.site/reader033/viewer/2022060504/5f1d43feaf3b787a555ae1c6/html5/thumbnails/23.jpg)
CSCI585CSCI585PM Quadtree
• PM (Polygonal-Map) quadtree family
– PM1 quadtree, PM2 quadtree, PM3 quadtree, PMR quadtree, … etc.
• PM1 quadtree
– Based on regular decomposition of space
– Vertex-based implementation
– Criteria• At most one vertex can lie in a region represented by a quadtree leaf
• If a region contains a vertex, it can contain no partial-edge that does not include that vertex
• If a region contains no vertices, it can contain at most one partial-edge
![Page 24: Spatial Index Structures › csci585 › Spring2010 › Docs › Session14.pdf · • Given a point (x, y) and the precision K find the pixel for the point and then compute the z-value](https://reader033.vdocuments.site/reader033/viewer/2022060504/5f1d43feaf3b787a555ae1c6/html5/thumbnails/24.jpg)
CSCI585CSCI585 PM Quadtree
PM1 quadtree
PM2 quadtree
PM3 quadtree
![Page 25: Spatial Index Structures › csci585 › Spring2010 › Docs › Session14.pdf · • Given a point (x, y) and the precision K find the pixel for the point and then compute the z-value](https://reader033.vdocuments.site/reader033/viewer/2022060504/5f1d43feaf3b787a555ae1c6/html5/thumbnails/25.jpg)
CSCI585CSCI585 Example of PM1 Quadtree
(0,0) (100,0)
(100,100)(0,100)
• Each node in a PM quadtree is a collection of partial edges (and a vertex)
• Each point record has two field (x,y)
• Each partial edge has four field (starting_point, ending_point, left region, right region)
![Page 26: Spatial Index Structures › csci585 › Spring2010 › Docs › Session14.pdf · • Given a point (x, y) and the precision K find the pixel for the point and then compute the z-value](https://reader033.vdocuments.site/reader033/viewer/2022060504/5f1d43feaf3b787a555ae1c6/html5/thumbnails/26.jpg)
CSCI585CSCI585 References
• National Technical University of Athens , Theoretical Computer Science II: Advanced Data Structures
• Jürg Nievergelt, Hans Hinterberger, Kenneth C. Sevcik: The Grid File: An Adaptable, Symmetric Multikey File Structure. ACM Trans. Database Syst. 9(1): 38-71 (1984)
• H. V. Jagadish: Linear Clustering of Objects with Multiple Atributes. ACM SIGMOD Conference 1990: 332-342