r-tree. 2 spatial database (ia) consider: given a city map, ‘index’ all university buildings in...

38
R-Tree

Upload: clarissa-walsh

Post on 04-Jan-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: R-Tree. 2 Spatial Database (Ia) Consider: Given a city map, ‘index’ all university buildings in an efficient structure for quick topological search

R-Tree

Page 2: R-Tree. 2 Spatial Database (Ia) Consider: Given a city map, ‘index’ all university buildings in an efficient structure for quick topological search

2

Spatial Database (Ia)• Consider: Given a city map, ‘index’ all university buildings in an efficient

structure for quick topological search.

Page 3: R-Tree. 2 Spatial Database (Ia) Consider: Given a city map, ‘index’ all university buildings in an efficient structure for quick topological search

3

Spatial Database (Ib)• Consider: Given a city map, ‘index’ all university buildings in an efficient

structure for quick topological search.

Spatial object: Contour (outline) of the area around the building(s).

Minimum bounding region (MBR) of the object.

Page 4: R-Tree. 2 Spatial Database (Ia) Consider: Given a city map, ‘index’ all university buildings in an efficient structure for quick topological search

4

Spatial Database (Ic)• Consider: Given a city map, ‘index’ all university buildings in an efficient

structure for quick relational-topological search.

MBR of the city neighbourhoods.

MBR of the city defining the overall search region.

Page 5: R-Tree. 2 Spatial Database (Ia) Consider: Given a city map, ‘index’ all university buildings in an efficient structure for quick topological search

5

Spatial Database (II)Notion: To retrieve data items quickly and efficiently according to their spatial

locations.• Involves 2D regions.• Need to support 2D range queries. • Multiple return values desired: Answering a query region by reporting all

spatial objects that are fully-contained-in or overlapping the query region (Spatial-Access Method – SAM).

In general:• Spatial data objects often cover areas in multidimensional spaces.• Spatial data objects are not well-represented by point-location.• An ‘index’ based on an object’s spatial location is desirable.

Page 6: R-Tree. 2 Spatial Database (Ia) Consider: Given a city map, ‘index’ all university buildings in an efficient structure for quick topological search

6

The Indexing Approach• A B-Tree (Rosenberg & Snyder, 1981) is an ordered, dynamic, multi-way

structure of order m (i.e. each node has at most m children).• The keys and the subtrees are arranged in the fashion of a search tree.

• Each node may contain a large number of keys, and the number of subtrees in each node, then, may also be large.

• The B-Tree is designed (among other objectives):

– to branch out this large number of directions, and– to contain a lot of keys in each node so that the

height of the tree is relatively short.

M

P T X

B D F G K L N O Q S V W Y ZI

E H

Page 7: R-Tree. 2 Spatial Database (Ia) Consider: Given a city map, ‘index’ all university buildings in an efficient structure for quick topological search

7

The R-Tree Index Structure• An R-Tree is a height-balanced tree, similar to a B-Tree.• Index records in the leaf nodes contain pointers to the actual spatial-

objects they represent.• Leaves in the structure all appear on the same level.• Spatial searching requires visiting only a small number of nodes.• The index is completely dynamic: inserts and deletes can be intermixed

with searches.• No periodic reorganisation is required.

Page 8: R-Tree. 2 Spatial Database (Ia) Consider: Given a city map, ‘index’ all university buildings in an efficient structure for quick topological search

8

The R-Tree Index Structure• A spatial database consists of a collection of tuples representing spatial

objects, known as Entries.• Each Entry has a unique identifier that points to one spatial object, and its

MBR; i.e. Entry = (MBR, pointer).

Page 9: R-Tree. 2 Spatial Database (Ia) Consider: Given a city map, ‘index’ all university buildings in an efficient structure for quick topological search

9

R-Tree Index Structure – Leaf Entries• An entry E in a leaf node is defined as (Guttman, 1984):

E = (I, tuple-identifier)• Where I refers to the smallest binding n-dimensional region (MBR) that

encompasses the spatial data pointed to by its tuple-identifier.• I is a series of closed-intervals that make up each dimension of the binding

region.

Example. In 2D, I = (Ix, Iy), where Ix = [xa, xb], and Iy = [ya, yb].

Page 10: R-Tree. 2 Spatial Database (Ia) Consider: Given a city map, ‘index’ all university buildings in an efficient structure for quick topological search

10

R-Tree Index Structure – Leaf Entries• In general I = (I0, I1, …, In-1) for n-dimensions, and that Ik = [ka, kb].• If either ka or kb (or both) are equal to , this means that the spatial object

extends outward indefinitely along that dimension.

Page 11: R-Tree. 2 Spatial Database (Ia) Consider: Given a city map, ‘index’ all university buildings in an efficient structure for quick topological search

11

R-Tree Index Structure – Non-Leaf Entries• An entry E in a non-leaf node is defined as:

E = (I, child-pointer)• Where the child-pointer points to the child of this node, and I is the MBR

that encompasses all the regions in the child-node’s pointer’s entries.

I(A) I(B) … I(M)

I(a) I(b) I(c) I(d)

B

a

b

c

d

Page 12: R-Tree. 2 Spatial Database (Ia) Consider: Given a city map, ‘index’ all university buildings in an efficient structure for quick topological search

12

Properties

Then an R-Tree must satisfy the following properties:1. Every leaf node contains between m and M index records, unless it is the

root.2. For each index-record Entry (I, tuple-identifier) in a leaf node, I is the

MBR that spatially contains the n-dimensional data object represented by the tuple-identifier.

3. Every non-leaf node has between m and M children, unless it is the root.4. For each Entry (I, child-pointer) in a non-leaf node, I is the MBR that

spatially contains the regions in the child node.5. The root has two children unless it is a leaf.6. All leaves appear on the same level.

Let M be the maximum number of entries that will fit in one node.Let m ≤ M/2 be a parameter specifying the minimum number of entries in one node.

Page 13: R-Tree. 2 Spatial Database (Ia) Consider: Given a city map, ‘index’ all university buildings in an efficient structure for quick topological search

13

Node Overflow and Underflow• A Node-Overflow happens when a new Entry is added to a fully packed

node, causing the resulting number of entries in the node to exceed the upper-bound M.

• The ‘overflow’ node must be split, and all its current entries, as well as the new one, consolidated for local optimum arrangement.

• A Node-Underflow happens when one or more Entries are removed from a node, causing the remaining number of entries in that node to fall below the lower-bound m.

• The underflow node must be condensed, and its entries dispersed for global optimum arrangement.

Page 14: R-Tree. 2 Spatial Database (Ia) Consider: Given a city map, ‘index’ all university buildings in an efficient structure for quick topological search

Spatial Indexes

Used to speed up spatial queries

Example:

Point query: return the geometric object that contains a given query point

Sequentially scanning all objects of a large collection to check whether they contain the query point involves a high number of disk accesses and the repetition of the evaluation of computationally expensive geometric predicates (e.g., containment, intersection, etc.)

Reducing the set of objects to be processed is highly desirable

Page 15: R-Tree. 2 Spatial Database (Ia) Consider: Given a city map, ‘index’ all university buildings in an efficient structure for quick topological search

Indexes for object-based and space-based representations

Indexes for raster data: based on recursive subdivision of the spaceExample: quadtrees

Indexes for vector data: differ depending on the type of data(extensions of quadtrees are used also for vector data)

Page 16: R-Tree. 2 Spatial Database (Ia) Consider: Given a city map, ‘index’ all university buildings in an efficient structure for quick topological search

Vector Data Indexing• Different indexing methods are used for point, linear and polygonal data

• In the case of collections of polygons, instead of indexing the object geometries themselves, whose shapes might be complex, we consider an approximation of the geometry and index it instead

• Most commonly used approximation: minimum bounding rectangle (MBR) also called minimum bounding box (MBB)

Page 17: R-Tree. 2 Spatial Database (Ia) Consider: Given a city map, ‘index’ all university buildings in an efficient structure for quick topological search

• By using the MBR as the geometric key for building the spatial index, we save the cost of evaluating expensive geometric predicates during index traversal (as geometric tests againsts an MBR is constant)

• Example: point-in-polygon test

• In addition, the space required to store a rectangle is constant (2 points)

MBRs

(x,y) (x,y)

Page 18: R-Tree. 2 Spatial Database (Ia) Consider: Given a city map, ‘index’ all university buildings in an efficient structure for quick topological search

MBRs (cont.d)

An operation involving a spatial predicate on a collection of objects indexed on their MBRs is performed in two steps:

1. Filter step: selects the objects whose MBR satisfies the spatial predicate (by traversing the spatial index and applying the predicate to the MBRs)

2. Refinement step: the objects that pass the filter step are a superset of the solution. An MBR might satisfy the predicate but the corresponding object might not

obj

MBRP

Page 19: R-Tree. 2 Spatial Database (Ia) Consider: Given a city map, ‘index’ all university buildings in an efficient structure for quick topological search

Refinement step: the objects that pass the filter step are a superset of the solution. An MBR might satisfy the predicate but the corresponding object might not

Therefore, in this step the spatial predicate is applied to the actual geometry of the object

obj

MBRP

Refinement step

Page 20: R-Tree. 2 Spatial Database (Ia) Consider: Given a city map, ‘index’ all university buildings in an efficient structure for quick topological search

Oracle Spatial Query Model

Spatial LayerData

Table wherecoordinates are stored

PrimaryFilter

Spatial Index

Index retrieves area of interest

ReducedData Set

SecondaryFilter

SpatialFunctions

Proceduresthat determineexact relationship

ExactResultSet

Page 21: R-Tree. 2 Spatial Database (Ia) Consider: Given a city map, ‘index’ all university buildings in an efficient structure for quick topological search

Oracle Spatial Indexing Methods

Two types of indexes are implemented in Oracle Spatial:

• R-trees

• Quadtrees

Page 22: R-Tree. 2 Spatial Database (Ia) Consider: Given a city map, ‘index’ all university buildings in an efficient structure for quick topological search

R-trees

Based on MBRs (minimum bounding rectangles)

Defined for indexing 2D objects (can be extended to higher dimensions but implemented only for 2D in Oracle Spatial)

MBRs of geometric objects form the leaves of the index tree

Multiple MBRs are grouped into larger rectangles (MBRs) to form intermediate nodes in the tree

Repeat until one rectangle is left that contains everything

Page 23: R-Tree. 2 Spatial Database (Ia) Consider: Given a city map, ‘index’ all university buildings in an efficient structure for quick topological search

R-trees

a

b

c

d a b c d

R

S

R S

root

root

R-tree

12 3

4

56

7

8

9

Pointers to geometries

4 5 6 7 8 91 2 3

Page 24: R-Tree. 2 Spatial Database (Ia) Consider: Given a city map, ‘index’ all university buildings in an efficient structure for quick topological search

Remark: nodes

• Intermediate nodes store: MBRs of collections of objects

• Leaf nodes store: MBRs of individual objects Pointers to storage location of the exact geometry

Page 25: R-Tree. 2 Spatial Database (Ia) Consider: Given a city map, ‘index’ all university buildings in an efficient structure for quick topological search

Building R-trees

An R-tree is a depth-balanced tree in which each node corresponds to a disk page (i.e., the number of entries in each node is limited)

The structure satisfies the following properties:

1. For all nodes in the tree (except the root) the number of entries is between m and M

2. The root has at least two children (unless it is a leaf)

3. All leaves are at the same level

Page 26: R-Tree. 2 Spatial Database (Ia) Consider: Given a city map, ‘index’ all university buildings in an efficient structure for quick topological search

Example (1)

a

b

c

d a b c d

R

S

R S

root

root

R-tree

12 3

4

56

7

8

9

Pointers to geometries

4 5 6 7 8 91 2 3

m = 2; M = 3

Page 27: R-Tree. 2 Spatial Database (Ia) Consider: Given a city map, ‘index’ all university buildings in an efficient structure for quick topological search

Example (2)

m = 2; M = 4

R-tree

R1 R3

root

R2

….. ….. …..

Page 28: R-Tree. 2 Spatial Database (Ia) Consider: Given a city map, ‘index’ all university buildings in an efficient structure for quick topological search

Searching R-trees

We consider two types of queries:

1. point query: “what object contains the query point”

2. window query: “what objects intersect the query window”

Page 29: R-Tree. 2 Spatial Database (Ia) Consider: Given a city map, ‘index’ all university buildings in an efficient structure for quick topological search

Basic spatial queries (1)

Containment Query: Given a spatial object O, find all objects in the collection that completely contain O. When O is a point, the query is called Point Query

O

P

Containment Query Point Query

(also Point-in-polygon, or Point Location)

Page 30: R-Tree. 2 Spatial Database (Ia) Consider: Given a city map, ‘index’ all university buildings in an efficient structure for quick topological search

Basic spatial queries (2)

Region Query: Given a region R, find all objects in the collection that intersect R. When R is a rectangle, the query is called Window Query

RR

Region Query Window Query

Page 31: R-Tree. 2 Spatial Database (Ia) Consider: Given a city map, ‘index’ all university buildings in an efficient structure for quick topological search

Searching R-trees: window query

Compare search window with MBRs stored at each node starting at root node

Stop at leaf nodes compare contained geometries with search window

Page 32: R-Tree. 2 Spatial Database (Ia) Consider: Given a city map, ‘index’ all university buildings in an efficient structure for quick topological search

5

Searching R-trees: window query

Example:

a b c d

R S

root

R-tree

12 3

4

6

7

8

9

4 8 91 2 3

a

b

c

d

root

R

S

Pointers to geometries

Page 33: R-Tree. 2 Spatial Database (Ia) Consider: Given a city map, ‘index’ all university buildings in an efficient structure for quick topological search

Example: remarksIf no MBRs are used: check the query window against all geometries for intersection (computationally expensive)

In some cases, using R-trees to structure the set of MBRs can cause more tests (against MBRs) to be done. In general, this is not the case

Page 34: R-Tree. 2 Spatial Database (Ia) Consider: Given a city map, ‘index’ all university buildings in an efficient structure for quick topological search

Searching R-trees: point query

Test query point for inclusion in MBRs stored at each node starting at root node

Stop at leaf nodes Test query point for inclusion in exact geometries

Page 35: R-Tree. 2 Spatial Database (Ia) Consider: Given a city map, ‘index’ all university buildings in an efficient structure for quick topological search

Exercise: point query

a

b

c

d a b c d

R

S

R S

root

root

R-tree

12 3

4

56

7

8

9

Pointers to geometries

4 5 6 7 8 91 2 3

P

Page 36: R-Tree. 2 Spatial Database (Ia) Consider: Given a city map, ‘index’ all university buildings in an efficient structure for quick topological search

5

Searching R-trees: point query

Example:

a b

R S

root

R-tree

12 3

4

6

7

8

9

43

a

b

root

R

S

Pointers to geometries

P

Page 37: R-Tree. 2 Spatial Database (Ia) Consider: Given a city map, ‘index’ all university buildings in an efficient structure for quick topological search

Summary

• Indexing Vector Spatial Data

• R-trees:

• Based on MBRs (leaves)

• Root: whole dataset

• Intermediate nodes: groups of MBRs (objects) – not a partition of the underlying space!

Page 38: R-Tree. 2 Spatial Database (Ia) Consider: Given a city map, ‘index’ all university buildings in an efficient structure for quick topological search

Important remarks• Note that the MBRs (at all levels) can overlap

• A rectangle is stored as child of a bigger rectangle only if completely contained in it

Example: