database systems ii index structures
DESCRIPTION
Database Systems II Index Structures. Introduction. We have discussed the organization of records in secondary storage blocks. Records have an address, either logical or physical. But SQL queries reference attribute values, not record addresses. SELECT * FROM R WHERE a=10; - PowerPoint PPT PresentationTRANSCRIPT
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 1
Database Systems II
Index Structures
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 2
IntroductionWe have discussed the organization of records in secondary storage blocks.Records have an address, either logical or physical.But SQL queries reference attribute values, not record addresses.
SELECT * FROM R WHERE a=10;How to find the records that have certain specified attribute values?
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 3
Introduction
recordID1
recordID2. . .
value
value
value matchingrecords
indexblocksholdingrecords
value
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 4
Index-Structure Basics
Storage structures consist of files.Data files store, e.g., the records of a relation.Search key: one or more attributes for which we want to be able to search efficiently.Index file over a data file for some search key associates search key values with pointers to (recordID = rid) data file records that have this value.Sequential file: records sorted according to their primary key.
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 5
Index-Structure Basics
Sequential File
2010
4030
6050
8070
10090
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 6
Index-Structure Basics
Three alternatives for data entries k*:- record with key value k- <k, rid of record with search key value k>- <k, list of rids of records with search key k>
Choice is orthogonal to the indexing technique used to locate entries k*
Two major indexing techniques:- tree-structures- hash tables.
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 7
Index-Structure BasicsDense index: one index entry for every record in the data file.Sparse index: index entries only for some of the record in the data file. Typically, one entry per block of the data file.Primary index: determines the location of data file records, i.e. order of index entries same as order of data records.Secondary index does not determine data location.Can only have one primary index, but multiple secondary indexes.
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 8
Index-Structure BasicsSequential File
2010
4030
6050
8070
10090
Dense Index
10203040
50607080
90100110120
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 9
Index-Structure BasicsSequential File
2010
4030
6050
8070
10090
Sparse Index
10305070
90110130150
170190210230
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 10
Index-Structure Basics
1010
2010
3020
3030
4540
10203040
Duplicate key values– sparse index– data entry for first new key from block
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 11
Index-Structure Basics
Sparse index: - requires less index space per record, - can keep more of index in memory,- needed for secondary indexes.
Dense index: - can tell if any record exists without accessing data file,- better for insertions.
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 12
Index-Structure Basics
Index file can become very large, e.g. at least one tenth of data file size for records with ten attributes of same length.
To speed-up index access, add a second index level on top of the first index level, a third level on top of the second one, . . .
First level can be dense, other levels are sparse.
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 13
Index-Structure BasicsSequential File
2010
4030
6050
8070
10090
Sparse 2nd level
10305070
90110130150
170190210230
1090
170250
330410490570
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 14
Index-Structure Basics
Index structure needs to support Equality Queries and Range Queries.
Equality query: one attribute value specified, e.g. docID = 100, or age = 18.
Range query: attribute range specified, e.g. 30 <= age <= 40.
Index structures must also support DB modifications, i.e. insertions, deletions and updates.
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 15
ISAM
ISAM = Index Sequential Access Method
Hierarchy of index files (tree structure)
Non-leaf
blocks
blocks
Overflow block
Primary blocks
Leaf
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 16
ISAM
Leaf blocks contain data entries.
Non-leaf blocks contain pairs (ki,pi), where
ki is a search key value and pi a pointer to
the (first of the) records with that search key value.
P0
K1 P
1K 2 P
2K m
P m
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 17
ISAM
File Creation
Leaf (data) blocks allocated sequentially, sorted by search key.
Then non-leaf blocks allocated, then space for overflow blocks.
Index entries: <search key value, block id>; they ‘direct’ search for data entries, which are in leaf blocks.
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 18
ISAM
10* 15* 20* 27* 33* 37* 40* 46* 51* 55* 63* 97*
20 33 51 63
40
Root
Example
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 19
ISAM
Index Operations SearchStart at root; use key comparisons to go to leaf.Insert Find leaf data entry belongs to, and put it there.DeleteFind and remove from leaf; if empty overflow block, de-allocate.
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 20
ISAMExample
After inserting 23*, 48*, 41*, 42*
10* 15* 20* 27* 33* 37* 40* 46* 51* 55* 63* 97*
20 33 51 63
40
Root
23* 48* 41*
42*
Overflow
blocks
Leaf
Index
blocks
blocks
Primary
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 21
ISAM
Example
After deleting 42*, 51*, 97*.51* appears in index level, but not in leaf level.
10* 15* 20* 27* 33* 37* 40* 46* 55* 63*
20 33 51 63
40
Root
23* 48* 41*
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 22
ISAM
Discussion Inserts / deletes affect only leaf pages.
static tree structureTree can degenerate into a linear list of overflow blocks.In this case, ISAM looses all advantages compared to a simple, non-hierarchical index file.Can we maintain a balanced tree structure dynamically under insertions / deletions?
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 23
B-Trees
IntroductionTree node corresponds to block.B-trees are balanced, i.e. all leaves at same level. This guarantees efficient access.B-trees guarantee minimum space utilization.n (order): maximum number of keys per node, minimum number of keys is roughly n/2.Exception: root may have one key only.m + 1 pointers in node, m actual number of keys.
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 24
B-TreesIntroduction
Index Entries(inner nodes)
Data Entries
(leaf nodes)
leaf nodes are linked in sequential order
this B-tree variant is normally referred to as B+-tree
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 25
B-TreesIntroduction
Node format: (p1,k1, . . ., pn,kn,pn+1)pi: pointer, ki: search key
Node with m pointers has m children and corresponding sub-trees.n+1-th index entry has only pointer. At leaf level, this pointer references the next leaf node.Search key property: i-th subtree contains data entries with search key k <ki, i+1-th subtree contains data entries with search key k >= ki.
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 26
B-TreesExample
Root n = 3
10
0
120
150
180
30
3 5 11
30
35
100
101
110
120
130
150
156
179
180
200
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 27
B-TreesExample
to keys to keys to keys to keys
< 57 57 k<81 81k<95 95
57
81
95
Non-leaf (inner) node
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 28
B-TreesExample
From non-leaf node
to next leafin sequence
57
81
95
To r
eco
rd
wit
h k
ey 5
7
To r
eco
rd
wit
h k
ey 8
1
To r
eco
rd
wit
h k
ey 8
5
Leaf node
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 29
B-TreesSpace utilization
full node min. node
Non-leaf
Leaf
120
150
180
30
3 5 11
30
35
counts
even if
null
n = 3
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 30
B-TreesSpace utilization
Number of pointers/keys for B-tree
Non-leaf(non-root) n+1 n (n+1)/2 (n+1)/2- 1
Leaf(non-root) n+1 n
Root n+1 n 1 1
Max Max Min Min ptrs keys ptrsdata keys
(n+1)/2 (n+1)/2
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 31
B-Trees
)(log 2/ NO n
Equality Queries
To search for key k, start from root.At a given node, find “nearest key” ki and follow left (pi) or right (pi+1) pointer depending on comparison of k and ki.
Continue, until leaf node reached.Explores one path from root to leaf node.Height of B-tree is where N: number of records indexed
runtime complexity
)(log NO
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 32
B-TreesInsertions
Always insert in corresponding leaf.Tree grows bottom-up.Four different cases:
space available in leaf,leaf overflow,non-leaf overflow,new root.
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 33
B-TreesInsertions
Space available in leaf3 5 11
30
31
30
100
32
Insert key 32
n = 3
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 34
B-TreesInsertions
Leaf overflowsplit overflowing node intotwo of (almost) same sizeand copy middle (separating) key to father node3 5 11
30
31
30
100
3 5
7
7
n = 3
Insert key 7
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 35
B-TreesInsertions
Non-leaf overflow split overflowing
node and push middle
key up to father
node
10
0
120
150
180
15
01
56
17
9
180
200
160
180
16
01
79
Insert key 160
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 36
B-TreesInsertions
New root split can
propagate up to the root
and result in new
root 10
20
30
1 2 3 10
12
20
25
30
32
40
40
45
40
30new root
Insert key 45
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 37
B-TreesDeletions
Locate corresponding leaf node.Delete specified entry.Four different cases:
Leaf node has still enough entries,Coalesce with neighbor (sibling),Re-distribute keys, Coalesce or re-distribute at non-leaf.
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 38
B-TreesDeletions
Coalesce with neighbor (sibling)if node underflowsand sibling has enoughspace, coalescethe two nodes
10
40
100
10
20
30
40
50
40
Delete key 50
n=4
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 39
B-TreesDeletions
Redistribute keys if node underflowsand sibling has extraentry, re-distributeentries of the two nodes
Delete key 50
n=410
40
100
10
20
30
35
40
5035
35
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 40
B-TreesDeletions
Non-leaf coalesce Delete key 37
n=4
40
45
30
37
25
26
20
22
10
141 3
10
20
30
4040
30
25
25
new root
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 41
B-TreesB-Trees in Practice
Often, coalescing is not implemented.It is too hard and typically does not gain a lot of performance.
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 42
B-TreesB-Trees in Practice
Typical order: 200, typical space utilization: 67%. I.e., average fanout = 133.Typical capacities:
Height 4: 1334 = 312,900,700 records,Height 3: 1333 = 2,352,637 records.
Can often hold top levels in buffer pool:Level 1 = 1 blocks = 8 Kbytes,
Level 2 = 133 blocks = 1 Mbyte,
Level 3 = 17,689 blocks = 133 Mbytes.
O(1) processing for equality queries
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 43
B-TreesB-Trees in Practice
Order (n) concept replaced by physical space criterion in practice (‘at least half-full’).Inner nodes can typically hold many more entries than leaf nodes.Variable sized records and search keys mean different nodes will contain different numbers of entries.Even with fixed length fields, multiple records with the same search key value (duplicates) can lead to variable-sized data entries.
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 44
Hash TablesIntroduction
Tree-based index structures map search key values to record addresses via a tree structure.Hash tables perform the same mapping via a hash function, which computes the record address.Search key KHash function h
B: number of buckets.
}1,...,0{)( BKh
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 45
Hash TablesIntroduction
Good hash function should have the following property: expected number of keys the same (similar) for all buckets.This is difficult to accomplish for search keys that have a highly skewed distribution, e.g. names.Common hash function K = ‘x1 x2 … xn’ n byte character string
B often chosen as prime number
BMODxxKh n )...()( 1
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 46
Hash TablesSecondary-Storage Hash Tables
Bucket: collection of blocks.Initially, bucket consists of one block.Records hashed to b are stored in bucket b.If bucket capacity exceeded, link chain of overflow buckets.Assume that address of first block of bucket i can be computed given i.E.g., main memory array of pointers to blocks.
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 47
Hash TablesSecondary-Storage Hash Tables
Hash tables can perform their mapping directly or indirectly.
.
.
.
records
.
.
.
(1) key h(key)
(2) key h(key)
Index
recordkey 1
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 48
Hash TablesInsertions
To insert record with search key K.
Compute h(K) = i.
Insert record into first block of bucket i that has enough space.
If none of the current blocks has space, add a new block to the overflow chain, and store new record there.
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 49
Hash TablesInsertions
INSERT:h(a) = 1h(b) = 2h(c) = 1h(d) = 0
0
1
2
3
d
acb
h(e) = 1
e
bucket capacity:2 records
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 50
Hash Tables
Deletions
To delete record with search key K.
Compute h(K) = i.
Locate record(s) with search key K in bucket i.
If possible, move up remaining records within block.
If possible, move remaining records from overflow chain to the previous block and de-allocate block.
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 51
Hash TablesDeletions
0
1
2
3
a
bc
e
d
Delete:ef
fg
maybe move“g” up
cd
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 52
Hash Tables
Queries
To find record(s) with search key K.
Compute h(K) = i.
Locate record(s) with search key K in bucket i, following the overflow chain.
In the absence of overflow blocks, only one block I/O necessary, i.e. O(1) runtime.
This is (much) better than B-trees.
But hash tables do not support range queries!
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 53
Hash Tables
QueriesIn order to keep overflow chains short, keep space utilization between 50% and 80%.
If space utilization < 50%: waste space.If space utilization > 80%: overflow chains become significant. Depends on hash function and on bucket capacity.
b
bbnutilizatiospace
in fit that keys#
bucket in keys# )(
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 54
Hash Tables
So far, only static hash tables, i.e. the number B of buckets never changes. With growing number of records, space utilization cannot be kept in the desired range. Dynamic hash tables adapt B to the actual number of records stored.Goal: approximately one block per bucket.Two dynamic methods:
Extensible Hashing, andLinear Hashing.
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 55
Extensible Hash Tables
IntroductionAdd a level of indirection for the buckets, a directory containing pointers to blocks, one for each value of the hash function.
Size of directory doubles in each growth step.
.
.
.
.
h(K)to bucket
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 56
Extensible Hash Tables
IntroductionSeveral buckets can share a data block, if they do not contain too many records.Hash function computes sequences of k bits, but bucket numbers use only the i first of these bits. i is the level of the hash table.
00110101
k
i, grows over time
0initially
,2 directory of size
i
i
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 57
Extensible Hash TablesInsertions
To insert record with search key K.
Compute h(K) and take its first i bits. Global level i is part of the data structure.
Retrieve the corresponding directory entry.
Follow that pointer leading to block b. b has a local level j <= i.
If b has enough space, insert record there.
Otherwise, split b into two blocks.
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 58
Extensible Hash Tables
Insertions
If j < i, distribute records in b based on (j+1)st bit of h(K): if 0, old block b, if 1 new block b’.
Increment the local level of b and b’ by one.
Adjust the pointer in the directory that pointed to b but must now point to b’.
If j = i, first increment i by one. Double the directory size and duplicate all entries. Proceed as in case j < i.
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 59
Extensible Hash Tables
Example
i = 1
1
1
0001
1001
1100
Insert 101011100
1010
New directory
200
01
10
11
i =
2
2
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 60
Extensible Hash Tables
10001
21001
1010
21100
Insert:
0111
0000
00
01
10
11
2i =
0111
0000
0111
0001
2
2
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 61
Extensible Hash Tables
00
01
10
11
2i =
21001
1010
21100
20111
20000
0001
Insert:
1001
1001
1001
1010
000
001
010
011
100
101
110
111
3i =
3
3
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 62
Extensible Hash Tables
Overflow Chains
May still needoverflow chainsin the presence of too manyduplicates ofhash values.
Split does not help if allentries belong to sameof two resulting blocks!
11101
1100
2
21100
insert 1100
1100
if we split:
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 63
Extensible Hash Tables
Overflow Chains
11101
1100
11100
insert 1100 add overflow block:
1101
1101
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 64
Extensible Hash Tables
Deletions
To delete record with search key K.
Using the directory, locate corresponding block b and delete record from there.
If possible, merge block b with “buddy” block b’ and adjust the directory pointers to b and b’.
If possible, halve the directory. reverse insertion procedure
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 65
Extensible Hash TablesDiscussion
Can manage growing number of buckets without wasting too much space.
Assume that directory fits into main memory.
Never need to access more than one data block (as long as there are no overflow chains) for a query.
Doubling the directory is a very expensive operation. Interrupts other operations and may require secondary storage.
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 66
Linear Hash TablesIntroduction
No directory.
Hash function computes sequences of k bits. Take only the i last of these bits and interpret them as bucket number m.
n: number of last bucket, first number is 0.
00110101
k
i, grows over time
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 67
Linear Hash Tables
records)(number capacity bucket and
records ofnumber total where,)1(
c
rcn
rnutilizatiospace
Insertions
If m <= n, store record in bucket m. Otherwise, store it in bucket number
If bucket overflows, add overflow block.
If space utilization becomes too high, add one bucket at the end and increment n by 1.
file grows linearly
12 im
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 68
Linear Hash TablesInsertions
Bucket we add is usually not in the range of hash keys where an overflow occurred.
When n > 2i, increment i by 1.
i is the number of rounds of doubling the size of the Linear Hash table.
No need to move entries.
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 69
Linear Hash TablesExample
00 01 1011
0101
11111010
n = 01
Futuregrowthbuckets
0101• can have overflow chains!
• insert 0101
If h(k)[i ] n, then look at bucket h(k)[i ]else, look at bucket h(k)[i ] – 2i -1
k = 4, i = 2
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 70
Linear Hash TablesExample
00 01 1011
0101
11111010
n = 01
Futuregrowthbuckets
10
1010
0101 • insert 0101
11
11110101
k = 4, i = 2
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 71
Linear Hash TablesExample
k = 4
00 01 1011
111110100101
0101
n = 11
i = 2
0 0 0 0100 101 110 111
3
. . .
100
100
101
101
0101
0101
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 72
Linear Hash TablesDiscussion
Can manage growing number of buckets
without wasting too much space.
No directory, i.e. no indirection in access
and no expensive doubling operation.
Significant need for overflow chains,
even if no duplicates among last i bits of
hash values.