cs 255: database system principles slides: b-trees by:- arunesh joshi id:-006538558

30
CS 255: Database System Principles slides: B-trees By:- Arunesh Joshi Id:-006538558

Post on 22-Dec-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CS 255: Database System Principles slides: B-trees By:- Arunesh Joshi Id:-006538558

CS 255: Database System Principles

slides: B-trees

By:- Arunesh Joshi Id:-006538558

Page 2: CS 255: Database System Principles slides: B-trees By:- Arunesh Joshi Id:-006538558

Agenda• The features and different functionalities of B-

Tree in terms of index structure• The Structure of B-Trees• Applications of B-Trees• Lookup in B-Trees• Range Queries• Insertion into B-Trees• Deletion from a B-Tree• Efficiency of B-Trees

Page 3: CS 255: Database System Principles slides: B-trees By:- Arunesh Joshi Id:-006538558

B-Trees

B-tree organizes its blocks into a tree. The tree is balanced, meaning that all paths from the root to a leaf have the same length. Typically, there are three layers in a B-tree: the root, an intermediate layer, and leaves, but any number of layers is possible.

Page 4: CS 255: Database System Principles slides: B-trees By:- Arunesh Joshi Id:-006538558

functionalities of B- Tree

• B-Trees automatically maintain as many levels of index as is appropriate for the size of the file being indexed.

• B-Trees manage the space on the blocks they use so that every block is between half used and completely full. No overflow blocks are needed.

Page 5: CS 255: Database System Principles slides: B-trees By:- Arunesh Joshi Id:-006538558

Structure of B-Trees

• There are three layers in binary trees- the root, an intermediate layer and leaves

• In a B-Tree each block have space for n search-key values and n+1 pointers

[next slide explains the structure of a B-Tree]

Page 6: CS 255: Database System Principles slides: B-trees By:- Arunesh Joshi Id:-006538558

Root

B-Tree Example n=3

100

120

150

180

30

3 5 11 30 35 100

101

110

120

130

150

156

179

180

200

Page 7: CS 255: Database System Principles slides: B-trees By:- Arunesh Joshi Id:-006538558

Sample non-leaf

to keys to keys to keys to keys to keys

< 57 57 k<81 81k<95 95

57 81 95

Page 8: CS 255: Database System Principles slides: B-trees By:- Arunesh Joshi Id:-006538558

From non-leaf node

to next leafin sequence57 81 95

To re

cord

w

ith k

ey 5

7

To re

cord

w

ith k

ey 8

1

To re

cord

w

ith k

ey 8

5

Sample leaf node:

Page 9: CS 255: Database System Principles slides: B-trees By:- Arunesh Joshi Id:-006538558

In textbook’s notation n=3

Leaf:

Non-leaf:

30 3530

30 35

30

Page 10: CS 255: Database System Principles slides: B-trees By:- Arunesh Joshi Id:-006538558

Size of nodes: n+1 pointersn keys (fixed)

Page 11: CS 255: Database System Principles slides: B-trees By:- Arunesh Joshi Id:-006538558

Don’t want nodes to be too empty

• Use at least

Non-leaf: (n+1)/2 pointers

Leaf: (n+1)/2 pointers to data

Page 12: CS 255: Database System Principles slides: B-trees By:- Arunesh Joshi Id:-006538558

Full node min. node

Non-leaf

Leaf

n=3

120

150

180

30

3 5 11 30 35

coun

ts e

ven

if nu

ll

Page 13: CS 255: Database System Principles slides: B-trees By:- Arunesh Joshi Id:-006538558

B-tree rules tree of order n

(1) All leaves at same lowest level(balanced tree)

(2) Pointers in leaves point to recordsexcept for “sequence pointer”

Page 14: CS 255: Database System Principles slides: B-trees By:- Arunesh Joshi Id:-006538558

Number of pointers/keys for B+tree

Non-leaf(non-root) n+1 n (n+1)/2 (n+1)/2- 1

Leaf(non-root) n+1 n

Root n+1 n 1 1

Max Max Min Min ptrs keys ptrsdata keys

(n+1)/2 (n+1)/2

Page 15: CS 255: Database System Principles slides: B-trees By:- Arunesh Joshi Id:-006538558

Applications of B-trees1. The search key of the B-tree is the primary key for the data

file, and the index is dense. That is, there is one key-pointer pair in a leaf for every record of the data file. The data file may or may not be sorted by primary key.

2. The data file is sorted by its primary key, and the B-tree is a sparse index with one key-pointer pair at a leaf for each block of the data file.

3. The data file is sorted by an attribute that is not a key, and this attribute is the search key for the B-tree. For each key value K that appears in the data file there is one key-pointer pair at a leaf. That pointer goes to the first of the records that have K as their sort-key value.

Page 16: CS 255: Database System Principles slides: B-trees By:- Arunesh Joshi Id:-006538558

Lookup in B-Trees

• Suppose we want to find a record with search key 40.

• We will start at the root , the root is 13, so the record will go the right of the tree.

• Then keep searching with the same concept.

Page 17: CS 255: Database System Principles slides: B-trees By:- Arunesh Joshi Id:-006538558

Looking for block “40”<not present>13

317

312923191713117532

43

4137 4743

23

Page 18: CS 255: Database System Principles slides: B-trees By:- Arunesh Joshi Id:-006538558

Range Queries

• B-trees are used for queries in which a range of values are asked for. Like,

SELECT * FROM R WHERE R. k >= 10 AND R. k <= 25;

Page 19: CS 255: Database System Principles slides: B-trees By:- Arunesh Joshi Id:-006538558

Insert into B-tree

(a) simple case– space available in leaf

(b) leaf overflow(c) non-leaf overflow(d) new root

Page 20: CS 255: Database System Principles slides: B-trees By:- Arunesh Joshi Id:-006538558

(a) Insert key = 32 n=33 5 11 30 31

30

100

32

Page 21: CS 255: Database System Principles slides: B-trees By:- Arunesh Joshi Id:-006538558

(a) Insert key = 7 n=3

3 5 11 30 31

30

100

3 5

7

7

Page 22: CS 255: Database System Principles slides: B-trees By:- Arunesh Joshi Id:-006538558

(c) Insert key = 160 n=3

100

120

150

180

150

156

179

180

200

160

180

160

179

Page 23: CS 255: Database System Principles slides: B-trees By:- Arunesh Joshi Id:-006538558

(d) New root, insert 45 n=3

10 20 30

1 2 3 10 12 20 25 30 32 40 40 45

40

30new root

Page 24: CS 255: Database System Principles slides: B-trees By:- Arunesh Joshi Id:-006538558

CS 245 Notes 4 24

(a) Simple case - no example

(b) Coalesce with neighbor (sibling)

(c) Re-distribute keys(d) Cases (b) or (c) at non-leaf

Deletion from B-tree

Page 25: CS 255: Database System Principles slides: B-trees By:- Arunesh Joshi Id:-006538558

(b) Coalesce with sibling– Delete 50

10 40 100

10 20 30 40 50

n=4

40

Page 26: CS 255: Database System Principles slides: B-trees By:- Arunesh Joshi Id:-006538558

(c) Redistribute keys– Delete 50

10 40 100

10 20 30 35 40 50

n=4

35

35

Page 27: CS 255: Database System Principles slides: B-trees By:- Arunesh Joshi Id:-006538558

40 4530 3725 2620 2210 141 3

10 20 30 40

(d) Non-leaf coalese– Delete 37

n=4

40

30

25

25

new root

Page 28: CS 255: Database System Principles slides: B-trees By:- Arunesh Joshi Id:-006538558

B-tree deletions in practice

– Often, coalescing is not implemented– Too hard and not worth it!

Page 29: CS 255: Database System Principles slides: B-trees By:- Arunesh Joshi Id:-006538558

Why we take 3 as the number of levels of a B-tree?

Suppose our blocks are 4096 bytes. Also let keys be integers of 4 bytes and let pointers be 8 bytes. If there is no header information kept on the blocks, then we want to find the largest integer value of n such that -

411 + 8(n + 1) 5 4096. That value is n = 340. 340 key-pointer pairs could fit in one block for our example data. Suppose that the average block has an occupancy midway between the minimum and maximum. i.e.. a typical block has 255 pointers. With a root 255 children and 255*255= 65023 leaves. We shall have among those leaves cube of 253. or about 16.6 million pointers to records. That is, files with up to 16.6 million records can be accommodated by a 3-level B-tree.

Page 30: CS 255: Database System Principles slides: B-trees By:- Arunesh Joshi Id:-006538558

Thank youfor bearing me.