b-trees - 123seminarsonly.com€¦ · avl trees: sometimes called hb[1] trees. invented by...

28
B-Trees

Upload: others

Post on 16-Jun-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: B-Trees - 123seminarsonly.com€¦ · AVL Trees: Sometimes called HB[1] trees. Invented by Adel’son-Vel’skii and Landis ~early 1960s. B-Trees: Proposed by R. Bayer & E.M. Creight

B-Trees

Page 2: B-Trees - 123seminarsonly.com€¦ · AVL Trees: Sometimes called HB[1] trees. Invented by Adel’son-Vel’skii and Landis ~early 1960s. B-Trees: Proposed by R. Bayer & E.M. Creight

Why B-Trees?

�Trees studied so far are for storing data in memory

�B-Trees are better suited for storing data in memory AND on secondary storage.

�Better suited for balancing data than some other three ADTs.

Page 3: B-Trees - 123seminarsonly.com€¦ · AVL Trees: Sometimes called HB[1] trees. Invented by Adel’son-Vel’skii and Landis ~early 1960s. B-Trees: Proposed by R. Bayer & E.M. Creight

The Problem With Unbalanced Trees

1

2

3

4

5

The levels are sparselyfilled resulting in deeppaths. This defeats thepurpose of binary trees

Page 4: B-Trees - 123seminarsonly.com€¦ · AVL Trees: Sometimes called HB[1] trees. Invented by Adel’son-Vel’skii and Landis ~early 1960s. B-Trees: Proposed by R. Bayer & E.M. Creight

Possible Solutions To Unbalanced Trees

� Periodically balance the tree

�Don’t let a tree get too unbalanced when inserting or deleting�AVL Trees: Sometimes called HB[1] trees.

Invented by Adel’son-Vel’skii and Landis ~early 1960s.

�B-Trees: Proposed by R. Bayer & E.M. Creight

Page 5: B-Trees - 123seminarsonly.com€¦ · AVL Trees: Sometimes called HB[1] trees. Invented by Adel’son-Vel’skii and Landis ~early 1960s. B-Trees: Proposed by R. Bayer & E.M. Creight

What Is A B-Tree?

� It is a type of “multiway” tree.

� It is NOT a binary search tree, nor is it a binary tree.

� It provides a fast way to index into a multi-level set of nodes.

�Each node in the B-Tree contains a sorted array of key values.

Page 6: B-Trees - 123seminarsonly.com€¦ · AVL Trees: Sometimes called HB[1] trees. Invented by Adel’son-Vel’skii and Landis ~early 1960s. B-Trees: Proposed by R. Bayer & E.M. Creight

Motivation For Multiway Tree

� Secondary storage (e.g., disks) is typically divided into equal-sized blocks (e.g., 512, 1024, …, 4096, …)

� The basic I/O operation reads and writes blocks rather than single bytes at a time between secondary storage and memory.

� Goal is to devise a multiway search tree that will minimize file access by exploiting disk reads.

� Each access to secondary storage is approximately equal to 250K instructions … depending on the speed of the CPU

Page 7: B-Trees - 123seminarsonly.com€¦ · AVL Trees: Sometimes called HB[1] trees. Invented by Adel’son-Vel’skii and Landis ~early 1960s. B-Trees: Proposed by R. Bayer & E.M. Creight

Multiway Search Tree (order m)

�A generalization of a binary search trees.

�Each node has at most mchildren.�If k<=m is the number of children, then the node

has exactly k-1 keys.

�The tree is ordered.

Page 8: B-Trees - 123seminarsonly.com€¦ · AVL Trees: Sometimes called HB[1] trees. Invented by Adel’son-Vel’skii and Landis ~early 1960s. B-Trees: Proposed by R. Bayer & E.M. Creight

Multiway Search Tree (cont.)

keys < k1 k2 < keys < k3 k5 < keys

k1 k2 k3 k4 k5Nodes ina multiwaytree

Page 9: B-Trees - 123seminarsonly.com€¦ · AVL Trees: Sometimes called HB[1] trees. Invented by Adel’son-Vel’skii and Landis ~early 1960s. B-Trees: Proposed by R. Bayer & E.M. Creight

Definition Of A B-Tree

� A B-Tree of order m is a m-way tree such that

� All leaves are on the same level

� All internal nodes except the root node are constrained to have at most mnon-empty children and at least m/2 non-empty children

� The root node has at most m non-empty children

� A leaf node must contain atleast ((m/2) – 1) keys

Page 10: B-Trees - 123seminarsonly.com€¦ · AVL Trees: Sometimes called HB[1] trees. Invented by Adel’son-Vel’skii and Landis ~early 1960s. B-Trees: Proposed by R. Bayer & E.M. Creight

Three Important Properties Of B-Trees

�All nodes in the B-Tree are at least half-full (root node is an exception at times)

�The B-tree is always balanced. That is, an identical number of nodes must be read into memory in order to locate all keys at any given level in the tree.

� A well organized B-Tree will have just a small number of levels relative to the number of nodes.

Page 11: B-Trees - 123seminarsonly.com€¦ · AVL Trees: Sometimes called HB[1] trees. Invented by Adel’son-Vel’skii and Landis ~early 1960s. B-Trees: Proposed by R. Bayer & E.M. Creight

Where are B-Tree Used?

�B-Trees are commonly found in database and file systems.

�B-Trees allow logarithmic time insertions and deletions.

�They generally grow from the bottom upwards as elements are inserted, whereas most binary trees grow downward.

Page 12: B-Trees - 123seminarsonly.com€¦ · AVL Trees: Sometimes called HB[1] trees. Invented by Adel’son-Vel’skii and Landis ~early 1960s. B-Trees: Proposed by R. Bayer & E.M. Creight

The Six Rules Governing B-Trees

�R1: A B-Tree might be empty, if not, then each node has some specified MINIMUM number of entries in each node.

�R2: The MAXIMUM number of entries is twice the MINIMUM.

Page 13: B-Trees - 123seminarsonly.com€¦ · AVL Trees: Sometimes called HB[1] trees. Invented by Adel’son-Vel’skii and Landis ~early 1960s. B-Trees: Proposed by R. Bayer & E.M. Creight

The Six Rules Governing B-Trees (cont)

�R3: The entries of each B-Tree node are stored in a partially filled array, sorted from the smallest entry (at index 0) to the largest entry (at the final position of the array).

....nk*kh

0 n-1

The data in such an array can be stored in a blockon a disk

B-Tree node

* B-Trees cansupport duplicatekeys

Page 14: B-Trees - 123seminarsonly.com€¦ · AVL Trees: Sometimes called HB[1] trees. Invented by Adel’son-Vel’skii and Landis ~early 1960s. B-Trees: Proposed by R. Bayer & E.M. Creight

The Six Rules Governing B-Trees (cont)

� R4: The number of subtrees below a non-leaf node is always one more than the number of entries in the node.

826755454 entries in a non-leaf node

Keys < 45

Keys > 45& < 55

Keys > 55& < 67

Keys > 67& < 82

Keys > 82

5 subtrees

subtree 0

subtree 1

subtree 2

subtree 3

subtree 4

0 1 2 3

Page 15: B-Trees - 123seminarsonly.com€¦ · AVL Trees: Sometimes called HB[1] trees. Invented by Adel’son-Vel’skii and Landis ~early 1960s. B-Trees: Proposed by R. Bayer & E.M. Creight

The Six Rules Governing B-Trees (cont)

�R5: For any non-leaf node:�An entry at index i is greater than all the

entries in subtree i of the node, and�An entry at index i is less than all the entries

at entry i+1 of the node.

�R6: Every leaf node in a B-Tree has the same depth (i.e., at the same level)

Page 16: B-Trees - 123seminarsonly.com€¦ · AVL Trees: Sometimes called HB[1] trees. Invented by Adel’son-Vel’skii and Landis ~early 1960s. B-Trees: Proposed by R. Bayer & E.M. Creight

Example B-Tree

MIN = 1MAX = 230 80

50 60

35 40

20 90

9572 82 85552510

Page 17: B-Trees - 123seminarsonly.com€¦ · AVL Trees: Sometimes called HB[1] trees. Invented by Adel’son-Vel’skii and Landis ~early 1960s. B-Trees: Proposed by R. Bayer & E.M. Creight

Searching For A Target In B-Trees

� Start with root node and search for target in the array at that node. If found, then done and return success.

� If the target is not in the root and there are no children, then also done, but return failure.

� If the target is not in the root node, and there are children, then if the target exists, then it can only be in one subtree.

� Compare the target with the listed keys and traverse first subtree i for which target is < key_array[ i] …while search key_array from left to right … up to data_count .

Repeat the process at the new root node

Page 18: B-Trees - 123seminarsonly.com€¦ · AVL Trees: Sometimes called HB[1] trees. Invented by Adel’son-Vel’skii and Landis ~early 1960s. B-Trees: Proposed by R. Bayer & E.M. Creight

Inserting Into A B-Tree

Add the new keyto the appropriate leaf

node

Split the node into two nodeson the same level, and promote

the median key

Overflow?

Yes No

Page 19: B-Trees - 123seminarsonly.com€¦ · AVL Trees: Sometimes called HB[1] trees. Invented by Adel’son-Vel’skii and Landis ~early 1960s. B-Trees: Proposed by R. Bayer & E.M. Creight

Example

176

22194

MIN = 1MAX = 2

12

Insert 186 | 17

412 18 | 19 | 22

Excess Entry(problem child)

Page 20: B-Trees - 123seminarsonly.com€¦ · AVL Trees: Sometimes called HB[1] trees. Invented by Adel’son-Vel’skii and Landis ~early 1960s. B-Trees: Proposed by R. Bayer & E.M. Creight

Contnd.

6, 17, 19

4 122218

Split problem child, andpromote middle key toparent node. Still have excess.

6

4 122218

17

19

Fix excess by repeating the process. Split node and promotemiddle key to new root node.

MIN = 1MAX = 2

Page 21: B-Trees - 123seminarsonly.com€¦ · AVL Trees: Sometimes called HB[1] trees. Invented by Adel’son-Vel’skii and Landis ~early 1960s. B-Trees: Proposed by R. Bayer & E.M. Creight

Insert In Class Exercise

176

22194

MIN = 1MAX = 2

12

� Insert 5, then insert 7 and 15.

Page 22: B-Trees - 123seminarsonly.com€¦ · AVL Trees: Sometimes called HB[1] trees. Invented by Adel’son-Vel’skii and Landis ~early 1960s. B-Trees: Proposed by R. Bayer & E.M. Creight

Deleting From A B-Tree

Page 23: B-Trees - 123seminarsonly.com€¦ · AVL Trees: Sometimes called HB[1] trees. Invented by Adel’son-Vel’skii and Landis ~early 1960s. B-Trees: Proposed by R. Bayer & E.M. Creight

Deletion (cont.)

� Case 1: The key is in a leaf , which has more than the minimum number of keys. If subset[i] has extra entries, then just delete the data

� Delete 21

2, 4 10,13 19, 21, 22

6, 17

2, 4 10, 13 19, 22

6, 17

MIN = 2MAX = 4

Page 24: B-Trees - 123seminarsonly.com€¦ · AVL Trees: Sometimes called HB[1] trees. Invented by Adel’son-Vel’skii and Landis ~early 1960s. B-Trees: Proposed by R. Bayer & E.M. Creight

Deletion

� Case 2: Key is in a leaf which has just the minimum number of keys. If subset[i-1] has extra entries, then transfer the entry to subset[i]

� Delete 22

2, 4 10, 12, 15 19,22

6, 17

2, 4 10, 12 17, 19

6, 15

MIN = 2MAX = 4

Page 25: B-Trees - 123seminarsonly.com€¦ · AVL Trees: Sometimes called HB[1] trees. Invented by Adel’son-Vel’skii and Landis ~early 1960s. B-Trees: Proposed by R. Bayer & E.M. Creight

Deletion (cont.)

�Case 3: If subset[i+1] has extra entries, then transfer the entry to subset[i] (Similar to Case 2)

�Delete 13

2, 4 10,13 19, 21, 22

6, 17

2, 4 10, 17 21, 22

6, 19

MIN = 2MAX = 4

Page 26: B-Trees - 123seminarsonly.com€¦ · AVL Trees: Sometimes called HB[1] trees. Invented by Adel’son-Vel’skii and Landis ~early 1960s. B-Trees: Proposed by R. Bayer & E.M. Creight

Deletion (cont.)

�Case 4: The key is in a leaf and the leaf and its siblings have just the minimum number of keys. Combine subset[i] with subset[i-1]

2, 4 10, 12 19,22

6, 17

2, 4 10, 12, 17, 19

6

Delete 22

Page 27: B-Trees - 123seminarsonly.com€¦ · AVL Trees: Sometimes called HB[1] trees. Invented by Adel’son-Vel’skii and Landis ~early 1960s. B-Trees: Proposed by R. Bayer & E.M. Creight

Deletion (cont.)� Case 5 : key is in an internal node. Child node that has the successor of the

key is located and if this node has more entries, then the key to be deleted is replaced by the successor and that value in the leaf is deleted.

Delete 95

25, 45

97,100,150

85,95

62

75, 80

90 , 92

50, 54

30,40

15,20

Page 28: B-Trees - 123seminarsonly.com€¦ · AVL Trees: Sometimes called HB[1] trees. Invented by Adel’son-Vel’skii and Landis ~early 1960s. B-Trees: Proposed by R. Bayer & E.M. Creight

Contd. Case 5

25, 45

100,150

85,97

62

75, 80

90 , 92

50, 54

30,40

15,20