group project b- tree student: yongsheng ma
DESCRIPTION
CS632 – Algorithm Professor: G. Gibson. Group Project B- Tree Student: Yongsheng Ma. B-Tree. Introduction Operations Complexities Applications Summary. B-Tree Properties. A m-way search way Root node may have as few as two children or none if the tree is empty Root may be a leaf - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Group Project B- Tree Student: Yongsheng Ma](https://reader035.vdocuments.site/reader035/viewer/2022070412/5681492f550346895db66dba/html5/thumbnails/1.jpg)
Group Project
B-Tree
Student: Yongsheng Ma
CS632 – AlgorithmProfessor: G. Gibson
![Page 2: Group Project B- Tree Student: Yongsheng Ma](https://reader035.vdocuments.site/reader035/viewer/2022070412/5681492f550346895db66dba/html5/thumbnails/2.jpg)
B-Tree
Introduction Operations Complexities Applications Summary
![Page 3: Group Project B- Tree Student: Yongsheng Ma](https://reader035.vdocuments.site/reader035/viewer/2022070412/5681492f550346895db66dba/html5/thumbnails/3.jpg)
B-Tree Properties
A m-way search way Root node may have as few as two
children or none if the tree is empty Root may be a leaf Internal nodes have at least ceiling(m/2)
and at most m non-null sub-trees
![Page 4: Group Project B- Tree Student: Yongsheng Ma](https://reader035.vdocuments.site/reader035/viewer/2022070412/5681492f550346895db66dba/html5/thumbnails/4.jpg)
B-Tree Properties
All leaf nodes are at the same level; that is, the tree is perfectly balanced.
A leaf node has at least ceiling(m/2)-1 entries (keys) and at most m-1 entries (keys).
![Page 5: Group Project B- Tree Student: Yongsheng Ma](https://reader035.vdocuments.site/reader035/viewer/2022070412/5681492f550346895db66dba/html5/thumbnails/5.jpg)
B-Tree Properties
“branching factor ” can be quite large. Each node may have many children, from
a handful to thousands. The keys in each node is in non-
decreasing order.
![Page 6: Group Project B- Tree Student: Yongsheng Ma](https://reader035.vdocuments.site/reader035/viewer/2022070412/5681492f550346895db66dba/html5/thumbnails/6.jpg)
Operations
Searching a key Inserting a key Splitting a node Deleting a node
![Page 7: Group Project B- Tree Student: Yongsheng Ma](https://reader035.vdocuments.site/reader035/viewer/2022070412/5681492f550346895db66dba/html5/thumbnails/7.jpg)
Searching a key
Much like searching a binary tree. Make a multi-way branching decision at
each node The nodes encountered form a path
downward from the root.
![Page 8: Group Project B- Tree Student: Yongsheng Ma](https://reader035.vdocuments.site/reader035/viewer/2022070412/5681492f550346895db66dba/html5/thumbnails/8.jpg)
Searching a key
The number of pages accessed is (h)=(logtn) , in which h is the height and n is the number of keys.
CPU time is O(th)=O(t logtn) . Note
t is minimum degree for B-tree. So each node has the maximum number of children
as 2t and entries(keys) as 2t-1.
![Page 9: Group Project B- Tree Student: Yongsheng Ma](https://reader035.vdocuments.site/reader035/viewer/2022070412/5681492f550346895db66dba/html5/thumbnails/9.jpg)
Searching a key
M
HD XTQ
GFCB LKJ PN WVSR ZY
![Page 10: Group Project B- Tree Student: Yongsheng Ma](https://reader035.vdocuments.site/reader035/viewer/2022070412/5681492f550346895db66dba/html5/thumbnails/10.jpg)
Creating a empty tree
We can assume there is no disk read.
Allocates one disk page to be used as a new node in O(1) time.
![Page 11: Group Project B- Tree Student: Yongsheng Ma](https://reader035.vdocuments.site/reader035/viewer/2022070412/5681492f550346895db66dba/html5/thumbnails/11.jpg)
Splitting a node
A fundamental operation used during insertion
The median key moves up into its parent node, which must be non-full.
If it has no parent, then the tree grows in height by one
![Page 12: Group Project B- Tree Student: Yongsheng Ma](https://reader035.vdocuments.site/reader035/viewer/2022070412/5681492f550346895db66dba/html5/thumbnails/12.jpg)
Splitting a node
WN… …… …
SRQ TP U V
SN… W… … …
RQ TP U V
t=4
![Page 13: Group Project B- Tree Student: Yongsheng Ma](https://reader035.vdocuments.site/reader035/viewer/2022070412/5681492f550346895db66dba/html5/thumbnails/13.jpg)
Splitting a node
HFD LA N PFD LA N P
t=4
H
![Page 14: Group Project B- Tree Student: Yongsheng Ma](https://reader035.vdocuments.site/reader035/viewer/2022070412/5681492f550346895db66dba/html5/thumbnails/14.jpg)
Inserting a key
Requiring O(h) disk accesses. CPU time O(th)=O(t logtn) .
![Page 15: Group Project B- Tree Student: Yongsheng Ma](https://reader035.vdocuments.site/reader035/viewer/2022070412/5681492f550346895db66dba/html5/thumbnails/15.jpg)
Inserting a key
Splitting the root is the only way to increase the height of a B-tree.
Unlike a binary tree, a B-tree increases in height at the top instead of the bottom .
![Page 16: Group Project B- Tree Student: Yongsheng Ma](https://reader035.vdocuments.site/reader035/viewer/2022070412/5681492f550346895db66dba/html5/thumbnails/16.jpg)
Inserting a key
EDC JA K RN O Y ZVUTS
XPMG(a) initial tree
t=3
![Page 17: Group Project B- Tree Student: Yongsheng Ma](https://reader035.vdocuments.site/reader035/viewer/2022070412/5681492f550346895db66dba/html5/thumbnails/17.jpg)
Inserting a key
EDC JB K RN O Y ZVUTS
XPMG(b) B inserted
A
t=3
![Page 18: Group Project B- Tree Student: Yongsheng Ma](https://reader035.vdocuments.site/reader035/viewer/2022070412/5681492f550346895db66dba/html5/thumbnails/18.jpg)
Inserting a key
EDC JB K QN O Y ZVUSR
TPMG(c) Q inserted
A
X
t=3
![Page 19: Group Project B- Tree Student: Yongsheng Ma](https://reader035.vdocuments.site/reader035/viewer/2022070412/5681492f550346895db66dba/html5/thumbnails/19.jpg)
Inserting a key
EDC JB K QN O Y ZVUSR
T
P
MG
(d) L inserted
A
X
L
t=3
![Page 20: Group Project B- Tree Student: Yongsheng Ma](https://reader035.vdocuments.site/reader035/viewer/2022070412/5681492f550346895db66dba/html5/thumbnails/20.jpg)
Inserting a key
ED
C
JB K QN O Y ZVUSR
T
P
MG
(e) F inserted
A
X
LF
t=3
![Page 21: Group Project B- Tree Student: Yongsheng Ma](https://reader035.vdocuments.site/reader035/viewer/2022070412/5681492f550346895db66dba/html5/thumbnails/21.jpg)
Deleting a key
is analogous to insertion but is a little more complicated.
Exists various cases of deleting keys from B-tree.
![Page 22: Group Project B- Tree Student: Yongsheng Ma](https://reader035.vdocuments.site/reader035/viewer/2022070412/5681492f550346895db66dba/html5/thumbnails/22.jpg)
Deleting a key
Different conditions can affect different behaviors.
In practice, deletion operations are most often used to delete keys from leaves.
![Page 23: Group Project B- Tree Student: Yongsheng Ma](https://reader035.vdocuments.site/reader035/viewer/2022070412/5681492f550346895db66dba/html5/thumbnails/23.jpg)
Deleting a key
When deleting a key from an internal node, however, the procedure makes a downward pass through the tree but may have to return to the node from which the key was deleted to replace the key with its predecessor or successor.
![Page 24: Group Project B- Tree Student: Yongsheng Ma](https://reader035.vdocuments.site/reader035/viewer/2022070412/5681492f550346895db66dba/html5/thumbnails/24.jpg)
Deleting a key
Although this procedure seems complicated, it involves only O(h) disk operations for a B-tree with height h.
The CPU time required is
O(th)=O(t logtn) .
![Page 25: Group Project B- Tree Student: Yongsheng Ma](https://reader035.vdocuments.site/reader035/viewer/2022070412/5681492f550346895db66dba/html5/thumbnails/25.jpg)
Deleting a key
ED
C
JB K QN O Y ZVUSR
T
P
MG
(a) Initial tree
A
X
LF
t=3
![Page 26: Group Project B- Tree Student: Yongsheng Ma](https://reader035.vdocuments.site/reader035/viewer/2022070412/5681492f550346895db66dba/html5/thumbnails/26.jpg)
Deleting a key
ED
C
JB K QN O Y ZVUSR
T
P
MG
(b) F deleted: case 1
A
X
L
t=3
![Page 27: Group Project B- Tree Student: Yongsheng Ma](https://reader035.vdocuments.site/reader035/viewer/2022070412/5681492f550346895db66dba/html5/thumbnails/27.jpg)
Deleting a key
ED
C
JB K QN O Y ZVUSR
T
P
LG
(c) M deleted: case 2a
A
X
t=3
![Page 28: Group Project B- Tree Student: Yongsheng Ma](https://reader035.vdocuments.site/reader035/viewer/2022070412/5681492f550346895db66dba/html5/thumbnails/28.jpg)
Deleting a key
ED
C
JB K QN O Y ZVUSR
T
P
L
(d) G deleted: case 2c
A
X
t=3
![Page 29: Group Project B- Tree Student: Yongsheng Ma](https://reader035.vdocuments.site/reader035/viewer/2022070412/5681492f550346895db66dba/html5/thumbnails/29.jpg)
Deleting a key
E
C
JB K QN O Y ZVUSR
TPL
(e) D deleted: case 3b
A
X
t=3
![Page 30: Group Project B- Tree Student: Yongsheng Ma](https://reader035.vdocuments.site/reader035/viewer/2022070412/5681492f550346895db66dba/html5/thumbnails/30.jpg)
Deleting a key
E
C
JB K QN O Y ZVUSR
TPL
(e’) tree shrinks in height
A
X
t=3
![Page 31: Group Project B- Tree Student: Yongsheng Ma](https://reader035.vdocuments.site/reader035/viewer/2022070412/5681492f550346895db66dba/html5/thumbnails/31.jpg)
Deleting a key
E
JC K QN O Y ZVUSR
TPL
(f) B deleted: case 3a
A
X
t=3
![Page 32: Group Project B- Tree Student: Yongsheng Ma](https://reader035.vdocuments.site/reader035/viewer/2022070412/5681492f550346895db66dba/html5/thumbnails/32.jpg)
Complexities
A large Branching Factor reduces the number of disk accesses required to find a key.
When root node resides in memory, a tree with a height of 1 will require at most 2 disk accesses to find any key in the tree, this can be realized in Constant Time O(1).
![Page 33: Group Project B- Tree Student: Yongsheng Ma](https://reader035.vdocuments.site/reader035/viewer/2022070412/5681492f550346895db66dba/html5/thumbnails/33.jpg)
Complexities
Running Time is comprised of the number of disk accesses and the CPU time.
During a disk Read or Write, an entire page of information is accessed
The number of disk accesses is measured in terms of pages that have to be read from or written to the disk.
![Page 34: Group Project B- Tree Student: Yongsheng Ma](https://reader035.vdocuments.site/reader035/viewer/2022070412/5681492f550346895db66dba/html5/thumbnails/34.jpg)
Complexities
The number of disk pages accessed is
O(h)=O(logtn). The CPU time to traverse within each node is
O(t). The Total Time is O(th) which is equal to
O(tlogtn) or ≈ O(log n).
It is the same for every basic operation.
![Page 35: Group Project B- Tree Student: Yongsheng Ma](https://reader035.vdocuments.site/reader035/viewer/2022070412/5681492f550346895db66dba/html5/thumbnails/35.jpg)
Applications
Databases cannot typically be maintained entirely in memory.
Secondary storage is usually used. B-tree is often used to index the data and
to provide fast access.
![Page 36: Group Project B- Tree Student: Yongsheng Ma](https://reader035.vdocuments.site/reader035/viewer/2022070412/5681492f550346895db66dba/html5/thumbnails/36.jpg)
Applications
Searching an un-indexed and unsorted database containing n key values will have a worst case running time of O(n)
Indexed with a B-tree, the same search operation will run in O(log n)
![Page 37: Group Project B- Tree Student: Yongsheng Ma](https://reader035.vdocuments.site/reader035/viewer/2022070412/5681492f550346895db66dba/html5/thumbnails/37.jpg)
Applications – an example
To perform a search for a single key on a set of one million keys (1,000,000), a linear search will require at most 1,000,000 comparisons.
If the same data is indexed with a B-tree of
minimum order 10 and height 9, 81 comparisons will be required in the worst case.
![Page 38: Group Project B- Tree Student: Yongsheng Ma](https://reader035.vdocuments.site/reader035/viewer/2022070412/5681492f550346895db66dba/html5/thumbnails/38.jpg)
Summary
B-Tree is a balanced, multi-way file organization.
Search, Insert, and Delete operations retain desirable logarithmic costs.
B-Tree schemes promote 50% storage usage.
![Page 39: Group Project B- Tree Student: Yongsheng Ma](https://reader035.vdocuments.site/reader035/viewer/2022070412/5681492f550346895db66dba/html5/thumbnails/39.jpg)
Extra
B-tree variants B+ and B* tree Branching factors are improved
![Page 40: Group Project B- Tree Student: Yongsheng Ma](https://reader035.vdocuments.site/reader035/viewer/2022070412/5681492f550346895db66dba/html5/thumbnails/40.jpg)
Extra
B+ tree Combine features of ISAM and B tree Contain Index pages and Data pages Data pages always appear as leaf nodes Root and intermediate nodes are index pages
![Page 41: Group Project B- Tree Student: Yongsheng Ma](https://reader035.vdocuments.site/reader035/viewer/2022070412/5681492f550346895db66dba/html5/thumbnails/41.jpg)
Extra
B+ tree Saves more space (but who cares) Non-leaf and leaf nodes contain different numbers
of nodes Deletion more complicated Faster look up for B-trees because the height of the
tree is smaller (because items are stored more compactly)