making b+-trees cache conscious in main memory author:jun rao, kenneth a. ross members: iris zhang,...
TRANSCRIPT
Making B+-Trees Cache Conscious in Main Memory
Author:Jun Rao, Kenneth A. Ross
Members: Iris Zhang, Grace Yung, Kara Kwon, Jessica Wong
Outline
1. Introduction
2. Related Work
3. Cache Sensitive B+-Trees
4. Conclusion
Motivation
1. Significant portion of execution time:
• second level data cache misses
• first level instruction cache misses System Hierarchy
Motivation (Cont’d)
2. CPU speeds have been increasing at a much faster rate than memory speeds
• Conclusion: improving cache behavior is going to be an imperative task in main memory data processing
• Resolution: using memory index structure
Cache Memories
Cache memories are small fast static RAM memories that improve performance by holding recently referenced data.
Parameter:• Capacity• Block Size (cache line)• Associativity
Memory reference:• Hit• Miss
Cache Optimization on Index Structures—B+-Trees
• Height-balanced tree• Minimum 50%
occupancy (except for root). Each node contains d <= m <= 2d entries. The parameter d is called the order of the tree. (n=2d)
• Each node is 1 cache line (cache-line based)
• Full pointerB+-Tree (n =2)
Cache Optimization on Index Structures—CSS-Trees
• Similar as B+-tree • Eliminating child pointers
-Storing child nodes in a fixed sized array.-Nodes are numbered & stored level by level, left to right.-Position of child node can be calculated via arithmetic.-No pointerCSS-Tree
Comparison between B+-Trees and CSS-Trees
• Cache Line Size=12 bytes, Key Size=Pointer Size=4 bytes
• Search key =3
• B+-Tree CSS-Tree
Comparison between B+-Trees and CSS-Trees(cont’d)B+ tree
• full pointer• more cache access and
more cache misses• efficient for updating
operation, e.g. insertion and deletion
CSS tree
• no pointer• fewer cache access
and fewer cache misses
• acceptable for static data updated in batches
Conclusion: partial pointer elimination
Cache Sensitive B+-Trees
1. Cache Sensitive B+-Trees with One Child Pointer
2. Segmented CSB+-Trees
3. Full CSB+-Trees
Cache Sensitive B+-Trees with One Pointer
• Similar as B+-tree• All the child nodes of any
given node are put into a node group with one pointer
• Nodes within a node group are stored continuously and can be accessed using an offset to the first node in the group
Cache Sensitive B+-Trees with One Pointer (cont’d)
• Cache misses are reduced because a cache line can hold more keys than B+-Trees and can satisfy one more level comparison.
• CSB+-Tree can support incremental updates in a way similar to B+-Tree
Cache Line Size=64 bytes, Key Size=Pointer Size=4 bytesB+-Tree: 7 keys per nodeCSB+-Tree: 14 keys per node
Operations on CSB+-Tree—Bulkload
2|3 5|7 12|13 16|19 20|22 24|25 27|30 31|33 36|39
3| 13|19 25| 33|
7| 30|
22|
1. Search the leaf node n to insert the new entry
2. If n is not full, insert the new entry in the appropriate place
3. Otherwise, split n. Let p be n’ parent node, f be the first-child pointer in p and g be the node-group pointed by f
a. If p is not full, copy g to g' in which n is split in two nodes. Let f point to g'
b. If p is full, copy half g to g'. Let f point to g'. Split the node-group of p according to step a
Operations on CSB+-Tree— Insertion
Operations on CSB+-Tree— Insertion (cont’d)
2|3 5|7 12|13 16|19 20|22 24|25 27|30 31|33 36|39
3| 13|19 25| 33|
7| 30|
22|key = 34
a CSB+-Tree of Order 1
Operations on CSB+-Tree— Insertion (cont’d)
2|3 5|7 12|13 16|19 20|22 24|25 27|30 31|33 34|36 39|
3| 13|19 25| 33|36
7| 30|
22|key = 34
Operations on CSB+-Tree—Search
• Determine the rightmost key K in the node that is smaller than the search key
• Get the address of the child node
• Goto first step until find the search key or there is no other node can be checked
Search method in a node
basic approach uniform approach variable approach
Segmented Cache Sensitive B+-Trees
• Problem: it’s time consuming to split a node group
• Resolution:SCSB+-Tree
– method: divide node group into two segments with one child pointer per segment
– result: better split performance, but worse search
Full CSB+-Tree
• Motivation: reduce the split cost• Method:
– pre-allocate space for a full node group– shift part of the node group along by one node
when a node split
• Result:– reduce the split cost, but increase the space
complexity
Conclusion
• CSB+-Trees are more cache conscious than B+-Tree because of partial pointer elimination
• CSB+-Trees support efficient incremental updates, but CSS-Trees do not
• Partial pointer elimination is a general technique which can be applied to other memory structures