hashing & indexing

INDEX (CHAPTER 12)INDEX (CHAPTER 12)

9/23/20071

TOPICS

• Basic concepts

• Hashing• Hashing

• B+-tree

9/23/20072

INTRODUCTION

• Review

Conceptual E-R data model

Logical Relational data model SQL

Physical relation = a fileOrg. of records on a disk pageOrganization of attributes within a recordI d FilIndex Files

9/23/20073

Software Architecture of a DBMS

Query Parser

Query Interpretor

Query Optimizer

Relational Algebra operators: ∏, σ, ρ, δ, ←, ∪, ∩, ÷, −

Abstraction of records

Index structures

Relational Algebra operators: ∏, σ, ρ, δ, ←, ∪, ∩, ÷,

File System

Buffer Pool Manager

9/23/20074

Implementation of б

SS# N A S l d

• Emp table:

SS# Name Age Salary dno

1 Joe 24 20000 2

2 Mary 20 25000 3

3 B b 22 27000 43 Bob 22 27000 4

4 Kathy 30 30000 5

5 Shideh 4 4000 1

• бSalary=30,000(Employee)SS# Name Age Salary dno

4 Kathy 30 30000 5

• Process the select operator using a file scan (linear scan)F1 = Open the file corresponding to EmployeeF1 Open the file corresponding to EmployeeP = read first page of F1While P is not null

For each record in P, if the record satisfies the selection predicate then produce as outputP = read next page of F1 /* P becomes null when EoF is reached */

9/23/20075

P = read next page of F1 /* P becomes null when EoF is reached */


SS# N A S l d

• Emp table:


1 Joe 24 20000 2

2 Mary 20 25000 3

3 B b 22 27000 43 Bob 22 27000 4

4 Kathy 30 30000 5

5 Shideh 4 4000 1


4 Kathy 30 30000 5

• Process the select operator using a file scan (linear scan)F1 = Open the file corresponding to Employee

Fetch the page from disk if not in the buffer pool

F1 Open the file corresponding to EmployeeP = read first page of F1While P is not null

For each record in P, if the record satisfies the selection predicate then produce as outputP = read next page of F1

9/23/20076

P = read next page of F1


SS# N A S l d

• Emp table:


1 Joe 24 20000 2

2 Mary 20 25000 3

3 B b 22 27000 43 Bob 22 27000 4

4 Kathy 30 30000 5

5 Shideh 4 4000 1


4 Kathy 30 30000 5

• Process the select operator using a file scan (linear scan)F1 = Open the file corresponding to Employee

Header

F1 Open the file corresponding to EmployeeP = read first page of F1While P is not null

For each record in P, if the record satisfies the selection predicate then produce as outputP = read next page of F1

9/23/20077

P = read next page of F1

TERMINOLOGY

• An exact match selection predicate: бSalary=30,000(Employee) , бFirstName=“Shideh”(Employee)бFirstName= Shideh (Employee)

• A range selection predicate: б (Employee) б (Employee)• A range selection predicate: бSalary>30,000(Employee) , бSalary<30,000(Employee), бSalary>30,000 and Salary < 32,000 (Employee)

9/23/20078

INTRODUCTION (Cont…)( )

• Motivation: Speed-up those queries that reference only a small portion of the records in a file.

• Analogy: Catalog cards in the library (more than one index).

• Evaluation:1. Access time (find)2. Insertion time (find + add)3. Deletion time (find + delete)4 S h d4. Space overhead

• Search-key: The attribute (or set of attributes) used to lookup records in a file

• Primary index: The index whose search key specifies the sequential order of y y p qthe records within a file.

• Secondary index: The index whose search key does not specify the sequential order of the records within a file.

9/23/20079


• Example:

Alaska Alaska Bob 12 AliceState Name Age Other!

AlaskaAlaskaArizonaCaliforniaCalifornia

Alaska Bob 12 ...Alaska George 28Arizona David 48California Hellen 20California Jack 37

AliceBobCharlesDavidDavid

FloridaFloridaIndianaOhio

Florida Frank 10Florida Charles 4Indiana Joe 12Ohio Alice 23

FrankGeorgeHellenJack

• Assume, size of disk page = 2 data records = 5 index records.

I d i t i d i ?

Ohio Ohio David 36 Joe

• Indexing or not indexing?SELECT age SELECT ageFROM personnel FROM personnelWHERE name = “Alice” WHERE name = “Don”

9/23/200710

WHERE name Alice WHERE name Don


• Example:Alaska Alaska Bob 12 Alice

State Name Age Else!AlaskaAlaskaBostonCaliforniaCalifornia

Alaska Bob 12 ...Alaska George 28Boston David 48California Hellen 20California Jack 37

AliceBobCharlesDavidDavid


Florida Frank 10Florida Charles 4Indiana Joe 12Ohio Alice 23

FrankGeorgeHellenJack

• Assume, size of disk page = 2 data records = 5 index records.


• Primary vs. Secondary”SELECT name SELECT ageFROM personnel FROM personnelWHERE t t “Ohi ” WHERE “D id”

9/23/200711

WHERE state = “Ohio” WHERE name = “David”


• Example: (page = 2 data = 5 index)

Al k Al k B b 12 AliState Name Age Else!

AlaskaAlaskaBostonCaliforniaCalifornia

Alaska Bob 12 ...Alaska George 28Boston David 48California Hellen 20California Jack 37

AliceBobCharlesDavidDavidCalifornia


California Jack 37Florida Frank 10Florida Charles 4Indiana Joe 12Ohio Alice 23

DavidFrankGeorgeHellenJack

• Exact match vs. RangeSELECT name SELECT name


FROM personnel FROM personnelWHERE state = “California” WHERE state >= “Alaska” and

state <= “Florida”

Speed p b emplo ing binar search (is it possible?)

9/23/200712

• Speedup by employing binary search (is it possible?)

Dense Index Files

• Dense index — Index record appears for every search-key value in the file.

9/23/200713

Example of Sparse Index Filesp p

9/23/200714

Multilevel Index

9/23/200715

HASHING

Hash function:• K: the set of all search key valuesK: the set of all search key values• V: the set of all bucket address• h(K): K V• K is large (perhaps infinite) but set of search-key values actually stored in theK is large (perhaps infinite) but set of search key values actually stored in the

database is much smaller than K.• Fast lookup: To find Ki, search the bucket with h(Ki) address.

9/23/200716

HASHING (Cont…)( )

• Example:– K = salary (set of all 6 digit integers)y ( g g )– V = 1000 buckets addressed from 0 to 999– h(k) = k mod 1000.SELECT nameFROM personnelWHERE salary = “120,100”

• To find a 120 100 salary we should search bucket number 100• To find a 120,100 salary, we should search bucket number 100.• Hash is only appropriate for Exact match queries.• A bad hash function maps the value to a subset of (or a few) buckets (e.g., h(k)

= k mod 10 k mod 10.

9/23/200717


• Clustered Hash Index– The index structure and its buckets are represented as a file (say file.hash)p ( y )– The relation is stored in file.hash (I.e., each entry in file.hash corresponds to a

record in relation)– Assuming no duplicates: the record can be accessed in 1 IO.

N l d H h I d• Non-clustered Hash Index:– The index structure and its buckets are represented as a file (say file.hash)– The relation remains intact

Each entry in file hash has the following format: (search key value RID)– Each entry in file.hash has the following format: (search-key value, RID)– Assuming no duplicates: the record can be accessed in 2 IO.

9/23/200718

HEAP FILE ORGANIZATION

• Assume a student table: Student(name, age, gpa, major)t(Student) = 16P(Student) = 4( )

Bob, 21, 3.7, CS Kane, 19, 3.8, ME Louis, 32, 4, LS Chris, 22, 3.9, CSBob, 21, 3.7, CS

Mary, 24, 3, ECE

Tom, 20, 3.2, EE

Kane, 19, 3.8, ME

Lam, 22, 2.8, ME

Chang, 18, 2.5, CS

Louis, 32, 4, LS

Martha, 29, 3.8, CS

James, 24, 3.1, ME

Chris, 22, 3.9, CS

Chad, 28, 2.3, LS

Leila, 20, 3.5, LSTom, 20, 3.2, EE

Kathy, 18, 3.8, LS

Chang, 18, 2.5, CS

Vera, 17, 3.9, EE

James, 24, 3.1, ME

Pat, 19, 2.8, EE

Leila, 20, 3.5, LS

Shideh, 16, 4, CS

9/23/200719

Non-Clustered Hash Index• A non-clustered hash index on the age attribute with 4 buckets• A non-clustered hash index on the age attribute with 4 buckets, • h(age) = age % B

(21, (1, 1))

(24, (1, 2))(32, (3,1))(20 (1 3))

(20, (4,3))(16, (4,4))(24 (3 3))

( ( ))(17, (2,4))(29, (3,2))

(20, (1,3))

(18 (1 4))

(28, (4,2))012

(24, (3,3))

(18, (1, 4))(22, (2,2))(22, (4,1))

(19, (2, 1))

23

(19, (3, 4))(18 (2 3))

B b 21 3 7 CS K 19 3 8 ME L i 32 4 LS Ch i 22 3 9 CS

(18, (2,3))

Bob, 21, 3.7, CS

Mary, 24, 3, ECE

Tom 20 3 2 EE

Kane, 19, 3.8, ME

Lam, 22, 2.8, ME

Chang 18 2 5 CS

Louis, 32, 4, LS

Martha, 29, 3.8, CS

James 24 3 1 ME

Chris, 22, 3.9, CS

Chad, 28, 2.3, LS

Leila 20 3 5 LS

9/23/200720

Tom, 20, 3.2, EE

Kathy, 18, 3.8, LS

Chang, 18, 2.5, CS

Vera, 17, 3.9, EE

James, 24, 3.1, ME

Pat, 19, 2.8, EE

Leila, 20, 3.5, LS

Shideh, 16, 4, CS

Clustered Hash Index• A clustered hash index on the age attribute with 4 buckets• A clustered hash index on the age attribute with 4 buckets, • h(age) = age % B

Bob, 21, 3.7, CS

Mary, 24, 3, ECE

T 20 3 2 EELouis, 32, 4, LS

J 24 3 1 MELeila, 20, 3.5, LSShideh, 16, 4, CS

Tom, 20, 3.2, EE

K h 18 3 8 LS

Vera, 17, 3.9, EEMartha, 29, 3.8, CS

James, 24, 3.1, ME

Chad, 28, 2.3, LS012 Kathy, 18, 3.8, LS

Kane, 19, 3.8, MELam, 22, 2.8, ME

Ch 18 2 CSPat, 19, 2.8, EE

Chris, 22, 3.9, CS

23

Chang, 18, 2.5, CS

9/23/200721

Non-Clustered Hash Index• A non-clustered hash index on the age attribute with 4 buckets 500• A non-clustered hash index on the age attribute with 4 buckets, • h(age) = age % B• Pointers are page-ids

(21, (1, 1))

(24, (1, 2))(32, (3,1))(20 (1 3))

(20, (4,3))(16, (4,4))(24 (3 3))

5001001

( ( ))(17, (2,4))(29, (3,2))

(20, (1,3))

(18 (1 4))

(28, (4,2))012

(24, (3,3))

7065001001706 (18, (1, 4))

(22, (2,2))(22, (4,1))

(19, (2, 1))

23

(19, (3, 4))(18 (2 3))

101706101

B b 21 3 7 CS K 19 3 8 ME L i 32 4 LS Ch i 22 3 9 CS

(18, (2,3))

Bob, 21, 3.7, CS

Mary, 24, 3, ECE

Tom 20 3 2 EE

Kane, 19, 3.8, ME

Lam, 22, 2.8, ME

Chang 18 2 5 CS

Louis, 32, 4, LS

Martha, 29, 3.8, CS

James 24 3 1 ME

Chris, 22, 3.9, CS

Chad, 28, 2.3, LS

Leila 20 3 5 LS

9/23/200722

Tom, 20, 3.2, EE

Kathy, 18, 3.8, LS

Chang, 18, 2.5, CS

Vera, 17, 3.9, EE

James, 24, 3.1, ME

Pat, 19, 2.8, EE

Leila, 20, 3.5, LS

Shideh, 16, 4, CS

Clustered Hash Index (SEQUENTIAL LAYOUT)• A clustered hash index on the age attribute with 4 buckets• A clustered hash index on the age attribute with 4 buckets, • h(age) = age % 4• When the number of buckets are known in advance, the system may

assume a sequentially laid file to eliminate the need for the hash directory.assume a sequentially laid file to eliminate the need for the hash directory.

Leila, 20, 3.5, LSShideh, 16, 4, CS

James, 24, 3.1, ME, , ,

M 24 3 ECE Bob, 21, 3.7, CSMary, 24, 3, ECE

Tom, 20, 3.2, EE

Kathy, 18, 3.8, LS Kane, 19, 3.8, MELam, 22, 2.8, MEVera, 17, 3.9, EELouis, 32, 4, LS

Martha, 29, 3.8, CS

Pat, 19, 2.8, EEChris, 22, 3.9, CS

Ch d 28 2 3 LS

9/23/200723

Chang, 18, 2.5, CSChad, 28, 2.3, LS

Clustered Hash Index (SEQUENTIAL LAYOUT)• A clustered hash index on the age attribute with 4 buckets• A clustered hash index on the age attribute with 4 buckets, • h(age) = age % 4• When the number of buckets are known in advance, the system may

assume a sequentially laid file to eliminate the need for the hash directory.assume a sequentially laid file to eliminate the need for the hash directory.

Leila, 20, 3.5, LSShideh, 16, 4, CS

Offset (bucket-id –1) times page size is for bucket id

James, 24, 3.1, ME, , ,

Offset 0 is for bucket 0

bucket-id

M 24 3 ECE

Offset Page Size is for bucket 1

Bob, 21, 3.7, CSMary, 24, 3, ECE

Tom, 20, 3.2, EE

Kathy, 18, 3.8, LS Kane, 19, 3.8, MELam, 22, 2.8, MEVera, 17, 3.9, EELouis, 32, 4, LS

Martha, 29, 3.8, CS

Pat, 19, 2.8, EEChris, 22, 3.9, CS

Ch d 28 2 3 LS

9/23/200724

Chang, 18, 2.5, CSChad, 28, 2.3, LS

Bucket Block address

0Number on disk

12

M-2M-1

9/23/200725

Example of Non-Clustered Hash Index

9/23/200726

Main buckets Overflow buckets

340460

Main buckets

981 Record pointer

Record pointer

Overflow buckets

01 460

Record pointer181 Record pointer

Record pointer12

32176191

551 Record pointer

Record pointer91Record pointer

22

Record pointer

Record pointer

72522

Record pointer

9

9/23/200727

p

Bucket Block address

0Number on disk

12

M-2M-1

9/23/200728

Example of Hash Index

9/23/200729

Main buckets Overflow buckets

340460

Main buckets

981 Record pointer

Record pointer

Overflow buckets

01 460

Record pointer182 Record pointer

Record pointer12

32176191

552 Record pointer

Record pointer91Record pointer

22

Record pointer

Record pointer

72522

Record pointer

9

9/23/200730

p


• Loading factor– B = # of buckets, S = # of records per bucket, R = # of records in the relation, p ,– loading - factor = R / (B×S)– The loading factor should not exceed 80%, if that happens, double B and re-hash.

• Why a bucket might overflow?– Heavy loading of the file– Poor hash functions– Statistical peculiarities

If b k t fl ?• If a bucket overflows?– Chaining: chain an empty bucket to the bucket that overflows.– Open addressing: If bucket h(k) is full, store the record in h(k) + 1, if that is also

full, try h(k) + 2, and so on., y ( ) ,– Two hash functions: If bucket h(k) is full, store the record in h’(k).

9/23/200731


• Problem: The file grows and shrinks over time. Hence, how one should choose the hash function:1. Based on current file size performance degradation as DB grows2. Based on anticipated file size waste space initially (and reduced buffer hits)3. Periodical reorganization time consuming

3.1. Choose new hash function3.2. Recompute hash value on every record3.3. Generate new bucket assignments

S l ti• Solution:– Dynamic hash functions: dynamic modification of h to accommodate growth and

shrinkage of the DB. (e.g., extendible hashing)

9/23/200732


Extendible hashing• Choose a hash function (h) such that it results in a b (b = 32) bit binaryChoose a hash function (h) such that it results in a b (b 32) bit binary

number.• The directory has a header that contains its depth, d.• Each directory entry points to a hash bucket.y y p• Buckets are created on demand, as records are inserted.• Each bucket contains a local depth used to find data.

Directory depth

200

directory

Directory depth

1 bucket

01

10

11siblings

9/23/200733


Extendible hashing (continued):• Every time a bucket overflows, its local depth is increased. If the local depth isEvery time a bucket overflows, its local depth is increased. If the local depth is

greater than the depth of the directory, the directory’s depth is increased, causing the directory to double in size.

• Each directory entry has one sibling or buddy. Two entries are buddies if they have identical bit patterns except for the dth bit.

• Every time a bucket overflows, its local depth is increased.• If the local depth is greater than the depth of the directory, then the directory’s

d th i i d i th di t t d bl i idepth is increased, causing the directory to double in size.• A bucket can overflow at any desired loading factor. That is, a split might

happen every time a bucket is 80% full.

9/23/200734


• Retrieval with Extendible hashing:

Retrieve (K )Retrieve (K0)1. Calculate h’ = h(K0)2. Read depth d of the directory3. Interpret the d initial bits of h’ as an integer base 2, term this r.p g ,4. Retrieve the bucket pointed to by the rth entry5. Find the record in this bucket

5.1. If a hashing technique is used to organize the records in a bucket, use the d bits d fi d h b kdefined on that bucket5.2. If necessary, follow the collision resolution scheme within this bucket.

9/23/200735


• Insertion with Extendible hashing:

Insert (K )Insert (K0)1. Apply the first four steps of Retrieve (K0) to find bucket b.2. If the insertion of K0 into b result in no overflow then Insert K0 into b and return3. Otherwise, obtain a new bucket b’,4. Set the local depth of b’ and b to equal (local depth of b + 1)5. If the new depth is NOT greater than the depth of the directory

5.1. Distinguish between b and b’ using their new d and set the appropriate (i ) f h di i hentry(ies) of the directory to point to each

5.2. Rehash the entries in bucket b and assign each individual entry to the appropriate bucket b or b’5.3. Insert (K0)( 0)

6. If the new depth is greater than the depth of the directory6.1 Increase the depth of the directory, doubling its size6.2. Set each entry and its buddy to point to the old bucket that it was pointing to

9/23/200736

6.3. Insert (K0)


• Deletion with Extendible hashing:• Delete (K0)Delete (K0)

1. Apply the first four steps of Retrieve (K0) to find bucket b.2. If K0 is not b then return with value no found3. Otherwise, delete the entry corresponding to K0

4. If the sum of the number of entries on this page and its sibling page are below the size of a bucket then:4.1. Copy the entries in the two buckets into one bucket b’4 2 Depth of b’ = (depth of b 1)4.2. Depth of b = (depth of b - 1)4.3. Free bucket b and its sibling4.4. Locate the two hash directory entries pointing to b and its buddy. Set these two

pointers to b’4.5. If every pointer in the directory equals its sibling pointer then decrease the

depth of the directory by one and set each entry in an obvious manner.

9/23/200737

Use of Extendable Hash Structure: Example

9/23/200738Initial Hash structure, bucket size = 2

Example (Cont.)p ( )

• Hash structure after insertion of one Brighton and two Downtown records

9/23/200739

Example (Cont.)p ( )Hash structure after insertion of Mianus record

9/23/200740

Example (Cont.)

H h t t ft i ti f th P id d

9/23/200741

Hash structure after insertion of three Perryridge records

Example (Cont.)p ( )

• Hash structure after insertion of Redwood and Round Hill records

9/23/200742


• Extendible hashing:The insertion algorithm of extendible hashing might crash whenThe insertion algorithm of extendible hashing might crash when

9/23/200743


Hashing vs. Indexing• Hashing is appropriate for exact match queries: (cannot support range queries)Hashing is appropriate for exact match queries: (cannot support range queries)

SELECT A1, A2, …FROM rWHERE (Ai = c)WHERE (Ai c)

• Indexing is appropriate for both range and exact match queries:SELECT A1, A2, …FROM rFROM rWHERE (Ai <= c1) and (Ai > c2)

9/23/200744

Examplep

• Suppose that we are using extendable hashing on a file that contains records with the following search key values:g y

2, 3, 5, 7, 11, 17, 19, 23, 29, 31

Show the extendable hash structure for this file if hash function is

h(x) = x mod 8 and buckets can hold three records( )

9/23/200745

B+-TREE

• B+-tree is a multi-level tree structured directory

….

Root

Internal Nodes

... ... Leaf Nodes

• Clustered: Leaf nodes contain the records themselves

Data File

Clustered: Leaf nodes contain the records, themselves.

9/23/200746

B+-TREE (Cont…)( )

• Non-clustered: Leaf nodes contain the pairs (P, K), where P is a pointer to the record in the file and K is a search-key.y

9/23/200747


• Leaf nodes

P K P P K P

– Maintain between to n-1 values per leaf.– If i < j then Ki < Kj

P1 K1 P2 . . . Pn-1 Kn-1 Pn

(n-1)2

i j

5 7 10 (n = 4)

– Every search-key value in the file appears in some leaf node.– Suppose Li and Lj are two leaves and i < j, then every search value in Li is less than

every search value in Lj.

5 7 10 15 17 18

9/23/200748


• Internal nodes– Maintain between to n pointers per internal node

n2 p p

– root is an exception: It must have more than one pointer.– Suppose a node with m pointers and 2<= i < m:

1. Pi points to subtree containing search-key values < Ki and >= Ki-1.2. Pm points to subtree containing search-key values >= Km-1.3. P1 points to subtree containing search-key values < K1.

5 7 105 7 10

2 3 5 6 10 10 11

9/23/200749

To calculate the order n of a B+-treeTo calculate the order n of a B tree

• Suppose that the search key field is V = 9 bytes long, the block size is B=512 bytes a record pointer is P = 7 bytesblock size is B=512 bytes, a record pointer is Pr = 7 bytes, and a block pointer is P = 6 bytes. 1. Calculate order of the internal nodes2. Calculate order of the leaf nodes

9/23/200750


• Suppose that the search key field is V = 9 bytes long, the block size is B=512 bytes a record pointer is P = 7 bytesblock size is B=512 bytes, a record pointer is Pr = 7 bytes, and a block pointer is P = 6 bytes. 1. Calculate order of the internal nodes

• An internal node Of a B+-tree can have up to n tree i d h k lpointers and n-1 search key values

• (n * P) + ((n-1) * V) <= B• (n * 6) +((n 1) * 9) <= 512• (n * 6) +((n-1) * 9) <= 512• (15 * n ) <= 521• n = 34

9/23/200751

n 34


• Suppose that the search key field is V = 9 bytes long, the block size is B=512 bytes a record pointer is P = 7 bytesblock size is B 512 bytes, a record pointer is Pr 7 bytes, and a block pointer is P = 6 bytes.

C l l d f h l f dCalculate order of the leaf nodes• The leaf nodes of the B+-tree will have the same

number of values and pointers, except that the pointers p , p pare data pointers and a next pointer.

• (nleaf * (Pr + V)) + P <= B• (n * (7 + 9)) + 6 <= 512• (nleaf * (7 + 9)) + 6 <= 512• (16 * nleaf ) <= 506• nleaf = 31

9/23/200752

leaf


• Lookup 30

8 41 50

4 7 10 20 30 40 41 47 50 52

471020

30404147

5052

– Find 7: 4 Ios– Find 4-20: 4 IOs (assuming primary index), 8 IOs (assuming secondary index)– More than 10% selection: it is more efficient to do sequential scan (do not use the

d i d )secondary index).– Example: 10,000 records, select 1000 of them, 1000 records per disk page:

(Sequential search: 10 IOs, Secondary index: potentially 1000+ IOs)

9/23/200753


• Analysis– “B” in B+-tree stands for Balanced. i.e., the length of every path from the root to a , g y p

leaf node is the same.– Hence, good performance for lookup, insertion, and deletion– K: number of search key values in a file, then the path is < log (K).

#K 1 000 000 d 10 100 th t t 3 t 9 d b dn2

– #K = 1,000,000, and 10 <= n <= 100 then at most 3 to 9 nodes be accessed.– Insertion and Deletion should not destroy the balance of the tree.

9/23/200754


8 25

n = 4;Internal nodes: 2 to 4 pointersLeaf nodes: 2 to 3 values

10 204 7 30 40

8 25

10 204 7 30 40 41

Insert 41

10 204 7 30 40 41

30 40 41 47

Insert 47

8 25 41

30 40 41 47

8 25

10 204 7 30 40 41 47

41

9/23/200755


Insert 508 25

10 204 7 30 40 41 47

41

50

Insert 5241 47 50 52

8 25 41 50

41

258 50

10 204 7 30 40 41 47 50 52

9/23/200756

B+-TREE (Cont…)( )30

8 41 50

D l 20

10 204 7 30 40 41 47 50 52

Delete 20 30

8 41 50

4 7 30 40 41 47 50 5210

30 41

104 7

50

41 47 5030 40 52

9/23/200757

ExampleExample

• Construct a B+- tree for the following set of lvalues:

2, 3, 5, 7, 11, 17, 19, 23, 29, 31

• Assume n = 4 (number of pointers)– Inner nodes : 4 to 2 children– Leaf nodes : 3 to 2 values

9/23/200758

hashing & indexing

Documents