cs 4604: introducon to database management...

113
CS 4604: Introduc0on to Database Management Systems B. Aditya Prakash Lecture #8: Indexes and Hashing

Upload: lamngoc

Post on 25-Mar-2018

226 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

CS4604:Introduc0ontoDatabaseManagementSystems

B.AdityaPrakashLecture#8:IndexesandHashing

Page 2: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

Announcements

§  ChecktheofficehoursscheduleonPiazza.– Severalextraones– MyWedofficehourthisweekFeb24iscanceled

§  OnWedFeb24:– ShamimulandSorourwillgivethelectureon(maybe)somehashingandSorKng.

Prakash2016 VTCS4604 2

Page 3: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

STORINGDATA

Prakash2016 VTCS4604 3

Page 4: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

Prakash2016 VTCS4604

DBMSLayers:

Query Optimization and Execution

Relational Operators

Files and Access Methods

Buffer Management

Disk Space Management

DB

Queries

TODAYà

4

Page 5: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

Prakash2016 VTCS4604

LeverageOSfordisk/filemanagement?

§  LayersofabstracKonaregood…but:

5

Page 6: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

Prakash2016 VTCS4604

LeverageOSfordisk/filemanagement?

§  LayersofabstracKonaregood…but:– Unfortunately,OSoXengetsinthewayofDBMS

6

Page 7: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

Prakash2016 VTCS4604

LeverageOSfordisk/filemanagement?

§  DBMSwants/needstodothings“itsownway”– Specializedprefetching– Controloverbufferreplacementpolicy

•  LRUnotalwaysbest(someKmesworst!!)– Controloverthread/processscheduling

•  “Convoyproblem”– AriseswhenOSschedulingconflictswithDBMSlocking

– Controloverflushingdatatodisk• WALprotocolrequiresflushinglogentriestodisk

7

Page 8: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

Prakash2016 VTCS4604

DisksandFiles

§  DBMSstoresinformaKonondisks.– but:disksare(relaKvely)VERYslow!

§ MajorimplicaKonsforDBMSdesign!

8

Page 9: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

Prakash2016 VTCS4604

DisksandFiles

§ MajorimplicaKonsforDBMSdesign:– READ:disk->mainmemory(RAM).– WRITE:reverse– Botharehigh-costoperaKons,relaKvetoin-memoryoperaKons,somustbeplannedcarefully!

9

Page 10: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

Prakash2016 VTCS4604

WhyNotStoreItAllinMainMemory?

10

Page 11: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

Prakash2016 VTCS4604

WhyNotStoreItAllinMainMemory?

§  Coststoomuch.– disk:~$1/Gb;memory:~$100/Gb– High-endDatabasestodayinthe10-100TBrange.

– Approx60%ofthecostofaproducKonsystemisinthedisks.

§ Mainmemoryisvola9le.§  Note:somespecializedsystemsdostoreenKredatabaseinmainmemory.

11

Page 12: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

Prakash2016 VTCS4604

TheStorageHierarchySmaller, Faster

Bigger, Slower

12

Page 13: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

Prakash2016 VTCS4604

TheStorageHierarchy

– Main memory (RAM) for currently used data.

– Disk for the main database (secondary storage).

– Tapes for archiving older versions of the data (tertiary storage).

Smaller, Faster

Bigger, Slower

Registers

L1 Cache

Main Memory

Magnetic Disk

Magnetic Tape

...

13

Page 14: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

Prakash2016 VTCS4604

JimGray’sStorageLatencyAnalogy:HowFarAwayistheData?

Registers On Chip Cache On Board Cache

Memory

Disk

1 2

10

100

Tape

10 9

10 6

Boston

This Building

This Room My Head

10 min

1.5 hr

2 Years

1 min

Pluto

2,000 Years

The image cannot be displayed. Your computer may not have enough memory to open the image, or the image may have been corrupted. Restart your computer, and then open the file again. If the red x still appears, you may have to delete the image and then insert it again.

The image cannot be displayed. Your computer may not have

The image cannot be displayed. Your computer may not have enough

Andromeda

14

Page 15: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

Prakash2016 VTCS4604

Disks§  Secondarystoragedeviceofchoice.§ Mainadvantageovertapes:randomaccessvs.sequen9al.

§  Dataisstoredandretrievedinunitscalleddiskblocksorpages.

§  UnlikeRAM,KmetoretrieveadiskpagevariesdependinguponlocaKonondisk.–  relaKveplacementofpagesondiskisimportant!

15

Page 16: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

Prakash2016 VTCS4604

AnatomyofaDisk

Platters

Spindle

•  Sector •  Track •  Cylinder •  Platter •  Block size = multiple of sector size (which is fixed)

Disk head

Arm movement

Arm assembly

Tracks

Sector

#16

Page 17: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

Prakash2016 VTCS4604

AccessingaDiskPage

§  Timetoaccess(read/write)adiskblock:–  .–  .–  .

17

Page 18: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

Prakash2016 VTCS4604

AccessingaDiskPage

§  Timetoaccess(read/write)adiskblock:– seek9me:movingarmstoposiKondiskheadontrack

–  rota9onaldelay:waiKngforblocktorotateunderhead

–  transfer9me:actuallymovingdatato/fromdisksurface

18

Page 19: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

Prakash2016 VTCS4604

AccessingaDiskPage

§  RelaKveKmes?– seek9me:–  rota9onaldelay:–  transfer9me:

19

Page 20: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

Prakash2016 VTCS4604

AccessingaDiskPage

§  RelaKveKmes?– seek9me:about1to20msec–  rota9onaldelay:0to10msec–  transfer9me:<1msecper4KBpage

Transfer

Seek

Rotate

transfer

20

Page 21: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

Prakash2016 VTCS4604

Seek0me&rota0onaldelaydominate

§  KeytolowerI/Ocost:reduceseek/rotaKondelays!

§  Alsonote:Forshareddisks,muchKmespentwaiKnginqueueforaccesstoarm/controller

Seek

Rotate

transfer

21

Page 22: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

Prakash2016 VTCS4604

ArrangingPagesonDisk

§  “Next” blockconcept:– blocksonsametrack,followedby– blocksonsamecylinder,followedby– blocksonadjacentcylinder

§  Accesing‘next’blockischeap§  AusefulopKmizaKon:pre-fetching

– Seetextbookpage323

22

Page 23: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

Prakash2016 VTCS4604

Rulesofthumb…

1. MemoryaccessmuchfasterthandiskI/O(~1000x)

§  “SequenKal”I/Ofasterthan“random”I/O(~10x)

23

Page 24: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

Prakash2016 VTCS4604

Conclusions---Storing

§ Memoryhierarchy§  Disks:(>1000xslower)-thus

– packinfoinblocks–  trytofetchnearbyblocks(sequenKally)

24

Page 25: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

TREEINDEXES

Prakash2016 VTCS4604 25

Page 26: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

DeclaringIndexes

§  Nostandard!§  Typicalsyntax:CREATE INDEX StudentsInd ON Students(ID);

CREATE INDEX CoursesInd ON Courses(Number, DeptName);

Prakash2016 VTCS4604 26

Page 27: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

TypesofIndexes

§  Primary:indexonakey– Usedtoenforceconstraints

§  Secondary:indexonnon-keyasribute§  Clustering:orderoftherowsinthedatapagescorrespondtotheorderoftherowsintheindex– Onlyoneclusteredindexcanexistinagiventable– Usefulforrangepredicates

§  Non-clustering:physicalordernotthesameasindexorder

Prakash2016 VTCS4604 27

Page 28: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

UsingIndexes(1):EqualitySearches

§  Givenavaluev,theindextakesustoonlythosetuplesthathavevintheasribute(s)oftheindex.

§  E.g.(useCourseIndindex)SELECT Enrollment FROM Courses WHERE Number = “4604” and DeptName = “CS”

Prakash2016 VTCS4604 28

Page 29: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

UsingIndexes(1):EqualitySearches

§  Givenavaluev,theindextakesustoonlythosetuplesthathavevintheasribute(s)oftheindex.

§  CanuseHashes,butseenext

Prakash2016 VTCS4604 29

Page 30: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

UsingIndexes(2):RangeSearches

§  ``Findallstudentswithgpa>3.0’’§  maybeslow,evenonsortedfile§  Hashesnotagoodidea!§ Whattodo?

Prakash2016 VTCS4604

Page 1 Page 2 Page N Page 3 Data File

30

Page 31: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

RangeSearches

§  ``Findallstudentswithgpa>3.0’’§  maybeslow,evenonsortedfile§  SoluKon:Createan`index’file.

Prakash2016 VTCS4604

Page 1 Page 2 Page N Page 3 Data File

k2 kN k1 Index File

31

Page 32: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

RangeSearches

§ Moredetails:§  ifindexfileissmall,dobinarysearchthere§  Otherwise??

Prakash2016 VTCS4604

Page 1 Page 2 Page N Page 3 Data File

k2 kN k1 Index File

32

Page 33: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

B-trees

§  themostsuccessfulfamilyofindexschemes(B-trees,B+-trees,B*-trees)

§  Canbeusedforprimary/secondary,clustering/non-clusteringindex.

§  balanced“n-way”searchtrees§  OriginalPaper:RudolfBayerandMcCreight,E.M.OrganizaKonandMaintenanceofLargeOrderedIndexes.ActaInformaKca1,173-189,1972.

Prakash2016 VTCS4604 33

Page 34: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

B-trees

§  Eg.,B-treeoforderd=1:

Prakash2016 VTCS4604

1 3

6

7

9

13

<6

>6 <9>9

34

Page 35: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

B-treeproper0es:

§  eachnode,inaB-treeoforderd:– Keyorder– atmostn=2dkeys– atleastdkeys(exceptroot,whichmayhavejust1key)

– allleavesatthesamelevel–  ifnumberofpointersisk,thennodehasexactlyk-1keys

–  (leavesareempty)

Prakash2016 VTCS4604

v1 v2 … vn-1

p1 pn

35

Page 36: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

Proper0es

§  “blockaware”nodes:eachnodeisadiskpage§  O(log(N))foreverything!(ins/del/search)§  typically,ifd=50-100,then2-3levels§  uKlizaKon>=50%,guaranteed;onaverage69%

Prakash2016 VTCS4604 36

Page 37: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

Queries

§  Algoforexactmatchquery?(eg.,ssn=8?)

Prakash2016 VTCS4604

1 3

6

7

9

13

<6

>6 <9>9

37

Page 38: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

JAVAanima0on

§  hsp://slady.net/java/bt/

Prakash2016 VTCS4604 38

Page 39: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

Queries

§  Algoforexactmatchquery?(eg.,ssn=8?)

Prakash2016 VTCS4604

1 3

6

7

9

13

<6

>6 <9>9

39

Page 40: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

Queries

§  Algoforexactmatchquery?(eg.,ssn=8?)

Prakash2016 VTCS4604

1 3

6

7

9

13

<6

>6 <9>9

40

Page 41: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

Queries

§  Algoforexactmatchquery?(eg.,ssn=8?)

Prakash2016 VTCS4604

1 3

6

7

9

13

<6

>6 <9>9

41

Page 42: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

Queries

§  Algoforexactmatchquery?(eg.,ssn=8?)

Prakash2016 VTCS4604

1 3

6

7

9

13

<6

>6 <9>9

Hsteps(=diskaccesses)

42

Page 43: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

Queries

§  whataboutrangequeries?(eg.,5<salary<8)§  Proximity/nearestneighborsearches?(eg.,salary~8)

Prakash2016 VTCS4604 43

Page 44: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

Queries

§  whataboutrangequeries?(eg.,5<salary<8)§  Proximity/nearestneighborsearches?(eg.,salary~8)

Prakash2016 VTCS4604

1 3

6

7

9

13

<6

>6 <9>9

44

Page 45: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

Queries

§  whataboutrangequeries?(eg.,5<salary<8)§  Proximity/nearestneighborsearches?(eg.,salary~8)

Prakash2016 VTCS4604

1 3

6

7

9

13

<6

>6 <9>9

45

Page 46: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

Queries

§  whataboutrangequeries?(eg.,5<salary<8)§  Proximity/nearestneighborsearches?(eg.,salary~8)

Prakash2016 VTCS4604

1 3

6

7

9

13

<6

>6 <9>9

46

Page 47: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

Queries

§  whataboutrangequeries?(eg.,5<salary<8)§  Proximity/nearestneighborsearches?(eg.,salary~8)

Prakash2016 VTCS4604

1 3

6

7

9

13

<6

>6 <9>9

47

Page 48: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

Varia0ons

§  HowcouldwedoevenbeserthantheB-treesabove?

Prakash2016 VTCS4604 48

Page 49: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

B+trees-Mo0va0on

§  B-tree–printkeysinsortedorder:

Prakash2016 VTCS4604

1 3

6

7

9

13

<6

>6 <9>9

49

Page 50: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

B+trees-Mo0va0on

§  B-treeneedsback-tracking–howtoavoidit?

Prakash2016 VTCS4604

1 3

6

7

9

13

<6

>6 <9>9

50

Page 51: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

B+trees-Mo0va0on

§  Strongerreason:forclusteringindex,datarecordsarescasered:

Prakash2016 VTCS4604

1 3

6

7

9

13

<6

>6 <9>9

51

Page 52: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

Solu0on:B+-trees

§  facilitatesequenKalops§  Theystringallleafnodestogether§  AND§  replicatekeysfromnon-leafnodes,tomakesureeverykeyappearsattheleaflevel

§  (vital,forclusteringindex!)

Prakash2016 VTCS4604 52

Page 53: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

B+trees

Prakash2016 VTCS4604

1 3

6

6

9

9

<6

>=6 <9>=9

7 13

53

Page 54: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

B+trees

Prakash2016 VTCS4604

1 3

6

6

9

9

<6

>=6 <9>=9

7 13

IndexPages

DataPages

54

Page 55: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

B+trees

§ Moredetails:next(andtextbook)§  Inshort:onsplit

– atleaflevel:COPYmiddlekeyupstairs– atnon-leaflevel:pushmiddlekeyupstairs(asinplainB-tree)

Prakash2016 VTCS4604 55

Page 56: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

ExampleB+Tree

§  Searchbeginsatroot,andkeycomparisonsdirectittoaleaf

§  Searchfor5*,15*,alldataentries>=24*...

Prakash2016 VTCS4604

Based on the search for 15*, we know it is not in the tree!

Root

17 24 30

2* 3* 5* 7* 14* 16* 19* 20* 22* 24* 27* 29* 33* 34* 38* 39*

13

56

Page 57: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

Inser0ngaDataEntryintoaB+Tree

§  FindcorrectleafL.§  PutdataentryontoL.

–  IfLhasenoughspace,done!– Else,mustsplitL(intoLandanewnodeL2)

•  Redistributeentriesevenly,copyupmiddlekey.

§  parentnodemayoverflow– butthen:pushupmiddlekey.Splits“grow”tree;rootsplitincreasesheight.

Prakash2016 VTCS4604 57

Page 58: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

ExampleB+Tree–Inser0ng30*

Prakash2016 VTCS4604

Root

17 24

2* 3* 5* 7* 14* 16* 19* 20* 22* 24* 27* 29*

13

23*

58

Page 59: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

ExampleB+Tree–Inser0ng30*

Prakash2016 VTCS4604

Root

17 24

2* 3* 5* 7* 14* 16* 19* 20* 22* 24* 27* 29*

13

23* 30*

59

Page 60: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

ExampleB+Tree-Inser0ng8*

Prakash2016 VTCS4604

Root

17 24

2* 3* 5* 7* 14* 16* 19* 20* 22* 24* 27* 29*

13

23*

60

Page 61: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

ExampleB+Tree-Inser0ng8*

Prakash2016 VTCS4604

Root

17 24

2* 3* 5* 7* 14* 16* 19* 20* 22* 24* 27* 29*

13

23*

NoSpace

61

Page 62: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

Prakash2016 VTCS4604

ExampleB+Tree-Inser0ng8*Root

17 24

2* 3* 5* 7* 14* 16* 19* 20* 22* 24* 27* 29*

13

23*

2* 3* 5* 14* 16* 19* 20* 22* 24* 27* 29* 23* 7* 8*

13 17 24

5*

SoSplit!

62

Page 63: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

Prakash2016 VTCS4604

ExampleB+Tree-Inser0ng8*Root

17 24

2* 3* 5* 7* 14* 16* 19* 20* 22* 24* 27* 29*

13

23*

2* 3* 5* 14* 16* 19* 20* 22* 24* 27* 29* 23* 7* 8*

13 17 24

5*

SoSplit!

AndthenpushmiddleUP

63

Page 64: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

Prakash2016 VTCS4604

ExampleB+Tree-Inser0ng8*Root

17 24

2* 3* 5* 7* 14* 16* 19* 20* 22* 24* 27* 29*

13

23*

2* 3* 14* 16* 19* 20* 22* 24* 27* 29* 23* 7* 8*

5 13 17 24

5*

<5 >=5

FinalState

64

Page 65: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

ExampleB+Tree-Inser0ng21*

Prakash2016 VTCS4604

2* 3*

Root

5

14* 16* 19* 20* 22* 24* 27* 29* 7* 5* 8*

13 17 24

23*

2* 3* 14* 16* 19* 20* 22* 24* 27* 29* 7* 5* 8* 23*

65

Page 66: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

ExampleB+Tree-Inser0ng21*

Prakash2016 VTCS4604

2* 3*

Root

5

14* 16* 19* 20* 22* 24* 27* 29* 7* 5* 8*

13 17 24

23*

2* 3* 14* 16* 19* 20* 24* 27* 29* 7* 5* 8* 21* 22* 23*

17 21 24 13 5 RootisFull,sosplitrecursively

66

Page 67: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

ExampleB+Tree:Recursivesplit

Prakash2016 VTCS4604

•  Notice that root was also split, increasing height.

2* 3*

Root

17

21 24

14* 16* 19* 20* 21* 22* 23* 24* 27* 29*

13 5

7* 5* 8*

67

Page 68: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

Prakash2016 VTCS4604

Example:Datavs.IndexPageSplit

§  leaf:‘copy’§  non-leaf:‘push’

§  whynot‘copy’@non-leaves?

2* 3* 5* 7* 8*

5

5 21 24

17

13

… 2* 3* 5* 7*

17 21 24 13

Data Page Split

Index Page Split

8*

5

#68

Page 69: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

SameInser0ng21*:TheDeferredSplit

Prakash2016 VTCS4604

2* 3*

Root

5

14* 16* 19* 20* 22* 24* 27* 29* 7* 5* 8*

13 17 24

23*

Notethishasfreespace.So…

69

Page 70: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

Inser0ng21*:TheDeferredSplit

Prakash2016 VTCS4604

2* 3*

Root

5

14* 16* 19* 20* 22* 24* 27* 29* 7* 5* 8*

13 17 24

23*

LENDkeystosibling,throughPARENT!

2* 3*

Root

5

14* 16* 19* 20* 21* 23* 24* 27* 7* 5* 8*

13 17 23

22* 29*

70

Page 71: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

Inser0ng21*:TheDeferredSplit

Prakash2016 VTCS4604

2* 3*

Root

5

14* 16* 19* 20* 22* 24* 27* 29* 7* 5* 8*

13 17 24

23*

Shorter,morepacked,fastertree

2* 3*

Root

5

14* 16* 19* 20* 21* 23* 24* 27* 7* 5* 8*

13 17 23

22* 29*

71

Page 72: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

Inser0onexamplesforyoutotry

Prakash2016 VTCS4604

2* 3*

Root

30

14* 16* 21* 22* 23*

13 5

7* 5* 8*

20 … (not shown)

11*

Insert the following data entries (in order): 28*, 6*, 25*

72

Page 73: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

Answer…

Prakash2016 VTCS4604

2* 3*

30

7* 8* 14* 16*

7 5

6* 5*

13 …

After inserting 28*, 6*

After inserting 25*

21* 22* 23* 28*

20

11*

73

Page 74: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

Answer…

Prakash2016 VTCS4604

2* 3*

13

20 23

7* 8* 14* 16* 21* 22* 23* 25* 28*

7 5

6* 5*

30

11*

After inserting 25*

74

Page 75: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

Dele0ngaDataEntryfromaB+Tree

§  Startatroot,findleafLwhereentrybelongs.§  Removetheentry.

–  IfLisatleasthalf-full,done!–  IfLunderflows

•  Trytore-distribute,borrowingfromsibling(adjacentnodewithsameparentasL).

•  Ifre-distribuKonfails,mergeLandsibling.–  updateparent–  andpossiblymerge,recursively

Prakash2016 VTCS4604 75

Page 76: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

Dele0onfromB+Tree

Prakash2016 VTCS4604

2* 3*

Root

17

21 24

14* 16* 19* 20* 21* 22* 23* 24* 27* 29*

13 5

7* 5* 8*

76

Page 77: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

Prakash2016 VTCS4604

Example:Delete19*&20*

DeleKng19*iseasy:

2* 3*

Root 17

24 30

14* 16* 19* 20* 22* 24* 27* 29* 33* 34* 38* 39*

13 5

7* 5* 8*

2* 3*

Root 17

30

14* 16* 33* 34* 38* 39*

13 5

7* 5* 8* 22* 24*

27

27* 29*

20* 22*

•  DeleKng20*->re-distribuKon(noKce:27copiedup)

1 2

3

77

Page 78: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

Prakash2016 VTCS4604

2* 3*

Root 17

30

14* 16* 33* 34* 38* 39*

13 5

7* 5* 8* 22* 24*

27

27* 29*

...AndThenDele0ng24*

2* 3*

Root 17

14* 16* 33* 34* 38* 39*

13 5

7* 5* 8* 22* 27*

30

29*

•  Mustmergeleaves:OPPOSITEofinsert

3

4

78

Page 79: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

Prakash2016 VTCS4604

2* 3*

Root 17

30

14* 16* 33* 34* 38* 39*

13 5

7* 5* 8* 22* 24*

27

27* 29*

...AndThenDele0ng24*

2* 3*

Root 17

14* 16* 33* 34* 38* 39*

13 5

7* 5* 8* 22* 27*

30

29*

•  Mustmergeleaves:OPPOSITEofinsert

…butarewedone??

3

4

79

Page 80: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

...MergeNon-LeafNodes,ShrinkTree

Prakash2016 VTCS4604

2* 3*

Root 17

14* 16* 33* 34* 38* 39*

13 5

7* 5* 8* 22* 27*

30

29*

4

2* 3* 7* 14* 16* 22* 27* 29* 33* 34* 38* 39* 5* 8*

Root 30 13 5 17

5

80

Page 81: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

ExampleofNon-leafRe-distribu0on

§  TreeisshownbelowduringdeleKonof24*.§  Now,wecanre-distributekeys

Prakash2016 VTCS4604

Root

13 5 17 20

22

30

14* 16* 17* 18* 20* 33* 34* 38* 39* 22* 27* 29* 21* 7* 5* 8* 3* 2*

81

Page 82: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

AmerRe-distribu0on

§  needonlyre-distribute‘20’;did‘17’,too§  whywouldwewanttore-distributemorekeys?

Prakash2016 VTCS4604

14* 16* 33* 34* 38* 39* 22* 27* 29* 17* 18* 20* 21* 7* 5* 8* 2* 3*

Root

13 5

17

30 20 22

82

Page 83: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

Mainobserva0onsfordele0on

§  Ifakeyvalueappearstwice(leaf+nonleaf),theabovealgorithmsdeleteitfromtheleaf,only

§  whynotnon-leaf,too?

Prakash2016 VTCS4604 83

Page 84: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

Mainobserva0onsfordele0on

§  Ifakeyvalueappearstwice(leaf+nonleaf),theabovealgorithmsdeleteitfromtheleaf,only

§  whynotnon-leaf,too?§  ‘lazydeleKons’-infact,somevendorsjustmarkentriesasdeleted(~underflow),– andreorganize/compactlater

Prakash2016 VTCS4604 84

Page 85: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

Recap:mainideas

§  onoverflow,split(and‘push’,or‘copy’)– orconsiderdeferredsplit

§  onunderflow,borrowkeys;ormerge– orletitunderflow...

Prakash2016 VTCS4604 85

Page 86: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

B+TreesinPrac0ce

§  Typicalorder:100.Typicalfill-factor:67%.– averagefanout=2*100*0.67=134

§  TypicalcapaciKes:– Height4:1334=312,900,721entries– Height3:1333=2,406,104entries

Prakash2016 VTCS4604 86

Page 87: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

B+TreesinPrac0ce

§  CanoXenkeeptoplevelsinbufferpool:– Level1=1page=8KB– Level2=134pages=1MB– Level3=17,956pages=140MB

Prakash2016 VTCS4604 87

Page 88: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

BulkLoadingofaB+Tree

§  Inanemptytree,insertmanykeys§ Whynotone-at-a-Kme?

– Tooslow!

Prakash2016 VTCS4604 88

Page 89: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

BulkLoadingofaB+Tree

§  IniKalizaKon:Sortalldataentries§  scanlist;wheneverenoughforapage,pack§  <repeatforupperlevel>

Prakash2016 VTCS4604

3* 4* 6* 9* 10* 11* 12* 13* 20* 22* 23* 31* 35* 36* 38* 41* 44*

Sorted pages of data entries; not yet in B+ tree Root

89

Page 90: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

Prakash2016 VTCS4604

3* 4* 6* 9* 10* 11* 12* 13* 20* 22* 23* 31* 35* 36* 38* 41* 44*

Root

Data entry pages not yet in B+ tree 35 23 12 6

10 20

3* 4* 6* 9* 10* 11* 12* 13* 20* 22* 23* 31* 35* 36* 38* 41* 44*

6

Root

10

12 23

20

35

38

not yet in B+ tree Data entry pages

BulkLoadingofaB+Tree

#90

Page 91: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

ANoteon`Order’

§  Order(d)conceptreplacedbyphysicalspacecriterioninpracKce(`atleasthalf-full’).

§ Manyrealsystemsareevensloppierthanthis:theyallowunderflow,andonlyreclaimspacewhenapageiscompletelyempty.

§  (whatarethebenefitsofsuch‘slopiness’?)

Prakash2016 VTCS4604 91

Page 92: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

Conclusions

§  B+treeistheprevailingindexingmethod§  Excellent,O(logN)worst-caseperformanceforins/del/search;(~3-4diskaccessesinpracKce)

§  guaranteed50%spaceuKlizaKon;avg69%

Prakash2016 VTCS4604 92

Page 93: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

Conclusions

§  Canbeusedforanytypeofindex:primary/secondary,sparse(clustering),ordense(non-clustering)

§  Severalfine-extensionsonthebasicalgorithm– deferredsplit;– bulk-loading

Prakash2016 VTCS4604 93

Page 94: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

HASHING

Prakash2016 VTCS4604 94

Page 95: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

(Sta0c)Hashing

§  Problem:“findEMPrecordwithssn=123”§ Whatifdiskspacewasfree,andKmewasatpremium?

Prakash2016 VTCS4604 95

Page 96: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

Hashing

§  A:Brilliantidea:key-to-addresstransformaKon:

Prakash2016 VTCS4604 96

#0page

#123page

#999,999,999

123;Smith;Mainstr

Page 97: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

Hashing

§  SincespaceisNOTfree:§  useM,insteadof999,999,999slots§  hashfuncKon:h(key)=slot-id

Prakash2016 VTCS4604 97

#0page

#123page

#999,999,999

123;Smith;Mainstr

Page 98: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

Hashing

§  Typically:eachhashbucketisapage,holdingmanyrecords:

Prakash2016 VTCS4604 98

#0page

#h(123)

M

123;Smith;Mainstr

Page 99: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

Hashing

§  NoKce:couldhaveclustering,ornon-clusteringversions:

Prakash2016 VTCS4604 99

#0page

#h(123)

M

123;Smith;Mainstr.

Page 100: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

Hashing

§  NoKce:couldhaveclustering,ornon-clusteringversions:

Prakash2016 VTCS4604 100

123...

#0page

#h(123)

M

...EMPfile

123;Smith;Mainstr.

...

234;Johnson;Forbesave

345;Tompson;FiXhave

...

Page 101: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

Designdecisions

§  1)formulah()forhashingfuncKon§  2)sizeofhashtableM§  3)collisionresoluKonmethod

Prakash2016 VTCS4604 101

Page 102: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

Problemwithsta0chashing

§  problem:overflow?§  problem:underflow?(underuKlizaKon)

Prakash2016 VTCS4604 102

Page 103: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

Solu0on:Dynamic/extendiblehashing

§  idea:shrink/expandhashtableondemand..§  ..dynamichashing§  Details:howtogrowgracefully,onoverflow?§ ManysoluKons-Oneofthem:‘extendiblehashing’[Faginetal]

Prakash2016 VTCS4604 103

Page 104: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

Extendiblehashing

Prakash2016 VTCS4604 104

#0page

#h(123)

M

123;Smith;Mainstr.

Page 105: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

Extendiblehashing

Prakash2016 VTCS4604 105

#0page

#h(123)

M

123;Smith;Mainstr.

solu0on:

splitthebucketintwo

Page 106: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

Extendiblehashing

Prakash2016 VTCS4604 106

indetail:§  keepadirectory,withptrstohash-buckets§  Q:howtodividecontentsofbucketintwo?§  A:hasheachkeyintoaverylongbitstring;keeponlyasmanybitsasneeded

Eventually:

Page 107: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

Extendiblehashing

Prakash2016 VTCS4604 107

directory

00...01...

10...

11...

10101...

10110...

1101...

10011...

0111...0001...

101001...

Page 108: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

Extendiblehashing

Prakash2016 VTCS4604 108

directory

00...01...

10...

11...

10101...

10110...

1101...

10011...

0111...0001...

101001...

Page 109: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

Extendiblehashing

Prakash2016 VTCS4604 109

directory

00...01...

10...

11...

10101...

10110...

1101...

10011...

0111...0001...

101001...

spliton3-rdbit

Page 110: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

Extendiblehashing

Prakash2016 VTCS4604 110

directory

00...01...

10...

11...

1101...

10011...

0111...0001...

101001...10101...

10110...

newpage/bucket

Page 111: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

Extendiblehashing

Prakash2016 VTCS4604 111

directory(doubled)

1101...

10011...

0111...0001...

101001...10101...

10110...

newpage/bucket

000...001...

010...

011...

100...101...

110...

111...

Page 112: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

Extendiblehashing

Prakash2016 VTCS4604 112

00...01...

10...

11...

10101...

10110...

1101...

10011...

0111...0001...

101001...

000...001...

010...

011...

100...101...

110...

111...

1101...

10011...

0111...0001...

101001...10101...

10110...

BEFORE AFTER

Page 113: CS 4604: Introducon to Database Management Systemscourses.cs.vt.edu/~cs4604/Spring16/lectures/lecture-8.pdf · CS 4604: Introducon to Database Management Systems ... Types of Indexes

Extendiblehashing

§  Summary:directorydoublesondemand§  orhalves,onshrinkingfiles§  needs‘local’and‘global’depth

Prakash2016 VTCS4604 113