database design and...

37
Database design and implementation CMPSCI 645 Lecture 08: Storage and Indexing 1

Upload: others

Post on 03-Oct-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Database design and implementationavid.cs.umass.edu/courses/645/s2017/lectures/08-StorageIndexing.pdf · DBMS vs. OS file system } Reason 1: Correctness } DBMS needs fine grained

Database design and implementation CMPSCI 645

Lecture 08: Storage and Indexing

1

Page 2: Database design and implementationavid.cs.umass.edu/courses/645/s2017/lectures/08-StorageIndexing.pdf · DBMS vs. OS file system } Reason 1: Correctness } DBMS needs fine grained

Where is the data and how to get to it?

2

DB

Page 3: Database design and implementationavid.cs.umass.edu/courses/645/s2017/lectures/08-StorageIndexing.pdf · DBMS vs. OS file system } Reason 1: Correctness } DBMS needs fine grained

DBMS architecture

3

DiskSpaceManager

AccessMethods

BufferManager

QueryParser

QueryRewriter

QueryOp=mizer

QueryExecutor

LockManager LogManager

DB

Page 4: Database design and implementationavid.cs.umass.edu/courses/645/s2017/lectures/08-StorageIndexing.pdf · DBMS vs. OS file system } Reason 1: Correctness } DBMS needs fine grained

Memory hierarchy

5

randomaccessfastvola=le

randomaccessrela=velyslownon-vola=le

sequen=alscannon-vola=lelongarchiving

mainmemory

magne+cdisk

tape

Page 5: Database design and implementationavid.cs.umass.edu/courses/645/s2017/lectures/08-StorageIndexing.pdf · DBMS vs. OS file system } Reason 1: Correctness } DBMS needs fine grained

Disks and DBMS design

DB

Databasesarestoredondisks

write

read

RAM

expensiveopera=ons

6

Page 6: Database design and implementationavid.cs.umass.edu/courses/645/s2017/lectures/08-StorageIndexing.pdf · DBMS vs. OS file system } Reason 1: Correctness } DBMS needs fine grained

Why not store everything in memory?

7

vola=lity

cost

Page 7: Database design and implementationavid.cs.umass.edu/courses/645/s2017/lectures/08-StorageIndexing.pdf · DBMS vs. OS file system } Reason 1: Correctness } DBMS needs fine grained

Basics of disks

8

Pla4ers

Spindle

Armmovement

Diskhead

Armassembly

PlaIersspinunderthehead

Onlyoneheadreadsandwrites

Retrieval=mevaries:Seek=me+rota=ondelay+transfer=me

PlaIershavetracks,formingan(imaginary)cylinder

Eachtrackhassectors.Blocks(pages)aremul=pleofsectors

Page 8: Database design and implementationavid.cs.umass.edu/courses/645/s2017/lectures/08-StorageIndexing.pdf · DBMS vs. OS file system } Reason 1: Correctness } DBMS needs fine grained

Accessing a disk page

}  Timetoaccess(read/write)adiskblock:1.  seek'me(movingarmstoposi=onadiskheadonatrack)2.  rota'onaldelay(wai=ngforablocktorotateunderthehead)3.  transfer'me(actuallymovingdatato/fromdisksurface)

}  Seek=meandrota=onaldelaydominate.

}  PlacementofpagesondiskhasmajorimpactonDBMSperformance.

9

Page 9: Database design and implementationavid.cs.umass.edu/courses/645/s2017/lectures/08-StorageIndexing.pdf · DBMS vs. OS file system } Reason 1: Correctness } DBMS needs fine grained

Arranging pages on disk

}  Sequen=alpagestorage:}  blocksonthesametrack,followedby}  blocksonthesamecylinder,followedby}  blocksonanadjacentcylinder

}  Pagesinafileshouldbearrangedsequen=allyondisk,tominimizeseekandrota=onaldelay.}  Scanofthefileisasequen'alscan.

10

Page 10: Database design and implementationavid.cs.umass.edu/courses/645/s2017/lectures/08-StorageIndexing.pdf · DBMS vs. OS file system } Reason 1: Correctness } DBMS needs fine grained

Files of records

11

Fieldsareorganizedinarecord

Acollec=onofrecordsareorganizedinapage

Acollec=onofpagesmakesafile

Page 11: Database design and implementationavid.cs.umass.edu/courses/645/s2017/lectures/08-StorageIndexing.pdf · DBMS vs. OS file system } Reason 1: Correctness } DBMS needs fine grained

Unordered (Heap) Files }  Simplestfilestructurecontainsrecordsinnopar=cularorder.

} Asfilegrowsandshrinks,diskpagesareallocatedandde-allocated.

}  Tosupportrecordlevelopera=ons,wemust:}  keeptrackofthepagesinafile}  keeptrackoffreespaceonpages}  keeptrackoftherecordsonapage

12

Page 12: Database design and implementationavid.cs.umass.edu/courses/645/s2017/lectures/08-StorageIndexing.pdf · DBMS vs. OS file system } Reason 1: Correctness } DBMS needs fine grained

Heap File Using a Page Directory

}  Pageentrycanincludethenumberoffreebytesonthepage.

}  Thedirectoryisacollec=onofpages;linkedlistimplementa=onisjustonealterna=ve.

DataPage1

DataPage2

DataPageN

HeaderPage

DIRECTORY

13

Page 13: Database design and implementationavid.cs.umass.edu/courses/645/s2017/lectures/08-StorageIndexing.pdf · DBMS vs. OS file system } Reason 1: Correctness } DBMS needs fine grained

Page format

14

} Howtostorerecordsonapage

}  Considerapageasacollec=onofslots,oneforeachrecord

} Arecordisiden=fiedbyrid=<pageid,slot#>

}  Recordids(rids)areusedinindexes

Page 14: Database design and implementationavid.cs.umass.edu/courses/645/s2017/lectures/08-StorageIndexing.pdf · DBMS vs. OS file system } Reason 1: Correctness } DBMS needs fine grained

Page formats: fixed length records

Movingrecordsforfreespacemanagementchangesrid;maynotbeacceptable.

Slot1Slot2

SlotN

... ...

N M10...

M...321PACKED UNPACKED,BITMAP

Slot1Slot2

SlotN

FreeSpace

SlotM

11

numberofrecords

numberofslots

15

Page 15: Database design and implementationavid.cs.umass.edu/courses/645/s2017/lectures/08-StorageIndexing.pdf · DBMS vs. OS file system } Reason 1: Correctness } DBMS needs fine grained

Page formats: variable length records

Canmoverecordsonpagewithoutchangingrid;so,aIrac=veforfixed-lengthrecordstoo.

PageiRid=(i,N)

Rid=(i,2)

Rid=(i,1)

Pointertostartoffreespace

SLOTDIRECTORY

N...2120 16 24 N

#slots

16

Page 16: Database design and implementationavid.cs.umass.edu/courses/645/s2017/lectures/08-StorageIndexing.pdf · DBMS vs. OS file system } Reason 1: Correctness } DBMS needs fine grained

Record formats: fixed length

Numberoffieldsandtypestoredinsystemcatalogs.Findingithfielddoesnotrequirescanofrecord.

Baseaddress(B)

L1 L2 L3 L4

F1 F2 F3 F4

Address=B+L1+L2

17

Page 17: Database design and implementationavid.cs.umass.edu/courses/645/s2017/lectures/08-StorageIndexing.pdf · DBMS vs. OS file system } Reason 1: Correctness } DBMS needs fine grained

Record formats: variable length

F1F2F3F4

S1 S2 S3 S4 E4ArrayofFieldOffsets

$ $ $ $

Scan

FieldsDelimitedbySpecialSymbols

F1F2F3F4

2ndchoiceoffersdirectaccesstoithfieldwithsmalldirectoryoverhead.

18

Page 18: Database design and implementationavid.cs.umass.edu/courses/645/s2017/lectures/08-StorageIndexing.pdf · DBMS vs. OS file system } Reason 1: Correctness } DBMS needs fine grained

Question

}  Considerthefollowingquery:

} HowcantheDBMSexecutethisquerygiven}  1GBofmemory}  100GBTempSensorand10GBPressureSensor

SELECT S1.temp, S2.pressure!FROM ! TempSensor S1, PressureSensor S2!WHERE! S1.location = S2.location !

! AND S1.time = S2.time!

19

Page 19: Database design and implementationavid.cs.umass.edu/courses/645/s2017/lectures/08-StorageIndexing.pdf · DBMS vs. OS file system } Reason 1: Correctness } DBMS needs fine grained

Buffer manager

Disk

Mainmemory

Pagerequestsfromhigher-levelcode

Bufferpool

Diskpage

Freeframe

1pagecorrespondsto1diskblock

Disk=collec=onofblocks

Diskspacemanager

BufferpoolmanagerFilesandaccessmethods

choiceofframedictatedbyreplacementpolicy

•  DatamustbeinRAMforDBMStooperateonit!•  Bufferpool=tableof<frame#,pageid>pairs

READ/WRITE

INPUT/OUTPUT

01

11

02

pincount

dirty

20

Page 20: Database design and implementationavid.cs.umass.edu/courses/645/s2017/lectures/08-StorageIndexing.pdf · DBMS vs. OS file system } Reason 1: Correctness } DBMS needs fine grained

When a page is requested...

}  Ifrequestedpageisnotinpool(andbufferisfull):}  Chooseaframeforreplacement}  Ifframeisdirty,writeittodisk}  Readrequestedpageintochosenframe

}  Pinthepageandreturnitsaddress.

Ifrequestscanbepredicted(e.g.,sequen=alscans)pagescanbepre-fetchedseveralpagesata=me!

23

Page 21: Database design and implementationavid.cs.umass.edu/courses/645/s2017/lectures/08-StorageIndexing.pdf · DBMS vs. OS file system } Reason 1: Correctness } DBMS needs fine grained

Buffer replacement policy }  Frameischosenforreplacementbyareplacementpolicy:}  Least-recently-used(LRU),Clock,MRUetc.

}  Policycanhavebigimpacton#ofI/O’s;dependsontheaccesspa>ern.

}  Sequen'alflooding:Nastysitua=oncausedbyLRU+repeatedsequen=alscans.}  #bufferframes<#pagesinfilemeanseachpagerequestcausesanI/O.MRUmuchbeIerinthissitua=on(butnotinallsitua=ons,ofcourse).

24

Page 22: Database design and implementationavid.cs.umass.edu/courses/645/s2017/lectures/08-StorageIndexing.pdf · DBMS vs. OS file system } Reason 1: Correctness } DBMS needs fine grained

DBMS vs. OS file system

}  Reason1:Correctness}  DBMSneedsfinegrainedcontrolfortransac=ons}  Needstoforcepagestodiskforrecoverypurposes

}  Reason2:Performance}  DBMSmaybeabletoan=cipateaccesspaIerns}  Hence,mayalsobeabletoperformprefetching}  MayselectbeIerpagereplacementpolicy

25

OSdoesdiskspace&buffermgmt:whynotletitmanagethesetasks?

Page 23: Database design and implementationavid.cs.umass.edu/courses/645/s2017/lectures/08-StorageIndexing.pdf · DBMS vs. OS file system } Reason 1: Correctness } DBMS needs fine grained

Database file types

Thedatafilecanbeoneof:} Heapfile}  Setofrecords,par==onedintoblocks}  Unsorted

}  Sequen=alfile}  SortedaccordingtosomeaIribute(s)called(sort)key

differentfrom“key"!

26

Page 24: Database design and implementationavid.cs.umass.edu/courses/645/s2017/lectures/08-StorageIndexing.pdf · DBMS vs. OS file system } Reason 1: Correctness } DBMS needs fine grained

Index

} A(possiblyseparate)file,thatallowsfastaccesstorecordsinthedatafilegivenasearchkey

}  Theindexcontains(key,value)pairs:}  Thekey=anaIributevalue}  Thevalue=eitherapointertotherecord,ortherecorditself

againdifferentfrom“key"!

27

Page 25: Database design and implementationavid.cs.umass.edu/courses/645/s2017/lectures/08-StorageIndexing.pdf · DBMS vs. OS file system } Reason 1: Correctness } DBMS needs fine grained

High-level overview: Indexes

id age salary other

006 19 50k ...

005 20 55k ...

004 25 50k ...

007 30 80k ...

002 35 75k ...

003 35 70k ...

001 40 65k ...

id age salary other

006 19 50k ...

004 25 50k ...

005 20 55k ...

001 40 65k ...

003 35 70k ...

002 35 75k ...

007 30 80k ...

datafile=indexfileclustered(primary)index

indexfileunclustered(secondary)index

28

Page 26: Database design and implementationavid.cs.umass.edu/courses/645/s2017/lectures/08-StorageIndexing.pdf · DBMS vs. OS file system } Reason 1: Correctness } DBMS needs fine grained

Index classification }  Clustered/unclustered

}  Clustered=recordscloseinindexarecloseindata}  Unclustered=recordscloseinindexmaybefarindata

}  Primary/secondary}  Primary=isoveraIributesthatincludetheprimarykey}  Secondary=otherwise

}  Organiza=on:B+treeorHashtable

29

Page 27: Database design and implementationavid.cs.umass.edu/courses/645/s2017/lectures/08-StorageIndexing.pdf · DBMS vs. OS file system } Reason 1: Correctness } DBMS needs fine grained

Clustered/Unclustered

}  Clustered}  Indexdeterminestheloca=onofindexedrecords}  Typically,clusteredindexisonewherevaluesaredatarecords(butnotnecessary)

} Unclustered}  Indexcannotreorderdata,doesnotdeterminedataloca=on

}  Intheseindexes:value=pointertodatarecord

30

Page 28: Database design and implementationavid.cs.umass.edu/courses/645/s2017/lectures/08-StorageIndexing.pdf · DBMS vs. OS file system } Reason 1: Correctness } DBMS needs fine grained

Clustered index

}  FileissortedontheindexaIribute} Onlyonepertable

10

20

30

40

50

60

70

80

10

20

30

40

50

60

70

80

Index File Data File

31

Page 29: Database design and implementationavid.cs.umass.edu/courses/645/s2017/lectures/08-StorageIndexing.pdf · DBMS vs. OS file system } Reason 1: Correctness } DBMS needs fine grained

Unclustered index

}  Severalpertable

10

10

20

20

20

30

30

30

20

30

30

20

10

20

10

30

Index File Data File

32

Page 30: Database design and implementationavid.cs.umass.edu/courses/645/s2017/lectures/08-StorageIndexing.pdf · DBMS vs. OS file system } Reason 1: Correctness } DBMS needs fine grained

Clustered vs. unclustered index

Dataentries(IndexFile)(Datafile)

DataRecords

Dataentries

DataRecords

CLUSTERED UNCLUSTERED

B+Tree B+Tree

33

Page 31: Database design and implementationavid.cs.umass.edu/courses/645/s2017/lectures/08-StorageIndexing.pdf · DBMS vs. OS file system } Reason 1: Correctness } DBMS needs fine grained

Alternatives for data entry k* in index

}  Inadataentryk*,wecanstore:}  Alterna=ve1:<k,datarecordwithsearchkeyvaluek>}  Alterna=ve2:<k,ridofarecordwithsearchkeyvaluek>

}  Alterna=ve3:<k,listofridsofrecordswithsearchkeyk>

}  Choiceofanalterna'vefordataentriesisorthogonaltoanindexingtechniqueused.}  Indexingtechniques:B+tree,hashing,…

34

Page 32: Database design and implementationavid.cs.umass.edu/courses/645/s2017/lectures/08-StorageIndexing.pdf · DBMS vs. OS file system } Reason 1: Correctness } DBMS needs fine grained

Cost model

WeignoreCPUcosts,forsimplicity:}  B:Thenumberofdatapages}  R:Numberofrecordsperpage}  D:(Average)=metoreadorwritediskpage}  MeasuringnumberofpageI/Osignoresgainsofpre-fetchingasequenceofpages;thus,evenI/Ocostisonlyapproximated.

}  Average-caseanalysis;basedonseveralsimplis=cassump=ons.

35

Page 33: Database design and implementationavid.cs.umass.edu/courses/645/s2017/lectures/08-StorageIndexing.pdf · DBMS vs. OS file system } Reason 1: Correctness } DBMS needs fine grained

Comparing file organizations

}  Heapfiles(randomorder)}  Sortedfiles,sortedon<age,sal>}  ClusteredB+treefile,Alterna=ve(1),search

key<age,sal>}  HeapfilewithunclusteredB+treeindexon

searchkey<age,sal>}  Heapfilewithunclusteredhashindexon

searchkey<age,sal> 36

Page 34: Database design and implementationavid.cs.umass.edu/courses/645/s2017/lectures/08-StorageIndexing.pdf · DBMS vs. OS file system } Reason 1: Correctness } DBMS needs fine grained

Operations to compare

37

}  Scan:Fetchallrecordsfromdisk}  Equalitysearch}  Rangeselec=on}  Insertarecord}  Deletearecord

Page 35: Database design and implementationavid.cs.umass.edu/courses/645/s2017/lectures/08-StorageIndexing.pdf · DBMS vs. OS file system } Reason 1: Correctness } DBMS needs fine grained

Assumptions } HeapFiles:}  Equalityselec=ononkey;exactlyonematch.

}  SortedFiles:}  Filescompactedaverdele=ons.

}  Indexes:}  Alt(2),(3):dataentrysize=10%sizeofrecord}  Hash:Nooverflowbuckets.

} 80%pageoccupancy=>Filesize=1.25datasize}  Tree:67%occupancy(thisistypical).

}  Impliesfilesize=1.5datasize

38

Page 36: Database design and implementationavid.cs.umass.edu/courses/645/s2017/lectures/08-StorageIndexing.pdf · DBMS vs. OS file system } Reason 1: Correctness } DBMS needs fine grained

Assumptions (contd.)

}  Scans:}  Leaflevelsofatree-indexarechained.}  Indexdata-entriesplusactualfilescannedforunclusteredindexes.

}  Rangesearches:}  Weusetreeindexestorestrictthesetofdatarecordsfetched,butignorehashindexes.

39

Page 37: Database design and implementationavid.cs.umass.edu/courses/645/s2017/lectures/08-StorageIndexing.pdf · DBMS vs. OS file system } Reason 1: Correctness } DBMS needs fine grained

Cost of operations

40

Scan Equality Range

Heap file BD 0.5 BD BD

Sorted file BD D log2 B D (log2 B + #match recs)

Clustered tree index 1.5 BD D logF 1.5B D (logF 1.5B + #pages with matched recs)

Unclustered tree index BD (R+0.15) D(1 + logF 0.15B) D (logF 0.15B + #pages with matched recs)

Unclustered hash index

BD (R + 0.125) 2D BD

Severalassump=onsunderliethese(rough)es=mates!