1 disk storage, basic file structures, and hashing

1

Disk Storage, Basic File Structures, and

Hashing

2

Introduction

In a computerized database, the data is stored on computer storage medium, which includes:

Primary Storage can be processed directly by the CPU e.g., the main memory, cache fast, expensive, but of limited capacity

Secondary Storage cannot be processed directly by the CPU magnetic disks, optical disks, tapes slow, cost less, but have a large capacity.

3

Storage Hierarchy

Magnetic Tape

Optical Disk

Magnetic Disk

Flash Memory

Cache

$$

unit priceMemory

Volatile

Non-volatile speed

Secondary storage

Tertiary storage

Primary storage

4

Storage of Databases

For the following reasons, most databases are stored permanently on secondary storage:

They are too large to fit entirely in main memory

They must persist over long period of times, but the main memory is a volatile storage

Secondary storage costs less

5

Secondary Storage

Magnetic-disk: cannot be directly processed by the CPU; it must be brought to the main memory first.

Data is stored on spinning disk, and read/written magnetically

Primary medium for the long-term storage of data; typically stores entire database.

Non-volatile

slow access to data

large storage capacity (on the order of gigabytes)

6

Disk Storage Devices

Preferred secondary storage device for high storage capacity and low cost.

Data stored as magnetized areas on magnetic disk surfaces.

A disk pack contains several magnetic disks connected to a rotating spindle.

Disks are divided into concentric circular tracks on each disk surface. Track capacities vary typically from 4 to 50

Kbytes or more

7

Disk Storage Devices (contd.)

A track is divided into smaller blocks or sectors because it usually contains a large amount of

information The division of a track into sectors is hard-coded on the

disk surface and cannot be changed. One type of sector organization calls a portion of a

track that subtends (faces) a fixed angle at the center as a sector.

A track is divided into blocks. The block size B is fixed for each system.

Typical block sizes range from B=512 bytes to B=4096 bytes. Whole blocks are transferred between disk and main

memory for processing.

8

Disk Storage Devices (contd.)

A read-write head moves to the track that contains the block to be transferred. Disk rotation moves the block under the read-write

head for reading or writing. A physical disk block (hardware) address consists of:

a cylinder number (imaginary collection of tracks of same radius from all recorded surfaces)

the track number or surface number (within the cylinder)

and block number (within track). Reading or writing a disk block is time consuming because

of the seek time s and rotational delay (latency) rd. Double buffering can be used to speed up the transfer of

contiguous disk blocks.

9

Physical Characteristics of Disks

10

Components of a Disk

The platters spin (say, 90rps).

The arm assembly is moved in or out to position a head on a desired track.

Read-write head Positioned very close to the platter surface

(almost touching it) Reads or writes magnetically encoded

information. Only one head reads/writes at any one time.

Surface of platter divided into circular tracks

11


Track an information storage circle on the surface of a

disk. Over 16,000 tracks per platter each track can store between 4KB and 50KB of

data. Each track is divided into sectors. Tracks under heads make a cylinder (imaginary!)

Cylinder the tracks with the same diameter on all

surfaces of a disk pack. Cylinder i consists of i-th track of all the

platters

12


Sector a part of a track with fixed size separated by fixed-size interblock gaps Typical sectors per track

200 (on inner tracks) to 400 (on outer tracks)

13

Sectors

15

Disk I/O Model of Computation

Disk I/O is equivalent to one read or write operation of a single block

It is very expensive compared with what is likely to be done once the block gets in main memory one random disk I/O ~ about 1,000,000 machine

instructions in terms of time

Cost for computation that requires secondary storage is computed only by disk I/Os.

16

Pages and Blocks

Data files decomposed into pages (blocks) fixed size piece of contiguous information in

the file sizes range from 512 bytes to several kilobytes

block is the smallest unit for transferring data between the main memory and the disk.

Address of a page (block): (cylinder#, track# (within cylinder), sector#

(within track)

17

Pages and Blocks

Track

SectorGap

One track ...1 2 3 4

1 page/block = 4 Sectors

18

Page I/O

Page I/O --- one page I/O is the cost (or time needed) to transfer one page of data between the memory and the disk.

The cost of a (random) page I/O = seek time + rotational delay + block transfer time

Seek time time needed to position read/write head on correct

track. Rotational delay (latency)

time needed to rotate the beginning of page under read/write head.

Block transfer time time needed to transfer data in the page/block.

19

Page I/O

Average rotational delay (rd) rd = ½ * (1/p) min = (60*1000)/(2*p) msec OR rd = ½ * cost of 1 revolution = ½ * (60*1000/p) msec where

p is speed of disk rotation (how many revolutions per minute - rpm)

Example Speed of disk rotatioon is p = 3600 rpm 60 revolutions/sec 1 rev. = 16.66 msec. (1 second = 1000 msec) rd = 8.33 ms

20

Page I/O

Transfer rate (tr) tr = track size / cost of one revolution = track size / (60*1000/p) in msec

Bulk transfer rate (btr) btr = (B/(B+G)) * tr bytes/msec Where B is the block size in bytes G is interblock gap size in bytes

Block transfer time (btt) btt = B / tr not taking into acount G btt = B / btr taking into acount G

21

Page I/O

Example: Track size = 50 KB and p = 3600 rpm Block size B = 3KB = 3000 bytes

tr = (50*1000)/(60*1000/3600) = 3000 bytes/msec

btt = B / tr = 3000/3000 = 1 msec

22

Page I/O

Average time for reading/writing n consecutive pages that are in the same track or cylinder = s + rd + n * btt

Average time for reading/writing consecutively n noncontigues pages/blocks that are in the same cylinder = s + n * (rd + btt)

23

An Example

A hard disk specifications: 4 platters, 8 Surfaces, 3.5 Inch diameter 213 = 8192 tracks/surface 28 = 256 sectors/track 29 = 512 bytes/sector Average seek time s = 25 ms Rotation rate rd = 3600 rpm = 60 rps 1 rev. = 16.66 msec Transfer rate tr = 1 KB in 0.117 ms tr = 1 KB in 0.130 ms with gap

24

An Example

What is the total capacity of this disk 8 GB (8*213*28*29=233)

How many bytes does one track hold? 256 sectors/track*512 bytes/sector = 128KB

How many blocks per track? one block = 4096 bytes = 8 sectors (4096/512) 256/8 = 32 blocks/track

25

An Example

How long does it take to access one block?

One block = 4096 bytes 8 sectors = 4096/512

Rotation rate r 1 rev. = 16.66 msec.

Time to access 1 sector (s + r/2 + tr/(secters/KB) 25 + (16.66/2) + .117/2 = 33.3885 ms.

time to access 1 block time to access the first sector of the block +

time to access the subsequent 7 sectors.

26

An Example

T = 25 + (16.66/2) + (0.117/2) * 1 + (0.13/2) *7 = 33.3885 + 0.455 ms = 33.8435ms

Compare to one sector access time: 33.3885 ms

...1 2 3 8

1 block = 8 Sectors

27

Buffering

A buffer is a contiguous reserved area in main memory

available for storage of copies of disk blocks. to speed up the processes.

For a read command the block from disk is copied into the buffer.

For a write command the contents of the buffer are copied into the

disk.

28

Accessing Data Through RAM Buffer

Buffer

RAM

Application

Page frames

Block transfer

blockRecordtransfer

29

Buffer Manager

Programs call on the buffer manager when they need a block from disk. If the block is already in the buffer,

the requesting program is given the address of the block in main memory

If the block is not in the buffer, the buffer manager allocates space in the buffer for

the block, replacing (throwing out) some other block, if required, to make space for the new block.

The block that is thrown out is written back to disk only if it was modified since the most recent time that it was written to/fetched from the disk.

30

Buffer Manager

Once space is allocated in the buffer, the buffer manager reads the block from the disk to the buffer, and passes the address of the block in main memory to requester.

Buffer Replacement Policy: Frame is chosen for replacement by a

replacement policy: Least-recently-used (LRU), MRU, FIFO, etc.

Policy can have big impact on # of I/O’s; depends on the access pattern.

31

File Organization

The database is stored as a collection of files.

Each file is a sequence of records.

A record is a sequence of fields.

Records are stored on disk blocks.

A file can have fixed-length records or variable-length records.

32

File Organization

Fixed length records Each record is of fixed length. Pad with spaces.

Variable length records different records in the file have different sizes. Arise in database systems in several ways:

different record types in a file. same record type with (variable-length fields,

repeating field, or optional fields)

33

File Organization

34

Fixed-Length Records

Insertion: Store record i starting from byte

n (i – 1), where n is the size of each record.

Deletion of record i: Packed format:

move records i + 1, . . ., n to i, . . . , n – 1

ORmove record n to i

Unpacked format (do not move records, but) link all free records on a free listORUse bitmap vector

35

Free Lists

Store the address of the first deleted record in the file header.

Use this first record to store the address of the second deleted record, and so on.

36

Page Formats: Fixed Length Records

Record id = <page id, slot #>.

Slot 1Slot 2

Slot N

. . . . . .

N M10. . .

M ... 3 2 1PACKED UNPACKED, BITMAP

Slot 1Slot 2

Slot N

FreeSpace

Slot M

11

number of records

numberof slots

37

Variable-Length Records Represenation

Byte-String representation Attach an end-of-record () control character to

the end of each record Difficulty with deletion and growth

Slotted-page header contains: number of record entries location and size of each record end of free space in the block

38

Slotted Page Structure

Records can be moved around within a page to keep them contiguous with no empty space between them entry in the header must be updated. Pointers should not point directly to record -

instead they should point to the entry for the record in header.

39

Fixed-Length Representation

Reserved Space can use fixed-length records of a known

maximum length unused space in shorter records filled with a

null or end-of-record symbol.

40


List Representation by Pointers A variable-length record is represented by a list

of fixed-length records, chained together via pointers.

Can be used even if the maximum record length is not known

41


Disadvantage: space is wasted in all records except the first in a a chain. Solution is to allow two kinds of block in file:

Anchor block: contains the first records of chain Overflow block: contains records other than those

that are the first records of chairs.

42

Blocking Factor

Blocking Factor (bfr) - the number of records that can fit into a single block. bfr = ⌊B/R ⌋

B : Block size in bytes R: Record size in bytes

Example: Record size R = 100 bytes Block Size B = 2,000 bytes Thus the blocking factor bfr = 2000/100 = 20

The number of blocks b needed to store a file of r records: b = r/bfr blocks

43

Spanned & Unspanned Records

A block is the unit of data transfer between disk and memory.

Unspanned records: A record is found in one and only one block.

records do not span across block boundaries. Used with fixed-length records having B R

Spanned records: Records are allowed to span across block

boundaries. Used with variable-length records having R B

In variable-length records, either organization can be used.

44

Placing File Records on Disk

A file header or file descriptor contains information about a file (e.g., the disk address, record format descriptions, etc.)

45

The physical disk blocks that are allocated to hold the records of a file can be contiguous, linked, or indexed.

In contiguous allocation, the file blocks are allocated to consecutive disk blocks.

In linked allocation, each file block contains a pointer to the next file block.

In indexed allocation, one or more index blocks contain pointers to the actual file blocks.

Allocating File Blocks on Disk

46

Organization of Records in Files

Heap File Organization a record can be placed anywhere in the file where there is

space, or at the end for full file scans or frequent updates Data unordered (unsorted)

Sorted/Ordered File Organization store records sorted in order, based on the value of the

search key of each record Need external sort or an index to keep sorted

Hashing File Organization a hash function computed on some attribute of each record the result specifies in which block of the file the record

should be placed

47

Heap File Organization

Records are placed in the file in the order in which they are inserted. Such an organization is called a heap file. Insertion is at the end

takes constant time O(1) (very efficient) Searching

requires a linear search (expensive) Deleting

requires a search, then delete

Select, Update and Delete take b/2 time (linear time) in average b is the number of blocks

48


For a file of unordered fixed-length records using unspanned blocks and contiguous allocation, it is straightforward to access any record by its position in the file. If the records are numbered 0,1,2, …, r-1 and The records in each block are numbered 0,1,2,

…, f-1, where f is the blocking factor The the i-th record of the file is located in

Block i/f and in the (i mod f)-th record in that block

49


A Heap file allows us to retrieve records: by specifying the rid, or by scanning all records sequentially

Accessing a record by its position does not help locate a record based on a search condition.

50

File Stored as a Heap File

666666 MGT123 F1994 4.0123456 CS305 S1996 4.0 page 0987654 CS305 F1995 2.0

717171 CS315 S1997 4.0666666 EE101 S1998 3.0 page 1765432 MAT123 S1996 2.0515151 EE101 F1995 3.0

234567 CS305 S1999 4.0 page 2

878787 MGT123 S1996 3.0

51

Sequential File Organization

Suitable for applications that require sequential processing of the entire file

The records in the file are ordered by a search-key

52

Files of Ordered Records

Some blocks of an ordered (sequential) file of EMPLOYEE records with NAME as the ordering key field.

53

File Stored as a Sorted File

111111 MGT123 F1994 4.0111111 CS305 S1996 4.0 page 0123456 CS305 F1995 2.0

123456 CS315 S1997 4.0123456 EE101 S1998 3.0 page 1232323 MAT123 S1996 2.0234567 EE101 F1995 3.0

234567 CS305 S1999 4.0 page 2

313131 MGT123 S1996 3.0

54


Insertion is expensive records must be inserted in the correct order

locate the position where the record is to be inserted if there is free space insert there if no free space insert the record in an overflow block In either case, pointer chain must be updated

Insert takes lg(b) plus the time to re-organize records. b is the number of blocks

Deletion use pointer chains

Searching very efficient (Binary search) This requires lg(b) on the average

55


56

A hash function maps the hash field of a record into the address of the storage media in which the record is stored.

Hashing provides very fast access to records, where the search condition is an equality condition on the hash field.

For internal files, hashing is implemented as a hash table. The mapping that assigns each element of the data a cell of the hash table is called a hash function.

Hashing Techniques

57

Two records that yield the same hash value are said to collide.

A good hash function must be easy to compute and generate a low number of collisions.

The process of finding another position (for colliding data) is called collision resolution.

There are several methods for collision resolution, including open addressing, chaining, and multiple hashing.

Hashing Techniques

58

Open addressing: Proceeding from the occupied position specified by

the hash function, check the subsequent positions in order until an unused position is found.

Chaining: Associate an overflow area (or a linked list) to any

cell (hashing address) and then simply store the data in this medium.

Multiple hashing: Apply a second hash function if the first results in a

collision. If another collision results, use open addressing, or

apply a third hash function, and then use open addressing.

Hashing Techniques

59

Hashing Techniques

60

Hashing Techniques

Hashing for disk files is called external hashing.

The target address space in external hashing is made of buckets (which holds a disk block or a cluster of contiguous blocks).

The collision problem is less severe, because as many records as will fit in a bucket can hash to the same bucket without causing collision problem.

A table maintained in the file header converts the bucket number into the corresponding disk block address.

61

Hashing Techniques

Matching bucket numbers to disk block addresses.

62

Hashing Techniques

To reduce overflow records, a hash file is typically kept 70-80% full.

The hash function h should distribute the records uniformly among the buckets Otherwise, search time will be increased

because many overflow records will exist.

63

Hashed Files - Overflow handling

64

Hashing Techniques

The hashing scheme is called static hashing if a fixed number of buckets is allocated.

Main disadvantage of static external hashing: The number of buckets must be chosen large

enough that can handle large files. That is, it is difficult to expand or shrink the file dynamically.

Three solutions to the above problem are: Dynamic hashing, Extendible hashing Linear hashing

1 disk storage, basic file structures, and hashing

Documents

disk storage devices

storage of databases

spinning disk

disk rotation

disk pack

secondary storage device

longterm storage of

magnetic disk surfaces