chapter 21 chapter 2. data storage. chapter 22 outline memory hierarchy hardware: disks access times...
Post on 03-Jan-2016
220 Views
Preview:
TRANSCRIPT
Chapter 2 1
Chapter 2. Data Storage
Chapter 2 2
Outline
• Memory hierarchy• Hardware: Disks• Access Times• Example - Megatron 747• Optimizations• Disk failure• RAIDs
Chapter 2 3
Operating Systems
DBMS’s
Hardware - Data Storage
Users
Chapter 2 4
The Memory Hierarchy
TertiaryStorage
DiskAsVirtualMemory
FileSystem
Main memory
Cache
DBMS
Programs,Main-memoryDBMS’s
Chapter 2 5
Cache
– The cache is an integrated circuit or part of the processor’s chip
• Holding data or machine instructions• Copy from main-memory
– If data being expelled from the cache has been modified, then the new value must be copied into the main memory.
– Typical performance• Capacities up to a megabyte• Access time: 10 nanoseconds (10-8 seconds)• Moving data bet. Cache and main memory: 100
nanoseconds (10-9 seconds)
Chapter 2 6
Main Memory
• Everything that happens in the computer is resident in main memory
• Capacity: around 100 Mbyte to 10 Gbyte
• Random access– Typical access time is 10-100 nanoseconds
Chapter 2 7
Virtual Memory
• Is a part of disk• In a 32-bit address machine
– Virtual memory grows up to 232 bytes (4 Gbyte)
• Data is moved between disk and main memory in entire blocks, which are also called pages in main memory
• Main-memory database systems
Chapter 2 8
Secondary Storage (1)
• Slower, more capacious than main memory
• Random access• magnetic, optical, magneto-optical disks
• Disk read/write are done by moving a chuck of bytes called blocks (or pages)file
buffer
Chapter 2 9
Secondary Storage (2)
• Accessing a block: 10-30 milliseconds
• Recently, one disk unit can store data
ranging from 10 to 32 Gbytes
• A machine can have several disk units
Chapter 2 10
Tertiary Storage (1)
• Have been developed to hold data
volumes measured in terabytes
• Compared with secondary storage, it
offers
– Higher read/write times
– Larger capacities and smaller cost per byte
• Not random access in general
Chapter 2 11
Tertiary Storage (2)
• Kinds of tertiary storage devices– Ad-hoc tape storage– Optical-disk juke boxes: CD-ROMs– Tape silo: an automated version ad-hoc tape
storage
• Capacities– CD: 2/3 Gbytes, 2.3 Gbytes– Tapes: 50 Gbytes
• Access time: about 1000 times slower than secondary memory
Chapter 2 12
Volatile and Nonvolatile
• Volatile vs. nonvolatile storage
• Flush memory– A form of main memory– Nonvolatile– Becomes economical
• RAM disk– A battery-backed main memory
Chapter 2 13
Access Time vs. Capacity
2 1 0 -1 -2 -3 -4 -5 -6 -7 -8 -95
6
7
8
9
10
11
12
13
floppy disk
zip disk
Secondary
Main
Cache
Tertiary
X (10 seconds)
Y (10 y bytes)
Chapter 2 14
Moore’s Law
• Gordon Moore observed that the followings double every 18 months– The speed of processors, i.e., the number of
instructions executed per second and the ratio of the speed to cost of a processor
– The cost of main memory per bit and the number of bits that can be put on one chip
– The cost of disk per bit and the number of bytes that a disk can hold
• Not applicable to– Main memory access time, disk access time
Chapter 2 15
Disks
…
Terms: Platter, Head, ActuatorCylinder, TrackSector, Block, Gap
A typical disk
Chapter 2 16
Disks: A Top View
• Cylinder, Track, Sector, Gap
• Gaps often represents about 10% of the total tracks
• A entire section cannot be used if portion of it gets destroyed
• Typically a block consists of one or more sectors.
top view
Chapter 2 17
The Disk Controller
Processor
MainMemory
DiskController
Bus
Disks
• Controls one or more disk drives
– controlling the mechanical actuator
– selecting a surface or a sector on that surface
– Transferring bits via a data bus
Chapter 2 18
Disk Storage Characteristics (as of 1999)
• Rotation speed of the disk assembly– 5400 RPM (one rotation every 11 milliseconds)
• Number of platters per unit– Typical disk drive: 5 platters (10 surfaces)– Floppy/zip disk: 1 platter (2 surfaces)
• Number of tracks per surface– Have as many as 10,000 tracks– 3.5 inch diskette : 40 tracks
• Number of bytes per track– Common disk: 105 or more bytes– 3.5 inch diskette: 150K
Chapter 2 19
Megatron 747 Disk (1)
• Characteristics
– Have 4 platters (8
surfaces)
– 8192 (213) tracks per
surface
– On average 256 (28)
sectors per track
– 512 (29) bytes per sector
– Diameters of tracks
• outermost track is 3.5
inches
• innermost track is 1.5
inches
– Track consists of two parts
• gap: 10 %• data: 90%
Chapter 2 20
Megatron 747 Disk (2)
• The capacity of the disk– 8 surfaces * 8192 tracks * 256 sectors * 512
bytes = 8G bytes
• A single track on average– 256 sectors * 512 bytes = 128K bytes = 1 Mbits
• A cylinder is of 1 Mbytes on average • If a block is 4096 bytes (212)
– A block uses 8 sectors (= 4096 bytes / 512 bytes)
– A track consists of 32 blocks (= 256 sectors / 8)
Chapter 2 21
Megatron 747 Disk (3)
– If each track had the same number (i.e. 256) of sectors, then the density of bits around the tracks would be greater
• Length of the outermost track
– 0.9 * 3.5 * ≒ 9.9 inch– 1 megabit / 9.9 ≒
100,000 bits per inch
• Length of the innermost track
– 0.9 * 1.5 * ≒ 4.2 inch– 1 megabit /4.2 ≒ 250,000
bits per inch
– Each track in Megatron 747 has the different numbers of sectors
• outer: 320 sectors• middle: 250 sectors• inner: 192 sectors• The outermost track
– 1,801,800 bit / 9.9 ≒ 182,000 bpi
• The innermost track– 47,880 bit / 4.2 ≒ 114,000
bpi
Chapter 2 22
The Latency of The Disk
• Disk access time– seek time– rotational delay– transfer time– others
block xin memory
disk access time
I wantblock X
Chapter 2 23
Seek Time
• The time to position the head assembly at the proper cylinder– 0(zero): already to be at the proper cylinder– Otherwise: move to be at the proper cylinder
In range3 or 20x
x
1 Max
Cylinders Traveled
Time
Chapter 2 24
Rotational latency Time
• The time for disk to rotate the first of the sectors containing the block
• One rotation takes 10 ms, so rotational latency on average 5 ms.
Head Here
Block I Want
Chapter 2 25
Transfer Time/Other delays
• Transfer Time– the time to read/writes the data on the
appropriate disk surface– 10 Mbytes per second
• Other delays (here, those are neglected)
– taken by the processor and disk controller
– due to contention for the disk controller
– other delays due to contention
Chapter 2 26
Modifying Blocks
• Not possible to modify a block on disk directly
• Sequence of procedures– Read block (time: rt)– Modify in memory (time: mt)– Write block (time: wt)– Verify (time: vt) if appropriate
• Total time– rt + mt + wt + vt
Chapter 2 27
Example 2.3 (1)
• Let us examine the time to read a 4096-byte block from the Megatron 747 disk
• Characteristic– 4 platters (8 surfaces), 1 surface = 8192 tracks– 1 track = 256 sectors, 1 sector = 512 bytes– Disk rotates at 3840 RPM, one rotation = 1/64
of a second– To move the head assembly
• 1ms (to start and stop)+ 1ms for every 500 cylinders
– Heads move one track in 1.002 ms– To move heads from innermost to outermost
track• 1 + (8192 / 500) = 17.4 ms
Chapter 2 28
Example 2.3 (2)
• Minimum time (the best case)– No seek time, no rotational latency, only
transfer time– Note: 1 track = 256 sectors, 1 sector = 512
bytes– 4096 bytes / 512 bytes = 8 sectors (including
7 gap)– gaps/sectors occupy 10%/90% of track– A track has 256 gaps and 256 sectors– 36 * 7/256 + 324 * 8/256 = 11.109 degrees– (11.109/360)/64 = 4.8e-4 seconds = 0.5 ms
Chapter 2 29
Example 2.3 (3)
• Maximum time (the worst case)– full seek time and rotational latency, plus
transfer time
– full seek time: 17.4 ms– full rotational time: 1/64 of a second = 15.6
ms– transfer time: 0.5 ms– 17.4 + 15.6 + 0.5 = 33.5 ms
Chapter 2 30
Example 2.3 (4)
• Average Time – Transfer time: 0.5 ms– Average rotational
time: half of the full rotation = 7.8 ms
– Average seek time• average distance
traveled = 1/3 of the disk = 2730 cylinders
• 1+ 2730/500 = 6.5ms
– 0.5 + 7.8 + 6.5 = 14.8 ms
4096
2048
00 4096 8192
Averagetravel
Starting track
Chapter 2 31
RAM model vs. I/O model computation
• I/O model computation– Dominance of I/O cost
• Remember, 105 - 106 in-memory operations take the same time as one disk I/O
• Should minimize the number of block accesses
• Data Structure vs. File Processing
Chapter 2 32
Using Secondary Storage Effectively
• In general database– Whole databases are much too large to fit in
main memory– Key parts of databases are buffered in main
memory– Disk I/O’s occur frequently
• Main memory sorts (such as “Quick sort”) are inadequate
Chapter 2 33
Merge Sort
Step List 1 List 2 Output
start 1, 3, 4, 9 2, 5, 7, 8 none
1) 3, 4, 9 2, 5, 7, 8 1
2) 3, 4, 9 5, 7, 8 1,2
3) 4, 9 5, 7, 8 1,2,3
4) 9 5, 7, 8 1,2,3,4
5) 9 7, 8 1,2,3,4,5
6) 9 8 1,2,3,4,5,7
7) 9 none 1,2,3,4,5,7,8
8) none none 1,2,3,4,5,7,8,9
Chapter 2 34
Two-Phase, Multiway Merge-Sort (1)
• Phase 1
– Sort main-memory-sized pieces of the data
• Fill all available main memory with blocks
• Sort the records in main memory
• Write the sorted records
Chapter 2 35
Two-Phase, Multiway Merge-Sort (2)
• Phase 2– Merge all the sorted sublists into a single
sorted list• Find the smallest key among the first remaining
elements of all the lists
• Move the smallest element to the first available position of the output block
• If output block is full, write it to disk and reinitialize the same buffer
• Repeat until all input blocks become exhausted.
Chapter 2 36
Main-memory Organization
Pointersto firstunchosenrecord
Input buffers, one for each sorted list
Select smallest
unchosen for output
Outputbuffers
Chapter 2 37
Merge Sort Example (1)
• Assumption– 10,000,000 tuples, 1 tuple = 100 bytes
– So, 1 Gbyte data
– 50 Mbytes memory available
– 4096 byte blocks, so each block contains 40 records
– Total # of blocks: 250,000
– # of blocks in main memory: 12,800 (= 50*220 / 212)
– Number of sublists• 19 sublists (12,800 blocks) + 1 sublists (6,800 blocks)
– Each block read or write: 15 ms
Chapter 2 38
Merge Sort Example (2)
• Computation– First phase
• Read each of the 250,000 blocks once
• Write 250,000 new blocks
• Total time– (250,000 * 15 ms) * 2 = 7500 seconds = 125 minutes
– Second phase• Similar with the first phase
• Total time: 125 minutes
Chapter 2 39
Improving the Access Time of Secondary Storage
• Place blocks on the same cylinder
• Divide the data among several small
disks
• Mirroring disks
• Use a disk-scheduling algorithm
• Prefetch blocks to main memory in
anticipation of their later use
Chapter 2 40
Organizing Data by Cylinders
• Use several adjacent cylinders
• Read all the blocks on a single track or
on a cylinder consecutively
• Neglect all but the first seek time and
the first rotational latency
Chapter 2 41
Example 2.9 (1)
• Recall examples 2.3 and 2.7• Original data may be stored on consecutive
cylinders• Total # of cylinders: 1000 (= 1Gbytes / 1M bytes)• Main memory can hold 50 cylinders (i.e. 50M)
• To read 50 cylinder data into main memory– 6.5 ms for average seek time– 49 ms for 49 one-cylinder seeks (1 ms each)– 6.4 seconds for transfer of 12,800 blocks
• (12,800 * 0.5 ms) / 1000 = 6.4 seconds
– So, 6.5 + 49 + 6,400 = 6455.5 ms
Chapter 2 42
Example 2.9 (2)
• First phase– Read
• ((6.5 ms + 49 ms + 6.4 seconds) * 20 times) = 2.15 minutes
– Write: The same as reading– Total time: 4.3 minutes
• Second phase – Still takes about 125 minutes (WHY ?)
Chapter 2 43
Using Multiple Disks in place of One
• Use several disks with their independent
heads
• Transfer data at a higher rate
• Roughly speaking, total time could be
divided by the number of disks
Chapter 2 44
Example 2.10 (1)
• Replace one 747 by four 737’s which have one platter and two surfaces
• Assumption– Divide the given records among the four disks
– Occupy 1000 adjacent cylinders on each disk
– Fill ¼ of main memory each disk
– Recall previous examples
• Average seek time and rotational latency: 0
• Number of full memory blocks: 12,800
– ¼ memory size: 3,200 blocks
Chapter 2 45
Example 2.10 (2)
• Computation– First phase
• Transfer time: 3200 * 0.5 ms = 1.6 seconds• Read: (6.5 ms + 49 ms + 1.6 seconds) * 20 = 33
sec.• Write: similar with reading
• Total time: about 1 minute
Chapter 2 46
Example 2.10 (3)
• Second phase– Apply delicate techniques (?) to reduce disk
I/O time• Start comparisons among the 20 lists as soon as
the first element of the block appears in main memory
• Use four output buffers• …
– Total time: about 1 hours (?)
Chapter 2 47
Mirroring Disks
• Two or more disks hold identical copies of
data
• Survive a head crash by either disk
• If we make n copies of a disk, we can read
any n blocks in parallel.
• Using mirror disks does not speed up
writing, but neither does it slow writing
down (to some extent)
Chapter 2 48
Scheduling Requests by the Elevator Algorithm
• Disk controller choose which of several
requests to execute first, to increase
throughput
• Elevator Algorithm
– Proceed in the same direction until the next
cylinder with blocks to access is encountered
– When no requests ahead in direction of travel,
reverse direction
Chapter 2 49
Example 2.11
Cylinder of
Request
First time
available
1000 0
3000 0
7000 0
2000 20
8000 30
5000 40
Cylinder of
Request
Time complete
d
1000 8.3
3000 21.6
7000 38.9
8000 50.2
5000 65.5
2000 80.8
Cylinder of
Request
Time complete
d
1000 8.3
3000 21.6
7000 38.9
2000 58.2
8000 79.5
5000 94.8Arrival times for six block-
access requests
Finishing times for block
accesses using the elevator algorithm
Finishing times for block
accesses using the first-come-
first-served algorithm
Chapter 2 50
Prefetching Data on Track- or Cylinder-sized Chunks
• Can we predict the order in which blocks will be requested from disk ?
• For example,– Devote two block buffers to each list when
merged (when there is plenty of memory)– When a buffer is exhausted, switch to the
other buffer for the same list
Chapter 2 51
Single Buffering
• Single buffering
1)Read B1 Buffer
2)Process Data in Buffer
3)Read B2 Buffer
4)Process Data in Buffer
...
• Computation
– P = time to
process/block
– R = time to read in 1
block
– n = # of blocks
– Single buffer time =
n(P+R)
Chapter 2 52
Single Buffering vs. Double Buffering
Memory:
Disk:
A B C D GE F
A B
done
process
AC
process
B
done
Chapter 2 53
Double Buffering
• Computation
– P = processing time/block
– R = IO time/block
– n = # of blocks
– Double buffering time: R + nP
– Single buffering time: n(R+P)
Chapter 2 54
Prefetching
• Combine prefetching with the cylinder-based strategy– Store the sorted sublists on whole,
consecutive cylinders– Read whole tracks or cylinders whenever we
need some records from a given list
Chapter 2 55
Example 2.14 (1)
• Consider the second phase of the sort• Have in main memory two track-sized
buffers– A track: 128KB– Total space requirement: 128KB * 20 lists * 2 =
5 Mbyte– Read all the blocks on 1000 cylinders (8000
tracks)– Computation
• average seek time : 6.5 ms• the time for disk to rotate once: 15.6 ms• total time (for reading): (6.5 + 15.6) * 8000 = 2.95
minutes
Chapter 2 56
Example 2.14 (2)
• Have in main memory two cylinder-sized buffers per sorted sublist– 1 cylinder = 8 tracks = 128K * 8 = 1M – Use 40 buffers of a megabyte each– 50 megabytes available main memory– Need only do a seek once per cylinder– Read all the block on 1000 cylinders (8000
tracks)– Total time (for reading)
• (6.5 + 8 * 15.6) * 1000 cylinders) = 2.19 minutes
Chapter 2 57
Block Size Selection
• Big block amortize I/O cost
• Big block read in more useless stuff
and takes longer to read
• As memory prices drop, blocks get
bigger…
Chapter 2 58
Disk Failures
• Intermittent failure– An attempt to read or write a sector is unsuccessful,
but with repeated tries we are able to read or write successfully.
• Media decay– A bit or bits are permanently corrupted, and the
sector becomes unreadable.
• Write failure– We can neither write successfully nor can we retrieve
the previously written sector.
• Disk Crash– When a disk becomes unreadable permanently
Chapter 2 59
Checksums (1)
• Each section has additional bits, called the checksum, to check reading or writing operations
• (w, s)• w: the data that is read• s: a status bit
• A simple form of checksum: parity
Chapter 2 60
Checksums (2)
• Example 1 (even parity)– The sequence of bits in a sector : 01101000– The parity bit is 1– Data becomes 011010001
• Example 2 (even parity)– The sequence of bits in a sector : 11101110– The parity bit is 0– Data becomes 111011100
Chapter 2 61
Checksums (3)
• Possible that we cannot detect an error if more than one bit of the sector may be corrupted
• If we use n independent bits as a checksum, then the chance of missing an error is only 1/2n (WHY ?)
Chapter 2 62
Stable Storage (1)
• How to correct errors ?
• Stable storage is a technique for organizing a disk so that media decays or failed writes do not result in permanent loss.– The general idea is that sectors are paired,
and each pair represents one sector-contents X
– As the left (XL) and right (XR) copies
Chapter 2 63
Stable Storage (2)
• Writing policy– Write the value of X into XL
• if status is good, write the value• if status is bad, repeat writing• If fails after a number of times, a media failure in the
sector
– Repeat above scheme for XR
• Reading policy (to obtain the value of X)– Read XL
• if status bad is returned, repeat reading• if status good is returned, take that value as X
– If can’t read XL , repeat above with XR
Chapter 2 64
Recovery from Disk Crashes
• Disk crash is fatal in mission-critical applications
• RAID (redundant arrays of independent disks)– Here, we talk levels 5, 6, and 7
– These RAID schemes also handle failures discussed previously
Chapter 2 65
The Failure Model of Disks
• Mean time to failure represents the length of time by which 50% of a population of disks will have failed catastrophically.– For modern disks, it is about 10 years
Fractionsurviving
Time
Chapter 2 66
RAID Level 1
• To protect against data loss– Use mirroring disks
• The only way data can be lost is if there is a second disk crash while the first crash is being repaired.
Chapter 2 67
How often will a data loss occur?
• Assume– The process of replacing the failed disk
• take 3 hours, 1/8 day, 1/2920 year
– A failure rate of 5% per year
• Probability that the mirror disk will fail during copying– (1/20) * (1/2920) = 1/58,400
• Mean time to a failure involving data loss– One of the two disks will fail once in 5 years
on the average• 5 * 58,400 = 292,000 years
Chapter 2 68
RAID Level 4 (1)
• Use one redundant disks no matter how many data disks there are
• In the redundant disk, the ith block consists of parity checks for the ith blocks of all the data disks
• Use modulo-2 sum: an even paritydisk1: 11110000disk2: 10101010disk3: 00111000
disk4: 01100010
Data disks
Redundant disk
Chapter 2 69
The Algebra of Modulo-2 Sums
• The commutative law– x y = y x
• The associative law– x (y z) = (x y) z
• The all-0 vector of the appropriate length is the identity for – x Ō = Ō x = x
is its own inverse– x x = Ō– If x y = z, y = x z
Chapter 2 70
RAID Level 4: Reading (2)
• Read disks normally. • We could read the redundant disk !
– Example • read disk 2, 3, and 4, and get the contents of disk
1 using modulo-2 sum.
disk2 : 10101010disk3 : 00111000disk4 : 01100010
disk1 : 11110000
Chapter 2 71
RAID Level 4: Writing (3)
• When a block is written, we need to change the redundant disk
• Naïve approach– N-1 reads of blocks not being rewritten– One write of new block– Rewrite new redundant disk– In total, N+1 disk I/O’s
• There is a better way to do that !
Chapter 2 72
Writing Example (4)
• When disk 2 changes from 10101010 to 11001100
disk1 : 11110000disk2 : 10101010disk3 : 00111000
disk4 : 01100010
01100110
disk1 : 11110000disk2 : 11001100disk3 : 00111000
disk4 : 0000010000000100
Modulo-2 sum of old and new bits of disk 2
Modulo-2 sum of old redundant disk and modulo-2 sum of disk 2’s
Chapter 2 73
RAID Level 4: Failure Recovery (5)
• Recomputing any missing data is
simple, and does not depend on which
disk (data or redundant) is failed.
Chapter 2 74
RAID Level 5
• We could treat each disk as the
redundant disk for some of the blocks
– That is, do not have to treat one disk as the
redundant disk and the others as data disks
• When there are n+1 disks (disk 0 – disk
n)
– If (i mod n+1) = j, then we can treat the ith
cylinder of disk j as redundant
Chapter 2 75
Example 2.21 (1)
• How redundant blocks compute for 4 disks (n=3)?– Disk 0
• redundant for block 4, 8, 12, …
– Disk 1 • redundant for block 1, 5, 9, …
– Disk 2• redundant for block 2, 6, 10, …
– Disk 3• redundant for block 3, 7, 11, …
Chapter 2 76
Example 2.21 (2)
• The reading and writing load for each disk is the same– If all blocks are equally likely to be written
• each disk has a 1/4 chance
– If not• each disk has a 1/3 chance
– Each of four disks is involved in ½ of the writes
• 1/4 + 3/4 * 1/3 = 1/2
Chapter 2 77
RAID Level 6 (1)
• To handle with any number of disk crashes – data or redundant
• Here, focused on a simple example, where two simultaneous crashes are correctable and the strategy is based on a simple error-correcting code, Hamming code
• Consider a system with seven disks– data disks: disk 1-4– redundant disks: disk 5-7
Chapter 2 78
RAID Level 6 (2)
• The relationship between data and redundant disks
– Note• every possible column of three 0’s and 1’s, except for the
all-0 column• the columns for the redundant disk have a singe 1• the columns for the data disks each have at least two 1’s
DATA Redundant
Disk Number
1 2 3 4 5 6 7
1 1 1 0 1 0 0
1 1 0 1 0 1 0
1 0 1 1 0 0 1
Chapter 2 79
RAID Level 6 (3)
• The disks with 1 in a row are treated as if they were the entire set of disks in a RAID level 4 scheme.– The bits of disk 5
• are the modulo-2 sum of bits of disk 1,2, and 3– The bits of disk 6
• are the modulo-2 sum of bits of disk 1,2, and 4– The bits of disk 7
• are the modulo-2 sum of bits of disk 1,3, and 4
DATA Redundant
Disk Number
1 2 3 4 5 6 7
1 1 1 0 1 0 0
1 1 0 1 0 1 0
1 0 1 1 0 0 1
Chapter 2 80
RAID Level 6 – Read/Write
• Reading: Just read data from any data disk normally
• Writing– Need to recalculate several redundant disks
Chapter 2 81
A Writing Example (1)
• Writing– Disk 2 is changed to be
0000111– Corresponding redundant
disks• disk 5 and 6
– Using modulo-2 sum• between old and new disk 2• between modulo-2 sum of
disk 2’s and disk 5• between modulo-2 sum of
disk 2’s and disk 6
Disk Contents
1 11110000
2 10101010
3 00111000
4 01000001
5 01100010
6 00011011
7 10001001
Chapter 2 82
A Writing Example (2)
Disk Contents
1 11110000
2 00001111
3 00111000
4 01000001
5 11000111
6 10111110
7 10001001
10101010 (old disk 2)00001111 (new disk 2)10100101 (modulo-2 sum )
10100101 (modulo-2 sum)01100010 (disk 5)11000111 (new disk 5)
10100101 (modulo-2 sum)00011011 (disk 6)10111110 (new disk 6)
Chapter 2 83
RAID Level 6 – Failure Recovery
• Assume that disk a and b fails simultaneously
• Find a row r in which the columns of a and b are different – For example, a has 0 in row r, b has 1 in row r
• Compute the correct b by taking the modulo-2 sum of corresponding bits from all the disks other than b that have 1 in row r.
• Then, compute the correct a
Chapter 2 84
A Recovery Example
– Pick the second row
– Disk 2: • modulo-2 sum of disks 1, 4, and
6• 00001111
– Disk 5: • modulo-2 sum of disks 1, 2, and
3• 11000111
Disk Contents
1 11110000
2 ????????
3 00111000
4 01000001
5 ????????
6 10111110
7 10001001
top related