Download - CSE521: Introduction to Computer Architecture Mazin Yousif I/O Subsystem RAID (Redundant Array of Independent Disks)

CSE521: Introduction to Computer Architecture

Mazin YousifI/O Subsystem

RAID (Redundant Array of Independent Disks)

MSY F02 2

RAID

• Improvements in Microprocessor performance (~ 50%) widely exceeds that of disk access time (~ 10%) - depends on Mechanical System

• Improvements in Magnetic Media Densities has also been Slow (~ 20%)

• Solution: Disk Arrays: Uses Parallelism between Multiple Disks to Improve Aggregate I/O Performance

– Disk Arrays stripe data across multiple disks and access them in parallel

• Capacity Penalty to store redundant data

• Bandwidth Penalty to update it

RAID

MSY F02 3

• Positive Aspects of Disk Arrays:– Higher data transfer rate on large data accesses

– Higher I/O rates on small data accesses

– Uniform load balancing across all the disks - no hot spots (Hopefully)

• Negative Aspects of Disk Arrays:– Higher vulnerability to disk failures - Need to employ redundancy in the

form of Error Correcting Code to tolerate failures

• Several Data Striping and Redundancy Schemes

• Sequential access generates highest data transfer with minimal head positioning

• Random accesses generates high I/O rates with lots of head positioning

RAID

MSY F02 4

• Data is Striped for improved performance– Distributes data over multiple disks to make them appear as a single

fast large disk

– Allows multiple I/Os to be serviced in parallel• Multiple independent requests serviced in parallel

• A block request may be serviced in parallel by multiple disks

• Data is Redundant for improved reliability– Large number of disks in an array lowers the reliability of the array

• Reliability of N disks = Reliability of 1 disk /N

• Example:

– 50,000 hours / 70 disks = 700 hours

– Disk System MTTF drops from 6 years to 1 month

– Arrays without redundancy are unreliable to be useful

RAID

MSY F02 5

• RAID 0 (Non-redundant)– Stripes Data; but does not employ redundancy

– Lowest cost of any RAID

– Best Write performance - no redundant information

– Any single disk failure is catastrophic

– Used in environments where performance is more important than reliability.

RAID

MSY F02 6

D0 D3D2D1

D4 D7D6D5

D8 D11D10D9

D12 D15D14D13

D19D18D17D16

Stripe Unit

Stripe

RAID

Disk 1 Disk 4Disk 3Disk 2

MSY F02 7

• RAID 1 (Mirrored)– Uses twice as many disks as non-redundant arrays - 100% Capacity

Overhead - Two copies of data are maintained

– Data is simultaneously written to both arrays

– Data is read from the array with shorter queuing, seek and rotation delays - Best Read Performance.

– When a disk fails, mirrored copy is still available

– Used in environments where availability and performance (I/O rate) are more important than storage efficiency.

RAID

MSY F02 8

• RAID 2 (Memory Style ECC)– Uses Hamming code - parity for distinct overlapping subsets of data

– # of redundant disks is proportional to log of total # of disks - Better for large # of disks - e.g., 4 data disks require 3 redundant disks

– If disk fails, other data in subset is used to regenerate lost data

– Multiple redundant disks are needed to identify faulty disk

RAID

MSY F02 9

• RAID 3 (Bit Interleaved Parity)– Data is bit -wise over the data disks

– Uses Single parity disk to tolerate disk failures - Overhead is 1/N

– Logically a single high capacity, high transfer rate disk

– Reads access data disks only; Writes access both data and parity disks

– Used in environments that require high BW (Scientific, Image Processing, etc.) , and not high I/O rates

100100

111010

110111

100101

110011

RAID

MSY F02 10

• RAID 4 (Block Interleaved Parity)– Similar to bit-interleaved parity disk array; except data is block-

interleaved (Striping Units)

– Read requests smaller than one striping unit, access one Striping unit

– Write requests update the data block; and the parity block.

– Generating parity requires 4 I/O accesses (RMW)

– Parity disk gets updates on all writes - a bottleneck

RAID

MSY F02 11

• RAID 5 (Block-Interleaved Distributed Parity)– Eliminates the parity disk bottleneck in RAID 4 - Distributes parity

among all the disks

– Data is distributed among all disks

– All disks participates in read requests - Better performance than RAID 4

– Write requests update the data block; and the parity block.

– Generating parity requires 4 I/O accesses (RMW)

– Left symmetry v.s. Right Symmetry - Allows each disk to be traversed once before any disk twice

RAID

MSY F02 12

D0 PD3D2D1

D4 D7PD6D5

D8 D11D10PD9

D12 D15D14D13P

P D19D18D17D16

Stripe Unit

Stripe

RAID

MSY F02 13

D0’ D0 PD3D2D1

+

+

Old Parity(2. Read)

Old Data1. Read

NewData

D0’ P’D3D2D1

3. WriteNew Data 4. Write New

Parity

RAID

MSY F02 14

• RAID 6 (P + Q Redundancy)– Uses Reed-Solomon codes to protect against up to 2 disk failures

– Data is distributed among all disks

– Two sets of parity P & Q

– Write requests update the data block; and the parity blocks.

– Generating parity requires 6 I/O accesses (RMW) - update both P & Q

– Used in environments that require stringent reliability requirements

RAID

MSY F02 15

• Comparisons– Read/Write Performance

• RAID 0 provides the best Write performance

• RAID 1 provides the best Read Performance

– Cost - Total # of Disks• RAID 1 is most expensive - 100% capacity overhead - 2N Disks

• RAID 0 is least expensive - N Disks - no redundancy

• RAID 2 needs N + ceiling(log2N) + 1

• RAID 3, RAID 4 & RAID 5 needs N + 1 disks

RAID

Comparisons

MSY F02 16

• Preferred Environments– RAID 0: Performance & capacity are more important than reliability

– RAID 1: High I/O rate, high availability environments

– RAID 2: Large I/O Data Transfer

– RAID 3: High BW Applications (Scientific, Image Processing…)

– RAID 4: High bit BW Applications

– RAID 5 & RAID 6: Mixed Applications

RAID

MSY F02 17

RAID

MSY F02 18

• Performance:– What metric ?

• IOPS ?

• Byte/sec ?

• Response Time ?

• IOPS per $$ ?

• Hybrid ?

– Application Dependent• Transaction Processing: IOPS per $$

• Scientific Applications: Bytes/sec per $$

• File Servers: Both IOPS and Bytes/sec

• Time-Sharing Applications: User Capacity per $$

RAID

MSY F02 19

RAID Level Small Reads Small Writes Large Reads Large Writes Storage Efficiency

RAID 0 1 1 1 1 1

RAID 1 1 1/2 1 1/2 1/2

RAID 3 1/G 1/G (G-1)/G (G-1)/G (G-1)/G

RAID 5 1 max(1/G,1/4) 1 (G-1)/G (G-1)/G

RAID 6 1 max(1/G,1/6) 1 (G-2)/G (G-2)/G

The table below, which shows Throughput per $$ relative to RAID 0, assumes that G drives in an error correcting group* RAID 3 performance/cost is always =< RAID 5 performance

RAID

MSY F02 20

Performance Issues

• Improving Small Write Performance for RAID 5:– Writes need 4 I/O accesses; Overhead is emphasized for small writes

• Response time increases by factor of 2; Throughput decreases by factor of 4.

• In contrast, RAID 1 writes require two writes - concurrent - latency may increase; throughput decreases by factor of 2.

• Three techniques to improve RAID 5 performance

• Buffering & Caching:– Disk cache (Write buffering) acknowledges the host before data is

written to disk

– Under high load, write backs increase & response time goes back to 4 times RAID 0

– During write back, group sequential writes together

– Keep a copy of old data before writing ==> 3 I/O accesses

– Keep the new parity & new data in cache; Any later updates will require 2 I/O accesses

RAID

MSY F02 21

Performance Issues

• Floating Parity:– Shortens RMW of small writes to average 1 I/O access

– Clusters parity into cylinders; each containing a track of free blocks

– When new parity needs updating, it is written on the closest unallocated block following old parity

New parity update is approximately one read plus 1msec.

– Overhead: Directories for unallocated blocks and parity blocks in a cache in RAID adapter Mbytes of memory

– Floating Data??• Larger directories

• sequential data may become discontinuous on disk

RAID

MSY F02 22

Performance Issues

• Parity Logging:– Delay writing the new parity

– Create an “update image” - difference between old & new parity - and store in log file in RAID adapter

– Hopefully, can group several parity blocks when writing back

– Log file is stored in NVRAM - can extend NVRAM to disk space

– Although, may be more I/Os, but efficient since large chunks of data are processed

– Logging reduces I/O accesses for small writes from 4 to possibly 2+

– Overhead: NVRAM, extra disk space, memory when applying parity update image to old parity

RAID

MSY F02 23

Hardware v.s. Software RAID

• RAID can be implemented in the OS– In RAID 1, Hardware RAID allows 100% mirroring. OS implemented

mirroring must distinguish between Master & Slave drives.• Only master drive has the boot code; if it fails, you can continue work, but no

booting is possible

• Hardware mirroring does not have this drawback

– Since software RAIDs implement standard SCSI, repair functions such as support for spare drives and hot plug have not been implemented; in contrast hardware RAID implements various repair functions.

– Hardware RAID improves system performance with its caching system, especially during high load situations, and synchronization

– Microsoft Windows NT implements RAID 0 and RAID1

RAID

MSY F02 24

• What RAID for which application– Fast Workstation:

• Caching is important to improve I/O rate

• If large files are installed, then RAID 0 may be necessary

• It is preferred to put the OS and swap files in separate drives from user drives to minimize movement between swap file area & user area.

– Small Server:• RAID 1 is preferred

– Mid-Size Server:• If more capacity is needed, then RAID 5 is recommended

– Large Server: e.g. Database Servers• RAID 5 is preferred

• Separate different I/Os in mechanically independent arrays; place index & data files in databases in different arrays

RAID

Download - CSE521: Introduction to Computer Architecture Mazin Yousif I/O Subsystem RAID (Redundant Array of Independent Disks)

Top Related