parity lost and parity regained andrew krioukov, lakshmi n. bairavasundaram, andrea c....

37
Parity Lost and Parity Regained Andrew Krioukov, Lakshmi N. Bairavasundaram, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau University of Wisconsin - University of Wisconsin - Madison Madison Garth R. Goodson, Kiran Srinivasan, Randy Thelen

Post on 21-Dec-2015

226 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Parity Lost and Parity Regained Andrew Krioukov, Lakshmi N. Bairavasundaram, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau University of Wisconsin

Parity Lost and Parity Regained

Andrew Krioukov, Lakshmi N. Bairavasundaram,

Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-DusseauUniversity of Wisconsin - MadisonUniversity of Wisconsin - Madison

Garth R. Goodson, Kiran Srinivasan, Randy Thelen

Page 2: Parity Lost and Parity Regained Andrew Krioukov, Lakshmi N. Bairavasundaram, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau University of Wisconsin

Bare-bones RAID• Stripe data across multiple drives• Store redundant parity data• Can reconstruct data with any single disk failure• Will RAID protect data in all single failure cases?

A B C

Data 1 Data 2 Data 3 Parity

P(ABC)

2

Page 3: Parity Lost and Parity Regained Andrew Krioukov, Lakshmi N. Bairavasundaram, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau University of Wisconsin

Bare-bones RAID Problems• Stripe contains file ABC consisting of 3 blocks• RAID has redundancy to recover data• RAID does not detect corruption

Data 1 Data 2 Data 3 Parity

P(ABC)Corruption

Read file ABC

Return Corrupt File

AA BB @#$%C@#$%RAID Stripe

3

Page 4: Parity Lost and Parity Regained Andrew Krioukov, Lakshmi N. Bairavasundaram, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau University of Wisconsin

Bare-bones RAID Problems

• RAID cannot detect partial disk failures:– Corruptions– Torn writes– Lost writes– Misdirected writes

• RAID only protects against– Complete disk failures– Errors reported by the disk (e.g. Latent Sector

Errors)4

Page 5: Parity Lost and Parity Regained Andrew Krioukov, Lakshmi N. Bairavasundaram, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau University of Wisconsin

Data Protection Techniques

• Need improvements to bare-bones RAID

– Techniques needed to help detect errors

• Checksums are common

– Many kinds: block, sector, parent checksums

• Which type of checksums are used?

• We examined real systems to determine protection schemes

5

Page 6: Parity Lost and Parity Regained Andrew Krioukov, Lakshmi N. Bairavasundaram, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau University of Wisconsin

Enterprise RAID Systems• Mixed bag of protections

Scrub Sector Cksum

Block Cksum

Parent Cksum

Write Verify

PhysIdent

LogicalIdent

Write Stamp

Dell Power-vault

√ √ √

Hitachi Thunder

√ √ √NetApp ONTAP

√ √ √ √ √Sun ZFS √ √ 6

Page 7: Parity Lost and Parity Regained Andrew Krioukov, Lakshmi N. Bairavasundaram, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau University of Wisconsin

Question

• Which errors do these systems protect against?

• How can we ensure complete data protection?

• Need method to identify all corruption & data loss scenarios in a design

7

Page 8: Parity Lost and Parity Regained Andrew Krioukov, Lakshmi N. Bairavasundaram, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau University of Wisconsin

Model Checking Solution

• Create a model of storage system design using primitives

• Checker exhaustively searches space of all possible states– Start with clean RAID stripe– Apply single disk error– Apply any number of disk operations (e.g. write)

• Identifies all possible data loss scenarios

8

Page 9: Parity Lost and Parity Regained Andrew Krioukov, Lakshmi N. Bairavasundaram, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau University of Wisconsin

Results Summary• Applied model checking on enterprise RAID

system designs• For all designs, a single error can cause data

loss• Identified a common problem, parity pollution

– Partial disk failure goes undetected– The erroneous data is used to compute parity– Recovery is no longer possible

• Presented a design that protects against all single failures 9

Page 10: Parity Lost and Parity Regained Andrew Krioukov, Lakshmi N. Bairavasundaram, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau University of Wisconsin

Outline

• Introduction• Background: Storage Errors• Model Checking Approach• Data Protection Design & Analysis• Conclusion

10

Page 11: Parity Lost and Parity Regained Andrew Krioukov, Lakshmi N. Bairavasundaram, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau University of Wisconsin

Storage Errors• Latent Sector Errors

– Data is inaccessible– Explicit error code returned– Affect 19% of nearline, 2% of enterprise disks in 2

years [Bairavasundaram et al. SIGMETRICS’07]

• Corruptions– Data is silently corrupted– Affect 0.6% of nearline and 0.06% of enterprise

disks in 17 months [Bairavasundaram et al. FAST’08]

• Reality: Partial disk failures happen11

Page 12: Parity Lost and Parity Regained Andrew Krioukov, Lakshmi N. Bairavasundaram, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau University of Wisconsin

Storage Errors (Cont’d)• Torn Write

– Only part of a block is written– Some sectors are lost– Write returns success code

• Lost Writes– Write returns success code– Data not reflected on disk A

Write B

Success

A

Write B

Success

12

Page 13: Parity Lost and Parity Regained Andrew Krioukov, Lakshmi N. Bairavasundaram, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau University of Wisconsin

Storage Errors (Cont’d)

• Misdirected Writes– Write goes to wrong location

(either wrong block or wrong disk)– Combination of lost write

and corruption

13

A

Overwrite A A’ Success

BA’

Page 14: Parity Lost and Parity Regained Andrew Krioukov, Lakshmi N. Bairavasundaram, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau University of Wisconsin

Outline

• Introduction• Background: Storage Errors• Model Checking Approach• Data Protection Design & Analysis• Conclusion

14

Page 15: Parity Lost and Parity Regained Andrew Krioukov, Lakshmi N. Bairavasundaram, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau University of Wisconsin

Modeling Storage System

• Use primitives to describe:– On disk layout in terms of sectors– Data protections

• Checker uses built-in models:– Storage errors– Disk operations (e.g. Read/Write)– Basic RAID functionality

15

Page 16: Parity Lost and Parity Regained Andrew Krioukov, Lakshmi N. Bairavasundaram, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau University of Wisconsin

Model Checking• Assumptions

– Single RAID stripe– Single storage error– Single parity protection– Data disks are interchangeable

• Apply error followed by any number of disk operations

• Generate state diagram with all data loss states

16

Page 17: Parity Lost and Parity Regained Andrew Krioukov, Lakshmi N. Bairavasundaram, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau University of Wisconsin

State Diagram Example• Bare-bones RAID state diagram

Clean

Parity Error

Corrupt(p), Torn(p),Lost(p), Misdir(p)

Wadd(x+)

Disk x Error

Corrupt(x), Torn(x),Lost(x), Misdir(x)

Wsub(x+)

Corrupt Data

Polluted Parity

R(x)

R(x)

Wadd()

W(x+)

Wadd(!x)

17

Page 18: Parity Lost and Parity Regained Andrew Krioukov, Lakshmi N. Bairavasundaram, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau University of Wisconsin

Outline

• Introduction• Background: Storage Errors• Model Checking Approach• Data Protection Design & Analysis• Conclusion

18

Page 19: Parity Lost and Parity Regained Andrew Krioukov, Lakshmi N. Bairavasundaram, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau University of Wisconsin

Data Protection Design

• Need fault tolerance for all partial failures

• Bare-bones RAID handles latent sector errors and complete disk failures

• Corruption is next most common failure

• Add protections cumulatively until design has complete protection

19

Page 20: Parity Lost and Parity Regained Andrew Krioukov, Lakshmi N. Bairavasundaram, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau University of Wisconsin

ProtectionsProtections in red will be discussed in the talk• Scrubbing• Sector checksums• Block checksums• Parental checksums• Write verify• Physical identity• Logical identity• Version mirroring

20

Page 21: Parity Lost and Parity Regained Andrew Krioukov, Lakshmi N. Bairavasundaram, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau University of Wisconsin

Checksums• Checksum per data block

• Checksum per sector

• Parent checksum– Checksum stored in parent inode

21

Acksum(A)

ck(a1)

ck(a2)a2

a1

A

Page 22: Parity Lost and Parity Regained Andrew Krioukov, Lakshmi N. Bairavasundaram, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau University of Wisconsin

• Corruption scenario is now fixed

Data 1 Data 2 Data 3 Parity

Bcksum(B)

Acksum(A)

Ccksum(C)

P(ABC)cksum(P)

Checksum Example

22

Corruption

Read file ABC

@#$%@#$%cksum(C)

Perform reconstruction

File is valid

A B P(ABC)

CC

User
MOVE
Page 23: Parity Lost and Parity Regained Andrew Krioukov, Lakshmi N. Bairavasundaram, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau University of Wisconsin

Checksum Problems

• Great for protecting against corruption errors• Fails to protect when data and checksum are

lost together:– Lost write (with any type of checksums)– Torn write (only with sector checksums)

• Parity pollution can occur

23

Page 24: Parity Lost and Parity Regained Andrew Krioukov, Lakshmi N. Bairavasundaram, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau University of Wisconsin

Data 1 Data 2 Data 3 Parity

Bcksum(B)

Acksum(A)

Ccksum(C)

P(ABC)cksum(P)

Checksum Problems – Lost Write• Block checksums

Overwrite C→C’

P(ABC’)

Lost Write

Read file ABC’

Ccksum(C)

Return data (ABC)Return Corrupt Data (C instead of C’)

Page 25: Parity Lost and Parity Regained Andrew Krioukov, Lakshmi N. Bairavasundaram, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau University of Wisconsin

Write Verify• Attempt to solve lost write problem• Costly solution, expect good protection• Procedure:

1. Write data to disk2. Read back to verify3. If lost write detected, write again

or remap to new location

Overwrite C→C’Lost Write

Ccksum(C)

Read back (C)Lost write detected, write C’ again

C’

Success

cksum(C’)

25

Page 26: Parity Lost and Parity Regained Andrew Krioukov, Lakshmi N. Bairavasundaram, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau University of Wisconsin

Write Verify Problems

• Protects against lost writes• Susceptible to misdirected writes

– Cannot detect/recover the overwritten data

26

Page 27: Parity Lost and Parity Regained Andrew Krioukov, Lakshmi N. Bairavasundaram, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau University of Wisconsin

Write Verify – Misdirected Write

Overwrite X→X’

Misdirected Write A

cksum(A)

Read back XLost, Re-write X

X’ Bcksum(B)

Parity

P(ABC)cksum(P)cksum(X’)

Read file ABC

X’

Return Corrupt Data (A has been corrupted)

Data 1 Data 2

Initially…

Later…

Data 3

Ccksum(C)

B C

27

X Y Z P(XYZ)X’ P(X’YZ)

Page 28: Parity Lost and Parity Regained Andrew Krioukov, Lakshmi N. Bairavasundaram, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau University of Wisconsin

Physical Identity• Protection against misdirected writes• Store disk & block number of destination in

each block

28

A 1

Overwrite Block 1: A A’ B 2A’

1Read Block 2

Returned (A’, 1)Block num does not match (1≠2)Misdirected Write Detected

Misdirected Write

Data, Block Number

Page 29: Parity Lost and Parity Regained Andrew Krioukov, Lakshmi N. Bairavasundaram, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau University of Wisconsin

Problem Solved?

• Write verify with block checksums and physical identity offers complete protection

• But… twice the I/O cost!• Need a more efficient solution

29

Page 30: Parity Lost and Parity Regained Andrew Krioukov, Lakshmi N. Bairavasundaram, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau University of Wisconsin

Logical Identity

• Less expensive protection against lost writes• Store file identifier (e.g. inode number) in

each data block• Test that file identifier

matches on a read

30

A

cksum(A)

Lost WriteOverwrite File 0

with File 1 (X)

File 0

Read File 1Logical ID does not match.Lost Write Detected

A

File 0

Page 31: Parity Lost and Parity Regained Andrew Krioukov, Lakshmi N. Bairavasundaram, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau University of Wisconsin

Logical Identity Problem

• Cannot be verified when re-computing parity– Not reading a file

• Parity pollution may occur

31

Page 32: Parity Lost and Parity Regained Andrew Krioukov, Lakshmi N. Bairavasundaram, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau University of Wisconsin

Parity Pollution ExampleData 1 Data 2 Data 3 Parity

Bcksum(B)

Acksum(A)

Ccksum(C)

P(ABC)cksum(P)

C→C’,

P(ABC’)

Lost Write

Overwrite AB →A’B’Parity:

A’cksum(A’)

B’cksum(B’)

A’ B’

C

Read Data 3 P(A’B’C)

Write File 1

Later… Write File 2

P(A’B’C)

Parity consistent with invalid data

File 0 File 0 File 0File 2 File 2

New Parity

Later… Read File 1 Logical ID mismatch (File 0 ≠ File 1)Reconstruct… Data is consistent!

C

File 0

Report Data Loss

A File0 B File0 C File0 P(ABC)P(ABC’)A’ File2 B’ File2 C’ File1 P(A’B’C’)What should

be on the disk

Page 33: Parity Lost and Parity Regained Andrew Krioukov, Lakshmi N. Bairavasundaram, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau University of Wisconsin

Version Mirroring• Lost write protection• Verifiable at RAID level• Store a version number in each data block• Mirror the version numbers on parity disk• Versions numbers verified on read

33

Bcksum(B)

Acksum(A)

Ccksum(C)

P(ABC)cksum(P)

Ver0 Ver0 Ver0 0,0,0

Page 34: Parity Lost and Parity Regained Andrew Krioukov, Lakshmi N. Bairavasundaram, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau University of Wisconsin

Parity Pollution SolvedData 1 Data 2 Data 3 Parity

Bcksum(B)

Acksum(A)

Ccksum(C)

P(ABC)cksum(P)

C→C’,

P(ABC’)

Lost Write

Overwrite AB →A’B’Parity:

A’cksum(A’)

B’cksum(B’)

A’ B’

Read Data 3

P(A’B’C’)

Write File 1

Later… Write File 2

P(A’B’C’)

Ver0 Ver0 Ver0Ver1 Ver 1

New Parity

0,0,00,0,1Ver0 0,0,1

Version mismatchReconstruct Data 3

Ver1

C

C’A B P(ABC’) C’

cksum(C’)

1,1,1

A Ver0 B Ver0 C Ver0 P(ABC)P(ABC’)A’ Ver1 B’ Ver1 C’ Ver1 P(A’B’C’)What should

be on the disk

C’

Page 35: Parity Lost and Parity Regained Andrew Krioukov, Lakshmi N. Bairavasundaram, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau University of Wisconsin

Problem Solved… Efficiently

• Version mirroring with block checksums and physical identity provides complete protection

• Use with logical identity for efficiency• More efficient than write verify

35

Page 36: Parity Lost and Parity Regained Andrew Krioukov, Lakshmi N. Bairavasundaram, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau University of Wisconsin

Conclusion• Applied model checking on real system designs

– For all designs, a single error can cause data loss– Parity pollution is a common problem– Version mirroring is a key technique to offering

complete and efficient data protection

• Partial failures are complex, no obvious data protection solution– Model checking is useful

36

Page 37: Parity Lost and Parity Regained Andrew Krioukov, Lakshmi N. Bairavasundaram, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau University of Wisconsin

37

ADvanced Systems Laboratorywww.cs.wisc.edu/adsl

Advanced Technology Grouphttp://www.netapp.com/company/research/