big data processing technologieswuct/bdpt/slides/lec3.pdferasure codes —reed solomon (1) •given...

99
Big Data Processing Technologies Chentao Wu Associate Professor Dept. of Computer Science and Engineering [email protected]

Upload: others

Post on 09-Jul-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Big Data Processing Technologies

Chentao WuAssociate Professor

Dept. of Computer Science and [email protected]

Page 2: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Schedule

• lec1: Introduction on big data and cloud computing

• Iec2: Introduction on data storage

• lec3: Data reliability (Replication/Archive/EC)

• lec4: Data consistency problem

• lec5: Block level storage and file storage

• lec6: Object-based storage

• lec7: Distributed file system

• lec8: Metadata management

Page 3: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Collaborators

Page 4: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Data Reliability Problem (1)Google – Disk Annual Failure Rate

Page 5: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Data Reliability Problem (2)Facebook-- Failure nodes in a 3000 nodes cluster

Page 6: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Contents

Introduction on Replication1

Page 7: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

What is Replication?

• Replication can be classified as• Local replication

• Replicating data within the same array or data center

• Remote replication• Replicating data at remote site

It is a process of creating an exact copy (replica) of data.

Replication

Source Replica (Target)

REPLICATION

Page 8: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

File System Consistency: Flushing Host Buffer

File System

Application

Memory Buffers

Logical Volume Manager

Physical Disk Driver

Data

Flush Buffer

Source Replica

Page 9: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Database Consistency: Dependent Write I/O Principle

D InconsistentC Consistent

Source Replica

4 4

3 3

2 2

1 1

4 4

3 3

2

1

C

Source Replica

Page 10: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Host-based Replication: LVM-based Mirroring

CCHost

Logical Volume

PhysicalVolume 1

PhysicalVolume 2

• LVM: Logical Volume Manager

Page 11: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Host-based Replication: File System Snapshot

CC

• Pointer-based replication

• Uses Copy on First Write (CoFW) principle

• Uses bitmap and block map

• Requires a fraction of the space used by the production FS

Metadata

Production FS

Metadata

1 Data a

2 Data b

FS Snapshot

3 no data

4 no data

BLKBit

1-0 1-0

2-0 2-0

N Data N

3 Data C

2 Data c

3-1

4 Data D 1 Data d

4-1

3-2

4-1

Page 12: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Storage Array-based Local Replication

CC

• Replication performed by the array operating environment

• Source and replica are on the same array

• Types of array-based replication• Full-volume mirroring

• Pointer-based full-volume replication

• Pointer-based virtual replication

BC HostStorage Array

ReplicaSource

Production Host

Page 13: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Full-Volume Mirroring

Source

Attached

Storage Array

Read/Write Not Ready

Production Host BC Host

Target

Detached – Point In Time

Read/Write Read/Write

Source

Storage ArrayProduction Host BC Host

Target

Page 14: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Copy on First Access: Write to the Source

Source

C’

Target

• When a write is issued to the source for the first time after replicationsession activation:

Original data at that address is copied to the target

Then the new data is updated on the source

This ensures that original data at the point-in-time of activation is preserved on the target

Production Host BC Host

C

Write to Source

A

B

C’ C

Page 15: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Copy on First Access: Write to the Target

• When a write is issued to the target for the first time after replication session activation:

The original data is copied from the source to the target

Then the new data is updated on the target

Source

B’

Target

Production Host BC Host

B

Write to Target

A

B

C’ C

B’

Page 16: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Copy on First Access: Read from Target

• When a read is issued to the target for the first time after replication session activation:

The original data is copied from the source to the target and is made available to the BC host

Source

A

Target

Production Host BC Host

A

Readrequest for

data “A”

A

B

C’ C

B’

A

Page 17: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Tracking Changes to Source and Target

Source

Target

0 unchanged changed

Logical OR

At PIT

Target

SourceAfter PIT…

0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0

1 0 0 1 0 1 0 0

0 0 1 1 0 0 0 1

1 0 1 1 0 1 0 1

1

For resynchronization/restore

Page 18: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Contents

Introduction to Erasure Codes2

Page 19: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Erasure Coding Basis (1)• You've got some data • And a collection of storage

nodes.

• And you want to store the data on the storage nodes so that you can get the data back, even when the nodes fail..

Page 20: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Erasure Coding Basis (2)• More concrete: You have k

disks worth of data• And n total disks.

• The erasure code tells you how to create n disks worth of data+coding so that when disks fail, you can still get the data

Page 21: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Erasure Coding Basis (3)• You have k disks worth of

data• And n total disks.

• n = k + m

• A systematic erasure code stores the data in the clear on k of the n disks. There are k data disks, and m coding or “parity” disks. Horizontal Code

Page 22: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Erasure Coding Basis (4)• You have k disks worth of

data• And n total disks.

• n = k + m

• A non-systematic erasure code stores only coding information, but we still use k, m, and n to describe the code. Vertical Code

Page 23: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Erasure Coding Basis (5)• You have k disks worth of

data• And n total disks.

• n = k + m

• When disks fail, their contents become unusable, and the storage system detects this. This failure mode is called an erasure.

Page 24: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Erasure Coding Basis (6)• You have k disks worth of

data• And n total disks.

• n = k + m

• An MDS (“Maximum Distance Separable”) code can reconstruct the data from any m failures. Optimal

• Can reconstruct any f failures (f < m) non-MDS code

Page 25: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Two Views of a Stripe (1)• The Theoretical View:

– The minimum collection of bits that encode and decode together.

– r rows of w-bit symbols from each of n disks:

Page 26: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Two Views of a Stripe (2)• The Systems View:

– The minimum partition of the system that encodes and decodes together.

– Groups together theoretical stripes for performance.

Page 27: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Horizontal & Vertical Codes• Horizontal Code

• Vertical Code

Page 28: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Expressing Code with Generator Matrix (1)

Page 29: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Expressing Code with Generator Matrix (2)

Page 30: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Expressing Code with Generator Matrix (3)

Page 31: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Encoding— Linux RAID-6 (1)

Page 32: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Encoding— Linux RAID-6 (2)

Page 33: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Encoding— Linux RAID-6 (3)

Page 34: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Accelerate Encoding— Linux RAID-6

Page 35: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Encoding— RDP (1)

Page 36: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Encoding— RDP (2)

Page 37: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Encoding— RDP (3)

Page 38: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Encoding— RDP (4)

Page 39: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Encoding— RDP (5)

Page 40: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Encoding— RDP (6)

Horizontal Parity Diagonal Parity Data

0 1 2 3 4 5 6 7

0

1

2

3

4

5

• Horizontal parity layout (p=7, n=8)

Page 41: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Encoding— RDP (7)• Diagonal parity layout (p=7, n=8)

Horizontal Parity Diagonal Parity Data

0 1 2 3 4 5 6 7

0

1

2

3

4

5

Page 42: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Arithmetic for Erasure Codes• When w = 1: XOR's only.

• Otherwise, Galois Field Arithmetic GF(2w)

– w is 2, 4, 8, 16, 32, 64, 128 so that words fit evenly into computer words.

– Addition is equal to XOR.Nice because addition equals subtraction.

– Multiplication is more complicated:Gets more expensive as w grows.

Buffer-constant different from a * b.

Buffer * 2 can be done really fast.

Open source library support.

Page 43: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Decoding with Generator Matrices (1)

Page 44: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Decoding with Generator Matrices (2)

Page 45: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Decoding with Generator Matrices (3)

Page 46: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Decoding with Generator Matrices (4)

Page 47: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Decoding with Generator Matrices (5)

Page 48: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Erasure Codes — Reed Solomon (1)• Given in 1960.

• MDS Erasure codes for any n and k.

– That means any m = (n-k) failures can be tolerated without data loss.

• r = 1

(Theoretical): One word per disk per stripe.

• w constrained so that n ≤ 2w.

• Systematic and non-systematic forms.

Page 49: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Erasure Codes —Reed Solomon (2)Systematic RS -- Cauchy generator matrix

Page 50: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Erasure Codes —Reed Solomon (3)Non-Systematic RS -- Vandermonde generator matrix

Page 51: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Erasure Codes —Reed Solomon (4)Non-Systematic RS -- Vandermonde generator matrix

Page 52: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Erasure Codes —EVENODD 1995(7 disks, tolerating 2 disk failures)• Horizontal Parity Coding

• Calculated by the data elements in the same row

• E.g. 𝐶0,5 = 𝐶0,0 ⊕𝐶0,1 ⊕𝐶0,2 ⊕𝐶0,3⊕𝐶0,4

• Diagonal Parity Coding

• Calculated by the data elements and S

• E.g. 𝐶0,6 = 𝐶0,0 ⊕𝐶3,2 ⊕𝐶2,3 ⊕𝐶1,4 ⊕𝑆

Page 53: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Erasure Codes —X-Code 1999 (1)• Diagonal parity layout (p=7, n=7)

Diagonal Parity Anti-diagonal Parity Data

0 1 2 3 4 5 6

0

1

2

3

4

5

6

Page 54: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Erasure Codes —X-Code 1999 (2)• Anti-diagonal parity layout (p=7, n=7)

Diagonal Parity Anti-diagonal Parity Data

0 1 2 3 4 5 6

0

1

2

3

4

5

6

Page 55: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Erasure Codes —H-Code (1)• Horizontal parity layout (p=7, n=8)

Horizontal Parity Anti-diagonal Parity Data

0 1 2 3 4 5 6 7

0

1

2

3

4

5

Page 56: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Erasure Codes —H-Code (2)• Anti-diagonal parity layout (p=7, n=8)

Horizontal Parity Anti-diagonal Parity Data

0 1 2 3 4 5 6 7

0

1

2

3

4

5

Page 57: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Erasure Codes —H-Code (3)• Recover double disk failure by single recovery chain

Horizontal Parity Anti-diagonal Parity Data Lost Data and Parity

Recovery

Chain

1

23

45

67

89

1011

12X

F

H

J

L

B

D

K

E

A

I

G

C

0 1 2 3 4 5 6 7

0

1

2

3

4

5

Page 58: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Erasure Codes —H-Code (4)• Recover double disk failure by two recovery chains

5

Horizontal Parity Anti-diagonal Parity Data Lost Data and Parity

Recovery

Chain

1

23

45

6X

DE

L

J

K

I

H

F

G

A

BC

0 1 2 3 4 5 6 7

0

1

2

3

4

5

1

2 3

4

6X

Page 59: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Erasure Codes —HDP Code (1)• Diagonal parity layout (p=7, n=6)

0 1 2 3 4 5

0

1

2

3

4

5

HDP ADPData

Page 60: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Erasure Codes —HDP Code (2)• Diagonal parity layout (p=7, n=6)

0 1 2 3 4 5

0

1

2

3

4

5

HDP ADPData

Page 61: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Erasure Codes —HDP Code (3)• HDP reduces more than 30% average recovery time.

0 1 2 3 4 5

0

1

2

3

4

5

HDP ADPData Lost Data and Parity

A1B

CD

EF

K L

I J

G H

F

F

F

F

2

34

56

1 2

3 4

5 6

Page 62: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Contents

Replication and EC in Cloud3

Page 63: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Three Dimensions in Cloud Storage

Page 64: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Replication vs Erasure Coding (RS)

Page 65: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Fundamental Tradeoff

Page 66: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Pyramid Codes (1)

Page 67: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Pyramid Codes (2)

Page 68: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Pyramid Codes (3) Multiple Hierachies

Page 69: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Pyramid Codes (4) Multiple Hierachies

Page 70: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Pyramid Codes (5) Multiple Hierachies

Page 71: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Pyramid Codes (6)

Page 72: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Google GFS II – Based on RS

Page 73: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Microsoft Azure (1)How to Reduce Cost?

Page 74: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Microsoft Azure (2)Recovery becomes expensive

Page 75: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Microsoft Azure (3)Best of both worlds?

Page 76: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Microsoft Azure (4)Local Reconstruction Code (LRC)

Page 77: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Microsoft Azure (5)Analysis LRC vs RS

Page 78: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Microsoft Azure (6)Analysis LRC vs RS

Page 79: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Recovery problem in Cloud

• Recovery I/Os from 6 disks (high network bandwidth)

Page 80: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Optimizing Recovery Network I/O (1)

Page 81: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Optimizing Recovery Network I/O (1)

• Establish recovery relationships among disks

Page 82: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Optimizing Recovery I/O (3)

• ~20+% savings in general

Page 83: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Regenerating Codes (1)

• Data = {a,b,c}

Page 84: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Regenerating Codes (2)

• Optimal Repair

Page 85: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Regenerating Codes (3)

• Optimal Repair

Page 86: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Regenerating Codes (4)

• Optimal Repair

Page 87: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Regenerating Codes (4)Analysis -- Regenerating vs RS

Page 88: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Facebook Xorbas HadoopLocally Repairable Codes

Page 89: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Combination of Two ECs (1)Recovery Cost vs. Storage Overhead

Page 90: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Combination of Two ECs (2)Fast Code and Compact Code

Page 91: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Combination of Two ECs (3)Analysis

Page 92: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Combination of Two ECs (4)Analysis

Page 93: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Combination of Two ECs (5)Analysis

Page 94: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Combination of Two ECs (6)Conversion• Horizontal parities require no re-computation

• Vertical parities require no data block transfer

• All parity updates can be done in parallel and in a distributedmanner

Page 95: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Combination of Two ECs (7)Results

Page 96: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Contents

Project 14

Page 97: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Erasure Code in Hadoop (1)

• Implement an erasure code into Hadoop system

• Hadoop Version: 2.7 or higher

• Erasure Code: you can select one, but not RS

• Test the storage efficiency of your proposed code

• Report and Source Code are required

• Source Code should be checked by TA

• Deadline: June 30th

Page 98: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Erasure Code in Hadoop (2)

• References

• Jerasurehttp://web.eecs.utk.edu/~plank/plank/www/software.html

• HDFS-Xorbashttp://smahesh.com/HadoopUSC/

Page 99: Big Data Processing Technologieswuct/bdpt/slides/lec3.pdfErasure Codes —Reed Solomon (1) •Given in 1960. •MDS Erasure codes for any n and k. –That means any m = (n-k) failures

Thank you!