witold litwin [email protected] riad mokadem riad.mokadem @dauphine.fr thomas schwartz...

21
Witold Litwin [email protected] Riad Mokadem Riad.Mokadem@dauphine.fr Thomas Schwartz [email protected] Disk Backup Through Algebraic Disk Backup Through Algebraic Signatures Signatures For For A Scalable Distributed Data Structure A Scalable Distributed Data Structure in SDDS-2002 System in SDDS-2002 System

Post on 21-Dec-2015

217 views

Category:

Documents


1 download

TRANSCRIPT

Witold Litwin [email protected]

Riad Mokadem [email protected]

Thomas Schwartz [email protected]

Disk Backup Through Algebraic Disk Backup Through Algebraic SignaturesSignatures

For ForA Scalable Distributed Data A Scalable Distributed Data

StructureStructurein SDDS-2002 Systemin SDDS-2002 System

2

PlanPlan

IntroductionIntroduction

The SDDS-2002 Backup SchemeThe SDDS-2002 Backup Scheme

Experimental performance analysis.Experimental performance analysis.

Conclusion.Conclusion.

3

IntroductionIntroduction

Need for RAM SDDS storage to the disk File Backup

Failure of a server

File Eviction Sharing of RAM

Among different SDDS files With other apps

4

IntroductionIntroduction

Write to the disk only the parts (pages) changed since last backup “Dirty bit” approach inapplicable Page signature calculus: a possibility provided that:

Fast Precise Scalable

Shorter signatures may become longer without total recalculus

Not the case of SHA-1 nor of any other previous proposed schema

5

The SDDS-2002 Backup SchemeThe SDDS-2002 Backup Scheme File BackupFile Backup

Client

…… … …Server RAM Buckets

Server Disks

Store command Multicast)

Distributed Distributed StoringStoring

6

The SDDS-2002 Backup SchemeThe SDDS-2002 Backup Scheme File LoadFile Load

Client

…… …

Load command Multicast)

Server RAM Buckets

Server Disks

Distributed Distributed LodingLoding

7

Internal Organization of Bucket in Internal Organization of Bucket in SDDSSDDS

En-tête

Index SDDS B+-tree

Pages de donnéesData FileData File

Index : a few Kbytes up to MByteIndex : a few Kbytes up to MByte

Data file : Dozens of Mbytes up to GBytes Data file : Dozens of Mbytes up to GBytes

8

Page GranularityPage Granularity Carefull choice

Smaller page More individual writes if many random updates Less data transferred if a few updades

Larger pages Vice versa

Optimal size ? Good question

Our choice 16 KB for data

Although 64 KB pages proved best for data page signature calculus speed

256 B for index

9

Page SignaturePage Signature Algebraic SignaturesAlgebraic Signatures

• Galois Field GF (Galois Field GF (216) )

• Log / Antilog multiplicationLog / Antilog multiplication

• Page Page P P has 2-byte symbols has 2-byte symbols pp11 , p , p22, ….p, ….p

nn

• The signature formula is : The signature formula is :

• for each for each p’i = antilog p’i

•for each for each = : = : , 2, 3…

Sign ( P )= p’i i i = 1..n

Sign (P)= (Sign ( P ), Sign 2( P ),…Sign m( P ))

We put m = 2 to SDDS-2002

i=1,2...ni=1,2...n

10

Experimental Performance AnalysisExperimental Performance AnalysisHardware ConfigurationHardware Configuration

1.8 GHz P4 Servers1.8 GHz P4 Servers 800 MHz P3 Client 800 MHz P3 Client 500 MHz P3 Name Server500 MHz P3 Name Server 1 Gbs Ethernet1 Gbs Ethernet Windows 2000 Server OSWindows 2000 Server OS

11

Experimental Performance Experimental Performance SDDS-2002SDDS-2002

Initial File Store Time (No Signature Calculus)Initial File Store Time (No Signature Calculus)

11 2 3 4 2 3 4 File serversFile servers

Time Time

(Sec)(Sec)

120120

100100

8080

6060

4040

2020

File Size: 393MOFile Size: 393MO

25 000 Records25 000 Records

12

Initial File Store TimeInitial File Store Time(Time Series) (Time Series)

0

20000

40000

60000

80000

100000

120000

140000

100 150 1000 10000 25000

One Serv

Tw o Serv

Tree Serv

Number of Number of recordrecord

Storage Storage Time Time

(Ms)(Ms)

13

FileFile Load TimeLoad Time

120120

100100

8080

6060

4040

2020

11 2 3 4 2 3 4

(Sec)(Sec)

# of servers# of servers

File Size :File Size :

393MO393MO

Practically the same as the 1Practically the same as the 1stst backup time backup time

14

File Storage Performance AnalysisFile Storage Performance Analysis

Bucket

size (MB)

Number of

record

Signature

calculus (ms)

Signature

Calculus

per/MB

(ms)

Totalstore time (ms)

Store time

for 0 % change

(ms)

Gain (%)

Store time for 5 %

change

(ms)

Gain(%)

1.88 100 46 24.46 562 50 91.1 65 88.43

2.7 150 78 28.8 781 82 89.51 95 87.83

17.6 1000 438 24.88 5078 438 91.38 453 91.07

158 10000 4068 25.74 46406 4071 91.23 4085 91.19

393 25000 11003 27.9 117859

11003 91.33 11018 90.65

15

SHA-1 / Algebraic SignaturesSHA-1 / Algebraic SignaturesBucket

size(Mb)

Number of

record

Algebraic

signature

calculus (ms)

SHA-1

calculus (ms)

Initial Store time with

SHA-1(ms)

Initial Store time with alg. sign.(ms)

SHA-1

Store time for 5 %

change

(ms)

Alg. sign

Store time for 5 %

change

(ms)

Gain

(%)

1.88 100 46 70 602 562 85 65 30

2.7 150 78 103 799 781 119 95 25

17.6 1000 438 680 5278 5078 697 453 53

158 10000 4068 6088 47906 46406

6102 4085 49

393 25000 11003 15403

119342 117859

15418

11018

40

16

Algebraic / SHA-1 Signature Calculus TimeAlgebraic / SHA-1 Signature Calculus Time

02000400060008000

1000012000140001600018000

0 2 4 6

Bucket Size (MB)

Algebraic signature

Cryptographicsignature

17

ImplImpleementation in SDDS 2002mentation in SDDS 2002Interactive Client InterfaceInteractive Client Interface

User User interfaceinterface

18

ImplImpleementation in SDDS 2002mentation in SDDS 2002Execution Listing at the ServerExecution Listing at the Server

}}

1st Request for storage : 1st Request for storage : New File New File Signature Calculus (375 ms) Signature Calculus (375 ms) Disk write of all pages (4922 ms) Disk write of all pages (4922 ms)

2nd Request for storage : 2nd Request for storage : No changes found (375 ms) No changes found (375 ms)

3rd Request for storage : 3rd Request for storage : 1 page changed 1 page changed (375 + 16 ms) (375 + 16 ms)

19

ConclusionConclusion• The algebraic signature based file backup worksThe algebraic signature based file backup works

• Present in SDDS-2002 prototypePresent in SDDS-2002 prototype

• Offers advantages over the traditional approachOffers advantages over the traditional approach

• No change to existing codeNo change to existing code

• No run-time overheadNo run-time overhead

• Future workFuture work

• SignaturesSignatures

•Calculus, Alg. Properties, Apps…Calculus, Alg. Properties, Apps…

• Automatic SDDS File eviction Automatic SDDS File eviction

Thank You Thank You forfor

Your AttentionYour Attention