Transcript
Page 1: Towards Weakly Consistent Local Storage Systemsjyshin/talks/socc16-poster-shin.pdfTowards Weakly Consistent Local Storage Systems Ji-Yong Shin1,2, Mahesh Balakrishnan2, Tudor Marian3,

TowardsWeaklyConsistentLocalStorageSystemsJi-YongShin1,2,MaheshBalakrishnan2,TudorMarian3,JakubSzefer2andHakimWeatherspoon1

1CornellUniversity,2YaleUniversity,3Google

StaleStore

•  Primary/BackupseLng•  Primaryperformstheworstduetonetworkdelays(100ms)•  YogurtperformsbePerthanlocallatestbyusingthetrade-off

Performance:AccessingBlocksandK-VPairs

• Modernserversareaspowerfulasdistributedsystemsinthepastü  CPUandstoragedevicesareparallel,similartodistributednodes• Goalistotrade-offconsistencyandperformanceinalocalstoreü  UseofstaledataindifferentstoragedevicesforbePerperformance

ServerTrends

GetCostOverhead

Yogurt:ABlockLevelStaleStore

Summary

• Modernserversaresimilartodistributedsystems

•  Localstoragesystemscanadoptweakconsistencyü WedefinethemasStaleStores

•  Yogurt,ablocklevelStaleStoreü  EffecYvelytrades-offconsistencyandperformanceü  SupportshighlevelmulY-blockdataconstructs

Year 2006 2016 ComparisonsModel(4U) DellPowerEdge6850 DellPowerEdgeR930

CPU[#ofcores]

4×2coreXeon[8]

4×24coreXeon[96] 12X

Memory 64GB 6TB 96XNetworkbandwidth 2×1GigE

2×1GigE2×10GigE 11X

Storage 8×SCSI/SASHDD 24×SASHDD/SSD10xPCIeSSD

#ofdevices:4.2XCapacity:175.3X

UseofSSDs

DistributedvsModernServerDistributedSystems ModernServersDifferentversionsofdataexistindifferentserversduetonetworkdelaysduringreplicaYon

Differentversionsofdataexistindifferentstoragemediaduetologging,caching,copy-on-write,deduplicaYon,etc.

Olderversionsarefastertoaccesswhenthenetworkoverheadislow

Olderversionsarefastertoaccesswhentheyareonfasterstoragemedia

Reasonsfordifferentaccessspeedsü RAM,SSD,HDD,hybrid-drives,etc.ü DiskwitharmcontenYonorSSDundergarbagecollecYonü RAIDunderdegradedmode

•  Localstoragesystemsinanyformthatcantrade-offconsistencyandperformance(e.g.KV-store,filesystem,blockstore,DB,etc.)

Requirements:1.  MaintainmulYpleversionsofdata-Shouldhaveinterfacetoaccessolderversions2.  AwareofconsistencysemanYcs-BoundedStaleness,monotonic-reads,read-my-writes,etc.3.  CangivecostesYmatesforaccessingeachversion-ConsideraYonsfordatalocaYonsandstoragecondiYons

1.  IssueGetCost()forblock1betweenversions3and6(Nquerieswithuniformdistance)

2.  Readthecheapest:e.g.1(5):Read(1,5)3.  Recordtheselectedversionforblock1

3(3) 1(4) 2(4) 1(5) 3(5) 1(6)

Cache… … Lo

g

I/O

Write(blk,data,ver),Read(blk,ver)

VersionedwritestosnapshotsVersionedreadsfromsnapshots

Cost GetCost(blk,ver) cache<<disk,#ofqueuedI/O(read<<write)

MulY-blockobjectaccess

GetVersonRange(blk,ver) Returnsaversionrangewhichablockisvalid

Readingblock1(monotonic-reads)

•  Key-valuestores,filesystemscanstoreanobjectovermulYpleblocks•  Readshouldbeservedfromapersistentsnapshot:GetVersionRange()

MulY-BlockObjectAccessinYogurt

Hard DriveDisk

Solid State

Disk

0

1

2

3

Drive

Solid State

Solid State Disk31

11

32

00

0 1 2 BlockAddr

Timestamp

(Snapsho

t#)

0

50000

100000

150000

200000

1 2 3 4 5 6 7 8

AverageRe

adLaten

cy

(us)

#ofStaleVersions@startOme

PrimaryLocallatestYogurtMRYogurtRMW

0

50000

100000

150000

200000

4KB 8KB 12KB 16KB 20KBKey-ValuePairSize

01234567

32B(3)

64B(7)

128B(15)

256B(31)

512B(63)

1024B(127)

AverageLatency(us)

GetCostQuerySize(#ofqueries)

•  CostqueryingoverheadisnegligiblecomparedtodiskandSSDaccesslatencies

OtherPossibleStaleStores•  Singledisklog-structuredstore•  SSDflashtranslaYonlayers•  Log-structuredarrays•  Durablewritecachesthatarefastforwritesbutslowforreads

•  Deduplicatedsystemswithreadcaches•  Fine-grainedloggingoverablock-grainedcache

•  Systemsstoringdifferencesfrompreviousversions

Top Related