cs782 presentation group7
TRANSCRIPT
Consistent Hashing and the Dynamo Model
Ai Ren, Yina Du, and Mingliang SunGroup 7
Outline
Motivation & Objective
Key Ideas in Dynamo
Simulation Method & Result
Conclusion
Motivation
It is all about $! Massive scale data in hundreds of nodes
Commodity hardware infrastructure
Failure is the norm, not the exception
Motivation - Availability
'always-on' experience for end users How to handle failures transparently?
Parity checking or replication?
Strongly consistent or eventually consistent?
Conflict resolution: who and when?
Motivation - Scalability
$ matters!
Poor performance means losing customers and money Increase capacity easily and incrementally
Over-provisioning means unnecessary cost Decrease capacity easily and incrementally
Objective
Service is always available for customers with a guaranteed response time no matter what, and achieve this with as little $ as possible
Key Ideas A fully decentralized DHT (Distributed Hash Table)
Consistent hashing
Natural partitioning and LB(division of labor)
Minimum data migration when node joins/leaves
Replication for fault tolerance
Quorum techniques: R + W > N
Eventual(weak) consistency model
Conflict resolution
By application, not Dynamo
When reading, not writing
Simulation - Overview
Performance test tool for concurrent requests
Dynamo applications
Gather and record results
a ring of services as dynamo nodes
replication and fault tolerance
A proxy sits between the PT tool and the ring
a simple service interface
requests randomness
membership discovering
Simulation - Availability When a node leaves, the coordinating node
uses the next available node on the ring
With node replacement, right after a node leaves the ring (fails), a new node will join the ring, keeping the number of nodes unchanged
System load increases gradually (from100 to 200 requests / second)
4 simulation cases
W=2, N=3 (R=2)
With node replacement (15 nodes)
Without node replacement (15 → 10 nodes)
W=3, N=3 (R=1)
With node replacement (15 nodes)
Without node replacement (15 → 10 nodes)
Simulation - Availability
No failure requests recorded for all cases, service remains available when node leaves (and joins)
With replacement nodes, service level (throughput) is maintained
A W=2 setting gives better performance, while a W=3 setting provides better fault tolerance
Simulation - Scalability Scalability: more nodes → larger
capacity
Incremental & dynamic scalability: no service interruption
System load increases gradually (from 100 to 200 requests / second)
6 simulation cases
W=2, N=3 (R=2)
10 nodes
From 10 to 15 nodes
15 nodes
W=3, N=3 (R=1)
10 nodes
From 10 to 15 nodes
15 nodes
Simulation - Scalability
A Ring with more nodes provide greater capacity (throughput) than a ring with less nodes does
Moreover, capacity (throughput) increased incrementally (dynamically) when more nodes join the ring, without incurring service interruption
Higher the W setting, better fault tolerance, but worse writing performance
Conclusion
With consistent hashing, the Dynamo model is able to provide great scalability and availability
Massive scale data storage on large cluster of commodity infrastructure is possible
A real application: the shopping cart on www.amazon.com