cassandra eu 2012 - highly available: the cassandra distribution model by sam overton

56
Highly Available: The Cassandra Distribution Model Sam Overton Cassandra Europe 2012

Upload: acunu

Post on 15-Jan-2015

3.353 views

Category:

Technology


4 download

DESCRIPTION

Sam Overton's talk from Cassandra Europe on March 28th 2012

TRANSCRIPT

Page 1: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton

Highly Available: The Cassandra Distribution

Model

Sam Overton

Cassandra Europe 2012

Page 2: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton

Cassandra Europe 2012

Highly Available: The Cassandra Distribution Model

Cassandra is:● built for scalability● built to tolerate failure

In this talk:● Cassandra distribution overview● Partitioning and placement● Replication● Consistency

Page 3: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton

Cassandra Europe 2012

Highly Available: The Cassandra Distribution Model

Cassandra is:● built for scalability● built to tolerate failure

In this talk:● Cassandra distribution overview● Partitioning and placement● Replication● Consistency

Page 4: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton

Cassandra Europe 2012

Highly Available: The Cassandra Distribution Model

Overview

● High availability● Partition tolerant● Tunable consistency● Scalable● Replication● No single point of failure

Page 5: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton

Cassandra Europe 2012

Highly Available: The Cassandra Distribution Model

Cassandra is:● built for scalability● built to tolerate failure

In this talk:● Cassandra distribution overview● Partitioning and placement● Replication● Consistency

Page 6: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton

Cassandra Europe 2012

Highly Available: The Cassandra Distribution Model

Partitioning and placement

Should...● Assign data to hosts● Have no S.P.O.F for routing clients to data● Balance load● Allow scaling without moving too much data

Page 7: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton

Cassandra Europe 2012

Highly Available: The Cassandra Distribution Model

Consistent Hashing

Page 8: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton

Cassandra Europe 2012

Highly Available: The Cassandra Distribution Model

Consistent Hashing

(k1, v1)

(k2, v2)

(k3, v3)

Page 9: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton

Cassandra Europe 2012

Highly Available: The Cassandra Distribution Model

Consistent Hashing

● partitioner maps key to ring token● hosts' tokens determine placement of keys● and proportion of data assigned to each host● each row is stored on one host● wide rows can cause hot-spotting!

So how does it scale?

Page 10: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton

Cassandra Europe 2012

Highly Available: The Cassandra Distribution Model

Consistent Hashing

Page 11: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton

Cassandra Europe 2012

Highly Available: The Cassandra Distribution Model

Consistent Hashing

Bootstrapping a new node

Page 12: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton

Cassandra Europe 2012

Highly Available: The Cassandra Distribution Model

Consistent Hashing

Range is transferred from old host to new host

Page 13: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton

Cassandra Europe 2012

Highly Available: The Cassandra Distribution Model

Consistent Hashing

Page 14: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton

Cassandra Europe 2012

Highly Available: The Cassandra Distribution Model

Consistent Hashing

Page 15: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton

Cassandra Europe 2012

Highly Available: The Cassandra Distribution Model

Consistent Hashing

Page 16: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton

Cassandra Europe 2012

Highly Available: The Cassandra Distribution Model

Consistent Hashing

Decommission is the reverse process

Page 17: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton

Cassandra Europe 2012

Highly Available: The Cassandra Distribution Model

Consistent Hashing

Page 18: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton

Cassandra Europe 2012

Highly Available: The Cassandra Distribution Model

Consistent Hashing

● Tokens can be assigned manually, automatically or randomly● Every node has full knowledge of placement● Client connects to any node, max 1 hop to data● Node status is gossiped

Page 19: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton

Cassandra Europe 2012

Highly Available: The Cassandra Distribution Model

Partitioners

● Converts a row key (from client data) into a token on the ring● RandomPartitioner● Order Preserving Partitioner

Page 20: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton

Cassandra Europe 2012

Highly Available: The Cassandra Distribution Model

Partitioners

Random Partitioner● token = hash(key)● good load balancing● no range queries across row keys

Page 21: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton

Cassandra Europe 2012

Highly Available: The Cassandra Distribution Model

Partitioners

Order Preserving Partitioner● token = key● requires manual load balancing● careful selection of tokens around the ring● allows range queries across row keys

Page 22: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton

Cassandra Europe 2012

Highly Available: The Cassandra Distribution Model

Partitioners

● Get it right first time!● Design data model for RP● Custom partitioners are possible if necessary

Page 23: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton

Cassandra Europe 2012

Highly Available: The Cassandra Distribution Model

Cassandra is:● built for scalability● built to tolerate failure

In this talk:● Cassandra distribution overview● Partitioning and placement● Replication● Consistency

Page 24: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton

Cassandra Europe 2012

Highly Available: The Cassandra Distribution Model

Replication

● For availability● For redundancy● Can increase read bandwidth

Page 25: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton

Cassandra Europe 2012

Highly Available: The Cassandra Distribution Model

Replication

● Replication Factor (RF) is number of copies of data● Defined per-keyspace● Can be changed (eg. If data becomes more/less valuable)● Determines how many failures can be tolerated

Page 26: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton

Cassandra Europe 2012

Highly Available: The Cassandra Distribution Model

Replication Strategy

● Determines how replicas are assigned for each host● Defined per keyspace (like RF)● SimpleStrategy● NetworkTopologyStrategy● Custom strategies can be written

Page 27: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton

Cassandra Europe 2012

Highly Available: The Cassandra Distribution Model

Replication Strategy : Simple Strategy

(k1, v1)

(k2, v2)

eg. RF=3

Page 28: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton

Cassandra Europe 2012

Highly Available: The Cassandra Distribution Model

Replication Strategy : Network Topology Strategy

Page 29: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton

Cassandra Europe 2012

Highly Available: The Cassandra Distribution Model

Replication Strategy : Network Topology Strategy

Multi-datacentre support

DC1 DC2

Page 30: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton

Cassandra Europe 2012

Highly Available: The Cassandra Distribution Model

Replication Strategy : Network Topology Strategy

Page 31: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton

Cassandra Europe 2012

Highly Available: The Cassandra Distribution Model

Snitches

● Enables routing of requests according to node proximity● Used by replication strategy to determine rack and DC membership● Custom snitches can be written

Page 32: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton

Cassandra Europe 2012

Highly Available: The Cassandra Distribution Model

Simple Snitch

● Every host is in the same rack & DC with equal proximity

RackInferringSnitch

● Infers the rack & DC from IP address of host123.8.2.100

DCrack host

Page 33: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton

Cassandra Europe 2012

Highly Available: The Cassandra Distribution Model

EC2Snitch

● DC = EC2 region● Rack = EC2 availability zone

Property file snitch

● Rack and DC membership read from configuration file

Page 34: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton

Cassandra Europe 2012

Highly Available: The Cassandra Distribution Model

DynamicSnitch

● Wraps each of the other snitches● Records latency stats from read operations● Avoids routing to slow hosts● Configurable update intervals

Page 35: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton

Cassandra Europe 2012

Highly Available: The Cassandra Distribution Model

Cassandra is:● built for scalability● built to tolerate failure

In this talk:● Cassandra distribution overview● Partitioning and placement● Replication● Consistency

Page 36: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton

Cassandra Europe 2012

Highly Available: The Cassandra Distribution Model

Consistency

● Replication and failures/partitions cause inconsistency● Old versions of data can be returned

Timestamps:● Chosen by the client● Can be used to avoid read-modify-write

Page 37: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton

Cassandra Europe 2012

Consistency

● Cassandra allows a trade-off between partition-tolerance and consistency

● For strong consistency:R + W > N

● Eg. with 5 replicas(RF = N = 5)write to 3read from 3

Highly Available: The Cassandra Distribution Model

11

11

11

11

11

Page 38: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton

Cassandra Europe 2012

Consistency

● Cassandra allows a trade-off between partition-tolerance and consistency

● For strong consistency:R + W > N

● Eg. with 5 replicas(RF = N = 5)write to 3read from 3

Highly Available: The Cassandra Distribution Model

22

22

22

11

11

write

Page 39: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton

Cassandra Europe 2012

Consistency

● Cassandra allows a trade-off between partition-tolerance and consistency

● For strong consistency:R + W > N

● Eg. with 5 replicas(RF = N = 5)write to 3read from 3

Highly Available: The Cassandra Distribution Model

22

22

22

11

11

read

Page 40: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton

Cassandra Europe 2012

Highly Available: The Cassandra Distribution Model

Consistency Level

● ANY (only for writes)● ONE, TWO, THREE● QUORUM (N/2 + 1)● LOCAL QUORUM● ALL

● Relax strong consistency for partition tolerance● To tolerate 1 node failure with strong consistency use RF=3 with CL=QUORUM

Page 41: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton

Cassandra Europe 2012

Highly Available: The Cassandra Distribution Model

Increasing Consistency

● Read repair● Hinted hand-off● Anti-entropy repair

Page 42: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton

Cassandra Europe 2012

Highly Available: The Cassandra Distribution Model

Read Repair

Page 43: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton

Cassandra Europe 2012

Highly Available: The Cassandra Distribution Model

Read Repair

Page 44: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton

Cassandra Europe 2012

Highly Available: The Cassandra Distribution Model

Read Repair

Page 45: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton

Cassandra Europe 2012

Highly Available: The Cassandra Distribution Model

Read Repair

Page 46: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton

Cassandra Europe 2012

Highly Available: The Cassandra Distribution Model

Hinted Hand-off

eg. RF=2(k1, v1)

(k1, v1)

Page 47: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton

Cassandra Europe 2012

Highly Available: The Cassandra Distribution Model

Hinted Hand-off

eg. RF=2(k1, v1)

(k1, v1)

Write (k1, v2)

Page 48: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton

Cassandra Europe 2012

Highly Available: The Cassandra Distribution Model

Hinted Hand-off

eg. RF=2(k1, v1)

(k1, v1)

Write (k1, v2)

Page 49: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton

Cassandra Europe 2012

Highly Available: The Cassandra Distribution Model

Hinted Hand-off

eg. RF=2(k1, v1)

(k1, v1)

Write (k1, v2)

Page 50: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton

Cassandra Europe 2012

Highly Available: The Cassandra Distribution Model

Hinted Hand-off

eg. RF=2(k1, v1)

(k1, v1)

Write (k1, v2)

(k1, v2)

Page 51: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton

Cassandra Europe 2012

Highly Available: The Cassandra Distribution Model

Hinted Hand-off

eg. RF=2(k1, v2)

(k1, v1)

(k1, v2)

Page 52: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton

Cassandra Europe 2012

Highly Available: The Cassandra Distribution Model

Hinted Hand-off

eg. RF=2(k1, v2)

(k1, v2)

(k1, v2)

Page 53: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton

Cassandra Europe 2012

Highly Available: The Cassandra Distribution Model

Hinted Hand-off

eg. RF=2(k1, v2)

(k1, v2)

(k1, v2)

Page 54: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton

Cassandra Europe 2012

Highly Available: The Cassandra Distribution Model

Hinted Hand-off

● Hinted writes do not count towards the chosen consistency level● … except with CL=ANY which succeeds even if all replicas are down● Don't rely on hints: hints cannot be read!

Page 55: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton

Cassandra Europe 2012

Highly Available: The Cassandra Distribution Model

Anti-entropy repair

● Manual maintenance process● Compares all data stored on a host with the replicas● Differences are streamed to restore consistency● Must be run every 10 days to ensure tombstones are replicated

Page 56: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton

Cassandra Europe 2012

Highly Available: The Cassandra Distribution Model

Cassandra is:● built for scalability● built to tolerate failure

In this talk:● Cassandra distribution overview● Partitioning and placement● Replication● Consistency

fin.