cassandra - deep dive

42
Cassandra A Decentralized Structured Storage System By Sameera Nelson

Upload: sameiralk

Post on 15-Jan-2015

171 views

Category:

Technology


8 download

DESCRIPTION

Presentation of internal architecture and features of Cassandra based on the version 1.2

TRANSCRIPT

Page 1: Cassandra - Deep Dive

CassandraA Decentralized Structured Storage System

By Sameera Nelson

Page 2: Cassandra - Deep Dive

Outline …

Introduction

Data Model

System Architecture

Failure Detection & Recovery

Local Persistence

Performance

Statistics

Page 3: Cassandra - Deep Dive

What is Cassandra ?

Distributed Storage System

Manages Structured Data

Highly available , No SPoF

Not a Relational Data Model

Handle high write throughput

◦ No impact on read efficiency

Page 4: Cassandra - Deep Dive

Motivation

Operational Requirements in Facebook

◦ Performance

◦ Reliability/ Dealing with Failures

◦ Efficiency

◦ Continues Growth

Application◦ Inbox Search Problem, Facebook

Page 5: Cassandra - Deep Dive

Similar Work

Google File System◦ Distributed FS, Single master/Slave

Ficus/ Coda

◦ Distributed FS

Farsite

◦ Distributed FS, No centralized server

Bayou◦ Distributed Relational DB System

Dynamo

◦ Distributed Storage system

Page 6: Cassandra - Deep Dive

Data Model

Page 7: Cassandra - Deep Dive

Data Model

Figure from Eben Hewitt’s slides.

Page 8: Cassandra - Deep Dive

Supported Operations

insert(table; key; rowMutation)

get(table; key; columnName)

delete(table; key; columnName)

Page 9: Cassandra - Deep Dive

Query Language

CREATE TABLE users

( user_id int PRIMARY KEY,

fname text,

lname text );

INSERT INTO users

(user_id, fname, lname) VALUES (1745, 'john', 'smith');

SELECT * FROM users;

Page 10: Cassandra - Deep Dive

Data Structure

Log-Structured Merge Tree

Page 11: Cassandra - Deep Dive

System Architecture

Page 12: Cassandra - Deep Dive

Architecture

Page 13: Cassandra - Deep Dive

Fully Distributed …No Single Point of Failure

Page 14: Cassandra - Deep Dive

Cassandra Architecture

PartitioningData distribution across nodes

ReplicationData duplication across nodes

Cluster MembershipNode management in cluster

adding/ deleting

Page 15: Cassandra - Deep Dive

Partitioning

The Token Ring

Page 16: Cassandra - Deep Dive

Partitioning Partitions using Consistent hashing

Page 17: Cassandra - Deep Dive

Partitioning Assignment in to the relevant partition

Page 18: Cassandra - Deep Dive

Partitioning, Vnodes

Page 19: Cassandra - Deep Dive

Replication

Based on configured replication factor

Page 20: Cassandra - Deep Dive

Replication

Different Replication Policies

◦Rack Unaware

◦Rack Aware

◦Data center Aware

Page 21: Cassandra - Deep Dive

Cluster Membership

Based on scuttlebutt

Efficient Gossip based mechanism

Inspired for real life rumor

spreading.

Anti Entropy protocol

◦ Repair replicated data by comparing &

reconciling differences

Page 22: Cassandra - Deep Dive

Cluster Membership

Gossip Based

Page 23: Cassandra - Deep Dive

Failure Detection &

Recovery

Page 24: Cassandra - Deep Dive

Failure DetectionTrack state

◦ Directly, Indirectly

Accrual Detection mechanism

 Permanent Node change

◦ Admin should explicitly add or remove

Hints

◦ Data to be replayed in replication

◦ Saved in system.hints table

Page 25: Cassandra - Deep Dive

Accrual Failure Detector

• Node is faulty, suspicion level

monotonically increases.

• Φ(t) k• k - threshold variable

• Node is correct

• Φ(t) = 0

Page 26: Cassandra - Deep Dive

Local Persistence

Page 27: Cassandra - Deep Dive

Write Request

Page 28: Cassandra - Deep Dive

Write Operation

Page 29: Cassandra - Deep Dive

Write OperationLogging data in commit log/ memtable

Flushing data from the memtable

◦Flushing data on threshold

Storing data on disk in SSTables Mark with tombstone

Compaction Remove deletes, Sorts, Merges data,

consolidation

Page 30: Cassandra - Deep Dive

Write Operation

Compaction

Page 31: Cassandra - Deep Dive

Read RequestDirect/ Background (Read repair)

Page 32: Cassandra - Deep Dive

Read Operation

Page 33: Cassandra - Deep Dive

Delete Operation

Data not removed immediately

Only Tombstone is written

Deleted in Compacting Process

Page 34: Cassandra - Deep Dive

Additional Features

Adding compression Snappy Compression

Secondary index support

SSL support

◦ Client/ Node

◦ Node/ Node

Rolling commit logs

SSTable data file merging

Page 35: Cassandra - Deep Dive

Performance

Page 36: Cassandra - Deep Dive

Performance

High Throughput & Low Latency

◦ Eliminating on-disk data modification 

◦ Eliminate erase-block cycles

◦ No Locking for concurrency control

◦ Maintaining integrity not required

High Availability

Linear Scalability

Fault Tolerant

Page 37: Cassandra - Deep Dive

Statistics

Page 38: Cassandra - Deep Dive

Stats from Netflix

Liner scalability

Page 39: Cassandra - Deep Dive

Stats from Netflix

Page 40: Cassandra - Deep Dive

Some users

Page 41: Cassandra - Deep Dive

Thank you

Page 42: Cassandra - Deep Dive

Read Detailed

Structure