introduction to cassandra (june 2010)
DESCRIPTION
Presented to the Silicon Valley Cloud Computing Group. 17 June 2010.TRANSCRIPT
![Page 1: Introduction to Cassandra (June 2010)](https://reader036.vdocuments.site/reader036/viewer/2022062708/5587de29d8b42a04638b46b5/html5/thumbnails/1.jpg)
Gary DusbabekRackspace
Apache
Silicon Valley Cloud Computing Group • 17 June 2010
![Page 2: Introduction to Cassandra (June 2010)](https://reader036.vdocuments.site/reader036/viewer/2022062708/5587de29d8b42a04638b46b5/html5/thumbnails/2.jpg)
Outline
• History• Scaling• Replication Model• Data Model• Tuning• Write Path• Read Path• Client Access• Practical Considerations
![Page 3: Introduction to Cassandra (June 2010)](https://reader036.vdocuments.site/reader036/viewer/2022062708/5587de29d8b42a04638b46b5/html5/thumbnails/3.jpg)
Outline
• History• Scaling• Replication Model• Data Model• Tuning• Write Path• Read Path• Client Access• Practical Considerations
![Page 4: Introduction to Cassandra (June 2010)](https://reader036.vdocuments.site/reader036/viewer/2022062708/5587de29d8b42a04638b46b5/html5/thumbnails/4.jpg)
Why Cassandra?
161 EB
988 EB
2006 2010
Source: http://www.emc.com/collateral/analyst-reports/expanding-digital-idc-white-paper.pdf
6 fold growthIn 4 years
322 million 500GB drives
1.98 billion 500 GB drives
![Page 5: Introduction to Cassandra (June 2010)](https://reader036.vdocuments.site/reader036/viewer/2022062708/5587de29d8b42a04638b46b5/html5/thumbnails/5.jpg)
Why Cassandra?
![Page 6: Introduction to Cassandra (June 2010)](https://reader036.vdocuments.site/reader036/viewer/2022062708/5587de29d8b42a04638b46b5/html5/thumbnails/6.jpg)
SQL
• Specialized data structures (think B-trees)– Shines with complicated queries
• Focus on fast query & analysis quickly– Not necessarily on large datasets
![Page 7: Introduction to Cassandra (June 2010)](https://reader036.vdocuments.site/reader036/viewer/2022062708/5587de29d8b42a04638b46b5/html5/thumbnails/7.jpg)
Ever tried scaling a RDBMS
• For reads?– Memcache etc.
• For writes?– Oh noes!
![Page 8: Introduction to Cassandra (June 2010)](https://reader036.vdocuments.site/reader036/viewer/2022062708/5587de29d8b42a04638b46b5/html5/thumbnails/8.jpg)
VerticalScalingIs hard
credit: janetmck via flickr
![Page 9: Introduction to Cassandra (June 2010)](https://reader036.vdocuments.site/reader036/viewer/2022062708/5587de29d8b42a04638b46b5/html5/thumbnails/9.jpg)
VerticalScalingIs hard
No, really:
![Page 10: Introduction to Cassandra (June 2010)](https://reader036.vdocuments.site/reader036/viewer/2022062708/5587de29d8b42a04638b46b5/html5/thumbnails/10.jpg)
Enter Cassandra
• Amazon Dynamo– Consistent hashing– Partitioning– Replication– One-hop routing
• Google BigTable– Column Families– Memtables– SSTables
![Page 11: Introduction to Cassandra (June 2010)](https://reader036.vdocuments.site/reader036/viewer/2022062708/5587de29d8b42a04638b46b5/html5/thumbnails/11.jpg)
Origins
Pre-2008
![Page 12: Introduction to Cassandra (June 2010)](https://reader036.vdocuments.site/reader036/viewer/2022062708/5587de29d8b42a04638b46b5/html5/thumbnails/12.jpg)
Moving Along
2008
![Page 13: Introduction to Cassandra (June 2010)](https://reader036.vdocuments.site/reader036/viewer/2022062708/5587de29d8b42a04638b46b5/html5/thumbnails/13.jpg)
Landed
2009
![Page 14: Introduction to Cassandra (June 2010)](https://reader036.vdocuments.site/reader036/viewer/2022062708/5587de29d8b42a04638b46b5/html5/thumbnails/14.jpg)
Outline
• History• Scaling• Replication Model• Data Model• Tuning• Write Path• Read Path• Client Access• Practical Considerations
![Page 15: Introduction to Cassandra (June 2010)](https://reader036.vdocuments.site/reader036/viewer/2022062708/5587de29d8b42a04638b46b5/html5/thumbnails/15.jpg)
Distributed and Scalable
• Horizontal!• All nodes are identical– No master or SPOF– Adding is simple
• Automatic cluster maintenance
![Page 16: Introduction to Cassandra (June 2010)](https://reader036.vdocuments.site/reader036/viewer/2022062708/5587de29d8b42a04638b46b5/html5/thumbnails/16.jpg)
Outline
• History• Scaling• Replication Model• Data Model• Tuning• Write Path• Read Path• Client Access• Practical Considerations
![Page 17: Introduction to Cassandra (June 2010)](https://reader036.vdocuments.site/reader036/viewer/2022062708/5587de29d8b42a04638b46b5/html5/thumbnails/17.jpg)
Replication
• Replication factor– How many nodes data is replicated on
• Consistency level– Zero, One, Quorum, All– Sync or async for writes– Reliability of reads– Read repair
![Page 18: Introduction to Cassandra (June 2010)](https://reader036.vdocuments.site/reader036/viewer/2022062708/5587de29d8b42a04638b46b5/html5/thumbnails/18.jpg)
Ring Topology
a
j
g
d
RF=3
Conceptual Ring
One token per node
Multiple ranges per node
![Page 19: Introduction to Cassandra (June 2010)](https://reader036.vdocuments.site/reader036/viewer/2022062708/5587de29d8b42a04638b46b5/html5/thumbnails/19.jpg)
Ring Topology
a
j
g
d
RF=2
Conceptual Ring
One token per node
Multiple ranges per node
![Page 20: Introduction to Cassandra (June 2010)](https://reader036.vdocuments.site/reader036/viewer/2022062708/5587de29d8b42a04638b46b5/html5/thumbnails/20.jpg)
New Node
a
j
g
d
RF=3
Token assignment
Range adjustment
Bootstrap
Arrival only affects immediate neighbors
m
![Page 21: Introduction to Cassandra (June 2010)](https://reader036.vdocuments.site/reader036/viewer/2022062708/5587de29d8b42a04638b46b5/html5/thumbnails/21.jpg)
Ring Partition
a
j
g
d
RF=3
Node dies
Available?HintingHandoff
Achtung!Plan for this
![Page 22: Introduction to Cassandra (June 2010)](https://reader036.vdocuments.site/reader036/viewer/2022062708/5587de29d8b42a04638b46b5/html5/thumbnails/22.jpg)
Outline
• History• Scaling• Replication Model• Data Model• Tuning• Write Path• Read Path• Client Access• Practical Considerations
![Page 23: Introduction to Cassandra (June 2010)](https://reader036.vdocuments.site/reader036/viewer/2022062708/5587de29d8b42a04638b46b5/html5/thumbnails/23.jpg)
Schema-free Sparse-table
• Flexible column naming• You define the sort order• Not required to have a specific column just
because another row does
![Page 24: Introduction to Cassandra (June 2010)](https://reader036.vdocuments.site/reader036/viewer/2022062708/5587de29d8b42a04638b46b5/html5/thumbnails/24.jpg)
Data Model
• Keyspace• ColumnFamily• Row (indexed)• Key• Columns•Name (sorted)•Value
![Page 25: Introduction to Cassandra (June 2010)](https://reader036.vdocuments.site/reader036/viewer/2022062708/5587de29d8b42a04638b46b5/html5/thumbnails/25.jpg)
Easier to show from the bottom up
![Page 26: Introduction to Cassandra (June 2010)](https://reader036.vdocuments.site/reader036/viewer/2022062708/5587de29d8b42a04638b46b5/html5/thumbnails/26.jpg)
Data Model
A single column
![Page 27: Introduction to Cassandra (June 2010)](https://reader036.vdocuments.site/reader036/viewer/2022062708/5587de29d8b42a04638b46b5/html5/thumbnails/27.jpg)
Data Model
A single row
![Page 28: Introduction to Cassandra (June 2010)](https://reader036.vdocuments.site/reader036/viewer/2022062708/5587de29d8b42a04638b46b5/html5/thumbnails/28.jpg)
Data Model
![Page 29: Introduction to Cassandra (June 2010)](https://reader036.vdocuments.site/reader036/viewer/2022062708/5587de29d8b42a04638b46b5/html5/thumbnails/29.jpg)
Outline
• History• Scaling• Replication Model• Data Model• Tuning• Write Path• Read Path• Client Access• Practical Considerations
![Page 30: Introduction to Cassandra (June 2010)](https://reader036.vdocuments.site/reader036/viewer/2022062708/5587de29d8b42a04638b46b5/html5/thumbnails/30.jpg)
Eventually Consistent
• CAP Theorem– Consistency– Availability– Partition Tolerance
• Choose two• Cassandra chooses A and P
But…
![Page 31: Introduction to Cassandra (June 2010)](https://reader036.vdocuments.site/reader036/viewer/2022062708/5587de29d8b42a04638b46b5/html5/thumbnails/31.jpg)
Eventually ConsistentI got a fever! And the only prescription is
MORE CONSISTENCY!
![Page 32: Introduction to Cassandra (June 2010)](https://reader036.vdocuments.site/reader036/viewer/2022062708/5587de29d8b42a04638b46b5/html5/thumbnails/32.jpg)
Tunable Consistency
• Give up a little A and P to get more C• Ratchet up the consistency level• R + W > N Strong consistency
• More to come
![Page 33: Introduction to Cassandra (June 2010)](https://reader036.vdocuments.site/reader036/viewer/2022062708/5587de29d8b42a04638b46b5/html5/thumbnails/33.jpg)
Outline
• History• Scaling• Replication Model• Data Model• Tuning• Write Path• Read Path• Client Access• Practical Considerations
![Page 34: Introduction to Cassandra (June 2010)](https://reader036.vdocuments.site/reader036/viewer/2022062708/5587de29d8b42a04638b46b5/html5/thumbnails/34.jpg)
Inserting: Overview
• Simple: put(key, col, value) • Complex: put(key, [col:value, …, col:value]) • Batch: multi key.
![Page 35: Introduction to Cassandra (June 2010)](https://reader036.vdocuments.site/reader036/viewer/2022062708/5587de29d8b42a04638b46b5/html5/thumbnails/35.jpg)
Inserting: Writes• Commit log for durability
– Configurable fsync– Sequential writes only
• Memtable – no disk access (no reads or seeks)
• Sstables are final (become read only)– Indexes– Bloom filter– Raw data
• Bottom line: FAST!!!
![Page 36: Introduction to Cassandra (June 2010)](https://reader036.vdocuments.site/reader036/viewer/2022062708/5587de29d8b42a04638b46b5/html5/thumbnails/36.jpg)
Outline
• History• Scaling• Replication Model• Data Model• Tuning• Write Path• Read Path• Client Access• Practical Considerations
![Page 37: Introduction to Cassandra (June 2010)](https://reader036.vdocuments.site/reader036/viewer/2022062708/5587de29d8b42a04638b46b5/html5/thumbnails/37.jpg)
Querying: Overview
• You need a key or keys:– Single: key=‘a’– Range: key=‘a’ through ’f’
• And columns to retrieve:– Slice: cols={bar through kite}– By name: key=‘b’ cols={bar, cat, llama}
• Nothing like SQL “WHERE col=‘faz’”– But secondary indices are being worked on (see
CASSANDRA-749)
![Page 38: Introduction to Cassandra (June 2010)](https://reader036.vdocuments.site/reader036/viewer/2022062708/5587de29d8b42a04638b46b5/html5/thumbnails/38.jpg)
Querying: Reads• Practically lock free• Sstable proliferation• New in 0.6:
– Row cache (avoid sstable lookup, not write-through)
– Key cache (avoid index scan)
![Page 39: Introduction to Cassandra (June 2010)](https://reader036.vdocuments.site/reader036/viewer/2022062708/5587de29d8b42a04638b46b5/html5/thumbnails/39.jpg)
Outline
• History• Scaling• Replication Model• Data Model• Tuning• Write Path• Read Path• Client Access• Practical Considerations
![Page 40: Introduction to Cassandra (June 2010)](https://reader036.vdocuments.site/reader036/viewer/2022062708/5587de29d8b42a04638b46b5/html5/thumbnails/40.jpg)
Client API (Low Level)
• Fat Client– Live non-storage node– Reduced RPC overhead
• Thrift (12 language bindings!)– http://incubator.apache.org/thrift/– No streaming
• Avro– Work in progress
![Page 41: Introduction to Cassandra (June 2010)](https://reader036.vdocuments.site/reader036/viewer/2022062708/5587de29d8b42a04638b46b5/html5/thumbnails/41.jpg)
Client API (High Level)
• http://wiki.apache.org/cassandra/ClientOptions• Feature rich• Connection pooling• Load balancing/failover• Simplified APIs• Version opaque
![Page 42: Introduction to Cassandra (June 2010)](https://reader036.vdocuments.site/reader036/viewer/2022062708/5587de29d8b42a04638b46b5/html5/thumbnails/42.jpg)
Outline
• History• Scaling• Replication Model• Data Model• Tuning• Write Path• Read Path• Client Access• Practical Considerations
![Page 43: Introduction to Cassandra (June 2010)](https://reader036.vdocuments.site/reader036/viewer/2022062708/5587de29d8b42a04638b46b5/html5/thumbnails/43.jpg)
Practical Considerations• Partitioner-Random or Order Preserving– Range queries
• Provisioning– Virtual or bare metal– Cluster size
• Data model– Think in terms of access– Giving up transactions, ad-hoc queries, arbitrary
indexes and joins• (you may already do this with an RDBMS!)
![Page 44: Introduction to Cassandra (June 2010)](https://reader036.vdocuments.site/reader036/viewer/2022062708/5587de29d8b42a04638b46b5/html5/thumbnails/44.jpg)
Practical Considerations
• Wide rows• Data life-span• Cluster planning– Bootstrapping
![Page 45: Introduction to Cassandra (June 2010)](https://reader036.vdocuments.site/reader036/viewer/2022062708/5587de29d8b42a04638b46b5/html5/thumbnails/45.jpg)
Future Direction
• Vector clocks (server side conflict resolution)• Alter keyspace/column families on a live
cluster• Compression• Multi-tenant features• Less memory restrictions
![Page 46: Introduction to Cassandra (June 2010)](https://reader036.vdocuments.site/reader036/viewer/2022062708/5587de29d8b42a04638b46b5/html5/thumbnails/46.jpg)
Wrapping Up
• Use Cassandra if you want/need– High write throughput– Near-linear scalability– Automated replication/fault tolerance– Can tolerate missing RDBMS features
![Page 47: Introduction to Cassandra (June 2010)](https://reader036.vdocuments.site/reader036/viewer/2022062708/5587de29d8b42a04638b46b5/html5/thumbnails/47.jpg)
Questions?
Linkage• wiki.apache.org/cassandra• cassandra.apache.org• [email protected]• gdusbabek on twitter and just about
everything else.