devops kc
DESCRIPTION
Devops KC presentation on Apache CassandraTRANSCRIPT
©2014 DataStax. Do not distribute without consent.©2014 DataStax. Do not distribute without consent.
DataStax
Philip ThompsonSoftware Engineer
Apache Cassandra
Who I am• Philip Thompson
• Software Engineer at DataStax
• Contributor to Apache Cassandra
• A maintainer of CCM, the Cassandra Cluster Manager
Apache Cassandra™
•Apache Cassandra™ is a massively scalable, open source, NoSQL, distributed database built for modern, mission-critical online applications.
•Written in Java and is a hybrid of Amazon Dynamo and Google BigTable•Masterless with no single point of failure•Distributed and data centre aware•100% uptime•Predictable scaling
©2012 DataStax
©2012 DataStax
©2012 DataStax
©2012 DataStax 9
http://techblog.netflix.com/2012/07/lessons-netflix-learned-from-aws-storm.html
©2012 DataStax
Cluster Architecture
Data Distribution
75
0
25
50Murmur3_Hash_Function(Partition Key) >>
Token
Cassandra - More than one server
• All nodes participate in a cluster
• Shared nothing
• Add or remove as needed
• More capacity? Add a server
• Each node owns a number of tokens
• Tokens denote a range of keys
• 4 nodes? -> Key range/4• Each node owns 1/4 the data
Cassandra - Locally Distributed
• Client writes to any node
• Node coordinates with others
• Data replicated in parallel
• Replication factor (RF): How many copies of your data?
• RF = 3 here
Each node stores 3/4 of clusters total data.
Cassandra - Geographically Distributed
• Client writes local
• Data syncs across WAN
• Replication Factor per DC
Single coordinator
Cassandra - Replication Factor
• Replication factor (RF): How many copies of your data?
• Replication Factor is set per keyspace
• Can be altered by operator
RF = 3
Cassandra - Consistency
• Consistency Level (CL)
• Client specifies per read or write
• ALL = All replicas ack
• QUORUM = > 51% of replicas ack
• LOCAL_QUORUM = > 51% in local DC ack
• ONE = Only one replica acks
Cassandra - Transparent to the application
• A single node failure shouldn’t bring failure
• Replication Factor + Consistency Level = Success
• This example:
• RF = 3
• CL = QUORUM
>51% Ack so we are good!
Cassandra - Scaling
• Take a cluster of four nodes
• Where does the fifth node go?
• Rebalancing is costly 75
0
25
50
Gossip• Manages cluster state
• Nodes up/down
• Nodes joining/leaving
• Decentralized
• “Heartbeat” every second
• Every node contacts 1-3 other nodes
Snitch
• Responsible for determining cluster topology
• Datacenter awareness
• Tracks node responsiveness
• Many snitches provided out of the box
• SimpleSnitch
• GossipingPropertyFileSnitch (recommended for production)
• EC2Snitch and EC2MultiRegionSnitch
• For use with AWS
• Comparable GCE snitch has just been added
• Custom snitches can be added
Anti-Entropy - Read Repair
Anti-Entropy - Hinted Handoff
• Three hour window
• Hints are replayed when node is restored
• Stored in system.hints table on coordinator
• Cassandra does not copy Dynamo’s “sloppy quorum”
Anti-Entropy - Repair
• Nodetool repair
• Uses merkle trees for data comparison
• Should be run weekly.
• Cassandra 2.1 has drastically improved repair times, thanks to incremental repair
©2012 DataStax
Node Architecture
Write Path
commit log
Memtable
SSTable
Write
Memory
Disk
Write Path• By default data is fsynced every 10s
• This can be configured in cassandra.yaml
commit log
Memtable
SSTable
Write
Read Path
Memtable
SSTable
Read
SSTable
Memory
Disk
Read Path
Compaction
Compaction
Debugging your data model• Tracing
cqlsh> tracing on;Now tracing requests.
cqlsh:foo> INSERT INTO test (a, b) VALUES (1, 'example');Tracing session: 4ad36250-1eb4-11e2-0000-fe8ebeead9f9
activity | timestamp | source | source_elapsed-------------------------------------+--------------+-----------+---------------- execute_cql3_query | 00:02:37,015 | 127.0.0.1 | 0 Parsing statement | 00:02:37,015 | 127.0.0.1 | 81 Preparing statement | 00:02:37,015 | 127.0.0.1 | 273 Determining replicas for mutation | 00:02:37,015 | 127.0.0.1 | 540 Sending message to /127.0.0.2 | 00:02:37,015 | 127.0.0.1 | 779
Messsage received from /127.0.0.1 | 00:02:37,016 | 127.0.0.2 | 63 Applying mutation | 00:02:37,016 | 127.0.0.2 | 220 Acquiring switchLock | 00:02:37,016 | 127.0.0.2 | 250 Appending to commitlog | 00:02:37,016 | 127.0.0.2 | 277 Adding to memtable | 00:02:37,016 | 127.0.0.2 | 378 Enqueuing response to /127.0.0.1 | 00:02:37,016 | 127.0.0.2 | 710 Sending message to /127.0.0.1 | 00:02:37,016 | 127.0.0.2 | 888
Messsage received from /127.0.0.2 | 00:02:37,017 | 127.0.0.1 | 2334 Processing response from /127.0.0.2 | 00:02:37,017 | 127.0.0.1 | 2550 Request complete | 00:02:37,017 | 127.0.0.1 | 2581
Nodetool
• Command line interface for monitoring Cassandra and performing routine database operations
• Commands for viewing detailed metrics for tables, server metrics, and compaction statistics:• cfstats: statistics for each table and keyspace
• cfhistograms: statistics about a table, including read/write latency, row size, column count, and number of SSTables
• netstats: statistics about network operations and connections
• tpstats: statistics about the number of active, pending, and completed tasks for each stage of Cassandra operations by thread pool
©2012 DataStax
Try it out
Cassandra• Download from source:
• git clone git://git.apache.org/cassandra.git
• Packaged install and tarballs available:• http://www.datastax.com/documentation/cassandra/2.1/cassandra/install/ins
tall_cassandraTOC.html
CCM• CCM - Cassandra Cluster Manager
• https://github.com/pcmanus/ccm
• Warning: not lightweight
• Example:• ccm create test -v 2.0.1
• ccm populate -n 3
• ccm start
Clients• Cqlsh
• Bundled with Cassandra
• Drivers• java: https://github.com/datastax/java-driver
• python: https://github.com/datastax/python-driver
• .net: https://github.com/datastax/csharp-driver
• and more: http://www.datastax.com/download/clientdrivers
• Ruby, C/C++, NodeJS
Get Help
• IRC: #cassandra on freenode
• Mailing Lists
• Subscribe at cassandra.apache.org
• Stack Overflow
• DataStax Docs
• http://www.datastax.com/docs
©2012 DataStax
Questions?
©2014 DataStax Confidential. Do not distribute without consent.©2014 DataStax Confidential. Do not distribute without consent.