continuous deployment with cassandra

15
Continuous Deployment with C*: Treating C* as First-Class Code Michael Kjellman @mkjellman Software Engineer, Barracuda Networks

Upload: planet-cassandra

Post on 20-Jun-2015

1.310 views

Category:

Technology


0 download

DESCRIPTION

Michael Kjellman, Software Engineer at Barracuda Networks, has offered to present on his experiences with Apache Cassandra. Come learn about: • Continuous Deployments with Cassandra • Upgrading Cassandra • When Upgrades Go Wrong • Coding Complexity Moved to Operations (How to Prepare and Plan) • Why 'Apt-get/Yum Install Cassandra' is a bad idea • Why You Should Treat Cassandra’s Code like it's Your Own

TRANSCRIPT

Page 1: Continuous Deployment with Cassandra

Continuous Deployment with C*: Treating C* as First-Class Code

Michael Kjellman@mkjellman

Software Engineer, Barracuda Networks

Page 2: Continuous Deployment with Cassandra
Page 3: Continuous Deployment with Cassandra

C* At Barracuda• Powers 100% of our Spam and Webfilter Backend• 48 Node Cluster• 2 Datacenters• Requests: 20k writes/sec 30k reads/sec • Latency: 1 ms/write 1.6 ms/read• > 30TB of Data • Almost entirely native protocol/CQL3

Page 4: Continuous Deployment with Cassandra

Hardware Configuration• 32GB of RAM• 1x SSD• 2x Spinning Disks• 2x 6 Core AMD

Page 5: Continuous Deployment with Cassandra

Key Configuration Options• key_cache_size_in_mb: 1024• row_cache_size_in_mb: 0• memtable_total_space_in_mb: 2048• HEAP_NEWSIZE = “1200M” (-Xmn)• MAX_HEAP_SIZE = “8G” (-Xmx)• -XX:SurvivorRatio=6

• Sidenote: Java 7u40 is out!

Page 6: Continuous Deployment with Cassandra

How do I keep my graphs pretty during a C* upgrade?

September 18th 2013

Page 7: Continuous Deployment with Cassandra

Make a C* Build$> git clone http://git-wip-us.apache.org/repos/asf/cassandra.git$> git checkout –t origin/cassandra-1.2$> git log$> vim build.xml (change version number every time you make a build!)$> ant clean release

Page 8: Continuous Deployment with Cassandra

Deployment• Make release• Test release with CCM• Push release to Puppet (deals with config, etc)• Run controlled and scripted rolling restart one datacenter at a

time– flush– stop– start– validate node

Page 9: Continuous Deployment with Cassandra

Automate, Automate, Automate

Page 10: Continuous Deployment with Cassandra

So, why not just apt-get install cassandra?

• Makes running a custom release in the future a complete nightmare

• Lost visibility into changes in the release• WHY are you upgrading• Treat a C* build just as if it was a release of your

code. What commits did you put into your own release?

Page 11: Continuous Deployment with Cassandra

MY CODE DOESN’T WORK WITHOUT A STABLE C* CLUSTER

Simply Put:

Page 12: Continuous Deployment with Cassandra

When things go wrong• Every commit (those by C* committers or my own)

come with potential bugs and regressions• Gossip Bugs Can Bite Hard:– CASSANDRA-5665: Gossiper.handleMajorStateChange

can lose existing node ApplicationState• At 48 nodes, even small mistakes are massive

Page 13: Continuous Deployment with Cassandra

Writing your code to deal with node failure

• Upgrading a C* cluster means constant node failures for the duration of the rolling restart

• How does your code deal with read latency and retries– CASSANDRA-4705: Eager Retries for reads for 2.0+

• The mythical “constantly failing” code != stability. – Handle exceptions (and node/read failures) gracefully!

Page 14: Continuous Deployment with Cassandra

Why treat C* like your own code• Using C* will move much of your own

application logic to C*• The bugs have to go somewhere!• Data replication at database layer or at

application layer

Page 15: Continuous Deployment with Cassandra

QUESTIONS?Thanks for Listening!