c* summit 2013: time-series metrics with cassandra by mike heffner
DESCRIPTION
Librato's Metrics platform relies on Cassandra as its sole data storage platform for time-series data. This session will discuss how we have scaled from a single six node Cassandra ring two years ago to the multiple storage rings that handle over 150,000 writes/second today. We'll cover the steps we have taken to scale the platform including the evolution of our underlying schema, operational tricks, and client-library improvements. The session will finish with our suggestions on how we believe Cassandra as a project and its community can be improved.TRANSCRIPT
#CASSANDRA13
Time-Series Metrics with Cassandra Mike Heffner
#CASSANDRA13
What we do.
#CASSANDRA13
October 2011
l Decision: All measurements in Cassandra l Single EC2 Ring: 6 * m1.large l Cassandra 0.8.x l How does this work?
#CASSANDRA13
Today
l Multiple sharded rings l ~250,000 writes / second l EC2: m1.xlarge and m2.4xlarge l Cassandra 1.1.x l Read load: < 1%
#CASSANDRA13
Talk Highlights
l Matching Schema to Storage l Optimally Expiring Data l Monitor Everything
#CASSANDRA13
Matching Schema to Storage
#CASSANDRA13
What is a Measurement?
( Metric ID, Source )
(X, Y) => (Time stamp, Value)
#CASSANDRA13
Measurement CF
Example: Select measurements between times [T1, T2]:
#CASSANDRA13
Locating Rows
Let us calculate the maximum row size: l 1 minute records l 1 week TTL l 7 days * 24 hours * 60 minutes => 10,080 l 3 Longs * 8 bytes * 10k => ~240KB (not bad)
#CASSANDRA13
Row Storage Over Time
#CASSANDRA13
Row Storage Over Time
#CASSANDRA13
Seek All The SStables
#CASSANDRA13
Examining CF SSTables Metrics/metric_id_epochs_60 histograms Offset SSTables 1 28821 2 58859 3 201198 4 178326 5 223016 6 154952 7 83289 8 21552 10 81104
1 2 3 4 5 6 7 8 10
nodetool cfhistograms Metrics metric_id_epochs_60
#CASSANDRA13
Splitting the Rows
mget(Rows: [12, EBase_30], [12, EBase_40], Columns: {31->45})
Retrieve Time Bases for Times 31->45 for metric ID 12:
#CASSANDRA13
Examining CF SSTables Metrics/metric_id_epochs_60 Offset SSTables 1 28821 2 58859 3 201198 4 178326 5 223016 6 154952 7 83289 8 21552 10 81104
1 2 3 4 5 6 7 8 10
nodetool cfhistograms Metrics metric_id_epochs_60
Metrics/metric_id_epochs_60 Offset SSTables 1 3491820 2 5389762 3 4095760 4 1310741 5 9976
1 2 3 4 5 6 7 8 9 10
Before
After
#CASSANDRA13
/graph me
#CASSANDRA13
Optimally Expiring Data
#CASSANDRA13
TTL Expiration
l Churn of about 750GB / day l 12 TB total l 6% of data set l gc_grace = 0 l STC
#CASSANDRA13
Synchronized Compactions
#CASSANDRA13
#CASSANDRA13
nodetool compact
#CASSANDRA13
* http://hight3ch.com/garbage-truck-crushing-a-car/
#CASSANDRA13
nodetool cleanup
#CASSANDRA13
Cleanup
l Not just for topology changes l Tombstoned rows (not referenced) l Rotated row keys decrease references l Cons: Must process every sstable.
#CASSANDRA13
Immutable SStables
#CASSANDRA13
Leverage SStable Mod Time
l If now – mtime > TTL => all data is expired l We can quickly eliminate entire sstables: find -mtime +<TTL> -name *.db | xargs rm
l Fast and low overhead l Cons: Rolling restart
26G 2013-05-17 09:44 Metrics-metrics_60-hf-7209-Data.db
#CASSANDRA13
nodetool setcompactionthreshold
#CASSANDRA13
Increasing minor compactions
l By default, STC requires a minimum of 4 ssts l Leads to large non-compacted sstables l Dropping to 2 can flatten the storage growth nodetool setcompactionthreshold <ks> <cf> 2
l Cons: CPU/IO increase
#CASSANDRA13
Result
#CASSANDRA13
Effective Monitoring
#CASSANDRA13
Ring Dashboards
#CASSANDRA13
Disk Errors => Throw Away
l If you ever see this, replace! end_request: I/O error, dev xvdb, sector 467940617
end_request: I/O error, dev xvdb, sector 467940617
l Mark node down, bootstrap new l No metric for this?
#CASSANDRA13
Cassandra Log Volume
l Count log lines seen every 10 minutes l Track over time l Can identify: - Unbalanced workloads - Schema disagreements - Phantom gossip nodes - GC activity
l grep -v '.java' => exceptions
#CASSANDRA13
Q & A
Mike Heffner
/mheffner
/mheffner