c* summit 2013: (re)-building the social grid for global telcos @ 1/10th the market cost by darshan...
DESCRIPTION
Darshan Rawal is leading the development of hybrid cloud based messaging products for global Tier 1 Telcos. Darshan has been working in Silicon valley since 2000, building nimble, cost effective products/services, handling millions of users and billions of transactions per day. Previous to Openwave Messaging, Darshan held engineering positions @ SS8 networks, Yahoo, DE Shaw, yp.com and has a M.S in Software Engineering from Carnegie Mellon University.TRANSCRIPT
Pushing Cassandra’s Boundaries
Darshan Rawal VP Engineering, Openwave Messaging Inc.
2 © 2013 Openwave Messaging | Confidential #Cassandra13
Agenda
! Introduction ! Our Cassandra Journey ! Spectrum of BIG Data challenges ! Cassandra Pivots ! Typical Cassandra Instance YoY change ! Cassandra Insights ! Conclusion
3 © 2013 Openwave Messaging | Confidential #Cassandra13
Openwave Messaging Customers
4 © 2013 Openwave Messaging | Confidential #Cassandra13
Universal Messaging Suite
5 © 2013 Openwave Messaging | Confidential #Cassandra13
Our Cassandra Journey – 3.5 years
6 © 2013 Openwave Messaging | Confidential #Cassandra13
Cassandra Under Fire - A Story
! Customer Emergency • Where: Major North American OWM customer • When: Q4 2012 • What: File System corruption in legacy platform • Impact: All (~800K) accounts without mail access
! Resolution: A lab system goes live ! Metrics:
• 20 minutes to upgrade RAM per Cassandra Node • Run wild maintainence/compaction; solved via SSDs • 100% Uptime
7 © 2013 Openwave Messaging | Confidential #Cassandra13
Spectrum of BIG Data Challenges
8 © 2013 Openwave Messaging | Confidential #Cassandra13
Cassandra Pivots
9 © 2013 Openwave Messaging | Confidential #Cassandra13
Atomic Batches – Client Side Impact
getConnection()
batch_mutate(…) freeConnection() getConnection()
batch_mutate(…) freeConnection() getConnection()
batch_mutate(…) freeConnection()
getConnection() batch_mutate( …) batch_mutate( …) batch_mutate( …)
freeConnection()
prepare_batch() getConnection()
atomic_batch_mutate(…) freeConnection()
Cassandra 1.1x
Cassandra 1.2.x
Application Optimization
10 © 2013 Openwave Messaging | Confidential #Cassandra13
Typical Cassandra Instance - YoY Change
11 © 2013 Openwave Messaging | Confidential #Cassandra13
Cassandra Journey Insights
! It’s a new paradigm, will take time / investment ! There is no free lunch; cool features have a price
! Sizing is all about IOPS, not all IOPS are equal
! Eventual Consistency is dual edged sword
! Adapt paradigms that don’t fit upfront
12 © 2013 Openwave Messaging | Confidential #Cassandra13
Cassandra Insights
Aspect Insight Replication Factor Ratio of RF / Ring size plays a crucial role in
throughput. Linear growth as the ratio shrinks
Tombstones Needs effective tuning for delete heavy applications Refactor application level soft deletes
Sizing Plan for the perfect storm: Compaction + N Failures + Recovery (especially for dense deployments)
Reliable Counters Utilize Client side affinity
Super Cols Best Avoided
Client Interaction Thundering herd issues due to backend GC
13 © 2013 Openwave Messaging | Confidential #Cassandra13
In retrospection
14 © 2013 Openwave Messaging | Confidential #Cassandra13
Current challenges @ Openwave Messaging