Transcript
Page 1: Cassandra@Coursera: AWS deploy and MySQL transition

Cassandra @ Coursera Deploying in AWS MySQL Transition

Daniel Chia @DanielJHChia

Software Engineer, Infrastructure

Page 2: Cassandra@Coursera: AWS deploy and MySQL transition

Overview

• Why Cassandra

• What goes into a good deployment

• MySQL → Cassandra transition experience

Page 3: Cassandra@Coursera: AWS deploy and MySQL transition
Page 4: Cassandra@Coursera: AWS deploy and MySQL transition

110 partners !

698 courses !

8.5 million learners

Page 5: Cassandra@Coursera: AWS deploy and MySQL transition

A Coursera Course

Page 6: Cassandra@Coursera: AWS deploy and MySQL transition
Page 7: Cassandra@Coursera: AWS deploy and MySQL transition
Page 8: Cassandra@Coursera: AWS deploy and MySQL transition

Your Final Project

This is your chance to apply the course concepts to real-world situations

Page 9: Cassandra@Coursera: AWS deploy and MySQL transition
Page 10: Cassandra@Coursera: AWS deploy and MySQL transition

Identity Verified Certificates

Page 11: Cassandra@Coursera: AWS deploy and MySQL transition

Technical

• 100% hosted on AWS

• Service-oriented architecture

• Mix of MySQL and Cassandra for persistence

Page 12: Cassandra@Coursera: AWS deploy and MySQL transition

What do we care about?

Page 13: Cassandra@Coursera: AWS deploy and MySQL transition

We care about…

• Availability

• Scalability

• Operational Ease

• Latency

• (Bonus) Multi-region writes

Page 14: Cassandra@Coursera: AWS deploy and MySQL transition

Availability matters

Page 15: Cassandra@Coursera: AWS deploy and MySQL transition
Page 16: Cassandra@Coursera: AWS deploy and MySQL transition

EBS Outage (2012)

Master us-east-1a

Slave us-east-1c

Page 17: Cassandra@Coursera: AWS deploy and MySQL transition

Scalability

Page 18: Cassandra@Coursera: AWS deploy and MySQL transition

Scalability

Page 19: Cassandra@Coursera: AWS deploy and MySQL transition

Sharded by class

class1

class2

class3

class4

class5

Machine 1

class6

class7

class8

class9

class10

Machine 2

class11

class12

class13

class14

class15

Machine 3

Page 20: Cassandra@Coursera: AWS deploy and MySQL transition

New use-caseUh-oh… doesn’t fit in existing sharding

Page 21: Cassandra@Coursera: AWS deploy and MySQL transition

We care about…

• Availability

• Scalability

• Operational Ease

• Performance

• (Bonus) Multi-region

Page 22: Cassandra@Coursera: AWS deploy and MySQL transition

Try Cassandra!So we decided to…

Page 23: Cassandra@Coursera: AWS deploy and MySQL transition

Cassandra ≠ [database XYZ]

Page 24: Cassandra@Coursera: AWS deploy and MySQL transition

–Albert Einstein

“But if you judge a fish by its ability to climb a tree, it will live its whole life believing that it is stupid.”

Page 25: Cassandra@Coursera: AWS deploy and MySQL transition

Time to deploy Cassandra!sudo apt-get install dse-full

Page 26: Cassandra@Coursera: AWS deploy and MySQL transition

A good deploymentMachine-level Cluster-level

Page 27: Cassandra@Coursera: AWS deploy and MySQL transition

Picking a machine

• Disk

• IOPS… IOPS… IOPS

• Latency

Author: D-Kuru/Wikimedia Commons Licence: CC-BY-SA-3.0-AT

Page 28: Cassandra@Coursera: AWS deploy and MySQL transition

Picking a machine

• CPU

Author: Mark Sze Licence: CC BY-NC-ND 2.0

Page 29: Cassandra@Coursera: AWS deploy and MySQL transition

Picking a machine• Memory

• Save some for page cache!

Author: brutalSoCal Licence: CC BY-NC-ND 2.0

Page 30: Cassandra@Coursera: AWS deploy and MySQL transition

On AWS• Ephemeral disks.

• Please don’t use EBS. Really.

• IOPS usually the problem

• Instance sizes:

• spinning disk: m1.large, m1.xlarge, m2.4xlarge

• ssd: m3.xlarge, c3.2xlarge, i2.*

Page 31: Cassandra@Coursera: AWS deploy and MySQL transition

Set up the machine

• Lots of documentation / talks about this

• Recommended reading: Datastax guide [1]

[1] http://www.datastax.com/documentation/cassandra/2.0/cassandra/install/installRecommendSettings.html

Page 32: Cassandra@Coursera: AWS deploy and MySQL transition

Cluster configuration

A

C B

Page 33: Cassandra@Coursera: AWS deploy and MySQL transition

Priamcare and feeding of Cassandra on AWS

https://github.com/Netflix/Priam

Page 34: Cassandra@Coursera: AWS deploy and MySQL transition

Cluster Topology

• We use RF=3

• Ring balanced within datacenter

• Nodes alternate racks (or AZs)

Page 35: Cassandra@Coursera: AWS deploy and MySQL transition

Cluster Topology (Priam)

• Token assignments stored in a database

• Can takeover token in instance of node failure

Page 36: Cassandra@Coursera: AWS deploy and MySQL transition

Cluster Topology (Priam)

• Priam assigns tokens evenly per region

• Alternates AZs within region

az1

az3

az2

az1

az2

az3

Page 37: Cassandra@Coursera: AWS deploy and MySQL transition

Autoscaling groups

• Recover from lost instance

• We don't use it for scaling with traffic

Page 38: Cassandra@Coursera: AWS deploy and MySQL transition

Important: Need one ASG per AZ

east-1a east-1a east-1a

east-1b east-1beast-1b

east-1ceast-1c east-1c

ASG size: 9

Page 39: Cassandra@Coursera: AWS deploy and MySQL transition

Important: Need one ASG per AZ

ASG size: 9

east-1a east-1a east-1a

east-1b east-1beast-1b

east-1ceast-1c

east-1b

Page 40: Cassandra@Coursera: AWS deploy and MySQL transition

Important: Need one ASG per AZ

ASG-1a size: 3 east-1a east-1a east-1a

east-1b east-1beast-1b

east-1ceast-1c

ASG-1b size: 3

ASG-1csize: 3 east-1c

Page 41: Cassandra@Coursera: AWS deploy and MySQL transition

Backups

• Data on ephemeral disks

• Guard against application errors

• SSTables immutable -> ship to S3

• Priam does this

Page 42: Cassandra@Coursera: AWS deploy and MySQL transition

Restore

• Have to be able use your backup

• Also useful for QA / test

• Priam handles this rather nicely

Page 43: Cassandra@Coursera: AWS deploy and MySQL transition

Deployed!Time to chill?

https://www.flickr.com/photos/spunkinator/2394514059 Creative Commons

Page 44: Cassandra@Coursera: AWS deploy and MySQL transition

Monitoringworking / not working doesn’t count.

Page 45: Cassandra@Coursera: AWS deploy and MySQL transition

We have our own custom reporter agent for Datadog There’s pluggable reporter support in 2.0.2 now.

Page 46: Cassandra@Coursera: AWS deploy and MySQL transition

JVM GC woes

Page 47: Cassandra@Coursera: AWS deploy and MySQL transition

JVM GC woesAll happy now

Page 48: Cassandra@Coursera: AWS deploy and MySQL transition

SSTables Read Histogram

Page 49: Cassandra@Coursera: AWS deploy and MySQL transition

Questions?before we carry on

Page 50: Cassandra@Coursera: AWS deploy and MySQL transition
Page 51: Cassandra@Coursera: AWS deploy and MySQL transition

Transition takestime mindset shift expertise (some) risk

Page 52: Cassandra@Coursera: AWS deploy and MySQL transition

Our experience

• Pick one feature first

• Mindset shift

• Data modeling consulting

• Libraries / Patterns / Data-as-a-service

Page 53: Cassandra@Coursera: AWS deploy and MySQL transition

Pick one feature

• Don’t go all in with Cassandra with something important right away

• Work closely with that team

Page 54: Cassandra@Coursera: AWS deploy and MySQL transition

You probably will make mistakes

Oops!

Page 55: Cassandra@Coursera: AWS deploy and MySQL transition

Mindset shift

• Everyone knows SQL

• Not everyone knows Cassandra / NoSQL

• Need to know queries beforehand

Page 56: Cassandra@Coursera: AWS deploy and MySQL transition

Enrollment Example

• Learners enroll into a course

• learner (many-to-many) course

• Need to keep track of this membership

Page 57: Cassandra@Coursera: AWS deploy and MySQL transition

MySQL ModelCREATE TABLE `courses_learners` (

`id` INT(11) NOT NULL auto_increment,

`course_id` INT(11) NOT NULL,

`learner_id` INT(11) NOT NULL,

PRIMARY KEY (`id`),

UNIQUE KEY `c_l` (`learner_id`, `course_id`),

CONSTRAINT `ref1` FOREIGN KEY (`course_id`)

CONSTRAINT `ref2` FOREIGN KEY (`learner_id`)

)

Page 58: Cassandra@Coursera: AWS deploy and MySQL transition

MySQL ModelCREATE TABLE `courses_learners` (

`id` INT(11) NOT NULL auto_increment,

`course_id` INT(11) NOT NULL,

`learner_id` INT(11) NOT NULL,

PRIMARY KEY (`id`),

UNIQUE KEY `c_l` (`learner_id`, `course_id`),

CONSTRAINT `ref1` FOREIGN KEY (`course_id`)

CONSTRAINT `ref2` FOREIGN KEY (`learner_id`)

)

Page 59: Cassandra@Coursera: AWS deploy and MySQL transition

MySQL ModelCREATE TABLE `courses_learners` (

`id` INT(11) NOT NULL auto_increment,

`course_id` INT(11) NOT NULL,

`learner_id` INT(11) NOT NULL,

PRIMARY KEY (`id`),

UNIQUE KEY `c_l` (`learner_id`, `course_id`),

CONSTRAINT `ref1` FOREIGN KEY (`course_id`)

CONSTRAINT `ref2` FOREIGN KEY (`learner_id`)

)

Page 60: Cassandra@Coursera: AWS deploy and MySQL transition

MySQL ModelCREATE TABLE `courses_learners` (

`id` INT(11) NOT NULL auto_increment,

`course_id` INT(11) NOT NULL,

`learner_id` INT(11) NOT NULL,

PRIMARY KEY (`id`),

UNIQUE KEY `c_l` (`learner_id`, `course_id`),

CONSTRAINT `ref1` FOREIGN KEY (`course_id`)

CONSTRAINT `ref2` FOREIGN KEY (`learner_id`)

)

Page 61: Cassandra@Coursera: AWS deploy and MySQL transition

Cassandra Style

CREATE TABLE courses_by_learner (

learner_id uuid,

course_id uuid,

PRIMARY KEY (learner_id, course_id)

)

Page 62: Cassandra@Coursera: AWS deploy and MySQL transition

Data modeling consulting

• Build core team proficient at C* data modeling

• Available to consult for trickier use cases

Page 63: Cassandra@Coursera: AWS deploy and MySQL transition

Libraries / Patterns• Abstract away simple (but common) use-cases

• Key-value storage

• Simple time series

• Maybe every developer won’t need deep C* knowledge?

• More radical: data as a service (e.g. STAASH)

STAASH: https://github.com/Netflix/staash

Page 64: Cassandra@Coursera: AWS deploy and MySQL transition

It’s a long roadbut we’ll get there…

Author: Carissa Rogers License: CC BY 2.0

Page 65: Cassandra@Coursera: AWS deploy and MySQL transition

Conclusion

• Know Cassandra

• Know what makes a good deployment

• Know that new skills have to be acquired

Page 66: Cassandra@Coursera: AWS deploy and MySQL transition

Questions?

We’re hiring! coursera.org/jobs


Top Related