datastax: relational scaling and the temple of gloom

45
Luke Tillman Technical Evangelist at DataStax (@LukeTillman)

Upload: datastax-academy

Post on 22-Jan-2018

749 views

Category:

Technology


0 download

TRANSCRIPT

Luke TillmanTechnical Evangelist at DataStax (@LukeTillman)

• Evangelist with a focus on Developers

• Long-time Developer on RDBMS (lots of .NET)

Who are you?!

2

The Good ol' Relational Database

© 2015. All Rights Reserved. 3

First proposed in 1970

The Relational Database Makes us Feel Good

© 2015. All Rights Reserved. 4

SQL is ubiquitous and allows

flexible querying

Data modeling is well understood

(3NF or higher)

ACID guarantees make us feel good

© 2015. All Rights Reserved. 5

Building and Scaling our Applications

© 2015. All Rights Reserved. 5

Building and Scaling our Applications

© 2015. All Rights Reserved. 5

Building and Scaling our Applications

I'm getting too old for this…

Scaling Up

© 2015. All Rights Reserved. 8

All these JOINs are Killing Us

9

SELECTarray_agg(players),player_teams

FROM (SELECT DISTINCT

t1.t1player AS players,t1.player_teams

FROM (SELECT

p.playerid AS t1id,concat(p.playerid, ':', p.playername, ' ') AS t1player,array_agg (pl.teamid ORDER BY pl.teamid) AS player_teams

FROM player pLEFT JOIN plays pl

ON p.playerid = pl.playeridGROUP BY p.playerid, p.playername

) t1INNER JOIN (

SELECTp.playerid AS t2id,array_agg (pl.teamid ORDER BY pl.teamid) AS player_teams

FROM player pLEFT JOIN plays pl

ON p.playerid = pl.playeridGROUP BY p.playerid, p.playername

) t2ON t1.player_teams = t2.player_teamsAND t1.t1id <> t2.t2id

) innerQueryGROUP BY player_teams

© 2015. All Rights Reserved.

All these JOINs are Killing Us

9

SELECT * FROM denormalized_view

Let's Denormalize!

© 2015. All Rights Reserved.

All these JOINs are Killing Us

9

But I thought data modeling was 3NF or

higher?! There can be only one!© 2015. All Rights Reserved.

Read Replication

12

Client

Users Data

Replica 2

Replica 1

Primary

Write Requests

Read

Req

uest

s

© 2015. All Rights Reserved.

Replication Lag

Consistent results? Nope, now

eventually consistent

Replication speed is limited by

the speed of light

Sharding

© 2015. All Rights Reserved. 13

Client

Router

A-F G-M N-T U-Z

Users Data

Queries that aren't on the shard key

require scatter-gather

Resharding can be a painful, manual process

Replication for Availability

© 2015. All Rights Reserved. 14

Client

Users Data

Failover

Process

Monitor

Failover

Failover takes time. How long are

you offline while it's happening?

And while you're offline...

© 2015. All Rights Reserved. 15

Putting it All Together

© 2015. All Rights Reserved. 16

Client

Users Data

Putting it All Together

© 2015. All Rights Reserved. 16

Client

Users Data

Sharding

A-F G-M N-T U-Z

Putting it All Together

© 2015. All Rights Reserved. 16

Client

Users Data

Sharding

Router

A-F G-M N-T U-Z

Putting it All Together

© 2015. All Rights Reserved. 16

Client

Users Data

Router

A-F G-M N-T U-Z

Sharding and Replication (and probably Denormalization)

Putting it All Together

© 2015. All Rights Reserved. 16

Client

Users Data

Failover

Process

Router

A-F G-M N-T U-Z

Sharding and Replication (and probably Denormalization)

Putting it All Together

© 2015. All Rights Reserved. 16

Client

Users Data

Failover

Process

Monitor

Failover

Router

A-F G-M N-T U-Z

Sharding and Replication (and probably Denormalization)

Putting it All Together

© 2015. All Rights Reserved. 16

Client

Users Data

Failover

Process

Monitor

Failover

Router

A-F G-M N-T U-Z

Sharding and Replication (and probably Denormalization)

Putting it All Together

© 2015. All Rights Reserved. 16

Client

Users Data

Failover

Process

Monitor

Failover

Router

A-F G-M N-T U-Z

Replication Lag

Sharding and Replication (and probably Denormalization)

Putting it All Together

© 2015. All Rights Reserved. 16

Client

Users Data

Failover

Process

Monitor

Failover

Router

A-F G-M N-T U-Z

Replication Lag

Sharding and Replication (and probably Denormalization)

Putting it All Together

© 2015. All Rights Reserved. 16

Client

Users Data

Failover

Process

Monitor

Failover

Router

A-F G-M N-T U-Z

Replication Lag

Sharding and Replication (and probably Denormalization)

Putting it All Together

© 2015. All Rights Reserved. 16

Client

Users Data

Failover

Process

Monitor

Failover

Router

A-F G-M N-T U-Z

Replication Lag

Sharding and Replication (and probably Denormalization)

What is Cassandra?

© 2015. All Rights Reserved. 27

A linearly scaling and fault tolerant

distributed database

What is Cassandra?

© 2015. All Rights Reserved. 28

A linearly scaling and fault tolerant

distributed database

• Data spread over many nodes

• All nodes participate in a cluster

• All nodes are equal

• No SPOF (shared nothing)

• Run on commodity hardware

What is Cassandra?

© 2015. All Rights Reserved. 29

A linearly scaling and fault tolerant

distributed database

• Have more data? Add more nodes.

• Need more throughput? Add more nodes.

What is Cassandra?

© 2015. All Rights Reserved. 30

A linearly scaling and fault tolerant

distributed database

• Nodes down != Database Down

• Datacenter down != Database Down

• No middle of the night phone calls

Multi Datacenter with Cassandra

© 2015. All Rights Reserved. 31

America Zamunda

Client

Fault Tolerance in Cassandra

© 2015. All Rights Reserved. 32

You have the power to control fault

tolerance in Cassandra

Replication Factor

© 2015. All Rights Reserved. 33

How many copies of the data should exist?

Client

Write Beetlejuice

RF=3

Beetlejuice Beetlejuice

Beetlejuice

Consistency Level

© 2015. All Rights Reserved. 34

How many replicas do we need to hear

from before we acknowledge?

CL=ONE

Copy #1 Copy #2

Copy #3

Client

Consistency Level

© 2015. All Rights Reserved. 35

How many replicas do we need to hear

from before we acknowledge?

CL=QUORUM

Copy #1 Copy #2

Copy #3

Client

Consistency Levels and Speed

© 2015. All Rights Reserved. 36

Use a lower consistency level like ONE to

get faster reads and writes

Consistency Levels and Eventual Consistency

© 2015. All Rights Reserved. 37

Use a higher consistency level like

QUORUM if you don’t want to be surprised

by data from the past (stale data)

Before you get too excited...

© 2015. All Rights Reserved. 38

Cassandra is not...

• A Data Ocean, Lake, or Pond

• An In-Memory Database

• A Queue

• A magical database luck dragon

that will solve all your database

use cases and problems

© 2015. All Rights Reserved. 39

How bad of an idea?

© 2015. All Rights Reserved. 40

Actually a 90's movie Actually a 70's movie

Why Arnold?! Why?

Cassandra is good when...

• Uptime is a top priority

• You have unpredictable or high scaling requirements

• The workload is transactional (i.e. OLTP not OLAP)

• You are willing to put the time and effort into understanding how it works and how to use it

© 2015. All Rights Reserved. 41

© 2015. All Rights Reserved. 42

Movie References(in order of appearance)

Leap of Faith (1992)

Patton (1970)

The Aristocats (1970)

When Harry Met Sally (1989)

Beverly Hills Cop (1984)

Lethal Weapon (1987)

Big (1988)

Trading Places (1983)

Highlander (1986)

Spaceballs (1987)

© 2015. All Rights Reserved. 42

Highlander (1986)

Spaceballs (1987)

Rain Man (1988)

Ghostbusters (1984)

Gremlins (1984)

Star Trek II: The Wrath of Khan (1982)

Star Wars Episode VI: Return of the Jedi (1983)

Weekend at Bernie's (1989)

Coming to America (1988)

Masters of the Universe (1987)

Beetlejuice (1988)

The Goonies (1985)

Top Gun (1986)

© 2015. All Rights Reserved. 42

Top Gun (1986)

Back to the Future (1985)

Footloose (1984)

The NeverEnding Story (1984)

Batman and Robin (1997)

Star Wars (1977)

Twins (1988)

The Karate Kid (1984)

Find me on Twitter: @LukeTillman

Thank You