dotnettoscana: nosql revolution - scalability

Scalability

Luigi Berrettini

Nicola Baldihttp://it.linkedin.com/in/nicolabaldi

http://it.linkedin.com/in/luigiberrettini

The need for speed

15/12/2012 Scalability 2

An increasing demand for performance

Companies continuously increase

More and more data and traffic

More and more computing resources needed

SOLUTION

SCALING

15/12/2012 Scalability – The need for speed 3

Scaling strategiesvertical scalability = scale up single server performance ⇒ more resources (CPUs, storage,

memory) volumes increase ⇒ more difficult and expensive to

scale not reliable: individual machine failures are common

horizontal scalability = scale out cluster of servers performance ⇒ more servers cheaper hardware (more likely to fail) volumes increase ⇒ complexity ~ constant, costs ~

linear reliability: CAN operate despite failures complex: use only if benefits are compelling

15/12/2012 Scalability – The need for speed 4

Vertical scalability


Single server

All data on a single node

Use cases data usage = mostly processing aggregates many graph databases

Pros/Cons RDBMSs or NoSQL databases simplest and most often recommended option only vertical scalability

15/12/2012 Scalability – Vertical scalability 6

Architectures anddistribution models

Horizontal scalability


Scale out architectures (1)

Shared everything every node has access to all data all nodes share memory and disk storage used on some RDBMSs

15/12/2012Scalability – Horizontal scalability: architectures and distribution

models 8

Scalability – Horizontal scalability: architectures and distribution models

9


15/12/2012

Shared disk every node has access to all data all nodes share disk storage used on some RDBMSs


10


15/12/2012

Shared nothing nodes are independent and self-sufficient no shared memory or disk storage used on some RDBMSs and all NoSQL databases

Shared nothingdistribution models

Shardingdifferent data put on different nodes

Replicationsame data copied over multiple nodes

Sharding + replicationthe two orthogonal techniques

combined

15/12/2012 11Scalability – Horizontal scalability: architectures and distribution

models

Sharding (1)

Different parts of the data onto different nodes data accessed together (aggregates) are on the same

node clumps arranged by physical location, to keep load even,

or according to any domain-specific access rule

Shard

R W

Shard

R W

Shard

R W


models

AFH

BEG

CDI

Sharding (2)

Use cases different people access different parts of the dataset to horizontally scale writes

Pros/Cons “manual” sharding with every RDBMS or NoSQL store better read performance better write performance low resilience: all but failing node data available high licensing costs for RDBMSs difficult or impossible cluster-level operations

(querying, transactions, consistency controls)


models

Master-slave replication (1)

Data replicated across multiple nodes

One designated master (primary) node• contains the original• processes writes and passes them on

All other nodes are slave (secondary)• contain the copies• synchronized with the master during a replication process


models



models

Slave

R

ABC

Master

R W

ABC

Slave

R

ABC

MASTER-SLAVE REPLICATION


Use cases load balancing cluster: data usage mostly read-

intensive failover cluster: single server with hot backup

Pros/Cons better read performance worse write performance (write management) high read (slave) resilience:

master failure ⇒ slaves can still handle read requests low write (master) resilience:

master failure ⇒ no writes until old/new master is up read inconsistencies: update not propagated to all

slaves master = bottleneck and single point of failure high licensing costs for RDBMSs


models


17

Peer-to-peer / multi-master replication (1)

15/12/2012

Data replicated across multiple nodes

All nodes are peer (equal weight): no master, no slaves

All nodes can both read and write


15/12/2012

Peer

R W

18Scalability – Horizontal scalability: architectures and distribution

models

Peer

R W

ABC

ABC

Peer

R W

ABC


Use cases load balancing cluster: data usage read/write-

intensive need to scale out more easily

Pros/Cons better read performance better write performance high resilience:

node failure ⇒ reads/writes handled by other nodes read inconsistencies: update not propagated to all

nodes write inconsistencies: same record at the same time high licensing costs for RDBMSs15/12/2012 19


Sharding + replication

Sharding + master-slave replication multiple masters each data item has a single master node configurations:• master• slave• master for some data / slave for other data

Sharding + peer-to-peer replication


models

Sharding + master-slave replication


models

Master 1

R W

AFH

Master/Slave 2

R W

BEG

Slave 3

R

CDI

Slave 1

R

AFH

Slave/Master 2

R W

BEG

Master 3

R W

CDI

Sharding + peer-to-peer replication


models

Peer 1/2

R W

AFH

Peer 3/4

R W

BEG

Peer 2/3

R W

BHG

Peer 5/6

R W

CDI

Peer 1/4

R W

AFE

Peer 5/6

R W

CDI

Scaling out on RDBMSs (1)


models

Oracle DatabaseOracle RAC shared everything

Microsoft SQL ServerAll editions shared nothing

master-slave replication

IBM DB2DB2 pureScaleshared diskDB2 HADR shared nothing

master-slave replication (failover cluster)

Scaling out on RDBMSs (2)


models

Oracle MySQLMySQL Clustershared nothing

sharding, replication, sharding + replication

The PostgreSQL Global Development Group PostgreSQLPGCluster-II shared diskPostgres-XC shared nothing

sharding, replication, sharding + replication

Consistency


15/12/2012 25Scalability

Scalability – Horizontal scalability: consistency 26

Inconsistencies dueto concurrency

Inconsistent write = write-write conflictmultiple writes of the same data at the same time (highly likely with peer-to-peer replication)

Inconsistent read = read-write conflictread in the middle of someone else’s write

15/12/2012

Scalability – Horizontal scalability: consistency 27

Write consistency

Pessimistic approachprevent conflicts from occurring

Optimistic approachdetect conflicts and fix them

15/12/2012

Pessimistic approach

Implementation write locks ⇒ acquire a lock before updating a value

(only one lock at a time can be tacken)

Pros/Cons often severely degrade system responsiveness often leads to deadlocks (hard to prevent/debug) rely on a consistent serialization of the updates*

* sequential consistencyensuring that all nodes apply operations in the same order

15/12/2012 Scalability – Horizontal scalability: consistency 28

Optimistic approach


Implementation conditional updates ⇒ test a value before updating

it(to see if it's changed since the last read)

merged updates ⇒ merge conflicted updates somehow(save updates, record conflict and merge somehow)

Pros/Cons conditional updates

rely on a consistent serialization of the updates*

* sequential consistencyensuring that all nodes apply operations in the same order

Read consistency


Logical consistencydifferent data make sense together

Replication consistencysame data ⇒ same value on different replicas

Read-your-writes consistencyusers continue seeing their updates

Logical consistency

ACID transactions ⇒ aggregate-ignorant DBs

Partially atomic updates ⇒ aggregate-oriented DBs atomic updates within an aggregate no atomic updates between aggregates updates of multiple aggregates: inconsistency

window replication can lengthen inconsistency windows


Replication consistency

Eventual consistency

nodes may have replication inconsistencies:stale (out of date) data

eventually all nodes will be synchronized


Read-your-writes consistency

Session consistency within a user’s session there is read-your-writes consistency

(no stale data read from a node after an update on another one)

consistency lost if• session ends• the system is accessed simultaneously from different PCs

implementations• sticky session/session affinity = sessions tied to one node

affects load balancing quite intricate with master-slave replication

• version stamps track latest version stamp seen by a session ensure that all interactions with the data store include it


CAP theorem



Scalability – Horizontal scalability: CAP theorem 35

DefinitionsConsistencyall nodes see the same data at the same time

Latencythe response time in interactions between nodes

Availability every nonfailing node must reply to requests the limit of latency that we are prepared to tolerate:

once latency gets too high, we give up and treat data as unavailable

Partition tolerancethe cluster can survive communication breakages(separating it into partitions unable to communicate with each other)

15/12/2012


ACID (1)1) read(A)2) A = A – 503) write(A)4) read(B)5) B = B + 506) write(B)

15/12/2012

Atomicity• transaction fails after 3 and before 6 ⇒ the system should

ensure that its updates are not reflected in the database Consistency• A + B is unchanged by the execution of the transaction

Transaction to transfer $50from account A to account B


ACID (2)1) read(A)2) A = A – 503) write(A)4) read(B)5) B = B + 506) write(B)

15/12/2012

Isolation• another transaction will see inconsistent data between 3

and 6 (A + B will be less than it should be)• Isolation can be ensured trivially by running transactions

serially ⇒ performance issue

Durability• user notified that transaction completed ($50 transferred)

⇒ transaction updates must persist despite failures

Transaction to transfer $50from account A to account B

BASE

15/12/2012 Scalability – Horizontal scalability: CAP theorem 38

Basically AvailableSoft stateEventually consistent

Soft state and eventual consistency are techniques that work well in the presence of partitions and thus promote

availability


CAP theorem(Brewer, Gilbert, Lynch)

Given the three properties of Consistency, Availability and

Partition tolerance,you can only get two

15/12/2012


Single server systemCA

Cbeing up and keeping consistency is reasonable

Aone node: if it’s up it’s available

Pa single machine can’t partition

15/12/2012


Two nodes clusterAP

15/12/2012

AP ( C )partition ⇒ update on one node = inconsistency


Two nodes clusterCP

15/12/2012

CP ( A )partition ⇒ consistency only if one nonfailing

node stops replying to requests


Two nodes clusterCA

15/12/2012

CA ( P )nodes communicate ⇒ C and A can be preserved

partition ⇒ all nodes on one partition must be turned off (failing nodes preserve A) difficult and expensive


It is all about trading off (1)

ACID databasesfocus on consistency first and availability second

BASE databasesfocus on availability first and consistency second

15/12/2012

It is all about trading off (2)

Single server no partitions consistency versus performance: relaxed isolation

levels or no transactions

Cluster consistency versus latency/availability durability versus performance (e.g. in memory DBs) durability versus latency (e.g. the master

acknowledges the update to the client only after having been acknowledged by some slaves)

15/12/2012 Scalability – Horizontal scalability: CAP theorem 45


Master-slave replication and strong consistency

strong write consistency ⇒ write to the master

strong read consistency ⇒ read from the master

15/12/2012

Peer-to-peer replication and strong consistency

N = replication factor(nodes involved in replication NOT nodes in the cluster)

W = nodes confirming a writeR = nodes needed for a consistent read

write quorum: W > N/2 read quorum: R + W > N

Consistency is on a per operation basis

Choose the most appropriate combination of problems and advantages15/12/2012 Scalability – Horizontal scalability: CAP theorem 47

dotnettoscana: nosql revolution - scalability

Technology

nodes apply

single server

consistent

multiple nodes

read performance

slave replication

data replicated

disk storage