conflict-free replicated data types in practice...con ict-free replicated data types in practice...

38
Conflict-free Replicated Data Types in Practice Georges Younes Vitor Enes Wednesday 11 th January, 2017 HASLab/INESC TEC & University of Minho InfoBlender

Upload: others

Post on 16-Jul-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Conflict-free Replicated Data Types in Practice...Con ict-free Replicated Data Types in Practice Georges Younes Vitor Enes Wednesday 11th January, 2017 HASLab/INESC TEC & University

Conflict-free Replicated Data Types

in Practice

Georges Younes Vitor Enes

Wednesday 11th January, 2017

HASLab/INESC TEC & University of Minho

InfoBlender

Page 2: Conflict-free Replicated Data Types in Practice...Con ict-free Replicated Data Types in Practice Georges Younes Vitor Enes Wednesday 11th January, 2017 HASLab/INESC TEC & University

Motivation

Page 3: Conflict-free Replicated Data Types in Practice...Con ict-free Replicated Data Types in Practice Georges Younes Vitor Enes Wednesday 11th January, 2017 HASLab/INESC TEC & University

Background

Page 4: Conflict-free Replicated Data Types in Practice...Con ict-free Replicated Data Types in Practice Georges Younes Vitor Enes Wednesday 11th January, 2017 HASLab/INESC TEC & University

Background: CAP Theorem

Brewer’s Conjecture

2000 Eric Brewer, PoDC Conference Keynote

2002 Seth Gilbert and Nancy Lynch, ACM SIGACT News 33(2)

Of three properties of shared-data

system - data Consistency, system

Availability and tolerance to network

Partitions - only two can be achievedat any given moment in time.

1

Page 5: Conflict-free Replicated Data Types in Practice...Con ict-free Replicated Data Types in Practice Georges Younes Vitor Enes Wednesday 11th January, 2017 HASLab/INESC TEC & University

Background: CAP Theorem

source: Lourenco et al. Journal of Big Data (2015) 2:18 DOI 10.1186/s40537-015-0025-0

2

Page 6: Conflict-free Replicated Data Types in Practice...Con ict-free Replicated Data Types in Practice Georges Younes Vitor Enes Wednesday 11th January, 2017 HASLab/INESC TEC & University

“ The best thing about being me... There are

so many ’me’s ”

-Agent Smith3

Page 7: Conflict-free Replicated Data Types in Practice...Con ict-free Replicated Data Types in Practice Georges Younes Vitor Enes Wednesday 11th January, 2017 HASLab/INESC TEC & University

Background: Data Replication

Data Replication: Maintaining multiple data copies on separate machines.

• Improves Availability

• Allows access when some replicas are not available

• Improves Performance

• reduced latency: let users access nearby replicas

• increased throughput: let multiple machines serve the data

Pessimistic

• Provide single-copy consistency

• Block access to a replica unless it

is provably up to date

• Perform well in LANs (small

latencies, uncommon failures)

• Not in Wide Area Networks

Optimistic

• Provide eventual consistency

• Let access to a replica without a

priori synchronization

• Updates propagated in bg,

occasional conflicts resolved later

• Offer many advantages over their

pessimistic counterparts 4

Page 8: Conflict-free Replicated Data Types in Practice...Con ict-free Replicated Data Types in Practice Georges Younes Vitor Enes Wednesday 11th January, 2017 HASLab/INESC TEC & University

Background: Eventual Consistency (EC)

5

Page 9: Conflict-free Replicated Data Types in Practice...Con ict-free Replicated Data Types in Practice Georges Younes Vitor Enes Wednesday 11th January, 2017 HASLab/INESC TEC & University

Background: Eventual Consistency (EC)

6

Page 10: Conflict-free Replicated Data Types in Practice...Con ict-free Replicated Data Types in Practice Georges Younes Vitor Enes Wednesday 11th January, 2017 HASLab/INESC TEC & University

Background: Eventual Consistency (EC)

7

Page 11: Conflict-free Replicated Data Types in Practice...Con ict-free Replicated Data Types in Practice Georges Younes Vitor Enes Wednesday 11th January, 2017 HASLab/INESC TEC & University

Background: Eventual Consistency (EC)

At the same moment, two users can see the sametweet with different number of favorites 8

Page 12: Conflict-free Replicated Data Types in Practice...Con ict-free Replicated Data Types in Practice Georges Younes Vitor Enes Wednesday 11th January, 2017 HASLab/INESC TEC & University

Conflict-free Replicated Data

Types

Page 13: Conflict-free Replicated Data Types in Practice...Con ict-free Replicated Data Types in Practice Georges Younes Vitor Enes Wednesday 11th January, 2017 HASLab/INESC TEC & University

Conflict-free Replicated Data Types

CRDTs

• Conflict-free: resolves conflicts automatically converging to the same

state

• Replicated: multiple independent copies

• Data Types: Registers, Counters, Maps, Sequences, Sets,

Graphs..etc

CRDT Flavors

There are two flavors of CRDTs:

• Operation-based: broadcasts update operations to other replicas

• State-based: propagates the local state to other replicas

9

Page 14: Conflict-free Replicated Data Types in Practice...Con ict-free Replicated Data Types in Practice Georges Younes Vitor Enes Wednesday 11th January, 2017 HASLab/INESC TEC & University

Op-based CRDTs

Figure 1: Operation based replication

For each update operation do:

• Prepare: Calculate the downstream (the effect observed locally)

• Apply the effect locally

• Disseminate (reliable causal broadcast) the calculated effect to be

applied on all other replicas10

Page 15: Conflict-free Replicated Data Types in Practice...Con ict-free Replicated Data Types in Practice Georges Younes Vitor Enes Wednesday 11th January, 2017 HASLab/INESC TEC & University

Op-based (Counter :: N)

replica A

SA = 0

SA = update(inc,SA) = 0 + 1 = 1

m1 = inc

sendB(m1)

on receiveR(m2):

SA = update(dec ,SA) = 1− 1 = 0

Op-based (Counter :: N)

replica B

SB = 0

SB = update(dec,SB) = SB − 1 = −1

m2 = inc

sendA(m2)

on receiveR(m1):

SB = update(inc,SB) = −1 + 1 = 0

What if m1 is lost?

What if m2 is duplicated?

11

Page 16: Conflict-free Replicated Data Types in Practice...Con ict-free Replicated Data Types in Practice Georges Younes Vitor Enes Wednesday 11th January, 2017 HASLab/INESC TEC & University

How do solve these problems?

TRCB: Tagged Reliable Causal Broadcast

• Tags each operation with a unique timestamp (VC)

• Delivery of operations respecting causal order

• Reliable bcast: at-least-once to prevent message loss

• Reliable bcast: at-most-once to prevent message duplication

Are all data types commutative?

• Counters are commutative: incr ; decr ; == decr ; incr ;

• Not all data types are (In fact most are NOT)

• A Set S1 with add(element,Set) and rmv(element,Set) operation is

not

• add(a,S1); rmv(a,S1); value(S1) = {}• rmv(a,S1); add(a,S1); value(S1) = {a}• How do we solve concurrent operations? 12

Page 17: Conflict-free Replicated Data Types in Practice...Con ict-free Replicated Data Types in Practice Georges Younes Vitor Enes Wednesday 11th January, 2017 HASLab/INESC TEC & University

POLog: Partially Ordered Log

13

Page 18: Conflict-free Replicated Data Types in Practice...Con ict-free Replicated Data Types in Practice Georges Younes Vitor Enes Wednesday 11th January, 2017 HASLab/INESC TEC & University

POLog: Partially Ordered Log

14

Page 19: Conflict-free Replicated Data Types in Practice...Con ict-free Replicated Data Types in Practice Georges Younes Vitor Enes Wednesday 11th January, 2017 HASLab/INESC TEC & University

POLog: Partially Ordered Log

15

Page 20: Conflict-free Replicated Data Types in Practice...Con ict-free Replicated Data Types in Practice Georges Younes Vitor Enes Wednesday 11th January, 2017 HASLab/INESC TEC & University

POLog: Partially Ordered Log

16

Page 21: Conflict-free Replicated Data Types in Practice...Con ict-free Replicated Data Types in Practice Georges Younes Vitor Enes Wednesday 11th January, 2017 HASLab/INESC TEC & University

POLog: Partially Ordered Log

17

Page 22: Conflict-free Replicated Data Types in Practice...Con ict-free Replicated Data Types in Practice Georges Younes Vitor Enes Wednesday 11th January, 2017 HASLab/INESC TEC & University

POLog: Partially Ordered Log

Querying the Add-Wins-Set (AWSet)

• An element v is in the set if there is an addv operation in the set

that is not succeeded by rmvv operation

• v |(t, addv ) ∈ POLog ∧ @(t ′, rmvv ) ∈ POLog .t → t ′

So does the POLog keep growing?

• In the op-based CRDT model, the POLog keeps growing

• Pure op-based CRDT model was introduced to:

• Apply GC on the POLog

• Reduces the message size

• Provides a more generic API

18

Page 23: Conflict-free Replicated Data Types in Practice...Con ict-free Replicated Data Types in Practice Georges Younes Vitor Enes Wednesday 11th January, 2017 HASLab/INESC TEC & University

Pure op-based in Redis

Figure 2: Pure op-based Architecture in Redis

19

Page 24: Conflict-free Replicated Data Types in Practice...Con ict-free Replicated Data Types in Practice Georges Younes Vitor Enes Wednesday 11th January, 2017 HASLab/INESC TEC & University

Operation-based (GCounter :: N)

A 1inc // 2

m1=inc

��

inc // 3

m2=inc

��B 1 2 3

State-based (GCounter :: I ↪→ N)

A {B1}inc // {A1,B1}

m1={A1,B1}

$$

inc // {A2,B1}

m2={A2,B1}

$$B {B1} {A1,B1} {A2,B1}

What if m1 is lost? (monotonicity)

What if m2 is duplicated? (idempotence)

What if m1 and m2 are reordered? (commutativity)

20

Page 25: Conflict-free Replicated Data Types in Practice...Con ict-free Replicated Data Types in Practice Georges Younes Vitor Enes Wednesday 11th January, 2017 HASLab/INESC TEC & University

Why are state-based CRDTs so cool?

State-based CRDT

A state-based CRDT is a join-semilattice (S ,v,t) where S is a poset,

v its partial order, and t a binary join operator that derives the least

upper bound for every two elements of S .

∀s, t, u ∈ S :

• s t s = s (idempotence)

• s t t = t t s (commutativity)

• s t (t t u) = (s t t) t u (associativity)

$ $ full state transmission $ $

21

Page 26: Conflict-free Replicated Data Types in Practice...Con ict-free Replicated Data Types in Practice Georges Younes Vitor Enes Wednesday 11th January, 2017 HASLab/INESC TEC & University

Avoiding full state transmission

How we can we avoid replica A sending the full state to

replica B?

Two fundamental problems

A knows something about B:

• B has Sold

• A has Snew and knows that B has Sold

• Goal: compute delta d of minimum size s.t. Snew = Sold t d

A knows nothing about B:

• B has Sold

• A has Snew

• Goal: protocol that minimizes communication cost

22

Page 27: Conflict-free Replicated Data Types in Practice...Con ict-free Replicated Data Types in Practice Georges Younes Vitor Enes Wednesday 11th January, 2017 HASLab/INESC TEC & University

Solving the first problem

23

Page 28: Conflict-free Replicated Data Types in Practice...Con ict-free Replicated Data Types in Practice Georges Younes Vitor Enes Wednesday 11th January, 2017 HASLab/INESC TEC & University

Solving the first problem: Delta-CRDTs

State-based (GSet :: P(E ))

A {} add x // {x}

{x}��

add y // {x , y}{x,y}

{x , y , z}

B {} {x} {x , y} add z // {x , y , z}

{x,y ,z}

;;

Delta-state-based

A {} add x // {x}

{x}��

add y // {x , y}{y}

{x , y , z}

B {} {x} {x , y} add z // {x , y , z}

{z}

;;

24

Page 29: Conflict-free Replicated Data Types in Practice...Con ict-free Replicated Data Types in Practice Georges Younes Vitor Enes Wednesday 11th January, 2017 HASLab/INESC TEC & University

Solving the second problem

25

Page 30: Conflict-free Replicated Data Types in Practice...Con ict-free Replicated Data Types in Practice Georges Younes Vitor Enes Wednesday 11th January, 2017 HASLab/INESC TEC & University

Solving the second problem: Join Decompositions

State-driven

A {a}add x,y// {a, x , y} {a, x , y , z}

{x,y}

$$B {a} add z // {a, z}

{a,z}

::

{a, x , y , z}

Digest-driven

A {a}add x,y// {a, x , y} {a, x , y}

({x,y},dB )

$$

{a, x , y , z}

B {a} add z // {a, z}

dB

<<

{a, x , y , z}

{z}

::

26

Page 31: Conflict-free Replicated Data Types in Practice...Con ict-free Replicated Data Types in Practice Georges Younes Vitor Enes Wednesday 11th January, 2017 HASLab/INESC TEC & University

Who uses that?

Page 32: Conflict-free Replicated Data Types in Practice...Con ict-free Replicated Data Types in Practice Georges Younes Vitor Enes Wednesday 11th January, 2017 HASLab/INESC TEC & University

27

Page 33: Conflict-free Replicated Data Types in Practice...Con ict-free Replicated Data Types in Practice Georges Younes Vitor Enes Wednesday 11th January, 2017 HASLab/INESC TEC & University

Lasp

• Lasp is a language for distributed, eventually consistent

computations ⇒ CRDTs

• from basho/riak dt to lasp/types

• lasp/types:

• State-based CRDTs

• Delta-based CRDTs

• Pure-op-based CRDTs

28

Page 34: Conflict-free Replicated Data Types in Practice...Con ict-free Replicated Data Types in Practice Georges Younes Vitor Enes Wednesday 11th January, 2017 HASLab/INESC TEC & University

GCounter state transmission

0

5

10

15

20

25

30

35

40

45

state delta state delta state delta state delta

GB

Tra

nsm

itte

d

(Client Number)

Advertisement Impression Counter

State

1024512256128

29

Page 35: Conflict-free Replicated Data Types in Practice...Con ict-free Replicated Data Types in Practice Georges Younes Vitor Enes Wednesday 11th January, 2017 HASLab/INESC TEC & University

LDB & LSim

LDB

• benchmarking platform for CRDTs

• lasp/types + replication

• State-based CRDTs

• Delta-based CRDTs

• Delta-based CRDTs + Join Decompositions

• Pure-op-based CRDTs

LSim

• LDB + Peer Services + Workloads

• experiments in DC/OS (Apache Mesos + Marathon)

30

Page 36: Conflict-free Replicated Data Types in Practice...Con ict-free Replicated Data Types in Practice Georges Younes Vitor Enes Wednesday 11th January, 2017 HASLab/INESC TEC & University

GSet state transmission

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0 20 40 60 80 100 120 140 160 180 200

MB

Tra

nsm

itte

d

Time in Seconds

State - RingState - HyParView

State - Erdos RenyiDecompositions - HyParView

Decompositions - Erdos RenyiDecompositions - Ring

31

Page 37: Conflict-free Replicated Data Types in Practice...Con ict-free Replicated Data Types in Practice Georges Younes Vitor Enes Wednesday 11th January, 2017 HASLab/INESC TEC & University

Challenges

1. Log aggregation

2. Time

• Time-based graphs (e.g. x axis is time)

• Total-effort graphs demands equal total-time across all runs

3. Bugs

• LDFI

• IronFleet

• Jepsen

• . . .

4. $$ (On-Demand vs Spot Instances)

32

Page 38: Conflict-free Replicated Data Types in Practice...Con ict-free Replicated Data Types in Practice Georges Younes Vitor Enes Wednesday 11th January, 2017 HASLab/INESC TEC & University

Questions?

bit.ly/crdts-infoblender

32