logic and lattices for distributed programming neil conway uc berkeley

54
Logic and Lattices for Distributed Programming Neil Conway UC Berkeley Joint work with: Peter Alvaro, Peter Bailis, David Maier, Bill Marczak, Joe Hellerstein, Sriram Srinivasan Basho Chats #004 June 27, 2012

Upload: dana

Post on 25-Feb-2016

32 views

Category:

Documents


4 download

DESCRIPTION

Logic and Lattices for Distributed Programming Neil Conway UC Berkeley Joint work with: Peter Alvaro, Peter Bailis , David Maier, Bill Marczak, Joe Hellerstein, Sriram Srinivasan Basho Chats #004 June 27, 2012. Programming. Distributed Programming. Dealing with Disorder. Introduce order - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley

Logic and Lattices for Distributed Programming

Neil ConwayUC Berkeley

Joint work with:Peter Alvaro, Peter Bailis,

David Maier, Bill Marczak,Joe Hellerstein, Sriram Srinivasan

Basho Chats #004June 27, 2012

Page 2: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley

Programming

Page 3: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley

Distributed Programming

Page 4: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley

Dealing with DisorderIntroduce order– Paxos, Zookeeper, Two-Phase Commit, …– “Strong Consistency”

Tolerate disorder– Correct behavior in the face of many

possible network orders– Typical goal: replicas converge to same

final state• “Eventual Consistency”

Page 5: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley

Eventual Consistency

Popular Hard toprogram

Page 6: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley

Help developers buildreliable programs on top ofeventual consistency

Page 7: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley

This Talk1. Theory– CRDTs, Lattices, and CALM

2. Practice– Programming with Lattices– Case Study: KVS

Page 8: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley

Read: {Alice, Bob}

Write: {Alice, Bob, Dave}

Write: {Alice, Bob, Carol}

Students{Alice, Bob, Dave}

Students{Alice, Bob, Carol}Client0

Client1

Read: {Alice, Bob} Students{Alice, Bob}

How to resolve?

Students{Alice, Bob}

Page 9: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley

Problem

Replicas perceive different event orders

Goal Same final state at all replicas

Solution

Commutative operations (“merge functions”)

Page 10: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley

Students{Alice, Bob, Carol,

Dave}

Students{Alice, Bob, Carol,

Dave}Client0

Client1

Merge = Set Union

Page 11: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley

Commutative Operations• Used by Dynamo, Riak, Bayou, etc.• Formalized as CRDTs: Convergent

and Commutative Replicated Data Types– Shapiro et al., INRIA (2009-2012)– Based on join semilattices– Commutative, associative, idempotent

• Practical libraries: Statebox, Knockbox

Page 12: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley

Time

Set(Union)

Integer(Max)

Boolean(Or)

“Growth”:Larger Sets

“Growth”:Larger Numbers

“Growth”:false true

Page 13: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley

Client0

Client1

Students{Alice, Bob, Carol,

Dave}

Students{Alice, Bob, Carol,

Dave}

Teams{<Alice, Bob>}

Teams{<Alice, Bob>}

Read: {Alice, Bob, Carol, Dave}

Read: {<Alice,Bob>}Write: {<Alice,Bob>, <Carol,Dave>}

Teams{<Alice, Bob>,

<Carol, Dave>}

Remove: {Dave} Students{Alice, Bob, Carol}

Replica Synchronization

Students{Alice, Bob, Carol}

Teams{<Alice, Bob>,

<Carol, Dave>}

Teams{<Alice, Bob>,

<Carol, Dave>}

Teams{<Alice, Bob>,

<Carol, Dave>}

Page 14: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley

Client0

Client1

Students{Alice, Bob, Carol,

Dave}

Students{Alice, Bob, Carol,

Dave}

Teams{<Alice, Bob>}

Read: {Alice, Bob, Carol}

Read: {<Alice,Bob>}Teams

{<Alice, Bob>}

Remove: {Dave} Students{Alice, Bob, Carol}

Replica Synchronization

Students{Alice, Bob, Carol}

Nondeterministic Outcome!

Teams{<Alice, Bob>}

Teams{<Alice, Bob>}

Page 15: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley

Possible Solution:Wrap both replicated values

in a single complex CRDT

Page 16: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley

Goal:Compose larger application

using “safe” mappingsbetween simple lattices

Page 17: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley

Time

Set(merge = Union)

Integer(merge = Max)

Boolean(merge = Or)

size() >= 5

Monotone functionfrom set max

Monotone functionfrom max boolean

Page 18: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley

Monotonicity in Practice“The more you

know, the more you know”

Never retractprevious outputs(“mistake-free”)

Typical patterns:• immutable data• accumulate knowledge over

time• threshold tests (“if” w/o

“else”)

Page 19: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley

Monotonicity and Determinism

Agents strictly learn more knowledge over

time

Monotone: different learning order, same

final outcome

Result:Program is deterministic!

Page 20: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley

20

A program is confluent if it produces the same results regardless of network nondeterminism

Page 21: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley

21

A program is confluent if it produces the same results regardless of network nondeterminism

Page 22: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley

Consistency

As

Logical

Monotonicity

CALM Analysis

1.All monotone programs are confluent

2.Simple syntactic test for monotonicity

Result: Simple static analysis for eventual consistency

Page 23: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley

Handling Non-Monotonicity

… is not the focus of this talk

Basic choices:1. Nodes agree on an event order using a

coordination protocol (e.g., Paxos)2. Allow non-deterministic outcomes• If needed, compensate and apologize

Page 24: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley

Putting It Into Practice

What we’d like:• Collection of agents• No shared state

( message passing)• Computation over

arbitrary lattices

Page 25: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley

BloomOrganization Collection of

agentsCommunication

Message passing

State Relations (sets)Computation Relational rules

over sets (Datalog, SQL)

Page 26: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley

Bloom BloomL

Organization Collection of agents

Collection of agents

Communication

Message passing Message passing

State Relations (sets) LatticesComputation Relational rules

over sets (Datalog, SQL)

Functions over lattices

Page 27: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley

27

Quorum Vote in BloomL

QUORUM_SIZE = 5RESULT_ADDR = "example.org"

class QuorumVote include Bud

state do channel :vote_chn, [:@addr, :voter_id] channel :result_chn, [:@addr] lset :votes lmax :vote_cnt lbool :got_quorum end

bloom do votes <= vote_chn {|v| v.voter_id} vote_cnt <= votes.size got_quorum <= vote_cnt.gt_eq(QUORUM_SIZE) result_chn <~ got_quorum.when_true { [RESULT_ADDR] } endend

Map set ! max

Map max ! bool

Threshold test on bool

Lattice state declarations

Communication interfaces

Accumulate votesinto set

Annotated Ruby class

Program state

Program logic

Merge function for set lattice

Page 28: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley

28

Builtin LatticesName Description ? a t b Sample Monotone

Functionslbool Threshold test false a ∨ b when_true() ! vlmax Increasing

number1 max(a,

b)gt(n) ! lbool+(n) ! lmax-(n) ! lmax

lmin Decreasing number

−1 min(a,b)

lt(n) ! lbool

lset Set of values ; a [ b intersect(lset) ! lsetproduct(lset) ! lset

contains?(v) ! lboolsize() ! lmax

lpset Non-negative set

; a [ b sum() ! lmax

lbag Multiset of values

; a [ b mult(v) ! lmax+(lbag) ! lbag

lmap Map from keys to lattice values

empty

map

at(v) ! any-latintersect(lmap) ! lmap

Page 29: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley

Case Study

Page 30: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley

Goal:Provably eventually consistent

key-value store (KVS)

Assumption:Map keys to lattice values

(i.e., values do not decrease)

Solution:Use a map lattice

Page 31: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley

Time

Replica 1 Replica 2

Nested lattice value

Page 32: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley

Time

Replica 1 Replica 2

Add new K/V pair

Page 33: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley

Time

Replica 1 Replica 2

“Grow” value in extant K/V pair

Page 34: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley

Time

Replica 1 Replica 2

Replica Synchronization

Page 35: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley

Goal:Provably eventually consistent KVS that stores arbitrary values

Solution:Assign a version to each

key-value pair

Each replica stores increasing versions, not increasing

values

Page 36: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley

Object Versions in Dynamo/Riak

1. Each KV pair has a vector clock version

2. Given two versions of a KV pair, prefer the one with the strictly greater version

3. If versions are incomparable, invoke user-defined merge function

Page 37: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley

Vector Clock:Map from node IDs logical

clocks

Logical Clock:Increasing counter

Solution:Use a map lattice

Solution:Use an increasing-int lattice

Page 38: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley

Version-Value PairsPair = <fst, snd>

Pair merge(Pair o){ if self.fst > o.fst: self elsif self.fst < o.fst: o else new Pair(self.fst.merge(o.fst), self.snd.merge(o.snd))}

Page 39: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley

Time

Replica 1 Replica 2

Page 40: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley

Time

Replica 1 Replica 2

Version increase;NOT value increase

Page 41: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley

Time

Replica 1 Replica 2

R1’s version replacesR2’s version

Page 42: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley

Time

Replica 1 Replica 2

New version @ R2

Page 43: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley

Time

Replica 1 Replica 2

Concurrent writes!

Page 44: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley

Time

Replica 1 Replica 2

Merge VC (automatically),value merge via user’s lattice(as in Dynamo)

Page 45: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley

Lattice Composition in KVS

Page 46: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley

Conclusion

Dealing with EC

Many event orders order-independent (disorderly) programs

Lattices Disorderly stateMonotone Functions

Disorderly computation

Monotone Bloom

Lattices + monotone functions for safe distributed programming

Page 47: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley

Questions WelcomePlease try Bloom!

http://www.bloom-lang.org

Or:

gem install bud

Page 48: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley

Backup Slides

Page 49: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley

49

LatticeshS,t,?i is a bounded join semi-lattice iff:– S is a partially ordered set– t is a binary operator (“least upper bound”)

• For all x,y 2 S, x t y = z where x ·S z, y ·S z, and there is no z’ z 2 S such that z’ ·S z.

• Associative, commutative, and idempotent– ? is the “least” element in S (8x 2 S: ? t x = x)

Example: increasing integers– S = Z, t = max, ? = -∞

Page 50: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley

50

Monotone Functionsf : ST is a monotone function iff

8a,b 2 S : a ·S b ) f(a) ·T f(b)

Example: size(Set) ! Increasing-Int

size({A, B}) = 2size({A, B, C}) = 3

Page 51: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley

51

From Datalog ! Lattices

Datalog (Bloom) BloomL

State Relations LatticesExample Values [[“red”, 1],

[“green”, 2]]set: [“red”, “green”]map: {“red” => 1, “green” => 2}counter: 5condition: false

Computation Rules over relations Functions over latticesMonotone Computation

Monotone rules Monotone functions

Program Semantics

Fixpoint of rules(stratified semantics)

Fixpoint of functions(stratified semantics)

Page 52: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley

52

Bloom Operational Model

Page 53: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley

53

QUORUM_SIZE = 5RESULT_ADDR = "example.org"

class QuorumVote include Bud

state do channel :vote_chn, [:@addr, :voter_id] channel :result_chn, [:@addr] table :votes, [:voter_id] scratch :cnt, [] => [:cnt] end

bloom do votes <= vote_chn {|v| [v.voter_id]} cnt <= votes.group(nil, count(:voter_id)) result_chn <~ cnt {|c| [RESULT_ADDR] if c >= QUORUM_SIZE} endend

Quorum Vote in Bloom

Communication

Persistent Storage

Transient StorageAccumulate votes

Send message when quorum reached

Not (set) monotonic!

Page 54: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley

Current StatusWriteups

BloomL: UCB Tech ReportBloom/CALM: CIDR’11, website

LatticeRuntime

Available as a git branch• To be merged soon-ish

Examples, Case Studies

• KVS• Shopping carts• Causal deliveryUnder development:• MDCC, concurrent editing