logic and lattices for distributed programming

55
Logic and Lattices for Distributed Programming Neil Conway , William R. Marczak, Peter Alvaro, Joseph M. Hellerstein UC Berkeley David Maier Portland State University

Upload: salim

Post on 25-Feb-2016

43 views

Category:

Documents


1 download

DESCRIPTION

Logic and Lattices for Distributed Programming. Neil Conway , William R. Marczak, Peter Alvaro, Joseph M. Hellerstein UC Berkeley. David Maier Portland State University. Distributed Programming: Key Challenges. Partial Failure. Asynchrony. Dealing with Disorder. Enforce global order - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Logic and Lattices for Distributed Programming

Logic and Lattices for Distributed Programming

Neil Conway, William R. Marczak, Peter Alvaro, Joseph M. HellersteinUC Berkeley

David MaierPortland State University

Page 2: Logic and Lattices for Distributed Programming

Distributed Programming:Key Challenges

Asynchrony

PartialFailure

Page 3: Logic and Lattices for Distributed Programming
Page 4: Logic and Lattices for Distributed Programming

Dealing with DisorderEnforce global order– Paxos, Two-Phase Commit, GCS, …– “Strong Consistency”

Tolerate disorder– Programmer must ensure correct behavior

for many possible network orders– “Eventual Consistency”

• Typical goal: replicas converge to same final state

Page 5: Logic and Lattices for Distributed Programming

Dealing with DisorderEnforce global order– Paxos, Two-Phase Commit, GCS, …– “Strong Consistency”

Tolerate disorder– Programmer must ensure correct behavior

for many possible network orders– “Eventual Consistency”

• Typical goal: replicas converge to same final state

Page 6: Logic and Lattices for Distributed Programming

Goal:Make it easier to write

programs on top ofeventual consistency

Page 7: Logic and Lattices for Distributed Programming

This Talk1. Prior Work– Convergent Modules (CRDTs)– Monotonic Logic (CALM)

2. BloomL

3. Case Study

Page 8: Logic and Lattices for Distributed Programming

Read: {Alice, Bob}

Write: {Alice, Bob, Dave}

Write: {Alice, Bob, Carol}

Students{Alice, Bob, Dave}

Students{Alice, Bob, Carol}Client0

Client1

Read: {Alice, Bob} Students{Alice, Bob}

How to resolve?

Students{Alice, Bob}

Page 9: Logic and Lattices for Distributed Programming

Problem

Replicas perceive different event orders

Goal Same final state at all replicas

Solution

Use commutative operations (“merge functions”)

Page 10: Logic and Lattices for Distributed Programming

Students{Alice, Bob, Carol,

Dave}

Students{Alice, Bob, Carol,

Dave}Client0

Client1

Merge = Set Union

Page 11: Logic and Lattices for Distributed Programming

Commutative Operations

• Common design pattern• Formalized as CRDTs:

Convergent and Commutative Replicated Data Types– Shapiro et al., INRIA (2009-

2012)– Based on join semilattices

Page 12: Logic and Lattices for Distributed Programming

12

Lattices

hS,t,?i is a bounded join semilattice iff:– S is a set– t is a binary operator (“least upper

bound”)• Associative, commutative, and idempotent• Induces a partial order on S: x ·S y if x t y = y• Informally, “merge function” for elements of

S– ? is the “least” element in S• 8x 2 S: ? t x = x

Page 13: Logic and Lattices for Distributed Programming

Time

Set(LUB = Union)

IncreasingInteger

(LUB = Max)Boolean

(LUB = Or)

Page 14: Logic and Lattices for Distributed Programming

Client0

Client1

Students{Alice, Bob, Carol,

Dave}

Students{Alice, Bob, Carol,

Dave}

Teams{<Alice, Bob>}

Teams{<Alice, Bob>}

Read: {Alice, Bob, Carol, Dave}

Read: {<Alice,Bob>}Write: {<Alice,Bob>, <Carol,Dave>}

Teams{<Alice, Bob>,

<Carol, Dave>}

Remove: {Dave} Students{Alice, Bob, Carol}

Replica Synchronization

Students{Alice, Bob, Carol}

Teams{<Alice, Bob>,

<Carol, Dave>}

Teams{<Alice, Bob>,

<Carol, Dave>}

Teams{<Alice, Bob>,

<Carol, Dave>}

Page 15: Logic and Lattices for Distributed Programming

Client0

Client1

Students{Alice, Bob, Carol,

Dave}

Students{Alice, Bob, Carol,

Dave}

Teams{<Alice, Bob>}

Read: {Alice, Bob, Carol}

Read: {<Alice,Bob>}Teams

{<Alice, Bob>}

Remove: {Dave} Students{Alice, Bob, Carol}

Replica Synchronization

Students{Alice, Bob, Carol}

Nondeterministic Outcome!

Teams{<Alice, Bob>}

Teams{<Alice, Bob>}

Page 16: Logic and Lattices for Distributed Programming

Problem:Composition of CRDTs canresult in non-determinism

Page 17: Logic and Lattices for Distributed Programming

Possible Solution:Encapsulate all distributed

state in a single CRDT

Hard to design,verify, and test

Doesn’t scale with application size

Page 18: Logic and Lattices for Distributed Programming

Goal:Design a language that allows

safe composition of CRDTs

Page 19: Logic and Lattices for Distributed Programming

Solution: … Datalog?• Concurrent work:

distributed programming using Datalog– P2 (2006-2010)– Bloom (2010-2012)

• Monotonic logic: building block for convergent distributed programs

Page 20: Logic and Lattices for Distributed Programming

Monotonic Logic• As input set grows,

output set does not shrink– “Retraction-free”

• Order independent• e.g., map, filter, join,

union, intersection

Non-Monotonic Logic• New inputs might

retract previous outputs

• Order sensitive• e.g., aggregation,

negation

Page 21: Logic and Lattices for Distributed Programming

Monotonicity and Determinism

Agents learn strictly more knowledge over

time

Different learning order, same final outcome

Result:Program is deterministic!

Page 22: Logic and Lattices for Distributed Programming

Consistency

As

Logical

Monotonicity

CALM Analysis

1.All monotone programs are deterministic

2.Simple syntactic test for monotonicity

Result: Whole-program static analysis foreventual consistency

Page 23: Logic and Lattices for Distributed Programming

Problem:CALM only applies to

programs over growing sets

Version Numbers Timestamps Threshold Tests

Page 24: Logic and Lattices for Distributed Programming

Quorum Vote• A coordinator

accepts votes from agents

• Count # of votes–When Count(Votes) >

k, send “success” message

Page 25: Logic and Lattices for Distributed Programming

Quorum Vote• A coordinator

accepts votes from agents

• Count # of votes–When Count(Votes) >

k, send “success” message

Aggregation isnon-monotonic!

Page 26: Logic and Lattices for Distributed Programming

CRDTsLimited scope(single object)Flexible types(any lattice)

CALMWhole program analysisLimited types (only sets)

BloomL

Whole program analysisFlexible types (any lattice)

Page 27: Logic and Lattices for Distributed Programming

BloomL Constructs

Organization Collection of agentsCommunication

Message passing

State LatticesComputation Functions over

lattices

Page 28: Logic and Lattices for Distributed Programming

28

Monotone Functions

f : ST is a monotone function iff

8a,b 2 S : a ·S b ) f(a) ·T f(b)

Page 29: Logic and Lattices for Distributed Programming

Time

Set(LUB = Union)

IncreasingInteger

(LUB = Max)Boolean

(LUB = Or)

size() >= 5

Monotone function fromset increase-int

Monotone function fromincrease-int boolean

Page 30: Logic and Lattices for Distributed Programming

30

Quorum Vote in BloomL

QUORUM_SIZE = 5RESULT_ADDR = "example.org"

class QuorumVote include Bud

state do channel :vote_chn, [:@addr, :voter_id] channel :result_chn, [:@addr] lset :votes lmax :vote_cnt lbool :got_quorum end

bloom do votes <= vote_chn {|v| v.voter_id} vote_cnt <= votes.size got_quorum <= vote_cnt.gt_eq(QUORUM_SIZE) result_chn <~ got_quorum.when_true { [RESULT_ADDR] } endend

Monotone function: set ! maxMonotone function: max ! bool

Threshold test on bool (monotone)

Lattice state declarations

Communication interfaces

Accumulate votesinto set

Annotated Ruby class

Program state

Program logic

Merge function for set lattice

Monotonic CALM

Page 31: Logic and Lattices for Distributed Programming

BloomL Features• Generalizes logic programming to

lattices– Integration of relational-style queries

and functions over lattices– Efficient incremental evaluation scheme

• Library of built-in lattices– Booleans, increasing/decreasing

integers, sets, multisets, maps, …• API for defining custom lattices

Page 32: Logic and Lattices for Distributed Programming

Case Studies

Key-Value Store– Object versioning

via vector clocks– Quorum replication

Replicated Shopping Cart– Using custom lattice types

to encode domain-specific knowledge

Page 33: Logic and Lattices for Distributed Programming

Case Studies

Key-Value Store– Object versioning

via vector clocks– Quorum replication

Replicated Shopping Cart– Using custom lattice types

to encode domain-specific knowledge

Page 34: Logic and Lattices for Distributed Programming

34

Case Study: Shopping Carts

Page 35: Logic and Lattices for Distributed Programming

35

Case Study: Shopping Carts

Page 36: Logic and Lattices for Distributed Programming

36

Case Study: Shopping Carts

Page 37: Logic and Lattices for Distributed Programming

37

Case Study: Shopping Carts

Page 38: Logic and Lattices for Distributed Programming

Perspectives on Shopping

• CRDTs– Individual server replicas converge

• Bloom– Checkout is non-monotonic requires

distributed coordination• Built-in BloomL lattice types– Checkout is not a monotone function of

any of the built-in lattices

Page 39: Logic and Lattices for Distributed Programming

Observation:Once a checkoutoccurs, no more shopping actions

can be performed

Page 40: Logic and Lattices for Distributed Programming

Observation:Each client knows

when a checkout can beprocessed “safely”

Page 41: Logic and Lattices for Distributed Programming

41

Monotone Checkout

OPS = [1]Incomplet

e

OPS = [2]Incomplet

e

OPS = [3]Incomplet

e

OPS = [1,2]

Incomplete

OPS = [2,3]

Incomplete

OPS = [1,2,3]

Complete

Page 42: Logic and Lattices for Distributed Programming

42

Monotone Checkout

Page 43: Logic and Lattices for Distributed Programming

43

Monotone Checkout

Page 44: Logic and Lattices for Distributed Programming

44

Monotone Checkout

Page 45: Logic and Lattices for Distributed Programming

45

Monotone Checkout

Page 46: Logic and Lattices for Distributed Programming

Shopping Takeaways• Checkout summary is a monotone

function of client’s activities• Custom lattice type captures

application-specific notion of “forward progress”– “Unsafe” state hidden behind ADT

interface

Page 47: Logic and Lattices for Distributed Programming

Recap1. How to build eventually consistent systems– Write disorderly programs

2. Disorderly state– Lattices

3. Disorderly computation– Monotone functions over lattices

4. BloomL

– Type system for deterministic behavior– Support for custom lattice types

Page 48: Logic and Lattices for Distributed Programming

Thank You!http://www.bloom-lang.net

Page 49: Logic and Lattices for Distributed Programming

Backup Slides

Page 50: Logic and Lattices for Distributed Programming

50

Strong Consistency in Industry

“… there was a single overarching theme within the keynote talks… strong synchronization of the sort provided by a locking service must be avoided like the plague… [the key] challenge is to find ways of transforming services that might seem to need locking into versions that … can operate correctly without locking.”

-- Birman et al.,“Toward a Cloud Computing Research Agenda”

(LADIS, 2009)

Page 51: Logic and Lattices for Distributed Programming

51

Bloom Operational Model

Page 52: Logic and Lattices for Distributed Programming

52

QUORUM_SIZE = 5RESULT_ADDR = "example.org"

class QuorumVote include Bud

state do channel :vote_chn, [:@addr, :voter_id] channel :result_chn, [:@addr] table :votes, [:voter_id] scratch :cnt, [] => [:cnt] end

bloom do votes <= vote_chn {|v| [v.voter_id]} cnt <= votes.group(nil, count(:voter_id)) result_chn <~ cnt {|c| [RESULT_ADDR] if c >= QUORUM_SIZE} endend

Quorum Vote in Bloom

Communication

Persistent Storage

Transient StorageAccumulate votes

Send message when quorum reached

Not (set) monotonic!Count votes

Annotated Ruby class

Program state

Program logic

Page 53: Logic and Lattices for Distributed Programming

53

Built-in LatticesName Description ? a t b Sample Monotone

Functionslbool Threshold test false a ∨ b when_true() ! vlmax Increasing

number1 max(a,

b)gt(n) ! lbool+(n) ! lmax-(n) ! lmax

lmin Decreasing number

−1 min(a,b)

lt(n) ! lbool

lset Set of values ; a [ b intersect(lset) ! lsetproduct(lset) ! lset

contains?(v) ! lboolsize() ! lmax

lpset Non-negative set

; a [ b sum() ! lmax

lbag Multiset of values

; a [ b mult(v) ! lmax+(lbag) ! lbag

lmap Map from keys to lattice values

empty

map

at(v) ! any-latintersect(lmap) ! lmap

Page 54: Logic and Lattices for Distributed Programming

Failure HandlingGreat question!

1. Monotone programs handle transient faults very well– Deterministic simple logging– Commutative, idempotent simple recovery

2. Future work: “controlled non-determinism”– Timeout code is fundamentally non-deterministic– But we still want mostly deterministic programs

Page 55: Logic and Lattices for Distributed Programming

Handling Non-Monotonicity

… is not the focus of this talk

Basic alternatives:1. Nodes agree on an event order

using distributed coordination (e.g., Paxos)

2. Allow non-deterministic outcomes• If needed, compensate and apologize