logic and lattices for distributed programming neil conway uc berkeley
DESCRIPTION
Logic and Lattices for Distributed Programming Neil Conway UC Berkeley Joint work with: Peter Alvaro, Peter Bailis , David Maier, Bill Marczak, Joe Hellerstein, Sriram Srinivasan Basho Chats #004 June 27, 2012. Programming. Distributed Programming. Dealing with Disorder. Introduce order - PowerPoint PPT PresentationTRANSCRIPT
Logic and Lattices for Distributed Programming
Neil ConwayUC Berkeley
Joint work with:Peter Alvaro, Peter Bailis,
David Maier, Bill Marczak,Joe Hellerstein, Sriram Srinivasan
Basho Chats #004June 27, 2012
Programming
Distributed Programming
Dealing with DisorderIntroduce order– Paxos, Zookeeper, Two-Phase Commit, …– “Strong Consistency”
Tolerate disorder– Correct behavior in the face of many
possible network orders– Typical goal: replicas converge to same
final state• “Eventual Consistency”
Eventual Consistency
Popular Hard toprogram
Help developers buildreliable programs on top ofeventual consistency
This Talk1. Theory– CRDTs, Lattices, and CALM
2. Practice– Programming with Lattices– Case Study: KVS
Read: {Alice, Bob}
Write: {Alice, Bob, Dave}
Write: {Alice, Bob, Carol}
Students{Alice, Bob, Dave}
Students{Alice, Bob, Carol}Client0
Client1
Read: {Alice, Bob} Students{Alice, Bob}
How to resolve?
Students{Alice, Bob}
Problem
Replicas perceive different event orders
Goal Same final state at all replicas
Solution
Commutative operations (“merge functions”)
Students{Alice, Bob, Carol,
Dave}
Students{Alice, Bob, Carol,
Dave}Client0
Client1
Merge = Set Union
Commutative Operations• Used by Dynamo, Riak, Bayou, etc.• Formalized as CRDTs: Convergent
and Commutative Replicated Data Types– Shapiro et al., INRIA (2009-2012)– Based on join semilattices– Commutative, associative, idempotent
• Practical libraries: Statebox, Knockbox
Time
Set(Union)
Integer(Max)
Boolean(Or)
“Growth”:Larger Sets
“Growth”:Larger Numbers
“Growth”:false true
Client0
Client1
Students{Alice, Bob, Carol,
Dave}
Students{Alice, Bob, Carol,
Dave}
Teams{<Alice, Bob>}
Teams{<Alice, Bob>}
Read: {Alice, Bob, Carol, Dave}
Read: {<Alice,Bob>}Write: {<Alice,Bob>, <Carol,Dave>}
Teams{<Alice, Bob>,
<Carol, Dave>}
Remove: {Dave} Students{Alice, Bob, Carol}
Replica Synchronization
Students{Alice, Bob, Carol}
Teams{<Alice, Bob>,
<Carol, Dave>}
Teams{<Alice, Bob>,
<Carol, Dave>}
Teams{<Alice, Bob>,
<Carol, Dave>}
Client0
Client1
Students{Alice, Bob, Carol,
Dave}
Students{Alice, Bob, Carol,
Dave}
Teams{<Alice, Bob>}
Read: {Alice, Bob, Carol}
Read: {<Alice,Bob>}Teams
{<Alice, Bob>}
Remove: {Dave} Students{Alice, Bob, Carol}
Replica Synchronization
Students{Alice, Bob, Carol}
Nondeterministic Outcome!
Teams{<Alice, Bob>}
Teams{<Alice, Bob>}
Possible Solution:Wrap both replicated values
in a single complex CRDT
Goal:Compose larger application
using “safe” mappingsbetween simple lattices
Time
Set(merge = Union)
Integer(merge = Max)
Boolean(merge = Or)
size() >= 5
Monotone functionfrom set max
Monotone functionfrom max boolean
Monotonicity in Practice“The more you
know, the more you know”
Never retractprevious outputs(“mistake-free”)
Typical patterns:• immutable data• accumulate knowledge over
time• threshold tests (“if” w/o
“else”)
Monotonicity and Determinism
Agents strictly learn more knowledge over
time
Monotone: different learning order, same
final outcome
Result:Program is deterministic!
20
A program is confluent if it produces the same results regardless of network nondeterminism
21
A program is confluent if it produces the same results regardless of network nondeterminism
Consistency
As
Logical
Monotonicity
CALM Analysis
1.All monotone programs are confluent
2.Simple syntactic test for monotonicity
Result: Simple static analysis for eventual consistency
Handling Non-Monotonicity
… is not the focus of this talk
Basic choices:1. Nodes agree on an event order using a
coordination protocol (e.g., Paxos)2. Allow non-deterministic outcomes• If needed, compensate and apologize
Putting It Into Practice
What we’d like:• Collection of agents• No shared state
( message passing)• Computation over
arbitrary lattices
BloomOrganization Collection of
agentsCommunication
Message passing
State Relations (sets)Computation Relational rules
over sets (Datalog, SQL)
Bloom BloomL
Organization Collection of agents
Collection of agents
Communication
Message passing Message passing
State Relations (sets) LatticesComputation Relational rules
over sets (Datalog, SQL)
Functions over lattices
27
Quorum Vote in BloomL
QUORUM_SIZE = 5RESULT_ADDR = "example.org"
class QuorumVote include Bud
state do channel :vote_chn, [:@addr, :voter_id] channel :result_chn, [:@addr] lset :votes lmax :vote_cnt lbool :got_quorum end
bloom do votes <= vote_chn {|v| v.voter_id} vote_cnt <= votes.size got_quorum <= vote_cnt.gt_eq(QUORUM_SIZE) result_chn <~ got_quorum.when_true { [RESULT_ADDR] } endend
Map set ! max
Map max ! bool
Threshold test on bool
Lattice state declarations
Communication interfaces
Accumulate votesinto set
Annotated Ruby class
Program state
Program logic
Merge function for set lattice
28
Builtin LatticesName Description ? a t b Sample Monotone
Functionslbool Threshold test false a ∨ b when_true() ! vlmax Increasing
number1 max(a,
b)gt(n) ! lbool+(n) ! lmax-(n) ! lmax
lmin Decreasing number
−1 min(a,b)
lt(n) ! lbool
lset Set of values ; a [ b intersect(lset) ! lsetproduct(lset) ! lset
contains?(v) ! lboolsize() ! lmax
lpset Non-negative set
; a [ b sum() ! lmax
lbag Multiset of values
; a [ b mult(v) ! lmax+(lbag) ! lbag
lmap Map from keys to lattice values
empty
map
at(v) ! any-latintersect(lmap) ! lmap
Case Study
Goal:Provably eventually consistent
key-value store (KVS)
Assumption:Map keys to lattice values
(i.e., values do not decrease)
Solution:Use a map lattice
Time
Replica 1 Replica 2
Nested lattice value
Time
Replica 1 Replica 2
Add new K/V pair
Time
Replica 1 Replica 2
“Grow” value in extant K/V pair
Time
Replica 1 Replica 2
Replica Synchronization
Goal:Provably eventually consistent KVS that stores arbitrary values
Solution:Assign a version to each
key-value pair
Each replica stores increasing versions, not increasing
values
Object Versions in Dynamo/Riak
1. Each KV pair has a vector clock version
2. Given two versions of a KV pair, prefer the one with the strictly greater version
3. If versions are incomparable, invoke user-defined merge function
Vector Clock:Map from node IDs logical
clocks
Logical Clock:Increasing counter
Solution:Use a map lattice
Solution:Use an increasing-int lattice
Version-Value PairsPair = <fst, snd>
Pair merge(Pair o){ if self.fst > o.fst: self elsif self.fst < o.fst: o else new Pair(self.fst.merge(o.fst), self.snd.merge(o.snd))}
Time
Replica 1 Replica 2
Time
Replica 1 Replica 2
Version increase;NOT value increase
Time
Replica 1 Replica 2
R1’s version replacesR2’s version
Time
Replica 1 Replica 2
New version @ R2
Time
Replica 1 Replica 2
Concurrent writes!
Time
Replica 1 Replica 2
Merge VC (automatically),value merge via user’s lattice(as in Dynamo)
Lattice Composition in KVS
Conclusion
Dealing with EC
Many event orders order-independent (disorderly) programs
Lattices Disorderly stateMonotone Functions
Disorderly computation
Monotone Bloom
Lattices + monotone functions for safe distributed programming
Questions WelcomePlease try Bloom!
http://www.bloom-lang.org
Or:
gem install bud
Backup Slides
49
LatticeshS,t,?i is a bounded join semi-lattice iff:– S is a partially ordered set– t is a binary operator (“least upper bound”)
• For all x,y 2 S, x t y = z where x ·S z, y ·S z, and there is no z’ z 2 S such that z’ ·S z.
• Associative, commutative, and idempotent– ? is the “least” element in S (8x 2 S: ? t x = x)
Example: increasing integers– S = Z, t = max, ? = -∞
50
Monotone Functionsf : ST is a monotone function iff
8a,b 2 S : a ·S b ) f(a) ·T f(b)
Example: size(Set) ! Increasing-Int
size({A, B}) = 2size({A, B, C}) = 3
51
From Datalog ! Lattices
Datalog (Bloom) BloomL
State Relations LatticesExample Values [[“red”, 1],
[“green”, 2]]set: [“red”, “green”]map: {“red” => 1, “green” => 2}counter: 5condition: false
Computation Rules over relations Functions over latticesMonotone Computation
Monotone rules Monotone functions
Program Semantics
Fixpoint of rules(stratified semantics)
Fixpoint of functions(stratified semantics)
52
Bloom Operational Model
53
QUORUM_SIZE = 5RESULT_ADDR = "example.org"
class QuorumVote include Bud
state do channel :vote_chn, [:@addr, :voter_id] channel :result_chn, [:@addr] table :votes, [:voter_id] scratch :cnt, [] => [:cnt] end
bloom do votes <= vote_chn {|v| [v.voter_id]} cnt <= votes.group(nil, count(:voter_id)) result_chn <~ cnt {|c| [RESULT_ADDR] if c >= QUORUM_SIZE} endend
Quorum Vote in Bloom
Communication
Persistent Storage
Transient StorageAccumulate votes
Send message when quorum reached
Not (set) monotonic!
Current StatusWriteups
BloomL: UCB Tech ReportBloom/CALM: CIDR’11, website
LatticeRuntime
Available as a git branch• To be merged soon-ish
Examples, Case Studies
• KVS• Shopping carts• Causal deliveryUnder development:• MDCC, concurrent editing