blazes: coordination analysis for distributed program peter alvaro, neil conway, joseph m....

Post on 06-Jan-2018

220 Views

Category:

Documents

3 Downloads

Preview:

Click to see full reader

DESCRIPTION

Asynchrony isn’t that hard Logical timestamps Deterministic interleaving Ameloriation:

TRANSCRIPT

Blazes: coordination analysis for distributed

program

Peter Alvaro, Neil Conway, Joseph M. Hellerstein David MaierUC Berkeley

Portland State

Distributed systems are hard

Asynchrony Partial Failure

Asynchrony isn’t that hard

Logical timestampsDeterministic interleaving

Ameloriation:

Partial failure isn’t that hard

ReplicationReplay

Ameloriation:

Asynchrony * partial failure is hard2

Logical timestampsDeterministic interleaving

ReplicationReplay

asynchrony * partial failure is hard2

ReplicationReplay

Today:

Consistency criteria for fault-tolerant distributed systems

Blazes: analysis and enforcement

This talk is all setupFrame of mind:

1. Dataflow: a model of distributed computation2. Anomalies: what can go wrong?3. Remediation strategies

1. Component properties2. Delivery mechanisms

Framework:

Blazes – coordination analysis and synthesis

Little boxes: the dataflow model

Generalization of distributed services

Components interact via asynchronous calls (streams)

Components

Input interfaces Output interface

Streams

Nondeterministic order

Example: a join operator

R

ST

Example: a key/value store

put

getresponse

Example: a pub/sub service

publish

subscribedeliver

Logical dataflow

“Software architecture”

Data source

client

Service X filter cachec

a

b

Dataflow is compositional

Components are recursively defined

Data source

client

Service X filter aggregator

Dataflow exhibits self-similarity

Dataflow exhibits self-similarity

DB HDFS

Hadoop

Index Combine

StaticHTTPApp1

App2

Buy

Content

Userrequests

App1 answers

App2answers

Physical dataflow

Physical dataflow

Data source

client

Service X filter aggregatorc

a

b

Physical dataflow

Data source

Service X filter

aggregator

client“System architecture”

What could go wrong?

Cross-run nondeterminism

Data source

client

Service X filter aggregatorc

a

b

Run 1

Nondeterministic replays

Cross-run nondeterminism

Data source

client

Service X filter aggregatorc

a

b

Nondeterministic replays

Run 2

Cross-instance nondeterminism

Data source

Service X

client

Transient replica disagreement

Divergence

Data source

Service X

client

Permanent replica disagreement

Hazards

Data source

client

Service X filter aggregatorc

a

b

Order Contents?

Preventing the anomalies1. Understand component

semantics (And disallow certain compositions)

Component properties

• Convergence– Component replicas receiving the same

messages reach the same state– Rules out divergence

Insert Read

Convergentdata structure(e.g., Set CRDT)

Convergence

Insert Read

CommutativityAssociativityIdempotence

ReorderingBatchingRetry/duplication

Tolerant to

Convergence isn’t compositional

Data source

client

Convergent (identical input contents identical state)

Component properties

• Convergence– Component replicas receiving the same

messages reach the same state– Rules out divergence

• Confluence– Output streams have deterministic contents– Rules out all stream anomalies

Confluent convergent

Confluence

output set = f(input set)

{ }

{ }=

Confluence is compositional

output set = f g(input set)

Preventing the anomalies1. Understand component semantics

(And disallow certain compositions)2. Constrain message delivery

orders1. Ordering

Ordering – global coordination

Deterministicoutputs

Order-sensitive

Ordering – global coordination

Data source

client

The first principle of successful scalability is to batter the consistency mechanisms down to a minimum. – James Hamilton

Preventing the anomalies1. Understand component semantics

(And disallow certain compositions)2. Constrain message delivery

orders1. Ordering2. Barriers and sealing

Barriers – local coordination

Deterministicoutputs

Data source

clientOrder-sensitive

Barriers – local coordination

Data source

client

Sealing – continuous barriersDo partitions of (infinite) input streams “end”?

Can components produce deterministic results given “complete” input partitions?

Sealing: partition barriers for infinite streams

Sealing – continuous barriers

Finite partitions of infinite inputs are common …in distributed systems

– Sessions– Transactions– Epochs / views

…and applications– Auctions– Chats– Shopping carts

Blazes:

consistency analysis

+ coordination selection

Blazes:

Mode 1: Grey boxes

Grey boxes

Example: pub/sub

x = publishy = subscribez = deliver

x

yz

Deterministicbut unordered

Severity Label Confluent

Stateless

1 CR X X2 CW X3 ORgate X4 OWgate

x->z : CWy->z : CWT

Grey boxes

Example: key/value store

x = put; y = get; z = response

x

yz

Deterministicbut unordered

Severity Label Confluent

Stateless

1 CR X X2 CW X3 ORgate X4 OWgate

x->z : OWkeyy->z : ORT

Label propagation – confluent composition

CW CR

CR

CR

CRDeterministicoutputs

CW

Label propagation – unsafe composition

OW CR

CR

CR

CRTaintedoutputs

Interpositionpoint

Label propagation – sealing

OWkey CR

CR

CR

CRDeterministicoutputs

OWkeySeal(key=x)

Seal(key=x)

Blazes:

Mode 1: White boxes

white boxesmodule KVS state do interface input, :put, [:key, :val] interface input, :get, [:ident, :key] interface output, :response,

[:response_id, :key, :val] table :log, [:key, :val] end bloom do log <+ put log <- (put * log).rights(:key => :key) response <= (log * get).pairs(:key=>:key) do |s,l|

[l.ident, s.key, s.val] end

endend

put response: OWkey

get response: ORkey

Negation ( order sensitive)Partitioned by :key

white boxesmodule PubSub state do interface input, :publish, [:key, :val] interface input, :subscribe, [:ident, :key] interface output, :response,

[:response_id, :key, :val] table :log, [:key, :val] table :sub_log, [:ident, :key] end bloom do log <= publish

sub_log <= subscriberesponse <= (log * sub_log).pairs(:key=>:key) do |s,l|

[l.ident, s.key, s.val] end

endend

publish response: CWsubscribe response: CR

The Blazes frame of mind:

• Asynchronous dataflow model• Focus on consistency of data in

motion– Component semantics– Delivery mechanisms and costs

• Automatic, minimal coordination

Queries?

top related