distributed systems - github · defines the “computational unit ... part of a distributed system...

154
Distributed Systems with ZeroMQ and gevent Jeff Lindsay @progrium

Upload: others

Post on 22-May-2020

12 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Distributed Systemswith ZeroMQ and gevent

Jeff Lindsay@progrium

Page 2: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Why distributed systems?

Harness more CPUs and resources

Run faster in parallel

Tolerance of individual failures

Better separation of concerns

Page 3: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Most web apps evolve into distributed systems

Page 4: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency
Page 5: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency
Page 6: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency
Page 7: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency
Page 8: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency
Page 9: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency
Page 10: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

OpenStack

Page 11: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Web

API

Client

Amazon AWS

TwiML

Provider

Provider

Provider

Page 12: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

ZeroMQ + geventTwo powerful and misunderstood tools

Page 13: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

ConcurrencyHeart of Distributed Systems

Page 14: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Distributed computing is just another flavor of

local concurrency

Page 15: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Multithreading

Distributed system

Shared Memory

Thread ThreadThread

Shared Database

App AppApp

Page 16: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Concurrency models

Execution modelDefines the “computational unit”

Communication modelMeans of sharing and coordination

Page 17: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Concurrency models

Traditional multithreadingOS threadsShared memory, locks, etc

Async or Evented I/OI/O loop + callback chainsShared memory, futures

Actor modelShared nothing “processes”Built-in messaging

Page 18: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Examples

ErlangActor model

ScalaActor model

GoChannels, Goroutines

Everything else (Ruby, Python, PHP, Perl, C/C++, Java)

ThreadingEvented

Page 19: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Normally, the networking of distributed systems is tacked on to the

local concurrency model.

Erlang is special.

MQ, RPC, REST, ...

Page 20: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Why not always use Erlang?

Page 21: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Why not always use Erlang?

Half reasonsWeird/ugly languageLimited library ecosystemVM requires operational expertiseFunctional programming isn’t mainstream

Page 22: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Why not always use Erlang?

Half reasonsWeird/ugly languageLimited library ecosystemVM requires operational expertiseFunctional programming isn’t mainstream

Biggest reasonIt’s not always the right tool for the job

Page 23: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Web

API

Client

Amazon AWS

TwiML

Provider

Provider

Provider

Page 24: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Service Oriented Architecture

Multiple languagesHeterogeneous cluster

Page 25: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

RPC

Page 26: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Client / server

RPC

Page 27: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Client / serverMapping to functions

RPC

Page 28: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Client / serverMapping to functionsMessage serialization

RPC

Page 29: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Client / serverMapping to functionsMessage serialization

RPC

Poor abstraction of what you really want

Page 30: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

What you want are tools to help you get distributed actor model concurrency like Erlang ... without Erlang.

Even better if they're decoupled and optional.

Page 31: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Rarely will you build an application as part of a distributed system that does

not also need local concurrency.

Page 32: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Communication model

How do we unify communications in local concurrency and distributed systems across languages?

Page 33: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Execution model

How do we get Erlang-style local concurrency without interfering with the language's idiomatic paradigm?

Page 34: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

ZeroMQCommunication model

Page 35: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Misconceptions

Page 36: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Misconceptions

It’s just another MQ, right?

Page 37: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Misconceptions

It’s just another MQ, right? Not really.

Page 38: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Misconceptions

It’s just another MQ, right? Not really.

Page 39: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Misconceptions

It’s just another MQ, right? Not really.

Oh, it’s just sockets, right?

Page 40: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Misconceptions

It’s just another MQ, right? Not really.

Oh, it’s just sockets, right? Not really.

Page 41: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Misconceptions

It’s just another MQ, right? Not really.

Oh, it’s just sockets, right? Not really.

Page 42: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Misconceptions

It’s just another MQ, right? Not really.

Oh, it’s just sockets, right? Not really.

Wait, isn’t messaging a solved problem?

Page 43: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Misconceptions

It’s just another MQ, right? Not really.

Oh, it’s just sockets, right? Not really.

Wait, isn’t messaging a solved problem?*sigh* ... maybe.

Page 44: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Regular Sockets

Page 45: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Regular Sockets

Point to point

Page 46: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Regular Sockets

Point to pointStream of bytes

Page 47: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Regular Sockets

Point to pointStream of bytesBuffering

Page 48: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Regular Sockets

Point to pointStream of bytesBufferingStandard API

Page 49: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Regular Sockets

Point to pointStream of bytesBufferingStandard APITCP/IP or UDP, IPC

Page 50: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Messaging

Page 51: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

MessagingMessages are atomic

Page 52: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

MessagingMessages are atomic

Page 53: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

MessagingMessages are atomic

Page 54: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

MessagingMessages are atomic

Page 55: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

MessagingMessages are atomic Messages can be routed

Page 56: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

MessagingMessages are atomic Messages can be routed

Page 57: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

MessagingMessages are atomic Messages can be routed

Page 58: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

MessagingMessages are atomic Messages can be routed

Messages may sit around

Page 59: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

MessagingMessages are atomic Messages can be routed

Messages may sit around

Page 60: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

MessagingMessages are atomic Messages can be routed

Messages may sit around

Page 61: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

MessagingMessages are atomic Messages can be routed

Messages may sit around

Page 62: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

MessagingMessages are atomic Messages can be routed

Messages may sit around

Page 63: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

MessagingMessages are atomic Messages can be routed

Messages may sit around

Page 64: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

MessagingMessages are atomic Messages can be routed

Messages may sit around Messages are delivered

Page 65: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

MessagingMessages are atomic Messages can be routed

Messages may sit around Messages are delivered

Page 66: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Rise of the Big MQ

Page 67: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

AppApp

AppApp

ReliableMessage Broker

PersistentQueues

Page 68: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

AppApp

AppApp

Page 69: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

AMQP

Producer Consumer

MQ

Page 70: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

AMQP

Producer Consumer

MQ

Exchange Queue

BindingX

Page 71: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

AMQP

Producer Consumer

MQ

Exchange Queue

X

Page 72: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

AMQP Recipes

Page 73: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

AMQP RecipesWork queuesDistributing tasks among workers

Page 74: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

AMQP RecipesWork queuesDistributing tasks among workers

Publish/SubscribeSending to many consumers at once

X

Page 75: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

AMQP RecipesWork queuesDistributing tasks among workers

Publish/SubscribeSending to many consumers at once

X

RoutingReceiving messages selectively

X

foo

bar

baz

Page 76: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

AMQP RecipesWork queuesDistributing tasks among workers

Publish/SubscribeSending to many consumers at once

X

RPCRemote procedure call implementation

RoutingReceiving messages selectively

X

foo

bar

baz

Page 77: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Drawbacks of Big MQ

Lots of complexity

Queues are heavyweight

HA is a challenge

Poor primitives

Page 78: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Enter ZeroMQ“Float like a butterfly, sting like a bee”

Page 79: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Echo in Python

Server Client

import zmqcontext = zmq.Context()socket = context.socket(zmq.REP)socket.bind("tcp://127.0.0.1:5000")

while True: msg = socket.recv() print "Received", msg socket.send(msg)

123456789

import zmqcontext = zmq.Context()socket = context.socket(zmq.REQ)socket.connect("tcp://127.0.0.1:5000")

for i in range(10): msg = "msg %s" % i socket.send(msg) print "Sending", msg reply = socket.recv()

123456789

10

Page 80: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Echo in Ruby

Server Client

require "zmq"context = ZMQ::Context.new(1)socket = context.socket(ZMQ::REP)socket.bind("tcp://127.0.0.1:5000")

loop do msg = socket.recv puts "Received #{msg}" socket.send(msg)end

123456789

10

require "zmq"context = ZMQ::Context.new(1)socket = context.socket(ZMQ::REQ)socket.connect("tcp://127.0.0.1:5000")

(0...10).each do |i| msg = "msg #{i}" socket.send(msg) puts "Sending #{msg}" reply = socket.recvend

123456789

1011

Page 81: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Echo in PHP

Server Client

<?php$context = new ZMQContext();$socket = $context->getSocket(ZMQ::SOCKET_REP);$socket->bind("tcp://127.0.0.1:5000");

while (true) { $msg = $socket->recv(); echo "Received {$msg}"; $socket->send($msg);}?>

123456789

1011

<?php$context = new ZMQContext();$socket = $context->getSocket(ZMQ::SOCKET_REQ);$socket->connect("tcp://127.0.0.1:5000");

foreach (range(0, 9) as $i) { $msg = "msg {$i}"; $socket->send($msg); echo "Sending {$msg}"; $reply = $socket->recv();}?>

123456789

101112

Page 82: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Bindings

ActionScript, Ada, Bash, Basic, C, Chicken Scheme, Common Lisp, C#, C++, D, Erlang,

F#, Go, Guile, Haskell, Haxe, Java, JavaScript, Lua, Node.js, Objective-C, Objective Caml, ooc, Perl, PHP, Python, Racket, REBOL,

Red, Ruby, Smalltalk

Page 83: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Plumbing

Page 84: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Plumbing

Page 85: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Plumbing

Page 86: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Plumbing

Page 87: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Plumbing

Page 88: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Plumbing

inprocipctcpmulticast

Page 89: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Plumbing

inprocipctcpmulticast

socket.bind("tcp://localhost:5560")socket.bind("ipc:///tmp/this-socket")socket.connect("tcp://10.0.0.100:9000")socket.connect("ipc:///tmp/another-socket")socket.connect("inproc://another-socket")

Page 90: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Plumbing

inprocipctcpmulticast

socket.bind("tcp://localhost:5560")socket.bind("ipc:///tmp/this-socket")socket.connect("tcp://10.0.0.100:9000")socket.connect("ipc:///tmp/another-socket")socket.connect("inproc://another-socket")

Page 91: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Plumbing

inprocipctcpmulticast

socket.bind("tcp://localhost:5560")socket.bind("ipc:///tmp/this-socket")socket.connect("tcp://10.0.0.100:9000")socket.connect("ipc:///tmp/another-socket")socket.connect("inproc://another-socket")

Page 92: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Message Patterns

Page 93: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Message PatternsRequest-Reply

REQ REP

Page 94: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Message PatternsRequest-Reply

REQ REP

REP

REP

Page 95: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Message PatternsRequest-Reply

REQ REP

REP

REP

Page 96: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Message PatternsRequest-Reply

REQ REP

REP

REP

Page 97: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Message PatternsRequest-Reply

REQ REP

REP

REP

Page 98: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Message PatternsRequest-Reply Publish-Subscribe

REQ REP

REP

REP

PUB

SUB

SUB

SUB

Page 99: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Message PatternsRequest-Reply Publish-Subscribe

Push-Pull (Pipelining)

REQ REP

REP

REP

PUB

SUB

SUB

SUB

PUSH

PULL

PULL

PULL

Page 100: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Message PatternsRequest-Reply Publish-Subscribe

Push-Pull (Pipelining)

REQ REP

REP

REP

PUB

SUB

SUB

SUB

PUSH

PULL

PULL

PULL

Page 101: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Message PatternsRequest-Reply Publish-Subscribe

Push-Pull (Pipelining)

REQ REP

REP

REP

PUB

SUB

SUB

SUB

PUSH

PULL

PULL

PULL

Page 102: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Message PatternsRequest-Reply Publish-Subscribe

Push-Pull (Pipelining)

REQ REP

REP

REP

PUB

SUB

SUB

SUB

PUSH

PULL

PULL

PULL

Page 103: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Message PatternsRequest-Reply Publish-Subscribe

Push-Pull (Pipelining) Pair

REQ REP

REP

REP

PUB

SUB

SUB

SUB

PUSH

PULL

PULL

PULL

PAIR PAIR

Page 104: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Devices

Queue Forwarder Streamer

Design architectures around devices.

Page 105: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Devices

Queue Forwarder Streamer

Design architectures around devices.

REQ REP

Page 106: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Devices

Queue Forwarder Streamer

Design architectures around devices.

PUB SUB

Page 107: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Devices

Queue Forwarder Streamer

Design architectures around devices.

PUSH PULL

Page 108: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Performance

Page 109: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Performance

Orders of magnitude faster than most MQs

Page 110: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Performance

Orders of magnitude faster than most MQsHigher throughput than raw sockets

Page 111: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Performance

Orders of magnitude faster than most MQsHigher throughput than raw sockets

Intelligent message batching

Page 112: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Performance

Orders of magnitude faster than most MQsHigher throughput than raw sockets

Intelligent message batchingEdge case optimizations

Page 113: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Concurrency?"Come for the messaging, stay for the easy concurrency"

Page 114: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Hintjens’ Law of Concurrency

e = mc2

E is effort, the pain that it takesM is mass, the size of the codeC is conflict, when C threads collide

Page 115: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Hintjens’ Law of Concurrency

Page 116: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Hintjens’ Law of Concurrency

Page 117: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Hintjens’ Law of Concurrency

e=mc2, for c=1ZeroMQ:

Page 118: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

ZeroMQ

EasyCheapFastExpressive

Messaging toolkit for concurrency and distributed systems.

... familiar socket API

... lightweight queues in a library

... higher throughput than raw TCP

... maps to your architecture

Page 119: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

geventExecution model

Page 120: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Threading vs EventedEvented seems to be preferred for scalable I/O applications

Page 121: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Evented Stack

Non-blocking Code

Flow Control

I/O Abstraction

Reactor

Event Poller

I/OLoop

Page 122: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

def lookup(country, search_term): main_d = defer.Deferred()

def first_step(): query = "http://www.google.%s/search?q=%s" % (country,search_term) d = getPage(query) d.addCallback(second_step, country) d.addErrback(failure, country)

def second_step(content, country): m = re.search('<div id="?res.*?href="(?P<url>http://[^"]+)"', content, re.DOTALL) if not m: main_d.callback(None) return url = m.group('url') d = getPage(url) d.addCallback(third_step, country, url) d.addErrback(failure, country)

def third_step(content, country, url): m = re.search("<title>(.*?)</title>", content) if m: title = m.group(1) main_d.callback(dict(url = url, title = title)) else: main_d.callback(dict(url=url, title="{not-specified}"))

def failure(e, country): print ".%s FAILED: %s" % (country, str(e)) main_d.callback(None)

first_step() return main_d

123456789

10111213141516171819202122232425262728293031323334

Page 123: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

gevent

Reactor / Event Poller

Greenlets Monkey patching

“Regular” Python

Page 124: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Green threads“Threads” implemented in user space (VM, library)

Page 125: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Monkey patchingsocket, ssl, threading, time

Page 126: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Twisted

Page 127: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Twisted~400 modules

Page 128: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

gevent25 modules

Page 129: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Performance

http://nichol.as

Page 130: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Performance

http://nichol.as

Page 131: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Performance

http://nichol.as

Page 132: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Building a Networking App#===# 1. Basic gevent TCP server

from gevent.server import StreamServer

def handle_tcp(socket, address): print 'new tcp connection!' while True: socket.send('hello\n') gevent.sleep(1)

tcp_server = StreamServer(('127.0.0.1', 1234), handle_tcp)tcp_server.serve_forever()

123456789

10111213

Page 133: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

#===# 2. Basic gevent TCP server and WSGI server

from gevent.pywsgi import WSGIServerfrom gevent.server import StreamServer

def handle_http(env, start_response): start_response('200 OK', [('Content-Type', 'text/html')]) print 'new http request!' return ["hello world"]

def handle_tcp(socket, address): print 'new tcp connection!' while True: socket.send('hello\n') gevent.sleep(1)

tcp_server = StreamServer(('127.0.0.1', 1234), handle_tcp)tcp_server.start()

http_server = WSGIServer(('127.0.0.1', 8080), handle_http)http_server.serve_forever()

123456789

10111213141516171819202122

Page 134: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

from gevent.pywsgi import WSGIServerfrom gevent.server import StreamServerfrom gevent.socket import create_connection

def handle_http(env, start_response): start_response('200 OK', [('Content-Type', 'text/html')]) print 'new http request!' return ["hello world"]

def handle_tcp(socket, address): print 'new tcp connection!' while True: socket.send('hello\n') gevent.sleep(1)

def client_connect(address): sockfile = create_connection(address).makefile() while True: line = sockfile.readline() # returns None on EOF if line is not None: print "<<<", line, else: break

tcp_server = StreamServer(('127.0.0.1', 1234), handle_tcp)tcp_server.start()

gevent.spawn(client_connect, ('127.0.0.1', 1234))

http_server = WSGIServer(('127.0.0.1', 8080), handle_http)http_server.serve_forever()

123456789

10111213141516171819202122232425262728293031

Page 135: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

from gevent.pywsgi import WSGIServerfrom gevent.server import StreamServerfrom gevent.socket import create_connection

def handle_http(env, start_response): start_response('200 OK', [('Content-Type', 'text/html')]) print 'new http request!' return ["hello world"]

def handle_tcp(socket, address): print 'new tcp connection!' while True: socket.send('hello\n') gevent.sleep(1)

def client_connect(address): sockfile = create_connection(address).makefile() while True: line = sockfile.readline() # returns None on EOF if line is not None: print "<<<", line, else: break

tcp_server = StreamServer(('127.0.0.1', 1234), handle_tcp)http_server = WSGIServer(('127.0.0.1', 8080), handle_http)greenlets = [ gevent.spawn(tcp_server.serve_forever), gevent.spawn(http_server.serve_forever), gevent.spawn(client_connect, ('127.0.0.1', 1234)),]gevent.joinall(greenlets)

123456789

1011121314151617181920212223242526272829303132

Page 136: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

ZeroMQ in gevent?

Page 137: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency
Page 138: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

from gevent import spawnfrom gevent_zeromq import zmq

context = zmq.Context()

def serve(): socket = context.socket(zmq.REP) socket.bind("tcp://localhost:5559") while True: message = socket.recv() print "Received request: ", message socket.send("World")

server = spawn(serve)

def client(): socket = context.socket(zmq.REQ) socket.connect("tcp://localhost:5559") for request in range(10): socket.send("Hello") message = socket.recv() print "Received reply ", request, "[", message, "]"

spawn(client).join()

123456789

101112131415161718192021222324

Page 139: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Actor model?Easy to implement, in whole or in part,

optionally with ZeroMQ

Page 140: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency
Page 141: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

What is gevent missing?

Page 142: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

What is gevent missing?

Documentation

Page 143: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

What is gevent missing?

Documentation

Application framework

Page 144: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

gserviceApplication framework for gevent

Page 145: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

from gevent.pywsgi import WSGIServerfrom gevent.server import StreamServerfrom gevent.socket import create_connection

def handle_http(env, start_response): start_response('200 OK', [('Content-Type', 'text/html')]) print 'new http request!' return ["hello world"]

def handle_tcp(socket, address): print 'new tcp connection!' while True: socket.send('hello\n') gevent.sleep(1)

def client_connect(address): sockfile = create_connection(address).makefile() while True: line = sockfile.readline() # returns None on EOF if line is not None: print "<<<", line, else: break

tcp_server = StreamServer(('127.0.0.1', 1234), handle_tcp)http_server = WSGIServer(('127.0.0.1', 8080), handle_http)greenlets = [ gevent.spawn(tcp_server.serve_forever), gevent.spawn(http_server.serve_forever), gevent.spawn(client_connect, ('127.0.0.1', 1234)),]gevent.joinall(greenlets)

123456789

1011121314151617181920212223242526272829303132

Page 146: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

from gevent.pywsgi import WSGIServerfrom gevent.server import StreamServerfrom gevent.socket import create_connection

from gservice.core import Service

def handle_http(env, start_response): start_response('200 OK', [('Content-Type', 'text/html')]) print 'new http request!' return ["hello world"]

def handle_tcp(socket, address): print 'new tcp connection!' while True: socket.send('hello\n') gevent.sleep(1)

def client_connect(address): sockfile = create_connection(address).makefile() while True: line = sockfile.readline() # returns None on EOF if line is not None: print "<<<", line, else: break

app = Service()app.add_service(StreamServer(('127.0.0.1', 1234), handle_tcp))app.add_service(WSGIServer(('127.0.0.1', 8080), handle_http))app.add_service(TcpClient(('127.0.0.1', 1234), client_connect))app.serve_forever()

123456789

10111213141516171819202122232425262728293031

Page 147: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

from gservice.core import Servicefrom gservice.config import Setting

class MyApplication(Service): http_port = Setting('http_port') tcp_port = Setting('tcp_port') connect_address = Setting('connect_address') def __init__(self): self.add_service(WSGIServer(('127.0.0.1', self.http_port), self.handle_http)) self.add_service(StreamServer(('127.0.0.1', self.tcp_port), self.handle_tcp)) self.add_service(TcpClient(self.connect_address, self.client_connect)) def client_connect(self, address): sockfile = create_connection(address).makefile() while True: line = sockfile.readline() # returns None on EOF if line is not None: print "<<<", line, else: break def handle_tcp(self, socket, address): print 'new tcp connection!' while True: socket.send('hello\n') gevent.sleep(1)

def handle_http(self, env, start_response): start_response('200 OK', [('Content-Type', 'text/html')]) print 'new http request!' return ["hello world"]

123456789

1011121314151617181920212223242526272829303132

Page 148: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

# example.conf.py

pidfile = 'example.pid'logfile = 'example.log'http_port = 8080tcp_port = 1234connect_address = ('127.0.0.1', 1234)

def service(): from example import MyApplication return MyApplication()

123456789

1011

# Run in the foregroundgservice -C example.conf.py

# Start service as daemongservice -C example.conf.py start

# Control servicegservice -C example.conf.py restartgservice -C example.conf.py reloadgservice -C example.conf.py stop

# Run with overriding configurationgservice -C example.conf.py -X 'http_port = 7070'

Page 149: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Generalizinggevent proves a model that can be implemented in almost

any language that can implement an evented stack

Page 150: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

gevent

EasySmallFastCompatible

Futuristic evented platform for network applications.

... just normal Python

... only 25 modules

... top performing server

... works with most libraries

Page 151: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

RaidenLightning fast, scalable messaging

https://github.com/progrium/raiden

Page 152: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Concurrency models

Traditional multithreading

Async or Evented I/O

Actor model

Page 153: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Conclusion

Two very simple, but very powerful tools for distributed / concurrent systems

Page 154: Distributed Systems - GitHub · Defines the “computational unit ... part of a distributed system that does not also need local concurrency. ... Messaging toolkit for concurrency

Thanks@progrium