asynchronous architectures for implementing scalable cloud services - evan cooke - gluecon 2012

43
Asynchronous Architectures for Implementing Scalable Cloud Services Designing for Graceful Degradation EVAN COOKE CO-FOUNDER & CTO twilio CLOUD COMMUNICATIONS

Upload: twilio

Post on 08-May-2015

6.017 views

Category:

Technology


1 download

DESCRIPTION

Cloud services power the apps that are becoming backbone of modern society. The workload of cloud APIs is typically driven by external customers and can fluctuate dramatically minute-by-minute. Rapid spikes in load can result in request failures as load increases beyond backend capacity and the size of web worker pools. This talk explores the use of asynchronous frameworks like python Twisted and gevent to implement services that can dynamically keep socket connections open and increase request latency in order to avoid request failures. We explore how that architectural approach helps Twilio provides high-availability Voice and SMS APIs.

TRANSCRIPT

Page 1: Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012

Asynchronous Architectures for Implementing Scalable Cloud ServicesDesigning for Graceful Degradation

EVAN COOKE

CO-FOUNDER & CTO twilioCLOUD COMMUNICATIONS

Page 2: Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012
Page 3: Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012

Cloud services power the apps that are the backbone of modern society. How

we work, play, and communicate.

Page 4: Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012

Cloud WorkloadsCan Be

Unpredictable

Page 5: Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012

6x spike in 5 mins

SMS API Usage

Page 6: Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012

RequestLatency

Load

Time

FAIL

Danger!Load higher than instantaneous throughput

Page 7: Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012

Don’t Fail Requests

Page 8: Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012

LoadBalancer

Incoming Requests

AAA AAA AAA

...Throttling Throttling Throttling

Throttling Throttling Throttling

App Server

App Server

App Server

App Server

W

WW

W

WWW

W

WorkerPool

Page 9: Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012

10%

70%

100%+

FailedRequests

Time

Worker Poolse.g., Apache/Nginx

Page 10: Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012

Problem Summary

•Cloud services often use worker pools to handle incoming requests

•When load goes beyond size of the worker pool, requests fail

Page 11: Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012

What next?

A few observations based on work implementing and scaling the Twilio API over the past 4 years...

• Twilio Voice/SMS Cloud APIs

• 100,000 Twilio Developers

• 100+ employees

Page 12: Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012

Observation 1

For many APIs, taking more time to service a request is better than failing that request

Implication: in many cases, it is better to service a request with some delay rather than failing it

Page 13: Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012

Observation 2

Matching the amount of available resources precisely to the size of incoming request worker pools is challenging

Implication: under load, it may be possible delay or drop only those requests that truly impact resources

Page 14: Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012

What are we going to do?

Suggestion: if request concurrency was very cheap, we could implement delay and finer-grained resource controls much more easily...

Page 15: Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012

Event-driven programming and the Reactor Pattern

Page 16: Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012

Event-driven programming and the Reactor Pattern

req = ‘GET /’;req.append(‘/r/n/r/n’);socket.write(req);resp = socket.read();print(resp);

1110000x10000000x10

TimeWorker

Page 17: Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012

Event-driven programming and the Reactor Pattern

req = ‘GET /’;req.append(‘/r/n/r/n’);socket.write(req);resp = socket.read();print(resp);

1110000x10000000x10

Time

Huge IO latency blocks worker

Page 18: Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012

Event-driven programming and the Reactor Pattern

req = ‘GET /’;req.append(‘/r/n/r/n’);socket.write(req, fn() {

socket.read(fn(resp) {print(resp);});

});

Make IO operations async and “callback” when done

Page 19: Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012

Event-driven programming and the Reactor Pattern

req = ‘GET /’;req.append(‘/r/n/r/n’);socket.write(req, fn() {

socket.read(fn(resp) {print(resp);});

});Central dispatch to coordinate event callbacksreactor.run_forever();

Page 20: Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012

Event-driven programming and the Reactor Pattern

req = ‘GET /’;req.append(‘/r/n/r/n’);socket.write(req, fn() {

socket.read(fn(resp) {print(resp);});

});reactor.run_forever();

11

10

Time

1010

Result: we don’t block the worker

Page 21: Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012

(Some)Reactor Pattern Frameworks

js/node.js

python/twistedpython/gevent

c/libeventc/libev

ruby/eventmachine

java/nio/netty

Page 22: Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012

The Callback Mess

Python Twistedreq = ‘GET /’req += ‘/r/n/r/n’

def r(resp): print resp

def w(): socket.read().addCallback(r)

socket.write().addCallback(w)

Page 23: Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012

The Callback Mess

Python Twistedreq = ‘GET /’req += ‘/r/n/r/n’

yield socket.write()resp = yield socket.read()print resp

Use deferred generators and inline callbacks

Page 24: Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012

The Callback Mess

Python Twistedreq = ‘GET /’req += ‘/r/n/r/n’

yield socket.write()resp = yield socket.read()print resp

Easy sequential programming with

mostly implicit async IO

Page 25: Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012

Enter gevent“gevent is a coroutine-based Python networking library that uses greenlet

to provide a high-level synchronous API on top of the libevent event loop.”

socket.write()resp = socket.read()print resp

Natively Async

Page 26: Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012

Enter gevent

from gevent.server import StreamServer

def echo(socket, address): print ('New connection from %s:%s' % address) socket.sendall('Welcome to the echo server!\r\n') line = fileobj.readline() fileobj.write(line) fileobj.flush() print ("echoed %r" % line)

if __name__ == '__main__': server = StreamServer(('0.0.0.0', 6000), echo) server.serve_forever()

Simple Echo Server

Easy sequential modelFully async

Page 27: Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012

Async Services with Ginkgo

Ginkgo is a simple framework for composing async gevent services with common

configuration, logging, demonizing etc.

https://github.com/progrium/ginkgo

Let’s look a simple example that implements a TCP and

HTTP server...

Page 28: Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012

Async Services with Ginkgoimport geventfrom gevent.pywsgi import WSGIServerfrom gevent.server import StreamServer

from ginkgo.core import Service

def handle_http(env, start_response): start_response('200 OK', [('Content-Type', 'text/html')]) print 'new http request!' return ["hello world"]

def handle_tcp(socket, address): print 'new tcp connection!' while True: socket.send('hello\n') gevent.sleep(1)

app = Service()app.add_service(StreamServer(('127.0.0.1', 1234), handle_tcp))app.add_service(WSGIServer(('127.0.0.1', 8080), handle_http))app.serve_forever()

Page 29: Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012

Async Services with Ginkgoimport geventfrom gevent.pywsgi import WSGIServerfrom gevent.server import StreamServer

from ginkgo.core import Service

def handle_http(env, start_response): start_response('200 OK', [('Content-Type', 'text/html')]) print 'new http request!' return ["hello world"]

def handle_tcp(socket, address): print 'new tcp connection!' while True: socket.send('hello\n') gevent.sleep(1)

app = Service()app.add_service(StreamServer(('127.0.0.1', 1234), handle_tcp))app.add_service(WSGIServer(('127.0.0.1', 8080), handle_http))app.serve_forever()

Import WSGI/TCPServers

Page 30: Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012

Async Services with Ginkgoimport geventfrom gevent.pywsgi import WSGIServerfrom gevent.server import StreamServer

from ginkgo.core import Service

def handle_http(env, start_response): start_response('200 OK', [('Content-Type', 'text/html')]) print 'new http request!' return ["hello world"]

def handle_tcp(socket, address): print 'new tcp connection!' while True: socket.send('hello\n') gevent.sleep(1)

app = Service()app.add_service(StreamServer(('127.0.0.1', 1234), handle_tcp))app.add_service(WSGIServer(('127.0.0.1', 8080), handle_http))app.serve_forever()

HTTP Handler

Page 31: Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012

Async Services with Ginkgoimport geventfrom gevent.pywsgi import WSGIServerfrom gevent.server import StreamServer

from ginkgo.core import Service

def handle_http(env, start_response): start_response('200 OK', [('Content-Type', 'text/html')]) print 'new http request!' return ["hello world"]

def handle_tcp(socket, address): print 'new tcp connection!' while True: socket.send('hello\n') gevent.sleep(1)

app = Service()app.add_service(StreamServer(('127.0.0.1', 1234), handle_tcp))app.add_service(WSGIServer(('127.0.0.1', 8080), handle_http))app.serve_forever()

TCP Handler

Page 32: Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012

Async Services with Ginkgoimport geventfrom gevent.pywsgi import WSGIServerfrom gevent.server import StreamServer

from ginkgo.core import Service

def handle_http(env, start_response): start_response('200 OK', [('Content-Type', 'text/html')]) print 'new http request!' return ["hello world"]

def handle_tcp(socket, address): print 'new tcp connection!' while True: socket.send('hello\n') gevent.sleep(1)

app = Service()app.add_service(StreamServer(('127.0.0.1', 1234), handle_tcp))app.add_service(WSGIServer(('127.0.0.1', 8080), handle_http))app.serve_forever()

Service Composition

Page 33: Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012

LoadBalancer

...

Incoming Requests

Async Server

Async Server

Async Server

Using our async reactor-based approach let’s redesign our serving infrastructure

Page 34: Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012

LoadBalancer

...

Incoming Requests

Async Server

AAA

Async Server

AAA

Async Server

AAA

Step 1: define an authentication and authorization layer that will identify the user and the resource being requested

Page 35: Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012

LoadBalancer

...

Incoming Requests

Throttling

Async Server

AAA

Throttling

Async Server

AAA

Throttling

Async Server

AAA

ConcurrencyManager

Step 2: add a throttling layer and concurrency manager

Page 36: Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012

Concurrency Admission Control

•Goal: limit concurrency by delaying or selectively failing requests

•Common metrics- By Account

- By Resource Type

- By Availability of Dependent Resources

•What we’ve found useful- By (Account, Resource Type)

Page 37: Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012

Delay - delay responses without failing requests

Latency

Load

Page 38: Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012

Load

Latency /x Fail

Latency /*

Deny - deny requests based on resource usage

Page 39: Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012

LoadBalancer

...

Incoming Requests

Throttling

App Server

AAA

Throttling

App Server

AAA

Throttling

App Server

AAA

DependentServices

ConcurrencyManager

Throttling Throttling Throttling

Step 3: allow backend resources to throttle requests

Page 40: Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012

SummaryAsync frameworks like gevent allow you to easily decouple a request from access to constrained resources

RequestLatency

Time

Service-wideFailure

Page 41: Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012

Don’t Fail RequestsDecrease

Performance

Page 42: Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012
Page 43: Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012

CONTENTS CONFIDENTIAL & COPYRIGHT © TWILIO INC. 2012

Evan Cooke@emcooke

twilio