state of the real-time web with django - amazon s3 · 2015-11-26 · real-time web 4 “the...

60
State of the real-time web with Django Aymeric Augustin - @aymericaugustin DjangoCon US - September 5th, 2013 1

Upload: others

Post on 26-May-2020

7 views

Category:

Documents


0 download

TRANSCRIPT

State of the real-time web with DjangoAymeric Augustin - @aymericaugustin

DjangoCon US - September 5th, 2013

1

What are wetalking about?

2

Real-time

3

1. Systems responding within deadlines

2. Simulations running at wall clock time

3. Processing events without perceivable delay

http://en.wikipedia.org/wiki/Real-time_computing

Real-time web

4

“The real-time web is a set oftechnologies and practices that enable users toreceive information as soon as it is published

by its authors, rather thanrequiring that they or their software

check a source periodically for updates.”

http://en.wikipedia.org/wiki/Real-time_web

Use cases

5

Games

Chat

Live data

Notifications

Collaboration

VoIP

Social feeds

All these use cases requireservers to push events to clients.

Problem

6

Client Server

GET / HTTP/1.1

HTTP/1.1 200 OK

informationpublished!???

The request-response model doesn’t allowservers to push events to clients.

From hacksto protocols

7

Early solutions• Java applets (1996)

• Not quite the web

• “Pushlets” (2000)

• Call back from Java applications into DHTML

• “Comet” (2006)

• Long-lived HTTP connections to reduce latency

• A revolution in browser-based user interfaces

8 http://www.pushlets.com/ – http://infrequently.org/2006/03/comet-low-latency-data-for-the-browser/

HTTP long polling

9

• Server keeps the requeston hold and only sendsa response when thereis an event to deliver

• Client resends a request after each response

Client Serverrequest

responserequest

event

responserequest

event

HTTP streaming

10

• Server sends a series of events in a single HTTP response

• Chunked

• EOF-terminated

• Client processes each incoming event

Client Serverrequest

response ...event

response ...event

response ...

Chunked response

HTTP/1.1 200 OKContent-Type: text/plainTransfer-Encoding: chunked

25This is the data in the first chunk

1Cand this is the second one

0

11

Client Serverrequest

response ...event

response ...event

response ...

RFC 6202

12

“The authors acknowledge that both the HTTPlong polling and HTTP streaming mechanisms

stretch the original semantic of HTTPand that the HTTP protocol was

not designed for bidirectional communication.”

http://tools.ietf.org/html/rfc6202

Server-sent events

13

• HTTP stream of events

• Format: text/event-stream

• JavaScript API: EventSource interface and events

http://www.w3.org/TR/eventsource/

: The ‘data’ field is mandatory.

data: This is the first message.

: ‘event’ and ‘id’ are optional.

event: messagedata: This is another messagedata: over several lines.

event: flashdata: This is a flash event!id: 0042

Server-sent events

14 http://caniuse.com/#feat=eventsource

WebSocket

15

• Provides bidirectional communication in the context of the existing HTTP infrastructure

• RFC 6455 (supersedes hybi-xx and hixie-xx)

• Opening handshake to upgrade from HTTP

• Framing protocol and closing handshake

• Provisions for extensions and subprotocols

• JavaScript API: WebSocket interface and events

http://tools.ietf.org/html/rfc6455 – http://www.w3.org/TR/websockets/

WebSocket

> GET /endpoint HTTP/1.1> Host: ws.example.com> Connection: Upgrade> Upgrade: websocket> Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==> Sec-WebSocket-Version: 13

< HTTP/1.1 101 Switching Protocols< Connection: Upgrade< Upgrade: websocket< Sec-WebSocket-Accept: s3pPLMBiTxaQ9kYGzzhZRbK+xOo=

16

WebSocket> 81 83 ec a0 1e 0c 81 f9 75# TEXT [ M A S K ] m Y k

< 81 0a 48 65 6c 6c 6f 20 6d 59 6b 21# TEXT H e l l o m Y k !

< 88 02 03 e8# CLOSE 1005

> 88 82 73 7a 29 aa 70 92# CLOSE [ M A S K ] 1005

17

WebSocket (2009)

18 http://caniuse.com/#feat=websockets

Transport frameworks

19

SockJS LightStreamer(proprietary)

Socket.IO

Let’s try it!

20

https://github.com/aaugustin/dcus13rt

An example of long polling

21

ApplicationDjango

Pub/SubRedis

EventsAny source

Web pageJS

Redis

SUBSCRIBE

Red

is

PUBL

ISH

GET

Long

pol

ling

Helpers# demo/models.py (1/2)

import redis

CHANNEL = 'demo'

def send_message(message): client = redis.StrictRedis() message = message.encode('utf-8') return client.publish(CHANNEL, message)

22

Helpers# demo/models.py (2/2)

def recv_message(): client = redis.StrictRedis() pubsub = client.pubsub()

pubsub.subscribe(CHANNEL) for event in pubsub.listen(): if event['type'] == 'message': message = event['data'].decode('utf-8') break pubsub.unsubscribe()

return message

23

Publisher# demo/send_msg.py

#!/usr/bin/env python

import sys

from demo.models import send_message

message = " ".join(sys.argv[1:])num = send_message(message)

if num == 1: log = "Sent to one subscriber"else: log = "Sent to {} subscribers".format(num)print("{}: {}".format(log, message))

24

Subscriber# demo/views.py

from django.http import HttpResponse

from demo.models import recv_message

def long_polling_endpoint(request): message = recv_message() return HttpResponse(message.encode('utf-8'), content_type='text/plain; charset=utf-8')

25

HTML# demo/templates/demo/long_polling.html

<!DOCTYPE html>{% load static %}<html> <head> <title>Long polling demo</title> </head> <body> <ul><!-- messages will be inserted here --></ul> <script src="//ajax.googleapis.com/ajax/libs/jquery/2.0.3/jquery.min.js"></script> <script src="{% static 'demo/long_polling.js' %}"></script> </body></html>

26

JavaScript# demo/static/demo/long_polling.js

$(function () { function show_next_message() { $('<li><i>Loading…</i></li>') .appendTo($('ul')) .load('endpoint/', show_next_message); } show_next_message();});

27

“Demo”

28

“Demo”

29

Deployment

30

% gunicorn --timeout 10 --workers 2 dcus13rt.wsgi[68271] [INFO] Starting gunicorn 17.5[68271] [INFO] Listening at: http://127.0.0.1:8000[68271] [INFO] Using worker: sync[68274] [INFO] Booting worker with pid: 68274[68275] [INFO] Booting worker with pid: 68275

% ab -c 20 -n 20 \ http://127.0.0.1:8000/long_polling/endpoint/Benchmarking 127.0.0.1 (be patient)...

% PYTHONPATH=. demo/send_msg.py spamSent to 2 subscribers: spam

Deployment% ab -c 20 -n 20 \ http://127.0.0.1:8000/long_polling/endpoint/...

Server Software: gunicorn/17.5Server Hostname: 127.0.0.1Server Port: 8000

Document Path: /long_polling/endpoint/Document Length: 4 bytes

Concurrency Level: 20Time taken for tests: 93.476 secondsComplete requests: 20Failed requests: 18 (Connect: 0, Receive: 0, Length: 18, Exceptions: 0)

31

Let’s try again?

32

https://github.com/aaugustin/dcus13rt

An example of WebSocket

33

Real timeTulip

Pub/SubRedis

EventsAny source

Web pageJS

Redis

SUBSCRIBE

Red

is

PUBL

ISH

 

Web

Sock

et

Subscriber# demo/handlers.py

import tulip

from demo.models import recv_message

@tulip.coroutinedef simple_endpoint(websocket, uri): # Doesn't work! recv_message isn't a coroutine. message = yield from recv_message() websocket.send(message)

34

Subscriber# demo/handlers.py (1/2)

import tulipimport websockets

from demo.models import recv_message

subscribers = set()

@tulip.coroutinedef endpoint(websocket, uri): global subscribers subscribers.add(websocket) yield from websocket.recv() subscribers.remove(websocket)

35

Subscriber# demo/handlers.py (2/2)

def relay_messages(): while True: message = recv_message() for websocket in subscribers: if websocket.open: websocket.send(message)

if __name__ == '__main__': websockets.serve(endpoint, 'localhost', 7999) loop = tulip.get_event_loop() loop.run_in_executor(None, relay_messages) loop.run_forever()

36

HTML# demo/templates/demo/websocket.html

<!DOCTYPE html>{% load static %}<html> <head> <title>WebSocket demo</title> </head> <body> <ul><!-- messages will be inserted here --></ul> <script src="//ajax.googleapis.com/ajax/libs/jquery/2.0.3/jquery.min.js"></script> <script src="{% static 'demo/websocket.js' %}"></script> </body></html>

37

JavaScript# demo/static/demo/websocket.js

$(function () { var ws = new WebSocket("ws://localhost:7999/"); ws.onmessage = function (event) { $('<li>' + event.data + '</li>') .appendTo($('ul')); }});

38

By the way...

39

https://github.com/aaugustin/dcus13rt

Deployment with gevent% gunicorn --timeout 10 --workers 2 \ --worker-class gevent dcus13rt.wsgi[68261] [INFO] Starting gunicorn 17.5[68261] [INFO] Listening at: http://127.0.0.1:8000[68261] [INFO] Using worker: gevent[68264] [INFO] Booting worker with pid: 68264[68265] [INFO] Booting worker with pid: 68265

% ab -c 20 -n 20 \ http://127.0.0.1:8000/long_polling/endpoint/Benchmarking 127.0.0.1 (be patient)...

% PYTHONPATH=. demo/send_msg.py spamSent to 20 subscribers: spam

40

Deployment with gevent% ab -c 20 -n 20 \ http://127.0.0.1:8000/long_polling/endpoint/...

Server Software: gunicorn/17.5Server Hostname: 127.0.0.1Server Port: 8000

Document Path: /long_polling/endpoint/Document Length: 4 bytes

Concurrency Level: 20Time taken for tests: 1.740 secondsComplete requests: 20Failed requests: 0

41

Asynchronous I/O

42

Execution model

43

• Based on an event loop

• Handle many socket connections in a single thread

• epoll (Linux), kqueue (BSD), IOCP (Windows)

• More efficient than one thread per connection

• Suitable for network programming

http://python-notes.boredomandlaziness.org/en/latest/pep_ideas/async_programming.html

Programming model

44

• Based on explicit cooperative multi-threading

• Callbacks

• Coroutines

• In Python: yield (from)

• Suitable for concurrent applications

http://python-notes.boredomandlaziness.org/en/latest/pep_ideas/async_programming.html

The sad truth

45

“Converting [a synchronousoperation] to asynchronous requires

modifying every point that calls itto yield control appropriately.”

http://python-notes.boredomandlaziness.org/en/latest/pep_ideas/async_programming.html

What about gevent?

46

It modifies every function that performs I/Oby monkey-patching the standard library

so you don’t have to change your own code.

You get the benefits of the execution model for free!But you lose the benefits of the programming model.

Major Python frameworks

47

• gevent is “a coroutine-based networking library”

• Twisted is “an event-driven networking engine”

• Tornado is a “web framework and asynchronous networking library”

PEP 3156• Pluggable event loop API for interoperability

• Callbacks, transports, protocols, futures

• High-level scheduler based on coroutines

• Suspend execution with yield from ...

• Reference implementation code-named Tulip

• Effort led by Guido van Rossum himself

48

It’s too complicated!

49

django-c10k-demo• Define WebSocket

handlers – @websocket

• Wire them in URLconfs

• Use runserver in devand gunicorn in prod

• Tulip under the hood

• Real-time made easy!

50 https://github.com/aaugustin/django-c10k-demo

django-c10k-demo

51 https://github.com/aaugustin/django-c10k-demo

from c10ktools.http import websocket

@websocketdef worker(ws): # ... initial synchronization (not shown) ... while True: msg = yield from ws.recv() if msg is None: break step, row, col, state = msg.split() for subscriber in subscribers[row][col]: if subscriber.open: subscriber.send(msg)

for row, col in subscriptions: subscribers[row][col].remove(ws)

That was a lie.

Markus Raetz, Silhouette (1993).

53

Django isn’t asynchronous# Synchronous cache access

def sync(request, key): value = cache.get(key) return HttpResponse(value)

# Asynchronous cache access

@websocketdef async(ws): value = yield from cache.get(key) ws.send(value)

Django isn’t asynchronous# Synchronous ORM access

def sync(request, user_id): user = User.objects.get(id=int(user_id)) bio_html = user.bio.html return HttpResponse(bio_html)

# Asynchronous ORM access

@websocketdef async(ws): user_id = yield from ws.recv() user = yield from User.objects.get(id=int(user_id)) bio_html = # ??? — no “yield from” attribute access! ws.send(bio_html)

54

Several lies, actually.

Piotr Kowalski, Identité 4 (1993).

56

HTTP != real-time

• Execution – threads vs. events

• Programming – preemptive vs. cooperative

• Protocol – request / response vs. message streams

• Scalability – stateless vs. stateful

• Workload – CPU vs. I/O bound

57

Key learnings• PEP 3156 will improve support and resources

for asynchronous I/O in Python ≥ 3.4.

• Django isn’t designed for explicit cooperative multi-threading and that’s unlikely to change.

• Robust client and server stacks are emergingas well as deployment best-practices.

• Simplified development setups are possibleat the cost of losing parity with production.

Charles Ross, Brûlures solaires, 1993 (detail).

95% use case?

Charles Ross, Brûlures solaires, 1993.

We’re still learning.

60

Photos were taken by the authorat the Château d’Oiron, France, on June 22nd.

The “Curios & Mirabilia” exhibition is a modern tribute tothe cabinet of curiosities gathered by Claude Gouffier circa 1550.

Thank you!Questions?