scaling django dc09

90
Scaling Django Web Apps Mike Malone djangocon 2009 Thursday, September 10, 2009

Post on 17-Oct-2014

7.506 views

Category:

Technology


1 download

DESCRIPTION

Django scaling from Mike Malone http://immike.net/blog/about/

TRANSCRIPT

Page 1: Scaling Django Dc09

Scaling Django Web AppsMike Malone

djangocon 2009Thursday, September 10, 2009

Page 2: Scaling Django Dc09

Thursday, September 10, 2009

Page 3: Scaling Django Dc09

Thursday, September 10, 2009

Page 4: Scaling Django Dc09

http://www.flickr.com/photos/kveton/2910536252/Thursday, September 10, 2009

Page 5: Scaling Django Dc09

Thursday, September 10, 2009

Page 6: Scaling Django Dc09

djangocon 2009

Pownce

• Large scale

• Hundreds of requests/sec

• Thousands of DB operations/sec

• Millions of user relationships

• Millions of notes

• Terabytes of static data

6

Thursday, September 10, 2009

Page 7: Scaling Django Dc09

djangocon 2009

Pownce

• Encountered and eliminated many common scaling bottlenecks

• Real world example of scaling a Django app

• Django provides a lot for free

• I’ll be focusing on what you have to build yourself, and the rare places where Django got in the way

7

Thursday, September 10, 2009

Page 8: Scaling Django Dc09

Scalability

Thursday, September 10, 2009

Page 9: Scaling Django Dc09

djangocon 2009

Scalability

9

• Speed / Performance

• Generally affected by language choice

• Achieved by adopting a particular technology

Scalability is NOT:

Thursday, September 10, 2009

Page 10: Scaling Django Dc09

djangocon 2009

import time

def application(environ, start_response): time.sleep(10) start_response('200 OK', [('content-type', 'text/plain')]) return ('Hello, world!',)

A Scalable Application

10

Thursday, September 10, 2009

Page 11: Scaling Django Dc09

djangocon 2009

def application(environ, start_response): remote_addr = environ['REMOTE_ADDR'] f = open('access-log', 'a+') f.write(remote_addr + "\n") f.flush() f.seek(0) hits = sum(1 for l in f.xreadlines()

if l.strip() == remote_addr) f.close() start_response('200 OK', [('content-type', 'text/plain')]) return (str(hits),)

A High Performance Application

11

Thursday, September 10, 2009

Page 12: Scaling Django Dc09

djangocon 2009

Scalability

12

A scalable system doesn’t need to change when the size of the problem changes.

Thursday, September 10, 2009

Page 13: Scaling Django Dc09

djangocon 2009

Scalability

• Accommodate increased usage

• Accommodate increased data

• Maintainable

13

Thursday, September 10, 2009

Page 14: Scaling Django Dc09

djangocon 2009

Scalability

• Two kinds of scalability

• Vertical scalability: buying more powerful hardware, replacing what you already own

• Horizontal scalability: buying additional hardware, supplementing what you already own

14

Thursday, September 10, 2009

Page 15: Scaling Django Dc09

djangocon 2009

Vertical Scalability

• Costs don’t scale linearly (server that’s twice is fast is more than twice as much)

• Inherently limited by current technology

• But it’s easy! If you can get away with it, good for you.

15

Thursday, September 10, 2009

Page 16: Scaling Django Dc09

djangocon 2009

Vertical Scalability

16

Sky scrapers are special. Normal buildings don’t need 10 floor foundations. Just build!

- Cal Henderson

Thursday, September 10, 2009

Page 17: Scaling Django Dc09

djangocon 2009

Horizontal Scalability

17

The ability to increase a system’s capacity by adding more processing units (servers)

Thursday, September 10, 2009

Page 18: Scaling Django Dc09

djangocon 2009

Horizontal Scalability

18

It’s how large apps are scaled.

Thursday, September 10, 2009

Page 19: Scaling Django Dc09

djangocon 2009

Horizontal Scalability

• A lot more work to design, build, and maintain

• Requires some planning, but you don’t have to do all the work up front

• You can scale progressively...

• Rest of the presentation is roughly in order

19

Thursday, September 10, 2009

Page 20: Scaling Django Dc09

Caching

Thursday, September 10, 2009

Page 21: Scaling Django Dc09

djangocon 2009

Caching

• Several levels of caching available in Django

• Per-site cache: caches every page that doesn’t have GET or POST parameters

• Per-view cache: caches output of an individual view

• Template fragment cache: caches fragments of a template

• None of these are that useful if pages are heavily personalized

21

Thursday, September 10, 2009

Page 22: Scaling Django Dc09

djangocon 2009

Caching

• Low-level Cache API

• Much more flexible, allows you to cache at any granularity

• At Pownce we typically cached

• Individual objects

• Lists of object IDs

• Hard part is invalidation

22

Thursday, September 10, 2009

Page 23: Scaling Django Dc09

djangocon 2009

Caching

• Cache backends:

• Memcached

• Database caching

• Filesystem caching

23

Thursday, September 10, 2009

Page 24: Scaling Django Dc09

djangocon 2009

Caching

24

Use Memcache.

Thursday, September 10, 2009

Page 25: Scaling Django Dc09

djangocon 2009

Sessions

25

Use Memcache.

Thursday, September 10, 2009

Page 26: Scaling Django Dc09

djangocon 2009

Sessions

26

Or Tokyo Cabinethttp://github.com/ericflo/django-tokyo-sessions/

Thanks @ericflo

Thursday, September 10, 2009

Page 27: Scaling Django Dc09

djangocon 2009

from django.core.cache import cache

class UserProfile(models.Model): ... def get_social_network_profiles(self): cache_key = ‘networks_for_%s’ % self.user.id profiles = cache.get(cache_key) if profiles is None: profiles = self.user.social_network_profiles.all() cache.set(cache_key, profiles) return profiles

Caching

27

Basic caching comes free with Django:

Thursday, September 10, 2009

Page 28: Scaling Django Dc09

djangocon 2009

from django.core.cache import cachefrom django.db.models import signals

def nuke_social_network_cache(self, instance, **kwargs): cache_key = ‘networks_for_%s’ % self.instance.user_id cache.delete(cache_key)

signals.post_save.connect(nuke_social_network_cache, sender=SocialNetworkProfile)signals.post_delete.connect(nuke_social_network_cache, sender=SocialNetworkProfile)

Caching

28

Invalidate when a model is saved or deleted:

Thursday, September 10, 2009

Page 29: Scaling Django Dc09

djangocon 2009

Caching

29

• Invalidate post_save, not pre_save

• Still a small race condition

• Simple solution, worked for Pownce:

• Instead of deleting, set the cache key to None for a short period of time

• Instead of using set to cache objects, use add, which fails if there’s already something stored for the key

Thursday, September 10, 2009

Page 30: Scaling Django Dc09

djangocon 2009

Advanced Caching

30

• Memcached’s atomic increment and decrement operations are useful for maintaining counts

• They were added to the Django cache API in Django 1.1

Thursday, September 10, 2009

Page 31: Scaling Django Dc09

djangocon 2009

Advanced Caching

31

• You can still use them if you poke at the internals of the cache object a bit

• cache._cache is the underlying cache object

try: result = cache._cache.incr(cache_key, delta)except ValueError: # nonexistent key raises ValueError # Do it the hard way, store the result.return result

Thursday, September 10, 2009

Page 32: Scaling Django Dc09

djangocon 2009

Advanced Caching

32

• Other missing cache API

• delete_multi & set_multi

• append: add data to existing key after existing data

• prepend: add data to existing key before existing data

• cas: store this data, but only if no one has edited it since I fetched it

Thursday, September 10, 2009

Page 33: Scaling Django Dc09

djangocon 2009

Advanced Caching

33

• It’s often useful to cache objects ‘forever’ (i.e., until you explicitly invalidate them)

• User and UserProfile

• fetched almost every request

• rarely change

• But Django won’t let you

• IMO, this is a bug :(

Thursday, September 10, 2009

Page 34: Scaling Django Dc09

djangocon 2009

class CacheClass(BaseCache): def __init__(self, server, params): BaseCache.__init__(self, params) self._cache = memcache.Client(server.split(';'))

def add(self, key, value, timeout=0): if isinstance(value, unicode): value = value.encode('utf-8') return self._cache.add(smart_str(key), value, timeout or self.default_timeout)

The Memcache Backend

34

Thursday, September 10, 2009

Page 35: Scaling Django Dc09

djangocon 2009

class CacheClass(BaseCache): def __init__(self, server, params): BaseCache.__init__(self, params) self._cache = memcache.Client(server.split(';'))

def add(self, key, value, timeout=None): if isinstance(value, unicode): value = value.encode('utf-8') if timeout is None: timeout = self.default_timeout return self._cache.add(smart_str(key), value, timeout)

The Memcache Backend

35

Thursday, September 10, 2009

Page 36: Scaling Django Dc09

djangocon 2009

Advanced Caching

36

• Typical setup has memcached running on web servers

• Pownce web servers were I/O and memory bound, not CPU bound

• Since we had some spare CPU cycles, we compressed large objects before caching them

• The Python memcache library can do this automatically, but the API is not exposed

Thursday, September 10, 2009

Page 37: Scaling Django Dc09

djangocon 2009

from django.core.cache import cachefrom django.utils.encoding import smart_strimport inspect as i

if 'min_compress_len' in i.getargspec(cache._cache.set)[0]: class CacheClass(cache.__class__): def set(self, key, value, timeout=None, min_compress_len=150000): if isinstance(value, unicode): value = value.encode('utf-8') if timeout is None: timeout = self.default_timeout return self._cache.set(smart_str(key), value, timeout, min_compress_len) cache.__class__ = CacheClass

Monkey Patching core.cache

37

Thursday, September 10, 2009

Page 38: Scaling Django Dc09

djangocon 2009

Advanced Caching

38

• Useful tool: automagic single object cache

• Use a manager to check the cache prior to any single object get by pk

• Invalidate assets on save and delete

• Eliminated several hundred QPS at Pownce

Thursday, September 10, 2009

Page 39: Scaling Django Dc09

djangocon 2009

Advanced Caching

39

All this and more at:

http://github.com/mmalone/django-caching/

Thursday, September 10, 2009

Page 40: Scaling Django Dc09

djangocon 2009

Caching

40

Now you’ve made life easier for your DB server,next thing to fall over: your app server.

Thursday, September 10, 2009

Page 41: Scaling Django Dc09

Load Balancing

Thursday, September 10, 2009

Page 42: Scaling Django Dc09

djangocon 2009

Load Balancing

• Out of the box, Django uses a shared nothing architecture

• App servers have no single point of contention

• Responsibility pushed down the stack (to DB)

• This makes scaling the app layer trivial: just add another server

42

Thursday, September 10, 2009

Page 43: Scaling Django Dc09

djangocon 2009

Load Balancing

43

App Servers

Database

Load Balancer

Spread work between multiple nodes in a cluster using a load balancer.

• Hardware or software• Layer 7 or Layer 4

Thursday, September 10, 2009

Page 44: Scaling Django Dc09

djangocon 2009

Load Balancing

44

• Hardware load balancers

• Expensive, like $35,000 each, plus maintenance contracts

• Need two for failover / high availability

• Software load balancers

• Cheap and easy, but more difficult to eliminate as a single point of failure

• Lots of options: Perlbal, Pound, HAProxy, Varnish, Nginx

Thursday, September 10, 2009

Page 45: Scaling Django Dc09

djangocon 2009

Load Balancing

45

• Most of these are layer 7 proxies, and some software balancers do cool things

• Caching

• Re-proxying

• Authentication

• URL rewriting

Thursday, September 10, 2009

Page 46: Scaling Django Dc09

djangocon 2009

Load Balancing

46

A common setup for large operations is to use redundant layer 4 hardware balancers in front of a pool of layer 7 software balancers.

Hardware Balancers

Software Balancers

App Servers

Thursday, September 10, 2009

Page 47: Scaling Django Dc09

djangocon 2009

Load Balancing

47

• At Pownce, we used a single Perlbal balancer

• Easily handled all of our traffic (hundreds of simultaneous connections)

• A SPOF, but we didn’t have $100,000 for black box solutions, and weren’t worried about service guarantees beyond three or four nines

• Plus there were some neat features that we took advantage of

Thursday, September 10, 2009

Page 48: Scaling Django Dc09

djangocon 2009

Perlbal Reproxying

48

Perlbal reproxying is a really cool, and really poorlydocumented feature.

Thursday, September 10, 2009

Page 49: Scaling Django Dc09

djangocon 2009

Perlbal Reproxying

49

1. Perlbal receives request

2. Redirects to App Server

1. App server checks auth (etc.)

2. Returns HTTP 200 with X-Reproxy-URL header set to internal file server URL

3. File served from file server via Perlbal

Thursday, September 10, 2009

Page 50: Scaling Django Dc09

djangocon 2009

Perlbal Reproxying

• Completely transparent to end user

• Doesn’t keep large app server instance around to serve file

• Users can’t access files directly (like they could with a 302)

50

Thursday, September 10, 2009

Page 51: Scaling Django Dc09

djangocon 2009

def download(request, filename): # Check auth, do your thing response = HttpResponse() response[‘X-REPROXY-URL’] = ‘%s/%s’ % (FILE_SERVER, filename) return response

Perlbal Reproxying

51

Plus, it’s really easy:

Thursday, September 10, 2009

Page 52: Scaling Django Dc09

djangocon 2009

Load Balancing

52

Best way to reduce load on your app servers: don’t use them to do hard stuff.

Thursday, September 10, 2009

Page 53: Scaling Django Dc09

Queuing

Thursday, September 10, 2009

Page 54: Scaling Django Dc09

djangocon 2009

Queuing

• A queue is simply a bucket that holds messages until they are removed for processing by clients

• Many expensive operations can be queued and performed asynchronously

• User experience doesn’t have to suffer

• Tell the user that you’re running the job in the background (e.g., transcoding)

• Make it look like the job was done real-time (e.g., note distribution)

54

Thursday, September 10, 2009

Page 55: Scaling Django Dc09

djangocon 2009

Queuing

• Lots of open source options for queuing

• Ghetto Queue (MySQL + Cron)

• this is the official name.

• Gearman

• TheSchwartz

• RabbitMQ

• Apache ActiveMQ

• ZeroMQ

55

Thursday, September 10, 2009

Page 56: Scaling Django Dc09

djangocon 2009

Queuing

• Lots of fancy features: brokers, exchanges, routing keys, bindings...

• Don’t let that crap get you down, this is really simple stuff

• Biggest decision: persistence

• Does your queue need to be durable and persistent, able to survive a crash?

• This requires logging to disk which slows things down, so don’t do it unless you have to

56

Thursday, September 10, 2009

Page 57: Scaling Django Dc09

djangocon 2009

Queuing

• Pownce used a simple ghetto queue built on MySQL / cron

• Problematic if you have multiple consumers pulling jobs from the queue

• No point in reinventing the wheel, there are dozens of battle-tested open source queues to choose from

57

Thursday, September 10, 2009

Page 58: Scaling Django Dc09

djangocon 2009

from django.core.management import setup_environfrom mysite import settings

setup_environ(settings)

Django Standalone Scripts

58

Consumers need to setup the Django environment

Thursday, September 10, 2009

Page 59: Scaling Django Dc09

THE DATABASE!

Thursday, September 10, 2009

Page 60: Scaling Django Dc09

djangocon 2009

The Database

• Til now we’ve been talking about

• Shared nothing

• Pushing problems down the stack

• But we have to store a persistent and consistent view of our application’s state somewhere

• Enter, the database...

60

Thursday, September 10, 2009

Page 61: Scaling Django Dc09

djangocon 2009

CAP Theorem

• Three properties of a shared-data system

• Consistency: all clients see the same data

• Availability: all clients can see some version of the data

• Partition Tolerance: system properties hold even when the system is partitioned & messages are lost

• But you can only have two

61

Thursday, September 10, 2009

Page 62: Scaling Django Dc09

djangocon 2009

CAP Theorem

• Big long proof... here’s my version.

• Empirically, seems to make sense.

• Eric Brewer

• Professor at University of California, Berkeley

• Co-founder and Chief Scientist of Inktomi

• Probably smarter than me

62

Thursday, September 10, 2009

Page 63: Scaling Django Dc09

djangocon 2009

CAP Theorem

• The relational database systems we all use were built with consistency as their primary goal

• But at scale our system needs to have high availability and must be partitionable

• The RDBMS’s consistency requirements get in our way

• Most sharding / federation schemes are kludges that trade consistency for availability & partition tolerance

63

Thursday, September 10, 2009

Page 64: Scaling Django Dc09

djangocon 2009

The Database

• There are lots of non-relational databases coming onto the scene

• CouchDB

• Cassandra

• Tokyo Cabinet

• But they’re not that mature, and they aren’t easy to use with Django

64

Thursday, September 10, 2009

Page 65: Scaling Django Dc09

Denormalization

Thursday, September 10, 2009

Page 66: Scaling Django Dc09

djangocon 2009

Denormalization

• Django encourages normalized data, which is usually good

• But at scale you need to denormalize

• Corollary: joins are evil

• Django makes it really easy to do joins using the ORM, so pay attention

66

Thursday, September 10, 2009

Page 67: Scaling Django Dc09

djangocon 2009

Denormalization

• Start with a normalized database

• Selectively denormalize things as they become bottlenecks

• Denormalized counts, copied fields, etc. can be updated in signal handlers

67

Thursday, September 10, 2009

Page 68: Scaling Django Dc09

Replication

Thursday, September 10, 2009

Page 69: Scaling Django Dc09

djangocon 2009

Replication

• Typical web app is 80 to 90% reads

• Adding read capacity will get you a long way

• MySQL Master-Slave replication

69

Read & Write

Read only

Thursday, September 10, 2009

Page 70: Scaling Django Dc09

djangocon 2009

Replication

• Django doesn’t make it easy to use multiple database connections, but it is possible

• Some caveats

• Slave lag interacts with caching in weird ways

• You can only save to your primary DB (the one you configure in settings.py)

• Unless you get really clever...

70

Thursday, September 10, 2009

Page 71: Scaling Django Dc09

djangocon 2009

class SlaveDatabaseWrapper(DatabaseWrapper): def _cursor(self, settings): if not self._valid_connection(): kwargs = { 'conv': django_conversions, 'charset': 'utf8', 'use_unicode': True, } kwargs = pick_random_slave(settings.SLAVE_DATABASES) self.connection = Database.connect(**kwargs) ... cursor = CursorWrapper(self.connection.cursor()) return cursor

Replication

71

1. Create a custom database wrapper by subclassing DatabaseWrapper

Thursday, September 10, 2009

Page 72: Scaling Django Dc09

djangocon 2009

class MultiDBQuerySet(QuerySet): ... def update(self, **kwargs): slave_conn = self.query.connection self.query.connection = default_connection super(MultiDBQuerySet, self).update(**kwargs) self.query.connection = slave_conn

Replication

72

2. Custom QuerySet that uses primary DB for writes

Thursday, September 10, 2009

Page 73: Scaling Django Dc09

djangocon 2009

class SlaveDatabaseManager(db.models.Manager): def get_query_set(self): return MultiDBQuerySet(self.model, query=self.create_query())

def create_query(self): return db.models.sql.Query(self.model, connection)

Replication

73

3. Custom Manager that uses your custom QuerySet

Thursday, September 10, 2009

Page 74: Scaling Django Dc09

djangocon 2009

Replication

74

http://github.com/mmalone/django-multidb/

Example on github:

Thursday, September 10, 2009

Page 75: Scaling Django Dc09

http://bit.ly/multidbThursday, September 10, 2009

Page 76: Scaling Django Dc09

djangocon 2009

Replication

• Goal:

• Read-what-you-write consistency for writer

• Eventual consistency for everyone else

• Slave lag screws things up

76

Thursday, September 10, 2009

Page 77: Scaling Django Dc09

djangocon 2009

Replication

77

What happens when you become write saturated?

Thursday, September 10, 2009

Page 78: Scaling Django Dc09

Federation

Thursday, September 10, 2009

Page 79: Scaling Django Dc09

djangocon 2009

Federation

79

• Start with Vertical Partitioning: split tables that aren’t joined across database servers

• Actually pretty easy

• Except not with Django

Thursday, September 10, 2009

Page 80: Scaling Django Dc09

djangocon 2009

Federation

80

django.db.models.base

FAIL!

Thursday, September 10, 2009

Page 81: Scaling Django Dc09

djangocon 2009

Federation

• At some point you’ll need to split a single table across databases (e.g., user table)

• Auto-increment PKs won’t work

• It’d be nice to have a UUIDField for PKs

• You can probably build this yourself

81

Thursday, September 10, 2009

Page 82: Scaling Django Dc09

Profiling, Monitoring & Measuring

Thursday, September 10, 2009

Page 83: Scaling Django Dc09

djangocon 2009

>>> Article.objects.filter(pk=3).query.as_sql()('SELECT "app_article"."id", "app_article"."name", "app_article"."author_id" FROM "app_article" WHERE "app_article"."id" = %s ', (3,))

Know your SQL

83

Thursday, September 10, 2009

Page 84: Scaling Django Dc09

djangocon 2009

>>> import sqlparse>>> def pp_query(qs):... t = qs.query.as_sql()... sql = t[0] % t[1]... print sqlparse.format(sql, reindent=True, keyword_case='upper')... >>> pp_query(Article.objects.filter(pk=3))SELECT "app_article"."id", "app_article"."name", "app_article"."author_id"FROM "app_article"WHERE "app_article"."id" = 3

Know your SQL

84

Thursday, September 10, 2009

Page 85: Scaling Django Dc09

djangocon 2009

>>> from django.db import connection>>> connection.queries[{'time': '0.001', 'sql': u'SELECT "app_article"."id", "app_article"."name", "app_article"."author_id" FROM "app_article"'}]

Know your SQL

85

Thursday, September 10, 2009

Page 86: Scaling Django Dc09

djangocon 2009

Know your SQL

• It’d be nice if a lightweight stacktrace could be done in QuerySet.__init__

• Stick the result in connection.queries

• Now we know where the query originated

86

Thursday, September 10, 2009

Page 87: Scaling Django Dc09

djangocon 2009

Measuring

87

Django Debug Toolbar

http://github.com/robhudson/django-debug-toolbar/

Thursday, September 10, 2009

Page 88: Scaling Django Dc09

djangocon 2009

Monitoring

• Ganglia

• Munin

88

You can’t improve what you don’t measure.

Thursday, September 10, 2009

Page 89: Scaling Django Dc09

djangocon 2009

Measuring & Monitoring

• Measure

• Server load, CPU usage, I/O

• Database QPS

• Memcache QPS, hit rate, evictions

• Queue lengths

• Anything else interesting

89

Thursday, September 10, 2009

Page 90: Scaling Django Dc09

All done... Questions?Contact me at [email protected] or @mjmalone

Thursday, September 10, 2009