david cramer: building to scale

64
BUILDING TO SCALE David Cramer twitter.com/zeeg Tuesday, February 26, 13

Upload: it-people

Post on 17-May-2015

1.688 views

Category:

Technology


4 download

TRANSCRIPT

Page 1: David Cramer: Building to scale

BUILDING TO SCALE

David Cramertwitter.com/zeeg

Tuesday, February 26, 13

Page 2: David Cramer: Building to scale

The things we build will notand can not last

Tuesday, February 26, 13

Page 3: David Cramer: Building to scale

Who am I?

Tuesday, February 26, 13

Page 4: David Cramer: Building to scale

Tuesday, February 26, 13

Page 5: David Cramer: Building to scale

Tuesday, February 26, 13

Page 6: David Cramer: Building to scale

Tuesday, February 26, 13

Page 7: David Cramer: Building to scale

What do we mean by scale?

Tuesday, February 26, 13

Page 8: David Cramer: Building to scale

DISQUSMassive traffic with a long tail

SentryCounters and event aggregation

tenXerMore stats than we can count

Tuesday, February 26, 13

Page 9: David Cramer: Building to scale

Does one size fit all?

Tuesday, February 26, 13

Page 10: David Cramer: Building to scale

Practical Storage

Tuesday, February 26, 13

Page 11: David Cramer: Building to scale

Postgres is the foundation of DISQUS

Tuesday, February 26, 13

Page 12: David Cramer: Building to scale

MySQL powers the tenXer graph store

Tuesday, February 26, 13

Page 13: David Cramer: Building to scale

Sentry is built on SQL

Tuesday, February 26, 13

Page 14: David Cramer: Building to scale

Databases are not the problem

Tuesday, February 26, 13

Page 15: David Cramer: Building to scale

Compromise

Tuesday, February 26, 13

Page 16: David Cramer: Building to scale

Scaling is about Predictability

Tuesday, February 26, 13

Page 17: David Cramer: Building to scale

Augment SQL with [technology]

Tuesday, February 26, 13

Page 18: David Cramer: Building to scale

Tuesday, February 26, 13

Page 19: David Cramer: Building to scale

Simple solutions using Redis(I like Redis)

Tuesday, February 26, 13

Page 20: David Cramer: Building to scale

Counters

Tuesday, February 26, 13

Page 21: David Cramer: Building to scale

Counters are everywhere

Tuesday, February 26, 13

Page 22: David Cramer: Building to scale

Counters in SQLUPDATE table SET counter = counter + 1;

Tuesday, February 26, 13

Page 23: David Cramer: Building to scale

Counters in RedisINCR counter 1

>>> redis.incr('counter')

Tuesday, February 26, 13

Page 24: David Cramer: Building to scale

Counters in Sentry

event ID 1 event ID 2 event ID 3

Redis INCR Redis INCR Redis INCR

SQL Update

Tuesday, February 26, 13

Page 25: David Cramer: Building to scale

Counters in Sentry

‣ INCR event_id in Redis‣ Queue buffer incr task

‣ 5 - 10s explicit delay

‣ Task does atomic GET event_id and DEL event_id (Redis pipeline)

‣ No-op If GET is not > 0‣ One SQL UPDATE per unique event per

delay

Tuesday, February 26, 13

Page 26: David Cramer: Building to scale

Counters in Sentry (cont.)

Pros‣ Solves database row lock contention‣ Redis nodes are horizontally scalable‣ Easy to implement

Cons‣ Too many dummy (no-op) tasks

Tuesday, February 26, 13

Page 27: David Cramer: Building to scale

Alternative Counters

event ID 1 event ID 2 event ID 3

Redis ZINCRBY Redis ZINCRBY Redis ZINCRBY

SQL Update

Tuesday, February 26, 13

Page 28: David Cramer: Building to scale

Sorted Sets in Redis

> ZINCRBY events ad93a 1{ad93a: 1}

> ZINCRBY events ad93a 1{ad93a: 2}

> ZINCRBY events d2ow3 1{ad93a: 2, d2ow3: 1}

Tuesday, February 26, 13

Page 29: David Cramer: Building to scale

Alternative Counters

‣ ZINCRBY events event_id in Redis‣ Cron buffer flush

‣ ZRANGE events to get pending updates‣ Fire individual task per update

‣ Atomic ZSCORE events event_id and ZREM events event_id to get and flush count.

Tuesday, February 26, 13

Page 30: David Cramer: Building to scale

Alternative Counters (cont.)

Pros‣ Removes (most) no-op tasks‣ Works without a complex queue due to no

required delay on jobs

Cons‣ Single Redis key stores all pending updates

Tuesday, February 26, 13

Page 31: David Cramer: Building to scale

Activity Streams

Tuesday, February 26, 13

Page 32: David Cramer: Building to scale

Streams are everywhere

Tuesday, February 26, 13

Page 33: David Cramer: Building to scale

Streams in SQL

class Activity: SET_RESOLVED = 1 SET_REGRESSION = 6

TYPE = ( (SET_RESOLVED, 'set_resolved'), (SET_REGRESSION, 'set_regression'), )

event = ForeignKey(Event) type = IntegerField(choices=TYPE) user = ForeignKey(User, null=True) datetime = DateTimeField() data = JSONField(null=True)

Tuesday, February 26, 13

Page 34: David Cramer: Building to scale

Streams in SQL (cont.)

>>> Activity(event, SET_RESOLVED, user, now)"David marked this event as resolved."

>>> Activity(event, SET_REGRESSION, datetime=now)"The system marked this event as a regression."

>>> Activity(type=DEPLOY_START, datetime=now)"A deploy started."

>>> Activity(type=SET_RESOLVED, datetime=now)"All events were marked as resolved"

Tuesday, February 26, 13

Page 35: David Cramer: Building to scale

Stream == View == Cache

Tuesday, February 26, 13

Page 36: David Cramer: Building to scale

Views as a Cache

TIMELINE = []MAX = 500

def on_event_creation(event): global TIMELINE

TIMELINE.insert(0, event) TIMELINE = TIMELINE[:MAX]

def get_latest_events(num=100): return TIMELINE[:num]

Tuesday, February 26, 13

Page 37: David Cramer: Building to scale

Views in Redis

class Timeline(object): def __init__(self): self.db = Redis()

def add(self, event): score = float(event.date.strftime('%s.%m')) self.db.zadd('timeline', event.id, score) def list(self, offset=0, limit=-1): return self.db.zrevrange( 'timeline', offset, limit)

Tuesday, February 26, 13

Page 38: David Cramer: Building to scale

Views in Redis (cont.)

MAX_SIZE = 10000

def add(self, event): score = float(event.date.strftime('%s.%m'))

# increment the key and trim the data to avoid # data bloat in a single key with self.db.pipeline() as pipe: pipe.zadd(self.key, event.id, score) pipe.zremrange(self.key, event.id, MAX_SIZE, -1)

Tuesday, February 26, 13

Page 39: David Cramer: Building to scale

Queuing

Tuesday, February 26, 13

Page 40: David Cramer: Building to scale

Introducing Celery

Tuesday, February 26, 13

Page 41: David Cramer: Building to scale

RabbitMQ or Redis

Tuesday, February 26, 13

Page 42: David Cramer: Building to scale

Asynchronous Tasks

# Register the task@task(exchange=”event_creation”)def on_event_creation(event_id): counter.incr('events', event_id)

# Delay executionon_event_creation(event.id)

Tuesday, February 26, 13

Page 43: David Cramer: Building to scale

Fanout

@task(exchange=”counters”)def incr_counter(key, id=None): counter.incr(key, id)

@task(exchange=”event_creation”)def on_event_creation(event_id): incr_counter.delay('events', event_id) incr_counter.delay('global')

# Delay executionon_event_creation(event.id)

Tuesday, February 26, 13

Page 44: David Cramer: Building to scale

Object Caching

Tuesday, February 26, 13

Page 45: David Cramer: Building to scale

Object Cache Prerequisites

‣ Your database can't handle the read-load

‣ Your data changes infrequently

‣ You can handle slightly worse performance

Tuesday, February 26, 13

Page 46: David Cramer: Building to scale

Distributing Load with Memcache

Memcache 1 Memcache 2 Memcache 3

Event ID 01Event ID 04Event ID 07Event ID 10Event ID 13

Event ID 02Event ID 05Event ID 08Event ID 11Event ID 14

Event ID 03Event ID 06Event ID 09Event ID 12Event ID 15

Tuesday, February 26, 13

Page 47: David Cramer: Building to scale

Querying the Object Cache

def make_key(model, id): return '{}:{}'.format(model.__name__, id)

def get_by_ids(model, id_list): model_name = model.__name__ keys = map(make_key, id_list)

res = cache.get_multi()

pending = set() for id, value in res.iteritems(): if value is None: pending.add(id)

if pending: mres = model.objects.in_bulk(pending)

cache.set_multi({make_key(o.id): o for o in mres})

res.update(mres)

return res

Tuesday, February 26, 13

Page 48: David Cramer: Building to scale

Pushing State

def save(self): cache.set(make_key(type(self), self.id), self)

def delete(self): cache.delete(make_key(type(self), self.id)

Tuesday, February 26, 13

Page 49: David Cramer: Building to scale

Redis for Persistence

Redis 1 Redis 2 Redis 3

Event ID 01Event ID 04Event ID 07Event ID 10Event ID 13

Event ID 02Event ID 05Event ID 08Event ID 11Event ID 14

Event ID 03Event ID 06Event ID 09Event ID 12Event ID 15

Tuesday, February 26, 13

Page 50: David Cramer: Building to scale

Routing with Nydus

# create a cluster of Redis connections which# partition reads/writes by (hash(key) % size)

from nydus.db import create_cluster

redis = create_cluster({ 'engine': 'nydus.db.backends.redis.Redis', 'router': 'nydus.db...redis.PartitionRouter', 'hosts': { {0: {'db': 0} for n in xrange(10)}, }})

github.com/disqus/nydus

Tuesday, February 26, 13

Page 51: David Cramer: Building to scale

Planning for the Future

Tuesday, February 26, 13

Page 52: David Cramer: Building to scale

One of the largest problems for Disqus is network-wide moderation

Tuesday, February 26, 13

Page 53: David Cramer: Building to scale

Be Mindful of Features

Tuesday, February 26, 13

Page 54: David Cramer: Building to scale

Sentry's Team Dashboard

‣ Data limited to a single team‣ Simple views which could be materialized‣ Only entry point for "data for team"

Tuesday, February 26, 13

Page 55: David Cramer: Building to scale

Sentry's Stream View

‣ Data limited to a single project‣ Each project could map to a different DB

Tuesday, February 26, 13

Page 56: David Cramer: Building to scale

Preallocate Shards

Tuesday, February 26, 13

Page 57: David Cramer: Building to scale

DB5 DB6 DB7 DB8 DB9

DB0 DB1 DB2 DB3 DB4

redis-1

Tuesday, February 26, 13

Page 58: David Cramer: Building to scale

redis-2

DB5 DB6 DB7 DB8 DB9

DB0 DB1 DB2 DB3 DB4

redis-1

When a physical machine becomes overloaded migrate a chunk of shards

to another machine.

Tuesday, February 26, 13

Page 59: David Cramer: Building to scale

Takeaways

Tuesday, February 26, 13

Page 60: David Cramer: Building to scale

Enhance your databaseDon't replace it

Tuesday, February 26, 13

Page 61: David Cramer: Building to scale

Queue Everything

Tuesday, February 26, 13

Page 62: David Cramer: Building to scale

Learn to say no(to features)

Tuesday, February 26, 13

Page 63: David Cramer: Building to scale

Complex problems do not require complex solutions

Tuesday, February 26, 13

Page 64: David Cramer: Building to scale

QUESTIONS?

Tuesday, February 26, 13