david cramer: building to scale

BUILDING TO SCALE

David Cramertwitter.com/zeeg

Tuesday, February 26, 13

The things we build will notand can not last


Who am I?


What do we mean by scale?


DISQUSMassive traffic with a long tail

SentryCounters and event aggregation

tenXerMore stats than we can count


Does one size fit all?


Practical Storage


Postgres is the foundation of DISQUS


MySQL powers the tenXer graph store


Sentry is built on SQL


Databases are not the problem


Compromise


Scaling is about Predictability


Augment SQL with [technology]


Simple solutions using Redis(I like Redis)


Counters


Counters are everywhere


Counters in SQLUPDATE table SET counter = counter + 1;


Counters in RedisINCR counter 1

>>> redis.incr('counter')


Counters in Sentry

event ID 1 event ID 2 event ID 3

Redis INCR Redis INCR Redis INCR

SQL Update


Counters in Sentry

‣ INCR event_id in Redis‣ Queue buffer incr task

‣ 5 - 10s explicit delay

‣ Task does atomic GET event_id and DEL event_id (Redis pipeline)

‣ No-op If GET is not > 0‣ One SQL UPDATE per unique event per

delay


Counters in Sentry (cont.)

Pros‣ Solves database row lock contention‣ Redis nodes are horizontally scalable‣ Easy to implement

Cons‣ Too many dummy (no-op) tasks


Alternative Counters

event ID 1 event ID 2 event ID 3

Redis ZINCRBY Redis ZINCRBY Redis ZINCRBY

SQL Update


Sorted Sets in Redis

> ZINCRBY events ad93a 1{ad93a: 1}

> ZINCRBY events ad93a 1{ad93a: 2}

> ZINCRBY events d2ow3 1{ad93a: 2, d2ow3: 1}


Alternative Counters

‣ ZINCRBY events event_id in Redis‣ Cron buffer flush

‣ ZRANGE events to get pending updates‣ Fire individual task per update

‣ Atomic ZSCORE events event_id and ZREM events event_id to get and flush count.


Alternative Counters (cont.)

Pros‣ Removes (most) no-op tasks‣ Works without a complex queue due to no

required delay on jobs

Cons‣ Single Redis key stores all pending updates


Activity Streams


Streams are everywhere


Streams in SQL

class Activity: SET_RESOLVED = 1 SET_REGRESSION = 6

TYPE = ( (SET_RESOLVED, 'set_resolved'), (SET_REGRESSION, 'set_regression'), )

event = ForeignKey(Event) type = IntegerField(choices=TYPE) user = ForeignKey(User, null=True) datetime = DateTimeField() data = JSONField(null=True)


Streams in SQL (cont.)

>>> Activity(event, SET_RESOLVED, user, now)"David marked this event as resolved."

>>> Activity(event, SET_REGRESSION, datetime=now)"The system marked this event as a regression."

>>> Activity(type=DEPLOY_START, datetime=now)"A deploy started."

>>> Activity(type=SET_RESOLVED, datetime=now)"All events were marked as resolved"


Stream == View == Cache


Views as a Cache

TIMELINE = []MAX = 500

def on_event_creation(event): global TIMELINE

TIMELINE.insert(0, event) TIMELINE = TIMELINE[:MAX]

def get_latest_events(num=100): return TIMELINE[:num]


Views in Redis

class Timeline(object): def __init__(self): self.db = Redis()

def add(self, event): score = float(event.date.strftime('%s.%m')) self.db.zadd('timeline', event.id, score) def list(self, offset=0, limit=-1): return self.db.zrevrange( 'timeline', offset, limit)


Views in Redis (cont.)

MAX_SIZE = 10000

def add(self, event): score = float(event.date.strftime('%s.%m'))

# increment the key and trim the data to avoid # data bloat in a single key with self.db.pipeline() as pipe: pipe.zadd(self.key, event.id, score) pipe.zremrange(self.key, event.id, MAX_SIZE, -1)


Queuing


Introducing Celery


RabbitMQ or Redis


Asynchronous Tasks

# Register the task@task(exchange=”event_creation”)def on_event_creation(event_id): counter.incr('events', event_id)

# Delay executionon_event_creation(event.id)


Fanout

@task(exchange=”counters”)def incr_counter(key, id=None): counter.incr(key, id)

@task(exchange=”event_creation”)def on_event_creation(event_id): incr_counter.delay('events', event_id) incr_counter.delay('global')

# Delay executionon_event_creation(event.id)


Object Caching


Object Cache Prerequisites

‣ Your database can't handle the read-load

‣ Your data changes infrequently

‣ You can handle slightly worse performance


Distributing Load with Memcache

Memcache 1 Memcache 2 Memcache 3

Event ID 01Event ID 04Event ID 07Event ID 10Event ID 13




Querying the Object Cache

def make_key(model, id): return '{}:{}'.format(model.__name__, id)

def get_by_ids(model, id_list): model_name = model.__name__ keys = map(make_key, id_list)

res = cache.get_multi()

pending = set() for id, value in res.iteritems(): if value is None: pending.add(id)

if pending: mres = model.objects.in_bulk(pending)

cache.set_multi({make_key(o.id): o for o in mres})

res.update(mres)

return res


Pushing State

def save(self): cache.set(make_key(type(self), self.id), self)

def delete(self): cache.delete(make_key(type(self), self.id)


Redis for Persistence

Redis 1 Redis 2 Redis 3





Routing with Nydus

# create a cluster of Redis connections which# partition reads/writes by (hash(key) % size)

from nydus.db import create_cluster

redis = create_cluster({ 'engine': 'nydus.db.backends.redis.Redis', 'router': 'nydus.db...redis.PartitionRouter', 'hosts': { {0: {'db': 0} for n in xrange(10)}, }})

github.com/disqus/nydus


Planning for the Future


One of the largest problems for Disqus is network-wide moderation


Be Mindful of Features


Sentry's Team Dashboard

‣ Data limited to a single team‣ Simple views which could be materialized‣ Only entry point for "data for team"


Sentry's Stream View

‣ Data limited to a single project‣ Each project could map to a different DB


Preallocate Shards


DB5 DB6 DB7 DB8 DB9

DB0 DB1 DB2 DB3 DB4

redis-1


redis-2

DB5 DB6 DB7 DB8 DB9

DB0 DB1 DB2 DB3 DB4

redis-1

When a physical machine becomes overloaded migrate a chunk of shards

to another machine.


Takeaways


Enhance your databaseDon't replace it


Queue Everything


Learn to say no(to features)


Complex problems do not require complex solutions


QUESTIONS?


david cramer: building to scale

Technology

sentry event id

event timeline

unique event

alternative counters

sentry incr event

zrem events event

id redis pipeline

event aggregation tenxer