scaling mongodb for real time analytics

Post on 05-Dec-2014

1.328 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

 

TRANSCRIPT

SHORTCUTSAROUND THE

MISTAKES WE’VEMADE SCALING

MONGODB

David Tollmyr, Platform lead

@effataslideshare.net/tollmyr

What we doWe want to make digital advertising an amazing user experience.There is more to metrics that clicks.

Ads

Data

Assembling sessionsexposure

pingping

ping ping

ping

event

event

ping

session➔ ➔

Information

Crunching

session

session

session

session

sessionsession

session session

session

session

session

session

session

➔ ➔ 42

Metrics

Reports

What we doTrack ads, make pretty reports.

That doesn’t sound so hardWe don’t know when sessions endThere’s a lot of dataIt’s all done in (close to) real time

Numbers200 Gb logs100 million data pointsper day~300 metrics per data point= 6000 updates / s at peak

How we use(d) MongoDB“Virtual memory” to offload data while we wait for sessions to finishShort time storage (<48 hours) for batch jobs, replays and manual analysisMetrics storage

Why we use MongoDBSchemalessness makes things so much easier, the data we collect changes as we come up with new ideasSharding makes it possible to scale writesSecondary indexes and rich query language are great features (for the metrics store)It’s just… nice

Btw.We use JRuby, it’s awesome

STANDING ON THE SHOULDERS

OF GIANTS WITH JRUBY

slideshare.net/iconara

A story in 9 iterations

secondary indexes and updates1st iteration

One document per session, update as new data comes alongOutcome: 1000% write lock

#1Everything is aboutworking around the

GLOBALWRITELOCK

MongoDB 1.8.1

db.coll.update({_id: "xyz"}, {$inc: {x: 1}}, true)

db.coll.update({_id: "abc"}, {$push: {x: “...”}}, true)

MongoDB 2.0.0

db.coll.update({_id: "xyz"}, {$inc: {x: 1}}, true)

db.coll.update({_id: "abc"}, {$push: {x: “...”}}, true)

using scans for two step assembling2nd iteration

Instead of updating, save each fragment, then scan over _id to assemble sessions

using scans for two step assembling2nd iteration

Outcome: not as much lock, but still not great performance. We also realised we couldn’t remove data fast enough

#2Everything is aboutworking around the

GLOBALWRITELOCK

#3Give a lot of

thought to your

PRIMARYKEY

partitioning3rd iteration

Partitioning the data by writing to a new collection every hourOutcome: complicated, fragmented database

#4Make sure you can

REMOVE OLD DATA

sharding4th iteration

To get around the global write lock and get higher write performance we moved to a sharded cluster.Outcome: higher write performance, lots of problems, lots of ops time spent debugging

#5Everything is aboutworking around the

GLOBALWRITELOCK

#6SHARDINGIS NOT A

SILVER BULLETand it’s complex, if you can, avoid it

#7IT WILL FAIL

design for it

moving things to separate clusters5th iteration

We saw very different loads on the shards and realised we had databases with very different usage patterns, some that made autosharding not work. We moved these off the cluster.Outcome: a more balanced and stable cluster

#8Everything is aboutworking around the

GLOBALWRITELOCK

#9ONE DATABASE

with one usage pattern

PER CLUSTER

#10MONITOR

EVERYTHINGlook at your health

graphs daily

monster machines6th iteration

We got new problems removing data and needed some room to breathe and think Solution: upgraded the servers to High-Memory Quadruple Extra Large (with cheese).

♥I

#11Don’t try to scale up

SCALE OUT

#12When you’re out of ideas

CALL THE EXPERTS

partitioning (again) and pre-chunking7th iteration

We rewrote the database layer to write to a new database each day, and we created all chunks in advance. We also decreased the size of our documents by a lot.Outcome: no more problems removing data.

#13Smaller objects means a smaller database, and a smaller database means

LESS RAM NEEDED

#14Give a lot of

thought to your

PRIMARYKEY

#15Everything is aboutworking around the

GLOBALWRITELOCK

realize when you have the wrong tool8th iteration

Transient data might not need all the bells and whistles.

Outcome: Redis gave us 100x performance in the assembling step

#16When all you have is a

HAMMEReverything looks like a

NAIL

rinse and repeat9th iteration

We now have the same scaling issues later in the chain.

Outcome: Upcoming rewrite to make writes/updated more effectiveRedis was actually slower

#17Everything is aboutworking around the

GLOBALWRITELOCK

Thank you

@effataslideshare.net/tollmyr

engineering.burtcorp.comburtcorp.com

richmetrics.com

Since we got time…

EC2Tips

You have three copies of your data, do you really need EBS?Instance store disks are included in the price and they have predictable performance.m1.xlarge comes with 1.7 TB of storage.

Avoid bulk insertsTips

Very dangerous if there’s a possibility of duplicate key errors

It’s not fixed in 2.0 even though the driver has a flag for it.

Safe modeTips

Run every Nth insert in safe modeThis will give you warnings when bad things happen; like failovers

top related