an evolving mongodb implementation vinayak javaly [email protected] november 15, 2011

13
An Evolving MongoDB Implementation Vinayak Javaly [email protected] November 15, 2011

Upload: ariana-page

Post on 26-Mar-2015

226 views

Category:

Documents


7 download

TRANSCRIPT

Page 1: An Evolving MongoDB Implementation Vinayak Javaly vinayak@everyscreenmedia.com November 15, 2011

An Evolving MongoDB Implementation

Vinayak [email protected] 15, 2011

Page 2: An Evolving MongoDB Implementation Vinayak Javaly vinayak@everyscreenmedia.com November 15, 2011

Agenda Background System Requirements Platform Architecture Evolution Why MongoDB? Recommendations Wrap-up

Page 2

Page 3: An Evolving MongoDB Implementation Vinayak Javaly vinayak@everyscreenmedia.com November 15, 2011

Background Me

Worked at IBM, Merrill Lynch, several startups 20+ years DB experience with Sybase, Oracle, MySQL, MongoDB Adjunct professor at New York Institute of Technology

EveryScreen Media - everyscreenmedia.com 10 person startup in SoHo Developed real-time mobile advertising technology platform Building data science practice We're hiring!

Page 3

Page 4: An Evolving MongoDB Implementation Vinayak Javaly vinayak@everyscreenmedia.com November 15, 2011

System Requirements Goal: Process billions requests / day

Currently: 100 million / day Response time: <100 milliseconds / request Near real-time reporting & analytics Generate data sets (weekly, monthly) for data science analysis

Page 4

Page 5: An Evolving MongoDB Implementation Vinayak Javaly vinayak@everyscreenmedia.com November 15, 2011

Platform Amazon EC2 servers & EBS volumes

Moving subset to hosted servers & RAID (near future) C++ - Pion-net, Boost libraries PHP - Lithium MongoDB 1.8 Redis 2.4 (near future) Queuing (TBD)

Page 5

Page 6: An Evolving MongoDB Implementation Vinayak Javaly vinayak@everyscreenmedia.com November 15, 2011

Architecture EvolutionPhase 1 (Spring) - Islands of data4 standalone Mongod servers

each with 500 GB EBS volumes

Phase 2 (Summer) – Sharded replica sets across different availability zones3 Mongod config servers10 Mongod shard servers (5 replica sets)

each with 1 TB EBS volume15 application servers running C++ bidders

each with Mongos router

Page 6

Page 7: An Evolving MongoDB Implementation Vinayak Javaly vinayak@everyscreenmedia.com November 15, 2011

Architecture Evolution

Major system re-design due to increasing traffic, better understanding of data usage, getting smarter, etc.

Separated read and write traffic

At high write volumes (>1000 inserts/second), reads are effectively stalled due to IO saturation in our setup

Converted some collections to cappedImplemented on-going rollups

Current (Fall)2 MongoDB replica sets

one for real-time data into capped collections one for all other data, including real-time rollups

Page 7

Page 8: An Evolving MongoDB Implementation Vinayak Javaly vinayak@everyscreenmedia.com November 15, 2011

Architecture EvolutionFuture (Winter ?)Move MongoDB servers to dedicated hosted environment

Setup H/W RAID 1+0 for MongoDB dataC++ application servers to remain in EC2 for flexible auto-scalingImplement Redis (with tiered replication) for fast lookupsImplement queuing (TBD)

Page 8

Page 9: An Evolving MongoDB Implementation Vinayak Javaly vinayak@everyscreenmedia.com November 15, 2011

Architecture Diagram (Next)

Page 9

Bidder

Redis DB [slave, secondary]

RTB Network A

RTB Network B

Load balancer

Queue (sender)

Queue (receiver)

Bidder

Redis DB [slave, secondary]

Redis DB

[master]

MongoDB

[Campaign master]

MongoDB

[Campaign slave]

Redis DB

[slave, primary]

MongoDB

[Raw master]

MongoDB

[Raw slave]

MongoDB

[DataScience standalone]

Amazon EC2

Hosted environment

Page 10: An Evolving MongoDB Implementation Vinayak Javaly vinayak@everyscreenmedia.com November 15, 2011

Why MongoDB? Scalability Flexible schema

Though, highly recommend a controlled, structured schema Fire & forget writes Update-in-place - $inc, upsert Looking forward to "expiration" functionality

MongoDB community 10gen Open Office Hours

Great resource

Page 10

Page 11: An Evolving MongoDB Implementation Vinayak Javaly vinayak@everyscreenmedia.com November 15, 2011

Recommendations Spend a lot of time data modeling

Embed related data, fetch data in a single query, etc. Separate read and write traffic on different servers Prepopulate “growing” fields – i.e. counts by hour Store dates as strings – speeds up range queries

"created" : {

"full" : ISODate("2011-11-14T15:27:01Z"),

"simple" : "20111114152701”

}

Get “50 Tips & Tricks for MongoDB Developers” by Kristina Chodorow

Page 11

Page 12: An Evolving MongoDB Implementation Vinayak Javaly vinayak@everyscreenmedia.com November 15, 2011

Wrap-up We're hiring!

everyscreenmedia.com/careers.php

Questions?

Page 12

Page 13: An Evolving MongoDB Implementation Vinayak Javaly vinayak@everyscreenmedia.com November 15, 2011

Confidential and Proprietary Page #