sf mongodb user group : using mongodb for ign's social platform

22
Using MongoDB for IGN’s Social Platform SF Bay Area MongoDB User Group Tuesday Feb 15 th , 2011

Post on 19-Oct-2014

4.826 views

Category:

Technology


5 download

DESCRIPTION

My presentation from the San Francisco Bay Area MongoDB User Group Meetup on 02/15/2011 on the topic 'Using MongoDB for IGN's Social Platform'

TRANSCRIPT

Page 1: SF MongoDB User Group : Using MongoDB for IGN's Social Platform

Using MongoDB for IGN’s Social Platform

SF Bay Area MongoDB User GroupTuesday Feb 15th, 2011

Page 2: SF MongoDB User Group : Using MongoDB for IGN's Social Platform

About Me

Manish Pandit

@lobster1234http:/about.me/mpandit

Page 3: SF MongoDB User Group : Using MongoDB for IGN's Social Platform

About IGN’s Social Platform

• An API to connect gamer community with editors, games, other gamers, and help lay the foundation for premium content discovery as well as UGC

• In beta since Sept 2010• 5M+ activities • 20K UVs a day, ~100K PVs a day

Page 4: SF MongoDB User Group : Using MongoDB for IGN's Social Platform

Architecture

• REST based API, built in Java• Entities are People, MediaItems, Activities,

Comments, Notifications, Status• Interfaces across IGN.com as well as other

social networks• Caching tier based on memcached• MySQL and MongoDB as persistence• PHP/Zend front end

Page 5: SF MongoDB User Group : Using MongoDB for IGN's Social Platform

MongoDB Usage

• Activity Streams : ActivityStrea.ms standard• Activity Caching : (more on this later!)• Activity Commenting• Points : Also extend to badges• Block lists, Ban lists• Notifications : System notifications• Analytics : Activity snapshot for a user

Page 6: SF MongoDB User Group : Using MongoDB for IGN's Social Platform

Alternatives

• MySQL – Obvious alternative, being used for storing person

data, game data, relationships– Did not work for activities

– Massive joins to filter newsfeeds, i.e. activities from friends– Fairly normalized schema for activities– Too many changes to the schema as requirements changed

and new types of activities came into picture. Alter table started to take hours.

– Optimization led to large number of indexes, slowing down the writes

Page 7: SF MongoDB User Group : Using MongoDB for IGN's Social Platform

Alternatives

• Voldemort– Used for the initial release, Sept 2010

• Fast and simple implementation of Amazon Dynamo

– Did not work out for long• We needed the ability to query the data• Needed more than Key-Value pairs• No in-place updates out of the box, had to write custom

code to handle concurrent update conflicts (read-repair).• Not a lot of developer velocity when compared to MongoDB

Page 8: SF MongoDB User Group : Using MongoDB for IGN's Social Platform

Other alternatives

• Cassandra• Learning curve, lack of querying• Did not want to bite more than we could chew

• CouchDB• Map-reduce queries, views• REST-based API is good, but performance gets affected

by a chatty, HTTP interface for a database

Page 9: SF MongoDB User Group : Using MongoDB for IGN's Social Platform

Configuration• Server:

• 1 Master, 2 Slaves (load balanced thru Netscalar)• 2 extra slaves which are not queried (replicate!!)• Version 1.6.1

• Client:• Java Driver (2.1)• Ruby Driver (1.2)

• Mappers:• Morphia for Java

• Connections per host : 200, #hosts = 4• Oplog Size: 1GB, about 2.5 hours• Syncdelay: 60s (default)• Hardware: 2 core, 6 GB virtualized machine

Page 10: SF MongoDB User Group : Using MongoDB for IGN's Social Platform

Maintenance

• Data defragmentation• Slaves – by running it on different port• Master – by having a downtime

• Collection trimming• The scripts block during remove• Bulk removes kills the slaves, spiking CPU 100%

Page 11: SF MongoDB User Group : Using MongoDB for IGN's Social Platform

Monitoring

• Nagios• TCP Port Monitoring • Disk space monitoring• CPU monitoring

• Munin• Mongo connections • Memory usage• Ops/second• Write Lock %• Collection Sizes (in terms of # of documents)

Page 12: SF MongoDB User Group : Using MongoDB for IGN's Social Platform

Backup or prepping for O Shit!

• NetApp Filter based, snapshots• Make sure to do {fsync:1} and {lock:1} on one slave

• Hourly dumps via cron job• Using mongodump

• Incremental backup via the oplog• Replay the oplog instead of relying on a snapshot

• Delayed slaves • Not recommended as it almost guarantees data loss

proportional to the delay, which is inversely proportional to the time-to-react

Page 13: SF MongoDB User Group : Using MongoDB for IGN's Social Platform

Tools to be familiar with• mongostat

• Look at queue lengths, memory, connections and operation mix

• db.serverStatus()• Server status with sync, pagefaults, locks, index misses

• atop• iostat• db.stats()

• Overall info at the database level

• db.<coll_name>.stats()• Overall info at the collection level

• db.printReplicationInfo()• Info about the oplog size and time

• db.printSlaveReplicationInfo()• Info about the master, the last sync timetamp, and how behind the slave is from the

master

Page 14: SF MongoDB User Group : Using MongoDB for IGN's Social Platform

Challenges with ActivityStreams• Lots of data!

• Large amount of data coming out as a result

• Reverse sorting• The data has to be sorted in reverse natural order ($natural : -1), and we do not use

capped collections

• Aggregation of similar activities• Impacts pagination

• Fetching self activities (profile), and newsfeed (self + others)• Filtering based on the activity type

• People want to see Game Updates or Blog updates from their friends

• Hydration of activities for dynamic data• The thumbnail and level of the actor may change

• Comments • When an activity is rendered, the initial comments and count has to be pulled ($slice)

TODO: Rant about missing $size operator

Page 15: SF MongoDB User Group : Using MongoDB for IGN's Social Platform

ActivityStreamsEach activity has an ACTOREach actor has a TYPEEach actor performs an action, that action is called a VERB Each VERB can act upon many Objects, called ACTIVITYOBJECTSSome VERBs may involve a Target, called ACTIVITYTARGETEvery entity (Actor, ActivityObject, ActivityTarget) has links to define it

Examples :

A writes ‘Hello!’ on B’s wallActor => A, ActivityObject => ‘Hello!’ of type WALL_POST, ActivityTarget => B, VERB => POST

A follows a game BActor => A, ActivityObject => B of type MEDIA_ITEM, ActivityTarget => null, VERB => FOLLOW

………and it gets complicated as we go down the rabbit hole!

Page 16: SF MongoDB User Group : Using MongoDB for IGN's Social Platform

Caching using MongoDB

• Caching the entire streams• A bad idea (or bad implementation?)• The expired objects sat in the db, bloating the database• The removal did not free up space, so we ran out

• Use Mongo as a cache-key-index• Cache the streams in Memcached• For invalidation, keep the index of the memcached keys

in MongoDB.• Works!

Page 17: SF MongoDB User Group : Using MongoDB for IGN's Social Platform

What we’ve learned

• Keep an eye on• Page Faults• Index misses• Queue lengths• Database sizes on disk due to reuse vs. release

• Use .explain() • Watch for nscanned and indexBounds

• Use limit() when using find• While updating, try to load that object in memory so that its

in the working set (findAndModify)• Try to keep the fields being selected at a minimum• Replicate and denormalize instead of using writeconcerns

Page 18: SF MongoDB User Group : Using MongoDB for IGN's Social Platform

Near term Plans

• Move to replica sets • Move relationship graphs to MongoDB• Shard the relationships based on the userId• Run multiple mongo processes, splitting out

collections among multiple databases

Page 19: SF MongoDB User Group : Using MongoDB for IGN's Social Platform

Wishlist

• Respect indexes in $or queries• A $size operator for arrays• $inc when doing $addToSet• Defragmentation when removing data• Concurrency – too many write lock conditions• A decent start/stop script• Load balancing in the driver (round robin) for

reads

Page 20: SF MongoDB User Group : Using MongoDB for IGN's Social Platform

We are hiring

• Software Engineers to help us with exciting initiatives at IGN

• Technologies we use• RoR, Java (no J2EE!), Spring, PHP/Zend, JQuery• HTML5, CSS3, Sencha Touch, PhoneGap• MongoDB, memcached, Solr

http://corp.ign.com

Page 21: SF MongoDB User Group : Using MongoDB for IGN's Social Platform

Questions

Page 22: SF MongoDB User Group : Using MongoDB for IGN's Social Platform

References

• IGN’s Social Platform• http://my.ign.com• http://people.ign.com/ign-labs

• Mongo Munin Plugins• https://github.com/erh/mongo-munin• https://github.com/lobster1234/munin-mongo-

collections

• Morphia• http://code.google.com/p/morphia/