scalability boot camp sxsw 2008
DESCRIPTION
SXSW Presentation by Jakob Heuser (Gaia Online), Alan Kasindorf (Six Apart), Sandy Jen (Meebo), and Blaine Cook (Twitter). Topics include scaling out using caching, file storage solutions, parallel processing, and database solutions.TRANSCRIPT
ScalabilityBoot Camp
SXSW 2008
Jakob HeuserAlan Kasindorf
Blaine CookSandy Jen
Kerry Miller
Briefing
PanelistsKerry Miller - BusinessWeekAlan Kasindorf (aka dormando) - Six ApartJakob Heuser - Gaia OnlineBlaine Cook - TwitterSandy Jen - Meebo
Kerry Miller - BusinessWeek
Why Scale
• Good problem
• Doesn’t have to cost
• It’s an “everybody” thing
Kerry Miller - BusinessWeek
The Regimen• Problem
• Concepts
• Abstract Solutions
• Google Time
• Conversation
Kerry Miller - BusinessWeek
Monitoring
avoid working in the dark
Sandy Jen - Meebo
Get it on the radar
• Understand the “pain points”
• Live and die by monitoring
• Monitor EVERYTHING
Sandy Jen - Meebo
“Everything”?
• Disk I/O
• Memory
• Bandwidth
• Page Load Times
• The list goes on...
Sandy Jen - Meebo
Google Time
• Ganglia
• Hyperic
• sar and sysstat - simple, you already have it
• Know your tool, whatever it is
Sandy Jen - Meebo
Content Delivery Network
images, static files, and the like
left out - n/a
Friendly fire
Bandwidth Usage During Digg Effect
left out - n/a
Down in the trenchesYSlow Output On Retrieving a Page
left out - n/a
“Better you than me”
• Put the right content on it
• Do not “bolt it on” later
• Use many subdomains
• Cross domain situations
left out - n/a
Google Time (2)
• CDN - wikipedia
• Akamai
• Panther Express
• Coral CDN
left out - n/a
File System Solutionsusers like making tons of lolcats
and storing them on your website
Jakob Heuser - Gaia Online
Never gonna be a heroDisk IO Graph
Jakob Heuser - Gaia Online
Use what you have
• Don’t waste capacity
• Use someone else’s space
• Avoid a single “authority” on a file
Jakob Heuser - Gaia Online
Google Time
• DRBD + OCFS
• Amazon S3
• MogileFS (Danga Software)
• lustre
Jakob Heuser - Gaia Online
The Database Layeryour most common, but
hardest to solvebottleneck
Alan Kasindorf - Six Apart
Enemy diversion
Disk IO CPU Usage
Alan Kasindorf - Six Apart
The real problem
Show Processlist Output
Alan Kasindorf - Six Apart
Another war zone
• Make “:) SQL” not “:( SQL”
• Horizontal Partitioning
• Caching Layer
Alan Kasindorf - Six Apart
Google Time
• Memcache
• HiveDB
• CouchDB / Hypertable
• MySQL Consultant
Alan Kasindorf - Six Apart
Parallel Processing
you don’t have to do it all right now
Blaine Cook - Twitter
Under siege
Slow Query Log
Blaine Cook - Twitter
Smarter, not stronger
• Consistent for current user, not everyone
• Design code for parallel steps
• cronjobs
Blaine Cook - Twitter
Google Time
• Starling
• Gearman
• TheSchwartz
Blaine Cook - Twitter
Regroup
Jakob Heuser - Gaia Online
Regroup
• All these technologies are built to be asynchronous
• An amazing amount of your app can be asynchronous too
Jakob Heuser - Gaia Online
ScalabilityBoot Camp
SXSW 2008
Jakob HeuserAlan Kasindorf
Blaine CookSandy Jen
Kerry Miller
http://www.slideshare.net/Jakobo
Resources
Monitoringhttp://ganglia.sourceforge.nethttp://www.hyperic.comhttp://www.nagios.orghttp://pagesperso-orange.fr/sebastien.godard/http://www.cacti.net
CDNhttp://en.wikipedia.org/wiki/Content_Delivery_Networkhttp://en.wikipedia.org/wiki/Akamai_Technologies
ResourcesFile Systemshttp://www.danga.com/mogilefs/http://www.lustre.orghttp://www.drbd.orghttp://oss.oracle.com/projects/ocfs/
Databasehttp://www.planetmysql.comhttp://datacharmer.blogspot.com/http://www.danga.com/memcached/
Job and Queue Systemshttp://www.danga.com/gearman/http://rubyforge.org/projects/starling/
Image Credithttp://www.history.noaa.gov/stories_tales/radar.html (radar)http://www.thatpoliticalblog.com/serendipity/plugin/tag/TPB+Information (bar chart)http://www.digitalearth.com.au/category/general/ (disk on fire)http://cgi.ebay.com/ws/eBayISAPI.dll?ViewItem&item=160210671355 (dump truck)http://420.thrashbarg.net/ (marching penguins)