where is my scalable api?
DESCRIPTION
Presentation corresponding with the talk I did in LARubyConf 2013.TRANSCRIPT
Where is my scalable API?
Helping your customers to cut costs in their web APIs.
About me…
16 years in Software Development From S/390 to Android In love with Ruby since 2006 Working in
@eljuanchosf Love to dance Tango and play Blues
guitar
What is an API?
Set of programming components and standards.
Open up your app to the world. Integration!
(Real life) Case scenario
Mobile social game (iOS & Android) Video upload & encoding JSON API AWS -> EC2/S3
Existing architecture
Autoscale withScalr.com
Customer requirements
Find a way to cut costs and improve performance.
Provide a very easy way to scale the new solution.
Maintain all the RoR application functionality, focusing on APIs for the mobile clients.
Tight, TIGHT budget.
Provided solution
Inspiration: Draw Something – http://goo.gl/hi7a6
Goliath Beanstalk Couchbase HAProxy as load balancer. Varnish
What is Goliath?
Asynchronous (non-blocking) web server framework. Based on EventMachine Lightweight Rack API & middleware support Very simple yet powerful configuration Fully async processing Websockets out of the box No callbacks!! Low memory footprint (only 65 KB!) 0.3 ms from top -> bottom! http://postrank-labs.github.com/goliath/
What is Beanstalk?
Very simple, very fast work queue. Saves memory (lots of it). Multiple queues. Generic interface. Several Ruby clients to choose from. Send your Ruby object as a JSON. Parallel and asynchronous. Scales VERY easily. http://kr.github.com/beanstalkd/
What is EventMachine?
Ruby implementation of the Reactor Pattern Highly scalable Performance optimized Mature & stable Eliminates the complexities of threaded
network programming. Active community
Examples: Thin & Goliath.
What is the Reactor Pattern? October 1995 by Douglas Schmidt AKA Dispatcher or Notifier Handle requests delivered to an
application by one or more clients. Single threaded by definition Separates application logic from the
reactor implementation Task switching = no multithreading!
How does the RP work?
Using EventMachine
EM.run {EM::HttpRequest.new(‘http://www.example.com’).get.callback { |http|
puts http.response}
}
The Spaghetti Callback IncidentEM::HttpRequest.new(first_url).get.callback {|http|
second_url = extract_next_url(http.response) EM::HttpRequest.new(second_url).get.callback {|http2| puts http2.response }}
Goliath’s way
require 'em-synchrony/em-http'
http = EM::HttpRequest.new(first_url).getsecond_url = extract_next_url(http.response)http2 = EM::HttpRequest.new(second_url).get
No callbacks and still asynchronous!!!
Look for appropriate libraries! https://github.com/igrigorik/em-synchrony https://github.com/eventmachine/eventmachine/
wiki/protocol-implementations
Back to Goliath: routing
Latest version has no built-in routing system.
Ilya Grigorik (Goliath’s creator) suggests to start multiple Goliath servers, each one with one endpoint and use HAProxy or any reverse proxy to route the requests.
That’s kind of cumbersome, don’t you think?
How we implemented routing Routing was done thru convention over
configuration with a little of Ruby’s reflection abilities mixed with some inheritance: http://server/api/game/CreateGame was
redirected to the api/game/create_game.rb controller: class CreateGame <
APIController....end
Scaling
Goliath: add processes or servers and configure them in HAProxy.
Couchbase: add servers to the cluster.
Done!
(we used Scalar to automate this, too)
Results
From ~450 req/s to ~1300 req/s. From 4 to 1 EC2 application servers. Triple performance while reducing
costs. Video upload and processing fast
and reliable: ~250 jobs/s
Your turn!
?/!