Just Do It Later...Procrastination in Ruby
Philly.Rb August 9, 2011
Hi I'm Rob
Twitter: iotr
Github: https://github.com/robdimarco/
Email: rob @ 416 software . com
More: http://www.innovationontherun.com
This Presentation: https://github.com/robdimarco/deferred_processing_talk
Out of Stream Processing
Always Looking to Speed Up User Experience
One Way to Get Faster..Do Less
Targets for Deferral
Sending email
Image resizing
HTTP/FTP downloads
Updating indexes
Batch import / exports
Analytics
Thread.fork {do_stuff}
Using Threads
Easy...ButProcess cannot exit before thread
Work kept in same process
Need to control thread count
Use whenRuntime makes sense within current process
Shared access to data is desired
Want communication with other threads
pid = fork {do_stuff}
Process.detach pid # Maybe??
Using Forks
Easy...ButNeed to control processes
Need to control output
Use ifSeparate resources desired
Runtime outside of main process but on same machine
Let's Take It Up A Notch
Queueing!!!
ControllerQueueRequestResponsePushAnother ProcessPopQueues
Delayed::Job
Delayed::Job
Brought to you by Shopify
Your code creates a Delayed::Job persisted object
Worker will go through and work off jobs
Installation
Gemfilegem 'delayed_job'
script/rails generate delayed_jobCreates migration
Adds script/delayed_job
rake db:migrate
Using delay
Any method can be converted into a delayed job by calling delay
User.find(1).follow(User.find(2)
becomes
User.find(1).delay.follow!(User.find(2))
Using enqueue
Can create an object that defines a perform method
Enqueue with
Delayed::Job.enqueue(MyJob.new)
Delayed::Job Configuration
On delay/enqueue:priority a number
:run_at a time
As defaultssleep_delay
max_attempts
max_run_time
destroy_failed_jobs
Persistence
Saved to a data store
By default, uses ActiveRecord, other frameworks available
Stores job off using YAML!! Model objects may have been changed between enqueuing and processing !!
Processing The Jobs
Start the script/delayed_jobCan pass in # of workers and priority
Uses daemons library
Workers lock jobs by editing the object in the database
Errors are stored alongside the job
Operating Delayed::Job
Need same code that you used to enqueue to process
One solitary queue with priorities, cannot create individual queues
Have had problem with workers processing long jobs not dying
Big queues can lead to database contention
What's To Like
Simple installation
Minimally invasive to object modelAny object can be queued
Numeric priorities
Maybe Not Right For You If...
Want multiple queues
Want out of the box GUI
Big queues
Lots of workers
Want to process outside your code base
Resque
Resque
Brought to you by GitHub
Uses Redis to persist queues
Rake task for starting a worker
Sinatra app for monitoring queues, jobs, and workers.
Installation
Install Redis
Gemfilegem 'resque'
require 'resque'
Configurationconfig/initializers/resque.rb
config/resque.yml
Enqueue
Class that has class methodsperform Method to do the work
queue The name of the queue to use (symbol/string)
Enqueue with Resque.enqueue(class, *args)
More on Methods
Arguments to enqueue are serialized to JSONPro tip...only send in objects that can be encoded/decoded to JSON
queue can be class attribute or class method
perform takes in arguments sent to enqueue
Resque Admin
Sinatra application
Standalone or mounted in your application
Good snapshot of what is happening
About Redis
Redis is an open source, advanced key-value store. It is often referred to as a data structure server since keys can contain strings, hashes, lists, sets and sorted sets.
In-memory data setCan persist to disk
Easy master-slave replication
Resque Uses Redis For...
Storing list of queues in a set
Creating a list of jobs for each queue
Tracking workers
Maintaining stats
Enqueuing a User Follow
Add queue and perform methods to the User class
Enqueue the job
Resque.enqueue User, 2, 1
Process
echo require 'resque/tasks' > lib/tasks/resque.rake
QUEUE=* rake environment resque:work
Forks off child to do actual workMakes it easier to stop/start
Looks to see if you are using Ruby EE
What's To Like
Fast on/off queue
Large queues
Multiple queues
Nice admin tool
Workers resistant to job quirks
Maybe Not Right For You If...
Want priorities
Do not want Redis
Do not want queues in RAM
Do not want JSON serialization
Want to manually query the job queue
Amazon SQS
What is SQS
Web-based Message QueueLanguage agnostic
On Amazon's CloudHigh Availability
Scalable
Queue level security
Using SQS
Requires AWS AccountCurrently 100,000 messages / month free
$.01 / 10,000 messages after that
All interactions over SOAP queries
Using RightAws gem to abstract this away
How to Queue Up Jobs
sqs = RightAws::SqsGen2.new( ENV['ACCESS_KEY'], ENV['SECRET_KEY'])
q = sqs.queue('my_queue_name')
q.push('my data')
Processing
sqs = RightAws::SqsGen2.new( ENV['ACCESS_KEY'], ENV['SECRET_KEY'])
q = sqs.queue('my_queue_name')
loop;process(q.pop);end
What's To Like
Easy to process out of your application codeE.g. processing needs to happen on Windows
Easy to process out of your data center
Easy to add/remove queues
No additional services to manage
Maybe Not Right For You If...
You do not want to figure out the serialization / de-serialization
You do not want to handle worker process management
You want to minimize processing time / job.
AMQP Messaging - The Big Gun
AMQP
Advanced Message Queue Protocol
Protocol defining how messaging clients and brokers will interact
Open, cross-vendor specification
ruby-amqp
Requires EventMachine
Connects to an AMQP-compliant serverRabbit MQ
Apache Qpid
Will need to specify host, channel, queue, and message
Look Into AMQP-Based System If
Needs extend beyond just simple queuePublish/subscribe
Transactional messaging
High volume / fast processing
Cross-systems
Other Options
BeanstalkdStandalone server
StarlingFrom twitter, speaks memcached
SimpleWorkerSaaS job processing http://www.simpleworker.com/
Cloud CrowdMap-reduce framework
Key Links
Source Codehttps://github.com/robdimarco/deferred_processing_talk
Delayed::Jobhttps://github.com/collectiveidea/delayed_job
Resquehttps://github.com/defunkt/resque
https://github.com/blog/542-introducing-resque
More Links
SQShttp://aws.amazon.com/sqs/
https://github.com/rightscale/right_aws
AMQPhttp://rdoc.info/github/ruby-amqp/amqp/master/file/docs/GettingStarted.textile
http://www.amqp.org/
Morehttp://ruby-toolbox.com/categories/queueing.html
http://kr.github.com/beanstalkd/