deferred processing in ruby - philly rb - august 2011

Download Deferred Processing in Ruby - Philly rb - August 2011

If you can't read please download the document

Upload: robdimarco

Post on 16-Apr-2017

2.436 views

Category:

Technology


1 download

TRANSCRIPT

Just Do It Later...Procrastination in Ruby

Philly.Rb August 9, 2011

Hi I'm Rob

Twitter: iotr

Github: https://github.com/robdimarco/

Email: rob @ 416 software . com

More: http://www.innovationontherun.com

This Presentation: https://github.com/robdimarco/deferred_processing_talk

Out of Stream Processing

Always Looking to Speed Up User Experience

One Way to Get Faster..Do Less

Targets for Deferral

Sending email

Image resizing

HTTP/FTP downloads

Updating indexes

Batch import / exports

Analytics

Thread.fork {do_stuff}

Using Threads

Easy...ButProcess cannot exit before thread

Work kept in same process

Need to control thread count

Use whenRuntime makes sense within current process

Shared access to data is desired

Want communication with other threads

pid = fork {do_stuff}

Process.detach pid # Maybe??

Using Forks

Easy...ButNeed to control processes

Need to control output

Use ifSeparate resources desired

Runtime outside of main process but on same machine

Let's Take It Up A Notch

Queueing!!!

ControllerQueueRequestResponsePushAnother ProcessPopQueues

Delayed::Job

Delayed::Job

Brought to you by Shopify

Your code creates a Delayed::Job persisted object

Worker will go through and work off jobs

Installation

Gemfilegem 'delayed_job'

script/rails generate delayed_jobCreates migration

Adds script/delayed_job

rake db:migrate

Using delay

Any method can be converted into a delayed job by calling delay

User.find(1).follow(User.find(2)

becomes

User.find(1).delay.follow!(User.find(2))

Using enqueue

Can create an object that defines a perform method

Enqueue with

Delayed::Job.enqueue(MyJob.new)

Delayed::Job Configuration

On delay/enqueue:priority a number

:run_at a time

As defaultssleep_delay

max_attempts

max_run_time

destroy_failed_jobs

Persistence

Saved to a data store

By default, uses ActiveRecord, other frameworks available

Stores job off using YAML!! Model objects may have been changed between enqueuing and processing !!

Processing The Jobs

Start the script/delayed_jobCan pass in # of workers and priority

Uses daemons library

Workers lock jobs by editing the object in the database

Errors are stored alongside the job

Operating Delayed::Job

Need same code that you used to enqueue to process

One solitary queue with priorities, cannot create individual queues

Have had problem with workers processing long jobs not dying

Big queues can lead to database contention

What's To Like

Simple installation

Minimally invasive to object modelAny object can be queued

Numeric priorities

Maybe Not Right For You If...

Want multiple queues

Want out of the box GUI

Big queues

Lots of workers

Want to process outside your code base

Resque

Resque

Brought to you by GitHub

Uses Redis to persist queues

Rake task for starting a worker

Sinatra app for monitoring queues, jobs, and workers.

Installation

Install Redis

Gemfilegem 'resque'

require 'resque'

Configurationconfig/initializers/resque.rb

config/resque.yml

Enqueue

Class that has class methodsperform Method to do the work

queue The name of the queue to use (symbol/string)

Enqueue with Resque.enqueue(class, *args)

More on Methods

Arguments to enqueue are serialized to JSONPro tip...only send in objects that can be encoded/decoded to JSON

queue can be class attribute or class method

perform takes in arguments sent to enqueue

Resque Admin

Sinatra application

Standalone or mounted in your application

Good snapshot of what is happening

About Redis

Redis is an open source, advanced key-value store. It is often referred to as a data structure server since keys can contain strings, hashes, lists, sets and sorted sets.

In-memory data setCan persist to disk

Easy master-slave replication

Resque Uses Redis For...

Storing list of queues in a set

Creating a list of jobs for each queue

Tracking workers

Maintaining stats

Enqueuing a User Follow

Add queue and perform methods to the User class

Enqueue the job

Resque.enqueue User, 2, 1

Process

echo require 'resque/tasks' > lib/tasks/resque.rake

QUEUE=* rake environment resque:work

Forks off child to do actual workMakes it easier to stop/start

Looks to see if you are using Ruby EE

What's To Like

Fast on/off queue

Large queues

Multiple queues

Nice admin tool

Workers resistant to job quirks

Maybe Not Right For You If...

Want priorities

Do not want Redis

Do not want queues in RAM

Do not want JSON serialization

Want to manually query the job queue

Amazon SQS

What is SQS

Web-based Message QueueLanguage agnostic

On Amazon's CloudHigh Availability

Scalable

Queue level security

Using SQS

Requires AWS AccountCurrently 100,000 messages / month free

$.01 / 10,000 messages after that

All interactions over SOAP queries

Using RightAws gem to abstract this away

How to Queue Up Jobs

sqs = RightAws::SqsGen2.new( ENV['ACCESS_KEY'], ENV['SECRET_KEY'])

q = sqs.queue('my_queue_name')

q.push('my data')

Processing

sqs = RightAws::SqsGen2.new( ENV['ACCESS_KEY'], ENV['SECRET_KEY'])

q = sqs.queue('my_queue_name')

loop;process(q.pop);end

What's To Like

Easy to process out of your application codeE.g. processing needs to happen on Windows

Easy to process out of your data center

Easy to add/remove queues

No additional services to manage

Maybe Not Right For You If...

You do not want to figure out the serialization / de-serialization

You do not want to handle worker process management

You want to minimize processing time / job.

AMQP Messaging - The Big Gun

AMQP

Advanced Message Queue Protocol

Protocol defining how messaging clients and brokers will interact

Open, cross-vendor specification

ruby-amqp

Requires EventMachine

Connects to an AMQP-compliant serverRabbit MQ

Apache Qpid

Will need to specify host, channel, queue, and message

Look Into AMQP-Based System If

Needs extend beyond just simple queuePublish/subscribe

Transactional messaging

High volume / fast processing

Cross-systems

Other Options

BeanstalkdStandalone server

StarlingFrom twitter, speaks memcached

SimpleWorkerSaaS job processing http://www.simpleworker.com/

Cloud CrowdMap-reduce framework

Key Links

Source Codehttps://github.com/robdimarco/deferred_processing_talk

Delayed::Jobhttps://github.com/collectiveidea/delayed_job

Resquehttps://github.com/defunkt/resque

https://github.com/blog/542-introducing-resque

More Links

SQShttp://aws.amazon.com/sqs/

https://github.com/rightscale/right_aws

AMQPhttp://rdoc.info/github/ruby-amqp/amqp/master/file/docs/GettingStarted.textile

http://www.amqp.org/

Morehttp://ruby-toolbox.com/categories/queueing.html

http://kr.github.com/beanstalkd/