ruby on redis
DESCRIPTION
Making an application horizontally scalable in 30 minutes. This presentation describes how a linear processing application (mail merge) can be converted into a horizontally scalable using Redis and provides some context why a multi-process approach is preferable to a multi-threaded approach.TRANSCRIPT
![Page 1: Ruby on Redis](https://reader033.vdocuments.site/reader033/viewer/2022052411/55643effd8b42adb258b5505/html5/thumbnails/1.jpg)
Ruby on Redis Pascal Weemaels Koen Handekyn Oct 2013
![Page 2: Ruby on Redis](https://reader033.vdocuments.site/reader033/viewer/2022052411/55643effd8b42adb258b5505/html5/thumbnails/2.jpg)
Target
Create a Zip file of PDF’s based on a CSV data file
‣ Linear version
‣ Making it scale with Redis
parse csv
create pdf
zip
create pdf
create pdf
...
![Page 3: Ruby on Redis](https://reader033.vdocuments.site/reader033/viewer/2022052411/55643effd8b42adb258b5505/html5/thumbnails/3.jpg)
Step 1: linear
‣ Parse CSV • std lib : require ‘csv’
• docs = CSV.read("#{DATA}.csv")
![Page 4: Ruby on Redis](https://reader033.vdocuments.site/reader033/viewer/2022052411/55643effd8b42adb258b5505/html5/thumbnails/4.jpg)
Simple Templating with String Interpolation
<<Q
<div class="title">
INVOICE #{invoice_nr}
</div>
<div class="address">
#{name}</br>
#{street}</br>
#{zip} #{city}</br>
</div>
Q
‣ Merge data into HTML • template =
File.new('invoice.html').read
• html = eval("<<QQQ\n#{template}\nQQQ”)
invoice.html
![Page 5: Ruby on Redis](https://reader033.vdocuments.site/reader033/viewer/2022052411/55643effd8b42adb258b5505/html5/thumbnails/5.jpg)
Step 1: linear
‣ Create PDF • prince xml using princely gem
• http://www.princexml.com
• p = Princely.new p.add_style_sheets('invoice.css') p.pdf_from_string(html)
![Page 6: Ruby on Redis](https://reader033.vdocuments.site/reader033/viewer/2022052411/55643effd8b42adb258b5505/html5/thumbnails/6.jpg)
Step 1: linear
‣ Create ZIP • Zip::ZipOutputstream. open(zipfile_name)do |zos| files.each do |file, content| zos.new_entry(file) zos.puts content end end
![Page 7: Ruby on Redis](https://reader033.vdocuments.site/reader033/viewer/2022052411/55643effd8b42adb258b5505/html5/thumbnails/7.jpg)
Full Code require 'csv'!require 'princely'!require 'zip/zip’!!DATA_FILE = ARGV[0]!DATA_FILE_BASE_NAME = File.basename(DATA_FILE, ".csv”)!!# create a pdf document from a csv line!def create_pdf(invoice_nr, name, street, zip, city)! template = File.new('../resources/invoice.html').read! html = eval("<<WTFMF\n#{template}\nWTFMF")! p = Princely.new! p.add_style_sheets('../resources/invoice.css')! p.pdf_from_string(html)!end!!# zip files from hash !def create_zip(files_h)! zipfile_name = "../out/#{DATA_FILE_BASE_NAME}.#{Time.now.to_s}.zip"! Zip::ZipOutputStream.open(zipfile_name) do |zos|! files_h.each do |name, content|! zos.put_next_entry "#{name}.pdf"! zos.puts content! end! end! zipfile_name!end!!# load data from csv!docs = CSV.read(DATA_FILE) # array of arrays!!# create a pdf for each line in the csv !# and put it in a hash!files_h = docs.inject({}) do |files_h, doc|! files_h[doc[0]] = create_pdf(*doc)! files_h!end!!# zip all pfd's from the hash !create_zip files_h!!
![Page 8: Ruby on Redis](https://reader033.vdocuments.site/reader033/viewer/2022052411/55643effd8b42adb258b5505/html5/thumbnails/8.jpg)
DEMO
![Page 9: Ruby on Redis](https://reader033.vdocuments.site/reader033/viewer/2022052411/55643effd8b42adb258b5505/html5/thumbnails/9.jpg)
Step 2: from linear ...
parse csv
create pdf
zip
create pdf
create pdf
...
![Page 10: Ruby on Redis](https://reader033.vdocuments.site/reader033/viewer/2022052411/55643effd8b42adb258b5505/html5/thumbnails/10.jpg)
Step 2: ...to parallel
parse csv
create pdf
zip
create pdf create pdf
Threads ?
![Page 11: Ruby on Redis](https://reader033.vdocuments.site/reader033/viewer/2022052411/55643effd8b42adb258b5505/html5/thumbnails/11.jpg)
Multi Threaded ‣ Advantage • Lightweight (minimal overhead)
‣ Challenges (or why is it hard) • Hard to code: most data structures are not thread safe by default, they
need synchronized access
• Hard to test: different execution paths , timings
• Hard to maintain
‣ Limitation • single machine - not a solution for horizontal scalability ���
beyond the multi core cpu
![Page 12: Ruby on Redis](https://reader033.vdocuments.site/reader033/viewer/2022052411/55643effd8b42adb258b5505/html5/thumbnails/12.jpg)
Step 2: ...to parallel
parse csv
create pdf
zip
create pdf create pdf
?
![Page 13: Ruby on Redis](https://reader033.vdocuments.site/reader033/viewer/2022052411/55643effd8b42adb258b5505/html5/thumbnails/13.jpg)
Multi Process • scale across machines • advanced support for debugging and monitoring at the
OS level
• simpler (code, testing, debugging, ...)
• slightly more overhead
BUT
![Page 14: Ruby on Redis](https://reader033.vdocuments.site/reader033/viewer/2022052411/55643effd8b42adb258b5505/html5/thumbnails/14.jpg)
But
parse csv
create pdf
zip
create pdf create pdf
all this assumes
“shared state across processes”
shared state
shared state
SQL? MemCached
File System Terra Cotta
… OR …
![Page 15: Ruby on Redis](https://reader033.vdocuments.site/reader033/viewer/2022052411/55643effd8b42adb258b5505/html5/thumbnails/15.jpg)
Hello Redis
‣ Shared Memory Key Value Store with High Level Data Structure support
• String (String, Int, Float)
• Hash (Map, Dictionary)
• List (Queue)
• Set
• ZSet (ordered by member or score)
![Page 16: Ruby on Redis](https://reader033.vdocuments.site/reader033/viewer/2022052411/55643effd8b42adb258b5505/html5/thumbnails/16.jpg)
About Redis
• Single threaded : 1 thread to serve them all • (fit) Everything in memory
• “Transactions” (multi exec)
• Expiring keys
• LUA Scripting
• Publisher-Subscriber
• Auto Create and Destroy
• Pipelining
• But … full clustering (master-master) is not available (yet)
![Page 17: Ruby on Redis](https://reader033.vdocuments.site/reader033/viewer/2022052411/55643effd8b42adb258b5505/html5/thumbnails/17.jpg)
Hello Redis ‣ redis-cli • set name “pascal” =>
“pascal”
• incr counter => 1 • incr counter => 2 • hset pascal name
“pascal”
• hset pascal address “merelbeke”
• sadd persons pascal • smembers persons =>
[pascal]
• keys * • type pascal => hash • lpush todo “read” => 1 • lpush todo “eat” => 2 • lpop todo => “eat” • rpoplpush todo done =>
“read”
• lrange done 0 -1 => “read”
![Page 18: Ruby on Redis](https://reader033.vdocuments.site/reader033/viewer/2022052411/55643effd8b42adb258b5505/html5/thumbnails/18.jpg)
Let Redis Distribute
parse csv
create pdf
zip
create pdf ...
process
process process
![Page 19: Ruby on Redis](https://reader033.vdocuments.site/reader033/viewer/2022052411/55643effd8b42adb258b5505/html5/thumbnails/19.jpg)
Spread the Work
parse csv
create pdf
zip
create pdf ...
Queue with data
counter 1
process
process process
![Page 20: Ruby on Redis](https://reader033.vdocuments.site/reader033/viewer/2022052411/55643effd8b42adb258b5505/html5/thumbnails/20.jpg)
Ruby on Redis
‣ Put PDF Create Input data on a Queue and do the counter bookkeeping
!
docs.each do |doc|!
data = YAML::dump(doc)!
!r.lpush 'pdf:queue’, data!
r.incr "ctr” # bookkeeping!
end!
![Page 21: Ruby on Redis](https://reader033.vdocuments.site/reader033/viewer/2022052411/55643effd8b42adb258b5505/html5/thumbnails/21.jpg)
Create PDF’s
parse csv
create pdf
zip
create pdf ...
Queue with data
counter
process
process process
1 Hash with pdfs
2
![Page 22: Ruby on Redis](https://reader033.vdocuments.site/reader033/viewer/2022052411/55643effd8b42adb258b5505/html5/thumbnails/22.jpg)
Ruby on Redis ‣ Read PDF input data from Queue and do the counter bookkeeping
and put each created PDF in a Redis hash and signal if ready
while (true)!
_, msg = r.brpop 'pdf:queue’!
!doc = YAML::load(msg)!
#name of hash, key=docname, value=pdf!
r.hset(‘pdf:pdfs’, doc[0], create_pdf(*doc))!
ctr = r.decr ‘ctr’ !
r.rpush "ready", "done" if ctr == 0!
end!
![Page 23: Ruby on Redis](https://reader033.vdocuments.site/reader033/viewer/2022052411/55643effd8b42adb258b5505/html5/thumbnails/23.jpg)
Zip When Done
parse csv
create pdf
zip
create pdf ...
process
process process
Hash with pdfs
ready 3
![Page 24: Ruby on Redis](https://reader033.vdocuments.site/reader033/viewer/2022052411/55643effd8b42adb258b5505/html5/thumbnails/24.jpg)
Ruby on Redis ‣ Wait for the ready signal ���
Fetch all pdf ’s���And zip them
!
r.brpop "ready“ # wait for signal!
pdfs = r.hgetall ‘pdf:pdfs‘ # fetch hash!
create_zip pdfs # zip it
![Page 25: Ruby on Redis](https://reader033.vdocuments.site/reader033/viewer/2022052411/55643effd8b42adb258b5505/html5/thumbnails/25.jpg)
More Parallelism
parse csv
create pdf
zip
create pdf ...
Queue with data
counter
hash
counter counter
hash Hash with Pdfs
ready ready ready
![Page 26: Ruby on Redis](https://reader033.vdocuments.site/reader033/viewer/2022052411/55643effd8b42adb258b5505/html5/thumbnails/26.jpg)
Ruby on Redis
‣ Put PDF Create Input data on a Queue and do the counter bookkeeping
# unique id for this input file!
UUID = SecureRandom.uuid!
docs.each do |doc|!
data = YAML::dump([UUID, doc])!
!r.lpush 'pdf:queue’, data!
r.incr "ctr:#{UUID}” # bookkeeping!
end!
![Page 27: Ruby on Redis](https://reader033.vdocuments.site/reader033/viewer/2022052411/55643effd8b42adb258b5505/html5/thumbnails/27.jpg)
Ruby on Redis ‣ Read PDF input data from Queue and do the counter bookkeeping and
put each created PDF in a Redis hash
while (true)!
_, msg = r.brpop 'pdf:queue’!
uuid, doc = YAML::load(msg)!
r.hset(uuid, doc[0], create_pdf(*doc))!
ctr = r.decr "ctr:#{uuid}" !
r.rpush "ready:#{uuid}", "done" if ctr == 0 !
end!
![Page 28: Ruby on Redis](https://reader033.vdocuments.site/reader033/viewer/2022052411/55643effd8b42adb258b5505/html5/thumbnails/28.jpg)
Ruby on Redis ‣ Wait for the ready signal ���
Fetch all pdf ’s���And zip them
!
r.brpop "ready:#{UUID}“ # wait for signal!
pdfs = r.hgetall(‘pdf:pdfs‘) # fetch hash!
create_zip(pdfs) # zip it
![Page 29: Ruby on Redis](https://reader033.vdocuments.site/reader033/viewer/2022052411/55643effd8b42adb258b5505/html5/thumbnails/29.jpg)
Full Code require 'csv'!require 'princely'!require 'zip/zip’!!DATA_FILE = ARGV[0]!DATA_FILE_BASE_NAME = File.basename(DATA_FILE, ".csv”)!!# create a pdf document from a csv line!def create_pdf(invoice_nr, name, street, zip, city)! template = File.new('../resources/invoice.html').read! html = eval("<<WTFMF\n#{template}\nWTFMF")! p = Princely.new! p.add_style_sheets('../resources/invoice.css')! p.pdf_from_string(html)!end!!# zip files from hash !def create_zip(files_h)! zipfile_name = "../out/#{DATA_FILE_BASE_NAME}.#{Time.now.to_s}.zip"! Zip::ZipOutputStream.open(zipfile_name) do |zos|! files_h.each do |name, content|! zos.put_next_entry "#{name}.pdf"! zos.puts content! end! end! zipfile_name!end!!# load data from csv!docs = CSV.read(DATA_FILE) # array of arrays!!# create a pdf for each line in the csv !# and put it in a hash!files_h = docs.inject({}) do |files_h, doc|! files_h[doc[0]] = create_pdf(*doc)! files_h!end!!# zip all pfd's from the hash !create_zip files_h!!
require 'csv’!require 'zip/zip'!require 'redis'!require 'yaml'!require 'securerandom'!!# zip files from hash !def create_zip(files_h)! zipfile_name = "../out/#{DATA_FILE_BASE_NAME}.#{Time.now.to_s}.zip"! Zip::ZipOutputStream.open(zipfile_name) do |zos|! files_h.each do |name, content|! zos.put_next_entry "#{name}.pdf"! zos.puts content! end! end! zipfile_name!end!!DATA_FILE = ARGV[0]!DATA_FILE_BASE_NAME = File.basename(DATA_FILE, ".csv")!UUID = SecureRandom.uuid!!r = Redis.new!my_counter = "ctr:#{UUID}"!!# load data from csv!docs = CSV.read(DATA_FILE) # array of arrays! ! docs.each do |doc| # distribute!! r.lpush 'pdf:queue' , YAML::dump([UUID, doc])! r.incr my_counter! end!!r.brpop "ready:#{UUID}" #collect!!create_zip(r.hgetall(UUID)) !!# clean up!r.del my_counter!r.del UUID !puts "All done!”!
require 'redis'!require 'princely'!require 'yaml’!!# create a pdf document from a csv line!def create_pdf(invoice_nr, name, street, zip, city)! template = File.new('../resources/invoice.html').read! html = eval("<<WTFMF\n#{template}\nWTFMF")! p = Princely.new! p.add_style_sheets('../resources/invoice.css')! p.pdf_from_string(html)!end!!r = Redis.new!while (true)! _, msg = r.brpop 'pdf:queue'! uuid, doc = YAML::load(msg)! r.hset(uuid , doc[0] , create_pdf(*doc))! ctr = r.decr "ctr:#{uuid}" ! r.rpush "ready:#{uuid}", "done" if ctr == 0!end!
LINEAR MAIN WORKER
Key functions (create pdf and create zip) remain unchanged. Distribution code highlighted
![Page 30: Ruby on Redis](https://reader033.vdocuments.site/reader033/viewer/2022052411/55643effd8b42adb258b5505/html5/thumbnails/30.jpg)
DEMO 2
![Page 31: Ruby on Redis](https://reader033.vdocuments.site/reader033/viewer/2022052411/55643effd8b42adb258b5505/html5/thumbnails/31.jpg)
Multi Language Participants
parse csv
create pdf
zip
create pdf ...
Queue with data
counter
hash
counter counter
hash Hash with pdfs
![Page 32: Ruby on Redis](https://reader033.vdocuments.site/reader033/viewer/2022052411/55643effd8b42adb258b5505/html5/thumbnails/32.jpg)
Conclusions
From Linear To Multi Process Distributed
Is easy with
Redis Shared Memory High Level Data Structures
Atomic Counter for bookkeeping
Queue for work distribution
Queue as Signal
Hash for result sets