garbage collection and the ruby heap

87
Gargbage Collection and the Ruby Heap Joe Damato @joedamato timetobleed.com ice799 on github/irc Tuesday, June 8, 2010

Upload: ice799

Post on 10-Apr-2015

34.761 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Garbage Collection and the Ruby Heap

Gargbage Collection and the Ruby Heap

Joe Damato@joedamato

timetobleed.comice799 on github/irc

Tuesday, June 8, 2010

Page 2: Garbage Collection and the Ruby Heap

Ruby developers know...

Tuesday, June 8, 2010

Page 3: Garbage Collection and the Ruby Heap

Rubyis

fatboyke (flickr)Tuesday, June 8, 2010

Page 4: Garbage Collection and the Ruby Heap

Ruby loves eating RAM

37prime (flickr)Tuesday, June 8, 2010

Page 5: Garbage Collection and the Ruby Heap

this talk is about what ruby does with your

RAM

let’s take a look inside the VM

Tuesday, June 8, 2010

Page 6: Garbage Collection and the Ruby Heap

ruby allocates memory from the OS

memory is broken up into slots

each slot holds one ruby object

Tuesday, June 8, 2010

Page 7: Garbage Collection and the Ruby Heap

when you need an object, it’s pulled off the freelist

a linked list called the ‘freelist’ points to all the

empy slots on the ruby heap

Tuesday, June 8, 2010

Page 8: Garbage Collection and the Ruby Heap

when you need an object, it’s pulled off the freelist

a linked list called the ‘freelist’ points to all the

empy slots on the ruby heap

Tuesday, June 8, 2010

Page 9: Garbage Collection and the Ruby Heap

when you need an object, it’s pulled off the freelist

a linked list called the ‘freelist’ points to all the

empy slots on the ruby heap

Tuesday, June 8, 2010

Page 10: Garbage Collection and the Ruby Heap

if the freelist is empty, GC is run

GC finds non-reachable objects and adds them to the freelist

when you need an object, it’s pulled off the freelist

a linked list called the ‘freelist’ points to all the

empy slots on the ruby heap

Tuesday, June 8, 2010

Page 11: Garbage Collection and the Ruby Heap

if the freelist is empty, GC is run

GC finds non-reachable objects and adds them to the freelist

when you need an object, it’s pulled off the freelist

a linked list called the ‘freelist’ points to all the

empy slots on the ruby heap

Tuesday, June 8, 2010

Page 12: Garbage Collection and the Ruby Heap

if the freelist is empty, GC is run

GC finds non-reachable objects and adds them to the freelist

when you need an object, it’s pulled off the freelist

a linked list called the ‘freelist’ points to all the

empy slots on the ruby heap

if the freelist is still empty (all slots were in use)

Tuesday, June 8, 2010

Page 13: Garbage Collection and the Ruby Heap

if the freelist is empty, GC is run

GC finds non-reachable objects and adds them to the freelist

when you need an object, it’s pulled off the freelist

a linked list called the ‘freelist’ points to all the

empy slots on the ruby heap

if the freelist is still empty (all slots were in use)

another heap is allocated

all the slots on the new heap are added to the freelist

Tuesday, June 8, 2010

Page 14: Garbage Collection and the Ruby Heap

turns out,

Ruby’s GC is

also one of the

reasons it can be so

slowantphotos (flickr)Tuesday, June 8, 2010

Page 15: Garbage Collection and the Ruby Heap

Matz’ Ruby Interpreter (MRI 1.8)has a...

john_lam (flickr)Tuesday, June 8, 2010

Page 16: Garbage Collection and the Ruby Heap

Conservativelifeisaprayer (flickr)

Tuesday, June 8, 2010

Page 17: Garbage Collection and the Ruby Heap

Stopthe

Worldbenimoto (flickr)

Tuesday, June 8, 2010

Page 18: Garbage Collection and the Ruby Heap

Markand

Sweepmichaelgoodin (flickr)

Tuesday, June 8, 2010

Page 19: Garbage Collection and the Ruby Heap

Garbage Collector

kiksbalayon (flickr)

Tuesday, June 8, 2010

Page 20: Garbage Collection and the Ruby Heap

•conservative: the VM hands out raw pointers to ruby objects

•stop the world: no ruby code can execute during GC

•mark and sweep: mark all objects in use, sweep away unmarked objects

Tuesday, June 8, 2010

Page 21: Garbage Collection and the Ruby Heap

more objects=

longer GC

Tuesday, June 8, 2010

Page 22: Garbage Collection and the Ruby Heap

longer GC=

less time to run your ruby code

Tuesday, June 8, 2010

Page 23: Garbage Collection and the Ruby Heap

fewer objects=

better performance

Tuesday, June 8, 2010

Page 24: Garbage Collection and the Ruby Heap

improve performance1. remove unnecessary object allocations

object allocations are not free

2. avoid leaked referencesnot really memory ‘leaks’

you’re holding a reference to an object you no longer need. GC sees the reference, so it keeps the object around

Tuesday, June 8, 2010

Page 25: Garbage Collection and the Ruby Heap

the GC follows

references recursively, so a reference

to classA will ‘leak’ all these objects

Tuesday, June 8, 2010

Page 26: Garbage Collection and the Ruby Heap

useful tools• ltrace

• GC tuning

• ObjectSpace.each_object

• gdb.rb

• bleak_house

• heap dumping patches

• memprof

Tuesday, June 8, 2010

Page 27: Garbage Collection and the Ruby Heap

ltrace

• can use system ltrace

• mine is cooler

• http://github.com/ice799/ltrace/tree/libdl

• can trace GC, mysql queries, and more.

• linux only

Tuesday, June 8, 2010

Page 28: Garbage Collection and the Ruby Heap

ltraceltrace -F ltrace.conf -ttTg -x garbage_collect ruby gc.rb

15:39:22.637185 garbage_collect() = <void> <0.002420> 15:39:22.650797 garbage_collect() = <void> <0.005480>

15:39:22.677607 garbage_collect() = <void> <0.012134>

15:39:22.729645 garbage_collect() = <void> <0.024849> 15:39:22.828402 garbage_collect() = <void> <0.048067>

15:39:23.007304 garbage_collect() = <void> <0.089344> 15:39:23.339801 garbage_collect() = <void> <0.163595>

15:39:23.929944 garbage_collect() = <void> <0.297686>

Tuesday, June 8, 2010

Page 29: Garbage Collection and the Ruby Heap

useful tools• ltrace

• GC tuning

• ObjectSpace.each_object

• gdb.rb

• bleak_house

• heap dumping patches

• memprof

Tuesday, June 8, 2010

Page 30: Garbage Collection and the Ruby Heap

GC tuningRuby Enterprise Edition contains a GC tuning patch

We use:

RUBY_GC_MALLOC_LIMIT=60000000

RUBY_HEAP_MIN_SLOTS=500000

RUBY_HEAP_SLOTS_GROWTH_FACTOR=1

RUBY_HEAP_SLOTS_INCREMENT=1

Tuesday, June 8, 2010

Page 31: Garbage Collection and the Ruby Heap

malloc_limit = 60MBforce garbage collection after every malloc_limit bytes worth of calls to malloc or realloc

defaults to 8MB

high traffic ruby servers can easily allocate and free more than 8mb in a single request

gc.c’s ruby_xmalloc wrapper used by internal objects such as String, Array and Hash

void *ruby_xmalloc(size) long size;{ void *mem; if (malloced > malloc_limit) garbage_collect();

mem = malloc(size); malloced += size;

return mem;}

Tuesday, June 8, 2010

Page 32: Garbage Collection and the Ruby Heap

HEAP_MIN_SLOTS = 500k

defaults to 10k

number of slots in the first slab

a new rails app boots up with almost 500k objects on the heap (mostly code)

(gdb) ruby objects nodes 20996 NODE_CONST 21620 NODE_SCOPE 26329 NODE_LASGN 26747 NODE_STR 33178 NODE_METHOD 40678 NODE_LIT 79046 NODE_LVAR 90646 NODE_NEWLINE 95758 NODE_BLOCK 107357 NODE_CALL 150298 NODE_ARRAY

Tuesday, June 8, 2010

Page 33: Garbage Collection and the Ruby Heap

HEAP_SLOTS_GROWTH = 1

defaults to 1.8x

each new slab is almost twice as big as the last

normal growth:

10k

10k + 18k = 28k

10k + 18k + 36k = 64k

tuned growth:

500k

500k + 500k = 1M

Tuesday, June 8, 2010

Page 34: Garbage Collection and the Ruby Heap

useful tools• ltrace

• GC tuning

• ObjectSpace.each_object

• gdb.rb

• bleak_house

• heap dumping patches

• memprof

Tuesday, June 8, 2010

Page 35: Garbage Collection and the Ruby Heap

types = Hash.new(0)ObjectSpace.each_object do |obj| types[obj.class] += 1endpp types.sort_by{ |klass,num| num }

[ ..., [Module, 18], [Class, 158], [String, 1725]]

* on Ruby 1.9, use ObjectSpace.count_objects

Tuesday, June 8, 2010

Page 36: Garbage Collection and the Ruby Heap

• ltrace

• GC tuning

• ObjectSpace.each_object

• gdb.rb

• bleak_house

• heap dumping patches

• memprof

useful tools

Tuesday, June 8, 2010

Page 37: Garbage Collection and the Ruby Heap

gdb.rb: gdb hooks for REE

• http://github.com/tmm1/gdb.rb

• attach to a running REE process and inspect the heap

• number of nodes by type• number of objects by class• number of strings by content• number of arrays/hash by size

• uses gdb7 + python scripting

• linux only

(gdb) ruby objects strings 140 u 'lib' 158 u '0' 294 u '\n' 619 u '' 30503 unique strings 3187435 bytes

(gdb) ruby objects HEAPS 8 SLOTS 1686252 LIVE 893327 (52.98%) FREE 792925 (47.02%) scope 1641 (0.18%) regexp 2255 (0.25%) data 3539 (0.40%) class 3680 (0.41%) hash 6196 (0.69%) object 8785 (0.98%) array 13850 (1.55%) string 105350 (11.79%) node 742346 (83.10%)

Tuesday, June 8, 2010

Page 38: Garbage Collection and the Ruby Heap

fixing a leak in rails_warden(gdb) ruby objects classes 1197 MIME::Type 2657 NewRelic::MetricSpec 2719 TZInfo::TimezoneTransitionInfo 4124 Warden::Manager 4124 MethodOverrideForAll 4124 AccountMiddleware 4124 Rack::Cookies 4125 ActiveRecord::ConnectionManagement 4125 ActionController::Session::CookieStore 4125 ActionController::Failsafe 4125 ActionController::ParamsParser 4125 Rack::Lock 4125 ActionController::Dispatcher 4125 ActiveRecord::QueryCache

middleware chain leaking per request

Tuesday, June 8, 2010

Page 39: Garbage Collection and the Ruby Heap

god memory leaks(gdb) ruby objects arrays elements instances 94310 3 94311 3 94314 2 94316 1 5369 arrays 2863364 elements

arrays with 94k+ elements!

(gdb) ruby objects classes 43 God::Process 43 God::Watch 43 God::Driver 43 God::DriverEventQueue 43 God::Conditions::MemoryUsage 43 God::Conditions::ProcessRunning 43 God::Behaviors::CleanPidFile 45 Process::Status 86 God::Metric327 God::System::SlashProcPoller327 God::System::Process406 God::DriverEvent

Tuesday, June 8, 2010

Page 40: Garbage Collection and the Ruby Heap

useful tools• ltrace

• GC tuning

• ObjectSpace.each_object

• gdb.rb

• bleak_house

• heap dumping patches

• memprof

Tuesday, June 8, 2010

Page 41: Garbage Collection and the Ruby Heap

bleak_house• http://github.com/fauna/bleak_house

• installs a custom patched version of ruby

• tells you what is leaking (like gdb.rb and ObjectSpace), but also where the leak is happening

191691 total objectsFinal heap size 191691 filled, 220961 freeDisplaying top 20 most common line/class pairs

89513 __null__:__null__:__node__ 41438 __null__:__null__:String 2348 site_ruby/1.8/rubygems/specification.rb:557:Array 1508 gems/specifications/gettext-1.9.gemspec:14:String

Tuesday, June 8, 2010

Page 42: Garbage Collection and the Ruby Heap

useful tools• ltrace

• ObjectSpace.each_object

• gdb.rb

• bleak_house

• heap dumping patches

• memprof

Tuesday, June 8, 2010

Page 43: Garbage Collection and the Ruby Heap

100000 file.rb:123:String

useful, 100k strings on this line

but..sometimes it’s not enough

what is actually inside these strings?

Tuesday, June 8, 2010

Page 44: Garbage Collection and the Ruby Heap

heap dumping patch

• simple patch to ruby VM (300 lines of C)

• http://gist.github.com/73674

• simple text based output format

0x154750 @ -e:1 is OBJECT of type: T0x15476c @ -e:1 is HASH which has data0x154788 @ -e:1 is ARRAY of len: 00x1547dc @ -e:1 is STRING len: 1 and val: T0x154814 @ -e:1 is CLASS named: T inherits from Object0x154a98 @ -e:1 is STRING len: 2 and val: hi0x154b40 @ -e:1 is OBJECT of type: Range

Tuesday, June 8, 2010

Page 45: Garbage Collection and the Ruby Heap

$ cat /tmp/ruby.heap | awk '{ print $3 }' | sort | uniq -c | sort -g | tail -1

 236840 memcached/memcached.rb:316

$ grep "memcached.rb:316" /tmp/ruby.heap | awk '{ print $5 }' | sort | uniq -c | sort -g | tail -2

  64952 HASH  123290 STRING

$ wc -l /tmp/ruby.heap

 1571529 /tmp/ruby.heap

$ grep "memcached.rb:316" /tmp/ruby.heap | grep STRING | awk '{ print $10 }' | sort | uniq -c | sort -g | tail -2

  72095 int(11)  79979 varchar(255)

Tuesday, June 8, 2010

Page 46: Garbage Collection and the Ruby Heap

useful tools• ltrace

• ObjectSpace.each_object

• gdb.rb

• bleak_house

• heap dumping patches

• memprof

Tuesday, June 8, 2010

Page 47: Garbage Collection and the Ruby Heap

memprof goals

• easy to use: no patching the VM, just require the gem

• detailed: include file/line (bleak_house), object contents (heap dumping patch), but also information about references between objects

• simple analysis: allow processing via various languages and databases using simple JSON data format

Tuesday, June 8, 2010

Page 48: Garbage Collection and the Ruby Heap

memprof• http://github.com/ice799/memprof

• gem install memprof

• under active development on github

• works on x86_64 linux and x86_64 osx

• for best results, use an RVM built 1.8.x or REE

• ruby 1.9 support in the works

• 32bit support in the works

Tuesday, June 8, 2010

Page 49: Garbage Collection and the Ruby Heap

memprof under the hood• rewrites your Ruby binary in memory

• injects short trampolines for all calls to internal VM functions to do tracking

• uses libdwarf and libelf to access VM internals like the ruby heap slabs

• uses libyajl to dump out ruby objects as json

http://timetobleed.com/string-together-global-offset-tables-to-build-a-ruby-memory-profiler/http://timetobleed.com/memprof-a-ruby-level-memory-profiler/http://timetobleed.com/what-is-a-ruby-object-introducing-memprof-dump/http://timetobleed.com/hot-patching-inlined-functions-with-x86_64-asm-metaprogramming/http://timetobleed.com/rewrite-your-ruby-vm-at-runtime-to-hot-patch-useful-features/

Tuesday, June 8, 2010

Page 50: Garbage Collection and the Ruby Heap

Tuesday, June 8, 2010

Page 51: Garbage Collection and the Ruby Heap

• memprof.track

• memprof.dump

• memprof.dump_all

• memprof.com

memprof features

Tuesday, June 8, 2010

Page 52: Garbage Collection and the Ruby Heap

Memprof.track{ 100.times{ "abc" } 100.times{ 1.23 + 1 } 100.times{ Module.new }}

100 file.rb:2:String100 file.rb:3:Float100 file.rb:4:Module

• like bleak_house, but for a given block of code

• use Memprof::Middleware in your webapps to run track per request

Tuesday, June 8, 2010

Page 53: Garbage Collection and the Ruby Heap

• memprof.track

• memprof.dump

• memprof.dump_all

• memprof.com

memprof features

Tuesday, June 8, 2010

Page 54: Garbage Collection and the Ruby Heap

Tuesday, June 8, 2010

Page 55: Garbage Collection and the Ruby Heap

{ "_id": "0x19c610",

"file": "file.rb", "line": 2,

"type": "string", "class": "0x1ba7f0", "class_name": "String",

"length": 10, "data": "helloworld"}

memory address of object

file and line where string was created

length and contentsof this string instance

address of the class “String”

stringsMemprof.dump{ "hello" + "world"}

Tuesday, June 8, 2010

Page 56: Garbage Collection and the Ruby Heap

floats and strings are separate ruby objects

{ "_id": "0x19c5c0",

"class": "0x1b0d18", "class_name": "Array",

"length": 4, "data": [ 1, ":b",

"0x19c750", "0x19c598" ]}

integers and symbols are stored in the array itself

arraysMemprof.dump{ [ 1, :b, 2.2, "d" ]}

Tuesday, June 8, 2010

Page 57: Garbage Collection and the Ruby Heap

hashes{ "_id": "0x19c598",

"type": "hash", "class": "0x1af170", "class_name": "Hash",

"default": null,

"length": 2, "data": [ [ ":a", 1 ], [ "0xc728", "0xc750" ] ]}

hash entries as key/value pairs

no default proc

Memprof.dump{ { :a => 1, "b" => 2.2 }}

Tuesday, June 8, 2010

Page 58: Garbage Collection and the Ruby Heap

classesMemprof.dump{ class Hello @@var=1 Const=2 def world() end end}

{ "_id": "0x19c408",

"type": "class", "name": "Hello", "super": "0x1bfa48", "super_name": "Object",

"ivars": { "@@var": 1, "Const": 2 }, "methods": { "world": "0x19c318" }}

class variables and constants are stored in the instance variable table

superclass object reference

references to method objects

Tuesday, June 8, 2010

Page 59: Garbage Collection and the Ruby Heap

• memprof.track

• memprof.dump

• memprof.dump_all

• memprof.com

memprof features

Tuesday, June 8, 2010

Page 60: Garbage Collection and the Ruby Heap

Tuesday, June 8, 2010

Page 61: Garbage Collection and the Ruby Heap

Tuesday, June 8, 2010

Page 62: Garbage Collection and the Ruby Heap

Tuesday, June 8, 2010

Page 63: Garbage Collection and the Ruby Heap

Memprof.dump_all("myapp_heap.json")

• dump out every single live object as json

• one per line to specified file

• analyze via

• jsawk/grep

• mongodb/couchdb

• custom ruby scripts

• libyajl + Boost Graph Library

Tuesday, June 8, 2010

Page 64: Garbage Collection and the Ruby Heap

memprof features

• memprof.track

• memprof.dump

• memprof.dump_all

• memprof.com

Tuesday, June 8, 2010

Page 65: Garbage Collection and the Ruby Heap

a web-based heap visualizer and leak analyzermemprof.com

Tuesday, June 8, 2010

Page 66: Garbage Collection and the Ruby Heap

a web-based heap visualizer and leak analyzermemprof.com

Tuesday, June 8, 2010

Page 67: Garbage Collection and the Ruby Heap

memprof.coma web-based heap visualizer and leak analyzer

Tuesday, June 8, 2010

Page 68: Garbage Collection and the Ruby Heap

memprof.coma web-based heap visualizer and leak analyzer

Tuesday, June 8, 2010

Page 69: Garbage Collection and the Ruby Heap

memprof.coma web-based heap visualizer and leak analyzer

Tuesday, June 8, 2010

Page 70: Garbage Collection and the Ruby Heap

memprof.coma web-based heap visualizer and leak analyzer

Tuesday, June 8, 2010

Page 71: Garbage Collection and the Ruby Heap

memprof.coma web-based heap visualizer and leak analyzer

Tuesday, June 8, 2010

Page 72: Garbage Collection and the Ruby Heap

memprof.coma web-based heap visualizer and leak analyzer

Tuesday, June 8, 2010

Page 73: Garbage Collection and the Ruby Heap

plugging a leak in rails3• in dev mode, rails3 is leaking 10mb per request

# in environment.rbrequire `gem which memprof/signal`.strip

let’s use memprof to find it!

Tuesday, June 8, 2010

Page 74: Garbage Collection and the Ruby Heap

plugging a leak in rails3

tell memprof to dump out the entire heap to json

$ memprof --pid <pid> --name <dump name> --key <api key>

send the app some requests so it leaks

$ ab -c 1 -n 30 http://localhost:3000/

Tuesday, June 8, 2010

Page 75: Garbage Collection and the Ruby Heap

2519 classes

30 copies of TestController

mongo query for all TestController classes

details for one copy of TestController

Tuesday, June 8, 2010

Page 76: Garbage Collection and the Ruby Heap

find references to object

holding references to all controllers

“leak” is on line 178

Tuesday, June 8, 2010

Page 77: Garbage Collection and the Ruby Heap

• In development mode, Rails reloads all your application code on every request

• ActionView::Partials::PartialRenderer is caching partials used by each controller as an optimization

• But.. it ends up holding a reference to every single reloaded version of those controllers

Tuesday, June 8, 2010

Page 78: Garbage Collection and the Ruby Heap

more* memprof features

• memprof.trace

• memprof::tracer

* currently under development

Tuesday, June 8, 2010

Page 79: Garbage Collection and the Ruby Heap

Tuesday, June 8, 2010

Page 80: Garbage Collection and the Ruby Heap

config.middleware.use(Memprof::Tracer)

{ "time": 4.3442,

"rails": { "controller": "test", "action": "index" },

"request": { "REQUEST_PATH": "/test,, "REQUEST_METHOD": "GET" },

total time for request

rails controller/action

request env info

Tuesday, June 8, 2010

Page 81: Garbage Collection and the Ruby Heap

"mysql": { "queries": 3, "time": 0.00109302 },

"gc": { "calls": 8, "time": 2.04925 },

config.middleware.use(Memprof::Tracer)

8 calls to GC2 secs spent in GC

3 mysql queries

Tuesday, June 8, 2010

Page 82: Garbage Collection and the Ruby Heap

"objects": { "created": 3911103, "types": { "none": 1168831, "object": 1127, "float": 627, "string": 1334637, "array": 609313, "hash": 3676, "match": 70211 } }}

config.middleware.use(Memprof::Tracer)

3 million objs created

lots of stringslots of arrays

regexp matches

object instances1 million method calls

Tuesday, June 8, 2010

Page 83: Garbage Collection and the Ruby Heap

more objects=

longer GC

Tuesday, June 8, 2010

Page 84: Garbage Collection and the Ruby Heap

longer GC=

less time to run your ruby code

Tuesday, June 8, 2010

Page 85: Garbage Collection and the Ruby Heap

fewer objects=

better performance

Tuesday, June 8, 2010

Page 86: Garbage Collection and the Ruby Heap

Use these tools.Tuesday, June 8, 2010

Page 87: Garbage Collection and the Ruby Heap

Questions?Joe Damato@joedamato

timetobleed.comice799 on github/irc

Tuesday, June 8, 2010