Archive for June, 2010

Tracing memory leaks in Ruby

Sunday, June 27th, 2010

A project I’m working on at the moment has a lot of long-lived objects, and we want to make sure that everything gets garbage collected correctly.

While trying to track down some issues, I wished I had Squeak’s PointerExplorer, which makes it easy to examine the references to an object—and figure out who’s causing problems by hanging on to one.

After a quick look at a few existing tools for tracing leaks, it doesn’t appear that there’s currently any way of doing this. So I took some code from Joe Damato, updated it a bit, and implemented two things: weak arrays, and GC.heap_find().

Weak arrays work pretty much as you’d expect:

>> a = []
=> []
>> a.weak = true
=> true
>> 1000.times { a << Object.new }
=> 1000
>> GC.start
=> nil
>> a[200]
=> nil

Like the PointerExplorer, GC.heap_find scans the heap to find references to an object. Since it uses the same algorithms that the garbage collector itself uses, it won’t miss anything. (If Ruby’s mark-and-sweep collector can’t find your object, it’s not going to be kept.)

>> obj = Object.new
=> #<Object:0x1007f6900>
>> a = [obj]
=> [#<Object:0x1007f6900>]
>> h = {:foo => 2, :bar => obj}
=> {:bar=>#<Object:0x1007f6900>, :foo=>2}
>> refs = GC.heap_find(obj)
=> [{:bar=>#<Object:0x1007f6900>, :foo=>2}, [#<Object:0x1007f6900>]]
>> refs[1].object_id
=> 2151609120
>> a.object_id
=> 2151609120

One slight issue is that heap_find can’t return some objects that are traversed by Ruby’s GC, but aren’t first class Ruby objects—things like SCOPEs and VARMAPs. heap_find could indicate their presence somehow, but, since they’re not first class, they can’t be returned directly. This isn’t that big a deal: memory issues here are pretty unlikely, and any present probably indicate a bug in Ruby itself.

Because the arrays returned by heap_find are weak, you can continue to explore the object graph as far as you like, without worrying about them interfering with the results, and leading to confusing additional references.

>> refs.weak?
=> true
>> obj2 = Object.new
=> #<Object:0x1013ec9c0>
>> foo = [1, 2, obj2]
=> [1, 2, #<Object:0x1013ec9c0>]
>> GC.heap_find(obj2)
=> [[1, 2, #<Object:0x1013ec9c0>]]
>> foo.weak = true
=> true
>> GC.heap_find(obj2)
=> []

Here’s the patch for ruby 1.8.7-p174 (shipped with OS X 10.6).