In it we learned that our GC algorithm is flawed and were prescribed some rather drastic and dangerous workarounds.

At the core it had one big demonstration:

Run this on Ruby 2.1.1 and you will be out of memory soon:

while true
"a" * (1024 ** 2)
end

Malloc limits, Ruby and you

From very early versions of Ruby we always tracked memory allocation. This is why I found FUD comments such as this troubling:

the issue is that the Ruby GC is triggered on total number of objects, and not total amount of used memory

This is clearly misunderstanding Ruby. In fact, the aforementioned article does nothing to mention memory allocation may trigger a GC.

Historically Ruby was quite conservative issuing GCs based on the amount of memory allocated. Ruby keeps track of all memory allocated (using malloc) outside of the Ruby heaps between GCs. In Ruby 2.0, out-of-the-box every 8MB of allocations will result in a full GC. This number is way too small for almost any Rails app, which is why increasing RUBY_GC_MALLOC_LIMIT is one of the most cargo culted settings out there in the wild.

Matz picked this tiny number years ago when it was a reasonable default, however it was not revised till Ruby 2.1 landed.

For Ruby 2.1 Koichi decided to revamp this sub-system. The goal was to have defaults that work well for both scripts and web apps.

Instead of having a single malloc limit for our app, we now have a starting point malloc limit that will dynamically grow every time we trigger a GC by exceeding the limit. To stop unbound growth of the limit we have max values set.

We track memory allocations from 2 points in time:

memory allocated outside Ruby heaps since last minor GC

memory allocated since last major GC.

At any point in time we can get a snapshot of the current situation with GC.stat:

So, in theory, this unbound memory growth is not possible for the script above. The two MAX values should just cap the growth and force GCs.

However, this is not the case in Ruby 2.1.1

Investigating the issue

We spent a lot of time ensuring we had extensive instrumentation built in to Ruby 2.1, we added memory profiling hooks, we added GC hooks, we exposed a large amount of internal information. This has certainly paid off.

Analyzing the issue raised by this mini script is trivial using the gc_tracer gem. This gem allows us to get a very detailed snapshot of the system every time a GC is triggered and store it in a text file, easily consumable by spreadsheet.

In the snippet above we can see minor GCs being triggered by exceeding malloc limits (where major_by is 0) and major GCs being triggered by exceeding malloc limits. We can see out malloc limit and old malloc limit growing. We can see when GC starts and ends, and lots more.

Trouble is, our limit max for both oldmalloc and malloc grows well beyond the max values we have defined:

Are you affected by this bug?

It is possible your production app on Ruby 2.1.1 is impacted by this. Simplest way to find out is to issue a GC.stat as soon as memory usage is really high.

The script above is very aggressive and triggers the pathological issue, it is quite possibly you are not even pushing against malloc limits. Only way to find out is measure.

General memory growth under Ruby 2.1.1

A more complicated issue we need to tackle is the more common “memory doubling” issue under Ruby 2.1.1. The general complaint goes something along the line of “I just upgraded Ruby and now my RSS has doubled”

Memory usage growth is partly unavoidable when employing a generational GC. A certain section of the heap is getting scanned far less often. It’s a performance/memory trade-off. That said, the algorithm used in 2.1 is a bit too simplistic.

If ever an objects survives a minor GC it will be flagged as oldgen, these objects will only be scanned during a major GC. This algorithm is particularly problematic for web applications.

Web applications perform a large amount of “medium” lived memory allocations. A large number of objects are needed for the lifetime of a web request. If a minor GC hits in the middle of a web request we will “promote” a bunch of objects to the “long lived” oldgen even though they will no longer be needed at the end of the request.

This has a few bad side effects,

It forces major GC to run more often (growth of oldgen is a trigger for running a major GC)

It forces the oldgen heaps to grow beyond what we need.

A bunch of memory is retained when it is clearly not needed.

.NET and Java employ 3 generations to overcome this issue. Survivors in Gen 0 collections are promoted to Gen 1 and so on.

Koichi is planning on refining the current algorithm to employ a somewhat similar technique of deferred promotion. Instead of promoting objects to oldgen on first minor GC and object will have to survive two minor GCs to be promoted. This means that if no more than 1 minor GC runs during a request our heaps will be able to stay at optimal sizes. This work is already prototyped into Ruby 2.1 see RGENGC_THREEGEN in gc.c (note, the name is likely to change). This is slotted to be released in Ruby 2.2

We can see this problem in action using this somewhat simplistic test:

Which is nice since we are back to Ruby 2.0 numbers now and lost a pile of performance.

Ruby 2.1 is ready for production

Ruby 2.1 has been running in production at GitHub for a few months with great success. The 2.1.0 release was a little rough 2.1.1 addresses the majority of the big issues it had. 2.1.2 will address the malloc issue, which may or may not affect you.

If you are considering deploying Ruby 2.1 I would strongly urge giving GitHub Ruby a go since it contains a fairly drastic performance boost due to funny-falcons excellent method cache patch.

I was reffering to the “General memory growth under Ruby 2.1.1” part of your article.

This should be solved with 2.2 if I’ve understood your article correctly.

I am deploying a rails api app under a 1X dyno on heroku and have 1 big endpoint, generating many objects, that’s being hit too often and my memory goes over the 512 MB allowed triggering many R14 heroku errors and perf issues.

I’ve set RUBY_GC_HEAP_OLDOBJECT_LIMIT_FACTOR to 0.9 as you advised and this seem to be better but I still outgrow the 512MB limits.

My tests with jemalloc on my mac are pretty encouraging but using jemalloc on heroku requires some setup.

It is fairly common to need 1.5x to 2.x the amount of memory when moving from 2.0 to 2.1.3, we are hoping 2.2 will reduce memory usage a bit, but not sure if we will reach 2.0 levels by then.

David Sanderson
over 3 years ago

The memory footprint doesn’t bother me as much as it behaving like a memory leak. I lowed the number of unicorn workers to offset the amount of memory it needs now, but the memory still grows out of control for some unknown reason. Maybe it’s Heroku + 2.1.3 related? Once I downgrade to 2.0.0, it’s fine.

Interesting, I have not heard of leaks on 2.1.3, do you have any out of band gc thing going ? any more info you can share ?

David Sanderson
over 3 years ago

Using out of band GC only slowed down the growth of memory. I’m not sure what other information I could give, other than it’s Rails 4.1.6, Ruby 2.1.3, Heroku, 1 dyno, Unicorn with 3 workers, fairly small app. The memory just starts to grow. This originally started happening with Ruby 2.1.0 and I was forced to downgrade to 2.0.0. I was excited with 2.1.3 because it looked like the memory leak issue was resolved, but I guess in some cases it’s not. What else can I do to diagnose the problem?