Concurrency in Rails 5.0

jrochkind

1 year ago

Advertisements

My previous posts on concurrency in ActiveRecord have been some of the most popular on this blog (which I’d like to think means concurrency is getting more popular in Rails-land), so I’m going to share what I know about some new concurrency architecture in Rails5 — which is no longer limited to ActiveRecord.

(update: Hours before I started writing this unawares, matthewd submitted a rails PR for a Rails Guide, with some really good stuff; have only skimmed it now, but you might wanna go there either before, after, or in lieu of this).

I don’t fully understand the new stuff, but since it’s relatively undocumented at present, and has some definite gotchas, as well as definite potentially powerful improvements — sharing what I got seems helpful. This will be one of my usual “lots of words” posts, get ready!

Why you need to know this

If you create any threads in a Rails app yourself — beyond the per-request threads a multi-threaded app server like Puma will do for you. Rails takes care of multi-threaded request dispatch for you (with the right app server), but if you’re doing any kind of what I’ll call “manual concurrency” yourself —Thread.new, any invocations of anything in concurrent-ruby (recommended), or probably any celluloid (not sure), etc. — you got to pay attention to be using the new architecture to be doing what Rails wants — and to avoid deadlocks if dev-mode-style class-reloading is happening.

If you’re getting apparent deadlocks in a Rails5 app that does multi-threaded concurrency, it’s probably about this.

If you are willing to turn off dev-mode class-reloading and auto-loading altogether, you can probably ignore this.

What I mean by “dev-mode class-reloading”

Rails5 by default generates your environments/development.rb with with config.cache_classes==false, config.eager_load==false. Classes are auto-loaded only on demand (eager_load == false), and are also sometimes unloaded to be reloaded on next access (cache_classes == false). (The details of when/how/which/if they are unloaded is outside the scope of this blog post, but has also changed in Rails5).

You can turn off all auto-loading with config.cache_classes==true, config.eager_load==true — the Rails5 default production. All classes are loaded/require’d en masse on boot, and are never unloaded. This is what I mean by ‘turn off dev-mode class-reloading and auto-loading altogether’.

The default Rails5 generated environments/test.rb has config.cache_classes==true, config.eager_load==false. Only load classes on demand with auto-loading (eager_load == false), but never unload them.

I am not sure if there’s any rational purpose for having config.cache_classes = false, config.eager_load = true, probably not.

I think there was a poorly documented config.autoload in previous Rails versions, with confusing interactions with the above two config; I don’t think it exists (or at least does anything) in Rails 5.

Good News

Previously to Rails 5, Rails dev-mode class-reloading and auto-loading were entirely un-thread-safe. If you were using any kind of manual concurrency, then you pretty much had to turn off dev-mode class-reloading and auto-loading. Which was too bad, cause they’re convenient and make dev more efficient. If you didn’t, it might sometimes work, but in development (or possibly test) you’d often see those pesky exceptions involving something like “constant is missing”, “class has been redefined”, or “is not missing constant” — I’m afraid I can’t find the exact errors, but perhaps some of these seem familiar.

Rails 5, for the first time, has an architecture which theoretically lets you do manual concurrency in the presence of class reloading/autoloading, thread-safely. Hooray! This is something I had previously thought was pretty much infeasible, but it’s been (theoretically) pulled off, hooray. This for instance theoretically makes it possible for Sidekiq to do dev-mode-style class-reloading — although I’m not sure if latest Sidekiq release actually still has this feature, or they had to back it out.

The architecture is based on some clever concurrency patterns, so it theoretically doesn’t impact performance or concurrency measuribly in production — or even, for the most part, significantly in development.

While the new architecture most immediately effects class-reloading, the new API is, for the most part, not written in terms of reloading, but is higher level API in terms of signaling things you are doing about concurrency: “I’m doing some concurrency here” in various ways. This is great, and should be a good for future of Just Works concurrency in Rails in other ways than class reloading too. If you are using the new architecture correctly, it theoretically makes ActiveRecord Just Work too, with less risk of leaked connections without having to pay lots of attention to it. Great!

I think matthewd is behind much of the new architecture, so thanks matthewd for trying to help move Rails toward a more concurrency-friendly future.

Less Good News

While the failure mode for concurrency used improperly with class-reloading in Rails 4 (which was pretty much any concurrency with class-reloading, in Rails 4) was occasional hard-to-reprodue mysterious exceptions — the failure mode for concurrency used improperly with class-reloading in Rails5 can be a reproduces-every-time deadlock. Where your app just hangs, and it’s pretty tricky to debug why, especially if you aren’t even considering “class-reloading and new Rails 5 concurrency architecture”, which, why would you?

And all the new stuff is, at this point, completely undocumented. (update some docs in rails/rails #27494, hadn’t seen that before I wrote this). So it’s hard to know how to use it right. (I would quite like to encourage an engineering culture where significant changes without docs is considered just as problematic to merge/release as significant changes without tests… but we’re not there yet). (The docs Autoloading and Reloading Constants Guide, to which this is very relevant, have not been updated for this ActiveSupport::Reloader stuff, and I think are probably no longer entirely accurate. That would be a good place for some overview docs…).

The new code is a bit tricky and abstract, a bit hard to follow. Some anonymous modules at some points made it hard for me to use my usual already grimace-inducing methods of code archeology reverse-engineering, where i normally count on inspecting class names of objects to figure out what they are and where they’re implemented.

The new architecture may still be buggy. Which would not be surprising for the kind of code it is: pretty sophisticated, concurrency-related, every rails request will touch it somehow, trying to make auto-loading/class-reloading thread-safe when even ordinary ruby require is not (I think this is still true?). See for instance all the mentions of the “Rails Reloader” in the Sidekiq changelog, going back and forth trying to make it work right — not sure if they ended up giving up for now.

The problem with maybe buggy combined with lack of any docs whatsoever — when you run into a problem, it’s very difficult to tell if it’s because of a bug in the Rails code, or because you are not using the new architecture the way it’s intended (a bug in your code). Because knowing the way it’s intended to work and be used is a bit of a guessing game, or code archeology project.

We really need docs explaining exactly what it’s meant to do how, on an overall architectural level and a method-by-method level. And I know matthewd knows docs are needed. But there are few people qualified to write those docs (maybe only matthewd), cause in order to write docs you’ve got to know the stuff that’s hard to figure out without any docs. And meanwhile, if you’re using Rails5 and concurrency, you’ve got to deal with this stuff now.

So: The New Architecture

I’m sorry this is so scattered and unconfident, I don’t entirely understand it, but sharing what I got to try to save you time getting to where I am, and help us all collaboratively build some understanding (and eventually docs?!) here. Beware, there may be mistakes.

The basic idea is that if you are running any code in a manually created thread, that might use Rails stuff (or do any autoloading of constants), you need to wrap your “unit of work” in eitherRails.application.reloader.wrap { work } or Rails.application.executor.wrap { work }. This signals “I am doing Rails-y things, including maybe auto-loading”, and lets the framework enforce thread-safety for those Rails-y things when you are manually creating some concurrency — mainly making auto-loading thread-safe again.

When do you pick reloader vs executor? Not entirely sure, but if you are completely outside the Rails request-response cycle (not in a Rails action method, but instead something like a background job), manually creating your own threaded concurrency, you should probably use Rails.application.reloader. That will allow code in the block to properly pick up new source under dev-mode class-reloading. It’s what Sidekiqdid to add proper dev-mode reloading for sidekiq (not sure what current master Sidekiq is doing, if anything).

On the other hand, if you are in a Rails action method (which is already probably wrapped in a Rails.application.reloader.wrap, I believe you can’t use a (now nested) Rails.application.reloader.wrap without deadlocking things up. So there you use Rails.application.executor.wrap.

What about in a rake task, or rails runner executed script? Not sure. Rails.application.executor.wrap is probably the safer one — it just won’t get dev-mode class-reloading happening reliably within it (won’t necessarily immediately, or even ever, pick up changes), which is probablyfine.

But to be clear, even if you don’t care about picking up dev-mode class-reloading immediately — unless you turn off dev-mode class-reloading and auto-loading for your entire app — you still need to wrap with a reloader/executor to avoid deadlock — if anything inside the block possibly might trigger an auto-load, and how could you be sure it won’t?

Let’s move to some example code, which demonstrates not just the executor.wrap, but some necessary use of ActiveSupport::Dependencies.interlock.permit_concurrent_loads too.

An actual use case I have — I have to make a handful of network requests in a Rails action method, I can’t really push it off to a bg job, or at any rate I need the results before I return a response. But since I’m making several of them, I really want to do them in parallel. Here’s how I might do it in Rails4:

In Rails4, that would work… mostly. With dev-mode class-reloading/autoloading on, you’d get occasional weird exceptions. Or of course you can turn dev-mode class-reloading off.

In Rails5, you can still turn dev-mode class-reloading/autoloading and it will still work. But if you have autoload/class-reload on, instead of an occasional weird exception, you’ll get a nearly(?) universal deadlock. Here’s what you need to do instead:

And it should actually work reliably, without intermittent mysterious “class unloaded” type errors like in Rails4.

ActiveRecord?

I think that if your concurrent work is wrapped in Rails.application.reloader.wrap doorRails.application.executor.wrap do, this is no longer a problem — they’ll take care of returning any pending checked out AR db connections to the pool at end of block.

So you theoretically don’t need to be so careful about wrapping every single concurrent use of AR in a ActiveRecord::Base.connection_pool.with_connection to avoid leaked connections.

But I think you still can, and it won’t hurt — and it should sometimes lead to shorter finer grained checkouts of db connections from the pool, which matters if you potentially have more threads than you have pool size in your AR connection. I am still wrapping in ActiveRecord::Base.connection_pool.with_connection , out of superstition if nothing else.

Under Test with Capybara?

I think this new architecture could theoretically pave the way to making this all a lot more intentional and reliable, but I’m not entirely sure, not sure if it helps at all already just by existing, or would instead require Capybara to make use of the relevant API hooks (which nobody’s prob gonna write until there are more people who understand what’s going on).

Note though that Rails 4 generated a comment in config/environments/test.rb that says “If you are using a tool that preloads Rails for running tests [which I think means Capybara feature testing], you may have to set [config.eager_load] to true.” I’m not really sure how true this was in even past versions Rails (whether it was neccessary or sufficient). This comment is no longer generated in Rails 5, and eager_load is still generated to be true … so maybe something improved?

Frankly, that’s a lot of inferences, and I have been still leaving eager_load = true under test in my Capybara-feature-test-using apps, because the last thing I need is more fighting with a Capybara suite that is the closest to reliable I’ve gotten it.

Debugging?

The biggest headache is that a bug in the use of the reloader/executor architecture manifests as a deadlock — and I’m not talking the kind that gives you a ruby ‘deadlock’ exception, but the kind where your app just hangs forever doing nothing. This is painful to debug.

These deadlocks in my experience are sometimes not entirely reproducible, you might get one in one run and not another, but they tend to manifest fairly frequently when a problem exists, and are sometimes entirely reproducible.

First step is experimentally turning off dev-mode class-reloading and auto-loading altogether (config.eager_load = true,config.cache_classes = true), and see if your deadlock goes away. If it does, it’s probably got something to do with not properly using the new Reloader architecture. In desperation, you could just give up on dev-mode class-reloading, but that’d be sad.

Added new ActionDispatch::DebugLocks middleware that can be used to diagnose deadlocks in the autoload interlock. To use it, insert it near the top of the middleware stack, using config/application.rb:

Rails.application.executor and Rails.application.reloader are initialized here, I think.

Not sure the design intent of: Executor being an empty subclass of ExecutionWrapper; Rails.application.executor being an anonymous sub-class of Exeuctor (which doesn’t seem to add any behavior either? Rails.application.reloader does the same thing fwiw); or if further configuration of the Executor is done in other parts of the code.

Sidekiq PR #2457 Enable code reloading in development mode with Rails 5 using the Rails.application.reloader, I believe code may have been written by matthewd. This is aood intro example of a model of using the architecture as intended (since matthewd wrote/signed off on it), but beware churn in Sidekiq code around this stuff dealing with issues and problems after this commit as well — not sure if Sidekiq later backed out of this whole feature? But the Sidekiq source is probably a good one to track.

A dialog in Rails Github Issue #686 between me and matthewd, where he kindly leads me through some of the figuring out how to do things right with the new arch. See also several other issues linked from there, and links into Rails source code from matthewd.

Conclusion

If I got anything wrong, or you have any more information you think useful, please feel free to comment here — and/or write a blog post of your own. Collaboratively, maybe we can identify if not fix any outstanding bugs, write docs, maybe even improve the API a bit.

While the new architecture holds the promise to make concurrent programming in Rails a lot more reliable — making dev-mode class-reloading at least theoretically possible to do thread-safely, when it wasn’t at all possible before — in the short term, I’m afraid it’s making concurrent programming in Rails a bit harder for me. But I bet docs will go a long way there.