TLDR: Rails Live Streaming allows Rails to compete with Node.js in the streaming arena. Streaming requires application servers to support either multi-threaded or evented I/O. Most Ruby application servers are not up for the job. Phusion Passenger Enterprise 4.0 (a Ruby app server) is to become hybrid multi-processed, multi-threaded and evented. This allows seamless support for streaming, provides excellent backwards compatibility, and allows future support for more use cases than streaming alone.

Several days ago Rails introduced Live Streaming: the ability to send partial responses to the client immediately. This is a big deal because it opens up a huge number of use cases that Rails simply wasn’t suitable for. Rails was and still is an excellent choice for “traditional” web apps where the user sends a request and expects a full response back. It was a bad choice for anything that works with response streams, for example:

Progress responses that continuously inform the user about the progress. Imagine a web application that performs heavy calculations that can take several minutes. Before Live Streaming, you had to split this system up into multiple pages that must respond immediately. The main page would offload the actual work into a background worker, and return a response informing the user that the work is now in progress. The user must poll a status page at a regular interval to lookup the progress of the work. With Live Streaming, you can not only simplify the code by streaming progress information in a single request, but also push progress information to the user much more quickly and without polling:

Chat servers. Or, more generally, web apps that involve a large number of mostly idle but persistent connections. Until today this has largely been the domain of evented systems such as Node.js and Erlang.

And as Aaron Patterson has already explained, this feature is different from Rails 3.2’s template streaming.

Just “possible” is not enough

The same functionality was actually already technically possible in Ruby. According to the Rack spec, Rack app objects must return a tuple:

[status_code, headers, body]

Here, body must respond to the each method. You can implement live streaming by yourself, with raw Rack, by returning a body object that yields partial responses in its each method.

class StreamingBody
def each
work = WorkModel.new
while !work.done?
work.do_some_calculations
yield "Progress: #{work.progress}%\n"
end
end
end

Notice that the syntax is nearly identical to the Rails controller example code. With this, it is possible to implement anything.

However streaming in Ruby has never caught a lot of traction compared to systems such as Node.js. The latter is much more popular for these kind of use cases. I believe this inequality in populairty is caused by a few things:

Awareness. Not everybody knew this was possible in Ruby. Indeed, it is not widely documented.

Ease and support. Some realize this is possible, but chose not to use Ruby because many frameworks do not provide easy support for streaming. It was possible to stream responses in pre-4.0 Rails but the framework code generally does not take streaming into account, so if you try to do anything fancy you run the risk of breaking things.

With Live Streaming, streaming is now easy to use as well as officially supported.

Can Rails compete with Node.js?

Node.js is gaining more and more momentum nowadays. As I see it there are several reasons for this:

Love for JavaScript. Some developers prefer JavaScript over Ruby, for whatever reasons. Some like the idea of using the same language for both frontend and backend (although whether code can be easily shared between frontend and backend remains a controversial topic among developers). Others like the V8 engine for its speed. Indeed, V8 is a very well-optimized engine, much more so than Ruby 1.9’s YARV engine.

Excellent support for high I/O concurrency use cases. Node.js is an evented I/O system, and evented systems can handle a massive amount of concurrent connections. All libraries in the Node.js ecosystem are designed for evented use cases, because there’s no other choice. In other languages you have to specifically look for evented libraries, so the signal-to-noise ratio is much lower.

I have to be careful here: the phrases “high I/O concurrency” and “massive ammount of concurrent connections” deserve more explanation because it’s easy to confuse them with “uber fast” or “massively scalable”. That is not what I meant. What I meant is, a single Node.js process is capable of handling a lot of client sockets, assuming that any work you perform does not saturate memory, CPU or bandwidth. In contrast, Ruby systems traditionally could only handle 1 concurrent request per process, even you don’t do much work inside a request. We call this a multi-process I/O model because the amount of concurrent users (I/O) the system can handle scales only with the number of processes.

In traditional web apps that send back full responses, this is not a problem because the web server queues all requests, the processes respond as quickly as possible (usually saturating the CPU) and the web server buffers all responses and relieves the processes immediately. In streaming use cases, you have long-running requests so the aforementioned mechanism of letting the web server buffer responses is simply not going to work. You need more I/O concurrency: either you must have more processes, or processes must be able to handle more than 1 request simultaneously. Node.js processes can effectively handle an unlimited number of requests simultaneously, when not considering any constraints posed by CPU, memory or bandwidth.

Node.js is more than HTTP. It allows arbitrary networking with TCP and UDP. Rails is pretty much only for HTTP and even support for WebSockets is dubious, even in raw Rack. It cannot (and I believe, should not) compete with Node.js on everything, but still… Now suddenly, Rails can compete with Node.js for a large number of use cases.

Two sides of the coin

Reality is actually a bit more complicated than this. Although Rails can handle streaming responses now, not all Ruby application servers can. Ilya Grigorik described this problem in his article Rails Performance Needs an Overhaul and criticized Phusion Passenger, Mongrel and Unicorn for being purely multi-process, and thus not able to support high concurrency I/O use cases. (Side note: I believe the article’s title was poorly chosen; it criticizes I/O concurrency support, not performance.)

Mongrel’s current maintenance status appears to be in limbo. Unicorn is well-maintained, but its author Eric Wong has explicitly stated in his philosophy that Unicorn is to remain a purely multi-processed application server, with no intention to ever become multithreaded or evented. Unicorn is explicitly designed to handle fast responses only (so no streaming responses).

At the time Ilya Grigorik’s article was written, Thin was the only application server that was able to support high I/O concurrency use cases. Built on EventMachine, Thin is evented, just like Node.js. Since then, another evented application server called Goliath has appeared, also built on EventMachine. However, evented servers require evented application code, and Rails is clearly not evented.

There have been attempts to make serial-looking code evented through the use of Ruby 1.9 fibers, e.g. through the em-synchrony gem, but in my opinion fibers cause more problems than they solve. Ruby 1.8’s green threading model was essentially already like fibers: there was only one OS thread, and the Ruby green thread scheduler switches context upon encountering a blocking I/O operation. Fibers also operate within a single OS thread, but you can only context switch with explicit calls. In other words, you have to go through each and every blocking I/O operation you perform and insert fiber context switching logic, which Ruby 1.8 already did for you. Worse, fibers give the illusion of thread safetiness, while in reality you can run into the same concurrency problems as with threading. But this time, you cannot easily apply locks to prevent unwanted context switching. Unless the entire ecosystem is designed around fibers, I believe evented servers + fibers only remains useful for a small number of use cases where you have tight control over the application code environment.

There is another way to support high I/O concurrency though: multi-threading, with 1 thread per connection. Multi-threaded systems generally do not support as much concurrent I/O as evented system, but are still quite formidable. Multi-threaded systems are limited by things such as the thread stack size, the available virtual memory address space and the quality of the kernel scheduler. But with the right tweaking they can approach the scalability of evented systems.

And so this leaves multithreaded servers as the only serious options for handling streaming support in Rails apps. It’s very easy to make Rails and most other apps work on them. Puma has recently appeared as a server in this category. Like most other Ruby application servers, you have to start Puma at least once for every web app, and each Puma instance is to be attached to a frontend web server in a reverse proxy setup. Because Ruby 1.9 has a Global Interpreter Lock, you should start more than 1 Puma process if you want to take advantage of multiple cores. Or you can use Rubinius, which does not have a Global Interpreter Lock.

Simple and easy to understand. If one process crashes, the others are not affected.

Can utilize multiple cores.

Cons:

Supports very low I/O concurrency.

Uses a lot of memory.

Multi-threaded

Considerations:

Not as compatible as multi-process, although still quite good. Many libraries and frameworks support threaded environments these days. In web apps, it’s generally not too hard to make your own code thread-safe because web apps tend to be inherently embarrassingly parallel.

Can normally utilize multiple cores in a single process, but not in MRI Ruby. You can get around this by using JRuby or Rubinius.

Pros:

Supports high I/O concurrency.

Threads use less memory than processes.

Cons:

If a thread crashes, the entire process goes down.

Good luck debugging concurrency bugs.

Evented

Pros:

Extremely high I/O concurrency.

Uses even less memory than threads.

Cons:

Bad application compatibility. Most libraries are not designed for evented systems at all. Your application itself has to be aware of events for this to work properly.

If your app/libraries are evented, then you can still run into concurrency bugs like race conditions. It’s easier to avoid them in an evented system than in a threaded system, but when they do occur they are very difficult to debug.

Cannot utilize multiple cores in a single process.

As mentioned before, Phusion Passenger is currently a purely multi-processed application server. If we want to change its I/O model, which one should we choose? We believe the best answer is: all of them. We can give users a choice, and let them chose – on a per-application basis – which I/O model they want.

Phusion Passenger Enterprise 4.x (which we introduced earlier) is to become a hybrid multi-processed, multi-threaded and evented application server. You can choose with a single configuration option whether you want to stay with the traditional multi-processed I/O model, whether you want multiple threads in a single process, or whether you want processes to be evented. In the latter two cases, you even control how many processes you want, in order to take advantage of multiple cores and for resistance against crashes. We believe a combination of processes and threads/events are best.

Being a hybrid server with configurable I/O model allows Phusion Passenger to support more than just streaming. Suddenly, the possibilities become endless. We could for example support arbitrary TCP protocols in the future with no limits on traffic workloads.

Code has just landed in the Phusion Passenger Enterprise 4.0 branch to support multithreading. Note that the current Phusion Passenger Enterprise release is of the 3.0.x series and does not support this yet. As you can see in our roadmap, Phusion Passenger Enterprise 4.0 beta will follow 3.0.x very soon.

Subscribe to the Phusion newsletter and we’ll keep you up to date about the latest Phusion Passenger features, licensing updates and more.

I would argue that the existing method is superior in many ways. First, it doesn’t depend on threading support in your app. Second, you don’t have to worry about accidentally leaving the stream open forever. Lastly, the only real difference is calling yield instead of response.stream.write. Is that really worth adding a bunch of new code and further obscuring an already poorly documented feature that works just fine?

Brian, if by the existing method you mean Rack yield + multi-process, then it *doesn’t* work just fine. Rack yield works fine, but multi-process doesn’t. To support high I/O concurrency the application server must be either multithreaded, or evented.

Rafael Rosa Fu

Hi,

Is there any particular reason to not including JRuby in the comparison?

Tks

jrochkind

If multi-threaded app server support is only available in the “Enterprise”, it’s going to be REALLY disappointing. I believe Passenger is still by far the easiest app server to set up and maintain for rails, it Just Works, for newbies and experienced people alike. But more and more apps require better concurrency than the multi-processor model —

— with Rails 4.0 going config.threadsafe! by default, IMO no app server will be sufficient for more than toy/demo use unless it supports a multi-threaded request concurrency model. If the free Passenger doesn’t… then, sure, some people will pay for Enterprise, but a whole lot of people are going to start looking for a free alternative.

Passenger has provided an _enormous_ service to the Rails community in providing a super simple to set up app server that just works and is totally powerful enough for most use cases; that’ll stop being the case if free passenger doesn’t support threading, as threading becomes more important for more Rails apps.

Michael Cohen

What is the general consensus in the Ruby community as to the viability of running Rails applications on JRuby and a Java servlet container like Tomcat? I read somewhere (a slidedeck maybe) from Erica Kwan of Square where I believe she was saying that Square’s had a surprising number of headaches running Rails apps on JRuby (didn’t say what container they were using but I assumed Tomcat).

Brian

So, the current Rack yield method is not thread safe if multi-threading is enabled?

Right there, nope, yeah you don’t understand, have you done much multi-threaded coding? No code is either “thread safe or not”, it’s always thread safe under certain uses or contracts. Here’s a post from WAY back on thread-safety in rails that is a useful complement to this one: http://m.onkey.org/thread-safety-for-your-rails/

The interesting thing (meaning something different depending on your point of view) is that the situation with Rails and it’s supporting deployment stack hasn’t really changed all that much in the past 4 years from that blog post.

The issue with Rack yield specifically that I”m aware of is less about thread-safety than streaming — and the fact that Rack middleware may (and currently, popular/standard middleware often does) break streaming by buffering the entire response before sending it on. Here’s a piece about that one: https://gist.github.com/11c3491561802e573a47 That particular problem is not so much something an app server can do something about.

“thread safety” and dealing with streaming are both fairly complicated (and inter-related) topics, with lots of integration issues among all the moving parts in a typical rails stack these days.

As the first post I quoted mentioned, the Rails framework itself has theoretically been able to handle multi-threaded concurrent request dispatch since Rails 2.2 (even before Rack, I think?). This is different than “it is thread safe” — it is safe for multi-threaded concurrent request dispatch specificall — that is framework code itself is (or claimed to be), your app code or other gem code can still break this with global state not properly synchronized).

Because few app servers (with the notable exception of anything Java with jruby) were actually capable of dispatching multi-threaded concurrent requests to a Rails app, even though Rails claimed to be capable of it — there were bugs, which took a long time to be discovered because few were doing it. ActiveRecord has had a long and dirty history with figuring out exactly what it’s concurrency contract even _is_, let alone actually fulfilling that contract (and then there’s performance). But things have been getting gradually better in Rails internals.

What the OP doesn’t mention is that Rails 4.0, currently according to Rails core team — will have `config.threadsafe!` on by _default_ — that’s the setting that tells Rails indeed to allow multi-threaded concurrent requests. Hopefully this is a sign that the rest of the supporting deployment infrastructure will start coming around to supporting concurrency (and streaming, not the same thing) better. We’ll see. If people do start deploying like this, I suspect even more bugs in Rails and popular Rack middleware will be found.

Phew. That was long for a blog comment. But I’ve been trying to figure this stuff out for a while, and there’s a _lot_ of misunderstanding out there.

Brian

I understand the surrounding code can cause the request handling to not be thread-safe, but if the underlying Rack code was not thread-safe, then it wouldn’t matter what you did with the controller. If thread-safety is not affected by the Rack interface, then it is indeed thread-safe and there should be nothing stopping you from writing a thread-safe streaming class with Rack yield.

“Phusion” and “Phusion Passenger” are registered trademarks of Phusion. “Rails”, “Ruby on Rails” and the Rails logo are registered trademarks of David Heinemeier Hansson. All other trademarks are property of their respective owners.