In one of the exploitable scenarios, the attacker must know the session secret key. This is not so problematic for proprietary apps where the session secret is kept hidden, but is problematic for many open source Rails apps because the session secret is stored in version control for all to see. Commenter Dan Kaminsky and some other people were of the opinion that this is a bug in Rails. I disagreed because I saw it as the responsibility of the developer to omit secret keys upon public distribution.

However, regardless of whose fault it is, can this problem be prevent? Rails allows storing the session secret key in version control by default for the following reasons:

It is not a problem for the majority of the users, who write proprietary apps. The number of open source Rails apps is quite small in comparison.

It is convenient. Requiring the user (which in production environments is the system administrator, and in development environments is the developer) to set a secret key adds one more installation step which is not user friendly.

Is there a way to avoid the secret key from leaking out upon public distribution, without making things less convenient? Let’s explore a few possibilities.

Omit the secret key from version control, and require the user to manually generate one.

This works but is inconvenient. You can make it more convenient by adding a hypothetical rake save_secret task to automate it, but that’s still a manual step. It also raises the learning curve for new users. You can have rails new automatically run it for you, but that doesn’t solve things in production. You will have to generate a secret key on the production server for every app you deploy, and modify your Capistrano script to symlink to that key on every new deploy. As authors of Phusion Passenger we dislike any kind of installation-time sit-ups.

Omit the secret key from version control, but auto-generate a random key if missing.

The problem with this is that every time you redeploy with Capistrano, the secret key would be regenerated. This would invalidate all previous session cookies, but not in a nice way. Rails would detect invalid signatures on previous cookies and throw an exception. There are two solutions to this:

Make Rails silently discard invalid cookies instead of throwing an exception. Users would still be logged out on every deploy, but I believe this is a minor problem for the people who can’t be bothered to do (2).

Create the secret key on the production server once and symlink to it on every deploy, but you have to know about it and it makes your Capistrano scripts slightly larger.

A reader said that Capistrano can be modified to generate this key and store it in the ‘shared’ directory if it doesn’t exist.

Omit the secret key from version control, but auto-generate a non-random key if missing.

Instead of generating a random key, the key would depend on something that is unique to the system so that the key changes across different machines but not on the same machine. However secret keys are supposed to have high entropy so you will have to choose your “something unique” very carefully. What options do we have? From the top of my head, this is what I’ve come up with:

Host name – low entropy and can be guessed.

MAC address – it’s not inconceivable that it can be guessed.

IP address – this is public information, so not a good idea.

Modification time of the root filesystem – low entropy. There’s a high chance that the server was installed in the past 5 years.

SHA-512 of all file contents in /etc – slow, and changes your key every time you modify something in /etc.

None of these are very good sources. But maybe we can generate a machine-unique property that has high entropy. The property would be stored in /etc/machine-uuid and would be world-readable because it’s only used for deriving secret keys. During startup, Rails would check for this file and tell you to run rake secret | sudo tee /etc/machine-uuid if it doesn’t exist. This only has to be done once per machine. machine-uuid is never supposed to be distributed outside the local machine.

You don’t want all apps on the same machine to have the same HMAC key, so Rails would derive the HMAC key as follows:

hash("#{machine_uuid}-#{hostname}-#{app_name}")

where hash is a cryptographically secure hashing function.

But let’s be paranoid for a bit. If the attacker has gained access to a random local user account then he can very easily derive the HMAC key. But in this case you probably have more things to worry about than whether the attacker can tamper session data.

Allow storing the secret key in version control, but use the secret key and unique machine properties to sign HMACs.

Like with (3), machine properties must have sufficient entropy. The HMAC key would be derived as follows:

hash("#{machine_uuid}-#{hostname}-#{secret_key}")

This is more secure than (3) if you’ve set proper permissions on your application files. An attacker that has obtained access to a local user account, but not the one that your app is running on, knows machine_uuid and hostname but not secret_key. Of course he can still do other nasty things so I’m not sure whether one should worry about this scenario.

Store a default secret key in version control, but allow customizing it through an environment variable

Unfortunately this does not solve the convenience problem. It requires the administrator to know about the secret token and it requires the administrator to perform some setup.

I prefer (2) in combination with patching Rails to discard invalid cookies instead of throwing an exception. What do you think? Can you think of any other ways to make it more secure without sacrificing convenience? Are my security analyses correct? Please leave a comment here, on Hacker News or on Reddit.

Update 2: a statement from Michael Koziarski of the Rails security team regarding the severity of this bug has been added. He urges people to upgrade immediately. Please scroll to the “Conclusion” section for details.

Update 3: new advisories (CVE-2013-0155 and CVE-2013-0156) have been published. These vulnerabilities are unrelated to the one reported in this blog post, but are extremely critical. Upgrade immediately.

Here are the facts, along with a clear explanation for non-Rails people.

Summary: what is this vulnerability?

The bug allows SQL injection through dynamic finder methods (e.g. find_by_foo(params[:foo])). I will explain dynamic finders in a bit.

The bug affects all Ruby on Rails versions.

A known exploitable scenario is when all of the following applies:

You’re using Authlogic (a third party but popular authentication library).

You must know the session secret token.

There are other exploitable scenarios, but it really depends on what your app is doing. Since it is impossible to prove that something isn’t insecure, you should take the vulnerability seriously and upgrade anyway even if you think you aren’t affected.

What is this vulnerability NOT?

For those who know Rails:

The bug does not affect normal finder methods (e.g. find(params[:id])).

The bug is not exploitable through request parameters.

The bug is not in Authlogic. It’s in Rails. It just so happens that Authlogic triggers it.

Devise (another third-party authentication library) does not trigger the bug.

Point 6, as described at Rails Security Digest. ‘params’ Case, is a totally different and unrelated issue. The issue described there is quite severe and deserves serious attention, so please keep your eye open on any new advisories.

For those who do not know Rails:

It does not mean all unupgraded Rails apps are suddenly widely vulnerable.

It does not mean Rails doesn’t escape SQL inputs.

It does not mean Rails doesn’t provide parameterized SQL APIs.

It does not mean Rails encourages code that are inherently prone to SQL injection. The code should be safe but due to a subtlety was not. This has been fixed.

The main exploitable scenario

Rails provides finder methods for all ActiveRecord (database) models. For example, to lookup a user using a primary key that was provided through the “id” request parameter, one would usually write:

User.find(params[:id])

Rails also provides so-called “dynamic finder methods”. It generates a “find_by_*” method for all database columns in model. If your “users” table have the “id”, “name” and “phone” columns, then it will generate methods so you can write things like this:

But ActiveRecord also defines ways for the programmer to inject SQL fragments into the query so that the programmer can customize the query when necessary. The injection interfaces are documented and the programmer is not supposed to pass user input to those interfaces. Normally, the strings passed to the injection interfaces are constant strings that never change. One of those injection interfaces is the options parameter (normally second parameter) for the “find_by_*” methods:

# Fetches a user record by name, but only fetch the 'id' and 'name' fields.
User.find_by_name("kotori", :select => "id, name")
# => SELECT id, name FROM users WHERE name = 'kotori' LIMIT 1
# You can inject arbitrary SQL if you wish:
User.find_by_name("kotori", :select => "id, name FROM users; DROP TABLE users; --")
# => SELECT id, name FROM users; DROP TABLE users; -- FROM users WHERE name = 'kotori' LIMIT 1

The vulnerability lies in the fact that “find_by_*” also accepted calls in which only the options parameter is given. In that case, it thinks that the value parameter is nil.

Not many people ever use the second parameter, but code of the following form is quite common:

User.find_by_name(params[:name])

params[:name] is normally a string. Can an attacker somehow ensure that params[:name] is an options hash? Yes. Rails converts request parameters of a certain form into hashes. Suppose you call the controller method like this:

However, this is not exploitable. Ruby has two datatypes, strings and symbols. Symbols are kind of like string constants. You’ve seen them before in this article: :select is a symbol. The vulnerability can only be triggered when the keys are symbols, but the Rails-generated request parameter hashes all have string keys thanks to the way HashWithIndifferentAccess works.

An attacker can only exploit this if the application somehow passes an arbitrary hash to “find_by_*”, yet with symbol keys. We now bring in the second part of the puzzle: Authlogic. This exploit is described here and works as follows.

Authlogic accepts authentication credentials through multiple ways: cookies, Rails session data, HTTP basic authentication, etc. All user accounts have a so-called persistence token, and the user must provide this persistence token through one of the authentication methods in order to authenticate himself. Authlogic looks up the user associated with the persistence token using roughly the following call:

User.find_by_persistence_token(the_token)

Can an attacker ensure that the_token is an options hash? Yes, but only through the Rails session data authentication method. In all the other methods, the_token is always a string.

The Rails session mechanism allows storing arbitrary Ruby objects, including hashes with symbol keys. Rails provides a variety of session stores, the default being the cookie store which stores session data in a cookie on the client. The cookie data is not encrypted, but is signed with an HMAC to prevent tampering. The cookie store is fast, does not require any server-side maintenance, and is only meant for session data that do not contain sensitive information such as credit card numbers. Apps that store sensitive information in the session should use the database session store instead. Nevertheless, it turned out that 95% of all Rails apps only ever store the user authentication credentials in the session, so the cookie store was made the default.

So to inject arbitrary SQL, you need to tamper with the cookie, which requires the HMAC key. The HMAC key is the so-called session secret. As the name implies, it is supposed to be secret. Rails generates a random 512-bit secret upon project creation. This is why most Rails apps that are running Authlogic are not exploitable: the attacker does not know the secret. Open source Rails apps however can form a problem. Many of them come with a default session secret, but the user never customizes them, so all those instances end up using the same HMAC key, making them very easily exploitable. Of course, in this case the operator have to worry about more than just SQL injection. If the HMAC key is known then anybody can send fake credentials to the app.

Other exploitable scenarios

Your code is vulnerable if you call Foo.find_by_whatever(bar), where bar can be an arbitrary user-specified hash with symbol keys. HashWithIndifferentAccess stores keys as strings, not symbols, so that does not trigger the vulnerability.

Mitigation

There are several ways to mitigate this.

Upgrade to the latest Rails version. This solves everything, you don’t need to do anything else. “find_by_*” has been patched so that the first parameter may not be an options hash.

Ensure that you only pass strings or integers to “find_by_*”, e.g. find_by_name(params[:name].to_s). This requires changing all code, including third party code. I do not recommend this as you’ll likely overlook things. If you upgrade Rails, you don’t need to do this.

Keep your session secret secret! If you write open source Rails apps, make sure the user generates a different session secret upon installation. Don’t let them use the default one.

Demo app

We’ve put together a demo app which shows that the bug is not exploitable through request parameters. Setup this app, run it, and try to attack it as follows:

curl 'http://127.0.0.1:3000/?id\[limit\]=1'

You will see that this attack does not succeed in injecting SQL.

Conclusion

So here you have it. Some folks on Hacker News asked “how can this giant bug be overlooked”? As you can see, it is not a “giant bug”, it is much more subtle than that and requires a specific combination of code and circumstances to work. Most apps are not vulnerable.

Update: Michael Koziarski of the Rails security team said the following:

“When we told people they should upgrade immediately we meant it. It *is* exploitable under some circumstances, so people should be upgrading immediately to avoid the risk.”

References

TLDR: Rails Live Streaming allows Rails to compete with Node.js in the streaming arena. Streaming requires application servers to support either multi-threaded or evented I/O. Most Ruby application servers are not up for the job. Phusion Passenger Enterprise 4.0 (a Ruby app server) is to become hybrid multi-processed, multi-threaded and evented. This allows seamless support for streaming, provides excellent backwards compatibility, and allows future support for more use cases than streaming alone.

Several days ago Rails introduced Live Streaming: the ability to send partial responses to the client immediately. This is a big deal because it opens up a huge number of use cases that Rails simply wasn’t suitable for. Rails was and still is an excellent choice for “traditional” web apps where the user sends a request and expects a full response back. It was a bad choice for anything that works with response streams, for example:

Progress responses that continuously inform the user about the progress. Imagine a web application that performs heavy calculations that can take several minutes. Before Live Streaming, you had to split this system up into multiple pages that must respond immediately. The main page would offload the actual work into a background worker, and return a response informing the user that the work is now in progress. The user must poll a status page at a regular interval to lookup the progress of the work. With Live Streaming, you can not only simplify the code by streaming progress information in a single request, but also push progress information to the user much more quickly and without polling:

Chat servers. Or, more generally, web apps that involve a large number of mostly idle but persistent connections. Until today this has largely been the domain of evented systems such as Node.js and Erlang.

And as Aaron Patterson has already explained, this feature is different from Rails 3.2’s template streaming.

Just “possible” is not enough

The same functionality was actually already technically possible in Ruby. According to the Rack spec, Rack app objects must return a tuple:

[status_code, headers, body]

Here, body must respond to the each method. You can implement live streaming by yourself, with raw Rack, by returning a body object that yields partial responses in its each method.

class StreamingBody
def each
work = WorkModel.new
while !work.done?
work.do_some_calculations
yield "Progress: #{work.progress}%\n"
end
end
end

Notice that the syntax is nearly identical to the Rails controller example code. With this, it is possible to implement anything.

However streaming in Ruby has never caught a lot of traction compared to systems such as Node.js. The latter is much more popular for these kind of use cases. I believe this inequality in populairty is caused by a few things:

Awareness. Not everybody knew this was possible in Ruby. Indeed, it is not widely documented.

Ease and support. Some realize this is possible, but chose not to use Ruby because many frameworks do not provide easy support for streaming. It was possible to stream responses in pre-4.0 Rails but the framework code generally does not take streaming into account, so if you try to do anything fancy you run the risk of breaking things.

With Live Streaming, streaming is now easy to use as well as officially supported.

Can Rails compete with Node.js?

Node.js is gaining more and more momentum nowadays. As I see it there are several reasons for this:

Love for JavaScript. Some developers prefer JavaScript over Ruby, for whatever reasons. Some like the idea of using the same language for both frontend and backend (although whether code can be easily shared between frontend and backend remains a controversial topic among developers). Others like the V8 engine for its speed. Indeed, V8 is a very well-optimized engine, much more so than Ruby 1.9’s YARV engine.

Excellent support for high I/O concurrency use cases. Node.js is an evented I/O system, and evented systems can handle a massive amount of concurrent connections. All libraries in the Node.js ecosystem are designed for evented use cases, because there’s no other choice. In other languages you have to specifically look for evented libraries, so the signal-to-noise ratio is much lower.

I have to be careful here: the phrases “high I/O concurrency” and “massive ammount of concurrent connections” deserve more explanation because it’s easy to confuse them with “uber fast” or “massively scalable”. That is not what I meant. What I meant is, a single Node.js process is capable of handling a lot of client sockets, assuming that any work you perform does not saturate memory, CPU or bandwidth. In contrast, Ruby systems traditionally could only handle 1 concurrent request per process, even you don’t do much work inside a request. We call this a multi-process I/O model because the amount of concurrent users (I/O) the system can handle scales only with the number of processes.

In traditional web apps that send back full responses, this is not a problem because the web server queues all requests, the processes respond as quickly as possible (usually saturating the CPU) and the web server buffers all responses and relieves the processes immediately. In streaming use cases, you have long-running requests so the aforementioned mechanism of letting the web server buffer responses is simply not going to work. You need more I/O concurrency: either you must have more processes, or processes must be able to handle more than 1 request simultaneously. Node.js processes can effectively handle an unlimited number of requests simultaneously, when not considering any constraints posed by CPU, memory or bandwidth.

Node.js is more than HTTP. It allows arbitrary networking with TCP and UDP. Rails is pretty much only for HTTP and even support for WebSockets is dubious, even in raw Rack. It cannot (and I believe, should not) compete with Node.js on everything, but still… Now suddenly, Rails can compete with Node.js for a large number of use cases.

Two sides of the coin

Reality is actually a bit more complicated than this. Although Rails can handle streaming responses now, not all Ruby application servers can. Ilya Grigorik described this problem in his article Rails Performance Needs an Overhaul and criticized Phusion Passenger, Mongrel and Unicorn for being purely multi-process, and thus not able to support high concurrency I/O use cases. (Side note: I believe the article’s title was poorly chosen; it criticizes I/O concurrency support, not performance.)

Mongrel’s current maintenance status appears to be in limbo. Unicorn is well-maintained, but its author Eric Wong has explicitly stated in his philosophy that Unicorn is to remain a purely multi-processed application server, with no intention to ever become multithreaded or evented. Unicorn is explicitly designed to handle fast responses only (so no streaming responses).

At the time Ilya Grigorik’s article was written, Thin was the only application server that was able to support high I/O concurrency use cases. Built on EventMachine, Thin is evented, just like Node.js. Since then, another evented application server called Goliath has appeared, also built on EventMachine. However, evented servers require evented application code, and Rails is clearly not evented.

There have been attempts to make serial-looking code evented through the use of Ruby 1.9 fibers, e.g. through the em-synchrony gem, but in my opinion fibers cause more problems than they solve. Ruby 1.8’s green threading model was essentially already like fibers: there was only one OS thread, and the Ruby green thread scheduler switches context upon encountering a blocking I/O operation. Fibers also operate within a single OS thread, but you can only context switch with explicit calls. In other words, you have to go through each and every blocking I/O operation you perform and insert fiber context switching logic, which Ruby 1.8 already did for you. Worse, fibers give the illusion of thread safetiness, while in reality you can run into the same concurrency problems as with threading. But this time, you cannot easily apply locks to prevent unwanted context switching. Unless the entire ecosystem is designed around fibers, I believe evented servers + fibers only remains useful for a small number of use cases where you have tight control over the application code environment.

There is another way to support high I/O concurrency though: multi-threading, with 1 thread per connection. Multi-threaded systems generally do not support as much concurrent I/O as evented system, but are still quite formidable. Multi-threaded systems are limited by things such as the thread stack size, the available virtual memory address space and the quality of the kernel scheduler. But with the right tweaking they can approach the scalability of evented systems.

And so this leaves multithreaded servers as the only serious options for handling streaming support in Rails apps. It’s very easy to make Rails and most other apps work on them. Puma has recently appeared as a server in this category. Like most other Ruby application servers, you have to start Puma at least once for every web app, and each Puma instance is to be attached to a frontend web server in a reverse proxy setup. Because Ruby 1.9 has a Global Interpreter Lock, you should start more than 1 Puma process if you want to take advantage of multiple cores. Or you can use Rubinius, which does not have a Global Interpreter Lock.

Simple and easy to understand. If one process crashes, the others are not affected.

Can utilize multiple cores.

Cons:

Supports very low I/O concurrency.

Uses a lot of memory.

Multi-threaded

Considerations:

Not as compatible as multi-process, although still quite good. Many libraries and frameworks support threaded environments these days. In web apps, it’s generally not too hard to make your own code thread-safe because web apps tend to be inherently embarrassingly parallel.

Can normally utilize multiple cores in a single process, but not in MRI Ruby. You can get around this by using JRuby or Rubinius.

Pros:

Supports high I/O concurrency.

Threads use less memory than processes.

Cons:

If a thread crashes, the entire process goes down.

Good luck debugging concurrency bugs.

Evented

Pros:

Extremely high I/O concurrency.

Uses even less memory than threads.

Cons:

Bad application compatibility. Most libraries are not designed for evented systems at all. Your application itself has to be aware of events for this to work properly.

If your app/libraries are evented, then you can still run into concurrency bugs like race conditions. It’s easier to avoid them in an evented system than in a threaded system, but when they do occur they are very difficult to debug.

Cannot utilize multiple cores in a single process.

As mentioned before, Phusion Passenger is currently a purely multi-processed application server. If we want to change its I/O model, which one should we choose? We believe the best answer is: all of them. We can give users a choice, and let them chose – on a per-application basis – which I/O model they want.

Phusion Passenger Enterprise 4.x (which we introduced earlier) is to become a hybrid multi-processed, multi-threaded and evented application server. You can choose with a single configuration option whether you want to stay with the traditional multi-processed I/O model, whether you want multiple threads in a single process, or whether you want processes to be evented. In the latter two cases, you even control how many processes you want, in order to take advantage of multiple cores and for resistance against crashes. We believe a combination of processes and threads/events are best.

Being a hybrid server with configurable I/O model allows Phusion Passenger to support more than just streaming. Suddenly, the possibilities become endless. We could for example support arbitrary TCP protocols in the future with no limits on traffic workloads.

Code has just landed in the Phusion Passenger Enterprise 4.0 branch to support multithreading. Note that the current Phusion Passenger Enterprise release is of the 3.0.x series and does not support this yet. As you can see in our roadmap, Phusion Passenger Enterprise 4.0 beta will follow 3.0.x very soon.

Subscribe to the Phusion newsletter and we’ll keep you up to date about the latest Phusion Passenger features, licensing updates and more.

I think Bundler is a great tool. Its strength lies not in its ability to install all the gems that you’ve specified, but in automatically figuring out a correct dependency graph so that nothing conflicts with each other, and in the fact that it gives you rock-solid guarantees that whatever gems you’re using in development is exactly what you get in production. No more weird gem version conflict errors.

This is awesome for most Ruby web apps that are meant to be used internally, e.g. things like Twitter, Basecamp, Union Station. Unfortunately, this strength also turns in a kind of weakness when it comes to public apps like Redmine and Juvia. These apps typically allow the user to choose their database driver through config/database.yml. However the driver must also be specified inside Gemfile, otherwise the app cannot load it. The result is that the user has to edit both database.yml and Gemfile, which introduces the following problems:

The user may not necessarily be a Ruby programmer. The Gemfile will confuse him.

The user is not able to use the Gemfile.lock that the developer has provided. This makes installing in deployment mode with the developer-provided Gemfile.lock impossible.

This can be worked around in a very messy form with groups. For example:

group :driver_sqlite do
gem 'sqlite3'
end
group :driver_mysql do
gem 'msyql'
end
group :driver_postgresql do
gem 'pg'
end

And then, if the user chose to use MySQL:

bundle install --without='driver_postgresql driver_sqlite'

This is messy because you have to exclude all the things you don’t want. If the app supports 10 database drivers then the user has to put 9 drivers on the exclusion list.

How can we make this better? I propose supporting conditionals in the Gemfile language. For example:

Bundler should enforce that the driver condition is set: if it’s not set then it should raise an error. To allow for the driver condition to not be set, the developer must explicitly define that the condition may be nil:

condition :driver => nil do
gem 'null-database-driver'
end

Here, bundle install will install null-database-driver.

With this proposal, user installation instructions can be reduced to these steps:

The upcoming Rails 3.1 will come with a powerful asset pipeline, which is a framework for allowing developers to preprocess, concatenate, minify and compress Javascripts and CSS files. Javascripts can be written in CofeeScript, which is compiled on-the-fly to Javascript. Similarly, CSS can be written in Sass or SCSS, which are compiled on-the-fly to plain old CSS. Many people have already written about this topic.

Today we ran into a situation in which we wanted to render an SCSS file to a string so that we can put it in a JSON document. The way to do this is non-obvious. First try:

render :template => 'assets/stylesheets/api.css'

Rails complained that it cannot find this file in its search path, which only includes ‘app/views’.

Next try involved using the :file option because it also searches in Rails.root:

render :file => 'app/assets/stylesheets/api.css'

This time it successfully found the file, but couldn’t render it because there’s no :scss template engine. Apparently the asset pipeline does not integrate into the Rails template engine framework.

After some code digging, it turned out that asset serving is completely powered by Sprockets. The Rack middleware responsible for serving the /assets is Sprockets::Server, which looks up asset information using a Sprockets::Environment object. This object also handles rendering. The Rails integration allows such an object to be accessed through the YourApp::Application.assets method.

So the correct way to render an asset to a string is as follows:

YourApp::Application.assets.find_asset('api.css').body

Here, YourApp is your application’s module name as found in config/application.rb.

Version 1.0.5 fixes support for Rails 3.0 and Rails 3.1. Unfortunately we had to remove one feature due to changes in ActiveRecord 3.1. It is no longer possible to access associations in the default_value_for_block. The following is no longer possible:

One would expect the last expression to evaluate to “Drone for Evil Organization”. On Rails 2 and Rails 3.0, it does, but on Rails 3.1 it does not. Because of the way the ‘organization’ attribute is assigned to the model by ActiveRecord, we are no longer able to support this behavior. Inside the default_value_for block, model.organization would evaluate to nil.

Igvita.com has recently published the article Rails Performance Needs an Overhaul. Rails performance… no, Ruby performance… no Rails scalability… well something is being criticized here. From my experience, talking about scalability and performance can be a bit confusing because the terms can mean different things to different people and/or in different situations, yet the meanings are used interchangeably all the time. In this post I will take a closer look at Igvita’s article.

Performance vs scalability

Let us first define performance and scalability. I define performance as throughput; number of requests per second. I define scalability as the amount of users a system can concurrently handle. There is a correlation between performance and scalability. Higher performance means each request takes less time, and so is more scalable, right? Sometimes yes, but not necessarily. It is entirely possible for a system to be scalable, yet manages to have a lower throughput than a system that’s not as scalable, or for a system to be uber-fast yet not very scalable. Throughout this blog post I will show several examples that highlight the difference.

“Scalability” is an extremely loaded word and people often confuse it with “being able to handle tons and tons of traffic”. Let’s use a different term that better reflects what Igvita’s actually criticizing: concurrency. Igvita claims that concurrency in Ruby is pathetic while referring to database drivers, Ruby application servers, etc. Some practical examples that demonstrate what he means are as follows.

Limited concurrency at the app server level

Mongrel, Phusion Passenger and Unicorn all use a “traditional” multi-process model in which multiple Ruby processes are spawned, each process handling a single request per second. Thus, concurrency is (assuming that the load balancer has infinite concurrency) limited by the number of Ruby processes: having 5 processes allow you to handle 5 users concurrently.

Threaded servers, where the server spawns multiple threads, each handling 1 connection concurrently, allow more concurrency because because it’s possible to spawn a whole lot more threads than processes. In the context of Ruby, each Ruby process needs to load its own copy of the application code and other resources, so memory increases very quickly as you spawn additional processes. Phusion Passenger with Ruby Enterprise Edition solves this problem somewhat by using copy-on-write optimizations which save memory, so you can spawn a bit more processes, but not significantly (as in 10x) more. In contrast, a multi-threaded app server does not need as much memory because all threads share application code with each other so you can comfortably spawn tens or hundreds of threads. At least, this is the theory. I will later explain why this does not necessarily hold for Ruby.

When it comes to performance however, there’s no difference between processes and threads. If you compare a well-written multi-threaded app server with 5 threads to a well-written multi-process app server with 5 processes, you won’t find either being more performant than the other. Context switch overhead between processes and threads are roughly the same. Each process can use a different CPU core, as can each thread, so there’s no difference in multi-core utilization either. This reflects back on the difference between scalability/concurrency and performance.

Multi-process Rails app servers have a concurrency level that can be counted with a single hand, or if you have very beefy hardware, a concurrency level in the range of a couple of tens, thanks to the fact that Rails needs about 25 MB per process. Multi-threaded Rails app servers can in theory spawn a couple of hundred of threads. After that it’s also game over: an operating system thread needs a couple MB of stack space, so after a couple hundreds of threads you’ll run out of virtual memory address on 32-bit systems even if you don’t actually use that much memory.

There is another class of servers, the evented ones. These servers are actually single-threaded, but they use a reactor style I/O dispatch architecture for handling I/O concurrency. Examples include Node.js, Thin (built on EventMachine) and Tornado. These servers can easily have a concurrency level of a couple of thousand. But due to their single-threaded nature they cannot effectively utilize multiple CPU cores, so you need to run a couple of processes, one per CPU core, to fully utilize your CPU.

The limits of Ruby threads

Ruby 1.8 uses userspace threads, not operating system threads. This means that Ruby 1.8 can only utilize a single CPU core no matter how many Ruby threads you create. This is why one typically needs multiple Ruby processes to fully utilize one’s CPU cores. Ruby 1.9 finally uses operating system threads, but it has a global interpreter lock, which means that each time a Ruby 1.9 thread is running it will prevent other Ruby threads from running, effectively making it the same multicore-wise as 1.8. This is also explained in an earlier Igvita article, Concurrency is a Myth in Ruby.

On the bright side, not all is bad. Ruby 1.8 internally uses non-blocking I/O while Ruby 1.9 unlocks the global interpreter lock while doing I/O. So if one Ruby thread is blocked on I/O, another Ruby thread can continue execution. Likewise, Ruby is smart enough to cause things like sleep() and even waitpid() to preempt to other threads.

On the dark side however, Ruby internally uses the select() system call for multiplexing I/O. select() can only handle 1024 file descriptors on most systems so Ruby cannot handle more than this number of sockets per Ruby process, even if you are somehow able to spawn thousands of Ruby threads. EventMachine works around this problem by bypassing Ruby’s I/O code completely.

Naive native extensions and third party libraries

So just run a couple of multi-threaded Ruby processes, one process per core and multiple threads per process, and all is fine and we should be able to have a concurrency level of up to a couple hundred, right? Well not quite, there are a number of issues hindering this approach:

Some third party libraries and Rails plugins are not thread-safe. Some aren’t even reentrant. For example Rails < 2.2 suffered from this problem. The app itself might not be thread-safe.

Although Ruby is smart enough not to let I/O block all threads, the same cannot be said of all native extensions. The MySQL extension is the most infamous example: when executing queries, other threads cannot run.

Mongrel is actually multi-threaded but in practice everybody uses in multi-process mode (mongrel_cluster) exactly because of these problems. It is also the reason why Phusion Passenger has also gone the multi-process route.

And even though Thin is evented, a typical Ruby web application running on Thin cannot handle thousands of concurrent users. This is because evented servers typically require a special evented programming style, such as the one seen in Node.js and EventMachine. A Ruby web app that is written in an evented style running on Thin can definitely handle a large number of concurrent users.

When is limited application server concurrency actually a problem?

Igvita is clearly disappointed at all all the issues that hinder Ruby web apps from achieving high concurrency. For many web applications I would however argue that limited concurrency is not a problem.

Web applications that are slow, as in CPU-heavy, max out CPU resources pretty quickly so increasing concurrency won’t help you.

Web applications that are fast are typically quick enough at handling the load so that even large number of users won’t notice the limited concurrency of the server.

Having a concurrency of 5 does not mean not mean that the app server can only handle 5 requests per second; it’s not hard to serve hundreds of requests per second with only a couple of single-threaded processes.

The problem becomes most evident for web applications that have to wait a lot for I/O (besides its own HTTP request/response cycle). Examples include:

Apps that have to spend a lot of time waiting on the database.

Apps that perform a lot of external HTTP calls that respond slowly.

Chat apps. These apps typically have thousands of users, most of them doing nothing most of the time, but they all require a connection (unless your app uses polling, but that’s a whole different discussion).

We at Phusion have developed a number of web applications for clients that fall in the second category, the most recent one being a Hyves gadget. Hyves is the most popular social network in the Netherlands and they get thousands of concurrent visitors during the day. The gadget that we’ve developed has to query external HTTP servers very often, and these servers can take 10 seconds to respond in extreme cases. The servers are running Phusion Passenger with maybe a couple tens of processes. If every request to our gadget also causes us to wait 10 seconds for the external HTTP call then we’d soon run out of concurrency.

But even suppose that our app and Phusion Passenger can have a concurrency level of a couple of thousand, all of those visitors will still have to wait 10 seconds for the external HTTP calls, which is obviously unacceptable. This is another example that illustrates the difference between scalability and performance. We had solved this problem by aggressively caching the results of the HTTP calls, minimizing the number of external HTTP calls that are necessary. The result is that even though the application’s concurrency is fairly limited, it can still comfortably serve many concurrent users with a reasonable response time.

This anecdote should explain why I believe that web apps can get very far despite having a limited concurrency level. That said, as Internet usage continues to increase and websites get more and more users, we may at some time come to a point where much a larger concurrency level is required than most of our current Ruby tools allow us to (assuming server capacity doesn’t scale quickly enough).

What was Igvita.com criticizing?

Igvita.com does not appear to be criticizing Ruby or Rails for being slow. It doesn’t even appear to be criticizing the lack of Ruby tools for achieving high concurrency. It appears to be criticizing these things:

Rails and most Ruby web application servers don’t allow high concurrency by default.

Many database drivers and libraries hinder concurrency.

Although alternatives exist that allow concurrency, you have to go out of your way to find them.

There appears to be little motivation in the Ruby community for making the entire stack of web frame work + web app server + database drivers etc scalable by default.

This is in contrast to Node.js where everything is scalable by default.

Do I understand Igvita’s frustration? Absolutely. Do I agree with it? Not entirely. The same thing that makes Node.js so scalable is also what makes it relatively hard to program for. Node.js enforces a callback style of programming and this can eventually make your code look a lot more complicated and harder to read than regular code that uses blocking calls. Furthermore, Node.js is relatively young – of course you won’t find any Node.js libraries that don’t scale! But if people ever use Node.js for things other than high-concurrency servers apps, then non-scalable libraries will at some time pop up. And then you will have to look harder to avoid these libraries. There is no silver bullet.

That said, all would be well if at least the preferred default stack can handle high concurrency by default. This means e.g. fixing the MySQL extension and have the fix published by upstream. The mysqlplus extension fixes this but for some reason their changes aren’t accepted and published by the original author, and so people end up with a multi-thread-killing database driver by default.

Is Node.js innovative? Is Ruby lacking innovation?

A minor gripe that I have with the article is that Igvita calls Node.js innovative while seemingly implying that the Ruby stack isn’t innovating. Evented servers like Node.js actually have been around for years and the evented pattern is well-known long before Ruby or Javascript have become popular. Thin is also evented and predates Node.js by several years. Thin and EventMachine also allow Node.js-style evented programming. The only innovation that Node.js brings, in my opinion, is the fact that it’s Javascript. The other “innovation” is the lack of non-scalable libraries.

Conclusion

Igvita appears to be criticizing something other than Rails performance, as his article’s title would imply.

I don’t think the concurrency levels that the Rails stack provides by default is that bad in practice. But as a fellow programmer, it does intuitively bother me that our laptops, which are a million times more powerful than supercomputers from two decades ago, cannot comfortably handle a couple of thousand concurrent users. We can definitely work towards something better, but in the mean time let’s not forget that the current stack is more than capable of Getting Work Done(tm).

EncryptedCookieStore is similar to Ruby on Rails’s CookieStore (it saves session data in a cookie), but it uses encryption so that people can’t read what’s in the session data. This makes it possible to store sensitive data in the session.

EncryptedCookieStore is written for Rails 2.3. Other versions of Rails have not been tested.

Note: This is not ThinkRelevance’s EncryptedCookieStore. In the Rails 2.0 days they wrote an EncryptedCookieStore, but it seems their repository had gone defunct and their source code lost. This EncryptedCookieStore is written from scratch by Phusion.

Installation and usage

Then edit config/initializers/session_store.rb and set your session store to EncryptedCookieStore:

ActionController::Base.session_store = EncryptedCookieStore

You need to set a few session options before EncryptedCookieStore is usable. You must set all options that CookieStore needs, plus an encryption key that EncryptedCookieStore needs. In session_store.rb:

ActionController::Base.session = {
# CookieStore options...
:key => '_session', # Name of the cookie which contains the session data.
:secret => 'b4589cc9...', # A secret string used to generate the checksum for
# the session data. Must be longer than 64 characters
# and be completely random.
# EncryptedCookieStore options...
:encryption_key => 'c306779f3...', # The encryption key. See below for notes.
}

The encryption key must be a hexadecimal string of exactly 32 bytes. It should be entirely random, because otherwise it can make the encryption weak.

You can generate a new encryption key by running rake secret:encryption_key. This command will output a random encryption key that you can then copy and paste into your environment.rb.

Operational details

Upon generating cookie data, EncryptedCookieStore generates a new, random initialization vector for encrypting the session data. This initialization vector is then encrypted with 128-bit AES in ECB mode. The session data is first protected with an HMAC to prevent tampering. The session data, along with the HMAC, are then encrypted using 256-bit AES in CFB mode with the generated initialization vector. This encrypted session data + HMAC are then stored, along with the encrypted initialization vector, into the cookie.

Upon unmarshalling the cookie data, EncryptedCookieStore decrypts the encrypted initialization vector and use that to decrypt the encrypted session data + HMAC. The decrypted session data is then verified against the HMAC.

The reason why HMAC verification occurs after decryption instead of before decryption is because we want to be able to detect changes to the encryption key and changes to the HMAC secret key, as well as migrations from CookieStore. Verifying after decryption allows us to automatically invalidate such old session cookies.

EncryptedCookieStore is quite fast: it is able to marshal and unmarshal a simple session object 5000 times in 8.7 seconds on a MacBook Pro with a 2.4 Ghz Intel Core 2 Duo (in battery mode). This is about 0.174 ms per marshal+unmarshal action. See rake benchmark in the EncryptedCookieStore sources for details.

EncryptedCookieStore vs other session stores

EncryptedCookieStore inherits all the benefits of CookieStore:

It works out of the box without the need to setup a seperate data store (e.g. database table, daemon, etc).

It does not require any maintenance. Old, stale sessions do not need to be manually cleaned up, as is the case with PStore and ActiveRecordStore.

Compared to MemCacheStore, EncryptedCookieStore can “hold” an infinite number of sessions at any time.

It can be scaled across multiple servers without any additional setup.

It is fast.

It is more secure than CookieStore because it allows you to store sensitive data in the session.

There are of course drawbacks as well:

It is prone to session replay attacks. These kind of attacks are explained in the Ruby on Rails Security Guide. Therefore you should never store anything along the lines of is_admin in the session.

You can store at most a little less than 4 KB of data in the session because that’s the size limit of a cookie. “A little less” because EncryptedCookieStore also stores a small amount of bookkeeping data in the cookie.

Although encryption makes it more secure than CookieStore, there’s still a chance that a bug in EncryptedCookieStore renders it insecure. We welcome everyone to audit this code. There’s also a chance that weaknesses in AES are found in the near future which render it insecure. If you are storing *really* sensitive information in the session, e.g. social security numbers, or plans for world domination, then you should consider using ActiveRecordStore or some other server-side store.

The default_value_for method

The default_value_for method is available in all ActiveRecord model classes.

The first argument is the name of the attribute for which a default value should be set. This may either be a Symbol or a String.

The default value itself may either be passed as the second argument:

default_value_for :age, 20

…or it may be passed as the return value of a block:

default_value_for :age do
if today_is_sunday?
20
else
30
end
end

If you pass a value argument, then the default value is static and never changes. However, if you pass a block, then the default value is retrieved by calling the block. This block is called not once, but every time a new record is instantiated and default values need to be filled in.

The latter form is especially useful if your model has a UUID column. One can generate a new, random UUID for every newly instantiated record:

What about before_validate/before_save?

True, before_validate and before_save does what we want if we’re only interested in filling in a default before saving. However, if one wants to be able to access the default value even before saving, then be prepared to write a lot of code. Suppose that we want to be able to access a new record’s UUID, even before it’s saved. We could end up with the following code:

What about other plugins?

Default Value appears to be unmaintained; its SVN link is broken. This leaves only ActiveRecord Defaults. However, it is semantically dubious, which leaves it wide open for corner cases. For example, it is not clearly specified what ActiveRecord Defaults will do when attributes are protected by attr_protected or attr_accessible. It is also not clearly specified what one is supposed to do if one needs a custom initialize method in the model.

I’ve taken my time to thoroughly document default_value_for’s behavior.

The development version of Ruby on Rails has cool new internationalization features. Although the framework itself doesn’t provide a lot of I18N functionality, it does provide the necessary hooks for plugins to implement I18N however they see fit. Simon Tokumine has written an I18N demo application to show you what Rails is capable of, when used in combination with the localized_dates plugin.

“Phusion” and “Phusion Passenger” are registered trademarks of Phusion. “Rails”, “Ruby on Rails” and the Rails logo are registered trademarks of David Heinemeier Hansson. All other trademarks are property of their respective owners.