If you use Sequel in a web app running in Apache, here’s a stability tip: make sure you reconnect to the database when Passenger forks.

We’ve been running a Rails app that uses ActiveRecord and Sequel side by side for several months without noticing anything, but then we started getting strange and seemingly random errors. After a while it was obvious that they all had something to do with Sequel, because the errors always occurred in, or in code that used, Sequel model objects.

We tried a number of different things but it was hard to know if anything made a difference since the errors occurred randomly, and without warning. Restarting the MySQL server seemed to fix the problem for several days, but then it started again. Sometimes there was only one or two errors, and then nothing for days, sometimes they would just stop everything. Then we discovered that we could always reproduce the problems by opening around 15 tabs with different pages and reloading them over and over again — essentially stressing the hell out of the app.

At this point it was obvious that there was something going on with Sequel’s connection to MySQL, and that it was related to the number of Passenger workers — but we didn’t really know what to do.

Then one day I remembered something I read in a thread on MongoDB and Rails: someone described problems similar to ours, and it turned out that his Passenger workers all shared the same database connection. I knew that this couldn’t be the whole story for us, because MySQL’s process list showed more than one connection from our app. However, it doesn’t hurt to try.

I added an adapted version of the code in an initializer, pushed up the fix and restarted the server. Then I did the 15 tab stress test and all pages loaded, quickly, without fail, and over and over again.

It’s very simple, taken more or less out of the Passenger documentation: when Passenger forks off a new worker process, the fork needs to reopen any shared connections.

It’s kind of obvious when you think about it, and the reason this wasn’t the first thing we tried was that we thought that Sequel already did this. Since we saw multiple connections in MySQL’s process list we assumed that it opened a new connection for every worker. Now, I assume that the multiple connections were from the workers that Passenger starts initially. It was the ones forked later that caused problems. The amount of traffic the app got was usually not more than the initial workers could handle, so there was no problem. It was only at times when Passenger needed to fork more workers (for example when we loaded up 15 tabs) that the problem occurred (and probably then only when two workers used the connection at the same time).

Lesson learned: don’t trust your database connection library to be Passenger-aware, always make sure things like database connections aren’t shared between workers. When you see unexplainable and random problems in your database code, make sure each process has its own connection.

By the way, isn’t that RailsSequel code really, really ugly? I didn’t make it any better with my global variables, but RailsSequel is just a namespace for global variables to begin with.

This entry was posted
on Sunday, June 27th, 2010 at 13:10 and is filed under Ruby.
You can follow any responses to this entry through the RSS 2.0 feed.
You can leave a response, or trackback from your own site.