We recently run into an issue where restarting Rails within Passenger was starting to take longer and longer as the number of plugins and gems we use have grown. This led to 30-90 seconds where the site was unavailable as Passenger restarted the Rails application spawner across every machine. That's far too long if you like to deploy frequently, which we do.

If you can, the best answer might be to move to Unicorn, the latest in a long line of Rails deployment options. It handles the process of migrating requests from old workers to new workers transparently. Awesome.

Our first thought was that Rails was just loading too slowly. Both Robby and I spent some time profiling Rails boot time and there wasn't a single culprit. Instead there were many contributors from the 50+ gems and 30 plugins we use that made it difficult to radically improve.

Next we took a different approach to have capistrano remove instances from the HAProxy pool
and serially restart Passenger instances. To get this working there are a couple steps that I hadn't seen documented elsewhere (hat tip to Matt Conway's rubber for the serial task trick):

Change haproxy.cfg to perform a file-based health check for each backend:
option httpchk GET /haproxy.txt
In the rails app make sure that same file lives in the public directory so it will respond 200 in the normal case.
Also, I needed to make nginx return a 404 when that file does not exist, since that is what HAProxy looks for to remove it
from the pool. So in the nginx config:

if (!-f $document_root/haproxy.txt) {
return 404;
}

Change the default "deploy:restart" capistrano task to do the restart serially. Capistrano usually executes the task
on all matching hosts concurrently so it requires a little hack to force it to run serially.

During deploys notice that haproxy will change the dashboard display to remove that instance and send requests to other instances. Be aware that
this does mean not all instances have the same code running at the exact same instant. So for database migrations or other scenarios it might
require putting up a maintenance page or coding defensively.