Why would you want to do a rolling a restart? Good question. If you are running a decent site, with some traffic you will see that it will become harder and harder to find maintenance window. Will you just choose the time that has the least amount of visitors an enter "cap deploy"? Will you post a maintenance page or edit that javscript file with the typo on the server just for once (no DeeDee nooo!)? If you recognize these situations, you will find this next article a usefull read.

The code

I'll start with the code example, to help people out that are just looking for something to copy paste & get on with their lives. After the example I will try to explain each step as thoroughly as possible.

In order for this to function correctly, you wil need at least two appservers, a loadbalancer setup and either a client or database session management system. Ready? Then we are off!

Step 1 - Taking the appserver out of the loadbalancer rotation

Before we touch anything, we need to remove the first appserver from the loadbalancer rotation.

If you have a loadbalancer with an API, you probably want to use that API in order to remove this appserver gracefully from the appserver pool. Our loadbalancer does not have this functionality, so we start by rejecting the loadbalancer check requests with iptables.

Some people actually prefer to do it like this, instead of using the API, because this will test your failover setup each time you perform a deploy. Because we are just blocking the loadbalancer check request, all current traffic will continue to flow as normal. Our loadbalancer is setup to check the status of the appserver each minute. If it does not get a response, it will remove it from the rotation pool. So if we sleep for 90 seconds after we block the loadbalancer, we will be gracefully taken out of rotation.

Step 2 - Restarting one appserver

Restarting passenger is easy right? Capistano has done all the complex moving of code and symlinking for you, so you should just hit /tmp/restart.txt and you are done!

Well almost. One very important detail you must not overlook is that touching /tmp/restart.txt actually does not actually restart passenger. The next request that hits your passenger instance will make passenger check the timestamp of restart.txt and then trigger a restart of your app. This is why your first request will always be slow after a restart.txt even if you have things like PassengerPrestart or PassengerMinInstances configured. Because we want to restart now, we push a single request to passenger using curl. Because we are sending this request from localhost, we need explicitly specify our host in the header, like so:

If we don't do this, the request will be handled by your default vhost, which might not be the correct one. To double check if everything went all right, you might want to run passenger-status here and see if everything is as you expect it to be.

Step 3 - Unblocking the loadbalancer

By droppping the iptables rules we start accepting loadbalancer checks again. We wait for another 90 seconds to make sure that the loadbalancer has done it's once-a-minute check and knows that we are up and running.

After this is done, we can safely move to the next application server and repeat the process.

Caveats

This will work very well if all you are just pushing code updates, but you might still have some downtime with database migrations, as most relational databases will lock tables on a migration. Most people work around this problem by having their code handle both old-style and new-style database schema's and doing the database migration through a separate process, keeping the table lock time as short as possible. After that they perform their data-migration, redeploy and restart the appservers and then safely remove any old columns. There are a lot of examples of this on the internet (like here).

Also if your deployment explodes half way through, you might end up with iptable rules where you do not want them. These will probably have to be dealt with manually.

References (54)

Can I simply say what a reduction to find someone who really is aware of what theyre talking about on the internet. You undoubtedly know tips on how to carry an issue to gentle and make it important. Extra people need to read this and perceive this aspect of the story. ...

Hello, here to post points. Here's a good article written, rich in content. If you want more information, look at the situation here:Moves on Rails - Journal - Rolling Restart with Passenger, Ruby on Rails and Capistrano

Part persons indeed select to do it similar this, rather of using the API, since this devise audition your failover setup every term you exhibit a arrange. So we are righteous blocking the loadbalancer test seek, sum latest business pleasure exist to flood as regular. Our loadbalancer is setup to examination ...

. This is why your first request will always be slow after a restart.txt even if you have things like PassengerPrestart or PassengerMinInstances configured. Because we want to restart now, we push a single request to passenger using curl. Because we are sending this request from localhost, we need explicitly specify ...