One of the things I’ve needed to do with arcade.ly for a while is start running node as a service.

It’s easy enough to run background tasks directly from bash with a command like:

1

sudo nohup node index.js &> /dev/null &

And if you run that command and log out of your ssh session, node will carry on running just fine in the background. But what happens if node falls over? For arcade.ly this has never happened, but that’s likely because the site hasn’t been under much stress up to now.

Realistically at some point the site is going to go down, and whilst Cloudflare will to some extent bail me out, I’d like it to come back up again without a manual intervention, especially if it’s due to something as simple as the node process crapping out.

Here you can see the effect of Cloudflare’s assistance. It’s not bad but that banner (basically an advert) at the top isn’t ideal. I suspect I could get rid of that by paying for a more expensive plan.

The great thing is that you can still play the games so if you need a fix of Asteroids or Star Castle you can go ahead and fire up Shoot the Rocks or Star Citadel and they’ll (probably) still run. (I’ll blog on this topic in due course as well.)

So how do we make sure it runs as a service and recovers without intervention?

I’m running the site on an instance spun up from a bog standard Ubuntu AMI in EC2, and deploying is just a case of cloning my git repo into a new instance and running a script. Nothing fancy either: just a shell script that installs a few things, runs gulp, and copies some files around. (That’s right: no Docker. I’ll post about why later.)

These days I’m much more familiar with Windows than Linux, so had to do a bit of digging around on how to run a process as a service, but it turns out to be quite straightforward, albeit that there are a couple of gotchas, and there are two different ways of doing it, depending upon which version of Ubuntu you’re using.

Pre Ubuntu 15.04 - upstart

Due to the inevitable lag as Google’s ranking algorithm catches up with today’s technological landscape the material you’ll most likely run across first suggests going down the upstart route, which involves writing a .conf file into /etc/init. However, whilst that’s fine if you’re running an older version of Ubuntu - and as of this writing I believe Ubuntu 14 is still widespread - upstart was deprecated in 15.04 in favour of systemd and if, like me, you’re running 16.x, well, you’re out of luck. If you’ve been keeping more up to date with Ubuntu developments than I have you probably already know about this.

In theory you can still use apt-get to install upstart on later versions of Ubuntu but this isn’t something I’d recommend due to reports of serious problems after doing so.

I followed these instructions and the only thing I did differently was to redirect to /dev/null because all my logging is sent to loggly because I don’t want to manually manage log files. You can see this under the script section here:

Note the respawn settings. This tells upstart to try and respawn the process up to 99 times at 5 second intervals. If the service can’t be restarted after 99 tries, upstart will give up.

The first was simple: I needed to prefix all the systemctl commands with sudo, otherwise it would ask me for a password, which I don’t have because I log in with ssh. Here’s an example:

1

sudo systemctl enable arcadely.service

The second was that when I first started the service, whilst all my 301 permanent redirects were working fine, none of the site content would load - everything was coming back with 404 - Not Found. This turned out to be because the working directory of the service process is set to / by default, and thus when express.js tried to serve up content relative to this folder it inevitably failed. This is easily fixed by setting the WorkingDirectory property in the descriptor, as you can see here:

Note also the restart settings - this is what tells systemd to bring the service back up again if it fails.

The following bash fragment shows how I’m installing the service.

You can see that after I’ve installed it once, I don’t bother installing it again (which is just a file copy), but I do bounce it so that node picks up any changes that have just been deployed. I could use nodemon for this, and not bother with the bounce, but I’ve found that sometimes nodemon can sometimes get stuck and require manual intervention if the app it’s monitoring crashes due to restarting in the midst of a deployment, which can easily happen if the deployment takes longer than usual, for some reason. I do still use nodemon during development, in concert with a gulp watch task.

I might change this so that it copies the service descriptor every time, in case there are changes. This is likely to be infrequent but it means I never have to think about it again, which is a bonus.

The only thing is, when the service descriptor changes you generally need to do a reload of the systemctl daemon:

1

sudo systemctl daemon-reload

Don’t worry - if this is necessary you’ll be prompted. There’s no guesswork. In any case, I’ll probably automate this away by adding it to the script.

Anyway, I’m now in a position where I can reliably deploy to a vanilla Ubuntu AMI with all dependencies in just a couple of minutes. This includes time to compile Python 2.7.13, which is the lengthiest operation.

I hope the above tips come in useful!

EDIT - 24 Jan 2017: systemd local root exploit in v228

There’s been some noise today about a local root exploit in systemd v228. I first picked this up on HackerNews, and you can find the original report here.

This has been fixed in later releases (and note the controversy about the stealthy way that happened), but you can check if your systems are vulnerable using one of the following commands to check your systemd version. Which command you use depends on which distro you’re using: