simple thinking for complex software

Init Scripts Considered Harmful

Nov 11th, 2007

Tired of PID files, needing root access, and writing init scripts just
to have your UNIX apps start when your server boots? Want a simpler,
better alternative that will also restart them if they crash? If so,
then read this quick-start introduction to process supervision with
runit/daemontools.

Background

Classic init scripts, e.g. /etc/init.d/apache, are widely used for
starting processes at system boot time, when they are executed by
init. Sadly, init scripts are cumbersome and error-prone to write,
they must typically be edited and run as root, and the processes they
launch do not get restarted automatically if they crash.

In an alternative scheme called “process supervision”, each important
process is looked after by a tiny supervising process, which deals
with starting and stopping the important process on request, and
re-starting it when it exits unexpectedly. Those supervising
processes are in turn reliably supervised by other supervising
processes, in a hierarchy extending up to the “master” UNIX process
with PID 1, typically init.

(The process supervision pattern is long-established, and is even
built into the programming language Erlang,
which is used for ultra-reliable telecoms applications.)

It’s possible to replace all init scripts with supervised processes,
but not necessary. As application developers we can get most of the
benefits simply by using process supervision to run our applications
as non-root users, leaving the init scripts in place for system
services.

Daemontools and runit

Dan Bernstein (of qmail fame) wrote the seminal process supervision
toolkit daemontools, which is a
beautifully-designed set of small, ultra-reliable and
highly-specialised programs that cooperate in the UNIX tradition to
manage process supervision trees.

Runit is a more conveniently licensed and
more actively maintained reimplementation of daemontools, written by
Gerrit Pape.

For the purposes of this article I’ll use runit. It is widely
available, and I personally use the packages for
Debian and
Homebrew.

Service directories and scripts

In runit parlance a “service” is simply a directory containing a ‘run’
script. More on those later.

There are just two key programs in runit. Firstly, runsv supervises
the process for an individual service. Service directories themselves
sit inside a containing directory, and the runsvdir program
supervises that directory, running one child runsv process for the
service in each subdirectory. Out of the box on Debian, for example,
an instance of runsvdir supervises services in subdirectories of
/var/service/.

Let’s add a simple play service, as root for now:

1

# mkdir /etc/runit/sleeper

Now we create a run script for the service; run scripts should not
fork, exit or maintain their own PID files. We’ll use a 5-minute
sleep command to simulate a long-running service that occasionally
crashes. We create /etc/runit/sleeper/run as follows:

12

#!/bin/sh -eexec sleep 300

And we make it executable:

1

# chmod +x /etc/runit/sleeper/run

Remember that runsvdir is actually watching /var/service/ – watch
what happens when we make our service definition appear there:

1

# ln -s /etc/runit/sleeper /var/service/sleeper

Now pstree shows us this:

12345

init-+-atd
|-cron
.........
|-runsvdir---runsv---sleep
`-sshd

Yes, runsvdir noticed the new service, and started running it with its own runsv.

To stop and start the service, we talk to that runsv process using the sv utility:

Our sleeper service will be started by runsvdir when the operating system starts up. (runsvdir itself is typically run from - and supervised reliably by - init.)

If we remove the sleeper symlink from /var/service/, then the
sleeper process will be stopped and removed from the supervision
tree.

Why the symlink?

Keeping our service definitions in one directory (/etc/runit/) lets us
start them manually in order to debug our run scripts, before
putting the services under the control of runit by adding symlinks
under /var/service.

Reliable services for unprivileged users

Using this machinery, we can arrange for non-root users to have
supervised services too. We simply give our application user (login
appuser) a ~/service directory, and then set up a service under
/var/service that will reliably run an instance of runsvdir for that
directory with appuser’s privileges:

Debugging service definitions

If runsvdir is unable to execute a run script for some reason, then
by default it will write log messages to its process title: look at
the output of ps aux for a clue (this is the reason for the long
...... line above). If possible, try starting your run script by
hand before symlinking the service directory into the location
runsvdir is watching.

Other capabilities

Runit can do much more than I have presented here; it can manage
reliable logging of stdout/stderr output as an alternative to (or
in addition to) syslog; take a look at svlogd. Setting up logging can
make it much easier to figure out why services don’t seem to be
starting correctly.

Simple mechanisms also exist for specifying dependencies between
services, such that one will not start before another is up and
running. That’s typically less an issue for application
administrators than it is for system administrators.

Under the covers

The design of runit is full of elegant touches; for example, runsv
creates a supervision subdirectory of the service directory it is
supervising. In that directory it writes a PID file and a named pipe
connected to itself. svthen simply writes rudimentary commands to
the pipe, to which runsv then reacts.

Monitoring tools such as monit can be easily configured to report on
the status of service processes by pointing them at the PID files.

Unprivileged processes, privileged ports

Sometimes people run applications as root simply because they need to
listen on privileged ports (ie. ports below 1024). If you need such a
port, you can still run your application as an unprivileged service;
consider using iptables or a tcp proxy like pen
to proxy that port through to your unprivileged runit services
listening on unprivileged ports.

More

If you found this article helpful, or want to read my forthcoming
article about reliably running
Ruby on Rails applications using runit,
please subscribe to my feed.