December 8, 2011

In the past three years of sysadvent, I've covered various process monitor tools like monit, supervisord, and upstart. A little while back I put Fedora 15 on my laptop and found a new one, systemd.

Since every tool seems to invent different terminology for the same things, for the purposes of this article when I say 'process' or 'service' I mean the same thing - a systemd service.

So let's dig in a bit.

A First Look

The first thing you'll need to know is how to interact with systemd: starting and stopping things - the usual business. The main tool for this is systemctl. Run it with no arguments, and it gives you a list of all services. You can also ask for status:

Most of the above should be fairly straight forward, though the 'unit' and 'install' sections have strange names. I'll explain it from top to bottom.

The 'Unit' section is documented in the systemd.unit(5) manpage. This section seems to cover things like ordering and dependencies. There's a separate section for defining the rsyslog service itself because systemd supports many more things than simply services, according to systemd.unit(5) -

A unit configuration file encodes information about a service, a socket, a device, a mount point, an automount point, a swap file or partition, a start-up target, a file system path or a timer controlled and supervised by systemd(1).

That's a lot of stuff, but I'm mainly interested in how to run things in systemd.

Controlling a Service

Let's stop it.

% sudo systemctl stop rsyslog.service

There's no output. If you run 'stop' again, it will again have no output and will still exit with success - a nice touch making scripted management easier. Check the status:

Remember that stop and start only affect the current run-time and don't impact other events that might cause rsyslog to start (like the system booting). For that, you'll want enable and disable. You can disable things fairly intuitively:

It's an odd thing. The 'stop' and 'start' commands output nothing, but 'enable' and 'disable' output shell commands? Additionally 'disable' and 'enable' do those actions for you, so I don't know what I am supposed to do with the output. Is systemd trying to encourage me to use those commands myself instead of using systemctl? By the way, if you enable an already-enabled service, you get no output and success. Same for disabling. Confusing!

Add Your Own Service

Here's the config file I used, and I put it in /lib/systemd/system/fizzle.service:

I don't want it to be dead. As described in past sysadvents covering process monitoring, "if it dies, restart it." What can be done? The systemd.service(5) manpage says to add 'Restart=always' to the 'Service' section.

Once that is added, starting 'fizzle.service' again will get it rolling. After 5 seconds it will die and be started by systemd:

Supporting Odd Services

I originally wanted to use nagios as the example, not rsyslog as above, but when installing nagios on Fedora 15, I received the usual /etc/init.d/nagios startup script. However, when I ran it, I saw this output:

Huh? I thought that was strange, and when I dug into the script, I saw no mention of systemctl. It loads /etc/init.d/functions which, by default, seems to pass itself into systemctl. Asking systemctl what's up, it says:

Funky, though it shows two features of systemd. First, it supports old sysv init scripts. Second, having explicit 'ExecStart' and 'ExecStop' settings can be useful in putting together systemd and software that insists on being run with its own management tools to stop and start.

Concerns

This section is more sourced from my feelings on systemd than from facts, so take this with a grain of salt.

My first problem with systemd is the huge feature list. It looks to be trying to replace /sbin/init, SYSV init scripts and runlevels, inetd, udevd, automount, and supports cgroups, inotify, and more. That's a pretty big feature space, and it reflects in the size of the code base. At this time of writing, the lines of code in systemd around 82000 lines of code. Of those, only about 1000 are tests. Only 1.2% of the code appears to be tests? Yikes.

Further, systemd is a major consolidation of several components of the system. I don't really want software replacing major components (/sbin/init, cron, etc) with practically no tests and a fixation on problems I don't have. A bug in cron doesn't crash init, but now it just might.

Lastly, systemd relies on DBus. It's pretty rare for sysadmins to have work experience with DBus. It's another layer to debug when things break. Are there debugging tools? I hope I'm just failing to google for this, but I always come up empty when looking for a decent tracing tool for DBus messages - all the ones I run across are graphical.

The above problems are not terrible things on desktops which tend to have much looser expectations on software reliability and more flexibility on outages. However, put these on a server, and what do you have? DBus usage, major software consolidation into a single binary, 82000 lines of code and basically no tests - all this adds up to great worry and concern. How long until systemd ships with RHEL or your preferred production Linux distribution?

Conclusion

As stated, please take my opionated concerns detailed above for what they are - they are not facts. I'm not writing off systemd as a failure by any stretch.

Systemd itself has some fairly nice features for running services. Pretty much anything you'd want to configure for a service is available: cgroups, user, oom tuning, output logging, cpu and I/O tuning, etc. The command line tools and documentation are also pretty good. It's the default on all Fedora 15 and newer releases, and you can get it on many other Linux distributions, so go on and play with it!

Further Reading

2 comments
:

enable and disable show you exactly what they're doing. the symlinks hold the information on what's enabled and what's not. so they don't get set a second time. you could alternatively do that manually.