Update on the systemd issue

I’d like to start with the things that are good. I found the systemd community to be one of the most helpful in Debian, and #debian-systemd IRC channel to be especially helpful. I was in there for quite some time yesterday, and appreciated the help from many people, especially Michael. This is a nontechnical factor, but is extremely important; this has significantly allayed my concerns about systemd right there.

There are things about the systemd design that impress. The dependency system and configuration system is a lot more flexible than sysvinit. It is also a lot more complicated, and difficult to figure out what’s happening. I am unconvinced of the utility of parallelization of boot to begin with; I rarely reboot any of my Linux systems, desktops or servers, and it seems to introduce needless complexity.

Anyhow, on to the filesystem problem, and a bit of a background. My laptop runs ZFS, which is somewhat similar to btrfs in that it’s a volume manager (like LVM), RAID manager (like md), and filesystem in one. My system runs LVM, and inside LVM, I have two ZFS “pools” (volume groups): one, called rpool, that is unencrypted and holds mainly the operating system; and the other, called crypt, that is stacked atop LUKS. ZFS on Linux doesn’t yet have built-in crypto, which is why LVM is even in the picture here (to separate out the SSD at a level above ZFS to permit parts of it to be encrypted). This is a bit of an antiquated setup for me; as more systems have AES-NI, I’m going to everything except /boot being encrypted.

Anyhow, inside rpool is the / filesystem, /var, and /usr. Inside /crypt is /tmp and /home.

Initially, I tried to just boot it, knowing that systemd is supposed to work with LSB init scripts, and ZFS has init scripts with carefully-planned dependencies. This was evidently not working, perhaps because /lib/systemd/systemd/ It turns out that systemd has a few assumptions that turn out to be less true with ZFS than otherwise. ZFS filesystems are normally not mounted via /etc/fstab; a ZFS pool has internal properties about which dataset gets mounted where (similar to LVM’s actions after a vgscan and vgchange -ay). Even though there are ordering constraints in the units, systemd is writing files to /var before /var gets mounted, resulting in the mount failing (unlike ext4, ZFS by default will reject an attempt to mount over a non-empty directory). Partly this due to the debian-fixup.service, and partly it is due to systemd reacting to udev items like backlight.

This problem was eventually worked around by doing zfs set mountpoint=legacy rpool/var, and then adding a line to fstab (“rpool/var /var zfs defaults 0 2”) for /var and its descendent filesystems.

This left the problem of /tmp; again, it wasn’t getting mounted soon enough. In this case, it required crypttab to be processed first, and there seem to be a lot of bugs in the crypttab processing in systemd (more on that below). I eventually worked around that by adding After=cryptsetup.target to the zfs-import-cache.service file. For /tmp, it did NOT work to put it in /etc/fstab, because then it tried to mount it before starting cryptsetup for some reason. It probably didn’t help that the system’s cryptdisks.service is a symlink to /dev/null, a fact I didn’t realize until after a lot of needless reboots.

Anyhow, one thing I stumbled across was poor console control with systemd. On numerous occasions, I had things like two cryptsetup processes trying to read a password, plus an emergency mode console trying to do so. I had this memorable line of text at one point:

And here we venture into unsatisfying territory with systemd. One answer to this in IRC was to install plymouth, which apparently serializes console I/O. However, plymouth is “an attractive boot animation in place of the text messages that normally get shown.” I don’t want an “attractive boot animation”. Nevertheless, neither systemd-sysv nor cryptsetup depends on plymouth, so by default, the prompt for a password at boot is obscured by various other text.

Worse, plymouth doesn’t support serial consoles, so at the moment booting a system that uses LUKS with systemd over a serial console is a matter of blind luck of typing the right password at the right time.

In the end, though, the system booted and after a few more tweaks, the backlight buttons do their thing again. Whew!

Update 2014-10-13: uau pointed out that Plymouth is more than a bootsplash, and can work with serial consoles, despite the description of the package. I stand corrected on that. (It is still the case, however, that packages don’t depend on it where they should, and the default experience for people using cryptsetup is not very good.)

Could you do me a favor? Simply try booting with “init=/lib/sysvinit/init cgroup_enable=memory” in your kernel command line and see if that fixes either of the issues you had with sysvinit. If it does not, please please please file a bug on systemd-shim about the issue(s).

I think boot parallellism and thus speed increase is not necessarily a goal in itself, but comes logically after you’ve defined which service depends on which and when a service is actually ‘ready’ so that it’s dependent services can be started. So it’s much more about correctness of boot ordering than speed. No more failing processes because a sleep(5) in an init script proved to be insufficient or because an init script exited but it took a bit longer for the database port to actually open…

I believe Plymouth has a text mode. I was observing Ubuntu 10.04 servers reboot over a serial console at one point. That’s with Upstart, not systemd, of course, but Upstart also requires Plymouth for the same reason – serialization of output from parallel jobs.