Implementing Offline System Updates

This is implemented starting with systemd 183.

Here are some guidelines how to implement "offline" OS updates with systemd. By "offline" OS updates we mean package installations and updates that are run with the system booted into a special system update mode, in order to avoid problems related to conflicts of libraries and services that are currently running with those on disk. This document is inspired by this GNOME design whiteboard.

The logic:

The package manager prepares system updates by downloading all (RPM or DEB or whatever) packages to update off-line in a special directory /var/lib/system-update (or another directory of the package/upgrade manager's choice).

When the user OK'ed the update a symlink /system-update is created that points to /var/lib/system-update (or wherever the upgrade package directory is called) and the system is rebooted. This symlink is in the root directory, since we need to check for it very early at boot, at a time where /var is not available yet.

Very early in the new boot a systemd generator checks whether /system-update exists. If so, it (temporarily and for this boot only) redirects (i.e. symlinks) default.target to system-update.target, a new target that is intended to pull in the base system (i.e. sysinit.target, so that all file systems are mounted but little else) and the system update script.

The system now continues to boot into default.target, and thus into system-update.target. This target pulls in the OS update script, which is executed after all file systems are mounted.

The system update script now creates a btrfs snapshot (if possible), then installs all RPMs. After completion (regardless whether the update succeeded or failed) the /system-update symlink is removed. In addition, on failure it reverts to the old btrfs state (modulo the aforementioned symlink), on success it leaves the newly made changes in place.

The system is now rebooted. Since the /system-update symlink is gone, the generator won't redirect default.target this time and we now boot into the full system again.

Advantages of this approach over in-system updates:

Only the components of the OS that are required to access the file systems are started and initialized. No other daemons/services are running, thus minimizing problems with conflicting versions of libraries/daemons/other files on disk and in memory.

The btrfs snapshot of the OS is fairly reliable and mostly atomic

Disadvantages of this approach over in-system updates:

The system is rebooted twice.

Advantages of this approach over rescue partition based updates:

The exact same storage initialization configuration is used for the update process as for the main OS, thus avoiding problems in finding the OS hierarchy and mounting it properly.

Since the journal is available all updates logs are saved and stored on disk the usual way and available for later review next to the usual logs.

Disadvantages of this approach over rescue partition based updates:

The basic mounts are done with code that might be subject to the updating process. This means that while the problems with conflicts between processes/libraries in memory on disk are minimized they are not zero.
To clear up some confusion: generators are small binaries that live in /usr/lib/systemd/system-generators. systemd will execute those binaries very early at bootup (and at configuration reload time), before the unit files are loaded. The generators can dynamically generate unit files in /run, and override existing definitions freely. (For more information see Generators.)

The generator code and the basic definition of systemd-update.target live within the systemd tarball/package itself. The system update script and creation of /system-update should live in the distribution's package manager or packages such as PackageKit.

A few more recommendations:

To make things a bit more robust we recommend hooking the update script into system-update.target via a .wants/ symlink in the distribution package, rather than depending on "systemctl enable" in the postinst scriptlets of your package. More specifically, for your update script create a .service file, without [Install] section, and then add a symlink like /usr/lib/systemd/system-update.target.wants/foobar.service → ../foobar.service to your package.

Make sure to remove the /system-update symlink early in the update script to avoid reboot loops in case the update fails.

Use OnFailure=reboot.target in the service file for your update script to ensure that a reboot is automatically triggered if the update fails. OnFailure= makes sure that the specified unit is activated if your script exits uncleanly (by non-zero error code, or signal/coredump). If your script succeeds you should trigger the reboot in your own code, for example by invoking logind's Reboot() call. See On logind for details.