Thursday, August 16, 2018

NixOS in production

This is a short post summarizing what I wish I had known when I first started using NixOS in production. Hopefully other people will find this helpful to avoid the pitfalls that I ran into.

The main issue with NixOS is that the manual recommends a workflow that is not suitable for deployment to production. Specifically, the manual encourages you to:

keep Nix source code on the destination machine

i.e. /etc/nixos/{hardware-,}configuration.nix

build on the destination machine

use Nix’s channel system to obtain nixpkgs code

This post will describe how you can instead:

build a source-free binary closure on a build machine

transfer and deploy that binary closure to a separate destination machine

This guide overlaps substantially with what the NixOps tool does and you can think of this guide as a way to transplant a limited subset of what NixOps does to work with other provisioning tools (such as Terraform).

You might also find this guide useful even when using NixOS as a desktop operating system for handling more exotic scenarios not covered by the nixos-rebuild command-line tool.

Building the closure

We’ll build up to the final solution by slowly changing the workflow recommended in the NixOS manual.

Suppose that you already have an /etc/nixos/configuration.nix file and you use nixos-rebuild switch to deploy your system. You can wean yourself off of nixos-rebuild by building the binary closure for the system yourself. In other words, you can reimplement the nixos-rebuild build command from scratch.

Congratulations, you’ve just done the equivalent of nixos-rebuild switch!

As the above command suggests, the closure contains a ./bin/switch-to-configuration which understands a subset of the commands that the nixos-rebuild command does. In particular, the switch-to-configuration script accepts these commands:

$ ./result/bin/switch-to-configuration --help
Usage: ../../result/bin/switch-to-configuration [switch|boot|test]
switch: make the configuration the boot default and activate now
boot: make the configuration the boot default
test: activate the configuration, but don't make it the boot default
dry-activate: show what would be done if this configuration were activated

Adding a profile

The nixos-rebuild command actually does one more thing in addition to buiding the binary closure and deploying the system. The nixos-rebuild command also creates a symlink pointing to the current system configuration so that you can roll back to that configuration later. The symlink also acts like a garbage collection root, preventing the system from being garbage collected until you remove the symlink (either directly using rm or a higher-level utility such as nix-collect-garbage)

You can record the system configuration in the same way as nixos-rebuild using the nix-env command:

Querying system options

You can use the same nixos.nix file to query what options you’ve set for your system, just like the nixos-option utility. For example, if you want to compute the final value of the networking.firewall.allowedTCPPorts option then you run this command:

In fact, this makes your build completely insensitive to the NIX_PATH, eliminating a potential source of non-determinism from the build.

Building remotely

Now that you’ve removed nixos-rebuild from the equation you can build the binary closure on a separate machine from the one that you deploy to. You can check your nixos.nix, configuration.nix and hardware-configuration.nix files into version control and nix-build the system on any machine that can check out your version controlled Nix configuration. All you have to do is change the import path to be a relative path to the configuration.nix file within the same repository instead of an absolute path:

… and upload the binary archive located at /tmp/system to the destination machine using your upload method of choice. Then import the binary archive into the /nix/store on the destination machine using nix copy:

… replacing /nix/store/... with the /nix/store path of your closure (since there is no result symlink on the destination machine).

Conclusion

That’s it! Now you should be able to store your NixOS configuration in version control, build a binary closure as part of continuous integration, and deploy that binary closure to a separate destination machine. You can also now pin your build to a specific revision of Nixpkgs so that your build is more deterministic.

I wanted to credit my teammate Parnell Springmeyer who taught me the ./result/bin/switch-to-configuration trick for deploying a NixOS system and who codified the trick into the nix-deploy command-line tool. I also wanted to credit Remy Goldschmidt who interned on our team over the previous summer and taught me how to reliably pin Nixpkgs.

2 comments:

I like how you break down the essential commands for remotely deploying NixOS into production. The only feature not covered by this that I use in NixOps is how it handles deploying secrets (like a SSL certificate, or private key file). If you use the system described in this post and you need to deploy secrets, what method do you use?

I tried to use the 'Pinning `nixpkgs`' example, but was met with "error: Module `/etc/nixos/configuration.nix' has an unsupported attribute `pkgs'. This is caused by assignments to the top-level attributes `config' or `options'" (https://github.com/NixOS/nixpkgs/blob/1a6af9f88ec2405334a9fd6a977ccbcb53472305/lib/modules.nix#L126). Not sure if this looks familiar to anyone who's tried this...