Switching the distribution behind the Windows Subsystem for Linux

Ever since the Windows Subsystem for Linux was announced, the one thing that was bugging me is that it is running an older version of Ubuntu. It wasn't only me, the eighth issue opened on the project's issue tracker is about the ability to switch to an alternate distribution. Shortly after, a tutorial popped up for installing Fedora on WSL. While this worked, I only tried it in a virtual machine, as I could not afford upgrading to Insider Preview on my main machine at the time. As a result, I did not end up experimenting as much as I'd liked.

With the Windows 10 Anniversary Update, the Linux subsystem finally shipped with the stable version of the operating system, and I could begin the experimentation.

After some research on WSL, it seems that the rootfs directory contains the distribution files, while the /home, /root and similar directories are stored separately. This setup is perfect for a scenario where I'd like to seamlessly switch between the distributions, persisting my dotfiles and miscellaneous files in the home directories.

This blog post aims to present my thought process throughout the solving of this issue while also explaining a few design decisions on the way. If you're only interested in how to use the the final product, I suggest reading the project's readme instead.

Obtaining source tarballs

The aformenetioned Fedora tutorial requires you to download a tarball from Fedora's build system. This works, however, it's too Fedora-specific. Other distributions may also have similar tarballs available for download, however, I wanted a more universal approach, without scouring the web for tarballs. At least, not manually.

Enter Docker. The Docker Hub already has such tarballs available for quick consumption, maybe they can be used for WSL as well. After some prodding at the service, I've determined the best way to download a usable tarball, is to find the original Dockerfile for the Docker image, and then use the files it packages into its / directory.

It seems the official OS images are stored in a git repository, available at github.com/docker-library/official-images. There is one file per image under the library directory, and this contains a list of tags along with their source. Going through the files, it seems there are two different versions in use. The older version seems to be:

Studying the structure of the repositories and the fields available, it quickly became apparent that enough information is available in both cases to build a direct link to the Dockerfile using something like https://github.com/$GitRepo/blob/$GitCommit/$Directory/Dockerfile. The two examples above would then translate to the following URLs:

Obtaining prebuilt images

The previous method works, but it's limited to official images only, which is somewhat a limiting factor, considering how much bigger the selection could get if the script was extended to the whole list of images published on the Docker Hub.

Updating the script to include these was not an option, since the official images are a one-off special case with their own git repository with a well-defined structure. Instead I needed to look into what happens when you run docker pull.

Thankfully, the Docker Registry has an open API with documentation, namely the Docker Registry HTTP API V2. To get started, you must first request an auth token for the repository you're about to download:

This response has its own separate documentation, called the Image Manifest Version 2, Schema 1, however, the returned values are not that interesting. The only interesting part is the fsLayers member, which lists the digest of the layers, in order of creation.

In order to download the layers, you just need to iterate through the list and request each digest:

It should be noted, that the API itself will not serve the binary blobs, instead it will just provide an HTTP redirection to a CDN.

The next step was to determine what each layer actually is, and how I could merge them for installation. Thankfully, the documentation quickly revealed that they are application/vnd.docker.image.rootfs.diff.tar.gzip, simply put, just .tar.gz.

However, this still leaves me with a bunch of archives, while the installer script only accepts one single tarball. Some research and experimentation later, I found that I can just primitively append the next layer to the previous, and tar will handle it just fine, as long as --ignore-zeros is specified:

Installing the new rootfs

As a test, I downloaded the tarballs mentioned above within WSL, uncompressed it with permission preservation (tar xfp), closed WSL, and moved the extracted directory in place of the old, rusty, ubuntu /rootfs from Windows Explorer. Rushing back to the console, bash gave me an error:

Running the command does work, however, not all the time. Some distributions I tried don't like to play nice, and you're just left with a broken installation, which either tells you to run the command above, or that the command failed:

However, I found this to be easily fixable: just copy the user's entries from /etc/passwd and /etc/shadow to the new /rootfs, and that's it. The /home/$USER directory will persist, as it is stored outside of /rootfs.

So with that, writing a script to automate all of this should be fairly trivial, right? Well, if WSL wasn't experimental, then yes. But in its current state, quite a few workarounds are required for the switch to be seamless and the new distribution to work correctly.

The WSL directory can be found under %LocalAppData%\lxss. The /rootfs directory is the one that has to be completely replaced with the contents of the new tarball. Since the Linux subsystem can't run while it's being replaced, I had to designed the installer script to run under Windows, and occasionally launch a few bash commands through bash -c .... At least, that was the initial plan.

The first step, to simply run bash.exe proved to be difficult, as it was nowhere to be found in the %PATH%. After a bit of search, %WinDir%\sysnative\bash.exe ended up to be the working path. However, launching this only resulted in a cryptic error message saying Error: 0x80070057. Searching for this only lead to issues where the solution was to uncheck "Enable legacy console". Fine, but I'm not using a console. What now?

Turns out, stdout and stderr redirection is currently not supported. Thankfully, the exit status code is correctly returned, so I can check whether the commands finished successfully or not, at least.

So now that I can somewhat launch commands, the next step was to copy the archive for extraction into the WSL /home directory, since other directories might not be writable by the logged in user. Python has a few ways to copy a file, little did I think this will also prove to be difficult.

Copying a file willy-nilly from the Windows filesystem to the WSL directories renders them unreadable. I wasn't able to exactly track down what it is, since it's not a permission error, it's just a "general I/O error". Thinking some metadata is probably attached to the file, which is missing if you just copy an outsider file, I tried creating a new file from within WSL (touch rootfs.tar.xz) and then opening the file for writing from the outside. This didn't work, as it reverted to the I/O error after writing.

Thankfully, it quickly clicked, that the Windows partitions are mounted under /mnt. Translating the absolute path of the archive to this mounted UNIX equivalent, and then copying the file from within WSL solved the issue.

After this, it was smooth sailing, and the install.py script was born:

Installing from SquashFS

As an alternative to Docker, I explored the downloadable ISO images Linux distributions offer, and found that most of them package their rootfs in a SquashFS archive.

These archives can be extracted easily using the unsquashfs tool. As such, adding SquashFS support to the installer script was relatively trivial. The script just needs to detect whether the specified file is .tar* or .sfs/.squashfs, and provide the according command for decompression.

Since the script can't receive stdout/stderr due to a WSL limitation, it check the existence of the unsquashfs binary in root's $PATH beforehand, so it can warn the user to launch WSL and install the squashfs-tools package.

As an example, the downloadable archlinux-2016.08.01-dual.iso image contains a suitable SquashFS file at ARCH/X86_64/AIROOTFS.SFS. This file can be installed without any additional hassles, by simply running install.py AIROOTFS.SFS.

Switching between distributions

The installer installs the various rootfs archives under the rootfs_<image>_<tag> name. This makes switching between the Linux distributions very easy, as all one needs to do is just to rename folders in order to determine which distribution is active.