Categories

Meta

bubblewrap is a sandboxing tool that allows unprivileged users to run containers. I was recently working on a way to allow unprivileged users, to take advantage of bubblewrap to run regular system images that are using systemd. To do so, it was necessary to modify bubblewrap to keep some capabilities in the sandbox.

Capabilities are the way, since Linux 2.2, that the kernel uses to split the root power into a finer grained set of permissions that each thread can have. Together with Linux namespaces it is fine to leave unprivileged users the possibility to use some of them. To give an example, CAP_SETUID, which allows the calling process to make manipulations of process UIDs, is fine to be used in a new user namespace as the set of permitted UIDs is restricted to those UIDs that exist in the new user namespace.

The changes required in bubblewrap are not yet merged upstream. In the rest of post I will refer to the modified bubblewrap simply as bubblewrap.

The set of capabilities that bubblewrap leaves in the process is regulated with –cap-add, new namespaces are required to use these caps. The special value ALL, adds all the caps that are allowed by bubblewrap.

A development version of systemd is required to run in the modified bubblewrap. There are patches in systemd upstream that allows systemd to run without requiring CAP_AUDIT_* and to not fail when setgroups is disabled, as it is the case when running inside bubblewrap (to address CVE-2014-8989). The setgroups restriction may be lifted in future in some cases, this is still under discussion.

For my tests, I’ve used Docker to compose the container, in the following Dockerfile there are no metadata directives as anyway they are not used when exporting the rootfs.

To install the latest systemd, once you’ve cloned its repository, from the source directory you can simply do:

./autogen.sh
make -j $(nproc)
make install DESTDIR=$(rootfs)

to install it in the container rootfs.

If the files /etc/subuid and /etc/subgid are present, the first interval of additional UIDs and GIDs for the unprivileged user invoking bubblewrap is used to set the additional users and groups available in the container. This is required for the system users needed for systemd.

At this point, everything is in place and we can use bubblewrap to run the new container as an unprivileged user:

Every programmer at some point gets in touch with the Brainfuck programming language and how surprising is that very few instructions are needed to have a Turing complete language, 6 is the case of Brainfuck (plus other 2 for I/O operations).

I have recently found an old project of mine that I have used to learn how to write a GCC frontend, it took a while to adapt it to work with a newer GCC version. The code is available on github. The only positive side of this project, if any, is that it can be easily used as a starting point on how to add a frontend to GCC, or in this case, to compile a Brainfuck interpreter written in Brainfuck!

I don’t remember how I got to this code, except that I helped myself with some C preprocessor macros and and I remember one important spec: the input is NUL terminated. Looking at the code is not very helpful. This is probably one of the cases that the compiled version is more understandable than the code itself.

Assuming you already have compiled GCC with the brainfuck frontend (there are instructions on the github project page on how to do it) and that you are able to compile brainfuck files:

$ gcc brainfuck-interpreter.bf -o brainfuck-interpreter

You should have got an executable at this point in the current directory: brainfuck-interpreter. It can be used to interpret an easier program, let’s try with the usual “Hello World!” stuff. The code is short enough that we can feed it straight from stdin to the interpreter. The terminal NUL byte is very important or the interpreter will just crash. I/O for Brainfuck doesn’t handle errors and EOF 🙂

git rebase is one of my favorite git commands. It allows to update a set of local patches against another git branch and also to rework, trough the -i flag some previous patches.

The problem I had to deal with was quite simple, rename a function called notProperPythonCode to proper_python that was defined in the first patch and be sure that all other patches are using the correct name. The –exec flag allows to run a custom script after each patch is applied, so that I could run sed to process the Python files and replace the old function name with the new one. The process is quite simple, except that such changes would trigger a lot of merge conflicts, trivial to solve but quite annoying. Fortunately git rebase allows to choose what merge strategy must be adopted for solving conflicts, the theirs strategy in case of a conflict, will take the previous version of the patch and silently use it. That is fine for this simple substitution case, where we process each file ending by *.py in the repository.

The main reason behind system containers was the inability to run Flannel in a Docker container as Flannel is required by Docker itself. CoreOS solved this chicken and egg problem by using another instance of Docker (called early-docker) that is used to setup only Etcd and Flannel.

Differently, Atomic system containers will be managed by runc and systemd.

The container images, even though being served through the Docker v2 registry, are slighty different than a regular Docker container in order to be used by Atomic. The installer expects that some files are present in the container rootfs under /exports, the OCI spec file for running the Runc container and the unit file for Systemd. Both these files are templates and some values are replaced by the installer. The communication with the Docker registry is done through Skopeo, that is used internally by the atomic cli tool.

The demo shows the current status of the project, these features are not still part of any release and most likely there will be more changes before they will be released.

If everything works as expected, the image should be ready after that command and we can run it as:

sudo docker run --rm -ti emacs

Repeating the same command twice won’t have any effect, unless –force is specified, if there is no new OStree commit available, ostree-docker-builder stores this information in the image itself using a Docker LABEL.

Tagging

Another feature is the automatic tagging of images, when –tag is specified, the built image will be tagged as the name provided as argument to –tag and automatically pushed to the configured Docker registry.

Advantages of ostree-docker-builder

There are mainly two advantages in using ostree-docker-builder instead of a Dockerfile:

The same tool to generate both the OS image and the containers

Use OStree to track what files were changed, added or removed. If there are no differences then no image is created

Special thanks to Colin Walters for his suggestions while experimenting ostree-docker-builder and how to take advantage of the OStree checksum.

coming as a surprise, this year we have got 4 students to work full-time during the summer on wget. More than all the students who have ever worked for wget before during a Summer of Code!

The accepted projects cover different areas: security, testing, new protocols and some speed-up optimizations. Our hope is that we will be able to use the new pieces as soon as possible, this is why we ask students to keep their code always rebased on top of the current wget development version.

Improve Wget’s security

The project aims at adding HSTS support in wget and enhance FTP security through FTPS.

Speed up Wget’s Download Mechanism

Support two performance enhancements: conditional GET requests and TCP Fast Open.

At this point, modify ./fedora-cloud-atomic.ks to point to our OStree repository.

This is how I modified the file to point to the OStree repo accessible at http://192.168.125.225:37375/. Use the correct settings for your machine, and the port used to serve the OStree repository that we noted before.