CodeCube: Docker-powered Runnable Gists

Whether you’re soliciting programming advice on a mailing list, providing technical support over IRC, or saving a useful snippet of code for later, GitHub’s Gist is an indispensable tool.

I’ve been writing more and more code in Go recently, and something struck me about the Go community: they don’t use Gists. When Go developers share code, they do so almost exclusively through a tool called the Go Playground. The key advantage of the Go Playground over GitHub Gists is that you can run the code snippets and see the output live in your browser. This is extremely powerful. I’ve seen countless Gists that have pages of output dumped at the end of the file in code comments. Reproducing the output is often extremely difficult, as it can be hugely dependent on the environment the code was run in. The Go Playground fixes all this. Front-end developers have a similar (but even more powerful) tool in JSFiddle.

Remote Code Execution as a Service

So what would it take to build a Go Playground that works for every language? The obvious issue is that the entire service would effectively be one big remote code execution vulnerability. To run such a service, there are a few desirable properties:

Isolation: if multiple code examples are running concurrently, they shouldn’t be able to interfere with (or even be aware of) each other. This means isolating the filesystem, process tree, network, etc between running code examples.

Resource limiting: code examples shouldn’t be able to use an unreasonable proportion of the host’s resources (CPU, memory, disk space).

There are a number of ways of achieving each of these objectives (virtual machines, chroot, cgroups, etc), but Docker emerged as a pretty great solution that covers each of the points mentioned. Built on LXC and cgroups, it enables creation and teardown of relatively secure sandboxed environments in a fraction of a second. Each time a code snippet is run, a docker container can be created, started, used to run the untrusted code, then killed and destroyed. It’s incredible that all this can happen in a matter of milliseconds.

Making it happen with Docker

My implementation of this is up and running at codecube.io. Here’s a breakdown of how the system works:

A user types some code in to a box on the website, and specifies the language the code is written in

They click “Run”, the code is POSTed to the server

The server writes the code to a temporary directory, and boots a docker container with the temporary directory mounted

The container runs the code in the mounted directory (how it does this varies according to the code’s language)

The server tails the logs of the running container, and pushes them down to the browser via server-sent events

The code finishes running (or is killed if it runs for too long), and the server destroys the container

The first step is to build a Docker image for the container that runs the code snippets. It includes everything necessary to run code in any of the supported languages (initially Python, Ruby, Go, and C). Here’s the corresponding Dockerfile:

Note there are two shell scripts added to the image. entrypoint.sh is a script that sets up an unprivileged user to run the code with, and run-code.sh detects the code’s language, and builds and runs it accordingly.

The server that accepts the code examples from the web, and orchestrates the Docker containers was written in Go. Go turned out to be a pretty good choice for this, as much of the server relied on concurrency (tailing logs to the browser, waiting for containers to die so cleanup could happen), which Go makes joyfully simple.

Resource Limiting

This is where the fun really begins. How do you stop someone from consuming 100% of the host’s CPU? What happens when someone runs while (1) { malloc(1024); }?

CPU

This is easy. Docker enables the allocation of CPU shares when starting a container. Note that this is a relative weighting, that affects how the processes in the container are scheduled. This functionality is provided by cgroups under the hood.

Memory

As with CPU, Docker exposes cgroups functionality that solves this problem. You simply specify the maximum amount of memory a container can use (in bytes) when you start the container.

Network

Fine-grained control is possible using tc, but for the initial version I opted to turn networking off. This is something I’d like to improve in a future version.

Disk

Limiting IO is currently an unsolved problem. There has been talk about using blkio to limit IOPS, however there are issues with this approach.

Disk quotas are also not easy - Docker doesn’t do much to help with this right now. Some people have worked around this problem by mounting a size-limited loopback device, and storing variable data exclusively in the mounted directory. This works well if you’re running a database or image-hosting service. But I wanted to apply the quota to the entire filesystem, as users have free reign over the filesystem. In theory you could place each container on its own loopback device, but that would require a lot of moving parts.

I solved the quota problem as follows:

Assign quotas to a large pool of uids on the host machine

When starting a code-runner container, reserve one of the uids from the pool

The container creates an unprivileged user to run the code as, but sets the user’s uid to the one reserved from the pool of quota-restricted uids

When the code is run as this user, the quota is applied directly by the host

Once the code-runner container has finished, release the uid back to the pool

This approach exposed something that was not entirely clear at first: it’s important to delete docker containers after using them, as by default they stay around forever. Every time you use docker run, a new container will be created, and remains on disk even after it has finished running. During early development, I found that the quotas weren’t being cleared after a code example ran. Updating the code to delete the containers as soon as they stopped resolved this issue.

Update: the newly-released Docker v0.6.4 adds a -rm flag to docker run, which auto-deletes containers once they finish running.

The End Result

Before pushing this live, I also added persistence. When you visit the application, you are redirected to a randomly-generated URL. Each time the code is run, it’s saved to Redis under the key that is present in the URL path. However, there’s still no proper authentication, so any snippet can currently be edited by anyone.

Things to Improve

Unfortunately, the server’s kernel doesn’t support limiting swap usage via control groups, so it is currently possible to cause the host to swap rather easily.

As mentioned above, networking is currently turned off. I’d like to implement some kind of network limiting, so I can turn it on.

The code snippets are currently executed as an unprivileged user. It would be nice to allow users to do whatever they wanted to the entire system, by giving them root access. LXC provides a good sandbox, but it isn’t perfect. For example, there are many parts of the kernel that do simple uid == 0 comparisons to check for root. As LXC containers use the host’s kernel, there are potentially serious implications to giving root access to untrusted users, even within a container (see here for more info).

The frontend is currently extremely simple, and could be greatly improved.

Adding more languages would be an obvious next step, and extremely simple. If you’re looking to contribute, this would be a great place to start.

The full code is up on GitHub - contributions are very welcome! If you’ve got any questions or suggestions, tweet me at @harrymarr.