Tuesday, October 07, 2014

Bake - Continuous Integration System

Summary: I've written a continuous integration system, in Haskell, designed for large projects. It works, but probably won't scale yet.

I've just released bake, a continuous integration system - an alternative to Jenkins, Travis, Buildbot etc. Bake eliminates the problem of "broken builds", a patch is never merged into the repo before it has passed all the tests.

Bake is designed for large, productive, semi-trusted teams:

Large teams where there are at least several contributors working full-time on a single code base.

Productive teams which are regularly pushing code, many times a day.

Semi-trusted teams where code does not go through manual code review, but code does need to pass a test suite and perhaps some static analysis. People are assumed to be fallible.

Current state: At the moment I have a rudimentary test suite, and it seems to mostly work, but Bake has never been deployed for real. Some optional functionality doesn't work, some of the web UI is a bit crude, the algorithms probably don't scale and all console output from all tests is kept in memory forever. I consider the design and API to be complete, and the scaling issues to be easily fixable - but it's easier to fix after it becomes clear where the bottleneck is. If you are interested, take a look, and then email me.

To give a flavour, the web GUI looks of a running Bake system looks like:

The Design

Bake is a Haskell library that can be used to put together a continuous integration server. To run Bake you start a single server for your project, which coordinates tasks, provides an HTTP API for submitting new patches, and a web-based GUI for viewing the progress of your patches. You also run some Bake clients which run the tests on behalf of the server. While Bake is written in Haskell, most of the tests are expected to just call some system command.

There are a few aspects that make Bake unique:

Patches are submitted to Bake, but are not applied to the main repo until they have passed all their tests. There is no way for someone to "break the build" - at all points the repo will build on all platforms and all tests will pass.

Bake scales up so that even if you have 5 hours of testing and 50 commits a day it will not require 250 hours of computation per day. In order for Bake to prove that a set of patches pass a test, it does not have to test each patch individually.

Bake allows multiple clients to run tests, even if some tests are only able to be run on some clients, allowing both parallelisation and specialisation (testing both Windows and Linux, for example).

Bake can detect that tests are no longer valid, for example because they access a server that is no longer running, and report the issue without blaming the submitted patches.

We have to declare allTests, then list of all tests that must pass, and execute, which gives information about a test. Note that the test type is (Platform,Action), so a test is a platform (where to run the test) and an Action (what to run). The run function gives an IO action to run, and require specifies dependencies. We use an auxiliary matchOS to detect whether a test is running on the right platform:

We define main = bake, then fill in some configuration. We first declare we are working with Git, and give a repo name and branch name. Next we declare what the tests are, passing the information about the tests. Finally we give a host/port for the server, which we can visit in a web browser or access via the HTTP API.

Using the Example

Now we have defined the example, we need to start up some servers and clients using the command line for our tool. Assuming we compiled as bake, we can write bake server and bake client (we'll need to launch at least one client per OS). We can view the state by visiting http://127.0.0.1:5000 in a web browser.

To add a patch we can run bake addpatch --name=cb3c2a71, using the SHA1 of the commit, which will try and integrate that patch into the master branch, after all the tests have passed.

9 comments:

> Bake scales up so that even if you have 5 hours of testing and 50 commits a day it will not require 250 hours of computation per day. In order for Bake to prove that a set of patches pass a test, it does not have to test each patch individually.

Doesn’t that hurt bisectability if there untested patches reach the repository?

Luca: I didn't know that. I suspect Bake has been used for several build systems over time (anything *ake seems to be quite popular), so I'll make sure to always write "Bake continuous integration" and hopefully that won't confuse people. I might also add a link to make it quite clear the projects are unrelated.

nometa: Yes, it hurts bitsectability a bit. But already most people only CI test on a push, not on each commit, so there are plenty of untested commits. Even with my small projects testing each commit would be infeasible. So it removes a few known good points, but hopefully it's not significantly worse in practice.

> Patches are submitted to Bake, but are not applied to the main repo until they have passed all their tests. There is no way for someone to "break the build" - at all points the repo will build on all platforms and all tests will pass.

This is how the CI from koalitycode.com works. However, they just got bought by docker and will shut down, so your claim to uniqueness may end up holding.

Travis CI integration with pull requests on github ends up providing a similar result. Which style is better depends on your particular use case.

Greg: I should probably tone down the word unique. I'll do that in the README, which I basically pasted here.

Travis CI + GitHub provides a similar result for untrusted teams, where the contributors have their code reviewed. Doing it with your own patches would be too much work. You want to automate the pressing of merge when it has passed, which Bake does.

I like the idea of committing after the build is proved to pass. This what I achieved with gerrit + Jenkins, with the additional benefits of having a code review gate, something that I found of great value for distributed (in space or time) teams for which pairing is hard. The nice thing with gerrit is that each commit triggers a build so bisectability is not hurt. My project was small enough (and I strived hard to keep it that way) that build time was not an issue.

But then you can combine gerrit + zuul (that's what mediawiki and openstack do) to 1/ parallelise builds and 2/ stop early builds that contain a failed commit.

I would not have been so categorical few years ago but now I think having each commit built and passed before merge is an absolute must (eg. bisectability is not an option IMHO), having wasted hours to track down bugs in stack of commits.

Arnaud: I've been looking at it, and it does seem I could do incremental building on each commit, but just not run the tests on each commit - that gives you most of bisectability, but saves you the time of running the tests (which can be hours).

I totally agree that code must pass before merging, in my case because if you have a multi-developer team you need the person who wrote the bad code to have the problem of fixing the bad code, not the other people who need to get their code included.

Of course! The entire system is parameterised over state, patches and tests. There is no inbuilt knowledge of anything about what any of the tests do, and things like Git integration is added around the outside. Polymorphism basically makes it impossible to cheat and shove such knowledge in the internals.