TFS11 Beta – Batching Gated Builds

In TFS 2010, we added Gated builds. A gated build is like a pessimistic continuous integration (CI). CI builds are started after every check-in to make sure the developer didn’t break anything. Gated builds require the developer to check-in through the build process itself. The check-in is blocked until the build succeeds. To guarantee that the build is never broken, gated definitions only allow one check-in at a time. But this leads to a problem for large builds. If your build takes 30 minutes to build, you can only get 48 check-ins thru the system in a day.

The solution to this problem is not to allow parallel builds and check-ins, because that may lead to broken builds. Rather the solution is to allow the build definition to build more than one queued build at a time (or to Batch the builds). By batching the queued builds, you can increase the throughput almost as much as you want (until you have a build failure – more on that below).

Here’s how this works in TFS11…

On the build definition Trigger tab, you can specify to “Merge and build” gated builds and provide the maximum size for a batch.

When a build starts the TFS server will look at this setting and start the build with the next 5 queued builds from the definition’s queue. The build process has been updated to handle multiple queued builds. This is what it does with the queued builds (also called build requests)…

The Sync Workspace activity unshelves the changes from each build request in the order that they were in the queue

If one of the shelvesets cannot be unshelved because it conflicts with another shelveset, then that build request is removed from the build and placed back into the queue.

All of the changes are compiled together.

Tests will run as normal

If the compilation or tests fail the build (tests don’t have to fail the build), all of the build requests are returned to the queue and marked to build independently.

If the compilation and tests succeed, the changes are divided back up into their appropriate groups and checked in separately on behalf of the user that submitted the gated check-in. So, 5 build requests will generate 5 changesets even though they are built with one build.

In the instructions above, you may notice an algorithm to handle build failures. This is how the default algorithm works:

Build 1 starts with 3 build requests

compilation fails

the 3 build requests are returned to the queue to build again

Build 2 starts with just 1 of the build requests and succeeds

Build 3 starts with just 1 of the build requests and fails (this was the bad check-in)

Build 4 starts with just 1 of the build requests and succeeds

Requests 1 and 3 were eventually checked in, but 2 was rejected

Notice that when a batched build fails, you end up building more times than you would have if you didn’t batch at all. This means that you should keep the batch size to a small number. For our builds internally, 5 works well. If you end up with a bad check-in in every batch, you should do one of two things: 1) stop using batched builds or 2) stop letting that dev check-in.

Technically speaking, this algorithm could be changed, but it would be very difficult. We chose a pattern that we felt was simple, effective, and transparent.

Batching builds also leads to some strange relationships between builds and build requests. In the failure case, the first build has 3 build requests associated with it and by the end of the building process, each build request has 2 different builds associated with them. So, we have a many to many relationship between build requests and builds. In Visual Studio, you can see this on the build details window and in the build explorer.

And there’s a new view called the Build Request view. If you click on one of the Request links above you will get to this new view that looks a lot like the Build Details View.

Hopefully, this information will help you understand this new feature in TFS11.