Monday, January 16, 2006

Release driven by test coverage feedback

Tried something new when releasing XMMS2 0.2 DrBombay. Set a requirement to reach a "test goal" before releasing. This means that the release was postponed until enough test coverage feedback was released.

XMMS2 includes a small tool called xmms2-et that, when enabled, sends basic test coverage data to a central server. xmms2-et sends one small UDP packet for each song played. The packet includes information about operating system, version of xmms2 and what plugins are being used to play current song. By looking at this data it is possible to get an picture of how much testing different components are getting. This can be very helpful in rc-phase (release candidate), because you will know that a certain version has at least played a certain number of songs. This will make it much easier to say that the specific version is good and a release can be made, while being pretty confident that there are no serious bugs.

Of course it all depends on the users running xmms2-et to report any problems they find. Getting lots of feedback data doesn't help if the users doesn't report any problems they have found, then it will have a negative effect instead, you will feel confident that a version works good because it has gotten so much testing while it actually is broken and users just don't report the errors.

To try to mitigate this problem and to increase community involvement a web page with test goals were created. It showed how much testing different components needed and how much testing they got so far. This way testers could easily see how testing was going. The release was delayed until all componets had reached 100% of their test goal. This way the users testing was the ones who had the power of "deciding" when to release, they could easily see if some component was lagging behind on the test goal page and give that one some extra testing to decrease the time left to release.

In xmms2 it is easy to identify different components and there is a natural event (playing next song) when feedback data can be submitted immediately. Another method that might be better in other (and maybe this to really) cases is to collect the testing data offline into a file which can be mailed or submitted in another way. That way it could easier be expanded to some more finegrained analysis, even including full call-graph analysis. Distributed gcov, when will someone write that, or does it already exist?

Using test coverage feedback data to drive releases was a very interesting experiment and next release of xmms2 will definitely try it again and hopefully refine it. Next time more people will know what it is all about the community involvement will hopefully be larger and it will be an even more well tested release.