Some thoughts on programmer productivity

by Ketil Malde; August 14, 2008

(Sorry about the misleading URL, time has passed the Biohaskell tutorial by, and you’re better off to visit biohaskell.org.)

Inspired by an article on programmer productivity at LWN, I ran darcs-graph on my code to see how I do. I guess I consider myself an about average programmer, and it looks like I can average about five commits a day when I’m working on a project. I’m occasionally touching 20 commits, but that’s probably a built-up backlog of patches. Let’s see where this puts me:

If you accept the proposition that the best programmers outperform the poor ones by a factor of 50, an average programmer should outperform a poor one by a factor of about seven, and also be outperformed by a top-notch one by the same amount. So if I’m about average, the best ones should be able to beat me by a factor of about seven, which means 35 commits a day, or about ten minutes per commit throughout the day. Scaling up my peak commit at 20/day would give 140 commits, about 20 per hour, or on average 3 minutes between each. Conversely, a lousy programmer should be struggling hard to make a commit at all during his working day. Both ends of the spectrum seem a bit incredible, but I guess the lousy end is slightly more so, and I have to accept the fact that I’m officially a below-average programmer. But before I get all depressed about it, I must point out that on the Haskell community overview, no project gets anywhere near 35 commits on average. The busiest is GHC, which touches 15 commits. The most likely explanation is that the 50:1 ratio is way off the mark, and that the old 10:1 ratio is closer to the mark.

As an aside: Productivity is often measured in source lines of code (SLOC), which is duly criticized for being imprecise. For instance, some of the most important and beneficial changes remove lines, what’s that, negative productivity? In contrast, I find that I really like the number of commits approach. A commit – at least for me – is a small piece of modification, usually one to maybe ten lines of effective changes. As such, it represents sort of a minimal, atomic modification to the code, and encapsulates the smallest coherent unit of brain sweat. In the Wikipedia page linked to above, Bill Gates is quoted as saying “Measuring programming progress by lines of code is like measuring aircraft building progress by weight.” In contrast, measuring programming progress by commits is like measuring aircraft building progress by parts – which I think is much more sensible. Of course, other people may have different notions on the granularity of patches. Most of what I’ve seen seems to agree with mine, though.

(Even more) random notes

Here’s a rather entertaining read on productivity, although it uses time and quality as metrics, and the standard error for the listed projects is about 0.5 (i.e. time use is about 20 hours ± 10 hours), with a factor of 10 between the worst and best programmer average. And, disregarding the span from worst to best, the shape of the distribution has certain ramifications, too.

One of the things I remember from Brooks is that he ascribed one order of magnitude productivity improvement from moving from assembly to a high level language. Further, he thought that no such improvement was possible again from moving to yet higher level languages. I tend to agree with Yanniss’ law that this is too pessimistic, although I also think a doubling every six years is too optimistic.