So, approximately every year or so, someone launches Yet Another Fully Automated Statistical Tool Dashboard Thing that tries to show the skills & activity of a given open source contributor, or the “health” of a given open source project.

The next thing that happens then follows as surely as night follows day, or as unrequited nostalgia follows the latest James Bond release:

People try the tool, and say “Hmm, well, it’s wildly inaccurate about *me* (or about the projects I have expertise in), but maybe it’s useful to someone else.”

And maybe it is. But what’s really going on, I think, is that the developers of these tools are trying to solve too much of the problem.

Investigating the activity of a particular developer, or the health of an open source project, inevitably involves human judgement calls informed by experience and by out-of-band knowledge. These judgements can be tremendously improved by good tool support – one could imagine certain dashboard-like tools that would make such investigations orders of magnitude more convenient and accurate than they currently are.

But because that kind of tool deliberately doesn’t go the last few inches, it’s a lot harder to demo it. There’s no one-size-fits-all final screen displaying conclusions, because in reality that final screen can only be generated through a process of interaction with the human investigator, who tunes various queries leading up to the final screen, based on knowledge and on specific questions and concerns.

And because it’s harder to demo, people are less likely to write it, because, hey, it’s not going to be as easy to post about it on Slashdot or Hacker News :-). Well, also because it’s a lot more work: interactive tools are simply more complex, in absolute terms, than single-algorithm full-service dashboards that load a known data query and treat it the same way every time.

So what I’m saying is: Any tool that tries to do this, but where you just have to enter a person’s name or a project’s name and click “Go”, is going to be useless. If you didn’t shape the query in partnership with the tool, then whatever question the tool is answering is probably not the question you were interested in.

Work against the Oracle Effect by building software systems that do not provide conclusions, but arguments. Resist throwing some KPIs on a dashboard without working with your users to understand how this feeds into their decision-making. Keep in mind how your software is going to be used, and make sure its value, its reasoning is transparent, even if your code is proprietary. And make sure the software products you use can also answer the important question about every step of their process: Why?