Equilibrium in free software testing

When a bug is filed in a free software project’s bug tracker, a social exchange takes place. Bug reporters give their time and attention to describing, debugging and testing, in exchange for a fair chance that the problem will be fixed. Project representatives make the effort to listen and understand the problem, and apply their specialized knowledge, in exchange for real-world testing and feedback which drive improvements in their software. This feedback loop is one of the essential benefits of the free software development model.

Based on the belief that this exchange is of mutual benefit, the people involved form certain expectations of each other. When I report a bug, I expect that:

the bug will be reviewed by a project representative

they will make a decision about the relative importance of the bug

project developers will fix the most important bugs in future releases of the software

When I receive a bug report, I expect that:

if more information is needed, the bug reporter will supply it

if I can’t diagnose the problem independently, the bug reporter will help with the analysis

if I need help to test and verify a fix for the bug, the bug reporter will provide it

Naturally, everything works best when the system is in equilibrium: there is a steady flow of testing and bug reports, and users feel satisfied with the quality of the software. Everybody wins. Ideally, much of this activity takes place around pre-release snapshots of the software, so that early adopters experience the newest features and fixes, and developers can fix bugs before they release a new version for mainstream use. This state produces the best quality free software.

Unfortunately, that isn’t always the case. When our expectations aren’t met, or sufficient progress is not made, we feel misled. If a bug report languishes in the bug tracker without ever being looked at, the bug reporter’s time and effort have been wasted. If the report lacks sufficient detail, and a request for more information goes unanswered, the developer’s time and effort have been wasted. This feeling is magnified by the fact that both parties are usually volunteers, who are donating their time in good faith.

The imbalance can often be seen in the number of new (unreviewed) bug reports for a particular project. At one extreme (“left”) is a dead project which receives a flood of bug reports, which are never put to good use. At the other extreme is a very active project with no users (“right”), which suffers from a lack of feedback and testing. Most projects are somewhere in the middle, though a perfect balance is rare.

Ubuntu currently receives too many bug reports for its developers to effectively process, putting it well left of center. It has a large number of enthusiastic users willing to run unstable development code, and actively encourages its users to participate in its development by testing and reporting bugs, even to the point of being flooded with data. A similar distance to the right of center might be the Linux kernel, which receives comparatively few bug reports. Kernel developers struggle to encourage users to test their unstable development code, because it’s inconvenient to build and install, and a bug can easily crash the system and cost them time and work. There are a huge number people who use the Linux kernel, but very few of them have relationships with its developers.

So, what can a project do to promote equilibrium? Users and developers need to receive good value for their efforts, and they need to keep pace with each other

The Linux kernel seems to need more willing testers, which distributions like Fedora and Ubuntu are helping to provide by packaging and distributing snapshots of kernel development to their users. The loop isn’t quite closed, though, as bug reports don’t always make their way back to the kernel developers.

Ubuntu, perhaps, needs more developers, and so we’ve undertaken a number of projects to try to make it easier to contribute to Ubuntu as a developer, and to help our developers work more efficiently. Soon, it should be possible to commit patches directly into Launchpad without any special privileges, so that they can be easily reviewed and merged by project developers. This isn’t a fix, but we hope it will help move us closer to a balance.

What else could we try? I’m particularly interested in approaches which have worked well in other projects.

17 Responses

I kind of like that there’s a “how to write C code” thing in Full Circle. Theoretically, it should result in more programmers. Dunno if it will though. Dholbach’s packaging lesson thing should help too…

I think it’s great that you’re encouraging people to start writing code. ^.^ And yes, that series of articles in Full Circle is cool! I was reading them earlier.

From my point of view, though, the whole idea of “you can submit patches” is abstract. I mean, I’m studying XHTML / CSS and I want to learn Python and PHP, but even once I learn how to code I still have to find out exactly what I need to do for that code to benefit my favorite projects. And I don’t even know where to look!

We’ve made Ubuntu easy to use, and we’ve made filing bug reports easy enough that we’re getting flooded with them. You’re right that our next step needs to be making it easy to contribute to, and I look forward to seeing what comes of that! I’ve got some apps that I want to write. ^.^

XHTML and CSS aren’t likely to be of any use for most desktop software, but you can fix themes in things like Drupal or Joomla. With Firebug, it should be easier.

There really is no getting around the fact that finding out where in the code the bug lies and which functions are needed is going to take time. I usually go for bugs that I estimate will take <= 8 hours to read through the code, figure out how it works, find the right functions to use, and fix it. I figure if it’ll take longer than that it’s more suited to someone wit a higher skill level than me.

If you pick one application whose bugs you want to attack though, you will eventually become familiar enough with that application’s code that you won’t spend 5 hours looking around to find which file contains whichever function is misbehaving.

Learning about the libraries in use is important too. Look at Devhelper. It’s a nice offline library API reference (with a search function). If you’re learning Python, you’ll probably want to integrate learning PyGTK into that.

Why not use develop Launchpad in such a way that the users bug karma should predict whether a user always provides the needed information in a proper way? As a developer you can look up the bug karma and have a kind of guarantee that the other side will be responsive. As a bug reporter you can check if the developer has good response karma.
Right now bug karma in Launchpad is only calculated based on activity, not on quality.

This is a good idea, and we’ve done some experiments in this area, trying to pull out bugs which have been filed by people who have filed good-quality bugs in the past. We need to be careful, though, to avoid problems like a chicken-and-egg situation (where it’s too hard to build a reputation) and missing the long tail.

I’d be very interested to hear if this has been done successfully in another large-scale project.

Wine has benefited a lot from its Application Database – it’s a very easy way of doing rather detailed testing without having to go through all the trouble of making a full on bug report. A user can just go to our web site, download a beta package, run their program, and then submit the results. They then know immediately that their report is useful because it can be instantly seen by other testers.

I know Ubuntu (and the Canonical QA team) has been trying to improve the use of Test Cases, however the QA website is largely hidden from normal users. Why not consider opening it up like AppDB and actually inviting users to contribute? A user could download the latest nightly, go to the qa website, find the “youtube works out of the box” test case, and submit his experience. To simplify parsing the results, we could even ask for a summary in rating form, analogous to Wine’s “Platinum-Gold-Silver-Bronze-Garbage” ratings.

This is interesting, and we’ve talked about systems like this before, but it’s aimed at a slightly different problem, i.e. getting more testing. At present, many parts of Ubuntu are tested very well by people who are using pre-release versions on a daily basis, but we receive so much feedback that it’s difficult to process effectively.

I like the general idea of making the feedback loop more visible, though.

Sadly Ubuntu is really bad in some cases. I created two separate bugs over two years ago and pointed to the fixes (both one liners) in Redhat’s bug database. Several other commenters manually applied the one line changes and confirmed they worked. To my knowledge the bugs are still outstanding (I unsubscribed after 3 consecutive releases with the bugs still present as it was obvious nothing was going to be done.)

It has now got to the point where I don’t report Ubuntu bugs any more since nothing is ever done to even acknowledge their existence. Sometimes after a release or two someone will change them back to unconfirmed and then ask if all the package updates have coincidentally fixed them.

Even the appport bugs in Jaunty get silly. Some random program crashes, uploads megabytes of stuff to launchpad with then wants me to login and somehow guess if it is the same problem as “Foo died with SIGSEGV”.

So we have a stalemate – Ubuntu/Canonical won’t do much about most reports and people like me won’t make new reports since it is a waste of time.

My suggestion as to how to fix this is to limit the number of open bugs (eg to one hundred times the number of people who work on bugs) and not accept new ones until old ones are resolved. That gives people a strong incentive to address the older ones and gives a quid pro quo to others (eg resolve an issue to make space to report your new issue).

It is probably also worth making clear in Launchpad which programs Ubuntu/Canonical cares about and which are pretty much going to be ignored or punted upstream. For example if I report an issue with Pidgin’s handling of offline contacts after a hibernate, it will just be another issue sitting there.

I went ahead and linked that bug in launchpad. What happens in bugs like this is that someone finds the corresponding fix in upstream (or another distro) and then leaves a comment. What should be happening is that these bugs get linked. (I clicked “Also affects distro”, picked Fedora, and then pasted in the bugzilla URL).

This allows lp to track the status. Now when bug people go looking for “bugs fixed elsewhere but not yet in ubuntu” this bug should show up on that radar. I’ve also added some tags, “patch”, and “bitesize” which put it on other reports/lists that are regularly checked.

We are constantly checking documentation to ensure that all bug reporters know this, but unfortunately sometimes things slip through the cracks. We as a project are doing better at linking bug reports than we used to be but we have more places where we could improve things like this.

Things like adding patch tags go a long way to getting it on the right radar, and an actual link to an upstream bug tracker will most likely be handled quicker.

(I couldn’t find a newt bug with an upstream link but I found one with a patch and tagged it as such)

All the upstream linking stuff etc didn’t exist in lp way back then and obviously none of the other commentors knew anything about the various things you just did.

Here is another suggestion to make things better then in lp. Have some sort of button I can click labelled something like “I want to fix this bug” that then starts a wizard – ie something that offers a few options at a time and shows a new page based on choices and other information that is filled in.

For example for someone that isn’t the original poster it could try to confirm the problem. It can ask about looking in other bug trackers and link if necessary. It can ask about patches and guide the process of building a debdiff, doing a PPA and whatever else it takes.

From any starting point there are numerous things that could happen next and just having documentation won’t help since it will only get bigger over time and will turn into spaghetti code (“if confirmed, but no tags then blah blah else upstream blah blah”). Having a guide/wizard the follows the process means it can be far more intelligent about it based on what has already happened.

1) Are you talking about main or are you talking about universe? Is it perhaps time to think about merging the two so that external community has more direct responsibility with maintaining main?

2) Do you have a good estimate of the package workload per maintainer for main or universe in the average? If you brought the package/maintainer ratio down would that help? Is your package collection growing “too fast” compared to man power to be sustainable? Maybe you actually need to slow down or cull your application offerings and find a way to grow them that is directly related to growing maintainership manpower.

3) Have you dropped the barrier to entry for bug reporting too far?

4) What’s the break down like between post-release bug reports and pre-release bug reports? Is that way out of balance? Maybe you need to find a way to put incentives in place for more people running the alpha and beta codes before releases and focus developer time on pre-release bugs and make it clear from a policy perspective that reports from pre-releases get a higher priority.

4)Have you tried to actually put metrics on where the bottlenecks are in developer efficiency?

Is lowering the bar to bzr commit really going to help? Is attaching a patch to a bug report into the web interface of lp blocking anything? If there is a problem with the patch you’ll still have to interface with the patch submitter via the bug ticket discussion.

1) This is about the sum total, and in point of fact we are moving toward a more unified way of organizing the package repository. However, note that (in contrast to some similar projects), Ubuntu main has *always* been open to the whole community, not just Canonical. There is a higher bar to gain privileges to work in main, but anyone who meets the criteria is welcome to develop in main.

2) Metrics like this don’t work very well for us because of the relationship we have with Debian. Primary maintenance for most of the packages in Ubuntu is provided by Debian, so there isn’t a meaningful package/ubuntu-maintainer ratio. Bug reports per developer is a more useful metric, and (notably) the number of bug reports scales with the number of users, while the developer base grows more slowly.

3) We started off with nearly zero barrier for bug reporting, so I don’t think we could possibly have dropped it lower. :-) For us, it’s a key part of being an open community that we accept bug reports from anyone. I wouldn’t want to sacrifice that for pragmatic reasons, and if we can filter and prioritize them appropriately, we shouldn’t need to.

4a) Yes. The bzr work will make this process more efficient, but even more importantly, it lowers the barrier to entry dramatically for new contributors. Patches attached to bugs require much more developer time to review and merge, and aren’t tracked as effectively as branches are. Bazaar branches have their own review mechanism which is more convenient than using bug comments.

In many cases there is a lack enough information in the report, and a request for more information also goes unanswered. The outcome – the effort of software testing goes in vain. Any one with a practical suggestion as how can we overcome this?