With Many Eyeballs, All Bugs Are Shallow

0

In his seminal work The Cathedral and the Bazaar, Eric Raymond put forward the claim that “given enough eyeballs, all bugs are shallow.” He dubbed this Linus’ Law, in honor of Linux creator Linus Torvalds. It sounds like a fairly self-evident statement, but as the Wikipedia page points out the notion has its detractors. Michael Howard and David LeBlanc claim in their 2003 book Writing Secure Code “most people just don’t know what to look for.”

A new report from the Coverity Scan project today indicates that a great many people do know what to look for, and open source software is at least on par — if not better than! — proprietary software with respect to software defects. The Coverity Scan project evaluated selected open source projects and a number of anonymous proprietary codebases to identify “hard-to-spot, yet potentially crash-causing defects.” The results reinforce Linus’ Law.

Coverity’s Scan project began in 2006, at the request of the U.S. Department of Homeland Security. The DHS noticed the increased adoption of open source projects, and wanted to meaure the overall security and quality of the code. Every year since, Coverity has run their Scan project on different selections of open source projects, including the Linux kernel, and published a report of their findings.

The basic analysis of the Scan project is defect density, which is “computed by dividing the number of defects found by the size of the code base in lines of code. The advantage of using defect density is that it accounts for the differing size of software code, which makes defect density figures directly comparable among projects of differing sizes.” The defects identified were limited to “high-impact” and “medium-impact” from Coverity’s Static Analysis scanning suite, which include things like Null Pointer Dereferences, Uninitialized Variables, Memory Corruption, Error Handling, Illegal Memory Access, etc.

According to Coverity, within the software industry as a whole a defect density of 1.0 is the average. As you can see from Coverity’s findings, the Linux 2.6 kernel, PHP 5.3, and PostgreSQL 9.1 all have signficantly smaller defect densities. The report provides some good analysis of the numbers, and specifically examines why the Linux kernel has a higher defect density than the other open source projects included in the scan:

Breaking down the defect density within each of the software components, the kernel has a higher defect density. This is likely because every fix has to be weighed against the risk of destabilizing existing code—it’s the “some fixes shouldn’t be made until you are changing that area of the code” principle. Also, kernel developers may be reluctant to change code that is known from experience to be stable in the field just to satisfy static analysis results. They may wait until the code is being altered for other purposes to incorporate defect fixes into the new code. On the other hand, the kernel has one of the fewest number of defects classified as high risk compared against other components such as drivers. This is likely due to the criticality and widespread usage of the kernel compared to device drivers, many of which are of interest to only a small portion of the Linux global user base.

It would have been easy to just run the scan and report on the numbers, but that would not have been the complete story. I’m glad to see that Coverity actually investigated the results.

Coverity’s 2011 report is the first time they have directly compared open source and proprietary software. Given that the proprietary code included in Coverity’s scan was from anonymous companies, I asked for some details about the industries from which these applications were pulled, as well as how mature and complex the projects may be. According to Zack Samocha, the Scan director, the applications came from “financial services, telecommunications, medical devices, aerospace and defense, industrial automation, automotive” and more. Most of the applications have been under development for 5-10 years, and “are mature applications that are embedded into some of the world’s most well known devices and critical systems. For example, software that runs MRI machines, software that runs critical power infrastructure/grid, software used in high frequency trading applications and stock exchanges, etc.”

From the report:

The average codebase size for proprietary codebases in our sample is 7.5 million lines of code, significantly larger than the average for open source software included in our analysis. Therefore, to make a more direct comparison we looked at the defect density of proprietary code against open source codebases of similar size. The average defect density for proprietary codebases of Coverity users is .64, which is better than the average defect density of 1.0 for the software industry. We also found that open source code quality is on par with proprietary code quality for codebases of similar size. For instance, Linux 2.6 has nearly 7 million lines of code and a defect density of .62, which is roughly identical to that of its proprietary codebase counterparts.

Open source quality is on par with proprietary code quality, particularly in cases where codebases are of similar size.

Organizations that make a commitment to software quality by adopting development testing as a part of their development workflow, as illustrated by the open source and proprietary codebases analyzed, reap the benefits of high code quality and continue to see quality improvements over time.

As I said, Linus’ Law in many ways seems self-evident: people who live and breathe open source software have known the truth of the statement for a long time. Coverity’s analysis provides a little objective verification, which is a good thing. Now we just need someone to update the Wikipedia entry for Linus’ Law to cite this report as a counter-argument to its detractors!