DHS-sponsored audit: number of OSS code defects dropping

Two years ago, the Department of Homeland Security launched an initiative …

Coverity Inc., in partnership with the Department of Homeland Security, has announced (PDF, registration required) the results of a two-year, open-source code-scanning initiative begun in 2006. At the time, the DHS was concerned that while open-source software had been widely deployed in areas where computer security was paramount, the various programs themselves had never been systematically audited. The DHS authorized the expenditure of $1.24 million, split between Stanford, Coverity Inc., and Symantec, to facilitate the development of automated static analysis tools that would be specifically used for vetting open-source projects. The project did not include a mandate to measure the severity of a bug or the degree to which it might be practically exploited, and Coverity includes no such data here.

Coverity measured the number of defects that occurred within each program and reported the average number of defects that occurred per 1,000 lines of code, also known as the defect density. Comments and whitespace were not measured, and a single line of code that contained multiple statements was still counted as a single line. Although Coverity has continued to refine its static analysis tool (currently at version 3.10), the same version of Coverity Prevent, 2.4.0, was used for all code analysis. Coverity Prevent was configured identically on all systems, and all errors that were later classified as false positives were removed before analysis.

The analysis's results seem to demonstrate that regular static analysis resulted in a lower defect density rate for the majority of the programs that were scanned. In 2006, Coverity's scan detected an average of 0.30 defects per 1,000 lines of code, or, put differently, one code defects per every 3,333 lines. The lower boundary, in this case, was 0.02 (one defect per 50,000 lines) and the upper boundary was 1.22 defects per thousand lines of code.

Two years later, the average defect density has fallen to 0.25, or one error per 4,000 lines of code. The upper boundary remains unchanged at 1.22, but the lower boundary has shrunk to 0, implying that repeated scanning has eliminated the errors from at least one program—at least all the errors that Coverity's 2006 static analysis program was able to detect.

A 16 percent reduction in defect density over two years is a notable gain, and Coverity singled out certain participating projects as having an exceptionally low defect density.

Postfix

Perl

PHP

Python

Samba

NULL pointer dereference errors and resource leak errors were by far the two most common types; together they accounted for 53.68 percent of the top ten errors found. As previously mentioned, Coverity attempts no analysis of how exploitable any particular error was in any particular program, and does not note the severity of any given exploit, if one indeed existed. It's therefore impossible to draw any conclusions about the degree to which consistent static analysis improved specific application security. Generally, however, the situation has improved; lower software defect densities translate into fewer potential attack points.

This type of long-term analysis is a step towards validating software projects as "secure," but it's not an all-encompassing solution. Static analysis programs are, by definition, limited—they do not operate while the program is running—and they are not infallible. Coverity reports that approximately 14 percent of its bug identifications turned out to be false positives. Even with these caveats, however, the Department of Homeland Security's original goal of hardening OSS applications and performing regular audits on the same has apparently been achieved.