Web Development

Deploying Static Analysis

By Flash Sheridan, August 07, 2012

Static analysis is a cheap and easy way to find bugs, but it offers important challenges that tend to be more political than technical.

It can be tempting to delegate the triage (that is, initial screening) of static analysis defects to someone other than the code owner. This is much more hazardous than it first appears. It's quite easy for unsupervised screeners to subtract much of the value you could get from static analysis. The conventional wisdom operates here: A smart programmer adds more value than several cheap programmers, and subtracts vastly less value. If an unqualified screener triages half of the genuine defects away as false false positives, and marks half the false positives as genuine, then the engineer responsible for fixing them won't see many of the genuine defects, and may use the erroneously marked false positives as an excuse to speed-mark the rest as false positives, too.

Even more importantly, the most valuable benefit from static analysis, greater even than fixing bugs, is preventing future bugs by educating the developer about his or her mistakes. If someone else is choosing the defects, the developer never sees the mistakes and cannot learn from them.

Not Our Problem

If your software included third-party code, it's an unpleasant political reality that you may include it in your product without fixing it. Whether this is a good idea for your customers and/or your organization is outside the scope of this article; but if you're not going to fix it, then you shouldn't spend too much time investigating static analysis defects in it. On the other hand, you should nag your suppliers to use static analysis to find and fix their bugs.

Preparing for an Embarrassment of Riches

The Embarrassment of Riches problem means that a modern commercial static analysis tool generally finds more bugs than the user has resources, or willingness, to fix. Political resistance to static analysis bugs is sometimes warranted, sometimes mere laziness, but sometimes deeper and cultural: Avoiding the kinds of bugs that static analysis finds is largely a matter of discipline, which is sometimes unpopular among programmers. Fixing these bugs, and verifying that your organization has done so, will require adaptability and judgment. Attempts to design simple rules and metrics for this are, in my opinion, at best premature, and perhaps impossible.

Non-Goals and Metrics

Beware of two fallacies. The first is one of the standard risks of measurement: over-optimization of single factor measurement the factor you measure may be all that gets optimized, displacing efforts towards your genuine goal. (See section 26.2 of Steve McConnell's Rapid Development for more information.) The second is what I call the "Management Metrics Fallacy":

If it can't be measured, it can't be managed.What's easy to measure is all that's important.

Stable metrics are the enemy of quality: If your tool must produce numbers that won't oscillate or upset people, then it can't change rapidly in order to catch real bugs or to stop reporting fake ones. This will bring static analysis into disrepute within your organization, and give engineers an excellent excuse for ignoring (or de-prioritizing) static analysis defects.

Conversely, it's easy to track a number that makes management look good, but doesn't get important bugs fixed; for example, the number of projects analyzed or the number of unexamined defects. And even though getting bugs fixed is the goal, simply tracking that number may still be misleading (I have been guilty of this myself). If you put too much emphasis on the number of defects fixed, the developers being measured may spend more time than is warranted on unimportant (or even obsolete) areas of the code, where static analysis defects may be more common. Some code should be excluded from analysis and ignored, rather than fixed. And, sadly, some shipping code will be excluded from being fixed.

What counts is a hypothetical, and hence impossible to measure with certainty: How many serious bugs did you prevent from reaching customers? This is related to a secondary consideration: How many bugs did you prevent from reaching manual testing where bugs start to be expensive?

False Negatives

No static analysis tool will find all bugs in any significant codebase. Part of the revolution in static analysis was giving up on even limited attempts to do so, in favor of heuristics to find actionable bugs. Definite numbers are hard to come by, but Coverity's Analysis Architect estimates that it probably finds less than 20% of bugs present (registration required). It seems unlikely that any current tool can do much better except at the cost of an unwieldy number of false positives. (Vendor claims of completeness are generally so restricted as to be impractical for normal development.) This is less of a disadvantage than it seems, since the Embarrassment of Riches problem means that a modern commercial static analysis tool generally finds more bugs than the user has time to fix. Thus, prioritization is crucial.

Living with False Positives

Conversely, no significant static analysis tool is immune to false positives; the number of these tends also to be, very roughly, 20%. This number, however, is usually eclipsed by the number of technically correct defects that are of little interest for other reasons, such as being in a part of the code that no one cares about, or being (in the opinion of the code owner) unlikely to occur in the field. (You must also be prepared for a high false false positive ratio, though this varies widely among developers.) Even with a high false positive ratio, static analysis is still vastly more efficient than other forms of bug finding, since it takes little time for a code owner to dismiss an irrelevant defect. (Triaging false positives can be more time-consuming for people not familiar with the code, however.) It is important to manage expectations, nonetheless, so that someone tasked with examining static analysis defects is not discouraged by false positives from persevering in resolving real bugs.

Ranking and Prioritization

Most static analysis tools present defects in essentially random order (for example, alphabetical by checker name or by file). This is unwise. If the first defect in a given engineer's queue is unimpressive, you may have lost him. This is particularly disappointing because there are techniques in the academic literature for ranking defects by reliability, relevance, and estimated importance. (Various presentations by Ted Kremenek dig into this topic more deeply).

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task.
However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

This month's Dr. Dobb's Journal

This month,
Dr. Dobb's Journal is devoted to mobile programming. We introduce you to Apple's new Swift programming language, discuss the perils of being the third-most-popular mobile platform, revisit SQLite on Android
, and much more!