Tag Archives: verification bottleneck

While researching the recent P!=NP paper and the activity surrounding it, I found that there are claims of P!=NP or the inverse being made all the time. One of the researchers posted a list of rules they use to quickly filter out bogus claims. Among these were some amusing ones (the paper was not created using LaTex), and some quite simple and powerful ones (the proof does not exclude known counterexamples). Deolalikar’s paper managed to pass most of these rules, which is one of the reasons it was taken seriously.

I recently read a blog post about verification with some claims that I find quite dubious. This gave me the idea of posting some “rules” for reading papers/articles/blog posts on verification that I think are useful in judging the validity of these claims. Note also that I have addressed most of these rules in more detail in previous posts.

The most frequent claim I see are those involving the coming “verification bottleneck”. The claim is invariable that, today verification is 70% of the effort (an oft-quoted, but unproven statistic), and tomorrow, because chips will have twice as many transistors, will be much higher.

What seems to get lost in these claims is that the same claim could have been made two years ago for the current generation, and the effort for the previous generation was still 70% (the 70% number has been around for quite a long time). In fact, going back to the dawn of the RTL era, when there was a dramatic shift in methodology, one could have made the same claim for the last ten generations of chips. This is usually ignored and no reason is given for why the 11th generation is somehow profoundly different than the last ten such that we need to buy these new tools (which is, inevitably, the other shoe falling in these articles).

In short, be skeptical of claims of future bottlenecks that ignore the fact that the jump from this generation to the next is no different from the jump over the previous ten generations.

Be skeptical of claims uses the term “verification bottleneck”. Do a Google search on “verification bottleneck” and you will find all the results are either research papers or vendors selling trying to sell a new product.

Be skeptical of claims using math. You will sometimes see something like, verifying N gates today requires 2^N effort and tomorrow there will be double the number of gates, means it will take 2^(2N) effort. Therefore, effort is exponentially increasing and so buy our tool which greatly increases capacity. This is basically a variant of rule 1, but, in general, if you see math, be skeptical.

It is common (and almost a condition of acceptance) that verification research papers always conclude with something like “…and a bug was found that was missed by simulation” or something along these lines. Remember, bugs are easy. An algorithm that found new bugs does not mean that this algorithm is better than any existing algorithm.

The best way to read verification papers is to ignore the fact that bugs may have been found. If you still find value in the paper even if you make the assumption that it found no bugs, then it probably does have some value.

The last rule comes from the premise that bugs are easy, or as I like to call it, the “pick axe and shovel” argument. This is usually in relation to some claim about better results (more bugs) being found on some, usually large-scale, verification effort. The better results are usually used to justify the proposed methodology as being superior to other methodologies.

The issue here is that the project usually involved a large number of well-coordinated, smart people. The counter-argument to this is, I could take the same smart people and, using nothing more than pick axes and shovels, find lots of bugs. Why does this prove that your methodology is better?

What prompted to write this post was this post by Harry Foster. This is posted on Mentor’s website, so Harry is obviously pitching the company party line. I won’t bother critiquing the post here. I suggest reading Harry’s post after reviewing the rules above. Let me know what you find.

I was listening to a talk by Daniel Kroening, a software verification researcher at Oxford University , who was explaining the certification process for safety-critical software. He mentioned that one of the requirements is that all test cases must be verified on the actual hardware. Now, in the case of avionics software, that means one test flight for every test case. Since test flights are expensive, optimizing the number of test cases required to cover everything is extremely important.

This is one of these cases where you get a dot and you connect them with other seemingly random dots and see that you have a line. In this case, what I realized is that the difficulty in verification is not really finding bugs (bugs are easy, right?), but in how efficiently we find these bugs. Recently I posted on how constrained random testing is essentially a (hard) optimization problem. I also posted on the best verification methodology being to combine orthogonal methodologies in order to optimize bug finding productivity. The criticality of optimizing safety-critical test cases was another data point that led me to this realization.

This is reflected in the fact that many of the most successful verification tools introduced over the last twenty years have succeeded by optimizing verification productivity. As we all know, faster simulation really does very little to improve the quality of a design, but it helps enormously in improving verification productivity. Hardware verification languages are probably the second most important development in verification in the last twenty years. But again, they don’t improve quality, they simply improve productivity in developing verification environments.

This is not to say that there have not been tools that improve quality. Formal proof clearly improves quality when it can be applied, although semi-formal verification, which focuses on bug hunting is more of a productivity increase. In-circuit emulation allows you to find bugs that could not be found in simulation due to being able to run on the real hardware. However, emulation used simply as faster simulation is really just a productivity increase.

Is verification optimization related to the well discussed verification bottleneck (you know, the old saw about verification consuming 70% of the effort)? Verification became a bottleneck when the methodology changed from being done predominantly post-silicon to being done predominately pre-silicon. Many people saw the resulting dramatic increase in verification effort as being correlated to increased design size and complexity. If this were true, then verification would consume 98% of the effort today since this switch occurred twenty years ago and there have been many generations of products since then. Since relative verification effort has not changed significantly over the last twenty years, I think it is safe to say that verification effort is constant with increased design size and that differences in relative effort reflect differences in methodology more than anything.

The real question is: will verification optimization become more important in the future as designs become larger and will that result in relative verification effort rising? If there is no change to design methodology, we would expect verification optimization effort to remain constant. If high-level synthesis allows us to move up the abstraction ladder, this should improve the ability to optimize verification. In short, there does not seem to be a looming crisis in overall verification optimization.

However, if we look at the software side we see that software content on hardware platforms is growing rapidly which is putting enormous pressure on the ability to verify this software. Effectively, we have managed to forestall the hardware verification optimization crisis by moving it to software