The Day the Universe Changed

If you have ever taped out a chip and gotten it back in the lab, you have inevitably had the experience of finding a bug which makes you say, “how in the world did we miss this?” When the chip taped out, it was running weeks of regression and random tests without finding any bugs. You reviewed everything and believed that the verification is comprehensive such that, if there are any holes, they are not big ones.

But then, when you get the chip back in the lab, it hits a bug within a second of turning the chip on. When you analyze it, you find that is a very simple case, maybe, a certain bus request in conjunction with a particular configuration bit being set is sufficient to trigger the bug. You think, “how could we have missed something so simple?”

The answer is simple also: the Universe has changed.. It is not unlike the day the Universe changed when Einstein discovered the Relativity principle.What changed that day is the same as what changed the day you got your chip back in the lab. You are looking at the problem with a different mindset.

The mindset you started off with came about when you wrote your initial verification test plan. To understand this, consider how a test plan is created. A test plan basically divides the entire space of possible behavior into different sets. For example, consider the creation of a test plan for a simple memory system. First, the set of all tests can be dividided into two basic types: reads and writes. Next, each of these could be sub-divided into cacheable and uncacheable. Next, each subclass can be divided based on address alignment. We also need to consider sequential behavior. Therefore, each class is further subdivided by being followed by a second request, each of which could be sub-divided based, again, on cacheable/uncacheable, etc.

This test plan can be represented by a tree in which each branch represents a division based on a choice such as read or write, cacheable/noncacheable. This tree grows very quickly and it is up to the verification engineer to decide how deep is sufficient to be considered comprehensive. For anything of any complexity, the number of leaves of the tree is far more than can be tested in a reasonable amount of time. Directed and random testing are ways of sampling the leaves of the tree in a uniform way such that there are no major holes.

A Test Plan For a Simple Memory System

Clearly, bugs can slip through if they occur in one of the leaves that ends up untested. But, if tests are uniformly distributed across leaves, then there should be no major holes. So, for example, if reads don’t work at all, this will be detected because we have sampled on the read side of the tree.

Now suppose we reorder the tree. For example, instead of the order being read/write, cacheable/uncacheable, aligned/unaligned, we reverse the order.

A TestPlan For a Simple Memory System - Reordered

This reordering can cause a uniform distribution of tests to suddenly become non-uniform. Reordering can reveal large holes in the testing. Reordering in this case reveals that the combination of unaligned/uncacheable requests is not tested, something that was not obvious in the original tree.

This is how the Universe changes when we get the chip back in the lab. We find a bug, but we are looking at it with a different ordering of the tree. We are imagining a tree in which the top of the tree is the bus request and next branch is that particular configuration bit and are seeing that whole side of the tree is completely untested. When we think “how did we miss this?”, we are forgetting that back at the beginning, we were looking at a different tree, one in which that configuration bit was near the bottom. Because of this we didn’t notice that, despite the fact that we were sampling the leaf nodes comprehensively, we were never setting this configuration bit in conjunction this particular bus request.

So, to avoid these kinds of bugs in future, you might try, as you near tapeout, to change the Universe by reordering the tree you used to create your testplan.

One Trackback/Pingback

[…] methods are most orthogonal. Methodologies often reflect mindset. Orthogonal methodologies force different mindsets such that simple bugs don’t slip through, which is the key to ensuring the highest […]