Fabless Silicon Manufacture: How Much Quality Do You Need ?

November 10, 2016, anysilicon

First off, we need to define what is meant by “quality” for this discussion. In generic terms it means something like “how well a product does its job”. For a chip, or a rather what comes out the end of a chip supply chain, this isn’t simply a measure of its functionality, for example whether an adder reckons that 2+2=4. To a manufacturing guy, having all manufactured parts reckon that 2+2=5 is a success: they all do exactly the same thing, they have been manufactured consistently. This demonstrates the meaning of quality for a population of manufactured chips: uniformity.

Quality is a characteristic of a bulk population, a large quantity of things that are intended behave the same as each other. Have they all been manufactured correctly, according to the tooling that has been created and the process steps that have been defined?

But why wouldn’t they behave the same as each other ? There’s the simple case of faults and defects: open traces, shorted gates, and so on. And there’s the increasing problem of variability as physical dimensions get closer to the size of the “building blocks”, being atoms. So the pursuit of quality is a hunt for exceptions: defects and out-of-spec variations.

How is quality measured ? The metric we’ll use here is DPPM, Defective Parts Per Million, the proportion of shipped parts that do not work. For any population, this will be an intrinsic characteristic, but is extremely difficult to measure accurately. In practice we have to approximate, judge, measure what we can and aim for it to be good enough.

What quality level, DPPM, should we aim for ? What is “good enough” ? Well, what does the customer want ? Or rather, what would they tolerate ? They are unlikely to start measuring it until it becomes a problem, and it’s a difficult subject to talk to them about, because there is little incentive for them to say anything other than “zero defects, please”. You probably have to make your own judgements and aim for “just enough quality”.

So who is the customer ? To start with, it’s probably a chip R&D team doing “bring-up” on the first parts you make. There won’t be many of these guys; let’s say ten in the first week. They would probably tolerate an appalling quality level: they want something desperately soon and there are only ten of them: one failure in ten would probably be tolerable. And as the internal customer base gets larger, the quality can be improved both proactively and reactively.

When we move to external customers, imagine they’re making a $25 system with your $5 part on it. More than a few minutes of failure analysis effort will cost $25, making continued recovery effort somewhat uneconomic. And what other defect sources are there in their production line ? Solder joints are easier to get to and address. After that, easy-to-test components are likely to be next in line: memories or power supplies. Random transistors in complex chips that perform widely varying functions are less likely to end up in a single category. Your 1000 DPPM is invisible.

But what if your product is going into a more expensive system, say $250. The dollar value of the bone pile quickly racks up. Their test is likely to be more thorough – ten times as thorough if you follow the 10x unit cost increase. And recovery becomes worth attempting. Your $5 part has killed a $250 machine – you will get attention quite soon. This trend continues as costs get higher.

So once a DPPM target strategy is defined, how do we go about managing it ? As a product characteristic, it is determined by the overall effectiveness of the test steps within your manufacturing flow: wafer parametric test, wafer probe test, optical inspection, X-ray inspection, final packaged device test, etc.. Each step will have the capacity to detect a quantity of defects and measurement of this can be of very high resolution (e.g. 98.67% single-stuck-at coverage). But beware: this doesn’t always mean high accuracy – DFT fault models are abstractions of the real world and the modelling errors get passed straight up to the headline figure. 100% fault coverage is an easy target to define, but it is not real-world accurate.

However, it is has a place in managing quality, which you can see is a game of probability, statistics and experience, played with test techniques that address the problem from different angles: structural tests, functional tests, parametric tests, physical inspection. Combine these with evaluation and ongoing monitoring of their use, product quality can be managed through the product’s life.

Just like functional behaviour, quality is another characteristic of your product, something that customers will know about if it’s not good enough. Your product’s market will have a quality requirement and you need to know what this is so that you can manage how your manufacturing flow delivers it.