New Issues In Signoff

By Ed Sperling
Signoff has always been a challenge at every stage of an SoC design flow. No matter how good a design looks, or how well a prototype works, there are still problems that can crop up at any stage of the design flow all the way into manufacturing that can leave engineering teams shaking their heads.

Even at mainstream process nodes, respins are common. At advanced nodes—particularly 20nm back-end of line processes with 14nm finFETs—the immaturity of the processes, the difficulty of designing 3D transistors, and double patterning all make reliability of a design even less certain. So instead of one point of failure in the signoff chain, there may be several. Worse still, continued shrinking of features is leading to signoff issues in areas that until now have been largely ignored, such as physical and proximity effects, but which also contribute to the reliability of a device at advanced nodes.

One of the prime targets in all of this is 193nm immersion lithography, which was supposed to have been replaced by extreme ultraviolet (EUV) lithography four process nodes ago. Without EUV, at 10nm the industry will be forced to resort to triple and quadruple patterning, adding expense, uncertainty and unexpected errors into a finely honed design process. At that point engineering managers might wish signoff is done with disappearing ink.

Intertwined may be an understatement. The ability to shrink features at each new node and put more functionality onto a single die has made it possible to create smart phones with enormous compute capabilities and devices that can fit inside the human body. But it also has made it far more difficult to ensure reliability. In the past, the linear progression of a design flow made signoff at each level fairly reliable. At advanced nodes, with more pieces of the flow overlapping—hardware-software co-design, design for manufacturing, yield and test, and IP integration and test—when to actually sign off on a particular job such as place and route or timing or functional verification has become a much tougher call.

“One of the big issues is just the scope of the problem,” said Ruben Molena, director of product marketing for timing signoff at Cadence. “The design sizes are bigger. And as you move to smaller process nodes, you wind up with more signoff corners.”

Just to put it in perspective, a 28nm design commonly has 30 million to 50 million instances, and GPUs can have more than 100 million instances. It’s not unusual to see designs with dozens of corners, and complex designs such as SoCs for mobile devices can include hundreds of them. Add to that manufacturing and the various levels of signoff look more like a wish list than a confident engineering statement.

“At 20nm with double patterning, you increase the number of corners because of the mask shift,” Molena said. “It’s difficult to predict which way the mask has shifted, so you use a min and max value. At 20nm we’re also seeing stress effects are more predominant. You need to start considering delay variation due to stress effects, and there are no tools that perform cell-level static timing analysis with modeling of delay due to layout-dependent effects. That also assumes the customers are not margining. As you approach gigahertz clock frequencies, there is not enough cycle time to enclose a 20% margin.”

Margin call
Margin historically has been the quick fix for process violations. Adding more cells into a design can fix problems and increase a design’s reliability. But at 28nm and beyond, those cells also decrease the efficiency of a design and its performance.

“At 28nm we saw margining stop,” said Christen Decoin, product marketing manager at Mentor Graphics. “But that also means that teams are spending more time than expected to reach their target. At 20nm the problem is even worse because there are so many more corners. So when you finally get to print, you do not get what you expect.”

In addition to functional errors, there are now performance and power effects.

“We’re finding that companies are being more cautious about their frequency targets,” Decoin said. “They will get to those targets, but it may take time. New tools don’t show the hotspots they may encounter, so it’s also going to be a longer cycle. One way to get over that is to use 2.5D on a silicon interposer. It may come down to reliability versus cost.”

At the very least, future designs will require more analysis at every stage.

“If you don’t want to overdesign, then you have to analyze more,” said Kevin Kranen, director of strategic alliances at Synopsys. “With electromigration you have to have a reasonable profile of what you’re doing with a device. So if you vias are not able to handle current density, then you trace it back. As we move forward, there will be a fine line between quantum effects and variability. If you have three layers of gate oxide, what happens if you’re missing an atom? That will have an effect on threshold variability. EDA tools need to figure out probabilities, and then account for those probabilities.”

One issue that has become more problematic at advanced nodes is thinning wires. While transistors will scale well into the single-digit nanometers by adding more fins, RC delay in wires and interconnects and electromigration are becoming more problematic at each node. “As signal resistance goes up, you can see small variants in thin wires that can have a big impact,” said Kranen.

The big question at advanced nodes is just how long wires will last. Gary Patton, vice president of IBM’s Semiconductor Research and Development Center, said wire lifetime degrades as designs scale to future process nodes because of electromigration. So while signoff may appear to be complete, the question is for how long?

System-level issues
Nor is the signoff issue entirely at the SoC level. Integration of components in ever-denser packages means that physical effects from any one component can disrupt another.

“The question is how you define a system” said Steve Pytel, signal integrity product manager at ANSYS. “Some people define a system as the PCIe bus, in which case you look at the individual buses. For things like EMI, you have to look at the power distribution network so you can show resonances in the PCB or the package. And then you need to do a risk assessment. If you find something that’s high risk, you address a large piece of the problem earlier.”

One piece of particular interest at the system level these days involves thermal issues. “You can get a pass-fail test done at a very high level,” said Pytel. “If it fails, or you’re using a very tight margin, you can dive into more detail. But a lot of this depends on what markets you’re serving. That dictates how much testing you do, or virtual testing before the design is complete. If it’s a mobile phone, it’s not that critical. If it involves nuclear energy, that requires a whole different approach.”

All of these changes also affect where signoff is actually done, as well.

“We’re seeing a need everywhere for more electrical checks,” said Kiran Vittal, senior director of product marketing at Atrenta. “You need to capture power intent post-synthesis and post- layout. But it also becomes more of a design strategy where you check for different conditions. Body biasing and other new techniques require electrical checks. And for different low-power techniques, you need to verify for all of those techniques with dynamic and static checks.”

It only gets harder from here
IBM’s Patton says the issues that affect reliability—and signoff—only become more challenging after 20nm. In addition to wire degradation from electromigration, if the structures on a chip aren’t done exactly right that will also cause EM problems.

“Reliability from PVD breakdown gets worse,” Patton said, adding that lower k dielectrics can affect the cohesive strength of a chip and cause interaction with the package. He said the next big breakthrough will be silicon photonics, but that may not happen for several more process nodes.

Until then, anyone who signs off on a project—or a piece of project—is signing off more on probabilities than the certainty that used to be required. At advanced nodes nothing is certain, and signoff may prove to be little better than a checklist.