Reducing Turnaround Time with Hierarchical Timing Analysis

The semiconductor industry accepts two facts: designs continue to grow in size and complexity, and time-to-market pressure is higher than ever. I’ll use the ‘smart phone’ as an example to make my point. On a smart phone you can now talk, text, IM, take pictures and videos, play games and perform a host of other tasks. Question: How is this possible? Answer: By integrating multiple functionalities capable of simultaneous interaction onto a single chip. Question: How do you make this happen within the same amount of time you are given as the last chip? Answer: Design reuse.

As design evolution continues, packing lots more on a single die,
‘design reuse’ has become a common technique. There is reuse of IPs,
flows, and methodologies – all causing the design size growth to
sometimes surpass Moore’s Law. However, designers are feeling the squeeze between packing tons of
functionality on one end and experiencing no relaxation in
time-to-market requirements at the other. Market dynamics dictate that
if you snooze, you lose.

Figure 1: A hierarchical chip with IP design reuse (source: Synopsys)

Figure 2 demonstrates the classic design gap caused by the rapid growth
in design complexity and lagging productivity over the past two decades.
The increase of IP usage in designs closes this gap to some degree and,
later in the article, I will explain how static-timing analysis (STA)
technology can leapfrog this design gap.

STA is a key task during chip design that directly impacts design cycle time. With design sizes growing 2-3X every two years, new technologies like multi-core processors have helped alleviate runtime pressures. While it’s necessary for STA software to make optimal use of the emerging hardware infrastructure, it’s by no means sufficient. Design size growth is not going to stop, while on the other hand, multicore scalability will lose steam once you have approached a sweet spot with the number of cores as per Amdahl’s law. The law is often used to predict the theoretical maximum speedup using multiple processors.

As designs continue to grow in size, hierarchical techniques are used to break down design complexity into manageable units of work. In STA, hierarchical timing models have been used to represent block timing in a compact manner to enable faster and more memory-efficient STA runs. However, these models are unable to capture the context of where the blocks reside in the full chip, thus causing differences between full-chip flat runs and hierarchical runs.

An example where a hierarchical timing model for a block does not capture the context accurately in traditional STA analysis is clock reconvergence pessimism removal (CRPR) effect. When two related clocks enter a block, it is difficult to model the CRPR properties of their top-level common source in the block-level analysis. When a full chip flat analysis is done, the entire clock network is visible, hence the CRPR effects can be accurately applied. With HyperScale technology, boundary clock CRPR relationships are captured as part of the updated context and used for accurate block-level analysis. This eliminates the need for a flat run. This is just one example of a global effect being more accurately captured through the HyperScale context. Other examples include AOCV, timing exceptions, SI and noise effects. This has prevented broad adoption of hierarchical STA for signoff. Today, STA signoff for larger SoCs, like the ones used in smart phones, is done flat. As a result, runtime can extend into several days, requiring costly computer systems.

To help designers accommodate increased chip sizes, signoff tool providers can improve single-core performance, leverage distributed multicore capabilities to utilize the farm efficiently and use threaded multicore capabilities to utilize multicore compute boxes. This has immensely helped the timing community to manage runtimes for full-chip signoff runs. Today, an increasingly important consideration is addressing the scalability challenges of really large designs that no longer fit the current hardware and are growing at a faster pace than Moore’s Law.

Designers can now benefit from a new technology, Primetime HyperScale available from Synopsys, that enables hierarchical STA by performing accurate block-level timing analysis in the top-level context. PrimeTime Hyperscale technology eliminates the need for hierarchical timing model generation and validation. HyperScale provides a 5 to 10X boost in performance and capacity, leap frogging the design gap (as demonstrated in figure 2) while maintaining the accuracy of flat analysis. In addition, HyperScale enables faster top and block-timing convergence with an automated context update mechanism and offers scalability to complete daily analysis on designs of up to 500M instances (impractical on large designs using the current full-chip approach).

During the implementation cycle, block designers perform STA for many reasons, including constraint cleanup, multi-scenario analysis and validating large violations. These block-level STA runs are localized runs that are not currently reused during the flow. Why throw away these analysis runs and duplicate effort with full chip runs? HyperScale technology was developed with the ‘design reuse’ technique in mind that today’s designs heavily rely on.

Because SoCs are built today with a mix of old and new IPs, integrating pieces that were born in different environments is an inherent challenge for designers. HyperScale technology allows designers to ‘reuse’ the results from block-level STA runs performed throughout the flow to drive the timing closure process. There is no need to throw away these analysis runs and duplicate effort with full chip runs. HyperScale is designed to handle such scenarios effectively and can quickly point designers to the timing impact of mixed-environment integration with clear guidance to resolve it.

Now more than ever, the task of creating clean ‘signoff’-level constraints for a block is an art form. Block design starts with initial estimated budgets and is refined over time – a manual, error-prone task. This can lead to inconsistent constraints between block and top, which can create two undesirable effects: either the blocks get conservatively bounded, leading to over-design; and/or some critical timing paths may be missed, leading to chip failure. The already runtime- and memory-intensive full chip analysis gets compounded when timing ECOs are required, prolonging design closure and slipping time-to-market goals.

HyperScale completes a comprehensive timing analysis around the block boundaries to automatically capture the timing context, including signal integrity (SI), on-chip variation (OCV) and clock reconvergence pessimism removal (CRPR) effects. This context alignment between block and top, more than simple Synopsys Design Constraints (SDC) constraint checking, ensures smooth and fast timing convergence. Block-level implementation teams can also use the updated context from top for optimization, leading to faster and more accurate timing closure.

Figure 3 shows a comparison of timing slack results, with reference to a flat run between an initial analysis with estimated budgets vs. a HyperScale context-corrected analysis. Initial block analysis with estimated budgets is typically pessimistic or inaccurate, resulting in false violations to fix and longer turnaround time (TAT). With accurate timing context information, the right violations can be fixed properly and in much less time.

PrimeTime HyperScale technology addresses the existing challenges of STA on hierarchical designs and enables seamless hierarchical STA signoff on multi-million instance SoCs like those used in today’s smart phones. HyperScale leapfrogs the productivity gap in STA analysis with a 5-10X runtime and memory boost, cutting design cycle time significantly. This heralds a new era in STA and represents a major milestone in timing analysis.

Author bio: Sunil Walia is a Senior Marketing Manager on the PrimeTime team at Synopsys. Prior to Synopsys, he was on the signoff marketing team at Magma Design Automation and worked on SPARC processors as a design engineer at Sun Microsystems.

If you found this article to be of interest, visit EDA Designline where you will find the latest and greatest
design, technology, product, and news articles with regard to all aspects of Electronic
Design Automation (EDA).

Also, you can obtain a highlights update delivered directly to your inbox by signing up for
the EDA Designline weekly newsletter – just Click
Here to request this newsletter using the Manage Newsletters tab (if you
aren't already a member you'll be asked to register, but it's free and painless so don't let
that stop you [grin]).