Layer-aware optimization

AbstractAt advanced technology nodes, longer wire lengths and highly resistive metal layers have led to a dramatic increase in interconnect delays. Traditional buffering and upsizing techniques to reduce interconnect delay are no longer as effective due to the area and power impact. To minimize design costs and better predict system performance, upfront and accurate pre-route parasitic estimation of interconnects is necessary during the implementation flow.

As we move to 28nm and below, metal resistance varies significantly (~5X-100X) across routing layers, providing both challenges and opportunities for accurate interconnect delay estimation. In this article, we review techniques that take advantage of resistance variation to reduce buffering and provide tighter post-route correlation to enable better performance prediction. The different techniques are compared for their benefits and limitations, and an optimal solution is proposed. We end by highlighting some results that illustrate measurable benefits from using layer-aware optimization.

Growing interconnect dominance The demand for ‘all-in-one’ devices has led to tremendous on-chip integration. With increasing functionality per mm2, designs today are heavily interconnect dominated. Figure 1 below shows that total interconnect length in ASICs has been doubling with every other technology node (65nm to 32nm to 20nm).

Figure 1: Interconnect length trends

With more on-chip interconnects, wire delays begin to dominate gate delays. At lower nodes, shrinking feature sizes reduce gate delays significantly but have the opposite impact on interconnect delays. When wires get thinner and the spacing between them decreases, parasitic effects are more pronounced, resulting in increased interconnect delays. Shorter or local interconnects scale in length; hence the delay increase is minimal. However, global interconnects that span the chip do not scale as well due to increased integration and nearly constant die sizes (see Figure 2 below). They remain a bottleneck, limiting system performance.

Figure 2: Increasing dominance of global interconnect delays

Traditionally, interconnect delays are reduced using techniques such as buffering and driver cell upsizing. These approaches are becoming expensive due to area and power overhead. To minimize design costs and better predict system performance, it is necessary to find smarter approaches during implementation to reduce global interconnect delays.

Changing profile of metal layer stacksTo meet the growing demands of integration, new process nodes come with an increasing number of routing metal layers. Studies show that the number of metal layers has approximately doubled every decade, reaching a current maximum of 12. Another phenomenon that characterizes the increased number of layers is the fact that each layer does not have the same metal pitch (width). The metal pitch varies from narrower widths for the lower layers to wider widths for the upper layers. Layer-width variation results in lower metal layers having more routing resources at a higher resistance and upper layers having fewer resources at a much lower resistance. Leveraging this difference in metal layers can help improve interconnect performance.

By comparison, at 32/28nm the resistance variation across layers can be dramatic, resulting in the telescopic metal stack shown in Figure 4 below.

Figure 4: 32nm layer stack cross section

The example in Figure 5 below plots the unit resistance (R) per layer for a 40nm and a 28nm process metal stack, which shows the large variation (~5X-100X) across the metal layers.

Figure 5: Layer resistance variation in advanced nodes

R variation provides both challenges and opportunities for upfront and accurate parasitic estimation. Typical pre-route parasitic estimation uses average “R” of all allowable routing layers. Average R works when all layers exhibit uniform resistance. The problem arises when there is a considerable R variation across layers – the effects of which are typically not realized until a net is detail routed. At 28nm, lower layer resistance skews pre-route R calculation to very pessimistic values, requiring increased buffering of global interconnects. Subsequently, during detail route, these nets get routed on upper layers having lower R variation, rendering the increased buffers inserted during pre-route unnecessary.

To summarize, the metal layer(s) used to route a signal at 28nm can have a significant impact on its timing and buffering needs. Judicious use of upper layers can help reduce the impact of interconnect delay and improve system performance.