Monday, 27 January 2014

When you
first start using dbGet, many of your queries branch off the "top"
keyword and then traverse to "insts" or "nets". These searches return a
list of all the instances or nets in the design. But sometimes it's
necessary to query the available cell masters -- some of which may or may not be instantiated.
Common
reasons for needing this are for finding things like well taps, end
caps, antenna diodes and filler cells. You have a hunch what these cells
are called for the library you're working on and you'd like to search
through all of the cell masters currently loaded in the design.
Say
for example you want to tell the tool which cells should be used as
fillers. (Fillers are physical-only instances added after placement to
fill gaps between standard cells to provide standard cell rail and well
continuity). You can have a hunch they're called FILL-"something".
Here's how to use dbGet to find the names of all the cell masters
available that match FILL*:

You can pass the output directly to setFillerMode, then call addFiller to add the instances to the design:

encounter 2> setFillerMode -core [dbGet head.allCells.name FILL*]

encounter 3> addFiller

Although
"top" is by far the common dbGet starting point, the "head" pointer
provides a link to technology information like layers, vias and more.
Give it a look next time you're seeking to find technology information
rather than design-specific data.
For more information on dbGet check out this post on Getting Started with dbGet.
Hope this helps.

How to get an additional pin to an existing one to have an I/O pin for
the same net on different places.
There is no menu command or text command to do that. You
can just do it by modifying a DEF or Floorplan-File in the following
way (DEF shown):

Monday, 7 October 2013

So you're about to start your first low power design. Or second, third, or fourth. As with many tapeouts, you know that with today's tight market windows, most likely the project will go off with a sprinting start (architectural planning), followed by an endurance test (designing and implementing), then a final mad dash towards the finish line (signoff closure and tapeout).

First, the bad news - given the complexities of today's design requirements and the swiftness in which the technology market moves, the project crunch noted above is still going to happen. The good news? If you're implementing a low power design, there are a few things you could do to reduce last-minute problems.

These tips apply mostly to power-domain based designs that use techniques such as power shutoff (PSO), multiple supply voltages (MSV), and dynamic voltage-frequency scaling (DVFS). However, some apply to non-power domain designs too.

1. Check your low power library availability! If you are well into the physical optimization stage of your design flow, you're counting on doing always-on net optimization, then realize "uh oh, I have no always-on buffers," that translates to at least 1-2 weeks of schedule delay. Obviously you'd want to check for library requirements early in the project, but sometimes for low power designs the requirements aren't that obvious. So here's a short list of the priorities:

If you are doing an MSV or DVFS design, check for the availability of multiple supply voltage-characterized libraries. Sure, you could use k-factors to extrapolate delay characteristics based on different voltages, but that's a very risky practice due to inaccuracies.

Check for level shifters, isolation cells, power switches (headers of footers, depending on which on you are planning to use), and of course, always-on buffers and state retention cells for PSO designs if you plan on using them.

2. Plan to use at least RTL simulation vectors for your power analysis. Vector-less power analysis is okay for estimation purposes, but at some point you'll have to switch to using vectors. Now, getting gate-level activity vectors for your design might be a bit hard since that only comes after doing gate-level simulation. But, RTL simulation vectors are typically available much earlier.

The old saying "garbage in, garbage out" applies here. The quality of your power analysis is completely dependent on the quality of the activity vectors you are feeding it. If that doesn't scare you enough, think about where this information is used: besides determining whether your design will meet power consumption specs and also fit within the packaging selected for your design, this information is also used as a basis for measuring dynamic and static IR drop, electromigration and other electrical problems that might come back and bite you if not taken care of early.

3. Try to test out your clock trees before finalizing your floorplan. This is helpful especially for power domain based designs. As we know, power domain definitions place restrictions on your floorplan in terms of placement, optimization and other factors. If, for example, your clock tree root starts in a power domain that's physically far away from your PLL, you can be sure that there will be a lot of buffers added in between, which means a much higher latency.

Also, clocks that exit one power domain and enter another power domain might be affected by the power domain layout in terms of skew and transition time. So, by doing at least a trial clock tree synthesis run before you finalize your floorplan, you should be able to catch problems like this early on, and fix it before your floorplan is finalized.

4. Don't over-constrain (too much) on IR drop requirements. Let's face it: the reality is we always over-constrain our designs. We over-constrain on timing to leave us some margin towards the signoff stage, and we over-constrain on IR drop so that we'll be able to meet the IR drop requirements of the library even if we take into account some variation between implementation and signoff. The main reason for IR drop requirements is that library cell performance degrades in accordance to IR drop, so too much IR drop may lead to the design not meeting timing even though STA thinks it does.

Library providers usually build in a little margin when specifying IR drop requirements, and it's perfectly normal for designers to add another layer of margin to that when implementing. The problem comes when expectations are unrealistic for a given design. For power shutoff designs, power switches usually cause some additional IR drop to that power domain. One way to decrease IR drop is to increase the number of power switch cells, but that's a double-edged sword because additional power switches lead to more area and more leakage power, which will ultimately negate the effect of having power switches in the first place. So, you can see how we could potentially shoot ourselves in the foot if we specify an unrealistic IR drop constraint.

5. Plan out your high fanout always-on nets. Planning out high fanout nets in general is a good practice for any design, but this applies even more to power shutoff designs if they have always-on high-fanout nets (hint - they usually do). Power switch sleep enable nets, SRPG sleep nets, and others would fall into this category. If you are planning to tap from nearby always-on power supplies to power the secondary power pins of the buffers for those nets, it's best that there actually is a nearby always-on power net available.

With that said, I hope this has been useful to all the folks out there designing for low power. I'm aware that this is not an all-inclusive list. Would anyone else like to share any pointers on low power implementation? Voice your comment below!

Sunday, 21 July 2013

In a path-based analysis, the distance of a path is the diagonal of the bounding box that encompasses all of the arcs in the path. In a graph-based analysis, an arc can be both launching and capturing. As a result, there are launch and capture distances. Maintaining separate launch and capture distances for arcs in a graph-based analysis vastly improves the accuracy of the results and allows closer correlation between the graph-based and path-based analyses.

The distinction between launch and capture distances can be best described using an example. In the schematic shown below, the BUF cell arc is treated as a capture arc. The cells that contribute to the bounding box for the BUFcell arc are highlighted in green. The launch and capture paths are shown with arrows. Note that the capture path passes through the BUF cell arc.

Figure 1: BUF Cell Arc Treated as a Capture Arc

In the schematic shown below, the BUF cell arc is treated as a launch arc. The cells that contribute to the bounding box for the BUF cell arc are highlighted in red. The launch and capture paths are shown with arrows. Note that thelaunch path passes through the BUF cell arc.

Figure 2: BUF Cell Arc Treated as a Launch Arc

You can examine the launch and capture AOCV distances and depths using the report_aocvm command. For example,

Friday, 5 July 2013

Everyone
knows that the increasing speed and complexity of today's designs
implies a significant increase in power consumption,
which demands better optimization of your design for power. I am sure
lot of us must be scratching our heads over how to achieve this, knowing
that manual power optimization would be hopelessly slow and all too
likely to contain errors.

Here are 8 Top Things you need to know to optimize your design for power using the Encounter Digital Implementation (EDI) System.

Given
the importance of power usage of ICs at lower and lower technology
nodes, it is necessary to optimize power at various stages in the
flow. This blog post will focus on methods that can be used to reach an
optimal solution using the EDI System in an automated and clearly
defined fashion. It will give clear and concise details on what features
are available within optimization, and how to use them to best reach
the power goals of the design.

Please read through all of the
information below before making a decision on the right approach or
strategy to take. It is highly dependent on the priority of low power
and what timing, runtime, area and signoff criteria were decided upon in
your design. With the aid of some or all of the techniques described in
this blog it is possible to, depending on the design, vastly reduce
both the leakage and dynamic power consumed by the design.
This is a one stop quick reference and not a substitute for reading the full document.

1)
VT partition uses various heuristics to gather the cells into a
particular partition. Depending on how the cells get placed in a
particular bucket, the design leakage can vary a lot. The first thing is
to ensure that the leakage power view is correctly specified using the "set_power_analysis_mode -view" command. The "reportVtInstCount -leakage" command is a useful check to see how the cells and libraries are partitioned. Always ensure correct partitioning of cells.

2)
In several designs, manually controlling certain leakage libraries in
the flow might give much better results than the automated partitioning
of cells. If the VT partitioning is not satisfactory, or the
optimization flow is found to use more LVT cells than targeted,
selectively turn off cells of certain libraries particularly in initial
part of the flow i.e. preRoute flow. The user should selectively set the
LVT libraries to "don't use" and run preCts/postCts optimization.
Depending on final timing QOR, another incremental optimization with LVT
cells enabled may be needed.

3) Depending on the importance of
leakage/dynamic power in the flow, the leakage/dynamic power flow effort
can be set to high or low. setOptMode -leakagePowerEffort {low|high}setOptMode -dynamicPowerEffort {low|high}
If
timing is the first concern, but having somewhat better leakage/dynamic
power is desired, then select low. If leakage/dynamic power is of
utmost importance, use high.

4) PostRoute Optimization typically
works with all LVT cells enabled. In case of large discrepancy between
preRoute and postRoute timings or if SI timing is much worse than base
timing, postRoute optimization may overuse LVT cells. So it may be
worthwhile experimenting with a two pass optimization, once with LVT
cells disabled, and then with LVT cells enabled.

5) In order to
do quick PostRoute timing optimization to clean up final violations
without doing physical updates, use the following:setOptMode -allowOnlyCellSwapping trueoptDesign -postRoute
This
will only do cell swapping to improve timing, without doing physical
updates. This is specifically for timing optimization and will worsen
leakage.

6) Leakage flows typically have a larger area footprint
than non-leakage flows. This is because EDI trades area with power, as
it uses more HVT cells to fix timing to reduce leakage. This sometimes
necessitates reclaiming any extra area during postRoute Opt to get
better convergence in timing. EDI has an option to turn on area reclaim
postRoute which is hold aware also and will not degrade hold timing.setOptMode -postRouteAreaReclaim holdAndSetupAware

7) Running standalone Leakage Optimization to do extra leakage reclamation:optLeakagePower
This may be needed if some of the settings have changed or if leakage flows are not being used.

8)
PreRoute Optimization works with an extra DRC Margin of 0.2 in the
flow. On some designs it is known to result in extra optimization
causing more runtime and worse leakage. The option below is used to
reset this extra margin in DRV fixing:setOptMode -drcMargin -0.2

Remember
to reset this margin for postRoute optimization to 0, as postRoute
doesn't work with this extra margin of 0.2. Note that the extra
drcMargin is sometimes useful in reducing the SI effects, so by removing
the extra margin, more effort may be needed to fix SI later in the
flow.
I hope these tips help you achieve your power goals of your designs!

Do you know about input vector controlled method of leakage reduction?

Leakage
current of a gate is dependant on its inputs also. Hence find the set
of inputs which gives least leakage. By applyig this minimum leakage
vector to a circuit it is possible to decrease the leakage current of
the circuit when it is in the standby mode. This method is known as
input vector controlled method of leakage reduction.

How can you reduce dynamic power?

-Reduce switching activity by designing good RTL

-Clock gating

-Architectural improvements

-Reduce supply voltage

-Use multiple voltage domains-Multi vdd

What are the vectors of dynamic power?

Voltage and Current

If you have both IR drop and congestion how will you fix it?

-Spread macros

-Spread standard cells

-Increase strap width

-Increase number of straps

-Use proper blockage

Is increasing power line width and providing more number of straps are the only solution to IR drop?

-Spread macros

-Spread standard cells

-Use proper blockage

In a reg to reg path if you have setup problem where will you insert buffer-near to launching flop or capture flop? Why?

(buffers
are inserted for fixing fanout voilations and hence they reduce setup
voilation; otherwise we try to fix setup voilation with the sizing of
cells; now just assume that you must insert buffer !)

Near to capture path.

Because
there may be other paths passing through or originating from the flop
nearer to lauch flop. Hence buffer insertion may affect other paths
also. It may improve all those paths or degarde. If all those paths have
voilation then you may insert buffer nearer to launch flop provided it
improves slack.

What is the most challenging task you handled?What is the most challenging job in P&R flow?

-It may be power planning- because you found more IR drop

-It may be low power target-because you had more dynamic and leakage power

-It may be macro placement-because it had more connection with standard cells or macros

-It may be CTS-because you needed to handle multiple clocks and clock domain crossings

-It may be timing-because sizing cells in ECO flow is not meeting timing

-It may be library preparation-because you found some inconsistancy in libraries.

-It may be DRC-because you faced thousands of voilations

How will you synthesize clock tree?

-Single clock-normal synthesis and optimization

-Multiple clocks-Synthesis each clock seperately

-Multiple clocks with domain crossing-Synthesis each clock seperately and balance the skew

Designing at the 20nm node is harder than at 28nm, mostly because of the
lithography and process variability challenges that in turn require changes to
EDA tools and mask making. The attraction of 20nm design is realizing SoCs with
20 billion transistors. Synopsys has re-tooled their EDA software to enable 20nm
design.

20nm Geometries with 193nm Wavelength

Using immersion lithography the
clever process development engineers have figured out how to resolve 20nm
geometries using 193nm wavelength light, however to make these geometries yield
now requires two separate masks, called Double Patterning Technology
(DPT).

Figure 1: Immersion Lithography

With DPT you have to split a single layer like Poly or Metal 1 onto two separate
masks, then the exposures from the two masks are overlaid to produce that layer
with 20nm geometries.

Figure 2: Double Patterning Technology (DPT)

Looking ahead to 14nm and
smaller nodes this trend will continue with three or more patterns per layer.

When a mask layer is turned into two parts the process is called
coloring, and the trick is to make sure that two adjacent geometries are on
different colors.

Figure 3 :DPT Coloring

With DPT you have to make sure that your cell library
and Place & Route tool are both DPT-compliant.

Often in your IC
layout the DPT process will have to use stitching to accomodate via
arrays:

This stitching will cause issues with line-end effects that in turn can degrade
yield:

The
earlier that you identify these issues, the sooner that you can make engineering
trade-offs.

Foundries create layout rules at 20nm to specify how to
produce high yield, and there are some 5,000 rules at this node.

Using
DPT techniques will also cause a variation in capacitance values between
adjacent nets caused by subtle shifts in the double masks.

Q:
Where can I read more about 20nm design with Synopsys tools?A: Achronix did
a paper at the Synopsys User Group, and they fabricated at Intel's custom
foundry using FinFET technology.

Q: How popular is your DRC and LVS tool,
IC Validator?A: There have been 100 tapeouts in the past year for IC
Validator tool.

Q: How many 20nm designs are there?A: Test chips were
done first last year, and now production designs are taping out with commercial
foundries.

Q: How many mask layers require DPT in a 20nm design?A: It
depends on the foundry. First layer metal, maybe second layer of metal. As you
relax the metal pitch, then you don't need DPT. Poly needs DPT.

Q: What
about mask costs at 20nm with DPT?A: It adds to the costs. It's always a
trade off, the foundry can relax the pitches and void DPT usage.

Q: Which
foundries have qualified 20nm with Synopsys tools?A: TSMC, Samsung,
GLOBALFOUNDRIES have qualified and endorse the Synopsys flow for 20nm.

Q: Why should I visit Synopsys at DAC? A: We'll have live
product demos, talk about advanced nodes, show emerging nodes, 14nm, 16nm,
discuss new product features, and have special events. There is an IC Compiler
luncheon where customers speak, and that's on Monday.