VXLIB Cell Characterisation Methodology

The cells are characterised by running a total of 60 Spice simulations
using different values of input transition and output load. 10 values
of input transition are used, ranging from 20ps to 1500ps.
6 values of output load are used, ranging for an x1 gate from 2.6fF to
338fF. For cells with bigger or smaller drive strengths, the output loads
are scaled correspondingly. The characterisation conditions are nominal,
with Vdd=1.2V, T=27C and typical process parameters.
Derating to best and worst case conditions has not been done yet.

For each simulation, the rise delay tR is the time from the input
transition passing the 50% point to the rising output passing the 50% point,
and the fall delay tF is the corresponding delay when the output is falling.
The rise and fall transitions are the time the output takes between the 10% and 90% points,
scaled to a full 0-100% transition. This means that the 10-90% time is divided
by (0.9-0.1) or 0.8 to give the equivalent 0-100% transition.

The transition times have been scaled to 0-100% because the Spice simulations
are using full swing inputs for the characterisation. The 10/90 point has been
determined empirically as minimising differences between the timing in the Spice
simulations and the logic timing coming from interpolating the LookUp Tables (LUT's).
This determination has to be done for each technology … the values won't
be the same for each one.

For some gates, typically where an N-transistor strongly helps the output to rise,
the rise transition thresholds are different.
In this case, the presence of an N transistor assisting in the pull up changes the
nature of the output curve. In order to minimise timing discrepancies, the thresholds are
varied on a per cell basis.

The iv1_y2 is used as a reference to calculate the simplified Prop-Ramp model
timings which have been put into the .LIB file as comments.
The input transition time used for the Prop delays is that of an iv1v0x2
inverter from the vsclib driving 4 more iv1v0x2's.

The iv1v0x2 has an input pin capacitance of 5.13fF, so a fanout of 4 is a
load cap of 20.52fF. The iv1v0x2 timing LUT for the output rising transitions is:

Taking an input transition of 130ps, we interpolate between the 16fF and 48fF loads
to find that with a load of 20.52fF, the output transition is 141.7ps.
Repeating for the output falling transition, again with an input transition of 130ps,
the output transition is 111.2ps. The average output transition is then 126.5ps,
and from this value we confirm the choice of 130ps as the transition to use for the Prop-Ramp model.

The logical effort of each gate is referenced instead to an inverter with a ratio between
its P and N transistors of 2.25, where 2.25 is the standard value of mobility ratio used.
The mobility ratio is the ratio of how much more conductive an N-transistor is than a P-transistor,
and varies with processes between 2 and 3. This is the iv1_y2 which has a 36λ
P transistor and a 16λ N transistor. Its Logical Effort has been set to 1,
and the other gates and their pins are compared to this value.

According to the Logical Effort theory, we will have an optimally designed circuit if all cell
transistor sizes are adjusted so that the transition times are all 130ps. This ignores the fact
that rise and fall transitions will be different, and only considers averages.
It also ignores the fact that fixed wire capacitances vary between different nets,
and are particularly low on internal nets on non-inverting gates.

I recommend keeping the transition times below 1200ps, and below 600ps for signals which are
used as clocks. The LUT extends beyond this so that for any reasonable value of input
transition and output load, the timing is interpolated between values in the table.
If the timing has to be extrapolated beyond the table values, then significant timing
inaccuracies will occur. A max_transition of 1500ps has been set on each cell input so that
input transitions which are bigger than the LUT will generate a warning.

It is possible to get negative delays in the LUT. This happens when the input transition
is very slow, the output load is very small and the switching threshold below Vdd/2.
For these cell inputs, the max_transition has been reduced to 1000ps.

A 6x10 LUT is larger than that used by most standard cell libraries. For this reason,
one can consider the vsclib and vxlib timing to be more accurate than other libraries.