Descriptions

Synchronization is one of the important issues in digital system design. While
other approaches have been intriguing, up until now a globally clocked timing
discipline has been the dominant design philosophy. However, we have reached the
point, with advances in technology, where other options should be given serious
consideration. VLSI promises great processing power at low cost. This increase in
computation power has been obtained by scaling the digital IC process. But as this
scaling continues, it is doubtful that the advantages of faster devices can be fully
exploited. This is because the clock periods are getting much smaller in relation to the
interconnect propagation delays, even within a single chip and certainly at the board and
backplane level.
In this thesis, some alternative approaches to synchronization in digital system
design are described and developed. We owe these techniques to a long history of
effort in both digital computational system design as well as digital communication
system design. The latter field is relevant because large propagation delays have always
been a dominant consideration in its design methods.
Asynchronous design gives better performance than comparable synchronous
design in situations for which a global synchronization with a high speed clock
becomes a constraint for greater system throughput. Asynchronous circuits with
unbounded gate delays, or self-timed digital circuit can be designed by employing either
of two request-acknowledge protocols 4-cycle and 2-cycle.
We will also present an alternative approach to the problem of mapping
computation algorithms directly into asynchronous circuits. Data flow graph or
language is used to describe the computation algorithms. The data flow primitives have
been designed using both the 2-cycle and 4-cycle signaling schemes which are
compared in terms of performance and transistor count. The 2-cycle implementations
prove to be better than their 4-cycle counterparts.
A promising application of self-timed design is in high performance DSP
systems. Since there is no global constraint of clock distribution, localized forwardonly
connection allows computation to be extended and sped up using pipelining. A
decimation filter was designed and simulated to check the system level performance of
the two protocols. Simulations were carried out using VHDL for high level definition
of the design. The simulation results will demonstrate not only the efficacy of our
synthesis procedure but also the improved efficiency of the 2-cycle scheme over the 4-
cycle scheme.