0% found this document useful (0 votes)
39 views9 pages

Delay Insensitive Asynchronous Design: Equation Chapter 2 Section 1

Delay-insensitivity is based on the assumption that “a circuit should function correctly irrespective of all gate and interconnect delays as if these delays are unbounded” [3]. That’s why delay-insensitive asynchronous circuits present a convenient alternative for designing in deep-submicron, where interconnect delays have nearly equal effect on circuit behavior as gate delays

Uploaded by

ozoemena29
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
39 views9 pages

Delay Insensitive Asynchronous Design: Equation Chapter 2 Section 1

Delay-insensitivity is based on the assumption that “a circuit should function correctly irrespective of all gate and interconnect delays as if these delays are unbounded” [3]. That’s why delay-insensitive asynchronous circuits present a convenient alternative for designing in deep-submicron, where interconnect delays have nearly equal effect on circuit behavior as gate delays

Uploaded by

ozoemena29
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Equation Chapter 2 Section 1

2 DELAY INSENSITIVE ASYNCHRONOUS DESIGN

Delay-insensitivity is based on the assumption that a circuit should function correctly


irrespective of all gate and interconnect delays as if these delays are unbounded [3]. Thats
why delay-insensitive asynchronous circuits present a convenient alternative for designing in
deep-submicron, where interconnect delays have nearly equal effect on circuit behavior as
gate delays [29]. Delay-insensitive circuits offer robust self-timed operation with the least
amount of timing analysis effort available to asynchronous design styles: No global
constraints are required from the environment. Completion of each operation is
acknowledged to allow the environment to apply the next input, so the circuit can wait for
indefinite input arrival times and once the input arrives, can run as fast as the underlying
silicon technology allows [3]. Thus average case performance could be delivered by the
circuit instead of worst.

2.1 Delay Insensitive Design Styles

Delay-insensitive design style mainly falls into two categories according to the level of
abstraction applied [30]: Transistor -Level and Gate -Level.

Transistor-Level Delay-Insensitive Design Styles usually follow Martins methods [36] for
designing at transistor level and building optimized and usually state holding circuits
through formal transformations from logic descriptions. This design style produces the
circuits with minimum transistor count [37, 38], and has a specific language and design tool
developed for it [36], but due to its abstraction being at transistor level, not as widely
supported and automated as gate (logic) level design styles.

Gate-Level Delay-Insensitive Design Styles set the level of abstraction at logic design level,
provided that a standard cell library composed of special logic gates is used for circuit
implementation, either totally or partially, alongside with ordinary boolean logic. Such a
library contains logic elements which resemble the Muller C gates [31], in that they can hold
their states in case certain input conditions are not attained. These are called threshold-logic

14
gates, of which the most well-known and cooperated into an automated CAD flow is Null
Convention Logic (NCL) [16]. In gate-level delay-insensitive design mutually exclusive
symbol representations are used frequently instead of boolean representation, even though
boolean gates are still partially used. There is an increasing degree of automated tool support
for design and verification of gate-level design-insensitive circuits, due to their suitability for
system-on-chip design constraints.

2.2 Dual-Rail Threshold Logic Gates

True delay-insensitive circuits are very hard to realize, therefore very rare. However, being
the closest approximation, Dual-rail Threshold Logic Gates are widely referred as building
blocks for delay-insensitive circuits in literature. These circuits are actually Quasi-delay
insensitive, meaning that their functionality is based on the isochronic forks assumption
which states that all wiring works have equal delays, or at least those on small circuit scales.
Dual-rail Threshold Logic gates implement a logic function in case a certain input
conditions, namely the threshold are met, otherwise hold their states. They have been
developed concurrently under different names by different parties for gate-level delay-
insensitive design [33, 34, 35]. The most well known is the Null Convention Logic (NCL),
developed and commercialized by Theseus Logic Inc. in 1996, to address the delay-
insensitive asynchronous design space [16, 39, 40]. In NCL style, completion information is
not explicitly sent but embedded in data representation and circuits are constructed using all
gates from an NCL-type cell library. The basic principles characterizing the Dual-rail
Threshold Logic Gates are explained in the following subsections.

2.2.1 Symbolic Completeness of Expression

Symbolic Completeness of expression requires a logical expression to depend only on the


relationships of the symbols present, without a reference to the evaluation time [30]. Dual-
rail Threshold logic circuits use Mutually Exclusive Assertion Groups (MEAG), instead of
the Boolean Representation, to achieve Symbolic-Completeness of Expression. MEAGs
such as dual-rail signals eliminate the time reference by embedding control information into
data representation: A NULL or RESET value exists in the symbol set which is asserted
when data is not valid.

15
A dual-rail signal has two mutually exclusive data paths, D0 and D1, and implements three
logic states {NULL, DATA0, and DATA1} as given in Table 2.1. State DATA1 (D0 = 0
and D1 = 1) for Boolean logic 1, State DATA0 (D0 = 1 and D1 = 0) for Boolean logic 0 and
State NULL (D0 = 0 and D1 = 0) to indicate the result is not available yet. So the validity of
the output could be determined without a time reference. As the two rails are mutually
exclusive, (D0 = 1 and D1 = 1) is an illegal state.

Table 2.1 Dual Rail Signalling


STATES
DATA0 DATA1 NULL -
SIGNALS (Boolean Logic 0) (Boolean Logic 1) (Data not valid) (undefined)
D0 1 0 0 1
1
D 0 1 0 1

2.2.2 Two-Phase Operation

Dual-rail Threshold logic circuits are constructed from primitive modules known as
threshold gates with hysteresis [41]. A typical thmn gate, with 1 m n, has n inputs, of
which at least m of them has to become DATA for the output to assert a DATA value. This
is the threshold behavior. Similarly, at least m of the n inputs has to transition to NULL for
the output to assert NULL. Otherwise the threshold gate maintains its current state,
displaying hysteresis behavior. Specifically, a thmn gate functions like an n-input C-
element while a th1n gate like an n-input OR gate. Two typical gates from the Dual-rail
Threshold Logic Library and their truth tables are given in Figure 2.1.

a a
z
b th33 b
th13 z
c c

a b c z a b c z

1 1 1 1 1 1 1 1
1 1 0 z 1 1 0 1
1 0 1 z 1 0 1 1
1 0 0 z 1 0 0 1
0 1 1 z 0 1 1 1
0 1 0 z 0 1 0 1
0 0 1 z 0 0 1 1
0 0 0 0 0 0 0 0

a. th33 gate with hysteresis b. th13 gate with hysteresis


Figure 2.1 Dual-rail Threshold logic style basic building gates
16
The threshold gates partition the inputs into separate NULL and DATA wavefronts, such
that a NULL value must be applied to the circuit inputs between consecutive DATA values,
so that the circuit always cycles between consecutive NULL and DATA inputs, eliminating
races and hazards completely.

2.2.3 Logic Design using Dual Rail Threshold Logic Gates

The most basic approach for logic design using Dual Rail Threshold Logic Gates is
producing a sum of minterms for both rails of the dual-rail output in DIMS (Delay
Insensitive Minterms Summation) style [32, 33], to implement the logic functionality. A
DIMS style full-adder built from dual-rail threshold logic gates is illustrated in Figure 2.2.

c1
c0
a0 a1b1c1 Cout1
a1 th33
th14

b0 a1b0c0
th33
b1
Sum1
a0b1c0 th14
th33

a0b0c1
th33
Cout0
th14
a0b0c0
th33

a0b1c1
th33 Sum0
th14

a1b0c1
th33

a1b1c0
th33

Figure 2.2 DIMS Adder Structure built with Dual-Rail Threshold Logic Gates

There are other approaches which allow for some degree of boolean optimization and hence
do not require complete minterms but rely on C-gates to guarantee delay-insensitivity [34,
35].

17
2.2.4 Transistor Level Design of Dual Rail Threshold Logic Gates

For abstracting design layers in delay-insensitive circuit design, a design library is


constructed by designing Dual-rail threshold logic gates at transistor level in standard
CMOS technologies and using custom CAD tools. Then functional modules are designed at
logic level using these threshold-logic gates in the design library. Among the various CMOS
circuit design techniques that could be employed in designing the Dual-rail threshold logic
gates, the Static Implementation Method for NCL Gates [41] is preferred for being the most
reliable method available. In Figure 2.5.a. the typical structure of static M-of-N threshold
gate is given. Both nMOS and pMOS logic is constructed in two parts. The Go to NULL
part of the pMOS logic is ON only when all N inputs are at logic level 0. The functionality
of this block is complementary to the functionality of the Hold DATA part of the nMOS
logic which, together with the feedback nMOS gate from the gate output Y, implements the
case when one or more of the N inputs are at logic level 1. Similarly the functionalities of
the Go to DATA part of the nMOS logic and Hold NULL part of the pMOS logic are
complementary to each other but their structures depend on the values of M and N values. In
case M=N, i.e. the gate is an N-of-N threshold gate, the Go to DATA part of the nMOS
logic, implements the case when all N inputs are at logic level 1 and Hold NULL part of
the pMOS logic, together with the feedback pMOS gate from the gate output Y, implements
the case when one or more of the N inputs are at logic level 0. Figure 2.5.b. illustrates the
general structure of such a gate.

VDD

An

A1 A2 An

VDD

A2
Go To Hold
NULL NULL A1

Y A1

A2

A1 A2 An
Go To Hold
DATA DATA
An

a. Structure of M-of-N threshold gate [41] b. Structure of N-of-N threshold gate [41]
Figure 2.3 Static implementation of Dual-Rail threshold gates with hysteresis

18
After constructing a Dual-Rail Threshold gate according to the given Static Implementation
rules, further circuit optimizations could be employed to decrease the transistor count and
circuit area or to increase gate response times [30].

2.2.5 Registration and Pipelining

Each Dual-rail threshold logic circuit requires at least two registration stages, one at the
output to detect the completion of a DATA/NULL value and one at the input to request the
next NULL/DATA value. More registration stages could be introduced to divide the
functional blocks in pipelined fashion, as seen in Figure 2.3.

STAGEn-1 STAGEn STAGEn+1

DI DI DI DI DI
DI DI
DATAin Latch Combinational
Latch
Combinational
Latch Combinational Latch DATAout
(n-2) (n-1) (n) Logic (n+1)
Logic Logic

ACKn-2 REQn-2 ACKn-1 REQn-1 ACKn REQn ACKn+1 REQn+1

Figure 2.4 Delay-Insensitive (DI) Pipeline with Explicit Registration

In a Dual-rail threshold logic pipeline, the flow of DATA/NULL wavefronts between


adjacent stages is controlled by Dual-rail Threshold logic latches (registers) through use of
dedicated ACK and REQ lines [42]. The ACK output, generated by the completion detection
block of each pipeline stage is connected to the REQ input of the preceding stage to convey
a DATA Acknowledge/NULL Request or a NULL Acknowledge/DATA Request,
resembling closely the control flow in micropipelines [17]. As a result, Dual-rail
Threshold logic circuits continuously cycle between DATA and NULL states, where a
complete cycle, called a DATA-to-DATA cycle time (TDD), resembles a clock period in a
pipelined synchronous circuit except that the period TDD is not definite, but input-dependent;
and approximately half of the period is used for actual logic operation, while the other half is
used to generate the NULL marker between successive logic operations (see Figure 2.4).
This is a disadvantage in terms of throughput, but there are certain techniques addressing
compensation for this slow down [42, 43].

19
NULL NULL DATA DATA
Evaluation Ack Evaluation Ack

DATA-to-DATA Cycle Time (TDD)

Figure 2.5 TDD cycle of a Pipelined Dual-Rail Threshold Logic Circuit

In a Dual-rail Threshold Logic pipeline, the pipeline registration stages could be completely
eliminated by embedding the pipeline registration stage into the last level of combinational
logic. Since each Dual-rail threshold logic gate can inherently hold its state like a register,
the REQ input from next state could be fed into the last level of combinational gates of each
pipelining stage as an extra input and the threshold level of these combinational gates could
be increased by 1 to include the REQ input. Thus gate count and DATA-to-DATA cycle
time (TDD) could be reduced and throughput of the pipeline would be improved.

2.3 Delay Insensitivity Criteria

Dual-rail Threshold logic circuits need to obey certain criteria for maintaining delay-
insensitivity. These can be summarized as follows:

(i) Completeness of Input requires that all outputs of a combinational circuit may not
transition from NULL to DATA until all inputs have transitioned from NULL to DATA, and
may not transition from DATA to NULL until all inputs have transitioned from DATA to
NULL. For circuits with multiple outputs, Seitzs Weak Conditions for Completeness of
Input [44] allow some outputs to transition without having a complete input set, as long as
all outputs cannot transition before all inputs arrive.

(ii) Observability requires that every input and internal wire transition in the circuit should
cause a transition in at least one of the outputs [30, 40]. Transitions that are not used in
determination of the outputs, called orphans, are not allowed propagate through gate
boundaries.
20
2.4 Pipelining Criteria

Dual-rail Threshold Logic circuits lend themselves easily to pipelining but pipelining
requires additional criterion to be obeyed for delay insensitivity. For maintaining proper
control flow in a pipelined Dual-rail Threshold Logic circuit, so that NULL and DATA
waves would not interact within a pipelining stage and violate delay insensitivity, the
evaluation time of ACK output of each pipelining stage should not be greater than arrival
time for REQ input to that pipelining stage, which is fed back from the next pipelining stage
as ACK output, as formulated in (1):

Time [input , ACK ]n Time [input , REQ ]n = Time [input , ACK ]n +1 (1)

Due to their ease of pipelining, Dual-rail Threshold logic circuits could be intrinsically
transformed into systolic arrays for increased throughput in data processing. In systolic
arrays, data exchange is localized to adjacent systoles so global data paths are eliminated.
With asynchronous design, global control paths (clock signals) are also eliminated and
replaced with local handshaking signals. A delay-insensitive bit-level pipelined systolic
array with embedded registration is shown in Figure 2.6.

V1n-1 V2n-1 Vmn-1 V1n V2n Vmn V1n+1 V2n+1 Vmn+1


0,1 0,1 0,1 0,1 0,1 0,1 0,1 0,1 0,1

... ... ...

H1n-20,1 H1n-10,1 H1n0,1 H1n+10,1


DI Systole
...

DI Systole DI Systole
...
...

...

Hjn-20,1 (n-1) Hjn-10,1 (n) Hjn0,1 (n+1) Hjn+10,1

... REQ(n-1) ACK(n)


... REQ(n) ACK(n+1)
... REQ(n+1)
ACK(n-1)

Y1n-1 Ykn-1 Y1n Ykn Y1n+1 Ykn+1


0,1 0,1 0,1 0,1 0,1 0,1

Figure 2.6 DI systolic array with bit-level embedded pipelining

Bit-level pipelining in systolic arrays has the advantage of reducing the latency of the circuit
to the latency of a single systole, so that the speed of a single systole signifies the overall
throughput of a systolic array circuit and the throughput of the systolic array could be kept

21
constant against increasing array dimensions. But, with bit-level pipelining, an additional
criterion for delay insensitivity, called Completion Completeness [45], is introduced in case
bit-wise completion is used at registration stages and the combinational parts of the circuit
only conform to the Weak Condition for Completeness of Input

Completion Completeness is based on the fact that the dual-rail threshold logic registration
stage, which acknowledges either a DATA output or a NULL output, can only assure the
completeness of the output, not the completeness of input [45]. This may cause interaction
of consecutive DATA/NULL wavefronts and violate delay insensitive operation, when bit-
wise completion is adopted instead of word-wise completion for increasing the throughput
of the dual-rail threshold logic pipeline and the combinational parts only conform to the
Weak Condition for Completeness of Input. Since, in bit-wise completion, the completion
signal of each bit of the output is sent only to the dual-rail threshold logic registers that took
part in the calculation of that output bit. So an output bit does not reflect all input transitions
individually.

In case a dual-rail threshold logic registration stage is completion-incomplete, two methods


are proposed in [45] in order to ensure delay insensitivity: Either the topology of the
combinational blocks is modified to make all output bits input-complete or the completion
set of each register is modified to reflect input-completeness. However, these two methods
may conflict with logic level optimizations introduced for the purpose of decreasing the gate
count or increasing the evaluation speed. To preserve the advantages of logic level
optimizations while realizing completion-completeness in order to ensure delay-
insensitivity, alternative methods are required.

22

You might also like