0% found this document useful (0 votes)
13 views40 pages

Part 1

part1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views40 pages

Part 1

part1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 40

Bridging the gap between

asynchronous design
and designers

Thanks to Jordi Cortadella, Luciano Lavagno, Mike


Kishinevsky and many others

1
Outline
1. Basic concepts on asynchronous circuit design
2. Logic synthesis from concurrent specifications
3. Design automation for asynchronous circuits

2
Basic concepts on
asynchronous circuit design

3
Outline
What is an asynchronous circuit ?
Asynchronous communication
Asynchronous design styles (Micropipelines)
Asynchronous logic building blocks
Control specification and implementation
Delay models and classes of async circuits
Why asynchronous circuits ?

4
Synchronous circuit

R CL R CL R CL R

CLK

Implicit (global) synchronization between blocks


Clock period > Max Delay (CL + R)
Time is an independent physical variable (quantity) 5
Asynchronous circuit
Ack

R CL R CL R CL R

Req

Explicit (local) synchronization:


Req / Ack handshakes

Time = events + quantity


Time does not exist if nothing happens (Aristotle) 6
Motivation for asynchronous
Asynchronous design is often unavoidable:

Asynchronous interfaces, arbiters etc.

Modern clocking is multi-phase and distributed –


and virtually ‘asynchronous’ (cf. GALS – next slide):

Mesachronous (clock travels together with data)

Local (possibly stretchable) clock generation

Robust asynchronous design flow is coming (e.g.


VLSI programming from Philips, NCL from Theseus
Logic, fine-grain pipelining from Fulcrum)

7
Globally Async Locally Sync (GALS)

Asynchronous Clocked Domain


World

Req1 Req3
R CL R

Ack1 Ack3

Local CLK Req4


Req2

Ack2 Async-to-sync Wrapper Ack4

8
Key Design Differences
Synchronous logic design:

proceeds without taking timing correctness
(hazards, signal ack-ing etc.) into account

Combinational logic and memory latches
(registers) are built separately

Static timing analysis of CL is sufficient to
determine the Max Delay (clock period)

Fixed set-up and hold conditions for latches

9
Key Design Differences
Asynchronous logic design:

Must ensure hazard-freedom, signal ack-ing, local
timing constraints

Combinational logic and memory latches
(registers) are often mixed in “complex gates”

Dynamic timing analysis of logic is needed to
determine relative delays between paths
To avoid complex issues, circuits may be
built as Delay-insensitive and/or Speed-
independent (Maller’s theory vs Huffman
asynchronous automata)

10
Verification and Testing Differences

Synchronous logic verification and testing:



Only functional correctness aspect is verified and
tested

Testing can be done with standard ATE and at low
speed
Asynchronous logic verification and testing:

In addition to functional correctness, temporal
aspect is crucial: e.g. causality and order,
deadlock-freedom

Testing must cover faults in complex gates
(logic+memory) and must proceed at normal
operation rate

Delay fault testing may be needed
11
Synchronous communication

1 1 0 0 1 0

Clock edges determine the time instants where


data must be sampled

Data wires may glitch between clock edges (set-


up/hold times must be satisfied)

Data are transmitted at a fixed rate


(clock frequency)

12
Dual rail
1 1 1

0 0 0

Two wires with L(low) and H (high) per bit



“LL” = “spacer”, “LH” = “0”, “HL” = “1”

n-bit data communication requires 2n wires

Each bit is self-timed

Other delay-insensitive codes exist (e.g. k-of-n)


and event-based signalling (choice criteria: pin
and power efficiency)
13
Bundled data

1 1 0 0 1 0
Validity signal

Similar to an aperiodic local clock

n-bit data communication requires n+1 wires

Data wires may glitch when no valid

Signaling protocols

level sensitive (latch)

transition sensitive (register): 2-phase / 4-phase
14
Example: memory read cycle
Valid address

Address A A

Valid data

Data D D

Transition signaling, 4-phase

15
Example: memory read cycle
Valid address

Address A A

Valid data

Data D D

Transition signaling, 2-phase

16
Asynchronous modules
DATA
Data IN PATH Data OUT

start done
req in req out
ack in CONTROL ack out

Signaling protocol:
reqin+ start+ [computation] done+ reqout+ ackout+ ackin+
reqin- start- [reset] done- reqout- ackout- ackin-
(more concurrency is also possible)
17
Asynchronous latches: C element
Vdd
A A B
C Z
B
Z
B A
Z
A B Z+ B A
Z
0 0 0 Static Logic
0 1 Z Implementation
1 0 Z A B
[van Berkel 91]
1 1 1
Gnd
18
C-element: Other implementations
Vdd Vdd
A A
Weak inverter
B B
Z Z

B B

A Dynamic A Quasi-Static

Gnd Gnd
19
Dual-rail logic
A.t
C.t
B.t
Dual-rail AND gate
A.f
C.f
B.f

Valid behavior for monotonic environment

20
Completion detection

Dual-rail C done
logic






Completion detection tree

21
Differential cascode voltage switch logic

start

Z.f Z.t

done

A.t
C.f B.f A.f B.t N-type
C.t
transistor
network
start
3-input AND/NAND gate

22
Examples of dual-rail design
Asynchronous dual-rail ripple-carry adder (A.
Martin, 1991)

Critical delay is proportional to logN (N=number
of bits)

32-bit adder delay (1.6m MOSIS CMOS): 11ns
versus 40 ns for synchronous

Async cell transistor count = 34 versus
synchronous = 28
More recent success stories (modularity and
automatic synthesis) of dual-rail logic from
Null-Convension Logic from Theseus Logic

23
Bundled-data logic blocks

Single-rail logic

• •
• •

start delay done

Conventional logic + matched delay


24
Micropipelines (Sutherland 89)
Micropipeline (2-phase) control blocks

r1 g1
d1 Request-
C Grant-Done
Join r2 (RGD)Arbiter
Merge d2 g2

sel r1
outf out a1 r
in in 0 a
outt out r2 Call
1 a2
Select Toggle
25
Micropipelines (Sutherland 89)
Aout delay delay Ain
C C

L logic L logic L logic L

C C
Rin delay Rout

26
Data-path / Control

L logic L logic L logic L

Rin Rout
Aout CONTROL Ain

Synthesis of control is a major challenge


27
Control specification

A+

A
B+

A- B

B- A input
B output

28
Control specification

A+

B-
A B
A-

B+

29
Control specification

A+ B+

A
C+
C C
A- B- B

C-

30
Control specification

A+ B+

A
C+
C C
A-
B
B-

C-

31
Control specification
Ri+ Ro+

Ri Ro
FIFO Ao+ Ai+

Ao
cntrl
Ai
Ri- Ro-

Ao- Ai-
Ri
C Ro
Ao C

Ai

32
Gate vs wire delay models
Gate delay model: delays in gates, no delays in wires

Wire delay model: delays in gates and wires

33
Delay models for async. circuits
Bounded delays (BD): realistic for gates and wires.

Technology mapping is easy, verification is
difficult
BD
Speed independent (SI): Unbounded (pessimistic)
delays for gates and “negligible” (optimistic) delays
for wires. DI

Technology mapping is more difficult, verification
is easy
SI  QDI
Delay insensitive (DI): Unbounded (pessimistic)
delays for gates and wires.

DI class (built out of basic gates) is almost empty

Quasi-delay insensitive (QDI): Delay insensitive


except for critical wire forks (isochronic forks).

In practice it is the same as speed independent
34
Environment models
Slow enough environment = Fundamental mode
(Inputs change AFTER system has settled)

Reactive environment = I/O mode


(Inputs may change once the first output changes)

35
Correctness of a circuit wrt delay
assumptions
C-element: z = ab +zb + za

a
a
b
z b z

36
Motivation (designer’s view)
Modularity for system-on-chip design

Plug-and-play interconnectivity
Average-case peformance

No worst-case delay synchronization
Many interfaces are asynchronous

Buses, networks, ...

37
Motivation (technology aspects)

Low power

Automatic clock gating
Electromagnetic compatibility

No peak currents around clock edges
Security

No ‘electro-magnetic difference’ between logical
‘0’ and ‘1’in dual rail code
Robustness

High immunity to technology and environment
variations (temperature, power supply, ...)

38
Resistance
Concurrent models for specification

CSP, Petri nets, ...: no more FSMs
Difficult to design

Hazards, synchronization
Complex timing analysis

Difficult to estimate performance
Difficult to test

No way to stop the clock

39
But ... some successful stories
Philips
AMULET microprocessors
Sharp
Intel (RAPPID)
Start-up companies:

Theseus logic, Fulcrum, Self-Timed Solutions
Recent blurb: It's Time for Clockless Chips, by
Claire Tristram (MIT Technology Review, v. 104,
no.8, October 2001:
http://www.technologyreview.com/magazine/oct01/t
ristram.asp
)
….
40

You might also like