0% found this document useful (0 votes)
21 views

Lecture09 Performance 01

1) Design for performance in integrated circuits involves minimizing capacitances through good layout, increasing transistor sizes up to a point, and increasing the supply voltage. 2) Device sizing involves balancing the intrinsic and extrinsic capacitances, where making the transistors very large eliminates the impact of external loads but increases input capacitance. 3) Optimally designing an inverter chain to drive a given capacitive load involves tapering the transistor sizes so that each stage has the same effective fanout and delay, with the size of each stage being the geometric mean of the previous two.

Uploaded by

Bala Krishna
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views

Lecture09 Performance 01

1) Design for performance in integrated circuits involves minimizing capacitances through good layout, increasing transistor sizes up to a point, and increasing the supply voltage. 2) Device sizing involves balancing the intrinsic and extrinsic capacitances, where making the transistors very large eliminates the impact of external loads but increases input capacitance. 3) Optimally designing an inverter chain to drive a given capacitive load involves tapering the transistor sizes so that each stage has the same effective fanout and delay, with the size of each stage being the geometric mean of the previous two.

Uploaded by

Bala Krishna
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 33

Design for Performance

Design for Performance


Modified From "Digital Integrated Circuits", by J. Rabaey, A. Chandrakasan and B. Nikolic
Contents
o Device Sizing
o Inverter Chain Sizing
o CMOS gates transient considerations
o Logical Effort

Modified From "Digital Integrated Circuits", by J. Rabaey, A. Chandrakasan and B. Nikolic
Design for Performance
Keep capacitances (internal, interconnect and fan-out) small
by good layout
Increase transistor sizes
Watch out for self-loading!
Once that the intrinsic capacitance dominates the delay, W/L does
not help anymore
Increase V
DD

There are limits to the maximum.
Increases power consumption
Device Sizing
Modified From "Digital Integrated Circuits", by J. Rabaey, A. Chandrakasan and B. Nikolic
Device Sizing
Assuming a symmetrical inverter, the capacitance is composed of:


Cint is the self-load, associated with diffusion and gate-drain (Miller)
Cext is extrinsec, load, wiring, etc.





Where is the intrinsec delay (Cext=0)

What are the consequences of scaling ?
ext L
C C C + =
int
) / 1 ( ) ( 69 . 0
int 0 int
C C t C C R t
ext p ext eq p
+ = + =
int 0
69 . 0 C R t
eq p
=
Modified From "Digital Integrated Circuits", by J. Rabaey, A. Chandrakasan and B. Nikolic
Device Sizing
Cint scales with size ratio S, and also Req:


The delay is:


int
, / ,
iref eq iref
C SC R R S = =
) / 1 (
0 iref ext p p
SC C t t + =
The intrinsic delay is independent of sizing and is determined by
technology and layout.
Making S infinitely large gives the maximum performance gain,
eliminating the impact of an external load. Yet, a big enough sizing
produces similar results with a gain in Silicon area
A big inverter has big input capacitance and affects the previous stages !

Modified From "Digital Integrated Circuits", by J. Rabaey, A. Chandrakasan and B. Nikolic
2 4 6 8 10 12 14
2
2.2
2.4
2.6
2.8
3
3.2
3.4
3.6
3.8
x 10
-11
S
t
p
(
s
e
c
)
Device Sizing
(for fixed load)
Self-loading effect:
Intrinsic capacitances
dominate
Inverter Chain Sizing
Inverter Chain Sizing
Modified From "Digital Integrated Circuits", by J. Rabaey, A. Chandrakasan and B. Nikolic
Inverter Chain
C
L
If C
L
is given:
- How many stages are needed to minimize the delay?
- How to size the inverters?

May need some additional constraints.
In
Out
Modified From "Digital Integrated Circuits", by J. Rabaey, A. Chandrakasan and B. Nikolic
Inverter Delay
Minimum length devices, L
Assume that W
P
= 2W
N
=2W

same pull-up and pull-down currents
approx. equal resistances R
N
= R
P

approx. equal rise t
pLH
and fall t
pHL
delays
W N
unit
N
unit
unit
P
unit
P
R R
W
W
R
W
W
R
R = =
|
|
.
|

\
|
~
|
|
.
|

\
|
=
t
pHL
= (ln 2) R
N
C
L
t
pLH
= (ln 2) R
P
C
L
Delay (D):
2W
W
unit
unit
gin
C
W
W
C 3 =
Load for the next stage:
Modified From "Digital Integrated Circuits", by J. Rabaey, A. Chandrakasan and B. Nikolic
Delay Formula
( )
( ) ( ) / 1 / 1
~
0
int
f t C C C kR t
C C R Delay
p int L W p
L int W
+ = + =
+
C
int
= C
gin
with

~ 1

f = C
L
/ C
gin
- effective fanout

Delay is only a function of the ratio between its external load
capacitance and its input capacitance
Modified From "Digital Integrated Circuits", by J. Rabaey, A. Chandrakasan and B. Nikolic
Apply to Inverter Chain
C
L
In Out
1 2 N
t
p
= t
p1
+ t
p2
+ + t
pN
|
|
.
|

\
|
+
+
j gin
j gin
unit unit pj
C
C
C R t
,
1 ,
1 ~

L N gin
N
i
j gin
j gin
p
N
j
j p p
C C
C
C
t t t =
|
|
.
|

\
|
+ = =
+
=
+
=
1 ,
1
,
1 ,
0
1
,
, 1

Modified From "Digital Integrated Circuits", by J. Rabaey, A. Chandrakasan and B. Nikolic


Optimal Tapering for Given N
Delay equation has N - 1 unknowns, Cgin
2
Cgin
N

Minimize the delay, find N - 1 partial derivatives

Result: Cgin,j+1/Cgin,j = Cgin,j/Cgin,j-1

Each stage has the same effective fanout (Cout/Cin)
Each stage has the same delay
Size of each stage is the geometric mean of two neighbors
1 , 1 , , +
=
j gin j gin j gin
C C C
Modified From "Digital Integrated Circuits", by J. Rabaey, A. Chandrakasan and B. Nikolic
Optimum Delay and Number of Stages
When each stage is sized by f and has same eff. fanout f:



Effective fanout of each stage:


Minimum path delay
1 ,
/
gin L
N
C C F f = =
N
F f =
( ) / 1
0
N
p p
F Nt t + =
Modified From "Digital Integrated Circuits", by J. Rabaey, A. Chandrakasan and B. Nikolic
Example
C
L
= 8 C
1
In
Out
C
1
1 f f
2
2 8
3
= = f
C
L
/C
1
has to be evenly distributed across N = 3 stages:
Modified From "Digital Integrated Circuits", by J. Rabaey, A. Chandrakasan and B. Nikolic
Optimum Number of Stages
For a given load, C
L
and given input capacitance C
in

Find optimal sizing f
0
0
ln
1
ln ln
p
p p
t F
f f
t Nt
f f


| | | |
= + = +
| |
\ . \ .
0
ln
1 ln
ln
2
0
=

=
c
c
f
f f
F t
f
t
p p

( ) f f + = 1 exp
f that minimizes total delay results from:
f
F
N C f C F C
in
N
in L
ln
ln
and
= = =
Modified From "Digital Integrated Circuits", by J. Rabaey, A. Chandrakasan and B. Nikolic
Optimum Effective Fanout f
Optimum f for given process defined by
( ) f f + = 1 exp
f
opt
= 3.6
for =1
For = 0, f = e, N = lnF
Modified From "Digital Integrated Circuits", by J. Rabaey, A. Chandrakasan and B. Nikolic
Trade-offs in the choice of N
Number of stages large, intrinsic delay dominates
Too small, the effective fan-out dominates




With more stages (smaller f), N grows exponencially and f
decreases linearly: tp increases

With fewer stages (bigger f), N reduces and f increases: tp
remains roughly constant
( )
0
log( )
1 / ,
log( )
p p
F
t Nt f N
f
= + =
Modified From "Digital Integrated Circuits", by J. Rabaey, A. Chandrakasan and B. Nikolic
Choice of N: Example
Example: Ci=1fF, Cout=1pF: F=1000

( ) / 1
0
N
p p
F Nt t + =
t
p (normalized delay) N (number of stages)

f
f
Make f slightly larger than optimum (to round off stages. Typ.=4)
Modified From "Digital Integrated Circuits", by J. Rabaey, A. Chandrakasan and B. Nikolic
Normalized delay function of F
Values of tp/Optimum(tp) for several designs
F Unbuffered Two Stage Inverter chain
10 11 8.3 8.3
100 101 22 16.5
1000 1001 65 24.8
10,000 10,001 202 33.1
Modified From "Digital Integrated Circuits", by J. Rabaey, A. Chandrakasan and B. Nikolic
Buffer Design
1
1
1
1
8
64
64
64
64
4
2.8
8
16
22.6
N f t
p


1 64 65


2 8 18



3 4 15



4 2.8 15.3
CMOS Gates
Modified From "Digital Integrated Circuits", by J. Rabaey, A. Chandrakasan and B. Nikolic
CMOS Gates
Static Properties of gates
Delay characteristics
Fan-in and Fan-out considerations
Modified From "Digital Integrated Circuits", by J. Rabaey, A. Chandrakasan and B. Nikolic
Static Properties
Depend on input pattern
0V
3V
3V
0V
Vin
Vout
a) A=B=0 1
b) A=1, B=0 1
c) B=1, A=0 1
a) Two pull-up transistors in parallel are more difficult to turn off than one
b) One pull-up transistor, one pull-down. Dynamically, the internal node has
to be discharged (slower)
c) Vds1 produces bulk effect during discharge. Vt of transistor A is
increased. More Vin is needed
Modified From "Digital Integrated Circuits", by J. Rabaey, A. Chandrakasan and B. Nikolic
Switch Delay Model
A
R
eq

A
R
p

A
R
p

A
R
n

C
L

A
C
L

B
R
n

A
R
p

B
R
p

A
R
n

C
int

B
R
p

A
R
p

A
R
n

B
R
n

C
L

C
int

NAND2
INV
NOR2
Modified From "Digital Integrated Circuits", by J. Rabaey, A. Chandrakasan and B. Nikolic
Input Pattern Effects on Delay
Delay is dependent on the pattern of
inputs

Low to high transition
both inputs go low
delay is 0.69 R
p
/2 C
L
one input goes low
delay is 0.69 R
p
C
L
when N transistor A goes off, internal
node has to be charged


High to low transition
both inputs go high
delay is 0.69 2R
n
C
L
C
L

B
R
n

A
R
p

B
R
p

A
R
n

C
int

Modified From "Digital Integrated Circuits", by J. Rabaey, A. Chandrakasan and B. Nikolic
Delay Dependence on Input Patterns
-0.5
0
0.5
1
1.5
2
2.5
3
0 100 200 300 400
A=B=10
A=1, B=10
A=1 0, B=1
time [ps]
V
o
l
t
a
g
e

[
V
]

Input Data
Pattern
Delay
(psec)
A=B=01 67
A=1, B=01 64
A= 01, B=1 61
A=B=10 45
A=1, B=10 80
A= 10, B=1 81
NMOS = 0.5m/0.25 m
PMOS = 0.75m/0.25 m
C
L
= 100 fF
when N transistor A goes off (A=1), internal node
has to be charged (slower)
Modified From "Digital Integrated Circuits", by J. Rabaey, A. Chandrakasan and B. Nikolic
Transistor Sizing
C
L

B
R
n

A
R
p

B
R
p

A
R
n

C
int

B
R
p

A
R
p

A
R
n

B
R
n

C
L

C
int

2


2
2 2
1
1
4


4
NAND based implementations are preferred over NOR
Modified From "Digital Integrated Circuits", by J. Rabaey, A. Chandrakasan and B. Nikolic
Transistor Sizing a Complex CMOS Gate
OUT = D + A (B + C)
D
A
B C
D
A
B
C
1
2
2 2
4
4
8
8
6
3
6
6
Modified From "Digital Integrated Circuits", by J. Rabaey, A. Chandrakasan and B. Nikolic
Fan-In Considerations
D C B A
D
C
B
A
C
L

C
3

C
2

C
1

Distributed RC model
(Elmore delay)

t
pHL
= 0.69 R
e
(C
1
+2C
2
+3C
3
+4C
L
)
= R
e
C
1
+2 R
e
C
2
+3R
e
C
3
+4R
e
C
L

* Propagation delay deteriorates
rapidly as a function of fan-in
quadratically in the worst case. (prop.
to RC)

* Internal nodes important !!
Modified From "Digital Integrated Circuits", by J. Rabaey, A. Chandrakasan and B. Nikolic
t
p
as a Function of Fan-In
0
250
500
750
1000
1250
2 4 6 8 10 12 14 16
t
pHL

quadratic
linear
t
p

t
pLH

t
p

(
p
s
e
c
)

fan-in
Gates with a fan-in greater than 4 should be avoided.
Intrinsec C increases
linearly
Series transistors cause a
double slowdown
Parallel transistors
increase C
Modified From "Digital Integrated Circuits", by J. Rabaey, A. Chandrakasan and B. Nikolic
t
p
as a Function of Fan-Out
2 4 6 8 10 12 14 16
t
p
NOR2
t
p

(
p
s
e
c
)

eff. fan-out
All gates
have the
same drive
current.
t
p
NAND2
t
p
INV
Slope is a
function of
driving
strength
Modified From "Digital Integrated Circuits", by J. Rabaey, A. Chandrakasan and B. Nikolic
t
p
as a Function of Fan-In and Fan-Out

Fan-in: quadratic due to increasing resistance and capacitance
Fan-out: each additional fan-out gate adds two gate
capacitances to C
L



t
p
= a
1
FI + a
2
FI
2
+ a
3
FO

You might also like