0% found this document useful (0 votes)
25 views20 pages

Power Aware Design of Nanometer MCML Tap

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views20 pages

Power Aware Design of Nanometer MCML Tap

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

Feature

Massimo Alioto and Gaetano Palumbo

Abstract
In the recent years, MOS Current-
Mode Logic (MCML) circuits have been
gaining a remarkable interest in sever-
al VLSI applications, ranging from
high-accuracy mixed-signal circuits to
high-speed circuits for channel
(de)multiplexing in optic fiber and
Radio Frequency (RF) telecommunica-
tion systems. However, advantages
over traditional CMOS logic are
achieved at the cost of a static power
consumption, which must be kept as
low as possible. Accordingly, a con-
scious management of the power-
delay trade-off is essential in the
design of such circuits.
This paper presents several recent
ideas on the design of digital MCML
circuits organized in a comprehensive
framework. The treatment reviews and
extends previous results by incorpo-
rating Deep-Sub-Micron (DSM) effects
from the beginning, with a strongly
simplified analytical formulation to
improve the understanding and the
design. Interesting properties and
design criteria are derived from sim-
ple analytical models. From these
models, a deep insight into the design
of MCML circuits is gained, which is
essential for both the efficient design
of MCML cells and the development of
an automated design flow. Numerical
examples are presented by consider-
ing a 90-nm CMOS process.
© MASTER SERIES

40 IEEE CIRCUITS AND SYSTEMS MAGAZINE 1531-6364/06/$20.00©2006 IEEE FOURTH QUARTER 2006
I. Introduction VDD

I
n the last decade, we have witnessed an increasing
interest in MOS Current-Mode Logic (also named
Source-Coupled Logic—SCL) circuits, which repre- RD RD
sents an alternative to traditional CMOS logic styles in vo,1 vo,2
several applications. Despite of their recent adoption,
MCML circuits actually have quite old ancestors in their CL CL
family tree, as they directly descend from the bipolar
Current-Model Logic (CML) which has the same topology,
despite of the different adopted technology [1].
vi1,1 vi1,2
The fundamental structure of an n-input MCML gate is
v NMOS Source-
depicted in Figure 1, where an NMOS network (consisting i2,1 vi2,2
Coupled Pairs
of properly stacked source-coupled pairs) steers the bias ... Network ...
current IS S to one of the two output branches, according vin,1 vin,2
to the value of the differential inputs vi1= vi1,1 − vi1,2 , . . .
vin= vin,1 − vin,2 . The steered current is then converted
into a differential output voltage vo= vo,1 − vo,2 by the two
resistances R D (in red line) which can be often imple- ISS
mented by physical resistors, or alternatively by PMOS Figure 1. Topology of
transistors (working in the triode region) active load. As a generic MCML gate.
opposite to previous works dealing with the power-delay
trade-off management in MCML
gates [2]–[5], in the following a
physical resistor will be assumed.
The current source IS S in Figure 1 NMOS Source-
is usually implemented by a simple Coupled Pairs
current mirror, which is not shown Network vi,1 M1 M2 vi,2
for the sake of simplicity. The load
capacitance C L represents the
external capacitance due to the
input capacitance of the following
gates and the wiring capacitance.
The general topology in Figure 1 VDD
allows the implementation of both
combinational and sequential gates
RD RD
whose logic function only depends vo,1 vo,2
on the connection of the source-
coupled pairs. The implemented CL
CL
function can also be modified by
negating the inputs and the output,
i.e., by simply swapping the corre-
vi,1 M1 M2 vi,2
sponding pairs of differential sig-
nals. As an example of the simplest
logic gate, the topology of an MCML
inverter is depicted in Figure 2, ISS
where the NMOS network consists
of only one source-coupled pair. As
Figure 2. Topology of
other examples, the NMOS network a MCML inverter gate.
topology of a 2-input Multiplexer

Massimo Alioto is with the DII (Dipartimento di Ingegneria dell’Informazione), UNIVERSITÁ DI SIENA, v. Roma, 56, I-53100 SIENA – ITALY,
E-mail: [email protected]. Gaetano Palumbo is with the DIEES (Dipartimento di Ingegneria Elettrica Elettronica e dei Sistemi), UNIVERSITÁ
DI CATANIA, Viale Andrea Doria 6, I-95125 CATANIA – ITALY, E-mail: [email protected]

FOURTH QUARTER 2006 IEEE CIRCUITS AND SYSTEMS MAGAZINE 41


(MUX), XOR and D-latch gate are shown in Figures 3–5, able for an increasingly wide range of applications:
respectively, (for an overview on the design techniques to 1) MCML gates are faster. The higher speed allows for
derive the topology of arbitrary MCML gates, the reader implementing circuits for fast communication sys-
is directed to [2]). Their static operation is easily under- tems (e.g., multiplexing/demultiplexing ICs in the
stood by considering the simplest case of the inverter range of 10 Gb/s for SONET/SDH optic-fiber links
gate in Figure 2: when the input voltage vi= vi,1 − vi,2 is and high-speed crosspoint switches) and RF cir-
high (low), the source-coupled pair M1-M2 completely cuits (e.g., PLL, prescalers, circuits for clock
steers the current IS S to the drain of M1 (M2), thus the recovery and VCOs), as well as high-speed cur-
output voltage is equal to the low (high) value rent-mode buffers [6]–[13]. This speed improve-
VOL= (VDD − R D · IS S ) − VDD = R D · IS S (VOH = R D · IS S ). ment is due to the tremendous CMOS technology
The obtained logic swing is scaling, and allows for replacing previous bipolar
CML logic [14]. However, in Section V it will be
VSWING = VOH − VOL = 2R D IS S (1) shown that the high speed performance is not due
to the small logic swing, as opposite to the com-
which is rather small, typically in the order of a few hun- mon belief.
dreds of millivolts. Due to the symmetry of the I-V trans- 2) MCML gates have a better power efficiency at high
fer characteristics of the source-coupled pair and of the frequencies. This enforces the suitability of MCML
circuit, the logic threshold VLT is equal to zero (i.e., gates for high-frequency applications, since from
vo = 0 when vi = 0). the last decade a low power consumption is also
Unfortunately, the power dissipated by the MCML gate required in high-speed circuits for reasons related
is dominated by the static power consumption VDD · IS S to the heat removal, as well as to the battery life-
due to the bias current source since the dynamic contri- time in portable devices [15], [16]. This has extend-
bution (associated with the capacitance charge during ed the range of applications of MCML gates to the
the gate switching) is rather small due to the reduced implementation of high-speed low-power arith-
logic swing. For this reason, various techniques have metic and signal processing cores [15].
been adopted to dynamically reduce the static power 3) MCML gates generate a much lower switching
consumption [1]. The static power consumption is the noise during switching. Indeed, the power supply
fundamental weak point of MCML gates, thus in their must provide a static power and thus a constant
design it must be kept as small as possible for a given current to each gate. This avoids the typical cur-
required performance by consciously managing the rent spikes of CMOS logic that determine large
power-delay trade-off, both to efficiently design MCML voltage variations on the supply voltage VDD
cells and develop an automated design flow. In the fol- [17]–[22] which in turn couple with the eventual
lowing sections, power-aware design strategies will be analog circuits sharing the same substrate (as
derived to address this problem. occurs in current Systems-on-Chip) and degrade
Compared to traditional CMOS logic, MCML gates their resolution. In particular, the almost con-
exhibit various interesting features that make them suit- stant supply current leads to an almost zero volt-
age drop in the bonding
wires/supply rails inductance due
VDD to current variations di/dt [6],
which will be increasingly impor-
RD RD
vo,1 vo,2 tant in next technology nodes, in
which both this inductance and
CL CL
the supply current variations are
expected to dramatically grow
A2 M3 M4 A 1 B2 M5 M6 B1 because of the increased clock
A 0 frequency [23], [24].
2:1 MUX

OUT The switching noise generated


SEL1 M1 M2 SEL2
B 1 by MCML circuits is typically
reduced by two orders of magni-
SEL ISS tude, thus this logic style is cur-
rently adopted in most high-speed
high-resolution mixed-signal ICs for
Figure 3. Topology of a 2:1 Multiplexer.
digital audio and video signal pro-

42 IEEE CIRCUITS AND SYSTEMS MAGAZINE FOURTH QUARTER 2006


cessing (such as sigma-delta A/D and D/A convert- Due to the essentially static power consumption,
ers) [25]–[32]. the chip temperature and the supply voltage also
4) MCML gates have a better signal integrity and a lower tends to be constant (according to point 3), there-
“delay noise”. This is due to the much lower supply by minimizing the delay variations associated with
voltage noise (discussed in point 3) and the differ- supply and environmental variations.
ential operation of MCML gates, which are insensi- The potentially lower delay uncertainty is an
tive to common-mode signals, including the supply appealing property since it is becoming an
noise. This greatly simplifies the design of the sup- increasing fraction of the clock period [36], and
ply distribution network and reduces the size (and thus a major limit to the speed improvement in
area) of decoupling capacitors needed to ensure current Deep-Sub-Micron (DSM) technologies [33].
low VDD variations. 6) MCML gates suffer from a lower degradation of the
Interestingly, MCML circuits can also be made electrical transistor properties due to DSM effects.
insensitive to the noise arising from the (capaci- This is due to the lower logic swing, which reduces
tive) coupling with other switching circuits. Indeed, the voltages across the transistors’ terminals, and
this coupling noise becomes a common-mode sig- thus the electric field under the transistor channel.
nal if the cells and the interconnects are carefully This reduces DSM effects, such as the carrier mobil-
designed with a symmetric layout. This also avoids ity degradation and velocity saturation, when com-
the delay variations due to the capacitive coupling pared to standard CMOS logic.
with other switching gates
(often named delay noise VDD
[33]), which is a major source
of delay uncertainty in cur- RD RD
vo,1 vo,2
rent CMOS logic circuits.
5) MCML gates potentially have a CL CL
lower sensitivity to process,
supply and environmental A2 M3 M4 A1 M5 M6 A2
variations. Simple techniques
to significantly lower the A OUT
effect of process tolerances B B1 M1 M2 B2
have been developed for
MCML gates circuits [1], [16],
ISS
even though this aspect has
not completely understood
and is currently under inves- Figure 4. Topology of a XOR gate.
tigation [34]. As an example,
the variation in the logic
threshold VLT due to process
tolerances determines an VDD
uncertainty on the input
switching time (in which RD RD
vo,1 vo,2
vi = VLT ) and thus on the
delay. However, the VLT varia- CL CL
tions in MCML gates are
mainly due to the mismatch A2 M3 M4 A1 M5 M6
of source-coupled NMOS OUT
transistors (or load resistanc- D Q
D CLK M2
es), whereas the CMOS varia- M1 CLK
Clk Q
tions are due to the poorer CLK
mismatch between a PMOS ISS
and an NMOS transistor [35].
Thus, a lower uncertainty in
VLT and in the delay is Figure 5. Topology of a D Latch.
expected in MCML circuits.

FOURTH QUARTER 2006 IEEE CIRCUITS AND SYSTEMS MAGAZINE 43


input gate (i.e., the inverter in Figure 2) it is defined as
RDISS vo(vi) VOH,min − V I H,min (or equivalently as V I L,max − VOL,max ,
VOHmin
due to the symmetry of the transfer characteristics) from
the critical points of the DC transfer voltage characteris-
0 tics (V I L,max , VOH,min ) and (V I H,min , VOL,max ) in Figure 6.
vi In the more general case with multiple inputs, the
noise margin is evaluated from the DC characteristics
VOLmax associated with a given input vi driving a source-coupled
−RDISS
pair M1-M2 with the other inputs being preliminarily
VILmax VLT = 0 VIHmin
assigned [6]. The transistors driven by the latter con-
Figure 6. Typical DC transfer characteristics of a MCML gate. stant inputs do not affect the static behavior of the gate,
since they are switched off or can be assumed as short
circuited, thus the DC behavior of multiple gates is equal
to that of a simple inverter made up of the source-cou-
Moreover, the worse speed performance of PMOS pled pair M1–M2, according to Figure 7.
transistors does not impose a speed limit in MCML In general, the noise margin NM might depend on the
gates, since the switching source-coupled pairs considered input vi , but in practical MCML gates all
are made up of only NMOS transistors [7]. source-coupled pairs are made identical to have the same
According to points 1–3, the range of applications in noise margin for all inputs, hence the noise margin
which MCML gates exhibit significant advantages has expression of the inverter is immediately extended to
continuously broadened. This trend is expected to con- arbitrary logic gates.
tinue according to points 3–6, since MCML gates are less
sensitive than CMOS logic to limitations arising in sub- II-A. Evaluation of the Noise Margin:
100 nm technologies. A Simple Approach.
In this subsection, a novel simplified approach is
II. Static Analysis Through the Alpha-Power Law adopted to evaluate the noise margin of nanometer
The noise margin NM is the fundamental requirement on MCML circuits by assuming a I-V relationship of MOS
the static behavior of any logic style, and for a single- transistors given by the well-known Alpha-Power law

VDD VDD

RD RD RD RD
vo,1 vo,2 vo,1 vo,2

Upper Source
Coupled Pairs

vi,1 M1 M2 vi,2 vi,1 M1 M2 vi,2

ISS
ISS
Lower Source
Coupled Pairs

ISS

Figure 7. Noise margin: equivalence of multiple-input gates to an inverter gate.

44 IEEE CIRCUITS AND SYSTEMS MAGAZINE FOURTH QUARTER 2006


[37]. The latter expresses the NMOS drain current iD as
Table 1.
a function of the gate-source voltage vG S in the satura-
Alpha-Power law coefficients and main process parameters
tion region in a 90-nm technology.

iD = K · W · (vG S − VT H )α (2) α 1.45


K 0.83 E3 µA/(m·V2 )
W being the effective channel width, VT H the transistor VTH 0.35 V
threshold voltage, K and α technology-dependent coeffi- Wmin 120 nm
Lmin 90 nm
cients (channel length modulation effect was neglected, as
effective Lmin 65 nm
usual). In particular, old long-channel technologies have a COX 18 fF/µm2
square I-V law, thus parameter α is equal to 2 and K is maximum VDD 1V
equal to µn C OX /2L [6], where µn is the electron mobility, resistance per unit length r 1.23 kW/µm
C OX is the gate oxide capacitance per unit area and L is (unsilicided p+ POLY)
the effective channel length. In this case, the MCML noise capacitance per unit length c 0.07 fF/µm
(unsilicided p+ POLY)
margin has been previously found to be given by, [2], [38],

VSWING
 
β
NM = 1− (3)
2 AV
swing, and roughly equal to half of it, if AV is suffi-

where β is a constant coefficient equal to 2, and AV is ciently high. Next, the comparison of (5) with (3)
the magnitude of the small-signal voltage gain around the shows that the noise margin achieved with nanometer
logic threshold given by, [2], devices is greater than that with old long-channel tran-
sistors, for assigned values of the logic swing and the
AV = gm R D (4) voltage gain. This is a good news, since it means that
DSM effects are beneficial in terms of the noise margin
gm being the transistor transconductance around the in MCML gates, and that the long-channel model in (3)
logic threshold. is pessimistic for current technologies. However, the
In the limit case of a very short-channel device with maximum logic swing which ensures the transistor
a completely saturated carrier velocity, the I-V relation- operation in the saturation region is equal to 2VT H [2],
ship is linear, α = 1, and K = vsat C OX , [6], [37]. In this which slightly decreases when scaling the technology,
case, by performing the simple calculations reported in thus the maximum noise margin tends to decrease
Appendix I, the noise margin turns out to be still given slowly.
by (3) but with a different value of β, which in this case Now, let us derive simple design equations to size R D
is equal to 1. In actual nanometer devices, as shown for and transistors in order to obtain assigned values of
example by the data in Table 1 referring to a 90-nm VSWING and AV satisfying the noise margin requirement
technology, α is somewhat intermediate between 2 and (more detailed design guidelines to preliminarily assign

1, and thus β is expected to range from 2 ≈ 1.4 and 1. these two parameters will be discussed in Section V).
As a reasonable approximation, β can be set to the Solving (1), a given logic swing is achieved by properly
intermediate value 1.2, which leads to setting the resistance R D to VSWING /2 IS S , whereas from
(4) an assigned value of AV is achieved by setting the
VSWING
 
1.2
NM = 1− (5) NMOS transconductance gm in (6) to AV /R D
2 AV
 1
diD  IS S 1− α
 
1
Extensive simulations were performed by varying the gm = = α · (K · W ) α (6)
dvG S iD = IS S 2
logic swing from 240 mV to 800 mV, with AV ranging from 2

1.6 to 2.5, adopting a 90 nm technology whose main where VG S under the drain current IS S /2 was evaluated
parameters are reported in Table 1. The error of the ana- from (2). By substituting (6) into (4) and solving for W ,
lytical model (5) was found to be always lower than 14% the transistor channel width needed to achieve a given
and typically in the order of a few percent. Typical values AV is
of N M are in the order of 100 mV in current nanometer α
22α−1 AV

technologies. W= IS S . (7)
K αVSWING

II-B. Considerations on the Technology Scaling From (7), the channel width of NMOS transistors must
and Circuit Design. be set to a value which is proportional to the bias current,
From (5), the noise margin is proportional to the logic and increases proportionally to the ratio AV /VSWING . Of

FOURTH QUARTER 2006 IEEE CIRCUITS AND SYSTEMS MAGAZINE 45


APPENDIX I

n the limit case of a very short-channel device with a completely saturated carrier velocity, i.e. with α = 1 [38] and K = vsat COX , [6],
I the DC transfer characteristics of an MCML gate can easily be evaluated by solving the usual set of two equations encountered in the well-
known analysis of atraditional source-coupled pair, [46],

(KVL at input loop)



vi = vGS1 − vGS2
(A1.1)
iD1 + iD2 = ISS (KCL at the source node)

By expressing vGS as a function of iD from (2) and substituting itinto the first equation in (A1.1), the solution of the set of two equations eas-
ily gives the expression of the transistor currents as a function of the input voltage vi

ISS
0 if vi < − K·W

ISS vi ISS
iD1 (vi ) = 2 +K·W· 2 if |vi | ≤ K·W (A1.2a)
 ISS
ISS if vi > K·W

iD2 (vi ) = ISS − iD1 (vi ) (A1.2b)

from which, considering that vo1 = VDD − RD iD1 and vo2 = VDD − RD iD2 , as well as substituting the voltage gain expression
AV = K · W · RD (achieved from (4), with gm equal to diD /dvGS = K · W from (2) with α = 1) and VSWING by solving (1), the differ-
ential output voltage is equal to
 VSWING
 2
 if vi < − V2A
SWING
V
VSWING
vo (vi ) = −AV vi if |vi | ≤ 2AV (A1.3)
 VSWING
 VSWING
− 2 if vi > 2AV

which according to Figure 20 is a piece-wise linear curve, as expected due to the linear I-V relationship. From this figure, the critical points
that define the noise margin are

 
VSWING VSWING
(VIL,max , VOH,min ) = − ,
2AV 2
 
VSWING VSWING
(VIH,min , VOL,max ) = ,− (A1.4)
2AV 2

Thus the noise margin is equal to

1
 
VSWING
NM = VOH,min − VIH,min = 1− . (A1.5)
2 AV

course, it is set to the minimum value allowed by the and the zero). The propagation delay τ PD of this first-
technology in the cases where (7) is lower than it. order approximation is equal to, [39],

III. Gate Delay Modeling Methodology τ PD = 0.69 · (τ − τz ) (8)


The delay in MCML gates can be evaluated by resorting to
the general approach in [2], [3], [37], where the circuit is where parameters τ and τz can easily be evaluated by
first properly linearized around the logic threshold and applying the well-known open-circuit time-constant
eventually simplified by resorting to the half-circuit con- method, [40], [41].
cept by exploiting the symmetry. The linearized (half) cir- When linearizing the circuit, NMOS transistors cannot
cuit is then approximated to a first-order circuit with a be rigorously modeled with the well-known small-signal
pole time constant τ and a zero time constant τz (respec- MOS model in Figure 8 due to the strong non-linearity
tively equal to the negative of the reciprocal of the pole involved in logic gates. However, it is well known [2] that

46 IEEE CIRCUITS AND SYSTEMS MAGAZINE FOURTH QUARTER 2006


the same topology in Figure 8 can
Cgd
be used to model NMOS transistors
D G D
since they work in the saturation
region most of the time, even G Cgs GMvGS Cdb
though the linearized parameters
(i.e., the transistor transconduc-
tance and capacitances) must be S
evaluated in a large-signal condi-
tion to account for the wide voltage
S Csb
variations during a switching tran-
sient. The large-signal transconduc-
tance G M can be evaluated as the Figure 8. Equivalent linear model of NMOS transistors.
ratio of the drain current variation
iD and the gate-source voltage variation vG S during resistance R D with a parallel capacitance C RD , as shown
the gate switching (in place of the small-signal transcon- in Figure 9. According to Appendix II, the capacitance
ductance diD /dvG S ). In a complete switching the transis- C RD is equal to one third of the total parasitic capaci-
tor, the current changes from 0 to IS S or vice versa (i.e., tance, and is also proportional to 1/ IS S
iD = IS S ), and this change is determined by a gate-
source voltage from VT H to [VT H + ( IS S /K W )1/α ] by C R,TOT C R,unit
solving (2) (thus vG S = ( IS S /K W )1/α ), hence G M is C RD = = (10a)
3 IS S
equal to VSWING c
C R,unit = (10b)
6 r
iD IS S
GM = = 1 C R,unit being the load parasitic capacitance for a unit
vG S ( IS S /K W ) α
gm gm bias current (c and r are the capacitance and resistance
=   ≈ (9) per unit length of the layer implementing the resistance,
α 0.6 + 0.4 · α
2
1
1− α which are provided in the technology design kit). To val-
idate this approximate first-order RC circuit, several
where the small-signal transconductance gm around the physical resistances were simulated by extracting para-
logic threshold (6) was substituted. In (9), G M is lower sitics from the layout and applying a step current, in
than gm by a factor equal to α/2(1−1/α) , which is very well order to evaluate the equivalent time constant τeq of the
approximated by (0.6 + 0.4.α) with an error smaller than corresponding voltage waveform. In particular, by con-
1% for α ranging from 1 to 2. For scaled processes having sidering an unsilicided p-doped polysilicon layer with
the values of α closer to unity, the large-signal transcon- the resistance r and capacitance c per unit length
ductance is only slightly lower than the small-signal reported in Table 1 for the 90-nm adopted technology,
value. results showed that (10) agrees very well with simula-
In Figure 8, the source-bulk and drain-bulk capaci- tions, with an error always lower than 4%.
tances C sb and C db can be linearized by multiplying their
zero-bias value by a factor which depends on the junction IV. Delay Versus Bias Current
built-in potential, the grading coefficient and the mini- in Nanometer MCML Gates
mum/maximum direct voltage across the junction, [2], In this section, the methodology and the circuit models
[6]. The gate-drain and the gate-source capacitances C gd of transistors and the load resistances discussed in
and C gs in the saturation region are approximately linear, Section II are applied to an inverter gate (Subsection A)
thus no linearization must be performed. It is worth not- and to more complex MCML gates (Subsection B).
ing that all these NMOS parasitic capacitances are pro- Compared to [2], [3] a strongly simplified procedure is
portional to the channel width W . adopted to express the power-delay trade-off in a very
To model the load resistance R D , observe that it is simple manner.
actually implemented by a strip of a highly-resistive layer
(to reduce its area occupation) with length L according to IV-B. MCML Inverter Gate
Figure 9, which also has a distributed parasitic capaci- Let us consider the inverter gate in Figure 2, in which
tance to ground, with an overall value C R,TOT . By follow- transistors M1-M2 work in the saturation region most of
ing the analysis in Appendix II, this RC strip can be the time, and their source voltage is the same for both
represented by a lumped RC circuit consisting of the input logic values (it is fixed by the NMOS transistor in

FOURTH QUARTER 2006 IEEE CIRCUITS AND SYSTEMS MAGAZINE 47


Physical Implementation
Load Resistance L
A
A ... B

RD
...
B

RD
n

CR,TOT CR,TOT
2n 2n

Decomposition Into
n Sections
A B

Circuit Model (RC Ladder)

R1 R2 Rn
A B

CR,TOT CR,TOT CR,TOT CR,TOT


C0 = 2n C1 = n C2 = n Cn =
2n

Simplified RC Circuit Model


A

CR,TOT
RD CRD =
3

Figure 9. Physical implementation of the load resistance: derivation of its lumped circuit model.

the ON state). Thus, the circuit can be linearized around are open-circuited), the time constants τ and τz in (8) are
the logic threshold vi = 0, and the half-circuit concept easily found to be
applies due to the symmetry and the differential signal-
ing. As shown in Figure 10, where the transistor model in
Figure 8 is substituted, the linearized half-circuit is a sim- τ = R D ⌊(C db + C gd ) + C RD + C L ⌋
ple common-source circuit. By applying the time-constant = R D (C drain + C RD + C L ) (11a)
method to this circuit (i.e., by evaluating the time con- C gd C gd
τz = − =− (0.6 + 0.4α) (11b)
stants associated with each capacitance when the others GM gm

48 IEEE CIRCUITS AND SYSTEMS MAGAZINE FOURTH QUARTER 2006


APPENDIX II

o model the effect of the distributed resistance and capacitance associated with the load resistance physical layer, we develop an equiv-
T alent lumped RC circuit which has approximately the same dynamic behavior. To this aim, divide the strip in Figure 9 into a high number
n of small sections, each of which represented by a lumped resistance RD /n and a capacitance CR,TOT /n (split into two symmetric con-
tributions CR,TOT /2n, according to Figure 9). Thus the distributed RC strip can be described by the ladder network in Figure 9 with
C0 = Cn = CR,TOT /2n, C1 = C2 = . . . = Cn−1 = CR,TOT /n, with Cn being short-circuited to ground.
The equivalent impedance ZD of the RC ladder circuit in Figure 9 can be approximated to a first-order RC circuit with an equivalent time
constant τeq , [47], [48],

1 + b1 s + b2 s2 . . . 1
ZD (s) = RD ≈ RD (A2.1)
1 + a1 s + a2 s2 . . . 1 + sτeq

which apparently consists of a resistance RD with a parallel equivalent capacitance CR such τeq = RD CR . In (A2.1), the equivalent time-
constant τeq is equal to a1 − b1 [39], which in turn is easily evaluated through the time-constant method, [40], [41]. After simple but tedious
calculations, a1 and b1 for n → ∞ we obtain

RD CR,TOT
a1 = (A2.2)
2

    
n−1
RD CR,TOT  i2 RD CR n(n − 1) 1 (n − 1)3 (n − 1)2 (n − 1)
b1 = lim 2
i− = lim − + +
n→∞ n i=1
2 n→∞ n2 2 n 3 2 6
RD CR,TOT
= (A2.3)
6

therefore the equivalent time constant τeq is equal to RD CR,TOT /3 (thereby yielding CRD = CR,TOT /3).
The equivalent capacitance CR can be expressed as an explicit function of the bias current by observing that the resistance RD is equal
to r · L, r being the resistance per unit length of the considered physical layer and L the strip length. The same observation holds for CR,TOT
equal to c · L, c being the capacitance per unit length of the considered layer. Accordingly, by expressing the strip length L as RD /r and
substituting the expression of RD = VSWING /2ISS we get the relationships (10).

where it was observed that all capacitances see the B, and NMOS transistors are sized according to (7). Since
same resistance R D in the evaluation of τ , and the sum all NMOS capacitances are proportional to W, as pointed
of C gd and C db was interpreted as the transistor capaci- out in Section III, from (7) the transistor capacitance
tive contribution C drain at the drain node. The (nega- C drain (C gd ) turns out to be proportional to IS S by a con-
tive) zero time constant in (11b) is that of the stant C drain,N (C gd,N ) which represents its value per unit
well-known common-source circuit, and from (8) tends current (i.e., C drain = C drain,N . IS S and C gd = C gd,N . IS S ).
to increase the delay more significantly in down-scaled By substituting (10), the MCML inverter delay in (12) is
technologies.1 From (8) and (11a)–(11b), the delay τ PD equal to
is equal to  
C R,unit CL
τ PD = 0.35 · VSWING C MOSnet,N + + (13)
τ PD = 0.69R D IS2 S IS S
C gd (0.6 + 0.4α)
  
× C drain + + C RD + C L .
AV where the NMOS network capacitive contributions per
(12) unit current were lumped into a single contribution
C MOSnet,N
Now, let us consider the explicit dependence of the
delay (12) on the bias current IS S , considering that in prac- C gd,N (0.6 + 0.4α)
C MOSnet,N = C drain,N + . (14)
tical designs R D = VSWING /2 IS S as discussed in Section II- AV

1This is because the (overlap) gate-drain capacitance scales more slowly than the other parasitic capacitances, since the direct overlap size cannot lin-
early scale as reducing the minimum feature size. As another important aspect, the recent adoption of high-κ dielectrics tends to further increase this
capacitance [35].
2When vi is applied to the upper transistors, the capacitances of lower transistors (that have already switched) do not contribute to the overall delay.

FOURTH QUARTER 2006 IEEE CIRCUITS AND SYSTEMS MAGAZINE 49


the latter is much greater than in the case of the inverter. By
Table 2. following the same approach as the inverter, and remembering
Delay coefficients for MCML gates in a 90-nm technology
that NMOS parasitic capacitances are proportional to IS S (i.e.,
(with VSWING = 700 mV, AV = 2.2).
C drain = C drain,N · IS S , C source = C source,N · IS S ), the
Cdrain,N 1.38 E-11 F/A delay is still given by (13) with an overall NMOS capacitance
Cinput,N 1.67 E-11 F/A per unit current equal to
CR,unit 1.88 E-20 F·A
Csource,N = Cdrain,N + Cinput,N 3.05 E-11 F/A C MOSnet,N = 2C drain,N
C drain,N + 2C source,N
+ (0.6 + 0.4α). (16)
AV
It is worth noting that relationship (13) analytically
expresses the Power-Delay trade-off, since the delay is an In regard to the D-latch, whose worst-case delay is the
explicit function of IS S (which defines the static power clock-to-output delay occurring when input CLK switch-
consumption P = VDD . IS S ). es, this gate differs from the MUX/XOR gate only for the
source-coupled pair M5-M6 storing the previous output-
IV-B. Complex MCML Gates and Input Capacitance value for C LK = 0 due to their positive-feedback connec-
In [2], it was shown that the power-delay interdependence tion. Thus, the capacitive contributions of the D latch are
(13) actually holds for arbitrary MCML gates, as will be the same as the MUX/XOR gate, except for the additional
shown in the following for various MCML gates. First, let capacitance C input in (15) seen from the gate of M5 (M6).
us consider the MCML MUX in Figure 3, whose worst-case As a consequence, the D latch clock-to-output delay is still
delay2 τ PD,MUX is obtained by applying the switching given by (13a),with an overall NMOS capacitance equal to
input vi to transistors M1-M2 and keeping inputs A and B
constant. Without loss of generality, A and B can respec- C drain,N + 2C source,N
C MOSnet,N = 2C drain,N +
tively be assumed to be at the low and high level, thus M3 AV
and M6 are in the saturation region, while M4 and M5 are × (0.6 + 0.4α) + C input,N . (17)
in cut-off. Observe that the XOR gate has the same delay
as the MUX, since its topology is obtained from the latter where C input,N is obtained from the gate-source capaci-
by setting B = Ā, hence in the following only the MUX tance expression and (7)
gate will be considered. By applying the adopted modeling
methodology, the MUX/XOR linearized half-circuit is C input 2 W
C input,N = = L · C OX
depicted in Figure 11, whose delay (8) is easily found to be IS S 3 IS S
α
2 22α−1 AV
τ PD,MUX =0.69 = · L · C OX (18)
 3 K αVSWING
  C gd (0.6 + 0.4α)
× R D C drain,3 + C drain,5 +
 AV Observe that the generalization of (13) to arbitrary
 gates is easily justified by considering that in arbitrary
+ C RD + C L MCML gates the parasitic capacitance C RD of the load
 resistance is always responsible for the delay term
1 inversely proportional to IS2S , and the external load capac-

+ (C drain,1 + C source,3 + C source,4 )
GM  itance C L determines the term inversely proportional to
IS S . Analogously, the NMOS transistor capacitances have
≈0.69 · R D
 the same dependence on IS S and are responsible for the
 C drain + 2C source

× 2C drain + (0.6 + 0.4α) delay term independent of IS S in (13), which is given by
 AV the sum of capacitances at the output node and the other

 capacitances multiplied by (0.6 + 0.4α)/ AV .
+ C RD + C L (15)
 IV-C. Simulation Results and Numerical Examples
The delay model was compared to Cadence Spectre simula-
where the sum of C gs and C sb is interpreted as the transistor tions with IS S widely ranging from 1 µA to 100 µA and load-
capacitive contribution C source at the source node. It has been ing each gate with a number of equal gates (i.e., the fan-out
observed that all transistors have the same C drain (C source), FO) ranging from 0 to 4, using the 90-nm CMOS technology
and the zero time constant τz (given by (11b)) is negligible previously described. The delay obtained for FO equal to 0
when compared to the sum of the other capacitances, since and 4 is plotted in Figure 12 versus IS S in logarithmic scale

50 IEEE CIRCUITS AND SYSTEMS MAGAZINE FOURTH QUARTER 2006


(due to the wide considered range of
the bias current), assuming VSWING Cgd
vo1,2
equal to 700 mV and AV equal to 2.2.
In the same figure, the predicted
delay (13) with C L = C input ∗ F O + GMvi1,2 Cdb + CRD RD CL
Vi1,2

(with C input given by (18) is plotted
versus IS S , where the numerical
data reported in Table 2 were used. Figure 10. Equivalent circuit of a MCML inverter.
The error, which is plotted versus
IS S in Figure 13, is always within 10%
and is typically in the order of a few vo1
percent, with an average value of +
4.7%. It is worth noting that the max- vgs3 GMvgs3 RD CRD + CL

imum error almost doubles (19%)
when the zero effect M3
(Cdb3 + Cgd,3) + (Cdb,5 + Cgd,5)
C gd,unit (0.6 + 0.4α)/ AV in (14) is
neglected, there by confirming that it Cgd1
is an increasingly important contri-
bution in nanometer technologies. vi1
+ GMvi1 Cdb1,2 + (Cgs,3 + Csb,3) + (Cgs,4 + Csb,4)
In Figure 12, as expected the
M1
delay does not depend on the fan-
out for very low values of the bias
Figure 11. Equivalent circuit of an MCML MUX gate.
current, since the dominant capac-
itive contributionis due to the par-
asitic capacitance associated with
the load resistance.This confirms 10,000
that the widely adopted assump-
tion of an ideal load resistor is far
1,000
from being realistic, since its para-
sitic capacitance in (10) must be
τPO (ps)

accounted for. Similar curves are 100


obtained for the other considered
MCML gates which are omitted for 10
the sake of compactness, and the
obtained numerical value of
C NMOS,unit in (13) is reported in 1
1 2 3 4 5 6 7 8 9 10 20 30 40 50 60 70 80 90 100
Table 3.
ISS (µA)
FO = 0 (Simulated) FO = 0 (Predicted)
V. Power-Delay Trade-Offs and FO = 4 (Simulated) FO = 4 (Predicted)
Design Guidelines
From the general relationship (13),
Figure 12. Inverter delay versus ISS with a fan-out of 0 and 4.
different power-delay trade-offs
and several interesting properties
of MCML gates can be derived, by eventually measuring with the load resistance dominates over the others,
the efficiency in the power-delay trade-off with the Power- thus τ PD is inversely proportional to IS2S . Accord-
Delay Product PDP (i.e., the product of P = VDD · IS S and ingly, PDP is inversely proportional to IS S , i.e., it
(13)), [6]. In any MCML gate with assigned values of greatly increases when reducing the power con-
VSWING and AV , three different regions can be identified sumption. Thus, in low-power designs, a power sav-
when varying the power consumption; see Figure 14 ing is achieved at the cost of a much greater speed
which plots the trend of (13) versus IS S : penalty. Moreover, the delay (13) does not depend
1) LOW POWER REGION: for low values of IS S such on the NMOS network, thus it is the same for all
that the term C R,unit / IS2S dominates over the other MCML gates, regardless of the implemented logic
two in (13), the parasitic capacitance associated function.

FOURTH QUARTER 2006 IEEE CIRCUITS AND SYSTEMS MAGAZINE 51


2) POWER-EFFICIENT REGION: for moderate values of IS S , the MCML delay can no longer be lowered
IS S such that the term C L / IS S dominates over the despite of a power increase, since it asymptotically
other two, τ PD is inversely proportional to IS S , tends to a minimum value (achieved from (13) with
hence PDP is roughly constant. A power saving is IS S → ∞) set by the NMOS capacitances: the gate
achieved at the cost of an equal speed penalty. In tends to be self-loaded due to the large transistor
this case, the delay mainly depends on the load. size (7) which determines large NMOS capaci-
3) INEFFICIENT DESIGN REGION: for high values of tances. In this case, MCML gates are very inefficient
in terms of the power-delay trade-
off, and the delay is mainly deter-
mined by the considered gate
8 through its NMOS network.
6 It is worth noting that all MCML
4 gates with the same VSWING and AV
2 have the same power-delay inter-
dependence, with the only differ-
0
Error(%)

1 2 3 4 5 6 7 8 9 10 20 30 40 50 60 70 80 90 100 ence being the value of the


−2
constant term C MOSnet,N in (13).
−4
Hence more complex gates have a
−6 greater C MOSnet,N and thus a
−8 greater asymptotic minimum delay
−10 τ PD,min . Therefore the delay curves
−12 versus IS S which analytically
ISS (µA) describe the power-delay trade-off
FO = 0 FO = 4 of two different MCML gates only
differs for a different up/down shift
by the difference of the two differ-
Figure 13. Error of the delay model versus ISS for an inverter gate with a fan-out
of 0 and 4.
ent minimum delay values, as
graphically reported in Figure 15.
According to the previous con-
siderations on the power-delay
trade-off, MCML gates will usually
be designed in the power-efficient
Delay region where power and speed per-
τPD 1 formance are reasonably balanced,
2
ISS whereas the low-power region will
be used only for non-critical paths.
In the following subsections, sim-
ple design criteria will be derived
from (13) in the three typical cases
(power-efficient, high-speed and
1
low-power design), and design con-
ISS
Constant siderations on the power supply
2τPD,min voltage will be made.
Gate-
Load-Dependent Gate-Dependent
Independent
Delay Delay
Delay V-A. Power-Efficient Design
τPD,min
Low- Power- Inefficient To achieve an optimum power-
Power Efficient Design delay balance, it is necessary to
minimize the power-delay product
ISS Bias Current
(Power) PDP = VDD · IS S · τ PD (with τ PD
given by (13), which is obtained by
setting its derivative to zero and
Figure 14. General delay dependence on the bias current (or equivalently the power solving for IS S . The obtained bias
consumption) in MCML gates.
current which minimizes PDP is

52 IEEE CIRCUITS AND SYSTEMS MAGAZINE FOURTH QUARTER 2006



C R,unit
IS S,opt PDP = . (19) IS S should only be increased as long as a significant
C MOSnet,N
speed improvement is achieved. To this aim, observe
that for sufficiently high values of IS S such that
First, (19) yields C MOSnet,N · IS S,opt PDP = C R,unit / C MOSnet,N > (C R,unit / IS2S + C L / IS S ), the constant term in
IS S,opt PDP , which means that a power-efficient design (13) dominates over the other two (i.e. the gate is self-
leads to equal capacitive contributions of the NMOS net- loaded), thus a high speed is achieved, but a further
work and the load resistance, as reported in Figure 16. increase in the bias current does not lead to a significant
Moreover, the optimum bias current (19) is independent speed advantage. In contrast, for lower values of IS S
of the load, and the minimum power-delay product such that C MOSnet,N < (C R,unit / IS2S + C L / IS S ), the terms
(obtained by substituting (19) into PDP) turns out to be depending on IS S dominate over the constant one, thus
a worse speed performance is achieved, but τ PD is highly
PDPopt,MC M L ≈ 0.35 · VDD · VSWING
   sensitive to a bias current increase. As a compromise, a
× 2 C MOSnet,N C R,unit + C L (20) reasonable choice of IS S is achieved in the intermediate
case C MOSnet,N = (C R,unit / IS2S + C L / IS S ), which appar-
from which a PDP increase (i.e., a worse power effi- ently makes τ PD only twice the minimum achievable
ciency) is observed when increasing the load capaci- (i.e. τ PD = 2τ PD,min ), as reported in Figure 16. Thus, the
tance C L , as well as C R,unit and C MOSnet,N . Observe that IS S,opt delay needed for such high-speed criterion is at
C R,unit is proportional to VSWING (according to Appen- the boundary of the power-efficient and the inefficient
dix II) and does not depend on AV , whereas the NMOS region in Figure 16. Moreover, under this current
contribution C MOSnet,N is proportional to W , which in C MOSnet,N · IS S,opt delay is equal to (C R,unit / IS S,opt delay +
turn is proportional to ( AV /VSWING )α according to (7), C L ), thus under this design criterion the NMOS capaci-
hence PDP in (20) is proportional to VSWING (3−α)/2 and tance contribution equals the sum of C L and that of the
AV α/2 . As a general result, in MCML gates designed for load resistance.
power-efficiency the logic swing and the voltage gain The bias current IS S,opt delay is easily found from (21)
should be kept as low as possible within the range by substituting τ PD =2τ PD,min
allowed by the noise margin requirement. These con-
siderations are summarized in Figure 16, where it is CL
considered that for IS S = IS S,opt PDP the terms propor- IS S,opt delay = 0.17 · VSWING
τ PD,min
tional to 1/ IS S 2 and the constant one are equal, thus it   
C R,unit τ PD,min
lies at the boundary of the low-power and the power- × 1 + 1 + 11.4 · ·
C L2 VSWING
efficient region.
(23)
V-B. High-Speed Design
When a high speed performance is the principal goal, two which is easily found to be always greater than
situations may occur. In the first one, a delay constraint IS S,opt PDP in (19) (or equal to, in the limit case C L = 0).
τ PD derived from considerations at the gate level has to This means that a high speed is achieved at the cost of a
be met by properly setting IS S to worse power efficiency, when compared to the case dis-
cussed in the previous subsection.
CL By reiterating the reasoning in Subsection A, C MOSnet,N
IS S = 0.17 · VSWING is proportional to ( AV /VSWING )α , thus the delay
τ PD − τ PD,min
   τ PD = 2τ PD,min is proportional to AV α /VSWING α−1 from
C R,N τ PD − τ PD,min
× 1 + 1 + 11.4 · · (21) (22), therefore in high-speed designs the voltage gain
C L2 VSWING
should be kept low, whereas the logic swing should be set
as high as possible, cf. Figure 16. Surprisingly, this is in
contrast with the usual belief that the high-speed feature
that was obtained by solving (13) for IS S and substituting of MCML gates is due to the small logic swing, [16], that
its asymptotic minimum expression probably is due to a superficial extension of well-known
properties of CML bipolar gates [2]. This consideration
τ PD,min = lim τ PD = 0.35 · VSWING · C MOSnet,N (22) can be intuitively justified by observing that an increase
IS S →∞
in the logic swing reduces the transistor size (5) needed
In the second case the speed potential must be to achieve a given AV , thereby reducing the NMOS capac-
exploited as much as possible, thus τ PD has to be close itances which are the dominant contribution in the high-
to (22) while keeping IS S within reasonable values, i.e., speed region.

FOURTH QUARTER 2006 IEEE CIRCUITS AND SYSTEMS MAGAZINE 53


V-C. Low-Power Design at the system level. Therefore, the only design parameter
In low-power design, e.g., the design of non-critical paths, is the logic swing, whereas ISS is set to a very low value
the power consumption per gate allowed is usually an chosen from the system considerations, thus the gate
assigned parameter that is derived from the requirements works in the low-power region where the dominant term
is C R,unit / IS2S , and the delay is approxi-
mately

C R,unit
= 0.35 · VSWING ·
τ PD ∼ (24)
IS2S
which shows that in low-power design the
Delay logic swing has to be set as low as possi-
ble, as in the case of power-efficient
design, whereas the voltage gain does not
affect the speed performance.

V-D. Remarks on the Power Supply


Gate 2 (More Complex) Voltage Sizing
τPD,min2 Since the NMOS network in MCML gates
consists of stacked source-coupled
τPD,min2 − τPD,min1 pairs associated with different levels,
according to Figure 17, only the transis-
Gate 1 (Simpler)
τPD,min1 tors at the first (upper) level can be
directly driven by the output of an
Bias Current MCML gate, whereas the input voltages
(Power)
of transistor pairs at lower levels are
progressively reduced through level
shifter stages to ensure operation in
Figure 15. Delay curves versus ISS for two different MCML gates with the
the saturation region (for the reader
same logic swing and load.
interested in the design of level shifter
stages, the subject is thoroughly
addressed in [2]). Each level
shifter stage is implemented
with a common-drain stage as in
Figure 17, [2], [18]. The mini-
Delay mum VDD is found by consider-
Low-Power
τPD
1
Design ing the input vi,n at the n-th
(Low VSWING, any AV)
2 lowest level in Figure 17, which
ISS CRD>>CMOSnet + CL
is set by the output voltage of
Power-Efficient
Design the preceding gate and the gate-
(Low VSWING, Low AV) High-Speed source voltage drop
CMOSnet = CRD Design (n − 1)VG S,shift of (n − 1) level
(HighVSWING, Low AV)
1 CMOSnet = CRD + CL shifters, is equal to
ISS VDD − (n − 1)VG S,shift in the case
2τPD,min Constant
of a high input. According to Fig-
ure 17, this voltage must accom-
τPD,min modate the gate-source voltage
Low- Power- Inefficient drop of the lowest transistor
Power Efficient Design driven by vin and the minimum
voltage drop across the bias
ISS,opt_PDP ISS ISS,opt_delay Bias Current
current source V I S S,min (equal to
(Power)
a small VDS,sat ∼ 100 mV in the
case of a simple current mirror
Figure 16. Summary of design criteria of MCML gates.
implementation), thus

54 IEEE CIRCUITS AND SYSTEMS MAGAZINE FOURTH QUARTER 2006


VDD,min = VG S + (n − 1) VG S,shift + V I S S,min (25) VI. A Design Example
Let us apply the concepts presented until now to the
Equivalently eq. (26) sets the maximum number of carry logic of a Full Adder, which evaluates the carry out-
levels n for a given VDD (typically 2–3 [2], [18]). The put C out = A · B + C in · A ⊕ B as a function of the carry
 

level shifter voltage drop VG S,shift is usually kept very input C in and the two digit inputs A and B [6]. This block
close to the transistor threshold voltage VT H by set- is of utmost importance in arithmetic blocks such as
ting the bias current to a rather low value. In regard to adders and multipliers, and its MCML topology is report-
VG S , it is the gate-source voltage of a transistor in the ed in Figure 18, [15]. Its worst-case delay is represented
ON state, i.e., with a current IS S , thus it is obtained by by the case when the maximum number of capacitances
solving (2) switch. From Figure 18, this occurs when the lowest level
input B switches and the resulting current is steered to
1
IS S

α
the source-coupled pairs M5-M6 and M9-M10 (or equiva-
VG S = VT H +
K ·W lently to M3-M4 and M7-M8), which occurs when A = 1
αVSWING α and C in = 0 (or A = 0 and C in = 1). This current path that
 
1
= VT H + (26)
1
22− α AV defines the worst-case delay is depicted with a dashed
line in Figure 18.
where (7) was substituted. From (26), in order to reduce The delay of the circuit in Figure 18 is given by (13),
the supply voltage, the voltage swing should be kept as low where C MOSnet,N is easily found by inspection of the
as possible, and the voltage gain should not be too low. worst-case current path. Indeed, the capacitance at the

VDD

RD RD
vo,1 vo,2

1st Level

VDD VDD

2nd Level

VGS,shift
VDD VDD

VDD ... VDD

1st Level V
Shifter
GS,shift ... n-th Level ... 1st Level
Shifter
vin,1 vin,2
VGS,shift
VGS
2nd Level 2nd Level
Shifter
... VISS,min ISS
... Shifter

(n-1)-th Level (n-1)-th Level


Shifter Shifter

Figure 17. Level shifter stages to interface MCML gates.

FOURTH QUARTER 2006 IEEE CIRCUITS AND SYSTEMS MAGAZINE 55


700 mV. Moreover, assuming a noise margin of 160 mV is
Table 3. required, from (5) a voltage gain AV equal to 2.2 is need-
Overall NMOS capacitance per unit current of different MCML
logic gates(with VSWING = 700 mV, AV = 2.2). ed. Assuming a load capacitance C L equal to 2 fF (i.e., the
rather high input capacitance of a gate with
logic gate CMOSnet,N IS S = 120 µA), the obtained optimum bias current in
inverter 1.48 E-11 (23) is 22 µA, and the delay is equal to about 60 ps. If an
MUX/XOR 6.76 E-11 optimum power-delay balance is desired, under the
D latch 8.44 E-11 same noise margin specification, the optimum bias
Full Adder (carry logic) 1.21 E-10 current (19) must be 12 µA, and the delay is 100 ps. In a
more practical case where the Full Adder is loaded by an
equal one (i.e. C L = C input,unit· IS S ), the predicted and
output node (due to the drain connection of transistors simulated delay are plotted versus IS S in Figure 19, and
M8, M5 and M10) is equal to 3C drain , the capacitance at the error is always lower than 13% and its average value
node X (due to the source contribution of M9-M10 and the is 5.5%. The results shows that for a very high speed the
drain contribution of M6) is equal to 2C source + C drain , optimum bias current is equal to 11.6 µA giving a delay
and the capacitance at node Y (due to the source contri- of about 67 ps.
bution of M5-M6 and the drain contribution of M2). As dis- Finally, let us observe that other input transitions lead
cussed in Section IV-B, by multiplying the capacitances to lower values of the delay, such in the case of the carry
not connected to the output node by input to carry output delay (with A = 0 and B = 1 or vice
(0.6 + 0.4α)/ AV , C MOSnet,N is equal to versa) which is particularly important when defining the
speed performance of adder circuits [6]. This delay
C M OSnet,N = 3C drain,N τCARRY is given by (13) with C MOSnet,N given by the con-
4C source,N + 2C drain,N tribution at the output node 3C drain , because the other
+ (0.6 + 0.4 · α)
AV capacitances have already switched during the carry
(27) input transition. In a high-speed design under the above
conditions, the obtained optimum current and delay are
whose numerical value which the data of Table 2 is respectively 18 µA and 28 ps.
reported in Table 3.
If a high speed is targeted, according to Figure 16 the VIII. CONCLUSIONS
logic swing must be set to the maximum value 2VT H (as In this paper, an overview of techniques to manage the
discussed in Section II-A), which from Table 1 is equal to power-delay trade-off in nanometer MCML circuits has
been presented. Compared to pre-
vious works, a strongly simplified
VDD
and comprehensive approach was
adopted which also account for
RD RD
Deep-Submicron effects. As oppo-
Cout,2 cout,1
site to the previous works of the
CL CL same authors, a physical resistance
load was assumed, whose distrib-
Cin1 M3 M4 Cin2 Cin1 M3 M4 Cin2
uted parasitic capacitance was sim-
ply modeled as a lumped circuits. It
Y
was also shown that the usual
A2 M3 M4 A1 A2 M5 M6 A1
assumption made in the previous
papers of an ideal resistor (i.e.,
X without parasitic capacitance) is
B1 M1 M2 B2 strongly unrealistic, especially in
low-power designs.
To understand better the
ISS design trade-offs, simple models of
the noise margin and the delay
have been discussed. Further-
Figure 18. Topology of the carry logic in an MCML Full Adder, with worst-case current more, a simple approach to write
path in dashed line.
the delay by inspection of the gate

56 IEEE CIRCUITS AND SYSTEMS MAGAZINE FOURTH QUARTER 2006


topology was extrapolated by generalizing the results of such as the low-voltage triple-tail cell approach that was
a few gates. Interesting properties on trade-offs and previously adopted in bipolar integrated circuits, [42]–[44].
effect of scaling have been derived from these analytical Moreover, the logic swing reduction that will be forced by
models: for example, it is shown that the DSM effects are the supply voltage scaling will determine a decrease in the
beneficial in terms of the noise margin. In particular, available noise margin, which will have to be recovered by
three design targets have been discussed (i.e., low- increasing the voltage gain (according to (5)) by means of
power, power-efficient and high-speed), and simple novel circuit techniques such as the introduction of posi-
design criteria to size the bias current, the logic swing tive feedback [45].
and the voltage gain have been found. These results, Third, efficient power-down techniques will be needed
which are summarized in Figures 14 and 16, provide pow- to reduce eventually the static power consumption in
erful information for decision taking in the design MCML blocks that do not perform useful computations,
process. Interestingly, it was shown that a high speed is while still keeping supply current variations within
achieved by increasing the logic swing, as opposite to reasonable bounds, in order to maintain the advantages
the incorrect traditional belief that low logic swings due to the almost constant supply current of MCML gates.
make MCML circuits faster. The
practical design of the carry
logic of a Full Adder has been
discussed presenting numerical 10,000
examples by considering a 90-
nm CMOS process. 1,000
Several challenges must still
be faced in the understanding of
100
(ps)

MCML circuits, which are an


approach that is less mature than
the traditional CMOS logic. First, 10
the understanding of the interde-
pendence of design parameters
and the design criteria here 1
1 2 3 4 5 6 7 8 9 10 20 30 40 50 60 70 80 90 100
derived should be exploited to
ISS (µA)
implement automated design
flows to optimize effectively com- FO = 1 (Simulated) FO = 1 (Predicted)
plex MCML circuits with a rea-
Figure 19. Carry logic delay versus ISS with a unity fan-out.
sonable computational effort.
Secondly, although MCML cir-
cuits were shown to be less sensi-
tive to the problems related to the vo(vi)
technology downscaling than tra- VSWING
2
ditional CMOS logic, further prob-
lems will arise due to the
continuous reduction of the supply
voltage. Indeed, the latter will
increasingly limit the number of
0
logic levels within a gate (accord- vi
ing to (25)–(26)), and thus the com-
plexity that can be implemented
into a single gate. This will trans-
late into a greater number of bias −VSWING
current sources (and thus a 2
greater overall power consump- −VSWING VSWING
0
tion) and interconnects (which 2 AV 2 AV
degrade the speed performance).
To overcome this limit, novel cir- Figure 20. DC transfer characteristics of a MCML gate with a completely saturated car-
rier velocity.
cuit approaches will be needed,

FOURTH QUARTER 2006 IEEE CIRCUITS AND SYSTEMS MAGAZINE 57


References coupled logic vs. CMOS static logic for low-noise mixed-signal ICs,” IEEE

[1] M. Mizuno et al., “A GHz MOS adaptive pipeline techniques using Trans. on CAS—Part I, vol. 40, no. 9, pp. 553–563, Sep. 1993.

MOS current-mode logic,” IEEE Journal of Solid-State Circuits, vol. 31, [20] S. Kiaei, S. Chee, and D. Allstot, “CMOS source-coupled logic for

no. 6, pp. 784–791, June 1996. mixed-mode VLSI,” Proc. Int. Symp. Circuits Systems, pp. 1608–1611, 1990.

[2] M. Alioto and G. Palumbo, Model and Design of Bipolar and MOS Cur- [21] H. Ng and D. Allstot, “CMOS current steering logic for low-voltage

rent-Mode Logic (CML, ECL and SCL Digital Circuits), Springer, 2005. mixed-signal integrated circuits,” IEEE Trans. on VLSI Systems, vol. 5,

[3] M. Alioto and G. Palumbo, “Design strategies for source coupled logic no. 3, pp. 301–308, Sep. 1997.
[22] B. Stanistic, N. Verghese, R. Rutenbar, L. Carley, and D. Allstot,
gates,” IEEE Trans. on CAS Part I, vol. 50, no. 5, pp. 640–654, May 2003.
“Addressing substrate coupling in mixed-mode IC’s: simulation and
[4] M. Alioto and G. Palumbo, “Power-delay optimization of D-
power distribution synthesis,” IEEE Jour. of Solid-State Circuits, vol. 29,
Latch/MUX source coupled logic gates,” International Journal of Circuit
pp. 226–238, Mar. 1994.
Theory and Applications, vol. 33, no. 1, pp. 65–86, Jan./Feb. 2005.
[23] International Technology Roadmap for Semiconductors, Available:
[5] M. Alioto and G. Palumbo, “Oscillation frequency in CML and ESCL
http://public.itrs.net.
ring oscillators,” IEEE Trans. on CAS Part I, vol. 48, no. 2, pp. 210–214,
[24] R. Singh (Ed.), Signal Integrity Effects in Custom IC and ASIC Design,
Feb. 2001.
IEEE Press, 2002.
[6] J. Rabaey, A. Chandrakasan, and B. Nikolic, Digital Integrated Circuits
[25] B. Del Signore, D. Kerth, N. Sooch, and E. Swanson, “A monolithic 20-b
(A Design Perspective), Prentice Hall, 2003.
delta-sigma A/D converter,” IEEE J. Solid-State Circuits, vol. 25, pp. 1311–1317,
[7] B. Razavi, “Prospect of CMOS technology for high-speed optical
Dec. 1990.
communication circuits,” IEEE Jour. of Solid-State Circ., vol. 37, no. 9,
[26] H. Leopold, G. Winkler, P. O’Leary, K. Ilzer, and J. Jernej, “A mono-
pp. 1135–1145, Sep. 2002.
lithic CMOS 20-b analog-to-digital converter,” IEEE J. Solid-State Circuits,
[8] B. Razavi (Ed.), Monolithic Phase-Locked Loops and Clock Recovery
vol. 26, pp. 910–916, July 1991.
Circuits (Theory and Design), IEEE Press, 1996.
[27] I. Fujimori et al., “A 5-V single chip delta-sigma audio A/D converter
[9] C. Hung, B. Floyd, B. Park, and K. O, “Fully integrated 5.35-GHz CMOS
with 111 dB dynamic range,” IEEE J. Of Solid State Circuits, vol. 32, pp.
VCOs and prescalers,” IEEE Trans. on Microwave Theory and Techniques,
329–336, Mar. 1997.
vol. 49, no. 1, Jan. 2001.
[28] S. Jantzi and K. Martin, A. Sedra, “Quadrature bandpass  modu-
[10] C. Lam and B. Razavi, “A 2.6-GHz/5.2-GHz Frequency Synthesizer in
lator for digital radio,” IEEE J. Solid State Circuits, vol. 32, pp. 1935–1949,
0.4-µm CMOS Technology,” IEEE Jour. of Solid-State Circ., vol. 35, no. 5,
1997.
pp. 788–794, May 2000.
[29] B. Kup, E. Dijkmans, P. Naus, and J. Sneep, “A bit-stream digital-to-
[11] H. Nosaka, K. Isshii, T. Enoki, and T. Shibata, “A 10-Gb/s data-pattern
analog converter with 18-b resolution,” IEEE J. Solid-State Circuits, vol. 26,
independent clock and data recovery with a two-mode phase compara-
pp. 1757–1763, Dec. 1991.
tor,” IEEE Jour. of Solid-State Circuits, vol. 38, no. 2, pp. 192–197, Feb. 2003.
[30] J. Kundan and S. Hasan, “Enhanced folded source-coupled logic
[12] S.-T. Yan and H. Luong, “A 3-V 1.3-to-1.8-GHz CMOS voltage-con-
technique for low-voltage mixed-signal integrated circuits,” IEEE Trans.
trolled oscillator with 0.3-ps Jitter,” IEEE Trans. on Circuits and Systems
on CAS—Part II, vol. 47, no. 8, pp. 810–817, Aug. 2000.
Part II, vol. 45, no. 7, pp. 876–880, July 1998.
[31] H. Lee, D. Hodges, and P. Gray, “A self-calibrating 15-bit CMOS A/D
[13] B. Razavi, Design of Integrated Circuits for Optical Communications, converter,” IEEE Jour. of Solid-State Circuits, vol. 19, pp. 813–819, Dec.
McGraw-Hill, 2003. 1984.
[14] T.H. Lee, The Design of CMOS Radio Frequency Integrated Circuits, [32] D. Su, M. Loinaz, S. Masui, and B. Wooley, “Experimental results and
Cambridge University Press, 2nd edition, 2003. modeling techniques for substrate noise in mixed-signal integrated cir-
[15] J. Musicer and J. Rabaey, “MOS current mode logic for low power, cuits,” IEEE Jour. of Solid-State Circuits, vol. 28, pp. 420–430, Apr. 1993.
low noise CORDIC computation in mixed-signal environments,” Proc. of [33] K. Bernstein et al., High Speed CMOS Design Styles, Kluwer Academic
ISLPED 2000, pp. 102–107, 2000. Publishers, 1999.
[16] A. Tanabe, M. Umetani, I. Fujiwara, T. Ogura, K. Kataoka, M. Okiara, [34] S. Bruma, “Impact of on-chip process variations on MCML perform-
H. Sakuraba, T. Endoh, and F. Masuoka, “0.18-µm CMOS 10-Gb/s ance,” Proc. IEEE International Systems-on-Chip Conference (SOCC’03), pp.
Multiplexer/Demultiplexer ICs Using Current Mode Logic with Tolerance 135–140, 2003.
to Threshold Voltage Fluctuation,” IEEE J. of Solid-State Circuits, vol. 36, [35] B.P. Wong, A. Mittal, U. Cao, and G. Starr, Nano-CMOS Circuit and
no. 6, June 2001. Physical Design, John Wiley & Sons, 2005.
[17] R. Senthinatan and J. Prince, “Application specific CMOS output [36] D. Chinnery and K. Keutzer, Closing the Gap between ASIC & Custom,
driver circuit design techniques to reduce simultaneous switching Kluwer Academic Publishers, 2002.
noise,” IEEE Jour. of Solid-State Circuits, vol. 28, no. 12, pp. 1383–1388, [37] T. Sakurai and A.R. Newton, “Alpha-Power law MOSFET model and
Dec. 1993. its applications to CMOS inverter delay and other formulas,” IEEE Jour.
[18] S. Maskai, S. Kiaei, and D. Allstot, “Synthesis techniques for CMOS on Solid-State Circuits, vol. 25, no. 2, pp. 584–594, Apr. 1990.
folded source-coupled logic circuits,” IEEE J. Of Solid State Circuits, vol. [38] M. Alioto, G. Palumbo, and S. Pennisi, “Modeling of Source Coupled
27, no. 8, pp. 1157–1167, Aug. 1992. Logic Gates,” International Journal of Circuit Theory and Applications, vol.
[19] D. Allstot, S. Chee, S. Kiaei, and M. Shristawa, “Folded source- 30, no. 4, pp. 459–477, 2002.

58 IEEE CIRCUITS AND SYSTEMS MAGAZINE FOURTH QUARTER 2006


[39] W. Elmore, “The transient response of damped linear networks,” J. (e.g., random number generators, circuits resistant to Dif-
Appl. Phys., vol. 19, pp. 55–63, Jan. 1948. ferential Power Analysis), and design for variability. His
[40] B. Cochrun and A. Grabel, “A Method for the Determination of the research was previously focused also on the modeling
Transfer Function of Electronic Circuits,” IEEE Trans. on Circuit Theory, and the design of bipolar CML/ECL circuits, as well as adi-
vol. CT-20, no. 1, pp. 16–20, Jan. 1973. abatic logic.
[41] G. Palumbo and S. Pennisi, Feedback Amplifiers Theory and Design,
Kluwer Academic Publishers, 2002. Gaetano Palumbo was born in Catania,
[42] B. Razavi, Y. Ota, and R. Swartz, “Design techniques for low-voltage Italy, in 1964. He received the laurea
high speed digital bipolar circuits,” IEEE Jour. of Solid-State Circ., vol. 29, degree in Electrical Engineering in 1988
no. 2, pp. 332–339, Mar. 1994. and the Ph.D. degree from the University
[43] G. Schuppener, C. Pala, and M. Mokhtari, “Investigation on low-volt- of Catania in 1993. Since 1993 he conducts
age low-power silicon bipolar design topology for high-speed digital cir- courses on Electronic Devices, Electron-
cuits,” IEEE Jour. Of Solid-State Circ., vol. 35, no. 7, pp. 1051–1054, July ics for Digital Systems and basic Elec-
2000. tronics. In 1994 he joined the DEES (Dipartimento
[44] M. Alioto, R. Mita, and G. Palumbo, “Performance evaluation of the Elettrico Elettronico e Sistemistico), now DIEES (Diparti-
low-voltage CML D-Latch topology,” Integration—The VLSI Journal, Spe- mento di Ingegneria Elettrica Elettronica e dei Sistemi), at
cial Issue in Analog and Mixed-Signal IC Design and Design Methodolo- the University of Catania as a researcher, subsequently
gies (edited by Francisco V. Fernandez), vol. 36, no. 4, pp. 191–209, Nov. becoming associate professor in 1998. Since 2000 he is a
2003. full professor in the same department.
[45] M. Alioto, L. Pancioni, S. Rocchi, and V. Vignoli, “Modeling and Eval- His primary research interest has been analog circuits
uation of Positive-Feedback Source-Coupled Logic,” IEEE Trans. on with particular emphasis on feedback circuits, compensa-
CAS—Part I, vol. 51, no. 12, pp. 2345–2355, Dec. 2004. tion techniques, current-mode approach, low-voltage cir-
[46] P. R. Gray and R. G. Meyer, Analysis and Design of Analog Integrated cuits. Then, his research has also embraced digital circuits
Circuits, John Wiley & Sons, 1977. with emphasis on bipolar and MOS current-mode digital
[47] J. L. Wyatt, Jr., “Signal propagation delay in RC models for intercon- circuits, adiabatic circuits, and high-performance building
nect,” Circuit Analysis, Simulation and Design, Part II: VLSI Circuit Analysis blocks focused on achieving optimum speed within the
and Simulation, A. Ruehli (Ed.), vol. 3 in the series Advances in CAD for constraint of low power operation. In all these fields he is
VLSI, North-Holland, 1987. developing some the research activities in collaboration
[48] M. Alioto G. Palumbo, and M. Poli, “Evaluation of energy consump- with STMicroelectronics of Catania.
tion in RC ladder circuits driven by a ramp input,” IEEE Trans. on VLSI He was the co-author of three books “CMOS Current
Systems, vol. 12, no. 10, pp. 1094–1107, Oct. 2004. Amplifiers”, “Feedback Amplifiers: theory and design” and
“Model and Design of Bipolar and MOS Current-Mode Logic
Massimo Alioto (M’01) was born in (CML, ECL and SCL Digital Circuits)” all by Kluwer Academ-
Brescia, Italy, in 1972. He received the lau- ic Publishers, in 1999, 2001 and 2005, respectively, and a
rea degree in Electronics Engineering and textbook on electronic devices in 2005. He is a contributor
the Ph.D. degree in Electrical Engineering to the Wiley Encyclopedia of Electrical and Electronics Engi-
from the University of Catania (Italy) in neering. He is the author of more than almost 300 scientific
1997 and 2001, respectively. In 2002, he papers on referred international journals (over 110) and in
joined the Engineering faculty of the Uni- conferences. Moreover he is co-author of several patents.
versity of Siena as a Research Associate and in the same Since June 1999 to the end of 2001 and since 2004 to
year as an Assistant Professor. In 2006, he became Associ- 2005 he served as an Associated Editor of the IEEE Trans-
ate Professor in the same faculty. actions on Circuits and Systems part I for the topic “Ana-
Since 2001 he has been teaching undergraduate and log Circuits and Filters” and “Digital Circuits and
graduate courses on basic electronics, microelectronics Systems”, respectively. Since 2006 he is serving as an
and advanced VLSI digital design. He has authored or co- Associated Editor of the IEEE Transactions on Circuits and
authored over 80 journals and conference papers. He is Systems part II.
co-author of the book Model and Design of Bipolar and In 2005 he was a panelist in the scientific-disciplinaire
MOS Current-Mode Logic: CML, ECL and SCL Digital Circuits area 09—industrial and information engineering of the
(Springer, 2005). His primary research interests include: CIVR (Committee for Evaluation of Italian Research), which
modeling and optimized design of CMOS high-perform- has the aim to evaluate the Italian research in the above
ance digital circuits in terms of high-speed or low-power area for the period 2001–2003.
dissipation, transistor- and gate-level design of arithmetic In 2003 he received the Darlington Award. Prof. Palum-
circuits, design of circuits for cryptographic applications bo is an IEEE Senior Member.

FOURTH QUARTER 2006 IEEE CIRCUITS AND SYSTEMS MAGAZINE 59

You might also like