Power Aware Design of Nanometer MCML Tap
Power Aware Design of Nanometer MCML Tap
Abstract
In the recent years, MOS Current-
Mode Logic (MCML) circuits have been
gaining a remarkable interest in sever-
al VLSI applications, ranging from
high-accuracy mixed-signal circuits to
high-speed circuits for channel
(de)multiplexing in optic fiber and
Radio Frequency (RF) telecommunica-
tion systems. However, advantages
over traditional CMOS logic are
achieved at the cost of a static power
consumption, which must be kept as
low as possible. Accordingly, a con-
scious management of the power-
delay trade-off is essential in the
design of such circuits.
This paper presents several recent
ideas on the design of digital MCML
circuits organized in a comprehensive
framework. The treatment reviews and
extends previous results by incorpo-
rating Deep-Sub-Micron (DSM) effects
from the beginning, with a strongly
simplified analytical formulation to
improve the understanding and the
design. Interesting properties and
design criteria are derived from sim-
ple analytical models. From these
models, a deep insight into the design
of MCML circuits is gained, which is
essential for both the efficient design
of MCML cells and the development of
an automated design flow. Numerical
examples are presented by consider-
ing a 90-nm CMOS process.
© MASTER SERIES
40 IEEE CIRCUITS AND SYSTEMS MAGAZINE 1531-6364/06/$20.00©2006 IEEE FOURTH QUARTER 2006
I. Introduction VDD
I
n the last decade, we have witnessed an increasing
interest in MOS Current-Mode Logic (also named
Source-Coupled Logic—SCL) circuits, which repre- RD RD
sents an alternative to traditional CMOS logic styles in vo,1 vo,2
several applications. Despite of their recent adoption,
MCML circuits actually have quite old ancestors in their CL CL
family tree, as they directly descend from the bipolar
Current-Model Logic (CML) which has the same topology,
despite of the different adopted technology [1].
vi1,1 vi1,2
The fundamental structure of an n-input MCML gate is
v NMOS Source-
depicted in Figure 1, where an NMOS network (consisting i2,1 vi2,2
Coupled Pairs
of properly stacked source-coupled pairs) steers the bias ... Network ...
current IS S to one of the two output branches, according vin,1 vin,2
to the value of the differential inputs vi1= vi1,1 − vi1,2 , . . .
vin= vin,1 − vin,2 . The steered current is then converted
into a differential output voltage vo= vo,1 − vo,2 by the two
resistances R D (in red line) which can be often imple- ISS
mented by physical resistors, or alternatively by PMOS Figure 1. Topology of
transistors (working in the triode region) active load. As a generic MCML gate.
opposite to previous works dealing with the power-delay
trade-off management in MCML
gates [2]–[5], in the following a
physical resistor will be assumed.
The current source IS S in Figure 1 NMOS Source-
is usually implemented by a simple Coupled Pairs
current mirror, which is not shown Network vi,1 M1 M2 vi,2
for the sake of simplicity. The load
capacitance C L represents the
external capacitance due to the
input capacitance of the following
gates and the wiring capacitance.
The general topology in Figure 1 VDD
allows the implementation of both
combinational and sequential gates
RD RD
whose logic function only depends vo,1 vo,2
on the connection of the source-
coupled pairs. The implemented CL
CL
function can also be modified by
negating the inputs and the output,
i.e., by simply swapping the corre-
vi,1 M1 M2 vi,2
sponding pairs of differential sig-
nals. As an example of the simplest
logic gate, the topology of an MCML
inverter is depicted in Figure 2, ISS
where the NMOS network consists
of only one source-coupled pair. As
Figure 2. Topology of
other examples, the NMOS network a MCML inverter gate.
topology of a 2-input Multiplexer
Massimo Alioto is with the DII (Dipartimento di Ingegneria dell’Informazione), UNIVERSITÁ DI SIENA, v. Roma, 56, I-53100 SIENA – ITALY,
E-mail: [email protected]. Gaetano Palumbo is with the DIEES (Dipartimento di Ingegneria Elettrica Elettronica e dei Sistemi), UNIVERSITÁ
DI CATANIA, Viale Andrea Doria 6, I-95125 CATANIA – ITALY, E-mail: [email protected]
VDD VDD
RD RD RD RD
vo,1 vo,2 vo,1 vo,2
Upper Source
Coupled Pairs
ISS
ISS
Lower Source
Coupled Pairs
ISS
VSWING
β
NM = 1− (3)
2 AV
swing, and roughly equal to half of it, if AV is suffi-
√
where β is a constant coefficient equal to 2, and AV is ciently high. Next, the comparison of (5) with (3)
the magnitude of the small-signal voltage gain around the shows that the noise margin achieved with nanometer
logic threshold given by, [2], devices is greater than that with old long-channel tran-
sistors, for assigned values of the logic swing and the
AV = gm R D (4) voltage gain. This is a good news, since it means that
DSM effects are beneficial in terms of the noise margin
gm being the transistor transconductance around the in MCML gates, and that the long-channel model in (3)
logic threshold. is pessimistic for current technologies. However, the
In the limit case of a very short-channel device with maximum logic swing which ensures the transistor
a completely saturated carrier velocity, the I-V relation- operation in the saturation region is equal to 2VT H [2],
ship is linear, α = 1, and K = vsat C OX , [6], [37]. In this which slightly decreases when scaling the technology,
case, by performing the simple calculations reported in thus the maximum noise margin tends to decrease
Appendix I, the noise margin turns out to be still given slowly.
by (3) but with a different value of β, which in this case Now, let us derive simple design equations to size R D
is equal to 1. In actual nanometer devices, as shown for and transistors in order to obtain assigned values of
example by the data in Table 1 referring to a 90-nm VSWING and AV satisfying the noise margin requirement
technology, α is somewhat intermediate between 2 and (more detailed design guidelines to preliminarily assign
√
1, and thus β is expected to range from 2 ≈ 1.4 and 1. these two parameters will be discussed in Section V).
As a reasonable approximation, β can be set to the Solving (1), a given logic swing is achieved by properly
intermediate value 1.2, which leads to setting the resistance R D to VSWING /2 IS S , whereas from
(4) an assigned value of AV is achieved by setting the
VSWING
1.2
NM = 1− (5) NMOS transconductance gm in (6) to AV /R D
2 AV
1
diD IS S 1− α
1
Extensive simulations were performed by varying the gm = = α · (K · W ) α (6)
dvG S iD = IS S 2
logic swing from 240 mV to 800 mV, with AV ranging from 2
1.6 to 2.5, adopting a 90 nm technology whose main where VG S under the drain current IS S /2 was evaluated
parameters are reported in Table 1. The error of the ana- from (2). By substituting (6) into (4) and solving for W ,
lytical model (5) was found to be always lower than 14% the transistor channel width needed to achieve a given
and typically in the order of a few percent. Typical values AV is
of N M are in the order of 100 mV in current nanometer α
22α−1 AV
technologies. W= IS S . (7)
K αVSWING
II-B. Considerations on the Technology Scaling From (7), the channel width of NMOS transistors must
and Circuit Design. be set to a value which is proportional to the bias current,
From (5), the noise margin is proportional to the logic and increases proportionally to the ratio AV /VSWING . Of
n the limit case of a very short-channel device with a completely saturated carrier velocity, i.e. with α = 1 [38] and K = vsat COX , [6],
I the DC transfer characteristics of an MCML gate can easily be evaluated by solving the usual set of two equations encountered in the well-
known analysis of atraditional source-coupled pair, [46],
By expressing vGS as a function of iD from (2) and substituting itinto the first equation in (A1.1), the solution of the set of two equations eas-
ily gives the expression of the transistor currents as a function of the input voltage vi
ISS
0 if vi < − K·W
ISS vi ISS
iD1 (vi ) = 2 +K·W· 2 if |vi | ≤ K·W (A1.2a)
ISS
ISS if vi > K·W
from which, considering that vo1 = VDD − RD iD1 and vo2 = VDD − RD iD2 , as well as substituting the voltage gain expression
AV = K · W · RD (achieved from (4), with gm equal to diD /dvGS = K · W from (2) with α = 1) and VSWING by solving (1), the differ-
ential output voltage is equal to
VSWING
2
if vi < − V2A
SWING
V
VSWING
vo (vi ) = −AV vi if |vi | ≤ 2AV (A1.3)
VSWING
VSWING
− 2 if vi > 2AV
which according to Figure 20 is a piece-wise linear curve, as expected due to the linear I-V relationship. From this figure, the critical points
that define the noise margin are
VSWING VSWING
(VIL,max , VOH,min ) = − ,
2AV 2
VSWING VSWING
(VIH,min , VOL,max ) = ,− (A1.4)
2AV 2
1
VSWING
NM = VOH,min − VIH,min = 1− . (A1.5)
2 AV
course, it is set to the minimum value allowed by the and the zero). The propagation delay τ PD of this first-
technology in the cases where (7) is lower than it. order approximation is equal to, [39],
RD
...
B
RD
n
CR,TOT CR,TOT
2n 2n
Decomposition Into
n Sections
A B
R1 R2 Rn
A B
CR,TOT
RD CRD =
3
Figure 9. Physical implementation of the load resistance: derivation of its lumped circuit model.
the ON state). Thus, the circuit can be linearized around are open-circuited), the time constants τ and τz in (8) are
the logic threshold vi = 0, and the half-circuit concept easily found to be
applies due to the symmetry and the differential signal-
ing. As shown in Figure 10, where the transistor model in
Figure 8 is substituted, the linearized half-circuit is a sim- τ = R D ⌊(C db + C gd ) + C RD + C L ⌋
ple common-source circuit. By applying the time-constant = R D (C drain + C RD + C L ) (11a)
method to this circuit (i.e., by evaluating the time con- C gd C gd
τz = − =− (0.6 + 0.4α) (11b)
stants associated with each capacitance when the others GM gm
o model the effect of the distributed resistance and capacitance associated with the load resistance physical layer, we develop an equiv-
T alent lumped RC circuit which has approximately the same dynamic behavior. To this aim, divide the strip in Figure 9 into a high number
n of small sections, each of which represented by a lumped resistance RD /n and a capacitance CR,TOT /n (split into two symmetric con-
tributions CR,TOT /2n, according to Figure 9). Thus the distributed RC strip can be described by the ladder network in Figure 9 with
C0 = Cn = CR,TOT /2n, C1 = C2 = . . . = Cn−1 = CR,TOT /n, with Cn being short-circuited to ground.
The equivalent impedance ZD of the RC ladder circuit in Figure 9 can be approximated to a first-order RC circuit with an equivalent time
constant τeq , [47], [48],
1 + b1 s + b2 s2 . . . 1
ZD (s) = RD ≈ RD (A2.1)
1 + a1 s + a2 s2 . . . 1 + sτeq
which apparently consists of a resistance RD with a parallel equivalent capacitance CR such τeq = RD CR . In (A2.1), the equivalent time-
constant τeq is equal to a1 − b1 [39], which in turn is easily evaluated through the time-constant method, [40], [41]. After simple but tedious
calculations, a1 and b1 for n → ∞ we obtain
RD CR,TOT
a1 = (A2.2)
2
n−1
RD CR,TOT i2 RD CR n(n − 1) 1 (n − 1)3 (n − 1)2 (n − 1)
b1 = lim 2
i− = lim − + +
n→∞ n i=1
2 n→∞ n2 2 n 3 2 6
RD CR,TOT
= (A2.3)
6
therefore the equivalent time constant τeq is equal to RD CR,TOT /3 (thereby yielding CRD = CR,TOT /3).
The equivalent capacitance CR can be expressed as an explicit function of the bias current by observing that the resistance RD is equal
to r · L, r being the resistance per unit length of the considered physical layer and L the strip length. The same observation holds for CR,TOT
equal to c · L, c being the capacitance per unit length of the considered layer. Accordingly, by expressing the strip length L as RD /r and
substituting the expression of RD = VSWING /2ISS we get the relationships (10).
where it was observed that all capacitances see the B, and NMOS transistors are sized according to (7). Since
same resistance R D in the evaluation of τ , and the sum all NMOS capacitances are proportional to W, as pointed
of C gd and C db was interpreted as the transistor capaci- out in Section III, from (7) the transistor capacitance
tive contribution C drain at the drain node. The (nega- C drain (C gd ) turns out to be proportional to IS S by a con-
tive) zero time constant in (11b) is that of the stant C drain,N (C gd,N ) which represents its value per unit
well-known common-source circuit, and from (8) tends current (i.e., C drain = C drain,N . IS S and C gd = C gd,N . IS S ).
to increase the delay more significantly in down-scaled By substituting (10), the MCML inverter delay in (12) is
technologies.1 From (8) and (11a)–(11b), the delay τ PD equal to
is equal to
C R,unit CL
τ PD = 0.35 · VSWING C MOSnet,N + + (13)
τ PD = 0.69R D IS2 S IS S
C gd (0.6 + 0.4α)
× C drain + + C RD + C L .
AV where the NMOS network capacitive contributions per
(12) unit current were lumped into a single contribution
C MOSnet,N
Now, let us consider the explicit dependence of the
delay (12) on the bias current IS S , considering that in prac- C gd,N (0.6 + 0.4α)
C MOSnet,N = C drain,N + . (14)
tical designs R D = VSWING /2 IS S as discussed in Section II- AV
1This is because the (overlap) gate-drain capacitance scales more slowly than the other parasitic capacitances, since the direct overlap size cannot lin-
early scale as reducing the minimum feature size. As another important aspect, the recent adoption of high-κ dielectrics tends to further increase this
capacitance [35].
2When vi is applied to the upper transistors, the capacitances of lower transistors (that have already switched) do not contribute to the overall delay.
C R,unit
= 0.35 · VSWING ·
τ PD ∼ (24)
IS2S
which shows that in low-power design the
Delay logic swing has to be set as low as possi-
ble, as in the case of power-efficient
design, whereas the voltage gain does not
affect the speed performance.
level shifter voltage drop VG S,shift is usually kept very input C in and the two digit inputs A and B [6]. This block
close to the transistor threshold voltage VT H by set- is of utmost importance in arithmetic blocks such as
ting the bias current to a rather low value. In regard to adders and multipliers, and its MCML topology is report-
VG S , it is the gate-source voltage of a transistor in the ed in Figure 18, [15]. Its worst-case delay is represented
ON state, i.e., with a current IS S , thus it is obtained by by the case when the maximum number of capacitances
solving (2) switch. From Figure 18, this occurs when the lowest level
input B switches and the resulting current is steered to
1
IS S
α
the source-coupled pairs M5-M6 and M9-M10 (or equiva-
VG S = VT H +
K ·W lently to M3-M4 and M7-M8), which occurs when A = 1
αVSWING α and C in = 0 (or A = 0 and C in = 1). This current path that
1
= VT H + (26)
1
22− α AV defines the worst-case delay is depicted with a dashed
line in Figure 18.
where (7) was substituted. From (26), in order to reduce The delay of the circuit in Figure 18 is given by (13),
the supply voltage, the voltage swing should be kept as low where C MOSnet,N is easily found by inspection of the
as possible, and the voltage gain should not be too low. worst-case current path. Indeed, the capacitance at the
VDD
RD RD
vo,1 vo,2
1st Level
VDD VDD
2nd Level
VGS,shift
VDD VDD
1st Level V
Shifter
GS,shift ... n-th Level ... 1st Level
Shifter
vin,1 vin,2
VGS,shift
VGS
2nd Level 2nd Level
Shifter
... VISS,min ISS
... Shifter
[1] M. Mizuno et al., “A GHz MOS adaptive pipeline techniques using Trans. on CAS—Part I, vol. 40, no. 9, pp. 553–563, Sep. 1993.
MOS current-mode logic,” IEEE Journal of Solid-State Circuits, vol. 31, [20] S. Kiaei, S. Chee, and D. Allstot, “CMOS source-coupled logic for
no. 6, pp. 784–791, June 1996. mixed-mode VLSI,” Proc. Int. Symp. Circuits Systems, pp. 1608–1611, 1990.
[2] M. Alioto and G. Palumbo, Model and Design of Bipolar and MOS Cur- [21] H. Ng and D. Allstot, “CMOS current steering logic for low-voltage
rent-Mode Logic (CML, ECL and SCL Digital Circuits), Springer, 2005. mixed-signal integrated circuits,” IEEE Trans. on VLSI Systems, vol. 5,
[3] M. Alioto and G. Palumbo, “Design strategies for source coupled logic no. 3, pp. 301–308, Sep. 1997.
[22] B. Stanistic, N. Verghese, R. Rutenbar, L. Carley, and D. Allstot,
gates,” IEEE Trans. on CAS Part I, vol. 50, no. 5, pp. 640–654, May 2003.
“Addressing substrate coupling in mixed-mode IC’s: simulation and
[4] M. Alioto and G. Palumbo, “Power-delay optimization of D-
power distribution synthesis,” IEEE Jour. of Solid-State Circuits, vol. 29,
Latch/MUX source coupled logic gates,” International Journal of Circuit
pp. 226–238, Mar. 1994.
Theory and Applications, vol. 33, no. 1, pp. 65–86, Jan./Feb. 2005.
[23] International Technology Roadmap for Semiconductors, Available:
[5] M. Alioto and G. Palumbo, “Oscillation frequency in CML and ESCL
http://public.itrs.net.
ring oscillators,” IEEE Trans. on CAS Part I, vol. 48, no. 2, pp. 210–214,
[24] R. Singh (Ed.), Signal Integrity Effects in Custom IC and ASIC Design,
Feb. 2001.
IEEE Press, 2002.
[6] J. Rabaey, A. Chandrakasan, and B. Nikolic, Digital Integrated Circuits
[25] B. Del Signore, D. Kerth, N. Sooch, and E. Swanson, “A monolithic 20-b
(A Design Perspective), Prentice Hall, 2003.
delta-sigma A/D converter,” IEEE J. Solid-State Circuits, vol. 25, pp. 1311–1317,
[7] B. Razavi, “Prospect of CMOS technology for high-speed optical
Dec. 1990.
communication circuits,” IEEE Jour. of Solid-State Circ., vol. 37, no. 9,
[26] H. Leopold, G. Winkler, P. O’Leary, K. Ilzer, and J. Jernej, “A mono-
pp. 1135–1145, Sep. 2002.
lithic CMOS 20-b analog-to-digital converter,” IEEE J. Solid-State Circuits,
[8] B. Razavi (Ed.), Monolithic Phase-Locked Loops and Clock Recovery
vol. 26, pp. 910–916, July 1991.
Circuits (Theory and Design), IEEE Press, 1996.
[27] I. Fujimori et al., “A 5-V single chip delta-sigma audio A/D converter
[9] C. Hung, B. Floyd, B. Park, and K. O, “Fully integrated 5.35-GHz CMOS
with 111 dB dynamic range,” IEEE J. Of Solid State Circuits, vol. 32, pp.
VCOs and prescalers,” IEEE Trans. on Microwave Theory and Techniques,
329–336, Mar. 1997.
vol. 49, no. 1, Jan. 2001.
[28] S. Jantzi and K. Martin, A. Sedra, “Quadrature bandpass modu-
[10] C. Lam and B. Razavi, “A 2.6-GHz/5.2-GHz Frequency Synthesizer in
lator for digital radio,” IEEE J. Solid State Circuits, vol. 32, pp. 1935–1949,
0.4-µm CMOS Technology,” IEEE Jour. of Solid-State Circ., vol. 35, no. 5,
1997.
pp. 788–794, May 2000.
[29] B. Kup, E. Dijkmans, P. Naus, and J. Sneep, “A bit-stream digital-to-
[11] H. Nosaka, K. Isshii, T. Enoki, and T. Shibata, “A 10-Gb/s data-pattern
analog converter with 18-b resolution,” IEEE J. Solid-State Circuits, vol. 26,
independent clock and data recovery with a two-mode phase compara-
pp. 1757–1763, Dec. 1991.
tor,” IEEE Jour. of Solid-State Circuits, vol. 38, no. 2, pp. 192–197, Feb. 2003.
[30] J. Kundan and S. Hasan, “Enhanced folded source-coupled logic
[12] S.-T. Yan and H. Luong, “A 3-V 1.3-to-1.8-GHz CMOS voltage-con-
technique for low-voltage mixed-signal integrated circuits,” IEEE Trans.
trolled oscillator with 0.3-ps Jitter,” IEEE Trans. on Circuits and Systems
on CAS—Part II, vol. 47, no. 8, pp. 810–817, Aug. 2000.
Part II, vol. 45, no. 7, pp. 876–880, July 1998.
[31] H. Lee, D. Hodges, and P. Gray, “A self-calibrating 15-bit CMOS A/D
[13] B. Razavi, Design of Integrated Circuits for Optical Communications, converter,” IEEE Jour. of Solid-State Circuits, vol. 19, pp. 813–819, Dec.
McGraw-Hill, 2003. 1984.
[14] T.H. Lee, The Design of CMOS Radio Frequency Integrated Circuits, [32] D. Su, M. Loinaz, S. Masui, and B. Wooley, “Experimental results and
Cambridge University Press, 2nd edition, 2003. modeling techniques for substrate noise in mixed-signal integrated cir-
[15] J. Musicer and J. Rabaey, “MOS current mode logic for low power, cuits,” IEEE Jour. of Solid-State Circuits, vol. 28, pp. 420–430, Apr. 1993.
low noise CORDIC computation in mixed-signal environments,” Proc. of [33] K. Bernstein et al., High Speed CMOS Design Styles, Kluwer Academic
ISLPED 2000, pp. 102–107, 2000. Publishers, 1999.
[16] A. Tanabe, M. Umetani, I. Fujiwara, T. Ogura, K. Kataoka, M. Okiara, [34] S. Bruma, “Impact of on-chip process variations on MCML perform-
H. Sakuraba, T. Endoh, and F. Masuoka, “0.18-µm CMOS 10-Gb/s ance,” Proc. IEEE International Systems-on-Chip Conference (SOCC’03), pp.
Multiplexer/Demultiplexer ICs Using Current Mode Logic with Tolerance 135–140, 2003.
to Threshold Voltage Fluctuation,” IEEE J. of Solid-State Circuits, vol. 36, [35] B.P. Wong, A. Mittal, U. Cao, and G. Starr, Nano-CMOS Circuit and
no. 6, June 2001. Physical Design, John Wiley & Sons, 2005.
[17] R. Senthinatan and J. Prince, “Application specific CMOS output [36] D. Chinnery and K. Keutzer, Closing the Gap between ASIC & Custom,
driver circuit design techniques to reduce simultaneous switching Kluwer Academic Publishers, 2002.
noise,” IEEE Jour. of Solid-State Circuits, vol. 28, no. 12, pp. 1383–1388, [37] T. Sakurai and A.R. Newton, “Alpha-Power law MOSFET model and
Dec. 1993. its applications to CMOS inverter delay and other formulas,” IEEE Jour.
[18] S. Maskai, S. Kiaei, and D. Allstot, “Synthesis techniques for CMOS on Solid-State Circuits, vol. 25, no. 2, pp. 584–594, Apr. 1990.
folded source-coupled logic circuits,” IEEE J. Of Solid State Circuits, vol. [38] M. Alioto, G. Palumbo, and S. Pennisi, “Modeling of Source Coupled
27, no. 8, pp. 1157–1167, Aug. 1992. Logic Gates,” International Journal of Circuit Theory and Applications, vol.
[19] D. Allstot, S. Chee, S. Kiaei, and M. Shristawa, “Folded source- 30, no. 4, pp. 459–477, 2002.