Transceiver Chip For USB

Download as pdf or txt
Download as pdf or txt
You are on page 1of 17

INTERNATIONAL JOURNAL OF CIRCUIT THEORY AND APPLICATIONS

Int. J. Circ. Theor. Appl. 2015; 43:900–916


Published online 28 February 2014 in Wiley Online Library (wileyonlinelibrary.com). DOI: 10.1002/cta.1982

A 5-Gbps USB3.0 transmitter and receiver linear equalizer

Nikolaos Terzopoulos, Costas Laoudias, Fotis Plessas*,†, George Souliotis,


Sotiris Koutsomitsos and Michael Birbas
Analogies S.A., Patras Science Park, Patras, Greece, 26504

SUMMARY
A USB3.0 compatible transmitter and the linear equalizer of the corresponding receiver are presented in this
paper. The architecture and circuit design techniques used to meet the strict requirements of the overall link
design are explored. Output voltage amplitude and de-emphasis levels are programmable, whereas the
output impedance is calibrated to 50Ω. A programmable receiver equalizer is also presented with its main
purpose being to compensate for the channel losses; this is employed together with a DC offset compensation
scheme. The 6.25-GHz equalizer provides a 10 dB overall gain equalization and 5.5-dB peaking at the
maximum gain setting. Designed using a mature and well established 65 nm complementary metal oxide
semiconductor process, the layout area is 400 μm × 210 μm for the transmitter core, and 140 μm × 70 μm
for the equalizer core. The power consumption is 55 and 4 mW, respectively, from a 1.2 V supply at a data
rate of 5 Gbps. The target application for such high-speed blocks is to implement the critical part of the
physical layer that defines the signaling technology of SuperSpeed USB3 PHY. However, identical iterations
of the circuitry discussed can be used for similar high-speed applications like the PCI express (PCIe).
Copyright © 2014 John Wiley & Sons, Ltd.

Received 1 August 2013; Revised 30 December 2013; Accepted 2 February 2014

KEY WORDS: serial interface; USB3.0; continuous time equalizer; high-speed transmitter; dc offset
compensation; SERDES

1. INTRODUCTION

SuperSpeed Universal Serial Bus (USB) (also known as USB 3.0) is the next generation of the popular
plug and play USB serial communication specification managed by the USB Implementers Forum [1–3].
USB 3.0 improves upon USB 2.0 including increased bandwidth—5.0 Gbps data rate, improved
communication—reduced CPU load, lower power consumption and improved power delivery—7.5W
of power on a single cable.
At multi-gigabit per second data rates, frequency-dependent losses from the interconnect channel
cause it to act like a lowpass filter, resulting in a reduced data eye opening at the receiver input. For
this reason, USB 3.0 specification [1] dictates the implementation of equalization schemes at both
transmitter and receiver in order to meet system timing and voltage margins. The transmitter
equalization is specified as 3.5 ± 0.5 dB de-emphasis, whereas the receiver equalization scheme can
be provided in the form of a continuous time linear equalizer (CTLE). Low power consumption and
small die area are equally important when silicon implementation issues are taken into account.
This paper describes the design and implementation of a 5 Gbps USB3.0 Transmitter and Receiver
Linear Equalizer. Only a small number of similar designs are available in the literature, probably
because of patent regulations and commercial reasons [4–8]. Regarding the transmitter, emphasis has

*Correspondence to: Fotis Plessas, Analogies S.A., Patras Science Park, Patras, Greece 26504.

E-mail: [email protected]

Copyright © 2014 John Wiley & Sons, Ltd.


1097007x, 2015, 7, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/cta.1982 by Indian Institute Of Technology, Kharagpur Central Library, Wiley Online Library on [25/07/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
5GBPS USB3.0 TX AND CTLE 901

been given to low-voltage, low-power operation, minimum die area, and impedance calibration to
overcome termination resistor variations because of different core voltages, temperature and process. In
that sense, we have adopted the half rate architecture option, meaning that in the transmitter, the only
cell operating at the maximum data rate of 5 Gbps is the final output driver stage. In contrast with the
full rate architectures, the realization of the digital cells is by far easier, allowing the designer to utilize
the ready-to-use (standard cell) libraries for digital design, which are provided by the foundries.
On the receiver side, a number of equalizer architectures are available, namely decision-feedback
equalization (DFE), CTLE with passive and active components, with the latter to be the simplest and the
most common [9–11]. Because the frequency dependent losses from the interconnect channel are of low
pass nature, most equalizers are acting as high-pass filters, to provide an overall channel with flat
frequency response. This extends the usable bandwidth of wires and circuits, and allows higher link data rates.
In the case of the DFE and in order to cancel the cursor Intersymbol Interference (ISI), the data
needs to be pre-distorted prior to their transmission by multiplying them with tap weights that will
be subtracted from the received signal. The DFE equalization does not amplify, in general, any
high-frequency noise; therefore, the most important drawback remains the additional difficulty when
the target application is of high speed because of the convention that the feedback loop delay always
needs to be half the baud period. Other merits when analog type of equalization is applied on the
receiver side are less die area and high-speed performance.
In this paper, the receiver’s CTLE is realized using a differential pair with resistive loads and RC
degeneration network to generate a real zero is presented. By modifying the values of the RC
network, the frequency characteristics of the equalizer are programmed, providing tunable frequency
boost for a great variety of different backplane traces and different target applications.
Another important issue in this kind of applications is the DC voltage offsets present in the data
transmission path. Thus, the performance of the receiver is degraded by setting a lower bound on the
precision with which a data bit can be measured. To eliminate this problem, a digitally-assisted DC
offset cancellation scheme is employed. The DC offset calibration is performed automatically during
the time intervals where the receiver is in idle state, occupying less chip area than the traditional
offset cancellation techniques with the nominal performance of the equalizer remaining unaffected.
Section 2 presents the system architecture. The implemented circuits are described in Section 3,
whereas the postlayout performance of the design is given in Section 4. Finally, the conclusions are
drawn in Section 5.

2. SYSTEM ARCHITECTURE

Figure 1 shows the system architecture including the transmitter (TX), the CTLE, the DC offset
compensation circuit, the bandgap (bandgap voltage reference [BGR]) and the Serial Peripheral
Interface (SPI).

Figure 1. System architecture.

Copyright © 2014 John Wiley & Sons, Ltd. Int. J. Circ. Theor. Appl. 2015; 43:900–916
DOI: 10.1002/cta
1097007x, 2015, 7, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/cta.1982 by Indian Institute Of Technology, Kharagpur Central Library, Wiley Online Library on [25/07/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
902 N. TERZOPOULOS ET AL.

The main features of the 5 Gbps transmitter are low power operation because of its core voltage
operation, serializer architecture based on standard cells library for easy porting to lower process nodes,
low power consumption as well as minimum die area, output driver with impedance calibration
mechanism to overcome termination resistor variations owed to different core voltages, temperature
and process and three levels of de-emphasis equalization to compensate for the channel losses.
The incoming signal, attenuated by the wire, is coupled to the receiver through off-chip capacitors in
order to remove its DC component. The input signal lines are terminated to a reference voltage, vterm,
via on-chip salicided poly resistors.
The CTLE employs a differential pair with resistive loads and RC degeneration network to generate
a real zero. The RC network is programmable, meaning that can adopt the frequency amplification
(boosting) for a range of different applications. The equalization circuit is used to cancel the
pre-cursor ISI, because in this channel, it is not mandatory to use a DFE equalization to cancel
the post-cursor ISI.
The DC offset calibration circuit (DCOC) is coupled to the output of the CTLE in order to control its
DC offsets. The digitally-assisted DC offset cancellation is performed automatically during the time
intervals where the receiver is in idle state.

3. CIRCUIT DESCRIPTIONS

3.1. Transmitter
Designing a low-power high-speed transmitter is not an easy task when using state of the art deep
submicron process nodes. Taking into account the need of all modern serial interfaces for higher
data rates, the today’s IC designer faces difficult challenges. In this paper, we address some of these,
and we provide a step by step approach on dealing with high-speed design issues.
In Figure 2, there is the top-level block diagram of the proposed high-speed transmitter, which
consists of a 10b/8b frequency converter, a serializer, an equalizer, and an output driver. Each of
these cells will be investigated separately along with their merits where emphasis will be given to
ensure high speed and low power. The extended use of the standard cells library offers a significant
advantage in terms of power and die area and, of course, easy porting to small process geometries.
In the USB3.0 protocol, we normally have 10 bits coming from the Physical Coding Sublayer
interface. This number of bits is not particularly helpful in order to implement a ‘tree’ type of
serializer where the main target is to use as many cells from the standard cells library as possible.
Therefore, we make use of a frequency converter cell where the 10 bits of 500 Mbps are converted
to a stream of 8 bits of 625 Mbps. It is a synchronous design in 625 MHz and can comfortably be
designed using the RTL-to-GDS methodology (Verilog).
The 8-bit stream is fed into a tree-type 8:2 serializer as shown in Figure 3 [8]. This type of serializer
is purely implemented by digital blocks from the standard cells library. The maximum frequency of the
flip-flops type D is equal to the data rate/4 (hence 5 Gbps/4 = 1.25 Gbps or 0.625 GHz in case of USB
3.0), whereas its maximum operation speed is always half rate meaning 2.5 Gbps in our case. The
output data stream consists of two lanes of 5-bit long each, where the input 10-bit word is grouped
into odd and even data streams (Deven and Dodd).
This short of arrangement helps the preparation/equalization of the data prior of their transmission
via the 2:1 multiplexer. The demand of high data rate transmission over long cables with high losses

Figure 2. Transmitter’s block diagram.

Copyright © 2014 John Wiley & Sons, Ltd. Int. J. Circ. Theor. Appl. 2015; 43:900–916
DOI: 10.1002/cta
1097007x, 2015, 7, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/cta.1982 by Indian Institute Of Technology, Kharagpur Central Library, Wiley Online Library on [25/07/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
5GBPS USB3.0 TX AND CTLE 903

Figure 3. 8:2 serializer.

requires additional equalization procedures at the transmitter end that will have a complementary
action to the receiver equalization. The widely used compensation de-emphasis technique is
employed at the transmitter side in accordance to the USB3.0 specifications.
The de-emphasis topology uses an FFE (feed forward equalizer) equalizer type that compensates for
the channel losses. A schematic diagram of this type of compensation is illustrated in Figure 4 [8]. The
de-emphasis technique requires the co-existence of two data streams called cursor and post-cursor.
These data streams are identical to each other and the only difference is the time delay of 1 UI (1 bit
delay, 200 ps for USB3.0 applications) that needs to be enforced between them at all times [1]. The
FFE equalizer used in this concept has two data inputs of 2.5 Gbps each (odd and even data
streams) coming from the serializer, which are then synchronized with the use of a negatively and a
positively edge triggered D type flip-flops and fed into a 2:1 multiplexer in order to form the cursor
5 Gbps data stream. Simultaneously, the outputs of the first pair of D flip-flops are fed into a second

Figure 4. Feed forward equalizer type Equalizer/Synchronizer.

Copyright © 2014 John Wiley & Sons, Ltd. Int. J. Circ. Theor. Appl. 2015; 43:900–916
DOI: 10.1002/cta
1097007x, 2015, 7, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/cta.1982 by Indian Institute Of Technology, Kharagpur Central Library, Wiley Online Library on [25/07/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
904 N. TERZOPOULOS ET AL.

identical stage of D flip flops and a 2:1 MUX and another pair of FFs, this time with opposite triggered
edge, with their outputs also driven into a 2:1 multiplexer to form the post-cursor 5 Gbps stream. The
cursor and post-cursor data streams are fed into the final driving stage as shown in Figure 5 [9], in order
for the 5 Gbps data stream to be able to be transmitted via a 3 m USB cable.
Between the two stages (FFE equalizer/synchronizer and CML output driver), there is also a
CMOS to CML converter stage accompanied by a number of CML pre-drivers/buffers to ensure
proper driving capabilities. The discussion for the aforementioned circuitry will not be extended
because these are just supporting cells. The 5 Gbps cursor data stream is connected to the main
CML driver (M1 and M2), whereas the post-cursor data stream is connected to the driver taking
care of the de-emphasis function (M4 and M5). The advantages of this approach in designing the
CML driver is that it requires only a VDD equal with the core voltage (1.2 V in our case), whereas
the DC differential impedance is always constant at 50Ω across voltage, temperature, and process
variations thanks to the impedance calibration mechanism realized by the parallel resistor network
formed by MOS transistors.
In Table I, the simulation results of the DC differential resistance for process, temperature, and
voltage variations (corner simulation) are shown. In this type of simulation, the DC differential
resistance is simulated across the slow-slow (SS) and fast-fast (FF) process variations while altering
the power supply ±10% of the core voltage and the temperature from 40 to +125°. The
specification for the output impedance according to [1] is 72-120#x2126;. The calibration
mechanism could be probably avoided by using another type of resistor with less resistance
variation across process and temperature (e.g. silicide).

Figure 5. Current-mode logic driver with impedance calibration mechanism.

Table I. DC differential resistance (corner simulations).


Corner Cnt[4:0] R (Ω) differential R (Ω) without calibration
FF_1.32_125 01001 100.6 102.5
FF_1.08_40 00001 101 163
FF_1.32_40 00001 99.8 82
FF_1.08_125 01010 103.8 100
SS_1.08_40 01011 99 118
SS_1.32_40 01001 102 85
SS_1.08_125 11000 101.6 90
SS_1.32_125 10100 98.28 60
Typ. 01010 102 103

Copyright © 2014 John Wiley & Sons, Ltd. Int. J. Circ. Theor. Appl. 2015; 43:900–916
DOI: 10.1002/cta
1097007x, 2015, 7, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/cta.1982 by Indian Institute Of Technology, Kharagpur Central Library, Wiley Online Library on [25/07/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
5GBPS USB3.0 TX AND CTLE 905

3.2. Continuous time equalizer


In order to cover the wide range of expected applications, two compliance channels are defined for
electric compliance testing [3]. One of the reference channels is intended to represent a short
channel such as a front panel port, as shown in Figure 6, in which reflections play a stronger role in
determining the performance. Because of the short transmission lines, this configuration will present
a relatively low differential loss.
The second reference channel, depicted in Figure 6 is intended to represent a long channel, such as a
back panel port in a desktop client, in which the performance is largely determined by channel loss. In
this case, the differential insertion loss for the channel is expected to be in the range of 17 to 18 dB
at the fundamental frequency of the signal (2.5 GHz). The characteristics of a differential backplane
channel including models of cable, device and backpanel are usually represented by 4-port S
parameters in Touchtone format and can be found at the USB website (www.usb.org). Figure 7
presents the frequency response of the two channel models used in this work in order to complete
compliance SuperSpeed testing.
The reference transfer function that is used for compliance tests is provided by USB3.0 SuperSpeed
Equalizer guidelines [1], in terms of a second-order continuous-time linear equalizer and is expressed by (1):

Adc ωp1 ωp2 s þ ωz


H ðsÞ ¼   
ωz s þ ωp1 s þ ωp2

Figure 6. Diagram of compliance channels.

Figure 7. Frequency response of compliance channels.

Copyright © 2014 John Wiley & Sons, Ltd. Int. J. Circ. Theor. Appl. 2015; 43:900–916
DOI: 10.1002/cta
1097007x, 2015, 7, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/cta.1982 by Indian Institute Of Technology, Kharagpur Central Library, Wiley Online Library on [25/07/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
906 N. TERZOPOULOS ET AL.

This equalization scheme is recommended in the long channel case where the channel loss is about
18 dB at 2.5 GHz (Nyquist frequency for 5 Gbps data transfer rate) and the data eye at the far end of the
link is closed. Thus, the equalizer filter is providing adequate gain peaking around 5 GHz so as to
equalize the signal spectrum and recover the transmitted bits.
In the short channel case where shorter trace lengths are presented and thus lower attenuation is
introduced, the employment of equalization with large gain peaking will result in over-equalization
and subsequently to higher jitter and performance degradation. For that reason, a reference equalizer
that has a lowpass filter response (or smaller amount of peaking) and with cutoff frequency well
above the fundamental frequency would be sufficient. The corresponding parameters for the
equalizer in both cases along with the reference values are given in Table II.
According to Table II, the ideal frequency responses of the reference equalizers in the long and short
channel cases are given in Figure 8. From the phase response, it is apparent that the zero introduces a
phase distortion, though it is negligible in these systems. The adaptation of the equalizer is performed
by altering the time constant of the zero. In Figure 8(a), the peaking gain with regards to the DC gain is
about 7 dB and the slope is 10 dB per decade, whereas in the case where the zero frequency is less
compared with the baud rate frequency, a gain increase at that particular baud rate frequency can be
attained.
An equalizing filter has to provide high-frequency gain boosting to compensate for high-frequency
channel losses. It can be implemented with different structures either by using standard analog filter
design techniques, with passive and active components or by means of a digital topology. For high-
speed multi-Gbps data transmission, it becomes more and more difficult to implement a receiver in
the format of FIR filter except for DFE, because it must perform delaying, multiplying, and adding
analog (or multilevel) signals in the frame of one bit period (UI). Therefore, continuous-time
equalizers using active components are mostly preferred in receiver implementations nowadays,
because of the fact that they offer linear equalization, high-frequency operation, gain greater than
unity, and can easily be integrated in silicon. Some other attractive features of this equalizer scheme
are the provision of gain and equalization with low power and area overhead especially in the case
of inductorless implementations, and it cancels both precursor and long tail ISI. Also, its transfer
function can be made programmable by modifying its circuit characteristics, thus making it
adjustable to process, voltage, and temperature (PVT) variations and different channel losses.
A common method in the design of the analog equalizer [10, 11], [12, 13], followed also in this
design, incorporates a differential pair with resistive loads and RC degeneration network to generate
a real zero, as shown in Figure 9. In order to ensure a constant frequency response across the whole
signal range, the analog zero balances out the low-pass filter effect of the backplane by increasing
the high frequencies.
At the high frequency, the degeneration capacitor shorts the degeneration resistor and creates
peaking. The peaking and DC gain can be tuned through the adjustment of degeneration resistor and
capacitor. In Figure 9, RL, Rs, Cs, and CL represent loading, degeneration resistors, capacitors, and
loading from the subsequent stage, respectively. The transfer function can be derived as

gm s þ 1 RS C S
H ð sÞ ¼   
C L sþ1þgm RS =2 s þ 1
RS C S RL C L

Table II. Design parameters of reference continuous-time linear equalizers.


Value

Parameter Description Long channel Short channel


Adc DC gain 0.667 1
fz Zero freq. 650 MHz 650 MHz
fp1 1st pole freq. 1.95 GHz 650 MHz
fp2 2nd pole freq. 5 GHz 10 GHz

Copyright © 2014 John Wiley & Sons, Ltd. Int. J. Circ. Theor. Appl. 2015; 43:900–916
DOI: 10.1002/cta
1097007x, 2015, 7, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/cta.1982 by Indian Institute Of Technology, Kharagpur Central Library, Wiley Online Library on [25/07/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
5GBPS USB3.0 TX AND CTLE 907

Figure 8. Gain and frequency response of equalizer used in (a) long and (b) short channel case.

The expressions of zero, poles and low-frequency gain of the circuit are given by

1
ωZ ¼
RS C S

ωp1 ¼ 1 þ gm Rs =2
RS C S

1
ωp 2 ¼
RL C L

The gain at the peaking frequency is proportional to the ratio of pole and zero frequencies and can be
approximated by:

Copyright © 2014 John Wiley & Sons, Ltd. Int. J. Circ. Theor. Appl. 2015; 43:900–916
DOI: 10.1002/cta
1097007x, 2015, 7, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/cta.1982 by Indian Institute Of Technology, Kharagpur Central Library, Wiley Online Library on [25/07/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
908 N. TERZOPOULOS ET AL.

Figure 9. The programmable degeneration RC network.

ωp 1
A ¼ A0
ωz

where A0 is the low-frequency gain of the CTLE given by

gm RL
A0 ¼
1 þ gm Rs =2

In Figure 9, the degeneration RC network is made programmable by utilizing an array of unit


elements and switches controlled by Rs[n:0], Cs[n:0]. By changing the value of the capacitor, zero’s
time constant can be modified accordingly in order for the equalizer to provide adoptable frequency
increase (boost) for several types of backplanes. The advantage of changing the value of the
capacitor while keeping the resistor fixed is to maintain the low frequency gain of the equalizer
steady. In addition to this capacitor array, a varactor can be placed in parallel in order to have the
ability of either fine and/or coarse correction. The control of the DC gain of the equalizer is
provided by the resistor array. In a similar way, a fine adjustment of the resistor value can be
provided, by adding an nMOS transistor in parallel to the resistor array and controlling its gate voltage.
The bias current and the transistor dimensions of the equalizer in Figure 9 have been chosen
accordingly, taking into account the noise, the bandwidth and the design parameters of each transfer
function provided by Table I. Thus, the typical values of the resistor and capacitor of the RC
degeneration network can easily be derived from (1)-(4). The number of the elements and,
consequently, the corresponding control bits are designated according to the PVT variations and the
losses introduced from different channels.
The frequency responses of CTLE in different selection of control bits are given in Figure 10. The
DC gain and the peaking frequency ranges provided by the utilization of three control bits for each
element are 9 and 3 GHz, respectively. The analog equalizer provides a maximum gain of about
4.5 dB at Nyquist frequency and 5 dB DC gain. If higher gain amplification is required, a cascade
connection of two or more blocks can be employed. In this application, the equalizer block is
composed by one stage equalizing filter, because it is sufficient to cancel the losses at 2.5 GHz
Nyquist frequency.

3.3. DC offset Calibration


A challenge arising in the design for this kind of applications is the minimum detectable differential
signal of the subsequent stage of CTLE, which, as mentioned previously, has a quite low amplitude

Copyright © 2014 John Wiley & Sons, Ltd. Int. J. Circ. Theor. Appl. 2015; 43:900–916
DOI: 10.1002/cta
1097007x, 2015, 7, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/cta.1982 by Indian Institute Of Technology, Kharagpur Central Library, Wiley Online Library on [25/07/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
5GBPS USB3.0 TX AND CTLE 909

Figure 10. The frequency responses of the continuous time linear equalizer for different selection of
control bits.

because of the channel losses. The subsequent stages are usually clocked circuits with high
sensitivity, like the latch-type sense amplifiers, in order to detect and amplify the equalized signal
to a level sufficient for a reliable operation of the clock and data recovery unit. In this application,
where the channel insertion loss is about 18 dB at 5 Gbps, the amplitude of the differential signal
at the output of CTLE is about 30 mV, which is almost equal to its input DC offset. Thus, any
DC offset introduced by CTLE that will further reduce the amplitude of the detectable signal must
be removed.
Several approaches have been presented for compensating the DC offset of the equalizer [14, 15]. In
[14], the DC offset cancelation is performed at the input of the equalizer, where offset currents are
dropped across resistors in order to generate the appropriate DC offset correction voltage. This
approach heavily increases the input capacitance of the equalizer, which in most high-speed serial
data communication interface systems must remain under a specified value. In [15], the offset
compensation is realized by injecting a positive or negative differential current at the output of
CTLE. In order not to affect the nominal transfer function of CTLE, additional sinking offset
currents are utilized. The drawback is the increased power consumption, the complexity and the
mismatch between the sourcing and sinking offset currents.
In the proposed method, the cancelation is performed by feeding offset currents at the differential
outputs of a buffer inserted after the CTLE, where a digital-based dc offset cancellation loop
(DCOC) is also designed to automatically cancel the DC offset of both stages [16, 17]. The
utilization of the high-bandwidth unity gain buffer, with the cost of small power consumption
overhead, serves various roles such as avoiding loading the output of CTLE with the additional
current sources and trimming the gain of the equalizer. The former is very important because in this
way the nominal transfer function of CTLE remains unaffected. The latter can adjust the level of
equalizer’s amplification and can be easily obtained by modifying the load resistance of the buffer.
In order to reduce the input capacitance of the unity gain buffer, the well-known topology of
fτ-doubler [18] has been employed. Thus, the input capacitance is roughly equal to the half
while maintaining the overall transconductance unchanged. The conceptual and actual scheme
for cancelling the offset of CTLE is given in Figure 11.
As shown in Figure 11, the DCOC loop consists of a comparator (1-bit quantizer), the digital control
logic comprised of an up/down counter and the cross-zero detection and finally the digital-to-analog
converter (DAC). The architecture of a current-steering DAC was chosen to convert the digital
control signal to the balanced analog currents Ioffsetp,n. These are consequently injected at the
balanced outputs of the buffer and converted to the equivalent offset correction voltages by its
output resistor loads. Therefore, the balanced outputs of the buffer can be sensed by the comparator.
In order to reduce the offset introduced by the comparator itself, a two-stage open-loop amplifier
with long channel length transistors has been employed. Also, careful layout with common-centroid
technique and matching properties has been followed to minimize the random offsets.

Copyright © 2014 John Wiley & Sons, Ltd. Int. J. Circ. Theor. Appl. 2015; 43:900–916
DOI: 10.1002/cta
1097007x, 2015, 7, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/cta.1982 by Indian Institute Of Technology, Kharagpur Central Library, Wiley Online Library on [25/07/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
910 N. TERZOPOULOS ET AL.

Figure 11. (a) Conceptual scheme for offset cancellation; (b) Actual scheme for generation of the offset
current sources and digitally-assisted calibration loop.

The DC offset calibration range and resolution are related to the number of bits and the LSB current
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
of DAC, respectively. The simulated worst case input referred total offset σtot ¼ σ2CTLE þ σ2BUF
introduced by CTLE and buffer is 14.01 mV. This is somewhat high because of the selection of
minimum sizes of channel lengths for both circuits to ensure high-speed operation. Thus, to obtain a
99.7% offset yield, a minimum variation of ±3σtot must be covered from the DC offset calibration
range. In the present design, a total offset cancelation of 114 mV has been selected, covering in that
way more DC offset than the minimum requirement. A 7-bit counter and a 7-bit DAC were used in
order for the offset correction at each cycle of the counter to remain lower than 1 mV (LSB), where

Offset Correction Range 114mV


1LSB ¼ ¼ 7 ¼ 0:897mV
2Resolution  1 2 1

Considering that the load resistors in the buffer were selected to be 350Ω, then at each counter cycle,
the correction offset currents are equal to 2.56 uA.
The abstract view and the topology of the 7-bit binary-weighted current-steering DAC are given in
Figure 12. The reference current source used in the current-steering DAC is derived from a bandgap
reference voltage topology, and thus is constant over PVT variations.
Figure 13 shows an example of the DC-offset calibration procedure. Initially, the up/down counter is
set to half of full scale such that the DAC output currents are equal. If the output of the comparator is
low, which means that the DC level of the negative output is higher than that of the positive output, the
counter will start counting up, reducing in each cycle the DC offset between the two branches until it
cancels the total offset voltage. In that case, the state of the comparator goes high, triggering the zero-
cross detection block that controls the counter and generating also a ‘stop-flag’ signal denoting the end
of process. That signal is also used to shut down the comparator for low-power purposes. After this
procedure, the total DC offset is calibrated to the minimum value which is limited by the LSB
current of the DAC. In the current example, the reference clock that was used was 40 MHz, the
initial DC offset was 88.49 mV and the remaining offset is 0.84 mV calibrated after 1.459 us.
The results of the Monte-Carlo simulations have also been performed with and without the DCOC
loop and are shown in Figure 14.
The DCOC loop is performed as the receiver is powered-on and before receiving the first symbol of
data. Because of the fact that the voltages are drifting with temperature, the DC-offset calibration is
re-executed many times and during the time intervals where the receiver is in an idle state. The power
overhead of the proposed offset cancellation scheme is 528 uW for the maximum DC offset, where
all the current sources of the DAC are needed. At the end of the calibration procedure, the
comparator is powered down to reduce the power consumption by 30%, whereas the DAC is
maintained powered on in order to keep the offset currents active.

Copyright © 2014 John Wiley & Sons, Ltd. Int. J. Circ. Theor. Appl. 2015; 43:900–916
DOI: 10.1002/cta
1097007x, 2015, 7, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/cta.1982 by Indian Institute Of Technology, Kharagpur Central Library, Wiley Online Library on [25/07/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
5GBPS USB3.0 TX AND CTLE 911

Figure 12. (a) Abstract view and (b) topology of 7-bit binary-weighted current-steering digital-to-analog converter.

Figure 13. Post-layout simulation result of DC offset calibration circuit loop.

3.4. Bandgap Reference


The BGR creates a constant voltage of 500 mV. The circuit is similar with this in [19], although
because of its careful redesign, shows now improved performance. The main difference is that
simple two-stage operational amplifiers have been used instead of those reported in [19]. This
makes the design more compact in terms of layout (die) area. The precise value of Vref for the
typical case at room temperature is 500.2 mV. The circuit has been simulated for a range of

Copyright © 2014 John Wiley & Sons, Ltd. Int. J. Circ. Theor. Appl. 2015; 43:900–916
DOI: 10.1002/cta
1097007x, 2015, 7, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/cta.1982 by Indian Institute Of Technology, Kharagpur Central Library, Wiley Online Library on [25/07/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
912 N. TERZOPOULOS ET AL.

Figure 14. Monte Carlo simulation results (a) before and (b) after the DC offset calibration circuit loop.

temperatures, supply voltages, and process variations. The range of temperature used in the
simulations is from 40 to 125°C, the supply voltage is 1.2 V with variations of ±10% and the
process parameters are for typical, SS, and FF models. The simulated results are shown in
Figure 15. The variation of Vref is less than ±0.06%, for typical case models and temperatures
from 40 to 125°C. The total Vref variation for all corners is in the range of +0.6% to 1.8%.
From Figure 15, it is shown that Vref is much worse than the average value for only one extreme
case, and particularly for the model case SS, supply voltage 1.08 V and temperature less than
20°C. However, if all these extreme parameters of this case do not occur simultaneously, then
the variation is much smaller. Comparing with the results presented in [19], this design exhibits
better performance. This could be explained by the careful electrical and layout design, together
with the slightly higher available supply voltage.
A constant current reference is also designed as part of the BGR. The constant current reference Iref
is 100.5 μΑ. The Iref variation for the typical model elements is less than ±0.3% and for all the corner
and temperature range Iref variation is ±2% , as shown in Figure 15.

3.5. Serial Peripheral Interface


USB 3.0 features such as transmitter de-emphasis and low-power operation mode as well as RX
equalizer settings can be configured through the on-chip SPI interface on power-up. The same block

Figure 15. (a) Voltage and (b) current reference.

Copyright © 2014 John Wiley & Sons, Ltd. Int. J. Circ. Theor. Appl. 2015; 43:900–916
DOI: 10.1002/cta
1097007x, 2015, 7, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/cta.1982 by Indian Institute Of Technology, Kharagpur Central Library, Wiley Online Library on [25/07/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
5GBPS USB3.0 TX AND CTLE 913

also controls the system power up sequence and the entry into the available test modes, which are also
configurable through SPI.

4. SYSTEM PERFORMANCE

The proposed USB3 transmitter and equalizer are designed in TSMC CMOS 65 nm technology. The
layout is shown in Figure 16 and occupies a total area of 1315 μm × 895 μm including the bandgap,
the SPI and the PADs. It consumes 59 mW under normal operation. The transmitter consumes

Figure 16. Layout diagram

Figure 17. Eye diagram at the output of the reference continuous time linear equalizer (typical case).

Copyright © 2014 John Wiley & Sons, Ltd. Int. J. Circ. Theor. Appl. 2015; 43:900–916
DOI: 10.1002/cta
1097007x, 2015, 7, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/cta.1982 by Indian Institute Of Technology, Kharagpur Central Library, Wiley Online Library on [25/07/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
914 N. TERZOPOULOS ET AL.

55 mW (26 mW, the main CML driver; 22 mW, the pre-drivers; level shifters and the de-emphasis
driver; and 7 mW the pure digital blocks), whereas the CTLE consumes 4 mW.
As it is enforced by the USB3 compliance methodology document [3], the proper functionality of
the transmitter at the 5 Gbps data rate has to be verified with a 3 m USB cable and an ideal CTLE
(reference CTLE) connected on the other end of the channel. As shown in the eye diagrams of
Figures 17 and 18, the eye height and the eye opening meets the minimum and maximum target
specifications as these defined by Table III.
The compliance test report for the RX is summarized in Table IV, whereas the eye diagrams with a
5 Gbps input signal after the cable and after the equalizer are shown in Figure 19.
Finally, the system performance summary is provided in Table V.

Figure 18. Eye diagrams at the output of the reference continuous time linear equalizer (corner cases).

Table III. Transmitter compliance report.


Parameter Actual Value (min-max)a USB3.0 Spec Range
Output Swing (V) 0.9-1.15 0.8-1.2
Eye height (mV) 130.7-180 100 (min)
Total Jitter (ps) 29-50 0.66 UI or 132 ps
Differential Output Impedance (Ohm) 99-103.8 72-120
De-emphasis (dB) 3.0-4.0 3.0-4.0
a
Slow-slow or fast-fast process corner, 1.08 to 1.32 V, 40°C to 125°C

Table IV. Receiver compliance report.

Parameter Value (min-max)a USB3.0 spec range


Receiver DC common mode impedance (Ohm) 21.02-26.1 18-30
DC differential impedance (Ohm) 84.1-104.4 72-120
Differential Rx peak-to-peak voltage (mV) 58.29-155.1 30 (min)
Eye width 151.9 ps-163.8
a
Slow-slow or fast-fast process corner, 1.08 to 1.32 V, 40°C to 125°C

Copyright © 2014 John Wiley & Sons, Ltd. Int. J. Circ. Theor. Appl. 2015; 43:900–916
DOI: 10.1002/cta
1097007x, 2015, 7, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/cta.1982 by Indian Institute Of Technology, Kharagpur Central Library, Wiley Online Library on [25/07/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
5GBPS USB3.0 TX AND CTLE 915

Figure 19. Eye diagrams with a 5 Gbps input signal (a) after cable (b) after equalizer.

Table V. System performance summary.


Parameter Value (typical conditions)a

Process technology 65 nm CMOS


Power supplies 1.2 and 1.8 V
Data rate 5 Gbps
Dimensions (um) 700 × 350 (without pads)
Receiver equalization CTLE
Transmitter equalization Cursor/post cursor
Output swing (V) 1.093
Eye height (mV) 149.6
Total jitter (ps) 51.09
TX differential impedance (Ohm) 102
rx differential impedance (Ohm) 93.9
RX common mode impedance (Ohm) 23.4
Differential RX peak-to-peak voltage (mV) 145.7
Power consumption (mW) 59
ESD 2 kV (HBM), 500 V (CDM)
Typical process corner, 1.2V, 27°C
a

5. CONCLUSIONS

A 5 Gbps transmitter and receiver linear equalizer using TSMC 65 nm CMOS technology has been
presented. USB3.0 compliance tests for both the transmitter and the CTLE have been performed and
passed as shown in the simulation results. The transmitter includes an impedance calibration mechanism
to overcome termination resistor variations due to different core voltages, temperature and process, and
three levels of de-emphasis equalization to compensate for the channel losses. The DC output swing is
1.093 V, the eye height is 149.6 mV, and the Jitter is 51.09 ps. At the receiver, the CTLE is employed
together with a DC offset compensation scheme to cope with the channel losses. The DC Differential
impedance is 93.9Ω and the Differential RX peak-to-peak voltage is 145.7 mV. The proposed architecture
occupies an area of 700 um × 350 um (without pads) and consumes 59 mW under 1.2 V core supply.

REFERENCES
1. “Universal serial bus 3.0 Specification” Revision 1.0, 2011
2. “PHY Interface For PCI Express, SATA, and USB 3.0 Architectures”, Version 4.0, 2011, Intel Corporation.http://
www.intel.com/content/www/us/en/io/pci-express/phy-interface-pci-express-sata-usb30-architectures.html
3. “USB Super Speed Electrical Compliance Methodology”, Revision 0.5, 2009. http://www.usb.org/developers/
whitepapers/USB_3_0_e-Compliance_methodology_0p5_whitepaper.pdf
4. Lin M-S. et al. A 5Gb/s low-power PCI express/USB3.0 ready PHY in 40nm CMOS technology with high-jitter
immunity. IEEE Asian Solid-State Circuits Conference, 2009; 177–180.

Copyright © 2014 John Wiley & Sons, Ltd. Int. J. Circ. Theor. Appl. 2015; 43:900–916
DOI: 10.1002/cta
1097007x, 2015, 7, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/cta.1982 by Indian Institute Of Technology, Kharagpur Central Library, Wiley Online Library on [25/07/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
916 N. TERZOPOULOS ET AL.

5. Palermo S, Song Y-H. A 6-gbit/s hybrid voltage-mode transmitter with current-mode equalization in 90-nm CMOS.
IEEE Transactions on Circuits and Systems II 2012; 59(8):491–495.
6. Higashi H, et al. 5-6.4 Gbps 12 channel transceiver with pre-emphasis and equalizer. Symposium on VLSI Circuits,
2004; 130–133.
7. Lin C-H, Wang C-H, Jou S-J. 5 Gbps serial link transmitter with pre-emphasis. Asia and South Pacific Design
Automation Conference, 2003; 795–800.
8. Stauffer D-R, Mechler J-T, Sorna M, Dramstad K, Ogilvie C-R, Mohammad A. High Speed Serdes Devices and
Applications. Springer: New York, 2008.
9. Yuan F. CMOS Current–Mode Circuits for Data Communications. Springer: New York, 2007.
10. Choi J-S, Hwang M-S, Jeong D-K. A 018- μm CMOS 3.5-Gb/s continuous-time adaptive cable equalizer using
enhanced low-frequency gain control method. IEEE J. Solid-State Circuits 2004; 39(3a):419–425.
11. Gondi S, Razavi B. Equalization and clock and data recovery techniques for 10-Gb/s CMOS serial-link receiver.
IEEE Journal of Solid State Circuits 2007; 42(9):1999–2011.
12. Hsieh C-L, Liu S-I. A 40 Gb/s decision feedback equalizer using back-gate feedback technique. Symposium on VLSI
Circuits Dig. Tech. Papers, 2009; 218–219.
13. Cheng K-H, Tsai Y-C, Wu Y-H, Lin Y-F. A 5Gb/s inductorless CMOS adaptive equalizer for PCI express generation
II applications. IEEE Transactions on Circuits and Systems II 2010; 57(5):324–328.
14. Chen J, Saibi F, Lin J, Azadet K. Electrical backplane equalization using programmable analog zeros and folded
active inductors. IEEE Transactions on Microwave Theory and Techniques 2007; 55(7):1459–1466.
15. Balamurugan G, Kennedy J, Banerjee G, Jaussi J, Mansuri M, O’Mahony F, Casper B, Mooney R. A scalable
5-15Gbps, 14-75mW low-power I/O transceiver in 65nm CMOS. IEEE Journal of Solid State Circuits 2008;
43(4):1010–1019.
16. Yu C-G, Geiger R. An automatic offset compensation scheme with ping-pong control for CMOS operational
amplifiers. IEEE Journal of Solid State Circuits 1994; 29(5):601–610.
17. Shih H-Y, Kuo C-N, Chen W-H, Yang T-Y, Juang K-C. A 250MHz 14 dB-NF 73 dB-gain 82dB-DR analog
baseband chain with digital-assisted DC-offset calibration for ultra-wideband. IEEE Journal of Solid States Circuits
2010; 45(2):338–350.
18. Razavi B. Design of Integrated Circuits for Optical Communications. Mc Graw Hill: New York, 2002.
19. Tsitouras A, Plessas F, Birbas M, Kikidis J, Kalivas G. A sub-1V supply CMOS voltage reference generator.
International Journal of Circuit Theory and Applications 2012; 40(8):745–758.

Copyright © 2014 John Wiley & Sons, Ltd. Int. J. Circ. Theor. Appl. 2015; 43:900–916
DOI: 10.1002/cta

You might also like