Clock and Data Recovery Ii Properties, Specifications, and Considerations
Clock and Data Recovery Ii Properties, Specifications, and Considerations
With the fundamental CDR knowledge developed in Chapter 9, we are now ready for ad-
vanced features. We begin our discussion on capture range of different PDs, and investigate
frequency acquisition techniques. Three important jitter specifications, namely, jitter transfer
(JTRAN), jitter tolerance (JTOL), and jitter generation (JG) would be studied thoroughly.
Similarly, by gradually tuning the data rate, the lock (tracking) range can be obtained. Frequency
acquisition loop must be switched off while conducting these experiments.
Lock (Tracking) Range
Capture Range
f
We first look at the capture range of linear, Pll-based CDRs. Fig. 2(a) redraws the model for
the sake of convenience. Recall the linear transfer function is given by
Φout 2ζωn s + ωn2
(s) = 2 , (10.1)
Φin s + 2ζωn s + ωn2
where s
Ip KV CO
ωn = (10.2)
2πCp
r
Rp Ip Cp KV CO
ζ= . (10.3)
2 2π
Suppose the loop is locked properly for t < 0, and the data rate (ωDR = 2πRD = 2π/Tb )
jumps abruptly from ωDR to ωDR + ∆ω at t = 0. The output phase tracks the sharper curve of
(ωDR + ∆ω)t immediately in order to minimize the phase error and get back to lock [Fig. 2(b)].
However, for the loop to relock, the maximum phase deviation ∆Φmax must not exceed 2π. Since
Φin (t) = (ωDR + ∆ω)t, we obtain Φout (t) by taking the inverse Laplace transform
∆ω
Φout (t) =p (ek1t − ek2t ) + (ωDR + ∆ω)t, (10.4)
2
2ωn ζ − 1
p p
where k1 = −ωn (ζ + ζ 2 − 1) and k2 = −ωn (ζ − ζ 2 − 1).
It can be clearly shown that ∆Φmax occurs at t1 , where dΦout /dt = ωDR + ∆ω.
It follows that " p #
1 ζ + ζ2 − 1
t1 = p ln p . (10.5)
2ωn ζ 2 − 1 ζ − ζ2 − 1
To ensure relocking, we must have ∆Φmax = (ωDR + ∆ω)t1 − Φout (t1 ) < 2π.
that is, √
" p # ζ−√ ζ2 −1
ζ 2 −1
∆ω ζ − ζ2 − 1 2
I av
Ip
−2π
∆φ
2π φ out ( t (
Tb
−I p (ω DR + ∆ ω( t
I av
∆φ max
D in Linear CP
( φ in ) PD
RP slope = ωDR + ∆ω
ωDR t
CP
CK out
( φ out )
t
VCO 0 t1
(a) (b)
p
For regular wireline systems (especially long haul), we have ζ ≫ 1 and then ζ 2 − 1 ≈ ζ.
The underline part of the above inequality approaches unity. The capture range is therefore given
by
= 2π · ω−3dB . (10.7)
That is, the capture range of linear PLL-based CDRs is on the order of the loop bandwidth.
The absolute value accounts for both sides of deviation. In reality, the capture range would be
somewhat smaller than that as PDs can hardly reach ±2π operation region.
Example 10.1
How do we calculate the capture range of a bang-bang CDR? The nonlinear behavior of the
loop presents us from doing s-domain analysis. However, judging from the model in Fig. 3, we can
still estimate the capture range. Suppose the data rate suddenly jumps by ∆ω, the edge sampling
point would no longer stay at transition point 1, but rather shift to point 2 in next bit [Fig. 3(b)].
The bang-bang PD as well as the loop reacts immediately, creating a temporary boost of Ip Rp
on VCO control line. Note that Cp for bang-bang CDR is typically very large, so the voltage
across Cp barely changes for a period of time as short as a few bits. The control line voltage step
Ip Rp translates itself to VCO frequency step by KV CO , arriving at a frequency step of KV CO Ip Rp .
Whether the loop can go back to lock depends on the relationship between ∆ω and KV CO Ip Rp .
That is, for the loop to relock, we must have
I av
IP
∆φ
−I
P
Return to Lock Loss Lock
I av
3 4 4 3 2
D in BBPD CP 2
( φ in )
RP
CP 1 1
CK out
( φ out )
VCO
(a) (b)
As illustrated is Fig. 3(b), the sampling point would gradually move toward the right po-
sition(i.e.,point 1). On the other hand, if the data rate deviation exceeds KV CO Ip Rp , the loop
would eventually become out of lock (and the sampling point keeps moving away from the right
position). We conclude the capture range of such a bang-bang CDR is given by KV CO Ip Rp .
Recall from our discussion in Chapter 9. This capture range is still commensurate with the loop
bandwidth of a bang-bang CDR. Note that the effect of random data has been included in the
345
model (e.q.,BBPD’s characteristic), which presents average results only. Actual case may vary to
some extent based on data patterns.
Example 10.2
Determine the capture range of all digital bang-bang CDR shown in Fig. 9.20.
Solution:
For an instant data rate change, the loop creates 0.5 · K1 · KDCO instant boost on frequency, which
must be greater than ∆ω in order to relock. The capture range is simply given by
1
|∆ω| < · K1 · KDCO . (10.11)
2
It could also be verified from Example 9.1, where Ip Rp of analog design is replace by 0.5 · K1 of
all-digital approach.
Other than the PLL-based CDRs, it is instructive to look at the frequency offset issue for PI-
based CDRs. Taking the all-digital PI-based CDR in Fig. 9.23 for example. If a finite frequency
offset exists between data rate (RD ) and reference PLL clock, the phase interpolator would keep
rotating either clockwisely or counter-clockwisely so as to “track” the phase. Once again, the loop
may go out of lock if the frequency offset exceeds a certain value. This phenomenon resemble the
behavior of capture range in PLL-based CDRs.
To analytically estimate the maximum tolerable frequency offset, we follow the same notation
of Fig. 9.23. In the present of frequency offset, the phase deviation in each bit would be 2π ×
∆ω/ωDR , where ωDR = 2πRD = 2π/Tb . On the other hand, if the digital loop filter takes TDLF to
update one output, the maximum phase difference it could pursue in each bit is 0.5K2 KP I Tb /TDLF .
To relock the loop, we need
∆ω 0.5K2 KP I Tb
× 2π < . (10.12)
ωDR TDLF
It follows that
K2 KP I
|∆ω| < . (10.13)
2TDLF
346
It is of course an optimistic estimation. CDRs with crystal oscillators as references would present
a few hundred ppm of frequency offset in practice, which is harmless in most applications. The
reader can prove that the frequency offset issue in oversampling CDRs can be analyzed with similar
approach.
Example 10.3
Explain why a second-order DLF improves the tolerance of frequency offset in PI-based CDRs.
Solution:
( ω0 + ∆ω ( t ( ω0 + ∆ω ( t
2
ω0 t + K 2 KPI ω0 t + KPI K 3 t + 1 K 4 t
(T (
2 TDLF 2 DLF
22 TDLF
ω0 t ω0 t
(a) (b)
Fig. 10.4 (a) Linear, (b) parabolic phase tracking on occurrence of frequency offset.
Figure 4(a) illustrates the linear tracking behavior, which can tolerate frequency offset (between the
reference PLL clock and input data) up to K2 KP I /(2TDLF ). The PI-based CDR with a 2nd order
DLF, on the other hand, provides a parabolic phase tracking as depicted in Fig. 4(b). That allows
a much larger offset in frequency, given that the proportional and the integral terms are properly
chosen.
Now that we realize the limitation of phase detectors, we focus our study on frequency detectors
(FDs) in this section. There are several mainstream methods to acquire frequency (data rate) infor-
mation, namely, dual loop, Pottbacker, direct-dividing, and all-digital. We introduce their operation
as well as properties.
347
The most straightforward and perhaps the most popular way to capture the right data rate is to
use a reference PLL as we described in Chapter 9. To make it more specific we redraw such a
dual loop approach is Fig. 5. The frequency tracking loop is actually unconditionally stable, even
though the loop filter is as simple as a single capacitor. Thus, its corresponding charge pump (CP2 )
can be driving the main capacitor C1 . In most cases, the FD loop would be turned off once the
proper frequency is obtained. It not only saves power but minimizes disturbance. The frequency
acquisition loop is usually accompanied by a lock detector, which monitors the loop states and
activates the FD loop once the PD loop is out of lock.
PD Loop
VCO
D in PD CP1 CKout
C2
C1
CP2 FD Loop
CK ref PFD M
Lock Detector
How do we implement a lock detector? A simple way to examine the frequency error between
two clocks is to use a mixer. As illustrated in Fig. 6, CKref and CKout /M are mixed up to get
1
the beat frequency fb .* If the two clocks are not synchronous, finite fb would occur. Using a
counter and some logic control allows us to determine whether the two clock frequencies are close
enough. If the beat frequency is within a preset threshold, the loop is considered “locked”. The
lock detector thus switches off the corresponding charge pump (i.e., CP2 in Fig. 5) to disable the
FD loop. Otherwise, the FD loop would remain active until the frequency acquisition is achieved.
1
A low-pass-filter should be added behind the mixer if the sum frequency is a concern.
348
It is not surprising that some tricks can be added in lock detector to enhance its flexibility and
reliability. For example, different thresholds can be imposed such that the “pull-in” and “out-of-
lock alarm” have different ranges. The mixer could be replaced by digital circuits as well. Other
calibration techniques may be included in all-digital implementation.
fb
Control Out−of−Lock
CK ref Counter Logic Signal
CK out Threshold
M Out of Lock
fb
Lock
CK ref
f
Counted
10.2.2 Pottbacker FD
The use of external crystal oscillators as reference requires at least one more pad on chip and extra
space and cost on board. Here, we introduce an elegant way to distill the frequency information
from random data without a reference. First proposed by Pottbacker [8], this type of FD mandates
quadrature clocks in full rate. Figure 7 reveals the idea of operation. Instead of sampling data with
clocks, here we use input data (Din ) to sample clocks (CKI and CKQ ). If the clock frequency
is less than the data rate, the sampling points would gradually shift to the left (i.e., forward).
Similarly, if it is greater than the data rate, we see the sampling points moving to the right (i.e.,
backward). As a result, we obtain two slow waves Q1 and Q2 roughly in quadrature. Whether Q1 is
leading or lagging depends on the polarity (sign) of the frequency error. In other words, frequency
detection is accomplished by examining the phase relationship between Q1 and Q2 , which can be
easily obtained with an additional flipflop. We get the final polarity result at Q3 .
The Pottbacker FD could be further modified to turn itself off upon lock. As illustrated in
Fig. 8(a), the rising or falling edge of Din always aligns with the valley of CKQ when the PD loop
349
f CK Data Rate
CK I
CKQ
t Q1
CK I D FF Q D FF Q Q3
f CK Data Rate
D in
CK I
Q2
CKQ CKQ D FF Q
t
is locked. That means Q2 would stay low upon phase locking. As a result, we can apply Q2 to
the corresponding charge pump (CPF D ) directly, arriving at the circuit shown in Fig. 8(b). Here,
only about 50% of the frequency tracking time is active as CPF D turns on and off periodically. It
presents little influence on the overall performance, since frequency acquisition time in wireline
systems is not that critical. A major advantage here is that we do not need a lock detector any
more. All functions are implemented in analog domain.
Q1 Q3
CKI D Q D Q
Q2 CPFD To Loop
CKQ D Q Filter
D in
D in
CK I
Q3 Q3
t t
(a) (b)
Fig. 10.8 (a) Pottbacker FD under phase locking, (b) modified Pottbacker FD with automatic
shutoff.
At high data rates, generating quadrature clocks may not be trivial. The purely linear CDR
introduced in Chapter 9.2 provides an alternative solution. Recall from Fig. 9.10 that the input
data are first applied to a series of buffers to create 0.5 UI delay. The nominal 0.5 UI delay
350
from VA to VE implies a 0.25 UI delay from VB to VD , which allows us to extract the frequency
difference. Indeed, the 0.25 UI data delay corresponds to a 90◦ phase shift2 of a full-rate clock,
making it possible to realize a rotational frequency detector without using quadrature clocks. The
proposed FD is shown in Fig. 9. Here, the clock is sampled by using the PD’s by-product VB and
VD , producing two outputs Q1 and Q2 , respectively. Similar to the Pottbacker FD, Q1 is further
sampled by Q2 through another flip-flop. The polarity of frequency error Q3 is therefore obtained.
Like the Pottbacker FD, the VB → VD delay need not be exactly 0.25 UI. Simulation shows that
a range of more than 25% on the delay variation is tolerable for the FD to function properly. The
automatic switching off function here works in the same way as that of Pottbacker FD does.
12.5 ps
VB VD
VA
VC VE
D Q1 D
FF1 Q FF3 Q Q3
(Up/Down)
CK VB
To Loop
D (V/I) Filter
FF2 Q FD
Q2
(On/Off)
VD
It is instructive to examine the FD operation in detail and quantize the operation range. The
states of Q1 and Q2 can be characterized in Fig. 10(a), where the rotating direction indicates the
sign of the beat frequency. For example, a clockwise rotation suggests the clock frequency (fCK ) is
less than the data rate (RD ). Of course, the rotation rate represents the beat frequency. For such an
FD to make a right decision on every sampling, we must require the states of Q1 and Q2 to jump no
more than one step at a time. That is, the average output current Iav remains fixed (either positive
or negative) for low frequency error, forming a binary characteristic. This situation continues until
2
As a matter of fact, a precise 90◦ separation on adjacent phases is not mandatory. A looser condition (such as 80◦
or 100◦ ) would still allow an FD to achieve similar performance, given that the initial frequency deviation stays within
a certain range.
351
the above condition is violated. To determine the points where Iav begins to drop, we study one
worst case as illustrated in Fig. 10(b). Here, without loss of generality, we assume fCK is less than
RD and the transition of VB (and thus Q1 ) is already very close to the clock edge. Starting from (1,
1), the state either stays at (1, 1) or moves to (0, 1) in the next sampling. As we know, for a PRBS
of 2N − 1, the longest run length between transitions is N bits. Since the longest run accumulates
the most error, we can determine the largest beat frequency at which the average output current
begins to degrade. That is, after N bits, the sampled Q2 remains high. The boundary condition
gives
1 1 1
N ·| − |= . (10.14)
fCK RD 4fCK
It follows that the deviation is given by
RD
∆f1 , |fCK − RD | = . (10.15)
4N
If N = 7, for example, the binary range is equal to ±3.6%. It can be easily proven that ∆f is
symmetric with respect to the origin. Strictly speaking, the use of N bits as the longest period of
error accumulation is not exactly correct because the flip-flops in the FD are single-edge triggered.
The actual accumulation time would be longer than N · (1/RD ). For example, the longest distance
between two adjacent rising edges in a 27 − 1 PRBS is 13 bits, so the binary characteristic begins
to roll off at around RD /(4 · 13).
The above analysis is based on the worst-case scenario. In practice, Q1 may stay far away from
the clock edge before the N-bit long run. The best-case scenario is also shown in Fig. 10(a), where
the phase error accumulated over N bits must be less than a half rather than a quarter of a clock
cycle in order to maintain a saturated IP 2,avg . Thus, the widest binary range would be twice as
large as that in Eq. (10.15):
RD
∆f2 = . (10.16)
2N
Depending on the initial phase relationship, the binary range in reality lies between the two ex-
tremes ∆f1 and ∆f2 .
The FD performance begins to degrade beyond the binary range, as the sequence of Q1 and
Q2 becomes chaotic and the erroneous samplings occur in F F3 . It is expected to see the average
352
(a)
(b)
Fig. 10.10 (a) Determing operation range of Pottbacker FD, (b) simulated FD characteristic (data
rate=20 Gb/s).
output eventually approaching zero as the sequential states of Q1 and Q2 become totally random,
i.e., no reliable average on Q3 can be obtained. The vanishing point can be roughly estimated
as follows. For random data, the expected interval between two adjacent transitions is two bits.
Since F F1 and F F2 are single-edge triggered, VB and VD on average sample the clock every four
bits. Now, if the frequency error is so significant that (Q1 , Q2 ) steps more than one state in each
sampling, the beat-frequency sequences become totally corrupted and the FD has no way to judge
the polarity. Under such a circumstance, we have
1 1 1
4·| − |≥ . (10.17)
fCK RD 2fCK
It follows that
9 7
fCK,max = RD and fCK,min = RD . (10.18)
8 8
353
In other words, the capture of the FD is about ±12.5%. In fact, the vanishing point is slightly larger
than the prediction of Eq. (10.18) because of the finite rising and falling times. Fig. 10(b) reveals
the simulated FD characteristic for a 27 − 1 input data sequence of 20 Gb/s.
10.2.3 Direct-Dividing FD
Another interesting approach to distill the frequency information from data stream counts on the
randomness of the bit sequence. Consider a purely random data stream. The chance for a transition
to occur between two consecutive bits is actually 50%. As a result, the probability of run length =
1 bit is 1/2, that of run length = 2 bits is 1/4, and so on. The average run length is therefore given
by
∞
X 1
Avg .Run = k · ( )k = 2 (bits), (10.19)
2
k=1
1 2 1 1 1
σ2 = · 1 + 0 + · 12 + · 22 + · 32 + · · · (10.20)
2 8 16 32
= 2. (10.21)
354
Tb
N D in
Full−Rate
D in 2 2 LPF CKout CK ( R D )
1st 2
Avg.Run 1/2
=2Tb Run Length PDF t
After CKout
1/4
1st ~R D/8
1/8
1/16 1/32 2nd ~R D/16
Nth ~R D/2 N+2
0 1 2 3 4 5
It goes without saying, that the period of CKout varies from time to time. A robust and reliable
lock detector would be necessary to ensure smooth transition from frequency acquisition to phase
locking.
Starting from this section, we will study jitter specifications extensively adopted in difference
standards. We look at jitter transfer (JTRAN) of linear CDRs in this section.3
Jitter transfer is defined as the response of a CDR loop to input jitter, which is actually nothing
more than the transfer function we derived in Chapter 9. It is because the s-domain analysis exactly
reflects the loop’s behavior in response to a sinusoidal variation of input phase. For convenience
we redraw the linear CDR model in Fig. 12(a), which presents a closed-loop transfer function
equivalent to (JTRAN) of
s
Ip KV CO
ωn = (10.23)
2πCp
r
Rp Ip Cp KV CO
ζ= . (10.24)
2 2π
For most long-haul applications, ζ ≫ 1, arriving at
2ζωn
JT RAN ≈ . (10.25)
s + 2ζωn
Figure 12(b) illustrates the jitter transfer of overdamped CDRs. As we will see in section 10.5,
this is the shape we see on the phase noise plot of the recovered clock.
Ip
0dB
∆φ
2π K vco
S
JTRAN
φ in PD+CP φ out
R
ζ1ωn= Kvco I pR 2 π
C
ω
(a) (b)
An important specification regarding JTRAN is the possible peak around the −3-dB bandwidth
f0 . Shown in Fig. 13(a) is an example of ŠONET, which asks the peaking to be less than 0.1 dB.
It is because tens or even hundreds of repeaters may be deployed, accumulating a huge peak in
the far-end side even each repeater contribute only 0.1 dB of peaking. Figure 13(b) illustrates the
effect.
The binary characteristic of bang-bang PDs in practice exhibits a finite slope across a narrow range
of the input phase difference. That is, small phase errors lead to linear operation whereas large
phase errors introduce “slewing” in the loop, as we discussed in Chapter 9. Two main phenomena
356
SONET
1000 Repeaters
0.1dB
−20dB/dec
JTRAN
100dB
0.1dB
f0 ω
(a) (b)
Fig. 10.13 (a) SONET jitter transfer specification, (b) peaking effect in long-haul systems.
cause such a characteristic smoothing. The first is the effect of metastability. When the zero-
crossing points of the recovered clock fall in the vicinity of data transitions, the flipflops comprising
the PD may experience metastability, thereby generating an output lower than the full level for
some time.
To quantify the effect of metastability, we first consider a single latch consisting of a pream-
plifier and a regenerative pair (Fig. 14), assuming a gain of Apre for the former and a regeneration
time constant of τreg for the latter.4 We also assume a slope of 2k for the input differential data and
a sufficiently large bandwidth at X and Y so that VX − VY tracks Din with the same slope.
Fig. 15 illustrates distinct cases that determine certain points on the PD characteristic. If the
phase difference between CK and Din , ∆T , is large enough, the output reaches the saturated level,
VF = ISS RC , in the sampling model [Fig. 15(a)], yielding an average approximately equal to VF .
For the case 2k∆T Apre < VF , the circuit regeneratively amplifies the sampled level [Fig. 15(b)],
providing VP D < VF . Finally, if ∆T is sufficiently small, the regeneration in half a clock period
does not amplify 2k∆T Apre to VF [Fig. 15(c)], leading to an average output substantially less than
VF . Since the current delivered to the loop filter is proportional to the area under VX − VY and
4
For the sake of brevity, the regenerative gain is included in τreg , allowing an expression of the form exp(t/τreg )
for the positive feedback growth of the signal.
357
VDD
RC RC
Vout
Apre
Din D in 2 k ∆T
CK M1 M2 CK
∆T
I SS
since the waveform in this case begins with an initial condition equal to 2k∆T Apre , we have
1 Tb /2 t
Z
VP D,meta (∆T ) ≈ 2k∆T Apre exp dt (10.26)
Tb 0 τreg
τreg Tb
≈ 2k∆T Apre exp . (10.27)
Tb 2τreg
Thus, the average output is indeed linearly proportional to ∆T . The linear regime holds so long as
the final value at t = Tb /2 remains less than VF ,5 and the maximum phase difference in this regime
is given by
Tb
2k∆Tlin Apre exp = VF (10.28)
2τreg
and hence
VF
∆Tlin = . (10.29)
2kApre exp 2τTreg
b
For phase differences greater than ∆Tlin , the slope of the characteristic begins to drop, approaching
zero if the preamplified level reaches VF :
VF
∆Tsat = . (10.30)
2kApre
Fig. 15(d) summarizes these concepts.
The binary PD characteristic is also smoothed out by the jitter inherent in the input data and
the oscillator output. Even with abrupt data and clock transitions, the random phase difference
5
Since the regeneration time is in fact equal to Tb /2 − ∆T , the PD characteristic displays a slight nonlinearity in
this regime.
358
Tb Tb
VX VX
VF 2 k ∆T A pre
VY VY ∆T
∆T
CK CK
Sampling Regeneration Sampling Regeneration
(a) (b)
Tb
VPD,meta
2 k ∆T A pre
VX VF
− ∆Tsat − ∆Tlin
∆T
∆Tlin ∆Tsat
VY ∆T
CK −V F
Sampling Regeneration
(c) (d)
Fig. 10.15 Average PD output for (a) complete switching, (b) partial switching, (c) incomplete
regeneration, (d) typical bang-bang characteristic.
resulting from jitter leads to an average output lower than the saturated levels. As illustrated in
Fig. 16(a), for a phase difference of ∆T , it is possible that the tail of the jitter distribution shifts
the clock edge to the left by more than ∆T , forcing the PD to sample a level of −V0 rather than
+V0 . To obtain the average output under this condition, we sum the positive and negative samples
with a weighting given by the probability of their occurrences:
Z −∆T Z +∞
VP D (∆T ) = −V0 p(x) dx + V0 p(x) dx (10.31)
−∞ −∆T
where p(x) denotes the probability density function (PDF) of jitter. Since the PDF is typically
even-symmetric, this result can be rewritten as
Z +∞ Z +∆T
VP D (∆T ) = −V0 p(x) dx + V0 p(x) dx (10.32)
+∆T −∞
359
which is equivalent to the convolution of the bang-bang characteristic and the PDF of jitter. Illus-
trated in Fig. 16(b), VP D exhibits a relatively linear range for |∆T | < 2σ if the PDF is Gaussian
with a standard deviation of σ.
D in +V0
−V0
V PD ( ∆T )
CK
−2σ
t ∆T
+2σ
p(x)
x
−∆T 0
Probability of
Sampling −V0
(a) (b)
Combining the two effects, it is not difficult to obtain the resulting model in Fig. 17, where
the BBPD+CP presents a linear region of ±Φm and saturated pumping current ±Ip . Suppose
Φin (t) = Φin,p cos ωΦ t. If Φin,p < Φm then the PD operates in the linear region, yielding a standard
second-order system. On the other hand, as Φin,p exceeds Φm , the phase difference between the
input and output may also rise above Φm , leading to nonlinear operation. At low jitter frequencies,
Φout (t) still tracks Φin (t) closely, |∆Φ| < |Φm |, and |Φout /Φin | ≈ 1. As ωΦ increases, so does
∆Φ, demanding that the V/I converter pump a larger current into the loop filter. However, since
the available current beyond the linear PD region is constant, large and fast variation of Φin results
in “slewing”.
I av
IP I av
Charge
Din BBPD Pump
−φ m (φ in ) RP
−2 π
∆φ
φm 2π CP
CK out
−I P (φ out )
VCO
To study this phenomenon, let us assume Φin,p ≫ Φm as an extreme case so that ∆Φ changes
polarity in every half cycle of ωΦ , requiring that I1 alternately jump between +Ip and −Ip (Fig. 18).
Since the loop filter capacitor is typically large, the oscillator control voltage tracks I1 Rp , leading
to binary modulation of the VCO frequency and hence triangular variation of the output phase.
The peak value of Φout occurs after integration of the control voltage for a duration of TΦ /4, where
TΦ = 2π/ωΦ ; that is,
KV CO Ip Rp TΦ
Φout,p = (10.33)
4
and
Φout,p πKV CO Ip Rp
| |= . (10.34)
Φin,p 2Φin,p ωΦ
φ in,p
φ in t
+I p
I out t
−I p
ω2
ω VCO t
ω1
φ in
φ out,p
t
φ out
Tφ
4
Expressing the dependence of the jitter transfer upon the jitter amplitude, Φin,p , this equation
also reveals a 20-dB/dec roll-off in terms of ωΦ . Of course, as ωΦ decreases, slewing eventually
vanishes, Eq. (10.34) is no longer valid, and the jitter transfer approaches unity. As depicted in
Fig. 19(a), extrapolation of linear and slewing regimes yields an approximate value for the −3-dB
bandwidth of the jitter transfer:
πKV CO Ip Rp
ω−3dB = . (10.35)
2Φin,p
361
Φout,p 1
(s) = s . (10.36)
Φin,p 1 + ω−3dB
Fig. 19(b) plots the jitter transfer for different input jitter amplitudes. The transfer approaches
that of a linear loop as Φin,p decreases toward Φm .
It is interesting to note that the jitter transfer of slew-limited CDR loops exhibits negligible
peaking. Due to the high gain in the linear regime, the loop operates with a relatively large damping
factor in the vicinity of ω−3dB . In the slewing regime, as evident from the Φin and Φout waveforms
in Fig. 18, Φout,p can only fall monotonically as ωΦ increases because the slew rate is constant.
Bang-bang CDR’s loop bandwidth must specify what input jitter level is to be used.
φ out φ out
φ in φ in
Linear
Operation
1.0
0 dB φ in,p
Slewing
Linear Loop
20 dB/dec
ωφ ωφ
ω 3dB π K VCO I P R P
2φ m
(a) (b)
Fig. 10.19 (a) Calculation of −3-dB, (b) jitter transfer of PLL-based bang-bang CDRs.
Example 10.5
Consider the JTRAN measured results of a 10-Gb/s bang-bang CDR as shown in Fig. 20, where
three different input jitter magnitudes are tested estimate the linear region boundary Φm .
Solution:
362
Fig. 10.20 Measured JTRAN of a 10-Gb/s bang-bang CDR for different Φin .
The loop bandwidth is inversely proportional to Φin,p as Φin,p varies from 0.25 to 0.5 UI. It ob-
viously saturates as Φin,p drops to 0.125 UI. Since all other parameters are fixed, we have two
equations to predict Φm :
Example 10.6
With the same setup of Fig. 20, now we fix Φin,p = 0.5 UI and change Cp . The result is shown in
Fig. 21, where three cases give roughly the same curves of ω−3dB =2.75 MHz. Calculate what Rp
we use here.
Solution:
363
The above two examples are based on real measurement results of a 10-Gb/s CDR with a
standard Alexander PD realized in 90-nm CMOS technology.
Jitter tolerance (JTOL) is defined as the maximum input jitter that a CDR loop can tolerate without
increasing the bit error rate at a given jitter frequency. As the phase error, Φin − Φout , approaches
π = 0.5 UI, BER rises rapidly [Fig. 22(a)].
It is straight forward to derive JTOL from JTRAN for linear CDRs. Since in theory, an error
would occur if
That is,
Φout
Φin (1 − ) ≥ 0.5. (10.40)
Φin
364
Jitter
Tolerance
(UI)
15
Optimal Sample −20 dB/dec
1.5
−20 dB/dec
D in 1UI
0.15
Error Occurs
f1 f2 f3 f4 Jitter
Frequency
(log scale)
(a) (b)
Fig. 10.22 (a) Jitter tolerance calculation, (b) jitter tolerance mask.
illustrated in Fig. 24, the ITU defines the loop bandwidth on JTRAN to be 120 kHz, where as
the major corner f4 is as high as 4 MHz. A dilemma is created here, as a traditional linear CDR
can never satisfy both specifications. More sophisticated CDR architecture must be adopted to
overcome this difficulty.
JTOL(UI)
−20dB/dec
0.7
0.5
ω
2ζ1ωn= Kvco I pR 2 π
0dB
−3dB
JTRAN
Fig. 10.23 Jitter transfer and jitter tolerance of PLL-based linear CDRs.
JTRAN JTOL
f0 120kHz f1 2kHz
f2 20kHz
f3 400kHz
f4 4MHz
Example 10.7
A linear CDR combining DLL and PLL has been proposed to untie the coupling between JTRAN
and JTOL. As shown in Fig. 25(a), this structure uses a simple capacitor as the loop filter. Ana-
lyze the circuit and determine its JTRAN and JTOL. The voltage-controlled delay line (i.e., phase
shifter) presents a gain of Kps .
366
Solution:
0dB
−20dB/dec
JTRAN
−40dB/dec
ω
ω1 ω2
K vco
JTOL
−20dB/dec
K ps
S
Vc 0.7uI
D in PD/CP CKout 0.5uI
C
ω
ω2
(a) (b)
Fig. 10.25 (a) D/PLL based linear CDR, (b) its JTRAN and JTOL.
Since Din experiences a phase shifting before entering PD, we have
Φout ωn2
JT RAN = = 2 . (10.45)
Φin s + 2ζωn s + ωn2
ωn2
JT RAN = (10.48)
(s + ωp1 )(s + ωp2 )
where
ωn KV CO
ωp1 ∼
= = (10.51)
2ζ Kps
Kps Ip
ωp2 ∼
= 2ζωn = . (10.52)
2πc
Fig. 25(b) illustrates JTRAN of such a design. JTOL can be derived with the same approach. Setting
the critical condition
|Φin − Vc Kps − Φout | = 0.5 (UI), (10.53)
we arrive at
JT OL = Φin,max (10.54)
0.5
= Kps
(10.55)
1 − (1 + KV CO
· s) · JT RAN
0.5(s + ωp2 )
= . (10.56)
s
That is, JTOL’s corner point now moves to ωp2 . The two specifications are now decoupled as
JTRAN and JTOL can be designed separately.
Now we look at JTOL of binary CDRs. As we described in 10.4, a bang-bang CDR loop slews if
it fails to follow the input phase modulation tightly.
It is important to recognize that a bang-bang loop must slew if it incurs errors. With no slewing,
the phase difference between the input and output falls below Φm (≪ π), and the data is sampled
correctly. Fig. 26(a) shows the case where Φout slews and Φin,p is chosen such that ∆Φmax = π.
It can be shown that ∆Φmax occurs at some point t1 , but ∆Φ at t0 is close to ∆Φmax and much
simpler to calculate. If Φout slews for most of the period, t0 is approximately equal to TΦ /4.
368
φ in,p φ out,p
Tφ
φ in t0= φ in Tφ
4 2 2 Tφ
2
0 t1 t 0 Tφ t
φ out
φ out
∆φmax ∆φmax −φ out,p
(a) (b)
Fig. 10.26 JTOR calculation for bang-bang CDRs: (a) slewing, (b) non-linear slewing.
As expected, JTOL falls at a rate of 20 dB/dec for low ωΦ , approaching π at high ωΦ . A corner
frequency, ω1 , can be defined by equating Eq. 10.61 to 0.7 UI
KV CO Ip Rp
ω1 = . (10.62)
2
The above analysis has followed the same assumptions as before, namely, the change in the control
voltage is due to I1 Rp and the voltage across Cp remains constant. At jitter frequencies below
6
The angle δ is chosen such that the output peak occurs at t=0, simplifying the algebra.
369
(Rp Cp )−1 , however, this condition is violated, leading to “nonlinear slewing” at the output. In
fact, for a sufficiently low ωΦ , the (linear) voltage change across Cp far exceeds I1 Rp , yielding a
parabolic shape for Φout [Fig. 26(b)]. Thus
Ip TΦ
Z
Φout (t) = − KV CO t dt + Φout,p 0<t< (10.63)
Cp 2
1 KV CO Ip 2
=− t + Φout,p . (10.64)
2 Cp
TΦ 1 KV CO Ip TΦ2
Φout ( ) = −Φout,p = − + Φout,p (10.65)
2 2 Cp 4
and hence
KV CO Ip π 2
Φout,p = Φin,p cos δ = . (10.66)
4Cp ωΦ2
√
Note that the zero-crossing point of Φout occurs at t = TΦ /(2 2). Adopting the same technique
√
used for the linear slewing case, we approximate ∆Φmax with |Φin (TΦ /(2 2)| and obtain
TΦ
∆Φmax ≈ |Φin,p cos(ωΦ √ + δ)| (10.67)
2 2
π π
= −∆Φin,p cos √ cos δ + ∆Φin,p sin √ sin δ (10.68)
2 2
q
KV CO Ip π 2 16Cp2 ωΦ4 Φ2in,p − KV2 CO Ip2 π 4
= 0.61 + 0.8 . (10.69)
4Cp ωΦ2 4Cp ωΦ2
Again, equating ∆Φmax to 0.5 UI yields the jitter tolerance, JTOL = Φin,p
v
u (1 − 0.61 KV CO I2p π )2 K 2 I 2 π 2
u
4Cp ωΦ p
JT OL = 0.5 + V CO2 4 , (10.70)
t
0.64 16Cp ωΦ
which is too complicated to analyze. Fortunately, at very low jitter frequency, we have
KV CO Ip π
0.61 ≫ 1, (10.71)
4Cp ωΦ2
In this region, JTOL falls at a rate of 40 dB/dec. Fig. 27 depicts the complete JTOL curve of bang-
bang CDRs. The corner frequency ω2 between the two regions can be calculated by extrapolation.
Assuming ω2 ≪ ω1 ,we have
0.63π
ω2 = . (10.73)
Rp Cp
The reader can also show that the above assumption is valid for most cases.
G JT
40 dB/dec
20 dB/dec
0.5 UI
ωφ
ω2 ω1
Example 10.8
For a certain 10-GB/s long-haul data link we have JTRAN bandwidth corner of 8 MHz and JTOL
major corner (i.e.,f4 in Fig. 24) of 4 MHz. Now design a bang-bang CDR and determine Rp to
satisfy both JTRAN and JTOL. KV CO =1.2 GHz/V, Ip = 600 µA, and Φin,p = 2 UI.
Solution:
From JTRAN and JTOL definitions we require
πKV CO Ip Rp
< 2π × 8MHz (10.74)
2Φin,p
KV CO Ip Rp
2π × 4MHz < . (10.75)
2
It follows that
70 Ω < Rp < 89 Ω. (10.76)
It is worth nothing that the JTOL of an ideal CDR approaches 0.5 UI as the phase modulation
frequency ωΦ keeps going up. In the presence of noise, jitter, offset, and/or other nonidealities,
371
JTOL would be further degraded. Thus, it is fair enough to set the mask of 0.15 UI boundary at
high frequencies.
Jitter generation (JG) is defined as the jitter entirely produced by the CDR itself. The JG mea-
surement is straightforward: apply a clean input data to the CDR under testing and collect the
jitter distribution of the recovered clock. Using the clean clock synchronized with input data as
the trigger signal, the statistical jitter results can be obtained in most digital oscilloscopes. Such a
time-domain measurement requires the sample number to be at least 10,000 in order to get mean-
ingful results. Fig. 28 shows the required rms and peak-to-peak jitters for different Optical Carrier
(OC) levels. For example, in OC-192 (data rate ≈10 Gb/s) the recovered clock jitter must be less
than 1 ps,rms and 10 ps,pp, respectively.
JGpp
D in CDR CKout f1 f2
(Jitter Free ( 20MHz
OC−48 5kHz
S φ (f ( t OC−192 20kHz 80MHz
JGrms
OC−768 20kHz 320MHz
JGrms JGpp
OC−48 0.01UI 0.1UI
OC−192 0.01UI 0.1UI
f OC−768 0.01UI 0.1UI
f1 f2
A more strict definition of jitter generation can be found in frequency domain. By integrating
the phase noise of recovered clock from dc to infinity, we would obtain the same rms jitter in
theory. However, a completely jitter (noise) free data stream does not exist. The phase noise of a
clean data stream still depends on that of its clock source ultimately. Shown in Fig. 29 is a typical
phase noise plot of the recovered clock from a 20-Gb/s PLL-based linear CDR. The output phase
372
noise is governed by the input data profile at low frequency offsets, and gradually migrated to that
of the free-running VCO. Thus, the integration must be restricted by boundaries. The lower limit
f1 excludes the low-frequency influence from the input data, and the high limit f2 avoids the offset
of undesired coupling at high frequencies.
I av S φ ,vco
Ip 1
ω2
− π
−2 S φ ,vco(ω )
2π
∆φ 0
−I p φ ω
vco ω0
I av φ out
VCO ~ S
=
φ out φ S + 2ξωn
CP vco
RP Kvco I p
ωn =
CP 2πC P
φ in = 0 PD R P Kvco I p C P
ξ=
2 2π
Example 10.9
The term inside the brackets is the noise power, which is exactly the integration of spectrum SΦ .
Thus,
Z ∞ 21
Tb
∆Trms = SΦ (f ) df (10.80)
2π −∞
Z ∞ 12
Tb
= 2· SΦ (f ) df (10.81)
2π 0
Z ∞ 12
Tb L(f )
= 2· 10 10 df , (10.82)
2π 0
where L(f ) denotes the phase noise with the unit dBc/Hz. Jitter generation is available by changing
integration limits:
Z f2 12
∆Trms 1
JGrms , = 2· SΦ (f ) df (UI) (10.83)
Tb 2π f1
Z f2 12
1 L(f )
= 2· 10 10 df (UI). (10.84)
2π f1
To be more specific, let us conduct the derivation of JG. For a PLL-based linear CDR, we re-
draw its model in Fig. 30. As evidenced by Fig. 29, the input-referred noise of PD/CP is negligible
as compared with input data noise. Therefore, the only major noise source is VCO.
For typical overdamped cases, the noise transfer function from VCO to output is given by
Φout ∼ s
= , (10.85)
Φin s + 2ζωn
where
s
KV CO Ip
ωn = (10.86)
2πCp
r
Rp KV CO Ip Cp
ζ= (10.87)
2 2π
374
Fig. 10.30 Typical phase noise of recovered clock (PLL-based linear CDR, data rate=20 Gb/s).
and the loop bandwidth ωBW = 2ζωn = 2πfBW . Follow the derivation of Chapter 8, we define
VCO’s noise spectrum as
ωo2
SΦ,V CO = SΦ,V CO (ωo ) ·
. (10.88)
ω2
Again, ω0 = 2πf0 is an arbitrary frequency point along the −20 dB/dec spectrum. The output
noise now becomes (Fig. 31)
ωo2 ω2
SΦ,out (ω) = SΦ,V CO (ωo ) · · 2
. (10.89)
ω 2 ω 2 + ωBW
From the above example, we calculate jitter generation in UI directly
Z f2 12
1 f02
JGrms = · 2· SΦ,V CO (f0 ) 2 2
df (10.90)
2π f1 f + fBW
12
f0 SΦ,V CO (f0 ) −1 f2 −1 f1
= 2· · tan ( ) − tan ( ) . (10.91)
2π fBW fBW fBW
In most cases, the finite integration limits can be removed (i.e.,f2 → ∞, f1 → 0) to simplify the
calculation: s
fo SΦ,V CO (fo )
JGrms = (UI). (10.92)
2 πfBW
375
f
f1 f BW f 2
Example 10.10
For a 10-Gb/s linear CDR with fBW = 10 MHz. Determine the minimum required VCO phase
noise of the CDR if it is to be used in an OC-192 system.
Solution:
Let’s pick fo = 1 MHz, SΦ,V CO is given by
Or equivalently, the VCO most present a phase noise L less then −79 dBc/Hz at 1-MHz offset.
How about the JG of bang-bang CDRs? The output jitter is still dominated by the VCO noise.
Once we obtain the transfer function Φout /ΦV CO of a binary loop, JG becomes readily available.
The question is that, we need to know the operation mode of bang-bang PD under locked condition
in the presence of VCO noise. Does the BBPD stay in the linear region of ±Φm most of the time?
Or it slews from time to time as the case for JTRAN and JTOL?
To answer this question, we go back to the definition of JG. Recall that JGrms = 0.01 UI. Even
with a very narrow linear region, say, ±0.03 UI, the BBPD can still find that 99.9% of the sampled
phase errors locate within the linear region! In other words, it is fair enough to say that the VCO
phase noise experience linear operation around the loop. With the same notation as Fig. 10.29, we
recalculate the noise transfer function. The transfer function is still given by
Φout ∼ s
= . (10.93)
ΦV CO s + 2ζωn
376
simply because the equivalent PD+CP gain here is Ip /Φm rather than Ip /2π. The loop bandwidth
is thus equal to
KV CO Ip Rp
ωBW,BB = 2πfBW,BB = . (10.96)
Φm
With the same token, we can estimate the jitter generation for bang-bang CDRs as
s
fo SΦ,V CO (fo )
JGrms,BB = (UI). (10.97)
2 πfBW,BB
It is worth noting that there are other sources causing jitter on the recovered clock. For example,
the undesired coupling from data, supply noise, etc. Building a sophisticated model is necessary
for designers to accurately estimate the overall jitter generation performance.
R EFERENCES
[1] C. R. Hogge, A Self-Correcting Clock Recovery Circuits, IEEE J. Lightwave Tech., vol. 3, pp.1312-
1314, Dec. 1985.
[2] J. D. H. Alexander, Clock Recovery from Random Binary Data, Electronics Letters, vol. 11, pp. 541-
542, Oct. 1975.
[3] J. Savoj and B. Razavi, A 10-Gb/s CMOS Clock and Data Recovery Circuit with a Half-Rate Linear
Phase Detector, IEEE Journal of Solid-State Circuits, vol. 36, pp. 761-768, May 2001.
[4] Jri Lee and Behzad Razavi, A 40-Gb/s Clock and Data Recovery Circuit in 0.18-µm CMOS Technol-
ogy, IEEE Journal of Solid-State Circuits, vol. 38, pp. 2181-2190, Dec. 2003.
[5] Rodoni et al., 5.75 to 44Gb/s quarter rate CDR with data rate selection in 90nm bulk CMOS, Proc.
ESSCIRC, 2008, pp. 166-169.
377
[6] T. Toifl, C. Menoifl,et al., A Low-Power 40 Gbit/s Receiver Circuit Based on Full-Swing CMOS-Style
Clocking, Compound Semiconductor Integrated Circuit Symposium, 2007, pp.1-4, Oct. 2007.
[7] Jri Lee and Shanghann Wu, Design and Analysis of a 20-GHz Clock Multiplication Unit in 0.18-µm
CMOS Technology, Digest of Symposium on VLSI Circuits, pp. 140-143, June 2005.
[8] A. Pottbacker, U. Langmann, and H.-U. Schreiber, A Si Bipolar Phase and Frequency Detector for
Clock Extraction up to 8Gb/s, IEEE Journal of Solid-State Circuits, vol. 27, no. 12, pp. 1747-1751,
Dec. 1992.
[9] Jri Lee, Ken Kundert and Behzad Razavi, Analysis and Modeling of Bang-Bang Clock and Data
Recovery Circuits, IEEE Journal of Solid-State Circuits, vol. 39, pp. 1571-1580, Sept. 2004.
[10] Jri Lee and M. Liu, A 20-Gb/s Burst-Mode CDR in 90-nm CMOS, Digest of International Solid-State
Circuits Conference, pp. 46-47, Feb. 2007.
[11] Jri Lee and M. Liu, A 20-Gb/s Burst-Mode Clock and Data Recovery Circuit Using Injection-Locking
Technique, IEEE Journal of Solid-State Circuits, vol. 43, pp. 619-630, Mar. 2008.
[12] Jri Lee and K. Wu, A 20Gb/s Full-Rate Linear CDR Circuit with Automatic Frequency Acquisition,
Digest of International Solid-State Circuits Conference, pp. 366-367, Feb. 2009.