Dunwell Dustin T 201211 PHD Thesis (4302)
Dunwell Dustin T 201211 PHD Thesis (4302)
Dunwell Dustin T 201211 PHD Thesis (4302)
by
Dustin Dunwell
Abstract
This thesis examines the design of high-speed wireline receivers that can be adapted
signal strengths, channel losses and operating frequencies is explored. In order to achieve
this flexibility, this thesis examines several key components of such a receiver.
First, a 15 Gb/s preamplifier with 10-dB gain control for the input stage of an analog
front end (AFE) is presented that automatically adjusts its power consumption to suit
the gain and linearity requirements of the AFE for various received signal strengths. The
gain of this preamplifier, along with the amount of peaking delivered by a linear equalizer
in the AFE are controlled using a new adaptation technique, which adds only a small
amount of overhead to the receiver. This adaptation scheme is able to sense changes in
the received signal conditions and automatically adjust the equalization and gain of the
In addition, this thesis presents the first clock multiplier with both a wide operating
frequency range and the ability to transition between completely off and fully operational
modes in under 10 cycles of the reference clock. This multiplier relies on the careful use
of several injection-locked oscillators (ILOs) with an aggregate lock range of 55.7% of the
3.16-GHz centre frequency. The design of these ILOs was facilitated by the use of a new
method for modeling the injection locking behaviour of oscillators. This model differs
from existing techniques in the way that it relies on the simulated response of an oscillator
ii
Acknowledgements
This section is to acknowledge and thank the many parties that generously assisted
Firstly, the interest and enthusiasm shown by Dr. Anthony Chan Carusone in his
teaching has played a large part in sparking my own interest in very high speed analog
circuit design. His continuing guidance and support, not to mention his exceptional
problem solving skills, have been instrumental in the successful completion of my graduate
studies.
Secondly, thanks is owed to the other graduate students in the electronics group at
the University of Toronto who were never too busy to lend a helping hand and whose
company during long hours at the lab was a comfort during the most stressful times. I
sincerely hope that the relationships that I have made with my peers will last well beyond
my graduation.
Finally, I would like to thank my wife and best friend for her patience and unwavering
support of my studies during the better part of the past decade. She is the best mother
in the world and with her by my side, I know that we can handle any challenge. I look
iii
Contents
1 Introduction 1
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1.2 Ethernet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2 Analog Front-Ends 8
2.2 Equalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
iv
3.1.1 Measuring the Received PDF . . . . . . . . . . . . . . . . . . . . 23
3.3.1 S-parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
4.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
v
5.3 Wide Lock Range ILO Design . . . . . . . . . . . . . . . . . . . . . . . . 72
5.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
7 Conclusion 112
Bibliography 119
vi
List of Tables
3.1 Component values used to implement the EQ design shown in Fig. 2.7. . 26
ison. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
5.2 Comparison of lock ranges calculated using the ISF model and the PTC
6.1 Comparison of measured lock range of the breakout MILO to those calcu-
A.1 Summary of contributions made along with related locations in the thesis. 116
vii
List of Figures
2.1 Variable gain can be implemented by using source degeneration with ad-
2.4 Drawing bias current through Rf raises the CM level at the output and
2.8 Source degeneration reduces the gain of the amplifier. At high frequencies
2.9 A typical wireline receiver with an AFE including gain and equalization
viii
2.10 EQ adaptation can be performed by analyzing the high-frequency content
of the EQ output but this does not necessarily maximize the resulting eye
opening. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.11 Adjusting the (a) sampling threshold, (b) sampling time or (c) both pro-
duces an indication of the eye opening that can be used to adapt the AFE
settings. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
signal. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.2 A block diagram of the VGA and EQ adaptation performed in this work.
The low-speed ADC digitizes the DC (average) output of the offset slicer.
3.3 The DC output of the slicer drops as Vth is increased. The magnitude of
the slope of this decline yields the PDF of the received data. . . . . . . . 25
3.4 Simulated gain of the EQ using Cadence Spectre with full RC-extraction
that the output spends at logic high. The slope of this curve is then used
3.6 Measurement results show that (a) increasing EQ peaking narrows the
PDF and increases its peak value and that (b) increasing VGA gain moves
this peak towards the target maximum threshold level, in this case set to
100 mV. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.7 Flow diagram showing how the equalizer and gain settings are determined. 29
ix
3.8 A preamplifier with gain control replaces a fixed-gain preamplifier and
3.12 Measured S11 shows broadband input matching for various preamplifier
gain settings. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.13 (a) 4-PAM, 146 mVpp , 1.1 GS/s test signal applied to the receiver input
and (b) corresponding receiver output shows little distortion of the 4-PAM
eyes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.14 Receiver output for PRBS inputs at (a) 1 Gb/s with an 8 mVpp eye opening
and speed. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
equalizer output eyes to the corresponding PDF and show that by choosing
the peak PDF, the algorithm settles to the correctly equalized eye. . . . . 39
3.16 Measured results show that the adaptation algorithm automatically in-
ing channel losses across (a) coaxial cables and (b) PCB traces. . . . . . 40
3.17 Eye diagram of the receiver output after automatic adaptation when re-
4.1 Oscillation amplitude of an LC tank will decay due to resistive losses unless
4.2 Injection of an oscillating signal, Iinj , can cause the LC tank output to
x
4.3 Shorting the tank during narrow injection pulses makes application of the
4.4 Equivalence between (a) the physical model of the series resistive losses
develop the frequency domain model breaks down for low Q values. . . . 46
4.6 Example impulse sensitivity functions for oscillators with (a) sinusoidal
4.7 Spectre simulations of the I and Q states of a four-stage VCO show that (a)
state (solid line). Repeated impulses (b) can lock the VCO to a differ-
ent frequency but this requires a new ISF to model the oscillator’s new
4.8 Injected impulses result in step changes in the oscillator output, which
4.9 Dividing an injected signal into impulses that act immediately on the
ILO output phase also shift the corresponding Γ(t) function, allowing for
determines the ISF of the oscillator. This ISF can then be used to predict
the ILO’s sensitivity to pulses with (b) larger amplitudes or (c) wider pulse
5.1 Definitions of the ILO input and output signals that will be used to develop
xi
5.2 By comparing the zero crossing times of the ILO output to that of an
unperturbed copy, the phase change created by one period of the injected
5.3 The PTC, P (φ), is determined through simulation by applying one period
of the injected signal, Vinj (t), at different phases, φn , relative to the oscil-
lator’s output signal, Vout , and observing the resulting change in output
phase, P . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
5.4 Model representing nonlinear the phase relationship between the injected
reference signal at ωlock shows that (b) the phase change produced by an
injection event is sufficient to cancel the phase drift resulting from the
difference between ωlock and ω0 . This allows the ILO to settle to the
ωlock and ω0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
5.7 An ILO’s lock time depends on the initial phase difference, ∆φ, between
the injected signal φ0 and the steady-state phase difference, φss , required
5.8 Spectre simulations of the (a) PTC and (b) transient phase response of a
4-stage ring ILO. When injection begins far from φss at φ01 the lock time
5.9 Lock time varies greatly depending on the phase at which the injected
xii
5.10 The jitter tracking bandwidth of an ILO can be determined by applying a
step change to the phase of the injected signal and observing the resulting
change in the output phase. The displayed results are from Simulink
simulations of the proposed model, performed using the PTC of the ILO
that the 3-dB jitter tracking bandwidth can be accurately predicted over
ILO is correctly locked (dotted line). If the injected signal is beyond the
ILO’s lock range then this can be identified by slipping of the output phase
(solid line), which may not become apparent until the simulation has been
5.13 One stage of the four-stage CML injection locked ring oscillator. Apply-
injection strength. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
5.14 The PTC is determined by measuring the output phase change created
5.15 Injection into multiple ILO locations can increase locking range if the
5.16 Spectre simulation results show that the peak-to-peak amplitude of the
5.17 Creating pulses at the reference clock edges emphasizes the desirable har-
monic of the input thereby improving the lock range of the MILO. . . . . 78
xiii
5.18 The addition of a second edge detector with wide pulse widths further
5.19 Spectre simulations determine the PTC for ILOs using (a) one edge de-
tector, (b) two edge detectors and (c) two edge detectors used to produce
6.1 The design of a frequency agile clock multiplier that is suitable for fast
power cycling can achieve link flexibility and power savings in DVFS ap-
plications. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
6.2 This chapter presents the first clock multiplier that is frequency-agile and
6.3 Four MILOs with adjacent lock ranges can cover an aggregate output
6.6 Latches at the output of each stage of the first ILO can be used to compare
6.7 Latch chains verify the multiplication factor by ensuring that there are
exactly 2 rising and 2 falling clock edges within one half period of the
reference clock. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
6.8 A MILO can be powered down immediately if the TDC detects an out-
multiplication ratio. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
6.9 If two MILOs are locked to the correct frequency, the power-down decision
frequency. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
xiv
6.10 Power down of any unused MILOs is accomplished by blocking tail currents
6.11 Converting the power down signal to CMOS logic levels ensures successful
6.12 Timing diagram of the power-on sequence for a 1-GHz reference signal. . 91
6.13 Power down of individual MILOs is enabled after 8 cycles and power down
10 cycles. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
6.14 Using four delay stages and an XOR gate as the building block for each
component in the MILO ensures good matching between pulse widths and
6.17 Measured values of (a) φss for various frequency offsets can be translated
to (b) the PTC of the ILO. These measurements show good agreement
6.18 Measurements show wide lock ranges for MILO1, MILO2 and the Breakout
MILO, but problems with the reference clock distribution cause a reduction
6.19 Two MILOs are able to increase the overall multiplier lock range to 55.7%.
as the average (avg) current drawn when the multiplier is active for 50%
xv
6.20 Measured results show that the addition of ILO2 provides a reduction in
a 4-bit pattern due to the pulse widths of the injected signal. . . . . . . . 101
6.22 A histogram of the total measured jitter shows four distinct peaks repre-
6.23 Including package parasitics at the simulated clock outputs identifies the
6.24 Increasing driver power (a) increases output swing to approximately 100
6.25 Histogram (a) and oscilloscope capture (b) showing that the RJ of the
6.26 Measurements at the output of both ILOs shows that the RJ remains near
6.27 Test setup used to capture the transient response to an applied Start signal.105
6.28 Repeated 50-ns bursts allow for power-on transient behaviour of the mul-
6.29 Test setup used to capture the transient response of two output clocks. . 106
6.30 Transient startup behaviour when (a) two and (b) three MILOs are enabled
6.31 Simulated transient power consumption during one power-up cycle of the
6.32 Energy consumed by the power-up sequence means that efficiency of the
xvi
6.33 Average power consumption of the multiplier scales linearly with the per-
B.1 Schematic of the variable threshold slicer used in the AFE adaptation
scheme. Transistor sizes are given in µm with all gate lengths set to
B.2 Schematic of the CML latch used in the TDC and power down logic of
the MILO. Transistor sizes are given in µm with all gate lengths set to
B.3 Schematic of the CML XOR gate used in the pulse generators and ring
oscillators of the MILO. Transistor sizes are given in µm with all gate
B.4 Schematic of the two-stage, 50 Ω output driver used to send the MILO
output off-chip. Transistor sizes are given in µm with all gate lengths set
xvii
Chapter 1
Introduction
1.1 Motivation
Wireline transceiver speeds have increased dramatically over the past decade. While
this increase in speed is a driving force in the industry, other considerations such as
many applications the need for transmitters and, to a greater extent, receivers that are
apparent. These applications include short range communications such as memory links
for mobile devices, medium length transmissions such as ethernet links, and long distance
lines such as cable TV and internet links. The following sections outline the motivation
In cable TV and internet communication links, the downstream channel uses a point-
to-multipoint scheme, which means that a carrier always exists, even when there is no
packet being transmitted. As a result, the subscriber set-top box or modem has ample
1
Chapter 1. Introduction 2
since bursts from different subscribers arrive with different power levels, symbol timing,
carrier phase/frequency offsets, and channel distortions [2]. The ability to receive and
decode data in such a link requires a receiver that is flexible and able to adapt to a wide
1.1.2 Ethernet
As standards such as 100 Gigabit Ethernet continue to push the speeds of networks, the
power consumption that such links require is encouraging the development of smarter link
options. In this context, “smarter” means the ability to make the best use of available
power and bandwidth constraints. One way of achieving this is through the use of real-
time systems that monitor network congestion and adapt the video quality and bit-rate
of streaming video. Such strategies to make best use of available bandwidth have recently
ethernet standards require that transmitters be able to provide enough power to send
a signal up to 100 m, this new standard recognizes that significant power saving can
be realized by determining channel length and reducing power for shorter cable lengths,
which are often used in home networks. Each of these techniques require an intelligent
receiver that is able to adapt to different data rates, received power levels, or amounts
of channel attenuation.
The speed of next-generation mobile memory interfaces for consumer electronics is con-
tinuing to increase with aggregate rates projected to reach 12.8 Gb/s in the near future
[4]. One factor impeding this progress, however, is the fact that battery energy density
Chapter 1. Introduction 3
and thermal dissipation constraints are expected to remain essentially constant, which
places a premium on both average and peak interface power consumption [4].
Further complicating the situation is the fact that bandwidth utilization in mem-
ory links varies by orders of magnitude over time, not just as different applications are
executed, but also for short periods of time within an individual application. This, com-
bined with the fact that average memory utilization is typically only a fraction of the
peak bandwidth [4], is making it necessary to design an interface that is able to adapt
operating frequency range is becoming an attractive feature since this ensures interoper-
ability with legacy devices [5], [6]. In order to make this practical, however, the power
consumed by such a memory link must also scale with the operating frequency, which
In order to accommodate the wide range of requirements listed in the previous section,
modern wireline receivers are, out of necessity, becoming more flexible. While variable
data rates have become commonplace [7], [8], [9], recently reported receivers have gone a
step further and now offer compatibility with a variety of standards such as PCIe, SATA
and 1 to 10-Gb/s Ethernet [10]. A vital requirement of any such receiver is the ability to
intelligently adapt the gain of a receiver’s analog front-end (AFE) in order to function
The first stage in any receiver should be well matched to the characteristic impedance
of the channel. It should provide as much gain as possible in order to optimize the
receiver’s noise performance and to open the received eye sufficiently for bit decisions to
be made with a sufficiently low error rate. However, a gain that is too large can introduce
Chapter 1. Introduction 4
links employing multi-level signaling [7]. As a result, the addition of a variable gain
amplifier (VGA) to the AFE can help improve the dynamic range of a receiver, thereby
provides 10 dB of gain control while maintaining a good broadband match to the channel
impedance across all gain settings. The topology is similar to the single-stage, fixed-gain
a system that monitors and automatically adjusts the power consumption in order to
provide a constant common mode level at the amplifier output. In doing so, this output
common mode control also helps to maintain a good input impedance match across all
gain settings and provides the added benefit of increasing power consumption in order
signal swing.
of this preamplifier up to at least 15 Gb/s and show a good impedance match with
S11 remaining below -8 dB from 1 to 24 GHz over all gain settings. These results were
published at the International Symposium on Circuits and Systems (ISCAS) in 2010 [13].
varying channel conditions is the equalizer (EQ). Even in links using fixed channels such
as PCB traces, channel losses can vary by several dB as changes in temperature or hu-
midity are encountered. Furthermore, these variations have been shown to increase as
mented using a decision feedback equalizer or a linear equalizer, its coefficients must be
As a result, the method used to implement this adaptation is extremely important and
has therefore received a great deal of attention in recent research publications. Currently,
EQ settings are usually controlled by either minimizing the difference in frequency content
between the EQ and slicer outputs [15], or monitoring the eye opening at the EQ output
and adjusting either the decision threshold [16], the sampling time [17], or both [18]
until a target bit-error rate (BER) is achieved. Unfortunately, the former method offers
no guarantee that the resulting output eye is optimized and the use of several filters
makes its implementation expensive in terms of chip area. Conversely, the latter method
optimizes the received eye effectively but the eye monitoring required in adds a great
a method for simultaneous adaptation of both the EQ and the VGA by monitoring not
only the vertical eye opening, but also the statistical distribution of the received data
levels. This adaptation technique requires the addition of very little circuitry to the
receiver and its ability to adapt both the EQ and VGA either while the circuit is active,
or in an initial “set and forget” calibration sequence, makes its implementation practical,
economical and effective. This research was published at the Custom Integrated Circuits
As discussed previously, the ability to switch between different power modes and oper-
ating frequencies can help to improve the energy efficiency of a wireline receiver. Un-
the design of the receiver’s clocking systems. The long lock time required by traditional
cies often makes them unsuitable for systems that require these features. As a result, the
receiver clock usually remains active at all times, with recent approaches relying on the
Chapter 1. Introduction 6
use of a variety of power-modes and bursting techniques to reduce average power [4].
While such techniques are effective, the need to operate the clock generating circuit
at all times introduces a significant power penalty. In addition, the complexity involved
in using many different power states with varying degrees of activity can often make their
only two power states—either completely off or completely on—and on reducing the
fully operational in less than 8 ns. This MILO is designed to have a lock range equal to
7%, which is sufficient to allow it to lock to its 2.8-GHz free-running frequency despite
variations in voltage and temperature. This MILO was designed as part of a team while
on internship at Rambus Inc. and was published as part of a fully bidirectional link at
Chapter 6 extends this work to show that the lock range of a MILO can be increased
to over 40% of its free-running frequency through the use of frequency pre-conditioning.
It then shows how the lock range can be further extended, in theory to any desired value,
by using parallel MILOs with adjacent lock ranges. The resulting circuit minimizes the
resulting power penalty through the use of several logic blocks that evaluate the output
signals of adjacent MILOs and power down all but the one providing the clock at the
desired frequency. These logic blocks operate quickly, allowing the circuit to transition
from zero power to operation at the desired frequency in less than 10 cycles of the
reference clock. The result is the first clock multiplier that is able to combine fast power-
cycling with the ability to adapt to a wide range of operating frequencies. This frequency
agility is achieved without the use of any external controls such as manual tuning of the
oscillator’s operating frequency. This work was developed at the University of Toronto,
not during the Rambus internship, and has been submitted for publication at CICC in
Chapter 1. Introduction 7
September 2012.
Although MILOs are typically only capable of achieving narrow lock ranges on the order
of a few percent of the free-running frequency [22], the lock range of the MILO described
above was increased significantly with the help of a new, simulation-based technique for
ILO modelling.
Traditionally, ILOs have been analyzed using the frequency domain model, first pro-
posed in [23]. Although originally limited to LC oscillators using small, sinusoidal injected
signals, variations of the frequency domain model have recently been introduced in or-
der to handle large injection strengths, ring oscillators and other injected signal shapes
[24], [25], [26]. Although attempts have been made to generalize the model to make it
applicable in a wide range of situations [26], the results remain complex. In addition, all
versions of the model use a variety of quasi-physical variables, which are not well defined
and whose values are difficult to determine before measured results are available. As a
result, this model is difficult to apply during the design phase of an ILO and is therefore
ILO’s behaviour. This method uses SPICE-level simulations to determine the change in
an oscillator’s output phase in response to any injected signal. This phase response at
the output is dependent on the phase at which the injected signal is applied, which gives
rise to the term “PTC”. Chapter 5 explains this concept in detail and shows how it can
be used to predict the lock time, jitter tracking behaviour and lock range of an ILO. This
technique is then demonstrated through the design of a MILO with wide lock range that
is suitable for fast power-on applications, as described in the previous section. This work
has been submitted for publication in Transaction on Circuits and Systems - I (TCAS-I).
Chapter 2
Analog Front-Ends
receivers incorporate an analog front-end (AFE) in order to compensate for these im-
pairments and improve the quality of the received data before it can be converted to
the digital domain. Although the elements contained in an AFE can vary depending
on the intended application, variable gain amplifiers (VGAs) and equalizers (EQs) are
two common circuits found in many front ends. This chapter presents a study of the
state-of-the-art of each of these two components in order to serve as a foundation for the
In addition to the AFE circuit blocks themselves, this chapter examines the important
issue of adaptation of their control signals. The ability to automatically adapt the control
signals of the AFE components has a direct impact on their effectiveness and has therefore
in this chapter, which will provide a basis for the adaptation method proposed in the
following chapter.
8
Chapter 2. Analog Front-Ends 9
In DSP-based receivers, or any link employing multilevel signaling, the analog front end
(AFE) of the receiver must avoid introducing non-linear distortions to the signal before
passing it to the multilevel slicer (ADC). In order to achieve the best possible dynamic
range, the variable gain stage should be implemented as close to the receiver input as
possible, so that overall gain can be reduced before nonlinearities are introduced by the
input stages.
One method commonly used to implement a variable gain amplifier is to use a dif-
ferential pair with variable source degeneration, as shown in Fig. 2.1. The addition of
source degeneration resistor Rs helps to reduce the the signal swing applied between the
gate and source of the input transistors, thereby making the input/output characteristic
Rd
Av = . (2.1)
Rs
By controlling the value of Rs the gain can therefore be controlled directly. If this stage
is used as the first stage in a receiver’s AFE then resistive components R1 and R2 can be
added to the input to achieve a broadband impedance match. Control of these resistance
values can also be implemented to achieve further gain control by attenuating the signal
R2 Rd
Av = (2.2)
R1 + R2 Rs
A second, more recently proposed method for implementing a VGA is to vary the
effective size of the input transistors by turning unit-sized differential pairs on or off, as
shown in Fig. 2.2 [30]. In this topology since the bias current is kept constant regardless
of the number of active differential pairs, N , the gain of the circuit is proportional to
Chapter 2. Analog Front-Ends 10
Rd Rd
Vout
Vinp Vinn
R1 R1
R2 R2
Rs
Figure 2.1: Variable gain can be implemented by using source degeneration with ad-
justable resistance. Resistors R1 and R2 can be added to increase gain control and
provide a broadband impedance match [29].
Rd Rd
Vout
M1 M2 M3 M4 Mn Mn+1
Vin
Figure 2.2: Variable gain can be achieved by varying activating a variable number of
unit-sized differential pairs, effectively changing the input transistor width [30].
√
N as r
N Wu
Av = 2µn Cox Id Rd (2.3)
L
where Wu is the width of a transistor in a unit-sized differential pair. Therefore, for low-
gain settings the current density in each active transistor is increased, helping to reduce
While this technique has been shown to achieve good dynamic range, with gain vari-
ation of over 6 dB per stage [30], it is difficult to implement as the first stage of an AFE
since creating a broadband impedance match between the variable-sized input devices
Chapter 2. Analog Front-Ends 11
Vout
RF
Vin
and the channel is challenging. As a result a resistive matching network is often used,
which can be detrimental to the bandwidth and noise performance of the preamplifier.
matching network through the use of the shunt-feedback configuration commonly associ-
ated with optical-input transimpedance amplifiers. This topology, displayed in Fig. 2.3,
is analyzed for electrical wireline applications in silicon technologies in [1] and is shown
to outperform the preamplifiers in Fig. 2.1 and Fig. 2.2 in terms of bandwidth and power
dissipation.
The linearity of this input stage can be improved by drawing current through feedback
resistor Rf with the use of transistor M1 as shown in Fig. 2.4 [12]. This raises the CM
level at Vout and obviates the need for a common-source level-shifting stage following
the preamplifier, which in turn increases the voltage swing that can be achieved at Vout
without driving the common source transistor into triode operation. While the addition
of M1 to the receiver input can degrade noise performance, this can be mitigated by
by adjusting the value of Rf [31] since this resistor determines the low-frequency tran-
Chapter 2. Analog Front-Ends 12
Vout
RF
Vin
Vbias M1
Figure 2.4: Drawing bias current through Rf raises the CM level at the output and
increases the voltage swing headroom for linear operation.
A
RT = Rf (2.4)
1−A
where A is the open-loop voltage gain of the amplifier [32]. Changing Rf , however, also
has the effect of changing the input impedance of the preamplifier, approximately given
by
1
Rin = Rf (2.5)
1−A
Since the preamplifier input must be matched to the impedance of the channel to avoid
gain control in the second stage of the receiver chain, maintaining a fixed preamplifier
gain and impedance match [33]. However, the fixed gain of the input stage using this
2.2 Equalization
The frequency-dependent losses introduced by the lossy channels used in most wireline
communications links can severely degrade the quality of the received signal. This can
in turn limit the operating speeds unless some method can be introduced to equalize
the amount of loss encountered at all relevant frequencies. This can be accomplished
through the use of either decision feedback equalizers (DFEs) [34], continuous time linear
Of these two options, DFEs typically outperform CTLEs due to their ability to adapt
a wide variety of channel conditions and to compensate for post-cursor intersymbol in-
typically limited to lower speeds than CTLEs, they have recently been shown to operate
design. Instead, the EQ used in the AFE presented in the following chapter is present
to illustrate the operation of the AFE adaptation algorithm. As a result, a CTLE was
chosen for its simplicity and ease of implementation. Therefore the discussion of EQ
The measured S21 of three lengths of a 75 Ω coaxial cable channel are illustrated in Fig.
2.5. These measurements clearly show that the loss of these channels increases as the
frequency content of the transmitted signal increases. In order to equalize these losses
CTLEs aim to introduce some combination of low-frequency gain (or loss) and high-
frequency gain to the AFE. This concept is captured by the conceptual diagram shown
in Fig. 2.6.
−10
−20 10m
−30
−40
30m
S21 [dB]
−50
−60
50m
−70
−80
−90
−100
7 8 9 10 11
10 10 10 10 10
[Hz]
Figure 2.5: Measured losses of three coaxial cables illustrate frequency-dependent losses.
High pass
Vin Vout
Low pass
Veq
Figure 2.6: Channel losses can be equalized by introducing a combination of high fre-
quency gain and low frequency gain/loss to the receiver.
Chapter 2. Analog Front-Ends 15
Vout
V+ M1 M2 V- M5 M6
IBIAS ILF
RHP RHP
LD
M7 M8
M3 M4
IHF
IBIAS
Figure 2.7: By combining the outputs of high-pass and low-pass amplification stages and
adjusting the weights of their contributions, channel losses can be equalized across all
frequencies of interest.
ferential pairs as shown in Fig. 2.7 [37]. By shorting the drains of transistors M3 and M4
at low frequencies, inductor LD enables high frequency amplification of the input data
while attenuating low frequencies. By controlling the current sources ILF and IHF the
contributions from the high pass and low pass paths can be adjusting according to the
While this technique offers an intuitive and straight forward way to implement equal-
ization, a similar effect can also be achieved by introducing a low-frequency zero into a
single amplification stage. One popular method of achieving this is through the addition
topology the source resistance Rs provides a reduction in gain, as described in the pre-
vious section. However, at high frequencies capacitor Cs becomes a short circuit and the
Chapter 2. Analog Front-Ends 16
Rd Rd
Vout
Vinp Vinn
Cs
Rs
Figure 2.8: Source degeneration reduces the gain of the amplifier. At high frequencies
this reduction is eliminated by the short circuit created by Cs resulting in higher gain at
high frequencies and equalization of the channel losses.
Rd
Av = (2.6)
Rs // sC1 s
Rd (1 + sCs Rs )
= (2.7)
Rs
This allows for larger gain at high frequencies, thereby equalizing the high frequency
channel losses. The frequency at which the gain begins to increase can be controlled by
The ability to intelligently control the AFE blocks in a way that optimizes receiver speed,
previous section, channel losses can change dramatically when the length of the channel
is changed. Even for fixed channels, variations in loss can occur due to factors such as
temperature or humidity. As a result, control signals for the VGA and EQ, as shown in
Fig. 2.9, should be generated automatically and should be able to adapt to a variety of
Chapter 2. Analog Front-Ends 17
ADC
Channel
Gain EQ
control control
Automatic Adaptation
Figure 2.9: A typical wireline receiver with an AFE including gain and equalization
control signals that should be generated automatically and intelligently.
channel conditions.
One way that the EQ control signal can be generated is by minimizing the difference
between the high-frequency content of the EQ and slicer outputs [38] as shown in Fig.
2.10. While this technique can also include low-frequency gain control by adding a second,
similar loop containing low-pass filters [15], the necessary analog filters can be difficult to
design accurately and can consume a great deal of chip area. Furthermore, this technique
offers no guarantee that the resulting EQ control achieves optimal bit error rate (BER)
performance for the given channel conditions. Instead, this section focuses on digital
adaptation schemes that monitor the opening of the received eye in the time domain in
In order to optimize the BER, it is necessary to develop a picture of the eye opening at the
slicer (or ADC) input. This requires moving the sampling point from its ideal location
at the center of the eye until errors are detected by comparing this output to the actual
data obtained by sampling at the ideal point. This movement away from the ideal point
can be performed by using additional slicers with either a variable decision threshold for
Chapter 2. Analog Front-Ends 18
EQ Driver
high-pass high-pass
+
rectifier
Σ rectifier
EQ control
v2
v2
vth
v1
v1
ts t1 t2 t1 t2
(a) (b) (c)
Figure 2.11: Adjusting the (a) sampling threshold, (b) sampling time or (c) both produces
an indication of the eye opening that can be used to adapt the AFE settings.
vertical eye monitoring [16], variable sampling time for horizontal eye monitoring [17],
or both variable threshold and sampling time for two-dimensional eye monitoring [18].
These three types of monitors are illustrated in Fig. 2.11 where the traditional goal for
If the threshold levels and sampling phases can be varied independently and with
a fine resolution then this technique can be extended further to obtain complete two-
dimensional eye opening data and, by counting the frequency of errors at various sam-
Chapter 2. Analog Front-Ends 19
threshold control
channel BER-based
output
adaptation
clk1
Rx Data
clk2
τ τ τ
CDR
Figure 2.12: An example of a BER-based adaptation used to optimize DFE tap settings
[39].
pling points, BER contours can therefore be developed. An example of a circuit used
to implement this approach for adaptation of a DFE is shown in Fig. 2.12 [39]. By
comparing the resulting BER to some target value, the adaptation algorithm is able to
This technique can guarantee optimal receiver performance by minimizing the BER
and represents the state-of-the art in adaptation techniques. It can, however, be very
high-speed XOR gate and error counter as well as multiple clocks with finely tunable
phases. It also does not incorporate any gain control into the AFE, limiting the dynamic
range of the receiver. As a result, the development of simpler adaptation schemes that
can achieve similar performance present an attractive area for further work on this topic.
Chapter 2. Analog Front-Ends 20
2.4 Summary
Variable gain and equalization are two important functions performed by many AFEs
used in wireline communication links. Adding equalization at some stage of the AFE can
help to mitigate frequency-dependent channel losses and help to improve the eye opening.
Adding variable gain to the input stage of the AFE can help to maximize the dynamic
range of the receiver. The use of shunt-feedback in the input stage can provide low-
noise, broadband functionality and presents the opportunity to implement gain control
by varying the resistance of the feedback path. This implementation, however, can
degrade the impedance match with the channel, creating undesirable reflections.
Both gain and equalization controls must be generated automatically and should be
intelligently adapted in order for the receiver to function correctly over a range of channel
conditions. While two dimensional monitoring of the equalized eye diagram can adjust
This chapter introduces a technique for the automatic adaptation of both the gain and
equalizer control signals in an AFE. Rather than targeting a specific minimum BER
as in [39], this algorithm tightens the distribution of the received signal amplitude and
centers it at a specific, pre-determined optimal level. This ensures that the vertical
dimension of the received eye is as open as possible for the given channel conditions,
while avoiding excessive gain, which can compromise the receiver’s dynamic range. This
technique requires only a single sampling phase at the center of the received eye and is
It also requires only minimal hardware overhead as it does not require a variety of
clock phases or a high-speed XOR gate to count errors but instead uses only the DC
output voltage of one additional comparator to determine the appropriate EQ and VGA
control signals. These factors make this adaptation strategy attractive not only as an
efficient means of optimizing AFE settings, but also a way of quickly and easily detecting
faults and predicting circuit performance in a production testing environment, which can
21
Chapter 3. Analog Front-End Adaptation 22
VL VRL
τ τ’
fX (x) fY (x)
,
τ τ
Figure 3.1: A noiseless transmitted signal encounters noise and ISI in a frequency-
dependent channel, which results in spreading of the PDF of the received signal.
data is generated by an ideal, noise-free binary transmitter and sampled at the baud
rate, Tb , and at phase τ , corresponding to the midpoint of each bit then each sample is
Xτ (k) will be constrained to two possible states: logic high, VH , and logic low, VL , which
and whose values are defined by the voltage swing constraints inherent to the transmitter.
The result is the transmitted eye diagram and the probability density function (PDF)
of Xτ , denoted by fXτ (x), is shown on left side of Fig. 3.1. Here the PDF is confined
and part of the AFE, the eye at the EQ input will show intersymbol interference (ISI) in
Chapter 3. Analog Front-End Adaptation 23
the form of jitter and a spreading of the received signal amplitudes due to the frequency-
dependent losses of the channel, reflections at connection points, and bandlimited stages
at the input of the receiver’s AFE. A new random variable, Yτ 0 (k), is then obtained by
sampling the received data at a phase corresponding to the midpoint of each received
bit, given by τ 0 . The PDF of this variable is given by fYτ 0 (x) and is illustrated on the
The received data is no longer exclusively constrained to two discrete voltage levels but
is instead spread over a range of values by the introduced ISI and noise. Hence, the PDF
of the received data is not represented by impulse functions, but is instead distributed
around new high and low levels, VRH and VRL . A detailed derivation of fYτ 0 (x) in terms
of the transmitter and channel characteristics can be found in [41]. Although the PDF
has been used previously as an efficient way to detect the BER of a received signal [32],
its use as a metric for the adaptation of AFE components has not. The premise of this
work is to monitor the PDF of the received data and adjust gain and high-frequency
peaking controls in the AFE in an effort to return to the ideal PDF of the transmitted
signal.
If the threshold voltage of the slicer at the AFE output is moved far enough away from
the center of the received eye, the slicer will begin to generate errors. If this output is
available at the same time as the error-free output from a slicer with the correct threshold
level, then these errors can easily be detected by passing both slicer outputs to an XOR
gate. Counting the number of errors generated by each threshold step can be used to
generate BER contours, which can in turn be used to select an optimum slicer threshold
In this work this process is simplified by observing that as the threshold voltage is
swept, the DC output of the slicer, Vout , is proportional to the cumulative distribution
Chapter 3. Analog Front-End Adaptation 24
Prototype I.C.
Slicer
Driver
Data
Slicer
channel
output VGA EQ Driver
National
speed
DAC
DAC
Instruments
DAC
low-
ADC
Data Acquisition
PCI-6024Ea
PC running adaptation
algorithm in Matlab
Figure 3.2: A block diagram of the VGA and EQ adaptation performed in this work. The
low-speed ADC digitizes the DC (average) output of the offset slicer. This information
is used by the adaptation algorithm to minimize the spreading of the received PDF.
function (CDF) of the input signal. Since the CDF of the received signal yields informa-
tion equivalent to a BER contour, it is possible to replace the high-speed XOR gate and
error counter with a simple low-pass filter. The prototype chip along with the off-chip
hardware used to demonstrate this operation is shown in Fig. 3.2. The fabricated chip is
slicer can be found in Appendix B, Fig. B.1. The output of this variable-threshold slicer
is taken off-chip and passed to a low-speed ADC (i.e. an ADC preceded by a low-pass
filter), which produces the DC (average) voltage of the receiver output. The ADC and
DACs are provided by a National Instruments PCI-6024E data acquisition card, which
runs at 200 kS/s and provides 12 bits of resolution. This card allows for the adaptation
algorithm to be run using a PC where simple Matlab control logic is used to generate
Vout CDF
VRH VCM
threshold
VCM sweep
VRL VRL
VCM threshold
VCM threshold
Figure 3.3: The DC output of the slicer drops as Vth is increased. The magnitude of the
slope of this decline yields the PDF of the received data.
The PDF of the received signal is equal to the slope of the CDF obtained by sweeping
the threshold of the auxiliary slicer, as illustrated in Fig. 3.3. The equalizer peaking is
then chosen to maximize the slope of the CDF (i.e. the peak value of the PDF) since
this corresponds to the narrowest possible spread of the PDF and therefore the lowest
The EQ implemented in this receiver is a continuous time linear equalizer, which creates
same as that introduced in [37], as shown previously in Fig. 2.7. This equalizer was im-
plemented in 65-nm CMOS and Spectre simulation results using post-layout extraction,
shown in Fig. 3.4, illustrate that the equalizer is capable of producing high-frequency
peaking of up to 8 dB. It should be noted that this peaking occurs at a frequency close
to 20 GHz, which was accidentally overdesigned for the target receiver speed. All tran-
sistors were implemented with minimum gate lengths and with widths and bias currents
Chapter 3. Analog Front-End Adaptation 26
Gain [dB] −5
VEQ = 0 V
VEQ = 0.4 V
−10 VEQ = 0.6 V
V =1V
EQ
−15 −1 0 1 2
10 10 10 10
Frequency [GHz]
Figure 3.4: Simulated gain of the EQ using Cadence Spectre with full RC-extraction of
the circuit shows up to 8 dB of high-frequency peaking.
Table 3.1: Component values used to implement the EQ design shown in Fig. 2.7.
To illustrate the adaptation of this EQ, Fig. 3.5 shows measured results taken from a
prototype binary receiver fabricated in 65-nm CMOS. In Fig. 3.5(a) the DC slicer output
voltage is used to infer the percentage of time that the slicer spends at logic high (VRH ),
which is equivalent to the CDF of the received random variable. As the threshold level
is increased, this percentage increases from 50% when the threshold is in the center of
the received eye, to 100% when the threshold is outside the eye. When alternating data
is transmitted, a sharp rise in the CDF is observed near a threshold level of 200 mV
because there is no ISI present. Conversely, the presence of ISI when 27 -1 PRBS data
is transmitted means that a more gradual change is observed in the CDF. This effect is
Chapter 3. Analog Front-End Adaptation 27
100 0.5
80 0.3
alternating pattern
PRBS data
70 0.2
60 0.1
50 0
150 200 250 300 150 200 250 300
Threshold Level (mV) Threshold Level (mV)
(a) (b)
Figure 3.5: Measured DC output voltage is proportional to (a) the percentage of time
that the output spends at logic high. The slope of this curve is then used to create (b)
the PDF of the the received signal.
apparent in Fig. 3.5(b) where the PDFs of the two received patterns are compared.
The adaptation algorithm was tested when receiving a 27 -1, 2 Gb/s PRBS signal,
transmitted across a 10 m BNC coaxial cable. Fig. 3.6(a) shows the PDF of the logic high
level of the received data for five different equalizer peaking settings. As the EQ control
voltage Veq is increased, the low-frequency content of the received signal is attenuated,
meaning that the peak slope of Vout occurs at incrementally lower threshold levels. At
the same time, the resulting emphasis of the high-frequency content helps to reduce ISI,
narrowing the PDF of the received data and increasing its peak value. In this case, the
adaptation algorithm selects Veq = 0.65 V to be the best equalizer setting because it
produces the highest peak value of the PDF (i.e. the largest slope of the CDF). Since
adjusting the settings of a peaking EQ effects the signal amplitude, this adaptation is
performed before the VGA adaptation, which is described in the following section.
The VGA can help keep the signal amplitude within some specified range throughout
the receiver signal path in order to ensure the best possible dynamic range in the AFE.
Chapter 3. Analog Front-End Adaptation 28
0.2
Veq = 0.27 V Veq = 0.46 V Vgain = 1.2 V
0.2
Veq = 0.36 V Veq = 0.55 V Vgain = 1.6 V
0.1
0.1
0.05 0.05
Figure 3.6: Measurement results show that (a) increasing EQ peaking narrows the PDF
and increases its peak value and that (b) increasing VGA gain moves this peak towards
the target maximum threshold level, in this case set to 100 mV.
Fortunately, the amplitude of the signal at the equalizer output can be readily observed
using the additional slicer. As the gain of the AFE is increased or decreased, the received
signal PDF is observed as described in the previous section and its peak can be used
to indicate the signal amplitude. For the 65-nm CMOS receiver in this work, it was
determined that the voltage swing of the equalizer output eye should not exceed 100 mV
As will be explained in the following section, the gain of the VGA used in this work
decreases as control signal Vgain is increased. From the measured results in Fig. 3.6(b)
it is apparent that the peak of the PDF shifts to higher voltages for higher preamplifier
gain settings, as expected. In this case the adaptation algorithm chooses a Vgain setting
of 1.2 V since this places the peak of the PDF as close as possible to the target value of
100 mV.
open eye in a “set and forget” initial calibration. The adaptation algorithm used in
such an implementation is illustrated in Fig. 3.7. If a set and forget adaptation is not
appropriate for the intended application then it is possible to maintain optimal settings
Chapter 3. Analog Front-End Adaptation 29
1. Set Veq
all
Determine PDF Veq
2. Set Vgain
all
Determine PDF Vgain
Figure 3.7: Flow diagram showing how the equalizer and gain settings are determined.
for the AFE as channel conditions vary over time by continuously repeating the EQ and
VGA adaptation algorithms in the background. In this case, once the initial calibration
algorithm is run, small changes to the gain and peaking settings can be tested to see if
they improve the location and size of the peak PDF. Since both the equalization and
gain control settings are determined from the same set of measured DC output voltages,
requiring that only limited overhead be added to the receiver, the power penalty incurred
Variable gain is achieved in the AFE presented in this work by incorporating gain control
into the input preamplifier stage, as illustrated in Fig. 3.8. This obviates the need for
Chapter 3. Analog Front-End Adaptation 30
channel
AMP VGA EQ DSP
this work
channel
AMP EQ DSP
Figure 3.8: A preamplifier with gain control replaces a fixed-gain preamplifier and VGA
in applications requiring high linearity and wide dynamic range.
a separate VGA stage, which can help to minimize power consumption and area while
on the fixed-gain topology shown in Fig. 2.4 [12]. In addition to implementing gain con-
trol to maximize linearity and dynamic range, the circuit presented in this section also
(CM) level across all gain settings in order to simplify the design of the following dif-
ferential stages in the receiver. This output CM control also automatically adjusts the
maintain a broadband input match across all gain settings. Fig. 3.9 shows a detailed
schematic of the circuit and illustrates how the CM control is able to perform these tasks
without loading the signal path of the TIA by operating on a copy of the preamplifier.
This circuit, and the measured results presented in the remainder of this chapter, were
Vgain
MD2 Vref
M2 +
VDout
Rfixed Rf Rf
Vout
Vchannel M1 MD1
200 pH 200 pH 200 pH
A single-ended signal arrives at the preamplifier at Vchannel where two inductors are used
to resonate with the input capacitances to extend the bandwidth of the input impedance
match. As discussed in the previous chapter, both the transimpedance gain and the
input impedance of this amplifier are set by the value of the feedback resistance, Rf , and
the open loop gain, |A|. Although a common solution to this problem is to separate the
variable gain stage from the input stage, however, in this approach non-linear distortions
The approach introduced in this work is to mitigate the impact of input impedance
variations by decreasing |A| along with Rf and by designing the input stage to be well
matched to the channel when the received signal is smallest and therefore most sensitive
to unwanted reflections. The gain of the RC-extracted preamplifier was simulated using
Spectre and is shown for four values of Vgain in Fig. 3.10. These simulations were
conducted using post-layout, RC extraction of the preamplifier and typical corners for
transistor models.
Another concern with the variable gain implementation in this topology is that bias
Chapter 3. Analog Front-End Adaptation 32
8
6
4
2
Gain
(dB)
0
-‐2
-‐4
Vgain
=
0.8
V
Vgain
=
1.2
V
-‐6
Vgain
=
1.6
V
-‐8
Vgain
=
2
V
-‐10
100000000
1E+09
1E+10
1E+11
Frequency
(Hz)
Figure 3.10: Simulated gain of the preamplifier as Vgain is varied. Simulations were
performed using Spectre with RC-extraction of the entire preamplifier.
levels at the input and output of the amplifying transistor are set by transistor M3 , which
pulls a DC current through Rf . This means that varying Rf will also change the CM
levels of the preamplifier, which necessitates the need for the CM regulation presented
A copy of the preamplifier (M1D - M3D ) is used to replicate the DC biasing. The resulting
D
pseudo-differential signal (Vout - Vout ) is passed to subsequent differential stages providing
power supply and common-mode noise rejection. However, in this topology changing Rf
will impact the CM levels of the preamplifier. To control this, the op amp shown in
D D
Fig. 3.9 monitors the DC output level at Vout and fixes both Vout and Vout to Vref by
By reducing the effective impedance of M2 and M2D at low gain settings the control
loop also has the effect of reducing the magnitude of the open-loop gain, |A|, of the
preamplifier. This helps to mitigate the change in input impedance according to Equation
(2.5) and also further reduces the transimpedance gain according to Equation (2.4). Since
this reduction in gain occurs without impacting the dominant pole of the amplifier, it
Chapter 3. Analog Front-End Adaptation 33
also results in an increase in bandwidth for these low-gain settings. Simulated results in a
standard 65-nm GP CMOS process show that this effect leads to an increase in bandwidth
The CM stabilization also adjusts the current density of M1 . For low gain settings
(used when the received signal swing is large) the DC bias voltage at its gate is increased,
providing a large overdrive voltage and, hence, high linearity. Simulated results show that
this setting results in a total harmonic distortion (THD) of less than -34 dB. For high
gain settings (used when the received signal swing is small) the gate voltage of M1 is
decreased, sacrificing linearity for an improvement in its noise performance. This setting
helps the preamplifier achieve good measured sensitivity of 8 mV. In total, the measured
current drawn by 60 µm wide transistor M1 ranges from 4.9 mA at the high-gain setting,
The op amp used in this control loop is a simple single stage amplifier with NMOS
inputs and an active, current mirror load. Simulated results show that the compensation
provided by the gate capacitances of transistors M2 and M2D is enough to ensure stability
The fabricated receiver occupies approximately 0.23 mm2 (excluding pad frame) and
has a measured maximum power consumption of 252 mA from a 1.2 V supply. The
AFE accounts for as much as 67.1 mA of this total (with the preamplifier at the low-
gain setting), with the rest being used in the output drivers and current-mode logic
of the digital back end. A die photo of the fabricated receiver is shown in Fig. 3.11.
All measurements were made on-wafer and alignment of the clock and data signals was
1 mm
Figure 3.11: Die photo of the receiver fabricated in 65-nm CMOS.
3.3.1 S-parameters
S-parameter measurements were taken to evaluate the input match of the receiver. The
measured S11 results are shown in Fig. 3.12 for various gain settings. For the highest
preamplifier gain settings (Vgain = 0.8 to 1.2 V), which are used when the received signal
is smallest and a good input match is critical, S11 remains below -12 dB to well beyond
20 GHz. For the lowest preamplifier gain settings (Vgain = 1.6 to 2 V) the input match
remains below -8 dB across the measurable range. For prototype testing, a dedicated,
off-chip source was used to generate the preamplifier Vgain settings. High voltage devices
are unnecessary as the gate-source and gate-drain voltages of the feedback transistor do
not exceed 1.2 V at any gain setting. In later iterations a high-voltage generator might
need to be used, or it might be possible to replace the NMOS feedback transistor with
an equivalent resistance PMOS device and use low control voltages instead.
The gain variation of the AFE was measured using a network analyzer with very small
input signals to keep the digital CML logic operating linearly. These measurements show
that the preamplifier gain can be adjusted by 10 dB, while the low-frequency gain of the
Chapter 3. Analog Front-End Adaptation 35
0
Vgain = 0.8 V
Min Gain Vgain = 1.2 V
−5 Vgain = 1.6 V
Vgain = 2 V
−10
−20
Max Gain
−25
0 5 10 15 20 25
Frequency (GHz)
Figure 3.12: Measured S11 shows broadband input matching for various preamplifier gain
settings.
equalizer can be adjusted by 9 dB for a total low-frequency gain variation of more than
19 dB.
In order to verify that the AFE can avoid distorting large signals, multilevel signaling
tests were performed. Using the 4-PAM transmitter reported in [43], a 1.1 GS/s, 146
mVpp , single-ended, 27 -1 length 4-PAM signal, shown in Fig. 3.13(a), was applied directly
to the receiver input. Note that although the receiver is designed to operate at higher
speeds than 1.1 GS/s, test setup limitations prevented the generation of 4-PAM signals
faster than this. With no clock applied to the slicers, and with the preamplifier set to
minimum gain, these signals should pass through the receiver to arrive at the chip output
showing a small amount of signal distortion as can be seen in Fig. 3.13(b). Since these
results include the digital back end logic and output drivers, it is likely that the distortion
seen in the 4-PAM output is caused by the number of gain stages present after the AFE.
As a result, the AFE alone is likely able to accept even larger input signals without
introducing signal distortion. There was, however, no test point to permit measured
Chapter 3. Analog Front-End Adaptation 36
(a) (b)
Figure 3.13: (a) 4-PAM, 146 mVpp , 1.1 GS/s test signal applied to the receiver input and
(b) corresponding receiver output shows little distortion of the 4-PAM eyes.
To test the receiver’s sensitivity a 1-Gb, 27 -1 length PRBS signal with an amplitude of
8 mV peak-to-peak was applied to the receiver input. The corresponding receiver output
was found to be error-free and is displayed in Fig. 3.14(a), indicating that the preamplifier
has a sensitivity of at least 8 mV and therefore has a dynamic range of 25 dB. It should
be noted however, that bit error rate testing was not performed and the absence of errors
was determined by manually examining the input and output bit patterns using Matlab
code. To test the overall receiver speed, a 15 Gb/s, 2-PAM PRBS signal of pattern length
27 -1 was sent to the receiver across a 10-m coaxial cable channel with a loss of 9 dB at
7.5 GHz, as shown previously in Fig. 2.5. This loss was compensated for by the EQ and
The measured and simulated results of the preamplifier are summarized in Table 6.2,
This helps to illustrate the preamplifier’s ability to provide a moderate amount of gain
that the bandwidth of the preamplifier itself could not be measured directly but simu-
lation results indicate that it should be capable of operating at speeds well beyond 15
Chapter 3. Analog Front-End Adaptation 37
(a) (b)
Figure 3.14: Receiver output for PRBS inputs at (a) 1 Gb/s with an 8 mVpp eye opening
and (b) 15 Gb/s with a 50 mV eye opening demonstrate receiver sensitivity and speed.
Ref. Technology 3-dB Bandwidth THD Gain Control Elec. Sensitivity Preamp Power
[29] 90 nm 7 GHz* -45 dB* 31 dB*
[30] 65 nm 5 GHz -38 dB* 23 dB -7 dBm
[33] 90 nm 22 GHz 2 kΩ** -20 dBm 75 mW
[44] 0.18 µm 3.9-7.6 GHz 52 dBΩ** -19 dBm 34 mW
This Work 65 nm 30-36 GHz* -34 dB* 10 dB -29 dBm 20-68.1 mW
* simulated result
** fixed gain input stage
Gb/s. Instead it is likely that this limitation is due to the cascade of additional stages
in the receiver.
With the proposed preamplifier in place at the input of the AFE, the robustness of the
PDF-based adaptation algorithm was tested both through simulation and measurement.
Fig. 3.15 shows the eye diagrams of at the output of the equalizer, as simulated in Spectre
using schematic level transistors, for a variety of Veq settings. Fig. 3.15 also shows the
corresponding PDFs determined by taking the slope of the average value or the output
signal as Veq is varied. By selecting the PDF with the largest peak value, the algorithm
settles to the correctly equalized eye. These simulations also show that the peak of the
Chapter 3. Analog Front-End Adaptation 38
PDF corresponds to the amplitude of the eye opening, allowing for correct optimization
of Vgain as well.
Fig. 3.16 compares the measured loss of each channel to the equalizer peaking and
preamplifier gain settings chosen by the adaptation algorithm. All tests were performed
using length 27 -1 PRBS data at a speed of 4 Gb/s for the PCB tests and 10 Gb/s for the
coaxial cable tests (with the exception of the 30 m cable, where the speed was reduced
In both the PCB and coax cases, the adaptation algorithm responds to the increase in
channel losses by increasing equalizer peaking and preamplifier gain. This intuitive result
was further verified by examining the receiver’s output eye diagram after adaptation had
taken place. In all cases the resulting eyes showed error free operation. One example of
such an eye is shown in Fig. 3.17 for a 10 Gb/s signal sent across a 10 m coaxial cable.
3.5 Conclusion
In order to avoid non-linear distortions receiver front-ends require gain control, which
should begin in the first stage in the receiver chain but must avoid adversely affecting
the impedance match with the channel. This chapter has introduced a preamplifier that
is suitable for this task with 10 dB of gain control and automated regulation of its output
of up to 19 dB of gain control.
The preamplifier provides a broadband match with a measured S11 of less than -8 dB
up to 25 GHz and across all gain settings. Simulation results show a bandwidth of at
least 30 GHz and high linearity with a THD of -34 dB. Its fabrication in 65-nm CMOS
as part of a complete receiver design was used to verify its ability to avoid non-linear
In addition, the ability to generate and automatically adapt the control signals of
Chapter 3. Analog Front-End Adaptation 39
Slicer
channel
output VGA EQ Driver
Veq = 0.84 V
Veq = 0.46 V
Veq = 0 V
7.5 4
8 4 7 3
6.5 2
EQ Peaking (dB)
7 2 6 1
5.5 0
6 0 5 −1
4.5 −2
5 −2 4 −3
3.5 −4
4 −4 3 −5
EQ Peaking EQ Peaking
2.5 −6
Preamp Gain Preamp Gain
3 −6
0 10 20 30 0 2 4 6 8 10
Channel Length (m) Channel Length (inches)
(a) (b)
Figure 3.16: Measured results show that the adaptation algorithm automatically increases
equalizer peaking and preamplifier gain to compensate for increasing channel losses across
(a) coaxial cables and (b) PCB traces.
Figure 3.17: Eye diagram of the receiver output after automatic adaptation when receiv-
ing 10 Gb/s data sent across a 10 m coaxial cable.
Chapter 3. Analog Front-End Adaptation 41
maintain optimal receiver operation. The adaptation method presented in this work is
able to generate these signals by adding only minimal hardware overhead to the receiver.
By observing the DC output of a single additional slicer with variable decision threshold,
a PDF of the received data is obtained, which contains the information necessary to in-
telligently adapt the control signals to a variety of channel conditions. By minimizing the
spreading of the PDF caused by ISI and maximizing the amplitude of the received signal
within the limits of linear operation, the adaptation scheme ensures that the vertical
To demonstrate its effectiveness, the technique was used to adapt the control settings
of a binary receiver fabricated in 65-nm CMOS technology. Measured results show that
the adaptation scheme operates correctly when used with a variety of channel types and
In order for a communications link to achieve flexibility by adapting its data rate in
with the ability to function over a wide range of frequencies. In addition, the ability
to transition between not only different operating frequencies but also levels of power
Although traditionally not well suited for frequency agile applications due to their
narrow lock ranges, injection-locked oscillators (ILOs) are becoming increasingly common
[24]. This is due in large part to their small power and area requirements as well as their
ability to operate at high speeds [45] and to quickly transition between operating states
[21]. Due to their potential for ubiquitous use, a great deal of attention has recently
oscillators. Despite this, an ILO model that is accurate, intuitive and applicable for all
types of oscillators under any strength of injection signal has yet to be developed.
This chapter first presents an introduction to injection locked oscillators and then ex-
amines the phenomenon of injection locking, through which an oscillator will synchronize
its output to an external signal. This phenomenon is best understood through analysis of
42
Chapter 4. Injection Locked Oscillators 43
Vout
L C Rp -Gm
Figure 4.1: Oscillation amplitude of an LC tank will decay due to resistive losses unless
compensated for by the addition of a negative resistance.
existing ILO models, including the strengths and weaknesses in their abilities to predict
If a charged capacitor is connected across an inductor, this charge will flow back and forth
between the inductor and capacitor causing the voltage across the capacitor to oscillate
at a frequency of [28]
1
ω0 = √ (4.1)
LC
Due to the parasitic resistances associated with any real inductance and capacitance used
to create this “LC tank” circuit, some of this charge will be lost in the form of heat in
these resistances in every cycle. As a result, the oscillation amplitude will decay with
time unless this charge can be replaced. This charge replacement can be accomplished
using the transconductance from an active device, denoted as −Gm in Fig. 4.1.
injected into this LC tank, as shown in Fig. 4.2, the tank output will be perturbed. If
the frequency of the injected signal is close enough to ω0 and the strength of the injected
signal is large enough then the frequency of the tank output will lock to ωinj . Similarly, if
the N th harmonic of ωinj is close enough to ω0 then the tank output will lock to N (ωinj ).
As a result, any oscillator that is subjected to a signal that satisfies these criteria can be
Vout
Iinj L C Rp -Gm
Figure 4.2: Injection of an oscillating signal, Iinj , can cause the LC tank output to lock
to the frequency, ωinj , of this injected signal.
it was recognized that an ILO was suitable for this application since it is capable of
signal.
• rejecting received signals in adjacent frequency bands that are far from ω0 .
However, no model had yet been presented to account for these observed behaviours or
to address ambiguities such as how “strong” an injected signal must be, or how “close”
to ω0 it must be, in order to achieve injection locking. Such models were proposed in
locking phenomena observed in LC oscillators [23]. In this model the instantaneous phase
difference between the injected and free running oscillator signals, ∆φ(t), was defined as
d∆φ(t) Iinj ω0
= ∆ω − sin(2π∆φ(t)) (4.2)
dt Iosc 2Q
where ω0 is the free running oscillator frequency, ∆ω is the difference between ω0 and the
injected signal frequency, Q is the quality factor of the tank, Iinj is the injected signal
When the injection locked oscillator has settled to a steady state then
d∆φ(t)
=0 (4.3)
dt
Iinj ω0
∆ω = sin(2π∆φ0 ) (4.4)
Iosc 2Q
From this, the lock range, ∆ωmax , of the injection locked oscillator can be found to be
Iinj ω0
∆ωmax = (4.5)
Iosc 2Q
Although this original analysis proved accurate for the case studied in [23], it relied
• The strength of the injected signal is much smaller than that of the free running
oscillator.
Since these assumptions are not always valid for ILOs, many later publications have since
As one example of a situation in which these assumptions are invalid, injection locking
can be achieved through the use of narrow pulses, in place of sinusoidal injection. This
type of injection is often preferable in ILOs used as frequency multipliers where narrow
pulses can reduce jitter and duty cycle distortion in the output of a ring oscillator [21]
or an LC oscillator [48]. Not only does this type of injection violate the sinusoidal
assumption, but it also makes determining the relative strength of the injected signal
difficult since it is not clear if Iinj in this case should be taken as an average over time or
Chapter 4. Injection Locked Oscillators 46
Vinj Vout
L C Rp -Gm
Figure 4.3: Shorting the tank during narrow injection pulses makes application of the
frequency domain model difficult.
L
C L C Rp=(Q2+1)Rs
Rs for low
Q values
(a) (b)
Figure 4.4: Equivalence between (a) the physical model of the series resistive losses in the
inductance of an LC tank and (b) the parallel resistance used to develop the frequency
domain model breaks down for low Q values.
simply as the pulse amplitude. Furthermore, if the injection signal is applied by shorting
the tank during injected pulses [48], as shown in Fig. 4.3, then the resulting effect on
Iinj is unclear. As a result, a modified version of the frequency domain model must be
As for the assumption that the injected signal strength is much smaller than that
of the free running oscillator, this condition is often intentionally violated depending
on the injection scheme used and the intended application of the ILO. For example, in
that stronger injection strengths provide a wider range of achievable phase shifts as well
as improved linearity and resolution of the phase steps [50]. Further complicating this
situation is the fact that when strong injection is applied to an oscillator with a low
Q factor, the accuracy of the frequency domain model deteriorates as the equivalence
between the two tank models shown in Fig. 4.4 is lost [25]. This leads to asymmetry in
an ILO’s lock range, which is a phenomenon that cannot be predicted without significant
As a result of these and similar issues, there now exist a wide variety of different
Chapter 4. Injection Locked Oscillators 47
Δϕ Δϕ=0 Δϕ
t t t
ϕ1 ϕ2 ϕ3
Figure 4.5: Impulses applied to an oscillator have varying impacts on the oscillator output
depending on the relative phase at which they are applied.
versions of the frequency domain model, each of which are suitable for some injection
locking cases, but not others. Therefore despite the ability to accurately reflect measured
results of the lock range, transient phase step response, jitter tracking bandwidth and
phase noise [9] for certain cases, this accuracy comes at the cost of complexity or a loss
of generality. While attempts have been made to generalize the frequency domain model
[51] to make it applicable under all conditions, the results remain complex and difficult to
apply during oscillator design. As a result, the frequency domain model is often relegated
The impulse sensitivity function (ISF) was developed in [27] to describe phase noise
in oscillators by observing that when small noise current impulses are applied to an
oscillator, their impact on the output phase of the oscillator depends on the relative
phase at which they are applied. Fig. 4.5 illustrates this concept, showing that for
current impulses applied at phases φ1 , φ2 and φ3 the resulting output phase change ∆φ
all possible applied impulse phases. Typical examples of the resulting impulse sensitivity
function, denoted by Γ, are shown in Fig. 4.6. Note that different injection techniques can
produce different Γ functions for a single ILO. Hence, in common practice, simulations
of the oscillator in the presence of very small impulsive injections are used to obtain Γ.
Chapter 4. Injection Locked Oscillators 48
Vout(t) Vout(t)
t t
Γ(ω0t) Γ(ω0t)
t t
(a) (b)
Figure 4.6: Example impulse sensitivity functions for oscillators with (a) sinusoidal and
(b) square wave outputs.
The ISF has traditionally been used to analyze oscillator phase noise and is shown
in [27] to have advantages over the Leeson model [52] in its ability to predict 1/f 2 and
1/f 3 noise as well as the influence of cyclostationary noise sources. It also offers circuit
designers insight into how the shape of the oscillator’s output waveform, as well as the
method of applying the negative resistance to restore energy to the oscillator, can affect
Although well suited to modeling phase noise, the ISF model cannot be directly applied
to model the injection locking behaviour of an oscillator [53] without some modification.
Unlike the noise sources for which the ISF model was developed, the injection waveforms
in ILOs are deterministic. They therefore cause the ISF to change significantly, especially
under strong injection. For example, straightforward application of the ISF model cannot
account for locking an oscillator to a frequency other than its free-running frequency [53].
This is because, according to the ISF model, the phase at the output of an oscillator can
be calculated as
Z t
φ(t) = Γ(τ )b(τ )dτ (4.6)
0
Chapter 4. Injection Locked Oscillators 49
where τ is the time of injection and b(τ ) is an injected signal with a period close, but not
Since Γ(τ ) has the same frequency as ω0 , this means that the frequencies of b(τ ) and
Γ(τ ) will not be equal and that the integral of their product in Equation (4.6) will contain
no DC component. This contradicts the known result for an oscillator that is injection
locked to ωlock 6= ω0 . In this case the output phase of the ILO should increase linearly
with time relative to the phase of the free-running oscillator, which should be represented
This idea can be represented graphically by examining the I and Q state variables of a
four-stage ring oscillator. These state variables can be observed most easily by examining
the output voltages of two unadjacent stages of such a ring oscillator. Simulation results
of this oscillator, performed using Spectre and reported in Fig. 4.7(a), show that the
injection of an impulse causes a temporary deviation (dotted line) from the steady-
state oscillator’s trajectory through state-space (solid line), where the magnitude of this
deviation is related to the ISF of the oscillator and the strength of the injected impulse.
Conventional wisdom dictates that, in order for the ISF to be successfully applied to any
future impulses, the transient response of the oscillator must first settle back to its steady-
state trajectory. This implies that the oscillator’s frequency must remain unchanged and
that the ISF is therefore unsuitable for use in the presence of a series of injected impulses
designed to lock the ILO to a frequency other than ω0 , since this would result in shifts in
the I and Q states as shown in Fig. 4.7(b) and thereby continuously require new ISFs.
There have been attempts to extend ISF analysis to accommodate injection locking.
For example, in [54] it is assumed that with each injected impulse, the ILO’s Γ function
undergoes a change in phase equivalent to that of the ILO output and any future injected
impulses will be applied to the new, phase-shifted ISF as shown in Fig. 4.8. However, the
model still fails to account for changes in the amplitude or shape of the ISF that inevitably
arise when the oscillator’s trajectory through state space deviates significantly from it’s
Chapter 4. Injection Locked Oscillators 50
0.4 0.4
0.3 0.3
0.2 0.2
0.1 0.1
Q voltage (V)
Q voltage (V)
0 0
−0.1 −0.1
−0.2 −0.2
−0.3 −0.3
−0.4 −0.4
−0.4 −0.2 0 0.2 0.4 −0.5 0 0.5
I voltage (V) I voltage (V)
(a) (b)
Figure 4.7: Spectre simulations of the I and Q states of a four-stage VCO show that (a)
an injected impulse causes a perturbation (dotted line) from the steady state (solid line).
Repeated impulses (b) can lock the VCO to a different frequency but this requires a new
ISF to model the oscillator’s new trajectory through state-space.
free-running trajectory, as can result from strong injection. Moreover, the analysis is
The challenge of modeling oscillators under strong injection is best illustrated by way
of example. In Fig. 4.9 an injected signal is divided into impulses of area b(τ )h. The
first impulse produces a shift in oscillator output phase, φ, obtained by multiplying the
Vout(t) inj2
inj1
Γ(ω0t)
Figure 4.8: Injected impulses result in step changes in the oscillator output, which must
be accounted for by introducing step changes to the ISF in order to model injection
locking.
Chapter 4. Injection Locked Oscillators 51
b(t) h
t
τ1 τ2 τ3
Γ(t)
Γ(τ3+ϕ(τ3))
t
Γ(τ1)
Γ(τ2+ϕ(τ2))
Figure 4.9: Dividing an injected signal into impulses that act immediately on the ILO
output phase also shift the corresponding Γ(t) function, allowing for injection locking to
be accurately modeled by the ISF.
pulse area b(τ1 )h by Γ(τ1 ). In [54], this same phase shift is applied to the ISF so that the
The application of this technique can be not only cumbersome and time consuming,
but also inaccurate if the injected signal is large. For example, Fig. 4.10(a) shows how the
an impulse (in this case a 5-mV, 10-ps pulse) into an oscillator at various phases, φ, in
relation to the oscillator output. Once this ISF has been determined, it is possible to use
the method described by Fig. 4.9 to predict the ILO’s sensitivity to other injected signals.
In Fig. 4.10(b) the amplitude of the applied signal has been increased by a factor of 10.
Using the ISF, one would expect the resulting oscillator phase shift to also increase by a
factor of 10. Similar predictions can be made for an increase in pulse width, as shown in
Fig. 4.10(c). Simulations of the ILO show that these predictions are relatively accurate,
so long as the resulting phase shifts remain small. Unfortunately, larger phase shifts
are often required in order to implement an ILO with a wide lock range or a fast lock
time. Fig. 4.10(d) shows an injected pulse with both large amplitude and width such as
would be required in a fast-locking ILO. The ISF prediction method in this case greatly
Chapter 4. Injection Locked Oscillators 52
4.4 Summary
The frequency domain model has been shown to accurately model measured ILO be-
haviour for both LC and ring oscillators, using sinusoidal or pulse train injection, with
either weak or strong injection strengths. However, each of these situations requires its
own variation of this model. Such variations add significant complexity to the model and
the resulting loss of generality limits its usefulness. In addition, in many situations it
is difficult to determine the values to be used for the model parameters until measured
results are available, thereby limiting its applicability during the design and simulation
stages of an ILO.
In contrast, the ISF-based ILO model can predict the phase behaviour of any oscillator
so long as the injected signal consists of small impulses. It’s inability to be easily used
with large injected signals and to be extended to predict other important ILO behaviour
such as lock range, lock time and jitter tracking bandwidth have thus far limited its
Vinj
5 mV
(a) t
10 ps
Vinj
50 mV
(b) t
10 ps
Vinj
5 mV
(c) t
100 ps
Vinj
50 mV
(d) t
100 ps
As described in the previous chapter, a beahvoural ILO model that can be universally
applied to any type of oscillator and under any type or strength of injected signal has
yet to be presented. This chapter introduces such a model by using the proposed phase
transfer characteristic (PTC) of an ILO. The parameters of this model can be extracted
from a relatively short set of transient simulations of the oscillator in question. Once
extracted, the model can be used to infer a great deal of information about the oscillator,
such as lock range, lock time, input and output phase relationships and jitter tracking
This chapter will show that the use of this model presents a significant saving in
time and computing resources when compared with determining this information directly
through traditional simulations. Moreover, the ILO model can be incorporated into
larger system-level behavioural models, such as phase-locked loops and clock distribution
networks. The utility of this new model is then demonstrated by using it to develop an
54
Chapter 5. Phase Transfer ILO Model 55
Instead of simulating the ILO under impulsive injection and feeding the result into com-
plex, and in some cases inaccurate, expressions, we instead simulate the ILO’s phase
transient with the actual injected pulse shape being studied. Such simulations more di-
rectly provide an intuitive understanding of how ILO design goals can be translated to
circuit topology. Moreover, the resulting model is accurate even under strong injection,
To define the model we begin by defining the relationship between the injected signal
and the ILO output. First, we assume that when the ILO is locked by some injected
where f is some periodic function with period 2π describing the oscillating waveshape
(e.g. square, sinusoidal, etc.) and θ(t) represents the phase of the signal.1 Similarly, the
ω
lock
Vinj (t) = b t + θinj (t) (5.2)
N
where b is some periodic function (i.e. a sinusoid or a pulse train) with frequency 2π,
θinj (t) is the phase of the injected signal and N is an integer that represents some mul-
In the case where the injected signal is no longer present, the ILO will free-run at a
1
Note that voltage state variables V (t) are used in this work, any of the relevant signals may be
branch currents instead of voltages.
Chapter 5. Phase Transfer ILO Model 56
free-running ILO
locked ILO
Figure 5.1: Definitions of the ILO input and output signals that will be used to develop
the ILO model in the remainder of this chapter.
Here ∆ω is the difference between the frequency of the locked ILO and its free-running
frequency,
∆ω = ωlock − ω0 . (5.4)
In this work, the ILO’s response to one full period of the injected signal, Vinj (t), is
simulated. If the ILO includes any peripheral circuitry such as narrow pulse generators
to condition the injected signal [48] then this can be included in the simulation to ensure
that the effects of this circuitry are accurately captured. Each period of this injected
signal changes the phase of the ILO’s output, θ(t), by an angle, P . The procedure used
two copies of the ILO (including any peripheral circuitry) are simulated over a small
number of cycles of the output clock. By injecting one period of the intended signal
into the ILO and comparing the times of the resulting zero crossings to those of the
This phase change depends upon φ(t), defined as the phase of the ILO output signal
θ(t) subtracted from the phase of the injected signal, θinj (t), such that
Hence, we define the phase transfer characteristic (PTC), P (φ), as the ILO’s phase
change for each injection of one period of Vinj (t) as a function of the relative phase of
Chapter 5. Phase Transfer ILO Model 57
free-running
oscillator
injection event
Figure 5.2: By comparing the zero crossing times of the ILO output to that of an un-
perturbed copy, the phase change created by one period of the injected signal can be
determined through transistor-level simulation in Spectre.
this injection, φ(t). The PTC is readily extracted from a series of transient simulations,
as demonstrated by Fig. 5.3 where two samples of P (φ) are determined by applying one
Since the PTC is specific to the injected signal, Vinj (t), it can have a wide variety of
possible shapes depending on the amplitude, shape and frequency of the injected signal
and the injection scheme used. While this technique means that PTC simulations must be
redone if the injected pulse shape is changed, these simulations can be run in a short time
and the results are accurate and provide insight. The time required to simulate the PTC
will be analyzed and compared to direct spice-level simulation of the ILO characteristics
With the PTC in hand, a behavioural model for the ILO is formed under the assumption
Vinj(t)
t
free-running
Vout(t)
P1 P
2
Figure 5.3: The PTC, P (φ), is determined through simulation by applying one period
of the injected signal, Vinj (t), at different phases, φn , relative to the oscillator’s output
signal, Vout , and observing the resulting change in output phase, P .
the ILO’s output phase equal to P (φ).2 Moreover, it is assumed that between injection
events, the ILO operates at its free-running frequency, ω0 , causing its phase relative to
If we treat each period of the injected signal as a discrete event then we are interested
in the phase difference at the start of the k th injection and Equation (5.5) becomes
φk = θinj,k − θk . (5.6)
While an ILO is locking, the difference between the injected signal phase and ILO output
phase evolves along the sequence φ1 , φ2 , ... φk . This means that the phase shift introduced
2
This is an approximation since, in fact, it may generally take some time for the ILO’s output phase to
react to the injected input. However, the accuracy of this approximation is borne out by later comparison
of the model with transistor-level simulations and measurements.
Chapter 5. Phase Transfer ILO Model 59
The negative sign is included because an increase in ILO output phase results in future
In the event that ωlock 6= ω0 (i.e. ∆ω 6= 0) an additional phase shift of −2πN ∆ω/ω0
is added to Equation (5.7). The resulting expression for the phase difference between the
2πN ∆ω
φk+1 = φk − P (φk ) − (5.8)
ω0
Finally, since any perturbations of the phase of the injected signal phase (i.e. cycle-to-
2πN ∆ω
φk+1 = φk − P (φk ) − + ∆θinj,k (5.10)
ω0
One may also wish to consider the ILO’s behaviour in terms of an absolute phase ref-
erence in order to model external phase perturbations and to make this model applicable
in other, larger systems. This can be done by substituting Equation (5.6) into Equation
(5.10), resulting in
2πN ∆ω
θk+1 = θk + P (φk ) + (5.11)
ω0
This relationship incorporates the nonlinear PTC, P (φk ), and can be represented by the
system drawn in Fig. 5.4. The absolute phase reference of this model means that it can
Equations (5.11) and (5.6) and Fig. 5.4, comprise a general nonlinear behavioural
model of an ILO. The nonlinear PTC function, P (), may be extracted from a relatively
2πNΔω
ω0
ϕk
θinj,k + P() + θk+1
-
z-1
θk
Figure 5.4: Model representing nonlinear the phase relationship between the injected
signal and the ILO output.
the next section, it will be shown how P () may also be extracted from measurements of
an ILO. The following sections will show how the model may be used to very quickly and
accurately find the phase relationship, lock range, lock time and tracking bandwidth of an
ILO. Each of these ILO performance metrics would otherwise require extensive transistor-
level simulations; hence, the model greatly accelerates design iterations, affording the
designer insight. The model may also be integrated into larger behavioural system-level
When locked, the oscillator and injected pulses will settle to some steady-state phase
relationship, φss , where each injected period causes a phase shift P (φss ) that is just
sufficient to cancel the phase drift resulting from ∆ω. This concept is illustrated by the
Spectre simulation results shown in Fig. 5.5 for a ring oscillator with ω0 = 4.7 GHz, which
is being injection locked to ωlock = 4.72 GHz by pulse injection that occurs at a frequency
of 1.18 GHz (i.e. N = 4). In part (a) the ILO output signal is compared to an ideal
reference signal oscillating at 4.7 GHz. Due to the 20 MHz difference between ωlock and
ω0 the phase difference between these signals, as measured by the difference between their
zero crossing times as shown in part (b), shrinks during cycles 74, 75 and 76. When an
injection event then occurs during cycle 77 the resulting phase change, given by P (φss ), is
sufficient to offset the phase difference that accumulated during the previous four cycles.
This allows the oscillator to settle to the steady-state phase relationship shown in part
Chapter 5. Phase Transfer ILO Model 61
(c).
In order for an ILO, with free-running frequency of ω0 to lock to ωlock , the phase
change produced by each period of the injected signal at steady-state, P (φss ), must be
sufficient to eliminate the phase drift accumulated over N cycles of the oscillator output
such that
2πN ∆ω
P (φss ) = − . (5.12)
ω0
Since N , ∆ω, ω0 and φss of a physical oscillator can all be directly observed, Equation
5.12 therefore provides a way to determine the PTC of a fabricated ILO and compare it
Equation (5.12) shows that the steady-state phase relationship, φss , between the
injected signal and the ILO output is determined by the frequency difference, ∆ω. This
observation is intuitive since the steady-state phase relationship between the injected
and output signals, φss , of an ILO has previously been exploited in applications such
as clock deskew, where ILO output phase can be adjusted by tuning the free-running
frequency of a VCO [24]. Fig. 5.6 illustrates this concept by showing φss for a variety
signal has no need to influence the oscillator’s output and therefore settle to a steady-
state relationship where, according to the PTC, they will have no effect on the output
phase (i.e. P (φ) = 0). When ωlock < ω0 each injected pulse must decrease the oscillation
frequency, meaning that the pulses settle to a steady-state relationship where they will
each create a positive change in oscillator phase, given in this example by P+ . These
steady-state relationships are reached, after some settling time, regardless of the phase
ILO
(a) ref
(b)
(c)
Figure 5.5: Transistor-level simulation results (a) comparing ILO output to an ideal
reference signal at ωlock shows that (b) the phase change produced by an injection event
is sufficient to cancel the phase drift resulting from the difference between ωlock and ω0 .
This allows the ILO to settle to the steady-state phase behaviour plotted in (c).
P(ϕ)
Pmax
P+
ωlock=ωmin
0 ϕss
ωlock<ω0
Pmin
ωlock=ω0
ωlock=ωmax
Figure 5.6: Steady-state phase relationships are determined by the difference between
ωlock and ω0 .
Chapter 5. Phase Transfer ILO Model 63
The analysis of an ILO’s lock range under subharmonic injection is of particular interest
[24], [21]. Fig. 5.6 shows that a natural extension of the steady-state phase shift modeling
is that the lock range of an ILO can be determined directly from its PTC since the ILO
can only successfully lock to an injected signal that produces a large enough phase change
in the oscillator output to compensate for the difference in their frequencies. In other
words, the maximum value of ∆ω, which we define to be ∆ωhigh , can be found using
Equation (5.12) to be
ω0 Pmin
∆ωhigh = − . (5.13)
2πN
ω0 Pmax
∆ωlow = − . (5.14)
2πN
where Ppp = Pmax − Pmin has been defined as the peak-to-peak value of the PTC. Simu-
lation results of an example oscillator topology comparing these lock range equations to
Treating the maximum and minimum PTC values separately, as in Equations (5.13)
and (5.14), identifies cases where the lock range is not centered equally about the free-
running frequency. This effect can be present in ILOs for a variety of reasons, especially
during strong injection and, although it has previously been reported in [25], it is often
ignored in ILO models. Furthermore, it should be noted that the lock range calculation
Chapter 5. Phase Transfer ILO Model 64
given by Equation (5.17) does not require that the circuit designer determine an effective
Q, injection strength, or any other oscillator parameter. Instead it relies only on the
When an injected signal is applied, the time that it takes for an ILO’s output to settle
to a steady-state phase relationship with this injection is known as the lock time. This
transient relationship can be useful in determining the ILO’s jitter tracking capabilities
[9] and can also be important in systems that require fast locking, such as frequency
The PTC allows us to predict lock time variations that are not obvious in the
frequency-domain ILO model. Although it has been shown that the frequency domain
model can be manipulated to predict these lock time variations [57], the complexity in-
usually suggested that lock time depends only upon ∆ω and injection strength, the PTC
model indicates that there is also a strong dependence on the initial phase relationship,
For the analysis of lock time we assume that no phase perturbations are introduced
by the injected signal (∆θinj,k = 0). In this case, the model given by Equation (5.10)
shows that the ILO phase will settle to its steady-state condition, φss , when
and therefore
2πN ∆ω
P (φk ) = P (φss ) = − (5.19)
ω0
This means that an injected signal that begins at a phase that is far from the desired
Chapter 5. Phase Transfer ILO Model 65
P(ϕ)
P(ϕSS)= 2πNΔω
ϕ
ω 0 ϕss ϕ0 ϕ0 ϕ0
1 2 u
Δϕ2
Δϕ1
Figure 5.7: An ILO’s lock time depends on the initial phase difference, ∆φ, between the
injected signal φ0 and the steady-state phase difference, φss , required by the frequency
of the injected signal.
steady-state relationship, φss , will require more injection events, and therefore a longer
time, to reach φss . Fig. 5.7 demonstrates this relationship for an example case where
two identical injected signals are applied individually to an ILO at initial phases φ01 and
φ02 . Since φ01 is much closer to φss than φ02 , therefore ∆φ1 < ∆φ2 , resulting in a shorter
lock time for the injected signal that begins at φ01 . Note that although P (φu ) = P (φss ),
the steady-state ILO output phase cannot settle to this point. A small deviation to the
left of φu , resulting from noise or a slight frequency difference between ωlock and ω0 , will
produce a small positive phase shift, which will then shift the phase difference further
to the left of φu , in turn producing a larger positive phase shift, and so on until φss is
reached. A similar effect occurs in the opposite direction if the shift occurs to the right of
φu . Due to the small steps that begin this settling, the increase in lock time that occurs
when φ0 ≈ φu is significant.
To illustrate this effect, Fig. 5.8(a) shows the PTC of a 4-stage ring oscillator obtained
width of 70 ps. When this injected signal is at a frequency close to ω0 then φss is
where P (φss ) = 0 on the rising edge of the PTC, as indicated. If the injected signal
begins its injection at a phase that is close to φss it will therefore settle quickly, following
a simple, first order exponential settling step response. Indeed, the time constant of
this exponential settling is expressed in the following section as a function of the jitter
20
400
φss
15
φ02
350
Phase
(deg)
10
φu
φss
300
PTC
(deg)
5
250
0
200
SPICE
sim.
-‐5
PTC
model
φ01
-‐10
φ01
φ02
150
-‐15
100
0
100
200
300
400
0
5
10
15
20
φ
(deg)
Time
(ns)
(a) (b)
Figure 5.8: Spectre simulations of the (a) PTC and (b) transient phase response of a
4-stage ring ILO. When injection begins far from φss at φ01 the lock time is significantly
longer than when it begins at φ02 .
approaches. If the injected signal begins farther from φss , especially if it begins near
the unstable operating point φu , it will require a much longer lock time. This effect is
captured by the PTC model as shown in Fig. 5.8(b) where 2 identical signals are injected
into the 4-stage ring oscillator but beginning at initial phases φ01 and φ02 , respectively.
This can lead to large variations in the lock time of an ILO for a given injection frequency,
A more complete picture of the lock time of an ILO as a function of its initial phase
is shown in Fig. 5.9 for both SPICE simulations and as predicted using the PTC model.
In these simulations the lock time is defined as the time taken for the oscillator’s output
phase to settle to within 10 of its steady-state phase. Although the lock time reaches a
maximum value near 12 ns, it should be noted that there is no fundamental limit to this
and it is possible to observe very long settling times in an ideal, noise-free simulation
environment. In practice noise will push the oscillator phase away from φu , thereby
14
12
SPICE
sim.
PTC
model
10
6
4
2
0
0
100
200
300
400
Ini6al
Injected
Signal
Phase,
φ0
(deg)
Figure 5.9: Lock time varies greatly depending on the phase at which the injected signal
begins. This effect is seen in both SPICE-level simulation simulation and the PTC-based
lock time model.
Although a strength of the PTC-based model is that it captures the nonlinear phase
response of the ILO during large phase transients, it can also be used to find linear
performance metrics in the presence of small phase deviations such as phase tracking
bandwidth. Specifically, consider an ILO that has reached steady-state at a lock point
with a relative phase shift φss defined by Equation 5.12. Small perturbations around this
lock point due to phase changes (i.e. jitter) in the injected signal, θinj,k , result in restoring
phase shifts that are proportional to the phase error. The constant of proportionality
is the slope, m, of the PTC around φss . Hence, under small phase perturbations, a
1
JT F = ω (5.20)
1 + ωTjB
where ωj is the frequency of the jitter and the 3-dB tracking bandwidth is given by ωT B .
When the phase at the input of the ILO is perturbed by an amount, θinj,k , then the
Chapter 5. Phase Transfer ILO Model 68
P(ϕss)
ϕk -
θinj,k + P() + θk+1
-
z-1
θk
Figure 5.10: The jitter tracking bandwidth of an ILO can be determined by applying
a step change to the phase of the injected signal and observing the resulting change in
the output phase. The displayed results are from Simulink simulations of the proposed
model, performed using the PTC of the ILO as determined from Spectre simulations.
This equation shows that the rate of change of the output phase of the ILO in response
we apply a small step change to θinj,k and observe the system response as illustrated in
Fig. 5.10. When this step is applied, the phase difference between the injected signal
and the ILO output, φk , jumps by the value of the applied step, which then causes the
Since the output phase follows a first order exponential settling, as shown in the
previous section, the time constant τT B of this response can be found from the slope of
θk+1 . This can be determined by taking the derivative of Equation (5.21), resulting in
dθk+1 dP (φk )
= . (5.22)
dk dk
If we assume that for small perturbations in θinj the slope of P is a constant given by m
Chapter 5. Phase Transfer ILO Model 69
140
Figure 5.11: Comparison with direct transistor-level simulation using Spectre shows that
the 3-dB jitter tracking bandwidth can be accurately predicted over a range of injected
frequencies using the PTC model.
then
dP (φk )
m= (5.23)
dk φk =φss
1
τT B = (5.24)
mfinj
where finj is included to convert τT B from injection cycles to seconds. This then means
that the jitter tracking bandwidth of the first-order phase tracking model is given by
1 mfinj
fT B = = (5.25)
2πτT B 2π
where the injected frequency, finj , is related to ∆f and therefore φss through Equation
(5.12).
The accuracy of this model is illustrated by Fig. 5.11 where the fT B of the 4-stage
ring oscillator discussed in the previous section is calculated using Equation (5.25) and
frequencies.
Chapter 5. Phase Transfer ILO Model 70
The time required for any spice-level transient simulation varies depending on a number
of factors including the processing power of the machine used, the size of the circuit
being simulated, the time step used by the simulator and the level of accuracy required
by the designer. Nonetheless, by making all of these factors as constant as possible, this
section attempts to quantify the reduction in simulation time achieved by using the PTC
To this end, simulations of a four-stage, 4.7-GHz ring oscillator with narrow pulse in-
jection into each stage of the oscillator [21] were performed using Cadence Spectre. With
the accuracy set to “conservative” for small step sizes, transient simulations were per-
formed to determine the PTC, as described previously. By using initial conditions to help
the free-running oscillations begin, the ILO settles to its steady-state in approximately
1 ns and the injected signal was applied 2 ns after beginning the transient simulation.
By delaying the application of this pulse from 2 ns to 2.22 ns (in increments of 10 ps),
the P (φ) of the ILO was determined for a range of φ spanning 3600 of the 213-ps output
signal period. For this comparison, the peak-to-peak value of the PTC was then trans-
lated to a lock range using Equation 5.17. Total simulation time required to determine
simulations. For example, the ability to lock to an injected signal can be determined by
applying the injected signal to the ILO for an extended period of time and observing the
output phase of the oscillator (relative to a signal oscillating at the ideal output frequency)
to see if the phase settles to some steady state value. By repeating this simulation over a
range of injected frequencies, the lock range of the ILO can be determined. Unfortunately,
in order to obtain a reasonable level of accuracy in the lock range when it is determined in
this way, a large number of injected frequencies must be attempted. Further complicating
the situation is the fact that each transient must be run for a large number of periods of
Chapter 5. Phase Transfer ILO Model 71
Figure 5.12: Spectre transient simulations of an ILO settle to a constant phase if the ILO
is correctly locked (dotted line). If the injected signal is beyond the ILO’s lock range
then this can be identified by slipping of the output phase (solid line), which may not
become apparent until the simulation has been run for many output clock cycles.
the output clock since the phase slipping that identifies an unlocked ILO may take many
cycles to present itself for frequencies close to the edge of the lock range. This case is
illustrated by the transistor-level simulation results in Fig. 5.12 where the phase of the
4.86-GHz injected signal (solid line) appears to settle to a steady state but is revealed to
slip only after the simulation has been running for over 60 cycles of the output clock.
Therefore, in order to determine the lock range of the 4.7-GHz oscillator discussed
above, 20-ns transient simulations were run for frequencies from 3.9 GHz to 5.1 GHz, in
10 MHz steps. Although the actual lock range of the ILO is from 4.15 GHz to 4.85 GHz,
this information is not known before the simulations are performed and wider range of
comparable accuracy to that obtained using the PTC, as will be shown later in Table
5.2.
Due to the length of each transient, and the fact that 121 frequencies need to be
simulated to cover 3.9 to 5.1 GHz in 10 MHz steps, this simulation takes considerably
longer than the PTC simulation. This is apparent from the comparison shown in Table
5.1. Since the PTC can be used to quickly determine not only the lock range of an ILO
but also its steady-state phase, lock time and jitter tracking bandwidth, as was shown in
the previous sections, the advantage of using the PTC model is apparent.
Chapter 5. Phase Transfer ILO Model 72
Table 5.1: Comparison of simulation time required to determine the lock range of an ILO
to that required to determine the PTC, which can be used to determine lock range in
addition to other ILO parameters.
This section presents a design example applying the proposed model to a multiplying
ILO (MILO) that generates a 4-GHz output signal from a 1-GHz reference clock. In
order to demonstrate the usefulness of the PTC model, the MILO is designed to have a
very wide lock range, which is difficult to model using other methods. Wide lock range
is typically difficult to achieve for ILOs, with reported lock ranges commonly less than
is required that does not fit conventional ILO models, but the PTC-based model can be
A ring oscillator topology was chosen as it generally provides wider lock range than
LC-based ILOs [58]. The frequency domain model presented in [24] states that the lock
2ω0 K
∆ωmax = 2π (5.26)
n sin n 1 − K 2
where n is the number of stages in the ring and K is the relative injection strength given
as Iinj /Iosc . Although this model indicates that the number of oscillator stages should
be decreased and that the injection strength should be increased in order to maximize
∆ωmax , the model provides very little insight into what the injected signal should look
like and how it should be applied to the MILO. Further complicating the application of
the frequency domain model is the fact that Equation (5.26) must be modified once the
522 Ω
Cvar 45 - 130 fF
Vout
10 10 2 2
Vosc Vinj
Itail
Itail 5
Figure 5.13: One stage of the four-stage CML injection locked ring oscillator. Applying
the injected signal to a secondary differential pair provides a strong injection strength.
The PTC model, specifically Equation (5.17), indicates that the lock range, ∆fmax ,
can be increased by maximizing the peak-to-peak phase transfer characteristic, Ppp . Since
Ppp can be efficiently determined through simulation, the lock range of different MILO
topologies can be quickly evaluated and compared. In this design a lock range of 1 GHz,
or 25% of the 4-GHz f0 was targeted, which translates to a target Ppp of 3600 .
To serve as a starting point in the design, a four-stage CML ring oscillator was
strength, Vinj was applied to a secondary input differential pair with the drain nodes
connected to those of the original CML stage as shown in Fig. 5.13. Transistor sizes in
µm are shown, with all gate lengths implemented as minimum sizes. The tail current of
the main pair was set to 744 µA, while that of the injection pair was chosen to be 1/5 of
this in order to ensure that the ILO continues to oscillate when there is no injected signal
present. Varactor load capacitances are used to tune the MILO’s free-running frequency
if necessary.
The PTC of the four-stage MILO is then determined by simulation using injected
pulses with an amplitude of 300 mVpp (differential) and a width equal to approximately
half of one period of a 4-GHz clock signal. These pulses were applied to the first stage
of the oscillator, as shown in Fig. 5.14. In order to create a realistic pulse shape in the
simulation environment, an ideal pulse is first applied to a CML differential pair before
it is applied to the MILO. The secondary differential pair shown in Fig. 5.13 is included
Chapter 5. Phase Transfer ILO Model 74
-1
Figure 5.14: The PTC is determined by measuring the output phase change created
when a single pulse is applied to the MILO at different phases relative to the oscillator’s
output signal.
in each stage of the oscillator to provide a consistent load at the output of each stage.
Where these injection pairs are unused, their gates have been grounded.
By applying this pulse at various times spanning one period of the clock signal and
observing the resulting change in the output phase of the MILO, the PTC was determined
and is plotted as the “1 inj” curve in Fig. 5.16. It exhibits a Ppp of 260 , which corresponds
to a lock range of 54 MHz and highlights the difficulty of achieving a wide lock range
for an ILO. Although the strength of the injected signal has been maximized relative to
the available headroom in the 65-nm CMOS process, other strategies that attempt to
increase the MILO’s sensitivity to injected signals are required in order to increase lock
range.
Injection into multiple locations of an oscillator has been shown to increase the lock range
of injection-locked frequency dividers by applying the injected signal to the tail currents
of two [26] or three different stages of an n-stage ring oscillator [58]. In both cases off-
chip controls were used to modify the phase relationship between the injected signals,
demonstrating that injected signals should be applied with successive phase delays of
π/n in order to achieve the widest lock range. In other words, the injected signal should
experience the same delay as is created by one oscillator stage before being injected into
-1
R1 R2 R3 R4
I1 I2 I3 I4
Figure 5.15: Injection into multiple ILO locations can increase locking range if the in-
jected signal experiences a delay that is equal to that created by each stage of the ring
oscillator.
This means that the required phase shift in the injected signal can be easily created
on-chip by passing the injecting signal through delay elements that are identical to those
that make up the ring oscillator. This concept is illustrated in Fig. 5.15 where injection
delay stages I1 to I4 are identical to ring oscillator stages R1 to R4 with their unused
injection input ports grounded. Small buffer stages are also added to the output of each
stage of the ring oscillator in order to ensure that the load seen at the outputs of R1 to
The addition of each new injection site increases the ILO’s sensitivity to an injected
pulse. This is illustrated by the Spectre simulation results in Fig. 5.16 where the peak-
to-peak value of the PTC is increased as the number of injection sites increases from
single injection into stage R1 (“1 inj”) to injection into all four ring oscillator stages
(“4 inj”). Although the effects of this multi-stage injection would be difficult to predict
using conventional ILO models, the PTC is readily determined through schematic-level
simulation and the results can easily be translated into the resulting lock range, lock
It should be noted, however, that the limited bandwidth of each element in the delay
line created by I1 to I4 results in the loss of some high frequency content of the injected
pulse as it travels through each successive stage. This means that the pulse, which began
Chapter 5. Phase Transfer ILO Model 76
60
40
PTC (deg.)
20
1 inj
−20 2 inj
3inj
4inj
−40
0 100 200 300 400
phase (deg.)
Figure 5.16: Spectre simulation results show that the peak-to-peak amplitude of the PTC
increases as more injection sites are added.
with a width equal to half the bit period of the output clock will become wider by the
time it reaches R4 . These wide pulses are therefore able to produce a larger positive
phase change in the MILO output, which corresponds to improved locking to frequencies
lower than that of the free-running oscillator. They are, however, not able to produce
a more negative phase in the MILO output, meaning that there is no improvement in
locking to higher frequencies. Although this effect is typically not addressed by existing
ILO models, it is clearly visible in the difference between positive and negative peak
The peak PTC values were translated to lock ranges using equation 5.17 and are
reported in Table 5.2. These results are compared to lock ranges obtained using the ISF
method [54] and to those obtained directly from SPICE-level simulation using Virtuoso
Spectre. To obtain the lock range in this way the transient response of the oscillator was
simulated over a range of frequencies and a locked condition is identified by the settling
of the MILO’s output phase (relative to a reference signal) to some steady-state value.
In order to ensure that the MILO is locked and that there is no eventual slipping in
Chapter 5. Phase Transfer ILO Model 77
# inj. sites 1 2 3 4
∆flow 101 MHz 249 MHz 466 MHz 475 MHz
ISF theory [54]
∆fhigh 87 MHz 174 MHz 226 MHz 242 MHz
Pmax 14.70 30.40 44.40 56.50
Pmin -11.60 -20.610 -22.690 -21.640
PTC model
∆flow 32 MHz 67 MHz 97 MHz 124 MHz
∆fhigh 25 MHz 45 MHz 50 MHz 47 MHz
∆flow 30 MHz 60 MHz 90 MHz 120 MHz
Spectre sim.*
∆fhigh 30 MHz 40 MHz 50 MHz 50 MHz
*performed with 10 MHz resolution
Table 5.2: Comparison of lock ranges calculated using the ISF model [54] and the PTC
model to those obtained directly using extensive SPICE-level simulations.
the output phase, the transient simulation must be run for several hundred clock cycles.
This, combined with the fact that the step size of the frequency sweep must be small
in order to accurately determine the lock range, results in simulations that consume a
significant amount of time and resources. This highlights the usefulness of determining
frequency pre-conditioning circuit that emphasizes the input’s desirable harmonic. For
example, an edge detector comprising a delay and XOR gate is often used for this purpose
[56], and results in the MILO design shown in Fig. 5.17. This topology was developed
in collaboration with a team of engineers while on internship at Rambus Inc. and was
published as part of the bidirectional link presented in [21]. By incorporating a lock range
of approximately 7%, this MILO is able to lock to the desired 2.8-GHz clock frequency
across changes in voltage and temperature. It should be noted, however, that the PTC
model presented in this chapter was developed separately and involved no collaboration
with Rambus.
While the impact of adding an edge detector to the MILO would be difficult to include
Chapter 5. Phase Transfer ILO Model 78
-1
finj = 2 GHz
fref = 1 GHz fclk = 4 GHz
Figure 5.17: Creating pulses at the reference clock edges emphasizes the desirable har-
monic of the input thereby improving the lock range of the MILO.
fref = 1 GHz -1
fpulse = 2 GHz
fclk = 4 GHz
finj = 4 GHz
Figure 5.18: The addition of a second edge detector with wide pulse widths further
emphasizes the desired harmonic of the input signal.
in existing ILO models, its inclusion in the PTC simulations is trivial. First, the delays
and XORs shown in Fig. 5.17 were included prior to the ILO. Then a DC offset was added
to the first amplification stage in the delay chain in order to create the return-to-zero
pulses shown in this figure. Simulations of the MILO using this injection technique show
that it increases Ppp to 1140 . Further increases are then achieved by adding a second
edge detector set to create pulse widths equal to twice that of the original edge detector,
resulting in the MILO topology shown in Fig. 5.18. Using this technique increases Ppp
to 2040 .
Furthermore, with the injected pulses now arriving at the same frequency as the
output clock, it becomes unnecessary for the injected signal to return to zero between
injected pulses and a full swing sinusoid can now be used as the injection signal instead.
When this is done Ppp increases beyond the target value of 360 degrees. The PTC for
Chapter 5. Phase Transfer ILO Model 79
PTC
(deg)
100
0
-‐100
-‐200
-‐300
0
100
200
300
400
Phase
of
Injected
Signal,
φ
(deg)
Figure 5.19: Spectre simulations determine the PTC for ILOs using (a) one edge detector,
(b) two edge detectors and (c) two edge detectors used to produce a sinusoidal (NRZ)
injection signal.
each of these options was determined through transistor-level simulation in Spectre and
is shown in Fig. 5.19. The PTC values reported for the case of two edge detectors with
sinusoidal injection indicate that a 4-GHz ILO using this topology should be able to
achieve a lock range that extends 0.7 GHz above f0 (using Pmin = -250.90 in Equation
5.13) and 1.06 GHz below f0 (using Pmax = 383.30 Equation 5.14).
Using this method Ppp was found to be 634.20 , which translates to a lock range of
1.76 GHz or 44% of f0 . This compares well with both direct simulation of the lock range
(40% of f0 ) and the measured lock range of this MILO in a prototype chip (42.5% of f0 ),
5.4 Summary
Modeling ILO behaviour by using conventional frequency domain models requires the use
of several parameters which are difficult to define. Modeling using the ISF-based model
is accurate only when the injected signal strength is low. As an alternative to these
options, the proposed PTC model of an oscillator can be used in conjunction with simple
transistor-level simulations to accurately predict the behaviour of any ILO under any
Chapter 5. Phase Transfer ILO Model 80
injected signal. This makes the PTC model useful during the design of an ILO, helping
to optimize the circuit for a given set of requirements. This model has been submitted
Using this PTC model a MILO was designed to multiply a 1-GHz reference clock by 4
to produce a 4-GHz output clock. By simulating the PTC of the MILO at various stages
throughout the design process, it was possible to quickly evaluate the impact that each
change in topology had on the lock range. This in turn made it possible to achieve a
wide lock range in a logical progression of design steps, resulting in a MILO with a lock
Clock Multiplier
Many wireline communication links require peak bandwidth operation for only a small
fraction of their operating time [20]. The power efficiency of many electronic devices
has therefore been improved by either varying the interface baud rate as bandwidth
requirements change [4], and by scaling the supply voltage of the link [59]. Such systems
are referred to as dynamic voltage and frequency scaling (DVFS) and are commonplace
More recently, work has focused on improving power efficiency by completely powering
down the link during periods of inactivity [35], [21]. While these techniques have proven
effective at significantly reducing standby power consumption, they have so far been
unable to combine this advantage with the ability to also scale the frequency of the links
when they are powered on. Greater link flexibility and thus further power savings can be
achieved by combining both techniques in applications such as the DVFS system shown
in Fig. 6.1. In this system, the digital logic operates at a frequency and supply voltage
that varies according to demand. Any communication that must occur with other chips
must first pass through a serializer/deserializer (SerDes) block, which increases the data
81
Chapter 6. A Wide Lock Range, Fast Power-On Clock Multiplier 82
Logic SerDes
Channel
Processor Processor
Processor Processor
clock
multiplier
low-speed
clock
Figure 6.1: The design of a frequency agile clock multiplier that is suitable for fast power
cycling can achieve link flexibility and power savings in DVFS applications.
rate of each link and thereby and limits of the number of I/O pins and interconnect traces
that are required. The high-speed clock required for the SerDes operation is typically
produced by a clock multiplier, as shown in Fig. 6.1 (note that only the serializer portion
of the SerDes block is shown for simplicity). If this multiplier can be included in the
blocks that are powered down when the link is inactive, further power savings can be
achieved. However, the implementation of a multiplier that is both frequency agile and
capable of fast power cycling presents significant design challenges, which will be explored
in this chapter.
narrow lock ranges, typically less than 10% of the free-running frequency, even in ap-
plications where attempts have been made to maximize the frequency range [21]. PLLs
[60], or MDLLs [61] can be tuned to accommodate a wide range of input frequencies, but
their slow settling time makes them unsuitable for fast power-on architectures. While it
is sometimes possible to adapt these loops for fast power-on applications by initializing
their control voltages [35], this requires constant link speed between consecutive power-
on cycles, limiting the usefulness of this technique for frequency agility. This chapter
presents the first clock generator that is both frequency-agile and capable of fast power-
Chapter 6. A Wide Lock Range, Fast Power-On Clock Multiplier 83
Power On
ref
Figure 6.2: This chapter presents the first clock multiplier that is frequency-agile and
has the ability to be powered on in under 10 cycles of the reference signal.
on. Measured results demonstrate an aggregate lock range of 55.7% and show that a
valid clock output available within 10 cycles of the reference clock. The result is the first
clock generator that is capable of performing the frequency shifting operation illustrated
in Fig. 6.2 with no adjustments or tuning of any kind between power-up sequences.
As discussed in the previous chapter, the lock range of a MILO can be increased by either
increasing the effective strength of the injected signal, or by increasing the sensitivity of
the oscillator to this signal. Both of these effects can be captured by simulating the
phase transfer characteristic of a MILO in response to one period of the injected signal.
In the previous chapter, a MILO was designed using a ring oscillator with multiple
injection points in combination with edge detectors to achieve a lock range of 45% of
the free-running frequency. This MILO will serve as a starting point for the multiplier
architecture presented in this chapter. While the lock range achieved by this design is
Since this can be difficult for a single MILO to achieve, a possible alternative is to
employ multiple MILOs with adjacent lock ranges along with some control circuitry that
is able to switch between MILO outputs when necessary. This idea is illustrated in
Fig. 6.3 for the case of a clock multiplier with multiplication factor N = 4, designed to
Chapter 6. A Wide Lock Range, Fast Power-On Clock Multiplier 84
Control
clock
Logic
2 GHz 4 GHz
fref = 0.5 - 1 GHz
Figure 6.3: Four MILOs with adjacent lock ranges can cover an aggregate output fre-
quency range from 2 to 4 GHz.
produce output clock frequencies ranging from 2 to 4 GHz. Although this example uses
four MILOs to cover the desired frequency range, this number can easily be changed
ranges.
A critical component of the architecture shown in Fig. 6.3 is the control logic, which
must monitor the output of each MILO in order to perform the following tasks:
input frequency, which may not be the case if the MILO has locked to a different
3. If two adjacent MILOs are locked to the correct frequency, then a single output
4. After making a decision based on these criteria, the three unused MILOs should be
powered down.
Further complicating these tasks is the fact that this monitoring and decision making
must happen within only a few cycles of the reference clock in order to meet the fast
Chapter 6. A Wide Lock Range, Fast Power-On Clock Multiplier 85
ILO2
clock output
ILO1
Figure 6.4: The addition of a second, identical ring oscillator compensates for DJ intro-
duced by unequal pulse widths created by the edge detectors.
power-on requirement. The techniques used to perform these tasks in the allotted time
Simulations of a single MILO show that, since the width of the pulses created by the
edge detectors is fixed, the pulse widths of the injected signal can vary greatly as the
frequency of the injected signal strays from the center of the MILO’s lock range. This can
create a significant amount of deterministic jitter (DJ) at the MILO output. To address
this, a second ring oscillator was added to the output of the first, as shown in Fig. 6.4.
Making the second oscillator identical to the first and injecting signals into each
oscillator stage ensures that the lock range will not be limited by the addition of the
second ILO. The resulting improvement in DJ was simulated for this MILO structure in
In addition to its jitter filtering properties, the addition of a second ILO can also be
used to verify that both oscillators are locked to an injected signal and also to measure
their distance from the free-running frequency. As was illustrated in the previous chap-
ter, the steady-state phase relationship between an ILO output and its injected signal is
proportional to the frequency difference between this injected signal and the oscillator’s
free-running frequency (Equation (5.12)). This phase relationship can be quickly deter-
Chapter 6. A Wide Lock Range, Fast Power-On Clock Multiplier 86
50
45
ILO1
mined by using the outputs of each stage of the first ILO to trigger latches reading one
output of the second ILO, as shown in Fig. 6.6. Details of the CML latches used in this
circuit are given in a schematic diagram included in Appendix B, Fig. B.2. The inclusion
These latches perform the function of a coarse time to digital converter (TDC). When
the injection frequency is close to f0 the phase of the highlighted injection signal (inj)
will be approximately equal to that of the highlighted clock signal (clk), resulting in the
latch outputs shown for case (b) in Fig. 6.6. As the injection frequency increases, the
phase of “inj” will begin to lead that of “clk”, resulting in case (a) and vice versa as the
injection frequency decreases, resulting in case (c). This means that bits D1 to D6 can
be used to estimate how far the ILOs are from their shared free-running frequency. Since
an ILO operating close to f0 should produce less jitter and have good tolerance to any
subsequent variations in voltage and temperature, this information can then be used to
select a clock signal from two different MILOs which have locked to the correct multiple
The addition of a sixth latch, which is triggered by the inverted output of stage 1 of
Chapter 6. A Wide Lock Range, Fast Power-On Clock Multiplier 87
ILO2
clk
inj TDC
Figure 6.6: Latches at the output of each stage of the first ILO can be used to compare
the phase relationship between the two ILOs.
ILO1, ensures that the latches are triggered over a range from 0 to 1800 in steps of 300 ,
which is determined by the number of stages in the oscillator. This guarantees that D1
and D6 will have opposite values so long as each ILO is locked to the same frequency.
Therefore the outputs of D1 and D6 can be used to indicate when the ILOs are in a locked
condition and can trigger a power-down of any MILOs when this is not the case.
Since the multiplier covers a broad range of frequencies, it is possible for two MILOs to be
locked to frequencies at different multiples of the reference signal frequency. This effect
is not captured by the frequency lock detection discussed in the previous section and, as
a result, some additional logic is required to detect which MILOs are correctly locked to
N = 4 times the reference frequency. This can be accomplished using the modified ripple
In this structure the chain of latches exit the reset state when the reference clock signal
goes high at t1 and propagate a “logic high” signal down the chain as each successive
output clock edge occurs. By the time the reference clock signal goes low at time t2 , latch
outputs Q1 to Q6 should correspond to the values shown in the table in Fig. 6.7 provided
Chapter 6. A Wide Lock Range, Fast Power-On Clock Multiplier 88
ref clk
t1 t2
D Q1 D Q3 D Q5
rising falling
R R R
edges edges
Q1 1 Q2 1
Q3 1 Q4 1
R R R Q5 0 Q6 0
D Q2 D Q4 D Q6
Figure 6.7: Latch chains verify the multiplication factor by ensuring that there are exactly
2 rising and 2 falling clock edges within one half period of the reference clock.
that the output clock frequency is 4 times that of the reference. Any deviation from
this multiplication factor will result in different latch outputs, which will then trigger
a power-down of the incorrectly locked MILO. It should be noted that, since the phase
relationship between the output clock and the reference signal is unknown, it is impossible
to know whether a rising or falling clock edge will arrive first after the reference signal
goes high at t1 , making it necessary to count rising and falling edges in separate counter
The power required to operate several MILOs continuously would likely outweigh any
savings gained through frequency agility and the ability to power down the system during
possible. If the edge counter is able to detect frequency lock to an incorrect multiple of
the reference frequency or the TDC is able to detect an out-of-lock condition, as described
previously, then the corresponding MILO can be powered down immediately. A block
diagram of the system used to implement this strategy is illustrated in Fig. 6.8.
If neither of these conditions are found then it becomes necessary to compare the
MILO’s clock output to that of any adjacent MILO that may also be locked to the
Chapter 6. A Wide Lock Range, Fast Power-On Clock Multiplier 89
distance
TDC from f0
ref output
clock
Edge Power
Counter Down
Figure 6.8: A MILO can be powered down immediately if the TDC detects an out-of-lock
condition or if the edge counter detects an incorrect frequency multiplication ratio.
correct frequency. This system is pictured in Fig. 6.9 where data about the distance
of each ILO’s output from its free-running frequency is passed to a “frequency offset
compare” logic block that is external to the MILOs. This logic uses the output bits from
the TDC to determine which multiplier is operating closest to its free-running frequency
and sends a signal to the other MILO instructing it to power down. In the case where
both TDC outputs are identical, the MILO with the lower free-running frequency is
powered down. The reference clock signal is also applied to the power down logic where
it is used to latch each of the TDC outputs for half cycles of the reference clock. This
gives the frequency offset compare logic enough time to make its decision and power
down a MILO without being susceptible to sudden changes in the TDC outputs, which
can occur due to the deterministic jitter present in the ILO outputs.
If any of the signals from either the TDC, edge counter, or frequency offset comparison
logic trigger a power-down then the circuits within the unused multiplier are turned off
by applying a logic signal to switches in the tails of each differential pair in the MILO, as
shown in Fig. 6.10. Since all parts of the MILO employ CML signaling, this ensures that
the power drawn by the multiplier goes to almost zero when powered down. It should
be noted that this idea can be easily extended to cut power to the monitoring circuitry
(TDC, edge counter, external power-down logic) of the MILO that remains active in
order to achieve further power savings. However, this technique was not implemented in
Chapter 6. A Wide Lock Range, Fast Power-On Clock Multiplier 90
MILO 1
TDC
clock
Edge Power
Counter Down
Freq.
ref Offset
MILO 2 Compare
TDC
Edge Power
Counter Down
Figure 6.9: If two MILOs are locked to the correct frequency, the power-down decision
is made by determining which MILO is operating closest to its free-running frequency.
Figure 6.10: Power down of any unused MILOs is accomplished by blocking tail currents
in all CML stages.
To ensure that the CML stages are powered down effectively and that any leakage
current present after power-down remains as small as possible, the power-down logic
signal is converted to full-swing CMOS logic levels by the circuit shown in Fig. 6.11. This
circuit is based on the CML-to-CMOS converter presented in [62] with the modification
that the width of transistor M2 is made twice that of transistor M1 . This ensures that
if both VCM L inputs go to Vdd , which occurs when the preceding CML stages are turned
off, the VCM OS output remains low, keeping the MILO circuits powered down.
The decision to power down a multiplier can only be made after all bias voltages
and oscillator outputs have settled to their steady-state operating conditions. This is
Chapter 6. A Wide Lock Range, Fast Power-On Clock Multiplier 91
VDD
VCML VCMOS
Itail
M1 M2
Figure 6.11: Converting the power down signal to CMOS logic levels ensures successful
power-down and minimizes leakage power.
start
1 G ref
Pdown1
Pdown2
clk1 wrong frequency
clk2 offset
clk3
clk4 not locked
settling
0 1n 2n 3n 7n 8n 9n 10n 11n
Figure 6.12: Timing diagram of the power-on sequence for a 1-GHz reference signal.
accomplished by adding an enable signal to the power down logic in Fig. 6.10. This
signal can be created automatically by counting some number of reference clock cycles
using a chain of latches similar to that used to measure the frequency multiplication
ratio. By waiting for 8 reference clock cycles before enabling power down from the
edge counter and TDC logic, and an additional 2 clock cycles before enabling power
down from frequency offset comparison with adjacent multipliers, it is possible to avoid
making premature power-down decisions. This sequence is shown in Fig. 6.12 for a
Fig. 6.13. Upon receiving a “Start” signal from an off-chip signal source, the latches
Chapter 6. A Wide Lock Range, Fast Power-On Clock Multiplier 92
MILO
Pdown
enable
1 2 8 9 10
Pdown_en DQ DQ DQ DQ DQ freq.
R R R R R offset
compare
ref Pdown
enable
Start
Figure 6.13: Power down of individual MILOs is enabled after 8 cycles and power down
resulting from comparison of two correctly locked MILOs is enabled after 10 cycles.
exit the reset state and begin to pass the value of “Pdown en” down the latch chain.
Pdown en is set off-chip in order to provide the ability to manually disable the power
The task of creating two ILOs with identical free-running frequencies, along with two
edge detectors—one with a pulse width equal to 1/f0 and the other with a pulse width
of 2/f0 —is non-trivial. While it is possible to design some method of tuning these blocks
so that calibration can be used to individually tune each ILO to a desired frequency
and each edge detector to a desired pulse width, this calibration can quickly become
To avoid this problem it is possible to take the four delay stages and an XOR gate
used in the edge detector structure and use this circuit as the basis for the second edge
detector and both ILOs in the MILO as shown in Fig. 6.14. This technique ensures that
the pulse widths and ILO operating frequencies will be well matched regardless of PVT
the CML XOR gate used in each of these blocks is included in Appendix B, Fig. B.3. It
should be noted that buffers were used at the output of each ILO stage to ensure that
the load seen by each delay element in the multiplier is identical, but these were omitted
ILO2
ILO1
edge detector (wide pulses) edge detector (narrow pulses)
Figure 6.14: Using four delay stages and an XOR gate as the building block for each
component in the MILO ensures good matching between pulse widths and ILO free-
running frequencies.
Creating MILO lock ranges that are adjacent to each other with enough overlap to
ensure that no gaps are present in the overall lock range, while also keeping this overlap
small enough to not compromise the achievable lock range can be challenging. In order to
accomplish this, load capacitances were added to the loads of each delay element shown
in Fig. 6.14. Then, by increasing the physical size of these capacitances from one MILO
to the next, the desired overall lock range and amount of overlap can be achieved. In
order to account for any discrepancies between simulated and measured behaviour, these
capacitances were implemented using NMOS varactors to provide some tunability. Since
the delay stages and their associated varactors are identical in each stage of the MILO, it
varactor control voltage, which is applied to both the ILOs and the edge detectors. Once
this initial calibration is performed to set each MILO to a desired operating frequency,
A prototype of the clock multiplier described in this chapter was fabricated in a 65-nm GP
CMOS process. The multiplier contains four parallel MILOs with all of their associated
Chapter 6. A Wide Lock Range, Fast Power-On Clock Multiplier 94
<=>,0
G88? !"#$?
G88<
.E/FA,<,/(,?
!"#$<
()-
'()*+,$--+,./012()3
!"#$& !45)6,278,9(:;)(6
G88&
GF0 %&>,0 !"#$% .E/FA,%,/(,&
G88%
DC2(CB1 @()2A/BC
DC2(C !"#$
control logic, as well as a breakout MILO that provides off-chip output from each ILO
and is isolated from any frequency offset power-down logic. Besides the absence of this
logic, the breakout is a replica of MILO3. A die photo of the 1 mm x 1mm chip is shown
in Fig. 6.15. Of special note is the fact that each MILO is powered by a dedicated supply
voltage in order to provide isolation from variations in power supply which can be created
when one or more MILOs are quickly powered down. This also improves testability of
the circuit by making it possible to test MILO lanes either individually or in any desired
combination.
With no injected signal applied and all power-down logic disabled, the free-running fre-
quency of each MILO was measured over a range of varactor voltages. The results are
plotted in Fig. 6.16 and show that each MILO can be tuned to a range of frequencies in
Chapter 6. A Wide Lock Range, Fast Power-On Clock Multiplier 95
6
5
4
Frequency
(GHz)
MILO4
MILO3
3
MILO2
2
MILO1
Breakout
1
0
0
0.2
0.4
0.6
0.8
1
1.2
Varactor
Voltage
(V)
Figure 6.16: Measured free-running frequency of each MILO shows reasonable spacing
between adjacent lanes and, if necessary, these free-running frequencies can be adjusted
using the varactor control voltages.
order to achieve the appropriate amount of overlap between adjacent lock ranges. This
figure also shows that if all MILOs are operated using the nominal setting of 0.9 V for all
varactors, which is the desired case for normal operation, the MILO operating frequen-
cies should be close enough to avoid gaps in the aggregate lock range. Furthermore, this
Direct measurement of the PTC of an ILO is impractical since the application of a single
period of the injected signal is non-trivial, as is the ability to measure the real-time phase
response of the MILO output to such a stimulus. Instead, it is possible to obtain part
of the PTC using the relationship shown in Equation (5.12), which is repeated here for
convenience.
N ∆ω
P (φss ) = 3600 (5.12)
ω0
Since the fabricated ILO was designed to have a multiplication factor of N = 4 and was
150
300.00
100
200.00
50
100.00
φSS
(deg)
PTC
(deg)
0
0.00
-‐50
-‐100.00
measured
measured
-‐100
simulated
-‐200.00
simulated
-‐150
-‐300.00
-‐1
-‐0.5
0
0.5
1
-‐150
-‐100
-‐50
0
50
100
150
Δf
(GHz)
φSS
(deg)
(a) (b)
Figure 6.17: Measured values of (a) φss for various frequency offsets can be translated to
(b) the PTC of the ILO. These measurements show good agreement with SPICE-level
simulations.
various frequency offsets, ∆f , are applied. However, since changes to the frequency of
the injected signal also introduce an unknown amount of phase shift in the signal that
achieve the same effect by keeping finj , and therefore the phase of the injected signal,
constant and sweeping the varactor control voltage, thereby effectively sweeping f0 .
This technique is able to reproduce a portion of the PTC between the peaks of the
P function. Due to the fact that the varactors in the breakout MILO are only able
to achieve free-running frequencies ranging from 3.4 to 4.7 GHz, this limits the ∆ω
that can be applied, thereby limiting the PSF values that can be observed. Despite this,
Fig. 6.17(a) compares the φss values measured in this way to those obtained from SPICE-
level simulations. Fig. 6.17(b) then translates these measurements to the PTC using
Equation (5.12) and compares them to the PTC that is obtained through simulation
as discussed in the previous chapter. In both cases there is good agreement between
Breakout
MILO4
MILO3
MILO2
MILO1
2
3
4
5
6
Output
Frequency
(GHz)
Figure 6.18: Measurements show wide lock ranges for MILO1, MILO2 and the Breakout
MILO, but problems with the reference clock distribution cause a reduction in the lock
ranges of MILO3 and MILO4.
With all power down logic disabled and varactor voltages set to their mid-range values,
the lock range of each MILO was individually measured and the results are shown in
Fig. 6.18. MILO1 and MILO2 show reasonable overlap with wide lock ranges that increase
with f0 , resulting in ranges of 32.2% and 36.3% respectively. This is not true, however, for
MILO3 and MILO4 whose lock ranges measure only 18.7% and 8.6% of f0 , respectively.
Since the only designed difference between each MILO is the size of the varactor load
capacitance implemented in the CML delay stages, it is likely that this drop in lock range
for MILO3 and MILO4 is due to factors external to the MILOs themselves.
The likely cause of this problem is the clock distribution network which is used to
apply the reference clock signal to each MILO as well as the startup circuit and the
frequency offset compare logic. Although a 50 Ω termination is placed near its application
to the chip at the “ref” pads in Fig. 6.15, the remainder of the distribution network
gates. These are essentially open-circuit terminations, which can cause reflections that
Chapter 6. A Wide Lock Range, Fast Power-On Clock Multiplier 98
Pmax 383.30
Pmin -250.90
PTC model
∆flow 1.06 GHz
∆fhigh 0.7 GHz
∆flow 1 GHz
Measured
∆fhigh 0.7 GHz
Table 6.1: Comparison of measured lock range of the breakout MILO to those calculated
using PSF simulations.
may create undesired signals at the MILO inputs. This conclusion is reinforced by the
fact that the Breakout MILO, which is a copy of MILO3 but is attached to a different
point of the clock distribution network, shows no such lock range limitation, achieving a
range of 42.5% around a free-running frequency of 4 GHz. This lock range shows excellent
Despite this problem, correct operation of MILO1 and MILO2 is sufficient to demon-
power-on clock multiplier. With all power-down logic enabled, the top portion of Fig. 6.19
shows the overall lock range of a multiplier consisting of MILO1 and MILO2. By com-
bining the 0.84 GHz lock range of MILO1 with the 1.24 GHz lock range of MILO2 and
incorporating a 320 MHz overlap region, a 1.76 GHz lock range is created, which is
equivalent to 55.7% of the 3.16 GHz f0 . The overlap region can be initially calibrated to
other desired values in order to tolerate variations in voltage and temperature by using
the varactor controls. Also, although not shown in this plot, tests were conducted with
1 1 1
reference clock frequencies equal to , ,
16 8 2
and 1 times a frequency within the valid lock
range. In every case the resulting incorrect frequency multiplication factor was detected
The lower portion of Fig. 6.19 shows the steady-state (SS) current measured from the
dedicated MILO supplies, illustrating the point at which the Frequency Offset Compare
logic switches the multiplier output from MILO1 to MILO2 as the reference frequency
is increased. This transition occurs well within the overlap region although not quite
Chapter 6. A Wide Lock Range, Fast Power-On Clock Multiplier 99
Lock Range
50
Current
(mA)
40
30
MILO1
(avg)
20
MILO2
(avg)
MILO1
(SS)
10
MILO2
(SS)
0
2
2.5
3
3.5
4
4.5
Output
Frequency
(GHz)
Figure 6.19: Two MILOs are able to increase the overall multiplier lock range to 55.7%.
The point at which logic switches between MILOs is illustrated by measurements of the
steady-state (SS) current drawn by each MILO, as well as the average (avg) current
drawn when the multiplier is active for 50% of the time in 50-ns bursts.
in its center because of the relatively coarse resolution of the TDCs in each MILO. To
highlight the power savings that are realized by using the fast power on/off capability, the
average (avg) current is also plotted showing a reduction to 32.2 mA when the multiplier
circuit is active for 50% of the time for 50-ns bursts. Although not shown in this plot,
the average continues to shrink as active time decreases or as burst length increases.
This is in contrast to traditional clock synthesizers which must either be left on at all
times [60] or sacrifice their frequency agility in order to achieve fast power-on [21]. Power
The output of both ILOs in the Breakout MILO structure (“ILO1” and “ILO2” in
Fig. 6.15) were taken off-chip in order to allow for analysis of their jitter behaviour.
The peak-to-peak DJ was obtained by using oscilloscope cursors to measure clock half-
periods and then taking the difference between the maximum and minimum values. The
results of this measurement are displayed in Fig. 6.20 and show that the shape of the
Chapter 6. A Wide Lock Range, Fast Power-On Clock Multiplier 100
100
Figure 6.20: Measured results show that the addition of ILO2 provides a reduction in
DJ created by unequal pulse widths in the injected signal.
curves agrees well with the simulated results in Fig. 6.5. The resulting improvement in
DJ achieved through the addition of ILO2 ranges from 2.5 to 30.5 ps across the lock
range.
DJ at the output of both ILOs is minimized when the reference clock frequency is
well-matched to the fixed pulse widths generated by the edge detectors, resulting in a
low-jitter signal being injected into the ring oscillators. At frequencies far from this
point the periodic variations in pulse widths create significant DJ. Due to the dual edge
detector structure, this DJ occurs in repeating 4-bit patterns that can be detected and
isolated from the random jitter (RJ) using the jitter decomposition functionality of the
Agilent DCA-J oscilloscope. Fig. 6.21 shows this repeating pattern by plotting the DJ
per bit for a 4-bit pattern. This pattern is further illustrated by Fig. 6.22, which shows
a histogram of the total measured jitter, showing four distinct peaks due to the DJ.
It should be noted that the DJ values reported in Fig. 6.20 are higher than the
simulated values (shown previously in Fig. 6.5) due to the fact that inadequate models
of the QFN package parasitics were used when designing the 50 Ω output buffers used
to send the multiplier’s clock outputs off-chip. Although these buffers were designed to
Chapter 6. A Wide Lock Range, Fast Power-On Clock Multiplier 101
Figure 6.21: Decomposition of the measured jitter shows a DJ pattern that repeats in a
4-bit pattern due to the pulse widths of the injected signal.
Figure 6.22: A histogram of the total measured jitter shows four distinct peaks repre-
senting the DJ.
Chapter 6. A Wide Lock Range, Fast Power-On Clock Multiplier 102
Lwire
From chip to PCB
4 nH
Cpad 200 fF Cpack 2 pF
Figure 6.23: Including package parasitics at the simulated clock outputs identifies the
cause of small measured clock amplitude.
supply a 350 mV peak-to-peak per side differential signal to a 50 Ω load, output clock
amplitudes measured only approximately 50 mV peak-to-peak per side. This effect was
duplicated in simulation by adding the package parasitic structure shown in Fig. 6.23 to
the chip outputs. These parasitics attenuate the high-frequency output to such a large
degree that the subharmonics due to the injected reference signal become relatively more
The effects of these parasitics can be partially overcome by increasing the power
drivers used in this application is shown in Appendix B, Fig. B.4. When the power
delivered to the output drivers is increased by 50%, Fig. 6.24.(a) shows that the resulting
output swing is increased to approximately 100 mV peak-to-peak per side. This improved
output swing then translates to a reduction in DJ at the output of the MILO as shown
in Fig. 6.24.(b). Unfortunately, since driver power could not be controlled independently
from the rest of the circuit in the prototype chip, increasing driver power by 50% also
results in a similar increase in the power consumption of the entire circuit. As a result,
this high power setting was not used to measure any other circuit performance metrics.
Although the DJ suffers from the reduced output signal amplitude, RJ measurements
show no similar degradation due to this effect. Fig. 6.25.(a) shows a histogram of the
random jitter obtained using the jitter decomposition function of the oscilloscope. This
Chapter 6. A Wide Lock Range, Fast Power-On Clock Multiplier 103
X = 2 ns/div, Y = 20 mV/div
MILO1
start signal
0 3 ns
(a) (b)
Figure 6.24: Increasing driver power (a) increases output swing to approximately 100
mVpp per side and (b) results in a decrease in DJ.
(a) (b)
Figure 6.25: Histogram (a) and oscilloscope capture (b) showing that the RJ of the clock
signal is approximately 1.4 ps-rms.
shows a measured RJ of 1.36 ps-rms, which agrees well with the 1.42 ps-rms and 12.44 ps-
measurement was performed by triggering the oscilloscope with the injected reference
signal at f0 /4, making it possible to examine clock edges in isolation from the DJ.
as the measured RJ of both ILOs, as shown in Fig. 6.26, remains at approximately 1.4
10
Figure 6.26: Measurements at the output of both ILOs shows that the RJ remains near
1.4 ps-rms across the entire lock range.
signal is non-trivial since any delays experienced by this signal between its source and
its application to the chip must be deembedded from the measurement results. The test
setup used to accomplish this is shown in Fig. 6.27. After being applied to the PCB, the
Start signal is passed through a series of inverters in order to shorten its transition time
before being applied to the prototype chip (DUT). The last stage of this inverter chain,
along with its associated PCB trace length is duplicated and sent off the PCB to be
measured by an oscilloscope. Using identical SMA cables to connect both this signal and
the clock output to the oscilloscope ensures that their delays are reasonably well matched
and that the two signals received by the scope present an accurate representation of the
chip’s behaviour.
By using a 10 MHz signal source to repeatedly apply a Start signal to the circuit
in 50-ns bursts, the power-on transient behaviour of the multiplier can be captured
Fig. 6.28(a) where the Start signal is shown in lower curve and the MILO output is shown
Chapter 6. A Wide Lock Range, Fast Power-On Clock Multiplier 105
Signal Source
Agilent E4422B
1 GHz Hybrid
Coupler
Trigger 00 1800
refn PCB
Sync.
clkp
Signal Source 0.6 V
refp Spectrum
Agilent 83712B
DUT Analyzer
10 MHz
Start clkp
Start_out SMA2
SMA1
Ch1 Ch2
Sampling Scope
Agilent DCA-J
Trigger
Figure 6.27: Test setup used to capture the transient response to an applied Start signal.
in the upper curve. The power-up behaviour of the circuit is examined more closely in
Fig. 6.28(b), which shows that the delay between application of the Start signal and the
To measure the transient behaviour of multiple MILO outputs simultaneously the test
setup is modified as shown in Fig. 6.29. Since the prototype chip is capable of sending
one of either MILO1 or MILO2 and one of either MILO3 or MILO4 off-chip, it is possible
to apply the outputs from MILO2 and MILO3 to the oscilloscope simultaneously.
By again applying a Start signal to the circuit in 50-ns bursts, the transient behaviour
of both of these clock outputs was measured and is shown in Fig. 6.30(a). This plot shows
how the lower, unlocked clock (MILO3) is correctly identified by the power-down logic
in 8.5 ns, or approximately 8 cycles of the 0.95 GHz reference signal used in this test. In
Fig. 6.30(b) MILO1 is also enabled, providing two correctly locked clock outputs in the
upper signal, and an unlocked clock output in the lower signal. In this case the unlocked
MILO is identified and powered down in 9 ns. Then, after approximately 10 reference
clock cycles (11 ns) the locked MILO that is furthest from its free-running frequency is
Chapter 6. A Wide Lock Range, Fast Power-On Clock Multiplier 106
3 ns
50 ns burst
(a) (b)
Figure 6.28: Repeated 50-ns bursts allow for power-on transient behaviour of the multi-
plier to be captured in (a). Zooming in on the start of these bursts (b) shows 3 ns delay
between Start signal and output oscillations
Signal Source
Agilent E4422B
1 GHz Hybrid
Coupler
Trigger 00 1800
refn PCB
Sync.
clk2
Signal Source 0.6 V
refp
Agilent 83712B
DUT
10 MHz
Start clk1
Start_out
Ch1 Ch2
Sampling Scope
Agilent DCA-J
Trigger
Figure 6.29: Test setup used to capture the transient response of two output clocks.
Chapter 6. A Wide Lock Range, Fast Power-On Clock Multiplier 107
2 ns 6.5 ns 11 ns
9 ns
(a) (b)
Figure 6.30: Transient startup behaviour when (a) two and (b) three MILOs are enabled
shows identification and power down of unlocked clock(s) in 10 reference clock cycles.
identified and powered down, leaving only a single valid output clock.
It should be noted that output of MILO3, which is shown in the bottom of these
figures experiences an additional 1 ns delay in the start of its oscillations when compared
to the output of MILO2. This is the result of the increased physical separation between
each MILO and the startup circuitry, as shown by the die photo in Fig. 6.15, and indicates
Since this architecture involves rapid powering on and off of several blocks, the power
that is consumed from the 1.1 V supply can not be adequately described by a simple,
steady-state metric. During power up, each MILO that is turned on consumes 55 mA.
For the MILOs that are powered down, this power is only consumed for a maximum of
10 reference clock cycles before it returns to 0 mA, leaving only 1 MILO consuming 60.5
by the startup circuit as well as the frequency offset comparison logic block, the output
ac:ve
MILO
100
inac:ve
MILO
80
logic/driver
Power
(mW)
60
40
20
0
-‐20
simulated
measured
-‐40
0
10
20
30
40
50
60
Time
(ns)
Figure 6.31: Simulated transient power consumption during one power-up cycle of the
multiplier is compared to measured steady-state power consumption.
power consumption during power-up sequences could not be measured due to the short
duration of these events. Instead, simulation results of a power-up event are shown in
Fig. 6.31 along with the measured steady-state values stated above in order to lend con-
fidence to these simulated results. Ringing seen around the steady-state levels at power
on or off are caused by the package parasitics at the chip supply pads.
Since the output clock cannot be used until the power-up sequence is complete, power
consumed during these 10 cycles of the reference clock penalizes the efficiency of the
multiplier. The extent of this penalty can be calculated using the simulated transient
power consumption values. For a 1-GHz reference clock, which corresponds to a power-
while the logic circuits consume approximately 430 pJ. As a result, the efficiency of the
multiplier depends strongly on the number of bits that are transmitted with each power-
up sequence. This effect is captured by Fig. 6.32, which shows how total energy consumed
remains below 25 pJ/bit for burst lengths of at least 250 bits but increases for shorter
burst lengths.
Regardless of the burst length, however, the average power consumed by the multi-
Chapter 6. A Wide Lock Range, Fast Power-On Clock Multiplier 109
350
300
250
Energy
(pJ/bit)
200
150
100
50
0
0
50
100
150
200
250
300
350
400
450
Burst
Length
(bits)
Figure 6.32: Energy consumed by the power-up sequence means that efficiency of the
multiplier suffers when used in short bursts.
plier scales nearly linearly with the percentage of active time. This is illustrated using
the simulated data shown in Fig. 6.33 for burst lengths of 400 bits. This dependence
highlights the fact that the anticipated percentage of active time of the multiplier is a
In order to compare the performance of the multiplier to the state-of-the-art, Fig. 6.34
reports the steady-state power consumption in relation to the power-on time and fre-
quency range of other reported clock multipliers. Since it was not reported, power-on
time was estimated in [22] and [60] from their PLL settling times, τs according to the
equation
10
τs = (6.1)
0.1fref
Further details, such as the technology and area used as well as the jitter performance
80
Figure 6.33: Average power consumption of the multiplier scales linearly with the per-
centage of active time.
GHz[50]
100
80
30.4
mW/
60
GHz
[This
Work]
4.16
mW/
40
GHz
[49]*
20
16.2
mW/ 5.25
mW/
GHz
[20]
GHz
[21]*
0
0
100
200
300
Power-‐on
Time
(ns)
*
es7mated
power-‐on
7me
Figure 6.34: Comparison to other clock multipliers shows the improvement in power-on
time and frequency range achieved by this work.
Chapter 6. A Wide Lock Range, Fast Power-On Clock Multiplier 111
6.3 Summary
Fast power cycling and frequency agility can increase the flexibility of communication
circuits, allowing them to improve their power efficiency. However, the design of a clock
generating circuit that is capable of meeting both of these requirements presents a sig-
nificant challenge. The clock multiplier presented in this chapter introduces a MILO
structure that addresses these challenges and shows how the frequency agility of the
multiplier can be further improved by using parallel MILOs with adjacent lock ranges.
The fabricated multiplier is able to use two MILOs to achieve a frequency lock range
of 55.7% of the 3.16 GHz centre frequency. Although the pulse injection scheme used
introduces some DJ, this effect is mitigated by the addition of a second ILO, which
improves DJ by anywhere from 2.5 to 30.5 ps, depending on the injected frequency.
Transient tests of the power-on behaviour of the multiplier verify that the control logic
is able to identify the correct output clock and power down the other MILOs within 10
reference clock cycles of the arrival of a start signal. The active MILO and its associated
logic consume a total of 96 mW during steady-state operation, but this power scales with
the amount of time during which the link is active. This multiplier has been submitted
Conclusion
In order to operate efficiently under a variety of channel conditions and data rates during
both active and idle periods, wireline receivers must incorporate a great deal of flexibility
into their design. This thesis has examined several aspects of adaptive receiver design in
stage of an AFE while simultaneously maintaining a good impedance match to the wire-
line channel over a wide range of frequencies. This impedance match is achieved without
the use of a purely resistive matching network, which helps to improve the noise perfor-
mance of the receiver. In addition, this circuit responds to varying requirements of noise
to changes in the received signal strength, which are seen as changes to the preamplifier
gain settings. This helps to improve both the dynamic range and the efficiency of the
receiver.
Automatic adaptation of both this gain setting and the EQ peaking is performed
using the scheme presented in Chapter 2. This method shows that both of these AFE
channels by adding only a modest amount of overhead to the receiver. Using this method,
112
Chapter 7. Conclusion 113
one slicer, a low-speed ADC, and the circuitry required by the optimization algorithm
are all that are required to optimize the vertical eye opening.
data rates and power levels as needed. This requirement can be particularly challenging
when it comes to the design of the clock circuitry, which traditionally operates over a
power levels. These challenges were addressed by the frequency agile clock multiplier
reported in Chapter 4. The use of MILOs in this design gives the multiplier the ability to
transition from being completely inactive to fully operational in 10 cycles of the reference
clock. Furthermore, the use of several MILOs with adjacent lock ranges was demonstrated
as a way to increase the frequency range of the multiplier. By monitoring the outputs of
these MILOs, it is possible to quickly identify the optimal output clock signal and then
power down the unused MILOs in order to maintain receiver power efficiency.
In order to achieve frequency agility, this clock multiplier design also requires the
optimization of individual MILO lock ranges. This requires extensive circuit simulation,
which was made possible using the PTC modeling described in Chapter 3. By relating
the lock time and lock range to the change in output phase caused by a single period of
the injected signal, it becomes possible to efficiently simulate an ILO’s behaviour under a
variety of injected signal shapes, strengths and frequencies. Using this technique, Chapter
3 describes how changes to the way in which an injected signal is applied to an ILO can
have a significant impact on its lock range and shows how this can in turn be used to
The ability of a wireline receiver to sense and adapt to a variety of operating conditions
was explored in this work. In order for these techniques to find wide-spread usefulness
Chapter 7. Conclusion 114
in industrial applications, there is still some work that can be done in order to minimize
In particular, the clock multiplier presented here relies on a clock forwarded archi-
tecture along with some additional control signal needed to implement the ”wake up”
transition from zero power to fully operational. Future work could be done to obtain
either the clock signal or this ”wake up” signal from an existing data channel, thereby
reducing the number of pins required and maximizing the aggregate data rate.
Furthermore, although the clock multiplier quickly identifies and powers down unused
MILO lanes, the power drawn by the MILO that remains active is still quite high. This is
due in large part to the fact that the monitoring circuitry within the active MILO remains
on even after the other MILOs have been powered down and no further decisions need
this circuitry when all decisions have been made after 10 cycles of the reference clock.
Along these same lines, the monitoring circuitry used in this clock multiplier uses CML
almost exclusively. This decision was made in order to ensure correct operation during
variations in the supply voltage caused by sudden changes in the power consumption as
parts of the multiplier are powered on and off quickly. If these supply voltage variations
could be limited through the use of voltage regulators or through staggered progression
of the power on/off transitions then it may be possible to replace several of these CML
blocks with CMOS logic, which would translate to a significant reduction in static power.
Future work could also be done to expand the PTC-based ILO model presented in
Chapter 3. Although this model has been shown to accurately predict the lock range, lock
time and jitter tracking bandwidth of an ILO, the PTC should also be able to determine
the amount of deterministic jitter present at the ILO output, as well as the overall phase
noise. Obtaining all of this information about the behaviour of an ILO from a single set
of simulation results would increase the usefulness of the PTC model but was outside the
Summary of Contributions
115
Appendix A. Summary of Contributions 116
3.4 Fast power-on MILO with multiple injection points to create [21]
a moderate lock range.
4 Multi-MILO clock multiplier with wide lock range and fast accepted
power-on. at CICC
Table A.1: Summary of contributions made along with related locations in the thesis.
Appendix B
Schematic Diagrams
Vout
12 12 6 6 12 12 20 20 20 20 20 20
Vin
VTH
20 20 20 20
Vclk
Figure B.1: Schematic of the variable threshold slicer used in the AFE adaptation scheme.
Transistor sizes are given in µm with all gate lengths set to minimum size.
522 Ω 522 Ω
Vout
10 10 10 10 10 10 10 10
Vin
10 10 10 10
Vclk
EN 20 EN 20
0.8 mA 0.8 mA
Figure B.2: Schematic of the CML latch used in the TDC and power down logic of the
MILO. Transistor sizes are given in µm with all gate lengths set to minimum size.
117
Appendix B. Schematic Diagrams 118
522 Ω
45 - 130 fF
Vout
10 10 10 10
A
10 10
B
EN 20
0.8 mA
Figure B.3: Schematic of the CML XOR gate used in the pulse generators and ring
oscillators of the MILO. Transistor sizes are given in µm with all gate lengths set to
minimum size.
169 Ω 51 Ω
Vout
30 30 60 60
Vin
EN 30 60
2.4 mA 7.8 mA
Figure B.4: Schematic of the two-stage, 50 Ω output driver used to send the MILO
output off-chip. Transistor sizes are given in µm with all gate lengths set to minimum
size.
Bibliography
Blocks,” IEEE J. Solid-State Circuits, vol. 41, no. 8, pp. 1830–1845, Aug. 2006.
[2] F. Lu, J. Min, S. Liu, K. Cameron, C. Jones, O. Lee, J. Li, A. Buchwald, S. Jantzi,
[3] L. Lehikoinen and T. Räty, “Adaptive Real-Time Video Streaming System for Best-
J. Eyles, M. Aleksic, T. Greer, and N. Nguyen, “A 4.3 GB/s Mobile Memory Inter-
face With Power-Efficient Bandwidth Scaling,” IEEE J. Solid-State Circuits, vol. 45,
[5] T. Ebuchi, Y. Komatsu, M. Miura, T. Chiba, and T. Iwata, “An Ultra-Wide Range
VCO Gain Calibration,” IEEE J. Solid-State Circuits, vol. 46, no. 4, pp. 986–991,
2011.
119
Bibliography 120
terface,” IEEE J. Solid-State Circuits, vol. 47, no. 4, pp. 926–937, 2012.
Receiver Analog Front End with Dynamic Sampler Swapping Capability for Back-
376–379.
[9] M. Hossain and A. Chan Carusone, “510 Gb/s 70 mW Burst Mode AC Coupled
Receiver in 90-nm CMOS,” IEEE J. Solid-State Circuits, vol. 45, no. 3, pp. 524–
537, 2010.
C. Holdenried, J. Pham, E. So, D. Cassan, and S. Sadr, “An 8.4 mW/Gb/s 4-lane
[12] P. Park and A. Chan Carusone, “A 20-Gb/s Coaxial Cable Receiver Analog Front-
[13] D. Dunwell and A. Chan Carusone, “A 15-Gb/s Preamplifier with 10-dB Gain Con-
[16] F. Buchali, S. Lanne, J. Thiery, W. Baumert, and H. Bulow, “Fast Eye Monitor for
10 Gbit/s and its Application for Optical PMD Compensation,” in Optical Fiber
an Optical Receiver,” IEEE J. Solid-State Circuits, vol. 35, no. 12, pp. 1958–1963,
2000.
State Circuits, vol. 40, no. 12, pp. 2689–2699, Dec. 2005.
[19] D. Dunwell and A. Chan Carusone, “Gain and Equalization Adaptation to Opti-
mize the Vertical Eye Opening in a Wireline Receiver,” in IEEE Custom Integrated
[20] L. A. Barroso and U. Hölzle, “The Case for Energy-Proportional Computing,” IEEE
[24] M. Hossain and A. Chan Carusone, “CMOS Oscillators for Clock Distribution and
Injection-Locked Deskew,” IEEE J. Solid-State Circuits, vol. 44, no. 8, pp. 2138–
Frequency Dividers based on Ring Oscillators with Optimum Injection for Wide
[27] A. Hajimiri and T. Lee, “A General Theory of Phase Noise in Electrical Oscillators,”
at 10 Gb/s,” IEEE J. Solid-State Circuits, vol. 43, no. 12, pp. 2939–2957, 2008.
brated AFE in 65nm CMOS for 10Gb/s Serial Links over Backplane and Multimode
Transimpedance Amplifier,” IEEE J. Solid-State Circuits, vol. 29, no. 6, pp. 701–
706, 1994.
[32] E. Sackinger, Broadband Circuits for Optical Fiber Communication. Hoboken, NJ,
[33] C. Liao and S. Liu, “40 Gb/s Transimpedance-AGC Amplifier and CDR Circuit for
[34] S. Ibrahim and B. Razavi, “Low-Power CMOS Equalizer Design for 20-Gb/s Sys-
tems,” IEEE J. Solid-State Circuits, vol. 46, no. 6, pp. 1321–1336, June 2011.
Interface in 45 nm CMOS,” IEEE J. Solid-State Circuits, vol. 45, no. 12, pp. 2828–
2837, 2010.
Bibliography 124
19Gb / s Serial Link Receiver with Both 4-Tap FFE and 5-Tap DFE Functions in
pp. 134–136.
[37] G. Zhang and M. Green, “A 10 Gb/s BiCMOS Adaptive Cable Equalizer,” IEEE J.
Solid-State Circuits, vol. 40, no. 11, pp. 2132–2140, Nov. 2005.
[38] A. Baker, “An Adaptive Cable Equalizer for Serial Digital Video Rates to 400 Mb/s,”
J. Zerbe, and C. Yang, “Near-Optimal Equalizer and Timing Adaptation for I/O
Links Using a BER-Based Metric,” IEEE J. Solid-State Circuits, vol. 43, no. 9, pp.
2144–2156, 2008.
[40] K. Cheng and H. Chang, “Test Strategies for Adaptive Equalizers,” IEEE Custom
[41] F. Musa and A. Chan Carusone, “Modeling and Design of Multilevel BangBang
CDRs in the Presence of ISI and Noise,” IEEE Trans. Circuits Syst. I, vol. 54,
Monitor Feedback,” IEEE J. Solid-State Circuits, vol. 43, no. 12, pp. 2929–2938,
2008.
[43] H. Cheng and A. Chan Carusone, “A 32/16 Gb/s 4/2-PAM Transmitter with PWM
Pre-Emphasis and 1.2 Vpp per side Output Swing in 0.13-µm CMOS,” in IEEE
Amplifier,” IEEE Microwave and Wireless Components Letters, vol. 16, no. 12, pp.
[45] I.-T. Lee and S.-I. Liu, “G-Band Injection-Locked Frequency Dividers Using pi-type
LC Networks,” IEEE Trans. Circuits Syst. I, vol. 59, no. 2, pp. 315–323, 2012.
OOK Transmitter and 8.4uW Power-Gated Receiver Front-End for Wireless Ad Hoc
Network in 40nm CMOS,” in IEEE Symposium on VLSI Circuits, 2011, pp. 278–279.
ceiver*,” Proceedings of the IRE, vol. 32, no. 12, pp. 730–737, 1944.
[48] B. Helal, C. Hsu, K. Johnson, and M. Perrott, “A Low Jitter Programmable Clock
ing Loop,” IEEE J. Solid-State Circuits, vol. 44, no. 5, pp. 1391–1400, May 2009.
[49] L. Paciorek, “Injection Locking of Oscillators,” Proc. IEEE, vol. 53, no. 11, pp.
[52] D. Leeson, “A simple model of feedback oscillator noise spectrum,” Proc. IEEE,
[53] X. Lai and J. Roychowdhury, “Capturing Oscillator Injection Locking via Nonlin-
Impulse-Response,” IEEE Trans. Circuits Syst. I, vol. 55, no. 5, pp. 1297–1305,
June 2008.
[55] R. Harjani and N. Lanka, “High Speed Frequency Hopping Using Injection Locked
[56] J. Lee and M. Liu, “A 20-Gb/s Burst-Mode Clock and Data Recovery Circuit Using
Injection-Locking Technique,” IEEE J. Solid-State Circuits, vol. 43, no. 3, pp. 619–
pp. 667–670.
[58] J. Chien and L. Lu, “Analysis and Design of Wideband Injection-Locked Ring Oscil-
lators With Multiple-Input Injection,” IEEE J. Solid-State Circuits, vol. 42, no. 9,
[59] J. Luo, N. K. Jha, and L.-s. Peh, “Simultaneous Dynamic Voltage Scaling of Pro-
IEEE Trans. on Very Large Scale Integration (VLSI) Systems, vol. 15, no. 4, pp.
427–437, 2007.
[61] F. Lin, R. a. Royer, B. Johnson, and B. Keeth, “A Wide-Range Mixed-Mode DLL for
a Combination 512 Mb 2.0 Gb/s/pin GDDR3 and 2.5 Gb/s/pin GDDR4 SDRAM,”
IEEE J. Solid-State Circuits, vol. 43, no. 3, pp. 631–641, Mar. 2008.
“A Multiphase PLL for 10 Gb/s Links in SOI CMOS Technology,” in IEEE Radio