0% found this document useful (0 votes)
307 views77 pages

Lec2 Timing Up

Uploaded by

6cc6nqjdzn
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
307 views77 pages

Lec2 Timing Up

Uploaded by

6cc6nqjdzn
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 77

EE7605 Signal Integrity in High-

Speed Digital Systems

Lecture 2: Timing and


Clocking Issues

Note: some of the figures in this slide set are adapted from the slide set
of “ Digital Integrated Circuits” by Rabaey et. al., Copyright 2002
1 EE7605 Lecture 2
Outline
• Motivations and definitions
• Hold and Setup time constraints
• Clock non idealities
– Clock Skew problem (positive and negative Skew).
– Worst delay and contamination time evaluation
– Clock Jitter.
• Clock Distribution networks:
– H tree network, Grid, etc..
• Some industrial example:
– Strategies of clock distributions in complex digital
systems and µp.
• Conclusion and discussions (How to counter Skew &
jitter)
2 EE7605 Lecture 2
System Timing
• Clocking is very important to ensure that improper
values are never stored.
• Flip-flop-based pipeline system:
Reg. Reg.
Tq Combinational Ts
clock A Logic (Td) B
Tc = Tq + Td + Ts
 Primary inputs change after clock (φ) edge.
 Primary inputs must stabilize before next clock edge.
 Rules allow changes to propagate through
combinational logic for next cycle.
 Flip-flop outputs hold current-state values for next-state
computation
3 EE7605 Lecture 2
Chip to Chip Timing

4 EE7605 Lecture 2
Timing Definition-Latch Parameters

D Q

Clk

T
Clk PWm
tsu
D
thold

tc-q td-q
Q

Delays can be different for rising and falling data transitions

5 EE7605 Lecture 2
Register Parameters
D Q
Delays can be different for rising and falling data transitions
Clk T
Clk

D thold

tsu
tc-q
Q

positive clock edge

6 EE7605 Lecture 2
Clock period
• For each clock cycle, cycle period must be longer
than sum of:
– combinational delay;
– Memory element propagation delay.
• period depends on longest path.
• Unbalanced delays
– Logic with unbalanced delays leads to inefficient
use of logic:

short clock period long clock period


7 EE7605 Lecture 2
Latch-based design

Latch Combinational T Latch Combinational Latch


Tq s
clock A Logic A (Tda) B Logic B (Tdb) C

• Latch-based machines must use multiple ranks of


latches.
• Multiple ranks require multiple phases of clock.

8 EE7605 Lecture 2
Outline
• Motivations and definitions
• Hold and Setup time constraints
• Clock non idealities
– Clock Skew problem (positive and negative Skew).
– Worst delay and contamination time evaluation
– Clock Jitter.
• Clock Distribution networks:
– H tree network, Grid, etc..
• Some industrial example:
– Strategies of clock distributions in complex digital
systems and µp.
• Conclusion and discussions (How to counter Skew &
jitter)
9 EE7605 Lecture 2
DFF Implementation (falling edge
triggered)
Master/Slave latch arrangement

D Ds Q
D Q D Q

G Q’ G Q’

Cs
C

D Q
Master D Slave D
latch latch
C Q’

10 EE7605 Lecture 2
DFF Internal Operation
D

C
Master Master
sampling sampling

Ds
Xfer to
Slave
Cs
Xfer to
Slave
Q

11 EE7605 Lecture 2
Edge-triggered Flip Flop using Latches

Slave
Master CLK

0 Q D
1 QM
1
QM
D 0 Q

CLK
CLK

Two opposite latches trigger on edge (positive edge)


Also called master-slave latch pair

12 EE7605 Lecture 2
Master-Slave Register

Multiplexer-based latch pair (positive edge)

I2 T2 I3 I5 T4 I6 Q

QM
D I1 T1 I4 T3

CLK

13 EE7605 Lecture 2
Flip-Flop: Timing Definitions
φ

t
tsetu p th old
In

DATA
STABLE
t

tp FF
Out

DATA
STABLE
t

Setup time: time before clock during which data input must be stable.
Hold time: time after clock event for which data input must remain
stable.
Clock-to-Q delay = TPFF

14 EE7605 Lecture 2
Clk-Q Delay

2.5
CLK

1.5
Volts

D
tc 2 q(lh) tc 2 q(hl)
Q
0.5

2 0.5
0 0.5 1 1.5 2 2.5
time, nsec

15 EE7605 Lecture 2
Clock Race

• In a synchronous system, if the data input to a register


does not obey the setup and hold-time constraints,
then potential clock race problems may occur.
• Clock race results in erroneous data being stored in
registers.
• Assuming a perfectly synchronous system with
perfect clocks, zero hold-time registers, and clock-to-Q
time greater than the setup time, no clock race
problem should occur.
• However, at the chip level this might be hard to ensure.

16 EE7605 Lecture 2
The setup time race

• Setup represents the race


for new data to propagate
around the feedback loop
before clock closed the
input gate.

• If data arrives too close to


clock edge, it will not set up
the feedback loop before
clock closed the input TG

17 EE7605 Lecture 2
The hold time race

• Hold time represents the


race for clock to close the
input gate before next
cycle’s data disturbs the
stored value

• If data changes too soon


after the clock edge, clock
might not had time to
switch off the input gate
and new data will corrupt
feedback loop

18 EE7605 Lecture 2
Setup Time

3.0 3.0
Q
2.5 2.5

2.0 QM 2.0 I 2 2 T2
1.5 1.5 Q
Volts

Volts
CLK CLK
D D
1.0 1.0
I 2 2 T2 QM
0.5 0.5

0.0 0.0

2 0.5 2 0.5
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
time (nsec) time (nsec)
(a) Tsetup 5 0.21 nsec (b) T setup 5 0.20 nsec

19 EE7605 Lecture 2
More Precise Setup Time

Clk

t
D

t
Q

t
(a)

1.05tC 2 Q
tC 2 Q

tSu tD 2 C

tH
(b)

20 EE7605 Lecture 2
Setup/Hold Time Illustrations
Circuit before clock arrival (Setup-1 case)
CN

TG1
Inv2 Clk-Q Delay
D1 SM QM
D

Inv1

CP
TClk-Q

TSetup-1 Time

Data Clock
TSetup-1

Time
t=0

21 EE7605 Lecture 2
Setup/Hold Time Illustrations

Circuit before clock arrival (Setup-1 case)


CN

TG1
Inv2 Clk-Q Delay
D1 SM QM
D

Inv1

CP
TClk-Q

TSetup-1 Time

Data Clock
TSetup-1

Time
t=0

22 EE7605 Lecture 2
Setup/Hold Time Illustrations

Circuit before clock arrival (Setup-1 case)


CN

TG1
Inv2 Clk-Q Delay
D1 SM QM
D

Inv1
TClk-Q

CP

TSetup-1 Time

Data Clock
TSetup-1

Time
t=0

23 EE7605 Lecture 2
Setup/Hold Time Illustrations

Circuit before clock arrival (Setup-1 case)


CN

TG1
Inv2 Clk-Q Delay
D1 SM QM TClk-Q
D

Inv1

CP

TSetup-1 Time

Data Clock
TSetup-1

Time
t=0

24 EE7605 Lecture 2
Setup/Hold Time Illustrations

Hold-1 case
CN

TG1 Clk-Q Delay


Inv2
D1 SM QM
D

Inv1

0
CP
TClk-Q

THold-1
Time

Clock Data
THold-1

Time
t=0

25 EE7605 Lecture 2
Setup/Hold Time Illustrations

Hold-1 case
CN

TG1 Clk-Q Delay


Inv2
D1 SM QM
D

Inv1

0
CP
TClk-Q

THold-1
Time

Clock Data
THold-1

Time
t=0

26 EE7605 Lecture 2
Setup/Hold Time Illustrations

Hold-1 case
CN

TG1 Clk-Q Delay


Inv2
D1 SM QM
D

Inv1

0
CP TClk-Q

THold-1
Time

Clock Data
THold-1

Time
t=0

27 EE7605 Lecture 2
Setup/Hold Time Illustrations

Hold-1 case
CN

TG1 Clk-Q Delay


Inv2
D1 SM QM
D

Inv1 TClk-Q

0
CP

THold-1
Time

Clock Data
THold-1
Time
t=0

28 EE7605 Lecture 2
Setup/Hold Time Illustrations

Hold-1 case
CN

TG1 Clk-Q Delay


Inv2 TClk-Q
D1 SM QM
D

Inv1

0
CP

THold-1
Time

Clock Data
THold-1 ⇒
Time
t=0

29 EE7605 Lecture 2
Hold time violation
Td2
Reg Reg
d q Logic
d q
clk M1 M2
delay T delay
c1 Tc2
Hold time
clk Violation

Tc1
Td2 Old data New data

Tc2
Tc2 is sampling the new data while it’s supposed to sample the old. This
happens when Tc2 lags behind the data Td2 and which is more likely to
happen for extended delay on clk and shorter delay on Registers and Logic.
Worst case will corresponds to the min delay of Logic.
30 EE7605 Lecture 2
Hold time condition

• Need to make sure that data are properly


held and avoid race between data and clock.

Hold time constraint:


tc-q + tlogic,min> thold

Also called contamination delay

 tc_q
+ tlogic,min must be higher than a certain threshold
defined by the hold time of the FF.

31 EE7605 Lecture 2
How fast can we run
Reg Reg
d q Logic
d q
clk M1 M2
delay T delay
c1 Tc2

clk

clk
Setup time requirement:
Tq1 There is Minimum cycle time:
still a margin T = tc-q + tsu + tlogic
Tq1 +
Tlmax
Tsetup2
Setup time
32 Problem Violation EE7605 Lecture 2
Hold and setup time violations
• The earliest that data appears at the input of register M2 is
at time Tc1+Tq1, assuming zero delay in the logic block.
• The clock appears at the register M2 at time Tc2.
• Assume zero setup and hold times, if Tc2 lags the data
change (Tc2 > (Tc1+ Tq1)), the module M2 will store the data
from the current cycle rather than the previous cycle. This
is a hold-time violation and may be caused in practice by
Tc1 and Tq1 being close to zero while a delay is introduced
into the Tc2 clock line.
• If the delay (Tc1+ Tq1) - Tc2 is larger than the cycle time Tc,
then the data will arrive late at M2. This will cause a setup-
time violation. This occurs when the circuit is too slow for
the clock cycle used. While Tc2 may be artificially increased
to allow more time for the data to set up, the constraints Tc2
< (Tc1+ Tq1), becomes harder to meet and data delays may
have to be artificially added to meet the constraints.
33 EE7605 Lecture 2
Outline
• Motivations and definitions
• Hold and Setup time constraints
• Clock non idealities
– Clock Skew problem (positive and negative Skew).
– Worst delay and contamination time evaluation
– Clock Jitter.
• Clock Distribution networks:
– H tree network, Grid, etc..
• Some industrial example:
– Strategies of clock distributions in complex digital
systems and µp.
• Conclusion and discussions (How to counter Skew &
jitter)
34 EE7605 Lecture 2
Clock Non-idealities
• Clock skew
– Spatial variation in temporally equivalent clock
edges; deterministic + random, tSK
• Clock jitter
– Temporal variations in consecutive edges of the
clock signal; modulation + random noise
– Cycle-to-cycle (short-term) tJS
– Long term tJL
• Variation of the pulse width
– Important for level sensitive clocking

35 EE7605 Lecture 2
Clock Skew and Jitter

Clk
tSK

Clk tJS

• Both skew and jitter affect the effective cycle time


• SKEW: No Clock period variation but only phase shift
• JITTER: Zero mean random variable, absolute jitter refers to the worst
case variation of a clock edge at a given location with respect to an
ideally periodic reference clock edge.
• Only skew affects the race margin (we will see why).

36 EE7605 Lecture 2
Clock Uncertainties

4 Power Supply
3 Interconnect
2 6 Capacitive Load
Devices

7 Coupling to Adjacent Lines


5 Temperature
1 Clock Generation

Sources of clock uncertainty

37 EE7605 Lecture 2
Sources of skew and Jitter
• Systematic errors are nominally identical from chip to chip and
are predictable while random errors are due to manufacturing
variations that are difficult to model.
• Clock-signal generation: achieved by generating a high
frequency signal from a low frequency one (VCO): sensitive to
device noise, power supply variations, substrate coupling.
• Manufacturing Device variations: matching of devices in the
buffers along multiple clock paths is critical.
• Interconnect variations: Vertical and lateral dimension
variations cause the interconnect cap and resistance to vary.
Source of problem: Inter layer Diele (ILD) thickness variations.
• Environmental variations: temperature and power supply.
Temperature gradients across the chip are large as a
consequence of clock gating. Device parameters (Vth and µ)
depend on temperature and the clock delay can vary from path
to path. Does temperature contributes to skew or jitter?
• Capacitive coupling: Any coupling between clock wire and
adjacent signal results in timing uncertainties.
38 EE7605 Lecture 2
The Clock Skew Problem
Clock Rates as High as 2 Ghz in CMOS! (T=0.5ns)
φ

t l,min t r,min tφ ' tφ " tφ '''


t l,max t r,max

In
CL1 R1 CL2 R2 CL3 R3 Out
ti

Clock Edge Timing Depends upon Position


Positive skew: data and clock routed in the same direction

clk1
clk2
39 EE7605 Lecture 2
Positive Skew
TCLK + δ
TCLK
1 3
CLK1
δ

CLK2 2 4
δ + th
R1 R2
In Combinational
D Q D Q
Logic

CLK tCLK1 tCLK2

tc − q tlogic
tc − q, cd tlogic, cd
tsu, thold

Launching edge arrives before the receiving edge


40 EE7605 Lecture 2
Positive Skew
TCLK + δ
TCLK
1 3
CLK1
δ

CLK2 2 4
δ + th

• The output of the combinational circuit must be valid one setup


time before the rising edge of CLK2 (point 4).
T + δ >= tc-q + tsu + tlogic)max or T >= tc-q + tsu + tlogic)max - δ
• This equation suggests that clock skew actually has the potential
to improve the performance of the circuit. This is indeed true but
increasing skew makes the circuit susceptible to race conditions.
• The problem may arise if the new value at the output of R1
propagates through the logic is valid at the input of R2 before 2.
• To avoid this we have to ensure that:
δ + thold < tc-q + tlogic)min or δ < tc-q + tlogic)min - thold
41 EE7605 Lecture 2
Negative Skew
φ

tφ ' tφ " tφ '''

In Out
CL1 R1 CL2 R2 CL3 R3
ti

Clock Edge Timing Depends upon Position


Negative skew: data and clock routed in the opposite direction

clk1
clk2
42 EE7605 Lecture 2
Negative Skew

TCLK + δ
TCLK
1 3
CLK1

CLK2 2 4
δ

R1 R2
In Combinational
D Q D Q
Logic

tCLK1 tCLK2
clk
tc − q tlogic
tc − q, cd tlogic, cd
tsu, thold

Receiving edge arrives before the launching edge


43 EE7605 Lecture 2
Negative Skew

TCLK + δ
TCLK
1 3
CLK1

CLK2 2 4
δ

• Negative slow impacts the performance as the effective period (from


position 1 to position 4) is made shorter by δ:
T - δ >= tc-q + tsu + tlogic)max or T >= tc-q + tsu + tlogic)max + δ
• However, a negative skew implies that the system never fails since
edge 2 happens before edge 1. There is no race issue.

44 EE7605 Lecture 2
Positive and Negative Skew
φ
(a) Positive skew(clock
is routed in the same
Data direction of the data
CL R CL R CL R flow.
•Skew has to be strictly controlled and satisfy the maximum
value of skew. Otherwise the circuit will be mal-function.
Reducing the clock frequency does not help.
φ

(b) Negative skew(clock is


Data routed in the opposite
CL R CL R CL R direction of the data

•When the skew is -ve, the race condition will never happen. The
circuit operates correctly independent of skew.
•However, -ve skew impact the throughput in a negative way. The skew
reduces the time available for the actual computation so that the clock
45 period has to increased by |δ|. EE7605 Lecture 2
How to counter Clock Skew?
• Routing the clock is opposition direction can relieve the
race problem of clock skew. But it will hamper
performance. Also sometimes the data-flow of circuit is
not uni-directional.
Negative Skew
REG

REG
REG
φ . log Out
REG

In φ φ
Positive Skew
φ

Clock Distribution

• The best solution is to ensure the clock skew


between communicating registers is bound
46 EE7605 Lecture 2
Outline
• Motivations and definitions
• Hold and Setup time constraints
• Clock non idealities
– Clock Skew problem (positive and negative Skew).
– Worst delay and contamination time evaluation
– Clock Jitter.
• Clock Distribution networks:
– H tree network, Grid, etc..
• Some industrial example:
– Strategies of clock distributions in complex digital
systems and µp.
• Conclusion and discussions (How to counter Skew &
jitter)
47 EE7605 Lecture 2
Example of Clock skew

tg = gate delay, tm= mux


REG

REG
MUX
delay, ts = setup time
tq = reg clock-to-q delay
φ T = clock period
Assume input signals arrive early enough, max Need to evaluate
bound on the skew is tlogic)min &
t q + t g + t m − t hold > δ tlogic)max
The equilibrium requirement at the time of latching
imposes another constraints on the skew
t q + 5t g + t m + t s < T + δ
Combining these constraints we have
t q + t g + t m − t hold > δ > t q + 5t g + t m + t s − T
48 EE7605 Lecture 2
Example –Propagation and
contamination delay evaluation
• Propagation and contamination delay are not always
easy to evaluate due to false paths.
OR1
A PATH2

In1 Out
B OR2 PATH1
C AND1
D AND3
AND2

• The contamination is defined a 2tgates (through OR1,OR2)


• It would appear that the worst case is path 1, 5tgates, but this is a
false path (output does not even depend on C &D):
REG

– If A=1 the critical path (CP) is through OR1 and OR2.


– If A=0, B=0, CP through I1, OR1 OR2
– If A=0, B=1, CP through I1, OR1, AND3, OR2 which is 4tgates
• Computation of worst case delay cannot be obtained just by
49 adding propagation delay due to false path. EE7605 Lecture 2
Static Timing Analysis

• 0->1 and 1->0 delays are generally different.


• The simplest delay problem to analyze is to change
the value at only one input and determine how long
it takes for the effect to be propagated to a single
output (provided there must be a path from the
selected input to the output).
• Can use a logic simulator, however have to simulate
all possible transition values
• Static Timing analysis - value-independent. It builds
a graph which models delays through the network
and identifies the longest(shortest) delay path.

50 EE7605 Lecture 2
Critical Path
• The longest delay path is known as critical path since
that path limits the system performance.
• The critical path not only tells us the system cycle
time, it points out what part of the combinational logic
must be changed to improved system performance.
• Speed up gates on the critical path by increasing
transistor sizes, or reducing wiring capacitance, or
redesign logic along the critical path to use a faster
gate configuration.
• Speeding up the system may require modifying several
sections of logic since the critical path can have
multiple branches. Identify the critical path and identify
the cutset of the graph represents the critical path.
Then determine the edge (gate) to speed up.

51 EE7605 Lecture 2
False Path

• False path - critical paths that can never be


exercised during normal circuit operation. In this
case the actual critical path is thus shorter than
what would be predicted from the first-order
analysis.
• Detecting false path is not easy since it requires an
understanding of the logic functionality of the
network.
• Also it is a N-P complete problem to determine
whether a path is false or not, however new CAD
tools/algorithm are available now to find false paths
in practical networks.

52 EE7605 Lecture 2
Example of False Path

a c d y

z
b
e

V a-> V c-> V d-> V e-> V z is a false path

53 EE7605 Lecture 2
Outline
• Motivations and definitions
• Hold and Setup time constraints
• Clock non idealities
– Clock Skew problem (positive and negative Skew).
– Worst delay and contamination time evaluation
– Clock Jitter.
• Clock Distribution networks:
– H tree network, Grid, etc..
• Some industrial example:
– Strategies of clock distributions in complex digital
systems and µp.
• Conclusion and discussions (How to counter Skew &
jitter)
54 EE7605 Lecture 2
Impact of Jitter

 TC LK 

 t j itter
CLK  
-tji tte r 

REGS Combinational
In Logic

CLK t log ic
tc-q , tc-q, cd t log ic, cd
ts u, thold
tjitter

Temporal variation in the clock edge.

55 EE7605 Lecture 2
Longest Logic Path in
Edge-Triggered Systems
TJI + δ Setup time
TSU Condition
Clk
TClk-Q
TLM
Latest point Earliest arrival
of launching T of next cycle

If launching edge is late and receiving edge is early, the data will not be too late if:

Tc-q + TLM + TSU < T – TJI,1 – TJI,2 - δ


Minimum cycle time is determined by the maximum delays through the logic

Tc-q + TLM + TSU + δ + 2 TJI < T


Skew can be either positive or negative
56 EE7605 Lecture 2
Clock Constraints in
Edge-Triggered Systems –Shortest path
Earliest point Hold time
of launching Clk Condition
TClk-Q TLm

Clk
TH
Nominal Data must not arrive
clock edge before this time

If launching edge is early and receiving edge is late:

Tc-q + TLM – TJI,1 > TH + TJI,2 + δ


Minimum logic delay

Tc-q + TLM > TH + 2TJI+ δ


57 EE7605 Lecture 2
Outline
• Motivations and definitions
• Hold and Setup time constraints
• Clock non idealities
– Clock Skew problem (positive and negative Skew).
– Worst delay and contamination time evaluation
– Clock Jitter.
• Clock Distribution networks
– H tree network, Grid, etc..
• Some industrial example:
– Strategies of clock distributions in complex digital
systems and µp.
• Conclusion and discussions (How to counter Skew &
jitter)
58 EE7605 Lecture 2
Clock-distribution network design
parameters

•Interconnect material used for the clock network

•Shape of the clock-distribution network

•Clock driver and the buffer scheme used

•Load on the clock lines (I.e. the clock fan-out)

•Rise and fall time of the clock

59 EE7605 Lecture 2
Clock Distribution to bound skew

Very attractive for CLOCK


regular structure H-Tree Network

Observe: Only Relative Skew is Important

60 EE7605 Lecture 2
More realistic H-tree

[Restle98]

61 EE7605 Lecture 2
Clock Network with Distributed Buffering
Local Area

Module Module
secondary clock drivers

Module Module
Equalizing the local
clock delay through a
Module Module
careful routing of the
clock signals
combining with a
main clock driver hierarchical clock-
buffering scheme
CLOCK

Reduces absolute delay, and makes Power-Down easier

62 EE7605 Lecture 2
The Grid System
GCL K

Driver
Driver

Driver
GCLK GCLK

•No rc-matching
•Large power
Driver

GCL K

63 EE7605 Lecture 2
Example: DEC Alpha Evolution

• Clock driver placement

21064 21164 21264

• Gronowski, JSSC 5/98

64 EE7605 Lecture 2
Example: DEC Alpha 21164

Clock Drivers

65 EE7605 Lecture 2
Example: DEC Alpha 21164
Use Clock grid instead of clock tree
Clock Frequency: 300 MHz - 9.3 Million Transistors
Total Clock Load: 3.75 nF

Power in Clock Distribution network : 20 W (out of 50)


Uses Two Level Clock Distribution:

• Single 6-stage driver at center of chip


• Secondary buffers drive left and right side
clock grid in Metal3 and Metal4
Total driver size: 58 cm!

66 EE7605 Lecture 2
Example: DEC Alpha 21164

67 EE7605 Lecture 2
Example: DEC Alpha 21264
tcycle= 1.67ns
600 MHz – 0.35
trise = 0.35ns tskew = 50ps micron CMOS
Global clock waveform

• 2 Phase, with multiple conditional


buffered clocks
– 2.8 nF clock load
– 40 cm final driver width
• Local clocks can be gated “off” to save
power
• Reduced load/skew
• Reduced thermal issues
• Multiple clocks complicate race checking

PLL

68 EE7605 Lecture 2
Example: DEC Alpha 21264

69 EE7605 Lecture 2
Example: DEC Alpha 21264
ps ps
5 300
10 305
15 310
20 315
25 320
30 325
35 330
40 335
45 340
50 345

GCLK Skew GCLK Rise Times


(at Vdd/2 Crossings) (20% to 80% Extrapolated to 0% to 100%)

70 EE7605 Lecture 2
Hybrid Grid

• DEC Alpha 21264, Bailey JSSC 11/98

71 EE7605 Lecture 2
Example : Intel IA-64 Itanium

• Use of Deskew buffers


• 3-level Hierarchy
– Global distribution
• On-die Phase-lock loop
• Deskew buffer (DSK)
– Regional distribution
• From deskew buffer to 30 clock regions (region
clock grid, RCD)
– Local distribution
• Lock clock buffer (LCB)
• Opportunity-time-borrowing (OTB) delay clocks
generation

72 EE7605 Lecture 2
Intel IA-64 Itanium clock
distribution topology

73 EE7605 Lecture 2
Global Clock Distribution
• Distribute two clocks
– Core clock and
reference clock
– Using two identical
and balanced H-tree
on the top two
metal layers
• To reduce cap. noise
coupling and to
ensure good
inductive return path,
the H-tree is fully
shield laterally with
Vcc/Vss.

74 EE7605 Lecture 2
Regional clock distribution

• Distributed array of
deskew buffer (DSK) to
reduce within-die
process variations
• Regional clock grid
driven by modular
Regional Clock Drivers
– 30 clock regions
– M4 for x-direction, M5
for y-direction
– Full support for scan
and clock gating

75 EE7605 Lecture 2
Local Clock distribution

• Local clock buffer


• Delay clocks that are
needed for the
opportunity-time-
borrowing (OTB) delay
clock generation, I.e.
intentional skew buffer

76 EE7605 Lecture 2
Take away message –Dealing with
Skew and jitter
• Balance clock paths from a central distribution source to
individual clocking elements: the effective load of each path that
includes wiring and transistors must equalized.
• The use of local grid can reduce skew but at increased power.
• Need to be very careful with gated clocks as it creates data
dependent clock, which would increase jitter.
• If data flow in one direction, route the data in the opposite
direction. This eliminates races at the cost of performance.
• Shielding clock wires from adjacent signals helps reduce noise.
• Variation in chip temperature across the die causes variations in
clock buffer delay. Need to use temperature compensation tech.
• Add on-chip decoupling capacitors to reduce power supply high
frequency fluctuations.
• Extensive simulation should be performed to check for circuit
operation in the # corners (δT, δV, δW, δL, δC, δµn, δVth)
77 EE7605 Lecture 2

You might also like