Timing Issues in Digital Circuits
Timing Issues in Digital Circuits
x1 y1 x1 y1
G1 G1
x2 x2
z(x1, x2, x3) z(x1, x2, x3)
G3 I1 G3
x1
x1
G2 G2
x3 y2 x3 y2
(a) (b)
Figure (c) Illustration of a static hazard (con’t)
x1
x2
x3
y1
y2
Dt Dt Dt
Time
t1 t2 t3 t4 t5 t6
(c)
Figure (d) Illustration of a static hazard (con’t).
x1
x2
x3
y1
y2
x1 x1
z z
1 1
x3 1 1 1 x3 1 1 1
x2 x2
(a) (b)
Figure : Hazard-free network.
x1
G1
x2
x2
G4
x3
Figure (a)--(b) Example of a static-0 hazard.
A 0 0
G1
C
0 0 0 0
A z(A, B, C, D)
G2 G4 D
D
0 0
A C
B G3
C 0
(a)
B
(b)
Figure (c)--(d) Example of a static-0 hazard (con’t).
A
A
G1
C 0 0
A
G2 0 0 0 0
D z(A, B, C, D)
G4 D
A
B G3 0 0
C C
B 0
C G5
D
(c) B
(d)
Figure Dynamic hazards.
(a) (b)
◼ Timing in combinational circuits
❑ Propagation delay and contamination delay
❑ Glitches
◼ Circuit Verification
❑ How to make sure a circuit works correctly
❑ Functional verification
❑ Timing verification
11
Tradeoffs in Circuit Design
12
Circuit Design is a Tradeoff Between:
◼ Area
❑ Circuit area is proportional to the cost of the device
◼ Speed / Throughput
❑ We want faster, more capable circuits
◼ Power / Energy
❑ Mobile devices need to work with a limited power supply
❑ High performance devices dissipate more than 100W/cm2
◼ Design Time
❑ Designers are expensive in time and money
❑ The competition will not wait for you
13
Requirements and Goals Depend On Application
14
Circuit Timing
◼ Until now, we investigated logical functionality
15
Part 1:
Combinational Circuit Timing
16
Digital Logic Abstraction
◼ “Digital logic” is a convenient abstraction
❑ Output changes immediately with the input
A 1 0
Y 0 1
17
Combinational Circuit Delay
◼ In reality, outputs are delayed from inputs
❑ Transistors take a finite amount of time to switch
A Y
delay
time
Time
18
Real Inverter Delay Example
Image source: Sandoval-Ibarra, F., and E. S. Hernández-Bernal. "Ring CMOS NOT-based oscillators:
Analysis and design." Journal of applied research and technology, 2008.
19
Circuit Delay and Its Variation
◼ Delay is fundamentally caused by
❑ Capacitance and resistance in a circuit
❑ Finite speed of light (not so fast on a nanosecond scale!)
20
Delays from Input to Output Y
◼ Contamination delay (tcd): delay until Y starts changing
◼ Propagation delay (tpd): delay until Y finishes changing
Cross-hatching
means value is changing 21
Calculating Longest & Shortest Delay Paths
◼ We care about both the longest and shortest delay
paths in a circuit
Critical Path
A n1
B
n2
C
D Y
Short Path
◼ Critical (Longest) Path: tpd = 2 tpd_AND + tpd_OR
◼ Shortest Path: tcd = tcd_AND
22
Calculating Longest Delay Path (Critical Path)
27
Disclaimer: Calculating Long/Short Paths
◼ It’s not always this easy to determine the long/short paths!
❑ Not all input transitions affect the output
❑ Can have multiple different paths from an input to output
28
Combinational Timing Summary
◼ Circuit outputs change some time after the inputs change
❑ Caused by finite speed of light (not so fast on a ns scale!)
❑ Delay is dependent on inputs, environmental state, etc.
29
Output Glitches
30
Glitches
◼ Glitch: one input transition causes multiple output transitions
0
0
1
31
Glitches
◼ Glitch: one input transition causes multiple output transitions
0
1 -> 0
1 -> ?
32
Glitches
◼ Glitch: one input transition causes multiple output transitions
1
Fast path (2 gates)
33
Glitches
◼ Glitch: one input transition causes multiple output transitions
1
Fast path (2 gates)
34
Glitches
◼ Glitch: one input transition causes multiple output transitions
Slow path (3 gates)
0
(B) 1 -> 0 n1
(Y) 1 -> 0 -> 1
n2
1
Fast path (2 gates)
35
Optional: Avoiding Glitches Using K-Maps
◼ Glitches are visible in K-maps
❑ Recall: K-maps show the results of a change in a single input
❑ A glitch occurs when moving between prime implicants
(A) 0 AB
(B) 1 -> 0
(Y) 1 -> 0 -> 1
(C) 1
BC
36
Optional: Avoiding Glitches Using K-Maps
◼ We can fix the issue by adding in the consensus term
❑ Ensures no transition between different prime implicants
(A) 0 AB
(B) 1 -> 0
BC (Y) 1 -> 1
(C) 1 AB + A'C + BC = AB
consensus or reso
terms AB and A'C
conjunction of all th
AC of the terms, exclud
that appears unneg
and negated in the
No dependence on B
=> No glitch!
37
Avoiding Glitches
◼ Q: Do we always care about glitches?
❑ Fixing glitches is undesirable
◼ More chip area
◼ More power consumption
◼ More design effort
❑ The circuit is eventually guaranteed to converge to the right
value regardless of glitchiness
39
Recall: D Flip-Flop
◼ Flip-flop samples D at the active clock edge
❑ It outputs the sampled value to Q
❑ It “stores” the sampled value until the next active clock edge
CLK
D Q
D
D Q
tsetup thold
ta
◼ Setup time (tsetup): time before the clock edge that data
must be stable (i.e. not changing)
◼ Hold time (thold): time after the clock edge that data must
be stable
◼ Aperture time (ta): time around clock edge that data
must be stable (ta = tsetup + thold)
41
Violating Input Timing: Metastability
◼ If D is changing when sampled, metastability can occur
❑ Flip-flop output is stuck somewhere between ‘1’ and ‘0’
❑ Output eventually settles non-deterministically
Example Timing Violations (NAND RS Latch)
CLK
Q Non-deterministic
Convergence
Metastability
Source: W. J. Dally, Lecture notes for EE108A, Lecture 13: Metastability and
Synchronization Failure (When Good Flip-Flops go Bad) 11/9/2005. 42
Flip-Flop Output Timing
CLK CLK
D Q
tccq
tpcq
43
Recall: Sequential System Design
44
Ensuring Correct Sequential Operation
◼ Need to ensure correct input timing on R2
CLK
tsetup thold
ta
45
Ensuring Correct Sequential Operation
◼ This means there is both a minimum and maximum
delay between two flip-flops
Potential
CL too fast -> R2 thold violation
R2 tHOLD
❑
Q1
D2
(b) tHOLD tSETUP
46
Setup Time Constraint
◼ Safe timing depends on the maximum delay from R1 to R2
◼ The input to R2 must be stable at least tsetup before the clock edge.
CLK CLK
Q1 CL D2 Tc
R1 R2
Tc
CLK
Q1
D2
tpcq tpd tsetup
47
Setup Time Constraint
◼ Safe timing depends on the maximum delay from R1 to R2
◼ The input to R2 must be stable at least tsetup before the clock edge.
CLK CLK
Q1 CL D2 Tc > tpcq
R1 R2
Tc
CLK
Q1
D2
tpcq tpd tsetup
48
Setup Time Constraint
◼ Safe timing depends on the maximum delay from R1 to R2
◼ The input to R2 must be stable at least tsetup before the clock edge.
CLK CLK
Q1 CL D2 Tc > tpcq + tpd
R1 R2
Tc
CLK
Q1
D2
tpcq tpd tsetup
49
Setup Time Constraint
◼ Safe timing depends on the maximum delay from R1 to R2
◼ The input to R2 must be stable at least tsetup before the clock edge.
CLK CLK
Q1 CL D2 Tc > tpcq + tpd + tsetup
R1 R2
Tc
CLK
Q1
D2
tpcq tpd tsetup
50
Setup Time Constraint
◼ Safe timing depends on the maximum delay from R1 to R2
◼ The input to R2 must be stable at least tsetup before the clock edge.
Wasted work
CLK CLK
Q1 CL D2 Tc > tpcq + tpd + tsetup
R1 R2 Useful work
Tc
CLK
Q1 Sequencing overhead:
D2
amount of time wasted
tpcq tpd tsetup
each cycle due to sequencing
element timing requirements
51
tsetup Constraint and Design Performance
52
Hold Time Constraint
◼ Safe timing depends on the minimum delay from R1 to R2
◼ D2 (i.e., R2 input) must be stable for at least thold after the clock edge
Must not change until
thold after the clock
CLK CLK
Q1 CL D2 tccq
R1 R2
CLK
Q1
D2
tccq tcd
thold
53
Hold Time Constraint
◼ Safe timing depends on the minimum delay from R1 to R2
◼ D2 (i.e., R2 input) must be stable for at least thold after the clock edge
CLK CLK
Q1 CL D2 tccq + tcd
R1 R2
CLK
Q1
D2
tccq tcd
thold
54
Hold Time Constraint
◼ Safe timing depends on the minimum delay from R1 to R2
◼ D2 (i.e., R2 input) must be stable for at least thold after the clock edge
CLK CLK
Q1 CL D2 tccq + tcd > thold
R1 R2
CLK
Q1
D2
tccq tcd
thold
55
Hold Time Constraint
◼ Safe timing depends on the minimum delay from R1 to R2
◼ D2 (i.e., R2 input) must be stable for at least thold after the clock edge
CLK CLK
Q1 CL D2 tccq + tcd > thold
R1 R2
tcd > thold - tccq
CLK
Q1
We need to have a minimum
D2
combinational delay!
tccq tcd
thold
56
Hold Time Constraint
◼ Safe timing depends on the minimum delay from R1 to R2
◼ D2 (i.e., R2 input) must be stable for at least thold after the clock edge
CLK CLK
Q1 CL D2 tccq + tcd > thold
R1 R2
tcd > thold - tccq
CLK
Q1
Does NOT depend on Tc!
D2
tccq tcd
Very hard to fix thold violations after
thold
manufacturing- must modify circuits!
57
Sequential Timing Summary
tccq / tpcq clock-to-q delay (contamination/propagation)
tcd / tpd combinational logic delay (contamination/propagation)
tsetup time that FF inputs must be stable before next clock edge
thold time that FF inputs must be stable after a clock edge
Tc clock period
R1 R2 R1 R2
Tc
CLK CLK
Q1 Q1
D2 D2
tccq tcd tpcq tpd tsetup
thold
58
Example: Timing Analysis
CLK CLK
A
Timing Characteristics
tccq = 30 ps
B
tpcq = 50 ps
X' X
C tsetup = 60 ps
D
Y' Y thold = 70 ps
per gate
tpd = 35 ps
tpd =
tcd = 25 ps
tcd =
Check setup time constraints: Check hold time constraints:
Tc > tpcq + tpd + tsetup tccq + tcd > thold ?
Tc >
fmax = 1/Tc =
59
Example: Timing Analysis
CLK CLK
A
Timing Characteristics
tccq = 30 ps
B
tpcq = 50 ps
X' X
C tsetup = 60 ps
D
Y' Y thold = 70 ps
per gate
tpd = 35 ps
tpd = 3 x 35 ps = 105 ps
tcd = 25 ps
tcd =
Check setup time constraints: Check hold time constraints:
Tc > tpcq + tpd + tsetup tccq + tcd > thold ?
Tc >
fmax = 1/Tc =
60
Example: Timing Analysis
CLK CLK
A
Timing Characteristics
tccq = 30 ps
B
tpcq = 50 ps
X' X
C tsetup = 60 ps
D
Y' Y thold = 70 ps
per gate
tpd = 35 ps
tpd = 3 x 35 ps = 105 ps
tcd = 25 ps
tcd = 25 ps
Check setup time constraints: Check hold time constraints:
Tc > tpcq + tpd + tsetup tccq + tcd > thold ?
Tc >
fmax = 1/Tc =
61
Example: Timing Analysis
tpcq CLKA CLK
Timing Characteristics
tccq = 30 ps
B tpd
tpcq = 50 ps
X' X
C tsetup = 60 ps
D
Y' Y thold = 70 ps
tsetup
per gate
tpd = 35 ps
tpd = 3 x 35 ps = 105 ps
tcd = 25 ps
tcd = 25 ps
Check setup time constraints: Check hold time constraints:
Tc > tpcq + tpd + tsetup tccq + tcd > thold ?
Tc > (50 + 105 + 60) ps = 215 ps
fmax = 1/Tc = 4.65 GHz
62
Example: Timing Analysis
CLK CLK
A
Timing Characteristics
tccq = 30 ps
B
tccq tpcq = 50 ps
X' X
C tsetup = 60 ps
D
tcd Y' Y thold = 70 ps
per gate
tpd = 35 ps
tpd = 3 x 35 ps = 105 ps
tcd = 25 ps
tcd = 25 ps
Check setup time constraints: Check hold time constraints:
Tc > tpcq + tpd + tsetup tccq + tcd > thold ?
Tc > (50 + 105 + 60) ps = 215 ps (30 + 25) ps > 70 ps ?
fmax = 1/Tc = 4.65 GHz
63
Example: Timing Analysis
CLK CLK
A
Timing Characteristics
tccq = 30 ps
B
tpcq = 50 ps
X' X
C tsetup = 60 ps
D
Y' Y thold = 70 ps
per gate
tpd = 35 ps
tpd = 3 x 35 ps = 105 ps
tcd = 25 ps
tcd = 25 ps
Check setup time constraints: Check hold time constraints:
Tc > tpcq + tpd + tsetup tccq + tcd > thold ?
Tc > (50 + 105 + 60) ps = 215 ps (30 + 25) ps > 70 ps ?
fmax = 1/Tc = 4.65 GHz 64
Example: Fixing Hold Time Violation
Add buffers to the short paths:
CLK CLK
Timing Characteristics
A tccq = 30 ps
B tpcq = 50 ps
X' X tsetup = 60 ps
C
thold = 70 ps
Y' Y
D
per gate
tpd = 35 ps
tpd =
tcd = 25 ps
tcd =
Check setup time constraints: Check hold time constraints:
Tc > tpcq + tpd + tsetup tccq + tcd > thold ?
Tc >
fc =
65
Example: Fixing Hold Time Violation
Add buffers to the short paths:
CLK CLK
Timing Characteristics
A tccq = 30 ps
B tpcq = 50 ps
X' X tsetup = 60 ps
C
thold = 70 ps
Y' Y
D
per gate
tpd = 35 ps
tpd = 3 x 35 ps = 105 ps
tcd = 25 ps
tcd = 2 x 25 ps = 50 ps
Check setup time constraints: Check hold time constraints:
Tc > tpcq + tpd + tsetup tccq + tcd > thold ?
Tc >
fc =
66
Example: Fixing Hold Time Violation
Add buffers to the short paths:
CLK CLK
Timing Characteristics
tpcq A tccq = 30 ps
B tpd tpcq = 50 ps
X' X tsetup = 60 ps
C
thold = 70 ps
Y' Y
D
per gate
tsetup tpd = 35 ps
tpd = 3 x 35 ps = 105 ps
tcd = 25 ps
tcd = 2 x 25 ps = 50 ps
Check setup time constraints: Check hold time constraints:
Tc > tpcq + tpd + tsetup tccq + tcd > thold ?
Tc > (50 + 105 + 60) ps = 215 ps
fc = 1/Tc = 4.65 GHz
67
Example: Fixing Hold Time Violation
Add buffers to the short paths:
CLK CLK
Timing Characteristics
A tccq = 30 ps
B tpcq = 50 ps
X' X tsetup = 60 ps
C
thold = 70 ps
Y' Y
D
per gate
tpd = 35 ps
tpd = 3 x 35 ps = 105 ps
tcd = 25 ps
tcd = 2 x 25 ps = 50 ps
Check setup time constraints: Check hold time constraints:
Tc > tpcq + tpd + tsetup tccq + tcd > thold ?
Tc > (50 + 105 + 60) ps = 215 ps
Note: no change
fc = 1/Tc = 4.65 GHz
to max frequency!
68
Example: Fixing Hold Time Violation
Add buffers to the short paths:
CLK CLK
Timing Characteristics
A tccq = 30 ps
B tpcq = 50 ps
tccq X' X tsetup = 60 ps
C
tcd Y' Y
thold = 70 ps
D
per gate
tpd = 35 ps
tpd = 3 x 35 ps = 105 ps
tcd = 25 ps
tcd = 2 x 25 ps = 50 ps
Check setup time constraints: Check hold time constraints:
Tc > tpcq + tpd + tsetup tccq + tcd > thold ?
Tc > (50 + 105 + 60) ps = 215 ps (30 + 50) ps > 70 ps ?
fc = 1/Tc = 4.65 GHz
69
Example: Fixing Hold Time Violation
Add buffers to the short paths:
CLK CLK
Timing Characteristics
A tccq = 30 ps
B tpcq = 50 ps
X' X tsetup = 60 ps
C
thold = 70 ps
Y' Y
D
per gate
tpd = 35 ps
tpd = 3 x 35 ps = 105 ps
tcd = 25 ps
tcd = 2 x 25 ps = 50 ps
Check setup time constraints: Check hold time constraints:
Tc > tpcq + tpd + tsetup tccq + tcd > thold ?
Tc > (50 + 105 + 60) ps = 215 ps (30 + 50) ps > 70 ps ?
fc = 1/Tc = 4.65 GHz
70
Clock Skew
◼ To make matters worse, clocks have delay too!
❑ The clock does not reach all parts of the chip at the same time!
◼ Clock skew: time difference between two clock edges
CLOCK A
SOURCE
Long, slow B
clock path
Clock Source
Point A
Point B
clock skew
71
Clock Skew Example
◼ Example of the Alpha 21264 clock skew spatial distribution
Source: Abdelhadi, Ameer, et al. "Timing-driven variation-aware nonuniform clock mesh synthesis." GLSVLSI’10.
75
Review:Synchronous Timing Basics
R1 R2
In Combinational
D Q D Q
logic
tclk1 tclk2
clk
tc-q, tsu, tplogic, t cdlogic
thold, tcdreg
❑ Under ideal conditions (i.e., when tclk1 = tclk2)
T tc-q + tplogic + tsu
thold ≤ tcdlogic + tcdreg
❑ Under real conditions, the clock signal can have both
spatial (clock skew) and temporal (clock jitter) variations
⚫ skew is constant from cycle to cycle (by definition); skew can be
positive (clock and data flowing in the same direction) or negative
(clock and data flowing in opposite directions)
⚫ jitter causes T to change on a cycle-by-cycle basis
Sources of Clock Skew and Jitter in Clock
Network
power supply4
interconnect 3
clock capacitive load 6
1
generation
7 capacitive
PLL coupling
2 clock drivers
5 temperature
❑ Skew ❑ Jitter
⚫ manufacturing device ⚫ clock generation
variations in clock drivers
⚫ capacitive loading and
⚫ interconnect variations coupling
⚫ environmental variations ⚫ environmental variations
(power supply and (power supply and
temperature) temperature)
Positive Clock Skew
❑ Clock and R1 R2
data flow in In D Q
Combinational
D Q
the same logic
direction tclk1 tclk2
clk
delay
T
T+
1 3
>0
2 4
+ thold
T: T + tc-q + tplogic + tsu so T tc-q + tplogic + tsu -
thold : thold + ≤ tcdlogic + tcdreg so thold ≤ tcdlogic + tcdreg -
❑ > 0: Improves performance, but makes thold harder to
meet. If thold is not met (race conditions), the circuit
malfunctions independent of the clock period!
Negative Clock Skew
❑ Clock and R1 R2
data flow in In D Q
Combinational
D Q
opposite logic
directions tclk1 tclk2
clk
delay
T
T+
1 3
2 4
<0
80
81
82
Clock Jitter
❑ Jitter causes T to R1
Combinational
vary on a cycle-by- In logic
cycle basis
tclk
clk
T
-tjitter +tjitter
6 12
-tjitter
85
How Do You Know That A Circuit Works?
◼ You have designed a circuit
❑ Is it functionally correct?
❑ Even if it is logically correct, does the hardware meet all
timing constraints?
86
Testing Large Digital Designs
◼ Testing can be the most time consuming design stage
❑ Functional correctness of all logic paths
❑ Timing, power, etc. of all circuit elements
Adapted from ”CMOS VLSI Design 4e”, Neil H. E. Weste and David Money Harris ©2011 Pearson 88
Part 4:
Functional Verification
89
Functional Verification
◼ Goal: check logical correctness of the design
90
Testbench-Based Functional Testing
◼ Testbench: a module created specifically to test a design
❑ Tested design is called the “device under test (DUT)”
Outputs
Inputs
Test Output
Pattern Checking
Generator Logic
DUT
Testbench
92
Common Verilog Testbench Types
Input/Output
Testbench Error Checking
Generation
Simple Manual Manual
Self-Checking Manual Automatic
Automatic Automatic Automatic
93
Example DUT
◼ We will walk through different types of testbenches to test
a module that implements the logic function:
y = (b ∙ c) + (a ∙ b)
// performs y = ~b & ~c | a & ~b
module sillyfunction(input a, b, c,
output y);
wire b_n, c_n;
wire m1, m2;
95
Simple Testbench
96
Simple Testbench
module testbench1(); // No inputs, outputs
reg a, b, c; // Manually assigned
wire y; // Manually checked
97
Simple Testbench: Output Checking
◼ Most common method is to look at waveform diagrams
❑ Thousands of signals over millions of clock cycles
time
◼ Manually check that output is correct at all times
98
Simple Testbench
◼ Pros:
❑ Easy to design
❑ Can easily test a few, specific inputs (e.g., corner cases)
◼ Cons:
❑ Not scalable to many test cases
❑ Outputs must be checked manually outside of the simulation
◼ E.g., inspecting dumped waveform signals
◼ E.g., printf() style debugging
99
Part 5:
Timing Verification
100
Timing Verification Approaches
◼ High-level simulation (e.g., C, Verilog)
❑ Can model timing using “#x” statements in the DUT
❑ Useful for hierarchical modeling
◼ Insert delays in FF’s, basic gates, memories, etc.
◼ High level design will have some notion of timing
❑ Usually not as accurate as real circuit timing
101
The Good News
◼ Tools will try to meet timing for you!
❑ Setup times, hold times
❑ Clock skews
❑ …
102
The Bad News
◼ The tool can fail to find a solution
❑ Desired clock frequency is too aggressive
◼ Can result in setup time violation on a particularly long path
❑ Too much logic on clock paths
◼ Introduces excessive clock skew
❑ Timing issues with asynchronous logic
103
Meeting Timing Constraints
◼ Unfortunately, this is often a manual, iterative process
❑ Meeting strict timing constraints (e.g., high performance
designs) can be tedious
104
Meeting Timing Constraints: Principles
◼ Let’s go back to the fundamentals