Libraries Corners STA TimingPaths 07 23
Libraries Corners STA TimingPaths 07 23
7
Exercise 1
What is the preferred cells to be used for the following applications: [size and flavor]
- Heart peacemaker.
- Datacenter processor.
Compare the following cells in terms of: speed, area, and leakage power.
-Compare between using a complex “AOI” cell, and implementing same logic using equivalent NAND2 cells, w.r.t overall delay and
performance.
8
Standard cell Layout
• At the top of the standard cell, there is VDD rail and bottom there is a
VSS rail.
• nwell region, near to the VDD rail where pMOS transistors are built.
• A gap of nwell and pwell dedicated usually for wiring.
• pwell region near the VSS rail where nMOS transistors are built
9
Standard cell library Characterization
1. Liberty file (.lib/.db file)
• It’s a readable ASCII format that characterizes the standard cell library cells in terms of timing,
area, power and other parameters.
• The cell is characterized using spice simulation, timing and power results are obtained under a
variety of conditions.
10
1. Liberty file (.lib or .db file)
Why to use .lib?
- To know if the design meets timing or not:
• Running SPICE will consume a lot of time and computing
resources.
• Instead, we use a timing model that abstracts cell behavior
and simplify calculations.
For each signoff corner, we use the provided .lib file for
this corner to perform timing analysis (STA) and
Power analysis as well.
11
1. Liberty file (.lib or .db file)
12
1. Liberty file (.lib or .db file)
Timing data of standard cells is provided in the Liberty
format.
• Library
• General information common to all cells in the
library.
• For example. Operating conditions, Wire load
models, Look-up tables.
• Cell
• Specific information about cell characterization.
• For example. Function, Area, leakage power
13
1. Liberty file (.lib or .db file)
Timing data of standard cells is provided in the Liberty
format.
• Pin
• Timing, power, capacitance, leakage. functionality,
design rules and other characteristics of each pin in
each cell.
14
Parasitic Estimation: WLM
For critical timing paths, which cells should be used by logic synthesis tools? (size/flavor/complex or simple cells)
For calculating the propagation delay of a specific cell, how can DC/ICC calculate its input transition and load
capacitance?
16
Path Delay Calculation
t1=f(t2,c1)
t3=f(t2e,c3)
a d
t2a t2d
b tfy
C1
t2b t2e
e y
c
t2c C2 C3
t2=f(t2d,t2c,c2)
ttotal= f(t2a,t2b,t2c,C1,C2,C3)
Exercise 3
If the input transition is outside cell’s delay characterization LUT; how will STA engine calculate
propagation delay?
18
Cell Types
B
• Combinational A C
Reset
Asynchronous input
19
Cell Timing Data
library(){
lu_table_template ("del_1_7_7") {
variable_1 : "input_net_transition";
index_1("1, 2, 3, 4, 5, 6, 7");
variable_2 : "total_output_net_capacitance";
index_2("1, 2, 3, 4, 5, 6, 7");
}
cell (INVX1) {
pin (Y) {
timing () {
related_pin : "A";
timing_type : "combinational";
timing_sense : "negative_unate";
cell_rise ("del_1_7_7") {
index_1("0.016, 0.032, 0.064, 0.128, 0.256, 0.512, 1.024");
index_2("0.1, 0.25, 0.5, 1, 2, 4, 8");
values("0.016861, 0.0179019, 0.0195185, 0.0229259, 0.029658, 0.043145, 0.07712", \
"0.0239648, 0.0255491, 0.0279298, 0.0319930, 0.0387540, 0.0520896, 0.0790211", \
"0.0342118, 0.0366966, 0.0402223, 0.0462823, 0.0558327, 0.0705154, 0.0967339", \
"0.0491695, 0.0524727, 0.0576512, 0.0665647, 0.0810999, 0.1027237, 0.1342571", \
"0.0721332, 0.0765389, 0.0836775, 0.0960890, 0.1171612, 0.1497265, 0.1957640", \
"0.1111560, 0.1164417, 0.1252609, 0.1422002, 0.1712097, 0.2171862, 0.2847010", \
"0.1841131, 0.1901881, 0.2010298, 0.2194395, 0.2555983, 0.3182710, 0.4139452");
}
20
Combinational Timing Arc Syntax
• Combinational timing arc between input A and output Y, with negative
dependence, i.e. When A is rising Y is falling and vice-versa
cell (INVX1) {
pin (Y) {
timing () {
related_pin : "A";
timing_type : "combinational";
timing_sense : "negative_unate";
cell_rise ("del_1_7_7") {
index_1("0.016, 0.032, 0.064, 0.128, 0.256, 0.512, 1.024");
index_2("0.1, 0.25, 0.5, 1, 2, 4, 8");
values("0.0168610, 0.0179019, 0.0195185, 0.0229259, 0.0296588, 0.0431451,
0.0702328", \
"0.0239648, 0.0255491, 0.0279298, 0.0319930, 0.0387540, 0.0520896, 0.0790211", \
"0.0342118, 0.0366966, 0.0402223, 0.0462823, 0.0558327, 0.0705154, 0.0967339", \
"0.0491695, 0.0524727, 0.0576512, 0.0665647, 0.0810999, 0.1027237, 0.1342571", \
"0.0721332, 0.0765389, 0.0836775, 0.0960890, 0.1171612, 0.1497265, 0.1957640", \
"0.1111560, 0.1164417, 0.1252609, 0.1422002, 0.1712097, 0.2171862, 0.2847010", \
"0.1841131, 0.1901881, 0.2010298, 0.2194395, 0.2555983, 0.3182710, 0.4139452");
}
21
Delay Analysis
• Calculation of each timing arc’s value cell
delay or a net delay
• Positive unate timing arc combines rise
delays with rise delays and fall delays with 1
fall delays
• Negative unate timing arc combines
incoming rise delays with local fall delays
and vice versa 1
22
Timing Group Names
N Parameter Unit Symbol Figure Definition
1. Rise transition ns tR V DD The time it takes a driving pin to
0.9VDD
time make a transition from kVDD to (1-
k)VDD value. Usually k=0.1 (also
rise_transition 0.1VDD
possible k=0.2, 0.3, etc)
V SS tR
2. Fall transition time ns tF VDD
0.9VDD The time it takes a driving pin to
fall_transition make a transition from (1-k)VDD to
kVDD value. Usually k=0.1 (also
0.1VDD
possible k=0.2, 0.3, etc)
tF VSS
23
Timing Constraints: Timing Types
Setup/Hold, Recovery/Removal Constraints
N Parameter Unit Symbol Figure Definition
24
LEF: Library exchange format (.lef file)
• It’s a readable ASCII format that contains detailed PIN information that is used later by
PnR tools to guide routing.
25
Geometry LEF
26
Technology LEF
• Tech .lef contains simplified information about the
technology to be used by the PnR tool. (Physical
synthesis tool)
• Layers
• Via definitions
• Design rules
• Antenna data
27
Spice and GDS
• SPICE netlist is the netlist of cell in SPICE
format is used for simulation.
• Typically used in digital implementation
for LVS checking.
28
.def: Design Exchange Format
• .def file holds both physical and logical information of the design.
• It is used for exchanging information between tools, enabling inter-operability within the ASIC flow. For example,
doing floorplanning and placement with one tool, CTS with a 2nd one, and parasitic extraction with a 3rd one.
29
Contents
• Standard cell libraries
A corner is defined as a PVT, and it is provided to the analysis and optimization tool as logic libraries per
PVT and parasitics data.
Corners are not due to functional settings, but rather result from process variations during manufacturing,
and voltage and temperature variations in the environment in which the chip will operate.
Each standard cell library is characterized for a set of signoff corners, according to the required signoff
corner for the design(s) that will use the library later.
31
The Multiple Analysis Corners
Supply voltage variations [nominal, 1.1*nominal,
0.9*nominal]
• Supply noise due to parasitic inductance.
• DC source or voltage regulator producing changing
voltage over time. It can go above or below the
expected voltage and hence it will cause current to
change making the circuit slower or faster than
earlier.
For Automotive chips; which corners do you expect to have more than usual to ensure chip will reliably operate in
car’s harsh environment?
What do you expect in the following scenarios for your design’s area and power:
34
Basics of Static Timing analysis
• Static Timing Analysis (STA) is a method for determining if a circuit meets
timing constraints without having to simulate.
• Simulation (dynamic analysis) circuit response for a specified set of input
patterns.
35
Basics of Static Timing analysis
36
Static Timing Analysis Gatelevel Timing Simulation
Usage check timing requirements: setup, hold, Functional and timing simulation, checking
recovery, removal, Logical DRCs. functionality by comparing output VS expected output.
More accurate
Much faster than timing-driven simulation. Can catch issues like glitches.
Exhaustive, checks every possible constrained
timing path.
No vector generation is required.
The signal at the input is propagated through
the gates at each level till it reaches the output
Limitations Only useful for synchronous digital circuits, Analysis quality can be dependent on stimulus vectors
can’t analyze asynchronous systems Takes a lot if time and computational power.
Less accurate Non-exhaustive.
Must define timing requirements, false
paths..etc.
Required gatelevel netlist, .lib files, .sdc, derates, .spef gatelevel netlist, library .v, .sdf, test vectors, expected
inputs output.
37
Timing paths
- Three main steps are followed by
STA tools (ex. PT):
1. Circuit is broken down into sets of
timing paths.
2. Delay of each path is calculated.
3. Path delays are checked to see if
timing constraints have been met.
38
Timing Path Types
There are 4 types of paths in any synchronous circuit
register-to- register-to-output
input-to-register (in2reg)
register(reg2reg) (reg2out)
D Q D Q
DFF DFF
CLK CLK
Clock
Input-to-output
(in2out)
Combinational
Logic
39
Timing paths
- A path is a route from a Startpoint to an Endpoint
40
Required Time
• Required time specifies the time point (interval) at which data is required to arrive at
end point (data is required to be stable after arrival).
• Time point after which data can become unstable (change) is called earliest required time
• Time point after which data cannot become unstable (change) is called latest required time
cycle 1 cycle 2
clock
41
Arrival Time
• Arrival time defines the time interval during which a data signal will arrive at a path
endpoint (after arrival time signal will be stable).
• Data arrival depend on circuit delay, which vary (depend on temperature, supply voltage,
etc.)
latest
arrival time
Data Signal min
max
42
Slack and Critical Path
•
setup setup
clock clock
data data
43
Early and Latest Analysis
• STA tool calculates the slack of each logic path, in order to find critical path.
setup setup
clock clock
data data
44
Clocked Storage Elements
Transparent Latch, Level Sensitive
– data passes through when clock high, latched when clock low
45
Flip-Flops: Review of Internal Operation
The data has to arrive at point B, Tsetup before The data has to stay stable at point B, for Thold
the active clock edge. after the active clock edge.
FF1
FF2
Q
F1 D
clk F1
Clk clk
0 2 4
48
5
T[clk-to-q]
• indicates the amount of time needed for a change in the flip flop-
clock input (e.g. rising edge) resulting in a permanent change at the
flip-flop output (Q).
49
T[setup]
• Setup time is the minimum amount of time the data input should be held
steady before the clock event, so that the data is reliably sampled by the clock.
50
T[hold]
• Hold time is the minimum amount of time the data input should be held
steady after the clock event, so that the data is reliably sampled by the clock.
• It’s not dependent on clock period!
51
The requirements of Setup and Hold on timing
paths
52
Exercise 5
For verifying chip proper operation; can we rely only on simulation or STA?
Where does optimization and analysis tools obtain a specific library FF characterstics
(tc2q/tsetup/thold)?
53
Exercise 6
Should we seek or avoid positive skew?
For next generation datacenter processors; what kind of STA violations is more concerning?
54
Back to Timing Paths
55
Four Sections in a Timing Report
report_timing
Startpoint: FF1 (rising edge-triggered flip-flop clocked by Clk)
Endpoint: FF2 (rising edge-triggered flip-flop clocked by Clk)
Header Path Group: Clk
Path Type: max
FF1
FF2
Q
F1 U2 D
CLK
U3 F1
Clk CLK
57
11
Data Arrival Section Calculated
latency
Point Incr Path
-----------------------------------------------------------
clock Clk (rise edge) 0.00 0.00
clock network delay (propagated) 1.10 * 1.10
FF1/CLK (fdef1a15) 0.00 1.10 r
Data FF1/Q (fdef1a15) 0.50 * 1.60 r
arrival U2/Y (buf1a27) 0.11 * 1.71 r
Library reference names
U3/Y (buf1a27) 0.11 * 1.82 r
FF2/D (fdef1a15) 0.05 * 1.87 r
data arrival time 1.87
.11ns
.11ns
.50ns .05ns
1.1ns Q
F1 r U2 D
r U3 r
r CLK r F1
0 2 4
FF1 CLK
Clk
FF2
58
12
Data Required Section
Point Incr Path
-----------------------------------------------------------
clock Clk (rise edge) 0.00 0.00
clock network delay (propagated) 1.10 * 1.10
FF1/CLK (fdef1a15) 0.00 1.10 r
FF1/Q (fdef1a15) 0.50 * 1.60 r
U2/Y (buf1a27) 0.11 * 1.71 r
U3/Y (buf1a27) 0.11 * 1.82 r
FF2/D (fdef1a15) 0.05 * 1.87 r
data arrival time 1.87
FF1
FF2
Q
F1 U2 D 0.21ns
0 2 4 1.0ns CLK
U3 F1
r
Clk CLK
59
13
Summary - Slack
report_timing
Startpoint: FF1 (rising edge-triggered flip-flop clocked by Clk)
Endpoint: FF2 (rising edge-triggered flip-flop clocked by Clk)
Path Group: Clk
Path Type: max
Data Required
Data Arrival
FF1/clk 5.1ns
1.1ns
Slack is the difference between
FF2/D data arrival and required.
Data Hold
Required
FF2/clk
1ns 5ns 61
17
Example Hold Timing Report
Startpoint: FF1 (rising edge-triggered flip-flop clocked by Clk)
Endpoint: FF2 (rising edge-triggered flip-flop clocked by Clk)
Path Group: Clk
Path Type: min
In1
A B C out1
FF1 FF2
D E out2
FF3
63
Exercise8
0.1ns
0.5ns For both FFs:
Tc2q=0.05ns
Tsetup=0.03ns
0.1ns Thold=0.025ns
64
Exercise9
1. Identify all timing paths [assume i/p delay constraint = 0.3ns]
2. Calculate max. clock frequency
65
Exercise10: Is there any setup or hold violation in this circuit?
• Whenever there are setup and hold time violations in any flip-flop, it enters a
state where its output is unpredictable: this state is known as metastable state
(quasi stable state);
• At the end of metastable state, the flip-flop settles down to either '1' or '0'.
• Whenever the input signal D does not meet the Tsetup and Thold of the given D
flip-flop, metastability occurs.
67