VLSI Static Timing Analysis Part 4 1719604589
VLSI Static Timing Analysis Part 4 1719604589
Analysis
Part 4 Timing Constraints
/amradelm
Introduction
In part 1 we went through the basic principles that are needed to understand all VLSI timing checks.
In part 2 we looked into setup and hold checks and how to fix the violations.
In part 3 we discussed other timing checks such as max transition, max capacitance, skew, etc.
In this part we will learn how to apply timing constraints on the design.
/amradelm
/amradelm 3
Clock Constraints
/amradelm
/amradelm 4
Clock Constraints
The most important timing constraint is the clock and its period.
The command to define a clock is create_clock We have to define multiple things for each clock in the design:
o Period :The only required argument. Others are optional.
create_clock -period <arg>
o Waveform : The rise and fall edges over one cycle.
create_clock -period 4 waveform {1 2}
#The first time specified (arg1) represents the time of the first rising transition, and the second time specified
(arg2) is the falling edge
o Object : The source the clock comes from. It can be a port or a pin (for example, a PLL output)
create_clock -period <arg> [get_port my_clock_port]
Pin
/amradelm
/amradelm 5
Source Latency
Top
o Source latency :
The latency/delay from the clock outer source till it reaches the IP/module clock port. Source latency for IP 1
This becomes very important when doing hierarchical flow1 as it defines the local skew between the
modules.
set_clock_latency [-clock <args>] [-rise] [-fall] [-late] [-early] [-source]
<latency> <objects>
[-clock <args>]: The clock that will get the latency
[-rise] [-fall]: The latency during rising edge or falling edge
[-late] [-early]: The latency is applied to early or late analysis?
[-source]: Is it a source or network latency? Network latencies will be explained soon.
<objects>: The clock source object (pin or port)
Source latency for IP 2
Source Latency
[1] : Hierarchical flow refers to when the modules of the design are synthesized and hardened separately and then
/amradelm
o Clock Uncertainty :
From its name, clock uncertainty defines a worst-case values about things we
are not certain about regarding the clock.
For example, What is the clock skew between 2 flip-flops during synthesis? We
a constant period. Instead, the generated clock has an error margin meaning Skew Is Unknown In Synthesis
that the clock period changes over time.
Imagine you have a drum that beats regularly, like a clock.
Normally, it goes "beat, beat, beat" at the same time intervals.
[1] : Jitter Part 3: C2C Jitter and Long Term Jitter (youtube.com)
/amradelm
/amradelm 7
How to Apply Clock Uncertainty
that will happen after we route the other signal nets. So we need to account for this uncertainty.
/amradelm 8
The network jitter is the jitter that happens due to the varying delays of the cells and routes due to temperature and voltage variations over time. This value is very
[2] : small and can range from 1%-5% of the clock period for advanced tech nodes.
Clock Uncertainty
The pll jitter happens from one cycle (one edge) to another. Therefore the pll jitter is only applied for setup paths where the launch and capture
edges are different but not applied on full cycle hold paths because the edges are the same.
We still apply some uncertainty on full cycle hold as a safety margin.
set_clock_uncertainty setup [expr $pll_jitter + $clock_period*0.03] [get_clocks myClk] #Setup
set_clock_uncertainty hold [expr $clock_period*0.03] [get_clocks myClk] #Hold
/amradelm
/amradelm 9
Clock-To-Clock Uncertainty
o The uncertainty between two different clocks can be defined using the set_clock_uncertainty command
set_clock_uncertainty 2.25 -from [get_clocks clk_1] -to [get_clocks clk_2]
set_clock_uncertainty 1.50 -from [get_clocks clk_2] -to [get_clocks clk_1]
/amradelm
/amradelm 10
Network Latency
o The network latency defines the clock delay within the block while the source latency defines the delay outside the block.
o Before CTS, network latency can model the expected latency of the clock tree.
o This helps the tool optimize the timing of the design especially for the IO paths and the paths going from and to different clocks.
o In the diagram on the right, we can apply a network latency of 2 ns to clk_1 and 5ns to clk_2 which is equivalent of applying a skew of
set_clock_latency [-source] 2.00 [get_port clk_1]
set_clock_latency [-source] 5.00 [get_port clk_2]
/amradelm
/amradelm 11
Generated Clock
loc
The generated clocks can be identified automatically by the STA tools or defined manually by the designer.
/amradelm
/amradelm 12
I/O Budgeting
/amradelm
/amradelm 13
I/O Budgeting
The current block communicates with other blocks through the I/O ports.
To fully run STA on the block we need to inform the tool about the delays in the other blocks so that when we integrate the bloc
we meet timing.
This is called I/O budgeting or I/O delays.
The delay coming from input ports is called input delay and it consists of:
of the launching FF
Combinational path delay in the launching block.
Net delay between the two blocks.
The delay coming from output ports is called output delay and it consists of:
of the capturing FF
Input Delay Output Delay
Combinational path delay in the capturing block.
Net delay between the two blocks.
Most designers set 50% of clock period as an IO delay. But this can change
from one project to another
/amradelm
/amradelm 14
I/O Budgeting Effect of Clock Latency
The only terms missing are the clock latencies to the launching and capturing flip-flops.
There are different ways to handle the latency when dealing with IOs:
o Apply zero latencies and include the latency in the IO delay value (As in the reports below)1
o Specify the latency on the ports manually.
o Apply the median clock latency
/amradelm 15
I/O Budgeting Effect of Clock Latency Zero Latency
In the case of zero latency, you have to account for the latency inside the IO delay value:
o For Input delay: latency is added to the combinational delay value
o For output delay: latency is subtracted from the combinational delay value.
The command to apply IO delay while including latency:
o set_input_delay 6.5 -clock CLK1 -network_latency_included [all_inputs]
/amradelm
/amradelm 16
I/O Budgeting Effect of Clock Latency Manual Latency
The other approach is to apply a fixed value for the clock latency.
To apply IO delay without including latency then applying the fixed latency:
o set_input_delay 1.5 -clock CLK1 [get_port my_port]
o set_output_delay 2 -clock CLK1 [get_port my_port] Propagated
latency applied
o set_clock_latency 5 [get_clock CLK1]1 on clock pins
Ideal latency
applied on ports
[1] : The tools will use this value for ports while the rest of the design will use the actual clock tree values (if the clock is
/amradelm
The other approach is to use the median latency of all the registers in the design after building the clock tree.
The command to apply IO delay without including latency then apply the median:
o set_input_delay 1.5 -clock CLK1 [all_inputs]
o set_output_delay 2 -clock CLK1 [all_outputs]
o Then after CTS:
o compute_clock_latency1
[1] : The command works for SNPS ICC2 and Fusion Compiler.
/amradelm
/amradelm 18
I/O Budgeting Input Transition And Load Capacitance
The calculation of the delay of the cells just at the input port needs the input transition time. Unless specified, the tool will assume ideal (zero) transition time
which might cause timing violation when the blocks are integrated.
There are 2 ways to manually define the input transition:
o Using a fixed numeric value
set_input_transition 0.75 [get_ports DATA_IN*]
o Using the drive strength of a cell (for example, assume the cell that drive the port is a buffer of size 4)
set_driving_cell -lib_cell BUF_4 [get_ports DATA_IN*]
Similarly the cell that drives the output port needs the load capacitance value to calculate its delay.
There are 2 ways to define the load cap:
o Using a fixed cap value
set_load 3 [all_outputs]
o Using the load cap of a cell (for example, assume the load cell is a buffer of size 8)
set pin_cap [get_attribute [get_lib_pins tech_lib/BUF_8/A] pin_capacitance]
set_load $pin_cap [all_outputs]
/amradelm
/amradelm 19
Path Based Constraints
/amradelm
/amradelm 20
False Paths
Applying false path constraints requires a good understanding of the functional operation of the circuit.
set_false_path -from FF1/CP -through {MUX_1/D0} -through {MUX_2/D1} -to FF3/D
set_false_path -from FF2/CP -through {MUX_1/D1} -through {MUX_2/D0} -to FF3/D
sel
/amradelm
/amradelm 21
False Paths Hold on IO Ports
Scenario 1
Some designers apply false path for the hold analysis on IO ports for Setup Violation
/amradelm
/amradelm 22
Multi-Cycle Path
Multi-cycle paths are timing paths that takes more than one clock cycles.
To set multi-cycle path:
set_multicycle_path -setup <MULTIPLIER> -from {FF1/CLK} -to {FF2/D} # MULTIPLER = 3 in the diagram below
STA tools will by default, select the hold capture edge to be the edge one cycle before the setup edge.
We fix this by instructing the tool to set hold N-1 cycles of the setup edge, where N is the number of multi-cycles.
set_multicycle_path -hold <MULTIPLIER-1> -from {FF1/CLK} -to {FF2/D} # MULTIPLER-1 = 2 in the diagram below
FF1/CLK
FF2/CLK
/amradelm
/amradelm 23
Max/Min/Skew Delays
As discussed in the previous part of this document, we may have max, min, or skew
constraints on the design paths.
To apply these constraints:
set_max_delay -from FF/CLK -to Instance/A 30
set_min_delay -from FF/CLK -to Instance/A 10
There is no command to apply a skew constraint on the bus1. You need to use TCL
commands to report if there is skew violation or not. Max/Min Constraints
The next slide shows an example TCL script that does skew check
Skew Constraints
[1] : That is in Primetime. Other tools may have commands for skew constraints
/amradelm
/amradelm 24
Max/Min/Skew Delays Skew Check Script
# Define skew constraint
set skew_constraint 3 ;# in ns
# Update min and max arrival times
# Initialize min and max arrival times if { [expr $arrival < $min] } {
set max 0 set min $arrival
set min 999999 ;# Use a large number as an initial }
"infinity" value if { [expr $arrival > $max] } {
set max $arrival
# Get the collection of pins
}
set my_pins [get_pins DATA[*]]
}
# Get the timing path to the pin if { [expr ($max - $min) > $skew_constraint] } {
set path [get_timing_path -to $pin] puts "ERROR: Skew Violation"
} else {
# Get the arrival time of the timing path puts "INFO: Skew Passed"
set arrival [get_attribute $path arrival]
}
/amradelm
/amradelm 25
Modes, Corners, And Scenarios
/amradelm
/amradelm 26
Modes
A mode in STA refers to a specific functional condition under which the timing analysis is
performed.
Mode Examples: Functional mode, Scan test mode, Low power mode.
Each mode represents a distinct set of constraints and conditions that affect the timing behavior of
the design.
For example, two modes could differ in the clock frequency.
0
create_mode high_freq_mode
create_clock -period 3.5 [get_port my_clock_port]
1
create_mode low_freq_mode
create_clock -period 7.5 [get_port my_clock_port]
If the two clocks enter from different ports and go through a clock MUX we have to use the
set_case_analysis command
set_case_analysis sets specific conditions for signal values during the analysis. It
fixes certain signals to specified logic values (0 or 1) to simulate specific operational
modes.
In the diagram we need to set the CLK_SEL to 0 in the high-frequency mode and to 1 in
the slow-frequency mode
create_mode high_freq_mode
create_clock -period 3.5 [get_port my_clock_port]
set_case_analysis 0 [get_port CLK_SEL]
create_mode low_freq_mode
create_clock -period 7.5 [get_port my_clock_port]
set_case_analysis 1 [get_port CLK_SEL]
/amradelm
/amradelm 27
Modes Scan Modes
Another usage for set_case_analysis is to enable or disable scan modes.
0
In functional modes we want the FF to receive input through the D pin
1
In scan mode we want the FF to receive input through SI (scan input) pin
This desired behavior can be selected by setting the scan enable port/pin to the
correct value
create_mode functional_mode
create_clock -period 3.5 [get_port my_func_clk_port]
set_case_analysis 0 [get_port scan_enable]
create_mode scan_mode
create_clock -period 10 [get_port my_scan_clk_port]
set_case_analysis 1 [get_port scan_enable]
/amradelm
/amradelm 28
PVT Corners
PVT stands for process, voltage, and temperature variations and discusses the different
fabrication, operating, and environmental conditions affecting the chip
Process: Systematic and large variations such as doping, oxide thickness, etc can affect
entire parts of a wafer resulting in all instances within the chip being faster or slower than
average
Voltage: Difference in the supply voltage provided to the entire chip. For example, a chip
can have two operating voltages 3V for high performance mode and 2V for low power
mode. The voltage affects the performance of the cells.
Temperature: Ambient temperature where the chip will be operated. For example, the chip
) vs a hot car engine
(120 ). The temperature also affects the performance of the cells
We run STA on all the different PVT corners to ensure the chip can operate correctly under
all conditions.
[1] : Reference : New analytical model for nanoscale tri-Gate SOI MOSFETs including quantum effects
/amradelm
/amradelm 29
Scenarios
/amradelm
/amradelm 30
Additional Topics
/amradelm
/amradelm 31
Path Groups
The design consists of several modules and blocks that communicate with each other.
paths.
For example, create a group for all timing paths going from module_1 to module_3
group_path -name FROM_MD1_TO_MD3 -from [get_pins -hier module_1/*CK] -to
[get_pins -hier module_3/*D]
Benefits of Path Grouping:
Simplified Analysis: Easier and better analysis of the design, allowing identification of which module is
causing more violations and needs redesign.
Team Collaboration: Enables multiple engineers to analyze the same design by assigning each engineer
specific groups to analyze.
Giving Higher Priority For Optimizations: Timing paths within a group are optimized together by the tool.
You can apply a higher weight/priority to a specific group so that the tool can put more focus on. The
higher the weight the higher the effort applied to the group.
group_path -name INOUT weight 5 -from [all_inputs] -to [all_outputs]
/amradelm
/amradelm 32
Graph-Based Analysis (GBA) vs Path-Based Analysis (PBA)
Consider the example shown: We have 2 timing paths going though the same AND gate and then a FF.
The upper path:
has a long combinational delay (3ns) but has a strong driver at the end.
The strong driver provides better transition time for the AND gate resulting in a delay of 1ns
within the AND.
/amradelm
/amradelm 33
Thank You!
/amradelm
/amradelm 34