0% found this document useful (0 votes)
48 views33 pages

VLSI Static Timing Analysis Part 4 1719604589

Uploaded by

Amul
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
48 views33 pages

VLSI Static Timing Analysis Part 4 1719604589

Uploaded by

Amul
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 33

Static Timing

Analysis
Part 4 Timing Constraints

Amr Adel Mohammady


/amradelm

/amradelm
Introduction

In part 1 we went through the basic principles that are needed to understand all VLSI timing checks.
In part 2 we looked into setup and hold checks and how to fix the violations.
In part 3 we discussed other timing checks such as max transition, max capacitance, skew, etc.
In this part we will learn how to apply timing constraints on the design.

/amradelm

/amradelm 3
Clock Constraints

/amradelm

/amradelm 4
Clock Constraints

The most important timing constraint is the clock and its period.
The command to define a clock is create_clock We have to define multiple things for each clock in the design:
o Period :The only required argument. Others are optional.
create_clock -period <arg>
o Waveform : The rise and fall edges over one cycle.
create_clock -period 4 waveform {1 2}
#The first time specified (arg1) represents the time of the first rising transition, and the second time specified
(arg2) is the falling edge
o Object : The source the clock comes from. It can be a port or a pin (for example, a PLL output)
create_clock -period <arg> [get_port my_clock_port]

Pin

Period 4 : Waveform {0 2} Period 4 : Waveform {1 2}


#Default Pin
Port
Clock Objects

/amradelm

/amradelm 5
Source Latency
Top
o Source latency :
The latency/delay from the clock outer source till it reaches the IP/module clock port. Source latency for IP 1

This becomes very important when doing hierarchical flow1 as it defines the local skew between the
modules.
set_clock_latency [-clock <args>] [-rise] [-fall] [-late] [-early] [-source]
<latency> <objects>
[-clock <args>]: The clock that will get the latency
[-rise] [-fall]: The latency during rising edge or falling edge
[-late] [-early]: The latency is applied to early or late analysis?
[-source]: Is it a source or network latency? Network latencies will be explained soon.
<objects>: The clock source object (pin or port)
Source latency for IP 2

Source Latency

[1] : Hierarchical flow refers to when the modules of the design are synthesized and hardened separately and then
/amradelm

integrated later in a top module.


/amradelm 6
Clock Uncertainty

o Clock Uncertainty :
From its name, clock uncertainty defines a worst-case values about things we
are not certain about regarding the clock.
For example, What is the clock skew between 2 flip-flops during synthesis? We

Therefore we will assume a clock tree skew to be used during synthesis.

a constant period. Instead, the generated clock has an error margin meaning Skew Is Unknown In Synthesis
that the clock period changes over time.
Imagine you have a drum that beats regularly, like a clock.
Normally, it goes "beat, beat, beat" at the same time intervals.

should, that's jitter.


Each time you count the time between beats (period), it's a little
different.

Ideal Clock Clock with jitter

[1] : Jitter Part 3: C2C Jitter and Long Term Jitter (youtube.com)
/amradelm

/amradelm 7
How to Apply Clock Uncertainty

but this value can


change depending on the project.
set_clock_uncertainty [expr $pll_jitter + $clock_period*0.20] [get_clocks myClk] #Uncertainty in Synthesis
1. We can relax the clock uncertainty and leave a term to
account for the routes.
set_clock_uncertainty [expr $pll_jitter + $clock_period*0.10] [get_clocks myClk] #Uncertainty After CTS
After route stage we know everything about the clock tree but we still have 2 sources of uncertainties: The PLL jitter and the network jitter2.
set_clock_uncertainty [expr $pll_jitter + $clock_period*0.03] [get_clocks myClk] #Uncertainty After Route

During Synthesis After CTS After Route

[1] : the crosstalk and coupling capacitance /amradelm

that will happen after we route the other signal nets. So we need to account for this uncertainty.
/amradelm 8
The network jitter is the jitter that happens due to the varying delays of the cells and routes due to temperature and voltage variations over time. This value is very
[2] : small and can range from 1%-5% of the clock period for advanced tech nodes.
Clock Uncertainty

The pll jitter happens from one cycle (one edge) to another. Therefore the pll jitter is only applied for setup paths where the launch and capture
edges are different but not applied on full cycle hold paths because the edges are the same.
We still apply some uncertainty on full cycle hold as a safety margin.
set_clock_uncertainty setup [expr $pll_jitter + $clock_period*0.03] [get_clocks myClk] #Setup
set_clock_uncertainty hold [expr $clock_period*0.03] [get_clocks myClk] #Hold

Setup Edges for Full Hold Edges for Full


Cycle Paths Cycle Paths

/amradelm

/amradelm 9
Clock-To-Clock Uncertainty
o The uncertainty between two different clocks can be defined using the set_clock_uncertainty command
set_clock_uncertainty 2.25 -from [get_clocks clk_1] -to [get_clocks clk_2]
set_clock_uncertainty 1.50 -from [get_clocks clk_2] -to [get_clocks clk_1]

/amradelm

/amradelm 10
Network Latency

o The network latency defines the clock delay within the block while the source latency defines the delay outside the block.
o Before CTS, network latency can model the expected latency of the clock tree.
o This helps the tool optimize the timing of the design especially for the IO paths and the paths going from and to different clocks.
o In the diagram on the right, we can apply a network latency of 2 ns to clk_1 and 5ns to clk_2 which is equivalent of applying a skew of
set_clock_latency [-source] 2.00 [get_port clk_1]
set_clock_latency [-source] 5.00 [get_port clk_2]

/amradelm

/amradelm 11
Generated Clock

loc
The generated clocks can be identified automatically by the STA tools or defined manually by the designer.

To define the generated clock manually:


o create_generated_clock -source [get_ports clk_1] -divide_by 2 -add -name CLK2 [get_pins clock_div/Q]

/amradelm

/amradelm 12
I/O Budgeting

/amradelm

/amradelm 13
I/O Budgeting

The current block communicates with other blocks through the I/O ports.
To fully run STA on the block we need to inform the tool about the delays in the other blocks so that when we integrate the bloc
we meet timing.
This is called I/O budgeting or I/O delays.
The delay coming from input ports is called input delay and it consists of:
of the launching FF
Combinational path delay in the launching block.
Net delay between the two blocks.
The delay coming from output ports is called output delay and it consists of:
of the capturing FF
Input Delay Output Delay
Combinational path delay in the capturing block.
Net delay between the two blocks.
Most designers set 50% of clock period as an IO delay. But this can change
from one project to another

/amradelm

/amradelm 14
I/O Budgeting Effect of Clock Latency

The only terms missing are the clock latencies to the launching and capturing flip-flops.
There are different ways to handle the latency when dealing with IOs:
o Apply zero latencies and include the latency in the IO delay value (As in the reports below)1
o Specify the latency on the ports manually.
o Apply the median clock latency

[1] : Reference - https://cdrdv2-public.intel.com/655075/an554.pdf


/amradelm

/amradelm 15
I/O Budgeting Effect of Clock Latency Zero Latency

In the case of zero latency, you have to account for the latency inside the IO delay value:
o For Input delay: latency is added to the combinational delay value
o For output delay: latency is subtracted from the combinational delay value.
The command to apply IO delay while including latency:
o set_input_delay 6.5 -clock CLK1 -network_latency_included [all_inputs]

/amradelm

/amradelm 16
I/O Budgeting Effect of Clock Latency Manual Latency

The other approach is to apply a fixed value for the clock latency.
To apply IO delay without including latency then applying the fixed latency:
o set_input_delay 1.5 -clock CLK1 [get_port my_port]
o set_output_delay 2 -clock CLK1 [get_port my_port] Propagated
latency applied
o set_clock_latency 5 [get_clock CLK1]1 on clock pins

To apply latency on a specific port:


o set_clock_latency 5 clock [get_clock CLK1] [get_port my_port]

Ideal latency
applied on ports

SNPS Timing Report : Fixed Ideal Latency

[1] : The tools will use this value for ports while the rest of the design will use the actual clock tree values (if the clock is
/amradelm

set as propagated and not ideal)


/amradelm 17
I/O Budgeting Effect of Clock Latency Median Latency

The other approach is to use the median latency of all the registers in the design after building the clock tree.
The command to apply IO delay without including latency then apply the median:
o set_input_delay 1.5 -clock CLK1 [all_inputs]
o set_output_delay 2 -clock CLK1 [all_outputs]
o Then after CTS:
o compute_clock_latency1

SNPS Timing Report : After Latency Update

[1] : The command works for SNPS ICC2 and Fusion Compiler.
/amradelm

/amradelm 18
I/O Budgeting Input Transition And Load Capacitance

The calculation of the delay of the cells just at the input port needs the input transition time. Unless specified, the tool will assume ideal (zero) transition time
which might cause timing violation when the blocks are integrated.
There are 2 ways to manually define the input transition:
o Using a fixed numeric value
set_input_transition 0.75 [get_ports DATA_IN*]
o Using the drive strength of a cell (for example, assume the cell that drive the port is a buffer of size 4)
set_driving_cell -lib_cell BUF_4 [get_ports DATA_IN*]

Similarly the cell that drives the output port needs the load capacitance value to calculate its delay.
There are 2 ways to define the load cap:
o Using a fixed cap value
set_load 3 [all_outputs]
o Using the load cap of a cell (for example, assume the load cell is a buffer of size 8)
set pin_cap [get_attribute [get_lib_pins tech_lib/BUF_8/A] pin_capacitance]
set_load $pin_cap [all_outputs]

/amradelm

/amradelm 19
Path Based Constraints

/amradelm

/amradelm 20
False Paths

Applying false path constraints requires a good understanding of the functional operation of the circuit.
set_false_path -from FF1/CP -through {MUX_1/D0} -through {MUX_2/D1} -to FF3/D
set_false_path -from FF2/CP -through {MUX_1/D1} -through {MUX_2/D0} -to FF3/D

sel

/amradelm

/amradelm 21
False Paths Hold on IO Ports
Scenario 1
Some designers apply false path for the hold analysis on IO ports for Setup Violation

sub-blocks during hierarchal flow


o set_false_path hold from [all_inputs]
o set_false_path hold to [all_outputs]
This is because hold can be easily fixed in the top module by adding buffers
To understand this lets consider these 2 scenarios: Two engineers each are
working on their block:
Scenario 1:
1. They added buffers on IO ports to fix hold violations
2. After integration in the top module a setup violation was found
3. Each engineer had to reopen the block, remove the buffers, reroute
the nets, fix any DRCs that appeared, then write GDS and create new Hold Violation
Scenario 2
design libs.
4. The top module engineer waited till both engineers finished then re-
integrated the design again.
Scenario 2:
1. Hold was ignored on the IO
2. After integration a hold violation was found.
3. The top module engineer simply added a few buffers in the top
module. No redesign was needed for the subblocks.
As you can see ignoring hold on the IO ports saves lots of work and time.

/amradelm

/amradelm 22
Multi-Cycle Path

Multi-cycle paths are timing paths that takes more than one clock cycles.
To set multi-cycle path:
set_multicycle_path -setup <MULTIPLIER> -from {FF1/CLK} -to {FF2/D} # MULTIPLER = 3 in the diagram below
STA tools will by default, select the hold capture edge to be the edge one cycle before the setup edge.
We fix this by instructing the tool to set hold N-1 cycles of the setup edge, where N is the number of multi-cycles.
set_multicycle_path -hold <MULTIPLIER-1> -from {FF1/CLK} -to {FF2/D} # MULTIPLER-1 = 2 in the diagram below

FF1/CLK

FF2/CLK

/amradelm

/amradelm 23
Max/Min/Skew Delays

As discussed in the previous part of this document, we may have max, min, or skew
constraints on the design paths.
To apply these constraints:
set_max_delay -from FF/CLK -to Instance/A 30
set_min_delay -from FF/CLK -to Instance/A 10
There is no command to apply a skew constraint on the bus1. You need to use TCL
commands to report if there is skew violation or not. Max/Min Constraints
The next slide shows an example TCL script that does skew check

Skew Constraints

[1] : That is in Primetime. Other tools may have commands for skew constraints
/amradelm

/amradelm 24
Max/Min/Skew Delays Skew Check Script
# Define skew constraint
set skew_constraint 3 ;# in ns
# Update min and max arrival times

# Initialize min and max arrival times if { [expr $arrival < $min] } {
set max 0 set min $arrival
set min 999999 ;# Use a large number as an initial }
"infinity" value if { [expr $arrival > $max] } {
set max $arrival
# Get the collection of pins
}
set my_pins [get_pins DATA[*]]
}

# Iterate over each pin in the collection


foreach_in_collection pin $my_pins { # Check if the skew constraint is violated

# Get the timing path to the pin if { [expr ($max - $min) > $skew_constraint] } {
set path [get_timing_path -to $pin] puts "ERROR: Skew Violation"
} else {
# Get the arrival time of the timing path puts "INFO: Skew Passed"
set arrival [get_attribute $path arrival]
}

/amradelm

/amradelm 25
Modes, Corners, And Scenarios

/amradelm

/amradelm 26
Modes
A mode in STA refers to a specific functional condition under which the timing analysis is
performed.
Mode Examples: Functional mode, Scan test mode, Low power mode.
Each mode represents a distinct set of constraints and conditions that affect the timing behavior of
the design.
For example, two modes could differ in the clock frequency.
0
create_mode high_freq_mode
create_clock -period 3.5 [get_port my_clock_port]
1
create_mode low_freq_mode
create_clock -period 7.5 [get_port my_clock_port]
If the two clocks enter from different ports and go through a clock MUX we have to use the
set_case_analysis command
set_case_analysis sets specific conditions for signal values during the analysis. It
fixes certain signals to specified logic values (0 or 1) to simulate specific operational
modes.
In the diagram we need to set the CLK_SEL to 0 in the high-frequency mode and to 1 in
the slow-frequency mode
create_mode high_freq_mode
create_clock -period 3.5 [get_port my_clock_port]
set_case_analysis 0 [get_port CLK_SEL]

create_mode low_freq_mode
create_clock -period 7.5 [get_port my_clock_port]
set_case_analysis 1 [get_port CLK_SEL]

/amradelm

/amradelm 27
Modes Scan Modes
Another usage for set_case_analysis is to enable or disable scan modes.
0
In functional modes we want the FF to receive input through the D pin
1

In scan mode we want the FF to receive input through SI (scan input) pin
This desired behavior can be selected by setting the scan enable port/pin to the
correct value
create_mode functional_mode
create_clock -period 3.5 [get_port my_func_clk_port]
set_case_analysis 0 [get_port scan_enable]

create_mode scan_mode
create_clock -period 10 [get_port my_scan_clk_port]
set_case_analysis 1 [get_port scan_enable]

/amradelm

/amradelm 28
PVT Corners

PVT stands for process, voltage, and temperature variations and discusses the different
fabrication, operating, and environmental conditions affecting the chip
Process: Systematic and large variations such as doping, oxide thickness, etc can affect
entire parts of a wafer resulting in all instances within the chip being faster or slower than
average
Voltage: Difference in the supply voltage provided to the entire chip. For example, a chip
can have two operating voltages 3V for high performance mode and 2V for low power
mode. The voltage affects the performance of the cells.
Temperature: Ambient temperature where the chip will be operated. For example, the chip
) vs a hot car engine
(120 ). The temperature also affects the performance of the cells
We run STA on all the different PVT corners to ensure the chip can operate correctly under
all conditions.

MOSFET Current vs Voltage and Temp1

[1] : Reference : New analytical model for nanoscale tri-Gate SOI MOSFETs including quantum effects
/amradelm

/amradelm 29
Scenarios

An STA scenario is a combination of a corner and a mode.


In general we run all the possible combinations. So, if there are 3 modes and 4 corners we run 12 scenarios.
olled lab environment with low
temperature, then high ambient temperature scenarios can be excluded for scan modes.

/amradelm

/amradelm 30
Additional Topics

/amradelm

/amradelm 31
Path Groups

The design consists of several modules and blocks that communicate with each other.

paths.
For example, create a group for all timing paths going from module_1 to module_3
group_path -name FROM_MD1_TO_MD3 -from [get_pins -hier module_1/*CK] -to
[get_pins -hier module_3/*D]
Benefits of Path Grouping:
Simplified Analysis: Easier and better analysis of the design, allowing identification of which module is
causing more violations and needs redesign.
Team Collaboration: Enables multiple engineers to analyze the same design by assigning each engineer
specific groups to analyze.
Giving Higher Priority For Optimizations: Timing paths within a group are optimized together by the tool.
You can apply a higher weight/priority to a specific group so that the tool can put more focus on. The
higher the weight the higher the effort applied to the group.
group_path -name INOUT weight 5 -from [all_inputs] -to [all_outputs]

/amradelm

/amradelm 32
Graph-Based Analysis (GBA) vs Path-Based Analysis (PBA)

Consider the example shown: We have 2 timing paths going though the same AND gate and then a FF.
The upper path:
has a long combinational delay (3ns) but has a strong driver at the end.
The strong driver provides better transition time for the AND gate resulting in a delay of 1ns
within the AND.

The lower path


has a shorter combinational delay (2ns) but has a weak driver at the end.
The weak driver provides bad transition time for the AND gate resulting in a delay of 2ns within
the AND.

worst (The one resulting in 2ns delay).


This means the upper path will, falsely, have a delay of (3ns + 2ns) = 5ns which violates setup.
When the tools use a single worst transition time we call this graph-based analysis (GBA) and when it
uses different transition times we call this path-based analysis (PBA).
GBA is used in most of the flow to reduce runtime. But during signoff, we run STA with PBA to remove
the pessimism and avoid fixing any false violations.

/amradelm

/amradelm 33
Thank You!

/amradelm

/amradelm 34

You might also like