0% found this document useful (0 votes)
614 views79 pages

Cts

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
614 views79 pages

Cts

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 79

Clock Tree Synthesis

(CTS)
Prerequisites for CTS
● The placement of standard cells and optimization are done.
● Power and Ground nets are pre-routed.
● Acceptable congestion.
● Acceptable timing.
● Estimated max tran and max cap values without violations.
● High fanout nets other than clock such as scan-enable, reset are synthesized.
● Remove don’t use attribute on clock buffers & inverters.
● Check whether all pre-existing cells in clock path are clock cells.
Difference between clock Buffer and normal Buffer
● Normal buffer has unequal rise and fall times whereas clock buffer has equal rise and fall times.
● The difference in rise and fall times of normal buffer is due to the speed of nmos is more than speed
of pmos(mobility of electrons is more than mobility of holes).
● CMOS is a combination of pmos and nmos, where pmos is a pull up device and nmos is a pull down
device.
● Rise time is more compared to fall time which results min pulse width violation.
● To avoid this, two pmos transistors are connected in parallel which increases the size of the buffer.
● Clock buffer consumes more power than the normal buffer.

Unequal rise and fall times Equal rise and fall times
Why CTS?
● Every sequential cell in a design is operated/synchronized by the clock.In order to
balance the skew and minimize insertion delay, CTS is performed.
● Before CTS, all the clock pins are driven by the same clock source having high fanout
and high load.
● In CTS, insertion of buffers/ inverters in clock path takes place to balance the clock
delay to all clock pins.
● Synthesis of clock nets takes place.
Difference between HFNS and CTS
● In HFNS buffering is done to meet the DRV constraints while in CTS buffering is done
in clock path to meet DRV constraints, minimum insertion delay and balanced skew.
● HFNS uses normal buffers and inverters whereas CTS uses clock buffers and
inverters.
● HFNS is used mostly for reset, scan enable and other static signals having high fanouts
whereas CTS is used for high switching signals with high fanouts such as clock.
● Clock cells consume more power than normal cells.
● NDR rules are applied in CTS but not in HFNS.
Inputs for CTS
● Placement DEF
● Netlist(.v)
● Libraries(.lib & .lef)
● SDC
● TLU+
● UPF
● Multi-Mode and Multi-Corner(MMMC) file
● SPEC file
○ Target latency & skew
○ Skew groups
○ Reference cells list
○ NDR rules & routing layers
○ Clock tree exceptions
○ Max transition & max capacitance

Note:

Different types of Vt cells have different process variations which will cause over or under pessimism of buffer
delays used in clock tree.This may leads to many issues like setup violation and clock frequency reduction.To
avoid this we use cells with same Vt(lvt).
Goals of CTS

Balanced skew and minimum


latency
CTS Flow

Read SDC

Generate CTS
SPEC

Compile CTS using


SPEC

Place Clock Tree


Cells

Route Clock Tree


Setting Clock Tree Constraints
● Setting Target Values ● Setting Reference Cells

set_clock_tree_options set_lib_cell_purpose -include cts\

-target_skew value \ {tech_lib/clk_buf* tech_lib/clk_inv*}

-target_latency value
● Setting clock cell spacing rules
● Setting DRV’s
set_clock_cell_spacing \
set_max_transition value\
-x_spacing value\
-clock_path [get_clocks clk_name]
-y_spacing value\
set_max_capacitance value\
-name
-clock_path [get_clocks clk_name]
Clock Latency
● The time taken by the clock to reach the clock pin from its source.
● The clock latency is the sum of source latency and network latency.
● Source latency is the time taken by the clock from its source to the clock definition
point.
● Network latency is the time taken by the clock from its definition point to the clock pin.
● Based on the clock arrival to flip flop in reg2reg path, latency is divided as:
○ Launch clock latency
○ Capture clock latency
● Capture clock latency is the time taken by the clock to reach the clock pin of capture flip
flop.
● Launch clock latency is the time taken by the clock to reach the clock pin of launch flip
flop.
Clock Latency
If each buffer offers 10ps delay,
Source latency=40ps
Network latency=10ps for FF1
Network latency=30ps for FF2
Capture clock latency=40ps+30ps=70ps
Launch clock latency=40ps+10ps=50ps

Insertion Delay:
● Delay obtained after inserting buffers or inverters in clock path.
● Insertion delay is nothing but the time required to reach the clock signal from clock
definition point to the sink pin.
Clock Skew
The difference in arrival times of clock signal to different sequential cells.

● Global Skew: Difference between shortest clock path delay and longest clock path
delay irrespective of the communication between them.
● Local Skew: Difference between the arrival of clock to two communicating sequential
cells.
● Positive Skew: Capture clock arrives late than the launch clock.
● Negative Skew: Capture clock arrives early than the launch clock.

Skew=Capture clock latency-Launch clock latency

Skew is more⇒Improves setup & degrades hold timing

Skew is less⇒Improves hold & degrades setup timing

The need of skew balancing is to meet both setup and hold timing.
Clock Skew
If each buffer offers 10ps delay.

● Local skew: FF1 and FF2


Clock arriva to FF2-clock arrival to FF1

60ps-30ps=30ps

● Local skew: FF3 and FF4


Clock arrival to FF4-clock arrival to FF3

70ps-50ps=20ps

● Global Skew:
Longest clock path-Shortest clock path

Clock arrival to FF4-clock arrival to FF1

70ps-30ps=40ps

Global skew is the maximum skew for a particular clock.


Clock Skew
● Capture clock latency=50ps

Launch clock latency=30ps

Skew=50ps-30ps=20ps,

which is a positive skew.

● Capture clock latency=30ps

Launch clock latency=40ps

Skew=30ps-40ps=-10ps,

which is a negative skew.


Clock Tree Exceptions
● Sink Pin
○ These are the endpoints of the clock tree which
are used for delay balancing.
○ Tool uses sink pins in calculation & optimization
for both DRC & clock tree timing.
Examples:
○ Clock pins of a flip flop.
○ Clock pin of macro/IP.
● Through pin/ Non-stop pin
○ Non-stop pins trace through the endpoints that
are normally considered as endpoints of the
clock tree.
Examples:
○ Clock pin of a flip flop which is driving the
generated clock.
○ Clock pin of a Integrated clock gating circuit.
Clock Tree Exceptions
● Ignore Pin
○ These are pins on the timing net that will not
be considered as a sink.
○ The clock net buffered upto the ignore pin
but not beyond it.
○ Isolates from clock tree by adding guide
buffer before the pin.

Examples:

○ Source pins of clock trees in the fanout of


another clock.
○ Non-clock input pins of sequential cells.
○ Output ports
● Exclude Pin
○ These are similar to ignore pins, but the
clock net will not be buffered upto an
exclude pin.
Clock Tree Exceptions
● Float Pin
○ These are the clock pins that have special
insertion delay requirements and balancing is
done according to the delay.

Example:
○ Clock pin of a macro with internal sink.
● Preserve Pin
○ This specifies that the clock tree build beyond
that pin is not disturbed or to prevent the tool
to create additional ports on a cell instance.
○ Setting the part of clock tree as don’t touch.
○ Freezing the ports/pins of a cell instance.
CTS Exception Summary
Defining clock Tree Exceptions
These are user-defined changes to the default endpoints derived by the tool for a specific clock.

● Defining sink pins


● Insertion delay
● Ignore pins

Command:

set_clock_balance_points

-clock clk_name \

-consider_for_balancing {true|false} \

-delay {value} \

-balance_points {pins}
Controlling Clock Propagation
● During CTS, tool uses the is_clock_used_as_clock attribute to identify pins of the clock network.
● If this attribute is true, tool considers the pin to be part of the clock network during CTS.
● If this attribute is false, tool adds guide buffer that isolates this pin from rest of the clock network and
marks the input of guide buffer as an explicit ignore pin.
● If the clock network end at a pin that have an is_clock_used_as_clock attribute setting of false, tool
isolates these pins from clock network and CTS optimization but optimize these isolated pin to fix DRC or
other constraint violations during datapath optimization.
● To specify the unateness at a pin with respect to the clock source and stop the propagation of clock in
clock network or data network:
set_sense
[-type clock | data]
[-non_unate]
[-primary]
[-generated]
[-positive]
[-negative]
[-stop_propagation]
[-clock_leaf]
[-clocks clock_list] object_list
Controlling clock propagation
● set_sense command restricts/controls the propagation of clock forward from the specified pins.
○ -type specifies whether the sense applied to clock or data network.
○ -non_unate launches both positive and negative unate sense clock from the clock source.
○ -primary specifies primary/master clock signal type which is valid for clock definition point.
○ -generated specifies the generated clock signal type which is valid for clock definition point.
○ -positive applies positive unate to all the pins in the object_list with respect to clock source.
○ -negative applies negative unate to the pins in the object_list with respect to clock source.
○ -stop_propagation stops the propagation of clocks specified in clock_list from the pins specified in
the object_list.This specifies that is_clock_used_as_clock attribute is false for that particular pin
and ignores clock gating checks on clock network leading up to the pin.
○ -clock_leaf stops the propagation of specified clocks beyond the pins specified in object_list and
considers these pins as explicit sink pins.This specifies that is_clock_used_as_clock attribute is
true for that particular pin and performs clock gating checks on clock network leading up to the pin.
○ -clocks clock_list specifies the list of clock objects to be applied with given unateness and
propagation restriction.
○ object_list specifies the list of pins, ports or cell timing arcs with specified unateness to propagate.
Skew Groups
● Skew group is nothing but the group of sinks which are need to be balanced among
each other with same or different clock source.
● By default, tool groups the sinks of the same clock for balancing.
● To balance the sinks which are driven by different clocks are need to be grouped
together using skew groups.
● If any logic is having large insertion delay compared to others in the same clock
domain, we define that logic as separate skew group to avoid clock cells insertion in
shorter paths.

Command:

create_clock_skew_group -name sg1\

-objects {list_of_clock_pins}
Skew Groups
create_clock_skew_group -name SG1
- objects {clk1 clk2} clk1 clk3
clk1
● By default, tool balances clk1,clk2 and clk3
among themselves individually which is clk2
nothing but intra-skew balancing. clk2
clk3
● We create a separate skew group SG1 for
clk1 and clk2 logics as they have
communicating registers which is nothing
but inter-skew balancing.
● As clk1 and clk2 belong to one skew group,
tool tries to balance the skew between
them.
Skew Balancing D Q D Q

FF1 FF2
● Skew will be balanced globally within each
clock domain across all clock pins for both CLK CLK
master and generated clock.
● If the divided clock domain is independent
of the master domain, then skew balancing clk D Q
may not be important.
FF3
Example:
CLK
If there is a path from FF2 to FF3, skew should be
balanced between FF2 and FF3.

If there is no path between FF2 and FF3, we


explicitly define CLK pin of FF1 as ignore pin which
ignores the skew balancing between FF2 and FF3.
Clock Tree Routing Rules
As clock is a high switching signal and drives all the sequential cells, it needs to be
routed carefully to avoid crosstalk and electro-migration(EM).
● Shielding
Blocking the signal line with ground line to minimize signal interference (crosstalk)
between nets.
● Command:
create_shields -with_ground Vss \
-shielding_mode
-nets {list_of_nets}
Clock Tree Routing Rules
● Non Default Rules(NDR)
Routing the nets with multiple width and multiple spacing to avoid crosstalk and EM .
● Command:
create_routing_rule -name RR1\
-multiplier_spacing 2 \
-multiplier_width 2
set_routing_rule -rule RR1 \
-min_routing_layer M6\
-max_routing_layer M10\
[get_nets Clk]
CTS Algorithms
● H-Tree based algorithm
● RC tree based algorithm
● Geometric Matching algorithm(GMA)
● PI configuration
Comparison between CTS algorithms
● RC based Algorithm
○ Routes individual clock net to each sink and balance the RC delay.
○ This leads to excessive power consumption and large RC of each net.
● H-Tree
○ Clock routing takes place like letter H.
○ In this approach the distance between clock source points to each sink pins is
same.
○ Initial trunk is more thicker than the rest of the clock tree to reduce clock delay.
● Geometric Matching Algorithm
○ If the physical locations of the sub-modules are not symmetrical, implementing H-
tree is complicated.
○ Implementing H like structures, group the submodules based on distance and then
balance them.
○ Here distance between clock source points and sink pins is not same.
Comparison between CTS algorithms
● Pi configuration
○ In this the total number of buffers inserted along the clock path is multiple of
previous level.
○ Uses same number of buffers and geometrical wires and relies on matching the
delay components at each level of the clock.

Based on the block complexity/criticality the clock is routed with suitable algorithm.
Pre CTS Checks
Before synthesizing the clock trees, need to verify that clock trees are properly defined
using,
check_clock_tree -clocks [get_clocks clk_name]
It verifies:
● Clock with no sinks
● Loops in the clock network
● Multiple clocks reach the same register
● Ignored clock tree exceptions
● Buffers with multiple timing arcs used in clock tree references
● Situations that cause empty buffer list
● Generated clock without valid master clock source
○ Specified master clock not exist
○ Specified master clock not driving the generated clock
○ Source pin of generated clock is driven by multiple clocks but some of the master clocks are
not specified in generated clock definition.
Clock Tree Synthesis
● synthesize_clock_trees
○ Clock Tree Synthesis

Performs virtual routing of the clock nets and uses RC estimation to determine clock net timing
and then clock tree synthesis.

○ Clock Tree Optimization

Performs global routing on clock nets and uses RC extraction to determine clock net timing.
And during optimization tool performs incremental global routing on clock nets.

It does not detail route the clock trees.


Clock Tree Synthesis
● clock_opt
○ build_clock
Synthesizes and optimizes all clocks in all modes of all active scenarios and sets the
synthesized clocks as propagated.

○ route_clock
During this stage, the tool detail routes the clock.Usually clock nets are routed in the middle
layers for better delays.

○ final_opto
Performs further optimization,timing driven placement and legalization.It then global routes the
block and performs extensive global route based optimization which includes incremental
legalization and route patching.
Command:clock_opt -from{stage_name} -to {stage_name}
Optimization During CTS
● Concurrent Clock and Data Optimization(CCDO)
● Clock Tree Optimization(CTO)
○ Buffer and Gate Sizing
○ Buffer Relocation
○ Level adjustment
○ HFN synthesis
○ Delay insertion
Concurrent Clock and Data Optimization(CCDO)
● Applying useful skew techniques during datapath optimization to improve the timing
QoR by taking advantage of positive slack and adjusting the clock arrival times of the
registers.
● It performs mainly clock pulling and clock pushing by considering the margins to meet
the timing.
● Gives better QoR but latency and skew values may change.
● As clock buffers and inverters large in size, area utilization increases.
● As clock cells increase, the power consumption increases.
Concurrent Clock and Data Optimization(CCDO)
Useful Skew:
Borrowing the skew from adjacent registers to meet the timing/ intentionally creating the
skew.
Example
For the above given picture,write setup time equations.
Timing Path from FF1 to FF2: Timing Path from FF2 to FF3:

Slack=Required Time-Arrival Time>0 Required Time=2ns+10ns-1ns=11ns

Required Time=TCCL+Tcp-Tsu=2ns+10ns-1ns=11ns Arrival Time=2ns+1ns+5ns=8ns

Arrival Time=TLCL+Tcq+Tcomb=2ns+1ns+9ns=12ns Slack=11ns-8ns=3ns(Met)

Slack=11ns-12ns=-1ns(Violated)

Let us introduce the skew between FF1 and FF2. Let us add 2ns extra skew and need to check the timing in
the next path.Means we are pushing the capture clock path which a launch clock path in the timing path from
FF2 to FF3.

Hold equation:

Arrival time=TLCL+Tcq+Tcomb Required time=TCCL+Th

slack=arrival time-required time


Example

Timing path from FF1 to FF2: Timing path from FF2 to FF3:
Arrival time=2ns+1ns+9ns=12ns Arrival time=4ns+1ns+5ns=10ns
Required Time=4ns+10ns-1ns=13ns Required time=2ns+10ns-1ns=11ns
Slack=13ns-12ns=1ns(Met) Slack=11ns-10ns=1ns(Met)
Clock Tree Optimization Techniques
● Buffer and Gate Sizing
Increase or decrease the drive strength of the gates and buffers to improve both skew and insertion
delay.

● Gate and Buffer Relocation


Physical location of buffer or gate is moved to reduce the skew and insertion delay.

● Level Adjustment
Adjust the level of clock pins to a higher or lower part of the clock tree hierarchy.

● HFN synthesis
Reducing the fanout using splitting or cloning.

● Delay Insertion
Delay is inserted for shortest path to minimize the skew.
Analyzing the Clock Tree
● Generate clock tree QoR reports

report_clock_qor

-type {latency |drc_viloators|local_skew|robustness|balance_groups}

● Report clock tree power

report_clock_power -clocks [get_clocks clk_name]

● Analyzing clock timing

report_clock_timing - type {skew|latency|inter_clock_skew|transition}


Post CTS Optimization
● Once the CTS optimizations are done, the clock tree is fixed and routed.
● Further optimizations cannot be done in clock path except buffer or gate sizing.
● Only datapath optimization is possible in post CTS stage.
● Post CTS optimizations include:
○ Meeting DRV’s
■ Max Tran
■ Max Cap
■ Max Fanout
○ Meeting Setup Timing
○ Meeting Hold Timing
○ Area optimization
○ Power optimization
○ Congestion reduction
Post CTS Optimizations
● Max Tran

Causes:
● Driver with low drive strength
● High Vt cells
● High Fanout
● Net length is large
● Load is more
Fixes:
● Replace HVT cells with LVT cells
● Upsize the driver
● Reduce net length by adding buffers
● Reduce load by reducing fanout using splitting or cloning.
Post CTS Optimizations
● Max Cap
Causes:

● Weak driver
● High Vt cells
● High Fanout
● Net length is large
● Load is more

Fixes:
● Upsize the driver
● Reduce net length by adding buffers
● Reduce load by reducing fanout using splitting or cloning.
Post CTS Optimizations
● Max Fanout

Causes:

● Weak driver
● High Vt cells
● High Fanout
● Net length is large
● Load is more Before After Cloning After splitting

Fixes:

● Reduce the fanout by load splitting.


● Reduce the fanout by cloning.
Post CTS Optimizations
● Setup Timing

Causes:
● Combinational delay in datapath is more due to long net, weak driver and high Vt cells.
● Tsetup is more for capture flop.
● More negative skew.
● Crosstalk delay(Signals switching in opposite direction).

Fixes:
● Swapping HVT cells with LVT/ULVT cells.
● Upsize drivers in datapath.
● Add buffers on long net.
● Decrease the fanout.
● Layer optimization.
● Fix crosstalk using NDR rules during routing stage.
Post CTS Optimizations
● Hold Timing
Causes:
● Combinational delay in datapath is less due to high drivers and low Vt cells.
● Thold is more for capture flop.
● More positive skew.
● Crosstalk delay(Signals switching in same direction)

Fixes:
● Add buffers at D pin in datapath.
● Swapping LVT cells with HVT cells.
● Downsize drivers in datapath.
● Detore the nets.
● Layer optimization.
● Fix crosstalk using NDR rules during routing stage.
Till placement tool optimizes the setup timing, so we concentrate more on hold timing optimization in PCO.
Post CTS Optimizations
● Area and Power Optimization
Causes:
● Clock cells occupy more area and consume more power.
● LVT'sare used in clock path which have more subthreshold leakage power.
● Clock is high switching net which has more switching power.

Fixes:
Area optimization:
○ Downsize clock buffers if smaller size buffer can drive the same load.

Power optimization:
○ Downsize clock buffers.
○ Replace LVTs/ULVTs with HVT in datapath.
Post CTS Optimizations
● Congestion

Causes:

● Adding extra buffers during CTS to minimize skew and insertion delay can cause
congestion.
● Applying NDR rules for clock nets reduces number of available tracks.

Fixes:

● Cell Padding
● Perform congestion driven placement in the initial stages as we can’t move clock
cells in post CTS.
Checks After CTS
● Balanced Skew ➔ report_clock_timing -type skew
● Minimum Latency ➔ report_clock_qor -type latency
● Functionality ➔ (FV_check)
● Timing ➔ report_timing -delay_type {max|min}
● DRV’s ➔ report_constraints -all_violators
● Congestion ➔ report_congestion -rerun_global_route
● Vt percentages ➔ report_threshold_voltage-group
● Non-clock cells in clock path ➔ Manual Check
● Legalization ➔ check_legality
● Utilization ➔ report_utilization
● Power ➔ report_power
Why don’t we do routing before CTS?
● If we perform signal routing first, most of the tracks may be utilized which makes skew
balancing critical.
● CTS is done to achieve a balanced skew and high fanout synthesis of clock signal.
● If we do not perform CTS before routing, it may leads to congestion and apparently
leads to more number of shorts in the design.
ROUTING
Prerequisites For Routing
● Legalization of standard cells and macros should be done(to avoid shorts).
● Power and ground nets have been routed before placement.
● Clock tree synthesis and optimization have been performed.
● Estimated congestion is acceptable.
● Estimated timing is acceptable.
● Estimated maximum transition and capacitance have no violations.
● All design rules must be defined in technology file.

For better routing, check whether the block is ready for detail routing using check_routability
command after placement.
check_routability checks for...
● Blocked standard cell ports(Due to overlaps)
● Blocked top level or macro cell ports
● Out-of boundary pins
● Minimum grid violations
● Incorrect via definitions
● Minimum width settings
What is Routing?
● Making physical connections between signal pins using metal layers.
● Exact paths for the interconnection of standard cells,macros and I/O pins are
determined.
● Based on the logical connection in netlist electrical connections are created using
metals and vias.
Inputs for Routing
● CTS Database
● Libraries
● SDC
● NDRs
● Routing Blockages
● Technology data
○ Metal layers
○ DRC rules
○ Via creation rules
○ Grid rules(pitch)
● Zroute is the router in IC Compiler II
Goals of Routing
● Minimize the total interconnect/wire length
● Minimize the critical path delay
● Minimize the number of layer changes that the connections have to make
● Complete the connections without increasing total area of the block
● Meeting Timing DRC’s and obtaining a good timing QoR.
● Minimize congestion
● Signal Integrity driven routing(Crosstalk)
Routing with Variation in Technology
● Routing has become increasingly challenging with each technology node due to the ever-
increasing metal layers, distinct layer thickness, design rule volume and design complexity.

M→1x width layers


C→1.3x width layers
B→2x width layers
E→4x width layers
U→10x width layers
W→16x width layers
Crosstalk
● The unintentional transfer of voltage from one net to the other net due the formation of
parasitic capacitance between them.
● Occurs when two different nets having less spacing and shares maximum area which
carry different strength of signals.
● Aggressor is the one which is having higher strength.
● Victim is the one which is having lower driver strength.

Types:

● Crosstalk Delay
● Crosstalk Noise
Crosstalk
● Crosstalk Delay

Delay addition/reduction because of variation in the victim signal transition when both aggressor and victim are
switching.

● Positive Crosstalk

Aggressor and victim are switching in opposite directions which results delay degradation in victim signal.

In Clock Path: Setup improves and Hold degrades.

In Data Path: Setup degrades and Hold improves.

● Negative Crosstalk

Aggressor and victim are switching in same direction results delay improvement in victim signal.

In Clock Path: Setup degrades and Hold Improves.

In Data Path: Setup improves and Hold degrades.


Crosstalk
● Crosstalk Noise
Victim is constant and aggressor is switching.

● Positive Glitch
When victim is constant at logic 0 and aggressor is switching from logic 0 to logic 1.
● Negative Glitch
When victim is constant at logic 1 and aggressor is switching from logic 1 to logic 0.
● Overshoot
When victim is constant at logic 1 and aggressor is switching from logic 0 to logic 1.
● Undershoot
When victim is constant at logic 0 and aggressor is switching from logic 1 to logic 0.
Crosstalk
● Crosstalk delay effects the timing.
● Crosstalk noise effects the functionality.

Fixes:

● Shielding
● NDRs
Routing Constraints
● Set constraints to number of layers to be used during routing.
● Set constraints for minimum and maximum routing layer.
● Setting limits on routing to specific regions.
● Setting the maximum length for the routing wires.
● Blocking routing in specific regions(Routing blockage).
● Set guidelines for minimum width and spacing.
● Set preferred routing directions to specific metal layers during routing.
● Constraining routing density.
● Constraining pin connections.
Controlling Routing
● Routing Blockage: It defines a region where routing is not allowed on specific layers.
create_routing_blockage -name RB\
-boundary {llx lly urx ury}
-layers {list_of_layers}
● Routing Guide:Provides routing directives for specific areas of a block.
create_routing_guide
-boundary {llx lly urx ury}
-layers {layers_list}
-horizontal_track_utilization
-vertical_track_utilization
Controlling Routing
● Routing Corridors:Restricting the global routing of specific nets to the region
defined/specified.Used to route critical nets before routing signal nets.

create_routing_corridor -name corridor_1

-boundary {llx lly urx ury}

-min_layer_name

-max_layer_name
Routing Critical Nets
● Before proceed for signal nets routing, critical nets such as clock nets and feedthrough
nets are routed in higher layers to reduce delay.
● Critical nets can be routed using route_group command by specifying particular nets.
● Route_group performs global routing, track assignment and detail routing.

Command:

route_group -nets {collection_of_nets} \

-all_clock_nets
Routing Signal Nets
Signal routing can be done in two ways: Global Routing
● Perform standalone routing tasks
○ route_global Track Assignment
○ route_track
○ route_detail
● Perform automatic routing Detail Routing
○ route_auto

It performs global routing,track assignment Search and Repair

and detail routing.It performs better when congestion QoR


is the main goal than timing QoR.
Global Routing(route_global)
● It divides a block into global routing cells which have the height same as standard cell.
● Routing capacity is calculated for each cell according to blockages,pins and routing
tracks inside the cell.
● Calculates the available tracks and required tracks and reports overflows if the cell
needs more tracks.
● Nets are not assigned to the actual wire tracks.
Global Routing
● Perform several rerouting phases to reduce congestion by rerouting the nets around
global routing cells with overflows.
● Based on -effort_level, tool performs number of rerouting phases.
● By default, tool performs congestion driven routing.
● To enable timing-driven global routing, set

route.global.timing_driven true

● To enable crosstalk driven global routing,


route.global.crosstalk_driven true &

time.si_enable_analysis true

● These need to be enabled before running route_global.


Track Assignment(route_track)
● To assign routing tracks for each global route.
○ Assigns tracks in horizontal partitions.
○ Assigns tracks in vertical partitions.
○ Re-routes overlapping wires.
● Replaces global routes with actual metal shapes.
● All nets are routed in track assignment but not carefully.
● DRC check is not done.
● Chance of having more shorts & DRC violations.
● To enable timing driven mode,
route.track.timing_driven true

● To enable crosstalk driven mode,


route.track.crosstalk_driven true
Detail Routing(route_detail)
● Performs detail routing on the whole block while fixing the DRC violations by dividing the
block into partitions.
● Based on DRC violations distribution in the block the partitions are divided.
○ Uniform partitioning if DRC violations are evenly distributed.
○ Non uniform partitioning if DRC violations are located in specific regions.
● Route guides are used to reduce shorts.
● Incremental detail routing can’t fix open nets.
● It is performed in maximum of 40 iterations.
● To perform additional detail routing on the block,
we can use incremental detail routing.

route_detail -incremental
Search and Repair
● Search and Repair is performed during detail routing after the first iteration.
● Shorts and spacing violations are detected and rerouting in the affected areas.
● It is also known as incremental detail routing.
Virtual Routing

● It routes the signal based on the shortest Manhattan distance source and sink without
considering routing blockages or congestion (not congestion aware).
● If the design is floorplan complex and virtual Routing is performed on that design, during
actual routing we may see lots of issues wrt congestion and timing.

Global Routing

● It is congestion aware routing and estimates how the actual routing takes place with current
floorplan and placement.
● It is nearly equivalent to detail routing.
● It does not consider DRC.

Detail Routing

● It is an actual routing performed on the design which is congestion driven, crosstalk driven and
timing driven.
● It also checks for DRC violations.
Post Route Optimization
● Logic Optimization
● Routability Optimization
● To fix Signal Integrity
○ Apply NDRs
○ Apply Shielding for sensitive nets.
● Fill Insertion

Inserted to avoid the DRC violations such as well to well spacing, metal to metal spacing and min
width etc.

Types of fill:

○ Base Fill (Filler Cell)


○ Metal Fill
Post Route Optimization
● Logic Optimization
○ Performs different optimization techniques to meet timing, DRVs, area and power constraints.
○ Improves timing, area and power QoR.
○ Fixes logical DRC violations, performs legalization and Engineering Change Order(ECO) routing.
○ It is performed on route db after removing fill.
● Routability Optimization
○ Increases spacing between the cells to fix routing DRC violations caused by pin access issues.
○ These types of DRC violations can be fixed by using keepout margins to increase the spacing
between the cells.
○ To increase the spacing between cells where violations occur due to pin accessing issue:

Run optimize_routability command.


Post Route Optimization
● Base Fill(Filler Cell)
○ Have no logical functionality.
○ To fill the gaps in the standard cell rows after routing.
○ To maintain well continuity by connecting the n-wells & p-substrates in the cell rows.
○ Tool throws DRC violation, if minimum spacing is not maintained between wells.
○ Even if minimum spacing is maintained, every cell needs tie-cell to connect VDD/VSS which
consumes more area.
○ Width of smallest filler cell is the placement grid.
○ Avoids latch-up and masking issues.
● Metal Fill
○ Filling up empty metal tracks with metal shapes to meet metal
density rules.
○ Metal density rule helps to avoid over etching/ metal erosion.
Engineering Change Order(ECO)
● It is a phase where we perform additional fixes in the design which are not closed in PnR.
● An ECO file which is generated in signoff stage contains series of changes required in the form of
PnR tool command for fixing the issue.
● Types of ECOs:
○ Timing ECO
■ To fix setup, hold, max_tran, max_cap, max_fanout, min_pulse_width,min_period
■ Metal only Timing ECOs
○ Functional ECO
■ Inserts a logic directly into gate-level netlist corresponding to a change that occurs in the
RTL.
■ Add /remove the logic with minimum modifications in the design.
ECO Flow
Floorplan
Command to route in ECO phase:
route_eco
Placement

CTS
check_pg_connectivity
Route Global Net Legalize check_legality &
Connectivity legalize_placement -incremental
Fill Source ECO files Check for errors and warnings
Violations
Signoff Write ECO files Remove Fill
No Violations
END
Manual ECO Fixes
● size_cell instance_name -lib_cell {ref_name}
To change the drive strength and flavor of the cell to the new library cell specified using -
ref_name.
● insert_buffer pin_name -lib_cell{ref_name}
Inserts specified buffer on a pin but not reconnects the net.
● add_buffer pin_name -lib_cell{ref_name}
Inserts specified buffer on the pin and reconnects the net.
● add_buffer_on_route net_name -repeater_distance{L} -repeater_distance_length_ratio{R}
Add buffers on the net for every L microns distance.
Add buffer on the net based on R value.If R=0.5, it adds the buffer in the middle of the net.
● split_fanout -driver{driving_cell_name} -max_fanout{value}
Split the fanout of the specified cell using buffers based on the load.
Checks after Routing
● Shorts and opens ➔ check_lvs
● Crosstalk ➔ report_crosstalk_delta & report_noise
● Timing ➔ report_timing -delay_type{max|min}
● DRV’s ➔ report_constraints -all_violators
● Utilization ➔ report_utilization
● Vt percentages ➔ report_threshold_voltage_group
● Power ➔ report_power
Thank You

You might also like