PnR-II-CTS Routing Chip Finishing

PNR #2:
CTS, Routing, Crosstalk and Chip finishing

Islam Ahmed, Hossam Hassan
1
Clock Tree Synthesis
2
Clock Parameters
• Skew
• Difference in clock arrival time
at two different registers.
Skew
• Jitter Clock Skew
Difference in clock arrival time
• Difference in clock period at two spatially distinct points
between different cycles. Clock Jitter
Difference in clock
• Slew
• Transition (trise/tfall) of clock
signal.
• Insertion Delay
• Delay from clock source until
3
registers.
How do clock skew and jitter arise?
Clock
Distribution
• Clock Generation Network
• Distribution network Generator length, metal width and
• Number of buffers PLL

• Device Variation
• Wire length and variation Central Clock Driver
• Coupling
Local
• Load Clock
Buffers
• Environment Variation
• Temperature Intel 1998, Variations in local clock
0.25um load, local power
• Power Supply supply, local gate
length and threshold,
local temperature
4
 Capture Clock Edge
 The edge of the clock for which data is detected is known as capture edge.
Launch Clock Edge

 This is the edge of the clock wherein data is launched in previous flip flop and
will be captured at this flip flop.
5
 Local skew
Local skew is the difference in the
arrival of clock signal at the clock pin
of related flops.
 Global skew
Global skew is the difference in the
arrival of clock signal at the clock pin
of non related flops. This also defined
as the difference between shortest
clock path delay and longest clock
path delay reaching two sequential
elements.
6
 Positive Skew
If capture clock comes late than launch clock
then it is called +ve skew.
+ve skew can lead to hold violation.
+ve skew improves setup time
 Negative Skew
If capture clock comes early than launch clock it
is called –ve skew.
-ve skew can lead to setup violation
-ve skew improves hold time.
7
 Source Delay or Source Latency
It is the delay from the clock origin point to the clock
definition point in the design".
 Delay from clock source to beginning of clock tree (i.e.
clock definition point).
 The time a clock signal takes to propagate from its ideal
waveform origin point to the clock definition point in
the design.
Network Delay (latency) or Insertion Delay

 It is the delay from the clock definition point to the
clock pin of the register.
 The time clock signal (rise or fall) takes to propagate
from the clock definition point to a register clock pin.
8
 Uncertainty
Clock uncertainty is the time difference between the
arrivals of clock signals at registers in one clock domain
or between domains.
 Clock latency
Latency is the delay of the clock source and clock network
delay.
Pre-CTS
Uncertainty = source latency + Network latency + jitter +
margin [est. network latency]
Post-CTS
Uncertainty = source latency + jitter (cal. skew)
9
Introduction
 CTS is building a buffer/inverter network in order to balance the
relative delays of FFs belonging to a clock domain. (triggered by the
same clock).
→ [Global skew of each clock domain =~ 0].
• Question: FF FF FF FF FF FF
• Why not just route the clock net to FF FF FF FF FF FF
all sequential elements,

just like any other net? FF FF FF FF FF FF
FF FF FF FF FF FF
• Answer…
• Timing Clock FF FF FF FF FF FF
FF FF FF FF FF FF
• Power
• Area
• Signal Integrity FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
10
CTS VS HFS
 Clock buffers and clock inverter with equal rise and fall times are used
(symmetric buffers/inverter). Whereas HFNS uses buffers and inverters with
a relaxed rise and fall times.
 HFNS are used mostly for reset, scan enable and other static signals having
high fan-outs. There is not stringent requirement of balancing & power
reduction.
 Clock tree power is given special attention as it is a constantly switching
signal. HFNS are mostly performed for static signals and hence not much
attention to power is needed.
 NDR rules are used for clock tree routing.
11
Where does the Clock Tree Begin and End?
D Q
STOP FF
GATED CLK
D Q
STOP FF
CLOCK CLK
Clock Sinks
D Q
Start FF
(stop, float or
STOP
Clock Source
CLK
exclude pins)
create_clock …
12
Clock Trees
RC-Tree
•Naïve approach:
• Route an individual clock net to each sink
and balance the RC-delay
• However, this would burn excessive power and the large
RC of each net would cause signal integrity issues.
•Instead use a buffered tree
• Short nets mean lower RC values
• Buffers restore the signal for
better slew rates
• Lower total insertion delay
• Less total switching
capacitance FF FF FF FF FF FF FF FF FF FF
13
The requirements of Setup and Hold on timing paths
For Setup: T[clk-to-Q] + Tcomb + T[setup] ≤ Tclk + Tskew

For Hold: T[clk-to-Q] + Tcomb ≥ Thold + Tskew
14
Design Status, Start of CTS Phase
 Placement - completed
 Power and ground nets – prerouted
 Estimated congestion – acceptable
 Estimated timing – acceptable (~0ns slack)
 Estimated max cap/transition – no violations
 High fanout nets:
 Reset, Scan Enable synthesized with buffers
 Clocks are still not buffered
15
Starting Point before CTS
All clock pins are driven by a single clock source.

16
CTS Goals
 Meet the clock tree Design Rule Constraints (DRC):

 Maximum transition delay
Constraints are upper bound
 Maximum load capacitance goals. If constraints are not
 Maximum fanout met, violations will be reported.
 Maximum buffer levels
 Meet the clock tree targets:

Targets are "nice to have"
 Maximum skew
goals. If targets are not met,
 Min/Max insertion delay no violations will be reported.
17
When are the pros and cons of setting a tight
constraint for target_skew?
When are the pros and cons of setting a relaxed

constraint for max_transition?
18
Clock buffer vs normal buffer
 Clock buffer have equal rise time and fall time, therefore pulse width violation
is avoided.
 In clock buffers Beta ratio is adjusted such that rise & fall time are matched.
This may increase size of clock buffer compared to normal buffer.
 Normal buffers may not have equal rise and fall time.
 Clock buffers are usually designed such that an input signal with 50% duty
cycle produces an output with 50% duty cycle.
19
Clock Tree constraints
Parameter Description
Clock tree buffers/inverters Has to be symmetric
Clock gating cells, logic cells [ex. MUXs]
Routing layers and NDR Usually used to avoid crosstalk
Target skew Target global skew
Max transition Target max transition of clock signal
Max capacitance Target max capacitance of clock signal
CTS cell spacing Extra cell spacing for clock cells to avoid congestion and
IR hotspots.
20
Clock Tree Synthesis (CTS) (1/2)
A buffer tree is built to balance the loads and minimize the skew.
21
Is the Design Ready for CTS?
 check_physical_design –stage
pre_clock_opt checks for:
 Design is placed
 Clocks have been defined
 Clock roots are not hierarchical pins
 check_clock_tree checks and warns if:
 A clock source pin is a hierarchical pin (see below for support)
 A generated-clock with improperly specified master-clock
 A clock tree has no synchronous pins
 There are multiple clocks per register
22
Clock Tree Synthesis
Control
DRC
max tran/cap/fanout
clock_opt
Exceptions
Physical report_clock_tree
Constraints Analysis report_clock_timing
report_timing
23
Where does the Clock Tree Begin and End?
D Q
STOP FF
GATED CLK
D Q
STOP FF
CLOCK CLK
Clock Sinks
D Q
Start FF
(stop, float or
STOP
Clock Source
CLK
exclude pins)
create_clock …
24
Define Clock Root Attributes
 When the clock root is a primary port of a block

 Ensure that an appropriate driving cell is defined
set_driving_cell
 The synthesis constraints may include a weak driving cell for all inputs, including the clock
port
 Because the clock is ideal during synthesis it has no effect on design QoR
 But a weak driver on the clock port affects clock tree QoR during CTS
Driving Cell
External driving cell
specified for clock port CLK
Clock root defined

on primary clock port
25
Stop, Float and Exclude Pins
skew and insertion
Exceptions delay are optimized
Implicit STOP or FLOAT pins
D Q
STOP Pins:
FF
 GATED CLK
 CTS optimizes for DRC and

clock tree targets (skew,
D Q
FF
insertion delay) CLOCK CLK
 FLOAT Pins:
 Like Stop pins, but with delays IP_CLK
on clock pin IP
 EXCLUDE Pins:
 CTS ignores targets skew and insertion
 CTS fixes clock tree DRCs delay are ignored D Q
FF
CLK
Implicit EXCLUDE pins CLK_OUT
26
Defining an Explicit Stop Pin
CLOCK D Q
FF
0.42 CLK
Defining an explicit stop

pin allows CTS to optimize skew and insertion delay
0.43 are now optimized
for skew and insertion
delay targets.
IP_CLK D Q
FF
0.17 CLKn
Explicit stop pin defined
CTS has no knowledge of the IP-internal
IP
clock delay – it can only “see” up to the
stop pin!
set_clock_tree_exceptions –stop_pins [get_pins IP/IP_CLK]

27
18
Defining an Explicit Float Pin
Exceptions CLOCK D Q
FF
0.42 CLK
Defining an explicit float skew and insertion delay

0.27 are now optimized
pin allows CTS to adjust
the insertion delays based
on specification.
IP_CLK D Q
FF
0.15 CLKn
Explicit float pin defined

D Q
FF
CLKn
IP
set_clock_tree_exceptions \
-float_pins IP/IP_CLK \
-float_pin_max_delay_rise 0.15
28
19
Generated and Gated Clocks
D Q
FF1
GATED 0.64 CLK
All insertion delays D Q
are matched FF2

0.65 CLK
D Q
FF3
CLK
CLOCK
D Q 0.63
FFD D Q
create_clock CLK
QN FF4
CLK
D Q
create_generated_clock FF5
CLK
Skew will be balanced ‘globally’, within each clock domain, across all clock-
pins of both master and generated clock.
29
Skew Balancing not Required?
If the divided clock domain is Exceptions
independent of the master
domain (no paths), skew
balancing may not be
important. D Q
FF
0.42 CLK
…
D Q
FF
CLK
CLOCK
D Q
D Q
FF
FFD 0.67 CLK
CLK
QN
…
Define an explicit
exclude pin here D Q
FF
CLK
set_clock_tree_exceptions -exclude_pins [get_pins FFD/CLK]

30
16
Non-Default Clock Routing
Physical
Constraints
 IC Compiler can route the clocks using non-default

routing rules, e.g. double-spacing, double-width, shielding
 Non-default rules are often used to “harden” the clock,
e.g. to make the clock routes less sensitive to Cross Talk
or EM effects
Sig1
Sig1
Clk Clk
Sig2
Sig2
Default Routing Rule Effect of NDR route on Clk

31
NDR Recommendations
 Always route clock on metal 3 and above

 Avoid NDR on clock sinks:
set_clock_tree_options -use_default_routing_for_sinks 1
 Avoid NDR on Metal 1

 may have trouble accessing metal 1 pins
on buffers and gates
 Put NDR on pitch – try to avoid blind double spacing
 Preserve routing resources/keep preroute RC estimation
accurate
 Consider double width to reduce resistance
33
Invoke CTS: Core Command
Control
✓
DRC
✓ clock_opt
Exceptions
Physical
✓ Single Command CTS, Optimization and
CT routing
Constraints
✓
34
clock_opt use recommendation
 Using clock_opt in the following manner has been found to be

more flexible across designs and flows:
clock_opt -only_cts -no_clock_route

analyze…
clock_opt -only_psyn -no_clock_route
analyze…
route_group -all_clock_nets
35
32
Effects of Clock Tree Synthesis
 Clock buffers added

 Congestion may increase
 Non clock cells may have been
moved to less ideal locations
 Can introduce new timing and
max tran/cap violations
How do you handle new violations?
36
(Embedded) Clock Tree Optimization
37
k Tree Optimizations
Gate
relocation
Buffer
relocation
Buffer
Delay Gate sizing
insertion sizing
Useful Skew
38
Routing
39
Design Status, Start of Routing Phase
 Placement - completed
 CTS – completed
 Power and ground nets - routed
 Estimated congestion - acceptable
 Estimated timing - acceptable (~0ns slack)
 Estimated max cap/transition – no violations
40
Routing Fundamentals: Goal
Routing creates physical connections to all clock and signal pins through metal
interconnects
 Routed paths must meet setup and hold timing, max cap/trans, and clock skew
requirements
 Metal traces must meet physical DRC requirements
41
Grid-Based Routing System
Trace
Grid Point
 Metal traces (routes) are built M1

along, and centered upon routing
tracks based on a grid
 Each metal layer has its own grid
and preferred routing direction:
 M1: Horizontal
 M2:Vertical, etc…
 Report by:
M2
Pitch unitTile
Track
42
Routing Operations
◼ IC Compiler performs:
Global Route
⚫ Global Routing
⚫ Track Assignment Track Assign
⚫ Detail Routing Detail Route

⚫ Search and Repair
Search&Repair
43
Route Operations: Global Route
◼ GR assigns nets to specific metal layers and global routing cells

(Gcells)
global route
◼ GR tries to avoid congested Gcells while minimizing detours:
⚫ Congestion exists when more tracks
are needed than available
⚫ Detours increase wire length (delay)
◼ GR also avoids:
⚫ P/G (rings/straps/rails)
⚫ Routing blockages
congestion area
Metal traces exist after Global Route. True or False?

44
Global Routing
 Global routing is a coarse-grain assignment of routes, which first partitions the
routing region into tiles/rectangles called global routing cells (gcells) and
decides tile-to-tile paths for all nets while attempting to optimize some given
objective function (e.g., total wire length and circuit timing), but doesn’t make
actual connections or assign nets to specific paths within the routing regions.
By default, the width of a gcells is same as the height of a standard cell and is
aligned with the standard cell rows.
45
Route Operations: Track Assignment
 Track Assignment (TA):

 Assigns each net to a specific
track and lays down the actual
Jog reduces
metal traces via count
 It also attempts to:

 Make long, straight traces
 Reduce the number
of vias
 TA does not check or follow TA metal
physical DRC rules traces
Preroute
46
Track Assignment
 Track assignment is a stage wherein the routing tracks are assigned for each
global routes. The tasks that are performed during this stage are as follows-
 Assigning tracks in horizontal and vertical partitions.
 Rerouting all overlapped wires.
 Track Assignment replaces all global routes with actual metal layers. Although
all nets are routed(not very carefully), there will be many DRC, SI and timing
related violations, especially in regions where the routing connects the pins.
These violations are fixed in the succeeding stages.
47
Detail Routing
 The detailed router uses the routing plan laid by the router during the Global
Routing and Track Assignment and lays actually metal to logically connect pins
with nets and other pins in the design.
 The violations that were created during the Track Assignment stage are fixed
through multiple iterations in this stage.
 The main goal of detailed routing is to complete all of the required
interconnect without leaving shorts or spacing violations.
 The detailed routing starts with the router dividing the block into specific
areas called switch boxes or Sbox, which are generally expressed in terms of
gcells. These boxes align with the gcell boundary. For example, a 3x3 Sbox is a
box which encompass 9 gcells.
48
Route Operations: Detail Routing
◼ Detail route attempts to clear DRC violations using a fixed size Sbox
◼ Due to the fixed Sbox size, detail route may not be able to clear all DRC
violations
Detail Route SBoxes

Notch
Spacing
Notch
Spacing
Thin&Fat
Spacing
Min
Spacing
49
Route Operations: Search&Repair
 Search&Repair fixes remaining DRC violations through multiple
loops using progressively larger SBox sizes
Loop4
Loop3
Loop2
Loop1
Note: Even if the design is DRC clean after S&R, you must still
run a sign-off DRC checker (Hercules).
 Routing DRC rules are a subset of the complete technology DRC
rules
 IC Compiler works on the FRAM view, not the detailed
transistor-level (CEL) view
50
Route Operations: Search&Repair
Note: Even if the design is DRC clean after S&R, you must still run a sign-off
DRC checker (Hercules/ICV/Calibre or opensource alternatives).
 Routing DRC rules are a subset of the complete technology DRC rules
 IC Compiler works on the FRAM view, not the detailed
transistor-level (CEL) view
51
Pre-Route Checks
 Check design for routing stage readiness
 There should not be:
 Ideal nets
 High fanout nets greater than 500
 Use check_routeability to check a design’s
prerequisites for detail routing and report a list of
violations
check_physical_design –stage pre_route_opt

all_ideal_nets
all_high_fanout -nets -threshold 501
check_routeability
52
What is Crosstalk?
Crosstalk is the transfer of a

voltage transition from one
switching net (aggressor) to
another static or switching net Aggressor
(victim) through a coupling
capacitance (Cc)
net 1 Aggressor
Cc
net 2 Victim
Static victim Switching victim

53
Crosstalk Definition
 Mechanisms that cause crosstalk
 Mutual capacitance
 Mutual inductance
Aggressor Aggressor
Cm Lm Victim
Victim
dVdriver dI driver
I noise ,Cm = C m Vnoise ,L m = L m
dt dt
Mutual Capacitance, Cm Mutual Inductance, Lm
54
 The crosstalk is electrical interaction between two long nets.
 The causes of the crosstalk are: long parallel nets, coupling capacitance or
inductance and high frequency switching.
 If signals in two neighboring parallel nets change in the same direction,

then speed-up occurs, If in the opposite direction, slow-down occurs.
 Coupling not only changes the delay on a wire but can also induce coupling
noise. The concept of crosstalk requires to define a “victim” and “aggressor”
net.
55
 Mutual capacitance is one of the two mechanisms that cause crosstalk

• Mutual capacitance is the coupling of two conductors via the electric field.
Mutual capacitance will inject a current onto the victim line proportional to the
rate of change of voltage on the driven line. Since the induced noise is
proportional to the rate of change, mutual capacitance becomes very
significant in high-speed digital applications.
 Mutual inductance is another mechanism that causes crosstalk.

• Mutual inductance induces current from a driven line onto a quiet line by
means of the magnetic field. The mutual inductance will inject a voltage noise
onto the victim proportional to the rate of change of the current on the driver
line. Since the induced noise is proportional to the rate of change, mutual
inductance becomes very significant in high-speed digital applications.
56
Capacitive Coupling
The main reason of SI problems is the parasitic
capacitive coupling between parallel interconnects
located near each other.
With the scaling to deep sub-micron processes

capacitive coupling has become a major concern Wire A
because in case of submicrometer designs routing
complexity has increased dramatically and as a result
more metal layers are being added to silicon.
The width of the metal is continuously being decreased

Wire B
whereas to keep the metal resistance low, the height of
the metal wires is being increased. Capacitive Coupling
Furthermore, metal wire lengths are now longer than

ever. All this leads to considerable increase of capacitive
coupling.
57
Noise Effects Types
 Delay
 Coupling of switching activity of the victim with the switching activity of the
aggressors
 Glitch
 Noise caused on a steady victim signal
58
Crosstalk Delay
In case of passing into submicrometer technologies,
timing problems become stricter because each new
generation brings shrinking feature sizes, wire
width, and wire spacing. The reduction in wire
width means a decrease in total wire capacitance. 1.5V
1.5V
However, it also means a dramatic increase in the
0V
fraction of wire capacitance resulting from lateral 0V
1.5V Crosstalk
coupling. Improved performance translates to Delay
higher clock frequencies, with much faster 0V
Timing Error
switching signal slew rates. As signal slew rates
Cload
increase, more noise couples onto neighboring
nets.
59
Crosstalk Glitch
 A steady signal net can have a glitch due to charge transferred by the switching aggressors through the
coupling capacitances
0
Aggressor net
Cc Glitch
Victim net
60
Factors Affecting Glitch
 Factors for large magnitude of glitch:
 Large coupling capacitance
 Fast slew time on the aggressor
 Smaller victim net grounded capacitance
 Smaller victim net driving strength
61
Crosstalk Delay Analysis
 Capacitance extraction for a net consists of different capacitances
 Capacitance Cg to ground
 Coupling capacitance Cc to a neighboring net
Aggressor
1.2V Cc
0
= Distributed
Victim RC
Cg
62
Crosstalk Scenarios
 Aggressor net stable
 Victim net provides the charge for Cg and Cc to be charged to Vdd
 Total charge= (Cg + Cc) * Vdd
 Aggressor switching in same direction
 If aggressor slew is similar → total charge by the driving cell is (Cg * Vdd)
 If aggressor slew is faster → charge smaller than (Cg * Vdd)
 Aggressor switching in opposite direction
 Cc is charged from -Vdd to Vdd
 Charge on Cc changes by (2 * Cc * Vdd) before and after transitions
Aggressor
1.2V Cc
0
Cg
Victim
63
Positive Crosstalk
 Aggressor and victim are switching in opposite directions

 Increases the amount of charge required from victim driver
 Victim delay increases
1.2V
1.2V
0
0
1.2V
Cc Crosstalk delay
0
Cg Timing Error!
64
Negative Crosstalk
 Aggressor and victim are switching in the same directions
 Charge on Cc remains the same before and after transitions
 Victim delay reduced
1.2V 1.2V
0
Crosstalk delay
0
1.2V Cc
0
Cg Timing Error!
65
Crosstalk Timing Verification
 Positive rise delay
 Rise edge moves forward in time
 Negative rise delay
 Rise edge moves backward in time
 Positive fall delay
 Fall edge moves forward in time
 Negative fall delay
 Fall edge moves backward in time
66
Setup Analysis
 Consider the logic shown below where crosstalk can occur at various nets along the data path and along the clock paths.
Data path
UFF0 UFF1
Launch clock path D Q D Q
CK CK
Common clock
path Common point
Capture clock path

 The setup (or max path) analysis assumes that:
 Launch clock path sees positive crosstalk delay so that the data is launched late.
 Data path sees positive crosstalk delay so that it takes longer for the data to reach the destination.
 Capture clock path sees negative crosstalk delay so that the datails captured by the capture flip-flop early.
67
Hold Analysis
 The worst condition for hold check occurs when both the launch clock path and the data path have
negative crosstalk and the capture clock path has positive crosstalk. There is one important difference
between the hold and setup analyses related to crosstalk on the common portion of the clock path.
 The worst-case hold (or min path) analysis for STA with crosstalk assumes:
 Launch clock (not including the common path) sees negative crosstalk delay so that the data is launched early.
 Data path sees negative crosstalk delay so that it reaches the destination early.
 Capture clock (not including the common path) sees positive crosstalk delay so that the data is captured by the capture
flipflop late.
68
Crosstalk Correction in IC Compiler
 IC Compiler addresses both XDD (crosstalk delta delay) and

Static Noise
 Crosstalk correction performs both cell-based and route-based
optimization (i.e. optimizes gate placement, gate logic, and
routing traces)
69
Design Finishing
70
Antenna Violations
 Metal wires (antennae) placed in an EM field generate voltage gradients
 During the metal etch stage, strong EM fields are used to ionize the
plasma etchant
 Resultant voltage gradients at MOSFET gates can damage the thin oxide
Oscillating charges in Plasma Etch
Protective coating
Metal 1
Oxide
Poly
Damaged Gate Oxide
71
Antenna Rules
 As total area (length) of wire increases during processing, the

voltage stressing the gate oxide increases
 Antenna rules define acceptable total area of wires
Antenna Ratios:
Area of Metal Connected to Gate

Combined Area of Gate
(Antenna area) ⁄ (Gate area) < (Max Antenna ratio)
gate
poly
diffusion
72
Solution 1: Splitting Metal or Layer Jumping
Before layer jumping
metal 3 M3 blockage
M1 gate
blockage metal 1
poly
Unacceptable antenna area
driver
diffusion
M1 is split by
jumping to
After layer jumping, to meet Antenna rules M3 and back
metal 3 M3 blockage metal 3
M1 gate
blockage metal 1
poly
driver
Acceptable antenna area
diffusion
73
Solution 2: Inserting Diodes
Before inserting diodes
Diode Inhibits large voltage

swings on metal tracks
During etch phase, the diode clamps the voltage swings.

74
Redundant Via Insertion
 Voids in vias is a serious issue in manufacturing
 Two solutions are available:
 Reduce via count:Via optimization techniques are employed in
route_opt
 Add backup vias: known as redundant vias
Connection fails if
contact defective Connection is okay even
if one contact defective
75
Insert Redundant Vias
 Replaces single vias with multiple vias on all nets

 Excludes timing critical nets identified by the thresholds
1X2
2X1
76
Why Filler Cell Insertion?
 To keep continuity of Nwell and Psub.

 For better yield, density of the chip needs to be uniform
 To keep continuity of standard cell power rails.
 DCAP cells are also used with regular filler cells to stabilize
voltage and support the instant current requirement in the power
delivery network.
 How?
 Some placement sites remain empty on some rows
 ICC can fill such empty sites with standard cells
77
Problem: Metal Over-Etching
 A metal wire in low metal density region receives a higher

ratio of etchant can get over-etched
 Minimum metal density rules are used to control this
Plasma Etchant etches away

un-protected metal
Less
etchant
Over-etching
per um2
due to high
of metal
etchant density
78
ECO
79
The Two Types of ECO Flows
ECO netlist
ECO placement
Yes NO
derives the
Placement location for new
Spare cells are Fixed?
required added cell
instances
Freeze Silicon ECO Non-freeze Silicon ECO

Requires that no cells are
moved or added. Allows new added cells.
Uses spare cells to Does not require spare cells.
perform ECO.
Continue with
ECO routing
80
Functional ECO Flows
1. Non-Freeze silicon ECO
 Pre-tapeout, no restriction on placement or routing
 Minimal disturbances to the existing layout
 ECO cells are placed close to their optimal locations
2. Freeze silicon ECO
 Post-tapeout, metal masks change only using previously inserted spare cells
 Cell placement remains unchanged
 ECO cells are mapped to spare cells that are closest to the optimal
location
 Deleted cells become spare cells
81
Spare Cells
• Spare cells generally consist of a group of standard cells mainly inverter, buffer, nand, nor,
and, or, exor, mux
• The inputs of spare cells are tied either VDD or VSS through the tie cell and the output is
left floating.
• Spare cells enable us to modify/improve the functionality of a chip with minimal changes in
the mask. We can use already placed spare cells from the nearby location and just need to
modify the metal interconnect.
• There is no need to make any changes in the base layers. Using metal ECO we can modify
the interconnect metal connection and make use of spare cells. We only need to change
some metal mask, not the base layer masks.
82
Physical Only Cells
83
DCAP cells
• Decap cells are basically a charge storing device made of the capacitors.
• It is used to fill empty spaces at chip-finishing stage, with the good impact of stabilizing
voltage when there is a sudden current drawn and voltage drop.
• Decap cells work as charge reservoirs and support the power delivery network and make it robust.
• How to build it:
• Source and drain of pMOS transistor shorted together and connected to VDD and the Gate is
connected to VSS.
• Similarly, the source and drain of the nMOS transistor are connected to the VSS and gate is connected
to VDD.
• However, you have to be careful of leakage!
84
EndCap cells
• The end cap cell or boundary cell is placed at both the ends of each placement row to
terminate the row.
• It has also been placed at the top and bottom row at the block level to make integration
with other blocks.
85
EndCap cells
• Why do we need endcap cells?

• To protect the gate of a standard cell placed near the boundary from damage during manufacturing.
• To avoid the base layer DRC (Nwell and Implant layer) at the boundary.
• Some standard cell library has end cap cell which serve as decap cell also.
86
Well Tap cells
• Well tap cells (or Tap cells) are used to prevent the latch-up issue
in the CMOS design. Well tap cells connect the nwell to VDD and
p-substrate to VSS in order to prevent the latch-up issue.
• Well tap cells having no logical functions, it has only two

connections.
• nwell to the power supply (VDD)
• p-substrate to the ground (VSS)
87
Well Tap cells
• In early days; Standard cells were

designed in such a way that each
standard cell had nwell to VDD
and p-substrate to VSS
connection within the standard
cell.
• To save the area, later a concept
of Tapless cell has evolved, where
there is no well taping inside the
standard cell, well taping is
provided by a separate standard
cell which is called a well tap
cell.
88
Well Tap placement
• Well tap cells are generally

placed in a straight column in
the alternate row as shown in
figure and such a pattern is
called checkerboard pattern to
provide maximum coverage for
well tap.
• If a macro comes in the path of
vertical columns, then the
placement of vertical column
shifted alongside macro as
shown in the figure.
89

PnR-II-CTS Routing Chip Finishing

Uploaded by

Copyright:

Available Formats

PnR-II-CTS Routing Chip Finishing

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

PnR-II-CTS Routing Chip Finishing

Uploaded by

Copyright:

Available Formats

PNR #2:

CTS, Routing, Crosstalk and Chip finishing

• Distribution network Generator length, metal width and

• Number of buffers PLL

Launch Clock Edge

Network Delay (latency) or Insertion Delay

• Why not just route the clock net to FF FF FF FF FF FF

all sequential elements,

For Setup: T[clk-to-Q] + Tcomb + T[setup] ≤ Tclk + Tskew

All clock pins are driven by a single clock source.

 Meet the clock tree Design Rule Constraints (DRC):

 Meet the clock tree targets:

When are the pros and cons of setting a relaxed

 When the clock root is a primary port of a block

Clock root defined

 CTS optimizes for DRC and

insertion delay) CLOCK CLK

Implicit EXCLUDE pins CLK_OUT

Defining an explicit stop

CTS has no knowledge of the IP-internal

set_clock_tree_exceptions –stop_pins [get_pins IP/IP_CLK]

Defining an explicit float skew and insertion delay

Explicit float pin defined

All insertion delays D Q

are matched FF2

set_clock_tree_exceptions -exclude_pins [get_pins FFD/CLK]

 IC Compiler can route the clocks using non-default

Default Routing Rule Effect of NDR route on Clk

 Always route clock on metal 3 and above

 Avoid NDR on Metal 1

 Using clock_opt in the following manner has been found to be

clock_opt -only_cts -no_clock_route

 Clock buffers added

How do you handle new violations?

 Metal traces (routes) are built M1

⚫ Detail Routing Detail Route

◼ GR assigns nets to specific metal layers and global routing cells

Metal traces exist after Global Route. True or False?

 Track Assignment (TA):

 It also attempts to:

Detail Route SBoxes

check_physical_design –stage pre_route_opt

Crosstalk is the transfer of a

Static victim Switching victim

 If signals in two neighboring parallel nets change in the same direction,

 Mutual capacitance is one of the two mechanisms that cause crosstalk

 Mutual inductance is another mechanism that causes crosstalk.

With the scaling to deep sub-micron processes

The width of the metal is continuously being decreased

Furthermore, metal wire lengths are now longer than

 Aggressor and victim are switching in opposite directions

Launch clock path D Q D Q

Capture clock path

 IC Compiler addresses both XDD (crosstalk delta delay) and

Oscillating charges in Plasma Etch

Damaged Gate Oxide

 As total area (length) of wire increases during processing, the

Area of Metal Connected to Gate

(Antenna area) ⁄ (Gate area) < (Max Antenna ratio)