100% found this document useful (1 vote)
1K views

PD Reference

The document provides an introduction to physical design for new engineers. It discusses that physical design involves placing and routing all design components with their geometric representations. The goal is to create a physical layout that meets design requirements like performance while optimizing factors such as chip size and cost. Key steps in physical design include partitioning, floorplanning, placement, routing, and verification. Floorplanning involves arranging blocks and determining their locations, shapes, and orientations. It aims to estimate interconnect length and circuit delay.

Uploaded by

Agnathavasi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
1K views

PD Reference

The document provides an introduction to physical design for new engineers. It discusses that physical design involves placing and routing all design components with their geometric representations. The goal is to create a physical layout that meets design requirements like performance while optimizing factors such as chip size and cost. Key steps in physical design include partitioning, floorplanning, placement, routing, and verification. Floorplanning involves arranging blocks and determining their locations, shapes, and orientations. It aims to estimate interconnect length and circuit delay.

Uploaded by

Agnathavasi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 69

PHYSICAL DESIGN

BASIC NOTE FOR NEWBIES

A.B.M TAFSIRUL ISLAM


ASIC PHYSICAL DESIGN ENGINEER
linkedin.com/in/a-b-m-tafsirul-islam-a76b9a183
PHYSICAL DESIGN INTRODUCTION
During Physical Design, all design components are instantiated with their geometric
representations. All macros, standard cells, gates etc. with fixed shapes and sizes per fabrication
layer are assigned in a special location (placement) and then connected (routing). The result of
physical design is a set of manufacturing specifications. The goal of the physical design process
is to create a physical layout for the chip that meets all of the requirements of the design,
including performance, power consumption, and reliability, while also optimizing for factors
such as chip size and manufacturing cost.
- Physical layout: A physical layout is the physical arrangement of the various
components and interconnections on a chip. The physical layout is created
during the physical design process, which follows the logic design process.
The physical layout is a two-dimensional representation of the chip, with each
component and interconnect represented as a geometric shape. The physical
layout specifies the exact position and orientation of each component on the
chip, as well as the routing of the interconnects between the components. The
physical layout is used to guide the fabrication process for creating the chip,
and it determines the final physical characteristics of the chip, such as its size
and shape.
● Partitioning: This step breaks up a circuit into smaller sub-circuits or modules, for which
physical design can be done individually.

● Floor-planning: This step determines the shapes and arrangement of modules, as well as
the locations of external ports and IP or macro blocks.

● Placement: This step assigns the special locations of all cells within each block.

● Clock Tree Synthesis (CTS): This step does the buffering, gating and routing of the
clock signal to meet prescribed skew and delay requirements.

● Routing: This step allocates routing resources that are used for design connections.
● Timing Analysis and Enclosure: This step uses timing analysis to optimize the
placement and routing of the design.

● Physical Verification: This step verifies the layout to ensure correct electrical and
logical functionality. The checks like Design Rule Check (DRC), Layout Versus
Schematic (LVS) checking, Parasitic Extraction (to verify the electrical characteristics of
the circuit), Antenna Rule Checking (ARC) & Electrical Rule Checking (ERC) are
performed in this step.

PARTITIONING
The design complexity of modern integrated circuits is increasing at a rapid pace. This increased
complexity makes it very difficult to do physical design on the full chip independently. A
common strategy is to partition or divide the design into smaller portions, each of which can be
processed with some degree of independence and parallelism. In physical design each of these
portions are known as a partition. The partitions are laid out individually and reassembled
(gathered something) at the top level.
A popular approach is to partition by modules or design functionality. These modules can range
from a small set of electrical components to fully functional integrated circuit (ICs).
If each block is implemented independently, i.e., without considering other partitions, than
connection between these partitions may negatively affect the overall design performance such
as increased circuit delay or decreased reliability. Therefore, the primary goal of partitioning is to
divide the circuit such that the number of connections between subcircuits is minimized. So,
sometimes dividing the circuit by modules or functionally may not be optimal. In this case there
is a difference between how design is divided in front-end versus back-end and physical
partitions can be very different from design functionality. The number of external connections of
a partition may also be limited e.g., by the number of I/O pins in the chip package.

Each partition must also meet all design constraints. For example, the amount of logic in a
partition can be limited by the size of an FPGA chip.

FLOOR-PLANNING
Before diving deep into the topic, let's take an example of building a house. If we are going to
build a house, we need to specify the area of the rooms, balconies, kitchens, washrooms etc.
Similarly in terms of building a chip we also need to specify the place where the pins, pads, std.
cell, macros, power pads will be placed.
Now we need the size/coordinates for these elements to be placed. For placing the elements such
as pins, pads, std.cells, macros etc. and determining the design size we need to take the core and
die as reference. Here we need to specify the aspect ratio, cell utilization, core utilization,
dimension, the distance between the core and IO boundary, IO box calculation, floorplan origin.
Floor-planning deals with large modules such as caches, embedded memories and intellectual
properties (IP) cores that have known areas, fixed or changeable shapes, and possibly fixed
locations. In floor-planning these blocks are arranged and assigned shapes and locations. This
step enables early estimates of interconnect length, circuit delay and chip performance.

Import Design:
Just before the floorplan we need to import the design. Import the design means we need to feed
all the required inputs to the physical design implementation tools to perform all the steps. At the
end of the floorplan summary we will mention the compulsory inputs that we need for
floorplanning.
In the above picture, the library(.lib) and tech file will create a delay model. With this delay
model and SDC constraint we will create an analysis view where we will create different corners
named MCMM (multi-corner-multi-mode). In this corner we will create slow, fast, typical
corners with best, typical and worst mode for timing analysis. By using these data we will decide
which element should be placed.

Two types of design are possible. Block level designs will be rectilinear and chip level designs
will be rectangular in shape.

● Rectilinear – To define this size more coordinates are required

● Rectangular – To define this only height and width of the die is required

Floor-planning involves with following steps. We will discuss preplaced cells in this chapter.
Before we dig deeper into floor-planning let’s go through a few terminologies.
● Wafer: A wafer is a thin slice of semiconductor, such as a crystalline silicon used for the
fabrication of integrated circuits.

● Die: A die is a small block of semiconductor material on which a given functional circuit
is fabricated.

● Core: A core is a section of the chip where fundamental logic of the design is placed.
● Rows: Rows are the location where cells get placed. These rows are individual rows and
the row area is utilized by the standard cells as shown below in figure below.The
utilization factor is decided by the channel area also. If the channel area is reduced, better
utilization can be achieved.

But reducing the channel area leads to a shortage between Vss and Vdd. To avoid this,
every row is flipped so that Vdd of two rows can be joined together and Vss of two rows
can be connected together and there will be no chance in short between Vdd and Vss as
shown in figure below. The utilization depends only on the row area, not the channel
area.
● Site: Site is the minimum unit of placement. Rows are multiples of site definition. We
can also say that the smallest unit of placement where the smallest cell can be placed is
called a site. This is technology dependent and obviously changes with respect to
technology changes.

Let’s understand the basic concept of floor-planning using the design we had partitioned in
the last chapter.

Assume Block A and Block B each is 1 unit and 2 units in height (shown below)
These blocks are arranged in the core area of the die as shown below. These blocks
are also known as pre-placed cells. The blocks have user-defined locations and are placed
in the chip before automated place and route steps.

Utilization Factor and Aspect Ratio:


● Utilization Factor: This is the ratio of “area of the design” divided by the “total area of
the core”. Typically, this is in the 50-60% range so remaining area can be used for
optimizations later.
● Aspect Ratio: This is the ratio of “height of the core” divided by “width of the core”

CELL Orientation:
Macros placement is done manually based on the connectivity with other macros and also with
I/O pads. Macros are moved on the basis of connectivity and the orientation of macros are as
shown below in figure.
a –> R0 — No rotation

b –> MX — Mirror through X axis

c –> MY — Mirror through Y axis

d –> R180 — Rotate counter-clockwise 180 degrees

e –> MX90 — Mirror through X axis and rotate counter-clockwise 90 degrees

f –> R90 — Rotate counter-clockwise 90 degrees

g –> R270 — Rotate counter-clockwise 270 degrees

h –> MY90 — Mirror through Y axis and rotate counter-


clockwise 90 degrees

By using the flight/fly lines the exact connectivity of macros to other macros or I/O pads can be
seen and with the help of these orientations, the cell’s physical orientation can be changed and as
a result of this routing resources can be reduced. The concept of flight/fly lines is given below.

Fly Lines

Fly/flight lines are virtual connections between macros and also macros to I/O pads. This helps the
designer to get an idea about the logical connections between macros and pads. Fly/flight lines act
as guidelines to the designer to reduce the routing resources to be used. On the basis of connectivity
it shows, flight lines are of three types.

1. Macro to macro fly lines


2. pin to pin fly lines
3. macro to I/O fly lines

Macro to macro fly lines


This shows the total number of connections between two macros. This gives an idea to the designer
about which two modules to be placed closer.

Pin to pin fly lines

If two macros are selected for pin to pin fly lines, the virtual connections are shown and the much
precious connection to exact pin to pin will be shown. This guides the designer to choose an
appropriate cell orientation (fig-3) for the macros and as a result will be efficient routing.

Macro to I/O fly lines

Macro to I/O flight lines show the exact connection between the macro pins and the I/O ports of
pins. This helps the designer to identify the macros to be kept at the corners of the die or block.

Pictorial representation of these flight lines and how these lines act as guidelines to the designer is
shown below in figure.

From the figure

i) macro to macro fly lines

ii) pin to pin fly lines

iii) for B macro “R90” is applied (B is rotated 90 degrees to the anti-clock direction)

iv) for A macro “R180” is applied (A is rotated 180 degrees to the anti-clock direction)

This is how the fly lines act as guidelines for macro placement. In a similar way macro to I/O fly
lines also helps the designer to identify the macro to be placed in the corners.
Core to IO Clearance: Distances from core to I/O pads are mentioned manually. Figure-5 shows
the core to I/O clearances. This distance from core to I/O purely depends on the width of Vdd and
Vss metal layers.

fig-5: core to I/O clearance

Pad Limited Design

● If pad area is more than core area


● Large number of I/O pads and less number of logic I/O pads
● Here pad area decide the size of the die

Core Limited design

● If core area is more than pad area


● Core area decides the die size

Bond Pads

Pad placements are of two types.

Inline Bonding

● All pads are of same height


● Generally used in most of the designs
● Can be used when design is not pad limited

Staggered Bonding

● Pads are of different heights


● Can be used when design is pad limited

Corner Cells

● I/O pads are not placed in corner of the chip


● To fill the gap and to provide I/O pad power ring connectivity corner cells are used
● It doesn’t perform any logic or they doesn’t contain any CMOS circuitry, they are
added to maintain the connectivity for I/O pad power rings

Filler Cells

● Similar to corner cells


● Used for continuity of pad power ring
DIE SIZE ESTIMATION

Technology Inputs:
● Gate density per sq. mm = D
● Number of horizontal layers = H
● Number of vertical layers = V
Design Inputs:
● Gate count (excluding memories, macro & sub-chips) = G
● IO area in sq. mm = I
● (Memory + Macros + Sub-chips) area in sq. mm = M
● Target Utilization in percentage = U%
● Additional gate count for CTS, timing closure etc., in percentage = T%
● Additional gate count for ECOs, in percentage = E%
DIE Area Calculation:
● Die Area in sq. mm = {[(Gate Count + Additional Gate Count for CTS & ECO)/
Gate Density] Area + Memory, Macro Area} / Target Utilization
● Die Area = {[(G + T + E) / D] + I + M} / U

Aspect Ratio, Width, Height Calculation:


● Aspect Ratio
AR = width / height
= Number of horizontal resources / Number of vertical resources
AR = H / V
● Height
AR = W / H
W = H * AR ----- (1)

Area = W * H
= H * H * AR
(Expressing W in terms of H
from (1)
H2 = Area / AR
H = SQRT (Die Area / AR)
● Width

W = H * AR
Typically, there are two kinds of block:
● Hard Blocks: The dimensions and areas of hard blocks are fixed.

● Soft Blocks: For a soft block, the area is fixed but the aspect ratio can be changes, either
continuously or in discrete steps.
The Floor-planning stage ensure that
1. Every chip module is assigned a shape and a location, so as to facilitate gate
placement.
2. Every pin that has an external connection is assigned a location, so that internal
and external nets can be routed.
The floor-planning stage determines the external characteristics- fixed dimensions and
external pin locations- of each mode. These characteristics are necessary for the
subsequent placement and routing steps, which determine the internal characteristics of
the blocks. Floor-plan optimization involves multiple degrees of freedom, while it
includes some aspects of placement (finding locations) and connection routing (pin
assignment), module shape optimization is unique to floor-planning

Floor Plan design optimizes both the locations and the aspect ratios of the individual
blocks, so as to maintain following objectives:

● Area: Area of the core impacts circuit performance, yield, and manufacturing
cost. So one of the goals of floor-planning is to optimize area.
● Total Wire Length: Long connections between floor-plan blocks may increase
signal propagation delays in the design. Therefore, layout of high-performance
circuits seeks to shorten such interconnects.
SUMMARY:
We know that the floor planning involves the following steps:

Objective of floor planning:


● Minimize the area
● Minimize the timing
● Reduce the wire length
● Making routing easy
● Reduce the IR drop
Inputs for floor planning:
● Gate level netlist (.v)
● Physical and logical libraries (.lef)
● Synopsys design constraints (.sdc)
● TLU+ files
● Technology files (.tf)
● Physical partitioning information of the design
● Floor planning parameters like height, width, aspect ratio etc.
Output of floor planning:
● Die/core area
● I/O pads information
● Placed macros information
● Standard cell placement area
● Power grid design
● Blockage are defined

POWERPLANNING
Power planning involves determining the layout of the layout of the power ground distribution
network and the placement of supply input output pads as shown in the image below.

The supply nets, VDD and GND, connect each cell in the design to a power source. As VDD and
ground must be distributed to each cell in the design, these nets (network source) are large and
span across the entire chip. These are usually routed first before any signal routing. Routing of
supply nets is different from routing of signals. Routing of signal means making physical
connections between signal pins using metal layers. Power and ground nets should have
dedicated metal layers to avoid consuming signal routed resources. In addition, supply nets
prefer thick metal layer typically the top two layers in the back-end of line process due to their
low resistance.

Fig. Power supplies using a dedicated metal layer.


When the power ground network travels through multiple layers, there must be sufficient vias to
carry current while avoiding electromigration and other reliability issues. Since supply nets have
high current loads, they are often much wider than standard signal route.

Even though the metal layers are low resistance, there is still some voltage drop along the supply
nets. If the voltage drop is more than threshold voltage the circuit will not function properly.

In order to avoid this problem a mesh topology is used so each block can tap the supply net from
the nearest point shown in the figure below.
A MESH TOPOLOGY THAT IS CREATED THROUGH THE FOLLOWING FIVE
STEPS:
1. Create Power Ring: This carries VDD and VSS around the
chip. For low resistance, these connections and the rings are
on many layers. Such as, a ring might use metal layers Metal
2-Metal 8 (every layer except metal 1)

2. Connect I/O pads to the ring: These should be maximally


connected to the power ring in order to minimize the resistance
and maximize the ability to carry current to the core.
3. Create a Mesh: A power mesh consists of a set of stripes
at defined pitches on two or more layers. The width or pitch
of the stripes are determined from the estimated power
consumption as well as layout design rules. The stripes are
laid out in pairs, alternating as VDD-GND, GND-VDD and so
on. The power mash uses the uppermost and thickest layers,
and is sparser on any lower layers to avoid signal routing
congestion. Stripes on adjacent layers are typically connected
with as many vias as possible, again to minimize resistance.

4. Create Power Rails: The metal 1 layer is where the


power-ground distribution network meets the logic gate of
the design.

5. Connects Rails to The Mesh: Finally, the rails are


connected to the mesh with sticked vias. A key consideration is the proper size of (number of
vias in) the via stack.
DECAP CELLS:
In addition to that, a decoupling capacitor (decap) can also be used. Decap cells are basically a
charge storing device made of the capacitors and used to support the instant current requirement
in the power delivery network. There are many reasons for the instant large current requirement
in the circuit and if there are no adequate measures taken to handle this requirement, power
droop or ground bounce may occur. These power droop or ground bounce will affect the constant
power supply and ultimately the delay of standard cells may get affected. To support the power
delivery network from such sudden power requirements, decap cells are inserted throughout the
design.

Decap cells are placed generally after power planning and before the standard cell placement,
that’s in the pre-placement stage. These cells are placed uniformly throughout the design in the
stage. Decap cells can also be placed in the post route stage if required. The only problem with
decap cell is that these are leaky and increase the leakage power of design, so must be used
judiciously.

POWER RING:

Inputs given to this EDA tool or code:

1. A text file in DEF format with clear definitions of core/die width, pad
placement and other unplaced cells
2. Industry grade 180nm PDK’s (standard cells, memories, pads) LEF formats
Expected output from this EDA tool:

A text file in standard DEF format which has all information about inputs which were provided +
information about pre-placed cells and power rings (shown in below image)
PIN ASSIGNMENT
During the pin assignment, all nets (signals) are assigned to unique pin locations such that the
overall design performance is organized. This step ensures maximizing routing ability and
minimizing electrical parasitic both inside and outside of the block. Concentric circle method is
one of the optimized method of pin placement (demonstrated in the image below).

(1) (2) (3)


(4)
(5) (6) (7) (8)

(9) (10)

The goal of external pin assignment is to connect each incoming or outgoing signal to a unique
I/O pin. Once the necessary nets have each been assigned a unique pin, they must be connected
such that wirelength and electrical parasitic, e.g. coupling or reduced signal integrity, are
minimized. Consider the design in the following image.
● Here is one way of assigning the pins and pin assignment is NOT optimum here

● Here is another way of pin assignment which is optimum here.


● Once pin arrangement and pin assignment are done, we add blockages for the
logical cell placement so Place and Route tool does not place any logic there.

PLACEMENT
Pre-Placement
Before the placement step, some pre-placement steps are done to avoid further problems
or to have better results in later steps. These steps help to improve routability, timing
performance and power goals of the design.

Adding physical only cells

1. End Caps

● These end caps cells are the pre-placed physical only cells.
-Insert to meet certain design rules.
-Insert at the end of the site row, top and bottom side of the block boundary and
every macro boundary.
-This will also help to reduce the DRC’s nearer to macros as the standard cells do
not place near to the macros.
-These cells only have VDD and GND pins and no other logical pins.

2. Well Tap

● Well Taps help to tie substrate and N-well to VDD levels and thus prevent latch-up.
Latch-up is a condition where a low impedance path is created between supply pins
and ground. This condition is caused by a trigger (current injection or over voltage) ,
but once activated, the low impedance path remains even after the trigger is no
longer present. This low impedance may cause system upset or catastrophic damage
due to excessive current levels. Well tap cells are placed after macro placement or
power rails creation. This stage is called the pre-placement stage. Well tap cells are
placed in a regular interval in each row of placement. The maximum distance
between well tap cells are as per the DRC rule of that particular technology library.
3. Spare cells: After silicon is out and tests complete, it might become necessary to have some
changes to the design. There might be a bug, or a very easy feature that will make the chip more
valuable. This is where we try to use the spare cells. For example, if we need a logic change that
requires addition of an and cell, we can use an existing spare AND to make this change. This
way, we are ensuring that the base layer masks need no regeneration. The only change here is in
metal connection and the metal connection is regenerated for the next fabrication.

4. Decap Cells: It is for avoiding instantaneous voltage drop. Place decap cells closer to power
pads or any larger drivers.
Adding Cell padding and placement Blockages:
● Cell padding is done to reserve spaces for routing congestion. It adds hard
constraints to placement. The constraints are honored by cell, legalization, CTS, and
timing optimization.
● Placement blockage halos are the areas where the tools should not place any cells in
these areas.
● Types of blockages:
-Hard Blockage: No standard cells, macros, buffer, inverter can be placed
inside the blockage during global placement, legalization and optimization.
# Command for soft blockage: create_placement_blockages -boundary
{{10 20} {100 200}} -name PB1 -type hard

-Soft Blockage: Can’t use during placement but may be used during
optimization. Only a buffer or inverter can be placed here.
# Command for soft blockage: create_placement_blockages -
boundary {{10 20} {100 200}} -name PB1 -type soft

-Partial Blockage: An area with lower utilization. In this blockage we can use
some part of the blockage and the rest remains unused or unchanged.
# Command for soft blockage: create_placement_blockages -boundary
{{10 20} {100 200}} -type partial -blocked_percentage 40
Here partial blockage will allow maximum cell density 60% and the rest
40% is blocked.

● Halo (padding): An area outside a macro that should be kept clear of standard cells.
PLACEMENT
After partitioning the circuit into smaller modules and floor-planning the layout to determine
block outlines and pin locations, placement seeks to determine the location of standard cells or
logic elements within each block. Before placement there are some pre-placement items which
are usually done.
During placement of a standard cell’s location is determined by the tool and then these are placed
in the layout. Placement also optimizes the design and determines its routability.

Placement step involves the following steps:


● Netlist binding: In this step, standard cells present in the netlist are mapped with
appropriate dimensions (usually rectangular) of the standard cells and placed in the
core.
A library will have multiple representations of cell with the same functionality. These
representations differ in their physical properties which impact speed and area. Each
representation has its own information about its:
-Dimension (height and width)
-Delay function
-Logic function (means at what condition output changes)

● Global/Coarse Placement: It is a kind of placement to get approximate initial


location of cells. Cells are not legally placed and they can be overlapping. Let’s
understand it using the example below.

After library mapping, these cells are following


These cells are then placed on the floor plan earlier created as shown below. (Note: This
may not be optimized placement)

● Detailed/ Legal Placement: This step improves the coarse placement, and ensures
legal placement of cells and macros. It avoids overlapping of the cells.

● Placement Optimization: Placement must produce a layout where all nets of the
design can be routed simultaneously such as the placement must be routable. In
addition, electrical effects such as signal delay or crosstalk must be taken into
consideration. As detailed routing information is not available during placement, The
placer optimizes estimates of routing quality matrices such as:

● Total wirelength
This can be optimized by appropriate placement of the cells, for example
following image shows how the same set of cells can be placed better for reducing
wirelength.

● Wire congestion: Congestion occurred when the number of required routing track
is greater than number of available tracks. Congestion can be estimated from the
results of a quick global route. Global bins with routing overflow can be
identified.
● Signal Delays (Timing Optimization):

Placement tries to place critical path cells close together to reduce net RC’s and to
meet setup timing. If cells can be placed together delays are optimized. (Some
examples mentioned below)
- Resizing of the cells increases the size of cells that reduce the delay.
- Adding buffer improves the slew and reduces the delay
- Cloning reduces the fanout and reduces the delay
- Redesigning fan-in-tree optimizes the critical path length and reduces
the delay
Placement also optimizes assignment of I/O pads with respect to both the
logic gates connected to them. It also tries to optimize the placement of actively switching (and
heat-generating) circuit elements, to achieve uniform temperature across the chip.

Inputs of Placement:
● Technology file (.tf)
● Netlist
● SDC
● Library files (.lib & .lef)
● TLU+ files
● Floorplan & Power plan DEF file
Output of Placement:
● Physical layout information
● Cell Placement Location
CLOCK TREE SYNTHESIS (CTS)

Pre-CTS Optimization: Before CTS is done, some pre-CTS steps are done. This involves High
Fanout Net Synthesis (HFNS). High Fanouts are nets with a large number of fanouts (>1000).
Some examples of High Fanout nets are scan enable and reset signals. These are implemented
before CTS. Buffering of High Fanout nets are done to ensure timing performance and slew
rates.
● Slew: Transition delay or slew is defined as the time taken by a signal to rise from 10 %(
20%) to the 90 %( 80%) of its maximum value. This is known as “rise time”. Similarly
“fall time” can be defined as the time taken by a signal to fall from 90 %( 80%) to the 10
%( 20%) of its maximum value.

Clock Tree Synthesis (CTS): CTS is the process of connecting clocks to all clock pins of
sequential circuits by using inverters or buffers in order to balance the skew and to minimize the
insertion delay.
● Skew: Skew refers to the difference in arrival time between two or more signals at a
given point in a circuit. Skew can be caused by various factors such as differences in the
lengths of the interconnects between the signals, differences in the loading of the
interconnects, and differences in the propagation delays of the various components in the
circuit. Skew can have a significant impact on the performance of a VLSI circuit because
it can cause a signal to arrive at an incorrect time, leading to errors in the operation of the
circuit. If the skew is significant (refers to a skew that is large enough to have a
meaningful impact on the operations of the circuits), it can cause error in the operation of
the circuit. To minimize skew, VLSI designers use techniques such as careful layout of
the circuit and the use of matched pairs of components. In general, small amounts of
skew are preferable because they can lead to more accurate and reliable operations of the
circuit. In ideal scenarios we expect zero skew during CTS but in practical scenarios
this is not possible, that's why we provide a constraint of maximum skew.
Clk_skew = (Clk_arrival time to capture seq. elements - Clk_arrival time to launch
seq. elements)

There are two types of clock skew:


1. Positive Clock Skew: Capture flop latency is greater than launch flop
latency.
2. Negative Clock Skew: Launch flop latency is greater than capture flop
latency

● Clock Jitter: Clock jitter is deviation of a clock edge from its ideal location.

● Insertion delay/ Propagation delay: Propagation delay is the time it takes for a signal to
travel through a circuit or component. Propagation delay is an important factor to consider
because it determines the speed at which a circuit can operate. Propagation delay is typically
measured in units of time, such as nanoseconds (ns). Propagation delay can be affected by various
factors, including the type of component or circuit being used, the length and width of
interconnects, and the type of material used in the interconnects. In VLSI design, minimizing
propagation delay is often a key goal because it can help to improve the overall performance of a
circuit. This can be achieved through techniques such as optimizing the layout of a circuit and
using components with low propagation delays.

When the skew is not balanced, it can make meeting setup and hold timing difficult.
Clocks are generally routed by the H tree shown below:

Goals of CTS:
Crosstalk Noise: Crosstalk noise refers to undesired or unintentional effect between two or more
signals that are going to affect the proper functionality of the chip. It is caused by the capacitive
coupling between neighboring signals on the die. There are several types of crosstalk that occur
in VLSI circuits such as Near End CrossTalk (NEXT), Far End CrossTalk (FEXT) and
Simultaneous Switching Noise (SSN). Crosstalk can be minimized through various techniques
such as careful layout of the circuit, the use of shielding or isolation techniques, the selection of
appropriate materials and design techniques for interconnects.
- Isolation Techniques: Isolation techniques refer to methods used to
reduce the coupling of signals between two or more interconnects or
components in a circuit. These techniques are often used to mitigate
crosstalk, which can cause distortion or errors in transmitted signals.
There are several types of isolation techniques that can be used in
VLSI design, including:

1. Shielding: This involves the use of physical barriers, such as


metal layers or shields, to reduce the coupling of signals between
interconnects.
2.Staggering: This involves the use of non-uniform spacing
between interconnects to reduce crosstalk.

3.Termination: This involves the use of resistors or other types of


impedance-matching elements at the ends of interconnects to
reduce reflections and reduce crosstalk.
In deep submicron technologies, noise plays an important role in terms of functionality or timing
of device due to several reasons:
● Increasing the number of metal layers. For example, 28 nm technology has 7 or 8 metal
layers and in 7 nm technology there are around 15 metal layers.
● Vertically dominant metal aspect ratio, it means that in lower technology wire is thin
and tall but in higher technology the wire is wide and thin. Thus, a greater proportion of
the sidewall capacitance which maps into wire capacitance between neighboring wires.
● Higher routing density due to finer geometry means more metal layers are packed in
close physical proximity.
● A large number of interacting devices and interconnect
● Faster waveforms due to higher frequencies. Fast edge rates cause more current spikes
as well as greater coupling impact on the neighboring cells.
● Lower supply voltage, because the supply voltage is reduced it leaves a small margin for
noise.
● The switching activity on one net can affect the coupled signal. The affected signal is
called the victim and affecting signals termed as aggressors.

How to reduce crosstalk:


● By wire spacing (NRD rules) we can reduce the coupling capacitance between two nets
● Increasing the driving strength of the victim net and decreasing the driving strength of
aggressor net.
● Jumping to higher layer because higher layers have more width
● Insert buffer to split long nets
● Use multiple vias because of less resistance than less RC delay
● Shielding: High frequency noise is coupled to VDD or VSS since shielded layers are
connected to either VDD or VSS. The coupling capacitance remains constant with VDD
or VSS.
Inputs of CTS:
● Technology file(.tf)
● Netlist (.v)
● SDC (.sdc)
● Library files (.lib & .lef)
● TLU+ file
● Placement DEF file
● Clock specification file which contains insertion delay, skew, clock transition, clock
cells, NDR, CTS tree type, CTS exceptions, list of buffers/inverters etc.
Output of CTS:
● Timing report
● Congestion report
● Skew report
● Insertion delay report
● CTS DEF file

ROUTING
Making physical connections between signal pins using metal layers is called Routing. Routing
is the stage after CTS and optimization where exact paths for the interconnection of standard
cells, macros, I/O pins are determined. Electrical connections using metal and vias are created in
the layout, defined by the logical connection present in the netlist (i.e., logical connectivity
converted as physical connectivity)
- Optimization: optimization refers to the process of improving the
performance, power consumption, and/or area utilization of a VLSI
circuit or system.
After CTS, we have information of all the cells, blockages, clock tree buffers/inverters
and I/O pins. The tool relies on this information to electrically complete all connections
defined in the netlist such that:
● There is minimal DRC violation while routing
● The design is 100% routed with minimal LVS violations
● There are minimal SI related violations

- SI (Signal Integrity) violations: SI (Signal Integrity) violation in


VLSI (Very Large-Scale Integration) refers to a situation where the
integrity of an electrical signal is compromised as it travels through a
VLSI circuit or system. This can occur due to various factors, such as
crosstalk, reflections, and noise. These factors can cause the signal to
become distorted or degraded, leading to errors in the system or
reduced performance. There are various techniques that can be used to
minimize the impact of SI violations in VLSI circuits, including the
use of proper layout techniques, adding shielding and grounding, and
using buffers and repeaters to amplify the signal. It is important to
carefully design and test VLSI circuits to ensure that they do not suffer
from SI violations, as this can have significant impacts on the
performance and reliability of the system.

● There must be no or minimal congestion hot spots


● The timing DRCs & QOP are met and good respectively.
Goals of Routing:
● Minimize the total interconnects/wire length.
● Minimize the critical path delay
● Minimize the number of layer changes that the connection has to make (minimizing the
number of Vias)
● Complete the connections without increasing the total area of the block.
● Meeting the timing DRCs and obtaining a good timing QoR
● Minimizing the congestion hotspots
● SI (signal integrity) Driven: reduction of crosstalk noise and delta delays.

Routing Constraints:
● Set constraint to number of layers to be used during routing
● Setting limits on routing to specific regions
● Setting the maximum length for the routing wires
● Blocking routing in specific regions
● Set stringent guidelines for minimum width and minimum spacing means establishing
strict limits on the dimensions of the various elements in the circuit. Setting stringent
guidelines for minimum width and minimum spacing can help to ensure that the circuit
meets the required performance and reliability standards, as well as to reduce the risk of
defects during fabrication.
- Minimum width refers to the smallest allowable width for a
conductor, such as a metal line or via
- Minimum spacing refers to the smallest allowable distance between
two conductors, such as two metal lines or a metal line and a VIA.
● Set preferred routing directions to specific metal layers during routing
● Constraining the routing density. Constraining the routing density refers to setting limits
on the number of interconnects that can be routed in a given area of the circuit. This is
typically done to control congestion, which can occur when there are too many
interconnects in a small area, leading to increased parasitic and reduced performance
● Constraining the pin connections. It refers to establishing limits on the number and
location of the input/output (I/O) pins that can be used to connect the circuit to other
components or systems.

Routing is usually done in multiple stages:


1. Global Routing: Coarse grain assignment to routes to routing regions refers to the
process of dividing the circuit into larger blocks or regions and assigning each block to a
specific routing resource, such as a metal layer or a routing channel. This is typically
done as part of the global routing process, after the placement of the various components
has been determined. The goal of coarse grain assignment is to improve the efficiency of
the routing process by reducing the number of routing resources that need to be
considered at each step. It can also help to reduce congestion and improve the
performance of the final circuit by allowing the routing resources to be used more evenly
across the circuit.
● Identifying routable path for the nets driving/driven pins in a shortest
distance.
● Does not consider DRC rules and gives an overall view of routing and
congested nets
● Assign layer to the nets
● Identify and assign net segments over the specific routable window called
Global Route Cell (GRC)
● Avoid congested areas and also long detours (a detour refers to a route
taken by an interconnect that deviates from the most direct path between
its source and destination).
● Avoid routing over blockages.
● Avoid routing for pre-route nets such as Rings/Stripes/Rails
● Uses Steiner Tree and Maze algorithm
- Steiner Trees are commonly used to minimize the total
wirelength of interconnects between components in a circuit,
which can help to improve the performance and reduce the
power consumption of the circuit. They can also be used to
optimize the routing of power and ground connections in a
circuit, which can help to reduce noise and improve the
reliability of the circuit
- A maze algorithm is a type of routing algorithm that is used to
find a path between two points in a circuit. The algorithm
works by "growing" a path through the circuit by adding new
segments to the path one at a time, similar to the way a mouse
might search for a way out of a maze. Maze algorithms are
commonly used in VLSI design to route interconnects between
components in a circuit, and they can be used at various stages
of the design process, from high-level synthesis to detailed
routing. They are particularly well-suited to circuits with
complex routing constraints or with a large number of
components, as they can find a solution quickly and efficiently.

Detailed Routing: Fine grain assignment of routes to routing tracks.


● Detailed routing follows up with the track routed net segments and performs the complete
DRC aware and timing driven routing.
● It is the final routing for the design built after CTS and the timing is freeze
● Filler cells are adding before detailed routing
● Detail routing is done after analyzing the cause for congestion in the design, add density
screen or change floor plan etc.
● Timing-driven routing: Net topology optimization and resource allocation to critical
nets
● Post-routing optimization:
o Signal Integrity (SI) Optimization by NDRs and Shielding for the sensitive
nets
o Types of Shielding for sensitive nets: 1. Same Layer Shielding
2. Adjacent Layer/ Coaxial Shielding

Inputs of Routing:
● Netlist
● All cells and ports should be legally placed with clock tree structure and CTS DEF file
● NDRs
● Routing Blockages
● Technology Data (metal layers like. lef, tech file etc), DRC rules, via creation rules, grid
rules (metal pitch etc.)

Outputs of Routing:
● Routing .db file or DEF file with no opens and shorts
- .db file is a file that contains database information about the
design of a circuit. The .db file typically contains a variety of
data about the circuit, including:
o Component information: This may include the
type, location, and connectivity of each component
in the circuit.
o Netlist information: This may include a list of the
interconnects in the circuit and the components that
they connect.
o Constraint information: This may include design
constraints such as minimum spacing requirements
or routing guidelines.
o Layout information: This may include the physical
layout of the circuit, including the placement and
routing of the various components and
interconnects.

● Timing report
● Congestion report
● Skew and Insertion delay report
● Geometric layouts of all nets

Metal Layers:
Metal layer connects the points of the two ends. To route any PG/Signal/Clock we need metal
layers. There are many metal layers we actually used to complete the routing. The number of
metal layers that have to be used depends on the foundry and technology node. 14 metals are
used for 7 nm technology of TSMC and 13 metal layers are used for the 7 nm Samsung
technology node. There are as many metal layers present as it helps the design to cover more
with respect to congestion. The metal layers are drawn in such a way from M0-M14 where the
structure is built with vertical and horizontal layers. M0,M2,M4,M6,M8,M10,M12,M14 are
horizontal layers and M1,M3,M5,M7,M9,M11,M13 are vertical layers. The reason behind these
vertical and horizontal routing is to avoid the routing congestion, crosstalk with each metal
layer. To connect these horizontal and vertical layers we need VIA’s which connect two metal
layers. Resistance value decreases with respect to increases in the metal layers because the metal
area increases accordingly in higher levels. Generally M1 has 1.5 times more resistance than M2
metal layer. M2 has 1.5 times more resistance than M3 metal layer and so on. In the figure below
we can see that the two points are being connected with different types of metal layers. The
highest metal layer is usually used for supplying ground connections and for longer
connectivity.
Layer Use in routing

Diffusion Never. Used only when multiple transistors on a single strip.

Poly Used for in cell routing. Used outside the cell for same row local routing only in
processes with less than 3 metal layers.

Metal 1 Supply,ground, in-cell routing, horizontal routing channels

Metal 2 Local connections in-row when more than 3 metals are available. Vertical
connections inter-row.

Metal 3 Generalized routing, usually long range. The higher metal layer the longer the
range.

Metal N Supply

For the routing of inter-cell same row connection here basically poly and metal 1 are used within
the same row. This is a little bit complicated because these two layers are also used for routing
within the cells for themselves. We actually keep using them to avoid intersections that we do
not want. On the other hand if we have more then one metal available we can use them for
routing within the same row of cells. The inputs and output of standard cells used metal 1 layer
for routing the connectivity and then via to use metal 2 for the further routing. The following
figure shows the routing between inter-cell.
When we are routing the cells that are not in the same row we are going to use the vertical track
of metals as well as the horizontal track which are available between the cell rows. Between the
cell rows we have tracks in metal 1. Shown in figure below.

However, if we have a longer row of standard cells the rows do not be the same length the tool
has to thicken the supply and ground rails for that row. So the longer the row the thicker the
ground rails because the metal line has resistive drop for its conductivity so the last cell of the
row will see lower value of voltage rather than the first cell of the row unless we make the layer
wider.
If we get deep into the topic then we can see from the following figure that there are rows for
standard cells which are arranged in a block. There is VDD at the top and GND at the bottom
that are placed horizontally in a higher metal layer. So for VDD and GND we usually use the
higher metal layer. Then the metal 1 (light pink in color) that is feeding ground voltage to the
standard cells. A lot of VIA that are used to connect the vertical layer (light blue in color) and
horizontal layers. They distribute the supply and ground vertically which provide the VDD and
GND voltage to the standard cells while maintaining the horizontal connection (royal blue in
color).
But the problem is whenever we provide the VDD and GND voltage, it may be lower at the
bottom of the layer which will provide lower voltage at the end of the standard cell row. For
improving this it is better to provide both supply voltage and ground voltage at the top and the
bottom of the design. And here we short the horizontal top VDD with bottom VDD along with
the metal layer (light pink in color). This process helps to reduce the amount of voltage and it's
almost half of the amount and maximum drop occurred at the middle of the block of standard
cells row. Again the supply will be worst near the right end of the blocks and ground will be
worst near the left end of the blocks. See the following figure.

To solve the above problem the best way is to create a power ring and distribute the supply and
ground voltage through creating power mesh. Here the maximum drop is observed at the middle
of the block.
[source: https://www.youtube.com/watch?v=9RLP8_WZNX8]

GBoxes or Gcells:
This topic is basically discussing global routing congestion. Gcells (Global Cells) are needed to
find out the routing congestion of layers in the global routing step. So now let's discuss how the
congestion is calculated.
First of all the tool divides the design into a specific number of gcells. For each gcell the tool will
calculate the number of cells present there and after that tool will try to calculate how many nets
are there in each gcell.
let,
total number of nets = n
After calculation the tool now knows how many nets are passing through each gcell. Now the
tool will see how many metal layers are available there which are defined in the technology
library.
let,
metal layers = 6
horizontal = 3 [m1,m3,m5]
vertical =3 [m2,m4,m6]
lets,
There are 25 nets in one of the gcell that are going to all of the cells if that particular gcell and
there are 6 metal layers to be routed in 25 nets. And among these 17 nets are going vertically and
8 nets are going horizontally while it's routed through metal layers. Just keep in mind that
m1,m3,m5 are routing horizontally and m2,m4,m6 are routing vertically. While doing the global
routing the tool assigns the layers in different nets.
Let's say we have 7 nets for m2, 6 nets for m4 and 4 nets for m6 which are vertical in direction.
This is our layer assignment that happened in the placement stage. This is done by the tool itself
while doing the layer assignment so we can say that this is what the tool is demanding. So our
demand for a perfect routing without any shorts and issues is,
demand of m2 = 7 nets but we can supply m2 = 4 nets; overflow = demand - supply = 7-4 = 3
demand of m4 = 6 nets but we can supply m4 = 3 nets; overflow = demand - supply = 6-3 = 3
demand of m6 = 4 nets but we can supply m6 = 3 nets ; overflow = demand - supply = 4-3 = 1
so the total overflow = (3+3+1) = 7
N.B: Higher overflow means higher routing congestion. Correct Signal routes are identified
by a sophisticated tool algorithm called Maze routing algorithm to find routes between
GBoxes.
To know more about Maze Algorithm: https://www.vlsisystemdesign.com/maze-routing-lees-algorithm/
source : https://www.youtube.com/watch?v=5FjnJnVQrc4&t=330s

Why is one metal layer routed vertically and another is horizontally?


Ans: To avoid electromagnetism.

Switch Boxes or Scell:


Detailed route follows the same pathways and tracks identified in the global routing stage. It
divides each of the GCell into smaller squares called switch boxes or SCells. Detail routing
algorithm is same as global router algorithm but uses assigned tracks and Groutes from the
global routing stage.

Routing Issues:
1. Cross-talk: Switching of the signal in one net can interfere with the neighboring net due
to cross coupling capacitance known as cross-talk. Cross-talk may lead to setup and hold
violation. Cross talk is the effect of the neighboring signal nets when the signal on a
particular net toggles fast.The effect could be a change in logic level, signal rise/fall
times, and frequency of the signal. The signal which affects the neighboring net is called
aggressor net and the signal which gets affected is called victim net. The techniques to
avoid crosstalk effects are:
● spacing the signals apart so that they do not affect each other
● shielding the nets which carried high-frequency signals
● increasing the net width so that cross talk effect is negligible

2. IR: IR issue is caused when the nets carrying signals, especially power signals are long
enough and have narrow tracks. The signal strength Vdd reduces due to the resistance of
long tracks and hence by the time they reach the destination nets, the signal amplitude
drops considerably resulting in functional failures. Major causes of IR issues in SoC
design are the following:
● Improper placement of power and ground pads
● Smaller than required width of nets to carry Vdd and Gnd signals
● Insufficient width of core ring, power straps, and rails
● Small number of power straps
● Disconnects in the power signal nets due to missing vias
IR issue is addressed by checking the above-listed probable root causes and fixing them. It is
necessary to set the safe route lengths, correct vias, widths of power ground rings, straps, rails,
and nets for power ground signal tracks during physical design
3. Electromagnetic: Electromagnetic issue is a long-term process and can result in failures
after SoC working for many years. This arises when a continuously large current flows in
one direction displacing the metal ions causing physical damage to the tracks, pins, pads,
and ports. Design rules like track signal and signal direction are to be considered to avoid
EM issues in the long run.

TIMING CLOSURE
Timing closure is the process of satisfying timing constraints through layout optimization and
netlist modifications. Following are multiple iterations a Place and Route tool does for timing
closure.
● Timing Driven Placement: Timing driven placement tries cells along timing critical
path close to each other to reduce net delays and meet setup timing.
● Timing Driven Routing: In modern chips, interconnect delays contribute to a significant
number. Timing driven routing seeks to minimize this to by efficient routing
● Physical Synthesis: Physical synthesis optimizes the timing using multiple techniques
- Gate Sizing
- Buffering
- Netlist restructuring
PHYSICAL DESIGN VERIFICATION

Design Rule Check (DRC)


● Design Rule Check (DRC) is the process of checking physical layout data against
fabrication specific rules specified by the foundry to ensure successful fabrication.
● Process specific design rules must be followed when drawing layouts to avoid any
manufacturing defects during the fabrication of an IC.
● Process design rules are the minimum allowable drawing dimensions which
affects the X and Y dimensions of layout and not the depth/vertical dimensions.
● As Technology Shrinks
o Number of design rules are increasing
o Complexity of routing rules is increasing
o Increasing the number of objects involved
o More design rules depending on width, halo, parallel length.

● Violating a design rule might result in a non-functional circuit or low yield.

Design Rules Example:


Layout Versus Schematic (LVS):
DRC only verifies that the given layout satisfies the design rules provided by the fabrication unit.
It does not ensure the functionality of layout. Because of this, idea of LVS is originated.

LVS Flow:
● LVS verifies the connectivity of a Verilog Netlist and Layout Netlist (Extracted
Netlist from GDS)
● Tool extracts circuit devices and interconnects from the layout and saved as
Layout Netlist (SPICE Format)
● As LVS performs comparison between 2 Netlist, it doesn’t compare the
functionalities of both the Netlist

Input Requirements:
● LVS rule deck: It is a set of code written in Standard Verification Rule
Format (SVRF) or TCL Verification Format (TVF). It guides the tool to
extract the devices and connectivity of IC’s. It contains the layer definition
to identify the layers used in layout file and to match it with the location of
layer in GDS. It also contains device structure definitions.
● Verilog Netlist
● Physical layout database (GDS)
● Spice Netlist (Extracted by the tool from GDS)

LVS Check Examples:


● Open Net Error

● Short Net Error

● Extract Error
● Compare Errors
Electrical Rule Check (ERC)
● ERC is used to analyze or confirm the electrical connectivity of an IC design.
● ERC checks are run to find out the following errors in layout:
o To locate devices connected directly between Power and Ground
o To locate floating devices, substrates and wells
o To locate devices which are shorted
o To locate devices with missing connections
● Well tap connection error: The well taps should bias the wells as specified in the
schematics.

● Well tap density error: If there are enough taps for a given area then this error is
flagged.
● Tap need to be placed regularly which biases the well to prevent latch-up. e.g., In typical
90 nm process the well tap density rules require well-taps to be placed every 50 microns
ISSUES IN PHYSICAL DESIGN

1. Signal Latch- Up: Latch-up in standard cells is a condition which results in large leakage
current from power supply (VDD) to Ground leading to circuit failures.
Following are some popular design techniques used to prevent latch-up in chip design:
a. Adding Guard ring by additional implants
b. Adding well tap cells
c. Creating isolation trench around the device structures
2. Antenna Effect: During CMOS device fabrication processes such as plasma etching,
there is a chance that a large amount of charge gets accumulated in the gate region of the
transistor if there is extra metal connected. During these processes, there is a possibility
of gate oxide getting easily damaged by electrostatic discharge. The static charge
collected during the multilayer metallization process can damage the device leading to
chip failure. The charge accumulated is conducted by the diodes formed in source and
drain diffusion regions, which will change transistor behavior. This is called charge
collecting antenna effect or antenna problem in VLSI chips. Antenna effect transistors in
the following ways:
• Reduces threshold voltage.
• Changes the IV characteristics.
• Reduces life expectancy of the SoC.
• Increases gate leakage
Rules of antennas are provided in the design rule manual (DRM) document to avoid antenna
effect. It specifies the maximum allowable ratio of metal area to gate area for each
interconnect layer also known as antenna ratio.
● antenna ratio = exposed metal area/ gate area
antenna ratio < max. antenna ratio (specified from the foundry)
To fix the problem we can do the following things:
1. Change routing order: We can use a higher metal layer instead of the same metal layer
to build a connection with the gate, which means splitting the metal interconnect wire
connected to the gate into two segments that are connected to each other by a buffer
layer.
2. Antenna diode mode: The second way is to connect a reverse-biased diode to the long
wire connected to the gate. This is called diode protection mode.

References:
1. VLSI Deep Dive
2. iVLSI
3. VLSI Backend Adventure

You might also like