21EC71 Advanced VLSI Module 1 - Module
21EC71 Advanced VLSI Module 1 - Module
Advanced VLSI
(21EC71)
SEMESTER – VII
Module 1
Introduction to ASICs: Full custom, Semi-custom and Programmable ASICs, ASIC
Design flow, ASIC cell libraries. CMOS Logic: Data path Logic Cells: Data Path
Elements, Adders: Carry skip, Carry bypass, Carry save, Carry select, Conditional
sum, Multiplier (Booth encoding), Data path Operators, I/O cells, Cell Compilers.
Textbook 1
Page | 1
21EC71 Advanced VLSI
CONTENTS
Introduction to ASICs ............................................................................................................................................................................ 3
Types of ASICs .......................................................................................................................................................................................... 4
Full-Custom ASICs .............................................................................................................................................................................. 4
SEMI CUSTOM ASICS......................................................................................................................................................................... 5
Standard-Cell Based ASICs ........................................................................................................................................................ 5
GATE ARRAY BASED ASICS ....................................................................................................................................................... 7
programmable asics .......................................................................................................................................................................... 8
Programmable Logic Devices ................................................................................................................................................... 8
Field-Programmable Gate Arrays ............................................................................................................................................... 9
Design Flow............................................................................................................................................................................................. 10
ASIC Cell Libraries........................................................................................................................................................................... 11
CMOS LOGIC CELLS ............................................................................................................................................................................. 13
Datapath Logic Cells ....................................................................................................................................................................... 13
Datapath Elements .......................................................................................................................................................................... 14
adders .............................................................................................................................................................................................. 16
Multipliers ..................................................................................................................................................................................... 19
Other Datapath Operators ........................................................................................................................................................... 20
IO Cells.................................................................................................................................................................................................. 21
Cell Compilers ........................................................................................................................................................................................ 22
Page | 2
21EC71 Advanced VLSI
INTRODUCTION TO ASICS
An ASIC is an application-specific integrated circuit. Figure 1.0(a) shows an IC
package (this is a pin-grid array, or PGA, shown upside down; the pins will go through holes
in a printed-circuit board). A PGA package is usually made from a ceramic material, but
plastic packages are also common.
FIGURE 1.0 An integrated circuit (IC). (a) A pin-grid array (PGA) package. (b)
The silicon die or chip is under the package lid.
The earliest ICs used bipolar technology and the majority of logic ICs used either
transistor transistor logic ( TTL ) or emitter-coupled logic (ECL).
Although invented before the bipolar transistor, the metal-oxide-silicon (MOS) transistor
was initially difficult to manufacture because of problems with the oxide interface.
By the early 1980s the aluminum gates of the transistors were replaced by polysilicon
gates.
The introduction of polysilicon as a gate material was a major improvement in CMOS
technology
The principal advantage of CMOS over NMOS techniology is lower power consumption.
Another advantage of a polysilicon gate was a simplification of the fabrication process,
allowing devices to be scaled down in size.
As different types of custom ICs began to evolve for different types of applications, these
new ICs gave rise to a new term:application-specific IC, or ASIC.
Examples of ICs that are ASICs include: Examples of ICs that are not ASICs include
standard parts such as:
1. a chip for a toy bear that talks;
2. a chip for a satellite; 1. memory chips sold as a commodity
3. a chip designed to handle the item ROMs,
interface between memory and a 2. DRAM, and SRAM;
microprocessor for a workstation 3. microprocessors;
CPU; 4. TTL or TTL-equivalent ICs at SSI,
4. a chip containing a microprocessor MSI, and LSI levels.
as a cell together with other logic.
Page | 3
21EC71 Advanced VLSI
TYPES OF ASICS
ICs are made on a thin (a few hundred microns thick), circular silicon wafer , with
each wafer holding hundreds of die (sometimes people use dies or dice for the plural of die).
The transistors and wiring are made from many layers (usually between 10 and 15
distinct layers) built on top of one another. Each successive mask layer has a pattern that is
defined using a mask similar to a glass photographic slide. The first half-dozen or so layers
define the transistors. The last half-dozen or so layers define the metal wires between the
transistors (the interconnect).
ASICs
Programmable Field
Standard Cell Gate Array Programmable
Logic Devices
based ASICs based ASICs Gate Arrays
(PLDs)
(FPGAs)
Programmable
Channeled
Array Logic
gate arrays
(PALs)
Programmable
Channelless
Logic Array
gate arrays.
(PLAs)
Structured
gate arrays.
FULL-CUSTOM ASICS
In a full-custom ASIC an engineer designs some or all of the logic cells, circuits, or
layout specifically for one ASIC.
This means the designer abandons the approach of using pretested and precharacterized
cells for all or part of that design.
It makes sense to take this approach only if there are no suitable existing cell libraries
available that can be used for the entire design. This might be because existing cell
libraries are not fast enough, or the logic cells are not small enough or consume too
much power.
There is one growing member of this family, the mixed analog/digital ASIC.
Page | 4
21EC71 Advanced VLSI
FIGURE 1.2: A cell-based ASIC (CBIC) die with a single standard-cell area (a flexible
block) together with four fixed blocks. The flexible block contains rows of standard
cells.
Page | 5
21EC71 Advanced VLSI
FIGURE 1.3 Standard cells are stacked like bricks in a wall; the abutment box (AB)
defines the edges of the brick. The difference between the bounding box (BB) and the
AB is the area of overlap between the bricks. Power supplies (labeled VDD and GND)
run horizontally inside a standard cell on a metal layer that lies above the transistor
layers. This standard cell has center connectors (the three squares, labeled A1, B1, and
Z) that allow the cell to connect to others.
FIGURE 1.4 Routing the CBIC (cell-based IC). The use of regularly shaped standard
cells, from a library allows ASICs like this to be designed automatically. This ASIC
uses two separate layers of metal interconnect (metal1 and metal2) running at right
angles to each other.
Page | 6
21EC71 Advanced VLSI
Page | 7
21EC71 Advanced VLSI
PROGRAMMABLE ASICS
PROGRAMMABLE LOGIC DEVICES
Programmable logic devices ( PLDs ) are standard ICs that are available in standard
configurations from a catalog of parts and are sold in very high volume to many
different customers.
However, PLDs may be configured or programmed to create a part customized to a
specific application, and so they also belong to the family of ASICs.
No customized mask layers or logic cells • Fast design turnaround
FIGURE 1.8 A programmable logic device (PLD) die. The macrocells typically
consist of programmable array logic followed by a flip-flop or latch.The macrocells
are connected using a large programmable interconnect block.
The simplest type of programmable IC is a read-only memory ( ROM ). The most common
types of ROM use a metal fuse that can be blown permanently (a programmable ROM or
PROM ).
An electrically programmable ROM , or EPROM , uses programmable MOS transistors
whose characteristics are altered by applying a high voltage. You can erase an EPROM
either by using another high voltage (an electrically erasable PROM , or EEPROM ) or
by exposing the device to ultraviolet light ( UV-erasable PROM , or UVPROM ).
There is another type of ROM that can be placed on any ASIC a mask-programmable
ROM (mask-programmed ROM or masked ROM). A masked ROM is a regular array of
transistors permanently programmed using custom mask patterns. An embedded
masked ROM is thus a large, specialized, logic cell.
Page | 8
21EC71 Advanced VLSI
FIGURE 1.9 A field-programmable gate array (FPGA) die. The exact type, size, and number of the
programmable basic logic cells varies tremendously
Page | 9
21EC71 Advanced VLSI
DESIGN FLOW
Figure 1.10 shows the sequence of steps to design an ASIC; we call this a design flow.
1. Design entry. Enter the design into an ASIC design system, either using a hardware
description language ( HDL ) or schematic entry .
2. Logic synthesis. Use an HDL (VHDL or Verilog) and a logic synthesis tool to
produce a netlist a description of the logic cells and their connections.
3. System partitioning. Divide a large system into ASIC-sized pieces.
4. Prelayout simulation. Check to see if the design functions correctly.
5. Floorplanning. Arrange the blocks of the netlist on the chip.
6. Placement. Decide the locations of cells in a block.
7. Routing. Make the connections between cells and blocks.
8. Extraction. Determine the resistance and capacitance of the interconnect.
9. Postlayout simulation. Check to see the design still works with the added loads of the
interconnect.
Page | 10
21EC71 Advanced VLSI
The first choice, using The second and third choices The third choice is to
an ASIC-vendor library, require you to make a buy-or- develop a cell library in-
requires you to use a set build decision . house.
of design tools approved If you complete an ASIC Many large computer and
by the ASIC vendor to design using a cell library electronics companies make
enter and simulate your that you bought, you also this choice.
design. own the masks (the Most of the cell libraries
You have to buy the tooling) that are used to designed today are still
tools, and the cost of the manufacture your ASIC. developed in-house
cell library is folded into This is called customer- despite the fact that the
the NRE. owned tooling (COT). process of library
A library vendor normally development is complex
develops a cell library and very expensive
using information about a
process supplied by an
ASIC foundry .
However created, each cell in an ASIC cell library must contain the following:
A physical
layout
A
A routing
behavioral
model
model
A wire- A
load Verilog/VHDL
model ASIC Cell model
library
A detailed
A cell
timing
icon
model
A circuit A test
schematic strategy
1. The ASIC designer may not actually see the layout if it is hidden inside a phantom, but the
layout will be needed eventually.
2. The ASIC designer needs a high-level, behavioral model for each cell because simulation
at the detailed timing level takes too long for a complete ASIC design.
3. The designer may require Verilog and VHDL models in addition to the models for a
particular logic simulator.
Page | 11
21EC71 Advanced VLSI
4. ASIC designers also need a detailed timing model for each cell to determine the
performance of the critical pieces of an ASIC. Library engineers simulate the delay of
each cell, a process known as characterization.
5. All ASICs need to be production tested (programmable ASICs may be tested by the
manufacturer before they are customized, but they still need to be tested). Simple cells in
small or medium-size blocks can be tested using automated techniques, but large blocks
such as RAM or multipliers need a planned strategy.
6. The cell schematic (a netlist description) describes each cell so that the cell designer can
perform simulation for complex cells.
7. A cell icon helps in identifying the library cell.
8. In order to estimate the parasitic capacitance of wires before we actually complete any
routing, we need a statistical estimate of the capacitance for a net in a given size circuit
block. This usually takes the form of a look-up table known as a wire-load model .
9. We also need a routing model for each cell. Large cells are too complex for the physical
design or layout tools to handle directly and we need a simpler representation. The
phantom may include information that tells the automated routing tool where it can and
cannot place wires over the cell, as well as the location and types of the connections to
the cell.
LIB files (.lib)
https://teamvlsi.com/2020/08/standard-cell-library-in-asic-design.html
Page | 12
21EC71 Advanced VLSI
(a) A full-adder (FA) cell with inputs (b) A 4-bit adder. (c) The layout, using two-level
metal, with data in m1 and control in m2. (d) The datapath layout.
Page | 13
21EC71 Advanced VLSI
What is the difference between using a datapath, standard cells, or gate arrays? Cells
are placed together in rows on a CBIC or an MGA, but there is no generally no regularity to
the arrangement of the cells within the rows we let software arrange the cells and complete
the interconnect.
Datapath layout automatically takes care of most of the interconnect between the cells
with the following advantages:
● Regular layout produces predictable and equal delay for each bit.
● The overhead (buffering and routing the control signals, for example) can make a
narrow (small number of bits) datapath larger and slower than a standard-cell (or even gate-
array) implementation.
● Datapath cells have to be predesigned (otherwise we are using full-custom design) for
use in a wide range of datapath sizes. Datapath cell design can be harder than designing
gate-array macros or standard cells.
● Software to assemble a datapath is more complex and not as widely used as software
for assembling standard cells or gate arrays.
DATAPATH ELEMENTS
Figure 2.21 shows some typical datapath symbols for an adder.
Page | 14
21EC71 Advanced VLSI
Page | 15
21EC71 Advanced VLSI
ADDERS
Page | 16
21EC71 Advanced VLSI
FIGURE 2.22 The carry-save adder (CSA). (a) A CSA cell. (b) A 4-bit CSA. (c)Symbol fora
CSA. (d) A four-input CSA. (e) The datapath for a four-input, 4-bit adder using CSAs with a
ripple-carry adder (RCA) as the final stage. (f) A pipelined adder. (g) The datapath for the
pipelined version showing the pipeline registers as well as the clock control lines that use m2
FIGURE 2.24 The conditional-sum adder. (a) A 1-bit conditional adder that calculates the sum
and carry out assuming the carry in is either '1' or '0'. (b) The multiplexer that selects between
sums and carries. (c) A 4-bit conditional-sum adder with carry input, C[0].
Page | 17
21EC71 Advanced VLSI
FIGURE 2.23 The Brent Kung carry-lookahead adder (CLA). (a) Carry generation in
4-bit CLA. (b) A cell to generate the lookahead terms, C[0] C[3]. (c) Cells L1, L2, an
L3 are rearranged into a tree that has less delay. Cell L4 is added to calculate C[2] that
is lost in the translation. (d) and (e) Simplified representations of parts a and c. (f) The
lookahead logic for an 8-bit adder. The inputs, 0:7, are the propagate and carry terms
formed from the inputs to the adder. (g) An 8-bit Brent Kung CLA. The outputs
of the lookahead logic are the carry bits that (together with the inputs) form the sum.
One advantage of this adder is that delays from the inputs to the outputs are more
nearly equal than in other adders. This tends to reduce the number of unwanted and
unnecessary switching events and thus reduces power dissipation.
Page | 18
21EC71 Advanced VLSI
MULTIPLIERS
There are two items we can attack to improve the performance of a multiplier: the
number of partial products and the addition of the partial products. Booth encoding reduces
the number o partial products by a factor of two and thus considerably reduces the area as
well as increasing the speed of our multiplier
Booth Encoding
The above multiplier architecture can be divided into two stages. In the first stage the Partial
Products are formed by the Booth encoder and Partial Product Generator(PPG). In the second
stage the partial products obtained in the above are merged to form the results.
Andrew Donald Booth proposed Booth's multiplication algorithm which can perform the
multiplication operation of Two Signed Binary numbers in their respective 2's complement
form.
RADIX-2 BOOTH RECODING
https://linus5.blogspot.com/p/design-and-implementation-of-efficient.html
https://digitalsystemdesign.in/booths-multiplication-algorithm/
Page | 19
21EC71 Advanced VLSI
FIGURE 2.31 Symbols for datapath elements. (a) An array or vector of flip-flops (a
register). (b) A two-input NAND cell with databus inputs. (c) A two-input NAND cell with a
control input. (d) A buswide MUX. (e) An incrementer/decrementer. (f) An all-zeros
detector. (g) An all-ones detector. (h) An adder/subtracter.
Page | 20
21EC71 Advanced VLSI
IO CELLS
FIGURE 2.32 A three-state bidirectional output buffer. When the output enable,
OE, is '1' the output section is enabled and drives the I/O pad. When OE is '0' the output
buffer is placed in a high-impedance state.
Page | 21
21EC71 Advanced VLSI
CELL COMPILERS
The process of hand crafting circuits and layout for a full-custom IC is a tedious,
time-consuming, and error-prone task.
There are two types of automated layout assembly tools, often known as a silicon
compilers.
1. The first type produces a specific kind of circuit, a RAM compiler or multiplier
compiler , for example.
2. The second type of compiler is more flexible, usually providing a
programming language that assembles or tiles layout from an input command
file, but this is full-custom IC design.
In addition to producing layout we also need a model compiler so that we can verify the
circuit at the behavioral level, and we need a netlist from a netlist compiler so that we can
simulate the circuit and verify that it works correctly at the structural level. Silicon compilers
are thus complex pieces of software.
Page | 22
21EC71 Advanced VLSI
Advanced VLSI
(21EC71)
SEMESTER – VII
Module 2
Floor planning and placement: Goals and objectives, Measurement of delay in Floor
planning, Floor planning tools, Channel definition, I/O and Power planning and Clock
planning. Placement: Goals and Objectives, Min-cut Placement algorithm, Iterative
Placement Improvement, Time driven placement methods, Physical Design Flow.
Routing: Global Routing: Goals and objectives, Global Routing Methods, Global routing
between blocks, Back annotation.
Textbook 1
Page | 1
21EC71 Advanced VLSI
CONTENTS
Floorplanning ..................................................................................................................................................................... 3
FLOORPLANNING GOALS and Objectives .......................................................................................................... 3
Measurement of Delay in Floorplanning ............................................................................................................ 4
Floorplanning Tools .................................................................................................................................................... 5
Channel Definition ....................................................................................................................................................... 6
I/O and Power Planning ............................................................................................................................................ 8
Clock Planning ............................................................................................................................................................ 10
Placement ......................................................................................................................................................................... 11
Placement Terms and Definitions ...................................................................................................................... 11
Placement Goals and Objectives ......................................................................................................................... 12
Placement Algorithms ............................................................................................................................................. 12
min-cut placement method............................................................................................................................... 13
Iterative Placement Improvement ................................................................................................................ 14
Timing-Driven Placement Methods ................................................................................................................... 16
Physical Design Flow.................................................................................................................................................... 17
ROUTING ........................................................................................................................................................................... 18
Global Routing ................................................................................................................................................................ 18
Goals and Objectives ................................................................................................................................................ 18
Global Routing Methods ......................................................................................................................................... 19
Global Routing Between Blocks .......................................................................................................................... 20
Back-annotation ........................................................................................................................................................ 21
Page | 2
21EC71 Advanced VLSI
FLOORPLANNING
Floorplanning is a mapping between the logical description (the netlist) and the
physical description (the floorplan).
FIGURE 16.3 Interconnect and gate delays. As feature sizes decrease, both average
interconnect delay and average gate delay decrease but at different rates. This is because
interconnect capacitance tends to a limit that is independent of scaling. Interconnect delay
now dominates gate delay.
The objectives of floorplanning are to minimize the chip area and minimize delay.
Page | 3
21EC71 Advanced VLSI
FIGURE 16.4 Predicted capacitance. (a) Interconnect lengths as a function of fanout (FO)
and circuit-block size. (b) Wire-load table. There is only one capacitance value for each
fanout (typically the average value). (c) The wire-load table predicts the capacitance and
delay of a net (with a considerable error).
Page | 4
21EC71 Advanced VLSI
FLOORPLANNING TOOLS
Figure 16.6 (a) shows an initial random floorplan generated by a floorplanning tool. Two
of the blocks, A and C in this example, are standard-cell areas (the chip shown in Figure
16.1 is one large standard-cell area). These are flexible blocks (or variable blocks )
because, although their total area is fixed, their shape (aspect ratio) and connector
locations may be adjusted during the placement step.
We may force logic cells to be in selected flexible blocks by seeding. Seeding may be
hard or soft. A hard seed is fixed and not allowed to move during the remaining
floorplanning and placement steps. A soft seed is an initial suggestion only and can be
altered if necessary by the floorplanner.
FIGURE 16.6 Floorplanning a cell-based ASIC. (a) Initial floorplan generated by the
floorplanning tool. Two of the blocks are flexible (A and C) and contain rows of
standard cells (unplaced). A pop-up window shows the status of block A. (b) An
estimated placement for flexible blocks A and C. The connector positions are known
and a rat s nest display shows the heavy congestion below block B. (c) Moving blocks
to improve the floorplan. (d) The updated display shows the reduced congestion after
the changes.
We need to control the aspect ratio of our floorplan because we have to fit our chip into
the die cavity (a fixed-size hole, usually square) inside a package.
With practice, we can create a good initial placement by floorplanning and a pictorial
display
Page | 5
21EC71 Advanced VLSI
FIGURE 16.7 Congestion analysis. (a) The initial floorplan with a 2:1.5 die aspect ratio.
(b) Altering the floorplan to give a 1:1 chip aspect ratio. (c) A trial floorplan with a
congestion map. Blocks A and C have been placed so that we know the terminal positions
in the channels. Shading indicates the ratio of channel density to the channel capacity.
Dark areas show regions that cannot be routed because the channel congestion exceeds the
estimated capacity. (d) Resizing flexible blocks A and C alleviates congestion.
CHANNEL DEFINITION
During the floorplanning step we assign the areas between blocks that are to be used for
interconnect. This process is known as channel definition or channel allocation .
Figure 16.8 shows a T-shaped junction between two rectangular channels and illustrates
why we must route the stem (vertical) of the T before the bar. The general problem of
choosing the order of rectangular channels to route is channel ordering.
FIGURE 16.8 Routing a T-junction between two channels in two-level metal. The dots
represent logic cell pins. (a) Routing channel A (the stem of the T) first allows us to adjust
the width of channel B. (b) If we route channel B first (the top of the T), this fixes the
width of channel A. We have to route the stem of a T-junction before we route the top.
Page | 6
21EC71 Advanced VLSI
Figure 16.9 shows a floorplan of a chip containing several blocks. Suppose we cut along
the block boundaries slicing the chip into two pieces ( Figure 16.9 a). Then suppose we can
slice each of these pieces into two. If we can continue in this fashion until all the blocks are
separated, then we have a slicing floorplan ( Figure 16.9 b). Figure 16.9 (c) shows how the
sequence we use to slice the chip defines a hierarchy of the blocks. Reversing the slicing
order ensures that we route the stems of all the channel T-junctions first.
FIGURE 16.9 Defining the channel routing order for a slicing floorplan using a
slicing tree. (a) Make a cut all the way across the chip between circuit blocks. Continue
slicing until each piece contains just one circuit block. Each cut divides a piece into two
without cutting through a circuit block. (b) A sequence of cuts: 1, 2, 3, and 4 that
successively slices the chip until only circuit blocks are left. (c) The slicing tree
corresponding to the sequence of cuts gives the order in which to route the channels: 4, 3,
2, and finally 1.
Figure 16.10 shows a floorplan that is not a slicing structure. We cannot cut the chip all
the way across with a knife without chopping a circuit block in two. This means we cannot
route any of the channels in this floorplan without routing all of the other channels first. We
say there is a cyclic constraint in this floorplan. There are two solutions to this problem.
One solution is to move the blocks until we obtain a slicing floorplan. The other solution is to
allow the use of L -shaped, rather than rectangular, channels (or areas with fixed
connectors on all sides a switch box ).
FIGURE 16.10 Cyclic constraints. (a) A nonslicing floorplan with a cyclic constraint
that prevents channel routing. (b) In this case it is difficult to find a slicing floorplan
without increasing the chip area. (c) This floorplan may be sliced (with initial cuts 1 or 2)
and has no cyclic constraints, but it is inefficient in area use and will be very difficult to
route.
Page | 7
21EC71 Advanced VLSI
FIGURE 16.12 Pad-limited and core-limited die. (a) A pad-limited die. The number of
pads determines the die size. (b) A core-limited die: The core logic determines the die size.
(c) Using both pad-limited pads and core-limited pads for a square die.
Special power pads are used for the positive supply, or VDD, power buses (or power
rails ) and the ground or negative supply, VSS or GND.
Usually one set of VDD/VSS pads supplies one power ring that runs around the pad
ring and supplies power to the I/O pads only.
Another set of VDD/VSS pads connects to a second power ring that supplies the logic
core.
We sometimes call the I/O power dirty power since it has to supply large transient
currents to the output transistors. We keep dirty power separate to avoid injecting
noise into the internal-logic power (the clean power).
I/O pads also contain special circuits to protect against electrostatic discharge
(ESD). These circuits can withstand very short high-voltage (several kilovolt) pulses
that can be generated during human or machine handling.
Figure 16.13 (a) and (b) are magnified views of the southeast corner of our example chip
and show the different types of I/O cells. Figure 16.13 (c) shows a stagger-bond
arrangement using two rows of I/O pads. In this case the design rules for bond wires (the
spacing and the angle at which the bond wires leave the pads) become very important.
Figure 16.13 (d) shows an area-bump bonding arrangement (also known as flip-chip,
solder-bump or C4, terms coined by IBM who developed this technology [ Masleid, 1991])
used, for example, with ball-grid array ( BGA )packages.
Page | 8
21EC71 Advanced VLSI
FIGURE 16.13 Bonding pads. (a) This chip uses both pad-limited and core-limited
pads. (b) A hybrid corner pad. (c) A chip with stagger-bonded pads. (d) An area-bump
bonded chip (or flip-chip). The chip is turned upside down and solder bumps connect the
pads to the lead frame.
Page | 9
21EC71 Advanced VLSI
CLOCK PLANNING
Figure 16.16 (a) shows a clock spine (not to be confused with a channel spine) routing
scheme with all clock pins driven directly from the clock driver. MGAs and FPGAs often use
this fish bone type of clock distribution scheme.
FIGURE 16.16 Clock distribution. (a) A clock spine for a gate array. (b) A clock spine
for a cell-based ASIC (typical chips have thousands of clock nets).
(c) A clock spine is usually driven from one or more clock-driver cells. Delay in the
driver cell is a function of the number of stages and the ratio of output to input
capacitance for each stage (taper). (d) Clock latency and clock skew. We would like to
minimize both latency and skew.
Page | 10
21EC71 Advanced VLSI
PLACEMENT
After completing a floorplan we can begin placement of the logic cells within the
flexible blocks. Placement is much more suited to automation than floorplanning. Thus we
shall need measurement techniques and algorithms.
FIGURE 16.18 Interconnect structure. (a) The two-level metal CBIC floorplan
shown in Figure 16.11 b. (b) A channel from the flexible block A. This channel has a
channel height equal to the maximum channel density of 7 (there is room for seven
interconnects to run horizontally in m1). (c) A channel that uses OTC (over-the-cell)
routing in m2.
With two layers of metal, we route within the rectangular channels using the first
metal layer for horizontal routing, parallel to the channel spine, and the second metal layer
for the vertical direction (if there is a third metal layer it will normally run in the horizontal
direction again). The maximum number of horizontal interconnects that can be placed side
by side, parallel to the channel spine, is the channel capacity .
Page | 11
21EC71 Advanced VLSI
The most commonly used placement objectives are one or more of the following:
● Minimize the total estimated interconnect length
● Meet the timing requirements for critical nets
● Minimize the interconnect congestion
PLACEMENT ALGORITHMS
There are two classes of placement algorithms commonly used in commercial CAD
tools:
1. constructive placement
a. variations on the min-cut algorithm
b. eigenvalue method
2. iterative placement improvement.
Placement usually starts with a constructed solution and then improves it using an
iterative algorithm.
Page | 12
21EC71 Advanced VLSI
FIGURE 16.24 Min-cut placement. (a) Divide the chip into bins using a grid.
(b) Merge all connections to the center of each bin. (c) Make a cut and swap
logic cells between bins to minimize the cost of the cut. (d) Take the cut piecesand
throw out all the edges that are not inside the piece. (e) Repeat the processwith a
new cut and continue until we reach the individual bins.
Page | 13
21EC71 Advanced VLSI
● The measurement criteria that decides whether to move the selected cells.
There are several interchange or iterative exchange methods that differ in their
selection and measurement criteria:
● pairwise interchange,
● force-directed interchange,
● force-directed relaxation, and
● force-directed pairwise relaxation.
FIGURE 16.26 Interchange. (a) Swapping the source logic cell with a destination logic
cell in pairwise interchange. (b) Sometimes we have to swap more than two logic cells
at a time to reach an optimum placement, but this is expensive in computation time.
Limiting the search to neighborhoods reduces the search time.
Logic cells within a distance e of a logic cell form an e-neighborhood. (c) A one-
neighborhood. (d) A two-neighborhood.
FIGURE 16.27 Force-directed placement. (a) A network with nine logic cells.
(b) We make a grid (one logic cell per bin). (c) Forces are calculated as if springs were
attached to the centers of each logic cell for each connection. The two nets connecting
logic cells A and I correspond to two springs. (d) The forces are proportional to the
spring extensions.
Page | 14
21EC71 Advanced VLSI
Without external forces to counteract the pull of the springs between logic cells, the
network will collapse to a single point as it settles. An important part of force-directed
placement is fixing some of the logic cells in position. Normally ASIC designers use the I/O
pads or other external connections to act as anchor points or fixed seeds.
Page | 15
21EC71 Advanced VLSI
We know that we can use net weights in our algorithms. The problem is to calculate
the weights. One method finds the n most critical paths (using a timing-analysis engine,
possibly in the synthesis tool). The net weights might then be the number of times each net
appears in this list. The problem with this approach is that as soon as we fix (for example)
the first 100 critical nets, suddenly another 200 become critical.
FIGURE 16.29 The zero-slack algorithm. (a) The circuit with no net delays. (b) The
zero-slack algorithm adds net delays (at the outputs of each gate, equivalent to increasing
the gate delay) to reduce the slack times to zero.
With the zero-slack algorithm we simplify but overconstrain the problem. For
example, we might be able to do a better job by making some nets a little longer than the
slack indicates if we can tighten up other nets. What we would really like to do is deal with
paths such as the critical path shown in Figure 16.29 (a) and not just nets . Path-based
algorithms have been proposed to do this, but they are complex and not all commercial
tools have this capability.
Page | 16
21EC71 Advanced VLSI
Page | 17
21EC71 Advanced VLSI
ROUTING
Once the designer has floorplanned a chip and the logic cells within the flexible
blocks have been placed, it is time to make the connections by routing the chip.
1. global routing
2. followed by detailed routing
GLOBAL ROUTING
The details of global routing differ slightly between cell-based ASICs, gate arrays,
and FPGAs, but the principles are the same in each case. A global router does not make any
connections, it just plans them. We typically global route the whole chip (or large pieces if it
is a large chip) before detail routing the whole chip (or the pieces).
● Maximize the probability that the detailed router can complete the routing.
Page | 18
21EC71 Advanced VLSI
global
routing
sequential hierarchical
routing routing
One approach to global routing takes each net in turn and calculates the shortest path
using tree on graph algorithms with the added restriction of using the available channels.
This process is known as sequential routing.
There are two different ways that a global router normally handles this problem.
Using order-independent routing, a global router proceeds by routing each net, ignoring
how crowded the channels are. Whether a particular net is processed first or last does not
matter, the channel assignment will be the same. In order-independent routing, after all the
interconnects are assigned to channels, the global router returns to those channels that are
the most crowded and reassigns some interconnects to other, less crowded, channels.
Starting at the whole chip, or highest level, and proceeding down to the logic cells is
the top-down approach. The bottom-up approach starts at the lowest level of hierarchy and
globally routes the smallest areas first.
Page | 19
21EC71 Advanced VLSI
FIGURE 17.4 Global routing for a cell-based ASIC formulated as a graph problem. (a) A
cell-based ASIC with numbered channels. (b) The channels form the edges of a graph. (c)
The channel-intersection graph. Each channel corresponds to an edge on a graph whose
weight corresponds to the channel length
Figure 17.5 shows an example of global routing for a net with five terminals, labeled A1
through F1, for the cell-based ASIC shown in Figure 17.4 . If a designer wishes to use
minimum total interconnect path length as an objective, the global router finds the minimum-
length tree shown in Figure 17.5 (b). This tree determines the channels the interconnects will
use.
FIGURE 17.5 Finding paths in global routing. (a) A cell-based ASIC (from Figure 17.4 )
showing a single net with a fanout of four (five terminals). We have to order the numbered
channels to complete the interconnect path for terminals A1 through F1. (b) The terminals
are projected to the center of the nearest channel, forming a graph. A minimum-length
tree for the net that uses the channels and takes into account the channel capacities. (c)
The minimum-length tree does not necessarily correspond to minimum delay. If we wish to
minimize the delay from terminal A1 to D1, a different tree might be better.
Page | 20
21EC71 Advanced VLSI
BACK-ANNOTATION
After global routing is complete it is possible to accurately predict what the length
of each interconnect in every net will be after detailed routing, probably to within 5 percent.
The global router can give us not just an estimate of the total net length (which was all we
knew at the placement stage), but the resistance and capacitance of each path in each net.
This RC information is used to calculate net delays. We can back-annotate this net delay
information to the synthesis tool for in-place optimization or to a timing verifier to make
sure there are no timing surprises. Differences in timing predictions at this point arise due
to the different ways in which the placement algorithms estimate the paths and the way the
global router actually builds the paths
Page | 21