Introduction 07 23
Introduction 07 23
Islam Ahmed
1
Course Objective
The main objective of the course is the study of digital backend flow.
By the end of this course, everyone should be able to implement all steps of
RTL2GDS.
2
Contents
Introduction to VLSI
ASIC
Design approaches
ASIC VS FPGA
ASIC Cost
Fabless VS FABs.
Basics of Microfabrication
Introduction to PnR
Standard Cell Libraries
Multiple Analysis Corners
Outline
Training plan
Introduction to ASIC design
Different ways of Implementing digital systems
ASIC
FABs VS FABless
4
VLSI
Various circuit elements: transistors, capacitors, resistors , and
even small inductances can be integrated on one chip.
Integrating different devices on the same chip enhances
performance and decreases area.
Moore’s Law
▪ 1965. Moore’s law was discovered, according to which the number of transistors in ICs doubles every
18 months
transistors
Dual-Core Intel* Itanium* 2 Processor
Intel* Itanium* 2 Processor 10.000.000.000
MOORE’S LAW Intel* Itanium* Processor
Intel* Pentium* 4 Processor 1.000.000.000
Intel* Pentium* III Processor
100.000.000
Intel* Pentium* II Processor
286 100.000
8085
8080 10.000
8008
4004
1.000
1970 1975 1980 1985 1990 1995 2000 2005 2010 2015
Web-site: http://www.intel.com/technology/mooreslaw/
Performance
Customization
7
Application Specific Integrated Circuit (ASIC)
8
ASIC
- 100’s of millions of logic gates can be integrated on the same chip using ASICs to
create incredibly large and complex functions.
- Due to the extremely expensive cost of building a new silicon foundry , many
companies work in design only and few companies specialize in fabricating their
designs.
9
SoC Example: Mobile Application SoC
10
ASIC: Design approaches
1) Full Custom
A design methodology useful for integrated circuits. In this design, the resistors, transistors, digital
logic, capacitors and analog circuits are all positioned in the circuit layout. [“handcrafted” designs]
Pros: Maximum performance, minimized area and highest degree of flexibility.
Cons: Huge design effort, high Design cost and NRE cost, design is frozen in silicon, and long time
to market.
11
ASIC: Design approaches
2) Semi-Custom [Standard Cell Based ASICs]
Components from a predesigned standard cell library are used.
All logic cells are predesigned and some mask layers are only customized.
Standard cell libraries are usually designed using full custom approach.
Pros over Full custom: Easier, automatable/less design effort, practical to use for
large designs, reasonable TaT and reduced risk.
12
Standard cells: the main building blocks
13
ASIC Cost
Total Product Cost = NRE + (Number of parts * recurring cost per part)
Cost-per-part
Wafer cost.
Wafer processing.
Production yield.
Packaging.
14
ASIC: Design approaches
3) Programmable ASICs
Field Programmable Gate Array
FPGAs are complex and larger reconfigurable
devices. Unique features of Field
Programmable Gate Arrays include
programming logic cells and interconnect and
here no mask layer is customized. Xilinx/AMD,
Altera/Intel, Microsemi/Microchip are some of
the important FPGA companies.
15
FPGA ASIC
Advantages Faster time to market. Lower unit cost for mass production.
No NRE Faster than FPGA.
Simpler design cycle, more predictable. Lower power than FPGA.
Re-programability More flexible; analog and mixed-signal
Reusability designs can be created which is not possible
Perfect for prototyping in FPGA.
Have built-in blocks like: MACs,
memories, high speed IOs.
Disadvantages Higher unit cost Longer time to market
Slower than ASIC High NRE, very expensive tools.
Higher power than ASIC. No control over Design cycle has to analyze/enhance more
power optimization. aspects like: DFM, Crosstalk, EMIR, LVS,
Limited design size to FPGA resources ERC/PERC.
Design is frozen in silicon.
16
Exercise 1
What is the best design approach for implementing each of the following: [hint: you can use a mix of
approaches when needed]
a. ADC
b. Microcontroller
c. AI accelerator HW testing and prototyping.
d. Mobile application chip.
e. Medical and Aerospace applications. [Low volume and high complexity]
17
Exercise 2
When can we consider the high NRE cost of ASIC development make sense economically?
18
Main semiconductor market players:
Fabless VS FABs
- Fabless semiconductor companies design only the layout, which indicates the
placement of the different layers inside the IC (n-diffusion, p-diffusion, metal1,
metal2,…etc.). These different placements indicates building transistor(s) or connections
between them.
- For example:
Chip vendors: Qualcomm, Broadcom, Nvidia, Infineon, Freescale and Renesas.
IP vendors: Arm, Synopsys, Cadence, Imagination Technologies and CEVA.
- Fabrication foundries ensures first that this layout satisfy their design rules, then start
fabricating them. They play the role of a pure-play foundry; specializing in fabrication
and not competing with their customers [Chip vendors].
- For example: TSMC, Samsung, UMC.
- Now Intel is joining the market to fabricate for other chip vendors.
19
Tape-Out
❑ The layout is sent to the fab house as GDSII or OASIS file format
▪ The process of delivering to fab is called "Tapeout” as it was sent on a
magnetic tape. Now only email is sent.
Exercise 3
What are the benefits of integrating more components into a single chip,
moving system blocks from board level to chip level? [cost, performance.
power, integration]
21
Contents
Introduction to VLSI
ASIC
Design approaches
ASIC VS FPGA
ASIC Cost
Fabless VS FABs.
Basics of Microfabrication
Introduction to PnR
Standard Cell Libraries
Multiple Analysis Corners
Layout vs Cross-Section
GND VDD
p+ n+ n+ p+ p+ n+
n well
p substrate
well
substrate tap
tap
Litho-Etch process
CMOS Fabrication process
2- Apply 5- remove
photoresist
Photoresist
6- Dope to get
n-well for PMOS
Metal
GND VDD
nMOS transistor pMOS transistor
well tap
Inverter Mask Set
Contents
Introduction to VLSI
ASIC
Design approaches
ASIC VS FPGA
ASIC Cost
Fabless VS FABs.
Basics of Microfabrication
Introduction to PnR
Standard Cell Libraries
Multiple Analysis Corners
Increasing Complexity
Number of transistors on a chip has grown exponentially.
Due to advances in fabrication, and enabled by powerful CAD tools.
The greatest challenge in modern VLSI design is not in designing the individual
transistors but rather in managing system complexity.
Modern System-On-Chip (SOC) designs combine memories, processors, high-speed I/O
interfaces, and dedicated application-specific logic on a single chip.
31
Design Abstractions
It is all about hiding details until they become necessary.
The practice of structured design, which is also used in large software projects, uses the
principles of hierarchy, regularity, modularity, and locality to manage the complexity
32
Design Abstractions
Hardware
Level Modeling Object Example of Modeling Object Description
Language Used
System C/C++
(Electronic System Structural Circuit RAM bus CPU System Verilog
Level – ESL) System C
Functional Circuits Add
Hardware
Level Modeling Object Example of Modeling Object Description
Language Used
Circuit Level
SPICE
(Transistor Level, Electrical Circuit
CDL
SPICE Netlist)
n+
p+
n
Device Level IC Components -
n+
p
34
Notes
These levels are interdependent and all influence each of the design objectives.
For example, choices of microarchitecture and logic are strongly dependent on the
number of transistors that can be placed on the chip, which depends on the physical
design and process technology.
Digital VLSI design favors the engineer who can evaluate how choices in one
part of the system impact other parts of the system.
35
Digital ASIC Design Flow
Usually, for electronic systems, either analog or digital;
circuit design is separated from layout design. Each job is performed by a different engineering team.
→Circuit design (RTL design for digital systems) is done by Frontend team.
→Layout design (PnR for digital systems) is done by backend team. [Also called Digital implementation, Chip
implementation or ASIC design].
36
Exercise 4
List 4 main responsibilities of digital frontend teams, and digital backend
teams.
37
Digital ASIC Design Flow
Responsibilities: Frontend VS Backend
Digital Frontend team -Choosing microarchitecture, suitable algorithms to use, number of
(System → algorithms → pipeline stages…etc according to the system specifications.
RTL) -Developing synthesizable RTL and constraints.
-Developing suitable DC scripts for synthesis and DFT.
-Specifying power domains and generating UPF maps.
-Performing verification for the RTL.
-Performing gatelevel simulations to check backend deliverables.
Digital Backend team -Meeting timing requirements of setup and hold according to SC library
(RTL → logic → logic cells specifications.
and FFs → standard cells) -Meeting special timing requirements asked by the digital team (ex. skew
balancing).
-Matching digital team intended design: gatelevel netlist matching RTL,
-Meeting physical design rules specified by the foundary (DRC, LVS,
Antenna, DFM)
-Minimizing IR drop over the design so that it is below a defined
threshold (~2%)
-Minimizing power consumption so that it’s comparable to a similar node
38 or similar design.
Digital Backend Design
It’s the transformation of a digital circuit design (RTL in Verilog or VHDL) into a
physical representation (layout) for manufacturing (GDS/Oasis).
The design after being represented in the physical layout has to meet all signoff
criteria:
A. Mathematically equivalent to the RTL. [Formality clean VS RTL]
B. Timing signoff.
C. Physical signoff (DRC, LVS, Antenna…etc)
D. Power Integrity signoff.
39
Design Verification
Floorplanning
No
Clean?
Yes
Signoff/Tapeout
Exercise 5
How can we confirm that our design is ready for tapeout/fabrication? [3 criterias]
43
Contents
Introduction to VLSI
ASIC
Design approaches
ASIC VS FPGA
ASIC Cost
Fabless VS FABs.
Basics of Microfabrication
Introduction to PnR
Standard Cell Libraries
Multiple Analysis Corners
Standard Cell Library
A good application of design abstraction.
A standard cell is a group of transistor and interconnect structures that provides a
boolean logic function (e.g., AND, OR, inverters) or a storage function (flipflop or latch).
Standard Cell Library
• Cell categories
All basic and universal gates (AND, OR, NOT, NAND, NOR, XOR etc)
Complex gates (MUX, HA, FA, Comparators, AOI, OAI etc)
Clock tree cells (Clock buffers, clock inverters, ICG cells etc)
Flip flops and latches
Delay cells
Physical only cells
Scannable Flip flops, Latches.
Standard Cell Library: Typical views
Behavioral views
Gatelevel netlist(.v): used for simulation an logic equivalence.
Timing/Power views (.lib): contains characterization of library used for STA and EMIR
analysis. Also input to logic synthesis and PnR tools for optimization.
Physical views
.lef: abstract format for modeling of cells in PnR tools.
.gds: graphical representation of the layout going to be fabricated, used for DRC and LVS.
.sp: spice netlist contains transistor level representation of cells, used for LVS.
Cell variants: Drive strengths
Each logic cell (NAND, NOR, INV…) is
implemented in the SC library in:
A. Multiple sizes (x1, x2, x4, x8..etc).
B. Multiple flavors (LVT, SVT).
48
Cell variants: MT-CMOS
One additional mask can provide more or less
doping in a transistor channel, shifting the
threshold voltage.
Most libraries provide equivalent cells with
three VTs: SVT, HVT, LVT to tradeoff speed vs.
leakage.
All threshold varieties have same footprint and
therefore can be swapped without any
placement/routing iterations. [Footprint: pins
and obstructions]
49
Exercise 6
What is the preferred cells to be used for the following applications: [size and flavor]
- Heart peacemaker.
- Datacenter processor.
- Battery powered IoT device.
Compare the following cells in terms of: speed, area, and leakage power.
-svt_x2_buf, svt_x8_buf, svt_x16_buf
-svt_x2_buf, lvt_x2_buf [lvt: low V-threshold cell]
-Compare between using a complex “AOI” cell, and implementing same logic using equivalent NAND2 cells, w.r.t overall delay and
performance.
50
Standard cell Layout
• At the top of the standard cell, there is VDD rail and bottom there is a VSS rail.
• nwell region, near to the VDD rail where pMOS transistors are built.
• A gap of nwell and pwell dedicated usually for wiring.
• pwell region near the VSS rail where nMOS transistors are built
51
Standard cell Layout [example]
52
Liberty file (.lib or .db file)
• It’s a readable ASCII format that characterizes the standard cell library cells in terms of timing, area,
power and other parameters.
• The cell is characterized using simulation and timing and power results are obtained under a variety of
conditions.
53
1. Liberty file (.lib or .db file)
Why to use .lib?
- To know if the design meets timing or not:
• Running SPICE will consume a lot of time and computing
resources.
• Instead, we use a timing model that abstracts cell behavior
and simplify calculations.
For each signoff corner, we use the provided .lib file for
this corner to perform timing analysis (STA) and
Power analysis as well.
54
Liberty file (.lib or .db file)
55
Non-Linear Delay Model (NLDM)
-Non-linear delay models use Spice-derived timing at several
input_transition and output_loadpoints.
-Data-points not found in the tables are linearly interpolated.
-Note the presence of two tables:
A. One for cell delay,/.
B. and another for output transition (true rise/fall time).
• Cell
• Specific information about cell characterization.
• For example. Function, Area, leakage power
57
1. Liberty file (.lib or .db file)
Timing data of standard cells is provided in the Liberty
format.
• Pin
• Timing, power, capacitance, leakage. functionality,
design rules and other characteristics of each pin in
each cell.
58
Parasitic Estimation: WLM
▪ All parasitics depend on interconnect ▪ The larger the chip (the more
length gates it has) the more the length
For calculating the propagation delay of a specific cell, how can DC/ICC calculate its input transition and load
capacitance?
61
Cell Types
Combinational
Cell output changes with changes at B
C
input A
Sequential
Cell has memory and output depends on
memory and input Synchronous input
Data
Flip Flop Output
Clock
Reset
Asynchronous input
62
Timing Group Names
N Parameter Unit Symbol Figure Definition
1. Rise transition time ns tR V DD The time it takes a driving pin to
0.9VDD
make a transition from kVDD to (1-
rise_transition k)VDD value. Usually k=0.1 (also
0.1VDD
possible k=0.2, 0.3, etc)
V SS tR
2. Fall transition time ns tF VDD
0.9VDD The time it takes a driving pin to
fall_transition make a transition from (1-k)VDD to
kVDD value. Usually k=0.1 (also
0.1VDD
possible k=0.2, 0.3, etc)
tF VSS
63
Cell Timing Data
library(){
lu_table_template ("del_1_7_7") {
variable_1 : "input_net_transition";
index_1("1, 2, 3, 4, 5, 6, 7");
variable_2 : "total_output_net_capacitance";
index_2("1, 2, 3, 4, 5, 6, 7");
}
cell (INVX1) {
pin (Y) {
timing () {
related_pin : "A";
timing_type : "combinational";
timing_sense : "negative_unate";
cell_rise ("del_1_7_7") {
index_1("0.016, 0.032, 0.064, 0.128, 0.256, 0.512, 1.024");
index_2("0.1, 0.25, 0.5, 1, 2, 4, 8");
values("0.016861, 0.0179019, 0.0195185, 0.0229259, 0.029658, 0.043145, 0.07712", \
"0.0239648, 0.0255491, 0.0279298, 0.0319930, 0.0387540, 0.0520896, 0.0790211", \
"0.0342118, 0.0366966, 0.0402223, 0.0462823, 0.0558327, 0.0705154, 0.0967339", \
"0.0491695, 0.0524727, 0.0576512, 0.0665647, 0.0810999, 0.1027237, 0.1342571", \
"0.0721332, 0.0765389, 0.0836775, 0.0960890, 0.1171612, 0.1497265, 0.1957640", \
"0.1111560, 0.1164417, 0.1252609, 0.1422002, 0.1712097, 0.2171862, 0.2847010", \
"0.1841131, 0.1901881, 0.2010298, 0.2194395, 0.2555983, 0.3182710, 0.4139452");
}
64
Combinational Timing Arc Syntax
Combinational timing arc between input A and output Y, with negative dependence, i.e. When A is
rising Y is falling and vice-versa
cell (INVX1) {
pin (Y) {
timing () {
related_pin : "A";
timing_type : "combinational";
timing_sense : "negative_unate";
cell_rise ("del_1_7_7") {
index_1("0.016, 0.032, 0.064, 0.128, 0.256, 0.512, 1.024");
index_2("0.1, 0.25, 0.5, 1, 2, 4, 8");
values("0.0168610, 0.0179019, 0.0195185, 0.0229259, 0.0296588, 0.0431451,
0.0702328", \
"0.0239648, 0.0255491, 0.0279298, 0.0319930, 0.0387540, 0.0520896, 0.0790211", \
"0.0342118, 0.0366966, 0.0402223, 0.0462823, 0.0558327, 0.0705154, 0.0967339", \
"0.0491695, 0.0524727, 0.0576512, 0.0665647, 0.0810999, 0.1027237, 0.1342571", \
"0.0721332, 0.0765389, 0.0836775, 0.0960890, 0.1171612, 0.1497265, 0.1957640", \
"0.1111560, 0.1164417, 0.1252609, 0.1422002, 0.1712097, 0.2171862, 0.2847010", \
"0.1841131, 0.1901881, 0.2010298, 0.2194395, 0.2555983, 0.3182710, 0.4139452");
}
65
Delay Analysis
1 0
66
Timing Constraints: Timing Types
Setup/Hold, Recovery/Removal Constraints
N Parameter Unit Symbol Figure Definition
1 Setup time ns tSU The minimum period in which the input data
0.5VDD
(only for flip-flops or latches) DATA
t SU
to a flip-flop or a latch must be stable before
setup_rising the active edge of the clock occurs
0.5VDD
setup_falling CLOCK
DATA
2. Hold time ns tH 0.5VDD The minimum period in which the input data
. (only for flip-flops or latches) to a flip-flop or a latch must remain stable
hold_rising 0.5VDD after the active edge of the clock has
hold_falling
CLOCK tH occurred
67
LEF: Library exchange format (.lef file)
• It’s a readable ASCII format that contains detailed PIN information that is used later by PnR tools to
guide routing.
68
LEF: Library exchange format (.lef file)
69
Technology LEF
Tech .lef contains simplified information about the
technology to be used by the PnR tool. (Physical
synthesis tool)
Layers
Via definitions
Design rules
Antenna data
70
Spice and GDS
71
.def: Design Exchange Format
.def file holds both physical and logical information of the design.
It is used for exchanging information between tools, enabling inter-operability within the ASIC flow. For example,
doing floorplanning and placement with one tool, CTS with a 2nd one, and parasitic extraction with a 3rd one.
This file has information about:
Die/block size, dimensions
Row height
Nets, NDRs
Placement blockages and routing blockages
Macro and std cell location ,
Pin location etc.
It can be used as an input to parasitic extraction tools [.def/.lef flows]
72
Contents
Standard cell libraries
Multiple Analysis Corners
Static timing analysis
Definition
STA VS simulation
Timing paths
Required time, arrival time, slack
FF basics characteristics, setup and hold requirements
Timing paths reporting
The Multiple Analysis Corners
MC => Opt(P,V,T)
P Best
Typical
T Worst
74
Operating Conditions
The operating conditions of a design include the following
parameters:
Process
Voltage
Temperature
The chip is intended to operate under this parameters.
75
Operating Conditions (2)
Process variation
Deviations in the semiconductor fabrication process
Supply voltage variation
Design’s supply voltage can vary from the established value during day-to-day
operation.
Operating temperature variation
Effects on performance caused by temperature fluctuations
76
The Multiple Analysis Corners
Corner Process Power Supply
Temperature (T) Notes
Name (NMOS proc. – PMOS proc.) (V)
A corner is defined as a PVT, and it is provided to the analysis and optimization tool as logic libraries per
PVT and parasitics data.
Corners are not due to functional settings, but rather result from process variations during manufacturing,
and voltage and temperature variations in the environment in which the chip will operate.
Each standard cell library is characterized for a set of signoff corners, according to the required signoff
corner for the design(s) that will use the library later.
77
The Multiple Analysis Corners
Supply voltage variations [nominal, 1.1*nominal,
0.9*nominal]
• Supply noise due to parasitic inductance.
• DC source or voltage regulator producing changing
voltage over time. It can go above or below the
expected voltage and hence it will cause current to
change making the circuit slower or faster than
earlier.
What do you expect in the following scenarios for your design’s area and power:
A. If you have extremely slow corners in your required signoff corners.
B. If you have extremely fast corners in your required signoff corners.
80
Exercise 5
If you are a product marketing manager for an MCU chip vendor
specialized in automotive MCUs (ex. Renesas), how would you choose
your signoff corners?
If you are a project manager for an IP vendor (ex. ARM), and a customer is
asking for signoff corners 1.5x slower what is usual, what is the impact on
the IP? [area, power, timing, development time, how much you charge the
customer ☺ ]
81
Exercise 6
If the input transition of an inverter is in the range of the characterized
lookup tables in the .lib, but not exactly matching an input transition value,
how will optimization/analysis engines calculate cell delay?
82
Exercise 7
If you will implement a physical design tool, what views would you prefer
to do different PnR stages (placement, CTS, Routing): .gds or .lef?
83