Handout Chip Design Methods
Handout Chip Design Methods
METHODOLOGIES
OR
DESIGN METHODS
DSP Processor Design Approaches
higher performance
• Full custom
lower energy (power)
• Standard cell**
lower per-part cost
• Gate array*
• FPGA*
• Programmable DSP
• Programmable general purpose
• VLSI
– Originally meant “Very Large Scale Integration” meaning a large
number of transistors per chip
– Now generally means “semiconductor chip”
• Characterized by their minimum feature length (length of
transistor’s gate)
• Some typical state-of-the-art fabrication technologies in late 2019:
– 14 nm Mature production for logic chips
– 5 nm “Industry-leading 5 nm CMOS technology features, for the first
time, full-fledged EUV, and high mobility channel finFETs, offering ~1.84x
logic density, 15% speed gain or 30% power reduction over 7 nm. This true
5 nm technology successfully passed qualification with high yield, and targets
for mass production in 1H 2020.” —IEDM, December 2019
© B. Baas 14
Full Custom
• Multiplier chip
– Multiplier
– I/O pads
– Clock generator
– Control logic
– Buffers
© B. Baas 16
Standard Cell
• Constant-height
cells
• Regular “pin”
locations
• Regular layout
allows CAD
tools to much
more easily
automatically
place and route
cells
© B. Baas 19
Combination Standard Cell and
Full Custom
• Dense, regular full-
custom blocks
• Random logic
implemented with
standard cells and
automatic place and
route
© B. Baas 20
[figure from S. Hauck]
Typical Standard Cell, Gate Array, or
FPGA Design Flow
• HDL (Verilog) source code is synthesized to generate a gate
netlist made up of elements from the Standard Cell library
• The same HDL design may be synthesized to various libraries;
for example:
– Standard cell (NAND, NOR, Flip-Flop, etc.)
– FPGA library (CLBs, LUTs, etc.)
Synthesizer Hardware
HDL
CAD Implementation
(Verilog
Tool (e.g., gate netlist)
or VHDL)
Ex: Ex:
c=a&b x=NAND(a,b)
cell c=INV(x)
library
© B. Baas 21
Simplified diagram of Standard
Cell design flow after synthesis
Final Layout
Hardware (could be Design Rule
Implementation fabricated) Check (DRC)
(e.g., gate netlist) Place
& Layout vs.
Ex: Route Schematic
x=NAND(a,b) Gate-level
c=INV(x) (LVS) Check
description
Timing Information
© B. Baas 22
Layout synthesized from Verilog and a Standard
Cell library, and then “Placed & Routed”
module multiplier (
input in1,
input in2,
output out
);
• Polysilicon and
diffusion are the same p-type diffusion
for all designs
• Metal layers PMOS
customized for transistor
particular chips
polysilicon
n-type diffusion
NMOS
transistor
© B. Baas 24
Gate Array
polysilicon
VD D
metal
rows of Uncommited
uncommitted possible
cells GND contact Cell
routing
channel Committed
Cell
(4-input NOR)
Out
© B. Baas 28
Programmable Processor
• Intel 8086
• First released 1978
• 33 mm2
• 3.2 µm
• 4–12 MHz
• 29,000 transistors
© B. Baas 29
4.80 GHz General-
Purpose Processor
• Intel i9 (formerly called Coffee Lake)
[i9-8950HK]
• 14 nm CMOS
• 6 cores (12 threads)
• 2.90 GHz base frequency
• 4.60 GHz standard turbo frequency
• 4.80 GHz maximum turbo
frequency—possible only if the CPU
is below 53 °C
• 12 MB on-die cache
• 45 Watts TDP (Thermal Design
Power)
© B. Baas 30
Massive General-Purpose
Server Processor
• Itanium Poulson
• 32 nm
• 3.1 Billion Transistors
• 18.2 mm x 29.9 mm = 544 mm2
• 8 multi-threaded cores
• 54 MB total on-die cache
• 170 Watts TDP
• [ISSCC 2011]
© B. Baas 31
Programmable DSP Processor
• TI C64X
• 600 MHz, 0.13 um, 718
mW @ 1.2 V
• 8-way VLIW core
• 2-level memory system
• 64 million transistors
© B. Baas 34
Heterogeneous Programmable Platforms
FPGA Fabric
Embedded memories
Embedded PowerPc
Hardwired multipliers
High-speed I/O
EEC 116, B. Baas 35
[Xilinx]
Design at a crossroad
System-on-a-Chip
Analog
Multi-
applications where cost,
Spectral + 1 Gbit DRAM performance, and energy are
RAM
Imager Preprocessing big issues!
• DSP and control
64 SIMD Processor mC • Mixed-mode
Array + SRAM system • Combines programmable and
+2 Gbit application-specific modules
Image Conditioning DRAM • Software plays crucial role
100 GOPS Recog-
nition