0% found this document useful (0 votes)
20 views27 pages

Handout Chip Design Methods

This document discusses various chip design methodologies and technologies. It describes full custom, standard cell, gate array, FPGA, and programmable processor design approaches. For each technology, it provides details on their advantages such as performance, power, cost, and design time. It also discusses semiconductor fabrication technologies ranging from 14nm to 5nm nodes. Finally, it provides examples of chips designed using different methodologies including general purpose processors, DSP processors, and GPUs.

Uploaded by

emindemirbas06.2
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views27 pages

Handout Chip Design Methods

This document discusses various chip design methodologies and technologies. It describes full custom, standard cell, gate array, FPGA, and programmable processor design approaches. For each technology, it provides details on their advantages such as performance, power, cost, and design time. It also discusses semiconductor fabrication technologies ranging from 14nm to 5nm nodes. Finally, it provides examples of chips designed using different methodologies including general purpose processors, DSP processors, and GPUs.

Uploaded by

emindemirbas06.2
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 27

CHIP DESIGN

METHODOLOGIES
OR
DESIGN METHODS
DSP Processor Design Approaches

higher performance
• Full custom
lower energy (power)
• Standard cell**
lower per-part cost
• Gate array*
• FPGA*
• Programmable DSP
• Programmable general purpose

lower design time


* Design domain of EEC 281
lower one-time cost
© B. Baas 13
VLSI Design Technologies

• VLSI
– Originally meant “Very Large Scale Integration” meaning a large
number of transistors per chip
– Now generally means “semiconductor chip”
• Characterized by their minimum feature length (length of
transistor’s gate)
• Some typical state-of-the-art fabrication technologies in late 2019:
– 14 nm Mature production for logic chips
– 5 nm “Industry-leading 5 nm CMOS technology features, for the first
time, full-fledged EUV, and high mobility channel finFETs, offering ~1.84x
logic density, 15% speed gain or 30% power reduction over 7 nm. This true
5 nm technology successfully passed qualification with high yield, and targets
for mass production in 1H 2020.” —IEDM, December 2019

© B. Baas 14
Full Custom

• All transistors and


interconnect drawn by
hand
• Full control over sizing
and layout

© B. Baas [figure from S. Hauck] 15


Full Custom

• Multiplier chip
– Multiplier
– I/O pads
– Clock generator
– Control logic
– Buffers

© B. Baas 16
Standard Cell

• Constant-height
cells
• Regular “pin”
locations
• Regular layout
allows CAD
tools to much
more easily
automatically
place and route
cells

© B. Baas [figure from S. Hauck] 17


Standard Cell

• Channels for routing only in older technologies (not


necessary with modern processes with many levels of
interconnect)

© B. Baas [figure from S. Hauck] 18


Standard Cell

• Wireless LAN chip


• Ten major standard cell
digital blocks. Plus one
analog block in the
upper right corner
• Many embedded
memory arrays
• Horizontal power grid
stripes

© B. Baas 19
Combination Standard Cell and
Full Custom
• Dense, regular full-
custom blocks
• Random logic
implemented with
standard cells and
automatic place and
route

© B. Baas 20
[figure from S. Hauck]
Typical Standard Cell, Gate Array, or
FPGA Design Flow
• HDL (Verilog) source code is synthesized to generate a gate
netlist made up of elements from the Standard Cell library
• The same HDL design may be synthesized to various libraries;
for example:
– Standard cell (NAND, NOR, Flip-Flop, etc.)
– FPGA library (CLBs, LUTs, etc.)

Synthesizer Hardware
HDL
CAD Implementation
(Verilog
Tool (e.g., gate netlist)
or VHDL)

Ex: Ex:
c=a&b x=NAND(a,b)
cell c=INV(x)
library

© B. Baas 21
Simplified diagram of Standard
Cell design flow after synthesis

Final Layout
Hardware (could be Design Rule
Implementation fabricated) Check (DRC)
(e.g., gate netlist) Place
& Layout vs.
Ex: Route Schematic
x=NAND(a,b) Gate-level
c=INV(x) (LVS) Check
description

Timing Information

Gate level dynamic and/or static analysis

© B. Baas 22
Layout synthesized from Verilog and a Standard
Cell library, and then “Placed & Routed”

module multiplier (
input in1,
input in2,
output out
);

out = in1 * in2;


endmodule

EEC 116, B. Baas Source: Digital Integrated Circuits, 2nd © 23


Gate Array

• Polysilicon and
diffusion are the same p-type diffusion
for all designs
• Metal layers PMOS
customized for transistor
particular chips

polysilicon

n-type diffusion
NMOS
transistor
© B. Baas 24
Gate Array

• Polysilicon and diffusion the


same for all designs
• 0.125 um example

© B. Baas [figure from LETI] 25


Gate Array — Sea-of-gates

polysilicon

VD D

metal
rows of Uncommited
uncommitted possible
cells GND contact Cell

In 1 In2 In3 In4

routing
channel Committed
Cell
(4-input NOR)
Out

EEC 116, B. Baas Source: Digital Integrated Circuits, 2nd © 26


Field Programmable
Gate Array (FPGA)
• Metal layers now
programmable with
SRAM instead of
hardwired during
manufacture as with a
gate array
• Cells contain general
programmable logic and
registers

© B. Baas [figure from S. Hauck] 27


Field Programmable
Gate Array (FPGA)
• Chips now “designed” with software
• User pays for up-front chip design costs
– All: full-custom, standard cell
– Half: gate array
– Shared: FPGA
• User writes code (e.g., verilog), compiles it, and
downloads into the chip
• The flexibility comes at a great cost however; as a
very approximate comparison, FPGAs are over 10x
slower, less energy efficient, and greater area than
an equivalent Standard Cell design

© B. Baas 28
Programmable Processor

• Intel 8086
• First released 1978
• 33 mm2
• 3.2 µm
• 4–12 MHz
• 29,000 transistors

© B. Baas 29
4.80 GHz General-
Purpose Processor
• Intel i9 (formerly called Coffee Lake)
[i9-8950HK]
• 14 nm CMOS
• 6 cores (12 threads)
• 2.90 GHz base frequency
• 4.60 GHz standard turbo frequency
• 4.80 GHz maximum turbo
frequency—possible only if the CPU
is below 53 °C
• 12 MB on-die cache
• 45 Watts TDP (Thermal Design
Power)

© B. Baas 30
Massive General-Purpose
Server Processor

• Itanium Poulson
• 32 nm
• 3.1 Billion Transistors
• 18.2 mm x 29.9 mm = 544 mm2
• 8 multi-threaded cores
• 54 MB total on-die cache
• 170 Watts TDP
• [ISSCC 2011]

© B. Baas 31
Programmable DSP Processor

• TI C64X
• 600 MHz, 0.13 um, 718
mW @ 1.2 V
• 8-way VLIW core
• 2-level memory system
• 64 million transistors

© B. Baas [figure from S. Agarwala] 32


Massive Special-
Purpose Processor
• Nvidia V100
• TSMC 12 nm FinFET
• 21.1 Billion Transistors
• 815 mm2
– Approximately 37.9 mm
x 21.5 mm
– At the reticle limit
• 1.45 GHz
• 80 streaming multiprocessors
• 300 Watts TDP
• Memory interface to HBM2
1.75 GHz, 4096-bit bus, 900 GB/s
• [HotChips 2017]
© B. Baas 33
Graphcore

© B. Baas 34
Heterogeneous Programmable Platforms
FPGA Fabric

Embedded memories
Embedded PowerPc

Hardwired multipliers

Xilinx Vertex-II Pro

High-speed I/O
EEC 116, B. Baas 35
[Xilinx]
Design at a crossroad
System-on-a-Chip

500 k Gates FPGA • Often used in embedded

Analog
Multi-
applications where cost,
Spectral + 1 Gbit DRAM performance, and energy are
RAM
Imager Preprocessing big issues!
• DSP and control
64 SIMD Processor mC • Mixed-mode
Array + SRAM system • Combines programmable and
+2 Gbit application-specific modules
Image Conditioning DRAM • Software plays crucial role
100 GOPS Recog-
nition

EEC 116, B. Baas 36


A System-on-a-Chip Example
High Definition TV Chip

EEC 116, B. Baas Courtesy: Philips 37


The World’s Largest Chip
Cerebras Wafer-Scale Engine
• 46,225 mm2 chip
– 8.5” × 8.5”
– Built from a 12” wafer
– 56x larger than the biggest
GPU ever made: 815 mm2 and
21.1 billion transistors
• 1.2 Trillion transistors
• 15 KWatts!
• 400,000 cores
• Fabbed by TSMC, 98%-99% of
wafer area is usable
• 18 GB on-chip SRAM
• 100 Pb/s interconnect (100,000
Tb/s = 12,500 TB/sec)
• Approximately $200M startup
capital as of Aug 2019
EEC 116, B. Baas 38
https://www.cerebras.net/
https://www.zdnet.com/article/cerebras-has-as-a-three-year-lead-on-competition-with-its-giant-chip/

You might also like