Chip Builder GPT

Download as pdf or txt
Download as pdf or txt
You are on page 1of 35

DEVELOPING CHIPS

FOR TOMORROW:
YOU, ME,
AND CHATGPT®
Dr Jawad Nasrullah | Palo Alto Electron
February 07, 2024
Agenda

1. ChatGPT® and chip design

2. Heterogeneous chips for generative AI-an


example

3. AI memory demand beyond HBM

4. Challenges and opportunities of AI HW


Vision

Chips to
enhance AI
AI to improve
chip design
ChatGPT and
®

Chip Design
Chip design uses coding
Manufacturing design kits and
Chip design collaterals

life-cycle Architecture and early


planning

Design implementation and


verification

Verification and sign off

Manufacture, test, qualify and ship


Manufacturing design kits and
Chip design collaterals

life-cycle Architecture and early


planning

Design implementation and


verification

Verification and sign off

Manufacture, test, qualify and ship


Manufacturing design kits and
Chip design collaterals

life-cycle Architecture and early


planning

Design implementation and


verification

Verification and sign off

Manufacture, test, qualify and ship


write a spice netlist of an inverter in a
subcircuit named chiplet1 Vdd

● .subckt chiplet1 Vin Vout Vdd Gnd In Out

● * Definition of PMOS transistor


● M1 Vout Vin Vdd Vdd PMOS L=1u W=2u
Vss
● * Definition of NMOS transistor
● M2 Vout Vin Gnd Gnd NMOS L=1u W=1u
● .ends chiplet1
write behavioral model of chiplet1 in Verilog
Vdd

● module chiplet1(Vin, Vout); In Out

● input Vin; // Input port


● output Vout; // Output port
Vss
● // Logical behavior of inverter
assign Vout = ~Vin;
● endmodule
write a SPICE testbench to test transient output
of chiplet1 with a pulse of 100ns width

Vdd

● .include "Chiplet1.spice" ; Include the


Chiplet1 subcircuit definition
In Out
● Vdd Vdd 0 DC 5V
● Vss Vss 0 DC 0V
● Vin in 0 PULSE(0V 5V 0 1ns 1ns 100ns
200ns) Vss
● X1 in out Vdd Vss Chiplet1
● .tran 1ns 500ns ; Transient analysis:
step time, total time
● .end
write an essay on Chiplet1

In the realm of semiconductor design and


electronics, the concept of a "chiplet,"
such as the imaginary "Chiplet1,"
represents a significant stride in the
modular approach to integrated circuit (IC)
design. While "Chiplet1" is a hypothetical
…. serves as an excellent example to
understand modern trends in IC design, such
as chiplet-based architecture…

• Not hallucinating and shows some ‘understanding’


• Can be useful in writing verification scripts
A simple view of the generative pipeline

Prompt Foundation Large Language Output


Model (LLM)
Language ChatGPT3— 175 Billion parameters
Training
Corpora ChatGPT4— ~1.7Trillion parameters
-
~4TB (Inference)
-
~27TB (Training)
SPICE and
Verilog
Manuals, App ChatGPT5— 20x increase?
Notes
A chip design generative pipeline

Engineering
Prompt
Chip Design Fine- EDA
Tuned tools
LLM
Chip Design
Training Corpora
(Style, IP, Kits)
Design Database Architecture

Retrieval Implementation
Chiplet modularity
helps simplify this
Verify, Sign off
pipeline
GPT4 Training Cost
Estimates
X ~2000-3000 needed for a
month to train GPT

Rental cost/training ~$80 Million

• Electricity cost/training ~$4 Million


(Greenhouse gas emissions equivalent to ~1600 gas-
powered cars for a year.)

• Equipment CapEx ~$450 Million


Data Center GPU
Estimates
X 75k units @10.2kW
(“600k H100 equivalent” - Meta/Zuckerberg)

Power Demand = 750 MW

Needs own power generation

• Equipment CapEx >$10 B


Compute = 1 x 1018 math operations/s
Power Dissipation = 20 W
Memory = 2.5 x 1015 = 2.5 Peta Bytes
Chips for AI
A heterogeneous example
AMD MI300A AMD MI300X

228 GPU, 24 CPU 304 GPU


128 GB HBM DRAM 192 GB HBM DRAM
5.3TB/s Memory BW 5.3TB/s Memory BW
750W
750A @ 1V
16A @ 48V

Gen AI justifying super expensive chips


Rack ~50kW

MI300 systems leverages OCP open


accelerator infrastructure

AMD MI300 Chip MI300 OCP OAM MI300 OCP UBB


Module
304 GPU (8 XCD) 8x OAM
192 GB HBM (DRAM) OAM Heatsink 1.5 TB HBM (DRAM)
5.3 TB/s Memory BW 42 TB/s Memory BW
750 W TBP 10 kW UBB
102mm

~78mm
MI300 OAM
• 304 GPUs

170mm
• 192 GB
• 750W
GPU CPU
CPU
GPU CPU

IOD IOD

HBM
Stack

Passive Si Interposer

Substrate
GPU CPU
CPU
GPU CPU

IOD IOD

HBM
Stack

Passive Si Interposer

Substrate
102mm

~78mm
MI300 OAM
• 304 GPUs
• 192 GB

170mm
• 750W
102mm

+3 Years
• 456 GPUs

~110
mm
• 1TB

170mm
• 1.5kW
150mm

+10 Years
• >1000 GPUs

170mm
• 4 TB
• 3kW
HBM and beyond
Still “there is plenty of room at the bottom”
HBM System Trends
100000

10000

1000
HBM
Stack 100

10

1
2020 2030 2040
1024 data bus
~50um
• DRAM capacity (#stacks x stack capacity)
• Bandwidth/stack (#wires x symbol rate)
• DRAM power budget (system design)
745um

GPU

Silicon Interposer
Now +10 years

Die Stack 8-Hi, 12-Hi 24-Hi

HBM Capacity/Package 288GB 4TB

Data Bus Width 1024 2048


HBM
Stack Symbol rate/wire 8Gbps 32Gbps

Core Vdd 1.1V 0.8V

1024 data bus • Cu-Cu bonding, New DRAM devices, large interposers
~50um • Substrate/interposer technology improvement
• New physical/logical layer circuitry
• DTCO, circuit design
745um

GPU

Silicon Interposer
Now +10 years

Die Stack 8-Hi, 12-Hi 24-Hi

HBM Capacity/Package 288GB 4TB

Data Bus Width 1024 2048


HBM
Stack Symbol rate/wire 8Gbps 32Gbps

Core Vdd 1.1V 0.8V

• Cu-Cu bonding, New DRAM devices, large interposers


• Substrate/interposer technology improvement
• New physical/logical layer circuitry
• DTCO, circuit design
745um

GPU

Silicon Interposer
Challenges and
opportunities of AI HW
Power efficiency and scale out
Manufacturing
• Beyond CMOS (multi-gate)
• Vdd scaling (target 200mV)
• True 3D transistor stacking
• Fine pitch backend

Chip Design System Design


• Modularity (Chiplets) • Scale-out networking
• Power reduction • Multi chiplet integration
• Circuit Density • HBM/3D integration
Manufacturing
• Beyond CMOS (multi-gate)
• Vdd scaling (target 200mV)
• True 3D transistor stacking
• Fine pitch backend

Chip Design System Design


• Modularity (Chiplets) • Scale-out networking
• Power reduction • Multi chiplet integration
• Circuit Density • HBM/3D integration
Manufacturing
• Beyond CMOS (multi-gate)
• Vdd scaling (target 200mV)
• True 3D transistor stacking
• Fine pitch backend

Chip System
Design Kits Design Kits

Design
Automation

Chip Design System Design


• Modularity (Chiplets) Power Supply • Scale-out networking
• Power reduction Cooling • Multi chiplet integration
• Circuit Density • HBM/3D integration
Thank you.

You might also like