Introduction To Cmos Vlsi Design: MIPS Processor Example
Introduction To Cmos Vlsi Design: MIPS Processor Example
Outline
Design Partitioning MIPS Processor Example Architecture Microarchitecture Logic Design Circuit Design Physical Design Fabrication, Packaging, Testing
Slide 2
Activity 2
Sketch a stick diagram for a 4-input NOR gate
Slide 3
Activity 2
Sketch a stick diagram for a 4-input NOR gate
VDD A B C D
GND
2: MIPS Processor Example CMOS VLSI Design Slide 4
Slide 5
Structured Design
Hierarchy: Divide and Conquer Recursively system into modules Regularity Reuse modules wherever possible Ex: Standard cell library Modularity: well-formed interfaces Allows modules to be treated as black boxes Locality Physical and temporal
Slide 6
Design Partitioning
Architecture: Users perspective, what does it do? Instruction set, registers MIPS, x86, Alpha, PIC, ARM, Microarchitecture Single cycle, multcycle, pipelined, superscalar? Logic: how are functional blocks constructed Ripple carry, carry lookahead, carry select adders Circuit: how are transistors used Complementary CMOS, pass transistors, domino Physical: chip layout Datapaths, memories, random logic
2: MIPS Processor Example CMOS VLSI Design Slide 7
Gajski Y-Chart
Slide 8
MIPS Architecture
Example: subset of MIPS processor architecture Drawn from Patterson & Hennessy MIPS is a 32-bit architecture with 32 registers Consider 8-bit subset using 8-bit datapath Only implement 8 registers ($0 - $7) $0 hardwired to 00000000 8-bit program counter Youll build this processor in the labs Illustrate the key concepts in VLSI design
Slide 9
Instruction Set
Slide 10
Instruction Encoding
32-bit instruction encoding Requires four cycles to fetch on 8-bit datapath
format R example 6 add $rd, $ra, $rb 0 6 I beq $ra, $rb, imm op 6 J j dest op 5 ra 5 ra 5 rb 5 rb 26 dest encoding 5 rd 5 0 16 imm 6 f unct
Slide 11
Fibonacci (C)
f0 = 1; f-1 = -1 fn = fn-1 + fn-2 f = 1, 1, 2, 3, 5, 8, 13,
Slide 12
Fibonacci (Assembly)
1st statement: n = 8 How do we translate this to assembly?
Slide 13
Fibonacci (Assembly)
Slide 14
Fibonacci (Binary)
1st statement: addi $3, $0, 8 How do we translate this to machine language? Hint: use instruction encodings below
format R
example 6 5 ra 5 ra 5 rb 5 rb 0 6
op 6
j dest
op
Slide 15
Fibonacci (Binary)
Machine language program
Slide 16
MIPS Microarchitecture
Multicycle marchitecture from Patterson & Hennessy
PCWriteCond PCEn PCSource PCWrite ALUOp Outputs IorD ALUSrcB MemRead ALUSrcA Control MemWrite RegWrite MemtoReg Op RegDst IRWrite[3:0] [5 : 0] 0 Jump address
Instruction [5 : 0] Instruction [31:26] Address Memory MemData Write data Instruction [25 : 21] Instruction [20 : 16] Instruction [15 : 0] Instruction register Instruction [7 : 0] Memory data register 0 M Instruction u x [15 : 11] 1 0 M u x 1 Read register 1 Read Read register 2 data 1 Registers Write Read register data 2 Write data A 0 M u x 1 0 1 M u 2 x 3
Shift left 2
1 u
x 2
PC
0 M u x 1
B 1
ALUOut
Instruction [5 : 0]
Slide 17
Multicycle Controller
Instruction fetch 0 MemRead ALUSrcA = 0 IorD = 0 IRWrite3 ALUSrcB = 01 ALUOp = 00 PCWrite PCSource = 00 1 MemRead ALUSrcA = 0 IorD = 0 IRWrite2 ALUSrcB = 01 ALUOp = 00 PCWrite PCSource = 00 2 MemRead ALUSrcA = 0 IorD = 0 IRWrite1 ALUSrcB = 01 ALUOp = 00 PCWrite PCSource = 00 3 MemRead ALUSrcA = 0 IorD = 0 IRWrite0 ALUSrcB = 01 ALUOp = 00 PCWrite PCSource = 00
e) -typ =R
') 'L B
= Op or (
'S B
(O p
')
(Op
'B
Reset
(Op = 'J')
EQ ')
Jump completion
12 PCWrite PCSource = 10
Memory access 8
p (O = 'S B' )
R-type completion
6 MemRead IorD = 1
Slide 18
Logic Design
Start at top level Hierarchically decompose MIPS into units Top-level interface
cry stal oscillator 2-phase clock generator ph1 ph2 reset MIPS processor memread memwrite adr writedata memdata 8 8 8 external memory
Slide 19
Block Diagram
PCWriteCond PCEn PCSource MemRead MemWrite MemtoReg IRWrite[3:0] Op [5 : 0] PCWrite ALUOp Outputs IorD ALUSrcB Control ALUSrcA RegWrite RegDst 0 M x 2
Instruction [5 : 0] Instruction [31:26] Address Memory MemData Write data Instruction [25 : 21] Instruction [20 : 16] Instruction [15 : 0] Instruction register Instruction [7 : 0] Memory data register 0 M Instruction u x [15 : 11] 1 0 M u x 1 Read register 1 Read Read register 2 data 1 Registers Write Read register data 2 Write data A 0 M u x 1 0 1 M u 2 x 3
Shift left 2
Jump address
1 u
PC
memwrite memread
0 M u x 1
B 1
ALUOut
Instruction [5 : 0]
controller
aluop[1:0]
alucontrol
funct[5:0]
alucontrol[2:0]
op[5:0]
zero
alusrca
alusrcb[1:0]
pcen
pcsource[1:0]
memtoreg
regdst
iord
regwrite
irwrite[3:0]
datapath
Slide 20
Hierarchical Design
mips controller standard cell library alu fulladder or2 and2 mux4 mux2 tri
2: MIPS Processor Example CMOS VLSI Design Slide 21
alucontrol
HDLs
Hardware Description Languages Widely used in logic design Verilog and VHDL Describe hardware using code Document logic functions Simulate logic before building Synthesize code into gates and layout Requires a library of standard cells
Slide 22
Verilog Example
module fulladder(input a, b, c, output s, cout);
cout a b c a b c s f ulladder cout s carry sum
module carry(input a, b, c, output cout) assign cout = (a&b) | (a&c) | (b&c); endmodule
2: MIPS Processor Example CMOS VLSI Design Slide 23
Circuit Design
How should logic be implemented? NANDs and NORs vs. ANDs and ORs? Fan-in and fan-out? How wide should transistors be? These choices affect speed, area, power Logic synthesis makes these choices for you Good enough for many applications Hand-crafted circuits are still better
Slide 24
p1 c c
p2 p3 i3 n3 i1 b n2
b a a b
p4 i4 p5 n5 i2 n4
cn
p6 cout n6
n1
Gate-level Netlist
module carry(input a, b, c, output cout)
g1
wire
x, y, z;
a b g2 a c g3 b c
x g4 y z cout
Slide 28
Transistor-Level Netlist
module carry(input a, b, c, output cout) wire tranif1 tranif1 tranif1 tranif1 tranif1 tranif0 tranif0 tranif0 tranif0 tranif0 tranif1 tranif0 endmodule i1, i2, i3, i4, cn; n1(i1, 0, a); n2(i1, 0, b); n3(cn, i1, c); n4(i2, 0, b); n5(cn, i2, a); p1(i3, 1, a); p2(i3, 1, b); p3(cn, i3, c); p4(i4, 1, b); p5(cn, i4, a); n6(cout, 0, cn); p6(cout, 1, cn);
p1 c c
p2 p3 i3 n3 i1 b n2
b a a b
p4 i4 p5 n5 i2 n4
cn
p6 cout n6
n1
Slide 29
SPICE Netlist
.SUBCKT CARRY A B C COUT VDD GND MN1 I1 A GND GND NMOS W=1U L=0.18U AD=0.3P AS=0.5P MN2 I1 B GND GND NMOS W=1U L=0.18U AD=0.3P AS=0.5P MN3 CN C I1 GND NMOS W=1U L=0.18U AD=0.5P AS=0.5P MN4 I2 B GND GND NMOS W=1U L=0.18U AD=0.15P AS=0.5P MN5 CN A I2 GND NMOS W=1U L=0.18U AD=0.5P AS=0.15P MP1 I3 A VDD VDD PMOS W=2U L=0.18U AD=0.6P AS=1 P MP2 I3 B VDD VDD PMOS W=2U L=0.18U AD=0.6P AS=1P MP3 CN C I3 VDD PMOS W=2U L=0.18U AD=1P AS=1P MP4 I4 B VDD VDD PMOS W=2U L=0.18U AD=0.3P AS=1P MP5 CN A I4 VDD PMOS W=2U L=0.18U AD=1P AS=0.3P MN6 COUT CN GND GND NMOS W=2U L=0.18U AD=1P AS=1P MP6 COUT CN VDD VDD PMOS W=4U L=0.18U AD=2P AS=2P CI1 I1 GND 2FF CI3 I3 GND 3FF CA A GND 4FF CB B GND 4FF CC C GND 2FF CCN CN GND 4FF CCOUT COUT GND 2FF .ENDS 2: MIPS Processor Example CMOS VLSI Design Slide 30
Physical Design
Floorplan Standard cells Place & route Datapaths Slice planning Area estimation
Slide 31
MIPS Floorplan
10 I/O pads
10 I/O pads
5000
3500
10 I/O pads
5000
Slide 32
MIPS Layout
Slide 33
Standard Cells
Uniform cell height Uniform well height M1 VDD and GND rails M2 Access to I/Os Well / substrate taps Exploits regularity
Slide 34
Synthesized Controller
Synthesize HDL into gate-level netlist Place & Route using standard cell library
Slide 35
Pitch Matching
Synthesized controller area is mostly wires Design is smaller if wires run through/over cells Smaller = faster, lower power as well! Design snap-together cells for datapaths and arrays Plan wires into cells A A A A B Connect by abutment A A A A B A A A A B Exploits locality A A A A B Takes lots of effort
C C D
Slide 36
MIPS Datapath
8-bit datapath built from 8 bitslices (regularity) Zipper at top drives control signals to datapath
Slide 37
Slice Plans
Slice plan for bitslice Cell ordering, dimensions, wiring tracks Arrange cells for wiring locality
Slide 38
MIPS ALU
Arithmetic / Logic Unit is part of bitslice
Slide 39
Area Estimation
Need area estimates to make floorplan Compare to another block you already designed Or estimate from transistor counts Budget room for large wiring tracks Your mileage may vary!
Slide 40
Design Verification
Fabrication is slow & expensive MOSIS 0.6mm: $1000, 3 months State of art: $1M, 1 month Debugging chips is very hard Limited visibility into operation Prove design is right before building! Logic simulation Ckt. simulation / formal verification Layout vs. schematic comparison Design & electrical rule checks Verification is > 50% of effort on most chips!
Specification Architecture Design Logic Design Circuit Design Physical Design = Function
Function
Function
Slide 41
Slide 42
Testing
Test that chip operates Design errors Manufacturing errors A single dust particle or wafer defect kills a die Yields from 90% to < 10% Depends on die size, maturity of process Test each part before shipping to customer
Slide 43