DDCA Ch7
DDCA Ch7
Chapter 7 <1>
MICROARCHITECTURE Chapter 7 :: Topics
• Introduction
• Performance Analysis
• Single-Cycle Processor
• Multicycle Processor
• Pipelined Processor
• Exceptions
• Advanced Microarchitecture
Chapter 7 <2>
MICROARCHITECTURE Introduction
• Microarchitecture: how to Application
Software
programs
in hardware Architecture
instructions
registers
• Processor: Micro- datapaths
adders
– Control: control signals Logic
memories
Analog amplifiers
Circuits filters
transistors
Devices
diodes
Physics electrons
Chapter 7 <3>
MICROARCHITECTURE Microarchitecture
• Multiple implementations for a single
architecture:
– Single-cycle: Each instruction executes in a
single cycle
– Multicycle: Each instruction is broken into series
of shorter steps
– Pipelined: Each instruction broken up into series
of steps & multiple instructions execute at once
Chapter 7 <4>
MICROARCHITECTURE Processor Performance
• Program execution time
Execution Time = (#instructions)(cycles/instruction)(seconds/cycle)
• Definitions:
– CPI: Cycles/instruction
– clock period: seconds/cycle
– IPC: instructions/cycle = IPC
• Challenge is to satisfy constraints of:
– Cost
– Power
– Performance
Chapter 7 <5>
MICROARCHITECTURE MIPS Processor
• Consider subset of MIPS instructions:
– R-type instructions: and, or, add, sub, slt
– Memory instructions: lw, sw
– Branch instructions: beq
Chapter 7 <6>
MICROARCHITECTURE Architectural State
• Determines everything about a processor:
– PC
– 32 registers
– Memory
Chapter 7 <7>
MICROARCHITECTURE MIPS State Elements
Chapter 7 <8>
MICROARCHITECTURE Single-Cycle MIPS Processor
• Datapath
• Control
Chapter 7 <9>
MICROARCHITECTURE Single-Cycle Datapath: lw fetch
STEP 1: Fetch instruction
CLK CLK
CLK
PC Instr WE3 WE
PC' A1 RD1
A RD
A RD
Instruction
A2 RD2 Data
Memory
A3 Memory
Register
WD3 WD
File
Chapter 7 <10>
MICROARCHITECTURE Single-Cycle Datapath: lw Register Read
CLK CLK
CLK
25:21
WE3 WE
PC' PC Instr A1 RD1
A RD
A RD
Instruction
A2 RD2 Data
Memory
A3 Memory
Register
WD3 WD
File
Chapter 7 <11>
MICROARCHITECTURE Single-Cycle Datapath: lw Immediate
CLK CLK
CLK
25:21
WE3 WE
PC' PC Instr A1 RD1
A RD
A RD
Instruction
A2 RD2 Data
Memory
A3 Memory
Register
WD3 WD
File
15:0 SignImm
Sign Extend
Chapter 7 <12>
MICROARCHITECTURE Single-Cycle Datapath: lw address
STEP 4: Compute the memory address
ALUControl2:0
010
CLK CLK
CLK
25:21
WE3 SrcA Zero WE
PC' PC Instr A1 RD1
A RD
ALU
ALUResult
A RD
Instruction
A2 RD2 SrcB Data
Memory
A3 Memory
Register
WD3 WD
File
SignImm
15:0
Sign Extend
Chapter 7 <13>
MICROARCHITECTURE Single-Cycle Datapath: lw Memory Read
ALU
ALUResult ReadData
A RD
Instruction
A2 RD2 SrcB Data
Memory 20:16
A3 Memory
Register
WD3 WD
File
SignImm
15:0
Sign Extend
Chapter 7 <14>
MICROARCHITECTURE Single-Cycle Datapath: lw PC Increment
ALU
ALUResult ReadData
A RD
Instruction
A2 RD2 SrcB Data
Memory 20:16
A3 Memory
Register
WD3 WD
File
PCPlus4
+
SignImm
4 15:0
Sign Extend
Result
Chapter 7 <15>
MICROARCHITECTURE Single-Cycle Datapath: sw
Write data in rt to memory
RegWrite ALUControl 2:0 MemWrite
0 010 1
CLK CLK
CLK
25:21
WE3 SrcA Zero WE
PC' PC Instr A1 RD1
A RD
ALU
ALUResult ReadData
20:16 A RD
Instruction
A2 RD2 SrcB Data
Memory 20:16
A3 Memory
Register WriteData
WD3 WD
File
PCPlus4
+
SignImm
4 15:0
Sign Extend
Result
Chapter 7 <16>
MICROARCHITECTURE Single-Cycle Datapath: R-Type
• Read from rs and rt
• Write ALUResult to register file
• Write to rd (instead of rt)
RegWrite RegDst ALUSrc ALUControl2:0 MemWrite MemtoReg
1 1 0 varies 0
CLK CLK 0
CLK
25:21
WE3 SrcA Zero WE
PC' PC Instr A1 RD1 0
A RD
ALU
ALUResult ReadData
A RD 1
Instruction 20:16
A2 RD2 0 SrcB Data
Memory
A3 1 Memory
Register WriteData
WD3 WD
File
20:16
0
15:11
1
WriteReg4:0
PCPlus4
+
SignImm
4 15:0
Sign Extend
Result
Chapter 7 <17>
MICROARCHITECTURE Single-Cycle Datapath: beq
• Determine whether values in rs and rt are equal
• Calculate branch target address:
BTA = (sign-extended immediate << 2) + (PC+4)
PCSrc
ALU
1 ALUResult ReadData
A RD 1
Instruction 20:16
A2 RD2 0 SrcB Data
Memory
A3 1 Memory
Register WriteData
WD3 WD
File
20:16
0
15:11
1
WriteReg4:0
PCPlus4
+
SignImm
4 15:0
<<2
Sign Extend PCBranch
+
Result
Chapter 7 <18>
MICROARCHITECTURE Single-Cycle Processor
MemtoReg
Control
MemWrite
Unit
Branch
ALUControl2:0 PCSrc
31:26
Op ALUSrc
5:0
Funct RegDst
RegWrite
CLK CLK
CLK
25:21 WE3 SrcA Zero WE
0 PC' PC Instr A1 RD1 0
A RD
ALU
1 ALUResult ReadData
A RD 1
Instruction 20:16
A2 RD2 0 SrcB Data
Memory
A3 1 Memory
Register WriteData
WD3 WD
File
20:16
0
15:11
1
WriteReg4:0
PCPlus4
+
SignImm
4 15:0
<<2
Sign Extend PCBranch
+
Result
Chapter 7 <19>
MICROARCHITECTURE Single-Cycle Control
Control
Unit MemtoReg
MemWrite
Branch
Opcode5:0 Main
ALUSrc
Decoder
RegDst
RegWrite
ALUOp1:0
ALU
Funct5:0 ALUControl2:0
Decoder
Chapter 7 <20>
MICROARCHITECTURE Review: ALU
F2:0 Function
A B 000 A& B
N N 001 A|B
010 A+ B
F 011 not used
ALU 3
100 A & ~B
N
101 A | ~B
Y
110 A-B
111 SLT
Chapter 7 <21>
MICROARCHITECTURE Review: ALU
A B
N N
0
F2
N
Cout +
[N-1] S
Extend
Zero
N N N N
1
0
3
2 F1:0
N
Y
Chapter 7 <22>
MICROARCHITECTURE Control Unit: ALU Decoder
ALUOp1:0 Meaning
00 Add
01 Subtract
10 Look at Funct
11 Not Used
R-type 000000
lw 100011
sw 101011
beq 000100
MemtoReg
Control
MemWrite
Unit
Branch
ALUControl2:0 PCSrc
31:26
Op ALUSrc
5:0
Funct RegDst
RegWrite
CLK CLK
CLK
25:21 WE3 SrcA Zero WE
0 PC' PC Instr A1 RD1 0
A RD
ALU
1 ALUResult ReadData
A RD 1
Instruction 20:16
A2 RD2 0 SrcB Data
Memory
A3 1 Memory
Register WriteData
WD3 WD
File
20:16
0
15:11
1
WriteReg4:0
PCPlus4
+
SignImm
4 15:0
<<2
Sign Extend PCBranch
+
Result
Chapter 7 <24>
MICROARCHITECTURE Control Unit: Main Decoder
R-type 000000 1 1 0 0 0 0 10
lw 100011 1 0 1 0 0 0 00
sw 101011 0 X 1 0 1 X 00
beq 000100 0 X 0 1 0 X 01
Chapter 7 <25>
MICROARCHITECTURE Single-Cycle Datapath: or
MemtoReg
Control
MemWrite
Unit
Branch 0
ALUControl 2:0 PCSrc
31:26
Op ALUSrc
5:0
Funct RegDst
RegWrite
CLK CLK
CLK 1 0
0 001 0
25:21
WE3 SrcA Zero WE
0 PC' PC Instr A1 RD1 0
A RD
ALU
1 ALUResult ReadData
0 A RD 1
Instruction 20:16
A2 RD2 0 SrcB Data
Memory
A3 1 Memory
Register WriteData
WD3 WD
File
1
20:16
0
15:11
1
WriteReg4:0
PCPlus4
+
SignImm
4 15:0 <<2
Sign Extend PCBranch
+
Result
Chapter 7 <26>
MICROARCHITECTURE Extended Functionality: addi
MemtoReg
Control
MemWrite
Unit
Branch
ALUControl2:0 PCSrc
31:26
Op ALUSrc
5:0
Funct RegDst
RegWrite
CLK CLK
CLK
25:21 WE3 SrcA Zero WE
0 PC' PC Instr A1 RD1 0
A RD
ALU
1 ALUResult ReadData
A RD 1
Instruction 20:16
A2 RD2 0 SrcB Data
Memory
A3 1 Memory
Register WriteData
WD3 WD
File
20:16
0
15:11
1
WriteReg4:0
PCPlus4
+
SignImm
4 15:0
<<2
Sign Extend PCBranch
+
Result
No change to datapath
Chapter 7 <27>
MICROARCHITECTURE Control Unit: addi
Instruction Op5:0 RegWrite RegDst AluSrc Branch MemWrite MemtoReg ALUOp1:0
R-type 000000 1 1 0 0 0 0 10
lw 100011 1 0 1 0 0 1 00
sw 101011 0 X 1 0 1 X 00
beq 000100 0 X 0 1 0 X 01
addi 001000
Chapter 7 <28>
MICROARCHITECTURE Control Unit: addi
Instruction Op5:0 RegWrite RegDst AluSrc Branch MemWrite MemtoReg ALUOp1:0
R-type 000000 1 1 0 0 0 0 10
lw 100011 1 0 1 0 0 1 00
sw 101011 0 X 1 0 1 X 00
beq 000100 0 X 0 1 0 X 01
addi 001000 1 0 1 0 0 0 00
Chapter 7 <29>
MICROARCHITECTURE Extended Functionality: j
Jump MemtoReg
Control
MemWrite
Unit
Branch
ALUControl2:0 PCSrc
31:26
Op ALUSrc
5:0
Funct RegDst
RegWrite
CLK CLK
CLK
0 PC' 25:21
WE3 SrcA Zero WE
0 PC Instr A1 RD1 0 Result
1 A RD
ALU
1 ALUResult ReadData
A RD 1
Instruction 20:16
A2 RD2 0 SrcB Data
Memory
A3 1 Memory
Register WriteData
WD3 WD
File
20:16
0
PCJump 15:11
1
WriteReg4:0
PCPlus4
+
SignImm
4 15:0
<<2
Sign Extend PCBranch
+
27:0 31:28
25:0
<<2
Chapter 7 <30>
MICROARCHITECTURE Control Unit: Main Decoder
Instruction Op5:0 RegWrite RegDst AluSrc Branch MemWrite MemtoReg ALUOp1:0 Jump
R-type 000000 1 1 0 0 0 0 10 0
lw 100011 1 0 1 0 0 1 00 0
sw 101011 0 X 1 0 1 X 00 0
beq 000100 0 X 0 1 0 X 01 0
j 000010
Chapter 7 <31>
MICROARCHITECTURE Control Unit: Main Decoder
Instruction Op5:0 RegWrite RegDst AluSrc Branch MemWrite MemtoReg ALUOp1:0 Jump
R-type 000000 1 1 0 0 0 0 10 0
lw 100011 1 0 1 0 0 1 00 0
sw 101011 0 X 1 0 1 X 00 0
beq 000100 0 X 0 1 0 X 01 0
j 000010 0 X X X 0 X XX 1
Chapter 7 <32>
MICROARCHITECTURE Review: Processor Performance
Program Execution Time
= (#instructions)(cycles/instruction)(seconds/cycle)
= # instructions x CPI x TC
Chapter 7 <33>
MICROARCHITECTURE Single-Cycle Performance
MemtoReg
Control
MemWrite
Unit
Branch 0 0
ALUControl 2:0 PCSrc
31:26
Op ALUSrc
5:0
Funct RegDst
RegWrite
CLK CLK
CLK 1 0
010 1
25:21
WE3 SrcA Zero WE
0 PC' PC Instr A1 RD1 0
A RD
ALU
1 ALUResult ReadData
1 A RD 1
Instruction 20:16
A2 RD2 0 SrcB Data
Memory
A3 1 Memory
Register WriteData
WD3 WD
File
0
20:16
0
15:11
1
WriteReg4:0
PCPlus4
+
SignImm
4 15:0 <<2
Sign Extend PCBranch
+
Result
Chapter 7 <35>
MICROARCHITECTURE Single-Cycle Performance Example
Element Parameter Delay (ps)
Register clock-to-Q tpcq_PC 30
Register setup tsetup 20
Multiplexer tmux 25
ALU tALU 200
Memory read tmem 250
Register file read tRFread 150
Register file setup tRFsetup 20
Tc = ?
Chapter 7 <36>
MICROARCHITECTURE Single-Cycle Performance Example
Element Parameter Delay (ps)
Register clock-to-Q tpcq_PC 30
Register setup tsetup 20
Multiplexer tmux 25
ALU tALU 200
Memory read tmem 250
Register file read tRFread 150
Register file setup tRFsetup 20
Chapter 7 <38>
MICROARCHITECTURE Multicycle MIPS Processor
• Single-cycle:
+ simple
- cycle time limited by longest instruction (lw)
- 2 adders/ALUs & 2 memories
• Multicycle:
+ higher clock speed
+ simpler instructions run faster
+ reuse expensive hardware on multiple cycles
- sequencing overhead paid many times
• Same design steps: datapath & control
Chapter 7 <39>
MICROARCHITECTURE Multicycle State Elements
• Replace Instruction and Data memories with
a single unified memory – more realistic
CLK CLK
CLK
WE WE3
PC' PC A1 RD1
RD
EN A A2 RD2
Instr / Data
Memory A3
Register
WD
File
WD3
Chapter 7 <40>
MICROARCHITECTURE Multicycle Datapath: Instruction Fetch
STEP 1: Fetch instruction
IRWrite
CLK CLK
CLK CLK
WE WE3
PC' PC Instr A1 RD1
b A
RD
A2 RD2
EN
Instr / Data
Memory A3
Register
WD
File
WD3
Chapter 7 <41>
MICROARCHITECTURE Multicycle Datapath: lw Register Read
STEP 2a: Read source operands from RF
IRWrite
Chapter 7 <42>
MICROARCHITECTURE Multicycle Datapath: lw Immediate
STEP 2b: Sign-extend the immediate
IRWrite
SignImm
15:0
Sign Extend
Chapter 7 <43>
MICROARCHITECTURE Multicycle Datapath: lw Address
STEP 3: Compute the memory address
ALU
A EN A2 RD2 ALUResult ALUOut
Instr / Data SrcB
Memory A3
Register
WD
File
WD3
SignImm
15:0
Sign Extend
Chapter 7 <44>
MICROARCHITECTURE Multicycle Datapath: lw Memory Read
ALU
A EN A2 RD2 ALUResult ALUOut
1
Instr / Data SrcB
Memory CLK A3
Register
WD
Data File
WD3
SignImm
15:0
Sign Extend
Chapter 7 <45>
MICROARCHITECTURE Multicycle Datapath: lw Write Register
ALU
A EN A2 RD2 ALUResult ALUOut
1
Instr / Data SrcB
Memory CLK
20:16
A3
Register
WD
Data File
WD3
SignImm
15:0
Sign Extend
Chapter 7 <46>
MICROARCHITECTURE Multicycle Datapath: Increment PC
STEP 6: Increment PC
ALU
EN A EN A2 RD2 00 ALUResult ALUOut
1 SrcB
Instr / Data 4 01
Memory CLK
20:16
A3 10
Register
WD 11
Data File
WD3
SignImm
15:0
Sign Extend
Chapter 7 <47>
MICROARCHITECTURE Multicycle Datapath: sw
Write data in rt to memory
ALU
EN A EN
20:16
A2 RD2 00 ALUResult ALUOut
1
Instr / Data 4 01 SrcB
Memory CLK
20:16
A3 10
Register
WD 11
Data File
WD3
SignImm
15:0
Sign Extend
Chapter 7 <48>
MICROARCHITECTURE Multicycle Datapath: R-Type
• Read from rs and rt
• Write ALUResult to register file
• Write to rd (instead of rt)
PCWrite IorD MemWrite IRWrite RegDst MemtoReg RegWrite ALUSrcA ALUSrcB1:0 ALUControl2:0
ALU
EN A EN
20:16
A2 RD2 00 ALUResult ALUOut
1
Instr / Data 20:16 4 01 SrcB
0
Memory 15:11 A3 10
CLK 1 Register
WD 11
0 File
Data WD3
1
SignImm
15:0
Sign Extend
Chapter 7 <49>
MICROARCHITECTURE Multicycle Datapath: beq
• rs == rt?
• BTA = (sign-extended immediate << 2) + (PC+4)
PCEn
IorD MemWrite IRWrite RegDst MemtoReg RegWrite ALUSrcA ALUSrcB1:0 ALUControl 2:0 Branch PCWrite PCSrc
ALU
EN A EN
20:16
A2 RD2 00 ALUResult ALUOut
1 1
Instr / Data 20:16
4 01 SrcB
0
Memory 15:11
A3 10
CLK 1 Register
WD 11
0 File
Data WD3
1
<<2
SignImm
15:0
Sign Extend
Chapter 7 <50>
MICROARCHITECTURE Multicycle Processor
CLK
PCWrite
Branch PCEn
IorD Control PCSrc
MemWrite Unit ALUControl2:0
IRWrite ALUSrcB1:0
31:26 ALUSrcA
Op
5:0 RegWrite
Funct
MemtoReg
RegDst
CLK CLK CLK
CLK CLK
0 SrcA
WE WE3 A Zero CLK
25:21
PC' PC Instr A1 RD1 1 0
0 Adr RD B
ALU
EN A EN
20:16
A2 RD2 00 ALUResult ALUOut
1 1
Instr / Data 20:16 4 01 SrcB
0
Memory 15:11 A3 10
CLK 1 Register
WD 11
0 File
Data WD3
1
<<2
SignImm
15:0
Sign Extend
Chapter 7 <51>
MICROARCHITECTURE Multicycle Control
Control
MemtoReg
Unit
RegDst
IorD Multiplexer
PCSrc Selects
Main ALUSrcB1:0
Controller
Opcode5:0 (FSM) ALUSrcA
IRWrite
MemWrite
Register
PCWrite
Enables
Branch
RegWrite
ALUOp1:0
ALU
Funct5:0 ALUControl2:0
Decoder
Chapter 7 <52>
MICROARCHITECTURE Main Controller FSM: Fetch
S0: Fetch
Reset
CLK
PCWrite 1
Branch 0 PCEn
IorD Control PCSrc
MemWrite Unit ALUControl2:0
IRWrite ALUSrcB1:0
31:26 ALUSrcA
Op
5:0 RegWrite
Funct
MemtoReg
RegDst
CLK CLK CLK 0
CLK 0 CLK 0
0 SrcA 010
0 WE WE3 A Zero CLK 0
25:21
PC' PC Instr A1 RD1 1 0
0 Adr RD B 01
ALU
EN A EN
20:16
A2 RD2 00 ALUResult ALUOut
1 1
X
Instr / Data 1 20:16 4 01 SrcB
1 0
Memory 15:11 A3 10
CLK 1 X Register
WD 11
0 File
Data WD3
1
<<2
SignImm
15:0
Sign Extend
Chapter 7 <53>
MICROARCHITECTURE Main Controller FSM: Fetch
S0: Fetch
IorD = 0
Reset AluSrcA = 0
ALUSrcB = 01
ALUOp = 00
PCSrc = 0
IRWrite CLK
PCWrite PCWrite 1
Branch 0 PCEn
IorD Control PCSrc
MemWrite Unit ALUControl2:0
IRWrite ALUSrcB1:0
31:26 ALUSrcA
Op
5:0 RegWrite
Funct
MemtoReg
RegDst
CLK CLK CLK 0
CLK 0 CLK 0
0 SrcA 010
0 WE WE3 A Zero CLK 0
25:21
PC' PC Instr A1 RD1 1 0
0 Adr RD B 01
ALU
EN A EN
20:16
A2 RD2 00 ALUResult ALUOut
1 1
X
Instr / Data 1 20:16 4 01 SrcB
1 0
Memory 15:11 A3 10
CLK 1 X Register
WD 11
0 File
Data WD3
1
<<2
SignImm
15:0
Sign Extend
Chapter 7 <54>
MICROARCHITECTURE Main Controller FSM: Decode
S0: Fetch S1: Decode
IorD = 0
Reset AluSrcA = 0
ALUSrcB = 01
ALUOp = 00
PCSrc = 0
IRWrite
PCWrite
CLK
PCWrite 0
Branch 0 PCEn
IorD Control PCSrc
MemWrite Unit ALUControl2:0
IRWrite ALUSrcB1:0
31:26 ALUSrcA
Op
5:0 RegWrite
Funct
MemtoReg
RegDst
ALU
EN A EN
20:16
A2 RD2 00 ALUResult ALUOut
1 1
X
Instr / Data 0 20:16 4 01 SrcB
0 0
Memory 15:11 A3 10
CLK 1 X Register
WD 11
0 File
Data WD3
1
<<2
SignImm
15:0
Sign Extend
Chapter 7 <55>
MICROARCHITECTURE Main Controller FSM: Address
S0: Fetch S1: Decode
IorD = 0
Reset AluSrcA = 0
ALUSrcB = 01
ALUOp = 00
PCSrc = 0
IRWrite
PCWrite
Op = LW
or
S2: MemAdr Op = SW CLK
PCWrite 0
Branch 0 PCEn
IorD Control PCSrc
MemWrite Unit ALUControl2:0
IRWrite ALUSrcB1:0
31:26 ALUSrcA
Op
5:0 RegWrite
Funct
MemtoReg
CLK RegDst CLK CLK 1
CLK 0 CLK 0
0 SrcA 010
X WE WE3 A Zero CLK X
25:21
PC' PC Instr A1 RD1 1 0
0 Adr RD B 10
ALU
EN A EN
20:16
A2 RD2 00 ALUResult ALUOut
1 1
X
Instr / Data 0 20:16 4 01 SrcB
0 0
Memory 15:11 A3 10
CLK 1 X Register
WD 11
0 File
Data WD3
1
<<2
SignImm
15:0
Sign Extend
Chapter 7 <56>
MICROARCHITECTURE Main Controller FSM: Address
S0: Fetch S1: Decode
IorD = 0
Reset AluSrcA = 0
ALUSrcB = 01
ALUOp = 00
PCSrc = 0
IRWrite
PCWrite
Op = LW
or CLK
S2: MemAdr Op = SW PCWrite 0
Branch 0 PCEn
IorD Control PCSrc
ALUSrcA = 1 MemWrite Unit ALUControl2:0
ALUSrcB = 10 IRWrite ALUSrcB1:0
ALUOp = 00 31:26
Op
ALUSrcA
5:0 RegWrite
Funct
MemtoReg
RegDst
CLK CLK CLK 1
CLK 0 CLK 0
0 SrcA 010
X WE WE3 A Zero CLK X
25:21
PC' PC Instr A1 RD1 1 0
0 Adr RD B 10
ALU
EN A EN
20:16
A2 RD2 00 ALUResult ALUOut
1 1
X
Instr / Data 0 20:16 4 01 SrcB
0 0
Memory 15:11 A3 10
CLK 1 X Register
WD 11
0 File
Data WD3
1
<<2
SignImm
15:0
Sign Extend
Chapter 7 <57>
MICROARCHITECTURE Main Controller FSM: lw
S0: Fetch S1: Decode
IorD = 0
Reset AluSrcA = 0
ALUSrcB = 01
ALUOp = 00
PCSrc = 0
IRWrite
PCWrite
Op = LW
or
S2: MemAdr Op = SW
ALUSrcA = 1
ALUSrcB = 10
ALUOp = 00
Op = LW
S3: MemRead
IorD = 1
S4: Mem
Writeback
RegDst = 0
MemtoReg = 1
RegWrite
Chapter 7 <58>
MICROARCHITECTURE Main Controller FSM: sw
S0: Fetch S1: Decode
IorD = 0
Reset AluSrcA = 0
ALUSrcB = 01
ALUOp = 00
PCSrc = 0
IRWrite
PCWrite
Op = LW
or
S2: MemAdr Op = SW
ALUSrcA = 1
ALUSrcB = 10
ALUOp = 00
Op = SW
Op = LW
S5: MemWrite
S3: MemRead
IorD = 1
IorD = 1
MemWrite
S4: Mem
Writeback
RegDst = 0
MemtoReg = 1
RegWrite
Chapter 7 <59>
MICROARCHITECTURE Main Controller FSM: R-Type
S0: Fetch S1: Decode
IorD = 0
Reset AluSrcA = 0
ALUSrcB = 01
ALUOp = 00
PCSrc = 0
IRWrite
PCWrite
Op = LW
or Op = R-type
S2: MemAdr Op = SW
S6: Execute
ALUSrcA = 1 ALUSrcA = 1
ALUSrcB = 10 ALUSrcB = 00
ALUOp = 00 ALUOp = 10
Op = SW
Op = LW S7: ALU
S5: MemWrite
Writeback
S3: MemRead
RegDst = 1
IorD = 1
IorD = 1 MemtoReg = 0
MemWrite
RegWrite
S4: Mem
Writeback
RegDst = 0
MemtoReg = 1
RegWrite
Chapter 7 <60>
MICROARCHITECTURE Main Controller FSM: beq
S0: Fetch S1: Decode
IorD = 0
Reset AluSrcA = 0
ALUSrcB = 01 ALUSrcA = 0
ALUOp = 00 ALUSrcB = 11
PCSrc = 0 ALUOp = 00
IRWrite
PCWrite
Op = BEQ
Op = LW
or Op = R-type
S2: MemAdr Op = SW
S6: Execute
S8: Branch
ALUSrcA = 1
ALUSrcA = 1 ALUSrcA = 1 ALUSrcB = 00
ALUSrcB = 10 ALUSrcB = 00 ALUOp = 01
ALUOp = 00 ALUOp = 10 PCSrc = 1
Branch
Op = SW
Op = LW S7: ALU
S5: MemWrite
Writeback
S3: MemRead
RegDst = 1
IorD = 1
IorD = 1 MemtoReg = 0
MemWrite
RegWrite
S4: Mem
Writeback
RegDst = 0
MemtoReg = 1
RegWrite
Chapter 7 <61>
MICROARCHITECTURE Multicycle Controller FSM
S0: Fetch S1: Decode
IorD = 0
Reset AluSrcA = 0
ALUSrcB = 01 ALUSrcA = 0
ALUOp = 00 ALUSrcB = 11
PCSrc = 0 ALUOp = 00
IRWrite
PCWrite
Op = BEQ
Op = LW
or Op = R-type
S2: MemAdr Op = SW
S6: Execute
S8: Branch
ALUSrcA = 1
ALUSrcA = 1 ALUSrcA = 1 ALUSrcB = 00
ALUSrcB = 10 ALUSrcB = 00 ALUOp = 01
ALUOp = 00 ALUOp = 10 PCSrc = 1
Branch
Op = SW
Op = LW S7: ALU
S5: MemWrite
Writeback
S3: MemRead
RegDst = 1
IorD = 1
IorD = 1 MemtoReg = 0
MemWrite
RegWrite
S4: Mem
Writeback
RegDst = 0
MemtoReg = 1
RegWrite
Chapter 7 <62>
MICROARCHITECTURE Extended Functionality: addi
S0: Fetch S1: Decode
IorD = 0
Reset AluSrcA = 0
ALUSrcB = 01 ALUSrcA = 0
ALUOp = 00 ALUSrcB = 11
PCSrc = 0 ALUOp = 00
IRWrite
PCWrite
Op = ADDI
Op = BEQ
Op = LW
or Op = R-type
S2: MemAdr Op = SW
S6: Execute S9: ADDI
S8: Branch
Execute
ALUSrcA = 1
ALUSrcA = 1 ALUSrcA = 1 ALUSrcB = 00
ALUSrcB = 10 ALUSrcB = 00 ALUOp = 01
ALUOp = 00 ALUOp = 10 PCSrc = 1
Branch
Op = SW
Op = LW S7: ALU
S5: MemWrite S10: ADDI
Writeback
S3: MemRead Writeback
RegDst = 1
IorD = 1
IorD = 1 MemtoReg = 0
MemWrite
RegWrite
S4: Mem
Writeback
RegDst = 0
MemtoReg = 1
RegWrite
Chapter 7 <63>
MICROARCHITECTURE Main Controller FSM: addi
S0: Fetch S1: Decode
IorD = 0
Reset AluSrcA = 0
ALUSrcB = 01 ALUSrcA = 0
ALUOp = 00 ALUSrcB = 11
PCSrc = 0 ALUOp = 00
IRWrite
PCWrite
Op = ADDI
Op = BEQ
Op = LW
or Op = R-type
S2: MemAdr Op = SW
S6: Execute S9: ADDI
S8: Branch
Execute
ALUSrcA = 1
ALUSrcA = 1 ALUSrcA = 1 ALUSrcB = 00 ALUSrcA = 1
ALUSrcB = 10 ALUSrcB = 00 ALUOp = 01 ALUSrcB = 10
ALUOp = 00 ALUOp = 10 PCSrc = 1 ALUOp = 00
Branch
Op = SW
Op = LW S7: ALU
S5: MemWrite S10: ADDI
Writeback
S3: MemRead Writeback
RegDst = 1 RegDst = 0
IorD = 1
IorD = 1 MemtoReg = 0 MemtoReg = 0
MemWrite
RegWrite RegWrite
S4: Mem
Writeback
RegDst = 0
MemtoReg = 1
RegWrite
Chapter 7 <64>
MICROARCHITECTURE Extended Functionality: j
PCEn
IorD MemWrite IRWrite RegDst MemtoReg RegWrite ALUSrcA ALUSrcB1:0 ALUControl 2:0Branch PCWrite PCSrc1:0
ALU
EN A EN
20:16
A2 RD2 00 ALUResult ALUOut
1 01
Instr / Data 20:16 4 01 SrcB 10
0
Memory 15:11 A3 10
CLK 1 Register PCJump
WD 11
0 File
Data WD3
1
<<2 27:0
<<2
SignImm
15:0
Sign Extend
25:0 (jump)
Chapter 7 <65>
MICROARCHITECTURE Main Controller FSM: j
S0: Fetch S1: Decode
IorD = 0
Reset AluSrcA = 0 S11: Jump
ALUSrcB = 01 ALUSrcA = 0
ALUOp = 00 ALUSrcB = 11 Op = J
PCSrc = 00 ALUOp = 00
IRWrite
PCWrite
Op = ADDI
Op = BEQ
Op = LW
or Op = R-type
S2: MemAdr Op = SW
S6: Execute S9: ADDI
S8: Branch
Execute
ALUSrcA = 1
ALUSrcA = 1 ALUSrcA = 1 ALUSrcB = 00 ALUSrcA = 1
ALUSrcB = 10 ALUSrcB = 00 ALUOp = 01 ALUSrcB = 10
ALUOp = 00 ALUOp = 10 PCSrc = 01 ALUOp = 00
Branch
Op = SW
Op = LW S7: ALU
S5: MemWrite S10: ADDI
Writeback
S3: MemRead Writeback
RegDst = 1 RegDst = 0
IorD = 1
IorD = 1 MemtoReg = 0 MemtoReg = 0
MemWrite
RegWrite RegWrite
S4: Mem
Writeback
RegDst = 0
MemtoReg = 1
RegWrite
Chapter 7 <66>
MICROARCHITECTURE Main Controller FSM: j
S0: Fetch S1: Decode
IorD = 0
Reset AluSrcA = 0 S11: Jump
ALUSrcB = 01 ALUSrcA = 0
ALUOp = 00 ALUSrcB = 11 Op = J
PCSrc = 00 ALUOp = 00 PCSrc = 10
IRWrite PCWrite
PCWrite
Op = ADDI
Op = BEQ
Op = LW
or Op = R-type
S2: MemAdr Op = SW
S6: Execute S9: ADDI
S8: Branch
Execute
ALUSrcA = 1
ALUSrcA = 1 ALUSrcA = 1 ALUSrcB = 00 ALUSrcA = 1
ALUSrcB = 10 ALUSrcB = 00 ALUOp = 01 ALUSrcB = 10
ALUOp = 00 ALUOp = 10 PCSrc = 01 ALUOp = 00
Branch
Op = SW
Op = LW S7: ALU
S5: MemWrite S10: ADDI
Writeback
S3: MemRead Writeback
RegDst = 1 RegDst = 0
IorD = 1
IorD = 1 MemtoReg = 0 MemtoReg = 0
MemWrite
RegWrite RegWrite
S4: Mem
Writeback
RegDst = 0
MemtoReg = 1
RegWrite
Chapter 7 <67>
MICROARCHITECTURE Multicycle Processor Performance
• Instructions take different number of cycles:
– 3 cycles: beq, j
– 4 cycles: R-Type, sw, addi
– 5 cycles: lw
• CPI is weighted average
• SPECINT2000 benchmark:
– 25% loads
– 10% stores
– 11% branches
– 2% jumps
– 52% R-type
Average CPI = (0.11 + 0.02)(3) + (0.52 + 0.10)(4) + (0.25)(5) = 4.12
Chapter 7 <68>
MICROARCHITECTURE Multicycle Processor Performance
Multicycle critical path:
Tc = tpcq + tmux + max(tALU + tmux, tmem) + tsetup
CLK
PCWrite
Branch PCEn
IorD Control PCSrc
MemWrite Unit ALUControl2:0
IRWrite ALUSrcB1:0
31:26 ALUSrcA
Op
5:0 RegWrite
Funct
MemtoReg
RegDst
ALU
EN A EN
20:16
A2 RD2 00 ALUResult ALUOut
1 1
Instr / Data 20:16 4 01 SrcB
0
Memory 15:11 A3 10
CLK 1 Register
WD 11
0 File
Data WD3
1
<<2
SignImm
15:0
Sign Extend
Chapter 7 <69>
MICROARCHITECTURE Multicycle Performance Example
Element Parameter Delay (ps)
Register clock-to-Q tpcq_PC 30
Register setup tsetup 20
Multiplexer tmux 25
ALU tALU 200
Memory read tmem 250
Register file read tRFread 150
Register file setup tRFsetup 20
Tc = ?
Chapter 7 <70>
MICROARCHITECTURE Multicycle Performance Example
Element Parameter Delay (ps)
Register clock-to-Q tpcq_PC 30
Register setup tsetup 20
Multiplexer tmux 25
ALU tALU 200
Memory read tmem 250
Register file read tRFread 150
Register file setup tRFsetup 20
Chapter 7 <72>
MICROARCHITECTURE Multicycle Performance Example
Program with 100 billion instructions
Execution Time = (# instructions) × CPI × Tc
= (100 × 109)(4.12)(325 × 10-12)
= 133.9 seconds
Chapter 7 <73>
MICROARCHITECTURE Multicycle Performance Example
Program with 100 billion instructions
Execution Time = (# instructions) × CPI × Tc
= (100 × 109)(4.12)(325 × 10-12)
= 133.9 seconds
Chapter 7 <74>
MICROARCHITECTURE Review: Single-Cycle Processor
Jump MemtoReg
Control
MemWrite
Unit
Branch
ALUControl2:0 PCSrc
31:26
Op ALUSrc
5:0
Funct RegDst
RegWrite
CLK CLK
CLK
0 25:21
WE3 SrcA Zero WE
0 PC' PC Instr A1 RD1 0 Result
1 A RD
ALU
1 ALUResult ReadData
A RD 1
Instruction 20:16
A2 RD2 0 SrcB Data
Memory
A3 1 Memory
Register WriteData
WD3 WD
File
20:16
0
PCJump 15:11
1
WriteReg4:0
PCPlus4
+
SignImm
4 15:0
<<2
Sign Extend PCBranch
+
27:0 31:28
25:0
<<2
Chapter 7 <75>
MICROARCHITECTURE Review: Multicycle Processor
CLK
PCWrite
Branch PCEn
IorD Control PCSrc
MemWrite Unit ALUControl2:0
IRWrite ALUSrcB1:0
31:26 ALUSrcA
Op
5:0 RegWrite
Funct
MemtoReg
RegDst
CLK CLK CLK
CLK CLK
0 SrcA
WE WE3 A 31:28 Zero CLK
25:21
PC' PC Instr A1 RD1 1 00
0 Adr RD B
ALU
EN A EN
20:16
A2 RD2 00 ALUResult ALUOut
1 01
Instr / Data 20:16 4 01 SrcB 10
0
Memory 15:11 A3 10
CLK 1 Register PCJump
WD 11
0 File
Data WD3
1
<<2 27:0
<<2
ImmExt
15:0
Sign Extend
25:0 (Addr)
Chapter 7 <76>