0% found this document useful (0 votes)
13 views28 pages

Lec 2

The document outlines the structure and functioning of a processor's main control unit, detailing control signals for different instruction types such as R-type, load/store, and branch instructions. It discusses the implementation of pipelining to improve performance, highlighting the stages of instruction execution and the impact of hazards on the pipeline. Additionally, it addresses the design considerations for MIPS ISA to facilitate pipelining and strategies to mitigate data hazards.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views28 pages

Lec 2

The document outlines the structure and functioning of a processor's main control unit, detailing control signals for different instruction types such as R-type, load/store, and branch instructions. It discusses the implementation of pipelining to improve performance, highlighting the stages of instruction execution and the impact of hazards on the pipeline. Additionally, it addresses the design considerations for MIPS ISA to facilitate pipelining and strategies to mitigate data hazards.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 28

The Main Control Unit

■ Control signals derived from instruction

R-type 0 rs rt rd shamt funct


31:26 25:21 20:16 15:11 10:6 5:0

Load/
35 or 43 rs rt address
Store
31:26 25:21 20:16 15:0

Branch 4 rs rt address
31:26 25:21 20:16 15:0

opcode always read, write for sign-extend


read except R-type and add
for load and load

Chapter 4 — The Processor — 1


Datapath With Control

Chapter 4 — The Processor — 2


Datapath With Control

R-type

0 rs rt rd shamt funct
31:26 25:21 20:16 15:11 10:6 Chapter 4 — The Processor — 3
5:0
Datapath With Control

load/store

35 or 43 rs rt address
31:26 25:21 20:16 15:0 Chapter 4 — The Processor — 4
Datapath With Control

branch

4 rs rt address
31:26 25:21 20:16 15:0 Chapter 4 — The Processor — 5
Control Line Settings
• 8 control lines (control read/write and
multiplexors)

10
00
00
01

the best practice is if we deal with memory don't use don't care

Note: you can use don't care(X) with mux only

6
Systems Architecture Lec 15
R-Type Instruction

Chapter 4 — The Processor — 7


Chapter 4 — The Processor — 8
Load Instruction

Chapter 4 — The Processor — 9


Chapter 4 — The Processor — 10
Branch-on-Equal Instruction

Chapter 4 — The Processor — 11


Chapter 4 — The Processor — 12
Implementing Jumps
Jump 2 address
31:26 25:0

■ Jump uses word address


■ Update PC with concatenation of
■ Top 4 bits of PC+4
■ 26-bit jump address
■ 00
■ Need an extra control signal decoded from
opcode
Chapter 4 — The Processor — 13
Datapath With Jumps Added

Chapter 4 — The Processor — 14


Chapter 4 — The Processor — 15
Shortcomings of a Single Cycle Implementation
• Limits reuse of hardware components
– each functional unit can be used only once per cycle
– e.g. instruction and data memory required
• Inefficient
– clock cycle determined by longest possible path in the machine
– E.G. Assume time for:
• Memory units = 200 ps
• ALU and adders = 100 ps
• Register file (read or write) = 50 ps

Instruction Instruction ALU


Register read Data memory Register write Total
class memory operation
R-type 200 50 100 0 50 400 ps
Load word 200 50 100 200 50 600 ps

Store word 200 50 100 200 550 ps

Branch 200 50 100 0 350 ps

Jump 200 200 ps

Lec 15 16
Systems Architecture
Pipelining Analogy
■ Pipelined laundry: overlapping execution
■ Parallelism improves performance

■ Four loads:
■ Speedup

= 16/7 = 2.3
■ Non-stop loads: # loads=n,
n🡪 ∞
■ Speedup

= 4n/(4+n-1)
■ =number of stages
■ = 4 as n🡪 ∞

Chapter 4 — The Processor — 17


MIPS Pipeline
■ Five stages, one step per stage
1. IF: Instruction fetch from memory
2. ID: Instruction decode & register read
3. EX: Execute operation or calculate address
4. MEM: Access memory operand
5. WB: Write result back to register

Chapter 4 — The Processor — 18


Pipeline Performance
■ Assume time for stages is
■ 100ps for register read or write
■ 200ps for other stages
■ Compare pipelined datapath with single-cycle
datapath

Instr Instr fetch Register ALU op Memory Register Total time


read access write
lw 200ps 100 ps 200ps 200ps 100 ps 800ps
sw 200ps 100 ps 200ps 200ps 700ps
R-format 200ps 100 ps 200ps 100 ps 600ps
beq 200ps 100 ps 200ps 500ps

Chapter 4 — The Processor — 19


Pipeline Performance
Single-cycle (Tc= 800ps)

Pipelined (Tc= 200ps)

Chapter 4 — The Processor — 20


Pipeline Speedup
■ If all stages are balanced
■ i.e., all take the same time
■ Time between instructionspipelined
= Time between instructionsnonpipelined
Number of stages
■ If not balanced, speedup is less
■ Speedup due to increased throughput
■ Latency (time for each instruction) does not
decrease

Chapter 4 — The Processor — 21


Pipelining and ISA Design
■ MIPS ISA designed for pipelining
■ All instructions are 32-bits
■ Easier to fetch and decode in one cycle
■ c.f. x86: 1- to 17-byte instructions
■ Few and regular instruction formats
■ Can decode and read registers in one step
■ Load/store addressing
■ Can calculate address in 3rd stage, access memory
in 4th stage
■ Alignment of memory operands
■ Memory access takes only one cycle

Chapter 4 — The Processor — 22


Hazards
■ Situations that prevent starting the next
instruction in the next cycle
■ Structure hazards
■ A required resource is busy
■ Data hazard
■ Need to wait for previous instruction to
complete its data read/write
■ Control hazard
■ Deciding on control action depends on
previous instruction

Chapter 4 — The Processor — 23


Structure Hazards
■ Conflict for use of a resource
■ In MIPS pipeline with a single memory
■ Load/store requires data access
■ Instruction fetch would have to stall for that
cycle
■ Would cause a pipeline “bubble”
■ Hence, pipelined datapaths require
separate instruction/data memories
■ Or separate instruction/data caches

Chapter 4 — The Processor — 24


Data Hazards
■ An instruction depends on completion of
data access by a previous instruction
■ add $s0, $t0, $t1
sub $t2, $s0, $t3
//$s0 has RAW Hazard (Read after write)
// WAW WAR will not happen on MIPS

Chapter 4 — The Processor — 25


Forwarding (aka Bypassing)
■ Use result when it is computed
■ Don’t wait for it to be stored in a register
■ Requires extra connections in the datapath

Chapter 4 — The Processor — 26


Load-Use Data Hazard
■ Can’t always avoid stalls by forwarding
■ If value not computed when needed
■ Can’t forward backward in time!

Chapter 4 — The Processor — 27


Code Scheduling to Avoid Stalls
■ Reorder code to avoid stalls
■ C code for A = B + E; C = B + F;

lw $t1, 0($t0) lw $t1, 0($t0)


lw $t2, 4($t0) lw $t2, 4($t0)
stall
add $t3, $t1, $t2 lw $t4, 8($t0)
sw $t3, 12($t0) add $t3, $t1, $t2
lw $t4, 8($t0) sw $t3, 12($t0)
add $t5, $t1, $t4 add $t5, $t1, $t4
stall
sw $t5, 16($t0) sw $t5, 16($t0)
13 cycles 11 cycles

Chapter 4 — The Processor — 28

You might also like