Unit 3 Basic Processing Unit
Unit 3 Basic Processing Unit
Computer Organization
“Central” Processing Unit (CPU)
datapath
2. Doing some computation (in the ALU)
3. Accessing the memory
4. Writing a register (in the register file)
Processor’s building blocks
• PC provides instruction
address
• Instruction is fetched
into IR
• Instruction address
generator updates PC
• ALU performs some
computation during
execution
• Control circuitry
interprets instruction
and generates control
signals to perform the
actions needed.
A digital processing system
• datapath
A multi-stage digital processing system
• datapath
Why multi-stage?
• Processing moves from one stage to the next
in each clock cycle
• Such a multi-stage system is the basis for
pipelined operation
– High-performance processors have a pipelined
organization
– Pipelining enables the execution of successive
instructions to be overlapped
• We will get back to pipeline later. Let’s now
focus on the basics of the multi-stage
architecture of a RISC-style processor
Instruction execution
• Pipelined organization is most effective if all
instructions can be executed in the same number
of steps.
• Each step is carried out in a separate hardware
stage.
• Processor design will be illustrated using five
hardware stages.
• How can instruction execution be divided into
five steps?
– Let’s start from some representative RISC instructions
A memory access instruction:
Load R5, X(R7)
1. Fetch the instruction and increment the
program counter.
2. Decode the instruction and read the contents
of register R7 in the register file.
3. Compute the effective address = X + [R7].
4. Read the memory source operand.
5. Load the operand into the destination
register, R5.
A computational instruction:
Add R3, R4, R5
1. Fetch the instruction and increment the program
counter.
2. Decode the instruction and read registers
R4 and R5.
3. Compute the sum [R4] + [R5].
4. No action.
5. Load the result into the destination register, R3.
• It may be
implemented using a
2-port memory.
Hardware components: ALU (1)
• Both source operands
and the destination
location are in the
register file.
[RA] and [RB] denote
values of registers that
[RB]
are identified by
addresses A and B new [RC]
[RA]
new [RC] denotes the
result that is stored to
the register identified
by address C
Hardware components: ALU (2)
[RA]
A 5-stage implementation of
a RISC processor
• Instruction processing
moves from stage to stage
in every clock cycle,
starting with fetch.
• …
• If a memory operation is
involved, it takes place in
stage 4.
• Register file,
used in stages 2 and 5
– (Inter-stage registers RA, RB, RZ,
RM, RY needed to carry data
from one stage to the next)
• ALU stage
• Memory stage
Generated by decoding
the OPCODE field of the
instruction hold in the
IR register
Instruction
Format
R
I
ALU control signals
Generated by decoding
the OPCODE field of the
instruction hold in the
IR register Analyzed by the
CONTROL CIRCUITRY
during the execution
of a branch
instruction
Result selection
Generated by decoding
the OPCODE field of the
instruction hold in the
IR register
Memory access
• When data are found in the cache, access to
memory can be completed in one clock cycle.
• Otherwise, read and write operations may
require several clock cycles to load data from
main memory into the cache.
• A control signal is needed to indicate that
memory function has been completed (MFC).
E.g., for step 1:
1. Memory address ← [PC], Read memory,
Wait for MFC,
IR ← Memory data, PC ← [PC] + 4
Memory and IR control signals
MuxY
Memory and IR control signals
E.g.
RF_wtite = T5&(ALU | Load | Call);
PC_enable = T1&MFC | T3&(BR | Ret | Call);
CISC processors
• CISC-style processors have more complex
instructions.
• The full set of instructions cannot all be
implemented in a fixed number of steps.
• Execution steps for different instructions do not
all follow a prescribed sequence of actions.
• Hardware organization should therefore enable
a flexible flow of data and actions to
accommodate CISC.
Hardware organization for a CISC
computer Main difference between
5-stage RISC organization
and CISC organization,
where a datapath cannot
Hold temporary results be identified easily
during instruction
execution
Bus
• An example of an interconnection network.
• When functional units are connected to a
common bus, tri-state drivers are needed.
Register Enable
A 3-bus interconnection network
Example 1: Add R5, R6
1. Memory address ← [PC],
Read memory, Wait for
MFC, IR ← Memory data,
PC ← [PC] + 4
2. Decode instruction
3. R5 ← [R5] + [R6]
A 3-bus interconnection network
Example 2: And X(R7), R9
1. Memory address ← [PC], Read
memory, Wait for MFC,
IR ← Memory data,
PC ← [PC] + 4
2. Decode instruction
3. Memory address ← [PC], Read
memory, Wait for MFC,
Temp1 ← Memory data,
PC ← [PC] + 4
4. Temp2 ← [Temp1] + [R7]
5. Memory address ← [Temp2], Read
memory, Wait for MFC, Temp1 ←
Memory data
6. Temp1 ←[Temp1] AND [R9]
7. Memory address ← [Temp2],
Memory data ← [Temp1], Write
memory, Wait for MFC