Unit-3
Unit-3
Architecture vs Organization
(1) How many Clock pulse required to to perform
previous computation using sequencial executation(or
without using pipeline)?
�
(� + � − 1)��
Throughput
Example-2
Example-2
Example-3.1
Example-3.1
Example-3.2
Example-3.2
Example-4
Example-5
Example-5
Example-6
Example-6
Example-7
Example-7
Example-8
Example-11
Example-11
Instruction Pipeline
INSTRUCTION EXECUTION IN A 4-STAGE PIPELINE
Decode instruction
Segment2: and calculate
effective address
yes Branch?
no
Segment3: Fetch operand
from memory
Interrupt yes
Interrupt?
handling
no
Update PC
Empty pipe
Step: 1 2 3 4 5 6 7 8 9 10 11 12 13
Instruction 1 FI DA FO EX
2 FI DA FO EX
(Branch) 3 FI DA FO EX
4 FI FI DA FO EX
5 FI DA FO EX
6 FI DA FO EX
7 FI DA FO EX
Example continue...
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
I1 FI DI FO EI WB
I2 FI DI FO EI WB
I3 FI DI FO EI WB
I4 FI DI FO EI WB
I5
I6
I7
I8
I9 FI DI FO EI WB
I10 FI DI FO EI WB
I11 FI DI FO EI WB
I12 FI DI FO EI WB
Example continue...
Number of Cycle=15
Time=15 * 11 = 165 nsec
GPK
Difficulties that cause the instruction pipeline
• There are three major difficulties that cause the instruction pipeline to
deviate from its normal operation.
1.Resource conflicts caused by access to memory by two segments at
the same time. Most of these conflicts can be resolved by using
separate instruction and data memories.
2.Data dependency conflicts arise when an instruction depends on the
result of a previous instruction, but this result is not yet available.
3.Branch difficulties arise from branch and other instructions that
change the value of PC.
Structural Hazard/ Resource Conflict
• Two Different inputs try to use same resource at same time
Instruction Pipeline
MAJOR HAZARDS IN PIPELINED EXECUTION
Structural hazards(Resource Conflicts)
Hardware Resources required by the instructions in
simultaneous overlapped execution cannot be met
Data hazards (Data Dependency Conflicts)
An instruction scheduled to be executed in the pipeline requires the
result of a previous instruction, which is not yet available
R1 <- R1 + 1
INC DA bubble R1 +1
Control hazards
Branches and other instructions that change the PC
make the fetch of the next instruction to be delayed
JMP ID PC + PC Branch address dependency
bubble IF ID OF OE OS
Hardware Interlock
- hardware detects the data dependencies and delays the scheduling
of the dependent instruction by stalling enough clock cycles
Operand Forwarding (bypassing, short-circuiting)
- Accomplished by a data path that routes a value from a source
(usually an ALU) to a user, bypassing a designated register. This
allows the value to be produced to be used at an earlier stage in the
pipeline than would otherwise be possible
Software Technique
Instruction Scheduling(compiler) for delayed load
Operand Forwarding
Instruction Pipeline
CONTROL HAZARDS
Prefetch Target Instruction
• Fetch instructions in both streams, branch not taken and branch taken
• Both are saved until branch is executed. Then, select the right
instruction stream and discard the wrong stream
Branch Target Buffer(BTB; Associative Memory)
• Entry: Addr of previously executed branches; Target instruction
and the next few instructions
• When fetching an instruction, search BTB.
• If found, fetch the instruction stream in BTB;
• If not, new stream is fetched and update BTB
Loop Buffer(High Speed Register file)
• Storage of entire loop that allows to execute a loop without accessing memory
Branch Prediction
• Guessing the branch condition, and fetch an instruction stream based on
the guess. Correct guess eliminates the branch penalty
Delayed Branch
• Compiler detects the branch and rearranges the instruction sequence
by inserting useful instructions that keep the pipeline busy
in the presence of a branch instruction
Example-9
Example-9 (Solution)
From Previous Example...
Example-10
Example-10 (Solution)
Example-10 (Solution)
Example-10 (Solution)