0% found this document useful (0 votes)
5 views

Unit-3

Uploaded by

Krishil Patel
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

Unit-3

Uploaded by

Krishil Patel
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 94

The Processor:

Architecture vs Organization
(1) How many Clock pulse required to to perform
previous computation using sequencial executation(or
without using pipeline)?

(2) For n=6 and n=7 howmany clock cycle is required


in pipeline architecture and non pipeline architecture
(for Previous example)
Pipeline Cycle Time
• Minimum time in which all segments can perform
their respective suboperations.
denoted by - tp
(1) Time required to perform first task ?

(2) Time required to perform remaining n-1 task ?

(3)Time required to complete all n task?

(4)Total cycle required to complete n task ?


Time required for n tasks= n*tn
Cycle time in Synchronous Pipeline
tp= maximum of segment delays + Inter mediate register delay.
tn= Sum of all segment delay.
Latency
Time after which machine takes next input.

Latency in non pipeline = tn


Latency in pipeline = tp
Throughput
• Number of inputs processed per unit of time


(� + � − 1)��
Throughput
Example-2
Example-2
Example-3.1
Example-3.1
Example-3.2
Example-3.2
Example-4
Example-5
Example-5
Example-6
Example-6
Example-7
Example-7
Example-8
Example-11
Example-11
Instruction Pipeline
INSTRUCTION EXECUTION IN A 4-STAGE PIPELINE

Segment1: Fetch instruction


from memory

Decode instruction
Segment2: and calculate
effective address

yes Branch?
no
Segment3: Fetch operand
from memory

Segment4: Execute instruction

Interrupt yes
Interrupt?
handling
no
Update PC

Empty pipe
Step: 1 2 3 4 5 6 7 8 9 10 11 12 13
Instruction 1 FI DA FO EX
2 FI DA FO EX
(Branch) 3 FI DA FO EX
4 FI FI DA FO EX
5 FI DA FO EX
6 FI DA FO EX
7 FI DA FO EX
Example continue...
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
I1 FI DI FO EI WB
I2 FI DI FO EI WB
I3 FI DI FO EI WB
I4 FI DI FO EI WB
I5
I6
I7
I8
I9 FI DI FO EI WB
I10 FI DI FO EI WB
I11 FI DI FO EI WB
I12 FI DI FO EI WB
Example continue...
Number of Cycle=15
Time=15 * 11 = 165 nsec
GPK
Difficulties that cause the instruction pipeline

• There are three major difficulties that cause the instruction pipeline to
deviate from its normal operation.
1.Resource conflicts caused by access to memory by two segments at
the same time. Most of these conflicts can be resolved by using
separate instruction and data memories.
2.Data dependency conflicts arise when an instruction depends on the
result of a previous instruction, but this result is not yet available.
3.Branch difficulties arise from branch and other instructions that
change the value of PC.
Structural Hazard/ Resource Conflict
• Two Different inputs try to use same resource at same time
Instruction Pipeline
MAJOR HAZARDS IN PIPELINED EXECUTION
Structural hazards(Resource Conflicts)
Hardware Resources required by the instructions in
simultaneous overlapped execution cannot be met
Data hazards (Data Dependency Conflicts)
An instruction scheduled to be executed in the pipeline requires the
result of a previous instruction, which is not yet available

R1 <- B + C ADD DA B,C + Data dependency

R1 <- R1 + 1
INC DA bubble R1 +1

Control hazards
Branches and other instructions that change the PC
make the fetch of the next instruction to be delayed
JMP ID PC + PC Branch address dependency

bubble IF ID OF OE OS

Hazards in pipelines may make it Pipeline Interlock:


necessary to stall the pipeline Detect Hazards Stall until it is cleared
Instruction Pipeline
DATA HAZARDS
Data Hazards

Occurs when the execution of an instruction


depends on the results of a previous instruction
ADD R1, R2, R3
SUB R4, R1, R5
Data hazard can be dealt with either hardware
techniques or software technique
Hardware Technique

Hardware Interlock
- hardware detects the data dependencies and delays the scheduling
of the dependent instruction by stalling enough clock cycles
Operand Forwarding (bypassing, short-circuiting)
- Accomplished by a data path that routes a value from a source
(usually an ALU) to a user, bypassing a designated register. This
allows the value to be produced to be used at an earlier stage in the
pipeline than would otherwise be possible

Software Technique
Instruction Scheduling(compiler) for delayed load
Operand Forwarding
Instruction Pipeline
CONTROL HAZARDS
Prefetch Target Instruction
• Fetch instructions in both streams, branch not taken and branch taken
• Both are saved until branch is executed. Then, select the right
instruction stream and discard the wrong stream
Branch Target Buffer(BTB; Associative Memory)
• Entry: Addr of previously executed branches; Target instruction
and the next few instructions
• When fetching an instruction, search BTB.
• If found, fetch the instruction stream in BTB;
• If not, new stream is fetched and update BTB
Loop Buffer(High Speed Register file)
• Storage of entire loop that allows to execute a loop without accessing memory
Branch Prediction
• Guessing the branch condition, and fetch an instruction stream based on
the guess. Correct guess eliminates the branch penalty
Delayed Branch
• Compiler detects the branch and rearranges the instruction sequence
by inserting useful instructions that keep the pipeline busy
in the presence of a branch instruction
Example-9
Example-9 (Solution)
From Previous Example...
Example-10
Example-10 (Solution)
Example-10 (Solution)
Example-10 (Solution)

You might also like