0% found this document useful (0 votes)
55 views

Advanced Computer Architecture: Pipelined Processor

This document discusses pipelined processors. It begins by explaining the principle of pipelining, which involves processing a sequence of instructions simultaneously using multiple stages. It then discusses pipeline performance measures such as cycle time and repetition rate. Examples are given to show how performance can be calculated. The document also discusses how data dependencies between instructions, such as RAW dependencies, can cause stalls if the results of one instruction are needed by a subsequent instruction before they are available. Various techniques for improving performance are presented, such as bypassing to avoid stalls from data dependencies. The key aspects of pipeline design space and examples of pipeline layouts from actual processors are also summarized.

Uploaded by

Sajid Hussain
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
55 views

Advanced Computer Architecture: Pipelined Processor

This document discusses pipelined processors. It begins by explaining the principle of pipelining, which involves processing a sequence of instructions simultaneously using multiple stages. It then discusses pipeline performance measures such as cycle time and repetition rate. Examples are given to show how performance can be calculated. The document also discusses how data dependencies between instructions, such as RAW dependencies, can cause stalls if the results of one instruction are needed by a subsequent instruction before they are available. Various techniques for improving performance are presented, such as bypassing to avoid stalls from data dependencies. The key aspects of pipeline design space and examples of pipeline layouts from actual processors are also summarized.

Uploaded by

Sajid Hussain
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 20

Advanced Computer Architecture

Lecture 4

Pipelined Processor

Principle of pipelining

Processing of a sequence of
instructions using a basic pipeline

Pipelined and unpipelined processing

General structure of pipelines

Pipeline Performance Measures

Cycle time: tc
is

determined by the worst-case processing


time of the longest stage

Repetition Rate: R
the

shortest possible time interval between


subsequent independent instructions in the
pipeline

Pipeline Performance Measures


(Example)

Performance potential of a pipeline: P


P = 1/(R * tc)

PowerPC603 FP double Mul. e.g. R = 2, t c


= 12 nsec
P = 1/(R * tc) = 1/(2*12nec) = 44.6
MFLOPS

Performance: RAW-dependent

Latency:
specifies

the amount of time that the result of


a particular instruction takes to become
available in the pipeline for a subsequent
dependent instruction.

Define-use latency (10 to 100 cycles)


mul

r1, r2, r3
add r5, r1, r4

Performance: RAW-dependent

Load-use latency (1 to 3 cycles)


load

r1, x
add r5, r1, r2

Stalled: the immediately following RAWdependent instruction has to be stalled in


the pipeline for n-1 cycle

Improve Performance

There is difference between


Additions/Subtractions

and Multiplications
Integer and Double Precision Operations

Design space of pipelines

key aspect of the design space of pipeline

Basic layout of a pipeline

Design space of the overall stage layout

Increasing parellelism by raising


the number of pipeline stages

Eight -stage pipeline

Problems arise for more stages

data and control dependencies occur more


frequently
stalled

and wait for data


reload pipe in case of branch

subtask becomes less balanced (in execution time)


cycle

time is determined by the worst-case processing


time of the longest stage

In most case
5-10

stages

Pipelines e.g. DEC 21064

Layout of the stage sequence

Bypasses (data forwarding in


RAW)
Unless special arrangements are made,
the results of the operation instruction is
written into the register file, or into the
memory,
and then it is fetched from there as a
source operand.

Principle of bypassing in define-use


and load-use conflicts

Possibilities for the timing of


pipeline operation

You might also like