0% found this document useful (0 votes)
174 views25 pages

Computer Organization and Assembly Language: Pipeline: Introduction

This document introduces pipelining in computer processors. It defines pipelining as a way to speed up the execution of instructions by overlapping the execution of multiple instructions. It provides an analogy using a laundry process to illustrate how pipelining works by performing tasks like washing, drying, and folding clothes simultaneously across multiple loads of laundry. The document then discusses how pipelining can be applied to a digital system by breaking computations into stages separated by pipeline registers. Finally, it shows how pipelining can be applied to a MIPS processor by splitting the instruction execution process into five stages - fetch, decode, execute, memory, and writeback.

Uploaded by

Aroosa Sheikh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
174 views25 pages

Computer Organization and Assembly Language: Pipeline: Introduction

This document introduces pipelining in computer processors. It defines pipelining as a way to speed up the execution of instructions by overlapping the execution of multiple instructions. It provides an analogy using a laundry process to illustrate how pipelining works by performing tasks like washing, drying, and folding clothes simultaneously across multiple loads of laundry. The document then discusses how pipelining can be applied to a digital system by breaking computations into stages separated by pipeline registers. Finally, it shows how pipelining can be applied to a MIPS processor by splitting the instruction execution process into five stages - fetch, decode, execute, memory, and writeback.

Uploaded by

Aroosa Sheikh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 25

Computer Organization and Assembly

Language

Pipeline: Introduction

CSCE430/830 Pipeline
Pipelining Outline

• Introduction 
– Defining Pipelining
– Pipelining Instructions
• Hazards
– Structural hazards
– Data Hazards
– Control Hazards
• Performance
• Controller implementation

CSCE430/830 Pipeline
What is Pipelining?

• A way of speeding up execution of instructions

• Key idea:
overlap execution of multiple instructions

CSCE430/830 Pipeline
The Laundry Analogy

• Ann, Brian, Cathy, Dave


each have one load of clothes A B C D
to wash, dry, and fold
• Washer takes 30 minutes

• Dryer takes 30 minutes

• “Folder” takes 30 minutes

• “Stasher” takes 30 minutes


to put clothes into drawers

CSCE430/830 Pipeline
If we do laundry sequentially...

6 PM 7 8 9 10 11 12 1 2 AM

30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30
T Time
a A
s
k
B
O
r
d C
e
r D
• Time Required: 8 hours for 4 loads

CSCE430/830 Pipeline
To Pipeline, We Overlap Tasks

6 PM 7 8 9 10 11 12 1 2 AM

30 30 30 30 30 30 30 Time
T
a
s A
k
O
B
r
d C
e
r D
• Time Required: 3.5 Hours for 4 Loads

CSCE430/830 Pipeline
To Pipeline, We Overlap Tasks

6 PM 7 8 9 10 11 12 1 2 AM

30 30 30 30 30 30 30 Time
T
a • Pipelining doesn’t help latency of
s A single task, it helps throughput of
k entire workload
O
B • Pipeline rate limited by slowest
r pipeline stage
d C
e • Multiple tasks operating
r D simultaneously
• Potential speedup = Number
pipe stages
• Unbalanced lengths of pipe
stages reduces speedup
• Time to “fill” pipeline and time to
CSCE430/830 “drain” it reduces speedup Pipeline
Pipelining a Digital System
1 nanosecond = 10^-9 second
1 picosecond = 10^-12 second

• Key idea: break big computation up into pieces

1ns
• Separate each piece with a pipeline register

200ps 200ps 200ps 200ps 200ps

Pipeline
Register
CSCE430/830 Pipeline
Pipelining a Digital System

• Why do this? Because it's faster for repeated


computations

Non-pipelined:
1 operation finishes
every 1ns

1ns

Pipelined:
1 operation finishes
every 200ps

200ps 200ps 200ps 200ps 200ps

CSCE430/830 Pipeline
Comments about pipelining

• Pipelining increases throughput, but not


latency
– Answer available every 200ps, BUT
– A single computation still takes 1ns
• Limitations:
– Computations must be divisible into stage size
– Pipeline registers add overhead

CSCE430/830 Pipeline
Pipelining a Processor

• Recall the 5 steps in instruction execution:


1. Instruction Fetch (IF)
2. Instruction Decode and Register Read (ID)
3. Execution operation or calculate address (EX)
4. Memory access (MEM)
5. Write result into register (WB)

• Review: Single-Cycle Processor


– All 5 steps done in a single clock cycle
– Dedicated hardware required for each step

CSCE430/830 Pipeline
Review - Single-Cycle Processor

CSCE430/830 •What do we need to add to actually split the datapath into stages? Pipeline
The Basic Pipeline For MIPS

Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Cycle 7

ALU
Ifetch Reg DMem Reg
I
n
s

ALU
t Ifetch Reg DMem Reg

r.

ALU
O Ifetch Reg DMem Reg

r
d

ALU
e Ifetch Reg DMem Reg

What do we need to add to actually split the datapath into stages?


CSCE430/830 Pipeline
Basic Pipelined Processor

CSCE430/830 Pipeline
Pipeline example: lw
IF

CSCE430/830 Pipeline
Pipeline example: lw
ID

CSCE430/830 Pipeline
Pipeline example: lw
EX

CSCE430/830 Pipeline
Pipeline example: lw
MEM

CSCE430/830 Pipeline
Pipeline example: lw
WB

CSCE430/830 Pipeline
Single-Cycle vs. Pipelined Execution

Non-Pipelined
Instruction 0 200 400 600 800 1000 1200 1400 1600 1800
Order Time
Instruction REG REG
lw $1, 100($0) ALU MEM
Fetch RD WR
Instruction REG REG
lw $2, 200($0) Fetch
ALU MEM
RD WR
800ps
Instruction
lw $3, 300($0)
Fetch
800ps
800ps
Pipelined
Instruction 0 200 400 600 800 1000 1200 1400 1600
Order Time
Instruction REG REG
lw $1, 100($0) ALU MEM
Fetch RD WR
Instruction REG REG
lw $2, 200($0) Fetch
ALU MEM
RD WR
200ps
Instruction REG REG
lw $3, 300($0) ALU MEM
Fetch RD WR
200ps
200ps 200ps 200ps 200ps 200ps

CSCE430/830 Pipeline
Speedup
• Consider the unpipelined processor introduced previously. Assume that
it has a 1 ns clock cycle and it uses 4 cycles for ALU operations and
branches, and 5 cycles for memory operations, assume that the relative
frequencies of these operations are 40%, 20%, and 40%, respectively.
Suppose that due to clock skew and setup, pipelining the processor
adds 0.2ns of overhead to the clock. Ignoring any latency impact, how
much speedup in the instruction execution rate will we gain from a
pipeline?

Average instruction execution time


= 1 ns * ((40% + 20%)*4 + 40%*5)
= 4.4ns

Speedup from pipeline


= Average instruction time unpiplined/Average instruction time pipelined
= 4.4ns/1.2ns = 3.7

CSCE430/830 Pipeline
Comments about Pipelining

• The good news


– Multiple instructions are being processed at same time
– This works because stages are isolated by registers
– Best case speedup of N
• The bad news
– Instructions interfere with each other - hazards
» Example: different instructions may need the same
piece of hardware (e.g., memory) in same clock cycle
» Example: instruction may require a result produced
by an earlier instruction that is not yet complete

CSCE430/830 Pipeline
Pipeline Hazards

• Limits to pipelining: Hazards prevent next instruction


from executing during its designated clock cycle
– Structural hazards: two different instructions use same h/w
in same cycle
– Data hazards: Instruction depends on result of prior
instruction still in the pipeline

CSCE430/830 Pipeline
Summary - Pipelining Overview

• Pipelining increase throughput (but not


latency)
• Hazards limit performance
– Structural hazards
– Data hazards

CSCE430/830 Pipeline
Pipelining Outline

• Introduction
– Defining Pipelining
– Pipelining Instructions
• Hazards
– Structural hazards 
– Data Hazards
• Performance

CSCE430/830 Pipeline

You might also like