0% found this document useful (0 votes)

59 views51 pages

5 Pipelined Processor: Temporal Overlapping of Processing, Assembly Line

This document discusses the principles and design of pipelined processors. It covers topics such as the basic concept of pipelining, pipeline performance measures, data and control dependencies that can cause stalls, and techniques to improve performance like bypassing and multiple-operation instructions. It also provides examples of integer and load/store pipeline designs for RISC and CISC processors, including stages, latency, and techniques to reduce load-use delay.

Uploaded by

Dhiraj Kapila

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

59 views51 pages

5 Pipelined Processor: Temporal Overlapping of Processing, Assembly Line

Uploaded by

Dhiraj Kapila

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 51

5 Pipelined Processor

temporal overlapping of processing, assembly line

5.1 Basic concept 5.2 Design space of pipelines 5.3 Overview of pipelined instruction processing 5.4 Pipelined execution of integer and Boolean instructions 5.5 Pipelined processing of loads and stores

TECH
CH01

Computer Science

5.1.1 Principle of pipelining

Principle of pipelining e.g.

Processing of a sequence of instructions using a basic pipeline

Pipelined and unpipelined processing

5.1.2 General structure of pipelines

Structure and pipelined operation of the Fx unit of the IBM Power1

Pipeline Performance Measures

Cycle time: tc
is determined by the worst-case processing time of the longest stage

Repetition Rate: R
the shortest possible time interval between subsequent independent instructions in the pipeline

Performance potential of a pipeline: P P = 1/(R * tc) PowerPC603 FP double Mul. e.g. R = 2, tc = 12 nsec P = 1/(R * tc) = 1/(2*12nec) = 44.6 MFLOPS

Performance: RAW-dependent
Latency:
specifies the amount of time that the result of a particular instruction takes to become available in the pipeline for a subsequent dependent instruction.

Define-use latency (10 to 100 cycles)

mul r1, r2, r3 add r5, r1, r4

Load-use latency (1 to 3 cycles)

load r1, x add r5, r1, r2

Stalled: the immediately following RAW-dependent instruction has to be stalled in the pipeline for n-1 cycle

Improve Performance
Multiple-operation instructions

HP PA 7100
FMPYADD RM1, RM2, RM3, RA1, RA2
RM3RM1*RM2 RA2RA1+RA2

PowerPC
FMA for performing (A*C) + B

5.1.4 Application scenarios of pipelines

5.2 Design space of pipelines

key aspect of the design space of pipeline

5.2.2 Basic layout of a pipeline

Design space of the overall stage layout

Increasing parellelism by raising the number of pipeline stages

Eight -stage pipeline

Problems arise for more stages

data and control dependencies occur more frequently
stalled and wait for data reload pipe in case of branch

subtask becomes less balances (in execution time)

cycle time is determined by the worst-case processing time of the longest stage

In most case
5-10 stages

Pipelines e.g. DEC 21064

Layout of the stage sequence

Bypasses (data forwarding in RAW)

Unless special arrangements are made, the results of the operation instruction is written into the register file, or into the memory, and then it is fetched from there as a source operand.

Principle of bypassing in define-use and loaduse conflicts

Possibilities for the timing of pipeline operation

5.3 Overview: pipelined instruction processing

Declaration of Logical Pipeline: e.g. Powerpc 601

Detailed Specification of each of the pipeline: e.g. //

Implementation of instruction pipelines (v.s. logical)

Layout of physical pipelines

Multiplicity of pipelines

Preserveing sequential consistency

Preserveing sequential consistency, implementation e.g.

Preserveing sequential consistency, e.g.

Case studies: Pentium

Logic layout of Pentiums pipelines

Case studies: PowerPC 604

5.4 (Specific) Pipelines execution: Integer and Boolean instructions (FX)

RISC pipelines 4 or 5 stages

Tradictrional FX pipeline of RISC processors

Logical to Physical: e.g. PowerPC601 using a single universal FX unit

Layout 5 stages e.g. : FX and L/S pipelines in the MIPS R4200

CISC pipeline 6 or 5 stages

Traditional CISC pipeline:

The execution of reister-memory instruction

CISC pipeline:
Execution of register-register and load/store instructions

CISC pipeline 5 stage: recycling E/C stage

Implementation of FX units: how many

Trend in increasing the performance

5.5 (Specific) Pipelines execution: loads and stores

5.5.3 Load-use delay: RICS pipelines

Load-use delay: MIPS

Load-use delay: CISC

Handling Load-use delay

Basic approaches to cope with a load-use delay

Remove Load-use delay

Remove Load-use delay: bringing forward the claculation of virtual address: for slow cache

Peter Gliwa - Embedded Software Timing - Methodology, Analysis and Practical Tips With A Focus On Automotive-Springer (2021)
No ratings yet
Peter Gliwa - Embedded Software Timing - Methodology, Analysis and Practical Tips With A Focus On Automotive-Springer (2021)
308 pages
Advanced Computer Architecture: Pipelined Processor
No ratings yet
Advanced Computer Architecture: Pipelined Processor
20 pages
Principles of Designing Pipelined Processor-1
No ratings yet
Principles of Designing Pipelined Processor-1
32 pages
25 pipelining محاضرة
No ratings yet
25 pipelining محاضرة
7 pages
Unit 5 - Pipeling and Multipoessors
No ratings yet
Unit 5 - Pipeling and Multipoessors
74 pages
Unit 05 Design Space of Pipelines
No ratings yet
Unit 05 Design Space of Pipelines
24 pages
Csa Module Iv Notes
No ratings yet
Csa Module Iv Notes
59 pages
BCA Semester II Computer Organisation and Architecture (COA
No ratings yet
BCA Semester II Computer Organisation and Architecture (COA
24 pages
Pipeline and Vector
No ratings yet
Pipeline and Vector
29 pages
5.1-5.3 Pipelining and Parallel Processing
No ratings yet
5.1-5.3 Pipelining and Parallel Processing
56 pages
Module 3-Part 2
No ratings yet
Module 3-Part 2
50 pages
CA Slides#3 Pipeline Introduction
No ratings yet
CA Slides#3 Pipeline Introduction
26 pages
Instruction Pipeline Design, Arithmetic Pipeline Deign - Super Scalar Pipeline Design
No ratings yet
Instruction Pipeline Design, Arithmetic Pipeline Deign - Super Scalar Pipeline Design
34 pages
Pipelining: 5-Stage Pipeline: Mahdi Nazm Bojnordi
No ratings yet
Pipelining: 5-Stage Pipeline: Mahdi Nazm Bojnordi
35 pages
Coa 3
No ratings yet
Coa 3
74 pages
Pipeline Architecture PDF
100% (1)
Pipeline Architecture PDF
42 pages
Pipeline Architecture: C. V. Ramamoorthy
No ratings yet
Pipeline Architecture: C. V. Ramamoorthy
42 pages
1.4-Parallel Computer Architecture
No ratings yet
1.4-Parallel Computer Architecture
22 pages
Pipelining and Parallelism
No ratings yet
Pipelining and Parallelism
41 pages
Computer Architecture 1
No ratings yet
Computer Architecture 1
8 pages
Contact Session 8
No ratings yet
Contact Session 8
63 pages
ACA - Chapter 6
No ratings yet
ACA - Chapter 6
75 pages
Parallelism in Uniprocessor System and Granularity
100% (5)
Parallelism in Uniprocessor System and Granularity
5 pages
Lec 7
No ratings yet
Lec 7
26 pages
Pipeline
No ratings yet
Pipeline
33 pages
Presentation 5156 Content Document 20250301102853AM
No ratings yet
Presentation 5156 Content Document 20250301102853AM
40 pages
Chapter 9 - Pipeline and Vector Processing Section 9.1 - Parallel Processing
No ratings yet
Chapter 9 - Pipeline and Vector Processing Section 9.1 - Parallel Processing
10 pages
Processor Organization (Part 2)
No ratings yet
Processor Organization (Part 2)
42 pages
Pipelining and Others
No ratings yet
Pipelining and Others
34 pages
Unit-4 Pipelinie and Vector Processing
No ratings yet
Unit-4 Pipelinie and Vector Processing
33 pages
Parallel Processing
No ratings yet
Parallel Processing
32 pages
Lecture Notes On Parallel Processing Pipeline
No ratings yet
Lecture Notes On Parallel Processing Pipeline
12 pages
Lec18 Pipeline
No ratings yet
Lec18 Pipeline
59 pages
Pipeline Processor Design
No ratings yet
Pipeline Processor Design
89 pages
Topic 10: Pipelining: Cos / Ele 375 Computer Architecture and Organization
No ratings yet
Topic 10: Pipelining: Cos / Ele 375 Computer Architecture and Organization
64 pages
Lecture On Embedded System (Part - 2)
No ratings yet
Lecture On Embedded System (Part - 2)
35 pages
Comp Architecture Chapter 4 - Pipelining
No ratings yet
Comp Architecture Chapter 4 - Pipelining
53 pages
Milen Dimitrov HW2 Q3
No ratings yet
Milen Dimitrov HW2 Q3
7 pages
4-Concept of Pipelining
No ratings yet
4-Concept of Pipelining
20 pages
ACA - Pipelining
No ratings yet
ACA - Pipelining
25 pages
Pipe Lining
No ratings yet
Pipe Lining
7 pages
Chapter 5 Pipelining and Vector Processing Modified
No ratings yet
Chapter 5 Pipelining and Vector Processing Modified
37 pages
Pipeline & Parallel Processing
No ratings yet
Pipeline & Parallel Processing
19 pages
Lecture 13 Pipelining
No ratings yet
Lecture 13 Pipelining
12 pages
Pipelining Basic Concept
No ratings yet
Pipelining Basic Concept
23 pages
Module 4-Pipelining
No ratings yet
Module 4-Pipelining
39 pages
Introduction To Parallel Processing: Unit-2
No ratings yet
Introduction To Parallel Processing: Unit-2
32 pages
CSO Computer Programming
No ratings yet
CSO Computer Programming
73 pages
UNIT-5: Pipeline and Vector Processing
No ratings yet
UNIT-5: Pipeline and Vector Processing
63 pages
Pipelining Unit 3
No ratings yet
Pipelining Unit 3
19 pages
Pipelining
No ratings yet
Pipelining
21 pages
Co Unit 4
No ratings yet
Co Unit 4
17 pages
Vectors
No ratings yet
Vectors
52 pages
Design and Analysis of A 32-Bit Pipelined Mips Risc Processor
No ratings yet
Design and Analysis of A 32-Bit Pipelined Mips Risc Processor
18 pages
Computer Architecture Pipe Line
No ratings yet
Computer Architecture Pipe Line
28 pages
Module 03
No ratings yet
Module 03
9 pages
PipeLining in Microprocessors
No ratings yet
PipeLining in Microprocessors
19 pages
CAO-II Module 2 Complete
100% (1)
CAO-II Module 2 Complete
32 pages
Pipelined Architecture With Its Diagram
No ratings yet
Pipelined Architecture With Its Diagram
20 pages
List of Youtube Videos Created and Shared On LPU Live APP
No ratings yet
List of Youtube Videos Created and Shared On LPU Live APP
1 page
General Guidelines
No ratings yet
General Guidelines
12 pages
Resume: Professional & Educational Details
No ratings yet
Resume: Professional & Educational Details
3 pages
8086 Architecture: Dr. Mohammad Najim Abdullah 1. Hardware Organization
No ratings yet
8086 Architecture: Dr. Mohammad Najim Abdullah 1. Hardware Organization
6 pages
Lavenstein Distance
No ratings yet
Lavenstein Distance
5 pages
8086 Architecture: Dr. Mohammad Najim Abdullah 1. Hardware Organization
No ratings yet
8086 Architecture: Dr. Mohammad Najim Abdullah 1. Hardware Organization
6 pages
Vliw Architecture
No ratings yet
Vliw Architecture
30 pages
The MESI Protocol
100% (1)
The MESI Protocol
4 pages
CS-501 Advance Software Engineering
No ratings yet
CS-501 Advance Software Engineering
1 page
Prolog Lab Manual
67% (3)
Prolog Lab Manual
160 pages
List of All Institute Nov. 05.11.2011
No ratings yet
List of All Institute Nov. 05.11.2011
10 pages
General Machine Structures
No ratings yet
General Machine Structures
22 pages
Entrepreneurship N Innovation - G1
No ratings yet
Entrepreneurship N Innovation - G1
15 pages
Knowledge Representation Technique
No ratings yet
Knowledge Representation Technique
30 pages
Knowledge Representation and Rule Based Systems
No ratings yet
Knowledge Representation and Rule Based Systems
17 pages
Biruk Tewoderos 1790
No ratings yet
Biruk Tewoderos 1790
21 pages
Edpm Study Guide
100% (1)
Edpm Study Guide
25 pages
A Computer Is A Programmable Machine Designed To Sequentially and Automatically Carry Out A Sequence of Arithmetic or Logical Operations
No ratings yet
A Computer Is A Programmable Machine Designed To Sequentially and Automatically Carry Out A Sequence of Arithmetic or Logical Operations
3 pages
Improving and Measuring Cache Performance
No ratings yet
Improving and Measuring Cache Performance
8 pages
Chapter 1
No ratings yet
Chapter 1
37 pages
Chapter 2 8085 Microprocessor Architecture
No ratings yet
Chapter 2 8085 Microprocessor Architecture
20 pages
Casio SK-1 / Realistic Concertmate-500 - Service Manual
No ratings yet
Casio SK-1 / Realistic Concertmate-500 - Service Manual
31 pages
Types of Computer & Their Parts
No ratings yet
Types of Computer & Their Parts
6 pages
How Does Microprocessor Work - Attempt Review
No ratings yet
How Does Microprocessor Work - Attempt Review
6 pages
DVD Daewoo Dg-k301sm
No ratings yet
DVD Daewoo Dg-k301sm
48 pages
Compiler Techniques For Exposing ILP
No ratings yet
Compiler Techniques For Exposing ILP
26 pages
Aku Eb
No ratings yet
Aku Eb
12 pages
5th Sem Syllabus
No ratings yet
5th Sem Syllabus
5 pages
2nd Year Syllabus
No ratings yet
2nd Year Syllabus
34 pages
Unit - 1 (PC Hardware)
No ratings yet
Unit - 1 (PC Hardware)
27 pages
Arithmetic Operations 8085 (1) Kt.3.0
No ratings yet
Arithmetic Operations 8085 (1) Kt.3.0
3 pages
"Cache Memory" in (Microprocessor Systems and Interfacing)
No ratings yet
"Cache Memory" in (Microprocessor Systems and Interfacing)
19 pages
Unit I
No ratings yet
Unit I
53 pages
Microcontroller Hardware
No ratings yet
Microcontroller Hardware
45 pages
In Memory Pointer Chasing Accelerator - Iccd16
No ratings yet
In Memory Pointer Chasing Accelerator - Iccd16
8 pages
8086 Architecture and Pin Discription
No ratings yet
8086 Architecture and Pin Discription
20 pages
BITS Pilani: Reconfigurable Computing Es ZG 554 / Mel ZG 554 Session 1
No ratings yet
BITS Pilani: Reconfigurable Computing Es ZG 554 / Mel ZG 554 Session 1
23 pages
1-TOPIC1-Application Computer
No ratings yet
1-TOPIC1-Application Computer
22 pages
Microprogrammed Control
No ratings yet
Microprogrammed Control
27 pages
Proyecto Sobre Computadoras en Inglés
No ratings yet
Proyecto Sobre Computadoras en Inglés
20 pages
Multi-Processor-Parallel Processing PDF
No ratings yet
Multi-Processor-Parallel Processing PDF
12 pages
Unit 2
No ratings yet
Unit 2
13 pages
Infineon C165 DS v02 - 00 en PDF
No ratings yet
Infineon C165 DS v02 - 00 en PDF
77 pages
PLC Command Index: A.B. Micrologix and SLC Family
No ratings yet
PLC Command Index: A.B. Micrologix and SLC Family
26 pages

5 Pipelined Processor: Temporal Overlapping of Processing, Assembly Line

Uploaded by

5 Pipelined Processor: Temporal Overlapping of Processing, Assembly Line

Uploaded by

5 Pipelined Processor

temporal overlapping of processing, assembly line

5.1.1 Principle of pipelining

Principle of pipelining e.g.

Processing of a sequence of instructions using a basic pipeline

Pipelined and unpipelined processing

5.1.2 General structure of pipelines

Structure and pipelined operation of the Fx unit of the IBM Power1

Pipeline Performance Measures

Define-use latency (10 to 100 cycles)

Load-use latency (1 to 3 cycles)

5.1.4 Application scenarios of pipelines

5.2 Design space of pipelines

5.2.2 Basic layout of a pipeline

Increasing parellelism by raising the number of pipeline stages

Eight -stage pipeline

Problems arise for more stages

subtask becomes less balances (in execution time)

Pipelines e.g. DEC 21064

Layout of the stage sequence

Bypasses (data forwarding in RAW)

Principle of bypassing in define-use and loaduse conflicts

Possibilities for the timing of pipeline operation

5.3 Overview: pipelined instruction processing

Declaration of Logical Pipeline: e.g. Powerpc 601

Detailed Specification of each of the pipeline: e.g. //

Implementation of instruction pipelines (v.s. logical)

Layout of physical pipelines

Preserveing sequential consistency

Preserveing sequential consistency, implementation e.g.

Preserveing sequential consistency, e.g.

Case studies: Pentium

Case studies: PowerPC 604

5.4 (Specific) Pipelines execution: Integer and Boolean instructions (FX)

RISC pipelines 4 or 5 stages

Tradictrional FX pipeline of RISC processors

Logical to Physical: e.g. PowerPC601 using a single universal FX unit

Layout 5 stages e.g. : FX and L/S pipelines in the MIPS R4200

CISC pipeline 6 or 5 stages

Traditional CISC pipeline:

CISC pipeline 5 stage: recycling E/C stage

Implementation of FX units: how many

Trend in increasing the performance

5.5 (Specific) Pipelines execution: loads and stores

5.5.3 Load-use delay: RICS pipelines

Load-use delay: MIPS

Load-use delay: CISC

Handling Load-use delay

Remove Load-use delay

You might also like