0% found this document useful (0 votes)

64 views47 pages

Pipelining

Pipelining is a technique that allows multiple stages of instruction execution to occur simultaneously, improving efficiency and throughput in processors. It involves breaking down processes into sub-operations executed in dedicated segments, with specific stages for fetching, decoding, executing, accessing memory, and writing back results. The document also discusses the concepts of speedup, efficiency, hazards, and vector processing in pipelined architectures.

Uploaded by

Manoj Kumar Sain

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

64 views47 pages

Pipelining

Uploaded by

Manoj Kumar Sain

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 47

Pipelining

 The term Pipelining refers to a technique of decomposing a sequential process into

sub-operations, with each sub-operation being executed in a dedicated segment that
operates concurrently with all other segments.

 It allows different stages of instruction execution to operate simultaneously, much like

an assembly line in a factory.
Example

Ai* Bi + Ci for i = 1, 2, 3, ......., 7

 The operation to be performed on the numbers is decomposed into sub-

operations with each sub-operation to be implemented in a segment within a
pipeline.
 The sub-operations performed in each segment of the pipeline are defined as:

R1 ← Ai, R2 ← Bi Input Ai, and Bi

R3 ← R1 * R2, R4 ← Ci Multiply, and input Ci
R5 ← R3 + R4 Add Ci to product
Stages of Instruction Execution:
In most processors, executing an instruction involves several stages:
•Fetch (F): The instruction is fetched from memory.
•Decode (D): The instruction is decoded to understand the operation and operands.
•Execute (E): The operation is performed (e.g., addition, subtraction).
•Memory Access (M): Data is read from or written to memory, if required.
•Write Back (WB): The result of the operation is written back to the register.
In a non-pipelined processor, these stages are performed sequentially for each instruction.
In a pipelined processor, these stages overlap.
Total Execution Time in Pipelined Processor

For a pipeline with k stages and n instructions:

Total Time (Pipelined) = k + (n - 1)

Explanation:
- First instruction takes k cycles to fill the pipeline.
- Each subsequent instruction takes 1 cycle to complete.
- Total time = k cycles for the first instruction + (n - 1) cycles for the
remaining instructions.
Non-Pipelined Execution Time

In a non-pipelined system:

Total Time (Non-Pipelined) = n * k

Explanation:
- Each instruction takes k cycles to complete (each stage sequentially).
Speedup Formula

Speedup is the ratio of execution time in the non-pipelined system to that in the
pipelined system.

Speedup = (Total Time Non-Pipelined) / (Total Time Pipelined)

Substituting formulas:

Speedup = (n * k) / (k + (n - 1))
Throughput in Pipelined Processor

Throughput is the number of instructions completed per unit time.

Throughput = n / (k + (n - 1))

For large n, throughput approaches 1 instruction per cycle.

Efficiency of Pipeline

Efficiency measures how well the pipeline is utilized.

Efficiency = (Speedup) / (Number of Pipeline Stages)

Substituting formulas:

Efficiency = (n * k) / (k * (k + (n - 1)))

For large n, efficiency approaches 1 (100%).

Example Calculation

Given:
- k = 5 stages
- n = 10 instructions

Non-Pipelined Execution Time:

Total Time (Non-Pipelined) = 10 * 5 = 50 cycles

Pipelined Execution Time:

Total Time (Pipelined) = 5 + (10 - 1) = 14 cycles
Speedup:
Speedup = 50 / 14 ≈ 3.57

Throughput:
Throughput = 10 / (5 + (10 - 1)) = 10 / 14 ≈ 0.71 instructions per cycle
8 instructions and 5 stages

•Number of stages (k) = 5

•Number of instructions (n) = 8
1. Total Execution Time (Pipelined)

• Formula: Total Time (Pipelined) = k + (n - 1)

• Given: k = 5, n = 8
• Calculation:
• Total Time (Pipelined) = 5 + (8 - 1) = 5 + 7 = 12 cycles

2. Total Execution Time (Non-Pipelined)

• Formula: Total Time (Non-Pipelined) = n × k
• Given: k = 5, n = 8
• Calculation:
• Total Time (Non-Pipelined) = 8 × 5 = 40 cycles
3. Speedup

• Formula: Speedup = Total Time (Non-Pipelined) / Total Time (Pipelined)

• Calculation:
• Speedup = 40 / 12 ≈ 3.33

4. Efficiency
• Formula: Efficiency = Speedup / Number of Pipeline Stages (k)
• Given: Speedup = 3.33, k = 5
• Calculation:
• Efficiency = 3.33 / 5 ≈ 0.67 or 67%
Stage delay in pipelining
refers to the time it takes for a specific pipeline stage to complete its operation. Each
stage in a pipeline performs a distinct part of the instruction processing (e.g., instruction
fetch, decode, execution, memory access, write-back), and the delay of each stage
impacts the overall performance of the pipeline.
 The throughput (number of instructions
completed per unit time) is reduced because
the slowest stage bottlenecks the entire
pipeline.
 Efficiency decreases as faster stages spend
part of their time waiting for slower stages.

Cycle Time=Maximum (Stage Delay + Register Delay) across

all stages
Role of Pipeline Registers
Pipeline registers (also called inter-stage buffers) are placed between consecutive pipeline stages
to:
1.Synchronize Data Flow:
1. Pipeline registers store the intermediate results from one stage and pass them to the next
stage in the subsequent clock cycle.
2. This synchronization ensures that even if stages have different delays, the pipeline operates
in a coordinated manner.
2.Hold Results During Waiting Periods:
1. Faster stages write their outputs into the pipeline registers and wait for the clock edge
determined by the slowest stage.
2. This prevents data from being overwritten or lost while waiting for slower stages to
complete their tasks.
A 4-stage pipeline has the stage delays as 150, 120, 160 and 140 nanoseconds,
respectively. Registers that are used between the stages have a delay
of 5 nanoseconds each. Assuming constant clocking rate, the total time taken to
process 1000 data items on this pipeline will be:

Ans : 165.5 micro sec

Consider a non-pipelined processor with a clock rate of 2.5 GHz and average cycles per
instruction of four. The same processor is upgraded to a pipelined processor with five stages;
but due to the internal pipeline delay, the clock speed is reduced to 2 GHz. Assume that there
are no stalls in the pipeline. The speed up achieved in this pipelined processor is
Arithmetic Pipeline

An arithmetic pipeline is a specialized form of pipelining used for performing arithmetic

operations (e.g., addition, subtraction, multiplication, and division). It splits the complex
arithmetic computation into smaller, simpler stages that are executed in parallel, improving the
throughput and efficiency of the operation.

An arithmetic pipeline divides an arithmetic problem into various sub problems for execution
in various pipeline segments.
Floating point addition using arithmetic
pipeline :
3-2=1
The following sub operations are performed in
this case:
1.Compare the exponents.
mantissa associated with
2.Align the mantissas. the smaller exponent
must be shifted to the X = 0.9504 *
right. 103
3.Add or subtract the mantissas. Y = 0.08200 *
103
4.Normalize the result
Z=X+Y
=
1.0324 *
X=0.3214*10^3 and Y=0.4500*10^2 103

Z = 0.1324 *
104
Instruction Pipeline
 In an instruction pipeline, the instruction cycle is divided into multiple stages.
 Each stage of the pipeline is responsible for a specific task in the overall instruction execution
process.
 The key benefit of pipelining is increased throughput—the ability to process more
instructions per unit of time.
Basic Pipeline Stages
1.Fetch (IF - Instruction Fetch)
1. The processor fetches the instruction from memory.
2. The Program Counter (PC) holds the address of the next instruction to be fetched.
3. The instruction is retrieved from memory and placed in the instruction register (IR).
2.Decode (ID - Instruction Decode)
1. The instruction in the IR is decoded to determine which operation is to be performed.
2. The registers are read (if needed), and the instruction's operands are identified.
3. The control signals are generated for the execution phase, specifying the operations to be
performed.
3.Execute (EX - Execute)
1. The operation specified by the instruction is performed. This can involve:
1.Arithmetic or logical operations (e.g., addition, subtraction).
2.Address calculations (for memory operations).
3.Decision making (for branch instructions).
2. The Arithmetic Logic Unit (ALU) is often used in this stage.
4. Memory Access (MEM - Memory Access)
1. If the instruction involves memory (e.g., load or store), this stage accesses the
memory.
1.Load: The data is read from memory and placed in a register.
2.Store: Data is written from a register to memory
5. Write-back (WB - Write Back)
2. The result of the instruction (e.g., data from the ALU or memory) is written back into
the destination register.
3. This is the final stage of the instruction cycle.
Example
Example
•I1: ADD R1, R2, R3 (Add R2 and R3, store in R1)
•I2: SUB R4, R5, R6 (Subtract R5 from R6, store in R4)
•I3: LOAD R7, 100(R1) (Load data from memory address (R1+100) into R7)
•I4: MUL R8, R9, R10 (Multiply R9 and R10, store in R8)
•I5: STORE 200(R11), R12 (Store value from R12 into memory address (R11+200))
Vector Processing in Pipelining

Vector Processing refers to the processing of data in parallel, using vector

registers and vector instructions to perform operations on entire arrays or
vectors of data rather than individual scalar values.

When integrated into a pipelined architecture, vector processing enhances

performance by allowing multiple data elements to be processed simultaneously
in different stages of the pipeline.
Scalar Data:
•Definition: A scalar is a single data value or a single element of data. It represents a single
quantity or number, typically a single integer, floating-point value, or character. Scalars are
the most basic form of data.
•Example:
•A single integer: 5
•A single floating-point value: 3.14
•A single character: 'A'
Key Characteristics:
•It represents one value at a time.
•Operations like addition, subtraction, multiplication, etc., are typically applied to scalars in
simple arithmetic or logical operations.
Vector Data:
•Definition: A vector is an ordered collection or sequence of multiple data values, typically of the
same type. It is an array or list that can hold several scalar values.
•Example:
•A vector of integers: A = [1, 2, 3, 4]
•A vector of floating-point numbers: B = [1.1, 2.2, 3.3, 4.4]
Key Characteristics:
•It holds multiple values at once.
•It allows operations to be performed on entire sets of data in a parallel manner, with one
operation affecting multiple elements at once.
•Vectors are typically used to represent data in higher dimensions (e.g., 2D or 3D coordinates,
time series data, or arrays).
Key Concepts in Vector Processing Pipelines:
1.Vector Registers:
1. A vector register is a storage location that can hold a collection of data elements (usually
integers or floating-point values) that represent a vector. Each data element in the vector
can be processed in parallel.
2.Vector Instructions:
1. A vector instruction operates on an entire vector of data, performing an operation (like
addition or multiplication) on each corresponding element in two or more vectors.
Pipeline Stages:
•In a vector processor pipeline, the stages of the pipeline are designed to handle vector
operations. For example, a basic pipeline might have the following stages:
• IF (Instruction Fetch): Fetching vector instructions from memory.
• ID (Instruction Decode): Decoding the vector instructions, identifying the operation,
and determining the operands (vector registers).
• EX (Execution): Performing the vector operation (e.g., adding or multiplying vector
elements).
• MEM (Memory Access): Accessing memory for read/write operations.
• WB (Writeback): Writing the result back to the vector register.
Handling Vector Length:

1.Short Vectors: If the vector length is shorter than the vector register size, there will be

unused entries in the vector register.

2.Long Vectors: If the vector is too large to fit into the vector register, the processor must

divide the vector into smaller chunks and process them sequentially, with some possible

pipeline stalls for loading new chunks.

vector addition example, where two vectors A = [1, 2, 3, 4] and B = [5, 6, 7, 8] are
added element-wise, and the result is stored in C = [C[1], C[2], C[3], C[4]].
Vector Processing for Matrix Inner Product
Vectorized Computation (SIMD or SIMT Approach):

In vectorized processing, instead of performing these operations serially, vector processors

allow you to:

1.Load the elements of the row of A and the column of B into vector registers.

2.Perform the element-wise multiplication of corresponding elements in parallel.

3.Compute the sum of the products in parallel and store the result.
Hazards

hazards are issues that arise in pipelined instruction execution that can
lead to incorrect behavior or reduced performance. Hazards occur
because of dependencies or conflicts between instructions as they
execute concurrently in a pipeline.

Types of Hazard
1. Data Hazard
1.1 RAW(Read after Write)
1.2 WAR (write after Read)
1.3 WAW (write after write)
2. Structural Hazard
3. Control Hazard
Data Hazards
Data hazards occur when instructions that depend on the results of previous instructions are
executed concurrently, causing incorrect results.
Read After Write (RAW) - True Dependency:
•An instruction depends on the result of a previous instruction.
I1: R2 R2+R3
I2: R5R2+R4
I2 depends on the result of I1.
Write After Read (WAR) - Anti-dependency:
•An instruction writes to a register that a previous instruction reads from.

Write After Write (WAW) - Output Dependency:

•Two instructions write to the same register, and the order of writes matters.
Solutions:

•Forwarding/Bypassing: Use the output of a stage directly in subsequent stages instead of

waiting for it to be written back to the register.

•Pipeline Stalling: Insert NOP (no-operation) instructions or stalls until the hazard is resolved.

•Out-of-Order Execution: Reorder instructions to avoid dependencies.

Structural Hazards
Structural hazards occur when hardware resources required for an instruction are not available
because they are being used by another instruction. This happens if the hardware design does not
provide enough resources to execute multiple instructions concurrently.
Example:
•If a single memory module is shared between instruction fetch (IF) and data memory access
(MEM) stages, both cannot access memory simultaneously, leading to a stall.
Solutions:
•Use duplicated hardware resources (e.g., separate instruction and data memories, also known
as Harvard architecture).
•Use stalling: The pipeline pauses until the resource becomes available.
Control Hazards
Control hazards arise from branch and jump instructions, which affect the flow of
instructions in the pipeline. When the processor does not know the correct path to take, it
may execute incorrect instructions.
Solution
•Branch Prediction: Predict the outcome of branches (taken or not taken).
•Use dynamic predictors like 1-bit, 2-bit predictors, or global history-based predictors.
•Use Stall or any prediction circuit that predict the branch instruction before next
fetching.
Thank you for your patience.

Any Question!!!!!!

CAO-II Module 2 Complete
100% (1)
CAO-II Module 2 Complete
32 pages
Pipelining Basic and Intermediate Concepts
No ratings yet
Pipelining Basic and Intermediate Concepts
75 pages
Zoning in Brocade FC SAN Switch For Beginners - SAN Enthusiast
No ratings yet
Zoning in Brocade FC SAN Switch For Beginners - SAN Enthusiast
13 pages
Pipeline Processing
No ratings yet
Pipeline Processing
28 pages
Pipelining & Riscs: Pipelining Used Key Implementation Technique To Build Fast Processors. It
No ratings yet
Pipelining & Riscs: Pipelining Used Key Implementation Technique To Build Fast Processors. It
6 pages
Pipelined Architecture With Its Diagram
No ratings yet
Pipelined Architecture With Its Diagram
20 pages
Computer Architecture and Organization
No ratings yet
Computer Architecture and Organization
49 pages
Pipeline and Vector
No ratings yet
Pipeline and Vector
29 pages
Lec18 Pipeline
No ratings yet
Lec18 Pipeline
59 pages
De Mod 1 Get Started With Databricks Data Science and Engineering Workspace
No ratings yet
De Mod 1 Get Started With Databricks Data Science and Engineering Workspace
27 pages
Processor Organization (Part 2)
No ratings yet
Processor Organization (Part 2)
42 pages
Operating System Unit 1
0% (1)
Operating System Unit 1
25 pages
CH7-Parallel and Pipelined Processing
No ratings yet
CH7-Parallel and Pipelined Processing
23 pages
Pipelining. Pipeline Hazards: Sabina Batyrkhanovna
No ratings yet
Pipelining. Pipeline Hazards: Sabina Batyrkhanovna
19 pages
COA Unit-3 Slides
No ratings yet
COA Unit-3 Slides
76 pages
An Introductory Analysis of Pipelines: I I I I I Clock Cycles
No ratings yet
An Introductory Analysis of Pipelines: I I I I I Clock Cycles
2 pages
Computer Architecture Pipe Line
No ratings yet
Computer Architecture Pipe Line
28 pages
Pipelining Lecture
No ratings yet
Pipelining Lecture
74 pages
Pipelining
No ratings yet
Pipelining
21 pages
Pipeline 1
No ratings yet
Pipeline 1
6 pages
LECTURE 3 Pipelining
No ratings yet
LECTURE 3 Pipelining
27 pages
Unit 6
No ratings yet
Unit 6
30 pages
Pipelining 2
No ratings yet
Pipelining 2
16 pages
Co Unit 4
No ratings yet
Co Unit 4
17 pages
Unit 4 Coa
No ratings yet
Unit 4 Coa
25 pages
Uni1-2 Pipelining
No ratings yet
Uni1-2 Pipelining
12 pages
Principles of Designing Pipelined Processor-1
No ratings yet
Principles of Designing Pipelined Processor-1
32 pages
Computer Architecture 1
No ratings yet
Computer Architecture 1
8 pages
Computer Organization: An Introduction To RISC Hardware: 6.1 An Overview of Pipelining
No ratings yet
Computer Organization: An Introduction To RISC Hardware: 6.1 An Overview of Pipelining
12 pages
Chapter 4.5 - 4.8 Piplined Processor and Hazards
No ratings yet
Chapter 4.5 - 4.8 Piplined Processor and Hazards
68 pages
Pipelining Basic Concept
No ratings yet
Pipelining Basic Concept
23 pages
Lecture 11 - Pipelining
No ratings yet
Lecture 11 - Pipelining
39 pages
Comparison Between Pipelining
No ratings yet
Comparison Between Pipelining
9 pages
Pipelining Unit 3
No ratings yet
Pipelining Unit 3
19 pages
Pipeline
No ratings yet
Pipeline
39 pages
Module 4
No ratings yet
Module 4
12 pages
CO Pipelining PDF Notes
No ratings yet
CO Pipelining PDF Notes
10 pages
3-Pipelining 241110 203716
No ratings yet
3-Pipelining 241110 203716
59 pages
Ddco5-240207065925-3db65dc3 (1) - Pages-Deleted
No ratings yet
Ddco5-240207065925-3db65dc3 (1) - Pages-Deleted
8 pages
Module 4-Pipelining
No ratings yet
Module 4-Pipelining
39 pages
Module 03
No ratings yet
Module 03
9 pages
Pipelining and Parallel Processing
No ratings yet
Pipelining and Parallel Processing
26 pages
05 Types of Pipelining
No ratings yet
05 Types of Pipelining
56 pages
Pipelining
No ratings yet
Pipelining
44 pages
Pipe Lining
No ratings yet
Pipe Lining
66 pages
ACA - Pipelining
No ratings yet
ACA - Pipelining
25 pages
Pipelining Concepts and Problems
No ratings yet
Pipelining Concepts and Problems
33 pages
Week 11 Reduced
No ratings yet
Week 11 Reduced
29 pages
Computer Organization and Architecture Pipelining Set Execution, Stages and Throughput
No ratings yet
Computer Organization and Architecture Pipelining Set Execution, Stages and Throughput
7 pages
Module 3-Part 2
No ratings yet
Module 3-Part 2
50 pages
PIPELINING
No ratings yet
PIPELINING
30 pages
3 Pipelining Pipeline:: "Folder" Takes 20 Minutes
No ratings yet
3 Pipelining Pipeline:: "Folder" Takes 20 Minutes
8 pages
Chap-10: Speed and Efficiency
No ratings yet
Chap-10: Speed and Efficiency
29 pages
Pipe Lining
No ratings yet
Pipe Lining
23 pages
Comp Architecture Chapter 4 - Pipelining
No ratings yet
Comp Architecture Chapter 4 - Pipelining
53 pages
Chapter # 03 Pipelining
No ratings yet
Chapter # 03 Pipelining
85 pages
MSOFTX3000 ATCA Platform Data Configuration of Hardware and Module Groups-20090622-B-1.0
No ratings yet
MSOFTX3000 ATCA Platform Data Configuration of Hardware and Module Groups-20090622-B-1.0
51 pages
Pipeline Processing
No ratings yet
Pipeline Processing
43 pages
Lecture 3.1.3 (Instruction Pipeline)
No ratings yet
Lecture 3.1.3 (Instruction Pipeline)
9 pages
Parallel Processing
No ratings yet
Parallel Processing
32 pages
COA Module 3 PPT Part 2
No ratings yet
COA Module 3 PPT Part 2
62 pages
Pipeline: A Simple Implementation of A RISC Instruction Set
No ratings yet
Pipeline: A Simple Implementation of A RISC Instruction Set
16 pages
PipeLining in Microprocessors
No ratings yet
PipeLining in Microprocessors
19 pages
STUDY NOTES For STUDENTS - Server 2008 Multiple Choice Questions and Answers
No ratings yet
STUDY NOTES For STUDENTS - Server 2008 Multiple Choice Questions and Answers
8 pages
Part B: User Manual
No ratings yet
Part B: User Manual
13 pages
Bus Booking System
No ratings yet
Bus Booking System
24 pages
TSR
0% (1)
TSR
24 pages
Malware Analysis Series Article 1 1638935075
No ratings yet
Malware Analysis Series Article 1 1638935075
36 pages
Performing Software Installation With Group Policy: Lesson 9
0% (1)
Performing Software Installation With Group Policy: Lesson 9
6 pages
Inct4 QP June 2015
No ratings yet
Inct4 QP June 2015
11 pages
Introduction To Parallel Computing
0% (1)
Introduction To Parallel Computing
34 pages
9-6 Error Messages Reference
No ratings yet
9-6 Error Messages Reference
2,536 pages
Journey To Cloud
No ratings yet
Journey To Cloud
20 pages
527307-001C - Basler Camera Firmware Upgrade
No ratings yet
527307-001C - Basler Camera Firmware Upgrade
22 pages
Module 2 Study Guide
No ratings yet
Module 2 Study Guide
9 pages
Free PDF en
No ratings yet
Free PDF en
16 pages
RMAN Interview Questions
No ratings yet
RMAN Interview Questions
2 pages
WCF Multi-Layer Services Development With Entity Framework: Fourth Edition
No ratings yet
WCF Multi-Layer Services Development With Entity Framework: Fourth Edition
32 pages
113 Submission
No ratings yet
113 Submission
6 pages
Input-Ouput Interface
No ratings yet
Input-Ouput Interface
37 pages
Data Transfer (COA)
No ratings yet
Data Transfer (COA)
40 pages
Lab Assessment: - 1: 1.create A Virtual Machine (VM)
No ratings yet
Lab Assessment: - 1: 1.create A Virtual Machine (VM)
17 pages
STM 32 H 523 Ce
No ratings yet
STM 32 H 523 Ce
226 pages
Alibaba Cloud Cloud Monitor User Guide 20190903
No ratings yet
Alibaba Cloud Cloud Monitor User Guide 20190903
313 pages
Manual - 5.4 Xtera Aslenlink
No ratings yet
Manual - 5.4 Xtera Aslenlink
332 pages
Synchronous DRAM
No ratings yet
Synchronous DRAM
15 pages
Backup Restore Complete Concepts 3.3 PDF
No ratings yet
Backup Restore Complete Concepts 3.3 PDF
28 pages
Pipelining Performance Parameters
No ratings yet
Pipelining Performance Parameters
5 pages
SimpleIDE User Guide 9 26 2
No ratings yet
SimpleIDE User Guide 9 26 2
33 pages
Bti LAN Structure V3
No ratings yet
Bti LAN Structure V3
1 page
Tablet-PC: AMD LX800® M Processor-Based Point of Care Terminal With 10.4" TFT LCD
No ratings yet
Tablet-PC: AMD LX800® M Processor-Based Point of Care Terminal With 10.4" TFT LCD
61 pages
Oi Ecom700 en A
No ratings yet
Oi Ecom700 en A
16 pages
Anti-Scammer Toolset - Version 2.0 INSTRUCTIONS
No ratings yet
Anti-Scammer Toolset - Version 2.0 INSTRUCTIONS
3 pages
Transport Phenomena II Essentials
From Everand
Transport Phenomena II Essentials
The Editors of REA
4/5 (1)

Pipelining

Uploaded by

Pipelining

Uploaded by

Pipelining

 The term Pipelining refers to a technique of decomposing a sequential process into

 It allows different stages of instruction execution to operate simultaneously, much like

Ai* Bi + Ci for i = 1, 2, 3, ......., 7

 The operation to be performed on the numbers is decomposed into sub-

R1 ← Ai, R2 ← Bi Input Ai, and Bi

For a pipeline with k stages and n instructions:

Total Time (Pipelined) = k + (n - 1)

Total Time (Non-Pipelined) = n * k

Speedup = (Total Time Non-Pipelined) / (Total Time Pipelined)

Throughput is the number of instructions completed per unit time.

For large n, throughput approaches 1 instruction per cycle.

Efficiency measures how well the pipeline is utilized.

Efficiency = (Speedup) / (Number of Pipeline Stages)

For large n, efficiency approaches 1 (100%).

Non-Pipelined Execution Time:

Pipelined Execution Time:

•Number of stages (k) = 5

• Formula: Total Time (Pipelined) = k + (n - 1)

2. Total Execution Time (Non-Pipelined)

• Formula: Speedup = Total Time (Non-Pipelined) / Total Time (Pipelined)

Cycle Time=Maximum (Stage Delay + Register Delay) across

Ans : 165.5 micro sec

An arithmetic pipeline is a specialized form of pipelining used for performing arithmetic

Vector Processing refers to the processing of data in parallel, using vector

When integrated into a pipelined architecture, vector processing enhances

unused entries in the vector register.

pipeline stalls for loading new chunks.

In vectorized processing, instead of performing these operations serially, vector processors

allow you to:

2.Perform the element-wise multiplication of corresponding elements in parallel.

Write After Write (WAW) - Output Dependency:

•Forwarding/Bypassing: Use the output of a stage directly in subsequent stages instead of

waiting for it to be written back to the register.

•Out-of-Order Execution: Reorder instructions to avoid dependencies.

You might also like