Pipelining
Pipelining
Explanation:
- First instruction takes k cycles to fill the pipeline.
- Each subsequent instruction takes 1 cycle to complete.
- Total time = k cycles for the first instruction + (n - 1) cycles for the
remaining instructions.
Non-Pipelined Execution Time
In a non-pipelined system:
Explanation:
- Each instruction takes k cycles to complete (each stage sequentially).
Speedup Formula
Speedup is the ratio of execution time in the non-pipelined system to that in the
pipelined system.
Substituting formulas:
Speedup = (n * k) / (k + (n - 1))
Throughput in Pipelined Processor
Throughput = n / (k + (n - 1))
Substituting formulas:
Efficiency = (n * k) / (k * (k + (n - 1)))
Given:
- k = 5 stages
- n = 10 instructions
Throughput:
Throughput = 10 / (5 + (10 - 1)) = 10 / 14 ≈ 0.71 instructions per cycle
8 instructions and 5 stages
4. Efficiency
• Formula: Efficiency = Speedup / Number of Pipeline Stages (k)
• Given: Speedup = 3.33, k = 5
• Calculation:
• Efficiency = 3.33 / 5 ≈ 0.67 or 67%
Stage delay in pipelining
refers to the time it takes for a specific pipeline stage to complete its operation. Each
stage in a pipeline performs a distinct part of the instruction processing (e.g., instruction
fetch, decode, execution, memory access, write-back), and the delay of each stage
impacts the overall performance of the pipeline.
The throughput (number of instructions
completed per unit time) is reduced because
the slowest stage bottlenecks the entire
pipeline.
Efficiency decreases as faster stages spend
part of their time waiting for slower stages.
An arithmetic pipeline divides an arithmetic problem into various sub problems for execution
in various pipeline segments.
Floating point addition using arithmetic
pipeline :
3-2=1
The following sub operations are performed in
this case:
1.Compare the exponents.
mantissa associated with
2.Align the mantissas. the smaller exponent
must be shifted to the X = 0.9504 *
right. 103
3.Add or subtract the mantissas. Y = 0.08200 *
103
4.Normalize the result
Z=X+Y
=
1.0324 *
X=0.3214*10^3 and Y=0.4500*10^2 103
Z = 0.1324 *
104
Instruction Pipeline
In an instruction pipeline, the instruction cycle is divided into multiple stages.
Each stage of the pipeline is responsible for a specific task in the overall instruction execution
process.
The key benefit of pipelining is increased throughput—the ability to process more
instructions per unit of time.
Basic Pipeline Stages
1.Fetch (IF - Instruction Fetch)
1. The processor fetches the instruction from memory.
2. The Program Counter (PC) holds the address of the next instruction to be fetched.
3. The instruction is retrieved from memory and placed in the instruction register (IR).
2.Decode (ID - Instruction Decode)
1. The instruction in the IR is decoded to determine which operation is to be performed.
2. The registers are read (if needed), and the instruction's operands are identified.
3. The control signals are generated for the execution phase, specifying the operations to be
performed.
3.Execute (EX - Execute)
1. The operation specified by the instruction is performed. This can involve:
1.Arithmetic or logical operations (e.g., addition, subtraction).
2.Address calculations (for memory operations).
3.Decision making (for branch instructions).
2. The Arithmetic Logic Unit (ALU) is often used in this stage.
4. Memory Access (MEM - Memory Access)
1. If the instruction involves memory (e.g., load or store), this stage accesses the
memory.
1.Load: The data is read from memory and placed in a register.
2.Store: Data is written from a register to memory
5. Write-back (WB - Write Back)
2. The result of the instruction (e.g., data from the ALU or memory) is written back into
the destination register.
3. This is the final stage of the instruction cycle.
Example
Example
•I1: ADD R1, R2, R3 (Add R2 and R3, store in R1)
•I2: SUB R4, R5, R6 (Subtract R5 from R6, store in R4)
•I3: LOAD R7, 100(R1) (Load data from memory address (R1+100) into R7)
•I4: MUL R8, R9, R10 (Multiply R9 and R10, store in R8)
•I5: STORE 200(R11), R12 (Store value from R12 into memory address (R11+200))
Vector Processing in Pipelining
1.Short Vectors: If the vector length is shorter than the vector register size, there will be
2.Long Vectors: If the vector is too large to fit into the vector register, the processor must
divide the vector into smaller chunks and process them sequentially, with some possible
1.Load the elements of the row of A and the column of B into vector registers.
3.Compute the sum of the products in parallel and store the result.
Hazards
hazards are issues that arise in pipelined instruction execution that can
lead to incorrect behavior or reduced performance. Hazards occur
because of dependencies or conflicts between instructions as they
execute concurrently in a pipeline.
Types of Hazard
1. Data Hazard
1.1 RAW(Read after Write)
1.2 WAR (write after Read)
1.3 WAW (write after write)
2. Structural Hazard
3. Control Hazard
Data Hazards
Data hazards occur when instructions that depend on the results of previous instructions are
executed concurrently, causing incorrect results.
Read After Write (RAW) - True Dependency:
•An instruction depends on the result of a previous instruction.
I1: R2 R2+R3
I2: R5R2+R4
I2 depends on the result of I1.
Write After Read (WAR) - Anti-dependency:
•An instruction writes to a register that a previous instruction reads from.
•Pipeline Stalling: Insert NOP (no-operation) instructions or stalls until the hazard is resolved.
Any Question!!!!!!