Unit 7 N
Unit 7 N
Calculate the speed up rate of 5-segment pipeline with a clock cycle time 25ns to execute
100 tasks.
Solution: Here,
n = 100
Number of segment (K) = 5
Time to complete a task (tp) = 25 ns
tn = Ktp
A non–pipeline system takes 100 ns to process a task. The same task can be processed in a
six-segment pipeline with time delay of each segment in the pipeline is as follows; 20 ns, 25
ns, 30 ns. Determine the speed of ratio of pipeline for 100 tasks.
Solution:
tn = 100 ns
K=6
tp = 30 ns
n = 100
Suppose that time delays of four segments are t1 = 60ns, t2 = 70 ns, t3 = 100 ns, t4 = 80 ns
and interface register have a delay of 10 ns. Determine the speed up ratio.
Solution: Here,
tp = 100 + 10 = 110 ns
tn = t1 + t2 + t3 + t4
=60+70+100+80 = 320 ns
Arithmetic Pipeline
- Pipeline arithmetic units are usually found in very high speed computers. They are used
to implement floating point operations, multiplication of fixed point numbers, and similar
computations encountered in scientific problems.
- We will now show an example of a pipeline unit for floating point addition and
subtraction. The inputs to the floating point adder pipeline are two normalized floating
point binary numbers.
X=A +2a
Y=B - 2b
- The floating point addition and subtraction can be performed in four segments, as shown
in figure. The registers labeled R are placed between the segments to store intermediate
results.
- The sub operations that are performed in the four segments are
i. Compare the exponents.
ii. Align the mantissas.
iii. Add or subtract the mantissas.
iv. Normalize the result.
- Procedure: The exponents are compared by subtracting them to determine their
difference. The larger exponent is chosen as the exponent of the result. The exponent
difference determines how many times the mantissa associated with the smaller exponent
must be shifted to the right. This produces an alignment of the two mantissas. It should be
noted that the shift must be designed as a combinational circuit to reduce the shift time.
The two mantissas are added or subtracted in segment 3. The result is normalized in
segment 4. When an overflow occurs, the mantissa of the sum or difference is shifted
right and the exponent incremented by one
- For simplicity, we use decimal numbers, the two normalized floating-point numbers:
X = 0.9504 +103
Y = 0.8200-102
- The two exponents are subtracted in the first segment to obtain 3-2 = 1. The larger
exponent 3 is chosen as the exponent of the result. The next segment shifts the mantissa
of Y to the right to obtain
X = 0.9504 +103
Y = 0.0820 103
- This aligns the two mantissas under the same exponent. The addition of the two
mantissas in segment 3 produces the sum
Z = 1.0324 103
Instruction Pipeline
- Pipeline processing can not only occur in the data stream but in the instruction stream as
well.
- An instruction pipeline reads consecutive instructions from memory while previous
instructions are being executed in other segments.
- This causes the instruction fetch and execute phases to overlap and perform simultaneous
operations. This technique is called instruction pipelining.
Consider sub dividing instruction processing into two ways:
1. fetch instruction
2. execute instruction
- In the most general case, the computer needs to process each instruction with the following
sequence of steps:
i. Fetch the instruction from memory.
ii. Decode the instruction.
iii. Calculate the effective address.
iv. Fetch the operands from memory.
v. Execute the instruction.
vi. Store the result in the proper place.
To again further speed up, the pipeline must have more stages consider the following
decomposition of
instruction processing.
a) Fetch instruction (FI)
b) Decode instruction (DI)
c) Fetch operands (FO)
d) Execute instruction (EI)
Vector Instruction
1. Operation Code
2. Base Address
3. Address Increment
4. Address Offset
5. Vector Length
Superscalar Processors
It was first invented in 1987. It is a machine which is designed to improve the performance of the
scalar processor. In most applications, most of the operations are on scalar quantities.
Superscalar approach produces the high performance general purpose processors.
The main principle of superscalar approach is that it executes instructions independently in
different pipelines. As we already know, that Instruction pipelining leads to parallel processing
thereby speeding up the processing of instructions. In Superscalar processor, multiple such
pipelines are introduced for different operations, which further improves parallel processing.
There are multiple functional units each of which is implemented as a pipeline. Each pipeline
consists of multiple stages to handle multiple instructions at a time which support parallel
execution of instructions.
It increases the throughput because the CPU can execute multiple instructions per clock cycle.
Thus, superscalar processors are much faster than scalar processors.
A scalar processor works on one or two data items, while the vector processor works with
multiple data items. A superscalar processor is a combination of both. Each instruction
processes one data item, but there are multiple execution units within each CPU thus multiple
instructions can be processing separate data items concurrently.
While a superscalar CPU is also pipelined, there are two different performance enhancement
techniques. It is possible to have a non-pipelined superscalar CPU or pipelined non-superscalar
CPU. The superscalar technique is associated with some characteristics, these are: