Unit 7 - Parallel Processing Paradigm
Unit 7 - Parallel Processing Paradigm
Paradigm
By
Smita Mande
Introduction Parallel Processing
Serial Processing:
To solve a problem, an algorithm divides the problem into smaller
instructions. These discrete instructions are then executed on the Central
Processing Unit of a computer one by one. Only after one instruction is
finished, next one starts.
Keypoints:
1. In this, a problem statement is broken into discrete instructions.
2. Then the instructions are executed one by one.
3. Only one instruction is executed at any moment of time.
Introduction Parallel Processing
Parallel Processing
It is the use of multiple processing elements simultaneously for solving any problem.
Problems are broken down into instructions and are solved concurrently as each resource that
has been applied to work is working at the same time.
Bit-level parallelism
Instruction-level parallelism
Task Parallelism
Data-level parallelism (DLP)
Bit-level parallelism
It is the form of parallel computing which is based on the increasing processor’s size. It
reduces the number of instructions that the system must execute in order to perform a task
on large-sized data.
Example: Consider a scenario where an 8-bit processor must compute the sum of two 16-
bit integers. It must first sum up the 8 lower-order bits, then add the 8 higher-order bits,
thus requiring two instructions to perform the operation. A 16-bit processor can perform
the operation with just one instruction.
Instruction-level parallelism
A processor can only address less than one instruction for each clock cycle phase. These
instructions can be re-ordered and grouped which are later on executed concurrently
without affecting the result of the program. This is called instruction-level parallelism.
Task Parallelism –
Task parallelism employs the decomposition of a task into
subtasks and then allocating each of the subtasks for execution.
The processors perform the execution of sub-tasks concurrently.
Data-level parallelism (DLP) –
Instructions from a single stream operate concurrently on several
data
Parallelism in Uni processor system
Parallel adders can be implemented using techniques such as carry-lookahead and carry-save.
The multiplier can be recoded to eliminate more complex calculations.
3. Overlapped CPU and I/O Operation
We all are aware of the fact that the processing speed of the
CPU is 1000 times faster than the memory accessing speed
which results in slowing the processing speed.
td> tm>tp
where td is the processing time of the device,
tm is the processing time of the main memory,
and tp is the processing time of the central processing unit.
To put a balance between the speed of CPU and memory a fast cache
memory can be used which buffers the information between memory and
CPU.
To balance the bandwidth between memory and I/O devices, input-
output channels with different speeds can be used between main memory
and I/O devices.
Software Approach for Parallelism in Uniprocessor
Multiprogramming
Program interleaving can be understood as when a process P1 is engaged in I/O
operation the process schedular can switch CPU to operate process P2. This led
process P1 and P2 to execute simultaneously. This interleaving between CPU and
I/O devices is called multiprogramming.
Time-Sharing
Multiprogramming is based on the concept of time-sharing. The CPU time is
shared among multiple programs. Sometimes a high-priority program can
engage the CPU for a long period starving the other processes in the computer.
Classification based on Architectural schemes
Flynn’s classification
Shores classification
Feng’s classification
Handle’s classification
Flynn’s classification
• Resource Sharing
A multi-core processor shares a variety of resources, both internal and external. Networks, system
buses, and main memory are among these resources. Consequently, any program running on the same
core will have a higher chance of being interrupted.
• Software Interference
Due to resource sharing, software interference can cause problems. The presence of more cores
implies a greater number of interference routes.