0% found this document useful (0 votes)
8 views

Lecture 10

Uploaded by

ayatbahaa21
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

Lecture 10

Uploaded by

ayatbahaa21
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 23

COMPUTER ARCHITECTURE

PARALLEL P ROCESSING

1
Parallel Processing
Serial processing (sequential)

http://www.itrelease.com/2017/11/difference-serial-parallel-processing/
It is a processing of completing one task at a
time. The processor can not complete more
than one task at the time and they are run in a
sequence.

Parallel processing (concurrent)

is a term used to denote a large class of


techniques that are used to provide
simultaneous data processing tasks for the
purpose of increasing the computational
speed of a computer system.

2
Parallel Processing
• The purpose of Parallel processing:
1. Speed up the computer processing capability
2. Increase the throughput

Throughput the amount of processing that can be


accomplished during a given interval of time.

• The side effects:


1. The amount of hardware increases
2. The cost of the system increases

3
Parallel processing classifications
• It can be considered
1. From the internal organization of the processors
2. From the interconnection structure between processors
3. From the flow of information through the system.

Flynn's classification

Based on M. Morris Mano “Computer System Architecture”-- Lecturer Ahmed Salah Hameed
4
Flynn's classification
IS
1. SISD

CU PU MM
IS DS

Shared memmory
2. SIMD DS 1
PU 1 MM 1

DS 2
PU 2 MM 2

IS
CU

DS n
PU n MM n

IS

5
Flynn's classification
DS

IS 1 IS 1
CU 1 PU 1

3. MISD IS 2 IS 2
Shared memory

CU 2 PU 2 MM n
Memory
MM 2 MM 1

IS n IS n
CU n PU n

DS

Shared memory
IS 1 IS 1 DS
4. MIMD CU 1 PU 1 MM 1

IS 2 IS 2
CU 2 PU 2 MM 2

v v

IS n IS n
CU n PU n MM n

6
SISD : Single-instruction, single-data (SISD) systems – An SISD computing system is a uniprocessor machine
which is capable of executing a single instruction, operating on a single data stream.

In SISD, machine instructions are processed in a sequential manner and computers adopting this model
are popularly called sequential computers.

Most conventional computers have SISD architecture.

All the instructions and data to be processed have to be stored in primary memory.

Single-instruction, multiple-data (SIMD) systems –An SIMD system is a multiprocessor machine capable of
executing the same instruction on all the CPUs but operating on different data streams.

Machines based on an SIMD model are well suited to scientific computing since they involve lots of vector
and matrix operations.

So that the information can be passed to all the processing elements (PEs) organized data elements of vectors
can be divided into multiple sets(N-sets for N PE systems) and each PE can process one data set.

Multiple-instruction, single-data (MISD) systems – An MISD computing system is a multiprocessor machine


capable of executing different instructions on different PEs but all of them operating on the same dataset .

Multiple-instruction, multiple-data (MIMD) systems – An MIMD system is a multiprocessor machine which


is capable of executing multiple instructions on multiple data sets.

Each PE in the MIMD model has separate instruction and data streams; therefore machines built using
this model are capable to any kind of application.

Unlike SIMD and MISD machines, PEs in MIMD machines work asynchronously.
Pipelining
• Pipelining is a technique of decomposing a sequential
process into sub operations, with each sub process being
executed in a special dedicated segment that operates
concurrently with all other segments.

1 2 3 4 1 2 3 4 5 6

2-segment pipelined Non pipelined

7
Example of the Pipeline Organization

8
Example of the Pipeline Organization

9
SPEED CALCULATION
• n is number of tasks
• Each task divided into k-segments
• Each k-segment executed in a clock cycle time (tp)

• The first task T1 requires a time equal to (k*tp) to complete its operation
because we have k segments in the pipe.

• The remaining n - 1 tasks emerge from the pipe at the rate of one task per
clock cycle and they will be completed after a time equal to (n - 1)*tp.
• Therefore, to complete n tasks using a k-segment pipeline:
k + (n - 1) clock cycles

For example system with four segments and six tasks.


The time required to complete all the operations is/
4 + (6 - 1) = 9 clock cycles

10
SPEED CALCULATION

Clock cycles
1 2 3 4 5 6 7 8 9

1 k1 k2 k3 k4

2 k1 k2 k3 k4

3 k1 k2 k3 k4
Tasks
4 k1 k2 k3 k4

5 k1 k2 k3 k4

6 k1 k2 k3 k4

11
SPEED CALCULATION
NONPIPELINE UNIT
• Each task take time equal to tn.
• The total time required for n tasks is
n * tn

PIPELINE UNIT
• To complete n tasks using a k-segment pipeline:
k + (n - 1) clock cycles
• Total time is: k + (n - 1) * tp
Smax
SPEED UP

12
EXAMPLE ON SPEED CALCULATION

13
INSTRUCTION PIPELINE
• An instruction pipeline reads consecutive instructions from memory
while previous instructions are being executed in other segments. This
causes the instruction fetch and execute phases to overlap and perform
simultaneous operations.
• Simple Example:
• Consider a computer with an instruction fetch unit and an instruction
execution unit designed to provide a two-segment pipeline.

1 2 3 4 1 2 3 4 5 6
F E F E
F E F E
F E F E

2-segment pipelined Non pipelined

14
INSTRUCTION PIPELINE
• NOTE 1: Computers with complex
instructions require other phases in
addition to the fetch and execute to
process an instruction completely.

• NOTE 2: Difficulties that will prevent the instruction pipeline from


operating at its maximum rate.
1. Different segments may take different times to operate on the
incoming information.
2. Some segments are skipped for certain operations. For example, a
register mode instruction does not need an effective address
calculation.
3. Two or more segments may require memory access at the same time,
causing one segment to wait until another is finished with the
memory.

15
INSTRUCTION PIPELINE
• NOTE 3: Memory access conflicts are sometimes resolved by using
two memory buses for accessing instructions and data in separate
modules. In this way, an instruction word and a data word can be read
simultaneously from two different modules.

• NOTE 4: The design of an instruction pipeline will be most efficient if


the instruction cycle is divided into segments of equal duration. The
time that each step takes to fulfill its function depends on the
instruction and the way it is executed.

16
EXAMPLE: FOUR-SEGMENT INSTRUCTION PIPELINE

1. FI is the segment that fetches an instruction.


2. DA is the segment that decodes the instruction
and calculates the effective address.
3. FO is the segment that fetches the operand.
4. EX is the segment that executes the instruction.

17
PROBLEMS

18
Clock cycles
1 2 3 4 5 6 7 8 9 10 11 12 13
1 k1 k2 k3 k4 k5 k6
2 k1 k2 k3 k4 k5 k6
3 k1 k2 k3 k4 k5 k6
4 k1 k2 k3 k4 k5 k6
Tasks
5 k1 k2 k3 k4 k5 k6
6 k1 k2 k3 K4 k5 k6
7 k1 k2 k3 K4 k5 k6
8 k1 k2 k3 K4 k5 k6

19
20
21
22

You might also like