0% found this document useful (0 votes)
11 views16 pages

Multiprocessors and Thread: Unit-4

Hi, Being a young adult, I possess new and innovative ideas. I have the dedication and determination towards work. I am able to adapt myself to the requirements of the organizations. And being a fresher I need a platform to explore my knowledge.

Uploaded by

pallavi.bcwcc
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views16 pages

Multiprocessors and Thread: Unit-4

Hi, Being a young adult, I possess new and innovative ideas. I have the dedication and determination towards work. I am able to adapt myself to the requirements of the organizations. And being a fresher I need a platform to explore my knowledge.

Uploaded by

pallavi.bcwcc
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

Unit- 4

MULTIPROCESSORS AND THREAD


LEVEL PARALLELISM

earningObjectives

(s Multiprocessors

(s Thread Level Parallelism


(s Interconnection Structures
8 Multi-threaded Architecture
Memory MIMD Architecture
( Distributed
O8 Review Questions
10.2 Computer Arehiteeture

10.1 Multiprocessors

Processor Organization

Serial Parallel

SIMD MISD MIMD


SISD

Uniprocessor Vector Array


Processor Processor
Tightly Loosely
Coupled Coupled
Overlapped Multi ALU
Operations Shared Distributed
memory Memory
Clusters

Symmetric Nonuniform
multiprocessor memory access
(SMP) (NUMA)

Figure 10.1: Taxonomy of mono-multiprocessor organizations


Amultiprocessor system is an interconnection of two or more CPUs with memory and input-output
equipment. The term "processor" in multi processor can mean either a central processing unit (CPU)
or an input-output processor (10P). However, a system with a single CPU and one or more 10Ps is
usually not included in the definition of a multiprocessor system unless the 1OP has computational
facilities comparable toa CPU. The various processors making up a multiprocessor typically share
resources such as communication facilities, I/0 devices, program libraries, databases and are
controlled by a common operating system.

Definition of Multiprocessor

Multiprocessor is a multiple instruction stream, multiple data stream (MIMD) computer containing two or
more CPU's that cooperate on common computational tasks..
10.3
Multiprocessors and Thread Level Parallelism

Interprocessor
Communication
Processor Processor
Network
. .

Processor
Processor

Processor Processor

Figure 10.2 : Basic multiprocessor architecture


of communication linesto form a computer
Computersareeinterconnected with each other by means communicate
or may not
network. The network consists of several autonomous computers that may system that provides
with each other. A multiprocessor system is
controlled by one operating solution of
cooperatein the
interaction between processors and all the components of the system and are very inexpensive
that micro processors take very little physical space
aproblem. Thefact interconnecting a large number of microprocessors into one
composite
d brings about the feasibility of
system.
Characteristics of Multiprocessors
circuit technology has reduced the cost of computer components to
" Very-large-scale integrated applyingmultiple processors tomeet system
pertormance
such a low level that the concept of
possibility.
requirements has become an attractive design part
improves the reliability of the system so that a failure or error in one
Multiprocessing to fail, a second
limited effect on the rest of the system. If a fault causes one processor
has a system as a
be assigned to perform the functions of the disabled processor. The
processor can
perhaps some loss in efficiency.
whole can continue to function correctly with parallel
improves system performance because computations can proceed in
Multiprocessing
in one of the two ways.
put-output operate in parallel, or
made to
Cunit(CPU) " Multiple independent jobs can be
parallel tasks.
orelOPsis " A single job can bepartitioned into multiple tasks
mputational multiprocessor organization an overall function can be partitioned into a number of
" In
individually.
icallyshare that each processor can handle
processors whose design is optimized to
cesandare
System tasks may be allocated to special purpose
efficiently.
perform certain types of processing
Examples
performs the computations for an industrial
" A computer system where one processor parameters, such as
ining
twoor/ process control while others monitor and control the various
temperature and flow rate.
10.4 Computer Architecture

" Acomputer where one processor performs high speed floating-point


Computationsand another takes care of routine data-processing tasks. mathematical
" Multiprocessing can improve performance by decomposing a program into paralel
executable tasks. This can be achieved in one of two ways.
" The user can explicitly declare that certain tasks of the program be executed in
parallel. This must be done prior to loading the program by specifying the nar
executable segments. Most multiprocessor manufacturers provide an
operating
system with programming language constructs suitable for specifying parallel
processing.
" The other, more efficient way is to provide a compiler with
multiprocessor
software that can automatically detect parallelism in a user's program Th
compiler checks for data dependency in the program. If a program depends on
data generated in another part, the part yielding the needed data must be exerutaa
first. However, tw0 parts of a program that do not use data generated by each ea
run concurrently. The parallelizing compiler checks the entire program to detect
any possible data dependencies. These that have no data dependency are then
considered for concurrent scheduling on different processors.
Classification of Multiprocessors / Coupling of Processors
Based on the way memory is organized Multiprocessors are classified as
Tightly coupled multiprocessor and "Loosely coupled multiprocessor
Tightly Coupled Multiprocessor /Shared Memory
Amultiprocessor system with common shared memory is called as tightly coupled or shared memory
multiprocessor. This does not preclude each processor from having its own local memory. There is
a global common memory that all CPU's can access. Information can therefore be shared among the
CPU's by placing it in the common global memory.
Loosely coupled multiprocessor /distributed-memory
Each processor element in a loosely coupled system has its own private local memory. Tasks or
processors do not communicate in a synchronized fashion. The processors communicate with each
other through amessage-passing scheme which uses packets.
The packets
A packet consists of an address, the data content, and some error detection code.
are addressed to a specific processor or taken by the first available processor, depending on the
Communication system used. In loosely coupled systems overhead for data exchange is high.
Loosely coupled systems are most efficient when the interaction between tasks is minimal, whereas
for high
tightly coupled systems can tolerate a higher degree of interaction between tasks and is used
speed real time processing.
Multiprocessors and Thread Level Parallelism (10.5
10.2 Thread Level Parallelism
thread is a separateprocess with its own instructions and data. Athread may represent a process
A
thatispartofaparallel program consisting of multiple processes, or it t may represent an independent
programonits own. Each thread has all the state (instructions, data, Program Counter, register state,
andsoon) necessary to allow it to execute. Athread isa lightweight process and threads commonly
share asingle address space, whereas processes don't.
IInlike instruction level parallelism, which exploits implicit parallel operations within a loop or
straight-line code segment, thread level parallelism is explicitly represented by the use of multiple
threads offexecution that are inherently parallel.
Threadlevel parallelism is animportant alternative to instruction levelI parallelism, primarily because
tcould be more cost-effective to exploit than instruction level parallelism. There are manyimportan
applications where thread level parallelism occurs naturally, as it does in many server applications.
Today, the term thread is often used in a casual wav to refer to multiple loci of execution that may run
on different procesSors, even when they do not share in address space. To take advantage of an MIMD
multiprocessor With n processors, we must usually have at least n threads or processes to eXecute.
The independent threads are typically identified by the programmer or created by the compiler. Since
the parallelism in this situation is contained in the threads, it is called thread-level parallelism.
Threads may vary from large-scale, independent processes - for example, independent programs
a loop;
running in amultiprogrammed fashion on different processors - to parallel iterations of
thousand
automatically generated by a compiler and each executing for perhaps less than a
thread-level
instructions. Although the size of a thread is important in considering how to exploit
identified at
parallelism efficiently, the important qualitative distinction is that such parallelism is
a high-level by the software system and that the threads consist of
hundreds to millions at a high
to millions of instructions that
level by the software systenm and that the threads consist of hundreds
by primarily by the
may be executed in parallel. In contrast, instruction level parallelism is identified
one instruction at a
hardware, though with software help in some cases, and is found and exploited
time.
10.3 Interconnection Structures
memory unit that may
The components that form the multiprocessor system are CPUs, IOP's and a
the components of
be portioned into a number of separate modules. The interconnection between
depending on the number of
a multiprocessor system can have different physical configurations memory system
transfer paths that are available between the processors and memory in a shared
and among the processing elements in a distributed memory systenm.
and I/0 devices in a
The processors nmust be able to share a set of main memory modules
structures.
multiprocessor system.This sharing capability can be provided through interconnection
The interconnection structure that are commonly used are
" Time-Shared /Common Bus " Multiport Memory
" Crossbar Switch " Multistage Switching Network
Hypercube System
106 " Computer Architecture

Time Shared CommonBus


Acommon-bus multiprocessor system consists of a number of processors Connected
common path to a memory unit. Atime-shared common bus for five processor is in fig. shown through10.3.a
Only one processor can communicate with the memory or another proCessor at any given time
Any processor wishingto initiate atransfer must first determine the availability states of the bus, and
only then transfer is initiated. Acommand is issued to informthe destination unit. The
responds to the control signals from the sender.
receiving unit
Disadvantage
" There may be transfer conflict since one common bus 0s shared by all processor.

Memory Unit

CPU4 CPU 5
CPU 1 CPU2 CPU 3

Figure 10.3: Time shared common bus organization

Local Bus

Common System Local


shared Bus CPU IOP
Memory
Memory Controller

System Bus

System Local System Local


Bus CPU 1OP Bus CPU
Memory Controller
Memory
Controller

Local Bus Local Bus

Figure 10.4: System Bus Structure for Multiprocessor


Memory access is fairly unifornm, but not very scalable Amore economical implementation of a duai local
Dus structure is illustrated in Figure 10.4. Number of local buses are connected to its own
an lOP or ay
hemory and to one or more processors. Each local bus may be connected to a CPU, bus.The
combinations of processors. Asystem bus controller links each local bus to a common system
processor.
I/0 devicesSconnected to the local 1OP, as well as the local memory, are availableto the local connected
The memory connected to the common system bus is shared by all processors. IfanIOP is
processors.
directly to the system bus the I/0 devices attached to it may be made available to all
Multiprocessors and Thread Level Parallelism (10.7
processor can
Onlyone communicate with the
system bus at any given time. The other shared memory and other
common resources through
with their local
the processors
memoryand /O devices. Part of the local memory mayarebekept busy communicating
designed as a cache memory attached to
thecpu.

Advantages

Inevpensive as no extra hardware is required such as switch.


. Simple and easy to configure as the functional units are directly connected to the bus.
Disadvantages

. If malfunctioning occurs in any of the bus interface circuits, complete system will fail.
. Decreased throughput since at a time, only one processor can communicate with any other Tu
unit.

. The total overalltransfer rate within the system is limited by the speed of the single patn.
" Increased arbitration logic, as the number of processors and memorv unit increases, the bus contenuon
problem increases.

Multiport Memory

Amultiport memory system employs separate buses between each


memory module and each CPU.
modules.
Figure 10.5 shows a multiport memory system with four CPU's and 4 memory
Memory Modules

MM 1 MM 2 MM 3 MM 4

EEEE
CPU1

CPU 2

CPU3

CPU4

Figure 10.5: Multiport Memory


processor bus consists of the address,
Each processor bus is connected to each memory module. A
memory.
data, and control lines required to communicate with
Each port serves a CPU
port accommodates one of the buses.
" The memory module is said to have four ports and each
which port will have access to
The module must have internal control logic to determine
memory at any given time.
priorities to each memory port. The
" Memoryaccess conflicts are resolved by assigning fixed established by the physical
priority for memory access associated with each processor may be
module.
port position that its bus occupies in each
10.8 " Computer Arehitecture

Advantages
High transfer rate can be achieved because of the multiple paths.

Disadvantages
" It requires expensive memory control logic and a large number of cables and connections.

Crossbar switch

The cross bar switch orgganization consists of anumber of cross points that are placedlat
between processor buses and memory module path. It provides separate path for each nmodule. intersections
Data, Address, and
Memory Modules Control from CPU1
Mm1|| Mm2 Mm3 Mm4 Data
Data, Address, and
Address Multiplexers Control from CPU2
CPU1 Memory and
ModuleRW Arbitration
CPU 2 Logic Data, Address, and
Memory (Control from CPU3
Enable
CPU 3
\Data, Address, and
CPU 4
Control from CPU4

Figure 10.6:(a) Cross bar switch (b) Block diagram of cross bar switch
Each switch point has control logic to set up the transfer path between a processor and a memory. It
examines the address that is placed in the bus to determine whether its particular module is being
addressed. It also resolves the multiple requests for access to the same memory on the predetermined
priority basis. It also supports simultaneous transfers from all memory modules because there is a
separate path associated with each module.
The functional designofacrossbar switch connected to one memory module is shown in figure 10.6.
The circuit consists of multiplexers that select the data address, and controlfrom one CPU for
communication with the memory module.
Priority levels are established by the arbitration logic to select on CPUwhen two or more CPU'S
attempt to access the same memory.

Advantages
Supports simultaneous transfers from all memory modules.
Multiprocessors and Thread LevelParallelism 10.9
pisadvantages

The hardware required to


implement the switch can become quite large and complex.
Multi-Stage Switehing Network
" The basic component of a multi stage switching network is atwo-input,two output interchange
ewitch. The switch has capacity of
connecting inputs to either of the outputs.
A
A

B 1 1
B
Aconnected to 0 A connected to 1

A 0
A

B 1 1
B
A connected to 0 A
connected to 1

Figure 10.7 :Operation of 2 x 2 Interconnection switch


Ilsing the 2x 2 switch as a building block, it is possible to build a multi-stage network to control the
communication between a number of sources and destinations.
To see how this is done, consider the binary tree shown in fig. 10.8 below.
0
000
1
001

010
Some requests cannot be P1 1
1 011
Satisfied Simultaneously P2
For Ex: it P1 is connected to
100
000 through 001. p2 can be 1
connected to only one of the 1 101
Destinations ie 100 through 111
110
1
111

Figure 10.8: Binary tree with 2 x 2 switches


The two processors P1and P2 are connected through switches to eight memory modules marked in
binary from 000 through 111. The path from source to a destination is determined from the binary
bits of the destination number.The first bit of the destination number determines the switch output
In the first level. The second bit specifies the output of the switch in the second level, and the third bËt
Specifies the output of the switch in third level.
Computer Arehitecture

Many diferent topologies have been proposed for multi-stage switching networks to c
processor - memory communication in a tightly coupled multiprocessor system or to control t
communication between the processing elements in a loosely coupled system.
" One such topology is the omega switching network shown in fig 10.9.
" In this configuration, thereaexactly one path from each source to any particular
destinatior
0 000
1 001

2 010
3 011

4 100
5 101

6 110
7 111

Figure 10.9 :8 x8 Omega switching network


" Some request patterns cannot be connected simultaneously. i.e., any two sources cannot
connected simultaneously to destination 000and 001. In a tightly coupled multiprocess
system, the source is a processor and the destination is a memory module.
" Set up the path ’ transfer the address into memory ’ transfer the data hterc
" In aloosely coupled multiprocessor system, both the source and destination are Processilthou
elements. After path establishment source transfers a message to the destination processorgjor
Hypercube System: The hypercube or binary n-cube multiprocessor structure is a loosely couplgfor
system composed of N= 2" processors interconnected in an n-dimensional binary cube.
10.4
" Each processor forms a node of the cube, in effect it contains not only a CPU but also lo
memory and i/0 interface.
luti-t
" Each processor has direct communication paths to n other neighbour processors. These patteam
correspond to the edges of the cube.
hcreas

011 111 MOWn

010
0 (01) (11) J110

)101
t
001

(00) (10) 000


100

One-cube Two-cube Three-cube

Figure 10.10 :Hypercube structure for n = 1, 2, and 3.


Multiprocessors and Threod Level Parallelism (10.11
one-cubestructure has n = 1and 2
= 2. It interconnected by asingle path.
Atwo-cubestructure has n = 2and 2n = 4, It containstwo processorSs
A
contains four nodes interconnectedas a Squa
Anacube structure has Znd nodes with a processor residing in each node. Each node is assigned a
ar binary address in such a way that the addresses of two neighbours differ inexactly one bit position.
Routing messages through an n-cube structure may take fromone to n links from a Souice
node to a destination node.
For example, in a three-cube structure. node 000 can communicate directly with node o
must cross at leasttwo links to communicate with 011 (from 000to 001to 011 or from 000 to
010 to 011)
" A routing procedure can be developed by computing the exclusive-OR of the source noue
address with the destination node address.
For example, in a three-cube structure, a message at 010 going to 001 produces a
to 000 and then
the two adaress equal to 011. The message can be sent along the second axis
through the third axis to 001.
resulting binary value will have l
" The message is then sent along any one of the axes that the
bits corresponding to the axes on which the two nodes differ.
UrCes catmtb " Arepresentative of the hypercube architecture is the Intel iPSCcomputer complex.
CPU, afloating pointprocessor,
mutigroces o " It consists of 128(n=7)microcomputers, each node consists ofa
local memory, and serial communication interface units.
Interconnection structure can decide overall system's performance in a
multi processor environment.
but the availability of only 1 path is its
are Although using common bus system is much easy andsimple,
ion Proces i fails. To overcome this and to improve overall

lbe.ooseprbyocecsouglorse
major drawback and if the bus fails, whole system
switch network evolved.
performances, crossbar, multi port, hypercube and multistage
10.4 Multi-threaded Architecture
Ubut also loca
instruction stream is divided into several smaller
Multi-threading is amechanism by which the
ors.These path streams (threads ) and can be executed in
parallel.
of a processor by switching to another thread when one thread is stalled is
Increasing utilization
known as hardware mutli-threading.
multi-threaded CPU is not a parallel architecture, strictly speaking; multi-threading is
A design and develop applications
obtained through a single CPU, but it allows aprogrammer to
execute in parallel; namely, threads.
as a set of programs that can virtually
Multit-hreading is solution to avoid waiting clock cycles as the missing data is fetched: making
" execute
concurrently; ifa thread gets blocked, the CPU can
the CPU manage more peer-threads units busy.
instructionsof another thread, thus keeping functional
set of private registers, separate from
" Each thread must have a private Program Counter and
other threads.
words, and the instruction set is composed of
" The architecture often exposes a register file of
instructions that operate on individual words.
Computer Architecture

Types of Multi-threading 2. Coarse-grained Multi-threading


1. Fine-grained Multi-threading
3. Simultaneous Multi-threading

ACoarse-grnined Multi-threading
version of hardware multi-threadingthatimpliesswitching between threads only after significant
events,such as a last-level cachemiss.
" This change relieves the needto have threadswitching be extremely fast and is much less likely
to slow down the execution of an individual thread, since instructions from other threads will
acostlystall.
only be issued when a thread encounters

Advantages
Tohave very fast thread switching. " Doesn't slow downthread.

Disadvantages
shorter stalls, due topipeline start-upcosts.
It is hard to overcome throughput losses from
" Since CPUissues instructions from 1 thread,
when a stall occurs, the pipeline must beemptied.
complete.
" New thread must fillpipeline before instructions can
start-up overhead, coarse-grained multi-threading is much more
useful for reducing the
Due to this
compared to the stall time.
penalty of high-cost stalls, where pipeline refill is negligible

Fine-grained Multi-threading

Aversion of hardware multi-threading that implies switching between threads after every
instruction resulting in interleaved execution of multiple threads. It switches from one thread
to another at each clock cycle.
" This interleaving is often done in around-robin fashion, skipping any threads that are stalled
at that clock cycle.
To make fine-grained multi-threading practical, the processor must be able to switch threads on
every clock cycle.

Advantages
" Vertical waste is eliminated. Pipeline hazards cannot arise.
" Zero switching overhead.
Ability to hide latency within a thread i.e, it can hide the throughput losses that arise from both short
and long stalls.
" Instructions from other threads can beexecuted when one thread stalls.
High execution efficiency.
" Potentially less complex than alternative high performance
processors.
(1o.15)
Multiprocossors ond Thregd level Parallelilsm

Disadvantages
" Clockcyeles are wasted if athread has little operationto execute.
Needs alot of threads to execute.
. It is expensive than coarse-grained multi-threading.
execute without
. It slows down the execution of the individual threads, since athread that is ready to
stalls willbe delayed by instructions from other threads.

Sinnultaneous multi-threading (SMT)


a multiple-issue,
" It is a variation on hardware multi-threading that uses the resources of
thread-level parallelism at the same
dynamically scheduled pipelined processor to exploit
time it exploits instruction level parallelism. functional
processors oftenhave more
. The key insightthat motivates SMT is that multiple-issue
unit parallelism available than most single threads can effectively use.
resources every cycle.
SinceSMT relies on the existing dynamic mechanisms, it does not switch instruction slots
associate
Instead, SMT is always executing instructions from multiple threads, to
and renamed registers with their proper threads.

Advantages

scheduling functional units among multiple threads.


It is ability to boost utilization by dynamically
" It increases hardware design facility.
" It produces better performance and add
resources to a fine grained manner.

Disadvantages
bottlenecks for the
cannot improve performance if any of the shared resources are the limiting
It
performance.

Architecture
10.5 Distributed Memory MIMD
popular computer architecture. Multiple instructions
(MIMD) Architecture is one of the recent and
computer . MIMD (Multiple Instruction Multiple
worked on multiple data to boost the performance of
computers are basically computers with threads and process level architectures. MIMD is
Data)
After advancement and development
appropriate for programs restricted by condition statements.
architectures became common.
in integrated circuit technology the MIMD
(Processors) which are connected with
MIMD architecture comprises of many processing elements
these processing elements.
some memorythrough a common bus. Task and data is distributed among
instruction on their data corresponding to
So at the same time all processing elements execute the
complete the given task.
10.14 Computer Architecture

" All processors in the system are directly connected to own memory and caches. Any processor
cannot directly access another processor's memory.
" Each node has anetwork interface (NI).
nox.
" All communication and synchronization between processors happens via messages
through the NI.
" Since this approach uses messages for communication and synchronization, it is often called
message passing architecture.

Processing
PE1 PEn
PEO Element (node)
MO M1 Mn Memory

P0 P1 Pn Processor

Interconnection Network

Figure 10.11 :Structure of Distributed Memory MIMD Architectures


Distributed memory MIMD Architecture is known as Multicomputer. It can replicate the processor/
memory pairs and link them through an interconnection network. The processor/memory pair is
known as the processing element (PE) and PEs work more or less separated from each other.
Whenever interaction between them is possible through message passing one PEs cannot directly
access the memory of other PE. This class of MIMD machines is known as distributed memory MIMD
architectures or message passing MIMDarchitectures.
In distributed-memory MIMD machines, each processor has its memory location. Each processor
has no explicit knowledge about other processor's memory. For data to be transmitted, it should
be shared from one processor to another as a message. Because there is no shared memory, the
contention is not as great an issue with these devices. It is not economically possible to connect
multiple processors directly to each other.
Amethod to prevent this multitude of direct connections is to connect each processor to only afew
others. This type of design can be disorganized because of the added time needed to pass a message
from one processor to another along the message path. The multiple time needed for processors to
implement simple message routing can be considerable. The goal of MIMD architectural design is to
develop a message passing parallel computer system organized such that the processor time spentin
.communication within the network is reduced to a minimum.
10.15)
Multiprocessors and Thread Level Parallelism
number of
system has a
MIMD (Multiple Instruction stream, Multiple Data stream) computer processor has its
independent processors operate upon separate data concurrently. Hence each
own data
each
processor has its
program memory or has access to program memory. Similarly. and
own mechanism toload the programsome
memoryor access to data memory. Clearly there needs to be a they work on as
memories processors muti-processors.
data and a mechanismfor passinginformation between general-purpose
problem. MIMD has clearlyemerges the architecture of choice for function
support, MIMDs can
MIMD machines offe flexibility. With the correct hardware andsoftware
multiprogrammed
application, as There are
user machines focusing on high performance for one these functions.
as single combination of MIMD
machines running many tasks simultaneously, or as some architecture, andshared memory
twotypes of MIMD architectures: distributed memory MIMD
architecture.
Components of the Multi Computer and Their Tasks communication network
linking
Within a multicomputer, there are alarge number of
nodes and a tasks related
elements that do have
theseinodes together. Inside each node,there are
three important
to message passing :
(a) Computation Processor and Private Memory
(b) Communication Processor multicomputernodes,
among1the
This component is responsible for organizing communication "de-packetizing
chunk of memory in the on the gËving end, and
"packetizing" a message as a
the same message for the receiving node.
toas switch units
(c) Router, commonly referred message from one node to the
next and assist the
to transmit the the message in through
the
The router's main task is organizing the communication of
communication processor in
network of nodes. hardware became
above three elements took place progressively, as messages were
The development of each of the computers, in which
useful. First came the Generation l communication processors or
more and more complex and but there were not any
between nodes, that were
passed through direct links multicomputers came along with independent switch units
routers. Then Generation II Generation II! multicomputer, all three components
finally in the
separate from the processor, and
exist.
memory MIMD architecture:
Example systems of Distributed
" IBM SP-2
Intel Paragon
Architectures
Memory MIMD
Advantages of Distributed of
system and their local memory, therefore, no problem
distributed memory
" Every processor has
contention. therefore sophisticated
connect through shared data structures and
" The processor cannot monitors are not required. Message passing solves all the
synchronization approaches like synchronization.
requirements of communication and
scalable and good architecture candidates for building massively parallel
These systems are highly
computers.
10.16 . Computer Architecture

Disadvantages of Distributed Memory MIMD Architectures


attention must be paid to load balancing.
" It can achieve high implementation in multicomputer special
automatic mapping and load balancing, in
Although recently much research effort has been devoted to
user to partition the code and data among the PEs.
many systems it is stillthe responsibility of the
synchronization can lead to a deadlock situation. On the
" Message-passing-based communication and
to avoid deadlocks derived from
architecture level, it is the task of thecommunication protocoldesigner
derived from message-based synchronization
incorrect routing schemes. However, avoiding deadlocks
at the software level is still the responsibility of the user.
message-passing is required to he
Although there is no architectural bottleneck in multicomputer,
data coPying can result in significan.
physically copied data structure between processes. Intensive
performance degradation. This was particularly the case in the first generation of multicomputer
consumed both processor time and meme
where the applied store and forward switching technique
space.
the introduction af
" The problem is radically reduced in the second generation of multicomputer where
wormhole routing and the employment of special-purpose communication processors resulted in an
improvement of three orders of magnitude in communication latency.

10.6 Review Questions

Short Answer Questions

1. Define multiprocessor system.


2. How is multiprocessor classified.
3. List out any two interconnection structures.
4. What is time shared common bus?
5. Define multithreading.
6. What are the different types of multithreading?
Long Answer Questions
1. Write the characteristics of multiprocessors.
2. Explain thread level parallesim.
3. Explain multiport memory.
4. Explain multithreaded architecture.
5. Write the advantages and disadvantages of Fine
grained multithreading.
6. Write the Advantages of Distributed
Memory MIMD Architecture.
7. Write the disadvantages of
Distributed Memory MIMD Architecture.

You might also like