0% found this document useful (0 votes)

11 views16 pages

Multiprocessors and Thread: Unit-4

Hi, Being a young adult, I possess new and innovative ideas. I have the dedication and determination towards work. I am able to adapt myself to the requirements of the organizations. And being a fresher I need a platform to explore my knowledge.

Uploaded by

pallavi.bcwcc

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views16 pages

Multiprocessors and Thread: Unit-4

Uploaded by

pallavi.bcwcc

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 16

Unit- 4

MULTIPROCESSORS AND THREAD

LEVEL PARALLELISM

earningObjectives

(s Multiprocessors

(s Thread Level Parallelism

(s Interconnection Structures
8 Multi-threaded Architecture
Memory MIMD Architecture
( Distributed
O8 Review Questions
10.2 Computer Arehiteeture

10.1 Multiprocessors

Processor Organization

Serial Parallel

SIMD MISD MIMD

SISD

Uniprocessor Vector Array

Processor Processor
Tightly Loosely
Coupled Coupled
Overlapped Multi ALU
Operations Shared Distributed
memory Memory
Clusters

Symmetric Nonuniform
multiprocessor memory access
(SMP) (NUMA)

Figure 10.1: Taxonomy of mono-multiprocessor organizations

Amultiprocessor system is an interconnection of two or more CPUs with memory and input-output
equipment. The term "processor" in multi processor can mean either a central processing unit (CPU)
or an input-output processor (10P). However, a system with a single CPU and one or more 10Ps is
usually not included in the definition of a multiprocessor system unless the 1OP has computational
facilities comparable toa CPU. The various processors making up a multiprocessor typically share
resources such as communication facilities, I/0 devices, program libraries, databases and are
controlled by a common operating system.

Definition of Multiprocessor

Multiprocessor is a multiple instruction stream, multiple data stream (MIMD) computer containing two or
more CPU's that cooperate on common computational tasks..
10.3
Multiprocessors and Thread Level Parallelism

Interprocessor
Communication
Processor Processor
Network
. .

Processor
Processor

Processor Processor

Figure 10.2 : Basic multiprocessor architecture

of communication linesto form a computer
Computersareeinterconnected with each other by means communicate
or may not
network. The network consists of several autonomous computers that may system that provides
with each other. A multiprocessor system is
controlled by one operating solution of
cooperatein the
interaction between processors and all the components of the system and are very inexpensive
that micro processors take very little physical space
aproblem. Thefact interconnecting a large number of microprocessors into one
composite
d brings about the feasibility of
system.
Characteristics of Multiprocessors
circuit technology has reduced the cost of computer components to
" Very-large-scale integrated applyingmultiple processors tomeet system
pertormance
such a low level that the concept of
possibility.
requirements has become an attractive design part
improves the reliability of the system so that a failure or error in one
Multiprocessing to fail, a second
limited effect on the rest of the system. If a fault causes one processor
has a system as a
be assigned to perform the functions of the disabled processor. The
processor can
perhaps some loss in efficiency.
whole can continue to function correctly with parallel
improves system performance because computations can proceed in
Multiprocessing
in one of the two ways.
put-output operate in parallel, or
made to
Cunit(CPU) " Multiple independent jobs can be
parallel tasks.
orelOPsis " A single job can bepartitioned into multiple tasks
mputational multiprocessor organization an overall function can be partitioned into a number of
" In
individually.
icallyshare that each processor can handle
processors whose design is optimized to
cesandare
System tasks may be allocated to special purpose
efficiently.
perform certain types of processing
Examples
performs the computations for an industrial
" A computer system where one processor parameters, such as
ining
twoor/ process control while others monitor and control the various
temperature and flow rate.
10.4 Computer Architecture

" Acomputer where one processor performs high speed floating-point

Computationsand another takes care of routine data-processing tasks. mathematical
" Multiprocessing can improve performance by decomposing a program into paralel
executable tasks. This can be achieved in one of two ways.
" The user can explicitly declare that certain tasks of the program be executed in
parallel. This must be done prior to loading the program by specifying the nar
executable segments. Most multiprocessor manufacturers provide an
operating
system with programming language constructs suitable for specifying parallel
processing.
" The other, more efficient way is to provide a compiler with
multiprocessor
software that can automatically detect parallelism in a user's program Th
compiler checks for data dependency in the program. If a program depends on
data generated in another part, the part yielding the needed data must be exerutaa
first. However, tw0 parts of a program that do not use data generated by each ea
run concurrently. The parallelizing compiler checks the entire program to detect
any possible data dependencies. These that have no data dependency are then
considered for concurrent scheduling on different processors.
Classification of Multiprocessors / Coupling of Processors
Based on the way memory is organized Multiprocessors are classified as
Tightly coupled multiprocessor and "Loosely coupled multiprocessor
Tightly Coupled Multiprocessor /Shared Memory
Amultiprocessor system with common shared memory is called as tightly coupled or shared memory
multiprocessor. This does not preclude each processor from having its own local memory. There is
a global common memory that all CPU's can access. Information can therefore be shared among the
CPU's by placing it in the common global memory.
Loosely coupled multiprocessor /distributed-memory
Each processor element in a loosely coupled system has its own private local memory. Tasks or
processors do not communicate in a synchronized fashion. The processors communicate with each
other through amessage-passing scheme which uses packets.
The packets
A packet consists of an address, the data content, and some error detection code.
are addressed to a specific processor or taken by the first available processor, depending on the
Communication system used. In loosely coupled systems overhead for data exchange is high.
Loosely coupled systems are most efficient when the interaction between tasks is minimal, whereas
for high
tightly coupled systems can tolerate a higher degree of interaction between tasks and is used
speed real time processing.
Multiprocessors and Thread Level Parallelism (10.5
10.2 Thread Level Parallelism
thread is a separateprocess with its own instructions and data. Athread may represent a process
A
thatispartofaparallel program consisting of multiple processes, or it t may represent an independent
programonits own. Each thread has all the state (instructions, data, Program Counter, register state,
andsoon) necessary to allow it to execute. Athread isa lightweight process and threads commonly
share asingle address space, whereas processes don't.
IInlike instruction level parallelism, which exploits implicit parallel operations within a loop or
straight-line code segment, thread level parallelism is explicitly represented by the use of multiple
threads offexecution that are inherently parallel.
Threadlevel parallelism is animportant alternative to instruction levelI parallelism, primarily because
tcould be more cost-effective to exploit than instruction level parallelism. There are manyimportan
applications where thread level parallelism occurs naturally, as it does in many server applications.
Today, the term thread is often used in a casual wav to refer to multiple loci of execution that may run
on different procesSors, even when they do not share in address space. To take advantage of an MIMD
multiprocessor With n processors, we must usually have at least n threads or processes to eXecute.
The independent threads are typically identified by the programmer or created by the compiler. Since
the parallelism in this situation is contained in the threads, it is called thread-level parallelism.
Threads may vary from large-scale, independent processes - for example, independent programs
a loop;
running in amultiprogrammed fashion on different processors - to parallel iterations of
thousand
automatically generated by a compiler and each executing for perhaps less than a
thread-level
instructions. Although the size of a thread is important in considering how to exploit
identified at
parallelism efficiently, the important qualitative distinction is that such parallelism is
a high-level by the software system and that the threads consist of
hundreds to millions at a high
to millions of instructions that
level by the software systenm and that the threads consist of hundreds
by primarily by the
may be executed in parallel. In contrast, instruction level parallelism is identified
one instruction at a
hardware, though with software help in some cases, and is found and exploited
time.
10.3 Interconnection Structures
memory unit that may
The components that form the multiprocessor system are CPUs, IOP's and a
the components of
be portioned into a number of separate modules. The interconnection between
depending on the number of
a multiprocessor system can have different physical configurations memory system
transfer paths that are available between the processors and memory in a shared
and among the processing elements in a distributed memory systenm.
and I/0 devices in a
The processors nmust be able to share a set of main memory modules
structures.
multiprocessor system.This sharing capability can be provided through interconnection
The interconnection structure that are commonly used are
" Time-Shared /Common Bus " Multiport Memory
" Crossbar Switch " Multistage Switching Network
Hypercube System
106 " Computer Architecture

Time Shared CommonBus

Acommon-bus multiprocessor system consists of a number of processors Connected
common path to a memory unit. Atime-shared common bus for five processor is in fig. shown through10.3.a
Only one processor can communicate with the memory or another proCessor at any given time
Any processor wishingto initiate atransfer must first determine the availability states of the bus, and
only then transfer is initiated. Acommand is issued to informthe destination unit. The
responds to the control signals from the sender.
receiving unit
Disadvantage
" There may be transfer conflict since one common bus 0s shared by all processor.

Memory Unit

CPU4 CPU 5
CPU 1 CPU2 CPU 3

Figure 10.3: Time shared common bus organization

Local Bus

Common System Local

shared Bus CPU IOP
Memory
Memory Controller

System Bus

System Local System Local

Bus CPU 1OP Bus CPU
Memory Controller
Memory
Controller

Local Bus Local Bus

Figure 10.4: System Bus Structure for Multiprocessor

Memory access is fairly unifornm, but not very scalable Amore economical implementation of a duai local
Dus structure is illustrated in Figure 10.4. Number of local buses are connected to its own
an lOP or ay
hemory and to one or more processors. Each local bus may be connected to a CPU, bus.The
combinations of processors. Asystem bus controller links each local bus to a common system
processor.
I/0 devicesSconnected to the local 1OP, as well as the local memory, are availableto the local connected
The memory connected to the common system bus is shared by all processors. IfanIOP is
processors.
directly to the system bus the I/0 devices attached to it may be made available to all
Multiprocessors and Thread Level Parallelism (10.7
processor can
Onlyone communicate with the
system bus at any given time. The other shared memory and other
common resources through
with their local
the processors
memoryand /O devices. Part of the local memory mayarebekept busy communicating
designed as a cache memory attached to
thecpu.

Advantages

Inevpensive as no extra hardware is required such as switch.

. Simple and easy to configure as the functional units are directly connected to the bus.
Disadvantages

. If malfunctioning occurs in any of the bus interface circuits, complete system will fail.
. Decreased throughput since at a time, only one processor can communicate with any other Tu
unit.

. The total overalltransfer rate within the system is limited by the speed of the single patn.
" Increased arbitration logic, as the number of processors and memorv unit increases, the bus contenuon
problem increases.

Multiport Memory

Amultiport memory system employs separate buses between each

memory module and each CPU.
modules.
Figure 10.5 shows a multiport memory system with four CPU's and 4 memory
Memory Modules

MM 1 MM 2 MM 3 MM 4

EEEE
CPU1

CPU 2

CPU3

CPU4

Figure 10.5: Multiport Memory

processor bus consists of the address,
Each processor bus is connected to each memory module. A
memory.
data, and control lines required to communicate with
Each port serves a CPU
port accommodates one of the buses.
" The memory module is said to have four ports and each
which port will have access to
The module must have internal control logic to determine
memory at any given time.
priorities to each memory port. The
" Memoryaccess conflicts are resolved by assigning fixed established by the physical
priority for memory access associated with each processor may be
module.
port position that its bus occupies in each
10.8 " Computer Arehitecture

Advantages
High transfer rate can be achieved because of the multiple paths.

Disadvantages
" It requires expensive memory control logic and a large number of cables and connections.

Crossbar switch

The cross bar switch orgganization consists of anumber of cross points that are placedlat
between processor buses and memory module path. It provides separate path for each nmodule. intersections
Data, Address, and
Memory Modules Control from CPU1
Mm1|| Mm2 Mm3 Mm4 Data
Data, Address, and
Address Multiplexers Control from CPU2
CPU1 Memory and
ModuleRW Arbitration
CPU 2 Logic Data, Address, and
Memory (Control from CPU3
Enable
CPU 3
\Data, Address, and
CPU 4
Control from CPU4

Figure 10.6:(a) Cross bar switch (b) Block diagram of cross bar switch
Each switch point has control logic to set up the transfer path between a processor and a memory. It
examines the address that is placed in the bus to determine whether its particular module is being
addressed. It also resolves the multiple requests for access to the same memory on the predetermined
priority basis. It also supports simultaneous transfers from all memory modules because there is a
separate path associated with each module.
The functional designofacrossbar switch connected to one memory module is shown in figure 10.6.
The circuit consists of multiplexers that select the data address, and controlfrom one CPU for
communication with the memory module.
Priority levels are established by the arbitration logic to select on CPUwhen two or more CPU'S
attempt to access the same memory.

Advantages
Supports simultaneous transfers from all memory modules.
Multiprocessors and Thread LevelParallelism 10.9
pisadvantages

The hardware required to

implement the switch can become quite large and complex.
Multi-Stage Switehing Network
" The basic component of a multi stage switching network is atwo-input,two output interchange
ewitch. The switch has capacity of
connecting inputs to either of the outputs.
A
A

B 1 1
B
Aconnected to 0 A connected to 1

A 0
A

B 1 1
B
A connected to 0 A
connected to 1

Figure 10.7 :Operation of 2 x 2 Interconnection switch

Ilsing the 2x 2 switch as a building block, it is possible to build a multi-stage network to control the
communication between a number of sources and destinations.
To see how this is done, consider the binary tree shown in fig. 10.8 below.
0
000
1
001

010
Some requests cannot be P1 1
1 011
Satisfied Simultaneously P2
For Ex: it P1 is connected to
100
000 through 001. p2 can be 1
connected to only one of the 1 101
Destinations ie 100 through 111
110
1
111

Figure 10.8: Binary tree with 2 x 2 switches

The two processors P1and P2 are connected through switches to eight memory modules marked in
binary from 000 through 111. The path from source to a destination is determined from the binary
bits of the destination number.The first bit of the destination number determines the switch output
In the first level. The second bit specifies the output of the switch in the second level, and the third bËt
Specifies the output of the switch in third level.
Computer Arehitecture

Many diferent topologies have been proposed for multi-stage switching networks to c
processor - memory communication in a tightly coupled multiprocessor system or to control t
communication between the processing elements in a loosely coupled system.
" One such topology is the omega switching network shown in fig 10.9.
" In this configuration, thereaexactly one path from each source to any particular
destinatior
0 000
1 001

2 010
3 011

4 100
5 101

6 110
7 111

Figure 10.9 :8 x8 Omega switching network

" Some request patterns cannot be connected simultaneously. i.e., any two sources cannot
connected simultaneously to destination 000and 001. In a tightly coupled multiprocess
system, the source is a processor and the destination is a memory module.
" Set up the path ’ transfer the address into memory ’ transfer the data hterc
" In aloosely coupled multiprocessor system, both the source and destination are Processilthou
elements. After path establishment source transfers a message to the destination processorgjor
Hypercube System: The hypercube or binary n-cube multiprocessor structure is a loosely couplgfor
system composed of N= 2" processors interconnected in an n-dimensional binary cube.
10.4
" Each processor forms a node of the cube, in effect it contains not only a CPU but also lo
memory and i/0 interface.
luti-t
" Each processor has direct communication paths to n other neighbour processors. These patteam
correspond to the edges of the cube.
hcreas

011 111 MOWn

010
0 (01) (11) J110

)101
t
001

(00) (10) 000

100

One-cube Two-cube Three-cube

Figure 10.10 :Hypercube structure for n = 1, 2, and 3.

Multiprocessors and Threod Level Parallelism (10.11
one-cubestructure has n = 1and 2
= 2. It interconnected by asingle path.
Atwo-cubestructure has n = 2and 2n = 4, It containstwo processorSs
A
contains four nodes interconnectedas a Squa
Anacube structure has Znd nodes with a processor residing in each node. Each node is assigned a
ar binary address in such a way that the addresses of two neighbours differ inexactly one bit position.
Routing messages through an n-cube structure may take fromone to n links from a Souice
node to a destination node.
For example, in a three-cube structure. node 000 can communicate directly with node o
must cross at leasttwo links to communicate with 011 (from 000to 001to 011 or from 000 to
010 to 011)
" A routing procedure can be developed by computing the exclusive-OR of the source noue
address with the destination node address.
For example, in a three-cube structure, a message at 010 going to 001 produces a
to 000 and then
the two adaress equal to 011. The message can be sent along the second axis
through the third axis to 001.
resulting binary value will have l
" The message is then sent along any one of the axes that the
bits corresponding to the axes on which the two nodes differ.
UrCes catmtb " Arepresentative of the hypercube architecture is the Intel iPSCcomputer complex.
CPU, afloating pointprocessor,
mutigroces o " It consists of 128(n=7)microcomputers, each node consists ofa
local memory, and serial communication interface units.
Interconnection structure can decide overall system's performance in a
multi processor environment.
but the availability of only 1 path is its
are Although using common bus system is much easy andsimple,
ion Proces i fails. To overcome this and to improve overall

lbe.ooseprbyocecsouglorse
major drawback and if the bus fails, whole system
switch network evolved.
performances, crossbar, multi port, hypercube and multistage
10.4 Multi-threaded Architecture
Ubut also loca
instruction stream is divided into several smaller
Multi-threading is amechanism by which the
ors.These path streams (threads ) and can be executed in
parallel.
of a processor by switching to another thread when one thread is stalled is
Increasing utilization
known as hardware mutli-threading.
multi-threaded CPU is not a parallel architecture, strictly speaking; multi-threading is
A design and develop applications
obtained through a single CPU, but it allows aprogrammer to
execute in parallel; namely, threads.
as a set of programs that can virtually
Multit-hreading is solution to avoid waiting clock cycles as the missing data is fetched: making
" execute
concurrently; ifa thread gets blocked, the CPU can
the CPU manage more peer-threads units busy.
instructionsof another thread, thus keeping functional
set of private registers, separate from
" Each thread must have a private Program Counter and
other threads.
words, and the instruction set is composed of
" The architecture often exposes a register file of
instructions that operate on individual words.
Computer Architecture

Types of Multi-threading 2. Coarse-grained Multi-threading

1. Fine-grained Multi-threading
3. Simultaneous Multi-threading

ACoarse-grnined Multi-threading
version of hardware multi-threadingthatimpliesswitching between threads only after significant
events,such as a last-level cachemiss.
" This change relieves the needto have threadswitching be extremely fast and is much less likely
to slow down the execution of an individual thread, since instructions from other threads will
acostlystall.
only be issued when a thread encounters

Advantages
Tohave very fast thread switching. " Doesn't slow downthread.

Disadvantages
shorter stalls, due topipeline start-upcosts.
It is hard to overcome throughput losses from
" Since CPUissues instructions from 1 thread,
when a stall occurs, the pipeline must beemptied.
complete.
" New thread must fillpipeline before instructions can
start-up overhead, coarse-grained multi-threading is much more
useful for reducing the
Due to this
compared to the stall time.
penalty of high-cost stalls, where pipeline refill is negligible

Fine-grained Multi-threading

Aversion of hardware multi-threading that implies switching between threads after every
instruction resulting in interleaved execution of multiple threads. It switches from one thread
to another at each clock cycle.
" This interleaving is often done in around-robin fashion, skipping any threads that are stalled
at that clock cycle.
To make fine-grained multi-threading practical, the processor must be able to switch threads on
every clock cycle.

Advantages
" Vertical waste is eliminated. Pipeline hazards cannot arise.
" Zero switching overhead.
Ability to hide latency within a thread i.e, it can hide the throughput losses that arise from both short
and long stalls.
" Instructions from other threads can beexecuted when one thread stalls.
High execution efficiency.
" Potentially less complex than alternative high performance
processors.
(1o.15)
Multiprocossors ond Thregd level Parallelilsm

Disadvantages
" Clockcyeles are wasted if athread has little operationto execute.
Needs alot of threads to execute.
. It is expensive than coarse-grained multi-threading.
execute without
. It slows down the execution of the individual threads, since athread that is ready to
stalls willbe delayed by instructions from other threads.

Sinnultaneous multi-threading (SMT)

a multiple-issue,
" It is a variation on hardware multi-threading that uses the resources of
thread-level parallelism at the same
dynamically scheduled pipelined processor to exploit
time it exploits instruction level parallelism. functional
processors oftenhave more
. The key insightthat motivates SMT is that multiple-issue
unit parallelism available than most single threads can effectively use.
resources every cycle.
SinceSMT relies on the existing dynamic mechanisms, it does not switch instruction slots
associate
Instead, SMT is always executing instructions from multiple threads, to
and renamed registers with their proper threads.

Advantages

scheduling functional units among multiple threads.

It is ability to boost utilization by dynamically
" It increases hardware design facility.
" It produces better performance and add
resources to a fine grained manner.

Disadvantages
bottlenecks for the
cannot improve performance if any of the shared resources are the limiting
It
performance.

Architecture
10.5 Distributed Memory MIMD
popular computer architecture. Multiple instructions
(MIMD) Architecture is one of the recent and
computer . MIMD (Multiple Instruction Multiple
worked on multiple data to boost the performance of
computers are basically computers with threads and process level architectures. MIMD is
Data)
After advancement and development
appropriate for programs restricted by condition statements.
architectures became common.
in integrated circuit technology the MIMD
(Processors) which are connected with
MIMD architecture comprises of many processing elements
these processing elements.
some memorythrough a common bus. Task and data is distributed among
instruction on their data corresponding to
So at the same time all processing elements execute the
complete the given task.
10.14 Computer Architecture

" All processors in the system are directly connected to own memory and caches. Any processor
cannot directly access another processor's memory.
" Each node has anetwork interface (NI).
nox.
" All communication and synchronization between processors happens via messages
through the NI.
" Since this approach uses messages for communication and synchronization, it is often called
message passing architecture.

Processing
PE1 PEn
PEO Element (node)
MO M1 Mn Memory

P0 P1 Pn Processor

Interconnection Network

Figure 10.11 :Structure of Distributed Memory MIMD Architectures

Distributed memory MIMD Architecture is known as Multicomputer. It can replicate the processor/
memory pairs and link them through an interconnection network. The processor/memory pair is
known as the processing element (PE) and PEs work more or less separated from each other.
Whenever interaction between them is possible through message passing one PEs cannot directly
access the memory of other PE. This class of MIMD machines is known as distributed memory MIMD
architectures or message passing MIMDarchitectures.
In distributed-memory MIMD machines, each processor has its memory location. Each processor
has no explicit knowledge about other processor's memory. For data to be transmitted, it should
be shared from one processor to another as a message. Because there is no shared memory, the
contention is not as great an issue with these devices. It is not economically possible to connect
multiple processors directly to each other.
Amethod to prevent this multitude of direct connections is to connect each processor to only afew
others. This type of design can be disorganized because of the added time needed to pass a message
from one processor to another along the message path. The multiple time needed for processors to
implement simple message routing can be considerable. The goal of MIMD architectural design is to
develop a message passing parallel computer system organized such that the processor time spentin
.communication within the network is reduced to a minimum.
10.15)
Multiprocessors and Thread Level Parallelism
number of
system has a
MIMD (Multiple Instruction stream, Multiple Data stream) computer processor has its
independent processors operate upon separate data concurrently. Hence each
own data
each
processor has its
program memory or has access to program memory. Similarly. and
own mechanism toload the programsome
memoryor access to data memory. Clearly there needs to be a they work on as
memories processors muti-processors.
data and a mechanismfor passinginformation between general-purpose
problem. MIMD has clearlyemerges the architecture of choice for function
support, MIMDs can
MIMD machines offe flexibility. With the correct hardware andsoftware
multiprogrammed
application, as There are
user machines focusing on high performance for one these functions.
as single combination of MIMD
machines running many tasks simultaneously, or as some architecture, andshared memory
twotypes of MIMD architectures: distributed memory MIMD
architecture.
Components of the Multi Computer and Their Tasks communication network
linking
Within a multicomputer, there are alarge number of
nodes and a tasks related
elements that do have
theseinodes together. Inside each node,there are
three important
to message passing :
(a) Computation Processor and Private Memory
(b) Communication Processor multicomputernodes,
among1the
This component is responsible for organizing communication "de-packetizing
chunk of memory in the on the gËving end, and
"packetizing" a message as a
the same message for the receiving node.
toas switch units
(c) Router, commonly referred message from one node to the
next and assist the
to transmit the the message in through
the
The router's main task is organizing the communication of
communication processor in
network of nodes. hardware became
above three elements took place progressively, as messages were
The development of each of the computers, in which
useful. First came the Generation l communication processors or
more and more complex and but there were not any
between nodes, that were
passed through direct links multicomputers came along with independent switch units
routers. Then Generation II Generation II! multicomputer, all three components
finally in the
separate from the processor, and
exist.
memory MIMD architecture:
Example systems of Distributed
" IBM SP-2
Intel Paragon
Architectures
Memory MIMD
Advantages of Distributed of
system and their local memory, therefore, no problem
distributed memory
" Every processor has
contention. therefore sophisticated
connect through shared data structures and
" The processor cannot monitors are not required. Message passing solves all the
synchronization approaches like synchronization.
requirements of communication and
scalable and good architecture candidates for building massively parallel
These systems are highly
computers.
10.16 . Computer Architecture

Disadvantages of Distributed Memory MIMD Architectures

attention must be paid to load balancing.
" It can achieve high implementation in multicomputer special
automatic mapping and load balancing, in
Although recently much research effort has been devoted to
user to partition the code and data among the PEs.
many systems it is stillthe responsibility of the
synchronization can lead to a deadlock situation. On the
" Message-passing-based communication and
to avoid deadlocks derived from
architecture level, it is the task of thecommunication protocoldesigner
derived from message-based synchronization
incorrect routing schemes. However, avoiding deadlocks
at the software level is still the responsibility of the user.
message-passing is required to he
Although there is no architectural bottleneck in multicomputer,
data coPying can result in significan.
physically copied data structure between processes. Intensive
performance degradation. This was particularly the case in the first generation of multicomputer
consumed both processor time and meme
where the applied store and forward switching technique
space.
the introduction af
" The problem is radically reduced in the second generation of multicomputer where
wormhole routing and the employment of special-purpose communication processors resulted in an
improvement of three orders of magnitude in communication latency.

10.6 Review Questions

Short Answer Questions

1. Define multiprocessor system.

2. How is multiprocessor classified.
3. List out any two interconnection structures.
4. What is time shared common bus?
5. Define multithreading.
6. What are the different types of multithreading?
Long Answer Questions
1. Write the characteristics of multiprocessors.
2. Explain thread level parallesim.
3. Explain multiport memory.
4. Explain multithreaded architecture.
5. Write the advantages and disadvantages of Fine
grained multithreading.
6. Write the Advantages of Distributed
Memory MIMD Architecture.
7. Write the disadvantages of
Distributed Memory MIMD Architecture.

Advanced Computer Arc.
No ratings yet
Advanced Computer Arc.
128 pages
Advanced Computer Arc. EXAM
No ratings yet
Advanced Computer Arc. EXAM
128 pages
Multiple Processor Systems: 8.1 Multiprocessors 8.2 Multicomputers 8.3 Distributed Systems
No ratings yet
Multiple Processor Systems: 8.1 Multiprocessors 8.2 Multicomputers 8.3 Distributed Systems
36 pages
1 - Concurrent Programming
No ratings yet
1 - Concurrent Programming
28 pages
COA
No ratings yet
COA
107 pages
Module 07 - Multiprocessing
No ratings yet
Module 07 - Multiprocessing
60 pages
Parallelism and Multicores
No ratings yet
Parallelism and Multicores
54 pages
Unit-6 Multiprocessors
No ratings yet
Unit-6 Multiprocessors
21 pages
Multiprocessor Architectures and Programming
No ratings yet
Multiprocessor Architectures and Programming
89 pages
Pipeline
No ratings yet
Pipeline
43 pages
Multiprocessor Architecture System
100% (1)
Multiprocessor Architecture System
10 pages
ACA Unit5 Notes
No ratings yet
ACA Unit5 Notes
26 pages
CH17 COA9e
No ratings yet
CH17 COA9e
51 pages
Velikanovs - Performance Tuning and Troubleshooting For Oracle OC4J
100% (2)
Velikanovs - Performance Tuning and Troubleshooting For Oracle OC4J
35 pages
Unit 6 Mom
No ratings yet
Unit 6 Mom
23 pages
9 Module 4
No ratings yet
9 Module 4
49 pages
Unit 6
No ratings yet
Unit 6
36 pages
Chapter - 5 Multiprocessors and Thread-Level Parallelism: A Taxonomy of Parallel Architectures
No ratings yet
Chapter - 5 Multiprocessors and Thread-Level Parallelism: A Taxonomy of Parallel Architectures
41 pages
2ad6a430 1637912349895
No ratings yet
2ad6a430 1637912349895
51 pages
Multiprocessor
No ratings yet
Multiprocessor
22 pages
Unit VI
No ratings yet
Unit VI
50 pages
Hardware Multithreading
No ratings yet
Hardware Multithreading
10 pages
Unit 5
No ratings yet
Unit 5
23 pages
Misc Topics in Computer Networks
No ratings yet
Misc Topics in Computer Networks
160 pages
Unit 3
No ratings yet
Unit 3
28 pages
Part - B Unit - 5 Multiprocessors and Thread - Level Parallelism
No ratings yet
Part - B Unit - 5 Multiprocessors and Thread - Level Parallelism
20 pages
Multi-Processor-Parallel Processing PDF
No ratings yet
Multi-Processor-Parallel Processing PDF
12 pages
Multiprocessor System Architecture
No ratings yet
Multiprocessor System Architecture
11 pages
Unit-3 2 Multiprocessor Systems
No ratings yet
Unit-3 2 Multiprocessor Systems
12 pages
Definition of UMA: Basis For Comparison UMA Numa
No ratings yet
Definition of UMA: Basis For Comparison UMA Numa
10 pages
Chapter 11 CO BIM III
No ratings yet
Chapter 11 CO BIM III
10 pages
07 Multiprocessors MF PDF
No ratings yet
07 Multiprocessors MF PDF
99 pages
Unit6 - Microprocessor - Final 1
No ratings yet
Unit6 - Microprocessor - Final 1
30 pages
Unit 2 Cloud Computing
No ratings yet
Unit 2 Cloud Computing
19 pages
Unit 1
No ratings yet
Unit 1
14 pages
Chapter Ten Architeture
No ratings yet
Chapter Ten Architeture
14 pages
Multiprocessor Architecture and Programming
No ratings yet
Multiprocessor Architecture and Programming
20 pages
Thread Vs Processes in Distributed Systems
No ratings yet
Thread Vs Processes in Distributed Systems
13 pages
Multiprocessing: - Classification
No ratings yet
Multiprocessing: - Classification
14 pages
5 4 Parallel
No ratings yet
5 4 Parallel
47 pages
Multi-Processor / Parallel Processing
No ratings yet
Multi-Processor / Parallel Processing
12 pages
Interconnection Structures
No ratings yet
Interconnection Structures
7 pages
Coa Unit5
No ratings yet
Coa Unit5
11 pages
COA Group Assigment
No ratings yet
COA Group Assigment
11 pages
A502018463 23825 5 2019 Unit6
No ratings yet
A502018463 23825 5 2019 Unit6
36 pages
Microprocessor
No ratings yet
Microprocessor
7 pages
B.tech CS S8 High Performance Computing Module Notes Module 4
No ratings yet
B.tech CS S8 High Performance Computing Module Notes Module 4
33 pages
Chapter 10
No ratings yet
Chapter 10
6 pages
AIX Manual MP
No ratings yet
AIX Manual MP
6 pages
William Stallings Computer Organization and Architecture 9 Edition
No ratings yet
William Stallings Computer Organization and Architecture 9 Edition
51 pages
COA Assignment
No ratings yet
COA Assignment
21 pages
Unit 11
No ratings yet
Unit 11
10 pages
Multiprocessors
No ratings yet
Multiprocessors
12 pages
Multi-Processor / Parallel Processing
No ratings yet
Multi-Processor / Parallel Processing
12 pages
Final Unit5 CO Notes
No ratings yet
Final Unit5 CO Notes
7 pages
Ch-9 MIMD Architecture and SPMD
No ratings yet
Ch-9 MIMD Architecture and SPMD
8 pages
CP4253 Map Unit I
No ratings yet
CP4253 Map Unit I
31 pages
Introduction To Multiprocessors
No ratings yet
Introduction To Multiprocessors
1 page
Symmetric Multiprocessors: Unit 5 Memory Organization
No ratings yet
Symmetric Multiprocessors: Unit 5 Memory Organization
6 pages
OpenCL Programming
100% (1)
OpenCL Programming
246 pages
Os Unit 2 Notes
No ratings yet
Os Unit 2 Notes
72 pages
Seminar
No ratings yet
Seminar
85 pages
Chapter 7 Multithreading Programming PDF
No ratings yet
Chapter 7 Multithreading Programming PDF
64 pages
Sisd, Simd, Misd, Mimd
No ratings yet
Sisd, Simd, Misd, Mimd
2 pages
Topic 1 2024
No ratings yet
Topic 1 2024
41 pages
CH 6 Distributed System
No ratings yet
CH 6 Distributed System
6 pages
TD Osb Sinc
No ratings yet
TD Osb Sinc
163 pages
CPU Scheduling in Operating Systems
No ratings yet
CPU Scheduling in Operating Systems
17 pages
DBS Architecture
No ratings yet
DBS Architecture
12 pages
2 - Process Management
No ratings yet
2 - Process Management
143 pages
Module 2 OS BCS303
No ratings yet
Module 2 OS BCS303
81 pages
Chapter 03 OS
No ratings yet
Chapter 03 OS
35 pages
Chapter 2 Process Management Part 2 Threads and Multithreading
No ratings yet
Chapter 2 Process Management Part 2 Threads and Multithreading
42 pages
Trace
No ratings yet
Trace
80 pages
OS Theory Syllabus
No ratings yet
OS Theory Syllabus
1 page
BCS515D Model Set 1 Paper
No ratings yet
BCS515D Model Set 1 Paper
2 pages
Chapter 4-Problems-2
No ratings yet
Chapter 4-Problems-2
25 pages
Anr 6.42 (64200002) 0
No ratings yet
Anr 6.42 (64200002) 0
10 pages
Stanford Hydra Architecture: Presented by Drew Schena and Josh Milas
No ratings yet
Stanford Hydra Architecture: Presented by Drew Schena and Josh Milas
19 pages
08 Threads PDF
No ratings yet
08 Threads PDF
16 pages
Linux PGM Memory
No ratings yet
Linux PGM Memory
13 pages
Simulasi CPU Scheduling
No ratings yet
Simulasi CPU Scheduling
7 pages
Power Off Reset Reason
No ratings yet
Power Off Reset Reason
3 pages
đề tài block chain
No ratings yet
đề tài block chain
2 pages
How Timer Works in Java
No ratings yet
How Timer Works in Java
4 pages
Mastering the Art of Unix Programming: Unraveling the Secrets of Expert-Level Programming
From Everand
Mastering the Art of Unix Programming: Unraveling the Secrets of Expert-Level Programming
Steve Jones
No ratings yet
Next-Generation switching OS configuration and management: Troubleshooting NX-OS in Enterprise Environments
From Everand
Next-Generation switching OS configuration and management: Troubleshooting NX-OS in Enterprise Environments
Mamta Devi
No ratings yet
Operating System Interview Questions and Answers
From Everand
Operating System Interview Questions and Answers
Manish Soni
No ratings yet
Daemon Architecture and Implementation: Definitive Reference for Developers and Engineers
From Everand
Daemon Architecture and Implementation: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Operating Systems Interview Questions You'll Most Likely Be Asked
From Everand
Operating Systems Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet

Multiprocessors and Thread: Unit-4

Uploaded by

Multiprocessors and Thread: Unit-4

Uploaded by

Unit- 4

MULTIPROCESSORS AND THREAD

(s Thread Level Parallelism

SIMD MISD MIMD

Uniprocessor Vector Array

Figure 10.1: Taxonomy of mono-multiprocessor organizations

Figure 10.2 : Basic multiprocessor architecture

" Acomputer where one processor performs high speed floating-point

Time Shared CommonBus

Figure 10.3: Time shared common bus organization

Common System Local

System Local System Local

Local Bus Local Bus

Figure 10.4: System Bus Structure for Multiprocessor

Inevpensive as no extra hardware is required such as switch.

Amultiport memory system employs separate buses between each

Figure 10.5: Multiport Memory

The hardware required to

Figure 10.7 :Operation of 2 x 2 Interconnection switch

Figure 10.8: Binary tree with 2 x 2 switches

Figure 10.9 :8 x8 Omega switching network

011 111 MOWn

(00) (10) 000

One-cube Two-cube Three-cube

Figure 10.10 :Hypercube structure for n = 1, 2, and 3.

Types of Multi-threading 2. Coarse-grained Multi-threading

Sinnultaneous multi-threading (SMT)

scheduling functional units among multiple threads.

Figure 10.11 :Structure of Distributed Memory MIMD Architectures

Disadvantages of Distributed Memory MIMD Architectures

10.6 Review Questions

Short Answer Questions

1. Define multiprocessor system.

You might also like