1.2 Underlying Principles of Parallel and Distributed Computing
1.2 Underlying Principles of Parallel and Distributed Computing
1.2 Underlying Principles of Parallel and Distributed Computing
UNIT NO 1
INTRODUCTION
20CSPC603
CLOUD COMPUTING
20CSPC603
CLOUD COMPUTING (Common to CSE & IT)
⮚ Levels of Parallelism
2
20CSPC603
CLOUD COMPUTING (Common to CSE & IT)
⮚The shared memory - a single address space, which is accessible to all the processors.
⮚Parallel programs :
⮚Broken down into several units of execution that can be allocated to different processors and
can communicate with each other by means of shared memory.
⮚Parallel systems - include all architectures that are based on the concept of shared memory,
whether this is physically present or created with the support of libraries, specific hardware
3
20CSPC603
CLOUD COMPUTING (Common to CSE & IT)
⮚Distributed Computing:
• Encompasses any architecture or system that allows the computation to be broken down into units
and
executed concurrently on different computing elements,
⮚Distributed computing includes a wider range of systems and applications than parallel computing
•Computing Grids
4
20CSPC603
CLOUD COMPUTING (Common to CSE & IT)
⮚ The parallel program consists of multiple active processes ( tasks) simultaneously solving a given
problem.
⮚ A given task is divided into multiple subtasks using a divide-and-conquer technique, and each subtask is
processed on a different central processing unit (CPU).
⮚ Programming on multi processor system using the divide-and-conquer technique is called parallel
programming.
5
20CSPC603
CLOUD COMPUTING (Common to CSE & IT)
⮚ Based on the number of instructions and data streams, that can be processed simultaneously,
computing systems are classified into the following four categories:
6
20CSPC603
CLOUD COMPUTING (Common to CSE & IT)
Single – Instruction , Single Data (SISD) systems
7
20CSPC603
CLOUD COMPUTING (Common to CSE & IT)
Processor 1
Processor 2
Processor N
8
20CSPC603
CLOUD COMPUTING (Common to CSE & IT)
⮚SIMD computing system is a multiprocessor machine capable of executing the same instruction on all the
CPUs but operating on different data streams.
⮚Machines based on this model are well suited for scientific computing since they involve lots of vector and
matrix operations.
⮚For instance statement Ci = Ai * Bi, can be passed to all the processing elements (PEs), organized data
elements of vectors A and B can be divided into multiple sets ( N- sets for N PE systems), and each PE can
process one data set.
⮚Dominant representative SIMD systems are Cray’s Vector processing machine and Thinking Machines Cm*,
and GPGPU accelerators
9
CS8791
CLOUD COMPUTING (Common to CSE & IT)
⮚MISD computing system is a multi processor machine capable of executing different instructions
on different PEs all of them operating on the same data set.
For example
⮚Machines built using MISD model are not useful in most of the applications.
⮚This type of systems are more of an intellectual exercise than a practical configuration.
10
CS8791
CLOUD COMPUTING (Common to CSE & IT)
Multiple – Instruction , Multiple Data (MIMD) systems
Data Input
Data Output 1
1
Processor 1
Data Input
Data Output 2
2
Processor 2
Data Input
Data Output 3
N
Processor N
11
CS8791
CLOUD COMPUTING (Common to CSE & IT)
Multiple – Instruction , Multiple Data (MIMD) systems
⮚MIMD computing system is a multi processor machine capable of executing multiple instructions on
multiple data sets.
⮚Each PE in the MIMD model has separate instruction and data streams, hence machines built using
this model are well suited to any kind of application.
⮚MIMD machines are broadly categorized into shared-memory MIMD and distributed memory MIMD
based on the way PEs are coupled to the main memory.
12
CS8791
CLOUD COMPUTING (Common to CSE & IT)
⮚All the PEs are connected to a single global memory and they all have access to it.
⮚Systems based on this model are also called tightly coupled multi processor systems.
⮚The communication between PEs in this model takes place through the shared memory.
⮚Modification of the data stored in the global memory by one PE is visible to all other PEs.
⮚Dominant representative shared memory MIMD systems are silicon graphics machines and Sun/IBM SMP
( Symmetric Multi-Processing)
13
CS8791
CLOUD COMPUTING (Common to CSE & IT)
Memory
Bus
14
CS8791
CLOUD COMPUTING (Common to CSE & IT)
⮚The shared memory MIMD architecture is easier to program but is less tolerant to failures and harder to
extend with respect to the distributed memory MIMD model.
⮚Failures, in a shared memory MIMD affect the entire system, whereas this is not the case of the distributed
model, in which each of the PEs can be easily isolated.
⮚Moreover, shared memory MIMD architectures are less likely to scale because the addition of more PEs leads
to memory contention.
⮚This is a situation that does not happen in the case of distributed memory, in which each PE has its own
memory.
⮚As a result, distributed memory MIMD architectures are most popular today.
15
CS8791
CLOUD COMPUTING (Common to CSE & IT)
⮚A sequential program is one that runs on a single processor and has a single line of control.
⮚To make many processors collectively work on a single program, the program must be divided
into smaller independent chunks so that each processor can work on separate chunks of the
problem.
⮚The program decomposed in this way is a parallel program.A wide variety of parallel
programming approaches are available.
16
CS8791
CLOUD COMPUTING (Common to CSE & IT)
⮚In the case of Process Parallelism, a given operation has multiple (but distinct) activities that
can be processed on multiple processors.
⮚In the case of Farmer-and-Worker model, a job distribution approach is used, one processor is
configured as master and all other remaining PEs are designated as slaves, the master assigns
the jobs to slave PEs and, on completion, they inform the master, which in turn collects results.
17
CS8791
CLOUD COMPUTING (Common to CSE & IT)
Levels of Parallelism
⮚Levels of Parallelism are decided on the lumps of code ( grain size) that can be a potential
candidate of parallelism.
18
CS8791
CLOUD COMPUTING (Common to CSE & IT)
Levels of Parallelism
•To conceal latency, there must be another thread ready to run whenever a lengthy operation
occurs.
19
CS8791
CLOUD COMPUTING (Common to CSE & IT)
Levels of Parallelism
Messages Messages
IPC IPC Large Level
(Processes, Tasks)
Task 1 Task 2 Task N
Shared Shared
function f1() Memory function f2() Memory function fJ() Medium Level
{…} 1
Function {…} 2
Function {…} J
Function (Threads, Functions)
20
CS8791
CLOUD COMPUTING (Common to CSE & IT)
Laws of Caution
⮚Studying how much an application or a software system can gain from parallelism.
⮚In particular what need to keep in mind is that parallelism is used to perform multiple
activities together so that the system can increase its throughput or its speed.
⮚But the relations that control the increment of speed are not linear.
⮚For example: for a given n processors, the user expects speed to be increase by in times.
This is an ideal situation, but it rarely happens because of the communication overhead.
21
CS8791
CLOUD COMPUTING (Common to CSE & IT)
LAWS OF CAUTION
⮚Speed of computation is proportional to the square root of the system cost; they never increase
linearly. Therefore, the faster a system becomes, the more expensive it is to increase its speed
⮚Speed by a parallel computer increases as the logarithm of the number of processors (i.e.
y=k*log(N)).
22
CS8791
CLOUD COMPUTING (Common to CSE & IT)
⮚ The components of a distributed system communicate with some sort of message passing. This
is a term the encompasses several communication models.
23
CS8791
CLOUD COMPUTING (Common to CSE & IT)
Frameworks for
distributed Applications
programming
24
CS8791
25
CS8791
⮚Software architectural styles are based on the logical arrangement of software components.
⮚They are helpful because they provide an intuitive view of the whole system, despite its physical
deployment.
⮚They also identify the main abstractions that are used to shape the components of the system and
the expected interaction patterns between them
26
CS8791
CLOUD COMPUTING (Common to CSE & IT)
27
CS8791
CLOUD COMPUTING (Common to CSE & IT)
⮚The repository architectural style is the most relevant reference model in this category.
⮚ It is characterized by two main components – the central data structure, which represents the current
state of the system, and a collection of independent component, which operate on the central data
⮚The ways in which the independent components interact with the central data structure can be very
heterogeneous.
⮚In particular repository based architectures differentiate and specialize further into subcategories
according to the choice of control discipline to apply for the shared data structure. Of particular
interest are databases and blackboard systems.
⮚In the repository systems, the dynamics of the system is controlled by independent components,
which by issuing an operation on the central repository, trigger the selection of specific
processes that operate on data. 28
CS8791
CLOUD COMPUTING (Common to CSE & IT)
DATA FLOW ARCHITECTURES
⮚Access to data is the core feature, data-flow styles explicitly incorporate the pattern of data-flow
⮚Batch Sequential:
The batch sequential style is characterized by an ordered sequence of separate programs executing
one after the other.
These programs are chained together by providing as input for the next program the output
generated by the last program after its completion, which is most likely in the form of a file
Pipe-and-Filter Style:
Each component of the processing chain is called a filter, and the connection between one filter and
the next is represented by a data stream.
29
CS8791
CLOUD COMPUTING (Common to CSE & IT)
⮚The virtual machine class of architectural styles is characterized by the presence of an abstract execution
environment (virtual machine) that simulates features that are not available in the hardware or software.
⮚Applications and systems are implemented on top of this layer and become portable over different
hardware and software environments.
⮚Rule-Based Style:
Client / Server
The information and the services of interest can be centralized and accessed through a single access
point : the server.
Multiple clients are interested in such services and the server must be appropriately designed to efficiently
serve requests coming from different clients.
Peer- to – Peer
Symmetric architectures in which all the components, called peers, play the same role and incorporate both
client and server capabilities of the client/server model.
More precisely, each peer acts as a server when it processes requests from other peers and as a client
when it issues requests to other peers.
31
CS8791
CLOUD COMPUTING (Common to CSE & IT)
Client / Server architectural Styles
request
Two Tier
(Classic Model)
client response server
Three Tier
N Tier
server
32
CS8791
CLOUD COMPUTING (Common to CSE & IT)
peer
peer
peer peer
peer
peer
peer
33
CS8791
CLOUD COMPUTING (Common to CSE & IT)
Models for Inter process Communication
⮚Distributed systems are composed of a collection of concurrent processes interacting with each other
by means of a network connection.
⮚IPC is used to either exchange data and information or coordinate the activity of processes.
⮚IPC is what ties together the different components of a distributed system, thus making them act as a
single system.
⮚There are several different models in which processes can interact with each other – these maps to
different abstractions for IPC.
⮚Among the most relevant that we can mention are shared memory, remote procedure call (RPC), and
message passing.
⮚At lower level, IPC is realized through the fundamental tools of network programming.
⮚Sockets are the most popular IPC primitive for implementing communication channels between
distributed processes.
34
CS8791
CLOUD COMPUTING (Common to CSE & IT)
Message-based communication
⮚The term message-based communication model can be used to refer to any model for IPC.
⮚Message Passing :
The entities exchanging information explicitly encode in the form of a message the data to be
exchanged. The structure and the content of a message vary according to the model.
Examples :
Message-Passing-Interface (MPI)
OpenMP.
35
CS8791
CLOUD COMPUTING (Common to CSE & IT)
Message-based communication
⮚ Remote Procedure Call (RPC) : This paradigm extends the concept of procedure call beyond the
boundaries of a single process, thus triggering the execution of code in remote processes.
⮚ Distributed Objects : This is an implementation of the RPC model for the object-oriented paradigm and
contextualizes this feature for the remote invocation of methods exposed by objects.
36
CS8791
CLOUD COMPUTING (Common to CSE & IT)
Message-based communication
⮚ Web Service
Push Strategy
Pull Strategy
37
.
CS8791
CLOUD COMPUTING (Common to CSE & IT)
RPC is the fundamental abstraction enabling the execution procedures on clients’ request.
RPC allows extending the concept of a procedure call beyond the boundaries of a process and a
single memory address space.
The called procedure and calling procedure may be on the same system or they may be on
different systems.
.
38
CS8791
CLOUD COMPUTING (Common to CSE & IT)
Remote Procedure Call (RPC)
Node A Node B
Network
39
CS8791
CLOUD COMPUTING (Common to CSE & IT)
Distributed Object Programming model
Node A Node B
Application B
21
10 16
Instance Remote
15
Instance
5: Object
1: Ask for Object Proxy Object Skeleton Activation
Reference
9 11 20 17 14 6 4
Application A
8 12 7
13
2 19 18 3
Network
40
CS8791
CLOUD COMPUTING (Common to CSE & IT)
Microsoft technology for distributed object programming before the introduction of .NET technology.
⮚.NET Remoting: IPC among .NET applications, a uniform platform for accessing remote objects from within
any application developed in any of the languages supported by .NET.
41
CS8791
CLOUD COMPUTING (Common to CSE & IT)
⮚Service – oriented computing organizes distributed systems in terms of services, which represent the
major abstraction for building systems.
⮚Service orientation expresses applications and software systems as an aggregation of services that are
coordinated within a service oriented architecture (SOA).
⮚Even though there is no designed technology for the development of service-oriented software systems,
web services are the de facto approach for developing SOA.
⮚Web services, the fundamental component enabling Cloud computing systems, leverage the Internet as
the main interaction channel between users and the system
42