Lecture 5
Lecture 5
• The development of applications and systems is a major element, and it comes to consolidation when
problem-solving environments were designed and introduced to facilitate and empower engineers.
• This is when the paradigm characterizing computing achieved maturity and became mainstream.
Moreover, every aspect of this era underwent a three-phase process: research and development
(R&D), commercialization, and commoditization
Principles of Parallel and Distributed Computing
Parallel vs. Distributed computing :
• The terms parallel computing and distributed computing are often used interchangeably, even though they
mean slightly different things.
• The term parallel implies a tightly coupled system, whereas distributed refers to a wider class of systems,
including those that are tightly coupled.
• Parallel Computing:
• The term parallel computing refers to a model in which the computation is divided among several processors
sharing the same memory.
• The architecture of a parallel computing system is often characterized by the homogeneity of components: each
processor is of the same type and it has the same capability as the others.
• The shared memory has a single address space, which is accessible to all the processors.
• Parallel programs are then broken down into several units of execution that can be allocated to different
processors and can communicate with each other by means of the shared memory.
• Originally we considered parallel systems only those architectures that featured multiple processors sharing the
same physical memory and that were considered a single computer.
Principles of Parallel and Distributed Computing
Distributed Computing:
The term distributed computing encompasses any architecture or system that allows the computation to be
broken down into units and executed concurrently on different computing elements, whether these are
processors on different nodes, processors on the same computer, or cores within the same processor.
Therefore, distributed computing includes a wider range of systems and applications than parallel
computing and is often considered a more general term.
Even though it is not a rule, the term distributed often implies that the locations of the computing elements
are not the same and such elements might be heterogeneous in terms of hardware and software features.
Classic examples of distributed computing systems are computing grids or Internet computing systems,
which combine together the biggest variety of architectures, systems, and applications in the world.
Principles of Parallel and Distributed Computing
Elements of Parallel Processing:
• Processing of multiple tasks simultaneously on multiple processors is called parallel processing.
• The parallel program consists of multiple active processes (tasks) simultaneously solving a given
problem.
• A given task is divided into multiple subtasks using a divide-and-conquer technique, and each
subtask is processed on a different central processing unit (CPU).
• Programming on a multiprocessor system using the divide-and-conquer technique is called parallel
programming.
Principles of Parallel and Distributed Computing
Hardware architectures for parallel processing
• The core elements of parallel processing are CPUs.
• Based on the number of instruction and data streams that can be processed simultaneously, computing
systems are classified into the following four categories:
• Single-instruction, single-data (SISD) systems
• Single-instruction, multiple-data (SIMD) systems
• Multiple-instruction, single-data (MISD) systems
• Multiple-instruction, multiple-data (MIMD) systems
Principles of Parallel and Distributed Computing
• Single-instruction, single-data (SISD) systems:
• An SISD computing system is a uniprocessor machine capable of executing a single instruction, which
operates on a single data stream.
• In SISD, machine instructions are processed sequentially; hence computers adopting this model are
popularly called sequential computers.
• Most conventional computers are built using the SISD model. All the instructions and data to be
processed have to be stored in primary memory.
• The speed of the processing element in the SISD model is limited by the rate at which the computer can
transfer information internally.
• Dominant representative SISD systems are IBM PC, Macintosh, and workstations.
Principles of Parallel and Distributed Computing
• Single-instruction, multiple-data (SIMD) systems:
• An SIMD computing system is a multiprocessor machine capable of executing the same instruction on all the
CPUs but operating on different data streams.
• Machines based on a SIMD model are well suited to scientific computing since they involve lots of vector and
matrix operations.
• Dominant representative SIMD systems are Cray’s vector processing machine and Thinking Machines’ cm.
Principles of Parallel and Distributed Computing
• Multiple-instruction, single-data (MISD) systems:
• An MISD computing system is a multiprocessor machine capable of executing different instructions on different
PEs but all of them operating on the same data set.
• Machines built using the MISD model are not useful in most of the applications; a few machines are built, but
none of them are available commercially.
Principles of Parallel and Distributed Computing
• Multiple-instruction, multiple-data (MIMD) systems:
• An MIMD computing system is a multiprocessor machine capable of executing multiple instructions on multiple
data sets.
• Each PE in the MIMD model has separate instruction and data streams; hence machines built using this model
are well suited to any kind of application.
• Unlike SIMD and MISD machines, PEs in MIMD machines work asynchronously.
Principles of Parallel and Distributed Computing
• MIMD machines are broadly categorized into shared-memory MIMD and distributed-memory MIMD based on
the way PEs are coupled to the main memory.
• In the shared memory MIMD model, all the PEs are connected to a single global memory and they all have
access to it.
• Systems based on this model are also called tightly coupled multiprocessor systems.
• The communication between PEs in this model takes place through the shared memory; modification of the
data stored in the global memory by one PE is visible to all other PEs.
Principles of Parallel and Distributed Computing
• Distributed memory MIMD machines In the distributed memory MIMD model, all PEs have a local memory.
Systems based on this model are also called loosely coupled multiprocessor systems.
• The communication between PEs in this model takes place through the interconnection network (the
interprocess communication channel, or IPC).
• The network connecting PEs can be configured to tree, mesh, cube, and so on.
• Each PE operates asynchronously, and if communication/synchronization among tasks is necessary, they can do
so by exchanging messages between them.
Principles of Parallel and Distributed Computing
• Approaches to parallel programming:
• A sequential program runs on a single processor and has a single line of control.
• To make many processors collectively work on a single program, the program must be divided into smaller
independent chunks so that each processor can work on separate chunks of the problem.
• The program decomposed in this way is a parallel program.
• A wide variety of parallel programming approaches are available. The most prominent among them are the
following:
• Data parallelism
• Process parallelism
• Farmer-and-worker mode
• process parallelism:
• In the case of process parallelism, a given operation has multiple (but distinct) activities that can be processed
on multiple processors.
• farmer-and-worker:
• In the case of the farmer-and-worker model, a job distribution approach is used: one processor is configured as
master and all other remaining PEs are designated as slaves;
• The master assigns jobs to slave PEs and, on completion, they inform the master, which in turn collects results.
• These approaches can be utilized in different levels of parallelism.
Principles of Parallel and Distributed Computing
• Levels of parallelism:
• Levels of parallelism are decided based on the lumps of code (grain size) that can be a potential candidate for
parallelism.
• Table 2.1 lists categories of code granularity for parallelism.
• All these approaches have a common goal: to boost processor efficiency by hiding latency.
• To conceal latency, there must be another thread ready to run whenever a lengthy operation occurs.
• The idea is to execute concurrently two or more single-threaded applications, such as compiling, text
formatting, database searching, and device simulation.
Principles of Parallel and Distributed Computing
• Parallelism within an application can be detected at several levels:
• Large grain (or task level)
• Medium grain (or control level)
• Fine grain (data level)
• Very fine grain (multiple-instruction issue)
Principles of Parallel and Distributed Computing
Elements of distributed computing:
Distributed computing studies the models, architectures,
and algorithms used for building and managing distributed
systems.
A distributed system is a collection of independent
computers that appears to its users as a single coherent
system.
A distributed system is one in which components located
at networked computers communicate and coordinate
their actions only by passing messages.
Principles of Parallel and Distributed Computing
Components of a distributed
system:
• A distributed system is the result of the interaction
of several components that traverse the entire
computing stack from hardware to software.
• It emerges from the collaboration of several
elements that—by working together—give users
the illusion of a single coherent system.
• Figure 2.10 provides an overview of the different
layers that are involved in providing the services of
a distributed system.
Principles of Parallel and Distributed Computing
• At the very bottom layer, computer and network
hardware constitute the physical infrastructure;
these components are directly managed by the
operating system, which provides the basic
services for interprocess communication (IPC),
process scheduling and management, and
resource management in terms of file system
and local devices.
• Taken together these two layers become the
platform on top of which specialized software is
deployed to turn a set of networked computers
into a distributed system.
Principles of Parallel and Distributed Computing
• At the operating system level and even more at
the hardware and network levels allows easy
harnessing of heterogeneous components and
their organization into a coherent and uniform
system.
• For example, network connectivity between
different devices is controlled by standards,
which allow them to interact seamlessly.
• At the operating system level, IPC services are
implemented on top of standardized
communication protocols such as Transmission
Control Protocol/Internet Protocol (TCP/IP), User
Datagram Protocol (UDP), or others.
Principles of Parallel and Distributed Computing
• The middleware layer leverages such services to
build a uniform environment for the
development and deployment of distributed
applications.
• This layer supports the programming paradigms
for distributed systems.
• By relying on the services offered by the
operating system, the middleware develops its
own protocols, data formats, and programming
language or frameworks for the development of
distributed applications.
• All of them constitute a uniform interface to
distributed application developers that is
completely independent of the underlying
operating system and hides all the
heterogeneities of the bottom layers.
Principles of Parallel and Distributed Computing
• The top of the distributed system stack is
represented by the applications and services
designed and developed to use the middleware.
• These can serve several purposes and often
expose their features in the form of graphical
user interfaces (GUIs) accessible locally or
through the Internet via a Web browser.
• For example, in the case of a cloud computing
system, the use of Web technologies is strongly
preferred, not only to interface distributed
applications with the end user but also to
provide platform services aimed at building
distributed systems.
Principles of Parallel and Distributed Computing
• The general reference architecture of a
distributed system is contextualized in the case
of a cloud computing system.
Principles of Parallel and Distributed Computing
• Architectural styles for distributed computing:
• Architectural styles are mainly used to determine the vocabulary of components and connectors
that are used as instances of the style together with a set of constraints on how they can be
combined.
• The architectural styles are organized into two major classes:
I. Software architectural styles
II. System architectural styles