0% found this document useful (0 votes)

9 views204 pages

Distributed Computing

The document provides an overview of distributed computing, defining it as a system where independent entities collaborate to solve problems without shared memory or a common clock. It discusses the characteristics, advantages, and differences between centralized and distributed systems, as well as middleware and distributed execution. Additionally, it covers the relationship between distributed and parallel systems, detailing various architectures, topologies, and classifications such as Flynn's taxonomy.

Uploaded by

22d115

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views204 pages

Distributed Computing

Uploaded by

22d115

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 204

lOMoARcPSD|40984340

Distributed Computing

Distributed computing (Anna University)

Scan to open on Studocu

Studocu is not sponsored or endorsed by any college or university

Downloaded by AKILESH J S PSGiTECH ([email protected])
lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

UNIT I
INTRODUCTION
1.1 DEFINITION
A distributed system is a collection of independent entities that cooperate to solve a
problem that cannot be individually solved. A distributed system can be characterized as a
collection of mostly autonomous processors communicating over a communication network
Features / Issues of Distributed Systems
1. No common physical clock
2. No shared memory: This is a key feature that requires message-passing for
communication.
3. Geographical separation: The Google search engine is based on the NOW architecture.
4. Autonomy and heterogeneity: The processors are “loosely coupled” in that they have
different speeds and each can be running a different operating system.
5. Communication is hidden from users
6. Applications interact in uniform and consistent way
7. High degree of scalability
8. A distributed system is functionally equivalent to the systems of which it is composed.
9. Resource sharing is possible in distributed systems.
10. Distributed systems act as fault tolerant systems
11. Enhanced performance
Differences between centralized and distributed systems

Centralized Systems Distributed Systems

In Centralized Systems, several jobs are In Distributed Systems, jobs are

done on a particular central processing distributed among several processors. The
unit(CPU) Processor are interconnected by a
computer network
They have shared memory and shared They have no global state (i.e.) no shared
variables. memory and no shared variables.
Clocking is present. No global clock.

1.2. RELATION TO COMPUTER SYSTEM COMPONENTS

A typical distributed system is shown in below figure 1.1. Each computer has a memory-
processing unit and the computers are connected by a communication network.

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

Middleware
The distributed software is also termed as middleware. The middleware is the distributed
software that drives the distributed system, while providing transparency of heterogeneity at the
platform level.

Distributed execution
A distributed execution is the execution of processes across the distributed system to
collaboratively achieve a common goal. An execution is also sometimes termed a computation
or a run.
The distributed system uses a layered architecture to break down the complexity of
system design.
Here we assume that the middleware layer does not contain the traditional application
layer functions of the network protocol stack, such as http, mail, ftp, and telnet. Various
primitives and calls to functions defined in various libraries of the middleware layer are
embedded in the user program code.
Examples of middleware
1. Object Management Group’s (OMG) common object request broker architecture
(CORBA)
2. Remote procedure call (RPC) mechanism
3. DCOM (distributed component object model)

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

4. Message-passing interface (MPI)
5. RMI (remote method invocation)
 In the RPC mechanism, procedure code may reside on a remote machine, and the
RPC software sends a message across the network to invoke the remote procedure. It
then awaits a reply, after which the procedure call completes from the perspective of
the program that invoked it.
 Currently deployed commercial versions of middleware often use CORBA, DCOM
(distributed component object model), Java, and RMI (remote method invocation)
technologies.
1.3 MOTIVATION
The following are the key points that act as a motivating factor of distributed systems:
1. Inherently distributed computations: Distributed systems can process
the computations at geographically remote locations.
2. Resource sharing: The hardware, databases, special libraries cab be
shared between systems without owning a dedicated copy or a replica.
This is cost effective and reliable.
3. Access to geographically remote data and resources: As mentioned
previously, computations may happen at remote locations. Resources such
as centralized servers can also be accessed from distant locations.
4. Enhanced reliability: Distributed systems provide enhanced reliability,
since they run on multiple copies of resources. The distribution of
resources at distant locations makes them less susceptible for faults. The
term reliability comprises of:
a. Availability: the resource/ service provided by the resource should
be accessible at all times
b. Integrity: the value/state of the resource should be correct and
consistent.
c. Fault-Tolerance: the ability to recover from system failures
5. Increased performance/cost ratio: The resource sharing and remote
access features of Distributed systems naturally increase the
performance / cost ratio.
6. Scalable: The number of systems operating in a distributed environment
can be increased as the demand increases.
1.4 RELATION TO PARALLEL SYSTEMS

The main objective of parallel systems is to improve the processing speed. They are
sometimes known as multiprocessor or multi computers or tightly coupled systems. They
refer to simultaneous use of multiple computer resources that can include a single computer with

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

multiple processors, a number of computers connected by a network to form a parallel
processing cluster or a combination of both.
1.4.1 CHARACTERISTICS OF PARALLEL SYSTEMS
There are three types of parallel (or) shared memory multiprocessors:
i) A multiprocessor system- Uniform Memory Access (UMA)
ii) Multicomputer parallel system-Non Uniform Memory Access NUMA
iii) Array processors

UMA NUMA COMA

i) Multiprocessor system- Uniform Memory Access (UMA)

In UMA, all the processors share the physical memory in a centralized manner with equal
access time to all the memory words.
 Each processor may have a private cache memory.
 Same rule is followed for peripheral devices.
Uniform Memory Access

Symmetric Asymmetric
multiprocessor multiprocessor

Symmetric multiprocessor
When all the processors have equal access to all the peripheral devices, the system is
called a symmetric multiprocessor.
Asymmetric multiprocessor
When only one or a few processors can access the peripheral devices, the system is called
an asymmetric multiprocessor.

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

Drawback:
1. Shared memory can quickly become a bottleneck for system performances
2. All processors must synchronize on the single bus and memory access.
ii) Multicomputer parallel system-Non Uniform Memory Access NUMA
In NUMA multiprocessor model, the access time varies with the location of the memory
word. Here, the shared memory is physically distributed among all the processors, called local

memories.

Global address space:

The collection of all local memories forms a global address space which can be accessed
by all the processors. NUMA systems also share CPUs and the address space, but each processor
has a local memory, visible to all other processors.

Non-Uniform Memory Access

Non-Caching NUMA Cache-Coherent NUMA

(NC-NUMA) (CC-NUMA).

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

Non-Caching NUMA (NC-NUMA):

 No local cache
 No cache coherency problem:
 Remote memory access is very inefficient.

Cache-Coherent NUMA (CC-NUMA):

Caching can alleviate the problem due to remote data access, but brings the cache
coherency issue. It uses following protocols:
 Bus snooping
 Directory-based protocol
Advantages of NUMA
1. In NUMA systems access to local memory blocks is quicker than access to
remote memory blocks.
2. Programs written for UMA systems run with no change in NUMA ones,
3. To build larger processors, NUMA is used.
3. Array processors
Array processors belong to a class of parallel computers that are physically co-located,
are very tightly coupled, and have a common system clock (but may not share memory and
communicate by passing data using messages).
Array processors and systolic arrays that perform tightly synchronized processing and
data exchange in lock-step for applications such as DSP and image processing belong to this
category. These applications usually involve a large number of iterations on the data. This
class of parallel systems has a very niche market.

1.4.2 TOPOLOGIES
The choice of the interconnection network may affect several characteristics of the
system such as node complexity, scalability and cost etc. The interconnection network form the
topology to access the memory. It may be following:
 Omega network
 Butterfly network
 Torus or 2D Mesh Topology
 Hypercube
 Array Processors

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

Omega network

A multistage omega network formed by a 2x2. The 2× 2 switch allows data on either of
the two input wires. Only one data unit can be sent on an output wire at a single step. To avoid
collision of data, many buffering techniques have been proposed.
The Omega network connecting n processors with n memory units has n/2logn switching
elements of size 2 × 2 arranged in log n stages.
Omega interconnection function
j = 2i for 0≤i≤n/2−1
2i+1 for n/2 ≤i ≤n−1

Butterfly network

A butterfly network links multiple computers into a high-speed network. For a butterfly
network with n processor nodes, there need to be n (log n + 1) switching nodes. The generation
of the interconnection pattern between a pair of adjacent stages depends not only on n but also
on the stage numbers. In a stage (s) switch, if the s + 1th MSB of j is 0, the data is routed to the
upper output wire, otherwise it is routed to the lower output wire.

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

Torus or 2D Mesh Topology
A k X k mesh will contain k2 processor with maximum path length as 2*(k/2 - 1). Every

unit in the torus topology is identified using a unique label, with dimensions distinguished as bit
positions.
Hypercube
The path between any two nodes in 4-D hypercube is found by Hamming distance.
Routing is done in hop to hop fashion with each adjacent node differing by one bit label. This
topology has good congestion control and fault tolerant mechanism.

1.4.3 FLYNN’S TAXONOMY

Flynn's taxonomy is a specific classification of parallel computer architectures

that are based on the number of concurrent instruction (single or multiple)
and data streams (single or multiple) available in the architecture
.
Flynn's taxonomy based on the number of instruction streams and data streams are the following:
 (SISD) single instruction, single data
 (MISD) multiple instruction, single data
 (SIMD) single instruction, multiple data
 (MIMD) multiple instruction, multiple data

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

1) SISD (Single Instruction, Single Data stream)

Single Instruction, Single Data (SISD) refers to an Instruction Set Architecture in
which a single processor (one CPU) executes exactly one instruction stream at a
time.
Most of the CPU design is based on the von Neumann architecture and the follow
SISD.
The SISD model is a non-pipelined architecture with general-purpose registers,
Program Counter (PC), the Instruction Register (IR), Memory Address Registers
(MAR) and Memory Data Registers (MDR).
2) Single Instruction, Multiple Data (SIMD)
SIMD is an Instruction Set Architecture that have a single control unit (CU) and
more than one processing unit (PU) that operates like a von Neumann machine by
executing a single instruction stream over PUs, handled through the CU.
The SIMD architecture is capable of achieving data level parallelism.
3) Multiple Instructions, Single Data (MISD)
MISD is an Instruction Set Architecture for parallel computing where many
functional units perform different operations by executing different instructions on
the same data set.
This type of architecture is common mainly in the fault-tolerant computers
executing the same instructions redundantly in order to detect and mask errors.
4) Multiple Instruction stream, Multiple Data stream (MIMD)
MIMD is an Instruction Set Architecture for parallel computing that is typical of
the computers with multiprocessors.
Using the MIMD, each processor in a multiprocessor system can execute
asynchronously different set of the instructions independently on the different set of
data units.
The MIMD based computer systems can use the shared memory in a memory pool
or work using distributed memory across heterogeneous network computers in a
distributed environment.
The MIMD architectures is primarily used in a number of application areas such as
computer-aided design/computer-aided manufacturing, simulation, modeling,
communication switches etc.

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

Comparison of Flynn’s taxonomy

1.4.4 COUPLING, PARALLELISM, CONCURRENCY & GRANULARITY COUPLING

i) Coupling
Concurrency is a major design challenge in these systems. The term coupling is
associated with the configuration and design of processors in a multiprocessor system.
The multiprocessor systems are classified into two types based on coupling:
1. Loosely coupled systems

2. Tightly coupled systems

Tightly Coupled systems:
 Tightly coupled multiprocessor systems contain multiple CPUs that are connected at the
bus level with both local as well as central shared memory.
 Tightly coupled systems perform better, due to faster access to memory and
intercommunication and are physically smaller and use less power. They are
economically costlier.
 Some examples of tightly coupled multiprocessors with NUMA shared memory or that
communicate by message passing are the SGI Origin 2000
Loosely Coupled systems:
 Loosely coupled multiprocessors consist of distributed memory where each processor has
its own memory and IO channels.
 The processors communicate with each other via message passing or interconnection
switching.
 Loosely coupled systems are less costly than tightly coupled systems, but are physically
bigger and have a low performance compared to tightly coupled systems.
 The individual nodes in a loosely coupled system can be easily replaced and are usually
inexpensive.

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

 They require more power and are more robust and can resist failures.
 The extra hardware required to provide communication between the individual
processors makes them complex and less portable.
ii) Parallelism within system
It is the use of multiple processing elements simultaneously for solving any problem.
 This is a measure of the relative speedup of a specific program, on a given machine. The
speedup depends on the number of processors and the mapping.
 It is expressed as the ratio of the time T(1) witha single processor, to the time T(n) with n
processors.
iii) Parallelism within a parallel/distributed program
This is an aggregate measure of the percentage of time that all the processors are
executing CPU instructions productively. Parallelism can be classified into three
categories based on work distribution among the parallel tasks:
1. Fine-grained: Partitioning the application into small amounts of work done leading
to a low computation to communication ratio.
2. Coarse-grained parallelism: This has high computation to communication ratio.
3. Medium-grained: Here the task size and communication time greater than fine-
grained parallelism and lower than coarse-grained parallelism.
 Programs with fine-grained parallelism are best suited for tightly coupled systems.
iv) Concurrency
 Concurrent programming refer to techniques for decomposing a task into subtasks that
can execute in parallel and managing the risks that arise when the program executes
more than one task at the same time.
 The parallelism or concurrency in a parallel or distributed program can be measured by
the ratio of the number of local non-communication and non-shared memory access

operations to the total number of operations, including the communication or shared

memory access operations.

1.5 MESSAGE-PASSING SYSTEMS VERSUS SHARED MEMORY SYSTEMS

There are two types of inter process communication models. They are,
1. Message passing systems
2. Shared memory systems
Message passing systems:
 This allows multiple processes to read and write data to the message queue
without being connected to each other.
 Messages are stored on the queue until their recipient retrieves them. Message
queues are quite useful for interprocess communication and are used by most
operating systems.

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

Shared memory systems:
 The shared memory is the memory that can be simultaneously accessed by multiple
processes. This is done so that the processes can communicate with each other.
 Communication among processors takes place through shared data variables, and
control variables for synchronization among the processors.
 Semaphores and monitors are common synchronization mechanisms on shared
memory systems.
 When shared memory model is implemented in a distributed environment, it is
termed as distributed shared memory.
Differences between message passing and shared memory models
Message Passing Distributed Shared Memory
Variables have to be marshaled fromThe processes share variables directly, so
one process, transmitted and unmarshalled no marshaling and unmarshalling.
into other variables at the receiving
process.
Processes can communicate with otherHere, a process does not have private
processes. They can be protected from oneaddress space. So one process can alter
another by having private address spaces. the execution of other.
This technique can be used This cannot be used to heterogeneous
in heterogeneous computers. computers.
Synchronization between processes isSynchronization is through locks and
through message passing primitives. semaphores.
Processes communicating via Processes communicating through DSM
message passing must execute at the same may execute with non-overlapping
time. lifetimes.

a) Message Passing Model b) Shared Memory Model

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

1.5.1 EMULATING MESSAGE-PASSING ON A SHARED MEMORY SYSTEM
(MP → SM)
The shared memory system can be made to act as message passing system. The shared
address space can be partitioned into disjoint parts, one part being assigned to each processor.
Send and receive operations care implemented by writing to and reading from the
destination/sender processor’s address space. The read and write operations are synchronized.
Specifically, a separate location can be reserved as the mailbox for each ordered pair of
processes.
1.5.2 EMULATING SHARED MEMORY ON A MESSAGE-PASSING SYSTEM
(SM → MP)
This is also implemented through read and writes operations. Each shared location can be
modeled as a separate process. Write to a shared location is emulated by sending an update
message to the corresponding owner process and read operation to a shared location is emulated
by sending a query message to the owner process.
This emulation is expensive as the process has to gain access to other process memory
location. The latencies involved in read and write operations may be high even when using

shared memory emulation because the read and write operations are implemented by using
network-wide communication.

1.6 PRIMITIVES FOR DISTRIBUTED COMMUNICATION

Message send and message receive communication primitives are done through
following operations:
 send()
 receive()
Send primitive
It has two parameters: the destination, and the buffer in the user space that holds the data
to be sent.
Receive primitive
It also has two parameters: the source from which the data is to be received and the user
buffer into which the data is to be received.

1.6.1 BLOCKING / NON BLOCKING / SYNCHRONOUS / ASYNCHRONOUS

There are two ways of sending data when the Send primitive is called:
 Buffered: It copies the data from the user buffer to the kernel buffer. The data later
gets copied from the kernel buffer onto the network. For the Receive primitive, the
buffered option is usually required because the data may already have arrived when
the primitive is invoked, and needs a storage place in the kernel.
 Unbuffered: The data gets copied directly from the user buffer onto the network.

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

Blocking primitives
 A primitive is blocking if control returns to the invoking process after the
processing for the primitive completes (whether in synchronous or asynchronous
mode) completes.
 The sending process must wait after a send until an acknowledgement is made by
the receiver.
 The receiving process must wait for the expected message from the sending
process
Non-Blocking primitives
 A primitive is non-blocking if control returns back to the invoking process
immediately after invocation, even though the operation has not completed.
 For a non-blocking Send, control returns to the process even before the data is
copied out of the user buffer.
 For anon-blocking Receive, control returns to the process even before the data
may have arrived from the sender.
Synchronous primitives
 send( ) or a receive( ) primitive is synchronous if both the send() and receive() handshake
with each other.
 The processing for the send primitive completes only after the invoking processor learns
that the other corresponding receive primitive has also been invoked and that the receive
operation has been completed.

 The processing for the receive primitive completes when the data to be received is copied
into the receiver’s user buffer.
Asynchronous primitives
 A Send primitive is said to be asynchronous, if control returns back to the invoking
process after the data item to be sent has been copied out of the user-specified buffer.
 It does not make sense to define asynchronous Receive primitives.
 For non-blocking primitives, a return parameter on the primitive call returns a system-
generated handle which can be later used to check the status of completion of the call.

1.6.2 MODES OF SEND AND RECEIVE PRIMITIVES

1. Blocking synchronous
2. Non- blocking synchronous
3. Blocking asynchronous
4. Non- blocking asynchronous
Four modes of send operation

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

i) Blocking synchronous Send:

 The data gets copied from the user buffer to the kernel buffer and is then sent over
the network.
 After the data is copied to the receiver’s system buffer and a receive call has
been issued, an acknowledgement back to the sender causes control to return to
the process that invoked the Send operation and completes the Send.
ii) Non-blocking synchronous Send:
 Control returns back to the invoking process as soon as the copy of data from the
user buffer to the kernel buffer is initiated.
 A parameter in the non-blocking call also gets set with the handle of a location
that the user process can later check for the completion of the synchronous send
operation.
 The location gets posted after an acknowledgement returns from the receiver.
 The user process can keep checking for the completion of the non-blocking
synchronous Send by testing the returned handle, or it can invoke the blocking
Wait operation on the returned handle

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

iii) Blocking asynchronous Send:
 The user process that invokes the Send is blocked until the data is copied from the
user’s buffer to the kernel buffer.
iv) Non-blocking asynchronous Send:
 The user process that invokes the Send is blocked until the transfer of the data
from the user’s buffer to the kernel buffer is initiated.
 Control returns to the user process as soon as this transfer is initiated, and a
parameter in the non-blocking call also gets set with the handle of a location
that the user process can check later using the Wait operation for the completion
of the asynchronous Send.
1.6.3 MODES OF RECEIVE OPERATION
i) Blocking Receive:
The Receive call blocks until the data expected arrives and is written in the specified
user buffer. Then control is returned to the user process.
ii) Non-blocking Receive
The Receive call will cause the kernel to register the call and return the handle of a
location that the user process can later check for the completion of the non- blocking
Receive operation.
This location gets posted by the kernel after the expected data arrives and is copied
to the user-specified buffer. The user process can check for the completion of the
non- blocking Receive by invoking the Wait operation on the returned handle.
1.6.4 PROCESSOR SYNCHRONY

Processor synchrony indicates that all the processors execute in lock-step with
their clocks synchronized.
Since distributed systems do not follow a common clock, this abstraction is implemented
using some form of barrier synchronization to ensure that no processor begins executing the next
step of code until all the processors have completed executing the previous steps of code
assigned to each of the processors.

1.6.5 LIBRARIES AND STANDARDS

1. Message Passing Interface (MPI): This is a standardized and portable message-
passing system to function on a wide variety of parallel computers. The primary
goal of the Message Passing Interface is to provide a widely used standard for
writing message passing programs.
2. Parallel Virtual Machine (PVM): It is a software tool for parallel networking of
computers. It is designed to allow a network of heterogeneous Unix and/or
Windows machines to be used as a single distributed parallel processor.
3. Remote Procedure Call (RPC): The Remote Procedure Call (RPC) is a common
model of request reply protocol. In RPC, the procedure need not exist in the same

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

address space as the calling procedure. The two processes may be on the same
system, or they may be on different systems with a network connecting them. By
using RPC, programmers of distributed applications avoid the details of the
interface with the network. RPC makes the client/server model of computing more
powerful and easier to program.
4. Remote Method Invocation (RMI): RMI (Remote Method Invocation) is a way
that a programmer can write object-oriented programming in which objects on
different computers can interact in a distributed network. It is a set of protocols
being developed by Sun's JavaSoft division that enables Java objects to
communicate remotely with other Java objects.
5. Common Object Request Broker Architecture (CORBA): CORBA describes a
messaging mechanism by which objects distributed over a network can
communicate with each other irrespective of the platform and language used to
develop those objects. The data representation is concerned with an external
representation for the structured and primitive types that can be passed as the
arguments and results of remote method invocations in CORBA. It can be used by a
variety of programming languages.
The commonalities between RMI and RPC are as follows:
 They both support programming with interfaces.
 They are constructed on top of request-reply protocols.
 They both offer a similar level of transparency.
Differences between RMI and RPC

1.7 SYNCHRONOUS VS ASYNCHRONOUS EXECUTIONS

The execution of process in distributed systems may be synchronous or asynchronous.
Asynchronous Execution:
A communication among processes is considered asynchronous, when every
communicating process can have a different observation of the order of the messages being
exchanged. In an asynchronous execution:
 there is no processor synchrony and there is no bound on the drift rate of processor

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

clocks
 message delays are finite but unbounded
 no upper bound on the time taken by a process

Fig 1.18: Asynchronous execution in message passing system

Synchronous Execution:
A communication among processes is considered synchronous when every process
observes the same order of messages within the system. In the same manner, the execution is
considered synchronous, when every individual process in the system observes the same total
order of all the processes which happen within it. In an synchronous execution:

 processors are synchronized and the clock drift rate between any two processors is
bounded
 message delivery times are such that they occur in one logical step or round
 upper bound on the time taken by a process to execute a step.

Fig 1.19 : Synchronous execution

Emulating an asynchronous system by a synchronous system (A → S)

An asynchronous program can be emulated on a synchronous system fairly trivially as
the synchronous system is a special case of an asynchronous system – all communication

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

finishes within the same round in which it is initiated.
Emulating a synchronous system by an asynchronous system (S → A)
A synchronous program called synchronizer can be emulated on an asynchronous system
using a tool
Emulation for a fault free system

Fig 1.20: Emulations in a failure free message passing system

If system A can be emulated by system B, denoted A/B, and if a problem is not solvable
in B, then it is also not solvable in A. If a problem is solvable in A, it is also solvable in B.
Hence, in a sense, all four classes are equivalent in terms of computability in failure-free
systems.

1.8 DESIGN ISSUES AND CHALLENGES IN DISTRIBUTED SYSTEMS

The design of distributed systems has numerous challenges. They can be categorized into:
 Issues related to system and operating systems design
 Issues related to algorithm design
 Issues arising due to emerging technologies
The above three classes are not mutually exclusive.

1.8.1 ISSUES RELATED TO SYSTEM AND OPERATING SYSTEMS DESIGN

The following are some of the common challenges to be addressed in designing a

distributed system from system perspective:
a) Communication: This task involves designing suitable communication mechanisms
among the various processes in the networks.
Examples: RPC, RMI
b) Processes: The main challenges involved are process and thread management at both
client and server environments, migration of code between systems, design of software
and mobile agents.

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

c) Naming: Devising easy to use and robust schemes for names, identifiers,and addresses is
essential for locating resources and processes in a transparent and scalable manner. The
remote and highly varied geographical locations make this task difficult.
d) Synchronization: Mutual exclusion, leader election, deploying physical clocks, global
state recording are some synchronization mechanisms.
e) Data storage and access Schemes: Designing file systems for easy and efficient data
storage with implicit accessing mechanism is very much essential for distributed
operation
f) Consistency and replication: The notion of Distributed systems goes hand in hand with
replication of data, to provide high degree of scalability. The replicas should be handed
with care since data consistency is prime issue.
g) Fault tolerance: This requires maintenance of fail proof links, nodes, and processes.
Some of the common fault tolerant techniques are resilience, reliable communication,
distributed commit, check pointing and recovery, agreement and consensus, failure
detection, and self-stabilization.
h) Security: Cryptography, secure channels, access control, key management – generation
and distribution, authorization, and secure group management are some of the security
measure that is imposed on distributed systems.
i) Applications Programming Interface (API) and transparency: The user friendliness
and ease of use is very important to make the distributed services to be used by wide
community. Transparency, which is hiding inner implementation policy from users, is
of the following types:
 Access transparency: hides differences in data representation
 Location transparency: hides differences in locations for providing uniform access
to data located at remote locations.
 Migration transparency: allows relocating resources without changing names.
 Replication transparency: Makes the user unaware whether he is working on
original or replicated data.
 Concurrency transparency: Masks the concurrent use of shared resources for the
user.
 Failure transparency: system being reliable and fault-tolerant.
j) Scalability and modularity: The algorithms, data and services must be as distributed as
possible. Various techniques such as replication, caching and cache management, and
asynchronous processing help to achieve scalability.

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

1.8.2 ALGORITHMIC CHALLENGES IN DISTRIBUTED COMPUTING

a. Designing useful execution models and frameworks

 The interleaving model, partial order model, input/output automata model and the
Temporal Logic of Actions (TLA) are some examples of models that provide
different degrees of infrastructure.
b. Dynamic distributed graph algorithms and distributed routing algorithms
 The distributed system is generally modeled as a distributed graph.
 Hence graphical algorithms are the base for large number of higher level
communication, data dissemination, object location, and object search functions.
 These algorithms must have the capacity to deal with highly dynamic graph
characteristics. They are expected to function like routing algorithms.
c. Time and global state in a distributed system
 The geographically remote resources demands the synchronization based on logical
time.

 Logical time is relative and eliminates the overheads of providing physical time for
applications. Logical time can
(i) capture logic and inter-process dependencies
ii) Track the relative progress at each proces.
d. Synchronization/coordination mechanisms
 Synchronization is essential for the distributed processes to facilitate concurrent
execution without affecting other processes.
 The synchronization mechanisms also involve resource management and
concurrency management mechanisms.
 Some techniques for providing synchronization are:
 Physical clock synchronization:
 Leader
 Election
 Mutual exclusion
 Deadlock detection and resolution: This is done to avoid duplicate
work, and deadlock resolution should be coordinated to avoid
unnecessary aborts of processes.
 Termination detection: cooperation among the processes to detect the specific
global state of quiescence.
 Garbage collection: Detecting garbage requires coordination among the
processes.
e. Group communication, multicast, and ordered message delivery
 A group is a collection of processes that share a common context and collaborate on
a common task within an application domain. Group management protocols are

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

needed for group communication wherein processes can join and leave groups
dynamically, or fail.
 The concurrent execution of remote processes may sometimes violate the semantics
and order of the distributed program. Hence, a formal specification of the semantics
of ordered delivery need to be formulated, and then implemented.
f. Monitoring distributed events and predicates
 Predicates defined on program variables that are local to different processes are used
for specifying conditions on the global system state.
 On-line algorithms for monitoring such predicates are hence important.
 An important paradigm for monitoring distributed events is that of event streaming,
wherein streams of relevant events reported from different processes are examined
collectively to detect predicates.
g. Distributed program design and verification tools
 Methodically designed and verifiably correct programs can greatly reduce the
overhead of software design, debugging, and engineering. Designing these is a big
challenge.
h. Debugging distributed programs
 Debugging distributed programs is much harder because of the concurrency and
replications. Adequate debugging mechanisms and tools are need of the hour.

i. Data replication, consistency models, and caching

 Fast access to data and other resources is important in distributed systems.
 Managing replicas and their updates faces concurrency problems.
 Placement of the replicas in the systems is also a challenge because resources
usually cannot be freely replicated.
j. World Wide Web design – caching, searching, scheduling
 WWW is a commonly known distributed system.
 The issues of object replication and caching, prefetching of objects have to be done
on WWW also.
 Object search and navigation on the web are important functions in the operation of
the web.
k. Distributed shared memory abstraction
 A shared memory is easier to implement since it does not involve managing the
communication tasks.
 The communication is done by the middleware by message passing.
 The overhead of shared memory is to be dealt by the middleware technology.
 Some of the methodologies that does the task of communication in shared memory
distributed systems are:
 Wait-free algorithms
 Mutual exclusion

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

l. Register constructions
Architectures must be designed in such a way that, registers allows concurrent access
without any restrictions on the concurrency permitted.
m. Reliable and fault-tolerant distributed systems
The following are some of the fault tolerant strategies:
 Consensus algorithms: Consensus algorithms allow correctly functioning
processes to reach agreement among themselves in spite of the existence of
malicious processes.
 Replication and replica management: The Triple Modular Redundancy (TMR)
technique is used in software and hardware implementation. TMR is a fault- tolerant
form of N-modular redundancy, in which three systems perform a process and that
result is processed by a majority-voting system to produce a single output.
 Voting and quorum systems: Providing redundancy in the active or passive
components in the system and then performing voting based on some quorum
criterion is a classical way of dealing with fault-tolerance.
 Distributed databases and distributed commit: The distributed databases should
also follow atomicity, consistency, isolation and durability (ACID) properties.
 Self-stabilizing systems: All system executions have associated good (or legal)
states and bad (or illegal) states; during correct functioning, the system makes
transitions among the good states.
 Check pointing and recovery algorithms: Check pointing is periodically recording
the current state on secondary storage so that, in case of a failure. The entire
computation is not lost but can be recovered from one of the recently taken
checkpoints.
 Failure detectors: The asynchronous distributed do not have a bound on the
message transmission time. This makes the message passing very difficult, since the
receiver do not know the waiting time. Failure detectors probabilistically suspect
another process as having failed and then converge on a determination of the
up/down status of the suspected process.
n. Load balancing
The objective of load balancing is to gain higher throughput, and reduce the user
perceived latency. Load balancing may be necessary because of a variety of factors such
as high network traffic or high request rate causing the network connection to be a
bottleneck, or high computational load. The following are some forms of load balancing:
 Data migration: The ability to move data around in the system, based on
the access pattern of the users
 Computation migration: The ability to relocate processes in order to
perform a redistribution of the workload.
 Distributed scheduling: This achieves a better turnaround time for theusers
by using idle processing power in the system more efficiently.

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

o. Real-time scheduling
Real-time scheduling becomes more challenging when a global view of the system
state is absent with more frequent on-line or dynamic changes. The message propagation
delays which are network-dependent are hard to control or predict.
p. Performance
User perceived latency in distributed systems must be reduced. The common issues
in performance:
 Metrics: Appropriate metrics must be defined for measuring the performance of
theoretical distributed algorithms and its implementation.
 Measurement methods/tools: The distributed system is a complexentity
appropriate methodology and tools must be developed for measuring the
performance metrics.

1.8.3 APPLICATIONS OF DISTRIBUTED COMPUTING AND NEWER CHALLENGES

a. Mobile systems
 Mobile systems which use wireless communication in shared broadcast medium
have issues related to physical layer such as transmission range, power, battery
power consumption, interfacing with wired internet, signal processing and
interference.
o Base-station approach (cellular approach): The geographical region is divided
into hexagonal physical locations called cells. The powerful base station
transmits signals to all other nodes in its range
o Ad-hoc network approach: This is an infrastructure-less approach which do not
have any base station to transmit signals. Instead all the responsibility is
distributed among the mobile nodes.
b. Sensor networks
 A sensor is a processor with an electro-mechanical interface that is capable of
sensing physical parameters.
 They are low cost equipment with limited computational power and battery life.
They are designed to handle streaming data and route it to external computer
network and processes.
 They are susceptible to faults and have to reconfigure themselves.
c. Ubiquitous or pervasive computing
 In Ubiquitous systems the processors are embedded in the environment to perform
application functions in the background.
 Examples: Intelligent devices, smart homes etc.
 They are distributed systems with recent advancements operating in wireless
environments through actuator mechanisms.
 They can be self-organizing and network-centric with limited resources.

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

d. Peer-to-peer computing
 Peer-to-peer (P2P) computing is computing over an application layer network where
all interactions among the processors are at a same level.
 This is a form of symmetric computation against the client sever paradigm.
 They are self-organizing with or without regular structure to the network.
e. Publish-subscribe, content distribution, and multimedia
 The users in present day require only the information of interest.
 In a dynamic environment where the information constantly fluctuates there is great
demand for
(i) Publish :an efficient mechanism for distributing this information
(ii) Subscribe: an efficient mechanism to allow end users to indicate interest in
receiving specific kinds of information
(iii) An efficient mechanism for aggregating large volumes of published
information and filtering it as per the user’s subscription filter.
 Content distribution refers to a mechanism that categorizes the information based on
parameters.
 The publish subscribe and content distribution overlap each other.
 Multimedia data introduces special issue because of its large size.
f. Distributed agents
 Agents are software processes or sometimes robots that move around the system to
do specific tasks for which they are programmed.
 Agents collect and process information and can exchange such information with
other agents.
 Challenges in distributed agent systems include coordination mechanisms among the
agents, controlling the mobility of the agents,their software design and interfaces.
g. Distributed data mining
 Data mining algorithms process large amount of data to detect patterns and trends in
the data, to mine or extract useful information.
 The mining can be done by applying database and artificial intelligence techniques
to a data repository.

q. Grid computing
 Grid computing is deployed to manage resources. For instance, idle CPU cycles of
machines connected to the network will be available to others.
 The challenges includes: scheduling jobs, framework for implementing quality of
service, real-time guarantees, security.
r. Security in distributed systems
 The challenges of security in a distributed setting include: confidentiality,
authentication and availability. This can be addressed using efficient and scalable
solutions.

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

1.9 DISTRIBUTED PROGRAM

A distributed program is composed of a set of asynchronous processes that communicate
by message passing over the communication network. Each process may run on different
processor.
 The processes do not share a global memory and communicate solely by passing
messages. These processes do not share a global clock that is instantaneously
accessible to these processes.
 Process execution and message transfer are asynchronous – a process may execute
an action spontaneously and a process sending a message does not wait for the
delivery of the message to be complete.
 The global state of a distributed computation is composed of the states of the
processes and the communication channels. The state of a process is characterized
by the state of its local memory and depends upon the context.
 The state of a channel is characterized by the set of messages in transit in the
channel.
1.10 A MODEL OF DISTRIBUTED EXECUTIONS
 The execution of a process consists of a sequential execution of its actions.
 The actions are atomic and the actions of a process are modeled as three types of
events:
 internal events
 message send events
 message receive events.
 The occurrence of events changes the states of respective processes and channels,
thus causing transitions in the global system state.
 An internal event changes the state of the process at which it occurs.
 A send event changes the state of the process that sends the message and the state of
the channel on which the message is sent.
The execution of process pi produces a sequence of events e1, e2, e3, …, and it is
denoted by,
Hi: Hi =(hi j).
Here hi are states produced by pi
are the casual dependencies among events
 msg indicates the dependency that exists due to message passing between two
events.

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

send(m) msg rec(m)

Fig 1.21: Space time distribution of distributed systems
 A send event changes the state of the process that sends the message and the state of
the channel on which the message is sent.
 A receive event changes the state of the process that receives the message and the
state of the channel on which the message is received.
1.10.1 CASUAL PRECEDENCE RELATIONS
Causal message ordering is a partial ordering of messages in a distributed computing
environment. It is the delivery of messages to a process in the order in which they were
transmitted to that process.
Happen Before Relation
The partial ordering obtained by generalizing the relationship between two processes is
called as happened-before relation or causal ordering or potential causal ordering. This term was
coined by Lamport. Happens-before defines a partial order of events in a distributed system.
Some events can’t be placed in the order. If say A →B if A happens before B. A B is defined
using the following rules:
 Local ordering: A and B occur on same process and A occurs before B.
 Messages: send(m) → receive(m) for any message m
 Transitivity: e → e’’ if e → e’ and e’ → e’’
 Ordering
Lamports ordering is happen before relation denoted by
 a b, if a and b are events in the same process and a occurred before b.
 a b, if a is the vent of sending a message m in a process and b is the event of the
same message m being received by another process.
 If a b and b c, then a c. Lamports law follow transitivity property.
When all the above conditions are satisfied, then it can be concluded that a b is casually
related.
Example:
Consider two events c and d; c d and d c is false (i.e) they are not casually related,
then c and d are said to be concurrent events denoted as c||d.

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

Fig 1.22: Communication between processes

Fig 1.22 shows the communication of messages m1 and m2 between three processes p1,
p2 and p3. a, b, c, d, e and f are events. It can be inferred from the diagram that, a b; c d; e f;
b->c; d f; a d; a f; b d; b f. Also a||e and c||e.

1.10.2 LOGICAL VS PHYSICAL CONCURRENCY

Physical concurrency:
Several program units from the same program that execute simultaneously.
Logical concurrency:
Multiple processors providing actual concurrency. The actual execution of programs is
taking place in interleaved fashion on a single processor.

Differences between logical and physical concurrency

Logical concurrency Physical concurrency
Several units of the same program execute Several program units of the same program
simultaneously on same processor, giving execute at the same time on different
an illusion to the programmer that they are processors.
executing on multiple processors.
They are implemented through They are implemented as uni-processor with
interleaving. I/O channels, multiple CPUs, network of uni
or multi CPU machines.

1.11 MODELS OF COMMUNICATION NETWORK

There are three types of communication models in distributed systems. They are,
1. FIFO (first-in, first-out)
2. Non-FIFO (N-FIFO)
3. Causal Ordering (CO):
FIFO
In FIFO, each channel acts as a FIFO message queue. So message ordering is preserved
by a channel.
Non-FIFO (N-FIFO)
In N-FIFO, a channel acts like a set in which a sender process adds messages and
receiver removes messages in random order.

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

Causal Ordering (CO)
It follows Lamport’s law.

The relation between the three models is given by CO FIFO N-FIFO.

1.12 GLOBAL STATE

The global state of a distributed system is a collection of the local states of its
components, namely, the processes and the communication channels.
 The state of a process at any time is defined by the contents of processor
registers, stacks, local memory, etc. and depends on the local context of the
distributed application.
 The state of a channel is given by the set of messages in transit in the channel.
 The occurrence of events changes the states of respective processes and channels,
thus causing transitions in global system state. For example, an internal event
changes the state of the process at which it occurs. A send event (or a receive
event) changes the state of the process that sends (or receives) the message and
the state of the channel on which the message is sent (or received).
A snapshot of the system is a single configuration of the system.

Consistent state
A distributed snapshot should reflect a consistent state. A global state is consistent if it
could have been observed by an external observer. For a successful Global State, all states must
be consistent:
 If we have recorded that a process P has received a message from a process Q,
then we should have also recorded that process Q had actually send that
message.
 Otherwise, a snapshot will contain the recording of messages that have been
received but never sent.
 The reverse condition (Q has sent a message that P has not received) is allowed.
The notion of a global state can be graphically represented by a cut. A cut represents the
last event that has been recorded for each process.
The history of each process if given by

 Each event is an internal action of the process.

 si k the state of process pi immediately before the kth event occurs.
 The state si in the global state S corresponding to the cut C is that of pi

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

immediately after the last event processed by pi in the cut – eici .
 The set of events eici is called the frontier of the cut.

Fig 1.23: Types of cuts

Consistent states:
The states should not violate causality. Such states are called consistent global states and
are meaningful global states.
Inconsistent global states:
They are not meaningful in the sense that a distributed system can never be in an
inconsistent state.
1.13 CUTS OF A DISTRIBUTED COMPUTATION
The notion of a global state can be graphically represented by a cut. A cut represents the
last event that has been recorded for each process.
Cut is pictorially a line slices the space–time diagram, and thus the set of events in the
distributed computation, into a PAST and a FUTURE.
 The PAST contains all the events to the left of the cut
 FUTURE contains all the events to the right of the cut.
 For a cut C, let PAST(C) and FUTURE(C) denote the set of events in the PAST
and FUTURE of C, respectively.
Consistent cut:
A consistent global state corresponds to a cut in which every message received in the
PAST of the cut was sent in the PAST of that cut.
Inconsistent cut:
A cut is inconsistent if a message crosses the cut from the FUTURE to the PAST.
1.14 PAST AND FUTURE CONES OF AN EVENT
In a distributed computation, an event ej could have been affected only by all events ei,
such that ei → ej and all the information available at ei could be made accessible at ej. In other
word ei and ej should have a causal relationship. Let Past(ej) denote all events in the past of ej
in any computation.

 The term max(past(ei)) denotes the latest event of process pi that has affected ej.
 This will always be a message sent event.

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

Fig 1.24: Past and future cones of event

A cut in a space-time diagram is a line joining an arbitrary point on each process line that
slices the space-time diagram into a PAST and a FUTURE. A consistent global state corresponds

to a cut in which every message received in the PAST of the cut was sent in the PAST of that

Futurei(ei ) is the set of those events of Future (ej) are the process pi and min(Futurei(ej))
as the first event on process pi that is affected by ej. All events at a process pi that occurred after
Max(Past(ej)) but before min(Futurei(ej)) are concurrent with ej.

1.15 MODELS OF PROCESS COMMUNICATIONS

There are two basic models of process communications

Synchronous: the sender process blocks until the message has been received by the
receiver process. The sender process resumes execution only after it learns that the receiver
process has accepted the message. The sender and the receiver processes must synchronize
to exchange a message.
Asynchronous: It is non- blocking communication where the sender and the receiver do not
synchronize to exchange a message. The sender process does not wait for the message to be
delivered to the receiver process. The message is buffered by the system and is delivered to
the receiver process when it is ready to accept the message. A buffer overflow may occur if
a process sends a large number of messages in a burst to another process, thus causing a
message burst.
Asynchronous communication achieves high degree of parallelism and non- determinism
at the cost of implementation complexity with buffers. On the other hand, synchronization is
simpler with low performance. The occurrence of deadlocks and frequent blocking of events
prevents it from reaching higher performance levels.

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

LOGICAL TIME
Logical clocks are based on capturing chronological and causal relationships of processes and
ordering events based on these relationships.
Three types of logical clock are maintained in distributed systems:
 Scalar clock
 Vector clock
 Matrix clock
Differences between physical and logical clock
Physical Clock Logical Clock
A physical clock is a physical procedureA logical clock is a component for
combined with a strategy for measuringcatching sequential and causal connections
that procedure to record the progression ofin a dispersed framework.
time.
The physical clocks are based on cyclic A logical clock allows global ordering on
processes such as a celestial rotation. events from different processes.

A system of logical clocks consists of a time domain T and a logical clock C. Elements of T
form a partially ordered set over a relation <. This relation is usually called the happened
before or causal precedence.
1.16 A FRAMEWORK FOR A SYSTEM OF LOGICAL CLOCKS

The logical clock C is a function that maps an event e in a distributed system to an

element in the time domain T denoted as C(e).

This monotonicity property is called the clock consistency condition.

When T and C stratify the following condition:

Then the system of clocks is strongly consistent.

1.16.1 IMPLEMENTING LOGICAL CLOCKS
The two major issues in implanting logical clocks are:
 Data structures: representation of each process
 Protocols: rules for updating the data structures to ensure consistent conditions.
Data structures:
Each process pi maintains data structures with the given capabilities:
• A local logical clock (lci)

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

 helps process pi measure its own progress.
• A logical global clock (gci)
 is a representation of process pi’s local view of the logical global time. It
allows this process to assign consistent timestamps to its local events.
Protocol:
The protocol ensures that a process’s logical clock, and thus its view of the global time,
is managed consistently with the following rules:
Rule 1:
Decides the updates of the logical clock by a process. It controls send, receive and other
operations.
Rule 2:
Decides how a process updates its global logical clock to update its view of the global
time and global progress. It dictates what information about the logical time is piggybacked in a
message and how this information is used by the receiving process to update its view of the
global time.
1.17 SCALAR TIME
Scalar time is designed by Lamport to synchronize all the events in distributed systems.
A Lamport logical clock is an incrementing counter maintained in each process. This logical
clock has meaning only in relation to messages moving between processes. When a process
receives a message, it resynchronizes its logical clock with that sender maintaining causal
relationship.
Rules R1 and R2 to update the clocks are as follows:
R1:Before executing an event (send, receive, or internal), process pi executes the following:

In general, every time R1 is executed, d can have a different value, and this value may be
application-dependent. However, typically d is kept at 1 because this is able to identify the time of
each event uniquely at a process, while keeping the rate of increase of d to its lowest level
R2:Each message piggybacks the clock value of its sender at sending time. When a process pi
receives a message with timestamp Cmsg, it executes the following actions:
1. Ci _= max_Ci_Cmsg_;
2. execute R1;
3. deliver the message.

Fig 1.25: Evolution of scalar time

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

1.17.1 BASIC PROPERTIES OF SCALAR TIME:

1. Consistency property: Scalar clock always satisfies monotonicity. A monotonic
clock only increments its timestamp and never jump. Hence it is consistent.
C(ei) < C(ej)
2. Total Reordering: Scalar clocks order the events in distributed systems. But all the
events do not follow a common identical timestamp. Hence a tie breaking
mechanism is essential to order the events. The tie breaking is done through:
 Linearly order process identifiers.
 Process with low identifier value will be given higher priority.
The term (t, i) indicates timestamp of an event, where t is its time of occurrence and i is
the identity of the process where it occurred. The total order relation ( ) over two events x
and y with timestamp (h, i) and (k, j) is given by,

A total order is generally used to ensure liveness properties in distributed algorithms.

3. Event Counting
If event e has a timestamp h, then h−1 represents the minimum logical duration, counted
in units of events, required before producing the event e. This is called height of the event e. h-1
events have been produced sequentially before the event e regardless of the processes that
produced these events.
4. No strong consistency
The scalar clocks are not strongly consistent is that the logical local clock and logical global
clock of a process are squashed into one, resulting in the loss causal dependency informat ion
among events at different processes.
1.18 VECTOR TIME
The time domain is represented by a set of n-dimensional non-negative
integer vectors in vector time.

The system of vector clocks was developed independently by Fidge, Mattern, and
Schmuck. In the system of vector clocks, the time domain is represented by a set of n-
dimensional non-negative integer vectors.
Each process pi maintains a vector vti[1….n], where vti [i] is the local logical clock of pi
and describes the logical time progress at process pi. Vti[j] represents process pi’s latest
knowledge of process pj local time. If vti[j] = x, then process pi knows that local time at process
pj has progressed till x. The entire vector vt i constitutes pi’s view of the global logical time and
is used to timestamp events.

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

1.18.1 RULES OF VECTOR TIME

Rule 1:
Before executing an event, process pi updates its local logical
time as follows:
ti [i] :  ti [i]  d d  0
Rule 2:
Each message m is piggybacked with the vector clock vt of the sender
process at sending time. On the receipt of such a message (m,vt), process
pi executes the following sequence of actions:
1. update its global logical time
1  k  n : ti [k] :  max ti [k], t[k]
2. Execute R1
3. deliver the message m

Fig 1.26: Evolution of vector scale

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

1.18.2 BASIC PROPERTIES OF VECTOR TIME
1. Isomorphism:
 “→” induces a partial order on the set of events that are produced by a
distributed execution.
 If events x and y are time stamped as vh and vk then,

2. Strong consistency
The system of vector clocks is strongly consistent; thus, by examining the vector timestamp of
two events, we can determine if the events are causally related.
3. Event counting
If an event e has timestamp vh, vh[j] denotes the number of events executed by

process pj that causally precede e.

1.19 PHYSICAL CLOCK SYNCHRONIZATION: NTP
Centralized systems do not need clock synchronization, as they work under a common
clock. But the distributed systems do not follow common clock: each system functions based on
its own internal clock and its own notion of time.
The time in distributed systems is measured in the following contexts:
 The time of the day at which an event happened on a specific machine in the
network.
 The time interval between two events that happened on different machines in the
network.
 The relative ordering of events that happened on different machines in the network.

Clock synchronization is the process of ensuring that physically distributed

processors have a common notion of time.
Clocks are synchronized to an accurate real-time standard like UTC (Universal Coordinated
Time). Clocks that must not only be synchronized with each other but also have to adhere to
physical time are termed physical clocks. This degree of synchronization additionally enables to
coordinate and schedule actions between multiple computers connected to a common network.

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

1.19.1 BASIC TERMINOLOGIES
If Ca and Cb are two different clocks, then:
 Time: The time of a clock in a machine p is given by the function Cp(t),where
Cp(t)= t for a perfect clock.
 Frequency: Frequency is the rate at which a clock progresses. The frequency at
time t of clock Ca is Ca’(t).
 Offset: Clock offset is the difference between the time reported by a clock and the
real time. The offset of the clock Ca is given by Ca(t)− t. The offset of clock C a
relative to Cb at time t ≥ 0 is given by Ca(t)- Cb(t)
 Skew: The skew of a clock is the difference in the frequencies of the clock and the
perfect clock. The skew of a clock Ca relative to clock Cb at time t is Ca’(t)- C ’b(t).
 Drift (rate): The drift of clock Ca the second derivative of the clock value with
respect to time. The drift is calculated as:

Clocking Inaccuracies
Physical clocks are synchronized to an accurate real-time standard like UTC (Universal
Coordinated Time). Due to the clock inaccuracy discussed above, a timer (clock) is said to be
working within its specification if:

Offset delay estimation

A time service for the Internet - synchronizes clients to UTC Reliability from redundant
paths, scalable, authenticates time sources Architecture. The design of NTP involves a
hierarchical tree of time servers with primary server at the root synchronizes with the UTC.
Clock offset and delay estimation
A source node cannot accurately estimate the local time on the target node due to varying
message or network delays between the nodes. This protocol employs a very common practice of
performing several trials and chooses the trial with the minimum delay.

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

Fig 1.29: Behavior of clocks

Fig 1.30 a) Offset and delay estimation Fig 1.30 b) Offset and delay estimation
between processes from same server between processes from different servers
Let T1, T2, T3, T4 be the values of the four most recent timestamps. The clocks A and B
are stable and running at the same speed. Let a = T1 − T3 and b = T2 − T4. If the network delay
difference from A to B and from B to A, called differential delay, is small, the clock offset and
round trip delay of B relative to A at time T4 are approximately given by the following:

Each NTP message includes the latest three timestamps T1, T2, and T3, while T4 is
determined upon arrival.

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

PART-A (Possible Questions)

1. Define distributed systems.
A distributed system is a collection of independent computers, interconnected via a network,
capable of collaborating on a task. Distributed computing is computing performed in a
distributed system

2. What are the aspects of distributed systems?

The definition of distributed systems deals with two aspects that:
 Deals with hardware: The machines linked in a distributed system are autonomous.
 Deals with software: A distributed system gives an impression to the users that they
are dealing with a single system.

3. List the features of Distributed Systems.

 Communication is hidden from users
 Applications interact in uniform and consistent way
 High degree of scalability
 A distributed system is functionally equivalent to the systems of which it is
composed.
 Resource sharing is possible in distributed systems.
 Distributed systems act as fault tolerant systems
 Enhanced performance
 No common clock
 Geographical isolation
 Autonomous
 Heterogeneous

4. List the issues in distributed systems

 Concurrency
 Distributed system function in a heterogeneous environment. So adaptability is a
major issue.
 Latency
 Memory considerations: The distributed systems work on both local and shared
memory.
 Synchronization issues
 Applications must need to adapt gracefully without affecting other parts of the
systems in case of failures.
 Since they are widespread, security is a major issue.
 Limits imposed on scalability
 They are less transparent.

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

5. Give the QOS parameters
The distributed systems must offer the following QOS:
 Performance
 Reliability
 Availability
 Security

6. Give the differences between centralized and distributed systems

Centralized Systems Distributed Systems

In Centralized Systems, several jobs are In Distributed Systems, jobs are
done on a particular central processing distributed among several processors. The
unit(CPU) Processor are interconnected by a
computer network
They have shared memory and shared They have no global state (i.e.) no shared
variables. memory and no shared variables.
Clocking is present. No global clock.

7. What is reliability?
Availability: The resource/ service provided by the resource should be accessible at all times
Integrity: the value/state of the resource should be correct and consistent. Fault-Tolerance:the
ability to recover from system failure

8. Give the e three types of shared memory multiprocessors.

Uniform Memory Access (UMA), Non Uniform Memory Access (NUMA), Cache Only
Memory Access (COMA)

9. Define Flynn’s taxonomy.

Flynn's taxonomy is a specific classification of parallel computer architectures that are based on
the number of concurrent instruction (single or multiple) and data streams (single or multiple)
available in the architecture.

10. Classify computer architectures based on Flynns taxonomy.

(SISD) single instruction, single data (MISD) multiple instruction, single data (SIMD) single
instruction, multiple data (MIMD) multiple instruction, multiple data

11. Define degree of coupling.

The degree of coupling among a set of modules, whether hardware or software, is measured in
terms of the interdependency and binding and/or homogeneity among the modules.

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

12. What are Tightly Coupled systems?
 Tightly coupled multiprocessor systems contain multiple CPUs that are connected at
the bus level with both local as well as central shared memory.
 Tightly coupled systems perform better, due to faster access to memory and
intercommunication and are physically smaller and use less power. They are
economically costlier.
 Tightly coupled multiprocessors with UMA shared memory may be either switch-
based (e.g., NYU Ultracomputer, RP3) or bus-based (e.g., Sequent, Encore).
 Some examples of tightly coupled multiprocessors with NUMA shared memory or
that communicate by message passing are the SGI Origin 2000.

13. What are Loosely Coupled systems?

 Loosely coupled multiprocessors consist of distributed memory where each
processor has its own memory and IO channels.
 The processors communicate with each other via message passing or
interconnection switching.

 Each processor may also run a different operating system and have its own bus
control logic.
 Loosely coupled systems are less costly than tightly coupled systems, but are
physically bigger and have a low performance compared to tightly coupled systems.
 The individual nodes in a loosely coupled system can be easily replaced and are
usually inexpensive.
 They require more power and are more robust and can resist failures.

14. Define concurrent programming.

Concurrent programming refer to techniques for decomposing a task into subtasks that can
execute in parallel and managing the risks that arise when the program executes more than one
task at the same time.

15. What is granularity?

Granularity or grain size is a measure of the amount of work or computation that is performed by
that task.

16. Classify parallelism based on work distribution.

1. Fine-grained: Partitioning the application into small amounts of work done leading
to a low computation to communication ratio.
2. Coarse-grained parallelism: This has high computation to communication ratio.
3. Medium-grained: Here the task size and communication time greater than fine-
grained parallelism and lower than coarse-grained parallelism.

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

17. What are blocking primitives?
The primitive commands wait for the message to be delivered. The execution of the processes is
blocked. The sending process must wait after a send until an acknowledgement is made by the
receiver. The receiving process must wait for the expected message from the sending process.
The receipt is determined by polling common buffer or interrupt. This is a form of
synchronization or synchronous communication. A primitive is blocking if control returns to the
invoking process after the processing for the primitive completes.

18. What are Non Blocking primitives?

If send is nonblocking, it returns control to the caller immediately, before the message is sent.
The advantage of this scheme is that the sending process can continue computing in parallel with
the message transmission, instead of having the CPU go idle. This is a form of asynchronous
communication. A primitive is non-blocking if control returns back to the invoking process
immediately after invocation, even though the operation has not completed. For a non-blocking
Send, control returns to the process even before the data is copied out of the user buffer. For a
non-blocking Receive, control returns to the process even before the data may have arrived from
the sender.

19. What is Synchronous primitive?

A Send or a Receive primitive is synchronous if both the Send() and Receive() handshake with
each other. The processing for the Send primitive completes only after the invoking processor
learns that the other corresponding Receive primitive has also been invoked and that the receive
operation has been completed. The processing for the Receive primitive completes when the data
to be received is copied into the receiver’s user buffer.

20. What is Asynchronous primitive?

A Send primitive is said to be asynchronous, if control returns back to the invoking process after
the data item to be sent has been copied out of the user-specified buffer. It does not make sense
to define asynchronous Receive primitives. Implementing non
-blocking operations are tricky. For non-blocking primitives, a return parameter on the primitive
call returns a system-generated handle which can be later used to check the status of completion
of the call.

21. What are the modes of send and receive primitives?

• Blocking synchronous
• Non- blocking synchronous
• Blocking asynchronous
• Non- blocking asynchronous

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

22. What is Blocking Receive?
The Receive call blocks until the data expected arrives and is written in the specified user buffer.
Then control is returned to the user process.

23. What is Non-blocking Receive?

• The Receive call will cause the kernel to register the call and return the handle of a
location that the user process can later check for the completion of the non-
blocking Receive operation.
• This location gets posted by the kernel after the expected data arrives and is copied
to the user-specified buffer. The user process can check for the completion of the
non-blocking Receive by invoking the Wait operation on the returned handle.
•
24. What is MPI?
This is a standardized and portable message-passing system to function on a wide variety of
parallel computers. MPI primarily addresses the message-passing parallel programming model:
data is moved from the address space of one process to that of another process through
cooperative operations on each process.
The primary goal of the Message Passing Interface is to provide a widely used standard for
writing message passing programs.

25. Brief about Parallel Virtual Machine (PVM).

It is a software tool for parallel networking of computers. It is designed to allow a network of
heterogeneous Unix and/or Windows machines to be used as a single distributed parallel
processor.

26. Define Remote Procedure Call (RPC).

The Remote Procedure Call (RPC) is a common model of request reply protocol. In RPC, the
procedure need not exist in the same address space as the calling procedure. The two processes
may be on the same system, or they may be on different systems with a network connecting
them. By using RPC, programmers of distributed applications avoid the details of the interface
with the network. RPC makes the client/server model of computing more powerful and easier to
program.

27. What is Remote Method Invocation (RMI)?

RMI (Remote Method Invocation) is a way that a programmer can write object- oriented
programming in which objects on different computers can interact in a distributed network. It is
a set of protocols being developed by Sun's JavaSoft division that enables Java objects to
communicate remotely with other Java objects.

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

28. What is Remote Procedure Call (RPC)?
It is a protocol that one program can use to request a service from a program located in another
computer in a network without having to understand network details. RPC is a powerful
technique for constructing distributed, client-server based applications. In RPC, the procedure
need not exist in the same address space as the callingprocedure. The two processes may be on
the same system, or they may be on different systems with a network connecting them. By using
RPC, programmers of distributed applications avoid the details of the interface with the network.
RPC makes the client/server model of computing more powerful and easier to program.

29. Differentiate between RMI and RPC

RMI RPC
RMI uses an object oriented paradigm RPC is not object oriented and does not
where the user needs to know the object deal with objects. Rather, it calls
and the method of the object he needs to specific subroutines that are already
invoke. established
With RPC looks like a local call. RPC RMI handles the
complexities of
handles the complexities involved with passing along the invocation from the
passing the call from the local to the local to the remote computer. But
remote computer. instead of passing a procedural call,
RMI passes a reference to the object
and the method that is being called.

30.List the commonalities between RMI and RPC.

 They both support programming with interfaces.
 They are constructed on top of request-reply protocols.
 They both offer a similar level of transparency.

30. Write about CORBA.

CORBA describes a messaging mechanism by which objects distributed over a network can
communicate with each other irrespective of the platform and language used to develop those
objects. The data representation is concerned with an external representation for the structured
and primitive types that can be passed as the arguments and results of remote method
invocations in CORBA. It can be used by a variety of programming languages.

31. What are the transparency types?

 Access transparency: hides differences in data representation
 Location transparency: hides differences in locations y providing uniform access to
data located at remote locations.

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

 Migration transparency: allows relocating resources without changing names.
 Replication transparency: Makes the user unaware whether he is working on
original or replicated data.
 Concurrency transparency: Masks the concurrent use of shared resources for the
user.
 Failure transparency: system being reliable and fault-tolerant.

32. Write about Wait-free algorithms.

The ability of a process to complete its execution irrespective of theactions of other processes is
wait free algorithm. They control the access to shared resources in the shared memory
abstraction. They are expensive.

33. Define Mutual exclusion.

Concurrent access of processes to a shared resource or data is executed in mutually exclusive
manner. Only one process is allowed to execute the critical section at any given time. In a
distributed system, shared variables or a local kernel cannot be used to implement mutual
exclusion. Message passing is the sole means for implementing distributed mutual exclusion.

34. Write about casual precedence relationship.

It places a restriction on communication between processes by requiring that if the transmission
of message mi to process pk necessarily preceded the transmission of message mj to the same
process, then the delivery of these messages to that process must be ordered such that mi is
delivered before mj.

35. Differenciate between logical and physical concurrency

Logical concurrency Physical concurrency

Several units of the same program execute Several program units of the same
simultaneously on same processor, giving program execute at the same time on
an illusion to the programmer that they are different processors.
executing on multiple processors.
They are implemented through They are implemented as uni-processor
interleaving. with I/O channels, multiple CPUs,
network of uni or multi CPU machines.

36. Define snapshot of system.

Distributed Snapshot represents a state in which the distributed system might have been in. A
snapshot of the system is a single configuration of the system.

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

37. What is cut?
A cut is a set of cut events, one per node, each of which captures the state of the node on which
it occurs.

38. Write about consistent and inconsistent cut.

Consistent cut: A consistent global state corresponds to a cut in which every message received in
the PAST of the cut was sent in the PAST of that cut.
Inconsistent cut: A cut is inconsistent if a message crosses the cut from the FUTURE to the
PAST.

39. What are logical clocks?

Logical clocks are based on capturing chronological and causal relationships of processes and
ordering events based on these relationships.

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

PART – B (Possible Questions)

1.Explain in detail about types of multiprocessor systems.

2.Describe the various topologies.
3.Brief about Flynn’s classification.
4.Classify systems based on granularity.
5.Explain about shared memory systems.
6. Elaborate about message passing system in detail.
7. Describe the various primitives for distributed communication.
8. Explain synchronous and asynchronous executions.
9. Brief about the design issues and challenges in distributed system.
10. Explain the model of distributes execution.
11. Write in detail about global state.
12. Explain cuts, future cones and events.
13. Describe the various clocks.
14. Give the various implementations of vector clock.
15. Explain in detail about network time protocol.

PART – C (Possible Questions)

1. (i) Formulate the Relation to computer system components.(10)
(ii) Design the model of distributed executions. (5)
2. (i) Describe the Design issues and challenges in distributed system (8)
(ii) Describe Physical clock synchronization and its applications.(7)
3. (i) Briefly describe the idea behind Message-passing systems versus shared
memory systems.(8)
(ii) Describe the Primitives for distributed communication.(7)
4. Explain in detail about Models of communication networks (15)
5. With an example, explain about Global state, Cuts in detail. (15)
6. Prepare summary report on Scalar time and Vector time(15)

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

UNIT II
MESSAGE ORDERING AND GROUP COMMUNICATION

2.1 MESSAGE ORDERING PARADIGMS

The message ordering means the order of delivering the messages to the intended recipients.
There are four types of communication models in distributed systems. They are,
(i) Non-FIFO
(ii) FIFO
(iii) Causal order
(iv) Synchronous order
There is always a trade-off between concurrency and ease of use and implementation.
ASYNCHRONOUS EXECUTIONS Or NON-FIFO
 There cannot be any causal relationship between events in asynchronous execution.
 The messages can be delivered in any order even in non FIFO.
 Though there is a physical link that delivers the messages sent on it in FIFO order due to
the physical properties of the medium, a logical link may be formed as a composite of
physical links and multiple paths may exist between the two end points of the logical
link.

FIFO EXECUTIONS
In FIFO, each channel acts as a FIFO message queue. So message ordering is preserved
by a channel.
 FIFO logical channels can be realistically assumed when designing distributed
algorithms since most of the transport layer protocols follow connection oriented service.
 A FIFO logical channel can be created over a non-FIFO channel by using a separate
numbering scheme to sequence the messages on each logical channel.
 The sender assigns and appends a <sequence_num, connection_id> tuple to each
message.
 The receiver uses a buffer to order the incoming messages as per the sender’s sequence
numbers, and accepts only the “next” message in sequence.

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

CAUSALLY ORDERED (CO) EXECUTIONS

It follows Lamport’s law.

The relation between the three models is given by CO ⊂FIFO ⊂ N-FIFO.

 Figure (a) shows an execution that violates CO because s1 ≺ s3 and at the

common destination P1, we have r3 ≺ r1.
 Figure (b) shows an execution that satisfies CO. Only s1 and s2 are related by
causality but the destinations of the corresponding messages are different.
 Figure (c) shows an execution that satisfies CO. No send events are related by
causality.
 Figure (d) shows an execution that satisfies CO. s2 and s1 are related by
causality but the destinations of the corresponding messages are different.
Similarly for s2 and s3.
Other properties of causal ordering

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

SYNCHRONOUS ORDER
When all the communication between pairs of processes uses synchronous send and receives
primitives, the resulting order is the synchronous order.
 The synchronous communication always involves a handshake between the receiver and
the sender, the handshake events may appear to be occurring instantaneously and
atomically.
 The instantaneous communication property of synchronous executions requires modified
definition of the causality relation because for each (s, r) ∈ T, the send event is not
causally ordered before the receive event.
 The two events are viewed as being atomic and simultaneous, and neither event precedes
the other.

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

2.2. ASYNCHRONOUS EXECUTION WITH SYNCHRONOUS COMMUNICATION

When all the communication between pairs of processes is by using synchronous send and
receives primitives, the resulting order is synchronous order. The algorithms run on
asynchronous systems will not work in synchronous system and vice versa is also true.
Example
The asynchronous execution of Figure 6.4, illustrated in Figure 6.5(a) using a timing diagram,
will deadlock if run with synchronous primitives. The executions in Figure 6.5(b)–(c) will also
deadlock when run on a synchronous system.

2.2.1 REALIZABLE SYNCHRONOUS COMMUNICATION (RSC)

• An execution can be modeled to give a total order that extends the partial order (E, ≺).
• In an A-execution, the messages can be made to appear instantaneous if there exist a
linear extension of the execution, such that each send event is immediately followed by
its corresponding receive event in this linear extension.

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

• In the non-separated linear extension, if the adjacent send event and its corresponding
receive event are viewed atomically, then that pair of events shares a common past and a
common future with each other.
Crown

Cyclic dependencies may exist in a crown. The crown criterion states that an A-
computation is RSC, i.e., it can be realized on a system with synchronous communication, if and
only if it contains no crown.

 Figure (a): The crown is as we have s1 ≺ r2 and s2 ≺ r1.

 Figure (b): The crown is as we have s1 ≺ r2 and s2 ≺ r1.
 Figure (c): The crown is •
Timestamp criterion for RSC execution
An execution (E, ≺) is RSC if and only if there exists a mapping from E to T (scalar
timestamps) such that

2.2.2.HIERARCHY OF ORDERING PARADIGMS

The orders of executions are:
 Synchronous order (SYNC)
 Causal order (CO)
 FIFO order (FIFO)
 Non FIFO order (non-FIFO)

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

For an A-execution, A is RSC if and only if A is an S-execution.
RSC ⊂ CO ⊂ FIFO ⊂ A.

The above hierarchy implies that some executions belonging to a class X will not belong to
any of the classes included in X. The degree of concurrency is most in A and least in SYNC.
 A program using synchronous communication is easiest to develop and verify.
 A program using non-FIFO communication, resulting in an A execution, is hardest to
design and verify

Fig 2.3: Hierarchy of execution classes

2.2.3 SIMULATIONS
Asynchronous programs on synchronous systems
The events in the RSC execution are scheduled as per some non-separated linear extension
and adjacent (s, r) events in this linear extension are executed sequentially in the synchronous
system.
 The partial order of the asynchronous execution remains unchanged.
 If an A-execution is not RSC, then there is no way to schedule the events to make them
RSC, without actually altering the partial order of the given A-execution.

Fig 2.4: Modeling channels as processes to simulate an execution using asynchronous

primitives on synchronous system

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

Synchronous programs on asynchronous systems
A (valid) S-execution can be trivially realized on an asynchronous system by scheduling the
messages in the order in which they appear in the S-execution.
 The partial order of the S-execution remains unchanged but the communication
occurs on an asynchronous system that uses asynchronous communication
primitives.
 Once a message send event is scheduled, the middleware layer waits for
acknowledgment; after the ack is received, the synchronous send primitive
completes.

2.3 SYNCHRONOUS PROGRAM ORDER ON AN ASYNCHRONOUS SYSTEM

Non deterministic programs
The partial ordering of messages in the distributed systems makes the repeated runs of the same
program will produce the same partial order, thus preserving deterministic nature. But
sometimes the distributed systems exhibit non determinism:
 A receive call can receive a message from any sender who has sent a message, if
the expected sender is not specified.
 Multiple send and receive calls which are enabled at a process can be executed in
an interchangeable order.
 If i sends to j, and j sends to i concurrently using blocking synchronous calls,
there results a deadlock.
 There is no semantic dependency between the send and the immediately
following receive at each of the processes. If the receive call at one of the
processes can be scheduled before the send call, then there is no deadlock.
2.3.1 RENDEZVOUS
Rendezvous systems are a form of synchronous communication among an arbitrary number of
asynchronous processes. All the processes involved meet with each other, i.e., communicate
synchronously with each other at one time. Two types of rendezvous systems are possible:
 Binary rendezvous:
o When two processes agree to synchronize.
 Multi-way rendezvous:
o When more than two processes agree to synchronize.
Features of binary rendezvous:
 For the receive command, the sender must be specified. However, multiple recieve
commands can exist. A type check on the data is implicitly performed.
 Send and received commands may be individually disabled or enabled. A command is
disabled if it is guarded and the guard evaluates to false. The guard would likely contain
an expression on some local variables.

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

 Synchronous communication is implemented by scheduling messages under the covers
using asynchronous communication.
 Scheduling involves pairing of matching send and receives commands that are both
enabled. The communication events for the control messages under the covers do not
alter the partial order of the execution.
2.3.2 BINARY RENDEZVOUS ALGORITHM
If multiple interactions are enabled, a process chooses one of them and tries to synchronize with
the partner process. The problem reduces to one of scheduling messages satisfying the following
constraints:
 Schedule on-line, atomically, and in a distributed manner.
 Schedule in a deadlock-free manner (i.e., crown-free).
 Schedule to satisfy the progress property in addition to the safety property.

Fig: Messages used to implement synchronous order. Pi has higher priority than Pj . (a) Pi
issues SEND(M). (b) Pj issues SEND(M).
Key rules to prevent cycles
 To send to a lower priority process, messages M and ack(M) are involved in that order.
The sender issues send(M) and blocks until ack(M) arrives. Thus, when sending to a
lower priority process, the sender blocks waiting for the partner process to synchronize
and send an acknowledgement.
 To send to a higher priority process, messages request(M), permission(M), and M are
involved, in that order. The sender issues send(request(M)), does not block, and awaits
permission. When permission(M) arrives, the sender issues send(M).
Steps in Bagrodia algorithm
1. Receive commands are forever enabled from all processes.
2. A send command, once enabled, remains enabled until it completes, i.e., it is not
possible that a send command gets before the send is executed.
3. To prevent deadlock, process identifiers are used to introduce asymmetry to break
potential crowns that arise.
4. Each process attempts to schedule only one send event at any time.

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

The message (M) types used are:
i) M
ii) ack(M)
iii) request(M)
iv) permission(M).
Execution events in the synchronous execution are only the send of the message M and receive
of the message M. The send and receive events for the other message types – ack(M),
request(M), and permission(M) which are control messages. The messages request(M), ack(M),
and permission(M) use M’s unique tag; the message M is not included in these messages.
Bagrodia algorithm:

Fig : Examples showing how to schedule messages sent with synchronous primitives.

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

In either case, a higher priority process blocks on a lower priority process. So cyclic
waits are avoided.

2.4 GROUP COMMUNICATION

Group communication is done by broadcasting of messages. A message broadcast is the sending
of a message to all members in the distributed system. The communication may be
 Multicast:
o A message is sent to a certain subset or a group.
 Unicasting: A point-to-point message communication.
The network layer protocol cannot provide the following functionalities:
 Application-specific ordering semantics on the order of delivery of messages.
 Adapting groups to dynamically changing membership.
 Sending multicasts to an arbitrary set of processes at each send event.
 Providing various fault-tolerance semantics.
The multicast algorithms can be open or closed group.
Differences between closed and open group algorithms:
Closed group algorithms Open group algorithms

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

2.5 CAUSAL ORDER (CO)
In the context of group communication, there are two modes of communication:
1. Causal order
2. Total order.
Given a system with FIFO channels, causal order needs to be explicitly enforced by a
protocol. The following two criteria must be met by a causal ordering protocol:
1. Safety
2. Liveness
Safety:
In order to prevent causal order from being violated, a message M that arrives at a
process may need to be buffered until all system wide messages sent in the causal past of the
send (M) event to that same destination have already arrived. The arrival of a message is
transparent to the application process. The delivery event corresponds to the receive event in the
execution model.
Liveness:
A message that arrives at a process must eventually be delivered to the process.

2.5.1 THE RAYNAL–SCHIPER–TOUEG ALGORITHM

Each message M should carry a log of all other messages sent causally before M’s send
event, and sent to the same destination dest(M).
The Raynal–Schiper–Toueg algorithm canonical algorithm is a representative of several
algorithms that reduces the size of the local space and message space overhead by various
techniques.
To distribute log information, broadcast and multicast communication is used. The hardware-
assisted or network layer protocol assisted multicast cannot efficiently provide features:
 Application-specific ordering semantics on the order of delivery of messages.
 Adapting groups to dynamically changing membership.
 Sending multicasts to an arbitrary set of processes at each send event.
 Providing various fault-tolerance semantics
An optimal CO algorithm stores in local message logs and propagates on messages, information
of the form d is a destination of M about a messageM sent in the causal past, as long as and only
as long as:
 Propagation Constraint I: it is not known that the message M is delivered to d.
 Propagation Constraint II: it is not known that a message has been sent to d in the causal
future of Send(M), and hence it is not guaranteed using a reasoning based on transitivity
that the message M will be delivered to d in CO.

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

The data structures maintained are sorted row–major and then column–major:
1. Explicit tracking
Tracking of (source, timestamp, destination) information for messages
(i) not known to be delivered
(ii) not guaranteed to be delivered in CO, is performed explicitly.
2. Implicit tracking
Tracking of messages that are either
(i) already delivered,
(ii) guaranteed to be delivered in CO, is performed implicitly.
Information about messages:
(i) not known to be delivered
(ii) not guaranteed to be delivered in CO, is explicitly tracked by the algorithm using (source,
timestamp, destination) information.

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

Multicasts M5,1 and M4,1

Message M5,1 sent to processes P4 and P6 contains the piggybacked information M5,1.
Dest = {P4, P6}. Additionally, at the send event (5, 1), the information M5,1. Dests = {P4,P6}
is also inserted in the local log Log5. When M5,1 is delivered to P6, the (new) piggybacked
information P4 ∈ M5,1 .Dests is stored in Log6 as M5,1.Dests ={P4} information about P6 ∈
M5,1.Dests which was needed for routing, must not be stored in Log6 because of constraint I. In
the same way when M5,1 is delivered to process P4 at event (4, 1), only the new piggybacked
information P6 ∈ M5,1 .Dests is inserted in Log4 as M5,1.Dests =P6which is later propagated
during multicast M4,2.
Multicast M4,3
At event (4, 3), the information P6 ∈M5,1.Dests in Log4 is propagated on multicast M4,3
only to process P6 to ensure causal delivery using the Delivery Condition. The piggybacked
information on message M4,3 sent to process P3must not contain this information because of
constraint II. As long as any future message sent to P6 is delivered in causal order w.r.t. M4,3
sent to P6, it will also be delivered in causal order w.r.t. M5,1. And as M5,1 is already delivered
to P4, the information M5,1Dests = ∅ is piggybacked on M4,3 sent to P 3. Similarly, the
information P6 ∈ M5,1Dests must be deleted from Log4 as it will no longer be needed, because
of constraint II. M5,1Dests = ∅ is stored in Log4 to remember that M5,1 has been delivered or is
guaranteed to be delivered in causal order to all its destinations.

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

Learning implicit information at P2 and P3

When message M4,2 is received by processes P2 and P3, they insert the (new)
piggybacked information in their local logs, as information M5,1.Dests = P6. They both
continue to store this in Log2 and Log3 and propagate this information on multicasts until
they learn at events (2, 4) and (3, 2) on receipt of messages M3,3 and M4,3, respectively, that
any future message is expected to be delivered in causal order to process P6, w.r.t. M5,1sent
toP6. Hence by constraint II, this information must be deleted from Log2 andLog3.
Processing at P6
When message M5,1 is delivered to P6, only M5,1.Dests = P4 is added to Log6. Further,
P6 propagates only M5,1.Dests = P4 on message M6,2, and this conveys the current implicit
information M5,1 has been delivered to P6 by its very absence in the explicit information.
Processing at P1
When M2,2 arrives carrying piggybacked information M5,1.Dests = P6 this (new)
information is inserted in Log1. When M6,2 arrives with piggybacked information M5,1.Dests
={P4}, P1learns implicit information M5,1has been delivered to P6 by the very absence of
explicit information P6 ∈ M5,1.Dests in the piggybacked information, and hence marks
information P6 ∈ M5,1Dests for deletion from Log1. Simultaneously, M5,1Dests = P6 in Log1
implies the implicit information that M5,1has been delivered or is guaranteed to be delivered in
causal order to P4.Thus, P1 also learns that the explicit piggybacked information M5,1.Dests =
P4 is outdated. M5,1.Dests in Log1 is set to ∅.

2.6 TOTAL ORDER

Centralized Algorithm for total ordering
Each process sends the message it wants to broadcast to a centralized process, which relays all
the messages it receives to every other process over FIFO channels.
(1) When process Pi wants to multicasts a message M to group G: (1a) send M(i, G) to
central coordinator
(2) When M(i, G) arrives from Pi at the central coordinator
(2a) send M(i, G) to all members of the group G.
(3) When M(i, G) arrives at pj from the central coordinator
(3a) deliver M(i, G) to the application.
Complexity:
Each message transmission takes two message hops and exactly n messages in a system
of n processes.

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

Drawbacks:
A centralized algorithm has a single point of failure and congestion, and is not an elegant
solution.
Three phase distributed algorithm
Three phases can be seen in both sender and receiver side.
I. SENDER SIDE
Phase 1
In the first phase, a process multicasts the message M with a locally unique tag and the
local timestamp to the group members.
Phase 2
The sender process awaits a reply from all the group members who respond with a
tentative proposal for a revised timestamp for that message M.
The await call is non-blocking.
Phase 3
The process multicasts the final timestamp to the group.

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

II. RECEIVER SIDE

Phase 1
The receiver receives the message with a tentative timestamp. It updates the variable
priority that tracks the highest proposed timestamp, then revises the proposed timestamp to the
priority, and places the message with its tag and the revised timestamp at the tail of the queue
temp_Q. In the queue, the entry is marked as undeliverable.
Phase 2
The receiver sends the revised timestamp back to the sender. The receiver then waits in a
non-blocking manner for the final timestamp.
Phase 3
The final timestamp is received from the multicaster. The corresponding message entry
in temp_Q is identified using the tag, and is marked as deliverable after the revised timestamp is
overwritten by the final timestamp.
Complexity
 This algorithm uses three phases, and, to send a message to n − 1 processes, it uses
3(n–1) messages
 incurs a delay of three message hops
Example:

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

The main sequence of steps is as follows:

1. A sends a REVISE_TS(7) message, having timestamp 7. B sends a REVISE_TS(9) message,
having timestamp 9.
2. C receives A’s REVISE_TS(7), enters the corresponding message in temp_Q, and marks it as
undeliverable; priority = 7. C then sends PROPOSED_TS(7) message to A.
3. D receives B’s REVISE_TS(9), enters the corresponding message in temp_Q, and marks it as
undeliverable; priority = 9. D then sends PROPOSED_TS(9) message to B.
4. C receives B’s REVISE_TS(9), enters the corresponding message in temp_Q, and marks it as
undeliverable; priority = 9. C then sends PROPOSED_TS(9) message to B.
5. D receives A’s REVISE_TS(7), enters the corresponding message in temp_Q, and marks it as
undeliverable; priority = 10. D assigns a tentative timestamp value of 10, which is greater than
all of the timestamps on REVISE_TSs seen so far, and then sends PROPOSED_TS(10) message
to A.
6. When A receives PROPOSED_TS(7) from C and PROPOSED_TS(10) from D, it computes
the final timestamp as max_7_ 10_ = 10, and sends FINAL_TS(10) to C and D.
7. When B receives PROPOSED_TS(9) from C and PROPOSED_TS(9) from D, it computes the
final timestamp as max_9_ 9_ = 9, and sends FINAL_TS(9) to C and D.
8. C receives FINAL_TS(10) from A, updates the corresponding entry in temp_Q with the
timestamp, resorts the queue, and marks the message as deliverable. As the message is not at the
head of the queue, and some entry ahead of it is still undeliverable, the message is not moved to
delivery_Q.
9. D receives FINAL_TS(9) from B, updates the corresponding entry in temp_Q by marking the
corresponding message as deliverable, and resorts the queue. As the message is at the head of the
queue, it is moved to delivery_Q.
10. When C receives FINAL_TS(9) from B, it will update the corresponding entry in temp_Q by
marking the corresponding message as deliverable. As the message is at the head of the queue, it
is moved to the delivery_Q, and the next message (of A), which is also deliverable, is also
moved to the delivery_Q.
11. When D receives FINAL_TS(10) from A, it will update the corresponding entry in temp_Q
by marking the corresponding message as deliverable. As the message is at the head of the
queue, it is moved to the delivery_Q.

GLOBAL STATE AND SNAPSHOT

RECORDING ALGORITHMS

A distributed computing system consists of processes that do not share a common

memory and communicate asynchronously with each other by message passing.
 Each component of has a local state. The state of the process is the local memory and a

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

history of its activity.

 The state of a channel is characterized by the set of messages sent along the channel less
the messages received along the channel. The global state of a distributed system is a
collection of the local states of its components.

 If shared memory were available, an up-to-date state of the entire system would be
available to the processes sharing the memory.
 The absence of shared memory necessitates ways of getting a coherent and complete
view of the system based on the local states of individual processes.
 A meaningful global snapshot can be obtained if the components of the distributed
system record their local states at the same time.
2.7 SYSTEM MODEL & ITS DEFINITIONS
The system consists of a collection of n processes, p1, p2,…, pn that are connected by channels.
 Let Cij denote the channel from process pi to process pj.
 Processes and channels have states associated with them.
 The state of a process at any time is defined by the contents of processor registers,
stacks, local memory, etc., and may be highly dependent on the local context of
the distributed application.
 The state of channel Cij, denoted by SCij, is given by the set of messages in
transit in the channel.
 The events that may happen are: internal event, send (send (mij)) and receive
(rec(mij)) events.
 The occurrences of events cause changes in the process state.
 A channel is a distributed entity and its state depends on the local states of the

processes on which it is incident.

The transit function records the state of the channel Cij.
 In the FIFO model, each channel acts as a first-in first-out message queue and, thus,
message ordering is preserved by a channel.
 In the non-FIFO model, a channel acts like a set in which the sender process adds
messages and the receiver process removes messages from it in a random order.
2.7.1 A CONSISTENT GLOBAL STATE
The global state of a distributed system is a collection of the local states of the processes and the

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

channels. The global state is given by:

 Condition C1 preserves law of conservation of messages.
 Condition C2 states that in the collected global state, for every effect, its cause
must be present.

 In a consistent global state, every message that is recorded as received is also

recorded as sent. Such a global state captures the notion of causality that message
cannot be received if it was not sent.
 Consistent global states are meaningful global states and inconsistent global states
are not meaningful in the sense that a distributed system can never be in an in
consistent state.
2.7.2 INTERPRETATION OF CUTS
Cuts in a space–time diagram provide a powerful graphical aid in representing and reasoning
about the global states of a computation. A cut is a line joining an arbitrary point on each process
line that slices the space–time diagram into a PAST and a FUTURE.
A consistent global state corresponds to a cut in which every message received in the PAST of
the cut has been sent in the PAST of that cut. Such a cut is known as a consistent cut.
In a consistent snapshot, all the recorded local states of processes are concurrent; that is, the
recorded local state of no process casually affects the recorded local state of any other process.
2.7.3 ISSUES IN RECORDING GLOBAL STATE
The non-availability of global clock in distributed system, raises the following issues:
Issue 1:
How to distinguish between the messages to be recorded in the snapshot from those not to be

recorded?
Issue 2:
Any message that is sent by a process before recording its snapshot, must be recorded in
the global snapshot (from C1).
Any message that is sent by a process after recording its snapshot, must not be recorded in the
global snapshot (from C2).

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

How to determine the instant when a process takes its snapshot?

Answer:
A process pj must record its snapshot before processing a message mij that was sent by process
pi after recording its snapshot.

2.8 SNAPSHOT ALGORITHMS FOR FIFO CHANNELS

Each distributed application has number of processes running on different physical servers.
These processes communicate with each other through messaging channels.
Snapshots are required to:
 Checkpointing
 Collecting garbage
 Detecting deadlocks
 Debugging
2.8.1 Chandy–Lamport algorithm
 The algorithm will record a global snapshot for each process channel.
 The Chandy-Lamport algorithm uses a control message, called a marker.
 After a site has recorded its snapshot, it sends a marker along all of its outgoing channels
before sending out any more messages.
 Since channels are FIFO, a marker separates the messages in the channel into those to be
included in the snapshot from those not to be recorded in the snapshot.
 This addresses issue I1. The role of markers in a FIFO system is to act as delimiters for
the messages in the channels so that the channel state recorded by the process at the
receiving end of the channel satisfies the condition C2.
Chandy–Lamport algorithm
1. Marker sending rule for process pi
(1) Process pi records its state.
(2) For each outgoing channel C on which a marker has not been sent pi sends a marker
along C before pi sends further message along C.
2. Marker receiving rule for process pj
(1) on receiving a marker along channel C:
if pj has not recorded its state then
record the state of C as the empty set
execute the “marker sending rule”
else

initiating a snapshot
record the state of C as the set of messages received along C after pj,s state was
recorded and before pj received the marker along C

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

Initiating a snapshot:
 Process Pi initiates the snapshot
 Pi records its own state and prepares a special marker message.
 Send the marker message to all other processes.
 Start recording all incoming messages from channels Cij for j not equal to i.
Propagating a snapshot
 For all processes Pjconsider a message on channel Ckj.
 If marker message is seen for the first time:
 Pj records own sate and marks Ckj as empty
 Send the marker message to all other processes.
 Record all incoming messages from channels Clj for 1 not equal to j or k.
 Else add all messages from inbound channels.
Terminating a snapshot
 All processes have received a marker.
 All process have received a marker on all the N-1 incoming channels.
 A central server can gather the partial state to build a global snapshot.
Correctness of the algorithm
 Since a process records its snapshot when it receives the first marker on any incoming
channel, no messages that follow markers on the channels incoming to it are recorded in
the process’s snapshot.
 A process stops recording the state of an incoming channel when a marker is received on
that channel.
 Due to FIFO property of channels, it follows that no message sent after the marker on
that channel is recorded in the channel state. Thus, condition C2 is satisfied.
 When a process pj receives message mij that precedes the marker on channel Cij, it acts
as follows: if process pj has not taken its snapshot yet, then it includes mij in its recorded
snapshot. Otherwise, it records mij in the state of the channel Cij. Thus, condition C1
is satisfied.
Complexity
The recording part of a single instance of the algorithm requires O(e) messages and O(d) time,
where e is the number of edges in the network and d is the diameter of the network.

2.8.2 PROPERTIES OF THE RECORDED GLOBAL STATE

The recorded global state may not correspond to any of the global states that occurred during
the computation. Consider two possible executions of the snapshot
algorithm.

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

Figure : Timing diagram of two possible executions of the banking example.

1.(Markers shown using dashed-and-dotted arrows.)
Let site S1 initiate the algorithm just after t1. Site S1 records its local state (account A
= $550) and sends a marker to site S2. The marker is received by site S2 after t4. When site S2
receives the marker, it records its local state (account B = $170), the state of channel C12 as $0, and
sends a marker along channel C21. When site S1 receives this marker, it records the state of channel
C21 as $80. The $800 amount in the system is conserved in the recorded global state,
A = $550 B = $170 C12 = $0 C21 = $80
2.(Markers shown using dotted arrows.)
Let site S1 initiate the algorithm just after t0 and before sending the $50 for S2. Site S1
records its local state (account A = $600) and sends a marker to site S2. The marker is received
by site S2 between t2 and t3. When site S2 receives the marker, it records its local state (account
B = $120), the state of channel C12 as $0, and sends a marker along channel C21. When site
S1 receives this marker, it records the state of channel C21 as $80. The $800 amount in the
system is conserved in the recorded global state, A = $600 B = $120 C12 = $0 C21 = $80.
In both these possible runs of the algorithm, the recorded global states never occurred in the
execution. This happens because a process can change its state asynchronously before the
markers it sent are received by other sites and the other sites record their states.

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

PART-A (Possible Questions)

1. What are the types of message orderings?
The message orderings are
(i) Non-FIFO
(ii) FIFO
(iii) Causal order
(iv) Synchronous order
2. What is asynchronous execution?
An asynchronous execution (or A-execution) is an execution (E,≺) for which the causality
relation is a partial order.
3. What is FIFO execution?
A FIFO execution is an A-execution in which, for all

4. What is casual ordering?

CO execution is an A-execution in which, for all,

5. Define delivery event.
The delayed message m is then given to the application for processing. The event of an
application processing an arrived message is referred to as a delivery event.
6. Write about RSC.
A-execution can be realized under synchronous communication is called a realizable with
synchronous communication (RSC).
7. Define crown.
Let E be an execution. A crown of size k in E is a sequence <(si, ri), i ∈{0,…, k-1}> of
pairs of corresponding send and receive events such that: s0 ≺ r1, s1 ≺ r2, sk−2 ≺ rk−1, sk−1 ≺
r0.
8. Define rendezvous systems.
Rendezvous systems are a form of synchronous communication among an arbitrary
number of asynchronous processes. All the processes involved meet with each other, i.e.,
communicate synchronously with each other at one time.
9. List the types of rendezvous systems.
• Binary rendezvous: When two processes agree to synchronize.
• Multi-way rendezvous: When more than two processes agree to synchronize.
10. Give the features of binary rendezvous.
 For the receive command, the sender must be specified. However, multiple recieve

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

commands can exist. A type check on the data is implicitly performed.
 Send and received commands may be individually disabled or enabled. A command
is disabled if it is guarded and the guard evaluates to false. The guard would likely
contain an expression on some local variables.
 Synchronous communication is implemented by scheduling messages under the
covers using asynchronous communication.
 Scheduling involves pairing of matching send and receives commands that are both
enabled. The communication events for the control messages under the covers do
not alter the partial order of the execution.
11. List the steps in Bagrodia algorithm.
1. Receive commands are forever enabled from all processes.
2. A send command, once enabled, remains enabled until it completes, i.e., it is not
possible that a send command gets before the send is executed.
3. To prevent deadlock, process identifiers are used to introduce asymmetry to break
potential crowns that arise.
4. Each process attempts to schedule only one send event at any time.
12. What are the responsibilities of network layer protocol?
The network layer protocol cannot provide the following functionalities:
 Application-specific ordering semantics on the order of delivery of messages.
 Adapting groups to dynamically changing membership.
 Sending multicasts to an arbitrary set of processes at each send event.
 Providing various fault-tolerance semantics.
The multicast algorithms can be open or closed group.
13. Differenciate between closed and open group algorithms.

Closed group algorithms Open group algorithms

If sender is also one of the receiver in the If sender is not a part of the communication
multicast algorithm, then it is closed group group, then it is open group algorithm.
algorithm.
They are specific and easy to implement. They are more general, difficult to design
and expensive.
It does not support large systems where It can support large systems.
client processes have short life.
14. What is the purpose of Raynal-Schiper- Toueg algorithm?
The Raynal–Schiper–Toueg algorithm canonical algorithm is a representative of several
algorithms that reduces the size of the local space and message space overhead by various
techniques.
15. Define total ordering.
For each pair of processes Pi and Pj and for each pair of messages Mx and My that are

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

delivered to both the processes, Pi is delivered Mx before My if and only if Pj is delivered
Mxbefore My.
16. Mention the complexity and limitation of total order.
Complexity: Each message transmission takes two message hops and exactly n messages
in a system of n processes.
Drawbacks: A centralized algorithm has a single point of failure and congestion, and is
not an elegant solution.
17. Define channel.
A channel is a distributed entity and its state depends on the local states of the processes
on which it is incident.

18. What is transit?

Transit: transit(LSi, LSj) ={mij |send(mij) LSi rec(mij) LSj}
19. Give the use of transit function.
The transit function records the state of the channel Cij.
20. Define law of conservation of messages.
Every message mij that is recorded as sent in the local state of a process p i must be
captured in the state of the channel Cij or in the collected local state of the receiver process pj.
21. Define cuts.
Cuts in a space–time diagram provide a powerful graphical aid in representing and
reasoning about the global states of a computation. A cut is a line joining an arbitrary point on
each process line that slices the space–time diagram into a PAST and a FUTURE.
22. Define consistent global state.
A consistent global state corresponds to a cut in which every message received in the
PAST of the cut has been sent in the PAST of that cut. Such a cut is known as a consistent cut.
23. What is consistent snapshot?
In a consistent snapshot, all the recorded local states of processes are concurrent; that is,
the recorded local state of no process casually affects the recorded local state of any other
process.
24. How to terminate a snapshot?
• All processes have received a marker.
• All process have received a marker on all the N-1 incoming channels.
• A central server can gather the partial state to build a global snapshot.

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

PART-B (Possible Questions)

1. Explain about Message Ordering Paradigms.

2. Compare and contrast Asynchronous execution with synchronous communication.
3. Describe about simulations.
4. How to implement synchronous program order on an asynchronous system.
5. Explain about group communication.
6. Elaborate casual ordering.
7. Compare implicit and explicit tracking.
8. Describe total ordering of events.
9. Brief about Global state and snapshot recording algorithms.
10. Explain in detail about Chandy–Lamport algorithm with properties of recorded state.

PART-C (Possible Questions)

1. (i) Formulate the Message ordering paradigms.(10)
(ii)Design the Global state and snapshot recording algorithms (5)
2. (i) Describe the Asynchronous execution with synchronous communication in
distributed system (8)
(ii) Describe Physical clock synchronization and its applications.(7)
3. (i) Briefly describe the idea behind Causal order (CO) and Total order (8)
(ii) Describe the Primitives for distributed communication. (7)
4. Explain in detail about Models of communication networks (15)
5. With an example, explain about Snapshot algorithms for FIFO channels (15)
6. Prepare summary report on Group communication

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

UNIT III
DISTRIBUTED MUTEX & DEADLOCK

3.1 INTRODUCTION
In distributed systems, a process may request resources in any order, which may not be
known a priori, and a process can request a resource while holding others. If the allocation
sequence of process resources is not controlled in such environments, deadlocks can occur.
Deadlocks can be deal with the following three strategies:
 Deadlock prevention
 Deadlock avoidance
 Deadlock detection
Deadlock prevention:
Deadlock prevention is achieved by either having a process acquire all the needed
resources simultaneously before it begins execution or by pre-empting a process that holds the
needed resource.
Deadlock avoidance:
Deadlock avoidance means a resource is granted to a process if the resulting global
system is safe.
Deadlock detection:
Deadlock detection requires an examination of the status of the process–resources
interaction for the presence of a deadlock condition. To resolve the deadlock, abort a deadlocked
process.
 Mutual exclusion is introduced to prevent race conditions.

Mutual exclusion in a distributed system states that only one process is allowed to
execute the critical section (CS) at any given time.

Critical section:
Critical section means more than one process cannot be executed at a time.
Three approaches for implementing distributed mutual exclusion:
 Token-based approach
 Non-token-based approach
 Quorum-based approach
1. Token-based approach:
 A unique token is shared among all the sites.
 If a site possesses the unique token, it is allowed to enter its critical section.
 This approach uses sequence number to order requests for the critical section.

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

 Sequence number is used to distinguish old and current requests.

 Eg: Suzuki-Kasami’s Broadcast Algorithm.

2. Non-token-based approach:
 A site communicates with other sites in order to determine which sites should
execute critical section next. This requires exchange of two or more
successive round of messages among sites.
 This approach use timestamps instead of sequence number to order requests for
the critical section.
 Whenever a site make request for critical section, it gets a timestamp.
Timestamp is also used to resolve any conflict between critical section
requests.
 Eg: Lamport's algorithm, Ricart–Agrawala algorithm.
3. Quorum-based approach:
 Instead of requesting permission to execute the critical section from all other
sites, each site requests only a subset of sites which is called a quorum.
 Any two subsets of sites or Quorum contain a common site.
 This common site is responsible to ensure mutual exclusion.
 Eg: Maekawa’s Algorithm.

3.2 PRELIMINARIES
3.2.1 SYSTEM MODEL
 The system consists of N sites, S1, S2, S3, …, SN.
 Assume that a single process is running on each site.
 The process at site Si is denoted by pi. All these processes communicate
asynchronously over an underlying communication network.
 A process wishing to enter the CS requests all other or a subset of processes
by sending REQUEST messages, and waits for appropriate replies before
entering the CS.
 While waiting the process is not allowed to make further requests to enter the
CS.
 A site can be in one of the following three states:
 requesting the CS
 executing the CS
 neither requesting nor executing the CS.
 In the requesting state, the site is blocked and cannot make further requests
for the CS.

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

 In the executing state, the site is executing in the CS.

 In the idle state, the site is executing outside the CS.
 N denotes the number of processes or sites involved in invoking the critical
section, T denotes the average message delay, and E denotes the average
critical section execution time.
Requirements of mutual exclusion algorithms:
A mutual exclusion algorithm should satisfy the following properties:
 Safety property
 Liveness property
 Fairness
Safety property:
 Safety property states that at any instant, only one process can execute the critical
section. This is an essential property of a mutual exclusion algorithm.
Liveness property:
 This property states the absence of deadlock and starvation. Two or more sites
should not endlessly wait for messages that will never arrive.
Fairness:
 Fairness in the context of mutual exclusion means that each process gets a fair
chance to execute the CS.
 The fairness property means that the CS execution requests are executed in order
of their arrival in the system.

3.2.2.PERFORMANCE METRICS
 Message complexity: This is the number of messages that are required
per CS execution by a site.
 Synchronization delay: After a site leaves the CS, it is the time required
and before the next site enters the CS.
 Response time: This is the time interval a request waits for its CS
execution to be over after its request messages have been sent out.
 System throughput: This is the rate at which the system executes
requests for the CS.

SD=synchronization delay
E=average critical section execution time.

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

3.2.3.BEST AND WORST CASE PERFORMANCE

 The best value of the response time is a round trip message delay plus the CS execution
time, 2T +E.
 The best and worst values of the response time are achieved when load is low and high.

3.3 LAMPORT’S ALGORITHM

Lamport developed a distributed mutual exclusion algorithm. It follows permission based
approach. Lamport algorithm executes CS requests in the increasing order of timestamps. i.e a
request with smaller timestamp will be given permission to execute critical section first than a
request with larger timestamp.
This algorithm requires communication channels to deliver messages in FIFO order.
Types of messages:
1. REQUEST
2. REPLY
3. RELEASE

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

 In fig.9.3. Sites S1 and S2 are making requests for the CS and send out
REQUEST messages to other sites. The timestamps of the requests are (1,1) and (1,2),
respectively.
 In fig.9.4. Both the sites S1 and S2 have received REPLY messages from all other
sites. S1 has its request at the top of its request_queue but site S2 does not have its
request at the top of its request_queue. Consequently, site S1 enters the CS.
 In fig.9.5. S1 exits and sends RELEASE mesages to all other sites.
 In fig.9.6. Site S2 has received REPLY from all other sites and also received a RELEASE
message from site S1. Site S2 updates its request_queue and its request is now at the top of
its request_queue. Consequently, it enters the CS next

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

Example: [operation of Lamport’s algorithm]

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

Properties satisfied by the Lamport’s Algorithm:

 Lamport’s algorithm achieves mutual exclusion.
 Lamport’s algorithm is fair.
Performance:
 Synchronization delay is equal to maximum message transmission time. It
requires 3(N – 1) messages per CS execution.
 The synchronization delay in the algorithm is T.
 Algorithm can be optimized to 2(N – 1) messages by omitting the REPLY
message in some situations.
Drawbacks of Lamport’s Algorithm:
 Unreliable approach: failure of any one of the processes will halt the progress of
entire system.
 High message complexity: Algorithm requires 3(N-1) messages per critical section
invocation.

3.4 RICART–AGRAWALA ALGORITHM

Ricart–Agrawala algorithm is an algorithm for mutual exclusion proposed by Glenn
Ricart and Ashok Agrawala. This algorithm is an extension and optimization of Lamport’s
Distributed Mutual Exclusion Algorithm.
 It follows permission based approach to ensure mutual exclusion.
 This algorithm requires communication channels to deliver messages in FIFO
order.
Types of messages:
1. REQUEST
2. REPLY
A site send a REQUEST message to all other site to get their permission to enter critical
section. A site send a REPLY message to other site to give its permission to enter the critical
section.
 A timestamp is given to each critical section request using Lamport’s logical clock.
 Timestamp is used to determine priority of critical section requests.
 Smaller timestamp gets high priority over larger timestamp.
 The execution of critical section request is always in the order of their timestamp

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

Example: [operation of Ricart–Agrawala algorithm]

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

 In Fig. 9.7, Sites S1 and S2 are each making requests for the CS and sending out
REQUEST messages to other sites. The timestamps of the requests are (2,1) and (1,2),
respectively.
 In Fig. 9.8, S2 has received REPLY messages from all other sites and, consequently,
enters the CS.
 In Fig. 9.9, S2 exits the CS and sends a REPLY mesage to site S1.
 In Fig. 9.10, site S1 has received REPLY from all other sites and enters the CS next.

Properties satisfied by the Ricart–Agrawala Algorithm:

 Ricart–Agrawala algorithm achieves mutual exclusion.
Performance:
 Synchronization delay is equal to maximum message transmission time It requires 2(N-1)
messages per Critical section execution.
 The synchronization delay in the algorithm is T.
Drawbacks of Ricart–Agrawala algorithm:
 Unreliable approach: failure of any one of node in the system can halt the progress
of the system. In this situation, the process will starve forever.

3.5 MAEKAWA‘s ALGORITHM

Maekawa’s Algorithm was the first quorum based approach. In permission based
algorithms like Lamport’s Algorithm, Ricart-Agrawala Algorithm etc. a site request permission
from every other site.
But in quorum based approach, a site does not request permission from every other
site but from a subset of sites which is called quorum.
Types of messages:
1. REQUEST
2. REPLY
3. RELEASE

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

 A site sends a REQUEST message to all other site in its request set or quorum to get
their permission to enter critical section.

 A site sends a REPLY message to requesting site to give its permission to enter the
critical section.
 A site sends a RELEASE message to all other site in its request set or quorum upon
exiting the critical section.

Conditions to be satisfied for Maekawa’s algorithm:

 Conditions M1 and M2 are necessary for correctness

 Condition M3 states that all sites should have to do an equal amount of work to
invoke mutual exclusion.
 Condition M4 states that all sites have equal responsibility in granting permission to
other sites.

Properties satisfied by the Maekawa’s algorithm:

 Maekawa’s algorithm achieves mutual exclusion.
Performance:
 Synchronization delay is equal to twice the message propagation delay time. It
requires 3√n messages per critical section execution.
 Synchronization delay in this algorithm is 2T.
Drawbacks of Maekawa’s Algorithm:
 This algorithm is deadlock prone because a site is exclusively locked by other
sites and requests are not prioritized by their timestamp.

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

3.6 SUZUKI–KASAMI‘S BROADCAST ALGORITHM:

Suzuki–Kasami algorithm is a token-based algorithm for achieving mutual exclusion in
distributed systems. This is modification of Ricart–Agrawala algorithm, a permission based
(Non-token based) algorithm which uses REQUEST and REPLY messages to ensure mutual
exclusion. In token-based algorithms, A site is allowed to enter its critical section if it possesses
the unique token.
 Non-token based algorithms uses timestamp to order requests for the critical
section where as sequence number is used in token based algorithms.
 Each requests for critical section contains a sequence number. This sequence
number is used to distinguish old and current requests.
 Suzuki–Kasami’s algorithm is not symmetric because a site retains the token even
if it does not have a request for the CS,
Two design issues that must be efficiently addressed:
1. How to distinguishing an outdated REQUEST message from a current REQUEST
message.
2. How to determine which site has an outstanding request for the CS.

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

Properties satisfied by the Suzuki–Kasami’s Algorithm:

 Suzuki–Kasami’s Algorithm achieves mutual exclusion.
Performance:
 Synchronization delay is 0 and no message is needed if the site holds the idle
token at the time of its request.
 In case site does not hold the idle token, the maximum synchronization delay is
equal to maximum message transmission time and a maximum of N message is
required per critical section invocation.
 The synchronization delay in this algorithm is 0 or T.
Drawbacks of Suzuki–Kasami Algorithm:
 Non-symmetric Algorithm: A site retains the token even if it does not have
requested for critical section.

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

DEADLOCK DETECTION IN DISTRIBUTED SYSTEMS

3.7 INTRODUCTION:
Deadlocks are a fundamental problem in distributed systems. A deadlock can be defined
as a condition where a set of processes request resources that are held by other processes
in the set.
Deadlocks can be deal with using any one of the following three strategies:
 Deadlock prevention
 Deadlock avoidance
 Deadlock detection.
Deadlock prevention:
 Deadlock prevention is commonly achieved by either having a process acquire all the
needed resources simultaneously before it begins execution or by pre-empting a
process that holds the needed resource.
Deadlock avoidance:
 In deadlock avoidance, a resource is granted to a process if the resulting global system
is safe.
Deadlock detection:
Deadlock detection requires an examination of the status of the process–resources
interaction for the presence of a deadlock condition.
3.8 SYSTEM MODEL
 The system can be modeled as a directed graph in which vertices represent the
processes and edges represent unidirectional communication channels.
 A process can be in two states, running or blocked. In the running state (also called
active state), a process has all the needed resources and is either executing or is ready
for execution. In the blocked state, a process is waiting to acquire some resource.

 Deadlock can neither be prevented nor avoided in distributed system as the system is so
vast that it is impossible to do so. Therefore, only deadlock detection can be
implemented.
Techniques of deadlock detection:
 Progress: The method should be able to detect all the deadlocks in the system.
 Safety: The method should not detect false of phantom deadlocks
Three approaches:
Centralized approach:
 Here there is only one responsible resource to detect deadlock.

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

 The advantage of this approach is that it is simple and easy to implement.

 Drawbacks include excessive workload at one node, single point failure
which in turns makes the system less reliable.
Distributed approach:
 In the distributed approach different nodes work together to detect
deadlocks. No single point failure as workload is equally divided among all
nodes.
 The speed of deadlock detection also increases.
Hierarchical approach:
 It is the combination of both centralized and distributed approaches of
deadlock detection in a distributed system.
 In this approach, some selected nodes or cluster of nodes are responsible for
deadlock detection and these selected nodes are controlled by a single node.
Wait for graph
This is used for deadlock deduction. A graph is drawn based on the request and acquirement of
the resource. If the graph created has a closed loop or a cycle, then there is a deadlock.

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

 Figure 10.1 shows a WFG, where process P11 of site 1 has an edge to process P21 of site
1 and an edge to process P32 of site 2. Process P32 of site 2 is waiting for a resource that is
currently held by process P33 of site 3.
 At the same time process P21 at site 1 is waiting on process P24 at site 4 to release a
resource, and so on. If P33 starts waiting on process P24, then processes in the WFG are
involved in a deadlock depending upon the request model.
3.9 PRELIMINARIES
3.9.1 DEADLOCK HANDLING STRATEGIES
Three strategies:
 Deadlock prevention
 Deadlock avoidance
 Deadlock detection.
Deadlock prevention:
 Deadlock prevention is commonly achieved by either having a process acquire all the
needed resources simultaneously before it begins execution or by pre-empting a process that
holds the needed resource.
Deadlock avoidance:
 In deadlock avoidance, a resource is granted to a process if the resulting global system is
safe.
Deadlock detection:
 Deadlock detection requires an examination of the status of the process–resources
interaction for the presence of a deadlock condition.
3.9.2 ISSUES IN DEADLOCK DETECTION
Deadlock handling faces two major issues
1. Detection of existing deadlocks
2. Resolution of detected deadlocks
I. Detection of existing deadlocks
 Detection of deadlocks involves addressing two issues namely maintenance of the
WFG and searching of the WFG for the presence of cycles or knots.
 In distributed systems, a cycle or knot may involve several sites; the search for
cycles greatly depends upon how the WFG of the system is represented across the
system.
 Depending upon the way WFG information is maintained and the search for cycles
is carried out, there are centralized, distributed, and hierarchical algorithms for
deadlock detection in distributed systems.
 A deadlock detection algorithm must satisfy the following two conditions:
a) Progress-No undetected deadlocks:

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

 The algorithm must detect all existing deadlocks in finite time.

b) Safety -No false deadlocks:
 The algorithm should not report deadlocks which do not exist. This is also called
as called phantom or false deadlocks.

II. Resolution of detected deadlocks:

 Deadlock resolution involves breaking existing wait-for dependencies between the
processes to resolve the deadlock.
 It involves rolling back one or more deadlocked processes and assigning their
resources to blocked processes so that they can resume execution.
 The deadlock detection algorithms propagate information regarding wait-for
dependencies along the edges of the wait-for graph.
 When a wait-for dependency is broken, the corresponding information should be
immediately cleaned from the system.
 If this information is not cleaned in a timely manner, it may result in detection of
phantom deadlocks.

3.10 MODELS OF DEADLOCKS

3.10.1SINGLE RESOURCE MODEL
A process can have at most one outstanding request for only one unit of a resource. The
maximum out-degree of a node in a WFG for the single resource model can be 1, the presence of
a cycle in the WFG shall indicate that there is a deadlock.

Fig 3.10: Deadlock in single resource model

3.10.2 AND MODEL
In the AND model, a process can request more than one resource simultaneously and the
request is satisfied only after all the requested resources are granted to the process. The requested
resources may exist at different locations.
 The out degree of a node in the WFG for AND model can be more than 1.
 The presence of a cycle in the WFG indicates a deadlock in the AND model.
 Each node of the WFG in such a model is called an AND node.
 In the AND model, if a cycle is detected in the WFG, it implies a deadlock but not vice versa.

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

That is, a process may not be a part of a cycle, it can still be deadlocked.

 In Fig. 10.1. Process P11 has two outstanding resource requests. In case of the
AND model, P11 shall become active from idle state only after both the resources
are granted. There is a cycle P11→P21→P24→P54→P11, which corresponds to
a deadlock situation.
 Consider process P44 in Figure 10.1. It is not a part of any cycle but is still
deadlocked as it is dependent on P24.
3.10.3 OR MODEL
In OR model, a process can make a request for numerous resources simultaneously and the
request is satisfied if any one of the requested resources is granted. The requested resources may
exist at different locations.
 If all requests in the WFG are OR requests, then the nodes are called OR nodes.
 Presence of a cycle in the WFG of an OR model does not imply a deadlock in the OR model.
 In the OR model, the presence of a knot indicates a deadlock.

Deadlock in OR model: a process Pi is blocked if it has a pending OR request to

be satisfied.
 With every blocked process, there is an associated set of processes called dependent set.
 A process shall move from an idle to an active state on receiving a grant message from
any of the processes in its dependent set.
 A process is permanently blocked if it never receives a grant message from any of the
processes in its dependent set.
 A set of processes S is deadlocked if all the processes in S are permanently blocked.

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

 In short, a process is deadlocked or permanently blocked, if the following conditions are

met:

1. Each of the process is the set S is blocked.

2. The dependent set for each process in S is a subset of S.
3. No grant message is in transit between any two processes in set S.
 A blocked process P is the set S becomes active only after receiving a grant message
from a process in its dependent set, which is a subset of S.
 Consider Figure 10.1. If all nodes are OR nodes, then process P11 is no deadlocked
because once process P33 releases its resources, P32 shall become active as one of its
requests is satisfied. After P32 finishes execution and releases its resources, process P11
can continue with its processing.
 Where P44 can be deadlocked even though it is not in a knot. So, in an OR model, a
blocked process P is deadlocked if it is either in a knot or it can only reach processes on a
knot.
3.10.4 THE AND-OR MODEL
 A generalization of the previous two models (OR model and AND model) is the AND-
OR model. In the AND-OR model, a request may specify any combination of and and or
in the resource request. For example, in the ANDOR model, a request for multiple
resources can be of the form x and (y or z).
 The requested resources may exist at different locations.
 Deadlock can be detected by repeated application of the test for OR-model deadlock.

3.10.5 THE MODEL [P OUT OF Q MODEL]:

This is a variation of AND-OR model. This allows a request to obtain any k available
resources from a pool of n resources. Both the models are the same in expressive power.
 This favours more compact formation of a request.
 Every request in this model can be expressed in the AND-OR model and vice- versa.
 Note that AND requests for p resources can be stated as and OR
requests for p resources can be stated .
3.10.6 UNRESTRICTED MODEL
 No assumptions are made regarding the underlying structure of resource requests.
 In this model, only one assumption that the deadlock is stable is made and hence it is the
most general model.

3.11 KNAPP‘S CLASSIFICATION

There are four classes of distributed deadlock detection algorithm. They are:
1. Path-pushing

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

2. Edge-chasing
3. Diffusion computation
4. Global state detection

1. Path Pushing algorithms:

In path pushing algorithm, the distributed deadlock detection are detected by maintaining an
explicit global wait for graph. The basic idea is to build a global WFG (Wait For Graph) for each
site of the distributed system.
 At each site whenever deadlock computation is performed, it sends its local WFG
to all the neighbouring sites.
 After the local data structure of each site is updated, this updated WFG is then
passed along to other sites, and the procedure is repeated until some site has a
sufficiently complete picture of the global state to announce deadlock or to
establish that no deadlocks are present.This feature of sending around the paths of
global WFGhas led to the term path- pushing algorithms.
Examples:
 Menasce-Muntz
 Gligor and Shattuck
 Ho and Ramamoorthy
 Obermarck
2. Edge Chasing Algorithms:
The presence of a cycle in a distributed graph structure is be verified by propagating
special messages called probes, along the edges of the graph. These probe messages are different
than the request and reply messages. The formation of cycle can be deleted by a site if it receives
the matching probe sent by it previously.
 Whenever a process that is executing receives a probe message, it discards this
message and continues.
 Only blocked processes propagate probe messages along their outgoing edges.
 Main advantage of edge-chasing algorithms is that probes are fixed size
messages which is normally very short.
Examples:
 Chandy et al.
 Kshemkalyani–Singhal
 Sinha–Natarajan algorithms.
3. Diffusing Computation Based Algorithms:
In diffusion computation based distributed deadlock detection algorithms; deadlock
detection computation is diffused through the WFG of the system. These algorithms make use of
echo algorithms to detect deadlocks. This computation is superimposed on the underlying
distributed computation.

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

o If this computation terminates, the initiator declares a deadlock.

o To detect a deadlock, a process sends out query messages along all
the outgoing edges in the WFG.
o These queries are successively propagated (i.e., diffused) through
the edges of the WFG.
o When a blocked process receives first query message for a
particular deadlock detection initiation, it does not send a reply

message until it has received a reply message for every query it sent.
o For all subsequent queries for this deadlock detection initiation, it immediately
sends back a reply message.
o The initiator of a deadlock detection detects a deadlock when it receives reply for
every query it had sent out.

Examples:
 Chandy–Misra–Haas algorithm for one OR model
 Chandy– Herman algorithm
4. Global state detection-based algorithms
Global state detection based deadlock detection algorithms exploit the following
facts:
 A consistent snapshot of a distributed system can be obtained without
freezing the underlying computation.
 If a stable property holds in the system before the snapshot collection is
initiated, this property will still hold in the snapshot.
 Therefore, distributed deadlocks can be detected by taking a snapshot
of the system and examining it for the condition of a deadlock.

3.12 ALGORITHMS FOR THE SINGLE RESOURCE MODEL[MITCHELL

AND MERRITT’SALGORITHM]:
 This deadlock detection algorithm assumes a single resource model.
 This detects the local and global deadlocks each process has assumed two
different labels namely private and public each label is accountant the
process id guarantees only one process will detect a deadlock.
 Probes are sent in the opposite direction to the edges of the WFG.
 When a probe initiated by a process comes back to it, the process declares
deadlock.

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

Features:
1. Only one process in a cycle detects the deadlock. This simplifies the deadlock
resolution – this process can abort itself to resolve the deadlock. This algorithm can be
improvised by including priorities, and the lowest priority process in a cycle detects
deadlock and aborts.
2. In this algorithm, a process that is detected in deadlock is aborted spontaneously,
even though under this assumption phantom deadlocks cannot be excluded. It can be
shown, however, that only genuine deadlocks will be detected in the absence of
spontaneous aborts.
Each node of the WFG has two local variables, called labels:
1. a private label, which is unique to the node at all times, though it is not constant.
2. a public label, which can be read by other processes and which may not be unique.
Each process is represented as u/v where u and u are the public and private labels, respectively.
Initially, private and public labels are equal for each process. A global WFG is maintained and it
defines the entire state of the system.

Fig 3.12: Four possible state transitions

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

 The algorithm is defined by the four state transitions as shown in Fig.3.12, where z=
inc(u, v), and inc(u, v) yields a unique label greater than both u and v labels that are
not shown do not change.

 The transitions in the defined by the algorithm are block, activate, transmit and
detect.
 Block creates an edge in the WFG.
 Two messages are needed, one resource request and one message back to the
blocked process to inform it of the public label of the process it is waiting for.
 Activate denotes that a process has acquired the resource from the process it was
waiting for.
 Transmit propagates larger labels in the opposite direction of the edges by sending
a probe message.
 Detect means that the probe with the private label of some process has returned to
it, indicating a deadlock.
 This algorithm can easily be extended to include priorities, so that whenever a
deadlock occurs, the lowest priority process gets aborted.
 This priority based algorithm has two phases.
1. The first phase is almost identical to the algorithm.
2. The second phase the smallest priority is propagated around the circle. The
propagation stops when one process recognizes the propagated priority as its
own.
The invariant is, for all processes u/v: v ≤u.
Proof
Initially u = v for all processes. The only requests that change u or v are:
1. Block: u and v are set such that u = v.
2. Transmit: u is increased.
Hence, the invariant follows.
From the previous invariant, we have the following lemmas.
Lemma
For any process u/v, if u > v, then u was set by a Transmit step.
Theorem
If a deadlock is detected, a cycle of blocked nodes exists.
Message Complexity:
 If a deadlock persists long enough to be detected, the worst-case complexity of the
algorithm is s(s - 1)/2 Transmit steps, where s is the number of processes in the cycle.

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

3.13 THE AND MODEL AND THE OR MODEL

3.13.1 CHANDY–MISRA–HAAS ALGORITHM FOR THE AND MODEL
 This is based on edge-chasing, probe-based algorithm.
 This algorithm uses a special message called probe, which is a triplet (i,j,k),
denoting that it belongs to a deadlock detection initiated for process Pi andit is being
sent by the home site of process Pj to the home site of process Pk.
 Each probe message contains the following information:
 the id of the process that is blocked (the one that initiates the probe message)
 the id of the process is sending this particular version of the probe message.
 the id of the process that should receive this probe message.
 A probe message travels along the edges of the global WFG graph, and a deadlock
is detected when a probe message returns to the process that initiated it.
 A process Pj is said to be dependent on another process Pk if there exists a
sequence of processes Pj, Pi1 , Pi2 , . . . , Pim, Pk such that each process except Pk in
the sequence is blocked and each process, except the Pj, holds a resource for which
the previous process in the sequence is waiting.
 Process Pj is said to be locally dependent upon process Pk if Pj is dependent upon
Pkand both the processes are on the same site.
Data structures
Each process Pi maintains a boolean array, dependent i, where dependent i (j) is true only if
Pi knows that Pj is dependent on it. Initially, dependent i (j) is false for all i and j.

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

Performance analysis:
 In the algorithm, one probe message is sent on every edge of the WFG which
connects processes on two sites.
 The algorithm exchanges at most m(n − 1)/2 messages to detect a deadlock that
involves m processes and spans over n sites.
 The size of messages is fixed and is very small (only three integer words).
 The delay in detecting a deadlock is O(n).
Advantages:
 It is easy to implement.
 Each probe message is of fixed length.
 There is very little computation.
 There is very little overhead.
 There is no need to construct a graph, nor to pass graph information to other sites.
 This algorithm does not find false (phantom) deadlock.
 There is no need for special data structures.

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

3.13.2 CHANDY–MISRA–HAAS ALGORITHM FOR THE OR MODEL
It is based on the approach of diffusion computation. A blocked process determines if it
is deadlocked by initiating a diffusion computation.
 Two types of messages are used in a diffusion computation:
 query(i, j, k)
 reply(i, j, k) denoting that they belong to a diffusion computation initiated by a
process pi and are being sent from process pj to process pk.

 A blocked process initiates deadlock detection by sending query messages to all

processes in its dependent set.
 If an active process receives a query or reply message, it discards it.
 When a blocked process Pk receives a query(i, j, k) message, it takes the following
actions:
1. If this is the first query message received by Pk for the deadlock detection
initiated by Pi, then it propagates the query to all the processes in its dependent
set and sets a local variable numk (i) to the number of query messages sent.
2. If this is not the engaging query, then Pk returns a reply message to it
immediately provided Pk has been continuously blocked since it received the
corresponding engaging query. Otherwise, it discards the query.

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

Process Pk maintains a boolean variable wait k(i) that denotes the fact that it
has been continuously blocked since it received the last engaging query
from process Pi.
 When a blocked process Pk receives a reply(i, j, k) message, it decrements
numk(i) only if waitk(i) holds.
 A process sends a reply message in response to an engaging query only after
it has received a reply to every query message it has sent out for this
engaging query.
 The initiator process detects a deadlock when it has received reply messages
to all the query messages it has sent out.
Performance analysis:
 For every deadlock detection, the algorithm exchanges e query messages and e
reply messages, where e = n(n – 1) is the number of edges

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

PART-A (Possible Questions)
1. What is meant by Deadlock prevention?
It is achieved by either having a process acquire all the needed resources simultaneously before
it begins execution or by pre-empting a process that holds the needed resource.
2. Write about dead lock avoidance.
In the deadlock avoidance approach to distributed systems, a resource is granted to a process if
the resulting global system is safe. Deadlock detection requires an examination of the status of
the process–resources interaction for the presence of a deadlock condition.
3. Define mutual exclusion.
Mutual exclusion in a distributed system states that only one process is allowed to execute the
critical section (CS) at any given time.
4. Define Message complexity.
This is the number of messages that are required per CS execution by a site.
5. What is Synchronization delay?
After a site leaves the CS, it is the time required and before the next site enters the CS.
6. Define Response time.
This is the time interval a request waits for its CS execution to be over after its request messages
have been sent out. Thus, response time does not include the time a request waits at a site before
its request messages have been sent out.
7. Define System throughput.
This is the rate at which the system executes requests for the CS. If SD is the synchronization
delay and E is the average critical section execution time.

8. What is the Message Complexity of Lamport’s algorithm?

Lamport’s Algorithm requires invocation of 3(N – 1) messages per critical section
executi on. These 3(N – 1) messages involve
 N– 1 request messages
 (N – 1) reply messages
 (N – 1) release messages
9. List the drawbacks of Lamport’s Algorithm.
Unreliable approach: failure of any one of the processes will halt the progress of
entire system.
High message complexity: Algorithm requires 3(N-1) messages per critical section
invocation.
10. Give the Performance of Lamport’s algorithm.
Synchronization delay is equal to maximum message transmission time. It requires 3(N – 1)
messages per CS execution. Algorithm can be optimized to 2(N – 1) messages by omitting the
REPLY message in some situations.

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

11. What is the use of Ricart–Agrawala algorithm?

It is an algorithm to for mutual exclusion in a distributed system proposed by Glenn Ricart and
Ashok Agrawala. This algorithm is an extension and optimization of Lamport’s Distributed
Mutual Exclusion Algorithm. It follows permission based approach to ensure mutual
exclusion.
12. Give the Drawbacks and performance of Ricart–Agrawala algorithm.
Unreliable approach: failure of any one of node in the system can halt the progress of the system.
In this situation, the process will starve forever. The problem of failure of node can be solved by
detecting failure after some timeout.
Performance:
Synchronization delay is equal to maximum message transmission time It requires 2(N– 1)
messages per Critical section execution.
13. What is Maekawa’s Algorithm?
It is quorum based approach to ensure mutual exclusion in distributed systems. In permission
based algorithms like Lamport’s Algorithm, Ricart-Agrawala Algorithm etc. a site request
permission from every other site but in quorum based approach, a site does not request
permission from every other site but from a subset of sites which is called quorum.
14. What is the Message Complexity Maekawa’s Algorithm?
This requires invocation of 3√N messages per critical section execution as the size of a
request set is √N. These 3√N messages involves.
 √N request messages
 √N reply messages
 √N release messages
15. List the drawbacks of Maekawa’s Algorithm.
This algorithm is deadlock prone because a site is exclusively locked by other sites and requests
are not prioritized by their timestamp.
16. Give the Performance of Maekawa’s Algorithm.
Synchronization delay is equal to twice the message propagation delay time. It requires 3√n
messages per critical section execution.
17. What is Suzuki–Kasami algorithm?
It is a token-based algorithm for achieving mutual exclusion in distributed systems.This is
modification of Ricart–Agrawala algorithm, a permission based (Non- token based) algorithm
which uses REQUEST and REPLY messages to ensure mutual exclusion.
18. Give the Message Complexity of Suzuki–Kasami algorithm.
The algorithm requires 0 message invocation if the site already holds the idle token at the time of
critical section request or maximum of N message per critical section execution. This N
messages involves

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

(N – 1) request messages
1 reply message
19. Give the drawbacks of Suzuki–Kasami Algorithm.
Non-symmetric Algorithm: A site retains the token even if it does not have
requested for critical section.

20. Give the Performance of Suzuki–Kasami algorithm.

Synchronization delay is 0 and no message is needed if the site holds the idle token at the time of
its request. In case site does not holds the idle token, the maximum synchronization delay is
equal to maximum message transmission time and a maximum of N message is required per
critical section invocation.
21. Define Centralized approach.
Here there is only one responsible resource to detect deadlock.
The advantage of this approach is that it is simple and easy to implement, while the
drawbacks include excessive workload at one node, single point failure which in
turns makes the system less reliable.
22. What is Distributed approach?
In the distributed approach different nodes work together to detect deadlocks. No
single point failure as workload is equally divided among all nodes.
The speed of deadlock detection also increases.
23. What is Hierarchical approach?
This approach is the most advantageous approach.
It is the combination of both centralized and distributed approaches of deadlock
detection in a distributed system.
In this approach, some selected nodes or cluster of nodes are responsible for
deadlock detection and these selected nodes are controlled by a single node.
24. Define Wait for graph.
This is used for deadlock deduction. A graph is drawn based on the request and acquirement of
the resource. If the graph created has a closed loop or a cycle, then there is a deadlock.
25. What is Deadlock prevention?
 This is achieved either by having a process acquire all the needed resources
simultaneously before it begins executing or by preempting a process which
holds the needed resource.
 This approach is highly inefficient and impractical in distributed systems.
26. What is Deadlock avoidance?
 A resource is granted to a process if the resulting global system state is safe. This
is impractical in distributed systems.
27. What is Deadlock detection?

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

 This requires examination of the status of process-resource interactions for
presence of cyclic wait.
 Deadlock detection in distributed systems seems to be the best approach to handle
deadlocks in distributed systems
28. What is Single Resource Model?
A process can have at most one outstanding request for only one unit of a resource. The
maximum out-degree of a node in a WFG for the single resource model can be 1, the presence of
a cycle in the WFG shall indicate that there is a deadlock
29. How to detect deadlock in OR model?
Deadlock in OR model: a process Pi is blocked if it has a pending OR request to be satisfied.
30. What are the four classes of distributed deadlock detection algorithm?
1. Path-pushing
2. Edge-chasing
3. Diffusion computation
4. Global state detection
31. What are Path Pushing algorithms?
In path pushing algorithm, the distributed deadlock detection are detected by
maintaining an explicit global wait for graph.
The basic idea is to build a global WFG (Wait For Graph) for each site of the
distributed system.
At each site whenever deadlock computation is performed, it sends its local WFG to
all the neighbouring sites.
32. Write about Edge Chasing Algorithms
The presence of a cycle in a distributed graph structure is be verified by propagating
special messages called probes, along the edges of the graph.
These probe messages are different than the request and reply messages.
The formation of cycle can be deleted by a site if it receives the matching probe sent
by it previously.
33. Brief about the Diffusing Computation Based Algorithms
In diffusion computation based distributed deadlock detection algorithms, deadlock
detection computation is diffused through the WFG of the system.
These algorithms make use of echo algorithms to detect deadlocks.
34. Write about Global state detection-based algorithms.
Global state detection based deadlock detection algorithms exploit the following facts:
1. A consistent snapshot of a distributed system can be obtained without freezing the
underlying computation.
2. If a stable property holds in the system before the snapshot collection is initiated,
this property will still hold in the snapshot.

100

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

35. Explain the features of Mitchell and Merritt’s algorithm.
1. Only one process in a cycle detects the deadlock. This simplifies the deadlock
resolution – this process can abort itself to resolve the deadlock. This algorithm can
be improvised by including priorities, and the lowest priority process in a cycle
detects deadlock and aborts.
2. In this algorithm, a process that is detected in deadlock is aborted spontaneously,
even though under this assumption phantom deadlocks cannot be excluded. It can
be shown, however, that only genuine deadlocks will be detected in the absence of
spontaneous aborts.
36. What are the transitions defined by the Mitchell and Merritt’s algorithm.
Block creates an edge in the WFG.
Two messages are needed, one resource request and onemessage back to the
blocked process to inform it of thepublic label of the process it is waiting for.
Activate denotes that a process has acquired the resourcefrom the process it was
waiting for.
Transmit propagates larger labels in the opposite directionof the edges by sending a
probe message.
Detect means that the probe with the private label of some process has returned to it,
indicating a deadlock.
37. What is the use of probe inChandy–Misra–Haas algorithm?
Each probe message contains the following information:
 the id of the process that is blocked (the one that initiates the probe message);
 the id of the process is sending this particular version of the probe message;
 the id of the process that should receive this probe message.

101

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

PART-B ( Possible Questions)
1. Write in detail about mutual exclusion algorithms.
2. Describe Lamport’s algorithm.
3. Brief about Ricart–Agrawala algorithm.
4. Explain Maekawa‘s algorithm.
5. Elaborate Suzuki–Kasami‘s broadcast algorithm.
6. How to deal with deadlocks in distributed systems?
7. Explain in detail about issues in deadlock Detection.
8. Elaborate deadlock models.
9. Describe the Knapp’s Classification of Distributed Deadlock detection algorithms.
10. Explain the Mitchell and Merritt’s algorithm for the single-resource model.
11. Explain the Chandy–Misra–Haas algorithm for the AND model.
12. Describe the Chandy–Misra–Haas Algorithm For The Or Model.

PART-C ( Possible Questions)

1. Discuss about Lamport’s algorithm.

2. Illustrate the Ricart–Agrawala algorithm.
3. Explain in detail about Maekawa‘s algorithm.
4. Elaborate Suzuki–Kasami‘s broadcast algorithm.
5. Discuss the Knapp’s Classification of Distributed Deadlock detection algorithms.
6. Elaborate the Mitchell and Merritt’s algorithm for the single-resource model.
7. Design the Chandy–Misra–Haas algorithm for the AND model.
8. Design the Chandy–Misra–Haas Algorithm For The Or Model.

102

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

UNIT IV

RECOVERY & CONSENSUS

4.1 INTRODUCTION
Rollback and checkpointing are used to ensure consistency in distributed systems.

The saved state is called a checkpoint, and the procedure of restarting

from a previously checkpointed state is called rollback recovery.
Upon a failure of one or more processes in a system, these dependencies may force some
of the processes that did not fail to roll back, creating what is commonly called rollback
propagation.
 In rollback propagation ,the dependencies may force some of the processes that did not
fail to rollback. This phenomenon is called domino effect.
 Independent checkpointing: If each participating process takes its checkpoints
independently, then the system is susceptible to the domino effect. This approach is
called independent or uncoordinated checkpointing.
 Coordinated checkpointing: It is desirable to avoid the domino effect and therefore
several techniques have been developed to prevent it. One such technique is coordinated
checkpointing where processes coordinate their checkpoints to form a system-wide
consistent state.
 Communication induced checkpointing: Communication-induced checkpointing forces
each process to take checkpoints based on information piggybacked on the application
messages it receives from other processes. Checkpoints are taken such that a system-wide
consistent state always exists on stable storage, thereby avoiding the domino effect.
 Log based rollback recovery combines checkpointing with logging of nondeterministic
events.
 Log-based rollback recovery enables a system to recover beyond the most recent set of
consistent checkpoints. It is therefore particularly attractive for applications that
frequently interact with the outside world, which consists of input and output devices that
cannot roll back.
Rollback recovery protocol is a process in which a system recovers correctly if
its internal state is consistent with the observable behavior of the system before
the failure.

103

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

4.2 BACKGROUND AND DEFINITIONS
4.2.1 SYSTEM MODEL
A distributed system consists of a fixed number of processes, P 1, P2, …, Pn that
communicate with each other only through send and receive messages.

Fig 4.2: Example of communications in a distributed system

 Figure 4.2 shows a system consisting of three processes and interactions with the
outside world.
 Rollback-recovery protocols always keep track of information about the
internal interactions among processes and also the external interactions with the
outside world.
4.2.2 LOCAL CHECKPOINT

A local checkpoint is a snapshot of the state of the process at a given instance

and the event of recording the state of a process is called local checkpointing.

 All processes save their local states at certain instants of time.

 A local check point is a snapshot of the state of the process at a given instance.
 Depending upon the checkpointing method used, a process may keep several local
checkpoints or just a single checkpoint at any time.
 The assumptions made in local checkpointing are:
– A process stores all local checkpoints on the stable storage.
– A process is able to roll back to any of its existing local checkpoints.
 Ci,k denotes the kth local checkpoint at process Pi.Ci,0 indicates that a
process takes a checkpoint Ci,0 before it starts execution.

104

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

A consistent global state is one that may occur during a failure-free

execution of a distributed computation.
4.2.3 CONSISTENT STATES
A global state of a distributed system is a collection of the individual states of all
participating processes and the states of the communication channels.
 A local checkpoint is a snapshot of a local state of a process and a global checkpoint is a
set of local checkpoints, one from each process.
A consistent system state is one in which a process’s state reflects a message receipt, then
the state of the corresponding sender must reflect the sending of that message.
A global checkpoint is a set of local checkpoints, one from each process
 A consistent global checkpoint is a global checkpoint such that no message is sent by a
process after taking its local point that is received by another process before taking its
checkpoint.
 The consistency of global checkpoints strongly depends on the flow of messages
exchanged by processes and an arbitrary set of local checkpoints at processes may not
form a consistent global checkpoint.

Fig 4.2: Consistent and Inconsistent states

 The primary objective of any rollback-recovery protocol is to bring the system to a
consistent state after a failure.
 The reconstructed consistent state is not necessarily one that occurred before the failure.
 It is enough to restore the reconstructed state be one that could have occurred before the
failure in a failure-free execution, provided that it is consistent with the interactions that
the system had with the outside world.

105

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

4.2.3 INTERACTIONS WITH THE OUTSIDE WORLD
 Communication with outside world is mainly to receive input data or deliver the outcome
of a computation.
 The Outside World Process (OWP) is a defined as a special process that interacts with the
rest of the distributed system through message passing.
 Before sending output to the OWP, the system must ensure that the state from which the
output is sent will be recovered despite any future failure. This is commonly called the
output commit problem.
 The most common approach in interacting with outside world is to save each input
message on the stable storage before allowing the application program to process it.
 Also, the input messages that a system receives from the OWP may not be reproducible
during recovery, because it may not be possible for the outside world to regenerate them.
 Outside world interaction is indicated using the Symbol “||”. This specifies that an
interaction with the outside world to deliver the outcome of a computation is to be done.

4.2.4 DIFFERENT TYPES OF MESSAGES

 A process failure and recovery involve exchange and storing of messages that were sent
and received before the failure in abnormal states.
 This is to facilitate the rollback of processes for recovery. This operation involves
sending and receiving of several types of messages.

Fig 4.3: Types of messages

In Transit Messages:
 These are the messages that have been sent but not yet received.
 These messages do not cause any inconsistency. However, depending on whether
the system model assumes reliable communication channels, rollback recovery

106

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

protocols may have to guarantee the delivery of in-transit messages when failures
occur.
 For reliable communication channels, a consistent state must include in-transit
messages because they will always be delivered to their destinations in any legal
execution of the system.
 If a system model assumes lossy communication channels, then in-transit
messages can be omitted from system state.
 The global state {C1,8 , C2,9, C3,8, C4,8 } in Fig 4.3 shows that message m1 has been
sent but not yet received.
Lost Messages:
 Messages whose send is not undone but receive is undone due to rollback are called lost
messages.
 This type of messages occurs when the process rolls back to a checkpoint prior to
reception of the message while the sender does not rollback beyond the send operation of
the message.
 The message m1 in Fig 4.3 is a lost message.

Delayed messages:
 Messages whose receive is not recorded because the receiving process was either down
or the message arrived after the rollback of the receiving process, are called delayed
messages.
 Messages m2 and m5 are delayed messages.
Orphan messages:
 Messages with receive recorded but message send not recorded are called orphan
messages.
 A rollback might have undone the send of such messages, leaving the receive event intact
at the receiving process.
 Orphan messages do not arise if processes roll back to a consistent global state.
Duplicate messages:
 Duplicate messages arise due to message logging and replaying during process recovery.
 In fig 4.3, message m4 was sent and received before the rollback.
 Due to the rollback of process P4 to C4,8 and process P3 to C3,8, both send and receipt of
message m4 are undone.
 When process P3 restarts from C3,8, it will resend message m4. Therefore, P4 should not
replay message m4 from its log. If P4 replays message m4, then message m 4 is called a
duplicate message.

107

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

4.3 ISSUES IN FAILURE RECOVERY

During the recovery process, it is essential to not only restore the system to a consistent
state, but also to handle the messages appropriately that is left in an abnormal state due to the
failure and recovery.

Fig 4.4: Illustration of issues in failure recovery

Assume that the process Pi fails and all the contents of the volatile memory of Pi are lost
and, after Pi has recovered from the failure, the system needs to be restored to a consistent global
state from where the processes can resume their execution. Process P i’s state is restored to a
valid state by rolling it back to its most recent checkpoint Ci,1.
 To restore the system to a consistent state, the process Pj rolls back to checkpoint
Cj,1 because the rollback of process Pi to checkpoint Ci,1 created an orphan
message H.
 Pj does not roll back to checkpoint Cj,1 but to checkpoint Cj,1 because rolling back
to checkpoint Cj,2 does not eliminate the orphan message H.
 Even this resulting state is not a consistent global state, as an orphan message I is
created due to the rollback of process Pj to checkpoint Cj,1.
 To eliminate this orphan message, process Pk rolls back to checkpoint Ck,1.
The restored global state {Ci,1, Cj,1 , Ck,1} is a consistent state
The system state has been restored to a consistent state but there are several messages left
in an erroneous state which must be handled correctly. Messages A, B, D, G, H, I, and J had
been received at the points indicated in the figure and messages C, E, and F were in transit when
the failure occurred.

108

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

 Restoration of system state to checkpoints { Ci,1, Cj,1 , Ck,1 } automatically
handles messages A, B, and J because the send and receive events of messages
A, B, and J have been recorded, and both the events for G, H, and I have been
completely undone.
 These messages cause no problem and we call messages A, B, and J normal
messages and messages G, H, and I vanished messages.
Messages C, D, E, and F are potentially problematic.
 Message C is in transit during the failure and it is a delayed message.
 The delayed message C has several possibilities: C might arrive at process Pi
before it recovers, it might arrive while Pi is recovering, or it might arrive after Pi
has completed recovery.
 Each of these cases must be dealt with correctly.
Message D is a lost message
The send event for D is recorded in the restored state for process P j, but the receive event
has been undone at process Pi.
 Process Pj will not resend D without an additional mechanism, since the send D at
Pj occurred before the checkpoint and the communication system successfully
delivered D.
Messages E and F are delayed orphan messages
It raises a serious problem of all the messages.
 When messages E and F arrive at their respective destinations, they must be
discarded since their send events have been undone.
 Processes, after resuming execution from their checkpoints, will generate both of
these messages, and recovery techniques must be able to distinguish between
messages like C and those like E and F.

 Lost messages like D can be handled by having processes keep a message log of
all the sent messages.
 So when a process restores to a checkpoint, it replays the messages from its log to
handle the lost message problem.
 However, message logging and message replaying during recovery can result in
duplicate messages.
 Process Pk, which has already received message J, will receive it again, thereby
causing in consistency in the system state.
 Overlapping failures further complicate the recovery process. A process Pj that
begins rollback/recovery in response to the failure of a process Pi can itself fail
and develop amnesia with respect process Pi’s failure.
 Pj can act in a fashion that exhibits ignorance of process Pi’s failure.

109

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

4.4 CHECKPOINT-BASED RECOVERY

In check point based recovery, the state of each process and the communication
channel is check pointed frequently so that when a failure occurs, the system can be
restored to a globally consistent set of checkpoints.
 This scheme does not depend on the PWD assumption, and hence it does not need
to detect, log, or replay non-deterministic events.
 These protocols are less restrictive and simpler to implement than log-based
rollback recovery.
 The drawback here is, it does not guarantee that pre-failure execution can be
deterministically regenerated after a rollback.
The three types of checkpoint based rollback-recovery techniques are:
1. Uncoordinated checkpointing
2. Coordinated checkpointing
3. Communication-induced checkpointing
1. Uncoordinated Checkpointing
Here, each process has autonomy in deciding when to take checkpoints. This eliminates
the synchronization overhead as there is no need for coordination between processes and it
allows processes to take checkpoints when it is most convenient or efficient.
Advantage of this method:
 Lower runtime overhead during normal execution.
Limitations
 Domino effect during a recovery
 Recovery from a failure is slow because processes need to iterate to find a consistent
set of checkpoints
 Each process maintains multiple checkpoints and periodically invoke a garbage
collection algorithm
 Not suitable for application with frequent output commits

110

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

 In Figure 13.5. When process Pi at interval Ii,x sends a message m to Pj , it piggybacks

the pair (i, x) on m. When Pj receives m during interval Ij,y , it records the dependency
from Ii,x to Ij,y , which is later saved onto stable storage when Pj takes checkpoint Cj,y.
 As each process takes checkpoints independently, it is necessary to determine a consistent
global checkpoint to rollback to, when a failure occurs.
 To accomplish this, the processes record the dependencies among their checkpoints
caused by message exchange during failure free operation.
 When a failure occurs, the recovering process initiates rollback by broadcasting a
dependency request message to collect all the dependency information maintained by
each process.
 When a process receives this message, it stops its execution and replies with the
dependency information saved on the stable storage as well as with the dependency
information, if any, which is associated with its current state.
 The initiator then calculates the recovery line based on the global dependency
information and broadcasts a rollback request message containing the recovery line.
 Upon receiving this message, a process whose current state belongs to the recovery line
simply resumes execution; otherwise, it rolls back to an earlier checkpoint as indicated
by the recovery line.
2. Coordinated checkpointing
In coordinated checkpointing, processes orchestrate their checkpointing activities so that
all local checkpoints form a consistent global state. Coordinated checkpointing simplifies
recovery and is not susceptible to the domino effect, since every process always restarts from its
most recent checkpoint.
 This requires each process to maintain only one checkpoint on the stable
storage, reducing the storage overhead and eliminating the need for garbage collection.

111

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

Types of Coordinated checkpoints:
1. Blocking Checkpointing: After a process takes a local checkpoint, to prevent orphan
messages, it remains blocked until the entire checkpointing activity is complete
 The disadvantages are the computation is blocked during the
checkpointing.
2. Non-blocking Checkpointing: The processes need not stop their execution while taking
checkpoints. A fundamental problem in coordinated checkpointing is to prevent a process from
receiving application messages that could make the checkpoint inconsistent.
Main disadvantage of this method
 Large latency is involved in committing output, as a global checkpoint is needed before a
message is sent to the OWP.
 Delays and overhead are involved everytime a new global checkpoint is taken.
 If perfectly synchronized clocks were available at processes, the following simple
method can be used for checkpointing:
o All processes agree at what instants of time they will take checkpoints, and the
clocks at processes trigger the local checkpointing actions at all processes.
 Since perfectly synchronized clocks are not available, the following approaches are used
to guarantee checkpoint consistency:
o either the sending of messages is blocked for the duration of the protocol
o checkpoint indices are piggybacked to avoid blocking.

Fig a) Non blocking coordinated checkpoint b) Inconsistency in checkpoint

 Coordinated checkpointing requires all processes to participate in every
checkpoint.
 This requirement generates valid concerns about its scalability. Hence it is
desirable to reduce the number of processes involved in a coordinated
checkpointing session.
 This can be done since only those processes that have communicated with the
checkpoint initiator either directly or indirectly since the last checkpoint need to
take new checkpoints.

112

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

3. Impossibility of min-process non-blocking checkpointing
A min-process, non-blocking checkpointing algorithm is one that forces only a minimum
number of processes to take a new checkpoint, and at the same time it does not force any process
to suspend its computation.
The algorithm consists of two phases.
 First phase: the checkpoint initiator identifies all processes with which it has
communicated since the last checkpoint and sends them a request. Upon
receiving the request, each process in turn identifies all processes it has
communicated with since the last checkpoint and sends them a request, and
so on, until no more processes can be identified.
 Second phase: all processes identified in the first phase take a checkpoint.
The result is a consistent checkpoint that involves only the participating
processes. In this protocol, after a process takes a checkpoint, it cannot send
any message until the second phase terminates successfully, although
receiving a message after the checkpoint has been taken is allowable.
 Z-dependency is that there does not exist a non-blocking algorithm that will
allow a minimum number of processes to take their checkpoints.
4. Communication-induced checkpointing
Communication-induced checkpointing avoids the domino effect, while allowing
processes to take some of their checkpoints independently. Processes may be forced to take
additional checkpoints, and thus process independence is constrained to guarantee the eventual
progress of the recovery line.
 Communication-induced checkpointing reduces or completely eliminates
the uselesscheckpoints.
 In communication-induced checkpointing, processes take two types
of checkpoints: autonomous and forced checkpoints.
 The checkpoints that a process takes independently are called local
checkpoints.
 The process is forced to take are called forced checkpoints.
 Communication-induced checkpointing piggybacks protocol- related
information on each application message.
Two types of communication-induced checkpointing:
 model-based checkpointing
 index-based checkpointing.
Model based checkpointing
This prevents patterns of communications and checkpoints that may result in inconsistent
states among the existing checkpoints. A process detects the inconsistent checkpoints and
independently forces local checkpoints to prevent the formation of undesirable patterns. A forced
checkpoint prevents the undesirable patterns from occurring.

113

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

 No control messages are exchanged among the processes during the normal
operation.
 All information necessary to execute the protocol is piggybacked on
application messages.
 The decision to take a forced checkpoint is done locally using the
information available.
 There are several domino-effect-free checkpoint and communication
models.
The MRS (mark, send, and receive) model avoids the domino effect by
ensuring that within every checkpoint interval all message receiving events
precede all message-sending events.
 This model can be maintained by taking an additional checkpoint before
every message-receiving event that is not separated from its previous
message-sending event by a checkpoint.
 Another method is by taking a checkpoint immediately after every
message- sending event.
Index-based checkpointing
 It assigns monotonically increasing indexes to checkpoints, such that the
checkpoints having the same index at different processes form a consistent
state.
 Inconsistency between checkpoints of the same index can be avoided in a
lazy fashion if indexes are piggybacked on application messages to help
receivers decide when they should take a forced a checkpoint.

4.5 LOG-BASED ROLLBACK RECOVERY

A log-based rollback recovery makes use of deterministic and nondeterministic
events in a computation.

4.5.1 Deterministic and non-deterministic events

 In log-based rollback, a process execution can be modeled as a sequence of
deterministic state intervals, each starting with the execution of a non-
deterministic event.
 A non-deterministic event can be the receipt of a message from another process or
an event internal to the process.

114

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

 In above figure, the execution of process P0 is a sequence of four deterministic

intervals.
 The first one starts with the creation of the process, while the remaining three
start with the receipt of messages m0, m3, andm7, respectively.
 Send event of message m2 is uniquely determined by the initial state of P0 and by
the receipt of message m0, and is therefore not anon-deterministic event.
 Log-based rollback recovery assumes that all non-deterministic events canbe
identified and their corresponding determinants can be logged into the stable
storage.
 During failure-free operation, each process logs the determinants of all non-
deterministic events that it observes onto the stable storage.
 Each process also takes checkpoints to reduce the extent of rollback during
recovery. After a failure occurs, the failed processes recover by using the
checkpoints and logged determinants to replay the corresponding non-
deterministic events precisely as they occurred during the pre-failure execution.
 The execution within each deterministic interval depends only on the sequence of
non-deterministic events that preceded the interval’s beginning, the pre-failure
execution of a failed process can be reconstructed during recovery up to the first
non-deterministic event whose determinant is not logged.
4.5.2 NO ORPHANS CONSISTENCY CONDITIONS
Let e be a non-deterministic event that occurs at process p.
 Depend(e): the set of processes that are affected by a non-deterministic event
e. This set consists of p, and any process whose state depends on the event e according to
Lamport’s happened before relation.
 Log(e): the set of processes that have logged a copy of e’s determinant in
their volatile memory.
 Stable(e): a predicate that is true if e’s determinant is logged on the stable
storage.

This property is called the always-no-orphans condition.

No-orphans condition: It states that if any surviving process depends on an

event e, then either event e is logged on the stable storage, or the process has a
copy of the determinant of event e.

 Log-based rollback-recovery protocols guarantee that upon recovery of all

failed processes, the system does not contain any orphan process.
 A process whose state depends on a non-deterministic event that cannot be

115

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

reproduced during recovery.
 Log-based rollback-recovery protocols are of three types: pessimistic
logging, optimistic logging, and causal logging protocols.
 They differ in their failure-free performance overhead, latency of output
commit, simplicity of recovery and garbage collection, and the potential for
rolling back surviving processes.

4.5.3 PESSIMISTIC LOGGING

Pessimistic logging protocols assume that a failure can occur after any non-
deterministic event in the computation.
Pessimistic protocols log to the stable storage the determinant of each non- deterministic
event before the event affects the computation. Pessimistic protocols implement as synchronous
logging, which is a stronger than the always-no-orphans condition. Checkpoints to minimize the
amount of work that has to be repeated during recovery.
 When a process fails, the process is restarted from the most recent checkpoint and
the logged determinants are used to recreate the prefailure execution.
 In a pessimistic logging system, the observable state of each process is always
recoverable.
 But there is performance penalty incurred by synchronous logging which may
lead to high performance overhead.
 Implementations of pessimistic logging must use special techniques to reduce the
effects of synchronous logging on the performance.
 This overhead can be lowered using special hardware.
 Magnetic disk devices and a special bus to guarantee atomic logging of all
messages exchanged in the system can mitigate this overhead.

 In Figure 13.8. During failure-free operation the logs of processes P0, P1, and P2
contain the determinants needed to replay messages m0, m4, m7, m1, m3, m6, and
m2, m5, respectively.

116

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

 Suppose processes P1 and P2 fail as shown, restart from checkpoints B and C,

and roll forward using their determinant logs to deliver again the same sequence
of messages as in the pre-failure execution.
 This guarantees that P1 and P2 will repeat exactly their pre-failure execution and
re-send the same messages. Hence, once the recovery is complete, both processes
will be consistent with the state of P0 that includes the receipt of message m7
from P1.
 In a pessimistic logging system, the observable state of each process is always
recoverable.
4.5.4 OPTIMISTIC LOGGING
In these protocols, processes log determinants asynchronously to the stable storage.
Optimistically assume that logging will be complete before a failure occurs. This do not
implement the always-no-orphans condition.
 To perform rollbacks correctly, optimistic logging protocols track causal dependencies
during failure free execution.
 Optimistic logging protocols require a non-trivial garbage collection scheme.
 Pessimistic protocols need only keep the most recent checkpoint of each process,
whereas optimistic protocols may need to keep multiple checkpoints for each process.
 The overheads in optimistic logging are complicated recovery, garbage collection, and
slower output commit.
 If a process fails, the determinants in its volatile log are lost, and the state intervals that
were started by the non-deterministic events corresponding to these determinants cannot
be recovered.
 If the failed process sent a message during any of the state intervals that cannot be
recovered, the receiver of the message becomes an orphan process and must roll back to
undo the effects of receiving the message.

Fig 4.7: Optimistic logging

 Upon a failure, the dependency information is used to calculate and recover the latest
global state of the pre-failure execution in which no process is in an orphan.

117

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

 Optimistic logging protocols require a non-trivial garbage collection scheme.

 The pessimistic protocols track only the most recent checkpoint of each process, whereas
optimistic protocols may need to keep multiple checkpoints for each process.
 Suppose process P2 fails before the determinant for m5 is logged to the stable storage.
Process P1 then becomes an orphan process and must roll back to undo the effects of
receiving the orphan message m6.
 The rollback of P1 further forces P0 to roll back to undo the effects of receiving message
m7.

4.5.5 CASUAL LOGGING

This combines the advantages of both pessimistic and optimistic logging at the expense
of a more complex recovery protocol.
 Like optimistic logging, it does not require synchronous access to the stable storage
except during output commit
 Like pessimistic logging, it allows each process to commit output independently
and never creates orphans, thus isolating processes from the effects of failures at
other processes.
 Make sure that the always-no-orphans property holds.
 Each process maintains information about all the events that have causally affected
its state.
 Causal logging protocols make sure that the always-no-orphans property holds by
ensuring that the determinant of each non-deterministic event that causally
precedes the state of a process is either stable or it is available locally to that
process.

 In fig.13.10 Messages m5 and m6 are likely to be lost on the failures of P1 and P2

at the indicated instants.
 Process P0 at state X will have logged the determinants of the nondeterministic

118

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

events that causally precede its state according to Lamport’s happened - before
relation.
 These events consist of the delivery of messages m0, m1, m2, m3, and m4.
 The determinant of each of these non-deterministic events is either logged on the
stable storage or is available in the volatile log of process P0.
 The process P0 will be able to guide the recovery of P1 and P2 since it knows the
order in which P1 should replay messages m1 and m3 to reach the state from which
P1 sent message m4.
 P0 has the order in which P2 should replay message m2 to be consistent with both
P0 and P1.
 The content of these messages is obtained from the sender log of P0 or regenerated
deterministically during the recovery of P1 and P2.
 The information about messages m5 and m6 is lost due to failures. These messages
may be resent after recovery possibly in a different order.
 Each process maintains information about all the events that have causally affected
its state.
 This information protects it from the failures of other processes and also allows the
process to make its state recoverable by simply logging the information available
locally.
 Thus, a process does not need to run a multi-host protocol to commit output. It can
commit output independently.

4.6 COORDINATED CHECKPOINTING ALGORITHM (KOO-TOUEG)

 Uncoordinated checkpointing may lead to domino effect or to livelock.
 Two basic approaches to checkpoint coordination:
1. The Koo-Toueg algorithm: process to initiates the system-wide
checkpointing process
2. Staggering checkpoints: An algorithm which staggers checkpoints in time;
Staggering checkpoints can help avoid near-simultaneous heavy loading of
the disk system
 The communication is induced by checkpointing procedures.
 But the uncoordinated checkpointing algorithms can deal the isolated failures.

A coordinated checkpointing and recovery technique that takes a

consistent set of checkpointing and avoids domino effect and
livelock problems during the recovery.

 This algorithm includes 2 parts:

• check pointing algorithm

119

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

• Recovery algorithm.
4.6.1 CHECKPOINTING ALGORITHM
 The following are the assumptions made in checkpointing algorithm:
 FIFO channel
 end-to-end protocols
 communication failures do not partition the network
 single process initiation
 no process failures during the execution of the algorithm
 The algorithm facilitates two kinds of checkpoints:
 Permanent: local checkpoint, part of a consistent global checkpoint
 Tentative: temporary checkpoint, become permanent checkpoint when
the algorithm terminates successfully
 The algorithm is implemented in two phases:
Phase I: Initiation
 This process takes a tentative checkpoint and requests all other processes to take tentative
checkpoints.
 Every process cannot send messages after taking tentative checkpoint.
 All processes will finally have the single same decision: do or discard.
 A process says no to a request if it fails to take a tentative checkpoint, which could be
due to several reasons, depending upon the underlying application.
 If Pi learns that all the processes have successfully taken tentative checkpoints, Pi
decides that all tentative checkpoints should be made permanent; otherwise, Pi decides
that all the tentative checkpoints should be discarded.
Phase II:
 All processes will receive the final decision from initiating process and act accordingly.
 Pi informs all the processes of the decision it reached at the end of the first phase.
 A process, on receiving the message from Pi, will act accordingly.
 Either all or none of the processes advance the checkpoint by taking permanent
checkpoints.
 The algorithm requires that after a process has taken a tentative checkpoint, it cannot
send messages related to the underlying computation until it is informed of Pi’s decision.
Correctness:
 Either all or none of the processes take permanent checkpoint.
 No process sends message after taking permanent checkpoint.
Optimization:  The optimization may be not all of the processes need to take checkpoints
4.6.2 ROLLBACK RECOVERY ALGORITHM
 This algorithm restore the system state to a consistent state after a failure with
assumptions: single initiator, checkpoint and rollback recovery algorithms are not

120

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

invoked concurrently.
 This is implemented in two phases:
 The initiating processsend a message to all other processes and ask for
the preferences – restarting to the previous checkpoints. All need to
agree about either do or not.
 The initiating process send the final decision to all processes, all the
processes act accordingly after receiving the final decision.
Phase I:
 An initiating process Pi sends a message to all other processes to check if they all are willing
to restart from their previous checkpoints.
 A process may reply no to a restart request due to any reason.
 If Pi learns that all processes are willing to restart from their previous checkpoints, Pi decides
that all processes should roll back to their previous checkpoints.

 Otherwise, Pi aborts the rollback attempt and it may attempt a recovery at a

later time.

Phase II:
 Pi propagates its decision to all the processes.
 On receiving Pi’s decision, a process acts accordingly.
 During the execution of the recovery algorithm, a process cannot send
messages related to the underlying computation while it is waiting for Pi’s
decision.

 In Figure 13.11. The set {x1,y1, z1} is a consistent set of checkpoints. Suppose process
X decides to initiate the checkpointing algorithm after receiving message m.
It takes a tentative checkpoint x2 and sends “take tentative checkpoint" messages to

121

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

processes Y and Z, causing Y and Z to take checkpoints y2 and z2, respectively.

Clearly, {x2,y2, z2} forms a consistent set of checkpoints. Note, however, that {x2,y2,
z1} also forms a consistent set of checkpoints.
In this example, there is no need for process Z to take checkpoint z2 because Z has not
sent any message since its last checkpoint. However, process Y must take a checkpoint
since it has sent messages since its last checkpoint.
Correctness
 All processes restart from an appropriate state because, if they decide to restart, they
resume execution from a consistent state.
Optimization
 This may not to recover all, since some of the processes did not change anything.

4.7ALGORITHM FOR ASYNCHRONOUS CHECKPOINTING AND

RECOVERY(JUANG-VENKATESAN):
 This algorithm helps in recovery in asynchronous checkpointing.
4.7.1 SYSTEM MODEL AND ASSUMPTIONS
 The following are the assumptions made:
 communication channels are reliable
 delivery messages in FIFO order
 infinite buffers
 message transmission delay is arbitrary but finite
 The underlying computation or application is event-driven: When
process P is at states, receives message m, it processes the message;
moves to state s’ and send messages out. So the triplet (s, m, msgs_sent)
represents the state of P.
 To facilitate recovery after a process failure and restore the system to a
consistent state, two types of log storage are maintained:
 Volatile log: It takes short time to access but lost if processor crash.
The contents of volatile log are moved to stable log periodically.
 Stable log: longer time to access but remained if crashed.

122

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

4.7.2 ASYNCHRONOUS CHECKPOINTING

After executing an event, a processor records a triplet (s, m, msg_sent) in its volatile
storage.
 s:state of the processor before the event
 m: message
 msgs_sent: set of messages that were sent by the processor during the event.
 A local checkpoint at a processor consists of the record of an event occurring at the
processor and it is taken without any synchronization with other processors.
 Periodically, a processor independently saves the contents of the volatile log in
the stable storage and clears the volatile log.
 This operation is equivalent to taking a local checkpoint.
4.7.3 Recovery Algorithm
The data structures followed in the algorithm are:

Basic idea:
 The main idea of the algorithm is to find a set of consistent checkpoints, from the
set of checkpoints.
 This is done based on the number of messages sent and received.
 Recovery may involve multiple iterations of roll backs by processors.
 Whenever a processor rolls back, it is necessary for all other processors to find
out if any message sent by the rolled back processor has become an orphan
message.
 The orphan messages are identified by comparing the number of messages sent to
and received from neighboring processors.
 When a processor restarts after a failure, it broadcasts a ROLLBACK message
that it has failed.
 The recovery algorithm at a processor is initiated when it restarts after a failure or
when it learns of a failure at another processor.
 Because of the broadcast of ROLLBACK messages, the recovery algorithm is
initiated at all processors.

123

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

 The rollback starts at the failed processor and slowly diffuses into the entire
system through ROLLBACK messages.
 During the kth iteration (k != 1), a processor pi does the following:
(i) based on the state CkPt i it was rolled back in the (k − 1)th iteration, it
computes SENTi j (CkPti) for each neighbor pj and sends this value in a
ROLLBACK message to that neighbor
(ii) pi waits for and processes ROLLBACK messages that it receives from its
neighbors in kth iteration and determines a new recovery point CkPti for pi
based on information in these messages.

124

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

 At the end of each iteration, at least one processor will rollback to its final
recovery point, unless the current recovery points are already consistent.

 Figure 4.9 consisting of three processors. Suppose processor Y fails and

restarts. If event ey2 is the latest checkpointed event at Y , then Y will restart
from the state corresponding to ey2.
 Because of the broadcast nature of ROLLBACK messages, the recovery
algorithm is also initiated at processors X and Z. Initially, X, Y , and Z set
CkPtX← ex3, CkPtY← ey2 and CkPtZ← ez2, respectively, and X, Y , and
Z send the following messages
 During the first iteration: Y sends ROLLBACK(Y, 2) to X and
ROLLBACK(Y, 1) to Z; X sends ROLLBACK(X, 2) to Y and
ROLLBACK(X, 0) to Z; and Z sends ROLLBACK(Z, 0) to X and
ROLLBACK(Z, 1) to Y . Since RCVDX←Y (CkPtX) = 3 > 2 (2 is the value
received in the ROLLBACK(Y , 2) message from Y ), X will set CkPtX to
ex2 satisfying RCVDX←Y (ex2)= 2 ≤ 2. Since RCVDZ←Y (CkPtZ)= 2 >
1, Z will set CkPtZ to ez1 satisfying RCVDZ←Y (ez1)= 1 ≤ 1. At Y,
RCVDY←X(CkPtY )= 1 < 2.and RCVDY←Z(CkPtY )= 1 = SENTZ←Y
(CkPt). Hence, Y need not roll back further.

 The second and third iteration will progress in the same manner. Note that
the set of recovery points chosen at the end of the first iteration, {ex2, ey2,
ez1}, is consistent, and no further rollback occurs.

125

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

CONSENSUS AND AGREEMENT ALGORITHMS

4.8. PROBLEM DEFINITION

A consensus algorithm is a process that achieves agreement on a single data

value among distributed processes or systems.
 Deciding whether to commit a distributed transaction to a database.
 Designating node as a leader for some distributed task.
 Synchronizing state machine replicas and ensuring consistency among them.
4.8.1 ASSUMPTIONS IN CONSENSUS ALGORITHMS
1. Failure models:
 Some of the processes may be faulty in distributed systems.
 A faulty process can behave in any manner allowed by the failure model
assumed.
 Some of the well known failure models includes fail-stop, send omission and
receive omission, and Byzantine failures.
 Fail stop model: a process may crash in the middle of a step, which could be the
execution of a local operation or processing of a message for a send or receive
event. it may send a message to only a subset of the destination set before
crashing.
 Byzantine failure model: a process may behave arbitrarily.
 The choice of the failure model determines the feasibility and complexity of
solving consensus.
2. Synchronous/asynchronous communication:
 If a failsure-prone process chooses to send a message to process but fails, then
intended process cannot detect the non-arrival of the message.
 This is because scenario is indistinguishable from the scenario in which the
message takes a very long time in transit. This is a major hurdle in asynchronous
system.
 In a synchronous system, a unsent message scenario can be identified by the
intended recipient, at the end of the round.
 The intended recipient can deal with the non-arrival of the expected message by
assuming the arrival of a message containing some default data, and then
proceeding with the next round of the algorithm.
3. Network connectivity:
 The system has full logical connectivity, i.e., each process can communicate with
any other by direct message passing.

126

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

4. Sender identification:
 A process that receives a message always knows the identity of the sender
process.
 When multiple messages are expected from the same sender in a single round, a
scheduling algorithm is employed that sends these messages in sub-rounds, so
that each message sent within the round can be uniquely identified.
5. Channel reliability
 The channels are reliable, and only the processes may fail.
 Authenticated vs. non-authenticated messages:
 With unauthenticated messages, when a faulty process relays a message to other
processes
 it can forge the message and claim that it was received from another process,
 it can also tamper with the contents of a received message before relaying it.
 When a process receives a message, it has no way to verify it authenticity. This is
known as un authenticated message or oral message or an unsigned message.
 Using authentication via techniques such as digital signatures, it is easier to solve
the agreement problem because, if some process forges a message or tampers
with the contents of a received message before relaying it, the recipient can detect
the forgery or tampering.
 Thus, faulty processes can inflict less damage.
6. Agreement variable:
 The agreement variable may be boolean or multivalued, and need not be an
integer.
 This simplifying assumption does not affect the results for other data types, but
helps in the abstraction while presenting the algorithms.
4.8.2 BYZANTINE GENERAL PROBLEM
The Byzantine Generals’ Problem (BGP) is a classic problem faced by any distributed
computer system network. Imagine that the grand Eastern Roman empire aka Byzantine empire
has decided to capture a city.
 There is fierce resistance from within the city.
 The Byzantine army has completely encircled the city.
 The army has many divisions and each division has a general.
 The generals communicate between each as well as between all lieutenants within
their division only through messengers.
 All the generals or commanders have to agree upon one of the two plans of
action.
 Exact time to attack all at once or if faced by fierce resistance then the time to
retreat all at once. The army cannot hold on forever.

127

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

 If the attack or retreat is without full strength then it means only one thing —
Unacceptable brutal defeat.
 If all generals and/or messengers were trustworthy then it is a very simple
solution.

 However, some of the messengers and even a few generals/commanders are

traitors. They are spies or even enemy soldiers.
 There is a very high chance that they will not follow orders or pass on the
incorrect message. The level of trust in the army is very less.
 Consider just a case of 1 commander and 2 Lieutenants and just 2 types of

messages- ‘Attack’ and ‘Retreat’.

 In Fig 4.10, the Lieutenant 2 is a traitor who purposely changes the message that
is to be passed to Lieutenant 1.
 Now Lieutenant 1 has received 2 messages and does not know which one to
follow. Assuming Lieutenant 1 follows the Commander because of strict
hierarchy in the army.
 Still, 1/3rd of the army is weaker by force as Lieutenant 2 is a traitor and this
creates a lot of confusion.
 However what if the Commander is a traitor (as explained in Fig 4.11). Now
2/3rd of the total army has followed the incorrect order and failure is certain.

128

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

 After adding 1 more Lieutenant and 1 more type of message (Let’s say the 3rd message
is ‘Not sure’), the complexity of finding a consensus between all the Lieutenants and the
Commander is increased.
 Now imagine the exponential increase when there are hundreds of Lieutenants.

129

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023


This is BGP. It is applicable to every distributed network. All participants or nodes
(‘Lieutenant’) are exactly of equal hierarchy. If agreement is reachable, then protocols to
reach it need to bedevised.
 All participating nodes have to agree upon every message that is transmitted
between the nodes.
 If a group of nodes is corrupt or the message that they transmit is corrupt then still
the network as a whole should not be affected by it and should resist this ‘Attack’.

 The network in its entirety has to agree upon every message transmitted in the
network. This agreement is called as consensus.

The Byzantine agreement problem requires a designated source process,

with an initial value, to reach agreement with the other processes about its
initial value, subject to:
 Agreement: All non-faulty processes must agree on the same value.
 Validity: If the source process is non-faulty, then the agreed upon
value by all the non-faulty processes must be the same as the initial
value of the source.
 Termination: Each non-faulty process must eventually decide on a
value.

There are two other versions of the Byzantine agreement problem:

 Consensus problem
 Interactive consistency problem.
 A correct process is a process that does not exhibit a Byzantine behaviour.
 A process is Byzantine if, during its execution, one of the following faults occurs:
 Crash: The process stops executing statements of its program and halts.
 Corruption: The process changes arbitrarily the value of a local variable
with respect to its program specification. This fault could be propagated to
other processes by including incorrect values in the content of a message sent
by the process.
 Omission: The process omits to execute a statement of its program. If a
process omits to execute an assignment, this could lead to a corruption fault.
 Duplication: The process executes more than one time a statement of its

130

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

program. If a process executes an assignment more than one time, this could
lead to a corruption fault.
 Misevaluation: The process misevaluates an expression included in its
program. This fault is different from a corruption fault: misevaluating an
expression does not imply the update of the variables involved in the
expression and, in some cases the result of an evaluation is not as-signed to a
variable.
4.8.3 CONSENSUS PROBLEM

All the process has an initial value and all the correct processes must agree
on single value. This is consensus problem.
Consensus is a fundamental paradigm for fault-tolerant asynchronous distributed
systems. Each process proposes a value to the others. All correct processes have to agree
(Termination) on the same value (Agreement) which must be one of the initially proposed values
(Validity).
The requirements of the consensus problem are:
 Agreement: All non-faulty processes must agree on the same (single) value.
 Validity: If all the non-faulty processes have the same initial value, then the
agreed upon value by all the non-faulty processes must be that same value.
 Termination: Each non-faulty process must eventually decide on a value.
Interactive Consistency Problem:

All the process has an initial value, and all the correct processes must agree
upon a set of values, with one value for each process. This is interactive
consistency problem.
The formal specifications are:
 Agreement: All non-faulty processes must agree on the same array of values
A[v1, …,vn].
 Validity: If process i is non-faulty and its initial value is vi, then all nonfaulty
processes agree on vi as the ith element of the array A. If process jis faulty, then
the non-faulty processes can agree on any value for A[j].
 Termination: Each non-faulty process must eventually decide on the array A.
The difference between the agreement problem and the consensus problem is that, in the
agreement problem, a single process has the initial value, whereas in the consensus problem, all
processes have an initial value.
4.9 OVERVIEW OF RESULTS
Consensus is not solvable in asynchronous systems even if one process can fail by
crashing. The results are tabulated below. f indicates the number of processes that can fail and n

131

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

indicates the total number of processes.

S.No Failure Mode Synchronous System Asynchronous System

1. No failure Agreement is attainable. Agreement is attainable.
Common knowledge is also Concurrent common
attainable. knowledge is also
attainable.
2. Crash failure Agreement is not
attainable.

3. Byzantine Agreement is not

(malicious) attainable.
failure

Solvable variants of agreement problem

Fig 4.14: Circumventing the impossibility results

A synchronous message passing system and a shared memory system can be used solve

132

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

the consensus problem. The following are the weaker consensus problem in asynchronous
system:
 Terminating reliable broadcast: A correct process will always get a message
even if the sender crashes while sending. If the sender crashes while sending the
message, the message may be even null, but still it has to be delivered to the
correct process.
 K-set consensus: It is solvable as long as the number of crashes is less than the
parameter k, which indicates the non-faulty processes that agree on different
values, as long as the size of the set of values agreed upon is bounded by k.
 Approximate agreement: The consensus value is from multi valued domain.
The agreed upon values by the non-faulty processes be within of each other.
 Renaming problem: It requires the processes to agree on necessarily distinct
values.
 Reliable broadcast: A weaker version of reliable terminating broadcast (RTB), is
the one in which the terminating condition is dropped and is solvable under
crash failures.

Fig 4.15: Solvable variants of agreement problem in asynchronous system

4.10 AGREEMENT IN A FAILURE –FREE SYSTEM:
In a failure-free system, consensus can be reached by collecting information from the
different processes, arriving at a decision, and distributing this decision in the system. A
distributed mechanism would have each process broadcast its values to others, and each process
computes the same function on the values received. The decision can be reached by using an
application specific function.
 Algorithms to collect the initial values and then distribute the decision may be based
on the token circulation on a logical ring, or the three-phase tree-based broadcast
converge cast: broadcast, or direct communication with all nodes.
 In a synchronous system, this can be done simply in a constant number of rounds.
 Further, common knowledge of the decision value can be obtained using an
additional round.

133

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

In an asynchronous system, consensus can similarly be reached in a constant number

of message hops.
 Further, concurrent common knowledge of the consensus value can also be attained.
4.11 AGREEMENT IN(MESSAGE PASSING)SYNCHRONOUS SYSTEMS WITH
FAILURES:
4.11.1 Consensus algorithm for crash failures (synchronous system):
 Consensus algorithm for crash failures message passing synchronous system.
 The consensus algorithm for n processes where up to f processes where f < n may
fail in a fail stop failure model.
 Here the consensus variable x is integer value; each process has initial value xi. If
up to f failures are to be tolerated than algorithm has f+1 rounds, in each round a
process i sense the value of its variable xi to all other processes if that value has
not been sent before.
 So, of all the values received within that round and its own value xi at that start of
the round the process takes minimum and updates xi occur f + 1 rounds the local
value xi guaranteed to be the consensus value.
 In one process is faulty, among three processes then f = 1. So the agreement
requires f + 1 that is equal to two rounds.
 If it is faulty let us say it will send 0 to 1 process and 1 to another process i, j and
k. Now, on receiving one on receiving 0 it will broadcast 0 over here and this
particular process on receiving 1 it will broadcast 1 over here.
 So, this will complete one round in this one round and this particular process on
receiving 1 it will send 1 over here and this on the receiving 0 it will send 0 over
here.

134

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

 The agreement condition is satisfied because in the f+ 1 rounds, there must be at

least one round in which no process failed.
 In this round, say round r, all the processes that have not failed so far succeed in
broadcasting their values, and all these processes take the minimum of the values
broadcast and received in that round.
 Thus, the local values at the end of the round are the same, say xi rfor all non-
failed processes.
 In further rounds, only this value may be sent by each process at most once, and
no process i will update its value xi r.
 The validity condition is satisfied because processes do not send fictitious values
in this failure model.
 For all i, if the initial value is identical, then the only value sent by any process is
the value that has been agreed upon as per the agreement condition.
 The termination condition is seen to be satisfied.
Complexity
 The complexity of this particular algorithm is it requires f + 1 rounds where f < n
and the number of messages is O(n2 )in each round and each message has one
integers hence the total number of messages is O((f +1)· n2) is the total number
of rounds and in each round n2 messages are required.
Lower bound on the number of rounds
 At least f + 1 rounds are required, where f < n.
 In the worst-case scenario, one process may fail in each round; with f + 1 rounds,
there is at least one round in which no process fails.
 In that guaranteed failure-free round, all messages broadcast can be delivered
reliably, and all processes that have not failed can compute the common function
of the received values to reach an agreement value.

4.11.2 CONSENSUS ALGORITHMS FOR BYZANTINE FAILURES (SYNCHRONOUS

SYSTEM)

Upper bound on Byzantine processes

 In a system of n processes, the Byzantine agreement problem can be solved in a
synchronous system only if the number of Byzantine processes f is such that

135

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

Fig 4.17: Impossibility of achieving Byzantine agreement with n = 3 processes and f = 1

malicious process
 The condition where f < (n – 1) / 2 is violated over here; that means, if f = 1 and n
= 2 this particular assumption is violated
 (n – 1) / 2 is not 1 in that case, but we are assuming 1 so obviously, as per the
previous condition agreement byzantine agreement is not possible.
 Here P 0 is faulty is non faulty and here P 0 is faulty so that means P 0 is the
source, the source is faulty here in this case and source is non faulty in the other
case.
 So, source is non faulty, but some other process is faulty let us say that P 2 is
faulty. P 1 will send because it is non faulty same values to P 1 and P 2 and as far
as the P 2s concerned it will send a different value because it is a faulty.
 Agreement is possible when f = 1 and the total number of processor is 4.
 So, agreement we can see how it is possible we can see about the commander P c.
 So, this is the source it will send the message 0 since it is faulty. It will send 0 to
P d 0 to P b, but 1 to pa in the first column. So, P a after receiving this one it will
send one to both the neighbors, similarly P b after receiving 0 it will send 0 since
it is not faulty.
 Similarity P d will send after receiving 0 at both the ends.
 If we take these values which will be received here it is 1 and basically it is 0 and
this is also 0.
 So, the majority is basically 0 here in this case here also if you see the values 10
and 0. The majority is 0 and here also majority is 0.
 In this particular case even if the source is faulty, it will reach to an agreement,
reach an agreement and that value will be agreed upon value or agreement
variable will be equal to 0.

136

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

 Consider the Fig 4.18. Commander Pc sends its value to the other three
lieutenants.
 In the second round, each lieutenant relays to the other two lieutenants, the value
it received from the commander in the first round.
 At the end of the second round, a lieutenant takes the majority of the values it
received
(i) directly from the commander in the first round
(ii) from the other two lieutenants in the second round.

Fig 4.18: Achieving Byzantine agreement when n = 4 processes and f = 1 Malicious process

Byzantine agreement tree algorithm: exponential (synchronous system)- recursive formulation

 The majority gives a correct estimate of the commander’s value.

 All three lieutenants take the majority of (1, 0, 0) which is “0,” the agreement
value. Pd is malicious. Lieutenants Pa and Pb agree on “0,” the value of then
commander.

137

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

 Lamport-Shostak-Pease Algorithm

This is also known as Oral Message algorithm of f faulty process (OM(f)) and n is the total
number of processes (n>=3f+1). The algorithm is recursively defined as:
1. Source process sends its value to each other process.
2. Each process uses the value it receives from the source.
 The algorithm is recursive and the base of the recursion that is OM(0) says that the

138

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

source process sends its values to each other process. Now each process uses its
value, value it receives from the source if no value is received the default 0 is
assumed.
 Each message has the following parameters:
i) a consensus estimate value (v)
ii) a set of destinations (Dests)
iii) a list of nodes traversed by the message, from most recent to least recent
(List)
iv) The number of Byzantine processes that the algorithm still needs to
tolerate (faulty).
 The commander invokes the algorithm with parameter faulty set to f, the maximum
number of malicious processes to be tolerated.
 The algorithm uses f + 1 synchronous rounds. Each message (having this parameter
faulty = k) received by a process invokes several other instances of the algorithm
with parameter faulty = k − 1.
 The terminating case of the recursion is when the parameter faulty is 0.
 As the recursion folds, each process progressively computes the majority function
over the values it used as a source for that level of invocation in the unfolding, and
the values it has just computed as consensus values using the majority function for
the lower level of invocations.

 There are an exponential number of messages O (nf) used by this algorithm

Fig 4.20: Relationship between number of rounds and messages

 As multiple messages are received in any one round from each of the other
processes, they can be distinguished using the List, or by using a scheduling
algorithm within each round.
 Each process maintains a tree of boolean variables. The tree data structure at a non-
initiator is used as follows:
 There are f + 1 levels from level 0 through level f.

139

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

 Level 0 has one root node, vinit , after round 1.

 Level h, 0 < h ≤ f has 1(n – 2) (n – 3)… (n – h)(n –(h + 1)) nodes after round
h+ 1. Each node at level (h – 1) has (n –(h + 1)) child nodes.
 Node vL Kdenotes the command received from the node head(L) by node k
which forwards it to node i. The command was relayed to head(L) by head(tail(L)),
which received it from head(tail(tail(L))) , and so on. The very last element of L is the
commander, denoted Pinit.
 In the f + 1 rounds of the algorithm (lines 2a–2e of the iterative version),each
level k, 0 ≤ k ≤ f, of the tree is successively filled to remember the values
received at the end of round k + 1, and with which the processs ends the
multiple instances of the OM message with the fourth parameter as f- (k + 1)
for round k + 2.
 Once the entire tree is filled from root to leaves, the actions in the folding of the
recursion are simulated in lines 2f–2h of the iterative version, proceeding from the leaves
up to the root of the tree. These actions are crucial – they entail taking the majority of the
values at each level of the tree. The final value of the root is the agreement value, which
will be the same at all processes.

140

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

Fig 4.22: Local tree at P3for solving the Byzantine agreement, for n = 10 and f = 3. Only
one branch of the tree is shown for simplicity
Correctness
The correctness of the Byzantine agreement algorithm can be observed from the following two
informal inductive arguments. Here we assume that the Oral_Msg algorithm is invoked with
parameter x, and that there are a total of f malicious processes. There are two cases depending on
whether the commander is malicious. A malicious commander causes more chaos than an honest
commander.
Phase-king algorithm for consensus: polynomial (synchronous system)
 The phase-king algorithm proposed by Berman and Garay solves the consensus
problem under the same model, requiring f + 1 phases, and a polynomial number
of messages but can tolerate only f < Ceil (n/4), malicious processes.
 The algorithm is so called because it operates in f + 1 phases, each with two
rounds, and a unique process plays an asymmetrical role as a leader in each
round.

141

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

 In the first round of each phase, each process broadcasts its estimate of the
consensus value to all other processes, and likewise awaits the values broadcast
by others.
 At the end of the round, it counts the number of “1” votes and the number of “0”
votes. If number is greater than n/2, then it sets its majority variable to that
consensus value, and sets mult to the number of votes received for the majority
value.
 If neither number is greater than n/2, which may happen when the malicious
processes do not respond, and the correct processes are split among themselves,
then a default value is used for the majority variable.
 In the second round (lines 1g–1o) of each phase, the phaseking initiates
processing the phase king for phase k is the process with identifier Pk, where k ∈
{1…n}.

142

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

 The phase king broadcasts its majority value majority, which serves the role of a
tie-breaker vote for those other processes that have a value of mult of less than
n/2+ f.
 Thus, when a process receives the tie-breaker from the phase king, it updates its
estimate of the decision variable v to the value sent by the phase king if its own
mult variable < n/2 + f.
 The reason for this is that among the votes for its own majority value, f votes
could be bogus and hence it does not have a clear majority of votes (i.e., > n/2)
from the non-malicious processes.
 Hence, it adopts the value of the phase king. However, if mult > n/2 + f(lines 1k–
1l), then it has received a clear majority of votes from the nonmalicious
processes, and hence it updates its estimate of the consensus variable v to its own
majority value, irrespective of what tie-breaker value the phase king has sent in
the second round.
 At the end of f + 1 phases, it is guaranteed that the estimate v of all the processes
is the correct consensus value.
Correctness
The correctness reasoning is in three steps:
 Among the f + 1 phases, the phase king of some phase k is non-malicious because
there are at most f malicious processes.
 As the phase king of phase k is non-malicious, all non-malicious processes can be
seen to have the same estimate value v at the end of phase k.
 All non-malicious processes have the same consensus estimate x at the start of
phase k+ 1 and they continue to have the same estimate at the end of phase k+ 1.
Complexity
The algorithm requires f + 1 phases with two sub-rounds in each phase, and (f + 1)[(n
−1)(n +1)] messages.

143

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

PART-A (Possible Questions)

1. Define checkpoint and rollback.
The saved state is called a checkpoint, and the procedure of restarting from a previously
checkpointed state is called rollback recovery.
2. Define Domino effect.
In rollback propagation the dependencies may force some of the processes that did not fail to
rollback. This phenomenon is called domino effect.
3. What is Independent checkpointing?
If each participating process takes its checkpoints independently, then the system is susceptible
to the domino effect. This approach is called independent or uncoordinated checkpointing.
4. What is Coordinatedcheckpointing?
It is desirable to avoid the domino effect and therefore several techniques have been developed
to prevent it. One such technique is coordinated checkpointing where processes coordinate their
checkpoints to form a system-wideconsistent state.
5. What is Communication induced checkpointing?
In case of a process failure, the system state can be restored to such a consistent set of
checkpoints, preventing the rollback propagation. Communication-induced checkpointing forces
each process to take checkpoints based on information piggybacked on the application messages
it receives from other processes. Checkpoints are taken such that a system-wide consistent state
always exists on stable storage, thereby avoiding the domino effect.
6. What is Rollback recovery protocol?
Rollback recovery protocol is a process in which a system recovers correctly if its internal state
is consistent with the observable behavior of the system before the failure.
7. Define local checkpointing.
A local checkpoint is a snapshot of the state of the process at a given instance and the event of
recording the state of a process is called local checkpointing. A local checkpoint is a snapshot of
a local state of a process and a global checkpoint is a set of local checkpoints, one from each
process.
8. What is consistent global state?
A consistent global state is one that may occur during a failure-free execution of a distributed
computation.
9. What is global state?
A global state of a distributed system is a collection of the individual states of all participating
processes and the states of the communication channels.
10. Define consistent system state.
A consistent system state is one in which a process’s state reflects a message receipt,
then the state of the corresponding sender must reflect the sending of that message.

144

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

11. What is global checkpoint?
It is a set of local checkpoints, one from each process
12. Define consistent global checkpoint.
It is a global checkpoint such that no message is sent by a process after taking its local point that
is received by another process before taking its checkpoint.
13. Define OWP.
The Outside World Process (OWP) is a defined as a special process that interacts with the rest of
the distributed system through message passing.
14. Write about output commit problem.
Before sending output to the OWP, the system must ensure that the state from which the output
is sent will be recovered despite any future failure. This is commonly called the output commit
problem.
15. What are Delayed messages?
 Messages whose receive is not recorded because the receiving process was either
down or the message arrived after the rollback of the receiving process, are
called delayed messages.
 Messages m2 and m5 are delayed messages.
16. Define Orphan messages.
 Messages with receive recorded but message send not recorded are called orphan
messages.
 A rollback might have undone the send of such messages, leaving the receive
event intact at the receiving process.
 Orphan messages do not arise if processes roll back to a consistent global state.
17. Write about check point based recovery.
In check point based recovery, the state of each process and the communication channel is
checkpointed frequently so that when a failure occurs, the system can be restored to a globally
consistent set of checkpoints.
18. List the types of rollback-recovery techniques.
(a) Uncoordinated checkpointing
(b) Coordinated checkpointing
(c) Communication-induced checkpointing
19. Give the limitations of uncoordinated checkpointing.
- Domino effect during a recovery
- Recovery from a failure is slow because processes need to iterate to find a
consistent set of checkpoints
- Each process maintains multiple checkpoints and periodically invoke a
garbage collection algorithm
- Not suitable for application with frequent output commits

145

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

20. What is blocking checkpointing?

After a process takes a local checkpoint, to prevent orphan messages, it remains blocked until the
entire checkpointing activity is complete. The disadvantages are the computation is blocked
during the checkpointing.
21. What is Non-blocking Checkpointing?
The processes need not stop their execution while taking checkpoints. A fundamental problem in
coordinated checkpointing is to prevent a process from receiving application messages that could
make the checkpoint inconsistent.
22. Define Z dependency.

Z-dependency is that there does not exist a non-blocking algorithm that will allow a minimum
number of processes to take their checkpoints.
23. Give the types of communication-induced checkpointing.
 The checkpoints that a process takes independently are called local checkpoints.
 The process is forced to take are called forced checkpoints.
24. What is MRS model?
The MRS (mark, send, and receive) model avoids the domino effect by ensuring that within
every checkpoint interval all message receiving events precede all message- sending events.
25. Write about Index-based checkpointing
 This assigns monotonically increasing indexes to checkpoints, such that the
checkpoints having the same index at different processes form a consistent state.
 Inconsistency between checkpoints of the same index can be avoided in a lazy
fashion if indexes are piggybacked on application messages to help receivers
decide when they should take a forced a checkpoint.
26. Define log-based rollback recovery.
It makes use of deterministic and nondeterministic events in a computation.
27. State No-orphans condition.
It states that if any surviving process depends on an event e, then either event e is logged on the
stable storage, or the process has a copy of the determinant of event e.
28. Define pessimistic logging.
Pessimistic logging protocols assume that a failure can occur after any non- deterministic event
in the computation.
29. What is optimistic logging?
 In these protocols, processes log determinants asynchronously to the stable
storage.
 Optimistically assume that logging will be complete before a failure occurs.
 This do not implement the always-no-orphans condition.

146

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

30. Write about casual logging.
 This combines the advantages of both pessimistic and optimistic logging at the
expense of a more complex recovery protocol.
 Like optimistic logging, it does not require synchronous access to the stable
storage except during output commit
 Like pessimistic logging, it allows each process to commit output independently
and never creates orphans, thus isolating processes from the effects of failures at
other processes.
31. Define Staggering checkpoints.
An algorithm which staggers checkpoints in time; Staggering checkpoints can help avoid near-
simultaneous heavy loading of the disk system
32. What are the two types of log storage?
 Volatile log: It takes short time to access but lost if processor crash. The
contents of volatile log are moved to stable log periodically.
 Stable log: longer time to access but remained if crashed.

33. Define consensus algorithm.

A consensus algorithm is a process that achieves agreement on a single data value among
distributed processes or systems.
34. What is fail stop model?
A process may crash in the middle of a step, which could be the execution of a local operation or
processing of a message for a send or receive event. it may send a message to only a subset of
the destination set before crashing.
35. What is Byzantine failure model?
A process may behave arbitrarily.
36. Define consensus.
The network in its entirety has to agree upon every message transmitted in the network. This
agreement is called as consensus.
37. State the Byzantine agreement problem.
The Byzantine agreement problem requires a designated source process, with an init ial value, to
reach agreement with the other processes about its initial value, subject to:
 Agreement: All non-faulty processes must agree on the same value.
 Validity: If the source process is non-faulty, then the agreed upon value by all the
non-faulty processes must be the same as the initial value of the source.
 Termination: Each non-faulty process must eventually decide on a value.
38. Define crash.
The process stops executing statements of its program and halts.

147

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

39. What is Corruption?
The process changes arbitrarily the value of a local variable with respect to its program
specification. This fault could be propagated to other processes by including incorrect values in
the content of a message sent by the process.
40. Define Omission.
The process omits to execute a statement of its program. If a process omits to execute an
assignment, this could lead to a corruption fault.
41. What is Duplication?
The process executes more than one time a statement of its program. If a process executes an
assignment more than one time, this could lead to a corruption fault.
42. Define Misevaluation.
The process misevaluates an expression included in its program. This fault is different from a
corruption fault: misevaluating an expression does not imply the update of the variables involved
in the expression and, in some cases the result of an evaluation is not as-signed to a variable.
43. What is interactive consistency problem?
All the process has an initial value, and all the correct processes must agree upon a set of values,
with one value for each process. This is interactive consistency problem.
44. What is phase king algorithm?
The phase-king algorithm proposed by Berman and Garay solves the consensus problem under
the same model, requiring f + 1 phases, and a polynomial number of messages but can tolerate
only f < Ceil (n/4), malicious processes.

148

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

PART-B (Possible Questions)

1. Write in detail about check pointing and rollback.
2. Discuss in detail about failure recovery issues.
3. Explain check point based recovery.
4. Describe log based recovery.
5. Elaborate Koo-Toueg coordinated checkpointing algorithm.
6. Discuss about Juang-Venkatesan’s asynchronous checkpoint and recovery.
7. Explain about consensus and agreement.
8. Tabulate the results of consensus problem.
9. Discuss about agreement in failure free system.
10. Describe agreement in (message-passing) synchronous systems with failures.

PART-C (Possible Questions)

1. Discuss about log based recovery.

2. Elaborate Koo-Toueg coordinated checkpointing algorithm.
3. Describe the Juang-Venkatesan’s asynchronous checkpoint and recovery.
4. Explain about consensus and agreement.
5. Tabulate the results of consensus problem.
6. Discuss about agreement in failure free system.
7. Describe agreement in (message-passing) synchronous systems with failures.

149

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

UNIT V
CLOUD COMPUTING

5.1 DEFINITION OF CLOUD COMPUTING

Originally, the cloud was thought of as a bunch of combined services, technologies, and activities.
What happened inside the cloud was not known to the users of the services. This is partially how the
cloud got its name. Cloud computing is a virtualization-based technology that allows us to create,
configure, and customize applications via an internet connection. The cloud technology includes a
development platform, hard disk, software application, and database. The term cloud refers to a network or
the internet. It is a technology that uses remote servers on the internet to store, manage, and access data
online rather than local drives. The data can be anything such as files, images, documents, audio, video,
and more.

There are the following operations that we can do using cloud computing:

 Developing new applications and services

 Storage, back up, and recovery of data
 Hosting blogs and websites
 Delivery of software on demand
 Analysis of data
 Streaming videos and audios

5.2 KEY CLOUD CHARACTERISTICS

The NIST definition of cloud computing outlines five key cloud characteristics: on-demand self-service,
broad network access, resource pooling, rapid elasticity, and measured service.
a)On-Demand Self-Service

On-demand self-service means that a consumer can request and receive access to a service offering,
without an administrator or some sort of support staff having to fulfill the request manually. The request
processes and fulfillment processes are all automated. This offers advantages for both the provider and the
consumer of the service.
b)Broad Network Access

Cloud services should be able to be accessed by a wide variety of client devices. Laptops and desktops
aren’t the only devices used to connect to networks and the Internet. Users also connect via tablets,
smartphones, and a host of other options. Cloud services need to support all of these devices. If the
service requires a client application, the provider may have to build platform-specific applications (i.e.,

150

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

Windows, Mac, iOS, and Android). Having to develop and maintain a number of different client
applications is costly, so it is extremely advantageous if the solution can be architected in such a way that
doesn’t require a client at all.

c)Resource Pooling
Resource pooling helps save costs and allows flexibility on the provider side. Resource pooling is based
on the fact that clients will not have a constant need for all the resources available to them. When
resources are not being used by one customer, instead of sitting idle those resources can be used by
another customer. This gives providers the ability to service many more customers than they could if each
customer required dedicated resources. Resource pooling is often achieved using virtualization.
Virtualization allows providers to increase the density of their systems. They can host multiple virtual
sessions on a single system. In a virtualized environment, the resources on one physical system are placed
into a pool that can be used by multiple virtual systems.

d)Rapid Elasticity
Rapid elasticity describes the ability of a cloud environment to easily grow to satisfy user demand. Cloud
deployments should already have the needed infrastructure in place to expand the service capacity. Rapid
elasticity is usually accomplished through the use of automation and orchestration. When resource usage
hits a certain point, a trigger is set off. This trigger automatically begins the process of capacity expansion.
Once the usage has subsided, the capacity shrinks as needed to ensure that resources are not wasted. The
rapid elasticity feature of cloud implementations is what enables them to be able to handle the “burst”

151

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

capacity needed by many of their users. Burst capacity is an increased capacity that is needed for only a
short period of time.
e)Measured Service

Cloud services must have the ability to measure usage. Usage can be quanti fied using various metrics,
such as time used, bandwidth used, and data used. The measured service characteristic is what enables the
“pay as you go” feature of cloud computing. Once an appropriate metric has been identified, a rate is
determined. This rate is used to determine how much a customer should be charged. This way, the client is
billed based on consumption levels.

5.3 CLOUD DEPLOYMENT MODELS

The way the cloud is used varies from organization to organization. Every organization has its own
requirements as to what services it wants to access from a cloud and how much control it wants to have
over the environment. To accom modate these varying requirements, a cloud environment can be
implemented using different service models. Each service model has its own set of requirements and
benefits. The NIST definition of cloud computing outlines four differ ent cloud deployment models:
public, private, community, and hybrid.
i)PUBLIC CLOUDS

When most people think about cloud computing, they are thinking of the public cloud service model. In
the public service model, all the systems and resources that provide the service are housed at an external
service provider. That service provider is responsible for the management and administration of the
systems that are used to provide the service. The client is only responsible for any software or client
application that is installed on the end-user system. Connections to public cloud providers are usually
made through the Internet.

Benefits
The number of public cloud implementations continues to grow at a rapid pace due to the numerous
benefits public clouds offer.
 Availability
Public cloud deployments can offer increased availability over what is achievable internally. Every
organization has an availability quotient that they would like to achieve. Every organization also has an
availability quotient that they are capable of achieving. Sometimes the two match; sometimes they don’t.
The problem is that availability comes at a cost, whether hardware cost, software cost, training cost, or
staffing cost. Whichever it is, an organization may not be able to afford it, so they have to make do with
what they have and therefore not be able to achieve the level of availability they would like. Most public
cloud providers already have the hardware, software, and staffing in place to make their offerings highly
available. They may charge a little extra for the service to provide increased availability, but it will be
nowhere near the cost of doing it internally.

152

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

 Scalability
Public cloud implementations offer a highly scalable architecture, as do most cloud implementations.
What public cloud implementations offer that private clouds do not is the ability to scale your
organization’s capacity without having to build out your own infrastructure.
 Accessibility
Public cloud providers place great importance on accessibility. To expand their potential customer base as
wide as possible, they attempt to ensure that they can service as many different client types as possible.
Their goal is to ensure that their services can be accessed by any device on the Internet without the need
for VPNs or any special client software.
 Cost Savings
Public clouds are particularly attractive because of the cost savings they offer. But you do have to be
careful because the savings might not be as good as you think. You need to have a good understanding of
not only the amount of savings but also the type of savings.

Drawbacks

a)Integration Limitations
In public SaaS clouds, the systems are external to your organization; this means that the data is also
external. Having your data housed externally can cause problems when you’re doing reporting or trying to
move to on-premises systems. If you need to run reports or do business intelligence (BI) analytics against
the data, you could end up having to transmit the data through the Internet. This can raise performance
concerns as well as security issues.

b)Reduced Flexibility
When you are using a public cloud provider, you are subject to that provider’s upgrade schedule. In most
cases, you will have little or no influence over when upgrades are performed.

c)Forced Downtime
When you use a public cloud provider, the provider controls when systems are taken offline for
maintenance. Maintenance may be performed at a time that is inconvenient for you and your organization.

Responsibilities

With public clouds, most of the responsibilities lie with the service provider. The provider is responsible
for maintenance and support. The provider is also responsible for making sure support personnel are
properly trained.
Security Considerations
Ensuring security is especially difficult in public cloud scenarios. Since you probably won’t manage
access to the systems providing the services, it’s very difficult to ensure that they are secure. You basically
have to take the provider’s word for it and trust in the provider’s capabilities.

153

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

a)Data
Public cloud providers raise a real issue over data security. There is a question of data ownership. Since
the service provider owns the systems where your data resides, the provider could potentially be
considered the true owner of the data. There is also an issue with data access. Theoretically, anyone who
works at the service provider could potentially have access to your data.

b) Compliance
Compliance can be a big concern with public service providers, much to do with the fact that you will
have little to no visibility around what’s happening on the back end.
c) Auditing
In the case of public cloud providers, you will generally have limited auditing capabilities. You may not
direct access to any logs or event management systems. Many public cloud providers will allow you
access to at least some form of application logs. These logs can be used to view user access and make
decisions regarding licensing.

ii)PRIVATE CLOUDS
In a private cloud, the systems and resources that provide the service are located internal to the company
or organization that uses them. That organization is responsible for the management and administration of
the systems that are used to provide the service. In addition, the organization is also responsible for any
software or client application that is installed on the end-user system. Private clouds are usually accessed
through the local LAN or wide area network (WAN). In the case of remote users, the access will generally
be provided through the Internet or occasionally through the use of a virtual private network (VPN).
Benefits
a)Support and Troubleshooting
Private cloud environments can be easier to troubleshoot than public cloud environments. In a private
cloud environment, you will have direct access to all systems. You can access logs, run network traces,
run debug traces, or do anything else you need to do to troubleshoot an issue. You don’t have to rely on a
service provider for help.
b) Maintenance
With private clouds, you control the upgrade cycle. You aren’t forced to upgrade when you don’t want.
You don’t have to perform upgrades unless the newer version has some feature or functionality that you
want to take advantage of. You can control when upgrades are performed. If your organization has
regularly scheduled maintenance windows, you can perform your upgrades and other maintenance
activities during that specified timeframe. This may help reduce the overall impact of a system outage.
c) Monitoring
Since you will have direct access to the systems in your private cloud environment, you will be able to do
whatever monitoring you require. You can monitor everything from the application to the system
hardware. One big advantage of this capability is that you can take preemptive measures to prevent an
outage, so you are able to be more proactive in servicing your customers.

154

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

Drawbacks
a) Cost
Implementing a private cloud requires substantial upfront costs. You have to implement an infrastructure
that not only can support your current needs but your future needs as well. You need to estimate the needs
of all the business units you will be supporting. You also have to implement an infrastructure that can
support peak times. All the systems needed to support peak times don’t always have to be running if you
have a way of automatically starting them when necessary.
b) Hardware and Software Compatibility
You have to make sure the software you implement is compatible with the hardware in your environment.
In addition, you have to make sure the software you implement is compatible with the clients in your
environment.
c) Expertise Needed
With private clouds you still need expertise in all the applications and system you want to deploy. The
need for internal expertise can lead to expensive train ing and education. You will be responsible for
installing, maintaining, and sup porting them, so you must ensure that you either have the in-house
knowledge to do so or the ability to bring in outside contractors or consultants to help.
Responsibilities
a)Security Considerations
With a private cloud implementation, your organization will have complete control over the systems,
applications, and data. You can control who has access to what. Ensuring security is easier in a private
cloud environment. There you have complete control over the systems, so you can implement any security
means you like.
b)Compliance
In a private cloud environment, you are responsible for making sure that you follow any applicable
compliance regulations. If your organization has the skills and the technology to ensure adherence to
compliance regulations, having the systems and the data internal can be a big advantage. If you don’t have
the skills and technology, you will have to obtain the skills, or you could face serious problems.
Data
In a private cloud environment, you own the data and the systems that house the data. This gives you
more control over who can access the data and what they can do with it. It also gives you greater assurance
that your data is safe. Auditing
In a private cloud environment, you have complete access to all the application and system logs. You can
see who accessed what and what they did with it. The biggest advantage is that you can see all of this in
real time, so you are able to take any corrective action necessary to ensure the integrity of your systems
.iii)COMMUNITY CLOUDS
Community clouds are semi-public clouds that are shared between members of a select group of
organizations. These organizations will generally have a common purpose or mission. The organizations
do not want to use a public cloud that is open to everyone. They want more privacy than what a public
cloud offers. In addition, each organization doesn’t want to be individually responsible for maintaining the
cloud; they want to be able to share the responsibilities with others.

155

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

Benefits
a)Cost
In a community cloud, costs are shared between the community members. This shared cost allows for the
purchase of infrastructure that any single member organization may not have been able to afford. This way
the community members are also able to achieve greater economies of scale
b)Multitenancy
In a community cloud, multitenancy can help you take advantage of some economies of scale. Your
organization alone may not be large enough to take advantage of some of the cost savings, but working
with another organization or multiple organizations, together you may be large enough to see these
benefits.
Drawbacks
There are some potential drawbacks to implementing a community cloud. Any time you have multiple
organizations working together, there is the potential for conflict. Steps must be taken to mitigate any
potential issues.
a)Ownership
Ownership in a community cloud needs to be clearly defined. If multiple organizations are coming
together to assemble infrastructure, you must determine some agreement for joint ownership. If you are
purchasing capital resources, those resources need to go against some organization’s budget. In some
instances, the organizations coming together to build the community cloud may establish a single common
organization that can “own” the resources.
b) Responsibilities
In a community cloud, responsibilities are shared between the member organizations. There may be
problems deciding who owns what and who is responsible for what, but after those questions have been
decided, the shared responsibility can be quite beneficial. This shared responsibility reduces the
administrative burden on any single organization.
c)Security Considerations
Community clouds present a special set of circumstances when it comes to security because there will be
multiple organizations accessing and controlling the environment.
d)Data
In a community cloud, all the participants in the community may have access to the data. For this reason,
you don’t want to store any data that is restricted to only your organization.
e)Compliance
In a community cloud, compliance can be particularly tricky. The systems will be subject to all the
compliance regulations to which each of the member organizations is subject. So, your organization may
be subject to regulations with which you have little familiarity.
f)Auditing
In a community cloud, member organizations will have shared access to all the application and system
audit logs. You will want to have some agreement as to who will perform what activities. Trolling though
logs can be particularly tedious and time consuming, so you don’t want people wasting time doing
duplicate work.

156

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

Hybrid
A hybrid cloud model is a combination of two or more other cloud models. The clouds themselves are not
mixed together; rather, each cloud is separate, and they are all linked together. A hybrid cloud may
introduce more complexity to the environment, but it also allows more flexibility in fulfilling an
organization’s objectives.

Benefits
In addition to the benefits brought by each of the cloud models, the hybrid cloud model brings increased
flexibility. If your ultimate goal is to move everything to a public service provider, a hybrid environment
allows you to move to a cloud environment without being forced to move everything public until the time
is right. You may have certain applications for which the public service offerings are very expensive. You
can keep these applications internal until the price comes down. You may also have security concerns
about moving certain types of data to a public service provider. Again, the hybrid cloud model allows you
to leave that data internal until you can be assured that it will be safe in a public cloud environment.

Drawbacks
A hybrid cloud environment can be the most complex environment to implement. You have different
considerations for each type of cloud you plan to implement. Not all your rules and procedures will apply
to all environments. You will have to develop a different set of rules and procedures for each environment.
a)Integration
You may have some applications in a private cloud and some applications in a private one, but these

157

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

applications may need to access and use the same data. You have two choices here: You can duplicate
copies of data, which would require you to set up some type of replication mechanism to keep the data in
sync, or you can move data around as needed. Moving data around in a hybrid cloud environment can be
tricky because you have to worry about bandwidth constraints.
b)Security Considerations
Hybrid clouds can bring about particular security considerations. Not only do you have to worry about
security issues in each individual environment, you have to worry about issues created by connecting the
environments together.
c)Data
Moving data back and forth between cloud environments can be very risky. You have to ensure that all
environments involved have satisfactorily secured data. Data in motion can be particularly difficult to
secure. Both sides of the conversation must support the same security protocols, and they must be
compatible with each other.
Auditing
Auditing in hybrid environments can be tricky. User access may rotate between internal and external.
Following a process from start to finish may take you through both internal and external systems. It’s
important that you have some way of doing event log correlation so that you can match up these internal
and external events.

5.4 CLOUD SERVICE MODELS

The NIST definition of cloud computing outlines three basic service models: Infrastructure as a Service
(IaaS), Platform as a Service (PaaS), and Software as a Service (SaaS).
 Infrastructure as a Service
Infrastructure as a Service, or IaaS, provides basic infrastructure services to customers. These services
may include physical machines, virtual machines, networking, storage, or some combination of these. You
are then able to build whatever you need on top of the managed infrastructure. IaaS implementations are
used to replace internally managed datacenters. They allow organizations more flexibility but at a reduced
cost.
An IaaS provider may provide you with hardware resources such as servers. These servers would be
housed in the provider’s datacenter, but you would have direct access to them. You could then install
whatever you needed to onto the servers. This can be costly, though, because the provider would not be
able to make use of multitenancy or economies of scale. Therefore, customers would have to absorb all the
costs of the systems themselves.
Responsibilities
In an IaaS deployment, the customer is responsible for most of the environment. The provider is
responsible for the hypervisor layer (if used) and below. This includes physical hardware, storage, and
networking. The physical hardware will be stored in the provider’s datacenter, but the customer may have
full access to it. The customer is responsible for obvious things like operating system and application
maintenance. The customer is responsible for ensuring that systems have up-to-date antivirus.

158

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

159

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

Drivers
Many organizations look to IaaS providers to expand their capacity. Instead of spending a lot of money
expanding a datacenter or building a new datacenter, organizations are basically renting systems provided
by an IaaS provider.
Challenges
There have been several challenges to IaaS adoption. Most organizations see the benefits, but they worry
about the loss of control. The total cost can also be an issue. In many IaaS environments, you are charged
for resource usage, such as processor and memory. Unless you carefully monitor your system usage, you
may be in for a shock when the bill comes.
Security Challenges
The security challenges for IaaS implementations are similar to those for other service providers.
However, since the provider does not need access to the actual operation system or items at a higher level,

160

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

there is no need for them to have administrative accounts on the system. This can give the customer at
least some level of comfort regarding security.
IaaS Providers
IaaS providers are really picking up steam in the marketplace. This isn’t just due to demand. There is also
the fact that IaaS platforms such as CloudStack and OpenStack have been developed to make automation
and orchestration easier. Here we cover two of the most well-known IaaS providers: Amazon EC2 and
Rackspace.

Amazon Elastic Compute Cloud (EC2)

The other important type of IaaS is compute as a service, whereby computing resources are offered as a
service. Of course, for a useful compute-as-a-service offering, it should be possible to associ ate storage
with the computing service (so that the results of the computation can be made persis tent). Virtual
networking is needed as well so that it is possible to communicate with the computing instance. All these
together make up Infrastructure as a Service (IaaS).

 Platform as a Service
Platform as a Service, or PaaS, provides an operating system, development platform, and/or a database
platform. PaaS implementations allow organizations to develop applications without having to worry
about building the infrastructure needed to support the development environment.

161

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

PaaS Characteristics
PaaS implementations allow organizations to build and deploy Web applica tions without having to
build their own infrastructure. PaaS offerings gener ally include facilities for development, integration,
and testing.

a)Customization
With PaaS, you have complete control over the application, so you are free to customize the application as
you see fit.
b) Analytics
Since you, the customer, will be creating the applications, you will have the ability to view application
usage and determine trends. You will be able to see which components are getting the most use and which
ones are not being used.

c)Integration

162

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

In a PaaS environment, the data will be stored at the provider site, but the cus tomer will have direct
access to it. Conducting business intelligence and reporting should not be a problem from an access point
of view, but you could run into issues when it comes to bandwidth usage, because you may be moving
large amounts of data between your internal environment and the provider’s environment.

PaaS Responsibilities

In a PaaS offering, responsibilities are somewhat distributed between the service provider and the
customer.

163

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

In a PaaS implementation, the customer is generally responsible for everything above the operating system
and development platform level. You will be responsible for installing and maintaining any additional
applications you will need. This includes application patching and application monitoring. The database
platform may be supplied for you, but you will be responsible for the data. In a PaaS implementation, you
will usually have direct access to the data. If there are any problems with the data, you will be able to
implement any direct data fix you might need to perform.

PaaS Drivers

There have been many drivers influencing the growth of the PaaS market. Many organizations want to
move towards a public cloud model, but can’t find public SaaS providers offering the applications they
need. A PaaS model allows them to move the infrastructure and platforms out of their internal datacenters
while allowing them to be able to develop the applications they need.

PaaS Challenges

A number of challenges come into play with public PaaS environments, including issues related to
flexibility and security.

Flexibility Challenges

You may have difficulty finding a provider with the platform you need. Most PaaS providers limit their
offerings to specific platform sets. If you need a special set or special configuration, you might not be able
to find a provider that offers what you need.

Security Challenges

The provider will have administrative control over the operating system and the database platform. Since
the provider has direct access to the systems, they will have direct access to all of the applications and
data.

PaaS Providers
The number of PaaS providers in the market continues to grow. First we take a look at Windows Azure.

a)Windows Azure
Windows Azure has a free offering and upgraded offerings that include features such as increased SLAs.
Windows Azure makes it very easy to spin up a Web site or development platform. Windows Azure
includes a wide variety of options such as compute services, data services, app services, and network
service

164

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

 Software as a Service
Software as a Service, or SaaS, provides application and data services. Applications, data, and all the
necessary platforms and infrastructure are provided by the service provider. SaaS is the original cloud
service model. It still remains the most popular model, offering by far the largest number of provider
options.

165

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

Customization
With SaaS implementations, the service provider usually controls virtually everything about the
application. In many cases, this will limit any custom ization that can be done. But depending on the
implementation, you may be able to request that the user interface (UI) or the look and feel of the
applica tion be modified slightly. Usually wholesale changes are not allowed. In most cases the customer
will not be to make the changes themselves; the provider will have to make the changes. In a SaaS
environment, allowing customization can be very costly for the service provider and, hence, the customer.
Support and Maintenance
In a SaaS environment, software upgrades are centralized and performed by the service provider. You
don’t have to worry about upgrading software on multiple clients. The centralized upgrades allow for more
frequent upgrades, which can allow for accelerated feature delivery. The exception to this rule is when
there is client software that is used to access the centralized application. But must SaaS providers will try
to provide access to their applications without requiring a client application.
Analytics
Analytics and usage statistics can provide value information about application usage. In SaaS
implementations, the provider has the ability to view user activities and determine trends. In many cases
this information is shared with the customers. For large organizations, this information can be invaluable.
Since most cloud environments are pay-as-you-go offerings, it’s important to understand usage trends.
Understanding trends helps you understand when you may have a spike in usage and therefore a spike in
costs. It’s also important to understand concurrent usage and total usage. You may be able to reduce your
license costs.
Integration
In a SaaS environment, the data will be stored at the provider site. In most cases, the customer will not
have direct access to the data. This can be a problem when it comes to reporting and business intelligence.
It’s also a problem if you need to do a manual fix of the data or load or reload data in bulk. In some cases
there is nothing you can do about that.

Responsibilities
In SaaS implementations, most of the responsibilities fall on the service pro vider. This is one of the
reasons SaaS implementations have become so popu lar. Organizations are able to free up their internal
resources for other activities, as opposed to using them for system administration. Figure 4.3 gives you an
idea of what is generally the responsibility of the service provider and what is usually taken care of by the
customer.

166

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

In a SaaS environment, the provider is basically responsible for everything except the client systems. It
will ensure that the application is up to date. It will make sure that the systems have been patched
appropriately. It will ensure that the data is being stored properly. It will monitor the systems for
performance and make any adjustments that need to be made. In a SaaS environment, the customer is
responsible for the client system or systems. The customer must ensure that the clients have connectivity
to the SaaS application. The client systems must have any necessary client software installed. The client
systems must be patched to an appropriate level.

SaaS Drivers
Many drivers have contributed to the rise of public SaaS offerings. There has been a big rise in the
creation and consumption of Web-based applications. Users are getting used to the look and feel of these
types of applications. Not to mention the fact that the look and feel have improved. Most SaaS providers

167

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

offer their services in the form of Web-based applications. So, as acceptance of Web-based applications
grows, so does the acceptance of SaaS services.

SaaS Challenges
Even though SaaS is currently the most popular cloud service model, there are still many challenges to the
adoption of SaaS. SaaS providers have been able to resolve many of the challenges and mitigate concerns,
but many still exist.
Disparate Location
SaaS applications are generally hosted offsite. This means connections between the client and the
application must travel over the public Internet, sometimes long distances. This distance may introduce
latency into the environment. This can be a limiting factor for some applications. Some applications
require response times in milliseconds. These applications will not work in environments where there is a
great deal of latency.
Multitenancy
Multitenancy can cause several issues. Since the application is shared, generally little to no customization
can be performed. This can be a problem if your organization requires extensive customization. You may
have to go with an on-premises application.

Other Security Challenges

One of the big worries organizations have with SaaS is around the security of the data. The employees at
the service provider will have direct access to the systems that house the data. One way to mitigate this is
to protect the data at the software level. You would have to encrypt the data at rest and the data in motion.
This would prevent the provider from reading the data when it’s stored or the provider’s systems and when
it’s traveling on the provider network.
SaaS Providers
There are a multitude of public SaaS providers out there. Here we cover a few of the most popular.
Outlook.com

A default Outlook.com mail account is free. If you want advanced features or a version that does not
include advertisements, you have to upgrade your account. This can be done by selecting the gear icon in
the top-right corner and selecting More Mail Settings.

Google Drive
Google Drive, gives you online access to view and create word processing documents, spreadsheets,
presentations, and a host of other documents. You can use the built-in document types or add new types.
To add a new type of document, choose Create in the left pane and select Connect More Apps.

168

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

5.5 DRIVING FACTORS AND CHALLENGES OF CLOUD

Driving Factors
1. Reduced costs

Establishing and running a data center is expensive. You need to purchase the right equipment and hire
technicians to install and manage the center. When you shift to cloud computing, you will only pay for the
services procured.

2. Flexibility

One of the major benefits of cloud computing is mobility. The service gives you and your employees the
flexibility to work from any location. Employees can complete their tasks at home or from the field.

You can reduce the number of workstations in your office and allow some employees to work from home
to save costs further. Cloud computing enables you to monitor the operations in your business effectively.
You just need a fast internet connection to get real time updates of all operations.

3. Scalability

The traditional way of planning for unexpected growth is to purchase and keep additional servers, storage,
and licenses. It may take years before you actually use the reserve resources. Scaling cloud computing
services is easy. You can get additional storage space or features whenever you need them. Your provider
will simply upgrade your package within minutes as long as you meet the additional cost.

4. No need for a backup plan

Traditional computing system require back up plans especially for data storage. A disaster can lead to
permanent data loss if no backup storage is in place. Businesses do not require any such means when
storing data on a cloud. The data will always be available as long as users have an internet connection.
Some businesses use cloud computing services as backup and a plan for disaster recovery.

5. Data security

Sometimes storing data on the cloud is safer than storing it on physical servers and data centers. A breach
of security at your premises can lead compromised data security if laptops or computers are stolen. If you
have data on the cloud, you can delete any confidential information remotely or move it to a different
account. Breaching the security measures on clouding platforms is difficult. Hence, you are assured of data
security.

6. A wide range of options

169

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

We have already mentioned the main groups of cloud computing services, that is, IaaS, PaaS, and IaaS.
Each of these groups has many sub categories that vary across providers. For instance, if you are looking
for software, you will have hundreds of options from different providers. You can choose the service
providers with the best features and rates for the service that your business needs.

7. Improved collaboration

Business owners are always looking for ways to boost individual and team performance. Cloud computing
is among the most effective ways of improving team performance.Staff members can easily share data and
collaborate to complete projects even from different locations. Field workers can easily share real time
data and updates with those in the office. In addition, cloud computing eliminates redundant or repetitive
tasks such as data re-entry.You can improve the level of efficiency, increase productivity, and save costs
by moving your business to cloud computing. The best approach is to shift the operations gradually to
avoid data losses or manipulation during the shift.

Challenges of Cloud Computing

1. Security

The topmost concern in investing in cloud services is security issues in cloud computing. It is because
your data gets stored and processed by a third-party vendor and you cannot see it. Every day or the other,
you get informed about broken authentication, compromised credentials, account hacking, data breaches,
etc. in a particular organization. It makes you a little more skeptical.

2. Password Security

As large numbers of people access your cloud account, it becomes vulnerable. Anybody who knows your
password or hacks into your cloud will be able to access your confidential information. Here the
organization should use a multiple level authentication and ensure that the passwords remain protected.
Also, the passwords should be modified regularly, especially when a particular employee resigns and leave
the organization. Access rights to usernames and passwords should be given judiciously.

3. Cost Management

Cloud computing enables you to access application software over a fast internet connection and lets you
save on investing in costly computer hardware, software, management, and maintenance. This makes it
affordable.

4. Lack of expertise
With the increasing workload on cloud technologies and continuously improving cloud tools, management

170

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

has become difficult. There has been a consistent demand for a trained workforce who can deal with cloud
computing tools and services. Hence, firms need to train their IT staff to minimize this challenge.
5. Internet Connectivity

Cloud services are dependent on a high-speed internet connection. So businesses that are relatively small
and face connectivity issues should ideally first invest in a good internet connection so that no downtime
happens. It is because internet downtime might incur vast business losses.

6. Control or Governance

Another ethical issue in cloud computing is maintaining proper control over asset management and
maintenance. There should be a dedicated team to ensure that the assets used to implement cloud services
are used according to agreed policies and dedicated procedures. There should be proper maintenance and
the assets are used to meet your organization’s goals successfully.

7. Compliance

Another major risk of cloud computing is maintaining compliance. By compliance we mean, a set of rules
about what data is allowed to be moved and what should be kept in-house to maintain compliance. The
organizations must follow and respect the compliance rules set by various government bodies.

8. Multiple Cloud Management

Companies have started to invest in multiple public clouds, multiple private clouds or a combination of
both called the hybrid cloud. This has grown rapidly in recent times. So it has become important to list the
challenges faced by such organizations and find solutions to grow with the trend.

9. Creating a private cloud

Implementing an internal cloud is advantageous. This is because all the data remains secure in-house. But
the challenge here is that the IT team has to build and fix everything by themselves. Also, the team needs
to ensure the smooth functioning of the cloud. They need to automate maximum manual tasks. The
execution of tasks should be in the correct order.

10. Performance

When your business applications move to a cloud or a third-party vendor, so your business performance
starts to depend on your provider as well. Another major problem in cloud computing is investing in the
right cloud service provider. Before investing, you should look for providers with innovative technologies.

171

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

Be cautious about choosing the provider and investigate whether they have protocols to mitigate issues
that arise in real-time.

11. Migration

Migration is nothing but moving a new application or an existing application to a cloud. In the case of a
new application, the process is pretty straightforward. But if it is an age-old company application, it
becomes tedious. Velostrata conducted a survey recently, wherein 95% of organizations are moving their
applications to the cloud. The survey showed that most organizations are finding it a nightmare. Some
notable issues faced here are slow data migrations, security challenges in cloud computing, extensive
troubleshooting, application downtime, migration agents, and cutover complexity.

12. Interoperability and Portability

Another challenge of cloud computing is that applications need to be easily migrated between cloud
providers without being locked for a set period. There is a lack of flexibility in moving from one cloud
provider to another because of the complexity involved. Changing cloud inventions bring a slew of new
challenges like managing data movement and establishing a secure network from scratch. Another
challenge is that customers can’t access it from everywhere, but this can be fixed by the cloud provider so
that the customer can securely access the cloud from anywhere.

13. Reliability and High Availability

Some of the most pressing issues in cloud computing is the need for high availability (HA) and reliability.
Reliability refers to the likelihood that a system will be up and running at any given point in time, whereas
availability refers to how likely it is that the system will be up and running at any given point in time.

14. Hybrid-Cloud Complexity

For any company, a hybrid cloud environment is often a messy mix of multiple cloud application
development and cloud service providers, as well as private and public clouds, all operating at once. A
common user interface, consistent data, and analytical benefits for businesses are all missing from these
complex cloud ecosystems.

5.6 VIRTUALIZATION IN CLOUD COMPUTING AND TYPES

Virtualization is a technique how to separate a service from the underlying physical delivery of that
service. It is the process of creating a virtual version of something like computer hardware. It was initially
developed during the mainframe era. It involves using specialized software to create a virtual or software-
created version of a computing resource rather than the actual version of the same resource. With the help

172

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

of Virtualization, multiple operating systems and applications can run on the same machine and its same
hardware at the same time, increasing the utilization and flexibility of hardware.

In other words, one of the main cost-effective, hardware-reducing, and energy-saving techniques used by
cloud providers is Virtualization. Virtualization allows sharing of a single physical instance of a resource
or an application among multiple customers and organizations at one time. It does this by assigning a
logical name to physical storage and providing a pointer to that physical resource on demand. The term
virtualization is often synonymous with hardware virtualization, which plays a fundamental role in
efficiently delivering Infrastructure-as-a-Service (IaaS) solutions for cloud computing. Moreover,
virtualization technologies provide a virtual environment for not only executing applications but also for
storage, memory, and networking.

Virtualization

 Host Machine: The machine on which the virtual machine is going to be built is known as Host
Machine.
 Guest Machine: The virtual machine is referred to as a Guest Machine.

173

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

Work of Virtualization in Cloud Computing

In the case of cloud computing, users store data in the cloud, but with the help of Virtualization, users have
the extra benefit of sharing the infrastructure. Cloud Vendors take care of the required physical resources,
but these cloud providers charge a huge amount for these services which impacts every user or
organization. Virtualization helps Users or Organisations in maintaining those services which are required
by a company through external (third-party) people, which helps in reducing costs to the company. This is
the way through which Virtualization works in Cloud Computing.

Benefits of Virtualization

 More flexible and efficient allocation of resources.

 Enhance development productivity.
 It lowers the cost of IT infrastructure.
 Remote access and rapid scalability.
 High availability and disaster recovery.
 Pay peruse of the IT infrastructure on demand.
 Enables running multiple operating systems.

Drawback of Virtualization

 High Initial Investment: Clouds have a very high initial investment, but it is also true that it will
help in reducing the cost of companies.
 Learning New Infrastructure: As the companies shifted from Servers to Cloud, it requires highly
skilled staff who have skills to work with the cloud easily, and for this, you have to hire new staff
or provide training to current staff.
 Risk of Data: Hosting data on third-party resources can lead to putting the data at risk, it has the
chance of getting attacked by any hacker or cracker very easily.

Characteristics of Virtualization

 Increased Security: The ability to control the execution of a guest program in a completely
transparent manner opens new possibilities for delivering a secure, controlled execution
environment. All the operations of the guest programs are generally performed against the virtual
machine, which then translates and applies them to the host programs.
 Managed Execution: In particular, sharing, aggregation, emulation, and isolation are the most
relevant features.
 Sharing: Virtualization allows the creation of a separate computing environment within the same
host.
 Aggregation: It is possible to share physical resources among several guests, but virtualization
also allows aggregation, which is the opposite process.

174

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

Types of Virtualization

1. Application Virtualization
2. Network Virtualization
3. Desktop Virtualization
4. Storage Virtualization
5. Server Virtualization
6. Data virtualization

Types of Virtualization

1. Application Virtualization: Application virtualization helps a user to have remote access to an

application from a server. The server stores all personal information and other characteristics of the
application but can still run on a local workstation through the internet. An example of this would be a
user who needs to run two different versions of the same software.

2. Network Virtualization: The ability to run multiple virtual networks with each having a separate
control and data plan. It co-exists together on top of one physical network. It can be managed by
individual parties that are potentially confidential to each other. Network virtualization provides a facility
to create and provision virtual networks, logical switches, routers, firewalls, load balancers, Virtual Private
Networks (VPN), and workload security within days or even weeks.

175

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

Network Virtualization

3. Desktop Virtualization: Desktop virtualization allows the users’ OS to be remotely stored on a server
in the data center. It allows the user to access their desktop virtually, from any location by a different
machine. Users who want specific operating systems other than Windows Server will need to have a
virtual desktop. The main benefits of desktop virtualization are user mobility, portability, and easy
management of software installation, updates, and patches.

4. Storage Virtualization: Storage virtualization is an array of servers that are managed by a virtual
storage system. The servers aren’t aware of exactly where their data is stored and instead function more
like worker bees in a hive. It makes managing storage from multiple sources be managed and utilized as a
single repository. storage virtualization software maintains smooth operations, consistent performance,
and a continuous suite of advanced functions despite changes, breaks down, and differences in the
underlying equipment.

5. Server Virtualization: This is a kind of virtualization in which the masking of server resources takes
place. Here, the central server (physical server) is divided into multiple different virtual servers by
changing the identity number, and processors. So, each system can operate its operating systems in an
isolated manner. Where each sub-server knows the identity of the central server. It causes an increase in
performance and reduces the operating cost by the deployment of main server resources into a sub-server
resource. It’s beneficial in virtual migration, reducing energy consumption, reducing infrastructural costs,
etc.

176

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

Server Virtualization

177

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

6. Data Virtualization: This is the kind of virtualization in which the data is collected from various
sources and managed at a single place without knowing more about the technical information like how
data is collected, stored & formatted then arranged that data logically so that its virtual view can be
accessed by its interested people and stakeholders, and users through the various cloud services remotely.
Many big giant companies are providing their services like Oracle, IBM, At scale, Cdata, etc.

Uses of Virtualization

 Data-integration
 Business-integration
 Service-oriented architecture data-services
 Searching organizational data

5.7 LOAD BALANCING

Load balancing is an essential technique used in cloud computing to optimize resource utilization and
ensure that no single resource is overburdened with traffic. It is a process of distributing workloads across
multiple computing resources, such as servers, virtual machines, or containers, to achieve better
performance, availability, and scalability.

1. In cloud computing, load balancing can be implemented at various levels, including the network
layer, application layer, and database layer. The most common load balancing techniques used in
cloud computing are:
2. Network Load Balancing: This technique is used to balance the network traffic across multiple
servers or instances. It is implemented at the network layer and ensures that the incoming traffic is
distributed evenly across the available servers.
3. Application Load Balancing: This technique is used to balance the workload across multiple
instances of an application. It is implemented at the application layer and ensures that each instance
receives an equal share of the incoming requests.
4. Database Load Balancing: This technique is used to balance the workload across multiple database
servers. It is implemented at the database layer and ensures that the incoming queries are
distributed evenly across the available database servers.

178

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

Advantages:

1. Improved Performance: Load balancing helps to distribute the workload across multiple resources,
which reduces the load on each resource and improves the overall performance of the system.
2. High Availability: Load balancing ensures that there is no single point of failure in the system,
which provides high availability and fault tolerance to handle server failures.
3. Scalability: Load balancing makes it easier to scale resources up or down as needed, which helps to
handle spikes in traffic or changes in demand.
4. Efficient Resource Utilization: Load balancing ensures that resources are used efficiently, which
reduces wastage and helps to optimize costs.

Disadvantages:

1. Complexity: Implementing load balancing in cloud computing can be complex, especially when
dealing with large-scale systems. It requires careful planning and configuration to ensure that it
works effectively.
2. Cost: Implementing load balancing can add to the overall cost of cloud computing, especially when
using specialized hardware or software.
3. Single Point of Failure: While load balancing helps to reduce the risk of a single point of failure, it
can also become a single point of failure if not implemented correctly.
4. Security: Load balancing can introduce security risks if not implemented correctly, such as
allowing unauthorized access or exposing sensitive data.

Cloud load balancing is defined as the method of splitting workloads and computing properties in a cloud
computing. It enables enterprise to manage workload demands or application demands by distributing
resources among numerous computers, networks or servers. Cloud load balancing includes holding the
circulation of workload traffic and demands that exist over the Internet. As the traffic on the internet
growing rapidly, which is about 100% annually of the present traffic. Hence, the workload on the server
growing so fast which leads to the overloading of servers mainly for popular web server. There are two
elementary solutions to overcome the problem of overloading on the servers-

 First is a single-server solution in which the server is upgraded to a higher performance server.
However, the new server may also be overloaded soon, demanding another upgrade. Moreover, the
upgrading process is arduous and expensive.
 Second is a multiple-server solution in which a scalable service system on a cluster of servers is
built. That’s why it is more cost effective as well as more scalable to build a server cluster system
for network services.

Load balancing is beneficial with almost any type of service, like HTTP, SMTP, DNS, FTP, and
POP/IMAP. It also rises reliability through redundancy. The balancing service is provided by a dedicated

179

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

hardware device or program. Cloud-based servers farms can attain more precise scalability and availability
using server load balancing. Load balancing solutions can be categorized into two types –

1. Software-based load balancers: Software-based load balancers run on standard hardware

(desktop, PCs) and standard operating systems.
2. Hardware-based load balancer: Hardware-based load balancers are dedicated boxes which
include Application Specific Integrated Circuits (ASICs) adapted for a particular use. ASICs
allows high speed promoting of network traffic and are frequently used for transport-level load
balancing because hardware-based load balancing is faster in comparison to software solution.

Major Examples of Load Balancers –

1. Direct Routing Requesting Dispatching Technique: This approach of request dispatching is like
to the one implemented in IBM’s Net Dispatcher. A real server and load balancer share the virtual
IP address. In this, load balancer takes an interface constructed with the virtual IP address that
accepts request packets and it directly routes the packet to the selected servers.
2. Dispatcher-Based Load Balancing Cluster: A dispatcher does smart load balancing by utilizing
server availability, workload, capability and other user-defined criteria to regulate where to send a
TCP/IP request. The dispatcher module of a load balancer can split HTTP requests among various
nodes in a cluster. The dispatcher splits the load among many servers in a cluster so the services of
various nodes seem like a virtual service on an only IP address; consumers interrelate as if it were a
solo server, without having an information about the back-end infrastructure.
3. Linux Virtual Load Balancer: It is an opensource enhanced load balancing solution used to build
extremely scalable and extremely available network services such as HTTP, POP3, FTP, SMTP,
media and caching and Voice Over Internet Protocol (VoIP). It is simple and powerful product
made for load balancing and fail-over. The load balancer itself is the primary entry point of server
cluster systems and can execute Internet Protocol Virtual Server (IPVS), which implements
transport-layer load balancing in the Linux kernel also known as Layer-4 switching.

5.8 SCALABILITY AND ELASTICITY

Cloud Elasticity: Elasticity refers to the ability of a cloud to automatically expand or compress the
infrastructural resources on a sudden up and down in the requirement so that the workload can be managed
efficiently. This elasticity helps to minimize infrastructural costs. This is not applicable for all kinds of
environments, it is helpful to address only those scenarios where the resource requirements fluctuate up
and down suddenly for a specific time interval.

It works such a way that when number of client access expands, applications are naturally provisioned the
extra figuring, stockpiling and organization assets like central processor, Memory, Stockpiling or transfer
speed what’s more, when fewer clients are there it will naturally diminish those as
per prerequisite.

180

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

It is most commonly used in pay-per-use, public cloud services. Where IT managers are willing to pay
only for the duration to which they consumed the resources.

Example: Consider an online shopping site whose transaction workload increases during festive season
like Christmas. So for this specific period of time, the resources need a spike up. In order to handle this
kind of situation, we can go for a Cloud-Elasticity service rather than Cloud Scalability. As soon as the
season goes out, the deployed resources can then be requested for withdrawal.

Cloud Scalability: Cloud scalability is used to handle the growing workload where good performance is
also needed to work efficiently with software or applications. Scalability is commonly used where the
persistent deployment of resources is required to handle the workload statically.

Example: Consider you are the owner of a company whose database size was small in earlier days but as
time passed your business does grow and the size of your database also increases, so in this case you just
need to request your cloud service vendor to scale up your database capacity to handle a heavy workload.

It is totally different from what you have read above in Cloud Elasticity. Scalability is used to fulfill the
static needs while elasticity is used to fulfill the dynamic need of the organization. Scalability is a similar
kind of service provided by the cloud where the customers have to pay-per-use. So, in conclusion, we can
say that Scalability is useful where the workload remains high and increases statically.

Types of Scalability:

1. Vertical Scalability (Scale-up) –

In this type of scalability, we increase the power of existing resources in the working environment
in an upward direction.

181

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

2. Horizontal Scalability: In this kind of scaling, the resources are added in a horizontal row.

3. Diagonal Scalability –
It is a mixture of both Horizontal and Vertical scalability where the resources are added both
vertically and horizontally.

Difference Between Cloud Elasticity and Scalability :

Cloud Elasticity Cloud Scalability

Elasticity is used just to meet the sudden up and Scalability is used to meet the static
1
down in the workload for a small period of time. increase in the workload.
Elasticity is used to meet dynamic changes,
Scalability is always used to address the
2 where the resources need can increase or
increase in workload in an organization.
decrease.
Elasticity is commonly used by small companies Scalability is used by giant companies whose
3 whose workload and demand increases only for a customer circle persistently grows in order to
specific period of time. do the operations efficiently.
It is a short term planning and adopted just to Scalability is a long term planning and
4 deal with an unexpected increase in demand or adopted just to deal with an expected
seasonal demands. increase in demand.

182

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

5.9 REPLICATION

Replication in Cloud Computing refers to multiple storage of the same data to several different locations
by usually synchronization of these data sources. Replication in Cloud Computing is partly done for
backup and on the other hand to reduce response times, especially for reading data requests.

Replication in Cloud Computing : Basics

The simplest form of data replication in cloud computing environment is to store a copy of a file ( copy ),
in expanded form, the copying and pasting in any modern operating systems. Replication is the
reproduction of the original data in unchanged form. Changing data accesses are expensive in general
through replication. In the frequently encountered master / slave replication, a distinction between the
original data (primary data) and the dependent copies. In peer copies ( version control ) there must be
merging of data sets (synchronization). Sometimes it is important to know which data sets must have the
replicas. Depending on the type of replication it is located between the processing and creation of the
primary data and their replication in a certain period of time. This period is usually referred to as latency.

Synchronous replication is when a change of operation can only be completed successfully on a data
object, if it was performed on the replicas. In order to implement this technology, a protocol ensures the
indivisibility of transactions, the commit protocol.

If between the processing of the primary data and the replication there is latency, it speaks of asynchrony.
The data are identical. A simple variant of the asynchronous replication is the “File Transfer Replication”,
the transfer of files via FTP or SSH. The data of the replicas make only a snapshot of the primary data at a
specific time. At the database level this can happen in short time intervals, the transaction databases are
transported from one server to another and read into the database. Assuming an intact network latency it
corresponds to the time interval in which the transaction logs are written. Four methods can be used for
Replication in Cloud Computing – Merge Replication, Primary Copy, Snapshot replication, Standby
replication.

5.10 MONITORING

Cloud monitoring is a method of reviewing, observing, and managing the operational workflow in a cloud-
based IT infrastructure. Manual or automated management techniques confirm the availability and
performance of websites, servers, applications, and other cloud infrastructure. This continuous evaluation
of resource levels, server response times, and speed predicts possible vulnerability to future issues before
they arise.

183

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

Types of cloud monitoring

The cloud has numerous moving components, and for top performance, it’s critical to safeguard that
everything comes together seamlessly. This need has led to a variety of monitoring techniques to fit the
type of outcome that a user wants. The main types of cloud monitoring are:

Database monitoring

Because most cloud applications rely on databases, this technique reviews processes, queries, availability,
and consumption of cloud database resources. This technique can also track queries and data integrity,
monitoring connections to show real-time usage data. For security purposes, access requests can be
tracked as well. For example, an uptime detector can alert if there’s database instability and can help
improve resolution response time from the precise moment that a database goes down.

Website monitoring

A website is a set of files that is stored locally, which, in turn, sends those files to other computers over a
network. This monitoring technique tracks processes, traffic, availability, and resource utilization of cloud-
hosted sites.

Virtual network monitoring

This monitoring type creates software versions of network technology such as firewalls, routers, and load
balancers. Because they’re designed with software, these integrated tools can give you a wealth of data
about their operation. If one virtual router is endlessly overcome with traffic, for example, the network
adjusts to compensate. Therefore, instead of swapping hardware, virtualization infrastructure quickly
adjusts to optimize the flow of data.

Cloud storage monitoring

This technique tracks multiple analytics simultaneously, monitoring storage resources and processes that
are provisioned to virtual machines, services, databases, and applications. This technique is often used to
host infrastructure-as-a-service (IaaS) and software-as-a-service (SaaS) solutions. For these applications,
you can configure monitoring to track performance metrics, processes, users, databases, and available
storage. It provides data to help you focus on useful features or to fix bugs that disrupt functionality.

Virtual machine monitoring

This technique is a simulation of a computer within a computer; that is, virtualization infrastructure and
virtual machines. It’s usually scaled out in IaaS as a virtual server that hosts several virtual desktops. A

184

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

monitoring application can track the users, traffic, and status of each machine. You get the benefits of
traditional IT infrastructure monitoring with the added benefit of cloud monitoring solutions.

Benefits of cloud monitoring

Monitoring is a skill, not a full-time job. In today’s world of cloud-based architectures that are
implemented through DevOps projects, developers, site reliability engineers (SREs), and operations staff
must collectively define an effective cloud monitoring strategy. Such a strategy should focus on
identifying when service-level objectives (SLOs) are not being met, likely negatively affecting the user
experience. So, then what are the benefits of leveraging cloud monitoring tools? With cloud monitoring:

 Scaling for increased activity is seamless and works in organizations of any size
 Dedicated tools (and hardware) are maintained by the host
 Tools are used across several types of devices, including desktop computers, tablets, and phones,
so your organization can monitor apps from any location
 Installation is simple because infrastructure and configurations are already in place
 Your system doesn’t suffer interruptions when local problems emerge, because resources aren’t
part of your organization’s servers and workstations
 Subscription-based solutions can keep your costs low

Monitoring in public, private, and hybrid clouds

A private cloud gives you extensive control and visibility. Because systems and the software stack are
fully accessible, cloud monitoring is relaxed when it’s operated in a private cloud. Monitoring in public or
hybrid clouds, however, can be tough. Let’s review the focal points:

 Because the data exists between private and public clouds, a hybrid cloud environment presents
curious challenges. Limited security and compliance create problems for data access. Your
administrator can solve these issues by deciding which data to store in various clouds and which
data to asynchronously update.
 A private cloud gives you more control, but to promote optimal performance, it’s still wise to
monitor workloads. Without a clear picture of workload and network performance, it’s nearly
impossible to justify configuration or architectural changes or to quantify quality-of-service
implementations.

What is a cloud platform? A cloud platform refers to the operating system and hardware of a server in an
Internet-based data center. It allows software and hardware products to co-exist remotely and at scale.

So, how do cloud platforms work? Enterprises rent access to compute services, such as servers, databases,
storage, analytics, networking, software, and intelligence. Therefore, the enterprises don’t have to set up
and own data centers or computing infrastructure. They simply pay for what they use.

185

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

5.11 CLOUD SERVICES AND PLATFORMS: COMPUTE SERVICES

Types of Cloud Platforms

There are several types of cloud platforms. Not a single one works for everyone. There are several models,
types, and services available to help meet the varying needs of users. They include:

 Public Cloud: Public cloud platforms are third-party providers that deliver computing resources
over the Internet. Examples include Amazon Web Services (AWS), Google Cloud Platform,
Alibaba, Microsoft Azure, and IBM Bluemix.

 Private Cloud: A private cloud platform is exclusive to a single organization. It’s usually in an on-
site data center or hosted by a third-party service provider.

 Hybrid Cloud: This is a combination of public and private cloud platforms. Data and applications
move seamlessly between the two. This gives the organization greater flexibility and helps
optimize infrastructure, security, and compliance.

A cloud platform allows organizations to create cloud-native applications, test and build applications, and
store, back up, and recover data. It also allows organizations to analyze data. Organizations can also
stream video and audio, embed intelligence into their operations, and deliver software on-demand on a
global scale.

5.12 STORAGE SERVICES

What is Cloud Storage?

Cloud Storage is a mode of computer data storage in which digital data is stored on servers in off-site
locations. The servers are maintained by a third-party provider who is responsible for hosting, managing,
and securing data stored on its infrastructure. The provider ensures that data on its servers is always
accessible via public or private internet connections.

Cloud Storage enables organizations to store, access, and maintain data so that they do not need to own
and operate their own data centers, moving expenses from a capital expenditure model to operational.
Cloud Storage is scalable, allowing organizations to expand or reduce their data footprint depending on
need.

How does Cloud Storage work?

186

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

Cloud Storage uses remote servers to save data, such as files, business data, videos, or images. Users
upload data to servers via an internet connection, where it is saved on a virtual machine on a physical
server. To maintain availability and provide redundancy, cloud providers will often spread data to multiple
virtual machines in data centers located across the world. If storage needs increase, the cloud provider will
spin up more virtual machines to handle the load. Users can access data in Cloud Storage through an
internet connection and software such as web portal, browser, or mobile app via an application
programming interface (API).

Cloud Storage is available in four different models:

Public

Public Cloud Storage is a model where an organization stores data in a service provider’s data centers that
are also utilized by other companies. Data in public Cloud Storage is spread across multiple regions and is
often offered on a subscription or pay-as-you-go basis. Public Cloud Storage is considered to be “elastic”
which means that the data stored can be scaled up or down depending on the needs of the organization.
Public cloud providers typically make data available from any device such as a smartphone or web portal.

Private

Private Cloud Storage is a model where an organization utilizes its own servers and data centers to store
data within their own network. Alternatively, organizations can deal with cloud service providers to
provide dedicated servers and private connections that are not shared by any other organization. Private
clouds are typically utilized by organizations that require more control over their data and have stringent
compliance and security requirements.

Hybrid

A hybrid cloud model is a mix of private and public cloud storage models. A hybrid cloud storage model
allows organizations to decide which data it wants to store in which cloud. Sensitive data and data that
must meet strict compliance requirements may be stored in a private cloud while less sensitive data is
stored in the public cloud. A hybrid cloud storage model typically has a layer of orchestration to integrate
between the two clouds. A hybrid cloud offers flexibility and allows organizations to still scale up with the
public cloud if need arises.

Multicloud

A multicloud storage model is when an organization sets up more than one cloud model from more than
one cloud service provider (public or private). Organizations might choose a multicloud model if one
cloud vendor offers certain proprietary apps, an organization requires data to be stored in a specific
country, various teams are trained on different clouds, or the organization needs to serve different

187

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

requirements that are not stated in the servicers’ Service Level Agreements. A multicloud model offers
organizations flexibility and redundancy.

Advantages of Cloud Storage

Total cost of ownership

Cloud Storage enables organizations to move from a capital expenditure to an operational expenditure
model, allowing them to adjust budgets and resources quickly.

Elasticity

Cloud Storage is elastic and scalable, meaning that it can be scaled up (more storage added) or down (less
storage needed) depending on the organization’s needs.

Flexibility

Cloud Storage offers organizations flexibility on how to store and access data, deploy and budget
resources, and architect their IT infrastructure.

Security

Most cloud providers offer robust security, including physical security at data centers and cutting edge
security at the software and application levels. The best cloud providers offer zero trust architecture,
identity and access management, and encryption.

Sustainability

One of the greatest costs when operating on-premises data centers is the overhead of energy consumption.
The best cloud providers operate on sustainable energy through renewable resources.

Redundancy

Redundancy (replicating data on multiple servers in different locations) is an inherent trait in public
clouds, allowing organizations to recover from disasters while maintaining business continuity.

Disadvantages of Cloud Storage

Compliance

Certain industries such as finance and healthcare have stringent requirements about how data is stored and
accessed. Some public cloud providers offer tools to maintain compliance with applicable rules and
regulations.

188

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

Latency

Traffic to and from the cloud can be delayed because of network traffic congestion or slow internet
connections.

Control

Storing data in public clouds relinquishes some control over access and management of that data,
entrusting that the cloud service provider will always be able to make that data available and maintain its
systems and security.

Outages

While public cloud providers aim to ensure continuous availability, outages sometimes do occur, making
stored data unavailable.

How to use Cloud Storage

Cloud Storage provides several use cases that can benefit individuals and organizations. Whether a person
is storing their family budget on a spreadsheet, or a massive organization is saving years of financial data
in a highly secure database, Cloud Storage can be used for saving digital data of all kinds for as long as
needed.

Backup

Data backup is one of the simplest and most prominent uses of Cloud Storage. Production data can be
separated from backup data, creating a gap between the two that protects organizations in the case of a
cyber threat such as ransomware. Data backup through Cloud Storage can be as simple as saving files to a
digital folder such as Google Drive or using block storage to maintain gigabytes or more of important
business data.

Archiving
The ability to archive old data has become an important aspect of Cloud Storage, as organizations
move to digitize decades of old records, as well as hold on to records for governance and compliance
purposes. Google Cloud offers several tiers of storage for archiving data, including coldline storage
and archival storage, that can be accessed whenever an organization needs them.
Disaster recovery

A disaster—natural or otherwise— that wipes out a data center or old physical records needs not be the
business-crippling event that it was in the past. Cloud Storage allows for disaster recovery so that
organizations can continue with their business, even when times are tough.

189

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

Data processing

As Cloud Storage makes digital data immediately available, data becomes much more useful on an
ongoing basis. Data processing, such as analyzing data for business intelligence or applying machine
learning and artificial intelligence to large datasets, is possible because of Cloud Storage.

Content delivery

With the ability to save copies of media data, such as large audio and video files, on servers dispersed
across the globe, media and entertainment companies can serve their audience low-latency, always
available content from wherever they reside.

Types of Cloud Storage

Cloud Storage comes in three different types: object, file, and block.

Object

Object storage is a data storage architecture for large stores of unstructured data. It designates each piece
of data as an object, keeps it in a separate storehouse, and bundles it with metadata and a unique identifier
for easy access and retrieval.

File

File storage organizes data in a hierarchical format of files and folders. File storage is common in personal
computing where data is saved as files and those files are organized in folders. File storage makes it easy
to locate and retrieve individual data items when they are needed. File storage is most often used in
directories and data repositories.

Block

Block storage breaks data into blocks, each with an unique identifier, and then stores those blocks as
separate pieces on the server. The cloud network stores those blocks wherever it is most efficient for the
system. Block storage is best used for large volumes of data that require low latency such as workloads
that require high performance or databases.

5.13 APPLICATION SERVICES

Definition

Application Services (often used instead of application management services or application services
management) are a pool of services such as load balancing, application performance monitoring,

190

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

application acceleration, autoscaling, micro-segmentation, service proxy and service discovery needed to
optimally deploy, run and improve applications.

What is Application Services Management?

The process of configuring, monitoring, optimizing and orchestrating different app services is known as
application services management.

Today, organizations with their own data centers or which use the public cloud, handle applications
services management. In the early days of online adoption, application service providers (or ASPs) were
companies which would deliver applications to end users for a fixed cost. This single tenant, hosted model
was largely replaced by the advent of the Software-as-a-Service (SaaS) delivery model which was multi-
tenant and on-demand.

191

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

What is Cloud Application Services?

Cloud App Services are a wide range of specific application services for applications deployed in cloud-
based resources. Services such as load balancing, application firewalling and service discovery can be
achieved for applications running in private, public, hybrid or multi-cloud environments.

What are App Modernization Services?

Traditional applications were built as monolithic blocks of software. These monolithic applications have
long life cycles because any changes or updates to one function, usually requires reconfiguring the entire
application. This costly and time consuming process delays advancements and updates in application
development.

Application Modernization Services enable the migration of monolithic, legacy application architectures to
new application architectures that more closely match the business needs of modern enterprises’
application portfolio. Application modernization is often part of an organization’s digital transformation.

An example of this is the use of a microservices architecture where all app services are created
individually and deployed separately from one another. This allows for scaling services based on specific
business needs. Services can also be rapidly changed without affecting other parts of the application.
Application-centric enterprises are choosing microservices architectures to take advantage of flexible
container-based infrastructure models.

192

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

Part A

1. Define Cloud Computing.

Cloud computing is a virtualization-based technology that allows us to create, configure, and
customize applications via an internet connection. The cloud technology includes a development
platform, hard disk, software application, and database. The term cloud refers to a network or the
internet.

2. Define community cloud.

Community clouds are semi-public clouds that are shared between members of a select group
of organizations. These organizations will generally have a common purpose or mission.

3. What is a hybrid cloud?

A hybrid cloud model is a combination of two or more other cloud models. The clouds
themselves are not mixed together; rather, each cloud is separate, and they are all linked together.
A hybrid cloud may introduce more complexity to the environment, but it also allows more
flexibility in fulfilling an organization’s objectives.

4. Define Infrastructure as a Service.

Infrastructure as a Service, or IaaS, provides basic infrastructure services to customers. These
services may include physical machines, virtual machines, networking, storage, or some
combination of these.
5. Explain Virtualization.
Virtualization is a technique how to separate a service from the underlying physical delivery of that
service. It is the process of creating a virtual version of something like computer hardware. It was
initially developed during the mainframe era. It involves using specialized software to create a
virtual or software-created version of a computing resource rather than the actual version of the
same resource.

6. Explain Load balancing.

Load balancing is an essential technique used in cloud computing to optimize resource utilization
and ensure that no single resource is overburdened with traffic. It is a process of distributing
workloads across multiple computing resources, such as servers, virtual machines, or containers, to
achieve better performance, availability, and scalability.

7. Define cloud elasticity.

193

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

8. Difference Between Cloud Elasticity and Scalability.

Cloud Scalability
Cloud Elasticity
Elasticity is used just to meet the sudden
Scalability is used to meet the static
up and down in the workload for a small
increase in the workload.
period of time.
Elasticity is used to meet dynamic Scalability is always used to address
changes, where the resources need can the increase in workload in an
increase or decrease. organization.
Elasticity is commonly used by small Scalability is used by giant
companies whose workload and demand companies whose customer circle
increases only for a specific period of persistently grows in order to do the
time. operations efficiently.
It is a short term planning and adopted Scalability is a long term planning
just to deal with an unexpected increase and adopted just to deal with an
in demand or seasonal demands. expected increase in demand.

Part B

1. Explain Cloud Deployment Models.

2. Explain CloudService Models.
3. Explain Virtualization.

Part C
1. Explain storage services and application services.

194

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

REFERENCES

1. Kshemkalyani, Ajay D., and Mukesh Singhal. Distributed computing: principles, algorithms,
and systems. Cambridge University Press, 2011.
2. George Coulouris, Jean Dollimore and Tim Kindberg, ―Distributed Systems Concepts and
Design‖, Fifth Edition, Pearson Education, 2012.
3. Pradeep K Sinha, “Distributed Operating Systems: Concepts and Design”, Prentice Hall o f
India, 2007.
4. Mukesh Singhal and Niranjan G. Shivaratri. Advanced concepts in operating systems.
McGraw-Hill, Inc., 1994.
5. Tanenbaum A.S., Van Steen M., ―Distributed Systems: Principles and Paradigms‖, Pearson
Education, 2007.
6. Liu M.L., ―Distributed Computing, Principles and Applications‖, Pearson Education, 2004.
7.. Nancy A Lynch, ―Distributed Algorithms‖, Morgan Kaufman Publishers, USA, 2003.

195

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

ANNA UNIVERSITY
MODEL
QUESTION PAPERS

196

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

MODEL QUESTION PAPER – I

Fifth Semester
Computer Science and Engineering

CS8603 Distributed Systems

(Common to: Computer and Communication Engineering/ B.Tech. Information Technology)
(Regulation 2017)
Time: Three Hours Maximum: 100 Marks
Answer ALL questions
PART – A (10x2=20 Marks)
1. Distinguish between synchronous and asynchronous execution.
2. List the issues in distributed systems.
3. List the types of message ordering.
4. Differentiate between casual and total order.
5. Define distributed mutual exclusion.
6. Label the models of deadlocks.
7. Define checkpoint and rollback recovery.
8. Distinguish between coordinated and uncoordinated checkpointing.
9. List the advantages of distributed shared memory.
10. Define overlay graph.

PART – B (5x13=65 Marks)

11. (a) Explain the primitives of distributed communication. (13)
(OR)
(b) Explain the challenges and issues of distributed systems.(13)
12. (a) Discuss about message ordering paradigm in detail.(13)
(OR)
(b) Elaborate the snapshot algorithm for FIFO channels.(13)

197

Downloaded by AKILESH J S PSGiTECH ([email protected])

lOMoARcPSD|40984340

CS3551 Distributed Computing 2023

13. (a) Compare the various distributed mutual exclusion algorithms with diagrams. (13)
(OR)
(b) Explain the Knapp’s classification with suitable examples. (13)

14. (a) Explain in detail about checkpoint based recovery and log based rollback recovery (13)
(OR)
(b) Discuss about agreement in synchronous systems with failures. (13)
15. (a) Explain about the working of Tapestry in detail. (13)
(OR)
(b) Describe the memory consistency models. (13)
PART – C (1x15=15 Marks)
16. (a) Design the algorithms for asynchronous checkpointing and recovery. (15)
(OR)
(b) Explain shared memory exclusion in detail. (15).

198

Downloaded by AKILESH J S PSGiTECH ([email protected])

PC Gilmore Price List
80% (10)
PC Gilmore Price List
2 pages
DC Unit 1
No ratings yet
DC Unit 1
25 pages
Cs3551 Distributed Computing Unit-1
No ratings yet
Cs3551 Distributed Computing Unit-1
52 pages
DC Unit 1 - Notes
No ratings yet
DC Unit 1 - Notes
36 pages
CS3551 Unit 1
No ratings yet
CS3551 Unit 1
24 pages
CS8603 U.I
No ratings yet
CS8603 U.I
36 pages
CS8603 U.I
No ratings yet
CS8603 U.I
37 pages
CS8603 Unit I
No ratings yet
CS8603 Unit I
35 pages
CS3551 - Distributed Computing
No ratings yet
CS3551 - Distributed Computing
106 pages
Unit-I Notes
No ratings yet
Unit-I Notes
40 pages
DC - Hand Written
No ratings yet
DC - Hand Written
26 pages
CS3551-Distributed Computing Notes - Removed
No ratings yet
CS3551-Distributed Computing Notes - Removed
32 pages
Distributed-Computing Notes
No ratings yet
Distributed-Computing Notes
108 pages
Nunit I
No ratings yet
Nunit I
119 pages
Unit 1notes Full
No ratings yet
Unit 1notes Full
20 pages
DC - Unit 1 - Introduction Final
No ratings yet
DC - Unit 1 - Introduction Final
53 pages
Distributed Systems - Notes Distributed Systems - Notes
No ratings yet
Distributed Systems - Notes Distributed Systems - Notes
122 pages
PDS Merged
No ratings yet
PDS Merged
182 pages
RS - Pds-Oe 3010
No ratings yet
RS - Pds-Oe 3010
8 pages
Distributed Systems
No ratings yet
Distributed Systems
121 pages
Distributed Systems Notes
No ratings yet
Distributed Systems Notes
122 pages
Unit 1 DC
No ratings yet
Unit 1 DC
19 pages
CS3551 - 1 - Merged
No ratings yet
CS3551 - 1 - Merged
117 pages
DISTRIBUTED Unit 1-COMPUTING
No ratings yet
DISTRIBUTED Unit 1-COMPUTING
30 pages
UNIT I Notes Final
No ratings yet
UNIT I Notes Final
88 pages
CS3551 Unit 1 Notes
No ratings yet
CS3551 Unit 1 Notes
25 pages
DC - Unit 1 - Introduction
No ratings yet
DC - Unit 1 - Introduction
68 pages
Distributed Comp (Intro)
No ratings yet
Distributed Comp (Intro)
39 pages
DC TechNeo
No ratings yet
DC TechNeo
205 pages
DS Mod 1
No ratings yet
DS Mod 1
44 pages
Unit I Notes DC
No ratings yet
Unit I Notes DC
28 pages
DC - Unit 1 - Introduction Notes
No ratings yet
DC - Unit 1 - Introduction Notes
23 pages
CCT Unit - 1
No ratings yet
CCT Unit - 1
26 pages
DC Mod 1
No ratings yet
DC Mod 1
7 pages
Distributed System: Chapter-1
No ratings yet
Distributed System: Chapter-1
31 pages
Distributed Computing Book
No ratings yet
Distributed Computing Book
130 pages
Ds 01
No ratings yet
Ds 01
41 pages
Distributed Computing
No ratings yet
Distributed Computing
36 pages
Unit One
No ratings yet
Unit One
93 pages
Introduction To Distributed Systems: BY: Sunita Sahu Assistant Professor, VESIT, Mumbai
No ratings yet
Introduction To Distributed Systems: BY: Sunita Sahu Assistant Professor, VESIT, Mumbai
48 pages
DS Unit 1
No ratings yet
DS Unit 1
13 pages
Distributed System
100% (1)
Distributed System
26 pages
UNIT-1 What Is: Q1: Distributed System? or Why Would You Design A System As A Distributed System
No ratings yet
UNIT-1 What Is: Q1: Distributed System? or Why Would You Design A System As A Distributed System
55 pages
Introduction To Distributed Systems: by Petros H
No ratings yet
Introduction To Distributed Systems: by Petros H
48 pages
Chapter 1
No ratings yet
Chapter 1
55 pages
Unit-1 Notes
No ratings yet
Unit-1 Notes
23 pages
Unit1 Parallel and Distributed
No ratings yet
Unit1 Parallel and Distributed
20 pages
Unit i Introduction - Updated
No ratings yet
Unit i Introduction - Updated
33 pages
Distributed System
No ratings yet
Distributed System
162 pages
Distributed Computing CSC413 Note
No ratings yet
Distributed Computing CSC413 Note
45 pages
Distributed Computing Note
100% (1)
Distributed Computing Note
54 pages
Dsunit 1 PART2
No ratings yet
Dsunit 1 PART2
28 pages
Introduction DC
No ratings yet
Introduction DC
43 pages
Unit-2 (A)
No ratings yet
Unit-2 (A)
40 pages
Module 1
No ratings yet
Module 1
51 pages
DC - Unit I
No ratings yet
DC - Unit I
57 pages
Distributed Systems: Xining Li
No ratings yet
Distributed Systems: Xining Li
21 pages
Ds-Module-1 (1) - 240125 - 093156
No ratings yet
Ds-Module-1 (1) - 240125 - 093156
63 pages
Unit 1-DC
No ratings yet
Unit 1-DC
80 pages
CS3551 Unit 1 and 2
No ratings yet
CS3551 Unit 1 and 2
48 pages
Dot Matrix Correcting USB Problems
No ratings yet
Dot Matrix Correcting USB Problems
4 pages
EEPROM 24LC512 - 21754e
No ratings yet
EEPROM 24LC512 - 21754e
26 pages
MLC 9000 Firmware Programing Lead + Update
No ratings yet
MLC 9000 Firmware Programing Lead + Update
6 pages
6t Sram Cell
No ratings yet
6t Sram Cell
4 pages
DLCO Unit5
No ratings yet
DLCO Unit5
29 pages
Intro To CUDA
No ratings yet
Intro To CUDA
76 pages
Pic18 and Embedded C
No ratings yet
Pic18 and Embedded C
20 pages
MP QB
No ratings yet
MP QB
8 pages
Haier Y11B User Guide
No ratings yet
Haier Y11B User Guide
27 pages
Dell XC630-10 Nutanix On VMware ESXi Reference Architecture
No ratings yet
Dell XC630-10 Nutanix On VMware ESXi Reference Architecture
51 pages
Application-Oriented Object Architecture - Concepts and Approach
No ratings yet
Application-Oriented Object Architecture - Concepts and Approach
6 pages
A4s Receiving Card Specifications V2.1.2
No ratings yet
A4s Receiving Card Specifications V2.1.2
13 pages
LW Specification
No ratings yet
LW Specification
20 pages
D-PST-DY-23 Exam Updated Practice Questions 2025
No ratings yet
D-PST-DY-23 Exam Updated Practice Questions 2025
12 pages
Omnidrive Usb2 LF-SD V1-30e
No ratings yet
Omnidrive Usb2 LF-SD V1-30e
3 pages
Tech Spec Flyer Eng A920 29sept2020 Final
No ratings yet
Tech Spec Flyer Eng A920 29sept2020 Final
1 page
Timewave Aea PK 900 Manual Nov 2004
No ratings yet
Timewave Aea PK 900 Manual Nov 2004
392 pages
Armados
No ratings yet
Armados
8 pages
SRDF Connectivity Guide
No ratings yet
SRDF Connectivity Guide
148 pages
Tbulm08030 14 V02 A4
No ratings yet
Tbulm08030 14 V02 A4
5 pages
Ch-01 (Comp) - Introduction To Computers
No ratings yet
Ch-01 (Comp) - Introduction To Computers
23 pages
Kioxia SSD XG6-P Product Brief
No ratings yet
Kioxia SSD XG6-P Product Brief
2 pages
Toward RISC-V CSR Compliance Testing
No ratings yet
Toward RISC-V CSR Compliance Testing
4 pages
Memory Devices
No ratings yet
Memory Devices
44 pages
Es Model Exam
No ratings yet
Es Model Exam
3 pages
Aduan Peralatan ICT Sekolah Negeri Perak 2018pkgslimriver
No ratings yet
Aduan Peralatan ICT Sekolah Negeri Perak 2018pkgslimriver
4 pages
C845 SP4201SL
No ratings yet
C845 SP4201SL
4 pages
High Speed Cipher Cracking: The Case of Keeloq On CUDA: Rypto Eeloq Eeloq
No ratings yet
High Speed Cipher Cracking: The Case of Keeloq On CUDA: Rypto Eeloq Eeloq
12 pages
USB To RS-232 DB9 1-Port Serial Interface Adapter - Sealevel
No ratings yet
USB To RS-232 DB9 1-Port Serial Interface Adapter - Sealevel
4 pages