0% found this document useful (0 votes)
13 views

Distributed System2

Uploaded by

anant2003krishna
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views

Distributed System2

Uploaded by

anant2003krishna
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 102

Distributed system

• Definition: A distributed system includes multiple physically different nodes linked together
using the network. All the nodes in this system communicate with each other and control processes
in a team. Nodes include a small portion of the distributed operating system software. It connects
multiple computers through a single channel. It uses many central processors to serve multiple
real-time applications and users.
• Types of distributed system
❑ Client/server systems
❑ Peer-to-peer systems
❑ Middleware
❑ Three-tier
❑ N-tier
Client/server systems
• In client-server systems, the client requests a resource or file and the server fetches that
resource. Users and servers usually communicate through a computer network, so they are
a part of distributed systems. A client is in contact with just one server.
Peer-to-peer systems
• The peer-to-peer techniques contain nodes that are equal participants in data sharing. The nodes
communicate with each other as needed to share resources. This is done with the help of a network.
All the tasks are equally separated between all the nodes.
Middleware
• Middleware can be thought of as an application that sits between two separate applications and
provides service to both. It works as a base for different interoperability applications running on
different operating systems. Data can be transferred to other between others by using this service.
Three-tier
• Three-tier system uses a separate layer and server for each function of a program. In this data of
the client is stored in the middle tier rather than sorted into the client system or on their server
through which development can be done easily. It includes an Application Layer, Data Layer, and
Presentation Layer. This is mostly used in web or online applications.
N-tier
• N-tier is also called a multitier distributed system. The N-tier system can contain any number of
functions in the network. N-tier systems contain similar structures to three-tier architecture. When
interoperability sends the request to another application to perform a task or to provide a service.
N-tier is commonly used in web applications and data systems.
Distributed system goals
• Boosting Performance: The distributed system tries to make things faster by dividing a
bigger task into small chunks and finally processing them simultaneously in different computers.
It’s just like a group of people working together on a project. For example, when we try to search
for anything on the internet the search engine distributes the work among several servers and then
retrieve the result and display the webpage in a few seconds.
• Enhancing Reliability: Distributed system ensures reliability by minimizing the load of
individual computer failure. If one computer gets some failure then other computers try to keep the
system running smoothly. For Example, when we search for something in social media if one
server gets an issue then also we are able to access photos, and posts because they switch the server
quickly.
• Scaling for the Future: Distributed systems are experts at handling increased demands. They
manage the demands by incorporating more and more computers into the system. This way they
run everything smoothly and can handle more users.
Contd.
• Resourceful Utilization: Resource Utilization is one of the most prominent features of a
Distributed system. Instead of putting a load on one computer, they distribute the task among the
other available resource. This ensures that work will be done by utilizing every resource.
• Fault Tolerance and Resilience: Distributed system comes with backup plans. If any
computer fails they redirect the task to some other computer ensuring less delay and a smooth
experience.
• Security and Data Integrity: Distributed system have special codes and lock to protect data
from other. They use some renowned techniques for encryption and authentication to keep
information safe and unauthorized access. Distributed systems prioritize data security as you keep
your secret safe.
• Load Balancing: As we know distributed systems ensure good resource utilization and allow
the system to handle a high volume of data without getting it slow down, this is achieved by load
balancing, in which it evenly distributed load to all the computers available. Thus preventing
single-machine overload and preventing bottlenecks.
Distributed System Models
• Following are different models in distributed system

❑ Physical Model

❑ Architectural Model

❑ Fundamental Model
Physical Model
• A physical model is basically a representation of the underlying hardware elements of a distributed
system. It encompasses the hardware composition of a distributed system in terms of computers
and other devices and their interconnections.
• It is primarily used to design, manage, implement and determine the performance of a distributed
system. A physical model majorly consists of the following components:

❖ Nodes
❖ Links
❖ Middleware
❖ Network Topology
❖ Communication Protocols
Architectural Model
• Architectural model in a distributed computing system is the overall design and structure of the
system, and how its different components are organized to interact with each other and provide the
desired functionalities. It is an overview of the system, on how will the development, deployment
and operations take place.
• The key aspects of the architectural model are
❖ Client-Server model
❖ Peer-to-peer model
❖ Layered model
❖ Micro-services model
• In the micro-service model, a complex application or task, is decomposed into multiple
independent tasks and these services running on different servers. Each service performs only a
single function and is focused on a specific business capability. This makes the overall system
more maintainable, scalable, and easier to understand.
Fundamental Model
• The fundamental model in a distributed computing system is a broad
conceptual framework that helps in understanding the key aspects of
the distributed systems. These are concerned with more formal
description of properties that are generally common in all architectural
models. It represents the essential components that are required to
understand a distributed system’s behaviour.
• Three fundamental models are as follows
1. Interaction Model
2. Remote Procedure Call (RPC)
3. Failure Model
4. Security Model
Interaction model
• Distributed computing systems are full of many processes interacting with
each other in highly complex ways. Interaction model provides a framework
to understand the mechanisms and patterns that are used for communication
and coordination among various processes. Different components that are
important in this model are –
• Message Passing – It deals with passing messages that may contain, data,
instructions, a service request, or process synchronisation between different
computing nodes. It may be synchronous or asynchronous depending on the
types of tasks and processes.
• Publish/Subscribe Systems – Also known as pub/sub system. In this the
publishing process can publish a message over a topic and the processes that
are subscribed to that topic can take it up and execute the process for
themselves. It is more important in an event-driven architecture.
Remote Procedure Call(RPC)
• It is a communication paradigm that has an ability to invoke a new process or a method on a
remote process as if it were a local procedure call. The client process makes a procedure call
using RPC and then the message is passed to the required server process using communication
protocols. These message passing protocols are abstracted and the result once obtained from the
server process, is sent back to the client process to continue execution.
Failure model
• This model addresses the faults and failures that occur in the distributed
computing system. It provides a framework to identify and rectify the faults
that occur or may occur in the system. Fault tolerance mechanisms are
implemented so as to handle failures by replication and error detection and
recovery methods. Different failures that may occur are:
• Crash failures – A process or node unexpectedly stops functioning.
• Omission failures – It involves a loss of message, resulting in absence of
required communication.
• Timing failures – The process deviates from its expected time quantum and
may lead to delays or unsynchronised response times.
• Byzantine failures – The process may send malicious or unexpected
messages that conflict with the set protocols.
Security model
• Distributed computing systems may suffer malicious attacks, unauthorised access and data
breaches. Security model provides a framework for understanding the security
requirements, threats, vulnerabilities, and mechanisms to safeguard the system and its
resources. Various aspects that are vital in the security model are –
• Authentication – It verifies the identity of the users accessing the system. It ensures that
only the authorised and trusted entities get access. It involves –
• Password-based authentication – Users provide a unique password to prove their identity.
• Public-key cryptography – Entities possess a private key and a corresponding public key, allowing
verification of their authenticity.
• Multi-factor authentication – Multiple factors, such as passwords, biometrics, or security tokens,
are used to validate identity.
• Encryption – It is the process of transforming data into a format that is unreadable without
a decryption key. It protects sensitive information from unauthorized access or disclosure.

Design Issues in distributed model
• The following are some of the major design issues of distributed systems
1. Heterogeneity: Heterogeneity is applied to the network, computer hardware, operating system, and
implementation of different developers. A key component of the heterogeneous distributed system
client-server environment is middleware. Middleware is a set of services that enables applications
and end-user to interact with each other across a heterogeneous distributed system.
2. Openness: The openness of the distributed system is determined primarily by the degree to which
new resource-sharing services can be made available to the users. Open systems are characterized
by the fact that their key interfaces are published. It is based on a uniform communication
mechanism and published interface for access to shared resources. It can be constructed from
heterogeneous hardware and software.
3. Scalability: The scalability of the system should remain efficient even with a significant increase
in the number of users and resources connected. It shouldn’t matter if a program has 10 or 100
nodes; performance shouldn’t vary. A distributed system’s scaling requires consideration of a
number of elements, including size, geography, and management.
Contd.
4. Security: The security of an information system has three components Confidentially, integrity, and
availability. Encryption protects shared resources and keeps sensitive information secrets when
transmitted.
5. Failure Handling: When some faults occur in hardware and the software program, it may produce
incorrect results or they may stop before they have completed the intended computation so corrective
measures should to implemented to handle this case. Failure handling is difficult in distributed
systems because the failure is partial i, e, some components fail while others continue to function.
6. Concurrency: There is a possibility that several clients will attempt to access a shared resource at
the same time. Multiple users make requests on the same resources, i.e. read, write, and update. Each
resource must be safe in a concurrent environment. Any object that represents a shared resource in a
distributed system must ensure that it operates correctly in a concurrent environment.
7. Transparency: Transparency ensures that the distributed system should be perceived as a single
entity by the users or the application programmers rather than a collection of autonomous systems,
which is cooperating. The user should be unaware of where the services are located and the transfer
from a local machine to a remote one should be transparent.
Communication in distributed system
• Message passing is the interaction of exchanging messages between at least two processors. The
cycle which is sending the message to one more process is known as the sender and the process
which is getting the message is known as the receiver.
• In a message-passing system, we can send the message by utilizing send function and we can
receive the message by utilizing receive function. Following are the general syntaxes for send
function and receive function.
Send()
Receive()
Send (receiver, message)
Receive(sender, message)

• Message passing is possible at whatever point the processors are in communication. The
communication of a message can be established in distributed in two ways.
• Interprocess communication(IPC)
• Remote methodology call(RPC)
Interprocess Communication
• Inter-process communication (IPC) is a mechanism that allows
processes to communicate with each other and synchronize their
actions. The communication between these processes can be seen as a
method of cooperation between them. Processes can communicate
with each other through both:
1. Shared Memory
2. Message Passing
• Communication between processes using shared memory requires
processes to share some variable, and it completely depends on how
the programmer will implement it.
Contd.
• One way of communication using shared memory can be imagined like this: Suppose process1 and process2
are executing simultaneously, and they share some resources or use some information from another process.
Process1 generates information about certain computations or resources being used and keeps it as a record in
shared memory.
• When process2 needs to use the shared information, it will check the record stored in shared memory and take
note of the information generated by process1, and act accordingly.
Message Passing Method
• In this method, processes communicate with each other without using
any kind of shared memory. If two processes p1 and p2 want to
communicate with each other, they proceed as follows:
Establish a communication link (if a link already exists, no need to
establish it again.)
Start exchanging messages using basic primitives.
We need at least two primitives:
–send(message, destination) or send(message)
– receive(message, host) or receive(message)
Contd.
• The message size can be of fixed size or of variable size. If it is of fixed size, it is easy for an OS designer but
complicated for a programmer and if it is of variable size then it is easy for a programmer but complicated for
the OS designer. A standard message can have two parts: header and body.
• The header part is used for storing message type, destination id, source id, message length, and control
information. The control information contains information like what to do if runs out of buffer space,
sequence number, priority. Generally, message is sent using FIFO style.
Message Passing through Communication Link
• Message passing is carried out via a link. A link has some capacity that determines the number of messages
that can reside in it temporarily for which every link has a queue associated with it which can be of zero
capacity, bounded capacity, or unbounded capacity. In zero capacity, the sender waits until the receiver
informs the sender that it has received the message. In non-zero capacity cases, a process does not know
whether a message has been received or not after the send operation. For this, the sender must communicate
with the receiver explicitly. Implementation of the link depends on the situation, it can be either a direct
communication link or an in-directed communication link.

• Direct Communication links are implemented when the processes use a specific process identifier for the
communication, but it is hard to identify the sender ahead of time. For example the print server.

• In-direct Communication is done via a shared mailbox (port), which consists of a queue of messages. The
sender keeps the message in the mailbox and the receiver picks them up.
Synchronous vs Asynchronous Transmission
• Synchronous Transmission: In Synchronous Transmission, data is sent in form of blocks or
frames. This transmission is the full-duplex type. Between sender and receiver, synchronization is
compulsory. In Synchronous transmission, There is no time-gap present between data. It is more
efficient and more reliable than asynchronous transmission to transfer a large amount of data.
• Example: Chat Rooms,Telephonic Conversations,Video Conferencing
Contd.
• Asynchronous Transmission: In Asynchronous Transmission, data is
sent in form of byte or character. This transmission is the half-duplex
type transmission. In this transmission start bits and stop bits are added
with data. It does not require synchronization.
• Example: Email, Forums, Letters


Message Passing through Exchanging the Messages.
• Synchronous and Asynchronous Message Passing:
• A process that is blocked is one that is waiting for some event, such as a resource becoming
available or the completion of an I/O operation. IPC is possible between the processes on same
computer as well as on the processes running on different computer i.e. in networked/distributed
system.
• In both cases, the process may or may not be blocked while sending a message or attempting to
receive a message so message passing may be blocking or non-blocking. Blocking is
considered synchronous and blocking send means the sender will be blocked until the message is
received by receiver.
• Similarly, blocking receive has the receiver block until a message is available. Non-blocking is
considered asynchronous and Non-blocking send has the sender sends the message and continue.
Similarly, a Non-blocking receive has the receiver receive a valid message or null.
RPC
• RPC is an effective mechanism for building client-server systems that are distributed. RPC
enhances the power and ease of programming of the client/server computing concept.
• It’s a protocol that allows one software to seek a service from another program on another
computer in a network without having to know about the network. The software that makes the
request is called a client, and the program that provides the service is called a server.
• The calling parameters are sent to the remote process during a Remote Procedure Call, and the
caller waits for a response from the remote procedure.
• There are 5 elements used in the working of RPC:
A. Client
B. Client Stub
C. RPC Runtime
D. Server Stub
E. Server
Contd.
• Client: The client process initiates RPC. The client makes a standard
call, which triggers a correlated procedure in the client stub.
Contd.
• Client Stub: Stubs are used by RPC to achieve semantic transparency.
The client calls the client stub. Client stub does the following tasks:

• The first task performed by client stub is when it receives a request from a
client, it packs(marshalls) the parameters and required specifications of
remote/target procedure in a message.

• The second task performed by the client stub is upon receiving the result
values after execution, it unpacks (unmarshalled) those results and sends them
to the Client.
RPC Runtime:
• The RPC runtime is in charge of message transmission between client
and server via the network. Retransmission, acknowledgment, routing,
and encryption are all tasks performed by it.
• On the client side, it receives the result values in a message from the
server side, and then it further sends it to the client stub whereas, on
the server side, RPC Runtime gets the same message from the server
stub when then it forwards to the client machine.
• It also accepts and forwards client machine call request messages to
the server stub.
Server Stub & Server
• Server stub does the following tasks:
The first task performed by server stub is that it
unpacks(unmarshalled) the call request message which is received
from the local RPC Runtime and makes a regular call to invoke the
required procedure in the server.
The second task performed by the server stub is that when it receives
the server’s procedure execution result, it packs it into a message and
asks the local RPC Runtime to transmit it to the client stub where it is
unpacked.
• After receiving a call request from the client machine, the server stub passes
it to the server. The execution of the required procedure is made by the
server and finally, it returns the result to the server stub so that it can be
passed to the client machine using the local RPC Runtime.
RPC Process
• The client, the client stub, and one instance of RPC Runtime are all
running on the client machine.
• A client initiates a client stub process by giving parameters as normal.
The client stub acquires storage in the address space of the client.
• At this point, the user can access RPC by using a normal Local
Procedural Call. The RPC runtime is in charge of message
transmission between client and server via the network.
Retransmission, acknowledgment, routing, and encryption are all tasks
performed by it.
Contd.
• On the server side, values are returned to the server stub, after the
completion of the server operation, which then packs (which is also
known as marshaling) the return values into a message. The transport
layer receives a message from the server stub.

• The resulting message is transmitted by the transport layer to the client


transport layer, which then sends a message back to the client stub.

• The client stub unpacks (which is also known as unmarshalling) the


return arguments in the resulting packet, and the execution process
returns to the caller at this point.
Remote Method Invocation (RMI)
• Remote Method Invocation (RMI) is an API that allows an object to
invoke a method on an object that exists in another address space,
which could be on the same machine or on a remote machine.
• Through RMI, an object running in a JVM present on a computer
(Client-side) can invoke methods on an object present in another JVM
(Server-side).
• RMI creates a public remote server object that enables client and
server-side communications through simple method calls on the server
object.
Contd.
• Stub Object: The stub object on the client machine builds an information
block and sends this information to the server.
• The block consists of
An identifier of the remote object to be used
Method name which is to be invoked
Parameters to the remote JVM
• Skeleton Object: The skeleton object passes the request from the stub
object to the remote object. It performs the following tasks
It calls the desired method on the real object present on the server.
It forwards the parameters received from the stub object to the method.
Working of RMI
• The communication between client and server is handled by using two
intermediate objects: Stub object (on client side) and Skeleton object
(on server-side) as also can be depicted from below media as follows:
Contd.
• These are the steps to be followed sequentially to implement Interface
as defined below as follows:
1. Defining a remote interface
2. Implementing the remote interface
3. Creating Stub and Skeleton objects from the implementation class
using rmic (RMI compiler)
4. Start the rmiregistry
5. Create and execute the server application program
6. Create and execute the client application program.
Step 1: Defining the remote interface
• The first thing to do is to create an interface that will provide the description
of the methods that can be invoked by remote clients. This interface should
extend the Remote interface and the method prototype within the interface
should throw the RemoteException.

// Creating a Search interface


import java.rmi.*;
public interface Search extends Remote
{
// Declaring the method prototype
public String query(String search) throws RemoteException;
}
Step 2: Implementing the remote interface

• The next step is to implement the remote interface. To implement the


remote interface, the class should extend to UnicastRemoteObject
class of java.rmi package.

• Also, a default constructor needs to be created to throw the


java.rmi.RemoteException from its parent constructor in class.
Contd.
// Java program to implement the Search interface
import java.rmi.*;
import java.rmi.server.*;
public class SearchQuery extends UnicastRemoteObject
implements Search
{
// Default constructor to throw RemoteException
// from its parent constructor
SearchQuery() throws RemoteException
{
super();
}

// Implementation of the query interface


public String query(String search)
throws RemoteException
{
String result;
if (search.equals("Reflection in Java"))
result = "Found";
else
result = "Not Found";

return result;
}
}
Contd.
• Step 3: Creating Stub and Skeleton objects from the
implementation class using rmic

• The rmic tool is used to invoke the rmi compiler that creates the Stub
and Skeleton objects. Its prototype is rmic class name. For above
program the following command need to be executed at the command
prompt
rmic SearchQuery.
• Step 4: Start the rmiregistry
Start the registry service by issuing the following command at the
command prompt start rmiregistry
Contd.
• Step 5: Create and execute the server application program
The next step is to create the server application program and execute it
on a separate command prompt.

The server program uses createRegistry method of


LocateRegistry class to create rmiregistry within the server JVM
with the port number passed as an argument.
The rebind method of Naming class is used to bind the remote
object to the new name.
Contd.
// Java program for server application
import java.rmi.*;
import java.rmi.registry.*;
public class SearchServer
{
public static void main(String args[])
{
try
{
// Create an object of the interface
// implementation class
Search obj = new SearchQuery();

// rmiregistry within the server JVM with


// port number 1900
LocateRegistry.createRegistry(1900);

// Binds the remote object by the name


// geeksforgeeks
Naming.rebind("rmi://localhost:1900"+
"/geeksforgeeks",obj);
}
catch(Exception ae)
{
System.out.println(ae);
}
}
}
Step 6: Create and execute the client application program

// Java program for client application


import java.rmi.*;
public class ClientRequest
{
public static void main(String args[])
{
String answer,value="Reflection in Java";
try
{
// lookup method to find reference of remote object
Search access =
(Search)Naming.lookup("rmi://localhost:1900"+
"/geeksforgeeks");
answer = access.query(value);
System.out.println("Article on " + value +
" " + answer+" at GeeksforGeeks");
}
catch(Exception ae)
{
System.out.println(ae);
}
}
}
PRAM or Parallel Random Access Machines
• Parallel Random Access Machine, also called PRAM is a model
considered for most of the parallel algorithms. It helps to write a
precursor parallel algorithm without any architecture constraints and
also allows parallel-algorithm designers to treat processing power as
unlimited. It ignores the complexity of inter-process communication.

• PRAM algorithms are mostly theoretical but can be used as a basis for
developing an efficient parallel algorithm for practical machines and
can also motivate building specialized machines.
PRAM Architecture Model
• The following are the modules of which a PRAM consists of:
1. It consists of a control unit, global memory, and an unbounded set of similar processors, each
with its own private memory.
2. An active processor reads from global memory, performs required computation, and then
writes to global memory.
3. Therefore, if there are N processors in a PRAM, then N number of independent operations
can be performed in a particular unit of time.
Models of PRAM
• While accessing the shared memory, there can be conflicts while performing the
read and write operation (i.e.), a processor can access a memory block that is
already being accessed by another processor. Therefore, there are various
constraints on a PRAM model which handles the read or write conflicts. They are:
• EREW: also called Exclusive Read Exclusive Write is a constraint that doesn’t
allow two processors to read or write from the same memory location at the same
instance.
• CREW: also called Concurrent Read Exclusive Write is a constraint that allows all
the processors to read from the same memory location but are not allowed to write
into the same memory location at the same time.
• ERCW: also called Exclusive Read Concurrent Write is a constraint that allows all
the processors to write to the same memory location but are now allowed to read
the same memory location at the same time.
• CRCW: also called Concurrent Read Concurrent Write is a constraint that allows
all the processors to read from and write to the same memory location parallelly.
Example
• Suppose we wish to add an array consisting of N numbers. We
generally iterate through the array and use N steps to find the sum of
the array.
• So, if the size of the array is N and for each step, let’s assume the time
taken to be 1 second. Therefore, it takes N seconds to complete the
iteration.
• The same operation can be performed more efficiently using a CRCW
model of a PRAM. Let there be N/2 parallel processors for an array of
size N, then the time taken for the execution is 4 which is less than N
= 6 seconds in the following illustration.
Contd.
Message Oriented Vs. Stream Oriented Communication

Message Oriented Communication Stream Oriented Communication


UDP (user data gram protocol) uses message oriented TCP (transmission control protocol) uses stream
communication oriented communication
Data is sent by application in discrete packages called Data is sent by with no particular structure.
message.
Communication is connection less, data is sent Communication is oriented, connection established
without any setup. before communication.
It is unreliable best effort delivery without It is reliable, data acknowledged.
acknowledgement.
Re transmission is not performed. Lost data is reframe automatically.
Low overhead. High overhead.
No flow control. Flow control using sent protocol like sliding
Suitable for applications like audio, video where Suitable for applications like e-mail systems where
speed is critical than loss of messages. data must be persistent through delivered late.
Module 4: Resource Process Management
• Followings are the desirable feature of a global scheduling algorithm
Fault Tolerance: A good global scheduling algorithm should not be stopped
when system nodes are crashed or temporarily crashed. Algorithm
configuration should also be even if the nodes are separated by multiple
nodes.
Scalability: A good global scheduling algorithm should be used for
marketing, which means that the algorithm should work well even as the
number of nodes increases. The scheduling algorithm will query the
workload of all categories in the corrupted system and select a node with the
least configuration load For distributed applications, the configuration
algorithm will balance the load between nodes but this is not the case that
nodes that are more efficient as they are assigned will have a better response
time and less loaded nodes may have a negative response time. so the load
should be shared instead of balancing i.e. resources in the nodes should be
shared among the nodes until the tasks being performed are not affected.
Contd.
No apriori knowledge about the processes: Scheduling algorithms work
based on the characteristics and resource requirements of the processes this
information should be provided by the user. This will obviously put extra
overhead on users, so a good global scheduling algorithm should require less
amount prior knowledge.
Quick decision-making capability: Scheduling algorithms should be fast
and should provide the best possible optimal decision in less amount of
time. A good process-scheduling algorithm must make quick decisions about
the assignment of processes to processors. This is an extremely important
aspect of the algorithms and makes many potential solutions unsuitable. For
example, an algorithm that models the system by a mathematical program
and solves it online is unsuitable because it does not meet this requirement.
Heuristic methods requiring less computational effort while providing
near-optimal results are therefore normally preferable to exhaustive
(optimal) solution methods.
Contd.
Stability: The useless migration of processes should be prevented
ie if node n1 is idle and n2 and n3 have multiple processes, then
node n2 and node n2 will send processes to node n1 causing node
n1 to overload node n1 will move these processes to other nodes.
which is a useless overhead to the system, so good scheduling
algorithms should be stable and prevent useless migration.
Dynamic in nature: Process allocations should be able to mean
that allocations should be made based on the current load of the
system but not based on specific planned conditions. The scheduler
should also be able to move processes from one node to another
duplicate so that the distribution is based on the current system
load.
Contd.
• A distributed system uses various approaches for resource management. These techniques
can be classified as follows:
Task Assignment Approach
Load Balancing Approach

• Task Assignment Approach: In this approach, when a user submits a process,


DOS(Distributed operating system) considered it to be a set of tasks and assigns resources
to each task to maximize the performance of the system. This approach has a drawback in
that it cannot assign resources to processes dynamically and thus lacks dynamism but at the
same time this approach ensures that:
o I/O cost is reduced to minimum
o Less turnaround time
o Enhances Parallelism
o Effective resource utilization
Load Balancing Approach
• As the name suggests, this approach tries to balance the load among
the various resources as well as process. In this approach, all the
submitted processes are distributed among various resources of the
system. This approach leads to maximum utilization of resources and
throughput is maximized. It uses various types
of load-balancing algorithms as follows:
Contd.
• Static and Dynamic: Static algorithms divide the resources among the
processes only at the time when they are submitted to the system. It cannot
redistribute the resources to the processes dynamically. In case of dynamic
algorithms, the redistribution of resources is done among the processes if
new process enters the system. This ensures that the resources are utilised in
the real time.
• Deterministic and probabilistic: Deterministic method works by analysing
the properties of nodes and the process characteristics for the allocation of
resources to the processes whereas probabilistic method uses static
information such as network topology, node capacity to allocate the
resources. Deterministic approach is expensive to implement but has a better
performance than probabilistic approach.
Contd.
• Centralized and Distributed: In centralized approach there is a central
node that controls the distribution of resources among the other nodes. In
this case, other nodes need to regularly update the central node about the
status and other nodes need to replicate the information at their end also so
that they can use it. This approach suffers from reliability issues. Replication
also leads to increases costs of communication. In distributed approach the
whole process is divided among various nodes and they are free to process it
according to their own algorithms. Finally the result from all the nodes is
combined to produce the final result.
• Cooperative and Non-cooperative: Cooperative approach works through
cooperation between various nodes whereas in non-cooperative approach,
nodes can act autonomously irrespective of the influence of the other nodes.
Cooperative approach generally leads to high stability at high overhead cost.
Load Sharing Approach
• This approach makes use of a router to distribute the load to different servers by acting as
a reverse proxy. Various algorithms such as Round Robin, Least Time, Least
Connections, etc. are used to distribute the load to different servers. It ensures that the
system is utilised to its maximum and no node is idle. It uses various steps to allocate the
node to a process. These steps are:
First of all it is checked if a node is idle or not.
The system may then decide if the work is to be transferred to the nodes that are
idle to the nodes that are about to finish processing.
Further the choice of node is also dependent upon either the sender or receiver. If
sender choses the node for allocation then it is sender initiated but if the receiving
node choses where to process the task, then it is called receiver initiated.
A node is responsible to communicate its stage whenever it changes. It can be done
either by polling or broadcasting.
Desirable Features For Load Balancing Algorithm
• Various factors should be considered while designing a load balancing algorithm. These are:
✔ A good algorithm should work without any prior knowledge of the resource demand of a
process which will prevent the communication overheads by the user.
✔ It should be able to handle the load dynamically as in real time the processes are always coming
to the system. A static approach may not prove useful in real time scenarios.
✔ The techniques used for allocating the resource to a process should be efficient and fast so that
the processes do not suffer from a long waiting time. One of the approach is to use heuristics.
✔ It should work with minimum need to communicate with other nodes to know about their states.
✔ It must ensure that real work is being done instead of just switching nodes to prevent processor
thrashing.
✔ The algorithm should have high reliability and be able to manage the work even if one of the
nodes crash or fails.
✔ It must ensure that all users’ work is done simultaneously and no user is made to wait for his
task to be completed so that process starvation can be prevented.
Process Management
• Process management is a core mechanism used in the distributed system to
gain control of all the processes and the task that they’re associated with, the
resources they’ve occupied, and how they’re communicating through
various IPC mechanisms. All of this is a part of process management,
managing the lifecycle of the executing processes.
• The following tasks are the foundation of process management:
• 1. Creation: When the program moves from secondary memory to main memory,
it becomes a process and that’s when the real procedure starts. In the context of
Distributed System, the initialization of process can be done by one of the node in
the system, user’s request , or required as dependency by other system’s
component, forked() by other processes as a part of some bigger functionality.
Contd.
• 2. Termination: A process can be terminated either voluntarily or involuntarily by one of
the node in the run-time environment. The voluntary termination is done when the process
has completed its task and the process might be terminated by the OS if its consuming
resources beyond a certain criteria set by the distributed system.
• 3. Process Coordination
• As a part of a whole system with multiple nodes in it, Process coordination (or otherwise
known as process synchronization) becomes crucial part of managing the overall system,
very frequently, our distributed system may come across the scenario when multiple
process have to agree on a single decision based on some criteria. This agreement is
governed by various algorithms like 2 – Phase Commit (2-PC) and 3 – Phase Commit
(3-PC) algorithms.
• 4. Fault Tolerance
• Fault tolerance is the ability of the system to give response to the client even in case of
system failure. The distributed system does this by replicating the data across various
nodes, so if one of the node fails for any reason like under maintenance, or down due to
hardware failure then the system will fetch the data from other nodes.
Contd.
• Process Management is Done in Distributed Systems by the following ways
• 1. Process Allocation
• Process Allocation deals with allocating processor, or node, or some fixed
size of memory to the process (size may vary as the requirements of the
process increase). This is initial procedure when the process is born and is
about to perform the assigned tasks.
• 2. Process Migration
• Process migration as its name indicates, is the shifting (or migrating) the
process to the desired node or processor. Migration can be done for many
reasons like load balancing if the current node on which it was executing has
exhausted its limit of handling a certain amount of processes at a time, or it
could be for resource utilization. Process Migration is further of 2 types :
Contd.
1. Non Pre-emptive Migration: The process is migrated before starting
its execution on the source node that is, the node on which the process
was born, before starting its execution it will migrate to its target node.
2. Pre-emptive Migration: In this case, the process has already started
its execution but due to some unexpected factors or demands it needs
to be migrated to other nodes. This is a costly procedure as this
requires the OS to save the state of the process, all the related
information like process id, files it has opened, program counter, state,
priority etc to be save in the Process Control Block (PCB)
Threads
• Thread is a lightweight process. Thread is the segment of a process
which means a process can have multiple threads and these multiple
threads are contained within a process. A thread has three states:
Running, Ready, and Blocked.
• Advantages of Multi Thread
❑ No need to block with every system call
❑ Easy to exploit available parallelism in multiprocessors
❑ Cheaper communication between components than with IPC
❑ Better fit for most complex applications
Contd. Process Thread
Process means any program is in execution. Thread means a segment of a process.
The process takes more time to terminate. The thread takes less time to terminate.
It also takes more time for context switching. It takes less time for context switching.
The process is less efficient in terms of communication. Thread is more efficient in terms of communication.

The process is isolated. Threads share memory.


Process switching uses an interface in an operating Thread switching does not require calling an operating system and
system. causes an interrupt to the kernel.
If one process is blocked then it will not affect the If a user-level thread is blocked, then all other user-level threads are
execution of other processes blocked.
The process does not share data with each other. Threads share data with each other.

A system call is involved in it. No system call is involved, it is created using APIs.
The process has its own Process Control Block, Stack, and Thread has Parents’ PCB, its own Thread Control Block, and Stack and
Address Space. common Address space.
Changes to the parent process do not affect child Since all threads of the same process share address space and other
processes. resources so any changes to the main thread may affect the behavior of
the other threads of the process.
Process Migration
• Process migration is a particular type of process management by
which processes are moved starting with one computing environment
and then onto the next.

• There are two types of Process Migration:


❖ Non-preemptive process
❖ Preemptive process
Contd.
• Non-preemptive process: If a process is moved before it begins
execution on its source node which is known as a non-preemptive
process.

• Preemptive process: If a process is moved at the time of its execution


that is known as preemptive process migration. Preemptive process
migration is all the more expensive in comparison to the
non-preemptive on the grounds that the process environment should
go with the process to its new node.
Contd.
• The reason to use process migration are:
❖ Dynamic Load Balancing: It permits processes to exploit less stacked nodes
by relocating from overloaded ones.
❖ Accessibility: Processes that inhibit defective nodes can be moved to other
perfect nodes.
❖ System Administration: Processes that inhabit a node if it is going through
system maintenance can be moved to different nodes.
❖ The locality of data: Processes can exploit the region of information or other
extraordinary abilities of a specific node.
❖ Mobility: Processes can be relocated from a hand-operated device or computer
to an automatic server-based computer before the device gets detached from
the network.
❖ Recovery of faults: The component to stop, transport and resume a process is
actually valuable to support in recovering the fault in applications that are
based on transactions.
Methods of Migration
• The methods of Process Migration are:
• 1. Homogeneous Process Migration: Homogeneous process migration implies
relocating a process in a homogeneous environment where all systems have a
similar operating system as well as architecture. There are two unique strategies for
performing process migration. These are i) User-level process migration ii) Kernel
level process migration.
User-level process migration: In this procedure, process migration is
managed without converting the operating system kernel. User-level
migration executions are more simple to create and handle but have usually
two issues: i) Kernel state is not accessible by them. ii) They should cross the
kernel limit utilizing kernel demands which are slow and expensive.
Kernel level process migration: In this procedure, process migration is
finished by adjusting the operating system kernel. Accordingly, process
migration will become more simple and more proficient. This facility
permits the migration process to be done faster and relocate more types of
processes.
Contd.
• There are five fundamental calculations for homogeneous process migration:
Total Copy Algorithm
Pre-Copy Algorithm
Demand Page Algorithm
File Server Algorithm
Freeze Free Algorithm

• Heterogeneous Process Migration: Heterogeneous process migration is the relocation of


the process across machine architectures and operating systems. Clearly, it is more
complex than the homogeneous case since it should review the machine and operating
designs and attributes, as well as send similar data as homogeneous process migration
including process state, address space, file, and correspondence data. Heterogeneous
process migration is particularly appropriate in the portable environment where is almost
certain that the portable unit and the base help station will be different machine types.
Contd.
• There are four essential types of heterogeneous migration:
❖ Passive object: The information is moved and should be
translated
❖ Active object, move when inactive: The process is relocated at
the point when it isn’t executing. The code exists in the two
areas, and just the information is moved and translated.
❖ Active object, interpreted code: The process is executing
through an interpreter so just information and interpreter state
need to be moved.
❖ Active object, native code: Both code and information should
be translated as they are accumulated for a particular
architecture.
Code Migration
• Instead of passing data around, why not moving code?
• Code migration is used for
❑ Improve load distribution in compute-intensive systems
❑ Save network resource and response time by moving processing
data closer to where the data is
❑ Improve parallelism w/o code complexities
• Mobile agents for web searches
❑ Dynamic configuration of distributed systems
• Instantiation of distributed system on dynamically available
resources; binding to service-specific, client-side code at
invocation time
Models for code segmentation
• Process is seen as composed of three segments
–Code segment –set of instructions that make up the program
–Resource segment –references to external resources needed–
Execution segment –state of the process (e.g. stack, PC, …)
• Some alternatives
–Weak/strong mobility –code or code and execution segments
–Sender or receiver initiated–A new process for the migration
code?
–Cloning instead of migration 1
Synchronization in Distributed System
• In the distributed system, the hardware and software components
communicate and coordinate their actions by message passing. Each node in
distributed systems can share its resources with other nodes. So, there is a
need for proper allocation of resources to preserve the state of resources and
help coordinate between the several processes.
• To preserve the state of resources, synchronization is used.
• Synchronization in distributed systems is achieved via clocks. The physical
clocks are used to adjust the time of nodes. Each node in the system can
share its local time with other nodes in the system. The time is set based on
UTC (Universal Time Coordination). UTC is used as a reference time clock
for the nodes in the system. Clock synchronization can be achieved by 2
ways: External and Internal Clock Synchronization.
1. External clock synchronization is the one in which an external reference
clock is present. It is used as a reference and the nodes in the system can set
and adjust their time accordingly.
2. Internal clock synchronization is the one in which each node shares its
time with other nodes and all the nodes set and adjust their times
accordingly.
• There are 2 types of clock synchronization algorithms: Centralized and
Distributed.
• Centralized is the one in which a time server is used as a reference. The
single time-server propagates it’s time to the nodes, and all the nodes adjust
the time accordingly. It is dependent on a single time-server, so if that node
fails, the whole system will lose synchronization. Examples of centralized
are-Berkeley the Algorithm, Passive Time Server, Active Time Server etc.
1. Distributed is the one in which there is no centralized time-server present. Instead, the
nodes adjust their time by using their local time and then, taking the average of the
differences in time with other nodes. Distributed algorithms overcome the issue of
centralized algorithms like scalability and single point failure. Examples of Distributed
algorithms are – Global Averaging Algorithm, Localized Averaging Algorithm, NTP
(Network time protocol), etc.

• Centralized clock synchronization algorithms suffer from two major drawbacks:


1. They are subject to a single-point failure. If the time-server node fails, the clock
synchronization operation cannot be performed. This makes the system unreliable. Ideally,
a distributed system should be more reliable than its individual nodes. If one goes down,
the rest should continue to function correctly.
2. From a scalability point of view, it is generally not acceptable to get all the time requests
serviced by a single-time server. In a large system, such a solution puts a heavy burden on
that one process.
• Distributed algorithms overcome these drawbacks as there is no
centralized time-server present. Instead, a simple method for clock
synchronization may be to equip each node of the system with a
real-time receiver so that each node’s clock can be independently
synchronized in real-time. Multiple real-time clocks (one for each
node) are normally used for this purpose.

Logical Clock
• We should do the operations on our PCs one by one in an organized
way.
• Suppose, we have more than 10 PCs in a distributed system and every
PC is doing it’s own work but then how we make them work together.
There comes a solution to this i.e. LOGICAL CLOCK.
• Method-1:
To order events across process, try to sync clocks in one
approach.
This means that if one PC has a time 2:00 pm then every PC
should have the same time which is quite not possible. Not every
clock can sync at one time. Then we can’t follow this method.
Contd.
• Method-2:
Another approach is to assign Timestamps to events.
If we give each PC their individual number than it will be organized in
a way that 1st PC will complete its process first and then second and so
on.
BUT, Timestamps will only work as long as they obey causality.
• Taking single PC only if 2 events A and B are occurring one by one then TS(A) <
TS(B). If A has timestamp of 1, then B should have timestamp more than 1, then
only happen before relationship occurs.
• Taking 2 PCs and event A in P1 (PC.1) and event B in P2 (PC.2) then also the
condition will be TS(A) < TS(B). Taking example- suppose you are sending
message to someone at 2:00:00 pm, and the other person is receiving it at 2:00:02
pm. Then it’s obvious that TS(sender) < TS(receiver).
• Properties Derived from Happen Before Relationship –
• Transitive Relation –
If, TS(A) <TS(B) and TS(B) <TS(C), then TS(A) < TS(C)
• Causally Ordered Relation –
a->b, this means that a is occurring before b and if there is any
changes in a it will surely reflect on b.
• Concurrent Event –
This means that not every process occurs one by one, some processes
are made to happen simultaneously i.e., A || B.
Lamport’s logical clock
• Lamport define the relation happens-before (->) between any pair
of events with 3 rules:
1. If a and b are events on the same process, then a -> b if a occurs
before b based on the local clock.
2. If a process sends a message m to another process, then
send(m) -> receive(m) where send(m) and receive(m) are events
from first and second processes respectively.
3. happens-before is transitive, i.e. if a -> b and b -> c then a -> c.
• The goal of Lamport’s logical clock is to assign timestamps to all
events such that these timestamps obey causality - if an event B is
caused by an earlier event A, then everyone must see A before
seeing B. Formally, if an event A causally happens before another
event B, then timestamp(A) < timestamp(B). The timestamp must
always go forward and not backward.
Contd.
• Let’s look at an example where 3 processes in the system with the
following conditions:
1. We assume the clocks use local counter which is an integer (initial
value of counter is 0) but the increment of each clock is different.
2. A process increments its counter when an even happens or when it
sends message. The counter is assigned to the event as its
timestamp., the message event also carries its timestamp.
Contd.
• The messages m1 and m2 obey happens-before, however
messages m3 and m4 do not and we need to correct the local
clock. For example, m3 is sent at 50, them m3 should only be
received at 51 or later. The algorithm to update Lamport’s
counter is:
1. Before executing an event, the process A increment its
counter, i.e. timestamp(A) = timestamp(A) + increment.
2. When A sends a message to process B, it sends along
timestamp(A).
3. Upon receiving the message, B will adjust local clock and
the counter is then incremented by 1 before the message is
considered received., i.e. timestamp(B) = max(timestamp(A),
timestamp(B)) + 1.
Contd.
• Sometimes, we do not want 2 events to occur at exactly the same
time. In this case, we need to use the unique identifier to break tie,
i.e. an event at process A at timestamp 10 is timestamped as (10, A);
so if A<B then (10,A) < (10,B) .

• On the other hand, if events happen from different processes and do


not exchange message directly or indirectly, then nothing can be said
about their relation, and these events are said to be concurrent.
Concurrent events are not casually related and their order is not
guaranteed.
Contd.

• From the example above, for every message, a process needs to send first
before the other receives, and timestamp(send) < timestamp(receive) for
every message. In the case of the same process, for instance at process
2, we know that the receiving of message 1 happens before the sending of
m3, hence timestamp(receive_1) < timestamp(send_3). However by
construction, timestamp(receive_1) < timestamp(send_2) but nothing can
be said about the sending of m2 and receiving of m1.
Contd.
• Hence, Lamport’s logical timestamps obey the rule of
causality but cannot distinguish between casual and
concurrent:
If 2 events follow the happens-before relationship, i.e. E1
-> E2 then timestamp(E1) < timestamp but
timestamp(E1) < timestamp (E2) implies either (E1 < E2) or
(E1 and E2 are concurrent)
Vector Clock
• The vector clock tries to overcome the shortcoming of the logical
clock. Suppose there are N presses in the system, each process
uses a vector of integer clocks where each vector has N elements,
We denote the vector maintained by process i as Vi [1…N], the jth
element of the vector at process i, Vi[j], is i’s acknowledgment of
latest events at process j.
• Vector Clock algorithm to assign and adjust vector timestamp in each
process:
1. On an instruction or send event at process i, it increments only its
i-th element of its vector clock
2. When a process sends a message, it attaches along its’ vector
clock
3. When a process j receives a message from process i, it increase
its’ j-th element of its own vector clock and update other elements
in the vector:
o V_j[i] = V_j[i] + 1
o V_j[k] = max(V_i[k], V_j[k]) for k ≠ j
Contd.
• In the example above, node 1 updates its’ vector to [1,0,0] to
represent the event of sending at A before sending to node 2.
Upon receiving, node 2 updates the event of receiving at C.
When node 2 receives another message at F, it again updates
the event of receiving and then adjust other elements in the
vector.
Contd.
• Using vector clock, we define some relationships of 2 events a
and b:

1. V_a = V_b if and only if V_a[i] = V_b[i], for all i = 1, … , N


2. V_a ≤ V_b if and only if V_a[i] ≤ V_b[i], for all i = 1, … , N
3. a and b are causally related if V_a < V_b if and only if V_a
≤ V_b and there exists j such that 1 ≤ j ≤ N and V_a[j] <
V_b[j]. So in the example above, using vector clock, node
2 can tell that message A->C and E->F are casually
related.
4. a and b are concurrent if and only if NOT (V_a ≤ V_b)
AND NOT (V_a ≤ V_b)
Lamport’s Algorithm for Mutual Exclusion in
Distributed System
• Three type of messages ( REQUEST, REPLY and RELEASE) are used and
communication channels are assumed to follow FIFO order.
• A site send a REQUEST message to all other site to get their permission to enter critical
section.
• A site send a REPLY message to requesting site to give its permission to enter the critical
section.
• A site send a RELEASE message to all other site upon exiting the critical section.
• Every site Si, keeps a queue to store critical section requests ordered by their
timestamps. request_queuei denotes the queue of site Si
• A timestamp is given to each critical section request using Lamport’s logical clock.
• Timestamp is used to determine priority of critical section requests. Smaller timestamp gets
high priority over larger timestamp. The execution of critical section request is always in
the order of their timestamp.
Algorithm
• To enter Critical section:
• When a site Si wants to enter the critical section, it sends a request
message Request(tsi, i) to all other sites and places the request on request_queuei.
Here, Tsi denotes the timestamp of Site Si
• When a site Sj receives the request message REQUEST(tsi, i) from site Si, it returns
a timestamped REPLY message to site Si and places the request of site
Si on request_queuej
• To execute the critical section:
• A site Si can enter the critical section if it has received the message with timestamp
larger than (tsi, i) from all other sites and its own request is at the top
of request_queuei
• To release the critical section:
• When a site Si exits the critical section, it removes its own request from the top of its
request queue and sends a timestamped RELEASE message to all other sites
• When a site Sj receives the timestamped RELEASE message from site Si, it
removes the request of Si from its request queue
Contd.
• Message Complexity: Lamport’s Algorithm requires invocation of
3(N – 1) messages per critical section execution. These 3(N – 1)
messages involves.
o (N – 1) request messages
o (N – 1) reply messages
o (N – 1) release messages
• Drawbacks of Lamport’s Algorithm:
o Unreliable approach: failure of any one of the processes will
halt the progress of entire system.
o High message complexity: Algorithm requires 3(N-1) messages
per critical section invocation.
Ricart–Agrawala Algorithm in Mutual
Exclusion in Distributed System
• This algorithm is an extension and optimization of Lamport’s Distributed Mutual
Exclusion Algorithm. Like Lamport’s Algorithm, it also follows permission-based
approach to ensure mutual exclusion. In this algorithm:
o Two type of messages ( REQUEST and REPLY) are used and communication
channels are assumed to follow FIFO order.
o A site send a REQUEST message to all other site to get their permission to enter the
critical section.
o A site send a REPLY message to another site to give its permission to enter the
critical section.
o A timestamp is given to each critical section request using Lamport’s logical clock.
o Timestamp is used to determine priority of critical section requests. Smaller timestamp
gets high priority over larger timestamp. The execution of critical section request is
always in the order of their timestamp.
Algorithm
• To enter Critical section:
o When a site Si wants to enter the critical section, it send a
timestamped REQUEST message to all other sites.
o When a site Sj receives a REQUEST message from site Si, It sends
a REPLY message to site Si if and only if
❖ Site Sj is neither requesting nor currently executing the critical section.
❖ In case Site Sj is requesting, the timestamp of Site Si‘s request is smaller than its own request.
• To execute the critical section:
o Site Si enters the critical section if it has received the REPLY message from all other
sites.
• To release the critical section:
o Upon exiting site Si sends REPLY message to all the deferred requests.
Contd.
• Message Complexity: Ricart–Agrawala algorithm requires invocation of 2(N – 1) messages per
critical section execution. These 2(N – 1) messages involves
o (N – 1) request messages
o (N – 1) reply messages
• Advantages of the Ricart-Agrawala Algorithm:
o Low message complexity: The algorithm has a low message complexity as it requires only
(N-1) messages to enter the critical section, where N is the total number of nodes in the
system.
o Scalability: The algorithm is scalable and can be used in systems with a large number of
nodes.
o Non-blocking: The algorithm is non-blocking, which means that a node can continue
executing its normal operations while waiting to enter the critical section.
• Drawbacks of Ricart–Agrawala algorithm:
o Unreliable approach: failure of any one of node in the system can halt the progress of the
system. In this situation, the process will starve forever. The problem of failure of node can
be solved by detecting failure after some timeout.
Election algorithm and distributed processing
• Distributed Algorithm is an algorithm that runs on a distributed system.
Distributed system is a collection of independent computers that do not share their
memory. Each processor has its own memory and they communicate via
communication networks. Communication in networks is implemented in a
process on one machine communicating with a process on another machine. Many
algorithms used in the distributed system require a coordinator that performs
functions needed by other processes in the system.

• Election algorithms are designed to choose a coordinator.


Election Algorithms
• Election algorithms choose a process from a group of processors to act as a coordinator. If the
coordinator process crashes due to some reasons, then a new coordinator is elected on other
processor. Election algorithm basically determines where a new copy of the coordinator should be
restarted. Election algorithm assumes that every active process in the system has a unique priority
number. The process with highest priority will be chosen as a new coordinator. Hence, when a
coordinator fails, this algorithm elects that active process which has highest priority number.Then
this number is send to every active process in the distributed system. We have two election
algorithms for two different configurations of a distributed system.

• The Bully Algorithm – This algorithm applies to system where every process can send a message
to every other process in the system. – Suppose process P sends a message to the coordinator.

Bully Algorithm
• Suppose process P sends a message to the coordinator.
1. If the coordinator does not respond to it within a time interval T, then it is
assumed that coordinator has failed.
2. Now process P sends an election messages to every process with high priority
number.
3. It waits for responses, if no one responds for time interval T then process P
elects itself as a coordinator.
4. Then it sends a message to all lower priority number processes that it is
elected as their new coordinator.
5. However, if an answer is received within time T from any other process Q
o Process P again waits for time interval T’ to receive another message from
Q that it has been elected as coordinator.
o If Q doesn’t responds within time interval T’ then it is assumed to have
failed and algorithm is restarted.
The Ring Algorithm
• This algorithm applies to systems organized as a ring(logically or physically). In this algorithm we
assume that the link between the process are unidirectional and every process can message to the
process on its right only. Data structure that this algorithm uses is active list, a list that has a
priority number of all active processes in the system.
1. If process P1 detects a coordinator failure, it creates new active list which is empty
initially. It sends election message to its neighbour on right and adds number 1 to its active
list.
2. If process P2 receives message elect from processes on left, it responds in 3 ways:
o If message received does not contain 1 in active list then P1 adds 2 to its active list and
forwards the message.
o If this is the first election message it has received or sent, P1 creates new active list with
numbers 1 and 2. It then sends election message 1 followed by 2.
o If Process P1 receives its own election message 1 then active list for P1 now contains
numbers of all the active processes in the system. Now Process P1 detects highest
priority number from list and elects it as the new coordinator.
Asynchronous Ring Election Algorithm (Clockwise leader
election)

• Each node 𝑣 executes the following code:


1. Node 𝑣 stores largest known ID in 𝑚v .
2. Initialize 𝑚v ≔ ID(𝑣) and send ID (𝑣) to clockwise neighbor.
3. if 𝑣 receives message with ID(W)> 𝑚v then
4. 𝑣 forwards ID(W) to clockwise neighbour and sets 𝑚v ≔ ID(𝑣).
5. 𝑣 decides not to be the leader if it has not done so already.
6. elseif 𝑣 receives message with ID(V) then
7. 𝑣 decides to be the leader.
Synchronous Leader Election Algorithm
• Algorithm consists of phases 𝑖 = 1,2,… of length 𝑛.
• Every node 𝑣 does the following
• if phase 𝑖 = ID 𝑣 and 𝑣 has not yet received a message then
• 𝑣 becomes the leader 𝑣 sends message “𝑣 is leader” arounds the ring

You might also like