DS Unit-5
DS Unit-5
Mutual exclusion in a distributed system states that only one process is allowed to execute the
critical section (CS) at any given time.
Message passing is the sole means for implementing distributed mutual exclusion.
The decision as to which process is allowed access to the CS next is arrived at by
message passing, in which each process learns about the state of all other processes in
some consistent way.
There are three basic approaches for implementing distributed mutual exclusion:
1. Token-based approach:
A unique token is shared among all the sites.
If a site possesses the unique token, it is allowed to enter its critical section
This approach uses sequence number to order requests for the critical section.
Each requests for critical section contains a sequence number. This sequence
number is used to distinguish old and current requests.
This approach insures Mutual exclusion as the token is unique.
Eg: Suzuki-Kasami’s Broadcast Algorithm
2. Non-token-based approach:
A site communicates with other sites in order to determine which sites should
execute critical section next. This requires exchange of two or more successive
round of messages among sites.
This approach use timestamps instead of sequence number to order requests
for the critical section.
When ever a site make request for critical section, it gets a timestamp.
Timestamp is also used to resolve any conflict between critical section
requests.
All algorithm which follows non-token based approach maintains a logical
clock. Logical clocks get updated according to Lamport’s scheme.
Eg: Lamport's algorithm, Ricart–Agrawala algorithm
Page 1 of 20 Meenakshi.R
CS8603- Distributed Systems
3. Quorum-based approach:
Instead of requesting permission to execute the critical section from all other
sites, Each site requests only a subset of sites which is called a quorum.
Any two subsets of sites or Quorum contains a common site.
This common site is responsible to ensure mutual exclusion.
Eg: Maekawa’s Algorithm
3.1.1 Preliminaries
The system consists of N sites, S1, S2, S3, …, SN.
Assume that a single process is running on each site.
The process at site Si is denoted by pi. All these processes communicate
asynchronously over an underlying communication network.
A process wishing to enter the CS requests all other or a subset of processes by
sending REQUEST messages, and waits for appropriate replies before entering the
CS.
While waiting the process is not allowed to make further requests to enter the CS.
A site can be in one of the following three states: requesting the CS, executing the CS,
or neither requesting nor executing the CS.
In the requesting the CS state, the site is blocked and cannot make further requests for
the CS.
In the idle state, the site is executing outside the CS.
In the token-based algorithms, a site can also be in a state where a site holding the
token is executing outside the CS. Such state is referred to as the idle token state.
At any instant, a site may have several pending requests for CS. A site queues up
these requests and serves them one at a time.
N denotes the number of processes or sites involved in invoking the critical section, T
denotes the average message delay, and E denotes the average critical section
execution time.
The safety property states that at any instant, only one process can execute the
critical section. This is an essential property of a mutual exclusion algorithm.
Liveness property:
This property states the absence of deadlock and starvation. Two or more sites
should not endlessly wait for messages that will never arrive. In addition, a site must
not wait indefinitely to execute the CS while other sites are repeatedly executing the
CS. That is, every requesting site should get an opportunity to execute the CS in finite
time.
Fairness:
Fairness in the context of mutual exclusion means that each process gets a fair
chance to execute the CS. In mutual exclusion algorithms, the fairness property
generally means that the CS execution requests are executed in order of their arrival in
the system.
Page 2 of 20 Meenakshi.R
CS8603- Distributed Systems
3.1.3 Performance metrics
Message complexity: This is the number of messages that are required per CS
execution by a site.
Synchronization delay: After a site leaves the CS, it is the time required and before
the next site enters the CS. (Figure 3.1)
Response time: This is the time interval a request waits for its CS execution to be
over after its request messages have been sent out. Thus, response time does not
include the time a request waits at a site before its request messages have been sent
out. (Figure 3.2)
System throughput: This is the rate at which the system executes requests for the
CS. If SD is the synchronization delay and E is the average critical section execution
time.
Page 3 of 20 Meenakshi.R
CS8603- Distributed Systems
For examples, the best and worst values of the response time are achieved when load
is, respectively, low and high;
The best and the worse message traffic is generated at low and heavy load conditions,
respectively.
Page 4 of 20 Meenakshi.R
CS8603- Distributed Systems
Correctness
Theorem: Lamport’s algorithm achieves mutual exclusion.
Proof: Proof is by contradiction.
Suppose two sites Si and Sj are executing the CS concurrently. For this to happen
conditions L1 and L2 must hold at both the sites concurrently.
This implies that at some instant in time, say t, both Si and Sj have their own requests
at the top of their request queues and condition L1 holds at them. Without loss of
generality, assume that Si ’s request has smaller timestamp than the request of Sj .
From condition L1 and FIFO property of the communication channels, it is clear that
at instant t the request of Si must be present in request queuej when Sj was executing
its CS. This implies that Sj ’s own request is at the top of its own request queue when
a smaller timestamp request, Si ’s request, is present in the request queuej – a
contradiction!
Message Complexity:
Lamport’s Algorithm requires invocation of 3(N – 1) messages per critical section execution.
These 3(N – 1) messages involves
(N – 1) request messages
(N – 1) reply messages
(N – 1) release messages
Page 5 of 20 Meenakshi.R
CS8603- Distributed Systems
Performance:
Synchronization delay is equal to maximum message transmission time. It requires 3(N – 1)
messages per CS execution. Algorithm can be optimized to 2(N – 1) messages by omitting
the REPLY message in some situations.
Page 6 of 20 Meenakshi.R
CS8603- Distributed Systems
In case Site Sj is requesting, the timestamp of Site Si‘s request is smaller than its own
request.
Otherwise the request is deferred by site Sj.
Message Complexity:
Ricart–Agrawala algorithm requires invocation of 2(N – 1) messages per critical section
execution. These 2(N – 1) messages involve:
(N – 1) request messages
(N – 1) reply messages
Performance:
Synchronization delay is equal to maximum message transmission time It requires
2(N – 1) messages per Critical section execution.
Page 7 of 20 Meenakshi.R
CS8603- Distributed Systems
A site send a RELEASE message to all other site in its request set or quorum upon
exiting the critical section.
Maekawa used the theory of projective planes and showed that N = K(K – 1)+ 1. This
relation gives |Ri|= √N.
Correctness
Theorem: Maekawa’s algorithm achieves mutual exclusion.
Proof: Proof is by contradiction.
Suppose two sites Si and Sj are concurrently executing the CS.
Page 8 of 20 Meenakshi.R
CS8603- Distributed Systems
This means site Si received a REPLY message from all sites in Ri and concurrently
site Sj was able to receive a REPLY message from all sites in Rj .
If Ri ∩ Rj = {Sk }, then site Sk must have sent REPLY messages to both Si and Sj
concurrently, which is a contradiction
Message Complexity:
Maekawa’s Algorithm requires invocation of 3√N messages per critical section execution as
the size of a request set is √N. These 3√N messages involves.
√N request messages
√N reply messages
√N release messages
Performance:
Synchronization delay is equal to twice the message propagation delay time. It requires 3√n
messages per critical section execution.
Page 9 of 20 Meenakshi.R
CS8603- Distributed Systems
To enter Critical section:
When a site Si wants to enter the critical section and it does not have the token then it
increments its sequence number RNi[i] and sends a request message REQUEST(i, sn)
to all other sites in order to request the token.
Here sn is update value of RNi[i]
When a site Sj receives the request message REQUEST(i, sn) from site Si, it sets
RNj[i] to maximum of RNj[i] and sni.eRNj[i] = max(RNj[i], sn).
After updating RNj[i], Site Sj sends the token to site Si if it has token and RNj[i] =
LN[i] + 1
Correctness
Mutual exclusion is guaranteed because there is only one token in the system and a site holds
the token during the CS execution.
Theorem: A requesting site enters the CS in finite time.
Proof: Token request messages of a site Si reach other sites in finite time.
Since one of these sites will have token in finite time, site Si ’s request will be placed in the
token queue in finite time.
Since there can be at most N − 1 requests in front of this request in the token queue, site Si
will get the token and execute the CS in finite time.
Message Complexity:
The algorithm requires 0 message invocation if the site already holds the idle token at the
time of critical section request or maximum of N message per critical section execution. This
N messages involves
(N – 1) request messages
1 reply message
Performance:
Synchronization delay is 0 and no message is needed if the site holds the idle token at the
time of its request. In case site does not holds the idle token, the maximum synchronization
delay is equal to maximum message transmission time and a maximum of N message is
required per critical section invocation.
Page 10 of 20 Meenakshi.R
CS8603- Distributed Systems
3.6 DEADLOCK DETECTION IN DISTRIBUTED SYSTEMS
Deadlock can neither be prevented nor avoided in distributed system as the system is
so vast that it is impossible to do so. Therefore, only deadlock detection can be implemented.
The techniques of deadlock detection in the distributed system require the following:
Progress:The method should be able to detect all the deadlocks in the system.
Safety: The method should not detect false of phantom deadlocks.
Distributed approach:
In the distributed approach different nodes work together to detect deadlocks. No
single point failure as workload is equally divided among all nodes.
The speed of deadlock detection also increases.
Hierarchical approach:
This approach is the most advantageous approach.
It is the combination of both centralized and distributed approaches of deadlock
detection in a distributed system.
In this approach, some selected nodes or cluster of nodes are responsible for deadlock
detection and these selected nodes are controlled by a single node.
System Model
Page 11 of 20 Meenakshi.R
CS8603- Distributed Systems
Preliminaries
Page 12 of 20 Meenakshi.R
CS8603- Distributed Systems
Correctness criteria
A deadlock detection algorithm must satisfy the following two conditions:
1. Progress-No undetected deadlocks:
The algorithm must detect all existing deadlocks in finite time. In other words, after all
wait-for dependencies for a deadlock have formed, the algorithm should not wait for any
more events to occur to detect the deadlock.
2. Safety -No false deadlocks:
The algorithm should not report deadlocks which do not exist. This is also called as
called phantom or false deadlocks.
Page 13 of 20 Meenakshi.R
CS8603- Distributed Systems
In the AND model, if a cycle is detected in the WFG, it implies a deadlock but not vice
versa. That is, a process may not be a part of a cycle, it can still be deadlocked.
3.7.3 OR Model
A process can make a request for numerous resources simultaneously and the request
is satisfied if any one of the requested resources is granted.
Presence of a cycle in the WFG of an OR model does not imply a deadlock
in the OR model.
In the OR model, the presence of a knot indicates a deadlock.
With every blocked process, there is an associated set of processes called dependent
set.
A process shall move from an idle to an active state on receiving a grant message
from any of the processes in its dependent set.
A process is permanently blocked if it never receives a grant message from any of the
processes in its dependent set.
A set of processes S is deadlocked if all the processes in S are permanently blocked.
In short, a processis deadlocked or permanently blocked, if the following conditions
are met:
1. Each of the process is the set S is blocked.
2. The dependent set for each process in S is a subset of S.
3. No grant message is in transit between any two processes in set S.
A blocked process P is the set S becomes active only after receiving a grant message
from a process in its dependent set, which is a subset of S.
Page 14 of 20 Meenakshi.R
CS8603- Distributed Systems
This allows a request to obtain any k available resources from a pool of n resources.
Both the models are the same in expressive power.
This favours more compact formation of a request.
Every request in this model can be expressed in the AND-OR model and vice-versa.
Note that AND requests for p resources can be stated as and OR requests for p
Page 15 of 20 Meenakshi.R
CS8603- Distributed Systems
3.8.2 Edge Chasing Algorithms
The presence of a cycle in a distributed graph structure is be verified by propagating
special messages called probes, along the edges of the graph.
These probe messages are different than the request and reply messages.
The formation of cycle can be deleted by a site if it receives the matching probe sent
by it previously.
Whenever a process that is executing receives a probe message, it discards this
message and continues.
Only blocked processes propagate probe messages along their outgoing edges.
Main advantage of edge-chasing algorithms is that probes are fixed size messages
which is normally very short.
Examples:Chandy et al., Choudhary et al., Kshemkalyani–Singhal, Sinha–Natarajan
algorithms.
Therefore, distributed deadlocks can be detected by taking a snapshot of the system and
examining it for the condition of a deadlock
Page 16 of 20 Meenakshi.R
CS8603- Distributed Systems
Probes are sent in the opposite direction to the edges of the WFG.
When a probe initiated by a process comes back to it, the process declares deadlock.
Features:
1. Only one process in a cycle detects the deadlock. This simplifies the deadlock
resolution – this process can abort itself to resolve the deadlock. This algorithm can
be improvised by including priorities, and the lowest priority process in a cycle
detects deadlock and aborts.
2. In this algorithm, a process that is detected in deadlock is aborted spontaneously, even
though under this assumption phantom deadlocks cannot be excluded. It can be
shown, however, that only genuine deadlocks will be detected in the absence of
spontaneous aborts.
Each node of the WFG has two local variables, called labels:
1. a private label, which is unique to the node at all times, though it is not constant.
2. a public label, which can be read by other processes and which may not be unique.
Each process is represented as u/v where u and u are the public and private labels,
respectively. Initially, private and public labels are equal for each process. A global WFG
is maintained and it defines the entire state sof the system.
The algorithm is defined by the four state transitions as shown in Fig.3.10, where z =
inc(u, v), and inc(u, v) yields aunique label greater than both u and v labels that are
notshown do not change.
The transitions in the defined by the algorithm are block, activate , transmit and
detect.
Block creates an edge in the WFG.
Two messages are needed, one resource request and onemessage back to the blocked
process to inform it of thepublic label of the process it is waiting for.
Activate denotes that a process has acquired the resourcefrom the process it was
waiting for.
Transmit propagates larger labels in the opposite directionof the edges by sending a
probe message.
Page 17 of 20 Meenakshi.R
CS8603- Distributed Systems
Detect means that the probe with the private label of some process has returned to it,
indicating a deadlock.
This algorithm can easily be extended to include priorities, so that whenever a
deadlock occurs, the lowest priority process gets aborted.
This priority based algorithm has two phases.
1. The first phase is almost identical to the algorithm.
2. The second phase the smallest priority is propagated around the circle. The
propagation stops when one process recognizes the propagated priority as its
own.
Message Complexity:
If we assume that a deadlock persists long enough to be detected, the worst-case complexity
of the algorithm is s(s - 1)/2 Transmit steps, where s is the number of processes in the cycle.
Page 18 of 20 Meenakshi.R
CS8603- Distributed Systems
Data structures
Each process Pi maintains a boolean array, dependenti, where dependent(j) is true only if Pi
knows that Pj is dependent on it. Initially, dependenti (j) is false for all i and j.
Performance analysis
In the algorithm, one probe message is sent on every edge of the WFG which
connects processes on two sites.
The algorithm exchanges at most m(n − 1)/2 messages to detect a deadlock that
involves m processes and spans over n sites.
The size of messages is fixed and is very small (only three integer words).
The delay in detecting a deadlock is O(n).
Advantages:
It is easy to implement.
Each probe message is of fixed length.
There is very little computation.
There is very little overhead.
There is no need to construct a graph, nor to pass graph information to other sites.
This algorithm does not find false (phantom) deadlock.
There is no need for special data structures.
Page 19 of 20 Meenakshi.R
CS8603- Distributed Systems
denoting that they belong to a diffusion computation initiated by a process pi and are being
sent from process pj to process pk.
A blocked process initiates deadlock detection by sending query messages to all
processes in its dependent set.
If an active process receives a query or reply message, it discards it.
When a blocked process Pk receives a query(i, j, k) message, it takes the following
actions:
1. If this is the first query message received by Pk for the deadlock detection
initiated by Pi, then it propagates the query to all the processes in its dependent
set and sets a local variable numk (i) to the number of query messages sent.
2. If this is not the engaging query, then Pk returns a reply message to it
immediately provided Pk has been continuously blocked since it received the
corresponding engaging query. Otherwise, it discards the query.
Process Pk maintains a boolean variable waitk(i) that denotes the fact that it
has been continuously blocked since it received the last engaging query from
process Pi.
When a blocked process Pk receives a reply(i, j, k) message, it decrements
numk(i) only if waitk(i) holds.
A process sends a reply message in response to an engaging query only after it
has received a reply to every query message it has sent out for this engaging
query.
The initiator process detects a deadlock when it has received reply messages to
all the query messages it has sent out.
Performance analysis
For every deadlock detection, the algorithm exchanges e query messages ande reply
messages, where e = n(n – 1) is the number of edges.
Page 20 of 20 Meenakshi.R