Unit 3DC
Unit 3DC
1
CS3551-DISTRIBUTED COMPUTING –UNIT 3
3.2. PRELIMINARIES
We describe here,
1. System model,
2. Requirements that mutual exclusion algorithms
3. Metrics we use to measure the performance of mutual exclusion algorithms.
1. SYSTEM MODEL
The system consists of N sites, S1, S2, ..., SN. We assume that a single process is
running on each site.
The process at site Si is denoted by pi.
A process wishing to enter the CS, requests all other or a subset of processes by
sending REQUEST messages, and waits for appropriate replies before entering the CS.
While waiting the process is not allowed to make further requests to enter the CS.
A site can be in one of the following three states:
1. Requesting the Critical Section.
2. Executing the Critical Section.
3. Neither requesting nor executing the CS (i.e., idle).
In the ‘requesting the CS’ state, the site is blocked and can not make further requests
for the CS. In the ‘idle’ state, the site is executing outside the CS.
In the token-based algorithms, a site can also be in a state where a site holding the
token is executing outside the CS. Such state is referred to as the idle token state.
At any instant, a site may have several pending requests for CS. A site queues up these
2
CS3551-DISTRIBUTED COMPUTING –UNIT 3
3
CS3551-DISTRIBUTED COMPUTING –UNIT 3
c. Response time: It is the time interval a request waits for its CS execution to be over after
its request messages have been sent out (see Figure 9.2).
4
CS3551-DISTRIBUTED COMPUTING –UNIT 3
• When a site Sj receives the REQUEST(tsi, i) message from site Si, places site Si’s Request
on request_queuej and it returns a time stamped REPLY message to Si.
2. Executing the critical section:
L1: Si has received a message with timestamp larger than (tsi, i) from all other sites.
Site Si, upon exiting the CS, removes its request from the top of its request queue and
broadcasts a time stamped RELEASE message to all other sites.
When a site Sj receives a RELEASE message from site Si, it removes Si’s request
from its request queue. When a site removes a request from its request queue, its own
5
CS3551-DISTRIBUTED COMPUTING –UNIT 3
request may come at the top of the queue, enabling it to enter the CS. Clearly, when a
site receives a REQUEST, REPLY or RELEASE message, it updates its clock using
the timestamp in the message.
Correctness
Theorem 1: Lamport’s algorithm achieves mutual exclusion.
Proof: Proof is by contradiction.
Suppose two sites Si and Sj are executing the CS concurrently. For this to happen
conditions L1 and L2 must hold at both the sites concurrently.
This implies that at some instant in time, say t, both Si and Sj have their own requests
at the top of their request_queues and condition L1 hold at them. Without loss of
generality, assume that Si’s request has smaller timestamp than the request of Sj.
From condition L1 and FIFO property of the communication channels, it is clear that at
instant t the request of Si must be present in request_queuej when Sj was executing its
CS.
This implies that Sj’s own request is at the top of its own request_queue when a smaller
timestamp request, Si’s request, is present in there quest_queuej – a contradiction!!
Hence, Lamport’s algorithm achieves mutual exclusion.
6
CS3551-DISTRIBUTED COMPUTING –UNIT 3
all other sites. In Figure 9.6, site S2 has received REPLY from all other sites and also received
a RELEASE message from siteS1. Site S2 updates its request_queue and its request is now at
thetop of its request_queue. Consequently, it enters the CS next.
7
CS3551-DISTRIBUTED COMPUTING –UNIT 3
Performance
for each CS invocation
(N-1) REQUEST
(N-1) REPLY
(N-1) RELEASE,
Total 3(N-1) messages, synchronization delay Sd = average delay.
Drawbacks of Lamport’s Algorithm:
Unreliable approach: failure of any one of the processes will halt the progress of entire
system.
High message complexity: Algorithm requires 3(N-1) messages per critical section
invocation.
Performance:
Synchronization delay is equal to maximum message transmission time. It requires 3(N – 1)
messages per CS execution. Algorithm can be optimized to 2(N – 1) messages by omitting the
REPLY message in some situations.
3.4. RICART-AGRAWALA ALGORITHM
The Ricart-Agrawala algorithm assumes the communication channels are FIFO.
The algorithm uses two types of messages: REQUEST and REPLY.
A process sends a REQUEST message to all other processes to request their permission
to enter the critical section.
A process sends a REPLY message to a process to give its permission to that process.
Processes use Lamport-style logical clocks to assign a timestamp to critical section
requests. Timestamps are used to decide the priority of requests in case of conflict – if
a process pi that is waiting to execute the critical section, receives a REQUEST
message from process pj, then if the priority of pj’s request is lower, pi defers the
REPLY to pj and sends a REPLY message to pj only after executing the CS for it
spending request.
Otherwise, pi sends a REPLY message to pj immediately, provided it is currently not
executing the CS. Thus, if several processes are requesting execution of the CS, the
8
CS3551-DISTRIBUTED COMPUTING –UNIT 3
highest priority request succeeds in collecting all the needed REPLY messages and gets
to execute the CS.
Each process pi maintains the Request-Deferred array, RDi, the size of which is the
same as the number of processes in the system. Initially, ∀i ∀j: RDi[j]=0. Whenever pi defer
the request sent by pj, it sets RDi[j]=1 and after it has sent a REPLY message to pj, it sets
RDi[j]=0. Note: Deferred – Postponed the request / waiting
ALGORITHM
1. Requesting the critical section:
(a) When a site Si wants to enter the CS, it broadcasts a time stamped REQUEST message to
all other sites.
(b) When site Sj receives a REQUEST message from site Si, it sends a REPLY message
to Site Si
• If site Sj is neither requesting nor executing the CS, or
• If the site Sj is requesting And Si’s request’s timestamp is smaller than site
Sj’s own request’s timestamp.
• Otherwise, the reply is deferred and Sj sets RDj[i]=1
2. Executing the critical section:
(c) Site Si enters the CS after it has received a REPLY message from every site it
sent a REQUEST message to.
3. Releasing the critical section:
(d) When site Si exits the CS, it sends all the deferred REPLY messages: ∀j
ifRDi[j]=1, then send a REPLY message to Sj and set RDi[j]=0.
When a site receives a message, it updates its clock using the timestamp in the message. Also,
when a site takes up a request for the CS for processing, it updates its local clock and assigns a
timestamp to the request. In this algorithm, a site’s REPLY messages are blocked only by
sites which are requesting the CS with higher priority (i.e., smaller timestamp).Thus, when a
site sends out differed REPLY messages, site with the next highest priority request receives
the last needed REPLY message and enters the CS. Execution of the CS requests in this
algorithm is always in the order of their timestamps.
An Example
Figures 9.7 to 9.10 illustrate the operation of Ricart-Agrawala algorithm. In Figure 9.7,
sites S1 and S2 are making requests for the CS and send out REQUEST messages to other
sites. The timestamps of the requests are (2, 1) and (1, 2), respectively. In Figure 9.8, S2 has
received REPLY messages from all other sites and consequently, it enters the CS. In Figure
9.9, S2 exits the CS and sends a REPLY message to site S1. In Figure 9.10, site S1 has
received REPLY from all other sites and enters the CS next.
9
CS3551-DISTRIBUTED COMPUTING –UNIT 3
10
CS3551-DISTRIBUTED COMPUTING –UNIT 3
Performance
Message Complexity
For each CS execution, Ricart-Agrawala algorithm requires
(N − 1) REQUEST messages and
(N−1) REPLY messages.
Thus, it requires 2(N−1) messages per CS execution. Synchronization
delay in the algorithm is T.
3.5 TOKEN BASED ALGORITHMS
A unique token (also known as the PRIVILEGE message) is shared among the
sites.
A site is allowed to enter its CS if it possesses the token and it continues to hold the
token until the execution of the CS is over.
Mutual exclusion is ensured because the token is unique.
Example:Suzuki-Kasami’s Broadcast Algorithm.
R [1...n] - request queue maintained at each site Si of size n each index corresponds to
every other site of OS.
T[n] - token array of size n to maintain the number of times the particular site
requested the token.
Q - Token request queue consists of the site IDs of simultaneous requests from
different sites.
Messages:
11
CS3551-DISTRIBUTED COMPUTING –UNIT 3
2. How to determine which site has an outstanding request for the CS:
After a site has finished the execution of the CS, it must determine what sites have an
outstanding request for the CS so that the token can be dispatched to one of them.
Finally, the site sends the token to the site whose id is at the head of the Q.
At site Si if Ri [j]=T[j]+1, then site Sj is currently requesting token.
Correctness
Mutual exclusion is guaranteed because there is only one token in the system and a site
holdsthe token during the CS execution.
Theorem: A requesting site enters the CS in finite time.
Proof: Token request messages of a site Si reach other sites in finite time.
12
CS3551-DISTRIBUTED COMPUTING –UNIT 3
Since one of these sites will have token in finite time, site Si ’s request will be placed in the
token queue in finite time.
Since there can be at most N − 1 requests in front of this request in the token queue, site Si
will get the token and execute the CS in finite time.
Message Complexity:
The algorithm requires 0 message invocation if the site already holds the idle token at the
time of critical section request or maximum of N message per critical section execution.
This N messages involves
(N – 1) request messages
1 reply message
Performance:
Synchronization delay is 0 and no message is needed if the site holds the idle token at
the time of its request. In case site does not holds the idle token, the maximum
synchronizationdelay is equal to maximum message transmission time and a maximum
of N message is required per critical section invocation.
Example:
Initial State:
S1 want to enter into critical section and broadcast token request REQUEST(1,1):
13
CS3551-DISTRIBUTED COMPUTING –UNIT 3
On Receiving S1 Request:
Granting token:
14
CS3551-DISTRIBUTED COMPUTING –UNIT 3
15
CS3551-DISTRIBUTED COMPUTING –UNIT 3
Granting Token:
16
CS3551-DISTRIBUTED COMPUTING –UNIT 3
17
CS3551-DISTRIBUTED COMPUTING – UNIT 3
3.8. PRELIMINARIES
Deadlock Handling Strategies
There are three strategies for handling deadlocks,
1. Deadlock Prevention,
2. Deadlock Avoidance,
3. Deadlock Detection.
Handling of deadlock becomes highly complicated in distributed systems because no
site has accurate knowledge of the current state of the system and because every inter
site communication involves a finite and unpredictable delay.
CS3551-DISTRIBUTED COMPUTING – UNIT 3
Deadlock prevention
It is commonly achieved either by having a process acquire all the needed resources
simultaneously before it begins executing or by pre-empting a process which holds the
needed resource.
This approach is highly inefficient and impractical in distributed systems.
deadlock avoidance
A resource is granted to a process if the resulting global system state is safe (note that a
global state includes all the processes and resources of the distributed system).
However, due to several problems, deadlock avoidance is impractical in distributed
systems.
Deadlock Detection
Issues in Deadlock Detection
Deadlock handling using the approach of deadlock detection entails addressing two basic issues:
1. Detection of existing deadlocks
2. Resolution of detected deadlocks.
1. Detection of Existing Deadlocks
Detection of deadlocks involves addressing two issues:
• maintenance of the WFG and
• Searching of the WFG for the presence of cycles (or knots).
Since in distributed systems, a cycle or knot may involve several sites, the search for
cycles greatly depends upon how the WFG of the system is represented across the
system.
Depending upon the way WFG information is maintained and search for cycles is
carried out, there are centralized, distributed, and hierarchical algorithms for deadlock
detection in distributed systems.
Correctness Criteria:
A deadlock detection algorithm must satisfy the following two conditions:
Correctness criteria
A deadlock detection algorithm must satisfy the following two conditions:
1. Progress-No undetected deadlocks:
The algorithm must detect all existing deadlocks in finite time. In other words, after
all wait-for dependencies for a deadlock have formed, the algorithm should not wait for
any more events to occur to detect the deadlock.
2. Safety -No false deadlocks:
The algorithm should not report deadlocks which do not exist. This is also called
ascalled phantom or false deadlocks.
2. Resolution of a Detected Deadlock
Deadlock resolution involves breaking existing wait-for dependencies between the
processes to resolve the deadlock.
CS3551-DISTRIBUTED COMPUTING – UNIT 3
It involves rolling back one or more deadlocked processes and assigning their resources
to blocked processes so that they can resume execution.
Note that several deadlock detection algorithms propagate information regarding wait-
for dependencies along the edges of the wait-for graph.
Therefore, when a wait-for dependency is broken, the corresponding information
should be immediately cleaned from the system.
If this information is not cleaned in timely manner, it may result in detection of
phantom deadlocks.
Untimely and inappropriate cleaning of broken wait-for dependencies is the main
reason why many deadlock detection algorithms reported in the literature are incorrect.
3.9. MODELS OF DEADLOCKS
The models of deadlocks are explained based on their hierarchy. Distributed systems
allow many kinds of resource requests. A process might require a single resource or a
combination of resources for its execution.
Note that AND requests for p resources can be stated as and OR requests for
presources can be stated as
Unrestricted model
No assumptions are made regarding the underlying structure of resource requests.
In this model, only one assumption that the deadlock is stable is made and hence it
is the most general model.
This model helps separate concerns: Concerns about properties of the problem
(stability and deadlock) are separated from underlying distributed systems
computations (e.g., message passing versus synchronous communication).
3.10. CHANDY-MISRA-HAAS ALGORITHM FOR THE AND MODEL AND OR
MODEL.
KNAPP’S CLASSIFICATION OF DISTRIBUTED DEADLOCK DETECTION
ALGORITHMS
CS3551-DISTRIBUTED COMPUTING – UNIT 3
Example:
CS3551-DISTRIBUTED COMPUTING – UNIT 3
Performance analysis
In the algorithm, one probe message is sent on every edge of the WFG
whichconnects processes on two sites.
The algorithm exchanges at most m(n − 1)/2 messages to detect a deadlock
thatinvolves m processes and spans over n sites.
The size of messages is fixed and is very small (only three integer words).
The delay in detecting a deadlock is O(n).
Advantages:
It is easy to implement.
Each probe message is of fixed length.
There is very little computation.
There is very little overhead.
There is no need to construct a graph, nor to pass graph information to other sites.
This algorithm does not find false (phantom) deadlock.
There is no need for special data structures.
1. If this is the first query message received by Pk for the deadlock detection initiated by Pi
(called the engaging query), then it propagates the query to all the processes in its dependent set
and sets a local variable numk(i) to the number of query messages sent.
2. If this is not the engaging query, then Pk returns a reply message to it immediately provided
Pk has been continuously blocked since it received the corresponding engaging query.
Otherwise, it discards the query.
CS3551-DISTRIBUTED COMPUTING – UNIT 3
3. Process Pk maintains a boolean variable waitk(i) that denotes the fact that it has been
continuously blocked since it received the last engaging query from process Pi. When a
blocked process Pk receives a reply(i, j, k) message, it decrements numk(i) only if waitk(i)
holds.
4. A process sends a reply message in response to an engaging query only after it has received
a reply to every query message it had sent out for this engaging query
5. . The initiator process detects a deadlock when it receives reply messages to all the query
messages it had sent out.
The Algorithm:
Example: query(i, j, k)
CS3551-DISTRIBUTED COMPUTING – UNIT 3
Performance Analysis
For every deadlock detection, the algorithm exchanges e query messages and e reply
messages, where e=n(n-1) is the number of edges