0% found this document useful (0 votes)
151 views22 pages

MODULE 3 Syncronization

The document discusses synchronization in distributed systems. It covers topics like clock synchronization, physical clocks, clock synchronization algorithms, election algorithms, and mutual exclusion. Clock synchronization is challenging in distributed systems without a global clock, as events on different machines may be assigned times that are not globally consistent. Centralized and distributed clock synchronization algorithms aim to address this by propagating or averaging times between nodes. Synchronization is important for processes to coordinate access to shared resources and agree on the order of events.

Uploaded by

Lucky Champ
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
151 views22 pages

MODULE 3 Syncronization

The document discusses synchronization in distributed systems. It covers topics like clock synchronization, physical clocks, clock synchronization algorithms, election algorithms, and mutual exclusion. Clock synchronization is challenging in distributed systems without a global clock, as events on different machines may be assigned times that are not globally consistent. Centralized and distributed clock synchronization algorithms aim to address this by propagating or averaging times between nodes. Synchronization is important for processes to coordinate access to shared resources and agree on the order of events.

Uploaded by

Lucky Champ
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

Session 2021-2022 CS-4010

Distributed
Computing

Harish Tiwari
SIR PADAMPAT SINGHANIA UNIVERSITY
CS-4010 DISTRIBUTED COMPUTING

Contents
Chapter 3 Synchronization in Distributed System ...................................................... 2

3.1 Introduction ....................................................................................................... 2

3.2 Clock Synchronization ...................................................................................... 2

3.3 Physical Clock .................................................................................................. 5

3.4 Clock Synchronization Algorithms .................................................................... 7

3.4.1 Cristian’s Algorithm .................................................................................... 8

3.4.2 The Berkeley Algorithm .............................................................................. 9

3.4.3 Network Time Protocol (NTP) .................................................................. 10

3.5 Election Algorithm ........................................................................................... 12

3.6 Mutual Exclusion............................................................................................. 15

3.6.1 Overview .................................................................................................. 16

3.6.2 A Centralized Algorithm ........................................................................... 17

3.6.3 Decentralized Algorithm ........................................................................... 18

3.6.4 A Token Ring Algorithm ........................................................................... 20

HARISH TIWARI 1

Sir Padampat Singhania University


CS-4010 DISTRIBUTED COMPUTING

Chapter 3 Synchronization in Distributed System


3.1 Introduction
In the previous chapters, we have looked at processes and communication between
processes. The methods used includes Layered protocols, request /reply message
passing (including RPC), and group communication. While communication is
important, it is not the entire story. Closely related is how processes cooperate and
synchronize with one another.

In single CPU systems, critical regions, mutual exclusion, and other synchronization
problems are solved using methods such as semaphores. These methods will not
work in distributed systems because they implicitly rely on the existence of shared
memory.

Synchronization in distributed systems is often much more difficult compared to


synchronization in uniprocessor or multiprocessor systems. The problems and
solutions that are discussed in this chapter are, by their nature, rather general, and
occur in many different situations in distributed systems.

Communication between processes in a distributed system can have unpredictable


delays, processes can fail, messages may be lost synchronization in distributed
systems is harder than in centralized systems because the need for distributed
algorithms.

In this chapter, we mainly concentrate on how processes can synchronize. For


example, it is important that multiple processes do not simultaneously access a
shared resource, such as printer, but instead cooperate in granting each other
temporary exclusive access.

Another example is that multiple processes may sometimes need to agree on the
ordering of events, such as whether message m1 from process P was sent before or
after message m2 from process Q.

3.2 Clock Synchronization


In a centralized system, time is unambiguous. When a process wants to know the
time, it makes a system call, and the kernel tells it. If process A asks for the time and

HARISH TIWARI 2

Sir Padampat Singhania University


CS-4010 DISTRIBUTED COMPUTING

then a little later process B asks for the time, the value that B gets will be higher than
(or possibly equal to) the value A got. It will certainly not be lower.

Just think, for a moment, about the implications of the lack of global time on the
UNIX make program, as a single example. Normally, in UNIX, large programs are
split up into multiple source files, so that a change to one source file only requires
one file to be recompiled, not all the files. If a program consists of 100 files, not
having to recompile everything because one file has been changed greatly increases
the speed at which programmers can work.

The way make normally works is simple. When the programmer has finished
changing all the source files, he runs make, which examines the times at which all
the source and object files were last modified. If the source file input. c has time
2151 and the corresponding object file input.o has time 2150, make knows that
input.c has been changed since input.o was created, and thus input.c must be
recompiled. On the other hand, if output.c has time 2144 and output.o has time
2145, no compilation is needed. Thus make goes through all the source files to find
out which ones need to be recompiled and calls the compiler to recompile them.

Now imagine what could happen in a distributed system in which there were no
global agreement on time. Suppose that output.o has time 2144 as above, and
shortly thereafter output.c is modified but is assigned time 2143 because the clock
on its machine is slightly behind, as shown in Fig. 1. Make will not call the compiler.
The resulting executable binary program will then contain a mixture of object files
from the old sources and the new sources. It will probably crash and the programmer
will go crazy trying to understand what is wrong with the code.

HARISH TIWARI 3

Sir Padampat Singhania University


CS-4010 DISTRIBUTED COMPUTING

Figure 1 When each machine has its own clock, an event that occurred after another event may nevertheless be
assigned an earlier time.

When each machine has its own clock, an event that occurred after another event
may nevertheless be assigned an earlier time.

The clock synchronization can be achieved by 2 ways: External and Internal Clock
Synchronization.

• External clock synchronization is the one in which an external reference clock


is present. It is used as a reference and the nodes in the system can set and
adjust their time accordingly.
• Internal clock synchronization is the one in which each node shares its time
with other nodes and all the nodes set and adjust their times accordingly.

There are 2 types of clock synchronization algorithms: Centralized and Distributed.

• Centralized is the one in which a time server is used as a reference. The


single time server propagates its time to the nodes and all the nodes adjust
the time accordingly. It is dependent on single time server so if that node fails,
the whole system will lose synchronization. Examples of centralized are-
Berkeley Algorithm, Passive Time Server, Active Time Server etc.
• Distributed is the one in which there is no centralized time server present.
Instead the nodes adjust their time by using their local time and then, taking
the average of the differences of time with other nodes. Distributed algorithms
overcome the issue of centralized algorithms like the scalability and single
point failure. Examples of Distributed algorithms are – Global Averaging
Algorithm, Localized Averaging Algorithm, NTP (Network time protocol) etc.

HARISH TIWARI 4

Sir Padampat Singhania University


CS-4010 DISTRIBUTED COMPUTING

3.3 Physical Clock


Nearly all computers have a circuit for keeping track of time. Despite the use of the
word "clock" to refer to these devices, they are not actually clocks in the usual sense.
Timer is perhaps a better word.

A computer timer is usually a precisely machined quartz crystal. When kept under
tension, quartz crystals oscillate at a well-defined frequency that depends on the kind
of crystal, how it is cut, and the amount of tension. Associated with each crystal are
two registers, a counter and a holding register. Each oscillation of the crystal
decrements the counter by one. When the counter gets to zero, an interrupt is
generated, and the counter is reloaded from the holding register. In this way, it is
possible to program a timer to generate an interrupt 60 times a second, or at any
other desired frequency. Each interrupt is called one clock tick.

When the system is booted, it usually asks the user to enter the date and time, which
is then converted to the number of ticks after some known starting date and stored in
memory. Most computers have a special battery-backed up CMOS RAM so that the
date and time need not be entered on subsequent boots. At every clock tick, the
interrupt service procedure adds one to the time stored in memory. In this way, the
(software) clock is kept up to date.

With a single computer and a single clock, it does not matter much if this clock is off
by a small amount. Since all processes on the machine use the same. clock, they
will still be internally consistent.

As soon as multiple CPUs are introduced, each with its own clock, the situation
changes radically. Although the frequency at which a crystal oscillator runs is usually
fairly stable, it is impossible to guarantee that the crystals in different computers all
run at exactly the same frequency. In practice, when a system has n computers, all n
crystals will run at slightly different rates, causing the (software) clocks gradually to
get out of synch and give different values when read out. This difference in time
values is called clock skew.

HARISH TIWARI 5

Sir Padampat Singhania University


CS-4010 DISTRIBUTED COMPUTING

As a consequence of this clock skew, programs that expect the time associated with
a file, object, process, or message to be correct and independent of the machine on
which it was generated (i.e., which clock it used) can fail.

Clock skew problem

To avoid the clock skew problem, two types of clocks are used:

• Logical clocks: to provide consistent event ordering


• Physical clocks : clocks whose values must not deviate from the real time by
more than a certain amount.

In some systems (e.g., real-time systems), the actual clock time is important. Under
these circumstances, external physical clocks are needed. For reasons of efficiency
and redundancy, multiple physical clocks are generally considered desirable, which
yields two problems:

• How do we synchronize them with real-world clocks?


• How do we synchronize the clocks with each other?

Sometimes we simply need the exact time, not just an ordering so the solution is
UTC (Universal Coordinated Time). based on the number of transitions per second
of the cesium133 atom (pretty accurate), at present, the real time is taken as the
average of some 50 cesium-clocks around the world, introduces a leap second from
time to time to compensate that days are getting longer.

To provide UTC to people who need precise time, the National Institute of Standard
Time (NIST) operates a shortwave radio station with call letters WWV from Fort
Collins, Colorado. WWV broadcasts a short pulse at the start of each UTC second.
The accuracy of WWV itself is about ±1 msec, but due to random atmospheric
fluctuations that can affect the length of the signal path, in practice the accuracy is
no better than ±10 msec.

Assumption: a distributed system with an UTC-receiver somewhere in it. The basic


principal is as follows :

• every machine has a timer that generates an interrupt H times per second.

HARISH TIWARI 6

Sir Padampat Singhania University


CS-4010 DISTRIBUTED COMPUTING

• There is a clock in machine p that ticks on each timer interrupt. Denote the
value of that clock by Cip(t), where t is UTC time.
• ideally, we have that for each machine p, Cp(t) = t, or, in other words, dC/dt =
1.
• Ideally: dC/dt = 1, in practice: 1 – p <= dC/dt <= 1 + p .
• in order to protect against difference bigger than ∂ time units → units)
synchronize at least every ∂/(2 p) seconds.

Figure 2 The relation between clock time and UTC when clocks tick at different rates

The constant p is specified by the manufacturer and is known as the maximum drift
rate. Note that the maximum drift rate specifies to what extent a clock's skew is
allowed to fluctuate. Slow, perfect, and fast clocks are shown in Fig-2.

3.4 Clock Synchronization Algorithms


Following algorithms provides clock synchronization:

• Cristian’s Algorithm
• The Berkeley Algorithm

HARISH TIWARI 7

Sir Padampat Singhania University


CS-4010 DISTRIBUTED COMPUTING

• Network Time Protocol

3.4.1 Cristian’s Algorithm


Cristian suggested the use of a time server, connected to a device that receives signals
from a source of UTC, to synchronize computers externally. Round trip times between
processes are often reasonably short in practice, yet theoretically unbounded. The
practical estimation is possible if round-trip times are sufficiently short in comparison to
required accuracy.

Getting the current time from a time server. The basic principal of this algorithm is as
follows:

• The UTC-synchronized time server S is used here.


• A process P sends requests to S and measures round-trip time Tround .

• In LAN, Tround should be around 1-10 ms .


• During this time, a clock with a 10-6 sec/sec drift rate varies by at most 10-8
sec.

• Hence the estimate of Tround is reasonably accurate.


• Now set clock to t + ½ Tround

Figure 3 Cristian’s algorithm

Problems in this algorithm:

• Timer must never run backward.

HARISH TIWARI 8

Sir Padampat Singhania University


CS-4010 DISTRIBUTED COMPUTING

• Variable delays in message passing / delivery occurs.

3.4.2 The Berkeley Algorithm


In many algorithms such as NTP, the time server is passive. Other machines
periodically ask it for the time. All it does is respond to their queries.

In Berkeley UNIX, exactly the opposite approach is taken. In Berkeley algorithm time
server (actually, a time daemon) is active, polling every machine from time to time to
ask what time it is there. Based on the answers, it computes an average time and
tells all the other machines to advance their clocks to the new time or slow their
clocks down until some specified reduction has been achieved.

Berkeley algorithm was developed to solve the problems of Cristian‟s algorithm. This
algorithm does not need external synchronization. Master slave approach is used
here.

• The master polls the slaves periodically about their clock readings.
• Estimate of local clock times is calculated using round trip.
• The average values are obtained from a group of processes.
• This method cancels out individual clock‟s tendencies to run fast and tells
slave processes by which amount of time to adjust local clock.
• In case of master failure, master election algorithm is used.

This method is suitable for a system in which no machine has a WWV receiver. The
time daemon's time must be set manually by the operator periodically. The method is
illustrated in Fig.3.

HARISH TIWARI 9

Sir Padampat Singhania University


CS-4010 DISTRIBUTED COMPUTING

In Fig. 3 (a), at 3:00, the time daemon tells the other machines its time and asks for
theirs. In Fig. 3(b), they respond with how far ahead or behind the time daemon they
are. Armed with these numbers, the time daemon computes the average and tells
each machine how to adjust its clock [see Fig. 3(c)].

In other words, the working of the algorithm can be summarized as,

• The time daemon asks all the other machines for their clock values.
• The machines answer the request.
• The time daemon tells everyone how to adjust their clock.

3.4.3 Network Time Protocol (NTP)


The Network Time Protocol defines architecture for a time service and a protocol to
distribute time information over the Internet.

Features of NTP:

• To provide a service enabling clients across the Internet to be synchronized


accurately to UTC.
• To provide a reliable service that can survive lengthy losses of connectivity.
• To enable clients to resynchronize sufficiently frequently to offset the rates of
drift found in most computers.
• To provide protection against interference with the time service, whether
malicious or accidental.

The NTP service is provided by a network of servers located across the Internet.
Primary servers are connected directly to a time source such as a radio clock
receiving UTC. Secondary servers are synchronized, with primary servers. The
servers are connected in a logical hierarchy called a synchronization subnet. Arrows
denote synchronization control, numbers denote strata. The levels are called strata.

NTP uses a hierarchical, semi-layered system of time sources. Each level of this
hierarchy is termed a stratum and is assigned a number starting with zero for the
reference clock at the top. A server synchronized to a stratum n server runs at

HARISH TIWARI 10

Sir Padampat Singhania University


CS-4010 DISTRIBUTED COMPUTING

stratum n + 1. The number represents the distance from the reference clock and is
used to prevent cyclical dependencies in the hierarchy.

Figure 4 NTP Configuration

In above figure-4 Yellow arrows indicate a direct connection; red arrows indicate a
network connection. A brief description of strata 0, 1, 2 and 3 is provided below.

• Strata 0
These are high-precision timekeeping devices such as atomic clocks, GNSS
(including GPS) or other radio clocks. They generate a very accurate pulse per
second signal that triggers an interrupt and timestamp on a connected computer.
Stratum 0 devices are also known as reference clocks. NTP servers cannot
advertise themselves as stratum 0.
• Strata 1
These are computers whose system time is synchronized to within a few
microseconds of their attached stratum 0 devices. Stratum 1 servers may peer
with other stratum 1 servers for sanity check and backup. They are also referred
to as primary time servers.

HARISH TIWARI 11

Sir Padampat Singhania University


CS-4010 DISTRIBUTED COMPUTING

• Strata 2
These are computers that are synchronized over a network to stratum 1 servers.
Often a stratum 2 computer queries several stratum 1 servers. Stratum 2
computers may also peer with other stratum 2 computers to provide more stable
and robust time for all devices in the peer group.
• Strata 3
These are computers that are synchronized to stratum 2 servers. They employ
the same algorithms for peering and data sampling as stratum 2, and can
themselves act as servers for stratum 4 computers, and so on.

Working: NTP follows a layered client-server architecture, based on UDP message


passing. Synchronization is done at clients with higher strata number and is less
accurate due to increased latency to strata 1 time server. If a strata 1 server fails, it
may become a strata 2 server that is being synchronized though another strata 1
server.

3.5 Election Algorithm


An algorithm requires that some process acts as a coordinator. How to select this
special process dynamically?

• in many systems the coordinator chosen by hand (e.g. file servers). This leads
to centralized solutions ) single point of failure.
• If a coordinator chosen dynamically, to what extent one can speak about a
centralized or distributed solution? Having a central coordinator does not
necessarily make an algorithm non-distributed.

So An algorithm for choosing a unique process to play a particular role is called


an election algorithm.

In general, election algorithms attempt to locate the process with the highest process
number and designate it as coordinator. The algorithms differ in the way they do the
location.

we also assume that every process knows the process number of every other
process. What the processes do not know is which ones are. Currently up and which
ones are currently down. The goal of an election algorithm is to ensure that when an

HARISH TIWARI 12

Sir Padampat Singhania University


CS-4010 DISTRIBUTED COMPUTING

election starts, it concludes with all processes agreeing on who the new coordinator
is to be. There are many algorithms and variations. Example election algorithms:

• The Bully Algorithm,


• A Ring Algorithm.

3.5.1 The Bully Algorithm

As a first example, consider the bully algorithm devised by Garcia-Molina (1982).


When any process notices that the coordinator is no longer responding to requests, it
initiates an election. Each process has an associated priority (weight). The process
with the highest priority should always be elected as the coordinator.

In this algorithm, there are three types of messages:

• Election: This is sent to announce an election message.


• OK: This is sent in response to an election message.
• Coordinator: This is sent to announce the identity of the elected process.

Procedure :

• A process P can just start an election by sending an ELECTION message to


all other processes with higher numbers awaits an answer in response.
• If none arrives within time T, the process considers itself the coordinator and
sends coordinator message to all processes with lower identifiers.
• If one of the higher-ups answers, it takes over. P's job is done.

At any moment, a process can get an ELECTION message from one of its lower-
numbered colleagues. When such a message arrives, the receiver sends an OK
message back to the sender to indicate that he is alive and will take over. The
receiver then holds an election, unless it is already holding one. Eventually, all
processes give up but one, and that one is the new coordinator. It announces its
victory by sending all processes a message telling them that starting immediately it is
the new coordinator.

If a process that was previously down comes back up, it holds an election. If it
happens to be the highest-numbered process currently running, it will win the

HARISH TIWARI 13

Sir Padampat Singhania University


CS-4010 DISTRIBUTED COMPUTING

election and take over the coordinator's job. Thus the biggest guy in town always
wins, hence the name "bully algorithm."

Figure 5 The bully election algorithm. (a) Process 4 holds an election. (b) Processes 5 and 6 respond. telling 4 to
stop. (c) Now 5 and 6 each hold an election. (d) Process 6 tells 5 to stop. (e) Process 6 wins and tells everyone.

3.5.1 The Ring Algorithm

Another election algorithm is based on the use of a ring. Unlike some ring
algorithms, this one does not use a token. All the processes arranged in a logical
ring. Process priority is obtained by organizing processes into a (logical) ring. We
assume that the processes are physically or logically ordered, so that each process
knows who its successor is.

Here in this algorithm also the process with the highest priority should be elected as
coordinator. When any process notices that the coordinator is not functioning so that
process can initiate the election.

HARISH TIWARI 14

Sir Padampat Singhania University


CS-4010 DISTRIBUTED COMPUTING

• Any process can begin an election by sending an ELECTION message


containing its own process number to its successor.
• If a successor is down, then message is passed on to the next successor
along the ring.
• if a message is passed on, the sender adds own process number list.
• When this message gets back to the initiator process. That process
recognizes this event when it receives an incoming message containing its
own process number.
• The initiator sends a coordinator message around the ring containing a list of
all living processes. The one with the highest priority is elected as coordinator.

Figure 6 Election algorithm using a ring

3.6 Mutual Exclusion


Fundamental to distributed systems is the concurrency and collaboration among
multiple processes. In many cases, this also means that processes will need to
simultaneously access the same resources. To prevent that such concurrent
accesses, corrupt the resource, or make it inconsistent, solutions are needed to
grant mutual exclusive access by processes. In this section, we take a look at some
of the more important distributed algorithms that have been proposed.

The mutual exclusion makes sure that concurrent process access shared resources
or data in a serialized way. If a process, say Pi, is executing in its critical section,

HARISH TIWARI 15

Sir Padampat Singhania University


CS-4010 DISTRIBUTED COMPUTING

then no other processes can be executing in their critical sections. So the


Distributed mutual exclusion provide critical region in a distributed environment.

3.6.1 Overview
Distributed mutual exclusion algorithms can be classified into two different
categories: Token based solution and Permission based Solution.

Token based Solution

In token-based solutions mutual exclusion is achieved by passing a special message


between the processes, known as a token. There is only one token available and
whoever has that token is allowed to access the shared resource. When finished, the
token is passed on to a next process. If a process having the token is not interested
in accessing the resource, it simply passes it on.

Token-based solutions have a few important properties.

• Avoiding Starvation: First, depending on the how the processes are


organized, they can fairly easily ensure that every process will get a chance at
accessing the resource. In other words, they avoid starvation.
• Deadlock can be Avoided: Second, deadlocks by which several processes
are waiting for each other to proceed, can easily be avoided, contributing to
their simplicity.

Unfortunately, the main drawback of token-based solutions is a rather serious one.

• when the token is lost (e.g., because the process holding it crashed), an
complex distributed procedure needs to be started to ensure that a new token
is created, but above all, that it is also the only token.

Permission based Solution

As an alternative, many distributed mutual exclusion algorithms follow a permission-


based approach. In this case. a process wanting to access the resource first requires
the permission of other processes. There are many ways toward granting such a
permission and in the sections that follow we will consider a few of them.

HARISH TIWARI 16

Sir Padampat Singhania University


CS-4010 DISTRIBUTED COMPUTING

3.6.2 A Centralized Algorithm


This employs the simplest way to grant permission to enter the critical section by
using a server (Coordinator). The most straightforward way to achieve mutual.
exclusion in a distributed system is to simulate how it is done in a one-processor
system.

One process is elected as the coordinator. Whenever a process wants to access a


shared resource, it sends a request message to the coordinator stating which
resource it wants to access and asking for permission. If no other process is
currently accessing that resource, the coordinator sends back a reply granting
permission, as shown in Fig. 7(a).

Figure 7 Centralised Algorithm

Now suppose that another process, 2 in Fig. 7(b), asks for permission to access the
resource. The coordinator knows that a different process is already at the resource,
so it cannot grant permission. The exact method used to deny permission is system
dependent. In Fig. 7(b), the coordinator just refrains from replying, thus blocking
process 2, which is waiting for a reply. Alternatively, it could send a reply saying
"permission denied." Either way, it queues the request from 2 for the time being and
waits for more messages.

When process 1 is finished with the resource, it sends a message to the coordinator
releasing its exclusive access, as shown in Fig.7(c). The coordinator takes the first
item off the queue of deferred requests and sends that process a grant message. If
the process was still blocked (i.e., this is the first message to it), it unblocks and
accesses the resource.

HARISH TIWARI 17

Sir Padampat Singhania University


CS-4010 DISTRIBUTED COMPUTING

It is easy to see that the algorithm guarantees mutual exclusion: the coordinator only
lets one process at a time to the resource.

• It is also fair, since requests are granted in the order in which they are
received.
• No process ever waits forever (no starvation).
• The scheme is easy to implement, too, and requires only three messages per
use of resource (request, grant, release).
• Its simplicity makes an attractive solution for many practical situations.

The centralized approach also has disadvantages.

• The coordinator is a single point of failure, so if it crashes, the entire system


may go down. If processes normally block after making a request, they cannot
distinguish a dead coordinator from "permission denied" since in both cases
no message comes back.
• In addition, in a large system, a single coordinator can become a performance
bottleneck.

3.6.3 Decentralized Algorithm


Many researchers have looked for deterministic distributed mutual exclusion
algorithms. Ricart and Agrawal (1981) made it more efficient. In this section we will
describe their method.

Ricart and Agrawal’s algorithm requires that there be a total ordering of all events in
the system. That is, for any pair of events, such as messages, it must be
unambiguous which one happened first.

The algorithm works as follows:

• When a process wants to access a shared resource, it builds a message


containing the name of the resource, its process number, and the current
(logical) time. It then sends the message to all other processes, conceptually
including itself. The sending of messages is assumed to be reliable; that is, no
message is lost.

HARISH TIWARI 18

Sir Padampat Singhania University


CS-4010 DISTRIBUTED COMPUTING

• When a process receives a request message from another process, the


action it takes depends on its own state with respect to the resource named in
the message. Three different cases have to be clearly distinguished:
o If the receiver is not accessing the resource and does not want to
access it, it sends back an OK message to the sender.
o If the receiver already has access to the resource, it simply does not
reply. Instead, it queues the request.
o If the receiver wants to access the resource as well but has not yet
done so, it compares the timestamp of the incoming message with me.
one contained in the message that it has sent everyone. The lowest
one wins. If the incoming message has a lower timestamp, the receiver
sends back an OK message. If its own message has a lower
timestamp, the receiver queues the incoming request and sends
nothing.

After sending out requests asking permission, a process sits back and waits until
everyone else has given permission. As soon as all the permissions are in, it may go
ahead. When it is finished, it sends OK messages to all processes on its queue and
deletes them all from the queue.

Figure 8 dsd

Let us try to understand why the algorithm works. If there is no conflict, it clearly
works. However, suppose that two processes try to simultaneously access the
resource, as shown in Fig. 8(a).

HARISH TIWARI 19

Sir Padampat Singhania University


CS-4010 DISTRIBUTED COMPUTING

Process 0 sends everyone a request with timestamp 8, while at the same time,
process 2 sends everyone a request with timestamp 12. Process 1 is not interested
in the resource, so it sends OK to both senders. Processes 0 and 2 both see the
conflict and compare timestamps. Process 2 sees that it has lost, so it grants
permission to 0 by sending OK. Process 0 now queues the request from 2 for later
processing and access the resource, as shown in Fig. 8(b). When it is finished, it
removes the request from 2 from its queue and sends an OK message to process 2,
allowing the latter to go ahead, as shown in Fig. 8(c). The algorithm works because
in the case of a conflict, the lowest timestamp wins and everyone agrees on the
ordering of the timestamps.

The number of messages required per entry is now 2(n - 1), where the total number
of processes in the system is n. Best of all, no single point of failure exists. the single
point of failure has been replaced by n points of failure. If any process crashes, it will
fail to respond to requests.

3.6.4 A Token Ring Algorithm


This provides a simplest way to arrange mutual exclusion between N processes
without requiring an additional process is arrange them in a logical ring.

Figure 9 (a) An unordered group of processes on a network (b) A logical ring constructed in software

Here we have a bus network, as shown in Fig. 9(a), (e.g., Ethernet), with no inherent
ordering of the processes. In software, a logical ring is constructed in which each
process is assigned a position in the ring, as shown in Fig. 6-16(b). The ring
positions may be allocated in numerical order of network addresses or some other

HARISH TIWARI 20

Sir Padampat Singhania University


CS-4010 DISTRIBUTED COMPUTING

means. It does not matter what the ordering is. All that matters is that each process
knows who is next in line after itself.

• When the ring is initialized, process 0 is given a token. The token circulates
around the ring. It is passed from process k to process k +1 (modulo the ring
size) in point-to-point messages.
• When a process acquires the token from its neighbor, it checks to see if it
needs to access the shared resource. If so, the process goes ahead, does all
the work it needs to, and releases the resources. After it has finished, it
passes the token along the ring. It is not permitted to immediately enter the
resource again using the same token.
• If a process is handed the token by its neighbor and is not interested in the
resource, it just passes the token along. As a consequence, when no
processes need the resource, the token just circulates at high speed around
the ring.

Token-based solutions have a few important properties.

• Avoiding Starvation: First, depending on the how the processes are


organized, they can fairly easily ensure that every process will get a chance at
accessing the resource. In other words, they avoid starvation.
• Deadlock can be Avoided: Second, deadlocks by which several processes
are waiting for each other to proceed, can easily be avoided, contributing to
their simplicity.

Unfortunately, the main drawback of token-based solutions is a rather serious one.

• when the token is lost (e.g., because the process holding it crashed), an
complex distributed procedure needs to be started to ensure that a new token
is created, but above all, that it is also the only token.

HARISH TIWARI 21

Sir Padampat Singhania University

You might also like