A Study of Token Based Algorithms For Distributed Mutual Exclusion
A Study of Token Based Algorithms For Distributed Mutual Exclusion
Abstract
The selection of a ‘good’ mutual exclusion algorithm, for the design of distributed systems, is of great importance. A number of mutual exclusion
algorithms, with different techniques and varying performance characteristics, are available in the literature. These algorithms can be broadly
classified into token based algorithms and non-token based algorithms. A number of survey papers for non-token based mutual exclusion algorithms
exist. Although, some of them include discussion on token based mutual exclusion algorithms too, however, none of them include any discussion on
the newer variants of classic mutual exclusion, like k-mutual exclusion and group mutual exclusion. The paper presents an exhaustive survey of the
token based mutual exclusion algorithms. The variants of mutual exclusion problem, namely k-mutual exclusion and group mutual exclusion, have
also been covered.
Keywords: Distributed mutual exclusion; Token; Algorithm; k-mutual exclusion; Group mutual exclusion
message traffic in comparison to non-token based from the queue and the privilege message is sent to
algorithms. Because of the existence of a unique that site. If the token queue is empty the site i holds
token in the system, token based algorithms are the token, until the arrival of request messages.
deadlock free. But their resiliency to failure is poor, The algorithm requires at most N
because if the process having token fails or token is messages. The synchronization delay is one way trip
lost in transit, a complex process of token communication time (T). This is an improvement
regeneration and recovery has to be started. Token over Ricart-Agarwala’s algorithm[13],which has
message complexity of 2( N − 1) and the
synchronization delay equal to two way trip requesting site waits for token up to a specified time,
communication time(2T). The draw back of the after that failure recovery algorithm is initiated. A
Suzuki-Kasmi’s algorithm is that in it sequence two phase recovery scheme has been suggested to
numbers are not bounded. To remove this drawback a recover from situations, in which the information
modified algorithm was presented by Suzuuki-Kasmi. stored at sites or token may be lost, due to some
However, in modified algorithm the number of catastrophic failure. The algorithm is able in handling
messages increases to L * N + ( N − 1) for L mutual a variety of network failures and ensures mutual
exclusion as long as there is one operational site in
exclusion invocation by a single node or
the network.
N + ( N − 1) / L messages per invocation.
the value of ( j − i ) mod N is minimum, among those algorithm request set of every site is static and its
cardinality is always N − 1 .
sites having pending requests. Nishio et al.’s
M. Singhal [36] used heuristic approach in
algorithm does not require a queue to store pending
his algorithm and showed that the number of
requests. The relationship between array of sequence
messages required per CS invocation can be reduced
numbers at site and array of sequence numbers at
by dynamically changing the request set. The
token is used to determine whether a site is having
algorithm makes use of the state information of the
pending request or not.
sites in changing the request set. A site may be in any
The additional data structures maintained by
one of the following states.
the algorithm are site − age[N ] stored at each site
R – Requesting. N– Not
and token-age stored in the token. When a site i
Requesting
suspects token loss, it executes the algorithm
E – Executing. H – Holding.
regenerate, which determines candidate value
Sequence numbers are used to differentiate between
propose _ agei for the token age and sends a token-
current and old requests. Each site maintains state
missing message to all sites j (≠ i ) . Site j checks if vector ( SV ) and sequence number vector ( SN ). The
propose − agei > site − age[ j ] , site − age[ j ] is token maintains two vectors TSV and TSN , which are
used for storing state information of the sites and
updated and ACK message is sent to i otherwise,
sequence number of the last served request for each
NACK message is sent to i . The site proposing
site respectively. The token and the request messages
highest age regenerates the token, on receiving
are used to disseminate state information among the
positive response from all other sites.
sites.
A time out mechanism is used to detect the
communication link failures or site failures. A
Initially site 1 holds the token and algorithm depends upon the arbitration rule used in
SV1[1] = H . Site S i 1 ≤ i ≤ n thinks that sites selecting the next site to which token has to be sent.
The space requirement of singhal’s algorithm is also
S n , S n−1 ,S i +1 are not requesting the token and sites
quiet large, because array containing state
S i−1, ...S1 are requesting the token and one of these information has to be stored at each site. The token
has the token. When a site wants to enter CS, it use also contains an array to store information about the
some heuristic to guess a set of sites which are likely state of the sites.
to have the token and sends its request message only Chang et al. [55] presented another token
to those sites. The heuristic used in the algorithm is based algorithm, in which the request set of a site
“All the sites for which state vector entries are R ,are changes dynamically. The algorithm does not use
added in the request set”. When a site exits from CS, state information, therefore, the arrays for storing
it updates its state vectors and token vectors. For this state information are not required. In this algorithm
purpose the state vectors at site are compared with token maintains a FIFO queue storing waiting
the token state vector to determine, which vector has requests. Each site maintains an array of sequence
more current information about the state of sites. The numbers and a request set R . The requesting site
outdated information is replaced with more current sends request message to only those sites which are
information. An arbitration rule is used to determine, in its request set R . Initially all other sites are added
which of the many requesting sites should get the in request set of a site, except the site X , which is
token next. The fairness of the algorithm depends randomly chosen to hold the token and for which
upon the arbitration rule. Following two arbitration request set is empty. When a site exits from CS, it
rules were suggested in the paper. checks if new requests have arrived and add these
(i) Grant the token to a requesting site with lowest requests to the token queue. The site then removes
sequence no. the site at the head of the token queue and sends the
(ii) Grant the token to the requesting site nearest to token to that site. When a site gets a token, its request
the current site.. set is emptied and all sites, which are in token queue,
The simulation study shows that, under light are added to its request set. Simulation studies
load the algorithms performs better than Suzuki- performed, shows that only 0.6 N messages/CS are
kasmi algorithm as far as number of messages per CS required under light load, however under heavy load
is concerned. However under heavy load the algorithm shows same performance as Suzuki-
performance of both the algorithms is comparable. kasmi’s algorithm.
Under light load the mean delay in granting CS will
be equal to one round trip time ( 2T ). Under heavy 3.1.3 Other Algorithms
load after every T + E units of time a site will get the Mizuno et al. [32] used a data structure, called
token and execute its CS. quorum agreements [5] similar to coterie, to present a
Singhal’s algorithm uses less number of variant of Suzuki – Kasmi’s algorithm[16]. Let QA =
messages as compared to Suzuki-Kasmi algorithm
(Q, Q −1 ) be the quorum agreement used by the
under light load conditions, but the fairness of the
algorithm. Each node i has a request set Ri ( Ri ∈ Q) token is used to allow entry in CS. Each processor
dynamically updates the latest known location of the
and an acquired set Ai ( Ai ∈ Q −1 ) and an array of
token. A requesting process only sends out a request
sequence numbers RN i . The token carries an array of message to chase token along the latest known
sequence numbers of the last served request of each location of the token. A request message dynamically
site and a queue of pending requests. Initial request changes its path based on the local information of the
messages are issued by i to the nodes in Ri only. intermediate nodes. The message complexity of the
acquired message to all nodes in its acquired set Ai . Saxena et al. [42] proposed a token based
algorithm for arbitrary network topologies using
The privilege message contains a copy of array LN
broadcast based approach. A site can be in any one of
which contains the sequence numbers of last served
the following states R – Requesting, N – Not
requests of each node. When k receives an acquired
Requesting, E – Executing, H – Holding. In this
message from j and RN [ K ] LN [ K ] , it sends back
algorithm the token serves the requests of the sites
a request message to node j . Otherwise node k which fall on the route to its destination. Although
waits until it receives new requests. The number of the algorithm does not satisfy request in strictly FCFS
messages used by node i for one CS entry are order, however under heavy load conditions this
request of a process arrives at the tail and if the tail is sending a message. If S i does not receive any answer
waiting for the token, the requesting process is linked in a specified time it regenerate the token and
to the tail. If the tail has the token and is in its CS, the initialize its position to zero in the queue. However, if
requesting process is linked to the tail. Otherwise the S i has not received commit message, an election
token is transferred to the requesting process. When a
algorithm has to be executed and only the elected site
process leaves the CS, it gives the token to the next
is allowed to continue the recovery process.
process in the queue. If no such next exists, the
Only one commit message /CS request is
process will hold the token. The algorithm does not
added in case of no failure, hence the message
use logical clocks or sequence numbers to serialize
complexity remains O(log n) . In case of failure the
the concurrent events, all the variables are bounded
algorithm requires less number of messages and less
and only O(log n) messages/ CS requests are used.
time in comparison to Naimi-Trehel’s fault tolerant
Julien Sopena et al. [21] proposed a fault
algorithm [33]. The algorithm can tolerate at most
tolerant extension for the Naimi-Trehel’s
N − 1 site failures. The Next queue is rebuilt from the
algorithm[35]. The algorithm tries to reconstruct the
previous queue; therefore original ordering of
next queue by gathering intact portions of previous
requests is preserved up to an extent.
next queue, which existed just before the failure. To
Bertier et al. [30] proposed a hierarchal
maintain information about predecessors, whenever
token based mutual exclusion algorithm based on
S i updates its next variable, it sends a commit Naimi-Trehal algorithm [35], which takes into
message to the requester S i . The commit message account the latency gap between local and remote
commit message, it sends a message to its token and proxy is aware of it, it redirects i' s request
predecessors from closest to farthest. S i Stops when towards j , thus avoiding messages to remote clusters.
it receives an answer from one of its predecessor site.
(ii) Aggregation – In it when a request has to be sent not require separate election or recovery protocol to
to the probable token holder, belonging to a remote recover lost token, instead token recovery is
cluster; the request is not sent but stored in a queue. It integrated in the protocol itself. The concept of
is stored in a queue in the last node which will enter logical time is used to detect and recover from token
the critical section with in the cluster, which is called loss. The algorithm can handle message loss, site
“Local Root”. failures and network partitioning.
(iii) Token Preemption -In token preemption a high Self-Stabilization is the most fundamental
priority is given to requests originating from the local concept of automatic recovery from transient faults in
cluster to exploit locality. A threshold is defined, in distributed systems. Jun Kiniwa [19] presented a self
order to avoid starvation. Whenever the number of stabilizing token passing algorithm in which a token
local requests is below this threshold; the request path is passed via a dynamic BFS tree rooted at a
is modified, in order to serve local requests first. requesting process. No queues are used in this
The variant of Naimi-Trehel’s algorithm [35] using algorithm and every variable is bounded. The
above mentioned techniques were presented by stabilization time of the algorithm is 1+ 5 * D rounds,
Bertier et. al. and it was observed that the above where D is the diameter of the network. k Covering
mentioned techniques are quiet useful in reducing the
time of the algorithm is k * n round, where n is the
inter cluster messages.
number of processors in the system.
head of the queue and rest of the queue is stored in a sequence number is seq-1. The forward message
local queue after-me. On exiting from CS, if the carries information about the next requesting node,
node’s after-me queue is non empty, the node sends thus forming a distributed waiting queue. When a
token-grant message, directly to the first node in the node exits from CS it checks whether a forward
If the length of active token queue is less has arrived, the node returns the token to n m by
than a predefined length (warning length), the master sending a release message otherwise the token is sent
sends a forward-token message containing a list of to the next node in the distributed queue. The heavy
nodes in the passive token queue, to the node at the load synchronization delay of the algorithm is T .
tail of the active token queue. Then the master node Wu-Shu argued that in centralized algorithms, only
moves nodes from its passive token queue into its the coordinator node has to do some extra work,
active token queue. The master node labels a batch however in decentralized algorithms; every node is
number on each forward-token and token-grant swamped with extra work and communication traffic.
messages, so that previous requests may be
distinguished from the current requests. The batch 4. k- Mutual Exclusion
number is passed along with the token and each client Raymond [26] introduced k-mutual exclusion
remembers the highest batch number it has seen. The problem, as a variant of mutual exclusion problem, in
light load synchronization delay of the algorithm is which at most k processes (1<k≤ n) can enter critical
2T and number of messages/CS under light load is 3. section at a time. For example there may be k copies
The heavy load synchronization delay of the of licenses of software, therefore it can be used by
algorithm is T . If the batch size of b > warning only k users at a time. Raymond also presented a non-
length is assumed, then 2b+2 messages are required token based algorithm for k-mutual exclusion based
for whole batch, Hence the number of messages/CS on Ricart-Agarwala’s algorithm [13], which requires
request is 2 + 2 / b . 2(n − 1) messages per CS invocation.
Wu-Shu [40] presented another centralized Srimani-Reddy [44] presented an extension
token based algorithm in which a node is specified as of Suzuki-kasmi’s algorithm for k-mutual exclusion
coordinator ( n m ). Initially token is held by n m . Any problem. In this algorithm k tokens are circulated.
node that wants to enter its CS sends its request to Because of existence of k tokens in the system up to k
processes may be inside their critical section
simultaneously. If a node owns a token, it may enter next site in the token queue. If value of the
critical section directly, otherwise it sends request semaphore is zero then no change is made to the
message to all other N − 1 nodes and waits for a semaphore and the site receiving token holds the
token. The upper bound for the message complexity token. When a site S i exits from its critical section, it
of this algorithm is N + K − 1 per CS entry.
sends a release message to the site that was K th in the
M.Naimi [34] proposed a directed graph
token queue when S i had removed itself from the
based algorithm for k-mutual exclusion problem.
Initially a logical spanning tree is defined from an queue. If the good site on the token queue is before
arbitrary network and k tokens are given to the root. the K th site, the release message is sent to the good
A waiting queue is maintained at each node. A node site. The value of semaphore is incremented by 1,
requesting CS sends a request to its predecessor and with every release message.
put itself in the waiting queue. The requesting node In this algorithm, a cycle is defined to begin,
then becomes the root and waits for the token. On when the good site picks a new good site, sends out
exiting from CS, the node sends the token to the first update messages. A cycle ends when a new good site
node in its waiting queue. On receiving request from finally receives the token and executes its own CS.
its neighbor Y , if a free token is available at X , it is Let m be the number of token requests in the token
directly sent to the Y , otherwise the node X puts, queue at the beginning of the cycle then n / m + 2
Node Y in its waiting queue and transmits Y’s messages per CS request are required. Under light
request to its predecessors except Y . The node Y
load ( m is close to 1) message complexity is O(n) .
then becomes predecessor of node X and the
Under heavy load (m is close to n) the performance
directed graph is transformed in to another directed
of the algorithm is good and only 3 messages are
graph. The number of message required is between 0
required in the extreme case, when all sites are
to 2 * (n − 1) messages per critical section entry.
requesting to enter CS.
K. Makki et al. [23] used a general Wang-Lang [52] presented a token-based
semaphore and token queue with token in their algorithm for k-mutual exclusion problem based on
algorithm. One site is chosen as ‘good site’. The good Raymond’s tree based approach. The nodes are
site places all token requests, which it receives in its assumed to be arranged in a tree structure, whose
local queue. When good site eventually receives the shape remains static. However the direction of an
token, it executes its CS and then appends its local edge can be changed and multiple edges with mixed
queue in to the token queue. A new good site is directions may exist between nodes. Each node has a
chosen and an update message about new good site token direction bag (tdb) to store those neighbors,
selection is sent to all those sites, which are not in which are on the outgoing paths leading to the nodes,
token queue now. A general semaphore, which is part holding the tokens. A node j can appear several
of the token, indicates the number of critical sections
times in a tdb, if there are several tokens reachable
that are available. Each site that receives the token
from j . Each node maintains a local variable token-
checks the value of this semaphore, if it is non-zero,
count, which indicates the number of free tokens at
the value is decremented by 1 and token is passed to
the node and a queue, which stores the requests from The Group Mutual Exclusion(GME) problem
its neighbors. A root node is chosen randomly to hold introduced by Joung [56], deals with two
K tokens. The directions of edges are selected in a contradictory issues in distributed systems, namely
way so that from each node there exists k paths mutual exclusion and concurrency. Joung modeled
leading to the root. When a node i wants to enter CS GME problem as “Congenial Talking Philosopher
and does not hold any token, it sends a request (i) Problem” [57]. In this problem, there is a set of n
message to each distinct node in tdbi and deletes one philosophers. A philosopher spends his time, either in
occurrence of each distinct node in tdbi. If tdbi is thinking alone or in talking in a group. There is only
empty ( i has already sent messages on behalf of its one meeting room, therefore, only one group can be
neighbors), in that case i simply waits. When node i held at a time. A philosopher interested in a group,
receives token and is waiting to enter CS, it could can succeed to enter the meeting room, only if the
send the token to a neighbor whose request is ahead meeting room is empty, or some philosopher
in the queue. It can also use greedy strategy by interested in the same group is already in the meeting
putting its requests always at the front of the queue. If room. Joung gave the example of a CD juke box
the queue is empty i retains token and increments containing large data objects. When a process needs a
token–count by 1. The performance of the algorithm data object, the data object is loaded to a cache buffer
is highly dependent on topology. The algorithms from the CD juke box. The cache buffer is large
requires at most 2 KD messages for a node to enter enough to store only one data object at a time. The
CS, where D is the diameter of the tree. Processes interested in currently loaded data object,
Bulgannawar-Vaidya [47] used dynamic are allowed to read concurrently, while a process
forest structure for each token to forwarded token requiring different data object has to wait, till the
requests. Each node maintains a pointer array with requested object is loaded to the cache buffer. The
one entry for each token. These pointers define k mutual exclusion and readers-writer problem are
forests corresponding to k tokens. Each node special cases of Group mutual Exclusion problem.
maintains a FIFO queue. Token also contains a queue For mutual exclusion, one forum can be allocated to
containing identifiers of the nodes to which token each process, so that only one process can be in the
must be forwarded in FIFO order. For performance critical section at a time. For readers-writers problem,
comparison simulation experiments were performed a common read forum can be used by all processes,
with 3 other algorithms, namely Raymond’s, Makki’s while a unique write forum can be assigned to each
shows that proposed algorithm achieves lower delay The solutions to GME problem in shared
in entering as well as lower number of messages. memory model have been proposed by Hadziclos
However average message size is 1.5 to 2 times [54],Kean-Moir [45],P. Jayanti et al. [43] ,S.Petrovic
larger than other algorithms. [51]. In message passing system, non-token based
algorithms for GME problem have been presented by
issued. Each process maintains an array request i . primary token, can not use it immediately because
some secondary tokens may be in use. Therefore, it
The request i [ j ] contains the sequence number of
waits until; it has received all the required release
latest request of process P j , along with its type, messages from the processes holding secondary