Acm Synchronization Algorithm
Acm Synchronization Algorithm
Tai-Kuo Woo
Department of Computer Science
Jacksonville University
Jacksonville, FL 32211
Kenneth Block
Department of Computer and Information Science
University of Florida
G a i n e s v i l l e , F L 32611
Abstract
1 Introduction
Synchronization is an i m p o r t a n t aspect of com- A problem that often arises in distributed sys-
puting. System performance can be greatly re- tems is synchronizing concurrent processes, par-
duced if an inefficient synchronization algorithm ticularly guaranteeing mutually exclusive access
to shared resources. An efficient mutual ex-
is used. llere we propose an algorithm for achiev-
clusion algorithm can maximize the parallelism
ing mutual exclusion, a major task of synchro-
a m o n g concurrent processes. For distributed sys-
nization, for distributed systems using a tourna- tems, the protocol of a mutual exclusion algo-
ment approach. In the algorithm, a request mes- rithm cannot be made to rely on access to shared
sage is passed from a leaf node to the root node, memory, but must c o m m u n i c a t e through mes-
and then back to the leaf node again, signaling sage passing. At present there are two mod-
els for achieving mutual exclusion in distributed
that a process is permitted to enter the critical
systems. The first model takes a probabilistic
section. A lluffman coding technique is used to approach [1, 3, 4, 5]. Each process generates a
minimize the number of messages required and to random number and then compares its number
reduce the bound of delay. Iligh fault tolerance is with others. The winner gets into the critical
achieved through backtracking the messages that section. T h e second model uses a deterministic
approach. Processes reach agreement by com-
have been acknowledged by the faulty node.
paring local counters in the nodes, such as time
stamps, sequence numbers, etc. [8, 10, 2]. Even
though the results m a y depend on the arrival se-
quence of the messages, tile decision of who gets
into the critical section is based on tile state of
the system. Most mutual exclusion algorithms
require that a process wishing to enter the criti-
cal section send messages to every other process.
The number of messages required for each en-
try to the critical section is O ( n ) where n is the
number of nodes in a distributed system.
In this paper, we use a t o u r n a m e n t approach
for achieving mutual exclusion. T h e idea is to
Permission to copy without fee all or part of this materialis granted pro-
vided that the copies are not made or distributed for direct commercial
advantage, the ACM copyrightnotice and the title of the publicationand
its date appear, and notice is given that copying is by permissionof the
Associationfor ComputingMachinery.To copyotherwise, or to republish,
requires a fee and/or specific permission.
s• ~. t •
not, have weak reliability, since reaching agree- os
• •
• ss
s°
359
The request message from node 1 is forwarded Procedure Receiving_Message;
to node 4, and the request message from node 0
begin
is blocked. For ease of detection of faulty nodes,
an ACK_withheld and an ACK_forward are re- case m e s s a g e o f
turned to nodes 0 and 1, respectively. Node 4 for- release_critical.section:
wards the request message from designated node if There is a request message being
0 to designated node 6 if no request message is withheld, t h e n
received from node 1. When the request message if level = log2 n t h e n
returns to node 1, it enters the critical section.
Forward the withheld request
When node I finishes the critical section, it sends
message to node ID.
messages_release to nodes 6, 4, and 0 to unbloek
the request messages. else
The following procedures implement the pro- Designated node ID : :
tocol. Designated node ID/2 + n/2;
Forward the withheld request
mesage to designated node ID
Procedure Enter_Critical.Section; at a higher level.
begin
else
{Send a message to its immediate
designated node.} Set the token on;
Designated node ID:= Node ID/2. r e q u e s t _critical_sectlon:
message(request_critical_section, level :=level + 1;
node ID, level, designated node ID) , if token is off t h e n
Designated node ID. Withhold the message;
Go to sleep
Return ACK_withheld
{It wakes up and enters the critical
else
section when it receives a message of
request_critical_section with its node ID i f level = log2 n t h e n
and level equals log2n) } Designated node ID :=
end ; Node ID;
{The request message has
Procedure Exit_Critical.Section; reached the top level and
begin
is going to be returned to
Designated node (k):= n - 2 ;
the node which issues
{Starting with the designated node at the
highest level } the request }
{Starting with the designated node at the else
highest level } Designated node ID :=
for k:= log2 n to 1 do; Designated I D / 2 + n/2;
begin {Compute the designated
Message(release_critical_section, node
node at the next level }
ID) , Designated_node(k);
Set the token off and
Designated node (k - 1):=
2 * Designate node(k) - n forward the request message to
{Compute the designated node at one designated ID at a higher level.
level below } Return ACK_forward;
end e n d case
end; end;
360
2.1 Proof of Correctness
L e m m a 2.1 Mutual exclusion is guaranteed.
Mutual ezclusion ensures that only one process
is in the critical section at a time.
L e m m a 2.2 Deadlock is impossible. Deadlock 0.1, and 0.79, respectively, it takes two messages
is the situation where no process can ever enter for each entry to the critical section if the nodes
the critical section. are placed at leaves equidistant from the root.
However, if we rearrange the nodes, placing the
P r o o f o f L e m m a 2.2 The claim is true. If high probability nodes at the leaves closer to the
only one process is at the highest level, it pro- root, as shown in Figure 2, the expected number
ceeds to the critical section. If more than one of messages required for each entry to the critical
process is at the highest level, one of the pro- section is reduced to 1.22.
cesses advances to the next level and enters the
critical section eventually.
3.1 Description of the Technique
L e m m a 2.3 Starvation is not possible. Starva- The technique for finding the optimal arrange-
tion is the situation where a process is prevented ment of nodes can be found in coding theory. In
indefinitely from entering the critical section. a tIuffman coding scheme [7], a set of items is to
be coded in binary for transmission. Each item is
P r o o f o f L e m m a 2.3 A process which is with-
associated with a probability of occurrence. To
held at a level can never be passed by any of its
minimize the average-length binary code, a short
descendents. When a process exits the critical
binary code is used to represent an item with a
section, the processes above the withheld pro-
high probability of occurrence. If we treat each
cess move to a higher level and allow the with-
held process to advance one level. Eventually, binary digit as a message, the problem of min-
the withheld process enters the critical section. imizing the average length of binary code can
be transformed into the problem of minimizing
the average number of messages required for each
3 A Technique for Reducing entry to the critical section. The Huffman cod-
ing algorithm works as follows. First, it orders
Message Traffic the items in descending order according to their
In distributed systems, nodes have different probabilities of occurrence. Second, the code of
probabilities of entering the critical section; i.e., the item with the lowest probability is concate-
some of them enter the critical section more fre- nated with a "0" at the front, and the code of the
quently than the others. The number of mes- item with the second lowest probability is con-
sages and the delay required for each entry to the catenated with a "1." Third, the probabilities of
critical section is reduced if we place the nodes these two items are summed up to form a new
with high probability of requesting the critical item, and all the items are reordered. Again, the
section near the root. For example, in a system last two items are concatenated with a "0" and a
of four nodes, 0..3, with probabilities 0.01, 0.1, ' T ' , respectively. This process is repeated until
361
In the Huffman coding scheme, the probabili-
ties of items are known. The probability that a
node will issue a request to enter the critical sec-
tion can be estimated statistically from its past
history.
There is no doubt that the percentage of re-
duction of the number of messages required de-
pends on the variance of the probabilities that
nodes may issue a request for entry to the crit-
ical section; the higher the variance the greater
the percentage of reduction of messages.
362
4 N e t w o r k Reliability
There are two major issues in network reliabil-
ity: fault detection and fault tolerance. The dis-
tributed algorithm proposed in this paper pro-
a NMH RC
vides high reliability.
0.16 1.76 0.12 First, fault detection is very easy to imple-
0.20 1.64 0.18
ment. A node is monitored by its immediate chil-
0.24 1.50 0.25" dren. When a node receives a request message
0.28 1.38 0.3i I from it children, an ACK is returned. If a child
0.32 1 . 3 6 0.32. node does not receive an ACK before a timeout
0.36 !1.20 0.40 period expires, the child node determines that its
parent node is faulty. To achieve high fault tol-
Table 2: Reduction Coefficient of Using Huffman erance, we use a backtracking technique. When
Coding (n=4) a node forwards a request message to its'parent
and an ACK is not returned before a timeout
period expires, it forwards the request message
to the last node whose request message has been
acknowledged. For instance, node 5 forwards a
request message from node 1 to node 7 and an
ACK is returned to both node 5 and node 6. Note
cr NM H RC that an ACK should include the ID number of
0.09 ~2.61 0.13 the node that originates the request to enter the
0.11 2.40 0.20 critical section. Later on if either node 5 or node
0.13 2.28 0.24 6 detects that node 7 is faulty, the new request
0.15 2.16 0.28 message, which could come from any of the de-
0.17 2.13 0.29' scendents of node 5 or node 6, is forwarded to
0.192.04 0.32 node 1, because node 1 was the last node to suc-
cessfully function on the chain leading to node
7.
Table 3: Reduction Coefficient of Using Huffman To achieve multiple fault tolerance, a queue
Coding (n=8) of ACKs can be established at each node and
the replacing node for the faulty node can be
designated according to the ordering of the nodes
in the queue.
5 Network Management
(7 NMH RC I
O.05 3.48 0.13 Since in the distributed algorithm each node re-
O.06 3.40 0.15 ceives request messages from its immediate de-
O.07 3.20 0.20 " scendents, the addition of a node is very straight-
0.08 2.88 0.28 forward. First, the position of the new node on
0.09 2.8 !0.30 the tree is calculated using the ttuffman coding
O.IO 2.36 ~ 0.41 scheme. Second, the old parent notifies its im-
mediate child that a new parent has been estab-
lished. For example, if a new node 4 is to be in-
Table 4: Reduction Coefficient of Using Huffman serted into the link between nodes 1 and 2, node
Coding (n=16) 2 notifies node 1 that all request messages should
be sent to node 4.
363
Similarly, a node being deleted from the net- References
work notifies its immediate children that all re-
quest messages should be sent to its neighbor. [1] Burns, J. E. Symmetry in Systems of Asyn-
chronous Processes. Proceedings of 22nd
Annual ACM Symposium on Foundations of
6 Conclusion Computer Science. 1981, pp. 169-174.
In this paper, we have described a mutual ex- [21 Carvalho, O. and Roucairol, G. On Mutual
clusion algorithm for distributed systems using Exclusion in Computer Networks. Commu-
a tournament approach. A process wishing to nications of the ACM, 26(2), Feb. 1983, pp.
enter the critical section must issue a request 146-147.
message, which is then passed through the nodes
[3] Chang, C. K. Bidding Against Competi-
on the path from the leaves to the root. When tor. IEEE Transactions on Software Engi-
the request message is returned to the request- neering, 16(1), Jan. 1990, pp. 100-104.
ing node, the process enters the critical section.
When exiting the critical section, the process [4] Cohen, S, Lehmann, D, and Pnueli. Sym-
sends a message to each node on the path the metric and Economical Solution to the Mu-
message has passed through. As a result, the tual Exclusion Problem in Distributed Sys-
number of messages required for each entry to tems. Theoretical Computer Science. Vol.
the critical section is 21og2n, where n is the num- 34, 1984, pp. 215-226.
ber of nodes. To further reduce the number of
messages, the Huffman coding technique is used [5] Francez, N. and Rodeh, M. A Distributed
to determine the position of each node in the Abstract Data Type Implemented by a
tournament based on the probability that a node Probabilistic Communication Scheme. Pro-
may issue a request for entry to the critical sec- ceedings of the 21 th Annual ACM Sympo-
tion. Performance evaluation shows that this al- sium on Foundations of Computer Science,
gorithm produces a significant reduction in mes- 1980, pp. 373-379.
sages, which also implies a reduction in the av- [6] Graunke, G. and Thakkar, S. Synchroniza-
erage bound of delay. The reliability of the pro- tion Algorithms for Shared-Memory Multi-
tocol is enhanced through a backtracking tech- processors. IEEE Computer, June 1990, pp.
nique. When a node receives a request message 60-69.
from one of its immediate children, it returns an
ACK to both of its children immediately. When [7] Huffman, D. A Method for the Construc-
the children detect a faulty parent (i.e., no ACK tion of Minimum Redundancy Code. Pro-
is received), the request message is forwarded to ceedings of IRE, 40, 1952.
the node whose request message passes through
the faulty node. Multiple fault tolerance can be [8] Lamport, L. Time, Clocks, and the Ordering
of Events in a Distributed System. Commu-
achieved by using a queue to store the nodes that
nications of the ACM, 21(7), July 1978, pp.
have passed through a faulty node. Also, since
558-565.
a node needs to know only its parent and im-
mediate children, adding and deleting nodes can [9] Peterson, G. and M. Fischer. Economic So-
be done on the fly. Moreover, the approach of lutions for the Critical Section Problem in
using Huffman coding technique can be applied a Distributed System. Proceedings of the
to other application fields, such as file structures Ninth A C M Symposium on Theory of Com-
and database retrieval, without adding overhead. puting, 1977, pp. 91-97.
For instance, in a hierarchical file structure, fre-
quently retrieved files are placed near the root [10] Ricart, G. and Agrawala, A. K. An Optimal
to reduce the amount of searching time. In dy- Algorithm for Mutual Exclusion in Com-
namic hashing and indexing, the items recalled puter Networks. Communications of the
most often should have the least rehashing and ACM, 24(1), Jan. 1981, pp. 9-17.
reindexing.
364