0% found this document useful (0 votes)
67 views

Acm Synchronization Algorithm

System performance can be greatly reduced if an inefficient synchronization algorithm is used. In the algorithm, a request message is passed from a leaf node to the root node. A lluffman coding technique is used to minimize the number of messages required.

Uploaded by

Jeff Pratt
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
67 views

Acm Synchronization Algorithm

System performance can be greatly reduced if an inefficient synchronization algorithm is used. In the algorithm, a request message is passed from a leaf node to the root node. A lluffman coding technique is used to minimize the number of messages required.

Uploaded by

Jeff Pratt
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

A Synchronization Algorithm for Distributed Systems

Tai-Kuo Woo
Department of Computer Science
Jacksonville University
Jacksonville, FL 32211
Kenneth Block
Department of Computer and Information Science
University of Florida
G a i n e s v i l l e , F L 32611

Abstract
1 Introduction
Synchronization is an i m p o r t a n t aspect of com- A problem that often arises in distributed sys-
puting. System performance can be greatly re- tems is synchronizing concurrent processes, par-
duced if an inefficient synchronization algorithm ticularly guaranteeing mutually exclusive access
to shared resources. An efficient mutual ex-
is used. llere we propose an algorithm for achiev-
clusion algorithm can maximize the parallelism
ing mutual exclusion, a major task of synchro-
a m o n g concurrent processes. For distributed sys-
nization, for distributed systems using a tourna- tems, the protocol of a mutual exclusion algo-
ment approach. In the algorithm, a request mes- rithm cannot be made to rely on access to shared
sage is passed from a leaf node to the root node, memory, but must c o m m u n i c a t e through mes-
and then back to the leaf node again, signaling sage passing. At present there are two mod-
els for achieving mutual exclusion in distributed
that a process is permitted to enter the critical
systems. The first model takes a probabilistic
section. A lluffman coding technique is used to approach [1, 3, 4, 5]. Each process generates a
minimize the number of messages required and to random number and then compares its number
reduce the bound of delay. Iligh fault tolerance is with others. The winner gets into the critical
achieved through backtracking the messages that section. T h e second model uses a deterministic
approach. Processes reach agreement by com-
have been acknowledged by the faulty node.
paring local counters in the nodes, such as time
stamps, sequence numbers, etc. [8, 10, 2]. Even
though the results m a y depend on the arrival se-
quence of the messages, tile decision of who gets
into the critical section is based on tile state of
the system. Most mutual exclusion algorithms
require that a process wishing to enter the criti-
cal section send messages to every other process.
The number of messages required for each en-
try to the critical section is O ( n ) where n is the
number of nodes in a distributed system.
In this paper, we use a t o u r n a m e n t approach
for achieving mutual exclusion. T h e idea is to

Permission to copy without fee all or part of this materialis granted pro-
vided that the copies are not made or distributed for direct commercial
advantage, the ACM copyrightnotice and the title of the publicationand
its date appear, and notice is given that copying is by permissionof the
Associationfor ComputingMachinery.To copyotherwise, or to republish,
requires a fee and/or specific permission.

© 1991 ACM 089791-382-5/91/0003/0358 $1.50 358


let the processes compete with one another as in
a tournament. If a process wins, it advances to
the next level and competes with another pro-
cess, the winner of the level below. This compe-
tition continues until a process reaches the top
and then enters the critical section. While the
idea of using tournament is not new [9, 6], what t e ~
o•
distinguishes this paper from others is the way
pw •
messages are passed and the arrangement of the
nodes, which is based on Huffman coding theory.
Also, most algorithms, whether probabilistic or i# • 0
I

s• ~. t •
not, have weak reliability, since reaching agree- os
• •
• ss

ment requires a process to send messages directly (D


to every other process. The approach presented •e
w~ ~
0s i
in this paper lessens this problem; a faulty node
only affects its descendents, and other processes Q©©©©
can still enter critical section. Once a faulty node
is detected, it is replaced dynamically through a
backtracking technique. Figure 1: A T o u r n a m e n t Among Nodes
This paper is organized as follows. In Section
2, we delineate the synchronization algorithm. In
Section 3, we describe how the Huffman coding that determines who goes to the next level im-
technique can be used to reduce the message traf- mediately returns an acknowledgment to both
fic. In Section 4, we discuss the strategy for in- of its children and forwards the message to an-
creasing fault-tolerance. In Section 5, we look at other designated node at a higher level. Figure
network management. A conclusion is given in 1 shows the tournament among nodes. The solid
Section 6. white circles represent nodes, and the grey cir-
cles denote designated nodes. Each designated
node holds a token, all of which are initialized to
2 Description of the Algo- be true. A request message from node I would
include designated nodes 0, 4, and 6. In the fig-
rithm ure, nodes 0 and 1 compete at designated node
The protocol for a node wishing to enter the crit- 0; nodes 2 and 3 compete at designated node 1;
ical section is as follows. A node competes with and so on. The designated node determines the
its neighbor at a designated node by sending a winner based on which request message arrives
request message containing information such as first. The request message arriving first finds the
the ID of the request node and its immediate des- token is true and is forwarded to the designated
ignated nodes. A designated node is the place node at higher level. A request message that
where two nodes with processes interested in the gets to the designated node late is withheld un-
critical section compete against each other. For til the node whose message arrives first exits the
instance, we can choose node 1 as the designated critical section. Since a node is only assigned as
node for nodes 2 and 3. Whenever these two a designated node once, it can receive at most
nodes have a request, the request messages are two request messages. When a node exits the
forwarded to node 1. An alternative is to let the critical section, it issues messages to unblock the
message frame contain the node number of the request messages at the designated nodes. For
request and the level number. The level number example, suppose both nodes 0 and 1 want to
is initialized to 1. Each designated node calcu- enter the critical section and the request mes-
lates the node number of the designated node sage from node 1 arrives at node 0 first (i.e., the
at the next level by using the formula provided message arrives at node 0 before node 0 gener-
in the procedure below. The designated node ates a process for entering the critical section.)

359
The request message from node 1 is forwarded Procedure Receiving_Message;
to node 4, and the request message from node 0
begin
is blocked. For ease of detection of faulty nodes,
an ACK_withheld and an ACK_forward are re- case m e s s a g e o f
turned to nodes 0 and 1, respectively. Node 4 for- release_critical.section:
wards the request message from designated node if There is a request message being
0 to designated node 6 if no request message is withheld, t h e n
received from node 1. When the request message if level = log2 n t h e n
returns to node 1, it enters the critical section.
Forward the withheld request
When node I finishes the critical section, it sends
message to node ID.
messages_release to nodes 6, 4, and 0 to unbloek
the request messages. else
The following procedures implement the pro- Designated node ID : :
tocol. Designated node ID/2 + n/2;
Forward the withheld request
mesage to designated node ID
Procedure Enter_Critical.Section; at a higher level.
begin
else
{Send a message to its immediate
designated node.} Set the token on;
Designated node ID:= Node ID/2. r e q u e s t _critical_sectlon:
message(request_critical_section, level :=level + 1;
node ID, level, designated node ID) , if token is off t h e n
Designated node ID. Withhold the message;
Go to sleep
Return ACK_withheld
{It wakes up and enters the critical
else
section when it receives a message of
request_critical_section with its node ID i f level = log2 n t h e n
and level equals log2n) } Designated node ID :=
end ; Node ID;
{The request message has
Procedure Exit_Critical.Section; reached the top level and
begin
is going to be returned to
Designated node (k):= n - 2 ;
the node which issues
{Starting with the designated node at the
highest level } the request }
{Starting with the designated node at the else
highest level } Designated node ID :=
for k:= log2 n to 1 do; Designated I D / 2 + n/2;
begin {Compute the designated
Message(release_critical_section, node
node at the next level }
ID) , Designated_node(k);
Set the token off and
Designated node (k - 1):=
2 * Designate node(k) - n forward the request message to
{Compute the designated node at one designated ID at a higher level.
level below } Return ACK_forward;
end e n d case
end; end;

360
2.1 Proof of Correctness
L e m m a 2.1 Mutual exclusion is guaranteed.
Mutual ezclusion ensures that only one process
is in the critical section at a time.

P r o o f o f L e m m a 2.1 The claim is true be-


cause each node has two immediate children and
each child can only send one request message.
At the root, only one request message can go
through. When releasing the critical section,
each withheld request message advances one level
and the request message withheld at the root gets
into the critical section. As a result, at any time, Figure 2: A New Arrangement of Nodes
only one process stays in the critical section.

L e m m a 2.2 Deadlock is impossible. Deadlock 0.1, and 0.79, respectively, it takes two messages
is the situation where no process can ever enter for each entry to the critical section if the nodes
the critical section. are placed at leaves equidistant from the root.
However, if we rearrange the nodes, placing the
P r o o f o f L e m m a 2.2 The claim is true. If high probability nodes at the leaves closer to the
only one process is at the highest level, it pro- root, as shown in Figure 2, the expected number
ceeds to the critical section. If more than one of messages required for each entry to the critical
process is at the highest level, one of the pro- section is reduced to 1.22.
cesses advances to the next level and enters the
critical section eventually.
3.1 Description of the Technique
L e m m a 2.3 Starvation is not possible. Starva- The technique for finding the optimal arrange-
tion is the situation where a process is prevented ment of nodes can be found in coding theory. In
indefinitely from entering the critical section. a tIuffman coding scheme [7], a set of items is to
be coded in binary for transmission. Each item is
P r o o f o f L e m m a 2.3 A process which is with-
associated with a probability of occurrence. To
held at a level can never be passed by any of its
minimize the average-length binary code, a short
descendents. When a process exits the critical
binary code is used to represent an item with a
section, the processes above the withheld pro-
high probability of occurrence. If we treat each
cess move to a higher level and allow the with-
held process to advance one level. Eventually, binary digit as a message, the problem of min-
the withheld process enters the critical section. imizing the average length of binary code can
be transformed into the problem of minimizing
the average number of messages required for each
3 A Technique for Reducing entry to the critical section. The Huffman cod-
ing algorithm works as follows. First, it orders
Message Traffic the items in descending order according to their
In distributed systems, nodes have different probabilities of occurrence. Second, the code of
probabilities of entering the critical section; i.e., the item with the lowest probability is concate-
some of them enter the critical section more fre- nated with a "0" at the front, and the code of the
quently than the others. The number of mes- item with the second lowest probability is con-
sages and the delay required for each entry to the catenated with a "1." Third, the probabilities of
critical section is reduced if we place the nodes these two items are summed up to form a new
with high probability of requesting the critical item, and all the items are reordered. Again, the
section near the root. For example, in a system last two items are concatenated with a "0" and a
of four nodes, 0..3, with probabilities 0.01, 0.1, ' T ' , respectively. This process is repeated until

361
In the Huffman coding scheme, the probabili-
ties of items are known. The probability that a
node will issue a request to enter the critical sec-
tion can be estimated statistically from its past
history.
There is no doubt that the percentage of re-
duction of the number of messages required de-
pends on the variance of the probabilities that
nodes may issue a request for entry to the crit-
ical section; the higher the variance the greater
the percentage of reduction of messages.

3.2 Performance Evaluation


Itere, we perform a simulation to determine the
Figure 3: A Configuration of Nodes Using Huff- effectiveness of using Huffman coding to rear-
man Coding Theory range the nodes. In the simulation, we calculate
the percentage of reduction of messages (the re-
duction coefficient, RC) required for each entry
only two items are left, and each is assigned a bi- to the critical section by using the Huffman cod-
nary digit. For example, items 0 through 7 have ing technique for n = 4, 8, and 16, where n is the
probabilities 1/2, 1/4, 1/16, 1/16, 1/16, 1/32, number of nodes in the system. The reduction
1/64, and 1/64. Applying the ttuffman coding coefficient is calculated by the formula below:
algorithm would generate the binary codes shown
in Table 1. NMH
RC = 1
log2n
Item Probability Binary Code where N M H is the expected number of mes-
0 1/2 0 sages required using Huffman coding theory and
1 1/4 10 logan is the number of messages required with-
2 1/16 1110 out using Huffman coding. The N M H is ob-
3 1/16 1101 tained by taking the average of five simulation
4 1/16 1100 runs. On each simulation run, a set of proba-
5 1/32 11110 bilities is generated with the specified standard
6 1/64 111111 deviation. Then, by associating each node with
7 1/64 111110 a probability and appling the ~Huffman coding
algorithm, an optimal arrangement of nodes is
determined. The N M H is calculated by the for-
Table 1: Binary Codes of Items Using Huffman
mula below.
Coding Scheme
11

To transform the binary codes into the tourna- NMH = ~ pili


i=l
ment tree, we start with the root, assigning 1 to
the right link of the root and 0 to the left link. If where Pi is the probability that node i may re-
any of the paths from the root down matches the quest to enter the critical section and li is the
binary code of a node, it terminates. The paths number of links between the node and the root
which do not match any binary code branch to in the tree. As shown in Tables 2, 3, and 4,
the next level. Eventually, all tile paths find a tile reduction coefficient increases as the stan-
match in the tree. Figure 3 shows the new con- dard deviation of the probabilities (tr) that the
figuration of nodes using tluffman coding theory. nodes may issue a request to enter the critical
section increases.

362
4 N e t w o r k Reliability
There are two major issues in network reliabil-
ity: fault detection and fault tolerance. The dis-
tributed algorithm proposed in this paper pro-
a NMH RC
vides high reliability.
0.16 1.76 0.12 First, fault detection is very easy to imple-
0.20 1.64 0.18
ment. A node is monitored by its immediate chil-
0.24 1.50 0.25" dren. When a node receives a request message
0.28 1.38 0.3i I from it children, an ACK is returned. If a child
0.32 1 . 3 6 0.32. node does not receive an ACK before a timeout
0.36 !1.20 0.40 period expires, the child node determines that its
parent node is faulty. To achieve high fault tol-
Table 2: Reduction Coefficient of Using Huffman erance, we use a backtracking technique. When
Coding (n=4) a node forwards a request message to its'parent
and an ACK is not returned before a timeout
period expires, it forwards the request message
to the last node whose request message has been
acknowledged. For instance, node 5 forwards a
request message from node 1 to node 7 and an
ACK is returned to both node 5 and node 6. Note
cr NM H RC that an ACK should include the ID number of
0.09 ~2.61 0.13 the node that originates the request to enter the
0.11 2.40 0.20 critical section. Later on if either node 5 or node
0.13 2.28 0.24 6 detects that node 7 is faulty, the new request
0.15 2.16 0.28 message, which could come from any of the de-
0.17 2.13 0.29' scendents of node 5 or node 6, is forwarded to
0.192.04 0.32 node 1, because node 1 was the last node to suc-
cessfully function on the chain leading to node
7.
Table 3: Reduction Coefficient of Using Huffman To achieve multiple fault tolerance, a queue
Coding (n=8) of ACKs can be established at each node and
the replacing node for the faulty node can be
designated according to the ordering of the nodes
in the queue.

5 Network Management
(7 NMH RC I
O.05 3.48 0.13 Since in the distributed algorithm each node re-
O.06 3.40 0.15 ceives request messages from its immediate de-
O.07 3.20 0.20 " scendents, the addition of a node is very straight-
0.08 2.88 0.28 forward. First, the position of the new node on
0.09 2.8 !0.30 the tree is calculated using the ttuffman coding
O.IO 2.36 ~ 0.41 scheme. Second, the old parent notifies its im-
mediate child that a new parent has been estab-
lished. For example, if a new node 4 is to be in-
Table 4: Reduction Coefficient of Using Huffman serted into the link between nodes 1 and 2, node
Coding (n=16) 2 notifies node 1 that all request messages should
be sent to node 4.

363
Similarly, a node being deleted from the net- References
work notifies its immediate children that all re-
quest messages should be sent to its neighbor. [1] Burns, J. E. Symmetry in Systems of Asyn-
chronous Processes. Proceedings of 22nd
Annual ACM Symposium on Foundations of
6 Conclusion Computer Science. 1981, pp. 169-174.

In this paper, we have described a mutual ex- [21 Carvalho, O. and Roucairol, G. On Mutual
clusion algorithm for distributed systems using Exclusion in Computer Networks. Commu-
a tournament approach. A process wishing to nications of the ACM, 26(2), Feb. 1983, pp.
enter the critical section must issue a request 146-147.
message, which is then passed through the nodes
[3] Chang, C. K. Bidding Against Competi-
on the path from the leaves to the root. When tor. IEEE Transactions on Software Engi-
the request message is returned to the request- neering, 16(1), Jan. 1990, pp. 100-104.
ing node, the process enters the critical section.
When exiting the critical section, the process [4] Cohen, S, Lehmann, D, and Pnueli. Sym-
sends a message to each node on the path the metric and Economical Solution to the Mu-
message has passed through. As a result, the tual Exclusion Problem in Distributed Sys-
number of messages required for each entry to tems. Theoretical Computer Science. Vol.
the critical section is 21og2n, where n is the num- 34, 1984, pp. 215-226.
ber of nodes. To further reduce the number of
messages, the Huffman coding technique is used [5] Francez, N. and Rodeh, M. A Distributed
to determine the position of each node in the Abstract Data Type Implemented by a
tournament based on the probability that a node Probabilistic Communication Scheme. Pro-
may issue a request for entry to the critical sec- ceedings of the 21 th Annual ACM Sympo-
tion. Performance evaluation shows that this al- sium on Foundations of Computer Science,
gorithm produces a significant reduction in mes- 1980, pp. 373-379.
sages, which also implies a reduction in the av- [6] Graunke, G. and Thakkar, S. Synchroniza-
erage bound of delay. The reliability of the pro- tion Algorithms for Shared-Memory Multi-
tocol is enhanced through a backtracking tech- processors. IEEE Computer, June 1990, pp.
nique. When a node receives a request message 60-69.
from one of its immediate children, it returns an
ACK to both of its children immediately. When [7] Huffman, D. A Method for the Construc-
the children detect a faulty parent (i.e., no ACK tion of Minimum Redundancy Code. Pro-
is received), the request message is forwarded to ceedings of IRE, 40, 1952.
the node whose request message passes through
the faulty node. Multiple fault tolerance can be [8] Lamport, L. Time, Clocks, and the Ordering
of Events in a Distributed System. Commu-
achieved by using a queue to store the nodes that
nications of the ACM, 21(7), July 1978, pp.
have passed through a faulty node. Also, since
558-565.
a node needs to know only its parent and im-
mediate children, adding and deleting nodes can [9] Peterson, G. and M. Fischer. Economic So-
be done on the fly. Moreover, the approach of lutions for the Critical Section Problem in
using Huffman coding technique can be applied a Distributed System. Proceedings of the
to other application fields, such as file structures Ninth A C M Symposium on Theory of Com-
and database retrieval, without adding overhead. puting, 1977, pp. 91-97.
For instance, in a hierarchical file structure, fre-
quently retrieved files are placed near the root [10] Ricart, G. and Agrawala, A. K. An Optimal
to reduce the amount of searching time. In dy- Algorithm for Mutual Exclusion in Com-
namic hashing and indexing, the items recalled puter Networks. Communications of the
most often should have the least rehashing and ACM, 24(1), Jan. 1981, pp. 9-17.
reindexing.

364

You might also like