0% found this document useful (0 votes)

546 views28 pages

An Overview of Distributed MST Algorithms

In this paper, we discuss the problem of finding a Minimum Spanning Tree over a weighted undirected graph. We also discuss and compare three classic distributed algorithms for the problem at hand. The problem of finding a distributed algorithm for a minimum weight spanning tree is a fundamental problem in the field of distributed network algorithms. Trees and MSTs are used in a wide variety of algorithms in the distributed graph structure domain. Around 1983, when the influential and classic approach was provided by Gallager et al., MST algorithms were already being used in broadcast algorithms for communication networks. With the help a minimum cost tree, the cost associated for a broadcast can be reduced by a significant amount. Here, the edge weights are associated to the cost of using a channel in a specific direction. In addition to the broadcast application, there are many potential control problems for networks whose communication complexities are reduced by having a known spanning tree. Spanning trees themselves are essential components in classic distributed graph problems like Leader Election, network synchronization, Breadth-First-Search and Deadlock Resolution.

Uploaded by

Arkanath Pathak

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

546 views28 pages

An Overview of Distributed MST Algorithms

Uploaded by

Arkanath Pathak

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

An overview of distributed MST algorithms

Arkanath Pathak, Buddha Prakash, Utpal, Palash Mittal

1 Introduction
In this paper, we discuss the problem of finding a Minimum Spanning Tree over a weighted undi-
rected graph. We also discuss and compare three classic distributed algorithms for the problem at
hand in sections 1, 1 and 1. The problem of finding a distributed algorithm for a minimum weight
spanning tree is a fundamental problem in the field of distributed network algorithms. Trees and
MSTs are used in a wide variety of algorithms in the distributed graph structure domain. Around
1983, when the influential and classic approach was provided by Gallager et al. [7], MST algo-
rithms were already being used in broadcast algorithms for communication networks. With the
help a minimum cost tree, the cost associated for a broadcast can be reduced by a significant
amount [2, 3]. Here, the edge weights are associated to the cost of using a channel in a specific
direction. In addition to the broadcast application, there are many potential control problems for
networks whose communication complexities are reduced by having a known spanning tree. Span-
ning trees themselves are essential components in classic distributed graph problems like Leader
Election [10], network synchronization [1], Breadth-First-Search [3] and Deadlock Resolution [4].

1.1 Problem Definition

Let G = (V, E) be an undirected graph, where V is the finite nonempty set of nodes and E ⊆ V ×V
is the set of edges defined over V . A graph G0 = (VG , EG ) is a spanning sub-graph of G if VG = V
and EG ⊆ E. A spanning tree of a graph is a connected acyclic spanning sub-graph. The original
graph G is assumed to be connected for this problem domain, otherwise a spanning tree can not be
formed. A spanning tree T = (V, ET ), ET ⊆ E, is the minimum cost spanning tree if the sum of
edge costs over ET is minimum. Formally,
X
T ∗ = argmin w(e)
T
e∈ET

, where w(e) is the weight assigned to the edge e. The edges are assumed to have distinct weights,
which makes the minimum weight spanning tree unique. This property can easily be ensured by
various workarounds, .e.g. appending node ids to the edge weights which insures a proper ordering,
where ties are broken by the node ids. In the late 1950’s, much before the realization of importance
of the problem in the communication networks domain, the classic approaches towards finding
MST were proposed by Dijkstra [6], Prim [12] and Kruskal [9]. These approaches primarily relied
on the following two lemmas. Note that the proofs are informal and they assume the edge costs to
be distinct, hence ensuring the presence of a unique MST.

1
Lemma 1.1 (MST Cut Lemma) Let P and Q be two disjoint node sets that together give the
union of the original graph nodes, then among all the edges between P and Q, the node with
minimum edge cost must be present in the MST.

Proof Let T be an MST that does not contain e (the edge with the minimum cost), then adding e
to it will produce a cycle. Traverse along that cycle, and whenever it crosses from P to Q, replace
that edge with e. This produces an MST again, a contradiction.

Lemma 1.2 (MST Cycle Lemma) The costliest edge of a connected undirected graph must not
be in the MST.

Proof Let T be the MST that contains the costliest edge, e. The graph obtained by removing e
from T (T − e) gives two connected disjoint components, traverse along the cycle to get a cheaper
edge that can be replaced with e. Hence, a contradiction.

Any of the above two lemmas can be used to construct the MST. Prim’s algorithm [12] is a nearly
direct implementation of the Cut Lemma (Lemma 1.1). These approaches have a time complexity
of O(ElogV ).

However, from here on, we focus on the distributed approaches towards the problem in the static
asynchronous network domain [1]. In this construct, the graph is assumed to “be representative
of a point-to-point communication network, where the set of nodes V represents processors of the
network and the set of edges E represents bidirectional non-interfering communication channels
operating between neighbouring nodes” [3]. The communication is only through the channels and
no memory is shared between the processes. However, a node does know the identity of its neigh-
bours. Also, the approaches are confined only to event-driven algorithms, hence, there is no help
of a global clock to gain knowledge for events taking place in other processes.

For the distributed approaches, the complexity of the algorithm is usually measured either in terms
of communication or time. Communication complexity is the number of messages sent during
the algorithm. In some models, it is further assumed that a single message can contain at most
O(logV ) bits [3]. The definition for Time complexity varies with the usage. It is usually based on
either the number of rounds (relative to an algorithm) or time units (global clock).

1.2 Types Of Solutions

All the three distributed algorithms for the MST problem that we review function according to the
following general scheme. The general scheme was introduced in the classical GHS algorithm by
Gallager et al. [7]. There is no pre-specified root in the distributed network. Each node stores
its own copy of the algorithm and the algorithm is started independently by all the nodes. The
algorithm proceeds by merging fragments (trivially connected subgraph of the MST). It is inter-
esting to note that the algorithm by Garay et al. [8] also involves breaking down of the fragments
into smaller fragments to ensure an upper bound on the diameter of each fragment. Initially, all
fragments are singletons (consisting of only a single node). The merge step involves each fragment
determining the “best” (minimum weight edge) edge for the entire fragment among all its outgo-
ing edges, and merging the two fragments along this selected edge. This edge is determined by

2
propagating an initiate message to each node in the fragment. Each node gathers the reports of all
its children and propagates the info of the best minimum outgoing edge reported by its children or
determined by itself. The algorithm upon termination finds and propagates the information about
the unique MST for the distributed network to all the nodes.

Even though all the three algorithms follow the same general scheme, they differ in terms of prob-
lem breakdown and the approaches adopted to solve the sub-problems. In a general sense, there
are two methods for computing the MST.

1. Combining fragments: add edges until the MST is constructed.

2. Eliminating cycles: delete edges until only the MST is left.

Both Galager’s [7] and Awerbuch’s [1] algorithms use the first approach whereas Galay’s [8] algo-
rithm combines the two approaches.

Awerbuch’s and Galay’s algorithms achieve optimal performance, both in terms of message pass-
ing and time complexity. The algorithms proceed by breaking down the problem into three smaller
sub-problems which are solved in three different phases. In the different phases, the algorithms
establish a trade-off between message passing and time complexity.

Awerbuch’s algorithm starts with the Counting phase where it determines the number of nodes
in the network. Counting phase is followed by a second phase where the MST is developed, by
V
following the GHS algorithm until the fragments become of a large size (O( log(V )
)). At this stage
the algorithm switches to the third phase which consists of two relatively complex procedures:
Root Distance and Leader Distance, for adding the remaining edges of the MST. We discuss the
implementation details and subtleties of the procedures later in the report.

Garay’s algorithm starts with the Controlled GHS phase, which is a modified and controlled
version of the GHS algorithm. The first phase produces a forest of a bounded number of frag-
ments where each fragments diameter is also bounded by an upper limit. This is followed by the
second phase which involves the elimination of short cycles in the fragment graph. The third
phase involves global edge elimination in which the remaining fragment graph is reduced to the
MST.

1.3 Challenges
Before jumping to the individual algorithms, let us first review the typical issues faced in this
domain. In the distributed domain, no node has the knowledge of the state of the whole graph,
which is required for both Prim’s and Kruskal’s algorithm. In the message-passing model, it is
also difficult to process a single node at a time. Hence, the algorithms need to take care of the
different orderings of events which can take place in different nodes. Some distributed algorithms
bear similarities to Borvka’s algorithm [5] for the classical MST problem. For the distributed
approaches, there are issues to be tackled for each algorithm. A common issue is due to the
assumption of known distinct edge costs. This is not always the case for the channels. However,
as mentioned earlier, workarounds exist to ensure proper ordering of the edges. A common issue

3
is to identify each fragment, so as to insure the internal and outgoing edges. This issue is resolved
by associating the identity of a fragment to the root node or edge.

2 Gallager et al. (1983) - GHS Algorithm

In this section, we describe the classical article by Gallagher, Humblet and Spira published in 1983
[7]. The approach is built over the Cut Lemma (Lemma 1.1), and as mentioned before, develops
a set of fragments (or sub-trees) which ultimately converge into the MST. Before going into the
working, we list the assumptions that will be made in the algorithm:

• Each node has the knowledge of all its neighbours, including the weights (or costs) of the
edges (or channels) connecting to them.

• Each node has its own copy of the full algorithm in the initial state.

• The channels are assumed to be asynchronous, bidirectional, FIFO, and without any errors.

• The weights of the edges in the network are mutually distinct.

With these assumptions, we move towards the working of the algorithm.

2.1 Overview
The algorithm involves some complications due to the detailed logistics, however, we first try to
give an overview of the working. After which, we will build upon the specifics in the next section.

At any point in the algorithm, there are a set of sub-trees, which the authors call fragments. These
fragments merge with each other to form the MST ultimately. Each fragment will hence be formed
of a set of nodes. Initially, a fragment will just be a single node of the graph. Also, each node will
have some identifying information (to be defined later) for the fragment it belongs to, and will
be aware of the edge leading to the core (a special edge, to be defined later) of the fragment. Due
to the Cut Lemma (Lemma 1.1), we know that a fragment should merge only along the outgoing
edge which has the minimum edge cost. To reduce the complexity, the authors impose a further
constraint for a fragment to merge only with a bigger fragment. Due to this asymmetry, this merg-
ing is often called as absorbing or hooking into the bigger fragment. This property of merging into
bigger sets is a classic optimization technique used in data structures like Union-Find. Hence, the
edge must also be connecting to a bigger fragment. Such an edge, if found, is called the best-edge
for the fragment. Therefore, each fragment will get merged along its best-edge. It can hence be
observed that finding the best-edge for a fragment will form a crucial module for the algorithm.

The core of a fragment, introduced in the previous paragraph, can be treated as a root of the
fragment, since each node will maintain an inbound edge leading to the core. Also, the identifi-
cation (introduced in the previous paragraph) for a fragment is nothing but the weight of the core
edge. Since the weights are distinct the identification is also unique. The definition of the core
edge relies on an attribute called the level, defined for each fragment. The level provides a lower
bound on the logarithm of the number of nodes in the fragment, we shall shortly realise how. A

4
level of a fragment is initialized as 0 (when the fragment comprises of a single node). Each frag-
ment gets absorbed into a fragment of higher or equal level. If the fragment gets absorbed into a
fragment of a higher level, the fragment just becomes a part of the bigger fragment and assumes
the identity of the bigger fragment. However, if the levels of the two fragments are equal, a new
fragment is formed with the level incremented by 1. We can now define the core, which is the
ID for the fragment. When a fragment has its level set as 0 (single node), there is no core edge.
However, during each increment of the level (caused by the merging of fragments of equal levels),
the core is updated to the edge along which the merge took place (connecting the two fragments).
Note that if the levels of the two fragments are equal during the merging, the best-edge for both
fragments will change. It hence follows that a level L + 1 fragment always contains, at least, two
level L fragments, and hence each level L fragment contains, at least, 2L nodes. Thus, the level of
a fragment provides a lower bound on the log of the number of nodes.

To give a one line summary for GHS algorithm, it maintains a set of mutually exclusive frag-
ments, each fragment has a core edge and at any point of time each fragment is waiting to get
merged along its best-edge (which may not be known at all times), and these fragments ultimately
merge to give a single fragment, the MST.

2.2 Algorithm: Specifics

We now discuss the specifics of the algorithm, including the types of messages that are sent and
the actions were taken by each node. Any node can exist in three “states” during the course of the
algorithm. The three states are named Sleeping, Find and Found. Initially, every node is in the
Sleeping state. It is then either “awakened” by itself or someone else, after which it never returns
to the Sleeping state. At any time after the node has wakened, it will either be in the Find state or
in the Found state; in fact, it will switch between the two until the MST is found.

We now describe the algorithm for the search for the best-edge for a fragment. If the fragment
consists of a single node, the best-edge is simply the least cost outgoing edge which leads to a
higher or equal levelled fragment. To do so, it iterates over all of the adjacent edges, chooses the
edge with the minimum cost, and sends a Connect message to the adjacent node. It subsequently
goes into the state Found and waits for a response from the fragment at the other end. Now, let’s
consider the algorithm for a fragment with more than one node (non-zero level). The type of mes-
sage used for this case is Initiate. The process for this is started whenever two fragments of level
L − 1 merge to form a bigger fragment, with a new core. The two nodes forming the core edge
broadcast the Initiate message to the other nodes of the fragment. This broadcast is done along
the outward (opposite of the inward edge) edges of the tree. The initiate message also carries the
information < ID, Level >, where ID is the weight of the core. When a node receives the Initiate
message, it changes its state to Find. This essentially starts the process of finding the minimum-
weight adjacent edge for each node. Each node labels each of the adjacent edges into either of the
following three classes: Branch, Rejected and Basic. Branch means an edge is in the fragment
tree. Rejected means an edge, which has been discovered to be pointing to a node of the same
framework. The Basic edges are the remaining edges which are not labelled as Branch or Re-
jected. Now, to find the best-edge candidate for that node, it finds the minimum-weight Basic edge
and sends a Test message to the node on the other side of the edge. The Test message contains the

5
< ID, Level > information of the fragment. If the node on the other side has the same ID (same
fragment), the node replies with a Reject message, and both nodes label the edge as Rejected.
Note that if a node sends the Test messages and receives a Test message from the other side as
well, with the same identity, the node need not send a Reject message as a reply and just label the
edge as Rejected. Now, if the node on the other side has a different identity, it will either respond
by sending an Accept message (if its level is greater or equal), or it will delay making any response
until its fragment reaches a level greater or equal than the one received in the message. Since the
response is delayed, the node which sent the message is blocked and hence the whole fragment is
blocked. This essentially means that a fragment will finish finding the best-edge if and only if none
of the outgoing best-edge candidates of the fragment lead to a fragment with a lower level.

When the individual nodes have found their respective minimum-weight edges (the candidates
for best-edge), the nodes need to cooperate to find the best-edge for the fragment. This is achieved
by propagating Report messages towards the core. A Report(W ) message is sent to the inbound
edge, where W is the weight of the minimum-weight outgoing edge encountered yet. W will be
∞ when there are no outgoing edges yet encountered. A node will wait to receive Report mes-
sages from all its branches, except the inbound edge, and it will be the minimum of those weights
along with the minimum-weight edge found by the node itself (of its outgoing edges). The global
minimum of all of these weights is then propagated again in a Report message towards its inward
edge. When a node sends a Report message, it changes its state to Found. Furthermore, the nodes
save the edge leading to the best-edge candidate so that the path can be traced back.

Ultimately, when both nodes of the core edge have exchanged Report messages, these nodes act
to inform the node having the minimum-weight outgoing edge. Also, it is now certain that the
fragment must merge along that best-edge, hence the inbound edges need to be reordered towards
that node since the core edge will either be in the adjoining fragment or it will be the best-edge. To
do so, a Change-core message is propagated towards the node with the best-edge. For each edge
that is encountered along this path, the inbound edge is reversed to point towards the best-edge.

Finally, the node with the best-edge sends a Connect(L) message towards the best-edge. L,
here, is the level of the fragment of the sending node. It may also happen that the fragment on
the other side has the same level. This causes the best-edge to become the new core of a newly
formed fragment with level L + 1. To achieve this, Initiate messages are broadcasted with the
new < ID, Level > information to both the fragments. This achieves two purposes, sending the
information update to all the nodes, and the initiation of a new search since the level as changed.
On the other hand, if the connecting fragment has level L0 > L, the fragment needs to get absorbed
in the connecting fragment. To do so, Initiate message is broadcasted to only the joining fragment
(smaller level). This achieves the purpose of both updating the < ID, Level > information of
the joining fragment as well as the initiation of search for the joining fragment since the level is
now updated. Note that the nodes in the connecting fragment remain unchanged since they were
already in the blocked search of searching for the best-edge.

We now formalize these details by giving a pseudo-code for the algorithm.

6
2.3 Example
In Figure 1, we have shown an example run of the algorithm on a graph with 5 nodes.

2.4 Algorithm: Pseudo-code

We give a pseudo-code, inspired by the one originally provided by the authors [7], in Algorithm 1.
Comments (beginning with .) are also present for some lines of the pseudo-code.

Notations used in the Algorithm 1 (GHS):

• Any procedure with RESPONSE-X is automatically triggered when a message X is received

at a node.

• Procedure WAKE-UP can be triggered automatically if a node is sleeping.

• n is used as the default notation for the node at which the procedure executes.

• N odeState(n): State of the node n, enumerator variable with possible values as SLEEPING,
FIND and FOUND. Initialized with SLEEPING.

• EdgeState(e): State of the edge e, enumerator variable with possible values as BRANCH,
REJECTED and BASIC. Initialized with BASIC.

• W eight(e): Variable storing the weight of the edge e.

• ID(n): Variable storing the identity of the fragment which contains node n.

• Level(n): Variable storing the level of the fragment which contains node n.

• BestEdge(n): Variable storing the edge leading to the best-edge of the fragment which
contains node n. Used for tracing back from core.

• BestW eight(n): Variable storing the weight of the best-edge of the fragment which con-
tains node n.

• T estEdge(n): Variable storing the outgoing edge at which a test message has been sent. Set
back to nil after response received.

• InboundEdge(n): Variable storing the inbound edge which leads to the core of the frag-
ment.

• F indCount(n): Variable storing the count of Initiate messages sent by the node n. Must
receive Report messages from all of these before reporting to inbound edge.

7
(a) Initial graph, all nodes are in sleeping state. (b) Node B spontaneously wakes up, sends Connect mes-
sage to A.

(c) Node A wakes up and connects with B. Initiate mes- (d) A − B becomes the core. Nodes A and B send Test
sage with new < ID, Level > information is sent to both. messages. Node E wakes up and merges with Node C.
Node C also wakes up independently.

(e) C accepts the Test message since both have Level 1. D (f) B responds with Initiate message to D. A and B report
wakes up and sends the Connect message. D rejects the to each other and Change-Core message is sent towards
Test message since level is lower (0 < 1). A.
8
(g) Change-Core reaches A (node with best-edge) and Ini- (h) During the the propagation of Initiate messsages (not
tiate message with incremented ID is sent to both frag- shown), inbound edge direction for E is reversed. MST is
ments. D’s Test message will be rejected later. formed since no outgoing edges.

Figure 1: Example run for GHS with a graph containing 5 nodes and 6 edges

2.5 Communication and Time Analysis

The authors show that the total number of messages is upper-bounded by 5V log2 V + 2E, where
V is the number of nodes and E is the number of edges. The proof behind this goes as follows.
The Reject message is sent at most once in each direction of an edge, hence, there are at most
2E Reject messages. Each node will send at most one Accept, Initiate, Report, Connect and Test
message unless the fragment it belongs to changes its ID (hence, changes the level). Since there
can be at max log2 V level changes, the number of messages except Reject is again bounded by
5V log2 V .

The time complexity is shown to be O(V log V ) time units. The proof behind this lies in the
fact that it takes at most O(lV ) time units until all nodes reach level l. This can be proved with
the help of induction on the number of levels, since the propagation of cooperation signals within
a fragment, requires O(V) time units. Since the level l is upper-bounded by log V , total time units
is bounded by O(V log V ).

3 Awerbuch (1987)
In this section, we describe the classical article by Awerbuch [4] published in 1987. We give a
brief overview of the algorithm followed by the algorithm specifics.

3.1 Overview
The main contribution of this work over past works is to develop a linear time algorithm for find-
ing Minimum Spanning Tree in the asynchronous network; with the best previous one having
Θ(E + V logV ) message complexity and taking Θ(V logV ) time. The GHS algorithm explained

9
Algorithm 1 GHS Algorithm
1: procedure WAKE -U P . Called at spontaneous waking of a node
2: Level(n) := 0
3: F indCount(n) := 0
4: N odeState(n) := FOUND
5: Find the adjacent edge m with such that w(m) is minimum.
6: EdgeState(m) := BRANCH
7: send Connect(0) towards m.
8: procedure R ESPONSE -I NITIATE(< ID, Level >, State) received through edge j
9: Level(n) := Level
10: ID(n) := ID
11: N odeState(n) := State
12: InboundEdge(n) := j . Initiate message comes from the path leading to core edge
13: BestEdge(n) := nil . Initially, no best-edge, set only after a valid best-edge found
14: BestW eight(n) :=∞
15: Send Initiate(< ID, Level >, State) on all adjacent edges (except j) of n which have
state set as BRANCH . Broadcast along the branch edges
16: if State = FIND then
17: F indCount(n) := number of adjacent edges (except j) of n which have state set as
BRANCH
18: Find the adjacent edge m such that w(m) is minimum and EdgeState(m) = BASIC
19: if there is no such edge m then . No edges to send T est to, good to go
20: T estEdge(n) := nil
21: Execute REPORT
22: else
23: T estEdge(n) = m
24: Send T est(< ID(n), Level(n) >) on m
25: procedure R ESPONSE -T EST(< ID, Level >) received through edge j
26: if N odeState(n) = SLEEPING then
27: Execute WAKE-UP
28: if Level(n) ≥ Level and ID(n) 6= ID then
29: Send Accept on j
30: if ID(n) = ID then
31: Send Reject on j
32: EdgeState(j) := REJECTED
33: if Level(n) < Level and ID(n) 6= ID then
34: Delay the response by placing the received message on end of queue
35: procedure R EPORT . Checks if got response from all the branches as well as outgoing edges
36: if F indCount(n) = 0 and T estEdge(n) = nil then
37: N odeState(n) := FOUND . Now need to propagate the results back to core
38: Send Report(BestW eight(n)) on InboundEdge(n)
39: procedure R ESPONSE -R EPORT(W eight) received through edge j
40: if j 6= InboundEdge(n) then
41: F indCount(n) := F indCount(n) − 1

10
42: if W eight < BestW eight(n) then
43: BestW eight(n) := W eight
44: BestEdge(n) := j
45: Execute REPORT . Check if Report received from all branches
46: else if N odeState(n) 6= FIND and W eight > BestW eight(n) then
47: Execute CHANGE-CORE . Received at core
48: else if BestW eight(n) = ∞ then . No outgoing edges
49: halt, MST found
50: procedure R ESPONSE -ACCEPT received through edge j
51: T estEdge(n) = nil . Good to go
52: if W eight(j) < BestW eight(n) then . Check with branches who have already reported
53: BestEdge(n) := j
54: BestW eight(n) := W eight(j)
55: Execute REPORT
56: procedure R ESPONSE -R EJECT received through edge j
57: if EdgeState(j) = BASIC then
58: EdgeState(j) := REJECTED
59: Find the adjacent edge m such that w(m) is minimum and EdgeState(m) = BASIC
60: if there is no such edge m then
61: T estEdge(n) := nil
62: Execute REPORT
63: else
64: T estEdge(n) = m
65: Send T est(< ID(n), Level(n) >) on m
66: procedure C HANGE -C ORE . Best-edge found, inform the node with best-edge
67: if EdgeState(BestEdge(n)) = BRANCH then
68: Send Change-Core on BestEdge(n) . Propagate
69: else . Reached node with best-edge, send Connect
70: Send Connect(Level(n)) on BestEdge(n)
71: EdgeState(BestEdge(n)) := BRANCH
72: procedure R ESPONSE -C HANGE -C ORE received through edge j
73: InboundEdge(n) := j
74: Execute CHANGE-CORE . Recursive implementation
75: procedure R ESPONSE -C ONNECT(Level) received through edge j
76: if N odeState(n) = SLEEPING then
77: Execute WAKE-UP
78: if Level(n) > Level then
79: Send Initiate(< Level(n), ID(n) >, N odeState(n)) on j
80: if N odeState(n) = FIND then
81: F indCount(n) := F indCount(n) + 1
82: else if EdgeState(j) = BASIC then
83: Delay response by placing the message at the end of queue
84: else . New core, note that Connect will be received by sending node as well
85: Send Initiate(< Level(n) + 1, W eight(j) >,FIND) on j
11
in the previous paper presented basic fundamental ideas and concepts to do so. The best previous
algo was given by Chin and Ting and Gafni. The algorithm given here is suboptimal by a fac-
tor of Θ(V logV ) in time. This is due to reason that small trees sometimes wait for bigger trees
leading to complex combinatorial structure as a consequence of waiting for relations between trees.

The improvement in performance of MST algorithm in this paper is primarily due to 2 new in-
novations: Root Update and Test Distance. The algorithm consists of two stages: Counting stage,
which computes the number of nodes in the network and uses this information to find the MST
in Minimum Spanning Tree stage. Both are optimal in communication and time. The Counting
stage first finds some spanning tree and elects a leader in the network; which helps to compute the
number of nodes in the system.

In most distributed MST algorithms, a spanning forest of rooted trees is maintained; each tree
being a subtree of the MST. Initially, every tree consists of a single node. In the course of the
algorithm, the subtrees try to find the best edge(minimum weight edge among all leading to other
trees). The best edge is guranteed to be in the MST given weights are unique. The tree then hooks
itself on the other side of that edge, becoming a sub-tree in the bigger tree. Hooking is a sequence
of manipulation of father pointers. Core edges (two trees hooking onto each other) create a cycle
of two in pointer graph; for which root with bigger identity is unhooked; hence, the larger id root
becoming the root of combined tree. A naive implementation of this algorithm requires O(V 2 )
messages and time complexity; the worst case being the tree of size V /2 being hooked onto other
trees V /2 times, each requiring linear work.

Classical idea to improve this is to use the Union-Find algorithm; leading to a double size of
the combined tree each time pointer of a node is changed. Since each node undergoes a maximum
of atmost log2 V pointer changes, we can achieve a communication complexity of O(E + V logV )
and time complexity of O(V logV ) if we ensure best edge of tree leads to bigger or equal tree.
To achieve this, the previous paper introduces the technique of levels. The reason that its time
complexity is not linear(O(V logV )) is that there might be a bunch of sub-trees of the same level
(say l), each hooked onto the next one on the chain, resulting in a tree of level l + 1, regardless
of the length of the chain. A tree of level 1 with V /2 nodes may be created, which may undergo
logV − 1 changes in level, each needing Ω(V ) time. Chin, Ting and Gafni addressed this problem
by updating the level to the logarithm of the cardinality of the tree, each time that computation of
the best edge is performed. However, the time complexity remained the same. The logV factor is
due to the fact that updating the level of long chain comes too late. The minimum weight property
can help to achieve a linear time algorithm because then, instead of hooking itself on to its mini-
mum weight edge, each tree will hook itself on edge leading to neighbouring tree of the maximum
level. This is the main idea behind the Counting stage of our algorithm.

3.2 Algorithm Specifics

The algorithm operates in two stages- Counting stage and MST stage. We first explain the MST
stage and then the Counting stage.

12
3.2.1 MST stage
The MST stage is performed in two phases. The first phase runs algorithm similar to GHS
algo(in above paper)[7] , the only difference being it is terminated when all trees reach the size
of Ω(V /logV ).

The second phase brings new algorithmic ideas, in which aggressive update of levels is done in an
accurate fashion, such that small trees are prevented from waiting for the big trees and speeds the
algorithm. The counting stage is needed in order to know the number of nodes V. The details of
the algorithm are as followed:

1) Root initialisation: In the course of the algorithm (second phase, MST stage) as trees coa-
lesce and hence new nodes become roots of the resulting trees. As soon as a node r becomes root
with level l of tree T, it broadcasts an initialisation message containing (r,l) parameters over T,
which is further relayed onto trees that hook themselves onto T.
Upon delivering the initialisation message, an internal node j remembers those parameters in local
variables Levelj , Rootj , and starts execution of a local search procedure.

2) Local candidate selection: The local search procedure tries to find the minimum weight edge
outgoing from node j to node i in a separate tree such that the i’s level is greater than l. Actually,
node j scans its incident edges one by one. It does so by passing a special test message to node k
on the other side of the edge and getting the reply from k. 3 broad cases arise:
a) Rootk = r : k is in T. Reply is negative.
b) Levelk <l : k delays response to that message until Levelk reaches l. If this level increase at k
is due to hooking of k’s tree onto T, then k will have Rootk = r. Hence, reply is negative.
c) Levelk >l : Edge (j,k) is declared to be local candidate for best edge of T.

3) Best edge selection : Names of local candidates are collected at the root. The root waits until
all the nodes get replies from all their neighbours and all the possible candidates have reached it.
If there is no local candidate, the algorithm terminates since tree spans the network and hence is
the MST.
Else, the root selects the minimum edge (v,w) with v being the internal endpoint. Root sends
special message(pointer reversal message) to v, reversing all the father points from r to v, so that
v becomes the new root. 2 cases arise:
a) (v,w) is a core edge and v is its biggest endpoint : w hooks itself onto v and v becomes root of
the resulting tree. Level of v increases by 1 and (1) that is Root initialisation is done.
b) (v,w) is not a core edge or v is not the bigger endpoint : v hooks itself onto w. T becomes a
subtree in the bigger tree.
If w has received an initialisation message, then v relays it over T making T participate in the
best edge selection of entire tree. Until best edge selection happens, Test-Distance procedure is
iterated by v. This is where the aggresive update of levels takes place and te innovation of paper
lies. Upon each invocation of Test-Distance, node v sends an exploration token to father w. The
token initially carries counter value 2l(v)+1 . Upon arrival of the token at a certain node, the node
subtracts the number of sons from the counter. If the counter is positive, and the receiving node is
not a root node then that node forwards the token to the father. Thus moving up, either the counter

13
becomes ¡=0 and the token dies or a positive counter reaches the root. If the token is alive and
it reaches root, then acknowledgment is sent back from root to v, upon which v sends a special
message over T, which causes every node in T to increase by its level by 1. The Test-Distance
procedure is revoked again and again with increased level until the token does not die. It is noted
that the Test Distance takes place until a new root in the tree is decided.

4) Root Update Procedure: This process is activated either when initialisation message has ad-
vanced for distance bigger than 2m+1 or if some node detected more than 2m+1 internal edges in
local candidate selection, m being the level of the tree root. In either case process of best edge
selection and Test Distance Procedure are interrupted, and level of the root is increased by 1 and
then Root initialization process is revoked.

3.2.2 Algorithm: Pseudo-code

We give a pseudo-code for the phase 2 of MST Stage of the algorithm . Comments (beginning
with .) are also present for some lines of the pseudo-code.

Notations used in the Algorithm 2 (MST(Phase 2)):

• Any procedure with RESPONSE-X is automatically triggered when a message X is received

at a node.

• Procedure WAKE-UP can be triggered on a node at the start to initialise all the variables.

• n is used as the default notation for the node at which the procedure executes.

• W eight(e): Variable storing the weight of the edge e.

• ID: Variable storing the identity of n.

• Level: Variable storing the level of n.

• BestW eight(n): Variable storing the weight of the best-edge of the fragment which con-
tains node n.

• Root: Variable storing the root of tree of n.

• count(best − edge): Count of number of best-edges.

• local − cand − arr: Array storing all the local candidate edges with their weights.

• parent: Variable storing the parent id of node.

• V : Variable storing the total number of nodes.

14
(a) Broadcast: level l is reset or r is made root. (b) Local candidate selection case 1:
Rootj = Rootk ,Levelj = Levelk

(c) Local candidate selection case 2: Levelk < Levelj

(d) Local candidate selection case 3: Levelk > Levelj , (j − k) stored as local candidate

15
(e) Best candidate selection Step 1: All local candidates sent up to root. Minimum
edge v-w selected.

(f) Best candidate selection step 2 case 1:

Levelv = Levelw (core-edge),IDv = IDw k Levelv < Levelw

16
(g) Best candidate selection step 2 case 2:
Levelv > Levelw k Levelv = Levelw (core-edge),IDw > IDv
(h) Test distance step 1: Sum of degrees of nodes on w-root path < 2Level(v)+1

(i) Test distance step 2: Broadcast to increase level of all nodes os subtree of v

17
(j) Root update

Figure 2: MST Stage, Awerbuch’s Algorithm

Figure 3: Counting Stage: Link Search, Awerbuch’s Algorithm

18
Algorithm 2 Phase 2, MST Stage (Awerbuch)
1: procedure WAKE -U P(id,V) . Called each node at starting of algorithm
2: Level := 0
3: Root := n
4: count := dict
5: local − cand − arr := []
6: ID := id
7: count(best − edge) :=0
8: parent := id
9: V := V
10: procedure ROOT-I NITIALISATION . Called just after phase-1 of MST or after level of root is
reset or root node is reset.
11: BroadcastInitiate(< ID, Level >, 0)overthetree
12: procedure R ESPONSE -I NITIALISATION(Initiate(< id, level >, val)) received through edge
i
13: if Initiate(< id, level >, val) received for the first time then
14: Level := level
15: Root := id
16: val := val + 1
17: if val > 2level+1 then
18: Send Root − U pdate − M essage to father upto path of root
19: Call Procedure local-candidate-selection
20: Broadcast Initiate(< ID, Level >, val) over the tree
21: procedure LOCAL - CANDIDATE - SECTION
22: no − internal − edge := 0
23: For each edge k incident to n,
24: Send T est − M essage(< Root, Level >) to k
25: Receive Response − T est − M essage(< Reply >) from k
26: if res > 0 then
27: add << n, k >, weight(< n, k >) > to local − cand − arr
28: else
29: no − internal − edge := no − internal − edge + 1
30: if no − internal − edge > 2level+1 then
31: Send Root − U pdate − M essage to father upto path of root
32: End For
33: Send Best − Edge < local − cand − arr > to parent
34: procedure BEST- EDGE - SELECTION(Best − Edge < local − cand − array >) received from
son i
35: if Root != ID then
36: Send Best − Edge < local − cand − arr > to parent, storing the path
37: else
38: if count of total number of best edge arrays received = V − 1 then

19
39: Select minimum edge v − w , v being the internal node
40: Remove first node in path of v from its sons and set it as its parent
41: Send P ointer − Reversal < v, w > to first node in path of v
42: else
43: count(best − edge) := count(best − edge) + 1
44: procedure R ECEIVE -P OINTER -R EVERSAL(P ointer−Reversal < v, w >) received through
edge i
45: Add i to set of sons
46: if ID!=v then
47: Set first node in path to v as parent
48: Send P ointer − Reversal < v, w > to first node in path of v
49: else
50: Set father as empty
51: Send Joining < Root, ID, Level > to w
52: Receive < join > from w
53: if join > 0 then
54: Set w as father
55: Call procedure Test-Distance(2Level+1 , v)
56: else
57: Add w to its list of sons
58: Level := Level + 1
59: Call procedure Root-Initialisation
60: procedure R ECEIVE -J OINING(Joining < Root, ID, Level >) received from v
61: if (Level == [Link] < [Link])or(Level < [Link]) then
62: Add father to list of sons
63: Reverse nodes till the path of root so as to reset the father and son pointers
64: Set v as father
65: Send 1 to n
66: else
67: Add v to its list of sons
68: Send −1 to n
69: procedure T EST-D ISTANCE(val, v) called or on receiving Test-Distance< val, v > through
edge i
70: if Root == IDandval > 0 then
71: Send < Ack − T est > to v using the saved path
72: else
73: if val > 0 then
74: temp := val − deg(val)
75: Send < temp, v > to father
76:

20
77: procedure T EST-D ISTANCE -U PDATE(< Ack − T est >) received
78: Level := Level + 1
79: Broadcast an increase in level of 1 in the entire subtree
80: Send Test-Distance¡2Level+1 , v¿ to father
81: procedure ROOT-U PDATE(Root − U pdate) received through son i
82: if Root == Id then
83: Level := Level + 1
84: Call procedue Root − Initialisation
85: else
86: Send Root − U pdate to its parent

3.2.3 Counting stage

The algorithm again maintains a forest of trees which ultimately combine to form a spanning tree.
The root of each tree is the leader of the tree. Initially, each node in the network forms a tree
consisting of one node. Levels are maintained for each tree, supposedly reflecting the size of the
tree. Level of a tree containing a single node is 0. As the algorithm proceeds , the bigger level
trees clash the smaller level ones capturing their nodes. Schematically, the algorithm has 3 basic
processes:
a) Link Search : This procedure is called after each increase in level. Each node scans the links in
order of their weights in increasing order until it finds a edge leading to another tree of bigger or
equal level; called the feasible link of the node at that level.
Exploration messages are sent along the edges to implement scanning of the edges. Links already
known to be internal are not scaned anymore. If a node of smaller level is detected in the course
of scanning, then the bigger tree starts invasion of the smaller tree through that link; while at same
time node in bigger tree continues search for feasible link, attempting to find a link to another tree
of bigger or equal level. Once each node finishes its search for feasible link, it reports the result
of the search to the root of the tree. The reporting is done through convergecast , where leaf node
sends the report whenever it finishes the search, while internal nodes do it after receiving reports
from all from their children. The report actually contains either the identity of the feasible link and
the label of the tree on the other side of the link or simply says that no feasible link has been found.
The root node collects such reports from all nodes of its tree, including the nodes that have just
been captured or are going to be captured.

b) Level Update : Whenever the time spent by Link search procedure is high, it is interrupted.
Whenever a node is detected such that sum of its height and degree in tree exceeds the value 2k+1 ,
where k is the level of the tree; link search procedure is interrupted.
The procedure succeeds only when the tree is not being absorbed by a bigger level tree and
aborts otherwise. The procedure operates similarly to “two-phase commit” protocals. It locks
the nodes which have not been captured by some other tree. The locking phase takes place in 2
phases; each phase involving one broadcast and one convergecast.
In the first broadcast, nodes are conveyed that locking mechanism has started. A node receiving
the first broadcast is locked if it has not been invaded by some another tree. Once a node is
locked, all the incoming exploration messages are buffered and processed immediately after node

21
is unlocked.
This is followed by a convergecast, where the leader finds out whether all locks have been ob-
tained. The locking succeeds if all the locks have been obtained. If successful, then the new level
is computed which is actually the (intger) value of the logarithm of a number of nodes(cardinality)
of the tree.
The second broadcast informs all the nodes if the locking was successful. If locking was succesful,
then each node udates its level. In any case, the nodes become unlocked.
The second convergecast is needed for the purpose of synchronisation; that is to ensure all the
nodes have completed the procedure.
In case the Level-Update aborts, the leader becomes inactive with no additional procedure being
executed in its tree. This is because the leader can never again become the network leader as its tree
is absorbed by bigger level trees. Upon termination of Level-Update, either the level is increased
or the tree becomes inctive.

Thus, 2 events may take place: either uninterrupted execution of Link-Search or tree is invaded
by another tree. In the latter case, the tree leader is killed.
If none of the tree nodes found a feasible link(in the former case), then the tree must cover the
entire network with the termination of the algorithm as the spanning tree is found. Root is declared
as the leader. Its name is broadcasted over the network and the total number of nodes is counted.
Otherwise, some feasible links have been found. Two cases arise:
1) If all feasible links lead to trees of the same level, then the preferred link is elected with the
minimum weight; the tree on the other side of the edge is called the preferred tree.
2) If there exists a bigger leel tree on the other side, the tree becomes inactive.

c) Marriage Procedure : If the tree is active at this point, that is ,i f all feasible links lead to
trees of the same level , then the Marriage Procedure merges the pairs of trees of same level, hav-
ing the same preferred link. In such pair, the tree with bigger identity conquers the tree with
smaller identity.

4 Garay et al. (1998)

In this section, we describe and report the paper by Garay, Kutten and Peleg published in 1998 [8]
which describes a sub-linear time distributed algorithm for The MST problem. A lot of work on
distributed network algorithms are focused on achieving an O(n) bound on an n-vertex distributed
network. Both the algorithms we have previously discussed try to achieve this “optimal” O(V )
bound. In the paper, the authors try to identify inherent graph parameters which accurately de-
scribe the behavior of distributed MST construction and then propose an optimal algorithm with a
bound in terms of these inherent graph parameters.

They identify that diameter of the graph is indeed one such inherent parameter in the construc-
tion of an MST and present a distributed MST algorithm whose time complexity is sub-linear in
V and linear in Diam. The motivation of the work coming from the fact that there exists trivial
O(Diam) algorithms for various other important distributed network problems such as Leader
Election, Breadth First Search Tree Construction etc. One other motivation being that in most real

22
large area networks Diam V . So any such improvement would hugely improve the perfor-
mance in real-world distributed systems.

For the algorithm to execute in the declared time and message passing complexity we need to
make a few assumptions. We need to follow all the assumptions made by the GHS algorithm that
we enlisted before. Besides them we also need to make the following assumptions

1. We will assume that the size of the messages has an upper bound of O(log V )

2. Also a node may send at most one message on each edge at each time unit

3. Edge weights are polynomial in V , so an edge weight can be sent in a single message.

With these assumptions, we move towards the working of the algorithm.

4.1 The Sub-linear Algorithm for MST Construction

The algorithm involves a lot of complications due to the detailed logistics, however, we first try to
give an overview of the working. With this in mind, we will build upon the specifics in the next
section.

4.1.1 Phase I: Controlled-GHS

Overview: Garay et al give a really innovative three phase algorithm which is much more com-
plex than the algorithms we have previously discussed and involves a number of subtleties. The
first phase of the algorithm is the Controlled GHS phase which is a modified controlled version
of the GHS algorithm. As with the basic GHS algorithm, Controlled-GHS also simultaneously
initiates from all the nodes with singleton fragments and execute a total of I phases. Each of the I
phases consist of the following two stages:

1. In the first stage the basic GHS algorithm is executed until the stage where each fragment in
the network has found its minimum weight outgoing edge(an outgoing edge of a fragment F
being an edge with one endpoint in F and another at a node outside it). So at the end of this
stage, we get a forest structure of fragments which is referred as F F in the remainder of the
report.

2. Each of the fragments in the resulting forest is broken down into small O(1) trees and merge
operation of the GHS algorithm is performed only on these small trees. The trees are broken
down by computing a dominating set M (T ) on each tree T of the fragment forest F F , and
then the merge operation is carried out with each fragment F ∈ M picking one neighboring
fragment F ∈ / M and merging with it. The breaking down of the fragment in this stage
ensures that the diameter of each fragment remains small.

Small-Dom-Set Construction: A Procedure Small-Dom-Set is used for computing a small

dominating set on each fragment. This procedure achieves the following goals. Given a tree T
with a vertex set V (T ) find a set of vertices M (T ) such that.

23
1. M dominates V (T )
|V (T )|
2. |M | ≤ 2

The procedure is based on the following. For a vertex v in a tree T , let Child(v) denote the set
of vs children in T . We use a depth function L(v) on the nodes, defined as follows:
(
0, if v is a leaf,
L(v) =
1 + minu∈Child(v) (L(u)), otherwise

Also, we denote the set of tree nodes at i as L(i). Now we can proceed to give the algorithm of the
procedure:

Algorithm: Small-Dom-Set
1. Mark the nodes of T with depth numbers L(v) = 0 , 1 , 2.

2. Select an MIS, Q, in the set R of unmarked nodes;

3. Then, M := Q ∪ L1;

Output of Controlled-GHS: For the computation of the dominating sets we use a distributed
implementation of Procedure Small-Dom-Set. The algorithm employs the distributed Minimal In-
dependent Set (MIS) algorithm of Panconesi and Srinivasan [? ] for calculating the MIS.

It is important to note that the first phase of the algorithm achieves the following.
Lemma 4.1 In each phase of Controlled-GHS
1. the number of fragments, at least, halves.

2. Diam(F ) increases by a factor of at most 3, for every fragment F.

Lemma 4.2 Also, when algorithm Controlled-GHS is activated for I phases, it takes O(3I .2log V )
time, and yields up to N = 2VI fragments, of diameter at most d = 3I .

The above results can be easily proved from the basic properties of the procedures of fragment
breakdown and small dominating set construction used in Controlled-GHS.

4.1.2 Phase II: Small Cycle Elimination on Fragment Graph

Overview: Improving the time bound of an MST construction requires that we solve the key
problem. The problem being that since the MST of the network may be possibly as high as O(V )
we need cannot afford communication on the tree structure itself as it would require O(V ) time.
The algorithm solves this problem by deviating from the GHS algorithm at an appropriately chosen
point and switch to an algorithm which eliminates edges which are definitely not going to be a part
of the MST.

24
Let F G denote the fragment graph that is the outcome of Phase I. The vertices of this graph are
the fragments constructed in Phase I, and its edges are all the inter-fragment edges. On observa-
tion, we find that cycles and multiple edges(from different nodes belonging to the same fragment)
connecting any two fragments might exist in this graph. The algorithm uses a complex procedure
to identify and eliminate these cycles.

Cycle Elimination procedure For eliminating cycles the procedure depends on the following
lemma:
Lemma 4.3 Given a weighted graph G = (V, E), if e is a bottleneck edge of G then e ∈
/ M ST (G).
One of the nodes is distinguished as the fragment’s center r(F ) which is also considered the root
of the fragment. The procedure eliminates all short cycles of length at most l and also concentrate
via T (F ) , all the information pertaining to every other fragment up to distance l from F in r(F ).

The procedure starts by eliminating all cycles of length 2 and then goes on to eliminate all cy-
cles of length at most l. We first consider the procedure of eliminating all cycles of length 2 as
after that extending the procedure to cycles of longer length would be much easier. The nodes of
the fragment collect information on the edges connecting F to the adjacent fragments and send it
upwards on the tree T (F ) to the center r(F ). In order to execute the procedure, each fragment
node v creates the record Path(F) containing edge information, for each F ∈ F G adjacent to F .
It is important to note that out of all records of fragments adjacent to the node in its subtree, it
sends exactly one record concerning each such fragment. It is easy to verify the following basic
properties of the above pipelining policy.
Lemma 4.4 Each node v ∈ F sends to its parent exactly one record P athl(F ) for each fragment
F that is adjacent to nodes in vs subtree in T (F ); these records are sent up in increasing the order
of fragment id.
For eliminating the remaining small cycles the algorithm basically repeats the procedure de-
scribed in the previous section for l phases.

4.1.3 Phase III: Global edge Elimination

At the end of the second phase of the algorithm, we obtain a tree which contains all the edges
which exist in the final MST but it also contains a small number of additional edges which were
not eliminated in Phase II. To remove the remaining edges and to reduce the total number of edges
to the required V − 1 we follow these steps:
• Build a breath-first search tree B on G, the original graph
• from every fragments center r(F ), upcast the list of (uneliminated) external edges adjacent
to F on B
• the final computation (elimination of edges) is performed centrally at Bs root, who then
broadcasts the resulting MST to all nodes, over the tree B.

25
4.2 Example
In Figure 4 we show a snapshot of the system running Controlled-GHS phase of the algorithm on
a distributed network. In Figure 5, we show the edge elimination procedure of the second stage of
the algorithm.

(a) Stage I: A particular Fragment tree formed after merg- (b) Label each node on the fragment tree with levels their
ing of smaller fragments. respective levels.

(c) Find dominating set by taking the union of the MIS (d) Breaking down the fragments tree on the basis of their
and first level nodes in the fragment tree dominating set.

Figure 4: Example run for Stage I-Controlled GHS with a graph containing 15 nodes and 22 edges

4.3 Message Passing and Time Complexity

The complexities of all three parts of our algorithm are as follows, for the given parameter I spec-
ified in the first part:

26
Figure 5: Maximum weight edge elimination in the Fragment Graph. All small cycles are detected
and edges eliminated.

√
Part I: 3I ∗ O(2 log V )
Part II: 3I + 2VI ∗ log V 2
Part III: Diam(G) + 2VI

To optimize the running time we choose I such that 3I = 2VI ie. I = lnlnV6
For this value of I we get a bound of O(Diam(G) + V 0.614 ) on the time complexity.

5 Conclusion
In this paper, we discussed three distributed algorithms which solve the problem of finding the
Minimum Spanning Tree for a connected asynchronous network. It his classic work, Angluin [?
] showed that there exists no deterministic distributed algorithm to solve the MST problem with a
bounded number of messages if the distributed network graph has neither distinct edge weights nor
distinct node identifiers. Therefore, we assume that each edge is associated with a distinct weight
known to adjacent nodes. Even though having distinct edges is not an essential requirement, we
assume this as it guarantees a unique MST in the network. All the algorithms also operate in
the condition that the size of messages is upper bounded by O(log V). With these assumptions,
the algorithms attempt to optimally solve the problem of finding MST on distributed network. We
realize that all the three algorithms use ideas from the GHS algorithm, and additionally also involve
other complex procedures and subtleties, to achieve optimal performance in terms of message
passing and time complexity. The classical algorithm by Gallagher et al. has an optimal message
passing complexity of O(E + V ∗ log V ) but a suboptimal running time complexity of O(V ∗
log V ). The algorithm by Awerbuch [4] achieved the optimal running time and communication

27
complexity by breaking down the problem into three parts and solving the sub-problems in three
phases. The different phases represent a trade-off between the demands of the initial part of the
problem (involving large numbers of small fragments, where bounding the number of messages is
most important) and the last part (involving a small number of large fragments where we need to
bound the running time). However, Garay et al. identified the diameter of the graph as an inherent
parameter in the construction of the MST and presented an algorithm whose time complexity is
sub-linear in V and linear in Diam. The motivation for the algorithm comes from the fact that
there are several O(Diam) algorithms for various other important network problems [11].

References
[1] Baruch Awerbuch. Complexity of network synchronization. Journal of the ACM (JACM),
32(4):804–823, 1985.
[2] Baruch Awerbuch. Reliable broadcast protocols in unreliable networks. Networks,
16(4):381–396, 1986.
[3] Baruch Awerbuch. Optimal distributed algorithms for minimum weight spanning tree, count-
ing, leader election, and related problems. In Proceedings of the nineteenth annual ACM
symposium on Theory of computing, pages 230–240. ACM, 1987.
[4] Baruch Awerbuch and Silvio Micali. Dynamic deadlock resolution protocols. In Foundations
of Computer Science, 1986., 27th Annual Symposium on, pages 196–207. IEEE, 1986.
[5] Cüneyt F Bazlamaçcı and Khalil S Hindi. Minimum-weight spanning tree algorithms a survey
and empirical study. Computers & Operations Research, 28(8):767–785, 2001.
[6] Edsger W Dijkstra. A note on two problems in connexion with graphs. Numerische mathe-
matik, 1(1):269–271, 1959.
[7] Robert G. Gallager, Pierre A. Humblet, and Philip M. Spira. A distributed algorithm for
minimum-weight spanning trees. ACM Transactions on Programming Languages and sys-
tems (TOPLAS), 5(1):66–77, 1983.
[8] Juan A Garay, Shay Kutten, and David Peleg. A sublinear time distributed algorithm for
minimum-weight spanning trees. SIAM Journal on Computing, 27(1):302–316, 1998.
[9] Joseph B Kruskal. On the shortest spanning subtree of a graph and the traveling salesman
problem. Proceedings of the American Mathematical society, 7(1):48–50, 1956.
[10] Navneet Malpani, Jennifer L Welch, and Nitin Vaidya. Leader election algorithms for mobile
ad hoc networks. In Proceedings of the 4th international workshop on Discrete algorithms
and methods for mobile computing and communications, pages 96–103. ACM, 2000.
[11] David Peleg. Time-optimal leader election in general networks. Journal of parallel and
distributed computing, 8(1):96–99, 1990.
[12] Robert Clay Prim. Shortest connection networks and some generalizations. Bell system
technical journal, 36(6):1389–1401, 1957.

A Distributed Algorithm For Spanning Trees Minimum-Weight
No ratings yet
A Distributed Algorithm For Spanning Trees Minimum-Weight
12 pages
Parallel Algorithm for MST
No ratings yet
Parallel Algorithm for MST
8 pages
12 - Minimum Spanning Tree
No ratings yet
12 - Minimum Spanning Tree
5 pages
MST Final
No ratings yet
MST Final
51 pages
Minimum Cost Spanning Trees
No ratings yet
Minimum Cost Spanning Trees
4 pages
Spanning Trees: Introduction To Algorithms
No ratings yet
Spanning Trees: Introduction To Algorithms
69 pages
Graphs - 2
No ratings yet
Graphs - 2
22 pages
Latex Assignment
No ratings yet
Latex Assignment
4 pages
Overview of Graph Theory
No ratings yet
Overview of Graph Theory
51 pages
MODULE4-Greedy Methods
No ratings yet
MODULE4-Greedy Methods
17 pages
Borůvka's Algorithm Variant for MSTs
No ratings yet
Borůvka's Algorithm Variant for MSTs
12 pages
Overview of Graph Theory
No ratings yet
Overview of Graph Theory
51 pages
GrahamHell HistoryMST
No ratings yet
GrahamHell HistoryMST
15 pages
Unit 4
No ratings yet
Unit 4
175 pages
Minimum Spanning Tree Algorithms Explained
No ratings yet
Minimum Spanning Tree Algorithms Explained
15 pages
Lecture 24
No ratings yet
Lecture 24
27 pages
Minimum Spanning Tree (Prim's and Kruskal's Algorithms)
No ratings yet
Minimum Spanning Tree (Prim's and Kruskal's Algorithms)
17 pages
Minimum Spanning Tree Guide
No ratings yet
Minimum Spanning Tree Guide
22 pages
CS180 S25 L09 MSTs Part2 Annotated-1
No ratings yet
CS180 S25 L09 MSTs Part2 Annotated-1
15 pages
Greedy Technique Definition:: On Each Step, The Choice Made Must Be
No ratings yet
Greedy Technique Definition:: On Each Step, The Choice Made Must Be
14 pages
CS124 Spring 2011
No ratings yet
CS124 Spring 2011
6 pages
A New Approach To Find Minimum Spanning Tree For Undirected Graphs
No ratings yet
A New Approach To Find Minimum Spanning Tree For Undirected Graphs
3 pages
IEEETCSwap
No ratings yet
IEEETCSwap
7 pages
Experiment No. 4 Design and Analysis Spanning Tree: Solve Minimum Cost Spanning Tree Problem Using Greedy Method
No ratings yet
Experiment No. 4 Design and Analysis Spanning Tree: Solve Minimum Cost Spanning Tree Problem Using Greedy Method
4 pages
Prim's and Kruskal's Algorithm
No ratings yet
Prim's and Kruskal's Algorithm
58 pages
Greedy MST
No ratings yet
Greedy MST
30 pages
Minimum Spanning Trees Explained
No ratings yet
Minimum Spanning Trees Explained
18 pages
Lecture 26 (Minimum Spanning Tree)
No ratings yet
Lecture 26 (Minimum Spanning Tree)
24 pages
Graphs MST
No ratings yet
Graphs MST
46 pages
Spanning Tree
No ratings yet
Spanning Tree
12 pages
Algorithm Solutions for Graph Problems
No ratings yet
Algorithm Solutions for Graph Problems
8 pages
Session 7 and 8
No ratings yet
Session 7 and 8
31 pages
Prims Kruskal Djstra
No ratings yet
Prims Kruskal Djstra
35 pages
Minimum Spanning Tree
No ratings yet
Minimum Spanning Tree
16 pages
Minimum Spanning Tree 30
No ratings yet
Minimum Spanning Tree 30
54 pages
4.5 Minimum Spanning Tree
No ratings yet
4.5 Minimum Spanning Tree
10 pages
Unit 1 Prim's & Kruskal's
No ratings yet
Unit 1 Prim's & Kruskal's
19 pages
Chapter 11 C
No ratings yet
Chapter 11 C
44 pages
DMST
No ratings yet
DMST
83 pages
Assignment Latex
No ratings yet
Assignment Latex
5 pages
08 Minumum Spanning Tree
No ratings yet
08 Minumum Spanning Tree
29 pages
Minimum Spanning Trees
No ratings yet
Minimum Spanning Trees
20 pages
DAA PPT - Minimum Spanning Tree
No ratings yet
DAA PPT - Minimum Spanning Tree
28 pages
Chapter Three
No ratings yet
Chapter Three
51 pages
MCA Spanning Tree CODE 212
100% (1)
MCA Spanning Tree CODE 212
8 pages
Minimum Spanning Tree Guide
No ratings yet
Minimum Spanning Tree Guide
30 pages
Slide 2 Onenote
No ratings yet
Slide 2 Onenote
40 pages
Network Flow & Spanning Trees Guide
No ratings yet
Network Flow & Spanning Trees Guide
10 pages
Greedy Strategy: Algorithm: Design & Analysis
No ratings yet
Greedy Strategy: Algorithm: Design & Analysis
26 pages
Minimum Spanning Tree Algorithms Explained
No ratings yet
Minimum Spanning Tree Algorithms Explained
30 pages
Lec 10 Mon
No ratings yet
Lec 10 Mon
53 pages
CS 170 DIS 04: Released On 2018-09-24
No ratings yet
CS 170 DIS 04: Released On 2018-09-24
3 pages
23 - Minimum Spanning Trees
No ratings yet
23 - Minimum Spanning Trees
28 pages
AD3251 - DS - Unit 5 - Graph Structures
No ratings yet
AD3251 - DS - Unit 5 - Graph Structures
67 pages
IanGravesMinimum Spanning Trees
No ratings yet
IanGravesMinimum Spanning Trees
25 pages
Lecture14 Connecting People - MST
No ratings yet
Lecture14 Connecting People - MST
37 pages
Prim's and Kruskal's Algorithms Explained
No ratings yet
Prim's and Kruskal's Algorithms Explained
31 pages
30 MST Prim Kruskal
No ratings yet
30 MST Prim Kruskal
17 pages
MST PDF
No ratings yet
MST PDF
3 pages
Luhn Algorithm
No ratings yet
Luhn Algorithm
3 pages
Class 7 First Term Exam
100% (5)
Class 7 First Term Exam
2 pages
Network Flow Optimization Techniques
No ratings yet
Network Flow Optimization Techniques
18 pages
MCA Curriculum Overview
No ratings yet
MCA Curriculum Overview
102 pages
800 Ds ML Courses 1674378827
No ratings yet
800 Ds ML Courses 1674378827
29 pages
Seminar PPT 002311002008
No ratings yet
Seminar PPT 002311002008
15 pages
Decidable and Recognizable Languages with Turing Machines
No ratings yet
Decidable and Recognizable Languages with Turing Machines
20 pages
Grade 5 Math Quarter 1 Summative Test
100% (1)
Grade 5 Math Quarter 1 Summative Test
1 page
Maharishi University of Management: CS 435 - Design and Analysis of Algorithms
No ratings yet
Maharishi University of Management: CS 435 - Design and Analysis of Algorithms
8 pages
Algorithms & Programming Concepts
No ratings yet
Algorithms & Programming Concepts
6 pages
Factorial Notation-Permutation
No ratings yet
Factorial Notation-Permutation
31 pages
Feb 2015 NCTM Calendar For Students
No ratings yet
Feb 2015 NCTM Calendar For Students
2 pages
3-2 Human Machine - Minimum Card
No ratings yet
3-2 Human Machine - Minimum Card
2 pages
1976 21erdos
No ratings yet
1976 21erdos
12 pages
Sequence and Series Worksheet 2
No ratings yet
Sequence and Series Worksheet 2
7 pages
HEC Computer Science
No ratings yet
HEC Computer Science
171 pages
DAA Paper MTE
No ratings yet
DAA Paper MTE
2 pages
Function and Recursion of Python
No ratings yet
Function and Recursion of Python
42 pages
MATLAB Root-Finding Methods Guide
No ratings yet
MATLAB Root-Finding Methods Guide
2 pages
Instantaneous Codes and Kraft Inequality
No ratings yet
Instantaneous Codes and Kraft Inequality
5 pages
Dynamic Programming
No ratings yet
Dynamic Programming
23 pages
Guessing Game: Linear vs. Binary Search
No ratings yet
Guessing Game: Linear vs. Binary Search
2 pages
Quadratic Equations Word Problems 20 Questions
No ratings yet
Quadratic Equations Word Problems 20 Questions
2 pages
5.E Graph Theory (Exercises)
No ratings yet
5.E Graph Theory (Exercises)
15 pages
Elementary Number Theory in Nine Chapters: Jamesj - Tattersall
No ratings yet
Elementary Number Theory in Nine Chapters: Jamesj - Tattersall
25 pages
DM Important Questions
No ratings yet
DM Important Questions
6 pages
AI Strategies in Tic-Tac-Toe
No ratings yet
AI Strategies in Tic-Tac-Toe
23 pages
GRT Unit1 2022 finalmoduleforITMS-01 PDF
No ratings yet
GRT Unit1 2022 finalmoduleforITMS-01 PDF
25 pages
Aptitude Book
No ratings yet
Aptitude Book
75 pages
MOMC S2 23 To 24
No ratings yet
MOMC S2 23 To 24
14 pages

An Overview of Distributed MST Algorithms

Uploaded by

An Overview of Distributed MST Algorithms

Uploaded by

An overview of distributed MST algorithms

Arkanath Pathak, Buddha Prakash, Utpal, Palash Mittal

1.1 Problem Definition

1.2 Types Of Solutions

1. Combining fragments: add edges until the MST is constructed.

2. Eliminating cycles: delete edges until only the MST is left.

2 Gallager et al. (1983) - GHS Algorithm

• The weights of the edges in the network are mutually distinct.

With these assumptions, we move towards the working of the algorithm.

2.2 Algorithm: Specifics

We now formalize these details by giving a pseudo-code for the algorithm.

2.4 Algorithm: Pseudo-code

Notations used in the Algorithm 1 (GHS):

• Any procedure with RESPONSE-X is automatically triggered when a message X is received

• Procedure WAKE-UP can be triggered automatically if a node is sleeping.

• W eight(e): Variable storing the weight of the edge e.

2.5 Communication and Time Analysis

3.2 Algorithm Specifics

3.2.2 Algorithm: Pseudo-code

Notations used in the Algorithm 2 (MST(Phase 2)):

• Any procedure with RESPONSE-X is automatically triggered when a message X is received

• W eight(e): Variable storing the weight of the edge e.

• ID: Variable storing the identity of n.

• Level: Variable storing the level of n.

• Root: Variable storing the root of tree of n.

• count(best − edge): Count of number of best-edges.

• parent: Variable storing the parent id of node.

• V : Variable storing the total number of nodes.

(c) Local candidate selection case 2: Levelk < Levelj

(f) Best candidate selection step 2 case 1:

Figure 2: MST Stage, Awerbuch’s Algorithm

Figure 3: Counting Stage: Link Search, Awerbuch’s Algorithm

3.2.3 Counting stage

4 Garay et al. (1998)

With these assumptions, we move towards the working of the algorithm.

4.1 The Sub-linear Algorithm for MST Construction

4.1.1 Phase I: Controlled-GHS

Small-Dom-Set Construction: A Procedure Small-Dom-Set is used for computing a small

2. Select an MIS, Q, in the set R of unmarked nodes;

2. Diam(F ) increases by a factor of at most 3, for every fragment F.

4.1.2 Phase II: Small Cycle Elimination on Fragment Graph

4.1.3 Phase III: Global edge Elimination

4.3 Message Passing and Time Complexity

You might also like