0% found this document useful (0 votes)
91 views21 pages

Kruskal's Algorithm: Melanie Ferreri

The document defines key terms related to graphs and spanning trees. It provides motivation for finding minimum spanning trees by describing an application to connecting villages with roads at minimum cost. It then presents Kruskal's algorithm for finding a minimum spanning tree in a weighted graph by sequentially adding edges in order of non-decreasing weight that do not form cycles. The algorithm is proved to always produce a minimum spanning tree. An example applies the algorithm to find the minimum spanning tree of a sample graph.

Uploaded by

Work 133
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
91 views21 pages

Kruskal's Algorithm: Melanie Ferreri

The document defines key terms related to graphs and spanning trees. It provides motivation for finding minimum spanning trees by describing an application to connecting villages with roads at minimum cost. It then presents Kruskal's algorithm for finding a minimum spanning tree in a weighted graph by sequentially adding edges in order of non-decreasing weight that do not form cycles. The algorithm is proved to always produce a minimum spanning tree. An example applies the algorithm to find the minimum spanning tree of a sample graph.

Uploaded by

Work 133
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

Definitions and Motivation

The Algorithm
Time Complexity
References

Kruskal’s Algorithm

Melanie Ferreri

February 8, 2021

Melanie Ferreri Kruskal’s Algorithm 1 / 21


Definitions and Motivation
The Algorithm
Time Complexity
References

Preliminary Definitions
Definition
A graph G = (V, E) consists of a finite set V of vertices, and a set E
of two-element subsets of V , which represent edges between
vertices. We sometimes denote the vertex set of G by V (G) and the
edge set by E(G). A weighted graph is a graph whose edges have
each been assigned a numerical weight. The weight of a graph is the
sum of the weights of its edges.

Example (A weighted graph G)


b 4
3 a
2
c 5
1
1 e
d
Melanie Ferreri Kruskal’s Algorithm 2 / 21
Definitions and Motivation
The Algorithm
Time Complexity
References

Preliminary Definitions

Definition
A tree is an acyclic connected graph (there is a path between any
pair of vertices, and there are no cycles).

A spanning tree of a graph G is a spanning subgraph H (the vertex


set of H is the same as V (G)) such that H is a tree.

A minimum spanning tree (MST) of G is a spanning tree whose


weight is minimum among all possible spanning trees of G.

Melanie Ferreri Kruskal’s Algorithm 3 / 21


Definitions and Motivation
The Algorithm
Time Complexity
References

Examples of spanning trees


Our G from earlier was:
b 4
3 a
2
c 5
1
1 e
d
A spanning tree and minimum spanning tree of G are as follows:
b 4 b
3 a 3 a
2
c 5 c
1
1 e 1 e
d d
Melanie Ferreri Kruskal’s Algorithm 4 / 21
Definitions and Motivation
The Algorithm
Time Complexity
References

Some Facts

MST’s do not have to be unique.


Every connected graph G has a spanning tree:
If G is already a tree, then it is its own spanning tree. If G has
cycles, then we can remove edges from cycles in a way that
keeps G connected, until there are no more cycles, and we end
up with a spanning tree.

Melanie Ferreri Kruskal’s Algorithm 5 / 21


Definitions and Motivation
The Algorithm
Time Complexity
References

Motivation for Kruskal’s algorithm

Suppose we have some set of villages, and we want to connect them


by roads in the least expensive way possible.

Some roads would be more expensive to construct than others,


based on the land they would go through, the labor required, etc.

We can view villages as vertices, and roads as edges, weighted by


the costs of building them.

The cheapest way to ensure that we can get from any village A to any
village B is by building roads in the shape of a minimum spanning
tree.

Melanie Ferreri Kruskal’s Algorithm 6 / 21


Definitions and Motivation
The Algorithm
Time Complexity
References

Kruskal’s Algorithm

Start with a connected weighted graph G.


Sort the edges of G by weight in increasing order.
Let T be the graph whose vertex set is V (G) and whose edge
set is intialized to empty.
Let e1 be any edge of minimum weight in E(G), and add it to
E(T ). Then, let e2 be any remaining edge of minimum weight,
and add that to E(T ).
Let e3 be a remaining edge of minimum weight which does not
form a cycle with the already chosen edges. Add e3 to E(T ).
Keep adding minimum-weight edges that don’t form a cycle, until
we obtain a spanning tree T .

Melanie Ferreri Kruskal’s Algorithm 7 / 21


Definitions and Motivation
The Algorithm
Time Complexity
References

Kruskal’s Algorithm Works

Theorem
Kruskal’s algorithm produces a minimum spanning tree in a
connected weighted graph.

Proof.
Let G be a connected weighted graph with n vertices. Let T be a
spanning tree obtained by Kruskal’s algorithm, where
E(T ) = {e1 , e2 , . . . , en−1 } ordered by selection in constructing the
tree. Let w(ei ) be the weight of ei , so w(e1 ) ≤ w(e2 ) ≤ · · · ≤ w(en−1 ),
n−1
X
and w(T ) = w(ei ).
i=1
We want to show that T is a MST. Suppose to the contrary that T is
not minimum, and let H be a MST which has the most possible edges
in common with T among all MST’s.

Melanie Ferreri Kruskal’s Algorithm 8 / 21


Definitions and Motivation
The Algorithm
Time Complexity
References

Kruskal’s Algorithm Works


Proof (continued).
Since H 6= T , there is some edge on T that is not on H. Let ei be the
first such edge in E(T ). So for all j < i, ej is in both H and T .
Define G0 = H + ei . Since H is a tree and we are adding an edge,
this produces a cycle, so G0 has a cycle C. T has no cycle, so there
must be some edge e0 on C that is not on T . Then let T0 = G0 − e0 ,
so T0 is another spanning tree of G. Also,
w(T0 ) = w(H) + w(ei ) − w(e0 ). Since H is a MST, w(H) ≤ w(T0 );
thus w(H) ≤ w(H) + w(ei ) − w(e0 ), so w(e0 ) ≤ w(ei ). By Kruskal’s
algorithm, w(e0 ) = w(ei ) if i = 1. If i > 1, then ei is an edge that can
be added to {e1 , . . . , ei−1 } without producing a cycle, but also e0 can
be added without producing a cycle. So by Kruskal’s algorithm,
w(ei ) ≤ w(e0 ) since edges are chosen in order of weight. Thus for
i > 1, w(ei ) = w(e0 ) as well. So w(T0 ) = w(H), so T0 is a MST, and
T0 has more edges in common with T than H does, which is a
contradiction. 
Melanie Ferreri Kruskal’s Algorithm 9 / 21
Definitions and Motivation
The Algorithm
Time Complexity
References

Example

We will use Kruskal’s algorithm to find a minimum spanning tree of


the following graph:
a
4 7

b c
2 1

d
7 3
2
e
6 1
5
f g

Melanie Ferreri Kruskal’s Algorithm 10 / 21


Definitions and Motivation
The Algorithm
Time Complexity
References

Example (continued)

We start out with the graph T having no edges, then we add the first
two edges of minimum weight.
a a

b c b 1 c
d d
e e 1
f g → f g

Melanie Ferreri Kruskal’s Algorithm 11 / 21


Definitions and Motivation
The Algorithm
Time Complexity
References

Example (continued)

Then, we add the next-lowest weighted edges, making sure to only


include them if they do not produce a cycle.
a a

b 1 c b 2 1 c
d d
2 2
e 1 e 1
f g → f g

Melanie Ferreri Kruskal’s Algorithm 12 / 21


Definitions and Motivation
The Algorithm
Time Complexity
References

Example (continued)

The next-lowest weighted edge is between c and g with weight 3, but


if we add it, it would make a cycle, so we go with the edge of weight 4
instead.

4 a 4 a
b 2 1 c b 2 1 c
d d
2 2
e 1 e 1
5
f g → f g

Now T is connected, and we have a minimum spanning tree of weight


15.

Melanie Ferreri Kruskal’s Algorithm 13 / 21


Definitions and Motivation
The Algorithm
Time Complexity
References

What contributes to the time complexity of Kruskal’s


algorithm?

The major contributors to the complexity of Kruskal’s algorithm are:


Sorting the edges by weight
Checking whether a potential new edge forms a cycle with the
previously selected edges

Suppose |V (G)| = n and |E(G)| = m.

We can sort the edges with MergeSort, which takes O(m log m) time.
For checking for cycles, we can use a Union Find data structure to
manage the edges of the MST.

Melanie Ferreri Kruskal’s Algorithm 14 / 21


Definitions and Motivation
The Algorithm
Time Complexity
References

Union Find

A Union Find or "disjoint sets" data structure manages a list of objects


that is partitioned into connected subsets.

In the case of creating a spanning tree, we want each new edge to


connect two separate connected components of the in-progress MST
T . (If this is not the case, then we would be adding an edge to an
already connected component, thereby creating a cycle.)

So we can set up the vertices of the graph in a Union Find structure


so that we know which pieces of the graph are connected to one
another, and avoid adding edges between vertices which are already
connected.

Melanie Ferreri Kruskal’s Algorithm 15 / 21


Definitions and Motivation
The Algorithm
Time Complexity
References

How Union Find works

We start out by making an array Arr[] of size n. Arr[i] will keep


track of the parent of vertex i. We will use this array to build trees
which represent the connected components of the graph.

Union Find has two operations: union, and find. When we call
Union(a,b), we set vertex b as the parent of vertex a, so Arr[a] =
b. The Find operation tells us whether two vertices are connected,
by checking if they are in the same tree. This is done by following up
the parents of each vertex until we reach a vertex which is its own
parent - this is the root of the tree. If two vertices are in the same tree,
we will get the same root, and know they are connected.

Melanie Ferreri Kruskal’s Algorithm 16 / 21


Definitions and Motivation
The Algorithm
Time Complexity
References

Example of Union Find


We will number the vertices by alphabetical order (so a = 0, b = 1,
etc). Our array starts out with A[i] = i for all i. Then when we add an
edge to T , we make one of the vertices on that edge a parent of the
other. At the third step of our last example, we had this:
a

b 1 c
d
2
e 1
f g

So Arr[] looked something like [ 0 1 2 2 3 5 4 ], where c is


the parent of d, d is the parent of e, and e is the parent of g, while a, b,
c and f are their own parents.
Melanie Ferreri Kruskal’s Algorithm 17 / 21
Definitions and Motivation
The Algorithm
Time Complexity
References

Example of Union Find (continued)

b 1 c
d
2
e 1
f g

Then when we wanted to add the next-smallest edge of weight 3


between c and g, we called Find(2,6) and it returned true since c
and g are both in the tree rooted at c, so they are connected. Thus we
didn’t add that edge and moved ahead in our sorted list.

Melanie Ferreri Kruskal’s Algorithm 18 / 21


Definitions and Motivation
The Algorithm
Time Complexity
References

Overall Complexity

There is a variation on union find which builds the trees in a balanced


way, by comparing the sizes of two subsets before unioning them to
determine how to join them in the data structure. Without a balanced
tree, the union and find operations can take O(n) time, but with a
balanced tree, this is reduced to O(log n).

At each step of Kruskal’s algorithm, we have to use the union and find
operations in order to check if our potential new edge will create a
cycle. We will have to iterate through at most all m edges of G, which
contributes m ∗ O(log n) = O(m log n) time. We also know sorting the
edges by weight takes O(m log m) time, so in total we have time
complexity O(m log m + m log n). There can be at most n2 edges of
G; thus log(m) is at most 2 log(n), so O(log m) = O(log n) in this case.
Thus, the total time complexity of Kruskal’s algorithm is O(m log m).

Melanie Ferreri Kruskal’s Algorithm 19 / 21


Definitions and Motivation
The Algorithm
Time Complexity
References

Applications and Conclusion

In conclusion, Kruskal’s algorithm runs in O(m log m) or O(m log n)


time (they are on the same order of magnitude for simple graphs).

This algorithm has applications in network design and other problems


like cluster analysis. It can also be used to approximate a solution to
the traveling salesperson problem (finding a shortest path which visits
each vertex at least once).
Thank you for listening!

Melanie Ferreri Kruskal’s Algorithm 20 / 21


Definitions and Motivation
The Algorithm
Time Complexity
References

References

Applications of minimum spanning tree problem. https://www.geeksforgeeks.


org/applications-of-minimum-spanning-tree/, a.
Kruskal’s minimum spanning tree algorithm | greedy algo-2.
https://www.geeksforgeeks.org/
kruskals-minimum-spanning-tree-algorithm-greedy-algo-2/, b.
Disjoint set (or union-find) | set 1 (detect cycle in an undirected graph).
https://www.geeksforgeeks.org/union-find/, c.

Omar Khaled Abdelaziz Abdelnabi. Minimum spanning tree.


https://www.hackerearth.com/practice/algorithms/graphs/
minimum-spanning-tree/tutorial/.
G. Chartrand and P. Zhang. A First Course in Graph Theory. Dover Books on
Mathematics. Dover Publications, 2013. ISBN 9780486297309. URL
https://books.google.com/books?id=zA_CAgAAQBAJ.
Prateek Garg. Disjoint set union (union find). https://www.hackerearth.com/
practice/notes/disjoint-set-union-union-find/.

Melanie Ferreri Kruskal’s Algorithm 21 / 21

You might also like