0% found this document useful (0 votes)
2 views

lecture19

The document discusses Kruskal's and Prim's algorithms for finding the Minimum Spanning Tree (MST) of a graph, detailing their methodologies and properties. It explains the concepts of disjoint sets and the cycle and cut properties that underpin these algorithms. Additionally, it provides announcements related to course projects and exercises, emphasizing the importance of understanding these algorithms in the context of data structures and algorithms coursework.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

lecture19

The document discusses Kruskal's and Prim's algorithms for finding the Minimum Spanning Tree (MST) of a graph, detailing their methodologies and properties. It explains the concepts of disjoint sets and the cycle and cut properties that underpin these algorithms. Additionally, it provides announcements related to course projects and exercises, emphasizing the importance of understanding these algorithms in the context of data structures and algorithms coursework.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 60

Lecture 19: Disjoint Sets CSE 373: Data Structures and

Algorithms

CSE 373 2` SP – CHAMPION 1


Warmup
Run Kruskal’s algorithm on the following graph to find the MST (minimum spanning tree) of the graph below.
Recall the definition of a minimum spanning tree: a minimum-weight set of edges such that you can get from
any vertex of the graph to any other on only those edges. The set of these edges form a valid tree in the graph.
Below is the provided pseudocode for Kruksal’s algorithm to choose all the edges.

PollEv.com/373lecture
KruskalMST(Graph G)
initialize each vertex to be an independent
component
sort the edges by weight
foreach(edge (u, v) in sorted order){
if(u and v are in different components){
add (u,v) to the MST
update u and v to be in the same component
}
}

CSE 373 SP 18 - KASEY CHAMPION 2


Announcements
P3 due Wednesday May 12th
P4 comes out Wednesday- due in 3 weeks on Wednesday June 2nd
- last project!

E3 came out on Friday – due this Friday May 14th


- two more exercises coming
Minimum Spanning Trees

CSE 373 20 SP – CHAMPION & CHUN 4


Review Minimum Spanning Trees (MSTs)
A Minimum Spanning Tree for a graph is a set of that graph’s edges that connect all of that
graph’s vertices (spanning) while minimizing the total weight of the set (minimum)
- Note: does NOT necessarily minimize the path from each vertex to every other vertex
- Any tree with V vertices will have V-1 edges
- A separate entity from the graph itself! More of an “annotation” applied to the graph, just like a Shortest Paths
Tree (SPT)

B 2
3 6 E
1 C 11 Minimum Spanning Tree
A 9
5 7 10
4 D F
8
Review Why do MST Algorithms Work?
Two useful properties for MST edges. We can think about them from either perspective:
- Cycle Property: The heaviest edge along a cycle is NEVER part of an MST.
- Cut Property: Split the vertices of the graph into any two sets A and B. The lightest edge between A and B is
ALWAYS part of an MST. (Prim’s thinks this way)

Whenever you add an edge to a tree you create exactly one cycle. Removing any edge
from that cycle gives another tree!
This observation, combined with the cycle and cut properties form the basis of all of the
greedy algorithms for MSTs.
- greedy algorithm: chooses best known option at each point and commits, rather than waiting for a global
view of the graph before deciding
Review Adapting Dijkstra’s: Prim’s Algorithm
• Normally, Dijkstra’s checks for a dijkstraShortestPath(G
prims graph, V start)
shorter path from the start. Map edgeTo, distTo;
• But MSTs don’t care about initialize distTo with all nodes mapped to ∞, except start to 0
individual paths, only the PriorityQueue<V> perimeter; perimeter.add(start);
overall weight!
• New condition: “would this be while (!perimeter.isEmpty()):
u = perimeter.removeMin()
a smaller edge to connect the
known.add(u)
current known set to the rest of for each edge (u,v) to unknown v with weight w:
the graph?” oldDist = distTo.get(v) // previous smallest edge to v
newDist = distTo.get(u) + w // is this a smaller edge to v?
if (newDist < oldDist):
KNOWN distTo.put(u, newDist)

C 1
edgeTo.put(u, v)
1?? if (perimeter.contains(v)):
perimeter.changePriority(v, newDist)
B 3 A else:
perimeter.add(v, newDist)
3??
4 1
X
A Different Approach
B 2
Suppose the MST on the right was produced by
Prim’s 3 6 E
1 C 11
Observation: We basically chose all the smallest edges A 9
in the entire graph (1, 2, 3, 4, 6) 5 7 10
- The only exception was 5. Why shouldn’t we add edge 5? 4 D F
- Because adding 5 would create a cycle, and to connect A, C, & D we’d 8
rather choose 1 & 4 than 1 & 5 or 4 & 5.
Building an MST “edge by
edge” in this graph:
Prim’s thinks “vertex by vertex”, but what if you think
“edge by edge” instead? • Add edge 1
- Start with the smallest edge in the entire graph and work your • Add edge 2
way up • Add edge 3
- Add the edge to the MST as long as it connects two new • Add edge 4
groups (meaning don’t add any edges that would create a • Skip edge 5 (would create a
cycle) cycle)
• Add edge 6
• Finished: all vertices in the MST!
Kruskal’s Algorithm “islands”

This “edge by edge” approach is how Kruskal’s Algorithm works!


B 6
4 2
11
E
1
C
A 9
• Key Intuition: Kruskal’s keeps track 5 7 10
3
of isolated “islands” of vertices D 8 F
(each is a sub-MST)
- Start with each vertex as its own “island” kruskalMST(G graph)
- If an edge connects two vertices within Set(?) msts; Set finalMST;
the same “island”, it forms a cycle! initialize msts with each vertex as single-element MST
Discard it. sort all edges by weight (smallest to largest)

- If an edge connects two vertices in


different “islands”, add it to the MST! for each edge (u,v) in ascending order:
Now those “islands” need to be uMST = msts.find(u)
combined. vMST = msts.find(v)
if (uMST != vMST):
finalMST.add(edge (u, v))
msts.union(uMST, vMST)
Kruskal’s Algorithm “islands”

This “edge by edge” approach is how Kruskal’s Algorithm works!


B 6
4 2
11
E
1 C
A 9
• Key Intuition: Kruskal’s keeps track 5 7 10
3
of isolated “islands” of vertices D 8 F
(each is a sub-MST)
- Start with each vertex as its own “island” kruskalMST(G graph)
- If an edge connects two vertices within Set(?) msts; Set finalMST;
the same “island”, it forms a cycle! initialize msts with each vertex as single-element MST
Discard it. sort all edges by weight (smallest to largest)

- If an edge connects two vertices in


different “islands”, add it to the MST! for each edge (u,v) in ascending order:
Now those “islands” need to be uMST = msts.find(u)
combined. vMST = msts.find(v)
if (uMST != vMST):
finalMST.add(edge (u, v))
msts.union(uMST, vMST)
Kruskal’s Algorithm “islands”

This “edge by edge” approach is how Kruskal’s Algorithm works!


B 6
4 2 E
11
1 C
A 9
• Key Intuition: Kruskal’s keeps track 5 7 10
3
of isolated “islands” of vertices D 8 F
(each is a sub-MST)
- Start with each vertex as its own “island” kruskalMST(G graph)
- If an edge connects two vertices within Set(?) msts; Set finalMST;
the same “island”, it forms a cycle! initialize msts with each vertex as single-element MST
Discard it. sort all edges by weight (smallest to largest)

- If an edge connects two vertices in


different “islands”, add it to the MST! for each edge (u,v) in ascending order:
Now those “islands” need to be uMST = msts.find(u)
combined. vMST = msts.find(v)
if (uMST != vMST):
finalMST.add(edge (u, v))
msts.union(uMST, vMST)
Kruskal’s Algorithm “islands”

This “edge by edge” approach is how Kruskal’s Algorithm works!


B 6
4 2 E
11
1 C
A 9
• Key Intuition: Kruskal’s keeps track 5 7 10
of isolated “islands” of vertices 3 D F
8
(each is a sub-MST)
- Start with each vertex as its own “island” kruskalMST(G graph)
- If an edge connects two vertices within Set(?) msts; Set finalMST;
the same “island”, it forms a cycle! initialize msts with each vertex as single-element MST
Discard it. sort all edges by weight (smallest to largest)

- If an edge connects two vertices in


different “islands”, add it to the MST! for each edge (u,v) in ascending order:
Now those “islands” need to be uMST = msts.find(u)
combined. vMST = msts.find(v)
if (uMST != vMST):
finalMST.add(edge (u, v))
msts.union(uMST, vMST)
Kruskal’s Algorithm “islands”

This “edge by edge” approach is how Kruskal’s Algorithm works!


B 6
4 2
11
E
1 C
A 9
• Key Intuition: Kruskal’s keeps track 5 7 10
of isolated “islands” of vertices 3 D F
8
(each is a sub-MST)
- Start with each vertex as its own “island” kruskalMST(G graph)
- If an edge connects two vertices within Set(?) msts; Set finalMST;
the same “island”, it forms a cycle! initialize msts with each vertex as single-element MST
Discard it. sort all edges by weight (smallest to largest)

- If an edge connects two vertices in


different “islands”, add it to the MST! for each edge (u,v) in ascending order:
Now those “islands” need to be uMST = msts.find(u)
combined. vMST = msts.find(v)
if (uMST != vMST):
finalMST.add(edge (u, v))
msts.union(uMST, vMST)
Kruskal’s Algorithm “islands”

B 6
This “edge by edge” approach is how Kruskal’s Algorithm works! 4 2 E
11
1 C
A 9
• Key Intuition: Kruskal’s keeps track 5 7 10
of isolated “islands” of vertices 3 D F
8
(each is a sub-MST)
- Start with each vertex as its own “island” kruskalMST(G graph)
- If an edge connects two vertices within Set(?) msts; Set finalMST;
the same “island”, it forms a cycle! initialize msts with each vertex as single-element MST
Discard it. sort all edges by weight (smallest to largest)

- If an edge connects two vertices in


different “islands”, add it to the MST! for each edge (u,v) in ascending order:
Now those “islands” need to be uMST = msts.find(u)
combined. vMST = msts.find(v)
if (uMST != vMST):
finalMST.add(edge (u, v))
msts.union(uMST, vMST)
Prim’s Demos and Visualizations
Dijkstra’s Algorithm Prim’s Algorithm
Dijkstra’s proceeds radially from its source, because it Prim’s jumps around the graph (the perimeter),
chooses edges by path length from source because it chooses edges by edge weight (there’s
no source)
Kruskal’ Demos and Visualizations
Prim’s Algorithm Kruskal’s Algorithm
Prim’s jumps around the graph (the perimeter), Kruskal’s jumps around the entire graph, because it chooses
because it chooses edges by edge weight (there’s from all edges purely by edge weight (while preventing
no source) cycles)
B
Selecting an ADT
6
4
11
2 E
1 C
A 9
Kruskal’s needs to find what MST a vertex belongs to, 5 7 10
and union those MSTs together 3 D 8 F
- Our existing ADTs don’t lend themselves well to “unioning” two
sets…
- Let’s define a new one!

kruskalMST(G graph)
Set(?) msts; Set finalMST;
initialize msts with each vertex as single-element MST
sort all edges by weight (smallest to largest)

for each edge (u,v) in ascending order:


uMST = msts.find(u)
vMST = msts.find(v)
if (uMST != vMST):
finalMST.add(edge (u, v))
msts.union(uMST, vMST)
B
Disjoint Sets ADT (aka “Union-Find”)
6
4
11
2 E
1 C
A 9
Kruskal’s will use a Disjoint Sets ADT under the hood 5 7 10
- Conceptually, a single instance of this ADT contains a “family” of 3 D 8 F
sets that are disjoint (no element belongs to multiple sets)

DISJOINT SETS ADT

State
Family of Sets
kruskalMST(G graph)
• disjoint: no shared elements
DisjointSets<V> msts; Set finalMST; • each set has a representative (either
initialize msts with each vertex as single-element MST a member or a unique ID)
sort all edges by weight (smallest to largest)
Behavior
for each edge (u,v) in ascending order: makeSet(value) – new set with value
as only member (and representative)
uMST = msts.find(u)
find(value) – return representative
vMST = msts.find(v) of the set containing value
if (uMST != vMST): union(x, y) – combine sets containing
finalMST.add(edge (u, v)) x and y into one set with all
msts.union(uMST, vMST); elements, choose single new
representative
Project 4: Mazes!
You find yourself trapped in the Labyrinth of Greek
legend – bummer!
How do you solve a maze?
- If we could model a maze as a graph, we’d just need an algorithm
to find a path from s to t… Maybe even the shortest path?

How do you generate a maze?


- We’d love an algorithm that is guaranteed to connect s to t
(spanning), but only produces one path from s to t (tree)…
Project 4: Mazes
It turns out that randomizing the weights of a graph and then computing the MST is a
fantastic way to generate mazes!
In P4, you’ll do both: Implement Dijkstra’s to solve an arbitrary maze, then implement
Kruskal’s (and a Disjoint Set) to generate those mazes
This project is really application-heavy!
- Graphical User Interface (GUI) for viewing mazes and solving them
- Significantly more starter code than past projects, to give you practice integrating with an existing codebase
- A major part of the challenge in P4 is reading through the starter code to understand what you need to
interface with! Don’t underestimate the time that takes.

2* week project, and 2* weeks worth of work. It’s never been more important to start early!
- You really have 3 weeks because of Thanksgiving in the middle, but don’t let that fool you!
Disjoint Sets

CSE 373 20 SP – CHAMPION & CHUN 21


Disjoint Sets in mathematics
- “In mathematics, two sets are said to be disjoint sets if they have no
element in common.” - Wikipedia
- disjoint = not overlapping

Kevin Nishu
Leona
Santino Brian
Keanu Santino
Aileen Sherdil

Set #1 Set #2 Set #3 Set #4

These two sets are disjoint sets These two sets are not disjoint sets

CSE 373 SP 18 - KASEY CHAMPION 22


Disjoint Sets in computer science new ADT!

In computer science, disjointsets can refer to this ADT/data structure


that keeps track of the multiple “mini” sets that are disjoint (confusing
naming, I know)
This overall grey blob thing is the actual
disjoint sets, and it’s keeping track of any
number of mini-sets, which are all disjoint
(the mini sets have no overlapping
Kevin
Leona values).
Keanu
Aileen Sherdil Note: this might feel really different than ADTs we’ve
run into before. The ADTs we’ve seen before
(dictionaries, lists, sets, etc.) just store values directly.
But the Disjoint Set ADT is particularly interested in
Set #1 Set #2
letting you group your values into sets and
keep track of which particular set your values are in.

CSE 373 SP 18 - KASEY CHAMPION 23


DisjointSets ADT methods
Just 3 methods (and makeSet is pretty simple!)

- findSet(value)
- union(valueA, valueB)
- makeSet(value)

CSE 373 SP 18 - KASEY CHAMPION 24


findSet(value)
findSet(value) returns some ID for which particular set the value is in. For Disjoint Sets, we
often call this the representative (as it’s a value that represents the whole set).

Examples:
findSet(Brian) 3
Sherdil
findSet(Sherdil) 2
Kasey
Kevin Velocity
findSet(Velocity) 2
Brian
Keanu
findSet(Kevin) == findSet(Aileen)true
Aileen Keanu
Set #2
Set #4

Set #1 Set #3

3 CSE 373 SP 18 - KASEY CHAMPION 25


union(valueA, valueB)
union(valueA, valueB) merges the set that A is in with the set that B is in. (basically add the
two sets together into one)
Example: union(Blarry,Brian)

Sherdil Sherdil
Kasey Kasey
Kevin Velocity Kevin Velocity
Brian
Blarry Keanu
Blarry
Keanu Brian
Vivian
Set #2 Set #2
Set #4 Vivian Set #4
Set #3
Set #1 Set #1
CSE 373 SP 18 - KASEY CHAMPION 26
makeSet(value)
makeSet(value) makes a new mini set that just has the value parameter in it.

Examples:
Elena
makeSet(Elena) Anish
Set #5 Set #6
makeSet(Anish) Sherdil
Kasey
Kevin Velocity
Brian
Blarry

Vivian Keanu
Set #2
Set #4

Set #1 Set #3

CSE 373 SP 18 - KASEY CHAMPION 27


Disjoint Sets ADT Summary

Disjoint-Sets ADT

state
Set of Sets
- Mini sets are disjoint: Elements must be unique across mini sets
- No required order
- Each set has id/representative

behavior
makeSet(value) – creates a new set within the disjoint set where the
only member is the value. Picks id/representative for set
findSet(value) – looks up the set containing the value, returns
id/representative/ of that set
union(x, y) – looks up set containing x and set containing y, combines
two sets into one. All of the values of one set are added to the other,
and the now empty set goes away.

CSE 373 SP 18 - KASEY CHAMPION 28


New ADT
Set ADT Disjoint-Set ADT

state state
Set of elements Set of Sets
- Elements must be unique! - Disjoint: Elements must be unique across sets
- No required order - No required order
- Each set has representative
Count of Elements
Count of Sets
behavior
create(x) - creates a new set with a single
behavior
member, x makeSet(x) – creates a new set within the disjoint set where the only
add(x) - adds x into set if it is unique, member is x. Picks representative for set
otherwise add is ignored findSet(x) – looks up the set containing element x, returns
remove(x) – removes x from set representative of that set
size() – returns current number of union(x, y) – looks up set containing x and set containing y, combines
elements in set two sets into one. Picks new representative for resulting set

B A C G
F H
D
D A
C B

CSE 373 SP 18 - KASEY CHAMPION 29


Example
new()
Rep: 1
makeSet(a) Rep: 0 b
a
makeSet(b)
Rep: 2
makeSet(c) c

makeSet(d)
Rep: 4

makeSet(e) e
Rep: 3

findSet(a) d

findSet(d)
union(a, c)

CSE 373 WI 18 – MICHAEL LEE 30


Example
new()
Rep: 1
makeSet(a) Rep: 0 b
a
makeSet(b)
c

makeSet(c)
makeSet(d)
Rep: 4

makeSet(e) e
Rep: 3

findSet(a) d

findSet(d)
union(a, c)
union(b, d)

CSE 373 WI 18 – MICHAEL LEE 31


Example
new()
Rep: 1
makeSet(a) Rep: 0 b
a
makeSet(b) d
c

makeSet(c)
makeSet(d)
Rep: 4

makeSet(e) e

findSet(a)
findSet(d)
union(a, c)
union(b, d)
findSet(a) == findSet(c)
findSet(a) == findSet(d) CSE 373 WI 18 – MICHAEL LEE 32
Implementation
TreeSet<E>
Disjoint-Set ADT TreeDisjointSet<E> state
state SetNode overallRoot
state
Set of Sets Collection<TreeSet> forest behavior
- Disjoint: Elements must be unique Dictionary<NodeValues, TreeSet(x)
across sets NodeLocations> nodeInventory add(x)
- No required order
- Each set has representative remove(x, y)
Count of Sets behavior getRep()-returns data of
overallRoot
behavior
makeSet(x)-create a new
makeSet(x) – creates a new set within tree of size 1 and add to
the disjoint set where the only member our forest SetNode<E>
is x. Picks representative for set
state
findSet(x)-locates node with E data
findSet(x) – looks up the set containing x and moves up tree to find
element x, returns representative of Collection<SetNode>
root
that set children
behavior
union(x, y) – looks up set containing x union(x, y)-append tree
with y as a child of tree SetNode(x)
and set containing y, combines two sets
into one. Picks new representative for with x addChild(x)
resulting set
removeChild(x, y)

CSE 373 SP 18 - KASEY CHAMPION 33


TreeDisjointSet<E>
Implement makeSet(x) state
Collection<TreeSet> forest
Dictionary<NodeValues,
forest NodeLocations> nodeInventory
behavior
makeSet(0) 0 1 2 3 4 5 makeSet(x)-create a new tree
of size 1 and add to our
makeSet(1) forest
findSet(x)-locates node with x
and moves up tree to find root
makeSet(2) union(x, y)-append tree with y
as a child of tree with x
makeSet(3)

makeSet(4)

makeSet(5) 0 1 2 3 4 5

Worst case runtime?


O(1)
CSE 373 SP 18 - KASEY CHAMPION 34
QuickUnion Data Structure
Fundamental idea:
- QuickFind tracks each element’s ID
- QuickUnion tracks each element’s parent. Only the root has an ID!
- Each set becomes tree-like, but something slightly different called an up-tree: store pointers from children to parents!

Aileen (1)
Joyce (2)
Aileen,

=
Joyce, Sam, Santino Santino
Sam Ken
Ken, Alex

Alex
Paul Paul (3)

Abstract Idea of “Disjoint Sets” Implementation using QuickUnion


QuickUnion: Find find(Santino) à 1
find(Ken) à 2
find(Ken): find(Santino) != find(Ken)
find(Santino) == find(Aileen)
jump to Ken node
travel upward until root
return ID
Aileen (1)
Joyce (2)

Key idea: can travel upward from any node to Sam Ken
Santino
find its representative ID
How do we jump to a node quickly? Alex
Paul (3)
- Also store a map from value to its node (Omitted in future
slides)

Sam

Alex

Paul

QuickUnion: Union Joyce (2)
Aileen (1)

Key idea: easy to simply rearrange pointers to union Santino


entire trees together! Sam Ken

Which of these implementations would you prefer? Alex


Paul (3)

union(Ken, Santino): union(Ken, Santino):


rootS = find(Santino) rootK = find(Ken)
set Ken to point to rootS rootS = find(Santino)
set rootK to point to rootS
Aileen (1) Aileen (1)
Joyce (2)

Ken Santino Joyce Santino


Sam
RESULT:
Alex Sam Ken
Paul (3) Paul (3)
Alex
QuickUnion: Union
union(Ken, Santino): union(Ken, Santino):
rootS = find(Santino) rootK = find(Ken)
set Ken to point to rootS rootS = find(Santino)
set rootK to point to rootS
Aileen (1) Aileen (1)
Joyce (2)

Ken Santino Joyce Santino


Sam
RESULT:
Alex Sam Ken
Paul (3) Paul (3)
Alex

• We prefer the right implementation because by changing just the root, we effectively pull the
entire tree into the new set!
• If we change the first node instead, we have to do more work for the rest of the old tree
• A rare example of constant time work manipulating a factor of n elements
QuickUnion: Why bother with the second
root? Why not just use:
union(Ken, Santino): union(Ken, Santino):
rootK = find(Ken) rootK = find(Ken)
rootS = find(Santino) set rootK to point to Santino
set rootK to point to rootS
Aileen (1) Aileen (1)

Santino
Joyce Santino
Joyce
Sam Ken Paul (3)
Sam Ken
Alex Paul (3)
Alex

Key idea: will help minimize runtime for future find() calls if we keep the height of the
tree short!
- Pointing directly to the second element would make the tree taller
QuickUnion: Checking in on those runtimes

makeSet(value)
Maps to Sets

Θ(1)
QuickFind

Θ(1)
QuickUnion

Θ(1)
*
Only if we discount the runtime from union’s
calls to find! Otherwise, Θ(𝑛).
- However, for Kruskal’s, not a bad assumption: we only
findSet(value) Θ(𝑛) Θ(1) Θ(𝑛) ever call union with roots anyway!
Θ(𝑛) Θ(𝑛) Θ(1)
*
union(x, y)

kruskalMST(G graph)
DisjointSets<V> msts; Set finalMST;
union(A, B):
initialize msts with each vertex as single-element MST
rootA = find(A) sort all edges by weight (smallest to largest)
rootB = find(B)
for each edge (u,v) in ascending order:
set rootA to point to rootB
uMST = msts.find(u)
vMST = msts.find(v)
if (uMST != vMST):
finalMST.add(edge (u, v))
msts.union(uMST, vMST);
QuickUnion: Let’s Build a Worst Case
Even with the ”use-the-roots” implementation of union, try to
come up with a series of calls to union that would create a worst- B D
case runtime for find on these Disjoint Sets:
A C

find(A):
jump to A node
travel upward until root
return ID

union(A, B):
rootA = find(A)
rootB = find(B)
set rootA to point to rootB
QuickUnion: Let’s Build a Worst Case
Even with the ”use-the-roots” implementation of union, try to
come up with a series of calls to union that would create a worst- B D
case runtime for find on these Disjoint Sets:
A C

D find(A):
union(A, B) jump to A node
union(B, C) C travel upward until root
union(C, D) return ID
find(A) B
union(A, B):
A
rootA = find(A)
rootB = find(B)
set rootA to point to rootB
Analyzing the QuickUnion Worst Case
How did we get a degenerate tree?
- Even though pointing a root to a root usually helps with this, we can still get a degenerate tree if we put the
root of a large tree under the root of a small tree.
- In QuickUnion, rootA always goes under rootB
- But what if we could ensure the smaller tree went under the larger tree?

D C

union(C, D) C B D

B A

A
What currently What would help
happens avoid degenerate
tree
WeightedQuickUnion
union(A, B)
B D
Goal: Always pick the smaller tree to go union(B, C)
under the larger tree union(C, D)
find(A) A C
Implementation: Store the number of nodes
(or “weight”) of each tree in the root
Now what happens?
- Constant-time lookup instead of having to traverse
the entire tree to count

union(A, B): A C D

rootA = find(A) Perfect! Best runtime we can get.


rootB = find(B)
put lighter root under heavier root
WeightedQuickUnion: Performance
union()’s runtime is still dependent on find()’s runtime, which is a function of the tree’s
height
What’s the worst-case height for WeightedQuickUnion?

union(A, B):
rootA = find(A)
rootB = find(B)
put lighter root under heavier root
WeightedQuickUnion: Performance
Consider the worst case where the tree height grows as fast as possible

N H
1 0

0
WeightedQuickUnion: Performance
Consider the worst case where the tree height grows as fast as possible

N H
1 0
2 1
0

1
WeightedQuickUnion: Performance
Consider the worst case where the tree height grows as fast as possible

N H
1 0
2 1
0 2
4 ?
1 3
WeightedQuickUnion: Performance
Consider the worst case where the tree height grows as fast as possible

N H
1 0
2 1
0
4 2
1 2

3
WeightedQuickUnion: Performance
Consider the worst case where the tree height grows as fast as possible

N H
1 0
2 1
0 4
4 2
1 2 5 6
8 ?
3 7
WeightedQuickUnion: Performance
Consider the worst case where the tree height grows as fast as possible

N H
1 0
2 1
0
4 2
1 2 4
8 3
3 5 6

7
WeightedQuickUnion: Performance
• Consider the worst case where the tree height grows as fast as
possible
N H
• Worst case tree height is Θ(log N)
1 0
2 1
0
4 2
1 2 4 8
8 3
3 5 6 9 10 12
16 4
7 11 13
14

15
Why Weights Instead of Heights?
We used the number of items in a tree to decide upon the root

Why not use the height of the tree?


- HeightedQuickUnion’s runtime is asymptotically the same: Θ(log(n))
- It’s easier to track weights than heights, even though WeightedQuickUnion can lead to some suboptimal
structures like this one:

0 6 0

1 2 3 4 5 + 7 8 1 2 3 4 5 6

9
7 8

9
WeightedQuickUnion Runtime

Maps to Sets QuickFind QuickUnion WeightedQuickUnion


makeSet(value) Θ(1) Θ(1) Θ(1) Θ(1)
find(value) Θ(𝑛) Θ(1) Θ(𝑛) Θ(log 𝑛)
union(x, y) Θ(𝑛) Θ(𝑛) Θ(1) Θ(1)
assuming root args

union(x, y) Θ(𝑛) Θ(𝑛) Θ(𝑛) Θ(log 𝑛)

This is pretty good! But there’s one final optimization we can make: path compression
Appendix
TreeDisjointSet<E>
Implement union(x, y) state
Collection<TreeSet> forest
Dictionary<NodeValues,
forest NodeLocations> nodeInventory
behavior
union(3, 5) 0 1 2 3 4 5 makeSet(x)-create a new tree
of size 1 and add to our
forest
findSet(x)-locates node with x
and moves up tree to find root
union(x, y)-append tree with y
as a child of tree with x

0 1 2 3 4 5
-> -> -> -> -> ->

CSE 373 SP 18 - KASEY CHAMPION 56


TreeDisjointSet<E>
Implement union(x, y) state
Collection<TreeSet> forest
Dictionary<NodeValues,
forest NodeLocations> nodeInventory
behavior
union(3, 5) 0 1 2 3 4 makeSet(x)-create a new tree
of size 1 and add to our
union(2, 1) forest
findSet(x)-locates node with x
5 and moves up tree to find root
union(x, y)-append tree with y
as a child of tree with x

0 1 2 3 4 5
-> -> -> -> -> ->

CSE 373 SP 18 - KASEY CHAMPION 57


TreeDisjointSet<E>
Implement union(x, y) state
Collection<TreeSet> forest
Dictionary<NodeValues,
forest NodeLocations> nodeInventory
behavior
union(3, 5) 0 2 3 4 makeSet(x)-create a new tree
of size 1 and add to our
union(2, 1) forest
findSet(x)-locates node with x
1 5 and moves up tree to find root
union(2, 5) union(x, y)-append tree with y
as a child of tree with x

0 1 2 3 4 5
-> -> -> -> -> ->

CSE 373 SP 18 - KASEY CHAMPION 58


TreeDisjointSet<E>
Implement union(x, y) state
Collection<TreeSet> forest
Dictionary<NodeValues,
forest NodeLocations> nodeInventory
behavior
union(3, 5) 0 2 4 makeSet(x)-create a new tree
of size 1 and add to our
union(2, 1) forest
findSet(x)-locates node with x
1 3 and moves up tree to find root
union(2, 5) union(x, y)-append tree with y
as a child of tree with x

0 1 2 3 4 5

CSE 373 SP 18 - KASEY CHAMPION 59


TreeDisjointSet<E>
Implement findSet(x) state
Collection<TreeSet> forest
Dictionary<NodeValues,
forest NodeLocations> nodeInventory
behavior
findSet(0) 0 2 4 makeSet(x)-create a new tree
of size 1 and add to our
findSet(3) forest
findSet(x)-locates node with x
1 3 and moves up tree to find root
findSet(5) union(x, y)-append tree with y
as a child of tree with x

0 1 2 3 4 5
Worst case runtime?
O(n)

Worst case runtime of union?


O(n) CSE 373 SP 18 - KASEY CHAMPION 60

You might also like