0% found this document useful (0 votes)
16 views40 pages

Lecture 15

Uploaded by

L A A
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views40 pages

Lecture 15

Uploaded by

L A A
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 40

ZING

POW

CS E 3 2 6 : D a ta S tr u c tur e s
Lecture #15
(E q u iv a le n c e ) D u o:
The Dynamic
We ig h te d U n io n & P a th
Compression
Bart Niswonge
W r Zap
!
ha S u mm e r Q u a r t
ck
!!
e r 20 01
Today’s Outline
• Project
– Rules of competition
• Making a “good” maze
• Disjoint Set Union/Find ADT
• Up-trees
• Weighted Unions
• Path Compression
Unix Tutorial!!
Useful tools
• Tuesday, July 31 st
grep, egrep/grep -e
sort
– 10:50am, Sieg 322 cut
file
tr
Printing worksheet find, xargs
Shell diff, patch
which, locate, whereis
different shell quotes : ' `
scripting, #!
Finding info
Techniques
alias
Resources (ACM webpage, web,
variables / environment internal docs)
redirection, piping Process management
File management/permissions
Filesystem layout
What’s a Good Maze?
The Maze Construction Problem
• Given:
– collection of rooms: V
– connections between rooms (initially all
closed): E
• Construct a maze:
– collection of rooms: V = V
– designated rooms in, iV, and out, oV
– collection of connections to knock down:
E  E such that one unique path
connects every two rooms
The Middle of the Maze
• So far, a number of walls
have been knocked
down while others
remain. A
• Now, we consider the
wall between A and B.
• Should we knock it B
down?
– if A and B are otherwise
connected
– if A and B are not
otherwise connected
Maze Construction Algorithm
While edges remain in E
 Remove a random edge e = (u, v)
from E
 If u and v have not yet been
connected
- add e to E
- mark u and v as connected

Mysterious note:
We’ll see this algorithm again!
Equivalence Relations
An equivalence relation R must have three properties
– reflexive: for any x, xRx is true
– symmetric: for any x and y, xRy implies yRx
– transitive: for any x, y, and z, xRy and yRz implies xRz

Connection between rooms is an equivalence relation


– any room is connected to itself
– if room a is connected to room b, then room b is
connected to room a
– if room a is connected to room b and room b is connected
to room c, then room a is connected to room c
Disjoint Set Union/Find ADT
• Union/Find operations
– create find(4)
– destroy
{1,4,8} {6}
– union 8 {7} {2,3,6}
– find
{5,9,10}
{2,3}
union(2,6)
• Disjoint set equivalence property: every element
of a DS U/F structure belongs to exactly one set
• Dynamic equivalence property: the set of an
element can change after execution of a union
Disjoint Set Union/Find† (More Formally)
• Given a set U = {a1, a2, … , an}
• Maintain a partition of U, a set of subsets
of U {S1, S2, … , Sk} such that:
– each pair of subsets Si and Sj are disjoint:
Si  S j  
k
– together, the subsets cover U: U 
S i
– each subset has a unique name i 1

• Union(a, b) creates a new subset which is


the union of a’s subset and b’s subset
• Find(a) returns a unique name for a’s
subset


AKA the dynamic equivalence problem
Example
Construct the maze on a 3 b 10 c
the right
2 1 6

Initial (the name of each d 4 e 7 f


set is underlined):
11 9 8
{a}{b}{c}{d}{e}{f}{g}{h}{i}
g 12 h 5 i

Order of edges in blue


Example, First Step
{a}{b}{c}{d}{e}{f}{g}{h}{i}
a 3 b 10 c

find(b)  b 2 1 6
find(e)  e
find(b)  find(e) so: d 4 e 7 f
add 1 to E 11 9 8
union(b, e)
g 12 h 5 i
{a}{b,e}{c}{d}{f}{g}{h}{i}
Order of edges in blue
Example, Continued
{a}{b,e}{c}{d}{f}{g}{h}{i} a 3 b 10 c

2 6

d 4 e 7 f

11 9 8

g 12 h 5 i

Order of edges in blue


Up-Tree Intuition
Finding the representative member of
a set is somewhat like the opposite
of finding whether a given key exists
in a set.

So, instead of using trees with


pointers from each node to its
children; let’s use trees with a
pointer from each node to its parent.
Up-Tree Union-Find Data Structure
• Each subset is an up-
tree with its root as its
representative member
• All members of a given a c g h
set are nodes in that
set’s up-tree d b f i
• Hash table maps input
data to the node
associated with that e
data
Up-trees are not necessarily binary!
Find
find(f)
find(e)
a b 10 c

a c g h
d e 7 f
11 9 8
d b f i
g 12 h i

Just traverse to the root!


runtime:
Union
union(a,c)
a b 10 c

a c g h
d e f
11 9 8
d b f i
g 12 h i

Just hang one root from the other!


runtime:
a 3 b 10 c
2 1 6

The Whole Example (1/11) d 4 e 7 f


11 9 8

union(b,e) g 12 h 5 i

a b c d e f g h i

a b c d f g h i

e
a 3 b 10 c
2 6

The Whole Example (2/11) d 4 e 7 f


11 9 8

union(a,d) g 12 h 5 i

a b c d f g h i
e

a b c f g h i
d e
a 3 b 10 c
6

The Whole Example (3/11) d 4 e 7 f


11 9 8

union(a,b) g 12 h 5 i

a b c f g h i
d e

a c f g h i
b d
e
a b 10 c
6

The Whole Example (4/11) d 4 e 7 f


11 9 8

find(d) = find(e) g 12 h 5 i
No union!

a c f g h i
b d
e

While we’re finding e,


could we do anything else?
a b 10 c
6

The Whole Example (5/11) d e 7 f


11 9 8

union(h,i) g 12 h 5 i

a c f g h i a c f g h

b d b d i

e e
a b 10 c
6

The Whole Example (6/11) d e 7 f


11 9 8

union(c,f) g 12 h i

a c f g h a c g h

b d i b d f i

e e
a b 10 c

The Whole Example (7/11) d e 7 f


11 9 8
find(e)
find(f) g 12 h i
union(a,c)

c g h
a c g h

b d f a f i
i

e b d

Could we do a
better job on this union? e
a b 10 c

The Whole Example (8/11) d e f


11 9 8
find(f)
find(i) g 12 h i
union(c,h)

c g
c g h

a f a f h
i

b d b d i

e e
a b 10 c

The Whole Example (9/11) d e f


find(e) = find(h) and find(b) = find(c) 11 9

So, no unions for either of these. g 12 h i

c g

a f h

b d i

e
a b c

The Whole Example (10/11) d e f


11
find(d)
find(g) g 12 h i
union(c, g)

g
c g
c
a f h
a f h
b d i b d i

e e
a b c

The Whole Example (11/11) d e f

find(g) = find(h)
So, no union. g 12 h i
And, we’re done!

g a b c

c d e f

a f h
g h i
b d i
Ooh… scary!
e Such a hard maze!
Nifty storage trick
A forest of up-trees
can easily be
a c g h
stored in an array.
Also, if the node
names are b d f i
integers or
characters, we
e
can use a very
simple, perfect
hash. 0 (a) 1 (b) 2 (c) 3 (d) 4 (e) 5 (f) 6 (g) 7 (h) 8 (i)
up-index: -1 0 -1 0 1 2 -1 -1 7
Implementation
typedef ID int;

ID find(Object x) { ID union(ID x, ID y) {
assert(hTable.contains(x)); assert(up[x] == -1);
ID parentID = hTable[x]; assert(up[y] == -1);

while(up[parentID] != -1) { up[y] = x;


parentID = up[xID]; }
}

return parentID;
}
runtime: O(depth) or … runtime: O(1)
Improvement: Weighted Union
• Always makes the root of the larger tree the new
root
• Often cuts down on height of the new up-tree

a c g h c g h a g h

b d f i a f i b d c i

e b d e f
Could we do a e
better job on this union? Weighted union!
Weighted Union Code
typedef ID int;
ID union(ID x, ID y) {
assert(up[x] == -1);
assert(up[y] == -1);

if (weight[x] > weight[y]) {


up[y] = x;
weight[x] += weight[y]; new runtime of union:
}
else {
up[x] = y;
weight[y] += weight[x];
} new runtime of find:
}
Weighted Union Find Analysis
• Finds with weighted union are O(max up-tree
height)
• But, an up-tree of height h with weighted union
must have at least 2h nodes
Base case: h = 0, tree has 20 = 1 node
Induction hypothesis: assume true for h < h

A merge can only increase tree height by


one over the smaller tree. So, a tree of
 , 2max height = n and height h-1 was merged with a larger tree to
form the new tree. Each tree then has  2h-1
max height = log n nodes by the induction hypotheses for a
• So, find takes O(log n) total of at least 2h nodes. QED.
Improvement: Path Compression
• Points everything along the path of a find to the
root
• Reduces the height of the entire access path to 1

a c f g h i a c f g h i
b d b d
e e
While we’re finding e, Path compression!
could we do anything else?
Path Compression Example
find(e)

c g c g

a f h a f h
b e
b d
d
i
i e
Path Compression Code
typedef ID int;
ID find(Object x) {
assert(hTable.contains(x));
ID parentID = hTable[x];
ID hold = parentID;

while(up[parentID] != -1) {
parentID = up[parentID];
}
ID rootID = parentID;
while(up[hold] != -1) {
ID oldParentID = up[hold];
up[hold] = rootID;
hold = oldParentID;
}

}
return rootID;
runtime:
Digression:
Doping at the Silicon Downs
How fast does log n grow? log n = 4 for n = 16
Let log(k) n = log (log (log … (log n)))
k logs
Then, let log* n = minimum k such that log(k) n  1
How fast does log* n grow? log* n = 4 for n = 65536

Ackermann created a really big function A(x, y) with the


inverse (x, y) which is really small
How fast does (x, y) grow? (x, y) = 4 for n far larger
than the number of atoms in the universe (2300)
Complex Complexity of Weighted
Union + Path Compression
• Tarjan proved that m weighted union
and find operations on a set of n
elements have worst case
complexity O(m(m, n))
• For all practical purposes this is
amortized constant time
• In some practical cases, one or both
is unnecessary because trees do not
naturally get very deep.
To Do
• Start Project III (only 5 days!)
• Read chapter 8 in the book
• Start reading chapter 7
Coming Up
• Algorithms
• Sorting (Chapter 7)
• Project III due (next
Wednesday)

• Unix Tutorial (next Tuesday)

You might also like