0% found this document useful (0 votes)
12 views77 pages

Small 16

Uploaded by

ps1406051
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views77 pages

Small 16

Uploaded by

ps1406051
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 77

Disjoint-Set Forests

Thanks for Showing Up!


Outline for Today
● Incremental Connectivity
● Maintaining connectivity as edges are added to a
graph.
● Disjoint-Set Forests
● A simple data structure for incremental connectivity.

Union-by-Rank and Path Compression
● Two improvements over the basic data structure.
● Forest Slicing
● A technique for analyzing these structures.
● The Ackermann Inverse Function
● An unbelievably slowly-growing function.
The Dynamic Connectivity Problem
The Connectivity Problem
● The graph connectivity problem is the following:
Given an undirected graph G, preprocess the graph so
that queries of the form “are nodes u and v
connected?”
● Using Θ(m + n) preprocessing, can preprocess the
graph to answer queries in time O(1).
0 2 2
0
2
1 3
0
2
1 3
2
0 0
3
3 2
Dynamic Connectivity
● The dynamic connectivity problem is the following:
Maintain an undirected graph G so that edges may be
inserted an deleted and connectivity queries may be
answered efficiently.
● This is a much harder problem!
Dynamic Connectivity
● Today, we'll focus on the incremental dynamic connectivity
problem: maintaining connectivity when edges can only be
added, not deleted.
● Has applications to Kruskal's MST algorithm and to many other
online connectivity settings.
● Look up percolation theory for an example.
● These data structures are also used as building blocks in other
algorithms:
● Speeding up Edmond's blossom algorithm for finding
maximum matchings.
● As a subroutine in Tarjan's offline lowest common ancestors
algorithm.
● Building meldable priority queues out of non-meldable
queues.
Incremental Connectivity and Partitions
Set Partitions
● The incremental connectivity problem is
equivalent to maintaining a partition of a set.
● Initially, each node belongs to its own set.
● As edges are added, the sets at the endpoints
become connected and are merged together.
● Querying for connectivity is equivalent to
querying for whether two elements belong to
the same set.
Representatives
● Given a partition of a set S, we can choose one
representative from each of the sets in the
partition.
● Representatives give a simple proxy for which set
an element belongs to: two elements are in the
same set in the partition iff their set has the same
representative.
Union-Find Structures
● A union-find structure is a data structure
supporting the following operations:
● find(x), which returns the representative of the
set containing node x, and
● union(x, y), which merges the sets containing x
and y into a single set.
● We'll focus on these sorts of structures as a
solution to incremental connectivity.
Data Structure Idea
● Idea: Have each element store a pointer
directly to its representative.
● To determine if two nodes are in the same set,
check if they have the same representative.
● To link two sets together, change all elements
of the two sets so they reference a single
representative.
Using Representatives
Using Representatives
● If we update all the representative
pointers in a set when doing a union, we
may spend time O(n) per union
operation.
● If you're clever with how you change the
pointers, you can make it amortized O(log n)
per operation. Do you see how?
● Can we avoid paying this cost?
Hierarchical Representatives
Hierarchical Representatives
● In a degenerate case, a hierarchical
representative approach will require
time Θ(n) for some find operations.
● Therefore, some union operations will
take time Θ(n) as well.
● Can we avoid these degenerate cases?
Union by Rank

0 0 1 0

1 2 0 0

0 0
Union by Rank
● Assign to each node a rank that is initially zero.
● To link two trees, link the tree of the smaller
rank to the tree of the larger rank.
● If both trees have the same rank, link one to
the other and increase the rank of the other
tree by one.
Union by Rank
● Claim: The number of nodes in a tree of rank r is at
least 2r.
● Proof is by induction; intuitively, need to double the size to
get to a tree of the next order.
● Fun fact: the smallest tree with a root of rank r is a binomial
tree of order r. Crazy!
● Claim: Maximum rank of a node in a graph with n
nodes is O(log n).
● Runtime for union and find is now O(log n).
● Useful fact for later on: The number of nodes of
rank r or higher in a disjoint set forest with n nodes is
at most n / 2r.
Path Compression

0 0 1 0

1 2 0 0

0 0
Path Compression

0 0 1 0

1 2 0 0

0 0
Path Compression
● Path compression is an optimization to the
standard disjoint-set forest.
● When performing a find, change the parent
pointers of each node found along the way to
point to the representative.
● Purely using path compression, each
operation has amortized cost O(log n).
● What happens if we combine this with union-
by-rank?
The Claim
● Claim: The runtime of performing m union and find
operations on an n-node disjoint-set forest using path
compression and union-by-rank is O(n + mα(n)),
where α is an extremely slowly-growing function.
● The original proof of this result (which is included in
CLRS) is due to Tarjan and uses a complex amortized
charging scheme.
● Today, we'll use an an aggregate analysis due to
Seidel and Sharir based on a technique called forest-
slicing.
Where We're Going
● First, we're going to define our cost model
so we know how to analyze the structure.
● Next, we'll introduce the forest-slicing
approach and use it to prove a key lemma.
● Finally, we'll use that lemma to build
recurrence relations that analyze the
runtime.
Our Cost Model
● The cost of performing a union or find depends on
the length of the paths followed.
● The cost of any one operation is
Θ(1 + #ptr-changes-made)
because each time we visit a node that doesn't
immediately point to its representative, we change
where it points.
● Therefore, the cost of m operations is
Θ(m + #ptr-changes-made)
● We will analyze the number of pointers changed
across the life of the data structure to bound the
overall cost.
Some Accounting Tricks
● To perform a union operation, we need to first
perform two finds.
● After that, only O(1) time is required to perform
the union operation.
● Therefore, we can replace each union(x, y) with
three operations:
● A call to find(x).
● A call to find(y).
● A linking step between the nodes found this way.

Going forward, we will assume that each union
operation will take worst-case time O(1).
A Slight Simplification
● Currently, find(x) compresses from x up to its
ancestor.
● For mathematical simplicity, we'll introduce an
operation compress(x, y) that compresses
from x upward to y, assuming that y is an
ancestor of x.
● Our analysis will then try to bound the total
cost of the compress operations.
Removing the Interleaving
● We will run into some trouble in our
analysis because unions and
compresses can be interleaved.
● To address this, we will will remove the
interleaving by pretending that all
unions come before all compresses.
● This does not change the overall work
being done.
Removing the Interleaving
a compress(j, b) union(b, a)
union(b, a) compress(j, b)
compress(h, a) compress(h, a)
b h c d
f→b b→a
h→b f→b
e f j g j→b h→b
b→a j→b
h→a h→a
i
Recap: The Setup
● Transform any sequence of unions and finds as
follows:
● Replace all union operations with two finds and a
union on the ancestors.
● Replace each find operation with a compress
operation indicating its start and end nodes.
● Move all union operations to the front.
● Since all unions are at the front, we build the
entire forest before we begin compressing.
● Can analyze compress assuming the forest has
already been created for us.
A Quick Initial Analysis
An Initial Analysis
● Lemma: Any series of m compress operation on a forest ℱ with
n nodes and maximum rank r makes at most nr pointer changes.
● Proof: Every time a node's representative change, the rank of
that representative increases. The maximum number of times this
can happen per node is r, giving an upper bound of nr. ■

3 3

2 2

1 1 1 1

0 0 0 0 0 0 0 0 0
The Forest-Slicing Approach
Forest-Slicing
a

b c d

e f g

h i j

k l
Forest-Slicing
● Let ℱ be a disjoint-set forest.
● Consider splitting ℱ into two forests ℱ₊
and ℱ₋ with the following properties:
● ℱ₊ is upward-closed: if x ∈ ℱ₊, then any
ancestor of x is also in ℱ₊.
● ℱ₋ is downward-closed: if x ∈ ℱ₋, then any
descendant of x is also in ℱ₋.
● We'll call ℱ₊ the top forest and ℱ₋ the
bottom forest.
Forest-Slicing
a

b c j g d

e f l i

Nodes
Nodesfrom
fromℱ ℱ₋₋never
never
h move
moveinto
intoℱℱ₊₊or
orvice-
vice-
versa.
versa.WeWeretain
retainthe
the
original
originalcut
cutafter
afterdoing
doing
k compressions.
compressions.
Why Slice Forests?
Forest-Slicing
● Key insight: Each compress operation
is either
● purely in ℱ₊,
● purely in ℱ₋, or
● crosses from ℱ₋ into ℱ₊.
● If we can bound the cost of compress
operations that cross from ℱ₋ to ℱ₊, we
can try to set up a recurrence relation to
analyze the cost of those compresses.
a a

b c b c

d e f d

g e

Observation
Observation1: 1:The
Theportion
portionof
ofthe
the f
compression
compressionin inℱℱ₊₊isisequivalent
equivalenttotoaa
compression
compressionof ofthe
thefirst
firstnode
nodeininℱℱ₊₊on
on
the
thecompression
compressionpathpathto tothe
thelast
lastnode
node
g
in
inℱℱ₊₊on
onthe
thecompression
compressionpath.path.
a a

b c b c

d e f d e f

g g

Observation
Observation2: 2: The
Theeffect
effectof
ofthe
the
compression
compressionon onℱ ℱ₋₋is
isnot
notthe
thesame
sameas as
the
theeffect
effectof
ofcompressing
compressingfromfromthe
thefirst
first
node
nodeininℱℱ₋₋to
tothe
thelast
lastnode
nodeininℱℱ₋.
₋.
a a

b c b c

d e f d e f

g g

Observation
Observation3: 3: The
Thecost
costof
ofthe
thecompress
compressin in
ℱℱ₋₋is
isthe
thenumber
numberof ofnodes
nodesin
inℱℱ₋₋that
thatgot
gotaa
parent
parentin inℱℱ₊,
₊,plus
plus(possibly)
(possibly)one
onemore
morefor
forthe
the
topmost
topmostnodenodeininℱ ℱ₋₋on
onthe
thecompression
compressionpath.
path.
The Cost of Crossing Compressions
● Suppose we do m compressions, of which m₊ of
them cross from ℱ₋ into ℱ₊.
● We can upper bound the cost of these
compressions as the sum of the following:
● the cost of all the tops of those compressions,
which occur purely in ℱ₊;
● the number of nodes in ℱ₋, since each node in
ℱ₋ gets a parent in ℱ₊ for the first time at most
once; and
● m₊, since each compression may change the
pointer of the topmost node on the path in ℱ₋.
Theorem: Let ℱ be a disjoint-set forest and let ℱ₊
and ℱ₋ be a partition of ℱ into top and bottom
forests.
Then for any series of m compressions C, there exist
two sequences of compressions
– C₊, a series of m₊ compressions purely in ℱ₊; and
– C₋, a series of m₋ compressions purely in ℱ₋,

such that
· m₊ + m₋ = m
· cost(C) ≤ cost(C₊) + cost(C₋) + n + m₊

Compressions
Compressionsthatthat Nodes
Nodesin inℱℱ₋₋ Nodes
Nodesininℱℱ₋₋
appear
appearpurely
purelyin inℱℱ₊₊or
or getting
gettingtheir
their having
havingtheir
their
purely
purelyin
inℱℱ₋,₋,plus
plusthe
the first
firstparent
parent parent
parentin
inℱℱ₊₊
tops
topsof
ofcrossing
crossing in
inℱℱ₊₊ change.
change.
compressions.
compressions.
Time-Out for Announcements!
The midterm is tonight
from 7PM – 10PM
in room 320-105.

Good luck!
Back to CS166!
The Main Analysis
Where We Are
● We now have a sort of recurrence relation for
evaluating the runtime of a series C of m
compresses on an n-node forest ℱ sliced into ℱ₊
and ℱ₋:
cost(C) ≤ cost(C₊) + cost(C₋) + n + m₊
● This recurrence relation assumes that we already
know how we've sliced ℱ into ℱ₊ and ℱ₋.
● To complete the analysis, we're going to need to
precisely quantify what happens if we slice the
forest in a number of different ways.
Natural Slices
● One “natural” way to slice a forest ℱ into ℱ₊ and
ℱ₋ is to pick some threshold rank. We then
choose ℱ₊ to be all the nodes whose rank is above
the threshold and ℱ₋ to be all the other nodes.

3 3

2 2

1 1 1 1

0 0 0 0 0 0 0 0 0
Natural Slices
● If our initial forest has maximum rank r and we
slice the forest at rank r', the bottom forest has
maximum rank r' and the top forest is
(essentially) a forest of rank r – r'.
1 1

0 0

1 1 1 1

0 0 0 0 0 0 0 0 0
Slicing our Forest
● Imagine that we have our forest ℱ of
maximum rank r.
● Suppose we cut slice the forest into ℱ₊
and ℱ₋ at some rank r'.
● We know that
cost(C) ≤ cost(C₊) + cost(C₋) + n + m₊.
● Let's investigate cost(C₊) and cost(C₋)
independently.
The Top Forest
● Let's begin by thinking about cost(C₊), the cost of
compresses in the top forest ℱ₊.
● Recall: ℱ₊ consists of all nodes of rank r'or higher.
● Intuitively, we'd expect there to not be “too many”
nodes in the top forest, since it's exponentially harder
to get nodes of progressively harder orders.
● Using our lemma from before, we know that there can
be at most n / 2r' nodes in ℱ₊.
● Therefore, using our (weak) bound from before, we
see that
cost(C₊) ≤ nr / 2r'.
Slicing our Forest
● Imagine that we have our forest ℱ of maximum
rank r.
● Suppose we cut slice the forest into ℱ₊ and ℱ₋ at
some rank r'.
● We know that
cost(C) ≤ cost(C₊) + cost(C₋) + n + m₊.
● Therefore
cost(C) ≤ nr / 2r' + cost(C₋) + n + m₊.
● Let's now go investigate cost(C₋).
Improving our Recurrence
cost(C) ≤ nr / 2r' + cost(C₋) + n + m₊.
● Notice that cost(C) is the cost of

doing m compresses,
● in an n-node forest, with
● maximum rank r.
● We now have cost(C₋), which is the cost of
● doing m₋ compresses,
● in a forest with at most n nodes, with
● maximum rank r'.
● Let's make these dependencies more explicit.
Improving our Recurrence
cost(C) ≤ nr / 2r' + cost(C₋) + n + m₊.
● Define T(m, n, r) to be the cost of

performing m compress operations,
● in a forest of at most n nodes, where
● the maximum rank is r.
● The above recurrence can be rewritten as
T(m, n, r) ≤ T(m₋, n, r') + nr / 2r' + n + m₊
● Now, we “just” need to solve this recurrence.
Don't worry... it's not too bad!
Finalizing our Recurrence
T(m, n, r) ≤ T(m₋, n, r') + nr / 2r' + n + m₊
● The above recurrence is dependent on having a
choice of r' based on our choice of r.
● If we make r' too large, then the recurrence
relation takes too long to bottom out and we'll
expect a higher runtime.
● If we make r' too small, the nr / 2r' term will be too
large and our analysis won't be tight.
● How do we balance these terms out?
Finalizing our Recurrence
T(m, n, r) ≤ T(m₋, n, r') + nr / 2r' + n + m₊
● Idea: Choose r' = lg r. Then
T(m, n, r) ≤ T(m₋, n, lg r) + 2n + m₊.
● Imagine that this recurrence expands out L times
before it bottoms out. Think about what happens:
● The 2n term gets summed in L times.
● The m₊ term – the number of compresses in the
top forest – sums up to at most m across all
compressions.
● Overall, we get T(m, n, r) ≤ 2nL + m.
Iterated Logarithms
● We now have
T(m, n, r) ≤ 2nL + m.
● The quantity L represents the number of layers in the
recurrence, and at each step we have r dropping to
lg r.
● The iterated logarithm, denoted lg* n, is the number
of times we can apply lg to n before it drops to some
constant (say, 2). Therefore:
T(m, n, r) ≤ 2n lg* r + m.
● And since the maximum rank is at most lg n, we see
that the cost of performing m operations on an n-node
forest is O(n lg* n + m).
Iterated Logarithms
● The function lg n is the inverse of the function
2n; that is, 2 × 2 × … × 2, n times.
● The tetration operation, denoted n2, is given
...2
by n 2 = 22 , with n copies of 2 in the tower of
exponents. It grows extremely quickly!
● The function lg* n is the inverse of tetration. It
grows extremely slowly!
● Useful fact: lg* n ≤ 5 for any n less than or
equal to the number of atoms in the universe.
Our Strategy
● Let's recap, how we got here.
● We begin with a forest ℱ of
Cost here: n
maximum rank r.
● We sliced ℱ at rank lg r.
● We (directly) obtained a weak
bound on the cost of the
compressions in the (small)
forest ℱ₊.
● We recursively obtained a
(good) bound on the cost of the
compressions in the (larger)
forest ℱ₋.
● We solved the recurrence to
get the bound Cost here:
T(m, n, r) ≤ 2n lg* r + m. T(m₋, n, lg r)
Our Strategy
What could we do to
tighten the runtime Cost here: n
bound?
● Option 1: Tighten
the bound on the Previously,
Previously,we
that the
weused
usedour
cost of
ourweak
any
weakbound
series
bound
of
that the cost of any series of
cost of the top operations
operationson onnnnodes
nodesininaaforest
forestofof
maximum
maximumrankrankrrwas
wasatatmost
mostnr.nr.We
forest. now have a bound of 2n lg* r
now have a bound of 2n lg* r + m, +
We
m,
which
whichisismuch
muchtighter.
tighter.
Option 2: Slice the
forest even lower to
make the recursion Cost here:
tree shorter. T(m₋, n, lg r)
Our Strategy
What could we do to
tighten the runtime Cost here: n
bound?
Option 1: Tighten
the bound on the IfIfwe
wehave
haveaatighter
tighterbound
boundon onthe
the
cost of the top forest, we can afford
cost of the top forest, we can afford
cost of the top to
tohave
havemore
morenodes
nodesininthe
thetop
topforest
forest
at
at the same cost, so we can slicethe
the same cost, so we can slice
forest. bottom forest even deeper.
the
bottom forest even deeper.
● Option 2: Slice the
forest even lower to
make the recursion Cost here:
tree shorter. T(m₋, n, lg r)
Slicing our Forest, Again
● Imagine that we have a forest ℱ of maximum rank r.
● Suppose we cut slice the forest into ℱ₊ and ℱ₋ at
some rank r'.
● We know that
cost(C) ≤ cost(C₊) + cost(C₋) + n + m₊.
● Therefore
T(m, n, r) ≤ cost(C₊) + T(m, n, r') + n + m₊.
● Let's investigate cost(C₊) using our previous analysis.
The Top Forest
● Lemma: In an n-node forest ℱ of maximum rank r, if
we split ℱ into ℱ₊ and ℱ₋ by cutting the forest at
rank r', then cost(C₊) ≤ 2n lg* r / 2r' + m₊.
● Proof: There are n / 2r' nodes in this forest and the
maximum rank is at most r. The cost of performing
m₊ compress operations here is therefore
2(n / 2r') lg* r + m₊.
● Observation: Our previous bound was
rn / 2r'.
We previously set r' = lg r because that was as low as
we could go without cost(C₊) being too high. With
our new bound, we can afford to make r' much lower.
Our Recurrence
● We had
T(m, n, r) ≤ cost(C₊) + T(m₋, n, r') + n + m₊.
● So we now have
T(m, n, r) ≤ T(m₋, n, r') + 2n lg* r / 2r' + n + 2m₊.
● Previously, we picked r' = lg r and ended up with a bound in
terms of lg* r.
● Now, we pick r' = lg* r. Then we have
T(m, n, r) ≤ T(m₋, n, lg* r) + 2n + 2m₊.
● Using a similar analysis as before, if L is the number of
layers in the recurrence, this solves to
T(m, n, r) ≤ 2nL + 2m.
Iterated Iteration
● We have
T(m, n, r) ≤ 2nL + 2m,
where L is the number of layers in the iteration.
● At each step, we shrink r to lg* r. The maximum
number of times we can do this is denoted lg** r, so
we have
T(m, n, r) ≤ 2n lg** r + 2m.
● So the cost of any m operations is O(n lg** n + m).
Iterated Iterated Logarithms

The pentation operation is next in the family of fast-
growing functions.
● Just as tetration is iterated exponentiation, pentation is
iterated tetration, so 2 pentated to the nth power,
denoted n2, is
2

...
(2 )
2
2
2
...

( 22 )
...
2

where there are n2 copies of the exponential towers.


● The function lg** n is the inverse of pentation. It grows
unbelievably slowly!
Our Strategy
● Let's recap, how we got here.
● We begin with a forest ℱ of
Cost here: n + m₊
maximum rank r.
● We sliced ℱ at rank lg* r.
● We (directly) obtained a weak
bound on the cost of the
compressions in the (small)
forest ℱ₊.
● We recursively obtained a
(good) bound on the cost of the
compressions in the (larger)
forest ℱ₋.
● We solved the recurrence to
get the bound Cost here:
T(m, n, r) ≤ 2n lg** r + 2m. T(m₋, n, lg* r)
Our Strategy
What could we do to
tighten the runtime Cost here: n + m₊
bound?
● Option 1: Tighten
the bound on the Previously,
Previously,we
the
weused
usedour
ourweak
weakbound
boundthat
that
the cost of any series of operations onnn
cost of any series of operations on
cost of the top nodes
nodesininaaforest
forestofofmaximum
maximumrankrankrrwas
was
at most 2n lg* r + m. We now have a
forest. at most 2n lg* r + m. We now have a
bound
boundofof2n 2nlg**
lg**rr++2m,
2m,which
whichisismuch
much
tighter.
tighter.
Option 2: Slice the
forest even lower to
make the recursion Cost here:
tree shorter. T(m₋, n, lg* r)
Our Strategy
What could we do to
tighten the runtime Cost here: n + m₊
bound?
Option 1: Tighten
the bound on the IfIfwe
wehave
haveaatighter
tighterbound
boundon onthe
the
cost of the top forest, we can afford
cost of the top forest, we can afford
cost of the top to
tohave
havemore
morenodes
nodesininthe
thetop
topforest
forest
at
at the same cost, so we can slicethe
the same cost, so we can slice
forest. bottom forest even deeper.
the
bottom forest even deeper.
● Option 2: Slice the
forest even lower to
make the recursion Cost here:
tree shorter. T(m₋, n, lg* r)
The Feedback Lemma
● Lemma: Suppose we know that
T(m, n, r) ≤ 2n lg*(k) n + km.
Then
T(m, n, r) ≤ 2n lg*(k+1)n + (k+1)m.
● Proof: Induction! Use the previous proof as a
template: split the forest at rank lg* (k) r, use the
known bound to bound the cost of the top
forest, and use recursion to bound the cost of
the bottom forest. ■
The Final Steps
● For any k ∈ ℕ, we have
T(m, n, r) ≤ 2n lg*(k) r + km.
● We can upper-bound r at log n, so we have
T(m, n) ≤ 2n lg*(k) n + km.
● As n gets larger and larger, we can increase the value of
k to make the lg*(k) n term at most some constant value.
● Question: What is that k, as a function of n?
● The Ackermann inverse function, denoted α(n), is
α(n) = min{ k ∈ ℕ | lg*(k) n ≤ 3 }
● Theorem: The cost of performing any m operations on
any n-node disjoint set forest using union-by-rank and
path compression is O(n + mα(n)).
Intuiting α(n)
● Imagine we want to define some function A such that
● A(n, 0) = 2
● A(n, 1) = 2 + 2 + … + 2 = 2n
● A(n, 2) = 2 × 2 × … × 2 = 2n.
● A(n, 3) = 22...² = n2. (tetration)
● A(n, 4) = ²...2 2 = n2 . (pentation)
● A(n, 5) doesn't have a name, but scares children.
● The function A is called an Ackermann-type
function. There are a number of different functions in
this family, but they all (fundamentally) apply higher
and higher orders of functions to the arguments.
Intuiting α(n)

Theorem: Asymptotically, the function α(n) is the inverse
of A(n, n), hence the name “Ackermann inverse”.

Intuition:
● lg n is the inverse of 2n, which is A(n, 2).
● lg* n is the inverse of n2 (tetration), which is A(n, 3).
● lg** is the inverse of n2 (pentation), which is A(n, 4).
● α(n) tells you how many stars you need to make lg*(k) n drop to a
constant, which essentially asks for which essentially asks for
what order of operation you need to invert.
● This function grows more slowly than any of the iterated
logarithm families. It's so slowly-growing that an input to
it that would make it more than, say, 10 can't even be
expressed without inventing special notation for fast-
growing numbers.
Intuiting α(n)
● If you keep dividing by two, you should
expect a log term.
● If you keep taking logs, you should
expect a log* term.
● If you keep taking log*s, you should
expect a log** term.
● If you keep adding stars to your logs, you
should expect an α term.
Some Notes on α(n)
● The term α(n) arises in many different algorithms:
● Range semigroup queries: there's a lower bound of α(n) on
the cost of a query under certain algebraic assumptions.
● Minimum spanning trees: the fastest known deterministic
MST algorithm runs in time O(mα(n)) due to a connection to
the above topic.
● Splay trees: imagine you treat a splay tree as a deque.
Hilariously, the best bound we have on the runtime of
performing n deque operations is O(nα*(n)). It's suspected to
be O(n), but this hasn't been proven.
● α(n) and its variants are the slowest-growing functions
that are routinely encountered in algorithms and data
structures. And now you know where it comes from!
Next Time
● Euler Tour Trees
● Fully dynamic connectivity in forests.
● Dynamic Graphs
● Fully dynamic connectivity in general graphs
(ITA).

You might also like