Analysis of IterativeAlgorithms 1
Analysis of IterativeAlgorithms 1
1 / 46
Topics cover today
Introduction
Average-case analysis
Amortized/aggregate analysis
Disjoint sets
Review on summations
2 / 46
Motivation
We also analyzed the running time of these algorithms for the worst
and best cases
3 / 46
Possible problems with a worst-case analysis
4 / 46
Average-Case Analysis
Worst-case analysis seeks to find that input of size n that has the
largest running time. In best case, we need to identify the input which
has the lowest running time.
where pi is the probability that input i will occur out of all inputs of
size n.
5 / 46
Problem :
If all inputs of size n are equally likely to occur, then each input i of
size n occurs with probability 1/(number of inputs of size n) and
P
i input of size n {# of elementary ops performed for input i}
Tavg (n) = .
# of inputs of size n
6 / 46
Example Average-Case Analysis
LinearSearch (L[1..n], x)
Foundx = false ; i = 1 ;
while (i ≤ n & Foundx = false) do
if L[i] = x then
Foundx = true ;
else
i = i + 1;
if not Foundx then i = 0 ;
return i ;
7 / 46
What are all the possible inputs of size n ?
LinearSearch (L[1..n], x)
Foundx = false ; i = 1 ;
while (i ≤ n & Foundx = false) do
if L[i] = x then
Foundx = true ;
else
i = i + 1;
if not Foundx then i = 0 ;
return i ;
8 / 46
What is the running time of each input ?
For that we need to identify a basic operation in the pseudo-code and
count how many time this instruction is executed for each different
input of size n
9 / 46
What is the running time of each input ?
LinearSearch (L[1..n], x)
Foundx = false ; i = 1 ;
while (i ≤ n & Foundx = false) do
if L[i] = x then
Foundx = true ;
else
i = i + 1;
if not Foundx then i = 0 ;
return i ;
# of
Category description basic Ops
1 x will be found in L[1] 1
2 x will be found in L[2] 2
3 x will be found in L[3] 3
.. .. ..
. . .
n x will be found in L[n] n
n+1 x will not be found in L n
10 / 46
Assumptions about the probability distribution :
11 / 46
The average-case formulation for linear search
# of
Category description Probability basic Ops
1 x will be found in L[1] p/n 1
2 x will be found in L[2] p/n 2
3 x will be found in L[3] p/n 3
.. .. .. ..
. . . .
n x will be found in L[n] p/n n
n+1 x will not be found in L 1−p n
12 / 46
Computing the average running time of linear search
13 / 46
▶ If x is always found (p = 1), then approximately half the entries of
L are compared to x.
▶ If x is never found (p = 0), then x is compared to all entries of L.
14 / 46
Amortized analysis
15 / 46
Amortized analysis
Assume you have the following algorithm :
int a = 0 ;
for (i = 0; i < n; i + +)
for (j = 0; j < n; i + +)
if (i ̸= 0) then
a = a − 1;
j = n;
else a = a + 1 ;
Based on iterative algorithm analysis methods, since there is an inner
for loop which runs in O(n) iterations in worst case embedded into an
outer loop that also runs for n iterations we conclude that the time
complexity of this algorithm is O(n2 )
While O(n2 ) is correct, it is not the tightest bound we can get given
the worst case of the inner for loop occurs for only one iteration of the
outer loop
16 / 46
Amortized analysis
The inner loop iterations is averaged over all the iterations of the outer
loop, 2n
n = 2, therefore the cost per outer loop iteration is 2
17 / 46
Amortized analysis : stacks
Stacks have two constant-time operations, push(s, x) puts an element
x on the top of the stack s, and pop(s) takes the top element off of
the stack s. These operations cost both O(1), so a total of n
operations (in any order) will result in O(n) total time.
18 / 46
Amortized analysis : stacks
19 / 46
Amortized analysis : a second example
We notice that each element can be popped only once per time that
it’s pushed. Therefore, in a sequence of n push, pop and multipop
operations :
▶ # pushes ≤ n which implies # pops ≤ n as well, including those
in multipop
▶ The total number of push and pop operations is ∈ O(n)
▶ Therefore the average cost per operation (including multipop) is
O(n)
n = O(1).
The total cost for executing n push, pop, multipop operations is O(n).
This amortized analysis technique is also called aggregate analysis.
20 / 46
Exercise 1 on amortized analysis
21 / 46
Exercise 1 on amortized analysis (cont.)
O(n2 ) is not a very accurate characterization of the time needed for a sequence of n
enqueue and dequeue operations, even though in the worst case an individual
dequeue can take O(n) time.
1. Write a pseudo code for this algorithm that is detailed enough to make an
analysis of the run time
2. Argue about a better bound on the run time of your algorithm using
arguments derived from amortized analysis, to simplify the amortized analysis,
consider only the cost of the push and pop operations and not of checking
whether stack2 is empty.
22 / 46
Solution : algorithm
23 / 46
Solution : amortized analysis
The amortized cost of the dequeue covers the final pop, plus the
verification of the stack being empty. Its amortized cost is 2.
24 / 46
Exercise 2 on amortized analysis
If the number of items in the queue get larger than n, then the array is
re-allocated to an array of size 2n, and all items are copied from the
old array to the new array.
25 / 46
Solution to exercise 2
We only consider the enqueue operation since this operation may cause to “resize”
the array if full. Enqueue cost O(1).
Assume we start with an array of size 1, inserting item i1 . The insertion of the next
item, i2 , cause “resizing” the array to size 2, thus resizing the array cost 1, for
copying item i1 from the old array into the new one
Inserting i3 cause “resizing” the array from size 2 to 4, resizing cost 2, for copying
items i1 and i2 from the old array into the new one
Inserting i5 cause “resizing” the array from size 4 to 8, resizing cost 4, for copying
items i1 to i4 from the old array into the new one
Thus the cost of resizings can be expressed with the following summation :
1 + 2 + 4 + 8 + · · · + 2i for 2i < n. This summation is smaller than 2n
The total cost of n enqueues = n + cost of resizes (< 2n) < 3n
Average over n enqueue operations is < 3, thus the amortized cost is O(1)
26 / 46
Disjoint sets
27 / 46
Disjoint Set implementation
Each set is represented by its own linked list.
Each set has a ”head” pointer pointing to the first element in the list, a
”tail” pointer pointing to the last element.
Each element in the list contains a pointer to the next element in the
list, and a pointer back to the head of the list.
The representative of the set is the first element in the list.
28 / 46
Implementation of disjoint set union operation
Union(x, y ) is obtained by appending y ’s list onto the end of x’s list
The representative of x’s list becomes the representative of the
resulting set
The tail pointer for x’s list is used to quickly find where to append y’s
list.
29 / 46
Implementation of disjoint set union operation
Need to update the tail pointer of the first list and the next pointer of
the last element in the first list, cost is constant
Need to update the head pointer of each element in the second list,
cost O(l) where l is the length of the second list
30 / 46
Disjoint Set : run time analysis
Assume initially n sets are created which are then merged into a single
set using the union operation :
Operation number of elements updated
MakeSet(x1 ) O(1)
MakeSet(x2 ) O(1)
.. ..
. .
MakeSet(xn ) O(1)
Union(x2 , x1 ) 1
Union(x3 , x2 ) 2
Union(x4 , x3 ) 3
.. ..
. .
Union(xn , xn−1 ) n−1
A sequence of 2n − 1 MakeSet and Union operations (n P− 1 Unions) 2
cost O(n2 ) as the cost of the n − 1 union operations is n−1
i=1 i = O(n )
using the above implementation of the Union operation
31 / 46
Disjoint Set Union : the weight-union heuristic
We can improve the running time of the union operation, always copy
smaller list into larger list
Establish an upper bound on the number of times the pointer to the
head of an element x can be updated :
▶ The first time x’s pointer is updated the resulting set must have
had at least 2 members.
▶ The second time the resulting set must have had at least 4
members.
▶ For k ≤ n, after x’s pointer has been updated ⌈lg k⌉ times, the
resulting set must have at least k members.
Since the largest set has at most n members, each object’s pointer is
updated at most ⌈lg n⌉ times over all the union operations.
32 / 46
Performance of disjoint set with weight-union heuristic
33 / 46
Exercise : weighted-union
Show the data structure that results and the answers returned by the
FIND-SET operations in the following program. Use the linked-list
representation with the weighted-union heuristic
1- for (i = 1; i ≤ 16; i + +)
MAKE-SET(xi )
3- for (i = 1; i ≤ 15; i + 2)
UNION(xi , xi+1 )
5- for (i = 1; i ≤ 13; i + 4)
UNION(xi , xi+2 )
7- UNION(x1 , x5 )
8- UNION(x9 , x13 )
9- UNION(x1 , x9 )
FIND-SET(x2 )
FIND-SET(x9 )
34 / 46
Solution : exercise on weighted-union 1- for (i = 1; i ≤ 16; i + +)
MAKE-SET(xi )
3- for (i = 1; i ≤ 15; i + 2)
Originally we have 16 sets, each UNION(xi , xi+1 )
containing xi . 5- for (i = 1; i ≤ 13; i + 4)
UNION(xi , xi+2 )
7- UNION(x1 , x5 )
8- UNION(x9 , x13 )
After the for loop in line 3 we have 8 sets 9- UNION(x1 , x9 )
of size 2 : FIND-SET(x2 )
FIND-SET(x9 )
{1, 2}, {3, 4}, {5, 6}, {7, 8}, {9, 10}, {11, 12}, {13, 14}, {15, 16}
After the for loop on line 5 we have 4 sets
of size 4 :
{1, 2, 3, 4}, {5, 6, 7, 8}, {9, 10, 11, 12}, {13, 14, 15, 16}
Line 7 results in :
{1, 2, 3, 4, 5, 6, 7, 8}, {9, 10, 11, 12}, {13, 14, 15, 16}
Line 8 results in :
{1, 2, 3, 4, 5, 6, 7, 8}, {9, 10, 11, 12, 13, 14, 15, 16}
Line 9 results in :
{1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16}
FIND-SET(x2 ) and FIND-SET(x9 ) each
return pointers to x1
35 / 46
Disjoint Set Union : disjoint-set forests implementation
Each element points only to its parent. The root of each tree contains
the representative and is its own parent.
The union operation consists of having the root of one tree to point to
the root of another tree.
In union by rank, we make the root with smaller rank (smaller height)
points to the root with larger rank during a Union operation.
36 / 46
Performance of disjoint-set forests implementation
Union cost O(1), just changing the root pointer of one tree
With the union by rank, it is assumed that the height of the trees is
O(lg n)
37 / 46
Review on Summations : Patterns in Sums
38 / 46
Summation Notation
b
X
f (i)
i=a
▶ Meaning :
b
X
f (i) = f (a) + f (a + 1) + f (a + 2) + · · · + f (b)
i=a
39 / 46
Example of sums
40 / 46
Operator Priorities
▶
P
has the same priority as addition because it represents
additions.
▶
P
and + : same priority
Example :
b b
!
X X
2i + 1 = 2i + 1
i=a i=a
▶
P
and multiplication : multiplication has higher priority
b
X b
X
i × log2 10 = (i × log2 10)
i=a i=a
41 / 46
Sums and Closed Forms
Definition : A closed form for a sum is some function that gives you
the value of the sum with only a constant number of operations.
▶ Example :
for(i = 1; i ≤ n; i + +) do
- execute i elementary operations -
end for
▶ 1 + 2 + 3 + . . . + n = ni=1 i = n(n + 1)/2.
P
42 / 46
Example :
for(i = 1; i ≤ n; i + +) do
- execute n − i operations -
end for
does
n
X
(n − i) = (n − 1) + (n − 2) + (n − 3) +
i=1
··· + 1 + 0
n−1
X
= i
i=1
(n − 1)n
= operations
2
43 / 46
Sums and Closed Forms : Continue
▶ Sum of a constant c :
b b − a + 1 of them
X z }| {
c = c + c + ··· + c
i=a
= (b − a + 1) × c.
44 / 46
▶ Similar to sum of squares :
u
X
i(i + 1) = (1 × 2) + (2 × 3) + (3 × 4) +
i=1
· · · + (u × (u + 1))
u(u + 1)(u + 2)
=
3
▶ Sum of powers of a constant c :
u
X
ci = c0 + c1 + c2 + · · · + cu
i=0
c u+1 − 1
=
c −1
P2 i
Example : i=0 3 = (33 − 1)/2 = 13.
45 / 46
▶ Sum of the first u integers times increasing powers of a
constant c :
u
X
ic i = 0 × c0 + 1 × c1 + 2 × c2 + · · · + u × cu
i=0
((c − 1)(u + 1) − c)c u+1 + c
=
(c − 1)2
Example :
3
X
i2i = 0 × 20 + 1 × 21 + 2 × 22 + 3 × 23
i=0
= 34
((2 − 1)(3 + 1) − 2)23+1 + 2
=
(2 − 1)2
46 / 46