algoanalysis
algoanalysis
1 Mathematical Preliminaries
When analyzing an algorithm, the amount of resources required is usually
expressed as a function of the input size. A nontrivial algorithm typically
consists of repeating a set of instructions either iteratively, e.g. by executing
a for or while loop, or recursively by invoking the same algorithm again
and again, each time reducing the input size until it becomes small enough,
in which case the algorithm solves the input instance using a straightforward
method. This implies that the amount of resources used by an algorithm can
be expressed in the form of summation or recursive formula. This mandates
the need for the basic mathematical tools that are necessary to deal with these
summations and recursive formulas in the process of analyzing an algorithm.
In this section we review some of the mathematical preliminaries and discuss
briefly some of these mathematical tools that are frequently employed in the
analysis of algorithms.
1.1 Logarithms
Let b be a positive real number greater than 1, x a real number, and suppose
that for some positive real number a we have a = bx . Then, x is called the
logarithm of a to the base b, and we write this as
x = logb a
Here b is referred to as the base of the logarithm. For any real numbers x
and y greater than 0, we have
logb xy = logb x + logb y
1
The log of a product equals the sum of the logs. similarly
if x > 0.
When b = 2, we will write log x instead of log2 x. To convert from one base
to another, we use the chain rule.
. The ceiling of x, denoted by dxe, is defined as the least integer greater than
or equal to x.
For example, √
b 2c = 1
√
d 2e = 2
b−2.5c = −3
d−2.5e = −2.
2
1.3 Counting and Probability
Definition 1. A sample space is the set of all possible outcomes of a random
process or experiment. An event is a subset of a sample space
In case an experiment has finitely many outcomes and all outcomes are
equally likely to occur, the probability of an event (set of outcomes) is just
the ratio of the number of outcomes in the event to the total number of
outcomes.
Equally Likely Probability Formula
If S is a finite sample space in which all outcomes are equally likely and E
is an event in S, then the probability of E, denoted P (E), is
the number of outcomes in E
P (E) = .
the total number of outcomes in S
. Notation
For any finite set A, N (A) denotes the number of elements in A. With this
notation, the equally likely probability formula becomes
N (E
P (E) = .
N (S)
Summation
Let S = a1 , a2 , . . . an be any finite sequence of numbers. The sum a1 + a2 +
. . . + an can be expressed compactly using the notation
n
X
aj
j=1
3
Theorem 2. Sum of the First n Integers
For all integers n ≥ 1,
n(n + 1)
1 + 2 + 3 + ... + n =
2
Writing
n(n + 1)
1 + 2 + 3 + ... + n =
2
expresses the sum 1 + 2 + 3 + . . . + n in closed form.
For each positive integer n, the quantity factorial denoted n!, is defined to
be the product of all the integers from 1 to n.
n! = n(n1) . . . 321.
0! = 1
4
over the others. so for example, if you just need to cross the street to go city
B, you can just walk over. That makes a lot more sense than flying over.
But if city B was across the country, it probably make more sense to buy a
flight ticket and flying over instead of riding your bike.
In computing, algorithms are the same thing. So let’s take sorting for exam-
ple. If you want to sort a number of elements from small to the largest, you
have a lot of optional tools available to you such as Insertion sort, Selection
sort, Quick sort, Bubble sort, buttom-up sort etc...
Our ultimate gaol in algorithm design is to complete the task we want to
solve efficiently. By efficiently, we mean that it should be quick in terms of
time and less space requirement because memory is limited in computer sys-
tems. We can have the space filled up quickly. Hence, efficiency is measured
in terms of time and space.
By analyzing them, we can compare algorithms, and depend on the task we
can pick the best one.
So what is running time analysis?
Well we want to determine how the running time increases related to the size
of the input. So as the input gets bigger, we want to see exactly what that
does to the running time.
Input is usually n values. So we can make up assumptions regarding n! In
fact we can’t assume that n is going to be small! It is probably the best
when we always assume that n is going to be adequately large.
Running Time Analysis: without making assumptions for n make for a time-
less concept that translate from old computing system for modern machines.
So let’s look at some definitions.
5
Figure 1: Growth of some typical functions that represent running times
f (n) = n3 + n
n3 is the higher order term and n is lower order term. When doing the
asymptotic notation (e. g. using the big oh!) for simplicity sake we drop the
lower order term n. Therefore, the function has O(n3 ) for example.
6
Figure 1(a) on page 6 shows some functions that are widely used to express
the running time of algorithms. They are called, respectively, logarithmic,
linear, quadratic and cubic. Higher order, exponential and hyper-exponential
functions are not shown in the figure. They are functions that can grow faster
than the ones shown in the figure, even for a small size of n.
Let us take, for instance, the operation of adding two integers. For the
running time of this operation to be constant, we stipulate that the size of
its operands be fixed no matter what algorithm is used. Furthermore, as we
are now dealing with the asymptotic running time, we can freely choose any
positive integer k to be the ”word length” of our ”model of computation”.
Incidentally, this is but one instance in which the beauty of asymptotic no-
tation shows off; the word length can be any fixed positive integer. If we
want to add arbitrarily large numbers, an algorithm whose running time is
proportional to its input size can easily be written in terms of the elementary
operation of addition. Likewise, we can choose from a large pool of operations
and apply the fixed-size condition to obtain as many number of elementary
operations as we wish. The following operations on fixed-size operands are
examples of elementary operation.
7
differences of a constant factor and differences that occur only for small sets
of input data.
The idea of the notations, O−notation (read “big-Oh notation”), Ω−notation
(read “big-Omega notation”) and Θ−notation (read “big-Theta notation”)
is this. Suppose f and g are real-valued functions of real variables n.
1. If, for sufficiently large values of n, the values of |f | are less than those
of a multiple of |g|, then f is of order at most g, or f (n) is O(g(n)).
2. If, for sufficiently large values of n, the values of |f | are greater than
those of a multiple of |g|, then f is of order at least g, or f (n) is Ω(g(n)).
3. If, for sufficiently large values of n, the values of |f | are bounded both
above and below by those of multiples of |g|, then f is of order g, or
f (n) is Θ(g(n)).
Definition 6 (Big Oh! notation). Let f (n) and g(n) be two functions from
the set of natural numbers to a set of nonnegative real numbers. f (n) is said
to be O(g(n)) if there exists a natural number n0 and a constant c > 0 such
that c|g(n)| ≥ |f (n)| for all real numbers n ≥ n0 .
f (n)
lim 6= ∞ implies f (n) = O(g(n)).
n→∞ g(n)
Informally, this definition says that f (n) grows no faster than the product
of some constant times g(n).
The O-notation is sometimes used in equations as a simplification tool. For
example, instead of writing
we may write
f (n) = 10n3 + O(n2 ).
This is helpful if we are not interested in the details of the lower order terms
of the equation.
Definition 7 (Big Omega notation). Let f (n) and g(n) be two functions
from the set of natural numbers to a set of nonnegative real numbers. f (n)
8
is said to be Ω(g(n)) if there exists a natural number n0 and a constant c > 0
such that c|g(n)| ≤ |f (n)| for all real numbers n ≥ n0 .
Consequently, if limn→∞ f (n)/g(n) exists, then
f (n)
lim 6= 0 implies f (n) = Ω(g(n)).
n→∞ g(n)
Informally, this definition says that f grows at least as fast as the product
of some constant and g. It is obvious from the definition that
Definition 8 (Big Theta notation). Let f (n) and g(n) be two functions
from the set of natural numbers to a set of nonnegative real numbers. f (n)
is said to be Θ(g(n)) if there exist two positive constants c1 and c2 and a
natural number n0 such that c1 |g(n)| ≤ |f (n)| ≤ c1 |g(n)| for all real numbers
n ≥ n0 .
Consequently, if limn→∞ f (n)/g(n) exists, then
f (n)
lim = c implies f (n) = Θ(g(n)),
n→∞ g(n)
Type of Analysis
We can analyse or evaluate an algorithm in one of the three ways:
9
Table 1: Type of Analysis
Most of the time we do use the Big-Oh notation because it is most practically
used for one of the three. Best Case we can’t always rely on because it is not
always the best case, and Average Case is less useful than the worst case. So
what we will always think about is the Worst Case about is to plan for the
worst case scenario.
g(n) = n4
So the function
f (n) = O(g(n))
is
f (n) = O(n4 )
Is it the best to always use Big-Oh notation? Most of the time, Yes!
However, if f (n) = O(n) and f (n) = Ω(n)) then we tend to use Θ(n). Is
just a formality and widely accepted practice...
So let us look at some examples of algorithms and look at what their
Big-Oh notation is.
10
Example 1
A simple loop with a constant time operation m = m + 2 inside it.
1. for i = 0 to n
2. m=m+2
3. end for
If you look at the simple for Loop above, inside it is a constant time
operation of simple addition we denote it by C. On the outside, the loop
happens n times. So the function is a constant times n which is the amount
of time that ever happens. Given as:
f (n) = C × n
f (n) = O(n)
We observed that the constant operation is also the same. However, the
loop happens n/2 times. Therefore
f (n) = C × 1/2 × n
C × 1/2 are constants and n is the loop. Remember! the leading constants
are usually ignored! Thus, our function running time is Big-Oh of n, denoted
as
f (n) = O(n)
.
Even though the loop only iterates half as much as the last example, they
11
are BOTH O(n)!
12
We know that the C × n2 is the highest order in this case, and that will
take precedence. Thus,
f (n) = O(n2 )
Furthermore, let’s look at another example dealing with the if-then-else
statement.
Example 5- if-then-else Statement]
// if-then-else statement
1. if(x+1¡5)
2. return -1
3 else
4. for i = 1 to n
5. for j = 1 to n
6. k=k+1
7. end for
8. end for
9. return k
Let’s build our function as we go through. So looking at the if statement,
we have a constant operation (x + 1 < 5). We denote that as C0 . If this
condition is met, then we have another constant operation (return -1. we
denote it by C1 . So at this point
f (n) = C0 + C1
However with the else statement outer loop, the loops goes up to n and the
inner loop goes to n also with a constant operation we denote that by C2 .
Thus,
f (n) = C0 + C1 + n × n × C2
. But when we group it
f (n) = C0 + C1 + C2 × n2
f (n) = O(n2 )
Example 6
13
Consider Algorithm count, which consists of two nested loops and a variable
count which counts the number of iterations performed by the algorithm on
input n, which is a positive integer.
Algorithm count
Input: A positive integer n.
Output: count = number of times Step 5 is executed.
1. count← 0
2. for i = 1 to n
3. m ← bn/ic
4. for j ← 1 to m
5. count ← count + 1
6. end for
7. end for
8. return count
The inner for loop is executed repeatedly for the following values of n:
Since n n n
X n X n X n
( − 1) ≤ b c≤ ,
i=1
i i=1
i i=1
i
we conclude that Step 5 is executed Θ(n log n) times. (See Example 11).
As the running time is proportional to count, we conclude that it is Θ(n log n).
In general, let f (n) = ak nk +ak−1 nk−1 +. . .+a1 n+a0 . Then, f (n) = Θ(nk ).
Recall that this implies that f (n) = O(nk ) and f (n) = Ω(nk ).
Example 7 Since
log n2 2 log n 2 1
lim = lim = lim = 0
n→∞ n n→∞ nln2 ln2 n→∞ n
14
Example 8 Since
log n2 = 2 log n, we immediately see that logn2 = T heta(log n). In general,
for any fixed constant k, log nk = Θ(logn).
Example 9
Any constant function is O(1), Ω(1) and Θ(1).
Example 10
Consider the series nj=1 log j. Clearly
P
n
X n
X
log l ≤ log n
j=1 j=1
That is n
X
log j = O(n log n).
j=1
Also
n bn/2c
X X n n
log j ≥ log( ) = bn/2c log( ) = bn/2c log n − bn/2c
j=1 j=1
2 2
Thus,
n
X
log j = Ω(n log n)
j=1
It follows that
n
X
log j = Θ(n log n)
j=1
Example 11
We want to findPnan exact bound for the function f (n) = log n!.PFirst, note
n
that log n! = j=1 log j. We have shown in Example 10 that j=1 log j =
Θ(n log n). It follows that log n! = Θ(n log n).
Example 12
It is easy to see that
n n
X n Xn
= = O(n2 ).
j=1
j j=1
1
In what follows, we list closed form formulas for some of the summations that
occur quite often when analyzing algorithms. The proofs of these formulas
15
are left for the student as exercises.
The arithmetic series:
n
X n(n + 1)
j= = Θ(n2 ) (1)
j=1
2
If c = 2, we have
n
X
2j = 2n+1 − 1 = Θ(2n ) (4)
j=1
If c = 1/2, we have
n
X 1 1
j
= 2 − n
< 2 = Θ(1) (5)
j=1
2 2
16
f (n) < cg(n) for all n ≥ n0 .
Consequently, if limn→∞ f (n)/g(n) exists, then
f (n)
lim = 0 implies f (n) = o(g(n)),
n→∞ g(n)
For example, n log n is o(n2 ) is equivalent to saying that n log n is O(n2 ) but
n2 is not O(n log n).
17
and that gives us the least running time (Best case). Therefore
CBest (n) = Ω(n) = Ω(1)
To find the maximum number of comparisons, let us first consider applying
the linear search on the array
A = [1, 4, 33, 7, 8, 17, 9, 10, 20, 12, 15, 22, 23, 27, 32, 18, 35].
If we search for 4, we need two comparisons,whereas searching for 8 costs
five comparisons. Now, in the case of unsuccessful search, it is easy to see
that searching for elements not in the array takes n comparisons because
we need to search through all the element in the array. It is not difficult to
see that, in general, the algorithm always performs the maximum number of
comparisons whenever x = A[n] or when x does not appear in the array at
all. Thus in the worst case, the linear search algorithm is O(n). Hence,
CW orst (n) = O(n)
To find the average number of comparisons we consider a situation where
the element x to searched exist in the array and that it is equally likely to
occur in any position in the array. Accordingly, the number of comparison
can be any of the numbers 1, 2, 3, ..., n, and each occur with the probability
P [E] = 1/n. Then
1 1 1
CAvg (n) = [1 × + 2 × + ... + n × ].
n n n
1
CAvg (n) = [1 + 2 + 3 + ... + n]
n
1 n(n + 1)
= [ ]
n 2
(n + 1)
=[ ]
2
This agrees with our intuitive feeling that the average number of compar-
isons required to find an item is approximately equal to half the number of
elements in the array.
Asymptotically, therefore
CAvg (n) = Θ(n)
18
4 Binary Search
Using the linear search approach for searching problem we see that intuitively,
scanning all entries of A[1 . . . n] is inevitable if no more information about the
ordering of the elements in A is given. If we are also given that the elements in
A are sorted, say in nondecreasing order, then there is a much more efficient
algorithm. The following example illustrates this efficient search method.
Consider searching the array
A[1 . . . 14] = [1, 4, 5, 7, 8, 9, 10, 12, 15, 22, 23, 27, 32, 35].
19
Algorithm BINARYSEARCH
Input: An array A[1 . . . n] of n elements sorted in nondecreasing order and
an element x.
Output: j if x = A[j]; 1 ≤ j ≤ n ; and 0 otherwise.
1. low ← 1; high ← n; j ← 0
2. while (low ≤ high) and (j = 0)
3. mid = b(low + high)/2c
4. if x = A[mid] then j ← mid
5. else if x < A[mid] then high ← mid − 1
6. else low ← mid + 1
7. end while
8. return j
A[1 . . . 14] = [1, 4, 5, 7, 8, 9, 10, 12, 15, 22, 23, 27, 32, 35].
In each iteration of the algorithm, the bottom half of the array is discarded
until there is only one element:
[12, 15, 22, 23, 27, 32, 35, ] −→ [27, 32, 35, ] −→ [35].
20
Therefore, to compute the maximum number of element comparisons per-
formed by Algorithm BINARYSEARCH, we may assume that x is greater
than or equal to all elements in the array to be searched. To compute the
number of remaining elements in A[1 . . . n] in the second iteration, there are
two cases to consider according to whether n is even or odd. If n is even,
then the number of entries in A[mid+1 . . . n] is n/2; otherwise it is (n−1)/2.
Thus, in both cases, the number of elements in A[mid + 1 . . . n] is exactly
bn/2c.
Similarly, the number of remaining elements to be searched in the third it-
eration is bbn/2c/2c = bn/4c.
In general, in the jth pass through the while loop, the number of remaining
elements is bn/2j−1 c. The iteration is continued until either x is found or the
size of the subsequence being searched reaches 1, whichever occurs first. As
a result, the maximum number of iterations needed to search for x is that
value of j satisfying the condition
bn/2j−1 c = 1
. By the definition of the floor function, this happens exactly when
1 ≤ n/2j−1 < 2.
or
2j−1 ≤ n < 2j ,
or
j − 1 ≤ log n < j
. Since j is integer, we conclude that
j = blog nc + 1.
Example 14
Suppose an array contains 1000000 elements.
Accordingly, using the binary search algorithm,on requires only about 20
comparisons to find the position of an item in the array.
21
use any other search algorithm? This is because the algorithm requires two
conditions: (1) the array must be sorted and (2) one must have direct access
to the middle element in any sub-array. This means that one must essentially
use a sorted array to hold the data. But keeping data in a sorted array
is normally very expensive when there are many insertions and deletions.
Accordingly, in such situation, one may use a different data structure, such
as a binary search tree, to store the data.
The performance of the binary search algorithm can be described in terms
of a decision tree, which is a binary tree that exhibits the behavior of the
algorithm. Figure 2 shows the decision tree corresponding to the array given
in Examples 13. by searching for the element x = 22. The darkened nodes
are those compared against x.
Note that the decision tree is a function of the number of the elements in
the array only. Figure 3 on page 23 shows two decision trees corresponding
to two arrays of sizes 10 and 14, respectively. As implied by the two figures,
the maximum number of comparisons in both trees is 4. In general, the
maximum number of comparisons is one plus the height of the corresponding
decision tree. Since the height of such a tree is blog nc, we conclude that the
maximum number of comparisons is blog nc + 1. We have in effect given two
proofs of the following theorem:
22
Figure 3: Two decision trees corresponding to two arrays of sizes 10 and 14.
C(n) ≤ C(bn/2c) + 1
= C(bbn/2c/2c) + 1 + 1
= C(bn/4c) + 1 + 1
.
.
.
= blog nc + 1,
23
That is, C(n) ≤ blog nc + 1. It follows that C(n) = O(log n). Since the oper-
ation of element comparison is a basic operation in Algorithm Binarysearch,
we conclude that its time complexity is O(log n).
24
8. B[k] ←− A[t]
9. t ←− t + 1
10. end if
11. k ←− k + 1
12. end while
13. if s = q + 1 then B[k..r] ←− A[t..r]
14. else B[k..r] ←− A[s :: q]
15. end if
16. A[p..r] ←− B[p..r]
Let n denote the size of the array A[p..r] in the input to Algorithm
MERGE, i.e., n = r−p+1. We want to Find the number of comparisons that
are needed to rearrange the entries of A[p..r]. It should be emphasized that
from now on when we talk about the number of comparisons performed by
an algorithm, we mean element comparisons, i.e., the comparisons involving
objects in the input data. Thus, all other comparisons, e.g. those needed for
the implementation of the while loop, will be excluded.
25
the while loop, the if statements, etc. in order to find out how the algorithm
works and then compute the number of element assignments.
However, it is easy to see that each entry of array B is assigned exactly once.
Similarly, each entry of array A is assigned exactly once, when copying B
back into A. As a result, we have the following observation:
Observation 2. The number of element assignments performed by Algorithm
MERGE to merge two arrays into one sorted array of size n is exactly 2n.
Algorithm SELECTIONSORT
Input: An array A[1 . . . n] of n elements.
Output: A[1 . . . n] sorted in nondecreasing order.
1. for i ←− 1 to n − 1
2. k=i
3. for j ←− i + 1 to n {Find the ith smallest element.}
4. if A[j] < A[k] then k ←− j
5. end for
6. if k 6= i then interchange A[i] and A[k]
7. end for
It is easy to see that the number of element comparisons performed by
the algorithm is exactly
n−1 n−1
X X n(n − 1)
f (n) = (n − i) = (n − 1) + (n − 2) + . . . + 1 = i=
i=1 i=1
2
26
Observation 3. The number of element comparisons performed by Algo-
rithm SELECTIONSORT is n(n−1)/2. The number of element assignments
is between 0 and 3(n − 1).
27
It is easy to see that the number of element comparisons is minimum when
the array is already sorted in nondecreasing order. In this case, the number
of element comparisons is exactly n−1, as each element A[i], 2 i n, is
compared with A[i − 1] only. On the other hand, the maximum number of
element comparisons occurs if the array is already sorted in decreasing order
and all elements are distinct. In this case, the number of element comparisons
is
n n−1
X X n(n − 1)
f (n) = i−1= i= ,
i=2 i=1
2
as each element A[i], 2 ≤ i ≤ n, is compared with each entry in the sub-array
A[1..i − 1]. This number coincides with that of Algorithm SELECTION-
SORT.
28
is done during the merge process when the elements in the sub-array are
copied back into the original array.
Figure 4 on page 29 depicts the merging process to sort the array
A[1..8] = [3, 1, 4, 1, 5, 9, 2, 6]
.The dashed lines show the original array is continually halved until each
element is its own array with the values shown at the bottom. The single-
element array are then merged back up into the two item array to produce
the values shown in the second level. The merging process continues up the
diagram to produce the final sorted version of the array shown at the top.
Algorithm Bottomupsort
Input: An array A[1..n] of n elements.
Output: A[1..n] sorted in nondecreasing order.
1. t ←− 1
2. while t < n
3. s ←− t; t ←− 2s; i ←− 0
4. while i + t ≤ n
5. MERGE(A, i + 1, i + s, i + t)
6. i ←− i + t
7. end while
8. if i + s < n then Merge(A, i + 1 i + s, n)
9. end while
29
The diagram makes analysis of the bottom-up merge sort easy. Starting
at the bottom level, we have to copy the n elements in the second level.
From the second to third, the n values. The only question left to is how
many levels are there? This boils down to how many times an array of size
n can be split in half. You already know from the analysis of binary search
that this just log2 n. Therefore, the total work required to sort n elements is
n log2 n. Computer scientists call this an n log n algorithm.
9 Further Reading
1. Introduction to Algorithms by Thomas H Corman
2. Algorithms by Robert Sedgewick & Kevin Wayne.
3. The Algorithm Design Manual by Steve S
4. Algorithm for Interviews by Admin Aziz and Amit Prakash
5. Algorithm in Nutshell.
6. Algorithm Design by Kleinberg & Tardos
7. Introduction to Algorithms: A Creative Approach by Udi mamber
8. The Design and Analysis of Algorithms
9. Data Structures and Algorithms. Aho, Ullman & Hopcroft
30