Unit 1 Analysis of Algorithms: Structure Page Nos
Unit 1 Analysis of Algorithms: Structure Page Nos
1.0 INTRODUCTION
A common person’s belief is that a computer can do anything. This is far from truth.
In reality, computer can perform only certain predefined instructions. The formal
representation of this model as a sequence of instructions is called an algorithm, and
coded algorithm, in a specific computer language is called a program. Analysis of
algorithms has been an area of research in computer science; evolution of very high
speed computers has not diluted the need for the design of time-efficient algorithms.
1.1 OBJECTIVES
After going through this unit, you should be able to:
Definition of Algorithm
Algorithm should have the following five characteristic features:
1. Input
2. Output
3. Definiteness
4. Effectiveness
5. Termination.
Complexity classes
All decision problems fall into sets of comparable complexity, called complexity
classes.
The complexity class P is the set of decision problems that can be solved by a
deterministic machine in polynomial time. This class corresponds to set of problems
which can be effectively solved in the worst cases. We will consider algorithms
belonging to this class for analysis of time complexity. Not all algorithms in these
classes make practical sense as many of them have higher complexity. These are
discussed later.
The complexity class NP is a set of decision problems that can be solved by a non-
deterministic machine in polynomial time. This class contains many problems like
Boolean satisfiability problem, Hamiltonian path problem and the Vertex cover
problem.
What is Complexity?
Complexity refers to the rate at which the required storage or consumed time grows as
a function of the problem size. The absolute growth depends on the machine used to
execute the program, the compiler used to construct the program, and many other
factors. We would like to have a way of describing the inherent complexity of a
program (or piece of a program), independent of machine/compiler considerations.
This means that we must not try to describe the absolute time or storage needed. We
must instead concentrate on a “proportionality” approach, expressing the complexity
in terms of its relationship to some known function. This type of analysis is known as
asymptotic analysis. It may be noted that we are dealing with complexity of an
algorithm not that of a problem. For example, the simple problem could have high
order of time complexity and vice-versa.
8
Asymptotic Analysis Analysis of
Algorithms
Asymptotic analysis is based on the idea that as the problem size grows, the
complexity can be described as a simple proportionality to some known function. This
idea is incorporated in the “Big O”, “Omega” and “Theta” notation for asymptotic
performance.
The notations like “Little Oh” are similar in spirit to “Big Oh” ; but are rarely used in
computer science for asymptotic analysis.
We will learn about various techniques to bind the complexity function. In fact, our
aim is not to count the exact number of steps of a program or the exact amount of time
required for executing an algorithm. In theoretical analysis of algorithms, it is
common to estimate their complexity in asymptotic sense, i.e., to estimate the
complexity function for reasonably large length of input ‘n’. Big O notation, omega
notation Ω and theta notation Θ are used for this purpose. In order to measure the
performance of an algorithm underlying the computer program, our approach would
be based on a concept called asymptotic measure of complexity of algorithm. There
are notations like big O, Θ, Ω for asymptotic measure of growth functions of
algorithms. The most common being big-O notation. The asymptotic analysis of
algorithms is often used because time taken to execute an algorithm varies with the
input ‘n’ and other factors which may differ from computer to computer and from run
to run. The essences of these asymptotic notations are to bind the growth function of
time complexity with a function for sufficiently large input.
Certainly, there are other choices for c1, c2 and no. Now we may show that the
function f(n) = 6n3 ≠ Θ (n2).
O(g(n)) = {f(n) : There exists a positive constant c and n0 such that 0 ≤ f(n) ≤ cg(n)
for all n ≥ n0 }
We can see from the earlier definition of Θ that Θ is a tighter notation than big-O
notation.
f(n) = an + c is O(n) is also O(n2), but O (n) is asymptotically tight whereas O(n2) is
notation.
10
Whereas in terms of Θ notation, the above function f(n) is Θ (n). As big-O notation is Analysis of
upper bound of function, it is often used to describe the worst case running time of Algorithms
algorithms.
The Ω-Notation (Lower Bound)
This notation gives a lower bound for a function to within a constant factor. We write
f(n) = Ω(g(n)), if there are positive constants n0 and c such that to the right of n0, the
value of f(n) always lies on or above cg(n). Figure 1.3 depicts the plot of
f(n) = Ω(g(n)).
Mathematically for a given function g(n), we may define Ω(g(n)) as the set of
functions.
Ω(g(n)) = { f(n) : there exists a constant c and n0 ≥ 0 such that 0 ≤ cg(n) ≤ f(n) for all
n ≥ n0 }.
Since Ω notation describes lower bound, it is used to bound the best case running time
of an algorithm.
Asymptotic notation
2. aknk + ak−1nk−1 + · · · + a1n + a0 = O(nk) for all k ≥ 0 and for all a0, a1, . . . , ak ∈ R.
In other words, every polynomial of degree k can be bounded by the function nk.
Smaller order terms can be ignored in big-O notation.
3. Basis of Logarithm can be ignored in big-O notation i.e. loga n = O(logb n) for
any bases a, b. We generally write O(log n) to denote a logarithm n to any base.
11
Introduction to 4. Any logarithmic function can be bounded by a polynomial i.e. logb n = O(nc)
Algorithms and Data for any b (base of logarithm) and any positive exponent c > 0.
Structures
6. Any exponential function can be bound by the factorial function. For example,
an = O(n!) for any base a.
• 30n2
• 10n3 + 6n2
• 5nlogn + 30n
• log n + 3n
• log n + 32
……………………………………………………………………………………………………
……………………………………………………………………………………………………
……………………………………………………………………………………………………
……………………………………………………………………………………………………
Time Complexity: The maximum time required by a Turing machine to execute on any
input of length n.
Space Complexity: The amount of storage space required by an algorithm varies with
the size of the problem being solved. The space complexity is normally expressed as
an order of magnitude of the size of the problem, e.g., O(n2) means that if the size of
the problem (n) doubles then the working storage (memory) requirement will become
four times.
The random access model (RAM) of computation was devised by John von Neumann
to study algorithms. Algorithms are studied in computer science because they are
independent of machine and language.
We will do all our design and analysis of algorithms based on RAM model of
computation:
The complexity of algorithms using big-O notation can be defined in the following
way for a problem of size n:
The process of analysis of algorithm (program) involves analyzing each step of the
algorithm. It depends on the kinds of statements used in the program.
Statement 1;
Statement 2;
...
...
Statement k;
The total time can be found out by adding the times for all statements:
13
Introduction to It may be noted that time required by each statement will greatly vary depending on
Algorithms and Data whether each statement is simple (involves only basic operations) or otherwise.
Structures
Assuming that each of the above statements involve only basic operation, the time for
each simple statement is constant and the total time is also constant: O(1).
In this example, assume the statements are simple unless noted otherwise.
if-then-else statements
if (cond) {
sequence of statements 1
}
else {
sequence of statements 2
}
In this, if-else statement, either sequence 1 will execute, or sequence 2 will execute
depending on the boolean condition. The worst-case time in this case is the slower of
the two possibilities. For example, if sequence 1 is O(N2) and sequence 2 is O(1), then
the worst-case time for the whole if-then-else statement would be O(N2).
for (i = 0; i < n; i + +) {
sequence of statements
}
Here, the loop executes n times. So, the sequence of statements also executes n times.
Since we assume the time complexity of the statements are O(1), the total time for the
loop is n * O(1), which is O(n). Here, the number of statements does not matter as it
will increase the running time by a constant factor and the overall complexity will be
same O(n).
for (i = 0; i < n; i + +) {
for (j = 0; j < m; j + +) {
sequence of statements
}
}
Here, we observe that, the outer loop executes n times. Every time the outer loop
executes, the inner loop executes m times. As a result of this, statements in the inner
loop execute a total of n * m times. Thus, the time complexity is O(n * m). If we
modify the conditional variables, where the condition of the inner loop is j < n instead
of j < m (i.e., the inner loop also executes n times), then the total complexity for the
nested loop is O(n2).
int psum(int n)
{
int i, partial_sum;
14
partial_sum = 0; /* Line 1 */ Analysis of
for (i = 1; i <= n; i++) { /* Line 2 */ Algorithms
partial_sum = partial_sum + i*i; /* Line 3 */
}
return partial_sum; /* Line 4 */
}
This function returns the sum from i = 1 to n of i squared, i.e. psum = 12 + 22+ 32
+ …………. + n2 . As we have to determine the running time for each statement in this
program, we have to count the number of statements that are executed in this
procedure. The code at line 1 and line 4 are one statement each. The for loop on line 2
are actually 2n+2 statements:
• i <= n; statement is executed once for each value of i from 1 to n+1 (till the
condition becomes false). The statement is executed n+1 times.
• i++ is executed once for each execution of the body of the loop. This is
executed n times.
In terms of big-O notation defined above, this function is O(n), because if we choose
c=3, then we see that cn > 2n+3. As we have already noted earlier, big-O notation
only provides a upper bound to the function, it is also O(nlog(n)) and O(n2), since
n2 > nlog(n) > 2n+3. However, we will choose the smallest function that describes
the order of the function and it is O(n).
By looking at the definition of Omega notation and Theta notation, it is also clear that
it is of Θ(n), and therefore Ω(n) too. Because if we choose c=1, then we see that
cn < 2n+3, hence Ω(n) . Since 2n+3 = O(n), and 2n+3 = Ω(n), it implies that
2n+3 = Θ(n) , too.
It is again reiterated here that smaller order terms and constants may be ignored while
describing asymptotic notation. For example, if f(n) = 4n+6 instead of f(n) = 2n +3 in
terms of big-O, Ω and Θ, this does not change the order of the function. The function
f(n) = 4n+6 = O(n) (by choosing c appropriately as 5); 4n+6 = Ω(n) (by choosing
c = 1), and therefore 4n+6 = Θ(n). The essence of this analysis is that in these
asymptotic notation, we can count a statement as one, and should not worry about
their relative execution time which may depend on several hardware and other
implementation factors, as long as it is of the order of 1, i.e. O(1).
Let us consider the following pseudocode to analyse the exact runtime complexity of
insertion sort.
∑T
j= 2
j
5 { A[i+1] = A[I] c4 n
∑T −1
j= 2
j
15
Introduction to 6 i = I –1 } c5 n
Algorithms and Data
Structures
∑T −1
j= 2
j
The statements at lines 5 and 6 will execute Tj − 1 number of times (one step less)
each
So, total time is the sum of time taken for each line multiplied by their cost factor.
n n n
T (n) = c1n + c2(n −1) + c3(n−1) + c4 ∑T
j =2
j + c5 ∑T
j =2
j − 1 + c6 ∑T
j =2
j − 1 + c7 (n−1)
Three cases can emerge depending on the initial configuration of the input list. First,
the case is where the list was already sorted, second case is the case wherein the list is
sorted in reverse order and third case is the case where in the list is in random order
(unsorted). The best case scenario will emerge when the list is already sorted.
Worst Case: Worst case running time is an upper bound for running time with any
input. It guarantees that, irrespective of the type of input, the algorithm will not take
any longer than the worst case time.
Best Case : It guarantees that under any cirumstances the running time of algorithms
will at least take this much time.
Average case : This gives the average running time of algorithm. The running time for
any given size of input will be the average number of operations over all problem
instances for a given size.
Best Case : If the list is already sorted then A[i] <= key at line 4. So, rest of the lines
in the inner loop will not execute. Then,
T (n) = c1n + c2(n −1) + c3(n −1) + c4 (n −1) = O (n), which indicates that the time
complexity is linear.
Worst Case: This case arises when the list is sorted in reverse order. So, the boolean
condition at line 4 will be true for execution of line 1.
n
So, step line 4 is executed ∑ j = n(n+1)/2
j =2
− 1 times
T (n) = c1n + c2(n −1) + c3(n −1) + c4 (n(n+1)/2 − 1) + c5(n(n −1)/2) + c6(n(n−1)/2)
+ c7 (n −1)
= O (n2).
Average case : In most of the cases, the list will be in some random order. That is, it
neither sorted in ascending or descending order and the time complexity will lie some
where between the best and the worst case.
Worst case
Average case
Best case
Time
Input size
printMultiplicationTable(int max){
for(int i = 1 ; i <= max ; i + +)
{
for(int j = 1 ; j <= max ; j + +)
cout << (i * j) << “ “ ;
cout << endl ;
} //for
………………………………………………………………………………………
3) Consider the following program segment:
for (i = 1; i <= n; i *= 2)
{
j = 1;
}
Each recursive call takes a constant amount of space and some space for local
variables and function arguments, and also some space is allocated for remembering
where each call should return to. General recursive calls use linear space. That is, for
n recursive calls, the space complexity is O(n).
Example: Find the greatest common divisor (GCD) of two integers, m and n.
The algorithm for GCD may be defined as follows:
Code in C
int gcd(int m, int n)
/* The precondition are : m>0 and n>0. Let g = gcd(m,n). */
{ while( m > 0 )
{
if( n > m )
{ int t = m; m = n; n = t; } /* swap m and n*/
/* m >= n > 0 */
m − = n;
}
return n;
}
The space-complexity of the above algorithm is a constant. It just requires space for
three integers m, n and t. So, the space complexity is O(1).
18
The time complexity depends on the loop and on the condition whether m>n or not. Analysis of
The real issue is, how many iterations take place? The answer depends on both m and Algorithms
n.
Space complexity of a Turing Machine: The (worst case) maximum length of the tape
required to process an input string of length n.
In complexity theory, the class PSPACE is the set of decision problems that can be
solved by a Turing machine using a polynomial amount of memory, and unlimited
time.
x = 4y + 3
z=z+1
p=1
As we have seen, x, y, z and p are all scaler variables and the running time is constant
irrespective of the value of x,y,z and p. Here, we emphasize that each line of code may
take different time, to execute, but the bottom line is that they will take constant
amount of time. Thus, we will describe run time of each line of code as O(1).
Binary search in a sorted list is carried out by dividing the list into two parts based on
the comparison of the key. As the search interval halves each time, the iteration takes
place in the search. The search interval will look like following after each iteration
N, N/2, N/4, N/8 , .......... 8, 4, 2, 1
The number of iterations (number of elements in the series) is not so evident from the
above series. But, if we take logs of each element of the series, then
19
Introduction to As the sequence decrements by 1 each time the total elements in the above series are
Algorithms and Data log2 N + 1. So, the number of iterations is log2 N + 1 which is of the order of
Structures
O(log2N).
As of now, there is no algorithm that finds a tour of minimum length as well as covers
all the cities in polynomial time. However, there are numerous very good heuristic
algorithms.
• T(n) = O(1). This is called constant growth. T(n) does not grow at all as a
function of n, it is a constant. For example, array access has this characteristic.
A[i] takes the same time independent of the size of the array A.
• T(n) = O(log2 (n)). This is called logarithmic growth. T(n) grows proportional to
the base 2 logarithm of n. Actually, the base of logarithm does not matter. For
example, binary search has this characteristic.
• T(n) = O(n). This is called linear growth. T(n) grows linearly with n. For
example, looping over all the elements in a one-dimensional array of n elements
would be of the order of O(n).
• T(n) = O(n log (n). This is called nlogn growth. T(n) grows proportional to n
times the base 2 logarithm of n. Time complexity of Merge Sort has this
characteristic. In fact, no sorting algorithm that uses comparison between
elements can be faster than n log n.
• T(n) = O(nk). This is called polynomial growth. T(n) grows proportional to the
k-th power of n. We rarely consider algorithms that run in time O(nk) where k is
bigger than 2 , because such algorithms are very slow and not practical. For
example, selection sort is an O(n2) algorithm.
Table 1.2 compares the typical running time of algorithms of different orders.
The growth patterns above have been listed in order of increasing size.
That is, O(1) < O(log(n)) < O(n log(n)) < O(n2) < O(n3), ... , O(2n).
Notation Name Example
O(1) Constant Constant growth. Does
20
not grow as a function Analysis of
of n. For example, Algorithms
accessing array for
one element A[i]
O(log n) Logarithmic Binary search
O(n) Linear Looping over n
elements, of an array
of size n (normally).
O(n log n) Sometimes called Merge sort
“linearithmic”
O(n2) Quadratic Worst time case for
insertion sort, matrix
multiplication
O(nc) Polynomial,
sometimes
“geometric”
O(cn) Exponential
O(n!) Factorial
Table 1.1 : Comparison of various algorithms and their complexities
8 3 8 64 256
128 7 128 16,384 3.4*1038
256 8 256 65,536 1.15*1077
1000 10 1000 1 million 1.07*10301
100,000 17 100,000 10 billion ........
1.6 SUMMARY
Reference Websites
http://en.wikipedia.org/wiki/Big_O_notation
http://www.webopedia.com
22