Analysis of Algorithms
Analysis of Algorithms
Given two algorithms for a task, how do we find out which one is
better?
One naive way of doing this is – to implement both the algorithms and run the two programs on
your computer for different inputs and see which one takes less time. There are many problems
with this approach for the analysis of algorithms.
• It might be possible that for some inputs, the first algorithm performs better than the
second. And for some inputs second performs better.
• It might also be possible that for some inputs, the first algorithm performs better on one
machine, and the second works better on another machine for some other inputs.
Asymptotic Analysis is the big idea that handles the above issues in analyzing algorithms. In
Asymptotic Analysis, we evaluate the performance of an algorithm in terms of input size (we
don’t measure the actual running time). We calculate, how the time (or space) taken by an
algorithm increases with the input size.
For example, let us consider the search problem (searching a given item) in a sorted array.
• let us say:
o we run the Linear Search on a fast computer A and
o Binary Search on a slow computer B and
o pick the constant values for the two computers so that it tells us exactly how long
it takes for the given machine to perform the search in seconds.
• Let’s say the constant for A is 0.2 and the constant for B is 1000 which means that A is
5000 times more powerful than B.
• For small values of input array size n, the fast computer may take less time.
• But, after a certain value of input array size, the Binary Search will definitely start
taking less time compared to the Linear Search even though the Binary Search is
being run on a slow machine.
10 2 sec ~1h
• The reason is the order of growth of Binary Search with respect to input size is
logarithmic while the order of growth of Linear Search is linear.
• So the machine-dependent constants can always be ignored after a certain value of
input size.
Running times for this example:
Also, in Asymptotic analysis, we always talk about input sizes larger than a constant value. It
might be possible that those large inputs are never given to your software and an asymptotically
slower algorithm always performs better for your particular situation. So, you may end up
choosing an algorithm that is Asymptotically slower but faster for your software.
Marked as Read
Worst, Average and Best Case Time Complexities
It is important to analyze an algorithm after writing it to find it's efficiency in terms of time and space
in order to improve it if possible.
When it comes to analyzing algorithms, the asymptotic analysis seems to be the best way possible to do
so. This is because asymptotic analysis analyzes algorithms in terms of the input size. It checks how are
the time and space growing in terms of the input size.
In this article, we will take an example of Linear Search and analyze it using Asymptotic analysis.
1. Worst Case
2. Average Case
3. Best Case
Below is the algorithm for performing linear search:
C
// Linearly search x in arr[].
// If x is present then return the index,
// otherwise return -1
int search(int arr[], int n, int x)
{
int i;
for (i=0;i<n;i++){
if(arr[i]==x){
return i;
}
}
return -1;
}
Worst Case Analysis (Usually Done): In the worst case analysis, we calculate upper bound on running
time of an algorithm. We must know the case that causes the maximum number of operations to be
executed. For Linear Search, the worst case happens when the element to be searched (x in the above
code) is not present in the array. When x is not present, the search() functions compares it with all the
elements of arr[] one by one. Therefore, the worst case time complexity of linear search would be O(N),
where N is the number of elements in the array.
Average Case Analysis (Sometimes done): In average case analysis, we take all possible inputs and
calculate computing time for all of the inputs. Sum all the calculated values and divide the sum by total
number of inputs. We must know (or predict) distribution of cases. For the linear search problem, let
us assume that all cases are uniformly distributed (including the case of x not being present in array).
So we sum all the cases and divide the sum by (N+1). Following is the value of average case time
complexity.
Best Case Analysis (Bogus): In the best case analysis, we calculate lower bound on running time of an
algorithm. We must know the case that causes minimum number of operations to be executed. In the
linear search problem, the best case occurs when x is present at the first location. The number of
operations in the best case is constant (not dependent on N). So time complexity in the best case would
be O(1).
Example-
C++
#include <bits/stdc++.h>
using namespace std;
// Driver's Code
int main()
{
int arr[] = { 1, 10, 30, 15 };
int x = 30;
int n = sizeof(arr) / sizeof(arr[0]);
// Function call
cout << x << " is present at index "
<< search(arr, n, x);
return 0;
}
Time Complexity Analysis: (In Big-O notation)
• Best Case: O(1), This will take place if the element to be searched is on the first index of the
given list. So, the number of comparisons, in this case, is 1.
• Average Case: O(n), This will take place if the element to be searched is on the middle index of
the given list.
• Worst Case: O(n), This will take place if:
o The element to be searched is on the last index
o The element to be searched is not present on the list
Important Points:
• Most of the times, we do the worst case analysis to analyze algorithms. In the worst analysis, we
guarantee an upper bound on the running time of an algorithm which is a good piece of
information.
• The average case analysis is not easy to do in most of the practical cases and it is rarely done. In
the average case analysis, we must know (or predict) the mathematical distribution of all
possible inputs.
• The Best Case analysis is bogus. Guaranteeing a lower bound on an algorithm doesn't provide
any information as in the worst case, an algorithm may take years to run.
Asymptotic Notations
Asymptotic notations are mathematical tools to represent the time complexity of algorithms for
asymptotic analysis. The following 3 asymptotic notations are mostly used to represent the time
complexity of algorithms:
1) Θ Notation: The theta notation bounds a function from above and below, so it defines exact
asymptotic behavior.
A simple way to get the Theta notation of an expression is to drop low-order terms and ignore leading
constants. For example, consider the following expression.
3n3 + 6n2 + 6000 = Θ(n3)
Dropping lower order terms is always fine because there will always be a number(n) after which Θ(n3)
has higher values than Θ(n2) irrespective of the constants involved.
For a given function g(n), we denote Θ(g(n)) is following set of functions.
The above definition means, if f(n) is theta of g(n), then the value f(n) is always between c1*g(n) and
c2*g(n) for large values of n (n >= n0). The definition of theta also requires that f(n) must be non-
negative for values of n greater than n0.
2) Big O Notation: The Big O notation defines an upper bound of an algorithm, it bounds a function
only from above. For example, consider the case of Insertion Sort. It takes linear time in best case and
quadratic time in worst case. We can safely say that the time complexity of Insertion sort is O(n^2).
Note that O(n^2) also covers linear time.
If we use O notation to represent time complexity of Insertion sort, we have to use two statements for
best and worst cases:
1. The worst case time complexity of Insertion Sort is O(n^2).
2. The best case time complexity of Insertion Sort is O(n).
The Big O notation is useful when we only have upper bound on time complexity of an algorithm. Many
times we easily find an upper bound by simply looking at the algorithm.
3) Ω Notation: Just as Big O notation provides an asymptotic upper bound on a function, Ω notation
provides an asymptotic lower bound.
Ω Notation can be useful when we have lower bound on time complexity of an algorithm. The Omega
notation is the least used notation among all three.
In this article, we will discuss the analysis of the algorithm using Big - O asymptotic notation in
complete detail.
We can express algorithmic complexity using the big-O notation. For a problem of size N:
f(n) = O(g(n)) if there exists a positive integer n0 and a positive constant c, such that f(n)≤c.g(n)
∀ n≥n0
The general step wise procedure for Big-O runtime analysis is as follows:
▪ Constant Multiplication:
If f(n) = c.g(n), then O(f(n)) = O(g(n)) ; where c is a nonzero constant.
▪ Polynomial Function:
If f(n) = a0 + a1.n + a2.n2 + ---- + am.nm, then O(f(n)) = O(nm).
▪ Summation Function:
If f(n) = f1(n) + f2(n) + ---- + fm(n) and fi(n)≤fi+1(n) ∀ i=1, 2, ----, m,
then O(f(n)) = O(max(f1(n), f2(n), ----, fm(n))).
▪ Logarithmic Function:
If f(n) = logan and g(n)=logbn, then O(f(n))=O(g(n))
; all log functions grow in the same manner in terms of Big-O.
Basically, this asymptotic notation is used to measure and compare the worst-case scenarios of
algorithms theoretically. For any algorithm, the Big-O analysis should be straightforward as long as we
correctly identify the operations that are dependent on n, the input size.
In general cases, we mainly used to measure and compare the worst-case theoretical running time
complexities of algorithms for the performance analysis.
The fastest possible running time for any algorithm is O(1), commonly referred to as Constant Running
Time. In this case, the algorithm always takes the same amount of time to execute, regardless of the
input size. This is the ideal runtime for an algorithm, but it's rarely achievable.
In actual cases, the performance (Runtime) of an algorithm depends on n, that is the size of the input or
the number of operations is required for each input item.
The algorithms can be classified as follows from the best-to-worst performance (Running Time
Complexity):
If n = 10, If n=20,
log(10) = 1; log(20) = 2.996;
10 = 10; 20 = 20;
10log(10)=10; 20log(20)=59.9;
102=100; 202=400;
210=1024; 2 =1048576;
20
10!=3628800; 20!=2.432902e+1818;
For performance analysis of an algorithm, runtime measurement is not only relevant metric but also
we need to consider the memory usage amount of the program. This is referred to as the Memory
Footprint of the algorithm, shortly known as Space Complexity.
Here also, we need to measure and compare the worst case theoretical space complexities of algorithms
for the performance analysis.
It basically depends on two major aspects described below:
• Firstly, the implementation of the program is responsible for memory usage. For example, we
can assume that recursive implementation always reserves more memory than the
corresponding iterative implementation of a particular problem.
• And the other one is n, the input size or the amount of storage required for each item. For
example, a simple algorithm with a high amount of input size can consume more memory than
a complex algorithm with less amount of input size.
Algorithmic Examples of Memory Footprint Analysis: The algorithms with examples are classified from
the best-to-worst performance (Space Complexity) based on the worst-case scenarios are mentioned
below:
▪ Ideal algorithm - O(1) - Linear Search, Binary Search,
Bubble Sort, Selection Sort, Insertion Sort, Heap Sort, Shell Sort.
▪ Logarithmic algorithm - O(log n) - Merge Sort.
▪ Linear algorithm - O(n) - Quick Sort.
▪ Sub-linear algorithm - O(n+k) - Radix Sort.
There is usually a trade-off between optimal memory use and runtime performance.
In general for an algorithm, space efficiency and time efficiency reach at two opposite ends and each
point in between them has a certain time and space efficiency. So, the more time efficiency you have,
the less space efficiency you have and vice versa.
For example, Mergesort algorithm is exceedingly fast but requires a lot of space to do the operations.
On the other side, Bubble Sort is exceedingly slow but requires the minimum space.
At the end of this topic, we can conclude that finding an algorithm that works in less running time and
also having less requirement of memory space, can make a huge difference in how well an algorithm
performs.
C++
// C++ program to findtime complexity for single for loop
#include <bits/stdc++.h>
using namespace std;
// main Code
int main()
{
//declare variable
int a = 0, b = 0;
//declare size
int N = 5, M = 5;
// This loop runs for N time
for (int i = 0; i < N; i++) {
a = a + 5;
}
// This loop runs for M time
for (int i = 0; i < M; i++) {
b = b + 10;
}
//print value of a and b
cout << a << ' ' << b;
return 0;
}
Output
25 50
Explanation :
First Loop runs N Time whereas Second Loop runs M Time. The calculation takes O(1)times.
So by adding them the time complexity will be O ( N + M + 1) = O( N + M).
Time Complexity : O( N + M)
Omega Notation
This article will discuss Big - Omega Notation represented by a Greek letter (Ω).
Definition: Let g and f be the function from the set of natural numbers to itself. The function f is said to
be Ω(g), if there is a constant c > 0 and a natural number n0 such that c*g(n) ≤ f(n) for all n ≥ n0
Mathematical Representation:
Ω(g) = {f(n): there exist positive constants c and n0 such that 0 ≤ c*g(n) ≤ f(n) for all n ≥ n0}
Note: Ω (g) is a set
Graphical Representation:
Graphical Representation
In simple language, Big - Omega (Ω) notation specifies the asymptotic (at the extreme) lower bound for
a function f(n).
Follow the steps below to calculate Big - Omega (Ω) for any program:
C++Java
// C++ program for the above approach
#include <bits/stdc++.h>
using namespace std;
// Driver Code
int main()
{
// Given array
int a[] = { 1, 2, 3 };
// Function Call
print(a, n);
return 0;
}
Output
1 2
1 3
2 1
2 3
3 1
3 2
In this example, it is evident that the print statement gets executed n 2 times therefore if the running
time vs n graph is plotted a parabolic graph will be obtained, f(n 2). Now linear functions g(n),
logarithmic functions g(log n), constant functions g(1) all are less than a parabolic function when the
input range tends to infinity therefore, the worst-case running time of this program can be Ω(log n),
Ω(n), Ω(1), or any function g(n) which is less than n2 when n tends to infinity. See the below graphical
representation:
When to use Big - Ω notation: Big - Ω notation is the least used notation for the analysis of algorithms
because it can make a correct but imprecise statement over the performance of an algorithm. Suppose
a person takes 100 minutes to complete a task, using Ω notation, it can be stated that the person takes
more than 10 minutes to do the task, this statement is correct but not precise as it doesn't mention the
upper bound of the time taken. Similarly, using Ω notation we can say that the worst-case running time
for the binary search is Ω(1), which is true because we know that binary search would at least take
constant time to execute.
Theta Notation
This article will discuss Big - Theta notations represented by a Greek letter (Θ).
Definition: Let g and f be the function from the set of natural numbers to itself. The function f is said to
be Θ(g), if there are constants c1, c2 > 0 and a natural number n0 such that c1* g(n) ≤ f(n) ≤ c2 * g(n) for all
n ≥ n0
Mathematical Representation:
Θ (g(n)) = {f(n): there exist positive constants c1, c2 and n0 such that 0 ≤ c1 * g(n) ≤ f(n) ≤ c2 * g(n)
for all n ≥ n0 }
Note: Θ(g) is a set
The above definition means, if f(n) is theta of g(n), then the value f(n) is always between c1 * g(n) and
c2 * g(n) for large values of n (n ≥ n0). The definition of theta also requires that f(n) must be non-negative
for values of n greater than n0.
Graphical Representation:
Graphical Representation
In simple language, Big - Theta(Θ) notation specifies asymptotic bounds (both upper and lower) for a
function f(n) and provides the average time complexity of an algorithm.
Follow the steps below to find the average time complexity of any program:
1. Break the program into smaller segments.
2. Find all types and number of inputs and calculate the number of operations they take to be
executed. Make sure that the input cases are equally distributed.
3. Find the sum of all the calculated values and divide the sum by the total number of inputs let say
the function of n obtained is g(n) after removing all the constants, then in Θ notation its
represented as Θ(g(n))
Example 1:
Example 2: Consider an example to find whether a key exists in an array or not using linear search. The
idea is to traverse the array and check every element if it is equal to the key or not.
return false;
}
Below is the implementation of the above approach:
C++Java
// C++ program for the above approach
#include <bits/stdc++.h>
using namespace std;
return false;
}
// Driver Code
int main()
{
// Given Input
int arr[] = { 2, 3, 4, 10, 40 };
int x = 10;
int n = sizeof(arr) / sizeof(arr[0]);
// Function Call
if (linearSearch(arr, n, x))
cout << "Element is present in array";
else
cout << "Element is not present in array";
return 0;
}
Output
Element is present in array
In a linear search problem, let's assume that all the cases are uniformly distributed (including the case
when the key is absent in the array). So, sum all the cases (when the key is present at position 1, 2, 3,
......, n and not present, and divide the sum by n + 1.
⇒ 𝜃((𝑛+1)∗(𝑛+2)/2)𝑛+1
⇒ 𝜃(1+𝑛/2)
1) O(1): Time complexity of a function (or set of statements) is considered as O(1) if it doesn't contain
loop, recursion, and call to any other non-constant time function.
// Here c is a constant
for (int i = 1; i <= c; i++) {
// some O(1) expressions
}
2) O(n): Time Complexity of a loop is considered as O(n) if the loop variables are
incremented/decremented by a constant amount. For recursive call in recursive function, the time
complexity is considered as O(n). For example following functions have O(n) time complexity.
//Recursive function
void recurse(n)
{
if(n==0)
return;
else{
// some O(1) expressions
}
recurse(n-1);
}
3) O(nc): Time complexity of nested loops is equal to the number of times the innermost statement is
executed. For example, the following sample loops have O(n2) time complexity
for (int i = 1; i <=n; i += c) {
for (int j = 1; j <=n; j += c) {
// some O(1) expressions
}
}
For example, Selection sort and Insertion Sort have O(n2) time complexity.
4) O(Logn) Time Complexity of a loop is considered as O(Logn) if the loop variables are
divided/multiplied by a constant amount.
For example, Binary Search(refer iterative implementation) has O(Logn) time complexity. Let us see
mathematically how it is O(Log n). The series that we get in the first loop is 1, c, c2, c3, ... ck. If we put k
equals to Logcn, we get cLogcn which is n.
5) O(LogLogn) Time Complexity of a loop is considered as O(LogLogn) if the loop variables are
reduced/increased exponentially by a constant amount.
How to calculate time complexity when there are many if, else statements inside loops?
As discussed here, worst-case time complexity is the most useful among best, average and worst.
Therefore we need to consider the worst case. We evaluate the situation when values in if-else
conditions cause a maximum number of statements to be executed.
For example, consider the linear search function where we consider the case when an element is
present at the end or not present at all.
When the code is too complex to consider all if-else cases, we can get an upper bound by ignoring if-else
and other complex control statements.
Many algorithms are recursive. When we analyze them, we get a recurrence relation for time
complexity. We get running time on an input of size n as a function of n and the running time on inputs
of smaller sizes. For example in Merge Sort, to sort a given array, we divide it into two halves and
recursively repeat the process for the two halves. Finally, we merge the results. Time complexity of
Merge Sort can be written as T(n) = 2T(n/2) + cn. There are many other algorithms like Binary Search,
Tower of Hanoi, etc.
Substitution Method:
We make a guess for the solution and then we use mathematical induction to prove the guess is correct
or incorrect.
We guess the solution as T(n) = O(nLogn). Now we use induction to prove our guess.
We need to prove that T(n) <= cnLogn. We can assume that it is true for values smaller than n.
T(n) = 2T(n/2) + n
<= 2c(n/2Log(n/2)) + n
= cnLogn - cnLog2 + n
= cnLogn - cn + n
<= cnLogn
cn2
/ \
T(n/4) T(n/2)
cn2
/ \
c(n2)/16 c(n2)/4
/ \ / \
c(n )/256
2 c(n )/64
2 c(n )/64
2 c(n )/16
2
/ \ / \ / \ / \
To get an upper bound, we can sum the infinite series. We get the sum as (n2)/(1 - 5/16) which
is O(n2)
Master Method:
Master Method is a direct way to get the solution. The master method works only for the following type
of recurrences or for recurrences that can be transformed into the following type.
The master method is mainly derived from the recurrence tree method. If we draw the recurrence tree
of T(n) = aT(n/b) + f(n), we can see that the work done at the root is f(n), and work done at all leaves is
Θ(nc) where c is Logba. And the height of the recurrence tree is Logbn
In the recurrence tree method, we calculate the total work done. If the work done at leaves is
polynomially more, then leaves are the dominant part, and our result becomes the work done at leaves
(Case 1). If work done at leaves and root is asymptotically the same, then our result becomes height
multiplied by work done at any level (Case 2). If work done at the root is asymptotically more, then our
result becomes work done at the root (Case 3).
Examples of some standard algorithms whose time complexity can be evaluated using the
Master Method
• Merge Sort: T(n) = 2T(n/2) + Θ(n). It falls in case 2 as c is 1 and Logba] is also 1. So the solution
is Θ(n Logn)
• Binary Search: T(n) = T(n/2) + Θ(1). It also falls in case 2 as c is 0 and Logba is also 0. So the
solution is Θ(Logn)
Notes:
• It is not necessary that a recurrence of the form T(n) = aT(n/b) + f(n) can be solved using Master
Theorem. The given three cases have some gaps between them. For example, the recurrence T(n)
= 2T(n/2) + n/Logn cannot be solved using master method.
• Case 2 can be extended for f(n) = Θ(ncLogkn)
If f(n) = Θ(ncLogkn) for some constant k >= 0 and c = Logba, then T(n) = Θ(ncLogk+1n)
Recursion Tree Method for Solving Recurrences
The Recursion Tree Method is a way of solving recurrence relations. In this method, a recurrence
relation is converted into recursive trees. Each node represents the cost incurred at various levels of
recursion. To find the total cost, costs of all levels are summed up.
Solution:
Recursion Tree
• Step 2: Calculate the work done or cost at each level and count total no of levels in recursion
tree
n/2k = 1
2k = n
k = log2(n)
Total no of levels in recursive tree = k +1 = log2(n) + 1
• Step 3: Count total number of nodes in the last level and calculate cost of last level
No. of nodes at level 0 = 20 = 1
...............................................................
= c(n) + Θ(n)
Thus, T(n) = Θ(n)
Solution:
Recursive Tree
• Step 2: Calculate the work done or cost at each level and count total no of levels in recursion tree
(9/10)kn = 1
(9/10)k = 1/n
k = log10/9(n)
Total no of levels in recursive tree = k +1 = log10/9(n) + 1
• Step 3: Count total number of nodes in the last level and calculate cost of last level
No. of nodes at level 0 = 20 = 1
...............................................................
Cost of sub problems at level log10/9(n) (last level) = nlog10/9(2) x T(1) = nlog10/9(2) x 1 = nlog10/9(2)
= nlog10/9(n) + Θ(nlog10/9(2))
Thus, T(n) = Θ(nlog10/9(2))
Space Complexity
Space Complexity:
The term Space Complexity is misused for Auxiliary Space at many places. Following are the correct
definitions of Auxiliary Space and Space Complexity.
Space Complexity of an algorithm is the total space taken by the algorithm with respect to the input size.
Space complexity includes both Auxiliary space and space used by input.
For example, if we want to compare standard sorting algorithms on the basis of space, then Auxiliary
Space would be a better criterion than Space Complexity. Merge Sort uses O(n) auxiliary space, Insertion
sort, and Heap Sort use O(1) auxiliary space. The space complexity of all these sorting algorithms is O(n)
though.
Space complexity is a parallel concept to time complexity. If we need to create an array of size n, this
will require O(n) space. If we create a two-dimensional array of size n*n, this will require O(n2) space.
Example :
1. add(4)
2. -> add(3)
3. -> add(2)
4. -> add(1)
5. -> add(0)
Each of these calls is added to call stack and takes up actual memory.
So it takes O(n) space.
However, just because you have n calls total doesn't mean it takes O(n) space.