Algorithm
Algorithm
A program is written in order to solve a problem. A solution to a problem actually consists of two things:
A way to organize the data
Sequence of steps to solve the problem
The way data are organized in a computers memory is said to be Data Structure and the sequence of
computational steps to solve a problem is said to be an algorithm. Therefore, a program is nothing but
data structures plus algorithms.
The model defines an abstract view to the problem. This implies that the model focuses only on problem
related stuff and that a programmer tries to define the properties of the problem.
1
With abstraction you create a well-defined entity that can be properly handled. These entities define the
data structure of the program.
An entity with the properties just described is called an abstract data type (ADT).
A data structure is a language construct that the programmer has defined in order to implement an abstract
data type.
There are lots of formalized and standard Abstract data types such as Stacks, Queues, Trees, etc.
1.1.2. Abstraction
Abstraction is a process of classifying characteristics as relevant and irrelevant for the particular purpose
at hand and ignoring the irrelevant ones.
How do data structures model the world or some part of the world?
The value held by a data structure represents some specific characteristic of the world
The characteristic being modeled restricts the possible values held by a data structure
The characteristic being modeled restricts the possible operations to be performed on the data
structure.
Note: Notice the relation between characteristic, value, and data structures
1.2. Algorithms
2
An algorithm is a well-defined computational procedure that takes some value or a set of values as input
and produces some value or a set of values as output. Data structures model the static part of the world.
They are unchanging while the world is changing. In order to model the dynamic part of the world we
need to work with algorithms. Algorithms are the dynamic part of a program’s world model.
An algorithm transforms data structures from one state to another state in two ways:
The quality of a data structure is related to its ability to successfully model the characteristics of the
world. Similarly, the quality of an algorithm is related to its ability to successfully simulate the changes in
the world.
However, independent of any particular world model, the quality of data structure and algorithms is
determined by their ability to work together well. Generally speaking, correct data structures lead to
simple and efficient algorithms and correct algorithms lead to accurate and efficient data structures.
3
In order to solve a problem, there are many possible algorithms. One has to be able to choose the best
algorithm for the problem at hand using some scientific method. To classify some data structures and
algorithms as good, we need precise ways of analyzing them in terms of resource requirement. The main
resources are:
Running Time
Memory Usage
Communication Bandwidth
Running time is usually treated as the most important since computational time is the most precious
resource in most problem domains.
Accordingly, we can analyze an algorithm according to the number of operations required, rather than
according to an absolute amount of time involved. This can show how an algorithm’s efficiency changes
according to the size of the input.
Complexity Analysis is the systematic study of the cost of computation, measured either in time units or
in operations performed, or in the amount of storage space required.
The goal is to have a meaningful measure that permits comparison of algorithms independent of operating
platform.
There are two things to consider:
Time Complexity: Determine the approximate number of operations required to solve a problem
of size n.
Space Complexity: Determine the approximate memory required to solve a problem of size n.
4
Algorithm Analysis: Analysis of the algorithm or data structure to produce a function T (n) that
describes the algorithm in terms of the operations performed in order to measure the complexity of
the algorithm.
Order of Magnitude Analysis: Analysis of the function T (n) to determine the general
complexity category to which it belongs.
There is no generally accepted set of rules for algorithm analysis. However, an exact count of operations
is commonly used.
5
2. int total(int n)
{
int sum=0;
for (int i=1;i<=n;i++)
sum=sum+1;
return sum;
}
Time Units to Compute
-------------------------------------------------
1 for the assignment statement: int sum=0
In the for loop:
1 assignment, n+1 tests, and n increments.
n loops of 2 units for an assignment, and an addition.
1 for the return statement.
-------------------------------------------------------------------
T (n)= 1+ (1+n+1+n)+2n+1 = 4n+4 = O(n)
3. void func ()
{
int x=0;
int i=0;
int j=1;
Cout<< “Enter an Integer value”;
cin>>n;
while (i<n){
x++;
i++;
}
while (j<n)
{
j++;
}
}
Time Units to Compute
-------------------------------------------------
1 for the first assignment statement: x=0;
1 for the second assignment statement: i=0;
1 for the third assignment statement: j=1;
1 for the output statement.
1 for the input statement.
In the first while loop:
n+1 tests
n loops of 2 units for the two increment (addition) operations
In the second while loop:
n tests
n-1 increments
6
-------------------------------------------------------------------
T (n)= 1+1+1+1+1+n+1+2n+n+n-1 = 5n+5 = O(n)
4. int sum (int n)
{
int partial_sum = 0;
for (int i = 1; i <= n; i++)
partial_sum = partial_sum +(i * i * i);
return partial_sum;
}
Time Units to Compute
-------------------------------------------------
1 for the assignment.
1 assignment, n+1 tests, and n increments.
n loops of 4 units for an assignment, an addition, and two multiplications.
1 for the return statement.
-------------------------------------------------------------------
T (n)= 1+(1+n+1+n)+4n+1 = 6n+4 = O(n)
• In general, a for loop translates to a summation. The index and bounds of the summation are the
same as the index and bounds of the for loop.
N
1
for (int i = 1; i <= N; i++) {
sum = sum+i; N
}
i 1
• Suppose we count the number of additions that are done. There is 1 addition per iteration of the
loop, hence N additions in total.
• Nested for loops translate into multiple summations, one for each for loop.
7
for (int i = 1; i <= N; i++) {
for (int j = 1; j <= M; j++) { N M N
}
sum = sum+i+j; 2 2M
i 1 j 1 i 1
2 MN
}
• Again, count the number of additions. The outer summation is for the outer for loop.
Conditionals: Formally
• If (test) s1 else s2: Compute the maximum of the running time for s1 and s2.
if (test == 1) {
for (int i = 1; i <= N; i++) { N N N
sum = sum+i; max 1, 2
}} i 1 i 1
j 1
else for (int i = 1; i <= N; i++) {
for (int j = 1; j <= N; j++) { max N , 2 N 2 2N 2
sum = sum+i+j;
}}
Example:
Suppose we have hardware capable of executing 106 instructions per second. How long would it take to
execute an algorithm whose complexity function was:
T (n) = 2n2 on an input size of n=108?
The total number of operations to be performed would be T (108):
8
1.3. Measures of Times
In order to determine the running time of an algorithm it is possible to define three functions Tbest(n),
Tavg(n) and Tworst(n) as the best, the average and the worst case running time of the algorithm respectively.
Average Case (Tavg): The amount of time the algorithm takes on an "average" set of inputs.
Worst Case (Tworst): The amount of time the algorithm takes on the worst possible set of inputs.
Best Case (Tbest): The amount of time the algorithm takes on the smallest possible set of inputs.
We are interested in the worst-case time, since it provides a bound for all input – this is called the “Big-
Oh” estimate.
There are five notations used to describe a running time function. These are:
Formal Definition: f (n)= O (g (n)) if there exist c, k ∊ ℛ+ such that for all n≥ k, f (n) ≤ c.g (n).
Examples: The following points are facts that you can use for Big-Oh problems:
10
1. f(n)=10n+5 and g(n)=n. Show that f(n) is O(g(n)).
To show that f(n) is O(g(n)) we must show that constants c and k such that
(c=15,k=1).
Typical Orders
Here is a table of some typical cases. This uses logarithms to base 2, but these are simply proportional to
logarithms in other base.
11
Demonstrating that a function f(n) is big-O of a function g(n) requires that we find specific constants c
and k for which the inequality holds (and show that the inequality does in fact hold).
Big-O expresses an upper bound on the growth rate of a function, for sufficiently large values of n.
An upper bound is the best algorithmic solution that has been found for a problem.
“ What is the best that we know we can do?”
Exercise:
f(n) = (3/2)n2+(5/2)n-3
Show that f(n)= O(n2)
In simple words, f (n) =O(g(n)) means that the growth rate of f(n) is less than or equal to g(n).
Theorem 1: k is O(1)
Theorem 2: A polynomial is O(the term containing the highest power of n).
12
logbn
n
nlogbn
n2
n to higher powers
2n
3n
larger constants to the nth power
n!
nn
Exponential functions grow faster than powers, i.e. is O( bn ) b > 1 and k >= 0
E.g. n20 is O ( 1.05n)
Just as O-notation provides an asymptotic upper bound on a function, notation provides an asymptotic
lower bound.
Formal Definition: A function f(n) is ( g (n)) if there exist constants c and k ∊ ℛ+ such that
f(n) >=c. g(n) for all n>=k.
f(n)= ( g (n)) means that f(n) is greater than or equal to some constant multiple of g(n) for all values of n
greater than or equal to some k.
13
In simple terms, f (n) = ( g (n)) means that the growth rate of f(n) is greater that or equal to g(n).
A function f (n) belongs to the set of (g(n)) if there exist positive constants c1 and c2 such that it can be
sandwiched between c1.g(n) and c2.g(n), for sufficiently large values of n.
Formal Definition: A function f (n) is (g(n)) if it is both O( g(n) ) and ( g(n) ). In other words, there
exist constants c1, c2, and k >0 such that c1.g (n)<=f(n)<=c2. g(n) for all n >= k
In simple terms, f(n)= (g(n)) means that f(n) and g(n) have the same rate of growth.
Example:
f(n)=O(n4)
f(n)=O(n3)
f(n)=O(n2)
All these are technically correct, but the last expression is the best and tight one. Since 2n2 and n2 have the
same growth rate, it can be written as f(n)= (n2).
2n2 = O(n2)
=O(n3)
f(n)=o(g(n)) means for all c>0 there exists some k>0 such that f(n)<c.g(n) for all n>=k. Informally,
f(n)=o(g(n)) means f(n) becomes insignificant relative to g(n) as n approaches infinity.
14
1.4.5. Little-Omega ( notation)
Little-omega () notation is to big-omega () notation as little-o notation is to Big-Oh notation. We use
notation to denote a lower bound that is not asymptotically tight.
Formal Definition: f(n)= (g(n)) if there exists a constant no>0 such that 0<= c. g(n)<f(n) for all n>=k.
Transitivity
Symmetry
• f(n)=(f(n)),
• f(n)=O(f(n)),
• f(n)=(f(n)).
15