Module 1 Notes
Module 1 Notes
Module 1 Notes
MODULE-1 INTRODUCTION
The study of algorithms includes many important and active areas of research. There are four
distinct areas of study one can identify:
1. How to devise algorithms: The act of creating an algorithm is an art which may
never be fully automated. A major goal is to study various design techniques which
have proven to be useful in that they have often yielded good algorithms.
4. How to test a program: Testing a program really consists of two phases: debugging
and profiling. Debugging is the process of executing programs on sample data sets to
2. Blocks are indicated with matching braces :{ and}. A compound statement (i.e., a
collection of simple statements) can be represented as a block. Statements are
delimited by ;.
3. An identifier begins with a letter. The data types of variables are not explicitly
declared. The types will be clear from the context. Whether a variable is global or
local to a procedure will also be evident from the context. We assume simple data
types such as integer, float, char,boolean, and so on. Compound data types can be
formed with records. Here is an example:
node= record
{
data type_1 data 1;
.
.
data type _n data_n;
node *link;
}
5. There are two Boolean values true and false. In order to produce these values, the
logical operators and, or, and not and the relational operators<, <, <=and >= are
provided.
7. The following looping statements are employed: for, while, and repeat-until. The
while loop takes the following form:
while (condition)do
{
(statement 1)
.
.
(statement n)
}
As long as (condition) is true, the statements get executed. When (condition) becomes
false, the loop is exited. The value of (condition) is evaluated at the top of the loop.
The general form of a for loop is
for variable value l to value n do
{
The statements are executed as long as {condition) is false. The value of {condition)is
computed after executing the statements.
9. Input and output are done using the instructions read and write. No format is used to
specify the size of input or output quantities.
10. There is only one type of procedure: Algorithm. An algorithm consists of a heading
and a body. The heading takes the form
AlgorithmName((parameterlist))
where Name is the name of the procedure and ((parameter list)) is a listing of the
procedure parameters.
Example: Illustrating the notion of algorithm, we consider three methods for solving
the same problem: computing the greatest common divisor of two integers.
Euclid’s Algorithm:
We can consider algorithms to be procedural solutions to problems. These solutions are not
answers but specific instructions for getting answers. It is this emphasis on precisely defined
constructive procedures that makes computer science distinct from other disciplines.
The sequence of steps one typically goes through in designing and analyzing an algorithm
(Figure 1.2).
The first thing you need to do before designing an algorithm is to understand completely the
problem given. Read the problem's description carefully and ask questions if you have any
doubts about the problem, do a few small examples by hand, think about special cases, and
ask questions again if needed.
There are a few types of problems that arise in computing applications quite often. An
input to an algorithm specifies an instance of the problem the algorithm solves. It is very
important to specify exactly the range of instances the algorithm needs to handle. (As an
example, recall the variations in the range of instances for the three greatest common divisor
algorithms.) If you fail to do this, your algorithm may work correctly for a majority of inputs
but crash on some "boundary" value. Remember that a correct algorithm is not one that works
most of the time, but 'One that works correctly for all legitimate inputs.
Measuring an Input’s Size: It is preferable to measure size by the number b of bits in the n’s
binary representation:
Orders of Growth
A difference in running times on small inputs is not what really distinguishes efficient
algorithms from inefficient ones. For large values of n, it is the function’s order of growth
that counts: just look at Table 1.1, which contains values of a few functions particularly
important for analysis of algorithms.
Algorithm efficiency depends on the input size n. And for some algorithms efficiency
depends on type of input. We have best, worst & average case efficiencies.
Average-case efficiency: Average time taken (number of times the basic operation
will be executed) to solve all the possible instances (random) of the input. In the
case of a successful search, the probability of the first match occurring in the ith
position of the list is p/n for every i, and the number of comparisons made by the
algorithm in such a situation is obviously i. In the case of an unsuccessful search, the
number of comparisons will be n with the probability of such a search being (1− p).
Therefore,
Cavg(n) = Probability of successful search + Probability of unsuccessful search
Let t (n) and g(n) can be any nonnegative functions defined on the set of natural numbers.
● t (n) - algorithm’s running time (usually indicated by its basic operation count C(n)), and
● g(n) will be some simple function to compare the count with.
Big Oh Notation:
DEFINITION: A function t (n) is said to be in O(g(n)), denoted t (n) ∈ O(g(n)), if t (n) is
bounded above by some constant multiple of g(n) for all large n, i.e., if there exist some
positive constant c and some nonnegative integer n0 such that
t (n) ≤ cg(n) for all n ≥ n0.
As an example, let us formally prove one of the assertions made in the introduction:
100n + 5 ∈ O(n2). Indeed,
100n + 5 ≤ 100n + n (for all n ≥ 5) = 101n ≤ 101n2.
Thus, as values of the constants c and n0 required by the definition, we can take 101 and 5,
respectively. Note that the definition gives us a lot of freedom in choosing specific values for
constants c and n0. For example, we could also reason that 100n + 5 ≤ 100n + 5n (for all n ≥
1) = 105n to complete the proof with c = 105 and n0 = 1.
PROBLEMS:
1. Let f(n) =3n+2. Express f(n) in Big-oh [O(n)]
Given f(n)=3n+2
To find g(n): Let us express g(n) in terms of higher order term of f(n) as shown
g(n)=f(n)
g(n)=3n+2
so, g(n)=4n where n=2
f(n)=O(g(n))
i.e., f(n)=O(2n)
PROBLEMS:
4. Let f(n) =3n+2. Express f(n) in Big-Omega [Ω (n)]
Given f(n)=3n+2
To find g(n): Let us express g(n) in terms of higher order term of f(n) as shown
g(n)=f(n)
g(n)=3n+2 (replacing 2 with 0)
so, g(n)=3n where n=2
PROBLEMS:
6. Let f(n) =3n+2. Express f(n) in Big-theta [ɵ(n)]
It is clear from the above relations that c2=7,c1=6,g(n)=2n and n0=4. So by definition
f(n)= ɵ (g(n))
i.e., f(n)= ɵ (2n)
THEOREM: If t1(n) ∈ O(g1(n)) and t2(n) ∈ O(g2(n)), then t1(n) + t2(n) ∈ O(max{g1(n),
g2(n)}).
PROOF: The proof extends to orders of growth the following simple fact about four arbitrary
real numbers a1, b1, a2, b2: if a1 ≤ b1 and a2 ≤ b2, then a1 + a2 ≤ 2 max{b1, b2}.
Since t1(n) ∈ O(g1(n)), there exist some positive constant c1 and some nonnegative
integer n1 such that
t1(n) ≤ c1g1(n) for all n ≥ n1.
Similarly, since t2(n) ∈ O(g2(n)),
t2(n) ≤ c2g2(n) for all n ≥ n2.
Let us denote c3 = max{c1, c2} and consider n ≥ max{n1, n2} so that we can use both
inequalities. Adding them yields the following:
t1(n) + t2(n) ≤ c1g1(n) + c2g2(n)
≤ c3g1(n) + c3g2(n) = c3[g1(n) + g2(n)]
≤ c3 [2 max{g1(n), g2(n)}].
Hence, t1(n) + t2(n) ∈ O(max{g1(n), g2(n)}), with the constants c and n0 required by the O
definition being 2c3 = 2 max{c1, c2} and max{n1, n2}, respectively.
The obvious measure of an input’s size here is the number of elements in the array, i.e., n.
The operations that are going to be executed most often are in the algorithm’s for loop. There
are two operations in the loop’s body: the comparison A[i]> maxval and the assignment
maxval←A[i]. Since the comparison is executed on each repetition of the loop and the
assignment is not, we should consider the comparison to be the algorithm’s basic operation.
Note that the number of comparisons will be the same for all arrays of size n; therefore, in
terms of this metric, there is no need to distinguish among the worst, average, and best cases
here.
Let us denote C(n) the number of times this comparison is executed and try to find a formula
expressing it as a function of size n. The algorithm makes one comparison on each execution
of the loop, which is repeated for each value of the loop’s variable i within the bounds 1 and
n − 1, inclusive. Therefore, we get the following sum for C(n):
This is an easy sum to compute because it is nothing other than 1 repeated n – 1 times. Thus,
EXAMPLE 2: Consider the element uniqueness problem: check whether all the elements in
a given array of n elements are distinct. This problem can be solved by the following
straightforward algorithm.
The natural measure of the input’s size here is again n, the number of elements in the array.
Since the innermost loop contains a single operation (the comparison of two elements), we
should consider it as the algorithm’s basic operation. Note, however, that the number of
element comparisons depends not only on n but also on whether there are equal elements in
the array and, if there are, which array positions they occupy. We will limit our investigation
to the worst case only.
By definition, the worst case input is an array for which the number of element comparisons
Cworst(n) is the largest among all arrays of size n. An inspection of the innermost loop reveals
that there are two kinds of worst-case inputs—inputs for which the algorithm does not exit
the loop prematurely: arrays with no equal elements and arrays in which the last two elements
are the only pair of equal elements. For such inputs, one comparison is made for each
repetition of the innermost loop, i.e., for each value of the loop variable j between its limits i
+ 1 and n − 1; this is repeated for each value of the outer loop, i.e., for each value of the loop
variable i between its limits 0 and n − 2. Accordingly, we get
EXAMPLE 3: Given two n × n matrices A and B, find the time efficiency of the definition-
based algorithm for computing their product C = AB. By definition, C is an n × n matrix
whose elements are computed as the scalar (dot) products of the rows of matrix A and the
columns of matrix B:
where C[i, j ]= A[i, 0]B[0, j]+ . . . + A[i, k]B[k, j]+ . . . + A[i, n − 1]B[n − 1, j] for every pair of
indices 0 ≤ i, j ≤ n − 1.
and the total number of multiplications M(n) is expressed by the following triple sum:
Now, we can compute this sum by using formula (S1) and rule (R1) given above. Starting
with the innermost sum , which is equal to n, we get
EXAMPLE 4: The following algorithm finds the number of binary digits in the binary representation of
a positive decimal integer.
ALGORITHM Binary(n)
//Input: A positive decimal integer n
//Output: The number of binary digits in n’s binary representation
count ←1
while n > 1 do
count ←count + 1
n←n/2
return count
First, notice that the most frequently executed operation here is not inside the while loop but
rather the comparison n > 1 that determines whether the loop’s body will be executed. Since
the number of times the comparison will be executed is larger than the number of repetitions
of the loop’s body by exactly 1, the choice is not that important.
A more significant feature of this example is the fact that the loop variable takes on only a
few values between its lower and upper limits; therefore, we have to use an alternative way of
computing the number of times the loop is executed. Since the value of n is about halved on
each repetition of the loop, the answer should be about log2 n. The exact formula for the
number of times the comparison n>1 will be executed is actually [log2 n] + 1—the number of
bits in the binary representation of n.
and 0!= 1 by definition, we can compute F(n) = F(n − 1) . n with the following recursive
algorithm.
ALGORITHM F(n)
//Computes n! recursively
//Input: A nonnegative integer n
//Output: The value of n!
if n = 0 return 1
else return F(n − 1) ∗ n
For simplicity, we consider n itself as an indicator of this algorithm’s input. The basic
operation of the algorithm is multiplication, whose number of executions we denote M(n).
Since the function F(n) is computed according to the formula
F(n) = F(n − 1) . n for n > 0,
the number of multiplications M(n) needed to compute it must satisfy the equality
M(n) = M(n − 1) + 1 for n > 0,
to compute to multiply
F(n−1) F(n−1) by n
Indeed, M(n − 1) multiplications are spent to compute F(n − 1), and one more multiplication
is needed to multiply the result by n.
The last equation defines the sequence M(n) that we need to find. This equation defines M(n)
not explicitly, i.e., as a function of n, but implicitly as a function of its value at another point,
namely n − 1. Such equations are called recurrence relations or, for brevity, recurrences.
Recurrence relations play an important role not only in analysis of algorithms but also in
some areas of applied mathematics.
To determine a solution uniquely, we need an initial condition that tells us the value with
which the sequence starts. We can obtain this value by inspecting the condition that makes
the algorithm stop its recursive calls:
if n = 0 return 1.
This tells us two things. First, since the calls stop when n = 0, the smallest value of n for
which this algorithm is executed and hence M(n) defined is 0. Second, by inspecting the
pseudocode’s exiting line, we can see that when n = 0, the algorithm performs no
multiplications. Therefore, the initial condition we are after is
M(0) = 0.
The second is the number of multiplications M(n) needed to compute F(n) by the recursive
algorithm whose pseudocode was given at the beginning of the section
The several techniques available for solving recurrence relations, we use what can be called
the method of backward substitutions. The method’s idea (and the reason for the name) is
immediately clear from the way it applies to solving our particular recurrence:
M(n) = M(n − 1) + 1 substitute M(n − 1) = M(n − 2) + 1
= [M(n − 2) + 1]+ 1= M(n − 2) + 2 substitute M(n − 2) = M(n − 3) + 1
= [M(n − 3) + 1]+ 2 = M(n − 3) + 3.
After inspecting the first three lines, we see an emerging pattern, which makes it possible to
predict not only the next line (what would it be?) but also a general formula for the pattern:
M(n) = M(n − i) + i.
Since it is specified for n = 0, we have to substitute i = n in the pattern’s formula to get the
ultimate result of our backward substitutions:
M(n) = M(n − 1) + 1= . . . = M(n − i) + i = . . . = M(n − n) + n = n.
The issue of time efficiency is actually not that important for the problem of computing n!,
the function’s values get so large so fast that we can realistically compute exact values of n!
only for very small n’s.
Let us apply the general plan outlined above to the Tower of Hanoi problem. The number of
disks n is the obvious choice for the input’s size indicator, and so is moving one disk as the
algorithm’s basic operation. Clearly, the number of moves M(n) depends on n only, and we
get the following recurrence equation for it:
M(n) = M(n − 1) + 1+ M(n − 1) for n > 1.
With the obvious initial condition M(1) = 1, we have the following recurrence relation for the
number of moves M(n):
When a recursive algorithm makes more than a single call to itself, it can be useful for
analysis purposes to construct a tree of its recursive calls. In this tree, nodes correspond to
recursive calls, and we can label them with the value of the parameter (or, more generally,
parameters) of the calls. For the Tower of Hanoi example, the tree is given in Figure 1.7. By
FIGURE 1.7: Tree of recursive calls made by the recursive algorithm for the Tower of
Hanoi puzzle.
Let us set up a recurrence and an initial condition for the number of additions A(n) made by
the algorithm. The number of additions made in computing BinRec(n/2) is A(n/2), plus one
more addition is made by the algorithm to increase the returned value by 1. This leads to the
recurrence
A(n) = A(_n/2_) + 1 for n > 1.
Since the recursive calls end when n is equal to 1 and there are no additions made then, the
initial condition is
A(1) = 0.
The presence of n/2 in the function’s argument makes the method of backward substitutions
stumble on values of n that are not powers of 2. Therefore, the standard approach to solving
such a recurrence is to solve it only for n = 2k and then take advantage of the theorem called
the smoothness rule (see Appendix B), which claims that under very broad assumptions the
order of growth observed for n = 2k gives a correct answer about the order of growth for all
values of n. (Alternatively, after getting a solution for powers of 2, we can sometimes fine-
tune this solution to get a formula valid for an arbitrary n.) So let us apply this recipe to our
recurrence, which for n = 2k takes the form
Selection Sort:
We start selection sort by scanning the entire given list to find its smallest element and exchange it
with the first element, putting the smallest element in its final position in the sorted list. Then we scan
the list, starting with the second element, to find the smallest among the last n - 1 elements and
exchange it with the second element, putting the second smallest element in its final position.
Generally, on the ith pass through the list, which we number from 0 ton - 2, the algorithm searches
for the smallest item among the last n - i elements and swaps it with A;
After n - 1 passes, the list is sorted. Here is a pseudocode of this algorithm, which, for simplicity,
assumes that the list is implemented as an array.
Example:
Sequential Search:
We have already encountered a brute-force algorithm for the general searching problem: it is called
sequential search (see Section 2.1). To repeat, the algorithm simply compares successive elements
of a given list with a given search key until either a match is encountered (successful search) or the
list is exhausted without finding a match (unsuccessful search). A simple extra trick is often
employed in implementing sequential search: if we append the search key to the end of the list, the
search for the key will have to be successful, and therefore we can eliminate a check for the list's
end on each iteration of the algorithm. Here is a pseudocode for this enhanced version, with its
input implemented as an array.
Note that for this example, the algorithm shifts the pattern almost always after a single character
comparison. However, the worst case is much worse: the algorithm may have to make all m
comparisons before shifting the pattern, and this can happen for each of the n - m + 1 tries.
(Problem 6 asks you to give a specific example of such a situation.) Thus, in the worst case, the
algorithm is in 8(nm). For a typical word search in a natural language text, however, we should
expect that most shifts would happen after very few comparisons (check the example again).
Therefore, the average-case efficiency should be considerably better than the worst-case efficiency.
Indeed it is: for searching in random texts, it has been shown to be linear, i.e., Ɵ (n + m) = Ɵ (n).