0% found this document useful (0 votes)
10 views113 pages

Design and Analysis of Algorithms Note by Arjun Singh Saud

This document discusses the principles of analyzing algorithms and problem-solving in computer science, emphasizing the importance of algorithm design and analysis for efficiency and correctness. It covers various algorithm properties, methods of expressing algorithms, and complexity analysis using asymptotic notations such as Big O, Big Omega, and Big Theta. Additionally, it provides examples of algorithms, their efficiencies, and techniques for solving recurrences.

Uploaded by

first born
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views113 pages

Design and Analysis of Algorithms Note by Arjun Singh Saud

This document discusses the principles of analyzing algorithms and problem-solving in computer science, emphasizing the importance of algorithm design and analysis for efficiency and correctness. It covers various algorithm properties, methods of expressing algorithms, and complexity analysis using asymptotic notations such as Big O, Big Omega, and Big Theta. Additionally, it provides examples of algorithms, their efficiencies, and techniques for solving recurrences.

Uploaded by

first born
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 113

Design and Analysis of Algorithms B.Sc.

CSIT

Chapter 1
Principles of Analyzing algorithms and Problems
An algorithm is a finite set of computational instructions, each instruction can be
executed in finite time, to perform computation or problem solving by giving some value,
or set of values as input to produce some value, or set of values as output. Algorithms are
not dependent on a particular machine, programming language or compilers i.e.
algorithms run in same manner everywhere. So the algorithm is a mathematical object
where the algorithms are assumed to be run under machine with unlimited capacity.

Examples of problems
 You are given two numbers, how do you find the Greatest Common Divisor.
 Given an array of numbers, how do you sort them?
We need algorithms to understand the basic concepts of the Computer Science,
programming. Where the computations are done and to understand the input output
relation of the problem we must be able to understand the steps involved in getting
output(s) from the given input(s).

You need designing concepts of the algorithms because if you only study the algorithms
then you are bound to those algorithms and selection among the available algorithms.
However if you have knowledge about design then you can attempt to improve the
performance using different design principles.

The analysis of the algorithms gives a good insight of the algorithms under study.
Analysis of algorithms tries to answer few questions like; is the algorithm correct? i.e. the
Algorithm generates the required result or not?, does the algorithm terminate for all the
inputs under problem domain? The other issues of analysis are efficiency, optimality, etc.
So knowing the different aspects of different algorithms on the similar problem domain
we can choose the better algorithm for our need. This can be done by knowing the
resources needed for the algorithm for its execution. Two most important resources are

Prepared By: Arjun S Saud, Faculty CDCSIT,TU 1

Downloaded from CSIT Tutor


Design and Analysis of Algorithms B.Sc.CSIT

the time and the space. Both of the resources are measures in terms of complexity for
time instead of absolute time we consider growth

Algorithms Properties
Input(s)/output(s): There must be some inputs from the standard set of inputs
and an algorithm’s execution must produce outputs(s).
Definiteness: Each step must be clear and unambiguous.
Finiteness: Algorithms must terminate after finite time or steps.
Correctness: Correct set of output values must be produced from the each set of
inputs.
Effectiveness: Each step must be carried out in finite time.
Here we deal with correctness and finiteness.

Expressing Algorithms
There are many ways of expressing algorithms; the order of ease of expression is natural
language, pseudo code and real programming language syntax. In this course I inter mix
the natural language and pseudo code convention.

Random Access Machine Model


This RAM model is the base model for our study of design and analysis of algorithms to
have design and analysis in machine independent scenario. In this model each basic
operations (+, -) takes 1 step, loops and subroutines are not basic operations. Each
memory reference is 1 step. We measure run time of algorithm by counting the steps.

Best, Worst and Average case


Best case complexity gives lower bound on the running time of the algorithm for
any instance of input(s). This indicates that the algorithm can never have lower
running timethan best case for particular class of problems.

Prepared By: Arjun S Saud, Faculty CDCSIT,TU 2

Downloaded from CSIT Tutor


Design and Analysis of Algorithms B.Sc.CSIT

Worst case complexity gives upper bound on the running time of the algorithm
for all the instances of the input(s). This insures that no input can overcome the
running time limit posed by worst case complexity.

Average case complexity gives average number of steps required on any instance
of the input(s).

In our study we concentrate on worst case complexity only.

Example 1: Fibonacci Numbers


Input: n
Output: nth Fibonacci number.
Algorithm: assume a as first(previous) and b as second(current) numbers
fib(n)
{
a = 0, b= 1, f=1 ;
for(i = 2 ; i <=n ; i++)
{
f = a+b ;
a=b ;
b=f ;
}
return f ;
}

Efficiency
Time Complexity: The algorithm above iterates up to n-2 times, so time
complexity is O(n).

Space Complexity: The space complexity is constant i.e. O(1).

Prepared By: Arjun S Saud, Faculty CDCSIT,TU 3

Downloaded from CSIT Tutor


Design and Analysis of Algorithms B.Sc.CSIT

Example 2: Greatest Common Divisor


Inputs: Two numbers a and b
Output: G.C.D of a and b.
Algorithm: assume (for simplicity) a>b>=0
gcd(a,b)
{
While(b != 0)
{
d =a/b ;
temp = b ;
b=a–b*d;
a = temp ;
}
return a ;
}

Efficiency
Running Time: if the given numbers a and b are of n-bits then loop executes for
n time and the division and multiplication can be done in O (n2) time. So the total
running time becomes O(n3).

Another way of analyzing:


For Simplicity Let us assume that
b=2n
 n= logb
 Loop executes logb times in worst case
 Time Complexity = o(logb)
Space Complexity: The only allocated spaces are for variables so space
complexity is constant i.e. O(1).

Prepared By: Arjun S Saud, Faculty CDCSIT,TU 4

Downloaded from CSIT Tutor


Design and Analysis of Algorithms B.Sc.CSIT

Mathematical Foundation
Since mathematics can provide clear view of an algorithm. Understanding the concepts of
mathematics is aid in the design and analysis of good algorithms. Here we present some
of the mathematical concepts that are helpful in our study.

Exponents
Some of the formulas that are helpful are:

xa xb = xa+b
xa / xb = xa-b
(x a)b = xab
xn + xn = 2xn
2n + 2n = 2n+1

Logarithmes
Some of the formulas that are helpful are:

1. logab = logcb / logc a ; c>0


2. log ab = log a + log b
3. log a/b = log a - log b
4. log (ab) = b log a
5. Log x < x for all x>0
6. Log 1 = 0, log 2 = 1, log 1024 = 10.

7. a logbn = n logba

Series

Prepared By: Arjun S Saud, Faculty CDCSIT,TU 5

Downloaded from CSIT Tutor


Design and Analysis of Algorithms B.Sc.CSIT

Asymptotic Notation
Complexity analysis of an algorithm is very hard if we try to analyze exact. we know that
the complexity (worst, best, or average) of an algorithm is the mathematical function of
the size of the input. So if we analyze the algorithm in terms of bound (upper and lower)
then it would be easier. For this purpose we need the concept of asymptotic notations.
The figure below gives upper and lower bound concept.

Big Oh (O) notation


When we have only asymptotic upper bound then we use O notation. A function
f(x)=O(g(x)) (read as f(x) is big oh of g(x) ) iff there exists two positive constants c and
x0 such that for all x >= x0, 0 <= f(x) <= c*g(x)
The above relation says that g(x) is an upper bound of f(x)
Some properties:
Transitivity: f(x) = O(g(x)) & g(x) = O(h(x)) _ f(x) = O(h(x))
Reflexivity: f(x) = O(f(x))
O(1) is used to denote constants.

Prepared By: Arjun S Saud, Faculty CDCSIT,TU 6

Downloaded from CSIT Tutor


Design and Analysis of Algorithms B.Sc.CSIT

For all values of n >= n0, plot shows clearly that f(n) lies below or on the curve of c*g(n)

Examples
1. f(n) = 3n2 + 4n + 7
g(n) = n2 , then prove that f(n) = O(g(n)).
Proof: let us choose c and n0 values as 14 and 1 respectively then we can have
f(n) <= c*g(n), n>=n0 as
3n2 + 4n + 7 <= 14*n2 for all n >= 1
The above inequality is trivially true
Hence f(n) = O(g(n))

2. Prove that n log (n3) is O(n3)).


Proof: we have n log (n3) = 3n log n
Again, n3 = n n,
If we can prove log n = O(n) then problem is solved
Because n log n = n O(n) that gives the question again.
We can remember the fact that log a n is O (nb) for all a,b>0.
In our problem a = 1 and b = 1/2 ,

Prepared By: Arjun S Saud, Faculty CDCSIT,TU 7

Downloaded from CSIT Tutor


Design and Analysis of Algorithms B.Sc.CSIT

hence log n = O(n).


So by knowing log n = O(n) we proved that
n log (n3) = O(n3)).

3. Is 2n+1 =O(2n) ?
Is 22n = O(2n) ?
Big Omega () notation

Big omega notation gives asymptotic lower bound. A function f(x) =  (g(x)) (read as

g(x) is big omega of g(x) ) iff there exists two positive constants c and x 0 such that for all
x >= x0, 0 <= c*g(x) <= f(x).
The above relation says that g(x) is a lower bound of f(x).
Some properties:
Transitivity: f(x) = O(g(x)) & g(x) = O(h(x)) _ f(x) = O(h(x))
Reflexivity: f(x) = O(f(x))

Examples
1. f(n) = 3n2 + 4n + 7

g(n) = n2 , then prove that f(n) =(g(n)).

Proof: let us choose c and n0 values as 1 and 1, respectively then we can have
f(n) >= c*g(n), n>=n0 as

Prepared By: Arjun S Saud, Faculty CDCSIT,TU 8

Downloaded from CSIT Tutor


Design and Analysis of Algorithms B.Sc.CSIT

3n2 + 4n + 7 >= 1*n2 for all n >= 1


The above inequality is trivially true

Hence f(n) = (g(n))

Big Theta () notation


When we need asymptotically tight bound then we use notation. A function f(x) = (g(x))
(read as f(x) is big theta of g(x) ) iff there exists three positive constants c1, c2 and x0 such
that for all x >= x0, c1*g(x) <= f(x) <= c2*g(x)
The above relation says that f(x) is order of g(x)
Some properties:
Transitivity: f(x) = (g(x)) & g(x) = (h(x)) f(x) = (h(x))
Reflexivity: f(x) = (f(x))
Symmetry: f(x) = g(x) iff g(x) = f(x)

Examples
1. f(n) = 3n2 + 4n + 7
g(n) = n2 , then prove that f(n) = (g(n)).
Proof: let us choose c1, c2 and n0 values as 14, 1 and 1 respectively then we can
have,
f(n) <= c1*g(n), n>=n0 as 3n2 + 4n + 7 <= 14*n2 , and

Prepared By: Arjun S Saud, Faculty CDCSIT,TU 9

Downloaded from CSIT Tutor


Design and Analysis of Algorithms B.Sc.CSIT

f(n) >= c2*g(n), n>=n0 as 3n2 + 4n + 7 >= 1*n2


for all n >= 1(in both cases).
So c2*g(n) <= f(n) <= c1*g(n) is trivial.
Hence f(n) = (g(n)).

2. Show (n + a)b = (nb), for any real constants a and b, where b>0.
Here, using Binomial theorem for expanding (n + a)b, we get ,
C(b,0)nb + C(b,1)nb-1a + … + C(b,b-1)nab-1 + C(b,b)ab
We can obtain some constants such that (n + a)b <= c1*(nb), for all n >= n0
And (n + a)b >= c2*(nb), for all n >= n0
Here we may take c1 = 2b c2 = 1 n0 = |a|,
b b b
Since 1 *(n ) <= (n + a) <= 2 *(nb).
Hence the problem is solved.
Exercises
1. Show that 2n is O(n!).
2. f(x) = anxn + an-1xn-1 + … + a1x + a0, where a0, a2, …, an are real numbers with an!= 0.

Then show that f(x) is O(xn), f(x) is  (xn) and then show f(x) is order of xn.

Recurrences

 Recursive algorithms are described by using recurrence relations.


 A recurrence is an inequality that describes a problem in terms of itself.
For Example:
Recursive algorithm for finding factorial
T(n)=1 when n =1
T(n)=T(n-1) + O(1) when n>1

Recursive algorithm for finding Nth Fibonacci number

Prepared By: Arjun S Saud, Faculty CDCSIT,TU 10

Downloaded from CSIT Tutor


Design and Analysis of Algorithms B.Sc.CSIT

T(1)=1 when n=1


T(2)=1 when n=2
T(n)=T(n-1) + T(n-2) +O(1) when n>2

Recursive algorithm for binary search

T(1)=1 when n=1


T(n)=T(n/2) + O(1) when n>1

Technicalities
Consider the recurrence
T(n) = k n=1
T(n) = 2T(n/2) + kn n>1

What is Odd about above? In next iteration n may not be integral.


More accurate is:
T(n) = k n=1
T(n) = T(n/2) + T(n/2) + kn n>1
This difference rarely matters, so we usually ignore this detail

Again consider the recurrence

T(n) = (1) n<b


T(n) = aT(n/b) + f(n) nb

For constant-sized problem, sizes can bound


algorithm by some constant value.

This constant value is irrelevant for asymptote. Thus, we often skip writing the
base case equation

Prepared By: Arjun S Saud, Faculty CDCSIT,TU 11

Downloaded from CSIT Tutor


Design and Analysis of Algorithms B.Sc.CSIT

Techniques for Solving Recurrences


We’ll use four techniques:
 Iteration method
 Recursion Tree
 Substitution
 Master Method – for divide & conquer
 Characteristic Equation – for linear

Iteration method
 Expand the relation so that summation independent on n is obtained.
 Bound the summation
e.g.
T(n)= 2T(n/2) +1 when n>1
T(n)= 1 when n=1

T(n) = 2T(n/2) +1
= 2 { 2T(n/4) + 1} +1
= 4T(n/4) + 2 + 1
= 4 { T(n/8) +1 } + 2 + 1
= 8 T(n/8) + 4 + 2 + 1
………………………
………………………
= 2k T( n/2k) + 2 k-1 T(n/2k-1) + ………………… + 4 + 2 + 1.

For simplicity assume:


n= 2k
 k=logn
 T(n)= 2k + 2k-1 + ……………………….. + 22 + 21 + 20
 T(n)= (2k+1 -1)/ (2-1)
 T(n)= 2k+1 -1

Prepared By: Arjun S Saud, Faculty CDCSIT,TU 12

Downloaded from CSIT Tutor


Design and Analysis of Algorithms B.Sc.CSIT

 T(n)= 2.2k -1
 T(n)= 2n-1
 T(n)= O(n)
Second Example:
T(n) = T(n/3) + O(n) when n>1
T(n) = 1 when n=1

T(n) = T(n/3) + O(n)


 T(n) = T(n/32) + O( n/3) + O(n)
 T(n) = T(n/33) + O(n/32) + O( n/3) + O(n)
 T(n) = T(n/34) + O(n/33) +O(n/32) + O( n/3) + O(n)
 T(n)= T(n/3k) + O(n/3k-1) + ………………+ O( n/3) + O(n)
For Simplicity assume
n= 3k
 k = log3n
 T(n)<= T(1) + c n/3k-1……………… + c.n/32+ c.n/3 + c.n
 T(n) <= 1 +{ c.n/3k-1……………… + c.n/32+ c.n/3 + c.n}
 T(n)<= 1+ c.n { 1/(1-1/3) }
 T(n) <= 1+ 3/2 c.n
 T(n) = O(n)

Substitution Method
Takes two steps:
1. Guess the form of the solution, using unknown constants.
2. Use induction to find the constants & verify the solution.

Completely dependent on making reasonable guesses


Consider the example:
T(n) = 1 n=1
T(n) = 4T(n/2) + n n>1
Guess: T(n) = O(n3).

Prepared By: Arjun S Saud, Faculty CDCSIT,TU 13

Downloaded from CSIT Tutor


Design and Analysis of Algorithms B.Sc.CSIT

More specifically:
T(n)  cn3, for all large enough n.
Prove by strong induction on n.
Assume: T(k)  ck3 for k<n.
Show: T(n)  cn3 for n>n0.
Base case,
For n=1:
T(n) = 1 Definition
1c Choose large enough c for conclusion
Inductive case, n>1:
T(n) = 4T(n/2) + n Definition.
 4c(n/2)3 + n Induction.
= c/2  n3 + n Algebra.

While this is O(n3), we’re not done.


Need to show c/2  n3 + n  cn3.
Fortunately, the constant factor is shrinking, not growing.

T(n)  c/2  n3 + n From before.


= cn3 - (c/2  n3 - n) Algebra.
 cn3 Since n>0, if c2
Proved:
T(n)  2n3 for n>0
Thus, T(n) = O(n3).

Second Example
T(n) = 1 n=1
T(n) = 4T(n/2) + n n>1
Guess: T(n) = O(n2).
Same recurrence, but now try tighter bound.
More specifically:
T(n)  cn2 for n>n0.

Prepared By: Arjun S Saud, Faculty CDCSIT,TU 14

Downloaded from CSIT Tutor


Design and Analysis of Algorithms B.Sc.CSIT

Assume T(k)  ck2, for k<n.

T(n) = 4T(n/2) + n
 4c(n/2)2 + n
= cn2 + n

Not  cn2 !
Problem is that the constant isn’t shrinking.

Solution: Use a tighter guess & inductive hypothesis.


Subtract a lower-order term – a common technique.
Guess:
T(n)  cn2 - dn for n>0
Assume T(k)  ck2 - dk, for k<n. Show T(n)  cn2 - dn.
Base case, n=1
T(n) = 1 Definition.
1  c-d Choosing c, d appropriately.
Inductive case, n>1:
T(n) = 4T(n/2) + n Definition.
 4(c(n/2)2 - d(n/2)) + n Induction.
= cn2 - 2dn + n Algebra.
= cn2 - dn - (dn - n) Algebra.
 cn2 - dn Choosing d1.
T(n)  2n2 – dn for n>0
Thus, T(n) = O(n2).
Ability to guess effectively comes with experience.

Changing Variables:
Sometimes a little algebric manipulation can make a unknown recurrence
similar to one we have seen

Prepared By: Arjun S Saud, Faculty CDCSIT,TU 15

Downloaded from CSIT Tutor


Design and Analysis of Algorithms B.Sc.CSIT

Consider the example


T(n) = 2T(n1/2) + logn
Looks Difficult: Rearrange like
Let m=logn => n= 2m
Thus,
T(2m)= 2T(2m/2)+m
Again let
S(m)= T(2m) S(m)= 2S(m/2) + m
We can show that
S(m)=O(logm)
 T(n)=T(2m) =S(m)= O(mlogm) = O(logn log logn)

Recursion Tree
Jus Simplification of Iteration method:
Consider the recurrence
T(1)=1 when n=1
T(n)= T(n/2)+ 1 when n>1

1 1 1

T(n/2) 1 1

T(n/4) 1

T(n/2k)

Cost at each level =1


For simplicity assume that n= 2K
 k= logn

Prepared By: Arjun S Saud, Faculty CDCSIT,TU 16

Downloaded from CSIT Tutor


Design and Analysis of Algorithms B.Sc.CSIT

Summing the cost at each level,


Total cost = 1 + 1 + 1 + …………………. Up to logn terms
 complexity = O (logn)

Second Example
T(n) = 1 n=1

T(n) = 4T(n/2) + n n>1

Third Example Cost at


T(n) this level
n n

n/2 n/2 n/2 n/2 2n

n/4 n/4 n/4 n/4 … n/4 n/4 k/4 n/4 22n

… … … … … … … … …
1 … 1 2kn

Asume: n= 2k
 k= logn
T(n) = n + 2n + 4n + … + 2k – 1n + 2 kn
= n(1 + 2 + 4 + … + 2k-1 + 2k
= n( 2k+1 -1)/(2-1)
= n (2k+1 -1)
≤ n 2k+1
=2n 2k
= 2n . n
= O(n2)

Prepared By: Arjun S Saud, Faculty CDCSIT,TU 17

Downloaded from CSIT Tutor


Design and Analysis of Algorithms B.Sc.CSIT

Solve T(n) = T(n/4) + T(n/2)+ n2

n2

n2/16 n2/4

n2/162 n2/82 n2/82 n2/42

. . . .
. . . .
1 . . 1

Total Cost ≤ n2 + 5 n2/16 + 52n2/162 +53 n2/163+…………………… +5kn2/16k


{ why ≤? Why not =?}

= n2 (1+ 5/16 + 52/162 + 53/163+…………………… + 5k/16k)


= n2 + (1- 5k+1/16K+1)
= n2 + constant
= O(n2)

Master Method
Cookbook solution for some recurrences of the form
T(n) = a  T(n/b) + f(n)
where
a1, b>1, f(n) asymptotically positive

Describe its cases

Master Method Case 1

T(n) = a  T(n/b) + f(n)

f(n) = O(nlogb a - ) for some >0  T(n) = (nlogb a)

Prepared By: Arjun S Saud, Faculty CDCSIT,TU 18

Downloaded from CSIT Tutor


Design and Analysis of Algorithms B.Sc.CSIT

T(n) = 7T(n/2) + cn2 a=7, b=2


Here f(n) = cn2 nlogb a = nlog2 7 = n2.8
 cn2 = O(nlogb a - ) , for any   0.8.
T(n) = (nlg2 7) = (n2.8)

Master Method Case 2

T(n) = a  T(n/b) + f(n)

f(n) = (nlogb a)  T(n) = (nlogb a lg n )

T(n) = 2T(n/2) + cn a=2, b=2


Here f(n) = cn nlogb a = n
 f(n) = (nlogb a)
T(n) = (n lg n)

Master Method Case 3


T(n) = a  T(n/b) + f(n)

f(n) = (nlogb a + ) for some >0 and


af(n/b)  cf(n) for some c<1 and all large enough n
 T(n) = (f(n))

I.e., is the constant factor


shrinking?
T(n) = 4T(n/2) + n3 a=4, b=2

n3 =?(nlogb a + ) = (nlog2 4 + ) = (n2 + ) for any   1.

Again, 4(n/2)3 = ½n3  cn3, for any c  ½.


T(n) = (n3)

Prepared By: Arjun S Saud, Faculty CDCSIT,TU 19

Downloaded from CSIT Tutor


Design and Analysis of Algorithms B.Sc.CSIT

Master Method Case 4

T(n) = a  T(n/b) + f(n)


None of previous apply. Master method doesn’t help.

T(n) = 4T(n/2) + n2/lg n a=4, b=2

Case 1? n2/lg n = O(nlogb a - ) = O(nlog2 4 - ) = O(n2 - ) = O(n2/n)

No, since lg n is asymptotically less than n.


Thus, n2/lg n is asymptotically greater than n2/n.

Case 2? n2/lg n =? (nlogb a) = (nlog2 4) = (n2)

No.

Case 3? n2/lg n =? (nlogb a + ) = (nlog2 4 + ) = (n2 + )

No, since 1/lg n is asymptotically less than n.

Exercises
 Show that the solution of T(n) = 2T( n / 2 ) + n is Ω(nlogn). Conclude that
solution is  (nlogn).
 Show that the solution to T(n) = 2T(n / 2) + n is O(nlogn).
 Write recursive Fibonacci number algorithm derive recurrence relation for it and
solve by substitution method. {Guess 2n}
 Argue that the solution to the recurrence T(n) = T(n/3) + T(2n/3) + n is (nlogn) by
appealing to a recursion tree.
 Use iteration to solve the recurrence T(n) = T(n-a) + T(a) + n, where a >=1 is a
constant.
 Use the master method to give tight asymptotic bounds for the following
recurrences.
 T(n) = 9T(n/3) + n
 T(n) = 3T(n/4) + nlogn

Prepared By: Arjun S Saud, Faculty CDCSIT,TU 20

Downloaded from CSIT Tutor


Design and Analysis of Algorithms B.Sc.CSIT

 T(n) = 2T(2n/3) + 1
 T(n) = 2T(n/2) + nologn
 T(n) = 2T(n/4) + √n
 T(n) = T(√n) + 1
 The running time of an algorithm A is described by the recurrence T(n) = 7T(n/2)
+ n2. A competing algorithm A’ has a running time of T’(n) = aT’(n/4) + n2.
What is the largest integer value for ‘a’ such that A’ is asymptotically faster than
A?

Prepared By: Arjun S Saud, Faculty CDCSIT,TU 21

Downloaded from CSIT Tutor


Design and Analysis of algorithm B.Sc. CSIT

Chapter 2

Review of Data Structures

Simple Data structures


The basic structure to represent unit value types are bits, integers, floating numbers, etc. The
collection of values of basic types can be represented by arrays, structure, etc. The access of the
values are done in constant time for these kind of data structured

Linear Data Structures


Linear data structures are widely used data structures we quickly go through the following linear
data structures.
Lists
List is the simplest general-purpose data structure. They are of different variety. Most
fundamental representation of a list is through an array representation. The other representation
includes linked list. There are also varieties of representations for lists as linked list like singly
linked, doubly linked, circular, etc. There is a mechanism to point to the first element. For this
some pointer is used. To traverse there is a mechanism of pointing the next (also previous in
doubly linked). Lists require linear space to collect and store the elements where linearity is
proportional to the number of items. For e.g. to store n items in an array nd space is required
were d is size of data. Singly linked list takes n(d + p), where p is size of pointer. Similarly for
doubly linked list space requirement is n(d + 2p).
Array representation
 Operations require simple implementations.
 Insert, delete, and search, require linear time, search can take O(logn) if binary search is
used. To use the binary search array must be sorted.
 Inefficient use of space
Singly linked representation (unordered)
 Insert and delete can be done in O(1) time if the pointer to the node is given, otherwise
O(n) time.
 Search and traversing can be done in O(n) time
 Memory overhead, but allocated only to entries that are present.

Prepared By: Arjun Singh Saud, Faculty CDCSIT, TU

Downloaded from CSIT Tutor


Design and Analysis of algorithm B.Sc. CSIT

Doubly linked representation


 Insert and delete can be done in O(1) time if the pointer to the node is given, otherwise
O(n) time.
 Search and traversing can be done in O(n) time
 Memory overhead, but allocated only to entries that are present, search becomes easy.
Some Operation with List
 boolean isEmpty ();
Return true if and only if this list is empty.
 int size ();
Return this list’s length.
 boolean get (int i);
Return the element with index i in this list.
 boolean equals (List a, List b);
Return true if and only if two list have the same length, and each element of the lists are
equal
 void clear ();
Make this list empty.
 void set (int i, int elem);
Replace by elem the element at index i in this list.
 void add (int i, int elem);
Add elem as the element with index i in this list.
 void add (int elem);
Add elem after the last element of this list.
 void addAll (List a List b);
Add all the elements of list b after the last element of list a.
 int remove (int i);
Remove and return the element with index i in this list.
 void visit (List a);
Prints all elements of the list

Prepared By: Arjun Singh Saud, Faculty CDCSIT, TU

Downloaded from CSIT Tutor


Design and Analysis of algorithm B.Sc. CSIT

Operation Array representation SLL representation


get O(1) O(n)
set O(1) O(n)
add(int,data) O(n) O(n)
add(data) O(1) O(1)
remove O(n) O(n)
equals O(n2) O(n2)
addAll O(n2) O(n2)

Stacks and Queues


These types of data structures are special cases of lists. Stack also called LIFO (Last In First Out)
list. In this structure items can be added or removed from only one end. Stacks are generally
represented either in array or in singly linked list and in both cases insertion/deletion time is
O(1), but search time is O(n).
Operations on stacks
 boolean isEmpty ();
Return true if and only if this stack is empty. Complexity is O(1).
 int getLast ();
Return the element at the top of this stack. Complexity is O(1).
 void clear ();
Make this stack empty. Complexity is O(1).
 void push (int elem);
Add elem as the top element of this stack. Complexity is O(1).
 int pop ();
Remove and return the element at the top of this stack. Complexity is O(1).

The queues are also like stacks but they implement FIFO(First In First Out) policy. One end is
for insertion and other is for deletion. They are represented mostly circularly in array for O(1)
insertion/deletion time. Circular singly linked representation takes O(1) insertion time and O(1)
deletion time. Again Representing queues in doubly linked list have O(1) insertion and deletion
time.

Prepared By: Arjun Singh Saud, Faculty CDCSIT, TU

Downloaded from CSIT Tutor


Design and Analysis of algorithm B.Sc. CSIT

Operations on queues
 boolean isEmpty ();
Return true if and only if this queue is empty. Complexity is O(1).
 int size ();
Return this queue’s length. Complexity is O(n).
 int getFirst ();
Return the element at the front of this queue. Complexity is O(1).
 void clear ();
Make this queue empty. Complexity is O(1).
 void insert (int elem);
Add elem as the rear element of this queue. Complexity is O(1).
 int delete ();
Remove and return the front element of this queue. Complexity is O(1).

Tree Data Structures


Tree is a collection of nodes. If the collection is empty the tree is empty otherwise it contains a
distinct node called root (r) and zero or more sub-trees whose roots are directly connected to the
node r by edges. The root of each tree is called child of r, and r the parent. Any node without a
child is called leaf. We can also call the tree as a connected graph without a cycle. So there is a
path from one node to any other nodes in the tree. The main concern with this data structure is
due to the running time of most of the operation require O(logn). We can represent tree as an
array or linked list.
Some of the definitions
 Level h of a full tree has dh-1 nodes.
 The first h levels of a full tree have
1 + d + d2 + …………………….. dh-1 = (dh -1)/(d-1)
Binary Search Trees
BST has at most two children for each parent. In BST a key at each vertex must be greater than
all the keys held by its left descendents and smaller or equal than all the keys held by its right
descendents. Searching and insertion both takes O(h) worst case time, where h is height of tree

Prepared By: Arjun Singh Saud, Faculty CDCSIT, TU

Downloaded from CSIT Tutor


Design and Analysis of algorithm B.Sc. CSIT

and the relation between height and number of nodes n is given by log n < h+1 <= n. for e.g.
height of binary tree with 16 nodes may be anywhere between 4 and 15.
When height is 4 and when height is 15?
So if we are sure that the tree is height balanced then we can say that search and insertion has
O(log n) run time otherwise we have to content with O(n).

Operation Algorithm Time complexity


Search BST search O(log n) best O(n) worst
Add BST insertion O(log n) best O(n) worst
Remove BST deletion O(log n) best O(n) worst

AVL Trees
Balanced tree named after Adelson, Velskii and Landis. AVL trees consist of a special case in
which the sub-trees of each node differ by at most 1 in their height. Due to insertion and deletion
tree may become unbalanced, so rebalancing must be done by using left rotation, right rotation or
double rotation.

Operation Algorithm Time complexity


Search AVL search O(log n) best, worst
Add AVL insertion O(log n) best, worst
Remove AVL deletion O(log n) best, worst

Priority Queues
Priority queue is a queue in which the elements are prioritized. The least element in the priority
queue is always removed first. Priority queues are used in many computing applications. For
example, many operating systems used a scheduling algorithm where the next process executed
is the one with the shortest execution time or the highest priority. Priority queues can be
implemented by using arrays, linked list or special kind of tree (I.e. heap).
 boolean isEmpty ();
Return true if and only if this priority queue is empty.

Prepared By: Arjun Singh Saud, Faculty CDCSIT, TU

Downloaded from CSIT Tutor


Design and Analysis of algorithm B.Sc. CSIT

 int size ();


Return the length of this priority queue.
 int getLeast ();
Return the least element of this priority queue. If there are several least elements, return
any of them.
 void clear ();
Make this priority queue empty.
 void add (int elem);
Add elem to this priority queue.
 int delete();
Remove and return the least element from this priority queue. (If there are several least
elements, remove the same element that would be returned by getLeast.
Operation Sorted SLL Unsorted SLL Sorted Array Unsorted Array
add O(n) O(1) O(n) O(1)
removeLeast O(1) O(n) O(1) O(n)
getLeast O(1) O(n) O(1) O(n)

Heap
A heap is a complete tree with an ordering-relation R holding between each node and its
descendant. Note that the complete tree here means tree can miss only rightmost part of the
bottom level. R can be smaller-than, bigger-than.
E.g. Heap with degree 2 and R is “bigger than”.

8 8

5 6 5 6

2 3 4 7 3 4
1 1

Heap Not a heap

Prepared By: Arjun Singh Saud, Faculty CDCSIT, TU

Downloaded from CSIT Tutor


Design and Analysis of algorithm B.Sc. CSIT

Heap Sort Build a heap from the given set (O(n)) time, then repeatedly remove the elements
from the heap (O(n log n)).
Implementation
Heaps are implemented by using arrays. Insertion and deletion of an element takes O(log n) time.
More on this later
Operation Algorithm Time complexity
add insertion O(log n)
delete deletion O(log n)
getLeast access root element O(1)

Priority Queues
Priority queue is a queue in which the elements are prioritized. The least element in the priority
queue is always removed first. Priority queues are used in many computing applications. For
example, many operating systems used a scheduling algorithm where the next process executed
is the one with the shortest execution time or the highest priority. Priority queues can be
implemented by using arrays, linked list or special kind of tree (I.e. heap).

 boolean isEmpty ();


Return true if and only if this priority queue is empty.

 int size ();


Return the length of this priority queue.

 int getLeast ();


Return the least element of this priority queue. If there are several least elements, return
any of them.

 void clear ();


Make this priority queue empty.

 void add (int elem);


Add elem to this priority queue.

Prepared By: Arjun Singh Saud, Faculty CDCSIT, TU

Downloaded from CSIT Tutor


Design and Analysis of algorithm B.Sc. CSIT

 int delete();
Remove and return the least element from this priority queue. (If there are several least
elements, remove the same element that would be returned by getLeast.

Operation Sorted SLL Unsorted SLL Sorted Array Unsorted Array


add O(n) O(1) O(n) O(1)
removeLeast O(1) O(n) O(1) O(n)
getLeast O(1) O(n) O(1) O(n)

Heap
A heap is a complete tree with an ordering-relation R holding between each node and its
descendant. Note that the complete tree here means tree can miss only rightmost part of the
bottom level. R can be smaller-than, bigger-than.
E.g. Heap with degree 2 and R is “bigger than”.

8 8

5 6 5 6

2 3 4 7 3 4
1 1

Heap Sort Build a heap from the given set (O(n)) time, then repeatedly remove the elements
from the heap (O(n log n)).

Implementation
Heaps are implemented by using arrays. Insertion and deletion of an element takes O(log n) time.
More on this later
Operation Algorithm Time complexity
add insertion O(log n)
delete deletion O(log n)
getLeast access root element O(1)

Prepared By: Arjun Singh Saud, Faculty CDCSIT, TU

Downloaded from CSIT Tutor


Design and Analysis of algorithm B.Sc. CSIT

Hash Table
A hash table or hash map is a data structure that uses a hash function to map identifying values,
known as keys (e.g., a person's name), to their associated values (e.g., their telephone number).
The hash function is used to transform the key into the index of an array element where the
corresponding value is to be sought. Ideally, the hash function should map each possible key to a
unique slot index, but this ideal is rarely achievable in practice (unless the hash keys are fixed;
i.e. new entries are never added to the table after it is created). Instead, most hash table designs
assume that hash collisions—different keys that map to the same hash value—will occur and
must be accommodated in some way.
In a well-dimensioned hash table, the average cost each lookup is independent of the number of
elements stored in the table. Many hash table designs also allow arbitrary insertions and
deletions of key-value pairs, at constant average (i.e O(1)). In many situations, hash tables turn
out to be more efficient than search trees or any other table lookup structure. For this reason,
they are widely used in many kinds of computer software, particularly for associative
arrays, database indexing, caches, and sets.

Prepared By: Arjun Singh Saud, Faculty CDCSIT, TU

Downloaded from CSIT Tutor


Design And Analysis of Algorithms B.SC. CSIT

Chapter 3
Divide and Conquer Algorithms
Sorting
Sorting is among the most basic problems in algorithm design. We are given a sequence of items,
each associated with a given key value. The problem is to permute the items so that they are in
increasing (or decreasing) order by key. Sorting is important because it is often the first step in
more complex algorithms. Sorting algorithms are usually divided into two classes, internal
sorting algorithms, which assume that data is stored in an array in main memory, and external
sorting algorithm, which assume that data is stored on disk or some other device that is best
accessed sequentially. We will only consider internal sorting. Sorting algorithms often have
additional properties that are of interest, depending on the application. Here are two important
properties.
In-place: The algorithm uses no additional array storage, and hence (other than perhaps the
system’s recursion stack) it is possible to sort very large lists without the need to allocate
additional working storage.
Stable: A sorting algorithm is stable if two elements that are equal remain in the same relative
position after sorting is completed. This is of interest, since in some sorting applications you sort
first on one key and then on another. It is nice to know that two items that are equal on the
second key, remain sorted on the first key.

Bubble Sort
The bubble sort algorithm Compare adjacent elements. If the first is greater than the second,
swap them. This is done for every pair of adjacent elements starting from first two elements to
last two elements. At the end of pass1 greatest element takes its proper place. The whole process
is repeated except for the last one so that at each pass the comparisons become fewer.

Prepared By: Arjun Singh Saud, Faculty CDCISIT, TU

Downloaded from CSIT Tutor


Design And Analysis of Algorithms B.SC. CSIT

Algorithm
BubbleSort(A, n)
{
for(i = 0; i <n-1; i++)
{
for(j = 0; j < n-i-1; j++)
{
if(A[j] > A[j+1])
{
temp = A[j];
A[j] = A[j+1];
A[j+1] = temp;
}
}
}
}

Time Complexity:
Inner loop executes for (n-1) times when i=0, (n-2) times when i=1 and so on:
Time complexity = (n-1) + (n-2) + (n-3) + …………………………. +2 +1
= O(n2)
There is no best-case linear time complexity for this algorithm.
Space Complexity:
Since no extra space besides 3 variables is needed for sorting
Space complexity = O(n)

Selection Sort
Idea: Find the least (or gratest) value in the array, swap it into the leftmost(or rightmost)
component (where it belongs), and then forget the leftmost component. Do this repeatedly.

Prepared By: Arjun Singh Saud, Faculty CDCISIT, TU

Downloaded from CSIT Tutor


Design And Analysis of Algorithms B.SC. CSIT

Algorithm:
SelectionSort(A)
{
for( i = 0;i < n ;i++)
{
least=A[i]; p=i;
{
for ( j = i + 1;j < n ;j++)
if (A[j] < A[i])
least= A[j]; p=j;
}
}
swap(A[i],A[p]);
}
Time Complexity:
Inner loop executes for (n-1) times when i=0, (n-2) times when i=1 and so on:
Time complexity = (n-1) + (n-2) + (n-3) + …………………………. +2 +1
= O(n2)
There is no best-case linear time complexity for this algorithm, but number of swap
operations is reduced greatly.
Space Complexity:
Since no extra space besides 5 variables is needed for sorting
Space complexity = O(n)

Insertion Sort
Idea: like sorting a hand of playing cards start with an empty left hand and the cards facing down
on the table. Remove one card at a time from the table, and insert it into the correct position in
the left hand. Compare it with each of the cards already in the hand, from right to left. The cards
held in the left hand are sorted

Prepared By: Arjun Singh Saud, Faculty CDCISIT, TU

Downloaded from CSIT Tutor


Design And Analysis of Algorithms B.SC. CSIT

Algorithm:
InsertionSort(A)
{
for (i=1;i<n;i++)
{
key = A[ i]
for(j=i; j>0 && A[j] >key; j--)
{
A[j + 1] = A[j]
}
A[j + 1] = key

}
}

Time Complexity:
Worst Case Analysis:
Array elements are in reverse sorted order
Inner loop executes for 1 times when i=1, 2 times when i=2... and n-1 times when i=n-1:
Time complexity = 1 + 2 + 3 + …………………………. +(n-2) +(n-1)
= O(n2)

Best case Analysis:


Array elements are already sorted
Inner loop executes for 1 times when i=1, 1 times when i=2... and 1 times when i=n-1:
Time complexity = 1 + 2 + 3 + …………………………. +1 +1
= O(n)

Prepared By: Arjun Singh Saud, Faculty CDCISIT, TU

Downloaded from CSIT Tutor


Design And Analysis of Algorithms B.SC. CSIT

Space Complexity:
Since no extra space besides 5 variables is needed for sorting
Space complexity = O(n)

Merge Sort

To sort an array A[l . . r]:


• Divide
– Divide the n-element sequence to be sorted into two subsequences of n/2 elements
each
• Conquer
– Sort the subsequences recursively using merge sort. When the size of the
sequences is 1 there is nothing more to do
• Combine
– Merge the two sorted subsequences

Divide

Prepared By: Arjun Singh Saud, Faculty CDCISIT, TU

Downloaded from CSIT Tutor


Design And Analysis of Algorithms B.SC. CSIT

Merging:

MergeSort(A, l, r)
{
If ( l < r)
{ //Check for base case
m = (l + r)/2 //Divide
MergeSort(A, l, m) //Conquer
MergeSort(A, m + 1, r) //Conquer
Merge(A, l, m+1, r) //Combine
}
}

Merge(A,B,l,m,r)
{
x=l, y=m;
k=l;
while(x<m && y<r)

Prepared By: Arjun Singh Saud, Faculty CDCISIT, TU

Downloaded from CSIT Tutor


Design And Analysis of Algorithms B.SC. CSIT

{
if(A[x] < A[y])
{
B[k]= A[x];
k++; x++;
}
else
{
B[k] = A[y];
k++; y++;
}
}
while(x<m)
{
A[k] = A[x];
k++; x++;
}
while(y<r)
{
A[k] = A[y];
k++; y++;
}
for(i=l;i<= r; i++)
{
A[i] = B[i]
}
}

Time Complexity:
Recurrence Relation for Merge sort:
T(n) = 1 if n=1
T(n) = 2 T(n/2) + O(n) if n>1
Solving this recurrence we get
Time Complexity = O(nlogn)

Space Complexity:
It uses one extra array and some extra variables during sorting, therefore
Space Complexity= 2n + c = O(n)

Prepared By: Arjun Singh Saud, Faculty CDCISIT, TU

Downloaded from CSIT Tutor


Design And Analysis of Algorithms B.SC. CSIT

Quick Sort
• Divide
Partition the array A[l…r] into 2 subarrays A[l..m] and A[m+1..r], such that each element
of A[l..m] is smaller than or equal to each element in A[m+1..r]. Need to find index p to
partition the array.
• Conquer
Recursively sort A[p..q] and A[q+1..r] using Quicksort
• Combine
Trivial: the arrays are sorted in place. No additional work is required to combine them.

5 3 2 6 4 1 3 7
x y

5 3 2 6 4 1 3 7
x y {swap x & y}

5 3 2 3 4 1 6 7
y x {swap y and pivot}

1 3 2 3 4 5 6 7
p
Algorithm:
QuickSort(A,l,r)
{
if(l<r)
{
p = Partition(A,l,r);
QuickSort(A,l,p-1);
QuickSort(A,p+1,r);
}
}

Prepared By: Arjun Singh Saud, Faculty CDCISIT, TU

Downloaded from CSIT Tutor


Design And Analysis of Algorithms B.SC. CSIT

Partition(A,l,r)
{
x =l; y =r ; p = A[l];
while(x<y)
{
do {
x++;
}while(A[x] <= p);
do {
y--;
} while(A[y] >=p);
if(x<y)
swap(A[x],A[y]);
}
A[l] = A[y]; A[y] = p; return y; //return position of pivot
}
Time Complexity:
We can notice that complexity of partitioning is O(n) because outer while loop executes
cn times.
Thus recurrence relation for quick sort is:
T(n) = T(k) + T(n-k-1) + O(n)
Best Case:
Divides the array into two partitions of equal size, therefore
T(n) = T(n/2) + O(n) , Solving this recurrence we get,
 Time Complexity = O(nlogn)

Worst Case:

Prepared By: Arjun Singh Saud, Faculty CDCISIT, TU

Downloaded from CSIT Tutor


Design And Analysis of Algorithms B.SC. CSIT

When array is already sorted or sorted in reverse order, one partition contains n-1 items
and another contains zero items, therefore
T(n) = T(n-1) + O(1), Solving this recurrence we get
 Time Complexity = O(n2)

Case between worst and best:


9-to-1 partitions split
T(n) = T(n=9n/10) + T(n/10) + O(n), Solving this recurrence we get
Time Complexity = O(nolgn)

Prepared By: Arjun Singh Saud, Faculty CDCISIT, TU

Downloaded from CSIT Tutor


Design And Analysis of Algorithms B.SC. CSIT

Average case:
All permutations of the input numbers are equally likely. On a random input array, we
will have a mix of well balanced and unbalanced splits. Good and bad splits are randomly
distributed across throughout the tree
Suppose we are alternate: Balanced, Unbalanced,Balanced, ….
B(n)= 2UB(n/2) + Θ(n) Balanced
UB(n)= B(n –1) + Θ(n) Unbalanced
Solving:
B(n)= 2(B(n/2 –1) + Θ(n/2)) + Θ(n)
= 2B(n/2 –1) + Θ(n)
= Θ(nlogn)

Randomized Quick Sort:


The algorithm is called randomized if its behavior depends on input as well as random value
generated by random number generator. The beauty of the randomized algorithm is that no
particular input can produce worst-case behavior of an algorithm. IDEA: Partition around a
random element. Running time is independent of the input order. No assumptions need to be
made about the input distribution. No specific input elicits the worst-case behavior. The worst
case is determined only by the output of a random-number generator. Randomization cannot
eliminate the worst-case but it can make it less likely!
Algorithm:
RandQuickSort(A,l,r)
{
if(l<r)
{
m = RandPartition(A,l,r);
RandQuickSort(A,l,m-1);
RandQuickSort(A,m+1,r);
}
}

Prepared By: Arjun Singh Saud, Faculty CDCISIT, TU

Downloaded from CSIT Tutor


Design And Analysis of Algorithms B.SC. CSIT

RandPartition(A,l,r)
{
k = random(l,r); //generates random number between i and j including both.
swap(A[l],A[k]);
return Partition(A,l,r);
}

Time Complexity:
Worst Case:
T(n) = worst-case running time
T(n) = max1 ≤ q ≤ n-1 (T(q) + T(n-q)) + (n)
Use substitution method to show that the running time of Quicksort is O(n2)
Guess T(n) = O(n2)
– Induction goal: T(n) ≤ cn2
– Induction hypothesis: T(k) ≤ ck2 for any k < n
Proof of induction goal:
T(n) ≤ max 1 ≤ q ≤ n-1 (cq2 + c(n-q)2) + (n)
= c  max1 ≤ q ≤ n-1 (q2 + (n-q)2) + (n)

The expression q2 + (n-q)2 achieves a maximum over the range 1 ≤ q ≤ n-1 at one of the
endpoints
max1 ≤ q ≤ n-1 (q2 + (n - q)2) = 12 + (n - 1)2 = n2 – 2(n – 1)
T(n) ≤ cn2 – 2c(n – 1) + (n)
≤ cn2
Average Case:
To analyze average case, assume that all the input elements are distinct for simplicity. If
we are to take care of duplicate elements also the complexity bound is same but it needs
more intricate analysis. Consider the probability of choosing pivot from n elements is
equally likely i.e. 1/n.
Now we give recurrence relation for the algorithm as

Prepared By: Arjun Singh Saud, Faculty CDCISIT, TU

Downloaded from CSIT Tutor


Design And Analysis of Algorithms B.SC. CSIT

n 1
T(n) = 1/n  (T (k )  T (n  k ))  O(n)
k 1

For some k = 1,2, …, n-1, T(k) and T(n-k) is repeated two times

n 1
T(n) = 2/n  T (k )  O(n)
k 1

n 1
nT(n) = 2  T (k ) + O(n2)
k 1

Similarly
n2
(n-1)T(n-1) = 2  T (k ) + O(n-1)2
k 1

nT(n) - (n-1)T(n-1) = 2T(n-1) + 2n -1

nT(n) – (n+1) T(n-1) = 2n-1

T(n)/(n+1) = T(n-1)/n +(2n +1)/n(n-1)


Let An = T(n) /(n+1)

 An = An-1 + (2n+1)/n(n-1)
n
 An =  2i  1 / i(i  1)
i 1

n
 An ≈  2i / i(i  1)
i 1

n
 An ≈ 2 1 /(i  1)
i 1

 An ≈ 2logn

Since An = T(n) /(n+1)


T(n) = nlogn

Prepared By: Arjun Singh Saud, Faculty CDCISIT, TU

Downloaded from CSIT Tutor


Design And Analysis of Algorithms B.SC. CSIT

Heap Sort
A heap is a nearly complete binary tree with the following two properties:
– Structural property: all levels are full, except possibly the last one, which is
filled from left to right
– Order (heap) property: for any node x, Parent(x) ≥ x

Array Representation of Heaps


A heap can be stored as an array A.
– Root of tree is A[1]
– Left child of A[i] = A[2i]
– Right child of A[i] = A[2i + 1]
– Parent of A[i] = A[ i/2 ]
– Heapsize[A] ≤ length[A]
The elements in the subarray A[(n/2+1) .. n] are leaves

Prepared By: Arjun Singh Saud, Faculty CDCISIT, TU

Downloaded from CSIT Tutor


Design And Analysis of Algorithms B.SC. CSIT

Max-heaps (largest element at root), have the max-heap property:


– for all nodes i, excluding the root:
A[PARENT(i)] ≥ A[i]
Min-heaps (smallest element at root), have the min-heap property:
– for all nodes i, excluding the root:
A[PARENT(i)] ≤ A[i]

Adding/Deleting Nodes
New nodes are always inserted at the bottom level (left to right) and nodes are removed from the
bottom level (right to left).

Operations on Heaps

 Maintain/Restore the max-heap property


– MAX-HEAPIFY
 Create a max-heap from an unordered array
– BUILD-MAX-HEAP
 Sort an array in place
– HEAPSORT
 Priority queues

Maintaining the Heap Property

Suppose a node is smaller than a child and Left and Right subtrees of i are max-heaps. To
eliminate the violation:
– Exchange with larger child

Prepared By: Arjun Singh Saud, Faculty CDCISIT, TU

Downloaded from CSIT Tutor


Design And Analysis of Algorithms B.SC. CSIT

– Move down the tree


– Continue until node is not smaller than children

Algorithm:
Max-Heapify(A, i, n)
{
l = Left(i)
r = Right(i)
largest=i;
if l ≤ n and A[l] > A[largest]
largest = l
if r ≤ n and A[r] > A[largest]
largest = r
if largest  i
exchange (A[i] , A[largest])
Max-Heapify(A, largest, n)

Prepared By: Arjun Singh Saud, Faculty CDCISIT, TU

Downloaded from CSIT Tutor


Design And Analysis of Algorithms B.SC. CSIT

Analysis:
In the worst case Max-Heapify is called recursively h times, where h is height of the heap and
since each call to the heapify takes constant time
Time complexity = O(h) = O(logn)

Building a Heap
Convert an array A[1 … n] into a max-heap (n = length[A]). The elements in the subarray
A[(n/2+1) .. n] are leaves. Apply MAX-HEAPIFY on elements between 1 and n/2.

Prepared By: Arjun Singh Saud, Faculty CDCISIT, TU

Downloaded from CSIT Tutor


Design And Analysis of Algorithms B.SC. CSIT

Algorithm:
Build-Max-Heap(A)
n = length[A]
for i ← n/2 downto 1
do MAX-HEAPIFY(A, i, n)

Time Complexity:
Running time: Loop executes O(n) times and complexity of Heapify is O(lgn), therefore
complexity of Build-Max-Heap is O(nlogn).
This is not an asymptotically tight upper bound
Heapify takes O(h)
 The cost of Heapify on a node i is proportional to the height of the node i in the tree
h
 T(n) = 
i 0
ni hi

hi = h – i height of the heap rooted at level i


ni = 2i number of nodes at level i
h
 T(n) = 
i 0
2i( h-i)

h
 T(n) = 
i 0
2h( h-i) / 2h-i

Let k= h-i
h
 T(n) = 2h 
i 0
k / 2k


 T(n) ≤ n 
i 0
k / 2k


We know that, 
i 0
xk = 1/(1-x) for x<1

Differentiating both sides we get,



i 0
k xk-1 = 1/(1-x)2

Prepared By: Arjun Singh Saud, Faculty CDCISIT, TU

Downloaded from CSIT Tutor


Design And Analysis of Algorithms B.SC. CSIT


i 0
k xk = x/(1-x)2

Put x=1/2


i 0
k /2k = 1/(1-x)2 = 2

 T(n) = O(n)

Heapsort
– Build a max-heap from the array
– Swap the root (the maximum element) with the last element in the array
– “Discard” this last node by decreasing the heap size
– Call Max-Heapify on the new root
– Repeat this process until only one node remains

7
4

4ZZ 3 Heapify(A,1) Heapify(A,1)


2 3

1
2 1
7

3 1

2 2 3
1

Heapify(A,1) 4
4
7 7

2 3

4
7

Prepared By: Arjun Singh Saud, Faculty CDCISIT, TU

Downloaded from CSIT Tutor


Design And Analysis of Algorithms B.SC. CSIT

Algorithm:
HeapSort(A)
{
BuildHeap(A); //into max heap
n = length[A];
for(i = n ; i >= 2; i--)
{
swap(A[1],A[n]);
n = n-1;
Heapify(A,1);
}
}

Lower Bound for Sorting


All the sorting algorithms we have seen so far are comparison sorts: only use comparisons to
determine the relative order of elements. The best worst-case running time that we’ve seen for
comparison sorting is O(nlgn). E.g., insertion sort, merge sort, quicksort, heapsort.
Sort 〈a1, a2, …, an〉

1:2

2:3 1:3

1:3
1,2,3 2,1,3 2:3

1,3,2 3,1,2 2,3,1 3,2,1

Each internal node is labeled i:j for i, j ∈{1, 2,…, n}. The left subtree shows subsequent
comparisons if ai≤aj. The right subtree shows subsequent comparisons if ai≥aj. A decision tree
Prepared By: Arjun Singh Saud, Faculty CDCISIT, TU

Downloaded from CSIT Tutor


Design And Analysis of Algorithms B.SC. CSIT

can model the execution of any comparison sort: The tree contains the comparisons along all
possible instruction traces.The running time of the algorithm is the length of the path
taken.Worst-case running time is height of tree.

The tree must contain ≥n! leaves, since there are n! possible permutations. A height-h binary tree
has ≤2h leaves.
Thus, n! ≤2h
∴h≥lg(n!)
Since, lg(n!) = Ω(nlg n)
 h= Ω(nlg n).

Searching
Sequential Search
Simply search for the given element left to right and return the index of the element, if found.
Otherwise return “Not Found”.

Algorithm:
LinearSearch(A, n,key)
{
for(i=0;i<n;i++)
{
if(A[i] == key)
{
return I;
}
}

return -1;//-1 indicates unsuccessful search


}

Analysis:
Time complexity = O(n)

Binary Search:

To find a key in a large _le containing keys z[0; 1; : : : ; n-1] in sorted order, we first compare
key with z[n/2], and depending on the result we recurse either on the first half of the file, z[0; : :
: ; n/2 - 1], or on the second half, z[n/2; : : : ; n-1].

Prepared By: Arjun Singh Saud, Faculty CDCISIT, TU

Downloaded from CSIT Tutor


Design And Analysis of Algorithms B.SC. CSIT

Algorithm
BinarySearch(A,l,r, key)
{
if(l= = r)
{
if(key = = A[l])
return l+1; //index starts from 0
else
return 0;
}

else
{
m = (l + r) /2 ; //integer division
if(key = = A[m]
return m+1;
else if (key < A[m])
return BinarySearch(l, m-1, key) ;
else return BinarySearch(m+1, r, key) ;
}
}

Analysis:
From the above algorithm we can say that the running time of the algorithm is:
T(n) = T(n/2) + Q(1)
= O(logn) .
In the best case output is obtained at one run i.e. O(1) time if the key is at middle. In the worst
case the output is at the end of the array so running time is O(logn) time. In the average case also
running time is O(logn). For unsuccessful search best, worst and average time complexity is
O(logn).

Prepared By: Arjun Singh Saud, Faculty CDCISIT, TU

Downloaded from CSIT Tutor


Design And Analysis of Algorithms B.SC. CSIT

Max and Min Finding


Here our problem is to find the minimum and maximum items in a set of n elements. We will
see two methods here first one is iterative version and the next one uses divide and conquer
strategy to solve the problem.
Iterative Algorithm:
MinMax(A,n)
{
max = min = A[0];
for(i = 1; i < n; i++)
{
if(A[i] > max)
max = A[i];
if(A[i] < min)
min = A[i];
}
}

The above algorithm requires 2(n-1) comparison in worst, best, and average cases. The
comparison A[i] < min is needed only when A[i] > max is not true. If we replace the content
inside the for loop by

if(A[i] > max)


max = A[i];
else if(A[i] < min)
min = A[i];

Then the best case occurs when the elements are in increasing order with (n-1) comparisons and
worst case occurs when elements are in decreasing order with 2(n-1) comparisons. For the
average case A[i] > max is about half of the time so number of comparisons is 3n/2 – 1.
We can clearly conclude that the time complexity is O(n).

Prepared By: Arjun Singh Saud, Faculty CDCISIT, TU

Downloaded from CSIT Tutor


Design And Analysis of Algorithms B.SC. CSIT

Divide and Conquer Algorithm:


Main idea behind the algorithm is: if the number of elements is 1 or 2 then max and min are
obtained trivially. Otherwise split problem into approximately equal part and solved recursively.

MinMax(l,r)
{
if(l = = r)
max = min = A[l];
else if(l = r-1)
{
if(A[l] < A[r])
{
max = A[r]; min = A[l];
}
else
{
max = A[l]; min = A[r];
}
}
else
{
//Divide the problems
mid = (l + r)/2; //integer division
//solve the subproblems
{min,max}=MinMax(l,mid);
{min1,max1}= MinMax(mid +1,r);
//Combine the solutions
if(max1 > max) max = max1;
if(min1 < min) min = min1;
}
}

Prepared By: Arjun Singh Saud, Faculty CDCISIT, TU

Downloaded from CSIT Tutor


Design And Analysis of Algorithms B.SC. CSIT

Analysis:
We can give recurrence relation as below for MinMax algorithm in terms of number of
comparisons.
T(n) = 2T( n / 2 ) + 1 , if n>2
T(n) = 1 , if n ≤2
Solving the recurrence by using master method complexity is (case 1) O(n).

Selection
ith order statistic of a set of elements gives ith largest(smallest) element. In general lets think of ith
order statistic gives ith smallest. Then minimum is first order statistic and the maximum is last
order statistic. Similarly a median is given by ith order statistic where i = (n+1)/2 for odd n and i
= n/2 and n/2 + 1 for even n. This kind of problem commonly called selection problem.

This problem can be solved in Ѳ(nlogn) in a very straightforward way. First sort the elements in
Ѳ(nlogn) time and then pick up the ith item from the array in constant time. What about the linear
time algorithm for this problem? The next is answer to this.

Nonlinear general selection algorithm


We can construct a simple, but inefficient general algorithm for finding the kth smallest or kth
largest item in a list, requiring O(kn) time, which is effective when k is small. To accomplish
this, we simply find the most extreme value and move it to the beginning until we reach our
desired index.
Select(A, k)
{
for( i=0;i<k;i++)
{
minindex = i;
minvalue = A[i];
for(j=i+1;j<n;j++)
{
if( A[j] < minvalue)

Prepared By: Arjun Singh Saud, Faculty CDCISIT, TU

Downloaded from CSIT Tutor


Design And Analysis of Algorithms B.SC. CSIT

{
minindex = j;
minvalue = A[j];
}
swap(A[i],A[minIndex]);
}
return A[k];
}
Analysis:
When i=0, inner loop executes n-1 times
When i=1, inner loop executes n-2 times
When i=2, inner loop executes n-3 times
…………………………………………...
When i=k-1 inner loop executes n-(k+1) times
Thus, Time Complexity = (n-1) + (n-2) + ………………..(n-k-1)
= O(kn) ≈ O(n2)

Selection in expected linear time


This problem is solved by using the “divide and conquer” method. The main idea for this
problem solving is to partition the element set as in Quick Sort where partition is randomized
one.
Algorithm:
RandSelect(A,l,r,i)
{
if(l = =r )
return A[p];
p = RandPartition(A,l,r);
k = p – l + 1;
if(i <= k)
return RandSelect(A,l,p-1,i);
else

Prepared By: Arjun Singh Saud, Faculty CDCISIT, TU

Downloaded from CSIT Tutor


Design And Analysis of Algorithms B.SC. CSIT

return RandSelect(A,p+1,r,i - k);


}
Analysis:
Since our algorithm is randomized algorithm no particular input is responsible for worst case
however the worst case running time of this algorithm is O(n2). This happens if every time
unfortunately the pivot chosen is always the largest one (if we are finding minimum element).
Assume that the probability of selecting pivot is equal to all the elements i.e 1/n then we have the
recurrence relation,
n 1
T(n) = 1/n(  T(max(j,n – j))) + O(n)
j 1

Where, max(j,n-j) = j, if j>= ceil(n / 2)


and max(j,n-j) = n-j, otherwise.
Observe that every T(j) or T(n – j) will repeat twice for both odd and even value of n (one
may not be repeated) one time form 1 to ceil(n / 2) and second time for ceil(n / 2) to n-1, so we
can write,
n 1
T(n) = 2/n( 
j n / 2
T(j)) + O(n)

Using substitution method,


Guess T(n) = O(n)
To show T(n) <= cn
Assume T(j) <= cj
Substituting on the relation
n 1
T(n) = 2/n 
j n / 2
cj + O(n)

n 1 n / 2 1
T(n) = 2/n {  cj -  cj }+ O(n)
j 1 j 1

T(n) = 2/n { (n(n-1))/2 –( (n/2-1)n/2)/2} + O(n)


T(n) ≤ c(n-1) – c(n/2-1)/2 + O( n)
T(n) ≤ cn – c – cn/4 + c/2 +O( n)
= cn – cn/4 –c/2 + O(n)
≤ cn {choose the value of c such that (-cn/4-c/2 +O(n) ≤ 0 }

Prepared By: Arjun Singh Saud, Faculty CDCISIT, TU

Downloaded from CSIT Tutor


Design And Analysis of Algorithms B.SC. CSIT

 T(n) = O(n)

Selection in worst case linear time

Divide the n elements into groups of 5. Find the median of each 5-element group. Recursively
SELECT the median x of the ⎣n/5⎦group medians to be the pivot.

Algorithm
Divide n elements into groups of 5
Find median of each group
Use Select() recursively to find median x of the n/5 medians
Partition the n elements around x. Let k = rank(x ) //index of x
if (i == k) then return x
if (i < k) then use Select() recursively to find ith smallest element in first partition
else (i > k) use Select() recursively to find (i-k)th smallest element in last partition

At least half the group medians are ≤x, which is at least⎣⎣n/5⎦/2⎦= ⎣n/10⎦group medians.
Therefore, at least 3⎣n/10⎦elements are ≤x.
Similarly, at least 3⎣n/10⎦elements are ≥x

Prepared By: Arjun Singh Saud, Faculty CDCISIT, TU

Downloaded from CSIT Tutor


Design And Analysis of Algorithms B.SC. CSIT

For n≥50, we have 3⎣n/10⎦≥n/4.


Therefore, for n≥50the recursive call to SELECT in Step 4 is executed recursively on
≤3n/4elements in worst case.
Thus, the recurrence for running time can assume that Step 4 takes time T(3n/4)in the worst case.
Now, We can write recurrence relation for above algorithm as”
T(n) = T(n/5) + T(3n/4) + Θ(n)
Guess T(n) = O(n)
To Show T(n) ≤ cn

Assume that our guess is true for all k<n


Now,
T(n) ≤ cn/5 + 3cn/4 + O(n)
= 19cn/20 + O(n)
= cn- cn/20 + O(n)
≤ cn { Choose value of c such that cn/20 –O(n) ≤ 0}
 T(n) = O(n)

Prepared By: Arjun Singh Saud, Faculty CDCISIT, TU

Downloaded from CSIT Tutor


Design And Analysis of Algorithms B.SC. CSIT

Chapter 4
Dynamic Programming
Dynamic Programming: technique is among the most powerful for designing algorithms for
optimization problems. Dynamic programming problems are typically optimization problems
(find the minimum or maximum cost solution, subject to various constraints). The technique is
related to divide-and-conquer, in the sense that it breaks problems down into smaller problems
that it solves recursively. However, because of the somewhat different nature of dynamic
programming problems, standard divide-and-conquer solutions are not usually efficient. The
basic elements that characterize a dynamic programming algorithm are:
Substructure: Decompose your problem into smaller (and hopefully simpler)
subproblems. Express the solution of the original problem in terms of solutions for
smaller problems.
Table-structure: Store the answers to the sub-problems in a table. This is done because
subproblem solutions are reused many times.
Bottom-up computation: Combine solutions on smaller subproblems to solve larger
subproblems. (We also discusses a top-down alternative, called memoization)

The most important question in designing a DP solution to a problem is how to set up the
subproblem structure. This is called the formulation of the problem. Dynamic programming is
not applicable to all optimization problems. There are two important elements that a problem
must have in order for DP to be applicable.
Optimal substructure: (Sometimes called the principle of optimality.) It states that for
the global problem to be solved optimally, each subproblem should be solved optimally.
(Not all optimization problems satisfy this. Sometimes it is better to lose a little on one
subproblem in order to make a big gain on another.)
Polynomially many subproblems: An important aspect to the efficiency of DP is that
the total number of subproblems to be solved should be at most a polynomial number.

Prepared By: Arjun Singh Saud, Faculty CDCISIT, TU

Downloaded from CSIT Tutor


Design And Analysis of Algorithms B.SC. CSIT

Fibonacci numbers
Recursive Fibonacci revisited: In recursive version of an algorithm for finding Fibonacci
number we can notice that for each calculation of the Fibonacci number of the larger number we
have to calculate the Fibonacci number of the two previous numbers regardless of the
computation of the Fibonacci number that has already be done. So there are many redundancies
in calculating the Fibonacci number for a particular number. Let’s try to calculate the Fibonacci
number of 4. The representation shown below shows the repetition in the calculation.

Fx Fib(4)

Fx Fib(3) Fx Fib(2)

Fx Fib(2) Fx Fib(1) Fx Fib(1) Fx Fib(0)

Fx Fib(1) Fx Fib(0)

In the above tree we saw that calculations of fib(0) is done two times, fib(1) is done 3 times,
fib(2) is done 2 times, and so on. So if we somehow eliminate those repetitions we will save the
running time.

Algorithm:
DynaFibo(n)
{
A[0] = 0, A[1]= 1;
for(i = 2 ; i <=n ; i++)
A[i] = A[i-2] +A[i-1] ;
return A[n] ;
}
Analysis

Prepared By: Arjun Singh Saud, Faculty CDCISIT, TU

Downloaded from CSIT Tutor


Design And Analysis of Algorithms B.SC. CSIT

Analyzing the above algorithm we found that there are no repetition of calculation of the
subproblems already solved and the running time decreased from O(2n/2) to O(n). This reduction
was possible due to the remembrance of the subproblem that is already solved to solve the
problem of higher size.

0/1 Knapsack Problem


Statement: A thief has a bag or knapsack that can contain maximum weight W of his
loot. There are n items and the weight of ith item is wi and it worth vi. An amount of item
can be put into the bag is 0 or 1 i.e. xi is 0 or 1. Here the objective is to collect the items
that maximize the total profit earned.
n
We can formally state this problem as, maximize i 1
xivi Using the constraints
n


i 1
xiwi ≤ W

The algorithm takes as input maximum weight W, the number of items n, two arrays v[] for
values of items and w[] for weight of items. Let us assume that the table c[i,w] is the value of
solution for items 1 to i and maximum weight w. Then we can define recurrence relation for 0/1
knapsack problem as

0 if i=0 or w=0
C[i,w] = C[i-1,w] if wi > w
Max{vi + C[i-1,w-wi], c[i-1,w] if i>0 and w>wi

DynaKnapsack(W,n,v,w)
{
for(w=0; w<=W; w++)
C[0,w] = 0;
for(i=1; i<=n; i++)
C[i,0] = 0;
for(i=1; i<=n; i++)
{
for(w=1; w<=W;w++xw1z)
{
if(w[i]<w)

Prepared By: Arjun Singh Saud, Faculty CDCISIT, TU

Downloaded from CSIT Tutor


Design And Analysis of Algorithms B.SC. CSIT

{
if v[i] +C[i-1,w-w[i]] > C[i-1,w]
{
C[i,w] = v[i] +C[i-1,w-w[i]];
}
else
{
C[i,w] = C[i-1,w];
}
}
else
{
C[i,w] = C[i-1,w];
}
}
}
}

Analysis
For run time analysis examining the above algorithm the overall run time of the algorithm is
O(nW).

Example
Let the problem instance be with 7 items where v[] = {2,3,3,4,4,5,7}and w[] = {3,5,7,4,3,9,2}and
W = 9.

w 0 1 2 3 4 5 6 7 8 9
i
0 0 0 0 0 0 0 0 0 0 0
1 0 0 0 2 2 2 2 2 2 2
2 0 0 0 2 2 3 3 3 5 5
3 0 0 0 2 2 3 3 3 5 5
4 0 0 0 2 4 4 4 6 6 7
5 0 0 0 4 4 4 6 8 8 8
6 0 0 0 4 4 4 6 8 8 8
7 0 0 7 7 7 11 11 11 13 15
.
Prepared By: Arjun Singh Saud, Faculty CDCISIT, TU

Downloaded from CSIT Tutor


Design And Analysis of Algorithms B.SC. CSIT

Profit= C[7][9]=15

Matrix Chain Multiplication


Chain Matrix Multiplication Problem: Given a sequence of matrices A1;A2; : : : ; An and
dimensions p0; p1; : : : ; pn, where Ai is of dimension pi−1 x pi, determine the order of
multiplication that minimizes the number of operations.
Important Note: This algorithm does not perform the multiplications, it just determines the best
order in which to perform the multiplications.

Although any legal parenthesization will lead to a valid result, not all involve the same number
of operations. Consider the case of 3 matrices: A1 be 5 x 4, A2 be 4 x 6 and A3 be 6 x 2.
multCost[((A1A2)A3)] = (5 . 4 . 6) + (5 . 6 . 2) = 180
multCost[(A1(A2A3))] = (4 . 6 . 2) + (5 . 4 . 2) = 88

Even for this small example, considerable savings can be achieved by reordering the evaluation
sequence.

Let Ai….j denote the result of multiplying matrices i through j. It is easy to see that Ai…j is a
pi−1 x pj matrix. So for some k total cost is sum of cost of computing Ai…k, cost of computing
Ak+1…j, and cost of multiplying Ai…k and Ak+1…j.

Recursive definition of optimal solution: let m[j,j] denotes minimum number of scalar
multiplications needed to compute Ai…j.

0 if i=j
C[i,w] =
mini≤k<j{m[I,k]+ m[k+1,j] + pi-1pkpj if i<j

Prepared By: Arjun Singh Saud, Faculty CDCISIT, TU

Downloaded from CSIT Tutor


Design And Analysis of Algorithms B.SC. CSIT

Matrix-Chain-Multiplication(p)
{
n =length[p]
for( i= 1 i<=n i++)
{
m[i, i]= 0
}
for(l=2; l<= n; l++)
{
for( i= 1; i<=n-l+1; i++)
{
j=i+l−1
m[i, j] = ∞
for(k= i; k<= j-1; k++)
{
c= m[i, k] + m[k + 1, j] + p[i−1] * p[k] * p[j]
if c < m[i, j]
{
m[i, j] = c
s[i, j] = k
}
}
}
}
return m and s
}

Analysis
The above algorithm can be easily analyzed for running time as O(n 3), due to three
nested loops.
The space complexity is O(n2) .
Prepared By: Arjun Singh Saud, Faculty CDCISIT, TU

Downloaded from CSIT Tutor


Design And Analysis of Algorithms B.SC. CSIT

Example:
Consider matrices A1, A2, A3 And A4 of order 3x4, 4x5, 5x2 and 2x3.

M Table( Cost of multiplication) S Table (points of parenthesis)

j 1 2 3 4
j 1 2 3 4
i i

1 0 60 64 82 1 1 1 3

2 0 40 64 2 2 3

3 0 30 3 3

4 0 4

Constructing optimal solution


(A1A2A3A4) => ((A1A2A3)(A4)) => (((A1)(A2A3))(A4))

Longest Common Subsequence Problem


Given two sequences X = (x1, x2, …………,xm) and Z = (z1; z2; ……… ; zk), we say that Z is a
subsequence of X if there is a strictly increasing sequence of k indices (i1, i2, ……… ik) (1 ≤ i1 < i2
< ……….. < ik) such that Z = (Xi1,Xi2 ………. Xik).

For example, let X = (ABRACADABRA) and let Z = (AADAA), then Z is a subsequence of X.


Given two strings X and Y , the longest common subsequence of X and Y is a longest sequence
Z that is a subsequence of both X and Y . For example, let X = (ABRACADABRA) and let Y =
(YABBADABBAD). Then the longest common subsequence is Z = (ABADABA)

The Longest Common Subsequence Problem (LCS) is the following. Given two sequences X =
(x1; : : : ; xm) and Y = (y, ……., yn) determine a longest common subsequence.

Prepared By: Arjun Singh Saud, Faculty CDCISIT, TU

Downloaded from CSIT Tutor


Design And Analysis of Algorithms B.SC. CSIT

DP Formulation for LCS:


Given a sequence X = <x1,x2, …, xm>, Xi = <x1, x2, …, xi> is called ith prefix of X, here we have
X0 as empty sequence. Now in case of sequences Xi and Yj:

If xi = yj (i.e. last Character match) , we claim that the LCS must also contail charater xi or yj .

If xi ≠ yj (i.e Last Character do not match) , In this case xi and yj cannot both be in the LCS
(since they would have to be the last character of the LCS). Thus either xi is not part of the LCS,
or yj is not part of the LCS (and possibly both are not part of the LCS). Let L[I,j] represents the
length of LCS of sequences Xi and Yj.

0 if i=0 or j=0
L[i,j] = L[i-1,j-1]+1 if xi = yj
max{L[i-1,j], L[i,j-1]} if i>0 and w>wi

LCS(X,Y)
{
m = length[X];
n = length[Y];
for(i=1;i<=m;i++)
c[i,0] = 0;
for(j=0;j<=n;j++)
c[0,j] = 0;
for(i = 1;i<=m;i++)
for(j=1;j<=n;j++)
{
if(X[i]==Y[j])
{
c[i][j] = c[i-1][j-1]+1; b[i][j] = “upleft”;
}
else if(c[i-1][j]>= c[i][j-1])
Prepared By: Arjun Singh Saud, Faculty CDCISIT, TU

Downloaded from CSIT Tutor


Design And Analysis of Algorithms B.SC. CSIT

{
c[i][j] = c[i-1][j]; b[i][j] = “up”;
}
else
{
c[i][j] = c[i][j-1]; b[i][j] = “left”;
}
}
return b and c;
}
Analysis:
The above algorithm can be easily analyzed for running time as O(mn), due to two nested loops.
The space complexity is O(mn).
Example:
Consider the character Sequences X=abbabba Y=aaabba

Y Φ a A a b b a
X
Φ 0 0 0 0 0 0 0
a 0 1 upleft 1upleft 1upleft 1 left 1 left 1 left
b 0 1 up 1 up 1 up 2 upleft 2 upleft 2 left
b 0 1 up 1 up 1 up 2 upleft 3 upleft 3 left
a 0 1 upleft 2 upleft 2 upleft 2 up 3 up 4 upleft
b 0 1 up 2 up 2 up 3 upleft 3 upleft 4 up
b 0 1 up 2 up 2 up 3 upleft 4 upleft 4 up
a 0 1 upleft 2 upleft 3 upleft 3 up 4 up 5 upleft

LCS = aabba

Prepared By: Arjun Singh Saud, Faculty CDCISIT, TU

Downloaded from CSIT Tutor


Design And Analysis of Algorithms B.SC. CSIT

Chapter 5
Greedy Algorithms
In many optimization algorithms a series of selections need to be made. In dynamic
programming we saw one way to make these selections. Namely, the optimal solution is
described in a recursive manner, and then is computed “bottom-up”. Dynamic programming
is a powerful technique, but it often leads to algorithms with higher than desired running
times. Greedy method typically leads to simpler and faster algorithms, but it is not as
powerful or as widely applicable as dynamic programming. Even when greedy algorithms do
not produce the optimal solution, they often provide fast heuristics (non-optimal solution
strategies), are often used in finding good approximations.

To prove that a greedy algorithm is optimal we must show the following two characteristics
are exhibited.
Greedy Choice Property
Optimal Substructure Property

Statement: A thief has a bag or knapsack that can contain maximum weight W of his loot.
There are n items and the weight of ith item is wi and it worth vi. Any amount of item can be
put into the bag i.e. xi fraction of item can be collected, where 0<=xi<=1. Here the objective
is to collect the items that maximize the total profit earned.

Algorithm
Take as much of the item with the highest value per weight (vi/wi) as you can. If the item is
finished then move on to next item that has highest (vi/wi), continue this until the knapsack is
full. v[1 … n] and w[1 … n] contain the values and weights respectively of the n objects
sorted in non increasing ordered of v[i]/w[i] . W is the capacity of the knapsack, x[1 … n] is
the solution vector that includes fractional amount of items and n is the number of items.
GreedyFracKnapsack(W,n)
{
for(i=1; i<=n; i++)

Prepared By: Arjun Singh Saud, Faculty CDCISIT, TU

Downloaded from CSIT Tutor


Design And Analysis of Algorithms B.SC. CSIT

x[i] = 0.0;
tw = W;
for(i=1; i<=n; i++)
{
if(w[i] > tw)
break;
else
x[i] = 1.0;

tempw -= w[i];
}
if(i<=n)
x[i] = tw/w[i];
}

Analysis:
We can see that the above algorithm just contain a single loop i.e. no nested loops the
running time for above algorithm is O(n). However our requirement is that v[1 … n] and
w[1 … n] are sorted, so we can use sorting method to sort it in O(nlogn) time such that the
complexity of the algorithm above including sorting becomes O(nlogn).

Job Sequencing with Deadline


We are given a set of n jobs. Associated with each job I, di≥0 is an integer deadline and pi≥0
is profit. For any job i profit is earned iff job is completed by deadline. To complete a job one
has to process a job for one unit of time. Our aim is to find feasible subset of jobs such that
profit is maximum.

Example
n=4, (p1,p2,p3,p4)=(100,10,15,27), (d1,d2,d3,d4)=(2,1,2,1)
n=4, (p1,p4,p3,p2)=(100,27,15,10), (d1,d4,d3,d2)=(2,1,2,1)

Prepared By: Arjun Singh Saud, Faculty CDCISIT, TU

Downloaded from CSIT Tutor


Design And Analysis of Algorithms B.SC. CSIT

Feasible processing
Solution sequence value
1. (1, 2) 2, 1 110
2. (1, 3) 1, 3 or 3, 1 115
3. (1, 4) 4, 1 127
4. (2, 3) 2, 3 25
5. (3, 4) 4, 3 42
6. (1) 1 100
7. (2) 2 10
8. (3) 3 15
9. (4) 4 27

We have to try all the possibilities, complexity is O(n!).


Greedy strategy using total profit as optimization function to above example.
Begin with J=
– Job 1 considered, and added to J  J={1}
– Job 4 considered, and added to J  J={1,4}
– Job 3 considered, but discarded because not feasible  J={1,4}
– Job 2 considered, but discarded because not feasible  J={1,4}
Final solution is J={1,4} with total profit 127 and it is optimal

Algorithm
Assume the jobs are ordered such that p[1]p[2]…p[n]
d[i]>=1, 1<=i<=n are the deadlines, n>=1. The jobs n are ordered such that p[1]>=p[2]>= ...
>=p[n]. J[i] is the ith job in the optimal solution, 1<=i<=k.
JobSequencing(int d[], int j[], int n)
{
for(i=1;i<=n;i++)
{//initially no jobs are selected
J[i]=0;
Prepared By: Arjun Singh Saud, Faculty CDCISIT, TU

Downloaded from CSIT Tutor


Design And Analysis of Algorithms B.SC. CSIT

}
for (int i=1; i<=n; i++)
{
d=d[i];
for(k=d;k>=0;k--)
{
if(j[k]==0)
{
J[k]=i;
}
}
}
}
Analysis
First for loop executes for O(n) times .
In case of second loop outer for loop executes O(n) times and inner for loop executes for at most
O(n) times in the worst case. All other statements takes O(1) time. Hence total time for each
iteration of outer for loop is O(n) in worst case.
Thus time complexity= O(n) + O(n2) = O(n2) .

Huffman Coding
Huffman coding is an algorithm for the lossless compression of files based on the frequency of
occurrence of a symbol in the file that is being compressed. In any file, certain characters are
used more than others. Using binary representation, the number of bits required to represent each
character depends upon the number of characters that have to be represented. Using one bit we
can represent two characters, i.e., 0 represents the first character and 1 represents the second
character. Using two bits we can represent four characters, and so on. Unlike ASCII code, which
is a fixed-length code using seven bits per character, Huffman compression is a variable-length
coding system that assigns smaller codes for more frequently used characters and larger codes
for less frequently used characters in order to reduce the size of files being compressed and
transferred.

Prepared By: Arjun Singh Saud, Faculty CDCISIT, TU

Downloaded from CSIT Tutor


Design And Analysis of Algorithms B.SC. CSIT

For example, in a file with the following data: ‘XXXXXXYYYYZZ’. The frequency of "X" is 6,
the frequency of "Y" is 4, and the frequency of "Z" is 2. If each character is represented using a
fixed-length code of two bits, then the number of bits required to store this file would be 24, i.e.,
(2 x 6) + (2x 4) + (2x 2) = 24. If the above data were compressed using Huffman compression,
the more frequently occurring numbers would be represented by smaller bits, such as: X by the
code 0 (1 bit),Y by the code 10 (2 bits) and Z by the code 11 (2 bits), the size of the file becomes
18, i.e., (1x 6) + (2 x 4) + (2 x 2) = 18. In this example, more frequently occurring characters are
assigned smaller codes, resulting in a smaller number of bits in the final compressed file.
Huffman compression was named after its discoverer, David Huffman.

To generate Huffman codes we should create a binary tree of nodes. Initially, all nodes are leaf
nodes, which contain the symbol itself, the weight (frequency of appearance) of the symbol. As
a common convention, bit '0' represents following the left child and bit '1' represents following
the right child. A finished tree has up to leaf nodes and internal nodes. A Huffman tree
that omits unused symbols produces the most optimal code lengths. The process essentially
begins with the leaf nodes containing the probabilities of the symbol they represent, then a new
node whose children are the 2 nodes with smallest probability is created, such that the new
node's probability is equal to the sum of the children's probability. With the previous 2 nodes
merged into one node and with the new node being now considered, the procedure is repeated
until only one node remains, the Huffman tree. The simplest construction algorithm uses
a priority queue where the node with lowest probability is given highest priority:
Example
The following example bases on a data source using a set of five different symbols. The
symbol's frequencies are:
Symbol Frequency
A 24
B 12
C 10
D 8
E 8
----> total 186 bit (with 3 bit per code word)

Prepared By: Arjun Singh Saud, Faculty CDCISIT, TU

Downloaded from CSIT Tutor


Design And Analysis of Algorithms B.SC. CSIT

Step1:

A E 8 A D 8 A C 10 A B 12 A A 24

Step2:

A C 10 A B 12 16 A A 24

A E 8 A D 8
Step3:

A A 24
16 22

A E 8 A D 8 A C 10 A B 12
Step4:

A A 24
38

16 22

A E 8 A D 8 A C 10 A B 12
Step5:
62

0 1 38
A A 24

0 1
16 22

0 1
A E 8 A D 8 A C 10 A B 12

Prepared By: Arjun Singh Saud, Faculty CDCISIT, TU

Downloaded from CSIT Tutor


Design And Analysis of Algorithms B.SC. CSIT

Symbol Frequency Code Code length total Length


A 24 0 1 24
B 12 100 3 36
C 10 101 3 30
D 8 110 3 24
E 8 111 3 24
------------------------------------------------------------------------------------------------
Total length of message: 138 bit
Algorithm
A greedy algorithm can construct Huffman code that is optimal prefix codes. A tree
corresponding to optimal codes is constructed in a bottom up manner starting from the |C|
leaves and |C|-1 merging operations. Use priority queue Q to keep nodes ordered by frequency.
Here the priority queue we considered is binary heap.
HuffmanAlgo(C)
{
n = |C|; Q = C;
For(i=1; i<=n-1; i++)
{
z = Allocate-Node();
x = Extract-Min(Q);
y = Extract-Min(Q);
left(z) = x; right(z) = y;
f(z) = f(x) + f(y);
Insert(Q,z);
}
}
Analysis
We can use BuildHeap(C) to create a priority queue that takes O(n) time. Inside the for loop the
expensive operations can be done in O(logn) time. Since operations inside for loop executes for
n-1 time total running time of Huffman algorithm is O(nlogn).

Prepared By: Arjun Singh Saud, Faculty CDCISIT, TU

Downloaded from CSIT Tutor


Design And Analysis of Algorithms B.SC. CSIT

Chapter 6
Graph Algorithms
Graph is a collection of vertices or nodes, connected by a collection of edges. Graphs are
extremely important because they are a very flexible mathematical model for many
application problems. Basically, any time you have a set of objects, and there is some
“connection” or “relationship” or “interaction” between pairs of objects, a graph is a good
way to model this. Examples of graphs in application include communication and
transportation networks, VLSI and other sorts of logic circuits, surface meshes used for shape
description in computer-aided design and geographic information systems, precedence
constraints in scheduling systems etc.

A directed graph (or digraph) G = (V,E) consists of a finite set V , called the vertices or
nodes, and E, a set of ordered pairs, called the edges of G.

An undirected graph (or graph) G = (V,E) consists of a finite set V of vertices, and a set E of
unordered pairs of distinct vertices, called the edges.

We say that vertex v is adjacent to vertex u if there is an edge (u; v). In a directed graph,
given the edge e = (u; v), we say that u is the origin of e and v is the destination of e. In
undirected graphs u and v are the endpoints of the edge. The edge e is incident (meaning that
it touches) both u and v.

Prepared By: Arjun Singh Saud, Faculty CDCISIT, TU

Downloaded from CSIT Tutor


Design And Analysis of Algorithms B.SC. CSIT

In a digraph, the number of edges coming out of a vertex is called the out-degree of that
vertex, and the number of edges coming in is called the in-degree. In an undirected graph we
just talk about the degree of a vertex as the number of incident edges. By the degree of a
graph, we usually mean the maximum degree of its vertices.

Notice that generally the number of edges in a graph may be as large as quadratic in the
number of vertices. However, the large graphs that arise in practice typically have much
fewer edges. A graph is said to be sparse if E = Θ(V ), and dense, otherwise. When giving the
running times of algorithms, we will usually express it as a function of both V and E, so that
the performance on sparse and dense graphs will be apparent.

Graph Representation
Graph is a pair G = (V,E) where V denotes a set of vertices and E denotes the set of edges
connecting two vertices. Many natural problems can be explained using graph for example
modeling road network, electronic circuits, etc. The example below shows the road network.

Representing Graphs
Generally we represent graph in two ways namely adjacency lists and adjacency matrix. Both
ways can be applied to represent any kind of graph i.e. directed and undirected. An
Prepared By: Arjun Singh Saud, Faculty CDCISIT, TU

Downloaded from CSIT Tutor


Design And Analysis of Algorithms B.SC. CSIT

adjacency matrix is an n n matrix M where M[i,j] = 1 if there is an edge from vertex i to


vertex j and M[i,j]=0 if there is not. Adjacency matrices are the simplest way to represent
graphs. This representation takes O(n2) space regardless of the structure of the graph. So, if
we have larger number of nodes say 100000 then we must have spcae for 100000 2 =
10,000,000,000 and this is quite a lot space to work on. The adjacency matrix representation
of a graph for the above given road network graph is given below. Take the order
{Kirtipur,TU gate, Balkhu, Kalanki, Kalimati}

It is very easy to see whether there is edge from a vertex to another vertex (O(1) time), what
about space? Especially when the graph is sparse or undirected. If adjacency list
representation of a graph contains and array of size n such that every vertex that has edge
between the vertex denoted by the vertex with array position is added as a list with the
corresponding array element. The example below gives the adjacency list representation of
the above road network graph.

Searching for some edge (i,j) required O(d) time where d is the degree of i vertex.

Prepared By: Arjun Singh Saud, Faculty CDCISIT, TU

Downloaded from CSIT Tutor


Design And Analysis of Algorithms B.SC. CSIT

Some points:
 To test if (x, y) is in graph adjacency matrices are faster.
 To find the degree of a vertex adjacency list is good
 For edge insertion and deletion adjacency matrix takes O(1) time where as adjacency list
takes O(d) time.

Graph Traversals
There are a number of approaches used for solving problems on graphs. One of the most
important approaches is based on the notion of systematically visiting all the vertices and
edge of a graph. The reason for this is that these traversals impose a type of tree structure (or
generally a forest) on the graph, and trees are usually much easier to reason about than
general graphs.

Breadth-first search
This is one of the simplest methods of graph searching. Choose some vertex arbitrarily as a
root. Add all the vertices and edges that are incident in the root. The new vertices added will
become the vertices at the level 1 of the BFS tree. Form the set of the added vertices of level
1, find other vertices, such that they are connected by edges at level 1 vertices. Follow the
above step until all the vertices are added.
Algorithm:
BFS(G,s) //s is start vertex
{
T = {s};
L =Φ; //an empty queue
Enqueue(L,s);
while (L != Φ )
{
v = dequeue(L);
for each neighbor w to v
if ( w  L and w  T )

Prepared By: Arjun Singh Saud, Faculty CDCISIT, TU

Downloaded from CSIT Tutor


Design And Analysis of Algorithms B.SC. CSIT

{
enqueue( L,w);
T = T U {w}; //put edge {v,w} also
}
}
}
Example:
Use breadth first search to find a BFS tree of the following graph.

Solution:

Analysis
From the algorithm above all the vertices are put once in the queue and they are accessed.
For each accessed vertex from the queue their adjacent vertices are looked for and this can be
done in O(n) time(for the worst case the graph is complete). This computation for all the
Prepared By: Arjun Singh Saud, Faculty CDCISIT, TU

Downloaded from CSIT Tutor


Design And Analysis of Algorithms B.SC. CSIT

possible vertices that may be in the queue i.e. n, produce complexity of an algorithm as
O(n2). Also from aggregate analysis we can write the complexity as O(E+V) because inner
loop executes E times in total.

Depth First Search


This is another technique that can be used to search the graph. Choose a vertex as a root and
form a path by starting at a root vertex by successively adding vertices and edges. This
process is continued until no possible path can be formed. If the path contains all the vertices
then the tree consisting this path is DFS tree. Otherwise, we must add other edges and
vertices. For this move back from the last vertex that is met in the previous path and find
whether it is possible to find new path starting from the vertex just met. If there is such a path
continue the process above. If this cannot be done, move back to another vertex and repeat
the process. The whole process is continued until all the vertices are met. This method of
search is also called backtracking.
Algorithm:
DFS(G,s)
{
T = {s};
Traverse(s);
}
Traverse(v)
{
for each w adjacent to v and not yet in T
{
T = T U {w}; //put edge {v,w} also
Traverse (w);
}
}
Prepared By: Arjun Singh Saud, Faculty CDCISIT, TU

Downloaded from CSIT Tutor


Design And Analysis of Algorithms B.SC. CSIT

Analysis:
The complexity of the algorithm is greatly affected by Traverse function we can write its
running time in terms of the relation T(n) = T(n-1) + O(n), here O(n) is for each vertex at most
all the vertices are checked (for loop). At each recursive call a vertex is decreased. Solving this
we can find that the complexity of an algorithm is O(n2).

Also from aggregate analysis we can write the complexity as O(E+V) because traverse function
is invoked V times maximum and for loop executes O(E) times in total.
for each execution the block inside the loop takes O(V) times . Hence the total running
time is O(V2).

Minimum Spanning Tree


A tree is defined to be an undirected, acyclic and connected graph (or more simply, a graph in
which there is only one path connecting each pair of vertices). Assume there is an undirected,
connected graph G. A spanning tree is a sub-graph of G that is tree and contains all the vertices
of G. A minimum spanning tree is a spanning tree, but has weights or lengths associated with the
edges, and the total weight of the tree (the sum of the weights of its edges) is at a minimum.
Application of MST
 Practical application of a MST would be in the design of a network. For instance, a group
of individuals, who are separated by varying distances, wish to be connected together in a
telephone network. MST can be used to determine the least costly paths with no cycles in
this network, thereby connecting everyone at a minimum cost.
 Another useful application of MST would be finding airline routes. MST can be applied
to optimize airline routes by finding the least costly paths with no cycles
Kruskal's algorithm
It is an algorithm in graph theory that finds a minimum spanning tree for a connected weighted
graph. This means it finds a subset of the edges that forms a tree that includes every vertex,
where the total weight of all the edges in the tree is minimized. If the graph is not connected,

Prepared By: Arjun Singh Saud, Faculty CDCISIT, TU

Downloaded from CSIT Tutor


Design And Analysis of Algorithms B.SC. CSIT

then it finds a minimum spanning forest (a minimum spanning tree for each connected
component). Kruskal's algorithm is an example of a greedy algorithm. It works as follows:
 create a forest F (a set of trees), where each vertex in the graph is a separate tree
 create a set S containing all the edges in the graph
 while S is nonempty and F is not yet spanning
 remove an edge with minimum weight from S
 if that edge connects two different trees, then add it to the forest, combining two trees
into a single tree (i.e does not creates cycle)
 Otherwise discard that edge.
At the termination of the algorithm, the forest has only one component and forms a minimum
spanning tree of the graph.

Consider the following Example, This is our original graph. The numbers near the arcs
indicate their weight. None of the arcs are highlighted.

Prepared By: Arjun Singh Saud, Faculty CDCISIT, TU

Downloaded from CSIT Tutor


Design And Analysis of Algorithms B.SC. CSIT

AD and CE are the shortest arcs, with length 5, and AD has been arbitrarily chosen, so it is
highlighted.

CE is now the shortest arc that does not form a cycle, with length 5, so it is highlighted as the
second arc.

The next arc, DF with length 6, is highlighted using much the same method.

The next-shortest arcs are AB and BE, both with length 7. AB is chosen arbitrarily, and is
highlighted. The arc BD has been highlighted in red, because there already exists a path (in
green) between B and D, so it would form a cycle (ABD) if it were chosen.

Prepared By: Arjun Singh Saud, Faculty CDCISIT, TU

Downloaded from CSIT Tutor


Design And Analysis of Algorithms B.SC. CSIT

The process continues to highlight the next-smallest arc, BE with length 7. Many more arcs
are highlighted in red at this stage: BC because it would form the loop BCE, DE because it
would form the loop DEBA, and FE because it would form FEBAD.

Finally, the process finishes with the arc EG of length 9, and the minimum spanning tree is
found.
Algorithm:
KruskalMST(G)
{
T = {V} // forest of n nodes
S = set of edges sorted in nondecreasing order of weights
while(|T| < n-1 and E !=Φ)
{
Select (u,v) from S in order
Remove (u,v) from E
if((u,v) doesnot create a cycle in T))
T = T U {(u,v)}
}
Prepared By: Arjun Singh Saud, Faculty CDCISIT, TU

Downloaded from CSIT Tutor


Design And Analysis of Algorithms B.SC. CSIT

}
Analysis
In the above algorithm creating n tree forest at the beginning takes (V) time, the creation of set S
takes O(ElogE) time and while loop execute O(V) times and the steps inside the loop take almost
linear time (see disjoint set operations; find and union). So the total time taken is O(ElogE)
which is asymptotically equivalently O(ElogV).
Prim's algorithm
It is an algorithm in graph theory that finds a minimum spanning tree for a connected weighted
graph. This means it finds a subset of the edges that forms a tree that includes every vertex,
where the total weight of all the edges in the tree is minimized.

How it works
 This algorithm builds the MST one vertex at a time.
 It starts at any vertex in a graph (vertex A, for example), and finds the least cost vertex
(vertex B, for example) connected to the start vertex.
 Now, from either 'A' or 'B', it will find the next least costly vertex connection, Without
creating a cycle (vertex C, for example).
 Now, from either 'A', 'B', or 'C', it will find the next least costly vertex connection,
without creating a cycle, and so on it goes.
 Eventually, all the vertices will be connected, without any cycles, and an MST
 will be the result.
Example,

This is our original weighted graph. The numbers near the edges indicate their weight.

Prepared By: Arjun Singh Saud, Faculty CDCISIT, TU

Downloaded from CSIT Tutor


Design And Analysis of Algorithms B.SC. CSIT

Vertex D has been arbitrarily chosen as a starting point. Vertices A, B, E and F are connected to
D through a single edge. A is the vertex nearest to D and will be chosen as the second vertex
along with the edge AD.

The next vertex chosen is the vertex nearest to either D or A. B is 9 away from D and 7 away
from A, E is 15,and F is 6. F is the smallest distance away, so we highlight the vertex F and
the arc DF.

The algorithm carries on as above. Vertex B, which is 7 away from A, is highlighted.

Prepared By: Arjun Singh Saud, Faculty CDCISIT, TU

Downloaded from CSIT Tutor


Design And Analysis of Algorithms B.SC. CSIT

In this case, we can choose between C, E, and G. C is 8 away from B, E is 7 away from B,
and G is 11 away from F. E is nearest, so we highlight the vertex E and the arc BE.

Here, the only vertices available are C and G. C is 5 away from E, and G is 9 away from E.
C is chosen, so it is highlighted along with the arc EC.

Vertex G is the only remaining vertex. It is 11 away from F, and 9 away from E. E is nearer,
so we highlight it and the arc EG.
Algorithm:
PrimMST(G)
{
T = Φ; // T is a set of edges of MST
Prepared By: Arjun Singh Saud, Faculty CDCISIT, TU

Downloaded from CSIT Tutor


Design And Analysis of Algorithms B.SC. CSIT

S = {s}; //s is randomly chosen vertex and S is set of vertices


while(S != V)
{
e = (u,v) an edge of minimum weight incident to vertices in T and not forming a
simple circuit in T if added to T i.e. u ɛ S and v ɛ V-S
T = T  {(u,v)};
S = S  {v};
}
}
Analysis
In the above algorithm while loop execute for O(V) time. The edge of minimum weight
incident on a vertex can be found in O(E), so the total time is O(EV). We can improve the
performance of the above algorithm by choosing better data structures. If we use heap data
structure edge of minimum weight can be selected in O(logE) time and the running time of
prim’s algorithm becomes O(ElogE) which is equivalent to O(ElogV).

Shortest Path Algorithms


Dijkstra Algorithm
This is an approach of getting single source shortest paths. In this algorithm it is assumed that
there is no negative weight edge. Dijkstra’s algorithm works using greedy approach, as
below:

Initially, shortest path to all vertices from some given source is infinity and, suppose g is the
is the source vertex.
Prepared By: Arjun Singh Saud, Faculty CDCISIT, TU

Downloaded from CSIT Tutor


Design And Analysis of Algorithms B.SC. CSIT

Take a vertex with shortest path (i.e vertex g), and relax all neighboring vertex. Vertex a, b, i
and h gets relaxed.

Again, Take a vertex with shortest path (i.e vertex i), and relax all neighboring vertex.
Vertices f and e gets relaxed.

Take a vertex with shortest path (i.e vertex f), and relax all neighboring vertex. None of the
vertices gets relaxed.

Prepared By: Arjun Singh Saud, Faculty CDCISIT, TU

Downloaded from CSIT Tutor


Design And Analysis of Algorithms B.SC. CSIT

Take a vertex with shortest path (i.e vertex a or vertex e or vertex h), and relax all
neighboring vertex. None of the vertices gets relaxed.

Take a vertex with shortest path (i.e vertex e or vertex h), and relax all neighboring vertex.
Vertex d gets relaxed.

Take a vertex with shortest path (i.e vertex h), and relax all neighboring vertex. Vertex c gets
relaxed

Prepared By: Arjun Singh Saud, Faculty CDCISIT, TU

Downloaded from CSIT Tutor


Design And Analysis of Algorithms B.SC. CSIT

Take a vertex with shortest path (i.e vertex c), and relax all neighboring vertex. None of the
vertices gets relaxed.

There will be no change for vertices b and d. continues above steps for b and d to complete.
The tree is shown as dark connection.
Algorithm:
Dijkstra(G,w,s)
{
for each vertex vÎ V
do d[v] = ∞
p[v] = Nil
d[s] = 0
S=Φ

Prepared By: Arjun Singh Saud, Faculty CDCISIT, TU

Downloaded from CSIT Tutor


Design And Analysis of Algorithms B.SC. CSIT

Q=V
While(Q!= Φ)
{
u = Take minimum from Q and delete.
S = S  {u}
for each vertex v adjacent to u
if d[v] > d[u] + w(u,v)
then d[v] = d[u] + w(u,v)
}
}

Analysis
In the above algorithm, the first for loop block takes O(V) time. Initialization of priority queue Q
takes O(V) time. The while loop executes for O(V), where for each execution the block inside
the loop takes O(V) times . Hence the total running time is O(V2).
Flyod’s Warshall Algorithm
The algorithm being discussed uses dynamic programming approach. The algorithm being
presented here works even if some of the edges have negative weights. Consider a weighted
graph G = (V,E) and denote the weight of edge connecting vertices i and j by wij. Let W be the
adjacency matrix for the given graph G. Let Dk denote an n x n matrix such that Dk(i,j) is defined
as the weight of the shortest path from the vertex i to vertex j using only vertices from 1,2,…,k as
intermediate vertices in the path. If we consider shortest path with intermediate vertices as above
then computing the path contains two cases. Dk(i,j) does not contain k as intermediate vertex and
Dk(i,j) contains k as intermediate vertex. Then we have the following relations Dk(i,j) = Dk-1(i,j),
when k is not an intermediate vertex, and Dk(i,j) = Dk-1(i,k) + Dk-1(k,j), when k is an intermediate
vertex. So from the above relations we obtain:
Dk(i,j) = min{Dk-1(i,j), Dk-1(i,k) + Dk-1(k,j)}.

Algorithm:
Prepared By: Arjun Singh Saud, Faculty CDCISIT, TU

Downloaded from CSIT Tutor


Design And Analysis of Algorithms B.SC. CSIT

FloydWarshalAPSP(W,D,n) // W is adjacency matrix of graph G.


{
for(i=1;i<=n;i++)
for(j=1;j<=1;j++)
D[i][j] = W[i][j]; // initially D[][] is D0.
for(k=1;k<=n;k++)
for(i=1;i<=n;i++)
for(j=1;j<=1;j++)
D[i][j] = min{D[i][j], D[i][k]+ D[k][j]};
}

Prepared By: Arjun Singh Saud, Faculty CDCISIT, TU

Downloaded from CSIT Tutor


Design And Analysis of Algorithms B.SC. CSIT

Analysis:
Clearly the above algorithm’s running time is O(n3), where n is cardinality of set V of vertices.

Prepared By: Arjun Singh Saud, Faculty CDCISIT, TU

Downloaded from CSIT Tutor


Design And Analysis of Algorithms B.SC. CSIT

Directed Acyclic Graph (DAG)


DAG, here directed means that each edge has an arrow denoting that the edge can be traversed in
only that particular direction. Acyclic means that the graph has no cycles, i.e., starting at one
node, you can never end up at the same node. DAG can be used to find shortest path from a
given source node to all other nodes. To find shortest path by using DAG, first of all sort the
vertices of graph topologically and then relax the vertices in topological order.
Example:

Step1: Sort the vertices of graph topologically

Step2: Relax from S ∞

Step3: Relax from C

Prepared By: Arjun Singh Saud, Faculty CDCISIT, TU

Downloaded from CSIT Tutor


Design And Analysis of Algorithms B.SC. CSIT

Step4: Relax from A

Step5: Relax from B

Step6: Relax from D

Algorithm
DagSP(G,w,s)
{
Topologically Sort the vertices of G
for each vertex vɛ V
d[v] = ∞

Prepared By: Arjun Singh Saud, Faculty CDCISIT, TU

Downloaded from CSIT Tutor


Design And Analysis of Algorithms B.SC. CSIT

d[s] = 0
for each vertex u, taken in topologically sorted order
for each vertex v adjacent to u
if d[v] > d[u] + w(u,v)
d[v] = d[u] + w(u,v)
}
Analysis:
In the above algorithm, the topological sort can be done in O(V+E) time (Since this is similar to
DFS! see book.).The first for loop block takes O(V) time. In case of second for loop it executes
in O(V2) Time so the total running time is O(V2). Aggregate analysis gives us the running time
O(E+V).

Prepared By: Arjun Singh Saud, Faculty CDCISIT, TU

Downloaded from CSIT Tutor


Design And Analysis of Algorithms B.SC. CSIT

Chapter 7

Geometric Algorithms
The easy explanation is that geometry algorithms are what a software developer programs to
solve geometry problems. And we all know what geometry problems are, right? The simplest of
such problems might be to find the intersection of two lines, the area of a given region, or the
inscribed circle of a triangle. Methods and formulas have been around for a long time to solve
such simple problems. But, when it comes to solving even these simple problems as accurate,
robust, and efficient software programs, the easy formulas are sometimes inappropriate and
difficult to implement. This is the starting point for geometry algorithms as methods for
representing elementary geometric objects and performing the basic constructions of classical
geometry.

Geometric Primitives
A line is a group of points on a straight path that extends to infinity. Any two points on the line
can be used to name it.

A line segment is a part of a line that has two end points. The two end points of the line segment
are used to name the line segment.

Polygon

Prepared By: Arjun Singh Saud, Faculty CDCISIT, TU

Downloaded from CSIT Tutor


Design And Analysis of Algorithms B.SC. CSIT

A closed figure of n line segments, where n ≥ 3. The polygon P is represented by its vertices,
usually in counterclockwise order of traversal of its boundary, P = (p0, p1, ..., pn-1) or the line
segments that are ordered as P = (S0, S1, ...,Sn-1) such that the end point of preceding line segment
becomes starting point of the next line segment.
P0 P2
P0

P1

P2 P4

P1 P3 P5

P3 P4

A Simple polygon is a polygon P with no two non-consecutive edges intersecting. There is a


well-defined bounded interior and unbounded exterior for a simple polygon, where the interior is
surrounded by edges.

Convex Polygon
A simple polygon P is convex if and only if for any pair of points x, y in P, the line segment
joining x and y lies entirely in P. We can notice that if all the interior angle is less than 1800, then
the simple polygon is a convex polygon.

p
q

Prepared By: Arjun Singh Saud, Faculty CDCISIT, TU

Downloaded from CSIT Tutor


Design And Analysis of Algorithms B.SC. CSIT

Ear and Mouth


A vertex pi of a simple polygon P is called an ear if for the consecutive vertices pi-1,pi pi+1 (pi-1,
pi+1) is a diagonal. A vertex pi of a simple polygon P is called a mouth if the diagonal (pi-1, pi+1)
is an external diagonal.

P1
Ear

P0

P2

P3

P5

P4 Mouth

Convex Hull
The convex hull of a polygon P is the smallest convex polygon that contains P. Similarly, we can
define the convex hull of a set of points R as the smallest convex polygon containing R.

Computing point of intersection between two line segments


We can apply our coordinate geometry method for finding the point of intersection between two
line segments. Let S1 and S2 be any two line segments. The following steps are used to calculate
point of intersection between two line segments. We are not considering parallel line segments
here in this discussion.
 Determine the equations of line through the line segment S1 and S2. Say the equations
are L1 = (y = m1x + c1) and L2 = (y = m2x + c2) respectively. We can find the equation
Prepared By: Arjun Singh Saud, Faculty CDCISIT, TU

Downloaded from CSIT Tutor


Design And Analysis of Algorithms B.SC. CSIT

of line L1 using the formula of slope (m1) = (y2-y1)/ (x2-x1), where (x1,y1) and (x2,y2)
are two given end points of the line segment S1. Similarly we can find the m2 for L2
also. The values of ci’s can be obtained by using the point of the line segment on the
obtained equation after getting slope of the respective lines.
 Solve two equations of lines L1 and L2, let the value obtained by solving be p = (xi, yi).
Here we confront with two cases. The first case is, if p is the intersection of two line
segments then p lies on both S1 and S2. The second case is if p is not an intersection
point then p does not lie on at least one of the line segments S1 and S2.

p------------------------>Line Segment intersect with each other

Determining Intersection between two Line Segments


Whether two consecutive line segments turn left or right at point pl. Cross products allow us to
answer this question without computing the angle. We simply check whether directed
segment is clockwise or counterclockwise relative to directed segment . To do this, we compute
the cross product (p2 - p0) X (p1 - p0). If the sign of this cross product is negative, then is
counterclockwise with respect to P1, and thus we make a left turn at P1. A positive cross product
indicates a clockwise orientation and a right turn. A cross product of 0 means that points p0, p1,
and p2 are collinear.

Prepared By: Arjun Singh Saud, Faculty CDCISIT, TU

Downloaded from CSIT Tutor


Design And Analysis of Algorithms B.SC. CSIT

We compute the cross product of the vectors given by two line segments as:
(p1- p0) X (p2- p0) = (x1- x0, y1- y0) ´(x2- x0, y2- y0) = (x1- x0)(y2- y0)-(y1- y0) (x2- x0), this
can be represented as

Here we have,
 If  = 0 then p0,p1,p2 are collinear
 If  > 0 then p0,p1,p2 make left turn i.e. there is left turn at p1. (p0,p1 is clockwise with
respect to p0,p2)
 If  < 0 then p0,p1,p2 make right turn i.e. there is right turn at p1, (p0,p1 is anticlockwise
with respect to p0,p2)

Using the concept of left and right turn we can detect the intersection between the two line
segments in very efficient manner. Two segments S1 = (P, Q) and S2 = (R, S) do not intersect if
PQR and PQS are of same turn type or RSP and RSQ are of same turn type.

Prepared By: Arjun Singh Saud, Faculty CDCISIT, TU

Downloaded from CSIT Tutor


Design And Analysis of Algorithms B.SC. CSIT

Grahm Scan Algorithm


The convex hull of a set Q of points is the smallest convex polygon P for which each point
in Q is either on the boundary of P or in its interior. Graham's scan solves the convex-hull
problem by maintaining a stack S of candidate points. Each point of the input set Q is pushed
once onto the stack, and the points that are not vertices of convex hull are eventually popped
from the stack. When the algorithm terminates, stack S contains exactly the vertices of convex
hull, in counterclockwise order of their appearance on the boundary.

The algorithm starts by picking a point in Q known to be a vertex of the convex hull. This can
be done in O(n) time by selecting the rightmost lowest point in the set; that is, a point with first a
minimum (lowest) y coordinate. Having selected this base point, call it p0, the algorithm then
sorts the other points p in Q by the increasing counter-clockwise angle the line segment p0p
makes with the x-axis. If there is a tie and two points have the same angle, discard the one that is
closest to p0.

Given Points: Select point with lowest y-coordinate


Prepared By: Arjun Singh Saud, Faculty CDCISIT, TU

Downloaded from CSIT Tutor


Design And Analysis of Algorithms B.SC. CSIT

p0

Sort points in increasing order of angle in counter clockwise order, say sorted set of points is S

p6 p4

p8 p7

p5

p9 p3 p2 S={p1,p2,p3,p4,p5,p6,p7,p8,p9}

p1

p0

p4

push (p1)

push(p2)

p3 p2

P1

p0

Prepared By: Arjun Singh Saud, Faculty CDCISIT, TU

Downloaded from CSIT Tutor


Design And Analysis of Algorithms B.SC. CSIT

p4

p1p2p3:-Left Turn=>push(p3)

p5 p2 p2p3p4:-Right Turn=>pop()

p3

p0 p1

p6 p4

p5 p2 p1p2p4:-Left Turn=>push(p4)

p3 p2p4p5:-Left Turn=>push(p5)

p1 p4p5p6:-Right Turn=>pop()

p0

p6 p4

p8 p7 p2p4p6:-Left Turn=>push(p6)

p5 p2 p4p6p7:-Left Turn=>push(p7)

p3 p6p7p8: Right Turn=>pop()

p1

p0

p6

p8 p4 p4p6p8:-Left Turn=>push(p8)

p7 p6p8p9:-Left Turn=>push(p9)

p9 p5 p3 p2

p1

p0

Prepared By: Arjun Singh Saud, Faculty CDCISIT, TU

Downloaded from CSIT Tutor


Design And Analysis of Algorithms B.SC. CSIT

Algorithm
GrahamScan(P)
{
p0 = point with lowest y-coordinate value.
Angularly sort the other points with respect to p0
Push(S, p0); // S is a stack
Push(S, p1);
Push(S, p2);
For(i=3;i<m;i++)
{
a = NexttoTop(S);
b= Top(S);
while (a,b,pi makes non left turn)
Pop(S);
Push(S,qi);
}
return S;
}

Analysis
It requires O(n) time to find p0. Sorting of points require O(nlogn) time,. Push operation takes
constant time i.e., O(1). We can understand that the while loop is executed O(m-2) times in total.
Thus we can say that while loop takes O(1) time. So the worst case running time of the
algorithm is T(n) = O(n) + O(nlogn) + O(n) = O(nlogn), where n = |P|.

Polygon Triangulation
Triangulation of a simple polygon P is decomposition of P into triangles by a maximal set of
non-intersecting diagonals. Diagonal is an open line segment that connects two vertices of P and

Prepared By: Arjun Singh Saud, Faculty CDCISIT, TU

Downloaded from CSIT Tutor


Design And Analysis of Algorithms B.SC. CSIT

lies in the interior of P. Triangulations are usually not unique. Any triangulation of a simple
polygon with n vertices consists of exactly n-2 triangles.

Triangulation is done by adding the diagonals. For polygon with n-vertices, the candidate
diagonals can be O(n2). Check intersection of O(n2) segments with O(n) edges it costs
O(n3).There can be total of n–3 diagonals on triangulation. So total cost in triangulation is O(n4).

Prepared By: Arjun Singh Saud, Faculty CDCISIT, TU

Downloaded from CSIT Tutor


Design And Analysis of Algorithms B.SC. CSIT

Chapter 8

NP-Completeness
Most of the problems considered up to now can be solved by algorithms in worst-case
polynomial time. There are many problems and it is not necessary that all the problems have the
apparent solution. This concept, somehow, can be applied in solving the problem using the
computers. The computer can solve: some problems in limited time e.g. sorting, some problems
requires unmanageable amount of time e.g. Hamiltonian cycles, and some problems cannot be
solved e.g. Halting Problem. In this section we concentrate on the specific class of problems
called NP complete problems (will be defined later).
Tractable and Intractable Problems
We call problems as tractable or easy, if the problem can be solved using polynomial time
algorithms. The problems that cannot be solved in polynomial time but requires superpolynomial
time algorithm are called intractable or hard problems. There are many problems for which no
algorithm with running time better than exponential time is known some of them are, traveling
salesman problem, Hamiltonian cycles, and circuit satisfiability, etc.
Polynomial time reduction
Given two problems A and B, a polynomial time reduction from A to B is a polynomial time
function f that transforms the instances of A into instances of B such that the output of algorithm
for the problem A on input instance x must be same as the output of the algorithm for the
problem B on input instance f(x) as shown in the figure below. If there is polynomial time
computable function f such that it is possible to reduce A to B, then it is denoted as A ≤p B. The
function f described above is called reduction function and the algorithm for computing f is
called reduction algorithm.

Prepared By: Arjun Singh Saud, Faculty CDCISIT, TU

Downloaded from CSIT Tutor


Design And Analysis of Algorithms B.SC. CSIT

P and NP classes and NP completeness


The set of problems that can be solved using polynomial time algorithm is regarded as class P.
The problems that are verifiable in polynomial time constitute the class NP. The class of NP
complete problems consists of those problems that are NP as well as they are as hard as any
problem in NP (more on this later). The main concern of studying NP completeness is to
understand how hard the problem is. So if we can find some problem as NP complete then we try
to solve the problem using methods like approximation, rather than searching for the faster
algorithm for solving the problem exactly.
Complexity Class P
P is the class of problems that can be solved in polynomial time on a deterministic effective
computing system (ECS). Loosely speaking, all computing machines that exist in the real world
are deterministic ECSs. So P is the class of things that can be computed in polynomial time on
real computers.
Complexity Class NP
NP is the class of problems that can be solved in polynomial time on a non-deterministic
effective computing system (ECS) or we can say that “NP is the class of problems that can be
solved in super polynomial time on a deterministic effective computing system (ECS)”. Loosely
speaking, all computing machines that exist in the real world are deterministic ECSs. So NP is
the class of problem that can be computed in super polynomial time on real computers. But
problem of class NP are verifiable in polynomial time. Using the above idea we say the problem
is in class NP (nondeterministic polynomial time) if there is an algorithm for the problem that
verifies the problem in polynomial time. For e.g. Circuit satisfiability problem (SAT) is the
question “Given a Boolean combinational circuit, is it satisfiable? i.e. does the circuit has
assignment sequence of truth values that produces the output of the circuit as 1?” Given the
circuit satisfiability problem take a circuit x and a certificate y with the set of values that produce
output 1, we can verify that whether the given certificate satisfies the circuit in polynomial time.
So we can say that circuit satisfiability problem is NP.

Complexity Class NP-Complete

Prepared By: Arjun Singh Saud, Faculty CDCISIT, TU

Downloaded from CSIT Tutor


Design And Analysis of Algorithms B.SC. CSIT

NP complete problems are those problems that are hardest problems in class NP. We define
some problem say A, is NP-complete if
a. A NP, and
b. B p A, for every B NP.
The problem satisfying property b is called NP-hard.

NP-Complete problems arise in many domains like: boolean logic; graphs, sets and partitions;
sequencing, scheduling, allocation; automata and language theory; network design; compilers,
program optimization; hardware design/optimization; number theory, algebra etc.

Cook’s Theorem
SAT is NP-complete
Proof
To prove that SAT is NP-complete, we have to show that
 SATɛNP
 SAT is NP-Hard
SATɛNP
Circuit satisfiability problem (SAT) is the question “Given a Boolean combinational circuit, is it
satisfiable? i.e. does the circuit has assignment sequence of truth values that produces the output
of the circuit as 1?” Given the circuit satisfiability problem take a circuit x and a certificate y
with the set of values that produce output 1, we can verify that whether the given certificate
satisfies the circuit in polynomial time. So we can say that circuit satisfiability problem is NP.
SAT is NP-hard
Take a problem V ɛ NP, let A be the algorithm that verifies V in polynomial time (this must be
true since V ɛ NP). We can program A on a computer and therefore there exists a (huge) logical
circuit whose input wires correspond to bits of the inputs x and y of A and which outputs 1
precisely when A(x,y) returns yes. For any instance x of V let Ax be the circuit obtained from A
by setting the x-input wire values according to the specific string x. The construction of Ax from
x is our reduction function.

Prepared By: Arjun Singh Saud, Faculty CDCISIT, TU

Downloaded from CSIT Tutor


Design And Analysis of Algorithms B.SC. CSIT

Approximation Algorithms
An approximate algorithm is a way of dealing with NP-completeness for optimization problem.
This technique does not guarantee the best solution. The goal of an approximation algorithm is to
come as close as possible to the optimum value in a reasonable amount of time which is at most
polynomial time. If we are dealing with optimization problem (maximization or minimization)
with feasible solution having positive cost then it is worthy to look at approximate algorithm for
near optimal solution.
Vertex Cover Problem
A vertex cover
ɛE either uɛV’ or vɛV’ or u and v ɛ V’. The problem here is to find the vertex cover of minimum
size in a given graph G. Optimal vertex-cover is the optimization version of an NP-complete
problem but it is not too hard to find a vertex-cover that is near optimal.
Algorithm
ApproxVertexCover (G)
{
C ={ } ;
E’ = E
while E` is not empty
do Let (u, v) be an arbitrary edge of E`
C = C U {u, v}
Remove from E` every edge incident on either u or v
return C
}
Example: (vertex cover running example for graph below)

Prepared By: Arjun Singh Saud, Faculty CDCISIT, TU

Downloaded from CSIT Tutor


Design And Analysis of Algorithms B.SC. CSIT

Prepared By: Arjun Singh Saud, Faculty CDCISIT, TU

Downloaded from CSIT Tutor

You might also like