0% found this document useful (0 votes)
22 views197 pages

All Units Notes DAA

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views197 pages

All Units Notes DAA

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 197

Course Code/Title:CS3302/Design Analysis of Algorithms

COURSE OBJECTIVES:
To understand and apply the algorithm analysis techniques on searching andsorting algorithms
To understand the different Greedy Algorithms
To understand different algorithm design techniques.
To solve programming problems using state space tree.
To understand the concepts behind NP Completeness, Approximation algorithms and
randomize algorithms.
UNIT I INTRODUCTION
9
Problem Solving: Programs and Algorithms – Problem Solving Aspects – Problem Solving
Techniques - Algorithm analysis: Time and space complexity - AsymptoticNotations and its
properties Best case, Worst case and average case analysis – Recurrence relation: substitution
method and searching: Interpolation Search, Pattern search: The naïve string- matching
algorithm - Rabin-Karp algorithm - Knuth-Morris-Pratt algorithm.

Algorithm : Systematic logical approach which is a well-defined, step-by-step procedure that


allows a computer to solve a problem.
Pseudocode : It is a simpler version of a programming code in plain English which uses short
phrases to write code for a program before it is implemented in a specific programming
language.
Program : It is exact code written for problem following all the rules of the programming
language.
Algorithm:
An algorithm is used to provide a solution to a particular problem in form of well-defined
steps. Whenever you use a computer to solve a particular problem, the steps which lead to the
solution should be properly communicated to the computer. Whileexecuting an algorithm on a
computer, several operations such as additions and subtractions are combined to perform more
complex mathematical operations. Algorithms can be expressed using natural language,
flowcharts, etc.
Algorithm of linear search:
1. Start from the leftmost element of arr[] and one by one compare x with eachelement of
arr[].
2. If x matches with an element, return the index.

1
Course Code/Title:CS3302/Design Analysis of Algorithms

3. If x doesn’t match with any of elements, return -1.


Pseudocode:
It is one of the methods which can be used to represent an algorithm for a program. Itdoes not
have a specific syntax like any of the programming languages and thus cannot be executed on
a computer. There are several formats which are used to write pseudo-codes and most of them
take down the structures from languages such as C,Lisp, FORTRAN, etc.
Many time algorithms are presented using pseudocode since they can be read and understood
by programmers who are familiar with different programming languages.Pseudocode allows
you to include several control structures such as While, If-then- else, Repeat-until, for and case,
which is present in many high-level languages.

Note: Pseudocode is not an actual programming language.


Pseudocode for Linear Search:
FUNCTION (linearSearch(list, searchTerm):FOR index FROM 0 -> length(list):
IF list[index] == searchTerm THENRETURN index
ENDIF
ENDLOOP
RRETUR -1
END FUNCTION
Program:
A program is a set of instructions for the computer to follow. The machine can’t read a program
directly, because it only understands machine code. But you can write stuff in a computer
language, and then a compiler or interpreter can make it understandableto the computer.
Program for Linear Search :
// C++ code for linearly search x in arr[]. If x
// is present then return its location, otherwise
// return -1
int search(int arr[], int n, int x)
{
int i;
for (i = 0; i < n; i++)if (arr[i] == x) return i;
return -1;

2
Course Code/Title:CS3302/Design Analysis of Algorithms

Algorithm vs Pseudocode vs Program:


An algorithm is defined as a well-defined sequence of steps that provides a solutionfor a given
problem, whereas a pseudocode is one of the methods that can be used torepresent an algorithm.
While algorithms are generally written in a natural language or plain English language,
pseudocode is written in a format that is similar to the structure of a high- level programming
language. Program on the other hand allows us to write a code ina particular programming
language.
PROBLEM SOLVING:
Problem solving is the act of defining a problem; determining the cause of the problem;
identifying, prioritizing, and selecting alternatives for a solution; and implementing a solution.
The problem-solving process
Problem solving resources

THE PROBLEM-SOLVING PROCESS

In order to effectively manage and run a successful organization, leadership must guide their
employees and develop problem-solving techniques. Finding a suitablesolution for issues can
be accomplished by following the basic four-step problem-solving process and methodology
outlined below.

3
Course Code/Title:CS3302/Design Analysis of Algorithms

Step Characteristics

1. Define the  Differentiate fact from opinion


problem  Specify underlying causes
 Consult each faction involved for information
 State the problem specifically
 Identify what standard or expectation is violated
 Determine in which process the problem lies
 Avoid trying to solve the problem without data

2. Generate  Postpone evaluating alternatives initially


alternative  Include all involved individuals in the generating of
solutions alternatives
 Specify alternatives consistent with organizational goals
 Specify short- and long-term alternatives
 Brainstorm on others' ideas
 Seek alternatives that may solve the problem

3. Evaluate  Evaluate alternatives relative to a target standard


and select an  Evaluate all alternatives without bias
alternative  Evaluate alternatives relative to established goals
 Evaluate both proven and possible outcomes

4
Course Code/Title:CS3302/Design Analysis of Algorithms

Step Characteristics

 State the selected alternative explicitly

4.  Plan and implement a pilot test of the chosen alternative


Implement  Gather feedback from all affected parties
and follow  Seek acceptance or consensus by all those affected
up on the
 Establish ongoing measures and monitoring
solution
 Evaluate long-term results based on final solution

1. Define the problem

Diagnose the situation so that your focus is on the problem, not just its symptoms. Helpful
problem-solving techniques include using flowcharts to identify the expected steps of a
process and cause-and-effect diagrams to define and analyze root causes.

The sections below help explain key problem-solving steps. These steps support the
involvement of interested parties, the use of factual information, comparison of
expectations to reality, and a focus on root causes of a problem. You should begin by:

Reviewing and documenting how processes currently work (i.e., who doeswhat, with what
information, using what tools, communicating with what organizations and individuals, in
what time frame, using what format).
Evaluating the possible impact of new tools and revised policies in thedevelopment of your
"what should be" model.

2. Generate alternative solutions

Postpone the selection of one solution until several problem-solving alternatives havebeen
proposed. Considering multiple alternatives can significantly enhance the value of your
ideal solution. Once you have decided on the "what should be" model, this target standard
becomes the basis for developing a road map for investigating alternatives. Brainstorming
and team problem-solving techniques are both useful tools in this stage of problem solving.

Many alternative solutions to the problem should be generated before final evaluation. A
common mistake in problem solving is that alternatives are evaluated as they are proposed,
so the first acceptable solution is chosen, even if it’s not the best fit. If we focus on trying
to get the results we want, we miss the potential for learning something new that will allow
for real improvement in the problem-solvingprocess.

5
Course Code/Title:CS3302/Design Analysis of Algorithms

3. Evaluate and select an alternative

Skilled problem solvers use a series of considerations when selecting the bestalternative.
They consider the extent to which:

A particular alternative will solve the problem without causing other unanticipated
problems.
All the individuals involved will accept the alternative.
Implementation of the alternative is likely.
The alternative fits within the organizational constraints.

4. Implement and follow up on the solution

Leaders may be called upon to direct others to implement the solution, "sell" the solution,
or facilitate the implementation with the help of others. Involving others in the
implementation is an effective way to gain buy-in and support and minimize resistance to
subsequent changes.

Regardless of how the solution is rolled out, feedback channels should be built into the
implementation. This allows for continuous monitoring and testing of actual events against
expectations. Problem solving, and the techniques used to gain clarity,are most effective if
the solution remains in place and is updated to respond to futurechanges.

ALGORITHMIC PROBLEM SOLVING:


Algorithmic problem solving is solving problem that require the formulation of analgorithm for the
solution

6
Course Code/Title:CS3302/Design Analysis of Algorithms

7
Course Code/Title:CS3302/Design Analysis of Algorithms

Understanding the Problem


It is the process of finding the input of the problem that the algorithm solves
It is very important to specify exactly the set of inputs the algorithm needs tohandle.
A correct algorithm is not one that works most of the time, but one that workscorrectly for
all legitimate inputs.
Ascertaining the Capabilities of the Computational Device
If the instructions are executed one after another, it is called sequentialalgorithm.
If the instructions are executed concurrently, it is called parallel algorithm.
Choosing between Exact and Approximate Problem Solving
The next principal decision is to choose between solving the problem exactly orsolving it
approximately.
Based on this, the algorithms are classified as exact algorithm and approximation algorithm.
Deciding a data structure:
Data structure plays a vital role in designing and analysis the algorithms.
Some of the algorithm design techniques also depend on the structuring dataspecifying a
problem’s instance
Algorithm+ Data structure=programs.
Algorithm Design Techniques
An algorithm design technique (or “strategy” or “paradigm”) is a generalapproach to solving
problems algorithmically that is applicable to a variety of problems from different areas of
computing.
Learning these techniques is of utmost importance for the following reasons.
First, they provide guidance for designing algorithms for new problems,
Second, algorithms are the cornerstone of computer science
Methods of Specifying an Algorithm
Pseudocode is a mixture of a natural language and programming language- like constructs.
Pseudocode is usually more precise than natural language, and its usage often yields more
succinct algorithm descriptions.
In the earlier days of computing, the dominant vehicle for specifying algorithms was a
flowchart, a method of expressing an algorithm by a collection ofconnected geometric
shapes containing descriptions of the algorithm’s steps.
Programming language can be fed into an electronic computer directly. Instead, it needs to
be converted into a computer program written in a particularcomputer language. We can
look at such a program as yet another way of specifying the algorithm, although it is
preferable to consider it as thealgorithm’s implementation.
Proving an Algorithm’s Correctness
Once an algorithm has been specified, you have to prove its correctness. That

8
Course Code/Title:CS3302/Design Analysis of Algorithms

is, you have to prove that the algorithm yields a required result for every legitimate input
in a finite amount of time.
A common technique for proving correctness is to use mathematical induction because
an algorithm’s iterations provide a natural sequence of steps needed forsuch proofs.
It might be worth mentioning that although tracing the algorithm’s performancefor a few
specific inputs can be a very worthwhile activity, it cannot prove the algorithm’s
correctness conclusively. But in order to show that an algorithm is incorrect, you need
just one instance of its input for which the algorithm fails.
Analysing an Algorithm
1. Efficiency.
Time efficiency, indicating how fast the algorithm runs,
Space efficiency, indicating how much extra memory it uses.

2. simplicity.
An algorithm should be precisely defined and investigated with mathematical
expressions.
Simpler algorithms are easier to understand and easier to program.
Simple algorithms usually contain fewer bugs.
Coding an Algorithm
Most algorithms are destined to be ultimately implemented as computer programs.
Programming an algorithm presents both a peril and an opportunity.
A working program provides an additional opportunity in allowing an empiricalanalysis
of the underlying algorithm. Such an analysis is based on timing the program on several
inputs and then analysing the results obtained.

Algorithm Analysis
Time Complexity: The time complexity of an algorithm quantifies the amount of time
taken by an algorithm to run as a function of the length of the input. Note that the time to
run is a function of the length of the input and not the actual execution time of the machine
on which the algorithm is running on.
Definition:
The valid algorithm takes a finite amount of time for execution. The time required by the
algorithm to solve given problem is called time complexity of the algorithm. Time
complexity is very useful measure in algorithm analysis.
Example 1: Addition of two scalar variables.Algorithm ADD SCALAR(A, B)
//Description: Perform arithmetic addition of two numbers
//Input: Two scalar variables A and B
//Output: variable C, which holds the addition of A and BC <- A + B
return C
The addition of two scalar numbers requires one addition operation. the time

9
Course Code/Title:CS3302/Design Analysis of Algorithms

complexity of this algorithm is constant, so T(n) = O(1) .

In order to calculate time complexity on an algorithm, it is assumed that a constant time c is


taken to execute one operation, and then the total operations for an input length on N are
calculated.
Space Complexity:
Definition – Problem-solving using computer requires memory to hold temporary data or
final result while the program is in execution. The amount of memory required by the algorithm
to solve given problem is called space complexity of the algorithm.

The space complexity of an algorithm quantifies the amount of space taken by an algorithm
to run as a function of the length of the input. Consider an example: Suppose a problem to
find the frequency of array elements.

It is the amount of memory needed for the completion of an algorithm. To estimate the

memory requirement we need to focus on two parts:

(1) A fixed part: It is independent of the input size. It includes memory for instructions
(code), constants, variables, etc.

(2) A variable part: It is dependent on the input size. It includes memory for recursion
stack, referenced variables, etc.

Example : Addition of two scalar variablesAlgorithm ADD SCALAR(A, B)


//Description: Perform arithmetic addition of two numbers
//Input: Two scalar variables A and B
//Output: variable C, which holds the addition of A and BC <— A+B
return C
The addition of two scalar numbers requires one extra memory location to hold the result.
Thus the space complexity of this algorithm is constant, hence S(n) = O(1).

Complexity analysis is defined as a technique to characterize the time taken


by an algorithm with respect to input size (independent from the machine, language and
compiler). It is used for evaluating the variations of execution time ondifferent algorithms.

Need of Complexity Analysis:


● Complexity Analysis determines the amount of time and space resourcesrequired to
execute it.
● It is used for comparing different algorithms on different input sizes.

10
Course Code/Title:CS3302/Design Analysis of Algorithms

● Complexity helps to determine the difficulty of a problem.


● often measured by how much time and space (memory) it takes to solve aparticular
problem

Asymptotic Notations in Complexity Analysis:

1. Big O Notation
Big-O notation represents the upper bound of the running time of an algorithm.
Therefore, it gives the worst-case complexity of an algorithm. By using big O-
notation, we can asymptotically limit the expansion of a running time to a range of
constant factors above and below. It is a model for quantifying algorithm performance.

Mathematical Representation of Big-O Notation:


O(g(n)) = { f(n): there exist positive constants c and n0 such that 0 ≤ f(n) ≤ cg(n)for
all n ≥ n0 }

2. Omega Notation
Omega notation represents the lower bound of the running time of an
algorithm. Thus, it provides the best-case complexity of an algorithm.
The execution time serves as a lower bound on the algorithm’s time complexity. It is
defined as the condition that allows an algorithm to complete statement executionin the
shortest amount of time.

11
Course Code/Title:CS3302/Design Analysis of Algorithms

Mathematical Representation of Omega Notation:


Ω(g(n)) = { f(n): there exist positive constants c and n0 such that 0 ≤ cg(n) ≤ f(n)for all n
≥ n0 }
Note: Ω (g) is a set

3. Theta Notation
Theta notation encloses the function from above and below. Since it represents the
upper and the lower bound of the running time of an algorithm, it is used for analyzing the
average-case complexity of an algorithm. The execution time serves as both a lower and
upper bound on the algorithm’s time complexity. It exists as both, the most, and least
boundaries for a given input value.

Mathematical Representation:
Θ (g(n)) = {f(n): there exist positive constants c1, c2 and n0 such that 0 ≤ c1 * g(n)
≤ f(n) ≤ c2 * g(n) for all n ≥ n0}

4. Little ο asymptotic notation


Big-Ο is used as a tight upper bound on the growth of an algorithm’s effort (this
effort is described by the function f(n)), even though, as written, it can also be a loose upper
bound. “Little-ο” (ο()) notation is used to describe an upper bound thatcannot be tight.
Mathematical Representation:

f(n) = o(g(n)) means lim f(n)/g(n) = 0 n→∞

12
Course Code/Title:CS3302/Design Analysis of Algorithms

5. Little ω asymptotic notation


Let f(n) and g(n) be functions that map positive integers to positive real numbers.
We say that f(n) is ω(g(n)) (or f(n) ∈ ω(g(n))) if for any real constant c >0, there exists an
integer constant n0 ≥ 1 such that f(n) > c * g(n) ≥ 0 for every integern ≥ n0.
Mathematical Representation:
if f(n) ∈ ω(g(n)) then,lim f(n)/g(n) = ∞ n→∞

Measure of Complexity:
The complexity of an algorithm can be measured in three ways:

1. Time Complexity
The time complexity of an algorithm is defined as the amount of time taken by an
algorithm to run as a function of the length of the input. Note that the time to run is a
function of the length of the input and not the actual execution time of the machine on
which the algorithm is running on

How is Time complexity computed?

To estimate the time complexity, we need to consider the cost of each fundamental
instruction and the number of times the instruction is executed.

If we have statements with basic operations like comparisons, return statements,


assignments, and reading a variable. We can assume they take constant time each O(1).
Statement 1: int a=5; // reading a variable statement 2; if( a==5) return true; //
return statementstatement 3; int x= 4>5 ? 1:0; // comparison statement 4;
bool flag=true; // Assignment

This is the result of calculating the overall time complexity.

total time = time(statement1) + time(statement2) + ... time (statementN)


Assuming that n is the size of the input, let’s use T(n) to represent the overalltime
and t to represent the amount of time that a statement or collection of statements takes to
execute.

T(n) = t(statement1) + t(statement2) + ... + t(statementN); Overall, T(n)= O(1), which means

constant complexity.

13
Course Code/Title:CS3302/Design Analysis of Algorithms

For any loop, we find out the runtime of the block inside them and multiplyit by
the number of times the program will repeat the loop.
for (int i = 0; i < n; i++) {
cout << “GeeksForGeeks” << endl;
}

For the above example, the loop will execute n times, and it will print
“GeeksForGeeks” N number of times. so the time taken to run this program is:

T(N)= n *( t(cout statement))


= n * O(1)
=O(n), Linear complexity.

For 2D arrays, we would have nested loop concepts, which means a loop
inside a loop.
for (int i = 0; i < n; i++) { for (int j = 0; j < m; j++) {
cout << “GeeksForGeeks” << endl;
}
}

For the above example, the cout statement will execute n*m times, and it will
print “GeeksForGeeks” N*M number of times. so the time taken to run this programis:

T(N)= n * m *(t(cout statement))


= n * m * O(1)
=O(n*m), Quadratic Complexity.

2. Space Complexity :
The amount of memory required by the algorithm to solve a given problem is
called the space complexity of the algorithm. Problem-solving using a computer

14
Course Code/Title:CS3302/Design Analysis of Algorithms

requires memory to hold temporary data or final result while the program is in execution.

How is space complexity computed?


The space Complexity of an algorithm is the total space taken by the algorithm with
respect to the input size. Space complexity includes both Auxiliary space and space used
by input.
Space complexity is a parallel concept to time complexity. If we need to create an
array of size n, this will require O(n) space. If we create a two- dimensional array
of size n*n, this will require O(n2) space.

In recursive calls stack space also counts.Example:

int add (int n){if (n <= 0){


return 0;
}
return n + add (n-1);
}
Here each call add a level to the stack :
1. add(4)
2. -> add(3)
3. -> add(2)
4. -> add(1)
5. -> add(0)
Each of these calls is added to call stack and takes up actual memory.So it takes O(n)
space.
However, just because you have n calls total doesn’t mean it takes O(n)
space.

Look at the below function :int addSequence (int n){


int sum = 0;
for (int i = 0; i < n; i++){ sum += pairSum(i, i+1);
}
return sum;

15
Course Code/Title:CS3302/Design Analysis of Algorithms

}
int pairSum(int x, int y){return x + y;
}
There will be roughly O(n) calls to pairSum. However, those calls do not
exist simultaneously on the call stack, so you only need O(1) space.

3. Auxiliary Space :

The temporary space needed for the use of an algorithm is referred to as auxiliary
space. Like temporary arrays, pointers, etc.
It is preferable to make use of Auxiliary Space when comparing things like sorting
algorithms. For example, sorting algorithms take O(n) space, as there is an input array to
sort. but the auxiliary space is O(1) in that case.

Optimizing the time and space complexity of an algorithm:


Optimization means modifying the brute-force approach to a problem. It is done to
derive the best possible solution to solve the problem so that it will take lesstime and space
complexity. We can optimize a program by either limiting the searchspace at each step or
occupying less search space from the start.
We can optimize a solution using both time and space optimization. To optimize a program,
● We can reduce the time taken to run the program and increase the spaceoccupied;
● we can reduce the memory usage of the program and increase its total runtime, or
● we can reduce both time and space complexity by deploying relevantalgorithms

Time Complexity:
The valid algorithm takes a finite amount of time for execution. The time required
by the algorithm to solve given problem is called time complexity of the algorithm. Time
complexity is very useful measure in algorithm analysis. It is the time needed for the
completion of an algorithm. To estimate the time complexity, we need to consider the cost of
each fundamental instruction and the number of times the instruction is executed.

Example 1: Addition of two scalar variables. Algorithm ADD SCALAR(A, B)

//Description: Perform arithmetic addition of two numbers


//Input: Two scalar variables A and B
//Output: variable C, which holds the addition of A and BC <- A + B
return C
The addition of two scalar numbers requires one addition operation. the time
complexity of this algorithm is constant, so T(n) = O(1) .

16
Course Code/Title:CS3302/Design Analysis of Algorithms

In order to calculate time complexity on an algorithm, it is assumed that a constant


time c is taken to execute one operation, and then the total operations for an input length on
N are calculated. Consider an example to understand the processof calculation: Suppose a
problem is to find whether a pair (X, Y) exists in an array,A of N elements whose sum is Z.
The simplest idea is to consider every pair and check if it satisfies the given condition or not.

The pseudo-code is as follows: int a[n];


for(int i = 0;i < n;i++)cin >> a[i]

for(int i = 0;i < n;i++) for(int j = 0;j < n;j++) if(i!=j && a[i]+a[j] == z)
return truereturn false
Assuming that each of the operations in the computer takes approximately constant
time, let it be c. The number of lines of code executed actually depends onthe value of Z.
During analyses of the algorithm, mostly the worst-case scenario is considered, i.e., when
there is no pair of elements with sum equals Z. In the worst case,

N*c operations are required for input.The outer loop i loop runs N times.
For each i, the inner loop j loop runs N times.
So total execution time is N*c + N*N*c + c. Now ignore the lower order terms since the
lower order terms are relatively insignificant for large input, therefore onlythe highest order
term is taken (without constant) which is N*N in this case. Different notations are used to
describe the limiting behavior of a
function, but since the worst case is taken so big-O notation will be used torepresent the time
complexity.

Hence, the time complexity is O(N2) for the above algorithm. Note that the time
complexity is solely based on the number of elements in array A i.e the input length, so if
the length of the array will increase the time of execution will also increase.

Order of growth is how the time of execution depends on the length of the input. In
the above example, it is clearly evident that the time of execution quadratically depends on
the length of the array. Order of growth will help to compute the running time with ease.
Some general time complexities are listed below with the input range for which they
are accepted in competitive programming:

17
Course Code/Title:CS3302/Design Analysis of Algorithms

Worst Usually type of solutions


Inpu Ac
t cepted
Len Time
gth Complexi
ty

10 - O(N!) Recursion and backtracking


12

Recursion, manipulation
15- O(2N *
18 N) backtracking,

and

bit

Recursion, manipulation
18- O(2N *
22 N) backtracking,

and

bit

18
Course Code/Title:CS3302/Design Analysis of Algorithms

30- O(2N/2 * Meet in the middle, Divide and


40 N) Conquer

100 O(N4) Dynamic programming,


Constructive

400 O(N3) Dynamic programming,


Constructive

Dynamic programming, Binary


2K O(N2* Search,Sorting,
log N) Divide and Conquer

Dynamic Programming, Graph,


10K O(N2) Trees,Constructive

Sorting,Conquer a
1M O(N* log n
N) Binary d

Search,

Divide

19
Course Code/Title:CS3302/Design Analysis of Algorithms

O(N), Constructive, Mathematical,


100 O(log GreedyAlgorithms
M N),O(1)

Space Complexity:
Problem-solving using computer requires memory to hold temporary data or final
result while the program is in execution. The amount of memory required by the algorithm to
solve given problem is called space complexity of the algorithm.

The space complexity of an algorithm quantifies the amount of space taken by an


algorithm to run as a function of the length of the input. Consider an example: Suppose a
problem to find the frequency of array elements.

It is the amount of memory needed for the completion of an algorithm.To estimate the memory

requirement we need to focus on two parts:

(1) A fixed part: It is independent of the input size. It includes memory for instructions
(code), constants, variables, etc.

(2) A variable part: It is dependent on the input size. It includes memory forrecursion
stack, referenced variables, etc.

Example : Addition of two scalar variablesAlgorithm ADD SCALAR(A, B)


//Description: Perform arithmetic addition of two numbers
//Input: Two scalar variables A and B
//Output: variable C, which holds the addition of A and BC <— A+B
return C
The addition of two scalar numbers requires one extra memory location to hold the
result. Thus the space complexity of this algorithm is constant, hence S(n)
= O(1).

The pseudo-code is as follows:

20
Course Code/Title:CS3302/Design Analysis of Algorithms

int freq[n];int a[n];


for(int i = 0; i<n; i++)
{
cin>>a[i]; freq[a[i]]++;
}
There is also auxiliary space, which is different from space complexity. The
main difference is where space complexity quantifies the total space used by the algorithm,
auxiliary space quantifies the extra space that is used in the algorithm apart from the given
input. In the above example, the auxiliary space is the space used by the freq[] array because
that is not part of the given input. So total auxiliaryspace is N * c + c which is O(N) only.

Asymptotic Notations:

Asymptotic Notations are mathematical tools used to analyze the performance of


algorithms by understanding how their efficiency changes as the input size grows. These
notations provide a concise way to express the behavior of an algorithm’s time or space
complexity as the input size approaches infinity. Rather than comparing algorithms directly,
asymptotic analysis focuses on understanding the relative growth rates of algorithms’
complexities. It enables comparisons of algorithms’ efficiency by abstracting away machine-
specific constants and implementationdetails, focusing instead on fundamental trends.
Asymptotic analysis allows for the comparison of algorithms’ space and time complexities
by examining their performance characteristics as the input size varies.By using asymptotic
notations, such as Big O, Big Omega, and Big Theta, we can categorize algorithms based on
their worst-case, best-case, or average-case time or space complexities, providing valuable
insights into their efficiency.

There are mainly three asymptotic notations:


1. Big-O Notation (O-notation)
2. Omega Notation (Ω-notation)
3. Theta Notation (Θ-notation)

1. Theta Notation (Θ-Notation):


Theta notation encloses the function from above and below. Since it represents the
upper and the lower bound of the running time of an algorithm, it is used for analyzing the
average-case complexity of an algorithm.

21
Course Code/Title:CS3302/Design Analysis of Algorithms

Theta (Average Case) You add the running times for each possible input
combination and take the average in the average case.

Let g and f be the function from the set of natural numbers to itself. The function f
is said to be Θ(g), if there are constants c1, c2 > 0 and a natural number n0 such that c1*
g(n) ≤ f(n) ≤ c2 * g(n) for all n ≥ n0

Mathematical Representation of Theta notation:


Θ (g(n)) = {f(n): there exist positive constants c1, c2 and n0 such that 0 ≤ c1 * g(n)
≤ f(n) ≤ c2 * g(n) for all n ≥ n0}Note: Θ(g) is a set

The above expression can be described as if f(n) is theta of g(n), then the value f(n) is
always between c1 * g(n) and c2 * g(n) for large values of n (n ≥ n0). The definition of
theta also requires that f(n) must be non-negative for values of n greaterthan n0.

The execution time serves as both a lower and upper bound on the algorithm’stime
complexity.

It exist as both, most, and least boundaries for a given input value.

A simple way to get the Theta notation of an expression is to drop low-order terms
and ignore leading constants. For example, Consider the expression 3n3 + 6n2
+ 6000 = Θ(n3), the dropping lower order terms is always fine because there will always
be a number(n) after which Θ(n3) has higher values than Θ(n2) irrespective of the constants
involved. For a given function g(n), we denote Θ(g(n)) is followingset of functions.

Examples :

{ 100 , log (2000) , 10^4 } belongs to Θ(1)


{ (n/4) , (2n+3) , (n/100 + log(n)) } belongs to Θ(n)
{ (n^2+n) , (2n^2) , (n^2+log(n))} belongs to Θ( n2)Note: Θ provides exact bounds.

2. Big-O Notation (O-notation):


Big-O notation represents the upper bound of the running time of analgorithm.
Therefore, it gives the worst-case complexity of an algorithm.

● It is the most widely used notation for Asymptotic analysis.


● It specifies the upper bound of a function.
● The maximum time required by an algorithm or the worst-case timecomplexity.
● It returns the highest possible output value(big-O) for a given input.
● Big-Oh(Worst Case) It is defined as the condition that allows an algorithm to complete

22
Course Code/Title:CS3302/Design Analysis of Algorithms

statement execution in the longest amount of time possible.

If f(n) describes the running time of an algorithm, f(n) is O(g(n)) if thereexist


a positive constant C and n0 such that, 0 ≤ f(n) ≤ cg(n) for all n ≥ n0

It returns the highest possible output value (big-O)for a given input.

The execution time serves as an upper bound on the algorithm’s timecomplexity.

Mathematical Representation of Big-O Notation:


O(g(n)) = { f(n): there exist positive constants c and n0 such that 0 ≤ f(n) ≤ cg(n) forall n ≥
n0 }

For example, Consider the case of Insertion Sort. It takes linear time in the best case
and quadratic time in the worst case. We can safely say that the time complexity of the
Insertion sort is O(n2).
Note: O(n2) also covers linear time.

If we use Θ notation to represent the time complexity of Insertion sort, we have to


use two statements for best and worst cases:

● The worst-case time complexity of Insertion Sort is Θ(n2).


● The best case time complexity of Insertion Sort is Θ(n).
● The Big-O notation is useful when we only have an upper bound on the timecomplexity of
an algorithm. Many times we easily find an upper bound by simply looking at the
algorithm.
Examples :

{ 100 , log (2000) , 10^4 } belongs to O(1)


U { (n/4) , (2n+3) , (n/100 + log(n)) } belongs to O(n)
U { (n^2+n) , (2n^2) , (n^2+log(n))} belongs to O( n^2)

Note: Here, U represents union, we can write it in these manner because Oprovides
exact or upper bounds .

3. Omega Notation (Ω-Notation):


Omega notation represents the lower bound of the running time of an algorithm.
Thus, it provides the best case complexity of an algorithm.

The execution time serves as a lower bound on the algorithm’s time complexity.

23
Course Code/Title:CS3302/Design Analysis of Algorithms

It is defined as the condition that allows an algorithm to complete statement


execution in the shortest amount of time.

Let g and f be the function from the set of natural numbers to itself. The function f
is said to be Ω(g), if there is a constant c > 0 and a natural number n0 suchthat c*g(n) ≤ f(n)
for all n ≥ n0

Mathematical Representation of Omega notation :


Ω(g(n)) = { f(n): there exist positive constants c and n0 such that 0 ≤ cg(n) ≤ f(n)for all n
≥ n0 }

Let us consider the same Insertion sort example here. The time complexity of
Insertion Sort can be written as Ω(n), but it is not very useful information about insertion
sort, as we are generally interested in worst-case and sometimes in the average case.

Examples :

{ (n^2+n) , (2n^2) , (n^2+log(n))} belongs to Ω( n^2)


U { (n/4) , (2n+3) , (n/100 + log(n)) } belongs to Ω(n)U { 100 , log (2000) , 10^4 } belongs
to Ω(1)

Note: Here, U represents union, we can write it in these manner because Ωprovides exact
or lower bounds.

Properties of Asymptotic Notations:

1. General Properties:
If f(n) is O(g(n)) then a*f(n) is also O(g(n)), where a is a constant.Example:

f(n) = 2n²+5 is O(n²)


then, 7*f(n) = 7(2n²+5) = 14n²+35 is also O(n²). Similarly, this property satisfies both Θ

and Ω notation. We can say,

If f(n) is Θ(g(n)) then a*f(n) is also Θ(g(n)), where a is a constant. If f(n) is Ω (g(n)) then
a*f(n) is also Ω (g(n)), where a is a constant.

2. Transitive Properties:
If f(n) is O(g(n)) and g(n) is O(h(n)) then f(n) = O(h(n)).
f(n) = O(g(n)) and g(n) = O(h(n)) ⇒ f(n) = O(h(n))Example:

24
Course Code/Title:CS3302/Design Analysis of Algorithms

If f(n) = n, g(n) = n² and h(n)=n³


n is O(n²) and n² is O(n³) then, n is O(n³)

Similarly, this property satisfies both Θ and Ω notation. We can say,

If f(n) is Θ(g(n)) and g(n) is Θ(h(n)) then f(n) = Θ(h(n)) .


If f(n) is Ω (g(n)) and g(n) is Ω (h(n)) then f(n) = Ω (h(n))

3. Reflexive Properties:
Reflexive properties are always easy to understand after transitive.
If f(n) is given then f(n) is O(f(n)). Since MAXIMUM VALUE OF f(n) will be f(n)ITSELF!
f(n) = O(f(n))
Hence x = f(n) and y = O(f(n) tie themselves in reflexive relation always.Example:

f(n) = n² ; O(n²) i.e O(f(n))


Similarly, this property satisfies both Θ and Ω notation. We can say that,

If f(n) is given then f(n) is Θ(f(n)).If f(n) is given then f(n) is Ω (f(n)).

4. Symmetric Properties:
If f(n) is Θ(g(n)) then g(n) is Θ(f(n)).
f(n) = Θ(g(n)) if and only if g(n) = Θ(f(n))Example:

If(n) = n² and g(n) = n²


then, f(n) = Θ(n²) and g(n) = Θ(n²)

This property only satisfies for Θ notation.

5. Transpose Symmetric Properties:


If f(n) is O(g(n)) then g(n) is Ω (f(n)).Example:

If(n) = n , g(n) = n²
then n is O(n²) and n² is Ω (n)

This property only satisfies O and Ω notations.

6. Some More Properties:


1. If f(n) = O(g(n)) and f(n) = Ω(g(n)) then f(n) = Θ(g(n))
2. If f(n) = O(g(n)) and d(n)=O(e(n)) then f(n) + d(n) = O( max( g(n), e(n) ))Example:

25
Course Code/Title:CS3302/Design Analysis of Algorithms

f(n) = n i.e O(n)


d(n) = n² i.e O(n²)
then f(n) + d(n) = n + n² i.e O(n²)

3. If f(n)=O(g(n)) and d(n)=O(e(n)) then f(n) * d(n) = O( g(n) * e(n))

Example:
f(n) = n i.e O(n)
d(n) = n² i.e O(n²)
then f(n) * d(n) = n * n² = n³ i.e O(n³) Note: If f(n) = O(g(n)) then g(n) = Ω(f(n))

Measurement of Complexity of an Algorithm


Based on the above three notations of Time Complexity there are three cases to
analyze an algorithm:

1. Worst Case Analysis (Mostly used)


In the worst-case analysis, we calculate the upper bound on the running timeof an
algorithm. We must know the case that causes a maximum number of operations to be
executed. For Linear Search, the worst case happens when the element to be searched (x)
is not present in the array. When x is not present, the search() function compares it with all
the elements of arr[] one by one. Therefore, the worst-case time complexity of the linear
search would be O(n).

2. Best Case Analysis (Very Rarely used)


In the best-case analysis, we calculate the lower bound on the running time ofan
algorithm. We must know the case that causes a minimum number of operations to be
executed. In the linear search problem, the best case occurs when x is present at the first
location. The number of operations in the best case is constant (not dependent on n). So
time complexity in the best case would be ?(1)

3. Average Case Analysis (Rarely used)


In average case analysis, we take all possible inputs and calculate the computing
time for all of the inputs. Sum all the calculated values and divide the sum by the total
number of inputs. We must know (or predict) the distribution of cases. For the linear search
problem, let us assume that all cases are uniformly distributed (including the case of x not
being present in the array). So we sum all thecases and divide the sum by (n+1). Following
is the value of average-case time complexity.

Average Case Time = \sum_{i=1}^{n}\frac{\theta (i)}{(n+1)} = \frac{\theta


(\frac{(n+1)*(n+2)}{2})}{(n+1)} = \theta (n)

26
Course Code/Title:CS3302/Design Analysis of Algorithms

Below is the ranked mention of complexity analysis notation based onpopularity:

1. Worst Case Analysis:


Most of the time, we do worst-case analyses to analyze algorithms. In the worst
analysis, we guarantee an upper bound on the running time of an algorithm which is good
information.

2. Average Case Analysis


The average case analysis is not easy to do in most practical cases and it is rarely
done. In the average case analysis, we must know (or predict) the mathematical distribution
of all possible inputs.

3. Best Case Analysis


The Best Case analysis is bogus. Guaranteeing a lower bound on an algorithmdoesn’t
provide any information as in the worst case, an algorithm may take years to run.

Examples:
1. Linear search algorithm:
// C++ implementation of the approach#include <bits/stdc++.h>
using namespace std;

// Linearly search x in arr[].


// If x is present then return the index,
// otherwise return -1
int search(int arr[], int n, int x)
{
int i;
for (i = 0; i < n; i++) {if (arr[i] == x)
return i;
}
return -1;
}

// Driver's Codeint main()

27
Course Code/Title:CS3302/Design Analysis of Algorithms

{
int arr[] = { 1, 10, 30, 15 };
int x = 30;
int n = sizeof(arr) / sizeof(arr[0]);

// Function call
cout << x << " is present at index "
<< search(arr, n, x);

return 0;
}
Output
30 is present at index 2

Time Complexity Analysis: (In Big-O notation)

Best Case: O(1), This will take place if the element to be searched is on the first index of
the given list. So, the number of comparisons, in this case, is 1.
Average Case: O(n), This will take place if the element to be searched is on themiddle
index of the given list.
Worst Case: O(n), This will take place if:
The element to be searched is on the last index
The element to be searched is not present on the list

Advantages:
● This technique allows developers to understand the performance of algorithms under different
scenarios, which can help in making informed decisions aboutwhich algorithm to use for a
specific task.
● Worst case analysis provides a guarantee on the upper bound of the running time of an
algorithm, which can help in designing reliable and efficient algorithms.
● Average case analysis provides a more realistic estimate of the running time of an
algorithm, which can be useful in real-world scenarios.

Disadvantages:
● This technique can be time-consuming and requires a good understanding of the algorithm
being analyzed.
● Worst case analysis does not provide any information about the typical running time of an
algorithm, which can be a disadvantage in real-world scenarios.

28
Course Code/Title:CS3302/Design Analysis of Algorithms

● Average case analysis requires knowledge of the probability distribution ofinput data,
which may not always be available.

Important points:
● The worst case analysis of an algorithm provides an upper bound on therunning
time of the algorithm for any input size.
● The average case analysis of an algorithm provides an estimate of therunning time
of the algorithm for a random input.
● The best case analysis of an algorithm provides a lower bound on therunning time
of the algorithm for any input size.
● The big O notation is commonly used to express the worst case running time of an
algorithm.
● Different algorithms may have different best, average, and worst caserunning times.

Recurrence Relation
A recurrence is an equation or inequality that describes a function in terms ofits
values on smaller inputs. To solve a Recurrence Relation means to obtain a function
defined on the natural numbers that satisfy the recurrence.

There are four methods for solving Recurrence:


1. Substitution Method
2. Iteration Method
3. Recursion Tree Method
4. Master Method

1. Substitution Method:
The Substitution Method Consists of two main steps:
● Guess the Solution.
● Use the mathematical induction to find the boundary condition andshows that the
guess is correct.

For Example Solve the equation by Substitution Method.

T (n) = TDAA Recurrence Relation + n


We have to show that it is asymptotically bound by O (log n).Solution:

For T (n) = O (log n)

29
Course Code/Title:CS3302/Design Analysis of Algorithms

We have to show that for some constant c

T (n) ≤c logn.
Put this in given Recurrence Equation.

T (n) ≤c logDAA Recurrence Relation+ 1


≤c logDAA Recurrence Relation+ 1 = c logn-clog2 2+1
≤c logn for c≥1Thus T (n) =O logn.
Example2 Consider the Recurrence

T (n) = 2TDAA Recurrence Relation+ n n>1Find an Asymptotic bound on T.

Solution:

We guess the solution is O (n (logn)).Thus for constant 'c'.T (n) ≤c n logn


Put this in given Recurrence Equation.Now,
T (n) ≤2cDAA Recurrence Relation Recurrence Relation+n
≤cnlogn-cnlog2+n
=cn logn-n (clog2-1)
≤cn logn for (c≥1) Thus T (n) = O (n logn).
2. Iteration Methods
It means to expand the recurrence and express it as a summation of terms of n andinitial
condition.

Example1: Consider the RecurrenceT (n) = 1 if n=1


= 2T (n-1) if n>1Solution:

T (n) = 2T (n-1)
= 2[2T (n-2)] = 22T (n-2)
= 4[2T (n-3)] = 23T (n-3)
= 8[2T (n-4)] = 24T (n-4) (Eq.1)

30
Course Code/Title:CS3302/Design Analysis of Algorithms

Repeat the procedure for i times

T (n) = 2i T (n-i)
Put n-i=1 or i= n-1 in (Eq.1)T (n) = 2n-1 T (1)
= 2n-1 .1 {T (1) =1 ........ given}
= 2n-1
Example2: Consider the Recurrence

T (n) = T (n-1) +1 and T (1) = θ (1).


Solution:

T (n) = T (n-1) +1
= (T (n-2) +1) +1 = (T (n-3) +1) +1+1
= T (n-4) +4 = T (n-5) +1+4
= T (n-5) +5= T (n-k) + k
Where k = n-1
T (n-k) = T (1) = θ (1)
T (n) = θ (1) + (n-1) = 1+n-1=n= θ (n).

Recursion Tree Method


Recursion is a fundamental concept in computer science and mathematics that
allows functions to call themselves, enabling the solution of complex problems through
iterative steps. One visual representation commonly used to understand andanalyze the
execution of recursive functions is a recursion tree. In this article, we will explore the
theory behind recursion trees, their structure, and their significance in understanding
recursive algorithms.

What is a Recursion Tree?


A recursion tree is a graphical representation that illustrates the execution flow of a
recursive function. It provides a visual breakdown of recursive calls, showcasing the
progression of the algorithm as it branches out and eventually reaches a base case. The
tree structure helps in analyzing the time complexity and understanding the recursive
process involved.

Tree Structure
Each node in a recursion tree represents a particular recursive call. The initialcall
is depicted at the top, with subsequent calls branching out beneath it. The tree grows
downward, forming a hierarchical structure. The branching factor

31
Course Code/Title:CS3302/Design Analysis of Algorithms

of each node depends on the number of recursive calls made within the function.
Additionally, the depth of the tree corresponds to the number of recursive calls before
reaching the base case.

Base Case
The base case serves as the termination condition for a recursive function. It
defines the point at which the recursion stops and the function starts returning values. In a
recursion tree, the nodes representing the base case are usually depicted as leafnodes, as
they do not result in further recursive calls.

Recursive Calls
The child nodes in a recursion tree represent the recursive calls made within the
function. Each child node corresponds to a separate recursive call, resulting in the creation
of new sub problems. The values or parameters passed to these recursive calls may differ,
leading to variations in the sub problems' characteristics.

Execution Flow:
Traversing a recursion tree provides insights into the execution flow of a recursive
function. Starting from the initial call at the root node, we follow the branches to reach
subsequent calls until we encounter the base case. As the base cases are reached, the
recursive calls start to return, and their respective nodes in thetree are marked with the
returned values. The traversal continues until the entire treehas been traversed.

Time Complexity Analysis


Recursion trees aid in analyzing the time complexity of recursive algorithms.By
examining the structure of the tree, we can determine the number of recursive calls made
and the work done at each level. This analysis helps in understanding theoverall efficiency
of the algorithm and identifying any potential inefficiencies or opportunities for
optimization.

Introduction
Think of a program that determines a number's factorial. This function takes a
number N as an input and returns the factorial of N as a result. This function's pseudo-
code will resemble,
// find factorial of a number factorial(n) {
// Base case
if n is less than 2: // Factorial of 0, 1 is 1 return n

32
Course Code/Title:CS3302/Design Analysis of Algorithms

// Recursive step
return n * factorial(n-1); // Factorial of 5 => 5 * Factorial(4)...
}

/* How function calls are made,

Factorial(5) [ 120 ]
|
5 * Factorial(4) ==> 120
|
4. * Factorial(3) ==> 24
|
3 * Factorial(2) ==> 6
|
2 * Factorial(1) ==> 2
|1

*/
Recursion is exemplified by the function that was previously mentioned. We
are invoking a function to determine a number's factorial. Then, given a lesser valueof
the same number, this function calls itself. This continues until we reach the basiccase,
in which there are no more function calls.
Recursion is a technique for handling complicated issues when the outcome is
dependent on the outcomes of smaller instances of the same issue.
If we think about functions, a function is said to be recursive if it keeps calling itself
until it reaches the base case.
Any recursive function has two primary components: the base case and the
recursive step. We stop going to the recursive phase once we reach the basic case. To
prevent endless recursion, base cases must be properly defined and are crucial. The
definition of infinite recursion is a recursion that never reaches the base case. If a
program never reaches the base case, stack overflow will continue to occur.
Recursion Types
Generally speaking, there are two different forms of recursion:
1. Linear Recursion
2. Tree Recursion

Linear Recursion:

33
Course Code/Title:CS3302/Design Analysis of Algorithms

A function that calls itself just once each time it executes is said to be linearly
recursive. A nice illustration of linear recursion is the factorial function. The name "linear
recursion" refers to the fact that a linearly recursive function takes a linear amount of time
to execute.

Take a look at the pseudo-code below:function doSomething(n) {


// base case to stop recursionif nis 0:
return
// here is some instructions
// recursive step doSomething(n-1);
}
If we look at the function doSomething(n), it accepts a parameter named n and does
some calculations before calling the same procedure once more but with lowervalues.
When the method doSomething() is called with the argument value n, let's saythat
T(n) represents the total amount of time needed to complete the computation. For this, we
can also formulate a recurrence relation, T(n) = T(n-1) +
K. K serves as a constant here. Constant K is included because it takes time for the function
to allocate or de-allocate memory to a variable or perform a mathematical operation. We
use K to define the time since it is so minute and insignificant.
This recursive program's time complexity may be simply calculated since, inthe
worst scenario, the method doSomething() is called n times. Formally speaking, the
function's temporal complexity is O(N).

Tree Recursion
When you make a recursive call in your recursive case more than once, it is referred
to as tree recursion. An effective illustration of Tree recursion is the fibonacci sequence.
Tree recursive functions operate in exponential time; they are not linear in their temporal
complexity.

Take a look at the pseudo-code below, function doSomething(n) {


// base case to stop recursionif n is less than 2:
return n;
// here is some instructions
// recursive step

34
Course Code/Title:CS3302/Design Analysis of Algorithms

return doSomething(n-1) + doSomething(n-2);


}
The only difference between this code and the previous one is that this one
makes one more call to the same function with a lower value of n.
Let's put T(n) = T(n-1) + T(n-2) + k as the recurrence relation for this function.
K serves as a constant once more.
When more than one call to the same function with smaller values is
performed, this sort of recursion is known as tree recursion. The intriguing aspect is
now: how time-consuming is this function?
Take a guess based on the recursion tree below for the same function.

DAA Recursion Tree Method


It may occur to you that it is challenging to estimate the time complexity by
looking directly at a recursive function, particularly when it is a tree recursion.
Recursion Tree Method is one of several techniques for calculating the temporal
complexity of such functions. Let's examine it in further detail.

What Is Recursion Tree Method?


Recurrence relations like T(N) = T(N/2) + N or the two we covered earlier in
the kinds of recursion section are solved using the recursion tree approach. These
recurrence relations often use a divide and conquer strategy to address problems.
It takes time to integrate the answers to the smaller sub problems that are created when
a larger problem is broken down into smaller sub problems.
The recurrence relation, for instance, is T(N) = 2 * T(N/2) + O(N) for the
Merge sort. The time needed to combine the answers to two sub problems with a
combined size of T(N/2) is O(N), which is true at the implementation level as well.

35
Course Code/Title:CS3302/Design Analysis of Algorithms

For instance, since the recurrence relation for binary search is T(N) = T(N/2)
+ 1, we know that each iteration of binary search results in a search space that is cut in half.
Once the outcome is determined, we exit the function. The recurrence relation has +1 added
because this is a constant time operation.
The recurrence relation T(n) = 2T(n/2) + Kn is one to consider. Kn denotes the
amount of time required to combine the answers to n/2-dimensional sub problems.
Let's depict the recursion tree for the aforementioned recurrence relation.

DAA Recursion Tree Method


We may draw a few conclusions from studying the recursion tree above, including

1. The magnitude of the problem at each level is all that matters for determining the
value of a node. The issue size is n at level 0, n/2 at level 1, n/2 at level 2, and so on.

2. In general, we define the height of the tree as equal to log (n), where n is the size of
the issue, and the height of this recursion tree is equal to the number of levels in the tree.
This is true because, as we just established, the divide-and-conquer strategy is used by
recurrence relations to solve problems, and getting from issue size n to problem size 1 simply
requires taking log (n) steps.

Consider the value of N = 16, for instance. If we are permitted to divide N by2 at each
step, how many steps are required to get N = 1? Considering that we

36
Course Code/Title:CS3302/Design Analysis of Algorithms

are dividing by two at each step, the correct answer is 4, which is the value of log(16)base
2.
log(16) base 2

log(2^4) base 2

4 * log(2) base 2, since log(a) base a = 1so, 4 * log(2) base 2 = 4

3. At each level, the second term in the recurrence is regarded as the root.

Although the word "tree" appears in the name of this strategy, you don't need to bean
expert on trees to comprehend it.

How to Use a Recursion Tree to Solve Recurrence Relations?


The cost of the sub problem in the recursion tree technique is the amount of time
needed to solve the sub problem. Therefore, if you notice the phrase "cost" linked with
the recursion tree, it simply refers to the amount of time needed to solvea certain sub
problem.

Let's understand all of these steps with a few examples.

Example

Consider the recurrence relation,T(n) = 2T(n/2) + K

Solution

The given recurrence relation shows the following properties,

A problem size n is divided into two sub-problems each of size n/2. The cost of
combining the solutions to these sub-problems is K.

Each problem size of n/2 is divided into two sub-problems each of size n/4 and soon.

37
Course Code/Title:CS3302/Design Analysis of Algorithms

At the last level, the sub-problem size will be reduced to 1. In other words, we finally hit
the base case.

Let's follow the steps to solve this recurrence relation, Step 1: Draw the

Recursion Tree

DAA Recursion Tree Method


Step 2: Calculate the Height of the Tree

Since we know that when we continuously divide a number by 2, there comes a time
when this number is reduced to 1. Same as with the problem size N, suppose after K
divisions by 2, N becomes equal to 1, which implies, (n / 2^k) = 1

Here n / 2^k is the problem size at the last level and it is always equal to 1.

Now we can easily calculate the value of k from the above expression by taking log() to
both sides. Below is a more clear derivation,

n = 2^k

log(n) = log(2^k) log(n) = k * log(2)

38
Course Code/Title:CS3302/Design Analysis of Algorithms

k = log(n) / log(2)k = log(n) base 2


So the height of the tree is log (n) base 2.Step 3: Calculate the cost at each level

Cost at Level-0 = K, two sub-problems are merged.


Cost at Level-1 = K + K = 2*K, two sub-problems are merged two times.
Cost at Level-2 = K + K + K + K = 4*K, two sub-problems are merged four times.
and so on....
Step 4: Calculate the number of nodes at each level

Let's first determine the number of nodes in the last level. From the recursion tree,we
can deduce this

Level-0 have 1 (2^0) node Level-1 have 2 (2^1) nodesLevel-2 have 4 (2^2) nodes
Level-3 have 8 (2^3) nodes
So the level log(n) should have 2^(log(n)) nodes i.e. n nodes.Step 5: Sum up the cost

of all the levels

The total cost can be written as,


Total Cost = Cost of all levels except last level + Cost of last level
Total Cost = Cost for level-0 + Cost for level-1 + Cost for level-2 + ............... +
Cost for
level-log(n) + Cost for last level
The cost of the last level is calculated separately because it is the base case and no
merging is done at the last level so, the cost to solve a single problem at this level is
some constant value. Let's take it as O (1).

Let's put the values into the formulae,

T(n) = K + 2*K + 4*K + ..... + log(n)` times + `O(1) * n


T(n) = K(1 + 2 + 4 + .... + log(n) times)` + `O(n)
T(n) = K(2^0 + 2^1 + 2^2 + .....+ log(n) times + O(n)
If you closely take a look to the above expression, it forms a Geometric progression
(a, ar, ar^2, ar^3 ...... infinite time). The sum of GP is given by S(N) = a / (r - 1). Here
is the first term and r is the common ratio.

39
Course Code/Title:CS3302/Design Analysis of Algorithms

1. Lower Bound Theory:


According to the lower bound theory, for a lower bound L(n) of an algorithm, it
is not possible to have any other algorithm (for a common problem) whose time
complexity is less than L(n) for random input. Also, every algorithm must take at least
L(n) time in the worst case. Note that L(n) here is the minimum of all the possible
algorithms, of maximum complexity.

The Lower Bound is very important for any algorithm. Once we calculated it,
then we can compare it with the actual complexity of the algorithm and if their orderis
the same then we can declare our algorithm as optimal. So in this section, we willbe
discussing techniques for finding the lower bound of an algorithm.

Note that our main motive is to get an optimal algorithm, which is the one
having its Upper Bound the Same as its Lower Bound (U(n)=L(n)). Merge Sort is a
common example of an optimal algorithm.

Trivial Lower Bound –


It is the easiest method to find the lower bound. The Lower bounds which can
be easily observed based on the number of input taken and the number of output
produced are called Trivial Lower Bound.

Example: Multiplication of n x n matrix, where,

Input: For 2 matrices we will have 2n2 inputs Output: 1 matrix of order n x n, i.e., n2
outputs
In the above example, it’s easily predictable that the lower bound is O(n2).

Computational Model –
The method is for all those algorithms that are comparison-based. For example,
in sorting, we have to compare the elements of the list among themselves and then sort
them accordingly. Similar is the case with searching and thus we can implement the
same in this case. Now we will look at some examples to understandits usage.

Ordered Searching –
It is a type of searching in which the list is already sorted.Example-1: Linear search
Explanation –

40
Course Code/Title:CS3302/Design Analysis of Algorithms

In linear search, we compare the key with the first element if it does not matchwe
compare it with the second element, and so on till we check against the nth element. Else
we will end up with a failure.

Example-2: Binary search Explanation –


In binary search, we check the middle element against the key, if it is greaterwe
search the first half else we check the second half and repeat the same process. The diagram
below there is an illustration of binary search in an array consisting of4 elements

Calculating the lower bound: The max no of comparisons is n. Let there be k levels
in the tree.
No. of nodes will be 2k-1
The upper bound of no of nodes in any comparison-based search of an element in the
list of size n will be n as there are a maximum of n comparisons in worst casescenario 2k-
1
Each level will take 1 comparison thus no. of comparisons k≥|log2n|
Thus the lower bound of any comparison-based search from a list of n elements
cannot be less than log(n). Therefore we can say that Binary Search is optimal as its
complexity is Θ(log n).

Sorting –
The diagram below is an example of a tree formed in sorting combinationswith 3
elements.

41
Course Code/Title:CS3302/Design Analysis of Algorithms

Example – For n elements, finding lower bound using computation model.

Explanation –
For n elements, we have a total of n! combinations (leaf nodes). (Refer to the
diagram the total combinations are 3! or 6) also, it is clear that the tree formed is a
binary tree. Each level in the diagram indicates a comparison. Let there be k levels
=> 2k is the total number of leaf nodes in a full binary tree thus in this case we have
n!≤2k.

As the k in the above example is the no of comparisons thus by computational model


lower bound = k.

Now we can say that,n!≤2T(n)


Thus, T(n)>|log n!|
=> n!<=nnThus,
log n!<=log nn
Taking ceiling function on both sides, we get
|-log nn-|>=|-log n!-|
Thus complexity becomes Θ(lognn) or Θ(nlogn)

Using Lower bond theory to solve the algebraic problem:

Straight Line Program –

42
Course Code/Title:CS3302/Design Analysis of Algorithms

The type of program built without any loops or control structures is calledthe
Straight Line Program. For example,

#include <iostream>

// Function to sum two numbers without using loops or control structuresint Sum(int a,
int b) {
int c = a + b;return c;
}

int main() {
// Example usageint num1 = 5;
int num2 = 7;

int result = Sum(num1, num2);

std::cout << "The sum of " << num1 << " and " << num2 << " is: " << result <<
std::endl;

return 0;
}

Output
The sum of 5 and 7 is: 12

Algebraic Problem –
Problems related to algebra like solving equations inequalities etc. come under
algebraic problems. For example, solving equation ax2+bx+c with simple
programming.

#include <iostream>

int Algo_Sol(int a, int b, int c, int x) {


// 1 assignment int v = a * x;

// 1 assignmentv = v + b;

43
Course Code/Title:CS3302/Design Analysis of Algorithms

// 1 assignmentv = v * x;

// 1 assignment int ans = v + c;return ans;


}

int main() {
// Example usage
int result = Algo_Sol(2, 3, 4, 5);
std::cout << "Result: " << result << std::endl;

return 0;
}

Output Result: 69

The complexity for solving here is 4 (excluding the returning).


The above example shows us a simple way to solve an equation for a 2-degree polynomial
i.e., 4 thus for nth degree polynomial we will have a complexity of O(n2).

Let us demonstrate via an algorithm. Example: x+a0 is a polynomial of degree n.

#include <iostream>
#include <vector> // Include vector header for using vectors

// Function to calculate power of x raised to nint power(int x, int n) {


int p = 1;

// Loop from 1 to n
for (int i = 1; i <= n; ++i) {p *= x;

44
Course Code/Title:CS3302/Design Analysis of Algorithms

return p;
}

// Function to evaluate the polynomial with coefficients A, value x, and degree n int
polynomial(std::vector<int>& A, int x, int n) {
int v = 0;

for (int i = 0; i <= n; ++i) {


// Loop within a loop from 0 to nv += A[i] * power(x, i);
}

return v;
}

int main() {
// Example usage:
std::vector<int> A = {2, 3, 4}; // Coefficients of the polynomialint x = 5; // Value of x
int n = A.size() - 1; // Degree of the polynomial

int result = polynomial(A, x, n);


std::cout << "Result: " << result << std::endl;

return 0;
}

Output Result: 117

Loop within a loop => complexity = O(n2);


Now to find an optimal algorithm we need to find the lower bound here (as per lower
bound theory). As per Lower Bound Theory, The optimal algorithm to solve the above
problem is the one having complexity O(n). Let’s prove this theorem using lower bounds.

45
Course Code/Title:CS3302/Design Analysis of Algorithms

Theorem: To prove that the optimal algo of solving a n degree polynomial is O(n) Proof: The
best solution for reducing the algo is to make this problem less complexby dividing the polynomial
into several straight-line problems.

=> anxn+an-1xn-1+an-2xn-2+...+a1x+a0can be written as


((..(anx+an-1)x+..+a2)x+a1)x+a0Now, the algorithm will be as, v=0
v=v+an v=v*x v=v+an-1v=v*x
...
v=v+a1v=v*x v=v+a0

#include <iostream>

int polynomial(int A[], int x, int n) {int v = 0;

// loop executed n times for (int i = n; i >= 0; i--) {


v = (v + A[i]) * x;
}

return v;
}

int main() {
// Example usage
int coefficients[] = {2, -1, 3}; // Coefficients of the polynomial 2x^2 - x + 3int degree = 2;
// Degree of the polynomial
int x_value = 4; // Value of x

int result = polynomial(coefficients, x_value, degree);

std::cout << "Result of the polynomial evaluation: " << result << std::endl;

46
Course Code/Title:CS3302/Design Analysis of Algorithms

return 0;
}

Output
Result of the polynomial evaluation: 184

The complexity of this code is O(n). This way of solving such equations is called
Horner’s method. Here is where lower bound theory works and gives the optimum algorithm’s
complexity as O(n).

2. Upper Bound Theory:


According to the upper bound theory, for an upper bound U(n) of analgorithm, we
can always solve the problem at most U(n) time. Time taken by aknown algorithm to
solve a problem with worse case input gives us the upper bound.
It’s difficult to provide a comprehensive list of advantages and disadvantages of
lower and upper bound theory, as it depends on the specific context in which it isbeing used.
However, here are some general advantages and disadvantages:

Advantages:
● Provides a clear understanding of the range of possible values for a quantity,which can be
useful in decision-making.
● Helps to identify the optimal value within the range of possible values,which can lead
to more efficient and effective solutions to problems.
● Can be used to prove the existence of solutions to optimization problems.
● Provides a theoretical framework for analyzing and solving a wide range ofmathematical
problems.

Disadvantages:
● May not always provide a precise solution to optimization problems, as the optimal value
may not be within the range of possible values determined by the lower and upper bounds.
● Can be computationally intensive, especially for complex optimization problems with
many constraints.
● May be limited by the accuracy of the data used to determine the lower and upper bounds.
● Requires a strong mathematical background to use effectively.

Searching
Searching is the fundamental process of locating a specific element or item within
a collection of data. This collection of data can take various forms, such as arrays, lists,
trees, or other structured representations. The primary objective of searching is to
determine whether the desired element exists within the data, and if so, to identify its
precise location or retrieve it. It plays an important role in various

47
Course Code/Title:CS3302/Design Analysis of Algorithms

computational tasks and real-world applications, including information retrieval, data


analysis, decision-making processes, and more.

Searching terminologies:

Target Element:
In searching, there is always a specific target element or item that you want tofind
within the data collection. This target could be a value, a record, a key, or any other data
entity of interest.

Search Space:
The search space refers to the entire collection of data within which you are looking
for the target element. Depending on the data structure used, the search space may vary in
size and organization.

Complexity:
Searching can have different levels of complexity depending on the data structure
and the algorithm used. The complexity is often measured in terms of time and space
requirements.

Deterministic vs. Non-deterministic:


Some searching algorithms, like binary search, are deterministic, meaning they
follow a clear, systematic approach. Others, such as linear search, are non- deterministic,
as they may need to examine the entire search space in the worst case.

Importance of Searching in DSA:


Efficiency: Efficient searching algorithms improve program performance. Data Retrieval:
Quickly find and retrieve specific data from large datasets.Database Systems: Enables fast
querying of databases.
Problem Solving: Used in a wide range of problem-solving tasks.

Applications of Searching:
Searching algorithms have numerous applications across various fields. Here are
some common applications:
1. Information Retrieval: Search engines like Google, Bing, and Yahoo use sophisticated
searching algorithms to retrieve relevant information from vast amounts of data on the web.
2. Database Systems: Searching is fundamental in database systems for retrieving specific
data records based on user queries, improving efficiency indata retrieval.
3. E-commerce: Searching is crucial in e-commerce platforms for users to find products
quickly based on their preferences, specifications, or keywords.
4. Networking: In networking, searching algorithms are used for routing packets

48
Course Code/Title:CS3302/Design Analysis of Algorithms

efficiently through networks, finding optimal paths, and managing network resources.
5. Artificial Intelligence: Searching algorithms play a vital role in AI applications, such as
problem-solving, game playing (e.g., chess), and decision-making processes
6. Pattern Recognition: Searching algorithms are used in pattern matching tasks,such as
image recognition, speech recognition, and handwriting recognition.

Linear Search Algorithm:


Linear Search is a method for searching an element in a collection of elements. In
Linear Search, each element of the collection is visited one by one in a sequentialfashion
to find the desired element. Linear Search is also known as Sequential Search.

Algorithm:
Start: Begin at the first element of the collection of elements. Compare: Compare the
current element with the desired element.
Found: If the current element is equal to the desired element, return true or index tothe
current element.
Move: Otherwise, move to the next element in the collection. Repeat: Repeat steps 2-4
until we have reached the end of collection.
Not found: If the end of the collection is reached without finding the desired
element, return that the desired element is not in the array.

Working:
● Every element is considered as a potential match for the key and checked forthe same.
● If any element is found equal to the key, the search is successful and theindex of
that element is returned.
● If no element is found equal to the key, the search yields “No match found”.

For example: Consider the array arr[] = {10, 50, 30, 70, 80, 20, 90, 40} and
key = 30

Step 1: Start from the first element (index 0) and compare key with each element
(arr[i]).

Comparing key with first element arr[0]. SInce not equal, the iterator moves to thenext
element as a potential match.

49
Course Code/Title:CS3302/Design Analysis of Algorithms

Comparing key with next element arr[1]. SInce not equal, the iterator moves to the next
element as a potential match.

Step 2: Now when comparing arr[2] with key, the value matches. So the Linear Search
Algorithm will yield a successful message and return the index of the element when key is
found (here 2).

Time Complexity:
Best Case: In the best case, the key might be present at the first index. So the best case
complexity is O(1)
Worst Case: In the worst case, the key might be present at the last index i.e., opposite to the
end from which the search has started in the list. So the worst-case complexityis O(N)
where N is the size of the list.
Average Case: O(N)
Auxiliary Space: O(1) as except for the variable to iterate through the list, no other
variable is used.

Applications of Linear Search Algorithm:


1. Unsorted Lists: When we have an unsorted array or list, linear search is mostcommonly
used to find any element in the collection.

50
Course Code/Title:CS3302/Design Analysis of Algorithms

2. Small Data Sets: Linear Search is preferred over binary search when we havesmall data
sets with
3. Searching Linked Lists: In linked list implementations, linear search is commonly used
to find elements within the list. Each node is checked sequentially until the desired
element is found.
4. Simple Implementation: Linear Search is much easier to understand and implement as
compared to Binary Search or Ternary Search.

Advantages of Linear Search Algorithm:


● Linear search can be used irrespective of whether the array is sorted or not.It can be used
on arrays of any data type.
● Does not require any additional memory.
● It is a well-suited algorithm for small datasets.

Disadvantages of Linear Search Algorithm:


● Linear search has a time complexity of O(N), which in turn makes it slow for large
datasets.
● Not suitable for large arrays.

When to use Linear Search Algorithm?


● When we are dealing with a small dataset.
● When you are searching for a dataset stored in contiguous memory.

Binary Search Algorithm:


Binary search is a search algorithm used to find the position of a target valuewithin
a sorted array. It works by repeatedly dividing the search interval in half untilthe target
value is found or the interval is empty. The search interval is halved by comparing the
target element with the middle value of the search space.

Conditions to apply Binary Search Algorithm in a Data Structure:


● The data structure must be sorted.
● Access to any element of the data structure takes constant time.

51
Course Code/Title:CS3302/Design Analysis of Algorithms

● In this algorithm, Divide the search space into two halves by finding themiddle
index “mid”.

● Compare the middle element of the search space with the key.
● If the key is found at middle element, the process is terminated.
● If the key is not found at middle element, choose which half will be used asthe next
search space.
● If the key is smaller than the middle element, then the left side is used fornext search.
● If the key is larger than the middle element, then the right side is used fornext search.
● This process is continued until the key is found or the total search space isexhausted.

To understand the working of binary search, consider the followingillustration:


Consider an array arr[] = {2, 5, 8, 12, 16, 23, 38, 56, 72, 91}, and the target = 23.

First Step: Calculate the mid and compare the mid element with the key. If the key is
less than mid element, move to left and if it is greater than the mid then move search
space to the right.

Key (i.e., 23) is greater than current mid element (i.e., 16). The search space moves to
the right.

Key is less than the current mid 56. The search space moves to the left.

52
Course Code/Title:CS3302/Design Analysis of Algorithms

If the key matches the value of the mid element, the element is found and stop search.

The Binary Search Algorithm can be implemented in the following two ways
1. Iterative Binary Search Algorithm
2. Recursive Binary Search Algorithm

Iterative Binary Search Algorithm:


Here we use a while loop to continue the process of comparing the key andsplitting
the search space in two halves.

Implementation of Iterative Binary Search Algorithm


// C++ program to implement iterative Binary Search#include <bits/stdc++.h>
using namespace std;

// An iterative binary search function.


int binarySearch(int arr[], int low, int high, int x)
{
while (low <= high) {
int mid = low + (high - low) / 2;

// Check if x is present at midif (arr[mid] == x)

53
Course Code/Title:CS3302/Design Analysis of Algorithms

return mid;

// If x greater, ignore left half


if (arr[mid] < x) low = mid + 1;

// If x is smaller, ignore right halfelse


high = mid - 1;
}

// If we reach here, then element was not presentreturn -1;


}

// Driver codeint main(void)


{
int arr[] = { 2, 3, 4, 10, 40 };
int x = 10;
int n = sizeof(arr) / sizeof(arr[0]);
int result = binarySearch(arr, 0, n - 1, x);(result == -1)
? cout << "Element is not present in array"
: cout << "Element is present at index " << result;return 0;
}
Output
Element is present at index 3Time Complexity: O(log N)Auxiliary Space: O(1)

Recursive Binary Search Algorithm:


Create a recursive function and compare the mid of the search space with the key.
And based on the result either return the index where the key is found or call the recursive
function for the next search space.

Implementation of Recursive Binary Search Algorithm:#include <bits/stdc++.h>


using namespace std;

// A recursive binary search function. It returns


// location of x in given array arr[low..high] is present,
// otherwise -1
int binarySearch(int arr[], int low, int high, int x)
{

54
Course Code/Title:CS3302/Design Analysis of Algorithms

if (high >= low) {


int mid = low + (high - low) / 2;

// If the element is present at the middle


// itself
if (arr[mid] == x)return mid;

// If element is smaller than mid, then


// it can only be present in left subarrayif (arr[mid] > x)
return binarySearch(arr, low, mid - 1, x);

// Else the element can only be present


// in right subarray
return binarySearch(arr, mid + 1, high, x);
}
}

// Driver codeint main()


{
int arr[] = { 2, 3, 4, 10, 40 };
int query = 10;
int n = sizeof(arr) / sizeof(arr[0]);
int result = binarySearch(arr, 0, n - 1, query);(result == -1)
? cout << "Element is not present in array"
: cout << "Element is present at index " << result;return 0;
}

Output
Element is present at index 3
Complexity Analysis of Binary Search Algorithm:
Time Complexity:
Best Case: O(1) Average Case: O(log N)Worst Case: O(log N)
Auxiliary Space: O(1), If the recursive call stack is considered then the auxiliaryspace
will be O(logN).

55
Course Code/Title:CS3302/Design Analysis of Algorithms

Applications of Binary Search Algorithm:


● Binary search can be used as a building block for more complex algorithms used in machine
learning, such as algorithms for training neural networks or finding the optimal
hyperparameters for a model.
● It can be used for searching in computer graphics such as algorithms for ray tracing or
texture mapping.
● It can be used for searching a database.

Advantages of Binary Search:


● Binary search is faster than linear search, especially for large arrays.
● More efficient than other searching algorithms with a similar
time complexity, such as interpolation search or exponential search.
● Binary search is well-suited for searching large datasets that are stored inexternal
memory, such as on a hard drive or in the cloud.

Disadvantages of Binary Search:


● The array should be sorted.
● Binary search requires that the data structure being searched be stored incontiguous
memory locations.
● Binary search requires that the elements of the array be comparable, meaning that they
must be able to be ordered.

Pattern Searching:
Pattern searching involves searching for a specific pattern or sequence of
elements within a given data structure. This technique is commonly used in string
matching algorithms to find occurrences of a particular pattern within a text or a larger
string. By using various algorithms like the Knuth-Morris-Pratt (KMP)algorithm or the
Rabin-Karp algorithm, pattern searching plays a crucial role in tasks such as text processing,
data retrieval, and computational biology.

56
Course Code/Title:CS3302/Design Analysis of Algorithms

Naive algorithm for Pattern Searching


Given text string with length n and a pattern with length m, the task is toprints
all occurrences of pattern in text.
Note: You may assume that n > m.Examples:

Input: text = “THIS IS A TEST TEXT”, pattern = “TEST”Output: Pattern found at index
10

Input: text = “AABAACAADAABAABA”, pattern = “AABA”


Output: Pattern found at index 0, Pattern found at index 9, Pattern found at index12

57
Course Code/Title:CS3302/Design Analysis of Algorithms

Slide the pattern over text one by one and check for a match. If a match is found,then
slide by 1 again to check for subsequent matches.
#include <iostream>#include <string> using namespace std;

void search(string& pat, string& txt) {int M = pat.size();


int N = txt.size();

// A loop to slide pat[] one by onefor (int i = 0; i <= N - M; i++) {


int j;
// For current index i, check for pattern matchfor (j = 0; j < M; j++) {
if (txt[i + j] != pat[j]) {break;
}
}

// If pattern matches at index iif (j == M) {


cout << "Pattern found at index " << i << endl;
}
}

58
Course Code/Title:CS3302/Design Analysis of Algorithms

} ing txt1 = "AABAACAADAABAABA";string pat1 = "AABA";


cout << "Example 1: " << endl;search(pat1, txt1);
/
/ // Example 2 string txt2 = "agd";string pat2 = "g";
D cout << "\nExample 2: " << endl;search(pat2, txt2);
r
i
return 0;
v
}
e
r Output
' Pattern found at index 0 Pattern found at index 9 Pattern found at index 13Time Complexity:
s O(N2)Auxiliary Space: O(1)
C
o Complexity Analysis of Naive algorithm for Pattern Searching:Best Case: O(n)
d
e on).
i
n
t
m
a
i
n
(
)
{
/
/
E
x
a
m
p
l
e
1
s
t
r
s
t
r

59
Course Code/Title:CS3302/Design Analysis of Algorithms

W The pattern is found at the very beginning of the text (or very early
h
e The algorithm will perform a constant number of comparisons, typically on
n
the order of O(n) comparisons, where n is the length of the pattern.Worst Case: O(n2)
When the pattern doesn’t appear in the text at all or appears only at the very
end.
The algorithm will perform O((n-m+1)*m) comparisons, where n is the
length of the text and m is the length of the pattern.
In the worst case, for each position in the text, the algorithm may need to compare the entire
pattern against the text.

KMP Algorithm for Pattern Searching


Given a text txt[0 . . . N-1] and a pattern pat[0 . . . M-1], write a function search(char pat[],
char txt[]) that prints all occurrences of pat[] in txt[]. You may assume that N > M.

Examples:

Input: txt[] = “THIS IS A TEST TEXT”, pat[] = “TEST”


Output: Pattern found at index 10

Input: txt[] = “AABAACAADAABAABA”pat[] = “AABA”


Output: Pattern found at index 0, Pattern found at index 9, Pattern found at index12

60
Course Code/Title:CS3302/Design Analysis of Algorithms

The worst case complexity of the Naive algorithm is O(m(n-m+1)). The timecomplexity
of the KMP algorithm is O(n+m) in the worst case.

KMP (Knuth Morris Pratt) Pattern Searching:

The Naive pattern-searching algorithm doesn’t work well in cases where we see
many matching characters followed by a mismatching character.

Examples:

1) txt[] = “AAAAAAAAAAAAAAAAAB”, pat[] = “AAAAB”


2) txt[] = “ABABABCABABABCABABABC”, pat[] = “ABABAC” (not a worst
case, but a bad case for Naive)

The KMP matching algorithm uses degenerating property (pattern having thesame
sub-patterns appearing more than once in the pattern) of the pattern and improves the worst-
case complexity to O(n+m).

The basic idea behind KMP’s algorithm is: whenever we detect a mismatch (after
some matches), we already know some of the characters in the text of the nextwindow. We
take advantage of this information to avoid matching the characters that we know will
anyway match.

61
Course Code/Title:CS3302/Design Analysis of Algorithms

Matching Overview

txt = “AAAAABAAABA”pat = “AAAA”


We compare first window of txt with pat

txt = “AAAAABAAABA”
pat = “AAAA” [Initial position]
We find a match. This is same as Naive String Matching.

In the next step, we compare next window of txt with pat.txt = “AAAAABAAABA”
pat = “AAAA” [Pattern shifted one position]

This is where KMP does optimization over Naive. In this second window, weonly
compare fourth A of pattern with fourth character of current window of text to decide
whether current window matches or not. Since we know first three characterswill anyway
match, we skipped matching first three characters.

Need of Preprocessing?
An important question arises from the above explanation, how to know how many
characters to be skipped. To know this, we pre-process pattern and prepare aninteger array
lps[] that tells us the count of characters to be skipped

Preprocessing Overview:
KMP algorithm preprocesses pat[] and constructs an auxiliary lps[] of size m(same
as the size of the pattern) which is used to skip characters while matching.
Name lps indicates the longest proper prefix which is also a suffix. A proper prefix
is a prefix with a whole string not allowed. For example, prefixes of “ABC” are “”, “A”,
“AB” and “ABC”. Proper prefixes are “”, “A” and “AB”. Suffixes of the string are “”, “C”,
“BC”, and “ABC”.
We search for lps in subpatterns. More clearly we focus on sub-strings of patterns
that are both prefix and suffix.
For each sub-pattern pat[0..i] where i = 0 to m-1, lps[i] stores the length of the
maximum matching proper prefix which is also a suffix of the sub-pattern pat[0..i].
lps[i] = the longest proper prefix of pat[0..i] which is also a suffix of pat[0..i].

62
Course Code/Title:CS3302/Design Analysis of Algorithms

Note: lps[i] could also be defined as the longest prefix which is also a proper suffix. We
need to use it properly in one place to make sure that the whole substring is not
considered.

Examples of lps[] construction:

For the pattern “AAAA”, lps[] is [0, 1, 2, 3]

For the pattern “ABCDE”, lps[] is [0, 0, 0, 0, 0]

For the pattern “AABAACAABAA”, lps[] is [0, 1, 0, 1, 2, 0, 1, 2, 3, 4, 5]

For the pattern “AAACAAAAAC”, lps[] is [0, 1, 2, 0, 1, 2, 3, 3, 3, 4]

For the pattern “AAABAAA”, lps[] is [0, 1, 2, 0, 1, 2, 3]

Preprocessing Algorithm:
In the preprocessing part, We calculate values in lps[]. To do that, we keep track
of the length of the longest prefix suffix value (we use len variable for this purpose) for
the previous index
We initialize lps[0] and len as 0.
If pat[len] and pat[i] match, we increment len by 1 and assign the incremented
value to lps[i].
If pat[i] and pat[len] do not match and len is not 0, we update len to lps[len-
1]

Illustration of preprocessing (or construction of lps[]):

pat[] = “AAACAAAA”

=> len = 0, i = 0:

lps[0] is always 0, we move to i = 1


=> len = 0, i = 1:

Since pat[len] and pat[i] match, do len++,store it in lps[i] and do i++.


Set len = 1, lps[1] = 1, i = 2
=> len = 1, i = 2:

63
Course Code/Title:CS3302/Design Analysis of Algorithms

Since pat[len] and pat[i] match, do len++,store it in lps[i] and do i++.


Set len = 2, lps[2] = 2, i = 3
=> len = 2, i = 3:

Since pat[len] and pat[i] do not match, and len > 0,Set len = lps[len-1] = lps[1] = 1
=> len = 1, i = 3:

Since pat[len] and pat[i] do not match and len > 0,len = lps[len-1] = lps[0] = 0
=> len = 0, i = 3:

Since pat[len] and pat[i] do not match and len = 0,Set lps[3] = 0 and i = 4
=> len = 0, i = 4:

Since pat[len] and pat[i] match, do len++,Store it in lps[i] and do i++.


Set len = 1, lps[4] = 1, i = 5
=> len = 1, i = 5:

Since pat[len] and pat[i] match, do len++,Store it in lps[i] and do i++.


Set len = 2, lps[5] = 2, i = 6
=> len = 2, i = 6:

Since pat[len] and pat[i] match, do len++,Store it in lps[i] and do i++.


len = 3, lps[6] = 3, i = 7
=> len = 3, i = 7:

Since pat[len] and pat[i] do not match and len > 0,Set len = lps[len-1] = lps[2] = 2
=> len = 2, i = 7:

Since pat[len] and pat[i] match, do len++,Store it in lps[i] and do i++.


len = 3, lps[7] = 3, i = 8
We stop here as we have constructed the whole lps[].

64
Course Code/Title:CS3302/Design Analysis of Algorithms

Implementation of KMP algorithm:


Unlike the Naive algorithm, where we slide the pattern by one and compare all
characters at each shift, we use a value from lps[] to decide the next characters tobe matched.
The idea is to not match a character that we know will anyway match.

How to use lps[] to decide the next positions (or to know the number of characters to be
skipped)?

● We start the comparison of pat[j] with j = 0 with characters of the currentwindow of


text.
● We keep matching characters txt[i] and pat[j] and keep incrementing i and jwhile pat[j] and
txt[i] keep matching.
● When we see a mismatch
● We know that characters pat[0..j-1] match with txt[i-j…i-1] (Note that jstarts with 0
and increments it only when there is a match).
● We also know (from the above definition) that lps[j-1] is the count ofcharacters of
pat[0…j-1] that are both proper prefix and suffix.
From the above two points, we can conclude that we do not need to match these
lps[j-1] characters with txt[i-j…i-1] because we know that these characters will anyway match.
Let us consider the above example to understand this.

Below is the illustration of the above algorithm:

Consider txt[] = “AAAAABAAABA“, pat[] = “AAAA“

If we follow the above LPS building process then lps[] = {0, 1, 2, 3}

-> i = 0, j = 0: txt[i] and pat[j] match, do i++, j++

-> i = 1, j = 1: txt[i] and pat[j] match, do i++, j++

-> i = 2, j = 2: txt[i] and pat[j] match, do i++, j++

-> i = 3, j = 3: txt[i] and pat[j] match, do i++, j++

-> i = 4, j = 4: Since j = M, print pattern found and reset j, j = lps[j-1] = lps[3] = 3Here

unlike Naive algorithm, we do not match first three

65
Course Code/Title:CS3302/Design Analysis of Algorithms

characters of this window. Value of lps[j-1] (in above step) gave us index of nextcharacter
to match.

-> i = 4, j = 3: txt[i] and pat[j] match, do i++, j++

-> i = 5, j = 4: Since j == M, print pattern found and reset j, j = lps[j-1] = lps[3] = 3Again
unlike Naive algorithm, we do not match first three characters of this window. Value of lps[j-
1] (in above step) gave us index of next character to match.

-> i = 5, j = 3: txt[i] and pat[j] do NOT match and j > 0, change only j. j = lps[j-1]
= lps[2] = 2

-> i = 5, j = 2: txt[i] and pat[j] do NOT match and j > 0, change only j. j = lps[j-1]
= lps[1] = 1

-> i = 5, j = 1: txt[i] and pat[j] do NOT match and j > 0, change only j. j = lps[j-1]
= lps[0] = 0

-> i = 5, j = 0: txt[i] and pat[j] do NOT match and j is 0, we do i++.

-> i = 6, j = 0: txt[i] and pat[j] match, do i++ and j++

-> i = 7, j = 1: txt[i] and pat[j] match, do i++ and j++

We continue this way till there are sufficient characters in the text to be comparedwith
the characters in the pattern…

Below is the implementation of the above approach:

// C++ program for implementation of KMP pattern searching


// algorithm

#include <bits/stdc++.h>

void computeLPSArray(char* pat, int M, int* lps);

// Prints occurrences of pat[] in txt[] void KMPSearch(char* pat, char* txt)


{
int M = strlen(pat);

66
Course Code/Title:CS3302/Design Analysis of Algorithms

int N = strlen(txt);

// create lps[] that will hold the longest prefix suffix


// values for patternint lps[M];

// Preprocess the pattern (calculate lps[] array)computeLPSArray(pat, M, lps);

int i = 0; // index for txt[] int j = 0; // index for pat[] while ((N - i) >= (M - j)) {
if (pat[j] == txt[i]) {j++;
i++;
}

if (j == M) {
printf("Found pattern at index %d ", i - j);j = lps[j - 1];
}

// mismatch after j matches


else if (i < N && pat[j] != txt[i]) {
// Do not match lps[0..lps[j-1]] characters,
// they will match anywayif (j != 0)
j = lps[j - 1];else
i = i + 1;
}
}
}

// Fills lps[] for given pattern pat[0..M-1]


void computeLPSArray(char* pat, int M, int* lps)
{
// length of the previous longest prefix suffixint len = 0;

67
Course Code/Title:CS3302/Design Analysis of Algorithms

lps[0] = 0; // lps[0] is always 0

// the loop calculates lps[i] for i = 1 to M-1int i = 1;


while (i < M) {
if (pat[i] == pat[len]) {len++;
lps[i] = len;i++;
}
else // (pat[i] != pat[len])
{
// This is tricky. Consider the example.
// AAACAAAA and i = 7. The idea is similar
// to search step.if (len != 0) {
len = lps[len - 1];

// Also, note that we do not increment


// i here
}
else // if (len == 0)
{
lps[i] = 0;i++;
}
}
}
}

// Driver codeint main()


{
char txt[] = "ABABDABACDABABCABAB";
char pat[] = "ABABCABAB";KMPSearch(pat, txt);
return 0;
}
Output
Found pattern at index 10

68
Course Code/Title:CS3302/Design Analysis of Algorithms

Time Complexity: O(N+M) where N is the length of the text and M is the length ofthe
pattern to be found.
Auxiliary Space: O(M)

Rabin-Karp Algorithm for Pattern Searching:


Given a text T[0. . .n-1] and a pattern P[0. . .m-1], write a function search(charP[],
char T[]) that prints all occurrences of P[] present in T[] using Rabin Karp algorithm. You
may assume that n > m.

Examples:

Input: T[] = “THIS IS A TEST TEXT”, P[] = “TEST”


Output: Pattern found at index 10

Input: T[] = “AABAACAADAABAABA”, P[] = “AABA”


Output: Pattern found at index 0 Pattern found at index 9 Pattern found at index 12

In the Naive String Matching algorithm, we check whether every substring ofthe text
of the pattern’s size is equal to the pattern or not one by one.

Like the Naive Algorithm, the Rabin-Karp algorithm also check every substring.
But unlike the Naive algorithm, the Rabin Karp algorithm matches the hash value of the
pattern with the hash value of the current substring of text, and if the hash values match
then only it starts matching individual characters. So Rabin Karp algorithm needs to
calculate hash values for the following strings.

● Pattern itself
● All the substrings of the text of length m which is the size of pattern.

How is Hash Value calculated in Rabin-Karp?


Hash value is used to efficiently check for potential matches between a pattern and
substrings of a larger text. The hash value is calculated using a rolling hash function, which
allows you to update the hash value for a new substring by efficiently removing the
contribution of the old character and adding the contributionof the new character. This makes
it possible to slide the pattern over the text and calculate the hash value for each substring
without recalculating the entire hash fromscratch.

69
Course Code/Title:CS3302/Design Analysis of Algorithms

Here’s how the hash value is typically calculated in Rabin-Karp:Step 1: Choose a suitable

base and a modulus:

Select a prime number ‘p‘ as the modulus. This choice helps avoid overflow issuesand
ensures a good distribution of hash values.
Choose a base ‘b‘ (usually a prime number as well), which is often the size of thecharacter
set (e.g., 256 for ASCII characters).
Step 2: Initialize the hash value:

Set an initial hash value ‘hash‘ to 0.


Step 3: Calculate the initial hash value for the pattern:

Iterate over each character in the pattern from left to right.


For each character ‘c’ at position ‘i’, calculate its contribution to the hash value as‘c *
(bpattern_length – i – 1) % p’ and add it to ‘hash‘.
This gives you the hash value for the entire pattern.Step 4: Slide the pattern over the text:

Start by calculating the hash value for the first substring of the text that is the same length
as the pattern.
Step 5: Update the hash value for each subsequent substring:

To slide the pattern one position to the right, you remove the contribution of theleftmost
character and add the contribution of the new character on the right.
The formula for updating the hash value when moving from position ‘i’ to ‘i+1’ is:hash =
(hash - (text[i - pattern_length] * (bpattern_length - 1)) % p) * b + text[i] Step 6: Compare
hash values:

When the hash value of a substring in the text matches the hash value of thepattern,
it’s a potential match.
If the hash values match, we should perform a character-by-character comparison to
confirm the match, as hash collisions can occur.

70
Course Code/Title:CS3302/Design Analysis of Algorithms

Step-by-step approach:

1. Initially calculate the hash value of the pattern.


2. Start iterating from the starting of the string:
3. Calculate the hash value of the current substring having length m.
4. If the hash value of the current substring and the pattern are same check ifthe substring is
same as the pattern.
5. If they are same, store the starting index as a valid answer. Otherwise,continue for the
next substrings.
6. Return the starting indices as the required answer.

Below is the implementation of the above approach:

/* Following program is a C++ implementation of Rabin KarpAlgorithm given in the CLRS


book */
#include <bits/stdc++.h>using namespace std;

// d is the number of characters in the input alphabet#define d 256

/* pat -> patterntxt -> text


q -> A prime number
*/
void search(char pat[], char txt[], int q)

71
Course Code/Title:CS3302/Design Analysis of Algorithms

{
int M = strlen(pat);int N = strlen(txt);int i, j;
int p = 0; // hash value for patternint t = 0; // hash value for txt
int h = 1;

// The value of h would be "pow(d, M-1)%q"for (i = 0; i < M - 1; i++)


h = (h * d) % q;

// Calculate the hash value of pattern and first


// window of text
for (i = 0; i < M; i++) {
p = (d * p + pat[i]) % q;t = (d * t + txt[i]) % q;
}

// Slide the pattern over text one by onefor (i = 0; i <= N - M; i++) {

// Check the hash values of current window of text


// and pattern. If the hash values match then only
// check for characters one by oneif (p == t) {
/* Check for characters one by one */for (j = 0; j < M; j++) {
if (txt[i + j] != pat[j]) {break;
}
}

// if p == t and pat[0...M-1] = txt[i, i+1,


// ...i+M-1]

if (j == M)
cout << "Pattern found at index " << i
<< endl;
}

72
Course Code/Title:CS3302/Design Analysis of Algorithms

// Calculate hash value for next window of text:


// Remove leading digit, add trailing digit if (i < N - M) {
t = (d * (t - txt[i] * h) + txt[i + M]) % q;

// We might get negative value of t, converting


// it to positiveif (t < 0)
t = (t + q);
}
}
}

/* Driver code */int main()


{
char txt[] = "GEEKS FOR GEEKS";
char pat[] = "GEEK";

// we mod to avoid overflowing of value but we should


// take as big q as possible to avoid the collisonint q = INT_MAX;

// Function Call search(pat, txt, q);return 0;


}

// This is code is contributed by rathbhupendraOutput


Pattern found at index 0 Pattern found at index 10Time Complexity:

The average and best-case running time of the Rabin-Karp algorithm is O(n+m), but its worst-
case time is O(nm).
The worst case of the Rabin-Karp algorithm occurs when all characters of pattern and text are
the same as the hash values of all the substrings of T[] match with the hash value of P[].

73
Course Code/Title:CS3302/Design Analysis of Algorithms

Auxiliary Space: O(1)

Limitations of Rabin-Karp Algorithm


Spurious Hit: When the hash value of the pattern matches with the hash valueof
a window of the text but the window is not the actual pattern then it is called a spurious
hit. Spurious hit increases the time complexity of the algorithm. In order tominimize
spurious hit, use good hash function. It greatly reduces the spurious hit.

74
Course Code/Title:CS3302/Design Analysis of Algorithms

UNIT – II GRAPH TECHNIQUE

1. Minimum spanning tree: Kruskal’s and Prim’s algorithm


2. Shortest path: Bellman-Ford algorithm - Dijkstra’s algorithm - Floyd-Warshallalgorithm
3. Network flow: Flow networks - Ford-Fulkerson method
4. Matching: Maximum bipartite matching

Minimum Spanning Tree:

A Spanning Tree is a tree which have V vertices and V-1 edges. All nodes in a spanning
tree are reachable from each other.

A Minimum Spanning Tree(MST) or minimum weight spanning tree for a weighted,


connected, undirected graph is a spanning tree having a weight less than or equal to the
weight of every other possible spanning tree. The weight of a spanning tree is the sum of
weights given to each edge of the spanning tree. In short out of all spanning trees of a given
graph, the spanning tree having minimum weight is MST.

Algorithms for finding Minimum Spanning Tree(MST):-

1. Prim’s Algorithm
2. Kruskal’s Algorithm

Prim’s Algorithm:

Prim's algorithm is a minimum spanning tree algorithm that takes a graph as input and
finds the subset of the edges of that graph which
 Form a tree that includes every vertex
 has the minimum sum of weights among all the trees that can be formed from the
graph.

How Prim's algorithm works:

It falls under a class of algorithms called greedy algorithms that find the local optimum in
the hopes of finding a global optimum.
We start from one vertex and keep adding edges with the lowest weight until we reach our
goal.

The steps for implementing Prim's algorithm are as follows:

1. Initialize the minimum spanning tree with a vertex chosen at random.


2. Find all the edges that connect the tree to new vertices, find the minimum and add
it to the tree
3. Keep repeating step 2 until we get a minimum spanning tree
Example of Prim's algorithm:

75
Course Code/Title:CS3302/Design Analysis of Algorithms

Start with a weighted graph

Choose a vertex

Choose the shortest edge from this vertex and add it

Choose the nearest vertex not yet in the solution

Choose the nearest edge not yet in the solution, if there are multiple choices, choose one
at random

Prim's Algorithm pseudocode:

76
Course Code/Title:CS3302/Design Analysis of Algorithms

The pseudocode for prim's algorithm shows how we create two sets of vertices U and
V-U. U contains the list of vertices that have been visited and V-U the list of vertices that
haven't. One by one, we move vertices from set V-U to set U by connecting the least weight
edge.
T=∅;
U={1};
while(U≠V)

let (u,v) be the lowest cost edge such that u ∈ U and v ∈ V- U;


T=T 𝖴 {(u,v)}
U =U 𝖴 {v}

Prim's Algorithm Complexity:

The time complexity of Prim's algorithm is O (E log V).

Kruskal Algorithm:

Kruskal's algorithm is a minimum spanning tree algorithm that takes a graph as input and
finds the subset of the edges of that graph which
 Form a tree that includes every vertex
 Has the minimum sum of weights among all the trees that can be formed from the
graph

How Kruskal's algorithm works:

It falls under a class of algorithms called greedy algorithms that find the local optimum in
the hopes of finding a global optimum.

We start from the edges with the lowest weight and keep adding edges until we reach our
goal.

The steps for implementing Kruskal's algorithm are as follows:

1. Sort all the edges from low weight to high

2. Take the edge with the lowest weight and add it to the spanning tree. If adding the
edge created acycle, then reject this edge.

3. Keep adding edges until we reach all vertices.

77
Course Code/Title:CS3302/Design Analysis of Algorithms

Example of Kruskal'salgorithm:

Start with a weighted graph

Choose the edge with the least weight, if there are more than 1, choose anyone

Choose the next shortest edge and add it

Choose the next shortest edge that doesn't create a cycle and add it

Choose the next shortest edge that doesn't create a cycle and add it

78
Course Code/Title:CS3302/Design Analysis of Algorithms

Repeat until you have a spanning tree.


Kruskal Algorithm Pseudocode:

KRUSKAL(G):
A =∅
For each vertex v ∈ G.V:
MAKE-SET(v)
For each edge(u,v) ∈ G.E ordered by increasing order by weight(u,v):
if FIND-SET(u) ≠ FIND-SET(v):
A=A 𝖴 {(u,v)}
UNION(u, v)
return A

Shortest Path Algorithm:

The shortest path problem is about finding a path between vertices in a graph suchthat
thetotal sum of the edges weights is minimum.

Algorithm for Shortest Path


1. BellmanAlgorithm
2. DijkstraAlgorithm
3. FloydWarshallAlgorithm

Bellman Algorithm:

Bellman Ford algorithm helps us find the shortest path from a vertex to all other vertices
of a weighted graph. It is similar to Dijkstra's algorithm but it can work with graphs in which
edges can have negative weights.

How Bellman Ford's algorithm works:


Bellman Ford algorithm works by overestimating the length of the path from the starting
vertex to all other vertices. Then it iteratively relaxes those estimates by finding new paths
that are shorter than the previously over estimated paths.

By doing this repeatedly for all vertices, we can guarantee that the result is optimized.

Step-1 for Bellman Ford's algorithm

79
Course Code/Title:CS3302/Design Analysis of Algorithms

Step-2 for Bellman Ford 's algorithm

Step-3 for Bellman Ford 's algorithm

Step-4 for Bellman Ford's algorithm

Step-5 for Bellman Ford's algorithm

80
Course Code/Title:CS3302/Design Analysis of Algorithms

Step-6 for Bellman Ford's algorithm

Bellman Ford Pseudocode:

We need to maintain the path distance of every vertex. We can store that in an array of
size v, where v is the number of vertices.

We also want to be able to get the shortest path, not only know the length of the shortest
path. For this, we map each vertex to the vertex that last updated its path length.

Once the algorithm is over, we can backtrack from the destination vertex to the source
vertex to find the path.

Function bellman Ford(G,S)

foreach vertex V in G

distance[V] <- infinite


previous[V]<-NULL
distance[S] <- 0

for each vertex V in G


for each edge(U,V)in G

81
Course Code/Title:CS3302/Design Analysis of Algorithms

tempDistance <- distance [U]+ edge_weight (U,V)


if tempDistance < distance[V]
distance[V]<-tempDistance
previous[V] <- U

for each edge (U,V)in G


If distance[U]+ edge_weight(U,V) < distance[V}
Error: Negative Cycle Exists

return distance[], previous[]

Bellman Ford's Complexity:

Time Complexity:

Best Case Complexity O(E)

Average Case Complexity O(VE)

Worst Case Complexity O(VE)

Dijkstra Algorithm:

Dijkstra's algorithm allows us to find the shortest path between any two vertices of a
graph.

It differs from the minimum spanning tree because the shortest distance between two
vertices might notinclude all the vertices of the graph.

How Dijkstra's Algorithm works:

Dijkstra's Algorithm works on the basis that any subpath B -> D of the hortest path A ->
D between vertices A and D is also the shortest path between vertices B and D.

82
Course Code/Title:CS3302/Design Analysis of Algorithms

Each subpath is the shortest path Djikstra used this property in the opposite direction i.e
we overestimate the distance of each vertex from the starting vertex.Then we visit each
node and its neighbors to find the shortest subpath to those neighbors.

The algorithm uses a greedy approach in the sense that we find the next best solution
hoping that the end result is the best solution for the whole problem.
Example of Dijkstra's algorithm:

It is easier to start with an example and then think about the algorithm.

Start with a weighted graph

Choose a starting vertex and assign infinity path values to all other devices

83
Course Code/Title:CS3302/Design Analysis of Algorithms

Go to each vertex and update its path length

If the path length of the adjacent vertex is lesser than new path length, don't update it

Avoid updating path lengths of already visited vertices

After each iteration, we pick the unvisited vertex with the least path length. So we choose

84
Course Code/Title:CS3302/Design Analysis of Algorithms

5 before 7

Notice how the right most vertex has its path length updated twice

Repeat until all the vertices have been visited.


Djikstra's algorithm pseudocode:

We need to maintain the path distance of every vertex. We can store that in an array of
size v, where v is thenumber of vertices.

We also want to be able to get the shortest path, not only know the length of the shortest
path. For this, we map each vertex to the vertex that last updated its path length.

Once the algorithm is over, we can backtrack from the destination vertex to the source
vertex to find the path.

A minimum priority queue can be used to efficiently receive the vertex with least path
distance.

functiondijkstra(G, S)
for each vertex V in G
distance[V] <- infinite
previous[V] <- NULL
If V!=S, add V to Priority Queue Q
distance[S] <- 0

while Q IS NOT EMPTY


U <- Extract MIN from Q
for each unvisited neighbor V of U
tempDistance<-distance[U]+edge_weight(U,V)
if tempDistance < distance[V]

85
Course Code/Title:CS3302/Design Analysis of Algorithms

distance[V]<-tempDistanceprevious[V] <- U
return distance[],previous[]

Dijkstra's Algorithm Complexity:

Time Complexity:O (E Log V)


where, E is the number of edges and V is the number of vertices.
Space Complexity: O(V)

Floyd Warshall Algorithm:

Floyd-Warshall Algorithm is an algorithm for finding the shortest path between all the
pairs of vertices in a weighted graph. This algorithm works for both the directed and
undirected weighted graphs. But, it does not work for the graphs with negative cycles
(where the sum of the edges in a cycle is negative).

A weighted graph is a graph in which each edge has a numerical value associated with
it. Floyd-Warhshall algorithm is also called as Floyd's algorithm, Roy-Floyd
algorithm, Roy-Warshallalgorithm, or WFI algorithm.

This algorithm follows the dynamic programming approach to find the shortest paths.
How Floyd-Warshall Algorithm Works?

Let the given graph be:

Initial graph

Follow the steps below to find the shortest path between all the pairs of vertices.

1. Create a matrix A0 of dimension n*n where n is the number of vertices. The row
and the column are indexed as i and j respectively. i and j are the vertices of the
graph. Each cell A[i][j] is filled with the distance from the ith vertex to the jth vertex.
If there is no path from ith vertex to jth vertex, the cell is left as infinity.

86
Course Code/Title:CS3302/Design Analysis of Algorithms

Fill each cell with the distance between ith and jth vertex

2. Now, create a matrix A1 using matrix A0. The elements in the first column and the
first row are left as they are. The remaining cells are filled in the following way. Let
k be the intermediate vertex in the shortest path from source to destination.

In this step, k is the first vertex.


A[i][j] is filled with (A[i][k] + A[k][j]) if (A[i][j] > A[i][k] + A[k][j]).

That is, if the direct distance from the source to the destination is greater than the
path h the vertex k,then the cell is filled with A[i][k] + A[k][j].

In this step, k is vertex1. We calculate the distance from source vertex to


destination vertex Through this vertex k.

For example:
For A1[2,4], the direct distance from vertex 2 to 4 is 4 and the sum of the
distance from vertex 2 to 4 through vertex (ie. from vertex 2 to 1 and from vertex
1 to 4) is 7.Since 4 < 7, A0 [2,4] is filled with 4.

3. Similarly, A2 is created using A1. The elements in the second column and the
second row areleft as they are.

In this step, k is the second vertex (i.e. vertex 2). The remaining steps are the same
as in step

Calculate the distance from the source vertex to destination vertex through this

87
Course Code/Title:CS3302/Design Analysis of Algorithms

vertex 2

4. Similarly, A3 and A4 is also created.

Calculate the distance from the source vertex to destination vertex through this
vertex 4

5. A4 gives the shortest path between each pair of vertices.

Floyd-WarshallAlgorithm:

n=no of vertices
A=matrix of dimension n*n
for k = 1 to n
for i = 1 to n
for j= 1to n
Ak[i,j]=min(Ak-1[i,j],Ak-1[i,k]+Ak-1[k,j])
return A

TimeComplexity:

There are three loops. Each loop has constant complexities. So, the time
complexity of the Floyd-Warshall algorithm is O(n3).

NetworkFlow:

Flow Network is a directed graph that is used for modeling material Flow. There are two
different vertices; one is a source which produces material at some steady rate, and another
one is sink which consumes the content at the same constant speed. The flow of the material
at any mark in the system is the rate at which the element moves.

Some real-life problems like the flow of liquids through pipes, the current through wires

88
Course Code/Title:CS3302/Design Analysis of Algorithms

and delivery of goods can be modelled using flow networks.

Definition: A Flow Network is a directed graph G=(V,E) such that

1. For each edge (u, v) ∈ E, we associate a nonnegative weight capacity


c (u, v) ≥ 0. If (u, v) ∉ E,we assume that c (u, v) = 0.
2. There are two distinguishing points, the sources, and the sink t;
3. For every vertex v∈ V, there is a path from s to t containing v.

Let G = (V, E) be a flow network. Let s be the source of the network, and let t be the sink.
A flow in G is a real-valued function f: V x V→R such that the following properties hold:
PlayVideo
o Capacity Constraint: For all u,v∈ V,we need f(u,v)≤c(u,v).
o Skew Symmetry: For all u,v∈ V,we need f(u,v)=-f(u,v).
o Flow Conservation: For all u∈V-{s,t},we need

The quantity f(u,v),which can be positive or negative, is known as the netflow from vertex
u to vertex v.In the maximum-flow problem, we are given a flow network G with source s
and sink t, and a flow of maximum value from s to t.

Ford-FulkersonAlgorithm:

Initially, the flow of value is 0. Find some augmenting Path p and increase flow f on each
edge of p by residual Capacity cf (p). When no augmenting path exists, flow f is a maximum
flow.

FORD-FULKERSONMETHOD(G,s,t)

1. Initialize flow f to 0
2. While there exists an augmenting path p
3. Do argument flow f along p
4. Return f

FORD-FULKERSON(G,s,t)

1. For each edge (u,v)∈E [G]


2. do f[u, v]←0
3. f [u,v]←0
4. while there exists a path p from s to t in the residual network Gf.
5. do cf(p)←min?{Cf (u,v):(u,v)is on p}
6. for each edge (u,v)in p
7. do f [u,v]←f[u, v]+ cf (p)

89
Course Code/Title:CS3302/Design Analysis of Algorithms

8. f [u,v]←-f [u,v]

Example: Each Directed Edge is labeled with capacity. Use the Ford-Fulkerson algorithm
to find themaximum flow.

Solution: The left side of each part shows the residual network Gf with a shaded augmenting
path p, and the right side of each part shows the net flow f.

90
Course Code/Title:CS3302/Design Analysis of Algorithms

Maximum Bipartite Matching:

The bipartite matching is a set of edges in a graph is chosen in such a way, that no two edges in
that set will share an endpoint. The maximum matching is matching the maximum number of
edges.

When the maximum match is found, we cannot add another edge. If one edge is added to the
maximum matched graph, it is no longer a matching. For a bipartite graph, there can be more
than one maximum matching is possible.

Algorithm:

Bipartite Match(u, visited, assign)

Input: Starting node, visited list to keep track, assign the list to assign node with another node.
Output−Returns true when a matching for vertex u is possible.
Begin
for all vertex v, which are adjacent with u, do if v is not
visited, then
mark v as visited
if v is not assigned, or bipartite Match(assign[v],visited,assign) is true,then assign[v] := u
return true
done
return false
End
maxMatch(graph) Input−The
given graph.
Output−The maximum number of the match.
Begin Initially no vertex is
assignedcount := 0
for all applicant u in M, do
make all node as unvisited
if bipartite Match(u,visited,assign), then,increase count by 1,done, End

91
Course Code/Title:CS3302/Design Analysis of Algorithms

UNIT III – DIVIDE AND CONQUER AND DYNAMIC PROGRAMMING

The divide-and-conquer technique involves taking a large-scale problem and dividing it into similar sub-
problems of a smaller scale and recursively solving each of these sub-problems. Generally, a problem is
divided into sub-problems repeatedly until the resulting sub-problems are very easy to solve.
Let T(n) be the time complexity of a divide-and-conquer algorithm to solve this problem. Then T(n)
satisfies an equation of the form: T(n) = a T(n/b) + f (n). 1 is the number of recursively calls and n/b with
b > 1 is the size of a sub-problem.
Finding Maximum and Minimum Element using Divide and Conquer Method:
Max-Min Problem
The Max-Min Problem in algorithm analysis is finding the maximum and minimum value in an array.
Solution
To find the maximum and minimum numbers in a given array numbers[] of size n, the following algorithm
can be used. First we are representing the naive method and then we will present divide and conquer
approach.
Naive Method
Naive method is a basic method to solve any problem. In this method, the maximum and minimum number
can be found separately. To find the maximum and minimum numbers, the following straightforward
algorithm can be used.
Algorithm: Max-Min-Element (numbers[])
max := numbers[1]
min := numbers[1]
for i = 2 to n do
if numbers[i] > max then
max := numbers[i]
if numbers[i] < min then
min := numbers[i]
return (max, min)
Example
Program:
#include <stdio.h>
struct Pair {
int max;
int min;
};
// Function to find maximum and minimum using the naive algorithm
struct Pair maxMinNaive(int arr[], int n) {
struct Pair result;
result.max = arr[0];
result.min = arr[0];
// Loop through the array to find the maximum and minimum values
for (int i = 1; i < n; i++) {

92
Course Code/Title:CS3302/Design Analysis of Algorithms

if (arr[i] > result.max) {


result.max = arr[i]; // Update the maximum value if a larger element is found
}
if (arr[i] < result.min) {
result.min = arr[i]; // Update the minimum value if a smaller element is found
}
}
return result; // Return the pair of maximum and minimum values
}
int main() {
int arr[] = {6, 4, 26, 14, 33, 64, 46};
int n = sizeof(arr) / sizeof(arr[0]);
struct Pair result = maxMinNaive(arr, n);
printf("Maximum element is: %d\n", result.max);
printf("Minimum element is: %d\n", result.min);
return 0;
}
Output
Maximum element is: 64
Minimum element is: 4
Analysis
The number of comparison in Naive method is 2n - 2.
The number of comparisons can be reduced using the divide and conquer approach. Following is the
technique.
Divide and Conquer Approach
In this approach, the array is divided into two halves. Then using recursive approach maximum and
minimum numbers in each halves are found. Later, return the maximum of two maxima of each half and
the minimum of two minima of each half.
In this given problem, the number of elements in an array is [Math Processing Error]y−x+1, where y is
greater than or equal to x.
[Math Processing Error]Max−Min(x,y) will return the maximum and minimum values of an array [Math
Processing Error]numbers[x...y].
Algorithm: Max - Min(x, y)
if y – x ≤ 1 then
return (max(numbers[x],numbers[y]),min((numbers[x],numbers[y]))
else
(max1, min1):= maxmin(x, ⌊((x + y)/2)⌋)
(max2, min2):= maxmin(⌊((x + y)/2) + 1)⌋,y)
return (max(max1, max2), min(min1, min2))
Example
#include <stdio.h>
// Structure to store both maximum and minimum elements
struct Pair {

93
Course Code/Title:CS3302/Design Analysis of Algorithms

int max;
int min;
};
struct Pair maxMinDivideConquer(int arr[], int low, int high) {
struct Pair result;
struct Pair left;
struct Pair right;
int mid;
// If only one element in the array
if (low == high) {
result.max = arr[low];
result.min = arr[low];
return result;
}
// If there are two elements in the array
if (high == low + 1) {
if (arr[low] < arr[high]) {
result.min = arr[low];
result.max = arr[high];
} else {
result.min = arr[high];
result.max = arr[low];
}
return result;
}
// If there are more than two elements in the array
mid = (low + high) / 2;
left = maxMinDivideConquer(arr, low, mid);
right = maxMinDivideConquer(arr, mid + 1, high);
// Compare and get the maximum of both parts
result.max = (left.max > right.max) ? left.max : right.max;
// Compare and get the minimum of both parts
result.min = (left.min < right.min) ? left.min : right.min;
return result;
}
int main() {
int arr[] = {6, 4, 26, 14, 33, 64, 46};
int n = sizeof(arr) / sizeof(arr[0]);
struct Pair result = maxMinDivideConquer(arr, 0, n - 1);
printf("Maximum element is: %d\n", result.max);
printf("Minimum element is: %d\n", result.min);
return 0;
}

94
Course Code/Title:CS3302/Design Analysis of Algorithms

Output
Maximum element is: 64
Minimum element is: 4
Analysis
Let T(n) be the number of comparisons made by [Math Processing Error]Max−Min(x,y), where the
number of elements [Math Processing Error]n=y−x+1.
If T(n) represents the numbers, then the recurrence relation can be represented as
[Math Processing Error]T(n)={T(⌊n2⌋)+T(⌈n2⌉)+2forn>21forn=20forn=1
Let us assume that n is in the form of power of 2. Hence, n = 2k where k is height of the recursion tree.
So,
[Math Processing Error]T(n)=2.T(n2)+2=2.(2.T(n4)+2)+2.....=3n2−2
Compared to Naïve method, in divide and conquer approach, the number of comparisons is less. However,
using the asymptotic notation both of the approaches are represented by O(n).
Merge Sort:
Merge sort is yet another sorting algorithm that falls under the category of Divide and Conquer technique.
It is one of the best sorting techniques that successfully build a recursive algorithm.
Divide and Conquer Strategy
In this technique, we segment a problem into two halves and solve them individually. After finding the
solution of each half, we merge them back to represent the solution of the main problem.
Suppose we have an array A, such that our main concern will be to sort the subsection, which starts at
index p and ends at index r, represented by A[p..r].
Divide
If assumed q to be the central point somewhere in between p and r, then we will fragment the
subarray A[p..r] into two arrays A[p..q] and A[q+1, r].
Conquer
After splitting the arrays into two halves, the next step is to conquer. In this step, we individually sort both
of the subarrays A[p..q] and A[q+1, r]. In case if we did not reach the base situation, then we again follow
the same procedure, i.e., we further segment these subarrays followed by sorting them separately.
Combine
As when the base step is acquired by the conquer step, we successfully get our sorted
subarrays A[p..q] and A[q+1, r], after which we merge them back to form a new sorted array [p..r].
Merge Sort algorithm
The MergeSort function keeps on splitting an array into two halves until a condition is met where we try to
perform MergeSort on a subarray of size 1, i.e., p == r.
And then, it combines the individually sorted subarrays into larger arrays until the whole array is merged.
ALGORITHM-MERGE SORT
1. If p<r
2. Then q → ( p+ r)/2
3. MERGE-SORT (A, p, q)
4. MERGE-SORT ( A, q+1,r)
5. MERGE ( A, p, q, r)
Here we called MergeSort(A, 0, length(A)-1) to sort the complete array.

95
Course Code/Title:CS3302/Design Analysis of Algorithms

As you can see in the image given below, the merge sort algorithm recursively divides the array into halves
until the base condition is met, where we are left with only 1 element in the array. And then, the merge
function picks up the sorted sub-arrays and merge them back to sort the entire array.
The following figure illustrates the dividing (splitting) procedure.

FUNCTIONS: MERGE (A, p, q, r)


1. n 1 = q-p+1
2. n 2= r-q
3. create arrays [1.....n 1 + 1] and R [ 1.....n 2 +1 ]
4. for i ← 1 to n 1
5. do [i] ← A [ p+ i-1]
6. for j ← 1 to n2
7. do R[j] ← A[ q + j]
8. L [n 1+ 1] ← ∞
9. R[n 2+ 1] ← ∞
10. I ← 1
11. J ← 1
12. For k ← p to r
13. Do if L [i] ≤ R[j]
14. then A[k] ← L[ i]
15. i ← i +1
1. 16. else A[k] ← R[j]
2. 17. j ← j+1

96
Course Code/Title:CS3302/Design Analysis of Algorithms

The merge step of Merge Sort


Mainly the recursive algorithm depends on a base case as well as its ability to merge back the results derived
from the base cases. Merge sort is no different algorithm, just the fact here the merge step possesses more
importance.
To any given problem, the merge step is one such solution that combines the two individually sorted
lists(arrays) to build one large sorted list(array).
The merge sort algorithm upholds three pointers, i.e., one for both of the two arrays and the other one to
preserve the final sorted array's current index.
1. Did you reach the end of the array?
2. No:
3. Firstly, start with comparing the current elements of both the arrays.
4. Next, copy the smaller element into the sorted array.
5. Lastly, move the pointer of the element containing a smaller element.
6. Yes:
7. Simply copy the rest of the elements of the non-empty array
Merge( ) Function Explained Step-By-Step
Consider the following example of an unsorted array, which we are going to sort with the help of the Merge
Sort algorithm.
A= (36,25,40,2,7,80,15)
Step1: The merge sort algorithm iteratively divides an array into equal halves until we achieve an atomic
value. In case if there are an odd number of elements in an array, then one of the halves will have more
elements than the other half.
Step2: After dividing an array into two subarrays, we will notice that it did not hamper the order of elements

97
Course Code/Title:CS3302/Design Analysis of Algorithms

as they were in the original array. After now, we will further divide these two arrays into other halves.
Step3: Again, we will divide these arrays until we achieve an atomic value, i.e., a value that cannot be
further divided.
Step4: Next, we will merge them back in the same way as they were broken down.
Step5: For each list, we will first compare the element and then combine them to form a new sorted list.
Step6: In the next iteration, we will compare the lists of two data values and merge them back into a list of
found data values, all placed in a sorted manner.

Hence the array is sorted.


Analysis of Merge Sort:
Let T (n) be the total time taken by the Merge Sort algorithm.

o Sorting two halves will take at the most 2T time.


o When we merge the sorted lists, we come up with a total n-1 comparison because the last element
which is left will need to be copied down in the combined list, and there will be no comparison.
Thus, the relational formula will be

But we ignore '-1' because the element will take some time to be copied in merge lists.

98
Course Code/Title:CS3302/Design Analysis of Algorithms

So T (n) = 2T + n...equation 1
Note: Stopping Condition T (1) =0 because at last, there will be only 1 element left that need to be copied,
and there will be no comparison.

Put 2 equation in 1 equation

Putting 4 equation in 3 equation

From Stopping Condition:

Apply log both sides:


log n=log2i

99
Course Code/Title:CS3302/Design Analysis of Algorithms

logn= i log2

=i

log2 n=i
From 6 equation

Best Case Complexity: The merge sort algorithm has a best-case time complexity of O(n*log n) for the
already sorted array.
Average Case Complexity: The average-case time complexity for the merge sort algorithm is O(n*log n),
which happens when 2 or more elements are jumbled, i.e., neither in the ascending order nor in the
descending order.
Worst Case Complexity: The worst-case time complexity is also O(n*log n), which occurs when we sort
the descending order of an array into the ascending order.
Space Complexity: The space complexity of merge sort is O(n).
Quick Sort:
It is an algorithm of Divide & Conquer type.
Divide: Rearrange the elements and split arrays into two sub-arrays and an element in between search that
each element in left sub array is less than or equal to the average element and each element in the right sub-
array is larger than the middle element.
Conquer: Recursively, sort two sub arrays.
Combine: Combine the already sorted array.
Algorithm:
QUICKSORT (array A, int m, int n)
1 if (n > m)
2 then
3 i ← a random index from [m,n]
4 swap A [i] with A[m]
5 o ← PARTITION (A, m, n)
6 QUICKSORT (A, m, o - 1)
7 QUICKSORT (A, o + 1, n)
Partition Algorithm:
Partition algorithm rearranges the sub arrays in a place.
PARTITION (array A, int m, int n)
1 x ← A[m]
2o←m
3 for p ← m + 1 to n

100
Course Code/Title:CS3302/Design Analysis of Algorithms

4 do if (A[p] < x)
5 then o ← o + 1
6 swap A[o] with A[p]
7 swap A[m] with A[o]
8 return o
Figure: shows the execution trace partition algorithm

Example of Quick Sort:


1. 44 33 11 55 77 90 40 60 99 22 88
Let 44 be the Pivot element and scanning done from right to left
Comparing 44 to the right-side elements, and if right-side elements are smaller than 44, then swap it.
As 22 is smaller than 44 so swap them.
22 33 11 55 77 90 40 60 99 44 88
Now comparing 44 to the left side element and the element must be greater than 44 then swap them.
As 55 are greater than 44 so swap them.
22 33 11 44 77 90 40 60 99 55 88
Recursively, repeating steps 1 & steps 2 until we get two lists one left from pivot element 44 & one right
from pivot element.
22 33 11 40 77 90 44 60 99 55 88
Swap with 77:

101
Course Code/Title:CS3302/Design Analysis of Algorithms

22 33 11 40 44 90 77 60 99 55 88
Now, the element on the right side and left side are greater than and smaller than 44 respectively.
Now we get two sorted lists:

And these sublists are sorted under the same process as above done.
These two sorted sublists side by side.

Merging Sublists:

SORTED LISTS
Worst Case Analysis: It is the case when items are already in sorted form and we try to sort them again.
This will takes lots of time and space.
Equation:
1. T (n) =T(1)+T(n-1)+n
T (1) is time taken by pivot element.
T (n-1) is time taken by remaining element except for pivot element.
N: the number of comparisons required to identify the exact position of itself (every element)
If we compare first element pivot with other, then there will be 5 comparisons.
It means there will be n comparisons if there are n items.

102
Course Code/Title:CS3302/Design Analysis of Algorithms

Relational Formula for Worst Case:

Note: for making T (n-4) as T (1) we will put (n-1) in place of '4' and if
We put (n-1) in place of 4 then we have to put (n-2) in place of 3 and (n-3)
In place of 2 and so on.
T(n)=(n-1) T(1) + T(n-(n-1))+(n-(n-2))+(n-(n-3))+(n-(n-4))+n
T (n) = (n-1) T (1) + T (1) + 2 + 3 + 4+............n
T (n) = (n-1) T (1) +T (1) +2+3+4+...........+n+1-1

103
Course Code/Title:CS3302/Design Analysis of Algorithms

[Adding 1 and subtracting 1 for making AP series]


T (n) = (n-1) T (1) +T (1) +1+2+3+4+........ + n-1

T (n) = (n-1) T (1) +T (1) + -1


Stopping Condition: T (1) =0
Because at last there is only one element left and no comparison is required.

T (n) = (n-1) (0) +0+ -1

Worst Case Complexity of Quick Sort is T (n) =O (n2)


Randomized Quick Sort [Average Case]:
Generally, we assume the first element of the list as the pivot element. In an average Case, the number of
chances to get a pivot element is equal to the number of items.
1. Let total time taken =T (n)
2. For eg: In a given list
3. p 1, p 2, p 3, p 4............pn
4. If p 1 is the pivot list then we have 2 lists.
5. I.e. T (0) and T (n-1)
6. If p2 is the pivot list then we have 2 lists.
7. I.e. T (1) and T (n-2)
8. p 1, p 2, p 3, p 4............pn
9. If p3 is the pivot list then we have 2 lists.
10. I.e. T (2) and T (n-3)
11. p 1, p 2, p 3, p 4............p n
So in general if we take the Kth element to be the pivot element.
Then,

Pivot element will do n comparison and we are doing average case so,

104
Course Code/Title:CS3302/Design Analysis of Algorithms

So Relational Formula for Randomized Quick Sort is:

= n+1 + (T(0)+T(1)+T(2)+...T(n-1)+T(n-2)+T(n-3)+...T(0))

= n+1 + x2 (T(0)+T(1)+T(2)+...T(n-2)+T(n-1))

1. n T (n) = n (n+1) +2 (T(0)+T(1)+T(2)+...T(n-1)........eq 1


Put n=n-1 in eq 1
1. (n -1) T (n-1) = (n-1) n+2 (T(0)+T(1)+T(2)+...T(n-2)......eq2
From eq1 and eq 2
n T (n) - (n-1) T (n-1)= n(n+1)-n(n-1)+2 (T(0)+T(1)+T(2)+?T(n-2)+T(n-1))-2(T(0)+T(1)+T(2)+...T(n-2))
n T(n)- (n-1) T(n-1)= n[n+1-n+1]+2T(n-1)
n T(n)=[2+(n-1)]T(n-1)+2n
n T(n)= n+1 T(n-1)+2n

Put n=n-1 in eq 3

Put 4 eq in 3 eq

Put n=n-2 in eq 3

Put 6 eq in 5 eq

105
Course Code/Title:CS3302/Design Analysis of Algorithms

Put n=n-3 in eq 3

Put 8 eq in 7 eq

From 3eq, 5eq, 7eq, 9 eq we get

From 10 eq

Multiply and divide the last term by 2

106
Course Code/Title:CS3302/Design Analysis of Algorithms

Is the average case complexity of quick sort for sorting n elements.


3. Quick Sort [Best Case]: In any sorting, best case is the only case in which we don't make any
comparison between elements that is only done when we have only one element to sort.

107
Course Code/Title:CS3302/Design Analysis of Algorithms

Dynamic Programming
Dynamic programming is a technique that breaks the problems into sub-problems, and saves the result for
future purposes so that we do not need to compute the result again. The subproblems are optimized to
optimize the overall solution is known as optimal substructure property. The main use of dynamic
programming is to solve optimization problems. Here, optimization problems mean that when we are trying
to find out the minimum or the maximum solution of a problem. The dynamic programming guarantees to
find the optimal solution of a problem if the solution exists.
The definition of dynamic programming says that it is a technique for solving a complex problem by first
breaking into a collection of simpler subproblems, solving each subproblem just once, and then storing
their solutions to avoid repetitive computations.
Let's understand this approach through an example.
Consider an example of the Fibonacci series. The following series is the Fibonacci series:
0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, ,…
The numbers in the above series are not randomly calculated. Mathematically, we could write each of the
terms using the below formula:
F(n) = F(n-1) + F(n-2),
With the base values F(0) = 0, and F(1) = 1. To calculate the other numbers, we follow the above
relationship. For example, F(2) is the sum f(0) and f(1), which is equal to 1.
How can we calculate F(20)?
The F(20) term will be calculated using the nth formula of the Fibonacci series. The below figure shows
that how F(20) is calculated.

108
Course Code/Title:CS3302/Design Analysis of Algorithms

As we can observe in the above figure that F(20) is calculated as the sum of F(19) and F(18). In the dynamic
programming approach, we try to divide the problem into the similar subproblems. We are following this
approach in the above case where F(20) into the similar subproblems, i.e., F(19) and F(18). If we recap the
definition of dynamic programming that it says the similar subproblem should not be computed more than
once. Still, in the above case, the subproblem is calculated twice. In the above example, F(18) is calculated
two times; similarly, F(17) is also calculated twice. However, this technique is quite useful as it solves the
similar subproblems, but we need to be cautious while storing the results because we are not particular
about storing the result that we have computed once, then it can lead to a wastage of resources.
In the above example, if we calculate the F(18) in the right subtree, then it leads to the tremendous usage
of resources and decreases the overall performance.
The solution to the above problem is to save the computed results in an array. First, we calculate F(16) and
F(17) and save their values in an array. The F(18) is calculated by summing the values of F(17) and F(16),
which are already saved in an array. The computed value of F(18) is saved in an array. The value of F(19)
is calculated using the sum of F(18), and F(17), and their values are already saved in an array. The computed
value of F(19) is stored in an array. The value of F(20) can be calculated by adding the values of F(19) and
F(18), and the values of both F(19) and F(18) are stored in an array. The final computed value of F(20) is
stored in an array.
How does the dynamic programming approach work?
The following are the steps that the dynamic programming follows:
o It breaks down the complex problem into simpler subproblems.

o It finds the optimal solution to these sub-problems.


o It stores the results of subproblems (memoization). The process of storing the results of subproblems
is known as memorization.
o It reuses them so that same sub-problem is calculated more than once.
o Finally, calculate the result of the complex problem.
The above five steps are the basic steps for dynamic programming. The dynamic programming is applicable

109
Course Code/Title:CS3302/Design Analysis of Algorithms

that are having properties such as:


Those problems that are having overlapping subproblems and optimal substructures. Here, optimal
substructure means that the solution of optimization problems can be obtained by simply combining the
optimal solution of all the subproblems.
In the case of dynamic programming, the space complexity would be increased as we are storing the
intermediate results, but the time complexity would be decreased.
Approaches of dynamic programming
There are two approaches to dynamic programming:
o Top-down approach

o Bottom-up approach
Top-down approach
The top-down approach follows the memorization technique, while bottom-up approach follows the
tabulation method. Here memorization is equal to the sum of recursion and caching. Recursion means
calling the function itself, while caching means storing the intermediate results.
Advantages
o It is very easy to understand and implement.

o It solves the subproblems only when it is required.


o It is easy to debug.
Disadvantages
It uses the recursion technique that occupies more memory in the call stack. Sometimes when the recursion
is too deep, the stack overflow condition will occur.
It occupies more memory that degrades the overall performance.
Let's understand dynamic programming through an example.
int fib(int n)
{
if(n<0)
error;
if(n==0)
return 0;
if(n==1)
return 1;
sum = fib(n-1) + fib(n-2);
}

110
Course Code/Title:CS3302/Design Analysis of Algorithms

In the above code, we have used the recursive approach to find out the Fibonacci series. When the value of
'n' increases, the function calls will also increase, and computations will also increase. In this case, the time
complexity increases exponentially, and it becomes 2n.
One solution to this problem is to use the dynamic programming approach. Rather than generating the
recursive tree again and again, we can reuse the previously calculated value. If we use the dynamic
programming approach, then the time complexity would be O(n).
When we apply the dynamic programming approach in the implementation of the Fibonacci series, then
the code would look like:
static int count = 0;
int fib(int n)
{
if(memo[n]!= NULL)
return memo[n];
count++;
if(n<0)
error;
if(n==0)
return 0;
if(n==1)
return 1;
sum = fib(n-1) + fib(n-2);
memo[n] = sum;
}
In the above code, we have used the memorization technique in which we store the results in an array to
reuse the values. This is also known as a top-down approach in which we move from the top and break the
problem into sub-problems.
Bottom-Up approach
The bottom-up approach is also one of the techniques which can be used to implement the dynamic
programming. It uses the tabulation technique to implement the dynamic programming approach. It solves
the same kind of problems but it removes the recursion. If we remove the recursion, there is no stack
overflow issue and no overhead of the recursive functions. In this tabulation technique, we solve the
problems and store the results in a matrix.
There are two ways of applying dynamic programming:

111
Course Code/Title:CS3302/Design Analysis of Algorithms

o Top-Down
o Bottom-Up
The bottom-up is the approach used to avoid the recursion, thus saving the memory space. The bottom-up
is an algorithm that starts from the beginning, whereas the recursive algorithm starts from the end and works
backward. In the bottom-up approach, we start from the base case to find the answer for the end. As we
know, the base cases in the Fibonacci series are 0 and 1. Since the bottom approach starts from the base
cases, so we will start from 0 and 1.
Key points
o We solve all the smaller sub-problems that will be needed to solve the larger sub-problems then
move to the larger problems using smaller sub-problems.
o We use for loop to iterate over the sub-problems.
o The bottom-up approach is also known as the tabulation or table filling method.
Let's understand through an example.
Suppose we have an array that has 0 and 1 values at a[0] and a[1] positions, respectively shown as below:

Since the bottom-up approach starts from the lower values, so the values at a[0] and a[1] are added to find
the value of a[2] shown as below:

The value of a[3] will be calculated by adding a[1] and a[2], and it becomes 2 shown as below:

The value of a[4] will be calculated by adding a[2] and a[3], and it becomes 3 shown as below:

The value of a[5] will be calculated by adding the values of a[4] and a[3], and it becomes 5 shown as below:

The code for implementing the Fibonacci series using the bottom-up approach is given below:
int fib(int n)
{
int A[];

112
Course Code/Title:CS3302/Design Analysis of Algorithms

A[0] = 0, A[1] = 1;
for( i=2; i<=n; i++)
{
A[i] = A[i-1] + A[i-2]
}
return A[n];
}
In the above code, base cases are 0 and 1 and then we have used for loop to find other values of Fibonacci
series.
Let's understand through the diagrammatic representation.
Initially, the first two values, i.e., 0 and 1 can be represented as:

When i=2 then the values 0 and 1 are added shown as below:

When i=3 then the values 1and 1 are added shown as below:

When i=4 then the values 2 and 1 are added shown as below:

113
Course Code/Title:CS3302/Design Analysis of Algorithms

When i=5, then the values 3 and 2 are added shown as below:

Elements of Dynamic Programming


 Optimal Substructure
 Overlapping Sub-problems
 Variant: Memoization

Matrix Chain Multiplication


Example: We are given the sequence {4, 10, 3, 12, 20, and 7}. The matrices have size 4 x 10, 10 x 3, 3 x
12, 12 x 20, 20 x 7. We need to compute M [i,j], 0 ≤ i, j≤ 5. We know M [i, i] = 0 for all i.

114
Course Code/Title:CS3302/Design Analysis of Algorithms

Let us proceed with working away from the diagonal. We compute the optimal solution for the product of
2 matrices.

Here P0 to P5 are Position and M1 to M5 are matrix of size (pi to pi-1)


On the basis of sequence, we make a formula

In Dynamic Programming, initialization of every method done by '0'.So we initialize it by '0'.It will sort out
diagonally.
We have to sort out all the combination but the minimum output combination is taken into consideration.
Calculation of Product of 2 matrices:
1. m (1,2) = m1 x m2
= 4 x 10 x 10 x 3
= 4 x 10 x 3 = 120

2. m (2, 3) = m2 x m3
= 10 x 3 x 3 x 12
= 10 x 3 x 12 = 360

3. m (3, 4) = m3 x m4
= 3 x 12 x 12 x 20
= 3 x 12 x 20 = 720

4. m (4,5) = m4 x m5
= 12 x 20 x 20 x 7
= 12 x 20 x 7 = 1680

115
Course Code/Title:CS3302/Design Analysis of Algorithms

o We initialize the diagonal element with equal i,j value with '0'.
o After that second diagonal is sorted out and we get all the values corresponded to it
Now the third diagonal will be solved out in the same way.
Now product of 3 matrices:
M [1, 3] = M1 M2 M3
1. There are two cases by which we can solve this multiplication: ( M1 x M2) + M3, M1+ (M2x M3)
2. After solving both cases we choose the case in which minimum output is there.

M [1, 3] =264
As Comparing both output 264 is minimum in both cases so we insert 264 in table and ( M1 x M2) + M3 this
combination is chosen for the output making.
M [2, 4] = M2 M3 M4
1. There are two cases by which we can solve this multiplication: (M2x M3)+M4, M2+(M3 x M4)
2. After solving both cases we choose the case in which minimum output is there.

M [2, 4] = 1320
As Comparing both output 1320 is minimum in both cases so we insert 1320 in table and M2+(M3 x M4)
this combination is chosen for the output making.
M [3, 5] = M3 M4 M5
1. There are two cases by which we can solve this multiplication: ( M3 x M4) + M5, M3+ ( M4xM5)
2. After solving both cases we choose the case in which minimum output is there.

M [3, 5] = 1140
As Comparing both output 1140 is minimum in both cases so we insert 1140 in table and ( M3 x M4) +
M5this combination is chosen for the output making.

116
Course Code/Title:CS3302/Design Analysis of Algorithms

Now Product of 4 matrices:


M [1, 4] = M1 M2 M3 M4
There are three cases by which we can solve this multiplication:
1. ( M1 x M2 x M3) M4
2. M1 x(M2 x M3 x M4)
3. (M1 xM2) x ( M3 x M4)
After solving these cases we choose the case in which minimum output is there

M [1, 4] =1080
As comparing the output of different cases then '1080' is minimum output, so we insert 1080 in the table
and (M1 xM2) x (M3 x M4) combination is taken out in output making,
M [2, 5] = M2 M3 M4 M5
There are three cases by which we can solve this multiplication:
1. (M2 x M3 x M4)x M5
2. M2 x( M3 x M4 x M5)
3. (M2 x M3)x ( M4 x M5)
After solving these cases we choose the case in which minimum output is there

M [2, 5] = 1350
As comparing the output of different cases then '1350' is minimum output, so we insert 1350 in the table
and M2 x( M3 x M4 xM5)combination is taken out in output making.

Now Product of 5 matrices:


M [1, 5] = M1 M2 M3 M4 M5
There are five cases by which we can solve this multiplication:
1. (M1 x M2 xM3 x M4 )x M5

117
Course Code/Title:CS3302/Design Analysis of Algorithms

2. M1 x( M2 xM3 x M4 xM5)
3. (M1 x M2 xM3)x M4 xM5
4. M1 x M2x(M3 x M4 xM5)
After solving these cases we choose the case in which minimum output is there

M [1, 5] = 1344
As comparing the output of different cases then '1344' is minimum output, so we insert 1344 in the table
and M1 x M2 x(M3 x M4 x M5)combination is taken out in output making.
Final Output is:

Step 3: Computing Optimal Costs: let us assume that matrix Ai has dimension pi-1x pi for i=1, 2, 3....n.
The input is a sequence (p0,p1,......pn) where length [p] = n+1. The procedure uses an auxiliary table m
[1....n, 1.....n] for storing m [i, j] costs an auxiliary table s [1.....n, 1.....n] that record which index of k
achieved the optimal costs in computing m [i, j].
The algorithm first computes m [i, j] ← 0 for i=1, 2, 3.....n, the minimum costs for the chain of length 1.
Multistage Graph
A multistage graph is a directed and weighted graph, in which all vertices are divided into stages, such that
all the edges are only directed from the vertex of the current stage to the vertex of the next stage (i.e there
is no edge between the vertex of the same stage and to the previous stage vertex). Both the first and last
stage contains only one vertex called source and destination/sink respectively.
Mathematically, a multistage graph can be defined as:
Multistage graph G = (V, E, W) is a weighted directed graph in which vertices are partitioned into k ≥ 2
disjoint subsets V = {V1, V2, …, Vk} such that if edge (u, v) is present in E then u ∈ Vi and v ∈ Vi+1, 1 ≤
i ≤ k. The sets V1 and Vk are such that |V1 | = |Vk|=1

118
Course Code/Title:CS3302/Design Analysis of Algorithms

A multistage graph with 9 vertices and 5 stages


In the multistage graph problem, we are required to find the shortest path between the source and the
sink/destination. This problem can be easily solved by dynamic programming. Before getting started, let’s
understand what is Dynamic Programming(DP).
Dynamic Programming:
Dynamic Programming is an optimization over recursion. Whenever we have an optimization problem with
repeated subproblems we use dynamic programming because, in dynamic programming, we store the
solution to smaller subproblems in a table, and use it when the same subproblem arrives. Dynamic
programming reduces the exponential time complexity of a recursive algorithm to polynomial time
complexity. Dynamic programming can be applied to many problems like travelling salesman problems
and multistage graph problems.
The multistage graph problem can be solved in two ways using dynamic programming :
1. Forward approach
2. Backward approach
3. Forward approach
In the forward approach, we assume that there are k stages in the graph. We start from the last stage and
find out the cost of each and every node to the first stage. We then find out the minimum cost path from
the source to destination (i.e from stage 1 to stage k).
Procedure:
1. Maintain a cost array cost[n] which stores the distance from any vertex to the destination.
2. If a vertex has more than one path, then the path with the minimum distance is chosen and the
intermediate vertex is stored in the distance array D[n]. This will give us a minimum cost path from
each and every vertex.

119
Course Code/Title:CS3302/Design Analysis of Algorithms

3. Finally, the cost from 1st vertex cost(1,k) gives the minimum cost of the shortest path from source
to destination.
4. For finding the path, start from vertex-1 then the distance array D(1) will give the minimum cost
neighbour vertex which in turn gives the next nearest vertex and proceed in this way till we reach
the destination. For a ‘k’ stage graph, there will be ‘k’ vertex in the path.
5. For forward approach,
Cost (i,j) = min {C (j,l) + Cost (i+1,l) } l∈Vi + 1 & (j,l)∈E
Algorithm:
MULTI_STAGE(G, k, n, p)
// Description: Solve multi-stage problem using dynamic programming
// Input:
k: Number of stages in graph G = (V, E)
c[i, j]:Cost of edge (i, j)
// Output: p[1:k]:Minimum cost path
cost[n] ← 0
for j ← n — 1 to 1 do
//Let r be a vertex such that (j, r) E and c[j, r] + cost[r] is minimum
cost[j] ← c[j, r] + cost[r]
π[j] ← r
end
//Find minimum cost path
p[1] ← 1
p[k] ← n
for j ← 2 to k — 1 do
p[j] ← π[p[j — 1]]
end
Code:
// CPP program to find shortest distance
// in a multistage graph.
#include<bits/stdc++.h>
using namespace std;
#define N 12
#define INF INT_MAX
// Returns shortest distance from 0 to
// N-1.
int shortestDist(int graph[N][N]) {
// dist[i] is going to store shortest
// distance from node i to node N-1.
int dist[N];
dist[N-1] = 0;

120
Course Code/Title:CS3302/Design Analysis of Algorithms

// Calculating shortest path for


// rest of the nodes
for (int i = N-2 ; i >= 0 ; i — )
{
// Initialize distance from i to
// destination (N-1)
dist[i] = INF;
// Check all nodes of next stages
// to find shortest distance from
// i to N-1.
for (int j = i ; j < N ; j++)
{
// Reject if no edge exists
if (graph[i][j] == INF)
continue;
// We apply recursive equation to
// distance to target through j.
// and compare with minimum distance
// so far.
dist[i] = min(dist[i], graph[i][j] +
dist[j]);
}
}
return dist[0];
}
// Driver code
int main()
{
// Graph stored in the form of an
// adjacency Matrix
int graph[N][N] =
{{INF, 9, 7, 3, 2, INF, INF, INF, INF,INF,INF,INF},
{INF, INF, INF, INF, INF,4, 2, 1, INF,INF,INF,INF},
{INF, INF, INF, INF, INF,2, 7, INF, INF,INF,INF,INF},
{INF, INF, INF, INF, INF,INF, INF, 11, INF,INF,INF,INF},
{INF, INF, INF, INF, INF,INF, 11, 8, INF,INF,INF,INF},
{INF, INF, INF, INF, INF,INF, INF,INF, 6,5,INF,INF},
{INF, INF, INF, INF, INF,INF, INF,INF, 4,3,INF,INF},
{INF, INF, INF, INF, INF,INF, INF,INF, INF,5,6,INF},
{INF, INF, INF, INF, INF,INF, INF,INF, INF,INF,INF,4},
{INF, INF, INF, INF, INF,INF, INF,INF, INF,INF,INF,2},
{INF, INF, INF, INF, INF,INF, INF,INF, INF,INF,INF,5},
{INF, INF, INF, INF, INF,INF, INF,INF, INF,INF,INF,0}};

121
Course Code/Title:CS3302/Design Analysis of Algorithms

cout << shortestDist(graph);


return 0;
}
Output :
16
Example: Find minimum path cost between vertex s and t for following multistage graph using dynamic
programming.

Solution:
In the above graph, cost of an edge is represented as c(i, j). We need to find the minimum cost path from
vertex 1 to vertex 12. Using the below formula we can find the shortest cost path from source to destination:
cost(i,j)=min{c(j,l)+cost(i+1,l)}
Step 1:
Stage 5
cost(5,12)=c(12,12)=0
We use forward approach here therefore (cost(5,12) = 0 ). Here, 5 represents the stage number and 12
represents a node in that stage. Since there are no outgoing edges from vertex 12, the cost is 0 and D[12]=12
Step 2:
Stage 4
cost(4,9)= c(9,12) = 4 D[9]=12
cost(4,10) = c(10,12) = 2 D[10]=12
cost(4,11) = c(11,12) = 5 D[11]=12
Step 3:
Stage 3
cost(3,6)=min{c(6,9)+cost(4,9), c(6,10)+cost(4,10)}
= min{6+4,5+2} =min{10,7}=7 D[6]=10
cost(3,7)=min{c(7,9)+cost(4,9), c(7,10)+cost(4,10)}
= min{4+4,3+2} =min{8,5}=5 D[7]=10

122
Course Code/Title:CS3302/Design Analysis of Algorithms

cost(3,8)=min{c(8,10)+cost(4,10), c(8,11)+cost(4,10)}
= min{5+2,6+5} =min{7,11}=7 D[7]=10
Step 4:
Stage 2
cost(2,2) = min{c(2,6)+cost(3,6), c(2,7)+cost(3,7), c(2,8)+cost(3,8)}
= min{4+7,2+5,1+7}=min{11,7,8}=7 D[2]=7
cost(2,3) = min{c(3,6)+cost(3,6), c(3,7)+cost(3,7)}
= min{2+7,7+5}=min{9,12}=9 D[3]=6
cost(2,4) = min{c(4,8)+cost(3,8)}
= min{11+7}=min{18}=18 D[4]=8
cost(2,5) = min{c(5,7)+cost(3,7), c(5,8)+cost(3,8)}
= min{11+5,8+7}=min{16,15}=15 D[3]=8
Step 5:
Stage 1
cost(1,1) = min(c(1,2)+cost(2,2), c(1,3)+cost(2,3), c(1,4)+cost(2,4), c(1,5)+cost(2,5)}
= min{9+7, 7+9, 3+18, 2+15}=min{16,16,21,17}=16
D[1] = 2 (we can take 3 also )
The path through which we have to find the shortest distance
Start from vertex — 2
D ( 1) = 2
D ( 2) = 7
D ( 7) = 10
D (10) = 12
So, the minimum –cost path is,

The cost is 9+2+3+2=16


Code for this is given above.
ANALYSIS: The time complexity of this forward method is 𝑂(|𝑉| + |𝐸|)
2. BACKWARD METHOD :
If there are ‘K’ stages in a graph using backward approach. we will find out the cost of each & every vertex
starting from 1st stage to the kth stage. We will find out the minimum cost path from destination to source
(i.e.,) from stage k to stage 1
PROCEDURE:
1. It is similar to the forward approach but differs only in two or three ways.
2. Maintain a cost matrix to store the cost of every vertex and a distance matrix to store the minimum
distance vertex.
3. Find out the cost of each and every vertex starting from vertex 1 up to vertex k.

123
Course Code/Title:CS3302/Design Analysis of Algorithms

4. To find out the path star from vertex ‘k’, then the distance array D (k) will give the minimum cost
neighbour vertex which in turn gives the next nearest neighbour vertex and proceed till we reach
the destination.
Application of multistage graph :
1. It is used to find the minimum cost shortest path.
2. If a problem can be represented as a multistage graph then it can be solved by dynamic
programming.

Optimal Binary Search Tree


As we know that in binary search tree, the nodes in the left subtree have lesser value than the root node and
the nodes in the right subtree have greater value than the root node.
We know the key values of each node in the tree, and we also know the frequencies of each node in terms
of searching means how much time is required to search a node. The frequency and key-value determine
the overall cost of searching a node. The cost of searching is a very important factor in various applications.
The overall cost of searching a node should be less. The time required to search a node in BST is more than
the balanced binary search tree as a balanced binary search tree contains a lesser number of levels than the
BST. There is one way that can reduce the cost of a binary search tree is known as an optimal binary
search tree.
Let's understand through an example.
If the keys are 10, 20, 30, 40, 50, 60, 70

In the above tree, all the nodes on the left subtree are smaller than the value of the root node, and all the
nodes on the right subtree are larger than the value of the root node. The maximum time required to search
a node is equal to the minimum height of the tree, equal to logn.
Now we will see how many binary search trees can be made from the given number of keys.
For example: 10, 20, 30 are the keys, and the following are the binary search trees that can be made out
from these keys.
The Formula for calculating the number of trees:

124
Course Code/Title:CS3302/Design Analysis of Algorithms

When we use the above formula, then it is found that total 5 number of trees can be created.
The cost required for searching an element depends on the comparisons to be made to search an element.
Now, we will calculate the average cost of time of the above binary search trees.

In the above tree, total number of 3 comparisons can be made. The average number of comparisons can be
made as:

In the above tree, the average number of comparisons that can be made as:

In the above tree, the average number of comparisons that can be made as:

125
Course Code/Title:CS3302/Design Analysis of Algorithms

In the above tree, the total number of comparisons can be made as 3. Therefore, the average number of
comparisons that can be made as:

In the above tree, the total number of comparisons can be made as 3. Therefore, the average number of
comparisons that can be made as:

In the third case, the number of comparisons is less because the height of the tree is less, so it's a balanced
binary search tree.
Till now, we read about the height-balanced binary search tree. To find the optimal binary search tree, we
will determine the frequency of searching a key.
Let's assume that frequencies associated with the keys 10, 20, 30 are 3, 2, 5.
The above trees have different frequencies. The tree with the lowest frequency would be considered the
optimal binary search tree. The tree with the frequency 17 is the lowest, so it would be considered as the
optimal binary search tree.
Dynamic Approach
Consider the below table, which contains the keys and frequencies.

126
Course Code/Title:CS3302/Design Analysis of Algorithms

First, we will calculate the values where j-i is equal to zero.


When i=0, j=0, then j-i = 0
When i = 1, j=1, then j-i = 0
When i = 2, j=2, then j-i = 0
When i = 3, j=3, then j-i = 0
When i = 4, j=4, then j-i = 0
Therefore, c[0, 0] = 0, c[1 , 1] = 0, c[2,2] = 0, c[3,3] = 0, c[4,4] = 0
Now we will calculate the values where j-i equal to 1.
When j=1, i=0 then j-i = 1
When j=2, i=1 then j-i = 1
When j=3, i=2 then j-i = 1
When j=4, i=3 then j-i = 1
Now to calculate the cost, we will consider only the jth value.
The cost of c[0,1] is 4 (The key is 10, and the cost corresponding to key 10 is 4).
The cost of c[1,2] is 2 (The key is 20, and the cost corresponding to key 20 is 2).
The cost of c[2,3] is 6 (The key is 30, and the cost corresponding to key 30 is 6)
The cost of c[3,4] is 3 (The key is 40, and the cost corresponding to key 40 is 3)

Now we will calculate the values where j-i = 2


When j=2, i=0 then j-i = 2
When j=3, i=1 then j-i = 2
When j=4, i=2 then j-i = 2
In this case, we will consider two keys.

127
Course Code/Title:CS3302/Design Analysis of Algorithms

o When i=0 and j=2, then keys 10 and 20. There are two possible trees that can be made out from
these two keys shown below:

In the first binary tree, cost would be: 4*1 + 2*2 = 8


In the second binary tree, cost would be: 4*2 + 2*1 = 10
The minimum cost is 8; therefore, c[0,2] = 8

o When i=1 and j=3, then keys 20 and 30. There are two possible trees that can be made out from
these two keys shown below:
In the first binary tree, cost would be: 1*2 + 2*6 = 14
In the second binary tree, cost would be: 1*6 + 2*2 = 10
The minimum cost is 10; therefore, c[1,3] = 10
o When i=2 and j=4, we will consider the keys at 3 and 4, i.e., 30 and 40. There are two possible trees
that can be made out from these two keys shown as below:
In the first binary tree, cost would be: 1*6 + 2*3 = 12
In the second binary tree, cost would be: 1*3 + 2*6 = 15
The minimum cost is 12, therefore, c[2,4] = 12

128
Course Code/Title:CS3302/Design Analysis of Algorithms

Now we will calculate the values when j-i = 3


When j=3, i=0 then j-i = 3
When j=4, i=1 then j-i = 3
o When i=0, j=3 then we will consider three keys, i.e., 10, 20, and 30.

The following are the trees that can be made if 10 is considered as a root node.

In the above tree, 10 is the root node, 20 is the right child of node 10, and 30 is the right child of node 20.
Cost would be: 1*4 + 2*2 + 3*6 = 26

In the above tree, 10 is the root node, 30 is the right child of node 10, and 20 is the left child of node 20.
Cost would be: 1*4 + 2*6 + 3*2 = 22
The following tree can be created if 20 is considered as the root node.

129
Course Code/Title:CS3302/Design Analysis of Algorithms

In the above tree, 20 is the root node, 30 is the right child of node 20, and 10 is the left child of node 20.
Cost would be: 1*2 + 4*2 + 6*2 = 22
The following are the trees that can be created if 30 is considered as the root node.

In the above tree, 30 is the root node, 20 is the left child of node 30, and 10 is the left child of node 20.
Cost would be: 1*6 + 2*2 + 3*4 = 22

In the above tree, 30 is the root node, 10 is the left child of node 30 and 20 is the right child of node 10.
Cost would be: 1*6 + 2*4 + 3*2 = 20
Therefore, the minimum cost is 20 which is the 3rd root. So, c[0,3] is equal to 20.
o When i=1 and j=4 then we will consider the keys 20, 30, 40

c[1,4] = min{ c[1,1] + c[2,4], c[1,2] + c[3,4], c[1,3] + c[4,4] } + 11


= min{0+12, 2+3, 10+0}+ 11
= min{12, 5, 10} + 11
The minimum value is 5; therefore, c[1,4] = 5+11 = 16

130
Course Code/Title:CS3302/Design Analysis of Algorithms

o Now we will calculate the values when j-i = 4


When j=4 and i=0 then j-i = 4
In this case, we will consider four keys, i.e., 10, 20, 30 and 40. The frequencies of 10, 20, 30 and 40 are 4,
2, 6 and 3 respectively.
w[0, 4] = 4 + 2 + 6 + 3 = 15
If we consider 10 as the root node then
C[0, 4] = min {c[0,0] + c[1,4]}+ w[0,4]
= min {0 + 16} + 15= 31
If we consider 20 as the root node then
C[0,4] = min{c[0,1] + c[2,4]} + w[0,4]
= min{4 + 12} + 15
= 16 + 15 = 31
If we consider 30 as the root node then,
C[0,4] = min{c[0,2] + c[3,4]} +w[0,4]
= min {8 + 3} + 15
= 26
If we consider 40 as the root node then,
C[0,4] = min{c[0,3] + c[4,4]} + w[0,4]
= min{20 + 0} + 15
= 35
In the above cases, we have observed that 26 is the minimum cost; therefore, c[0,4] is equal to 26.

The optimal binary tree can be created as:

131
Course Code/Title:CS3302/Design Analysis of Algorithms

General formula for calculating the minimum cost is:


C[i,j] = min{c[i, k-1] + c[k,j]} + w(i,j)

An Activity Selection Problem


The activity selection problem is a mathematical optimization problem. Our first illustration is the problem
of scheduling a resource among several challenge activities. We find a greedy algorithm provides a well-
designed and simple method for selecting a maximum- size set of manually compatible activities.
Suppose S = {1, 2....n} is the set of n proposed activities. The activities share resources which can be used
by only one activity at a time, e.g., Tennis Court, Lecture Hall, etc. Each Activity "i" has start time si and
a finish time fi, where si ≤fi. If selected activity "i" take place meanwhile the half-open time interval [si,fi).
Activities i and j are compatible if the intervals (si, fi) and [si, fi) do not overlap (i.e. i and j are compatible
if si ≥fi or si ≥fi). The activity-selection problem chosen the maximum- size set of mutually consistent
activities.
Algorithm Of Greedy- Activity Selector:
GREEDY- ACTIVITY SELECTOR (s, f)
1. n ← length [s]
2. A ← {1}
3. j ← 1.
4. for i ← 2 to n
5. do if si ≥ fi
6. then A ← A ∪ {i}
7. j ← i
8. return A
Example: Given 10 activities along with their start and end time as
S = (A1 A2 A3 A4 A5 A6 A7 A8 A9 A10)

132
Course Code/Title:CS3302/Design Analysis of Algorithms

Si = (1,2,3,4,7,8,9,9,11,12)
fi = (3,5,4,7,10,9,11,13,12,14)
Compute a schedule where the greatest number of activities takes place.
Solution: The solution to the above Activity scheduling problem using a greedy strategy is illustrated
below:
Arranging the activities in increasing order of end time

Now, schedule A1
Next schedule A3 as A1 and A3 are non-interfering.
Next skip A2 as it is interfering.
Next, schedule A4 as A1 A3 and A4 are non-interfering, then next, schedule A6 as A1 A3 A4 and A6 are non-
interfering.
Skip A5 as it is interfering.
Next, schedule A7 as A1 A3 A4 A6 and A7 are non-interfering.
Next, schedule A9 as A1 A3 A4 A6 A7 and A9 are non-interfering.
Skip A8 as it is interfering.
Next, schedule A10 as A1 A3 A4 A6 A7 A9 and A10 are non-interfering.
Thus the final Activity schedule is:

133
Course Code/Title:CS3302/Design Analysis of Algorithms

Optimal Merge Pattern Algorithm


Merge a set of sorted files of different length into a single sorted file. We need to find an optimal solution,
where the resultant file will be generated in minimum time.
If the number of sorted files are given, there are many ways to merge them into a single sorted file. This
merge can be performed pair wise. Hence, this type of merging is called as 2-way merge patterns.
As, different pairings require different amounts of time, in this strategy we want to determine an optimal
way of merging many files together. At each step, two shortest sequences are merged.
To merge a p-record file and a q-record file requires possibly p + q record moves, the obvious choice
being, merge the two smallest files together at each step.
Two-way merge patterns can be represented by binary merge trees. Let us consider a set of n sorted files {f1,
f2, f3, …, fn}. Initially, each element of this is considered as a single node binary tree. To find this optimal
solution, the following algorithm is used.
Pseudocode
Following is the pseudocode of the Optimal Merge Pattern Algorithm −
for i := 1 to n – 1 do
declare new node
node.leftchild := least (list)
node.rightchild := least (list)
node.weight) := ((node.leftchild).weight)+
((node.rightchild).weight)
insert (list, node);
return least (list);
At the end of this algorithm, the weight of the root node represents the optimal cost.
Examples
Let us consider the given files, f1, f2, f3, f4 and f5 with 20, 30, 10, 5 and 30 number of elements respectively.
If merge operations are performed according to the provided sequence, then
M1 = merge f1 and f2 => 20 + 30 = 50
M2 = merge M1 and f3 => 50 + 10 = 60
M3 = merge M2 and f4 => 60 + 5 = 65
M4 = merge M3 and f5 => 65 + 30 = 95
Hence, the total number of operations is
50 + 60 + 65 + 95 = 270
Now, the question arises is there any better solution?
Sorting the numbers according to their size in an ascending order, we get the following sequence −
f4, f3, f1, f2, f5
Hence, merge operations can be performed on this sequence
M1 = merge f4 and f3 => 5 + 10 = 15
M2 = merge M1 and f1 => 15 + 20 = 35
M3 = merge M2 and f2 => 35 + 30 = 65

134
Course Code/Title:CS3302/Design Analysis of Algorithms

M4 = merge M3 and f5 => 65 + 30 = 95


Therefore, the total number of operations is
15 + 35 + 65 + 95 = 210
Obviously, this is better than the previous one.
In this context, we are now going to solve the problem using this algorithm.
Initial Set

Step 1

Step 2

Step 3

Step 4

135
Course Code/Title:CS3302/Design Analysis of Algorithms

Hence, the solution takes 15 + 35 + 60 + 95 = 205 number of comparisons.


0/1 Knapsack Problem
Here knapsack is like a container or a bag. Suppose we have given some items which have some weights
or profits. We have to put some items in the knapsack in such a way total value produces a maximum profit.
For example, the weight of the container is 20 kg. We have to select the items in such a way that the sum
of the weight of items should be either smaller than or equal to the weight of the container, and the profit
should be maximum.
There are two types of knapsack problems:
o 0/1 knapsack problem

o Fractional knapsack problem


We will discuss both the problems one by one. First, we will learn about the 0/1 knapsack problem.
PlayNext
Mute
Current Time 0:04
/
Duration 18:10
Loaded: 3.67%
Â
Fullscreen
What is the 0/1 knapsack problem?
The 0/1 knapsack problem means that the items are either completely or no items are filled in a knapsack.
For example, we have two items having weights 2kg and 3kg, respectively. If we pick the 2kg item then
we cannot pick 1kg item from the 2kg item (item is not divisible); we have to pick the 2kg item completely.
This is a 0/1 knapsack problem in which either we pick the item completely or we will pick that item. The
0/1 knapsack problem is solved by the dynamic programming.
What is the fractional knapsack problem?
The fractional knapsack problem means that we can divide the item. For example, we have an item of 3 kg
then we can pick the item of 2 kg and leave the item of 1 kg. The fractional knapsack problem is solved by
the Greedy approach.
Example of 0/1 knapsack problem.
Consider the problem having weights and profits are:
Weights: {3, 4, 6, 5}
Profits: {2, 3, 1, 4}
The weight of the knapsack is 8 kg
The number of items is 4
The above problem can be solved by using the following method:
xi = {1, 0, 0, 1}
= {0, 0, 0, 1}
= {0, 1, 0, 1}
The above are the possible combinations. 1 denotes that the item is completely picked and 0 means that no
item is picked. Since there are 4 items so possible combinations will be:

136
Course Code/Title:CS3302/Design Analysis of Algorithms

24 = 16; So. There are 16 possible combinations that can be made by using the above problem. Once all the
combinations are made, we have to select the combination that provides the maximum profit.
Another approach to solve the problem is dynamic programming approach. In dynamic programming
approach, the complicated problem is divided into sub-problems, then we find the solution of a sub-problem
and the solution of the sub-problem will be used to find the solution of a complex problem.
How this problem can be solved by using the Dynamic programming approach?
First,
we create a matrix shown as below:
0 1 2 3 4 5 6 7 8

4
In the above matrix, columns represent the weight, i.e., 8. The rows represent the profits and weights of
items. Here we have not taken the weight 8 directly, problem is divided into sub-problems, i.e., 0, 1, 2, 3,
4, 5, 6, 7, 8. The solution of the sub-problems would be saved in the cells and answer to the problem would
be stored in the final cell. First, we write the weights in the ascending order and profits according to their
weights shown as below:
wi = {3, 4, 5, 6}
pi = {2, 3, 4, 1}
The first row and the first column would be 0 as there is no item for w=0
0 1 2 3 4 5 6 7 8

0 0 0 0 0 0 0 0 0 0

1 0

2 0

3 0

4 0
When i=1, W=1
w1 = 3; Since we have only one item in the set having weight 3, but the capacity of the knapsack is 1. We
cannot fill the item of 3kg in the knapsack of capacity 1 kg so add 0 at M[1][1] shown as below:

137
Course Code/Title:CS3302/Design Analysis of Algorithms

0 1 2 3 4 5 6 7 8

0 0 0 0 0 0 0 0 0 0

1 0 0

2 0

3 0

4 0
When i = 1, W = 2
w1 = 3; Since we have only one item in the set having weight 3, but the capacity of the knapsack is 2. We
cannot fill the item of 3kg in the knapsack of capacity 2 kg so add 0 at M[1][2] shown as below:
0 1 2 3 4 5 6 7 8

0 0 0 0 0 0 0 0 0 0

1 0 0 0

2 0

3 0

4 0
When i=1, W=3
w1 = 3; Since we have only one item in the set having weight equal to 3, and weight of the knapsack is also
3; therefore, we can fill the knapsack with an item of weight equal to 3. We put profit corresponding to the
weight 3, i.e., 2 at M[1][3] shown as below:
0 1 2 3 4 5 6 7 8

0 0 0 0 0 0 0 0 0 0

1 0 0 0 2

2 0

3 0

4 0
When i=1, W = 4

138
Course Code/Title:CS3302/Design Analysis of Algorithms

W1 = 3; Since we have only one item in the set having weight equal to 3, and weight of the knapsack is 4;
therefore, we can fill the knapsack with an item of weight equal to 3. We put profit corresponding to the
weight 3, i.e., 2 at M[1][4] shown as below:
0 1 2 3 4 5 6 7 8

0 0 0 0 0 0 0 0 0 0

1 0 0 0 2 2

2 0

3 0

4 0
When i=1, W = 5
W1 = 3; Since we have only one item in the set having weight equal to 3, and weight of the knapsack is 5;
therefore, we can fill the knapsack with an item of weight equal to 3. We put profit corresponding to the
weight 3, i.e., 2 at M[1][5] shown as below:
0 1 2 3 4 5 6 7 8

0 0 0 0 0 0 0 0 0 0

1 0 0 0 2 2 2

2 0

3 0

4 0
When i =1, W=6
W1 = 3; Since we have only one item in the set having weight equal to 3, and weight of the knapsack is 6;
therefore, we can fill the knapsack with an item of weight equal to 3. We put profit corresponding to the
weight 3, i.e., 2 at M[1][6] shown as below:
0 1 2 3 4 5 6 7 8

0 0 0 0 0 0 0 0 0 0

1 0 0 0 2 2 2 2

2 0

139
Course Code/Title:CS3302/Design Analysis of Algorithms

3 0

4 0
When i=1, W = 7
W1 = 3; Since we have only one item in the set having weight equal to 3, and weight of the knapsack is 7;
therefore, we can fill the knapsack with an item of weight equal to 3. We put profit corresponding to the
weight 3, i.e., 2 at M[1][7] shown as below:
0 1 2 3 4 5 6 7 8

0 0 0 0 0 0 0 0 0 0

1 0 0 0 2 2 2 2 2

2 0

3 0

4 0
When i =1, W =8
W1 = 3; Since we have only one item in the set having weight equal to 3, and weight of the knapsack is 8;
therefore, we can fill the knapsack with an item of weight equal to 3. We put profit corresponding to the
weight 3, i.e., 2 at M[1][8] shown as below:
0 1 2 3 4 5 6 7 8

0 0 0 0 0 0 0 0 0 0

1 0 0 0 2 2 2 2 2 2

2 0

3 0

4 0
Now the value of 'i' gets incremented, and becomes 2.
When i =2, W = 1
The weight corresponding to the value 2 is 4, i.e., w 2 = 4. Since we have only one item in the set having
weight equal to 4, and the weight of the knapsack is 1. We cannot put the item of weight 4 in a knapsack,
so we add 0 at M[2][1] shown as below:
0 1 2 3 4 5 6 7 8

140
Course Code/Title:CS3302/Design Analysis of Algorithms

0 0 0 0 0 0 0 0 0 0

1 0 0 0 2 2 2 2 2 2

2 0 0

3 0

4 0
When i =2, W = 2
The weight corresponding to the value 2 is 4, i.e., w 2 = 4. Since we have only one item in the set having
weight equal to 4, and the weight of the knapsack is 2. We cannot put the item of weight 4 in a knapsack,
so we add 0 at M[2][2] shown as below:
0 1 2 3 4 5 6 7 8

0 0 0 0 0 0 0 0 0 0

1 0 0 0 2 2 2 2 2 2

2 0 0 0

3 0

4 0
When i =2, W = 3
The weight corresponding to the value 2 is 4, i.e., w2 = 4. Since we have two items in the set having weights
3 and 4, and the weight of the knapsack is 3. We can put the item of weight 3 in a knapsack, so we add 2 at
M[2][3] shown as below:
0 1 2 3 4 5 6 7 8

0 0 0 0 0 0 0 0 0 0

1 0 0 0 2 2 2 2 2 2

2 0 0 0 2

3 0

4 0
When i =2, W = 4
The weight corresponding to the value 2 is 4, i.e., w2 = 4. Since we have two items in the set having weights

141
Course Code/Title:CS3302/Design Analysis of Algorithms

3 and 4, and the weight of the knapsack is 4. We can put item of weight 4 in a knapsack as the profit
corresponding to weight 4 is more than the item having weight 3, so we add 3 at M[2][4] shown as below:
0 1 2 3 4 5 6 7 8

0 0 0 0 0 0 0 0 0 0

1 0 0 0 2 2 2 2 2 2

2 0 0 0 2 3

3 0

4 0
When i = 2, W = 5
The weight corresponding to the value 2 is 4, i.e., w2 = 4. Since we have two items in the set having weights
3 and 4, and the weight of the knapsack is 5. We can put item of weight 4 in a knapsack and the profit
corresponding to weight is 3, so we add 3 at M[2][5] shown as below:
0 1 2 3 4 5 6 7 8

0 0 0 0 0 0 0 0 0 0

1 0 0 0 2 2 2 2 2 2

2 0 0 0 2 3 3

3 0

4 0
When i = 2, W = 6
The weight corresponding to the value 2 is 4, i.e., w2 = 4. Since we have two items in the set having weights
3 and 4, and the weight of the knapsack is 6. We can put item of weight 4 in a knapsack and the profit
corresponding to weight is 3, so we add 3 at M[2][6] shown as below:
0 1 2 3 4 5 6 7 8

0 0 0 0 0 0 0 0 0 0

1 0 0 0 2 2 2 2 2 2

2 0 0 0 2 3 3 3

3 0

142
Course Code/Title:CS3302/Design Analysis of Algorithms

4 0
When i = 2, W = 7
The weight corresponding to the value 2 is 4, i.e., w2 = 4. Since we have two items in the set having weights
3 and 4, and the weight of the knapsack is 7. We can put item of weight 4 and 3 in a knapsack and the
profits corresponding to weights are 2 and 3; therefore, the total profit is 5, so we add 5 at M[2][7] shown
as below:
0 1 2 3 4 5 6 7 8

0 0 0 0 0 0 0 0 0 0

1 0 0 0 2 2 2 2 2 2

2 0 0 0 0 3 3 3 5

3 0

4 0
When i = 2, W = 8
The weight corresponding to the value 2 is 4, i.e., w2 = 4. Since we have two items in the set having weights
3 and 4, and the weight of the knapsack is 7. We can put item of weight 4 and 3 in a knapsack and the
profits corresponding to weights are 2 and 3; therefore, the total profit is 5, so we add 5 at M[2][7] shown
as below:
0 1 2 3 4 5 6 7 8

0 0 0 0 0 0 0 0 0 0

1 0 0 0 2 2 2 2 2 2

2 0 0 0 2 3 3 3 5 5

3 0

4 0
Now the value of 'i' gets incremented, and becomes 3.
When i = 3, W = 1
The weight corresponding to the value 3 is 5, i.e., w3 = 5. Since we have three items in the set having
weights 3, 4, and 5, and the weight of the knapsack is 1. We cannot put neither of the items in a knapsack,
so we add 0 at M[3][1] shown as below:
0 1 2 3 4 5 6 7 8

143
Course Code/Title:CS3302/Design Analysis of Algorithms

0 0 0 0 0 0 0 0 0 0

1 0 0 0 2 2 2 2 2 2

2 0 0 0 2 3 3 3 5 5

3 0 0

4 0
When i = 3, W = 2
The weight corresponding to the value 3 is 5, i.e., w3 = 5. Since we have three items in the set having weight
3, 4, and 5, and the weight of the knapsack is 1. We cannot put neither of the items in a knapsack, so we
add 0 at M[3][2] shown as below:
0 1 2 3 4 5 6 7 8

0 0 0 0 0 0 0 0 0 0

1 0 0 0 2 2 2 2 2 2

2 0 0 0 2 3 3 3 5 5

3 0 0 0

4 0
When i = 3, W = 3
The weight corresponding to the value 3 is 5, i.e., w3 = 5. Since we have three items in the set of weight 3,
4, and 5 respectively and weight of the knapsack is 3. The item with a weight 3 can be put in the knapsack
and the profit corresponding to the item is 2, so we add 2 at M[3][3] shown as below:
0 1 2 3 4 5 6 7 8

0 0 0 0 0 0 0 0 0 0

1 0 0 0 2 2 2 2 2 2

2 0 0 0 2 3 3 3 5 5

3 0 0 0 2

4 0
When i = 3, W = 4
The weight corresponding to the value 3 is 5, i.e., w3 = 5. Since we have three items in the set of weight 3,

144
Course Code/Title:CS3302/Design Analysis of Algorithms

4, and 5 respectively, and weight of the knapsack is 4. We can keep the item of either weight 3 or 4; the
profit (3) corresponding to the weight 4 is more than the profit corresponding to the weight 3 so we add 3
at M[3][4] shown as below:
0 1 2 3 4 5 6 7 8

0 0 0 0 0 0 0 0 0 0

1 0 0 0 2 2 2 2 2 2

2 0 0 0 2 3 3 3 5 5

3 0 0 0 1 3

4 0
When i = 3, W = 5
The weight corresponding to the value 3 is 5, i.e., w3 = 5. Since we have three items in the set of weight 3,
4, and 5 respectively, and weight of the knapsack is 5. We can keep the item of either weight 3, 4 or 5; the
profit (3) corresponding to the weight 4 is more than the profits corresponding to the weight 3 and 5 so we
add 3 at M[3][5] shown as below:
0 1 2 3 4 5 6 7 8

0 0 0 0 0 0 0 0 0 0

1 0 0 0 2 2 2 2 2 2

2 0 0 0 2 3 3 3 5 5

3 0 0 0 1 3 3

4 0
When i =3, W = 6
The weight corresponding to the value 3 is 5, i.e., w3 = 5. Since we have three items in the set of weight 3,
4, and 5 respectively, and weight of the knapsack is 6. We can keep the item of either weight 3, 4 or 5; the
profit (3) corresponding to the weight 4 is more than the profits corresponding to the weight 3 and 5 so we
add 3 at M[3][6] shown as below:
0 1 2 3 4 5 6 7 8

0 0 0 0 0 0 0 0 0 0

1 0 0 0 2 2 2 2 2 2

145
Course Code/Title:CS3302/Design Analysis of Algorithms

2 0 0 0 2 3 3 3 5 5

3 0 0 0 1 3 3 3

4 0
When i =3, W = 7
The weight corresponding to the value 3 is 5, i.e., w3 = 5. Since we have three items in the set of weight 3,
4, and 5 respectively, and weight of the knapsack is 7. In this case, we can keep both the items of weight 3
and 4, the sum of the profit would be equal to (2 + 3), i.e., 5, so we add 5 at M[3][7] shown as below:
0 1 2 3 4 5 6 7 8

0 0 0 0 0 0 0 0 0 0

1 0 0 0 2 2 2 2 2 2

2 0 0 0 2 3 3 3 5 5

3 0 0 0 1 3 3 3 5

4 0
When i = 3, W = 8
The weight corresponding to the value 3 is 5, i.e., w3 = 5. Since we have three items in the set of weight 3,
4, and 5 respectively, and the weight of the knapsack is 8. In this case, we can keep both the items of weight
3 and 4, the sum of the profit would be equal to (2 + 3), i.e., 5, so we add 5 at M[3][8] shown as below:
0 1 2 3 4 5 6 7 8

0 0 0 0 0 0 0 0 0 0

1 0 0 0 2 2 2 2 2 2

2 0 0 0 2 3 3 3 5 5

3 0 0 0 1 3 3 3 5 5

4 0
Now the value of 'i' gets incremented and becomes 4.
When i = 4, W = 1
The weight corresponding to the value 4 is 6, i.e., w4 = 6. Since we have four items in the set of weights 3,
4, 5, and 6 respectively, and the weight of the knapsack is 1. The weight of all the items is more than the
weight of the knapsack, so we cannot add any item in the knapsack; Therefore, we add 0 at M[4][1] shown
as below:

146
Course Code/Title:CS3302/Design Analysis of Algorithms

0 1 2 3 4 5 6 7 8

0 0 0 0 0 0 0 0 0 0

1 0 0 0 2 2 2 2 2 2

2 0 0 0 2 3 3 3 5 5

3 0 0 0 1 3 3 3 5 5

4 0 0
When i = 4, W = 2
The weight corresponding to the value 4 is 6, i.e., w4 = 6. Since we have four items in the set of weights 3,
4, 5, and 6 respectively, and the weight of the knapsack is 2. The weight of all the items is more than the
weight of the knapsack, so we cannot add any item in the knapsack; Therefore, we add 0 at M[4][2] shown
as below:
0 1 2 3 4 5 6 7 8

0 0 0 0 0 0 0 0 0 0

1 0 0 0 2 2 2 2 2 2

2 0 0 0 2 3 3 3 5 5

3 0 0 0 1 3 3 3 5 5

4 0 0 0
When i = 4, W = 3
The weight corresponding to the value 4 is 6, i.e., w4 = 6. Since we have four items in the set of weights 3,
4, 5, and 6 respectively, and the weight of the knapsack is 3. The item with a weight 3 can be put in the
knapsack and the profit corresponding to the weight 4 is 2, so we will add 2 at M[4][3] shown as below:
0 1 2 3 4 5 6 7 8

0 0 0 0 0 0 0 0 0 0

1 0 0 0 2 2 2 2 2 2

2 0 0 0 2 3 3 3 5 5

3 0 0 0 1 3 3 3 5 5

147
Course Code/Title:CS3302/Design Analysis of Algorithms

4 0 0 0 2
When i = 4, W = 4
The weight corresponding to the value 4 is 6, i.e., w4 = 6. Since we have four items in the set of weights 3,
4, 5, and 6 respectively, and the weight of the knapsack is 4. The item with a weight 4 can be put in the
knapsack and the profit corresponding to the weight 4 is 3, so we will add 3 at M[4][4] shown as below:
0 1 2 3 4 5 6 7 8

0 0 0 0 0 0 0 0 0 0

1 0 0 0 2 2 2 2 2 2

2 0 0 0 2 3 3 3 5 5

3 0 0 0 1 3 3 3 5 5

4 0 0 0 2 3
When i = 4, W = 5
The weight corresponding to the value 4 is 6, i.e., w4 = 6. Since we have four items in the set of weights 3,
4, 5, and 6 respectively, and the weight of the knapsack is 5. The item with a weight 4 can be put in the
knapsack and the profit corresponding to the weight 4 is 3, so we will add 3 at M[4][5] shown as below:
0 1 2 3 4 5 6 7 8

0 0 0 0 0 0 0 0 0 0

1 0 0 0 2 2 2 2 2 2

2 0 0 0 2 3 3 3 5 5

3 0 0 0 1 3 3 3 5 5

4 0 0 0 2 3 3
When i = 4, W = 6
The weight corresponding to the value 4 is 6, i.e., w4 = 6. Since we have four items in the set of weights 3,
4, 5, and 6 respectively, and the weight of the knapsack is 6. In this case, we can put the items in the
knapsack either of weight 3, 4, 5 or 6 but the profit, i.e., 4 corresponding to the weight 6 is highest among
all the items; therefore, we add 4 at M[4][6] shown as below:
0 1 2 3 4 5 6 7 8

0 0 0 0 0 0 0 0 0 0

148
Course Code/Title:CS3302/Design Analysis of Algorithms

1 0 0 0 2 2 2 2 2 2

2 0 0 0 2 3 3 3 5 5

3 0 0 0 1 3 3 3 5 5

4 0 0 0 2 3 3 4
When i = 4, W = 7
The weight corresponding to the value 4 is 6, i.e., w4 = 6. Since we have four items in the set of weights 3,
4, 5, and 6 respectively, and the weight of the knapsack is 7. Here, if we add two items of weights 3 and 4
then it will produce the maximum profit, i.e., (2 + 3) equals to 5, so we add 5 at M[4][7] shown as below:
0 1 2 3 4 5 6 7 8

0 0 0 0 0 0 0 0 0 0

1 0 0 0 2 2 2 2 2 2

2 0 0 0 2 3 3 3 5 5

3 0 0 0 2 3 3 3 5 5

4 0 0 0 2 3 3 4 5
When i = 4, W = 8
The weight corresponding to the value 4 is 6, i.e., w4 = 6. Since we have four items in the set of weights 3,
4, 5, and 6 respectively, and the weight of the knapsack is 8. Here, if we add two items of weights 3 and 4
then it will produce the maximum profit, i.e., (2 + 3) equals to 5, so we add 5 at M[4][8] shown as below:
0 1 2 3 4 5 6 7 8

0 0 0 0 0 0 0 0 0 0

1 0 0 0 2 2 2 2 2 2

2 0 0 0 2 3 3 3 5 5

3 0 0 0 2 3 3 3 5 5

4 0 0 0 2 3 3 4 5 5
As we can observe in the above table that 5 is the maximum profit among all the entries. The pointer points
to the last row and the last column having 5 value. Now we will compare 5 value with the previous row; if
the previous row, i.e., i = 3 contains the same value 5 then the pointer will shift upwards. Since the previous
row contains the value 5 so the pointer will be shifted upwards as shown in the below table:

149
Course Code/Title:CS3302/Design Analysis of Algorithms

0 1 2 3 4 5 6 7 8

0 0 0 0 0 0 0 0 0 0

1 0 0 0 2 2 2 2 2 2

2 0 0 0 2 3 3 3 5 5

3 0 0 0 2 3 3 3 5 5

4 0 0 0 2 3 3 4 5 5
Again, we will compare the value 5 from the above row, i.e., i = 2. Since the above row contains the value
5 so the pointer will again be shifted upwards as shown in the below table:
0 1 2 3 4 5 6 7 8

0 0 0 0 0 0 0 0 0 0

1 0 0 0 2 2 2 2 2 2

2 0 0 0 2 3 3 3 5 5

3 0 0 0 2 3 3 3 5 5

4 0 0 0 2 3 3 4 5 5
Again, we will compare the value 5 from the above row, i.e., i = 1. Since the above row does not contain
the same value so we will consider the row i=1, and the weight corresponding to the row is 4. Therefore,
we have selected the weight 4 and we have rejected the weights 5 and 6 shown below:
x = { 1, 0, 0}
The profit corresponding to the weight is 3. Therefore, the remaining profit is (5 - 3) equals to 2. Now we
will compare this value 2 with the row i = 2. Since the row (i = 1) contains the value 2; therefore, the pointer
shifted upwards shown below:
0 1 2 3 4 5 6 7 8

0 0 0 0 0 0 0 0 0 0

1 0 0 0 2 2 2 2 2 2

2 0 0 0 2 3 3 3 5 5

3 0 0 0 2 3 3 3 5 5

150
Course Code/Title:CS3302/Design Analysis of Algorithms

4 0 0 0 2 3 3 4 5 5
Again we compare the value 2 with a above row, i.e., i = 1. Since the row i =0 does not contain the value
2, so row i = 1 will be selected and the weight corresponding to the i = 1 is 3 shown below:
X = {1, 1, 0, 0}
The profit corresponding to the weight is 2. Therefore, the remaining profit is 0. We compare 0 value with
the above row. Since the above row contains a 0 value but the profit corresponding to this row is 0. In this
problem, two weights are selected, i.e., 3 and 4 to maximize the profit.

151
Course Code/Title:CS3302/Design Analysis of Algorithms

UNIT IV
STATE SPACE SEARCH TREE AND BACKTRACKING

N-queen Problem
N-Queens problem is to place n-queens in such a manner on an NxN chessboard that no queen’s attack
eachother by being in the same row, column or diagonal.
It can be seen that for n=1 ,the problem has a trivial solution ,and no solution exists for n=2 and n=3.So
first we will consider the 4 queens problem and then generate it to n - queens problem.
Given a 4x4 chessboard and number the rows and column of the chessboard 1 through 4.

Since, we have to place 4 queens such as q1q2q3and q4on the chessboard, such that no two queens
attack each other .In such a conditional each queen must be placed on a different row,i.e.,we put queen
"i" on row "i."
Now, we place queen q1 in the very first acceptable position (1, 1). Next, we put queen q 2 so that both
these queens do not attack each other. We find that if we place q2 in column 1 and 2, then the dead end
is encountered. Thus the first acceptable position for q2 in column 3, i.e. (2, 3) but then no position is
left for placing queen 'q3' safely. So we backtrack one step and place the queen 'q 2' in (2, 4), the next
best possible solution. Then we obtain the position for placing 'q3' which is (3, 2). But later this position
also leads to a dead end, and no place is found where 'q4' can be placed safely. Then we have to
backtrack till 'q1' and place it to (1, 2) and then all other queens are placed safely by moving q2 to (2,
4), q3 to (3, 1) and q4 to (4, 3). That is, we get the solution (2, 4, 1, 3). This is one possible solution
for the 4-queens problem. For another possible solution, the whole method is repeated for all partial
solutions. The other solutions for 4 - queens problems is (3, 1, 4, 2) i.e.

152
Course Code/Title:CS3302/Design Analysis of Algorithms

The implicit tree for 4-queen problem for a solution(2,4,1,3)is as follows:

Fig shows the complete state space for 4-queens problem .But we can use backtracking method
to generate the necessary node and stop if the next node violates the rule, i.e., if two queens are
attacking.

153
Course Code/Title:CS3302/Design Analysis of Algorithms

4- Queens solution space with nodes numbered in DFS


It can be seen that all the solutions to the 4 queens problem can be represented as 4-tuples(x1,x2,x3,x4)
where xi represents the column on which queen "q i" is placed.
One possible solution for 8 queens problem is shown in fig:

1. Thus, the solution for 8-queen problem for (4,6,8,2,7,1,3,5).

2. If two queens are placed at position (i,j) and (k,l).

3. Then they are on same diagonal only if (i-j)= k-lori+ j=k +l.

4. The first equation implies that j-l=i-k.

5. The second equation implies that j-l=k-i.

154
Course Code/Title:CS3302/Design Analysis of Algorithms

6. Therefore, two queens lie on the duplicate diagonal if and only if |j-l|=|i-k|

Place (k, i) returns a Boolean value that is true if the kth queen can be placed in column i. It tests
both whether i is distinct from all previous costs x1, x2,... xk-1 and whether there is no other
queen on the same diagonal.
Using place, we give a precise solution to then n-queens problem.
Place (k, i)
{
For j←1tok- 1
Do if (x[j]=i)
Or (Absx[j]) -i)=(Abs(j- k))
Then return false;
Return true;
}
Place(k,i)return true if a queen can be placed in the k th row and it h column otherwise return is
false. x [] is a global array whose final k - 1 values have been set. Abs (r) returns the absolute
value of r.
1. N-Queens(k,n)
2. {
3. Fori←1ton
4. doifPlace(k,i)then5.
{
6. x[k]←i;
7. if(k==n)then
8. write(x[1 ... n));
9. else
10. N- Queens(k+1, n);
11. }
12.}

Hamiltonian Circuit
The Hamiltonian cycle is the cycle in the graph which visits all the vertices in graph exactly
once and terminates at the starting node. It may not include all the edges
 The Hamiltonian cycle problem is the problem offinding a Hamiltonian cycle in a
graph if there exists any such cycle.
 The input to the problem is an undirected, connected graph. For the graph shown in
Figure (a), a path A–B– E– D–C–A forms a Hamiltonian cycle. It visits all the vertices

155
Course Code/Title:CS3302/Design Analysis of Algorithms

exactly once, but does not visit the edges <B, D>.

 The Hamiltonian cycle problem is also both, decision problem and an


optimization problem. A decision problem is stated as, “Given a path,
is it a Hamiltonian cycle of the graph?”.
 The optimization problem is stated as ,“Given graph G, find the Hamiltonian cycle for the graph.”
 We can define the constraint for the Hamiltonian cycle problem as follows:
 In any path,vertex i and (i+1 ) must be adjacent
 1 stand (n–1)th vertex must be adjacent (nth of cycle is the initial vertex itself).
 Vertex I must not appear in the first(i– 1)vertices of any path.
 With the adjacency matrix representation n of the graph, the adjacency of two vertices
can be verified in constant time.
Algorithm
HAMILTONIAN(i)
//Description : Solve Hamiltonian cycle problem using backtracking.
//Input :Undirected, connected graph G=<V,E>and initial vertex i
//Output: Hamiltonian cycle
if
FEASIBLE(i)
then
if
(i==n-1)
then
PrintV[0…n– 1]
else
j ←2
whil
e (j ≤
n) do

156
Course Code/Title:CS3302/Design Analysis of Algorithms

V[i] ← j
HAMILTONIAN(i+1)
j←j+1
end end
end
function
FEASIBL
E(i)
flag←1
for
j ←1toi –1
do
if
Adjacent(Vi,Vj)
then
flag←
0 end
end
if
Adjacent(Vi,Vi-1)
then
flag←
1 else
flag←
0 end
return
flag

Complexity Analysis
Looking at the state space graph, in worst case, total number of nodes in tree would be, T(n) = 1 + (n – 1)
+ (n – 1)2 + (n – 1)3 + … + (n – 1)n–1
=frac(n−1)n–1n–2 T(n)=O(nn).Thus, the Hamiltonian cycle algorithm runs in
exponential time.

Example: Find the Hamiltonian cycle by using the backtracking approach for a given graph.

157
Course Code/Title:CS3302/Design Analysis of Algorithms

The backtracking approach uses a state-space tree to check if there exists a Hamiltonian cycle in the
graph. Figure (g) shows the simulation of the Hamiltonian cycle algorithm. For simplicity, we have
not explored all possible paths, the concept is self-explanatory. It is not possible to include all the
paths in the graph, so few of the successful and unsuccessful paths are traced in the graph. Black nodes
indicate the Hamiltonian cycle.

Subset SumProblem
Sum of Sub sets Problem :Given a set of positive integers ,find the combination of numbers that
sum to given value M.
Sumofsubsetsproblemisanalogoustotheknapsackproblem.TheKnapsackProblemtriestofillthe
knapsack using a given set of items to maximize the profit. Items are selected in such a way that
the total weight in the knapsack does not exceed the capacity of the knapsack. The inequality

158
Course Code/Title:CS3302/Design Analysis of Algorithms

condition in the knapsack problem is replaced by equality in the sum of subsets problem.
Given the set of n positive integers, W = {w1, w2, …, wn}, and given a positive integer M,
the sum of the subset problem can be formulated as follows(where wi and M correspond to item
weights and knapsack capacity in the knapsack problem):

Where,

Numbers are sorted in ascending order, such that w1< w2< w3< …. < wn. The solution is often
represented using the solution vector X. If the it h item is included, set xi to 1 else set it to 0. In
each iteration, one item is tested .If the inclusion of an item does not violet the constraint of the
problem, add it. Otherwise, backtrack, remove the previously added item ,and continue the same
procedure for all remaining items.
The solution is easily described by the state space tree. Each left edge denotes the inclusion of wi and
the right edge denotes the exclusion of wi. Any path from the root to the leaf forms a subset. A state-
space tree for n = 3 is demonstrated in Fig. (a).

Fig.(a):State space tree for n= 3


Algorithm for Sum of subsets
The algorithm for solving the sum of subsets problem using recursion is stated below:

159
Course Code/Title:CS3302/Design Analysis of Algorithms

Examples

160
Course Code/Title:CS3302/Design Analysis of Algorithms

Graph Colouring

In this problem, an undirected graph is given. There is also provided m colors. The problem is to
find if itis possible to assign nodes with m different colors, such that not wo adjacent vertices of the
graph are of the same colors. If the solution exists, then display which color is assigned on which
vertex.
Starting from vertex0, we will try to assign colors one by one to different nodes. But before
assigning, we have to check whether the color is safe or not. A color is not safe whether adjacent
vertices are containing the same color.
Input and Output Input: The adjacency matrix of a graph G (V,E) and an integer m , which
indicates the maximum number of colors that can be used.

Let the maximum color


m=3. Output:
This algorithm will return which node will be assigned with which color. If the solution is not
possible, it will return false. For this input the assigned colors are:
Node0->
color1
Node1->
color2
Node2->
color3
Node3->
color2

Algorithm
isValid(vertex,colorList,col)
Input−Vertex,colorListtocheck,and color, which is trying to
assign. Output−True if the color assigning is valid,
otherwise false.
Begin

161
Course Code/Title:CS3302/Design Analysis of Algorithms

For all vertices v of the graph, do if there is an edge between v and i, and
col=color List[i],then return false
done return true
End
Graph Coloring(colors,colorList,vertex) Input−Most possible colors, the list for which vertices
are colored with which color ,and the starting vertex. Output−True,when colors are
assigned, otherwise false.
Begin
If all vertices are checked,
then return true
For all colors col from available
colors, do if isValid(vertex, color,
col), then
Add col to the color List for vertex
if graph Coloring(colors,colorList,vertex+1)=true, then
return true remove color for
vertex
done return
false

End
Branch and Bound
Solving15puzzle Problem(LCBB)
The problem c insist of 15 numbered (0-15) tiles on a square box with16 tiles(one tile is blank or
empty). The objective of this problem is to change the arrangement of initial node to goal node by
using series of legal moves.
The Initial and Goal node arrangement is shown by following figure.

162
Course Code/Title:CS3302/Design Analysis of Algorithms

1 2 4 15 1 2 3 4

2 5 12 5 6 7 8

7 6 11 14 9 10 11 12

8 9 10 13 13 14 15

Initial Arrangement Final Arrangement

In initial node four moves are possible. User can move any one of the tile like 2, or 3, or5, or 6 to the
empty tile. From this we have four possibilities to move from initial node.
The legal moves are for adjacent tile number is left, right, up, down, ones at a time.
Each and every move creates a new arrangement, and this arrangement is called state of puzzle
problem. By using different states, a state space tree diagram is created, in which edges are labeled
according to the direction in which the empty space moves.
The state space tree is very large because it can be 16! Different arrangements. In state space
tree, nodes are numbered as per the level. In each level we must calculate the value or cost of
each node by using given formula:
C(x)=f(x)+g(x), f(x)is length of path from root
or initial node to node x,
g(x)is estimated length of path from x do w n ward to the goal node. Number of nonblank
tile not in their correct position.
C(x)< Infinity .(initially set bound).
Each time node with smallest cost is selected for further expansion towards goal node.
This node become the e-node.
State Space tree with node cost is shown diagram.

163
Course Code/Title:CS3302/Design Analysis of Algorithms

Assignment Problem
Problem Statement
Let’s first define a job assignment problem. In a standard version of a job
assignment problem, there can be jobs and workers. To keep it simple,
we’re taking jobs and workers in our example:

We can assign any of the available jobs to any worker with the condition that if a job is
assigned to a worker, the other workers can’t take that particular job. We should also
notice that each job has some cost associated with it, and it differs from one worker to
another.
Here the main aim is to complete all the jobs by assigning one job to each worker in such a
way that the sum of the cost of all the jobs should be minimized.
Branch and Bound Algorithm Pseudo code
Now let’s discuss how to solve the job assignment problem using a branch and
bound algorithm. Let’s see the pseudocode first:

164
Course Code/Title:CS3302/Design Analysis of Algorithms

Here,is the input cost matrix that contains information like the number of available jobs, a
list of available workers, and the associated cost for each job. The function Min Cost()
maintains a list of active nodes. The function Least cost()calculates the minimum cost of the
active node at each level of the tree. After finding the node with minimum cost, we remove the
node from the list of active nodes and return it.
We’re using the add() function in the pseudocode, which calculates the cost of a
particular node and adds it to the list of active nodes.
In the search space tree, each node contains some information, such as cost, a total number of
jobs, as well as a total number of workers.
Now let’s run the algorithm on the sample example we created:

165
Course Code/Title:CS3302/Design Analysis of Algorithms

Advantages
In a branch and bound algorithm, we don’t explore all the nodes in the tree. That’s
why the time complexity of the branch and bound algorithm is less when compared
with other algorithms.
If the problem is not large and if we can do the branching in a reasonable amount of time,
it finds an optimal solution for a given problem.
Thebranchandboundalgorithmfindaminimalpathtoreachtheoptimalsolutionforagiven problem. It
doesn’t repeat nodes while exploring the tree.
Disadvantages
The branch and bound algorithm are time-consuming. Depending on the size of the given
problem, the number of nodes in the tree can be too large in the worst case.

Knapsack Problem using branch and bound

Problem Statement
We are a given a set of n objects which have each have a value vi and a weight wi. The
objective of the0/1Knapsack problem is to find a subset of objects such that the total value
is maximized, and

The sum of weights of the objects does not exceed a given threshold W. An important
condition here is that one can either take the entire object or leave it. It is not possible to
take a fraction of the object.
Consider an example where n=4, and the values are given by {10,12,12, 18}and the weights
given by {2, 4, 6, 9}. The maximum weight is given by W = 15. Here, the solution to the
problem will be including the first, third and the fourth objects.

Here, the procedure to solve the problem is as follows are:


 Calculate the cost function and the Upper bound for the two children of each
node. Here, the (i + 1)th level indicates whether the ith object is to be included
or not.

166
Course Code/Title:CS3302/Design Analysis of Algorithms

 If the cost function for a given node is greater than the upper bound, then the node need
not be explored further. Hence, we can kill this node. Otherwise, calculate the upper
bound forth is node. If this value is less than U, then replace the value of U with this
value. Then, kill all unexplored nodes which have cost function greater than this value.
 Thenextnodetobecheckedafterreachingallnodesinaparticularlevelwillbe the one
with the least cost function value among the unexplored nodes.
 While including an object, one needs to check whether the adding the object
crossed the threshold. If it does, one has reached the terminal point in that
branch, and all the succeeding objects will not be included.

Time and Space Complexity


Even though this method is more efficient than the other solutions to this problem, its worst
case time complexity is still given byO (2n),in cases where the entire tree has to be explored.
However, in its best case, only one path through the tree will have to explored, and hence
its best case time complexity is given by O(n).Since this method requires the creation of the
states pace tree, its space complexity will also be exponential.

Solving an Example
Consider the problem with n=4, V ={10,10,12, 18}, w={2,4,6,9}and W=
15.Here,wecalculate the initital upper bound to be U = 10 + 10 + 12 = 32. Note that the 4th
object cannot be included here, since that would exceed W. For the cost, we add 3/9 th of
the final value, and hence the cost function is
38. Remember to negate the values after calculation before comparison.
After calculating the cost at each node, kill nodes that do not need exploring. Hence, the
final state space tree will be as follows (Here, the number of the node denotes the order in
which the state space tree was explored):

167
Course Code/Title:CS3302/Design Analysis of Algorithms

Note here that node 3 and node 5 have been killed after updating U at node 7. Also, node 6 is
not explored further, since adding any more weight exceeds the threshold. At the end, only
nodes 6 and 8remain. Since the value of U is less for node 8,we select this node. Hence the
solution is{1,1,0,1}, and we can see here that the total weight is exactly equal to the threshold
value in this case.

Travelling sales man problem


 Travelling Salesman Problem (TSP)is an interesting problem. Problem is defined
as“ given n cities and distance between each pair of cities, find out the path
which visits each city exactly once and come back to starting city, with the
constraint of minimizing the travelling distance.”
 TSP has many practical applications. It issued in network design, and
transportation route design. The objective is to minimize the distance. We can
start tour from any random city and visit other cities in any order. With n cities,
n! different permutations are possible. Exploring all paths using brute force
attacks may not be useful in real life applications.
LCB Busing Static State Space Tree for Travelling Sales man Problem
 Branch and bound is an effective way to find better, if not best, solution in quick
time by pruning some of the unnecessary branches of search tree.
 It works as follow:
Consider directed weighted graph G=(V,E,W),where node represents cities and

168
Course Code/Title:CS3302/Design Analysis of Algorithms

weighted directed edges represents direction and distance between two cities.
1. Initially, graph is represented by cost matrix C, where
Cij=cost of edge, if there is a direct path from city i to
city j Cij=∞, if there is no direct path from city i to city
j.
2. Convert cost matrix to reduced matrix by subtracting minimum values from
appropriate rows and columns, such that each row and column contains at least one
zero entry.
3. Find cost of reduced matrix. Cost is given by summation of subtracted amount
from the cost matrix to convert it in to reduce matrix.
4. Prepare state space tree for the reduce matrix
5. Find least cost valued node A(i.e.E-node),by computing reduced cost node
matrix with every remaining node.
6. If<i,j>edge is to be included, the end following:
(a) Set all values in row I and all values in column j of A to∞
(b) Set A[j,1]= ∞
(c) Reduce Aagain, except rows and columns having all ∞entries.
7. Compute the cost of newly created
reduced matrix as, Cost=L + Cost(i, j) + r
Where ,List cost of original reduced cost
matrix and ris A[i,j].
8. If all nodes are not visited then go
to step 4. Reduction procedure is
described below :
Raw Reduction:
Matrix Mis called reduced matrix if each of its row and column has at least one
zero entry or entire row or entire column has ∞ value. Let M represents the
distance matrix of 5 cities. M can be reduced as follow:
MRowRed={Mij– min{Mij|1≤ j≤n,and Mij< ∞}}
Consider the following distance matrix:

Find the minimum element from each row and subtract it from each cell of matrix.

169
Course Code/Title:CS3302/Design Analysis of Algorithms

Reduced matrix would be:

Row reduction cost is the summation of all the values subtracted from
each rows: Row reduction cost (M) = 10 + 2 + 2 + 3 + 4 = 21
Column reduction:
Matrix MRowRed is row reduced but not the column reduced. Matrix is called column reduced
if each of its column has at least one zero entry or all ∞ entries.
MColRed={Mji–min{Mji|1≤j≤n, and Mji<∞ }} To reduced above matrix, we will find the
minimum element from each column and subtract it from each cell of matrix.

Column reduced matrix MColRed would be:

170
Course Code/Title:CS3302/Design Analysis of Algorithms

Each row and column of MColRed has at least one zero entry, so this matrix is reduced
matrix. Column reduction cost (M) = 1 + 0 + 3 + 0 + 0 = 4 State space tree for 5 city problem
is depictedin Fig.6.6.1.Number with in circle indicates the order in which the node is
generated, and number of edge indicates the city being visited.

Example
Example: Find the solution of following travelling sales man problem using branch and bound
method.

Solution:
 The procedure for dynamic reduction is as follow :
 Draw state space tree with optimal reduction cost at root node.
 Derive cost of path from node I to j by setting all entries in ithrow and jth column as∞. Set
M[j][i]
=∞
 Cost of corresponding node N for path I to j is summation of optimal cost + reduction cost+
M[j][i]
 After exploring all nodes at level i,set node with minimum cost as E
node and repeat the procedure until all nodes are visited.
 Given matrix is not reduced. Inorder to find reduced matrix of it, we will first
find the row reduced matrix followed by column reduced matrix if needed. We
can find row reduced matrix by subtracting minimum element of each row
from each element of corresponding row. Procedure is described below:
 Reduce above cost matrix by subtracting minimum value from each row and column.

171
Course Code/Title:CS3302/Design Analysis of Algorithms

M‘1

is not reduced matrix. Reduce it subtracting minimum value from corresponding column.
Doing this we get,

Cost of M1=C(1)
=Row reduction cost + Column reduction cost
=(10+2+2+3+4)+(1+3)=25
This means all tours in graph has length at least25.This is the optimal cost of the path.
State space tree

Let us find cost of edge fromnode1to2,3,4,5.


Selectedge1-2: Set
M1[1][]=M1[][2]=∞ Set M1[2]
[1] = ∞
Reduce the resultant matrix if required.

172
Course Code/Title:CS3302/Design Analysis of Algorithms

M2 is already reduced.
Cost of node 2 :
C(2)=C(1)+Reduction cost
+M1[1][2]
=25+0+10=35
Select edge1-3 Set
M1[1][]=M1[][3]=∞ Set
M1 [3][1] = ∞
Reduce the resultant matrix if required.

Cost of node 3:
C(3)=C(1)+Reduction cost
+M1[1][3]
=25+11+17=53
Select edge1-4:
Set M1[1][]=M1[][4]=∞ Set
M1 [4][1] = ∞
Reduce resultant matrix if required.

Matrix M4 is already
reduced. Cost of node 4:

173
Course Code/Title:CS3302/Design Analysis of Algorithms

C(4)=C(1)+Reduction cost +M1[1][4]


=25+0+0=25
Select edge1-5:
Set M1[1][]=M1[][5]=∞ Set
M1 [5] [1] = ∞
Reduce the resultant matrix if required.

Cost of node 5:
C(5)=C(1)+reduction cost +M1[1][5]
=25+5+1=31
State space diagram:

Node 4 has minimum cost for path 1-4.We can go to vertex 2,3 or 5.Let’s explore all three nodes.
Select path1-4-2:(Addedge4-2)
SetM4[1][]=M4[4][]=M4[] [2]=∞ Set
M4 [2] [1]=∞
Reduce resultant matrix if required.

Matrix M6 is already reduced.


Cost of node 6:
C(6) =C(4)+Reduction cost +M4[4][2]
=25+0+3=28

174
Course Code/Title:CS3302/Design Analysis of Algorithms

Select edge 4-3(Path1-4-3):


Set M4[1][]=M4[4][]=
M4[][3]=∞ Set M [3][1]=∞
Reduce the resultant matrix if required.

M‘7

Is not reduced. Reduce it by subtracting 11 from column 1.

Cost of node 7:
C(7)=C(4)+Reduction cost
+M4[4][3]
=25+2+11+12=50
Select edge4-5(Path1-4-5):

Matrix M8 is reduced. Cost of node 8:


C(8)=C(4)+Reduction cost +M4[4][5]
=25+11+0=36
State space tree

175
Course Code/Title:CS3302/Design Analysis of Algorithms

Path1-4-2leads to minimum cost. Let’s find the cost for two possible paths.

Add edge 2-3(Path1-4-2-3):


Set M6 [1][ ]=M6 [4][] =M6[2][ ]
=M6 [][3]=∞
SetM6[3][1]=∞
Reduce resultant matrix if required.

Cost of node 9:
C(9)=C(6)+Reduction cost
+M6[2][3]
=28+11+2+11=52
Addedge2-5(Path1-4-2-5):
SetM6[1][]= M6[4][]=M6[2][]=M6[][5]=∞
Set M6 [5][1] = ∞
Reduce resultant matrix if required.

176
Course Code/Title:CS3302/Design Analysis of Algorithms

Cost of node10: C(10)=C(6)+Reduction


cost+M6[2][5]
=28+0+0=28
State space tree

Add edge5-3(Path1-4-2-5-3):

177
Course Code/Title:CS3302/Design Analysis of Algorithms

Cost of node11: C(11)=C(10)+Reduction


cost+M10[5][3]
=28+0+0=28
State space tree:

So we can select any of the edge.Thus the final path includes the edges<3,1>,<5,3>,<1,4>,<4,2>,
<2,5>,that forms the path1– 4–2 –5– 3–1.This path has cost of 28.

178
Course Code/Title:CS3302/Design Analysis of Algorithms

UNIT 5
V NP-COMPLETE AND APPROXIMATION ALGORITHM

Tractable and Intractable Problems


Tractable problems refer to computational problems that can be solved efficiently using
algorithms that can scale with the input size of the problem. In other words, the imerequired
to solve a tractable problem increases at most polynomially with the input size.
On the other hand, intractable problems are computational problems for which no known
algorithm can solve them efficiently in the worst-case scenario. This means that the time
required to solve an intractable problem grows exponentially or even faster with the input
size.
One example of a tractable problem is computing the sum of a list of n numbers. The time
required to solve this problem scales linearly with the input size, a seach number can be
added to a running total in constant time. Another example is computing the shortest path
between two nodes in a graph, which can be solved efficiently using algorithms like
Dijkstra'salgorithm or the A* algorithm.
In contrast, some well-known intractable problems include the traveling salesman problem,
the knap sack problem ,and the Boolean satisfiability problem. These problems are NP-hard
,meaning that any problem in NP (the set of problems that can be solved in polynomial time
using a non- deterministic Turing machine) can be reduced to the min polynomial time. While
it is possible to find approximate solutions to these problems, there is no known algorithm that
can solve the mexactlyin polynomial time.
Tractable problems are those that can be solved efficiently with algorithms that scale well
with the input size, while in tractable problems are those that cannot be solved efficiently in
the worst- case scenario.
Examples of Tractable problems
1. Sorting: Given a list of n items, the task is to sort the min ascending order.
Algorithms like Quick Sort and Merge Sort can solve this problem in
O(nlogn) time complexity.
2. Matrix multiplication:Given t woma trices A and B, the task is to find their product
C=AB. The best-known algorithm for matrix multiplication run sinO(n^2.37) time
complexity, which is considered tractable for practical applications.
3. Shortest path in a graph: Given a graph G and two nodes s and t, the task is to find
the shortest path between s and t.Algorithms like Dijkstra's algorithm and the A*
algorithm can solve this problem in O(m+nlogn) timecomplexity, where m is the
number of edges and n is the number of nodes in the graph.
4. Linear programming: Given a system of linear constraints and a linear objective

179
Course Code/Title:CS3302/Design Analysis of Algorithms

function, the taskis to find the values of the variables that optimize the objective function
subject to the constraints. Algorithms like the simplex method can solve this problem in
polynomial time.
5. Graph coloring: Given an undirected graph G, the task is to assign a color to each node
such that no two adjacent nodes have the same color ,using as few colors as possible.
The greedy algorithm can solve this problem in O(n^2)time complexity, where n is the
number of nodes in the graph.
These problems are considered tractable because algorithms exist that can solve the min
polynomial l time complexity, which means that the time required to solve them grows no
faster than a polynomial function of the input size.

Examples of intractable problems


1. Traveling sales man problem(TSP): Given a set of cities and the distances between
them, the task is to find the shortest possible route that visits each city exactly once
and returns to the starting city. The best-known algorithms for solving the TSP have
an exponential worst-case time complexity, which makes it intractable for large
instances of the problem.
2. Knapsack problem: Given a set of items with weights and values, and a knapsack
that can carry a maximum weight, the task is to find the most valuable subse to f
items that can be carried by the knapsack. The knapsack problem is also NP-hard and
is intractable for large instances of the problem.
3. Boolean satisfiability problem(SAT): Given a Boolean formula in conjunctive
normal form (CNF), the task is to determine if there exists an assignment of truth
values to the variables that makes the formula true.The SAT problem is one of the
most well-known NP-complete problems, which means that any NP problem can be
reduced to SAT in polynomial time.
4. Subset sum problem: Given a set of integers and a target sum, the task is to find
a subset of the integers that sums up to the target sum. Like the knapsack
problem, the subset sum problem is also intractable for large instances of the
problem.
5. Graph isomorphism problem: Given two graphs G1 and G2, the task is to determine if there

1. Linearsearch: Given a list of n items, the task is to find a specific item in the
list. The time complexity of linear search is O(n), which h is a polynomial
function of the input size.

180
Course Code/Title:CS3302/Design Analysis of Algorithms
2. Bubblesort: Given a list of n items, the task is to sort the min ascending order. The time
complexity of bubble sort is O(n^2), which is also a polynomial function of the input
size.
3. Shortest path in a graph: Given a graph G and two nodes s and t, the task is to find
the shortest path between s and t .Algorithms like Dijkstra's algorithm and the A*
algorithm can solve this problem in O(m+nlogn) time complexity, which is a
polynomial function of the input size.
4. Maximum flow in a network: Given a network with a source node and a sink node,
and capacities on the edges, thetaskis tofindthe maximumflowfromthesourcetothe
sink. The Ford-Fulkerson algorithm can solve this problem in O(mf),where m is the
number of edges in the network and f is the maximum flow, which is also a
polynomial function of the input size.
5. Linear programming: Given a system of linear constraints and a linear objective
function, the task is to find the values of the variables that optimize the objective
function subject to the constraints. Algorithms like the simplex method can solve this
problem in polynomial time.

P(Polynomial)problems
P problems refer to problems where an algorithm would take a polynomial amount of
time to solve, or where Big-O is a polynomial (i.e.O(1),O(n),O(n²),etc). These are
problems that would be considered ‘easy’ to solve, and thus do not generally have
immense run times.
NP(Non-deterministicPolynomial)Problems
NP problems were a little harder forme to understand, but I think this is what they are.
Interms of solving a NP problem, the run-time would not be polynomial. It would be
something like O(n!) or something much larger.
NP-Hard Problems
A problem is classified as NP-Hard when an algorithm for solving it can be translated
to solve an Zy NP problem. Then we can say, this problem is atleast as hard as any
NP problem, but it could be much harder or more complex.

181
Course Code/Title:CS3302/Design Analysis of Algorithms

NP-Complete Problems
NP-Complete problems are problems that live in both the NP and NP-Hard
classes. This means that NP-Complete problems can be verified in polynomial
time and that any NP problem can be reduced to this problem in polynomial time.

Bin Packing problem


Bin Packing problem involves assigning n items of different weights and bin s each
of capacity c to a bin such that number of total used bin s is minimized. It may be
assume d that all items have weights smaller than bin capacity.
The following 4 algorithms depend on the order of the irinputs. They pack the item
given first and then move on to the next input or next item
1) NextFit algorithm

The simplest approximate approach to the bin packing problem is the Next-Fit (NF)
algorithm which is explained later in this article. The first item is assigned to bin 1.
Items 2,...,n are then considered by increasing indices: each item is assigned to the
current bin, if it fits; otherwise, it is assigned to a new bin, which becomes the current
one.
VisualNRepresentation
Let us consider the same example as used above and bins of size 1

Assuming the sizes of the items be {0.5,0.7,0.5,0.2,0.4,0.2,0.5,0.1, 0.6}. The minimum


number of bins required would be Ceil((TotalWeight)/(BinCapacity))=Celi(3.7/1)
=4bins.

182
Course Code/Title:CS3302/Design Analysis of Algorithms

The Next fit solution (NF(I)) forth is instance I would be- Considering 0.5 size d item
first, we can place it in the first bin
Moving onto the 0.7 size d item, we cannot place it in the first bin. Hence we place it in a new

bin.
Moving on to the 0.5 sized item, we cannot place it in the current bin. Hence we place it in a new

bin.

Moving on to the 0.2 sized item, we can place it in the current (third bin)

Similarly, placing all the other items followingtheNext-Fitalgorithmweget-

Thus we need 6 bins as opposed to the 4 bins of the optimal solution. Thus we can see
that this algorithm is not very efficient.
Analyzing the approximation ratio of Next-Fit algorithm
The time complexity of the algorithm is clearlyO(n). It is easy to prove that, for any
instance I of BPP, the solution value NF(I) provided by the algorithm satisfies the
bound

183
Course Code/Title:CS3302/Design Analysis of Algorithms

NF(I)<2z(I)
Where z(I) denotes the optimal solution value. Furthermore, there exist instances for
which the ratio NF(I)/z (I) is arbitrarily close to 2, i.e. the worst-case approximation
ratio of NFisr(NF)
=2.
Psuedo code
NEXTFIT(size[],n,c)
size[] is the array containg the sizes of the items, n is the number of items and c is the
capacity of the bin
{
Initialize result(Count of bins) and remaining capacity in current bin.res=0
bin_rem=c
Place items one by one
for(inti=0;i <n;i++){
//If this item can't fit in current
bin if (size[i] > bin_rem) {
Use a new bin
res++
bin_rem=c-size[i]
}
else
bin_rem-=size[i];
}
returnres;
}
2) FirstFit algorithm

A better algorithm, First-Fit (FF), considers the items according to increasing indices and
assign seach item to the lowest indexed initialized bin in to which it fits; only when the current
item cannot fit in to any initialized bin, is a new bin introduced
VisualRepresentation
Let us consider the same example as used above and bin s of size 1

Assuming the sizes of the items be{0.5,0.7,0.5,0.2,0.4,0.2,0.5,0.1, 0.6}. The minimum


number of bins required would be Ceil((TotalWeight)/(BinCapacity))=Celi(3.7/1)
=4bins.
The First fit solution(FF(I))for this instance I would be-

184
Course Code/Title:CS3302/Design Analysis of Algorithms

Considering0.5sizeditemfirst,wecanplaceitinthefirstbin
Movingontothe0.7sizeditem,wecannotplaceitinthefirstbin.Henceweplaceitinanewbin.

Movingontothe0.5sizeditem,wecanplaceitinthefirstbin.

Movingontothe0.2sizeditem,wecanplaceitinthefirstbin,wecheckwiththesecondbinand
we can place it there.

Movingontothe0.4sizeditem,wecannotplaceitinanyexistingbin.Henceweplaceitinanewbin.

Similarly,placingalltheotheritemsfollowingtheFirst-Fitalgorithmweget-

185
Course Code/Title:CS3302/Design Analysis of Algorithms

Thus we need 5 bins as opposed to the 4 bins of the optimal solution but is much more
efficient than Next-Fit algorithm.
AnalyzingtheapproximationratioofNext-Fitalgorithm
If FF(I)is theFirst-fit implementation for Iinstance and z(I)is the most optimal solution, then:

ItcanbeseenthattheFirstFitneverusesmorethan1.7*z(I)bins.SoFirst-
FitisbetterthanNextFit in terms of upper bound on number of bins.
Psuedocode
FIRSTFIT(size[],n,c)
{
size[]isthearraycontaingthesizesoftheitems,nisthenumberofitemsandcisthecapacityoft
he bin

/Initializeresult(Countofbins)
res=0;
Createanarraytostoreremainingspaceinbinstherecanbeatmostnbinsbin_rem[n];

Plaeitemsonebyone
for(inti=0;i<n;i++){
Findthefirstbinthatcanaccommodateweight[i]intj;
for(j=0;j <res;j++){
if (bin_rem[j] >= size[i]) {
bin_rem[j]=bin_rem[j]-
size[i]; break;
}
}
Ifnobincouldaccommodatesize[i]
if (j == res) {

186
Course Code/Title:CS3302/Design Analysis of Algorithms

bin_rem[res]=c-
size[i]; res++;
}

}
returnres;
}
3) BestFitAlgorithm

The next algorithm,Best-Fit(BF), is obtained from FF by assigning the current item to the
feasible bin (if any) having the smallest residual capacity (breaking ties in favor of the lowest
indexed bin).
Simply put,the idea is to places the next item in the tightes tspot.That is,put it in the
bin so that the smallest empty space is left.
VisualRepresentation
Letusconsiderthesameexampleasusedaboveandbinsofsize1

Assuming the sizes of the items be {0.5,0.7,0.5,0.2,0.4,0.2,0.5,0.1, 0.6}. The


minimum number of bins required would be
Ceil((TotalWeight)/(BinCapacity))=Celi(3.7/1)
= 4 bins. TheFirst fit solution(FF(I))for this instance I would be-

Considering 0.5sized item first,we can place it in the first bin

Movingontothe0.7sizeditem,wecannotplaceitinthefirstbin.Henceweplaceitinanewbin.

187
Course Code/Title:CS3302/Design Analysis of Algorithms

Moving on to the 0.5sized item,we can place it in the first bin tightly.

Moving on to the 0.2 sized item,we cannot place it in the first bin but we can place it in second
bin tightly.

Moving on to the 0.4 sized item,we cannot place it in any existing bin.Hence we place it in a new

bin.

Similarly,placing all the other items following the First-Fit algorithm we get-

Thus we need 5 bins as opposed to the 4 bins of the optimal solution but is much more efficient than Next-
Fit algorithm.
Analyzing the approximation ratio of Best-Fit algorithm
It can be noted that Best-Fit(BF), is obtained from FF by assigning the current item
to the feasible bin (if any) having the smallest residual capacity (breaking ties in
favour of the lowest indexed bin). BF satisfies the same worst-case bounds as FF

AnalysisOfupper-boundofBest-Fitalgorithm
If z(I) is the optimal number of bins, then BestFit n ever use s more than 2*z(I)-2 bins.
So Best Fit is same as Next Fit in terms of upper bound on number of bins.

188
Course Code/Title:CS3302/Design Analysis of Algorithms

Psuedo code
BEST FIT(size[],n,c)
{
size[] is the array containg the size s of the items, n is the number of item s and c is
the capacity of the bin
Initialize result(Count of bins)res
=0;

Create an array to store remaining space in bin s there can be


atmost n bin s bin_rem[n];

Place items one by one


for(inti=0;i <n;i++){

Find the best bin that can accommodate weight[i] int j;

Initialize minimum space left and index of best bin int min = c + 1, bi = 0;

for(j=0;j<res;j++){
if(bin_rem[j]>=size[i]&&bin_rem[j]-
size[i]<min){bi=j; min=bin_rem[j]-size[i];
}
}

If no bin could accommodate weight[i],create a new


bin if (min == c + 1) {
bin_rem[res]=c-
size[i]; res++;
}
else
Assign the item to best bin
bin_rem[bi] -= size[i];
}
Return res;
}
In the offline version, we have all items at our disposal since the start of the
execution.The natural solution is to sort the array from largest to smallest, and then
apply the algorithms discussed henceforth.
NOTE: In the online programs we have given the input sup front for simplicity but it
can also work interactively
Let us look at the various off line algorithms

189
Course Code/Title:CS3302/Design Analysis of Algorithms

1) First Fit Decreasing

We first sort the array of item s in decreasing size by weight and apply first-fit
algorithm as discussed above
Algorithm
 Read the input s of items
 Sort the array of item s in decreasing order by their sizes
 Apply First-Fit algorithm
Visual Representation
Let us consider the same example as used above and bin s of size 1

Assuming the sizes of the items be{0.5,0.7,0.5,0.2,0.4,0.2,0.5,0.1,0.6}.


Sorting them weget{0.7,0.6,0.5,0.5,0.5,0.4,0.2,0.2,0.1}
The First fit Decreasing solution would
be- We will start with 0.7 and place it in
the firstbin

We then select 0.6 sized item. We cannot place it in bin 1.So, we place it in bin 2

We then select 0.5 sized item. We cannot place it in any existing. So, we place it in bin 3

We then select 0.5 sized item. We can place it in bin 3

190
Course Code/Title:CS3302/Design Analysis of Algorithms

Doing the same for all items, we get.

Thus only 4 bins are required which his the same as the optimal solution.

2) Best Fit Decreasing

We first sort the array of items in decreasing size by weight and apply Best-fit
algorithm as discussed above
Algorithm
 Read the inputs of items
 Sort the array of items in decreasing order by their sizes
 Apply Next-Fit algorithm
Visual Representation
Let us consider the same example as used above and bins of size 1

Assuming the sizes of the items be{0.5,0.7,0.5,0.2,0.4,0.2,0.5,0.1,0.6}.


Sorting them weget{0.7,0.6,0.5,0.5,0.5,0.4,0.2,0.2,0.1}
The Best fit Decreasing solution would
be- We will start with 0.7 and place it
in the first bin

191
Course Code/Title:CS3302/Design Analysis of Algorithms

We then select 0.6 sized item. We cannot place it in bin 1. So, we place it in bin 2

We then select 0.5 sized item. We cannot place it in any existing. So, we place it in bin 3

We then select 0.5 sized item.We can place it in bin 3

Doing the same for all items, we get.

Thus only 4 bins are required which is the same as the optimal solution.
Approximation Algorithms for the Traveling Salesman Problem
We solved the traveling sales man problem by exhaustive search in Section3.4,
mentioned its decision version as one of the most well-known NP-complete problems
in Section 11.3, and saw how its instances can be solved by a branch-and-bound
algorithm in Section 12.2. Here, we consider several approximation algorithms, a
small sample of dozens of such algorithms suggested over the years for this famous
problem.

But first let us answer the question of whether we should hope to find a polynomial-
time approximation algorithm with a finite performance ratio on all instances of the

192
Course Code/Title:CS3302/Design Analysis of Algorithms

traveling sales man problem. As the following theorem[Sah76]shows,the answer turns


out to be no,unless P = N P .

THEOREM 1 If P!=NP, there exists no c-approximation algorithm for the traveling


salesman problem, i.e., there exists no polynomial-time approximation algorithm for
this problem so that for all instances

Nearest-neighbour algorithm
The following well-known greedy algorithm is based on the nearest-neighbor
heuristic: always go next to the nearest unvisited city.
Step1 Choose an arbitrary city as the start.
Step2 Repeat the following operation until all the cities have beenvisited: go to the
unvisited city nearest the one visited last (ties can be broken arbitrarily).
Step3 Return to the starting city.
EXAMPLE1 For the instance represented by the graph in Figure12.10, with a as the
starting vertex, the nearest-neighbor algorithm yields the tour (Hamiltonian
circuit) sa:a−b−c−d−a of length 10.

The optimal solution, as can be easily checked by exhaustive search,


is the tour s∗:a−b−d−c−a of length 8.Thus, the accuracy ratio of this
approximation is

Unfortunately, except for its simplicity, not many good things can be said about the
nearest- neighbor algorithm. In particular, nothing can be said in general about the

193
Course Code/Title:CS3302/Design Analysis of Algorithms

accuracy of solutions obtained by this algorithm because it can force us to traverse a


very long edge on the last leg of the tour. Indeed, if we change the weight of edge
(a,d) from 6 to an arbitrary large number w ≥ 6 in Example 1, the algorithm will still
yield the tour a − b − c − d − a of length 4 + w, and the optimal solution will still be a
− b − d − c − a of length 8. Hence,

Which can be made as large as we wish by choosing an appropriately large value of w. Hence,
RA=
∞ for this algorithm(as it should be according to Theorem1).

Twice-around-the-tree algorithm
Step1 Construct a minimum spanning tree of the graph corresponding to a given
instance of the traveling salesman problem.
Step2Starting at an arbitrary vertex, perform a walk around the minimum spanning
tree recording all the vertices passed by. (This can be done by a DFS traversal.)
Step3Scan the vertex list obtained in Step2 and eliminate from it all repeated
occurrences of the same vertex except the starting on e at the end of the list.(This step
is equivalent to making shortcuts in the walk.) The vertices remaining on the list will
form a Hamiltonian circuit, which is the output of the algorithm.

EXAMPLE2 Let us apply this algorithm to the graph in Figure12.11a.The minimum


spanning tree of this graph is made up of edges(a,b),(b,c),(b, d),and(d, e). A twice-
around-the-tree walk that starts and end s at a is
a,b,c,b,d,e,d,b,a.
Eliminating the second b(a shortcut from c to d),the second d, and the third b (a shortcut from e
to

194
Course Code/Title:CS3302/Design Analysis of Algorithms

a)yields the Hamiltonian circuit


a,b,c,d,e,a
of length 39.
The tour obtained in Example 2 is not optimal. Although that instance is small enough
to find an optimal solution by either exhaustive search or branch-and-bound, were f
rained from doing so to reiterate a general point. As a rule, we do not know what the
length of an optimal tour actually is, and therefore we cannot compute the accuracy
ratio f (sa)/f(s∗). For the twice-around-the-tree algorithm, we can at least estimate it
above, provided the graph is Euclidean.

Fermat's Little Theorem: If n is a prime


number, then for every a,1<a<n-1, an-

1≡1(modn)OR

an-1%n=1
Example: Since 5 is prime,
24≡1(mod5)[or24%5=1],
34≡1(mod5)and44≡1(mod5)
Since7isprime,26≡ 1(mod7),
36≡1(mod7),46≡1(mod7)
56≡1(mod7)and66≡1(mod7)
Algorithm
1) Repeat following k times:

a) Pick a randomly in the range[2,n-2]

b) If gcd(a,n)≠1,then return false

c) If an-1& n equiv;1(mod n),then return false

2) Return true[probablyprime].

Unlike merge sort, we don’t need to merge the two sort e d arrays. Thus Quick sort
require s lesser auxiliary space than Merge Sort, which is why it is often preferred to
Merge Sort.
Using a randomly generated pivot we can further improve the time complexity of QuickSort.
Algorithm for random pivoting

195
Course Code/Title:CS3302/Design Analysis of Algorithms

partition(arr[],lo,hi)
pivot=arr[hi]
i = lo //place for
swapping for j := lo to hi
– 1 do
if arr[j] <= pivot then
swaparr[i]witharr[j]i
=i +1
swaparr[i]witharr[hi]retur
ni
partition_r(arr[],lo,hi)
r=RandomNumberfromlotohiSw
ap arr[r] and arr[hi]
returnpartition(arr,lo,hi)
quicksort(arr[],lo,hi)
iflo<hi
p=partition_r(arr,lo,hi)
quicksort(arr, lo , p-1)
quicksort(arr, p+1, hi)
Finding k th smallest element
Problem Description:Given an arrayA[] of n elements and a positive integer K, find
the Kth smallest element in the array. It is given that all array elements are distinct.
For Example:
Input:A[]={10,3,6,9,2,4,15,23},K=4
Output:6
Input:A[]={5,-8,10,37,101,2,9},K=6
Output:37
Quick-Select:Approach similar to quicksort
This approach is similar to the quick sort algorithm where we use the partition on the
input array recursively. But unlike quicksort, which processes both sides of the array
recursively, this algorithm works on only one side of the partition. We recur for either
the left or right side according to the position of pivot.
Solution Steps
1. Partition the arrayA[left..right]into two subarrays A[left..pos] and A[pos+1..right]
such that each element of A[left .. pos] is less than each element of A[pos + 1 .. right].
2. Computes the number of elements in the subarrayA[left..pos]i.e.count=pos-left+1

196
Course Code/Title:CS3302/Design Analysis of Algorithms

3. if(count==K),then A[pos] is the Kth smallest element.

4. Otherwise determine s in which of the two subarrays A[left..pos-1]and


A[pos+1..right]the Kth smallest element lies.
If(count>K) then the desired element lies on the left side of the partition
 If (count < K), then the desired element lies on the right side of the partition. Since we
already know i values that are smaller than the kth smallest element of A[left..right],
the desired element is the (K - count)th smallest element of A[pos + 1 .. right].
 Base case is the scenario of single element array i.e left==right.returnA[left] or A[right].
Pseudo-Code
//Original value for left=0 and right=n-1
Int kth Smallest(intA[],int left,int right,int K)
{
if(left==right)
return A[left] int
pos=partition(A,left,right) count
= pos - left + 1 if(count==K)
return A[pos]
elseif(count>K)
returnkthSmallest(A,left,pos-1,K)
else
return kthSmallest(A,pos+1,right,K-i)
}
intpartition(intA[],intl,intr)
{
intx=A[r] ,inti=l-1
for(j=ltor-1)
{
if(A[j]<=x)
{
i = i + 1
swap(A[i],A[j])
}
}
swap(A[i+1],A[r])
return i+1
}
Complexity Analysis
Time Complexity:The worst-case time complexity for this algorithm isO(n²),but it can be
improved if we choose the pivot element randomly.If we randomly select the pivot, the
expected time complexity would be linear, O(n).

197

You might also like