CIT310
CIT310
FACULTY OF SCIENCES
1
Course Code CIT 310
Programme Leader
Course Coordinator
2
NATIONAL OPEN UNIVERSITY OF NIGERIA
CONTENTS Page
Introduction 4
Course Aims 4
Course Objectives 4
Study Units 6
Presentation Schedule 6
Assessment 7
Summary 8
3
Introduction
In writing the Algorithms and Complexity Analysis course, emphasis will be
placed on understanding the concept of computer algorithms, how to develop
algorithms; test them before translating into viable and workable programs. This
course is specifically tailored towards those students who are actually studying
computing and interested in developing and testing computer algorithms and in
applying them towards developing programs in any programming language.
Course Aims
The aim of the course is to guide learners of Computing and Computer Programs
on how to design and test algorithms and also help them in identifying different
types of algorithm design paradigms. It is also to help them simplify the task of
understanding the theory behind computer algorithms.
Course Objectives
Below are the objectives of the course which are to:
1. Provide sound understanding of computer algorithms.
2. Provide an understanding of algorithm design paradigms.
3. Provide suitable examples of different types of algorithms and why algorithms
are very important in computing.
4
marked assignments for assessment and grading. At the end of the course there is
a final examination.
To be abreast of this course, you are advised to avail yourself the opportunity of
attending the tutorial or online facilitation sessions where you have opportunity
of comparing your knowledge with those of your colleagues.
5
The Course Materials
The main components of the course are:
Study Units
The study units in this course are as follows:
Presentation Schedule
The course materials assignments have important deadlines for submission. The
learners should guide against falling behind stipulated deadlines.
6
Assessment
There are three ways of carrying out assessment of the course. First assessment
is made up of self-assessment exercises, second consists of the tutor marked
assignments and the last is the written examination/end of course examination.
You are expected to do all the self-assessment exercises by applying what you
have read in the units. The tutor marked assignments should be submitted to your
facilitators for formal assessment in accordance with the deadlines stated in the
presentation schedule and the assignment files. The total assessment will carry
30% of the total course score. At the end of the course, the final examination will
not be more than three hours and will carry 70% of the total marks.
Marks
Assignments 30%
Examination 70%
Total 100%
Facilitators/Tutor and Tutorials
7
There are 16 hours of tutorials provided in support of this course. You will be
notified of the dates, times and location of these tutorials as well as the name and
phone number of your facilitator, as soon as you are allocated a tutorial group.
You facilitator will mark and comment on your assignments, keep a close watch
on your progress and any difficulties you might face and provide assistance to
you during the course. You are expected to mail you Tutor Marked Assignments
to your facilitator before the schedule date (at least two working days are
required). The assignments will be marked by your tutor and returned to you as
soon as possible.
Do not delay in contacting your facilitator on telephone or e-mail if you need
assistance. Such assistance could be as a result of the followings:
Summary
This course is to provide overview of computer algorithms and analysis of its
complexity. In particular, we will see know more about the nature and design of
algorithms, why they are so important in the field of computing and the several
algorithm design paradigms that would be explained.. In fact, the learners will
actually learn how do basic run-time and space-complexity analysis of computer
algorithms. Some examples of algorithms applied in the fields of Searching and
Sorting would also be examined..
I wish you success in the course and I hope you will find the course both
interesting and useful.
8
CIT 310 - Algorithms and Complexity Analysis (3 Units)
Table of Content
Page
Module 1 Basic Algorithm Analysis 10
Unit 1 Basic Algorithm Concepts 11
Unit 2 Analysis and Complexity of Algorithms 18
Unit 3 Algorithm Design Techniques 30
Unit 4 Recursion and Recursive Algorithms 42
Unit 5 Recurrence Relations 59
9
CIT 310 - Algorithms and Complexity Analysis
10
Module 1: Basic Algorithmic Analysis
4.0 Conclusion 17
5.0 Summary 17
6.0 Tutor Marked Assignments 17
7.0 Further Reading and Other Resources 17
11
1.0 Introduction
2.0 Objectives
By the end of this unit, you should be able to:
Define and describe what an algorithm is
Enumerate the different characteristics of an algorithm
Examine some of the advantages of algorithms
Identify some shortcomings or disadvantages of algorithms
Look at the the concept of a pseudocode
Examine some benefits and shortcomings of a pseudocode
Make a comparison between and algorithm and a pseudocode
Look at the various reasons why an algorithm is needed
12
Algorithms are named for the 9th century Persian mathematician Al-Khowarizmi.
He wrote a treatise in Arabic in 825 AD, On Calculation with Hindu Numerals.
It was translated into Latin in the 12th century as Algoritmi de numero Indorum,
which title was likely intended to mean "[Book by] Algoritmus on the numbers
of the Indians", where "Algoritmi" was the translator's rendition of the author's
name in the genitive case; but people misunderstanding the title treated Algoritmi
as a Latin plural and this led to the word "algorithm" (Latin algorismus) coming
to mean "calculation method".
13
o Easy and Efficient Coding: An algorithm is nothing but a blueprint of a
program that helps develop a program.
o Independent of Programming Language: Since it is a language-
independent, it can be easily coded by incorporating any high-level
language.
3.2 Pseudocode
Suppose there are 60 students in a class. How will you calculate the
number of absentees in the class?
i. Pseudocode Approach:
15
3.3 Need of Algorithms (Why do we need Algorithms?)
10. To measure the behavior (or performance) of the methods in all cases (best
cases, worst cases, average cases)
11. With the help of an algorithm, we can also identify the resources (memory,
input-output) cycles required by the algorithm.
14. We can measure and analyze the complexity (time and space) of the
problems concerning input size without implementing and running it; it
will reduce the cost of design.
Self-Assessment Exercise
1. What is an algorithm?
2. Differentiate between an algorithm and a pseudocode
16
3. Highlight some of the basic reasons why algorithms are needed?
4. How is an algorithm similar to and different from a program?
5. Why must every good computer programmer understand an algorithm first?
6. State an algorithm for adding three numbers A, B, and C
4.0 Conclusion
The concept of understanding and writing computer algorithms is very essential
to understanding the task of programming and every computing student has to
imbibe the concepts of algorithms. In fact, algorithms are the basic key to
understanding the theory and practice of computing.
5.0 Summary
In this unit we have learnt an overview of algorithms and their basic
characteristics. In addition, we looked at some of the benefits and shortcomings
of algorithms and also examined the concept of a pseudocode as well as some of
its benefits and shortcomings. We also made a brief comparison between a
pseudocode and an algorithm and finally looked at some of the reasons why an
algorithm is needed
17
Module 1: Basic Algorithm Analysis
Unit 2: Analysis and Complexity of Algorithms
Page
1.0 Introduction 19
2.0 Objectives 19
3.0 Analysis of Algorithms 19
3.1 Types of Time Complexity Analysis 20
3.1.1 Worst-case Time Complexity 20
3.1.2 Average-case Time Complexity 21
3.1.3 Best-case Time Complexity 21
3.2 Complexity of Algorithms 21
3.3 Typical Complexities of an Algorithm 22
3.3.1 Constant complexity 22
3.3.2 Logarithmic complexity 22
3.3.3 Linear complexity 22
3.3.4 Quadratic complexity 23
3.3.5 Cubic complexity 23
3.3.6 Exponential complexity 23
3.4 How to approximate the Time taken by an Algorithm 24
3.4.1 Some Examples 25
4.0 Conclusion 28
5.0 Summary 28
6.0 Tutor Marked Assignments 28
7.0 Further Reading/ References 29
18
1.0 Introduction
2.0 Objectives
By the end of this unit, you will be able to
Understand runtime and space analysis or complexity of algorithms
Know the different types of analysis
Understand the typical complexities of an algorithm
Learn how to approximate the time taken by an algorithm
The analysis is a process of estimating the efficiency of an algorithm and that is,
trying to know how good or how bad an algorithm could be. There are two main
parameters based on which we can analyze the algorithm:
Or in other words, you should describe what you want to include in your code in
an English-like language for it to be more readable and understandable before
implementing it, which is nothing but the concept of Algorithm.
19
In general, if there is a problem P1, then it may have many solutions, such that
each of these solutions is regarded as an algorithm. So, there may be many
algorithms such as A1, A2, A3, …, An.
Before you implement any algorithm as a program, it is better to find out which
among these algorithms are good in terms of time and memory.
It would be best to analyze every algorithm in terms of Time that relates to which
one could execute faster and Memory or Space corresponding to which one will
take less memory.
So, the Design and Analysis of Algorithm talks about how to design various
algorithms and how to analyze them. After designing and analyzing, choose the
best algorithm that takes the least time and the least memory and then implement
it as a program in C.
We will be looking more on time rather than space because time is instead a more
limiting parameter in terms of the hardware. It is not easy to take a computer and
change its speed. So, if we are running an algorithm on a particular platform, we
are more or less stuck with the performance that platform can give us in terms of
speed.
However, on the other hand, memory is relatively more flexible. We can increase
the memory as when required by simply adding a memory card. So, we will focus
on time than that of the space.
3.1.1 Worst-case time complexity: For 'n' input size, the worst-case time
complexity can be defined as the maximum amount of time needed by an
algorithm to complete its execution. Thus, it is nothing but a function
20
defined by the maximum number of steps performed on an instance having
an input size of n.
3.1.2 Average case time complexity: For 'n' input size, the average-case time
complexity can be defined as the average amount of time needed by an
algorithm to complete its execution. Thus, it is nothing but a function
defined by the average number of steps performed on an instance having
an input size of n.
3.1.3 Best case time complexity: For 'n' input size, the best-case time
complexity can be defined as the minimum amount of time needed by an
algorithm to complete its execution. Thus, it is nothing but a function
defined by the minimum number of steps performed on an instance having
an input size of n.
The term algorithm complexity measures how many steps are required by the
algorithm to solve the given problem. It evaluates the order of count of operations
executed by an algorithm as a function of input data size.
The complexity can be found in any form such as constant, logarithmic, linear,
n*log(n), quadratic, cubic, exponential, etc. It is nothing but the order of constant,
logarithmic, linear and so on, the number of steps encountered for the completion
of a particular algorithm. To make it even more precise, we often call the
complexity of an algorithm as "running time".
21
3.3 Typical Complexities of an Algorithm
We examine the different types of complexities of an algorithm and one or
more of our algorithm or program will fall into any of the following
categories;
22
It imposes a complexity of O(n2). For N input data size, it undergoes the
order of N2 count of operations on N number of elements for solving a
given problem.
If N = 100, it will endure 10,000 steps. In other words, whenever the order
of operation tends to have a quadratic relation with the input data size, it
results in quadratic complexity.
For example, for N number of elements, the steps are found to be in the
order of 3*N2/2.
Since the constants do not hold a significant effect on the order of count of
operation, so it is better to ignore them.
Thus, to consider an algorithm to be linear and equally efficient, it must
undergo N, N/2 or 3*N count of operation, respectively, on the same
number of elements to solve a particular problem
23
Self Assessment Exercises
1. Compare the Worst-case and the Best-case analysis of an algorithm
2. Why is the Worst-case analysis the most important in algorithm
analysis?
3. Among the different complexity types of an algorithm, which do you
consider as the worst?
4. Presently we can solve problem instances of size 30 in 1 minute
using algorithm A, which is a Θ(2n) algorithm. On the other hand,
we will soon have to solve problem instances twice this large in 1
minute. Do you think it would help to buy a faster (and more
expensive) computer?
So, to find it out, we shall first understand the types of the algorithm we have.
There are two types of algorithms:
However, it is worth noting that any program that is written in iteration could be
written as recursion. Likewise, a recursive program can be converted to iteration,
making both of these algorithms equivalent to each other.
24
But to analyze the iterative program, we have to count the number of times the
loop is going to execute, whereas in the recursive program, we use recursive
equations, i.e., we write a function of F(n) in terms of F(n/2).
Suppose the program is neither iterative nor recursive. In that case, it can be
concluded that there is no dependency of the running time on the input data size,
i.e., whatever is the input size, the running time is going to be a constant value.
Thus, for such programs, the complexity will be O(1).
Consider the following programs written in simple English and does not
correspond to any syntax.
Example1:
In the first example, we have an integer i and a for loop running from i equals 1
to n. Now the question arises, how many times does the name get printed?
A()
{
int i;
for (i=1 to n)
printf("Abdullahi");
}
Example2:
A()
{
int i, j:
for (i=1 to n)
for (j=1 to n)
printf("Abdullahi");
}
25
In this case, firstly, the outer loop will run n times, such that for each time, the
inner loop will also run n times. Thus, the time complexity will be O(n2).
Example3:
A()
{
i = 1; S = 1;
while (S<=n)
{
i++;
SS = S + i;
printf("Abdullahi");
}
}
As we can see from the above example, we have two variables; i, S and then we
have while S<=n, which means S will start at 1, and the entire loop will stop
whenever S value reaches a point where S becomes greater than n.
Here i is incrementing in steps of one, and S will increment by the value of i, i.e.,
the increment in i is linear. However, the increment in S depends on the i.
Initially;
i=1, S=1
i=2, S=3
i=3, S=6
26
Thus, it is nothing but a series of the sum of first n natural numbers, i.e., by the
𝑘 (𝑘+1)
time i reaches k, the value of S will be .
2
𝑘 (𝑘+1)
To stop the loop, has to be greater than n, and when we solve this
2
equation,
Example1:
A(n)
{
if (n>1)
return (A(n-1))
}
Solution;
Here we will see the simple Back Substitution method to solve the above problem.
27
Step4: Substitute eqn. (3) in Eqn. (4)
Now, according to Eqn. (1), i.e. T(n) = 1 + T(n-1), the algorithm will run until
n>1. Basically, n will start from a very large number, and it will decrease
gradually. So, when T(n) = 1, the algorithm eventually stops, and such a
terminating condition is called anchor condition, base condition or stopping
condition.
4.0 Conclusion
Analysis of algorithms helps us to determine how good or how bad they are in
terms of speed or time taken and memory or space utilized. Designing good
programs is dependent on how good or how bad the algorithm is and the analysis
helps us to determine the efficiency of such algorithms.
5.0 Summary
In the unit, we have learnt the meaning of algorithm analysis and the different
types of analysis. We also examined the complexity of algorithms and the
different types of complexities.
Cormen, T.H., Leiserson, C.E., Rivest, R.L., and Stein, C. (2009). Introduction
to Algorithms, 3rd ed. MIT Press.
29
Module 1: Basic Algorithm Analysis
Unit 3: Algorithm Design Techniques
Page
1.0 Introduction 31
2.0 Objectives 31
3.0 Algorithm Design Techniques 31
3.1 Popular Algorithm Design Techniques 32
3.1.1 Divide-and-Conquer Approach 32
3.1.2 Greedy Techniques 32
3.1.3 Dynamic Programming 33
3.1.4 Branch and Bound 33
3.1.5 Backtracking Algorithm 34
3.1.6 Randomized Algorithm 36
3.2 Asymptotic Analysis (Growth of Function) 37
3.2.1 Asymptotic Analysis 37
3.2.2 Why is Asymptotic Analysis Important? 38
3.3 Asymptotic Notation 38
3.3.1 Big O Notation 38
3.3.2 Big Omega Notation 39
3.3.3 Big Theta Notation 40
4.0 Conclusion 41
5.0 Summary 41
6.0 Tutor Marked Assignment 41
7.0 Further Reading and Other References 41
30
1.0 Introduction
The design of any algorithm follows some planning as there are different design
techniques, strategies or paradigms that could be adopted depending on the
problem domain and a better understanding by the designer. Some of these
techniques could be combined also while the limiting behaviour of the algorithm
can be represented with asymptotic analysis of which we shall be looking at
examples of algorithm design techniques and asymptotic notations.
2.0 Objectives
By the end of this unit, you will be able to
Understand several design techniques or paradigms of algorithms
Know the meaning of Asymptotic notations
Understand some popular Asymptotic notations
Learn how to apply some of the Asymptotic notations learnt
31
3.1 Popular Algorithm Design Techniques
Following are some standard algorithms that are of the Divide and Conquer
algorithms variety.
o Greedy Algorithm always makes the choice (greedy criteria) looks best at
the moment, to optimize a given objective.
32
o The greedy algorithm doesn't always guarantee the optimal solution
however it generally produces a solution that is very close in value to the
optimal.
Tower of Hanoi
Dijkstra Shortest Path
Fibonacci sequence
33
Matrix chain multiplication
Egg-dropping puzzle, etc
The branch and bound method is a solution approach that parti- tions the
feasible solution space into smaller subsets of solutions. , can assume any
integer value greater than or equal to zero is what gives this model its designation
as a total integer model.
Branch and bound is an algorithm design paradigm which is generally used for
solving combinatorial optimization problems.
Knapsack problems
Traveling Salesman Problem
Job Assignment Problem, etc
34
Backtracking is a general algorithm for finding solutions to some
computational problems, notably constraint satisfaction problems, that
incrementally builds candidates to the solutions, and abandons a candidate
("backtracks") as soon as it determines that the candidate cannot possibly be
completed to a valid solution.
Given a problem:
Backtrack(s)
if is not a solution
return false
if is a new solution
backtrack(expand s)
35
There are the following scenarios in which you can use the backtracking:
It is used to solve a variety of problems. You can use it, for example, to
find a feasible solution to a decision problem.
Backtracking algorithms were also discovered to be very effective for
solving optimization problems.
In some cases, it is used to find all feasible solutions to
the enumeration problem.
Backtracking, on the other hand, is not regarded as an optimal problem-
solving technique. It is useful when the solution to a problem does not
have a time limit.
There are two main types of randomized algorithms: Las Vegas algorithms and
Monte-Carlo algorithms.
36
Self-Assessment Exercise
1. What do you understand by an Algorithm design paradigm?
2. How does the Greedy Technique work and give an example?
3, Give a difference between the Backtracking and Randomized algorithm
techniques
In this function, the n2 term dominates the function that is when n gets sufficiently
large.
ƒ (n) ~ n2.
37
Asymptotic notations are used to write fastest and slowest possible running time
for an algorithm. These are also referred to as 'best case' and 'worst case' scenarios
respectively.
"In asymptotic notations, we derive the complexity concerning the size of the
input. (Example in terms of n)"
"These notations are important because without expanding the cost of running the
algorithm, we can estimate the complexity of the algorithms."
Hence, function g (n) is an upper bound for function f (n), as g (n) grows faster
than f (n)
38
Examples:
1. 3n+2=O(n) as 3n+2≤4n for all n≥2
2. 3n+3=O(n) as 3n+3≤4n for all n≥3
The function f (n) = Ω (g (n)) [read as "f of n is omega of g of n"] if and only if
there exists positive constant c and n0 such that
Example:
f (n) =8n2+2n-3≥8n2-3
=7n2+(n2-3)≥7n2 (g(n))
Thus, k1=7
39
Hence, the complexity of f (n) can be represented as Ω (g (n))
The function f (n) = θ (g (n)) [read as "f is the theta of g of n"] if and only if there
exists positive constant k1, k2 and k0 such that
For Example:
3n+2= θ (n) as 3n+2≥3n and 3n+2≤ 4n, for n
k1=3,k2=4, and n0=2
The Theta Notation is more precise than both the big-oh and Omega notation. The
function f (n) = θ (g (n)) if g(n) is both an upper and lower bound.
Self-Assessment Exercise
1. Which of the Asymptotic notations do you consider more important and
why?
2. What do you understand by a Backtracking algorithm?
3. What do you understand by the Upper and Lower bound of an algorithm?
40
4.0 Conclusion
5.0 Summary
Several design techniques or paradigms are available for specifying algorithms
and they range from the popular Divide-and-Conquer, Greedy techniques and
Randomized algorithms amongst others. In the same vein, we have three main
notations for carrying out the Asymptotic analysis of algorithms and they are the
Big O, Big Omega and Big Theta notations.
41
Module 1: Basic Algorithm Analysis
Unit 4: Recursion and Recursive Algorithms
Page
1.0 Introduction 43
2.0 Objectives 43
3.0 Recursion and Recursive Algorithms 44
3.1 Why use Recursion 44
3.1.1 Factorial Example 44
3.1.2 Purpose of Recursions 46
3.1.3 Conditionals to Start, Continue and Stop Recursion 46
3.1.4 The Three Laws of Recursion 47
3.2 Types of Recursions 47
3.2.1 Direct Recursion 47
3.2.2 Indirect Recursion 52
3.3 Recursion versus Iteration 53
3.4 Some Recursive Algorithms (Examples) 54
3.4.1 Reversing an Array 54
3.4.2 Fibonacci Sequence 54
4.0 Conclusion 57
5.0 Summary 57
6.0 Tutor Marked Assignment 57
7.0 Further Reading and other Resources 58
42
1.0 Introduction
Recursion is a method of solving problems that involves breaking a problem
down into smaller and smaller subproblems until you get to a small enough
problem that it can be solved trivially. In computer science, recursion involves a
function calling itself. While it may not seem like much on the surface, recursion
allows us to write elegant solutions to problems that may otherwise be very
difficult to program.
2.0 Objectives
By the end of this unit, you will be able to
Know the meaning of Recursion and a Recursive algorithm
Understand the different types of recursive algorithms
See some examples of recursive algorithms
Understand how the recursive algorithm works
Know the difference between recursion and iteration
Know the reasons why recursion is preferred in programming
Know the runtime and space complexity of different recursive algorithms
There are two main instances of recursion. The first is when recursion is used as
a technique in which a function makes one or more calls to itself. The second is
when a data structure uses smaller instances of the exact same type of data
structure when it represents itself.
43
3.1 Why use recursion?
The factorial function is denoted with an exclamation point and is defined as the
product of the integers from 1 to n. Formally, we can state this as:
n! = n ⋅ (n−1) ⋅ (n−2) … 3 ⋅ 2 ⋅ 1
4! = 4 ⋅ 3 ⋅ 2 ⋅ 1 = 24.
So how can we state this in a recursive manner? This is where the concept
of base case comes in.
4! = 4 ⋅ (3 ⋅ 2 ⋅ 1) = 24
44
4! = 4 ⋅ 3! = 24
Meaning we can rewrite the formal recursion definition in terms of recursion like
so:
n! = n ⋅ (n−1) !
Note, if n = 0, then n! = 1. This means the base case occurs once n=0,
the recursive cases are defined in the equation above. Whenever you are trying
to develop a recursive solution it is very important to think about the base case,
as your solution will need to return the base case once all the recursive cases
have been worked through. Let’s look at how we can create the factorial function
in Python:
def fact(n):
'''
Returns factorial of n (n!).
Note use of recursion
'''
# BASE CASE!
if n == 0:
return 1
# Recursion!
else:
return n * fact(n-1)
Note how we had an if statement to check if a base case occurred. Without it this
function would not have successfully completed running. We can visualize the
recursion with the following figure:
We can follow this flow chart from the top, reaching the base case, and then
working our way back up.
45
3.1.2 Purpose of Recursions
Recursive functions have many uses, but like any other kind of code, their
necessity should be considered. As discussed above, consider the differences
between recursions and loops, and use the one that best fits your needs. If you
decide to go with recursions, decide what you want the function to do before you
start to compose the actual code.
It’s important to look at any arguments or conditions that would start the recursion
in the first place. For example, the function could have an argument that might be
a string or array. The function itself may have to recognize the datatype versus it
being recognized before this point (such as by a parent function). In simpler
scenarios, starting conditions may often be the exact same conditions that force
the recursion to continue.
More importantly, you want to establish a condition where the recursive action
stops. These conditionals, known as base cases, produce an actual value rather
46
than another call to the function. However, in the case of tail-end recursion, the
return value still calls a function but gets the value of that function right away.
Self-Assessment Exercises
1. What do you understand by the term “base case”?
2. Why must a stopping criterion be specified in a recursive algorithm?
3. What happens when a recursive algorithm calls itself recursively?
47
Example:
// Code Showing Tail Recursion
#include <iostream>
using namespace std;
// Recursion function
void fun(int n)
{
if (n > 0) {
cout << n << " ";
// Driver Code
int main()
{
int x = 3;
fun(x);
return 0;
}
Output:
3 2 1
Lets us convert Tail Recursion into Loop and compare each other in terms of
Time & Space Complexity and decide which is more efficient.
// Converting Tail Recursion into Loop
#include <iostream>
using namespace std;
void fun(int y)
{
while (y > 0) {
cout << y << " ";
y--;
}
}
// Driver code
int main()
{
int x = 3;
fun(x);
return 0;
48
}
Output
3 2 1
So it was seen that in case of loop the Space Complexity is O(1) so it was better
to write code in loop instead of tail recursion in terms of Space Complexity
which is more efficient than tail recursion.
b. Head Recursion:
If a recursive function calling itself and that recursive call is the first statement
in the function then it’s known as Head Recursion. There’s no statement, no
operation before the call. The function doesn’t have to process or perform any
operation at the time of calling and all operations are done at returning time.
Example:
// C++ program showing Head Recursion
#include <bits/stdc++.h>
using namespace std;
// Recursive function
void fun(int n)
{
if (n > 0) {
Output:
1 2 3
49
Time Complexity For Head Recursion: O(n)
Space Complexity For Head Recursion: O(n)
// Recursive function
void fun(int n)
{
int i = 1;
while (i <= n) {
cout <<" "<< i;
i++;
}
}
// Driver code
int main()
{
int x = 3;
fun(x);
return 0;
}
Output:
1 2 3
c. Tree Recursion:
To understand Tree Recursion let’s first understand Linear Recursion.
If a recursive function calling itself for one time then it’s known as Linear
Recursion. Otherwise if a recursive function calling itself for more than
one time then it’s known as Tree Recursion.
50
#include <iostream>
using namespace std;
// Recursive function
void fun(int n)
{
if (n > 0)
{
cout << " " << n;
// Calling once
fun(n - 1);
// Calling twice
fun(n - 1);
}
}
// Driver code
int main()
{
fun(3);
return 0;
}
Output:
3 2 1 1 2 1 1
Example:
51
// C++ program to show Nested Recursion
#include <iostream>
using namespace std;
int fun(int n)
{
if (n > 100)
return n - 10;
// A recursive function passing parameter
// as a recursive call or recursion inside
// the recursion
return fun(fun(n + 11));
}
// Driver code
int main()
{
int r;
r = fun(95);
cout << " " << r;
return 0;
}
Output:
9 1
From the above diagram fun(A) is calling for fun(B), fun(B) is calling for
fun(C) and fun(C) is calling for fun(A) and thus it makes a cycle.
Example:
// C++ program to show Indirect Recursion
52
#include <iostream>
using namespace std;
void funB(int n);
void funA(int n)
{
if (n > 0) {
cout <<" "<< n;
// Fun(A) is calling fun(B)
funB(n - 1);
}
}
void funB(int n)
{
if (n > 1) {
cout <<" "<< n;
// Fun(B) is calling fun(A)
funA(n / 2);
}
}
// Driver code
int main()
{
funA(20);
return 0;
}
Output:
20 19 9 8 4 3 1
Iteration
Self-Assessment Exercises
54
Output: The reversal of the elements in A starting at
index i and ending at j
if i < j then
Swap A[i] and A[j]
ReverseArray(A, i+1, j-1)
return
Write the recursive function and the call tree for F(5).
Algorithm Fib(n) {
if (n < 2) return 1
else return Fib(n-1) + Fib(n-2)
}
The above recursion is called binary recursion since it makes two recursive
calls instead of one. How many number of calls are needed to compute the kth
Fibonacci number? Let nk denote the number of calls performed in the
execution.
n0 = 1
n1 = 1
n2 = n1 + n0 + 1 = 3 > 21
n3 = n2 + n1 + 1 = 5 > 22
n4 = n3 + n2 + 1 = 9 > 23
n5 = n4 + n3 + 1 = 15 > 23
...
nk > 2k/2
55
This means that the Fibonacci recursion makes a number of calls that are
exponential in k. In other words, using binary recursion to compute Fibonacci
numbers is very inefficient. Compare this problem with binary search, which is
very efficient in searching items, why is this binary recursion inefficient? The
main problem with the approach above, is that there are multiple overlapping
recursive calls.
We can compute F(n) much more efficiently using linear recursion. One way to
accomplish this conversion is to define a recursive function that computes a
pair of consecutive Fibonacci numbers F(n) and F(n-1) using the convention
F(-1) = 0.
Algorithm LinearFib(n) {
Input: A nonnegative integer n
Output: Pair of Fibonacci numbers (Fn, Fn-1)
if (n <= 1) then
return (n, 0)
else
(i, j) <-- LinearFib(n-1)
return (i + j, i)
}
Let's use iteration to generate the Fibonacci numbers. What's the complexity of
this algorithm?
56
Self-Assessment Exercises
1. Either write the pseudo-code or the Java code for the following problems.
Draw the recursion trace of a simple case. What is the running time and
space requirement?.
4.0 Conclusion
Recursive algorithms are very important in programming as they help us write
very good programs and also allow us to understand the concept of computing
well. So many programs are naturally recursive and many others can be turned
into a recursive algorithm.
5.0 Summary
In computer science, recursion is a method of solving a problem where the
solution depends on solutions to smaller instances of the same problem. Such
problems can generally be solved by iteration, but this needs to identify and
index the smaller instances at programming time. There exist several natural
examples of recursive algorithms while other programming algorithms that
are iterative can be turned into recursive algorithms.
The concept of recursion is very important to developers of algorithms and
also to programmers.
57
3. What makes recursion better than iteration and what makes iteration better
than recursion.
4. Give a vital difference between Head recursion and Tail recursion.
58
Module 1: Algorithm Analysis
Unit 5: Recurrence Relations
Page
1.0 Introduction 60
2.0 Objectives 60
3.0 Recurrence Relations 60
3.1 Methods for Resolving Recurrence Relations 60
3.1.1 Guess-and-Verify Method 61
3.1.2 Iteration Method 62
3.1.3 Recursion Tree method 63
3.1.4 Master Method 66
3.2 Example of Recurrence relation: Tower of Hanoi 70
3.2.1 Program for Tower of Hanoi 73
3.2.2 Applications of Tower of Hanoi Problem 74
3.2.3 Finding a Recurrence 74
3.2.4 Closed-form Solution 75
4.0 Conclusion 76
5.0 Summary 76
6.0 Tutor Marked Assignment 76
7.0 Further Reading and Other References 77
59
1.0 Introduction
In principle such a relation allows us to calculate T(n) for any n by applying the
first equation until we reach the base case. To solve a recurrence, we will find a
formula that calculates T(n) directly from n, without this recursive computation.
2.0 Objectives
By the end of this unit, you will be able to
Know more about Recurrences and Recurrence relations
Understand the different methods for resolving recurrence
relations
Know the areas of applications of recurrence relations
For Example, the Worst Case Running Time T(n) of the MERGE SORT
Procedures is described by the recurrence.
2T + θ (n) if n>1
Recurrence relations can be resolved with any of the following four methods:
As when solving any other mathematical problem, we are not required to explain
where our solution came from as long as we can prove that it is correct. So the
most general method for solving recurrences can be called "guess but verify".
Naturally, unless you are very good friends with the existential quantifier you
may find it had to come up with good guesses. But sometimes it is possible to
make a good guess by iterating the recurrence a few times and seeing what
happens.
T (n) = T +n
Solution:
T (n) ≤c log n.
T (n) ≤c log + 1
61
Thus T (n) =O logn.
T (n) = 2T + n n>1
Solution:
Now,
T (n) = 1 if n=1
= 2T (n-1) if n>1
Solution:
T (n) = 2T (n-1)
= 2[2T (n-2)] = 22T (n-2)
= 4[2T (n-3)] = 23T (n-3)
= 8[2T (n-4)] = 24T (n-4) (Eq.1)
62
Repeat the procedure for i times
T (n) = 2i T (n-i)
Put n-i=1 or i= n-1 in (Eq.1)
T (n) = 2n-1 T (1)
= 2n-1 .1 {T (1) =1 .....given}
= 2n-1
Solution:
T (n) = T (n-1) +1
= (T (n-2) +1) +1 = (T (n-3) +1) +1+1
= T (n-4) +4 = T (n-5) +1+4
= T (n-5) +5= T (n-k) + k
Where k = n-1
T (n-k) = T (1) = θ (1)
T (n) = θ (1) + (n-1) = 1+n-1=n= θ (n).
A Recursion Tree is best used to generate a good guess, which can be verified by
the Substitution Method.
Example 1
Consider T (n) = 2T + n2
63
We have to obtain the asymptotic bound using recursion tree method.
T (n) = 4T +n
64
Example 3: Consider the following recurrence
65
When we add the values across the levels of the recursion trees, we get a value of
n for every level. The longest path from the root to leaf is
The Master Method is used for solving the following types of recurrence
T (n) = a T + f (n) with a≥1 and b≥1 be constant & f(n) be a function and
can be interpreted as
66
T (n) = a T + f (n)
Master Theorem:
T (n) = Θ
Example:
67
T (n) = 8 T apply master theorem on it.
Solution:
T (n) = a T
a = 8, b=2, f (n) = 1000 n2, logba = log28 = 3
Since this equation holds, the first case of the master theorem applies to the given
recurrence relation, thus resulting in the conclusion:
T (n) = Θ
Therefore: T (n) = Θ (n3)
F (n) = Θ
Example:
68
Therefore: T (n) = Θ
= Θ (n log n)
Case 3: If it is true f(n) = Ω for some constant ε >0 and it also true
T (n) = 2
Solution:
T (n) = a T
a= 2, b =2, f (n) = n2, logba = log22 =1
2
If we will choose c =1/2, it is true:
∀ n ≥1
So it follows: T (n) = Θ ((f (n))
69
T (n) = Θ(n2)
Self-Assessment Exercises
1. The disks must all be moved one at a time from one peg to another peg using
only three pegs.
Once all the disks have been moved, the World will end !!! This problem can be
easily solved by Divide & Conquer algorithm
70
In the above 7 step all the disks from peg A will be transferred to C given
Condition:
Let T (n) be the total time taken to move n disks from peg A to peg C
1. Moving n-1 disks from the first peg to the second peg. This can be done in
T (n-1) steps.
2. Moving larger disks from the first peg to the third peg will require first one
step.
3. Recursively moving n-1 disks from the second peg to the third peg will
require again T (n-1) step.
71
So, total time taken T (n) = T (n-1)+1+ T(n-1)
We get,
72
It is a Geometric Progression Series with common ratio, r=2.
T (3) = 23- 1
=8-1
= 7 Ans
[As in concept we have proved that there will be 7 steps now proved by general
equation]
#include<stdio.h>
void towers(int, char, char, char);
int main()
{
int num;
printf ("Enter the number of disks : ");
scanf ("%d", &num);
73
printf("The sequence of moves involved in the Tower of Hanoi a
re:\n");
towers (num, 'A', 'C', 'B');
return 0;
}
void towers( int num, char from peg, char topeg, char aux
peg)
{
if (num == 1)
{
printf ("\n Move disk 1 from peg %c to peg %c", from peg,
topeg);
return;
}
Towers (num - 1, from peg, auxpeg, topeg);
Printf ("\n Move disk %d from peg %c to peg %c", num, from peg
, topeg);
Towers (num - 1, auxpeg, topeg, from peg);
}
To answer how long it will take our friendly monks to destroy the world, we write
a recurrence (let's call it M(n)) for the number of moves MoveTower takes for
an n-disk tower.
The base case - when n is 1 - is easy: The monks just move the single disk directly.
M(1) = 1
In the other cases, the monks follow our three-step procedure. First they move the
(n-1)-disk tower to the spare peg; this takes M(n-1) moves. Then the monks move
the nth disk, taking 1 move. And finally they move the (n-1)-disk tower again,
74
this time on top of the nth disk, taking M(n-1) moves. This gives us our recurrence
relation,
M(n) = 2 M(n-1) + 1.
This would be more convenient if we had M(n) into a closed-form solution - that
is, if we could write a formula for M(n) without using recursion. Do you see what
it should be? (It may be helpful if you go ahead and compute the first few values,
like M(2), M(3), and M(4).)
M(1) =1
M(2)=2M(1) + 1 =3
M(3)=2M(2) + 1 =7
M(4)=2M(3) + 1 =15
M(5)=2M(4) + 1 =31
M(n) = 2n - 1.
M(1) = 1 = 21 - 1
M(n) = 2 M(n - 1) + 1 = 2 (2n - 1 + 1) - 1 = 2n + 1
Since our expression 2n+1 is consistent with all the recurrence's cases, this is the
closed-form solution.
So the monks will move 264+1 (about 18.45x1018) disks. If they are really good
and can move one disk a millisecond, then they'll have to work for 584.6 million
years. It looks like we're safe.
75
Self-Assessment Exercise
1. Simulate the Tower-of-Hanoi problem for N = 7 disks and N = 12
disks.
2. Can we solve the Tower of Hanoi problem for any value of Tn
without using a Recurrence relation? Discuss.
3. What are the application areas for the Tower of Hanoi problem?
4.0 Conclusion
Recurrence relation permits us to compute the members of a sequence one after
the other starting from one or more initial values.
Recurrence relations apply recursion completely and there exist one or more base
cases to help determine the stopping criterion.
5.0 Summary
In mathematics and computer science, a recurrence relation is an equation that
expresses the nth term of a sequence as a function of the k preceding terms, for
some fixed k, which is called the order of the relation. Recurrence relations can
be solved by several methods ranging from the popular Guess-and-Verify method
to the Master method and they help us understand the workings of algorithms
better.
76
b) Solve the recurrence relation from part (a) to find the number of goats
on the island at the start of the n-th year.
c) Construct a recurrence relation for the number of goats on the island at
the start of the n-th year, assuming that n goats are removed during
the n-th year for each n≥3n≥3 .
d) Solve the recurrence relation in part (c) for the number of goats on the
island at the start of the n-th year.
3. a) Find all solutions of the recurrence relation an=2an−1+2n2.an=2an−1+2n2.
b) Find the solution of the recurrence relation in part (a) with initial
condition a1=4
77
CIT310: ALGORITHMS AND COMPLEXITY ANALYSIS
78
Module 2: Sorting and Searching Algorithms
Unit 1: Bubble Sort and Selection Sort Algorithm
Page
1.0 Introduction 80
2.0 Objectives 80
3.0 Bubble Sort Algorithm 81
3.1 How Bubble sort works 81
3.2 Complexity Analysis of Bubble Sort 85
3.2.1 Time Complexities 86
3.2.2 Advantages of Bubble Sort 86
3.2.3 Disadvantages of Bubble Sort 86
3.3 Selection Sort Algorithm 87
3.3.1 Algorithm Selection Sort 87
3.3.2 How Selection Sort works 87
3.3.3 Complexity of Selection sort 91
3.3.4 Time Complexity 92
3.3.5 Advantages of Selection Sort 92
3.3.6 Disadvantages of Selection Sort 92
4.0 Conclusion 93
5.0 Summary 93
6.0 Tutor Marked Assignment 93
7.0 Further Reading and other Resources 93
79
1.0 Introduction
Sorting and searching are two of the most frequently needed algorithms in
program design. Common algorithms have evolved to take account of this need.
Since computers were created, users have devised programs, many of which have
needed to do the same thing. As a result, common algorithms have evolved and
been adopted in many programs.
linear search
binary search
Methods of sorting include:
bubble sort
merge sort
insertion sort
quicksort
radix sort
selection sort
2.0 Objectives
By the end of this unit, you should be able to:
Know some of the techniques for sorting a list containing numbers or texts
Identify how the bubble sort and Selection sort algorithm works
Know some benefits and disadvantages of Bubble sort and Selection sort
Identify the worst case and best case of Bubble sort and selection sort
Know where the bubble sort and selection sort algorithms are applied
80
3.0 Bubble Sort
Bubble Sort, also known as Exchange Sort, is a simple sorting algorithm. It works
by repeatedly stepping throughout the list to be sorted, comparing two items at a
time and swapping them if they are in the wrong order. The pass through the list
is duplicated until no swaps are desired, which means the list is sorted.
Algorithm
Step 1 ➤ Initialization
set 1 ← n, p ← 1
Step 2 ➤ loop,
if (E = 0) then
exit
else
set l ← l - 1.
For each iteration, the bubble sort will compare up to the last unsorted element.
Once all the elements get sorted in the ascending order, the algorithm will get
terminated.
Consider the following example of an unsorted array that we will sort with the
help of the Bubble Sort algorithm.
Initially,
Pass 1:
o Compare a0 and a1
o Compare a1 and a2
82
o Compare a2 and a3
o Compare a3 and a4
Pass 2:
o Compare a0 and a1
o Compare a1 and a2
83
Here a1 < a2, so the array will remain as it is.
o Compare a2 and a3
Pass 3:
o Compare a0 and a1
o Compare a1 and a2
84
Pass 4:
o Compare a0 and a1
85
Therefore, the bubble sort algorithm encompasses a time complexity
of O(n2) and a space complexity of O(1) because it necessitates some extra
memory space for temp variable for swapping.
o Best Case Complexity: The bubble sort algorithm has a best-case time
complexity of O(n) for the already sorted array.
o Average Case Complexity: The average-case time complexity for the
bubble sort algorithm is O(n2), which happens when 2 or more elements
are in jumbled, i.e., neither in the ascending order nor in the descending
order.
o Worst Case Complexity: The worst-case time complexity is also O(n2),
which occurs when we sort the descending order of an array into the
ascending order.
Self-Assessment Exercise
1. What exactly do we mean by the concept of “Sorting”
2. Explain the terms “Sorting in Ascending order” and “Sorting in Descending order”.
86
3. Why do we prefer using the Bubble sort algorithm in teaching Sorting and in sorting
small list of numbers?
The selection sort enhances the bubble sort by making only a single swap for each
pass through the rundown. In order to do this, a selection sort searches for the
biggest value as it makes a pass and, after finishing the pass, places it in the best
possible area. Similarly, as with a bubble sort, after the first pass, the biggest item
is in the right place. After the second pass, the following biggest is set up. This
procedure proceeds and requires n-1 goes to sort n item since the last item must
be set up after the (n-1) th pass.
k ← length [A]
for j ←1 to n-1
smallest ← j
for I ← j + 1 to k
if A [i] < A [ smallest]
then smallest ← i
exchange (A [j], A [smallest])
87
5. For each iteration, we will start the indexing from the first element of the
unsorted list. We will repeat the Steps from 1 to 4 until the list gets sorted
or all the elements get correctly positioned.
6. Consider the following example of an unsorted array that we will sort with
the help of the Selection Sort algorithm.
1st Iteration:
Set minimum = 7
o Compare a0 and a1
o Compare a1 and a2
o Compare a2 and a3
88
As, a2 < a3, set minimum= 3.
o Compare a2 and a4
2nd Iteration:
Set minimum = 4
o Compare a1 and a2
o Compare a1 and a3
o Compare a1 and a4
89
Again, a1 < a4, set minimum = 4.
Since the minimum is already placed in the correct position, so there will be no
swapping.
3rd Iteration:
Set minimum = 7
o Compare a2 and a3
o Compare a3 and a4
Since 5 is the smallest element among the leftover unsorted elements, so we will
swap 7 and 5.
90
4th Iteration:
Set minimum = 6
o Compare a3 and a4
Since the minimum is already placed in the correct position, so there will be no
swapping.
91
Therefore, the selection sort algorithm encompasses a time complexity
of O(n2) and a space complexity of O(1) because it necessitates some extra
memory space for temp variable for swapping.
Self-Assessment Exercise
1. How does the Selection sort algorithm work?
92
2. What is the Average case and Worst case complexity of the Selection Sort
algorithm?
4.0 Conclusion
The sorting problem enables us to find better algorithms that would help arrange
the numbers in a list or sequence in any order. Ascending order is when it is
arranged from Smallest to Biggest while Descending order is when the list is
arranged from biggest item to the smallest item. We looked at the case of the
bubble sort and the Selection sort algorithms which are well suited for sorting a
small-sized list efficiently.
5.0 Summary
In simple terms, the Sorting algorithm arranges a list from either smallest item
consecutively to the biggest item (Ascending order) or from the biggest item
consecutively to the smallest item (Descending order).
Two methods of Sorting small-sized lists (Bubble sort and Selection Sort) were
introduced and incidentally, they both have the same Worst case runnung time of
O(n2).
93
Module 2: Sorting and Searching Algorithms
Unit 2: Insertion Sort and Linear Search Algorithm
Page
1.0 Introduction 95
2.0 Objectives 95
3.0 Insertion Sort 95
3.1 How Insertion sort works 96
3.2 Complexity of Insertion sort 99
3.2.1 Time Complexities 100
3.2.2 Space Complexity 100
3.2.3 Insertion sort Applications 100
3.2.4 Advantages of Insertion sort 100
3.2.5 Disadvantages of Insertion sort 101
3.3 Linear Search Algorithm 101
3.4 Complexity of Linear Search 103
3.4.1 Advantages of Linear Search 103
3.4.2 Disadvantages of Linear Search 103
4.0 Conclusion 103
5.0 Summary 104
6.0 Tutor Marked Assignments 104
7.0 Further Reading and Other Resources 104
94
1.0 Introduction
Insertion sort is one of the simplest sorting algorithms for the reason that it sorts
a single element at a particular instance. It is not the best sorting algorithm in
terms of performance, but it's slightly more efficient than selection sort
and bubble sort in practical scenarios. It is an intuitive sorting technique.
2.0 Objectives
By the end of this unit, you will be able to:
Know how Insertion sort and Linear search works
Understand the complexities of both Linear search and Insertion sort
Know the advantages and disadvantages of Linear search
Know the advantages and disadvantages of Insertion sort
Use the Linear Search and Insertion sort algorithms to write good programs
in any programming language of your choice.
Insertion sort is one of the simplest sorting algorithms for the reason that it sorts
a single element at a particular instance. It is not the best sorting algorithm in
terms of performance, but it's slightly more efficient than selection sort
and bubble sort in practical scenarios. It is an intuitive sorting technique.
Let's consider the example of cards to have a better understanding of the logic
behind the insertion sort.
Suppose we have a set of cards in our hand, such that we want to arrange these
cards in ascending order. To sort these cards, we have a number of intuitive ways.
One such thing we can do is initially we can hold all of the cards in our left hand,
and we can start taking cards one after other from the left hand, followed by
building a sorted arrangement in the right hand.
Assuming the first card to be already sorted, we will select the next unsorted card.
If the unsorted card is found to be greater than the selected card, we will simply
place it on the right side, else to the left side. At any stage during this whole
process, the left hand will be unsorted, and the right hand will be sorted.
95
In the same way, we will sort the rest of the unsorted cards by placing them in the
correct position. At each iteration, the insertion algorithm places an unsorted
element at its right place.
1. We will start by assuming the very first element of the array is already sorted.
Inside the key, we will store the second element.
Next, we will compare our first element with the key, such that if the key is found
to be smaller than the first element, we will interchange their indexes or place the
key at the first index. After doing this, we will notice that the first two elements
are sorted.
2. Now, we will move on to the third element and compare it with the left-hand
side elements. If it is the smallest element, then we will place the third element at
the first index.
Else if it is greater than the first element and smaller than the second element,
then we will interchange its position with the third element and place it after the
first element. After doing this, we will have our first three elements in a sorted
manner.
3. Similarly, we will sort the rest of the elements and place them in their correct
position.
Consider the following example of an unsorted array that we will sort with the
help of the Insertion Sort algorithm.
96
Initially,
1st Iteration:
Set key = 22
Compare a1 with a0
2nd Iteration:
Set key = 63
3rd Iteration:
Set key = 14
97
Compare a3 with a2, a1 and a0
Since a3 is the smallest among all the elements on the left-hand side, place a3 at
the beginning of the array.
4th Iteration:
Set key = 55
5th Iteration:
Set key = 36
98
Since a5 < a2, so we will place the elements in their correct positions.
Logic: If we are given n elements, then in the first pass, it will make n-
1 comparisons; in the second pass, it will do n-2; in the third pass, it
will do n-3 and so on. Thus, the total number of comparisons can be
found by;
Output:
(n-1) + (n-2) + (n-3) + (n-4) = …… +1
99
3.2.1 Time Complexities:
o Best Case Complexity: The insertion sort algorithm has a best-case time
complexity of O(n) for the already sorted array because here, only the
outer loop is running n times, and the inner loop is kept still.
o Average Case Complexity: The average-case time complexity for the
insertion sort algorithm is O(n2), which is incurred when the existing
elements are in jumbled order, i.e., neither in the ascending order nor in the
descending order.
o Worst Case Complexity: The worst-case time complexity is also O(n2),
which occurs when we sort the ascending order of an array into the
descending order.
In this algorithm, every individual element is compared with the rest of the
elements, due to which n-1 comparisons are made for every nth element.
The insertion sort encompasses a space complexity of O(1) due to the usage of
an extra variable key.
100
4. It is in-place (only requires a constant amount O (1) of extra memory
space).
5. It is an online algorithm, which can sort a list when it is received.
Self-Assessment Exercise:
1. What is the worst case time complexity of insertion sort where position
of the data to be inserted is calculated using binary search?
2. Consider an array of elements arr[5]= {5,4,3,2,1} , what are the steps of
insertions done while doing insertion sort in the array.
3. How many passes does an insertion sort algorithm consist of?
4. What is the average case running time of an insertion sort algorithm?
5. What is the running time of an insertion sort algorithm if the input is pre-
sorted?
Starting at the beginning of the data set, each item of data is examined until a
match is made. Once the item is found, the search ends.
Suppose we were to search for the value 2. The search would start at position 0
and check the value held there, in this case 3.
A linear search, although simple, can be quite inefficient. Suppose the data set
contained 100 items of data, and the item searched for happens to be the last item
in the set? All of the previous 99 items would have to be searched through first.
However, linear searches have the advantage that they will work on any data set,
whether it is ordered or unordered.
102
3.4 Complexity of Linear Search
The worst case complexity of linear search is O(n).
If the element to be searched lived on the the first memory block then the best
case complexity would be: O(1).
a. Will perform fast searches of small to medium lists. With today's powerful
computers, small to medium arrays can be searched relatively quickly.
Self-Assessment Exercises
1. Given a list of numbers 12, 45, 23, 7, 9, 10, 22, 87, 45, 23, 34, 56
a. Use the linear search algorithm to search for the number 10
b. Comment on the worst-case running time of your algorithm
2. When do we consider the linear search algorithm a better alternative?
3. What is the best case for linear search?
4.0 Conclusion
The Insertion sort is a simple sorting algorithm that builds the final sorted array
one item at a time. It is much less efficient on large lists than more advanced
algorithms such as quicksort, or merge sort while a linear search or sequential
search is a method for finding an element within a list. It sequentially checks each
element of the list until a match is found or the whole list has been searched
5.0 Summary
We examined the Insertion sort algorithm and how it can be used to sort or
arrange a list in any order while at the same time noting its complexity,
advantages and disadvantages. A Linear Search algorithm which is also known
as Sequential search is used in finding a given element in a list and returns a
103
positive answer once the element is located else it returns a negative answer.
Linear search is very efficient for searching an item within a small-sized list’
104
Module 2: Searching and Sorting algorithms
Unit 3: Radix Sort and Stability in Sorting
Page
1.0 Introduction 106
2.0 Objectives 106
3.0 Radix Sort 106
3.1 Complexity of Radix Sort 107
3.1.1 Advantages of Radix Sort 108
3.1.2 Disadvantages of Radix Sort 108
3.1.3 Applications of Radix Sort 108
3.2 Stability in Sorting 109
3.2.1 Why is Stable Sort Useful? 111
4.0 Conclusion 112
5.0 Summary 112
6.0 Tutor Marked Assignments 113
7.0 Further Reading and Other Resources 113
105
1.0 Introduction
Radix sort is one of the simplest sorting algorithms for the reason that it sorts a
single element at a particular instance. It is not the best sorting algorithm in terms
of performance, but it's slightly more efficient than selection sort and bubble sort
in practical scenarios. It is an intuitive sorting technique.
2.0 Objectives
By the end of this unit, you will be able to:
Know how to calculate with various data types
Specify input and output statements
Differentiate between formatted and unformatted I/O statements.
Radix Sort is a Sorting algorithm that is useful when there is a constant 'd' such
that all keys are d digit numbers. To execute Radix Sort, for p =1 towards 'd'
sort the numbers with respect to the Pth digits from the right using any linear
time stable sort.
Radix sort is a sorting technique that sorts the elements digit to digit based on
radix. It works on integer numbers. To sort the elements of the string type, we
can use their hash value. This sorting algorithm makes no comparison.
The Code for Radix Sort is straightforward. The following procedure assumes
that each element in the n-element array A has d digits, where digit 1 is the
lowest order digit and digit d is the highest-order digit.
Here is the algorithm that sorts A [1.n] where each number is d digits long.
Example: The first Column is the input. The remaining Column shows the list
after successive sorts on increasingly significant digit position. The vertical
106
arrows indicate the digits position sorted on to produce each list from the previous
one.
576 49[4] 9[5]4 [1]76 176
494 19[4] 5[7]6 [1]94 194
194 95[4] 1[7]6 [2]78 278
296 → 57[6] → 2[7]8 → [2]96 → 296
278 29[6] 4[9]4 [4]94 494
176 17[6] 1[9]4 [5]76 576
954 27[8] 2[9]6 [9]54 954
Space Complexity
In this algorithm, we have two auxiliary arrays cnt of size b (base)
and tempArray of size n (number of elements), and an input array arr of size n.
107
Space complexity: O(n+b)
The base of the radix sort doesn't depend upon the number of elements. In some
cases, the base may be larger than the number of elements.
Radix sort becomes slow when the element size is large but the radix is small.
We can't always use a large radix cause it requires large memory in counting sort.
It is good to use the radix sort when d is small.
Fast when the keys are short i.e. when the range of the array elements is
less.
Used in suffix array construction algorithms like Manber's algorithm and
DC3 algorithm.
Radix Sort is stable sort as relative order of elements with equal values is
maintained.
Since Radix Sort depends on digits or letters, Radix Sort is much less
flexible than other sorts. ...
The constant for Radix sort is greater compared to other sorting algorithms.
It takes more space compared to Quicksort which is in-place sorting.
Radix sort can be applied to data that can be sorted lexicographically, such
as words and integers. It is also used for stably sorting strings.
It is a good option when the algorithm runs on parallel machines, making
the sorting faster. To use parallelization, we divide the input into several
108
buckets, enabling us to sort the buckets in parallel, as they are independent
of each other.
It is used for constructing a suffix array. (An array that contains all the
possible suffixes of a string in sorted order is called a suffix array.
Self-Assessment Exercises
1. If we use Radix Sort to sort n integers in the range (nk/2,nk], for some k>0
which is independent of n, the time taken would be?
3. Sort the following list in descending order using the Radix sort
algorithm
109
same key, and R appears before S in the original list, then R will always appear
before S in the sorted list.
When equal elements are indistinguishable, such as with integers, or more
generally, any data where the entire element is the key, stability is not an issue.
Stability is also not an issue if all keys are different.
An example of stable sort on playing cards. When the cards are sorted by rank
with a stable sort, the two 5s must remain in the same order in the sorted output
that they were originally in. When they are sorted with a non-stable sort, the 5s
may end up in the opposite order in the sorted output.
110
the suits are in the order clubs (♣), diamonds (♦), hearts (♥), spades (♠), and within
each suit, the cards are sorted by rank. This can be done by first sorting the cards
by rank (using any sort), and then doing a stable sort by suit:
Within each suit, the stable sort preserves the ordering by rank that was already
done. This idea can be extended to any number of keys and is utilized by radix
sort. The same effect can be achieved with an unstable sort by using a
lexicographic key comparison, which, e.g., compares first by suit, and then
compares by rank if the suits are the same.
111
The sorting algorithm which will produce the first output will be known as stable
sorting algorithm because the original order of equal keys are maintained, you
can see that (4, 5) comes before (4,3) in the sorted order, which was the original
order i.e. in the given input, (4, 5) comes before (4,3) .
On the other hand, the algorithm which produces second output will know as an
unstable sorting algorithm because the order of objects with the same key is not
maintained in the sorted order. You can see that in the second output, the (4,3)
comes before (4,5) which was not the case in the original input.
Self-Assessment Exercise
1. Can any unstable sorting algorithm be altered to become stable? If so, how?
2. What is the use of differentiating algorithms on the basis of stability?
3. When is it definitely unnecessary to look at the nature of stability of a
sorting algorithm?
4. What are some stable sorting techniques?
5. What properties of sorting algorithms are most likely to get affected when
a typically unstable sorting algorithm is implemented to be stable?
4.0 Conclusion
In computer science, radix sort is a non-comparative sorting algorithm. It
avoids comparison by creating and distributing elements into buckets
according to their radix. Stable sorting algorithms on the other hand maintain
the relative order of records with equal keys (i.e. values). That is, a sorting
algorithm is stable if whenever there are two records R and S with the same
key and with R appearing before S in the original list, R will appear before S
in the sorted list.
5.0 Summary
We considered another good example of a sorting algorithm known as Radix sort
which unconsciously, is the commonest method we use in sorting some items in
a list. On the other hand, we looked at stability in sorting algorithms and how to
identify stable and unstable sorting algorithms.
112
2. Assuming that the number of digits used is not excessive, the worst-case
cost for Radix Sort when sorting nn keys with distinct key values is:
3. If an unstable sorting algorithm happens to preserve the relative order in a
particular example, is it said to be stable?
4. The running time of radix sort on an array of n integers in the range
[0……..n5 -1] when using base 10 representation is?
5. How can you convert an unstable sorting algorithm into a stable sorting
algorithm?
Levitin, A. (2012). Introduction to the Design and Analysis of Algorithms, 3rd Ed.
Pearson Education, ISBN 10-0132316811
113
Module 2: Sorting and Searching Algorithms
Unit 4: Divide and Conquer Strategies I: Binary Search Algorithm
Page
1.0 Introduction 115
2.0 Objectives 115
3.0 Divide-and-Conquer Algorithms 116
3.1 Fundamentals of Divide-and-Conquer Strategy 116
3.1.1 Applications of Divide-and-Conquer Approach 117
3.1.2 Advantages of Divide-and-Conquer 118
3.1.3 Disadvantages of Divide-and-Conquer 119
3.1.4 Properties of Divide-and-Conquer Algorithms 119
3.2 Binary Search 120
3.2.1 Complexity of Binary Search 122
3.1.2 Advantages of Binary Search 123
3.1.3 Disadvantages of Binary Search 123
3.1.4 applications of Binary Search 123
114
1.0 Introduction
Divide-and-Conquer is a useful problem-solving technique that divides a large
instance of a problem sixe into smaller and smaller instances and then solves these
smaller instances to give a complete solution of the bigger problem. There are
several strategies for implementing the Divide-and-Conquer approach and we
shall first examine the Binary Search algorithm which first requires that a list be
sorted and then proceeds to find any requested item on the list and is very efficient
for large lists since it uses logarithmic time.
2.0 Objectives
By the end of this unit, you should be able to:
Know the meaning of a Divide-and-Conquer Algorithm
Know how to use a Divide-and-Conquer algorithm
Know the different applications of Divide-and-Conquer algorithms
Understand the Binary Search algorithm,
Know why the Binary Search algorithm is useful
Understand the benefits and shortcomings of Binary search
Know the different application areas of Binary Search
Divide and Conquer algorithm consists of a dispute using the following three
steps.
115
Generally, we can follow the divide-and-conquer approach in a three-step
process.
Examples: The specific computer algorithms are based on the Divide & Conquer
approach:
1. Relational Formula
2. Stopping Condition
116
2. Stopping Condition: When we break the problem using Divide & Conquer
Strategy, then we need to know that for how much time, we need to apply divide
& Conquer. So the condition where the need to stop our recursion steps of Divide
& Conquer is called as Stopping Condition.
Following algorithms are based on the concept of the Divide and Conquer
Technique:
117
space, given n points, such that the distance between the pair of points
should be minimal.
a. Divide and Conquer tend to successfully solve one of the biggest problems,
such as the Tower of Hanoi, a mathematical puzzle. It is challenging to
solve complicated problems for which you have no basic idea, but with the
help of the divide and conquer approach, it has lessened the effort as it
works on dividing the main problem into two halves and then solve them
recursively. This algorithm is much faster than other algorithms.
b. It efficiently uses cache memory without occupying much space because it
solves simple subproblems within the cache memory instead of accessing
the slower main memory.
c. It is more proficient than that of its counterpart Brute Force technique.
d. Since these algorithms inhibit parallelism, it does not involve any
modification and is handled by systems incorporating parallel processing.
118
3.1.3 Disadvantages of Divide and Conquer
119
Self-Assessment Exercise
1. The steps in the Divide-and-Conquer process that takes a recursive
approach is said to be?
2. Given the recurrence f(n) = 4 f(n/2) + 1, how many sub-problems will a
divide-and-conquer algorithm divide the original problem into, and what
will be the size of those sub-problems?
3. Design a divide-and-conquer algorithm to compute kn for k > 0 and
integer n >= 0.
4. Define divide and conquer approach to algorithm design
120
Suppose we were to search for the value 11.
The midpoint is found by adding the lowest position to the highest position and
dividing by 2.
8/2 = 4
NOTE - if the answer is a decimal, round up. For example, 3.5 becomes 4. We
can round down as an alternative, as long as we are consistent.
7 is less than 11, so the bottom half of the list (including the midpoint) is
discarded.
14 is greater than 11, so the top half of the list (including the midpoint) is
discarded.
121
Check at position 6.
Self-Assessment Exercise
1. Which type of lists or data sets are binary searching algorithms used
for?
123
2. A binary search is to be performed on the list:
[3 5 9 10 23]. How many comparisons would it take to find number
9?
3. How many binary searches will it take to find the value 7 in the list
[1,4,7,8,10,28]?
4. Given an array arr = {45,77,89,90,94,99,100} and key = 100; What
are the mid values(corresponding array elements) generated in the
first and second iterations?
4.0 Conclusion
In computer science, divide and conquer is an algorithm design paradigm. A
divide-and-conquer algorithm recursively breaks down a problem into two or
more sub-problems of the same or related type, until these become simple enough
to be solved directly.
A binary search algorithm is a widely used algorithm in the computational
domain. It is a fat and accurate search algorithm that can work well on both big
and small datasets. A binary search algorithm is a simple and reliable algorithm
to implement. With time and space analysis, the benefits of using this particular
technique are evident.
5.0 Summary
We looked at the meaning of Divide-and-Conquer algorithms and how they work
and then considered a very good example of a Divide-and-Conquer algorithm
called Binary Search which is very efficient for large lists as its worst case
complexity is given in logarithmic time.
4. How many binary searches will it take to find the value 10 in the list
[1,4,9,10,11]?
124
5. Given a real number, x, and a natural number n, xn can be defined by
the following recursive function:
xn = 1 if n = 0
125
Module 2: Searching and Sorting Algorithms
Unit 5: Merge Sort and Quick Sort Algorithms
Page
1.0 Introduction 127
2.0 Objectives 127
3.0 MergeSort 127
3.1 Mergesort Algorithm 128
3.1.1 Complexity Analysis of Mergesort 132
3.1.2 Mergesort Applications 133
3.1.3 Advantages of Mergesort 133
3.1.4 Disadvantages of Mergesort 133
3.2 Quicksort 134
3.2.1 Complexity of Quicksort 136
3.1.2 Advantages of Quicksort 137
3.1.3 Disadvantages of Quicksort 137
3.1.4 Applications of Quicksort 137
126
1.0 Introduction
We continue with two more examples of Divide-and-Conquer algorithms which
incidentally, are sorting algorithms. The Merge sort (also spelled mergesort) is an
efficient sorting algorithm that uses a divide-and-conquer approach to order
elements in an array. It repeatedly breaks down a list into several sublists until
each sublist consists of a single element and merging those sublists in a manner
that results into a sorted list.
Like mergesort, Quick Sort (also spelled QuickSort) is a Divide and Conquer
algorithm. It picks an element as pivot and partitions the given array around the
picked pivot.
2.0 Objectives
At the end of this unit, you will be able to:
Understand the Mergesort algorithm
Know when and where we can apply mergesort
Understand the complexity of the mergesort approach
Know the benefits and shortcomings of mergesort
Know more about the Quicksort algorithm
Understand how Quicksort works
Be able to write codes for mergesort and quicksort
Be able to perform simple sorting of any list using quicksort and mergesort.
Merge sort is yet another sorting algorithm that falls under the category of Divide
and Conquer technique. It is one of the best sorting techniques that successfully
build a recursive algorithm.
In this technique, we segment a problem into two halves and solve them
individually. After finding the solution of each half, we merge them back to
represent the solution of the main problem.
Suppose we have an array A, such that our main concern will be to sort the
subsection, which starts at index p and ends at index r, represented by A[p..r].
127
Divide
Conquer
After splitting the arrays into two halves, the next step is to conquer. In this step,
we individually sort both of the subarrays A[p..q] and A[q+1, r]. In case if we
did not reach the base situation, then we again follow the same procedure, i.e., we
further segment these subarrays followed by sorting them separately.
Combine
As when the base step is acquired by the conquer step, we successfully get our
sorted subarrays A[p..q] and A[q+1, r], after which we merge them back to form
a new sorted array [p..r].
The MergeSort function keeps on splitting an array into two halves until a
condition is met where we try to perform MergeSort on a subarray of size 1, i.e., p
== r.
And then, it combines the individually sorted subarrays into larger arrays until
the whole array is merged.
ALGORITHM-MERGE SORT
1. If p<r
2. Then q → ( p+ r)/2
3. MERGE-SORT (A, p, q)
4. MERGE-SORT ( A, q+1,r)
5. MERGE ( A, p, q, r)
As you can see in the image given below, the merge sort algorithm recursively
divides the array into halves until the base condition is met, where we are left
with only 1 element in the array. And then, the merge function picks up the sorted
sub-arrays and merge them back to sort the entire array.
128
The following figure illustrates the dividing (splitting) procedure.
1. n 1 = q-p+1
2. n 2= r-q
3. create arrays [1.....n 1 + 1] and R [ 1.....n 2 +1 ]
4. for i ← 1 to n 1
5. do [i] ← A [ p+ i-1]
6. for j ← 1 to n2
7. do R[j] ← A[ q + j]
8. L [n 1+ 1] ← ∞
9. R[n 2+ 1] ← ∞
10. I ← 1
11. J ← 1
12. For k ← p to r
13. Do if L [i] ≤ R[j]
14. then A[k] ← L[ i]
15. i ← i +1
16. else A[k] ← R[j]
17. j ← j+1
129
The merge step of Merge Sort
Mainly the recursive algorithm depends on a base case as well as its ability to
merge back the results derived from the base cases. Merge sort is no different
algorithm, just the fact here the merge step possesses more importance.
To any given problem, the merge step is one such solution that combines the two
individually sorted lists(arrays) to build one large sorted list(array).
The merge sort algorithm upholds three pointers, i.e., one for both of the two
arrays and the other one to preserve the final sorted array's current index.
130
Merge( ) Function Explained Step-By-Step
Consider the following example of an unsorted array, which we are going to sort
with the help of the Merge Sort algorithm.
A= (36,25,40,2,7,80,15)
Step1: The merge sort algorithm iteratively divides an array into equal halves
until we achieve an atomic value. In case if there are an odd number of elements
in an array, then one of the halves will have more elements than the other half.
Step2: After dividing an array into two subarrays, we will notice that it did not
hamper the order of elements as they were in the original array. After now, we
will further divide these two arrays into other halves.
Step3: Again, we will divide these arrays until we achieve an atomic value, i.e.,
a value that cannot be further divided.
Step4: Next, we will merge them back in the same way as they were broken
down.
Step5: For each list, we will first compare the element and then combine them to
form a new sorted list.
Step6: In the next iteration, we will compare the lists of two data values and
merge them back into a list of found data values, all placed in a sorted manner.
131
Hence the array is sorted.
Best Case Complexity: The merge sort algorithm has a best-case time
complexity of O(n*log n) for the already sorted array.
Average Case Complexity: The average-case time complexity for the merge sort
algorithm is O(n*log n), which happens when 2 or more elements are jumbled,
i.e., neither in the ascending order nor in the descending order.
132
Worst Case Complexity: The worst-case time complexity is also O(n*log n),
which occurs when we sort the descending order of an array into the ascending
order.
Self-Assessment Exercise
1. A list of n string, each of length n, is sorted into lexicographic order
using the merge-sort algorithm. The worst case running time of this
computation is?
2. What is the average case time complexity of merge sort?
3. A mergesort works by first breaking a sequence in half a number of
times so it is working with smaller pieces. When does it stop
breaking the list into sublists (in its simplest version)?
133
3.2 Quick Sort
Divide: Rearrange the elements and split arrays into two sub-arrays and an
element in between search that each element in left sub array is less than or equal
to the average element and each element in the right sub- array is larger than the
middle element.
Algorithm:
Partition Algorithm:
134
Example of Quick Sort. Given the following list;
44 33 11 55 77 90 40 60 99 22 88
Let 44 be the Pivot element and scanning done from right to left
22 33 11 55 77 90 40 60 99 44 88
Now comparing 44 to the left side element and the element must
be greater than 44 then swap them. As 55 are greater than 44 so swap them.
22 33 11 44 77 90 40 60 99 55 88
Recursively, repeating steps 1 and steps 2 until we get two lists one left from
pivot element 44 & one right from pivot element.
22 33 40 77 90 44 60 99 55 88
22 33 11 40 44 90 77 60 99 55 88
Now, the element on the right side and left side are greater than and smaller
than 44 respectively.
And these sublists are sorted under the same process as above done.
135
Merging Sublists:
SORTED LISTS
Worst Case Analysis: The worst case occurs when the partition process always
picks greatest or smallest element as pivot. If we consider above partition
strategy where last element is always picked as pivot, the worst case would
occur when the array is already sorted in increasing or decreasing order.
Following is recurrence for worst case.
136
Average Case Analysis: T(n) = O(n log n) is the average case complexity of
quick sort for sorting n elements.
Best Case Analysis: In any sorting, best case is the only case in which we don't
make any comparison between elements that is only done when we have only one
element to sort.
Self-Assessment Exercises
1. What is recurrence for worst case of QuickSort and what is the time
complexity in Worst case?
2. Sort the following list in descending order of magnitude using
QuickSort [23, 65, 8, 78, 3, 65, 21, 9, 4, 43, 76, 1, 6, 4, 8, 56]. You
can pick any element as your pivot.
137
3. Apply Quick sort on a given sequence 7 11 14 6 9 4 3 12. What is
the sequence after first phase, pivot is first element?
4.0 Conclusion
Merge sort is a sorting technique based on divide and conquer technique. With
worst-case time complexity being Ο(n log n), it is one of the most respected
algorithms.
Merge sort first divides the array into equal halves and then combines them in a
sorted manner.
Quicksort, is a sorting algorithm that makes n log n comparisons in average case
for sorting an array of n elements. It is a fast and highly efficient sorting algorithm
and follows the divide-and-conquer approach.
5.0 Summary
In this Unit, we examined two sorting algorithm examples of Divide-and-
Conquer algorithms. The Mergesort which is also and external sorting algorithm
was considered with its complexity analysis explained as well as its benefits and
shortcomings.
The QuickSort algorithm which is another example of a Divide-and-Conquer
algorithm was also looked at as well as its advantages and disadvantages.
a. MergeSort b. Quicksort
138
7.0 Further Reading and other Resources
Karumanchi, N. (2016). Data Structures and Algorithms, CareerMonk
Publications. ISBN-13 : 978-8193245279
139
CIT 310 – Algorithms and Complexity Analysis
Module 3: Other Algorithm Techniques
140
Module 3: Other Algorithm Techniques
Unit 1: Binary Search Trees
Page
1.0 Introduction 142
2.0 Objectives 142
3.0 Binary Search Trees 142
3.0.1 Binary Search Tree Property 143
3.1 Traversal In Binary Search Treess 143
3.1.1 Inorder Tree Walk 143
3.1.2 Preorder Tree Walk 144
3.1.3 Postorder Tree Walk 144
3.2 Querying a Binary Search Tree 144
3.2.1 Searching 144
3.2.2 Minimum and Maximum 145
3.2.3 Successor and Predeccessor 146
3.2.4 Insertion in Binary Search Trees 147
3.2.5 Deletion in Binary Search Trees 148
3.3 Red Black Trees 151
3.3.1 Properties of Red Black Trees 151
3.4 Operations on Red Black Trees 152
3.4.1 Rotation 152
3.4.2 Insertion 155
3.4.3 Deletion 159
141
1.0 Introduction
We introduce here a special search tree called the Binary Search Tree and a
derivative of it known as the Red Black Tree.
A binary search tree, also known as ordered binary tree is a binary tree wherein
the nodes are arranged in a order. The order is : a) All the values in the left sub-
tree has a value less than that of the root node. b) All the values in the right node
have a value greater than the value of the root node.
On the other hand, a red-black tree is a Binary tree where a particular node has
color as an extra attribute, either red or black. By check the node colors on any
simple path from the root to a leaf, red-black trees secure that no such path is
higher than twice as long as any other so that the tree is generally balanced.
2.0 Objectives
A Binary Search tree is organized in a Binary Tree. Such a tree can be defined by
a linked data structure in which a particular node is an object. In addition to a key
field, each node contains field left, right, and p that point to the nodes
corresponding to its left child, its right child, and its parent, respectively. If a child
or parent is missing, the appropriate field contains the value Nil. The root node is
the only node in the tree whose parent field is Nil.
142
3.0.1 Binary Search Tree Property
In Inorder Tree walk, we always print the keys in the binary search tree in a sorted
order.
143
3. print key [x]
4. INORDER-TREE-WALK (right [x])
In Preorder Tree walk, we visit the root node before the nodes in either subtree.
PREORDER-TREE-WALK (x):
1. If x ≠ NIL.
2. then print key [x]
3. PREORDER-TREE-WALK (left [x]).
4. PREORDER-TREE-WALK (right [x]).
In Postorder Tree walk, we visit the root node after the nodes in its subtree.
POSTORDER-TREE-WALK (x):
1. If x ≠ NIL.
2. then POSTORDER-TREE-WALK (left [x]).
3. POSTORDER-TREE-WALK (right [x]).
4. print key [x]
3.2.1. Searching:
The TREE-SEARCH (x, k) algorithm searches the tree node at x for a node
whose key value equal to k. It returns a pointer to the node if it exist otherwise
NIL.
TREE-SEARCH (x, k)
1. If x = NIL or k = key [x].
2. then return x.
144
3. If k < key [x].
4. then return TREE-SEARCH (left [x], k)
5. else return TREE-SEARCH (right [x], k)
Clearly, this algorithm runs in O (h) time where h is the height of the tree. The
iterative version of the above algorithm is very easy to implement
An item in a binary search tree whose key is a minimum can always be found
by following left child pointers from the root until a NIL is encountered. The
following procedure returns a pointer to the minimum element in the subtree
rooted at a given node x.
TREE-MAXIMUM (x)
1. While left [x] ≠ NIL
2. do x←right [x].
3. return x.
145
3.2.3. Successor and predecessor:
Given a node in a binary search tree, sometimes we used to find its successor
in the sorted form determined by an in order tree walk. If all keys are specific,
the successor of a node x is the node with the smallest key greater than key[x].
The structure of a binary search tree allows us to rule the successor of a node
without ever comparing keys. The following action returns the successor of a
node x in a binary search tree if it exists, and NIL if x has the greatest key in
the tree:
The code for TREE-SUCCESSOR is broken into two cases. If the right subtree
of node x is nonempty, then the successor of x is just the leftmost node in the
right subtree, which we find in line 2 by calling TREE-MINIMUM (right [x]).
On the other hand, if the right subtree of node x is empty and x has a successor
y, then y is the lowest ancestor of x whose left child is also an ancestor of x.
To find y, we quickly go up the tree from x until we encounter a node that is
the left child of its parent; lines 3-7 of TREE-SUCCESSOR handle this case.
146
3.2.4. Insertion in Binary Search Tree:
To insert a new value into a binary search tree T, we use the procedure TREE-
INSERT. The procedure takes a node ´ for which key [z] = v, left [z] NIL, and
right [z] = NIL. It modifies T and some of the attributes of z in such a way that it
inserts into an appropriate position in the tree.
TREE-INSERT (T, z)
1. y ←NIL.
2. x←root [T]
3. while x ≠ NIL.
4. do y←x
5. if key [z]< key [x]
6. then x←left [x].
7. else x←right [x].
8. p [z]←y
9. if y = NIL.
10. then root [T]←z
11. else if key [z] < key [y]
12. then left [y]←z
For Example:
Working of TREE-INSERT
Suppose we want to insert an item with key 13 into a Binary Search Tree.
147
x = 1
y = 1 as x ≠ NIL.
Key [z] < key [x]
13 < not equal to 12.
x ←right [x].
x ←3
Again x ≠ NIL
y ←3
key [z] < key [x]
13 < 18
x←left [x]
x←6
Again x ≠ NIL, y←6
13 < 15
x←left [x]
x←NIL
p [z]←6
Now our node z will be either left or right child of its parent (y).
When Deleting a node from a tree it is essential that any relationships, implicit in
the tree can be maintained. The deletion of nodes from a binary search tree will
be considered:
1. Nodes with no children: This case is trivial. Simply set the parent's
pointer to the node to be deleted to nil and delete the node.
2. Nodes with one child: When z has no left child then we replace z by its
right child which may or may not be NIL. And when z has no right child,
then we replace z with its right child.
3. Nodes with both Childs: When z has both left and right child. We find z's
successor y, which lies in right z's right subtree and has no left child (the
148
successor of z will be a node with minimum value its right subtree and so
it has no left child).
o If y is z's right child, then we replace z.
o Otherwise, y lies within z's right subtree but not z's right child. In
this case, we first replace z by its own right child and the replace z
by y.
TREE-DELETE (T, z)
If left [z] = NIL or right [z] = NIL.
Then y ← z
Else y ← TREE- SUCCESSOR (z)
If left [y] ≠ NIL.
Then x ← left [y]
Else x ← right [y]
If x ≠NIL
Then p[x] ← p [y]
If p[y] = NIL.
Then root [T] ← x
Else if y = left [p[y]]
Then left [p[y]] ← x
Else right [p[y]] ← y
If y ≠ z.
Then key [z] ← key [y]
If y has other fields, copy them, too.
Return y
For Example: Deleting a node z from a binary search tree. Node z may be the
root, a left child of node q, or a right child of q.
Node z has two children; its left child is node l, its right child is its successor y,
and y's right child is node x. We replace z by y, updating y's left child to become
l, but leaving x as y's right child.
Node z has two children (left child l and right child r), and its successor y ≠ r lies
within the subtree rooted at r. We replace y with its own right child x, and we set
y to be r's parent. Then, we set y to be q's child and the parent of l.
150
Self-Assessment Exercises
1. What is the worst case time complexity for search, insert and delete
operations in a general Binary Search Tree?
2. We are given a set of n distinct elements and an unlabelled binary tree with
n nodes. In how many ways can we populate the tree with the given set so
that it becomes a binary search tree?
3. How many distinct binary search trees can be created out of 4 distinct keys?
4. Suppose the numbers 7, 5, 1, 8, 3, 6, 0, 9, 4, 2 are inserted in that order into
an initially empty binary search tree. The binary search tree uses the usual
ordering on natural numbers. What is the in-order traversal sequence of the
resultant tree?
A Red Black Tree is a category of the self-balancing binary search tree. It was
created in 1972 by Rudolf Bayer who termed them "symmetric binary B-trees."
A red-black tree is a Binary tree where a particular node has color as an extra
attribute, either red or black. By check the node colors on any simple path from
the root to a leaf, red-black trees secure that no such path is higher than twice as
long as any other so that the tree is generally balanced.
151
A tree T is an almost red-black tree (ARB tree) if the root is red, but other
conditions above hold.
3.4.1. Rotation:
152
Clearly, the order (Ax By C) is preserved by the rotation operation. Therefore, if
we start with a BST and only restructure using rotation, then we will still have a
BST i.e. rotation do not break the BST-Property.
Example: Draw the complete binary tree of height 3 on the keys {1, 2, 3... 15}.
Add the NIL leaves and color the nodes in three different ways such that the black
heights of the resulting trees are: 2, 3 and 4.
Solution:
153
Tree with black-height-2
154
Tree with black-height-4
3.4.2. Insertion:
Insert the new node the way it is done in Binary Search Trees.
Color the node red
If an inconsistency arises for the red-black tree, fix the tree according to
the type of discrepancy.
A discrepancy can be a decision from a parent and a child both having a red color.
This type of discrepancy is determined by the location of the node concerning
grandparent, and the color of the sibling of the parent.
RB-INSERT (T, z)
y ← nil [T]
x ← root [T]
while x ≠ NIL [T]
do y ← x
if key [z] < key [x]
then x ← left [x]
else x ← right [x]
p [z] ← y
155
if y = nil [T]
then root [T] ← z
else if key [z] < key [y]
then left [y] ← z
else right [y] ← z
left [z] ← nil [T]
right [z] ← nil [T]
color [z] ← RED
RB-INSERT-FIXUP (T, z)
After the insert new node, Coloring this new node into black may violate the
black-height conditions and coloring this new node into red may violate coloring
conditions i.e. root is black and red node has no red children. We know the black-
height violations are hard. So we color the node red. After this, if there is any
color violation, then we have to correct them by an RB-INSERT-FIXUP
procedure.
RB-INSERT-FIXUP (T, z)
while color [p[z]] = RED
do if p [z] = left [p[p[z]]]
then y ← right [p[p[z]]]
If color [y] = RED
5. then color [p[z]] ← BLACK //Case 1
6. color [y] ← BLACK //Case 1
7. color [p[z]] ← RED //Case 1
8. z ← p[p[z]] //Case 1
else if z= right [p[z]]
10. then z ← p [z] //Case 2
11. LEFT-ROTATE (T, z) //Case 2
12. color [p[z]] ← BLACK //Case 3
13. color [p [p[z]]] ← RED //Case 3
14. RIGHT-ROTATE (T,p [p[z]]) //Case 3
15. else (same as then clause)
With "right" and "left" exchanged
16. color [root[T]] ← BLACK
Example: Show the red-black trees that result after successively inserting the
keys 41,38,31,12,19,8 into an initially empty red-black tree.
156
Solution:
Insert 41
Insert 19
157
Thus the final tree is
158
3.4.3. Deletion:
If the element to be deleted is in a node with only left child, swap this node
with one containing the largest element in the left subtree. (This node has
no right child).
If the element to be deleted is in a node with only right child, swap this
node with the one containing the smallest element in the right subtree (This
node has no left child).
If the element to be deleted is in a node with both a left child and a right
child, then swap in any of the above two ways. While swapping, swap only
the keys but not the colors.
The item to be deleted is now having only a left child or only a right child.
Replace this node with its sole child. This may violate red constraints or
black constraint. Violation of red constraints can be easily fixed.
If the deleted node is black, the black constraint is violated. The elimination
of a black node y causes any path that contained y to have one fewer black
node.
Two cases arise:
o The replacing node is red, in which case we merely color it black to
make up for the loss of one black node.
o The replacing node is black.
RB-DELETE (T, z)
1. if left [z] = nil [T] or right [z] = nil [T]
2. then y ← z
3. else y ← TREE-SUCCESSOR (z)
4. if left [y] ≠ nil [T]
5. then x ← left [y]
6. else x ← right [y]
7. p [x] ← p [y]
8. if p[y] = nil [T]
159
9. then root [T] ← x
10. else if y = left [p[y]]
11. then left [p[y]] ← x
12. else right [p[y]] ← x
13. if y≠ z
14. then key [z] ← key [y]
15. copy y's satellite data into z
16. if color [y] = BLACK
17. then RB-delete-FIXUP (T, x)
18. return y
RB-DELETE-FIXUP (T, x)
1. while x ≠ root [T] and color [x] = BLACK
2. do if x = left [p[x]]
3. then w ← right [p[x]]
4. if color [w] = RED
5. then color [w] ← BLACK //Case 1
6. color [p[x]] ← RED //Case 1
7. LEFT-ROTATE (T, p [x]) //Case 1
8. w ← right [p[x]] //Case 1
9. If color [left [w]] = BLACK and color [right[w]] =
BLACK
10. then color [w] ← RED //Case 2
11. x ← p[x] //Case 2
12. else if color [right [w]] = BLACK
13. then color [left[w]] ← BLACK //Case 3
14. color [w] ← RED //Case 3
15. RIGHT-ROTATE (T, w) //Case 3
16. w ← right [p[x]] //Case 3
17. color [w] ← color [p[x]] //Case 4
18. color p[x] ← BLACK //Case 4
19. color [right [w]] ← BLACK //Case 4
20. LEFT-ROTATE (T, p [x]) //Case 4
21. x ← root [T] //Case 4
22. else (same as then clause with "right" and "left"
exchanged)
23. color [x] ← BLACK
160
Example: In a previous example, we found that the red-black tree that results
from successively inserting the keys 41,38,31,12,19,8 into an initially empty tree.
Now show the red-black trees that result from the successful deletion of the keys
in the order 8, 12, 19,31,38,41.
Solution:
161
Delete 38
Delete 41
No Tree.
Self-Assessment Exercises
1. When deleting a node from a red-black tree, what condition might happen?
2. What is the maximum height of a Red-Black Tree with 14 nodes? (Hint:
The black depth of each external node in this tree is 2.) Draw an example
of a tree with 14 nodes that achieves this maximum height.
3. Why can't a Red-Black tree have a black node with exactly one black child
and no red child?
162
4.0 Conclusion
A binary search tree, also called an ordered or sorted binary tree, is a rooted binary
tree data structure whose internal nodes each store a key greater than all the keys
in the node’s left subtree and less than those in its right subtree. On the other
hand, a red–black tree is a kind of self-balancing binary search tree. Each node
stores an extra bit representing "color", used to ensure that the tree remains
balanced during insertions and deletions
5.0 Summary
In this unit, we considered the Binary Search Tree and looked at how such trees
could be traversed while also examining the various methods of querying or
accessing information from a Binary Search Tree. In addition, we looked at a
special derivative of the Binary Search Tree called Red Black Trees, its properties
and also some operations that could be carried out on Red Black Tress.
163
7.0 Further Reading and other Resources
164
Module 3: Other Algorithm Techniques
Unit 2: Dynamic Programming
Page
1.0 Introduction 166
2.0 Objectives 166
3.0 Dynamic Programming 166
3.1 How Dynamic Programming Works 168
3.2 Approaches of Dynamic Programming 169
3.2.1 Top-down approach 169
3.2.2 Bottom-up approach 171
3.3 Divide-and-Conquer Method vs Dynamic Programmming 175
3.4 Techniques for Solving Dynamic Programming Problems 176
165
1.0 Introduction
Dynamic programming is both a mathematical optimization method and a
computer programming method. The method was developed by Richard Bellman
in the 1950s and has found applications in numerous fields, from aerospace
engineering to economics. We look at some of the techniques of Dynamic
Programming in this unit as well as some benefits and applications of Dynamic
Programming
2.0 Objectives
166
solving each subproblem just once, and then storing their solutions to avoid
repetitive computations.
The numbers in the above series are not randomly calculated. Mathematically, we
could write each of the terms using the below formula:
To calculate the other numbers, we follow the above relationship. For example,
F(2) is the sum f(0) and f(1), which is equal to 1.
The F(20) term will be calculated using the nth formula of the Fibonacci series.
The below figure shows that how F(20) is calculated.
167
As we can observe in the above figure that F(20) is calculated as the sum of F(19)
and F(18).
In the dynamic programming approach, we try to divide the problem into the
similar subproblems. We are following this approach in the above case where
F(20) into the similar subproblems, i.e., F(19) and F(18). If we revisit the
definition of dynamic programming that it says the similar subproblem should
not be computed more than once. Still, in the above case, the subproblem is
calculated twice. F(18) is calculated two times; similarly, F(17) is also calculated
twice. However, this technique is quite useful as it solves the similar
subproblems, but we need to be cautious while storing the results because we are
not particular about storing the result that we have computed once, as it can lead
to a wastage of resources.
In the above example, if we calculate the F(18) in the right subtree, then it leads
to the tremendous usage of resources and decreases the overall performance.
The solution to the above problem is to save the computed results in an array.
First, we calculate F(16) and F(17) and save their values in an array. The F(18) is
calculated by summing the values of F(17) and F(16), which are already saved in
an array. The computed value of F(18) is saved in an array. The value of F(19) is
calculated using the sum of F(18), and F(17), and their values are already saved
in an array. The computed value of F(19) is stored in an array. The value of F(20)
can be calculated by adding the values of F(19) and F(18), and the values of both
F(19) and F(18) are stored in an array. The final computed value of F(20) is stored
in an array.
The following are the steps that the dynamic programming follows:
168
Finally, calculate the result of the complex problem.
The above five steps are the basic steps for dynamic programming. The dynamic
programming is applicable that are having properties such as:
Top-down approach
Bottom-up approach
169
Disadvantages of Top-down approach
It uses the recursion technique that occupies more memory in the call stack.
Sometimes when the recursion is too deep, the stack overflow condition
will occur.
It occupies more memory that degrades the overall performance.
int fib(int n)
{
if(n<0)
error;
if(n==0)
return 0;
if(n==1)
return 1;
sum = fib(n-1) + fib(n-2);
}
In the above code, we have used the recursive approach to find out the Fibonacci
series. When the value of 'n' increases, the function calls will also increase, and
computations will also increase. In this case, the time complexity increases
exponentially, and it becomes O(2n).
170
error;
if(n==0)
return 0;
if(n==1)
return 1;
sum = fib(n-1) + fib(n-2);
memo[n] = sum;
}
In the above code, we have used the memorization technique in which we store
the results in an array to reuse the values. This is also known as a top-down
approach in which we move from the top and break the problem into sub-
problems.
The bottom-up approach uses the tabulation technique to implement the dynamic
programming approach. It solves the same kind of problems but it removes the
recursion. If we remove the recursion, there is no stack overflow issue and no
overhead of the recursive functions. In this tabulation technique, we solve the
problems and store the results in a matrix.
The bottom-up is the approach used to avoid the recursion, thus saving the
memory space. The bottom-up is an algorithm that starts from the beginning,
whereas the recursive algorithm starts from the end and works backward. In the
bottom-up approach, we start from the base case to find the answer for the end.
As we know, the base cases in the Fibonacci series are 0 and 1. Since the bottom
approach starts from the base cases, so we will start from 0 and 1.
We solve all the smaller sub-problems that will be needed to solve the
larger sub-problems then move to the larger problems using smaller
sub-problems.
We use for loop to iterate over the sub-problems.
The bottom-up approach is also known as the tabulation or table filling
method.
171
Let's understand through an example.
Suppose we have an array that has 0 and 1 values at a[0] and a[1] positions,
respectively shown as below:
Since the bottom-up approach starts from the lower values, so the values at a[0]
and a[1] are added to find the value of a[2] shown as below:
The value of a[3] will be calculated by adding a[1] and a[2], and it becomes 2
shown as below:
The value of a[4] will be calculated by adding a[2] and a[3], and it becomes 3
shown as below:
172
The value of a[5] will be calculated by adding the values of a[4] and a[3], and it
becomes 5 shown as below:
The code for implementing the Fibonacci series using the bottom-up approach is
given below:
int fib(int n)
{
int A[];
A[0] = 0, A[1] = 1;
for( i=2; i<=n; i++)
{
A[i] = A[i-1] + A[i-2]
}
return A[n];
}
In the above code, base cases are 0 and 1 and then we have used for loop to find
other values of Fibonacci series.
Initially, the first two values, i.e., 0 and 1 can be represented as:
When i=2 then the values 0 and 1 are added shown as below:
173
When i=3 then the values 1and 1 are added shown as below:
When i=4 then the values 2 and 1 are added shown as below:
When i=5, then the values 3 and 2 are added shown as below:
174
In the above case, we are starting from the bottom and reaching to the top
1.It deals (involves) three steps at each 1.It involves the sequence of four steps:
level of recursion: Characterize the structure of
Divide the problem into a number of optimal solutions.
subproblems.
Recursively defines the values
Conquer the subproblems by solving
of optimal solutions.
them recursively.
Combine the solution to the Compute the value of optimal
subproblems into the solution for solutions in a Bottom-up
original subproblems. minimum.
Construct an Optimal Solution
from computed information.
175
3. It does more work on subproblems 3. It solves subproblems only once and
and hence has more time then stores in the table.
consumption.
6. For example: Merge Sort & Binary 6. For example: Matrix Multiplication.
Search etc.
To solve any dynamic programming problem, we can use the FAST method.
'F' stands for Find the recursive solution: Whenever we find any DP
problem, we have to find the recursive solution.
'A' stands for Analyse the solution: Once we find the recursive
solution then we have to analyse the solution and look for the
overlapping problems.
'S' stands for Save the results for future use: Once we find the
overlapping problems, we store the solutions of these sub-problems. To
store the solutions, we use the n-dimensional array for caching purpose.
The above three steps are used for the top-down approach if we use 'F', 'A' and
'S', which means that we are achieving the Top-down approach and since it is not
purely because we are using the recursive technique.
176
Above are the four steps to solve a complex problem.
Problem Statement: Write an efficient program to find the nth Fibonacci number?
0, 1, 1, 2, 3, 5, 8, 13, 21,...
Fib(n)
{
if(n<2)
return n;
return fib(n-1) + fib(n-2);
}
The above recursive solution is also the solution for the above problem but the
time complexity in this case is O(2n). So, dynamic programming is used to reduce
the time complexity from the exponential time to the linear time.
As we can observe in the above figure that fib(2) is calculated two times while
fib(1) is calculated three times. So, here overlapping problem occurs. In this step,
we have analysed the solution.
177
Third step is to save the result.
The process of saving the result is known as memoization. In this step, we will
follow the same approach, i.e., recursive approach but with a small different that
we have used the cache to store the solutions so that it can be re-used whenever
required.
Fib(n)
{
int cache = new int[n+1];
if(n<2)
return n;
if(cache[n]!= 0)
return cache[n];
return cache[n] = fib(n-1) + fib(n-2);
}
In the above code, we have used a cache array of size n+1. If cache[n] is not equal
to zero then we return the result from the cache else we will calculate the value
of cache and then return the cache. The technique that we have used here is top-
down approach as it follows the recursive approach. Here, we always look for the
cache so cache will be populated on the demand basis. Suppose we want to
calculate the fib(4), first we look into cache, and if the value is not in the cache
then the value is calculated and stored in the cache.
178
Fourth step is to Tweak the solution
In this step, we will remove the recursion completely and make it an iterative
approach. So, this technique is known as a bottom-up approach.
Fib(n)
{
int cache[] = new int[n+1];
// base cases
cache[0] = 0;
cache[1] = 1;
for(int i=2; i<=n; i++)
{
cache[i] = cache[i-1] + cache[i-2];
}
return cache[n];
}
In the above code, we have followed the bottom-up approach. We have declared
a cache array of size n+1. The base cases are cache[0] and cache[1] with their
values 0 and 1 respectively. In the above code, we have removed the recursion
completely. We have used an iterative approach. We have defined a for loop in
which we populate the cache with the values from the index i=2 to n, and from
the cache, we will return the result. Suppose we want to calculate f(4), first we
will calculate f(2), then we will calculate f(3) and finally, we we calculate the
179
value of f(4). Here we are going from down to up so this approach is known as a
bottom-up approach.
As we can observe in the above figure that we are populating the cache from
bottom to up so it is known as bottom-up approach. This approach is much more
efficient than the previous one as it is not using recursion but both the approaches
have the same time and space complexity, i.e., O(n).
In this case, we have used the FAST method to obtain the optimal solution. The
above is the optimal solution that we have got so far but this is not the purely an
optimal solution.
Efficient solution:
fib(n)
{
int first=0, second=1, sum=0;
if(n<2)
{
return 0;
}
for(int i =2; i<=n; i++)
{
sum = first + second;
first = second;
180
second = sum;
}
return sum;
}
The above solution is the efficient solution as we do not use the cache.
The Following are the top 10 problems that can easily be solved using Dynamic
programming:
a. Longest Common Subsequence.
b. Shortest Common Supersequence.
c. Longest Increasing Subsequence problem.
d. The Levenshtein distance (Edit distance) problem.
e. Matrix Chain Multiplication.
f. 0–1 Knapsack problem.
g. Partition problem.
h. Rod Cutting.
Self-Assessment Exercises
2. Four matrices M1, M2, M3 and M4 of dimensions pxq, qxr, rxs and sxt
respectively can be multiplied is several ways with different number of
total scalar multiplications. For example, when multiplied as ((M1 X M2)
X (M3 X M4)), the total number of multiplications is pqr + rst + prt. When
multiplied as (((M1 X M2) X M3) X M4), the total number of scalar
multiplications is pqr + prs + pst. If p = 10, q = 100, r = 20, s = 5 and t =
80, then the number of scalar multiplications needed is?
181
5. What happens when a top-down approach of dynamic programming is
applied to a problem?
4.0 Conclusion
5.0 Summary
182
7.0 Further Reading and Other Resources
183
Module 3: Other Algorithm Techniques
Unit 3: Computational Complexity
Page
1.0 Introduction 185
2.0 Objectives 185
3.0 Computational Complexity Theory 185
3.0.1 Notations Used 186
3.1 Deterministic Algorithms 187
3.1.1 Facts about Deterministic Algorithms 188
3.2 Non Deterministic Algorithms 188
3.2.1 What makes and Algorithm Non-Deterministic? 188
3.2.2 Facts about Non-Deterministic Algorithms 189
3.2.3 Deterministic versus Non-Deterministic Algorithms 190
3.3 NP Problems 190
3.3.1 Definition of P Problems 191
3.4 Decision-Based Problems 191
3.4.1 NP-Hard Problems 191
3.4.2 NP-Complete Problems 192
3.4.3 Representation of NP Classes 192
3.5 Tractable and Intractable Problems 193
3.5.1 Tractable Problems 193
3.5.2 Intractable Problems 193
3.5.3 Is P = NP? 193
4.0 Conclusion 194
5.0 Summary 194
6.0 Tutor Marked Assignment 194
7.0 Further Reading and Other Resources 195
184
1.0 Introduction
In general, the amount of resources (or cost) that an algorithm requires in order
to return the expected result is called computational complexity or just
complexity. ... The complexity of an algorithm can be measured in terms of time
complexity and/or space complexity.
Computational complexity theory focuses on classifying computational
problems according to their resource usage, and relating these classes to each
other. A computational problem is a task solved by a computer. A computation
problem is solvable by mechanical application of mathematical steps, such as
an algorithm.
A problem is regarded as inherently difficult if its solution requires significant
resources, whatever the algorithm used.
We shall be looking at the famous P (Polynomial Time) and NP (Non Polynomial
Time) as well as NP-complete problems.
2.0 Objectives
By the end of this unit, you should be able to:
Know the meaning and focus of Computational Complexity theory
Identify the different cases of P and NP problems
Differentiate between Tractable and Intractable problems
Know what we mean by Deterministic and Non-Deterministic problems
Understand the differences between Deterministic and Non
Deterministic algorithms
Big Omega notation, written Ω(g(N)), means that the run-time function is
bounded below by the function g(N) . For example, as explained a moment ago,
N log(N) is a lower bound for algorithms that sort by using comparisons, so those
algorithms are Ω(N logN) .
Big Theta notation, written ϴ(g(N)) , means that the run-time function is
bounded both above and below by the function g(N) . For example, the mergesort
algorithm’s run time is bounded above by O(N log N), and the run time of any
algorithm that sorts by using comparisons is bounded below by Ω(N log N), so
mergesort has performance ϴ(N log N).
In summary,
Big O notation gives an upper bound,
Big Omega gives a lower bound, and
Big Theta gives an upper and lower bound.
186
A Deterministic algorithm is an algorithm which, given a particular input will
always produce the same output, with the underlying machine always passing
through the same sequence of states.
In other words, Deterministic algorithm will always come up with the same result
given the same inputs.
Deterministic algorithms are by far the most studied and familiar kind of
algorithm as well as one of the most practical, since they can be run on real
machines efficiently.
187
iii. On the basis of execution and outcome in case of Deterministic algorithm,
they are also classified as reliable algorithms as for a particular input
instructions the machine will give always the same output.
iv. In Deterministic Algorithms execution, the target machine executes the
same instruction and results same outcome which is not dependent on the
way or process in which instruction get executed.
v. As outcome is known and is consistent on different executions so
deterministic algorithm takes polynomial time for their execution.
188
i. If it uses external state other than the input, such as user input, a global
variable, a hardware timer value, a random value, or stored disk data.
ii. If it operates in a way that is timing-sensitive, for example if it has multiple
processors writing to the same data at the same time. In this case, the
precise order in which each processor writes its data will affect the result.
iii. If a hardware error causes its state to change in an unexpected way.
The following table gives some vital differences between a Deterministic and a
Non Deterministic algorithm.
189
BASIS OF NON-DETERMINISTIC
DETERMINISTIC ALGORITHM
COMPARISON ALGORITHM
190
the polynomial time. NP class contains P class as a subset. NP problems are very
hard to solve.
Note: The term “NP” does not mean “Not Polynomial”. Originally, the term
meant “Non-Deterministic Polynomial. It means according to the one input
number of output will be produced.
A problem is called a decision problem if its output is a simple "yes" or "no" (or
you may need to represent it as true/false, 0/1, accept/reject.) We will phrase
many optimization problems as decision problems.
For example, Greedy method, D.P., given a graph G= (V, E) if there exists any
Hamiltonian cycle.
A problem is NP-hard if an algorithm for solving it can be translated into one for
solving any NP- problem (nondeterministic polynomial time) problem. NP-hard
therefore means "at least as hard as any NP-problem," although it might, in fact,
be harder.
191
A problem must satisfy the following points to be classified as NP-hard
1. If we can solve this problem in polynomial time, then we can solve all NP
problems in polynomial time
2. If you convert the issue into one form to another form within the
polynomial time
192
3.5 Tractable and Intractable Problems
3.5.1 Tractable Problem:
This algorithm, however, does not provide an efficient solution and is, therefore,
not feasible for computation with anything more than the smallest input.
Examples
Towers of Hanoi: we can prove that any algorithm that solves this problem must
have a worst-case running time that is at least 2n − 1.
* List all permutations (all possible orderings) of n numbers.
3.5.3 IS P = NP?
The P versus NP problem is a major unsolved problem in computer science. It
asks whether every problem whose solution can be quickly verified can also be
solved quickly.
193
An answer to the P versus NP question would determine whether problems that
can be verified in polynomial time can also be solved in polynomial time.
If it turns out that P ≠ NP, which is widely believed, it would mean that there are
problems in NP that are harder to compute than to verify: they could not be solved
in polynomial time, but the answer could be verified in polynomial time.
4.0 Conclusion
Computational complexity theory focuses on classifying computational problems
according to their resource usage, and relating these classes to each other. A
computational problem is a task solved by a computer. A computation problem
is solvable by mechanical application of mathematical steps, such as an
algorithm. Several areas considered in this Unit were P and NP problems,
Deterministic versus Non Deterministic problems as well as Tractable versus
Intractable problems.
5.0 Summary
In this Unit we looked at the meaning and nature of Computational Complexity
theory and also examined the notion of Deterministic as well as Non
Deterministic algorithms. Several examples of the algorithms were listed and we
also treated P, NP, NP-hard and NP-complete problems while also mentioning
Tractable and Intractable problems. On a final note, we also looked at the
unsolvable problem of P = NP.
194
2. What again would happen if P≠NP?
3. What makes exponential time and algorithms with factorials more difficult
to solve?
4. How many stages of procedure does a Non deterministic algorithm consist
of?
195
Module 3: Other Algorithm Techniques
Unit 4: Approximate Algorithms I
Page
1.0 Introduction 103
2.0 Objectives 103
3.0 Pascal Programming Basics 103
3.1 Character Set and Identifiers 105
3.2 Numbers and Strings 105
3.3 Variable, Constant and Assignment Statements 107
3.4 Data Types 108
3.5 Reserved Words 109
3.6 Standard Functions and Operator Precedence 111
4.0 Conclusion 113
5.0 Summary 113
6.0 Tutor Marked Assignment 113
7.0 Further Reading and Other Resources 113
196
1.0 Introduction
In computer science and operations research, approximation algorithms
are efficient algorithms that find approximate solutions to optimization problems
(in particular NP-hard problems) with provable guarantees on the distance of the
returned solution to the optimal one. Approximation algorithms are typically
used when finding an optimal solution is intractable, but can also be used in some
situations where a near-optimal solution can be found quickly and an exact
solution is not needed.
2.0 Objectives
Learn more about the Vertex Cover and Traveling Salesman problems
197
Suppose we work on an optimization problem where every solution carries a cost.
An Approximate Algorithm returns a legal solution, but the cost of that legal
solution may not be optimal.
We say the approximate algorithm has an approximate ratio P (n) for an input
size n, where
Intuitively, the approximation ratio measures how bad the approximate solution
is distinguished with the optimal solution. A large (small) approximation ratio
measures the solution is much worse than (more or less the same as) an optimal
solution.
Observe that P (n) is always ≥ 1, if the ratio does not depend on n, we may write
P. Therefore, a 1-approximation algorithm gives an optimal solution. Some
problems have polynomial-time approximation algorithm with small constant
approximate ratios, while others have best-known polynomial time
approximation algorithms whose approximate ratios grow with n.
198
3.1 Vertex Cover
The decision vertex-cover problem was proven NPC. Now, we want to solve the
optimal version of the vertex cover problem, i.e., we want to find a minimum size
vertex cover of a given graph. We call such vertex cover an optimal vertex cover
C*.
The idea is to take an edge (u, v) one by one, put both vertices to C, and remove
all the edges incident to u or v. We carry on until all edges have been removed.
C is a VC. But how good is C?
199
VC = {b, c, d, e, f, g}
In the traveling salesman Problem, a salesman must visits n cities. We can say
that salesman wishes to make a tour or Hamiltonian cycle, visiting each city
exactly once and finishing at the city he starts from. There is a non-negative cost
c (i, j) to travel from the city i to city j. The goal is to find a tour of minimum cost.
We assume that every two cities are connected. Such problems are called
Traveling-salesman problem (TSP).
We can model the cities as a complete graph of n vertices, where each vertex
represents a city.
If we assume the cost function c satisfies the triangle inequality, then we can use
the following approximate algorithm.
Triangle inequality
200
Approx-TSP (G= (V, E))
{
1. Compute a MST T of G;
2. Select any vertex r is the root of the tree;
3. Let L be the list of vertices visited in a preorder
tree walk of T;
4. Return the Hamiltonian cycle H that visits the verti
ces in the order L;
}
Intuitively, Approx-TSP first makes a full walk of MST T, which visits each edge
exactly two times. To create a Hamiltonian cycle from the full walk, it bypasses
some vertices (which corresponds to making a shortcut)
Before knowing about the minimum spanning tree, we should know about the
spanning tree.
201
To understand the concept of spanning tree, consider the graph below:
The above graph can be represented as G(V, E), where 'V' is the number of
vertices, and 'E' is the number of edges. The spanning tree of the above graph
would be represented as G`(V`, E`). In this case, V` = V means that the number
of vertices in the spanning tree would be the same as the number of vertices in
the graph, but the number of edges would be different. The number of edges in
the spanning tree is the subset of the number of edges in the original graph.
Therefore, the number of edges can be written as:
E` € E
E` = |V| - 1
The number of vertices in the spanning tree would be the same as the
number of vertices in the original graph.
V` = V
The number of edges in the spanning tree would be equal to the number of
edges minus 1.
E` = |V| - 1
The above graph contains 5 vertices. As we know, the vertices in the spanning
tree would be the same as the graph; therefore, V` is equal 5. The number of edges
in the spanning tree would be equal to (5 - 1), i.e., 4. The following are the
possible spanning trees:
203
3.3.1 What is a minimum spanning tree?
The minimum spanning tree is a spanning tree whose sum of the edges is
minimum. Consider the below graph that contains the edge weight:
The following are the spanning trees that we can make from the above graph.
i. The first spanning tree is a tree in which we have removed the edge
between the vertices 1 and 5 shown as below:
The sum of the edges of the above tree is (1 + 4 + 5 + 2): 12
ii. The second spanning tree is a tree in which we have removed the edge
between the vertices 1 and 2 shown as below:
The sum of the edges of the above tree is (3 + 2 + 5 + 4) : 14
204
iii. The third spanning tree is a tree in which we have removed the edge
between the vertices 2 and 3 shown as below:
The sum of the edges of the above tree is (1 + 3 + 2 + 5) : 11
iv. The fourth spanning tree is a tree in which we have removed the edge
between the vertices 3 and 4 shown as below:
The sum of the edges of the above tree is (1 + 3 + 2 + 4) : 10. The edge
cost 10 is minimum so it is a minimum spanning tree.
205
The number of spanning trees that can be made from the above complete graph
equals to nn-2 = 44-2 = 16.
The maximum number of edges that can be removed to construct a spanning tree
equals to e-n+1 = 6 - 4 + 1 = 3.
Electric Power
Water
Telephone lines
206
Sewage lines
To reduce cost, you can connect houses with minimum cost spanning trees.
207
Self-Assessment Exercises
208
4.0 Conclusion
5.0 Summary
1. How does the practical Traveling Salesman problem differ from the
Classical Traveling salesman problem?
2. Consider a complete undirected graph with vertex set {0, 1, 2, 3, 4}. Entry
Wij in the matrix W below is the weight of the edge {i, j}. What is the
minimum possible weight of a spanning tree T in this graph such that vertex
0 is a leaf node in the tree T?
4. In the graph given in question (2) above, what is the minimum possible
weight of a path P from vertex 1 to vertex 2 in this graph such that P
contains at most 3 edges?
5. Consider a weighted complete graph G on the vertex set {v1,v2 ,v} such
that the weight of the edge (v,,v) is 2|i-j|. The weight of a minimum
spanning tree of G is?
209
7.0 Further Reading and Other Resources
210
Module 3: Other Algorithm Techniques
Unit 5: Approximate Algorithms II
Page
1.0 Introduction 212
2.0 Objectives 212
3.0 Methods of Minimum Spanning Tree (MST) 212
3.1 Kruskal’s Algorithms 212
3.1.1 Steps for Finding MST using Kruskal’s Algorithm 213
3.2 Prim’s Algorithm 217
3.2.1 Steps for Finding MST using Prim’s Algorithm 217
211
1.0 Introduction
An approximation algorithm is a way of dealing with NP-completeness for an
optimization problem. The goal of the approximation algorithm is to come close
as much as possible to the optimal solution in polynomial time.
2.0 Objectives
1. Kruskal's Algorithm
2. Prim's Algorithm
212
1. Arrange the edge of G in order of increasing weight.
2. Starting only with the vertices of G and proceeding sequentially add each
edge which does not result in a cycle, until (n - 1) edges are used.
3. EXIT.
Analysis:
Where E is the number of edges in the graph and V is the number of vertices,
Kruskal's Algorithm can be shown to run in O (E log E) time, or simply, O (E log
V) time, all with simple data structures. These running times are equivalent
because:
Example:
213
Find the Minimum Spanning Tree of the following graph using Kruskal's
algorithm.
Solution:
First we initialize the set A to the empty set and create |v| trees, one containing
each vertex with MAKE-SET procedure. Then sort the edges in E into order by
non-decreasing weight.
Now, check for each edge (u, v) whether the endpoints u and v belong to the same
tree. If they do then the edge (u, v) cannot be supplementary. Otherwise, the two
vertices belong to different trees, and the edge (u, v) is added to A, and the vertices
in two trees are merged in by union procedure.
214
Step 2: then (g, f) edge.
Step 3: then (a, b) and (i, g) edges are considered, and the forest becomes
Step 4: Now, edge (h, i). Both h and i vertices are in the same set. Thus it creates
a cycle. So this edge is discarded.
Then edge (c, d), (b, c), (a, h), (d, e), (e, f) are considered, and the forest becomes.
215
Step 5: In (e, f) edge both endpoints e and f exist in the same tree so discarded
this edge. Then (b, h) edge, it also creates a cycle.
Step 6: After that edge (d, f) and the final spanning tree is shown as in dark lines.
Step 7: This step will be required Minimum Spanning Tree because it contains
all the 9 vertices and (9 - 1) = 8 edges
216
3.2 Prim's Algorithm
At every step, it considers all the edges and picks the minimum weight edge. After
picking the edge, it moves the other endpoint of edge to set containing MST.
MST-PRIM (G, w, r)
for each u ∈ V [G]
do key [u] ← ∞
π [u] ← NIL
key [r] ← 0
Q ← V [G]
While Q ? ∅
do u ← EXTRACT - MIN (Q)
for each v ∈ Adj [u]
217
do if v ∈ Q and w (u, v) < key [v]
then π [v] ← u
key [v] ← w (u, v)
Example:
Generate minimum cost spanning tree for the following graph using Prim's
algorithm.
Solution:
In Prim's algorithm, first we initialize the priority Queue Q. to contain all the
vertices and the key of each vertex to ∞ except for the root, whose key is set to 0.
Suppose 0 vertex is the root, i.e., r. By EXTRACT - MIN (Q) procure, now u = r
and Adj [u] = {5, 1}.
Removing u from set Q and adds it to set V - Q of vertices in the tree. Now, update
the key and π fields of every vertex v adjacent to u but not in a tree.
219
Now remove 4 because key [4] = 25 which is minimum, so u =4
π[3]= 4 π[6]= 4
220
Adj [3] = {4, 6, 2}
4 is already in heap
4 ≠ Q key [6] = 24 now becomes key [6] = 18
Key [2] = ∞ key [6] = 24
w (3, 2) = 12 w (3, 6) = 18
w (3, 2) < key [2] w (3, 6) < key [6]
Now in Q, key [2] = 12, key [6] = 18, key [1] = 28 and parent of 2 and 6 is 3.
π [2] = 3 π[6]=3
u = EXTRACT_MIN (2, 6)
u = 2 [key [2] < key [6]]
12 < 18
Now the root is 2
Adj [2] = {3, 1}
3 is already in a heap
Taking 1, key [1] = 28
w (2,1) = 16
w (2,1) < key [1]
π[1]= 2
221
Now by EXTRACT_MIN (Q) Removes 1 because key [1] = 16 is minimum.
Π [6] = 1
Now all the vertices have been spanned, Using above the table we get Minimum
Spanning Tree.
0→5→4→3→2→1→6
[Because Π [5] = 0, Π [4] = 5, Π [3] = 4, Π [2] = 3, Π [1] =2, Π [6] =1]
222
Thus the final spanning Tree is
Total Cost = 10 + 25 + 22 + 12 + 16 + 14 = 99
Self-Assessment Exercises
223
3. Let G be connected undirected graph of 100 vertices and 300 edges. The
weight of a minimum spanning tree of G is 500. When the weight of each
edge of G is increased by five, the weight of a minimum spanning tree
becomes?
4.0 Conclusion
Many problems that are NP-hard are also non-approximable assuming P≠NP.
5.0 Summary
224
P: Minimum spanning tree of G does not change
Q: Shortest path between any pair of vertices does not change
2. G = (V, E) is an undirected simple graph in which each edge has a distinct
weight, and e is a particular edge of G. Which of the following statements
about the minimum spanning trees (MSTs) of G is/are TRUE
225