0% found this document useful (0 votes)

10 views52 pages

Algorithms Unit 1

The document provides an introduction to algorithm analysis, covering time and space complexity, asymptotic notations, and various searching and sorting algorithms. It includes definitions, properties of algorithms, and the necessity for analyzing algorithms to determine efficiency. Additionally, it discusses best, worst, and average case scenarios in algorithm performance, along with examples and important questions for further understanding.

Uploaded by

ROHISIVAM

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views52 pages

Algorithms Unit 1

Uploaded by

ROHISIVAM

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 52

UNIT I - INTRODUCTION

Algorithm analysis: Time and space complexity - Asymptotic Notations

and its properties Best case, Worst case and average case analysis –
Recurrence relation: substitution method - Lower bounds – searching:
linear search, binary search and Interpolation Search, Pattern search:
The naïve string-matching algorithm - Rabin-Karp algorithm - Knuth-
Morris-Pratt algorithm. Sorting: Insertion sort – heap sort.

Important Questions:
Part A Questions
 Define time complexity and space complexity. Write an algorithm for
adding n natural numbers and find the space required by that
algorithm
 List the steps to write an Algorithm
 Define Big ‘Oh’ notation.
 Differentiate between Best, average and worst case efficiency.
 Define recurrence relation.
 How do you measure efficiency of the algorithm?
 Write an algorithm to find the area and circumference of a circle.
 How to measure algorithms running time?
 List the desirable properties of algorithms
 Write the recursive Fibonacci algorithm and its recurrence relation.
 Write an algorithm to compute the GCD of two numbers.

Part B Questions
 Discuss the concepts of asymptotic notations and its properties
 What is divide and conquer strategy? Explain binary search problem in detail.
 Solve the following using Brute-Force algorithm:
 Find whether the given string follows the specified pattern and
return 0 or 1 accordingly
Examples
Pattern: “abba”, input ”redblueredblue” should
return 1 Pattern: “aaaa”, input
”asdasdasdasd” should return 1 Pattern:
“aabb”, input ”xyzabcxyzabc” should return 0

Unit I CS3401 Algorithms 1

1. Algorithm
1.1 Definition
The sequence of steps to be performed in order to solve a problem by the computer is
known as an algorithm.
𝑃𝑟𝑜𝑔𝑟𝑎𝑚𝑠 = 𝐴𝑙𝑔𝑜𝑟𝑖𝑡ℎ𝑚𝑠 + 𝐷𝑎𝑡𝑎
Another way to describe algorithm is the sequence of unambiguous instructions. It
starts from an initial input of instructions that describe a computation and
proceeds through a finite number of well-defined successive steps, producing an
output and a final ending state.

Figure 1.1 Definition of the Algorithm

Algorithms was first developed by Persian scientist, astronomer and

mathematician Abdullah Muhammad bin Musa al-Khwarizmi in 9th century. He was
often cited as “The father of Algebra”, and was responsible for the creation of the
term “Algorithm”.

1.1.1 Examples on
Algorithms Example 1:
Problem statement: Calling a friend on the telephone

Input: The telephone number of your

friend. Output: Talk to your friend

Algorithm Steps:
1. Pick up the phone and listen for a dial tone
2. Press each digit of the phone number on the phone
3. If busy, hang up phone, wait 2 minutes, jump to step 2
4. If no one answers, leave a message then hang up
5. If no answering machine, hang up and wait 2 hours, then jump to step 2
6. Talk to friend
7. Hang up phone

Example 2:
Problem statement: Find the largest number in the given list of numbers

Input: A list of positive integer numbers.

Output: Largest number.

Algorithm Steps:
1. Define a variable 'max' and initialize with '0'.
2. Compare first number (say 'x') in the list 'L' with 'max'.

Unit I CS3401 Algorithms 2

3. If 'x' is larger than 'max', set 'max' to 'x'.

Unit I CS3401 Algorithms 3

4. Repeat step 2 and step 3 for all numbers in the list 'L'.
5. Display the value of 'max' as a result.

1.1.2 Properties of an Algorithm

 Finiteness: The algorithm must always terminate after a finite number of steps.
 Definiteness: Each instruction must be clear, well-defined and precise. There
should not be any ambiguity.
 Effectiveness: Each Instruction must be simple and be carried out in a finite amount of
time.
 Input: An algorithm has zero or more inputs, taken from a specified set of objects.
 Output: An algorithm has one or more outputs, which have a specified relation to the
inputs.
 Feasibility: It must be possible to perform each instruction.
 Generality – the algorithm must be able to work for a set of inputs rather than a single
input.
 Efficiency: The term efficiency is measured in terms of time and space required
by an algorithm to implement. Thus, an algorithm must ensure that it takes
little time and less memory space.
 Independent: An algorithm must be language independent. It means that it
should mainly focus on the input and the procedure required to get the
output instead of depending upon the language.

1.1.3 Necessity to analyse the algorithm

If we want to go from city "A" to city "B", there can be many ways of doing
this. We can go by flight, by bus, by train and also by bicycle. Depending on the
availability and convenience, we choose the one which suits us. Similarly, in
computer science, there are multiple algorithms to solve a problem. When we have
more than one algorithm to solve a problem, we need to select the best one.
Performance analysis helps us to select the best algorithm from multiple
algorithms to solve a problem.
Performance of an algorithm depends on the following parameters like -
 Whether that algorithm provides the exact solution for the problem statement
 Whether it is easy to understand
 Whether it is easy to implement
 How much space (memory) required to solve the problem
 How much time required to solve the problem

1.2 Algorithm analysis

Algorithm analysis is an important part of computational complexity theory,
which provides theoretical estimation for the required resources of an algorithm to
solve a specific computational problem. Most algorithms are designed to work with
inputs of arbitrary length. Algorithm analysis is the process of calculating space
and time required by that algorithm. The term "analysis of algorithms" was coined
by Donald Knuth.

Algorithm analysis is performed by using the following measures -

 Space Complexity: Space required to complete the task. It includes program
space and data space
 Time Complexity: Time required to complete the task.
Unit I CS3401 Algorithms 4
1.2.1 Space complexity
Space complexity is an amount of memory used by the algorithm (including
the input values of the algorithm), to execute it completely and produce the result.
We know that to execute an algorithm it must be loaded in the main memory. The
memory can be used in different forms:
 Variables (This includes the constant values and temporary values)
 Program Instruction
 Execution
Space complexity includes both Auxiliary space and space used by input. Auxiliary
Space is the extra space or temporary space used by an algorithm.
Memory Usage during program execution
 Instruction Space  used to save compiled instruction in the memory.
 Environmental Stack  used for storing the addresses while a module
calls another module or functions during execution.
 Data space used to store data, variables, and constants which are stored
by the program and it is updated during execution.

Space complexity is a parallel concept to time complexity. If we need to create an

array of size n, this will require 𝑂(𝑛) space. If we create a two-dimensional array
of size 𝑛 * 𝑛, this will require
𝑂(𝑛2) space.

1.2.2 Time complexity

Time complexity of an algorithm measures the amount of time taken by an
algorithm ie. the time taken to execute each statement of code in an algorithm.
Example:
 Time taken to execute 1 statement = x milliseconds.
 Time taken to execute n statement = x * n milliseconds
 To execute n statement inside a FOR loop = x * n milliseconds + y
milliseconds where y milliseconds is the time taken to execute FOR
loop.

1.2.3 Asymptotic Notation

To perform analysis of an algorithm, it is necessary to calculate the
complexity of that algorithm. To calculate the complexity of an algorithm, exact
amount of resource required. But it will not be provided. So instead of taking the
exact amount of resource, we represent that complexity in a general form
(Notation) for analysis process.
In asymptotic notation, the complexities of an algorithm are represented
only by the most significant terms and ignore least significant terms (Here
complexity is, Space Complexity or Time Complexity).
Example,
 Algorithm 1 : 25n3 + 2n + 1
 Algorithm 2 : 1223n2 + 8n + 3
The term '2n + 1' have least significance than the term '25n3', and the term '8n + 3'
in algorithm has least significance than the term '1223n2'.
Definition:
Asymptotic notations are mathematical tools to represent the time and space
complexity of algorithms for asymptotic analysis.
Unit I CS3401 Algorithms 5
There are mainly three asymptotic notations:
 Big-O Notation (O-notation)
 Omega Notation (Ω-notation)
 Theta Notation (Θ-notation)

1.2.3.1 Big - Oh Notation (O)

 Big - Oh notation is used to define the upper bound of an algorithm ie. it
indicates the maximum time required by an algorithm for all input values.
Therefore, it gives the worst- case complexity of an algorithm.
 Consider function f(n) as time complexity of an algorithm and g(n) is the
most significant term. If f(n) <= C g(n) for all n >= n 0, C > 0 and n0 >= 1.
Then we can represent f(n) as O(g(n)).

f(n) = O(g(n))
Fig. Big - Oh Notation
Example
s:
 100 , log (2000) , 104- -> have O(1)
 (n/4) , (2n+3) , (n/100 + log(n)) > have O(n)
 (n +n) , (2 n ) , (n +log(n))
2 2 2 > have O(n2)
O provides upper bounds.

1.2.3.2 Omega Notation (Ω-Notation)

 Omega notation represents the lower bound of an algorithm ie. it indicates
the minimum time required by an algorithm for all input values. Thus, it
provides the best case complexity of an algorithm.

 Consider function f(n) as time complexity of an algorithm and g(n) is the

most significant term. If f(n) >= C g(n) for all n >= n 0, C > 0 and n0 >= 1.
Then we can represent f(n) as Ω(g(n)).

f(n) = Ω(g(n))

Unit I CS3401 Algorithms 6

Fig. Omega Notation
Examples:
 100 , log (2000) , 104--> have Ω (1)
 (n/4) , (2n+3) , (n/100 + log(n)) > have Ω (n)
 (n +n) , (2 n ) , (n +log(n))
2 2 2 > have Ω (n2)
Ω provides lower bounds.

1.2.3.3 Theta Notation (Θ-Notation)

 Theta notation always indicates the average time required by an algorithm.
Since it represents the upper and the lower bound of the running time of an
algorithm, it is used for analyzing the average-case complexity of an
algorithm.

 Consider function f(n) as time complexity of an algorithm and g(n) is the

most significant term. If C1 g(n) <= f(n) <= C2 g(n) for all n >= n0, C1 > 0, C2
> 0 and n0 >= 1. Then we can represent f(n) as Θ(g(n)).

f(n) = Θ(g(n))
Fig. Theta Notation

Examples:
 100 , log (2000) , 104--> have Θ (1)
 (n/4) , (2n+3) , (n/100 + log(n)) > have Θ (n)
 (n +n) , (2 n ) , (n +log(n))
2 2 2 > have Θ (n2)
Θ provides exact bounds.

Asymptotic Notation
f(n) = O(g(n))
Big - Oh Notation--Upper Bound ====> Worst case
f(n) = Ω(g(n)) ))
Omega Notation- Lower Bound ====> Best case
f(n) = Θ(g(n))
Theta Notation--Upper & Lower Bound ====>
Average case

1.2.4 Worst Case, Average Case, and Best Case in Algorithm Analysis
 Best case - Function which performs the minimum number of steps on input data of
size n.
 Worst case - Function which performs the maximum number of steps on
input data of size n.

Unit I CS3401 Algorithms 7

 Average case - Function which performs an average number of steps on input data
of size n.

Unit I CS3401 Algorithms 8

Best Case Analysis (Very Rarely used):
 In the best case analysis, calculate the lower bound of the execution time of
an algorithm. It is necessary to know the case which causes the execution of
the minimum number of operations.
 Example – Linear Search
In linear search, Best case occurs when x is present at the first location. The best case
time
complexity would be Ω(1)

Worst Case Analysis (Mostly used):

 In the worst case analysis, calculate the upper limit of the execution time of
an algorithm. It is necessary to know which causes the execution of the
maximum number of operations. The worst-case time complexity of the
linear search would be O(n).
 Example – Linear Search
In linear search, Worst case occurs when x is NOT present in the array. The
worst case time complexity of the linear search would be O(n).

Average Case Analysis (Rarely used):

 In average case analysis, take all possible inputs and calculate the
computing time for all of the inputs. Sum all the calculated values and
divide the sum by the total number of inputs.
 Example:
In linear search, assume all cases are uniformly distributed (including the case
of x not being present in the array). After summing all the cases, divide the
sum by (n+1).

Types of time complexities:

Big O Notation Name Example(s)
1.Odd or Even number checking
O(1) Constant
2.Look-up table (on average)
1.Find max element in unsorted array
O(n) Linear
2.Duplicate elements in array with Hash
Map
1.Duplicate elements in array
O(n2) Quadratic
2.Bubble sort
O(log n) Logarithmic Binary Searching
Merge Sort
O(n log n) Linearithmic

1.Travelling salesman problem

O(2n) Exponential using dynamic programming
2.Fibonacci series generation

O(1) - Constant time

O(1) describes algorithms that take the same amount of time to compute regardless of
the input size. For example, if a function takes the same time to process ten
elements and 1 million items, then it is O(1).
Unit I CS3401 Algorithms 9
Examples:
 Find if a number is even or odd.

Unit I CS3401 Algorithms 1

0
 Check if an item on an array is null.
 Print the first element from a list.
 Find a value on a map.

O(n) - Linear time

Linear time complexity O(n) means that the algorithms take proportionally longer to
complete as the input grows. These algorithms imply that the program visits every
element from the input.
Examples
 Get the max/min value in an array.
 Find a given element in a collection.
 Print all the values in a list.

O(n2) - Quadratic time

A function with a quadratic time complexity has a growth rate of n2. If the input is size
2, it will do four operations. If the input is size 8, it will take 64, and so on.
Examples
 Check if a collection has duplicated values.
 Sorting using bubble sort, insertion sort, or selection sort.
 Find all possible ordered pairs in an array.

O(log n) - Logarithmic time

Logarithmic time complexities usually apply to algorithms that divide problems in half
every time. For example, to find a word in a book which is sorted alphabetically,
there are two ways to do it. Method 1:
 Start on the first page of the book and go word by word until you find
matching word. Method 2:
 Open the book in the middle and check the first word on it.
 If the word you are looking for is alphabetically more significant, then
look to the right. Otherwise, look in the left half.
 Divide the remainder in half again, and repeat above step until you find matching.

Method 1 - go word by word - O(n)

Method 2 - split the problem in half for each iteration - O(log n)

Example
 Binary search.

O(n log n) - Linearithmic

Linearithmic time complexity it’s slightly slower than a linear algorithm. However, it’s
still much better than a quadratic algorithm.
Examples
 Sorting algorithms like merge sort, quicksort, and others.

O(2n) - Exponential time

Exponential (base 2) running time means the calculations performed by an algorithm
double every time as the input grows.

Unit I CS3401 Algorithms 1

1
Examples:
 Fibonacci series generation
 Travelling salesman problem using dynamic programming

2. Recurrence Relation
A recurrence relation is an equation that defines a sequence based on a rule that
gives the next term as a function of the previous term(s). It helps in finding the
subsequent term (next term) with the previous term. If we know the previous term
in a given series, then we can easily determine the next term.
Example 1:
 Recursive definition for the factorial function
n!=(n−1)! * n

Example 2:
 Recursive definition for Fibonacci sequence
Fib(n)=Fib(n−1)+Fib(n−2)

Recurrence relations are often used to model the cost of recursive functions. For
example, the number of multiplications required by a recursive version of the
factorial function for an input of size n will be zero when n=0 or n=1 (the base
cases), and it will be one plus the cost of calling fact on a value of n−1.

2.1 Expansion of the recurrence equations

Example 1:
Let us see the expansion of the following recurrence equation.
T(n)= T(n−1)+1 for n>1
T(0)= T(1)=0.
Step 1:
T(n) = 1 + T(n - 1)
Step 2:
T(n) = 1 + (1 + T(n - 2))

Step 3:
T(n) = 1 + (1 + (1 + T(n - 3)))
Step 4:
T(n) = 1 + (1 + (1 + (1 + T(n - 4))))
Step 5:
This pattern will continue till we reach a sub-problem of size 1.
T(n) = 1 + (1 + (1 + (1 + (1 + (1 + ………))))

Step 6:

Thus the closed form of T(n) = 1 + T(n - 1) can be modeled as ∑𝑛 1

𝑖=1

Unit I CS3401 Algorithms 1

2
Example 2:
Let us see the expansion of the following recurrence equation.
T(n)= T(n−1)+ n
T(1)= 1

Step 1:
T(n) = n + T(n - 1)

Step 2:
T(n) = n +(n - 1 + T(n - 2))

Step 3:
T(n) = n + (n - 1 + (n - 2 + T(n - 3))

Step 4:
T(n) = n +(n - 1+(n - 2 +(n –3 +T(n - 4))))

Step 5:
This pattern will continue till we reach a sub-problem of size 1.
T(n) = n + (n -1 + (n - 2 + (n - 3 + (n - 4 + ……… 1))))

Step 6:

Thus the closed form of T(n) = n + T(n - 1) can be modeled as ∑𝑛 1

𝑖=1

a
2.2 Methods for solving Recurrence
 Substitution Method
 Iteration Method
 Recursion Tree Method
 Master Method
2.2.1 Substitution Method

Unit I CS3401 Algorithms 10

In the substitution method, we have a known recurrence, and we use induction to
prove that our guess is a good bound for the recurrence's solution.
Steps
 Guess a solution through your experience.
 Use induction to prove that the guess is an upper bound solution for the
given recurrence relation.

Example:
T (n) = 1 if n=1
= 2T (n-1) if n>1
T (n) = 2T (n-1)
= 2[2T (n-2)] = 22T (n-2)
= 4[2T (n-3)] = 23T (n-3)
= 8[2T (n-4)] = 24T (n-4)

Repeat the procedure for i

times T (n) = 2i T (n-i)

Put n-i=1 or i= n-1 in (Eq.1)
T (n) = 2 n-1 T (1)
= 2n-1 .1 {T (1) =1......given}
n-1
= 2

3. Searching
Searching is a technique that helps to find whether the given element is
present in the set of elements. Any search is said to be successful or unsuccessful
depending upon whether the element that is being searched is found or not. Some
of the standards searching techniques are:
 Linear Search or Sequential Search
 Binary Search
 Interpolation Search

3.1 Linear Search:

It is one of the most simple and straightforward search algorithms. In
this, you need to traverse the entire list and compare the current element with the
target element. If a match is found, you can stop the search else continue.

Linear search is implemented using following steps...

Step 1: Read the search element from the user
Step 2: Compare, the search element with the first element in the array.
Step 3: If both are matched, then display "Given element found!!!" and
terminate the program Step 4: If both are not matched, then compare search
element with the next element in the array.
Step 5: Repeat steps 3 and 4 until the search element is compared with the last
element in the array.
Step 6: If the last element in the array is also not matched, then display
"Element not found!!!" and terminate the function.

Unit I CS3401 Algorithms 11

/* 3.1.1 Python Program to search the given element in the list of
items using Linear Search */
Example

Given the array of elements: 59,58,96,78,23 and the element to be

searched is 96, the working of linear search is as follows:

The Element is FOUND. Hence stop the searching process.

def LinearSearch(mylist, n,
k): for j in range(0, n):
if (mylist[j] == k):
return j
return -1

Unit I CS3401 Algorithms 12

mylist = [1, 3, 5, 7, 9]
print("Given Elements : ", mylist)
k = int(input("Enter the element to be searched : "))
n = len(mylist)
result = LinearSearch(mylist, n, k)
if(result == -1):
print("Element not found")
else:
print("Element found at index: ", result)

Execution:
Input
Given Elements : [1, 3, 5, 7, 9]
Enter the element to be searched :
3 Output
Element found at index: 1

3.1.2 Complexity Analysis of Linear Search

Time Complexity
 Best case - O(1)
The best case occurs when the target element is found at the beginning of
the list/array. Since only one comparison is made, the time complexity is
O(1).
o Example :
Array A[] = {3,4,0,9,8} & Target
element = 3 Here, the target is found at
A[0].

 Worst-case - O(n), where n is the size of the list/array.

The worst-case occurs when the target element is found at the end of the
list or is not present in the list/array. Since you need to traverse the entire
list, the time complexity is O(n), as n comparisons are needed.

 Average case - O(n)

The average case complexity of the linear search is also O(n).

Space Complexity
 The space complexity of the linear search is O(1), as we don't need
any auxiliary space for the algorithm.

3.2 Binary search

Binary search is a searching algorithm which works efficiently on sorted elements.
It uses divide and conquers method in which we compare the target element with
the middle element of the list. If they are equal, then it implies that the target is
found at the middle position; else, we reduce the search space by half, i.e. apply
binary search on either of the left and right halves of the list depending upon
whether target<middle element or target>middle element. We continue this until a

Unit I CS3401 Algorithms 13

match is found or the size of the array reaches 1.

Unit I CS3401 Algorithms 14

Binary search is implemented using following steps:
Step 1: Read the search element from the user
Step 2: Find the middle element in the sorted array
Step 3: Compare, the search element with the middle element in the sorted array.
Step 4: If both are matched, then display "Given element found!!!" and
terminate the function
Step 5: If both are not matched, then check whether the search element is
smaller or larger than middle element.
Step 6: If the search element is smaller than middle element, then repeat
steps 2, 3, 4 and 5 for the left sub array of the middle element.
Step 7: If the search element is larger than middle element, then repeat
steps 2, 3, 4 and 5 for the right sub array of the middle element.
Step 8: Repeat the same process until we find the search element in the
array or until the sub array contains only one element.
Step 9: If that element also doesn't match with the search element, then display
"Element not
found in the array!!!" and terminate the function.
/* 3.2.1 Python Program to search the given element in the list of
items using Binary Search using Iterative approach */
Method 1 – Iterative approach:
Given an array of elements: 6, 12, 17, 323, 38, 45, 77, 84, 90
The element to be searched: 45
𝑠𝑡𝑎𝑟𝑡 +𝑒𝑛𝑑
Formula for calculating middle is, 𝑀𝑖𝑑 =
2

Unit I CS3401 Algorithms 15

The Element is FOUND. Hence stop the searching process.

def mybinarySearch(myarray, x, low, high):

# Binary Search using Iterative
approach while low <= high:
mid = low + (high - low)//2
if myarray[mid] == x:
return mid
elif myarray[mid] < x:
low = mid + 1
else:
high = mid - 1
return -1

myarray = [3, 4, 5, 6, 7, 8, 9]
print("Elements in the array: " , myarray)
x = int(input("Enter the element to be searched : "))

Unit I CS3401 Algorithms 16

result = mybinarySearch(myarray, x, 0, len(myarray)-1)

if result != -1:
print("Element is present at index :" + str(result))
else:
print("Element not found ")

Execution:
Input
Elements in the array: [33, 44, 55, 66, 77, 88, 99]
Enter the element to be searched : 66
Output
Element is present at index :3

/* 3.2.2 Python Program to search the given element in the list of

items using Binary Search using Recursive approach */
Method 2 – Recursive approach:
Method 2 is the recursive approach. In the recursive approach the
function calls itself again and again. We declared a recursive
function and its base condition. The condition is the lowest value
is smaller or equal to the highest value. We calculate the middle
number as in the last program.
We have used if statement to proceed with the binary search.
 If the middle value equal to the number that we are
looking for, the middle value is returned.
 If the middle value is less than the value, we are
looking then our recursive function binary_search() again
and increase the mid value by one and assign to low.
 If the middle value is greater than the value we are
looking then our recursive function binary_search() again
and decrease the mid value by one and assign it to low.

Program:
def mybinary_search(myarr, low, high, x):
if high >= low:
mid = (high + low) //
2 if myarr[mid] == x:
return mid
# If element is smaller than mid, then it can only
# be present in left subarray
elif myarr[mid] > x:
return mybinary_search(myarr, low, mid - 1, x)
# Else the element can only be present in right subarray
else:
return mybinary_search(myarr, mid + 1, high, x)
else:
# Element is not present in the array
return -1

Unit I CS3401 Algorithms 17

# Test data
myarr = [ 2, 3, 4, 10, 40 ]
print("Elements in the array :", myarr)
x = int(input("Enter the element to be searched : "))

# Function call
result = mybinary_search(myarr, 0, len(myarr)-1, x)

if result != -1:
print("Element is present at index : ", str(result))
else:
print("Element is not present in array")

Execution:
Input
Elements in the array : [2, 3, 4, 10, 40]
Enter the element to be searched : 10
Output
Element is present at index : 3

3.2.3 Complexity Analysis of Binary Search

Time Complexity
 Best case - O(1)
The best case occurs when the target element is found in the middle of
list/array. Since only one comparison is made, the time complexity is O(1).

 Worst-case - O(logn)
The worst occurs when the algorithm keeps on searching for the target
element until the size of the array reduces to 1. Since the number of
comparisons required is logn, the time complexity is O(logn).

 Average case - O(logn)

Binary search has an average-case complexity of O(logn).

Space Complexity
 Since no extra space is needed, the space complexity of the binary search is
O(1).

3.3 Interpolation Search

The interpolation search is basically an improved version of the binary
search. This searching algorithm resembles the method by which one might search
a telephone book for a name. It performs very efficiently when there are uniformly
distributed elements in the sorted list. In a binary search, we always start
searching from the middle of the list, whereas in the interpolation search we
determine the starting position depending on the item to be searched. In the
interpolation search algorithm, the starting search position is most likely to be the
closest to the

Unit I CS3401 Algorithms 18

start or end of the list depending on the search item. If the search item is near to
the first element in the list, then the starting search position is likely to be near
the start of the list.
Important points on Interpolation Search
 Interpolation search is an improvement over binary search.
 Binary Search always checks the value at middle index. But, interpolation
search may check at different locations based on the value of element being
searched.
 For interpolation search to work efficiently the array elements/data should
be sorted and uniformly distributed.

Interpolation search is implemented using following steps:

Step 1: Let A - Array of elements, e - element to be searched, pos - current position
Step 2: Assign start = 0 & end = n-1
Step 3: Calculate position ( pos ) to start searching by using formula:

Step 4: If A[pos] == e , element found at index pos.

Step 5: Otherwise if e > A[pos] we make start = pos + 1
Step 6: Else if e < A[pos] we make end = pos -1
Step 7: Do steps 3, 4, 5, 6.
While : start <= end && e >= A[start] && e =< A[end]
 start <= end is checked until we have elements in the sub-array.
 e >= A[start] is done when the element we are looking for is greater
than or equal to the starting element of sub-array we are looking in.
 e =< A[end] is done when the element we are looking for is less than or
equal to the last element of sub-array we are looking in.

/* Python Program to search the given element in the list of

items using Interpolation Search */

Example: Element to be searched = 4.

Unit I CS3401 Algorithms 19

Program
def interpolationSearch(arr, lo, hi, x):
if (lo <= hi and x >= arr[lo] and x <= arr[hi]):
pos = lo + ((hi-lo)//(arr[hi]-arr[lo])*(x - arr[lo]))
if arr[pos] == x:
return pos
if arr[pos] <
x:
return interpolationSearch(arr, pos + 1, hi, x)
if arr[pos] > x:
return interpolationSearch(arr, lo, pos - 1, x)
return -1

arr = [10, 12, 13, 16, 18, 19, 20,

21, 22, 23, 24, 33, 35, 42, 47]
print("Elements in the array :", arr)
x = int(input("Enter the element to be searched : "))

n = len(arr)
index = interpolationSearch(arr, 0, n - 1, x)

if index != -1:
print("Element found at index",
index) else:
print("Element not found")

Execution:
Input
Elements in the array : [10, 12, 13, 16, 18, 19, 20, 21, 22,
23, 24, 33, 35, 42, 47]
Enter the element to be searched : 20
Output

Complexity Analysis of Interpolation Search

Time Complexity
 Best case - O(1)
The best-case occurs when the target is found exactly as the first expected
position computed using the formula. As we only perform one comparison,
the time complexity is O(1).

 Worst-case - O(n)
The worst case occurs when the given data set is exponentially distributed.

Unit I CS3401 Algorithms 20

 Average case - O(log(log(n)))
If the data set is sorted and uniformly distributed, then it takes O(log(log(n)))
time as on an average (log(log(n))) comparisons are made.
Space Complexity
 Since no extra space is needed, the space complexity of the
interpolation search is O(1).

3.4 Comparative analysis:

Time Complexity Space

Algorithm
Best case Worst-case Average case Complexity
Linear Search O(1) O(n) O(n) O(1)
Binary Search O(1) O(logn) O(logn) O(1)
Interpolation Search O(1) O(n) O(log(log(n))) O(1)

4. Pattern Search
The Pattern Searching algorithms are sometimes also referred to as String
Searching Algorithms. These algorithms are useful in the case of searching a
pattern in a string.

Algorithms used for String Matching:

Various string matching algorithms
are:
 The Naive String Matching Algorithm
 The Rabin-Karp-Algorithm
 Finite Automata
 The Knuth-Morris-Pratt Algorithm
 The Boyer-Moore Algorithm

Algorithms based on character comparison

Naive Match Algorithm:
It slides the pattern over text one by one and checks for a match. If a match
is found, then slides by 1 again to check for subsequent matches.

KMP (Knuth Morris Pratt) Algorithm:

KMP algorithm is used to find a "Pattern" in a "Text". This algorithm
compares character by character from left to right. But whenever a mismatch
occurs, it uses a pre-processed table called "Prefix Table" to skip characters
comparison while matching.

Algorithms based on Hashing Technique

Rabin Karp Algorithm:
It matches the hash value of the pattern with the hash value of current
substring of text, and if the hash values match then only it starts matching
individual characters.

4. 1 Naive Match Algorithm:

Unit I CS3401 Algorithms 21
This is simple and efficient brute force approach. It compares the first
character of pattern with given stringt. If a match is found, pointers in both strings
are advanced. If a match is not found,

Unit I CS3401 Algorithms 22

the pointer to text is incremented and pointer of the pattern is reset. This process
is repeated till the end of the text. The naïve approach does not require any pre-
processing.
Given a text array, T [1 n], of n character and a pattern array, P [1. m], of m
characters.
The algorithms are to find an integer s, called valid shift where 0 ≤ s < n-m. In
other words, we need to find, if P is in T, i.e., where P is a substring of T. The item
of P and T are character drawn from some finite alphabet such as {0, 1} or {A, B
.....Z, a, b z}.
Steps:
1. n ← length [T]
2. m ← length [P]
3. for s ← 0 to n -m
4. do if P [1.....m] = T [s + 1....s + m]
5. then print "Pattern occurs with shift" s
Input:
string = “This is my class
room” pattern = “class”
Output:
Pattern found at index 11
Input
: string = “AABAACAADAABAABA”
pattern = = “AABA”
Output:
Pattern found at
index 0 Pattern found
at index 9 Pattern
found at index 12

Unit I CS3401 Algorithms 23

Fig Working of Naïve Pattern matching algorithm

Unit I CS3401 Algorithms 24

/* 4.1.1 Python Program to search the pattern in the given string
using Naïve Match algorithm */

def naïve_algorithm(string, pattern):

n = len(string)
m = len(pattern)
if m > n:
print("Pattern not found")
return
for i in range(n - m + 1):
j = 0
while j < m:
if string[i + j] != pattern[j]:
break
j += 1
if j == m:
print("Pattern found at index: ", i)

string = "hellohihello"
print("Given String : ", string)
pattern = input("Enter the pattern to be searched :")
naïve_algorithm(string, pattern)

Execution:
Input
Given String : hellohihello
Enter the pattern to be searched :hi
Output
Pattern found at index: 5

4.1.2 Complexity Analysis of Naïve Match

Time Complexity
 Best Case Complexity - O(n).
Best case complexity occurs when the first character of the pattern is not present in
string.
String = “HIHELLOHIHELLO”
Pattern = “ LI”
The number of comparisons in best case is O(n).

 Worst Case Complexity - O(m*(n-m+1)).

Worst case complexity of Naive Pattern Searching occurs in
following cases. Case 1: When all the characters of the string and
pattern are same.
String = “HHHHHHHHHHHH”
Pattern = “ HHH”

Case 2: When only the last character is different.

Unit I CS3401 Algorithms 25

String = “HHHHHHHHHHHM”
Pattern = “ HHM”
The number of comparisons in the worst case is O(m*(n-m+1)).

Space Complexity
 Since no extra space is needed, the space complexity of the naïve search is
O(1).

4.1.3 Merits & Demerits:

Advantages:
 The comparison of the pattern with the given string can be done in any order
 No extra space required
 Since it doesn’t require the pre-processing phase, as the running time is
equal to matching time
Disadvantage:
 Naive method is inefficient because information from a shift is not used again.

4. 2 Rabin Karp Algorithm:

Rabin-Karp algorithm is an algorithm used for searching/matching patterns in
the text using a hash function. Unlike Naive string matching algorithm, it does not
travel through every character in the initial phase rather it filters the characters
that do not match and then performs the comparison.
 Initially calculate the hash value of the pattern.
 Start iterating from the starting of the string:
o Calculate the hash value of the current substring having length m.
o If the hash value of the current substring and the pattern are
same, check if the substring is same as the pattern.
o If they are same, store the starting index as a valid answer.
Otherwise, continue for the next substrings.
 Return the starting indices as the required answer.

Hash(acad) = 1466 Hash(acad) = 1466

Hash(abra) = 1493 Hash(brac) = 1533
Hash(acad) ≠ Hash(abra) Hash(acad) ≠ Hash(brac)
Hence, it is mismatch Hence, it is mismatch

Hash(acad) = 1466 Hash(acad) = 1466

Hash(raca) = 1595 Hash(acad) = 1466
Hash(acad) ≠ Hash(raca) Hash(acad) ≠ Hash(acad)
Hence, it is mismatch Match found at index 3

Unit I CS3401 Algorithms 26

Steps in Rabin-Karp Algorithm:
Step 1:
 Take the input string and the pattern, which we want to match.

Given Patter
A B C C D D A E F G C D D

Step 2:
 Here, we have taken first ten alphabets only (i.e. A to J) and given the weights.
A B C D E F G H I J
1 2 3 4 5 6 7 8 9 10
Step
3: n  Length of the text
m  Length of the pattern
Here, n = 10 and m
= 3.
d  Number of characters in the input set.
Here, we have taken input set {A, B, C, ..., J}. So, d = 10.
Note: we can assume any suitable value for d.
Step
4:
 Calculate the hash value of the pattern (CDD)
hash value for pattern(p) = Σ(v * dm-1) mod 13
= ((3 * 102) + (4 * 101) + (4 * 100))
mod 13
= 344 mod 13
= 6 choose a prime number (here, 13) in such a way
In the calculation above,
that we can perform all the calculations with single-precision arithmetic.
 Now calculate the hash value for the first window (ABC)
hash value for text(t) = Σ(v * dm-1) mod 13
= ((1 * 102) + (2 * 101) + (3 * 100)) mod
13
= 123 mod 13
=6
 Compare the hash value of the pattern with the hash value of the text. If
they match then, character-matching is performed. In the above examples,
the hash value of the first window (i.e. text) matches with pattern, so go for
character matching between ABC and CDD. Since they do not match so, go
for the next window.

Step 5:
 We calculate the hash value of the next window by subtracting the first term
and adding the next term as shown below.
 Simple Numerical example:
o Pattern length is 3 and string is “23456”
o Let us assume that we computed the value of the first window as 234.
o How to compute the value of the next
window “345”? It’s just (234 – 2*100)*10 +
5 and we get 345.
Unit I CS3401 Algorithms 27
hash value for text(t) = ((1 * 102) + ((2 * 101) + (3 * 100) - (1 * 102)) * 10 + (3 *
100)) mod 13
= 233 mod 13
= Therefore,
For BCC, t = 12 (≠6). 12 go for the next window.
After a few searches, we will get the match for the window CDA in the text.

/* 4.2.1 Python Program to search the pattern in the given string

using Rabin-Karp algorithm */

d = 10
def search(pattern, text, q):
m = len(pattern)
n = len(text) p
= 0
t = 0
h = 1
i = 0
j = 0

for i in range(m-1):
h = (h*d) % q

# Calculate hash value for pattern and text for

i in range(m):
p = (d*p + ord(pattern[i])) % q t
= (d*t + ord(text[i])) % q

# Find the match

for i in range(n-m+1):
if p == t:
for j in range(m):
if text[i+j] != pattern[j]:
break

j += 1
if j == m:
print("Pattern is found at position: " + str(i+1))

if i < n-m:
t = (d*(t-ord(text[i])*h) + ord(text[i+m])) % q

if t < 0:
t = t+q

text = "hihellohi" print("Given

String : ", text)
pattern = input("Enter the pattern to be searched :") q
= int(input("Enter the prime number :"))

search(pattern, text, q)

Unit I CS3401 Algorithms 28

Execution:
Input
Given String : hihellohi
Enter the pattern to be searched :hello
Enter the prime number :3
Output
Pattern is found at position: 3

4.2.2 Complexity Analysis of Rabin-Karp algorithm

Time Complexity
 Best Case Complexity - O(n+m).
The average and best-case running time of the Rabin-Karp algorithm is
O(n+m), but its worst-case time is O(nm).

 Worst Case Complexity - O(nm).

The worst case of the Rabin-Karp algorithm occurs when all characters of
pattern and text are the same as the hash values of all the substrings of
text matches with the hash value of pattern.

Space Complexity
 Since no extra space is needed, the space complexity of the naïve search is
O(1).

4.2.3 Merits & Demerits:

Advantages:
 Extends to 2D patterns.
 Extends to finding multiple patterns.
Disadvantage:
 Arithmetic operations is slower than character comparisons.

4.3 Knuth-Morris-Pratt Algorithm

KMP Algorithm is one of the most popular patterns matching algorithms.
KMP stands for Knuth Morris Pratt algorithm. KMP algorithm was the first linear
time complexity algorithm for string matching. KMP algorithm is used to find a
"Pattern" in a "Text". This algorithm compares character by character from left to
right. But whenever a mismatch occurs, it uses a pre-processed table called "Prefix
Table" to skip characters comparison while matching. Sometimes prefix table is also
known as LPS Table. Here LPS stands for "Longest proper Prefix which is also
Suffix".

Steps for Creating LPS Table (Prefix Table)

Step 1: Define a one dimensional array with the size equal to the length of the
Pattern. (LPS[size]) Step 2: Define variables i & j. Set i = 0, j = 1 and LPS[0] = 0.
Step 3: Compare the characters at Pattern[i] and Pattern[j].
Step 4: If both are matched then set LPS[j] = i+1 and increment both i & j values by one.
Goto Step 3.
Step 5: If both are not matched then check the value of variable 'i'. If it is '0' then set LPS[j]
= 0 and increment 'j' value by one, if it is not '0' then set i = LPS[i-1]. Goto Step 3.
Step 6: Repeat above steps until all the values of
Unit I CS3401 Algorithms 29
LPS[] are filled. Example:

Unit I CS3401 Algorithms 30

Given Pattern Initialize LPS[] table with size 7 which is
A B C D A B D equal to the length of the pattern
0 1 2 3 4 5 6
LPS
Step 1:
 Define variables i & j.
 Set i = 0, j= 1 and LPS[0] = 0.
0 1 2 3 4 5 6
LPS 0

Step 2:
 Compare Pattern[i] with Pattern[j] ====>A is compared with B. Since
both were not matching, check the value of i.
 i = 0, so set LPS[j] = 0 and increment ‘j’ value by 1.
0 1 2 3 4 5 6
LPS 0 0
 Now, i = 0 & j = 2

Step 3:
 Compare Pattern[i] with Pattern[j] ====>A is compared with C. Since
both were not matching, check the value of i.
 i = 0, so set LPS[j] = 0 and increment ‘j’ value by 1.
0 1 2 3 4 5 6
LPS 0 0 0
 Now, i = 0 & j = 3

Step 4:
 Compare Pattern[i] with Pattern[j] ====>A is compared with D. Since
both were not matching, check the value of i.
 i = 0, so set LPS[j] = 0 and increment ‘j’ value by 1.
0 1 2 3 4 5 6
LPS 0 0 0 0
 Now, i = 0 & j = 4

Step 5:
 Compare Pattern[i] with Pattern[j] ====>A is compared with A. Since both
are matching, set LPS[j] = i+1 and increment both ‘i’ & ‘j’ value by 1.
0 1 2 3 4 5 6
LPS 0 0 0 0 1
 Now, i = 1 & j = 5

Step 6:
 Compare Pattern[i] with Pattern[j] ====>B is compared with B. Since both
are matching, set LPS[j] = i+1 and increment both ‘i’ & ‘j’ value by 1.
0 1 2 3 4 5 6

Unit I CS3401 Algorithms 31

LPS 0 0 0 0 1 2
 Now, i = 2 & j = 6
Step 7:
 Compare Pattern[i] with Pattern[j] ====>C is compared with D. Since
both were not matching, check the value of i.
 i !=0, so set i= LPS[i-1]====>i= LPS[2-1]
 i= 0
0 1 2 3 4 5 6
LPS 0 0 0 0 1 2
 Now, i = 0 & j = 6

Step 7:
 Compare Pattern[i] with Pattern[j] ====>A is compared with D. Since
both were not matching, check the value of i.
 i = 0, so set LPS[j] = 0 and increment ‘j’ value by 1.
0 1 2 3 4 5 6
LPS 0 0 0 0 1 2 0
 Now, i = 0 & j = 7

Final LPS[] table is as follows:

0 1 2 3 4 5 6
LPS 0 0 0 0 1 2 0

Working mechanismof KMP:

We use the LPS table to decide how many characters are to be skipped for
comparison when a mismatch has occurred. When a mismatch occurs, check the
LPS value of the previous character of the mismatched character in the pattern.
 If it is '0' then start comparing the first character of the pattern with the
next character to the mismatched character in the text.
 If it is not '0' then start comparing the character which is at an index value
equal to the LPS value of the previous character to the mismatched
character in pattern with the mismatched character in the Text.

Example:
Consider the following Text and Pattern
Text : ABC ABCDAB ABCDABCDABDE
Pattern: ABCDABD
LPS[] table for the above pattern is as follows:
0 1 2 3 4 5 6
LPS 0 0 0 0 1 2 0

Step
1:  Start comparing the first character of the pattern with the first

Unit I CS3401 Algorithms 32

c
h
a
r
a
c
t
e
r
o
f
T
e
x
t
f
r
o
m

l
e
f
t
t
o
r
i
g
h
t.