0% found this document useful (0 votes)
47 views46 pages

Module 1 DAA

The document discusses algorithms, including what they are, their key characteristics, analyzing their time and space complexity, and different methods for designing algorithms like greedy, divide and conquer, and dynamic programming. It also covers algorithm notations and efficiency classes.

Uploaded by

TUSHAR GUPTA
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
47 views46 pages

Module 1 DAA

The document discusses algorithms, including what they are, their key characteristics, analyzing their time and space complexity, and different methods for designing algorithms like greedy, divide and conquer, and dynamic programming. It also covers algorithm notations and efficiency classes.

Uploaded by

TUSHAR GUPTA
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 46

Module 1

Introduction and Advanced Data Structure

Notation of Algorithm, Analysis of algorithms, Designing of algorithms, Growth


of function, Master’s Theorem, Asymptotics Notations and Basic Efficiency
Classes. Sorting and Searching Algorithm: Insertion Sort, Selection Sort and
Bubble Sort. Divide and Conquer- Merge sort, Quick Sort, Heap Sort, Sequential
Search and Binary Search. Binary Search Tree and AVL tree:Traversal and
Related Properties, Binomial Heaps, Fibonacci Heaps, Data structure for
Disjoints sets.

What is an Algorithm?
An algorithm is a set of steps of operations to solve a problem performing calculation,
data processing, and automated reasoning tasks. An algorithm is an efficient method
that can be expressed within a finite amount of time and space.

An algorithm must possess following characteristics :


1. Finiteness: An algorithm should have a finite number of steps and it should
end after a finite time.
2. Input: An algorithm may have many inputs or no inputs at all.
3. Output: It should result in at least one output.
4. Definiteness: Each step must be clear, well-defined and precise. There should
be no ambiguity.
5. Effectiveness: Each step must be simple and should take a finite amount of
time.
6. Language Independent: The Algorithm designed must be
language-independent, i.e. it must be just plain instructions that can be
implemented in any language, and yet the output will be the same, as
expected.
7. Feasible: The algorithm must be simple, generic, and practical, such that it can
be executed with the available resources.

1
Analysis of Algorithms

The efficiency of an algorithm can be decided by measuring the performance of an


algorithm. We can measure the performance of an algorithm by computing two
factors:
1. Amount of time required by an algorithm to execute.
2. Amount of storage required by an algorithm
This is popularly known as time complexity and space complexity of an algorithm.
Time Complexity: The time complexity of an algorithm is the amount of computer
time required by an algorithm to run to completion. It is measured by calculating the
iteration of loops, number of comparisons etc.
Space Complexity: The space complexity of an algorithm is the amount of memory
it needs to run to completion. It includes space used by necessary input variables
and any extra space (excluding the space taken by inputs) that is used by the
algorithm. The space need by a program has the following components:
Instruction space: Instruction space is the space needed to store the compiled
version of the program instructions. The amount of instructions space that is
needed depends on factors such as:
The compiler used to complete the program into machine code.
The compiler options in effect at the time of compilation
The target computer.
Data space: Data space is the space needed to store all constant and variable
values. Data space has two components:
Space needed by constants and simple variables in the program.
Space needed by dynamically allocated objects such as arrays and class
instances.
Environment stack space: The environment stack is used to save information
needed to resume execution of partially completed functions.

Why is Analysis of Algorithms important?


● To predict the behavior of an algorithm without implementing it on a specific
computer.
● It is much more convenient to have simple measures for the efficiency of an
algorithm than to implement the algorithm and test the efficiency every time a
certain parameter in the underlying computer system changes.
● It is impossible to predict the exact behavior of an algorithm. There are too
many influencing factors.
● The analysis is thus only an approximation; it is not perfect.
● More importantly, by analyzing different algorithms, we can compare them to
determine the best one for our purpose.

Analysis of algorithms:
Usually, the Analysis of algorithms (time required by an algorithm) falls under three
types −
Best Case − Minimum time required for program execution.
Average Case − Average time required for program execution.
2
Worst Case − Maximum time required for program execution.

Designing of algorithms
An algorithm design technique means a unique approach or mathematical method for
creating algorithms and solving problems. While multiple algorithms can solve a
problem, not all algorithms can solve it efficiently. Therefore, we must create
algorithms using a suitable algorithm design method based on the nature of the
problem. An algorithm created with the right design technique can solve the problem
much more efficiently with respect to the computational power required.
The algorithms can be classified in various ways. They are:
1. Implementation Method
2. Design Method
3. Other Classifications
Classification by Implementation Method: There are primarily three main
categories into which an algorithm can be named in this type of classification. They
are:
1. Recursion or Iteration: A recursive algorithm is an algorithm which calls
itself again and again until a base condition is achieved whereas iterative
algorithms use loops and/or data structures like stacks, queues to solve any
problem. Every recursive solution can be implemented as an iterative
solution and vice versa.
Example: The Tower of Hanoi is implemented in a recursive fashion while
the Stock Span problem is implemented iteratively.
2. Exact or Approximate: Algorithms that are capable of finding an optimal
solution for any problem are known as the exact algorithm. For all those
problems, where it is not possible to find the most optimized solution, an
approximation algorithm is used. Approximate algorithms are the type of
algorithms that find the result as an average outcome of sub outcomes to a
problem.
Example: For NP-Hard Problems, approximation algorithms are used.
Sorting algorithms are the exact algorithms.
3. Serial or Parallel or Distributed Algorithms: In serial algorithms, one
instruction is executed at a time while parallel algorithms are those in which
we divide the problem into subproblems and execute them on different
processors. If parallel algorithms are distributed on different machines, then
they are known as distributed algorithms.

Classification by Design Method: There are primarily three main categories into
which an algorithm can be named in this type of classification. They are:
1. Greedy Method: In the greedy method, at each step, a decision is made to
choose the local optimum, without thinking about the future consequences.
Example: Fractional Knapsack, Activity Selection.
2. Divide and Conquer: The Divide and Conquer strategy involves dividing
the problem into sub-problem, recursively solving them, and then
recombining them for the final answer.
Example: Merge sort, Quicksort.
3. Dynamic Programming: The approach of Dynamic programming is similar
to divide and conquer. The difference is that whenever we have recursive
3
function calls with the same result, instead of calling them again we try to
store the result in a data structure in the form of a table and retrieve the
results from the table. Thus, the overall time complexity is reduced.
“Dynamic” means we dynamically decide whether to call a function or
retrieve values from the table.
Example: 0-1 Knapsack, subset-sum problem.
4. Linear Programming: In Linear Programming, there are inequalities in
terms of inputs and maximizing or minimizing some linear functions of
inputs.
Example: Maximum flow of Directed Graph
5. Reduction(Transform and Conquer): In this method, we solve a difficult
problem by transforming it into a known problem for which we have an
optimal solution. Basically, the goal is to find a reducing algorithm whose
complexity is not dominated by the resulting reduced algorithms.
Example: Selection algorithm for finding the median in a list involves first
sorting the list and then finding out the middle element in the sorted list.
These techniques are also called transform and conquer.

Other Classifications: Apart from classifying the algorithms into the above broad
categories, the algorithm can be classified into other broad categories like:
1. Randomized Algorithms: Algorithms that make random choices for faster
solutions are known as randomized algorithms.
Example: Randomized Quicksort Algorithm.
2. Classification by complexity: Algorithms that are classified on the basis of
time taken to get a solution to any problem for input size. This analysis is
known as time complexity analysis.
Example: Some algorithms take O(n), while some take exponential time.
3. Classification by Research Area: In CS each field has its own problems
and needs efficient algorithms.
Example: Sorting Algorithm, Searching Algorithm, Machine Learning etc.
4. Branch and Bound Enumeration and Backtracking: These are mostly
used in Artificial Intelligence.

Notation of Algorithm (Asymptotics Notations):


To choose the best algorithm, we need to check the efficiency of each algorithm.The
efficiency can be measured by computing time complexity of each algorithm.
Using Asymptotic Notations we can give time complexity as “fastest possible”
,”slowest possible “ or “average time”.
Various Notations such as Ω, O and Θ are called asymptotic Notations.
Big O Notation: The Big O notation defines an upper bound of an algorithm, it
bounds a function only from above.
O(g(n)) = { f(n): there exist positive constants c and n0 such that 0 <= f(n) <=
c*g(n) for all n >= n0}
Ω Notation: Just as Big O notation provides an asymptotic upper bound on a
function, Ω notation provides an asymptotic lower bound.
Ω (g(n)) = {f(n): there exist positive constants c and n0 such that 0 <= c*g(n) <=
f(n) for all n >= n0}.
Θ Notation: The theta notation bounds a function from above and below, so it defines
exact asymptotic behavior.
4
Θ(g(n)) = {f(n): there exist positive constants c1, c2 and n0 such that 0 <=
c1*g(n) <= f(n) <= c2*g(n) for all n >= n0}

o-notation: pronounced little-oh notation. is used to denote an upper bound that is


not asymptotically tight. For a given function, g(n), we denote by o(g(n)) the set of
functions o(g(n)) = { f(n): for any positive constant c>0, there exists a constant n0 >0
such that 0≤f(n)≤cg(n) for all n≥ n0}.

ω-notation: pronounced little-omega notation, is used to denote a lower bound that is


not asymptotically tight. For a given function, g(n), we denote by ω(g(n)) the set of
functions ω(g(n)) = { f(n): for any positive constant c>0, there exists a constant n0 >0
such that 0≤cg(n)≤f(n) for all n≥ n0}.

Basic Efficiency Classes: ( Growth of function)

Constant <= loglog(n) <= (log n)^k <= n^½ <= n <= nlog(n) <= (n)n^½ <=
n^2 <= n^2log n <= n^3 <= n^log n <= 2^n <= 3^n <= n! <= n^n

Recurrences Relation.
A recurrence relation is a way of expressing the time complexity of an algorithm by
recursively defining the value of a function in terms of its previous values. Recurrence
relations are commonly used in algorithm design and analysis to understand the time
complexity of recursive algorithms or algorithms that can be broken down into subproblems.

5
For example
1. Recurrence Relation for factorial is :
T(n)=1 for n=0 , T(n)=1+T(n-1) for n>0
,not T(n)=1 for n=0, T(n)=n*T(n-1) for n>0, Why?
We generally use recurrence relation to find the time complexity of the
algorithm. Here, the function T(n) is not actually for calculating the value of a
factorial but it is telling you about the time complexity of the factorial algorithm.
It means for finding a factorial of n it will take 1 more operation than factorial of
n-1.
2. Recurrence for Merge Sort:
To sort a given array using Merge Sort , we divide it in two halves and
recursively repeat the process for the two halves. Finally we merge the results.
So, the time complexity of Merge Sort can be written as T(n) = 2T(n/2) + cn.
There are many other algorithms like Binary Search, Tower of Hanoi, etc.

3. Recurrence for tower of hanoi is :


T(n)=2T(n-1)+1 for n>0;

Concept Rule Of Towers of Hanoi problem:


Tower of Hanoi is a mathematical puzzle where we have three poles and
n disks. The main goal in the puzzle is to move the entire stack to another pol,
obeying the following simple rules:
1. Only one disk can be moved at a time.
2. Each move consists of taking the upper disk from one of the stacks
and placing it on top of another stack i.e. a disk can only be moved if it is the
uppermost disk on a stack.
3. No disk may be placed on top of a smaller disk.
Solution Of Towers of Hanoi problem:
Begin with n disks on pole one
1. We can transfer top n-1 disks from pole1 to pole3 using T(n-1) moves
2. We keep the largest disk fixed during these moves
3. Then, we use one move to transfer the largest disk to the second pole.
4. We can transfer top n-1 disks on pol3 to pol2 using T(n-1) additional
moves, placing them on top of the largest disk, which always stays on the
bottom of pol2
Moreover, it is easy to say that the puzzle can not be solved using fewer steps. This
shows that The recurrence relation capturing the optimal execution time of the Towers
of Hanoi problem with ṅ discs is T(n) = 2T(n − 1) + 1.

There are several methods to solve recurrence relations. Here are some common methods:

6
1) Substitution Method: We make a guess for the solution and then we use
mathematical induction to prove the guess is correct or incorrect.
Examples :-
1. T(n) = 2T(n/2) + n; n>1
We guess the solution as T(n) = O(nLogn). Now we use induction
to prove our guess.
We need to prove that T(n) <= cnLogn.
T(n) = 2T(n/2) + n
<= 2[c(n/2)Log(n/2)] + n = cnLogn - cnLog2 + n =cnLogn - cn + n = cnLogn +n(1-c)
<= cnLogn
= O (nLogn)
2.
___________________________________________________________________
2) Iterative Method :The iteration method is a "brute force" method of solving a
recurrence relation. The general idea is to iteratively substitute the value of the
recurrent part of the equation until a pattern (usually a summation) is noticed, at which
point the summation can be used to evaluate the recurrence relation.

Examples :-
Q1. T (n) = 1 if n=1

= 2T (n-1) if n>1

Solution: -

T (n) = 2T (n-1)

= 2[2T (n-2)] = 2^2 *T (n-2)

= 4[2T (n-3)] = 2^3 *T (n-3)

= 8[2T (n-4)] = 2^4 *T (n-4) (Eq.1)

Repeat the procedure for i times

T (n) = 2^i * T (n-i)

Put n-i=1 or i= n-1 in (Eq.1)

T (n) = 2^(n-1)* T (1)

= 2^(n-1) *1 {T (1) =1 .....given}

= 2^(n-1)

Q2. T (n) = T (n-1) +1 and T (1) = θ (1)

Solution :-
7
T (n) = T (n-1) +1 = (T (n-2) +1) +1 = (T (n-3) +1) +1+1

= T (n-4) +4 = T (n-5) +1+4 = T (n-5) +5= T (n-k) + k

Where k = n-1

T (n-k) = T (1) = θ (1)

T (n) = θ (1) + (n-1) = 1+n-1=n= θ (n).

3) Recurrence Tree Method: In this method, we draw a recurrence tree and calculate
the time taken by every level of tree. Finally, we sum the work done at all levels. To
draw the recurrence tree, we start from the given recurrence and keep drawing till we
find a pattern among levels. The pattern is typically an arithmetic or geometric series.
Examples :-
1. T(n)= 2T(n/2) + n2
The Recursion tree for the above recurrence is :-

8
2. T(n)= 4T(n/2) +n
The recursion trees for the above recurrence :-

9
3.
The given Recurrence has the following recursion tree

When we add the values across the levels of the recursion trees, we get a value of n
for every level. The longest path from the root to leaf is : -

_______________________________________________________________________________

10
4) Master Method: Master Method is a direct way to get the solution. The master
method works only for following types of recurrences or for recurrences that can be
transformed to the following type.
Master Theorem Case 1 -

T(n) = aT(n/b) + f(n) where a >= 1 and b > 1

Here, n = size of input a = number of subproblems in the recursion


n/b = size of each subproblem. All subproblems are assumed to have the same size.
f(n) = cost of the work done outside the recursive call, which includes the cost of
dividing the problem and cost of merging the solutions. f(n) is an asymptotically
positive function. An asymptotically positive function means that for a sufficiently large
value of n, f(n) > 0.
There are following three cases:
1. If f(n) = O(nc) where c < Logb a then T(n) = Θ(n Logb a)
2. If f(n) = Θ(nc) where c = Logb a then T(n) = Θ(nc Log n)
3.If f(n) = Ω(nc) where c > Logb a then T(n) = Θ(f(n))
Each of the above conditions can be interpreted as:
● If the cost of solving the sub-problems at each level increases by a certain factor,
the value of f(n) will become polynomially smaller than nlogb a. Thus, the time
complexity is oppressed by the cost of the last level ie. nlogb a
● If the cost of solving the sub-problem at each level is nearly equal, then the value
of f(n) will be nlogb a. Thus, the time complexity will be f(n) times the total number of
levels ie. nlogb a * log n.
● If the cost of solving the subproblems at each level decreases by a certain factor,
the value of f(n) will become polynomially larger than nlogb a. Thus, the time
complexity is oppressed by the cost of f(n).

Master Theorem Limitations


The master theorem cannot be used if:
➔ T(n) is not monotone. eg. T(n) = sin n
➔ f(n) is not a polynomial. eg. f(n) = 2n or f(n) = n/Logn
➔ a is not a constant. eg. a = 2n
➔ a<1
Examples of some standard algorithms whose time complexity can be
evaluated using Master Method
Merge Sort: T(n) = 2T(n/2) + Θ(n). It falls in case 2 as c is 1 and logb a is also 1. So
the solution is Θ(n Logn).
Binary Search: T(n) = T(n/2) + Θ(1). It also falls in case 2 as c is 0 and logb a is also
0. So the solution is Θ(Logn).
Solved Example of Master Theorem
T(n) = 3T(n/2) + n2 Here, a = 3 n/b = n/2 f(n) = n2
logb a = log2 3 ≈ 1.58 < 2 ie. f(n) < n logb a +ϵ , where, ϵ is a constant.
Case 3 implies here. Thus, T(n) = f(n) = Θ(n2)

Master Theorem Case 2 -

11
Master Theorem Case 3
Let T(n) be a function defined on positive n, and having the property

for some constants c, a > 0, b > 0, d 0, and function f(n). If f(n) is in O(n^d), then

Proof: Expanding the recursion by one step, we have :-

Remark: This theorem is written to reveal a similarity to the Master theorem. The first

12
case is there for the sake of similarity. It doesn’t occur in algorithm analysis, since if a
is the number of recursive calls, a < 1 implies no recursive calls and no need for the
theorem. The second case arises often. Insertion sort and modexp() are two
examples at hand. The third case is not so common, but applies for instance to the
iconic Towers of Hanoi problem.
__________________________________________________________________________
Sorting Algorithm -

Insertion sort :
It is a simple sorting algorithm that works similar to the way you sort playing cards in
your hands. The array is virtually split into a sorted and an unsorted part. Values from
the unsorted part are picked and placed at the correct position in the sorted part.
Algorithm
To sort an array of size n in ascending order:
1: Iterate from arr[1] to arr[n] over the array.
2: Compare the current element (key) to its predecessor.
3: If the key element is smaller than its predecessor, compare it to the elements
before. Move the greater elements one position up to make space for the swapped
element.

Example 1: 12, 11, 13, 5, 6


Let us loop for i = 1 (second element of the array) to 4 (last element of the array)
i = 1. Since 11 is smaller than 12, move 12 and insert 11 before 12
11, 12, 13, 5, 6
i = 2. 13 will remain at its position as all elements in A[0..I-1] are smaller than 13
11, 12, 13, 5, 6
i = 3. 5 will move to the beginning and all other elements from 11 to 13 will move one
position ahead of their current position. 5, 11, 12, 13, 6
i = 4. 6 will move to position after 5, and elements from 11 to 13 will move one position
ahead of their current position. 5, 6, 11, 12, 13

Example 2:

# Python program for implementation of Insertion Sort

# Function to do insertion sort


def insertionSort(arr):
# Traverse through 1 to len(arr)
for i in range(1, len(arr)):
key = arr[i]
# Move elements of arr[0..i-1], that are
# greater than key, to one position ahead
# of their current position
j = i-1
while j >= 0 and key < arr[j] :
arr[j + 1] = arr[j]
13
j -= 1
arr[j + 1] = key
# Driver code to test above
arr = [12, 11, 13, 5, 6]
insertionSort(arr)
for i in range(len(arr)):
print ("% d" % arr[i])

Time Complexity: O(n^2) Auxiliary Space: O(1)


Boundary Cases: Insertion sort takes maximum time to sort if elements are sorted in
reverse order. And it takes minimum time (Order of n) when elements are already
sorted.
Algorithmic Paradigm: Incremental Approach

Selection Sort -

This algorithm sorts an array by repeatedly finding the minimum element (considering
ascending order) from the unsorted part and putting it at the beginning. The algorithm
maintains two subarrays in a given array.
1) The subarray which is already sorted.
2) Remaining subarray which is unsorted.
In every iteration of selection sort, the minimum element (considering ascending
order) from the unsorted subarray is picked and moved to the sorted subarray.
Following example explains the above steps:
arr[] = 64 25 12 22 11
// Find the minimum element in arr[0...4] and place it at beginning
11 25 12 22 64
// Find the minimum element in arr[1...4] and place it at beginning of arr[1...4]
11 12 25 22 64
// Find the minimum element in arr[2...4] and place it at beginning of arr[2...4]
11 12 22 25 64
// Find the minimum element in arr[3...4] and place it at beginning of arr[3...4]
11 12 22 25 64

# Python program for implementation of Selection


# Sort
import sys
A = [64, 25, 12, 22, 11]
# Traverse through all array elements
for i in range(len(A)):
# Find the minimum element in remaining
# unsorted array
min_idx = i
for j in range(i+1, len(A)):

14
if A[min_idx] > A[j]:
min_idx = j
# Swap the found minimum element with
# the first element
A[i], A[min_idx] = A[min_idx], A[i]
# Driver code to test above
print ("Sorted array")
for i in range(len(A)):
print("%d" %A[i],end=" , ")

Time Complexity: O(n2) as there are two nested loops. Auxiliary Space: O(1)
The good thing about selection sort is it never makes more than O(n) swaps and can
be useful when memory write is a costly operation.

Bubble Sort -
It is the simplest sorting algorithm that works by repeatedly swapping the adjacent
elements if they are in the wrong order.
Example:
First Pass:
( 5 1 4 2 8 ) –> ( 1 5 4 2 8 ), Here, algorithm compares the first two elements, and
swaps since 5 > 1.
( 1 5 4 2 8 ) –> ( 1 4 5 2 8 ), Swap since 5 > 4
( 1 4 5 2 8 ) –> ( 1 4 2 5 8 ), Swap since 5 > 2
( 1 4 2 5 8 ) –> ( 1 4 2 5 8 ), Now, since these elements are already in order (8 > 5),
the algorithm does not swap them.
Second Pass:
( 1 4 2 5 8 ) –> ( 1 4 2 5 8 )
( 1 4 2 5 8 ) –> ( 1 2 4 5 8 ), Swap since 4 > 2
( 1 2 4 5 8 ) –> ( 1 2 4 5 8 )
( 1 2 4 5 8 ) –> ( 1 2 4 5 8 )
Now, the array is already sorted, but our algorithm does not know if it is completed.
The algorithm needs one whole pass without any swap to know it is sorted.
Third Pass:
( 1 2 4 5 8 ) –> ( 1 2 4 5 8 )
( 1 2 4 5 8 ) –> ( 1 2 4 5 8 )
( 1 2 4 5 8 ) –> ( 1 2 4 5 8 )
( 1 2 4 5 8 ) –> ( 1 2 4 5 8 )

15
# Optimized Python program for implementation of Bubble Sort
def bubbleSort(arr):
n = len(arr)
# Traverse through all array elements
for i in range(n):
swapped = False
# Last i elements are already in place
for j in range(0, n-i-1):
# Traverse the array from 0 to n-i-1
# Swap if the element found is greater
# than the next element
if arr[j] > arr[j+1]:
arr[j], arr[j+1] = arr[j+1], arr[j]
swapped = True
if (swapped == False):
break
# Driver code to test above
if __name__ == "__main__":
arr = [64, 34, 25, 12, 22, 11, 90]
bubbleSort(arr)
print("Sorted array:")
for i in range(len(arr)):
print("%d" % arr[i], end=" ")

16
Worst and Average Case Time Complexity: O(n*n). Worst case occurs when the
array is reverse sorted.
Best Case Time Complexity: O(n). Best case occurs when the array is already
sorted.
Auxiliary Space: O(1)
Boundary Cases: Bubble sort takes minimum time (Order of n) when elements are
already sorted.

Divide and Conquer- Merge sort, Quick Sort, Heap Sort.

Merge Sort -
It is a Divide and Conquer algorithm. It divides the input array into two halves, calls
itself for the two halves, and then merges the two sorted halves. The merge()
function is used for merging two halves. The merge(arr, l, m, r) is a key process that
assumes that arr[l..m] and arr[m+1..r] are sorted and merges the two sorted subarrays
into one.
MergeSort(arr[], l, r)
If r > l
1. Find the middle point to divide the array into two halves:
middle m = l+ (r-l)/2
2. Call mergeSort for first half: Call mergeSort(arr, l, m)
3. Call mergeSort for second half: Call mergeSort(arr, m+1,
r)
4. Merge the two halves sorted in step 2 and 3:
Call merge(arr, l, m, r)

Time Complexity: Sorting arrays on different machines. Merge Sort is a recursive


algorithm and time complexity can be expressed as following recurrence relation. T(n)
= 2T(n/2) + θ(n)
The above recurrence can be solved either using the Recurrence Tree method or the
Master method. It falls in case II of Master Method and the solution of the recurrence
is θ(nLogn). Time complexity of Merge Sort is θ(nLogn) in all 3 cases (worst, average
17
and best) as merge sort always divides the array into two halves and takes linear time
to merge two halves.
Auxiliary Space: O(n) Algorithmic Paradigm: Divide and Conquer
Sorting In Place: No in a typical implementation Stable: Yes
Applications of Merge Sort
1. Merge Sort is useful for sorting linked lists in O(nLogn) time. In the case of
linked lists, the case is different mainly due to the difference in memory
allocation of arrays and linked lists. Unlike arrays, linked list nodes may not be
adjacent in memory. Unlike an array, in the linked list, we can insert items in
the middle in O(1) extra space and O(1) time. Therefore, the merge operation
of merge sort can be implemented without extra space for linked lists.
In arrays, we can do random access as elements are contiguous in memory.
Let us say we have an integer (4-byte) array A and let the address of A[0] be x
then to access A[i], we can directly access the memory at (x + i*4). Unlike
arrays, we can not do random access in the linked list. Quick Sort requires a
lot of this kind of access. In a linked list to access i’th index, we have to travel
each and every node from the head to i’th node as we don’t have a continuous
block of memory. Therefore, the overhead increases for quicksort. Merge sort
accesses data sequentially and the need for random access is low.
2. Inversion Count Problem
3. Used in External Sorting
Drawbacks of Merge Sort
● Slower compared to the other sort algorithms for smaller tasks.
● Merge sort algorithm requires an additional memory space of 0(n) for the
temporary array.
● It goes through the whole process even if the array is sorted.

QuickSort -

It is a Divide and Conquer algorithm. It picks an element as pivot and partitions the
given array around the picked pivot. There are many different versions of quickSort
that pick pivot in different ways.
Always pick the first element as pivot.
Always pick last element as pivot (implemented below)
Pick a random element as pivot.
Pick median as pivot.

How does QuickSort work?


The key process in quickSort is partition(). Target of partitions is, given an array and
an element x of array as pivot, put x at its correct position in sorted array and put all
smaller elements (smaller than x) before x, and put all greater elements (greater than
x) after x. All this should be done in linear time.

18
Partition Algorithm
There can be many ways to do partition, following pseudo code adopts the method
given in CLRS book. The logic is simple: we start from the leftmost element and keep
track of the index of smaller (or equal to) elements as i. While traversing, if we find a
smaller element, we swap the current element with arr[i]. Otherwise we ignore the
current element.

/* low --> Starting index, high --> Ending index */


quickSort(arr[], low, high)
{
if (low < high)
{
/* pi is partitioning index, arr[pi] is now
at right place */
pi = partition(arr, low, high);

quickSort(arr, low, pi - 1); // Before pi


quickSort(arr, pi + 1, high); // After pi
}
}
Pseudo code for partition()

/* This function takes last element as pivot, places


the pivot element at its correct position in sorted
array, and places all smaller (smaller than pivot)
to left of pivot and all greater elements to right
of pivot */
partition (arr[], low, high)
{
// pivot (Element to be placed at right position)
pivot = arr[high];

i = (low - 1) // Index of smaller element and indicates the


// right position of pivot found so far

for (j = low; j <= high- 1; j++)


{

19
// If current element is smaller than the pivot
if (arr[j] < pivot)
{
i++; // increment index of smaller element
swap arr[i] and arr[j]
}
}
swap arr[i + 1] and arr[high])
return (i + 1)
}

Illustration of partition() :

arr[] = {10, 80, 30, 90, 40, 50, 70}


Indexes: 0 1 2 3 4 5 6

low = 0, high = 6, pivot = arr[h] = 70


Initialize index of smaller element, i = -1

Traverse elements from j = low to high-1


j = 0 : Since arr[j] <= pivot, do i++ and swap(arr[i], arr[j])
i=0
arr[] = {10, 80, 30, 90, 40, 50, 70} // No change as i and j
// are same

j = 1 : Since arr[j] > pivot, do nothing


// No change in i and arr[]

j = 2 : Since arr[j] <= pivot, do i++ and swap(arr[i], arr[j])


i=1
arr[] = {10, 30, 80, 90, 40, 50, 70} // We swap 80 and 30

j = 3 : Since arr[j] > pivot, do nothing


// No change in i and arr[]

j = 4 : Since arr[j] <= pivot, do i++ and swap(arr[i], arr[j])


i=2
arr[] = {10, 30, 40, 90, 80, 50, 70} // 80 and 40 Swapped
j = 5 : Since arr[j] <= pivot, do i++ and swap arr[i] with arr[j]
i=3
arr[] = {10, 30, 40, 50, 80, 90, 70} // 90 and 50 Swapped

We come out of loop because j is now equal to high-1.


Finally we place pivot at correct position by swapping
arr[i+1] and arr[high] (or pivot)
arr[] = {10, 30, 40, 50, 70, 90, 80} // 80 and 70 Swapped

Now 70 is at its correct place. All elements smaller than 70 are before it and all elements
greater than 70 are after it.

20
Analysis of QuickSort
Time taken by QuickSort, in general, can be written as follows.

T(n) = T(k) + T(n-k-1) + \theta(n)

The first two terms are for two recursive calls, the last term is for the partition process. k is
the number of elements which are smaller than pivot.

The time taken by QuickSort depends upon the input array and partition strategy. Following
are three cases.

Worst Case: The worst case occurs when the partition process always picks the greatest or
smallest element as pivot. If we consider the above partition strategy where the last element
is always picked as pivot, the worst case would occur when the array is already sorted in
increasing or decreasing order. Following is the recurrence for the worst case.
T(n) = T(0) + T(n-1) + \theta(n)
which is equivalent to
T(n) = T(n-1) + \theta(n)
The solution of the above recurrence is \theta (n2).

Best Case: The best case occurs when the partition process always picks the middle
element as pivot. Following is a recurrence for best case.

T(n) = 2T(n/2) + \theta(n)


The solution of the above recurrence is \theta (nLogn). It can be solved using case 2 of
Master Theorem.

Average Case:
To do an average case analysis, we need to consider all possible permutations of an array
and calculate the time taken by every permutation which doesn’t look easy. We can get an
idea of the average case by considering the case when a partition puts O(n/9) elements in
one set and O(9n/10) elements in another set. Following is a recurrence for this case.

T(n) = T(n/9) + T(9n/10) + \theta(n)

Solution of above recurrence is also O(nLogn)


Although the worst case time complexity of QuickSort is O(n2) which is more than many
other sorting algorithms like Merge Sort and Heap Sort, QuickSort is faster in practice,
because its inner loop can be efficiently implemented on most architectures, and in most
real-world data. QuickSort can be implemented in different ways by changing the choice of
pivot, so that the worst case rarely occurs for a given type of data. However, merge sort is
generally considered better when data is huge and stored in external storage.

Is QuickSort stable?
The default implementation is not stable. However any sorting algorithm can be made stable
by considering indexes as comparison parameters.

Is QuickSort In-place?
As per the broad definition of in-place algorithm it qualifies as an in-place sorting algorithm as
it uses extra space only for storing recursive function calls but not for manipulating the input.

What is 3-Way QuickSort?


In simple QuickSort algorithm, we select an element as pivot, partition the array around pivot
and recur for subarrays on left and right of pivot.
21
Consider an array which has many redundant elements. For example, {1, 4, 2, 4, 2, 4, 1, 2, 4,
1, 2, 2, 2, 2, 4, 1, 4, 4, 4}. If 4 is picked as pivot in Simple QuickSort, we fix only one 4 and
recursively process remaining occurrences. In 3 Way QuickSort, an array arr[l..r] is divided in
3 parts:
a) arr[l..i] elements less than pivot.
b) arr[i+1..j-1] elements equal to pivot.
c) arr[j..r] elements greater than pivot.

Algorithm Time Complexity

Best Average Worst

Selection Sort Ω(n^2) θ(n^2) O(n^2)

Bubble Sort Ω(n) θ(n^2) O(n^2)

Insertion Sort Ω(n) θ(n^2) O(n^2)

Heap Sort Ω(n log(n)) θ(n log(n)) O(n log(n))

Quick Sort Ω(n log(n)) θ(n log(n)) O(n^2)

Merge Sort Ω(n log(n)) θ(n log(n)) O(n log(n))

Bucket Sort Ω(n+k) θ(n+k) O(n^2)

Radix Sort Ω(nk) θ(nk) O(nk)

Count Sort Ω(n+k) θ(n+k) O(n+k)

Heap Sort -
Heap sort is a comparison-based sorting technique based on Binary Heap data structure. It is
similar to selection sort where we first find the minimum element and place the minimum
element at the beginning. We repeat the same process for the remaining elements.

Since a Binary Heap is a Complete Binary Tree, it can be easily represented as an array and
the array-based representation is space-efficient. If the parent node is stored at index I, the
left child can be calculated by 2 * I + 1 and the right child by 2 * I + 2 (assuming the indexing
starts at 0).

Algorithm for “heapify”:


heapify(array)
Root = array[0]
Largest = largest( array[0] , array [2 * 0 + 1]. array[2 * 0 + 2])
if(Root != Largest)
Swap(Root, Largest)
Example of “heapify”:
30(0)
/ \
70(1) 50(2)

22
Child (70(1)) is greater than the parent (30(0))

Swap Child (70(1)) with the parent (30(0))


70(0)
/ \
30(1) 50(2)
Algorithm of Heap sort -
for i = n - 1 till i >= 0:
// taking the largest number from and the heap and placing it at the end
swap arr[0], arr[i]
// Heapifying the tree recursively starting from index 0
heapify(arr, i, 0);
end for
Time Complexity: Time complexity of heapify is O(Logn). Time complexity of
createAndBuildHeap() is O(n) and the overall time complexity of Heap Sort is O(nLogn).
Advantages of heapsort –
● Efficiency – The time required to perform Heap sort increases logarithmically while
other algorithms may grow exponentially slower as the number of items to sort
increases. This sorting algorithm is very efficient.
● Memory Usage – Memory usage is minimal because apart from what is necessary
to hold the initial list of items to be sorted, it needs no additional memory space to
work
● Simplicity – It is simpler to understand than other equally efficient sorting
algorithms because it does not use advanced computer science concepts such as
recursion
Strassen’s Matrix Multiplication
In general method the main idea is to use the divide and conquer technique in this algorithm
– divide matrix A & matrix B into 8 submatrices and then recursively compute the
submatrices of C

Consider the following matrices A and B:


A = |a b|, B = | e f | and we know A*B = matrix C = |ae+bg af+bh|
|c d| | g h| |ce+dg cf+dh|
23
There will be 8 recursive calls:
a*e | b*g|a*f|b*h|c*e|d*g|c*f|d*h
The above strategy is the basic O(N^3) strategy .
Using the Master Theorem with T(n) = 8T(n/2) + O(n^2) we still get a runtime of
O(n^3).

But Strassen came up with a solution where we don’t need 8 recursive calls but can
be done in only 7 calls and some extra addition and subtraction operations.
Strassen’s 7 calls are shown in figure .

The time complexity using the Master Theorem. T(n) = 7T(n/2) + O(n^2) = O(n^log(7))
runtime.Approximately O(n^2.8074) which is better than O(n^3)

Pseudocode of Strassen’s multiplication

● Divide matrix A and matrix B in 4 sub-matrices of size N/2 x N/2 as shown in


the above diagram.
● Calculate the 7 matrix multiplications recursively.
● Compute the submatrices of C.
● Combine these submatricies into our new matrix C

Searching Algorithm -

Linear/Sequential Search -
A linear search is also known as a sequential search that simply scans each element
at a time. Suppose we want to search an element in an array or list; we simply
calculate its length and do not jump at any item. The worst-case complexity is O(n)

24
Algorithm:
Linear Search (Array A, Value x)

Step 1: Set i to 1
Step 2: if i > n, then jump to step 7
Step 3: if A[i] = x then jump to step 6
Step 4: Set i to i + 1
Step 5: Go to step 2
Step 6: Print element x found at index i and jump to step 8
Step 7: Print element not found
Step 8: Exit

Minimum/ Maximum, K-th smallest element

Binary Search -
A binary search is a search in which the middle element is calculated to check
whether it is smaller or larger than the element which is to be searched. The main
advantage of using binary search is that it does not scan each element in the list.
Instead of scanning each element, it performs the searching to the half of the list. So,
the binary search takes less time to search an element as compared to a linear
search. The worst-case scenario for finding the element is O(log2n).
The one pre-requisite of binary search is that an array should be in sorted order,
whereas linear search works on both sorted and unsorted arrays. The binary search
algorithm is based on the divide and conquer technique, which means that it will
divide the array recursively.
There are three cases used in the binary search:
Case 1: data<a[mid] then left = mid+1.
Case 2: data>a[mid] then right=mid-1
Case 3: data = a[mid] // element is found
Suppose we have an array of 10 size which is indexed from 0 to 9 as shown in the
below figure: We want to search for 70 elements from the above array.

25
Binary Search Tree -
Binary Search Tree is a node-based binary tree data structure which has the following
properties:
● The left subtree of a node contains only nodes with keys lesser than the node’s key.
● The right subtree of a node contains only nodes with keys greater than the node’s key.
● The left and right subtree each must also be a binary search tree.

Creating Binary Search tree (O(n2)): 45, 15, 79, 90, 10, 55, 12, 20, 50

26
Searching in the Binary Search tree: Suppose we have to find node 20 from the below
tree.

27
Search (root, item)
if (item = root → data) or (root = NULL)
return root
else if (item < root → data)
return Search(root → left, item)
else
return Search(root → right, item)
END

Deletion in Binary Search tree:


To delete a node from BST, there are three possible situations occur -
● The node to be deleted is the leaf node, or,
● The node to be deleted has only one child, and,
● The node to be deleted has two children
The node to be deleted is the leaf node

The node to be deleted has only one child

28
The node to be deleted has two children

Insertion in Binary Search tree:

Operations Best case time Average case time Worst case time
complexity complexity complexity

Insertion O(log n) O(log n) O(n)

29
Deletion O(log n) O(log n) O(n)

Search O(log n) O(log n) O(n)


The space complexity of all operations of a Binary search tree is O(n).

AVL tree:Traversal and Related Properties.


AVL Tree can be defined as a height balanced binary search tree in which each node
is associated with a balance factor.
Balance Factor (k) = height (left(k)) - height (right(k))

Insertion and deletion in an AVL tree is performed in the same way as it is performed in a
binary search tree. However, it may lead to violation in the AVL tree property and therefore
the tree may need balancing. The tree can be balanced by applying rotations.

Algorithm Average case Worst case

Space o(n) o(n)

Search o(log n) o(log n)

Insert o(log n) o(log n)

Delete o(log n) o(log n)


We perform rotation in the AVL tree only in case the Balance Factor is other than -1, 0, and 1.
There are basically four types of rotations which are as follows:
1. L L rotation: Inserted node is in the left subtree of left subtree of A

30
2. R R rotation : Inserted node is in the right subtree of right subtree of A

3. L R rotation : Inserted node is in the right subtree of left subtree of A:


LR rotation = RR rotation + LL rotation, i.e, First RR rotation is performed on
subtree and then LL rotation is performed on full tree, by full tree we mean the first
node from the path of inserted nodes whose balance factor is other than -1, 0, or 1.

4. R L rotation : Inserted node is in the left subtree of right subtree of A:


R L rotation = LL rotation + RR rotation, i.e, First LL rotation is performed on
subtree and then RR rotation is performed on full tree, by full tree we mean the first
node from the path of inserted node whose balance factor is other than -1, 0, or 1.

H, I, J, B, A, E, C, F, D, G, K, L
1. Insert H, I, J

31
On inserting the above elements, especially in the case of H, the BST becomes unbalanced
as the Balance Factor of H is -2. Since the BST is right-skewed, we will perform RR Rotation
on node H. It results in a balanced tree.
2. Insert B, A

On inserting the above elements, especially in case of A, the BST becomes unbalanced as
the Balance Factor of H and I is 2, we consider the first node from the last inserted node i.e.
H. Since the BST from H is left-skewed, we will perform LL Rotation on node H. it results in a
balanced tree.
3. Insert E

On inserting E, BST becomes unbalanced as the Balance Factor of I is 2, since if we travel


from E to I we find that it is inserted in the left subtree of right subtree of I, we will perform LR
Rotation on node I. LR = RR + LL rotation
3 a) We first perform RR rotation on node B,
3b) Then we perform LL rotation on the node I, it results in a balanced tree.
4. Insert C, F, D

32
On inserting C, F, D, BST becomes unbalanced as the Balance Factor of B and H is -2, since if
we travel from D to B we find that it is inserted in the right subtree of left subtree of B, we will
perform RL Rotation on node I. RL = LL + RR rotation.
4a) We first perform LL rotation on node E
4b) We then perform RR rotation on node B

5. Insert G

On inserting G, BST become unbalanced as the Balance Factor of H is 2, since if we travel


from G to H, we find that it is inserted in the left subtree of right subtree of H, we will perform
LR Rotation on node I. LR = RR + LL rotation.
5 a) We first perform RR rotation on node C
The resultant tree after RR rotation is:

5 b) We then perform LL rotation on node H


6. Insert K

33
On inserting K, BST becomes unbalanced as the Balance Factor of I is -2. Since the BST is
right-skewed from I to K, hence we will perform RR Rotation on the node I.
7. Insert L
On inserting the L tree is still balanced as the Balance Factor of each node is now either, -1, 0,
+1. Hence the tree is a Balanced AVL tree

—------------------------------------------------------------------------------------------------
Binomial Heaps -
A Binomial Heap is a collection of Binomial Trees
Binomial Tree -
A Binomial Tree of order 0 has 1 node. A Binomial Tree of order k can be constructed by
taking two binomial trees of order k-1 and making one as leftmost child or other.
A Binomial Tree of order k has following properties.
a) It has exactly 2^k nodes.
b) It has depth as k.
c) There are exactly kCi nodes at depth i for i = 0, 1, . . . , k.
d) The root has degree k and the children of the root are themselves Binomial Trees with
order k-1, k-2,.. 0 from left to right.

Binomial Heap:
A Binomial Heap is a set of Binomial Trees where each Binomial Tree follows the Min Heap
property. And there can be at most one Binomial Tree of any degree.
Examples Binomial Heap:
12------------10--------------------20
/ \ / | \
15 50 70 50 40
34
| / | |
30 80 85 65
|
100
A Binomial Heap with 13 nodes. It is a collection of 3 Binomial Trees of orders 0, 2 and 3
from left to right.
Complexity -
Decrease key -> Decreases an existing key to some value - Θ(logn)
Delete -> Deletes a node given a reference to the node - Θ(logn)
Extract minimum-> Removes and returns the minimum value given a reference to the node
- Θ(logn)
Find minimum -> Returns the minimum value - O(logn)
Insert ->Inserts a new value - O(logn)
Union - >Combine the heap with another to form a valid binomial heap - Θ(logn)
—---------------------------------------------------------------------------------------------------------------
Fibonacci Heaps -

In terms of Time Complexity, Fibonacci Heap beats both Binary and Binomial Heaps.
Below are amortized time complexities of Fibonacci Heap.
1) Find Min: Θ(1) [Same as both Binary and Binomial]
2) Delete Min: O(Log n) [Θ(Log n) in both Binary and Binomial]
3) Insert: Θ(1) [Θ(Log n) in Binary and Θ(1) in Binomial]
4) Decrease-Key: Θ(1) [Θ(Log n) in both Binary and Binomial]
5) Merge: Θ(1) [Θ(m Log n) or Θ(m+n) in Binary and
Θ(Log n) in Binomial]
Like Binomial Heap, Fibonacci Heap is a collection of trees with min-heap or max-heap
property. In Fibonacci Heap, trees can have any shape even all trees can be single
nodes (This is unlike Binomial Heap where every tree has to be Binomial Tree).
Below is an example Fibonacci Heap -

35
Fibonacci Heap maintains a pointer to minimum value (which is the root of a tree). All tree
roots are connected using a circular doubly linked list, so all of them can be acces sed using
a single ‘min’ pointer.
The main idea is to execute operations in a “lazy” way. For example, the merge operation
simply links two heaps, insert operation simply adds a new tree with a single node. The
operation extract minimum is the most complicated operation. It does delayed work of
consolidating trees. This makes delete also complicated as delete first decreases the key to
minus infinite, then calls extract minimum.
Below are some interesting facts about Fibonacci Heap
1. The reduced time complexity of Decrease-Key has importance in Dijkstra and Prim
algorithms. With Binary Heap, the time complexity of these algorithms is O(VLogV
+ ELogV). If Fibonacci Heap is used, then time complexity is improved to O(VLogV
+ E)
2. Although Fibonacci Heap looks promising time complexity wise, it has been found
slow in practice as hidden constants are high (Source Wiki).
3. Fibonacci heaps are mainly called so because Fibonacci numbers are used in the
running time analysis. Also, every node in Fibonacci Heap has degree at most
O(log n) and the size of a subtree rooted in a node of degree k is at least Fk+2,
where Fk is the kth Fibonacci number
—-----------------------------------------------------------------------------------------------------------------------

Data structure for Disjoints sets -


The efficiency of an algorithm sometimes depends on the data structure that is used. An
efficient data structure, like the disjoint-set-union, can reduce the execution time of an
algorithm.
Let’s say there are 5 people A, B, C, D, and E.
A is B's friend, B is C's friend, and D is E's friend, therefore, the following is true:
1. A, B, and C are connected to each other
2. D and E are connected to each other
You have to perform two operations:
1. Union(A, B): Connect two elements A and B
2. Find(A, B): Find whether the two elements A and B are connected
Assumption
A and B objects are connected only if arr[ A ] = arr[ B ].
Implementation
To implement the operations of union and find, do the following:
1. Find(A, B): Check whether arr[ A ] = arr[ B ]
2. Union(A, B): Connect A to B and merge the components that comprise A and B by
replacing elements that have a value of arr[ A ] with the value of arr[ B ].
Initially there are 10 subsets and each subset has one element in it.

36
When each subset contains only one element, the array arr is as follows:

Let’s perform some operations.


1. Union(2, 1)

The arr is as follows:


1. Union(4, 3)
2. Union(8, 4)
3. Union(9, 3)

The arr is as follows:

1. Union(6, 5)

37
The arr is as follows:
After performing some operations of Union (A ,B), there are now 5 subsets as follows:
1. First subset comprises the elements {3, 4, 8, 9}
2. Second subset comprises the elements {1, 2}
3. Third subset comprises the elements {5, 6}
4. Fourth subset comprises the elements {0}
5. Fifth subset comprises the elements {7}
The elements of a subset, which are connected to each other directly or indirectly, can be
considered as the nodes of a graph. Therefore, all these subsets are called connected
components.
The union-find data structure is useful in graphs for performing various operations like
connecting nodes, finding connected components etc.
Let’s perform some find(A, B) operations.
1. Find(0, 7): 0 and 7 are disconnected, and therefore, you will get a false result
2. Find(8, 9): Although 8 and 9 are not connected directly, there is a path that connects
both the elements, and therefore, you will get a true result
Implementation
Approach A
Initially there are N subsets containing one element in each subset. Therefore, to initialize the
array use the initialize () function.
void initialize( int Arr[ ], int N)
{
for(int i = 0;i<N;i++)
Arr[ i ] = i ;
}
//returns true if A and B are connected, else returns false
bool find( int Arr[ ], int A, int B)
{
if(Arr[ A ] == Arr[ B ])
return true;
else
return false;
}
//change all entries from arr[ A ] to arr[ B ].
void union(int Arr[ ], int N, int A, int B)
{
int TEMP = Arr[ A ];
for(int i = 0; i < N;i++)
{
if(Arr[ i ] == TEMP)
Arr[ i ] = Arr[ B ];
}
}

38
Time complexity (of this approach)
As the loop in the union function iterates through all the N elements for connecting two
elements, performing this operation on N objects will take O(N2) time, which is quite
inefficient.
Approach B
Let’s try another approach.
Idea
Arr[ A ] is a parent of A.
Consider the root element of each subset, which is only a special element in that subset
having itself as the parent. Assume that R is the root element, then arr[ R ] = R.
For more clarity, consider the subset S = {0, 1, 2, 3, 4, 5}
Initially each element is the root of itself in all the subsets because arr[ i ] = i, where i is the
element in the set. Therefore root(i) = i.

Performing union(1, 0) will connect 1 to 0 and will set root (0) as the parent of root (1). As
root(1) = 1 and root(0) = 0, the value of arr[ 1 ] will change from 1 to 0. Therefore, 0 will be
the root of the subset that contains the elements {0, 1}.

Performing union (0, 2), will indirectly connect 0 to 2 by setting root(2) as the parent of
root(0). As root(0) is 0 and root(2) is 2, it will change the value of arr[ 0 ] from 0 to 2.
Therefore, 2 will be the root of the subset that contains the elements {2, 0, 1}.

39
Performing union (3, 4) will indirectly connect 3 to 4 by setting root(4) as the parent of root(3).
As root(3) is 3 and root(4) is 4, it will change the value of arr[ 3 ] from 3 to 4. Therefore, 4 will
be the root of the subset that contains the elements {3, 4}.

Performing union (1, 4) will indirectly connect 1 to 4 by setting root(4) as the parent of root(1).
As root(4) is 4 and root(1) is 2, it will change the value of arr[ 2 ] from 2 to 4. Therefore, 4 will
be the root of the set containing elements {0, 1, 2, 3, 4}.

After each step, you will see the change in the array arr also.
After performing the required union(A, B) operations, you can perform the find(A, B)
operation easily to check whether A and B are connected. This can be done by calculating
the roots of both A and B. If the roots of A and B are the same, then it means that both A and
B are in the same subset and are connected.
40
Calculating the root of an element
Arr[ i ] is the parent of i (where i is the element of the set). The root of i is Arr[ Arr[ Arr[
…...Arr[ i ]...... ] ] ] until arr[ i ] is not equal to i. You can run a loop until you get an element
that is a parent of itself.
Note This can only be done when there is no cycle in the elements of the subset, else the
loop will run infinitely.
1. Find(1, 4): 1 and 4 have the same root i.e. 4. Therefore, it means that they are
connected and this operation will give the result True.
2. Find(3, 5): 3 and 5 do not have the same root because root(3) is 4 and root(5) is 5.
This means that they are not connected and this operation will give the result False.
Implementation
Initially each element is a parent of itself, which can be done by using the initialize function as
discussed above.
//finding root of an element
int root(int Arr[ ],int i)
{
while(Arr[ i ] != i) //chase parent of current element until it reaches root
{
i = Arr[ i ];
}
return i;
}

/*modified union function where we connect the elements by changing the root of one of
the elements*/

int union(int Arr[ ] ,int A ,int B)


{
int root_A = root(Arr, A);
int root_B = root(Arr, B);
Arr[ root_A ] = root_B ; //setting parent of root(A) as root(B)
}
bool find(int A,int B)
{
if( root(A)==root(B) ) //if A and B have the same root, it means that they are
connected.
return true;
else
return false;
}
In the worst case, this idea will also take linear time in connecting 2 elements and
determining (finding) whether two elements are connected. Another disadvantage is that
while connecting two elements, which subset has more elements is not checked. This may
sometimes create a big problem because you will have to perform approximately linear time
operations.

41
To avoid this, track the size of each subset and while connecting two elements connect the
root of each subset that has a smaller number of elements to the root of each subset that has
a larger number of elements.
Example
If you want to connect 1 and 5, then connect the root of subset A (the subset that contains 1)
to the root of subset B ( the subset that contains 5) because subset A contains less number
of elements than subset B.

It will balance the tree formed by performing the operations discussed above. This is known
as weighted-union operation .
Implementation
Initially the size of each subset will be one because each subset will have only one element.
You can initialize it in the initialize function discussed above. The size[ ] array function will
keep a track of the size of each subset.
//modified initialize function:
void initialize( int Arr[ ], int N)
{
for(int i = 0;i<N;i++)
{
Arr[ i ] = i ;
size[ i ] = 1;
}
}
The root() and find() functions will be the same as discussed above .
The union function will be modified because the two subsets will be connected based on the
number of elements in each subset.
//modified union function
void weighted-union(int Arr[ ],int size[ ],int A,int B)
{

42
int root_A = root(A);
int root_B = root(B);
if(size[root_A] < size[root_B ])
{
Arr[ root_A ] = Arr[root_B];
size[root_B] += size[root_A];
}
else
{
Arr[ root_B ] = Arr[root_A];
size[root_A] += size[root_B];
}

}
Example
You have a set S = {0, 1, 2, 3, 4, 5} Initially all the subsets have a single element and each
element is a root of itself. Initially size[ ] array will be as follows:

Perform union(0, 1). Here, you can connect any root of any element with the root of another
element because both the element’s subsets are of the same size. After the roots are
connected, the respective sizes will be updated. If you connect 1 to 0 and make 0 the root,
then the size of 0 will change from 1 to 2.

While performing union(1, 2), connect root(2) with root(1) because the subset of 2 has fewer
elements than the subset of 1.

43
Similarly, in union(3, 2), connect root(3) to root(2) because the subset of 3 has fewer
elements than the subset of 2.

Maintaining a balanced tree will reduce complexity of the union-find function from N to log2N.
Idea for improving this approach further
Union with path compression
While computing the root of A, set each i to point to its grandparent (thereby halving the
length of the path), where i is the node that comes in the path while computing the root of A.
//modified root function

int root (int Arr[ ] ,int i)


44
{
while(Arr[ i ] != i)
{
Arr[ i ] = Arr[ Arr[ i ] ] ;
i = Arr[ i ];
}
return i;
}
When you use the weighted-union operation with path compression it takes log * N for each
union find operation, where N is the number of elements in the set.
log *N is the iterative function that computes the number of times you have to take the log of
N before the value of N reaches 1.
—------------------------------------------------------------------------------------------------------------------------

1. Which one of the following statements is TRUE for all positive functions f(n)?
[GATE CSE 2022]
(A) f(n2) = θ(f(n)2), when f(n) is a polynomial
(B) f(n2) = o(f(n)2)
(C) f(n2) = O(f(n)2), when f(n) is an exponential function
(D) f(n2) = Ω(f(n)2)
Solution: Correct answer is (A)

2. For parameters a and b, both of which are ω(1), T(n) = T(n1/a) + 1, and T(b) = 1. Then
T(n) is [GATE CSE 2020]
(A) θ(logalogbn) (B) θ(logabn) (C) θ(logblogan) (D) θ(log2log2n)
Solution: Correct answer is (A)

3. Which one of the following is the recurrence equation for the worst-case time
complexity of the Quicksort algorithm for sorting (n ≥ 2) numbers? In the recurrence
equations given in the options below, c is a constant. [GATE CSE 2015]
(A) T(n) = 2T(n/2) + cn (B) T(n) = T(n-1) + T(0) + cn
(C) T(n) = 2T(n-1) + cn (D) T(n) = T(n/2) + cn
Solution: Correct answer is (B)

4. An unordered list contains n distinct elements. The number of comparisons to find


an element in this list that is neither maximum nor minimum is [GATE CSE 2015]
(A) θ(n log n) (B) θ(n) (C) θ(log n) (D) θ(1)
Solution: Correct answer is (D)
.
5. Consider the following array of elements: (89,19,50,17,12,15,2,5,7,11,6,9,100). The
minimum number of interchanges needed to convert it into a max-heap is [GATE CSE
2015]
(A) 4 (B) 5 (C) 2 (D) 3
Solution: Correct answer is (D)
6. The tightest lower bound on the number of comparisons, in the worst case, for
comparison-based sorting is of the order of [GATE CSE 2004]
45
(A) n (B) n2 (C) n log n (D) n log2 n
Solution: Correct answer is (C)
7. A sorting technique is called stable if: [GATE CSE 1999]
(A) It takes O(n log n) time
(B) It maintains the relative order of occurrence of non-distinct elements.
(C) It uses divide and conquers paradigm
(D) It takes O(n) space
Solution: Correct answer is (B)

8. For merging two sorted lists of sizes m and n into a sorted list of size m+n, we
require comparisons of [GATE CSE 1995]
(A) O(m) (B) O(n) (C) O(m+n) (D) O(log m + log n)
Solution: Correct answer is (C)

Assignment -1
1. Apply Master’s theorem to find complexity of the following recurrence relations:
i) T(n)=3T(n/2)+n2
ii) T(n)=4T(n/2)+n2
iii) T(n)=T(n/2)+n2
iv) T(n)=2n T(n/2)+nn
2. Write a Quick sort algorithm and compute its best, worst case time
complexity.Sort 7,9,2,3,15,1,12,30 in ascending order using quick sort.
3. Solve using Recursion Tree method T(n)=3T(n/4)+n 2
4. Solve the following recurrence relations using the substitution method. Also,
prove your answer using the iteration method.
a. T(n) = 3T(n/3) + n/log n
b. T(n) = T(n/2)+T(n/4)+T(n/8) + n
5. Examine the time complexity of the equations:
a. T(n)=T(√n) +o(1)
b. T(n) =7 T(n/2) +n3
6. Write a merge sort algorithm and compute its best, worst case time complexity.
Sort L,U,C,K,N,O,W in alphabetical order using merge sort.
7. Sort given array
A={27,46,11,95,67,32,78}
Using insertion sort algorithm. Also perform best case and worst case analysis
of insertion sort algorithm.
8. Describe in detail the strassen’s matrix multiplication algorithm based on divide
and conquer strategies with suitable examples.
9. Sort the following sequence {25, 57, 48, 36, 12, 91, 86, 32} using heap sort.
10. Write short notes on Binomial heap.

46

You might also like