Module 1 DAA
Module 1 DAA
What is an Algorithm?
An algorithm is a set of steps of operations to solve a problem performing calculation,
data processing, and automated reasoning tasks. An algorithm is an efficient method
that can be expressed within a finite amount of time and space.
1
Analysis of Algorithms
Analysis of algorithms:
Usually, the Analysis of algorithms (time required by an algorithm) falls under three
types −
Best Case − Minimum time required for program execution.
Average Case − Average time required for program execution.
2
Worst Case − Maximum time required for program execution.
Designing of algorithms
An algorithm design technique means a unique approach or mathematical method for
creating algorithms and solving problems. While multiple algorithms can solve a
problem, not all algorithms can solve it efficiently. Therefore, we must create
algorithms using a suitable algorithm design method based on the nature of the
problem. An algorithm created with the right design technique can solve the problem
much more efficiently with respect to the computational power required.
The algorithms can be classified in various ways. They are:
1. Implementation Method
2. Design Method
3. Other Classifications
Classification by Implementation Method: There are primarily three main
categories into which an algorithm can be named in this type of classification. They
are:
1. Recursion or Iteration: A recursive algorithm is an algorithm which calls
itself again and again until a base condition is achieved whereas iterative
algorithms use loops and/or data structures like stacks, queues to solve any
problem. Every recursive solution can be implemented as an iterative
solution and vice versa.
Example: The Tower of Hanoi is implemented in a recursive fashion while
the Stock Span problem is implemented iteratively.
2. Exact or Approximate: Algorithms that are capable of finding an optimal
solution for any problem are known as the exact algorithm. For all those
problems, where it is not possible to find the most optimized solution, an
approximation algorithm is used. Approximate algorithms are the type of
algorithms that find the result as an average outcome of sub outcomes to a
problem.
Example: For NP-Hard Problems, approximation algorithms are used.
Sorting algorithms are the exact algorithms.
3. Serial or Parallel or Distributed Algorithms: In serial algorithms, one
instruction is executed at a time while parallel algorithms are those in which
we divide the problem into subproblems and execute them on different
processors. If parallel algorithms are distributed on different machines, then
they are known as distributed algorithms.
Classification by Design Method: There are primarily three main categories into
which an algorithm can be named in this type of classification. They are:
1. Greedy Method: In the greedy method, at each step, a decision is made to
choose the local optimum, without thinking about the future consequences.
Example: Fractional Knapsack, Activity Selection.
2. Divide and Conquer: The Divide and Conquer strategy involves dividing
the problem into sub-problem, recursively solving them, and then
recombining them for the final answer.
Example: Merge sort, Quicksort.
3. Dynamic Programming: The approach of Dynamic programming is similar
to divide and conquer. The difference is that whenever we have recursive
3
function calls with the same result, instead of calling them again we try to
store the result in a data structure in the form of a table and retrieve the
results from the table. Thus, the overall time complexity is reduced.
“Dynamic” means we dynamically decide whether to call a function or
retrieve values from the table.
Example: 0-1 Knapsack, subset-sum problem.
4. Linear Programming: In Linear Programming, there are inequalities in
terms of inputs and maximizing or minimizing some linear functions of
inputs.
Example: Maximum flow of Directed Graph
5. Reduction(Transform and Conquer): In this method, we solve a difficult
problem by transforming it into a known problem for which we have an
optimal solution. Basically, the goal is to find a reducing algorithm whose
complexity is not dominated by the resulting reduced algorithms.
Example: Selection algorithm for finding the median in a list involves first
sorting the list and then finding out the middle element in the sorted list.
These techniques are also called transform and conquer.
Other Classifications: Apart from classifying the algorithms into the above broad
categories, the algorithm can be classified into other broad categories like:
1. Randomized Algorithms: Algorithms that make random choices for faster
solutions are known as randomized algorithms.
Example: Randomized Quicksort Algorithm.
2. Classification by complexity: Algorithms that are classified on the basis of
time taken to get a solution to any problem for input size. This analysis is
known as time complexity analysis.
Example: Some algorithms take O(n), while some take exponential time.
3. Classification by Research Area: In CS each field has its own problems
and needs efficient algorithms.
Example: Sorting Algorithm, Searching Algorithm, Machine Learning etc.
4. Branch and Bound Enumeration and Backtracking: These are mostly
used in Artificial Intelligence.
Constant <= loglog(n) <= (log n)^k <= n^½ <= n <= nlog(n) <= (n)n^½ <=
n^2 <= n^2log n <= n^3 <= n^log n <= 2^n <= 3^n <= n! <= n^n
Recurrences Relation.
A recurrence relation is a way of expressing the time complexity of an algorithm by
recursively defining the value of a function in terms of its previous values. Recurrence
relations are commonly used in algorithm design and analysis to understand the time
complexity of recursive algorithms or algorithms that can be broken down into subproblems.
5
For example
1. Recurrence Relation for factorial is :
T(n)=1 for n=0 , T(n)=1+T(n-1) for n>0
,not T(n)=1 for n=0, T(n)=n*T(n-1) for n>0, Why?
We generally use recurrence relation to find the time complexity of the
algorithm. Here, the function T(n) is not actually for calculating the value of a
factorial but it is telling you about the time complexity of the factorial algorithm.
It means for finding a factorial of n it will take 1 more operation than factorial of
n-1.
2. Recurrence for Merge Sort:
To sort a given array using Merge Sort , we divide it in two halves and
recursively repeat the process for the two halves. Finally we merge the results.
So, the time complexity of Merge Sort can be written as T(n) = 2T(n/2) + cn.
There are many other algorithms like Binary Search, Tower of Hanoi, etc.
There are several methods to solve recurrence relations. Here are some common methods:
6
1) Substitution Method: We make a guess for the solution and then we use
mathematical induction to prove the guess is correct or incorrect.
Examples :-
1. T(n) = 2T(n/2) + n; n>1
We guess the solution as T(n) = O(nLogn). Now we use induction
to prove our guess.
We need to prove that T(n) <= cnLogn.
T(n) = 2T(n/2) + n
<= 2[c(n/2)Log(n/2)] + n = cnLogn - cnLog2 + n =cnLogn - cn + n = cnLogn +n(1-c)
<= cnLogn
= O (nLogn)
2.
___________________________________________________________________
2) Iterative Method :The iteration method is a "brute force" method of solving a
recurrence relation. The general idea is to iteratively substitute the value of the
recurrent part of the equation until a pattern (usually a summation) is noticed, at which
point the summation can be used to evaluate the recurrence relation.
Examples :-
Q1. T (n) = 1 if n=1
= 2T (n-1) if n>1
Solution: -
T (n) = 2T (n-1)
= 2^(n-1)
Solution :-
7
T (n) = T (n-1) +1 = (T (n-2) +1) +1 = (T (n-3) +1) +1+1
Where k = n-1
3) Recurrence Tree Method: In this method, we draw a recurrence tree and calculate
the time taken by every level of tree. Finally, we sum the work done at all levels. To
draw the recurrence tree, we start from the given recurrence and keep drawing till we
find a pattern among levels. The pattern is typically an arithmetic or geometric series.
Examples :-
1. T(n)= 2T(n/2) + n2
The Recursion tree for the above recurrence is :-
8
2. T(n)= 4T(n/2) +n
The recursion trees for the above recurrence :-
9
3.
The given Recurrence has the following recursion tree
When we add the values across the levels of the recursion trees, we get a value of n
for every level. The longest path from the root to leaf is : -
_______________________________________________________________________________
10
4) Master Method: Master Method is a direct way to get the solution. The master
method works only for following types of recurrences or for recurrences that can be
transformed to the following type.
Master Theorem Case 1 -
11
Master Theorem Case 3
Let T(n) be a function defined on positive n, and having the property
for some constants c, a > 0, b > 0, d 0, and function f(n). If f(n) is in O(n^d), then
Remark: This theorem is written to reveal a similarity to the Master theorem. The first
12
case is there for the sake of similarity. It doesn’t occur in algorithm analysis, since if a
is the number of recursive calls, a < 1 implies no recursive calls and no need for the
theorem. The second case arises often. Insertion sort and modexp() are two
examples at hand. The third case is not so common, but applies for instance to the
iconic Towers of Hanoi problem.
__________________________________________________________________________
Sorting Algorithm -
Insertion sort :
It is a simple sorting algorithm that works similar to the way you sort playing cards in
your hands. The array is virtually split into a sorted and an unsorted part. Values from
the unsorted part are picked and placed at the correct position in the sorted part.
Algorithm
To sort an array of size n in ascending order:
1: Iterate from arr[1] to arr[n] over the array.
2: Compare the current element (key) to its predecessor.
3: If the key element is smaller than its predecessor, compare it to the elements
before. Move the greater elements one position up to make space for the swapped
element.
Example 2:
Selection Sort -
This algorithm sorts an array by repeatedly finding the minimum element (considering
ascending order) from the unsorted part and putting it at the beginning. The algorithm
maintains two subarrays in a given array.
1) The subarray which is already sorted.
2) Remaining subarray which is unsorted.
In every iteration of selection sort, the minimum element (considering ascending
order) from the unsorted subarray is picked and moved to the sorted subarray.
Following example explains the above steps:
arr[] = 64 25 12 22 11
// Find the minimum element in arr[0...4] and place it at beginning
11 25 12 22 64
// Find the minimum element in arr[1...4] and place it at beginning of arr[1...4]
11 12 25 22 64
// Find the minimum element in arr[2...4] and place it at beginning of arr[2...4]
11 12 22 25 64
// Find the minimum element in arr[3...4] and place it at beginning of arr[3...4]
11 12 22 25 64
14
if A[min_idx] > A[j]:
min_idx = j
# Swap the found minimum element with
# the first element
A[i], A[min_idx] = A[min_idx], A[i]
# Driver code to test above
print ("Sorted array")
for i in range(len(A)):
print("%d" %A[i],end=" , ")
Time Complexity: O(n2) as there are two nested loops. Auxiliary Space: O(1)
The good thing about selection sort is it never makes more than O(n) swaps and can
be useful when memory write is a costly operation.
Bubble Sort -
It is the simplest sorting algorithm that works by repeatedly swapping the adjacent
elements if they are in the wrong order.
Example:
First Pass:
( 5 1 4 2 8 ) –> ( 1 5 4 2 8 ), Here, algorithm compares the first two elements, and
swaps since 5 > 1.
( 1 5 4 2 8 ) –> ( 1 4 5 2 8 ), Swap since 5 > 4
( 1 4 5 2 8 ) –> ( 1 4 2 5 8 ), Swap since 5 > 2
( 1 4 2 5 8 ) –> ( 1 4 2 5 8 ), Now, since these elements are already in order (8 > 5),
the algorithm does not swap them.
Second Pass:
( 1 4 2 5 8 ) –> ( 1 4 2 5 8 )
( 1 4 2 5 8 ) –> ( 1 2 4 5 8 ), Swap since 4 > 2
( 1 2 4 5 8 ) –> ( 1 2 4 5 8 )
( 1 2 4 5 8 ) –> ( 1 2 4 5 8 )
Now, the array is already sorted, but our algorithm does not know if it is completed.
The algorithm needs one whole pass without any swap to know it is sorted.
Third Pass:
( 1 2 4 5 8 ) –> ( 1 2 4 5 8 )
( 1 2 4 5 8 ) –> ( 1 2 4 5 8 )
( 1 2 4 5 8 ) –> ( 1 2 4 5 8 )
( 1 2 4 5 8 ) –> ( 1 2 4 5 8 )
15
# Optimized Python program for implementation of Bubble Sort
def bubbleSort(arr):
n = len(arr)
# Traverse through all array elements
for i in range(n):
swapped = False
# Last i elements are already in place
for j in range(0, n-i-1):
# Traverse the array from 0 to n-i-1
# Swap if the element found is greater
# than the next element
if arr[j] > arr[j+1]:
arr[j], arr[j+1] = arr[j+1], arr[j]
swapped = True
if (swapped == False):
break
# Driver code to test above
if __name__ == "__main__":
arr = [64, 34, 25, 12, 22, 11, 90]
bubbleSort(arr)
print("Sorted array:")
for i in range(len(arr)):
print("%d" % arr[i], end=" ")
16
Worst and Average Case Time Complexity: O(n*n). Worst case occurs when the
array is reverse sorted.
Best Case Time Complexity: O(n). Best case occurs when the array is already
sorted.
Auxiliary Space: O(1)
Boundary Cases: Bubble sort takes minimum time (Order of n) when elements are
already sorted.
Merge Sort -
It is a Divide and Conquer algorithm. It divides the input array into two halves, calls
itself for the two halves, and then merges the two sorted halves. The merge()
function is used for merging two halves. The merge(arr, l, m, r) is a key process that
assumes that arr[l..m] and arr[m+1..r] are sorted and merges the two sorted subarrays
into one.
MergeSort(arr[], l, r)
If r > l
1. Find the middle point to divide the array into two halves:
middle m = l+ (r-l)/2
2. Call mergeSort for first half: Call mergeSort(arr, l, m)
3. Call mergeSort for second half: Call mergeSort(arr, m+1,
r)
4. Merge the two halves sorted in step 2 and 3:
Call merge(arr, l, m, r)
QuickSort -
It is a Divide and Conquer algorithm. It picks an element as pivot and partitions the
given array around the picked pivot. There are many different versions of quickSort
that pick pivot in different ways.
Always pick the first element as pivot.
Always pick last element as pivot (implemented below)
Pick a random element as pivot.
Pick median as pivot.
18
Partition Algorithm
There can be many ways to do partition, following pseudo code adopts the method
given in CLRS book. The logic is simple: we start from the leftmost element and keep
track of the index of smaller (or equal to) elements as i. While traversing, if we find a
smaller element, we swap the current element with arr[i]. Otherwise we ignore the
current element.
19
// If current element is smaller than the pivot
if (arr[j] < pivot)
{
i++; // increment index of smaller element
swap arr[i] and arr[j]
}
}
swap arr[i + 1] and arr[high])
return (i + 1)
}
Illustration of partition() :
Now 70 is at its correct place. All elements smaller than 70 are before it and all elements
greater than 70 are after it.
20
Analysis of QuickSort
Time taken by QuickSort, in general, can be written as follows.
The first two terms are for two recursive calls, the last term is for the partition process. k is
the number of elements which are smaller than pivot.
The time taken by QuickSort depends upon the input array and partition strategy. Following
are three cases.
Worst Case: The worst case occurs when the partition process always picks the greatest or
smallest element as pivot. If we consider the above partition strategy where the last element
is always picked as pivot, the worst case would occur when the array is already sorted in
increasing or decreasing order. Following is the recurrence for the worst case.
T(n) = T(0) + T(n-1) + \theta(n)
which is equivalent to
T(n) = T(n-1) + \theta(n)
The solution of the above recurrence is \theta (n2).
Best Case: The best case occurs when the partition process always picks the middle
element as pivot. Following is a recurrence for best case.
Average Case:
To do an average case analysis, we need to consider all possible permutations of an array
and calculate the time taken by every permutation which doesn’t look easy. We can get an
idea of the average case by considering the case when a partition puts O(n/9) elements in
one set and O(9n/10) elements in another set. Following is a recurrence for this case.
Is QuickSort stable?
The default implementation is not stable. However any sorting algorithm can be made stable
by considering indexes as comparison parameters.
Is QuickSort In-place?
As per the broad definition of in-place algorithm it qualifies as an in-place sorting algorithm as
it uses extra space only for storing recursive function calls but not for manipulating the input.
Heap Sort -
Heap sort is a comparison-based sorting technique based on Binary Heap data structure. It is
similar to selection sort where we first find the minimum element and place the minimum
element at the beginning. We repeat the same process for the remaining elements.
Since a Binary Heap is a Complete Binary Tree, it can be easily represented as an array and
the array-based representation is space-efficient. If the parent node is stored at index I, the
left child can be calculated by 2 * I + 1 and the right child by 2 * I + 2 (assuming the indexing
starts at 0).
22
Child (70(1)) is greater than the parent (30(0))
But Strassen came up with a solution where we don’t need 8 recursive calls but can
be done in only 7 calls and some extra addition and subtraction operations.
Strassen’s 7 calls are shown in figure .
The time complexity using the Master Theorem. T(n) = 7T(n/2) + O(n^2) = O(n^log(7))
runtime.Approximately O(n^2.8074) which is better than O(n^3)
Searching Algorithm -
Linear/Sequential Search -
A linear search is also known as a sequential search that simply scans each element
at a time. Suppose we want to search an element in an array or list; we simply
calculate its length and do not jump at any item. The worst-case complexity is O(n)
24
Algorithm:
Linear Search (Array A, Value x)
Step 1: Set i to 1
Step 2: if i > n, then jump to step 7
Step 3: if A[i] = x then jump to step 6
Step 4: Set i to i + 1
Step 5: Go to step 2
Step 6: Print element x found at index i and jump to step 8
Step 7: Print element not found
Step 8: Exit
Binary Search -
A binary search is a search in which the middle element is calculated to check
whether it is smaller or larger than the element which is to be searched. The main
advantage of using binary search is that it does not scan each element in the list.
Instead of scanning each element, it performs the searching to the half of the list. So,
the binary search takes less time to search an element as compared to a linear
search. The worst-case scenario for finding the element is O(log2n).
The one pre-requisite of binary search is that an array should be in sorted order,
whereas linear search works on both sorted and unsorted arrays. The binary search
algorithm is based on the divide and conquer technique, which means that it will
divide the array recursively.
There are three cases used in the binary search:
Case 1: data<a[mid] then left = mid+1.
Case 2: data>a[mid] then right=mid-1
Case 3: data = a[mid] // element is found
Suppose we have an array of 10 size which is indexed from 0 to 9 as shown in the
below figure: We want to search for 70 elements from the above array.
25
Binary Search Tree -
Binary Search Tree is a node-based binary tree data structure which has the following
properties:
● The left subtree of a node contains only nodes with keys lesser than the node’s key.
● The right subtree of a node contains only nodes with keys greater than the node’s key.
● The left and right subtree each must also be a binary search tree.
Creating Binary Search tree (O(n2)): 45, 15, 79, 90, 10, 55, 12, 20, 50
26
Searching in the Binary Search tree: Suppose we have to find node 20 from the below
tree.
27
Search (root, item)
if (item = root → data) or (root = NULL)
return root
else if (item < root → data)
return Search(root → left, item)
else
return Search(root → right, item)
END
28
The node to be deleted has two children
Operations Best case time Average case time Worst case time
complexity complexity complexity
29
Deletion O(log n) O(log n) O(n)
Insertion and deletion in an AVL tree is performed in the same way as it is performed in a
binary search tree. However, it may lead to violation in the AVL tree property and therefore
the tree may need balancing. The tree can be balanced by applying rotations.
30
2. R R rotation : Inserted node is in the right subtree of right subtree of A
H, I, J, B, A, E, C, F, D, G, K, L
1. Insert H, I, J
31
On inserting the above elements, especially in the case of H, the BST becomes unbalanced
as the Balance Factor of H is -2. Since the BST is right-skewed, we will perform RR Rotation
on node H. It results in a balanced tree.
2. Insert B, A
On inserting the above elements, especially in case of A, the BST becomes unbalanced as
the Balance Factor of H and I is 2, we consider the first node from the last inserted node i.e.
H. Since the BST from H is left-skewed, we will perform LL Rotation on node H. it results in a
balanced tree.
3. Insert E
32
On inserting C, F, D, BST becomes unbalanced as the Balance Factor of B and H is -2, since if
we travel from D to B we find that it is inserted in the right subtree of left subtree of B, we will
perform RL Rotation on node I. RL = LL + RR rotation.
4a) We first perform LL rotation on node E
4b) We then perform RR rotation on node B
5. Insert G
33
On inserting K, BST becomes unbalanced as the Balance Factor of I is -2. Since the BST is
right-skewed from I to K, hence we will perform RR Rotation on the node I.
7. Insert L
On inserting the L tree is still balanced as the Balance Factor of each node is now either, -1, 0,
+1. Hence the tree is a Balanced AVL tree
—------------------------------------------------------------------------------------------------
Binomial Heaps -
A Binomial Heap is a collection of Binomial Trees
Binomial Tree -
A Binomial Tree of order 0 has 1 node. A Binomial Tree of order k can be constructed by
taking two binomial trees of order k-1 and making one as leftmost child or other.
A Binomial Tree of order k has following properties.
a) It has exactly 2^k nodes.
b) It has depth as k.
c) There are exactly kCi nodes at depth i for i = 0, 1, . . . , k.
d) The root has degree k and the children of the root are themselves Binomial Trees with
order k-1, k-2,.. 0 from left to right.
Binomial Heap:
A Binomial Heap is a set of Binomial Trees where each Binomial Tree follows the Min Heap
property. And there can be at most one Binomial Tree of any degree.
Examples Binomial Heap:
12------------10--------------------20
/ \ / | \
15 50 70 50 40
34
| / | |
30 80 85 65
|
100
A Binomial Heap with 13 nodes. It is a collection of 3 Binomial Trees of orders 0, 2 and 3
from left to right.
Complexity -
Decrease key -> Decreases an existing key to some value - Θ(logn)
Delete -> Deletes a node given a reference to the node - Θ(logn)
Extract minimum-> Removes and returns the minimum value given a reference to the node
- Θ(logn)
Find minimum -> Returns the minimum value - O(logn)
Insert ->Inserts a new value - O(logn)
Union - >Combine the heap with another to form a valid binomial heap - Θ(logn)
—---------------------------------------------------------------------------------------------------------------
Fibonacci Heaps -
In terms of Time Complexity, Fibonacci Heap beats both Binary and Binomial Heaps.
Below are amortized time complexities of Fibonacci Heap.
1) Find Min: Θ(1) [Same as both Binary and Binomial]
2) Delete Min: O(Log n) [Θ(Log n) in both Binary and Binomial]
3) Insert: Θ(1) [Θ(Log n) in Binary and Θ(1) in Binomial]
4) Decrease-Key: Θ(1) [Θ(Log n) in both Binary and Binomial]
5) Merge: Θ(1) [Θ(m Log n) or Θ(m+n) in Binary and
Θ(Log n) in Binomial]
Like Binomial Heap, Fibonacci Heap is a collection of trees with min-heap or max-heap
property. In Fibonacci Heap, trees can have any shape even all trees can be single
nodes (This is unlike Binomial Heap where every tree has to be Binomial Tree).
Below is an example Fibonacci Heap -
35
Fibonacci Heap maintains a pointer to minimum value (which is the root of a tree). All tree
roots are connected using a circular doubly linked list, so all of them can be acces sed using
a single ‘min’ pointer.
The main idea is to execute operations in a “lazy” way. For example, the merge operation
simply links two heaps, insert operation simply adds a new tree with a single node. The
operation extract minimum is the most complicated operation. It does delayed work of
consolidating trees. This makes delete also complicated as delete first decreases the key to
minus infinite, then calls extract minimum.
Below are some interesting facts about Fibonacci Heap
1. The reduced time complexity of Decrease-Key has importance in Dijkstra and Prim
algorithms. With Binary Heap, the time complexity of these algorithms is O(VLogV
+ ELogV). If Fibonacci Heap is used, then time complexity is improved to O(VLogV
+ E)
2. Although Fibonacci Heap looks promising time complexity wise, it has been found
slow in practice as hidden constants are high (Source Wiki).
3. Fibonacci heaps are mainly called so because Fibonacci numbers are used in the
running time analysis. Also, every node in Fibonacci Heap has degree at most
O(log n) and the size of a subtree rooted in a node of degree k is at least Fk+2,
where Fk is the kth Fibonacci number
—-----------------------------------------------------------------------------------------------------------------------
36
When each subset contains only one element, the array arr is as follows:
1. Union(6, 5)
37
The arr is as follows:
After performing some operations of Union (A ,B), there are now 5 subsets as follows:
1. First subset comprises the elements {3, 4, 8, 9}
2. Second subset comprises the elements {1, 2}
3. Third subset comprises the elements {5, 6}
4. Fourth subset comprises the elements {0}
5. Fifth subset comprises the elements {7}
The elements of a subset, which are connected to each other directly or indirectly, can be
considered as the nodes of a graph. Therefore, all these subsets are called connected
components.
The union-find data structure is useful in graphs for performing various operations like
connecting nodes, finding connected components etc.
Let’s perform some find(A, B) operations.
1. Find(0, 7): 0 and 7 are disconnected, and therefore, you will get a false result
2. Find(8, 9): Although 8 and 9 are not connected directly, there is a path that connects
both the elements, and therefore, you will get a true result
Implementation
Approach A
Initially there are N subsets containing one element in each subset. Therefore, to initialize the
array use the initialize () function.
void initialize( int Arr[ ], int N)
{
for(int i = 0;i<N;i++)
Arr[ i ] = i ;
}
//returns true if A and B are connected, else returns false
bool find( int Arr[ ], int A, int B)
{
if(Arr[ A ] == Arr[ B ])
return true;
else
return false;
}
//change all entries from arr[ A ] to arr[ B ].
void union(int Arr[ ], int N, int A, int B)
{
int TEMP = Arr[ A ];
for(int i = 0; i < N;i++)
{
if(Arr[ i ] == TEMP)
Arr[ i ] = Arr[ B ];
}
}
38
Time complexity (of this approach)
As the loop in the union function iterates through all the N elements for connecting two
elements, performing this operation on N objects will take O(N2) time, which is quite
inefficient.
Approach B
Let’s try another approach.
Idea
Arr[ A ] is a parent of A.
Consider the root element of each subset, which is only a special element in that subset
having itself as the parent. Assume that R is the root element, then arr[ R ] = R.
For more clarity, consider the subset S = {0, 1, 2, 3, 4, 5}
Initially each element is the root of itself in all the subsets because arr[ i ] = i, where i is the
element in the set. Therefore root(i) = i.
Performing union(1, 0) will connect 1 to 0 and will set root (0) as the parent of root (1). As
root(1) = 1 and root(0) = 0, the value of arr[ 1 ] will change from 1 to 0. Therefore, 0 will be
the root of the subset that contains the elements {0, 1}.
Performing union (0, 2), will indirectly connect 0 to 2 by setting root(2) as the parent of
root(0). As root(0) is 0 and root(2) is 2, it will change the value of arr[ 0 ] from 0 to 2.
Therefore, 2 will be the root of the subset that contains the elements {2, 0, 1}.
39
Performing union (3, 4) will indirectly connect 3 to 4 by setting root(4) as the parent of root(3).
As root(3) is 3 and root(4) is 4, it will change the value of arr[ 3 ] from 3 to 4. Therefore, 4 will
be the root of the subset that contains the elements {3, 4}.
Performing union (1, 4) will indirectly connect 1 to 4 by setting root(4) as the parent of root(1).
As root(4) is 4 and root(1) is 2, it will change the value of arr[ 2 ] from 2 to 4. Therefore, 4 will
be the root of the set containing elements {0, 1, 2, 3, 4}.
After each step, you will see the change in the array arr also.
After performing the required union(A, B) operations, you can perform the find(A, B)
operation easily to check whether A and B are connected. This can be done by calculating
the roots of both A and B. If the roots of A and B are the same, then it means that both A and
B are in the same subset and are connected.
40
Calculating the root of an element
Arr[ i ] is the parent of i (where i is the element of the set). The root of i is Arr[ Arr[ Arr[
…...Arr[ i ]...... ] ] ] until arr[ i ] is not equal to i. You can run a loop until you get an element
that is a parent of itself.
Note This can only be done when there is no cycle in the elements of the subset, else the
loop will run infinitely.
1. Find(1, 4): 1 and 4 have the same root i.e. 4. Therefore, it means that they are
connected and this operation will give the result True.
2. Find(3, 5): 3 and 5 do not have the same root because root(3) is 4 and root(5) is 5.
This means that they are not connected and this operation will give the result False.
Implementation
Initially each element is a parent of itself, which can be done by using the initialize function as
discussed above.
//finding root of an element
int root(int Arr[ ],int i)
{
while(Arr[ i ] != i) //chase parent of current element until it reaches root
{
i = Arr[ i ];
}
return i;
}
/*modified union function where we connect the elements by changing the root of one of
the elements*/
41
To avoid this, track the size of each subset and while connecting two elements connect the
root of each subset that has a smaller number of elements to the root of each subset that has
a larger number of elements.
Example
If you want to connect 1 and 5, then connect the root of subset A (the subset that contains 1)
to the root of subset B ( the subset that contains 5) because subset A contains less number
of elements than subset B.
It will balance the tree formed by performing the operations discussed above. This is known
as weighted-union operation .
Implementation
Initially the size of each subset will be one because each subset will have only one element.
You can initialize it in the initialize function discussed above. The size[ ] array function will
keep a track of the size of each subset.
//modified initialize function:
void initialize( int Arr[ ], int N)
{
for(int i = 0;i<N;i++)
{
Arr[ i ] = i ;
size[ i ] = 1;
}
}
The root() and find() functions will be the same as discussed above .
The union function will be modified because the two subsets will be connected based on the
number of elements in each subset.
//modified union function
void weighted-union(int Arr[ ],int size[ ],int A,int B)
{
42
int root_A = root(A);
int root_B = root(B);
if(size[root_A] < size[root_B ])
{
Arr[ root_A ] = Arr[root_B];
size[root_B] += size[root_A];
}
else
{
Arr[ root_B ] = Arr[root_A];
size[root_A] += size[root_B];
}
}
Example
You have a set S = {0, 1, 2, 3, 4, 5} Initially all the subsets have a single element and each
element is a root of itself. Initially size[ ] array will be as follows:
Perform union(0, 1). Here, you can connect any root of any element with the root of another
element because both the element’s subsets are of the same size. After the roots are
connected, the respective sizes will be updated. If you connect 1 to 0 and make 0 the root,
then the size of 0 will change from 1 to 2.
While performing union(1, 2), connect root(2) with root(1) because the subset of 2 has fewer
elements than the subset of 1.
43
Similarly, in union(3, 2), connect root(3) to root(2) because the subset of 3 has fewer
elements than the subset of 2.
Maintaining a balanced tree will reduce complexity of the union-find function from N to log2N.
Idea for improving this approach further
Union with path compression
While computing the root of A, set each i to point to its grandparent (thereby halving the
length of the path), where i is the node that comes in the path while computing the root of A.
//modified root function
1. Which one of the following statements is TRUE for all positive functions f(n)?
[GATE CSE 2022]
(A) f(n2) = θ(f(n)2), when f(n) is a polynomial
(B) f(n2) = o(f(n)2)
(C) f(n2) = O(f(n)2), when f(n) is an exponential function
(D) f(n2) = Ω(f(n)2)
Solution: Correct answer is (A)
2. For parameters a and b, both of which are ω(1), T(n) = T(n1/a) + 1, and T(b) = 1. Then
T(n) is [GATE CSE 2020]
(A) θ(logalogbn) (B) θ(logabn) (C) θ(logblogan) (D) θ(log2log2n)
Solution: Correct answer is (A)
3. Which one of the following is the recurrence equation for the worst-case time
complexity of the Quicksort algorithm for sorting (n ≥ 2) numbers? In the recurrence
equations given in the options below, c is a constant. [GATE CSE 2015]
(A) T(n) = 2T(n/2) + cn (B) T(n) = T(n-1) + T(0) + cn
(C) T(n) = 2T(n-1) + cn (D) T(n) = T(n/2) + cn
Solution: Correct answer is (B)
8. For merging two sorted lists of sizes m and n into a sorted list of size m+n, we
require comparisons of [GATE CSE 1995]
(A) O(m) (B) O(n) (C) O(m+n) (D) O(log m + log n)
Solution: Correct answer is (C)
Assignment -1
1. Apply Master’s theorem to find complexity of the following recurrence relations:
i) T(n)=3T(n/2)+n2
ii) T(n)=4T(n/2)+n2
iii) T(n)=T(n/2)+n2
iv) T(n)=2n T(n/2)+nn
2. Write a Quick sort algorithm and compute its best, worst case time
complexity.Sort 7,9,2,3,15,1,12,30 in ascending order using quick sort.
3. Solve using Recursion Tree method T(n)=3T(n/4)+n 2
4. Solve the following recurrence relations using the substitution method. Also,
prove your answer using the iteration method.
a. T(n) = 3T(n/3) + n/log n
b. T(n) = T(n/2)+T(n/4)+T(n/8) + n
5. Examine the time complexity of the equations:
a. T(n)=T(√n) +o(1)
b. T(n) =7 T(n/2) +n3
6. Write a merge sort algorithm and compute its best, worst case time complexity.
Sort L,U,C,K,N,O,W in alphabetical order using merge sort.
7. Sort given array
A={27,46,11,95,67,32,78}
Using insertion sort algorithm. Also perform best case and worst case analysis
of insertion sort algorithm.
8. Describe in detail the strassen’s matrix multiplication algorithm based on divide
and conquer strategies with suitable examples.
9. Sort the following sequence {25, 57, 48, 36, 12, 91, 86, 32} using heap sort.
10. Write short notes on Binomial heap.
46