Unit Ii
Unit Ii
Basic strategy, matrix operations, Strassen's matrix multiplication, binary search, quick sort, merge sort,
amortized analysis, application of amortized analysis, advanced data structures like Fibonacci heap,
binomial heap, disjoint set representation
Basic strategy:
A problem-solving strategy that involves breaking a problem into smaller subproblems, solving each subproblem
recursively, and then combining their solutions to solve the original problem.
The basic strategy of the Divide and Conquer paradigm involves three main steps for solving a problem:
1. Divide: Break the original problem into smaller subproblems that are similar to the original problem but
smaller in size. The size of these subproblems decreases with each recursive step.
2. Conquer: Solve the subproblems recursively. If the subproblem sizes are small enough, solve them directly
using base case solutions (e.g., simple arithmetic operations).
3. Combine: Merge the solutions of the subproblems to form the solution of the original problem.
Example: Merge Sort (Divide and Conquer Strategy)
• Divide: Split the array into two halves.
• Conquer: Recursively sort each half.
• Combine: Merge the two sorted halves to produce a single sorted array.
The divide and conquer strategy is effective when a problem can be broken down into independent or nearly
independent subproblems that can be solved recursively and then combined.
Advantages:
• Efficiency: By dividing the problem, the solution often achieves better time complexity (e.g., O(nlogn) in
merge sort).
• Parallelization: Subproblems can often be solved in parallel, making this approach well-suited for
distributed computing.
Matrix operations:
Basic operations like matrix addition, subtraction, and multiplication. This forms the foundation for more complex
algorithms.
Strassen's matrix multiplication
An algorithm that multiplies two matrices faster than the standard matrix multiplication method, reducing the time
complexity from O(n3) to O(n2.81).
Divide and Conquer :
Following is simple Divide and Conquer method to multiply two square matrices.
1. Divide matrices A and B in 4 sub-matrices of size N/2 x N/2 as shown in the below diagram.
2. Calculate following values recursively. ae + bg, af + bh, ce + dg and cf + dh.
Strassen's algorithm, developed by Volker Strassen in 1969, is a fast algorithm for matrix multiplication. It is an
efficient divide-and-conquer method that reduces the number of arithmetic operations required to multiply two
matrices compared to the conventional matrix multiplication algorithm (the naive approach).
The traditional matrix multiplication algorithm has a time complexity of O(n3) for multiplying two n x n matrices.
However, Strassen's algorithm improves this to O(n ^ log 2 7), which is approximately O(n2.81). The algorithm
achieves this improvement by recursively breaking down the matrix multiplication into smaller subproblems and
combining the results.
The efficiency of Strassen's algorithm comes from the fact that it reduces the number of recursive calls, which
means fewer multiplication operations are needed overall. However, due to its higher constant factors and increased
overhead, Strassen's algorithm is sometimes slower than the naive algorithm for small matrices or practical
implementations. For huge matrices, it can provide a significant speedup. Additionally, further optimized
algorithms like Coppersmith-Winograd algorithm have been developed to improve matrix multiplication even
more, especially for huge matrices.
Binary search:
A search algorithm that works on sorted arrays by repeatedly dividing the search interval in half. Its time
complexity is O(log n).
Binary search is a search algorithm used to find the position of a target value within a sorted array. It works by
repeatedly dividing the search interval in half until the target value is found or the interval is empty. The search
interval is halved by comparing the target element with the middle value of the search space.
Binary search follows the divide and conquer approach in which the list is divided into two halves, and the item is
compared with the middle element of the list. If the match is found then, the location of the middle element is
returned. Otherwise, we search into either of the halves depending upon the result produced through the match.
To apply Binary Search algorithm:
• The data structure must be sorted.
• Access to any element of the data structure should take constant time.
Below is the step-by-step algorithm for Binary Search:
• Divide the search space into two halves by finding the middle index “mid”.
• Compare the middle element of the search space with the key.
• If the key is found at middle element, the process is terminated.
• If the key is not found at middle element, choose which half will be used as the next search space.
If the key is smaller than the middle element, then the left side is used for next search.
If the key is larger than the middle element, then the right side is used for next search.
• This process is continued until the key is found or the total search space is exhausted.
Pseudo Code:
The Binary Search Algorithm can be implemented in the following two ways
• Iterative Binary Search Algorithm
• Recursive Binary Search Algorithm
return binarySearch(arr, mid + 1, high, x); // Else the element can only be present in right
subarray
}
return -1;
}
Quick sort
A highly efficient sorting algorithm that uses a divide-and-conquer approach. It works by selecting a 'pivot' and
partitioning the array around the pivot, then recursively sorting the subarrays.
QuickSort is a sorting algorithm based on the Divide and Conquer algorithm that picks an element as a pivot and
partitions the given array around the picked pivot by placing the pivot in its correct position in the sorted array.
Quicksort picks an element as pivot, and then it partitions the given array around the picked pivot element. In quick
sort, a large array is divided into two arrays in which one holds values that are smaller than the specified value
(Pivot), and another array holds the values that are greater than the pivot.
Divide: In Divide, first pick a pivot element. After that, partition or rearrange the array into two sub-arrays such
that each element in the left sub-array is less than or equal to the pivot element and each element in the right sub-
array is larger than the pivot element.
Conquer: Recursively, sort two subarrays with Quicksort.
Combine: Combine the already sorted array.
Choosing the Pivot:
Picking a good pivot is necessary for the fast implementation of quicksort. However, it is typical to determine a
good pivot. Some of the ways of choosing a pivot are as follows –
There are many different choices for picking pivots.
➢ Pivot can be random, i.e. select the random pivot from the given array.
➢ Pivot can either be the rightmost element or the leftmost element of the given array.
➢ Select median as the pivot element.
Pseudo Code:
Code:
import java.io.*;
class GFG
{
static void swap(int[] arr, int i, int j) // A utility function to swap two elements
{
int temp = arr[i];
arr[i] = arr[j];
arr[j] = temp;
}
static int partition(int[] arr, int low, int high)
/*This function takes last element as pivot, places the pivot element at its correct position in sorted array, and
places all smaller to left of pivot and all greater elements to right of pivot*/
{
int pivot = arr[high]; // Choosing the pivot
int i = (low - 1); // Index of smaller element and indicates the right position of pivot found so far
for (int j = low; j <= high - 1; j++)
{
if (arr[j] < pivot) // If current element is smaller than the pivot
{
i++; // Increment index of smaller element
swap(arr, i, j);
}
}
swap(arr, i + 1, high);
return (i + 1);
}
Merge Sort:
Another divide-and-conquer algorithm that splits the array into halves, recursively sorts them, and then merges the
sorted halves. Its time complexity is O(n log n).
Merge sort is similar to the quick sort algorithm as it uses the divide and conquer approach to sort the elements. It is
one of the most popular and efficient sorting algorithm. It divides the given list into two equal halves, calls itself for
the two halves and then merges the two sorted halves. We have to define the merge() function to perform the
merging.
The sub-lists are divided again and again into halves until the list cannot be divided further. Then we combine the
pair of one element lists into two-element lists, sorting them in the process. The sorted two-element pairs is merged
into the four-element lists, and so on until we get the sorted list.
1. Divide: Divide the list or array recursively into two halves until it can no more be divided.
2. Conquer: Each subarray is sorted individually using the merge sort algorithm.
3. Merge: The sorted subarrays are merged back together in sorted order. The process continues until all
elements from both subarrays have been merged.
Pseudo Code:
Code:
Amortized Analysis
A method for analysing the average time complexity of an algorithm over a sequence of operations, rather than a
single operation.
Amortized analysis is a powerful technique for data structure analysis, involving the total runtime of a sequence of
operations, which is often what we really care about.
In amortized analysis, one averages the total time required to perform a sequence of data-structure operations over
all operations performed. Upshot of amortized analysis: worst-case cost per query may be high for one particular
query, so long as overall average cost per query is small in the end!
Amortized analysis is a worst-case analysis. That is, it measures the average performance of each operation in the
worst case.
In Amortized Analysis, we analyze a sequence of operations and guarantee a worst-case average time that is lower
than the worst-case time of a particularly expensive operation.
The example data structures whose operations are analyzed using Amortized Analysis are Hash Tables, Disjoint
Sets, and Splay Trees.
Amortized analysis is a technique used in computer science to analyze the average-case time complexity of
algorithms that perform a sequence of operations, where some operations may be more expensive than others. The
idea is to spread the cost of these expensive operations over multiple operations, so that the average cost of each
operation is constant or less.
Types of amortized analyses:
Three common types of amortized analyses:
1. Aggregate Analysis: determine upper bound T(n) on total cost of sequence of n operations. So amortized
complexity is T(n)/n.
2. Accounting Method: assign certain charge to each operation (independent of the actual cost of the
operation). If operation is cheaper than the charge, then build up credit to use later.
3. Potential Method: one comes up with potential energy of a data structure, which maps each state of entire
data-structure to a real number (its “potential”). Differs from accounting method because we assign credit to
the data structure as a whole, instead of assigning credit to each operation.
Aggregate Method:
The method we used in the above analysis is the aggregate method: just add up the cost of all the operations and
then divide by the number of operations.
In aggregate analysis, there are two steps. First, we must show that a sequence of n operations takes T(n) time in
𝑇(𝑛)
the worst case. Then, we show that each operation takes time, on average. Therefore, in aggregate analysis,
𝑛
each operation has the same cost.
A common example of aggregate analysis is a modified stack. Stacks are a linear data structure that have two
constant-time operations. push(element) puts an element on the top of the stack, and pop() takes the top element
off of the stack and returns it. These operations are both constant-time, so a total of n operations (in any order)
will result in O(n) total time.
Aggregate method is the simplest method. Because it’s simple, it may not be able to analyse more complicated
algorithms.
Accounting Method:
This method allows an operation to store credit into a bank for future use, if its assigned amortized cost > its
actual cost; it also allows an operation to pay for its extra actual cost using existing credit, if its assigned
amortized cost < its actual cost.
Potential Method:
The potential method is similar to the accounting method. However, instead of thinking about the analysis in
terms of cost and credit, the potential method thinks of work already done as potential energy that can pay for
later operations. This is similar to how rolling a rock up a hill creates potential energy that then can bring it back
down the hill with no effort. Unlike the accounting method, however, potential energy is associated with the data
structure as a whole, not with individual operations.
This method defines a potential function Φ that maps a data structure (DS) configuration to a value. This function
Φ is equivalent to the total unused credits stored up by all past operations (the bank account balance). Now
And
In order for the amortized bound to hold, Φ should never go below Φ(initial DS) at any point. If Φ(initial DS) = 0,
which is usually the case, then Φ should never go negative (intuitively, we cannot ”owe the bank”).
Advantages of amortized analysis:
1. More accurate predictions: Amortized analysis provides a more accurate prediction of the average-case
complexity of an algorithm over a sequence of operations, rather than just the worst-case complexity of
individual operations.
2. Provides insight into algorithm behavior: By analyzing the amortized cost of an algorithm, we can gain
insight into how it behaves over a longer period of time and how it handles different types of inputs.
3. Helps in algorithm design: Amortized analysis can be used as a tool for designing algorithms that are
efficient over a sequence of operations.
Useful in dynamic data structures: Amortized analysis is particularly useful in dynamic data structures like
heaps, stacks, and queues, where the cost of an operation may depend on the current state of the data
structure.
1. Fibonacci heap:
A Fibonacci heap is defined as the collection of rooted-tree in which all the trees must hold the
property of Min-heap. That is, for all the nodes, the key value of the parent node should be greater than the
key value of the parent node:
The above Fibonacci Heap consists of five rooted min-heap-ordered trees with 14 nodes. The min-
heap-ordered tree means the tree which holds the property of a min-heap. The dashed line shows the
root list. The minimum node in the Fibonacci heap is the node containing the key = 3 pointed by the
pointer FH-min.
Here, 18, 39, 26, and 35 are marked nodes, meaning they have lost one child. The potential of the
Fibonacci series = No. of rooted tree + Twice the number of marked nodes = 5 + 2 * 4 = 13.
In the above figure, we can observe that each node contains four pointers, the parent points to the
parent (Upward), the child points to the child (downward), and the left and right pointers for the
siblings (sideways).
Properties of Fibonacci Heap:
1. It can have multiple trees of equal degrees, and each tree doesn't need to have 2^k nodes.
2. All the trees in the Fibonacci Heap are rooted but not ordered.
3. All the roots and siblings are stored in a separated circular-doubly-linked list.
4. The degree of a node is the number of its children. Node X -> degree = Number of X's children.
5. Each node has a mark-attribute in which it is marked TRUE or FALSE. The FALSE indicates the
node has not any of its children. The TRUE represents that the node has lost one child. The newly
created node is marked FALSE.
6. The potential function of the Fibonacci heap is F(FH) = t[FH] + 2 * m[FH]
7. The Fibonacci Heap (FH) has some important technicalities listed below:
1. min[FH] - Pointer points to the minimum node in the Fibonacci Heap
2. n[FH] - Determines the number of nodes
3. t[FH] - Determines the number of rooted trees
4. m[FH] - Determines the number of marked nodes
5. F(FH) - Potential Function.
2. Binomial heap:
A heap similar to a binary heap, but with a more complex structure that allows faster merging of two heaps. It is
used in applications requiring fast merging.
What is a Binomial tree?
A Binomial tree Bk is an ordered tree defined recursively, where k is defined as the order of the binomial tree.
• If the binomial tree is represented as B0, then the tree consists of a single node.
• In general terms, Bk consists of two binomial trees, i.e., Bk-1 and Bk-1 that are linked together in which
one tree becomes the left subtree of another binomial tree.
We can understand it with the example given below:-
i. If B0, where k is 0, there would exist only one node in the tree.
ii. If B1, where k is 1. Therefore, there would be two binomial trees of B0 in which one B0 becomes the left subtree
of another B0.
iii. If B2, where k is 2. Therefore, there would be two binomial trees of B1 in which one B1 becomes the left subtree
of another B1.
iv. If B3, where k is 3. Therefore, there would be two binomial trees of B2 in which one B2 becomes the left subtree
of another B2.
• Every binomial tree in the heap must follow the min-heap property, i.e., the key of a node is greater than or
equal to the key of its parent.
• For any non-negative integer k, there should be atleast one binomial tree in a heap where root has degree k.
The first property of the heap ensures that the min-heap property is hold throughout the heap. Whereas the second
property listed above ensures that a binary tree with n nodes should have at most 1 + log2 n binomial trees,
here log2 is the binary logarithm.
The above figure has three binomial trees, i.e., B0, B2, and B3. The above all three binomial trees satisfy the min
heap's property as all the nodes have a smaller value than the child nodes.
The above figure also satisfies the second property of the binomial heap. For example, if we consider the value of k
as 3, we can observe in the above figure that the binomial tree of degree 3 exists in a heap.
When we create a new binomial heap, it simply takes O(1) time because creating a heap will create the head
of the heap in which no elements are attached.
As stated above, binomial heap is the collection of binomial trees, and every binomial tree satisfies the min-
heap property. It means that the root node contains a minimum value. Therefore, we only have to compare
the root node of all the binomial trees to find the minimum key. The time complexity of finding the
minimum key in binomial heap is O(logn).
We can see that there are two binomial heaps, so, first, we have to combine both heaps. To combine the heaps, first,
we need to arrange their binomial trees in increasing order.
Now, first apply Case1 that says 'if degree[x] ≠ degree[next x] then move pointer ahead' but in the above
example, the degree[x] = degree[next[x]], so this case is not valid.
Now, apply Case2 that says 'if degree[x] = degree[next x] = degree[sibling(next x)] then Move pointer
ahead'. So, this case is also not applied in the above heap.
Now, apply Case3 that says ' If degree[x] = degree[next x] ≠ degree[sibling[next x]] and key[x] < key[next x]
then remove [next x] from root and attached to x'. We will apply this case because the above heap follows the
conditions of case 3 -
Insert an element in the heap
Inserting an element in the heap can be done by simply creating a new heap only with the element to be inserted,
and then merging it with the original heap. Due to the merging, the single insertion in a heap takes O(logn) time.
Now, let's understand the process of inserting a new node in a heap using an example.
First, we have to combine both of the heaps. As both node 12 and node 15 are of degree 0, so node 15 is attached to
node 12 as shown below –
Disjoint set representation
Also known as union-find, this data structure manages a partition of a set into disjoint (non-overlapping) subsets.
It's commonly used in network connectivity and minimum spanning tree algorithms.
The disjoint set data structure is also known as union-find data structure and merge-find set. It is a data structure
that contains a collection of disjoint or non-overlapping sets. The disjoint set means that when the set is partitioned
into the disjoint subsets. The various operations can be performed on the disjoint subsets. In this case, we can add
new sets, we can merge the sets, and we can also find the representative member of a set. It also allows to find out
whether the two elements are in the same set or not efficiently.
The disjoint set can be defined as the subsets where there is no common element between the two sets. Let's
understand the disjoint sets through an example.
s1 = {1, 2, 3, 4}
s2 = {5, 6, 7, 8}
We have two subsets named s1 and s2. The s1 subset contains the elements 1, 2, 3, 4, while s2 contains the
elements 5, 6, 7, 8. Since there is no common element between these two sets, we will not get anything if we
consider the intersection between these two sets. This is also known as a disjoint set where no elements are
common. Now the question arises how we can perform the operations on them. We can perform only two
operations, i.e., find and union.
In the case of find operation, we have to check that the element is present in which set. There are two sets named s1
and s2 shown below:
Suppose we want to perform the union operation on these two sets. First, we have to check whether the elements on
which we are performing the union operation belong to different or same sets. If they belong to the different sets,
then we can perform the union operation; otherwise, not. For example, we want to perform the union operation
between 4 and 8. Since 4 and 8 belong to different sets, so we apply the union operation. Once the union operation
is performed, the edge will be added between the 4 and 8 shown as below:
When the union operation is applied, the set would be represented as:
s1Us2 = {1, 2, 3, 4, 5, 6, 7, 8}
Suppose we add one more edge between 1 and 5. Now the final set can be represented as:
s3 = {1, 2, 3, 4, 5, 6, 7, 8}
If we consider any element from the above set, then all the elements belong to the same set; it means that the cycle
exists in a graph.