Cs3401 Algorithms Unit III
Cs3401 Algorithms Unit III
Divide and Conquer methodology: Finding maximum and minimum - Merge sort - Quick sort
Dynamic programming: Elements of dynamic programming — Matrix-chain multiplication
- Multi stage graph — Optimal Binary Search Trees. Greedy Technique: Elements of the
greedy strategy - Activity-selection problem –- Optimal Merge pattern — Huffman Trees.
Applications
The divide-and-conquer technique is the basis of efficient algorithms for many problems, such
as:
Finding maximum/minimum
Sorting (e.g., quicksort, merge sort)
Multiplying large numbers (e.g., the Karatsuba algorithm)
Finding the closest pair of points
Syntactic analysis (e.g., top-down parsers)
Computing the discrete Fourier transform (FFT).
Methods
- Naïve method
- Using inbuilt function - sort()
- Divide and Conquer method (or) Recursive method
for i in range(len(arr)):
if arr[i] < min:
min = arr[i]
if arr[i] > max:
max = arr[i]
Execution
Maximum value in the array : 4
Maximum value in the array : 89
Algorithm analysis
- The worst case occur when elements is sorted in descending order. We will be
making two comparisons at each iteration.
- Total number of comparison = 2*(n — 1), so Time Complexity =2n — 2.
- Space complexity = O(1)
Execution
Maximum value in the array : 4
Maximum value in the array : 89
def find_max_min(arr):
if len(arr) == 1:
return (arr[0], arr[0])
elif len(arr) == 2:
return (max(arr[0], arr[1]), min(arr[0], arr[1]))
else:
mid = len(arr) // 2
left_max, left_min = find_max_min(arr[:mid])
right_max, right_min = find_max_min(arr[mid:])
return (max(left_max, right_max), min(left_min, right_min))
# Example usage
arr = [3, 5, 1, 9, 7, 4, 2, 8, 6]
max_val, min_val = find_max_min(arr)
print(f"Max value: {max_val}")
print(f"Min value: {min_val}")
Algorithm analysis
Let T (n) = time required to apply the algorithm on an array of size n. Here we divide the
terms as T(n/2). Let us assume that n is in the form of power of 2. Hence, n = 2k where k is
height of the recursion tree.
T (2) = 1, time required to compare two elements/items. (Time is measured in units of the
number of comparisons).
Sorting
Sorting involves arranging a collection of elements in a specific order, such as numerical or
alphabetical order. There are several algorithms for sorting, including bubble sort, selection
sort, insertion sort, quicksort, mergesort, and heapsort.
The combination of sorting and divide and conquer is often used in solving problems
that involve sorting a large data set. Merge sort & Quick sort uses divide and conquer method
to sort a collection of elements.
Merge sort
Merge sort is a sorting algorithm that works by dividing an array into two halves,
recursively sorting each half, and then merging the sorted halves back together.
Basic idea
Repeatedly divide the array in half until each sub-array contains only one element,
which is already sorted by definition.
The two sorted sub-arrays are then merged back together to form a larger sorted
array, and the process continues until the entire array is sorted.
Here are the high-level steps of the merge sort algorithm:
Divide the array into two halves.
Recursively sort each half using merge sort.
Merge the sorted halves back together.
The key step in merge sort is the merge operation, which combines two sorted sub-arrays
into a larger sorted array. Here are the steps for the merge operation:
Create a new array to hold the merged result.
Initialize two pointers, one for each sub-array, pointing to the first element of each
sub-array.
Compare the values at the two pointers, and add the smaller value to the merged result
array.
Move the pointer of the sub-array whose value was added to the merged result array
to the next element.
Repeat steps 3 and 4 until one sub-array is fully processed.
Unit III CS3401 Algorithms 5
Add the remaining elements of the other sub-array to the merged result array.
Example:
Here is an example of how merge sort works on the array [5, 2, 9, 1, 5, 6]:
Divide the array into two halves: [5, 2, 9] and [1, 5, 6].
Recursively sort each half using merge sort:
o [5, 2, 9] -> [2, 5, 9] and [1, 5, 6] -> [1, 5, 6]
Merge the two sorted halves back together
o [2, 5, 9] and [1, 5, 6] ->[1, 2, 5, 5, 6, 9].
The final result is the sorted array [1, 2, 5, 5, 6, 9].
Example 2
Execution:
[1, 5, 6, 12, 49, 65]
def quick_sort(arr):
if len(arr) <= 1:
return arr
pivot = arr[len(arr) // 2]
left = [x for x in arr if x < pivot]
middle = [x for x in arr if x == pivot]
right = [x for x in arr if x > pivot]
Execution
[2, 3, 13, 35, 39, 47]
Note:
Memoization or memoisation Optimization technique to speed up computer programs by
storing the results of expensive function calls. When the same inputs occur again, it sends the
result.
Bottom-Up approach
The bottom-up approach uses the tabulation technique. It solves the same kind of processing
but it removes the recursion. If we remove the recursion, there is no stack overflow issue and
no overhead of the recursive functions. In this tabulation technique, we solve the problems
and store the results in a matrix.
Substructure: Sub-Structuring is the process of dividing the given problem statement into
smaller sub-problems. By solving these sub-problems and combining their solutions, we can
solve the original problem. This is the basis for the recursive nature of dynamic programming.
Table Structure: It is necessary to store the result of the sub-problem in a table.By reusing the
solutions of sub-problems many times, we don’t have to solve the same problem, again and
again,
Bottom-up approach: The process of combining the solutions of sub-problems to achieve the
final result using the table is done through bottom-up approach.
Aim
Find the most efficient way to multiply a given sequence of matrices. The problem is
not actually to perform the multiplications but todecide the sequence of the matrix
multiplications involved.
The matrix multiplication is associative as no matter how the product is parenthesized, the
result obtained will remain the same.
Example of Matrix Chain Multiplication
Consider 4 matrices A, B, C, and D.
((AB)C)D = ((A(BC))D) = (AB)(CD) = A((BC)D) = A(B(CD))
if A is a 10 × 30 matrix, B is a 30 × 5 matrix, and C is a 5 × 60 matrix
Unit III CS3401 Algorithms 13
(AB)C needs (10×30×5) + (10×5×60) = 1500 + 3000 = 4500 operations
A(BC) needs (30×5×60) + (10×30×60) = 9000 + 18000 = 27000 operations.
Clearly, the first method is more efficient.
Algorithm
First, it will divide the matrix sequence into two sub-sequences.
You will find the minimum cost of multiplying out each subsequence.
You will add these costs together and in the price of multiplying the two result
matrices.
These procedures will be repeated for every possible matrix split and calculate the
minimum.
Detailed Algorithm
Define the subproblems:
Let m[i,j] be the minimum number of scalar multiplications needed to compute the
product of matrices Ai x Ai+1 x ... x Aj.
Identify the base cases:
When there is only one matrix, m[i,i] = 0.
Define the recurrence relation:
To find m[i,j], we need to consider all possible ways to split the product Ai x Ai+1 x ... x Aj
into two sub-products, and then choose the one that requires the minimum number of
scalar multiplications. Let k be the index at which we split the product, such that i<= k < j.
Then we have:
m[i,j] = min(m[i,k] + m[k+1,j] + ri * ck * cj)
whereri, ck, and cj are the dimensions of matrices Ai, Ak+1, and Aj, respectively.
Compute the solution:
The solution to the original problem is m[1,n].
Python Code
def matrix_chain_multiplication(dimensions):
n = len(dimensions)
m = [[float('inf')] * n for _ in range(n)]
for i in range(n):
m[i][i] = 0
for length in range(2, n+1):
for i in range(n - length + 1):
j = i + length - 1
for k in range(i, j):
cost = m[i][k] + m[k+1][j] + dimensions[i][0] *
dimensions[k+1][0] * dimensions[j][1]
if cost < m[i][j]:
m[i][j] = cost
return m[0][n-1]
# Example usage
dimensions = [(10, 20), (20, 30), (30, 40)]
min_cost = matrix_chain_multiplication(dimensions)
print(f"Minimum cost for multiplying matrices {dimensions} is {min_cost}.")
M [1, 3] =264
Comparing, 264 is minimum in both cases so we insert 264 in table and ( M1 x M2) + M3.
M [2, 4] = 1320
Comparing both output, 1320 is minimum in both cases so we insert 1320 in table and M2+(M3
x M4) this combination is chosen for the output making.
M [3, 5] = M3 M4 M5
There are two cases by which we can solve this multiplication: ( M3 x M4) + M5, M3+ ( M4xM5)
After solving both cases we choose the case in which minimum output is there.
Comparing both output 1140 is minimum in both cases so we insert 1140 in table and ( M3 x
M4) + M5this combination is chosen for the output making.
M [1, 4] =1080
As comparing the output of different cases then '1080' is minimum output, so we insert 1080 in
the table and (M1 xM2) x (M3 x M4) combination is taken out in output making.
M [2, 5] = M2 M3 M4 M5
There are three cases by which we can solve this multiplication:
(M2 x M3 x M4)x M5
M2 x( M3 x M4 x M5)
(M2 x M3)x ( M4 x M5)
M [2, 5] = 1350
As comparing the output of different cases then '1350' is minimum output, so we insert 1350 in
the table and M2 x( M3 x M4 xM5)combination is taken out in output making.
M [1, 5] = 1344
As comparing the output of different cases then '1344' is minimum output, so we insert 1344 in
the table and M1 x M2 x(M3 x M4 x M5)combination is taken out in output making.
Step 3: Computing Optimal Costs: let us assume that matrix Ai has dimension pi-1x pi for i=1, 2,
3....n. The input is a sequence (p0,p1,......pn) where length [p] = n+1. The procedure uses an
auxiliary table m [1....n, 1.....n] for storing m [i, j] costs an auxiliary table s [1.....n, 1.....n] that
record which index of k achieved the optimal costs in computing m [i, j].
The algorithm first computes m [i, j] ← 0 for i=1, 2, 3.....n, the minimum costs for the chain of
length 1.
Example
Matrices : A1 dimensions: ( 3 * 5 ) , A2 dimensions: ( 5 * 4 ) and A3 dimensions: ( 4 * 6 )
Option 1 : ( ( A1 . A2 ) . A3 ) = ( ( 3 * 5 ) . ( 5 * 4 ) ) . ( 4 * 6 )
Option 2 : ( A1 . ( A2 . A3 ) ) = ( 3 * 5 ) . ( ( 5 * 4 ) . ( 4 * 6) )
Algorithm analysis
Approach Time Complexity Space Complexity
Recursive Solution O(2n) O(n)
Dynamic Programming O(n3) O(n2)
Steps
Divide the graph into multiple stages.
Define the sub-problems for each stage and compute the optimal solution for each sub-
problem.
Use the optimal solution for the last stage to recursively compute the optimal solution
for the previous stages.
Compute the optimal solution for the first stage, which is the solution to the original
problem.
The algorithm operates in the backward direction, i.e. it starts from the last vertex of
the graph and proceeds in a backward direction to find minimum cost path.
Applications
Advantages
It is more efficient and can be used to solve complex problems.
Multistage graphs are also easier to implement and can be more easily scaled.
Disadvantages
Uses more space.
Difficult to interpret since the information is spread out over multiple stages.
Algorithm analysis
Time complexity: O(nm^2), where n is the number of stages in the graph and m is the
maximum number of nodes in any stage. This is because the algorithm needs to compute the
optimal solution for each node in each stage, and the computation for each node takes O(m)
time. Since there are N stages, the overall time complexity is O(nm^2).
Solve the following multistage graph and find the minimum cost
Solution
This problem is solved by using tabular method, So we draw a table contain all
vertices(v),cost(c) and destination (d).
Here our main objective is to select those paths which have minimum cost.So we can say
thatit is a minimization problem. It can be solved by the principle of optimality which says
thesequence of decisions.That means in every stage we have to take decision.
In this problem we will start from 12 so its distance is 12 and cost is 0.
Now calculate for V(12) and stage 5.
Here Cost(5 , 12) = 0 i.e Cost( stage number , vertex)
Find minimum from [16,16,21,17] is 16 and the vertex given the minimum cost is2,3 because
here we get 16 twice so we consider both vertices.
Now update the new cost and distance in table for stage 1.
Like all tree data structure, binary search tree has a root, the top node (just one node),
parent node has at most two children nodes, which are called siblings. The edge is the
connection between one node and another. The node without children is called leaf.
Basic Idea
Given a sorted array ] of search keys and an array of
frequency counts, where is the number of searches for Construct a
binary search tree of all keys such that the total cost of all the searches is as small as
possible.
For example: 10, 20, 30 are the keys, and the following are the binary search trees that can be
made out from these keys.
Formula for calculating the number of trees:
The formula for calculating the number of trees with n nodes is given by the Catalan
number:
The cost required for searching an element depends on the comparisons to be made to search
an element. Now, we will calculate the average cost of time of the above binary search trees.
Example 1:
Case 1: Case 2:
Case 3: Case 4:
Example 3:
Example 4:
Step 1:
Calculate the values where j - i is equal to zero. Cost Table after step 1 and step 2
When i =0, j=0, then j-i = 0 (0,0)
When i = 1, j=1, then j-i = 0 (1,1)
When i = 2, j=2, then j-i = 0 (2,2)
When i = 3, j=3, then j-i = 0 (3,3)
When i = 4, j=4, then j-i = 0 (4,4)
c[0,0] = 0, c[1 ,1] = 0, c[2,2] = 0, c[3,3] = 0, c[4,4] = 0
Step 2:
Calculate the values where j - i equal to 1.
When j=1, i =0 then j-i = 1 (0,1)
When j =2, i = 1 then j-i = 1 (1,2)
When j =3, i = 2 then j-i = 1 (2,3)
When j =4, i = 3 then j-i = 1 (3,4)
Cost of c[0,1] = 4 [Key is 10 & Cost of key 10 = 4]
Cost of c[1,2] = 2 [Key is 20 & Cost of key 20 = 2]
Cost of c[2,3] = 6 [Key is 30 & Cost of key 30 = 6]
Cost of c[3,4] = 3 [Key is 40 & Cost of key 40 = 3]
Step 3: Step 3 - Continued:
Calculate the values where j - i = 2 When i=1 and j=3, then keys 20 and 30. There are
When j=2, i=0 then j-i = 2 (0,2) two possible trees that can be made out from these
When j=3, i=1 then j-i = 2 (1,3) two keys shown below:
When j=4, i=2 then j-i = 2 (2,4) In the first binary tree, cost = 1*2 + 2*6 = 14
In this case, we will consider two keys. In the second binary tree, cost = 1*6 + 2*2 = 10
When i=0 and j=2, then keys 10 and 20. There are
two possible trees that can be made out from these The minimum cost is 10; therefore, c[1,3] = 10
two keys shown below:
When i=2 and j=4, we will consider the keys at 3 and
4, i.e., 30 and 40. There are two possible trees that
can be made out from these two keys shown as
below:
In the first binary tree, cost = 1*6 + 2*3 = 12
In the second binary tree, cost = 1*3 + 2*6 = 15
The minimum cost is 12, therefore, c[2,4] = 12
In first binary tree, cost = 4*1 + 2*2 = 8
In second binary tree, cost = 4*2 + 2*1 = 10 Cost Table:
The minimum cost is 8; therefore, c[0,2] = 8
Cost Table:
w[0, 4] = 4 + 2 + 6 + 3 = 15
Greedy Technique
Definition
Greedy Algorithms work step-by-step, and always choose the steps which provide immediate
profit/benefit. It chooses the “locally optimal solution”, without thinking about future
consequences.
Greedy algorithms may not always lead to the optimal global solution, because it does
not consider the entire data. The choice made by the greedy approach does not
consider future data and choices.
The greedy technique is used for optimization problems (find the maximum or
minimum).
Basic idea
Choose the activity that ends first, and then choose subsequent activities that do not
overlap with the previously chosen activity and have the earliest end time.
Greedy approach is used - since we want to maximize the activities that can be
executed, thus yielding an optimal solution.
Working process
Two activities, say i and j, are said to be non-conflicting if, Sj >= Fi where Sj denote the
starting time of activitiy j and Fi refers to finishing time of the activitiy i.
Algorithm
Sort the activities by their end time in ascending order.
Select the first activity in the sorted list and mark it as selected.
For each subsequent activity, if its start time is greater than or equal to the end time of
the last selected activity, select it and mark it as selected.
Repeat step 3 until no more activities are left.
Example problem
Problem 1
List of activities to be performed. Step1
Sort the activities in ascending order of Finish
times.
Hence, the execution schedule of maximum number of non-conflicting activities will be:
(1,2), (3,4), (5,7), (8,9)
Problem 2
Given 10 activities along with their start and end time as
A = (A1 A2 A3 A4 A5 A6 A7 A8 A9 A10)
Si = (1,2,3,4,7,8,9,9,11,12)
Fi = (3,5,4,7,10,9,11,13,12,14)
Activities A1 A2 A3 A4 A5 A6 A7 A8 A9 A10
Start time 1 2 3 4 7 8 9 9 11 12
Finish time 3 5 4 7 10 9 11 13 12 14
Algorithm Analysis
Time Complexity Space Complexity
When activities are sorted -
Need to store the start and end
When activities are not sorted -
times of each activity in memory
Python Program
def activity_selection(start, finish):
n = len(finish)
activities = []
i = 0
activities.append(i)
for j in range(n):
if start[j] >= finish[i]:
activities.append(j)
i = j
return activities
start = [1, 3, 0, 5, 8, 5]
finish = [2, 4, 6, 7, 9, 9]
selected_activities = activity_selection(start, finish)
print("Selected activities:", selected_activities)
Execution
Selected activities: [0, 1, 3, 4]
Example 1
Given 3 files with sizes 2, 3, 4 units. Find an optimal way to combine these files
Solution
Input: n = 3, size = {2, 3, 4}
Different ways to combine patterns are:
Method 1 Method 2 Method 3
Example 2
Let us consider the given files, f1, f2, f3, f4 and f5 with 20, 30, 10, 5 and 30 number of
elements respectively.
Method 1 Method 2
If merge operations are performed, then Sorting the files - f4, f3, f1, f2, f5
M1 = merge f1 and f2 => 20 + 30 = 50 M1 = merge f4 and f3 => 5 + 10 = 15
M2 = merge M1 and f3 => 50 + 10 = 60 M2 = merge M1 and f1 => 15 + 20 = 35
M3 = merge M2 and f4 => 60 + 5 = 65 M3 = merge M2 and f2 => 35 + 30 = 65
M4 = merge M3 and f5 => 65 + 30 = 95 M4 = merge M3 and f5 => 65 + 30 = 95
Total number of operations is Total number of operations is
50 + 60 + 65 + 95 = 270 15 + 35 + 65 + 95 = 210
Step 2 Step 3
Step 4 Step 5
Total number of operations is
15 + 35 + 60 + 95 = 205
Example3
Consider the sequence {3, 5, 9, 11, 16, 18, 20}. Find optimal merge pattern for this data
Solution:
At each step, merge the two smallest sequences
Step 3: Merge the two smallest sequences and Step 4: Merge the two smallest sequences
sort in ascending order and sort in ascending order
Step 7: Merge the two smallest sequences and sort in ascending order
Algorithm Analysis
Time complexity Space complexity
- where n is the total number of
elements in all the input arrays. The algorithm needs to store all the elements in
The algorithm uses a min-heap to find the minimum memory in order to perform the merging operation.
element among all the arrays repeatedly until all
elements have been merged into a single array.
Example
Suppose the string below is to be sent over a network.
Initial string
B C A A D D D C C A C A C A C
Each ASCII character occupies 8 bits. There are a total of 15 characters in the above string.
Thus, a total of are required to send this string.
Repeat steps 3 to 5 for all the characters. Step 5 - For each non-leaf node, assign 0 to
15 the left edge and 1 to the right edge.
*
Algorithm Analysis
Time complexity Space complexity
- where n is the number of
characters in the message Storage for the binary tree and the codes assigned to
It involves building a binary tree from the characters each character.
in the message, and sorting the tree nodes by
frequency