Unit 3
Unit 3
A Greedy method is one of the techniques used to solve problems. This method is used for
solving optimization problems. The optimization problems are the problems that require either
maximum or minimum results.
Greedy approach is simple method. It finds the solution by taking the next best move without
exploring its future consequences. So sometimes it ends up with sub optimal solution. Greedy
algorithm does not provide optimal solution for all problem instances. However, due to its
simplicity and less computational time, greedy algorithms are preferred over any other method if
it provides optimal solution
Examples:
On the other hand, dynamic programming always ensures the optimal solution. Dynamic
programming requires more memory as it stores the solution of each and every possible sub
problems in the table. It does lot of work compared to greedy approach, but optimal solution is
ensured
Examples:
Floyd-Warshall and Bellman-Ford
Mathematical optimization problem
All pair Shortest path problem
Flight control and robotics control.
Time sharing: It schedules the job to maximize CPU usage.
Huffman coding
o (iii) Huffman's greedy algorithm uses a table of the frequencies of occurrences of each
character to build up an optimal way of representing each character as a binary string
o
o How can we represent the data in a Compact way?
o (i) Fixed length Code: Each letter represented by an equal number of bits. With a fixed
length code, at least 3 bits per character
o A variable-length code: It can do considerably better than a fixed-length code, by giving
many characters short code words and infrequent character long codewords
o It is used for the lossless compression of data.
o It uses variable length encoding.
o It assigns variable length code to all the characters.
o The code length of a character depends on how frequently it occurs in the given text.
o Huffman Tree-
o
o The steps involved in the construction of Huffman Tree are as follows
The knapsack problem is one of the famous and important problems that come under the
greedy method
This problem can be solved with the help of using two techniques:
o Brute-force approach: The brute-force approach tries all the possible solutions with all
the different fractions but it is a time-consuming approach.
o Greedy approach: In Greedy approach, we calculate the ratio of profit/weight, and
accordingly, we will select the item. The item with the highest ratio would be selected
first
As this problem is solved using a greedy method, this problem is one of the optimization
problems
The optimization problem needs to find an optimal solution and hence no exhaustive
search approach could be applied to it.
Consider a bag that has a capacity of 12 kg, let's take the maximum capacity of a bag as
‘m’, here m = 12. Now let us take objects, object weights, and profit associated with that
weight.
We can see in the above example that there are 4 objects and in the example, we can also
see the profit and weights of the object. We can see that the total weight of the objects is
exceeding the capacity of the bag( total weight: 6 + 2 + 7 + 5 = 20)
Now, what should be our approach to fill the bag? We can either choose the object with a
maximum profit or maximum weight, or both approaches
Second approach
Third approach:
Object 1: 5/1 = 5
Object 2: 10/3 = 3. 33
Object 3: 15/5 = 3
Object 5: 8/1 = 8
Object 6: 9/3 = 3
Object 7: 4/2 = 2
As we can observe in the above table that 5 is the maximum profit among all the entries. The
pointer points to the last row and the last column having 5 value. Now we will compare 5 value
with the previous row; if the previous row, i.e., i = 3 contains the same value 5 then the pointer
will shift upwards. Since the previous row contains the value 5 so the pointer will be shifted
upwards .
Again, we will compare the value 5 from the above row, i.e., i = 2. Since the above row
contains the value 5 so the pointer will again be shifted upwards
gain, we will compare the value 5 from the above row, i.e., i = 1. Since the above row does not
contain the same value so we will consider the row i=1, and the weight corresponding to the
row is 4. Therefore, we have selected the weight 4 and we have rejected the weights 5 and 6
shown below:
x = { 1, 0, 0}
The profit corresponding to the weight is 3. Therefore, the remaining profit is (5 - 3) equals to 2.
Now we will compare this value 2 with the row i = 2. Since the row (i = 1) contains the value 2;
therefore, the pointer shifted upwards
The profit corresponding to the weight is 3. Therefore, the remaining profit is (5 - 3) equals to 2.
Now we will compare this value 2 with the row i = 2. Since the row (i = 1) contains the value 2;
therefore, the pointer shifted upwards
Tree Traversal
Traversal is a process to visit all the nodes of a tree and may print their values too. Because, all
nodes are connected via edges (links) we always start from the root (head) node. That is, we
cannot randomly access a node in a tree. There are three ways which we use to traverse a tree −
In-order Traversal
Pre-order Traversal
Post-order Traversal
Minimum spanning tree
A spanning tree is a subset of an undirected Graph that has all the vertices connected by
minimum number of edges.
If all the vertices are connected in a graph, then there exists at least one spanning tree. In a graph,
there may exist more than one spanning tree.
Properties
A spanning tree does not have any cycle.
Any vertex can be reached from any other vertex.
Minimum Spanning Tree
A Minimum Spanning Tree (MST) is a subset of edges of a connected weighted undirected
graph that connects all the vertices together with the minimum possible total edge weight. To
derive an MST, Prim’s algorithm or Kruskal’s algorithm can be used.
Kruskal’s Algorithm
Kruskal’s Algorithm builds the spanning tree by adding edges one by one into a growing
spanning tree. Kruskal's algorithm follows greedy approach as in each iteration it finds an edge
which has least weight and add it to the growing spanning tree
Algorithm Steps:
Sort the graph edges with respect to their weights.
Start adding edges to the MST from the edge with the smallest weight until the edge of
the largest weight.
Only add edges which doesn't form a cycle , edges which connect only disconnected
components.
So now the question is how to check if 2 vertices are connected or not ?
This could be done using DFS which starts from the first vertex, then check if the second vertex
is visited or not. But DFS will make time complexity large as it has an order
of O(V+E) where V is the number of vertices, E is the number of edges. So the best solution
is "Disjoint Sets":
Disjoint sets are sets whose intersection is the empty set so it means that they don't have any
element in common.
Example:
In Kruskal’s algorithm, at each iteration we will select the edge with the lowest weight. So, we
will start with the lowest weighted edge first i.e., the edges with weight 1. After that we will
select the second lowest weighted edge i.e., edge with weight 2. Notice these two edges are
totally disjoint. Now, the next edge will be the third lowest weighted edge i.e., edge with weight
3, which connects the two disjoint pieces of the graph. Now, we are not allowed to pick the edge
with weight 4, that will create a cycle and we can’t have any cycles. So we will select the fifth
lowest weighted edge i.e., edge with weight 5. Now the other two edges will create cycles so we
will ignore them. In the end, we end up with a minimum spanning tree with total cost 11 ( = 1 +
2 + 3 + 5).
TimeComplexity:
In Kruskal’s algorithm, most time consuming operation is sorting because the total complexity of
the Disjoint-Set operations will be O(ElogV), which is the overall Time Complexity of the
algorithm
Prim’s Algorithm
is a greedy algorithm that is used to find the minimum spanning tree from a graph. Prim's
algorithm finds the subset of edges that includes every vertex of the graph such that the sum of
the weights of the edges can be minimized.
Prim's algorithm starts with the single node and explores all the adjacent nodes with all the
connecting edges at every step. The edges with the minimal weights causing no cycles in the
graph got selected.
Prim's algorithm is a greedy algorithm that starts from one vertex and continue to add the edges
with the smallest weight until the goal is reached. The steps to implement the prim's algorithm
are given as follows -
o First, we have to initialize an MST with the randomly chosen vertex.
o Now, we have to find all the edges that connect the tree in the above step with the new
vertices. From the edges found, select the minimum edge and add it to the tree.
o Repeat step 2 until the minimum spanning tree is formed.
The applications of prim's algorithm are -
o Prim's algorithm can be used in network designing.
o It can be used to make network cycles.
o It can also be used to lay down electrical wiring cables
Example of prim's algorithm
o Now, let's see the working of prim's algorithm using an example. It will be easier to
understand the prim's algorithm using an example.
o Suppose, a weighted graph is -
Step 2 - Now, we have to choose and add the shortest edge from vertex B. There are two edges
from vertex B that are B to C with weight 10 and edge B to D with weight 4. Among the edges,
the edge BD has the minimum weight. So, add it to the MST.
Step 3 - Now, again, choose the edge with the minimum weight among all the other edges. In
this case, the edges DE and CD are such edges. Add them to MST and explore the adjacent of C,
i.e., E and A. So, select the edge DE and add it to the MST
Step 4 - Now, select the edge CD, and add it to the MST.
Step 5 - Now, choose the edge CA. Here, we cannot select the edge CE as it would create a cycle
to the graph. So, choose the edge CA and add it to the MST
So, the graph produced in step 5 is the minimum spanning tree of the given graph. The cost of
the MST is given below
Cost of MST = 4 + 2 + 1 + 3 = 10 units.
Time complexity analysis
If adjacency list is used to represent the graph, then using breadth first search, all the
vertices can be traversed in O(V + E) time.
We traverse all the vertices of graph using breadth first search and use a min heap for
storing the vertices not yet included in the MST.
To get the minimum weight edge, we use min heap as a priority queue.
Min heap operations like extracting minimum element and decreasing key value takes
O(logV) time.
So, overall time complexity
= O(E + V) x O(logV)
= O((E + V)logV)
= O(ElogV)
This time complexity can be improved and reduced to O(E + VlogV) using Fibonacci heap.
Matrix Chain Multiplication Solution using Dynamic Programming
Matrix chain multiplication problem can be easily solved using dynamic programming because it
is an optimization problem, where we need to find the most efficient sequence of multiplying
the matrices
For example, consider the following sequences for a set of matrices.
Notice that multiplication of matrix A with matrix B i.e. (A.B) is being repeated in two
sequences.
If we could reuse the previous multiplication result of A.B in the next sequence, our algorithm
will become faster.
For this, we have to store the solution of subproblems like this into a 2D array i.e. memoize, so
that we can use it later easily
The optimal substructure is defined as,
MatrixChainMultiplication(int dims[])
{
// length[dims] = n + 1
n = dims.length - 1;
// m[i,j] = Minimum number of scalar multiplications(i.e., cost)
// needed to compute the matrix A[i]A[i+1]...A[j] = A[i..j]
// The cost is zero when multiplying one matrix
for (i = 1; i <= n; i++)
m[i, i] = 0;
for (len = 2; len <= n; len++){
// Subsequence lengths
for (i = 1; i <= n - len + 1; i++) {
j = i + len - 1;
m[i, j] = MAXINT;
for (k = i; k <= j - 1; k++) {
cost = m[i, k] + m[k+1, j] + dims[i-1]*dims[k]*dims[j];
if (cost < m[i, j]) {
m[i, j] = cost;
s[i, j] = k;
// Index of the subsequence split that achieved minimal cost
}
}
}
}
}
There are three nested loops. Each loop executes a maximum n times. So the
complexity is O (n3 )
Longest common subsequence using dynamic programming
The longest common subsequence (LCS) is defined as the longest subsequence that
is common to all the given sequences, provided that the elements of the
subsequence are not required to occupy consecutive positions within the original
sequences
If S1 and S2 are the two given sequences then, Z is the common subsequence
of S1 and S2 if Z is a subsequence of both S1 and S2
Suppose we have a string 'w'.
W1 = abcd
The following are the subsequences that can be created from the above string:
o ab
o bd
o ac
o ad
o acd
o bcd
In the above table, we can observe that all the entries are filled. Now we are at
the last cell having 4 value. This cell moves at the left which contains 4 value.;
therefore, the first character of the LCS is 'a'. The left cell moves upwards
diagonally whose value is 3; therefore, the next character is 'b' and it becomes
'ba'. Now the cell has 2 value that moves on the left. The next cell also has 2
value which is moving upwards; therefore, the next character is 'a' and it
becomes 'aba'. The next cell is having a value 1 that moves upwards. Now we
reach the cell (b, b) having value which is moving diagonally upwards;
therefore, the next character is 'b'. The final string of longest common
subsequence is 'baba'.
If we use the dynamic programming approach, then the number of function calls
are reduced. The dynamic programming approach stores the result of each function
call so that the result of function calls can be used in the future function calls
without the need of calling the functions again.
In the above dynamic algorithm, the results obtained from the comparison between
the elements of x and the elements of y are stored in the table so that the results can
be stored for the future computations.