0% found this document useful (0 votes)
1 views22 pages

Unit 3

The document discusses greedy algorithms and dynamic programming as methods for solving optimization problems, highlighting their differences in terms of optimality and computational efficiency. It provides examples of algorithms such as Prim's, Kruskal's, and Dijkstra's for greedy methods, and explains the Huffman coding technique for data compression. Additionally, it covers the knapsack problem, tree traversal methods, and matrix chain multiplication, detailing their respective approaches and complexities.

Uploaded by

Aksh Vashist
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
1 views22 pages

Unit 3

The document discusses greedy algorithms and dynamic programming as methods for solving optimization problems, highlighting their differences in terms of optimality and computational efficiency. It provides examples of algorithms such as Prim's, Kruskal's, and Dijkstra's for greedy methods, and explains the Huffman coding technique for data compression. Additionally, it covers the knapsack problem, tree traversal methods, and matrix chain multiplication, detailing their respective approaches and complexities.

Uploaded by

Aksh Vashist
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

Unit 3

A Greedy method is one of the techniques used to solve problems. This method is used for
solving optimization problems. The optimization problems are the problems that require either
maximum or minimum results.

Greedy approach is simple method. It finds the solution by taking the next best move without
exploring its future consequences. So sometimes it ends up with sub optimal solution. Greedy
algorithm does not provide optimal solution for all problem instances. However, due to its
simplicity and less computational time, greedy algorithms are preferred over any other method if
it provides optimal solution

Examples:

 Prim's Minimal Spanning Tree Algorithm.


 Travelling Salesman Problem.
 Graph – Map Coloring.
 Kruskal's Minimal Spanning Tree Algorithm.
 Dijkstra's Minimal Spanning Tree Algorithm.
 Graph – Vertex Cover.
 Knapsack Problem.
 Job Scheduling Problem.

On the other hand, dynamic programming always ensures the optimal solution. Dynamic
programming requires more memory as it stores the solution of each and every possible sub
problems in the table. It does lot of work compared to greedy approach, but optimal solution is
ensured
Examples:
 Floyd-Warshall and Bellman-Ford
 Mathematical optimization problem
 All pair Shortest path problem
 Flight control and robotics control.
 Time sharing: It schedules the job to maximize CPU usage.
Huffman coding

o Data can be encoded efficiently using Huffman Codes.

o (ii) It is a widely used and beneficial technique for compressing data.

o (iii) Huffman's greedy algorithm uses a table of the frequencies of occurrences of each
character to build up an optimal way of representing each character as a binary string
o
o How can we represent the data in a Compact way?
o (i) Fixed length Code: Each letter represented by an equal number of bits. With a fixed
length code, at least 3 bits per character
o A variable-length code: It can do considerably better than a fixed-length code, by giving
many characters short code words and infrequent character long codewords
o It is used for the lossless compression of data.
o It uses variable length encoding.
o It assigns variable length code to all the characters.
o The code length of a character depends on how frequently it occurs in the given text.
o Huffman Tree-
o
o The steps involved in the construction of Huffman Tree are as follows

o Create a leaf node for each character of the text.


o Leaf node of a character contains the occurring frequency of that character
o Arrange all the nodes in increasing order of their frequency value.
o Considering the first two nodes having minimum frequency

o Create a new internal node.


o The frequency of this new node is the sum of frequency of those two nodes.
o Make the first node as a left child and the other node as a right child of the newly created
node
o Overall time complexity of Huffman Coding becomes O(nlogn).

Knapsack problems using greedy approach

The knapsack problem is one of the famous and important problems that come under the
greedy method

This problem can be solved with the help of using two techniques:

o Brute-force approach: The brute-force approach tries all the possible solutions with all
the different fractions but it is a time-consuming approach.
o Greedy approach: In Greedy approach, we calculate the ratio of profit/weight, and
accordingly, we will select the item. The item with the highest ratio would be selected
first

As this problem is solved using a greedy method, this problem is one of the optimization
problems

The optimization problem needs to find an optimal solution and hence no exhaustive
search approach could be applied to it.

The Knapsack problem is used in logistics, mathematics, cryptography, computer


science, and more.
A knapsack can also be considered as a bag and the problem is to fill the bag with the
objects in such a way that the profit is maximized. As we are trying to maximize the
profit, this problem is optimization as well as maximization problem.

Consider a bag that has a capacity of 12 kg, let's take the maximum capacity of a bag as
‘m’, here m = 12. Now let us take objects, object weights, and profit associated with that
weight.

We can see in the above example that there are 4 objects and in the example, we can also
see the profit and weights of the object. We can see that the total weight of the objects is
exceeding the capacity of the bag( total weight: 6 + 2 + 7 + 5 = 20)

Now, what should be our approach to fill the bag? We can either choose the object with a
maximum profit or maximum weight, or both approaches
Second approach

Third approach:

In this case, we first calculate the profit/weight ratio.

Object 1: 5/1 = 5

Object 2: 10/3 = 3. 33

Object 3: 15/5 = 3

Object 4: 7/4 = 1.7

Object 5: 8/1 = 8

Object 6: 9/3 = 3

Object 7: 4/2 = 2

P:w: 5 3.3 3 1.7 8 3 2


Time complexity of fractional knapsack is θ(nlogn) in worst,best or average case

0/1 Knapsack Problem


The 0/1 knapsack problem means that the items are either completely or no items are filled in a
knapsack. For example, we have two items having weights 2kg and 3kg, respectively. If we
pick the 2kg item then we cannot pick 1kg item from the 2kg item (item is not divisible); we
have to pick the 2kg item completely. This is a 0/1 knapsack problem in which either we pick
the item completely or we will pick that item. The 0/1 knapsack problem is solved by the
dynamic programming
Example of 0/1 knapsack problem.
Consider the problem having weights and profits are:
Weights: {3, 4, 6, 5}
Profits: {2, 3, 1, 4}
The weight of the knapsack is 8 kg
The number of items is 4
xi = {1, 0, 0, 1}
= {0, 0, 0, 1}
= {0, 1, 0, 1}
The above are the possible combinations. 1 denotes that the item is completely picked and 0
means that no item is picked. Since there are 4 items so possible combinations will be:
24 = 16; So. There are 16 possible combinations that can be made by using the above problem.
Once all the combinations are made, we have to select the combination that provides the
maximum profit.
Another approach to solve the problem is dynamic programming approach. In dynamic
programming approach, the complicated problem is divided into sub-problems, then we find the
solution of a sub-problem and the solution of the sub-problem will be used to find the solution
of a complex problem

As we can observe in the above table that 5 is the maximum profit among all the entries. The
pointer points to the last row and the last column having 5 value. Now we will compare 5 value
with the previous row; if the previous row, i.e., i = 3 contains the same value 5 then the pointer
will shift upwards. Since the previous row contains the value 5 so the pointer will be shifted
upwards .
Again, we will compare the value 5 from the above row, i.e., i = 2. Since the above row
contains the value 5 so the pointer will again be shifted upwards
gain, we will compare the value 5 from the above row, i.e., i = 1. Since the above row does not
contain the same value so we will consider the row i=1, and the weight corresponding to the
row is 4. Therefore, we have selected the weight 4 and we have rejected the weights 5 and 6
shown below:
x = { 1, 0, 0}
The profit corresponding to the weight is 3. Therefore, the remaining profit is (5 - 3) equals to 2.
Now we will compare this value 2 with the row i = 2. Since the row (i = 1) contains the value 2;
therefore, the pointer shifted upwards
The profit corresponding to the weight is 3. Therefore, the remaining profit is (5 - 3) equals to 2.
Now we will compare this value 2 with the row i = 2. Since the row (i = 1) contains the value 2;
therefore, the pointer shifted upwards

What is the complexity of the 0 / 1 knapsack problem?


Time Complexity-. Each entry of the table requires constant time θ (1) for its computation. It
takes θ (nw) time to fill (n+1) (w+1) table entries. It takes θ (n) time for tracing the solution
since tracing process traces the n rows. Thus, overall θ (nw) time is taken to solve 0/1 knapsack
problem using dynamic programming

Tree Traversal
Traversal is a process to visit all the nodes of a tree and may print their values too. Because, all
nodes are connected via edges (links) we always start from the root (head) node. That is, we
cannot randomly access a node in a tree. There are three ways which we use to traverse a tree −

 In-order Traversal
 Pre-order Traversal
 Post-order Traversal
Minimum spanning tree

A spanning tree is a subset of an undirected Graph that has all the vertices connected by
minimum number of edges.
If all the vertices are connected in a graph, then there exists at least one spanning tree. In a graph,
there may exist more than one spanning tree.
Properties
 A spanning tree does not have any cycle.
 Any vertex can be reached from any other vertex.
Minimum Spanning Tree
A Minimum Spanning Tree (MST) is a subset of edges of a connected weighted undirected
graph that connects all the vertices together with the minimum possible total edge weight. To
derive an MST, Prim’s algorithm or Kruskal’s algorithm can be used.

Kruskal’s Algorithm
Kruskal’s Algorithm builds the spanning tree by adding edges one by one into a growing
spanning tree. Kruskal's algorithm follows greedy approach as in each iteration it finds an edge
which has least weight and add it to the growing spanning tree
Algorithm Steps:
 Sort the graph edges with respect to their weights.
 Start adding edges to the MST from the edge with the smallest weight until the edge of
the largest weight.
 Only add edges which doesn't form a cycle , edges which connect only disconnected
components.
So now the question is how to check if 2 vertices are connected or not ?

This could be done using DFS which starts from the first vertex, then check if the second vertex
is visited or not. But DFS will make time complexity large as it has an order
of O(V+E) where V is the number of vertices, E is the number of edges. So the best solution
is "Disjoint Sets":
Disjoint sets are sets whose intersection is the empty set so it means that they don't have any
element in common.

Example:
In Kruskal’s algorithm, at each iteration we will select the edge with the lowest weight. So, we
will start with the lowest weighted edge first i.e., the edges with weight 1. After that we will
select the second lowest weighted edge i.e., edge with weight 2. Notice these two edges are
totally disjoint. Now, the next edge will be the third lowest weighted edge i.e., edge with weight
3, which connects the two disjoint pieces of the graph. Now, we are not allowed to pick the edge
with weight 4, that will create a cycle and we can’t have any cycles. So we will select the fifth
lowest weighted edge i.e., edge with weight 5. Now the other two edges will create cycles so we
will ignore them. In the end, we end up with a minimum spanning tree with total cost 11 ( = 1 +
2 + 3 + 5).
TimeComplexity:
In Kruskal’s algorithm, most time consuming operation is sorting because the total complexity of
the Disjoint-Set operations will be O(ElogV), which is the overall Time Complexity of the
algorithm

Prim’s Algorithm
is a greedy algorithm that is used to find the minimum spanning tree from a graph. Prim's
algorithm finds the subset of edges that includes every vertex of the graph such that the sum of
the weights of the edges can be minimized.
Prim's algorithm starts with the single node and explores all the adjacent nodes with all the
connecting edges at every step. The edges with the minimal weights causing no cycles in the
graph got selected.
Prim's algorithm is a greedy algorithm that starts from one vertex and continue to add the edges
with the smallest weight until the goal is reached. The steps to implement the prim's algorithm
are given as follows -
o First, we have to initialize an MST with the randomly chosen vertex.
o Now, we have to find all the edges that connect the tree in the above step with the new
vertices. From the edges found, select the minimum edge and add it to the tree.
o Repeat step 2 until the minimum spanning tree is formed.
The applications of prim's algorithm are -
o Prim's algorithm can be used in network designing.
o It can be used to make network cycles.
o It can also be used to lay down electrical wiring cables
Example of prim's algorithm

o Now, let's see the working of prim's algorithm using an example. It will be easier to
understand the prim's algorithm using an example.
o Suppose, a weighted graph is -

Step 2 - Now, we have to choose and add the shortest edge from vertex B. There are two edges
from vertex B that are B to C with weight 10 and edge B to D with weight 4. Among the edges,
the edge BD has the minimum weight. So, add it to the MST.

Step 3 - Now, again, choose the edge with the minimum weight among all the other edges. In
this case, the edges DE and CD are such edges. Add them to MST and explore the adjacent of C,
i.e., E and A. So, select the edge DE and add it to the MST

Step 4 - Now, select the edge CD, and add it to the MST.
Step 5 - Now, choose the edge CA. Here, we cannot select the edge CE as it would create a cycle
to the graph. So, choose the edge CA and add it to the MST

So, the graph produced in step 5 is the minimum spanning tree of the given graph. The cost of
the MST is given below
Cost of MST = 4 + 2 + 1 + 3 = 10 units.
Time complexity analysis
 If adjacency list is used to represent the graph, then using breadth first search, all the
vertices can be traversed in O(V + E) time.
 We traverse all the vertices of graph using breadth first search and use a min heap for
storing the vertices not yet included in the MST.
 To get the minimum weight edge, we use min heap as a priority queue.
 Min heap operations like extracting minimum element and decreasing key value takes
O(logV) time.
So, overall time complexity
= O(E + V) x O(logV)
= O((E + V)logV)
= O(ElogV)
This time complexity can be improved and reduced to O(E + VlogV) using Fibonacci heap.
Matrix Chain Multiplication Solution using Dynamic Programming
Matrix chain multiplication problem can be easily solved using dynamic programming because it
is an optimization problem, where we need to find the most efficient sequence of multiplying
the matrices
For example, consider the following sequences for a set of matrices.

Notice that multiplication of matrix A with matrix B i.e. (A.B) is being repeated in two
sequences.
If we could reuse the previous multiplication result of A.B in the next sequence, our algorithm
will become faster.
For this, we have to store the solution of subproblems like this into a 2D array i.e. memoize, so
that we can use it later easily
The optimal substructure is defined as,

Where d = {d0, d1, d2, …, dn} is the vector of matrix dimensions.

m[i, j] = Least number of multiplications required to multiply matrix sequence


Ai….Aj .

// Matrix A[i] has dimension dims[i-1] x dims[i] for i = 1..n

MatrixChainMultiplication(int dims[])
{
// length[dims] = n + 1
n = dims.length - 1;
// m[i,j] = Minimum number of scalar multiplications(i.e., cost)
// needed to compute the matrix A[i]A[i+1]...A[j] = A[i..j]
// The cost is zero when multiplying one matrix
for (i = 1; i <= n; i++)
m[i, i] = 0;
for (len = 2; len <= n; len++){
// Subsequence lengths
for (i = 1; i <= n - len + 1; i++) {
j = i + len - 1;
m[i, j] = MAXINT;
for (k = i; k <= j - 1; k++) {
cost = m[i, k] + m[k+1, j] + dims[i-1]*dims[k]*dims[j];
if (cost < m[i, j]) {
m[i, j] = cost;
s[i, j] = k;
// Index of the subsequence split that achieved minimal cost
}
}
}
}
}
There are three nested loops. Each loop executes a maximum n times. So the
complexity is O (n3 )
Longest common subsequence using dynamic programming
The longest common subsequence (LCS) is defined as the longest subsequence that
is common to all the given sequences, provided that the elements of the
subsequence are not required to occupy consecutive positions within the original
sequences
If S1 and S2 are the two given sequences then, Z is the common subsequence
of S1 and S2 if Z is a subsequence of both S1 and S2
Suppose we have a string 'w'.
W1 = abcd
The following are the subsequences that can be created from the above string:
o ab
o bd
o ac
o ad
o acd
o bcd

Finding LCS using dynamic programming with the help of a


table.

o Consider two strings:


o X= a b a a b a
o Y= b a b b a b

In the above table, we can observe that all the entries are filled. Now we are at
the last cell having 4 value. This cell moves at the left which contains 4 value.;
therefore, the first character of the LCS is 'a'. The left cell moves upwards
diagonally whose value is 3; therefore, the next character is 'b' and it becomes
'ba'. Now the cell has 2 value that moves on the left. The next cell also has 2
value which is moving upwards; therefore, the next character is 'a' and it
becomes 'aba'. The next cell is having a value 1 that moves upwards. Now we
reach the cell (b, b) having value which is moving diagonally upwards;
therefore, the next character is 'b'. The final string of longest common
subsequence is 'baba'.

Why a dynamic programming approach in solving a LCS problem is more


efficient than the recursive algorithm?

If we use the dynamic programming approach, then the number of function calls
are reduced. The dynamic programming approach stores the result of each function
call so that the result of function calls can be used in the future function calls
without the need of calling the functions again.

In the above dynamic algorithm, the results obtained from the comparison between
the elements of x and the elements of y are stored in the table so that the results can
be stored for the future computations.

The time taken by the dynamic programming approach to complete a table is


O(mn) and the time taken by the recursive algorithm is 2max(m, n)

Optimal Binary search Tree


we know that in binary search tree, the nodes in the left subtree have lesser value
than the root node and the nodes in the right subtree have greater value than the
root node.
We know the key values of each node in the tree, and we also know the
frequencies of each node in terms of searching means how much time is required
to search a node. The frequency and key-value determine the overall cost of
searching a node. The cost of searching is a very important factor in various
applications. The overall cost of searching a node should be less. The time required
to search a node in BST is more than the balanced binary search tree as a balanced
binary search tree contains a lesser number of levels than the BST. There is one
way that can reduce the cost of a binary search tree is known as an optimal binary
search tree
Search time of an element in a BST is O(n), whereas in a Balanced-BST search
time is O(log n). Again the search time can be improved in Optimal Cost Binary
Search Tree, placing the most frequently used data in the root and closer to the root
element, while placing the least frequently used data near leaves and in leaves.

1. First, we will calculate the values where j-i is equal to zero


2. we will calculate the values where j-i equal to 1
3. we will calculate the values where j-i = 2
4. Now we will calculate the values when j-i = 3
5. Now we will calculate the values when j-i = 4
Complexity Analysis of Optimal Binary Search Tree
It is very simple to derive the complexity of this approach from the above
algorithm. It uses three nested loops. Statements in the innermost loop run in Q(1)
time. The running time of the algorithm is computed as

You might also like