Unit III - Daa
Unit III - Daa
Introduction – Greedy
Huffman Coding
Knapsack Problem
Minimum Spanning Tree (Kruskals Algorithm)
Minimum Spanning Tree (Prim’s Algorithm)
Dynamic Programming
Introduction
0/1 knapsack problem
Matrix chain multiplication using dynamic programming
Longest common subsequence using dynamic programming
Optimal binary search tree (OBST)using dynamic programming
Greedy Method
•Greedy algorithm obtains an optimal solution by
making a sequence of decisions.
•Decisions are made one by one in some order.
•Each decision is made using a greedy-choice
property or greedy criterion.
•A decision, once made, is (usually) not changed
later.
Introduction
•Greedy is a strategy that works well on optimization
problems.
•Optimization - the action of making the best or most
effective use of a situation or resource.
•Characteristics:
1. Greedy-choice property: To construct the solution in an
optimal way, this algorithm creates two sets where one
set contains all the chosen items, and another set
contains the rejected items.
2. Optimal substructure: A Greedy algorithm makes good
local choices in the hope that the solution should be
either feasible or optimal.
Components of Greedy Algorithm
•Candidate set: A solution that is created from the set is
known as a candidate set.
•Selection function: This function is used to choose the
candidate or subset which can be added in the solution.
•Feasibility function: A function that is used to
determine whether the candidate or subset can be used to
contribute to the solution or not.
•Objective function: A function is used to assign the
value to the solution or the partial solution.
•Solution function: This function is used to intimate
whether the complete function has been reached or not.
Applications of Greedy Algorithm
Without encoding, the total size of the string was 120 bits. After
encoding the size is reduced to 32 + 15 + 28 = 75.
Huffman Coding Algorithm
• Problem: Finding the minimum length bit string which can be used to
encode a string of symbols.
• Used for compressing data.
• Uses a simple heap based priority queue.
• Each leaf is labeled with a character and its frequency of occurrence.
• Each internal node is labeled with the sum of the weights of the
leaves in its subtree.
• Huffman coding is a lossless data compression algorithm.
Example
• Solve
Variable Length coding
A B C D E F
Frequency 45000 13000 12000 16000 9000 5000
• The time complexity for encoding each unique character based on its
frequency is O(nlog n).
• Extracting minimum frequency from the priority queue takes place
2*(n-1) times and its complexity is O(log n). Thus the overall
complexity is O(nlog n).
Solution: V 20 30 66 40 60
Object 1 2 3 4 5
Ratio = vi/wi 2 1.5 2.2 1 1.2
•Step 2 : Sort the items according to the ratio and Select the
item according to its highest ratio. First item is selected
Object 3 1 2 5 4
W 30 10 20 50 40
Ratio = vi/wi 2.2 2 1.5 1.2 1
Selected item 1
Object 3 1 2 5 4
W 30 10 20 50 40
Ratio = Vi/Wi 2.2 2 1.5 1.2 1
Selected item 1 1
Object 3 1 2 5 4
W 30 10 20 50 40
Ratio = Vi/Wi 2.2 2 1.5 1.2 1
Selected item 1 1 1
1 5 30
2 10 40
3 15 45
4 22 77
5 25 90
Or
• Find the optimal solution for the fractional knapsack problem making use of
greedy approach. Consider-
•n = 5
• w = 60 kg
• (w1, w2, w3, w4, w5) = (5, 10, 15, 22, 25)
• (b1, b2, b3, b4, b5) = (30, 40, 45, 77, 90)
Problem-
Knapsack Problem Using Greedy Method-
• A thief enters a house for robbing it. He can carry a maximal weight of 60 kg
into his bag. There are 5 items in the house with the following weights and
values. What items should thief take if he can even take the fraction of any
item with him?
Item Weight Value
1 5 30
2 10 40
3 15 45
4 22 77
5 25 90
Solve
KNAPSACK PROBLEM
•
Time Complexity-
Knapsack Problem Using Greedy Method-
•The main time taking step is the sorting of all items in decreasing
order of their value / weight ratio.
•If the items are already arranged in the required order, then while
loop takes O(n) time.
•The average time complexity of Quick Sort is O(nlogn).
•Therefore, total time taken including the sort is O(nlogn).
Minimum Spanning Tree
Minimum Spanning Tree
• Remove all loops & Parallel Edges from the given graph
• Pick the smallest edge. Check if it forms a cycle with the spanning
tree formed so far. If cycle is not formed, include this edge. Else,
discard it.
• Repeat step#2 until there are (V-1) edges in the spanning tree.
Example
Brinleigh 5
Cornwell
3
4
8 6
8
Avonford Fingley Donster
7
5
4
2
Edan
We model the situation as a network, then the problem
is to find the minimum connector for the network
B 5
C
3
4
8 6
8
A F D
7
5
4
2
E
Kruskal’s Algorithm
E
Kruskal’s Algorithm
8
A D
7 F
5
4
2
E
Kruskal’s Algorithm
3 ED 2
4 AB 3
8 6
8
A D
7 F
5
4
2
E
Kruskal’s Algorithm
3
ED 2
4 AB 3
8 6
CD 4 (or AE 4)
8
A D
7 F
5
4
2
E
Kruskal’s Algorithm
3
ED 2
4 AB 3
8 6
CD 4
AE 4
8
A D
7 F
5
4
2
E
Kruskal’s Algorithm
3
ED 2
4 AB 3
8 6
CD 4
AE 4
8
BC 5 – forms a cycle
A D
7 F EF 5
5
4
2
E
Kruskal’s Algorithm
5
4 Total weight of tree: 18
2
E
Kruskal’s Algorithm
5
4 Total weight of tree: 18
2
E
Kruskal Algorithm
Kruskal Algorithm
Find a MST using Kruskal Algorithm
Kruskal algorithm to create MST
Algorithm
MST-KRUSKAL(G, w)
A←Ø
for each vertex v V[G]
do MAKE-SET(v)
sort the edges of E into non-decreasing order by weight w
for each edge (u, v) E, taken in non-decreasing order by weight
do if (u,v) not forming any cycle
then A ← A {(u, v)}
UNION(u, v)
return A
Algorithm Analysis
• Time complexity
• To sort the vertices it takes nlogn
• O(nlogn) where n is the number of unique characters.
Prim's Algorithm
Prim's Spanning Tree Algorithm
•Prim's algorithm to find minimum cost spanning tree
(as Kruskal's algorithm) uses the greedy approach.
Prim's algorithm shares a similarity with
the shortest path first algorithms.
• Remove all loops and parallel edges from the given graph. In case of
parallel edges, keep the one which has the least cost associated and
remove all others.
Prim's Spanning Tree Algorithm
• Step 2 - Choose any arbitrary node as root node
• In this case, we choose S node as the root node of Prim's spanning
tree. This node is arbitrarily chosen, so any node can be the root
node.
• Step 3 - Check outgoing edges and select the one with less cost
• After choosing the root node S, we see that S,A and S,C are two
edges with weight 7 and 8, respectively. We choose the edge S,A as
it is lesser than the other.
Prim's Spanning Tree Algorithm
• Now, the tree S-7-A is treated as one node and we check for all
edges going out from it. We select the one which has the lowest cost
and include it in the tree.
• After this step, S-7-A-3-C tree is formed. Now we'll again treat it as a
node and will check all the edges again. However, we will choose
only the least cost edge. In this case, C-3-D is the new edge, which is
less than other edges' cost 8, 6, 4, etc.
Prim's Spanning Tree Algorithm
• After adding node D to the spanning tree, we now have two edges
going out of it having the same cost, i.e. D-2-T and D-2-B. Thus, we
can add either one. But the next step will again yield edge 2 as the
least cost. Hence, we are showing a spanning tree with both edges
included.
Algorithm
MST-Prims(G, w)
A←Ø
for each vertex v V[G]
do MAKE-SET(v)
Select the source vertices u
Choose the minimal out degree edge from u to v
do if (u,v) not forming any cycle
then A ← A {(u, v)}
UNION(u, v)
return A
Algorithm Analysis
• Time complexity
• O(E * logV)where E – edges and V - Vertices.
• Each edges connect two vertices so logV.
Solve using Prim's Spanning Tree Algorithm
Solve using Prim's Spanning Tree Algorithm
Dynamic Programming
Introduction
0/1 knapsack problem
Matrix chain multiplication using dynamic programming
Longest common subsequence using dynamic programming
Optimal binary search tree (OBST)using dynamic programming
Introduction to dynamic programming
•Dynamic Programming is also used in optimization
problems. Like divide-and-conquer method, Dynamic
Programming solves problems by combining the
solutions of subproblems.
•Moreover, Dynamic Programming algorithm solves
each sub-problem just once and then saves its
answer in a table, thereby avoiding the work of
re-computing the answer every time.
•Two main properties of a problem suggest that the given
problem can be solved using Dynamic Programming.
These properties are
•overlapping sub-problems and
•optimal substructure.
Introduction to dynamic programming
•Overlapping Sub-Problems
•Similar to Divide-and-Conquer approach, Dynamic
Programming also combines solutions to
sub-problems. It is mainly used where the solution
of one sub-problem is needed repeatedly. The
computed solutions are stored in a table, so that
these don’t have to be re-computed. Hence, this
technique is needed where overlapping
sub-problem exists.
•Optimal Sub-Structure
•A given problem has Optimal Substructure
Property, if the optimal solution of the given
problem can be obtained using optimal solutions of
its sub-problems.
Greedy Vs Dynamic Programming
Greedy method Dynamic Programming
make an optimal choice (without knowing solve subproblems first, then use those solutions
solutions to subproblems) and then solve to make an optimal choice
remaining subproblems
solutions are top down solutions are bottom up
Best choice does not depend on solutions to Choice at each step depends on solutions to
subproblems. subproblems
Make best choice at current time, then work on Many subproblems are repeated in solving larger
subproblems. Best choice does depend on choices problems. This repetition results in great savings
so far when the computation is bottom up
Optimal Substructure: solution to problem Optimal Substructure: solution to problem
contains within it optimal solutions to contains within it optimal solutions to
subproblems subproblems
Fractional knapsack: at each step, choose item 0-1 Knapsack: to determine whether to include
with highest ratio item i for a given size, must consider best
solution, at that size, with and without item i
Divide & Conquer Method vs Dynamic Programming
0/1 KnapSack Problem
Ex :1- n=4,m=8 P={1,2,5,6} and W={2,3,4,5}
Using Tabular Method:
0/1 KnapSack Problem
Ex :1- n=4,m=8 P={1,2,5,6} and W={2,3,4,5}
Tabular Method:
Capaci
ty / 0 1 2 3 4 5 6 7 8
Item
P W 0
1 2 1
2 3 2
5 4 3
6 5 4
0/1 KnapSack Problem
Ex :1- n=4,m=8 P={1,2,5,6} and W={2,3,4,5}
Tabular Method:
Capaci
ty / 0 1 2 3 4 5 6 7 8
Item
P W 0 0 0 0 0 0 0 0 0 0
1 2 1 0 0 1 1 1 1 1 1 1
2 3 2 0 0 1 2 2 3 3 3 3
5 4 3 0 0 1 2 5 5 6 7 7
6 5 4 0 0 1 2 5 6 6 7 8
0/1 KnapSack Problem
• For the zero row, no item is selected and so no weight is included into
knapsack. So fill first row with all 0’s and first column with all 0’s
• For the first row consider first element. First element weight is 2. so fill second
row second column with profit of the first item i.e 1. Only first element can be
selected. So all the remaining columns values are 1 only, and left side column
with previous value.
• For the second row select second element. Second element weight is 3. so fill
3rd column in the 3 row with profit of the second item i.e 2. fill all the left side
colmns with previous values. Here we must select both the items. So first two
items weight is 5. total profit of first two item is 3. so fill column 5 with 3.
fillleft side columns with previous valus. And right side columns with 3,
because we can select first two items only.
• For the third row, select 3rd item. Its weight is 4. so fill 4th column with its
profit i.e 5. here we select first 3 items. But we may not select all three. But we
must identify the combinations with 3rd item.
If we select 3rd and 1st items, total weight is 6, so fill 6th column with total
profit of 1st and 3rd items i.e 6 , in the same way select 3rd and 2nd items, total
weight is 7 , so fill 7th column with total profit of 2nd and 3rd items, i.e 7.
0/1 Knapsack Problem
• Fill the last column with 7 only, because we cannot select remaining items.
• For the 4th row, select 4th item , the weight of the item is 5, so fill 5th
column in 4th row with the profit of the 4th item, i.e 6. all the previous
columns with the old values. Now select remaining items alongwith 4th
itam.
- select 4th item with 1st item , the total weight of these two items is 7, so
fill 7th column with the total profit of these two items i.e 7.
- select item 4 with second item, the total weight of these two items is 8,
so fillthe 8th column with these two items profit, i.e 8.
- 6th column with 5th column value.
• After filling the entire row, now construct the solutionx1, x2, x3 and x4.
0/1 Knapsack Problem
•Formula for filling all the rows:
V[i, w] =max { V[i-1, w], V[i-1,w-w[i]]+p[i] }
•V[4, 1] = max{ V[3, 1], V[3, 1-5] +6 }
= max{ 0, v[3, -4] + 6}
undefined..
So upto w=4 take the same values as previous row.
•V[4, 5] = max{ V[3, 5], V[3, 5-5] +6 }
= max{ 5, v[3, 0] + 6}
= max{ 5, 0 + 6} = max{ 5, 6} = 6
• V[4, 6] = max{ V[3, 6], V[3, 6-5] +6 }
= max{ 6, v[3, 1] + 6}
= max{ 6, 0 + 6} = max{ 6, 6} = 6
0/1 Knapsack Problem
•V[4, 7] = max{ V[3, 7], V[3, 7-5] +6 }
= max{ 7, v[3, 2] + 6}
= max{ 7, 1 + 6} = max{ 7, 7} = 7
• V[4, 8] = max{ V[3, 8], V[3, 8-5] +6 }
= max{ 7, v[3, 3] + 6}
= max{ 6, 2 + 6} = max{ 6, 8} = 8
Algorithm
Dynamic-0-1-knapsack (v, w, n, W)
for w = 0 to W do
c[0, w] = 0
for i = 1 to n do
c[i, 0] = 0
for w = 1 to W do
if wi ≤ w then
if vi + c[i-1, w-wi] then
c[i, w] = vi + c[i-1, w-wi]
else c[i, w] = c[i-1, w]
else
c[i, w] = c[i-1, w]
0/1 Knapsack Problem
•Select the maximum profit value , i.e. 8, which is there in 4th row,
check whether 8 is there in 3rd row or not. Value is not there, means
4th row is included, so x4=1.
- 4th row profit is 6. remaining profit is 8-6=2.
•2 is there in row 3, check whether 2 is there in row 2 or not. Value is
there in 2nd row also , means 3rd item is not included. X3=0.
•So 2 is there in row 2, check whether 2 is there in row 1 or not. No,
value 2 is not there in row 1. so second item is included, x2=1.
- so the remaining profit is 2-2=0
•0 is there in row 1, check whether 0 is there in row 0 or not, yes row
zero contain 0, so item 1 is not included, x1=0.
•So the solution is : x1=0, x2=1, x3=0, x4=1.
•Total profit obtained is = p1 * x1+ p2 * x2 + p3 * x3 + p4 * x4
= 1 *0 + 2 * 1 + 5 * 0 + 6 * 1= 8
Knapsack filled with weight = 2*0 + 3*1 + 4*0 + 5*1 = 8
0/1 KnapSack Problem
Solve
0/1 KnapSack Problem
Ex :3- n=4, m=8, P={2,3,1,4} and W={3,4,6,5}
Using Tabular Method:
Solve
Matrix Chain Multiplication
Matrix Chain Multiplication
Matrix Chain Multiplication is the optimization
problem. It can be solved using dynamic
programming. The problem is defined below:
Problem: In what order, n matrices A1, A2, A3, ….
An should be multiplied so that it would take a
minimum number of computations to derive the
result.
Or
1 2 3 4 5
• Single matrix 1 0
• m[i, j] = 0, for i = 1 to 5 2 0
4 0
5 0
• Two Matrix
• m[1, 2] = {m[1, 1] + m[2, 2] + d0 x d1 x d2}
= {0 + 0 + 1 × 5 × 4}
= 20
• m[2, 3] = {m[2, 2] + m[3, 3] + d1 x d2 x d3} 1 2 3 4 5
= {0 + 0 + 5 × 4 × 3} 1 0 20
= 60 2 0 60
= {0 + 0 + 4 × 3 × 2} 4 0 6
= 24 5 0
1 2 3 4 5
1 0 20 32
2 0 60 64
3 0 24 18
4 0 6
5 0
• Four Matrix
1 2 3 4 5
1 0 20 32 38
2 0 60 64 38
3 0 24 18
4 0 6
5 0
• Five Matrix
1 2 3 4 5
1 0 20 32 38 40
2 0 60 64 38
3 0 24 18
4 0 6
5 0
• The optimal sequence is
• 5th matrix (A*B*C*D) E
• 4th matrix (A*B*C) (D) E
• 3rd matrix (A*B) * (C) * (D) * E
• Solve : We are given the sequence {4, 10, 3, 12, 20, and 7}. The
matrices have size 4 x 10, 10 x 3, 3 x 12, 12 x 20, 20 x 7. We need to
compute M [i,j], 0 ≤ i, j≤ 5. We know M [i, i] = 0 for all i.
• Solve it
Longest Common Subsequence
•The longest common subsequence (LCS) is
defined as the longest subsequence that is
common to all the given sequences, provided
that the elements of the subsequence are not
required to occupy consecutive positions within
the original sequences.
•If S1 and S2 are the two given sequences
then, Z is the common subsequence
of S1 and S2 if Z is a subsequence of
both S1 and S2. Furthermore, Z must be
a strictly increasing sequence of the indices
of both S1 and S2.
•In a strictly increasing sequence, the indices of
the elements chosen from the original
sequences must be in ascending order in Z.
Longest Common Subsequence
Problem
•Let us understand LCS with an example.
•If
S1 = {B, C, D, A, A, C, D}
S2 = {A, C, D, B, A, C}
•Then, common subsequences are {B, C}, {C,
D, A, C}, {D, A, C}, {A, A, C}, {A, C}, {C, D}, ...
• Solution
• The following steps are followed for finding the longest common
subsequence.
• Create a table of dimension n+1*m+1 where n and m are the lengths of X
and Y respectively. The first row and the first column are filled with zeros.
1. Fill each cell of the table using the following logic.
2. If the character correspoding to the current row and current column are
matching, then fill the current cell by adding one to the diagonal
element. Point an arrow to the diagonal cell.
3. Else take the maximum value from the previous column and previous
row element for filling the current cell. Point an arrow to the cell with
maximum value. If they are equal, point to any of them.
• The value in the last row and the last column is the length of the
longest common subsequence.