DAA Unit-IV
DAA Unit-IV
Let’s sort and label the jobs by finishing time: f1 ≤ f2 ≤ . . . ≤ fN. In order to maximize profit we
can either take current job j and include to optimal solution or not include current job j to optimal
solution. How to find the profit including current job? The idea is to find the latest job before the current
job (in sorted array) that doesn’t conflict with current job 1<=j<=N-1. Once we find such a job, we recur
for all jobs till that job and add profit of current job to result.
2. Huffman Coding:
Huffman coding is a lossless data compression algorithm. The idea is to assign variable-length
codes to input characters, lengths of the assigned codes are based on the frequencies of corresponding
characters. The most frequent character gets the smallest code, and the least frequent character gets the
largest code.
The variable-length codes assigned to input characters are prefix Codes, means the codes (bit
sequences) are assigned in such a way that the code assigned to one character is not the prefix of code
assigned to any other character. This is how Huffman Coding makes sure that there is no ambiguity when
decoding the generated bitstream.
Let us understand prefix codes with a counter example. Let there be four characters a, b, c and d, and their
corresponding variable length codes be 00, 01, 0 and 1. This coding leads to ambiguity because code
assigned to c is the prefix of codes assigned to a and b. If the compressed bit stream is 0001, the de-
compressed output may be “cccd” or “ccb” or “acd” or “ab”. See this or applications of Huffman Coding.
There are mainly two major parts in Huffman Coding
1. Build a Huffman Tree from input characters.
2. Traverse the Huffman Tree and assign codes to characters.
Example:
Character Frequency
a 5
b 9
c 12
d 13
e 16
f 45
Step 1. Build a min heap that contains 6 nodes where each node represents root of a tree with single
node.
Step 2 Extract two minimum frequency nodes from min heap. Add a new internal node with frequency
5 + 9 = 14.
Illustration of step 2
Now min heap contains 5 nodes where 4 nodes are roots of trees with single element each, and
one heap node is root of tree with 3 elements.
character Frequency
c 12
d 13
Internal Node 14
e 16
f 45
Step 3: Extract two minimum frequency nodes from heap. Add a new internal node with frequency 12
+ 13 = 25
Illustration of step 3
Now min heap contains 4 nodes where 2 nodes are roots of trees with single element each, and two
heap nodes are root of tree with more than one nodes
character Frequency
Internal Node 14
e 16
Internal Node 25
f 45
Step 4: Extract two minimum frequency nodes. Add a new internal node with frequency 14 + 16 = 30
Illustration of step 4
Now min heap contains 3 nodes.
character Frequency
Internal Node 25
Internal Node 30
f 45
Step 5: Extract two minimum frequency nodes. Add a new internal node with frequency 25 + 30 = 55
Illustration of step 5
Now min heap contains 2 nodes.
character Frequency
f 45
Internal Node 55
Step 6: Extract two minimum frequency nodes. Add a new internal node with frequency 45 + 55 = 100
Illustration of step 6
3. Dynamic Programming:
Dynamic programming, like the divide-and-conquer method, solves problems by combining the
solutions to sub-problems. Dynamic programming is applicable when the sub-problems are not
independent, that is, when sub-problems share sub sub-problems. A dynamic-programming algorithm
solves every sub sub-problem just once and then saves its answer in a table, thereby avoiding the work of
re-computing the answer every time the sub sub-problem is encountered. Dynamic programming is
typically applied to optimization problems. The development of a dynamic-programming algorithm can
be broken into a sequence of four steps.
1. Characterize the structure of an optimal solution.
2. Recursively define the value of an optimal solution.
3. Compute the value of an optimal solution in a bottom-up fashion.
4. Construct an optimal solution from computed information.
The dynamic programming technique was developed by Bellman based upon the principle known as
principle of optimality. Dynamic programming uses optimal substructure in a bottom-up fashion. That is,
we first find optimal solutions to sub problems and, having solved the sub problems, we find an optimal
solution to the problem.
Algorithm Matrix_Mul(A, B)
{
if(n ≠ P) then
error "incompatible dimensions"
else
for i ← 1 to m do
for j ← 1 to q do
{
C[i, j] ← 0
for k ← 1 to ndo
C[i, j] ← C[i, j] + A[i, k] * B[k, j]
}
return C
}
Step 4: Constructing an optimal solution. In the first level we compare M 12and M23. When M12< M23 we
parenthesize the A1A2 in the product A1A2A3i.e (A1A2) A3 and parenthesize the A2A3 in the product
A1A2A3i.e A1(A2A3)whenM12>M23 .This process is repeated until the whole product is parenthesized. The
top entry in the table i.e M13 gives the optimum cost of matrix chain multiplication.
Example:-Find an optimal parenthesization of a matrix-chain product whose dimensions are given in the
table below.
M atrix D imension
P 5x4
Q 4x6
R 6x2
T 2x7
Solution:-Given
P0=5 ,P1=4, P2=6, P3=2, P4=7
The Bottom level of the table is to be initialized.
Mi,j=0 where i = j
In the first level when we compare M12, M23 and M34. As M23=48 is minimum among three we
parenthesize the QR in the product PQRT i.e P(QR)T. In the second level when we compare M 13 and
M24. As M13=88 is minimum among the two we parenthesize the P and QR in the product PQRT i.e
(P(QR))T. Finally we parenthesize the whole product i.e ((P(QR))T). The top entry in the table i.eM 14
gives the optimum cost of ((P(QR))T).
Algorithm Matrix_Chain_Mul(p)
{
for i = 1 to n do
M[i, i] = 0
for len= 2 to n do
{
for i = 1 to n - len + 1 do
{
j←i+l-1
M[i, j] ← ∞
for k = i to j - 1 do
q =M[i, k] + M[k + 1, j] + Pi-1PkPj
if q <M[i, j]then
{
M[i, j] ← q
}
}
}
return m
}
Time Complexity:-Algorithm, Matrix_Chain_Mul uses first For loop to initialize M[i,j] which takes O(n).
M[i, j] value is computed using three For loops which takes O(n3). Thus the overall time complexity of
Matrix_Chain_Mul is O(n3).
Here longest means that the subsequence should be the biggest one. The common means that
some of the characters are common between the two strings. The subsequence means that some of the
characters are taken from the string that is written in increasing order to form a subsequence.
• Notice that the LCS problem has optimal substructure: solutions of subproblems are parts of the
final solution.
• Subproblems: “find LCS of pairs of prefixes of X and Y”
• First, we’ll find the length of LCS. Later we’ll modify the algorithm to find LCS itself.
• Define Xi, Yj to be the prefixes of X and Y of length i and j respectively
• Define c[i,j] to be the length of LCS of Xi and Yj
• Then the length of LCS of X and Y will be c[m,n]
c[i 1, j 1] 1 if x[i ] y[ j ],
c[i, j ]
max(c[i, j 1], c[i 1, j ]) otherwise
• We start with i = j = 0 (empty substrings of x and y)
• Since X0 and Y0 are empty strings, their LCS is always empty (i.e. c[0,0] = 0)
• LCS of empty string and any other string is empty, so for every i and j: c[0, j] = c[i,0] = 0
• When we calculate c[i,j], we consider two cases:
• First case: x[i]=y[j]: one more symbol in strings X and Y matches, so the length of LCS Xi and
Yj equals to the length of LCS of smaller strings Xi-1 and Yi-1 , plus 1
• Second case: x[i] != y[j]
• As symbols don’t match, our solution is not improved, and the length of LCS(X i , Yj) is the same
as before (i.e. maximum of LCS(Xi, Yj-1) and LCS(Xi-1,Yj)
LCS Algorithm:
LCS-Length(X, Y)
1. m = length(X) // get the # of symbols in X
2. n = length(Y) // get the # of symbols in Y
3. for i = 1 to m c[i,0] = 0 // special case: Y0
4. for j = 1 to n c[0,j] = 0 // special case: X0
5. for i = 1 to m // for all Xi
6. for j = 1 to n // for all Yj
7. if ( Xi == Yj )
8. c[i,j] = c[i-1,j-1] + 1
9. else c[i,j] = max( c[i-1,j], c[i,j-1] )
10. return c
Example:
X = ABCB, Y = BDCAB
Initially, the table as shown below