0% found this document useful (0 votes)
0 views

algorithm_lecture 07_LCS

The document discusses the Longest Common Subsequence (LCS) algorithm, which is used to compare two sequences, such as DNA strings. It details the brute-force approach and the more efficient dynamic programming method to find the length of the LCS, explaining the recursive solution and providing an example with sequences X and Y. The algorithm's complexity and the process of filling a table to derive the LCS length are also outlined.

Uploaded by

tachbirdewan.bd
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
0 views

algorithm_lecture 07_LCS

The document discusses the Longest Common Subsequence (LCS) algorithm, which is used to compare two sequences, such as DNA strings. It details the brute-force approach and the more efficient dynamic programming method to find the length of the LCS, explaining the recursive solution and providing an example with sequences X and Y. The algorithm's complexity and the process of filling a table to derive the LCS length are also outlined.

Uploaded by

tachbirdewan.bd
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 29

Algorithms

Dynamic programming
Longest Common Subsequence

01/21/25 1
Longest Common Subsequence (LCS)

Application: comparison of two DNA strings


Ex: X= {A B C B D A B }, Y= {B D C A B A}
Longest Common Subsequence:
X= AB C B DAB
Y= BDCA B A
Brute force algorithm would compare each
subsequence of X with the symbols in Y

01/21/25 2
LCS Algorithm
• if |X| = m, |Y| = n, then there are 2m
subsequences of x; we must compare each
with Y (n comparisons)
• So the running time of the brute-force
algorithm is O(n 2m)
• Notice that the LCS problem has optimal
substructure: solutions of subproblems are
parts of the final solution.
• Subproblems: “find LCS of pairs of prefixes
of X and Y”

01/21/25 3
LCS Algorithm
• First we’ll find the length of LCS. Later we’ll
modify the algorithm to find LCS itself.
• Define Xi, Yj to be the prefixes of X and Y of
length i and j respectively
• Define c[i,j] to be the length of LCS of Xi and
Yj
• Then the length of LCS of X and Y will be
c[m,n] c[i  1, j  1]  1 if x[i ]  y[ j ],
c[i, j ] 
 max(c[i, j  1], c[i  1, j ]) otherwise

01/21/25 4
LCS recursive solution
c[i  1, j  1]  1 if x[i ]  y[ j ],
c[i, j ] 
 max(c[i, j  1], c[i  1, j ]) otherwise

• We start with i = j = 0 (empty substrings of x


and y)
• Since X0 and Y0 are empty strings, their LCS
is always empty (i.e. c[0,0] = 0)
• LCS of empty string and any other string is
empty, so for every i and j: c[0, j] = c[i,0] = 0
01/21/25 5
LCS recursive solution
c[i  1, j  1]  1 if x[i ]  y[ j ],
c[i, j ] 
 max(c[i, j  1], c[i  1, j ]) otherwise

• When we calculate c[i,j], we consider two


cases:
• First case: x[i]=y[j]: one more symbol in
strings X and Y matches, so the length of LCS
Xi and Yj equals to the length of LCS of

01/21/25
smaller strings Xi-1 and Yi-1 , plus 1 6
LCS recursive solution
c[i  1, j  1]  1 if x[i ]  y[ j ],
c[i, j ] 
 max(c[i, j  1], c[i  1, j ]) otherwise

• Second case: x[i] != y[j]


• As symbols don’t match, our solution is not
improved, and the length of LCS(Xi , Yj) is
the same as before (i.e. maximum of LCS(X i,
Yj-1) and LCS(Xi-1,Yj)

01/21/25
Why not just take the length of LCS(X i-1, Yj-1) ? 7
LCS Length Algorithm
LCS-Length(X, Y)
1. m = length(X) // get the # of symbols in X
2. n = length(Y) // get the # of symbols in Y
3. for i = 1 to m c[i,0] = 0 // special case: Y0
4. for j = 1 to n c[0,j] = 0 // special case: X0
5. for i = 1 to m // for all Xi
6. for j = 1 to n // for all Yj
7. if ( Xi == Yj )
8. c[i,j] = c[i-1,j-1] + 1
9. else c[i,j] = max( c[i-1,j], c[i,j-1] )
10. return c
01/21/25 8
LCS Example
We’ll see how LCS algorithm works on the
following example:
• X = ABCB
• Y = BDCAB

What is the Longest Common Subsequence


of X and Y?

LCS(X, Y) = BCB
X=AB C B
01/21/25 Y= BDCAB 9
ABCB
LCS Example (0) BDCAB
j 0 1 2 3 4 5
i Yj B D C A B
0 Xi
A
1
2 B

3 C

4 B

X = ABCB; m = |X| = 4
Y = BDCAB; n = |Y| = 5
Allocate array c[5,4]
01/21/25 10
ABCB
LCS Example (1) BDCAB
j 0 1 2 3 4 5
i Yj B D C A B
0 Xi 0 0 0 0 0 0
A
1 0
2 B
0
3 C 0
4 B 0

for i = 1 to m c[i,0] = 0
for j = 1 to n c[0,j] = 0
01/21/25 11
ABCB
LCS Example (2) BDCAB
j 0 1 2 3 4 5
i Yj B D C A B
0 Xi 0 0 0 0 0 0
A
1 0 0
2 B
0
3 C 0
4 B 0

if ( Xi == Yj )
c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )
01/21/25 12
ABCB
LCS Example (3) BDCAB
j 0 1 2 3 4 5
i Yj B D C A B
0 Xi 0 0 0 0 0 0
A
1 0 0 0 0
2 B
0
3 C 0
4 B 0

if ( Xi == Yj )
c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )
01/21/25 13
ABCB
LCS Example (4) BDCAB
j 0 1 2 3 4 5
i Yj B D C A B
0 Xi 0 0 0 0 0 0
A
1 0 0 0 0 1
2 B
0
3 C 0
4 B 0

if ( Xi == Yj )
c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )
01/21/25 14
ABCB
LCS Example (5) BDCAB
j 0 1 2 3 4 5
i Yj B D C A B
0 Xi 0 0 0 0 0 0
A
1 0 0 0 0 1 1
2 B
0
3 C 0
4 B 0

if ( Xi == Yj )
c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )
01/21/25 15
ABCB
LCS Example (6) BDCAB
j 0 1 2 3 4 5
i Yj B D C A B
0 Xi 0 0 0 0 0 0
A
1 0 0 0 0 1 1
2 B
0 1
3 C 0
4 B 0

if ( Xi == Yj )
c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )
01/21/25 16
ABCB
LCS Example (7) BDCAB
j 0 1 2 3 4 5
i Yj B D C A B
0 Xi 0 0 0 0 0 0
A
1 0 0 0 0 1 1
2 B
0 1 1 1 1
3 C 0
4 B 0

if ( Xi == Yj )
c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )
01/21/25 17
ABCB
LCS Example (8) BDCAB
j 0 1 2 3 4 5
i Yj B D C A B
0 Xi 0 0 0 0 0 0
A
1 0 0 0 0 1 1
2 B
0 1 1 1 1 2
3 C 0
4 B 0

if ( Xi == Yj )
c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )
01/21/25 18
ABCB
LCS Example (10) BDCAB
j 0 1 2 3 4 5
i Yj B D C A B
0 Xi 0 0 0 0 0 0
A
1 0 0 0 0 1 1
2 B
0 1 1 1 1 2
3 C 0 1 1
4 B 0

if ( Xi == Yj )
c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )
01/21/25 19
ABCB
LCS Example (11) BDCAB
j 0 1 2 3 4 5
i Yj B D C A B
0 Xi 0 0 0 0 0 0
A
1 0 0 0 0 1 1
2 B
0 1 1 1 1 2
3 C 0 1 1 2
4 B 0

if ( Xi == Yj )
c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )
01/21/25 20
ABCB
LCS Example (12) BDCAB
j 0 1 2 3 4 5
i Yj B D C A B
0 Xi 0 0 0 0 0 0
A
1 0 0 0 0 1 1
2 B
0 1 1 1 1 2
3 C 0 1 1 2 2 2
4 B 0

if ( Xi == Yj )
c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )
01/21/25 21
ABCB
LCS Example (13) BDCAB
j 0 1 2 3 4 5
i Yj B D C A B
0 Xi 0 0 0 0 0 0
A
1 0 0 0 0 1 1
2 B
0 1 1 1 1 2
3 C 0 1 1 2 2 2
4 B 0 1

if ( Xi == Yj )
c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )
01/21/25 22
ABCB
LCS Example (14) BDCAB
j 0 1 2 3 4 5
i Yj B D C A B
0 Xi 0 0 0 0 0 0
A
1 0 0 0 0 1 1
2 B
0 1 1 1 1 2
3 C 0 1 1 2 2 2
4 B 0 1 1 2 2

if ( Xi == Yj )
c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )
01/21/25 23
ABCB
LCS Example (15) BDCAB
j 0 1 2 3 4 5
i Yj B D C A B
0 Xi 0 0 0 0 0 0
A
1 0 0 0 0 1 1
2 B
0 1 1 1 1 2
3 C 0 1 1 2 2 2
4 B 0 1 1 2 2 3
if ( Xi == Yj )
c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )
01/21/25 24
LCS Algorithm Running Time

• LCS algorithm calculates the values of each


entry of the array c[m,n]
• So what is the running time?

O(m*n)
since each c[i,j] is calculated in
constant time, and there are m*n
elements in the array
01/21/25 25
How to find actual LCS
• So far, we have just found the length of LCS,
but not LCS itself.
• We want to modify this algorithm to make it
output Longest Common Subsequence of X
and Y
Each c[i,j] depends on c[i-1,j] and c[i,j-1]
or c[i-1, j-1]
For each c[i,j] we can say how it was acquired:
2 2 For example, here
2 3 c[i,j] = c[i-1,j-1] +1 = 2+1=3
01/21/25 26
How to find actual LCS - continued
• Remember that
c[i  1, j  1]  1 if x[i ]  y[ j ],
c[i, j ] 
 max(c[i, j  1], c[i  1, j ]) otherwise

 So we can start from c[m,n] and go backwards


 Whenever c[i,j] = c[i-1, j-1]+1, remember
x[i] (because x[i] is a part of LCS)
 When i=0 or j=0 (i.e. we reached the
beginning), output remembered letters in
reverse order
01/21/25 27
Finding LCS
j 0 1 2 3 4 5
i Yj B D C A B
0 Xi 0 0 0 0 0 0
A
1 0 0 0 0 1 1
2 B
0 1 1 1 1 2
3 C 0 1 1 2 2 2
4 B 0 1 1 2 2 3

01/21/25 28
Finding LCS (2)
j 0 1 2 3 4 5
i Yj B D C A B
0 Xi 0 0 0 0 0 0
A
1 0 0 0 0 1 1
2 B
0 1 1 1 1 2
3 C 0 1 1 2 2 2
4 B 0 1 1 2 2 3
LCS (reversed order): B C B
LCS (straight order): B C B
01/21/25 (this string turned out to be a palindrome) 29

You might also like