The Needleman Wunsch Algorithm For Sequence Alignment
The Needleman Wunsch Algorithm For Sequence Alignment
sequence alignment
7th Melbourne Bioinformatics Course
Vladimir Likić, Ph.D.
e-mail: [email protected]
Example:
SIMILARITY
PI-LLAR---
For example:
SIMILARITY
PI-LLAR---
--MOLARITY
ATGGCGT
ATG-AGT score: +1 + 1 + 1 + 0 − 1 + 1 + 1 = 4
An alternative alignment:
ATGGCGT
A-TGAGT score: +1 + 0 − 1 + 1 − 1 + 1 + 1 = 2
C T A G
C 1 -1 -1 -1
T -1 1 -1 -1
A -1 -1 1 -1
G -1 -1 -1 1
C T A G
C 2 1 -1 -1
T 1 2 -1 -1
A -1 -1 2 1
G -1 -1 1 2
The Needleman-Wunsch algorithm for sequence alignment – p.13/46
Protein substitution matrices
γ(n) = −o − (n − 1)e
Brute-force approach:
Generate the list all possible alignments between two
sequences, score them
Select the alignment with the best score
22N
√
πN
A divide-and-conquer strategy:
Break the problem into smaller subproblems.
Solve the smaller problems optimally.
Use the sub-problem solutions to construct an optimal
solution for the original problem.
SEND
-AND score: +1
A-ND score: +3 ← the best
AN-D score: -3
AND- score: -8
D
N
S
E
N
D
i = 1, 2, ..., N and j = 1, 2, ..., M
S
E
N
The score and traceback matrices
The first row and the first column of the score and
traceback matrices are filled during the initialization.
S E N D S E N D
A −10 up
N −20 up
D −30 up
The score matrix cells are filled by row starting from the
cell C(2, 2)
C(i−1,j−1) C(i−1,j)
C(i,j−1) C(i,j)
S E N D S E N D
A −10 ? up ?
N −20 up
D −30 up
Where C(1, 1), C(1, 2), and C(2, 1) are read from the
score matrix, and S(S, A) is the score for the S ↔ A
taken from the BLOSUM62 matrix.
S E N D S E N D
A −10 1 up diag
N −20 up
D −30 up
S E N D S E N D
A −10 1 ? up diag ?
N −20 up
D −30 up
After all cells are filled, the score and traceback matrices
are:
S E N D S E N D
Traceback
D
up up diag diag diag
starts here
Traceback
D
up up diag diag diag
starts here
The Needleman-Wunsch algorithm for sequence alignment – p.40/46
The traceback step-by-step (2)
Traceback
D
up up diag diag diag
starts here
The Needleman-Wunsch algorithm for sequence alignment – p.41/46
The traceback step-by-step (3)
Traceback
D
up up diag diag diag
starts here
The Needleman-Wunsch algorithm for sequence alignment – p.42/46
The traceback step-by-step (4)
Traceback
D
up up diag diag diag
starts here
The Needleman-Wunsch algorithm for sequence alignment – p.43/46
Compare with the exhaustive search
SEND
A-ND