lecture2_sequence_alignment
lecture2_sequence_alignment
LECTURE NO 2
SEQUENCE ALIGNMENT
By:
Laila Sehar
abcdef
||
abdgf
Dynamic Programming:
Needlemanwunch algorithm
Smithwaterman algorithm
Dynamic Programming
We distinguish
Global alignment algorithms which optimize
overall alignment between two sequences .
We assume that the two proteins are basically
similar over the entire length of one another.
Local alignment algorithms which seek only
relatively conserved pieces of sequence.
Allign those parts of sequences that appear to have
good similarity.
Global vs. Local Alignment
Global
LGPSSKQTGKGS-SRIWDN
| | | | | | |
LN-ITKSAGKGAIMRLGDA
Local
--------GKG--------
| | |
--------GKG--------
Elements of Global Sequence Alignment
Alignment scores. The score for an alignment
is taken to be the sum of scores for aligned pairs
of letters, and scores for letters aligned with
nulls.
The Needleman-Wunsch
algorithm
The Needleman-Wunsch algorithm is a dynamic
programming algorithm for optimal sequence
alignment (Needleman and Wunsch, 1970).
The optimal path can be determined by
incremental extension of the optimal sub-paths.
In a Needleman-Wunsch alignment, the optimal
path must stretch from beginning to end in both
sequences (hence the term ‘global alignment’).
The key for understanding this approach is to
observe how the alignment problem can be
divided into subproblems.
Steps:
Initialization
Matrix fill or scoring
Trace back and alignment
Pairwise Global Alignment of sequences
empty A A A C
empty 0 -2 -4 -6 -8
A -2 1 -1 -3 -5
G -4 -1 0 -2 -4
C -6 -3 -2 -1 -1
AAAC AAAC
A-GC -AGC
Finding the alignments that give
the highest score
The arrows constitutes paths in the matrix, and for
finding the highest-scoring alignments, we can follow
the paths from H(m,n) backwards to H(0,0).
empty A A A C
empty 0 -2 -4 -6 -8
A -2 1 -1 -3 -5
G -4 -1 0 -2 -4
C -6 -3 -2 -1 -1
AAAC AAAC
A-GC -AGC
Local Alignment Algorithm
Match = 3
CTTCAGC Mismatch= -2
Gap = -3
CTCAGC
Smith-Waterman Algorithm
Conclusion
The Needleman-Wunsch algorithm, for realizing global
alignments, and the technique by Smith and Waterman, for
local alignments, constitute the fundamental basis on
which numerous database search algorithms were built
BLAST
FASTA
Clustal
These algorithms use indexing techniques, heuristics, and
fast comparative methods to get a quick comparison
between a query sequence and an entire database