Algorithm
Algorithm
Dr. P. Borah
Professor & Head, Dept. of Animal Biotechnology
College of Veterinary Science
AAU, Khanapara, Guwahati-781022
Two types of alignment:
• Global alignment
• Local alignment
Global alignment
• Compares sequences and gives the
best overall alignment.
Needleman-Wunsch algorithm
Local Alignment
Finds regions of un-gapped sequence with a
high degree of similarity.
Smith-Waterman algorithm
Dotplots
A A T G C A G C T
Match = 1
0 Mismatch = 0
Gap = -1
A
C Needleman Wunsch
algorithm
Seq. 1: AATGCAGCT
Seq. 2: ATATGCAGC
A A T G C A G C T
Match = 1
0 -1 -2 -3 -4 -5 -6 -7 -8 -9 Mismatch = 0
Gap = -1
A -1
T -2
A -3
T -4
G -5
C -6
A -7
G -8
Needleman Wunsch
C -9
algorithm
Seq. 1: AATGCAGCT
Seq. 2: ATATGCAGC
A A T G C A G C T
Match = 1
0 -1 -2 -3 -4 -5 -6 -7 -8 -9 Mismatch = 0
Gap = -1
A -1
T -2
Z + M/MM X + Gap
A -3
T -4
G -5 Y + Gap
C -6
A -7
G -8
Needleman Wunsch
C -9
algorithm
Seq. 1: AATGCAGCT
Seq. 2: ATATGCAGC
A A T G C A G C T
Match = 1
0 -1 -2 -3 -4 -5 -6 -7 -8 -9 Mismatch = 0
Gap = -1
A -1 1
T -2
A -3 Z + M/MM X + Gap
T -4
G -5
Y + Gap
C -6
A -7
G -8
C -9 Needleman Wunsch
algorithm
Seq. 1: AATGCAGCT
Seq. 2: ATATGCAGC
A A T G C A G C T
Match = 1
0 -1 -2 -3 -4 -5 -6 -7 -8 -9 Mismatch = 0
Gap = -1
A -1 1 0 -1
T -2
A -3 Z + M/MM X + Gap
T -4
G -5
Y + Gap
C -6
A -7
G -8
C -9 Needleman Wunsch
algorithm
Seq. 1: AATGCAGCT
Seq. 2: ATATGCAGC
A A T G C A G C T
Match = 1
0 -1 -2 -3 -4 -5 -6 -7 -8 -9 Mismatch = 0
Gap = -1
A -1 1 0 -1 -2 -3 -4 -5 -6 -7
T -2 0 1 1 0 -1 -2 -3 -4 -5
A -3 -1 1 1 1 0 0 -1 -2 -3 Z + M/MM X + Gap
T -4 -2 0 2 1 1 0 0 -1 -1
G -5 -3 -1 1 3 2 1 1 0 -1
Y + Gap
C -6 -4 -2 0 2 4 3 2 2 1
A -7 -5 -3 -1 0 3 5 4 3 2
G -8 -6 -4 -2 0 2 4 6 5 4
C -9 -7 -5 -3 -1 1 3 5 7 6 Needleman Wunsch
algorithm
Seq. 1: AATGCAGCT
Seq. 2: ATATGCAGC
A-ATGCAGCT
ATATGCAGC-
A A T G C A G C T Match = 1
Mismatch = 0
0 -1 -2 -3 -4 -5 -6 -7 -8 -9 Gap = -1
A -1 1 0 -1 -2 -3 -4 -5 -6 -7
T -2 0 1 1 0 -1 -2 -3 -4 -5
Z + M/MM X + Gap
A -3 -1 1 1 1 0 0 -1 -2 -3
T -4 -2 0 2 1 1 0 0 -1 -1
Y + Gap
G -5 -3 -1 1 3 2 1 1 0 -1
C -6 -4 -2 0 2 4 3 2 2 1
A -7 -5 -3 -1 0 3 5 4 3 2
G -8 -6 -4 -2 0 2 4 6 5 4 Needleman Wunsch
C -9 -7 -5 -3 -1 1 3 5 7 6 algorithm
Seq. 1: AATGCAGCT
Seq. 2: ATATGCAGC
A-ATGCAGCT
ATATGCAGC-
A A T G C A G C T Match = 1
Mismatch = 0
0 -1 -2 -3 -4 -5 -6 -7 -8 -9 Gap = -1
A -1 1 0 -1 -2 -3 -4 -5 -6 -7
T -2 0 1 1 0 -1 -2 -3 -4 -5
Z + M/MM X + Gap
A -3 -1 1 1 1 0 0 -1 -2 -3
T -4 -2 0 2 1 1 0 0 -1 -1
Y + Gap
G -5 -3 -1 1 3 2 1 1 0 -1
C -6 -4 -2 0 2 4 3 2 2 1
A -7 -5 -3 -1 0 3 5 4 3 2
G -8 -6 -4 -2 0 2 4 6 5 4 Needleman Wunsch
C -9 -7 -5 -3 -1 1 3 5 7 6 algorithm
Seq. 1: AATGCAGCT
Seq. 2: ATATGCAGC
A-ATGCAGCT
Match = 1
ATATGCAGC- Mismatch = 0
Gap = -1
Matches = 8
Mismatch = 0 M/MM Gap
Indel = 2
Needleman Wunsch
algorithm
Total score = 8 x 1 + 0 + 2 x (-1) = 8-2 = 6
A A T G C A G C T
0 -1 -2 -3 -4 -5 -6 -7 -8 -9
A -1 1 0 -1 -2 -3 -4 -5 -6 -7
T -2 0 1 1 0 -1 -2 -3 -4 -5
A -3 -1 1 1 1 0 0 -1 -2 -3
T -4 -2 0 2 1 1 0 0 -1 -1
G -5 -3 -1 1 3 2 1 1 0 -1
C -6 -4 -2 0 2 4 3 2 2 1
A -7 -5 -3 -1 0 3 5 4 3 2
G -8 -6 -4 -2 0 2 4 6 5 4
Needleman Wunsch
C -9 -7 -5 -3 -1 1 3 5 7 6
algorithm
Smith – Waterman Algorithm for
Local Alignment
It is a modification of Needleman-Wunch Algorithm
Modifications:
1. No negative values are allowed.
2. Negative values are replaced by zero.
3. Back tracking is initiated from the largest value in the last
column or last row.
4. Find the path for that value.
5. Then align the sequences based on the directions of the
arrows as done in Needleman-Wunch algorithm.
M/MM Gap
A A T C G A T C G G
0 0 0 0 0 0 0 0 0 0 0
T 0 0 0 2 0 0 0 2 0 0 0
C 0 0 0 0 4 2 0 0 4 2 0
A 0 2 2 0 2 3 4 2 2 3 1
A 0 2 4 2 0 1 5 3 1 1 2
G 0 0 2 3 1 2 3 4 2 3 3
T 0 0 0 4 2 0 1 5 3 1 2
C 0 0 0 2 6 4 2 3 7 5 3
Seq 1: AATCGATCGG Match = +2
Mismatch = -1
Seq 2: TCAAGTC Gap = -2
A A T C G A T C G G
0 0 0 0 0 0 0 0 0 0 0
T 0 0 0 2 0 0 0 2 0 0 0
C 0 0 0 0 4 2 0 0 4 2 0
A 0 2 2 0 2 3 4 2 2 3 1
A 0 2 4 2 0 1 5 3 1 1 2
G 0 0 2 3 1 2 3 4 2 3 3
T 0 0 0 4 2 0 1 5 3 1 2
C 0 0 0 2 6 4 2 3 7 5 3
Seq 1: AATCGATCGG Match = +2
Mismatch = -1
Seq 2: TCAAGTC Gap = -2
A A T C G A T C G G
0 0 0 0 0 0 0 0 0 0 0
T 0 0 0 2 0 0 0 2 0 0 0
C 0 0 0 0 4 2 0 0 4 2 0
A 0 2 2 0 2 3 4 2 2 3 1
A 0 2 4 2 0 1 5 3 1 1 2
G 0 0 2 3 1 2 3 4 2 3 3
T 0 0 0 4 2 0 1 5 3 1 2
C 0 0 0 2 6 4 2 3 7 5 3
Seq 1: AATCGATCGG Match = +2
Mismatch = -1
Seq 2: TCAAGTC Gap = -2
A A T C G A T C G G
0 0 0 0 0 0 0 0 0 0 0
T 0 0 0 2 0 0 0 2 0 0 0
C 0 0 0 0 4 2 0 0 4 2 0
A 0 2 2 0 2 3 4 2 2 3 1
A 0 2 4 2 0 1 5 3 1 1 2
G 0 0 2 3 1 2 3 4 2 3 3
T 0 0 0 4 2 0 1 5 3 1 2
C 0 0 0 2 6 4 2 3 7 5 3
Seq 1: AATCGATCGG Match = +2
Mismatch = -1
Seq 2: TCAAGTC Gap = -2
A A T C G A T C G G
0 0 0 0 0 0 0 0 0 0 0
T 0 0 0 2 0 0 0 2 0 0 0
C 0 0 0 0 4 2 0 0 4 2 0
A 0 2 2 0 2 3 4 2 2 3 1
A 0 2 4 2 0 1 5 3 1 1 2
G 0 0 2 3 1 2 3 4 2 3 3
T 0 0 0 4 2 0 1 5 3 1 2
C 0 0 0 2 6 4 2 3 77 5 3
Seq 1: AATCGATCGG Match = +2
Mismatch = -1
Seq 2: TCAAGTC Gap = -2
AATCGA_TCGG
TCAAGTC
Match = 5 x 2 = 10
Mismatch = 1 x (-1) = -1
Gap = 1 x (-2) = -2
Total score = +7
DOT PLOT METHOD
G x x x
T x x x
C x x
A x x x Break
G x x x
Seq 1: TAATCGATCGG
Seq 2: TTCGAGTCAG
T A A T C G A T C G G
T x x x
T x x x
C x x
G x x
Indel
A x x x
G x x x
T x x x
C x x
A x x x Substitution
G x x x
Seq 1: TAATCGATCGG TAATCGA –TCGG Final
alignment
Seq 2: TTCGAGTCAG TTCGAGTCAG
T A A T C G A T C G G
T x x x
T x x x
C x x
G x x
Indel
A x x x
G x x x
T x x x
C x x
A x x x Substitution
G x x x
Analysis of dot plot matrix
Thank you