0% found this document useful (0 votes)
21 views

Lecture-7-Dynamic Programming Global-Sequence Alignment

The document describes the dynamic programming method for sequence alignment. It involves creating a scoring matrix to find the optimal alignment between two sequences by accounting for matches and mismatches. The method proceeds in four steps - initialization, matrix filling via scoring, traceback, and obtaining the final alignment.

Uploaded by

Veer khade
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views

Lecture-7-Dynamic Programming Global-Sequence Alignment

The document describes the dynamic programming method for sequence alignment. It involves creating a scoring matrix to find the optimal alignment between two sequences by accounting for matches and mismatches. The method proceeds in four steps - initialization, matrix filling via scoring, traceback, and obtaining the final alignment.

Uploaded by

Veer khade
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 31

Sequence Alignment

Algorithms:
Dynamic Programming

Dr. Aditya Kumar Padhi

Laboratory for Computational Biology & Biomolecular Design (LCBD)


School of Biochemical Engineering, IIT (BHU)
Sequence Alignment Algorithms
The 3 primary methods of producing Pairwise sequence
alignments

1. Dot matrix method.

2. The dynamic programming (DP) algorithm (advanced


method).

3. Word or k -tuple methods.


Sequence Alignment Algorithms
The dot-matrix method:

• The two sequences are written out as column and row headings of a
two-dimensional matrix.

• A dot is put in the dot-matrix plot at a position where the nucleotides in


the two sequences are identical.

• The alignment is defined by a path from the upper-left element to the


lower-right element.
Advantages of Dot-Matrix method

The vertical gap indicates that a


coding region corresponding to ~75
amino acids has either been deleted
from the human gene or inserted into
the bacterial gene.

The two diagonally oriented parallel


lines most probably indicate that a
small internal duplication has
occurred in the bacterial gene.
Disadvantages of Dot-Matrix method

May not identify the best alignment


Dynamic programming method
• Dynamic programming determines optimal alignment by matching 2
sequences for all possible pairs of characters between the two
sequences.

• It is fundamentally similar to the dot matrix method in that it also


creates a 2-dimensional alignment grid.

• However, it finds alignment in a more quantitative way by converting


a dot matrix into a scoring matrix to account for matches and
mismatches between sequences.

• By searching for the set of highest scores in this matrix, the best
alignment can be accurately obtained.
Global alignment method
• Needleman & Wunsch were the first to propose this method.

• 4 steps in dynamic programming:

1. Initialization

2. Matrix filling (scoring)

3. Traceback

4. Alignment
Global alignment method
1. Initialization: It is the 1st step in the global alignment dynamic programming
approach, where a matrix with (M+1) columns and (N+1) rows is created.
(M and N) correspond to the size of the sequences to be aligned.

2. Matrix filling (scoring): We fill the matrix with the highest possible scores.

3. Traceback: Move from the last corner and follow the arrow.

4. Alignment: Perform/obtain the alignment and verify it.

• All these steps are to be performed by following certain RULES.


Global alignment method
RULES

1. 1st row and 1st column should correspond to the two sequences.

2. Fill 1st block with a gap and “0”.

3. Put scores for each category (if Match: +1, if Mismatch: -1, if Gap: -2)

4. 1st column and last row should be first filled with scores or values.

5. The direction of our scoring or value placement should be from lower left to
upper right diagonally.
Initialization
2 Sequences
• GCTA (Sequence 1)
• TCG (Sequence 2)
• We need to do the alignment

GAP 0

GAP T C G

RULES
• 1st row and 1st column should correspond to the two sequences.
• Fill 1st block with a gap and “0”.
• Put scores for each category (Match: +1, Mismatch: -1, Gap: -2)
Initialization
G
Sequences
C
• GCTA
• TCG
T • We need to do
the alignment
A ?

GAP 0

GAP T C G

RULES
• 1st row and 1st column should correspond to the two sequences.
• Fill 1st block with a gap or “0”.
• Put scores for each category (Match: +1, Mismatch: -1, Gap: -2)

Direction of arrow
• Box beside (add the value of beside box with gap score of -2)
• Box bottom (add the value of bottom box with gap score of -2)
• Diagonal box (Match/Mismatch criteria), if match add +1 or -1 with diagonal box value
• Then only keep the highest value & arrow direction.
Matrix Filling / Scoring
C-1

GAP 0 -2 -4 -6 R-4

GAP T C G

RULES
• Column-1 (C-1) and Row-4 (R-4) have to be filled up first.
• Now, simply add the value of “gap” i.e. “-2”.
• Put scores for each category (Match: +1, Mismatch: -1, Gap: -2)

• Direction of arrow
• Box beside (add the value of beside box with gap score of -2)
• Box bottom (add the value of bottom box with gap score of -2)
• Diagonal box (Match/Mismatch criteria), if match add +1 or -1 with diagonal box value
• Then only keep the highest value & arrow direction.
Matrix Filling / Scoring
C-1

G -8

C -6

T -4

A -2

GAP 0 -2 -4 -6 R-4

GAP T C G

RULES
• Column-1 (C-1) and Row-4 (R-4) have to be filled up first.
• Now, simply add the value of “gap” i.e. “-2”.
• Put scores for each category (Match: +1, Mismatch: -1, Gap: -2)

• Direction of arrow
• Box beside (add the value of beside box with gap score of -2)
• Box bottom (add the value of bottom box with gap score of -2)
• Diagonal box (Match/Mismatch criteria), if match add +1 or -1 with diagonal box value
• Then only keep the highest value & arrow direction.
Matrix Filling / Scoring
G -8 X X X

C -6 X X X

T -4 X X X

A -2 X X X

GAP 0 -2 -4 -6

GAP T C G

RULES
• Column-1 (C-1) and Row-4 (R-4) have to be filled up first.
• Now, simply add the value of “gap” i.e. “-2”.
• Put scores for each category (Match: +1, Mismatch: -1, Gap: -2)

• Direction of arrow
• Box beside (add the value of beside box with gap score of -2)
• Box bottom (add the value of bottom box with gap score of -2)
• Diagonal box (Match/Mismatch criteria), if match add +1 or -1 with diagonal box value
• Then only keep the highest value & arrow direction.
Matrix Filling / Scoring
G -8 X X X

C -6 X X X

T -4 X X X

-4
A -2 -4 X X

GAP 0 -2 -4 -6

GAP T C G

RULES
• Put scores for each category (Match: +1, Mismatch: -1, Gap: -2)

• Direction of arrow
• Box beside (add the value of beside box with gap score of -2)
• Box bottom (add the value of bottom box with gap score of -2)
• Diagonal box (Match/Mismatch criteria), if match add +1 or -1 with diagonal box value
• Then only keep the highest value & arrow direction.
Matrix Filling / Scoring
G -8 X X X

C -6 X X X

T -4 X X X

-4
A -2 0-1= -1 -4 X X

GAP 0 -2 -4 -6

GAP T C G

RULES
• Put scores for each category (Match: +1, Mismatch: -1, Gap: -2)

• Direction of arrow
• Box beside (add the value of beside box with gap score of -2)
• Box bottom (add the value of bottom box with gap score of -2)
• Diagonal box (Match/Mismatch criteria), if match add +1 or -1 with diagonal box value
• Then only keep the highest value & arrow direction.
Matrix Filling / Scoring
G -8 X X X

C -6 X X X

T -4 X X X

A -2 0-1= (-1) X X SCORING

• -1 is greater.
GAP 0 -2 -4 -6 • So, we will delete -4
• We will score this box
with -1
GAP T C G • Specify the direction

RULES
• Put scores for each category (Match: +1, Mismatch: -1, Gap: -2)

• Direction of arrow
• Box beside (add the value of beside box with gap score of -2)
• Box bottom (add the value of bottom box with gap score of -2)
• Diagonal box (Match/Mismatch criteria), if match add +1 or -1 with diagonal box value
• Then only keep the highest value & arrow direction.
Matrix Filling / Scoring
G -8 X X X

C -6 X X X

T -4 X X X
-1-2= (-3)
A -2 -1 -4-2= (-6) X
-2-1= (-3)
GAP 0 -2 -4 -6

GAP T C G

RULES
• Put scores for each category (Match: +1, Mismatch: -1, Gap: -2)

• Direction of arrow
• Box beside (add the value of beside box with gap score of -2)
• Box bottom (add the value of bottom box with gap score of -2)
• Diagonal box (Match/Mismatch criteria), if match add +1 or -1 with diagonal box value
• Then only keep the highest value & arrow direction.
Matrix Filling / Scoring
G -8 X X X

C -6 X X X

T -4 X X X
-3 -5
A -2 -1
-3 -5 -8

GAP 0 -2 -4 -6

GAP T C G

RULES
• Put scores for each category (Match: +1, Mismatch: -1, Gap: -2)

• Direction of arrow
• Box beside (add the value of beside box with gap score of -2)
• Box bottom (add the value of bottom box with gap score of -2)
• Diagonal box (Match/Mismatch criteria), if match add +1 or -1 with diagonal box value
• Then only keep the highest value & arrow direction (here 2).
Matrix Filling / Scoring
G -8 X X X

C -6 X X X

T -4 X X X

A -2 -1 -3 -5

GAP 0 -2 -4 -6

GAP T C G

RULES
• Put scores for each category (Match: +1, Mismatch: -1, Gap: -2)

• Direction of arrow
• Box beside (add the value of beside box with gap score of -2)
• Box bottom (add the value of bottom box with gap score of -2)
• Diagonal box (Match/Mismatch criteria), if match add +1 or -1 with diagonal box value
• Then only keep the highest value & arrow direction (here 2).
Matrix Filling / Scoring
G -8 X X X

C -6 X X X
-6
T -4 X X
-1 -3

A -2 -1 -3 -5

SCORING
GAP 0 -2 -4 -6
• -1 is greater.
• So, we will delete -6 and -3
GAP T C G • We will score this box with -1

RULES
• Put scores for each category (Match: +1, Mismatch: -1, Gap: -2)

• Direction of arrow
• Box beside (add the value of beside box with gap score of -2)
• Box bottom (add the value of bottom box with gap score of -2)
• Diagonal box (Match/Mismatch criteria), if match add +1 or -1 with diagonal box value
• Then only keep the highest value & arrow direction (here 2).
Matrix Filling / Scoring
G -8 X X X

C -6 X X X

T -4 -1 X X

A -2 -1 -3 -5

GAP 0 -2 -4 -6

GAP T C G

RULES
• Put scores for each category (Match: +1, Mismatch: -1, Gap: -2)

• Direction of arrow
• Box beside (add the value of beside box with gap score of -2)
• Box bottom (add the value of bottom box with gap score of -2)
• Diagonal box (Match/Mismatch criteria), if match add +1 or -1 with diagonal box value
• Then only keep the highest value & arrow direction (here 2).
Matrix Filling / Scoring
G -8 -5 -2 +1

C -6 -3 0 -2

T -4 -1 -2 -4

A -2 -1 -3 -5

GAP 0 -2 -4 -6

GAP T C G

RULES
• Put scores for each category (Match: +1, Mismatch: -1, Gap: -2)

• Direction of arrow
• Box beside (add the value of beside box with gap score of -2)
• Box bottom (add the value of bottom box with gap score of -2)
• Diagonal box (Match/Mismatch criteria), if match add +1 or -1 with diagonal box value
• Then only keep the highest value & arrow direction (here 2).
Tracebacking
G -8 -5 -2 +1

C -6 -3 0 -2

T -4 -1 -2 -4

A -2 -1 -3 -5

GAP 0 -2 -4 -6

GAP T C G

Tracebacking

• We have to find the highest value in this matrix.

• +1 is the highest value, called as “most suitable value”.

• From the highest value of the matrix, we need to trace down to the starting value i.e. 0.
Tracebacking
G -8 -5 -2 +1

C -6 -3 0 -2

T -4 -1 -2 -4

A -2 -1 -3 -5

GAP 0 -2 -4 -6

GAP T C G

Tracebacking

• Based on the direction of the previous blue arrow & from where the value is coming,
we need to now come down in each box.

• We will now go to the next step i.e. “alignment”.


Tracebacking
G -8 -5 -2 +1

C -6 -3 0 -2

T -4 -1 -2 -4

A -2 -1 -3 -5

GAP 0 -2 -4 -6

GAP T C G

Tracebacking

• Based on the direction of previous blue arrow, we need to now come down in each box.

• We have a straight line now from (+1) to near (0).

• We will now go to the next step i.e. “alignment”.


Alignment
G -8 -5 -2 +1

C -6 -3 0 -2

T -4 -1 -2 -4

A -2 -1 -3 -5

GAP 0 -2 -4 -6

GAP T C G

RULE for Alignment

• We need to look for the traceback arrow in green color and its direction.

• If diagonal arrow, we will put the sequence characters.

• If horizontal/vertical arrow, we will put a gap.


Alignment
G -8 -5 -2 +1

C -6 -3 0 -2 A T C G
[GAP] T C G
T -4 -1 -2 -4 This is the Alignment

A -2 -1 -3 -5

GAP 0 -2 -4 -6

GAP T C G

Alignment

• We need to look for the traceback arrow in green color and its direction.

• If diagonal arrow, we will put the sequence characters.

• If horizontal/vertical arrow, we will put a gap.


Alignment
G -8 -5 -2 +1

C -6 -3 0 -2 A T C [GAP]
[GAP] T C G
T -4 -1 -2 -4
This is the Alignment

A -2 -1 -3 -5

GAP 0 -2 -4 -6

GAP T C G

Alignment

• We need to look for the traceback arrow in green color and its direction.

• If diagonal arrow, we will put the sequence characters.

• If horizontal/vertical arrow, we will put a gap.


Alignment: cross-checking
G -8 -5 -2 +1

C -6 -3 0 -2 A T C G
[GAP] T C G
T -4 -1 -2 -4
= (-2) + (1) + (1) + (1)

A -2 -1 -3 -5 = (+1)

GAP 0 -2 -4 -6

GAP T C G

Alignment

• To check the accuracy of the alignment again, we need to compare the alignment
value with the highest score of the matrix.
• (+1) is matching with the highest score of the matrix i.e. (+1)
• That indicates the alignment is correct.
• This cross-checking is important.
Next class:
• We will go through “Scoring matrices”.

• PAM matrices.

• BLOSUM matrices

• Comparison between them.

• Statistical significance of sequence alignment.

Thank you

You might also like