0% found this document useful (0 votes)

87 views52 pages

Introduction Dynamic Programming

This document provides an introduction to dynamic programming algorithms for sequence alignment. It outlines three strategies for sequence alignment - visual inspection, enumerating all possible alignments, and using dot plots. Dynamic programming is introduced as a method to partition the global alignment problem into smaller subproblems. The Needleman-Wunsch algorithm uses a recursive definition and dynamic programming to find the optimal global alignment between two sequences in quadratic time.

Uploaded by

Thangathurai Kartheeswaran

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

87 views52 pages

Introduction Dynamic Programming

Uploaded by

Thangathurai Kartheeswaran

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

Introduction to

Dynamic Programming
The sequence alignment problem

Wilson Leung 08/2015

Outline
Overview of the sequence alignment problem
Calculate the optimal global alignment
Characteristics of dynamic programming algorithms
Calculate the optimal local alignment
Learning objectives
Understand the theory behind sequence alignment
Become a better informed user of NCBI BLAST

This presentation will not cover:

The BLAST algorithm
Parameter optimizations
Statistics for similarity searches (Karlin-Altschul theory)

Korf, I., Yandell, M., and Bedell, J. (2003). BLAST. O’Reilly Media, Inc.
Design goals

Generate an alignment between two sequences

Identify the “best” (most parsimonious) alignment
Generate the best alignment “quickly”
Strategy #1: Visual inspection

Query: ATTACCAG
|| |||||
Subject: ATCACCAG
Sequences must have high percent identity
Applications:
PAM scoring matrix (align sequences with >= 85% identity)
Align mononucleotide runs during sequence improvement
Strategy #2: Enumerate all alignments

Guaranteed to find the best alignment

Does not scale
Combinatorial explosion
Two 300 bp sequences have ~10179 possible alignments (Eddy
2004)

Brute-force algorithm
Establish baseline performance and test cases
Identify patterns in the problem space
Apply the brute force algorithm to a single column of
the alignment

Homologous Not homologous

Query: A A- -A
Subject: A -A A-
Three possible alignments for two 1 bp sequences
Query length (M) = 1; Subject length (N) =1

Only two biological interpretations:

A in the query is homologous to A in the subject
A in the query is not homologous to A in the subject
Six possible relationships between the
query and subject for M=2, N=2
2 aligned bases 1 aligned base 0 aligned bases

Query: AT A-T -AT AT-- --AT

Subject: AT -AT A-T --AT AT--
AT- A-T A-T- -A-T
Each color denotes a
different evolutionary
A-T AT- -A-T A-T-
relationship
AT- -AT A--T -AT-
-AT AT- -AT- A--T
Observations from the brute force
alignment strategy
Many of the possible alignments are redundant
Imply the same evolutionary relationship

Large number of possible alignments

13 possible alignments for sequences of length 2

Can ignore many possible alignments

Many are suboptimal compared to the best alignment
Strategy #3: Dot plot
Deletion in subject Cell position (i,j):
i = Query position (x-axis)
j = Subject position (y-axis)

Draw a dot at (i,j) if the

Subject (y)

Align
two bases are identical

Connect the dots to make a

line (alignment)
Level of noise depends on
Insertion in subject repeat density
Use longer words and higher
cutoff scores to reduce noise

Query (x)
Assessment of the three sequence
alignment strategies
Infeasible to examine all possible alignments
Need to reduce the search space

Only a small subset of alignments are “interesting”

Many alignments are redundant

Connect the dots in the dot plot to create an alignment

Consider the cumulative levels of similarity
The optimal alignment is composed of
smaller optimal alignments
Query: AT Subject: AT

Query: A T A - T A T -
Subject: A T - A T A - T

Only the best alignment at each position A - T -

could be part of the final optimal alignment
- A - T

Align Deletion in subject Insertion in subject

Partition the alignment problem into
smaller subproblems
1 100
1
Subject (y)

Subject
100 Query
Query (x)
Assume the query and subject sequences are the same
Three different ways to reach cell (i,j)
in the alignment matrix
A
Subject (y)

(i-1,j-1) (i,j-1) Align with subject A

(i-1, j-1) A
Gap in subject A
(i-1, j) -
A Gap in query -
(i-1,j) (i,j) (i, j-1) A
Query (x) Arrow = alignment
Construct a scoring system to measure
similarity between two sequences
Scoring system for the aligned state: 𝛔
𝛔(a, b) = Score for aligning a in query with b in subject
𝛔(A, A) = Bonus for aligning A in query with A in subject
𝛔(A, T) = Penalty for aligning A in query with T in subject

Penalty for adding a gap: 𝛾

More sophisticated scoring systems take transitions,

transversions, affine gap penalty into account
Pearson WR. Selecting the Right Similarity-Scoring Matrix. Curr Protoc
Bioinformatics. 2013;43:3.5.1-3.5.9.
Recursive definition for the optimal cumulative
alignment score S(i,j)
a
Subject (y)

(i-1,j-1) (i,j-1)
S(i,j) = max {
𝛾 S(i-1,j-1) + 𝛔(a,b)
𝛔(a,b)
S(i-1,j ) + 𝛾
𝛾
S(i ,j-1) + 𝛾
b }
(i-1,j) (i,j)
Query (x)
Align Gap in subject Gap in query
Determine the best way to reach cell (i,j) if
it were part of the optimal alignment
(i,j)
Query
?
Subject
Optimal alignment
S(i,j) = max { a
Align
b
a
Gap in subject

Gap in query
b
}
Use the maximum score at each cell to eliminate entire
branch of suboptimal alignments

(i,j)

Gap in query Align Gap in subject

Cumulative score S(i,j) encapsulates the
alignment decisions up to position (i,j)
All potential optimal alignments that go through cell
(i,j) have the same ancestry
Re-use the cumulative alignment score (memoization)

Gaps are described by the cumulative score

Do not affect the coordinates of the alignment matrix

Do not know the optimal alignment until we complete

the entire alignment matrix
Optimal alignment has the highest cumulative score
Needleman-Wunsch algorithm (global alignment)
(Query length: M; Subject length: N)
Construct a (M+1) x (N+1) matrix
Extra column and row = gaps at the beginning of the alignment

Fill in the cells in the first row and first column with the
cumulative gap costs
Calculate the maximum score for subsequent cells (i,j)
Keep track of the decision that leads to the maximum score (S)

S(i-1,j-1) + 𝛔(a,b)
S(i,j) = max S(i-1,j ) + 𝛾
S(i ,j-1) + 𝛾
Needleman SB, Wunsch CD. A general method applicable to the search for similarities in
the amino acid sequence of two proteins. J Mol Biol. 1970 Mar;48(3):443-53.
Initialize the alignment matrix
(Match = +5; Mismatch = -2; Gap = -6)

0 1 2 3 4 5 6 7 8
T G C T C G T A
0 0 -6 -12 -18 -24 -30 -36 -42 -48
1 T -6
2 T -12

Subject
3 C -18
4 A -24
5 T -30
6 A -36
Query (Eddy, 2004)
Calculate the possible scores for the cell
at position (1,1)
T
Subject (y)

(0,0) (1,0)

0 -6 S(1,1) = max {
𝛔(T,T) 𝛾 S(0,0) + 𝛔(T,T)
S(0,1) + 𝛾
-6 S(1,0) + 𝛾
T 𝛾 }
(0,1) (1,1)
Query (x)
Align Gap in subject Gap in query
Calculate the optimal score for the cell
at position (1,1)
T
Subject (y)

S(1,1) = max {
0 -6 0 + (+5) = 5
-6
+5 -6 + (-6) = -12
-6 + (-6) = -12
5 -12
-6 }
T -6
-12 5 S(1,1) = 5
Query (x)
(Match = +5; Mismatch = -2; Gap = -6)
Calculate the possible scores for the cell
at position (2,1)
T G
Subject (y)

(1,0) (2,0)

-6 -12 S(2,1) = max {

𝛔(T,G) 𝛾 S(1,0) + 𝛔(T,G)
S(1,1) + 𝛾
5 S(2,0) + 𝛾
T 𝛾 }
(1,1) (2,1)
Query (x)
Align Gap in subject Gap in query
Calculate the optimal score for the cell
at position (2,1)
T G
Subject (y)

S(2,1) = max {
-6 -12 -6 + (-2) = -8
-6
-2 5 + (-6) = -1
-12 + (-6) = -18
-8 -18
5 }
T -6
-1 -1 S(2,1) = -1
Query (x)
(Match = +5; Mismatch = -2; Gap = -6)
Align Alignment matrix after two iterations
Gap in (Match = +5; Mismatch = -2; Gap = -6)
subject
Gap in 0 1 2 3 4 5 6 7 8
query T G C T C G T A
0 0 -6 -12 -18 -24 -30 -36 -42 -48
1 T -6 5 -1
2 T -12

Subject
3 C -18
4 A -24
5 T -30
6 A -36
Query
Calculate the optimal score for the cell
at position (3,1)
G C
Subject (y)

S(3,1) = max {
-12 -18 -12 + (-2) = -14
-6
-2 -1 + (-6) = -7
-18 + (-6) = -24
-14 -24
-1 }
T -6
-7 -7 S(3,1) = -7
Query (x)
(Match = +5; Mismatch = -2; Gap = -6)
Align Matrix after three iterations
Gap in (Match = +5; Mismatch = -2; Gap = -6)
subject
Gap in 0 1 2 3 4 5 6 7 8
query T G C T C G T A
0 0 -6 -12 -18 -24 -30 -36 -42 -48
1 T -6 5 -1 -7
2 T -12

Subject
3 C -18
4 A -24
5 T -30
6 A -36
Query
Calculate the optimal score for the cell
at position (1,2)
T
S(1,2) = max {
T -6 5 -6 + (+5) = -1
Subject (y)

-6
+5 -12 + (-6) = -18
5 + (-6) = -1
-1 -1 }
T
-12
-6
-18 -1 S(1,2) = -1
Query (x)
(Match = +5; Mismatch = -2; Gap = -6)
Align Complete alignment matrix
Gap in (Match = +5; Mismatch = -2; Gap = -6)
subject
Gap in 0 1 2 3 4 5 6 7 8
query T G C T C G T A
0 0 -6 -12 -18 -24 -30 -36 -42 -48
1 T -6 5 -1 -7 -13 -19 -25 -31 -37
2 T -12 -1 3 -3 -2 -8 -14 -20 -26

Subject
3 C -18 -7 -3 8 2 3 -3 -9 -15
4 A -24 -13 -9 2 6 0 1 -5 -4
5 T -30 -19 -15 -4 7 4 -2 6 0
6 A -36 -25 -21 -10 1 5 2 0 11
Query
Use traceback to recover the
optimal alignment
Start from the cell within the last row and last column that
has the highest score
Recall the step (color) that leads to this optimal score
Report this step in the alignment output
All the alignment decisions have already been made

Repeat until we reached the beginning of the sequence

Two options if multiple paths produce the same score

Report only one of the paths (pick arbitrarily)
Report all paths with the optimal score
Query: T C G T A
Subject: T C A T A
Traceback:
Query T G C T C G T A
0 -6 -12 -18 -24 -30 -36 -42 -48
T -6 5 -1 -7 -13 -19 -25 -31 -37
T -12 -1 3 -3 -2 -8 -14 -20 -26
Subject

C -18 -7 -3 8 2 3 -3 -9 -15
A -24 -13 -9 2 6 0 1 -5 -4
T -30 -19 -15 -4 7 4 -2 6 0
A -36 -25 -21 -10 1 5 2 0 11
Calculate the optimal score for the cell at
position (5,3)
T C
Subject (y)

S(5,3) = max {
-2 -8 -2 + (+5) = 3
-6
+5 2 + (-6) = -4
-8 + (-6) = -14
3 -14
2 }
C -6
-4 3 S(5,3) = 3
Query (x)
(Match = +5; Mismatch = -2; Gap = -6)
Traceback must follow the steps that produce the
optimal cumulative global alignment score

T C

T -2 -8

Subject (y)
C 2 3

Query (x)
Query: T G C T C G T A
Subject: T - - T C A T A
Traceback:
Query T G C T C G T A
0 -6 -12 -18 -24 -30 -36 -42 -48
T -6 5 -1 -7 -13 -19 -25 -31 -37
T -12 -1 3 -3 -2 -8 -14 -20 -26
Subject

C -18 -7 -3 8 2 3 -3 -9 -15
A -24 -13 -9 2 6 0 1 -5 -4
T -30 -19 -15 -4 7 4 -2 6 0
A -36 -25 -21 -10 1 5 2 0 11
The Needleman-Wunsch algorithm is an example
of a dynamic programming algorithm

Problem must satisfy two criteria:

Optimal substructure
Optimal solution to the complete problem is composed of optimal
solutions to the subproblems
Overlapping problems
Re-use the results for the subproblems (e.g., lookup table)

Many bioinformatics problems satisfy these criteria

Sequence alignment, gene prediction, RNA-folding

Bellman B. The theory of dynamic programming. Bulletin of the American Mathematical

Society. 1954; 60(6):503–516
Smith-Waterman algorithm (local alignment)
(Query length: M; Subject length: N)

Three changes to the Needleman-Wunsch algorithm:

The minimum score for a cell is zero
Initiate a new alignment when the cumulative score is negative
Begin traceback from the cell within the entire matrix that has
the highest score
Terminate traceback when the score is zero

S(i-1,j-1) + 𝛔(a,b)
S(i-1,j ) + 𝛾
S(i,j) = max
S(i ,j-1) + 𝛾
0
Smith TF, Waterman MS. Identification of common molecular subsequences.
J Mol Biol. 1981 Mar 25;147(1):195-7.
Global versus local alignments
Global alignment
Optimal alignment along the entire length of two sequences
Compare protein sequences to identify orthologs

Local alignment
Optimal alignment between parts of two sequences
Identify conserved domains within protein sequences

Glocal (semi-global) alignment

Optimal global alignment for one sequence; optimal local alignment
for the other sequence
Map a coding exon against a genomic sequence
Initialize the local alignment matrix
(Match = +5; Mismatch = -2; Gap = -6)

0 1 2 3 4 5 6 7 8
T G C T C G T A
0 0 0 0 0 0 0 0 0 0
1 T 0
2 T 0

Subject
3 C 0
4 A 0
5 T 0
6 A 0
Query
Calculate the possible local alignment
scores for the cell at position (1,1)
T
Subject (y)

(0,0) (1,0)
S(1,1) = max {
0 0 S(0,0) + 𝛔(T,T)
𝛔(T,T) 𝛾
S(0,1) + 𝛾
S(1,0) + 𝛾
0 0
T 𝛾 0
(0,1) (1,1) }
Query (x)
Align Gap in subject Gap in query
Calculate the optimal local alignment
score for the cell at position (1,1)
T
Subject (y)

S(1,1) = max {
0 0 0 + (+5) = 5
-6 0 + (-6) = -6
+5
0 + (-6) = -6
5 -6 0
0 }
T -6
0
-6 5 S(1,1) = 5
Query (x)
(Match = +5; Mismatch = -2; Gap = -6)
Align Local alignment matrix
Gap in (Match = +5; Mismatch = -2; Gap = -6)
subject
Gap in 0 1 2 3 4 5 6 7 8
query T G C T C G T A
0 0 0 0 0 0 0 0 0 0
1 T 0 5 0 0 5 0 0 5 0
2 T 0 5 3 0 5 3 0 5 3

Subject
3 C 0 0 3 8 2 10 4 0 3
4 A 0 0 0 2 6 4 8 2 5
5 T 0 5 0 0 7 4 2 13 7
6 A 0 0 3 0 1 5 2 7 18
Query
Query: T C G T A
Subject: 0 T C A T A
Traceback:
Query T G C T C G T A
0 0 0 0 0 0 0 0 0
T 0 5 0 0 5 0 0 5 0
T 0 5 3 0 5 3 0 5 3
Subject

C 0 0 3 8 2 10 4 0 3
A 0 0 0 2 6 4 8 2 5
T 0 5 0 0 7 4 2 13 7
A 0 0 3 0 1 5 2 7 18
Techniques to improve the performance of
sequence alignment
Time and space complexity: O(MN)
Double the size of the two sequences leads to a four-fold
increase in the amount of time and space required

Reduce memory requirement

Myers EW, Miller W. Optimal alignments in linear space. Comput Appl Biosci.
1988 Mar;4(1):11-7.

Fill the matrix in parallel (SIMD, CUDA)

Farrar M. Striped Smith-Waterman speeds database searches six times over other
SIMD implementations. Bioinformatics. 2007 Jan 15;23(2):156-61.

Find high-scoring instead of the best alignment

Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment
search tool. J Mol Biol. 1990 Oct 5;215(3):403-10.
Questions?

Eddy SR. What is dynamic programming? Nat Biotechnol. 2004 Jul;22(7):909-10.

Rationale for calculating the scores for
the entire alignment matrix
Cannot determine the best global alignment without
aligning the entire query and subject sequences
Cannot evaluate all possible alignments

If the alignment before we reached cell (i,j) is part of

the optimal alignment:
Identify the next step (i.e. align, gap in query, gap in subject)
that will be part of the optimal alignment

Use traceback to determine the final alignment

Different alignments could produce the same score
Overview of the BLAST algorithm
Heuristic algorithm to find local regions of similarity
between the query and subject sequences

Consists of four main stages:

Find common subsequences (words)
Extend the word matches into longer alignments
Evaluate the significance of the high-scoring segment pairs
(HSPs)
Combine multiple HSPs into a longer alignment

Korf, I., Yandell, M. and Bedell, J. (2003). The BLAST Algorithm. In BLAST (76-87).
Sebastopol, CA: O’Reilly Media, Inc.
Number of alignments for two sequences with length N

Stirling’s approximation
Number of alignments for two sequences with length N
Number of alignments for two sequences with length N
Brute force alignment approach is
computationally intractable
Sequence # possible
length (N) alignments
10 1.87E+05
50 1.01E+29
100 9.07E+58
200 1.03E+119
300 1.35E+179
400 1.88E+239
500 2.70E+299

Lecture 5 Introduction Dynamic Programming
No ratings yet
Lecture 5 Introduction Dynamic Programming
52 pages
Sequence Comparison: Motivation: Finding Similarity Between Sequences Is Important For Many Biological Questions
No ratings yet
Sequence Comparison: Motivation: Finding Similarity Between Sequences Is Important For Many Biological Questions
47 pages
Lecture1 2
No ratings yet
Lecture1 2
44 pages
Sequence Alignment Algorithms Overview
75% (4)
Sequence Alignment Algorithms Overview
37 pages
Bioinformatics Sequence Alignments
No ratings yet
Bioinformatics Sequence Alignments
37 pages
Sequence Analysis - Pairwise Alignment
No ratings yet
Sequence Analysis - Pairwise Alignment
26 pages
PCB Lect02 Pairwise Allign
No ratings yet
PCB Lect02 Pairwise Allign
51 pages
Sequence Alignment Methods Overview
No ratings yet
Sequence Alignment Methods Overview
57 pages
MIT6 047F15 Lecture03
No ratings yet
MIT6 047F15 Lecture03
56 pages
Bio Medical Tics - Sequence Analysis - Alignment - 2011
No ratings yet
Bio Medical Tics - Sequence Analysis - Alignment - 2011
96 pages
Lecture2 Sequence Alignment
No ratings yet
Lecture2 Sequence Alignment
26 pages
Sequence Alignment Algorithms in Bioinformatics
No ratings yet
Sequence Alignment Algorithms in Bioinformatics
95 pages
Dynamic Programming in Sequence Alignment
No ratings yet
Dynamic Programming in Sequence Alignment
41 pages
Lecture5 Newest
No ratings yet
Lecture5 Newest
124 pages
Unit Iv
No ratings yet
Unit Iv
98 pages
Tabby
No ratings yet
Tabby
11 pages
Bioinfo Generic Skill
No ratings yet
Bioinfo Generic Skill
10 pages
Sequence Alignment
No ratings yet
Sequence Alignment
24 pages
W03 Pairwise
No ratings yet
W03 Pairwise
55 pages
Pairwise Alignment 2017
No ratings yet
Pairwise Alignment 2017
49 pages
Sequence Comparison
No ratings yet
Sequence Comparison
39 pages
Daa Assignment 10 Aryan Project
No ratings yet
Daa Assignment 10 Aryan Project
11 pages
String Alignment Techniques
No ratings yet
String Alignment Techniques
76 pages
Sequence Alignment: Lecture 2, Thursday April 3, 2003
No ratings yet
Sequence Alignment: Lecture 2, Thursday April 3, 2003
39 pages
Dynamic Programming
No ratings yet
Dynamic Programming
28 pages
Pattern Matching Techniques and Their Applications To Computational Molecular Biology - A Review
No ratings yet
Pattern Matching Techniques and Their Applications To Computational Molecular Biology - A Review
8 pages
Lecture 4.1 and 4.2 Sequence Alignment (Global and Local)
No ratings yet
Lecture 4.1 and 4.2 Sequence Alignment (Global and Local)
14 pages
Bioinformatics Sequence Alignment
No ratings yet
Bioinformatics Sequence Alignment
90 pages
Importance and Significance of Sequence Alignment - pptx12
No ratings yet
Importance and Significance of Sequence Alignment - pptx12
15 pages
CS 838: Pairwise Sequence Alignment
No ratings yet
CS 838: Pairwise Sequence Alignment
18 pages
Lecture-7-Dynamic Programming Global-Sequence Alignment
No ratings yet
Lecture-7-Dynamic Programming Global-Sequence Alignment
31 pages
Sequence Alignment
No ratings yet
Sequence Alignment
36 pages
Dynamic Programming in Sequence Alignment
No ratings yet
Dynamic Programming in Sequence Alignment
38 pages
Sequence Alignment
No ratings yet
Sequence Alignment
92 pages
36) Corpet 1988
No ratings yet
36) Corpet 1988
10 pages
Pairwise Sequence Alignment Techniques
No ratings yet
Pairwise Sequence Alignment Techniques
27 pages
Sequence Alignment Presentation
No ratings yet
Sequence Alignment Presentation
27 pages
Unit - Ii Sequence Analysis: Pair-Wise Sequence Comparison
No ratings yet
Unit - Ii Sequence Analysis: Pair-Wise Sequence Comparison
17 pages
Need & Emergence of The Field: Speaker Shashi Shekhar Head of Computational Section Biowits Life Sciences
No ratings yet
Need & Emergence of The Field: Speaker Shashi Shekhar Head of Computational Section Biowits Life Sciences
59 pages
Sequence Alignment: "Continuing.." (5th Week)
No ratings yet
Sequence Alignment: "Continuing.." (5th Week)
61 pages
Sequence Alignment
No ratings yet
Sequence Alignment
9 pages
Sequence Alignment: Lecture - 4
No ratings yet
Sequence Alignment: Lecture - 4
19 pages
Running BLAST Through Perl
No ratings yet
Running BLAST Through Perl
35 pages
Needleman-Wunsch Algorithm Explained
No ratings yet
Needleman-Wunsch Algorithm Explained
39 pages
Sequence Alignment Techniques
No ratings yet
Sequence Alignment Techniques
49 pages
Pairwise Sequence Alignment Methods
No ratings yet
Pairwise Sequence Alignment Methods
22 pages
Dynamic Programming Approach
No ratings yet
Dynamic Programming Approach
32 pages
Sequence Alignment Basics
No ratings yet
Sequence Alignment Basics
27 pages
LO5 Pairwise Sequence Alignment
No ratings yet
LO5 Pairwise Sequence Alignment
11 pages
Alignment Methods
No ratings yet
Alignment Methods
33 pages
Needleman-Wunsch and Smith-Waterman Algorithm
67% (9)
Needleman-Wunsch and Smith-Waterman Algorithm
19 pages
Computational Biology Alignment
No ratings yet
Computational Biology Alignment
34 pages
Sequence Alignment in Bioinformatics
No ratings yet
Sequence Alignment in Bioinformatics
61 pages
Global vs Local Sequence Alignment
No ratings yet
Global vs Local Sequence Alignment
77 pages
5 Sequence Alignment
No ratings yet
5 Sequence Alignment
21 pages
Sequence Alignment
No ratings yet
Sequence Alignment
63 pages
Lobal Ournal of Ngineering Cience and Esearches: G J E S R
No ratings yet
Lobal Ournal of Ngineering Cience and Esearches: G J E S R
12 pages
Sequence Alignment for Bioinformatics
No ratings yet
Sequence Alignment for Bioinformatics
51 pages
Frid Seminar
No ratings yet
Frid Seminar
30 pages
PHIL 101 Exam 3 Study Guide
No ratings yet
PHIL 101 Exam 3 Study Guide
3 pages
2.9 Analysing Forces in Equilibrium: Chapter 2 Forces and Motion
No ratings yet
2.9 Analysing Forces in Equilibrium: Chapter 2 Forces and Motion
31 pages
Polymerization
No ratings yet
Polymerization
30 pages
Raychem GIS Cable Terminations Overview
No ratings yet
Raychem GIS Cable Terminations Overview
1 page
Tascam DM 24 Manual de Usuario
100% (1)
Tascam DM 24 Manual de Usuario
12 pages
Chemical Bonding: by Om Pandey, Iit Delhi
No ratings yet
Chemical Bonding: by Om Pandey, Iit Delhi
30 pages
Keyboard Skills Q&A Guide
No ratings yet
Keyboard Skills Q&A Guide
6 pages
Altivar 31 Manual
No ratings yet
Altivar 31 Manual
94 pages
The Shiva625
No ratings yet
The Shiva625
55 pages
Eduqas 02 Rhythm Questions
No ratings yet
Eduqas 02 Rhythm Questions
12 pages
Gen Math
No ratings yet
Gen Math
15 pages
DIN-Type HRC Fuse Links Overview
No ratings yet
DIN-Type HRC Fuse Links Overview
44 pages
DMRC Junior Engineer Electrical Question Papers 2012 Sample Paper
50% (2)
DMRC Junior Engineer Electrical Question Papers 2012 Sample Paper
4 pages
GATE 2018 Physics Aptitude Questions
No ratings yet
GATE 2018 Physics Aptitude Questions
13 pages
Astm-D 204
No ratings yet
Astm-D 204
13 pages
Chinese Petroleum Resources / Reserves Classification System
No ratings yet
Chinese Petroleum Resources / Reserves Classification System
29 pages
Lecture #04 - Wire Load Model (WLM) - Net Delay
No ratings yet
Lecture #04 - Wire Load Model (WLM) - Net Delay
6 pages
CLR Using C#
No ratings yet
CLR Using C#
14 pages
Plotting Data on a Cartesian Plane
No ratings yet
Plotting Data on a Cartesian Plane
2 pages
Silica Nanoparticles from Agro Waste
No ratings yet
Silica Nanoparticles from Agro Waste
6 pages
Elementary Fluids Mechanics .Handout9
No ratings yet
Elementary Fluids Mechanics .Handout9
9 pages
Mbe35 - 50 Operation Manual Completo
No ratings yet
Mbe35 - 50 Operation Manual Completo
58 pages
Factors Influencing The Recovery and Addition of Magnesium
No ratings yet
Factors Influencing The Recovery and Addition of Magnesium
4 pages
Superior Drummer 2 Manual
No ratings yet
Superior Drummer 2 Manual
38 pages
Sap Bw4hana Es
No ratings yet
Sap Bw4hana Es
40 pages
UVM Ramakrishna
0% (2)
UVM Ramakrishna
54 pages
Review of Related Literature 2.1 Concrete Strength: It Matters
No ratings yet
Review of Related Literature 2.1 Concrete Strength: It Matters
19 pages
F0371102 Etos Td-Ed en PDF
No ratings yet
F0371102 Etos Td-Ed en PDF
12 pages
862 MIR - Instruction Manual Vs30
100% (1)
862 MIR - Instruction Manual Vs30
20 pages
Zinc and Copper.: Back To Top
No ratings yet
Zinc and Copper.: Back To Top
13 pages

Introduction Dynamic Programming

Uploaded by

Introduction Dynamic Programming

Uploaded by

Introduction to

Wilson Leung 08/2015

This presentation will not cover:

Generate an alignment between two sequences

Guaranteed to find the best alignment

Homologous Not homologous

Only two biological interpretations:

Query: AT A-T -AT AT-- --AT

Large number of possible alignments

Can ignore many possible alignments

Draw a dot at (i,j) if the

Connect the dots to make a

Only a small subset of alignments are “interesting”

Connect the dots in the dot plot to create an alignment

Only the best alignment at each position A - T -

Align Deletion in subject Insertion in subject

(i-1,j-1) (i,j-1) Align with subject A

Penalty for adding a gap: 𝛾

More sophisticated scoring systems take transitions,

Gap in query Align Gap in subject

Gaps are described by the cumulative score

Do not know the optimal alignment until we complete

-6 -12 S(2,1) = max {

Repeat until we reached the beginning of the sequence

Two options if multiple paths produce the same score

Problem must satisfy two criteria:

Many bioinformatics problems satisfy these criteria

Bellman B. The theory of dynamic programming. Bulletin of the American Mathematical

Three changes to the Needleman-Wunsch algorithm:

Glocal (semi-global) alignment

Reduce memory requirement

Fill the matrix in parallel (SIMD, CUDA)

Find high-scoring instead of the best alignment

Eddy SR. What is dynamic programming? Nat Biotechnol. 2004 Jul;22(7):909-10.

If the alignment before we reached cell (i,j) is part of

Use traceback to determine the final alignment

Consists of four main stages:

You might also like