0% found this document useful (0 votes)

48 views

Inexact Matching, Sequence Alignment, and Dynamic Programming

The document discusses inexact matching and sequence alignment using dynamic programming. It introduces edit distance and transcripts as ways to quantify the differences between strings and describe their alignment allowing for mismatches and gaps. Dynamic programming is used to compute the optimal edit distance and alignment between two sequences in quadratic time by filling a 2D matrix according to recurrence relations and then tracing back through the matrix. Both global and local alignment are discussed as well as variants that assign weights to operations.

Uploaded by

Alimushwan Adnan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

48 views

Inexact Matching, Sequence Alignment, and Dynamic Programming

Uploaded by

Alimushwan Adnan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 57

Inexact Matching, Sequence Alignment,

and Dynamic Programming

Presented By
Dr. Shazzad Hosain
Asst. Prof. EECS, NSU
Inexact Matching and Alignment
• Inexact/approximate matching means some errors
will be there
• Alignment generally means lining up characters of
strings, allowing mismatches as well as matches, and
allowing characters of one string to be placed
opposite spaces made in opposing strings.
Importance of Alignment or Approximate
Matching
• It is Central in computational molecular biology
• Because of active mutational process
• “Duplication and Modification” is the central part of
protein evolution
• In DNA/RNA/Amino Acid sequences, high sequence
similarity implies significant functional or structural
similarity.
Edit Distance Between Two Strings
• Difference between two strings
• It focuses on transforming (or editing) one string into the
other by a series of edit operations on individual characters

RIMDMDMMI
v intner
Wri t ers
• The permitted edit operations are
– Insertion (I) of a character into the first string
– Deletion (D) of a character from the first string
– Substitution (or replacement) (R) of a character in the first string with
a character in the second string
• For Match (M) no operation is necessary
Edit Transcript vs. Edit Distance
Edit Transcript: A string over the alphabet I, D, R, M that
describes a transformation of one string to another is called
an edit transcript, or transcript for short, of the two strings.

RIMDMDMMI
v intner
wri t ers
Edit Distance: The minimum number of edit operations –
insertions, deletions and substitutions – needed to
transform the first string into the second. Also known as
Levenshtein distance.

What is the edit distance in this example? 5

Optimal Transcript
• Optimal transcript is an edit transcript that
uses minimal number of edit operations.
• There may be more than one optimal
transcript for two strings
String Alignment
• A (global) alignment of two strings S1 and S2 is obtained
by first inserting chosen spaces, either into or at the
ends of S1 and S2, and then placing the two resulting
strings one above the other so that every character or
space in either string is opposite a unique character or a
unique space in the other string.
v_intner_ qac_dbd
wri_t_ers qawx_b_
Alignment vs. Edit Transcript
• Mathematical viewpoint these are equivalent ways
to describe relationship between two strings
• Alignment can easily be converted to edit transcript
and vice versa
• For modeling standpoint they are quite different
– Edit transcript emphasizes the putative mutational events
that transform one string to another
– While alignment displays the relationship only
– So, one is process (edit transcript), the other is the
product (alignment)
v_intner_ qac_dbd
wri_t_ers qawx_b_
Dynamic Programming Calculation of Edit
Distance
• How to compute the edit distance of two
string along with the accompanying edit
transcript or alignment?
Definition: For two strings S1 and S2, D(i, j) is defined to be the edit distance
of S1[1…i] and S2[1 … j]

D(n, m) is the desired value if n and m are the lengths of S1 and S2

Steps of Dynamic Programming
• Recurrence relation
• Tabular Computation
• Traceback
The Recurrence Relation
• Recurrence relation establishes relationship between
the value of D(i, j) for i and j and values of D with
index pairs smaller than i, j.
• Base conditions are
– D (i, 0) = i, i.e. delete i characters
– D (0, j) = j, i.e. j characters to be inserted
• The recurrence relation is
– D(i, j) = min[D(i-1, j) + 1, D(i, j-1) + 1, D(i-1, j-1) + t(i, j)]
Tabular Computation: Bottom Up Approach

1 2 3 4 5 6 7
2 2 2 3

D(i, j) = min[D(i-1, j) + 1, D(i, j-1) + 1, D(i-1, j-1) + t(i, j)]

Tabular Computation: Bottom Up Approach

O (nm)
3

D(i, j) = min[D(i-1, j) + 1, D(i, j-1) + 1, D(i-1, j-1) + t(i, j)]

The Traceback

For optimal edit transcript, follow any path from cell (n, m) to cell (0, 0)

1. Horizontal edge, from (i, j) to (i, j-1), is insertion (I) of character S2(j) into S1
2. Vertical edge, from (i, j) to (i-1, j), is deletion (D) of S1(i) from S1
3. Diagonal edge, from (I, j) to (i-1, j-1) is a match (M) if S1(i) = S2(j) and a
substitution (R) if S1(i) ≠ S2(j)
The Traceback

Alternatively in terms of alignment Three traceback paths

1. Horizontal edge specifies a space inserted into S1 S1 = vintner
2. Vertical edge specifies a space inserted into S2 S2 = writer
3. Diagonal edge specifies either a match or a mismatch From (7, 7) to (3, 3) identical
w ri_t_ers w r i _ t_ers wr i t_ers
O (n + m) vi n tner_
_ vintner_ v _ i n tner_
Edit Graphs
• Often useful to represent dynamic programming solutions of
string problems in terms of weighted edit graph
– If |S1| = n and |S2| = m then the weighted edit graph has (n+1) x (m+1)
nodes
– Each edge has weights
• In the case of edit distance
problem, each edge has weight
1 except the three edges
• Any shortest path from (0,0) to
(n, m) specifies an edit transcript
Weighted Edit Distance
• Easy but crucial generalization is to associate weight or cost
or score to every edit operation, as well as with a match
– Let, insertion or deletion weight is d
– Substitution weight is r, and
– Match weight is e, usually very small, often zero
• Equivalently, in terms of operation-weight alignment
– Mismatch costs r
– Match costs e
– Space costs d
• Two types of weighted edit distance
– Operation weight
– Alphabet weight
Operation-weight Edit Transcript

It can also be represented as a shortest path

problem on a weighted edit graph

d = 1, r = 1 and e = 0 We get three optimal alignments

d = 4, r = 2 and e = 1 writ_ers Total weight is 17, which is optimal
Vintner_
Modified Recurrence Relations: ,
Alphabet-weight Edit Distance
• Assign score/weight depending on characters
– For example, it may be more costly to replace an A with a T
than with a G
– Or, the weight of a deletion / insertion may depend on exactly
which character is deleted / inserted
• Weighted edit distance usually means alphabet-weight
version
• Dominant scoring matrices are PAM matrices, and the
newer BLOSUM scoring matrices
– They are defined in terms of maximization problem (string
similarity) rather than edit distance.
String Similarity
• While edit distance is to minimize weights, string similarity
is to maximize weights
• For string similarity
– Matches are greater than or equal to zero
– Mismatches are less than zero
Computing String Similarity

• Let V(i, j) is
the optimal
alignment of
prefixes S1[1..i]
and S2[1..j]
End-space Free Variant

• Any spaces at the beginning and end has cost zero

• Encourages one string to align in the interior of the other
• Or the suffix of one string to align with a prefix of the other
• Shotgun sequence assembly (see section 16.14 and 16.15) problem uses this
variant, can be a project.

0
Local vs. global alignment

• Global alignment: entire sequences

• Local alignment: segments of sequences

• Local alignment often the most relevant

– Depends on biological assumptions
The Needleman-Wunsch
and
The SMITH-WATERMAN
algorithm for
sequence alignment
Global Sequence Alignment
• The Needleman–Wunsch algorithm performs a
global alignment on two sequences
• It is an example of dynamic programming, and was the
first application of dynamic programming to biological
sequence comparison
• Suitable when the two sequences are of similar
length, with a significant degree of similarity
throughout
• Aim: The best alignment over the entire length of two
sequences
Three steps in Needleman-Wunsch Algorithm

• Initialization
• Scoring
• Trace back (Alignment)
• Consider the two DNA sequences to be
globally aligned are:
ATCG (x=4, length of sequence 1)
TCG (y=3, length of sequence 2)
Scoring Scheme

• Match Score = +1
• Mismatch Score = -1
• Gap penalty = -1
• Substitution Matrix
A C G T
A 1 -1 -1 -1
C -1 1 -1 -1
G -1 -1 1 -1
T -1 -1 -1 1
Initialization Step
• Create a matrix with X +1 Rows and Y +1
Columns
• The 1st row and the 1st column of the score
matrix are filled as multiple of gap penalty

T C G
0 -1 -2 -3
A -1
T -2
C -3
G -4
Scoring
• The score of any cell C(i, j) is the maximum of:
scorediag = C(i-1, j-1) + S(i, j)
scoreup = C(i-1, j) + g
scoreleft = C(i, j-1) + g
where S(i, j) is the substitution score for letters
i and j, and g is the gap penalty
Scoring ….
• Example:
The calculation for the cell C(2, 2):
scorediag = C(i-1, j-1) + S(I, j) = 0 + -1 = -1
scoreup = C(i-1, j) + g = -1 + -1 = -2
scoreleft = C(i, j-1) + g = -1 + -1 = -2
T C G
0 -1 -2 -3
A -1 -1
T -2
C -3
G -4
Scoring ….
• Final Scoring Matrix
T C G
0 -1 -2 -3
A -1 -1 -2 -3
T -2 0 -1 -2
C -3 -1 1 0
G -4 -2 0 2

Note: Always the last cell has the maximum

alignment score: 2
Trace back
• The trace back step determines the actual
alignment(s) that result in the maximum score
• There are likely to be multiple maximal
alignments
• Trace back starts from the last cell, i.e.
position X, Y in the matrix
• Gives alignment in reverse order
Trace back ….
• There are three possible moves: diagonally (toward
the top-left corner of the matrix), up, or left
• Trace back takes the current cell and looks to the
neighbor cells that could be direct predecessors. This
means it looks to the neighbor to the left (gap in
sequence #2), the diagonal neighbor
(match/mismatch), and the neighbor above it (gap in
sequence #1). The algorithm for trace back chooses
as the next cell in the sequence one of the possible
predecessors
Trace back ….
T C G
0 -1 -2 -3
A -1 -1 -2 -3
T -2 0 -1 -2
C -3 -1 1 0
G -4 -2 0 2

• The only possible predecessor is the diagonal match/mismatch neighbor. If more than
one possible predecessor exists, any can be chosen. This gives us a current alignment of
Seq 1: G
|
Seq 2: G
Trace back ….
• Final Trace back
T C G
0 -1 -2 -3
A -1 -1 -2 -3
T -2 0 -1 -2
C -3 -1 1 0
G -4 -2 0 2

Best Alignment:
A T C G
| | | |
_ T C G
Local Sequence Alignment
• The Smith-Waterman algorithm performs a
local alignment on two sequences
• It is an example of dynamic programming
• Useful for dissimilar sequences that are
suspected to contain regions of similarity or
similar sequence motifs within their larger
sequence context
• Aim: The best alignment over the conserved
domain of two sequences
Differences in Needleman-Wunsch and
Smith-Waterman Algorithms:
• In the initialization stage, the first row and first
column are all filled in with 0s
• While filling the matrix, if a score becomes
negative, put in 0 instead
• In the traceback, start with the cell that has
the highest score and work back until a cell
with a score of 0 is reached.
Three steps in Smith-Waterman Algorithm

T C G
0 0 0 0
A 0
T 0
C 0
G 0
Scoring
• The score of any cell C(i, j) is the maximum of:
scorediag = C(i-1, j-1) + S(i, j)
scoreup = C(i-1, j) + g
scoreleft = C(i, j-1) + g
And
0
(here S(i, j) is the substitution score for letters i
and j, and g is the gap penalty)
Scoring ….
• Example:
The calculation for the cell C(2, 2):
scorediag = C(i-1, j-1) + S(I, j) = 0 + -1 = -1
scoreup = C(i-1, j) + g = 0 + -1 = -1
scoreleft = C(i, j-1) + g = 0 + -1 = -1
T C G
0 0 0 0
A 0 0
T 0
C 0
G 0
Scoring ….
• Final Scoring Matrix
T C G
0 0 0 0
A 0 0 0 0
T 0 1 0 0
C 0 0 2 1
G 0 0 1 3

Note: It is not mandatory that the last cell has

the maximum alignment score!
Trace back
• The trace back step determines the actual
alignment(s) that result in the maximum score
• There are likely to be multiple maximal
alignments
• Trace back starts from the cell with maximum
value in the matrix
• Gives alignment in reverse order
Trace back ….
• There are three possible moves: diagonally (toward the
top-left corner of the matrix), up, or left
• Trace back takes the current cell and looks to the
neighbor cells that could be direct predecessors. This
means it looks to the neighbor to the left (gap in
sequence #2), the diagonal neighbor (match/mismatch),
and the neighbor above it (gap in sequence #1). The
algorithm for trace back chooses as the next cell in the
sequence one of the possible predecessors. This
continues till cell with value 0 is reached.
Trace back ….
T C G
0 0 0 0
A 0 0 0 0
T 0 1 0 0
C 0 0 2 1
G 0 0 1 3

Best Alignment:
T C G
| | |
T C G
Gaps
• A gap is any maximal, consecutive run of spaces in a
single string of a given alignment.

c t t t a a c _ _ a _ a c
c _ _ _ c a c c c a t _ c
Four gaps and seven spaces
The simplest objective function that includes gaps

1. Where Wg is a constant gap for each gap

2. k is the number of gaps
3. s(x, _) = s(_, x) = 0 for every character x
Why Gaps?

• Top row shows part of the RNA sequences of one strain of the HIV-1 virus.
• The HIV virus mutates rapidly
• The three bottom rows, each shows the mutated virus strain from the original
one.
• Dark one is the matching portion, white space represents gap
• Matching means similarity, i.e. mismatch or space could be there but in small
percentage of the region
cDNA Matching: A Concrete Example

• cDNA means complemented DNA

Connection between DNA and Protein

Exon
Intron
The cDNA
• Each cell contains the same chromosome, the same set
of genes
• Yet, in each specialized cell (a liver cell for example)
only a small fraction of the genes are expressed
• You want to hunt the location of the encoding gene for
that specific protein
• Capture the mRNA in that cell after it leaves the cell
nucleus
• That mRNA is used to create a DNA string
complementary to it , which is known as cDNA
cDNA Problem

cDNA
Why Gaps in the Objective Function

• You will not get long gaps or you can not get
gaps of your own choice or problem specific
Choice of Gap Weights
• Constant
– Maximize [Wm(# matches) – Wms(# mismatches) – Wg(# gaps)]
– Or
• Affine
– Maximize [Wm(# matches) – Wms(# mismatches) – Wg(# gaps) – Ws(#
spaces)]
– Wg gap initiation cost, Ws gap extension cost
• Convex
• Arbitrary

c t t t a a c _ _ a _ a c
c _ _ _ c a c c c a t _ c
Reference
• Chapter 10, 11: Algorithms on Strings, Trees
and Sequences

The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
From Everand
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
Mark Manson
4/5 (6432)
Principles: Life and Work
From Everand
Principles: Life and Work
Ray Dalio
4/5 (640)
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
From Everand
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
Brené Brown
4/5 (1173)
Never Split the Difference: Negotiating As If Your Life Depended On It
From Everand
Never Split the Difference: Negotiating As If Your Life Depended On It
Chris Voss
4.5/5 (992)
The Glass Castle: A Memoir
From Everand
The Glass Castle: A Memoir
Jeannette Walls
4.5/5 (1853)
Grit: The Power of Passion and Perseverance
From Everand
Grit: The Power of Passion and Perseverance
Angela Duckworth
4/5 (650)
Sing, Unburied, Sing: A Novel
From Everand
Sing, Unburied, Sing: A Novel
Jesmyn Ward
4/5 (1267)
The Perks of Being a Wallflower
From Everand
The Perks of Being a Wallflower
Stephen Chbosky
4.5/5 (4102)
Her Body and Other Parties: Stories
From Everand
Her Body and Other Parties: Stories
Carmen Maria Machado
4/5 (903)
Shoe Dog: A Memoir by the Creator of Nike
From Everand
Shoe Dog: A Memoir by the Creator of Nike
Phil Knight
4.5/5 (628)
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
From Everand
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
Margot Lee Shetterly
4/5 (1016)
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
From Everand
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
Ben Horowitz
4.5/5 (361)
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
From Everand
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
Ashlee Vance
4.5/5 (581)
The Emperor of All Maladies: A Biography of Cancer
From Everand
The Emperor of All Maladies: A Biography of Cancer
Siddhartha Mukherjee
4.5/5 (297)
Steve Jobs
From Everand
Steve Jobs
Walter Isaacson
4.5/5 (1138)
Research in Education 10th Edition John W. Best All Chapters Instant Download
50% (2)
Research in Education 10th Edition John W. Best All Chapters Instant Download
37 pages
A Man Called Ove: A Novel
From Everand
A Man Called Ove: A Novel
Fredrik Backman
4.5/5 (5143)
Angela's Ashes: A Memoir
From Everand
Angela's Ashes: A Memoir
Frank McCourt
4.5/5 (943)
The Yellow House: A Memoir (2019 National Book Award Winner)
From Everand
The Yellow House: A Memoir (2019 National Book Award Winner)
Sarah M. Broom
4/5 (100)
The Little Book of Hygge: Danish Secrets to Happy Living
From Everand
The Little Book of Hygge: Danish Secrets to Happy Living
Meik Wiking
3.5/5 (460)
Brooklyn: A Novel
From Everand
Brooklyn: A Novel
Colm Tóibín
3.5/5 (2126)
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
From Everand
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
Gilbert King
4.5/5 (279)
The World Is Flat 3.0: A Brief History of the Twenty-first Century
From Everand
The World Is Flat 3.0: A Brief History of the Twenty-first Century
Thomas L. Friedman
3.5/5 (2289)
The Art of Racing in the Rain: A Novel
From Everand
The Art of Racing in the Rain: A Novel
Garth Stein
4/5 (4360)
Bad Feminist: Essays
From Everand
Bad Feminist: Essays
Roxane Gay
4/5 (1090)
The Woman in Cabin 10
From Everand
The Woman in Cabin 10
Ruth Ware
3.5/5 (2787)
A Tree Grows in Brooklyn
From Everand
A Tree Grows in Brooklyn
Betty Smith
4.5/5 (2033)
Yes Please
From Everand
Yes Please
Amy Poehler
4/5 (2010)
The Outsider: A Novel
From Everand
The Outsider: A Novel
Stephen King
4/5 (2876)
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
From Everand
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
Dave Eggers
3.5/5 (233)
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
From Everand
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
Viet Thanh Nguyen
4.5/5 (141)
Team of Rivals: The Political Genius of Abraham Lincoln
From Everand
Team of Rivals: The Political Genius of Abraham Lincoln
Doris Kearns Goodwin
4.5/5 (244)
Wolf Hall: A Novel
From Everand
Wolf Hall: A Novel
Hilary Mantel
4/5 (4087)
On Fire: The (Burning) Case for a Green New Deal
From Everand
On Fire: The (Burning) Case for a Green New Deal
Naomi Klein
4/5 (78)
Fear: Trump in the White House
From Everand
Fear: Trump in the White House
Bob Woodward
3.5/5 (835)
Manhattan Beach: A Novel
From Everand
Manhattan Beach: A Novel
Jennifer Egan
3.5/5 (918)
Rise of ISIS: A Threat We Can't Ignore
From Everand
Rise of ISIS: A Threat We Can't Ignore
Jay Sekulow
3.5/5 (144)
John Adams
From Everand
John Adams
David McCullough
4.5/5 (2546)
The Light Between Oceans: A Novel
From Everand
The Light Between Oceans: A Novel
M.L. Stedman
4.5/5 (815)
Mayan Treasure
100% (2)
Mayan Treasure
143 pages
Mathematics in The Modern World: Math Module 2020
No ratings yet
Mathematics in The Modern World: Math Module 2020
137 pages
Sadia Zannat - Final Report
No ratings yet
Sadia Zannat - Final Report
56 pages
9 Regression Analysis
No ratings yet
9 Regression Analysis
38 pages
Naïve Bayes Classifier: Ke Chen
No ratings yet
Naïve Bayes Classifier: Ke Chen
19 pages
Sequence Alignment: Lecture 2, Thursday April 3, 2003
No ratings yet
Sequence Alignment: Lecture 2, Thursday April 3, 2003
39 pages
Exact String Matching Algorithms: Presented by Dr. Shazzad Hosain Assoc. Prof. EECS, NSU
No ratings yet
Exact String Matching Algorithms: Presented by Dr. Shazzad Hosain Assoc. Prof. EECS, NSU
80 pages
Exact String Matching Algorithms: Presented by Dr. Shazzad Hosain Asst. Prof. EECS, NSU
No ratings yet
Exact String Matching Algorithms: Presented by Dr. Shazzad Hosain Asst. Prof. EECS, NSU
27 pages
The Protein: Presented by Dr. Shazzad Hosain Asst. Prof. EECS, NSU
No ratings yet
The Protein: Presented by Dr. Shazzad Hosain Asst. Prof. EECS, NSU
157 pages
CSE 516/CSE 446 Introduction To Bioinformatics: Presented by Dr. Shazzad Hosain Asst. Prof. EECS, NSU
No ratings yet
CSE 516/CSE 446 Introduction To Bioinformatics: Presented by Dr. Shazzad Hosain Asst. Prof. EECS, NSU
25 pages
Compensation Philosophies
No ratings yet
Compensation Philosophies
15 pages
Lectures
No ratings yet
Lectures
262 pages
Submitted To: Dr. Nazrul Islam Professor, Dean, SOB Canadian University of Bangladesh
No ratings yet
Submitted To: Dr. Nazrul Islam Professor, Dean, SOB Canadian University of Bangladesh
3 pages
Eight Essential Components of Communication
No ratings yet
Eight Essential Components of Communication
7 pages
Factors Considered in Deciding Compensation
100% (1)
Factors Considered in Deciding Compensation
21 pages
Communication Structure
No ratings yet
Communication Structure
5 pages
The Unwinding: An Inner History of the New America
From Everand
The Unwinding: An Inner History of the New America
George Packer
4/5 (45)
Little Women
From Everand
Little Women
Louisa May Alcott
4.5/5 (2369)
The Constant Gardener: A Novel
From Everand
The Constant Gardener: A Novel
John le Carré
4/5 (278)
Konse J
No ratings yet
Konse J
11 pages
Form
100% (1)
Form
77 pages
Mersenne Numbers and Fermat Numbers 1st Edition Elena Deza instant download
100% (2)
Mersenne Numbers and Fermat Numbers 1st Edition Elena Deza instant download
76 pages
003 Manual Irfanview Quick Start Guide
No ratings yet
003 Manual Irfanview Quick Start Guide
16 pages
SQL Cheatsheet
No ratings yet
SQL Cheatsheet
2 pages
Devops For Network Function Virtualisation: An Architectural Approach
No ratings yet
Devops For Network Function Virtualisation: An Architectural Approach
11 pages
PR2 Curriculum Guide
No ratings yet
PR2 Curriculum Guide
7 pages
FSMO Role: Schema Master
No ratings yet
FSMO Role: Schema Master
39 pages
O & M Manual For The ATC-800 Automatic Transfer Switch Controller
No ratings yet
O & M Manual For The ATC-800 Automatic Transfer Switch Controller
40 pages
Clauses in English Grammar
No ratings yet
Clauses in English Grammar
3 pages
Level III - Ata 38 Water - Waste
No ratings yet
Level III - Ata 38 Water - Waste
40 pages
Prepared By-Chaudhari M.M SVCP (Sinhgad), PUNE (INDIA) Dept. of Mech. Engg
No ratings yet
Prepared By-Chaudhari M.M SVCP (Sinhgad), PUNE (INDIA) Dept. of Mech. Engg
39 pages
Tolerance
No ratings yet
Tolerance
21 pages
LG Training Manual
No ratings yet
LG Training Manual
67 pages
A Study On Implementation of Poka - Yoke Technique in Improving The Operational Performance by Reducing The Rejection Rate in The Assembly Line
No ratings yet
A Study On Implementation of Poka - Yoke Technique in Improving The Operational Performance by Reducing The Rejection Rate in The Assembly Line
16 pages
Class 7, CH 2 Nutrition in Animals
No ratings yet
Class 7, CH 2 Nutrition in Animals
4 pages
Unit II - HES
No ratings yet
Unit II - HES
215 pages
Design of IEC 61850 Based Substation Automation Sy
No ratings yet
Design of IEC 61850 Based Substation Automation Sy
9 pages
Mane-4040 Mechanical Systems Laboratory (MSL) Lab Report Cover Sheet
No ratings yet
Mane-4040 Mechanical Systems Laboratory (MSL) Lab Report Cover Sheet
13 pages
The Formation of The Solar System: Our Theory Must Explain The Data
100% (1)
The Formation of The Solar System: Our Theory Must Explain The Data
30 pages
Best Oonline Tip1a PDF
No ratings yet
Best Oonline Tip1a PDF
5 pages
2020 BGCSE Chemistry
No ratings yet
2020 BGCSE Chemistry
12 pages
100 Days CDS Mathematics Study Plan
No ratings yet
100 Days CDS Mathematics Study Plan
18 pages
Figured Bass Notation PDF
100% (1)
Figured Bass Notation PDF
5 pages
High Voltage Cables Catalog PDF
100% (1)
High Voltage Cables Catalog PDF
20 pages
Web
No ratings yet
Web
9 pages
CALCIUM COA Report
No ratings yet
CALCIUM COA Report
1 page

Inexact Matching, Sequence Alignment, and Dynamic Programming

Uploaded by

Inexact Matching, Sequence Alignment, and Dynamic Programming

Uploaded by

Inexact Matching, Sequence Alignment,

and Dynamic Programming

What is the edit distance in this example? 5

D(n, m) is the desired value if n and m are the lengths of S1 and S2

D(i, j) = min[D(i-1, j) + 1, D(i, j-1) + 1, D(i-1, j-1) + t(i, j)]

D(i, j) = min[D(i-1, j) + 1, D(i, j-1) + 1, D(i-1, j-1) + t(i, j)]

Alternatively in terms of alignment Three traceback paths

It can also be represented as a shortest path

d = 1, r = 1 and e = 0 We get three optimal alignments

• Any spaces at the beginning and end has cost zero

• Global alignment: entire sequences

• Local alignment: segments of sequences

• Local alignment often the most relevant

Note: Always the last cell has the maximum

Note: It is not mandatory that the last cell has

1. Where Wg is a constant gap for each gap

• cDNA means complemented DNA

You might also like