0% found this document useful (0 votes)

80 views44 pages

Sequence Alignment (Chapter 6) : The Biological Problem

The document discusses sequence alignment and related concepts. It introduces global sequence alignment, which finds the optimal alignment between two sequences by assigning scores to matches and mismatches. The optimal alignment is computed using dynamic programming to evaluate all possible alignments and choose the one with the highest score. Local sequence alignment is also mentioned as another type of sequence alignment.

Uploaded by

Jahir Hasan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

80 views44 pages

Sequence Alignment (Chapter 6) : The Biological Problem

Uploaded by

Jahir Hasan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 44

Sequence Alignment (chapter 6)

l The biological problem

l Global alignment
l Local alignment
l Multiple alignment

Introduction to bioinformatics, Autumn 2006 22

Background: comparative genomics
l Basic question in biology: what properties are shared
among organisms?
l Genome sequencing allows comparison of organisms
at DNA and protein levels
l Comparisons can be used to
− Find evolutionary relationships between organisms
− Identify functionally conserved sequences
− Identify corresponding genes in human and model
organisms: develop models for human diseases

Introduction to bioinformatics, Autumn 2006 23

Homologs
• Two genes or characters
gB and gC evolved from
the same ancestor gA are gA = agtgtccgttaagtgcgttc
called homologs
gB = agtgccgttaaagttgtacgtc
• Homologs usually exhibit
conserved functions gC = ctgactgtttgtggttc

• Close evolutionary
relationship => expect a
high number of homologs

Introduction to bioinformatics, Autumn 2006 24

Sequence similarity
l Intuitively, similarity of two sequences refers to the
degree of match between corresponding positions in
sequence
agtgccgttaaagttgtacgtc

ctgactgtttgtggttc
l What about sequences that differ in length?

Introduction to bioinformatics, Autumn 2006 25

Similarity vs homology
l Sequence similarity is not sequence homology
− If the two sequences gB and gC have accumulated enough mutations, the
similarity between them is likely to be low
#mutations #mutations
0 agtgtccgttaagtgcgttc 64 acagtccgttcgggctattg
1 agtgtccgttatagtgcgttc 128 cagagcactaccgc
2 agtgtccgcttatagtgcgttc 256 cacgagtaagatatagct
4 agtgtccgcttaagggcgttc 512 taatcgtgata
8 agtgtccgcttcaaggggcgt 1024 acccttatctacttcctggagtt
16 gggccgttcatgggggt 2048 agcgacctgcccaa
32 gcagggcgtcactgagggct 4096 caaac

Homology is more difficult to detect over greater evolutionary

distances.
Introduction to bioinformatics, Autumn 2006 26
Similarity vs homology (2)
l Sequence similarity can occur by chance
− Similarity does not imply homology

l Similarity is an expected consequence of homology

Introduction to bioinformatics, Autumn 2006 27

Orthologs and paralogs
l We distinguish between two types of homology
− Orthologs: homologs from two different species
− Paralogs: homologs within a species Organism A
gA

gA
Gene A is copied
gA gA’ within organism A

gB gC
gB gC
Organism B Organism C

Introduction to bioinformatics, Autumn 2006 28

Orthologs and paralogs (2)
l Orthologs typically retain the original function
l In paralogs, one copy is free to mutate and acquire
new function (no selective pressure) Organism A
gA

gA
Gene A is copied
gA gA’ within organism A

gB gC
gB gC
Organism B Organism C

Introduction to bioinformatics, Autumn 2006 29

Sequence alignment
l Alignment specifies which positions in two sequences
match

acgtctag acgtctag acgtctag

|| ||||| || |||||
actctag- -actctag ac-tctag

2 matches 5 matches 7 matches

5 mismatches 2 mismatches 0 mismatches
1 not aligned 1 not aligned 1 not aligned

Introduction to bioinformatics, Autumn 2006 30

Mutations: Insertions, deletions and
substitutions
Indel: insertion or acgtctag Mismatch: substitution
deletion of a base ||||| (point mutation) of
with respect to the a single base
-actctag
ancestor sequence

l Insertions and/or deletions are called indels

− We can’t tell whether the ancestor sequence had a base or
not at indel position

Introduction to bioinformatics, Autumn 2006 31

Problems
l What sorts of alignments should be considered?
l How to score alignments?
l How to find optimal or good scoring alignments?
l How to evaluate the statistical significance of scores?

In this course, we discuss the first three problems.

Course Biological sequence analysis tackles all four in-

depth.
Introduction to bioinformatics, Autumn 2006 32
Sequence Alignment (chapter 6)
l The biological problem
l Global alignment
l Local alignment
l Multiple alignment

Introduction to bioinformatics, Autumn 2006 33

Global alignment
l Problem: find optimal scoring alignment between two
sequences (Needleman & Wunsch 1970)
l We give score for each position in alignment
− Identity (match) +1 WHAT
− Substitution (mismatch) -µ ||
− Indel WH-Y

S(WHAT/WH-Y) = 1 + 1 – –µ

Introduction to bioinformatics, Autumn 2006 34

Representing alignments and scores

WHAT - W H A T

|| -

WH-Y W X

H X X

Y X

Introduction to bioinformatics, Autumn 2006 35

Representing alignments and scores

WHAT - W H A T

|| - 0
WH-Y W 1

H 2 2-

Global alignment Y 2- -µ
score S3,4 = 2- -µ

Introduction to bioinformatics, Autumn 2006 36

Dynamic programming
l How to find the optimal alignment?
l We use previous solutions for optimal alignments of
smaller subsequences
l This general approach is known as dynamic
programming

Introduction to bioinformatics, Autumn 2006 37

Filling the alignment matrix
- W H A T Consider the alignment process
at shaded square.
- Case 1. Align H against H
(match or substitution).
W Case 1
Case 2 Case 2. Align H in WHY against
– (indel) in WHAT.
H
Case 3 Case 3. Align H in WHAT
Y against – (indel) in WHY.

Introduction to bioinformatics, Autumn 2006 38

Filling the alignment matrix (2)
- W H A T Scoring the alternatives.
Case 1. S2,2 = S1,1 + s(2, 2)
-
Case 2. S2,2 = S1,2
W Case 1 Case 3. S2,2 = S2,1
Case 2
s(i, j) = 1 for matching positions,
H
Case 3 s(i, j) = - µ for substitutions.
Y Choose the case (path) that
yields the maximum score.
Keep track of path choices.

Introduction to bioinformatics, Autumn 2006 39

Global alignment: formal
development
A = a1a2a3…an, 0 1 2 3 4
B = b1b2b3…bm
- b1 b2 b3 b4
b1 b2 b3 b4 -
- a1 - a2 a3 0 -

lAny alignment can be written

1 a1
as a unique path through the
matrix
2 a2
l Score for aligning A and B up
to positions i and j:
3 a3
Si,j = S(a1a2a3…ai, b1b2b3…bj)
Introduction to bioinformatics, Autumn 2006 40
Scoring partial alignments
l Alignment of A = a1a2a3…an with B = b1b2b3…bm can end in
three ways
− Case 1: (a1a2…ai-1) ai
(b1b2…bj-1) bj
− Case 2: (a1a2…ai-1) ai
(b1b2…bj) -
− Case 3: (a1a2…ai) –
(b1b2…bj-1) bj

Introduction to bioinformatics, Autumn 2006 41

Scoring alignments
l Scores for each case:
+1 if ai = bj
− Case 1: (a1a2…ai-1) ai
(b1b2…bj-1) bj
s(ai, bj) = { -µ otherwise
− Case 2: (a1a2…ai-1) ai
(b1b2…bj) –
s(ai, -) = s(-, bj) = -
− Case 3: (a1a2…ai) –
(b1b2…bj-1) bj

Introduction to bioinformatics, Autumn 2006 42

Scoring alignments (2)
• First row and first column 0 1 2 3 4
correspond to initial alignment
against indels:
- b1 b2 b3 b4
S(i, 0) = -i
S(0, j) = -j 0 - 0 -2 -3 -4

• Optimal global alignment 1 a1

score S(A, B) = Sn,m
2 a2 -2

3 a3 -3

Introduction to bioinformatics, Autumn 2006 43

Algorithm for global alignment
Input sequences A, B, n = |A|, m = |B|
Set Si,0 := - i for all i
Set S0,j := - j for all j
for i := 1 to n
for j := 1 to m
Si,j := max{Si-1,j – , Si-1,j-1 + s(ai,bj), Si,j-1 – }
end
end

Algorithm takes O(nm) time and space.

Introduction to bioinformatics, Autumn 2006 44

Global alignment: example
- T G G T G
µ=1 - 0 -2 -4 -6 -8 -10
=2 A -2
T -4
C -6
G -8
T -10 ?

Introduction to bioinformatics, Autumn 2006 45

Global alignment: example (2)
- T G G T G
µ=1 - 0 -2 -4 -6 -8 -10
=2 A -2 -1 -3 -5 -7 -9
T -4 -1 -2 -4 -4 -6
C -6 -3 -2 -3 -5 -5
ATCGT-
G -8 -5 -2 -1 -3 -4
| ||
T -10 -7 -4 -3 0 -2
-TGGTG

Introduction to bioinformatics, Autumn 2006 46

Sequence Alignment (chapter 6)
l The biological problem
l Global alignment
l Local alignment
l Multiple alignment

Introduction to bioinformatics, Autumn 2006 47

Local alignment: rationale
• Otherwise dissimilar proteins may have local regions of
similarity
-> Proteins may share a function

Human bone
morphogenic protein
receptor type II
precursor (left) has a
300 aa region that
resembles 291 aa
region in TGF-
receptor (right).
The shared function
here is protein kinase.

Introduction to bioinformatics, Autumn 2006 48

Local alignment: rationale
A

B
Regions of
similarity

• Global alignment would be inadequate

• Problem: find the highest scoring local alignment
between two sequences
• Previous algorithm with minor modifications solves this
problem (Smith & Waterman 1981)
Introduction to bioinformatics, Autumn 2006 49
From global to local alignment
l Modifications to the global alignment algorithm
− Look for the highest-scoring path in the alignment matrix
(not necessarily through the matrix)
− Allow preceding and trailing indels without penalty

Introduction to bioinformatics, Autumn 2006 50

Scoring local alignments
A = a1a2a3…an, B = b1b2b3…bm

Let I and J be intervals (substrings) of A and B,

respectively: ,

Best local alignment score:

where S(I, J) is the score for substrings I and J.

Introduction to bioinformatics, Autumn 2006 51

Allowing preceding and trailing
indels
• First row and column 0 1 2 3 4
initialised to zero:
Mi,0 = M0,j = 0 - b1 b2 b3 b4

0 - 0 0 0 0 0

1 a1 0

b1 b2 b3 2 a2 0
- - a1
3 a3 0

Introduction to bioinformatics, Autumn 2006 52

Recursion for local alignment
• Mi,j = max { - T G G T G
Mi-1,j-1 + s(ai, bi), - 0 0 0 0 0 0
Mi-1,j ,
A 0 0 0 0 0 0
Mi,j-1 ,
0 T 0 1 0 0 1 0
}
C 0 0 0 0 0 0

G 0 0 1 1 0 1

T 0 1 0 0 2 0

Introduction to bioinformatics, Autumn 2006 53

Finding best local alignment
• Optimal score is the highest - T G G T G
value in the matrix
- 0 0 0 0 0 0

A 0 0 0 0 0 0
= maxi,j Mi,j
T 0 1 0 0 1 0
• Best local alignment can be
found by backtracking from C 0 0 0 0 0 0
the highest value in M
G 0 0 1 1 0 1

T 0 1 0 0 2 0

Introduction to bioinformatics, Autumn 2006 54

Local alignment: example
0 1 2 3 4 5 6 7 8 9 10

- G G C T C A A T C A
0 - 0 0 0 0 0 0 0 0 0 0 0
1 A 0
2 C 0
3 C 0
4 T 0
5 A 0
6 A 0
7 G 0
8 G 0

Introduction to bioinformatics, Autumn 2006 55

Local alignment: example
10
0 1 2 3 4 5 6 7 8 9
Scoring - G G C T C A A T C A
Match: +2 0 - 0 0 0 0 0 0 0 0 0 0 0
1 A 0 0 0 0 0 0 2 2 0 0 2
Mismatch: -1
2 C 0 0 0 2 0 2 0 1 1 2 0
Indel: -2 3 C 0 0 0 2 1 2 1 0 0 3 1
4 T 0 0 0 0 4 2 1 0 2 1 2
5 A 0 0 0 0 2 3 4 3 1 1 3
6 A 0 0 0 0 0 1 5 6 4 2 3
C T – A A 7 G 0 2 2 0 0 0 3 4 5 3 1
C T C A A 8 G 0 2 4 2 0 0 1 2 3 4 2

Introduction to bioinformatics, Autumn 2006 56

Non-uniform mismatch penalties
l We used uniform penalty for mismatches:
s(’A’, ’C’) = s(’A’, ’G’) = … = s(’G’, ’T’) = µ
l Transition mutations (A->G, G->A, C->T, T->C) are
approximately twice as frequent than transversions (A-
>T, T->A, A->C, G->T)
− use non-uniform mismatch A C G T
penalties A 1 -1 -0.5 -1
C -1 1 -1 -0.5
G -0.5 -1 1 -1
T -1 -0.5 -1 1
Introduction to bioinformatics, Autumn 2006 57
Gaps in alignment
l Gap is a succession of indels in alignment
C T – - - A A
C T C G C A A

l Previous model scored a length k gap as w(k) = -k

l Replication processes may produce longer stretches
of insertions or deletions
− In coding regions, insertions or deletions of codons may
preserve functionality

Introduction to bioinformatics, Autumn 2006 58

Gap open and extension penalties (2)
l We can design a score that allows the penalty opening
gap to be larger than extending the gap:
w(k) = - (k – 1)
l Gap open cost , Gap extension cost
l Our previous algorithm can be extended to use w(k)
(not discussed on this course)

Introduction to bioinformatics, Autumn 2006 59

Sequence Alignment (chapter 6)
l The biological problem
l Global alignment
l Local alignment
l Multiple alignment

Introduction to bioinformatics, Autumn 2006 60

Multiple alignment
• Consider a set of n
sequences on the right aggcgagctgcgagtgcta
– Orthologous sequences from cgttagattgacgctgac
different organisms ttccggctgcgac
– Paralogs from multiple gacacggcgaacgga
duplications agtgtgcccgacgagcgaggac
• How can we study gcgggctgtgagcgcta
relationships between these aagcggcctgtgtgcccta
sequences? atgctgctgccagtgta
agtcgagccccgagtgc
agtccgagtcc
actcggtgc

Introduction to bioinformatics, Autumn 2006 61

Optimal alignment of three
sequences
l Alignment of A = a1a2…ai and B = b1b2…bj can end
either in (-, bj), (ai, bj) or (ai, -)
l 22 – 1 = 3 alternatives
l Alignment of A, B and C = c1c2…ck can end in 23 – 1
ways: (ai, -, -), (-, bj, -), (-, -, ck), (-, bj, ck), (ai, -, ck), (ai,
bj, -) or (ai, bj, ck)
l Solve the recursion using three-dimensional dynamic
programming matrix: O(n3) time and space
l Generalizes to n sequences but impractical with
moderate number of sequences

Introduction to bioinformatics, Autumn 2006 62

Multiple alignment in practice
l In practice, real-world multiple alignment problems are
usually solved with heuristics
l Progressive multiple alignment
− Choose two sequences and align them
− Choose third sequence w.r.t. two previous sequences and
align the third against them
− Repeat until all sequences have been aligned
− Different options how to choose sequences and score
alignments

Introduction to bioinformatics, Autumn 2006 63

Multiple alignment in practice
l Profile-based progressive multiple alignment:
CLUSTALW
− Construct a distance matrix of all pairs of sequences using
dynamic programming
− Progressively align pairs in order of decreasing similarity
− CLUSTALW uses various heuristics to contribute to
accuracy

Introduction to bioinformatics, Autumn 2006 64

Additional material
l R. Durbin, S. Eddy, A. Krogh, G. Mitchison: Biological
sequence analysis
l Course Biological sequence analysis in Spring 2007

Introduction to bioinformatics, Autumn 2006 65

BLAST and Sequence Alignment
No ratings yet
BLAST and Sequence Alignment
36 pages
Lecture 6 Evolutionary Sequence Alignment Algorithms
No ratings yet
Lecture 6 Evolutionary Sequence Alignment Algorithms
26 pages
Itb0607 Slides All PDF
No ratings yet
Itb0607 Slides All PDF
137 pages
Sequence Alignment Algorithms: DEKM Book Notes From Dr. Bino John and Dr. Takis Benos
No ratings yet
Sequence Alignment Algorithms: DEKM Book Notes From Dr. Bino John and Dr. Takis Benos
53 pages
Introduction-To-Computational Biology
No ratings yet
Introduction-To-Computational Biology
61 pages
Bioinfo Notes 2
No ratings yet
Bioinfo Notes 2
9 pages
Bioinformatics Chaper3
No ratings yet
Bioinformatics Chaper3
34 pages
Module 5
No ratings yet
Module 5
23 pages
Aula 2
No ratings yet
Aula 2
22 pages
Intro To Bioinformatics Semester 6 Botany
No ratings yet
Intro To Bioinformatics Semester 6 Botany
15 pages
Module 3 CSE3069 (Bioinformatics)
No ratings yet
Module 3 CSE3069 (Bioinformatics)
57 pages
BT302 L3 Psa
No ratings yet
BT302 L3 Psa
47 pages
Sequence Analysis - Alignment
No ratings yet
Sequence Analysis - Alignment
57 pages
Module II
No ratings yet
Module II
51 pages
An Introductory Course Bioinformatics-I: A Student Handout
No ratings yet
An Introductory Course Bioinformatics-I: A Student Handout
320 pages
Lecture 3
No ratings yet
Lecture 3
39 pages
Computational Biology (3) Alignment Algorithms: by Dr. Safynaz Abdel-Fattah Computer Science Department
No ratings yet
Computational Biology (3) Alignment Algorithms: by Dr. Safynaz Abdel-Fattah Computer Science Department
107 pages
Bio in For Ma Tics
No ratings yet
Bio in For Ma Tics
54 pages
Dynamic Programming Methods in Pairwise Alignment
No ratings yet
Dynamic Programming Methods in Pairwise Alignment
41 pages
Unit 3 Sequence Alignment and Phylogenetic Tree
No ratings yet
Unit 3 Sequence Alignment and Phylogenetic Tree
70 pages
Sequence Alignment
No ratings yet
Sequence Alignment
36 pages
Bio Medical Tics - Sequence Analysis - Alignment - 2011
No ratings yet
Bio Medical Tics - Sequence Analysis - Alignment - 2011
96 pages
L6-Pairwise Seq Alignment
No ratings yet
L6-Pairwise Seq Alignment
70 pages
W03 Pairwise
No ratings yet
W03 Pairwise
55 pages
Sequence Alignment
No ratings yet
Sequence Alignment
24 pages
Sequence Alignment Methods
No ratings yet
Sequence Alignment Methods
32 pages
Sequence Comparison
No ratings yet
Sequence Comparison
39 pages
Introduction Dynamic Programming
No ratings yet
Introduction Dynamic Programming
52 pages
Bioinformatics Intro
No ratings yet
Bioinformatics Intro
69 pages
Local and Global Sequence Alignment 12 by DR Sheikh Arslan Sehgal
No ratings yet
Local and Global Sequence Alignment 12 by DR Sheikh Arslan Sehgal
59 pages
Alignment of Sequences
No ratings yet
Alignment of Sequences
33 pages
Introduction To Different Resources of Bioinformatics and Application PDF
No ratings yet
Introduction To Different Resources of Bioinformatics and Application PDF
55 pages
LO5 Pairwise Sequence Alignment
No ratings yet
LO5 Pairwise Sequence Alignment
11 pages
Blast 2 Sequences, A New Tool For Comparing Protein and Nucleotide Sequences
No ratings yet
Blast 2 Sequences, A New Tool For Comparing Protein and Nucleotide Sequences
17 pages
Chap 03 BioInfo
No ratings yet
Chap 03 BioInfo
15 pages
Lecture 6 - Sequence Analysis
No ratings yet
Lecture 6 - Sequence Analysis
28 pages
Sequence Alignment
No ratings yet
Sequence Alignment
25 pages
Sequence Alignment
No ratings yet
Sequence Alignment
27 pages
Sequence Analysis - Pairwise Alignment
No ratings yet
Sequence Analysis - Pairwise Alignment
26 pages
Introduction To Bioinformatics: Tolga Can
No ratings yet
Introduction To Bioinformatics: Tolga Can
21 pages
Sequence Alignment Write
No ratings yet
Sequence Alignment Write
17 pages
Unit 2.1
No ratings yet
Unit 2.1
77 pages
Bioinformatics
No ratings yet
Bioinformatics
22 pages
Msa MTech
No ratings yet
Msa MTech
17 pages
Genomic Sequence Alignment
No ratings yet
Genomic Sequence Alignment
25 pages
Lecture 101
No ratings yet
Lecture 101
43 pages
Sequence Analysis in Bioinformatics
No ratings yet
Sequence Analysis in Bioinformatics
18 pages
Bioinformatics: Sequence Alignment Methods
No ratings yet
Bioinformatics: Sequence Alignment Methods
32 pages
Importance and Significance of Sequence Alignment - pptx12
No ratings yet
Importance and Significance of Sequence Alignment - pptx12
15 pages
What Is Bioinformatics
No ratings yet
What Is Bioinformatics
10 pages
Introduction To Bioinformatics Presentation
No ratings yet
Introduction To Bioinformatics Presentation
13 pages
5 Sequence Alignment
No ratings yet
5 Sequence Alignment
21 pages
Sequence Alignment Methods and Algorithms
No ratings yet
Sequence Alignment Methods and Algorithms
37 pages
Need & Emergence of The Field: Speaker Shashi Shekhar Head of Computational Section Biowits Life Sciences
No ratings yet
Need & Emergence of The Field: Speaker Shashi Shekhar Head of Computational Section Biowits Life Sciences
59 pages
Biochemistry Haqs 3rd Edtn
No ratings yet
Biochemistry Haqs 3rd Edtn
26 pages
Bioinformatics:: Guide To Bio-Computing and The Internet
No ratings yet
Bioinformatics:: Guide To Bio-Computing and The Internet
34 pages
Sequence Alignment Methods and Algorithms
75% (4)
Sequence Alignment Methods and Algorithms
37 pages
Bioinformatics Seminar3rdOct18
No ratings yet
Bioinformatics Seminar3rdOct18
25 pages
Sequence Alignment
No ratings yet
Sequence Alignment
8 pages
Yoga Nidra With Imagery
100% (1)
Yoga Nidra With Imagery
5 pages
Overcoming Parental Anxiety Rewire Your Brain To Worry Less and Enjoy Parenting More New Edition PDF
100% (19)
Overcoming Parental Anxiety Rewire Your Brain To Worry Less and Enjoy Parenting More New Edition PDF
14 pages
Notes - Macromolecules (PreMed - PK)
No ratings yet
Notes - Macromolecules (PreMed - PK)
26 pages
GOD'S MERIT NUR Primary 2 Agric
No ratings yet
GOD'S MERIT NUR Primary 2 Agric
2 pages
This Study Resource Was: G R A D E S L A B - C O M
100% (1)
This Study Resource Was: G R A D E S L A B - C O M
8 pages
Quantitative Analysis of Soda Ash by Double-Indicator Titration
No ratings yet
Quantitative Analysis of Soda Ash by Double-Indicator Titration
3 pages
EcoTraining Professional Field Guide Course Information 2025 - 2026 X
No ratings yet
EcoTraining Professional Field Guide Course Information 2025 - 2026 X
7 pages
Kebo106 Removed
No ratings yet
Kebo106 Removed
11 pages
4hb1 01 Rms 20240125
No ratings yet
4hb1 01 Rms 20240125
11 pages
Affinity Chromatography
No ratings yet
Affinity Chromatography
5 pages
TPE in Intensive Care - Dr. Teuku Yasir, SpAn-KIC, FIPM
No ratings yet
TPE in Intensive Care - Dr. Teuku Yasir, SpAn-KIC, FIPM
17 pages
Medicine Update 2021 Section 12
No ratings yet
Medicine Update 2021 Section 12
161 pages
CDS 2 Month Study Time Table
No ratings yet
CDS 2 Month Study Time Table
2 pages
How To Draw Fantasy Characters by BarbaraBrutti - Make Better Art - CLIP STUDIO TIPS
No ratings yet
How To Draw Fantasy Characters by BarbaraBrutti - Make Better Art - CLIP STUDIO TIPS
10 pages
CORE 2500 Syllabus F24
No ratings yet
CORE 2500 Syllabus F24
8 pages
Walter Gilbert RNA World
No ratings yet
Walter Gilbert RNA World
1 page
Form 2 Integrated Science Quiz
No ratings yet
Form 2 Integrated Science Quiz
6 pages
A. Analisa Daun C. Analisa Kompos: Form Pesanan Jasa Analisa Laboratorium Analitik PT - Socfindo
No ratings yet
A. Analisa Daun C. Analisa Kompos: Form Pesanan Jasa Analisa Laboratorium Analitik PT - Socfindo
1 page
A Manual Method For Applying The Hansch Approach To Drug Design
No ratings yet
A Manual Method For Applying The Hansch Approach To Drug Design
7 pages
Introduction To Chromatography: Components
No ratings yet
Introduction To Chromatography: Components
24 pages
Evsjv 'K E VSK: E VSKVM© WM JKKB Kwgwu Mwpevjq Cöavb KVH©VJQ XVKV
No ratings yet
Evsjv 'K E VSK: E VSKVM© WM JKKB Kwgwu Mwpevjq Cöavb KVH©VJQ XVKV
2 pages
Week 1 Oct Ge3 Unit 3 Part 31 Deserts Around The World Quiz
No ratings yet
Week 1 Oct Ge3 Unit 3 Part 31 Deserts Around The World Quiz
2 pages
Harnessing The Medicinal Properties of Serpentina
No ratings yet
Harnessing The Medicinal Properties of Serpentina
17 pages
2013 Maliketal Trees
No ratings yet
2013 Maliketal Trees
6 pages
Science Pts
No ratings yet
Science Pts
6 pages
Resource Definition Framework
No ratings yet
Resource Definition Framework
39 pages
Sqwincherhydrationsolutions Inclposter
No ratings yet
Sqwincherhydrationsolutions Inclposter
6 pages
Harrisons Principles of Internal Medicine, 19th Edition
No ratings yet
Harrisons Principles of Internal Medicine, 19th Edition
8 pages
Question Solution
No ratings yet
Question Solution
54 pages
Inferring The Past: Phylogenetic Trees (Chapter 12)
No ratings yet
Inferring The Past: Phylogenetic Trees (Chapter 12)
52 pages
Salwa NF - Acute Diarrhea
No ratings yet
Salwa NF - Acute Diarrhea
12 pages
Resource Description Framework (RDF)
No ratings yet
Resource Description Framework (RDF)
24 pages
Itb0607 Slides 1-21
No ratings yet
Itb0607 Slides 1-21
21 pages
Inferring The Past: Phylogenetic Trees (Chapter 12) : The Biological Problem Parsimony and Distance Methods
No ratings yet
Inferring The Past: Phylogenetic Trees (Chapter 12) : The Biological Problem Parsimony and Distance Methods
21 pages
Floral Thermogenesis of Three Species of Hydnora (Hydnoraceae) in Africa
No ratings yet
Floral Thermogenesis of Three Species of Hydnora (Hydnoraceae) in Africa
10 pages
Ans. RNN:: 1. What Is RNN and How It Works?
No ratings yet
Ans. RNN:: 1. What Is RNN and How It Works?
2 pages
Optimum Coordination of Overcurrent Relays Using Algorithm: Cma-Es
No ratings yet
Optimum Coordination of Overcurrent Relays Using Algorithm: Cma-Es
6 pages
Exercises of Galois Theory
From Everand
Exercises of Galois Theory
Simone Malacrida
No ratings yet

Sequence Alignment (Chapter 6) : The Biological Problem

Uploaded by

Sequence Alignment (Chapter 6) : The Biological Problem

Uploaded by

Sequence Alignment (chapter 6)

l The biological problem

Introduction to bioinformatics, Autumn 2006 22

Introduction to bioinformatics, Autumn 2006 23

Introduction to bioinformatics, Autumn 2006 24

Introduction to bioinformatics, Autumn 2006 25

Homology is more difficult to detect over greater evolutionary

l Similarity is an expected consequence of homology

Introduction to bioinformatics, Autumn 2006 27

Introduction to bioinformatics, Autumn 2006 28

Introduction to bioinformatics, Autumn 2006 29

acgtctag acgtctag acgtctag

2 matches 5 matches 7 matches

Introduction to bioinformatics, Autumn 2006 30

l Insertions and/or deletions are called indels

Introduction to bioinformatics, Autumn 2006 31

In this course, we discuss the first three problems.

Course Biological sequence analysis tackles all four in-

Introduction to bioinformatics, Autumn 2006 33

Introduction to bioinformatics, Autumn 2006 34

Introduction to bioinformatics, Autumn 2006 35

Introduction to bioinformatics, Autumn 2006 36

Introduction to bioinformatics, Autumn 2006 37

Introduction to bioinformatics, Autumn 2006 38

Introduction to bioinformatics, Autumn 2006 39

lAny alignment can be written

Introduction to bioinformatics, Autumn 2006 41

Introduction to bioinformatics, Autumn 2006 42

• Optimal global alignment 1 a1

Introduction to bioinformatics, Autumn 2006 43

Algorithm takes O(nm) time and space.

Introduction to bioinformatics, Autumn 2006 44

Introduction to bioinformatics, Autumn 2006 45

Introduction to bioinformatics, Autumn 2006 46

Introduction to bioinformatics, Autumn 2006 47

Introduction to bioinformatics, Autumn 2006 48

• Global alignment would be inadequate

Introduction to bioinformatics, Autumn 2006 50

Let I and J be intervals (substrings) of A and B,

Best local alignment score:

where S(I, J) is the score for substrings I and J.

Introduction to bioinformatics, Autumn 2006 51

Introduction to bioinformatics, Autumn 2006 52

Introduction to bioinformatics, Autumn 2006 53

Introduction to bioinformatics, Autumn 2006 54

Introduction to bioinformatics, Autumn 2006 55

Introduction to bioinformatics, Autumn 2006 56

l Previous model scored a length k gap as w(k) = -k

Introduction to bioinformatics, Autumn 2006 58

Introduction to bioinformatics, Autumn 2006 59

Introduction to bioinformatics, Autumn 2006 60

Introduction to bioinformatics, Autumn 2006 61

Introduction to bioinformatics, Autumn 2006 62

Introduction to bioinformatics, Autumn 2006 63

Introduction to bioinformatics, Autumn 2006 64

Introduction to bioinformatics, Autumn 2006 65

You might also like