0% found this document useful (0 votes)

11 views

Module Comparing and Visualizing Multiple Biological Sequences

Uploaded by

engeneeringtrader

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views

Module Comparing and Visualizing Multiple Biological Sequences

Uploaded by

engeneeringtrader

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 34

From Pairwise to Multiple Alignment

• Up until now we have only

tried to align two sequences.
• A faint (and statistically
insignificant) similarity
between two sequences
becomes significant if it is
present in many other
sequences.
• Multiple alignments can
reveal subtle similarities that
pairwise alignments do not
reveal.

Bioinformatics Algorithms: An Active Learning Approach.

YAFDLGYTCMFPVLLGGGELHIVQKETYTAPDEIAHYIKEHGITYIKLTPSLFHTIVNTASFAFDANFESLRLIVLGGEKIIPIDVIAFRKMYGHTE-FINHYGPTEATIGA

-AFDVSAGDFARALLTGGQLIVCPNEVKMDPASLYAIIKKYDITIFEATPALVIPLMEYI-YEQKLDISQLQILIVGSDSCSMEDFKTLVSRFGSTIRIVNSYGVTEACIDS

IAFDASSWEIYAPLLNGGTVVCIDYYTTIDIKALEAVFKQHHIRGAMLPPALLKQCLVSA----PTMISSLEILFAAGDRLSSQDAILARRAVGSGV-Y-NAYGPTENTVLS

Bioinformatics Algorithms: An Active Learning Approach.

YAFDLGYTCMFPVLLGGGELHIVQKETYTAPDEIAHYIKEHGITYIKLTPSLFHTIVNTASFAFDANFESLRLIVLGGEKIIPIDVIAFRKMYGHTE-FINHYGPTEATIGA

-AFDVSAGDFARALLTGGQLIVCPNEVKMDPASLYAIIKKYDITIFEATPALVIPLMEYI-YEQKLDISQLQILIVGSDSCSMEDFKTLVSRFGSTIRIVNSYGVTEACIDS

IAFDASSWEIYAPLLNGGTVVCIDYYTTIDIKALEAVFKQHHIRGAMLPPALLKQCLVSA----PTMISSLEILFAAGDRLSSQDAILARRAVGSGV-Y-NAYGPTENTVLS

Bioinformatics Algorithms: An Active Learning Approach.

YAFDLGYTCMFPVLLGGGELHIVQKETYTAPDEIAHYIKEHGITYIKLTPSLFHTIVNTASFAFDANFESLRLIVLGGEKIIPIDVIAFRKMYGHTE-FINHYGPTEATIGA

-AFDVSAGDFARALLTGGQLIVCPNEVKMDPASLYAIIKKYDITIFEATPALVIPLMEYI-YEQKLDISQLQILIVGSDSCSMEDFKTLVSRFGSTIRIVNSYGVTEACIDS

IAFDASSWEIYAPLLNGGTVVCIDYYTTIDIKALEAVFKQHHIRGAMLPPALLKQCLVSA----PTMISSLEILFAAGDRLSSQDAILARRAVGSGV-Y-NAYGPTENTVLS

Bioinformatics Algorithms: An Active Learning Approach.

• Alignment of 2 sequences is a 2-row matrix.

• Alignment of 3 sequences is a 3-row matrix

A T - G C G -
A - C G T - A
A T C A C - A

• Our scoring function should score alignments with

conserved columns higher.

Bioinformatics Algorithms: An Active Learning Approach.

• Alignment of ATGC, AATC, and ATGC

A -- T G C

A A T -- C

-- A T G C

Bioinformatics Algorithms: An Active Learning Approach.

• Alignment of ATGC, AATC, and ATGC

0 1 1 2 3 4 #symbols up to a given position

A -- T G C

A A T -- C

-- A T G C

Bioinformatics Algorithms: An Active Learning Approach.

• Alignment of ATGC, AATC, and ATGC

0 1 1 2 3 4 #symbols up to a given position

A -- T G C

0 1 2 3 3 4
A A T -- C

-- A T G C

Bioinformatics Algorithms: An Active Learning Approach.

• Alignment of ATGC, AATC, and ATGC

(0,0,0)→(1,1,0)→(1,2,1) →(2,3,2) →(3,3,3) →(4,4,4)

0 1 1 2 3 4

A -- T G C

0 1 2 3 3 4
A A T -- C

0 0 1 2 3 4
-- A T G C

Bioinformatics Algorithms: An Active Learning Approach.

(i-1,j-1,k) (i-1,j,k)

2-D (i,j,k-1)
(i,j-1,k-1)

(i,j-1,k) (i,j,k)

Bioinformatics Algorithms: An Active Learning Approach.

• δ(x, y, z) is an entry in the 3-D scoring matrix.

Bioinformatics Algorithms: An Active Learning Approach.

Copyright 2018 Compeau and Pevzner.
Multiple Alignment: Running Time
• For 3 sequences of length n, the run time is
proportional to 7n3
• For a k-way alignment, build a k-dimensional
Gedi graph with
– nk nodes
– most nodes have 2k – 1 incoming edges.
– Runtime: O(2knk)

Calculate the runtime for aligning 10 sequences

of length 100 each ?

Bioinformatics Algorithms: An Active Learning Approach.

Every multiple alignment induces pairwise alignments:

AC-GCGG-C
AC-GC-GAG
GCCGC-GAG

ACGCGG-C AC-GCGG-C AC-GCGAG

ACGC-GAC GCCGC-GAG GCCGCGAG

Bioinformatics Algorithms: An Active Learning Approach.

Given a set of arbitrary pairwise alignments, can

we construct a multiple alignment that induces
them?
AAAATTTT---- ----AAAATTTT TTTTGGGG----
----TTTTGGGG GGGGAAAA---- ----GGGGAAAA

Bioinformatics Algorithms: An Active Learning Approach.

- A G G C T A T C A C C T G
T A G – C T A C C A - - - G
C A G – C T A C C A - - - G
C A G – C T A T C A C – G G
C A G – C T A T C G C – G G

A 0 1 0 0 0 0 1 0 0 .8 0 0 0 0
C .6 0 0 0 1 0 0 .4 1 0 .6 .2 0 0
G 0 0 1 .2 0 0 0 0 0 .2 0 0 .4 1
T .2 0 0 0 0 1 0 .6 0 0 0 0 .2 0
- .2 0 0 .8 0 0 0 0 0 0 .4 .8 .4 0

Bioinformatics Algorithms: An Active Learning Approach.

• In the past we were aligning a sequence

against a sequence.

- A G G C T A T C A C C T G
T A G – C T A C C A - - - G
C A G – C T A C C A - - - G
C A G – C T A T C A C – G G
C A G – C T A T C G C – G G

A 0 1 0 0 0 0 1 0 0 .8 0 0 0 0
C .6 0 0 0 1 0 0 .4 1 0 .6 .2 0 0
G 0 0 1 .2 0 0 0 0 0 .2 0 0 .4 1
T .2 0 0 0 0 1 0 .6 0 0 0 0 .2 0
- .2 0 0 .8 0 0 0 0 0 0 .4 .8 .4 0

Bioinformatics Algorithms: An Active Learning Approach.

• In the past we were aligning a sequence

against a sequence.
– Can we align a sequence against a profile?

- A G G C T A T C A C C T G
T A G – C T A C C A - - - G
C A G – C T A C C A - - - G
C A G – C T A T C A C – G G
C A G – C T A T C G C – G G

A 0 1 0 0 0 0 1 0 0 .8 0 0 0 0
C .6 0 0 0 1 0 0 .4 1 0 .6 .2 0 0
G 0 0 1 .2 0 0 0 0 0 .2 0 0 .4 1
T .2 0 0 0 0 1 0 .6 0 0 0 0 .2 0
- .2 0 0 .8 0 0 0 0 0 0 .4 .8 .4 0

Bioinformatics Algorithms: An Active Learning Approach.

• In the past we were aligning a sequence

against a sequence.
– Can we align a sequence against a profile?
– Can we align a profile against a profile?
- A G G C T A T C A C C T G
T A G – C T A C C A - - - G
C A G – C T A C C A - - - G
C A G – C T A T C A C – G G
C A G – C T A T C G C – G G

A 0 1 0 0 0 0 1 0 0 .8 0 0 0 0
C .6 0 0 0 1 0 0 .4 1 0 .6 .2 0 0
G 0 0 1 .2 0 0 0 0 0 .2 0 0 .4 1
T .2 0 0 0 0 1 0 .6 0 0 0 0 .2 0
- .2 0 0 .8 0 0 0 0 0 0 .4 .8 .4 0

Bioinformatics Algorithms: An Active Learning Approach.

Copyright 2018 Compeau and Pevzner.
- A G G C T A T C A C C T G
T A G – C T A C C A - - - G
C A G – C T A C C A - - - G
C A G – C T A T C A C – G G
C A G – C T A T C G C – G G
A 0 1 0 0 0 0 1 0 0 .8 0 0 0 0
C .6 0 0 0 1 0 0 .4 1 0 .6 .2 0 0
G 0 0 1 .2 0 0 0 0 0 .2 0 0 .4 1
T .2 0 0 0 0 1 0 .6 0 0 0 0 .2 0
- .2 0 0 .8 0 0 0 0 0 0 .4 .8 .4 0
0
0
1
0
0
0 .8 0 0 0
1 0 .6 .2 0
0 .2 0 0 .4
0 0 0 0 .2
0 0 .4 .8 .4
1 0
0 .4
0 0
0 .6
0 0
0
0
0
1
0
0
1
0
0
0
0 0
0 0
1 .2
0 0
0 .8
A
C
G
T
-
Approximate methods to perform MSA
Approximate methods:
* Alignment based on small conserved regions or based on statistical or
probabilistic models that is FAST

* Give sub-optimal alignment which is almost the best in reasonable time.

* Most popular methods:

Progressive (Greedy approach)
Iterative
Multiple Alignment: Greedy Approach
• Choose the most similar sequences and
combine them into a profile, thereby reducing
alignment of k sequences to an alignment of
of k – 2 sequences and 1 profile.
• Iterate
• Used by ClustalW, T-COFFEE

Bioinformatics Algorithms: An Active Learning Approach.

• 6 pairwise alignments (premium for match +1,

penalties for indels and mismatches -1)
s2 GTCTGA s1 GATTCA--
s4 GTCAGC (score = 2) s4 G—T-CAGC (score = 0)

s1 GAT-TCA s2 G-TCTGA
s2 G-TCTGA (score = 1) s3 GATAT-T (score = -1)

s1 GAT-TCA s3 GAT-ATT
s3 GATAT-T (score = 1) s4 G-TCAGC (score = -1)
Bioinformatics Algorithms: An Active Learning Approach.
Copyright 2018 Compeau and Pevzner.
Greedy Approach: Example
• Since s2 and s4 are closest, we consolidate them
into a profile:
s2 GTCTGA
s2,4 = GTCt/aGa/cA
s4 GTCAGC
• New set of 3 sequences to align:
s1 GATTCA
s3 GATATT
s2,4 GTCt/aGa/c

What will be the next steps ?

Bioinformatics Algorithms: An Active Learning Approach.
Copyright 2018 Compeau and Pevzner.
Approximate algorithms: Iterative alignment

ALGORITHM:

1. Obtain a Draft Progressive alignment

2. Improve alignment using Kimura distance
matrix
3. Refine alignment
Approximate algorithms: Refine Iterative alignment

REFINEMENT:

1. Choose a sequence with gaps

2. Move gaps within an aligned sequence to
obtain more matches
3. Goto step 1 until all sequences with gaps
have been realigned
How do these alignments look like in practice ?

Similar (Aa with similar properties)> Identity (Same Aa)

Sequence conservation implies function

Why do we need multiple sequence alignments ?

MSA of DNA or protein sequence can yield more information than analysis
of a single sequence such as:

● Which part of the sequence is shared between different organisms?

● When dealing with a new protein with unknown function, presence of several
domains (functional parts of a protein) similar to domains in other “known”
protein sequence, can imply a similar structure or function
Why do we need multiple sequence alignments ?

MSA of DNA or protein sequence can yield more information than analysis
of a single sequence such as:

● Which part of the sequence is shared between different organisms?

It is known that selective pressure of evolution results from the need to
conserve function

Predicting the structure of new protein Building phylogenetic trees

(Depicting evolutionary relationships)

How do we make phylogenetic trees ?

Next module
Revisiting Darwin’s experiments using MSA

PCB Lect02 Pairwise Allign
No ratings yet
PCB Lect02 Pairwise Allign
51 pages
lecture1-2
No ratings yet
lecture1-2
44 pages
Bio Medical Tics - Sequence Analysis - Alignment - 2011
No ratings yet
Bio Medical Tics - Sequence Analysis - Alignment - 2011
96 pages
Lecture 5: Multiple Sequence Alignment: Introduction To Computational Biology
No ratings yet
Lecture 5: Multiple Sequence Alignment: Introduction To Computational Biology
34 pages
Sequence Alignment: Lecture 2, Thursday April 3, 2003
No ratings yet
Sequence Alignment: Lecture 2, Thursday April 3, 2003
38 pages
Sequence Analysis - Pairwise Alignment
No ratings yet
Sequence Analysis - Pairwise Alignment
26 pages
Unit Ii
No ratings yet
Unit Ii
14 pages
Pairwise Alignment 2017
No ratings yet
Pairwise Alignment 2017
49 pages
Importance and Significance of Sequence Alignment.pptx12
No ratings yet
Importance and Significance of Sequence Alignment.pptx12
15 pages
Sequence Alignment Presentation
No ratings yet
Sequence Alignment Presentation
27 pages
Pattern Matching Techniques and Their Applications To Computational Molecular Biology - A Review
No ratings yet
Pattern Matching Techniques and Their Applications To Computational Molecular Biology - A Review
8 pages
Notes On Dynamic-Programming Sequence Alignment
No ratings yet
Notes On Dynamic-Programming Sequence Alignment
8 pages
Assignments For Week 4 2024
No ratings yet
Assignments For Week 4 2024
11 pages
05. Sequence Alignment
No ratings yet
05. Sequence Alignment
9 pages
COB Sequencealignment
No ratings yet
COB Sequencealignment
49 pages
Needlemanwunsch 130216130832 Phpapp01
No ratings yet
Needlemanwunsch 130216130832 Phpapp01
39 pages
Sequence Alignment: Lecture 2, Thursday April 3, 2003
No ratings yet
Sequence Alignment: Lecture 2, Thursday April 3, 2003
39 pages
Lecture 5
No ratings yet
Lecture 5
42 pages
beispielfragen-bioinformatik-1
No ratings yet
beispielfragen-bioinformatik-1
4 pages
Lecture5 Newest
No ratings yet
Lecture5 Newest
124 pages
Sequence Comparison
No ratings yet
Sequence Comparison
39 pages
beispielfragen-bioinformatik
No ratings yet
beispielfragen-bioinformatik
4 pages
Introduction Dynamic Programming
No ratings yet
Introduction Dynamic Programming
52 pages
Lab5 Ch2 Sequence Similarity PDF
No ratings yet
Lab5 Ch2 Sequence Similarity PDF
95 pages
Fall
No ratings yet
Fall
3 pages
LO5 Pairwise Sequence Alignment
No ratings yet
LO5 Pairwise Sequence Alignment
11 pages
Lecture 5 Introduction Dynamic Programming
No ratings yet
Lecture 5 Introduction Dynamic Programming
52 pages
Sequence Alignment Methods and Algorithms
No ratings yet
Sequence Alignment Methods and Algorithms
37 pages
Sequence Alignment Methods and Algorithms
75% (4)
Sequence Alignment Methods and Algorithms
37 pages
Running BLAST Through Perl
No ratings yet
Running BLAST Through Perl
35 pages
Sequence Comparison: Motivation: Finding Similarity Between Sequences Is Important For Many Biological Questions
No ratings yet
Sequence Comparison: Motivation: Finding Similarity Between Sequences Is Important For Many Biological Questions
47 pages
Chapter_3_Summary
No ratings yet
Chapter_3_Summary
33 pages
36) Corpet 1988
No ratings yet
36) Corpet 1988
10 pages
Tabby
No ratings yet
Tabby
11 pages
Sequence Comparison: Local Alignment
No ratings yet
Sequence Comparison: Local Alignment
21 pages
Need & Emergence of The Field: Speaker Shashi Shekhar Head of Computational Section Biowits Life Sciences
No ratings yet
Need & Emergence of The Field: Speaker Shashi Shekhar Head of Computational Section Biowits Life Sciences
59 pages
Analytical
No ratings yet
Analytical
24 pages
Multiple Alignment PDF
No ratings yet
Multiple Alignment PDF
45 pages
2024 Bioinformatics Algorithms Day 2
100% (1)
2024 Bioinformatics Algorithms Day 2
107 pages
318809f1420dc08eac795206c14bbebd_MIT6_047F15_Lecture03
No ratings yet
318809f1420dc08eac795206c14bbebd_MIT6_047F15_Lecture03
56 pages
1 T Coffee Dalign 18
No ratings yet
1 T Coffee Dalign 18
31 pages
Module-II
No ratings yet
Module-II
51 pages
Leklj
No ratings yet
Leklj
24 pages
Bioinformatics: Sequence Alignment Methods
No ratings yet
Bioinformatics: Sequence Alignment Methods
32 pages
HW1 2014
No ratings yet
HW1 2014
2 pages
W03_Pairwise
No ratings yet
W03_Pairwise
55 pages
Bioinformatics Basics PDF
No ratings yet
Bioinformatics Basics PDF
10 pages
Dr. Zoya Khalid Zoya - Khalid@nu - Edu.pk
No ratings yet
Dr. Zoya Khalid Zoya - Khalid@nu - Edu.pk
51 pages
BMB 822_Bioinformatics and Computing_Lecture Notes
No ratings yet
BMB 822_Bioinformatics and Computing_Lecture Notes
94 pages
Unit I Algorithms
No ratings yet
Unit I Algorithms
42 pages
Freiburg - Bioinfo I.2008s.uebungsblatt04
No ratings yet
Freiburg - Bioinfo I.2008s.uebungsblatt04
1 page
A Genetic Algorithm Based Approach for The
No ratings yet
A Genetic Algorithm Based Approach for The
4 pages
Bioinformatics Seminar3rdOct18
No ratings yet
Bioinformatics Seminar3rdOct18
25 pages
Frid Seminar
No ratings yet
Frid Seminar
30 pages
Genomic Sequence Alignment
No ratings yet
Genomic Sequence Alignment
25 pages
Lecture 101
No ratings yet
Lecture 101
43 pages
Sequence Alignment and Searching
No ratings yet
Sequence Alignment and Searching
37 pages
Lecture 6
No ratings yet
Lecture 6
31 pages
Futuristic Projects in Energy and Automation Sectors: A Brief Review of New Technologies Driving Sustainable Development
From Everand
Futuristic Projects in Energy and Automation Sectors: A Brief Review of New Technologies Driving Sustainable Development
Alok Kumar Verma
No ratings yet
PyTorch Cookbook: 100+ Solutions across RNNs, CNNs, python tools, distributed training and graph networks
From Everand
PyTorch Cookbook: 100+ Solutions across RNNs, CNNs, python tools, distributed training and graph networks
Matthew Rosch
No ratings yet
Dynamic Programming: Longest Common Subsequences
No ratings yet
Dynamic Programming: Longest Common Subsequences
11 pages
Butenko 2019 (Leishmania Genomica Comparativa)
No ratings yet
Butenko 2019 (Leishmania Genomica Comparativa)
12 pages
(Ebook) Evolutionary Protein Design by Frances H. Arnold (Eds.) ISBN 9780120342556, 0120342553 instant download
50% (2)
(Ebook) Evolutionary Protein Design by Frances H. Arnold (Eds.) ISBN 9780120342556, 0120342553 instant download
52 pages
BLAST Homepage and Selected Search Pages: Background
No ratings yet
BLAST Homepage and Selected Search Pages: Background
8 pages
Bioinformatics Lab Notebook: Comsats University, Islamabad
No ratings yet
Bioinformatics Lab Notebook: Comsats University, Islamabad
27 pages
Genetic Engineering 3rd Year Syllabus
No ratings yet
Genetic Engineering 3rd Year Syllabus
3 pages
Clustalw
No ratings yet
Clustalw
9 pages
Jalview 2.8: A Manual and Introductory Tutorial
No ratings yet
Jalview 2.8: A Manual and Introductory Tutorial
89 pages
BI205 Prac 5&6
No ratings yet
BI205 Prac 5&6
11 pages
DNA Fragment Assembly: An Ant Colony System Approach
No ratings yet
DNA Fragment Assembly: An Ant Colony System Approach
12 pages
SortMeRNA User Manual v2.1
No ratings yet
SortMeRNA User Manual v2.1
20 pages
Accepted Manuscript: Molecular Phylogenetics and Evolution
No ratings yet
Accepted Manuscript: Molecular Phylogenetics and Evolution
48 pages
Bioinformatics Approaches and Applications in Plan
No ratings yet
Bioinformatics Approaches and Applications in Plan
13 pages
Analysis On Credit Card Fraud Detection Methods
No ratings yet
Analysis On Credit Card Fraud Detection Methods
5 pages
Comparative Genomics Masterclass
No ratings yet
Comparative Genomics Masterclass
4 pages
Insilico Gene Analysis
No ratings yet
Insilico Gene Analysis
34 pages
Banguera Et Al 2019
No ratings yet
Banguera Et Al 2019
30 pages
Mycosphere 8 6 7
No ratings yet
Mycosphere 8 6 7
8 pages
Complete Download Computational Structural Biology Methods and Applications 1st Edition Torsten Schwede PDF All Chapters
100% (8)
Complete Download Computational Structural Biology Methods and Applications 1st Edition Torsten Schwede PDF All Chapters
77 pages
IBT_Applicant_Outcome__Successful_
No ratings yet
IBT_Applicant_Outcome__Successful_
7 pages
Bioinformation: Phylogenetic Analysis of Chloroplast Matk Gene From Zingiberaceae For Plant Dna Barcoding
No ratings yet
Bioinformation: Phylogenetic Analysis of Chloroplast Matk Gene From Zingiberaceae For Plant Dna Barcoding
4 pages
Syllabus For M.Sc. Part-II Botany Semester - III
No ratings yet
Syllabus For M.Sc. Part-II Botany Semester - III
31 pages
Sequence Alignment Thesis
100% (2)
Sequence Alignment Thesis
6 pages
Dada 2
No ratings yet
Dada 2
45 pages
A Genome-Wide Survey of The NAC Transcription Factor Family in Monocots and Eudicots
No ratings yet
A Genome-Wide Survey of The NAC Transcription Factor Family in Monocots and Eudicots
20 pages
Protein Sequence
No ratings yet
Protein Sequence
36 pages
M.E-CSE Anna University
No ratings yet
M.E-CSE Anna University
25 pages
Sankaranarayanan Et Al 2024 Exploring Antimicrobial Resistance Determinants in The Neanderthal Microbiome
No ratings yet
Sankaranarayanan Et Al 2024 Exploring Antimicrobial Resistance Determinants in The Neanderthal Microbiome
13 pages
Instant Download Systems Biology and Bioinformatics A Computational Approach 1st Edition Kayvan Najarian PDF All Chapters
100% (8)
Instant Download Systems Biology and Bioinformatics A Computational Approach 1st Edition Kayvan Najarian PDF All Chapters
71 pages
PhyML-3.1 - Manual Bioinformatics
No ratings yet
PhyML-3.1 - Manual Bioinformatics
39 pages

Module Comparing and Visualizing Multiple Biological Sequences

Uploaded by

Module Comparing and Visualizing Multiple Biological Sequences

Uploaded by

From Pairwise to Multiple Alignment

• Up until now we have only

Bioinformatics Algorithms: An Active Learning Approach.

Bioinformatics Algorithms: An Active Learning Approach.

Bioinformatics Algorithms: An Active Learning Approach.

Bioinformatics Algorithms: An Active Learning Approach.

• Alignment of 2 sequences is a 2-row matrix.

• Our scoring function should score alignments with

Bioinformatics Algorithms: An Active Learning Approach.

• Alignment of ATGC, AATC, and ATGC

Bioinformatics Algorithms: An Active Learning Approach.

• Alignment of ATGC, AATC, and ATGC

0 1 1 2 3 4 #symbols up to a given position

Bioinformatics Algorithms: An Active Learning Approach.

• Alignment of ATGC, AATC, and ATGC

0 1 1 2 3 4 #symbols up to a given position

Bioinformatics Algorithms: An Active Learning Approach.

• Alignment of ATGC, AATC, and ATGC

(0,0,0)→(1,1,0)→(1,2,1) →(2,3,2) →(3,3,3) →(4,4,4)

Bioinformatics Algorithms: An Active Learning Approach.

Bioinformatics Algorithms: An Active Learning Approach.

• δ(x, y, z) is an entry in the 3-D scoring matrix.

Bioinformatics Algorithms: An Active Learning Approach.

Calculate the runtime for aligning 10 sequences

Bioinformatics Algorithms: An Active Learning Approach.

Every multiple alignment induces pairwise alignments:

ACGCGG-C AC-GCGG-C AC-GCGAG

Bioinformatics Algorithms: An Active Learning Approach.

Given a set of arbitrary pairwise alignments, can

Bioinformatics Algorithms: An Active Learning Approach.

Bioinformatics Algorithms: An Active Learning Approach.

• In the past we were aligning a sequence

Bioinformatics Algorithms: An Active Learning Approach.

• In the past we were aligning a sequence

Bioinformatics Algorithms: An Active Learning Approach.

• In the past we were aligning a sequence

Bioinformatics Algorithms: An Active Learning Approach.

* Give sub-optimal alignment which is almost the best in reasonable time.

* Most popular methods:

Bioinformatics Algorithms: An Active Learning Approach.

• 6 pairwise alignments (premium for match +1,

What will be the next steps ?

1. Obtain a Draft Progressive alignment

1. Choose a sequence with gaps

Similar (Aa with similar properties)> Identity (Same Aa)

Sequence conservation implies function

● Which part of the sequence is shared between different organisms?

● Which part of the sequence is shared between different organisms?

Predicting the structure of new protein Building phylogenetic trees

How do we make phylogenetic trees ?

You might also like