0% found this document useful (0 votes)

70 views18 pages

Sequence Analysis in Bioinformatics

The document discusses sequence analysis and alignment in bioinformatics. It defines sequence alignment, describes different types of sequence alignment including global, local and multiple sequence alignment. It also explains different methods used for sequence alignment like dot matrix, dynamic programming, PAM and BLOSUM matrices, and progressive multiple sequence alignment.

Uploaded by

Bhaskar Chatterjee

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

70 views18 pages

Sequence Analysis in Bioinformatics

Uploaded by

Bhaskar Chatterjee

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 18

Sequence Analysis in

Bioinformatics
BIOF605
• Sequence comparison lies at the heart of bioinformatics analysis. It is
an important first step toward structural and functional analysis of
newly determined sequences.
• As new biological sequences are being generated at exponential rates,
sequence comparison is becoming increasingly important to draw
functional and evolutionary inference of a new protein with proteins
already existing in the database.
• The most fundamental process in this type of comparison is sequence
alignment. This is the process by which sequences are compared by
searching for common character patterns and establishing residue–
residue correspondence among related sequences. Pairwise sequence
alignment is the process of aligning two sequences and is the basis of
database similarity searching and multiple sequence alignment.
DEFINITION OF SEQUENCE
ALIGNMENT
• Sequence alignment is the procedure of comparing two (pair-wise
alignment) or more (multiple sequence alignment) sequences by searching
for a series of individual characters or character patterns that are in the
same order in the sequences. Two sequences are aligned by writing them
across a page in two rows.
• Identical or similar characters are placed in the same column, and
nonidentical characters can either be placed in the same column as a
mismatch or opposite a gap in the other sequence. In an optimal
alignment, nonidentical characters and gaps are placed to bring as many
identical or similar characters as possible into vertical register.
• Sequences that can be readily aligned in this manner are said to be similar.
Pairwise Sequence Alignment
• The goal of pairwise sequence alignment is to establish a
correspondence between the elements in a pair of sequences that
share a common property, such as common ancestry or a common
structural or functional role.
• In bioinformatics, the sequences under consideration are typically
nucleic acid or amino acid polymers.
• We will consider three variants of the pairwise sequence alignment
problem: global alignment, and local alignment.
• Global Alignment
For the two hypothetical protein sequence fragments in the global alignment is stretched over
the entire sequence length to include as many matching amino acids as possible up to and
including the sequence ends. Vertical bars between the sequences indicate the presence of
identical amino acids. Although there is an obvious region of identity in this example (the
sequence GKG preceded by a commonly observed substitution of T for A), a global alignment
may not align such regions so that more amino acids along the entire sequence lengths can be
matched.
• Local Alignment
In a local alignment, the alignment stops at the ends of regions of identity or strong similarity,
and a much higher priority is given to finding these local regions than to extending the alignment
to include more neighboring amino acid pairs. Dashes indicate sequence not included in the
alignment. This type of alignment favors finding conserved nucleotide patterns, DNA sequences,
or amino acid patterns in protein sequences.
Dot Matrix Method
• The most basic sequence alignment method is the dot matrix method, also known as the dot plot
method.
• It is a graphical way of comparing two sequences in a two-dimensional matrix.
• In a dot matrix, two sequences to be compared are written in the horizontal and vertical axes of
the matrix.
• The comparison is done by scanning each residue of one sequence for similarity with all residues
in the other sequence.
• If a residue match is found, a dot is placed within the graph.
Dynamic Programming Method
• Dynamic programming is a method that determines optimal alignment by matching two
sequences for all possible pairs of characters between the two sequences.
• It is fundamentally similar to the dot matrix method in that it also creates a two-dimensional
alignment grid.
• However, it finds alignment in a more quantitative way by converting a dot matrix into a scoring
matrix to account for matches and mismatches between sequences.
• By searching for the set of highest scores in this matrix, the best alignment can be accurately
obtained
PAM Matrices
• The PAM matrices (also called Dayhoff PAM matrices) were first constructed by Margaret Dayhoff,
who compiled alignments of seventy-one groups of very closely related protein sequences.
• PAM stands for “point accepted mutation” (although “accepted point mutation” or APM may be a
more appropriate term, PAM is easier to pronounce).
• Because of the use of very closely related homologs, the observed mutations were not expected
to significantly change the common function of the proteins.
• Thus, the observed amino acid mutations are considered to be accepted by natural selection
BLOSUM Matrices
• Instead of using the extrapolation function, the BLOSUM matrices are actual percentage identity
values of sequences selected for construction of the matrices.
• For example, BLOSUM62 indicates that the sequences selected for constructing the matrix share
an average identity value of 62%.
• Other BLOSUM matrices based on sequence groups of various identity levels have also been
constructed.
• In the reversing order as the PAM numbering system, the lower the BLOSUM number, the more
divergent sequences they represent.
Pairwise Sequence Alignment Tools < EMBL-EBI
Global Alignment using Needleman algorithm Local Alignment using Waterman algorithm
Multiple Sequence Alignment
• From a multiple alignment of three or more protein sequences, the
highly conserved residues that define structural and functional
domains in protein families can be identified.
• New members of such families can then be found by searching
sequence databases for other sequences with these same domains.
• Alignment of DNA sequences can assist in finding conserved
regulatory patterns in DNA sequences.
• Despite the great value of multiple sequence alignments, obtaining
one presents a very difficult algorithmic problem.
SCORING FUNCTION
• Multiple sequence alignment is to arrange sequences in such a way that a maximum number of
residues from each sequence are matched up according to a particular scoring function.
• The scoring function for multiple sequence alignment is based on the concept of sum of pairs
(SP).
• As the name suggests, it is the sum of the scores of all possible pairs of sequences in a multiple
alignment based on a particular scoring matrix. In calculating the SP scores, each column is scored
by summing the scores for all possible pairwise matches, mismatches and gap costs.
• The score of the entire alignment is the sum of all of the column scores. The purpose of most
multiple sequence alignment algorithms is to achieve maximum SP scores.
Progressive Alignment Method
• Progressive alignment depends on the stepwise assembly of multiple alignment and is heuristic
in nature.
• It speeds up the alignment of multiple sequences through a multistep process. It first conducts
pairwise alignments for each possible pair of sequences using the Needleman–Wunsch global
alignment method and records these similarity scores from the pairwise comparisons.
• The scores can either be percent identity or similarity scores based on a particular substitution
matrix. Both scores correlate with the evolutionary distances between sequences.
• The scores are then converted into evolutionary distances to generate a distance matrix for all
the sequences involved. A simple phylogenetic analysis is then performed based on the
distance matrix to group sequences based on pairwise distance scores.
• As a result, a phylogenetic tree is generated using the neighbor-joining method. The tree
reflects evolutionary proximity among all the sequences.
Clustal Omega < Multiple Sequence Alignment < EMBL-EBI
Presented by
Name Enrollment number

Bioinformatics Alignment
No ratings yet
Bioinformatics Alignment
128 pages
Boost Your Vocabulary Cam 17
0% (1)
Boost Your Vocabulary Cam 17
34 pages
Biology 6th Edition Brooker Solution Manual Full Download
0% (1)
Biology 6th Edition Brooker Solution Manual Full Download
408 pages
BLAST (Basic Local Alignment Search Tool)
100% (1)
BLAST (Basic Local Alignment Search Tool)
23 pages
Multiple Sequence Alignment
No ratings yet
Multiple Sequence Alignment
19 pages
Multiple Sequence Alignment
No ratings yet
Multiple Sequence Alignment
89 pages
Introduction-To-Computational Biology
No ratings yet
Introduction-To-Computational Biology
61 pages
Lecture 6 Evolutionary Sequence Alignment Algorithms
No ratings yet
Lecture 6 Evolutionary Sequence Alignment Algorithms
26 pages
Sequence Alignment Methods and Algorithms
75% (4)
Sequence Alignment Methods and Algorithms
37 pages
Pre-Test in Shs Earth and Life Science: Schools Division of Capiz
100% (1)
Pre-Test in Shs Earth and Life Science: Schools Division of Capiz
6 pages
Module For General Biology (Biol. 1012) : Ministry of Science and Higher Education-Ethiopia
No ratings yet
Module For General Biology (Biol. 1012) : Ministry of Science and Higher Education-Ethiopia
241 pages
1 T Coffee Dalign 18
No ratings yet
1 T Coffee Dalign 18
31 pages
Multiple Sequence Alignment 3
No ratings yet
Multiple Sequence Alignment 3
22 pages
Computational Biology (3) Alignment Algorithms: by Dr. Safynaz Abdel-Fattah Computer Science Department
No ratings yet
Computational Biology (3) Alignment Algorithms: by Dr. Safynaz Abdel-Fattah Computer Science Department
107 pages
Unit 3 Sequence Alignment and Phylogenetic Tree
No ratings yet
Unit 3 Sequence Alignment and Phylogenetic Tree
70 pages
Alignment Methods
No ratings yet
Alignment Methods
33 pages
5.pairwise Alignment
No ratings yet
5.pairwise Alignment
85 pages
Genomics and Similarity Search
No ratings yet
Genomics and Similarity Search
43 pages
Unit 3 Behavioral Ecology
100% (1)
Unit 3 Behavioral Ecology
27 pages
W03 Pairwise
No ratings yet
W03 Pairwise
55 pages
Module II
No ratings yet
Module II
51 pages
L8 Msa
No ratings yet
L8 Msa
52 pages
Sequence Analysis - Alignment
No ratings yet
Sequence Analysis - Alignment
57 pages
Lecture 3
No ratings yet
Lecture 3
39 pages
Module 3 CSE3069 (Bioinformatics)
No ratings yet
Module 3 CSE3069 (Bioinformatics)
57 pages
Sequence Alingment
No ratings yet
Sequence Alingment
10 pages
Sequence Alignment
No ratings yet
Sequence Alignment
36 pages
Sequence Alignment
No ratings yet
Sequence Alignment
25 pages
Bioinformatics I
No ratings yet
Bioinformatics I
39 pages
Unit 2.1
No ratings yet
Unit 2.1
77 pages
Lecture 6 - Sequence Analysis
No ratings yet
Lecture 6 - Sequence Analysis
28 pages
5 Sequence Alignment
No ratings yet
5 Sequence Alignment
21 pages
Local and Global Sequence Alignment 12 by DR Sheikh Arslan Sehgal
No ratings yet
Local and Global Sequence Alignment 12 by DR Sheikh Arslan Sehgal
59 pages
Sequence Alignment Methods
No ratings yet
Sequence Alignment Methods
32 pages
Dr. Zoya Khalid Zoya - Khalid@nu - Edu.pk
No ratings yet
Dr. Zoya Khalid Zoya - Khalid@nu - Edu.pk
51 pages
Bioinformatics Pairwise Alignment
No ratings yet
Bioinformatics Pairwise Alignment
128 pages
Multiple Sequence Alignment
No ratings yet
Multiple Sequence Alignment
18 pages
L3.4 Alignment
No ratings yet
L3.4 Alignment
90 pages
Sequence Alignment
No ratings yet
Sequence Alignment
24 pages
Sequence Alignment Presentation
No ratings yet
Sequence Alignment Presentation
27 pages
Sequence Alignment Write
No ratings yet
Sequence Alignment Write
17 pages
Sequence Alignment
No ratings yet
Sequence Alignment
27 pages
Msa MTech
No ratings yet
Msa MTech
17 pages
Biological Databases
No ratings yet
Biological Databases
13 pages
Importance and Significance of Sequence Alignment - pptx12
No ratings yet
Importance and Significance of Sequence Alignment - pptx12
15 pages
Pairwise Sequence Alignment
No ratings yet
Pairwise Sequence Alignment
12 pages
Sequence Analysis - Pairwise Alignment
No ratings yet
Sequence Analysis - Pairwise Alignment
26 pages
Notes Bioinformatics
No ratings yet
Notes Bioinformatics
14 pages
AsBioinfo Ders 7 ALLIGNMENT - 1
No ratings yet
AsBioinfo Ders 7 ALLIGNMENT - 1
9 pages
Sequencing Alignment & Its Methods Group II
No ratings yet
Sequencing Alignment & Its Methods Group II
12 pages
LO5 Pairwise Sequence Alignment
No ratings yet
LO5 Pairwise Sequence Alignment
11 pages
Chap 03 BioInfo
No ratings yet
Chap 03 BioInfo
15 pages
B.I Sec 4.
No ratings yet
B.I Sec 4.
18 pages
Sequence Alignment
No ratings yet
Sequence Alignment
9 pages
Sequence Allignment
No ratings yet
Sequence Allignment
5 pages
Unit - Ii Sequence Analysis: Pair-Wise Sequence Comparison
No ratings yet
Unit - Ii Sequence Analysis: Pair-Wise Sequence Comparison
17 pages
Sequence Alignment: Sequence Alignment Is The Most Important Task in Bioinformatics!
No ratings yet
Sequence Alignment: Sequence Alignment Is The Most Important Task in Bioinformatics!
13 pages
Multiple Sequence Alignment Black and White
No ratings yet
Multiple Sequence Alignment Black and White
2 pages
Sequence Alignment Methods and Algorithms
No ratings yet
Sequence Alignment Methods and Algorithms
37 pages
Chapter 2 Bioinformatics
No ratings yet
Chapter 2 Bioinformatics
9 pages
Msa
No ratings yet
Msa
28 pages
Bioinformatics: Sequence Alignment Methods
No ratings yet
Bioinformatics: Sequence Alignment Methods
32 pages
65edfde4e78b0100182332aa - ## - Practice Test-13 Test Paper
No ratings yet
65edfde4e78b0100182332aa - ## - Practice Test-13 Test Paper
19 pages
Classification of Gymnosperms Cycadopsida Classification of Gymnosperms
No ratings yet
Classification of Gymnosperms Cycadopsida Classification of Gymnosperms
15 pages
Biodiversity and Evolution
No ratings yet
Biodiversity and Evolution
15 pages
Glog Mutations
No ratings yet
Glog Mutations
4 pages
Chapter 3 - Neoplasia
No ratings yet
Chapter 3 - Neoplasia
30 pages
Seed Viability: CWANA Workshop On Conservation and Use of Plant Genetic Resources 5-16 December 2022
No ratings yet
Seed Viability: CWANA Workshop On Conservation and Use of Plant Genetic Resources 5-16 December 2022
33 pages
Advances in Protein Chemistry and Structural Biology Premium Download
100% (13)
Advances in Protein Chemistry and Structural Biology Premium Download
16 pages
ALL CHapter 7
No ratings yet
ALL CHapter 7
82 pages
PDF Zoology, Curtailed Syllabus of B.Sc.I, II & III 2022-23
No ratings yet
PDF Zoology, Curtailed Syllabus of B.Sc.I, II & III 2022-23
5 pages
Monthly Study Schedule Plan: Month
No ratings yet
Monthly Study Schedule Plan: Month
9 pages
Opens Tax Microbiology Test Bank
No ratings yet
Opens Tax Microbiology Test Bank
483 pages
Assignment of Biochemistry
No ratings yet
Assignment of Biochemistry
3 pages
Syllabus-BIOL1222 BB - Spring 2024
No ratings yet
Syllabus-BIOL1222 BB - Spring 2024
12 pages
Ecology Notes
No ratings yet
Ecology Notes
21 pages
Seed Mineral Composition and Protein Content of Faba Beans
No ratings yet
Seed Mineral Composition and Protein Content of Faba Beans
10 pages
Vasey 2014
No ratings yet
Vasey 2014
18 pages
Agricultural Sciences Content Manual
No ratings yet
Agricultural Sciences Content Manual
85 pages
Dr. Bhupendra Pandey CV - 28-04-25
No ratings yet
Dr. Bhupendra Pandey CV - 28-04-25
3 pages
Module 1-Zoology - History
No ratings yet
Module 1-Zoology - History
51 pages
Lesson 3 MBC 211 (Electrophoresis)
No ratings yet
Lesson 3 MBC 211 (Electrophoresis)
9 pages
01 Lecture Animal Behavior Holt PDF
No ratings yet
01 Lecture Animal Behavior Holt PDF
48 pages
NanoGlo HiBiT Lytic Detection System
No ratings yet
NanoGlo HiBiT Lytic Detection System
30 pages
IELTS Reading Print
No ratings yet
IELTS Reading Print
7 pages
2 50 1628251001 17ijasrdec202117
No ratings yet
2 50 1628251001 17ijasrdec202117
8 pages
Chapter 1
No ratings yet
Chapter 1
1 page
10 Minute Guide to Orthogonal Array Test Strategy
From Everand
10 Minute Guide to Orthogonal Array Test Strategy
Rajeev Nair Raman
No ratings yet
Radial Basis Networks: Fundamentals and Applications for The Activation Functions of Artificial Neural Networks
From Everand
Radial Basis Networks: Fundamentals and Applications for The Activation Functions of Artificial Neural Networks
Fouad Sabry
No ratings yet
Support Vector Machine: Fundamentals and Applications
From Everand
Support Vector Machine: Fundamentals and Applications
Fouad Sabry
No ratings yet

Sequence Analysis in Bioinformatics

Uploaded by

Sequence Analysis in Bioinformatics

Uploaded by

Sequence Analysis in

You might also like