0% found this document useful (0 votes)

56 views26 pages

lecture2_sequence_alignment

This document provides an introduction to sequence alignment in bioinformatics, detailing methods for both pairwise and multiple sequence alignments. It explains the significance of alignment in identifying evolutionary relationships and introduces key algorithms such as Needleman-Wunsch for global alignment and Smith-Waterman for local alignment. The document also outlines the steps involved in these algorithms and their applications in database searching.

Uploaded by

Zahra.

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

56 views26 pages

lecture2_sequence_alignment

Uploaded by

Zahra.

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 26

INTRODUCTION TO BIOINFORMATICS

LECTURE NO 2

SEQUENCE ALIGNMENT

By:
Laila Sehar

National Center for Bioinformatics

Quaid-e-Azam university, Islamabad
Introduction
Sequence Alignment
 Definition: “Procedure for comparing two or more
sequences by searching for a series of individual
characters or character patterns that are in the same
order in the sequences”
 Pair-wise alignment: compare two sequences
 Multiple sequence alignment: compare more
than two sequences
Example sequence alignment

 Align “abcdef” with “abdgf”

• Write second sequence below the first
abcdef
abdgf
• Move sequences to give maximum match
between them
• Show characters that match using vertical
bar
Example sequence alignment

abcdef
||
abdgf

 Insert gap between b and d on lower sequence to allow d

and f to align
abcdef
| | | |
ab-dgf
Note e and g don’t match
Why do sequence alignments?

 To find whether two (or more) genes or

proteins are evolutionarily related to each
other
 To find structurally or functionally similar
regions within proteins
Pairwise Sequence Alignment
 Pairwise sequence alignment methods are used
to find the best-matching piecewise (local) or
global alignments of two query sequences.

 Pairwise Sequence Alignment is used to

identify regions of similarity that may indicate
functional, structural and/or evolutionary
relationships between two biological
sequences (protein or nucleic acid).
Methods for Pair wise Alignment

 Dynamic Programming:

 Needlemanwunch algorithm

 Smithwaterman algorithm
Dynamic Programming

Two method for sequence alignment.

 Local sequence alignment

 Global sequence alignment

Global vs. Local Alignment

 We distinguish
Global alignment algorithms which optimize
overall alignment between two sequences .
 We assume that the two proteins are basically
similar over the entire length of one another.
Local alignment algorithms which seek only
relatively conserved pieces of sequence.
 Allign those parts of sequences that appear to have
good similarity.
Global vs. Local Alignment
 Global
LGPSSKQTGKGS-SRIWDN
| | | | | | |
LN-ITKSAGKGAIMRLGDA
 Local
--------GKG--------
| | |
--------GKG--------
Elements of Global Sequence Alignment
 Alignment scores. The score for an alignment
is taken to be the sum of scores for aligned pairs
of letters, and scores for letters aligned with
nulls.

 Substitution scores. Scores for aligned pairs

of letters are called substitution scores, whether
the letter aligned are identical or not.

 Gap scores. The score for a letter aligned with

a null is called a gap score.
Significance
 It is commonly used in bioinformatics to
align protein or nucleotide sequences.
 Alignments try to tell the evolutionary story of the
proteins.
 Sequence Alignment is used to identify regions of
similarity that may indicate functional, structural and/or
evolutionary relationships between two biological
sequences (protein or nucleic acid).
 One motivation for database searching: given a query
sequence q, ﬁnd the sequences in a database D which make
‘good’ alignments to q.
Pairwise Global Alignment
Algorithm

The Needleman-Wunsch
algorithm
The Needleman-Wunsch algorithm is a dynamic
programming algorithm for optimal sequence
alignment (Needleman and Wunsch, 1970).
The optimal path can be determined by
incremental extension of the optimal sub-paths.
In a Needleman-Wunsch alignment, the optimal
path must stretch from beginning to end in both
sequences (hence the term ‘global alignment’).
The key for understanding this approach is to
observe how the alignment problem can be
divided into subproblems.
Steps:
Initialization
Matrix fill or scoring
Trace back and alignment
Pairwise Global Alignment of sequences

 Place each sequence along one axis

 Place score 0 at the up-left corner
 Fill in 1st row & column with gap penalty multiples
 Fill in the matrix with max value of 3 possible moves:
 Vertical move: Score + gap penalty
 Horizontal move: Score + gap penalty
 Diagonal move: Score + match/mismatch score
 The optimal alignment score is in the lower-right corner
 To reconstruct the optimal alignment, trace back where the max
at each step came from, stop when hit the origin.
Example of Global Alignment
 Let gap = -2 Vertical move: Score + gap penalty
Horizontal move: Score + gap penalty
match = 1 Diagonal move: Score + match/mismatch score
mismatch = -1.

empty A A A C
empty 0 -2 -4 -6 -8
A -2 1 -1 -3 -5
G -4 -1 0 -2 -4
C -6 -3 -2 -1 -1

AAAC AAAC
A-GC -AGC
Finding the alignments that give
the highest score
 The arrows constitutes paths in the matrix, and for
ﬁnding the highest-scoring alignments, we can follow
the paths from H(m,n) backwards to H(0,0).

 if the arrow comes from H(i−1,j ), the column is

(qi ,−);
 if the arrow comes from H(i,j−1), the column is
(−,dj );
 if the arrow comes from H(i−1,j−1) , the column is
(qi ,dj ).
Example of Global Alignment
 Let gap = -2
match = 1
mismatch = -1.

empty A A A C
empty 0 -2 -4 -6 -8
A -2 1 -1 -3 -5
G -4 -1 0 -2 -4
C -6 -3 -2 -1 -1

AAAC AAAC
A-GC -AGC
Local Alignment Algorithm

The Smith-Waterman algorithm

In 1981, T. F. Smith and M. S. Waterman
developed a new algorithm capable of
detecting also local similarity
Let us suppose to have a long DNA sequence
and want to isolate each subsequence similar to
each part of the yeast genome
Smith & Waterman
 Place each sequence along one axis
 Place score 0 at the up-left corner
 Fill in 1st row & column with 0s
 Fill in the matrix with max value of 4 possible values:
 0
 Vertical move: Score + gap penalty
 Horizontal move: Score + gap penalty
 Diagonal move: Score + match/mismatch score
 The optimal alignment score is the max in the matrix
 To reconstruct the optimal alignment, trace back where the MAX
at each step came from, stop when a zero is hit
Local Alignment
 Let gap = -2
GATCACCT GATCACCT
match = 1 GATACCC GAT _ ACCC
mismatch = -1.
empty G A T C A C C T
empty
0 0 0 0 0 0 0 0 0
G 0 1 0 0 0 0 0 0 0
A 0 0 2 0 0 1 0 0 0
T 0 0 0 3 1 0 0 0 1
A 0 0 1 1 2 2 0 0 0
C 0 0 0 0 2 1 3 1 0
C 0 0 0 0 1 1 2 4 2
C 0 0 0 0 1 0 2 3 3
Vertical move: Score + gap penalty
Horizontal move: Score + gap penalty
Diagonal move: Score + match/mismatch score
Practice

Match = 3
CTTCAGC Mismatch= -2
Gap = -3
CTCAGC
Smith-Waterman Algorithm
Conclusion
The Needleman-Wunsch algorithm, for realizing global
alignments, and the technique by Smith and Waterman, for
local alignments, constitute the fundamental basis on
which numerous database search algorithms were built
BLAST
FASTA
Clustal
These algorithms use indexing techniques, heuristics, and
fast comparative methods to get a quick comparison
between a query sequence and an entire database

Dynamic Programming Methods in Pairwise Alignment
No ratings yet
Dynamic Programming Methods in Pairwise Alignment
41 pages
Lecture 4.1 and 4.2 Sequence Alignment (Global and Local)
No ratings yet
Lecture 4.1 and 4.2 Sequence Alignment (Global and Local)
14 pages
Tabby
No ratings yet
Tabby
11 pages
Sequence Comparison: Motivation: Finding Similarity Between Sequences Is Important For Many Biological Questions
No ratings yet
Sequence Comparison: Motivation: Finding Similarity Between Sequences Is Important For Many Biological Questions
47 pages
Bioinfo Generic Skill
No ratings yet
Bioinfo Generic Skill
10 pages
Sequence Alignment: Lecture - 4
No ratings yet
Sequence Alignment: Lecture - 4
19 pages
Alignment Methods: Introduction To Global and Local Sequence Alignment Methods
No ratings yet
Alignment Methods: Introduction To Global and Local Sequence Alignment Methods
57 pages
Local and Global Sequence Alignment 12 by DR Sheikh Arslan Sehgal
No ratings yet
Local and Global Sequence Alignment 12 by DR Sheikh Arslan Sehgal
59 pages
Sequence Alignment
No ratings yet
Sequence Alignment
36 pages
Module 3 CSE3069 (Bioinformatics)
No ratings yet
Module 3 CSE3069 (Bioinformatics)
57 pages
Needleman-Wunsch and Smith-Waterman Algorithm
67% (9)
Needleman-Wunsch and Smith-Waterman Algorithm
19 pages
Pooja Anshul Saxena Engr 692: Special Topics - Computational Biology
No ratings yet
Pooja Anshul Saxena Engr 692: Special Topics - Computational Biology
24 pages
Blast 2 Sequences, A New Tool For Comparing Protein and Nucleotide Sequences
No ratings yet
Blast 2 Sequences, A New Tool For Comparing Protein and Nucleotide Sequences
17 pages
Smithwaterman 130216133804 Phpapp02
No ratings yet
Smithwaterman 130216133804 Phpapp02
15 pages
Alignment Methods
No ratings yet
Alignment Methods
33 pages
Unit I Algorithms
No ratings yet
Unit I Algorithms
42 pages
L3.4 Alignment
No ratings yet
L3.4 Alignment
90 pages
Unit - Ii Sequence Analysis: Pair-Wise Sequence Comparison
No ratings yet
Unit - Ii Sequence Analysis: Pair-Wise Sequence Comparison
17 pages
11 Smith–Waterman Algorithm 06-08-2024
No ratings yet
11 Smith–Waterman Algorithm 06-08-2024
9 pages
Bioinformatics 04
No ratings yet
Bioinformatics 04
28 pages
Bioinfo Notes 2
No ratings yet
Bioinfo Notes 2
9 pages
Sequence Alignment Methods and Algorithms
No ratings yet
Sequence Alignment Methods and Algorithms
37 pages
Sequence Alignment Methods and Algorithms
75% (4)
Sequence Alignment Methods and Algorithms
37 pages
Introduction-To-Computational Biology
No ratings yet
Introduction-To-Computational Biology
61 pages
BIOLOGICAL DATABASES
No ratings yet
BIOLOGICAL DATABASES
13 pages
DNA Alignment
No ratings yet
DNA Alignment
76 pages
05. Sequence Alignment
No ratings yet
05. Sequence Alignment
9 pages
Introduction Dynamic Programming
No ratings yet
Introduction Dynamic Programming
52 pages
Lecture 5
No ratings yet
Lecture 5
15 pages
03_Sequence Alignment (1)
No ratings yet
03_Sequence Alignment (1)
4 pages
Sequence Alignment Presentation
No ratings yet
Sequence Alignment Presentation
27 pages
Sequence Alignment Methods
No ratings yet
Sequence Alignment Methods
32 pages
Importance and Significance of Sequence Alignment.pptx12
No ratings yet
Importance and Significance of Sequence Alignment.pptx12
15 pages
Optimization of A Classical Algorithm For The Alignment of Genomic Sequences With Artificial Bee Colony
No ratings yet
Optimization of A Classical Algorithm For The Alignment of Genomic Sequences With Artificial Bee Colony
7 pages
Developing Pairwise Sequence Alignment Algorithms: Dr. Nancy Warter-Perez
No ratings yet
Developing Pairwise Sequence Alignment Algorithms: Dr. Nancy Warter-Perez
33 pages
Lecture5 Newest
No ratings yet
Lecture5 Newest
124 pages
The Needleman Wunsch Algorithm For Sequence Alignment
No ratings yet
The Needleman Wunsch Algorithm For Sequence Alignment
46 pages
Lecture 5 Introduction Dynamic Programming
No ratings yet
Lecture 5 Introduction Dynamic Programming
52 pages
W03_Pairwise
No ratings yet
W03_Pairwise
55 pages
Daa Assignment 9
No ratings yet
Daa Assignment 9
4 pages
Lecture 6- Sequence Analysis
No ratings yet
Lecture 6- Sequence Analysis
28 pages
Bio Medical Tics - Sequence Analysis - Alignment - 2011
No ratings yet
Bio Medical Tics - Sequence Analysis - Alignment - 2011
96 pages
Sequence Alignment: Sequence Alignment Is The Most Important Task in Bioinformatics!
No ratings yet
Sequence Alignment: Sequence Alignment Is The Most Important Task in Bioinformatics!
13 pages
Sequence Alignment: Lecture 2, Thursday April 3, 2003
No ratings yet
Sequence Alignment: Lecture 2, Thursday April 3, 2003
39 pages
Sequence Analysis - Pairwise Alignment
No ratings yet
Sequence Analysis - Pairwise Alignment
26 pages
Sequence Alignment: "Continuing.." (5th Week)
No ratings yet
Sequence Alignment: "Continuing.." (5th Week)
61 pages
Algorithm
No ratings yet
Algorithm
36 pages
Unit 2.1
No ratings yet
Unit 2.1
77 pages
Bioinformatics Prof. M. Michael Gromiha Department of Biotechnology Indian Institute of Technology, Madras Lecture - 7b Sequence Alignment II
No ratings yet
Bioinformatics Prof. M. Michael Gromiha Department of Biotechnology Indian Institute of Technology, Madras Lecture - 7b Sequence Alignment II
26 pages
Sequence Alignment
No ratings yet
Sequence Alignment
27 pages
Why Align Two Sequences
No ratings yet
Why Align Two Sequences
2 pages
Pairwise Sequence Alignment: CS 838 WWW - Cs.wisc - Edu/ Craven/cs838.html Mark Craven Craven@biostat - Wisc.edu January 2001
No ratings yet
Pairwise Sequence Alignment: CS 838 WWW - Cs.wisc - Edu/ Craven/cs838.html Mark Craven Craven@biostat - Wisc.edu January 2001
18 pages
bioinformatics.pdf.bak
No ratings yet
bioinformatics.pdf.bak
14 pages
L-8 Global Alignment
No ratings yet
L-8 Global Alignment
19 pages
318809f1420dc08eac795206c14bbebd_MIT6_047F15_Lecture03
No ratings yet
318809f1420dc08eac795206c14bbebd_MIT6_047F15_Lecture03
56 pages
Lab5 Ch2 Sequence Similarity PDF
No ratings yet
Lab5 Ch2 Sequence Similarity PDF
95 pages
Lecture 3
No ratings yet
Lecture 3
39 pages
Dynamic Programming
No ratings yet
Dynamic Programming
28 pages
Numerical Analysis II Essentials
From Everand
Numerical Analysis II Essentials
The Editors of REA
No ratings yet
Level Set Method: Advancing Computer Vision, Exploring the Level Set Method
From Everand
Level Set Method: Advancing Computer Vision, Exploring the Level Set Method
Fouad Sabry
No ratings yet
lecture-8.bioinformatics-tools-for-lab
No ratings yet
lecture-8.bioinformatics-tools-for-lab
32 pages
lecture1.-2.-plant-cell
No ratings yet
lecture1.-2.-plant-cell
99 pages
Industrial Biochemistry-I
No ratings yet
Industrial Biochemistry-I
14 pages
Industrial
No ratings yet
Industrial
19 pages
Lecture 2.Genes and Genomes
No ratings yet
Lecture 2.Genes and Genomes
58 pages
Senescence-resistant Human Mesenchymal Progenitor Cells Counter Aging in Primates_ Cell
No ratings yet
Senescence-resistant Human Mesenchymal Progenitor Cells Counter Aging in Primates_ Cell
41 pages
GG3 Lab Report
No ratings yet
GG3 Lab Report
2 pages
PGDAV - Computer Science Final
No ratings yet
PGDAV - Computer Science Final
86 pages
Previewpdf
No ratings yet
Previewpdf
57 pages
Metagenomics - Remove Host Sequences
No ratings yet
Metagenomics - Remove Host Sequences
1 page
NyBerMan Free Internship Metagenomics
No ratings yet
NyBerMan Free Internship Metagenomics
1 page
Next generation sequencing platforms_28-03-2025
No ratings yet
Next generation sequencing platforms_28-03-2025
29 pages
Skill Enhancement Courses (Credit-3)
No ratings yet
Skill Enhancement Courses (Credit-3)
143 pages
D. Higgins, Willie Taylor Bioinformatics Sequence, Structure and Databanks PDF
100% (2)
D. Higgins, Willie Taylor Bioinformatics Sequence, Structure and Databanks PDF
268 pages
Your Passport to a Career in Bioinformatics 2nd Edition Prashanth N. Suravajhala download
100% (2)
Your Passport to a Career in Bioinformatics 2nd Edition Prashanth N. Suravajhala download
57 pages
Becoming A Space Scientist
No ratings yet
Becoming A Space Scientist
2 pages
Bioinformatics Learning Roadmap
No ratings yet
Bioinformatics Learning Roadmap
4 pages
Engineering Catalogue 2014
No ratings yet
Engineering Catalogue 2014
124 pages
Ecology and Evolution - 2024 - Thormar - Sampling fish gut microbiota ‐ A genome‐resolved metagenomic approach
No ratings yet
Ecology and Evolution - 2024 - Thormar - Sampling fish gut microbiota ‐ A genome‐resolved metagenomic approach
16 pages
[Essential Reference] Information Resources Management Association - Bioinformatics _ Concepts, Methodologies, Tools, And Applications (2013, Medical Information Science Reference, IGI Global) - l
No ratings yet
[Essential Reference] Information Resources Management Association - Bioinformatics _ Concepts, Methodologies, Tools, And Applications (2013, Medical Information Science Reference, IGI Global) - l
1,826 pages
Bioinformatics and Quantumcomputing: Bio Informatics
No ratings yet
Bioinformatics and Quantumcomputing: Bio Informatics
10 pages
Iasri 7
No ratings yet
Iasri 7
8 pages
Machine Learning in Bioinformatics
No ratings yet
Machine Learning in Bioinformatics
7 pages
S. N. Jogdand: Biotechnology and Nanotechnology Will Be The Next Thrust Areas For Education and Entrepreneurship
100% (3)
S. N. Jogdand: Biotechnology and Nanotechnology Will Be The Next Thrust Areas For Education and Entrepreneurship
57 pages
Molecular Medicine
No ratings yet
Molecular Medicine
2 pages
Your Passport to a Career in Bioinformatics 2nd Edition Prashanth N. Suravajhala download pdf
100% (6)
Your Passport to a Career in Bioinformatics 2nd Edition Prashanth N. Suravajhala download pdf
65 pages
Bioinformatics Developments in India
No ratings yet
Bioinformatics Developments in India
27 pages
Western Uni
No ratings yet
Western Uni
38 pages
swami ppt (1)
No ratings yet
swami ppt (1)
12 pages
Part II Zoology Booklet 2020-21-1
No ratings yet
Part II Zoology Booklet 2020-21-1
28 pages
Labmanual CS 1
No ratings yet
Labmanual CS 1
52 pages

lecture2_sequence_alignment

Uploaded by

lecture2_sequence_alignment

Uploaded by

INTRODUCTION TO BIOINFORMATICS

National Center for Bioinformatics

 Align “abcdef” with “abdgf”

 Insert gap between b and d on lower sequence to allow d

 To find whether two (or more) genes or

 Pairwise Sequence Alignment is used to

Two method for sequence alignment.

 Local sequence alignment

 Global sequence alignment

 Substitution scores. Scores for aligned pairs

 Gap scores. The score for a letter aligned with

 Place each sequence along one axis

 if the arrow comes from H(i−1,j ), the column is

The Smith-Waterman algorithm

You might also like