0% found this document useful (0 votes)
5 views4 pages

03 - Sequence Alignment

Sequence alignment is a method used in bioinformatics to compare biological sequences such as DNA, RNA, or proteins to identify similarities that may indicate functional, structural, or evolutionary relationships. It can be categorized into global alignment, which aligns entire sequences, and local alignment, which focuses on the most similar segments. Various algorithms, such as Needleman-Wunsch for global alignment and Smith-Waterman for local alignment, are used to compute alignments and edit distances, although they may have practical limitations in real-life applications.

Uploaded by

Md. Abdul Mukit
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views4 pages

03 - Sequence Alignment

Sequence alignment is a method used in bioinformatics to compare biological sequences such as DNA, RNA, or proteins to identify similarities that may indicate functional, structural, or evolutionary relationships. It can be categorized into global alignment, which aligns entire sequences, and local alignment, which focuses on the most similar segments. Various algorithms, such as Needleman-Wunsch for global alignment and Smith-Waterman for local alignment, are used to compute alignments and edit distances, although they may have practical limitations in real-life applications.

Uploaded by

Md. Abdul Mukit
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

Sequence Alignment

Sequence alignment is the process of comparing and detecting similarities


between biological sequences.
What “similarities” are being detected will depend on the goals of the particular
alignment process. Sequence alignment appears to be extremely useful in a number
of bioinformatics applications.
In bioinformatics, a sequence alignment is a way of arranging the sequences of
DNA, RNA, or protein to identify regions of similarity that may be a consequence
of functional, structural, or evolutionary relationships between the sequences.[1]
Aligned sequences of nucleotide or amino acid residues are typically represented
as rows within a matrix. Gaps are inserted between the residues so that identical or
similar characters are aligned in successive columns. Sequence alignments are also
used for non-biological sequences, such as calculating the distance cost between
strings in a natural language or in financial data.

For example, the simplest way to compare two sequences of the same length is to
calculate the number of matching symbols. The value that measures the degree of
sequence similarity is called the alignment score of two sequences. The opposite
value, corresponding to the level of dissimilarity between sequences, is usually
referred to as the distance between sequences. The number of non-matching
characters is called the Hamming distance. Fig. 1 shows an example of two
sequences with Hamming distance equal to 3.
Fig. 1. Example of two sequences with Hamming distances equal to 3.

It is, however, worth noting that comparing sequence characters position by


position as described above can barely be referred to as alignment process, since it
does not take into account such typical biological events as deletions and
insertions.
The classical notion of sequence alignment includes calculating the so called edit
distance, which generally corresponds to the minimal number of
substitution, insertions and deletions needed to turn one sequence into another. Fig.
2 demonstrates an example of two sequences with edit distance equal to 3.

Fig. 2. Example of two sequences with edit distances equal to 3.


The problems of computing edit distance and various types of sequence alignment
have exact solutions, e.g., (Smith and Waterman, 1981) and (Needleman and
Wunsch, 1970) algorithms. Since these algorithms were initially developed for
protein-protein alignment and later adapter for DNA sequence alignment, they are
described in the section ‘Protein-protein alignment’. In most real-life cases,
however, these algorithms appear to be impractical for DNA alignment due their
running time and memory requirements.

Why is sequence alignment important?


Sequence alignments are useful in bioinformatics for identifying sequence
similarity, producing phylogenetic trees, and developing homology models of
protein structures.

Local And Global Sequence Alignment

Sequence alignments can be further divided into global alignments that align the
complete sequences and local alignments that identify only the most similar
segments or sequence patterns (motifs). While global alignment algorithms
produce more accurate alignments for proteins of similar length, local alignment
algorithms are better at identifying similar regions within sequences when the
sequences are not related over their entire length.
Facts About Global Sequence Alignment
1. In global alignment, an attempt is made to align the entire sequence (end to
end alignment).
2. A global alignment contains all letters from both the query and target
sequences.
3. If two sequences have approximately the same length and are quite similar,
they are suitable for global alignment.
4. Suitable for aligning two closely related sequences.
5. Global alignments are usually done for comparing homologous genes like
comparing two genes with same function (in human vs. mouse) or
comparing two proteins with similar function.
6. A general global alignment technique is the Needleman–Wunsch algorithm.

Facts About Local Sequence Alignment


1. Finds local regions with the highest level of similarity between the two
sequences.
2. A local alignment aligns a substring of the query sequence to a substring of
the target sequence.
3. Any two sequences can be locally aligned as local alignment finds stretches
of sequences with high level of matches without considering the alignment
of rest of the sequence regions.
4. Suitable for aligning more divergent sequences or distantly related
sequences.
5. Used for finding out conserved patterns in DNA sequences or conserved
domains or motifs in two proteins.
6. A general local alignment method is Smith–Waterman algorithm.

Difference Between Global And Local Sequence Alignment In Tabular Form


GLOBAL SEQUENCE LOCAL SEQUENCE
ALIGNMENT ALIGNMENT

In global alignment, an attempt is Finds local regions with the highest


made to align the entire sequence level of similarity between the two
(end to end alignment). sequences.

A global alignment contains all A local alignment aligns a substring


letters from both the query and target of the query sequence to a substring
sequences. of the target sequence.

If two sequences have approximately Any two sequences can be locally


aligned as local alignment finds
the same length and are quite similar, stretches of sequences with high
they are suitable for global level of matches without
alignment. considering the alignment of rest of
the sequence regions.

Suitable for aligning more divergent Suitable for aligning more


sequences or distantly related divergent sequences or distantly
sequences. related sequences.

Global alignments are usually done


for comparing homologous genes Used for finding out conserved
like comparing two genes with same patterns in DNA sequences or
function (in human vs. mouse) or conserved domains or motifs in two
comparing two proteins with similar proteins.
function.

A general global alignment technique A general local alignment method


is the Needleman–Wunsch algorithm. is Smith–Waterman algorithm.

Examples of Global alignment tools


include: EMBOSS Examples of Local alignment tools
NeedleNeedleman-Wunsch Global include: BLASTEMBOSS
Align Nucleotide Sequences WaterLALIGN
(Specialized BLAST)

You might also like