Bioinfo Generic Skill
Bioinfo Generic Skill
Alignment
SEQUENCE
ALIGNMENT
Types
Pairwise Multiple
sequence sequence
alignment alignment
Pairwise sequence alignment
Involves aligning two sequences to identify the optimal pairing
of the sequences.
NEEDELEMAN-WUNSCH ALGORITHM
SMITH-WATERMAN ALGORITHM
Needleman-wunsch
The Needleman-Wunsch algorithm is a dynamic programming algorithm used in
bioinformatics and computational biology to perform sequence alignment. It is
primarily employed for comparing two sequences, such as DNA, RNA, or protein
sequences, in order to identify similarities and differences between them
1. Initialization:
- Create a matrix (or table) where the rows and columns
represent the characters or elements of the two sequences you
want to align.
- Initialize the first row and the first column of the matrix with
gap penalties. Typically, the first row and column represent gaps
for the first sequence and the second sequence, respectively.
2. Scoring Scheme:
- Define a scoring scheme that assigns scores to
matches, mismatches, and gaps. For example, you might
assign a positive score for a match, a negative score for a
mismatch, and a penalty for opening a gap and extending
a gap.
5. Output :
- You will obtain one or more optimal alignments with their scores,
depending on the number of paths that lead to the top-left corner.
- The final alignment(s) will show matches, mismatches, and gaps,
along with their positions in the input sequences.
Smith-waterman
The Smith-Waterman algorithm is a dynamic programming algorithm used
in bioinformatics to perform local sequence alignment between two
nucleotide or protein sequences. It is similar to the more well-known
Needleman-Wunsch algorithm, which performs global sequence alignment.
1. Initialization:
- Create a matrix (usually a 2D table) with dimensions (m+1) x
(n+1), where m and n are the lengths of the two sequences you
want to align.
- Initialize the first row and first column of the matrix with zeros
2. Scoring Scheme:
- Define a scoring scheme that assigns scores to matches,
mismatches, gap openings, and gap extensions. These scores
are typically provided as input to the algorithm.
5. Reporting:
- Report the aligned subsequences along with their scores and
positions