0% found this document useful (0 votes)
15 views10 pages

Bioinfo Generic Skill

The Needleman-Wunsch algorithm and Smith-Waterman algorithm are dynamic programming algorithms used for pairwise sequence alignment. The Needleman-Wunsch algorithm performs global alignment to identify similarities across the full length of two sequences, while the Smith-Waterman algorithm performs local alignment to identify partial similarities between sequences. Both algorithms work by constructing a scoring matrix and tracing back through it to find the optimal alignment.

Uploaded by

Charmi thor
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views10 pages

Bioinfo Generic Skill

The Needleman-Wunsch algorithm and Smith-Waterman algorithm are dynamic programming algorithms used for pairwise sequence alignment. The Needleman-Wunsch algorithm performs global alignment to identify similarities across the full length of two sequences, while the Smith-Waterman algorithm performs local alignment to identify partial similarities between sequences. Both algorithms work by constructing a scoring matrix and tracing back through it to find the optimal alignment.

Uploaded by

Charmi thor
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 10

Pairwise sequence

Alignment
SEQUENCE
ALIGNMENT

Sequence alignment arranges two or more nucleotide or


amino acid sequences to identify regions of similarity between the
sequences. These regions of similarity are helpful in understanding
the functional, structural, and evolutionary relationships between the
sequences.

Types
Pairwise Multiple
sequence sequence
alignment alignment
Pairwise sequence alignment
 Involves aligning two sequences to identify the optimal pairing
of the sequences.

 Based on a scoring system that assigns positive scores to


matching characters and negative scores to mismatching
characters or gaps.

 Main objective: to obtain the highest possible score, which


indicates the degree of similarity between the two sequences.
There are various techniques available for pairwise sequence alignment

The most important and basic techniques are :

 NEEDELEMAN-WUNSCH ALGORITHM

 SMITH-WATERMAN ALGORITHM
Needleman-wunsch
The Needleman-Wunsch algorithm is a dynamic programming algorithm used in
bioinformatics and computational biology to perform sequence alignment. It is
primarily employed for comparing two sequences, such as DNA, RNA, or protein
sequences, in order to identify similarities and differences between them

1. Initialization:
- Create a matrix (or table) where the rows and columns
represent the characters or elements of the two sequences you
want to align.
- Initialize the first row and the first column of the matrix with
gap penalties. Typically, the first row and column represent gaps
for the first sequence and the second sequence, respectively.
2. Scoring Scheme:
- Define a scoring scheme that assigns scores to
matches, mismatches, and gaps. For example, you might
assign a positive score for a match, a negative score for a
mismatch, and a penalty for opening a gap and extending
a gap.

3. Filling the Matrix:


- Start from the top-left corner of the matrix (position [0,0]) and move row by
row, filling in the matrix based on the scores and penalties:
- Calculate the score for each cell by considering the scores from the cell
diagonally above-left (match/mismatch), the cell directly above (gap extension),
and the cell to the left (gap opening).
- Choose the maximum score among these options and fill the current cell
with that score.
4. Traceback:
- Once the matrix is filled, start from the bottom-right corner (position
[m, n], where m and n are the lengths of the two sequences) and trace
back to the top-left corner to find the alignment.
- Follow the path of maximum scores, which will guide you to either a
match/mismatch or a gap in each step.
- Record the aligned sequences as you backtrack through the matrix.

5. Output :
- You will obtain one or more optimal alignments with their scores,
depending on the number of paths that lead to the top-left corner.
- The final alignment(s) will show matches, mismatches, and gaps,
along with their positions in the input sequences.
Smith-waterman
The Smith-Waterman algorithm is a dynamic programming algorithm used
in bioinformatics to perform local sequence alignment between two
nucleotide or protein sequences. It is similar to the more well-known
Needleman-Wunsch algorithm, which performs global sequence alignment.

1. Initialization:
- Create a matrix (usually a 2D table) with dimensions (m+1) x
(n+1), where m and n are the lengths of the two sequences you
want to align.
- Initialize the first row and first column of the matrix with zeros
2. Scoring Scheme:
- Define a scoring scheme that assigns scores to matches,
mismatches, gap openings, and gap extensions. These scores
are typically provided as input to the algorithm.

3. Fill the Matrix:


- Iterate through the matrix, starting from the second row and second column.
- For each cell (i, j) in the matrix, calculate the score based on three possibilities:
- Match or mismatch score: Score in the diagonal cell (i-1, j-1) + the score for aligning the
characters at positions i and j.
- Gap in the sequence 1: Score in the cell above (i-1, j) minus a penalty for gap opening or
extension.
- Gap in the sequence 2: Score in the cell to the left (i, j-1) minus a penalty for gap opening or
extension.
- If the calculated score is negative, set it to zero (local alignment).
4. Traceback:
- Start at the cell with the highest score in the matrix and
traceback by choosing the direction that maximizes the score.
- Traceback stops when a cell with a score of zero is reached
or when the edge of the matrix is reached.
- Record the alignment path, marking matches, mismatches,
and gaps.

5. Reporting:
- Report the aligned subsequences along with their scores and
positions

You might also like