Bioinfo
Bioinfo
Algorithm:
The algorithm for Needleman wunch in python is as follow.
1st Step
Define function with seq1 and seq2, and scoring parameters
match score, mismatch penalty, and gap penalty as
parameters.
2nd Step
Initialize rows and column.
rows = len(seq1) + 1
cols = len(seq2) + 1
3rd Step
Initialize the first row and column of the scoring matrix with
gap penalties:
For i from 1 to len(seq1):
score_matrix[i][0] = gap_penalty * i
For j from 1 to len(seq2):
score_matrix[0][j] = gap_penalty * j
4th Step
Now fill remaining matrix
For i from 1 to rows:
For j from 1 to coloumn):
Calculate the scores for match/mismatch and gap:
If seq1[i-1] equals seq2[j-1]:
match = score_matrix[i-1] [j-1] + match_score
Else:
match = score_matrix[i-1][j-1] + mismatch_penalty
delete = score_matrix[i-1][j] + gap_penalty
insert = score_matrix[i][j-1] + gap_penalty
Consider the maximum among them using maximum pre-
defined function.
5th Step
Print the matrix.
for row in score_matrix:
print(row)
6th Step
1. Initialize two empty strings, align1 and align2, to store
the aligned sequences.
2. Set i to the row index of the bottom-right corner of the
scoring matrix.
3. Set j to the column index of the bottom-right corner of
the scoring matrix.
4. While i > 0 and j > 0:
a. If seq1[i - 1] is equal to seq2[j - 1]:
- Prepend seq1[i - 1] to align1.
- Prepend seq2[j - 1] to align2.
- Decrement i and j by 1.
b. Else if the current score at score_matrix[i][j] is equal
to score_matrix[i - 1][j - 1] + mismatch_penalty:
- Prepend seq1[i - 1] to align1.
- Prepend seq2[j - 1] to align2.
- Decrement i and j by 1.
c. Else if the current score at score_matrix[i][j] is equal
to score_matrix[i - 1][j] + gap_penalty:
- Prepend seq1[i - 1] to align1.
- Prepend a gap symbol "-" to align2.
- Decrement i by 1.
d. Else:
- Prepend a gap symbol "-" to align1.
- Prepend seq2[j - 1] to align2.
- Decrement j by 1.
5. While i > 0:
- Prepend seq1[i - 1] to align1.
- Prepend a gap symbol "-" to align2.
- Decrement i by 1.
6. While j > 0:
- Prepend a gap symbol "-" to align1.
- Prepend seq2[j - 1] to align2.
- Decrement j by 1.
7. Return align1 and align2 as the aligned sequences..
Code
def needleman_wunsch(seq1, seq2, match_score=1, mismatch_penalty=-1, gap_penalty=-2):
rows = len(seq1) + 1
cols = len(seq2) + 1
score_matrix[i][0] = gap_penalty * i
score_matrix[0][j] = gap_penalty * j
else:
print(row)
# Traceback to find the optimal alignment
align1 = ""
align2 = ""
i, j = rows - 1, cols - 1
i -= 1
j -= 1
i -= 1
j -= 1
i -= 1
else:
j -= 1
while i > 0:
i -= 1
while j > 0:
align1 = "-" + align1
j -= 1
# Example usage
sequence1 = "CGTATT"
sequence2 = "GACTTT"
Matrix
We have two sequences.
Sequence1=CGTATT
Sequence2=GACTTT
Match score=1
Mismatch=-1
Gap=-2
G A C T T T
0 -2 -4 -6 -8 -10 -12
C -2 -1 -3 -3 -5 -7 -9
G -4 -1 -2 -4 -4 -6 -8
T -6 -3 -2 -3 -3 -3 -5
A -8 -5 -2 -3 -4 -4 -4
T -10 -7 -4 -3 -2 -3 -3
T -12 -9 -6 -5 -2 -1 -2
Sequence1= G A C T T T
Sequence 2= C G T AT T
Score = -2