0% found this document useful (0 votes)
7 views6 pages

Bioinfo

The document outlines the Needleman-Wunsch algorithm for sequence alignment, detailing the steps to create a scoring matrix and traceback for optimal alignment. It includes a Python implementation of the algorithm, which takes two sequences and scoring parameters as input, and produces aligned sequences. An example with sequences 'CGTATT' and 'GACTTT' is provided, along with the resulting scoring matrix and alignment score.

Uploaded by

Annaya Ch
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views6 pages

Bioinfo

The document outlines the Needleman-Wunsch algorithm for sequence alignment, detailing the steps to create a scoring matrix and traceback for optimal alignment. It includes a Python implementation of the algorithm, which takes two sequences and scoring parameters as input, and produces aligned sequences. An example with sequences 'CGTATT' and 'GACTTT' is provided, along with the resulting scoring matrix and alignment score.

Uploaded by

Annaya Ch
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 6

Question

Write a code, algorithm for Needleman-wunch and compare


these two sequences?

Algorithm:
The algorithm for Needleman wunch in python is as follow.

1st Step
 Define function with seq1 and seq2, and scoring parameters
match score, mismatch penalty, and gap penalty as
parameters.

2nd Step
 Initialize rows and column.
 rows = len(seq1) + 1
 cols = len(seq2) + 1

3rd Step
 Initialize the first row and column of the scoring matrix with
gap penalties:
 For i from 1 to len(seq1):
 score_matrix[i][0] = gap_penalty * i
 For j from 1 to len(seq2):
 score_matrix[0][j] = gap_penalty * j

4th Step
 Now fill remaining matrix
 For i from 1 to rows:
 For j from 1 to coloumn):
 Calculate the scores for match/mismatch and gap:
 If seq1[i-1] equals seq2[j-1]:
 match = score_matrix[i-1] [j-1] + match_score
 Else:
 match = score_matrix[i-1][j-1] + mismatch_penalty
 delete = score_matrix[i-1][j] + gap_penalty
 insert = score_matrix[i][j-1] + gap_penalty
 Consider the maximum among them using maximum pre-
defined function.
5th Step
 Print the matrix.
 for row in score_matrix:
 print(row)

6th Step
 1. Initialize two empty strings, align1 and align2, to store
the aligned sequences.
 2. Set i to the row index of the bottom-right corner of the
scoring matrix.
 3. Set j to the column index of the bottom-right corner of
the scoring matrix.
 4. While i > 0 and j > 0:
 a. If seq1[i - 1] is equal to seq2[j - 1]:
 - Prepend seq1[i - 1] to align1.
 - Prepend seq2[j - 1] to align2.
 - Decrement i and j by 1.
 b. Else if the current score at score_matrix[i][j] is equal
to score_matrix[i - 1][j - 1] + mismatch_penalty:
 - Prepend seq1[i - 1] to align1.
 - Prepend seq2[j - 1] to align2.
 - Decrement i and j by 1.
 c. Else if the current score at score_matrix[i][j] is equal
to score_matrix[i - 1][j] + gap_penalty:
 - Prepend seq1[i - 1] to align1.
 - Prepend a gap symbol "-" to align2.
 - Decrement i by 1.
 d. Else:
 - Prepend a gap symbol "-" to align1.
 - Prepend seq2[j - 1] to align2.
 - Decrement j by 1.
 5. While i > 0:
 - Prepend seq1[i - 1] to align1.
 - Prepend a gap symbol "-" to align2.
 - Decrement i by 1.
 6. While j > 0:
 - Prepend a gap symbol "-" to align1.
 - Prepend seq2[j - 1] to align2.
 - Decrement j by 1.
 7. Return align1 and align2 as the aligned sequences..

Code
def needleman_wunsch(seq1, seq2, match_score=1, mismatch_penalty=-1, gap_penalty=-2):

# Create the scoring matrix

rows = len(seq1) + 1

cols = len(seq2) + 1

score_matrix = [[0] * cols for _ in range(rows)]

# Initialize the first row and column with gap penalties

for i in range(1, rows):

score_matrix[i][0] = gap_penalty * i

for j in range(1, cols):

score_matrix[0][j] = gap_penalty * j

# Fill in the rest of the scoring matrix

for i in range(1, rows):

for j in range(1, cols):

# Calculate the scores for match/mismatch and gap

if seq1[i - 1] == seq2[j - 1]:

match = score_matrix[i - 1][j - 1] + match_score

else:

match = score_matrix[i - 1][j - 1] + mismatch_penalty

delete = score_matrix[i - 1][j] + gap_penalty

insert = score_matrix[i][j - 1] + gap_penalty

# Choose the maximum score

score_matrix[i][j] = max(match, delete, insert)

for row in score_matrix:

print(row)
# Traceback to find the optimal alignment

align1 = ""

align2 = ""

i, j = rows - 1, cols - 1

while i > 0 and j > 0:

if seq1[i - 1] == seq2[j - 1]:

align1 = seq1[i - 1] + align1

align2 = seq2[j - 1] + align2

i -= 1

j -= 1

elif score_matrix[i][j] == score_matrix[i - 1][j - 1] + mismatch_penalty:

align1 = seq1[i - 1] + align1

align2 = seq2[j - 1] + align2

i -= 1

j -= 1

elif score_matrix[i][j] == score_matrix[i - 1][j] + gap_penalty:

align1 = seq1[i - 1] + align1

align2 = "-" + align2

i -= 1

else:

align1 = "-" + align1

align2 = seq2[j - 1] + align2

j -= 1

while i > 0:

align1 = seq1[i - 1] + align1

align2 = "-" + align2

i -= 1

while j > 0:
align1 = "-" + align1

align2 = seq2[j - 1] + align2

j -= 1

return align1, align2

# Example usage

sequence1 = "CGTATT"

sequence2 = "GACTTT"

alignment1, alignment2 = needleman_wunsch(sequence1, sequence2)

print("Alignment 1:", alignment1)

print("Alignment 2:", alignment2)

Matrix
We have two sequences.
Sequence1=CGTATT
Sequence2=GACTTT
Match score=1
Mismatch=-1
Gap=-2
G A C T T T
0 -2 -4 -6 -8 -10 -12
C -2 -1 -3 -3 -5 -7 -9
G -4 -1 -2 -4 -4 -6 -8
T -6 -3 -2 -3 -3 -3 -5
A -8 -5 -2 -3 -4 -4 -4
T -10 -7 -4 -3 -2 -3 -3
T -12 -9 -6 -5 -2 -1 -2

Sequence1= G A C T T T
Sequence 2= C G T AT T
Score = -2

You might also like