Bioinformatics HW1
Bioinformatics HW1
sequence from 5' to 3' direction, replacing T by U to find the RNA sequence.
Copy code
ATGAGCATGTCGAAACATGATTCACCGATCATCATCAAATTACTAATGTGCTGGAACTATATGCTATT
TACGTGAGACAATGACCCACATTAACCGGCCAATGGGA
Copy code
ATGAGCATGCGGCTAGACATGTCGAAACATGATTCACCGATCATCATCAAATTACTAATGTGCTGGA
ACTATATGCTATTTACGTGAGACAATGACCCACATTAACCGGCCAATGGGA
Copy code
AUGAGCAUGCGGCUAGAC-AUGUCGAAACAUGAUUCACCGAUCAUCAUCAA-
AUUACUAAUGUGCUGGAACUAUAUGCUAUUUACGUGAGACAAUGACCCACAUUAACCGGCCA
AUGGGA
So, the RNA codons for the first part of the sequence are as follows (this is not
exhaustive, you can continue based on how much you want to analyze):
objectivec
Copy code
AUG AGC AUG CGG CUA GAC AUG UCG AAA CAU GAU UCA CCG AUC AUC AUC AAU UAC
UAA UGU GCU GGA ACU AUA...
5. Determine the Amino Acids Produced
Using a codon chart, we translate the RNA into amino acids. The first few are as follows:
AUG = Methionine (M)
AGC = Serine (S)
AUG = Methionine (M)
CGG = Arginine (R)
CUA = Leucine (L)
GAC = Aspartic acid (D)
AUG = Methionine (M)
UCG = Serine (S)
So, the protein sequence for the first portion of the RNA is:
mathematica
Copy code
M-S-M-R-L-D-M-S-K-H...
You can continue this process to identify the entire amino acid sequence.
7. Mutation Analysis
Original sequence: ATTCACCGATCATCATCAAATT
Mutated sequence: TATGCTATTTAAATGCCACAAT
The mutated sequence introduces a STOP codon (UAA) much earlier, which means the
protein will be truncated and likely non-functional.
Copy code
GTCATGCCG
This substring is common to both sequences, and its length is 9 nucleotides.
Summary:
Modified RNA Sequence: AUGAGCAUGCGGCUAGAC-
AUGUCGAAACAUGAUUCACCGAUCAUCAUCAA...
Codons Identified: Translated into amino acids starting from Methionine (AUG)
and continuing.
Amino Acid Sequence: M-S-M-R-L-D-M-S-K-H...
Protein Prediction: Based on start and stop codons, at least one valid protein is
identified.
Mutation Effect: Mutation introduces an early stop codon, resulting in truncated
proteins.
Longest Common Substring: GTCATGCCG