0% found this document useful (0 votes)
4 views5 pages

Bioinformatics Lab # 6

This document outlines a lab exercise focused on using the BLAST program to identify unknown protein and nucleotide sequences. It includes tasks such as determining the protein name, query length, number of sequences in the database, and E-value for a given protein sequence, as well as finding details for a nucleotide sequence. The E-value is explained as a measure of significance in sequence alignment, with lower values indicating more reliable matches.

Uploaded by

salmanalishba980
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views5 pages

Bioinformatics Lab # 6

This document outlines a lab exercise focused on using the BLAST program to identify unknown protein and nucleotide sequences. It includes tasks such as determining the protein name, query length, number of sequences in the database, and E-value for a given protein sequence, as well as finding details for a nucleotide sequence. The E-value is explained as a measure of significance in sequence alignment, with lower values indicating more reliable matches.

Uploaded by

salmanalishba980
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Bioinformatics: Lab Exercise#6

BLAST EXERCISE: Identification of unknown protein and nucleotide


sequences through BLAST

1. Given a protein sequence, find out which protein it is and from which organism it
belongs using BLAST Program.
MERGVRRGAALVAAWRSLWERGGLALFRPQCRTGCGACRVQGTRPFSLSAAASAVLG
LGSWGGDSGKQKLTLQDVAELIRKKECRRVVVMAGAGISTPSGIPDFRSPGSGLYSNLE
QYNIPYPEAIFELAYFFINPKPFFTLAKELYPGNYRPNYAHYFLRLLHDKGLLLRLYTQNI
DGLERVAGIPPDRLVEAHGTFATATCTVCRRKFPGEDFRGDVMADKVPHCRVCTGIVK
PDIVFFGEELPQRFFLHMTDFPMADLLFVIGTSLEVEPFASLAGAVRNSVPRVLINR
DLVGPFAWQQRYNDIAQLGDVVTGVEKMVELLDWNEEMQTLIQKEKEKLDAKDK
From the given details, find
 Name the protein of that sequence.

 What is the length of the query sequence ?


Query Length: 346

 What is the number of sequences in the database searched ?


 What is the E-value ?

The E-value (Expect Value) is a statistical measure used in sequence alignment (such as
BLAST) to indicate the number of matches (or alignments) you might expect to see by chance
when searching a database.
 A low E-value (close to 0) indicates a highly significant match, meaning the alignment
is unlikely to have occurred by chance.
 A high E-value suggests the match could be due to random chance and is less significant.
In summary, the E-value helps assess the quality and reliability of the alignment result:
 E-value ≈ 0: Strong, significant match.
 E-value > 0: Weaker, less significant match.
The lower the E-value, the more reliable the match between the sequences.

2. Given a nucleotide sequence, find the details regarding the sequence like which
organism, name, accession id etc. using BLAST

Nucleotide Sequence

AGTGCCGCGCGTCGAGCGGAGCAGAGGAGGCGAGGGCGGAGGGCCAGAGAGGCAGTTGGAAGATGGCGGAC
GAGGTGGCGCTCGCCCTTCAGGCCGCCGGCTCCCCTTCCGCGGCGGCCGCCATGGAGGCCGCGTCGCAGCCGGC
GGACGAGCCGCTCCGCAAGAGGCCCCGCCGAGACGGGCCTGGCCTCGGGCGCAGCCCGGGCGAGCCGAGCGCA
GCAGTGGCGCCGGCGGCCGCGGGGTGTGAGGCGGCGAGCGCCGCGGCCCCGGCGGCGCTGTGGCGGGAGGCG
GCAGGGGCGGCGGCGAGCGCGGAGCGGGAGGCCCCGGCGACGGCCGTGGCCGGGGACGGAGACAATGGGTCC
GGCCTGCGGCGGGAGCCGAGGGCGGCTGACGACTTCGACGACGACGAGGGCGAGGAGGAGGACGAGGCGGCG
GCGGCAGCGGCGGCGGCAGCGATCGGCTACCGAGACAACCTCCTGTTGACCGATGGACTCCTCACTAATGGCTTT
CATTCCTGTGAAAGTGATGACGATGACAGAACGTCACACGCCAGCTCTAGTGACTGGACTCCGCGGCCGCGGATA
GGTCCATATACTTTTGTTCAGCAACATCTCATGATTGGCACCGATCCTCGAACAATTCTTAAAGATTTATTACCAGA
AACAATTCCTCCACCTGAGCTGGATGATATGACGCTGTGGCAGATTGTTATTAATATCCTTTCAGAACCACCAAAGC
GGAAAAAAAGAAAAGATATCAATACAATTGAAGATGCTGTGAAGTTACTGCAGGAGTGTAAAAAGATAATAGTTC
TGACTGGAGCTGGGGTTTCTGTCTCCTGTGGGATTCCTGACTTCAGATCAAGAGACGGTATCTATGCTCGCCTTGC
GGTGGACTTCCCAGACCTCCCAGACCCTCAAGCCATGTTTGATATTGAGTATTTTAGAAAAGACCCAAGACCATTCT
TCAAGTTTGCAAAGGAAATATATCCCGGACAGTTCCAGCCGTCTCTGTGTCACAAATTCATAGCTTTGTCAGATAA
GGAAGGAAAACTACTTCGAAATTATACTCAAAATATAGATACCTTGGAGCAGGTTGCAGGAATCCAAAGGATCCT
TCAGTGTCATGGTTCCTTTGCAACAGCATCTTGCCTGATTTGTAAATACAAAGTTGATTGTGAAGCTGTTCGTGGAG
ACATTTTTAATCAGGTAGTTCCTCGGTGCCCTAGGTGCCCAGCTGATGAGCCACTTGCCATCATGAAGCCAGAGAT
TGTCTTCTTTGGTGAAAACTTACCAGAACAGTTTCATAGAGCCATGAAGTATGACAAAGATGAAGTTGACCTCCTC
ATTGTTATTGGATCTTCTCTGAAAGTGAGACCAGTAGCACTAATTCCAAGTTCTATACCCCATGAAGTGCCTCAAAT
ATTAATAAATAGGGAACCTTTGCCTCATCTACATTTTGATGTAGAGCTCCTTGGAGACTGCGATGTTATAATTAATG
AGTTGTGTCATAGGCTAGGTGGTGAATATGCCAAACTTTGTTGTAACCCTGTAAAGCTTTCAGAAATTACTGAAAA
ACCTCCACGCCCACAAAAGGAATTGGTTCATTTATCAGAGTTGCCACCAACACCTCTTCATATTTCGGAAGACTCAA
GTTCACCTGAAAGAACTGTACCACAAGACTCTTCTGTGATTGCTACACTTGTAGACCAAGCAACAAACAACAATGT
TAATGATTTAGAAGTATCTGAATCAAGTTGTGTGGAAGAAAAACCACAAGAAGTACAGACTAGTAGGAATGTTGA
GAACATTAATGTGGAAAATCCAGATTTTAAGGCTGTTGGTTCCAGTACTGCAGACAAAAATGAAAGAACTTCAGTT
GCAGAAACAGTGAGAAAATGCTGGCCTAATAGACTTGCAAAGGAGCAGATTAGTAAGCGGCTTGAGGGTAATCA
ATACCTGTTTGTACCACCAAATCGTTACATATTCCACGGTGCTGAGGTATACTCAGACTCTGAAGATGACGTCTTGT
CCTCTAGTTCCTGTGGCAGTAACAGTGACAGTGGCACATGCCAGAGTCCAAGTTTAGAAGAACCCTTGGAAGATG
AAAGTGAAATTGAAGAATTCTACAATGGCTTGGAAGATGATACGGAGAGGCCCGAATGTGCTGGAGGATCTGGA
TTTGGAGCTGATGGAGGGGATCAAGAGGTTGTTAATGAAGCTATAGCTACAAGACAGGAATTGACAGATGTAAA
CTATCCATCAGACAAATCATAACACTATTGAAGCTGTCCGGATTCAGGAATTGCTCCACCAGCATTGGGAACTTTA
GCATGTCAAAAAATGAATGTTTACTTGTGAACTTGAACAAGGAAATCTGAAAGATGTATTATTTATAGACTGGAAA
ATAGATTGTCTTCTTGGATAATTTCTAAAGTTCCATCATTTCTGTTTGTACTTGTACATTCAACACTGTTGGTTGACTT
CATCTTCCTTTCAAGGTTCATTTGTATGATACATTCGTATGTATGTATAATTTTGTTTTTTGCCTAATGAGTTTCAACC
TTTTAAAGTTTTCAAAAGCCATTGGAATGTTAATGTAAAGGGAACAGCTTATCTAGACCAAAGAATGGTATTTCAC
ACTTTTTTGTTTGTAACATTGAATAGTTTAAAGCCCTCAATTTCTGTTCTGCTGAACTTTTATTTTTAGGACAGTTAAC
TTTTTAAACACTGGCATTTTCCAAAACTTGTGGCAGCTAACTTTTTAAAATCACAGATGACTTGTAATGTGAGGAGT
CAGCACCGTGTCTGGAGCACTCAAAACTTGGTGCTCAGTGTGTGAAGCGTACTTACTGCATCGTTTTTGTACTTGCT
GCAGACGTGGTAATGTCCAAACAGGCCCCTGAGACTAATCTGATAAATGATTTGGAAATGTGTTTCAGTTGTTCTA
GAAACAATAGTGCCTGTCTATATAGGTCCCCTTAGTTTGAATATTTGCCATTGTTTAATTAAATACCTATCACTGTGG
TAGAGCCTGCATAGATCTTCACCACAAATACTGCCAAGATGTGAATATGCAAAGCCTTTCTGAATCTAATAATGGT
ACTTCTACTGGGGAGAGTGTAATATTTTGGACTGCTGTTTTTCCATTAATGAGGAAAGCAATAGGCCTCTTAATTAA
AGTCCCAAAGTCATAAGATAAATTGTAGCTCAACCAGAAAGTACACTGTTGCCTGTTGAGGATTTGGTGTAATGTA
TCCCAAGGTGTTAGCCTTGTATTATGGAGATGAATACAGATCCAATAGTCAAATGAAACTAGTTCTTAGTTATTTAA
AAGCTTAGCTTGCCTTAAAACTAGGGATCAATTTTCTCAACTGCAGAAACTTTTAGCCTTTCAAACAGTTCACACCT
CAGAAAGTCAGTATTTATTTTACAGACTTCTTTGGAACATTGCCCCCAAATTTAAATATTCATGTGGGTTTAGTATTT
ATTACAAAAAAATGATTTGAAATATAGCTGTTCTTTATGCATAAAATACCCAGTTAGGACCATTACTGCCAGAGGA
GAAAAGTATTAAGTAGCTCATTTCCCTACCTAAAAGATAACTGAATTTATTTGGCTACACTAAAGAATGCAGTATAT
TTAGTTTTCCATTTGCATGATGTGTTTGTGCTATAGACAATATTTTAAATTGAAAAATTTGTTTTAAATTATTTTTACA
GTGAAGACTGTTTTCAGCTCTTTTTATATTGTACATAGACTTTTATGTAATCTGGCATATGTTTTGTAGACCGTTTAA
TGACTGGATTATCTTCCTCCAACTTTTGAAATACAAAAACAGTGTTTTATACTTGTATCTTGTTTTAAAGTCTTATATT
AAAATTGTCATTTGACTTTTTTCCCGTTAAAAAAAAAAAAAAA

http://vlab.amrita.edu/?sub=3&brch=274&sim=1434&cnt=5

3. find the details regarding the sequence like which


organism, name, accession id etc. using BLAST

You might also like