0% found this document useful (0 votes)
29 views

Lecture 3

The document discusses biological sequence databases. It notes that DNA contains A, C, T, and G nucleotides while proteins contain 20 different amino acids. Due to the large number of bases and sequences, this information cannot be stored on a single computer. Therefore, public databases like GenBank and UniProt were created to store DNA, RNA, and protein sequences online for researchers. GenBank, hosted at NCBI, allows users to search sequences by name, ID, species, locus, or other attributes. Similarly, UniProt can be used to search protein sequences.

Uploaded by

SUNDAS FATIMA
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views

Lecture 3

The document discusses biological sequence databases. It notes that DNA contains A, C, T, and G nucleotides while proteins contain 20 different amino acids. Due to the large number of bases and sequences, this information cannot be stored on a single computer. Therefore, public databases like GenBank and UniProt were created to store DNA, RNA, and protein sequences online for researchers. GenBank, hosted at NCBI, allows users to search sequences by name, ID, species, locus, or other attributes. Similarly, UniProt can be used to search protein sequences.

Uploaded by

SUNDAS FATIMA
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 6

Bio Informatics

Lecture 3
STORAGE OF BIOLOGICAL SEQUENCE
INFORMATION
• We know that sequence of DNA contain A,C,T&G nucleotides and
sequence of RNA contains A,C,U&G while sequence of protein contain
A,R,N,D,C,E,Q,G,H,I,L,K,M,F,P,S,T,W,Y&P these are actually 20 different
amino acids in nature which compose a protein.
• When both DNA and RNA or mRNA are sequenced in lab their
sequences contains larger number of nucleotides with variety
• And when we talk about protein its sequences contain large number
of bases as they are complex in nature.
Biological Databases
• This large number of sequence or bases cannot be stored in a single
computer that’s why solution lies in public sequence data bases for
DNA & RNA the public database is GenBank (by NCBI).
• For proteins the public database is UniProt (by Uniprot Consortium)
• Both GenBank and UniProt are online database and the DNA, RNA
and Protein sequences are available here online for public and
researchers.
Biological Databases
• GenBank is online database where researcher can get access to the
sequences of DNA, RNA and proteins.
• To find any sequence we go online to NCBI GenBank website which is
Public database site. Which is;
• www.ncbi.nlm.nih.gov/genbank
• And for example we want to find the sequence for
Mycobacterium tuberculosis (a species of pathogenic bacteria and
the causative agent of tuberculosis (tb).
Biological Databases
• Sequences can be searched from GenBank by typing;
• Sequence name
• ID
• Name
• Species
• Locus
• Accession Number
• Author
• Journal
Biological Databases
• UniProt is public database which is being used to search the sequence
of proteins.
• www.Uniprot.org
• For example we want to search a sequence of a protein which is
Human insulin which plays an important role in managing blood sugar.
We have to go online to the website www.Uniprot.org

You might also like