Bioinformatics Biological Database
Bioinformatics Biological Database
DATABASES
A database is an organized collection of data.
For instance, a list with some of the movies that we like would be a movie database:
•Entities: The kind of things that we want to store in a database. E.g.: Genes, DNA
sequences, bibliographical references.
•Records: The particular things stored in the database. E.g.: The gene BRCA1
•Identifiers or key: The unique name that identifies a record
•Fields: The properties that an entity has. E.g.: The name, sequence and mutations of
the gene
If the different entities could be stored in different
tables and the records on those tables would be related
by their unique identifiers, that structure would
comprise a relational database.
For example, UniProt accepts primary sequences derived from peptide sequencing
experiments.
However, UniProt also infers peptide sequences from genomic information, and it
provides a wealth of additional information, some derived from
automated annotation (TrEMBL), and even more from careful manual analysis
(SwissProt).
4. There are also specialized databases are those that cater to a particular research
interest.
For example, Flybase, HIV sequence database, and Ribosomal Database Project are
databases that specialize in a particular organism or a particular type of data.
Primary database vs. secondary database
This database has been developed and maintained at the NCBI, Bethesda,
MD, USA, as a part of International Sequence Database Collaboration (INSDC).
•KEGG PATHWAY Database contains graphical pathway maps for all known
metabolic pathways from various organisms.
•GOLD (Genomes Online Database at the University of Illinois, USA) contains a list
of all the complete and ongoing genome projects worldwide.
•TIGR database (TDB), at the institute for Genomic Research at Rockeville MD, USA.
Virological databases
•Escherichia coli- E. coli Genome Centre(Wisconsin university, USA), The E.coli index (University of
Birmingham, UK)
•Danio rerio(zebrafish)- ZFIN (Zebrafish Information Network at the University of Oregon, USA)
•S. cerevisiae (Bakers yeast)- SGD ()Yeast Genome Database at Stanford, USA
Annotation of Gene:
In molecular biology, genomes make the basic genetic material and typically consist of
DNA.
Genome include the genes (coding ) and non coding regions, of interest to us, are
The coding regions are of interest as they actively influence the basic life processes.
The genes contain useful biological information that is required in building up and
maintaining an organism. Gene annotation can be defined merely as the process of
making nucleotide sequence meaningful.
Gene annotation involves the process of taking the raw DNA sequence produced by
genome sequencing projects and adding layers of analysis and interpretation
necessary to extracting biologically significant information and placing such derived
details into context.
Annotation is the process by which pertinent information about these raw DNA
sequences is added to the databases.
Accession number