0% found this document useful (0 votes)
258 views3 pages

Biological Databases

Biological database
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
258 views3 pages

Biological Databases

Biological database
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3

Biological Databases- Types and Importance

August 3, 2023 by Sagar Aryal


Edited By: Sagar Aryal
 One of the hallmarks of modern genomic research is the
generation of enormous amounts of raw sequence data.
 As the volume of genomic data grows, sophisticated computational
methodologies are required to manage the data deluge.
 Thus, the very first challenge in the genomics era is to store and
handle the staggering volume of information through the
establishment and use of computer databases.
 A biological database is a large, organized body of persistent data,
usually associated with computerized software designed to update,
query, and retrieve components of the data stored within the
system.
 A simple database might be a single file containing many records,
each of which includes the same set of information.
 The chief objective of the development of a database is to organize
data in a set of structured records to enable easy retrieval of
information.
Example. A few popular databases are GenBank from NCBI (National
Center for Biotechnology Information), SwissProt from the Swiss
Institute of Bioinformatics and PIR from the Protein Information
Resource.

Types of Biological Databases


Based on their contents, biological databases can be roughly divided
into two categories:
1. Primary databases
 Primary databases are also called as archieval database.
 They are populated with experimentally derived data such as
nucleotide sequence, protein sequence or macromolecular
structure.
 Experimental results are submitted directly into the database by
researchers, and the data are essentially archival in nature.
 Once given a database accession number, the data in primary
databases are never changed: they form part of the scientific
record.
Examples
 ENA, GenBank and DDBJ (nucleotide sequence)
 Array Express Archive and GEO (functional genomics data)
 Protein Data Bank (PDB; coordinates of three-dimensional
macromolecular structures)
2. Secondary databases
 Secondary databases comprise data derived from the results of
analysing primary data.
 Secondary databases often draw upon information from numerous
sources, including other databases (primary and secondary),
controlled vocabularies and the scientific literature.
 They are highly curated, often using a complex combination of
computational algorithms and manual analysis and interpretation
to derive new knowledge from the public record of science.
Examples
 InterPro (protein families, motifs and domains)
 UniProt Knowledgebase (sequence and functional information on
proteins)
 Ensembl (variation, function, regulation and more layered onto
whole genome sequences)
3. However, many data resources have both primary and secondary
characteristics. For example, UniProt accepts primary sequences
derived from peptide sequencing experiments. However, UniProt
also infers peptide sequences from genomic information, and it
provides a wealth of additional information, some derived from
automated annotation (TrEMBL), and even more from careful
manual analysis (SwissProt).
4. There are also specialized databases that cater to particular
research interests. For example, Flybase, HIV sequence database,
and Ribosomal Database Project are databases that specialize in a
particular organism or a particular type of data.
Subscribe us to receive latest notes.
Subscribe
Email Address*
Importance of Databases
 Databases act as a store house of information.
 Databases are used to store and organize data in such a way that
information can be retrieved easily via a variety of search criteria.
 It allows knowledge discovery, which refers to the identification of
connections between pieces of information that were not known
when the information was first entered. This facilitates the
discovery of new biological insights from raw data.
 Secondary databases have become the molecular biologist’s
reference library over the past decade or so, providing a wealth of
information on just about any gene or gene product that has been
investigated by the research community.
 It helps to solve cases where many users want to access the same
entries of data.
 Allows the indexing of data.
 It helps to remove redundancy of data.

You might also like