0% found this document useful (0 votes)

13 views16 pages

Database

The document provides an overview of biological databases, including their definitions, classifications, and purposes. It details various types of databases such as bibliographic, genome, sequence, DNA, protein, metabolic, disease, expression, and chemical databases, along with examples and their functions. Additionally, it discusses the submission processes for researchers to contribute their data to these databases.

Uploaded by

sadiquraga

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views16 pages

Database

Uploaded by

sadiquraga

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 16

BCH 418

PRINCIPLES OF BIOINFORMATICS

DR. A. Dandare
Dept. of Biochemistry & Molecular Biology
Usmanu Danfodiyo University Sokoto
Database
▪ A database is a computerized archive used to store and organize data in such a way
that information can be retrieved easily via a variety of search criteria.

▪ The chief objective of the development of a database is to organize data in a set of

structured records to enable easy retrieval of information.

▪ Biological Database: is a collection of data that is structured, searchable, updated

periodically and crossed referenced.
Biological data are developed to perform several functions as follows:

i. Biological Database aid in organisation of biological experiments and analysis.

ii. Biological Database make biological data available for scientist at one place and
help them to obtain data of their research and cross-validation.

iii. Biological Databases are available in computer readable format, thus forms the first
fundamental step of biological data analysis.
Classification of Biological Database
Biological Database
are broadly classified
into nine categories
based on composition
of the data types.
Classification of Biological Database

1. Bibliographic Database
Is a scientific literature database consisting of numerous research papers and
articles from various journal..

PubMed, available at NCBI, is the widely used bibliographic database, it is

maintained by National Library of Medicine (NLM). and contained morthan12.8
million abstract from 4,400 biomedical and biochemical journals

MEDLINE is also an NML premier bibliography database covering the field of

human medicine, nursing, dentistry, veterinary medicine, health care system,
and pre-clinical science, it has 4,800 biomedical journals
Classification of Biological Database
2. Genome databases
▪ Genome databases give absolute information on the heritable properties of an
organism. These databases help to identify genes and predict their functions.
A few genome databases have links with specific organism databases

▪ GOLD (Genomes Online Database at the University of Illinois, USA)

contains a list of all the complete and ongoing genome projects worldwide.

▪ Genomes at NCBI (National Centre for Biotechnology Information, USA).

Classification of Biological Database
3. Sequence Databases

▪ RefSeq database for example is an open access, annotated and curated collection of publicly
available nucleotide sequences (DNA, RNA) and their protein products.

▪ The National Center for Biotechnology Information Reference Sequence (NCBI RefSeq) database
provides curated non-redundant sequences of genomic regions, transcripts and proteins for
taxonomically diverse organisms including Archaea, Bacteria, Eukaryotes, and Viruses.

▪ RefSeq database is derived from the sequence data available in the redundant archival
database GenBank. RefSeq sequences include coding regions, conserved domains,
variations etc.
▪ Nucleic acids sequence database include: Genebank, EMBL (European Molecular Biology
Laboratory)Bank, DDBJ (DNA Data Bank of Japan) etc

▪ Protein sequence database include: Entrez protein, Swiss Prot, Protein Data Bank (PDB),
Molecular Modelling Database (MMDB), Gene3D, EMBL-Macromolecular Structure Database
Classification of Biological Database
DNA Databases
▪ A DNA database centers on managing DNA data from many or some
specific species.

▪ The primary function of human DNA databases includes establishment of

the reference genome.

▪ A representative example of DNA database is GenBank, a collection of

all publicly-available DNA sequences

▪ GenBank contains over 184 billion nucleotide bases in more than 179
million sequences
Classification of Biological Database
▪ Protein Databases

▪ The purpose of constructing protein databases includes collection of universal

proteins, identification of protein families and domains, reconstruction of phylogenetic trees,
and profiling of protein structures.

▪ A representative example of protein database is PDB, the main primary database for 3D
structures of biological macromolecules determined by X-ray crystallography and NMR.

▪ PDB contains more than 105,465 biological macromolecular structures as of in which 27,393
entries belong to human (http://www.rcsb.org/pdb).

▪ Another example is the Universal Protein Resource (UniProt). As a collaborative project between
EMBL-EBI, Swiss Institute of Bioinformatics (SIB), and Protein Information Resource (PIR).

▪ UniProt provides a comprehensive, high-quality, and freely-accessible resource of protein

sequence and functional information.
Classification of Biological Database
4. Metabolic Database
It contains data on biological pathway and enzymes in different organisms
Pathway databases
▪ Pathway databases contain biological pathways for metabolic, signalling, and
regulatory pathway analysis.

▪ A representative example is KEGG pathway, a curated biological pathway resource on

the molecular interaction and reaction networks.

▪ • KEGG pathway Database contains graphical pathway maps for all known metabolic
pathways from various organisms.

▪ KEGG pathway integrates many entities that are stored in KEGG sibling databases,
including genes, proteins, RNAs, chemical compounds, and chemical reactions.
Classification of Biological Database
5. Disease databases
▪ These are exclusive sources for disease-related information example OMIM (online Mendelian
inheritance in Man) provides data about human genes and genetic disorder.

▪ Genetic Association Database is another popular disease database containing data on human
genetic association studies of complex diseases and disorders.

▪ This database helps in rapidly identifying medically relevant polymorphism from large volume of
polymorphisms and mutational data. This database have a significant therapeutic value.

▪ Example, the Cancer Genome Atlas (TCGA) and International Cancer Genome Consortium
(ICGC) are example of disease database

▪ TCGA is aimed to collect a wide diversity of omics data (including exome, SNP, mRNA, miRNA,
and methylation) for more than 20 different types of human cancer

▪ ICGC aims to obtain a comprehensive description of genomic, transcriptomic, and epigenomic

changes in 50 different tumor types and/or subtypes.
Classification of Biological Database
6. Expression databases
Expression databases can be used for various purposes, including archiving expression
data (e.g., GEO), detecting differential and baseline expression (e.g., Expression Atlas),
exploring tissue-specific gene expression and regulation (e.g., TiGER ), and profiling
expression information based on both RNA and protein data (e.g., Human Protein Atlas.

➢ A representative case of expression database is Human Protein Atlas.

➢ it encompasses expression profiles for a large majority of human protein-coding
genes based on both RNA (transcriptome analysis based on 213 tissue and cell line
samples) and protein data (proteome analysis based on 24,028 antibodies)
(http://www.proteinatlas.org).
Classification Databases Based on Data Source
1. Primary databases

Primary databases are also called as archieval database, accept original data from researcher
with relatively little checking or validation. They contain original submission from researcher.

They are populated with experimentally derived data such as nucleotide sequence, protein
sequence or macromolecular structure.

▪ Once given a database accession number, the data in primary databases are never changed:
they form part of the scientific record.

Examples

▪ ENA, GenBank and DDBJ (nucleotide sequence).

▪ Array Express Archive and GEO (functional genomics data).

▪ Protein Data Bank (PDB; coordinates of three-dimensional macromolecular structures).

Classification Databases Based on Data Source
2. Secondary databases
▪ Secondary databases comprise data derived from the results of analysing primary data.
▪ Secondary databases often draw upon information from numerous sources, including other
databases (primary and secondary), controlled vocabularies and the scientific literature.
▪ They are highly curated, often using a complex combination of computational algorithms and
manual analysis and interpretation to derive new knowledge from the public record of science.
Examples
▪ InterPro (protein families, motifs and domains)
▪ UniProt Knowledgebase (sequence and functional information on proteins)
▪ Ensembl (variation, function, regulation and more layered onto whole genome sequences)

3. However, many data resources have both primary and secondary characteristics. For example,
UniProt accepts primary sequences derived from peptide sequencing experiments. However, UniProt
also infers peptide sequences from genomic information, and it provides a wealth of additional
information, some derived from automated annotation (TrEMBL), and even more from careful manual
analysis (SwissProt).
Classification of Biological Database
7. Chemical Databases:

▪ This database store chemical information of various molecules. Examples:

▪ PubChem of NCBI contain substances description of small molecules with fewer than 1000
atoms and 1000 bonds

▪ ChEMBL is a large-scale bioactivity database containing binding, functional, in vivo

absorption, distribution, metabolism, excretion, and toxicity (ADMET) information about drug-
like bioactive compounds

▪ ChEMBL data are manually curated from the published literature together with data drawn
from other databases. ChEMBL are standardized for using in many types of chemical
biology and drug-discovery research problems.

▪ ChEMBL database can be accessed from a web-based interface where a variety of search
and browsing functionality are provided.

▪ ChEMBL data is freely available from their FTP site in the formats of Oracle, MySQL,
PostgreSQL, structure-data file (SDF), FASTA and RDF
Submission to Database
Investigators are encouraged to submit their newly obtained sequences directly to a member of the
International Nucleotide Sequence Database Collaboration such as

1. National Center for Biotechnology Information (NCBI) which manages Genbank

(http://www.ncbi.nlm.nih.gov)

2. The DNA Databank of Japan (DDBJ; http://www.ddbj.nig.ac.jp)

3. The European Molecular Biology Laboratory(EMBL)/EBI Nucleotide Sequence Database

(http://www.embl-heldelberg.de)
• The simplest way of submitting sequences is through the website http://ncbi.nlm.nih.gov/
on a Web form page called bankIt.
The sequence can be annotated with information about the sequence, such as mRNA start
and coding regions.
Ways of submission to Databases
▪ The submitted form is transformed into Genbank format and returned to the submitter
for review before it is added to the Genbank.
▪ The other method of submission is to use Sequin (formerly called Authorin) , which
runs on personal computers and UNIX machines.
▪ The programme provides an easy -to –use graphic interface and can manage large
submissions such as genomic sequence .
▪ It is described and demonstrated on http://www.ncbi.nlm.nih.gov/Sequin/index.html

▪ Completed files, using the appropriate or standard format:

▪ Files containing only sequence characters.
▪ Based on American Standard Code for Information Interchange (ASCII).
▪ Can be sent by Email to [email protected]
▪ Or mailed on diskette to Genbank Submissions , NCBI National Library of Medicine,
Bldg, 38A ,Room 8N-803, Bethesda, Maryland 20894.
▪ The sequence then became publicly available.

Biological Databases ODL
No ratings yet
Biological Databases ODL
31 pages
Math and Vocabulary For Civil Service Exams
97% (36)
Math and Vocabulary For Civil Service Exams
304 pages
BCH 505 Bioinformatics 3 (2 2) Databases
No ratings yet
BCH 505 Bioinformatics 3 (2 2) Databases
17 pages
Biological Databases
No ratings yet
Biological Databases
3 pages
Lecture3 4
No ratings yet
Lecture3 4
73 pages
Biological - Databases Class Work 60
No ratings yet
Biological - Databases Class Work 60
60 pages
Introduction To Bioinformatics (Databases)
No ratings yet
Introduction To Bioinformatics (Databases)
28 pages
Bioinformatics PPT Section B Data Storage and Retrival Group 3
No ratings yet
Bioinformatics PPT Section B Data Storage and Retrival Group 3
36 pages
M Lec 01 & 02 Biological Database
No ratings yet
M Lec 01 & 02 Biological Database
50 pages
Introduction To Databases
No ratings yet
Introduction To Databases
21 pages
Databases Class Work
No ratings yet
Databases Class Work
48 pages
Biological Database ODL
No ratings yet
Biological Database ODL
21 pages
Unit 2
No ratings yet
Unit 2
36 pages
Lecture 4 Biological Databases
No ratings yet
Lecture 4 Biological Databases
29 pages
CMSC 838T - Lecture 9: Bioinformatics Databases
No ratings yet
CMSC 838T - Lecture 9: Bioinformatics Databases
65 pages
3rd Year Syllabus Electrical Engineering, Electrical & Electronics Engineering
No ratings yet
3rd Year Syllabus Electrical Engineering, Electrical & Electronics Engineering
31 pages
Biol BDs Singapore
No ratings yet
Biol BDs Singapore
24 pages
Database 2
No ratings yet
Database 2
15 pages
Bioinformatics Biological Database
No ratings yet
Bioinformatics Biological Database
31 pages
Lec2 Databases
No ratings yet
Lec2 Databases
135 pages
Lecture 1 - Biological Database
No ratings yet
Lecture 1 - Biological Database
14 pages
Data Base in Bioinformatics
No ratings yet
Data Base in Bioinformatics
30 pages
Peace BMCB Seminar
No ratings yet
Peace BMCB Seminar
13 pages
"MBG1002 Biological Databases Week II
No ratings yet
"MBG1002 Biological Databases Week II
37 pages
Biological Data Bases
No ratings yet
Biological Data Bases
36 pages
Bioinformatics Lecture Notes Database
No ratings yet
Bioinformatics Lecture Notes Database
28 pages
BIOINFORMATICS - eNOTES
No ratings yet
BIOINFORMATICS - eNOTES
23 pages
BCH 516-1
No ratings yet
BCH 516-1
32 pages
Capture D'écran . 2023-03-14 À 00.15.22
No ratings yet
Capture D'écran . 2023-03-14 À 00.15.22
54 pages
BCH 428 Slide
No ratings yet
BCH 428 Slide
32 pages
Module 2 (Bioinformatics)
No ratings yet
Module 2 (Bioinformatics)
81 pages
Bioinformatics
No ratings yet
Bioinformatics
47 pages
Zoya Bioinformatics Assignment
No ratings yet
Zoya Bioinformatics Assignment
36 pages
Database
No ratings yet
Database
40 pages
Sec1 Introduction To Bioinformatics
No ratings yet
Sec1 Introduction To Bioinformatics
20 pages
Day 1
No ratings yet
Day 1
38 pages
Bio in For Ma Tics
No ratings yet
Bio in For Ma Tics
52 pages
Module - 2 - Reference Course Content
No ratings yet
Module - 2 - Reference Course Content
19 pages
Biological Databases
No ratings yet
Biological Databases
17 pages
Unit Ii
No ratings yet
Unit Ii
23 pages
المحاضرة 2
No ratings yet
المحاضرة 2
16 pages
Bioinformatic Databases 2
No ratings yet
Bioinformatic Databases 2
28 pages
Lecture 5 - DataBase
No ratings yet
Lecture 5 - DataBase
18 pages
#1 L1 BioDatabases
No ratings yet
#1 L1 BioDatabases
89 pages
2024.HF BioInformatics Lec3p
No ratings yet
2024.HF BioInformatics Lec3p
11 pages
Biological Information On Artificial Intelligence
No ratings yet
Biological Information On Artificial Intelligence
20 pages
WINSEM2021-22 BIY1012 ETH VL2021220501045 Reference Material I 11-01-2022 Ntroduction To Databases
No ratings yet
WINSEM2021-22 BIY1012 ETH VL2021220501045 Reference Material I 11-01-2022 Ntroduction To Databases
42 pages
Databases - Final
No ratings yet
Databases - Final
50 pages
Biological Databases: DR Z Chikwambi Biotechnology
No ratings yet
Biological Databases: DR Z Chikwambi Biotechnology
47 pages
PMMerdeka TG4 C05
No ratings yet
PMMerdeka TG4 C05
84 pages
Biological Databases: - Bio-Informatics
No ratings yet
Biological Databases: - Bio-Informatics
16 pages
Bioinformatics Database Resources: Icxa Khandelwal Pavan Kumar Agrawal Rahul Shrivastava
No ratings yet
Bioinformatics Database Resources: Icxa Khandelwal Pavan Kumar Agrawal Rahul Shrivastava
46 pages
CH12
No ratings yet
CH12
8 pages
Computational Chem 6
No ratings yet
Computational Chem 6
152 pages
FALLSEM2019-20 BIT2001 ETH VL2019201000690 Reference Material I 11-Jul-2019 Unit I New
No ratings yet
FALLSEM2019-20 BIT2001 ETH VL2019201000690 Reference Material I 11-Jul-2019 Unit I New
48 pages
ML Assignment 2 2019 Nptel
No ratings yet
ML Assignment 2 2019 Nptel
34 pages
Amchelltdprofile V2
No ratings yet
Amchelltdprofile V2
146 pages
Biological Databases For Human Research
No ratings yet
Biological Databases For Human Research
9 pages
Bioinfo U2 KD 2
No ratings yet
Bioinfo U2 KD 2
3 pages
Basics of Bioinformatics in Biological Research
No ratings yet
Basics of Bioinformatics in Biological Research
5 pages
Introduction To Databases
No ratings yet
Introduction To Databases
7 pages
Iso 5628 2019
No ratings yet
Iso 5628 2019
9 pages
Appsc
No ratings yet
Appsc
19 pages
Exp 1
No ratings yet
Exp 1
7 pages
Bioinformatics Tools For Nucleotide Sequence Analysis and Database Exploration
No ratings yet
Bioinformatics Tools For Nucleotide Sequence Analysis and Database Exploration
75 pages
Appendix - K (HSE Training Matrix, CV Dari HSE Staff)
No ratings yet
Appendix - K (HSE Training Matrix, CV Dari HSE Staff)
28 pages
Spillway
No ratings yet
Spillway
39 pages
Company Profile-Polybond
No ratings yet
Company Profile-Polybond
40 pages
Bioinformatics and Omics Topic: Database and Biological Database With Examples Assignment-3
No ratings yet
Bioinformatics and Omics Topic: Database and Biological Database With Examples Assignment-3
5 pages
Akers 1989
No ratings yet
Akers 1989
26 pages
Tics - A Brief Introduction
No ratings yet
Tics - A Brief Introduction
4 pages
Geography SBA Format
No ratings yet
Geography SBA Format
16 pages
Managing Your Emotions at Work
No ratings yet
Managing Your Emotions at Work
22 pages
1689580033sarkarinaukriexams com-TCS NQT Syllabus 2023 Exam Pattern PDF Download
No ratings yet
1689580033sarkarinaukriexams com-TCS NQT Syllabus 2023 Exam Pattern PDF Download
6 pages
Ankit - KP Horary Software - Prashna Kundali Software - Free KP Astrology Software
No ratings yet
Ankit - KP Horary Software - Prashna Kundali Software - Free KP Astrology Software
3 pages
Lab Report #4
No ratings yet
Lab Report #4
6 pages
Differentiated Instruction in Statistics and Probability Independent and Dependent Variable With Indicators
No ratings yet
Differentiated Instruction in Statistics and Probability Independent and Dependent Variable With Indicators
6 pages
Pure MTC Resource
No ratings yet
Pure MTC Resource
4 pages
How To Write A Position Paper
No ratings yet
How To Write A Position Paper
3 pages
Erie Scientific LLC: Globally Harmonized Safety Data Sheet (GHSDS)
No ratings yet
Erie Scientific LLC: Globally Harmonized Safety Data Sheet (GHSDS)
7 pages
Hyrje Modelim
No ratings yet
Hyrje Modelim
19 pages
Revise Project Proposal Sample
No ratings yet
Revise Project Proposal Sample
15 pages
Arc Soil Inference Engine (Arcsie) and Digital Soil Mapping in Mlra Ssa 12-5: A Ssurgo Success Story
No ratings yet
Arc Soil Inference Engine (Arcsie) and Digital Soil Mapping in Mlra Ssa 12-5: A Ssurgo Success Story
40 pages
Phisophy Conceipt
No ratings yet
Phisophy Conceipt
6 pages
PHY103A: Lecture # 2: Semester II, 2017-18 Department of Physics, IIT Kanpur
No ratings yet
PHY103A: Lecture # 2: Semester II, 2017-18 Department of Physics, IIT Kanpur
21 pages
Annual Fund Gift Table
No ratings yet
Annual Fund Gift Table
4 pages
Curriculum Vita1
No ratings yet
Curriculum Vita1
3 pages
SVN Sba #4
No ratings yet
SVN Sba #4
4 pages
Best Teacher Award 1
No ratings yet
Best Teacher Award 1
1 page
Bioinformatics Unveiled
From Everand
Bioinformatics Unveiled
Joan Melody
No ratings yet

Database

Uploaded by

Database

Uploaded by

BCH 418

▪ The chief objective of the development of a database is to organize data in a set of

▪ Biological Database: is a collection of data that is structured, searchable, updated

i. Biological Database aid in organisation of biological experiments and analysis.

PubMed, available at NCBI, is the widely used bibliographic database, it is

MEDLINE is also an NML premier bibliography database covering the field of

▪ GOLD (Genomes Online Database at the University of Illinois, USA)

▪ Genomes at NCBI (National Centre for Biotechnology Information, USA).

▪ The primary function of human DNA databases includes establishment of

▪ A representative example of DNA database is GenBank, a collection of

▪ The purpose of constructing protein databases includes collection of universal

▪ UniProt provides a comprehensive, high-quality, and freely-accessible resource of protein

▪ A representative example is KEGG pathway, a curated biological pathway resource on

▪ ICGC aims to obtain a comprehensive description of genomic, transcriptomic, and epigenomic

➢ A representative case of expression database is Human Protein Atlas.

▪ ENA, GenBank and DDBJ (nucleotide sequence).

▪ Array Express Archive and GEO (functional genomics data).

▪ Protein Data Bank (PDB; coordinates of three-dimensional macromolecular structures).

▪ This database store chemical information of various molecules. Examples:

▪ ChEMBL is a large-scale bioactivity database containing binding, functional, in vivo

1. National Center for Biotechnology Information (NCBI) which manages Genbank

2. The DNA Databank of Japan (DDBJ; http://www.ddbj.nig.ac.jp)

3. The European Molecular Biology Laboratory(EMBL)/EBI Nucleotide Sequence Database

▪ Completed files, using the appropriate or standard format:

You might also like