Biological Data Bases

The document discusses major databases in bioinformatics, categorizing them into primary, secondary, and composite databases. It highlights key examples such as GenBank, DDBJ, and KEGG, which store original biological data, and explains the functions of secondary databases like PROSITE and PRINTS. Additionally, it mentions database search engines like Entrez and SRS that facilitate access to these biological databases.

Uploaded by

jaleelkabdul

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

22 views36 pages

Biological Data Bases

Uploaded by

jaleelkabdul

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 36

MAJOR DATABASES

IN BIOINFORMATICS
Dr. ABDULJALEEL K
DEPARTMENT OF ZOOLOGY
GOVERNMENT COLLEGE KASARAGOD
• A database is a computerized archive used to store
and organize data.
• Includes computer hardware and software for data
management.
• Enormous amount of biological data are being
generated every day by researchers .
• Biological databases can be defined as a collection of
files containing records of biological data in machine
readable form, arranged in fields and which can be
accessed, added, retrieved, manipulated and
modified.
Types
• Three types of databases–
• A) Primary data base,
• B) secondary data base,
• C) composite data
• base
PRIMARY DATABASES

• It is also known as archival databases.

• Contains original biological data, ie, the raw
sequence submitted by the research community.
• Unique data obtained through laboratory
experiments.
• Eg: Gen Bank, PDB, DDBJ, PIR, MIPS, KEGG,
EcoCyeetc
Types of primary data
bases

• a) Nucleotide sequence data base – Gen Bank,

DDBJ, EMBL
• b) Protein sequence data base – SWISS PROT, MIPS,
PIR
• c) Metabolic data base –KEGG , EcoCye
NUCLEOTIDE SEQUENCE DATA BASE

• There are different databases containing nucleotide

sequences.
GenBank
• Established in 1979, USA.
• It is produced and maintained by the National
Center for Biotechnology Information (NCBI) NCBI is
a part of the National Institutes of Health (NIH) in
the United States.
• It is a database consisting of most public DNA
sequences or it is the complete collection of
annotated nucleic acid sequence data for almost all
organisms.
• It includes genomic DNA, mRNA, cDNA, etc.
DDBJ (DNA Databank of Japan):

• Established in 1986.
• It is the major nucleotide sequence database which
collect sequence from researches and issue the
accession number to the submitter.
• SAKURA is a tool used to deposit data to the DDBJ
• ARSA is used to search data from DDBJ.
• The principal purpose of DDBJ is to improve the
quality of International Nucleotide Sequence Data
bases (INSD) as public domains.
EMBL: (European Molecular
Biology Laboratory)
•  The database is a part of an internationalcollaboration
with DDBJ (Japan) and GenBank (USA).
•  Data are exchanged between the collaborating
databases on a daily basis to achieve optimal synchrony.
•  The web-based tool, Webin, is the preferred system for
individual submission of nucleotide sequences
•  For sequence similarity searching, a variety of tools
(e.g. FASTA and BLAST) are available that allow external
users to compare their own sequences against the data in
the EMBL Nucleotide Sequence Database
2. PROTEIN SEQUENCE
DATABASES
• Protein sequence databases are information about
proteins.
• It is an array of amino acid sequence entries
arranged according to the identification number.
• Eg Swiss prot, PIR, MIPS
Swiss prot
• It is a high quality protein data base.
• Swiss port is created at the department of Medical
Biochemistry, University of Geneva in 1986.
• The development and maintenance of this high
quality protein data base is carried out by European
Molecular Biology Laboratory and Swiss Institute of
Bioinformatics (SIB).
• It provides a high level of annotation (such as the
description of the function of a protein, its domain
structure, etc).
PIR:Protein information Resource

• It is an integrated public Informatics resource to

support genomic, proteomic and systems biology
search and scientific studies.
• PIR was established in 1984 by the National
Biomedical Research Foundation (NBRF)
• Help researchers in the identification and
interpretation of protein sequence information.
MIPS: The Munich Information
Center for Protein Sequences
•  Provide genome-related information.
•  MIPS supports both national and European
sequencing and functional analysis projects.
•  It develops systematic classification schemes for
the functional annotation of protein sequences,
•  Provides tools for the comprehensive analysis of
protein sequences.
• It helps in gene expression analysis and
proteomics.
3. METABOLITE DATABASES
• Metabolic databases are those databases which
represent the metabolic pathways of an organism.
• It is powerful and influential in the field of
computational biology and systems biology
KEGG: (Kyoto Encyclopaedia
of Genes and Genomes)
• It is a collection of databases dealing with
genomes, biological pathways, diseases, drugs, and
chemical substances.
• The KEGG project is undertaken in the
Bioinformatics Center, Institute for Chemical
Research, Kyoto University.
EcoCyc
• It is a biological database for the bacterium E. Coli.
• This data base describes genome, transcriptional
regulation, transporters, and metabolic pathways of
E. Coli.
• New experimental discoveries about gene products,
their function and regulation, new metabolic
pathways, etc are regularly added to EcoCyc.
B) SECONDARY DATABASES
• They are also known as curated databases.
Secondary databases comprise data derived from
the results of analyzing primary data.

• Eg: PROSITE, PRINTS, Blocks.

PROSITE
• :PROSITE database consists of protein families, domains and
functional sites which serve as biological signature
•  The database is manually curated by Swiss Institute of
Bioinformatics (SIB) and is integrated to Swiss port.
•  Consists of a large collection of biologically meaningful
signatures that are described as patterns or profiles.
•  Provides useful biological information on the protein
family, domain or functional site identified by the signature.
•  The PROSITE database is now complemented by a series
of rules that can give more precise information about
specific residues.
PRINTS

•  PRINTS is a database of protein, which uses a

different approach of pattern recognition called
‘fingerprinting’.
•  It provides both a detailed annotation resource
for protein families, and a diagnostic tool for new
protein sequences.
BLOCKS
• The blocks database is a collection of “Blocks”
representing known protein families that can be
used to compare a protein or DNA sequence with
documented families of proteins.
•  Blocks are ungapped multiple alignments of
segments of related protein sequences that
correspond to the most conserved regions of
proteins.
•  The main problem with the blocks is that the data
base is no longer updated.
C) SPECIALIZED DATABASES/
COMPOSITE DATA BASES
• These are collections on particular subjects, such as
medical journal articles, abstracts or on particular
organism.
• This data base is a combination of a number of
primary source, using a set of defined criteria.
• The choice of different data sources and the
application of different criteria results in the
emergence of composite data base.
• Eg.AGR (Arabidopsis Genome Resource) , FLY BASE,
BIODIVERSITY DATA BASE
DATABASE SEARCH
ENGINES
• A search engine is a web-based tool that enables
users to locate information on the WorldWide Web.
Entrez
• It is a molecular biology database search and retrieval
system developed by theNational Center for Biotechnology
Information (NCBI).
• It is an entry point for exploring distinct but integrated
databases. The Entrez system provides access to:
• 1 Nucleotide sequence databases–GenBank/DDBJ/EBI
• 2 Protein sequence databases-Swiss-Prot,PIR,PRF,PDB,
• 3 translated protein sequences from DNA sequence
databases
• 4 Genome and chromosome mapping data
• 5 Molecular Modeling 3-Dstructures Database
SRS
• The Sequence Retrieval System (SRS)–a network browser for
databases in molecularbiology.
• . It is a powerful sequence information indexing, search and
retrieval system
• SRS is a homogeneous interface to biological databases developed
at the European Bioinformatics Institute (EBI) at Hinxton, UK.
• The types of databases included are sequence AND sequence
related, metabolic pathways,
• Transcription factors, application results (eg.,BLAST), protein3D-
structure, genome,mapping, mutations, and locus-specific
mutations.
• One can access and query their contents and navigate among them.
STAG
• It is a molecular biology database and retrieval
system of DDBJ. It is used for exploring integrated
databases. It provide access to
• 1 Nucleotide sequence databases–
• 2 Protein sequence databases-
• 3 translated protein sequences from DNA sequence
databases etc

Biological Databases Lec 2,3
No ratings yet
Biological Databases Lec 2,3
49 pages
Economic Importance of Corals
No ratings yet
Economic Importance of Corals
3 pages
Escape It: Tarosophy
No ratings yet
Escape It: Tarosophy
11 pages
Databases - Final
No ratings yet
Databases - Final
50 pages
Unit Ii
No ratings yet
Unit Ii
23 pages
BCH 505 Bioinformatics 3 (2 2) Databases
No ratings yet
BCH 505 Bioinformatics 3 (2 2) Databases
17 pages
Bioinformatics Lecture Notes Database
No ratings yet
Bioinformatics Lecture Notes Database
28 pages
Introduction To Databases
No ratings yet
Introduction To Databases
21 pages
Introduction To Bioinformatics (Databases)
No ratings yet
Introduction To Bioinformatics (Databases)
28 pages
Bioinformatics
No ratings yet
Bioinformatics
47 pages
Database 2
No ratings yet
Database 2
15 pages
Biologicaldatabase 190402034501
No ratings yet
Biologicaldatabase 190402034501
26 pages
Bioinformatics Biological Database
No ratings yet
Bioinformatics Biological Database
31 pages
Lec2 Databases
No ratings yet
Lec2 Databases
135 pages
Bioinformatics PPT Section B Data Storage and Retrival Group 3
No ratings yet
Bioinformatics PPT Section B Data Storage and Retrival Group 3
36 pages
Module 2 (Bioinformatics)
No ratings yet
Module 2 (Bioinformatics)
81 pages
Databases Class Work
No ratings yet
Databases Class Work
48 pages
Biological Information On Artificial Intelligence
No ratings yet
Biological Information On Artificial Intelligence
20 pages
Biological Databases ODL
No ratings yet
Biological Databases ODL
31 pages
Biological Databases: - Bio-Informatics
No ratings yet
Biological Databases: - Bio-Informatics
16 pages
Databases 2025
No ratings yet
Databases 2025
50 pages
Peace BMCB Seminar
No ratings yet
Peace BMCB Seminar
13 pages
Rese Rach
No ratings yet
Rese Rach
37 pages
WINSEM2021-22 BIY1012 ETH VL2021220501045 Reference Material I 11-01-2022 Ntroduction To Databases
No ratings yet
WINSEM2021-22 BIY1012 ETH VL2021220501045 Reference Material I 11-01-2022 Ntroduction To Databases
42 pages
Basics of Bioinformatics in Biological Research
No ratings yet
Basics of Bioinformatics in Biological Research
5 pages
Presentation 11
No ratings yet
Presentation 11
20 pages
Database
No ratings yet
Database
40 pages
Tics - A Brief Introduction
No ratings yet
Tics - A Brief Introduction
4 pages
CH12
No ratings yet
CH12
8 pages
Bioinformatics Tools For Nucleotide Sequence Analysis and Database Exploration
No ratings yet
Bioinformatics Tools For Nucleotide Sequence Analysis and Database Exploration
75 pages
Unit 2
No ratings yet
Unit 2
36 pages
Biological Database ODL
No ratings yet
Biological Database ODL
21 pages
Biological Database
No ratings yet
Biological Database
18 pages
Biological Databases
No ratings yet
Biological Databases
3 pages
Database
No ratings yet
Database
16 pages
Bioinformatics Database and Applications
100% (3)
Bioinformatics Database and Applications
82 pages
Bioinformatics (Final)
No ratings yet
Bioinformatics (Final)
41 pages
Biological Data and Database
No ratings yet
Biological Data and Database
13 pages
Capture D'écran . 2023-03-14 À 00.15.22
No ratings yet
Capture D'écran . 2023-03-14 À 00.15.22
54 pages
Bioinformatics and Omics Topic: Database and Biological Database With Examples Assignment-3
No ratings yet
Bioinformatics and Omics Topic: Database and Biological Database With Examples Assignment-3
5 pages
Bioinformatics Database Resources: Icxa Khandelwal Pavan Kumar Agrawal Rahul Shrivastava
No ratings yet
Bioinformatics Database Resources: Icxa Khandelwal Pavan Kumar Agrawal Rahul Shrivastava
46 pages
Sec1 Introduction To Bioinformatics
No ratings yet
Sec1 Introduction To Bioinformatics
20 pages
Module 2 Biodata
No ratings yet
Module 2 Biodata
36 pages
Biol BDs Singapore
No ratings yet
Biol BDs Singapore
24 pages
Unit II Bioinformatics
No ratings yet
Unit II Bioinformatics
25 pages
Bioinformatics. CH 3 Databases (Summarized Notes)
50% (2)
Bioinformatics. CH 3 Databases (Summarized Notes)
5 pages
BCH 516-1
No ratings yet
BCH 516-1
32 pages
M Lec 01 & 02 Biological Database
No ratings yet
M Lec 01 & 02 Biological Database
50 pages
BCH 428 Slide
No ratings yet
BCH 428 Slide
32 pages
FALLSEM2019-20 BIT2001 ETH VL2019201000690 Reference Material I 11-Jul-2019 Unit I New
No ratings yet
FALLSEM2019-20 BIT2001 ETH VL2019201000690 Reference Material I 11-Jul-2019 Unit I New
48 pages
Lecture 2 Introduction To The Computational Tools
No ratings yet
Lecture 2 Introduction To The Computational Tools
15 pages
Biological Databases
No ratings yet
Biological Databases
17 pages
"MBG1002 Biological Databases Week II
No ratings yet
"MBG1002 Biological Databases Week II
37 pages
Biological - Databases Class Work 60
No ratings yet
Biological - Databases Class Work 60
60 pages
Bioinfo U2 KD 2
No ratings yet
Bioinfo U2 KD 2
3 pages
Unit II Major Databases in Bioinformatics
No ratings yet
Unit II Major Databases in Bioinformatics
54 pages
Databases in Bioinformatics - An Introduction
No ratings yet
Databases in Bioinformatics - An Introduction
11 pages
Lecture 5 - DataBase
No ratings yet
Lecture 5 - DataBase
18 pages
Bio in For Ma Tics
No ratings yet
Bio in For Ma Tics
52 pages
Anjali 1
No ratings yet
Anjali 1
16 pages
Bio PPT
No ratings yet
Bio PPT
35 pages
Flight Adaptation
No ratings yet
Flight Adaptation
47 pages
Pooja Project
No ratings yet
Pooja Project
6 pages
Class: Insecta / Hexapoda
No ratings yet
Class: Insecta / Hexapoda
25 pages
Screenshot 2025-07-02 at 8.03.28 PM
No ratings yet
Screenshot 2025-07-02 at 8.03.28 PM
1 page
Adobe Scan 06 Jul 2025
No ratings yet
Adobe Scan 06 Jul 2025
6 pages
Abduljaleel K Assist Professor Government College Kasaragod
No ratings yet
Abduljaleel K Assist Professor Government College Kasaragod
19 pages
Question
No ratings yet
Question
7 pages
Fracture
No ratings yet
Fracture
51 pages
Back Cross and Test Cross
100% (1)
Back Cross and Test Cross
13 pages
Mechanism of Action of Thyroid Hormone
No ratings yet
Mechanism of Action of Thyroid Hormone
4 pages
Echinococcus Granulosus
No ratings yet
Echinococcus Granulosus
6 pages
Analogous Organs: (D) Analogous Organs That Have Evolved Due To Convergent Evolution
No ratings yet
Analogous Organs: (D) Analogous Organs That Have Evolved Due To Convergent Evolution
75 pages
Genetics Entrance Questions
75% (4)
Genetics Entrance Questions
3 pages
Phylum Rotifera
No ratings yet
Phylum Rotifera
6 pages
(Ebook) Thinking Through the Past: A Critical Thinking Approach to U.S. History, Volume 1 by John Hollitz ISBN 9781285427430, 1285427432 download
No ratings yet
(Ebook) Thinking Through the Past: A Critical Thinking Approach to U.S. History, Volume 1 by John Hollitz ISBN 9781285427430, 1285427432 download
84 pages
ISO/IEC 17025:2017 General Requirement For The Competence of Testing and Calibration Laboratories
100% (1)
ISO/IEC 17025:2017 General Requirement For The Competence of Testing and Calibration Laboratories
134 pages
English W4 Realistic Fiction The Impossible Pet Show
No ratings yet
English W4 Realistic Fiction The Impossible Pet Show
38 pages
Day 3 of Eramus 7 Hills Project in Rome - Odt
No ratings yet
Day 3 of Eramus 7 Hills Project in Rome - Odt
28 pages
THORNES, John E. - RANDALLS, Samuel. Commodifying The Atmosphere - Pennies From Heaven
No ratings yet
THORNES, John E. - RANDALLS, Samuel. Commodifying The Atmosphere - Pennies From Heaven
14 pages
Assessment of Industrial Effluents Quality: A Case Study of Bhaluka Industrial Area, Mymensingh, Bangladesh
No ratings yet
Assessment of Industrial Effluents Quality: A Case Study of Bhaluka Industrial Area, Mymensingh, Bangladesh
66 pages
Set3065 A1
No ratings yet
Set3065 A1
2 pages
Avoiding Scams and Fraud A CLB 5 6 Module
No ratings yet
Avoiding Scams and Fraud A CLB 5 6 Module
75 pages
NCDC-Chemistry Sample Assessment Items-S1-S2-2022
80% (5)
NCDC-Chemistry Sample Assessment Items-S1-S2-2022
6 pages
R22 - Fea
No ratings yet
R22 - Fea
3 pages
Chapter 2 . .2
No ratings yet
Chapter 2 . .2
12 pages
Cbcshandbookfinal
No ratings yet
Cbcshandbookfinal
50 pages
A1 Ss Solution CCN2002 1819 S2 With Comments
No ratings yet
A1 Ss Solution CCN2002 1819 S2 With Comments
6 pages
2024, Couto-Pereira Et Al - Routine Regularity During A Global Pandemic Impact On Mental Health Outcomes and Influence of Chronotype
No ratings yet
2024, Couto-Pereira Et Al - Routine Regularity During A Global Pandemic Impact On Mental Health Outcomes and Influence of Chronotype
18 pages
MJ-2012,11332Engineering-DSP - Anna University
No ratings yet
MJ-2012,11332Engineering-DSP - Anna University
4 pages
Basic Details Avkahada Chakra: Name: B
No ratings yet
Basic Details Avkahada Chakra: Name: B
4 pages
A Novel Remote Sensing Based Approach To Estimate The Water Quality Index Using Sentinel-2 Multispectral Data
No ratings yet
A Novel Remote Sensing Based Approach To Estimate The Water Quality Index Using Sentinel-2 Multispectral Data
4 pages
Source Code New Abap Syntax Youtube
No ratings yet
Source Code New Abap Syntax Youtube
6 pages
Topic 5 - Leadership
No ratings yet
Topic 5 - Leadership
38 pages
ETABS-Example-RC Building Seismic Load - Response
50% (2)
ETABS-Example-RC Building Seismic Load - Response
35 pages
Consumer Behavior WORKSHEET
No ratings yet
Consumer Behavior WORKSHEET
10 pages
EN Eagle HC 72P 325-345W-V - 20
No ratings yet
EN Eagle HC 72P 325-345W-V - 20
2 pages
Thesis For Hunger Games
100% (4)
Thesis For Hunger Games
6 pages
Model T100: UV Fluorescence SO Analyzer
No ratings yet
Model T100: UV Fluorescence SO Analyzer
2 pages
Internship Report ON Industrial Manufacturing Management and Maintenance System
No ratings yet
Internship Report ON Industrial Manufacturing Management and Maintenance System
60 pages
Week 5 TOEFL Listening 2 Mahasiswa
No ratings yet
Week 5 TOEFL Listening 2 Mahasiswa
6 pages
Family Chapter 2
No ratings yet
Family Chapter 2
72 pages
Study Guide 2023
No ratings yet
Study Guide 2023
13 pages
Ekc500 Science and Engineering RESEARCH METHODOLOGY (2021/2022)
No ratings yet
Ekc500 Science and Engineering RESEARCH METHODOLOGY (2021/2022)
12 pages