0% found this document useful (0 votes)

189 views7 pages

Introduction To Databases

Biological databases play a fundamental role in bioscience and bioinformatics by organizing vast amounts of biological data for easy access and analysis. There are various types of biological databases that contain data from the nucleic acid level to whole genomes and proteomes, including nucleotide and protein sequences, gene expression patterns, metabolic pathways, and more. Maintaining comprehensive yet structured biological databases is crucial as modern biological research generates enormous volumes of raw genomic and molecular data.

Uploaded by

jonny depp

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

189 views7 pages

Introduction To Databases

Uploaded by

jonny depp

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

Background

Now biology becomes increasingly turned into a data-rich science, so the need for strong and
communicating large datasets has grown tremendously (e.g. Nucleotide and protein
sequences, three-dimensional structures from X-ray crystallography and NMR). A biological
database is a collection of data that is organized so that it contents can easily be accessed,
managed and updated. Biological databases play a fundamental role in bioscience particularly
in bioinformatics. They offer scientists the opportunity to access sequence and structure data
for tens of thousands of sequences from a broad range of organisms. Biological databases
represent an invaluable resource in support of biological research.

Types of Biological Data

The biological data obtained from the nucleotide to the networks level results the diverse
classes of biological databases, which includes:

 Nucleic acid sequence and structure

 Transcriptional regulation/gene expression patterns
 Protein sequence and structure
 Motifs and domains
 Protein-protein interactions
 Metabolic and signaling pathways
 Metabolites, enzymes, protein modification
 Viruses, bacteria, protozoa, and fungi
 Partial and whole genome sequences
 Genomic variation, diseases, and drugs
 Plant databases
 Other molecular biology databases, etc.

Definition of Bioinformatics
A biological database is a collection of data that is organized so that it contents can easily be
accessed, managed, and updated.
Why we need Biological database?
One of the hallmarks of modern genomic research is the generation of enormous amounts of
raw sequence data. As the volume of genomic data grows, sophisticated computational
methodologies are required to manage the data. Thus, the very first challenge in the genomics
era is to store and handle the overwhelming volume of information through the establishment
of computer databases. The development of databases to handle the vast amount of molecular
biological data is thus a fundamental task of bioinformatics. This chapter introduces some
basic concepts related to databases, in particular, the types, designs, and architectures of
biological databases.

Objectives of Biological Databases

• Availability of biological data to scientific community: To store, organize and share data in
a structured and searchable manner with the aim to facilitate data retrieval and visualization.

• Availability of biological data in computer-readable form: To maintain the data in the

common formats and to provide web application programming interfaces for computers to
exchange and integrate data from various database resources in an automated manner.

Features of Biological database

• Structured - Stored in a well designed fashion

• Searchable (index) - Table of contents
• Updated periodically (release) - New edition
• Cross-referenced (hyperlinks) - Links with other Databases
Organization of a database

• Databases are composed of tables of data: It the same thing as a spreadsheet: a set of rows
and columns.

• Each table has several records (rows): A record stores all the information for a given
individual.

• Each record has several fields (columns): A field is an individual piece of data, a single
attribute of the record.

• Each record has a unique identifier, the primary key: A primary key serves to identify the
data stored in this record across all the tables in the database.

The ‘Perfect’ Database

A perfect database has following qualities:

 Comprehensive, but easy to search

 Annotated, but not “too annotated”
 A simple, easy to understand structure
 Cross-referenced
 Minimum redundancy
 Easy retrieval of data
Classification of Biological Databases

Biological databases can be broadly classified into two categories

Sequence databases:
Contains nucleic acid and protein sequences information

Structure databases:
Three-dimensional structures of proteins, nucleic acids, and macromolecular complexes.

These databases are important tools in assisting scientists to analyze and explain a host of
biological phenomena from the structure of biomolecules and their interaction, to the whole
metabolism of organisms and to understanding the evolution of species. This knowledge helps
to facilitate the fight against diseases, assists in the development of medications, predicting
certain genetic diseases and in discovering basic relationships among species in the history of
life.

Sequences and structures are only among the several different types of data required in the
practice of the modern biology. Other important data types includes metabolic pathway
networks and molecular interactions, mutations and polymorphisms in molecular sequences
and structures as well as organelle structure and tissue type, genetic maps, physicochemical
data, gene and mRNA expression profiles, two dimensional gel electrophoresis images of
protein expression.
Sequence and structural databases are further can be classified into:

 Primary
 Secondary
 Composite

Primary database:

Consisting of data derived experimentally such as nucleotide, protein sequences and three
dimensional structures alone.
Examples of these include UniProtKB for protein sequences, GenBank & DDBJ for Genome
sequences and the Protein Data Bank for protein structures.

Secondary databases:

Contains data that are derived from the analysis or treatment of primary data such as
secondary structures, hydrophobicity plots, conserved sequence, signature sequence and
domain are stored in secondary databases.

Secondary structure database contains detailed information of the PDB entry in an organized
way. Example: Structural classification of protein class, fold, superfamily, etc.

Most of the secondary database created and hosted by various researchers at their individual
laboratories. Example: SCOP-developed at Cambridge University, CATH-developed at
University College of London, BMCD-developed at NIST, USA.

Composite databases:

This merges a variety of different primary database sources, which avoids the need to search
multiple resources. Different composite database use different combinations of primary
database and different criteria in their search algorithm.

The nucleotide and protein databases hosted at the National Center for Biotechnology
Information (NCBI), provides OMIM (Online Mendelian Inheritance in Man) an online
comprehensive, authoritative compendium of human genes and genetic phenotypes.
Current Status
The Database Issue of the journal “Nucleic Acids Research” is freely available, and categorizes
many of the publicly available online databases related to biology and bioinformatics.
According to a report of 21st Nucleic Acids Research Database Issue, published in 2014, there
are 1552 databases that are publicly accessible online [ref] and the recent 22nd Nucleic Acids
Research Database Issue reports the addition of 58 new molecular biology databases, and the
updates on 115 existing databases. (Nucleic Acids Research, 2015, Vol. 43, Database issue D1–
D5)
About Me

• Researcher at Rajasthan University DBT Bioinformatics Centre.

• Highly rated freelancer at https://www.teacheron.com/tutor-

profile/2w79

• Owns automatic updating chemical compound database

https://sites.google.com/view/mud-data

• 2 years of Next-Generation Sequencing, Data Processing, and

Bioinformatics experience.

• Developed septicemia and COVID-19 QSAR models using R.

• 3 years of project-based teaching experience to national and

international students.

For any query, suggestion, or feedback, you can contact me at-

[email protected]

www.linkedin.com/in/shradheya-r-r-gupta-54492984

Thanks for taking the course.

My aim is to bridge the gap between life science and computer.

Enjoy learning!

#2 Mol Bio Mutation Questions
100% (2)
#2 Mol Bio Mutation Questions
3 pages
Case Study 8.final
100% (1)
Case Study 8.final
12 pages
Animal Cell Parts and Functions - Summary Table
100% (1)
Animal Cell Parts and Functions - Summary Table
16 pages
DNA Sequencing-Powerpoint Presentation-Professor's
No ratings yet
DNA Sequencing-Powerpoint Presentation-Professor's
24 pages
Wiley - Gene Cloning and DNA Analysis An Introduction, 7th Edition - 978-1-119-07256-0
No ratings yet
Wiley - Gene Cloning and DNA Analysis An Introduction, 7th Edition - 978-1-119-07256-0
2 pages
Unit 5-Introduction To Biological Databases
No ratings yet
Unit 5-Introduction To Biological Databases
14 pages
Biological Databases: DR Z Chikwambi Biotechnology
No ratings yet
Biological Databases: DR Z Chikwambi Biotechnology
47 pages
Introduction To Bioinformatics (Databases)
No ratings yet
Introduction To Bioinformatics (Databases)
28 pages
Biological Databases: - Bio-Informatics
No ratings yet
Biological Databases: - Bio-Informatics
16 pages
CBE 647 Lesson Plan - Sept 2017
No ratings yet
CBE 647 Lesson Plan - Sept 2017
3 pages
Food Biotech 1st Unit (History and Uses and Role of Microorganism)
No ratings yet
Food Biotech 1st Unit (History and Uses and Role of Microorganism)
15 pages
An Overview of Microbiology: Dr. Thaigar Parumasivam Email: Thaigarp@usm - My
No ratings yet
An Overview of Microbiology: Dr. Thaigar Parumasivam Email: Thaigarp@usm - My
26 pages
BTT302 - Ktu Qbank
No ratings yet
BTT302 - Ktu Qbank
6 pages
Bioinformatics Class Notes
No ratings yet
Bioinformatics Class Notes
12 pages
Biological Nitrogen Fixation
No ratings yet
Biological Nitrogen Fixation
9 pages
Bacterial Ultra Structure
No ratings yet
Bacterial Ultra Structure
31 pages
PCR Techniques and Their Clinical Applications, 2023
No ratings yet
PCR Techniques and Their Clinical Applications, 2023
20 pages
Bioinformatics Notes
No ratings yet
Bioinformatics Notes
104 pages
Bioinformatics Is The Inter-Disciplinary Branch of Biology Which Merges Computer Science, Mathematics and Engineering To Study The Biological Data
No ratings yet
Bioinformatics Is The Inter-Disciplinary Branch of Biology Which Merges Computer Science, Mathematics and Engineering To Study The Biological Data
26 pages
Enzymes - : 1. Lock and Key Model
No ratings yet
Enzymes - : 1. Lock and Key Model
2 pages
Ghosh and Mallik
No ratings yet
Ghosh and Mallik
68 pages
Site Directed Mutagenesis
No ratings yet
Site Directed Mutagenesis
10 pages
Bio-Informatics, Its Application S& Ncbi: Submitted By: Sidhant Oberoi (BTF/09/4038)
No ratings yet
Bio-Informatics, Its Application S& Ncbi: Submitted By: Sidhant Oberoi (BTF/09/4038)
9 pages
Bioinformatics KSOU
No ratings yet
Bioinformatics KSOU
260 pages
Unit2 - Bioinformatics (KBT-603)
No ratings yet
Unit2 - Bioinformatics (KBT-603)
81 pages
Sequence Analysis
No ratings yet
Sequence Analysis
6 pages
Bi0505 Lab
No ratings yet
Bi0505 Lab
102 pages
Introduction To NCBI
No ratings yet
Introduction To NCBI
31 pages
Biological Databases
No ratings yet
Biological Databases
39 pages
Mutation
No ratings yet
Mutation
11 pages
DNA Sequencing at 40 - Past Present and Future
No ratings yet
DNA Sequencing at 40 - Past Present and Future
10 pages
Microbiology Chapter 2
No ratings yet
Microbiology Chapter 2
109 pages
Group # 13
No ratings yet
Group # 13
49 pages
Metagenomics and Industrial Applications: Perspectives
100% (1)
Metagenomics and Industrial Applications: Perspectives
7 pages
PFAM Database
No ratings yet
PFAM Database
22 pages
Isozyme, Ribozyme, Abzyme
No ratings yet
Isozyme, Ribozyme, Abzyme
3 pages
Immunology RDNA Bioinformatics Manual
No ratings yet
Immunology RDNA Bioinformatics Manual
60 pages
Role of Chromatin in Regulating Gene Expression and Gene Silencing
No ratings yet
Role of Chromatin in Regulating Gene Expression and Gene Silencing
19 pages
Second Semester Examinations Question Paper - Computational Genomics
No ratings yet
Second Semester Examinations Question Paper - Computational Genomics
6 pages
Fuel Biomass Biomass Heating Systems Greenhouse Gas Energy Security
No ratings yet
Fuel Biomass Biomass Heating Systems Greenhouse Gas Energy Security
20 pages
Basic of Genetic Engineering
No ratings yet
Basic of Genetic Engineering
26 pages
Cell and Molecular Biology: Nternational Eview of
100% (1)
Cell and Molecular Biology: Nternational Eview of
294 pages
Genetics A Conceptual Approach 6th Edition Full Download
50% (2)
Genetics A Conceptual Approach 6th Edition Full Download
408 pages
Cloning Vectors
No ratings yet
Cloning Vectors
19 pages
Amino Acid Synthesis
No ratings yet
Amino Acid Synthesis
16 pages
cDNA Library
No ratings yet
cDNA Library
10 pages
M.Prasad Naidu MSC Medical Biochemistry, Ph.D.Research Scholar
No ratings yet
M.Prasad Naidu MSC Medical Biochemistry, Ph.D.Research Scholar
39 pages
Lab 3 DNA Extraction
No ratings yet
Lab 3 DNA Extraction
2 pages
Ultrastructure Nucleus: Biology
No ratings yet
Ultrastructure Nucleus: Biology
14 pages
Scope and Application of Genetic Engineering RS Maam HW
No ratings yet
Scope and Application of Genetic Engineering RS Maam HW
29 pages
Notes 1 - DNA Is Genetic Material
No ratings yet
Notes 1 - DNA Is Genetic Material
5 pages
Industrial Biotechnology An Overview
100% (1)
Industrial Biotechnology An Overview
36 pages
202 07 Bioinformatics
No ratings yet
202 07 Bioinformatics
14 pages
Protein Sequence Database Ankita Sharma
No ratings yet
Protein Sequence Database Ankita Sharma
31 pages
Proteomic and Proteomics
No ratings yet
Proteomic and Proteomics
6 pages
Biological Database 1
No ratings yet
Biological Database 1
50 pages
Unit-4 4. Chemistry Carbohydrates
100% (1)
Unit-4 4. Chemistry Carbohydrates
45 pages
Bacterial Conjugation
No ratings yet
Bacterial Conjugation
18 pages
BIOCHEMISTRY Introduction
No ratings yet
BIOCHEMISTRY Introduction
3 pages
COURSE WORK MOLECULAR BIOLOGY & Genetics
No ratings yet
COURSE WORK MOLECULAR BIOLOGY & Genetics
3 pages
Intellectual Property Rights in Pharmaceutical Industry: Theory and Practice
From Everand
Intellectual Property Rights in Pharmaceutical Industry: Theory and Practice
Bayya Subba Rao
No ratings yet
Notes On a Few Minor Phyla
From Everand
Notes On a Few Minor Phyla
Daniel Zimmermann
No ratings yet
Evaluation of Cellular Processes by in vitro Assays
From Everand
Evaluation of Cellular Processes by in vitro Assays
Taseen Gul
No ratings yet
Current Protocols in Molecular Biology
0% (3)
Current Protocols in Molecular Biology
11 pages
DSE Biology Chapter 2 - Genetics and Evolution 遺傳與進化
100% (1)
DSE Biology Chapter 2 - Genetics and Evolution 遺傳與進化
50 pages
Pexp5 TOPO User Manual
100% (1)
Pexp5 TOPO User Manual
53 pages
NBT 2022 PDF
No ratings yet
NBT 2022 PDF
10 pages
Spider Eye Development Editing and Silk Fiber Engineering Using CRISPR Cas - Santiago Rivera (2025)
No ratings yet
Spider Eye Development Editing and Silk Fiber Engineering Using CRISPR Cas - Santiago Rivera (2025)
7 pages
1 Amino Acid Analysis - Agilent PDF
No ratings yet
1 Amino Acid Analysis - Agilent PDF
49 pages
Cambridge International AS & A Level: Ishan Willathgamuwa
No ratings yet
Cambridge International AS & A Level: Ishan Willathgamuwa
20 pages
Cellcylcereview
No ratings yet
Cellcylcereview
2 pages
Enzymes Concept Map
No ratings yet
Enzymes Concept Map
1 page
EXAM1 Key
No ratings yet
EXAM1 Key
7 pages
Control of Glucose Concentration by Insulin in The Liver
No ratings yet
Control of Glucose Concentration by Insulin in The Liver
40 pages
1.NDC-Biochemistry-Week-01 (14-09-2014)
No ratings yet
1.NDC-Biochemistry-Week-01 (14-09-2014)
9 pages
The Fundamental Unit of Life RWS Ak Fe
No ratings yet
The Fundamental Unit of Life RWS Ak Fe
2 pages
Pharmacology Intro 55953232
No ratings yet
Pharmacology Intro 55953232
260 pages
Lecture 10 Carbohydrates
No ratings yet
Lecture 10 Carbohydrates
22 pages
Respiratory Substrates: Greater The Number of Hydrogens Greater The Energy Value
No ratings yet
Respiratory Substrates: Greater The Number of Hydrogens Greater The Energy Value
2 pages
Campbell Bookreview
No ratings yet
Campbell Bookreview
4 pages
Cell Division Mitosis Meiosis
100% (2)
Cell Division Mitosis Meiosis
49 pages
Week 6a Pre-Class Reading Guide - Proteins Translation
No ratings yet
Week 6a Pre-Class Reading Guide - Proteins Translation
2 pages
Post-Translational Modifications of Proteins: Acetylcholinesterase As A Model System
No ratings yet
Post-Translational Modifications of Proteins: Acetylcholinesterase As A Model System
17 pages
Dukewriting Lesson2-Answer
No ratings yet
Dukewriting Lesson2-Answer
2 pages
Chaperone Machines For Protein Folding, Unfolding and Disaggregation
No ratings yet
Chaperone Machines For Protein Folding, Unfolding and Disaggregation
13 pages
Animal Nutrition-15 Question Quiz
No ratings yet
Animal Nutrition-15 Question Quiz
4 pages
FL - CLARIVINE™ - v25 - 00 - EN
No ratings yet
FL - CLARIVINE™ - v25 - 00 - EN
2 pages
220 GRQ 12 - Regulation of Gene Expression in Prokaryotes
No ratings yet
220 GRQ 12 - Regulation of Gene Expression in Prokaryotes
3 pages
Extraction of Native Protein From Yeast
100% (1)
Extraction of Native Protein From Yeast
5 pages
LS7C Week 1A Pre-Class Reading Guide
No ratings yet
LS7C Week 1A Pre-Class Reading Guide
2 pages

Introduction To Databases

Uploaded by

Introduction To Databases

Uploaded by

Background

Types of Biological Data

 Nucleic acid sequence and structure

Objectives of Biological Databases

• Availability of biological data in computer-readable form: To maintain the data in the

Features of Biological database

• Structured - Stored in a well designed fashion

The ‘Perfect’ Database

A perfect database has following qualities:

 Comprehensive, but easy to search

Biological databases can be broadly classified into two categories

• Researcher at Rajasthan University DBT Bioinformatics Centre.

• Highly rated freelancer at https://www.teacheron.com/tutor-

• Owns automatic updating chemical compound database

• 2 years of Next-Generation Sequencing, Data Processing, and

• Developed septicemia and COVID-19 QSAR models using R.

• 3 years of project-based teaching experience to national and

For any query, suggestion, or feedback, you can contact me at-

Thanks for taking the course.

My aim is to bridge the gap between life science and computer.

You might also like