0% found this document useful (0 votes)
41 views

1 - Introduction and Sequence Database

The document introduces several primary biological databases, including NCBI, DDBJ, EMBL, ENSEMBL, and UCSC. It discusses how these databases classify and store literature, sequences, structures, and other data for use in laboratories. The document also outlines key characteristics of biological databases, such as how they contain curated genomics, proteomics, and other data from experiments and are structured to enable searching and periodic updates.

Uploaded by

Alisha
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
41 views

1 - Introduction and Sequence Database

The document introduces several primary biological databases, including NCBI, DDBJ, EMBL, ENSEMBL, and UCSC. It discusses how these databases classify and store literature, sequences, structures, and other data for use in laboratories. The document also outlines key characteristics of biological databases, such as how they contain curated genomics, proteomics, and other data from experiments and are structured to enable searching and periodic updates.

Uploaded by

Alisha
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 51

NCBI, DDBJ, EMBL, ENSEMBL, UCSC

 Introduction to various databases


 their classification (primary and secondary databases) e.g.
NCBI, DDBJ, EMBL, ENSEMBL, UCSC
 their use in laboratories: literature, sequence, structure, medical, enzymes and
metabolic pathways databases.
 Database are convenient system to properly store, search and retrieve any type of
data.
 A database helps to easily handle and share large amount of data and supports
large scale analysis by easy access and data updating.
 Biological databases are libraries of life sciences information collected from
scientific experiments, published literature, high-throughput experiment technology
and computation analysis
 They contain information genomics {gene sequence, function, structure, localization
(both cellular and chromosomal)}, proteomics, transcriptomics, metabolomics,
microarray
 They are structured, searchable, updated periodically and cross-referenced
 Purpose:
 Collation and organization of data related to biological systems
 Availability of biological data
 Provide a computational support and user-friendly interface to a researcher for a
meaningful analysis of biological data
5 Characteristics of Good Database
 The data collection of this database is due to
the efforts of basic research from academic
and industrial labs
 Experimental results are directly submitted into
the database by researchers
 Data exists in its original form
 Once given a database accession number, the
data in the primary database are never
changed
 These are the primary sources of data to store
nucleic acid, protein sequence and structural
information of biological macromolecules
• It has a flat file structure that is an ASCII text file, readable and downloadble by
both humans and computers
• There are two main ways of making batch sequence submissions to GenBank :
NCBI’s Barcode submission tool (BarSTool) and Sequin.
Database retrieval systems
 An annotated collection of all publicly available nucleotide
and protein sequences
 Started, 1984 at the National Institute of Genetics (NIG) in
Mishima.
 http://www.ddbj.nig.ac.jp

You might also like