Bioinformatics Databases

The document provides an overview of various online bioinformatics databases, highlighting key providers like NCBI and EBI, and notable databases for biomolecular sequences, structures, protein functions, and genome information. It emphasizes the importance of features such as data accessibility, annotation, and links to additional resources when selecting a database. Additionally, it lists specific databases relevant to plant genomes, model organisms, and cancer research.

Uploaded by

Hasan Saiem

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views7 pages

Bioinformatics Databases

Uploaded by

Hasan Saiem

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 7

Bioinformatics databases

There are thousands of online bioinformatics databases. Here we list only a

handful of the most commonly used (i.e. most highly cited in the literature) in the
areas of biomolecular sequence, biomolecular structure, protein function and
domain annotation, genome databases and model organisms.
Additional databases of potential interest can often be found by looking through
the ‘Nucleic Acid Research (NAR) Database Issue’ available online.
When considering a particular database remember that desirable features will
likely include:
• Contains the data you are interested in.
• Allows fast data access.
• Provides annotation and curation of entries.
• Provides links to additional information (possibly in other databases).
• Allows you to make discoveries!

NCBI and EBI: Key database providers

The National Center for Biotechnology Information (NCBI) and European
Bioinformatics Institute (EBI) are the most prominent online bioinformatics
resource providers (both tools and databases).
Notable NCBI databases include:
• GenBank - an annotated collection of all publicly available DNA sequences.
• RefSeq - annotated set of non-redundant reference sequences (best
representation of a sequence in their judgment) including genomic, transcript,
and protein.
• PubMed - database of published biomedical literature (mostly abstracts).
• dbSNP - database of single nucleotide polymorphisms (SNPs) and multiple small-
scale variations of nucleotide sequences.
Notable EBI databases include
• ENA - a comprehensive record of DNA sequences. Contains the same sequences
as GenBank (above) but offers different views and ways to navigate through the
data.
• UniProt - the premier protein sequence database.
• Ensembl - genome databases for vertebrates and other eukaryotic species.
• PDBsum - pictorial database of 3D biomolecular structures in the Protein Data
Bank (or PDB)

Biomolecular sequence databases

▪ GenBank - NCBI’s nucleotide sequence database. Part of the ‘International
Nucleotide Database Collaboration’ together with the ENA (‘European Nucleotide
Archive’, from the EBI) and DDBJ (in Japan).
▪ RefSeq - The Reference Sequence collection constructed by NCBI to provide a
comprehensive, integrated, non-redundant set of DNA, RNA sequences and
protein products. It provides a stable reference for genome annotation, gene
identification and characterization, mutation and polymorphism analysis,
expression studies and comparative analyses.
▪ UniGene - An Organized View of the Transcriptome created by NCBI. Each
UniGene entry is a set of transcript sequences that appear to come from the same
transcription locus, together with information on protein similarities, gene
expression, cDNA clone reagents, and genomic location.
▪ dbSNP - The database of single nucleotide polymorphism maintained by NCBI.
▪ UniProt - The main protein sequence database consisting of the protein
‘KnowledgeBase’ (UniProtKB), the sequence clusters (UniRef) and the sequence
archive (UniParc).
Biomolecular structure databases
▪ PDB - The main repository of biomolecular structures maintained by the
Research Collaboration for Structural Bioinformatics (RCSB). The same structures
are also contained in PDBe, an EBI maintained version of the protein data bank
that offers differing levels of annotation.
▪ SCOP - The database of Structure Classification of Proteins developed and
maintained by Cambridge University.
▪ CATH - The database of protein structure ‘Class, Architecture, Topology and
Homologous superfamily’ developed and maintained by University College,
London. Protein function and domain databases
▪ PFam - A database of protein families represented by multiple sequence
alignments and hidden Markov models, constructed and maintained by the
Sanger Institute, UK.
▪ Prosite - A database of protein domains, families and functional sites, created
and maintained by the Swiss Institute of Bioinformatics.
▪ PRINTS - A database of protein fingerprints consisting of conserved motifs within
a protein family, created and maintained by Manchester University, UK.
▪ BLOCKS - A database of multiply aligned ungapped segments corresponding to
the most highly conserved regions of proteins, created and maintained by the
Fred Hutchinson Cancer Research Center, US.
▪ ProDom - A database of protein domain families automatically generated from
UniProt sequence database, developed and maintained by the University Claude
Bernard, France.
▪ HPA - A web site for the the human protein atlas which shows expression and
localization of proteins in a large variety of normal human tissues, cancer cells and
cell lines with the aid of immunohistochemistry images, developed and
maintained by Proteome Resource Center, Sweden.
Genome databases and genome browsers
▪ ENSEMBL - The web server of the European eukaryotic genome resource
developed by the EBI and Sanger Institute.
▪ UCSC Genome Information - The genome browser website containing the
reference sequence and working draft assemblies for a large collection of
genomes at the University of California at Santa Cruz (UCSC).
▪ NCBI Map Viewer - The NCBI genomic map viewer for the visualization of
completed and ongoing genome sequence.
▪ NCBI Genome - The entry portal to various NCBI genomic biology tools and
resources, including the Map Viewer, the Genome Project Database and the Plant
Genomes Central, etc.
▪ NCBI Genome Information - The NCBI genomic information table lists the
general information of genomes for all species.
▪ VISTA - A comprehensive suite of programs and databases for comparative
analysis of genomic sequences.
▪ GOLD - Genomes Online Database, a comprehensive information resource for
complete and ongoing genome sequencing projects with flowcharts and tables of
statistical data.

Plant genome databases

▪ Phytozome - A tool for green plant comparative genomics, maintained by the
Center for Integrative Genomics, Joint Genome Institute.
▪ Gramene - A curated open-source data resource for comparative genome
analysis in the grasses including rice, maize, wheat, barley, sorghum etc, as well as
other plants including arabidopsis, poplar and grape. Crossspecies homology
relationships can be found using information derived from genomic and EST
sequencing, protein structure and function analysis, genetic and physical
mapping, interpretation of biochemical pathways, gene and QTL localization and
descriptions of phenotypic characters and mutations.
▪ TAIR - The Arabidopsis information resource maintained by Stanford University.
It includes the complete genome sequence along with gene structure, gene
product information, metabolism, gene expression, DNA and seed stocks, genome
maps, genetic and physical markers, publications, and information about the
Arabidopsis research community.
▪ AtENSEMBL - A genome browser for the commonly studied plant model
organism Arabidopsis thaliana.
▪ Oryzabase - A comprehensive rice science database maintained by National
Institute of Genetics, Japan. It contains genetic resource stock information, gene
dictionary, chromosome maps, mutant images and fundamental knowledge of
rice science.
▪ MaizeDB - The community database for biological information about the crop
plant Zea mays ssp. mays, with genetic, genomic, sequence, gene product,
functional characterization, literature reference.
▪ SoyBase - Integrating Genetics and Molecular Biology for Soybean Researchers.
▪ SGN - A collection of data resource of the Solanaceae species including tomoto,
potato, peppper, eggplant, petunia, nicotiana.
▪ ICuGI - The web portal for the International Cucurbit Genomics Initiative
including melon, cucumber, watermelon, pumpkin, etc.

Other genome databases

▪ PATRIC - the Bacterial Bioinformatics Resource Center, an information system
designed to support the biomedical research community’s work on bacterial
infectious diseases via integration of vital pathogen information with rich data
and analysis tools.
▪ GenoList - The bacterial genome database maintained at the Pasteur Institute.
▪ CyanoBase - The genome database for cyanobacteria developed by Kazusa
Institute, Japan.
▪ Viral Genomes - the main page of NCBI viral genome information resource
▪ GISAID - Global Initiative on Sharing Avian Influenza Data.
▪ OpenFlu - A database for human and animal influenza virus.
▪ NCBI Flu - NCBI Influenza Virus Resource with influenza genomic data and
analysis tools.
▪ Plant Viruses - This site provides a central source of information about viruses,
viroids and satellites of plants, fungi and protozoa.

Model organism focused database

▪ MGI - The international database resource for the laboratory mouse, providing
integrated genetic, genomic, and biological data to facilitate the study of human
health and disease.
▪ ZFIN - The Zebrafish International Resource Center.
▪ Flybase - A comprehensive database of drosophila genes and genomes
maintained by Indiana University.
▪ WormBase - The biology and genome resource of the Caenorhabditis elegans
genome.
▪ SGD - The Saccharomyces Genome database.
▪ RGD - The Rat Genome Database at the Wisconsin University, to collect,
consolidate, and integrate data generated from ongoing rat genetic and genomic
research.
▪ XenBase - The African clawed frog Xenopus laevis and Xenopus tropicalis biology
and genomics resource. Cancer specific databases
▪ ICGC Data Portal - Tools for visualizing, querying and downloading the data
released quarterly by the consortium's member projects.
▪ TGCA portal - Search, download, and analyze data sets generated by the ‘Cancer
Genome Atlas’ (TCGA). It contains clinical information, genomic characterization
data, and high level sequence analysis of tumor genomes.
▪ UCSC Cancer Genomics Browser - Interactively explore cancer genomics data
and its associated clinical information.

Database Dalam Bioinformatika
No ratings yet
Database Dalam Bioinformatika
34 pages
Mids Notes
No ratings yet
Mids Notes
11 pages
List of Biological Databases
No ratings yet
List of Biological Databases
9 pages
Database 2
No ratings yet
Database 2
15 pages
Zoya Bioinformatics Assignment
No ratings yet
Zoya Bioinformatics Assignment
36 pages
BCH 505 Bioinformatics 3 (2 2) Databases
No ratings yet
BCH 505 Bioinformatics 3 (2 2) Databases
17 pages
Bioinfi U3 Part - 1
No ratings yet
Bioinfi U3 Part - 1
4 pages
DATAbases 1 KD
No ratings yet
DATAbases 1 KD
5 pages
Fat Noews
No ratings yet
Fat Noews
24 pages
Bioinformatics Lecture Notes Database
No ratings yet
Bioinformatics Lecture Notes Database
28 pages
A Review Article On Bioinformatics Tools and Software
No ratings yet
A Review Article On Bioinformatics Tools and Software
14 pages
Data Base in Bioinformatics
No ratings yet
Data Base in Bioinformatics
30 pages
CH12
No ratings yet
CH12
8 pages
Bioinformatics Tools For Nucleotide Sequence Analysis and Database Exploration
No ratings yet
Bioinformatics Tools For Nucleotide Sequence Analysis and Database Exploration
75 pages
Introduction To Bioinformatics (Databases)
No ratings yet
Introduction To Bioinformatics (Databases)
28 pages
Manual
No ratings yet
Manual
68 pages
Bioinformatics Practical File
No ratings yet
Bioinformatics Practical File
12 pages
Bioinformatics (STH Sir)
No ratings yet
Bioinformatics (STH Sir)
13 pages
Biological Data and Database Biological Data
No ratings yet
Biological Data and Database Biological Data
10 pages
System Biology Assignment
No ratings yet
System Biology Assignment
17 pages
Generating Structural Data Analysis
No ratings yet
Generating Structural Data Analysis
8 pages
Bio-Informatics, Its Application S& Ncbi: Submitted By: Sidhant Oberoi (BTF/09/4038)
No ratings yet
Bio-Informatics, Its Application S& Ncbi: Submitted By: Sidhant Oberoi (BTF/09/4038)
9 pages
Biological Databases BDB
No ratings yet
Biological Databases BDB
5 pages
Adv Bi Unit 1
No ratings yet
Adv Bi Unit 1
39 pages
Bioinformatics Biological Database
No ratings yet
Bioinformatics Biological Database
31 pages
Bioinformatics Overview
100% (1)
Bioinformatics Overview
18 pages
Introduction To Databases
No ratings yet
Introduction To Databases
29 pages
Computational Biology
No ratings yet
Computational Biology
19 pages
Databases Class Work
No ratings yet
Databases Class Work
48 pages
Bioinformatics - Group21 - Report - Application of Bioinformatics in Agriculture
No ratings yet
Bioinformatics - Group21 - Report - Application of Bioinformatics in Agriculture
11 pages
University of Okara: Name: Topic: Subject: Semester: Department
No ratings yet
University of Okara: Name: Topic: Subject: Semester: Department
29 pages
BIOINFORMATICS
No ratings yet
BIOINFORMATICS
85 pages
Biological Databases
No ratings yet
Biological Databases
13 pages
Biological Databases PDF
No ratings yet
Biological Databases PDF
13 pages
Lecture 5
No ratings yet
Lecture 5
44 pages
Lecture 5 - DataBase
No ratings yet
Lecture 5 - DataBase
18 pages
Ing Gen P4
No ratings yet
Ing Gen P4
66 pages
Presentation 11
No ratings yet
Presentation 11
20 pages
Biological Databases Genbank
No ratings yet
Biological Databases Genbank
31 pages
Lecture 4 Nucleic Acid Sequence Database
No ratings yet
Lecture 4 Nucleic Acid Sequence Database
21 pages
I Am Sharing 'Document (2) ' With You
No ratings yet
I Am Sharing 'Document (2) ' With You
36 pages
Lec 3 Terms and Definitions in Bioinformatics
No ratings yet
Lec 3 Terms and Definitions in Bioinformatics
8 pages
Unit V DM
No ratings yet
Unit V DM
96 pages
Bioinformatics Day2
No ratings yet
Bioinformatics Day2
3 pages
Bioinformatics Databases
No ratings yet
Bioinformatics Databases
10 pages
Protein Databases
No ratings yet
Protein Databases
13 pages
Bioinformatics Definition
No ratings yet
Bioinformatics Definition
11 pages
Protein Databases
No ratings yet
Protein Databases
23 pages
List of Biological Databases
100% (1)
List of Biological Databases
8 pages
Genomic Databases - Analysis Tools
No ratings yet
Genomic Databases - Analysis Tools
87 pages
Bioinformatics
No ratings yet
Bioinformatics
5 pages
Tics - A Brief Introduction
No ratings yet
Tics - A Brief Introduction
4 pages
Module 2 (Bioinformatics)
No ratings yet
Module 2 (Bioinformatics)
81 pages
Biological Databases
No ratings yet
Biological Databases
41 pages
Lista de Bases de Datos
No ratings yet
Lista de Bases de Datos
13 pages
Bioinformatics & Gene Banks
No ratings yet
Bioinformatics & Gene Banks
2 pages
17373.selected Works in Bioinformatics by Xuhua Xia PDF
No ratings yet
17373.selected Works in Bioinformatics by Xuhua Xia PDF
190 pages
Bio PPT
No ratings yet
Bio PPT
35 pages
Biological Databases ODL
No ratings yet
Biological Databases ODL
31 pages
Introduction to Bioinformatics Using Action Labs
From Everand
Introduction to Bioinformatics Using Action Labs
Jean-Louis Lassez
5/5 (1)
Lec - 7 - Nucleic Acids
No ratings yet
Lec - 7 - Nucleic Acids
26 pages
Lec - 6 - Lipids
No ratings yet
Lec - 6 - Lipids
26 pages
Lec-1 Carbohydrate - PPTX (Lec201)
No ratings yet
Lec-1 Carbohydrate - PPTX (Lec201)
19 pages
Lec - 9 - Introduction To Metabolism
No ratings yet
Lec - 9 - Introduction To Metabolism
21 pages
Lec - 3 - Protein
No ratings yet
Lec - 3 - Protein
18 pages
Bioinformatics Glossary
No ratings yet
Bioinformatics Glossary
4 pages
Tertiary Structure Prediction Methods: Any Given Protein Sequence
No ratings yet
Tertiary Structure Prediction Methods: Any Given Protein Sequence
29 pages
Sequence Alignment Presentation
No ratings yet
Sequence Alignment Presentation
27 pages
Site-Directed Mutagenesis: Volume 5, Issue10, 2017, PP 7-9 ISSN No. (Online) 2349-0365
No ratings yet
Site-Directed Mutagenesis: Volume 5, Issue10, 2017, PP 7-9 ISSN No. (Online) 2349-0365
3 pages
Using BLAST: FASTA Format
0% (1)
Using BLAST: FASTA Format
3 pages
CRISPR+101+eBook v2021
No ratings yet
CRISPR+101+eBook v2021
20 pages
Meselson Stahl
No ratings yet
Meselson Stahl
2 pages
GeneAssure Brochure Dec23
No ratings yet
GeneAssure Brochure Dec23
4 pages
Rob Beynon UMIST, Manchester, UK Chris Howe Department of Biochemistry, University of Cambridge
No ratings yet
Rob Beynon UMIST, Manchester, UK Chris Howe Department of Biochemistry, University of Cambridge
72 pages
Smith Waterman
No ratings yet
Smith Waterman
9 pages
AI Bioinformatics Career Map 2025 Compressed
No ratings yet
AI Bioinformatics Career Map 2025 Compressed
18 pages
Intro To Phyl o Genetics
No ratings yet
Intro To Phyl o Genetics
44 pages
Experiment-7 (HOMOLOGY MODELING)
No ratings yet
Experiment-7 (HOMOLOGY MODELING)
12 pages
Grad
No ratings yet
Grad
132 pages
Final Course List (Jan - Apr 2025)
No ratings yet
Final Course List (Jan - Apr 2025)
210 pages
Biotek La
No ratings yet
Biotek La
2 pages
Viva Eng
No ratings yet
Viva Eng
2 pages
Martin 2011
No ratings yet
Martin 2011
3 pages
1.history of Bioinformatics
No ratings yet
1.history of Bioinformatics
7 pages
Lecture1 BIMM143 Large
No ratings yet
Lecture1 BIMM143 Large
73 pages
W. James Kent - BLAT-The BLAST-Like Alignment Tool
No ratings yet
W. James Kent - BLAT-The BLAST-Like Alignment Tool
10 pages
Biologics Suite: All The Tools That Are Important in Modeling Biologics, Antibodies, and Proteins
No ratings yet
Biologics Suite: All The Tools That Are Important in Modeling Biologics, Antibodies, and Proteins
2 pages
Lesson 1 Genetic Engineering
No ratings yet
Lesson 1 Genetic Engineering
5 pages
Hjrs 2023-24
No ratings yet
Hjrs 2023-24
6,751 pages
Module 12 - GMOs and Gene Therapy
No ratings yet
Module 12 - GMOs and Gene Therapy
26 pages
CRISPR Nobelists Surrender Their Own European Patents
No ratings yet
CRISPR Nobelists Surrender Their Own European Patents
1 page
07 Notes Genetic Engineering
No ratings yet
07 Notes Genetic Engineering
38 pages
GenBio2 - Q1 - W1 - S2-Lesson 1.
No ratings yet
GenBio2 - Q1 - W1 - S2-Lesson 1.
3 pages
Topic 3.5 Worksheet
No ratings yet
Topic 3.5 Worksheet
4 pages
Plasmids, Cosmids, Phasmids
No ratings yet
Plasmids, Cosmids, Phasmids
8 pages
Vietnam National University Ho Chi Minh International University
100% (1)
Vietnam National University Ho Chi Minh International University
5 pages

Bioinformatics Databases

Uploaded by

Bioinformatics Databases

Uploaded by

Bioinformatics databases

There are thousands of online bioinformatics databases. Here we list only a

NCBI and EBI: Key database providers

Biomolecular sequence databases

Plant genome databases

Other genome databases

Model organism focused database

You might also like