Bioinformatic Databases 2

Biological databases are libraries of biological information collected from scientific experiments and literature. They provide structured, indexed data including sequences, functions, structures, and related references that is periodically updated. Major databases include sequence databases like NCBI and UniProt, and structure databases like PDB, CATH, and SCOP. Sequence databases provide nucleotide and protein sequences along with annotation, while structure databases classify protein domains and structures hierarchically based on structural and evolutionary relationships. Tools on these databases allow users to search, analyze, and visualize biological data.

Uploaded by

vivian1899190

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

25 views28 pages

Bioinformatic Databases 2

Uploaded by

vivian1899190

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 28

Biological Databases

Dr. Sajid
Department of Biotechnology
Biological Databases
• Biological databases are libraries of
biological sciences, collected from scientific
experiments, published literature, high-
throughput experiment technology, and
computational analysis.
What we expect from a database..!!
• Sequence, functional, structural information,
related bibliography
• Well Structured and Indexed
• Well cross-referenced (with other databases)
• Periodically updated
• Tools for analysis and visualization
Biological Databases
• Sequence databases
• Structure databases
Sequence databases
• Nucleotide databases
• Protein databases
Sequence databases
Nucleotide databases
• International Nucleotide Sequence
Database Collaboration (INSDC)
– NCBI
– EMBL
– DDBJ
Standard contents of a sequence database

• Sequences
• Accession number
• References
• Taxonomic data
• Annotation/curation
• Keywords
• Cross-references
• Documentation
NCBI
• Very comprehensive biological database
• GENBANK: The nucleotide sequence database
• Provides 42 different resource
• Provides a simple and easy to use web
interface

http://www.ncbi.nlm.nih.gov/
• Sequence submission: done using Bankit or
Sequin
• Search Engine for data retrieval: Entrez
• Retrieves information across all the resources
under NCBI
Example: PubMed, taxonomy, SNP, PubChem
etc.
Tools for analysis
• BLAST
• Primer-BLAST
• B-Link
• ORF finder
• Genome workbench
Protein Sequence databases

• UniProt
• PFAM
• Gene Index project
UniProt
• Universal Protein Resource
• Formed through the merger of :
– SIB
– EBI-SwissProt
– TrEMBL
– PIR-PSD
• Entry names are often the names of the gene
followed by the species.
• Accession numbers are of the following
format:
• e.g. P26367 (PAX6_HUMAN)
Uniprot features

• Blast
• Align
• Retrieve
• ID mapping
Pfam
• Proteins contain conserved regions
• Based on the conserved regions, proteins are
classified into families
• Provides links to external databases like PDB,
SCOP, CATH etc.
Pfam: Features
• Sequence search
• View Pfam family
• View a clan
• View a sequence
• View a structure
• Keyword search
Gene Indices
• Project aimed at indexing genes and their
variants in the various genome sequences.

• Creating a catalogue of genes in a wide range

of organisms

• Reduce redundancy
Gene Indices Software Tools
• TGI Clustering tools
• Clview
• SeqClean
• Cdbfasta/cdbyank
Structural databases
• PDB – Protein Data Bank
• CATH
• SCOP – Structural Classification of Proteins
wwPDB
• Contains information about experimentally
determined structures of proteins, nucleic
acids, and complex assemblies

• RCSB-PDB, PDBe, PDBj, BMRB – repositories of

protein structure data

• Files in PDB, mmCIF, PDBML/XML formats

• Advanced search – provides comprehensive
information about a protein.
• Sequence info, domain info, sequence
similarity, literature, apart from the details of
the structure.

• Cross referenced to SCOP and CATH

CATH
• Classification of proteins based on domain
structures
• Each protein chopped into individual domains
and assigned into homologous superfamilies.
• Hierarchial domain classification of PDB
entries.
CATH hierarchy
• Class – derived from secondary structure content is assigned
automatically
• Architecture – describes gross orientation of secondary
structures, independent of connectivity
• Topology – clusters structures according to their
topological connections and numbers of secondary
structures
• Homologous superfamily – this level groups
together protein domains which are thought to
share a common ancestor and can therefore be
described as homologous
SCOP
• Description of structural and evolutionary
relationships between all the proteins with
known structures
• Uses the PDB entries
• Search using keywords or PDB identifiers
Hierarchy in SCOP
• Class
• Fold
• Superfamily
• Family
• Species
Thank you

Biological Databases ODL
No ratings yet
Biological Databases ODL
31 pages
BCH 505 Bioinformatics 3 (2 2) Databases
No ratings yet
BCH 505 Bioinformatics 3 (2 2) Databases
17 pages
Introduction To Bioinformatics (Databases)
No ratings yet
Introduction To Bioinformatics (Databases)
28 pages
Bioinformatics Biological Database
No ratings yet
Bioinformatics Biological Database
31 pages
Unit II Bioinformatics
No ratings yet
Unit II Bioinformatics
25 pages
Bioinformatics PPT Section B Data Storage and Retrival Group 3
No ratings yet
Bioinformatics PPT Section B Data Storage and Retrival Group 3
36 pages
Protein Database Overview
No ratings yet
Protein Database Overview
13 pages
Data Base in Bioinformatics
No ratings yet
Data Base in Bioinformatics
30 pages
M Lec 01 & 02 Biological Database
No ratings yet
M Lec 01 & 02 Biological Database
50 pages
Databases Class Work
No ratings yet
Databases Class Work
48 pages
Bio PPT
No ratings yet
Bio PPT
35 pages
Biological Databases: DR Z Chikwambi Biotechnology
No ratings yet
Biological Databases: DR Z Chikwambi Biotechnology
47 pages
Database
No ratings yet
Database
40 pages
Biological - Databases Class Work 60
No ratings yet
Biological - Databases Class Work 60
60 pages
Lec2 Databases
No ratings yet
Lec2 Databases
135 pages
Bioinformatics Lab Assignment Group 3
No ratings yet
Bioinformatics Lab Assignment Group 3
7 pages
Lecture 3 Database
No ratings yet
Lecture 3 Database
81 pages
Bioinform-Tica-Pdf-May-6-2010-12-38-Pm-3-5-Meg
No ratings yet
Bioinform-Tica-Pdf-May-6-2010-12-38-Pm-3-5-Meg
105 pages
Unit II Major Databases in Bioinformatics
No ratings yet
Unit II Major Databases in Bioinformatics
54 pages
Sequence and Structure Retrieval
No ratings yet
Sequence and Structure Retrieval
9 pages
Bioinformatics Tools For Nucleotide Sequence Analysis and Database Exploration
No ratings yet
Bioinformatics Tools For Nucleotide Sequence Analysis and Database Exploration
75 pages
Bioinformatics Question Bank For FAT
No ratings yet
Bioinformatics Question Bank For FAT
53 pages
Module 2 (Bioinformatics)
No ratings yet
Module 2 (Bioinformatics)
81 pages
Nucleic Acid Databases
No ratings yet
Nucleic Acid Databases
37 pages
Unit 2
No ratings yet
Unit 2
36 pages
Biological Data Bases
No ratings yet
Biological Data Bases
36 pages
CMSC 838T - Lecture 9: Bioinformatics Databases
No ratings yet
CMSC 838T - Lecture 9: Bioinformatics Databases
65 pages
FALLSEM2019-20 BIT2001 ETH VL2019201000690 Reference Material I 11-Jul-2019 Unit I New
No ratings yet
FALLSEM2019-20 BIT2001 ETH VL2019201000690 Reference Material I 11-Jul-2019 Unit I New
48 pages
Bioinformatics
No ratings yet
Bioinformatics
47 pages
BCH 428 Slide
No ratings yet
BCH 428 Slide
32 pages
BCH 516-1
No ratings yet
BCH 516-1
32 pages
Bioinformatics Lecture Notes Database
No ratings yet
Bioinformatics Lecture Notes Database
28 pages
Biological Database ODL
No ratings yet
Biological Database ODL
21 pages
Biologicaldatabase 190402034501
No ratings yet
Biologicaldatabase 190402034501
26 pages
Biological Databases: - Bio-Informatics
No ratings yet
Biological Databases: - Bio-Informatics
16 pages
Sec1 Introduction To Bioinformatics
No ratings yet
Sec1 Introduction To Bioinformatics
20 pages
Introduction To Databases
No ratings yet
Introduction To Databases
21 pages
Database
No ratings yet
Database
16 pages
Lecture 2
No ratings yet
Lecture 2
24 pages
Bio in For Ma Tics
No ratings yet
Bio in For Ma Tics
52 pages
Unit Ii
No ratings yet
Unit Ii
23 pages
Biological Databases
No ratings yet
Biological Databases
17 pages
#1 L1 BioDatabases
No ratings yet
#1 L1 BioDatabases
89 pages
Databases - Final
No ratings yet
Databases - Final
50 pages
Database 2
No ratings yet
Database 2
15 pages
المحاضرة 2
No ratings yet
المحاضرة 2
16 pages
Online Biological Databases: A/Prof. Ly Le
No ratings yet
Online Biological Databases: A/Prof. Ly Le
64 pages
Biological Data and Database
No ratings yet
Biological Data and Database
13 pages
Lecture 5 - DataBase
No ratings yet
Lecture 5 - DataBase
18 pages
LO4 Access To Sequenced Data and Related Information
No ratings yet
LO4 Access To Sequenced Data and Related Information
11 pages
Lecture 2 Introduction To The Computational Tools
No ratings yet
Lecture 2 Introduction To The Computational Tools
15 pages
Protein Databases
No ratings yet
Protein Databases
12 pages
CH12
No ratings yet
CH12
8 pages
Bioinformatics
No ratings yet
Bioinformatics
5 pages
PDP
No ratings yet
PDP
2 pages
Abasyn University Peshawar: Name: Ihsan Ullah Depart: BS Medical Lab Technology
No ratings yet
Abasyn University Peshawar: Name: Ihsan Ullah Depart: BS Medical Lab Technology
8 pages
Bioinformatics and Omics Topic: Database and Biological Database With Examples Assignment-3
No ratings yet
Bioinformatics and Omics Topic: Database and Biological Database With Examples Assignment-3
5 pages
Tics - A Brief Introduction
No ratings yet
Tics - A Brief Introduction
4 pages