0% found this document useful (0 votes)
3 views36 pages

Zoya Bioinformatics Assignment

The document outlines various biological databases, primarily maintained by NCBI and EMBL, that are essential for research in biotechnology and bioinformatics. Key databases include GenBank, PubMed, and UniProt, which provide access to genetic sequences, biomedical literature, and protein information, respectively. Additionally, resources like KEGG and Gene Ontology offer insights into metabolic pathways and gene functions, supporting a wide range of scientific studies.

Uploaded by

farazscholars54
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views36 pages

Zoya Bioinformatics Assignment

The document outlines various biological databases, primarily maintained by NCBI and EMBL, that are essential for research in biotechnology and bioinformatics. Key databases include GenBank, PubMed, and UniProt, which provide access to genetic sequences, biomedical literature, and protein information, respectively. Additionally, resources like KEGG and Gene Ontology offer insights into metabolic pathways and gene functions, supporting a wide range of scientific studies.

Uploaded by

farazscholars54
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 36

B. K.

Birla College of Arts, Science & Commerce, Kalyan


Empowered Autonomous Status
B. K. Birla College of Arts, Science &
Commerce, Kalyan

Empowered Autonomous Status

Name: - Ansari Zoya Mohammad Saud


Class: - M. Sc. Part – I
Faculty: - Biotechnology
Paper / Subject: - Sequence Alignment and Phylogenetics
Discipline Specific Elective (DSE)
Topic: - Types of Databases
Roll No.: - 06
Department: - Department of Biotechnology
College: - B.K. Birla College, Kalyan
A) National Center for Biotechnology Information (NCBI): -

The National Center for Biotechnology Information (NCBI) is a division of the National Library
of Medicine (NLM), USA, that provides access to a vast range of biological databases and tools.
It is responsible for storing, analyzing, and making biological and biomedical data publicly
available. Some of its most notable databases include GenBank, PubMed, PubChem, OMIM, and
the Taxonomy Database.

1
I) GenBank: -

GenBank is a comprehensive public database of nucleotide sequences and their protein


translations, maintained by NCBI. It allows researchers to store and retrieve genetic sequence data
and collaborates with other sequence databases like EMBL (Europe) and DDBJ (Japan) under the
International Nucleotide Sequence Database Collaboration (INSDC).

2
3
II) Taxonomy Database: -

The NCBI Taxonomy Database provides a hierarchical classification of organisms based on


genetic and evolutionary relationships. It includes information on species, genera, families, and
higher taxonomic ranks, making it widely used in bioinformatics for species identification and
evolutionary studies.

4
III) Gene Expression Omnibus (GEO): -

The Gene Expression Omnibus (GEO) is a public repository for gene expression data, storing
experimental data from microarrays, RNA-Seq, and ChIP-Seq studies. Researchers can submit,
retrieve, and analyze gene expression data for various biological studies.

5
6
IV) PubChem: -

PubChem is a free chemical database maintained by NCBI, containing information on small


molecules, their properties, biological activities, and toxicity. It is widely used in drug discovery
and includes millions of chemical structures, bioassay data, and compound interactions.

7
8
V) PubMed: -

PubMed is a biomedical literature database providing access to over 35 million citations from
MEDLINE, life sciences journals, and online books. It is widely used by researchers, doctors, and
students to access scientific publications.

9
10
VI) OMIM (Online Mendelian Inheritance in Man): -

OMIM is a comprehensive database cataloging human genes and genetic disorders, providing
information on disease-causing mutations, inheritance patterns, and clinical descriptions, making
it valuable for medical genetics research.

11
12
VII) Virus Database: -

The NCBI Virus Database provides a collection of viral genome sequences, taxonomic
information, and related research data. It includes data on RNA and DNA viruses, including those
affecting humans, animals, and plants. This database is essential for studying viral evolution,
outbreaks, and vaccine development.

13
14
B) EMBL Database: -

The European Molecular Biology Laboratory (EMBL) Database is a key resource for biological
data storage and analysis. It is maintained by the European Bioinformatics Institute (EMBL-EBI)
and provides various databases for nucleotide sequences, protein functions, chemical compounds,
and structural biology. Some of its important databases include UniProt, AlphaFold, InterPro,
ChEMBL, IntAct, IPD, and Pfam.

15
16
I) UniProt: -

UniProt (Universal Protein Resource) is a comprehensive database that provides protein sequence
and functional information. It integrates data from several sources, including Swiss-Prot,
TrEMBL, and PIR, to offer high-quality, curated protein annotations. UniProt is widely used for
studying protein function, structure, and interactions.

17
18
II) AlphaFold Database: -

The AlphaFold Database is an AI-driven protein structure prediction database developed by


DeepMind in collaboration with EMBL-EBI. It provides highly accurate 3D structural models of
proteins, which are crucial for understanding protein function, interactions, and drug discovery.

19
20
III) InterPro: -

InterPro is a database that classifies and predicts protein families, domains, and functional sites. It
integrates information from multiple protein signature databases to provide insights into protein
structure and function. Researchers use InterPro to identify conserved domains and evolutionary
relationships.

21
22
IV) ChEMBL: -

ChEMBL is a bioactivity database containing information on small molecules, drug-like


compounds, and their biological interactions. It is used for drug discovery, pharmacology research,
and computational modeling by providing data on compound-target interactions, ADMET
properties, and clinical trial details.

23
24
V) IntAct Database: -

The IntAct Database is a resource for molecular interaction data, storing experimentally verified
protein-protein, protein-DNA, and protein-RNA interactions. It supports researchers in
understanding biological pathways, disease mechanisms, and protein networks.

25
26
VI) IPD Database (Immuno Polymorphism Database): -

The Immuno Polymorphism Database (IPD) is a specialized resource that provides genetic
variation data of immune system-related genes, including human leukocyte antigens (HLA), killer-
cell immunoglobulin-like receptors (KIR), and major histocompatibility complex (MHC). It is
essential for immunogenetics and transplantation research.

27
VII) Pfam Database: -

The Pfam Database is a collection of protein families and domains, using hidden Markov models
(HMMs) to predict conserved functional regions within proteins. It helps researchers study protein
evolution, domain architectures, and functional annotations across species.

28
29
C) PDB (Protein Data Bank): -

The Protein Data Bank (PDB) is a structural biology database that stores 3D atomic-level
structures of biomolecules, including proteins, DNA, RNA, and complex assemblies. It is
maintained by the Worldwide Protein Data Bank (wwPDB) consortium, which includes RCSB
PDB (USA), PDBe (Europe), and PDBj (Japan). PDB structures are obtained from X-ray
crystallography, NMR spectroscopy, and cryo-electron microscopy. The database is widely used
for drug discovery, protein function analysis, and structural bioinformatics.

30
E) Gene Ontology: -

Gene Ontology (GO) is a structured system in bioinformatics that categorizes gene and protein
functions across various species. It offers a consistent terminology for describing genes and their
products based on their roles in biological processes, molecular activities, and cellular locations.

31
D) KEGG (Kyoto Encyclopedia of Genes and Genomes): -

The Kyoto Encyclopedia of Genes and Genomes (KEGG) is a bioinformatics resource that
provides information on genes, proteins, metabolic pathways, diseases, and drugs. It integrates
genomic, biochemical, and systems biology data to help researchers study cellular processes, gene
functions, and metabolic interactions. KEGG is widely used in genomics, metabolomics, and
systems biology research.

32
I) KEGG Pathway: -

The KEGG Pathway Database contains graphical representations of biochemical pathways


involved in metabolism, genetic information processing, cellular processes, human diseases, and
drug responses. These pathways help researchers understand biological functions and molecular
interactions within organisms. It is extensively used in metabolic engineering, disease research,
and drug discovery.

33
II) KEGG Enzyme: -

The KEGG Enzyme Database provides information on enzymes and their roles in biochemical
reactions. It classifies enzymes based on the Enzyme Commission (EC) number and links them to
metabolic pathways, substrates, and reactions. This database is useful for studying enzyme
mechanisms, metabolic engineering, and drug-target interactions.

34

You might also like