Data Bases

The document discusses various types of chemical databases, including structure, substructure, reaction, and relational databases, highlighting their importance in retrieving and analyzing chemical information. It emphasizes the role of the Cambridge Structural Database (CSD) and Protein Data Bank (PDB) in providing curated structural data for organic and biological molecules. Additionally, it covers the concept of 3D pharmacophores, which are essential for understanding molecular interactions in drug design.

Uploaded by

pgchemistry2225

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views36 pages

Data Bases

Uploaded by

pgchemistry2225

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 36

DATA BASES

• https://www.science.co.il/chemistry/database
s/Structure-databases.php#google_vignette
InChI representation (formerly IChI) being
developed by IUPAC
DATA BASE SEARCH

2D 3D
SEARCH SEARCH
STRUCTURE SEARCH

SUB STRUCTURE SEARCH CSD

REACTION SEARCH

PROTEIN DATA BANK

PATENT SEARCH

RELATIONAL SEARCH 3D PHARMACOPHORE

STRUCTURE SEARCHING
• The simplest searching task involves the extraction from the
database of information associated with a particular structure.
• For example, one may wish to look up the boiling point of acetic
acid or the price of acetone.
• The first step is to convert the structure provided by the user (the
query) into the relevant canonical ordering or representation.
• One could then search through the database, starting at the
beginning, in order to find this structure. However, the canonical
representation can also provide the means to retrieve
information about a given structure more directly from the
database through the generation of a hash key. A hash key is
typically an integer with a value between 0 and some large
number (e.g. 232 − 1).
SUBSTRUCTURE SEARCHING
• A substructure search identifies all the molecules
in the database that contain a specified
substructure.
• A simple example would be to identify all
structures that contain a particular functional
group or sequence of atoms such as a carboxylic
acid, benzene ring or C5 alkyl chain.
• Most chemical database systems use a two-stage
mechanism to perform substructure search
• The first step involves the use of screens to
rapidly eliminate molecules that cannot
possibly match the substructure query. The
aim is to discard a large proportion (ideally
more than 99%) of the database.
REACTION DATA BASE
• Reactions are central to the subject of
chemistry, being the means by which new
chemical entities are produced. As any student
of chemistry is aware a huge number of new
reactions and new applications of existing
reactions are published every year.
• The Beilstein Handbuch der Organischen
Chemie containing information from 1771
onwards
WHY YOU NEED REACTION SEARCH?
• When planning a synthesis a chemist may wish to
search a reaction database in a variety of ways.
• A simple type of query would involve an exact
structure search against the products, to
determine whether there was an established
synthesis for the compound of interest.
• reaction search involving the structures or
substructures of the precursor or reactant and
the product.
• A more general type of search would involve
identification of all deposited syntheses for a
particular named reaction, in order to identify
a range of possible reaction conditions to
attempt. Other quantities of interest may
include specific reagents, catalysts, solvents,
the reaction conditions (temperature,
pressure, pH, time, etc.) together with the
yield.
• One may of course wish to combine more
than one of these criteria in a single query
(e.g. “find all deposited reaction schemes
which involve the reduction of an aliphatic
aldehyde to an alcohol where the temperature
is less than 150◦C and the yield is greater than
75%”)
RELATIONAL DATA BASE
• Relational database systems have been used for
many years to store numeric and textual data
• Most databases include an identifier for each
structure, such as a Chemical Abstracts Registry
Number (a CAS number), an internal registry
number, catalogue number or chemical name.
Other types of data that may be present include
measured and calculated physical properties,
details of supplier and price where appropriate,
date of synthesis and so on.
• A typical database contains a number of
tables, linked via unique identifiers. By way of
example, suppose we wish to construct a
simple relational database to hold inventory
and assay data relating to biological testing.
• In a relational database the data is stored in
rectangular tables, the columns corresponding
to the data items and each row representing a
different piece of data.
EXAMPLE OF RELATIONAL DATA BASE
3D DATA BASE SEARCH
• EXPERIMENTAL 3D DATABASES:
– Cambridge Structural Database (CSD)
– Protein Data Bank (PDB)
Cambridge Structural Database (CSD)

• Cambridge crystallographic data centre-CCDC

• The CSD is the world's repository of highly
curated experimentally determined organic and
metal-organic crystal structures. It is used
globally by scientists in over 70 countries to
understand how molecules behave and interact
in three dimensions in the solid form and
ultimately how this affects physical properties.
• 1 million molecuels
• CCDC are world-leading experts in structural chemistry data, software
and knowledge for materials and life science research and application.
• They are dedicated to the advancement of chemistry and
crystallography for the public benefit. They specialise in the collation,
preservation and application of scientific structural data for use in
pharmaceutical discovery, materials development and research and
education.
• CCDC compile and distribute the Cambridge Structural Database (CSD),
a certified trusted database of fully curated and enhanced organic and
metal-organic structures, used by researchers across the globe.
• Their cutting-edge software empowers scientists to extract invaluable
insights from the vast dataset, informing and accelerating their
research & development.
• https://
www.ccdc.cam.ac.uk/structures/UnlicensedEn
quiry?id=UnitCellSearch
PDB-Protein Data Bank
• The Protein Data Bank (PDB) contains more than 44,000 x-ray and
nuclear magnetic resonance (NMR) structures of proteins and
protein–ligand complexes and some nucleic acid and
carbohydrate structures
• The Protein Data Bank (PDB)[1] is a database for the three-
dimensional structural data of large biological molecules, such as
proteins and nucleic acids. The data, typically obtained by
X-ray crystallography, NMR spectroscopy, or, increasingly, cryo
-electron microscopy, and submitted by biologists and
biochemists from around the world, are freely accessible on the
Internet via the websites of its member organisations (PDBe, [2]
PDBj,[3] RCSB,[4] and BMRB[5]). The PDB is overseen by an
organization called the Worldwide Protein Data Bank, wwPDB.
• the PDB is more a communal repository of data
files, one for each protein structure. Founded in
1971 the PDB now contains approximately 44,000
structures, most obtained using x-ray
crystallography but with some determined using
NMR.
• The PDB is most commonly accessed via a web
interface, which enables structures to be retrieved
using various textual queries (such as by author,
protein name, literature citation).
• Some web interfaces also enable searches to
be performed using amino acid sequence
information. As the number of protein
structures has grown so it has been
recognised that the “flat file” system is
inappropriate and more modern database
systems and techniques have been introduced
based on the information in the PDB.
• The PDB has been used extensively to further
our understanding of the nature of protein
structure and its relationship to the amino
acid sequence. For example, various
classification schemes have been proposed for
dividing protein structures into families
• The structures in the PDB also form the basis for
comparative modelling (also known as homology
modelling), where one attempts to predict the
conformation of a protein of known sequence but
unknown structure using the known 3D structure
of a related protein.
• The PDB has also provided information about the
nature of the interactions between amino acids,
and between proteins and water molecules and
small-molecule ligands
3D PHARMACOPHORE
• A major use of 3D database systems is for the identification of
compounds that possess 3D properties believed to be
important for interaction with a particular biological target.
These requirements can be expressed in a variety of ways, one
of the most common being as a 3D pharmacophore. A 3D
pharmacophore is usually defined as a set of features together
with their relative spatial orientation. Typical features include
hydrogen bond donors and acceptors, positively and negatively
charged groups, hydrophobic regions and aromatic rings. The
use of such features is a natural extension of the concept of
bioisosterism, which recognises that certain functional groups
have similar biological, chemical and physical properties
bioisosteres are chemical substituents or groups with
similar physical or chemical properties which produce
broadly similar biological properties to another
chemical compound.
• PDB file formats:
• The Protein Data Bank (pdb) file format is a
textual file format describing the three-
dimensional structures of molecules held in the
Protein Data Bank. The pdb format accordingly
provides for description and annotation of protein
and nucleic acid structures including atomic
coordinates, secondary structure assignments, as
well as atomic connectivity. In addition
experimental metadata are stored. PDB format is
the legacy file format for the Protein Data Bank
which now keeps data on biological
macromolecules in the newer mmCIF file format.
Pharmacophore
• A pharmacophore is an abstract description
of molecular features that are necessary for
molecular recognition of a ligand by a
biological macromolecule

05 Structural Databases
No ratings yet
05 Structural Databases
23 pages
GTGF GGCF
No ratings yet
GTGF GGCF
19 pages
BCH 505 Bioinformatics 3 (2 2) Databases
No ratings yet
BCH 505 Bioinformatics 3 (2 2) Databases
17 pages
Introduction To Bioinformatics (Databases)
No ratings yet
Introduction To Bioinformatics (Databases)
28 pages
Protein Database Overview
No ratings yet
Protein Database Overview
13 pages
Protein Data Bank
No ratings yet
Protein Data Bank
42 pages
Bioinformatics Biological Database
No ratings yet
Bioinformatics Biological Database
31 pages
Lecture2-Structural Bioinformatics
No ratings yet
Lecture2-Structural Bioinformatics
8 pages
Soil Testing Manual
No ratings yet
Soil Testing Manual
32 pages
CADD Unit 4 TP
No ratings yet
CADD Unit 4 TP
7 pages
Brochure PT. Total Prime Engineering
No ratings yet
Brochure PT. Total Prime Engineering
16 pages
Protein STR
No ratings yet
Protein STR
20 pages
Journal of Cheminformatics
No ratings yet
Journal of Cheminformatics
17 pages
Database 2
No ratings yet
Database 2
15 pages
Chemical Databases & Data Representation
No ratings yet
Chemical Databases & Data Representation
30 pages
Introduction To Databases
No ratings yet
Introduction To Databases
21 pages
Biological Databases PDF
No ratings yet
Biological Databases PDF
13 pages
Biological Database ODL
No ratings yet
Biological Database ODL
21 pages
Daftar Nutrisi Parenteral
100% (1)
Daftar Nutrisi Parenteral
6 pages
List of Chemistry Databases and Modelling Software
No ratings yet
List of Chemistry Databases and Modelling Software
5 pages
DVS Technical Codes On Plastics Joining Technologies: Selected Translations
No ratings yet
DVS Technical Codes On Plastics Joining Technologies: Selected Translations
7 pages
Lecture 7
No ratings yet
Lecture 7
29 pages
Sequence and Structure Retrieval
No ratings yet
Sequence and Structure Retrieval
9 pages
12-Protein Data Bank (PDB) - 05-09-2024
No ratings yet
12-Protein Data Bank (PDB) - 05-09-2024
12 pages
Structural Databases
No ratings yet
Structural Databases
5 pages
Chemoinformatics
No ratings yet
Chemoinformatics
65 pages
Propylene Polymers
No ratings yet
Propylene Polymers
72 pages
Photosynthesis: B. Carbon Metabolism 1. C3 2. C4 3. CAM C. Factors Regulating The Processes
No ratings yet
Photosynthesis: B. Carbon Metabolism 1. C3 2. C4 3. CAM C. Factors Regulating The Processes
94 pages
94 Aaa 320576041 D
No ratings yet
94 Aaa 320576041 D
16 pages
GKJ 067
No ratings yet
GKJ 067
5 pages
Pertemuan 7 Kimkom Teori Desain Obat Berbasis Ligan
No ratings yet
Pertemuan 7 Kimkom Teori Desain Obat Berbasis Ligan
83 pages
Computer Aided Drug Design (Cadd)
No ratings yet
Computer Aided Drug Design (Cadd)
33 pages
CSD
No ratings yet
CSD
14 pages
PMC 532
No ratings yet
PMC 532
36 pages
Chemical Databases
No ratings yet
Chemical Databases
11 pages
Ligand-Based+structure-Based Screening
No ratings yet
Ligand-Based+structure-Based Screening
107 pages
Weed Management in 2050 Perspectives On The Future
No ratings yet
Weed Management in 2050 Perspectives On The Future
11 pages
Chemical Formulas
No ratings yet
Chemical Formulas
243 pages
Bioinformatics: Merging Biology and Technology
From Everand
Bioinformatics: Merging Biology and Technology
Mani Devar
No ratings yet
Brochure Mesoexpert2016compressed
No ratings yet
Brochure Mesoexpert2016compressed
20 pages
High Alert
No ratings yet
High Alert
180 pages
Assignment On Linseed Oil School of Food Technology, JNTUK, Kakinada
No ratings yet
Assignment On Linseed Oil School of Food Technology, JNTUK, Kakinada
8 pages
Protein Databases
No ratings yet
Protein Databases
23 pages
14 АМИНИ II
No ratings yet
14 АМИНИ II
10 pages
Unit 4
No ratings yet
Unit 4
47 pages
COC Premix
No ratings yet
COC Premix
2 pages
A Comprehensive Review of Database Resources in CH
No ratings yet
A Comprehensive Review of Database Resources in CH
12 pages
Cadd - Oe Impq
No ratings yet
Cadd - Oe Impq
12 pages
Compima
No ratings yet
Compima
26 pages
HISTORY AND CURRENT STATUS OF CHEMOINFORMATICSs
No ratings yet
HISTORY AND CURRENT STATUS OF CHEMOINFORMATICSs
10 pages
Unit 3
No ratings yet
Unit 3
67 pages
Macromolecular Structure Databases: by and
No ratings yet
Macromolecular Structure Databases: by and
22 pages
Alkyenes
No ratings yet
Alkyenes
5 pages
Emulsogen LCN 070 (TDS)
No ratings yet
Emulsogen LCN 070 (TDS)
1 page
Advanced Zeolite and Ordered Mesoporous Silica-Based Catalysts For The Conversion of CO2 To Chemicals and Fuels - 3 Groups
No ratings yet
Advanced Zeolite and Ordered Mesoporous Silica-Based Catalysts For The Conversion of CO2 To Chemicals and Fuels - 3 Groups
174 pages
Crops 1
No ratings yet
Crops 1
16 pages
Madzhidov Lecture
No ratings yet
Madzhidov Lecture
28 pages
Introduction To The Chemical Database Service
No ratings yet
Introduction To The Chemical Database Service
35 pages
Sejal Balsaraf
No ratings yet
Sejal Balsaraf
16 pages
Cheminformatics 1
No ratings yet
Cheminformatics 1
12 pages
Presentation 11
No ratings yet
Presentation 11
20 pages
Databases - Final
No ratings yet
Databases - Final
50 pages
Fluorination 2012
No ratings yet
Fluorination 2012
19 pages
Introduction
No ratings yet
Introduction
19 pages
MCB 309-Food Presrvation Note
No ratings yet
MCB 309-Food Presrvation Note
5 pages
Abasyn University Peshawar: Name: Ihsan Ullah Depart: BS Medical Lab Technology
No ratings yet
Abasyn University Peshawar: Name: Ihsan Ullah Depart: BS Medical Lab Technology
8 pages
Protein Database
No ratings yet
Protein Database
3 pages
Cheminformatics
No ratings yet
Cheminformatics
4 pages
Protein Structure Views by Different Softwares: Assignment Topic
No ratings yet
Protein Structure Views by Different Softwares: Assignment Topic
9 pages
J Mol: Kanehisa Laboratories
No ratings yet
J Mol: Kanehisa Laboratories
2 pages
Transport of Carbon Dioxide
No ratings yet
Transport of Carbon Dioxide
17 pages
Production of Insect Repellent Final Year Project Dr. Obiri
No ratings yet
Production of Insect Repellent Final Year Project Dr. Obiri
25 pages
Submitted By:: Lab-Pdb
No ratings yet
Submitted By:: Lab-Pdb
7 pages
Introduction To Structural Databases
No ratings yet
Introduction To Structural Databases
10 pages
Freeman 6e Ch9 Cell Respiration
No ratings yet
Freeman 6e Ch9 Cell Respiration
21 pages
Drug Design
No ratings yet
Drug Design
18 pages
Chemistry 2nd Pu - Revision Package (2025)
No ratings yet
Chemistry 2nd Pu - Revision Package (2025)
37 pages
Unit 5
No ratings yet
Unit 5
8 pages
Clinical Chem Rev With Answers
No ratings yet
Clinical Chem Rev With Answers
15 pages
Unit 1: Structure Determination: Protein Structure Database PDB PDB File Format Ramachandran Plot
No ratings yet
Unit 1: Structure Determination: Protein Structure Database PDB PDB File Format Ramachandran Plot
33 pages
2 Introduction To PDB
No ratings yet
2 Introduction To PDB
43 pages
Sulfated Xyloglucan-Based Magnetic Nanocomposite
No ratings yet
Sulfated Xyloglucan-Based Magnetic Nanocomposite
8 pages
Introduction to Bioinformatics Using Action Labs
From Everand
Introduction to Bioinformatics Using Action Labs
Jean-Louis Lassez
5/5 (1)
Crystallography and Databases
No ratings yet
Crystallography and Databases
17 pages
TMC-3283 Electronic Liquid US en
No ratings yet
TMC-3283 Electronic Liquid US en
10 pages
Dna Replication Transcirption Translation 1.Pptx 20250114 122122 0000
No ratings yet
Dna Replication Transcirption Translation 1.Pptx 20250114 122122 0000
23 pages
Difference Between Silicone Rubbers
No ratings yet
Difference Between Silicone Rubbers
3 pages
Bioinformatics: Submitted by
No ratings yet
Bioinformatics: Submitted by
19 pages
13C NMR Spectroscopy-Revision
No ratings yet
13C NMR Spectroscopy-Revision
10 pages
Intro To Chemical Database
No ratings yet
Intro To Chemical Database
5 pages
A C A D e M I C S C I e N C e S
No ratings yet
A C A D e M I C S C I e N C e S
4 pages
Databases
No ratings yet
Databases
4 pages
Biofuel Production From Microalgae Challenges and Chances
No ratings yet
Biofuel Production From Microalgae Challenges and Chances
8 pages
Unit 3 Objective Bold
No ratings yet
Unit 3 Objective Bold
3 pages

Data Bases

Uploaded by

Data Bases

Uploaded by

DATA BASES

SUB STRUCTURE SEARCH CSD

PROTEIN DATA BANK

RELATIONAL SEARCH 3D PHARMACOPHORE

• Cambridge crystallographic data centre-CCDC

You might also like