0% found this document useful (0 votes)
59 views11 pages

CADD Unit 4 2

1. Bioinformatics is an interdisciplinary field that uses computer science, statistics, and mathematics to analyze and interpret biological data. 2. The goals of bioinformatics are to provide scientists with computational tools to explain normal biological processes, diseases, and improve drug discovery. 3. Key applications include sequence analysis, data analysis of nucleotide and protein sequences, and development of tools for efficient data access and management.

Uploaded by

mohit
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
59 views11 pages

CADD Unit 4 2

1. Bioinformatics is an interdisciplinary field that uses computer science, statistics, and mathematics to analyze and interpret biological data. 2. The goals of bioinformatics are to provide scientists with computational tools to explain normal biological processes, diseases, and improve drug discovery. 3. Key applications include sequence analysis, data analysis of nucleotide and protein sequences, and development of tools for efficient data access and management.

Uploaded by

mohit
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

7.1.

INFORMAT

wystems to
store, organize and and nprocems lata,
or the use of
conputer
care industry, especiall.
,
lustry, cspecially when
Informatics,
convenience
to the health
efficiency and of pharmaceuticals.
brings and management

comes
to the tracking

HIMSS as "The scientific field th e

informatics is
defined by continuum of i Cuen
within the
Pharmacy
medication-related
data and knowledge
use
thcare
and disseminatione
on
its acquisition,
storage, analysis, the
including care and health outcomes
systems palient
delivery of optimal me<dicalion-related
Pharmacists issued a practice d e .
called the
The Amerncan
Statement of Health-System
Soxietyon the Pharmaclst's Role in Informaties. The su ormati
The staterement
document
pharmaCy inf nenN
of the pharmacist and the
the respmsibilities broad area
society outlines five hroad
reaffimed
informatics. The areas of
healthcare
in
for this role:
responsibility
)Informmation management
2)Knowledgedelivery
3) Data analytics
4) Chnical intormatics

5) Change management

7.1.1. Bioinformatics
methods and software
Bioinfomatics is interdisciplinary field that develops
an

tools for understanding biological


data. As an interdisciplinary field of science
bioinformatics combines computer science,
statistics, mathematics, and
Bioinformatics has been
engineering toanalyse and interpret biological data.
mathematical and statistical
used for in-silico analyses of biological queries using
from computer analysis of
techniques. Bioinformatics derives knowledge
biological data.

Bioinformatics or computational biology is the use of mathematical and


informational techniques, including statisties, to solve biological problems,
models or both.
usually by creating or using computer programs, mathematical of the
One of the main areas of bioinformatics is the data mining and analysis
data gathered by the various genome projects. Other areas are sequence
alignment, protein structure prediction, systems biology., protein-protein
interactions and virtual evolution.

Bioinformatics is the science of developing computer databases and algoriths


for the purpose of speeding up and enhancing biological research.
Sand Methods in Drug Design (Chapter 7
Informmatics a n d M
15

7.1.1.1. Aim

ngeneral,
there are three aims of bioinfo
formatics
efirst aim of bioinformatics is to store the biological data organised in
) Of a database. This allows the researchers an easy access to existing
and submits new entries. These data must be annoted
formation a
to give
able meaning or to assign its functional characteristics. The databases
talso be able to correlate between different hierarchies of information
must

aim is to develop tools and resources that aid in the analysis of


data.
The second
The third
and the most important aim of bioinformatics is to exploit these
results in a
computational tools to analyse the biological data interpret the
biologically meaningful manner.

1.1.1.2. Goal
means to
of bioinformatics thus are to provide scientists with a
The goals
explain:
)Normal biological processes.
to diseases.
2) Malfunctions in these processes which lead
3) Approaches to improving drug discovery.
within bioinformatics:
There are three important sub-disciplines with which to assess
The development of new algorithms and statistics
) data sets;
relationships among members of large nucleotide
and interpretation of various types of data including
2) The analysis
domains, and protein structures;
and amino acid sequences, protein
of tools that enable efficient
access

3) The development and implementation


of information.
and management of different types

7.1.1.3. Application science and information


mathematics, statistics, and computer
Bioinformatics joins These problems are usually at

technology to complex biological problems.


solve m e a n s . This interesting
field
be solved by other
the molecular level which cannot be applied.
research areas where it can
and
OT SCIence has many applications
out in the userlevel. Here is the
bioinformatics are carried
All the applications of use certain applications and
various level can
1010gist including the students at Various bioinformatics application
in their research or in study.
the output
groups:
Can be categorised under following
the applications that analyses various types
of
Sequence Analysis: All between similar types
of information
and can compare
equence information The application of sequence
analysis
Sequence Analysis. peptides by
grouped under sequences o r
which encode regulatory
uelermines those genes
uSing the information of sequencing. which
computers which
and computers
tools and
powerful tools
For sequen analysis, there are many
sequence of various organisms. These
analysing the genome
PErTorm the duty of and also detect
mutations in an organism
DNA
see the
Omputers and tools also which are related. Shotgun
sequence techniques are
of DNA. Special
identify those sequences of numerous fragments
analysis
used for sequence of fragments
and their assembly.
tware is used to see the overlapping
I16 Computer Aided Dru
2) Function Analysis: These applications analyse the function engraved
Design
the sequences and helps predict the functional interaction between"thin
rious genes isalous
proteins or genes. Also expressional analysis of various
genes is a
topic for research these days. prime
3) Structure Analysis: When it comes to the realm of RNA and Protak
structure plays a vital role in the interaction with any other thin,It
gave birth to a whole new branch termed Structural Bioinformatic
h is
devoted to predict the structure and possible roles of these stri
Proteins or RNA.
4) Prediction of Protein Structure: It is easy to determine the Drim
structure of proteins in the form of amino acids which are present
on
DNA molecule but it is difficult to determine the seco
condary, tertiary the
quaternary structures of proteins. For this purpose either the method
crystallography is used or tools of bioinformatics can also be nsed of
us ed
to
determine the complex protein structures.
5) Genome Annotation: In this genomes are marked to know the regulator
sequences and protein coding. It is a, very important part of the human
genome project as it determines the regulatory sequences.
6) Comparative Genomics: It is the branch of bioinformatics which determines
the genomic structure and function relation between different biological
species. For this purpose, intergenomic maps are constructed which enable the
scientists to trace the processes of evolution that occur in genomes of different
species. These maps contain the information about the point mutations as well
as the information about the duplication of large chromosomal segments.
7) Health and Drug Discovery: The tools of bioinformatics are also helpful in
drug discovery, diagnosis and disease management. Complete sequencing of
human genes has enabled the scientists to make medicines and drugs which
can target more than 500 genes. Different computational tools and dug
targets has made the drug delivery easy and specific because now only those
cells can be targeted which are diseased or mutated. It is also easy to know
the molecular basis of a disease.

Application of Bioinformatics in Current Research


Currently almost every field of biological research has accepted this biological
research weapon and following it, whether it is molecular biology or genetics, 0r
even agriculture. There is a complete new emerging field of genome informate
which 1s based on bioinformatics tools. Apart from these there ar
completely
many areas where bioinformatics is readily being accepted with primary roi
cule
prediction of structure similarity and functional similarity in novel drug n
research also. They perform initially tasks such as:
ortant

) Submitting DNA Sequences to the Databases: This is one of


impl
, but
thing in biological research, where scientists sequence DNA, and RNA: a n n o t

until it is not getting deposited to any public sequence database. that c l


be beneficial for scientific community. It became very essential to suD p o r t a n t

the sequenced data to pubic sequence repositories. Some of the


public repositories are DDBJ, EMBL, and Genebank.
ese per/)
mail sequence data can be
submission submitted 117
als. After or
by online to
repositories in two ways,
tSubmitted submission every submission
sequence database through sequence by
unique sequence, then
unique
hy 5 digit number,
after
verificationprovides
accession and submission
unique accession
a

but number is duplication checks.


followed by 6 digit of recently due to huge given single letter followedIf
number
it is
as a a
mic number for number
2Genomi
Mapping and accession number ofis submission two letters
technique to estimate accurate
Mapping Databases: Gene now proposed.
between related gernes of position of map
apping is one of the
genome map tor completesimilar type. After gene and corresponding
3)Information
genome for that complete evaluation, distance
Retrieval
database and its from particular organism be conclusion
a
can
or

availability Biological
online Database: reached.
stage of
biological was one of Developing biological
data is in form of research, but now by the primary
concerns at
be known
text, table and having many biological database
initia
that how to pictures and many other
retrieve exact ana
may be of text data from a formats. It
should
data retrieval retrieval,
importance. sequence retrieval or it suitable database.
may also include Database
structural
4) Sequence Alignment and
with compare to other Database Searching:
relevant and similar
biological research to understand Alignment of sequence
sequence is
predict structure and function basedrelation between two very much needed in
sequences and also to
alignment of sequences use of BLAST ison sequence similarity. For basic
sequences involved in very common. Based on
sequencing, these alignments can be number of
pairwise alignment or multiple sequence classified into
5) Predictive Methods using DNA alignment.
classified into three major Sequences: Gene-finding strategies can be
i) categories:
Content-based methods rely on the overall bulk
in making determination. properties of a sequence
Characteristics considered
often particular codons are used, the here include how
periodicity of repeats. and the
compositional complexity of the sequence. Because different
use
synonymous codons with different frequency, such dues can organisms
insight into determining regions that are more likely to be exons. provide
11) In site-based methods, the focus turns to the presence or absence of a
specific sequence, pattern, or consensus. These methods are used to
detect features such as donor and acceptor splice sites,
binding sites for
transcription factors, polyA tracts, and start and stop codons.
l1) Comparative methods can make determinations based on
sequence
homology. Here, translated sequences are subjected to database searches
against protein sequences to determine whether a previously
characterised coding region corresponds to a region in the query
Sequence. Although this is conceptually the most straightforward of the
methods, it is restrictive because most newly discovered genes do not
have gene products that match anything in the protein databases. Tools
associated with these are Grail, Genscan, Fgenes, procrustes and many
others developed with bioinformatics.
118 Computer Aided Drug Desip
esign
6) Predictive Methods Using Protein Sequences: There are tools based
predictive methods using protein sequences, such as PSLPred, NRn
PSEAPred. There are other methods also based on motif level, residue
signal level, peptide level, domain level and profile based. level,
7) Sequences Assembly and Finishing Methods: At present. present,
he
sequencing process is often talked of as consisting of two parts, name
assembly and finishing. but in practice there is considerable ove
between the two. Assembly is the process of attempting to order and
the readings, and finishing is the task of checking and editino
assembled data. the
This includes performing new sequencing experiments to fill gaps or to COve
segments where the data is poor and adjudicating between conflici
readings when editing. ting
8) Phylogenetic Analysis: Phylogenetic analysis is also one of the important
implementation of bioinformatics in biological research. Phylogenetic
analysis is study of ancestral history of an organism. Here after sequence
and structural similarities, it should be tried to relate organism's ancestral
history to show how origin of organism was related to each other and
what was order of evolution".

Actually evolutionary history analysis by phylogenetic analysis is done


There are many tools available online as well as commercial packages
also like PHYLIP. It uses tree generation methods with algorithms based
on methods such as UPGMA, and neighbour joining.
9) Comparative Genome Analysis: Comparative genome analysis is also
being performed in various researches at many levels such as academics and
professional researches. By comparing the finished reference sequence of the
human genome with genomes of other organisms, researchers can identify
regions of similarity and difference.
This information can help scientists better understand the structure and
function of human genes and thereby develop new strategies to combat
human disease. Comparative genomics also provides a powerful tool 0r
studying evolutionary changes among organisms, helping to identify genes
that are conserved among species, as well as genes that give each organisn
its unique characteristics.
10) Large-Scale Genome Analysis: Large scale genome analysis is complete
genome sequencing, and this application has much advancement as ne
generation sequencing and bioinformatics tools like illumine have
developed to analyse them very quickly. These tools are generally termeu
sequencer and playing a vital role in modern biological research.

7.1.2. Chemoinformatics and


Chemoinformatics plays an important place chemistry and chemical researc nn
this was coined and giftedby FK Brown, G Paris, M. Hann, R Green. Jonalu
Gastegeir, the chemical scientist with computing expertise.
Inhnin
Methods in Drug
n d Method
Design (Chapter 7
19

efinitions
e r d i n R to FA
Brown, "The Use of
r has become a critical inforation technology ana
4 part of the drug discovery process.
The'moir ormatics is the miXing ot those inlormation resources to transform
a t a ntO rmation and information into
infc

of making better decisions faster knowledge for the


the intended
uyse
ntilication and organisation."
in the arca of drug lead

ccording
i n g to M.
to M. Hann, R Green, "Chemoinformatics a new name for an old
y v h e m

ording
ding to G. Paris, "Chemoinformatics is a generic term that encompasses
on,
design, creation
organisation, management, retrieval, analysis, dissemination,
lal1sation and use of chemical
information."
oarding to J.Gastegeir et.al., "Chemoinformatics is the application of
icco

ntormatics methods to solve chemical problems."

Knowledge Abstraction

Context
Information

Measurements,
Calculations
Data

Figure 7.1: From Data through Information to Knowledge

7.1.2.1. Objective
Temoinformatics should assist the chemist to solve some of following
tundamental problems:
design molecules with desired properties, the major task of Chemist
o make a is

compounds with desired properties, establish structure-actuvity or


of tinding such
Cture-property relationships (SAR or SPR) or even
Onships in a quantitative manner (QSAR or QSPR).
designing the of
oSign reaction and syntheses to make these compounds,
materials to be used
lo nincludes the sequence of reactions and starting
ynthesise the desired compound.
To a reactions, there is
yse and elucidate the structures obtained in
a
he
t o establish the structure of the reaction product by using
various

tools of structure elucidation.


To information processing for the
transforr
Iorm data into knowledge through
Intended irpose of making better decisions faster.
Computer Aided
Drug Design
120
7.1.2.2. Chemoinformatics Tasks

Chemoinformatics collects, manages,


analyses eminates
and dissemina
tasks
the chemical
chem Che
in

i n f o r m a t i o n
eeded for drug
discovery.
Some of the
moinformatics
research are:
data.

1)
Analysis ofHTS chemicals.
search
Similarity
2) c o m b i n a t o r i a l
libraries.
of
3) Design libraries.
focused
of of libraries.
4) Design similarity/diversity

of the
Comparison
5)
6) Virtual screening.

7) Docking.
n o v o design.
8) De
9) Pharmacophore Perception.
and pharmacn.
physicochemical
properties kineic
of affinities,
10) Prediction
properties.
models which
can be interpreted and guide the
Establishment of QSAR
11) drug.
of a new
further development

Chemoinformatics

7.1.2.3. Need of and a c c e s s enormous amount


key role to maintain
plays a
45 million chemical
chemist (more than
Chemoinformatics

of chemical data,
produced by increase in million every year)
by
the number may
are known and
compounds
database.
using a proper
for knowledge extracthon
needs a novel technique
Also, the field of chemistry structure of r
between the
relationships
from data to model complex influence of reac
activity or the o
chemical compound and biological
Chemoinformatics has wider rang SOme

condition on reactivity.
chemical
chemoinformatics
n

influence if
application and Figure 7.2 shows
specific research areas.

Environmental
effects and
Analysis and Hazards Spectroscopy

Modelling

Chemical and Environment

physical reference Chemoinformati


data

Pharmacology

Toxicology
Regulations

Figure 7.2 Need for Chemoinformatics


Informatics and Methods in Drug Design (Chapter 7)
21

Three major aspects of Chemoinformatics are:


1) Information Acquisition is a process of generating and collecting
empirically (experimentation) or from theory (molecular simulation). dat
)Information Management deals with storage and retrieval of information, and
3)Information is which includes data analysis, correlation and applications to
problem in the chemical and biochemical sciences.

7.1.2.4. Role of Chemoinformaties in Morden Drug Discovery


Recent chemical
developments for
drug discovery are generating a lot o
chemical data which is referred as information
explosion. This has created
demand to
effectively collect, organise. analyse and apply the chemicala
information in the process of modern drug discovery and development. The drug
discovery process is aimed at discovering molecules that can be very rapidly
developed for effective treatments to meet medical needs.

The entanglement of chemistry and information management started in the mid


of 1970s, applying in the area of prediction of protein structure, Fourier
transform of X-ray crystallography, enzyme and Chemical kinetics, analyse
various type of spectroscopy data and binding of chemical compounds. During
early 1980s, computer technology is considered as the core component by the
medical chemist to solve chemical problems.
For example, collecting crystal structures of small molecules in Cambridge
Structural Databas (CSD) provides a fertile resource for geometrical data on
molecular fragments for calibration of force fields and validation of results from
computational chemistry. The need of storing macro molecular data results in
Protein Data Base (PDB). The needs and refinement on these approaches result
in several tools and upgrading the process of solving the problems.

The traditional drug discovery process starts


with a particular disease.
identification of target, identification of molecule effective against target and
preclinical testing, identification of target and synthesis the molecule to increase
their suitability takes more amounts of time and cost (in millions) which is done
in "WET Lab". This is the area where the chemical informatics plays its
major
role in discovery process of the drug. The development process starts with
human clinical trials, approval from authority and delivers the product in the
market. This process takes about 10-15 years to discover, develop and bring
drug to the market.

The modern pharmaceutical drug discovery and development pipeline process, as


shown in Figure 7.3, start with Disease selection, Target identification, Lead
identification, Lead optimisation, Pre-clinical trial testing. Clinical trial testing,
Approval and circulation (Drug in market).

In traditional drug discovery phase, the process which cost more time and
noney is replaced with lead identification and lead optimisation proce
modern drug discovery system. Each phase has an interaction component that
ransfers data, knowledge and information to one another (shown in figure 7.4).
Computer AidedDrug Desgn

122

Lead
Target Identification Lead Optimisation Pre-Clinial
Disease
Identification
and Validation
Identification
Trails

CDSCO
Approval from
Drug Authority
Clinical Trials
and
Modern Drug Discovery
Flgure 7.3:
Development Life Cycle

Lead HTS
Optimisation
Chemoinformatics

Compound ADME
design

Interaction Process
Figure 74:

7.1.2.5. Modules of Chemoinformatics


information technology to helo
Chemoinformatics significant application of
is a

chemists for investigating new problems,


organise, analyse and understand
novel compounds, materials and process.
scientific data in the development of
Primary modules of Chemoinformatics are Computer-Assisted Synthesis Design.
7.5.
Structure representation and chemmetrics, shown in figure
Chemoinformatic

Structure Chemmetrics
CASD
Representation
Figure 7.5

mainly where
1) Computer-Assisted Synthesis Design (CASD): It is applied mainiydi
d in
artificial intelligence technique can be applied. This technique is appxile
textile

various applications which included pharmaceutical, food industry


industry and agro industry. emical

2) Stucture Representation: Various forms of machine


readabhere the
representation play basic property to design chemical database The
chemical information are stored for
analysis and manipu"s at. Some

chemical structure representations can be


of the chemícal structure
linear, 2D or in 3D Table
Table 7.1.
in
representations are shown is is
one

SMILES (Simplífied Molecular Input Line Entry Specifica


tion)
n gchemis

of the linear chemical notation format


which is widely used anmo
nformatics andMethods
Me in Drug Design (Chapter 7) 123

cal and
various clinica analysis purpose. Structure epresentation deals
ction Representation, Structure Descriptors, Molecul Modelling.
t o r

with Reactic

Searching and Computer-Assisted Structure Elucidation


S t r u c t u r e

shown in Figure 7.6.


CASE) as
Table 7.1: Some of the Chemical Structure Representation
Representation Name
Caffine
Common Name
banthine caffeine, theine mateine, Synonymns
CHioN,O
Empirical formula
7-dihydro-1, 3, 7-trimethyl-IH-purine-2, 6-dione IUPAC Name
8-08-2
CAS Registry Number
DN FNVNVJ BI FI HI WLN Notation
TSo BN
CNIC = NC2 = CIC (=0) N (C(=0) N2C) c SMILES
IS/C8H1ON402/cl-10-4-9-6- Inchl
(10) 7 (13) 12 (3) 8 (14) 11 (6) 2/h4H, 1-3H3
Markush Structure

Connection Table
12 3456 7 8 910 11
0 1 00 0 0 0 0 0 2 0
1 0 2 0 0 |0 0 0 00 0
0 2 0|1|0||0|0|1|00 0
4
0 0 1|0| 0 | 0 0|0|0 0
0 0 0|1|02|1|0 0|0
6 0 0 0 0 2 0 0 0 |0
7 0 0 0 0|1|0 0 0 00
80 0 1 | 0 | 0 0 0|0 2 0
90 0 0 000 0 2 0
10 2 0 0 00 0 0 0
11 0 0 o 0 o 0 00|ol
OH]clecccl Fragment Code
0001001 10100111 Fingerprint
$244987098423150 Hash Code
Reaction Representation helps to understand the basic chemical models,
quantify chemical reactivity and extract knowledge from the reaction
ntormation. Molecular modelling is a method includes a variety of
computational schemes which are aimed at stimulating molecular
Structures, their properties and "in-silico" behaviour.

CASD Structure
Representation

L
Suructure
Reaction Molecular Structure
Searching
CASE Deseriptors
Representation Modelling

Structure Elucidation (CASE)


Figure 7.6: Computer Assisted

You might also like