2012BC IntroductiontoBioinformatics PDF
2012BC IntroductiontoBioinformatics PDF
net/publication/236630283
Introduction To Bioinformatics
CITATIONS READS
2 38,869
2 authors:
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Dr. Kunwar Singh Vaisla on 04 June 2014.
LkR;eso t;rs
mÙkjk[k.M 'kklu
! " # $ % & ' & ' ( ) ' & * # + # , " - . / # ' & * # + # , " 0
# 2 ' ! 3 ' # $ 4 - ! - 5 * - .
1
BIOINFORMATICS
Tools & Applications
Editor-in-Chief
Dr. J.M.S Rana, Ph.D.
Director,
Uttarakhand State Biotechnology Department,
Government of Uttarakhand
LkR;eso t;rs
mÙkjk[k.M 'kklu
2012
Edi t or :
Dr. Kunwar Singh Vaisla, Ph.D.
Associate Professor,
Department of Computer Science & Engineering
BT Kumaon Institute of Technology
Dwarahat 263653, District Almora (Uttarakhand)
[email protected]
Co-Edi t or :
Mr. Rajendra Bharti,
Associate Professor,
Department of Computer Science & Engineering
BT Kumaon Institute of Technology
Dwarahat 263653, District Almora (Uttarakhand)
Disclaimer :
This document may be freely reviewed, abstracted, reproduced and translated in part
or in whole, but not for sale of any commercial purpose. All reasonable precautions
have been taken by editorial board and authors to verify the information contained in
this publication. However, the responsibility for the interpretation of the material and
warranty of any kind, either expressed or implied, lies with the reader.
The document is being published by USBD for free of cost distribution among
trainees/participants/students.
LkR;eso t;rs
mÙkjk[k.M 'kklu
And
Printed by:
Charu Printers, Dehradun
Phone : 0135-2727591
(Dr. C. Raman Suri)
Chief Scientist
Head-Nanobiotechnology
PREFACE
The unprecedented growth of computer science and
biotechnology has rapidly spawned specialized research groups and
industries, both. Bioinformatics, generally relates to biological
molecules and therefore requires knowledge and understanding in the
field of biochemistry, molecular biology, biophysics, mathematical
modeling, to name a few. Thus, Bioinformatics is a multi-disciplinary
field with an appropriate convergence of computer science, information
technology, biotechnology etc. encompassing analysis and
interpretation of biological data, modeling of biological processes and
development of related algorithms and statistics. Bioinformatics aims to
solve some of the important biological problems using computational
techniques and applications. Prominent among such examples include
those to gene-protein interactions, various protein structure prediction,
molecular networks, structures and sequences, and genomics and
proteomics. The knowledge of bioinformatics helps us tremendously in
analysis and interpretation of various types of biological data including
nucleotide and amino-acid sequences, protein domains /folding and
protein structures etc. It is the knowledge of bioinformatics which made
it possible to discover all human genes (per human genome
approximately 3 billion base pairs) determine complete sequence of 3
billon DNA subunits and make them accessible for further biological
studies. Thus, the research in this field requires close collaboration
among multi-disciplinary teams of researchers in computer science,
statistics, physics, engineering, life sciences medical sciences, and their
interfaces.
(J.M.S.Rana)
INDEX
Ch Topic Page No
No.
1 Message 3
2 Forward 5
3 Preface 7
4 Introduction To Bioinformatics 11
Jag Mohan Singh Rana & Kunwar Singh Vaisla
10 Phylogenetic Analysis 74
Kumar Sachin
12 Proteomics 113
C. Rajasekaran & T. Kalaivani
What is Bioinformatics ?
Bioinformatics is an interdisciplinary research area at the interface
between biological and computational sciences. Although the term
'Bioinformatics' is not really well-defined, you could say that this
scientific field deals with the computational management of all kinds of
molecular biological information. Most of the bioinformatics work that
is being done deals with either analyzing biological data, or with the
organization of biological information.
As a consequence of the large amount of data produced in the field of
molecular biology, most of the current bioinformatics projects deal with
structural and functional aspects of genes and proteins.
Defining the terms bioinformatics and computational biology is not
necessarily an easy task, as evidenced by multiple definitions available
over the web. A recent google search for "definition of bioinformatics"
returned over 43,000 results! In the past few years, as the areas have
grown, a greater confusion into these two terms has prevailed. For
some, the terms bioinformatics and computational biology have become
completely interchangeable terms, while for others, there is a great
distinction. I'll throw my two cents in, based on what my experience has
been to the consensus use of these two terms.
Computational biology and bioinformatics are multidisciplinary fields,
involving researchers from different areas of specialty, including (but in
no means limited to) statistics, computer science, physics,
biochemistry, genetics, molecular biology and mathematics. The goal of
these two fields is as follows:
• Bioinformatics: Typically refers to the field concerned with the
collection and storage of biological information. All matters
concerned with biological databases are considered
bioinformatics.
• Comput at ional
biology: Refers to the
aspect of developing
algorithms and
statistical models
necessary to analyze
biological data through
the aid of computers.
(Source: http://www.genome.jp/en/db_growth.html)
3. Third, high-speed digital computers, which had developed from
weapons research programmes during the Second World War,
finally became widely available to academic biologists. Not all
biologists had or wanted to have access to these machines but,
by 1960, scarcity of computers was no longer a serious
stumbling block for the development of computational biology.
2. Aims of bioinformatics
The aims of bioinformatics are three-fold.
1. First, at its simplest bioinformatics organises data in a way that
allows researchers to access existing information and to submit
new entries as they are produced, e.g. the Protein Data Bank for
3D macromolecular structures .
2. While data-curation is an essential task, the information stored
in these databases is essentially useless until analysed. Thus the
purpose of second aim is to develop tools and resources that aid
in the analysis of data.
3. The third aim is to use these tools to analyze the data and
interpret the results in a biologically meaningful manner.
Traditionally, biological studies examined individual systems in detail,
and frequently compare them with a few that are related. In
bioinformatics global analyses of all the available data with the aim of
uncovering common principles that apply across many systems and
highlight features that are unique to some.
Research Areas of Bioinformatics
(Source: http://www.mpi-inf.mpg.de/
departments/d3/areas/docking.html )
(iv) Biological database: Biological databases are libraries of life
sciences information, collected from scientific experiments,
published literature, high-throughput experiment technology,
and computational analyses. They contain information from
research areas including genomics, proteomics, metabolomics,
microarray gene expression, and phylogenetics. Information
contained in biological databases includes gene function,
structure, localization (both cellular and chromosomal), clinical
effects of mutations as well as similarities of biological sequences
and structures.
(Source: http://www.mrc-lmb.cam.ac.uk/genomes/
madanm/pres/biodb.htm )
(v) Biological Data Mining: Biological Data mining is the discovery
of useful knowledge from biological databases. Data mining
employs algorithms and techniques from statistics, machine
learning, artificial intelligence, databases and data warehousing
etc. Some of the most popular tasks are classification,
clustering, association and sequence analysis, and regression.
Clustering, association and sequence analysis, and regression.
(Source: http://cs.salemstate.edu/hatfield/
teaching/courses/DataMining/M.htm )
(vi) Microarray informatics: Microarray Technology is a powerful
tool to monitor gene expression or gene expression changes of
hundreds or thousands of genes in a single experiment.
(Source: http://www.accessexcellence.org/RC/
VL/GG/microArray.php )
16 Bioinformatics Tools & Its Applications
(vii) Molecular Phylogenetics: Molecular phylogenetics is the study of
organisms on a molecular level to gather information about the
phylogenetic relationships between different organisms.
(Source: http://www.tulane.edu/~wiser/
protozoology/notes/tree.html )
(Source: http://kdbio.inesc-id.pt/~orlando/kdbio/software.html)
LkR;eso t;rs
mÙkjk[k.M 'kklu
C D D E F E G H E I J K D E D L M N O D L P H I O Q O R S T L U E F D V L I D
W X Y Z Y [ \ ] ^ _ ` a b Y c Z b c d e c b f Z _ g _ h ^ i Z j k Y _ \ c b f Z _ g _ h ^ l
_ n c ] Z o c Z \ _ ` p \ \ i ] i q f i Z j r
s t u
t t t t t
v w x y z { | w { } ~ } | x { w z w w
w w z
t t t
| y } { { { w } ~ w } }
s t u
v y z y y { v { z { | } z y { { y z y y
¡ ¢ ¡ £ ¤ £ ¥ ¡ ¤ ¡ ¦