100% found this document useful (12 votes)
412 views16 pages

Essentials of Bioinformatics, Volume I Understanding Bioinformatics Genes To Proteins Full-Resolution Download

The book 'Essentials of Bioinformatics, Volume I' serves as a comprehensive introduction to bioinformatics, covering essential concepts, databases, and computational methods for analyzing biological data from genes to proteins. It aims to equip readers, regardless of their prior experience, with the necessary skills to handle complex genomic information and utilize various bioinformatics tools effectively. With contributions from experts in the field, the book is designed as a practical guide for students and researchers in molecular biology, genetics, and related disciplines.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (12 votes)
412 views16 pages

Essentials of Bioinformatics, Volume I Understanding Bioinformatics Genes To Proteins Full-Resolution Download

The book 'Essentials of Bioinformatics, Volume I' serves as a comprehensive introduction to bioinformatics, covering essential concepts, databases, and computational methods for analyzing biological data from genes to proteins. It aims to equip readers, regardless of their prior experience, with the necessary skills to handle complex genomic information and utilize various bioinformatics tools effectively. With contributions from experts in the field, the book is designed as a practical guide for students and researchers in molecular biology, genetics, and related disciplines.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

Essentials of Bioinformatics, Volume I Understanding

Bioinformatics Genes to Proteins

Visit the link below to download the full version of this book:

https://medipdf.com/product/essentials-of-bioinformatics-volume-i-understanding-
bioinformatics-genes-to-proteins/

Click Download Now


This book is dedicated to King Abdulaziz
University, Jeddah, Saudi Arabia.
Foreword

Bioinformatics is an interdisciplinary branch of modern biology which utilizes the


advanced computational methods to generate, extract, analyze, and interpret the
multidimensional biological data. The rapid technological improvements made,
especially in molecular and genetic data generation and collection, an easier task for
scientists, but left its handling an extremely complex affair. Thus, the logical gap
existing in between high-throughput data generation and analysis cannot be fully
bridged without the utilization of computational methods, and this is where bioin-
formatics comes into the scene. All steps of data collection and analysis have their
particular computational tools, and beginners trying to experience bioinformatics
are often inevitably lost in the huge amount of ambiguously defined terms and
concepts.
The reason for choosing essentials of bioinformatics as the main theme of this
book series is highly justifiable. For example, the genome and protein biologists
regularly deal with large amounts of lightly annotated data, especially which is
produced from large-scale sequencing projects related to animal, microbial, plants,
or human studies. The fundamental understanding about important computational
methods becomes an unescapable need for them to perform first-hand analysis and
interpretation of complex data without relying on external sources.
The chapters in this book mainly discusses about introduction to bioinformatics
field, databases, structural and functional bioinformatics, computer-aided drug dis-
covery, sequencing, and metabolomics data analysis. All the chapters provide an
overview of the currently available online tools, and many of them are illustrated
with examples. The methods include both basic and advanced methodologies,
which help the reader become familiar with the manifold approaches that character-
ize this varied and interdisciplinary field.
This book is designed to allow anyone, regardless of prior experience, to delve
deep into this data-driven field, and it is a learner’s guide to bioinformatics. This
book helps the reader to acquaint basic skills to analyze sequencing data produced
by instruments, perform advanced data analytics such as sequence alignment, and
variation recording or gene expression analysis. Students and researchers in protein

vii
viii Foreword

research, bioinformatics, biophysics, computational biology, molecular modeling,


and drug design will find this easy-to-understand book a ready reference for staying
current and productive in this changing, interdisciplinary field. The editors of this
book chose a well-defined target group given the fact that bioinformatics is a fast-
evolving field and made a good summary of diverse gene to protein level analysis
methods available in present-day bioinformatics field.
It is with these thoughts that I recommend this well-written book to the reader.

Kaiser Jamil
School of Life Sciences and Centre for Biotechnology and Bioinformatics
Jawaharlal Nehru Institute of Advanced Studies
Hyderabad, India
Preface

Bioinformatics is a relatively young subject, compared to numerous well-established


branches of biomedical sciences. The recent revolutionary technological develop-
ments taken place in interdisciplinary fields like chemistry, biology, engineering,
sequencing, and powerful computing methods have greatly contributed to the rapid
emergence of bioinformatics as a major scientific discipline. With the exponential
accumulation of sequencing data generated from different biological organisms and
more forthcoming, accessing and interpreting the highly complex genomic informa-
tion have become central research themes of current-day molecular biology and
genetics. However, mining this huge genetic data generated by whole genome or
exome or RNA sequencing methods, often coming giga- to terabytes in digital size,
demands for advanced computational methods. It has therefore become the neces-
sity of not just the biologists and statisticians but also computer experts to develop
different bioinformatics softwares for tackling the unprecedented challenges in the
next-generation genomics.
In the recent decades, we all have witnessed the release of several new easy-to-
use and effective bioinformatics tools into the public domain by numerous multidis-
ciplinary academic groups on a regular basis. Since one single textbook cannot
cover all bioinformatics tools and databases available, ordinary bench scientists are
feeling difficulty in keeping themselves updated with the latest developments in this
field. Hence, to support this important task, the current textbook discusses the
diverse range of cutting-edge bioinformatics tools in a clear and concise manner.
The current book introduces the reader to essential bioinformatics concepts and
to different databases and software programs applicable in gene to protein level
analysis, highlights their theoretical basis and practical applications, and also pres-
ents the simplified working approaches using graphs, figures, and screenshots. In
some instances, step-by-step working procedures are provided as a practical exam-
ple to solve particular problems. However, this book does not aim to serve as a
manual for each software program discussed inside, rather it only briefs the purpose
of each tool, its source, and working options, in most instances. Majority of the
computational methods discussed in this book is available online, is free to use, and
does not require special skills or previous working experience. They are rather

ix
x Preface

straightforward to use and only demand simple inputs like nucleotide or amino acid
sequence and protein structures to analyze and return the output files.
Since majority of the chapters in this book are prepared by scientists, who utilize
different bioinformatics tools in their day-to-day research activities, we believe that
this book will mainly help young biologists keen to learn new skills in bioinformat-
ics. Our authors have taken care in simple presentation of chapter contents, easier
enough for any practicing molecular biologist to independently employ bioinfor-
matics methods in their regular research tasks without consulting computational
experts. The reader of this book is expected to be familiar with fundamental con-
cepts in biochemistry, molecular biology, and genetics. Our authors have consulted
numerous original articles and online reading materials to provide the most updated
information in preparing their chapters. Although we have taken a care to cover
major bioinformatics tools, but in case if any important tool is missed out, then it is
only due to space limitation but not a bias against any particular software program.
The chapters in this book mainly covers the introduction to bioinformatics, intro-
duction to biological databases, sequence bioinformatics, structural bioinformatics,
functional bioinformatics, computer-aided drug discovery methods, and some spe-
cial concepts like in silico PCR and molecular modeling. A total of 17 chapters are
included in this book, and most of them are relatively independent from each other.
All the chapters in each section are arranged in a logical manner where one chapter
acts as follow-ups to the next one. Since this book is basically meant for practicing
molecular biologists, few selected molecular or mathematical formulas, which are
prerequisites for understanding the corresponding concepts, are used. A general dis-
cussion about computational program is often described along with its web links.
The conclusion part is provided at the bottom of each chapter to refresh the under-
standing of readers.
We sincerely thank the Princess Al-Jawhara Center of Excellence in Research of
Hereditary Disorders (PACER-HD) and Department of Genetic Medicine, Faculty of
Medicine and Department of Biology, Faculty of Science at King Abdulaziz
University (KAU) for providing us the opportunity to teach and train the bioinformat-
ics course, because of which we have been able to bring up this book. We thank Prof.
Jumana Y. Al-Aama, director of the PACER-HD, KAU, for her excellent moral and
administrative support in letting us involved in teaching and training the bioinformat-
ics to young minds in biology and medicine. We also thank Dr. Musharraf Jelani,
Dr. Nuha Al-Rayes, Dr. Sheriff Edris, and Dr. Khalda Nasser our colleagues in the
PACER-HD for their wonderful friendship. We would also like to thank the chairman
of the Department of Biological Sciences, Prof. Khalid M. AlGhamdi, and the head
of Plant Sciences section, Dr. Hesham F. Alharby, for providing us the valuable sug-
gestions and encouragement to complete this task. Last but not the least, we would
like to acknowledge the support of Springer Nature publishing house for accepting
our book proposal, their regular follow-up, and the final publication of this book.

Jeddah, Saudi Arabia Noor Ahmad Shaik


 Khalid Rehman Hakeem
 Babajan Banaganapalli
 Ramu Elango
Contents

  1 Introduction to Bioinformatics ����������������������������������������������������������������   1


Babajan Banaganapalli and Noor Ahmad Shaik
  2 Introduction to Biological Databases ������������������������������������������������������ 19
Noor Ahmad Shaik, Ramu Elango, Muhummadh Khan,
and Babajan Banaganapalli
  3 Sequence Databases ���������������������������������������������������������������������������������� 29
Vivek Kumar Chaturvedi, Divya Mishra, Aprajita Tiwari,
V. P. Snijesh, Noor Ahmad Shaik, and M. P. Singh
  4 Biological 3D Structural Databases���������������������������������������������������������� 47
Yasser Gaber, Boshra Rashad, and Eman Fathy
  5 Other Biological Databases ���������������������������������������������������������������������� 75
Divya Mishra, Vivek Kumar Chaturvedi, V. P. Snijesh,
Noor Ahmad Shaik, and M. P. Singh
  6 Introduction to Nucleic Acid Sequencing������������������������������������������������ 97
Preetha J. Shetty, Francis Amirtharaj, and Noor Ahmad Shaik
  7 Tools and Methods in the Analysis of Simple Sequences������������������������ 127
Yogesh Kumar, Om Prakash, and Priyanka Kumari
  8 Tools and Methods in Analysis of Complex Sequences�������������������������� 155
Noor Ahmad Shaik, Babajan Banaganapalli, Ramu Elango,
and Jumana Y. Al-Aama
  9 Structural Bioinformatics������������������������������������������������������������������������� 169
Bhumi Patel, Vijai Singh, and Dhaval Patel
10 Protein Structure Annotations������������������������������������������������������������������ 201
Mirko Torrisi and Gianluca Pollastri
11 Introduction to Functional Bioinformatics���������������������������������������������� 235
Peter Natesan Pushparaj

xi
xii Contents

12 Biological Networks: Tools, Methods, and Analysis������������������������������� 255


Basharat Ahmad Bhat, Garima Singh, Rinku Sharma,
Mifftha Yaseen, and Nazir Ahmad Ganai
13 Metabolomics���������������������������������������������������������������������������������������������� 287
Peter Natesan Pushparaj
14 Drug Discovery: Concepts and Approaches�������������������������������������������� 319
Varalakshmi Devi Kothamuni Reddy, Babajan Banaganapalli,
and Galla Rajitha
15 Molecular Docking������������������������������������������������������������������������������������ 335
Babajan Banaganapalli, Fatima A. Morad, Muhammadh Khan,
Chitta Suresh Kumar, Ramu Elango, Zuhier Awan,
and Noor Ahmad Shaik
16 In Silico PCR���������������������������������������������������������������������������������������������� 355
Babajan Banaganapalli, Noor Ahmad Shaik, Omran M. Rashidi,
Bassam Jamalalail, Rawabi Bahattab, Hifaa A. Bokhari,
Faten Alqahtani, Mohammed Kaleemuddin, Jumana Y. Al-Aama,
and Ramu Elango
17 Modeling and Optimization of Molecular Biosystems
to Generate Predictive Models������������������������������������������������������������������ 373
Ankush Bansal, Siddhant Kalra, Babajan Banaganapalli,
and Tiratha Raj Singh

Index�������������������������������������������������������������������������������������������������������������������� 389
About the Editors

Noor Ahmad Shaik is an academician and researcher


working in the field of Human Molecular Genetics.
Over the last 15 years, he is working in the field of
molecular diagnostics of different monogenic and
complex disorders. With the help of high throughout
molecular methods like genetic sequencing, gene
expression, and metabolomics, his research team is
currently trying to discover the novel causal genes/bio-
markers for rare hereditary disorders. He has published
63 research publications including 50 original research
articles and 13 book chapters in the fields of human
genetics and bioinformatics. He has been a recipient of
several research grants from different national and
international funding agencies. He is currently render-
ing his editorial services to world-renowned journals
like Frontiers in Pediatrics and Frontiers in Genetics.

Khalid Rehman Hakeem is associate professor at


King Abdulaziz University, Jeddah, Saudi Arabia. He
has completed his PhD (Botany) from Jamia Hamdard,
New Delhi, India, in 2011. Dr. Hakeem has worked as
postdoctorate fellow in 2012 and fellow researcher
(associate professor) from 2013 to 2016 at Universiti
Putra Malaysia, Selangor, Malaysia. His speciality is
plant ecophysiology, biotechnology and molecular
biology, plant-microbe-soil interactions, and environ-
mental sciences and so far has edited and authored
more than 25 books with Springer International,
Academic Press (Elsevier), CRC Press, etc. He has
also to his credit more than 120 research publications

xiii
xiv About the Editors

in peer-reviewed international journals, including 42


book chapters in edited volumes with international
publishers.

Babajan Banaganapalli works as bioinformatics


research faculty at King Abdulaziz University. He initi-
ated and is successfully running the interdisciplinary
bioinformatics program from 2014 to till date in King
Abdulaziz University. He has more than 12 years of
research experience in bioinformatics. His research
interest spreads across genomics, proteomics, and drug
discovery for complex diseases. He published more
than 40 journal articles, conference papers, and book
chapters. He has also served in numerous conference
program committees and organized several bioinfor-
matics workshops and trainings programs and acts as
editor and reviewer for various international genetics/
bioinformatics journals. Recently, he was honored as
young scientist for his outstanding research in bioin-
formatics by Venus International Research Foundation,
India.

Ramu Elango is a well-experienced molecular genet-


icist and computational biologist with extensive expe-
rience at MIT, Cambridge, USA, and GlaxoSmithKline
R&D, UK, after completing his PhD in Human
Genetics at All India Institute of Medical Sciences,
New Delhi, India. At GlaxoSmithKline, he contributed
extensively in many disease areas of interest in identi-
fying novel causal genes and tractable drug targets. Dr.
Ramu Elango presently heads the Research &
Laboratories at the Princess Al-Jawhara Center of
Excellence in Research of Hereditary Disorders, King
Abdulaziz University. His research focus is on genetics
and genomics of complex and polygenic diseases. His
team exploits freely available large-scale genetic and
genomic data with bioinformatics tools to identify the
risk factors or candidate causal genes for many com-
plex diseases.
Chapter 1
Introduction to Bioinformatics

Babajan Banaganapalli and Noor Ahmad Shaik

Contents
1.1 Introduction    1
1.2 The Role of Bioinformatics in Gene Expression Data Analysis    4
1.3 The Role of Bioinformatics in Gene/Genome Mapping    5
1.4 Role of Bioinformatics in Sequence Alignment and Similarity Search    6
1.5 Contribution of Bioinformatics toward Modern Cancer Research    9
1.6 The Domain of Structural Bioinformatics 11
1.7 Bioinformatics Processing of Big Data 12
1.8 Conclusion 14
References 14

1.1 Introduction

Bioinformatics is a fairly recent advancement in the field of biological research, and


its contribution towards the cutting-edge medical research is phenomenal.
Developing skills in bioinformatics has become essential to all biologists due to its
interdisciplinary nature and involvement of cutting-edge technology through which
it has effectively provided an insight into the inherent blueprint of molecular cell
systems. The subject of bioinformatics is an amalgamation of various concepts
derived from domains such as computer science, mathematics, molecular biology,
genetics, and statistics and other domains. Bioinformatics can be defined as matri-
mony between biology and computer science, which routinely solves complex

B. Banaganapalli (*)
Princess Al-Jawhara Center of Excellence in Research of Hereditary Disorders,
Department of Genetic Medicine, Faculty of Medicine, King Abdulaziz University,
Jeddah, Saudi Arabia
e-mail: [email protected]
N. A. Shaik
Department of Genetic Medicine, Faculty of Medicine, King Abdulaziz University,
Jeddah, Saudi Arabia
e-mail: [email protected]

© Springer Nature Switzerland AG 2019 1


N. A. Shaik et al. (eds.), Essentials of Bioinformatics, Volume I,
https://doi.org/10.1007/978-3-030-02634-9_1
2 B. Banaganapalli and N. A. Shaik

Fig. 1.1 Bioinformatics and biological research

biological problems through computer science, mathematics, and statistics (Can


2014). Bioinformatics endeavors to achieve modeling of complex biological pro-
cesses at the cellular level and gain insights into the disease mechanisms from the
analysis of large volumes of data that is generated. Figure 1.1 given below illustrates
the contribution of bioinformatics toward different biological and medical research
domains.
Ever since the completion of the human genome project in 2003, biological data
generation has increased explosively and the domain of bioinformatics is playing a
very key role in drawing critical inference from them (Greene and Troyanskaya
2011). During the Human Genome Project, comprehensive analysis of Human DNA
was performed from multiple perspectives. The project started with the sequencing
of all the base pairs, discovery of genes (more than 20,000) and mapping of
those genes onto the chromosomes leading to further development of linkage maps.
The most critical achievement of the Human Genome Project is the complete under-
standing of the dynamics and modalities of the human gene transcripts, their presen-
tation and localization in the genome and their fundamental molecular functions. The
human genome project generated huge amounts of biological data and its signifi-
cance toward knowledge generation depended on the efficient data mining. The inter-
disciplinary domain of bioinformatics has played a key role in the data mining
1 Introduction to Bioinformatics 3

process, and in the recent years, a wide range of sophisticated bioinformatics meth-
ods, ­techniques, algorithms and tools which immensely contributed toward extract-
ing the knowledge from major biological data repositories are developed. All of
these developments have potentially contributed towards the progression of biologi-
cal research in general but also facilitated the progress of new scientific domains
such as molecular medicine, targeted gene therapy, and in silico drug design
approaches to name a few. State-of-the-art bioinformatic programs used in high-
throughput sequence analysis allows researchers to accurately monitor and identify
the minute genomic alterations on genes, and these kinds of endeavors result in the
generation of massive amounts of biological data that requires highly efficient analy-
sis (Goldfeder et al. 2011; Yang et al. 2009). This once again highlights the advantage
of bioinformatics over traditional molecular techniques where one can study only a
single gene at one time (Jorge et al. 2012; Blekherman et al. 2011; Kihara et al. 2007).
From a biological data analysis perspective, bioinformatics is a robust set of data
mining paradigms that help convert textual data into human perceivable knowledge.
In every domain of scientific research, computing technologies have virtually
become indispensable and the situation is no different in case of bioinformatics
(Akalin 2006; Bork 1997; Brzeski 2002). Biological sequencing machines generate
tremendous volumes of data, and it has become critically important to archive, orga-
nize, and process such data using powerful computational methods efficiently to
extract the knowledge. Similarly, computational simulation of complex molecular
processes can be an asset as it allows researchers to gain rapid insights in silico
without the need for time consuming experimentations (Akalin 2006).
Some of the key contributions of the discipline of bioinformatics include (Akalin
2006):
• Conceptualization, design, and development of biological relational databases
for archiving, organizing, and retrieving biological data.
• Development of cutting-edge computer algorithms to model, visualize, mine and
compare biological data.
• Highly intuitive and user-friendly curation of biological data that will help bio-
logical researchers who lack IT knowledge to derive useful representation of
information.
Bioinformatic tools and techniques have become virtually ubiquitous in modern
biological research. From a molecular biology perspective involving microarrays
and sequencing experiments, bioinformatics techniques can be used for efficient
analysis of raw transcript signals and sequence data. In the field of genetics and
genomics, bioinformatics can assist in the analysis of sequencing data, annotation
of genomic landmarks, and identification of genetic mutations that could have a
direct correlation with disease pathology (Batzoglou and Schwartz 2014; Yalcin
et al. 2016; Zharikova and Mironov 2016). Furthermore, bioinformatics data mining
tools can also be highly effective in the analysis and extraction of knowledge from
biological literature so that sophisticated gene ontologies could be conceptualized
for future query and analysis. Bioinformatics tools and algorithms also find
application in the field of proteomics such as the analysis of protein expression and
its regulation. Bioinformatics can contribute toward the analysis of biological
4 B. Banaganapalli and N. A. Shaik

pathways and molecular network analysis from a systems biology approach (Hou
et al. 2016). In the field of structural biology, bioinformatics tools can help in the in
silico simulation and modeling of genomic and proteomic data that can help
researchers better understand the key molecular interactions. A key domain where
bioinformatics has immensely contributed is, in the field of biomedicine. It will not
be wrong to comment that every human disease is somehow connected to a genetic
event. In this context, the complete draft of the human genome has greatly helped in
the mapping the disease-associated genes and elucidation of their molecular func-
tion. As a result it has become possible for researchers to gain comprehensive
insights into pathogenesis at the cellular level, thus creating grounds for the devel-
opment of effective therapeutic interventions. Through the use of the state-of-the-art
bioinformatics and computational tools, it has now become possible to simulate,
identify, and establish potential drug targets that will have much greater efficacy
against diseases with minimal side effects. With the advent of different bioinformat-
ics paradigms, it has now become possible to carry out analysis of an individual’s
genetic profile leading to the innovative concept of personalized medicine. In
domains such as agriculture, bioinformatics tools can be used to alter the genomic
structure so that there is an increase in the resistance of crops toward different plant
pathogens and insects (Bolger et al. 2014; Edwards and Batley 2010).

1.2  he Role of Bioinformatics in Gene Expression Data


T
Analysis

A molecular biology experimental paradigm that is extensively used to decipher the


inherent characteristics and functionality of an entire cohort of gene transcripts on a
genome is microarray. A major strength of the microarray is its ability to characterize
the complete genes using a single microarray chip or plate (Konishi et al. 2016; Li
2016). Microarrays contain a solid substrate where protein and DNA are printed as
microscopic spots. DNA microarray chip may contain short single-stranded oligonu-
cleotides or large double-stranded DNA strands. These spots are called as probes, and
they hybridize with the cDNA samples from the control and the test subjects whose
identity is unknown (Konishi et al. 2016; Li 2016). The microarray process starts with
the extraction and purification of RNA from the control and test samples and their
transformation into cDNA using reverse transcriptase enzyme. To differentiate the test
and control samples, the cDNA from control samples is labelled with green Cy3 fluo-
rescent dye while the test sample cDNA is labelled with red Cy5 fluorescent dye. The
role of the fluorescent dye is to help the researchers accurately estimate the expression
profile of the genes. After the experiment is complete, microarray chips/plates are
scanned using a scanning device, and the data is collected for analysis. The probes
(both test and control) offers a complete representation of the genes that have com-
pleted the process of transcription. Certain gene transcripts may show hybridization
with both the test and the control samples and appear as yellow spots on the array (Frye
and Jin 2016; Non and Thayer 2015; Soejima 2009). At this juncture, the role of
1 Introduction to Bioinformatics 5

Fig. 1.2 The microarray workflow

bioinformatics has become critical with regard to analyzing and deciphering the micro-
array data to extract the desired information. Bioinformatics data mining tools and
work pipelines such as cluster analysis, and heatmap show high efficacy in microarray
data analysis (Bodrossy and Sessitsch 2004; Loy and Bodrossy 2006). A typical micro-
array experiment workflow is presented in Fig. 1.2 (Macgregor and Squire 2002).

1.3 The Role of Bioinformatics in Gene/Genome Mapping

High-throughput sequencing technologies have allowed us in generating whole


genome sequencing data at a phenomenal rate. Today a wide range of user-friendly
genome browsers are available on the web which can help researchers analyze their
gene or genome landmark of interest with the click of a button (Mychaleckyj 2007).
Many such genome browsers offer cutting-edge genome mapping tools but it needs
to be noted that de novo genome mapping is still a necessity for many researchers.
The technique of gene or genome mapping involved the characterization of key
landmarks on the genome. The fundamental aim of genome mapping is to create a
comprehensive map of the genome of interest and identify landmarks such as gene
regulatory sites, short-sequence repeats, single nucleotide polymorphism, or even
the discovery of completely new gene transcripts (Waage et al. 2018; Brown 2002).
Researchers across the globe are carrying out more and more sequencing experi-
ments with each passing day, and the fundamental question that they face concerns
6 B. Banaganapalli and N. A. Shaik

Fig. 1.3 Illustration of large-scale mapping of a gene by genetic mapping and physical mapping

the need for genome maps. These maps allow identification of the key genetic fea-
tures that could have major therapeutic significance (Brown 2002).
In silico gene mapping (Schadt 2006), uses the publicly available genomic data-
bases to discover and map genes on the genome. In silico gene mapping techniques
using bioinformatic applications have been very successful in identifying or map-
ping QTLs (quantitative trait loci) that help understand the molecular pathways
associated with the pathogenesis of different polygenic diseases (Burgess-­Herbert
et al. 2008). The excellent and informative BAC (bacterial artificial chromosome)
clone contig map for Sus scrofa is achieved with help of in silico mapping
approaches. In silico allows efficient and fast mapping of genes, which may specifi-
cally influence resistance or susceptibility of a particular animal towards a specific
disease. Moreover the availability of excellent computational resources, public
genomic and phenotypic databases like UniGene and GenBank, and tools like
BLAST considerably accelerates the mapping process, when done in silico as com-
pared to the conventional methodologies that are more labor intensive, costly, and
time consuming (Schadt 2006). The simple workflow of the gene mapping tech-
nique is illustrated in Fig. 1.3.

1.4  ole of Bioinformatics in Sequence Alignment


R
and Similarity Search

Deciphering the similarity between DNA, RNA, or protein sequences is a critical


step toward gaining insights into different biological processes and disease
pathways. Sequence similarity can help researchers identify and characterize
candidate genes or single nucleotide polymorphisms that can predispose an
individual toward major pathological conditions such as cancer and autoimmune
diseases. Sequence similarities can be determined by carrying out pairwise or

You might also like