Probabilistic Modeling in Bioinformatics and Medical Informatics 1st Edition Entire Book Download
Probabilistic Modeling in Bioinformatics and Medical Informatics 1st Edition Entire Book Download
Visit the link below to download the full version of this book:
https://medidownload.com/product/probabilistic-modeling-in-bioinformatics-and-me
dical-informatics-1st-edition/
The turn of the millennium has been described as the dawn of a new scientific
revolution, which will have as great an impact on society as the industrial and
computer revolutions before. This revolution was heralded by a large-scale
DNA sequencing effort in July 1995, when the entire 1.8 million base pairs
of the genome of the bacterium Haemophilus influenzae was published – the
first of a free-living organism. Since then, the amount of DNA sequence data
in publicly accessible data bases has been growing exponentially, including a
working draft of the complete 3.3 billion base-pair DNA sequence of the entire
human genome, as pre-released by an international consortium of 16 institutes
on June 26, 2000.
Besides genomic sequences, new experimental technologies in molecu-
lar biology, like microarrays, have resulted in a rich abundance of further
data, related to the transcriptome, the spliceosome, the proteome, and the
metabolome. This explosion of the “omes” has led to a paradigm shift in
molecular biology. While pre-genomic biology followed a hypothesis-driven
reductionist approach, applying mainly qualitative methods to small, isolated
systems, modern post-genomic molecular biology takes a holistic, systems-
based approach, which is data-driven and increasingly relies on quantitative
methods. Consequently, in the last decade, the new scientific discipline of
bioinformatics has emerged in an attempt to interpret the increasing amount
of molecular biological data. The problems faced are essentially statistical,
due to the inherent complexity and stochasticity of biological systems, the
random processes intrinsic to evolution, and the unavoidable error-proneness
and variability of measurements in large-scale experimental procedures.
vi Preface
The first part of this book provides a brief yet self-contained introduction to
the methodology of Bayesian networks. The following parts demonstrate how
these methods are applied in bioinformatics and medical informatics.
This book is by no means comprehensive. All three fields – the methodol-
ogy of probabilistic modeling, bioinformatics, and medical informatics – are
evolving very quickly. The text should therefore be seen as an introduction,
offering both elementary tutorials as well as more advanced applications and
case studies.
The first part introduces the methodology of statistical inference and prob-
abilistic modelling. Chapter 1 compares the two principle paradigms of statis-
tical inference: the frequentist versus the Bayesian approach. Chapter 2 pro-
vides a brief introduction to learning Bayesian networks from data. Chapter 3
interprets the methodology of feed-forward neural networks in a probabilistic
framework.
The second part describes how probabilistic modelling is applied to bioin-
formatics. Chapter 4 provides a self-contained introduction to molecular phy-
logenetic analysis, based on DNA sequence alignments, and it discusses the
advantages of a probabilistic approach over earlier algorithmic methods. Chap-
ter 5 describes how the probabilistic phylogenetic methods of Chapter 4 can
be applied to detect interspecific recombination between bacteria and viruses
from DNA sequence alignments. Chapter 6 generalizes and extends the stan-
dard phylogenetic methods for DNA so as to apply them to RNA sequence
alignments. Chapter 7 introduces the reader to microarrays and gene expres-
sion data and provides an overview of standard statistical pre-processing pro-
cedures for image processing and data normalization. Chapters 8 and 9 address
the challenging task of reverse-engineering genetic networks from microarray
gene expression data using dynamical Bayesian networks and state-space mod-
els.
The third part provides examples of how probabilistic models are applied
in medical informatics.
Chapter 10 illustrates the wide range of techniques that can be used to
develop probabilistic models for medical informatics, which include logistic
regression, neural networks, Bayesian networks, and class-probability trees.
viii Preface
The examples are supported with relevant theory, and the chapter emphasizes
the Bayesian approach to probabilistic modeling.
Chapter 11 discusses Bayesian models of groups of individuals who may
have taken several drug doses at various times throughout the course of a
clinical trial. The Bayesian approach helps the derivation of predictive distri-
butions that contribute to the optimization of treatments for different target
populations.
Variable selection is a common problem in regression, including neural-
network development. Chapter 12 demonstrates how Automatic Relevance
Determination, a Bayesian technique, successfully dealt with this problem for
the diagnosis of heart arrhythmia and the prognosis of lupus.
The development of a classifier is usually preceded by some form of data
preprocessing. In the Bayesian framework, the preprocessing stage and the
classifier-development stage are handled separately; however, Chapter 13 in-
troduces an approach that combines the two in a Bayesian setting. The ap-
proach is applied to the classification of electroencephalogram data.
There is growing interest in the application of the variational method to
model development, and Chapter 14 discusses the application of this emerging
technique to the development of hidden Markov models for biosignal analysis.
Chapter 15 describes the Treat decision-support system for the selection
of appropriate antibiotic therapy, a common problem in clinical microbiol-
ogy. Bayesian networks proved to be particularly effective at modelling this
problem task.
The medical-informatics part of the book ends with Chapter 16, a descrip-
tion of several software packages for model development. The chapter includes
example codes to illustrate how some of these packages can be used.
Finally, an appendix explains the conventions and notation used through-
out the book.
Intended Audience
The book has been written for researchers and students in statistics, machine
learning, and the biological sciences. While the chapters in Parts II and III
describe applications at the level of current cutting-edge research, the chapters
in Part I provide a more general introduction to the methodology for the
benefit of students and researchers from the biological sciences.
Chapters 1, 2, 4, 5, and 8 are based on a series of lectures given at the
Statistics Department of Dortmund University (Germany) between 2001 and
2003, at Indiana University School of Medicine (USA) in July 2002, and at
the “International School on Computational Biology”, in Le Havre (France)
in October 2002.
Preface ix
Website
The website
http://robots.ox.ac.uk/∼parg/pmbmi.html
complements this book. The site contains links to relevant software, data,
discussion groups, and other useful sites. It also contains colored versions of
some of the figures within this book.
Acknowledgments
This book was put together with the generous support of many people.
Stephen Roberts would like to thank Peter Sykacek, Iead Rezek and
Richard Everson for their help towards this book. Particular thanks, with
much love, go to Clare Waterstone.
Richard Dybowski expresses his thanks to his parents, Victoria and Henry,
for their unfailing support of his endeavors, and to Wray Buntine, Paulo Lis-
boa, Ian Nabney, and Peter Weller for critical feedback on Chapters 3, 10,
and 16.
Dirk Husmeier is most grateful to David Allcroft, Lynn Broadfoot, Thorsten
Forster, Vivek Gowri-Shankar, Isabelle Grimmenstein, Marco Grzegorczyk,
Anja von Heydebreck, Florian Markowetz, Jochen Maydt, Magnus Rattray,
Jill Sales, Philip Smith, Wolfgang Urfer, and Joanna Wood for critical feed-
back on and proofreading of Chapters 1, 2, 4, 5, and 8. He would also like to
express his gratitude to his parents, Gerhild and Dieter; if it had not been for
their support in earlier years, this book would never have been written. His
special thanks, with love, go to Ulli for her support and tolerance of the extra
workload involved with the preparation of this book.
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
Part II Bioinformatics