100% found this document useful (9 votes)
41 views66 pages

Get Deep Learning in Bioinformatics: Techniques and Applications in Practice Habib Izadkhah Free All Chapters

Applications

Uploaded by

sduhiad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (9 votes)
41 views66 pages

Get Deep Learning in Bioinformatics: Techniques and Applications in Practice Habib Izadkhah Free All Chapters

Applications

Uploaded by

sduhiad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 66

Download Full Version ebookmass - Visit ebookmass.

com

Deep Learning in Bioinformatics: Techniques and


Applications in Practice Habib Izadkhah

https://ebookmass.com/product/deep-learning-in-
bioinformatics-techniques-and-applications-in-practice-
habib-izadkhah/

OR CLICK HERE

DOWLOAD NOW

Discover More Ebook - Explore Now at ebookmass.com


Instant digital products (PDF, ePub, MOBI) ready for you
Download now and discover formats that fit your needs...

Machine Learning for Business Analytics: Concepts,


Techniques and Applications in RapidMiner Galit Shmueli

https://ebookmass.com/product/machine-learning-for-business-analytics-
concepts-techniques-and-applications-in-rapidminer-galit-shmueli/

ebookmass.com

Applications of Deep Learning in Electromagnetics:


Teaching Maxwell's equations to machines Maokun Li

https://ebookmass.com/product/applications-of-deep-learning-in-
electromagnetics-teaching-maxwells-equations-to-machines-maokun-li/

ebookmass.com

Time Series Algorithms Recipes: Implement Machine Learning


and Deep Learning Techniques with Python Akshay R Kulkarni

https://ebookmass.com/product/time-series-algorithms-recipes-
implement-machine-learning-and-deep-learning-techniques-with-python-
akshay-r-kulkarni/
ebookmass.com

Global History with Chinese Characteristics: Autocratic


States along the Silk Road in the Decline of the Spanish
and Qing Empires 1680-1796 1st ed. Edition Manuel Perez-
Garcia
https://ebookmass.com/product/global-history-with-chinese-
characteristics-autocratic-states-along-the-silk-road-in-the-decline-
of-the-spanish-and-qing-empires-1680-1796-1st-ed-edition-manuel-perez-
garcia/
ebookmass.com
Physical Geology, 15th Edition Charles C. Plummer

https://ebookmass.com/product/physical-geology-15th-edition-charles-c-
plummer/

ebookmass.com

A familial cluster of pneumonia associated with the 2019


novel coronavirus indicating person-to-person
transmission: a study of a family cluster Jasper Fuk-Woo
Chan
https://ebookmass.com/product/a-familial-cluster-of-pneumonia-
associated-with-the-2019-novel-coronavirus-indicating-person-to-
person-transmission-a-study-of-a-family-cluster-jasper-fuk-woo-chan/
ebookmass.com

Open Arms (The Sabela Series Book 7) Tina Hogan Grant

https://ebookmass.com/product/open-arms-the-sabela-series-book-7-tina-
hogan-grant/

ebookmass.com

Dimensions of Uncertainty in Communication Engineering


Ezio Biglieri

https://ebookmass.com/product/dimensions-of-uncertainty-in-
communication-engineering-ezio-biglieri/

ebookmass.com

Digital Forensics and Internet of Things: Impact and


Challenges 1st Edition Anita Gehlot

https://ebookmass.com/product/digital-forensics-and-internet-of-
things-impact-and-challenges-1st-edition-anita-gehlot/

ebookmass.com
Public-Private Partnerships in Health: Improving
Infrastructure and Technology 1st Edition Veronica Vecchi

https://ebookmass.com/product/public-private-partnerships-in-health-
improving-infrastructure-and-technology-1st-edition-veronica-vecchi/

ebookmass.com
Deep Learning in
Bioinformatics
This page intentionally left blank
Deep Learning in
Bioinformatics
Techniques and Applications in
Practice

Habib Izadkhah
Department of Computer Science
University of Tabriz
Tabriz, Iran
Academic Press is an imprint of Elsevier
125 London Wall, London EC2Y 5AS, United Kingdom
525 B Street, Suite 1650, San Diego, CA 92101, United States
50 Hampshire Street, 5th Floor, Cambridge, MA 02139, United States
The Boulevard, Langford Lane, Kidlington, Oxford OX5 1GB, United Kingdom
Copyright © 2022 Elsevier Inc. All rights reserved.
No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical,
including photocopying, recording, or any information storage and retrieval system, without permission in writing from the
publisher. Details on how to seek permission, further information about the Publisher’s permissions policies and our
arrangements with organizations such as the Copyright Clearance Center and the Copyright Licensing Agency, can be found
at our website: www.elsevier.com/permissions.
This book and the individual contributions contained in it are protected under copyright by the Publisher (other than as may
be noted herein).
Notices
Knowledge and best practice in this field are constantly changing. As new research and experience broaden our
understanding, changes in research methods, professional practices, or medical treatment may become necessary.
Practitioners and researchers must always rely on their own experience and knowledge in evaluating and using any
information, methods, compounds, or experiments described herein. In using such information or methods they should be
mindful of their own safety and the safety of others, including parties for whom they have a professional responsibility.
To the fullest extent of the law, neither the Publisher nor the authors, contributors, or editors, assume any liability for any
injury and/or damage to persons or property as a matter of products liability, negligence or otherwise, or from any use or
operation of any methods, products, instructions, or ideas contained in the material herein.

Library of Congress Cataloging-in-Publication Data


A catalog record for this book is available from the Library of Congress

British Library Cataloguing-in-Publication Data


A catalogue record for this book is available from the British Library

ISBN: 978-0-12-823822-6

For information on all Academic Press publications


visit our website at https://www.elsevier.com/books-and-journals

Publisher: Mara Conner


Acquisitions Editor: Chris Katsaropoulos
Editorial Project Manager: Joshua Mearns
Production Project Manager: Nirmala Arumugam
Designer: Victoria Pearson
Typeset by VTeX
To my wife Sepideh
and my children Amir Reza and Rose
This page intentionally left blank
Contents

Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv
CHAPTER 1 Why life science? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Why deep learning? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.3 Contemporary life science is about data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.4 Deep learning and bioinformatics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.5 What will you learn? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
CHAPTER 2 A review of machine learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2 What is machine learning? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.3 Challenge with machine learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.4 Overfitting and underfitting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.4.1 Mitigating overfitting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.4.2 Adjusting parameters using cross-validation . . . . . . . . . . . . . . . . . . . . 15
2.4.3 Cross-validation methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.5 Types of machine learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.5.1 Supervised learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.5.2 Unsupervised learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.5.3 Reinforcement learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.6 The math behind deep learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.6.1 Tensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.6.2 Relevant mathematical operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.6.3 The math behind machine learning: statistics . . . . . . . . . . . . . . . . . . . . 25
2.7 TensorFlow and Keras . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.8 Real-world tensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.9 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
CHAPTER 3 An introduction of Python ecosystem for deep learning . . . . . . . . . . . . . . . . . 31
3.1 Basic setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.2 SciPy (scientific Python) ecosystem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.3 Scikit-learn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.4 A quick refresher in Python . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.4.1 Identifier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.4.2 Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.4.3 Data type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.4.4 Control flow statements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.4.5 Data structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.4.6 Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
vii
viii Contents

3.5 NumPy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.6 Matplotlib crash course . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
3.7 Pandas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.8 How to load dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.8.1 Considerations when loading CSV data . . . . . . . . . . . . . . . . . . . . . . . . 46
3.8.2 Pima Indians diabetes dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.8.3 Loading CSV files in NumPy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
3.8.4 Loading CSV files in Pandas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.9 Dimensions of your data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.10 Correlations between features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
3.11 Techniques to understand each feature in the dataset . . . . . . . . . . . . . . . . . . . . 53
3.11.1 Histograms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
3.11.2 Box-and-whisker plots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
3.11.3 Correlation matrix plot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
3.12 Prepare your data for deep learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
3.12.1 Scaling features to a range . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
3.12.2 Data normalizing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
3.12.3 Binarize data (make binary) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
3.13 Feature selection for machine learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
3.13.1 Univariate selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
3.13.2 Recursive feature elimination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
3.13.3 Principal component analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
3.13.4 Feature importance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
3.14 Split dataset into training and testing sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
3.15 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
CHAPTER 4 Basic structure of neural networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4.2 The neuron . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
4.3 Layers of neural networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
4.4 How a neural network is trained? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
4.5 Delta learning rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
4.6 Generalized delta rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
4.7 Gradient descent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
4.7.1 Stochastic gradient descent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
4.7.2 Batch gradient descent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
4.7.3 Mini-batch gradient descent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
4.8 Example: delta rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
4.8.1 Implementation of the SGD method . . . . . . . . . . . . . . . . . . . . . . . . . . 86
4.8.2 Implementation of the batch method . . . . . . . . . . . . . . . . . . . . . . . . . . 89
4.9 Limitations of single-layer neural networks . . . . . . . . . . . . . . . . . . . . . . . . . . 90
4.10 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
CHAPTER 5 Training multilayer neural networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
Contents ix

5.2 Backpropagation algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96


5.3 Momentum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
5.4 Neural network models in keras . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
5.5 ‘Hello world!’ of deep learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
5.6 Tuning hyperparameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
5.7 Data preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
5.7.1 Vectorization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
5.7.2 Value normalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
5.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
CHAPTER 6 Classification in bioinformatics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
6.1.1 Binary classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
6.1.2 Pima indians onset of diabetes dataset . . . . . . . . . . . . . . . . . . . . . . . . . 115
6.1.3 Label encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
6.2 Multiclass classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
6.2.1 Sigmoid and softmax activation functions . . . . . . . . . . . . . . . . . . . . . . 128
6.2.2 Types of classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
6.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
CHAPTER 7 Introduction to deep learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
7.2 Improving the performance of deep neural networks . . . . . . . . . . . . . . . . . . . 132
7.2.1 Vanishing gradient . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
7.2.2 Overfitting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
7.2.3 Computational load . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
7.3 Configuring the learning rate in keras . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
7.3.1 Adaptive learning rate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
7.3.2 Layer weight initializers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
7.4 Imbalanced dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
7.5 Breast cancer detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
7.5.1 Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
7.5.2 Introduction and task definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
7.5.3 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
7.6 Molecular classification of cancer by gene expression . . . . . . . . . . . . . . . . . . 163
7.6.1 Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
7.6.2 Introduction and task definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
7.6.3 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
7.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
CHAPTER 8 Medical image processing: an insight to convolutional neural networks . . . . 175
8.1 Convolutional neural network architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
8.2 Convolution layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176
8.3 Pooling layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179
8.4 Stride and padding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180
8.5 Convolutional layer in keras . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
x Contents

8.6 Coronavirus (COVID-19) disease diagnosis . . . . . . . . . . . . . . . . . . . . . . . . . . 184


8.6.1 Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184
8.6.2 Introduction and task definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184
8.6.3 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184
8.6.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191
8.7 Predicting breast cancer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
8.7.1 Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
8.7.2 Introduction and task definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
8.7.3 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
8.7.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202
8.8 Diabetic retinopathy detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202
8.8.1 Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202
8.8.2 Introduction and task definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202
8.8.3 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203
8.8.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210
8.9 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211
CHAPTER 9 Popular deep learning image classifiers . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215
9.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215
9.2 LeNet-5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216
9.3 AlexNet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217
9.4 ZFNet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220
9.5 VGGNet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222
9.6 GoogLeNet/inception . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228
9.7 ResNet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237
9.8 DenseNet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242
9.9 SE-Net . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245
9.10 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247
CHAPTER 10 Electrocardiogram (ECG) arrhythmia classification . . . . . . . . . . . . . . . . . . . . 249
10.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249
10.2 MIT-BIH arrhythmia database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250
10.3 Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250
10.4 Data augmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255
10.5 Architecture of the CNN model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 256
10.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 258
CHAPTER 11 Autoencoders and deep generative models in bioinformatics . . . . . . . . . . . . 261
11.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261
11.2 Autoencoders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263
11.2.1 Encoder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265
11.2.2 Decoder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265
11.2.3 Distance function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265
11.3 Variant types of autoencoders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266
11.3.1 Undercomplete autoencoders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266
11.3.2 Deep autoencoders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 268
Contents xi

11.3.3 Convolutional autoencoders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269


11.3.4 Sparse autoencoders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 270
11.3.5 Denoising autoencoders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272
11.3.6 Variational autoencoders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274
11.3.7 Contractive autoencoders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283
11.4 An example of denoising autoencoders – bone suppression in chest radiographs 284
11.4.1 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 288
11.5 Implementation of autoencoders for chest X-ray images (pneumonia) . . . . . . . 290
11.5.1 Undercompleted autoencoder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293
11.5.2 Sparse autoencoder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 295
11.5.3 Denoising autoencoder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 296
11.5.4 Variational autoencoder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 298
11.5.5 Contractive autoencoder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 306
11.6 Generative adversarial network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 308
11.6.1 GAN network architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 308
11.6.2 GAN network cost function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 310
11.6.3 Cost function optimization process in GAN . . . . . . . . . . . . . . . . . . . . . 310
11.6.4 General GAN training process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 310
11.7 Convolutional generative adversarial network . . . . . . . . . . . . . . . . . . . . . . . . . 314
11.7.1 Deconvolution layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314
11.7.2 DCGAN network structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314
11.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 318
CHAPTER 12 Recurrent neural networks: generating new molecules and proteins sequence
classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 321
12.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 321
12.2 Types of recurrent neural network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 322
12.3 The problem, short-term memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323
12.4 Bidirectional LSTM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325
12.5 Generating new molecules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 328
12.5.1 Simplified molecular-input line-entry system . . . . . . . . . . . . . . . . . . . 329
12.5.2 A generative model for molecules . . . . . . . . . . . . . . . . . . . . . . . . . . . . 330
12.5.3 Generating new SMILES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 336
12.5.4 Analyzing the generative model’s output . . . . . . . . . . . . . . . . . . . . . . . 337
12.6 Protein sequence classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 337
12.6.1 Protein structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 339
12.6.2 Protein function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 339
12.6.3 Prediction of protein function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 339
12.6.4 LSTM with dropout . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344
12.6.5 LSTM with bidirectional and CNN . . . . . . . . . . . . . . . . . . . . . . . . . . . 344
12.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 346
CHAPTER 13 Application, challenge, and suggestion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 347
13.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 347
13.2 Legendary deep learning architectures, CNN, and RNN . . . . . . . . . . . . . . . . . 347
xii Contents

13.3 Deep learning applications in bioinformatics . . . . . . . . . . . . . . . . . . . . . . . . . 348


13.4 Biological networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 351
13.4.1 Learning tasks on graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 351
13.4.2 Graph neural networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 352
13.5 Perspectives, limitations, and suggestions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 353
13.6 DeepChem, a powerful library for bioinformatics . . . . . . . . . . . . . . . . . . . . . . 357
13.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 357

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 359
Acknowledgments

This book is a product of sincere cooperation of many people. The author would like to thank all those
who contributed in the process of writing and publishing this book. Dr. Masoud Kargar, Dr. Masoud
Aghdasifam, Hamed Babaei, Mahsa Famil, Esmaeil Roohparver, Mehdi Akbari, Mahsa Hashemzadeh,
and Shabnam Farsiani have read the whole draft and made numerous suggestions, improving the pre-
sentation quality of the book; I thank them for all their effort and encouragement.
I wish to express my sincere appreciation to the team at Elsevier, particularly Chris Katsaropoulos,
Senior Acquisitions Editor, for his guidance, comprehensive explanations of the issues, prompt reply
to my e-mails, and, of course, his patience. I would like to also thank Joshua Mearns and Nirmala
Arumugam, for preparing the production process and coordinating the web page, and the production
team. Finally, I thank “the unknown reviewers,” for their great job on exposing what needed to be
restated, clarified, rewritten, and/or complemented.

Habib Izadkhah

xiii
This page intentionally left blank
Preface

Artificial Intelligence, Machine Learning, Deep Learning, and Big Data have become the latest hot
buzzwords, Deep learning and bioinformatics being two of the hottest areas of contemporary research.
Deep learning, as an emerging branch from machine learning, is a good solution for big data analytics.
Deep learning methods have been extensively applied to various fields of science and engineering,
including computer vision, speech recognition, natural language processing, social network analyzing,
and bioinformatics, where they have produced results comparable to and in some cases superior to
domain experts. A vital value of deep learning is the analysis and learning of massive amounts of data,
making it a valuable method for Big Data Analytics.
Bioinformatics research comes into an era of Big Data. With increasing data in biology, it is ex-
pected that deep learning will become increasingly important in the field and will be utilized in a vast
majority of analysis problems. Mining potential value in biological data for researchers and the health
care domain has great significance. Deep learning, which is especially formidable in handling big data,
shows outstanding performance in biological data processing.
To practice deep learning, you need to have a basic understanding of the Python ecosystem. Python
is a versatile language that offers a large number of libraries and features that are helpful for Artificial
Intelligence and Machine Learning in particular, and, of course, you do not need to learn all of these
libraries and features to work with deep learning. In this book, I first give you the necessary Python
background knowledge to study deep learning. Then, I introduce deep learning in an easy to under-
stand and use way, and also explore how deep learning can be utilized for addressing several important
problems in bioinformatics, including drug discovery, de novo molecular design, protein structure pre-
diction, gene expression regulation, protein sequence classification, and biomedical image processing.
Through real-world case studies and working examples, you’ll discover various methods and strategies
for building deep neural networks using the Keras library. The book will give you all the practical in-
formation available on the bioinformatics domain, including the best practices. I believe that this book
will provide valuable insights for a successful career and will help graduate students, researchers, ap-
plied bioinformaticians working in the industry and academia to use deep learning techniques in their
biological and bioinformatics studies as a starting point.
This book
• provides necessary Python background for practicing deep learning,
• introduces deep learning in a convenient way,
• provides the most practical information available on the domain to build efficient deep learning
models,
• presents how deep learning can be utilized for addressing several important problems in bioinfor-
matics,
• explores the legendary deep learning architectures, including convolutional and recurrent neural
networks, for bioinformatics,
• discusses deep learning challenges and suggestions.

Habib Izadkhah
xv
This page intentionally left blank
CHAPTER

Why life science?


1
1.1 Introduction
There are many paths which people can follow based on their technical desires and interests in data.
Due to the availability of massive data in recent years, biomedical studies have drawn a great deal
of attention. The advent of modern medicine has transformed many fundamental aspects of human
life. Over the past 20 years, there have been innovations affecting the lives of many people. Not so
long ago, HIV/AIDS was considered a fatal disease. The ongoing development of antiviral treatments
has significantly increased the life expectancy of patients in developed countries. Other diseases such as
hepatitis C, which was not effectively treatable a decade ago, can now be treated. Genetic breakthroughs
have brought about high hopes for the treatment of different diseases. Innovation in diagnosis and
availability of precision tools enable physicians to diagnose and target a special disease in the human
body. Many of these breakthroughs have used and will benefit from computational methods.

1.2 Why deep learning?


Living in the golden era of machine learning, we are now experiencing a revolution directed by machine
learning programs.
In today’s world, machine learning algorithms are indispensable to every process ranging from
prediction to financial services. As a matter of fact, machine learning is a modern human invention
that has not only led to developments in industries and different businesses but also left a significant
footprint on the individual lives of humans. Scientists are developing certain algorithms which enable
digital assistants (e.g., Amazon Echo and Google Home) to speak well. There have also been notable
advances in psychologist robots.
Sentiment analysis is another modern application of machine learning. This is the process of de-
termining a speaker’s or an author’s attitudes or beliefs. Machine learning developments have allowed
for multilingual translation. In addition to the daily life, machine learning has affected many areas of
physical sciences and other aspects of life. The algorithms are employed for different purposes, rang-
ing from the identification of new galaxies through telescopic images to the classification of subatomic
reactions in the Large Hardon Collider.
The development of a class of machine learning methods, known as deep neural networks, has con-
tributed to these technological advances. Although the technological infrastructure of artificial neural
networks was developed in the 1950s and modified in the 1980s, the real power of this technique was
not totally perceived until the recent decade, in which many breakthroughs have been achieved in com-
puter hardware. While Chapters 3 and 4 give a more comprehensive review of neural networks, and a
Deep Learning in Bioinformatics. https://doi.org/10.1016/B978-0-12-823822-6.00008-1
Copyright © 2022 Elsevier Inc. All rights reserved.
1
2 Chapter 1 Why life science?

deep neural network (deep learning) is presented in the subsequent chapters of the book, it is important
to know about some of the breakthroughs achieved with deep learning first.
A common application of deep learning is image recognition. Using deep learning for facial recog-
nition includes a wide range of applications from security areas and cell phone unlocking methods to
automated tagging of individuals who are present in an image. Companies now seek to use this feature
to set up the process of making purchases without the need for credit cards. For instance, have you
noticed that Facebook has developed an extraordinary feature that lets you know about the presence of
your friends in your photos? Facebook used to make you click on photos and type your friends’ names
to tag them. However, as soon as a photo is uploaded, Facebook now does the magic and tags everybody
for you. This technology is called facial recognition.
Deep learning can also be utilized to restore images or eliminate their noise. This feature of machine
learning is also employed in different security areas, identification of criminals, and quality enhance-
ment of a family photo or a medical image. Producing fake images is also another feature of deep
learning. In fact, deep learning algorithms are able to generate new images of people’s faces, objects,
and even sceneries that have never existed. These images are utilized in graphic design, video game
development, and movie production.
Leading to a plethora of applications for users, many of the similar deep learning developments are
now employed in bioinformatics and biomedicine to classify tumor cells into various categories. Given
the scarcity of medical data, fake images can be produced to generate new data.
Deep learning has also resulted in many speech recognition developments that have become perva-
sive in search engines, cell phones, computers, TV sets, and other online devices everywhere.
So far, various speech recognition technologies have been developed, such as Alexa, Cortana,
Google Assistant, and Siri, changing human interactions with devices, homes, cars, and jobs. Through
the speech recognition technology, it is possible to talk with computers and devices, which can also
understand what the speech means and can make a response. Introducing voice-controlled or digital
assistants into the speech recognition market has changed the outlook of this technology in the 21st
century.
Analyzing its user’s behavior, a recommender system suggests the most appropriate items (e.g., data,
information, and goods). Helping users find their targets faster, this system is an approach proposed to
deal with the problems caused by the growingly massive amount of information. Many companies that
have extensive websites now employ recommender systems to facilitate their processes. Given different
preferences of various users at different ages, there is no doubt that users select different products; thus,
recommender systems should yield various results accordingly. Recommender systems have significant
effects on the revenues of different companies. If employed correctly, these systems can bring about
high profitability for companies. For instance, Netflix has announced that 60% of DVDs rented by users
are provided through recommender systems, which can greatly affect user choices of films.
Recommender systems can also be employed to prescribe appropriate medicines for patients. In
fact, prescribing the right medicines for patients is among the most important processes of their treat-
ments, for which accurate decisions must be made based on patients’ current conditions, history, and
symptoms. In many cases, patients may need more than one medicine or new medicines for another
condition in addition to a previous disease. Such cases increase the chances of medical error in the
prescription of medicines and the incidence of side effects of medicine misuse.
These are only a few innovations achieved through the use of deep learning methods in bioinformat-
ics. Ranging from medical diagnosis and tumor detection to production and prescription of customized
1.3 Contemporary life science is about data 3

medicines based on a specific genome, deep learning has attracted many large pharmaceutical and
medical companies. Many deep learning ideas used in bioinformatics are inspired by the conventional
applications of deep learning.
We are living in an interesting era when there is a convergence of biological data and the extensive
scientific methods of processing that kind of data. Those who can combine data with novel methods to
learn from data patterns can achieve significant scientific breakthroughs.

1.3 Contemporary life science is about data


As discussed earlier, the fundamental nature of life sciences has changed. The large-scale use of ma-
chine experiments has significantly increased the amount of producible experimental data. For instance,
signal processing and 3D imaging in empirical molecular biology can result in a large amount of raw
information. In the 1980s, a biologist would conduct an experiment and draw a conclusion. This ex-
periment would lack a sufficient amount of data because of computational limitations. In addition, the
experimental data would not be made available to others due to the absence of extensive communication
tools. However, modern biology benefits from a mechanism which can generate millions of experimen-
tal data in one or two days. Furthermore, experiments such as gene sequencing, which can generate
massive datasets, have become inexpensive and easy to access.
Advances in gene sequencing can produce the databases which attribute a person’s genetic code to
a multitude of health-related outcomes, including diabetes, cancer, and genetic diseases such as cystic
fibrosis. Employing computational techniques for the analysis and extraction of data, scientists are now
perceiving the causes of these diseases correctly in order to develop novel treatment methods.
The disciplines which used to basically rely on human observations now benefit from the datasets
that cannot easily be analyzed manually due to their massive dimensions. Machine learning is now
usually used for image classification. The outputs of these machine learning models are employed to
detect and classify cancerous tumors and evaluate the effects of potential treatments for a disease.
Advances in empirical techniques have resulted in the development of several databases which
list the structures of chemicals and their effects on a wide range of processes or biological activities.
These structure–activity relationships (SARs) lay the foundations for a discipline that is known as
cheminformatics. Scientists use the data of these large datasets to develop predictive models. Moreover,
making good and rapid decisions in the field of medicine can lead to the identification and optimization
of problems.
The huge amount of data requires a new generation of scientists who are competent in both scientific
and computational areas. Those who possess these combinatorial skills will have the potential to work
on the structures and procedures for big datasets and make scientific discoveries.
Bioinformatics is an interdisciplinary science that includes methods and software for understanding
biological information. Bioinformatics uses a combination of computer science, statistics, and mathe-
matics to analyze and interpret biological information. In other words, bioinformatics is used to analyze
biological problems using computer algorithms, mathematical and statistical techniques.
4 Chapter 1 Why life science?

1.4 Deep learning and bioinformatics


Deep learning with successful experimental results and wide applications has the potential to change
the future of medical science. Today, the use of artificial intelligence has become increasingly common
and is used in various fields such as cancer diagnosis. Deep learning also enables computer vision,
imaging, and more accurate medical diagnosis. So it is no surprise that a report from Report Linker
states that the market for artificial intelligence in the medical industry is expected to grow from $1.2
billion in 2018 to $26 billion in 2025!
Deep learning: the future of medical science
As deep learning has become so popular in the industry, the question arises as to how it will affect
our lives in the next few years. In medicine, although we have stored large amounts of patient data
over the past few years, deep learning has so far been used to analyze image or text data. In addition,
deep learning has recently been used to predict a wide range of problems and clinical outcomes. Deep
learning will have a wonderful future in medicine.
Today’s interest in deep learning in medicine stems from two factors. First, the growth of deep
learning techniques is widespread. Second, a dramatic increase in health care data.
Use deep learning in e-health records
Electronic health systems store patient data such as demographic information, medical records, and
test results. These systems can use deep learning algorithms to improve the correct diagnosis and the
time required to diagnose the disease. These algorithms use data stored in electronic health systems to
identify patterns of health trends and risk factors, and draw conclusions based on identified patterns.
Researchers can also use data from e-health systems to create depth learning models that predict the
likelihood of some health-related outcomes.

1.5 What will you learn?


Let us briefly review what you will learn in this book:
Chapter 2 provides a brief introduction to machine learning. I begin with a definition of Artifi-
cial Intelligence from Oxford Dictionary. Then, I provide a figure that shows the relationship between
Artificial Intelligence, Machine Learning, and Deep Learning. The difference between traditional pro-
gramming and machine learning methods is stated. In Chapter 2, I discuss the model. Machine learning
aims to automatically create a “model” from “data”, which you can then use to make decisions. Ma-
chine learning typically proceeds by initially splitting a dataset into a training set that is used to generate
a model and a test set that is used to evaluate the performance of the model. Chapter 2 also discusses
generalization. Generalization usually refers to a machine learning model’s ability to perform well on
unseen data rather than just the data that it was trained on. Due to the concept of generalizability in
machine learning, two other terms emerge, called Underfitting and Overfitting. If your model is overfit-
ted then it will not generalize well. I describe these problems using an example. Then, I summarize the
meanings of these two concepts. To deal with the overfitting problem, in general, there are two ways,
namely regularization and cross-validation, which I discuss.
There are different ways of how machines learn. In some cases, we train them (called supervised
learning) and, in some other cases, machines learn on their own (called unsupervised learning). In Chap-
1.5 What will you learn? 5

ter 2, I discuss the three ways that a machine can learn, which are supervised learning, unsupervised
learning, and reinforcement learning.
To work with deep learning, you need to be familiar with a number of mathematical and statistical
concepts. In Chapter 2, I outline some of the important concepts, e.g., tensors, you will be working with.
Chapter 2 introduces the Keras library where we will implement deep learning projects. Chapter 2 ends
by introducing several real-world tensors.
Chapter 3 provides a brief introduction to the Python ecosystem. If you would like to make a ca-
reer in the domain of deep learning, you need to know Python programming along with the Python
ecosystem. Based on the report of GitHub, Python is the most popular programming language used
for machine learning hosted on its service. To build effective deep learning models, you need to have
some basic understanding of the Python ecosystem, e.g., Numpy, Pandas, Matplotlib, and Scikit-learn
libraries. This chapter introduces various Python libraries and examples that are very useful to develop
deep learning applications.
The chapter begins with introducing four high-performance computing environments that you can
use to write Python programs without installing anything, including IPython, the Jupyter notebook,
Colaboratory, and Kaggle. Chapter 3 provides general descriptions about SciPy (Scientific Python)
ecosystem and Scikit-learn library. This chapter provides a few basic details about Python syntax you
should be familiar with to understand the code and write a typical program. The syntaxes discussed
include identifier, comments, data type, control flow statements, data structures, and functions. I provide
examples to explain these syntaxes.
NumPy is a Python core library that is widely used in deep learning applications. This library
supports multidimensional arrays and matrices, along with a large collection of high-level mathematical
functions to operate on them. In Chapter 3, I provide several examples about this library which you will
need in deep learning applications. After providing an overview of NumPy, I discuss the Matplotlib
library which is a plotting library used for creating plots and charts. An easy way to load data is to use
the Pandas library. This library is built on top of the Python programming language. In Chapter 3, you
learn how to use this library to load data. In Python, there exist several ways to load a CSV data file to
use in deep learning algorithms. In Chapter 3 you will learn two frequently used ways: (1) loading CSV
files with NumPy and (2) loading CSV Files with Pandas. Reviewing the shape of the dataset is one
of the most frequent data manipulation operations in deep learning applications, for example, seeing
how much data we have, in terms of rows and columns. Chapter 3 also provides examples of this. After
that, I explain how you can use the Pearson correlation coefficient to determine the correlation between
features.
In Chapter 3, I explain Histograms, Box and Whisker Plots, and Correlation Matrix Plot, three
techniques that you can use to understand each feature of your dataset independently. Deep learning
algorithms use numerical features to learn from the data. However, when the features have different
scales, such as “Age” in years and “Income” in hundreds of dollars, the features using larger scales
can unduly influence the model. As a result, we want the features to be on a similar scale that can
be achieved through scaling techniques. In this chapter, you learn how to standardize the data using
Scikit-learn.
Bioinformatics datasets are often high-dimensional. Chapter 3 introduces several feature selection
methods. Feature selection is one of the key concepts in machine learning which is used to select a
subset of features that contribute the most to the output. It thus hugely impacts the performance of the
6 Chapter 1 Why life science?

constructed model. Chapter 3 ends with introducing the train_test_split() function which allows you to
split a dataset into the training and test sets.
Chapter 4 provides the basic structure of neural networks. In this chapter, I discuss the types of
neural network and provide an example of how to train a single-layer neural network. Chapter 4 dis-
cusses gradient descent which is used to update the network’s weights. To this end, three gradient
descent methods, namely Stochastic Gradient Descent, Batch Gradient Descent, and Mini-batch Gradi-
ent Descent, are discussed. Chapter 4 ends with a discussion about the limitations of single-layer neural
networks.
Training a multilayer neural network is discussed in Chapter 5. In this chapter, the backpropagation
algorithm, an effective algorithm used to train a neural network, is introduced. After that, I explain how
you can design a neural network in Keras. The MNIST dataset is often considered the “hello world” of
deep learning. The purpose of this example is first to classify different types of handwritten numbers
based on their appearance and then to classify the handwritten input into the most similar group in order
to identify the corresponding digit. In this chapter, I implement a handwritten classification problem
with dense layers in Keras. After the implementation of this problem, you can learn the components of
neural networks without going into technical details. Chapter 5 ends with a discussion about two more
general data preprocessing techniques, namely vectorization and value normalization. After studying
this chapter, you will be able to design a deep learning network with dense layers.
Chapter 6 discusses the classification problem. Classification is a very important task in bioinfor-
matics and refers to a predictive modeling problem where a class label is predicted for a given input
data. Pima Indians Diabetes Database is employed to predict the onset of diabetes based on diagnostic
measures. In this dataset, there are 768 observations with 8 input variables (i.e., the number of features)
and one output variable (Diabetic and Nondiabetic). In this chapter, using the Pima dataset, I imple-
ment a binary classification in Keras to classify people into diabetic and nondiabetic categories. Neural
networks expect numerical input values. For nonnumerical data, we need to convert it to numerical data
to make this data ready for the network. Label encoding is one of the popular processes of converting
labels, i.e., categorical texts, into numeric values in order to make them understandable for machines.
Chapter 6 explains how you can do this. In this chapter, I also discuss multiclass classification.
Chapter 7 provides an overview of deep learning. Deep learning is a type of machine learning that
has improved the ability to classify, recognize, detect, and generate—or in one word, understand. Chap-
ter 7 helps you understand why deep learning was introduced much later than the single-layer neural
networks and also what challenges deep learning faces. This chapter discusses the most important chal-
lenge of deep learning, namely overfitting, and how to deal with it. This chapter will show you how
to build a deep neural network with two examples from the bioinformatics field, namely breast cancer
classification and molecular classification of cancer by gene expression, using the Keras library.
In deep learning, the main problem is overfitting. The best solution to reduce overfitting is to get
more training data. When no further training data can be accessed, the next best solution is to limit
the amount of information your model can store or be allowed to store. This is called regularization.
In Chapter 7, I describe three techniques, namely reducing the network’s size, dropout, and weight
regularization, to deal with overfitting. Another important concept discussed in this chapter is how to
deal with imbalanced datasets. A dataset is said to be imbalanced when there is a significant difference
in the number of instances in one set of classes, called a majority class, compared to another set of
classes, called a minority class. In imbalanced datasets, neural networks can function well. To deal
with this problem, this chapter introduces RandomOverSampler class in Keras.
1.5 What will you learn? 7

Chapter 8 introduces the convolutional neural network, a deep neural network with special image
processing applications. Such networks significantly improve the processing of information (images) by
deep layers. In Chapter 8, I briefly explain the basic part of a convolution architecture. How convolution
works can hardly be described in words. However, the concept and steps of calculating it are simpler
than they first seem. In this chapter, using a simple example, I show how convolution works. This
chapter also discusses the pooling layer. This layer is utilized to reduce the image’s size by summarizing
neighboring pixels and giving them a value. In fact, it is a downsampling operation. In this chapter, I
implement three medical image processing problems, namely predicting coronavirus disease (COVID-
19), predicting breast cancer, and diabetic retinopathy detection, in Keras. After studying these three
problems, you will learn many practical concepts and techniques in image processing.
Chapter 9 provides an overview of popular deep learning image classifiers. In this chapter, I analyze
eight well-known image classification architectures that have been ranked first in the ILSVRC compe-
tition in different years, along with their Keras codes. After studying this chapter, you will be able to
design high-precision convolutional networks for a problem of interest.
In Chapter 10, I discuss electrocardiogram (ECG) arrhythmia classification. Arrhythmia refers to
any irregular change from normal heart rhythms. This chapter helps you understand how to classify
ECG signals into normal and different types of arrhythmia using a convolutional neural network (CNN).
This chapter provides a Keras code to do this.
Chapter 11 discusses autoencoders and generative models and how to implement them. The net-
works discussed in this chapter, although seemingly identical to the previous chapters, use different
concepts called encoding and decoding. These concepts were not present in previous chapters. The au-
toencoders and generative models are a newly emerging field in deep learning, showing a lot of success
and receiving increasing attention in the deep learning area in the past couple of years. In this chapter,
I will discuss different types of deep generative model and focus on autoencoders’ variations, teaching
how to implement and train autoencoders and deep generators using Keras.
A large part of the data, such as speech, protein sequence, data received from sensors, videos,
and texts, are inherently serial (sequential). Sequential data are data whose current value depends on
previous values. Recurrent neural networks (simple RNNs) are a good way to process sequential data
due to considering the sequence dependence in the calculations. But their capability to compute long
sequence data is limited. The long short-term memory networks, in short LSTM, is a type of recurrent
neural network utilized to handle large sequences. In Chapter 12, I discuss RNN and LSTM, as well as
two important topics in bioinformatics, namely protein sequence classification and the design of new
molecules.
Chapter 13 presents several deep learning applications in bioinformatics, then discusses several deep
learning challenges and ways we can overcome them.
This page intentionally left blank
CHAPTER

A review of machine learning


2
2.1 Introduction
“Machine learning can’t get something from nothing ... what it does is get more from less.” – Dr. Pedro
Domingo, University of Washington

Before moving on to the meaning of machine learning, let us find out what the sense of Artificial
Intelligence (AI) is. According to Webster’s dictionary, intelligence is the ability to learn and solve
problems. It will be recalled that intelligence is the skill to obtain and apply knowledge. Knowledge
is the information taken through experience or/and training. Artificial, also, refers to something that is
simulated or made by humans, not by nature.
Now we are ready to define AI. There is not a unique definition for AI. The Oxford Dictionary
defines AI as “the theory and development of computer systems able to perform tasks normally requir-
ing human intelligence, such as visual perception, speech recognition, decision-making, and translation
between languages.” It is, therefore, an intelligence where we would like to add all the abilities to a
machine that human mind contains.

2.2 What is machine learning?


Machine Learning (ML), as a branch of AI, is the study of algorithms and statistical inference that a
machine can learn on its own without being explicitly programmed, build upon on patterns and infer-
ence instead. Here’s is a basic definition of ML—machine learning is a data analysis method that learns
from that data and then employs what they have learned to make well-informed decisions. Many people
think that the terms machine learning, deep learning, and artificial intelligence are the same and they
use these words interchangeably. These terms overlap and easily could be confused. In the computer
science field, these terms are related but not identical. Fig. 2.1 depicts the relationships among these
three terms. AI is an umbrella term often used to describe systems that make automatic decisions on
their own. ML is a way of achieving AI, which means by the use of machine learning we may be able to
achieve AI in the future. While AI is the broad field of study that mimics human intelligence, machine
learning is a specific branch of AI that trains a machine how to learn. Therefore, when we are talking
about AI, it is everything else from machine learning and deep learning. Deep learning is just a subset
of machine learning and machine learning is a subset of AI.
Machine learning provides the system with the capability of automatically learning from historical
data without using explicit instructions. Fig. 2.2 shows the difference between traditional programming
and machine learning methods.
Deep Learning in Bioinformatics. https://doi.org/10.1016/B978-0-12-823822-6.00009-3
Copyright © 2022 Elsevier Inc. All rights reserved.
9
10 Chapter 2 A review of machine learning

FIGURE 2.1
The relationship between artificial intelligence, machine learning, and deep learning.

FIGURE 2.2
Traditional programming (left) vs machine learning (right).

In machine learning, we can generate a program (also known as a learned model) by integrating the
input and output of that program.
2.2 What is machine learning? 11

Machine learning is very popular now and is often synonymous with artificial intelligence.
In general, one cannot understand the concept of artificial intelligence without knowing how
machine learning works.

Let x and y are two vectors. In most of the problems in machine learning, the aim is to create a
mathematical function as follows:
y = f (x).
This function may take many vectors as input, perhaps thousands or even millions, and may generate
many numbers as output. Here are some examples of functions you may want to create:
• x contains the health characteristics of a large number of people, e.g., Pregnancies, Glucose, Blood
Pressure, Skin Thickness, Insulin, BMI, Age, and f (x) should equal to 1 if a person has diabetes
and 0 if it does not.
• x is the structure of a protein (i.e., a sequence of acids and amino acids) and f (x) must determine
the function of a protein, depending on the dataset used, there can be many functions.
• x contains a number of color images and f (x) should equal to 1 if the image has breast cancer and
0 if it does not.
• x contains a number of chest radiograph (chest X-ray) images; f (x) should be a vector of numbers.
The first element indicates whether the image contains a pleural thickening, the second whether it
contains cardiomegaly, the third whether it contains a nodule, and so on for many types of objects.
As you can see, f (x) can be a very, very complex function! It usually takes a lot of inputs and tries
to extract patterns from them that cannot be extracted manually just by looking at the input numbers.
In machine learning, f (x) is called the model.
In machine learning, we basically try to build a model from the dataset, which is referred to as the
“learned model,” to predict the new and unseen data. This short description has implications that may
not be obvious at first glance. Consequently, let me elaborate on this, just a few words first. Machine
learning aims to automatically create a “model” from “data,” which you can then use to make decisions.
In this direction, the data means information such as genes, proteins, images, documents, etc.
Before going further toward the model, let me step aside from the model a bit. If you have noticed,
the definition of machine learning only describes the concepts of data and model, and does not discuss
anything about “learning.” The term machine learning itself describes the process of finding a model
by analyzing data without having to be done by a human. Because this process, i.e., finding a model,
is trained with the help of data, we call it the “learning process.” Therefore, the data used for building
the “model” is called the training set. The first thing you need, of course, is a training set to train the
model. Fig. 2.3 depicts the overall process of machine learning.
I need to point out that the dataset used for a problem is initially split into two sets: training set and
test set. As mentioned earlier, the samples in the training set are used to train the model and the samples
in the test set are used to evaluate the performance of the resulting model. Fig. 2.4 shows this division.
After testing the model, if it is observed that the model is performing well enough, it can be used in the
real environment for new data.
Model is our main interest in this section, and let us now resume this discussion. In machine learn-
ing, the model is the final product we are looking for and this is what we actually use. The resulting
12 Chapter 2 A review of machine learning

FIGURE 2.3
The overall process of machine learning.

FIGURE 2.4
Splitting data into two parts of training and test sets.

model can be a mathematical representation of a real-world process. For example, if we are developing
a prediction system to identify the risk of breast cancer at earlier stages of the disease, the prediction
system is the model that we are talking about. If the training data used in the learning process are com-
prehensive, the model constructed works as well as the experts themselves. Machine learning has two
steps of training and inference:
• Training refers to the process of creating a model,
• Inference refers to the process of using a trained model to make a prediction.
2.3 Challenge with machine learning 13

FIGURE 2.5
Using the model for prediction.

In machine learning, the output of the training process is a model so that we can then utilize the
model to real-world domain data. This process is depicted in Fig. 2.5. The training data that is used to
create a model and the new data which is used in the real environment are often different.

2.3 Challenge with machine learning


I have just explained that machine learning is a data analysis method that automates model building
to recognize patterns (rules) and make decisions with minimal human interference. This method is
usually utilized to perform tasks that normally require human intelligence such as image recognition
where it is infeasible or difficult to design a conventional algorithm for effectively performing the task.
Using machine learning can solve this problem, but it creates inevitable issues. The following is the
fundamental issue that machine learning faces.
The distinctness of the data that the model was trained on and new unseen data is the structural
challenge that machine learning faces. It would not be an exaggeration to declare that every problem of
machine learning arises from this. For instance, suppose that we trained a model using a few medical
images for a particular disease. Will the model successfully recognize the new medical images? The
possibility will be very low.
Machine learning needs a comprehensive training set in order to work properly. No machine learning
algorithm can achieve the desired aim with small-sized or poor training data. Generalization is a term
used to express a model’s capability to cope with new data. Generalization usually refers to a machine
14 Chapter 2 A review of machine learning

learning model’s ability to perform well on unseen data rather than just the data that it was trained on.
The ability of a model to generalize is crucial to the success of machine learning (learned model).

2.4 Overfitting and underfitting


One of the important considerations in machine learning is how to generalize the learned model to new
data. Because the data that is collected is typically small, incomplete, missing and noisy, a constructed
model must be generalizable.
Generalization refers to the fact that the concepts learned by a machine learning model can be well
generalized to the new examples encountered. So the concept of generalizability refers to the model’s
ability to make output (make a prediction) from new data that it has not yet seen.
Due to the concept of generalizability in machine learning, two other terms emerge, called Under-
fitting and Overfitting. Both of these concepts reflect the poor performance of the learner algorithm
in machine learning. Let us start with an example. Suppose you are studying for a final exam. The
teacher has also given you 100 sample questions so that you can use them to prepare yourself for the
exam. If you study in such a way that you know only these 100 sample questions completely and an-
swer any other question that is slightly different from these 100 questions incorrectly, it means that your
mind is overfit by the educational questions that the teacher has given you for learning.
The meanings of these two concepts are summarized as follows:
Overfitting occurs when learning is well done on training data, but performance on unseen data is not
good. As a matter of fact, the constructed model cannot be generalized. In summary,

Overfitting = Good Learning + Not Generalized

Overfitting is due to the model learning “too much” from the training data. When we simplify the
model to reduce the risk of overfitting, we call this process regularization.
Underfitting is when not only learning is not good, but also when the model performs poorly on other
datasets. Underfitting is due to the model having “not learned enough” from the training data, yielding
low generalization and inaccurate predictions.
In summary,
• In overfitting, the accuracy of the model is high for data similar to training data as well as training
data.
• In overfitting, the model accuracy for new and never seen data is low.
• Overfitting occurs when the model is highly dependent on training data and therefore cannot be
generalized to new data.
• Overfitting occurs when a model learns the details and noise in training data to the extent that it
negatively affects the model performance on new data.
• Overfitting occurs when the model tries to memorize the training data only, instead of learning the
scope of the problem and finding the relationship between the independent and dependent variables,
and this is what we call being very dependent on the training data.
• Underfitting occurs when the model is not sufficiently trained from the training data at the time of
learning.
2.4 Overfitting and underfitting 15

2.4.1 Mitigating overfitting


Overfitting negatively impacts the machine learning performance on unseen data. We can figure out
who is a professional and who is an amateur by looking at their strategies in dealing with overfitting.
To deal with the overfitting problem, in general, there are two ways, namely regularization and cross-
validation.
Regularization techniques help confront overfitting in machine learning aiming to build a model
as simple as possible. In simple terms, they reduce parameters and simplify the model. The resulting
simplified model:
1. Can reduce overfitting,
2. Is usually faster to converge to a local minimum; the local minima are often bad,
3. Is less likely to learn noise data, and this may improve the model’s generalization capabilities.
We always need to determine whether a model could be generalized to new, unseen data, in other
words, whether the trained model is overfitted or not. Cross-validation is another method to deal with
overfitting. Validation is a very useful method to evaluate the effectiveness of your model, particularly
in cases where you need to reduce overfitting.

2.4.2 Adjusting parameters using cross-validation


In deep learning, we need to estimate the model parameters. If the number of parameters is large, the
model becomes more complex and the estimations may not be easy to perform. On the other hand,
increasing the parameters may reduce the efficiency of the model. As has already been mentioned, such
a problem is known as “overfitting.” The solution to such a problem could be to use “cross-validation”
in which the goal is to determine the appropriate number of parameters of the model. This method is
sometimes called “rotation estimation” or “out-of-sample testing.” In such a case, the parameters that
are estimated by cross-validation are called “out-of-sample estimation.”
To measure the performance of a model, two methods are usually used: (1) evaluation based on the
assumptions on which the model should work; and (2) evaluation based on the efficiency of the model
in predicting new values (not observed).
In Method 1, the evaluation of the model relies on the data (samples) observed and used to build the
model. For example, we expect the constructed model to have the least sum of squares of error com-
pared to any other model. It is clear that this method is possible based on the data on which the model is
based, but the performance of the model cannot be measured for new data that was not observed during
modeling. Method 2, which is called cross-validation, relies on data that is observed but not used when
building the model. This data is used to evaluate and measure the performance of the model to predict
new data.
Thus, to measure the efficiency of the model and its optimality, we resort to estimating the model
error based on the data that have been set aside for cross-validation. Estimating this error is commonly
referred to as an “out-of-sample error.” In the following, I describe cross-validation as a tool to measure
this error and examine different ways of implementing it.
Assume that observations from society are available as a random sample that is to be used in mod-
eling. The goal in cross-validation is to achieve a model whose number of parameters is optimal. That
is, finding a model that does not overfit. To achieve this goal in machine learning, the data is usually
divided into two parts, training data and test data.
16 Chapter 2 A review of machine learning

FIGURE 2.6
Splitting the training data into two sets, namely training and validation sets. The validation set must not share any
samples with either the training set or the test set.

According to the separation intended for these two sets, modeling will be based only on the training
data part. But in the cross-validation method, hereinafter referred to as CV, during a repetitive process,
the training set used to create the model is split into two parts. Each time the CV process is repeated,
part of the data is used to train and part to test the model. Thus, this process is a sampling method
to estimate the model error. Fig. 2.6 illustrates the splitting of training data into two sets, training and
validation sets.
The ratio of these parts is also debatable, which I will not discuss here, but usually 50% of the total
data is for training purposes, 25% for cross-validation, and the rest of the data for model testing.
It should be noted that the test data in the CV process may be used as training data in the next
iteration, so their nature is different from the data previously introduced as test data.
At each stage of the CV process, the model trained by applying the training samples is used to
predict the other part of CV data, and the “error” or “accuracy” of the model is calculated on the samples
that were not used to train the model. The average of these errors (accuracy) is usually considered as
the overall error (accuracy) of the model. Of course, it is better to report the standard deviation of
errors (accuracy). Thus, according to the number of different parameters (model complexity), different
models can be produced and their estimation error can be measured using the CV method. At the end,
we will choose a model as the most appropriate if it has the lowest error estimate.

2.4.3 Cross-validation methods


Based on the method of selecting the validation set, different CV methods have been introduced. In the
following, I discuss some of them.
Holdout method. In this method, the data is randomly divided into two parts, training and validation.
Model parameters are estimated by using training data and model error is calculated based on validation
data.
The simplicity of calculations and nonrepetition of the CV process in this method are its advantages.
This method seems appropriate if the training and validation data is homogeneous. However, since the
model error calculations are based on only one part, a suitable estimate for the model error may not be
provided.
2.5 Types of machine learning 17

Leave-One-Out method. In this method, an observation is removed from the training set and based
on the rest of the observations, the parameters are estimated. The model error is then calculated for
the removed observation. Since in this method only one observation is removed from each stage of the
CV process, the number of iterations of the CV process is equal to the number of training data. As a
result, the error calculation time of the model is short and can be easily implemented. This method is
sometimes called LOO for short.
Leave-P-Out method. If in the LOO method the number of observations coming out of the training set
is equal to p, it is called the Leave-P-Out method, or LPO for short. As a result, if n denotes the number
 
of observations in the training set, the number of steps in the CV process will be pn . Thus, at each stage
of the process, the p observations are removed from the training data and the model is estimated based
on the other parameters. The model error is then calculated for the removed observations. Finally, by
calculating the average of the obtained errors, the model error is estimated.
K-Fold method. If we randomly split the training samples into k subfolders or “folds” of the same size,
at each stage of the CV process, we can consider k − 1 of these subfolders as the training set and one
as the validation set. Fig. 2.7 illustrates the splitting of the training data into k folds. It is clear that by
selecting k = 5, the number of iterations of the CV process will be equal to 5 and it will be possible to
achieve the appropriate model quickly. This method is the gold-standard to evaluate the performance of
a machine learning algorithm.
Choosing the right number of folds is an important consideration in this approach. When choosing
the number of folds, it should be noted that it is necessary to have enough data in each fold to be able
to provide a good estimate of the model performance. On the other hand, the number of folds should
not be underestimated, in order to have enough folds to evaluate the model performance.
Validation based on random sampling. In this method, sometimes known as Monte Carlo cross-
validation, the dataset is randomly divided into training and validation sets. The model parameters are
then estimated based on the training data and the error or accuracy of the model is calculated using the
validation data. By repeating the random separation of data, the mean error or accuracy of the models
is considered as the criterion for selecting the appropriate model (least error or highest accuracy). Due
to the random selection of data, the ratio of training data size and validation will not depend on the
number of iterations, and unlike the k-fold method, the CV process can be performed with any number
of iterations. Instead, due to the random selection of subsamples, some observations may never be used
in the validation section and others may be used more than once in the model error estimate calculations.

2.5 Types of machine learning


Generally, machine learning uses two types of methods to perform learning. In several cases, we train
them (called Supervised Learning) and, in some other cases, machines learn by their own (called Un-
supervised Learning). In general, there are three ways for a machine to learn (Fig. 2.8), which are
Supervised Learning, Unsupervised Learning, and Reinforcement Learning. Let us see how a machine
learns in detail.
18 Chapter 2 A review of machine learning

FIGURE 2.7
K-fold validation process.

2.5.1 Supervised learning


The supervised approach is actually similar to a student learning under the supervision of a teacher.
The teacher teaches the student by solving examples, and then the student derives general rules from
these examples and thus will be able to solve new examples that he has not seen before. In supervised
learning, we have a training dataset consisting of samples in which we know the truth or the correct
output for each sample, and we train a model by telling the truth through samples. The shape of the
training dataset is as the following pairs:

{input, correct output}


2.5 Types of machine learning 19

FIGURE 2.8
Three core types of machine learning techniques differing in their approach.

Table 2.1 The shape of training dataset.


Input Correct output
Input #1 correct output for Input #1
... ...
Input #n correct output for Input #n

Table 2.1 shows the shape of the training dataset in detail. In this table, you can see that the correct
output is provided for each input. Another name for the “correct output” is “class” or “label.”
Now that you know the meaning of the supervised process, let us look at how a supervised algorithm
works:
Step 1. Data preparation – the very first step conducted before training a model in the supervised
process is to load labeled data into the system. This step usually takes more time for data preparation,
including data labeling and some preprocessing operations on it, such as removing invalid data. Most
tasks that can be done at this stage are often performed by a human trainer. At the end of this step, the
dataset prepared for the next step is so divided into training and test sets.
Step 2. Training process – the goal of this step is to find a relationship between input and output with
acceptable accuracy. Machine learning algorithms are used to find such a relationship. The output of
this step is a model made for the problem.
Step 3. Testing process – the model built in the second step will be tested on new data in this step to
determine its performance in the face of new and unseen data.
Step 4. Prediction – when the model is ready after training and testing, it can start making a prediction
or decision when new data is given to it.
There are two main supervised learning techniques: Regression and Classification. Table 2.2 shows
a summary of what they perform.
A classification algorithm classifies the input data (new observation) into one of several predefined
classes. It learns from the available dataset and then uses this learning to classify new observations.
There are two types of classification, which are binary and nonbinary classification. The classification
20 Chapter 2 A review of machine learning

Table 2.2 Two supervised machine learning algorithms.


Classification Classifying something into classes, and predicting unseen data from created model
Regression Finding the relationship between variables

Table 2.3 Structure of training data.


Input Class
Feature 1 Feature 2 ... Feature n correct output
Input #1 Value 1 Value 2 ... Value n correct output for Input 1
Input #2 Value 1 Value 2 ... Value n correct output for Input 2

of humans into two groups with diabetes and those without is an example of a binary classification.
Protein family classification is an example of nonbinary classification. In this problem, the proteins are
classified into classes that share similar function.
The structure of the training data of the classification problem, i.e., input and correct output pairs,
looks like in Table 2.3.

WHAT IS A FEATURE?
A feature in machine learning is any column value in the dataset that describes a piece of data.
For example, in the diagnosis of diabetes in a human, Pregnancies, Glucose, Blood Pressure,
etc., are examples of features. Note that we use features as independent variables.

Regression is another useful application from supervised machine learning algorithms that is used
to find a relationship between variables (features). It attempts to predict the output value when the input
value is given. In contrast to classification, regression does not determine the class. Instead, it involves
predicting a numerical value.

2.5.2 Unsupervised learning


In unsupervised learning, no training data (the data given is not labeled) is available, and the dataset
contains only inputs without correct outputs. Instead, the algorithm automatically identifies the patterns
and relationships within the dataset and creates a structure from the data itself. This machine learning
technique is employed when we do not know how to classify the given data but need to do so. Now, let
us use an example to see how unsupervised machine learning works.
Suppose we provide images of cucumbers, peaches, and bananas to the model, so the machine
learning algorithm creates classes based on some patterns and relationships, assigning fruits to those
classes. Now if new data enters the model, it adds it to one of the created classes.
There are two primary categories in unsupervised machine learning, clustering and dimensionality
reduction. Table 2.4 shows a summary of what they perform.
Just for reference, clustering is the process of dividing data points into several clusters so that
the data points in the same cluster are more similar than the data points in other clusters. It is often
confusing how classification and clustering differ from each other, as their process is similar. Despite
this similarity, these methods offer two completely different approaches so that clustering helps you to
2.5 Types of machine learning 21

Table 2.4 Two unsupervised machine learning algorithms.


Clustering Partitioning the given data into clusters
Dimensionality Reduction Reducing features to related and meaningful features aiming to improve accuracy

FIGURE 2.9
Several supervised and unsupervised algorithms.

find all kinds of unknown patterns in data. Something to bear in mind is that clustering and classification
are distinct terms. Some clustering approaches are:
• Partitioning methods
• Hierarchical clustering
• Fuzzy clustering
• Density-based clustering
• Model-based clustering
Why reduce dimensionality? Among the reasons are time or space complexity, desire to reduce the
cost of viewing and collecting additional and unnecessary data, and having better visualization when
data is 2D or 3D. Fig. 2.9 depicts several most used supervised and unsupervised algorithms.

2.5.3 Reinforcement learning


In recent years, Reinforcement Learning (LR) has achieved many successes in various fields, but there
are also situations in which the use of this knowledge will be difficult. Reinforcement learning describes
22 Chapter 2 A review of machine learning

a set of learning problems in which an “agent” must perform “actions” in an “environment” in order to
maximize the defined “reward function.”
Unlike in supervised learning, in reinforcement learning there is no labeled data or, in fact, correct
input and output pairs. Thus, a large part of learning takes place “online” and, for example, when the
agent actively interacts with its environment over several repetitions and gradually learns the “policy”
that applies and explains what can be done to maximize the “reward.”
Reinforcement learning has different goals compared to unsupervised learning. While the goal in
unsupervised learning is to explore the distribution in the data in order to learn more about the data,
reinforcement learning aims to discover the right data model that maximizes the “total cumulative
reward” for the agent.
Q-learning and SARSA (State–Action–Reward–State–Action) are two popular, model-independent
algorithms for reinforcement learning. The difference between these algorithms is in their search strate-
gies.

2.6 The math behind deep learning


Let us review some of the basic mathematical concepts you need to know to practice deep learning. Let
us take a look at how the data is displayed.

2.6.1 Tensors
Tensor is a new word. A tensor is a matrix in which each cell can hold multiple numbers instead of one.
Typically, deep learning uses tensors as the primary data structure. Tensors are the basis of this field,
which is why TensorFlow Google is so named. Now, what is a tensor? A tensor is actually a container
for storing data. Let us see the several types of tensors:
Scalars (zero-dimensional tensors). A tensor that contains only one number is called a scalar. This
number can be an integer or a decimal number.
Vectors (one-dimensional tensors). An array of numbers or a one-dimensional tensor is called a vector.
In mathematical texts, we often see vectors written as follows:

⎡ ⎤
x1
⎢.⎥
x = ⎣ .. ⎦
xn

or [x1 , . . . , xn ].
A one-dimensional tensor has exactly one axis. If an array has four elements then it is called a 4D
vector. There is a difference between a 4D vector and a 4D tensor. The 4D vector has only one axis
containing four components, while the 4D tensor has five axes.
Matrices (two-dimensional tensors). A vector of vectors, or arrays, is called a matrix. A matrix has
two axes known as the row axis and the column axis. For example, the following matrix has three rows
2.6 The math behind deep learning 23

FIGURE 2.10
3D tensor.

FIGURE 2.11
4D tensor.

and three columns (a 3 × 3 matrix):


⎡ ⎤
0 2 4
⎣1 3 5⎦ .
7 8 9

In this example, [0, 2, 4] is the first row of the matrix.


Three- and higher-dimensional tensors. If the elements of a matrix are placed in a vector, a 3D
tensor is created so that each element of a vector contains a matrix. In other words, by stacking two-
dimensional tensors, a three-dimensional tensor is created. Fig. 2.10 depicts a 3D tensor. Each two-
dimensional tensor or matrix is called a channel here. For example, channel 1 in Fig. 2.10 is yellow
(light gray in print version). So we say, channel 0, channel 1, channel 2, and so on.
By putting three-dimensional tensors together, a four-dimensional tensor is formed. In fact, a four-
dimensional tensor is a vector, each element of which is a three-dimensional tensor. Fig. 2.11 shows a
4D tensor.
Other documents randomly have
different content
and Rooklands, there was not much in Corston worth living for. But
at the time this story opens, the charge of the coast had not long
been put in the hands of (comparatively speaking) a young and hale
man who bid fair to keep anybody else out of it for a long while to
come. His office was no sinecure though, for, notwithstanding the
difficulty of landing, the coast was a celebrated one for smugglers,
and as soon as the dark months of winter set in there was no lack of
work for the preventive officers. For the village of Corston did not, of
itself, run down to the sea. Between it and the ocean there lay the
salt marshes, a bleak, desolate tract of land, which no skill or
perseverance could reclaim from apparent uselessness. Except to
the samphire and cockle-gatherers, the salt marshes of Corston
were an arid wilderness which could yield no fruit. Many a farmer
had looked longingly across the wide waste which terminated only
with the shingled beach, and wondered if it were possible to utilise it.
But as it had been from the beginning, so it remained until that day;
its stinted vegetation affording shelter for sea-fowl and smugglers’
booty only, and its brackish waters that flowed and ebbed with the
tides, tainting the best springs on the level ground of Corston. It was
the existence of these marshes that rendered the coastguard
necessary to the village, which would otherwise have become a
perfect nest of smugglers. As it was, notwithstanding all the vigilance
of Mr John Burton and his men, many a keg of spirits and roll of
tobacco were landed on the coast of Corston, and many a man in
the place was marked by them as guilty, though never discovered.
For they who had lived by the salt marshes all their lives were
cunning as to their properties, and knew just where they might bury
their illegal possessions with impunity when the tide was low, and
find them safe when it had flowed and ebbed again. Everyone was
not so fortunate. Lives had been lost in the marshes before now—ay,
and of Corston men too, and several dark tales were told by the
gossips of the village of the quagmires and quicksands that existed
in various parts of them, which looked, although they never were,
both firm and dry, but had the power to draw man and horse with the
temerity to step upon them, into their unfathomable depths. But if the
smugglers kept Mr Burton and his men fully occupied on the sea
shore, the poachers did no less for Lord Worcester’s band of
gamekeepers at Rooklands; and Farmer Murray, who had a drop of
Scotch blood running in his veins, and was never so much alive as
when his own interests were concerned, had only saved his game
for the last three years by having been fortunate enough to take the
biggest poacher in Corston, red-handed, and let him off on condition
that he became his keeper and preserved his covers from future
violence. ‘Set a thief to catch a thief’ is a time-honoured saying, and
Farmer Murray found it answer. Isaac Barnes, the unscrupulous
poacher, became a model gamekeeper, and the midnight rest of the
inhabitants of Mavis Farm had never been disturbed by a stray shot
since; though the eldest son, George Murray, had been heard to
affirm that half the fun of his life was gone now that there was no
chance of a tussle with the poachers. Such was the state of Corston
some forty years ago. The villagers were rough, uneducated, and
lawless, and the general condition of the residents, vapid and
uninteresting enough to have provoked any amount of wickedness, if
only for the sake of change or excitement.

It was the end of September, and the close of a glorious summer.


The harvest had been abundant and the Norfolk soil, which knows
so well how to yield her fruits in due season, was like an exhausted
mother which had just been delivered of her abundance. The last
sheaves of golden corn were standing in the fields ready to be
carried to the threshing-barn, the trees in the orchards were weighed
down with their wealth of pears and apples, and in every lane
clusters of bare-headed children with their hands full of nuts and
their faces stained with blackberry juice, proved how nature had
showered her bounties on rich and poor alike. Lizzie Locke, who was
making her way slowly in the direction of the village, with a huge
basket on her arm, stopped more than once to wipe her hot face,
and pull the sun bonnet she wore further over her eyes, although in
another couple of days the October moon would have risen upon the
land. She was a young girl of not more than eighteen or twenty
years, and, as her dress denoted, bred from the labouring classes.
Not pretty—unless soft brown hair, a fair skin and delicate features,
can make a woman so—but much more refined in appearance than
the generality of her kind. The hands that grasped the handle of her
heavy basket had evidently never done much hard work, nor were
her feet broadened or her back bent with early toiling in the turnip
and the harvest fields. The reason of this was apparent as soon as
she turned her eyes toward you. Quiet blue eyes shaded by long
lashes, that seldom unveiled them—eyes that, under more fortuitous
circumstances, might have flashed and sparkled with roguish mirth,
but that seemed to bear now a settled melancholy in them, even
when her mouth smiled: eyes, in fact, that had been blinded from
their birth.
Poor Lizzie Locke! There was a true and great soul burning in her
breast, but the windows were darkened and it had no power to look
out upon the world. As she stood still for a few moments’ rest for the
third or fourth time between the salt marshes and Corston, her quick
ear caught the sound of approaching horses’ feet, and she drew on
one side of the open road to let the rider pass. But instead of that,
the animal was drawn up suddenly upon its haunches, and a
pleasant young voice rang out in greeting to her.
‘Why, Lizzie, is that you? What a careless girl you are—I might
have ridden over you.’
‘Miss Rosa,’ exclaimed the blind girl, as she recognised the voice
and smiled brightly in return.
‘Of course it’s Miss Rosa, and Polly is as fresh as a two-year-old
this morning. She always is, when she gets upon the marshes. It’s
lucky I pulled up in time.’
The new comer, a handsome girl of about the same age as Lizzie,
was the only daughter of Farmer Murray, of Mavis Farm. Spoilt, as
one girl amongst half-a-dozen boys is sure to be, it is not to be
wondered at that Rosa Murray was impetuous, saucy, and self-
willed. For, added to her being her father’s darling, and not knowing
what it was to be denied anything in his power to give her, Miss Rosa
was extremely pretty, with grey eyes and dark hair, and a complexion
like a crimson rose. A rich brunette beauty that had gained for her
the title of the Damask Rose of Corston, and of which no one was
better aware than herself. Many a gentleman visitor at Rooklands
had heard of the fame of the farmer’s pretty daughter, and ridden
over to Corston on purpose to catch a glimpse of her, and it was
beginning to be whispered about the village that no one in those
parts would be considered good enough for a husband for Miss
Rosa, and that Mr Murray was set upon her marrying a gentleman
from London, any gentleman from ‘London’ being considered by the
simple rustics to be unavoidably ‘the glass of fashion and the mould
of form.’ Mr Murray was termed a ‘gentleman farmer’ in that part of
the county, because he lived in a substantially-built and well-
furnished house, and could afford to keep riding-horses in his stable
and sit down to a dinner spread on a tablecloth every day. But, in
reality, his father had commenced life as a ploughman in that very
village of Corston, and it was only necessary to bring Farmer Murray
into the presence of Lord Worcester and his fashionable friends to
see how much of a ‘gentleman’ he was. He had made the great
mistake, however, of sending his children to be educated at schools
above their station in life, the consequence of which was that, whilst
their tastes and proclivities remained plebeian as his own, they had
acquired a self-sufficiency and idea of their merits that accorded ill
with their surroundings and threatened to mar their future happiness.
The Damask Rose of Corston was the worst example amongst them
of the evil alluded to. She had unfortunately lost her mother many
years before, so was almost completely her own mistress, and the
admiration her beauty excited was fast turning her from a
thoughtless flirt into a heartless coquette, the most odious character
any woman can assume.
But with her own sex, and when it suited her, Rosa Murray could
be agreeable and ingenuous enough, and there was nothing but
cordiality in the tone in which she continued her conversation with
Lizzie Locke.
‘What are you doing out here by yourself, child? You really ought
not to go about alone. It can’t be safe.’
‘Oh, it’s safe enough, Miss Rosa. I’ve been used to find my way
about ever since I could walk. I’ve just come up from the marshes,
and I was going to take these cockles to Mavis Farm to see if the
master would like them for his breakfast to-morrow.’
‘I daresay they will be very glad of them. George and Bob are
awfully fond of cockles. What a lot you’ve gathered, Lizzie. How do
you manage to find them, when you can’t see?’
‘I know all the likeliest places they stick to, Miss Rosa, as well as I
do the chimney corner at home. The tide comes up and leaves them
on the bits of rocks, and among the boulders, and some spots are
regular beds of them. I’ve been at it half my life, you see, miss, and I
just feel for them with my fingers and pick them off. I can tell a piece
of samphire, too, by the sound it makes as I tread over it.’
‘It’s wonderful,’ said Rosa; ‘I have often been surprised to see you
go about just as though you had the use of your eyes. It seems to
make no difference to you.’
Poor Lizzie sighed.
‘Oh, miss! it makes a vast difference—such a difference as you
could never understand. But I try to make the best of it, and not be
more of a burden upon aunt and Larry than I need to be.’
‘I’m sure they don’t think you a burden,’ said the other girl, warmly.
‘But I wonder I didn’t meet you on the marshes just now. I’ve been
galloping all over them.’
‘Not past Corston Point, I hope, miss,’ exclaimed Lizzie, hurriedly.
‘Yes, I have! Why not?’
‘Oh, don’t go there again, Miss Rosa. It isn’t safe, particularly on
horseback. There’s no end of quagmires beyond the Point, and you
can never tell when you’ll come on one and be swallowed up, horse
and all.’
Rosa Murray laughed.
‘Why aren’t you swallowed up then, Lizzie?’
‘I know my way, miss, and I know the tread of it too. I can tell when
the soil yields more than it should at low tide that I’m nearing a
quicksand. When the Almighty takes away one sense He sharpens
the others to make up for it. But the sands are full of danger; some of
them are shifting too, and you can never tell if they’re firm to-day
whether they won’t be loose to-morrow. Do take heed, Miss Rosa,
and never you ride beyond Corston Point without one of the young
gentlemen to take care of you.’
‘Well, I’ll remember your advice, Lizzie, for I don’t want to be
swallowed up alive. Good-bye.’
She put her horse in motion and cantered on some little way in
advance—then suddenly checked him again and turned back. All
Rosa Murray’s actions, like her disposition, were quick and
impulsive.
‘By the way, Lizzie, it’s our harvest-home supper to-night. You
must be sure and make Larry bring you up to the big barn with him.’
The blind girl crimsoned with pleasure.
‘Oh, Miss Rosa! but what should I be doing at your supper? I can’t
dance, you know. I shall only be in the way.’
‘Nonsense! You can hear the singing and the music; we have
made papa get a couple of fiddlers over from Wells; and you can eat
some supper. You will enjoy yourself, won’t you, Lizzie?’
‘Yes, miss, I think so—that is, if Larry and aunt are willing that I
should go; but it’s very good of you to ask me.’
‘You must be sure and come. Tell Larry I insist upon it. We shall all
be there, you know, and I shall look out for you, Lizzie, and if I don’t
see you I shall send some one round to your cottage to fetch you.’
Lizzie Locke smiled and curtsied.
‘I’ll be sure and tell Larry of your goodness, miss’ she said, ‘and
he’ll be able to thank you better than I can. Here comes a
gentleman,’ she added, as she withdrew herself modestly from the
side of the young lady’s horse.
The gentleman, whom Lizzie Locke could have distinguished only
as such from the different sound produced by his boots in walking,
was Lord Worcester’s head gamekeeper, Frederick Darley. He was a
young fellow to hold the responsible position he did, being only about
thirty years of age, and he had not held it long; but he was the son of
the gamekeeper on one of Lord Worcester’s estates in the south of
England, and his lordship had brought him to Rooklands as soon as
ever a vacancy occurred. He was a favourite with his master and his
master’s guests, being a man of rather superior breeding and
education, but on that very account he was much disliked by all the
poor people around. Gamekeepers are not usually popular in a
poaching district, but it was not Frederick Darley’s position alone that
made him a subject for criticism. His crying sin, to use their own
term, was that he ‘held his head too high.’ The velveteen coat he
usually wore, with a rose in the button-hole, his curly black hair and
waxed moustache, no less than the cigars he smoked and the air
with which he affected the society of the gentry, showed the tenants
of Rooklands that he considered himself vastly above themselves in
position, and they hated him accordingly. The animus had spread to
Corston, but Mr Darley was not well enough known there yet to have
become a subject for general comment. Lizzie Locke had never even
encountered him before.
He was walking from the village on the present occasion swinging
a light cane in his hand, and as Rosa Murray looked up at the blind
girl’s exclamation, she perceived him close to her horse’s head.
‘Good morning, Miss Murray,’ he said, lifting his hat.
‘Good morning,’ she replied, without mentioning any name, but
Lizzie Locke could detect from the slight tremor in her voice that she
was confused at the sudden encounter. ‘Were you going down to the
beach?’
‘I was going nowhere but in search of you.’
‘Shall we walk towards home then?’ said Rosa, suiting the action
to the word. She evidently did not wish the blind girl to be a party to
their conversation. She called out ‘Good-bye, Lizzie,’ once more as
she walked her horse away, but before she was out of hearing, the
little cockle-gatherer could distinguish her say to the stranger in a
fluttered voice,—
‘I am so glad you are coming over to our harvest-home to-night.’
‘One of the grand gentlemen over from Rooklands come to court
Miss Rosa,’ she thought in the innocence of her heart, as she turned
off the road to take a short cut across the country to Mavis Farm.
Meanwhile the couple she alluded to were making their way slowly
towards Corston; she, reining in her horse to the pace of a tortoise,
whilst he walked by the side with his hand upon the crutch of her
saddle.
‘Could you doubt for a moment whether I should come?’ said
Frederick Darley in answer to Rosa’s question. ‘Wouldn’t I go twenty
—fifty miles, for the pleasure of a dance with you?’
‘You’re such an awful flatterer,’ she replied, bridling under the
compliment; ‘but don’t make too sure of a dance with me, for papa
and my brothers will be there, and they are so horribly particular
about me.’
‘And not particularly fond of me—I know it, Miss Murray—but I
care nothing at all about it so long as—as—’
‘As what?’
‘As you are.’
‘Oh, Mr Darley! how can you talk such nonsense?’
‘It’s not nonsense! it’s sober sense—come, Rosa, tell me the truth.
Are you playing with me, or not?’
‘What do you mean by “playing”?’
‘You know. Are you in earnest or in jest? In fact—do you love me
better than you love your father and your brothers?’
‘Mr Darley! You know I do!’
‘Prove it then, by meeting me to-night.’
‘Meeting you? Are you not coming to the harvest-home?’
‘I may look in, but I shall not remain long; I shall only use it as an
excuse to come over to Corston. Mr Murray is suspicious of me—I
can see that—and your brothers dislike me. I don’t care to sit at the
table of men who are not my friends, Rosa. But if you will take an
opportunity to slip out of the barn and join me in the apple copse, I
will wait there for you at ten o’clock.’
‘Oh! Frederick—if papa should catch me!’
‘I will take care of that! Only say you’ll come.’
‘I should like to come—it will be so lovely and romantic. Just like a
scene in a novel. But I am afraid it is very wrong.’
‘What is there wrong in a moonlight stroll? “The summer nights
were made for love,” Rosa, and we shall have a glorious moon by
nine o’clock to-night. You won’t disappoint me, will you?’
‘No, indeed I won’t; but if anything should be discovered you will
promise me—’
‘What? I will promise you anything in the world.’
‘Only that you will shield me from papa’s anger—that you will say it
was all your fault. For papa is dreadful when he gets in a temper.’
‘If you should be discovered—which is not at all likely—I promise
you that, rather than give you back into papa’s clutches, I will carry
you straight off to Rooklands and marry you with a special licence.
Will that satisfy you? Would you consent to be my wife, Rosa?’
‘Yes!’ she replied, and earnestly, for she had been captivated by
the manner and appearance of Frederick Darley for some weeks
past, and this was not the first meeting by many that they had held
without the knowledge of her father.
‘That’s my own Damask Rose,’ he exclaimed triumphantly; ‘give
me a kiss, dear, just one to seal the contract; there’s no one looking!’
He held up his face towards her as he spoke—his handsome
insouciant face with its bright eyes and smile, and she stooped hers
to meet it, and give the embrace he petitioned for.
But someone was looking. Almost as Rosa’s lips met Darley’s a
frightened look came into her eyes, and she uttered a note of alarm.
‘What is it, darling?’
‘It’s my brother George! He’s coming this way. Oh! go, Mr Darley—
pray go across the field and let me canter on to meet him.’ He would
have stayed to remonstrate, but the girl pushed him from her, and
thinking discretion the better part of valour, he jumped over a
neighbouring stile and walked away in the direction she had
indicated, whilst she, with a considerable degree of agitation, rode
on to make what excuses she best could for the encounter to her
brother. George Murray was sauntering along the hedge-row
switching the leaves off the hazel bushes as he went, and apparently
quite unsuspicious of anything being wrong. But the first question he
addressed to his sister went straight to the point.
‘Who was that fellow that was talking to you just now, Rosa?’
She knew it would be of no use trying to deceive him, so she
spoke the truth.
‘It was Mr Darley!’
‘What’s he doing over here?’
‘How should I know? You’d better ask him yourself! Am I
accountable for Mr Darley’s actions?’
‘Don’t talk nonsense. You know what I mean perfectly well. Did he
come over to Rooklands to see you?’
‘To see me—what will you get into your head next?’
‘Well, you seemed to be hitting it off pretty well together. What
were you whispering to him about just now?’
‘I didn’t whisper to him.’
‘You did! I saw you stoop your head to his ear. Now look here,
Rosa! Don’t you try any of your flirtation games on with Darley, or I’ll
go straight to the governor and tell him.’
‘And what business is it of yours, pray?’
‘It would be the business of every one of us. You don’t suppose
we’re going to let you marry a gamekeeper, do you?’
‘Really, George, you’re too absurd. Cannot a girl stop to speak to a
man in the road without being accused of wanting to marry him? You
will say I want to marry every clodhopper I may dance with at the
harvest-home to-night next.’
‘That is a very different thing. The ploughboys are altogether
beneath you, but this Darley is a kind of half-and-half fellow that
might presume to imagine himself good enough to be a match for
you.’
‘Half-and-half indeed!’ exclaimed Rosa, nettled at the reflection on
her lover; ‘and pray, what are we when all’s said and done? Mr
Darley’s connections are as good as our own, and better, any day.’
‘Halloa! what are you making a row about? I’ll tell you what, Rosa.
It strikes me very forcibly you want to “carry on” with Lord
Worcester’s keeper, and you ought to be ashamed of yourself for
thinking of it. You—who have been educated and brought up in
every respect like a lady—to condescend to flirt with an upstart like
that, a mere servant! Why, he’s no better than Isaac Barnes, or old
Whisker, or any of the rest of them, only he’s prig enough to oil his
hair, and wear a button-hole, in order to catch the eye of such silly
noodles like yourself.’
‘You’ve no right to speak to me in this way, George. You know
nothing at all about the matter.’
‘I know that I found Darley and you in the lane with your heads
very close together, and that directly he caught sight of me he made
off. That doesn’t look as if his intentions were honourable, does it?
Now, look you here, Rosa. Is he coming to the barn to-night?’
‘I believe so!’
‘And who asked him?’
‘I don’t know,’ she replied, evasively; ‘papa, perhaps—or very
likely Mr Darley thought he required no invitation to join a
ploughman’s dance and supper.’
‘Well, you’re not to dance with him if he does come.’
‘I don’t know what right you have to forbid it.’
‘None at all! but if you won’t give me the promise I shall go straight
to the governor, and let him know what I saw to-day. He’s seen
something of it himself, I can tell you, and he told me to put you on
your guard, so you can take your choice of having his anger or not.’
This statement was not altogether true, for if Farmer Murray had
heard anything of his daughter’s flirtation with the handsome
gamekeeper, it had been only what his sons had suggested to him,
and he did not believe their reports. But the boys, George and
Robert, now young men of three or four-and-twenty, had had more
than one consultation together on the subject, and quite made up
their minds that their sister must not be allowed to marry Frederick
Darley. For they were quite alive to the advantages that a good
connection for her might afford to themselves, and wanted to see her
raise the family instead of lowering it.
Rosa, however, believed her brother’s word. Dread of her father’s
anger actuated in a great measure this belief, and she began to fear
lest all communication between Darley and herself might be broken
off if she did not give the required promise. And the very existence of
the fear opened her eyes to the truth, that her lover was become a
necessary part of life’s enjoyment to her. So, like a true woman and
a hunted hare, she temporised and ‘doubled.’
‘Does papa really think I am too intimate with Mr Darley, George?’
she inquired, trembling.
‘Of course he does, like all the rest of us.’
‘But it’s a mistake. I don’t care a pin about him.’
‘Then it will be no privation for you to give up dancing with him to-
night.’
‘I never intended to dance with him.’
‘Honour bright, Rosa?’
‘Well, I can’t say more than I have. However, you will see. I shall
not dance with him. If he asks me, I shall say I am engaged to you.’
‘You can say what you like, so long as you snub the brute. I
wonder at his impudence coming up to our “Home” at all. But these
snobs are never wanting in “cheek.” However, if Bob and I don’t give
him a pretty broad hint to-night that his room is preferable to his
company, I’m a duffer! Are you going in, Rosa?’
For the young people had continued to walk towards their own
home, and had now arrived at the farm gates.
‘Yes. I’ve been in the saddle since ten o’clock this morning, and
have had enough of it.’
‘Let me take Polly round to the stables before the governor sees
the state you’ve brought her home in, then,’ said George, as his
sister dismounted and threw him the reins. He could be good-
natured enough when he had his own way, and he thought he had
got it now with Rosa. But she went up to her chamber bent but on
one idea—how best to let Mr Darley know of what had passed
between her brother and herself, that he might not be surprised at
the caution of her behaviour when they met in the big barn.

Meanwhile Lizzie Locke having left her basket of cockles at Mavis


Farm, had reached her cottage home. Her thoughts had been very
pleasant as she journeyed there and pondered on the coming
pleasure of the evening. It was not often the poor child took any part
in the few enjoyments to be met in Corston. People were apt to leave
her out of their invitations, thinking that as she was blind she could
not possibly derive any amusement from hearing, and she was of too
shrinking and modest a nature to obtrude herself where she was not
specially required. She had never been to one of the harvest-home
suppers given by Farmer Murray (in whose employ her cousin
Laurence worked), though she had heard much of their delights. But
now that Miss Rosa had particularly desired her to come, she
thought Larry would be pleased to take her. And she had a print
dress nice and clean for the occasion, and her aunt would plait her
hair neatly for her, and she should hear the sound of Larry’s voice as
he talked to his companions, and of his feet whilst he was dancing,
and, perhaps, after supper one of his famous old English songs—
songs which they had heard so seldom of late, and the music of
which her aunt and she had missed so much.
It was past twelve o’clock as she entered the cottage, but she was
so full of her grand news that she scarcely remembered that she
must have kept both her relations waiting for their dinner of bacon
and beans.
‘Why, Lizzie, my girl, where on earth have you been to?’ exclaimed
her aunt, Mrs Barnes, as she appeared on the threshold. Mrs
Barnes’ late husband had been brother to the very Isaac Barnes,
once poacher, now gamekeeper on Farmer Murray’s estate, and
there were scandal-mongers in Corston ill-natured enough to assert
that the taint was in the blood, and that young Laurence Barnes was
very much inclined to go the same way as his uncle had done before
him. But at present he was a helper in the stables of Mavis Farm.
‘I’ve been along the marshes,’ said Lizzie, ‘gathering cockles, and
they gave me sixpence for them up at the farm; and oh, aunt! I met
Miss Rosa on my way back, and she says Larry must take me up to
the big barn this evening to their harvest-home supper.’
Laurence Barnes was seated at his mother’s table already
occupied in the discussion of a huge lump of bread and bacon, but
as the name of his master’s daughter left Lizzie’s lips it would have
been very evident to any one on the look-out for it that he started
and seemed uneasy.
‘And what will you be doing at a dance and a supper, my poor
girl?’ said her aunt, but not unkindly. ‘Come, Lizzie, sit down and
take your dinner; that’s of much more account to you than a harvest
merry-making.’
‘Not till Larry has promised to take me up with him this evening,’
replied the girl gaily, and without the least fear of a rebuff. ‘You’ll do
it, Larry, won’t you? for Miss Rosa said they’d all be there, and if she
didn’t see me she’d send round to the cottage after me. She said,
“Tell Larry I insist upon it; she did, indeed!”’
‘Well, then, I’m not going up myself, and so you can’t go,’ he
answered roughly.
‘Not going yourself!’
The exclamation left the lips of both women at once. They could
not understand it, and it equally surprised them. Larry—the best
singer and dancer for twenty miles round, to refuse to go up to his
master’s harvest-home! Why, what would the supper and the dance
be without him? At least, so thought Mrs Barnes and Lizzie.
‘Aren’t you well, Larry?’ demanded the blind girl, timidly.
‘I’m well enough; but I don’t choose to go. I don’t care for such
rubbish. Let ’em bide! They’ll do well enough without us.’
Lizzie dropt into her seat in silence, and began in a mechanical
way to eat her dinner. She was terribly disappointed, but she did not
dream of disputing her cousin’s decision. He was master in that
house; and she would not have cared to go to the barn without Larry.
Half the pleasure would be gone with his absence. He did not seem
to see that.
‘Mother can take you up, Liz, if she has a mind to,’ he said,
presently.
‘I take her along of me!’ cried Mrs Barnes, ‘when I haven’t so much
as a clean kerchief to pin across my shoulders. You’re daft, Larry. I
haven’t been to such a thing as a dance since I laid your father in the
churchyard, and if our Liz can’t go without me she must stop at
home.’
‘I don’t want to go, indeed I don’t, not without Larry,’ replied the
blind girl, earnestly.
‘And what more did Miss Rosa say to you?’ demanded her aunt,
inquisitively.
‘We talked about the sands, aunt. She’d been galloping all over
them this morning, and I told her how dangerous they were beyond
Corston Point, and we was getting on so nice together, when some
one came and interrupted us.’
‘Some one! Who’s some one?’ said Laurence Barnes, quickly.
‘I can’t tell you; I never met him before.’
‘’Twas a man, then?’
‘Oh yes! ’twas a man—a gentleman! I knew that, because there
were no nails in his boots, and he didn’t give at the knees as he
walked.’
‘What more?’ demanded Larry, with lowered brows.
‘Miss Rosa knew him well, because they never named each other,
but only wished “good morning.” She said, “What are you doing
here?” and he said, “Looking after you.” He carried a rose in his
hand or his coat, I think, for I smelt it, and a cane, too, for it struck
the saddle flap.’
‘Well, that’s enough,’ interrupted Laurence, fiercely.
‘I thought you wanted to hear all about it, Larry?’
‘Is there any more to tell, then?’
‘Only that as they walked away together, Miss Rosa said she was
so glad he was coming up to the harvest-home to-night.’
‘So he’s a-going, the cur!’ muttered the young man between his
teeth. ‘I know him, with his cane, and his swagger, and his stinking
roses; and I’ll be even with him yet, or my name’s not Larry Barnes.’
It was evident that Mr Frederick Darley was no greater favourite in
the cottage than the farm.
‘Whoever are you talking of?’ said Larry’s mother. ‘Do you know
the gentleman Lizzie met with Miss Rosa?’
‘Gentleman! He’s no gentleman. He’s nothing but a common
gamekeeper, same as uncle. But don’t let us talk of him any more. It
takes the flavour of the bacon clean out of my mouth.’
The rest of the simple meal was performed in silence, and then
Mrs Barnes gathered up the crockery and carried it into an outer
room to wash.
Larry and Lizzie were left alone. The girl seemed to understand
that in some mysterious way she had offended her cousin, and
wished to restore peace between them, so she crept up to where he
was smoking his midday pipe on the old settle by the fire, and laid
her head gently against his knees. They had been brought up from
babes together, and were used to observe such innocent little
familiarities towards each other.
‘Never mind about the outing, Larry. I’m not a bit disappointed, and
I’m sorry I said anything about it.’
‘That’s not true, Liz. You are disappointed, and it’s my doing; but I
couldn’t help it. I didn’t feel somehow as if I had the heart to go. But
I’ve changed my mind since dinner, and we’ll go up to the harvest-
home together, my girl. Will that content you?’
‘Oh, Larry! you are good!’ she said, raising herself, her cheeks
crimsoned with renewed expectation; ‘but I’d rather stop at home a
thousand times over than you should put yourself out of the way for
me.’
A sudden thought seemed to strike the young man as he looked at
Lizzie’s fair, sightless face. He had lived with her so long, in a sisterly
way, that it had never struck him to regard her in any other light. But
something in the inflection of her voice as she addressed him, made
him wonder if he were capable of making her happier than she had
ever been yet. He cherished no other hopes capable of realisation.
What if he could make his own troubles lighter by lightening those of
poor Liz? Something of this sort, but in much rougher clothing,
passed through his half-tutored mind. As it grasped the idea he
turned hurriedly towards the girl kneeling at his knee.
‘Do you really care about me, lass?’ he said. ‘Do you care if I’m
vexed or not? Whether I come in or go out? If I like my dinner or I
don’t like it? Does all this nonsense worry you? Answer me, for I
want to know.’
‘Oh! Larry, what do you mean? Of course I care. I can’t do much
for you—more’s the pity—without my poor eyes, but I can think of
you and love you, Larry, and surely you know that I do both.’
‘But would you like to love me more, Liz?’
‘How could I love you more?’
‘Would you like to have the right to care for me—the right to creep
after me in your quiet way wherever I might happen to go—the right
to walk alongside of me, with your hand in mine, up to the harvesting
home to-night; eh, Liz?’
The girl half understood her cousin’s meaning, but she was too
modest not to fear she might be mistaken. Larry could never wish to
take her, blind and helpless, for his wife.
‘Larry, speak to me more plainly; I don’t catch your meaning quite.’
‘Will you marry me then, Liz, and live along of mother and me to
the end of your life?’
‘Marry you!—Be your wife!—Me! Oh, Larry, you can’t mean it!
never.’
‘I do mean it,’ replied her cousin with an oath; ‘and I’ll take you as
soon as ever you’ll take me if you will but say the word.’
‘But I am blind, Larry.’
‘Do you suppose I don’t know that? Perhaps I likes you blind best.’
‘But I am so useless. I get about so slowly. If anything was to
happen to aunt, how could I keep the house clean and cook the
dinners, Larry? You must think a bit more before you decide for
good.’
But the poor child’s face was burning with excitement the while,
and her sightless eyes were thrown upwards to her cousin’s face as
though she would strain through the darkness to see it.
‘If anything happened to mother, do you suppose I’d turn you out
of doors, Liz? And in any case, then, I must have a wife or a servant
to do the work—it will make no difference that way. The only
question is, do you want me for a husband?’
‘Oh! I have loved you ever so!’ replied the girl, throwing herself into
his arms. ‘I couldn’t love another man, Larry. I know your face as well
as if I had seen it, and your step, and your voice. I can tell them long
before another body knows there’s sound a-coming.’
‘Then you’ll have me?’
‘If you’ll have me,’ she murmured in a tone of delight as she
nestled against his rough clothes.
‘That’s settled, then, and the sooner the banns are up the better!
Here, mother! Come along and hear the news. Lizzie has promised
to marry me, and I shall take her to church as soon as we’ve been
cried.’
‘Well! I am pleased,’ said Mrs Barnes. ‘You couldn’t have got a
neater wife, Larry, though her eyesight’s terribly against her, poor
thing! But I’m sure of one thing, Liz, if you can’t do all for him that
another woman might, you’ll love my lad with the best among them,
and that thought will make me lie quiet in my grave.’
The poor cannot afford the time to be as sentimental over such
things as the rich. Larry kissed his cousin two or three times on the
forehead in signification of the compact they had just entered into,
and then he got up and shook himself, and prepared to go back to
his afternoon work.
‘That’s a good job settled,’ he thought as he did so; ‘it will make
Lizzie happy, and drive a deal of nonsense may be out of my head.
But if ever I can pay out that scoundrel Darley I’ll do it, if it costs me
the last drop of my blood.’
The blind girl regarded what had passed between her cousin and
herself with very different feelings. Condemned, by reason of her
infirmity, to pass much of her young life in solitude, the privation had
repaid itself by giving her the time and opportunity for an amount of
self-culture which, if subjected to the rough toil and rougher
pleasures of her class, she never could have attained. Her ideas
regarding the sanctity of love and marriage were very different from
those of other Corston girls. She could never have ‘kept company,’
as they termed it, with one man this month and another the next. Her
pure mind, which dwelt so much within itself, shrank from the levity
and coarseness with which she had heard such subjects treated,
and believing, as she had done, that she should never be married,
she had pleased herself by building up an ideal of what a husband
should be, and how his wife would love and reverence him. And this
ideal had always had for its framework a fancied portrait of her
cousin Laurence. In reality, this young fellow was an average
specimen of a fresh-faced country youth, with plenty of colour and
flesh and muscle. But to the blind girl’s fancy he was perfection. Her
little hands from babyhood had traced each feature of his face until
she knew every line by heart, and though she had never
acknowledged it even to herself, she had been in love with him ever
since she was capable of understanding the meaning of the term. So
that although his proposal to marry her had come as a great
surprise, it had also come as a great glory, and set her heart
throbbing with the pleasant consciousness of returned affection.
She was in a flutter of triumph and delight all the afternoon, whilst
Larry was attending to his horses, and hardly knew how to believe in
her own happiness. Her aunt brushed and plaited her long hair for
her till it was as glossy and neat as possible, and tied her new
cherry-coloured ribbon round the girl’s throat that she might not
disgrace her son’s choice at the merry-making. And then Lizzie sat
down to wait for her affianced lover’s return, the proudest maid in
Corston. Larry came in punctually for his tea, and the first thing he
did was to notice the improvement in his little cousin’s appearance;
and indeed joy had so beautified her countenance that she was a
different creature from what she had been on the sands that
morning. The apathy and indifference to life had disappeared, and a
bright colour bloomed in her soft cheeks. As she tucked her hand
through her cousin’s arm, and they set off to walk together to Farmer
Murray’s harvest-home, Mrs Barnes looked after them with pride,
and declared that if poor Liz had only got her sight they would have
made the handsomest couple in the parish.
Larry was rather silent as they went up to the barn together, but
Liz was not exigeante, and trotted by his side with an air of perfect
content. When they arrived they found the place already full, but the
‘quality’ had not yet arrived, and until they did so, no one ventured to
do more than converse quietly with his neighbour, although the
fiddlers from Wells were all ready and only waiting a signal to strike
up. But in those days the working men did not consider their festival
complete without the presence of the master, and it would have been
a sore affront if the members and guests of the household had not
Welcome to our website – the ideal destination for book lovers and
knowledge seekers. With a mission to inspire endlessly, we offer a
vast collection of books, ranging from classic literary works to
specialized publications, self-development books, and children's
literature. Each book is a new journey of discovery, expanding
knowledge and enriching the soul of the reade

Our website is not just a platform for buying books, but a bridge
connecting readers to the timeless values of culture and wisdom. With
an elegant, user-friendly interface and an intelligent search system,
we are committed to providing a quick and convenient shopping
experience. Additionally, our special promotions and home delivery
services ensure that you save time and fully enjoy the joy of reading.

Let us accompany you on the journey of exploring knowledge and


personal growth!

ebookmass.com

You might also like