100% found this document useful (1 vote)
22 views

Python for Bioinformatics 2nd Edition Sebastian Bassi 2024 scribd download

The document provides information about the book 'Python for Bioinformatics, 2nd Edition' by Sebastian Bassi, including download links and details about its content and structure. It is part of the Chapman & Hall/CRC Mathematical and Computational Biology series, aimed at integrating mathematical, statistical, and computational methods into biology. The book covers programming concepts, Python installation, and various data types, making it suitable for students and professionals in the field.

Uploaded by

goskaalro
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
22 views

Python for Bioinformatics 2nd Edition Sebastian Bassi 2024 scribd download

The document provides information about the book 'Python for Bioinformatics, 2nd Edition' by Sebastian Bassi, including download links and details about its content and structure. It is part of the Chapman & Hall/CRC Mathematical and Computational Biology series, aimed at integrating mathematical, statistical, and computational methods into biology. The book covers programming concepts, Python installation, and various data types, making it suitable for students and professionals in the field.

Uploaded by

goskaalro
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 67

Visit https://ebookfinal.

com to download the full version and


explore more ebooks

Python for Bioinformatics 2nd Edition Sebastian


Bassi

_____ Click the link below to download _____


https://ebookfinal.com/download/python-for-
bioinformatics-2nd-edition-sebastian-bassi/

Explore and download more ebooks at ebookfinal.com


Here are some suggested products you might be interested in.
Click the link to download

Python for Bioinformatics Chapman Hall CRC Mathematical


Computational Biology Sebastian Bassi

https://ebookfinal.com/download/python-for-bioinformatics-chapman-
hall-crc-mathematical-computational-biology-sebastian-bassi/

Python Machine Learning Second Edition Sebastian Raschka

https://ebookfinal.com/download/python-machine-learning-second-
edition-sebastian-raschka/

Bioinformatics Programming Using Python 1st Edition


Mitchell L. Model

https://ebookfinal.com/download/bioinformatics-programming-using-
python-1st-edition-mitchell-l-model/

Black Hat Python Python Programming For Hackers And


Pentesters 2nd Edition Justin Seitz And Tim Arnold

https://ebookfinal.com/download/black-hat-python-python-programming-
for-hackers-and-pentesters-2nd-edition-justin-seitz-and-tim-arnold/
Clinical Bioinformatics 2nd Edition Ronald Trent (Eds.)

https://ebookfinal.com/download/clinical-bioinformatics-2nd-edition-
ronald-trent-eds/

Raspberry Pi for Python Programmers Cookbook 2nd, revised


Edition Tim Cox

https://ebookfinal.com/download/raspberry-pi-for-python-programmers-
cookbook-2nd-revised-edition-tim-cox/

Instant Notes in Bioinformatics 2nd Edition Charlie


Hodgman

https://ebookfinal.com/download/instant-notes-in-bioinformatics-2nd-
edition-charlie-hodgman/

Python in easy steps Covers Python 3 7 2nd Edition Mike


Mcgrath

https://ebookfinal.com/download/python-in-easy-steps-covers-
python-3-7-2nd-edition-mike-mcgrath/

Data Mining for Bioinformatics 1st Edition Sumeet Dua

https://ebookfinal.com/download/data-mining-for-bioinformatics-1st-
edition-sumeet-dua/
Python for Bioinformatics 2nd Edition Sebastian Bassi
Digital Instant Download
Author(s): Sebastian Bassi
ISBN(s): 9781138035263, 1138035262
Edition: 2nd
File Details: PDF, 4.97 MB
Year: 2017
Language: english
PYTHON FOR
BIOINFORMATICS
SECOND EDITION
CHAPMAN & HALL/CRC
Mathematical and Computational Biology Series

Aims and scope:


This series aims to capture new developments and summarize what is known
over the entire spectrum of mathematical and computational biology and
medicine. It seeks to encourage the integration of mathematical, statistical,
and computational methods into biology by publishing a broad range of
textbooks, reference works, and handbooks. The titles included in the
series are meant to appeal to students, researchers, and professionals in the
mathematical, statistical and computational sciences, fundamental biology
and bioengineering, as well as interdisciplinary researchers involved in the
field. The inclusion of concrete examples and applications, and programming
techniques and examples, is highly encouraged.

Series Editors

N. F. Britton
Department of Mathematical Sciences
University of Bath

Xihong Lin
Department of Biostatistics
Harvard University

Nicola Mulder
University of Cape Town
South Africa

Maria Victoria Schneider


European Bioinformatics Institute

Mona Singh
Department of Computer Science
Princeton University

Anna Tramontano
Department of Physics
University of Rome La Sapienza

Proposals for the series should be submitted to one of the series editors above or directly to:
CRC Press, Taylor & Francis Group
3 Park Square, Milton Park
Abingdon, Oxfordshire OX14 4RN
UK
Published Titles
An Introduction to Systems Biology: Statistical Methods for QTL Mapping
Design Principles of Biological Circuits Zehua Chen
Uri Alon An Introduction to Physical Oncology:
Glycome Informatics: Methods and How Mechanistic Mathematical
Applications Modeling Can Improve Cancer Therapy
Kiyoko F. Aoki-Kinoshita Outcomes
Computational Systems Biology of Vittorio Cristini, Eugene J. Koay,
Cancer and Zhihui Wang
Emmanuel Barillot, Laurence Calzone, Normal Mode Analysis: Theory and
Philippe Hupé, Jean-Philippe Vert, and Applications to Biological and Chemical
Andrei Zinovyev Systems
Python for Bioinformatics, Second Edition Qiang Cui and Ivet Bahar
Sebastian Bassi Kinetic Modelling in Systems Biology
Quantitative Biology: From Molecular to Oleg Demin and Igor Goryanin
Cellular Systems Data Analysis Tools for DNA Microarrays
Sebastian Bassi Sorin Draghici
Methods in Medical Informatics: Statistics and Data Analysis for
Fundamentals of Healthcare Microarrays Using R and Bioconductor,
Programming in Perl, Python, and Ruby Second Edition
Jules J. Berman Sorin Drăghici
Chromatin: Structure, Dynamics, Computational Neuroscience:
Regulation A Comprehensive Approach
Ralf Blossey Jianfeng Feng
Computational Biology: A Statistical Biological Sequence Analysis Using
Mechanics Perspective the SeqAn C++ Library
Ralf Blossey Andreas Gogol-Döring and Knut Reinert
Game-Theoretical Models in Biology Gene Expression Studies Using
Mark Broom and Jan Rychtář Affymetrix Microarrays
Computational and Visualization Hinrich Göhlmann and Willem Talloen
Techniques for Structural Bioinformatics Handbook of Hidden Markov Models
Using Chimera in Bioinformatics
Forbes J. Burkowski Martin Gollery
Structural Bioinformatics: An Algorithmic Meta-analysis and Combining
Approach Information in Genetics and Genomics
Forbes J. Burkowski Rudy Guerra and Darlene R. Goldstein
Spatial Ecology Differential Equations and Mathematical
Stephen Cantrell, Chris Cosner, and Biology, Second Edition
Shigui Ruan D.S. Jones, M.J. Plank, and B.D. Sleeman
Cell Mechanics: From Single Scale- Knowledge Discovery in Proteomics
Based Models to Multiscale Modeling Igor Jurisica and Dennis Wigle
Arnaud Chauvière, Luigi Preziosi, Introduction to Proteins: Structure,
and Claude Verdier Function, and Motion
Bayesian Phylogenetics: Methods, Amit Kessel and Nir Ben-Tal
Algorithms, and Applications
Ming-Hui Chen, Lynn Kuo, and Paul O. Lewis
Published Titles (continued)
RNA-seq Data Analysis: A Practical Introduction to Bio-Ontologies
Approach Peter N. Robinson and Sebastian Bauer
Eija Korpelainen, Jarno Tuimala, Dynamics of Biological Systems
Panu Somervuo, Mikael Huss, and Garry Wong Michael Small
Introduction to Mathematical Oncology Genome Annotation
Yang Kuang, John D. Nagy, and Jung Soh, Paul M.K. Gordon, and
Steffen E. Eikenberry Christoph W. Sensen
Biological Computation Niche Modeling: Predictions from
Ehud Lamm and Ron Unger Statistical Distributions
Optimal Control Applied to Biological David Stockwell
Models Algorithms for Next-Generation
Suzanne Lenhart and John T. Workman Sequencing
Clustering in Bioinformatics and Drug Wing-Kin Sung
Discovery Algorithms in Bioinformatics: A Practical
John D. MacCuish and Norah E. MacCuish Introduction
Spatiotemporal Patterns in Ecology Wing-Kin Sung
and Epidemiology: Theory, Models, Introduction to Bioinformatics
and Simulation Anna Tramontano
Horst Malchow, Sergei V. Petrovskii, and
The Ten Most Wanted Solutions in
Ezio Venturino
Protein Bioinformatics
Stochastic Dynamics for Systems Anna Tramontano
Biology
Combinatorial Pattern Matching
Christian Mazza and Michel Benaïm
Algorithms in Computational Biology
Statistical Modeling and Machine Using Perl and R
Learning for Molecular Biology Gabriel Valiente
Alan M. Moses
Managing Your Biological Data with
Engineering Genetic Circuits Python
Chris J. Myers Allegra Via, Kristian Rother, and
Pattern Discovery in Bioinformatics: Anna Tramontano
Theory & Algorithms Cancer Systems Biology
Laxmi Parida Edwin Wang
Exactly Solvable Models of Biological Stochastic Modelling for Systems
Invasion Biology, Second Edition
Sergei V. Petrovskii and Bai-Lian Li Darren J. Wilkinson
Computational Hydrodynamics of Big Data Analysis for Bioinformatics and
Capsules and Biological Cells Biomedical Discoveries
C. Pozrikidis Shui Qing Ye
Modeling and Simulation of Capsules Bioinformatics: A Practical Approach
and Biological Cells Shui Qing Ye
C. Pozrikidis
Introduction to Computational
Cancer Modelling and Simulation Proteomics
Luigi Preziosi Golan Yona
PYTHON FOR
BIOINFORMATICS
SECOND EDITION

SEBASTIAN BASSI
MATLAB• is a trademark of The MathWorks, Inc. and is used with permission. The MathWorks does not warrant the
accuracy of the text or exercises in this book. This book’s use or discussion of MATLAB • software or related products
does not constitute endorsement or sponsorship by The MathWorks of a particular pedagogical approach or particular
use of the MATLAB• software.

CRC Press
Taylor & Francis Group
6000 Broken Sound Parkway NW, Suite 300
Boca Raton, FL 33487-2742

© 2018 by Taylor & Francis Group, LLC


CRC Press is an imprint of Taylor & Francis Group, an Informa business

No claim to original U.S. Government works

Printed on acid-free paper


Version Date: 20170626

International Standard Book Number-13: 978-1-1380-3526-3 (Hardback)

This book contains information obtained from authentic and highly regarded sources. Reasonable efforts have been
made to publish reliable data and information, but the author and publisher cannot assume responsibility for the validity
of all materials or the consequences of their use. The authors and publishers have attempted to trace the copyright
holders of all material reproduced in this publication and apologize to copyright holders if permission to publish in this
form has not been obtained. If any copyright material has not been acknowledged please write and let us know so we may
rectify in any future reprint.

Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced, transmitted, or utilized
in any form by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying,
microfilming, and recording, or in any information storage or retrieval system, without written permission from the
publishers.

For permission to photocopy or use material electronically from this work, please access www.copyright.com (http://
www.copyright.com/) or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive, Danvers, MA 01923,
978-750-8400. CCC is a not-for-profit organization that provides licenses and registration for a variety of users. For
organizations that have been granted a photocopy license by the CCC, a separate system of payment has been arranged.

Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are used only for
identification and explanation without intent to infringe.

Library of Congress Cataloging-in-Publication Data

Names: Bassi, Sebastian, author.


Title: Python for bioinformatics / Sebastian Bassi.
Description: Second edition. | Boca Raton : CRC Press, 2017. | Series:
Chapman & Hall/CRC mathematical and computational biology | Includes
bibliographical references and index.
Identifiers: LCCN 2017014460| ISBN 9781138035263 (pbk. : alk. paper) |
ISBN 9781138094376 (hardback : alk. paper) | ISBN 9781315268743 (ebook) |
ISBN 9781351976961 (ebook) | ISBN 9781351976954 (ebook) |
ISBN 9781351976947 (ebook)
Subjects: LCSH: Bioinformatics. | Python (Computer program language)
Classification: LCC QH324.2 .B387 2017 | DDC 570.285--dc23
LC record available at https://lccn.loc.gov/2017014460

Visit the Taylor & Francis Web site at


http://www.taylorandfrancis.com
and the CRC Press Web site at
http://www.crcpress.com
Contents

List of Figures xvii

List of Tables xxi

Preface to the First Edition xxiii

Preface to the Second Edition xxv

Acknowledgments xxix

Section I Programming

Chapter 1  Introduction 3
1.1 WHO SHOULD READ THIS BOOK 3
1.1.1 What the Reader Should Already Know 4
1.2 USING THIS BOOK 4
1.2.1 Typographical Conventions 4
1.2.2 Python Versions 5
1.2.3 Code Style 5
1.2.4 Get the Most from This Book without Reading It All 6
1.2.5 Online Resources Related to This Book 7
1.3 WHY LEARN TO PROGRAM? 7
1.4 BASIC PROGRAMMING CONCEPTS 8
1.4.1 What Is a Program? 8
1.5 WHY PYTHON? 10
1.5.1 Main Features of Python 10
1.5.2 Comparing Python with Other Languages 11
1.5.3 How Is It Used? 14
1.5.4 Who Uses Python? 15
1.5.5 Flavors of Python 15
1.5.6 Special Python Distributions 16
1.6 ADDITIONAL RESOURCES 17

vii
viii  Contents

Chapter 2  First Steps with Python 19


2.1 INSTALLING PYTHON 20
2.1.1 Learn Python by Using It 20
2.1.2 Install Python Locally 20
2.1.3 Using Python Online 21
2.1.4 Testing Python 22
2.1.5 First Use 22
2.2 INTERACTIVE MODE 23
2.2.1 Baby Steps 23
2.2.2 Basic Input and Output 23
2.2.3 More on the Interactive Mode 24
2.2.4 Mathematical Operations 26
2.2.5 Exit from the Python Shell 27
2.3 BATCH MODE 27
2.3.1 Comments 29
2.3.2 Indentation 30
2.4 CHOOSING AN EDITOR 32
2.4.1 Sublime Text 32
2.4.2 Atom 33
2.4.3 PyCharm 34
2.4.4 Spyder IDE 35
2.4.5 Final Words about Editors 36
2.5 OTHER TOOLS 36
2.6 ADDITIONAL RESOURCES 37
2.7 SELF-EVALUATION 37

Chapter 3  Basic Programming: Data Types 39


3.1 STRINGS 40
3.1.1 Strings Are Sequences of Unicode Characters 41
3.1.2 String Manipulation 42
3.1.3 Methods Associated with Strings 42
3.2 LISTS 44
3.2.1 Accessing List Elements 45
3.2.2 List with Multiple Repeated Items 45
3.2.3 List Comprehension 46
3.2.4 Modifying Lists 47
Contents  ix

3.2.5 Copying a List 49


3.3 TUPLES 49
3.3.1 Tuples Are Immutable Lists 49
3.4 COMMON PROPERTIES OF THE SEQUENCES 51
3.5 DICTIONARIES 54
3.5.1 Mapping: Calling Each Value by a Name 54
3.5.2 Operating with Dictionaries 56
3.6 SETS 59
3.6.1 Unordered Collection of Objects 59
3.6.2 Set Operations 60
3.6.3 Shared Operations with Other Data Types 62
3.6.4 Immutable Set: Frozenset 63
3.7 NAMING OBJECTS 63
3.8 ASSIGNING A VALUE TO A VARIABLE VERSUS BINDING A NAME
TO AN OBJECT 64
3.9 ADDITIONAL RESOURCES 67
3.10 SELF-EVALUATION 68

Chapter 4  Programming: Flow Control 69


4.1 IF-ELSE 69
4.1.1 Pass Statement 74
4.2 FOR LOOP 75
4.3 WHILE LOOP 77
4.4 BREAK: BREAKING THE LOOP 78
4.5 WRAPPING IT UP 80
4.5.1 Estimate the Net Charge of a Protein 80
4.5.2 Search for a Low-Degeneration Zone 81
4.6 ADDITIONAL RESOURCES 83
4.7 SELF-EVALUATION 83

Chapter 5  Handling Files 85


5.1 READING FILES 86
5.1.1 Example of File Handling 87
5.2 WRITING FILES 89
5.2.1 File Reading and Writing Examples 90
5.3 CSV FILES 90
x  Contents

5.4 PICKLE: STORING AND RETRIEVING THE CONTENTS OF VARI-


ABLES 94
5.5 JSON FILES 96
5.6 FILE HANDLING: OS, OS.PATH, SHUTIL, AND PATH.PY MODULE 98
5.6.1 path.py Module 100
5.6.2 Consolidate Multiple DNA Sequences into One FASTA File 102
5.7 ADDITIONAL RESOURCES 102
5.8 SELF-EVALUATION 103

Chapter 6  Code Modularizing 105


6.1 INTRODUCTION TO CODE MODULARIZING 105
6.2 FUNCTIONS 106
6.2.1 Standard Way to Make Python Code Modular 106
6.2.2 Function Parameter Options 110
6.2.3 Generators 113
6.3 MODULES AND PACKAGES 114
6.3.1 Using Modules 115
6.3.2 Packages 116
6.3.3 Installing Third-Party Modules 117
6.3.4 Virtualenv: Isolated Python Environments 119
6.3.5 Conda: Anaconda Virtual Environment 121
6.3.6 Creating Modules 124
6.3.7 Testing Modules 125
6.4 ADDITIONAL RESOURCES 127
6.5 SELF-EVALUATION 128

Chapter 7  Error Handling 129


7.1 INTRODUCTION TO ERROR HANDLING 129
7.1.1 Try and Except 131
7.1.2 Exception Types 134
7.1.3 Triggering Exceptions 135
7.2 CREATING CUSTOMIZED EXCEPTIONS 136
7.3 ADDITIONAL RESOURCES 137
7.4 SELF-EVALUATION 138

Chapter 8  Introduction to Object Orienting Programming (OOP) 139


8.1 OBJECT PARADIGM AND PYTHON 139
Contents  xi

8.2 EXPLORING THE JARGON 140


8.3 CREATING CLASSES 142
8.4 INHERITANCE 145
8.5 SPECIAL METHODS 149
8.5.1 Create a New Data Type Using a Built-in Data Type 154
8.6 MAKING OUR CODE PRIVATE 154
8.7 ADDITIONAL RESOURCES 155
8.8 SELF-EVALUATION 156

Chapter 9  Introduction to Biopython 157


9.1 WHAT IS BIOPYTHON? 158
9.1.1 Project Organization 158
9.2 INSTALLING BIOPYTHON 159
9.3 BIOPYTHON COMPONENTS 162
9.3.1 Alphabet 162
9.3.2 Seq 163
9.3.3 MutableSeq 165
9.3.4 SeqRecord 166
9.3.5 Align 167
9.3.6 AlignIO 169
9.3.7 ClustalW 171
9.3.8 SeqIO 173
9.3.9 AlignIO 176
9.3.10 BLAST 177
9.3.11 Biological Related Data 187
9.3.12 Entrez 190
9.3.13 PDB 194
9.3.14 PROSITE 196
9.3.15 Restriction 197
9.3.16 SeqUtils 200
9.3.17 Sequencing 202
9.3.18 SwissProt 205
9.4 CONCLUSION 207
9.5 ADDITIONAL RESOURCES 207
9.6 SELF-EVALUATION 209
xii  Contents

Section II Advanced Topics

Chapter 10  Web Applications 213


10.1 INTRODUCTION TO PYTHON ON THE WEB 213
10.2 CGI IN PYTHON 214
10.2.1 Configuring a Web Server for CGI 215
10.2.2 Testing the Server with Our Script 215
10.2.3 Web Program to Calculate the Net Charge of a Protein
(CGI version) 219
10.3 WSGI 221
10.3.1 Bottle: A Python Web Framework for WSGI 222
10.3.2 Installing Bottle 223
10.3.3 Minimal Bottle Application 223
10.3.4 Bottle Components 224
10.3.5 Web Program to Calculate the Net Charge of a Protein
(Bottle Version) 229
10.3.6 Installing a WSGI Program in Apache 232
10.4 ALTERNATIVE OPTIONS FOR MAKING PYTHON-BASED DYNAMIC
WEB SITES 232
10.5 SOME WORDS ABOUT SCRIPT SECURITY 232
10.6 WHERE TO HOST PYTHON PROGRAMS 234
10.7 ADDITIONAL RESOURCES 235
10.8 SELF-EVALUATION 236

Chapter 11  XML 237


11.1 INTRODUCTION TO XML 237
11.2 STRUCTURE OF AN XML DOCUMENT 241
11.3 METHODS TO ACCESS DATA INSIDE AN XML DOCUMENT 246
11.3.1 SAX: cElementTree Iterparse 246
11.4 SUMMARY 251
11.5 ADDITIONAL RESOURCES 252
11.6 SELF-EVALUATION 252

Chapter 12  Python and Databases 255


12.1 INTRODUCTION TO DATABASES 256
12.1.1 Database Management: RDBMS 257
12.1.2 Components of a Relational Database 258
Contents  xiii

12.1.3 Database Data Types 260


12.2 CONNECTING TO A DATABASE 261
12.3 CREATING A MYSQL DATABASE 262
12.3.1 Creating Tables 263
12.3.2 Loading a Table 264
12.4 PLANNING AHEAD 266
12.4.1 PythonU: Sample Database 266
12.5 SELECT: QUERYING A DATABASE 269
12.5.1 Building a Query 271
12.5.2 Updating a Database 273
12.5.3 Deleting a Record from a Database 273
12.6 ACCESSING A DATABASE FROM PYTHON 274
12.6.1 PyMySQL Module 274
12.6.2 Establishing the Connection 274
12.6.3 Executing the Query from Python 275
12.7 SQLITE 276
12.8 NOSQL DATABASES: MONGODB 278
12.8.1 Using MongoDB with PyMongo 278
12.9 ADDITIONAL RESOURCES 282
12.10 SELF-EVALUATION 284

Chapter 13  Regular Expressions 285


13.1 INTRODUCTION TO REGULAR EXPRESSIONS (REGEX) 285
13.1.1 REGEX Syntax 286
13.2 THE RE MODULE 287
13.2.1 Compiling a Pattern 290
13.2.2 REGEX Examples 292
13.2.3 Pattern Replace 294
13.3 REGEX IN BIOINFORMATICS 294
13.3.1 Cleaning Up a Sequence 296
13.4 ADDITIONAL RESOURCES 297
13.5 SELF-EVALUATION 298

Chapter 14  Graphics in Python 299


14.1 INTRODUCTION TO BOKEH 299
14.2 INSTALLING BOKEH 299
14.3 USING BOKEH 301
xiv  Contents

14.3.1 A Simple X-Y Plot 303


14.3.2 Two Data Series Plot 304
14.3.3 A Scatter Plot 306
14.3.4 A Heatmap 308
14.3.5 A Chord Diagram 309

Section III Python Recipes with Commented Source Code

Chapter 15  Sequence Manipulation in Batch 315


15.1 PROBLEM DESCRIPTION 315
15.2 PROBLEM ONE: CREATE A FASTA FILE WITH RANDOM SE-
QUENCES 315
15.2.1 Commented Source Code 315
15.3 PROBLEM TWO: FILTER NOT EMPTY SEQUENCES FROM A
FASTA FILE 316
15.3.1 Commented Source Code 317
15.4 PROBLEM THREE: MODIFY EVERY RECORD OF A FASTA FILE 319
15.4.1 Commented Source Code 320

Chapter 16  Web Application for Filtering Vector Contamination 321


16.1 PROBLEM DESCRIPTION 321
16.1.1 Commented Source Code 322
16.2 ADDITIONAL RESOURCES 326

Chapter 17  Searching for PCR Primers Using Primer3 329


17.1 PROBLEM DESCRIPTION 329
17.2 PRIMER DESIGN FLANKING A VARIABLE LENGTH REGION 330
17.2.1 Commented Source Code 331
17.3 PRIMER DESIGN FLANKING A VARIABLE LENGTH REGION,
WITH BIOPYTHON 332
17.4 ADDITIONAL RESOURCES 333

Chapter 18  Calculating Melting Temperature from a Set of Primers 335


18.1 PROBLEM DESCRIPTION 335
18.1.1 Commented Source Code 336
18.2 ADDITIONAL RESOURCES 336

Chapter 19  Filtering Out Specific Fields from a GenBank File 339


19.1 EXTRACTING SELECTED PROTEIN SEQUENCES 339
Contents  xv

19.1.1 Commented Source Code 339


19.2 EXTRACTING THE UPSTREAM REGION OF SELECTED PRO-
TEINS 340
19.2.1 Commented Source Code 340
19.3 ADDITIONAL RESOURCES 341

Chapter 20  Inferring Splicing Sites 343


20.1 PROBLEM DESCRIPTION 343
20.1.1 Infer Splicing Sites with Commented Source Code 345
20.1.2 Sample Run of Estimate Intron Program 347

Chapter 21  Web Server for Multiple Alignment 349


21.1 PROBLEM DESCRIPTION 349
21.1.1 Web Interface: Front-End. HTML Code 349
21.1.2 Web Interface: Server-Side Script. Commented Source Code 351
21.2 ADDITIONAL RESOURCES 353

Chapter 22  Drawing Marker Positions Using Data Stored in a Database 355


22.1 PROBLEM DESCRIPTION 355
22.1.1 Preliminary Work on the Data 355
22.1.2 MongoDB Version with Commented Source Code 358

Section IV Appendices

Appendix A  Collaborative Development: Version Control with GitHub 365


A.1 INTRODUCTION TO VERSION CONTROL 366
A.2 VERSION YOUR CODE 367
A.3 SHARE YOUR CODE 375
A.4 CONTRIBUTE TO OTHER PROJECTS 381
A.5 CONCLUSION 382
A.6 METHODS 384
A.7 ADDITIONAL RESOURCES 384

Appendix B  Install a Bottle App in PythonAnywhere 385


B.1 PYTHONANYWHERE 385
B.1.1 What Is PythonAnywhere 385
B.1.2 Installing a Web App in PythonAnywhere 385
xvi  Contents

Appendix C  Scientific Python Cheat Sheet 393


C.1 PURE PYTHON 394
C.2 VIRTUALENV 400
C.3 CONDA 402
C.4 IPYTHON 403
C.5 NUMPY 405
C.6 MATPLOTLIB 410
C.7 SCIPY 412
C.8 PANDAS 413

Index 417
List of Figures

2.1 Anaconda install in macOS. 21


2.2 Anaconda Python interactive terminal. 23
2.3 PyCharm Edu welcome screen. 35

3.1 Intersection. 60
3.2 Union. 61
3.3 Difference. 61
3.4 Symmetric difference. 62
3.5 Case 1. 65
3.6 Case 2. 66

5.1 Excel formatted spreadsheet called sampledata.xlsx. 93

8.1 IUPAC nucleic acid notation table. 147

9.1 Anatomy of a BLAST result. 181

10.1 Our first CGI. 216


10.2 CGI accessed from local disk instead from a web server. 217
10.3 greeting.html: A very simple form. 217
10.4 Output of CGI program that processes greeting.html. 218
10.5 Form protcharge.html ready to be submitted. 220
10.6 Net charge CGI result. 222
10.7 Hello World program made in Bottle, as seen in a browser. 224
10.8 Form for the web app to calculate the net charge of a protein. 229

11.1 Screenshot of XML viewer. 244


11.2 Codebeautify, a web based XML viewer. 245

12.1 Screenshot of PhpMyAdmin. 258


12.2 Creating a new database using phpMyAdmin. 262
12.3 Creating a new table using phpMyAdmin. 264

xvii
xviii  LIST OF FIGURES

12.4 View of the Student table. 266


12.5 An intentionally faulty “Grades” table. 267
12.6 A better “Grades” table. 267
12.7 Courses table: A lookup table. 268
12.8 Modified “Grades” table. 268
12.9 Screenshot of SQLite manager. 277
12.10 View from a MongoDB cloud provider. 281

14.1 A circle with Bokeh. 302


14.2 Four circles with Bokeh. 303
14.3 A simple plot with Bokeh. 305
14.4 A two data series plot with Bokeh. 306
14.5 Scatter plot graphics. 308
14.6 A heatmap out of a microarray experiment. 310
14.7 A chord diagram. 312

16.1 HTML form for sequence filtering. 327


16.2 HTML form for sequence filtering. 328

21.1 Muscle Web interface. 350

22.1 Product of Listing 22.2, using the demo dataset (NODBDEMO). 356

A.1 The git add/commit process. 369


A.2 Working with a local repository. 370
A.3 Working with both a local and remote repository as a single user. 379
A.4 Contributing to open source projects. 383

B.1 “Consoles” tab. 386


B.2 The “Web” tab. 386
B.3 Upgrading domain type option. 387
B.4 Select a web framework screen, select Bottle. 388
B.5 Select a Python and Bottle version. 389
B.6 Form to enter the path of the web app. 390
B.7 The sample web app is ready to use. 390
B.8 The “File” tab. 391
B.9 Form to create a new directory in PythonAnywhere. 391
B.10 View and upload files into your account. 391
LIST OF FIGURES  xix

B.11 Front-end of the program to calculate charge of a protein using


Bottle and hosted in PythonAnywhere. 392
List of Tables

2.1 Arithmetic-Style Operators 26

3.1 Common List Operations 48


3.2 Methods Associated with Dictionaries 58

9.1 Sequence and Alignment Formats 175


9.2 Blast programs 178
9.3 eUtils 191

10.1 Frameworks for Web Development 233

12.1 Students in Python University 259


12.2 Table with primary key 260
12.3 MySQL Data Types 261

13.1 REGEX Special Sequences 287

A.1 Resources 367

xxi
Preface to the First Edition

This book is a result of the experience accumulated during several years of working
for an agricultural biotechnology company. As a genomic database curator, I gave
support to staff scientists with a broad range of bioinformatics needs. Some of them
just wanted to automate the same procedure they were already doing by hand, while
others would come to me with biological problems to ask if there were bioinformat-
ics solutions. Most cases had one thing in common: Programming knowledge was
necessary for finding a solution to the problem. The main purpose of this book is to
help those scientists who want to solve their biological problems by helping them
to understand the basics of programming. To this end, I have attempted to avoid
taking for granted any programming-related concepts. The chosen language for this
task is Python.
Python is an easy-to-learn computer language that is gaining traction among
scientists. This is likely because it is easy to use, yet powerful enough to accomplish
most programming goals. With Python the reader can start doing real programming
very quickly. Journals such as Computing in Science and Engineering, Briefings
in Bioinformatics, and PLOS Computational Biology have published introductory
articles about Python. Scientists are using Python for molecular visualization, ge-
nomic annotation, data manipulation, and countless other applications.
In the particular case of the life sciences, the development of Python has been
very important; the best exponent is the Biopython package. For this reason, Section
II is devoted to Biopython. Anyhow, I don’t claim that Biopython is the solution to
every biology problem in the world. Sometimes a simple custom-made solution may
better fit the problem at hand. There are other packages like BioNEB and CoreBio
that the reader may want to try.
The book begins from the very basic, with Section I (“Programming”), teaching
the reader the principles of programming. From the very beginning, I place a special
emphasis on practice, since I believe that programming is something that is best
learned by doing. That is why there are code fragments spread over the book. The
reader is expected to experiment with them, and attempt to internalize them. There
are also some spare comparisons with other languages; they are included only when
doing so enlightens the current topic. I believe that most language comparisons do
more harm than good when teaching a new language. They introduce information
that is incomprehensible and irrelevant for most readers.
In an attempt to keep the interest of the reader, most examples are somehow
related to biology. In spite of that, these examples can be followed even if the reader
doesn’t have any specific knowledge in that field.
To reinforce the practical nature of this book, and also to use as reference

xxiii
xxiv  Preface to the First Edition

material, Section IV is called “Python Recipes with Commented Source Code.”


These programs can be used as is, but are intended to be used as a basis for other
projects. Readers may find that some examples are very simple; they do their job
without too many bells and whistles. This is intentional. The main reason for this
is to illustrate a particular aspect of the application without distracting the reader
with unnecessary features, as well as to avoid discouraging the reader with complex
programs. There will always be time to add features and customizations once the
basics have been learned.
The title of Section III (“Advanced Topics”) may seem intimidating, but in
this case, advanced doesn’t necessarily mean difficult. Eventually, everyone will
use the chapters in this section [especially relational database management system
—RDBMS— and XML]. An important part of the bioinformatics work is building
and querying databases, which is why I consider knowing a RDBMS like MySQL
to be a relevant part of the bioinformatics skill set. Integrating data from different
sources is one of tasks most frequently performed in bioinformatics. The tool of
choice for this task is XML. This standard is becoming a widely used platform for
data interchange between applications. Python has several XML parsers and we
explain most of them in this book.
Appendix B, “Selected Papers,” provides introductory level papers on Python.
Although there is some overlapping of subjects, this was done to show several points
of view of the same subject.
Researchers are not the only ones for whom this book will be beneficial. It has
also been structured to be used as a university textbook. Students can use it for
programming classes, especially in the new bioinformatics majors.
Preface to the Second
Edition

The first edition of Python for Bioinformatics was written in 2008 and published
in 2009. Even after eight years, the lessons in this book are still valuable. This is
quite an accomplishment in a field that evolves at such a fast pace. In spite of its
usefulness, the book is showing its age and would greatly benefit from a second
edition.
The predominant Python version is 3.6, although Python 2.7 is still in use in
production systems. Since there are incompatibilities between these versions, lot of
effort was made to make all code in the book Python 3 compatible.
Not only has the software changed in these past eight years, but enterprise atti-
tude and support toward Open Source Software in general and Python in particular
has changed dramatically. There are also new computing paradigms that can’t be
ignored such as collaborative development and cloud computing.
In the original book, Chapter 14 was called “Collaborative Development: Version
Control” and was based on Bazaar, a software that follows the currently used
distributed development workflow but is not what is being used by most developers
today. By far the most software development is done with Git at GitHub. This
chapter was rewritten to focus on current practices.
Web development is another area that changed significantly. Although this is
not a book about web development, the chapter “Web Applications” now reflects
current usage of long-running processes and frameworks instead of CGI/WSGI and
middleware-based applications. Frameworks were discussed as a side note in this
chapter, but now the chapter is based around a framework (Bottle) and leave the
old method as a historical footnote.
In databases, the NoSQL gained lot of traction, from being a bullet point in
the first edition, now has its own section using MongoDB, and a Python recipe
was changed to use this NoSQL database.
Graphical libraries have improved since 2009, and there are great quality com-
peting graphic libraries available for Python. There is a whole chapter devoted to
Bokeh, a free interactive visualization library.
Another change that is reflected in this book is the usage of Anaconda and
Jupyter Notebooks (with all code in a cloud notebook provided by Microsoft
Azure1 ).
1
See https://notebooks.azure.com/py4bio/libraries/py3.us

xxv
xxvi  Preface to the Second Edition

Regarding source code, there is a GitHub repository at https://github.com/


Serulab/Py4Bio where you can download all the code and sample files used in this
book.
There are corrections in every chapter. Sometimes there were actual mistakes,
but most of the corrections were related to the Python 3 upgrade and in keeping
with current good practices. Regarding corrections, I expect that this book may
need corrections, so I made a web page where the readers can get updates. Please
take a look at http://py3.us and subscribe to the low volume mailing list while
at it.
Apart from software evolution and paradigms shifts, I also gained development
experience and changed my views on pedagogical matters. During these years I
worked in a genome sequencing project at an international consortium and as a
senior software developer in an NYSE listed company (Globant). In the last 5 years
I worked for several well-known clients such as Salesforce and National Geographic.
I am currently working at PLOS (Public Library of Science).
By request of MATLAB, I include their contact information:
MATLAB ® is a registered trademark of The MathWorks, Inc. For product
information please contact: The MathWorks, Inc. 3 Apple Hill Drive Natick, MA,
01760-2098 USA Tel: 508-647-7000 Fax: 508-647-7001 E-mail: [email protected]
Web: www.mathworks.com
Regarding the logo of Biopython, that is used in the cover, here it is usage
license (this covers all Biopython files, including its logo):
Biopython is currently released under the "Biopython License Agreement"
(given in full below). Unless stated otherwise in individual file headers, all Biopy-
thon’s files are under the "Biopython License Agreement".
Some files are explicitly dual licensed under your choice of the "Biopython Li-
cense Agreement" or the "BSD 3-Clause License" (both given in full below). This
is with the intention of later offering all of Biopython under this dual licensing
approach.

Biopython License Agreement


Permission to use, copy, modify, and distribute this software and its documenta-
tion with or without modifications and for any purpose and without fee is hereby
granted, provided that any copyright notices appear in all copies and that both
those copyright notices and this permission notice appear in supporting documen-
tation, and that the names of the contributors or copyright holders not be used in
advertising or publicity pertaining to distribution of the software without specific
prior permission.
THE CONTRIBUTORS AND COPYRIGHT HOLDERS OF THIS SOFT-
WARE DISCLAIM ALL WARRANTIES WITH REGARD TO THIS SOFT-
WARE, INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY
AND FITNESS, IN NO EVENT SHALL THE CONTRIBUTORS OR COPY-
RIGHT HOLDERS BE LIABLE FOR ANY SPECIAL, INDIRECT OR CON-
SEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING
Preface to the Second Edition  xxvii

FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF


CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT
OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS
SOFTWARE.

BSD 3-Clause License


Copyright (c) 1999-2017, The Biopython Contributors All rights reserved.
Redistribution and use in source and binary forms, with or without modification,
are permitted provided that the following conditions are met:
Redistributions of source code must retain the above copyright notice, this list of
conditions and the following disclaimer. Redistributions in binary form must repro-
duce the above copyright notice, this list of conditions and the following disclaimer
in the documentation and/or other materials provided with the distribution. Nei-
ther the name of the copyright holder nor the names of its contributors may be
used to endorse or promote products derived from this software without specific
prior written permission. THIS SOFTWARE IS PROVIDED BY THE COPY-
RIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR
IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IM-
PLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PAR-
TICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPY-
RIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDI-
RECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAM-
AGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTI-
TUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSI-
NESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (IN-
CLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF
THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY
OF SUCH DAMAGE.
Acknowledgments

A project such as this book couldn’t be done by just one person. For this reason,
there is a long list of people who deserve my thanks. In spite of the fact that the
average reader doesn’t care about the names, and at the risk of leaving someone out,
I would like to acknowledge the following people: my wife Virginia Gonzalez (Vicky)
and my son Maximo Bassi, who had to contend with my virtual absence during
more than a year. Vicky also assisted me in uncountable ways during manuscript
preparation. My parents and professors taught me important lessons. My family
(Oscar, Graciela, and Ramiro) helped me with the English copyediting, along with
Hugo and Lucas Bejar. Vicky, Griselda, and Eugenio also helped by providing a
development abstraction layer, which is needed for writers and developers.
I would like to thank the people in the local Python community (http://www.
python.org.ar): Facundo Batista, Lucio Torre, Gabriel Genellina, John Lenton,
Alejandro J. Cura, Manuel Kaufmann, Gabriel Patiño, Alejandro Weil, Marcelo
Fernandez, Ariel Rossanigo, Mariano Draghi, and Buanzo. I would choose Python
again just for this great community. I also thank the people at Biopython: Jeffrey
Chang, Brad Chapman, Peter Cock, Michiel de Hoon, and Iddo Friedberg. Peter
Cock is specially thanked for his comments on the Biopython chapter. I also thank
Shashi Kumar and Pablo Di Napoli who helped me with the LATEX2ε issues, and
Sunil Nair who believed in me from the first moment. Also people at Globant
who trusted in me, like Guido Barosio, Josefina Chausovsky, Lucas Campos, Pablo
Brenner and Guibert Englebienne. Globant co-workers such as Pedro Mourelle,
Chris DeBlois, Rodrigo Obi-Wan Iloro, Carlos Del Rio and Alejandro Valle. People
at PLOS, Jeffrey Gray and Nick Peterson.

xxix
I
Programming

1
CHAPTER 1

Introduction
CONTENTS
1.1 Who Should Read This Book . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1.1 What the Reader Should Already Know . . . . . . . . . . . . . . . . . . . . . . . 4
1.2 Using this Book . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2.1 Typographical Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2.2 Python Versions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2.3 Code Style . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2.4 Get the Most from This Book without Reading It All . . . . . . . . . 6
1.2.5 Online Resources Related to This Book . . . . . . . . . . . . . . . . . . . . . . . . 7
1.3 Why Learn to Program? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.4 Basic Programming Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.4.1 What Is a Program? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.5 Why Python? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.5.1 Main Features of Python . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.5.2 Comparing Python with Other Languages . . . . . . . . . . . . . . . . . . . . . 11
Readability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Speed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.5.3 How Is It Used? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.5.4 Who Uses Python? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.5.5 Flavors of Python . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.5.6 Special Python Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.6 Additional Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

The most effective way to do it, is to do it.

Amelia Earhart

1.1 WHO SHOULD READ THIS BOOK


This book is for the life science researcher who wants to learn how to program.
He/she may have previous exposure to computer programming, but this is not
necessary to understand this book (although it surely helps).
This book is designed to be useful to several separate but related audiences,
students, graduates, postdocs, and staff scientists, since all of them can benefit
from knowing how to program.

3
4  Python for Bioinformatics

Exposing students to programming at early stages in their career helps to boost


their creativity and logical thinking, and both skills can be applied in research. In
order to ease the learning process for students, all subjects are introduced with the
minimal prerequisites. There are also questions at the end of each chapter. They
can be used for self-assessing how much you’ve learned. The answers are available
to teachers in a separate guide.
Graduates and staff scientists having actual programming needs should find its
several real-world examples and abundant reference material extremely valuable.

1.1.1 What the Reader Should Already Know


Since this book is called Python for Bioinformatics, it has been written with the
following assumptions in mind:

• No programming knowledge is assumed, but the reader is required to have


minimum computer proficiency to be able to use a text editor and handle basic
tasks in your operating system (OS). Since Python is multi-platform, most
instructions from this book will apply to the most common operating systems
(Windows, macOS and Linux); when there is a command or a procedure that
applies only to a specific OS, it will be clearly noted.

• The reader should be working (or at least planning to work) with bioinfor-
matics tools. Even low-scale handmade jobs, such as using the NCBI BLAST
to ID a sequence, aligning proteins, primer searching, or estimating a phy-
logenetic tree will be useful to follow the examples. The more familiar the
reader is with bioinformatics, the better he will be able to apply the concepts
learned in this book.

1.2 USING THIS BOOK


1.2.1 Typographical Conventions
There are some typographical conventions I have tried to use in a uniform way
throughout the book. They should aid readability and were chosen to tell apart
user-made names (or variables) from language keywords. This comes in handy when
learning a new computer language.
Bold: Objects provided by Python and by third-party modules. With this no-
tation it should be clear that round is part of the language and not a user-defined
name. Bold is also used to highlight parts of the text. There is no way to confuse
one bold usage with the other.
Mono-spaced font: User declared variables, code, and filenames. For example:
sequence = ’MRVLLVALALLALAASATS’.
Italics: In commands, it is used to denote a variable that can take different
values. For example, in len(iterable), “iterable” can take different values. Used in
Introduction  5

text, it marks a new word or concept. For example “One such fundamental data
structure is a dictionary.”
The content of lines starting with $ (dollar sign) are meant to be typed in your
operating system console (also called command prompt in Windows or terminal
in macOS).
←֓ : Break line. Some lines are longer than the available space in a printed
page, so this symbol is inserted to mean that what is on the next line in the page
represents the same line on the computer screen. Inside code, the symbol used is
<=.

1.2.2 Python Versions


The current version of Python at this moment is 3.6.1. There is a 2.7.12 version that
is maintained1 because there are still a sizable number of applications in production
using the 2.7 branch. Versions 3.x and 2.x are slightly different, at the point of
being incompatible. Python 3 is more efficient than Python 2 in many aspects.
Large websites such as Instagram migrated from Python 2.7 to Python 3.6 to save
in CPU and memory consumption by up to 30%. This book uses Python 3.6.
The only scenario where you may need to use Python 2.7, apart from mainte-
nance of old code, is when there is no availability of a specific library for Python
3. In this case, before starting a project in Python 2.7, try to search for a replace-
ment library. For example, you want to connect with a MySQL database and you
are told to use MySQLdb, since this package is not Python 3 compatible; instead
of using Python 2.7, use mysqlclient or mysql-connector-python, both works
with Python 3.

1.2.3 Code Style


Python source code that appears in this book is presented as listings. Each line of
these listings is numbered. These numbers are not intended to be typed; they are
used to reference each line in the text. You don’t need to copy the code from the
book, since it can be downloaded from the GitHub repository at https://github.
com/Serulab/Py4Bio.
Code can be formatted in several ways and still be valid to the Python inter-
preter. This following code is syntactically correct:

def GetAverage(X):
avG=sum(X)/len(X)
" Calculate the average "
return avG

Also this one:


1
Python 2.7.x has an end-of-life date in 2020. There will be no Python 2.8. For more information
see https://www.python.org/dev/peps/pep-0373/.
6  Python for Bioinformatics

def get_average(items):
""" Calculate the average
"""
average = sum(items) / len(items)
return average

The former code sample follows most accepted coding styles for Python.2
Throughout the book you will find mostly code formatted as the second sample.
Some code in the book will not follow accepted coding styles for the following
reasons:

• There are some instances where the most didactic way to show a particular
piece of code conflicts with the style guide. On those few occasions, I choose
to deviate from the style guide in favor of clarity.

• Due to size limitation in a printed book, some names were shortened and
other minor drifts from the coding styles have been introduced.

• To show that there is more than one way to write the same code. Coding
style is a guideline, and enforcement is not made at a language level, so some
programmers don’t follow it thoroughly. You should be able to read “bad”
code, since sooner or later you will have to read other people’s code.

1.2.4 Get the Most from This Book without Reading It All
• If you want to learn how to program, read the first section, from Chapter
1 to Chapter 8. The Regular Expressions (REGEX) chapter (Chapter 13) can
be skipped if you don’t need to deal with REGEX.

• If you know Python and just want to know about Biopython, read first
Chapter 9 (from page 158 to page 209). It is about Biopython modules and
functions. Then try to follow programs found in Section III (from page 315
to page 363).

• There are three appendixes that can be read in an independent way. Appendix
A (Collaborative Development: Version Control with GitHub) reproduces a
paper called “A Quick Introduction to Version Control with Git and GitHub.”
Appendix B shows how to install a web application using Python Anywhere.
Appendix C is a reference material that can be used as a cheat sheet when
you need a quick answer without having to read a chapter.

2
The official Python style guide is located at https://www.python.org/dev/peps/pep-0008,
and a more easy-to-read style guide is located at http://docs.python-guide.org/en/latest/
writing/style.
Introduction  7

1.2.5 Online Resources Related to This Book


The book website is at http://py3.us. In this site you will find errata, a mail-
ing list to keep updated about Python and links to source code repositories. Re-
garding source code, the official source code repository of this book is at GitHub
(https://github.com/Serulab/Py4Bio). From this site you can inspect online
or download all the code used in this book. To download all scripts, go to the
“Clone or download” green button and press it. If you have Git installed in
your machine (and know how to use it3 ), clone the repository using this ad-
dress: [email protected]:Serulab/Py4Bio.git. Another alternative is to click on
“Download ZIP”. Once you have the repository in your machine, go to the code
folder, where there are a set of folders, each one has the scripts related to the
chapter. Each script in the book has a name and this corresponds with the file-
name. There is another folder called notebooks, and it contains Jupyter note-
books that can be run locally. For more information on how to run a Jupyter
notebook, please see http://jupyter-notebook-beginner-guide.readthedocs.
io/en/latest/execute.html.
Another online resource are the Jupyter Notebooks available at Microsoft Azure
Notebook website (https://notebooks.azure.com/py4bio/libraries/py3.us).
The same notebooks that are in the book repository, can be used online in this site.

1.3 WHY LEARN TO PROGRAM?


Many of the tasks that a researcher performs with his or her computer are repetitive:
Collect data from a Web page, convert files from one format to another, execute or
interpret hundreds of BLAST results, primer design, look for restriction enzymes,
etc. In many cases it is evident that these are tasks that can be performed with a
computer, with less effort on our part and without the possibility of errors caused
by tiredness or distractions.
An important consideration when you’re evaluating whether or not to create a
program is the apparent time lost in the definition and formulation of the problem,
implementing it with code, and then debugging it (correcting errors). It is incorrect
to consider problem definition and evaluation as a waste of time. It is generally at
this precise point in the process where we understand thoroughly the problem that
we face. It is common that during the attempt to formulate a problem, we realize
that many of our initial assumptions were mistaken. It also helps us to detect when
it is necessary to restart the planning process. When this happens, it is better that
it happens at the planning stage than when we are in the middle of the project. In
these cases, the planning of the program represents time saved. Another advantage
to take into account is that the time that is invested to create a program once is
compensated by the speed with which the tasks are performed every time we run
it.
3
In Appendix A there is a tutorial on how to use GitHub
8  Python for Bioinformatics

Not only can it automate the procedures that we do manually, but it will also
be able to do things that would otherwise not be possible.
Sometimes it is not very clear if a particular task can be done by a program.
Reading a book such as this one (including the examples) will help you identify
which tasks are feasible to automate with software and which ones are better done
manually.

1.4 BASIC PROGRAMMING CONCEPTS


Before installing Python, let’s review some programming fundamentals. If you have
some previous programming experience, you may want to skip this section and jump
straight to Chapter 2 “Installing Python.” This section introduces basic concepts
such as instructions, data types, variables, and some other related terminology that
is used throughout this book.

1.4.1 What Is a Program?


Computers only know what you tell them. The way to tell them to do something
is by a program. A program is a set of ordered instructions designed to command
the computer to do something. The word “ordered” is there because is not enough
to declare what to do, but the actual order of directions should also be stated.4
A program is often characterized as a recipe. A typical recipe consists of a
list of ingredients followed by step-by-step instructions on how to prepare a dish.
This analogy is reflected in several programming websites and tutorials with the
words “recipe” and “cookbook.” A laboratory protocol is another useful analogy. A
protocol is defined as a “predefined written procedural method in the design and
implementation of experiments.”
Here is a typical protocol, followed almost every day in several molecular labo-
ratories:

Listing 1.1: Protocol for Lambda DNA digestion

Restriction Digestion of Lambda DNA

Materials

5.0 mcL Lambda DNA (0.1 g/L)


2.5 mcL 10x buffer
16.5 mcL H2O
1.0 mcL EcoRI

4
There are declarative languages that state what the program should accomplish, rather than
describing how to accomplish it. Most computer languages (Python included) are imperative instead
of declarative.
Introduction  9

Procedure

Incubate the reagents at 37°C for 1 hr.


Add 2.5 mcL loading dye and incubate for another 15 minutes.
Load 20 mcL of the digestion mixture onto a minigel

There are at least two components of a protocol: materials or ingredients, and


procedures. A procedure provides specific order like incubate, add, mix, load and
many others. The same goes for a computer program. The programmer gives specific
order to the computer: print, read, write, add, multiply, round, and others.
While protocol procedures correlate with program instructions, materials are
the data. In protocols, procedures are applied to materials: Mix 2.5 µL of buffer
with 5 µL of Lambda DNA and 16.5 µL of H2 0, load 20 µL onto a minigel. In a
program, instructions are applied to data: print the text string “Hello”, add two
integer numbers, round a float number.
As a protocol can we written in different languages (like English, Spanish, or
French), there are different languages to program a computer. In science, English is
the de facto language. Due to historical, commercial and practical reasons, there is
no such equivalent in computer science. There are several languages, each with its
own strong points and weakness. For reasons that will make sense shortly, Python
was the computer language chosen for this book.
Let’s see a simple Python program:

Listing 1.2: sample.py: Sample Python Program

1 seq_1 = ’Hello,’
2 seq_2 = ’ you!’
3 total = seq_1 + seq_2
4 seq_size = len(total)
5 print(seq_size)

Note: The numbers at the beginning of the each line are for reference only,
they are not meant to be typed.
This small program can be read as “Name the string Hello, as seq_1. Name
the string you! as seq_2. Add the strings named seq_1 and seq_2 and call the
result as total. Get the length of the string called total and name this value as
seq_size. Print the value of seq_size.” This program prints the number 11.
As shown, there are different types of data (often called “data types” or just
“types”). Numbers (integers or float), text string, and other data types are covered
in Chapter 3. In print(seq_size), the instruction is print and seq_size is the
name of the data. Data is often represented as variables. A variable is a name
that stands for a value that may vary during program execution. With variables,
a programmer can represent a generic command like “round n” instead of “round
2.9.” This way he can take into account a non-fixed (hence variable) value. When
Exploring the Variety of Random
Documents with Different Content
"She has to stay three days with her husband," Edward took it upon
himself to answer; "then the wedding will be finished and she can come
here for a day. That is our custom. Even though our father is dead, they will
not permit her to come before three days."

"And a nice home-coming it will be!" Ronald groaned. "A cheerful


place to return to. Please tell the t'ai-t'ai that when Nancy returns I must be
here to see her and speak to her. I don't know what the Chinese custom is in
such a case, but this is absolutely necessary if I am to perform my duties as
a trustee in a satisfactory manner."

Edward communicated this demand, to which the t'ai-t'ai gave a shrug


of consent. There was nothing these foreigners appeared incapable of
asking, but she was too wholly in the man's power. It was no time to
quibble.

With this promise safely gained, Ronald told Edward to gather up his
things. It was not healthy for the boy to stay a minute longer than necessary
in a household where everyone's thoughts dwelt round the corpse of the
dead master. Edward went to his work listlessly and came back sniffing and
weeping after the woebegone task of dismantling the room he had occupied
so long. Neither the sympathetic help of his amah cheered him nor the
welcome of his new home, where David, awed by the distinction of one
who had lost his father, tried cautiously to say the appropriate word.
Edward wanted Nancy; his heart was hungering for her even when he
thought he mourned for his father.

On the third day he went with his Uncle Ronald, as already he had been
taught to call his guardian, to see the sister who had become a bride.

His own eagerness, if he had known it, did not exceed Ronald's. The
intervening day had been a busy one. Ronald had been to the legation to
have Herrick's will admitted to probate. He found friends who had known
Herrick long ago and who were avid for every last detail of Herrick's story,
but they could suggest no scheme for saving Nancy. It was a rotten
business, they agreed with some emphasis, but a matter which could not be
helped, for Nancy, by wedding a Chinese husband, had forfeited British
protection. Ronald might use pressure, and they hoped he would, to get the
girl away from her husband,—there was not one of them who expected the
marriage to end in any way except drastic misery,—but he had no lawful
right to divert any of Herrick's estate for the purpose. The estate, through
remarkably clever investments, had once been close to a fortune, but
recently Herrick's intemperate withdrawals had reduced it till it was barely
enough to cover the terms of his will.

So Ronald went impatiently to meet Nancy, determined that if she gave


him the slightest encouragement he would break all the laws of the land to
rescue her. Early though he went, the bride had arrived before him and had
given way to a frenzy of sorrow beside her father's coffin. She had not yet
put on mourning, for the mother-in-law had deemed it an unlucky thing to
interrupt the first festal days with any mark of sadness. So she had come,
oddly enough, wearing a red skirt; but any suggestion of happiness had
been erased by the stains of grief which made her eyes dull in their sunken
pits and her skin a bloodless white.

It was the first chance Nancy had had to yield to her passionate misery:
for three days she had struggled against tears, trying to preserve some
semblance of joy in a family which paid no heed to the death of her father.
The rites of the wedding were dragged out till she was on the point of
fainting under the cruel burden. She felt no love for the husband who had
been goaded into claiming her, and suffered bridal intimacies from one who
became worse than a stranger in her eyes. Beneath his treatment she felt the
hostility of a youth who had not desired this foreigner for his wife, and
beneath the treatment she met from her new mother she felt the
exasperation over delay in the payment of her dowry, disappointment taking
unkind shapes because the woman had never forgiven herself for selling her
son into what was likely to prove a bad bargain. For three days the family
had been most deliberately merry, trying to face out their regrets in the sight
of the world; they had been reckless of how they spent money, but thrifty of
a single friendly word to the girl whose heart was breaking while she
pretended to smile. At last they had let her go home to weep.

When Nancy, who had comforted herself before marriage with the hope
of coming back to see her father, realized that he too had deserted her and
that she had not won him a single day's peace by her sacrifice, she threw
herself down beside his coffin and wept till her body seemed torn apart by
her grief. Edward, who in his turn was ready to break down, understood the
sudden need to control himself, so that when the time came he could
comfort his sister in his affectionate boyish manner and bring her away to
the room where Ronald was waiting.

Nancy was dazed at seeing Ronald. She did not seem to know why he
was there. Her mind still lingered with her father. She had only perfunctory
words to spare for the living, while Ronald could hardly check the
temptation to carry her away by force, to carry her out of sight and sound of
this baneful household. Everything he wanted to say froze on his lips. He
had no heart to reproach the girl for persisting in the wedding she might
have stopped. With her face marred by grief, he could not ask her if she
were happy, if she were contented with her new home. The words would
have mocked their own meaning.

"Nancy," he did at last summon courage to say, "it is no use weeping


over the dead any more. It doesn't help them at all. If your father doesn't
know, then your tears are wasted; if he does know, then he will be the more
unhappy to see you so sad. The living are what we have to think of—you
and Edward. If you want your father to have peace, wherever he has gone,
you must help him not to worry over you. You must let him know that you
have peace yourself. Edward he won't worry about because he asked me to
take charge of him and so Edward has come to my sister's to live, but you
every one of us will worry about till we are sure that you are well and
happy. That's what you must tell me: you can speak as frankly as you
choose; there is no one here who dares to interrupt, but I must know how I
can help you."

"You can't help me," answered Nancy.

She was quieter now, but the hysterical stillness of her manner
frightened Ronald.

"That is no answer," exclaimed Ronald.

He was annoyed by the girl's obstinacy, which she had inherited in too
full measure from her father.
"You surely can be frank with me," he added, "because I may never
again be in such a position to help you. You know that I have your father's
estate to divide. As long as the money, which includes ten thousand taels
which were to be paid at your wedding, as long as this remains in my hands
I can make almost any terms you may wish with the t'ai-t'ai. But when it has
been divided, then my power will be gone. Now do you regret your
bargain? Are you sorry you kept to this marriage? Do tell me now, when I
can help you."

He had realized Nancy's stubbornness; he had not measured her pride.

"My marriage is what I expected," she answered.

How could she tell him the shame of the last three days? How could she
relate the scornful treatment of her new family? She might have told Kuei-
lien; she had no words to speak of it to Ronald. She could not run to him
like a weakling tired of her promise. To endure the mischances of her
marriage was no more than keeping faith with her father's good name. She
was a wife; that was the end of it. But Ronald seemed to read her thoughts.

"I don't know what your new home is like," he argued, "but I do know
what you are like, and I can hardly imagine you happy under the conditions
you will find there. Just now your sorrow for your father makes everything
else seem of small account, but the time will come when the sharpness will
wear off and you will have to think of the man you have married and the
life you have adopted. For it is an adopted life; it is not natural to you. Now
your father is dead, don't make a mistake of your loyalty to him and think
you have to embrace years of misery merely to gratify his memory. That's
not good enough. They don't want you—I can see that; they only want the
money that was promised with you. Nothing would please them better than
to get this money without the necessity of taking you. You are a foreigner
and always will be a foreigner to them. Can't you come home with Edward
and me, and I will promise, if I have to move heaven and earth, to get your
marriage annulled."

"If they want my money, they have to take me," said Nancy stubbornly.
She was not doing justice to Ronald's proposal, while the man, in his
turn, was far from seeing her marriage as she saw it. She could not
appreciate how in his foreign eyes her marriage was no marriage, nor could
he see how to her Chinese eyes it was a bond from which there was but one
honorable escape for the wife, the extreme measure of suicide. Ronald had
been reading deeply in the customs of the Chinese the better to understand
Nancy's case, but he missed the essential fact of her attitude, the value she
set by her good name. To have run away because she was displeased with
her first three days of wedded life seemed an act of intolerable cowardice.
Nancy's every thought was Chinese, more Chinese than Kuei-lien's: she had
an inbred fear of disgrace, not only for her own sake but for her father's
whose reputation rested helplessly in her care. So she met Ronald's most
persuasive entreaties with the same blank answer. If she had grounds for
quarreling with her husband or with his parents it was no business of an
outsider to know of them.

At last Ronald despaired of moving her. He gave up the attempt. He was


as sure as he was sure of his own love for the girl that she was unhappy in
her new home and would grow week by week unhappier, but she was less
responsive to his words now than before her marriage. He threw down his
hands with a hopeless gesture, inwardly cursing the folly of Timothy
Herrick, which was able to survive him in such fatuously obdurate wrong-
headedness. Nancy's white, troubled face reminded him of his first glimpse
of her in the temple. How much greater was her danger to-day than in that
first perilous meeting. How much less he could help her. Unable to leave
the girl without one sign of his deep overmastering passion, he crossed the
room and kissed her gently on the forehead.

"I shall always love you, Nancy," he said.

Nancy trembled a little beneath the touch of his lips, but the kiss came
so naturally that she had no time to be surprised and could only wonder
long afterward at the trance which had held her silent under so strange a
greeting, so strange a token of farewell.
CHAPTER XXX

Ronald did not see Nancy again until the day of Timothy Herrick's
funeral. On that dreary day she was more remote than ever, wearing her
headdress of white sackcloth and weeping loudly. Even Edward, who had
thrown off many vestiges of his Chinese upbringing in the short time he had
lived with the Ferrises, fell back disconcertingly into old habits and was as
Chinese as Herrick's half-caste children when he had donned his coat of
coarse bleached calico.

Ronald rightly insisted that as Herrick had lived so should he be buried,


and he advised the t'ai-t'ai to spare none of the rites suitable to a mandarin
of her husband's rank. He brought Beresford with him to the funeral.
Beresford was intrigued by the many peculiar rites, but Ronald listened to it
all with insufferable weariness and wondered if the priests were ever to be
finished chanting their guttural prayers. Each stroke of bell and drum
seemed to remove Nancy farther than ever from his hopes, tangling her
spirit in an alien region from which she would never come out again. He
saw nothing picturesque in the great scarlet catafalque put over Herrick's
coffin, the silk umbrellas, the tables with their food for the dead, the spirit
chair intricately wrapped in white muslin, the horrid crayon copy of
Herrick's photograph, borne in a chair of its own, the bright silken copes of
the priests, their contrast with the rags of the beggars, who carried white
banners certifying to the merits of the dead, the green-clad coolies who
labored with the weight of the coffin, the pervading smell of incense and
burning sandalwood—these were all details which Ronald might have noted
with an interested eye if he had not been oppressed by their meaning for
Nancy. It was her tragedy that when those who loved her could bring the
girl no comfort, she had to seek relief in this pitiless barbarity which
seemed to sing her father's failure, his exile from his own people, his
cheerless sojourn in the cold places of the dead.

All this Ronald heard in the weird music of the procession, as the coffin
and its mourners moved slowly toward the gates of the city; he felt that the
road Timothy Herrick was traveling, this same road there was no one to
prevent his daughter from taking, despite all her lovable instincts for joy
and for beauty—no one good enough to prevent her from following in her
own desolate hour.

Beresford, however, thought the whole funeral very splendid. So much


better, he declared, than being reminded of the skin-worms, and forced to
linger in the sickly smell of a church which had been banked like a flower-
seller's shop while bald-headed gentlemen trundled the coffin with
exaggerated slowness up the aisle. He envied Herrick's escape from those
absurd rites and from being consigned into eternity by the throaty reading
of a curate in a starched surplice. This brilliant procession, winding with
such an unrehearsed mixture of carelessness and dignity, did seem in his
eyes to express more reasonably the tragic naturalness of death. Even
Ronald, before they had reached Herrick's burial-place, began to feel
himself haunted by the sobbing voice of the flutes and to know that this
garish splendor was the ancient and simple way of keeping up man's
courage before the mystery of death. It was a shock, on coming outside the
city, to see the coffin stripped of its pall, the umbrellas and chairs sent back,
as though the chief object of the parade had been not to honor the unseeing
dead but to win honor from the populous streets of the city, yet the quiet
which ensued induced meditations that were not unpleasing though they
were sad. Autumn lay with warm sunshine on the land; sloping shafts of
light made the dry grass glow; wide and blue was the sky. The only sound
was the low-toned note of a gong which a priest rang from time to time as
he walked in front of the coffin.

Ronald was moved by the loneliness of Herrick's burial-ground. It was


so tranquil that he, too, half envied the dead man's privilege of sleeping
quietly with all the scenes he had loved, the serene clarity of the Western
Hills, the climbing palaces of Wan Shou Shan, the towers and golden roofs
of Peking, compassing from the far distance the little circle of pine and
cypress round the grave. Ronald's spirit was hushed by the stillness. The
man looked idly at the four characters gilded on the end of Herrick's coffin:
"Hai returns to the halls of spring," they said, and for the first time Ronald
believed that there was immortality in lying here beneath the open spaces of
heaven. A fresh outburst of wailing, the burning of paper money, and
exploding of crackers could not touch the peace of a heart fortified by the
strangely comforting thought that life was soon over.
The grave was ready at two, but the hour was even-numbered, unlucky;
mourners and priests and workmen waited in little gossiping groups till the
more fortunate hour of three, when the coffin was lowered into the grave
with the lavish sunshine pouring down upon it as if to make amends for
Herrick's last sight of day. Every clod that had been dug was thrown
scrupulously upon the round mound of the grave. Edward knelt down and
wept; Nancy wept and bowed her forehead to the ground; the women
prostrated themselves, tearing their hair and their clothes. Ronald stood
watching dumbly, but he got his moment of reward when Nancy rose, for
she gave him one searching look, one glance of understanding and love,
over which hovered the trembling flicker of a smile. She showed she had
not forgotten his kiss; this was her answer. So completely, indeed, had
Nancy seemed to belong to him throughout all the tedious hours of the
funeral that Ronald remembered afterward, with some amazement, that
among the gathering of the t'ai-t'ai's family, which followed the coffin, he
had not knowingly set eyes upon or even thought of singling out Nancy's
husband.

After Herrick had been buried, there was nothing to keep him from
dividing what remained of his money. Ronald was anxious to be done with
the task. He exacted but one promise, a promise from the t'ai-t'ai that when
Nancy's first month of married life was complete and the girl, as custom
allowed, was able to sleep a few nights under another roof than her
husband's, she should come to his sister's home instead of the father's house
she ought to have visited. This was reasonable, for Edward was the only
kinsman left to her.

Herrick's pretentious household melted away. Each wife, when she


received her money, took pains to put herself out of the t'ai-t'ai's reach.
There was none of them that wished to be slave to that arrogant lady. With a
contemptuous smile she watched them scatter. After they and their children
and their bundles and bedding and their wrangling servants had gone, she
gave up the lease of the house Herrick had occupied so long, sold what she
could of his furniture, and betook herself to her brother's. Of the line her
husband had been so ambitious to found, literally not even the name
remained.
Ronald took care to obtain and note the t'ai-t'ai's address; Nancy's bridal
month was so nearly finished that he could not govern his eagerness to have
her come. The rest of Herrick's family he made no effort to trace. Except
the amah, who of course remained with Edward, they might scatter to the
winds for all he cared. But suddenly one evening when the Ferrises had
finished dinner a hubbub in the kitchen woke them from the lethargy of
worrying about Nancy, for Edward's presence among them had been a
continual reminder of his sister's absence; they jumped up in alarm when
the old nurse rushed gasping into the room, crying out, "They've gone,
they've gone!" It took them some minutes to understand what she meant.
Not till Kuei-lien appeared and rapidly poured out her story to Edward was
the cause of the amah's excitement understood.

To their consternation they learned that the t'ai-t'ai had broken her
promise. She had gone with her brother and his whole family back to their
native town of Paoling. And Nancy, as naturally she must do, had gone with
them. It was the last blow.

The other details of Kuei-lien's story were more interesting to Edward


than to his discouraged guardian. The one fact which might have been of
use, her coming from the same town as the t'ai-t'ai, was robbed of
advantage because the girl did not dare nor intend to go home. If she had
done so she would have been handed over to the t'ai-t'ai by her stupid and
covetous family. She was the single one of Herrick's concubines whom his
wife had tried to retain. Her parents were dependents of the Chou family,
absolutely under their orders, while the t'ai-t'ai not only did not like losing a
slave of Kuei-lien's beauty and cleverness but still more regretted letting her
escape with the money she had gathered. Their separation had cost them a
quarrel. The t'ai-t'ai had commanded the concubine to remain, had
threatened to hold her boxes and to have the girl beaten. If Kuei-lien had
been less bountiful in bribing the servants, she could not have got away.
The t'ai-t'ai's stinginess had proved her safety.

So Kuei-lien, meditating new plans, lay low. She cultivated the


friendship of the amah, husbanded the money she owned, while she looked
for chances to get more. And because she maintained some slight
connection with Pao-ling and might get them news of Nancy, the Ferrises
were pleased to let her stay. They did not guess a tenth of her plans nor
realize that she was using the shelter of their servant quarters to let it be
known she was under foreign protection, that any offense offered to her
would be visited upon the offender by the King and Parliament of Great
Britain.

As for poor Nancy, the King and Parliament of Great Britain had lost
interest in her. The secluded Chihli village of Paoling kept her as hidden
from prying strangers as the fastnesses of Turkestan. Nancy had never been
told of the promise that she should visit Edward in his new home. She was
saved this disappointment. But she knew it was the last step away from her
friends when her mother-in-law summoned her to pack and to get up long
before dawn for the cold dark ride to the station. Long as she had lived in
Peking, the city was a place strange and unfamiliar to the girl, yet she
conceived a fondness even for the arches and walls she barely could descry
in the darkness, for she felt she should never set eyes upon them again.

With the rest of her husband's family she bundled uncomfortably into a
third-class carriage, squeezing herself so tightly between baskets and
bedding that she sat as though cramped stiffly in a vise. Everyone spoke
shrilly; the early hour, the bitterly frosty morning, had set their tempers on
edge. No one was in a mood to enjoy the novelty of a railway ride. Nancy
looked wearily at the dingy houses they passed, wondered if their occupants
could be unhappier than she was; she saw in the distance the blue roofs of
the Temple of Heaven, but paid no heed; if her legs had not been so stiff,
her whole body aching from the need of movement, she might have gone to
sleep counting the numbers of the telegraph poles. Her mind did go to
sleep; her body persisted in staying painfully awake.

She was grateful to get off the train, grateful to shake her numb legs into
life, pulling boxes and bales quickly out of the car. The t'ai-t'ai and her
mother-in-law gave contradictory orders, they wrangled and shouted,
pulling servants helter-skelter, scolding Nancy, scolding her husband; they
were only one of many groups invoking heaven and hell in their panic lest
the train should start before the last bundle had been rolled out of the
window.
By a miracle they got themselves untangled and down to the platform,
where the women sank breathless on rolls of bedding, waiting for a bargain
to be struck with the mule-drivers. This was not quickly nor quietly done
and Nancy, used to having these small matters arranged without her
presence, despaired of its ever being done at all. To the mule-drivers and
their opponents, however, the hiring of a cart was more heady business than
speech in a public forum. Not till vulgar interest was diverted to Nancy,
whose presence in this company became an eighth day's wonder, did the
arguing parties see that their prominence of the moment had passed; they
made the same bargain they could have made half an hour back. Chou
hsien-sheng swore he was cheated, the drivers swore they were robbed, but
the price they fixed had been the unchanging rate for a decade.

Nancy was glad to get into her cart, even to be thriftily crowded among
three women servants and a suffocating mass of baggage. She had not
enjoyed the ring of staring eyes which had surveyed her nor the coarse
guesses of the people as to her history, guesses loudly and impudently
debated with many rustic guffaws over the joke of a foreigner reduced to
Chinese clothes and the whims of a Chinese master.

All day long the carts moved slowly forward, lumbering in ruts, shaking
the teeth of their passengers on miles of chipped highway, ploughing deep
through sand. Nancy was acutely mindful of other mule-cart journeys, the
rides to the Western Hills, when Edward and Kuei-lien had been her
comrades and each new turn of the road had tempted their eyes to objects of
joyful interest. She was scornful of the ignorant maids squashed into this
unpleasant contact, closed her eyes to avoid seeing their puffy faces; their
few monosyllables were like a parody of human speech. They wheezed and
grunted and reeked of garlic till Nancy wondered why she could not
withdraw all her senses, as she had withdrawn her sense of sight, and shut
herself from these clownish wenches like a mussel in its shell.

Shortly before dark the carts lurched down the sunken streets of
Paoling. It was like all the other villages they had passed, dusty and poor.
Dikes of baked mud served for walls. Two policemen lounged at the gate as
though the place were not worth their vain offer of protection. Mud and
gray tile and leafless trees, streets without shops, worn into deep trenches,
people clothed in rags so dirty that the very patches were blended to a
greasy uniformity of color—not an item relieved the drab scene. And the
home of her husband, Nancy found, was a consistent part of its
surroundings. It was filthy, musty, and cold, a huge ramshackle place
replete with tottering chairs and tables, its stone floors overlaid with grime,
its courtyards heaped with dung. Only rats and spiders seemed fit to inhabit
such a place and Nancy's heart became chill with dismay when she thought
of dragging out her life in this cheerless hole.

In a panic of sheer terror she was taken to greet Ming-te's grandmother,


the matriarch of the clan, the old lady whose temper she had heard
discussed with lively fear during the month she had been married. She
shrank from being led to something more terrible than any of the evil things
she had seen. Her nerves were so unstrung by the weariness and misery, the
depressing finish of the day, that she was ready to shriek. She halted stock-
still in a room ill lit by native wicks.

"Kneel," chided the voice of her mother-in-law.

Nancy knelt and kowtowed three times before the august personage to
whose face she had not yet presumed to raise her eyes. She waited, prostrate
on the floor.

"Lift her, you fools," cried a voice that showed by its testiness it was
used to being obeyed. "Can't you see she is worn with weariness?"

The other women hastened to help Nancy to her feet. The girl looked
wonderingly at the little old woman who sat muffled in quilted satin on the
k'ang. From a face crossed and transcrossed with wrinkles burned eyes
whose haughtiness spoke an older and a finer generation than the women to
whom Nancy had been subjected. Her mother-in-law's were dog's eyes
compared with them. Nancy lost her fear. The eyes brought memories of
her father. They seemed to pierce, with their sadness, their cynical
discontent, the very mysteries of life.

"Come here, my child," said the old woman gently. "Come and sit with
me and tell me how you are. I have waited a long, long time to welcome
you."
CHAPTER XXXI

In the first relief that followed this kindly greeting, Nancy nearly broke
down. Tears welled to her eyes, do what she would to hold them back. She
could not help sobbing, but the old woman stroked her hands as though she
knew the misery pent up in the heart of this alien bride.

"My husband and your father were friends," she said, "and I am glad
that his daughter has become my granddaughter. But it's hard, isn't it?"

She gave a little chuckle, seeming to appreciate her own experiences as


a bride in years which only a handful of bent gray figures like herself still
lived to remember. Nancy could have lived as long without forgetting this
reception by the wise old woman whose harsh tongue she had been taught
to dread. It came with such sudden, blinding beauty at the end of a
comfortless journey, at the end of four suffering weeks in which her spirit
had been tortured nearly to the limits of its endurance!

Nancy would have suffered much from the women, from her mother-in-
law and from her stepmother—for the latter visited on the daughter her
anger over the justice of Timothy Herrick's will—and even at the hands of
lesser people, who took their pattern from this spiteful pair, but she had
hoped for some measure of sympathy, some pity, even if there could not be
love, from the youthful stranger, Ming-te, who had been given the rights of
a husband over her life.

In this she was disappointed. Ming-te felt that there was no one with a
grievance comparable to his own. His parents, however much they might
dislike this foreigner in the family, had invited her by their own choice. But
he had been given no choice.

Like most youths of his modern day, he detested being bound by an


early marriage even to a girl of his own race; he detested being set to breed
heirs for the pleasure of his parents. He envied the new laxities of Shanghai
and Peking, the parody of Western freedom carried on under the guise of
choosing one's wife for one's self. He was eager to push aside convention,
to realize republican liberty by bursting all restraint; he was a student,
member of a class bound by no laws of right or reason, to whom all things
ought to be allowed in the pursuit of knowledge; yet just when his
imagination had begun to run riot over the thought of embracing slim girl
students to the mutual advancement of their studies, when he was becoming
conscious of his own sacred importance as the hope of China and the flower
of creation, he had been put under restraint like his forefathers, suddenly,
brutally married, his hopes dashed. And his sacrifice had been
unmentionably worse than theirs; he, the heir of the ages, had amounted to
so little in the eyes of his elders that they had flung him a foreigner for a
bride!

So Ming-te, the handsome, spoiled idol of his parents, took his marriage
in bad grace and vented his spleen on Nancy. He did not take the trouble to
see whether here might not be the ideal comrade of whom he had prated so
freely in the safe company of his friends; he had made up his mind to
dislike the girl long before he set eyes upon her. The disgrace of his bridal
night, his sheepishness, the mockery of his family, of which he still heard
the echoes, were an added score to be wiped out. And because he could not
avenge himself on her mind he tried to avenge himself on her body, for at
heart he was afraid of Nancy; at heart he realized her contempt for his
shallowness and conceit; he seemed to see her eyes despising him as a
weakling, a petulant small boy, till she challenged him to ecstasies of
cruelty to prove that he was indeed her master.

Nancy had learned many undreamed-of things during this month, but
nothing more dumbfounding than the fact that real sorrow is an experience
without appeal; it has no glamour, no romance. It is like a headache which
goes on forever. She wondered at the vernal innocent person she had been,
blithely offering herself for a life of torture, as though it were no more than
one of those tempestuous black tragedies of childhood which last for an
hour, then ripple peacefully away like bird notes after a storm. It seemed so
splendid to sacrifice herself, against the protests of Ronald and his nieces
and Edward and Kuei-lien and even her father himself; she had been
thrilled by her own daring even when her heart was cold with the prospect,
so that, while she entered the bridal chair sad and afraid, longing to cling to
everything she was forsaking, some small part of her could not forbear
standing aside to gloat over the picturesque courage of her deed.

But she had been wakened too unmercifully from her dream; her vanity,
so excusable, so childishly serious, broken by a punishment out of all
justice to what it deserved. Her days of shyness were passing. She was
putting off the bride to put on the shrew—in that hard-mouthed family no
other role was safe—when her regrets for the folly of her sacrifice suddenly
dissolved and her heart swelled with pride, with thankfulness, because she
had kept faith with an old lady she had never met, who greeted her in the
twilight of a gray day, saying, "I have waited a long, long time to welcome
you."

The t'ai-t'ai and her sister-in-law were more surprised than Nancy. They
were dismayed. What the old t'ai-t'ai said, she meant; she had come to an
age when she did not trouble to hide her thoughts of other people, but ruled
her clan, as the last of the oldest generation, with an unsparing frankness
such as made them quail. Hers was a witty, biting tongue which she found
life too short to think of bridling; she did not like her daughter, still less her
daughter-in-law, thought none too highly of her sons, and, as for her
grandchildren, she called them a litter of gaping puppies. Her mind was a
catalogue of their faults; she could make the best of them wince with a
single sharply prodding phrase, for there was nothing ridiculous that any of
them had done, and wished with all his heart to forget, that she could not
recall when the occasion suited her. Grown men writhed for a pretext to get
beyond earshot of her chuckle.

Yet she did not welcome Nancy kindly—as the t'ai-t'ai and her sister-in-
law concluded—merely to annoy them. Her instinct, which always was
extravagantly right, had told her that Nancy would be a friend. She did not
care whether Ming-te had a wife or not, but she longed for someone young,
someone talented and pretty, to whom she could talk and be kind. Her own
family bored her. She yawned when she thought of them. They were a
small, petty-minded generation, while her memory dwelt upon the large
days of the past. Her loyalty was all to the past, to her husband and his
father, to the family in its time of splendor, before its name had been
dragged in the dust by a progeny that forsook their books and squabbled
over cash like beggars fighting in the street. So she had ruled them with a
testy loneliness, glad to be alive only because she knew they would be glad
if she were dead.

Her first glimpse of Nancy satisfied the keen-sighted old tyrant. She
drew the pale girl to her side like a child.

"It's a long time since I've seen anyone really young," she said, "young
and wise together as they used to be. Now we have a republic; men don't
trouble about wisdom and they think they can rule the eighteen provinces
before they have left off their mother's milk. You have read books, I have
heard, and can write poems. Your father would see to that. He knew our
customs. He was one of us."

She could be tactful when she chose; in her questions about the death of
Nancy's father she soothed rather than irritated the quick feelings of the
daughter.

"To die on the day of your wedding, ai, that was a strange thing. I have
lived many years, but I have never heard the like. That was a proof that he
loved you, my child. You must remember such a father. And you have a
brother, too; where is he?"

Nancy told the story of Edward's friends.

"So you have Western friends. How did you come to make them?"

Paragraph by paragraph she drew from Nancy's lips the tale of how they
had met and visited the Ferrises. The old lady enjoyed the freshness of the
girl's story. She wanted most exact details of how these foreigners lived.

"It must have surprised them to see one of their own blood living in the
fashion of a Chinese. Did you like their ways?"

"Sometimes," Nancy admitted.


"Do you like our ways better?"

Nancy was surprised at the question and reluctant to answer.

"Perhaps—sometimes," suggested the old grandmother, answering


herself, and turned to laugh at the shadow of the smile Nancy could not
hide.

"Don't be afraid of me," she said, patting the girl's hand from pleasure at
her own jest. "I shall be your father and your mother from this time forth—
hm-m, just like a magistrate, remember. You can tell your troubles to me as
freely as you please and, even if the walls have ears, they won't dare speak
till I let them."

Her words lulled Nancy into a pleasing warmth of security. She forgot
her weariness, the despair with which she had risen this very morning to
start on a hopeless journey, for the old t'ai-t'ai's words were spoken with the
authority of one who could promise peace when she wished and protection
to those she liked. And she really liked Nancy.

"Your Western friends," she resumed, "they must have been appalled by
your marrying a Chinese. Did they try to dissuade you?"

"Yes, they did try."

"Ah, of course, they wouldn't understand. And perhaps they were right.
You may go back to them some day; who knows?"

"Oh no, I shall never go back to them," Nancy protested, dreading lest
the woman should doubt her loyalty to the promise she had made.

"Young people, my daughter, should never use the word 'never.' When
you are as old as I am and have to think soberly of the spring winds as not
just a chance to fly kites, then 'never' means something; ah, it means too
much. There is so much happiness I shall never know again, so many faces
I shall never see. But you, with your handful of years, there is no 'never' for
you. You thought to-day you would never smile again. You had heard of
me, hadn't you, and trembled to meet a bad-tempered old grandmother;
don't deny it—I saw it in your face when they made you kneel. I shall not
be bad-tempered to you, child. We old people like to have flowers about us.
I shall be selfish of your company and most surely will begrudge you to
others. And will you be sorry? Aha, I don't think you will. Your father must
have taught you wisely for you remind me of children as they used to be
when I was young. I am tired of being waited on by servant maids or by
people who wonder when I'm going to die. Why should I die just to make
fools more comfortable in their folly! No, I shall not be bad-tempered to
you, because you are the first person I have had round me for years who
really wished me to live. But I'm not going to share you."

How firm were her intentions was soon shown, for Nancy's mother-in-
law came in to say, in a voice too carefully matter-of-fact, that if the old t'ai-
t'ai had been gracious to say all she wished to the 'hsi-fu,' they hoped she
would give her permission to withdraw, for there was much work to be
done and her room to be set right.

"And whose work, indeed, is she to do, if not mine?" asked the old t'ai-
t'ai. "Her room we can discuss later, but to-night her room will be here."

"Oh, but that would not be convenient," faintly protested the younger
woman; "we must not separate the bride from her husband. My mother
speaks this out of her kind heart, but surely it would make my mother
uncomfortable."

"It will be entirely convenient," snapped the dowager.

"Very well, that is only what we wished to be sure of," said Ming-te's
mother hastily, "we wanted to make sure of your comfort."

Yet the next day she was still so far from being satisfied of the old t'ai-
t'ai's comfort that she asked her sister-in-law to intercede and to get Nancy
out of the old lady's clutches before it was too late. Hai t'ai-t'ai, Nancy's
step-mother, was more than ready to try, for she knew that while the old
lady lived, if they did not make a stand quickly, Nancy would be lost to
their control. She had a portion of her mother's independence and did not
cringe in the august presence as her sister-in-law was apt to do. Waiting a
chance when Nancy was absent, she went boldly into the den.
"You have come to ask after my health, have you?" inquired her mother
brusquely. "My health is excellent, this morning. It has done me great good
to meet someone new.

"We are so glad that the foreign hsi-fu meets with your favor," lied the
daughter cheerfully. "I thought of your comfort when I began to arrange the
match."

"Did you? Well, you thought most intelligently, so intelligently that I


have decided to keep her as my companion, to give her the room next to
mine."

"Your companion, by all means," agreed Hai t'ai-t'ai, "but not too much
your companion. We can never permit her to tire you with her prattle. She
might become spoiled and think you were indulging her in liberties only fit
for yourself. I have known her for many years and I speak the truth when I
say she is difficult to control. She puts forward a good face at first, but she
is an obstinate, self-willed child, not always obedient to her elders. Her
training was sadly neglected because she was left to the charge of an
indulgent old amah—"

"And you think her training will suffer at my hands, do you?"


interrupted the old t'ai-t'ai with a laugh, "you fear that I will be another
indulgent old amah to her?"

"Oh no, not at all, but we trembled to put the burden of her training in
your hands."

"You are all very busy people. What is there for an old woman like
myself to do? I shall be happy to take the burden of her training into my
hands. When I weary of it, I have a tongue; I can tell you."

The daughter shrugged her shoulders. Nancy always had been a


mischievous obstacle to her plans; and now, with her new ally, was more
dangerous than ever. Her hands itched to beat the wench. But she went on
in smooth tones:—

"We must be just to Ming-te."


"I am just to Ming-te."

"But he has had his bride only for a month. Is it right to leave the boy
lonely without a mate for his bed? These things mean so much to the young.
If he is lonely, he may go out to drink and to gamble with evil companions.
He did not want to marry, yet for our sake he did even more: he married a
foreigner to help his family. And now, when he is beginning to understand
her excellent qualities—"

"Self-willed and obstinate," reminded the t'ai-t'ai.

"To understand her excellent qualities," continued the daughter, as


though she had not heard the interruption, "and is beginning to appreciate
her for his wife, you surely would not reward his unselfishness by taking
her away and making her a stranger to him. What of the future of the
family? How will they learn to live together in peace and harmony like—"

"Like a sparrow and a phœnix," suggested the mother wickedly. Hai t'ai-
t'ai flushed in annoyance, but the dowager stopped her from speaking.

"How will they learn to live together in peace and harmony?" she
echoed. "Ah, my daughter, you are old enough to answer that question, or
must I answer it for you and say they never will learn. If you could have got
fifteen thousand taels without this girl, would you have taken her? No,
indeed not. But I would have taken her without a cash. So she belongs to
me. She will never be of any use to this family because I am the only one
who knows how to use her. And I am old—and I have no husband to give
her. She will be safer with me. If Ming-te wants a bedfellow, get him one.
You can afford to spend on him a little of the money he has earned. Buy
him a nice, good-tempered, pretty wife; the country is full of them. He will
be happy, you will be happy, and I shall have peace."

CHAPTER XXXII
Welcome to our website – the ideal destination for book lovers and
knowledge seekers. With a mission to inspire endlessly, we offer a
vast collection of books, ranging from classic literary works to
specialized publications, self-development books, and children's
literature. Each book is a new journey of discovery, expanding
knowledge and enriching the soul of the reade

Our website is not just a platform for buying books, but a bridge
connecting readers to the timeless values of culture and wisdom. With
an elegant, user-friendly interface and an intelligent search system,
we are committed to providing a quick and convenient shopping
experience. Additionally, our special promotions and home delivery
services ensure that you save time and fully enjoy the joy of reading.

Let us accompany you on the journey of exploring knowledge and


personal growth!

ebookfinal.com

You might also like