Dominic W. S. Wong

The ABCs
of Gene
Third Edition
The ABCs of Gene Cloning
Dominic W. S. Wong

The ABCs of Gene Cloning

Third Edition
Dominic W. S. Wong
Western Regional Research Center
Albany, CA, USA

ISBN 978-3-319-77762-7    ISBN 978-3-319-77982-9 (eBook)

Library of Congress Control Number: 2018937521

© Springer International Publishing AG, part of Springer Nature 1997, 2006, 2018
This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of
the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation,
broadcasting, reproduction on microfilms or in any other physical way, and transmission or information
storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology
now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this book
are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the
editors give a warranty, express or implied, with respect to the material contained herein or for any errors
or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims
in published maps and institutional affiliations.

Printed on acid-free paper

This Springer imprint is published by the registered company Springer International Publishing AG part
of Springer Nature.
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
To: Benji and Theo
Preface to the Third Edition

In preparing this third edition, the author has become convinced more
than ever that mastering the very basics of speaking and reading the “language”
of gene cloning is the key to see its beauty. The overall objective remains the
same as that stated for the two previous editions, with emphasis on the “nuts and
bolts” in learning the vocabulary and language of gene cloning. To this end, Part
I and II have included updates for the chapters on cloning techniques, cloning
vectors, and transformation. A new chapter is written on the concept and
approach in developing gene-vector constructs for expression cloning.
During the 12 years since the second edition was prepared, there has
been remarkable advancement in the application technology of gene cloning. In
revising this book, topics of emerging impact have been added, particularly
relating to the field of medical science and technology. Some of the new sec-
tions include: disease gene identification by exome sequencing, recombinant
Adeno-­ associated virus-mediated gene therapy, engineered nucleases and
CRISPR for gene/genome editing, and next generation sequencing. Other chap-
ters have been revised and updated as well.
It has been a delightful and inspiring experience to learn about the con-
tribution of numerous scientists to the ever-advancing field of gene cloning. I
should thank the authors whose publications and materials are referenced in this
book, and the publishers for giving permissions to use the copyrighted materi-
als. Thanks are due to many of my colleagues and students for the years of
research collaborations giving focus and meaning to the scope and presentation
of this book.

Preface to the Second Edition

In the 9 years since the First Edition, my contention remains that an

effective approach to understand the subject of gene cloning is by learning the
“vocabulary” and the “language”. This book emphasizes the nuts and bolts on
just how to do that – reading and speaking the language of gene cloning. It
shows the readers how to distinguish between a gene and a DNA, to read and
write a gene sequence, to talk intelligently about cloning, to read science news
and to enjoy seminars with some degree of comprehension.
On the whole, the second edition is not any more advanced than the
first, with the intent of keeping the book concise and not burdening the readers
with unwarranted details. Nevertheless, changes were needed and new materials
were incorporated in the revision. Part I has a new chapter to provide a tutorial
on reading both prokaryotic and eukaryotic gene sequences. Part II consists of
several additions, updating on new techniques and cloning vectors. The topics
in Part III have been rearranged in separate sections – Part III now focuses on
applications of gene cloning in agriculture, and Part IV is devoted entirely to
applications in medicine. Chapters on gene therapy, gene targeting, and DNA
typing have been thoroughly revised. Additional coverage is included on animal
cloning and human genome sequencing. The heavy activity in rewriting and
expanding Part IV reflects the rapid progress in the technology and the increased
impact of gene cloning.
I enjoyed writing and revising this book with deep satisfaction. It has
been an inspiring experience to witness the remarkable development in the field
of gene cloning and the tireless dedication of thousands of scientists in making
genes tick.

Preface to the First Edition

Gene cloning has become a fast growing field with a wide-ranging

impact on every facet of our lives. The subject of gene cloning could be intimi-
dating to the novice with little formal training in biology. This book is not
intended to give an elementary treatment of recombinant DNA technology, as
there are already a number of books in this category. The objective of writing
this book is to provide a genuine introduction in gene cloning for interested
readers with no prior knowledge in this area to learn the vocabulary and acquire
some proficiency in reading and speaking the “language”.
In the process of writing this book, the author was continuously con-
fronted with how to present the language of a complex field in a simple and
accessible manner. I have chosen to devote Part I of this book to outlining some
basic concepts of biology in a straightforward and accessible manner. My inten-
tion is to highlight only the essentials that are most relevant to understanding
gene cloning. For those who want to pursue a thorough review of genetics or
molecular biology, there are many excellent references available. Part II of the
book describes cloning techniques and approaches used in microbial, plant, as
well as mammalian systems. I believe that a discussion beyond microbes is a
prerequisite to a better comprehension of the language and the practical uses of
gene cloning. Part III describes selected applications in agriculture and food
science, and in medicine and related areas. I have taken the approach to first
introduce the background information for each application, followed by an
example of cloning strategies published in the literature. The inclusion of pub-
lications is an efficient way to demonstrate how gene cloning is conducted, and
relate it to the concepts developed in Parts I and II. Moreover, it enables the
readers to “see” the coherent theme underlining the principles and techniques of
gene cloning. Consistent with its introductory nature, the text is extensively
illustrated and the contents are developed in a logical sequence. Each chapter is
supplemented with a list of review questions as a study aid.

xii Preface to the First Edition

I hope that this book will succeed in conveying not only the wonderful
language of gene cloning, but also a sense of relevance of this science in our
everyday lives. Finally, I acknowledge the contributions of my teachers and col-
leagues, especially Professor Carl A. Batt (Cornell University) and Professor
Robert E. Feeney (UC Davis), to my pursuing interest in biological molecules
and processes. Special thanks are due to Dr. Eleanor S. Reimer (Chapman &
Hall) who has been very supportive in making this book a reality.
Preface to the Third Edition  vii
Preface to the Second Edition  ix
Preface to the First Edition  xi

Part One. Fundamentals of Genetic Processes

1 Introductory Concepts   3

1.1 What Is DNA and What Is a Gene?������������������������������������������    3

1.2 What Is Gene Cloning?������������������������������������������������������������    4
1.3 Cell Organizations��������������������������������������������������������������������    5
1.4 Heredity Factors and Traits������������������������������������������������������    6
1.5 Mitosis and Meiosis������������������������������������������������������������������    8
1.6 Relating Genes to Inherited Traits��������������������������������������������    9
1.7 Why Gene Cloning? ����������������������������������������������������������������   10

2 Structures of Nucleic Acids  13

2.1 5′-P and 3′-OH Ends����������������������������������������������������������������   13
2.2 Purine and Pyrimidine Bases����������������������������������������������������   14
2.3 Complementary Base Pairing ��������������������������������������������������   15
2.4 Writing a DNA Molecule ��������������������������������������������������������   16
2.5 Describing DNA Sizes��������������������������������������������������������������   17
2.6 Denaturation and Renaturation ������������������������������������������������   17
2.7 Ribonucleic Acid����������������������������������������������������������������������   18

3 Structures of Proteins  21
3.1 Amino Acids ����������������������������������������������������������������������������   21
3.2 The Peptide Bond ��������������������������������������������������������������������   22
3.3 Structural Organization������������������������������������������������������������   24
3.4 Posttranslational Modification��������������������������������������������������   25
3.5 Enzymes������������������������������������������������������������������������������������   26

xiv Contents

4 The Genetic Process  29

4.1 From Genes to Proteins������������������������������������������������������������   29
4.2 Transcription����������������������������������������������������������������������������   29
4.3 Translation��������������������������������������������������������������������������������   30
4.4 The Genetic Code ��������������������������������������������������������������������   31
4.5 Why Present a Sequence Using the Coding Strand?����������������   32
4.6 The Reading Frame������������������������������������������������������������������   33
4.7 DNA Replication����������������������������������������������������������������������   35
4.8 The Replicon and Replication Origin ��������������������������������������   36
4.9 Relating Replication to Gene Cloning��������������������������������������   37

5 Organization of Genes  39
5.1 The Lactose Operon������������������������������������������������������������������   39
5.2 Control of Transcription ����������������������������������������������������������   40
5.2.1 Where Are the Transcription Start Site
and Termination Site? ������������������������������������������������   40
5.2.2 When Does Transcription Start or Stop?��������������������   42
5.3 Control of Translation��������������������������������������������������������������   44
5.3.1 Ribosome Binding Site and Start Codon��������������������   44
5.3.2 Translation Termination Site ��������������������������������������   44
5.4 The Tryptophan Operon������������������������������������������������������������   44
5.4.1 Co-repressor����������������������������������������������������������������   45
5.4.2 Attenuation������������������������������������������������������������������   45
5.4.3 Hybrid Promoters��������������������������������������������������������   47
5.5 The Control System in Eukaryotic Cells����������������������������������   47
5.5.1 Transcriptional Control ����������������������������������������������   48
5.5.2 Introns and Exons��������������������������������������������������������   48
5.5.3 Capping and Tailing����������������������������������������������������   49
5.5.4 Ribosome Binding Sequence��������������������������������������   50
5.5.5 Monocistronic and Polycistronic��������������������������������   50

6 Reading the Nucleotide Sequence of a Gene  53

6.1 The E. coli dut Gene ����������������������������������������������������������������   53
6.2 The Human bgn Gene��������������������������������������������������������������   55
6.2.1 Reading the Genomic Sequence���������������������������������   59
6.2.2 Reading the cDNA Sequence��������������������������������������   60

Part Two. Techniques and Strategies of Gene Cloning

7 Enzymes Used in Cloning  67

7.1 Restriction Enzymes ����������������������������������������������������������������   67
7.2 Ligase����������������������������������������������������������������������������������������   68
7.3 DNA Polymerases��������������������������������������������������������������������   68
7.3.1 E. coli DNA Polymerase I������������������������������������������   69
Contents xv

7.3.2 Bacteriophage T4 and T7 Polymerase������������������������   71

7.3.3 Reverse Transcriptase��������������������������������������������������   72
7.4 Phosphatase and Kinase������������������������������������������������������������   72

8 Techniques Used in Cloning  75

8.1 DNA Isolation��������������������������������������������������������������������������   75
8.2 Gel Electrophoresis������������������������������������������������������������������   75
8.2.1 Agarose Gel Electrophoresis��������������������������������������   76
8.2.2 Polyacrylamide Gel Electrophoresis ��������������������������   76
8.3 Western Blot ����������������������������������������������������������������������������   78
8.4 Southern Transfer����������������������������������������������������������������������   78
8.5 Colony Blot������������������������������������������������������������������������������   78
8.6 Hybridization����������������������������������������������������������������������������   80
8.7 Colony PCR������������������������������������������������������������������������������   82
8.8 Immunological Techniques������������������������������������������������������   82
8.9 DNA Sequencing����������������������������������������������������������������������   84
8.10 Polymerase Chain Reaction������������������������������������������������������   87
8.11 Site-Directed Mutagenesis��������������������������������������������������������   88
8.12 Non-radioactive Detection Methods ����������������������������������������   91

9 Cloning Vectors for Introducing Genes into Host Cells  93

9.1 Vectors for Bacterial Cells��������������������������������������������������������   93
9.1.1 Plasmid Vectors ����������������������������������������������������������   93
9.1.2 Bacteriophage Vectors������������������������������������������������   99
9.1.3 Cosmids���������������������������������������������������������������������� 102
9.1.4 Phagemids ������������������������������������������������������������������ 103
9.2 Yeast Cloning Vectors �������������������������������������������������������������� 104
9.2.1 The 2 μ Circle�������������������������������������������������������������� 104
9.2.2 The Pichia pastoris Expression Vectors���������������������� 106
9.3 Vectors for Plant Cells�������������������������������������������������������������� 106
9.3.1 Binary Vector System�������������������������������������������������� 107
9.3.2 Cointegrative Vector System �������������������������������������� 109
9.3.3 Genetic Markers���������������������������������������������������������� 109
9.3.4 Plant Specific Promoters �������������������������������������������� 112
9.4 Vectors for Mammalian Cells �������������������������������������������������� 112
9.4.1 SV40 Viral Vectors������������������������������������������������������ 113
9.4.2 Direct DNA Transfer�������������������������������������������������� 114
9.4.3 Insect Baculovirus������������������������������������������������������ 115
9.4.4 Retrovirus�������������������������������������������������������������������� 119

10 Gene-Vector Construction 123
10.1 Cloning or Expression�������������������������������������������������������������� 123
10.2 The Basic Components ������������������������������������������������������������ 123
10.2.1 Expression Vectors������������������������������������������������������ 124
10.3 Reading a Vector Map�������������������������������������������������������������� 125
xvi Contents

10.4 The Cloning/Expression Region���������������������������������������������� 125

10.5 The Gene Must Ligate in Frame with the Vector
for Expression�������������������������������������������������������������������������� 127
10.6 Linkers and Adapters for Introducing Restriction Sites ���������� 128

11 Transformation 131
11.1 Calcium Salt Treatment������������������������������������������������������������ 131
11.2 Electroporation ������������������������������������������������������������������������ 132
11.3 Agrobacterium Infection���������������������������������������������������������� 132
11.4 The Biolistic Process���������������������������������������������������������������� 132
11.5 Viral Transfection �������������������������������������������������������������������� 133
11.6 Microinjection�������������������������������������������������������������������������� 133
11.7 Nuclear Transfer ���������������������������������������������������������������������� 134
11.8 Cell-Free Expression���������������������������������������������������������������� 134

12 Isolating Genes for Cloning 137

12.1 The Genomic Library �������������������������������������������������������������� 137
12.2 The cDNA Library�������������������������������������������������������������������� 138
12.3 Choosing the Right Cell Types for mRNA Isolation���������������� 140

Part Three. Impact of Gene Cloning: Applications in Agriculture

13 Improving Tomato Quality by Antisense RNA 143

13.1 Antisense RNA ������������������������������������������������������������������������ 143
13.2 A Strategy for Engineering Tomatoes
with Antisense RNA ���������������������������������������������������������������� 145

14 Transgenic Crops Engineered with Insecticidal Activity 149

14.1 Bacillus thuringiensis Toxins���������������������������������������������������� 149
14.2 Cloning of the cry Gene into Cotton Plants������������������������������ 150
14.2.1 Modifying the cry Gene���������������������������������������������� 150
14.2.2 The Intermediate Vector���������������������������������������������� 150
14.2.3 Transformation by Agrobacterium������������������������������ 150

15 Transgenic Crops Conferred with Herbicide Resistance 153

15.1 Glyphosate�������������������������������������������������������������������������������� 153
15.2 Cloning of the aroA gene���������������������������������������������������������� 155

16 Growth Enhancement in Transgenic Fish 157

16.1 Gene Transfer in Fish���������������������������������������������������������������� 157
16.2 Cloning Salmons with a Chimeric Growth
Hormone Gene�������������������������������������������������������������������������� 158
Contents xvii

Part Four. Impact of Gene Cloning: Applications

in Medicine and Related Areas

17 Microbial Production of Recombinant Human Insulin 163

17.1 Structure and Action of Insulin������������������������������������������������ 163
17.2 Cloning Human Insulin Gene �������������������������������������������������� 164

18 Finding Disease-Causing Genes 167

18.1 Genetic Linkage������������������������������������������������������������������������ 167
18.1.1 Frequency of Recombination�������������������������������������� 168
18.1.2 Genetic Markers���������������������������������������������������������� 169
18.2 Positional Cloning�������������������������������������������������������������������� 169
18.2.1 Chromosome Walking������������������������������������������������ 170
18.2.2 Chromosome Jumping������������������������������������������������ 171
18.2.3 Yeast Artificial Chromosome�������������������������������������� 171
18.3 Exon Amplification������������������������������������������������������������������ 172
18.4 Isolation of the Mouse Obese Gene������������������������������������������ 173
18.5 Exome Sequencing ������������������������������������������������������������������ 173
18.5.1 Targeted Enrichment by Sequence Capture���������������� 174
18.5.2 Disease Gene Identification���������������������������������������� 175

19 Human Gene Therapy 177

19.1 Physical and Chemical Methods���������������������������������������������� 177
19.2 Biological Methods������������������������������������������������������������������ 179
19.2.1 Life Cycle of Retroviruses������������������������������������������ 179
19.2.2 Construction of a Safe Retrovirus Vector�������������������� 179
19.2.3 Gene Treatment of Severe Combined Immune
Deficiency ������������������������������������������������������������������ 180
19.3 Adeno-Associated Virus ���������������������������������������������������������� 181
19.3.1 Life Cycle of Adeno-Associated Virus ���������������������� 182
19.3.2 Recombinant Adeno-Associated Virus������������������������ 182
19.3.3 Recombinant Adeno-Associated
Virus-Mediated Gene Treatment for Leber’s
Congenital Amaurosis Type 2 ������������������������������������  184
19.4 Therapeutic Vaccines���������������������������������������������������������������� 184
19.4.1 Construction of DNA Vaccines ���������������������������������� 185
19.4.2 Delivery of DNA Vaccines������������������������������������������ 185

20 Gene Targeting and Genome Editing 187

20.1 Recombination�������������������������������������������������������������������������� 187
20.2 Replacement Targeting Vectors������������������������������������������������ 188
20.3 Gene Targeting Without Selectable Markers���������������������������� 189
20.3.1 The PCR Method�������������������������������������������������������� 190
20.3.2 The Double-Hit Method���������������������������������������������� 190
20.3.3 The Cre/loxP Recombination�������������������������������������� 191
xviii Contents

20.4 Gene Targeting for Xenotransplants ���������������������������������������� 192

20.5 Engineered Nucleases: ZFN, TALEN, CRISPR���������������������� 193
20.5.1 Zinc-Finger Nucleases������������������������������������������������ 194
20.5.2 Transcription Activator-Like Effector
Nucleases�������������������������������������������������������������������� 194
20.5.3 The CRISPR/Cas System�������������������������������������������� 195
20.5.4 Nonhomologous End Joining
and Homology-Directed Repair���������������������������������� 196
20.5.5 Expressing Engineered Nucleases
in Target Cells ������������������������������������������������������������ 196

21 DNA Typing 199
21.1 Variable Number Tandem Repeats������������������������������������������� 199
21.2 Polymorphism Analysis Using VNTR Markers ���������������������� 200
21.3 Single-Locus and Multi-locus Probes�������������������������������������� 201
21.4 Paternity Case Analysis������������������������������������������������������������ 201
21.5 Short Tandem Repeat Markers�������������������������������������������������� 202
21.5.1 The Combined DNA Index System���������������������������� 204
21.6 Mitochondrial DNA Sequence Analysis���������������������������������� 205

22 Transpharmers: Bioreactors for Pharmaceutical Products 209

22.1 General Procedure for Production of Transgenic
Animals������������������������������������������������������������������������������������ 210
22.2 Transgenic Sheep for α1-Antitrypsin���������������������������������������� 210

23 Animal Cloning  213
23.1 Cell Differentiation ������������������������������������������������������������������ 213
23.2 Nuclear Transfer ���������������������������������������������������������������������� 214
23.3 The Cloning of Dolly���������������������������������������������������������������� 215
23.4 Gene Transfer for Farm Animals���������������������������������������������� 216

24 Whole Genome and Next Generation Sequencing 219

24.1 Genetic Maps���������������������������������������������������������������������������� 219
24.1.1 DNA Markers�������������������������������������������������������������� 220
24.1.2 Pedigree Analysis�������������������������������������������������������� 220
24.2 Physical Maps�������������������������������������������������������������������������� 221
24.2.1 Sequence Tagged Sites������������������������������������������������ 221
24.2.2 Radiation Hybridization���������������������������������������������� 221
24.2.3 Clone Libraries������������������������������������������������������������ 222
24.2.4 The Bacterial Artificial Chromosome Vector�������������� 223
24.3 Comprehensive Integrated Maps���������������������������������������������� 224
Contents xix

24.4 Strategies For Genome Sequencing������������������������������������������ 224

24.4.1 Hierarchical Shotgun Sequencing ������������������������������ 224
24.4.2 Whole-Genome Shotgun Sequencing ������������������������ 226
24.5 Next Generation Sequencing of Whole Genomes�������������������� 226
24.5.1 The Basic Scheme of NGS������������������������������������������ 227

Suggested Readings 231

Index 245
Part One

Fundamentals of Genetic
chapter 1

Introductory Concepts

The building blocks of all forms of life are cells. Simple organisms
such as bacteria exist as single cells. Plants and animals are composed of many
cell types, each organized into tissues and organs of specific functions. The
determinants of genetic traits of living organisms are contained within the
nucleus of each cell, in the form of a type of nucleic acids, called deoxyribo-
nucleic acid (DNA). The genetic information in DNA is used for the synthesis
of proteins unique to a cell. The ability of cells to express the information coded
by DNA in the form of protein molecules is achieved by a two-stage process of
transcription and translation.

Transcription Translation
DNA _____________________> _____________________> Protein

1.1 What Is DNA and What Is a Gene?

A DNA molecule contains numerous discrete pieces of information,

each coding for the structure of a particular protein. Each piece of the informa-
tion that specifies a protein corresponds to only a very small segment of the
DNA molecule. Bacteriophage λ, a virus that infects bacteria, contains all its 60
genes in a single DNA molecule. In humans, there are about 20,000 genes orga-
nized in 46 chromosomes, complex structures of DNA molecules associated
with proteins.
When, how, and where the synthesis of each protein occurs is precisely
controlled. Biological systems are optimized for efficiency; proteins are made
only when needed. This means that transcription and translation of a gene in the
production of a protein are highly regulated by a number of control elements,
many of which are also proteins. These regulatory proteins are in turn coded by
a set of genes.

© Springer International Publishing AG, part of Springer Nature 2018

D. W. S. Wong, The ABCs of Gene Cloning,
4 The ABCs of Gene Cloning

It is therefore more appropriate to define a gene as a functional unit. A

gene is a combination of DNA segments that contain all the information neces-
sary for its expression, leading to the production of a protein. A gene defined in
this context would include (1) the structural gene sequence that encodes the
protein, and (2) sequences that are involved in the regulatory function of the

1.2 What Is Gene Cloning?

Gene cloning is the process of introducing a foreign DNA (or gene)

into a host (bacterial, plant, or animal) cell. In order to accomplish this, the gene
is usually inserted into a vector (a small piece of DNA) to form a recombinant
DNA molecule. The vector acts as a vehicle for introducing the gene into the
host cell and for directing the proper replication (DNA -> DNA) and expression
(DNA -> protein) of the gene (Fig. 1.1).
The process by which the gene-containing vector is introduced into a
host cell is called “transformation”. The host cell now harboring the foreign
gene is a “transformed” cell or a “transformant” .
The host cell carrying the gene-containing vector produces progeny all
of which contain the inserted gene. These identical cells are called “clones”.
In the transformed host cell and its clones, the inserted gene is tran-
scribed and translated into proteins. The gene is therefore “expressed”, with the
gene product being a protein. The process is called “expression”.

Fig. 1.1. General scheme of gene cloning

Introductory Concepts 5

1.3 Cell Organizations

Let us focus the attention for a moment on the organization and the
general structural features of a cell, knowledge of which is required for com-
manding the language of gene cloning. Cells exist in one of two distinct types
of arrangements (Fig. 1.2). In a simple cell type, there are no separate compart-
ments for genetic materials and other internal structures.

Fig. 1.2. Drawing of cells showing details of organelles

6 The ABCs of Gene Cloning

Organisms with this type of cellular organization are referred to as pro-

karyotes. The genetic materials of prokaryotes, such as bacteria, are present in
a single circular DNA in a clear region called nucleoid that can be observed
microscopically. Some bacteria also contain small circular DNA molecules
called plasmids. (Plasmids are the DNA used to construct vectors in gene clon-
ing. See Sect. 9.1.) The rest of the cell interior is the cytoplasm, which contains
numerous minute spherical structures called ribosomes – the sites for protein
synthesis. Defined structures like ribosomes, are called organelles. The rest
(fluid portion) of the cytoplasm is the cytosol, a solution of chemical constitu-
ents that maintain various functions of the cell. All the intracellular materials are
enclosed by a plasma membrane, a bilayer of phospholipids in which various
proteins are embedded. In addition, some bacterial cells contain an outer layer
of peptidoglycan (a polymer of amino-sugars) and a capsule (a slimy layer of
In contrast, a vast majority of living species including animals, plants,
and fungi, have cells that contain genetic materials in a membrane-bound
nucleus, separated from other internal compartments which are also surrounded
by membranes. Organisms with this type of cell organization are referred to as
eukaryotes. The number and the complexity of organelles in eukaryotic cells far
exceed those in bacteria (Fig. 1.2). In animal cells, the organelles and constitu-
ents are bound by a plasma membrane. In plants and fungi, there is an additional
outer cell wall that is comprised primarily of cellulose. (In plant and fungal
cells, the cell wall needs to be removed before a foreign DNA can be introduced
into the cell in some cases as described in Sect. 11.1).

1.4 Heredity Factors and Traits

In a eukaryotic nucleus, DNA exists as complexes with proteins to

form a structure called chromatin (Fig. 1.3). During cell division, the fibrous-­
like chromatin condenses to form a precise number of well-defined structures
called chromosomes, which can be seen under a microscope.
Chromosomes are grouped in pairs by similarities in shape and length
as well as genetic composition. The number of chromosome pairs varies in dif-
ferent species. For example, carrots have 9 pairs of chromosomes, humans have
23 pairs, and so on. The two similar chromosomes in a pair are described as
homologous, containing genetic materials that control the same inherited traits.
If a heredity factor (gene) that determines a specific inherited trait is located in
one chromosome, it is also found at the same location (locus) on the homolo-
gous chromosome. The two copies of a gene that are found in the same loci in a
homologous chromosome pair are determinants of the same hereditary trait, but
may exist in various forms (alleles). In simple terms, dominant and recessive
alleles exist for each gene.
Introductory Concepts 7

Fig. 1.3. Structure of cellular chromosome

In a homologous chromosome pair, the two copies of a gene can exist

in three types of combinations: 2 dominant alleles, 1 dominant and 1 recessive,
or 2 recessives. Dominant alleles are designated by capital letters, and recessive
alleles by the same letter but in lower case. For example, the shape of a pea seed
is determined by the presence of the R gene. The dominant form of the gene is
“R”, and the recessive form of the gene is designated as “r”. The homologous
combination of the alleles can be one of the following: (1) RR (both dominant),
(2) Rr (one dominant, one recessive) or (3) rr (both recessive). This genetic
makeup of a heredity factor is called the genotype. A dominant allele is the form
of a gene that is always expressed, while a recessive allele is suppressed in the
presence of a dominant allele. Hence, in the case of the genotypes RR and Rr, the
pea seeds acquire a round shape, and a genotype of rr will give a wrinkled seed.
The observed appearance from the expression of a genotype is its phenotype.
In the example, a pea plant with a genotype of RR or Rr has a pheno-
type of round shape seeds. When two alleles of a gene are the same (such as RR
or rr), they are called homozygous (dominant or recessive). If the two alleles are
different (such as Rr), they are heterozygous. The genotypes and phenotypes of
the offspring from breeding between, for example, two pea plants having geno-
types of Rr (heterozygous) and rr (homozygous recessive), can be tracked by
the use of a Punnett square (Fig. 1.4a). The offspring in the first generation will
have genotypes of Rr and rr in a 1:1 ratio, and phenotypes of round seed and
wrinkled seed, respectively.
8 The ABCs of Gene Cloning

Fig. 1.4. Cross between (a) Rr and rr pea plants, and (b) carrier female
and normal male

The example of round/wrinkled shape of pea seeds is typical of one

gene controlling a single trait. The situation is more complex in most cases,
because many traits are determined by polygenes. Eye color, for example, is
controlled by the presence of several genes. In some cases, a gene may exist in
more than two allelic forms. Human ABO blood types are controlled by a gene
with 3 alleles – IA and IB are codominant, and Io is recessive. Additional varia-
tions are introduced by a phenomenon called crossing over (or recombination)
in which a genetic segment of one chromosome is exchanged with the corre-
sponding segment of the homologous chromosome during meiosis (a cell divi-
sion process, see Sects. 1.5 and 18.1).
A further complication arises from sex-linked traits. Humans have 23
pairs of chromosomes. Chromosome pairs 1 to 22 are homologous pairs, and
the last pair contains sex chromosomes. Male has XY pair and female has XX
chromosomes. The genes carried by the Y chromosome dictate the development
of a male; the lack of the Y chromosome results in a female. A sex-linked gene
is a gene located on a sex chromosome. Most known human sex-linked genes
are located on the X chromosome, and thus are referred to as X-linked. An
example of a sex-linked trait is color blindness, which is caused by a recessive
allele on the X chromosome (Fig. 1.4b). If a carrier female is married to a nor-
mal male, the children will have the following genotypes and phenotype- Sons:
XY (color blind) and XY (normal), and daughters: XX (normal, carrier) and XX
(normal, non-carrier).

1.5 Mitosis and Meiosis

The presence of homologous chromosome pairs is the result of sexual

reproduction. One member of each chromosome pair is inherited from each par-
ent. In human and other higher organisms, autosomal cells (all cells except the
germ cells, sperms and eggs) contain a complete set of homologous
Introductory Concepts 9

chromosomes, one of each pair from one parent. These cells are called diploid
cells (2n). Germ cells contain only one homolog of each chromosome pair, and
are referred to as haploid (n).
A fundamental characteristic of cells is their ability to reproduce
themselves by cell division – a process of duplication in which two new (daugh-
ter) cells arise from the division of an existing (parent) cell. Bacterial cells
employ cell division as a means of asexual reproduction, producing daughter
cells by binary fission. The chromosome in a parent cell is duplicated, and
separated so that each of the two daughter cells acquires the same chromosome
as the parent cell.
In eukaryotes, the process is not as straightforward. Two types of cell
division, mitosis and meiosis, can be identified. In mitosis, each chromosome is
copied into duplicates (called chromatids) that are separated and partitioned
into two daughter cells. Therefore, each of the two daughter cells receives an
exact copy of the genetic information possessed by the parent cell (Fig. 1.5).
Mitosis permits new cells to replace old cells, a process essential for growth and
maintenance. In meiosis, the two chromatids of each chromosome stay attached,
and the chromosome pairs are separated instead, resulting in each daughter cell
carrying half of the number of chromosomes of the parent cell (Fig. 1.5). Note
that at this stage, each chromosome in the daughter cells consists of 2 chroma-
tids. In a second step of division, the chromatids split, resulting in 4 daughter
cells each containing a haploid number of chromosomes, i.e. only one member
of each homologous chromosome pair. Meiosis is the process by which germ
cells are produced. After fertilization of an egg with a sperm, the embryo has
complete pairs of homologous chromosomes.

1.6 Relating Genes to Inherited Traits

The preceding discussions on dominant and recessive forms, and geno-

types and phenotypes, can be interpreted at the molecular level by relating them
to how genes determine inherited traits. In simple terms, a gene can exist in a
functional form, so that it is expressed through transcription and translation to
yield a gene product (a specific protein) that exhibits its normal function.
However, a gene can also be non-functional due to a mutation, for example,
resulting in either the absence of a gene product, or a gene product that does not
function properly. Therefore, a homozygous dominant genotype, such as AA,
means that both alleles in the chromosome pair are functional. A genotype of Aa
will still have one functional copy of the gene that permits the synthesis of the
functional protein. A homozygous recessive (aa) individual does not produce
the gene product or produce a nonfunctional gene product. A gene controls an
inherited trait through its expression, in that the gene product determines the
associated inherited characteristic. Genes with multiple alleles can be explained
10 The ABCs of Gene Cloning

Fig. 1.5. Schematic comparison between mitosis and meiosis

by the difference in the efficiencies of the functions of the gene products.

Another explanation is that one copy of the gene produces a lower amount of the
gene product than the corresponding normal (functional) gene.
An example can be drawn from the genetic disorder of obesity in mice.
Obese (ob) is an autosomal recessive mutation in chromosome 6 of the mouse
genome. The normal gene encodes the Ob protein, which functions in a signal
pathway for the body to adjust its energy metabolism and fat accumulation (see
Sect.18.4). Mice carrying 2 mutant copies (ob/ob) of the gene develop progres-
sive obesity with increased efficiency in metabolism (i.e. increase weight gain
per calorie intake). Mice with ob/ob genotype do not produce the gene product
(Ob protein), because both copies of the ob gene are nonfunctional.

1.7 Why Gene Cloning?

The general objective of gene cloning is to manipulate protein synthesis.

There are several reasons why we want to do this.
1. To produce a protein in large quantity. Large-scale production of
therapeutic proteins has been a primary focus of biotechnology. Many proteins
Introductory Concepts 11

of potential therapeutic values are often found in minute amounts in biological

systems. It is not economically feasible to purify these proteins from their natu-
ral sources. To circumvent this, the gene of a targeted protein is inserted into a
suitable host system that can efficiently produce the protein in large quantities.
Examples of pharmaceuticals of this type include human insulin, human growth
hormone, interferon, hepatitis B vaccine, tissue plasminogen activator, interleu-
kin-2, and erythropoietin. Another area of great interest is the development of
“transpharmers”. The gene of a pharmaceutical protein is cloned into livestock
animals, and the resulting transgenic animals can be raised for milking the
2. To manipulate biological pathways. One of the common objectives
in gene cloning is to improve crop plants and farm animals. This often involves
alteration of biological pathways either by (A) blocking the production of an
enzyme, or (B) implementing the production of an exogenous (foreign) enzyme
through the manipulation of genes. Many applications of gene cloning in agri-
culture belong to the first category. A well-­known example is the inhibition of
the breakdown of structural polymers in tomato plant cell wall by blocking the
expression of the gene for the enzyme involved in the breakdown process (using
antisense technique). The engineered tomatoes, with decreased softening, can
be left to ripe on the vine, allowing full development of color and flavor. Another
example is the control of ripening by blocking the expression of the enzyme that
catalyzes the key step in the formation of the ripening hormone, ethylene.
On the other hand, new functions can be introduced into plants and
animals by introducing a foreign gene for the production of new proteins that
are previously not present in the system. The development of pest-resistant
plants has been achieved by cloning a bacterial endotoxin. Other examples
include salt-tolerant and disease-resistant crop plants. Similar strategies can be
applied to raise farm animals, with build-in resistance to particular diseases.
Animals cloned with growth hormone genes result in the enhancement of
growth rate, increased efficiency of energy conversion, and increased protein to
fat ratio. All these translate into lower cost of raising farm animals, and a lower
price for high quality meat.
A number of human genetic diseases, such as severe-combined immu-
nodeficiency (SCID), are caused by the lack of a functional protein or enzyme,
due to a single defective gene. In these cases, the defect can be corrected by the
introduction of a healthy (normal, therapeutic) gene. The augmentation enables
the patient to produce the key protein required for the normal functioning of the
biological pathway. “Naked” DNA such as plasmids containing the gene
­encoding specific antigens can be used as therapeutic vaccines to stimulate
immune responses for protection against infectious diseases.
3. To change protein structure and function by manipulating its gene.
One can modify the physical and chemical properties of a protein by altering its
12 The ABCs of Gene Cloning

structure through gene manipulation. Using the tools in genetic engineering, it

is possible to probe into the fine details of how proteins function, by investigat-
ing the effects of modifying specific sites in the molecule. This technique has
generated vast information on our current knowledge on the mechanism of
important proteins and enzyme functions.
For therapeutic applications, many of the proteins are engineered to
modify the structure and activity. For example, crosslinking the variable domains
of different monoclonal antibodies by short peptide linkers can form single-­
chain bispecific antibodies that are less immunogenic with enhanced tissue pen-
etration. Glycoengineering has been applied to introduce sugar moieties into
antibodies to improve solubility and increase the half-life of the protein.
Modifying the proteolytic cleavage site of coagulation factor VIII enhances its
resistance to inactivation for improved pharmacokinetic properties.
For illustration of the impact of gene cloning, some application exam-
ples are covered in Part III (for agriculture) and Part IV (for medicine and related
areas) of this book.

1. Define: (A) a gene, (B) transformation, (C) a clone, (D) expression.
2. What is a vector used for?
3. List some applications of gene cloning.
4. Describe the differences in structural features between prokaryotic and eukaryotic
5. Match by circling the correct answer in the right column.
Homozygous dominant RR, Rr, rr
Homozygous recessive RR, Rr, rr
Heterozygous RR, Rr, rr

6. Tongue rolling is an autosomal recessive trait. What are the genotypes and pheno-
types of the children from a heterozygous female married to a homozygous dominant
7. Hemophilia is a sex-linked trait. Describe the genotypes and phenotypes of the sons
and daughters from a marriage between a normal male and a carrier female.
8. Identify the differences between mitosis and meiosis.

Mitosis Meiosis
(A) Number of daughter cells
(B) Haploid or diploid
(C) One or two divisions
(D) Germ cells or somatic cells

9. Why is it that a dominant allele corresponds to a functional gene? Why is it recessive

if a gene is nonfunctional?
chapter 2

Structures of Nucleic Acids

What is the chemical structure of a deoxyribonucleic acid (DNA) mol-

ecule? DNA is a polymer of deoxyribonucleotides. All nucleic acids consist of
nucleotides as building units. A nucleotide has three components: sugar, base,
and a phosphate group. (The combination of a sugar and a base is a nucleoside.)
In the case of DNA, the nucleotide is known as deoxyribonucleotide, because
the sugar in this case is deoxyribose. The base is either a purine (adenine or
guanine) or a pyrimidine (thymine or cytosine) (Figs. 2.1 and 2.3). Another type
of nucleic acid is ribonucleic acid (RNA), a polymer of ribonucleotides also
consisting of three components – a sugar, a base and a phosphate. The sugar in
this case is a ribose, and that the base thymine is replaced by uracil (Sect. 2.7).

2.1 5′-P and 3′-OH Ends

In DNA, the hydroxyl (OH) group is attached to the carbon at the 3′

position of the deoxyribose. One of the three phosphates (P) in the phosphate
group is attached to the carbon at the 5′ position (Fig. 2.1). The OH group and
the P group in a nucleotide are called 3′-OH (3 prime hydroxyl) and 5′-P (5
prime phosphate), respectively. A nucleotide is more appropriately described as
2′-deoxynucleoside 5′-triphosphate to indicate that the OH at the 2′ position is
deoxygenated and the phosphate group is attached to the 5′ position.
A DNA molecule is formed by linking the 5′-P of one nucleotide to the
3′-OH of the neighboring nucleotide (Fig. 2.2). A DNA molecule is therefore a
polynucleotide with nucleotides linked by 3′-5′ phosphodiester bonds. The 5′-P
end contains three phosphates but in the 3′-5′ phosphodiester bonds, two of the
phosphates have been cleaved during bond formation. An important conse-
quence to a phosphodiester linkage is that DNA molecules are directional: one
end of the chain with a free phosphate group, and the other end with a free OH
group. It is important in cloning to specify the two ends of a DNA molecule:
5′-P end (or simply 5′ end) and 3′-OH end (or 3′ end).

© Springer International Publishing AG, part of Springer Nature 2018

D. W. S. Wong, The ABCs of Gene Cloning,
14 The ABCs of Gene Cloning

Fig. 2.1. Chemical structure of deoxyribonucleotide

Fig. 2.2. Polynucleotide showing a 3′-5′ phosphodiester bond

2.2 Purine and Pyrimidine Bases

The deoxyriboses and phosphate groups forming the backbone of a

DNA molecule are unchanged throughout the polynucleotide chain. However,
the bases in the nucleotides vary because there are 4 bases – adenine, thymine,
guanine and cytosine, abbreviated as A, T, G and C, respectively (Fig. 2.3,
Table 2.1). A and G are purines (with double-ring structures); T and C are
pyrimidines (with single-ring structures). Consequently, there are four different
