Molecular Systematics - David Hillis, Craig Moritz, Barbara Mable
Molecular Systematics - David Hillis, Craig Moritz, Barbara Mable
Molecular Systematics - David Hillis, Craig Moritz, Barbara Mable
Second Edition
Edited by
David M. Hillis
THE UNIVERSITY OF TEXAS
Craig Moritz
UNIVERSITY OF QUEENS1,AND
and
Barbara K,MabTe
THE UNIVERSITY OF TEXAS
FAX:413-549-1118
Internet: yublishQsinauer.com
I'nnied in Canada
5 4 3 2 1
Contents in Brief
Part 1,Sampling
2. Project Design 17
3. Collection and Storage of Tissues 29
Part 3. Analysis
10. Intraspecific Differentiation 385
11. Phylogenetic Inference 407
12. Applications of Molecular Systematics:
The State of the Field and a Look to the Future 515
Contents
Pveface
Preface to the First Edition
Contributors
CONTROVERSIES IN MOLECULAR SYSTEMATICS 5
Molecules versus Morphology 5
Molecular Systematics: Types of Characters and Metl~odsof Analysis 6
Corztext and Controve~sies 1 Homology and Similarity Molecular Systematics 7
Craig Moritz and David M. Hillis Gene Trees and Organisma1 Phylogeny 9
THE EVOLUTION OF MOLECULAR SYSTBMATICS 1 Constancy of EvolutionaryRates 10
The L i i Between Molecular Evolution and Neutrality of Molecular Variants 11
Systematics 3 Data Quality and Presentation I1
The Link Between Mdecular Population Genetics SCOPE AND USE OF TIHIS BOOK 12
and Phylogenetics 4 FOR FZTRTHER S T m Y 12
Part 1 Sampling
cftapk~r 2 MOLECULAR SYSTEMATICS 18
Project Design 17 Studies of Population Structure 19
Studies of Species Boundaries and Hybridization 22
Peter R. Baverstock and Craig Moritz
Phylogenetic Relationships 25
INTRODUCTION 17 CONCLUDING REMARKS 27
STATISTICAL CONSIDERATIONS 18
Shipping Regulations 36
Sources of Liquid Nitrogen and Dry Ice 36
Collection and Storage of Tissues 29 STORAGE OF TISSUES ON RETUXN FROM THE FIELD 37
Herbert C. Dessauer, Charles J. Cole,
and Mark S. Hafiler STABILITY OF MACROMOLECULES DURING LONG-TERM
STORAGE 37
INTRODUCTION 29
DEVELOPMENT AND SUPPORT OF SYNOPTIC TISSUE
TGGULATIONSGQVEIWNG ACQUISITIONOF SPECm/m\IS 30 COLLECTIONS 39
REMOVING AND PRESERVING TISSUES IN THE FIELD 30 Disposition of 'Iissues far Long-Term Preservation 40
General Procedures 30 Curatorial Problems Unique to Tlsue Collections 40
Procedures Unique to Animal Tissue Collection 33 EXISTING COLLECTIONS 41
Procedures Unique to Plant Tissue Collection 35
Collecting Cell Lines 35
TRANSPORT OF TISSUES FROM FIELD TO LABORATORY
OR BETWEEN LABORATORIES 36
Contents vii
I
Protocol 1: D N A isolation for PCR 222
Nucleic Acids 11: T h e Polymerase ProtocoI 2: The Polynzerase Chain Reaction 225
Chain Reaction 205 Profocol3: PCR From RNA 229
Stephen R. Palumbi TROUBLESHOOTNG 230
WTRODUCTlQN 205 Avoiding PCR Problems: PCR Hygiene 230
I'KINCIPLES AND COMPARISON OF METIIODS 206 Some Common Problems with PCR 230
Cei~eraiPrinciples 206 Problelns with Single-Strand Amplifications 231
The Cycle 207 USEFUL PRIMERS 232
Choosing Reaction Conditions 209 Nuclear Ribosomal Gene Primers 232
PCR Components 210 Animal Mitochondria1 Gene Primers 235
'l'lie Thermal Cycler 211 Chloroplast DNA Primers 239
l'r~mersand Primer Design 212 Jntron Primers 240
ASSUMPTIONS 214 More Information about PCR 245
AIJl'LIICA'rIONS AND LIMITATIONS 215 Append& Stock Sol-orkkxris 246
'Types of Amplifications and Types of Data 215
Contents ix
Chapter 8 Isra~ocols
Protocol 1: Isolafion of Aniinul m t D N A Using CsCl-p~
Nztcleic Acids III: Analysis of Fragments Gradients 283
and Restriction Sites 249 Protocol 2: Isolation of cyDNA Using Sucrose Step and
T h o m a s E. D o w l i n g , C r a i g M o r i t z , CsCI-EB Gradients. 289
J e f f r e y D. Palmer, a n d L o r e n H. Rieseberg Protocol 3: Digestion of D N A with Restriction
PPJNCIIT,ES AND COMPARISON OF METHODS 249 Endonucleasa 290
General Principles 249 Protocol 4: Agarose and Polyacylamide Electrophoresis 291
Assumptions 255 Protocol 5: Staining with Eth~diumBmmide 297
Comparison o f the Primary Methods 257 Pmtocol6. d 2 P3'End-labeling oJRestricfion Fragments 297
PmLowl7: Pnmer Labeling fov Microsc~tellite Analysk 298
APPLICATIONSAND LIMITATIONS 266
Protocol 8: Transfer Hybrid~zation 299
Choice o f Sequence 266
Protocol 12: Mapping Restriction Sites 302
Population-Level Comparisons 268
Species-Level Comparisons 276 INTERPRETATION AND TROUBLESHOOTING 308
Higher-Level Systematics 279 RFLP Analysis 308
Troubleshooting 314
LABORATORY SETUP 282
Microsatellites 317
Appendix: Stock Solutions 319
Part 3: Analysis
Chapter 10 APPLICATIONS 401
Ivltraspecific Differentiation 385 Conditional Genotypic Frequencies 401
Bruce S. Weir IMPLEMENTATION 402
Sampling 402
BIOLOGICAL CONTEXT 385 Analysis 403
Genetic and Statistical Sampling 387
AN EXAMPLE 403
Fixed and Random Models 388
CONCLUSION 405
STATISTICAL MBTIlODS 389
Fixed Populations 389
Random Populations 394
INTRODUCTION 515
Acknowledgments 545
DATA ANALYSIS: ISSUES AND CONTROVERSIBS 521 Measurement Symbols 548
Trees versus Networks 521
Combined versus Separate Analyses of Multiple Glossay and Abbreviations 549
Data Sets 522
Literut.ure Cited 560
Hypothesis Testing and the Parametric Bootsfrap 523
Phylogenetic Accuracy 526 Index 636
Preface
It is gratifying to us to see how much the field of molecular systematics has
grown and matured since the first edition of Molecular Systenzatics appeared six
years ago. We have received a considerable amount of helpful advice and sug-
gestions about material that should be included in this book, and we have tried to
incorporate as many of these suggestions as possible in this new edition. Every
chapter has been completely updated, and most chapters have undergone major
revision and expansion. Because of this expansion and our desire to include a
new chapter on the polymerase chain reaction, we decided to drop the chapter
on immunological techiques from the new edition to keep the size of the book
within reason. ImmunologicaI kechniques are no longer widely used in system-
atic studies, so there was little need to update the summary from the first edition.
In this edition, we have tried to incorporate more information on the
processes of molecular evolution. This is most visible in the chapter on phyloge-
netic inference (Chapter II), which now deals extensively with models of nu-
cleotide substitution. Throughout the remaining chapters, there is considerably
more information about applying the techniques to studying problems in molec-
ular evolution, although the emphasis is still on intraspecific and interspecific
systematic analyses. One of the trends in the molecular evolution literature that
has appeared over the past decade is a blurring of the distinction between studies
of molecular processes and studies of historical relationships among taxa. We see
this as a positive trend: as we learn more about how genes evolve, we have more
information to apply to the study of the history of populations and taxa; con-
versely, population genetic and phylogenetic analyses are providing critical con-
tributions to the body of information on gene evolution. This reciprocal illumi-
nation has resulted in rapid advances-in the merging fields of molecular
systematics and evolution.
As with the first edition, we have relied heavily on outside reviewers for ad-
vice and assistance. Most of the individuals who helped in the preparation of the
first edition (listed in the Preface to the First Edition, which follows) have con-
tributed to this edition as well, and we thank them for their extensive help and
continuing enthusiasm for this project. In addition, Chris Austin, Marty ~ a d g e t t ,
Mike Charleston, Keith Crandall, Sandie Degnan, Joseph Felsenstein, Christina
James, John Huelsenbeck, Shane Lavery, Paul Lewis, Peter Lockhart, Phillip
Tucker, David Maddison, Wayne Maddison, Jim McGuire, and David Penny have
provided reviews or other assistance. Janet Young helped design the cove; illus-
tration. The staff of Sinauer Associates has been extremely helpful in the produc-
tion of the book; we are especially grateful to Andy Sinauer, Chris Small, and
Carol Wigg for their dedication and work on this volume.
Preface to the First Edition
The need for a book on molecular systematics has been evident for many years,
However, no one person can possibly become a practitioner and at the same time
remain current in all of the molecular techniques used in systematic biology; the
technology changes too quickly. Thus, we decided in 1987 to organize a multi-
authored book on the subject. Because we were concerned about the possibility of
uneven treatment by the various authors, we structured the chapters carefully
and enforced the structure rigidly. We organized the book into three main sec-
tions that correspond to the three parts of every molecular systematic study: Sam-
pling design and execution, collection of molecular data, and data analysis. Our
hope is that this book can guide beginners all the way through a molecular sys-
tematic study, and at the same time provide established investigators with new
ideas, techniques, and approaches.
We use the term systematics in its broad sense to include the comparative
study of biotic diversity at any level. The goals of molecular systematics are also
the goals of systematics in general; this book deals specifically with molecular
approaches because of the unique problems of collecting and analyzing molecu-
lar data. We hope the book will also be useful to non-molecular systematists by
describing the principles, applications, and limitations of molecular techniques.
A book of this type must rely heavily on cooperation from expert reviewers,
and we have been fortunate to have extraordinary cooperation from the research
community. John Avise, John Gillespie, Morris Goodman, Mark Kirkpatrick, Irv
Kornfield, Mike Miyamoto, Colin Patterson, Vincent Sarich, and Allan Wilson
sent us detailed comments on several chapters each, and we thank them for their
considerable commitment of time. We also received very useful reviews of chap-
ters from Loren Ammerman, James Archie, Robert Baker, Peter Baverstock, John
Benzie, James Bull, Paul Chippindale, Joel Cracraft, Brian Crother, Ross Crozier,
Llewellyn Densmore, Michael Dixon, Rafael de SQ,Herbert Dessauer, John Gold,
Sheldon Cuttman, James Hamrick, Richard Highton, John Kirsch, Mike Johnson,
Linda Maxson, Steve Palumbi, James Patton, Craig Pease, Eric Pianka, Michael
Ryan, Barbara Schaal, Charles Sibley, Montgomery Slatkin, Jerry Slightom, Carol
Stepien, David Swofford, D. Tagle, Bruce Weir, and Gregory Whitt. We appreciate
the time and effort that these reviewers have invested in this book.
Argye Hillis and Hamish McCallum provided invaluable statistical advice,
and Michael Dixon and Loren Ammerman assisted with figure preparation.
Thomas White provided advice and prepublication information on the poly-
merase chain reaction. We thank Linda Davis, Brad Garton, Diana Hews, Beth
Reid, and Vicki Young-Lehmeier for assisting with the correction, handling, and
translating of computer files of the chapters. Andy Sinauer has contributed to
xiv Preface to the First Edition
every stage of the book, from planning and organizing to production; we thank
him for his personal interest and concern for this book. The National Science
Foundation and the Australian Research Council have provided generous sup-
port for our research in molecular systematics; this support provided us with the
experience in a diversity of molecular techniques that we needed to edit this v01-
ume. Some of the travel involved in editing was generously supported by the
University of Queensland.
Finally, our wives Ann Hillis and Fiona Hamer have assisted us and sup-
ported us throughout this project We may never be able to repay them for all
their help, encouragement, and extraordinary patience.
For centuries, naturalists have tried to detect, describe, and explain diversity in
the biological world; this endeavor is known as systematics. The formalization of
a hierarchical system of nomenclature by Linnaeus (1758) established a frame-
work for describing and categorizing biological diversity. This hierarchical sys-
tem was initially independent of evolutionary theory, and in fact early evolu-
tionists (such as Buffon, 1753) opposed the Linnean system and the Aristotelian
essentialism it embodied. However, the Linnean system prevailed, and later evo-
lutionists (e.g., Lamarck, 1809; Darwin, 1859; Haeckel, 1866, reviewed by Mayr,
1983) simply co-opted the system to produce a classification based on phyloge-
netic relationships (Figure 1).Initial efforts to reconstruct phylogenetic history
were based on few (ifany) objective criteria, and estimates of phylogeny were lit-
tle more than plausible assertions by experts on particular taxonomic groups.
During much of the first half of the twentieth century, systematists were con-
cerned more with problems of species, speciation, and geographic variation than
with problems of phylogeny In fact, the word phylogeny does not even appear in
the index to Julian Huxley's Evolution: The Modern Synthesis, published in 1942.
The situation began to change during the 1930s, 1940s, and 1950s, through
the efforts of individuals like the botanist Walter Zimmermann (1930; 1931; 1934;
1943) and the zoologist Willi Hennig (1950; 1966).They began to define objective
methods for reconstructing evolutionary history based on the shared attributes
2 Chapter
of extant and fossil organisms. in the 1960s, these mented in computer programs, which allowed the
methods (and others) were refined and developed al~alysisof large and colnpkx data sets. The past
illto explicit criteria for estimating phylogeny. Al- 30 years have continued to see major conceptual
gorithms based on these criteria were soon imple- and operational advances in the estimation of
Systematics: Contexf al~dControversies 3
phylogeny, as well as in the analysis of microevo- ally, the sophlsticatlon of phylogenetic arlalysls
lutionary change, and now studies of phylogeny has grown rapidly, wsth ~ncreasingelnphasis be-
are no longer limited to applicatlons in biological ing placed on assessments of phylogenetlc accu-
classification. Indeed, studies of phylogeny have racy (reviewed by Hillis, 1995). New methods of
permeated almost every subdiscipline in biology, analysis relate not only to the generation of phy-
and comparative biologists of all sorts appreciate logenetic hypotheses, but also to testing hypothe-
the importance of phylogenetic methods for inter- ses about biogeography, ecology, behavlor, physi-
preting all kinds of biological patterns and ology, development, epidemiology, and almost
processes. every other aspect of biology. Mareaver, ~ncreased
About tlze same time that these advances in sophisticatiolz in the analysis of character evolu-
methods for phylogenetic estimation were devel- tlon (e.g., Maddison and Maddison, 1992) has
oping in the 1960s, another sort of revolution was greatly improved our ability to invcstigato them-
happening in molecular biology, Methods for ex- tricacies of molecular data in relation to cvolu-
amining the molecular structure of proteins and tsonary models and processes.
nucleic acids were soon adopted by evolutionary Our enthusiasm for new molecular tools and
biologists, and the data available for phylogenetic methods of analysis sliould not be taken to mean
estimation began to increase exponentially, at that we advocate abandonment of old and
least for some taxa. This book is a summary of the proven techniques. Both allozyme electrophore-
lnetlzods and applications in systematics that have sis (Chapter 4) and cytogenetics (Chapter 5) have
developed out of these parallel advances in esti- made major contributions to evolutionary theory
mation procedures, co~nputationalanalyses, and (M.J.D.White, 1973; Lcwontin, 1974, Av~se,1994)
molecular bioteclznologies. and continue to be extrelnely cost-effectlvc ap-
The six years between the first and second proaches for many applications. There ex~stsa
editions of this book have see11 a quaniu~nleap in broad spectrum of methods for analyzing varla-
the number and scope of applications of rnolecu- tion In DNA, including DIVA-DNA hybridiza-
lar systematics. Central to this increase has been tion (Chapter 6), vanous methods for generating
development of new applications of the poly- and analyzing DNA fragments (e.g., rcs tr~ction
merase chain reaction, or PCR, for investigating enzyme analysis, m~crosa~ellites, RAPDs, and
variation in DNA on a large scale (Kleppe et al., multilocus DNA fingerprinting; Chapter a), and
1977; Mullis and Faloona, 1987). In conjunction DNA sequencing (Chapter 91, each wit11 charac-
with the design of broadly applicable sets of teristic strengths and limitations. A theme devel-
primers (see Chapter 71, gene amplification meth- oped throughout this book is that tlze technjque
ods have spawned increasingly large data sets on and the molecule or form of variation to be as-
DNA sequence variation within and between sayed must each be matched to a carefully de-
species. Gene amplification is also fundamental to fined problem, assessed though pilot s i u d i c s
new approaches to DNA fingerprinting, such as (Chapter 2), and the results subjected to appro-
microsatellite (Weber and May, 1989) and RAPD priate statistical analyses (Chapters 10-1 2) Only
(Williams et al., 1990) analyses. Symptomatic of then wlll the full power of molecular sysLematies
this evolution, the new edition of Molecular Sys- be realized.
tetnatics includes a separate chapter on PCR am-
plification (Chapter 7). The Link Between Molecular Evolution
Alongside the advances in bioteclznology- and Systematics
indeed, driven by them-have been improve-
ments 111 the analysis of molecular variation There is a fundamental synergy between studies
within and among species. Within species, the of molecular systematics and molecular evolut~on.
ability to obtain gene trees has encouraged the de- Molecular systematics uses genetic markers to
velopment of coalescence theory (reviewed by make inferences about population processes and
Hudson, 1990) and tlze analysis of phylogeogra- phylogeny and in doing so creates a substantial
phy (Avise et al., 1987; Avise, 1994).More gener- comparative database for specific genes or pro-
4 Cizizpfer I / Moritz Ij Hillis
telns Studies of molecular evolution use these ticular problem. For example, contrary to earlier
data to evaluate rates, processes, and constraints suggestions (e.g., Moritz et al., 19871, it has be-
on molecular change through time (reviewed by come clear that genes with highly conserved
Kirnura, 1983b; J.H. Gillespie, 1991; Li and Graur, amino acid sequences may be less useful than
1991).The results of molecular evolutionary stud- those with high replacement rates for inferring
ies can then provide for more informed use of phylogeny of distantly related taxa if allowable
molecular markers in population genetics and substitutions in the former are rapidly saturated
phylogerletic analyses. (Graybeal, 1993,1994).Molecular systematics also
'This linkage of inolecular systematics and can contribute to studies of molecular evoIution
evolution is most obvious in the analysis of DNA beyond just providing comparative sequence
sequence variation (Miyalnoto and Cracraft, 1991; data. For example, estimated molecular phyloge-
Crozier, 1993; Simon et al., 1994; Chapters 11 and nies can be used to detect intragenic recombina-
12). It is clear from comparisons of closely related tion, exon shuffling, horizontal transfer, or gene
sequences and analyses of pseudogenes and their conversion events (e.g., Hughes and Nei, 1989;
functional paralogs that substitutions between Valdes and Pifiera, 1992; Robertson et al., 19951, to
different bases occur at disparate rates (see Chap- test for heterogeneous rates of evolution (e.g.,
ter 31 and Li and Graur, 1991).For instance, in an- Easteal, 1990), and to identify sequences subject to
imal mtDNA, the bias toward transitions can be selection (Fu and Li, 1993; Klein, 1993).These ex-
extreme I'rotcin-coding genes and the non-cod- amples are far from exhaustive, but serve to illus-
ing control region of intDNA may accumulate trate the growing interplay between molecular
transillo~~s 10 or more times more quickly than evolution and systematics in the analysis of DNA
transversions in some species (W.M.Brown et al., sequences,
1982; Irwin et al., 1991; Kocher and Wilson, 1991), The synergy between molecular evolution
although other genes (such as the rRNA genes) and systematics Is also developing for some forms
may experience very different substitutional pres- of DNA fragment analysis. Assays sf variation at
sures (Vawter and Brown, 1993). To complicate microsatellite loci are becoming increasingly pop-
matters further, the relative probabilities of sub- uIar for studies of intraspecific variation, but there
stitutlons between a particular pair of bases (e.g., are concerns that the mutation rate may be so
C ++T transitions) can be asymmetric, resulting in high as to overwhelm information on population
biased base composition 111 the sequences com- history and migration rate (reviewed in Chapter
pared. Correction for these inequalities using 8). In particular, the effects of mutation and mi-
weighted parsimony, maximum likelihood, or ap- gration can be confounded, making it difficult to
propriate estimators for sequence divergence in- use measures of among-popula tion differentiation
creases the range of parameter space over which (e.g., FST)to estimate migration rates. These diffi-
estimates of phylogeny from DNA sequence data culties can be overcome if the form and rate of
are rrllable (Felsenstein, 1988a; Huelsenbeck, mutation at microsatellite loci is understood; in-
1995, Chapter 11).Methods to correct for effects of deed, measures of population differentiation and
varia rion in base composition among taxa on phy- genetic distance that incorporate specific mutation
logenetic analysis also have been developed (e.g., models have been developed recently (Goldstein
Sidow and Wllson, 1991; Stecle et al., 1993b; Lock- et al., 1995; Slatkin, 1995).
hart et al., 1994; Lake, 1994). Knowledge about
varlous types of interactions among sequence po- The Link Between
sitions (e.g., Wheeler and Honeycutt, 1988; Korber Molecular Population Genetics
et a1 , 1993) and differences in probabilities of and Phylogenetics
change across sites have led to objective criteria
for differential character weighting. An under- In the first edition of Molecular Systematics, we as-
standli~gof evolutionary constraints also can serted that the field of molecular systematics en-
guide the selection of genes to be used for a par- compasses both intraspecific variation, tradition-
Molecula Systematics: Context and Controversies 5
ally the field of population genetics, and inter- (e.g., Moritz, 1991a; Hillis et al., 1991c; Bradley et
specific diversity, traditionally the field of phylo- al., 1993).
genetics. This linkage is fundamental to the inte-
gration of molecular evolution and systematics
discussed above and has been enhanced by the CONTROVERSIES IN
use of allelic genealogies at both levels. Indeed, MOLECULAR SYSTEMATICS
population genetics is undergoing a renaissance
fueled by the availability of information on the The collection of molecular data and their use in
molecular differences among alleles, which is of- systematics has led to several controversies, some
ten expressed as a phylogeny (Avise, 1989). Coa- of which have generated more heat than light.
lescence theory (reviewed by Hudson, 1990) These controversies include arguments about the
predicts the effects pf genetic drift, mutation, mi- relative value of molecular versus morphological
gration, and selection on expected times to com- data, the types of data that should be collected,
mon ancestry of alleles. If the rate of nucleotide the various philosophical approaches to analyz-
substitution is sufficient for the allelic genealogy ing data, the meaning of homology in relation to
to be estimated, then inferences about historical molecular characters, the extent to which individ-
population size, gene flow, and selection events ual gene trees reflect relationships among popu-
are possible (Slatkin and Maddison, 1989; Slatkin, lations or species, the constancy of rates of molec-
1991; Slatkin and Hudson, 1991; Felsenstein, ular evolution, and the neutrality of molecular
1992; Hudson et al., 1992b; Nee et al., 1995).A sig- variants. Some of these debates are specific to
nificant outcome of theoretical and empirical molecular data, whereas others are general to all
studies of allelic genealogies within species will types of evidence used to estimate phylogeny.
be an improved understanding of the conditions Each of these debates is reviewed at length else-
for which gene and organismal trees are congru- where; here we merely outline the principal argu-
ent in comparisons of closely related species. ments and their implications for molecular sys-
The methods described in this book vary in tematics. Another recent controversy-whether to
their ability to link population genetics and phy- analyze multiple data sets separately or in combi-
logenetic analysis. Certainly, DNA sequencing nation-is discussed in Chapter 12.
and restriction site analvsis allow a close connec-
tion. However, some of the increasingly popular
methods for analyzing within-species variation,
Molecules versus Morphology
such as microsatellite analysis and RAPDs, do There has been considerable debate over whether
not lend themselves to this approach because the molecular or morphological features are inher-
homology of alleles detected between species is ently better sources of information for estimating
questionable (FitzSimmons et al., 1995; J.J. Smith phylogeny (Patterson et al., 1993). Some have
et al., 1995; see C h a ~ t e 8).
r Other methods such claimed that molecular characters are relatively
as aliozyme electro'phoresis (Chapter 4) or ge- weak (e.g., Kluge, 19831, whereas others have
nomic DNA-DNA hybridization (Chapter 6) also claimed that morphological characters are likely
do not reveal individual gene genealogies and ac- to be misleading or uninformative (e.g., Frelin and
cordingly are less able to take advantage of the Vuilleumier, 1979; Sibley and Ahlquist, 1987a;
developing interaction between molecular popu- Lamboy, 1994).Closer examination shows this to
lation genetics and phylogenetics. Nonetheless, be an empty argument (Hillis, 1987; Sytsma, 1990;
studies-that combine sequence analysis or restric- Wiens and Hillis, 1996). The real concerns for the
tion analysis with chromosomal or allozymic practicing systematist are whether the characters
analysis provide an approach for linking studies examined exhibit variation appropriate to the
of allelic phylogeny to genetic analyses of popu- question(s) posed, whether the characters have a
lations or species (e.g., Moritz, 1991b; Radtke et clear and independent genetic basis, and whether
al., 1995) or processes of molecular evolution the data are collected and analyzed in such a way
that it is possible to compare and combine phylo- It also needs to be recognized that morpho-
genetic hypotheses derived from tlzern (see Chap- logical and molecular approaches each have dis-
ter 12). tinct advantages and disadvantages. For example,
We suggest that the conflicts between molec- most (but not all) molecular data have a clear ge-
ular and morphological evidence have been netic basis and the total data set is limited only by
overemphasized. The development of molecular the genome size. On the other hand, morphologi-
systematics has not resulted in widespread refu- cal data can be obtained more readily from an-
tation of phylogenetic hypotheses generated by cient fossils (e.g., Gauthier et al., 1988) and exten-
morphologists, although the molecular approach sive preserved collections and can be interpreted
may be revealing in situations where morpholog- in the context of ontogeny (Kluge and Strauss,
ical variation is limited or the homology of mor- 1985; cf. Mabee, 2993). Irt general, studies that in-
phological features is unclear. Two recent contro- corporate both molecular and morpl~ologicaldata
versies concerning relationships among eutherian will provide much better descriptions and inter-
mammals illustrate this point. pretations of biological diversity than those that
Stimulated by the observation that flying focus on just one approach. Furthermore, it is pos-
foxes and their relatives (megachiropterans) share sible to address some systematic problems only
a number of features of brain organization with wit11 morphological data and other problems only
primates that are not present in other bats (mi- with molecular data (see Hillis, 1987; Fernholm et
crochiropterans), Pettigrew (1986) proposed that al., 1989; Sytsma, 1990). This book is concerned
that the megachiropterans are more closely re- only with molecular variation because many is-
lated to primates than they are to the microchi- sues are unique to molecular data and are inade-
xopterans, and thus that wings and flying have quately covered elsewhere, not because we view
evolved separately in these lineages. This molecules as inherently superior to morphologi-
provocative hypothesis led to the generation of cal characters as markers of evolution.
many molecular datasets (Bennet et al., 1988; Ad-
kins and Honeycutt, 1991; R.J. Baker et al., 1991b;
Mindell et a]., 1991; Ammerman and Hillis, 1992;
Types of Characters and
Bailey et al., 1992; Stanhope et al., 1992) and criti-
Methods of Analysis
cal reassessments of both the morphological and The techniques of molecular systematics produce
molecular data (Pettigrew et al., 1989; Pettigrew, two fundamentally different types of information:
1991a,b; R.J. Baker et al., 1991b; N.B. Simmons et distance data, where differences among mole-
al. 1991; Pettigrew, 1994; Van Den Bussche et al., cules are measured as a single variable (e.g., DNA
1996).Whatever the outcome for bat phylogeny, hybridization, Chapter 7); and character data,
this debate has been very healthy in focusing crit- where differences are measured as a series of dis-
ical attention on the validity and interpretation of crete variables (characters), each with multiple
characters and on potential sources of conver- states. Character data can be converted to dis-
gence of either morphological or moIecular fea- tances, but distances usually cannot be converted
tures (e.g;., see the section "Recognizing System- into character data. Character data have some ad-
atic Errors" in Chapter 11). Another example is vantages for data collection and analysis. It is rel-
the debate over cetacean relationships, where re- atively easy to add information on new taxa to the
cent molecular data have suggested paraphyly of data set (see Chapter 2) and data obtained from
toothed whales, which in turn led to a reassess- different sources-(i.e.. other molecules or other
ment of morphological. characters (reviewed by types of attributes) can be combined for analysis
Milinkovitch, 1995). Jn this case there is disagree- (see Chapter 12).
ment between molecular data sets (Arnason and Arguments abaut different approaches to
Gullberg, 1994), and the conflicting hypotheses phylogeny estimation, such as the relative effi-
are again prompting critical assessment of both ciency, consistency, and robustness of the compet-
the molecular and morphological evidence. ing methods, continue to dominate discussions of
Molecula r Systematics: Context and Controversies 7
phylogenetic analysis (see Chapter 11).Some mol- However, in most cases, it is likely that the two
ecular techniques inevitably restrict the range of proteins are hoinologous across the~rlength, and
applicable methods of analysis. This is not a prob- have simply diverged at 5% of the pasitlons Fur-
lem as long as the remaining options can reliably thermore, there arc several reasons thal the pro-
estimate phylogeny, which in turn depends on the teins may be simllar, lnclud~ngcommon ancestry
frequency wit11 which assumptions specific to the (homology), convergence, and gene conversion
method are violated and how sensitive the (Patterson, 1988).
method is to those deviations. Considerable effort There are also several types of homology that
is being given to examining the robustness of al- must be distinguished. If the common ancestry of
ternative methods for phylogenetic analysis two sequences can be traced back to a speclation
(Chapter 11) and estimation of population genetic event, then they are said to be related by orthol-
parameters (Chapter 10). Development of new ogy (Fitch, 1970). Jf, on the other hand, the com-
methods of analysis and their implementation in mon ancestry of the sequences can be tracccl back
computer software packages also constitute to a gene duplication event, then tlze rela tlonsh~p
a very active field. Nonetheless, the greatest ob- is one of paralogy (Fitch, 1970).Homologous se-
stacle to the incorporation of the new flood of quences also can be related through lateral gene
molecular data in systematics remains a lack of transfer (via retroviruses, for instance), in which
adequate algorithm development and implemen- case the sequences are related by xenology (Gray
tation, especially for the alignment and analysis of and Fitch, 1983).The distinction is necessary be-
very large data sets. cause only orthologous sequences can be used to
infer phylogeny of species. Confusion of paralo-
gous and ortl~ologousscquenccs can result in a
Homology and Similarity in correctly estimated phylogeny for the molecules
Molecular Systematics that differs markedly from that of the organisms
The uses and misuses of the word homology frame from which they were sampled. Consider the ex-
a complex subject. Difficulties in its use arise as a ample in Figure 2. A gene duplication event in the
result of differences in intended meaning between ancestor of species 1,2, and 3 gave rise to the two
some molecular biologists and morphologists paralagous sequences, A and B. Subsequently,
(Patterson, 1988), among molecular biologists (Re- two speciation events gave rise to the three
eck et al., 1987; Aboitiz, 1987; Dover, 1987; Weg- species, such that specics 1 and 2 shared a ;nore
nez, 1987; Hillis, 19944, and depending on con- recent common ancestor (Figure 2A). One could
text (B.K. Hall, 1994). potentially recover the phylogeny of the three
In general, homology means inferred comlnon species by examinlng the orthologous A se-
ancestry, althouglx it is cammonly misused to quences in each species, or by examinlng the or-
mean similarity (Fitch, 1966, 1970; Reeck et al., thologous B sequences In each species (Flgr11.e2B).
1987). Similarity is an empirical observation and However, examination of paralogous sequences
can be quantified, whereas homology usually (e.g., A in speclerj 1 and 3 and B in specles 2)
must be inferred and is not usually a quantifiable would result in incorrect inferences about specles
relationship. When a molecular biologist states phylogeny but correct inferences about gene phy-
that two proteins are "95% homologous," 11e or logeny (Figure 2C). Thus, for problems of species
she usually means that the two proteins are the phylogeny, the sequences examined must be or-
same at 95% of the amino acid positions (= 95% thologous.
isologous; Wegnez, 19871, which may be used to However, paralogous sequences w ~ t h i na
infer that they are homologous. The statement is species do not always evolve ~ndependentlyIn-
confusing because it is possible that 95% of one deed, sequences that are repeated in tandem ar-
protein is homologous to the other, and that the rays rarely undergo Independent evolution. In-
remaining 5%is unrelated by direct ancestry (per- stead, the many copies evolve in conccrl (hence
haps because of exon shuffling; see Hillis, 1994a). the name concerted evolution; Zimmer et al.,
8 Ctlnpter I / Moritz €9 llillis
3A IA 2A 1B 20 3B
Figure 2 The consequences of using orthologous ver- The phylogeny inferred from comparison of either set
sus paralogous genes to infer phylogeny. (A) The phy- of orthologous genes. (Notice that this is the correct
logcny of a set of homologous genes in three species species phylogeny.) (C) The phylogeily inferred from
(1-3) A gene duplicatlorl event in the ancestor of the comparison of two orthologous and one paralogous se-
three species gave rise to two sets of paralogous genes quence. (This is the correct gene phylogeny, but not the
(Aand 131, and two subsequent speciation events gave correct species phylogeny.)
rise to orthologous genes in each of three species. (0)
1980) because of molecular processes such as bi- events in their simulations. At intermediate rates
ased gene conversion and unequal crossing over. of concerted evolution, the inferred trees usually
Patterson (1988) suggested the term plerology to confounded speciation and gene duplication
describe the relationship among paralogous se- events. Obviously, these estimates are dependent
quences homogenized within a taxon as a result on the details of their simulations, but they give
of concerted evolution. If the rate of concerted an approximation of the level of concerted evolu-
evolut~onis high enough, then all the copies in the tion that is likely to confuse phylogenetic esti-
repeated array may appear to be evolving almost mates of taxa or genes. Their results indicate that
as a slngle sequence, and the distinction between plerologous sequences can be used to infer rela-
orthology and paralogy is blurred. tionships among taxa, as long as the rate of ho-
Sanderson and Doyle (1992) simulated the ef- mogenization is demonstrably faster than the rate
fects of concerted evolution on phylogenetic esti- of speciation in the group of interest. For some se-
mation They found that when 70% of sites un- quences (such as nuclear ribosomal RNA genes),
derwent homogenization between speciation this condition appears to be met for most species
evcnts, the inferred trees represented the relation- comparisons.
ships of the taxa rather than of the individual The increased use of the polymerase chain re-
genes To reliably infer the gene trees among the action (Chapter 7 ) for in vitro amplification of
paralogs, concerted evolution had to involve DNA has increased the likelihood of practical
fewer than 10% of the sites between speciation problems associated with paralogy. Conserved
Molecula'r Systenzatics: Context and Controversies 9
primer sequences for nuclear or organellar genes differences in tissue specificity or electrophoretic
are likely to be conserved in paralogous pseudo- migration, and nonfunctional pseudogenes do not
genes as well. This may lead not only to the am- cause difficulties since they are not expressed and
pIification of pseudogenes and other paralogs (see therefore cannot be scored. This suggests that
Chapters 7 and £9, but also to the amplification of combined studies of allozyme electrophoresis and
in vitro recombination products among different sequence analysis of the individual alleles has
alleles, functional paralogs, or pseudogenes (Saiki great potential for resolving problems of homol-
et al., 1988; Scharf et al., 1988a,b). Thus, the pres- ogy assignment in studies of gene evolution.
ence of the amplified sequences in the original or- Finally, the term homologous has taken on at
ganism must be confirmed, especially for studies least two additional meanings in molecular biol-
of gene evolution where such recombinational ar- ogy. In cytogenetics, it is standard to refer to the
tifacts are likely to be highly misleading. Verifica- respective chromosomes in a chromosome pair of
tion of sequence fidelity may be accomplished a diploid organism as homologs and to refer to
through genomic restriction mapping or cloning the homologous pair of chromosomes in another
techniques (Chapters 8 and 9). species as homeologs (Chapter 5), even though
Still another level of homology must be ad- this is quite different from the use of homology in
dressed if the sequences of homologous genes are classical morphology (where homonomy is used to
to be compared. Even if two genes are known to refer to a repeated structure in a single organism).
be homologous (i.e., they are descended from a In addition, a molecular probe is said to be ho-
common ancestral gene), insertion and deletion mologous if it is used to study the same species
events may have confused the positional homal- from which it was derived, and heterologous if it
ogy of the individual nucleotide sites or amino is used to study an homologous sequence in an-
acid positions. Because most phylogenetic infer- other species. We prefer the terms homospecific
ence methods depend upon accurate assessment and heterospecific for these latter meanings.
of positional homology, as much (or more) atten-
tion should be given to sequence alignment as is
given to analysis of the aligned sequences (see
Gene Trees and Organismal Phylogeny
Chapter 9). As DNA sequences have become easier to obtain,
In restriction enzyme analyses, the potentially increasing emphasis has been placed on estimat-
homologous characters are the restriction sites ing gene trees and, from these, making inferences
rather than the restriction fragments (Chapter 8). about relationships among populations or species.
Two homologous sequences may not share any A major concern that arises is whether the gene
restriction fragments, even though all of the re- tree reflects the organismal phylogeny (reviewed
striction sites in one sequence are also found in by Doyle, 1992; Avise, 1994). Assuming that the
the other. Confusion of homology from using re- genes compared are truly homologous (see
striction fragments rather than restriction sites as above), gene trees and organismal phylogenies
characters has been shown to result in positively can differ because of retention of ancestral poly-
misleading analyses in experimental studies of morphism~,or reticulation among populations
phylogenetic inference (M,E. White et al., 1991; (i.e., gene flow) or species (i.e., hybridization).
Hillis et al., 1994a).Similar problems plague in- This is of particular concern for non-recombining
ferences of homology in interspecific applications segments such as organelle genomes because the
of some other DNA fragment techniques (FitzSim- effects of reticulation are potentially retained
mons et al., 1995; J.J. Smith et al,, 1995; Chapter 8). through subsequent generations (Doyle, 1992;
Homology of protein loci in isozyme studies Degnan, l993b).
(Chapter 4) is inferred on the basis of a number of The process by which ancestral alleles are
functional, structural, and expressional criteria. sorted among recently separated populations or
Most functional paralogous proteins are easily species has been the subject of several theoretical
recognized as the products of distinct loci through analyses (e.g.,Neigel and Avise, 1986; Pamilo and
10 Chapter 1 J Moritz & Hillis
Nei, 1988; Takahata, 1989; Wu, 1991). The rela- Where there is substantial migration, approaches
tionships among gene lineages found in two pop- that do not assume a hierarchy of populations
ulations progress from polyphyly to paraphyly should be used to examine the extent of genetic
and finally reciprocal monophyly following re- differentiationin relation to geographic separation
productive isolation. The rate at which this occurs of populations (Lessa, 1990; Slatkin and Maddison,
is affected by the pre-existing geographic struc- 1990; Slatkin, 1993; Csandall and Templeton, 1996).
ture of the variation, demographic processes Hybridization among long-separated species
within eacli population, and, primarily, the effec- also can lead to introgression of a non-recornbin-
tive population size (the process takes approxi- ing gene (e.g., mtDNA or cpDNA) and, thus, dis-
mately 4Ne generations, where N, is the effective crepancies between gene and organismal phyIo-
population size; Neigel and Avise, 1986).Variation genies (reviewed in Avise, 1994). However, if
among loci is expected because of stochastic vari- relationships are assessed by several means, these
ation (Ball et al., 1990) and differences in effective discrepancies can provide insights into the evolu-
population size (e.g., organellar genomes versus tionary history of a species, in particular the role
nuclear genes). Thus, differences in gene trees of hybridization (e.g., M.L. Arnold et al., 1991;
among populations or closely related species can Whittemore and Schall, 1991; Dowling and De-
arise purely because of lineage sorting effects Marais, 1993).
(e.g., Hey and Kliman, 1993; Slade et al., 1994).
Both theory and practice suggest that these effects
can be overcome by combining data across a large
Constancy of Evolutionary Rates
number of loci (Pamilo and Nei, 1988; Atchley Early indications of a strong correlation between
and Fitch, 1991; Slade et al., 1994), although there estimates of sequence divergence and divergence
are potential drawbacks from combining data time (Zuckerkandl and Pauling, 1962) raised the
from distinct gene trees in a single analysis (Bull exciting possibility that molecular comparisons
et al., 199313; de Queiroz, 1993). could provide indications of the time of diver-
Reticulation will disrupt hierarcitical patterns gence for taxa where no fossils exist. Although
that are produced by an underlying process of lin- most biologists now accept a broad correlation be-
eage divergence bcc the section "Trees versus Net- tween amount of molecular divergence (at least
works" in Chapter 12). On one hand, this disrup- for proteins and DNA) and time, recent evidence
tion of phylogeographic structure provides the (reviewed in Chapter 12; J.H. Gillespie, 1991;
basis for estimating rates of gene flow among pop- Avise, 1994) indicates sufficient rate heterogeneity
ulations (Slatkin and Maddison, 1989). On the that one sl~ouldnot assume that rates are equal on
other hand, it can frustrate attempts to use phylo- an a priaui basis. For instance, J.12. Gillespie (1987)
genetic metliods to assess relationslups among the Iias calculated the ratio of the variance of the
populations or taxa involved, especially if the number of substitutions to the mean number of
methods used assume an underlying tree structure substitutions that occur along a lineage as ranging
for the relationships. In general, phylogenetic from 1to 35 for amino acid substitutions and 1to
methods that assume a tree should only be used to 19 for silent substitutions, indicating considerable
infer population relationships from gene trees fluctuation in evolutionary rate (see also Lynch
where those populations are effectively indepen- and Jarrell, 1993).This has significant imylications
dant (i.e., where the rate of immigration is trivial for molecular systematics. Constancy of rates is an
campared to the rate of lineage extinction). In expectation of the neutral theory of molecular
practice, phylogenies estimated from mtDNA evolution (discussed below), is an assumption of
have been very useful for generating or testing hy- a few methods for estimating phylogeny (Chap-
potheses about historical biogeography at geo- ter II), and is widely assumed in estimating time
graphic scales where migration rates are low (e.g., since divergence (Chapter 12).
Bermingham and Avise, 1986; Moritz et al., 1992a; To some extent, the arguments over the mole-
Moritz and Heideman, 1993; Joseph et al., 1995). cular clock stem from different expectations. The
.Molecular Systematics: Coil text and Colztroversics 13.
utility of such a clock depends on the quality of depends on the proportron of markers (loci) nf-
information needed to test a specific hypothesis. fected, the extent and sign of correlations among
If a clock indicates 3:20 P.M., but the real time loci, and the robustness of the method of analys~s
could be anywhere from 12:20 I:M. to 620 P.M., the to departures from neutrahty (Chapters 2, 10, and
clock is only useful if one needs to know whether 11).Also, ~tmay be that selection is ep~sodlcrathcr
it is morning or afternoon. In some cases (e.g., than constant and therefore affects inferel~ces
l~ominoiddivergences; see Sarich and Wilson, based on ~nterspeciflcversus lntraspeciflc varia-
1967), molecular estimates of divergence time tlon (e.g., McDonald and Kreitman, 1991) or 111-
have led to a substantial re-evaluation of fossil ev- traspeciflc allele genealogies (e.g., Rand et al ,
idence. However, most purported tests of hy- 1994), but not those based on short-term dynam-
potheses about divergence time have ignored ics of alleles (e.g, Easteal, 1985; Waples, 1989)
problems associated with calibration, and few Where deviations from neutrality are llkcly to blas
have calculated appropriate confidence intervals. analyses significantly, the assuinptlon of neutral-
These confidence intervals can be so large in some ity should be made explicit, preferably m a w a y
cases that the term "clock"-or even "sloppy that can be tested. However, because most depar-
clock"-becomes meaningless (see Chapter 12). tures from neutrality are thought to be locus-spe-
cific, ~tis widely assumed that select~onwill have
relatively minor effects on the overall analysls if
Neutrality of Molecular Variants numerous loci are examined.
A frequently voiced concern is that molecular
characters are not neutral and that selection will Data Quality and Presentation
bias analyses of ii~traspecificvariation and esti-
mates of phylogeny. This relates to a much SJopulation genetic or pl~ylogeneticestimates can
broader argument over the evolutionary signifi- only be as accurate as the primary data them-
cance of molecular variation, the "neutralist- selves. Therefore, it is essential to confirm that ap-
selectionist controversy" that has been a major parent molecular varid tion has a genetic basis
concern of molecular population genetics since This includes such obvious procedures as con-
fimura's (1968)seminal paper on the neutral the- firming DNA sequences using the complementary
ory (reviewed by Lewontin, 1974, 1985; Kimura, strand or overlapping primers and repeated runs
198310, 1986; Crow, 1985; J.H. Gillespie, 1991). (Chapter 91, using internal controls in allozyme
There can be no doubt that many protein, chro- electropharesis (Chapter 4) and DNA lzybndiza-
mosome, and DNA variants are acted on by selec- tion procedures (Chapter 6), and verifying the re-
tion; it also appears that much molecular varia- pcatability of DNA fragment data (Chapter 8).
tion is consistent with predictions of various I-fowever, it also includes some less obvious pro-
modifications of the neutral theory (Sarich, 1977; cedures, such as confirming that observed molec-
Neil 1987; Ohta, 1992).Thus, the debate is reduced ular variation exists in the organism, rather than
to whether or not most molecular variants are se- being an artifact of m vltro recombination or poly-
lectively neutral (or nearly neutral), and whether merase errors (Chapter 7).
neutrality or selection should be considered the Inevitably, it is up to the investigator(s) to de-
null hypothesis. The current lack of a general cide wkether molecular data are accurate. How-
testable theory sf lnolecular evolution based on ever, the data should be presented in such a way
selection dictates that neutrality must usually that peer reviewers and readers can judge the
serve as the null hypothesis. However, given the technical. quality and extent of the data them-
poor fit of many molecular data to the neutral the- selves. Unfortunately, once techniques lzave bc-
ory (J.H.Gillespie, 19911, one should make a con- come established in the literature, there has becn a
scious distinction between testing for neutrality tendency on the part of editors and authors allke
and simply assuming that it exists. to dispense with the primary data, such as pho-
The impact of selection on systematic studies tographs of gels or chron~osomes,raw experi-
mental data, and even alignments of sequences! for detecting and recording variation in proteins,
In practice, this can lead to ul~necessarilyacrimo- chromosomes, and nucleic acids (Chapters 4-9);
nious cicbates over data quality and interpretation and methods for analyzing the data (Chapters
(e.g, Cracraft, 1987; Sibley and Alzlquist, 1987; 10-12). We have attempted to include a balance of
Sarlch et al., 1989).T h ~ is
s a poor reflection on the viewpoints concerning different methods of data
field as a whole, and it is u p to practitioners of collection and analysis. Obvious omissions in this
~nolecularsystematics to insist on rigorous stan- edition are amino acid sequencing and immuno-
dards of data quality and presentation. Increas- logical methods, both techniques of great histori-
ingly, n~olecularjournals are insisting tlzat DNA cal importance in molecular systematics (Good-
sequences and alignments be entered into appro- man et al., 1987; Maxson and Maxson, 1990), but
priate databases, but this does not ensure the ac- which have been largely replaced by nucleic acid
curacy of the sequences themselves. Clark and sequencing for most applications in systematics.
Whittam (1992) reported a low error rate (=1/1000 Otherwise, the coverage of techniques is fairly
bp) lor sequences in GenBank, and concluded tlzat comprehensive. To facilitate comparisons, each of
such a n error rate would have little impact on the molecular techniq~lechapters is arranged into
molecular systematics of organisms or genes with sections on (1)principles and comparisons of
hig.17 scquence diversity However, this error rate methods (including a discussion of assumptions);
could adversely affect population genetic analy- (2) applications and limitations; (3) laboratory set-
ses of species with low nucleotide diversity (e.g., up; (4) protocols; and (5) interpretation and trou-
humans; Li and Sadler, 19911, so even greater cau- bleshooting. A glossary of terms and common ab-
tion may be required for such applications. breviations is given after Chapter 12. Words and
phrases included ill the glossary appear in bold-
face type at their first appearance in the text.
SCOPE AND USE OF THIS BOOK For the most part, protocols are basic and well
proven. Emphasis also has been placed on high-
M o ; c c ~ , i n l Syslernatccs aims to provide an lighting recent developments that appear partic-
overview of ~nolecularmethods currently used to ularly promising. However, for each approach,
anajyLe diversity witl-trn and among species. The there is a wide range of alternative protocols not
prrmary goal is to provide new workers in this described here. This volume is designed to com-
rap+ expanding field nrith sufficient technical plement existing manuals that focus on a single
and iiieoretical inforination to enable them to se- approach (see works cited at the end of this chap-
lect one or more appropriate methods, to dcsign ter, as well ds the manuals for phylogenetic analy-
and ~lnplementa study, and to analyze the re- sis packages listed in Chapter 11).These more de-
suiilng data, all with maximurn efficiency. In se- tailed sources should be referred to for additional
lecc~ngan appropriate technique for obtaining background and alternative methods once a par-
data, the basic questions to be considered are (1) ticular approach has been adopted.
will i r produce information compatible with the
dcs~rodinetlzod of analysis; (2) it; the signal-to-
nolsc ratio likely to be sufficiently high to address FOR FURTHER STUDY
the question(s) posed; and (3) is it cost-effective
and tcaslble, given the available facilities and ex- General References
perLlse7 For practicing molecular systematists, Avise, J. C . 1994. Molec~llauMarkers, Nafural History,
these chapters may suggest alternative strategies and Evolutio~i.Chapman and Hall, New York.
for collcct~ngand analyzing data and new per- Koelzal, A. R. (ed.).1992. Molecular Genetic Analysis of
spectxves on limitations and assumptions of fa- Populations. IRL Press, Oxford.
Iloelzal, A. R. and G. A. Dover. 1991. Molecular Genet~c
mi11ar iecl~n~ques.
Ecology. Oxford University Press, Oxford.
The book has three major sections, each rep- Soltis, P. S., D. B. Soltis and J. J. Doyle (eds.).1992.
rcsenimg an important phase of a study: sampli~zg Molecular Systematzcs of Plants. Chapman and
dcs3gn and methods (Chapters 2 and 3);methods Hall, New York.
Molecular Systematics: Context and Controversies 13
INTRODUCTION
Molecular systematic studies require particularly careful planning because they
are usually relatively expensive and may involve destructive sampling of the or-
ganisms (Chapter 3). The aim, therefore, should be to maximize the information
obtained per specimen; too few specimens may lead to an inconclusive or incor-
rect result; too many is sheer waste. Despite this requirement, molecular system-
atic studies seem especially prone to poor planning. Too often projects are well
advanced before it is realized that the sampling strategy is inappropriate, the
wrong tissues have been collected, the tissues have been stored inappropriately,
the wrong technique has been chosen, or far too many or far too few specimens
have been included.
Molecular systematic studies typically involve the following stages:
1, Define the problem
2. Conduct a pilot study
3. Determine the appropriate sampling strategy
4. Collect samples
5. Analyze the samples
6. Analyze the data
This chapter deals mainly with step 3, establishing the most efficient sampling
design. However, steps 1and 2 will have a profound influence on step 3 and are
therefore considered. The remaining chapters of this book concern steps 4-6.
The cost of a project should take into account not only the cost of chemicals
and other consumables, but also the cost of time, which includes both the col-
lecting and the screening phases of the project. Thus, the sampling design should
18 Chaptev 2 / Baverstock & Moritz
aim to minimize both the number of specimens The sample sizes required for a given level of
and the handling in a way that remains compati- type I error and a given power depend on the
ble with the biological questions being asked, In- sampling variance. Many biological data follow a
deed, the first and most important step is to de- normal distribution, for which the mean and the
fine clearly the biological questions being asked. variance are independent. However, genetic data
The questions should be stated in as specific and such as allele frequencies determined by allozyme
detailed a way as possible. It will be particularly electrophoresis (Chapter 4) or DNA fragment
useful to contrast formal hypotheses as a guide to studies (Chapter 8) may follow a binomial distri-
further steps in the analysis. bution where the variance (s2)can be estimated
from the mean:
STATISTICAL CONSIDERATIONS
At the outset, one needs to decide on the level of where p is the allele frequency and n is tlze sam-
error that is acceptable. There are two types of er- ple size. For nuclear loci in a diploid population
rors; type I if the null hypothesis is rejected when this distribution is appropriate if the genotype fre-
it should be accepted, and type I1 if the null hy- quencies conform to Hardy-Weinberg equilib-
pothesis is accepted when it should be rejected. rium (HWE), otherwise resampling procedures
Type 11 errors are difficult to define for most bio- should be used to estimate variances of allele fre-
logical systems because the expected differences quencies (Chapter 10).
between two populations are usually unknown.
They are usually expressed as the power of the
test, i.e., I minus the probability of a type I1 error. MOLECULAR SYSTEMATICS
The level of type I error one is prepared to accept
depends on the consequences of being wrong. In Three main applications of molecular systematics
biological studies it is usually set at 5%, but this will be considered here: studies of population
limit should not be accepted blindly. structure (e.g., geographic variation, mating sys-
For example, a researcher may be testing the tems, heterozygosity, and individual relatedness),
hypothesis that a particular species of commercial identification of species boundaries (including hy-
fish has a population structure characterized by bndization), and estimation of pkylogenies. Each
isolated demes. The corresponding null hypothe- of these requires different approaches to virtually
sis is that the entire species is a single panmictic every phase of the study, from project planning
unit. Before launching into a full-scale study, the through pilot studies to sample sizes, sampling
researcher should conduct a pilot study to see if strategies, methods of data collection, and data
there is any suggestion at all that the fish popula- analysis. Therefore, it is necessary to have a clear
tion shows evidence of genetic substructuring. idea of the aims of the study very early in the
Here the type I error might be set at 20%, since the planning stage.
consequence of being wrong (i.e,, rejecting the Determinhg the relationships of specific indi-
null hypothesis when it is true) is that a fuller viduals (e.g., testing parentage) requires direct
study is carried out. However, in the full-scale comparison of alleles at allozyme loci (Chapter 4)
study, a type I error might be set at 1%, since here or, preferably, l~ypervariableloci (Chapter 8) be-
the consequence of being wrong is that inappro- tween putative relatives. Sampling and statistical
priate management procedures are adopted for considerations are reviewed by Sensabaugh
the fish species. By contrast, if the fish species con- (1982) for allozymes and by others (e.g., Lynch,
cerned was endangered, the type I error might be 1991a; Chakraborty, 1992; Chakraborty and Jin,
set at 20%, since t l ~ econsequences of (incorrectly) 1993) for hypervariable loci. An important con-
concluding that the species is a single panmictic sideration is the need to have information on the
unit may be disastrous for the recovery program. frequencies of different alleles for the population
Project Desiyz 19
in question in order to calculate exclusion/inclu- Units, but allele frequencies to define Manage-
sion probabilities (Lewontin and Hartl, 1991; cf, ment Units for rnol&ring of current populatio~s.
Chakraborty and Kidd, 1991; Jin and Chakraborty, This has major ilnplicatjons for the choice of tech-
1995). Statistical considerations relevant to the niques and sampling design.
analysis of mating systems using RAPDs were To date, allozyme electrophoresis (Chapter 4)
discussed by Milligan and McMurray (1993). The has been the genetic technique most widely used
remaining applications are discussed below. to study the genetic structure of populat~ons.
FIowever, various DNA fragment methods are be-
Studies of Population Structure ing used with increasing frequency (Chapter 8).
The distribution of variation within and among
Background populations revealed by these methods map dif-
The genetic structure of a population is perhaps fer. Uniparentally inherited loci (e.g., mitochondr-
the most fundamental piece of information for a ial DNA; Y-linked loci; most chloroplast DNA) are
species that requires management. For some generally expected to show less variation within
species, the entire population may consist of a sin- populations and more between populations than
gle random mating unit; others may consist of a are biparentally inherited loci (e.g.,autosomal 11~1-
series of small subpopulations, each largely iso- clear loci). Similarly, repeated sequences subject to
lated from other subpopulations (the stepping- strong concerted evolution (Chapter 8) also may
stone or islands model); still others may consist of have reduced levels of variation wit11111 popula-
a continuous population, but individuals within tions. Anv such alteration in the distributibr-t of
i t exchange genes only with geographically prox- variation has important implications for sampll~~g
imate individuals (the isolation-by-distance design. For hypervariable loci (e.g., microsatel-
model). Deciding which model best approximates lites), the large number of alleles and potential
the population structure of a particular species is genotypes complicate even the simplest statistical
usually the first step in understanding population analyses (e.g., tests of fit to Hardy-Weinberg equi-
biology. The three different models of population librium, Chapter 10). Altl-tough the theory to deal
structure result in different patterns of genetic with these data is still being developed (e.g.,
differentiation within and between geographic lo- Chakraborty, 1992; Chakraborty and Jin, 1993;
calities. Therefore, an analysis of the genetic struc- Slatkin, 1995) it is clear that large sample slzes are
ture of a species can give the investigator im- needed in combination with greater emphasls on
portant clues about the population structure randomization procedures for analysis.
(reviewed by Richardson et al., 1986; Slatkin, 1987;
see also Chapters 4 and 8). Pilot Studies
An important consideration is whether infor- The pilot study has three major aims: (1) to lind
mation is required on current population structure, genetic markers (i.e., polymorphic loci); (2) to de-
historical population structure, or both. Analyses termine whether the polymorphic markers are
of historical population structure are enhanced by suitable in a practical sense; and (3) to establish
information on the relationships among alleles as the feasibility of a large-scale sampling program.
well as their frequency and distribution (Avise,
1989; Hudson, 1990; Slatkin and Hudson, 1991; ESTABLISHING MARKERS Samples should be
Slatkin, 1993). However, in some circumstances, obtained from multiple populations representing
inclusion of information about allele phylogeny or a hierarchy from closely spaced to geographical-
divergences can be quite misleading about current ly distant sites to identify locally polymorphic
population processes (e.g., Avise et al., 1992a).For markers as well as those wit11 widespread varia-
threatened species, where population sizes and/or tion. It is at this point that the distribution of
migration rates may be changing rapidly, Moritz variation within versus among populat~olls
(1994) advocated the use of information on allele should be assessed.
phylogeny to define Evolutionarily Significant A suitable approach for the pilot study may
20 Chnptev 2 / Bcrvevsfock & Moritz
be to collect relatively large samples (e.g., n > 20) but not with RFLPs or microsatellites if a small
from two localities at the extremes of the range n ~ ~ m b of
e r loci is examined per gel (see Chapter
and sm,~lIersamples (n= 5) from several other lo- 8). At this stage it is prudent to experiment with
callIiei This represents a trade-off between the different tissues, different tissue treatment, differ-
need to assess within-population variation (par- ent storage regimes, PCR conditions, etc. (see
tlcularly for iiiploid nuclear loci) and among- Chapters 3,4, and 7) to improve the resolution of
population variation (particularly for diploid nu- loci that are polymorphic but are proving diffi-
clear loc~),and samples should be assayed for as cult to score. These things should be sorted out
many loci ds possible, preferably including dif- before the main sampling program begins.
ferent genetic systerns (e.g., allozymes or mi-
crosatellites as well as mtDNA or CDDNA).At FEASIBILITY OF POPULATION SAMPLING PROGRAM
this point, it may be appropriate to approach The pilot study also gives the opportunity to test
other laboratories with experience in specific ap- the feasibility of the main sampling program:
proaches rather than spending resources estab- Have all the logistic problems been sorted out? Is
lishing methods that turn out to be uninformative the spatial scale of sampling appropriate?
(i.e,rnonomorphic).
The actual number of genetic markers re- Sample Sizes and Strategies
cl~~ired will depend to some extent on the subtlety Sampled localities may have different allele fre-
of [he population substructuring encountered and quencies for a polymorphic locus, but the differ-
the variance among loci. The need to examine a ence may go undetected because of small sample
large number of loci is evident from the observa- size (a type I1 error). The smaller the actual differ-
t~onsIhak the reliability of estimates of summary ence in allele frequencies, the larger the sample
st3list1cs such as heterozygosity, genetic distance, sizes needed to reliably detect them. Thus, the
and Fs, depends more on the number of loci than first consideration in setting sample sizes is the
on the number of individuals (Nei, 1978; Nei and magnitude of the allele frequency differences one
Chesser, 1983; Chakraborty and Leimar, 1987; expects to encounter, which may be determined
Slatkln and Barton, 1989; Leberg, 19921. One from the pilot study. The only other considera-
marker 1s clearly insufficient because the effects of tions are the level of type T and type I1 errors one
selection and substructuring cannot be distin- is prepared to accept. Again, both can be reduced
guished Even two loci are insufficient because se- by increasing the sample size. Table 1 gives the
lectlon or linkage may give the same pattern for minimum sample sizes required to detect given
each. At the very least, three loci with multiple al- levels of allele frequency differences (assuming
leles at reasonably high frequency should be used. Hardy-Weinberg equilibrium) for various levels
If fewer are found, some other approach to the of type I error and various powers of test.
problein should be explored. Weir (Chapter 10) For example, let us assume that we have two
suggests that a minimum of five polymorphic loci diploid populations, both polymorphic at a locus
are needed to test the significance af population with allele frequencies of 0.510.5 and 0.55/0.45,
structure via resampling procedures, respectively. Our null hypothesis is that the two
populations do not differ in allele frequency. Let
sunAnlLlTY OF MARKERS Because a large number us assume also that we have decided on a type I
of san~plesmay need to be screened in the main error of 5% (i.e., we will reject the null hypothesis
study, markers should be inexpensive and easily only when the data have less than a 5% probabil-
scored. Moreover, for diploid loci, it is highly ity of occurring if the null hypothesis were true),
preferable that heterozygotes can be clearly and and a power of 90% (i.e., we want to be sure that
consiste~z~lydistinguished from both homozy- if we accept the null hypothesis it has a 90%
gotes. This can be a problem with multilocus chance of being correct). This would mean that
mmisatellite fingerprints, RAPDs, and some the two samples would each need to consist of at
allo~ymemarkers (e.g., Richardson et al., 19861, least 2081 individuals!
Project Design 21
Table 3.
The number of diploid individuals in each of two samples required to
detect given differences in allele frequency ( A P ) ~
v
An alternative way of viewing the problem of of the time (i.e., a power of 0.5) with a type I error
sample size is to look at what level of discrimina- of 5%is approximately
tion will be achieved for a given sample size. For
example, with sample sizes of 100 in each of two
diploid populations, only differences in allele fre-
quency of at least 0.1 to 0.2 (depending on the ac- (Chakraborty and Leimar, 1987; Slatkin and Bar-
tual allele frequencies) will be detected with a ton, 1989). For example, to detect a GSTvalue of
probability of a type I error set at 5% and the 0,05, samples of just 10 diploids per locality may
power set at 80%.Smaller differences in allele fre- be adequate.
quency between populations will be indistin- The above examples apply to diploid loci
guishable from sampling error even with such where the only information extracted is on allele
large sample sizes. Similarly, Chakraborty (1992) frequencies. The power of tests for population
concluded that sample sizes of 50 are required to subdivision may be greater where they incorpo-
detect alleles with frequencies of p > 0.05 at hy- rate information on the molecular differences be-
pervariable loci with >95%probability. tween alleles as well as their frequency (e.g., Ex-
T11e situation with respect to estimation of coffier et al., 1992). Based on simulation studies,
variance in allele frequencies appears somewhat Hudson et al. (1992a) concluded that such se-
less demanding. For small numbers of popula- quence-based statistics (e.g., NST; Lynch and
tions, the sample size (n)needed to detect a given Crease, 1990) are more powerful than statistics
level of differentiation at a diploid locus among that consider only allele frequency when muta-
populations (using GST;see Nei, 1973) at least 50% tion rates are high and sample sizes small. Con-
22 Chapter 2 / Bavevstock G.' Moritz
versely, ~2 tests based on frequency-based statis- tures at each scale. Other aspects of the species' bi-
tics (Chapter 10) usually will have greater power ology may dictate further sampling requirements.
for detecting population subdivision when muta- For example, the strict two-year breeding cycle of
tion rate and thus allelic diversity is low, that is, some salmonids requires odd- and even-year
when breeders to be sampled and analyzed separately.
Repeat sampling of at least. some localities is
highly desirable to permit a direct assessment of
sampling variance in allele frequencies or to de-
where HTis the estimate of allelic diversity for the tect artifacts arising from non-random sampling
total population and nl and 12, are the sample of a gene pool. The latter could arise if there are
sizes for the populations compared (Hudson et spatially or temporally separate groups, such as
al., 1992a). These results have implications not schools of siblings or genetically distinct cohorts
only for choosing the most appropriate statistics, (Richardson eE al., 1986). Changes in allele fre-
but also for determining the most cost-effective quency between generations, other than through
way of measuring variation in the first sampling error, also permit an estimate of the ef-
place (see Chapter 8 for further details). Tn a simi- fective population size, N,,assuming that the al-
lar vein, Lynch and Milligan (1994) found that the leles are selectively neutral (Waples, 1989).Also,
use of dominant RAPD markers to estimate pop- the overaIl mean sampling variance can be used
ulation genetic parameters requires 2-10 times to estimate the size of a genetically uniform
more individuals and more loci compared to neighborhood in a continuously distributed
codominant genetic markers (e.g., RFLPS,mi- species (Richardson et al., 1986).
crosatellites).
The number and geographic pattern of locali-
ties that ultiinately need to be sampled will de- Studies of Species Boundaries and
pend to a large extent on the actual scale of sub- I-fybridization
structuring, which may not be apparent until after
the first round of sampling. For example, if fol- Background
lowing the first round of sampling the entire pop- There has been considerable debate in the litera-
ulation conforms to a panmictic unit, it may be ture concerning the most appropriate definition of
decided to conduct no further sampling.- - Alterna- a species (see Endler, 1989; O'Hara, 1993).A com-
tively, there may be no obvious geographic struc- monly held view is the evolutionary species con-
turing, but a deficiency of heterozygotes (under cept, according to which a species is "a single lin-
HWE expectations) may be observed, prompt- eage of ancestral-descendant populations which
ing sampling at a finer geographic scale (e.g., maintains its identity from other such lineages
Richardson, 1981; Johnson and Black, 1984). and which has its own evolutionary tendencies
Therefore, the budget for the program should fore- and historical fate" (Wiley, 1978). For sympatric,
shadow the possibility of additional rounds of sexually reproducing species, this reduces to the
sampling. Weir (Chapter 10) recommends that at biological species concept (Mayr, 19691, according
least five localities with n > 20 be sampled to pro- to which a species consists of a group of individu-
vide for statistical testing via resampling methods. als capable of exchanging genetic material with
It may prove useful to use a spatially hierar- each other but which are reproductively isolated
chical sampling strategy in both the pilot and sub- from all other such groups.
sequent studies, especially where the geographic There are at least: five situations in which
scale of gene flow in the species is not apparent a morphological data alone will be inadequate for
priori. For example, Lavery et al. (1995a) sampled defining species boundaries. First, two species
coconut crabs from different islands within an may be sympatric (overlapping) or parapatric
archipelago, islands from different archipelagos (abutting), but be so similar in morphology that
within an ocean, and islands in different oceans their specific status goes undetected (e.g., Don-
and found qualitatively distinct population struc- nellan and Aplin, 1989). Second, two allopatric
(geographically separate) populations may be of 10 individuals is collected. The null hypothesis
morphologically different, but their status as (bi- under test is that all specimens belong to a smgle
ological) species is questionable. Third, two para- random-mating population. At one locus, 6 speci-
patric populations may be ~norphologically mens are homozygous for one allele and 4 speci-
distinct, but slzow clinal variation or broad hy- mens are homozygous lor a different allele, tl~ere
bridization (e.g., Jackman and Wake, 1994). are no heterozygotes. The best estimates of the al-
Fourth, two ~norphologicallydistinct forms may lele frequencies at this var~ablelocus are p = 0.6
represent polymsrphisms within a single inter- and q = 0.4. The expected proportioil of heterozy-
breeding population (e.g., Titus et al., 1989; Hillis gotes assumiilg Hardy-Weinberg equilibrium for
et al., 199110).Fifth, an asexual species complex a random-breeding population is 2pq, which in
may have morphologically similar forms that this case is 0.48. The probability of an individual
arose independently from sexual species (e.g., not being a heterozygote is therefore 1 - 2pq =
Moritz and Heideman, 1993). 0.52. The probability of all 10 individuals not be-
Of the various molecular genetic approaches ing heterozygous is (0.52)1°= 0.0014. Clearly, the
that may be brought to bear on tl-ie problem, al- null hypothesis is under serious challenge. If an
lozyme electrophoresis (Chapter 4) appears to re- additional locus is found that shows the samc pat-
main the most generally applicable and efficient, tern of fixed differences involving the same indi-
although cytogenetic (Chapter 5 ) and DNA frag- viduals, then the hypothesis is that two species
ment (Chapter 8) analysis often can be useful as are involved. Other hypotheses that deserve con-
well (see the Summary in Chapter 12). With re- sideration are that the species is actually asexual,
gard to defining species under tl-ie phylogenetic that it is haploid, or that it has a very high level of
species concept, Davis and Nixon (1992) sug- self-fertilization,
gested that non-recombining loci such as mtDNA In practice, investigators should a m for a
have advantages, although Moritz et al. (1992a) minirnum of two locl show~ngpatterns of iixcd
argued against tl-ie use of mtDNA phylogenies to differences that are consistent between mdividu-
define species boundaries because of the potential als. This is necessary because an apparent lack of
for differing patterns of geographic variation in heterozygotes at a locus can result from other ef-
nuclear versus organellar genes. fects. For example, variation may not be under
Different species usually have a fixed allelic simple genetic control (c.g.,Ldlz-R in Mus doi7zesti-
difference at some of the loci screened in protein cus; Shows and Iiuddle, 1968), there may be llull
electrophoretic studies. Thus, for predominantly alleles, there may be ontogenetic variation (c.g.,
outcrossing species, the presence of sympatric hemoglobin in vertebrates), or, at least in theory,
cryptic species can be tested by loolung for variable there may be very strong selection against het-
loci that lack l-ieterozygotes, while the status of erozygotes.
sympatric morphotypes can be evaluated by test- The above argument rests on the assumption
ing for significant differences in genotype or allele that fixed differences will be found. Clearly, the
frequencies (Chapter 11).For allopatric populations more loci that are screencd, the greater thc chancc
and asexual populations, the aim is to assess the of finding such loci i f two genetically distinct
extent of genetic divergence between the popula- species really are represented. Consequeri tly, part
tions being tested in relation to geographic varia- of the aim of the pilot study should be to screen
tion within species. In all cases it is more important as many loci as possible. For allozyine analysis, it
to maximize the number of loci screened than to may be necessary to try different tissues and dif-
maximize the number of individuals examined. ferent treatments on a limited number of speci-
mens to detect additional loci (see Chapter 4).
Pilot Studies, Sample Sizes, and If the aim is to test whether two previously
Sampling SStrategies identified sympatric groups (e.g., distinctive nlor-
For sympatric outcrossing species, very small photypes or cl-irornosome races) are reproduc-
samples are adequate so long as they include both tively isolated, loci wlth shared polymorphisms
species. For example, let us assume that a sample also may be useful. These can be examined onc lo-
24 Cilapter 2 / Bclverstock & Moritz
clrs a t a time, testing for slgluficant differences in be incorrectly scored as fixed differences, but,
ailele fseq~~encies (Table 1) or significant deficien- from the point of view of assessing genetic diver-
cies oi I~eterozygotes(Chapter 10). Alternatively, gence between allopatric populations, very differ-
sevcsal locl can be examined simultaneously us- ent allele frequencies indicate high genetic diver-
lng discqulllbrium statistics (e.g., Ryman et al., gence and are operationally equivalent to fixed
1979, see Chapter 10). Such analyses usually will differences. Once again, however, because the
require much larger sample sizes than is the case variance of the estimate of between-population
if locr with fixed differences are used (A.D.H variation depends mainly upon the number of
Brown, 1975). Moreover, disequilibrium statistics loci (Nei, 1978), every attempt should be made to
should be ~nterpretedwit11 caution because sig- maximize the number of loci screened.
niflcar~tdisequilibrium can result from many fac- The pilot study should consist of screening
tors other than the presence of two reproductively about five individuals of each of the two geo-
isolated groups (Hart1and Clark, 1989). graphic forms. If no fixed differences are found,
Once the presence of two species has been in- there is no point in screening additional individ-
dlca ted, a follow-up study usually is required in uals or additional populations since increasing the
order to find rnorphologlcal features diagnostic population sample size can only reduce the esti-
for the species. Such studies often require multi- mate of fixed differences. Any additional effort
varrate analyses. It is unlikely that the original should focus on examining additional loci. Only
sample w ~ l bel sufficiently large for a full multi- where potentially diagnostic differences are found
varra te morphornetric analysis, especially when should additional sampling be contemplated.
the posslble effects of age and sex are taken into Here, small samples (about five individuals for
account However, additional specimens need to diploid organisms) should be screened, including
be "Lypcd" only for the diagnostic loci. The sam- samples from each of several geographically
ple sues required for thls part of the study will widespread populations of each of the morpho-
depend on the subtlety of any n~orphologicaldif- logical forms.
ferences between the species.
IvIethods for assessing whether allopatric Hybrid Zones
populations represent distinct species are contro- The population genetics of hybrid zones is most
verslal (e.g., Frost and Hillis, 1990; Davis and readily investigated if fixed genetic differences are
Nixon, 1992). For example, some have suggested found between the parental taxa involved in hy-
Lhat a certain level of genetic divergence is re- bridization, although some information can be ob-
quired for populations to be considered as sepa- tained from polymorphic markers (see Chapters 4
rate species (c,g.,Baverstock et al., 1977; Highton and 8). Consequently, every effort should be made
et al., 19891, although thls approach has been in the pilot study to discover such fixed differences
strongly crrticized (Avise and Aquadro, 1982; rather than rely on allele frequency differences.
Frost and W~llis,1990). Another approach is to Three additional features of hybrid zones are
compare genetic divergence between two al- salient to the project planning stage. First, genetic
lopalric populations suspected of representing markers frequently show introgression over much
distinct species with that between similarly sepa- broader geographic areas than might be predicted
rated populations within each form (e.g.,Jackson from morphology alone. Second, different genetic
and P o ~ ~ n d1979;
s , Moritz et al., 1993).It has been markers frequently show different levels of intro-
argued elsewhere that for studies of species gression. Third, uniparentally inherited non-re-
boundaries and relationships, the proportion of combining markers (such as mtDNA and cpDNA)
fixed differences between two samples is the most provide information of a different kind from
appropriate measure of genetic divergence diploid nuclear markers (Chapter 8).
(Xilcl~ardsonet al., 1986; see also Davis and Nixon, As a consequence of these considerations, it is
1992) Uslng this approach, shared poIymor- useful to screen both nuclear diploid and uni-
phlsms with very different allele frequencies may parental haploid loci for fixed differences. More-
Projecf Design 25
over, the pilot samples should be taken from lo- levels and others work at lower levels. The num-
calities well away from the hybrid zone itself, and ber of specimens examined per group can be quite
should involve several populations of each of the small (even one), unless shared polymorphisms
parental taxa. among species are a likely possibility (e.g., closely
related species). If shared polymorphism is a rea-
Phylogenetic Relationships sonable possibility, then at least two larger sam-
ples (n = 10) should be included. Multiple popu-
Background lations should bc examined for closely related
The single most important component of the pro- pairs (Smouse et al., 1991).
ject planning stage of a phylogenetic analysis is In principle, these specimens could be sub-
the decision as to which method(s1 or sequence(s1 jected in the first instance to any one of the treat-
are appropriate to,the phylogenetic question at ments discussed in Chapters 4 9 to obtain some
hand. The method chosen must yield sufficient idea of the most appropriate methsd/gene to be
variation to be phylogenetically informative, but used in the main study. However, it would be
not so much variation that convergences and par- most efficient to start with a method already
allelisms overwhelm informative changes (see available in the laboratory or to begin with a rela-
Chapter 12). tively cheap and fast method. If no technique is
There is a considerable body of evidence sug- available locally, it would be wise to see if another
gesting that the rate of evolution at the molecular laboratory with one of the techniques already es-
level is at least similar (i.e., within an order of tablished will run the pilot specimens rather than
magnitude) across most groups for a particular establish a method de nova that ultimately turns
gene or set of genes. As a consequence, the out to be inappropriate for the major study.
method chosen will depend to a large extent on The pilot study will determine which tech-
the time frame over which divergence has oc- nique(~)or gene(s) are most appropriate for the
curred for the study group (see Chapter 12). group, and hence how many specimens are
When the phylogenetio study begins, the time needed for the main study, which tissues should
scale for the group in question probably will be be collected, and how they should be stored. The
largely unknown. Guesses based exclusively on preliminary data can be used to project the size of
morphology of extant forms are likely to be quite the final data set that will likely be needed to
misleading because rates of morphological evolu- achieve a well-supported estimate of the phy-
tion vary enormously between groups (Cherry et logeny (see the section "Hypothesis Testing and
al., 1982; Baverstock and Adams, 1987). Fossil the Parametric Bootstrap," Chapter 12). Xt also
data also must be interpreted with caution (A.C. should be possible at this stage to estimate the
Wilson et al., 1977). Therefore, the prime purpose cost of the study in terms of both consumables
of the pilot study should be to determine which and time.
molecular technique or techniques are appropri-
ate to the study group. Sample Sizes and S a r n p l i ~ gStrategy
The number of specimens and populations
Pilot Study needed per group to resolve relationships among
For the pilot study, it is desirable to sample indi- groups depends critically on the amount of poly-
viduals from taxa representing the two extremes morphism relative to the extent of divergence. If,
of differentiation (i.e., two pairs of closely related for a given method or sequence, the study
taxa and two pairs of distantly related taxa). indicates that virtually all of the variation occurs
Again, it is desirable to evaluate the distribution among groups, then it is appropriate to use small
and nature of variation within and between samples per group (e.g., Gorman and Renzi,
groups for different types of loci (e.g., allozymes; 1979). However, even here it is necessary to in-
slowly versus rapidly evolving genes). It may be clude multiple populations of closely related
that some approaches work at higher taxonomic species, particularly if non-recombining se-
26 Chapter 2 Baverstock C;. Moritz
quences such as mtDNA or cpDNA are being one must be certain that the outgroup taxa are in-
used (see Neigel and Avise, 1986; Smouse et al., deed outside of the group under study Including
1991). Thus, it may be most efficient to conduct multiple members of the sister group is useful for
the sampling and analysis in two steps-the first reducing long-branch attraction problems (A.B.
to identify clades of closely related taxa and the Smith, 1994), and multiple successive outgroups
second to add geographically remote populations may provide a minimal test of ingroup monophyly
for each of the members of such clades. If the cho- Another important issue, partici~larlyrelevant
sen method and sequence reveal appreciable to assessment of phylogenies by DNA sequencing
polymorphism (relative to divergence) in some or or RFT,P analysis, is how genes should be sampled.
all taxa, then larger sample sizes will be needed to One consideration is the number and distribution
estimate phylogeny (Archie et al,, 1989). In this of nucleotides that should be sampled within a sin-
case, correct choice of method of analysis (see gle linkage group. Comparisons of phylogenies
Chapter 12) becomes even more important. For produced from various subsamples of whole ver-
example, different methods of coding polymor- tebrate mtDNA genomes have indicated that
phism~as character states are subject to very dif- blocks of contiguous sites are less likely to repro-
ferent levels of sampling error (Swofford and duce the whole-genome tree than samples of
Berlocher, 1987; Chapter 11). equivalent size drawn from nucleotide sites dis-
The number of species that must be included persed throughout the genome, apparently be-
to obtain an accurate phylogeny represents a cause of heterogeneity among regions in variabil-
trade-off between sampling enough so that char- ity and base composition (Cummings et al., 1995).
acter changes can be accurately reconstructed (e.g., Thus, restriction site analysis or sequencing of mul-
splitting long branches; Chapter 11), but not so tiple short stretches from sequence-tagged sites can
many that phylogenetic analysis become unwieldy provide more power than sequencing of longer
(e,g.,M.W. Chase et al., 1993).It has long been rec- conti~~ous segments. Consistent with this, Hillis et
ognized that phylogenetic iderence is sensitive to al. (1994a) found that restriction sites performed
the number and phylogenetic distribution of better than a similar number of variable sites
species included (e.g., Lanyon, 1985; Lecointre et within a continous sequence at estimating a known
al., 1993). Sampling of species is likely to be an it- phylogeny of T7 viruses. In recombining genes, se-
erative procedure, adding new taxa within groups quencing of long continuous stretches will also in-
as discussed above in relation to populations (e.g., crease the likelihood of spanning sites of recoinbi-
Moritz et al., 1992a).At the minimum, we suggest nation and thus obtaining reticulate gene trees.
that there should be replication of samples one A second consideration is the number of
level below the level of inference. For example, if genes that should be analyzed. Particularly for
the relationship of families is being examined, at closely related species, any one gene tree can dif-
least two (non-sister) genera should be examined fer from the species tree because of retained an-
per family where this is possible. cestral polymorphisms (Pamilo and Nei, 1988;
At least one outgroup taxon should be in- Wu, 1991; Doyle, 1992; Hey and IKliman, 1993).In
cluded in the analysis to root the tree (W.P. Maddi- a study of pinniped seals, Slade et al. (1994) found
son et al., 1984). In the absence of a suitable out- that trees for individual nuclear (intron) genes dif-
group, the data for the ingroup can be used to fered, but that a tree based on concatenated nu-
produce an unxooted tree, which pro~iidesvaluable clear sequences was congruent with both the
information, but it is usual to aim for a rooted tree mtDNA tree and the traditional phylogeny. One
(Chapter 11). The use of more than one outgroup conclusion drawn from this study was that, for a
will be useful for jackknifing the final data set given amount of effort, it may be preferable to
(Chapter 11). The outgroups should be as closely combine several short sequences from unlinked
related as possible to the ingroup (preferably in- nuclear genes than to maximize the information
cluding at least one member of its sister group), but obtained from a single gene or linkage group.
Finally, the neth hods discussed in Chapters
4-9 fall into two broad categories: those that, by CONCLUDING REMARKS
their very nature, yield distance data (e.g., DNA
hybridization), and those that can yield character- We have attempted to highlight the necessity of
state data (e.g., allozymes, cl~romosomes,frag- proper project plannlng in the use of molecular
ment methods, and nucleic acid sequences). methods in systematics. Some of the common pit-
Methods that yield distance data alone require a falls can be avoided by careful project planning
different project strategy from those that yield and the judicious use of pjlot studies. A partlcu-
character-state data. For character-state data, the larly common pitfall is to attempt to include all
cost (of both time and chemicals) goes up linearly three applications of molecular nzethods in sys-
with the number of taxa, whereas for distance tematics-population structuring, species bound-
data (where a matrix is required), the cost goes up aries, and phylogenetlc reconstruction-~nto a sin-
with the square of the number of taxa. Therefore, gle study. Yet these three applications, although
for methods that yield distance data a sensible using similar techniques, have such different
strategy might be to divide the project into several strategies that attempts to combine them are al-
matrices, one providing the major branches for most certain to be inefficient, or at worst fail on all
the group and others dealing wit11 lower-order re- three.
lationships.
Chapte
ection and Storage of Tissues
Herbert C. Dessauer, Charles J. Cole, and Mark S. Hafner
INTRODUCTION
Research in molecular systematics requires plasmid, cell, or tissue samples in
which proteins and nucleic acids are maintained in the structurally intact physi-
ologically active state. In field work to obtain such material, the collector is con-
fronted with unique challenges. Field equipment often must include u n u s ~ ~ a l
items such as liquid nitrogen tanks or dry ice, because freezing is the most effec-
tive method for preserving the widest variety of tissue constituents. Even after
obtaining the samples, collectors may encounter difficulties transporting them.
For example, airlines often are reluctant to accept liquid nitrogen as baggage.
Also, collectors working in foreign countries may find that customs officials re-
quire special permits for the importation of biologically active materials. In this
chapter we offer advice on meeting the cl~allengesof collecting, packaging, and
preserving tissues, and we include a listing of collections of material for molecu-
lar studies and emphasize the need to develop such synoptic collections of these
materials.
30 Clzapter 3 / Dessaue~;Cole & Hafner
Figure I. Materials and supplies that are useful for forl (see text). After tissucs are sealed and folded wlthm
preparing and storing frozen tissues. (A) A plastic these packets (I-I, heart, L,livcr, In, intest~neand stom-
"French Straw" (right), assorted plastic containers, and ach, K, kidney, Sm, skeletal muscle), the package is
a waterproof ink marker (top). The marker works on wrapped m extra-heavy-duty aluminum forl, labeled
paper, plastic, and aluminum foil; the ink withstands by pressmg the foil llghtly with a ballpo~ntpen, and
freezing even in liqujd nitrogen, so it can be used for droppcd into liquid nitrogen (Dl Glass tubes for col-
both external and internal (backup) labcls. The plastic lecting blood, sealant, and an example (bottom) of
straw and tubes can be used in liquid nitrogen if the capped tubes being slid Into a labeled ptece of corm-
lids are sealed well, but the small tubes with pop-off gated cardboard (see tcxt for freezing instructions) (E)
lids (bottom row) should be packaged in tightly Hand-foot centrifuge for ficld work in areas wlthout
wrapped foil or a larger tubc for maximum security. (B) elcctrlcity (Dessauer et a1 , 1983) (F) Plastlc box (wlth
Plastic bags; these are acceptable for storage in electric gasket) for long-term storage of samples in an elcctrlc
ultracold freezers but are not recommended for use in ultracold freezer.
liquid nitrogen. (C) Packets made by folding akuminum
32 Chnpfer 3 / Dessat~er,Cole & Hafner
withln a sheet of extra-heavy-duty aluminum foil. clear impression in aluminum foil. Experiment by
Such packets are readily customized to individual exposing your materials to liquid nitrogen or
needs, they can be folded by assistants in advance other ultracold conditions followed by thawing.
of field work, and they transport efficiently in the In addition, we recommend use of a backup sys-
flat (unopened) state. tem (e.g., number written in ink and also etched
on the tube with a diamond-tipped scribe; labels
Doctiinentation of Samples both inside and outside the package).
Careful documentation oi samples is critical in all On occasion, it is not possible to cross-refer-
phases ol work. A sample in a cryotube or other ence a tissue sample with a permanently pre-
package 1s essentially useless if it has lost its label. served voucher specimen, such as when a blood
It is important to: (1)label samples and specimens sample is taken from an individual in a zoo or
so that no information is lost in wrapping, trans- from one that will be released after temporary re-
port, storage, and entering of data into permanent straint in the field. Under such circumstances,
records, (2) cross-referencethe tissue sample with photograph the individual to document its iden-
field collection data for the original specimen; (3) tification and record its tag, band, or other identi-
label containers, Iaboratory notebooks, and ex- fying number, if known.
perimental sainples during study; and (4) list
specimens examined in research publications. Ide- Preseivation
ally this cltation will include the museum cata- As soon as possible after collection and packag-
logue number for the voucl~erspecimen (e.g., ing, most tissues should be dropped directly into
study skin, skull, preserved or dried body, liquid nitrogen or covered with dry ice. Field
pressed leaves) housed in a permanent repository. workers should be aware that liquid nitrogen is
Although the museum or herbarium number may potentially hazardous. Quick-freezing in liquid ni-
be assigned long after the tissue sample was col- trogen generally shatters fragile glass hernatocrit
lected, all records pertaining to the sample (e.g., and microtubes filled with tissue fluids; such
field cdtalogue data, notes, and photographs) tubes must be frozen slowly before being sub-
should be cross-referenced with the permanent jected to ultracold temperature. In emergencies, a
voucher number. salt-ice mixture will substitute as a temporary re-
We recornmend that individuals collecting tis- frigerant. Fragile capillary tubes and microtubes
sue specimens in the fleld continue to use tradi- can be inserted into the slots of corregated card-
tio~lalcollector's catalogues (e, g., Remsen, 1977; board (Figure ID) or into a plastic straw such as
I-Terman, 1980). These usually are organized so those used for sperm storage (Figure 1A) for pro-
that each specimen receives a unique number pre- tection during long-term storage.
ceded by the collector's initials. The catalogue en- Fresh, unfrozen tissue gives the highest yield
try should indicate the type(s) of tissues sampled. of animal mtDNA (Chapter 8). Tissues have been
The package or tube containing the tissues should maintained successfully for 7-10 days unfrozen,
be marked clearly with the collector's inltials and immersed in a mannitol-sucrose buffer contain-
f~eldnumber. The name of the specialist who pro- ing 100 mM EDTA. Soft tissues and especially
vided ~dentificationof a specimen is an important oocytes, which contain 100 times more rnitochon-
part of the document~t' ron. dria per cell than somatic cells, are the best
Great care should be taken in labeling tubes sources of mtDNA (Lansman et al., 1981; J. C.
and packages containing tissue samples. The fol- Avise, personal communication).
lotvli-ig items have been reliable: (1)high-quality Cryopreservation is not required for tissues
bond paper and a drafting pen with permanent, collected for certain purposes. Although not rec-
waterproof, non-smearing ink; (2) felt-tip pen ommended for long-term preservation, immersion
with permanent ink that adheres to plastic tubes of tissues jn an aqueous solution containing 2% 2-
01.packages; and (3) a ballpoint pen that leaves a phenoxyethanol preserves the physicochemical
Collection and Sforage of Tissues 33
a lightweight plastic centrifuge (Figure 1E; certain taxa show a "senescence" phenomenon
Dessauer et al., 1983) are useful for separating wherein their proteins disappear during seasoiial
blood cells from plasma under field conditions. aging. Immediately upon collection, leaf cuttings
Methods for collecting hemolymph are described should be rinsed in distilled water, packaged, and
in bulletins 2,5, and 15 (crustaceans),2 (molluscs), frozen rapidly. Rapid freezing is particularly im-
and 37 (Limulus) of the Serological Museum (A. portant when collecting leathery leaves, whicln
Boyden, 1948-1978). tend to rot rather than dry. Proteins in leaves
ranging from ferns to oaks have survived freezing
Venom Collection for at least 3-4 years.
Collecting venom from snakes and other organ- For maximum yield of DNA, leaves should be
isms is a dangerous task that should not be taken pressed, dried overnight at 42'C, then stored at
lightly. The fangs of the snake are inserted through room temperature (see also Chapter 9).The most
a rubber or plastic membrane stretched across a important role of drying plant tissues may be the
collecting container. Many snakes discharge prevention of rotting, rather than the preservation
venom upon piercing the film. Additional venom of DNA. Although this method appears to pre-
may be obtained by gently massaging the region serve the integrity of DNA for several months,
over the venom glands, avoiding undue compres- dried samples should be frozen at -70°C for long-
sion which may injure the glands and cause bleed- term storage (Doyle and Dickson, 1987).
ing. If carefully handled, snakes can be "milked" Chemical treatment to remove lipids and phe-
repeatedly at 3-week intervals (Russell, 1980). i~olicsubstances from freshly collected,vegetative
tissue prior to long-term storage probably should
be avoided. Doyle and Djckson (1987) found that
Procedures Unique to treatment with preservatives used in ana to~nical
Plant Tissue Collection studies, or wit11 solvents such as ethanol and chlo-
Leaves, pollen, seeds, fern spores, and tubers of roform-ethanol, tended to cause degradation of
vascular plants have been preserved successfully DNA. Similarly, Coradin and Giannasi (1980)
for subsequent use in studies at the molecular found that chemical treatment interfered with
level (Jensen and Fairbrothers, 1983). Many pa- subsequent analyses of flavanoids.
pers in the SerologicalMuseum Bulletin (A. Boy-
den, 1948-1978) include valuable information on
collecting and handling plant material. As with
Collecting Cell Lines
animal tissues, plant tissue samples must be pack- Cryopreservation of living cells requires special
aged properly and labeled with the collector's collecting, freezing, and storage procedures ~f cells
field x~umber. are to survive intact (Stowell, 1965; Watson, 1978;
Seeds, pollen, and fern spores should be har- Hay, 1979; Hay and Gee, 1984). Cell damage i s
vested only when mature. Viability of mistletoe most likely to occur during the freezing and thaw-
seeds was greatest when collected during their ing process. Some cells will survive freezlng and
period of dehiscence (Nickrent et al., 1984). Fair- thawing, if pretreated with a cryoprotectant such
brothers (in Dessauer et al., 1984) recommended as glycerol or DMSO. For every species, tissue,
the following protocol for preserving seeds from and freezing system there 1s an optimum cryo-
most plants: (1) remove fleshy portion; (2) dry; (3) protectant cancel~trationand freezing rate (Mazur,
place in vacuum-sealed container; and (4) store in 1970). The cryoprotectant must be concentrated
the dark at or below freezing. Pollen and fern enough to protect the cells from freeze damage,
spores may be treated in the same manner after yet dilute enougl~to avoid chemical injury to
screening to remove debris. cells. Ideally, the rate of freezing should be con-
Young, actively growing vegetative tissues trolled precisely, with rate depending on variables
are more valuable for molecular study than are such as species, tissue type, size of the sample,
mature leaves (see Werth et al., 1985a). Leaves of cryoprotectant used, and the system used to
36 Chnpter 3 / Dessnz~er,Cole b IlaJner
freeze the sample. Generally, a cooling rate that is carried in personal baggage in thermos bottles
too fast results in death due to formation of ice containing refrigeration packs. Dry ice (referred to
crystals with~nthe cells; too slow a rate causes as "carbon dioxide, solid") and liquid nitrogen are
death from the chemical consequences of solute classified as Restricted Articles by the Interna-
concentration. Nevertheless, it is possible to store tional Air Transport Association, and the shipper
and I ecover low yields of viable cells without a must be aware of all pertinent regulations (see
lilghly specialized freezing procedure. Dangerous Goods Regulations, 30th Edition, ef-
Sei~iensamples and tissue biopsies are easily fective 1January 1989).
obtalned under field conditions without perma- Dry ice containers are usually accepted as
nent inpry to the donor animal. The equipment baggage by the airline agent if the "Shipper's Cer-
and supplies needed to establish proper freezing tification for Restricted Articles" is attached to the
conditions in the field arc not elaborate: alcohol, package. Dry ice is designated as Hazard Class
heezil~gmedium, and liquid nitrogen in an appro- OW-A, and packages containing dry ice must be
priate tank (Maure, 1978).Plastic "French Straws" so marked. No more than 200 kg of dry ice may
have been useful as containers for storage of se- be shipped in a single package.
men If the freeze rates used are less than optimal, Liquid nitrogen in nonpresswized, metal de-
nlanlpulation during the thawing process may in- war flasks is also authorized for shipment by air.
crease the y~eldof viable cells (Mazur, 1970). No more than 50 kg per flask can be shipped on an
'fl-ie following tissue biopsy protocol is rec- aircraft carrying passengers. The flask must be
ornr~~ended by Hay and Gee (1984) for use in the marked "Nitrogen (Liquid, Nonpressurized)" and
fleld. The tissue is collected aseptically, minced must be further labeled to discourage loading or
into fragments of about 1 mm diameter, and handling in any position other than upright. The
placed in a culture medium containing 10% upright position should be indicated prominently
seium, antibiotics, and 10 to 12% DMSO. The tis- by arrows and the wording KEEP UPRIGW placed at
sue 1s allowed to eclullibrate with this "freeze 120-degree intervals around the container. It must
mcdlurn" for 2-3 l~oursm an ice bath or refrigera- also be prominently marked DO NOT DROP-HAN-
tor, II available. The temperature is then lowered DLE WITH CARE. For shorter trips, the liquid nitro-
slowpiyto -50°C (approximately 1 degree per gen can be poured out and the tank checked as
min~tte),after which tlme the tissue container is baggage. Most standard dewars are so we11 insu-
dropped Into hquid nitrogen. The gradual cooling lated that they will maintain a large mass of tis-
procedure may be carried out in the nitrogen de- sues frozen for many hours, sometimes days, after
war by suspending the sample in the cold space the liquid nitrogen has been poured off. If a tank
above the liquid nitrogen. Successful cultures of contains few specimens, add plastic tubes filled
some t~ssueshave been established using samples with water to the tank to provide supercooled ice
equ~libratedin the freeze medium and frozen im- before pouring off excess nitrogen. A "Dry Ship-
medxately in liquid nitrogen (H.A. Taylor et al., per," which contains an absorbent that keeps ni-
1978,I<J Baker, personal communication). trogen from spilling, alleviates problems of air
transport of nitrogen in dewars; storage in some
models is effective for up to 3 weeks. For maxi-
mum duration it is necessary to saturate the "Dry
TRANSPORT OF TISSUES FROM Shipper" initially with liquid nitrogen and keep it
FIELD TO LABORATORY OR in the upright position. Although there is no dan-
BETWEEN LABORATORIES ger of spills in the horizontal position, the boil-off
rate is greater so that holding time is reduced.
Shipping Regulations
Tissues are usually transported either in styro-
Sources of Liquid Nitrogen and Dry Ice
loaln boxes packed with dry ice or in liquid nitro- Scientists conducting field work often have diffi-
gen containers. Small samples are conveniently culty locating sources of liquid nitrogen and dry
Collection and Storage of Tissues 37
ice. Universities, hospitals, welding supply com- space and freezer boxes are more easily retrieved.
panies, industrial gas suppliers, and mining oper- Freezers should be monitored at least once each
ations are useful contacts in seeking sources of ni- day and should be equipped with both local and
trogen and dry ice. A partial listing of such remote systems that will sound alarms in the
sources in different areas of the world is given in event of electrical or mechanical failure. During
Dessauer and Hafner (1984). holiday periods, special arrangements must be
made to ensure that freezers are monitored daily.
Readily visible notices should be posted in freezer
STORAGE OF TISSUES ON RETURN areas, indicating persons (and telephone num-
FROM THE FIELD bers) to be contacted in case of freezer malfunc-
tion. Some form of backup storage system (other
The majority of tissue samples are stored in an freezers, liquid nitrogen, or dry ice) should be
electric ultracold freezer (-70 to -150°C), on dry available in the event of freezer failure, and a gen-
ice, or in liquid nitrogen. Ease of access of samples erator should be available in the event of a pro-
within the freezer or nitrogen tank is of special longed blackout.
importance. Samples can be stored in numbered Ideally, liquid nitrogen is better than ultracold
moisture-proof boxes (Figure IF). For easy re- freezers for long-term storage because of its much
trieval, all specimens of a given taxon should be colder temperature (-196°C). However, if large
stored together; color codes on the outside of numbers of samples are stored in the collection, it
boxes facilitate the identification of contents. may be difficult to retrieve individual samples,
Within boxes each sample is identified by the col- Also, continual replenishment of evaporated liq-
lector's field number. A listing of the holdings in uid nitrogen may become costly. A small number
each freezer, complete with box number, contents of samples stored in liquid nitrogen tanks is rela-
of the box, and location in the freezer is main- tively easy to organize for efficient retrieval. Liq-
tained and routinely updated as samples are uid nitrogen freezers are available that are large
moved, used, granted, or discarded. enough to organize up to 15,000 2-ml cryotubes
Ultracold storage space is very expensive to with easy access to any tube.
purchase and maintain, and it is important that Household refrigerators and freezers may be
materials be stored in a space-efficient manner. used to store freeze-dried blood fractions, acetone
Thus, access and inventory procedures for frozen powders, seeds and pollen, and enzymes in
tissue collections should be extremely well orga- strong salt solutions or glycerol. Isolated samples
nized. Freezers should be opened as rarely as pos- of DNA and bacterial cultures containing DNA
sible; ultracold freezers are sensitive to even brief for cloning purposes also may be stored in a
periods of temperature warm-up, and every sec- household freezer (Sambrook et al., 1989). How-
ond that a freezer door is open while one searches ever, frost-free appliances should be avoided, be-
for a particular sample is energy-consuming and cause of the danger that biomolecules may de-
could eventually contribute to freezer failure. One grade during warming cycles.
must know exactly where each sample is located
before opening the freezer. A map on the door of
the freezer facilitates location of individual boxes. STABILITY OF MACROMOLECULES
A "working freezer" should be used for storage of DURING LONG-TERM STORAGE
tissues that are currently being analyzed.
On a cost-per-sample basis, an ultracold All biological macromolecules spontaneously de-
freezer provides the most convenient and efficient. compose (Lindahl, 1993). Many proteins, how-
method for long-term storage of large numbers of ever, are far more stable than is generally as-
samples. Of the two designs (chest and upright sumed (Sensabaugh et al., 1971a,b; Dessauer and
models), chest models maintain more constant Menzies, 1984). For example, remnants of blood
temperatures during use and are less prone to me- samples used in Nuttall's (1904) classical im-
chanical failure; upright models occupy less floor munological study of mammalian relationships
38 Chapter 3 / Dessauev, Cole & Hafner
museums and herbaria worldwide are accumu- museum symbolic codes). Examples of organisms
lating such collections (see listings at the end of from which tissues may be accepted without
this chapter). The goal is to develop collections vouchers include large ungulates, proboscideans,
that a r e as synoptic and diverse as traditional and marine mammals such as whales; pho-
skin, skeletal, dried, and splrit collections. tographs can substitute as vouchers in such cases.
have helped answer many research questions. 3. In the laboratory, use only as much of any
currently, individual scientists distribute samples sample as is necessary to complete the exper-
of frozen tissues throughout the research commu- iments. Conserve as much of the sample as
nity on an informal basis. It is hoped that, in time, possible for future use.
a network of institutions will accept the responsi- 4. After completing the experiments, place the
bility for long-term maintenance; development, remaining samples in a formal synoptic tissue
and distribution of such material. Unique curato- repository.
rial problems posed by frozen tissue collections
have been addressed at workshops organized by 5. Personal collections should be discouraged,
the Association of Systematics Collections in 1983 as they are often lost to science. If a personal
(Dessauer and Haher, 1984) and in 1988, and a set research collection is maintained, arrange for
of guidelines has been proposed for their curation its conservation and timely transfer to an ap-
(Dessauer et al., 1988). propriate institutional collection in the event
The principle of obtaining large quantities of of disability or death.
tissues when opportunity permits (rather than a
small sample for a few experiments by a single
scientist) greatly enhances the value of collecting
efforts. The following principles (Dessauer et al.,
EXISTING COLLECTIONS
1988) are intended to guide the development of The following is a list of collections of materials
synoptic collections of tissues representing the useful for research in molecular biology, updated
world's biota: from the list of Dessauer and Hafner (1984). We
welcome any additions and comments for im-
1. When collecting, do not limit efforts only to
proving this list in future editions of this book.
the specific material needed. Within the lim-
Collections are listed alphabetically by coun-
its of permits, make general collections when try and (for the United States) by state. Each col-
opportunities arise, with emphasis on the un- lection cited includes collection location, person(s)
usual, the difficult to obtain, and on filling
in charge, specific information on collection size,
gaps in existing collections. Individuals who
materials, taxa, and geographic regions repre-
seek grants of tissue from large collections
sented. Investigators who may wish grants from
should also help to develop such collections. any of these collections should first read the pre-
2. When preparing specimens, discard as little ceding section, "Development and Support of
material as possible. Obtain tissue samples for Synoptic Tissue Collections," and, in particular, be
the research, select appropriate anatomical prepared to assist in future collection growth or to
material to document the specimen by tradi- reciprocate in other ways.
tional means, then freeze as much of the spec-
imen as possible for a synoptic frozen tissue
repository.
42 Chapter 3 / Dessauer, Cole b Hafner
AMERICAN TYPE CULTURE COLLECTION Size: 6; material: 3-7; taxa: 1,3-6; regions: A-C, E
Department of Molecular Biology, 12301 Strengths: Mammals of southwestern USA and
Parkland Drive, Rockville, MD 20852 Bolivia
Size: 5; material: 1-3; taxa: 5; regions: A-F Contact: Terry L. Yates or William L. Cannon
Strengths: national repository for human and
mouse DNA, probes, and libraries New York
Contact: Bill Nierman AMERICAN MUSEUM OF NATURAL
HISTORY
UNNERSITY OF MARYLAND Molecular Laboratory, 79th Street at Central Park
Department of Zoology, College Park, MD 20742 West, New York, NY 10024-5192
Size: 5; material: 6, 7; taxa: 2; regions: A Size: 3; material: 1, 6,7; taxa: 2, 3,5, 7; regions: A
Strengths: Plethodon Contact: Ward Wheeler or Rob DeSalle
Contact: Richard Highton
UNIVERSITY OF ROCHESTER
CAPTIVE PROPAGATION RESEARCH GROUP Department of Biology, Rochester, NY
Patuxent Wildlife Research Center, U. S. Size: 2; material: 5; taxa: 8; regions: A-C
Department of the Interior, Laurel, MD Strengths: E , coli strains characterized pheno-
20708-4019 typically and genetically
Size: 3; material: 6,7,8; taxa: 4; regions: A Contact: Howard Ochman
Strengths: Semen and blood of cranes
Contact: George F. Gee BRONX ZOO
Department of Wildlife Management Services,
NATIONAL CANCER INSTITUTE Bronx, NY 10460-1099
National Institutes of Health, P.O. Box B, Size: 4; material: 1,3, 6,7; taxa: 2-5; regions: A-F
Frederick, MD 21701 Strengths: zoo and aquarium specimens, Tibet
Size: 3; material: 1,5-7; taxa: 5; regions: A-F and Patagonia
Strengths: Felidae; other carnivores; primates; Contact: Dan Wharton or George Amato
marsupials
Contact: Stephen J. O'Brien Ohio
CINCINNATI MUSEUM OF NATURAL
Michigan HISTORY
WAYNE STATE UNIVERSITY University of Cincinnati, Cincinnati, OH
Department of Anatomy and Cell Biology, 45202-1401
School of Medicine, Detroit, MI 48201 Size: 3; material: 6,7; taxa: 2-5,9; regions: E
Size: 2; material: 1,3,6,8; taxa: 5; regions: B-D (Philippines only)
Contact: Morris Goodman Contact: Robert S. Kennedy
New Jersey .Oregon
RUTGERS UNIVERSITY U.S. NATIONAL FISH AND WILDLIFE
Center of Theoretical and Applied Genetics, SERVICE
New Brunswick, NJ 08903-0231 Forensic Laboratory, 1490 E. Main St., Ashland,
Size: 5; material: 6; taxa: 1,7; regions: A-F OR 97520
Strengths: Deep-sea hydrothermal vent Size: 5; material: 6,7; taxa: 2,3,5; regions: A-D
invertebrates Strengths: herps from Morocco and Spain;
Contact: Robert C. Vjrijenhoek Eumeces; cervids; ursids, manatees
Contact: Stephen D. Busack (herps),
New Mexico Wayne F. Ferguson (mammals)
MUSEUM OF SOUTHWESTERN BIOLOGY
University of New Mexico, Albuquerque,
NM 87131
46 Chapter 3 / Dessaueu, Cole & Hafneu
INTRODUCTION
Protein electrophoresis, t l ~ emigration of proteins under tlze influence of an elec-
tric field, is among the most cost-effective methods of investigating genetic phc-
nomena at the molecular level. Since the origin of starch gel electrophoresis
(Smithies, 1955) and the histochemlcal visualization of enzymes on gcls (I-lunter
and Markext, 19571, and the classic studies of H. Harris (1966), Ilubby and Lewon-
tin (19661, and Lewantin and Hubby (1966),a major revolution in undersfandll-ig
micro- and macroevolutionary processes has occurred. Using enzymatic and non-
enzymatic proteins, numerous investigations have focused on enzyme effic~cncy,
estimating and understanding genetic variability jn natural populations, gene
flow, hybridization, recognition of species boundaries, and phylogenetic rela-
tionships, among other problems. The frequency of such investigations has not
waned in recent years, but rather has increased as refinements and new methods
have been developed.
Two general forms of protein data can be gathered simultaneously using elec-
trophoretic methods. One is derived from isozymes, which are all functiol-tally
similar forms of enzymes, including all polymers of subunits produced by dlf-
ferent gene loci or by different alleles at the same locus (Markcrt and Mollcr,
1959).The other data set consists of allozymes, a subset of isozymes, which are
variants of polypeptides representing different allelic alternatives of the same
gene locus. Both forms of data are important in molecular systematics, and both
involve proteins that can be separated on the basis of net charge and size.
52 Chapter 4 /Murphy, Sites, Buth ~!3Haufler
Here we provide a review of applications, they are attracted to neither the (positive) anode
step-by-slep instruct~onson how to establish a nor the (negative) cathode.
hor~zonlalstarch gel electrophoresis laboratory, Uncharged amino acids are either non-polar
perform protein electrophoresis, stain for specific and hydrophobic or polar. These amino acids can
entyrnatlc and non-enzymatic pmteins, and inter- become hydrogen-bonded to one another result-
pret the resultant gels. Although other metl~odsof ing in folding (/%structure)or helical (a-helix) con-
prote~nelectrophoresis exist, using media such as figurations, termed secondary structure. Depend-
cellulose acetate gels (Richardson et al., 1986),we ing on the primary and secondary structure, the
have chosen to detail horizontal starch gel meth- molecule usually undergoes additional folding to
ods because of their widespread use and effi- form its tertiary structure. The shape and size of a
ciency Ways of avoiding or recovering from com- protein also may have an effect on protein migra-
mon pitfalls are described. The electrophoretic tion, depending on the pore size of the elec-
principles and methods described are applicable trophoresis matrix. To some extent the shape of a
to all organisms. particular protein is determined by the relative
Where possible, we provide inexpensive al- charges of adjacent amino acids because of the ef-
terna tives to costly equipment and methods, but fect of like charges repelling and different charges
not a i the expense of increased health risk. As attracting. Finally, many proteins contain more
wlth most molecular methods used in systemat- than one polypeptide chain (subunit) bound to-
ics, some aspects of the data gathering pose ex- gether by hydrogen bonds, van der Waals forces,
treme health risks, both acute and chronic. There- ionic bonds, disulfide bridges, and/or hydropho-
fore, the appropriate level of caution, as known to bic interactions. Proteins having more than one
us, 1s always given the highest priority. polypeptide (multimeric) have a quaternary
structure (Darnell et al., 1986).
Some forms of electrophoresis separate pro-
PRINCIPLES AND COMPARISON OF teins on the basis of net protein charge Q, shape
METHODS as measured by radius r, strength of the electric
field d, and viscosi!y of the suspension medium n,
General Principles as given by the following equation:
in the absence of selection, drift, and migration, Advantages of the four basic methods are com-
the frequencies of alleles in a randomly mating pared in Table 1, although choice of method will
population will maintain a stable equilibrium often be determined by availability of equipment
with genotype frequencies of AA = p2, Aa = 2pq, and expertise.
and aa = q2, where p is the frequency of allele A,
and q is the frequency of the alternative allele a. Starch Gel Electrophoresis (SGE)
Nonconformity to the prediction of Hardy-Wein- Hydrolyzed starch is heated in an ionic buffer so-
berg equilibrium indicates that the phenotypic lution and allowed to cool, thereby forming a gel.
variation has a non-genetic basis or that one or The ratio of starch to buffer can be varied to alter
more of the Hardy-Weinberg assumptions is not the size of the gel pores. Pore size allows for a
met in the population. Thus, for example, the indi- sieving effect in the gel. Thus, these gels can sepa-
viduals may not be randomly mating, or some nat- rate on the basis of both size (shape) and charge.
ural selective force may be acting on the species, or Two forms of SGE exist: horizontal and verti-
genes from neighboring populations may be mi- cal. In horizontal SGE, a poured gel. is allowed to
grating into the study site. Lf these principles of bio- cool in a gel mold without further preparations.
chemistry, genetics, and gel interpretation are fol- Vertical starch gels are poured into double-sided
lowed, electrophoresis can yield many valuable molds having a "gel comb" or "well former" that
insights for the evolutionary biologist. makes the "gel wells" for holding tissue extracts
A major assumption in the use of allele fre- (Brewer, 1970; Morizot and Schmidt, 1990; see also
quency data to infer population structure is that Chapter 8). In general, the vertical system requires
alternative alleles at a given locus are selectively a greater amount of starch and larger quantities of
equivalent or neutral (Kimura, 1983a,b),or nearIy tissue extract, allows for fewer samples to be run
neutral (Ohta, 1992). Exceptions to this assump- per gel, and is thus more costly. The advantages
tion are known (see below), and accepting neu- of vertical SGE include the avoidance of the phe-
trality for most protein polymorphisms also re- nomenon known as electxodecantation: as pro-
quires accepting IargeIy untested or poorly tested teins migrate on horizontal gels, enzymes of high
null hypotheses. However, in the absence of evi- molecular weight tend to drop toward the bottom
dence for selection at a particular locus, it has of the gel. This may make slices from the upper
been suggested that studies begin with neutrality reglons of the horizontaI gel inferior or inade-
as a working assumption (Allendorf and Phelps, quate for resolving these proteins. Nevertheless,
1981). the method of horizontal SGE is used almost ex-
clusively in our laboratories and in the vast ma-
Comparison of the Primary Methods jority of other laboratories; the vertical system will
not be discussed further (for more information,
The four primary methods of electrophoresis dif- see Siciliano and Shaw, 1976; Morizot and
fer by the nature of the support medium: starch Schmidt, 1990).
gel (including both horizontal and vertical sys-
tems), polyacrylamide gel, agarose gel, and cellu- Polyacylamide Gel Electrophoresis (PAGE)
lose acetate gel. Each method will be briefly de- Polyacrylamide gels are formed by the catalytic
scribed and discussed in terms of specific polymerization of monomeric forms of acry-
advantages and limitations. Less frequently used larnide and bisacrylarnide. It allows the separa-
methods of resolving protein variants are not tion of proteins on the basis of both size and
discussed herein; these include paper elec- charge (Chrambach and Rodbard, 1971). The pore
trophoresis (Freifelder, 1982), isoelectric focusing size of acrylamide gels can be controlled by alter-
(Whitmore, 1990), immunoelectrophoresis, and ing concentrations of acrylamide and/or bisacry-
two-dimensional electrophoresis (Harris and larnide. This sieving attribute has made PAGE one
Hopkinson, 1976; Hames and Rickwood, 1981). of the methods of choice in molecular biology lab-
Proteins: lsozyme Electropi~oresi~ 55
Table 1
Comparison of the attributes of the four primary methods of protein electrophoresis
on gel support mediau
Attribute SGE PAGE CAGE AGE
oratories examining nucleic acid sequences be- number of variants identified (M.A. Riley e t al.,
cause, unlike most other forms of gel elec- 1992).In addition, the large pores also cause elec-
trophoresis, it allows for the accurate, controlled troendosmosis, a "backwash" of buffer soliltlon
separation of charged particles on the basis of caused by gel charge groups that accelerates the
molecular weight (Chapters 8 and 9). General ref- mobility of cationic isozymes but retards or re-
erences to this system are found in Hames and verses the anionic isozymes. Although this prob-
Rickwood (1981). lem occurs with SGE and AGE, it is more pro-
nounced with CAGE (Harris and Wopkii~son,
Cellulose Acetate Gel Electrophowesis (CAGE) 1976). CAGE has been discussed in detail by
Electropl~oresiscan be carried out on preformed Rcl~ardsonet al. (1986).
cellulose acetate gels or strips. The gel form of cel-
lulose acetate is preferred because of repeatability Agavosc Gel Electvophorcsis
of experiments (Harris and Hopkinson, 1976). A (AGE) Agar and agarose gels are prepared rnuclx
major advantage is that electrophoresis can be car- in the same way as starch gels. Pure "agar" gels
ried out wit11 very small quantities of tissue ho- have a relatively high concentration of acidic
mogenate. Althougl~the gel itself is prernade, it groups (carboxyls and sulfates), resulting in con-
must be soaked in the appropriate buffer prior to siderable electroendosmosis and occasiol~alad-
electrophoresis. Due to the large pore size, CAGE sorption of proteins, although adsorption prob-
has no sieving effect; proteins are separated on the lems may be overcome by use of highly purified
basis of net charge only and this may reduce the agarose (Harris and Hopkinson, 1976).
56 Chapter 4 / Mzirphy, Sites, Buth &' Haufler
and FST= -0.142 for population G; the latter pop- 1983, 1985b, 1987; Knight and Waller, 1987;
ulation was subject to a higher rate of troop failure Crouau-Roy, 1988; Van Treuren et al., 1991).
and recolonization. This pattern also correlates
with significant population differences in within- Paternity Studies
troop heterozygosity (Frs = -0.136 for W, FIs = Allozyme studies combining ecological data on
-0.064 for GI, although troops in both populations dispersal with genotypic data that establish pa-
displayed excess heterozygotes, a pattern pre- ternity of offspring have allowed assessment of
dicted from ecological and behaviorial observa- the relative importance of several factors affecting
tions of mating and dispersion. The importance of genetic structure in some mammals. An experi-
obtaining independent estimates of dispersal can mental removal and colonization study of pocket
be illustrated by K.L. Brown's (1985) study of the gophers (Tl~omornysbottae; Patton and Feder, 1981)
demographic and genetic characteristics of disper- revealed that migration (i.e., recolonization) was
sal in mosquitofish (Gambusia affinis) in a ther- nearly random with respect to the available
mally heated pond on the Savannah River Reser- source populations. This movement depressed be-
vation, South Carolina. Genotype frequencies of tween-field genetic heterogeneity, but this was re-
the dispersers were non-randomly distributed stored within a single generation due to a high
throughout the pond, and associations of genetic variance in male reproductive success. Juvenile
distance values and geographical distances be- dispersal apparently was responsible for the
tween collection sites indicated that the dispersers maintenance of intrapopulation variability in
did not constitute a random intermixing of refuge highly socially structured breeding groups of a
groups. Counter to intuitive expectations, disper- Neotropical cave-dwelling bat (McCracken and
sal in these populations resulted in an increase in Bradbury, 1977) and colonies of yellow-bellied
allelic differentiationbetween sites and an increase marmots (Schwartz and Armitage, 1980). Genetic
in mean levels of intrapopulation heterozygosity. markers revealed that inbreeding was avoided by
Very few plant or animal species have ade- the near-total dispersal of male offspring from
quate demographic data for estimating effective their natal colonies in both of these species.
genetic dispersal (Endler, 1979) and N,.In the Paternity of specific groups of offspring has
house sparrow (Passer domesticus), Fleischer (1983) been studied in several groups using detectable
used demographic data to predict FST(Wright, allozyme markers. For example, Tilley and Hans-
1943), and then tested the prediction with al- man (1976) collected female dusky salamanders
lozyme data and concluded that these birds ap- (Desmognathus ochrophaeus) and their broods, and
proximated a stepping stone model (Kimura and showed that at least 7% of all individual clutches
Weiss, 1964) of genetic structure. However, a were sired by more than one male. Insemination
number of indirect approaches are now available and fertilization are therefore effectively uncou-
for inferring gene flow patterns entirely from the pled allowing the opportunity for sperm compe-
geographic distribution of allozyme frequencies, tition, which has been shown in controlled labo-
and these appear to be robust for some classes of ratory matings in D,ochrophaeus (Houck et al.,
dispersal patterns (reviewed by Slatkin, 3985, 1985).Other studies of colony breeding structure,
1987, 1993; see also Slatkin and Barton, 1989; based on either relatedness or parentage of spe-
Lessa, 1990; A.H. Porter, 1990; and Chapter 10). cific broods as inferred from allozyme markers,
Some population genetic surveys have re- include Evarts and Williams (1987), Harry and
vealed striking examples of heterozygote defi- Briscoe (1988), Quellar et al. (19881, and Price et al.
ciency, which could result from (1) strong selection (1989). Among studies of plants, Stanton (1986)
against heterozygous genotypes, (2) inbreeding, or summarized the ongoing problems in trying to as-
(3) a Wahlund (1928) effect (the inclusion of two or sess the contribution of different fathers to suc-
more genetically distinct units into a single popu- cessive generations, and Murawski and Hamrick
lation sample). In a number of studies, the in- (1990) investigated the effect of density of flower-
breeding explanation is favored (see O'Brien et al., ing individuals on the mating systems of nine
58 Chapter 4 /Murphy, Sites, Buth &Ha;
species of trees. The latter authors discovered that plications (Daugherty et al., 1990; see also Crother,
there was (not unexpectedly) great variation 1992; T.E. Dowling et al., 1992).
among the density of flowering individuals,
which in itself varied annually, and in years of Ecological Genetics
lower flowering-tree density there was greater The neutral theory of molecular polymorphism
heterogeneity and/or more selfing. (Kimura, 1968; King and Jukes, 1969) has ques-
tioned the primacy of natural selection as an agent
Species Boundaries of molecular evolution (Lewontin, 1974; Nei and
Because isozyme electrophoresis is a cost-effective Koehn, 1983).Several statistical studies have con-
method for screening a large number of single- cluded that most alternative alleles are selectively
copy nuclear gene loci (see Appendix I), it will equivalent and may represent transient stages of
continue to be especially helpful in multiple-pop- replacement, with fixation probability being a
ulation sampling efforts designed to determine function of mutation rate and effective population
species boundaries. Isozyme data readily can be size (see Kimura, 1983a,b).However, these studies
used as diagnostic markers in the sense described are based upon several largely untested assump-
by Davis and Nixon (1992; fixed differences be- tions, and may not distinguish among various
tween samples) for a priori identification of the ba- processes of neutrality and selection (W.B. Watt,
sic units (species) of phylogenetic analyses; this is 1985).Alternatively, several elegant multidiscipli-
especiaUy critical in view of the potential for over- nary studies have investigated the selective basis
splitting taxa defined exclusively by rapidly for specific enzyme polymorpl~isms.
evolving portions of the animal mitochondria1 W.B. Watt (1985,1986) outlined a bioenergetic
genome (Moritz et al., 1992a).Isozyme studies of- approach for investigating possible functional and
ten reveal discordant geographic patterns be- ecological differences between alternate al-
tween levels of genetic divergence and taxonomic lozymes. Documentation of adaptive differences
boundaries inferred from morphological data, es- in allozymes requires demonstration of (1) differ-
pecially for geologically old and morpl~ologically ences in a catalytic function, (2) allozyme-based
conservative radiations (see Wake et al., 1983; catalytic differences having pl~ysiologicaleffects,
Wake and Larson, 1987). For example, Larson and (3) fitness differences in natural environments
(1989) has summarized 19 examples of pairs of between physiological effects (see also Koehn,
cryptic or morphologically very similar species of 1978; Powers, 1987).Watt (1977) identified four
plethodontid salamanders differing by from 1 to common GPI alleles in several natural butterfly
14 fixed allozyme markers, and similar cases are (Colias eu ytheme) populations, and demonstrated
known in other salamanders (e.g., D.A. Good et genotypic differences in survivorship, flight activ-
al., 1987; D.A. Good, 1989). Ranker and Schnabel ity, and mating success (Watt, 1983; Watt et al.,
(1986) used isozyme evidence to demonstrate the 1985,1986). Similar studies have been carried out
genetic differentiation of two lily species whose on bivalves (Koehn et al., 1980, 1988; Koehn and
only clear separation was a difference in flower- Immermann, 1981; Koehn and Siebenaller, 1981;
ing time. ]. Shaw et al. (1987) demonstrated the Hilbish et al., 1982; Hilbish and Koehn, 1985a,b;
clear genetic differentiation of two moss varieties, McDonald and Siebenaller, 1989), fishes (Powers
as did Odrzykoski and Szweykowski (1991) for et al., 1979; DiMichele and Powers, 1982a,b; Al-
the thallose liverwort. I-fowever, the reverse pat- lendorf et dl., 1983; Leary et al., 1984; Crawford
tern is also well documented: morphologically and Powers, 1989; Ropson and Powers, 1989; Van
distinct taxa sometimes show little or no genetic Beneden and Powers, 1989; Ropson et al., 19901,
divergence (B.J. Turner, 1974; Ecl~elleand Dowl- Drosophila (Heinstra et al., 1986; Barnes and Lau-
ing, 1992). Nevertheless, the many examples rie-Ahlberg, 1986; M.W. White et al., 2988), and
given above highlight the power of this approach marine amphipod crustaceans (J.H.McDonald,
for diagnosis of the basic units of analysis, and 1989; Patarnello and Battaglia, 1992). However,
these so~netimeshave profound conservation im- Eanes (1987) points out that many of these stud-
Proteins: Isozylne Electrophoresis 59
ies were not designed to separate the effects of a Murphy et al., 1983a). At the other extreme, ,l-
single locus upon individual fitness relative to the lozyme divergence may have proceeded to
contribution of linked loci; future efforts will re- point where too few electromorphs are shared,
quire strongly integrated approaches combinillg and many of those that are shared are convergent
in vivo and in vitro analyses and biochemical (e,g., Sites et al., 1984; Derr et al., 1987, Dlrnrnick,
properties of segregating allozymes (Eanes et al., 1987).
1990). There are many groups, however, for 1vl~ic11
allozyme divergence provides information appro-
Interspecific Applications priate for analysis of intra- or intergeneric rcla-
tionships. Allozyme electrophoresis IS appropri-
Phylogenetic Systematics ate for analyzing intergeneric phyIogeny In b ~ r d s
Allozyme data (and to a lesser extent, isozyme (Gutierrez et al., 1983; N.K. Johnson et a1 , 1988),
data) have been used extensively to investigate snakes (Murphy, 1988) and, occas~o~~ally, other
phylogenetic relationships. Some recent reviews groups as well (e.g., Hafner and Nadler, 1988).
of phylogenetic applications include M.W. Smith Most studies have focused on intragenerlc rela-
et al. (1982, 1994) for vertebrates, Matson (1984) tionships (see reviews clted above), and these are
and N.K. Johnson et al. (1984) for birds, D.J. only informative when the individual loci are
Crawford (1989, 2990) for plants, and Kilias analysed as discrete charactcrs (Murphy, 1993;
(1987) for lichens. Buth (13844 reviewed the ap- Chapter 11). Among other things, this has the ad-
plication of isozyme and allozyme data to sys- vantage that the evidcnce for each node is made
tematic problems in general, and Mabee and explicit and therefore can be related back to the
Humphries (1993) and Murphy (1993) provide re- primary data (e.g., Crabtrce, 1987; Sites et al.,
cent evaluations of methods and suggestions for 1990; Wiens and Tltus, 1991).
phylogenetic analysis (see also Chapter 11). The
methods of data analysis used for pl~ylogenetic Modes of Speciatiorz
analysis of these data vary widely and are highly AIlozy~nescan be used to explicate the pdtterlzs
controversial. Undoubtedly, the methods are still and mechanisms by w h ~ c hnew specles are
in a relatively early stage of refinement and much formed. Changes In allozyme frequency havo
remains to be developed. What seems critical is been used to identify incip~entspecles (Aradhya
that the assu~nptionsassociated with each et al., 1991; Gottlieb, 1973, McPheron et a1 , 1988),
method of analysis not be violated. Because each study sibling species (Anderson and Oakeshot~,
method of analysis has its limitations, and some 1984),analyze how adaptation has col-~tributcdto
commonly used methods are simply invalid the process of speciat~on(Allcgrucci et al., 1967),
(But11,1984a; Murphy, 1993; Chapter 111, the cita- differentiate between competing hypotheses for
tions in the general reviews (and below) should the origin of new species (Mayden, 2986; S~nallct
not necessarily be considered exemplary. Here, al., 1992), and explore the role that spcclat~on
we restrict our comments to the use of allozymes; plays in evolution (Mindell et al., 1989, 1990).
isozyme characters such as the presence of dupli- Even though DNA ana1yses can be a p p l ~ e dl o
cate loci, patterns of gene expression, and tke these kinds of questions, the ease with which elcc-
ability to form heteropolymers are considered trophoresis can be used to survey large ~ ~ u n ~ b c r s
later, in the section "Gene Expression and Gene of individuals for genetically infor~nahvef ea lures
Duplication." makes aUozyme analysis remain the tool of cho~ce
Allozyme characters are subject to many of for many studies.
the same limitations as other forms of systematic
data. For example, morphologically distinct Paleobiogeoguaphy
species may show very low levels of divergence, Phylogel~etichypotheses formed from allozyme
and so differ by few phylogenetically informative data can be applied to answering questions of pa-
characters even when many loci are screened (e.g., leobiogeography, ,4 primary method is Brooks
60 Chapter 4 / Murphy, Sites, Buth & Haufler
parsimony analysis, or BPA (Wiley, 1988a,b), also where evolutionary tempo can be examined from
known as co-speciation analysis. 111 the first step the perspective of cladograms (Mindell et al.,
of thls method, cladograms are constructed for 1989,1990;Murphy and Lovejoy, 1995).
the taxa i n question (Brooks, 1981,1990).Next, ge- Tempo questions may also be applied to (I)
ographlc areas in which the species occur are des- estimated dates of dispersal or vicariance events,
igna tcd as if they were taxa. Using geological evi- (2) the relative arrival times of taxa on oceanic is-
dence, an area cladogram is constructed slzowing lands, (3) relative roles of colonization, extinction
the historical connections among the study areas. and historical factors in island biogeography, and
Next, the taxa are treated as if they comprised a (4) prediction of the time sf origin of populations
completely polarized multistate transformation or geographic areas, such as islands, in the ab-
series in which each taxon and each internal sence of supporting geological data or radioiso-
branch of the tree are numbered. The taxa are tope dating, such as 14C. For example, Murphy
then coded using non-redundant linear coding (1983a) used genetic distance data from presumed
(O'Grady and Deets, 1987) and the species names sister taxa of a number of reptile taxa presumably
are replaced by their area names. A new area isolated by the same geological vicariant events in
cladogram is constructed based on the phyloge- Baja California, Mexico and found a good correla-
iietic relationships of the species, and this new tion between geological dates and genetic similar-
area cIadogram is presumed to represent the his- ity. Genetic distance data were then used to pre-
torical involvement of areas in the evolution of dict the age of one island in the Gulf of California,
the species. Although tlus is the preferred method lsla Santa Catalina. Unfortunately, most of the sis-
of paleobiogeographic analysis, it has rarely been ter taxa were presumed rather than tested using
appllcd to allozyme data (see also Kluge, 1988).A more preferable cladistic methods, such as that of
less preferable alternative is described below. Mindell et al. (1989, 1990). Nevertheless, genetic
similarity data were extended to test the applica-
X ~ r f e sof Evolution bility of the MacArthur and Wilson (1963, 1967)
Queslions of evolutionary tempo areimportant theory of island biogeography as it relates to rep-
considerations, especially if applying the molecu- tiles on islands in the Gulf of California (Murphy,
lar clock (see Chapter 12) or examinii~grelative 1983b), and to the colonization of islands by some
rates among different kinds of data from the same rattlesnakes (Murphy and Crabtree, 1985a).
taxa, Tor example, allozyrnes versus morphology.
Rosen and But11 (1980) provided a protocol for ex- Hybridization
anuning evolutionary tempo using allozjhe data Ideally, studies of interspecific hybridization
that lncluded the calculation of ancestral genetic should incorporate three features, including: (1)
d~stancebetween all examined taxa and their hy- phylogenetic analysis of the taxa involved to al-
potl~etlcalcommon ancestor, Murphy and Crab- low inferences to be drawn about the origin of the
tree (1985a) applied this method to rattlesnakes hybrid zone (primary versus secondary; e.g.,
and found that rates of divergence had been Hillis, 19851, (2) identification of autapomorphic
equal, although they were unable to confidently electromorphs in each of the hybridizing species,
calibrate the clock. However, comparisons of rela- which provides the most unambiguous genetic
tlonships revealed by methods that assurqe equal markers for gene flow inferences (Murphy et al.,
ra tes (c.g., UPGMA) and distance (similarity) 1984), and (3) identification of at least three un-
1nc~l30dsthat do not (e.g-.,distance-Wagner trees; linked markers (fixed or nearly fixed electro-
Chapter 11)most frequently reveal marked varia- morph differences) between hybridizing taxa (see
tion among lineages in rates of change (e.g., Chapter 2). With three or more single-copy inark-
Baverstock et al., 1979; Hillis, 1985).In general, the ers, F1 individuals (heterozygous for parental
ev~dencefor an allozyrne clock is weak (Avise and electromorphs at all markers) can clearly be dis-
Aquadro, 1982; Chapter 12). Finally, a phyloge- tinguished from most F2 or backcross classes,
netlc approach has been developed recently which will be heterozygous for some but not all
Proteins: Isozyme Electrophoresis 61
markers (R.J. Baker et al., 1989; D.A. Good, 1989; in the hybrid origins of various unisexual taxa.
Wake et al., 1989; Arevalo et al., 1993; Sites et al., Most carefully examined unisexual vertebrates
1993).Few of the studies conducted to date satisfy appear to be of hybrid origin (reviewed in Daw-
all of these criteria, but most have contributed to a ley and Bogart, 1989).Typically, unisexual taxa are
better understanding of the structure and dynam- characterized by higher levels of multilocus het-
ics of hybrid zones. Least informative are studies erozygosity than either parental form because of
in which hybridizing populations are not charac- their hybridity and the absence sf segregation.
terized by any fixed allozyme differences (Green- Laboratory studies have confirmed patterns of
baum, 1981; Frykman and Bengtsson, 1984; clonal inheritance of fixed heterozygosity in some
Halkka et al., 1987). Hybridizing taxa distin- unisexual lizards (Dessauer and Cole, 1986). In
guished by several fixed differences (e.g., Patton cases of multiple ploidy levels among different
et al., 1984; D.A. Good, 1989; Szymura and Barton, unisexuals of hybrid lineages, allozymes fre-
1986, 1991; Wake et al., 1989; Dessauer and Cole, quently show different staining intensities due to
1991) offer greater potential for inferring the ex- alleles that are represented unequally in the
tent and symmetry of introgressed nuclear genes. genome (Dessauer and Cole, 1984,1989; Dawley
Several recent studies have used allozyme et al., 1985; Kraus, 1991). Ideally, an analysis of
markers to infer the extent of introgression of suspected hybridization events should be carried
other classes of genetic markers (M.L. Arnold et out in a phylogenetic context that will permit the
al., 198%; Harrison et al., 1987; Nelson et al,, 1987; identification of uniquely derived (autapomor-
Marchant et al., 1988; R.J. Baker et al., 1989; Klier phic) markers in the parental species; this will
et al., 1991; Ar6valo et al., 1993), and occasionally eliminate ambiguities arising from the use of
hybridization is assessed in a historical context shared ancestral (symplesiomorphic) alleles to
(Dowling and DeMarais, 1993). Isozymes have define bisexual taxa involved in hybridization
also been used to study developmental stability as events (W.H. Wagner, 1983; Murphy et al., 1984;
manifested by morphological asymmetry (Gra- Funk, 1985; Moritz, 1987; Sites et al., 1990).
ham and Felley, 19851, and the origin and distrib- Allozyme data are useful in the estimation of
ution of rare or unique alleles (called hybxizyrnes clonal diversity within gynogenetic or partheno-
by D.S. Woodruff, 1989; see also I-Iunt and Se- genetic populations which arise through recurrent
lander, 1973; Sage and Selander, 1979; Green- hybridization (Moritz et al., 1989c; Vrijenhoek,
baum, 1981; Barton et al., 1983; Case and 1989), mutation (Parker and Selander, 1976;
Williams, 1984; Murphy et al., 1984; Kocher and Spinella and Vrijenhoek, 1982), limited recombi-
Sage, 1986; Gollmann et al., 1988; Bradley et al. nation (Asher, 1970; Bogart et al., 1987), or some
1993))the genetic status of threatened and endan- combination of these factors. The matrilineal
gered taxa (Echelle and Conner, 1989; Dowling clones are frequently not a random representation
and Childs, 1992),and to address issues of hybrid of the possible genotypic diversity (B.J. Turner et
speciation (M.L.Arnold et al., 1990; Meagher and al., 1983); interclonal selection may produce habi-
Dowling, 1991; DeMarais et al., 1992). Future tat or trophic specialists, or hybridogens with dif-
studies of hybridization that merge ecological and ferent life history characteristics, and these kinds
molecular genetic approaches in appropriate phy- of differences may be "f~ozen"during the origin
logenetic and biogeographic contents offer great of new clones (Vrijenhoek, 1989).Others have ad-
potential for understanding processes involved in dressed questions of genome replacement in some
genome divergence, gene flow, and adaptation to of the unisexual salamanders (Spolsky et al., 1992)
alternative stable equilibria (Hewitt, 1988; Barton and have used isozymes in combination with
and Hewitt, 1989; Harrison, 1990). other genetic markers to provide evidence for
semi-independent segregation of unisexual al-
Parentage of Unisexual Biotypes lochthonous genornes during hybridigenetic
Allozyme electrophoresis is a powerful method . meiosis in some populations of the salamander
for identifying the bisexual parent taxa involved genus Ambystoma (Kraus and Miyamoto, 1990).
62 Chapter 4 /Murphy, Sites, Buth & Haufler
Origin of Polyploid Plants (Ohno, 1970; MacIntyre, 1976; B.J. Turner et al.,
As in studies of hybridization, isozymes have 1980) can produce isozymc loci that often diverge
been valuable in identifying the parents of poly- markedly in their developmental expression
ploid plants. Isozymes have supported hypothe- (Whitt, 1981).Differences in gene number may ei-
ses based on other lines of evidence (Roose and ther serve as characters useful in systematic stud-
Gottlieb, 1976; Werth et al., 1985a) and differenti- ies (Gottlieb, 1982b; Whitt, 1983, 1987; Buth,
ated among alternative hypotheses (Holsinger 1984a), or have little value because of extensive
and Gottlieb, 1988; Gastony, 1986). Allozymes homoplasy (Sites and Murphy, 1991). These dif-
have shown that some polyploids have a single ferences can arise through gains of new genes
origin (Werth et al., 1985b) and that some are au- (duplication) or losses (gene silencing); both con-
topolyploids (Soltis and Rieseberg, 1986). In the ditions may be considered as derived relative to
process of exploring the origin of allopolyploids, an ancestral state. For example, many groups of
allozymes have been used to predict the existence fishes (Buth, 1983) and plants (Gottlieb and Wee-
of, and ultimately to discover, new diploid species den, 1979; Gottlieb, 1982a) have extra loci encod-
(Pryer and Haufler, 1993). Diversification and spe- ing enzyme systems, suggesting that gene dupli-
ciation in polyploid lineages have occurred cation events have played an important role in
through gene silencing (Werth and Windham, their evolution, perhaps in the acquisition of
1987; Gastony, 1991). However, if gene silencing novel gene functions (Ohno, 1970; Markert et al.,
regularly leads to diploidization of entire poly- 1975; Fisher et al., 1980). In fishes, tetraploidiza-
ploid genomes (Haufler, 1987), then the value of tion is followed by a shift from tetrasomic (pair-
allozymes for assessing ploidy must be ques- ing of homologous chromosomes in tetrads) back
tioned, especially in phylogenetically ancient to disomic (pairing in dyads) patterns of inheri-
groups. tance. During this "rediploidization," some 50%
of the duplicated loci are silenced either by fixa-
tion of new mutations or the deletion of some
Gene Expression and Gene Duplication codons (Allendorf and Utter, 1973; Ferris and
The expression of gene products is subject to both Whitt, 1977a,b; W.H. Li, 1980).Patterns of malate
temporal (ontogenetic) and spatial (cells/tissues) dehydrogenase (MDH) meiotic segregation dur-
variation in organisms. The predominance of ing rediploidization in the recently evolved
products of different L-lactate dehydrogenase loci tetraproid frog Hyla versicolav suggest polyrnor-
in different tissues of vertebrates (e.g., Ldh-A in phic (tetrasomic, disonic, and tetrasomic-dis-
skeletal muscle, Ldk-B in heart) is a classic exam- omic) inheritance tl~oughtto be a transitory phase
ple of this phenomenon (reviewed by Markert, between complete tetrasomy and complete dis-
1983). An example of the evolutionary conse- omy (Danzmann and Bogart, 1982a).A phyloge-
quences of regulatory divergence in gene expres- netic evaluation of the catostomid fish Moxastoma
sion is provided by the third L-lactate dehydroge- lachneri suggests that "retetraploidization" of the
nase locus (Ldk-C) in the bony fishes. Fishes of second glucose-6-phosphate isomerase locus, Gpi-
several morphologically primitive orders express B, is due to reactivation, or postpolyploidization
Ldk-C in many tissues, whereas in most teleosts regional duplication (Buth, 1982a).
Ldh-C expression is limited to eye or liver tissue Isozyme staining intensities may be used to
(Shaklee et al,, 1973; W11itt et al., 1975; Shaklee investigate ploidy levels. Danamann and Bogart
and Whitt, 1981). To ensure relevant comparisons (1982b) and Dessauer and Cole (1984) found that
of homologous gene products, extracts from ho- gene dosages, and thus ploidy levels (212, 3n, or
mologous tissues/organs must be prepared and 4n), could be inferred accurately from staining in-
specimens at similar developmental stages com- tensities because the subunit interactions were
pared. additive.
The duplication of genes via aneuploidy, As discussed earlier, many enzymes are mul-
polyploidization, and regional gene duplications timeric, composed of subunits that must be as-
Table 2
Evolutionary patterns of creatine kinase gene expression in fishes
Character state
sembled in order for the enzyme to function. Mul- and Murphy, 1991) and glucose-6-phosphate iso-
tiple isozymes of multilners can be produced by merase in the Leguminosae (Weeden et al., 1989)
combining different kinds of subunits in het- does not correlate strongly wit11 phylogenetic re-
erazygotes (lzeteromers) and by the interactions lationships.
among multimers of duplicated genes in a multi-
locus isozyme system producing interlocus het- Limitations
eropolymers. Heteropolymer formation may be
non-random because regulatory differences may Taxonomic Limits
suppress the formation of some or all of the pos- Studies of population structure, breeding biology,
sible heteromers, e.g., the heterotetramers of t-lac- and other intraspecific applications require suffl-
tate dehydrogenase of some lizards (Gorman, cient levels of intraspecific variability. Allozynzes
1971; Sites et al., 1986), fishes (But11 et al., 1980), are not sufficiently variable in some organisms,
and snakes (Murphy, 1988). making other molecular methods, such as RFLP
The isozyme characters (sensu Whitt, 1983, studies (Chapter 8), more appropriate. For exam-
1987; Buth, 198413; Murphy and Crabtree, 1985b) ple, DeSalle et al. (1987b)examined the distribu-
of gene number, tissue specificity of expression tion of mtDNA haplotypes in populations of
(gene regulation) and posttranslational modifica- Drosoplzila mercatorurn distributed along a short al-
tion, and heteropolymer assembly can be of sys- titudinal transect near Kamuela, Hawaii, and
tematic value only if they vary at a taxonomic found statistically significant spatial and ternpo-
level useful to the investigator. These characters ral heterogeneity in the absence of isozyme d~ver-
may be useful for intraspecific, intrageneric, or in- gence. Intraspecific studies of birds are often ham-
trafamilial comparisons depending upon the pered by very low levels of isozyme pelynzor-
group (Buth, 1984b).However, the few studies of phisnz (Barrowclougl~et al., 1985),yet Quinu and
enzyme systems reveal certain limited group White (198713) demonstrated extensive genolnic
trends. Studies of creatine kinase (CK) expression DNA RFLP variability in the snow goose (A~zserc.
in fishes by Ferris and Whitt (1978b), Fisher and caerulescens; see also Haig et al., 1993, for another
Whitt (1978,1979), and otlzers permit the general- example). Similarly, Sites and Davis (1989) found
izations for CK isozyn~echaracters listed in Table many more variable markers using r e s t r ~ c t ~ o n
2 , Three of the four evolutionary patterns in Table sites in both rntDNA and nuclear ribosomal DNA
2 appear to hold for amphibians and reptiles than they found using allozylnes among central
(But11 et al., 2985). In contrast, LDH expression in Mexican chromosome races of the Iizard Sccloyoms
sea snakes and cobras (Murphy, 19881, and the grainmicus. Tlzese and other studies (Wettan et al.,
number of loci encoding glycerol-3-phosphate de- 1987; Karl et al., 1992; Karl and Avise, 1993) sl~owa
hydrogenase (G3PDH) in squamate reptiles (Sites definite lower taxonomic limit to the resoivlng
64 Chccyfer 4 / Murplzy, Sites, Buth & Hnz
power of protein electrophoresis (which may vary LIMITS TO DETECTION OF SEGREGATING ALLELES
among groups; Kessler and .Avise, 1985b). Hubby and Lewontin (1966) recognized that gel
At the opposite extreme, some taxa have di- bands represented enzyme phenotypes, and not
verged to the extent that they share virtually no necessarily all underlying allelic variation. King
alleles For example, Sites et al. (1984) surveyed and Okta (1975) introduced the term electro-
17 genera of batagurine turtIes and found the morph to label allozymes of the same mobility as
taxa to be so divergent and homoplasy so exten- different classes of alleles. Allendorf (1977)
sive that they could not recover well-corrobo- stressed that electromorph identity did not mean
ratcd branches for most basal stems of the clado- identity in DNA base sequence; homology is a
gram l h g h levels of divergence found among conditional concept for isozyme phenotypes.
congeneric fern species average D = 1.1 (I= 0.33) Because accurate estimation of allelic vana-
(Soltis and Soltis, 1989), which approach or ex- tion has important implications for many evolu-
ceed the limits of resolution of isozyme elec- tionary questions (Coyne, 1982), the problem of
trophoresis. Nei (1987: 251-252) offered as a gen- hidden heterogeneity (G.B. Johnson, 1977) fos-
eral rule that if genetic distance D (Nei, 1972, tered several studies to determine how accurately
1978) 1s greater than 1.0, then the frequencies of conventional electrophoretic techniques estimate
back/parallel mutations will be high, and the genetic variability. R.S. Singh et al. (1976) used a
varlance of D large, even if numerous loci are as- sequential assay of four different electrophoretic
sayed The hierarchical taxonomic level at which conditions, termed sequential electrophoresis,
phylogenetic utility is lost (D2 1.0) will vary and heat stability tests to examine Xdh-A variation
with taxonomic assignments and taxon-specific in Drosophilu pseudoobscuru. They resolved 37 al-
rates of lnolecular evolution-birds appear to be leIes where only 6 had previously been identified
decelerated (Avise and Aquadro, 1982)-but gen- by conventional protocols. Other approaches to
erally the greatest phylogenetic utility for detecting cryptic alleles include thermostability
isozylnes will be at. the level of species or closely analysis (e.g., Chambers et al., 19811, peptide
related genera (Nei, 1987). mapping (Ayala, 19821, and the use of polyacry-
lamide gels of varying pore sizes to produce a
Sampling Limitations sieving effect for separation by size or molecular
Several kinds of limitations of isozyme tech- weight (G.B. Jolwson, 1976,1979).
niques are recognized, including limits to the Although these methods show that conven-
number of (1) loci resolved, (2) alleles per locus, tional isozyme electrophoresis may underestimate
a n d ( 3 ) individuals required for population or variability, they do not reveal what proportion of
phylogenetic studies, The total number of loci alleles may remain undetected. Ramshaw et al.
that can now be visualized with histochemical (1979) examined several human hemoglobin vari-
s t a i n ~ n gtccllniques is in excess of 300 (D.A. ants of known amino acid sequence using both
Wrlght et al., 1983; Morizot and Siciliano, 1984; standard and sequential acrylamide electrophore-
Mnnchenko, 1994), but this is still only a very sis (varying conditions of p H and pore size).
small sample of the total genome. However, Three experiments determined what types and
given the slze of most eukaryotjc genomes (sum- proportions of substitutions could be resolved by
marlzed in Cavalier-Smith, 1985b; Loomis, 19881, these methods. First, 8 and 17 hemoglobin vari-
this 1s a constraint common to most moIecular ants out of 20 were detected by the two proce-
tecl-iniques and will not be elaborated further dures, respectively. Second, groups of variants
here. What is apparent is that, in general, one with the same amino acid substitutions in differ-
needs to resolve about three times as many loci ent parts of the molecule were screened by two
as ihcre are taxa in order to have a reasonable approaches and revealed 77% and 90% of the
chance of resolving most nodes of a cladogram in known variants, respectively. Third, 4 of 5 pairs of
a character-state evaluation. hemoglobins differing by charge-equivalent sub-
Proteins: Isozyrne Electrophoresis 65
stitutions in the same positions were separated by Clearly, hidden heterogeneity is pervasive,
both procedures. There was no class of commonly and one cannot always rely on any single method
indistinguishable substitutions, and Ramshaw et to resolve all alleles. Equally important, however,
al. (1979) concluded that the standard protocol of are findings that (1)some loci are much more
electrophoresis was a powerful method for iden- likely than others to harbor cryptic alleles, espe-
tifying most variants. cially systems originally resolved as highly poly-
McLellan (1984) examined 14 whale myoglo- morphic by conventional methods, and ( 2 ) con-
bins of known sequence by sequential polyacry- ventional methods will resolve most or all
lamide electrophoresis (five pH values) and was variation at the more conservative loci, Further, a
able to separate 13 of the 14 variants. No further number of classes of studies will be largely unaf-
resolution was obtained by altering concentration fected by this phenomenon (Coyne, 1982). Fixed
or composition of the gels, or by screening with differences between populations or species de-
other techniques such as urea denaturation or iso- tected by conventional methods are real and the
electric focusing (McLellan and Inouye, 1986). differences can only increase by resolution of ad-
Aquadro and Avise (1982a) used several ditional alleles. Similarly, between-population al-
starch and acrylamide conditions, gel-sieving, iso- lele frequency heterogeneity is also real, regard-
electric focusing, and thermal stability tests to less of any underlying heterogeneity in
screen for cryptic alleles at three loci (sAat-A, electromorphs, because such differences in elec-
sMdh-A, and Est-1) in five populations of Per- tromorph classes should also reflect the same de-
omyscus maniculatus. sAat-A (their Got-1) was pre- viations of cryptic alleles. Other kinds of studies
viously known to segregate for two alleles across (e.g., absolute estimates of heterozygosity) may be
most of the range, sMdh-A (their Mdh-1) was es- more affected by cryptic allelic variability, but to
sentially monomorphic throughout the range, and an unknown degree. Obviously, any problem ad-
Est-1 was highly polymorphic, with eight alleles dressed with isozyme techniques will be better
resolved in earlier studies. None of the techniques understood by more accurate descriptions of al-
uncovered any further variation in either sAat-A lelic variation. Where time and resources permit,
or sMdh-A. In contrast, sequential electrophoresis we suggest that at least loci showing extensive
(five additional starch gel conditions) resolved 23 variation under standard conditions be screened
variants in Est-1, which were further resolved into sequentially with additional buffers to maximize
35 variants by heat denaturation, although the al- separation. Notwithstanding, at least one study
lelic nature of the latter group was not deter- (C.D. Chase et al., 1991) suggests allozymes corre-
mined. Aquadro and Avise (1982b) also uncov- late well with DNA RFLP data (Chapter 8). For
ered additional sMDH isozymes among ten phylogenetic studies involving extensive radia-
orders of birds using multiple buffers. tions likely including multiple monophyletic
Bradley et al. (1993) sequenced the entire cod- groups, conservative loci also should be screened
ing regions from multiple individuals of gophers with multiple buffers for resolution of additional
(Geornys) that expressed several combinations of electromorphs likely to define basal splits (S.B.
three Adh-1 electromorphs. They found that the Hedges, 1989; Burnell and Hedges, 1990). To this
three electromorphs were encoded by a total of six we would add that all hypothesized synapomor-
alleles at the nucleotide level and five alleles at the phic allozymes, identified as such through a pre-
amino acid level. However, each electromorph class liminary phylogenetic analysis, should be sub-
contained only alleles that were phylogenetically jected to sequential electrophoresis. However, the
closely related. Thus, the electromorphs represented sequencing study by Bradley et al. (1993; dis-
natural groups of alleles that would be expected to cussed above) lent support to the hypothesis that
be informative about phylogenetic relationships, even if electromorphs are encoded by different al-
even though all of the nucleotide variation was not leles at the nucleotide level, allozymes neverthe-
apparent in the allozymic differences. less are likely to represent related groups of
66 Ckaptev 4 / Muvphy, Sites, Butk G.Haufler
alleles, which are thus informative about evolu- compIicate isozyme interpretations. Several non-
tionary relationships. Mendelian factors also may complicate isozyme
phenotypes via in vivo or in vitro environmental
NULL ALLELES AND ISOLOCI Other phenomena conditions, or through the action of modifier loci.
cause deviation from codominant expression of
allozymes. Null alleles (those with reduced or POSTTRANSLATIONAL MODIFICATIONS OF ENZYMES
no expression of a protein product) are detected Polypeptide synthesis involves (I) translation, (2)
by reduced staining intensity of some single polymerization, (3) termination, and (4) process-
isozymes on the same gel; complete absence of ing of the final protein product. Only the first step
activity may indicate null homozygotes (see involves the direct coding of nucleotide sequences
Utter et al., 1987). These interpretations are often into primary protein structure, while the others
ambiguous and require confirmation by breed- are posttranslational processes that give a final
ing studies (e.g., Stoneking et al., 1981). In the structure to the product. These latter processes
absence of breeding studies, quantification of a change t11e 20 primary amino acids specified by
null allele cannot be made reliably. Apparent het- the genetic code as monomeric building blocks in
erozygote deficiencies may be due to null het- polypeptide assenzbly into about 140 amino acids
erozygotes being scored as active homozygotes and derivatives in completed proteins (Uy and
(Foltz, 1986). Heterozygotes for null alleles are Wold, 1977). On gels, a number of these epige-
more readily detected if they either form partial netic events may produce conformational iso-
heteropolymer isozymes in polymorphic single- zymes, or multiple forms of a singIe gene product
locus systems (Burkhart et al., 1984) or are that differ in secondary or tertiary structure (also
expressed in multilocus, multimeric proteins called secondary isozymes or subbands; Riclz-
(e.g., Engel et al., 1973; Allendorf et al., 1984; ardson et al., 1986) and/or variants that differ in
Utter et al,, 1987; Gastony, 1991). In both cases, thermal stability (Lebherz, 1983).In some cases,
reduced intensities of one or more multiple modifying genes have been shown to be poly-
bands provide additional visual clues to the morpluc for alleles that differ in their influence on
presence of null alleles. electrophoretic mobilities of the protein products
Another difficulty may occur when isozymes (Cochrane and Richmond, 1979; Womack, 1983;
with identical electrophoretic mobilities represent Dykhuizen et al., 1985).In other cases, altered mo-
the products of two different loci of the same mul- bilities appear to be restricted to specific tissues
tilocus enzyme system (Utter et al., 1987).These (Murphy and Crabtree, 1985b), or to be a function
isoloci may present rather complicated isozyme of environmental conditions and/or the physio-
patterns, and, if allelic variation is present, deter- logical state of the organism (McGovern and
mination of which locus i s polymorphic (or Tracy, 1981; van Tets and Cowan, 1966; Fields et
whether both are) may not be possible. Isoloci al., 1989). For example, in a cryptic species of the
may be individually identifiable if their respective freshwater clam genus Corbicula, the synthesis of
encoded loci are synthesized at different levels in an enzyme seems to be a seasonal event in an en-
different tissues, but this appears to be uncom- tire population (Hillis and Patton, 1982). Suclt al-
mon (Allendorf and Thorgaard, 1984). Under terations of mobility may lead to incorrect hy-
some circumstances, different staining intensities potheses about the number of loci encoding an
are expected (see Utter et al., 1987),but often such enzyme system (Hickey et al., 1989).
distinctions are difficult or impossible to make. Mobilities of some proteins are also suscepti-
Changing electrophoresis buffers often results in ble to protease degradation associated with re-
the separation of isoloci. peated freezing and thawing (Harris and Hopkin-
son, 1976; Richardson et al., 1986), or long- and
Other Sources of Phenotypic Variation of lsozymes short-term aging of the sample (Walter et al., 1965;
The phenomena described above may either limit Kobayashi et al., 1984). Moore and Yates (1983)
the resolving ability of isozyme electropl~oresis,or showed that many of the loci frequently screened
Proteins: lsozylne Elcct~ophouesis 67
in population and systematic studies were resis- respectively, as a functlon of varying el-izymc
tant to mobility modification when kept at room dilution. This di1ution effect, which occurs
temperature up to 12 hours after death. Posttrans- because GTDH molecules associate with one
lational effects can frequently be determined by another in the presence of coenzymes and purine
evaluating relative intensity of isozyrne staining; nucleotides, is known from GTDH only. Al-
alternate segregating alleles usually give constant though tlxe phenomenon of greater mobility with
patterns of expression, while breakdown effects increasing dilution occurs neither in all taxa nor
are likely to give a full range of expression of rela- on all buffer systems, it remains a variable to be
tive strengths (Richardson et al., 1986). considered.
Major equipnlerrt
Freezc.r Manual defrost 1 1 6
Refr~gelator >12 cu f t 1 1 6,8
Analytical balance 0.1 mg to 100 g 1 1 1,2,6
pH meter 0.01 pH, with Tris probe 1 I 2,6
Fuine hood 1 1 2,4,6
Water delonlzer and filter 1 1 1,2,4,6
Power suppl~es 0-500 V; 0-100 mA 1 10+ 4
Refrlgcrated, high-speed >10,000 g 0 1 1
cen trlfuge
Centrifuge rotor Fixed angle; 24-36 place
Refr~geidtedchamber or
walk-ln refrigerator
Ultracold freezer, -70°C
C 0 2 (or 1 NZ) backup for
ult~acoldfreezer
Incubator
Tlssuc hoinogenizer High speed
Sonica ior/cell disrupter
Water bail]
~Microwavcoven
Pipeticrs, set Adpstable: 1 p11-5 ml
S~ngiel c l ~ 1seflex camera With macro lens and
yellow filter
Ice rnaci~lne h lieu of blue ice
Minor eqzripment
Gel ~-ilolds 1 >20
Buifer wells (trays) 1pair >20 pairs Y
Dessica tors 2 4 6
Spatula (stamless steel) Large and small 12 af each >12 6
Magizetic stirrer Preferably with hot plate 1 1 6
6
Magl-iet~cstlrrlng bars Various sizes I pkg 1 pkg
Asplratol /vacuum line 1 1 2
Asp11 arlon safety shield 1 1 2
Btmser~burner 1000 Cal 1 1 2
Hcab gloves 1pair 2 pairs 2
Gel slicer 1 1 5
Polystyrene stain boxes 10 >200 6
IIazardous chemical 1 2 6
Proteins: Isozyme Electrophoresis 69
Table 3 (continued)
Basic equipment and non-chemical supplies necessary and desirable for starch gel
electrophoresisa
able, then Blue IcerMpacks can be used during form, or a commercially available solvent (methyl-
electrophoresis without having deleterious effects. ene chloride containing dissolved plastic).
Most staining gels are placed in an incubator Table 4 lists the chemicals necessary to estab-
set at 37OC. Alternatively, gel staining can be car- lish an allozyme electrophoresis (specifically,
ried out in dark cabinets or drawers, the only ef- SGE) laboratory having a capacity to use many
fect being a longer stain reaction time. different buffer combinations for running and
It may be necessary or desirable to construct staining of most enzyme systems that have been
some of the equipment, especially gel molds, buffer adapted to eukaryotes.
wells, gel origin guide, gel slicer, slicing tray, and
aspiration shield. Plans and examples of equip-
ment are provided in Figures 1-5 and detailed as- PROJECT PLANNING
sembly instructions will be provided upon request
to R. W. Murphy. Buffer well plans are designed to The problems to be solved in preliminary studies
prevent accidental electrocution (see E.W. Spencer are (1)what is the optimal buffer system? and (2)
et al., 1966).Gel molds, buffer wells, and gel origin how does expression vary among tissues and
guides are constructed from high-quality acrylic which tissues are best for analysis? This technical
plastic (transparent polymethyl methacrylate) development phase can be combined with a pilot
sheets, such as PlexiglaslM G. The pieces of plastic study (see Chapter 2) to determine the efficiency
are glued using either methylene chloride, chloro- of the approach. We have found that most fre-
Figure 1 Plans for two types of gel molds used in
horizontal starch gel electrophoresis. (A) Simple gel
mold that requires the use of a sponge wick. (B) A
wickless gel mold. The construction material is %-inch
acrylic plastic. A11 measurements are in millimeters.
quently the optima1 gel buffer systems for partic- summarized commonly used combinations for
ular proteins vary among taxa and are not trans- plants.
ferable. Also, impurities in water can affect differ- With five gel setups, in a few days it is possi-
ences in electrophoretic conditions, making ble to determine optimal electrophoretic conditions
interlaboratory protocols vary. Unless multiple gel by surveying a few specimens for a wide array of
buffer systems are initially tried for each enzyme enzymes on virtually all commonly used gel buffer
or general protein system, much of the variation systems. Each of the five gels is made fro111 a dif-
may be unresolved (see above). Before the ferent buffer and can be cut into 5 x 5 minislices, aI-
isozyme data are gathered, it is highly desirable,
if not essential, to determine independenby which
Figure 2 Design for an electrophoresis buffer tray P
of the various gel buffer systems are useful. that prevents accidental electrocution. (A) Base. (B)
Therefore, we have avoided suggesting buffer and Cover. Construction material is %-inchacrylic plastic.
stain combinations, although Kephart (1990) has All measurements are in millimeters.
1 qL A
2551-79 Male banana
/ plug 1
(B) rF1:;7-4
0 0 0 0 0
- 0 0 0 0 0 -
0 0 0 o 0 2 5 3 panel) / blob
0 0 0 0 0 ,Wire
t
44.5 Femalc
1 0 0 0 0 0 banana plug
2 p
-------
k32+1+44.5 ../
6 3 B B E s g l u O Q
@ 3 6 5 = e e a *
72 Chapter 4 / Murphy, Sifes, Buth & Haufler
(A)
6 mm stoel rod Music wire
F i g u ~ e3 Gel slicing apparatus. (A) Bow slicer (con- in both number of loci and amount of allelic vari-
structed from 'h-inch aluinlnum bar). (R) Gel slicing
tray (constrnctedfrom %-inchacrylic plastic). All mea- ability within loci (J.H. Gillespie and Kojima,
surerncnts are in millimeters. 1968; Gottlieb, 1982a).
The final stage of planning involves the elec-
trophoresis of allozymes from numerous individ-
loxving the rapid survey of 30 or more enzyme sys- uals on established buffer systems and from
ten-~s.Five minigels representing five different known tissues to generate data on allozyme vari-
buffers are silnultaneously stained in the same ation. Gel runs must be well planned in advance
stain box making the protocoI cost- and time-effi- in order to avoid unnecessary reruns. Richardson
cient 'The specimens examined can represent thc et al. (1986) have detailed many variables that
taxonomic diversity to be studled (see Chapter 21, a should be taken into consideration in the plan-
rang(' of different tissue types, or both. ning stages. Some of the more important consid-
I t may be important to have a mix of rela- erations are as follows:
tively rapidly andAslowlyevolving loci, especially Enzyme systems sensitive to freezing and
if one study is to be compared to another, or if dif- thawing (e.g., HBDH, GSPDH, IDDH, etc.)
iere1-d hierarchical taxonomic levels are being ex- should be resolved first, preferably before
auzincd. Some enzymes, such as those involved freezing the tlssues or extracts.
wltll glycolysis, tend to be relatively conservative
P-----152.5--4
I cuum
Table 4
Chemicals required for electrophoresis, use, location of storage, a n d health hazard information
Reference
Chemical (use)a Locationb numberC Health and safetyd
(contzrzucd)
Proteins: Isozyrne Electrophovesisis 7s
Table 4 (conti~zued)
Chemicals required for electrophoresis, use, location of storage, and health hazard informatior-1
Reference
Chemical (use)' l.ocationb numberC Health and safetyd
Table 4 (cont~l?ued)
C h e ~ ~ x i c arlesq u i r e d for electrophoresis, use, location of storage, a n d health hazard i n f o r m a t i o n
- - .
Reference
Chemical (use)" tocaticlnb numberC Health and safetyd
L-Lactic dekydrogenase (AK, ALAT, CK, ENO, GUK, r
tIAG1-1, PK,UK)
L-Lc~ucyl-L-alanine(gci~eralPEP) r
L - Idcucylglycylglycine (PEP) f
I,-Leuclne p-naphthylarn~deHCl (CAP) f
L-Leucyl-L-lcucyl-L-leuclne(PEP) f
Lithium hydroxide (buffer) s
IvIagneslum acetate (CK) s
Mdgncsium chlorlde (general) s
Magneslurn sulfate (ALP, PK, buffers) s
Male~cacld (buffers) s
D T -Malic a c ~ d(buffer, MBH, MDHP, ME) s
Maiic dehydrogenase (FUMH) r
D -hfal1nose-6-phosphate (MPI) f
2-Mercaptoethanol (PBP, NTP, PFK) r
Methylglyoxal (HAGH, LGL) r
Methyl alcohol s
4-Mcthylumbelliferyl acetate (EST) f
4-Metlzyl~imbelliferyl-N-acetyl-P-D- galactosa~nlde@GALA) f
4-Metl~ylur~1bel11fery1-N-acetyl-~-~-glucosamide (PGA) f
4-bf/lcthylurnbelliferyl-a-L-arabinoslde (aARAB) f
4-i\.Teihylumbelliferyl-a-D-galactoside (aGAL) f
4-blethylumbelliferyl-PD- galactoside @GAL) f
4-Methylui~1belliferyl-a-~-g1ucoside(aGLUS) f
4-Me tl~ylumbelliferyl-P-D-glucoside (PGLUS) f
4-Melhylumbelllferyl-P -D-glucuronide ((3 GLUR) f
4-Methylumbelliferyl-a-w-mannopyranoside (aMAN) f
Molybdlc acid ammoruum tetrahydrate (GLAL) s
MTT (Lctrazolium salt) (general) r
f3-NAD (Nicotinam~deadenine dinucleotide) (general) f
p -NXJ3H (general) r
p -NADP (general) f
P-NADPH (GSR) f
Naphthol AS-BI (3-D-glucuronic acid (PGLUR) f
Naphthol blue black (anlido black) (GP) s
U-Naphthyl acetate (EST) f
P-Naphthyl acetate (EST) f
p-haphthyl acid phosphate (ALP) f
N ~ t r oblue fetrazo31um (NBT) (general) r
Nucleoside phospl~orylase(ADA) r
1-Octanol (ADII, ODH) s
D -0cinpine (OPDK) f
I-lr'e~ltanol (ADH) s
(conf i n d )
Proteins: Isozyme Electrophoresis 77
Table 4 (confi7zued)
Chemicals required for electrophoresis, use, location of storage, and health hazard information
Reference
Chemical (use)' ~ o c a t i o n ~ numberc Heaith and safetyd
(continued)
78 Chapter 4 / Murphy, Sites, Buth & Haufler
Table 4 (continued)
Chemicals required for electrophoresis, use, location of storage, and health hazard information
Reference
Chemical (use)a ~ o c a t i o n ~ numberC Health and safetyd
buffer recipes use 2-mercaptoetl~anol,a sulfhydryl mogenizer, (3) hand grinding with a glass test
reducing agent, to reduce subbands. However, at tube or rod sanded on its base and a porcelain
least in reptiles, this ingredient significantly re- spot plate (Werth, 1985; Kephart, 1990), (4) motor-
duces the activity levels of many enzyme systems. ized plastic (e.g., TeflolP) pestle and plastic (cen-
Phenolic compounds in many plant tissues trifuge tube) mortar, or (5) a high-speed tissue ho-
form complexes with proteins upon homogeniza- mogenizer with a generator blade (Figure 6 ) .
tion. The addition of polyvinylpyrrolidone to the Homogenization using devices designed not to
extraction solution usually reduces this problem; disrupt cell membranes (Figure 6 ) may require
some plants also require other ingredients (see that the samples be subjected to sonication or re-
Werth, 1985; Kephart, 1990). freezing for 10 min at -20°C, All of the methods
There are several ways of extracting enzy- work very well, even without sonication; the lat-
matic proteins from cells including (1) simple ter, and initially most expensive method, is the
maceration of tissue(s) with scissors f ollowed-by fastest. If san~plesare not to be used immediately,
freezing, (2) use of a hand-held ground-glass ho- refreeze, preferably in an ultracold freezer.
Proteins: Isozyme Electrophoresis 79
of SkarcJ'~Gels
PsaafocaR 2: $brcparnhic~n
(Time: 2-3 hr/gel)
Gel cooking involves either the boili~lgof hy-
drolyzed starcli in gel buffer (below) or tlic addi-
tion of starch to hot gel buffer (e.g., Micalcs e l al.,
1986).Hydrolyzed potato starch may be made fol-
lowing the method of Smithies (1955) or pur-
chased. Although relatively expensive, Con-
naught Medical Laboratories (Toronto, Ontario,
Canada) starch has a longstanding reputation for
consistently producing very high-quality gels.
Electrostarch Co. (Madison, Wisconsin) starch is
relatively inexpensive, but variable in quality, and
sometimes requires the addition of Connaught
starch to make it usable. Starch from Starch Art
Corp. (P.O. Box 268, Sniitl~ville,Texas 75957
U.S.A.) produces highly satisfactory gels and is
moderately priced, As with Electrostarch, a frcc
sample is available on request. Other sources of
Figure 6 Homogenization of tissue extracts using a hydrolyzed potato starch include varlous cherni-
high-speed homogenizer. See text for other methods. cal (e.g., Sigma) and biological supply compal~ics;
these are invariably the most expensive and usu-
ally obtain their stock from the sources above.
1. Dissect out desired tissues or retrieve previ-
Typically, starch gels are made in concentra-
ously dissected tissue samples from the freezer
tions of 9-18% (w/v) starch in gel buffcr, depend-
and place them in a clean grinding tube.
ing on the quality of starch, preferred texture of
2. Dilute the samples 3-5 fold with grinding so- the gcl, and desired sieving effect obtained dunng
lution. The ice-cold grinding solution may be electrophoresis. The appropriate conccn trat~ons
either distilled, deionized water or one of are determined by trial (and error).
many solutions described in the literature Thrce problems may occur during gel prcpa-
(e.g., Selander et al., 1971; Harris and Hop- ration: undercooking, overcooking, and btlrnmg.
kinson, 1976; Werth, 1985; Kephart, 1990). If Undercooking can be recognized by soft, w e t gels
enzyme activity levels are to be surveyed, the that are difficult to lzandle following slicing; un-
tissue samples must be weighed precisely and dercooking is rare. Overcooking is easily recog-
diluted (Klebe, 1975; Kettler and Whitt, 1986; nized during four stages. aspiration, cool~ng,
Kettler et al., 1986). loading, and slicing. Durrng aspiration, over-
3. Mechanically homogenize the mixture of tis- cooked gels may boll out of the flask. Vrgorous
sue and grinding solution. The mixture shaking during asplratlon may be required if the
slzouid be kept ice-cold during the l~ornoge- gel is to be saved, although t h ~ is
s sometlmcs 111-
nization process. effective. During cooling, deep crevasses or clrcu-
4. Just prior to use, centrifuge the homogenate, lar or octagonal patterns may form in the surface.
80 Ciznpter 4 / Mzlrplzy, Sites, Buth & Haufler
Overcooked gel mixtures tend to stick to the gel a. While wearlng eye protection and insulated
moldi, often splitting during loading or removal glove(s), continuously swirl flask above a 1,000-
for sLicing following electrophoresis; gel slices are Cal Bunsen burner (Figure 7A). The mixture
tacky and sometimes iinpossible to handle. Burn- will become viscous and then quite rapidly
mg can occur without overcooking. It results fron~ much less viscous. As boiling begins (after
not swrrling the mixture vigorously enougl-t dur- about 3 4 min) stop heating.
ing cooking and can be recognized by brown- b. Use a magnetic stirring hot plate and large mag-
black, burned starch on tl-te bottom of the flask netic stirring bar to heat the starch-buffer mix-
and/or dark flecks in the gel. Burning frequently ture until the mixture becomes too viscous for
results in tacky gels. Improperly cooked gels the stirring bar to swirl. Remove the flask and
should be discarded. occasionally swirl by hand until the mixture be-
Most types of gels can be cooked, poured, left comes less viscous once again, in about 1 rnin.
overnight, and run the following day. However, Return the flask to the stirrer and continue heat-
Tris-citrate/borate, Tris-citrate 111, lithium-bo- ing until boiling as above. This procedure takes
rate/?i.rs-citrate and Tris-I-ICl gels tend to crack about 20 min.
during electrophoresis if used after this period of c. Cut the bottom out of a microwave oven having
storage. a stainless-steel interior in order to accommo-
Frnally, there are a number of peculiarities date a magnetic stirring plate. While stirring,
associated with some gel buffers. Tris-borate- heat the starch-buffer mixture until it becomes
EDTA 11 gels tend to stick to the flask after cook- less viscous. Stop heating. (We have not used
ing. The problem can be overcome by lowering this method.)
the percentage of starch by 0.5-1 percent and/or
preparing an extra 20 ml of gel. Tris-citrate/bo- 5. Using an insulated glove, quickly transfer
rate and lithium-borate/Tris-citrate gels tend to molten gel to the aspiration shield, set flask on
spIi t apart at the origin during running (see Pro- a heat pad and cover the open hole of the T-
tocol 4). Borate gels tend to be difficult to aspi- connector to apply vacuum for about 15 sec
rate, sllght undercooking and/or vigorous shak- (Figures 5 and 7B). The mixture will resume
ing during aspiration reduce these problems (see boiling. Swirling of the flask may be required
also Protocol 4). during the first few seconds to avoid aspirat-
ing the gel out of the flask. SIowly release the
1. Locate a stable, horizontal surface to hold gel vacuum.
molds until gels are cool enough to move (=1
hr). The surface should be near the aspirator. 6. Rapidly pour the hot mixture into gel mold
filling evenly and almost overflowing (Figure
2 I'repare gel molds for receiving hot starch: 7C); avoid dribbles.
unglued wick molds (e.g., Micales et al., 1986)
7. Immediately (within 1 min) remove any re-
il-~usthave the edges clamped; use masking
maining air bubbles from the molten gel us-
iape and seal the open portions of the legs of
wrckless molds. Place molds on the table or ing a Pasteur pipette and pipette bulb.
bench on top of a paper towel. Label the pa- 8. Rush used flask in hot running water before
per towel (or masklng tape) noting the type of remaining mixture solidifies.
gel buffer to be poured and the date. 9. After cooking all gels, and while they are
3. Welg11 out 40 g (or appropriate weight) starch, cooling, fill buffer wells (trays).
place lnto a 1000-ml glass Erlenmeyer flask 10. Allow the gel to cool to ambient temperature,
(narlow mouth, heavy duty rim), and add about 45-60 min, and gently cover with plas-
400-ml gel buffer. Swirl contents until starch tic food wrap. With both hands, 11old tlre
is well emulsified. wrap at one end. Allow the opposite free end
4. Cook gel using one of the following methods: to contact one edge of the gel. SlowIy lower
Proteins: lsozyrne Electrophoresis 81
Figure 7 (A) Cooking, (B) aspirating, and (C) pouring a starch gel.
the wrap allowing it to drop on the gel. If air ple wicks-rectangular pieces of filter paper
bubbles begin to form, lift the wrap and lower (Whatman No. 3) measuring 2-4 mm in width
it again; air bubbles induce malformations in and 1 mm taller than the gel mold. Wicks can be
the gel. surface. Pulling/tugging of the wrap hand-cut or purchased. The following protocol is
should be avoided as this can split in the used for loading multiple gels, and for right-
forming gel matrix. Gently write the name of handed operators.
the gel buffer on the wrap using a felt-tip
1. Before loading, make sure that the buffer
marker.
wells have been filled and labeled.
11. Place gel in refrigerator for 1 hr, or allow to
2. If applicable, remove frozen homogenized
continue to cool at ambient temperature for 2
samples from freezer and initiate thawing,
hr.
and recentrifuge if desirable; keep thawed
samples chilled.
3. Number a piece of filter paper from 1 to the
Yrofocol3: Gel Loading number of samples being applied to a gel, in-
(Time: 10-20 min per gel) cluding tracking dye. Tape the paper to the
The inoculation of protein extracts into horizontal table to the right of the operator.
gels is generally accomplished by the use of Sam- 4, Make stacks of wicks on the numbered filter
82 Chapter 4 / Murphy, Sites, Buth B Hauf7
tray buffer elech.olytes can be observed to migrate without supervision). Splitting typically occul-s
through the gel. when the tray buffer electrolytes pass tl~roughthe
Certain kinds of gels have peculiarities, espe- origin. There are three remedies to this problem.
cially the discon.tinuaus buffer systems. In many First, sligl~tlyovercook the gels durlng prepara-
buffer systems, such as Tris-HC1, the amperage tion. Second, push the gel halves together after
(electric current) drops as electrophoresis pro- the borate line has passed through the origin.
ceeds. Consequently, if running time is to be rnin- Third, following 1-2 hr of electrophoresis, wedge
imized the voltage should be progressively raised plastic drinking straws or thin glass rods between
to the maximum level (Table 5) about every half the gel and inside edge of gel mold thereby forc-
hour but without exceeding 75 mA. ing the gel halves together, These gels should be
Tris-citratelborate, and lithium-borate/Tris- checked at the midpoint of electrophoresis to as-
citrate gels tend to split apart at the origin during sure that spli.tting has not occurred. Splits can be
electrophoresis, especially if the gels were cooked repaired by pushing the two halves of the gel
a day in advance or run overnight (i.e., run slowly back together.
Table 5
Recommended electrophoretic conditions for the wickless system described herein,
including electric potential in V/cm and average duration
Buffer combination Vicm Duration
It is frequently possible to run gels much able to place a paper towel and glass plate be-
more rapidly than recommended, down to as lit- tween the gel and ice pack in order to prevent
tle as four hours. Because of the sieving effect of freezing of the gel surface.
starch gels, however, rapid running usually re- 4. Plug the well box top into the bottom, i.e.,
sults Ln less well-defined protein bands following connect the buffer well electrodes to the
staining. Moreover, as gels begin to heat up, resis- power supplies (Figure 9).
tance increases and further heating will likely OC- 5. Turn the power supply on, allow it to warm
cur-to the extent of melting gels! up for a few minutes, and adjust to desired
1. If wickless gel molds are used, remove the voltage/amperage levels (Table 5). Amperage
masking tape from the legs. should not be allowed to exceed 100 mA, and
2. Place the gel mold in the buffer well box ori- preferably 75 mA, as overheating of the gel
enting the narrow end towards the cathode will likely occur.
(negative, black terminal). If wick molds are 6. After 25 min of electrophoresis, check track-
used, a sponge cloth must be used to com- ing dye by examining edge of gel mold to as-
plete the electric circuit between the gel and sure that the gel was properly oriented in the
buffer wells. While wearing rubber gloves, buffer well. If not, reverse polarity of the elec-
dlp the sponge clot11 into the well buffer and trodes at the power supply.
place it so that one end is in the buffer, and 7. Check ice levels every 2 hr if not running gels
one on the gel surface 1 cm onto the gel and in refrigeration.
under the plastic food wrap. 8. When tracking dye has reached the end of the
J I'lace either an alummum tray filled with gel, turn power supply off and remove gel
clusl~edice, or a frozen package of Blue IceTM (and gel mold) from buffer well box.
on the gel ensuring that the plastic wrap com-
pletely covers the gel and separates it. from
the ice pack. If Blue IceTM is used, it is advis-
Pro~ocial5:Gel Slicing
(Time 5-10 min/gel)
Once electropl~oresishas been completed, the gels
Figure 9 Horizontal starch gel apparatus during
elccrroyhoresis. The electropl~oresisbuffer tray is a need to be sliced and the slices placed in stain
sl~gl~ily
more complex version of that shown in Figure
3. The gel is being cooled by using Blue IceTM.
Proteins: Isozyme Electrophoresis 85
Figure 10 Gel slicing. (A) Use of a simple bow slicer use of a slicing tray. Note that when handling a gel
(see Figure 4). (B) A multiple slicer (plans available on slice, bhe fingers of both hands are touching to prevent
request). (C) A gel sliced with a multiple slicer. (D) Gel stretching of the gel. Top glass (or plastic) plate in (A)
slice handling. The multiple slicer does not require the has been removed for clarity.
boxes. A number of methods have been devel- transfer the two parts separately. Improperly
oped including, among others, the use of bow cooked gels are difficult to handle by hand. Trans-
slicers and slicing trays (Figures 3 and 10A),mul- fer these slices by using plastic food wrap as a car-
tiple slicers (B.J. Turner, 1980; Figure 10B and C), rying medium,
and nylon string (thread) (Micales et al., 1986).Gel
slicing and handling should be carried out while 1. Using masking tape, label stain boxes with
wearing protective gloves, even though this in- the gel number, enzyme system or locus to be
creases the difficulty of the operations. stained, gel buffer, and date. (This step usu-
Several problems can occur during slicing, the ally is completed during electrophoresis.)
most common of which is that of splitting or tear- 2. Using a microspatula and gel origin guide, cut
ing the slices. Once a split has formed in a gel, it away the anodal and cathodal 1crn of the gel
can be extremely difficult to transfer slices from (or legs of the wickless gel), 3-5 mm of the
the tray to the stain box; splits usually result from edges of the gel, and notch the left anodal and
bending the gel too much while transferring it cathodal corners of the gel. Remove these
from one slicing tray to another. The easiest solu- edges and notched pieces from the mold, leav-
tion is to completely separate the split slice and ing the greater portion of the gel in the mold.
86 Chapter 4 /Murphy, Sites, Buth & Haufler
3. Separate halves at the origin and remove (Figure 10D) to the stain box. If an agar over-
wicks. The gel may be more difficult to move lay, UV fluorescing, or limited volume stain is
for some buffers (e.g., lithium-borate/Tris-cit- to be applied, then it is important that no bub-
rate). Using a paper towel, gently dry the top bles occur underneath tlze slice. Relatively ex-
of the gel, arrange the two pieces so tlzat they pensive or critical stains should be made on
form a V separated by about 1 cm at one end, slices cut from the bottom of the gel.
and cover with a piece of plate glass or a slic- 11. Repeat slicing. Always initiate subsequent
ing tray. slices from opposite ends of the gel to prevent
4. Invert the sandwiched gel and gently dry the uneven thinning. It may be necessary to re-
bottom of the gel. Choose the appropriate peat steps 4-5 if the remaining gel slides eas-
thickness of slicing tray (if applicable), center ily on the slicing tray. The top slice can be in-
it upside down on the bottom gel surface verted and used, although it is preferable to
with the tray ridges aligned with the origin, stain with an agar overlay. Remaining por-
and turn the gel right side up again. tions of gels can be temporarily saved (24+
5. Remove air bubbles between the gel and slic- hr) by wrapping.
ing surface. Failure to remove bubbles may
result in holes in the gel slice and/or render
tlze remaining gel incapable of being sliced. PrakocoB 6 : I-Tistochemicial Staining
Re-cover top of the gel with second slicing (Time: 2 inin to 6 hr/stain)
tray (or glass plate).
The distance of rnigratiotz of specific proteins
6. Clean slicer wire with damp towel or steel through a starch gel is visualized by histoche~ni-
wool. cal staining. These stains (Appendix 1)consist of a
7. Orient the gel so that the apex of the V is fur- substrate on which a specific enzyme reacts, and a
thest away from the operator. Brace the (bot- detection mechanism such as a dye or substance
tom) slicing tray to prevent it from moving to- that fluoresces under long-wave (340 nm) W. The
ward tlze operator during slicing. Place the common mechanisms for detection include (1)the
wire of the slicer on the raised ridges of the formation of a purple precipitate (forrnazan) by
slicing tray, press downward on the slicer, the reduction of NBT or MTT using PMS or DCIP
and in one continuous operation slowly as the intermediate electron carrier or reducer, re-
(about 3 cm per second) pull the wire through spectively; (2) the non-fluorescence of NAD,
the gel. Gels usually move toward the opera- wlzich is formed from fluorescent NADH; (3) flu-
tor slightly during slicing; do not stop pulling orescence of methylumbelliferone; (4) fast diazo
if this is observed, and DO NOT PRESS DOWN dye (e.g., esterases); and (5) the oxidized form of
ON THE TOP TRAY/PLATE. o-dianisidine diHCl producing an insoluble
8. Clean slicer wire with damp towel, or steel brown precipitate. Many stains also contain co-
wool. Do not immerse wire slicers in water. factors, coupling enzymes, and other requisite
molecules. Details of how each of these systems
9. Remove top tray/plate, carefully separate the
work are provided by Harris and Hopkinson
gel from the bottom slice, and transfer the gel
(1976), Richardson et aI. (1986), and Manchenko
to tlze second slicing tray allowing for the V
(1994). A complete understanding of the concepts
to have the opposite orientation (apex near
is desirable but not absolutely necessary, althouglz
operator). Similarly, move the anodal top slice
such understanding greatly faciIitates the resolu-
but use both hands to support opposite sides
tion of staining problems when they occur.
of the gel. Lift anodal top gel slice to second
Some stains (e.g., for PGM) are best applied
tray, forming a V.
to the gels in the form of an agar overlay, or an
10. Open a plastic staining box and carefully agar-based gel containing stain components;
transfer the anodal and cathodal bottom slices agarose may be preferred over agar because the
latter inhibits the activity of some proteins be necessary (e.g., HADH). When mlxlng
through binding (Harris and Hopkinson, 1976). formazan-based stains, all powdered lngre-
Most laboratories use agar because it is much less dients should be dissolved in the stain buffer
expensive. The overlays serve the function of con- and pH adjustments slzould be made before
taining the precipitating dye, which prevents it adding cofactors, PMS, and NBT (or MTT).
from either diffusing over a broad area of the gel Once completely mlxed, pour the stain onto
or becoming too diffuse to be observed. the gel and gently shake the box free~ngthe
Several UV-fluorescing stains (e.g., 0-GLU) gel from the bottom. Agar overlays are pre-
may be applied to the gel slices as filter paper pared by bringing a 0.7% (w/v) mlxture of
overlays, the overlays being cut from Whatman agar/stain buffer to a boil, allowing ~t to set
1MM or other thin filter paper (Harris and Hop- until all agar grains have disappeared, cool-
kinson, 1976). However, wc have not noticed an ing to just below 50°C, adding remaining
advantage over simply applying these stains di- staining components, and pouring onto the
rectly to the gel. gel slice. For the typical 50-ml stain, 3 5 4 0 ml
The quantity of specific chemicals in some of stain buffer is mixed with 0 35 g agar In a
recipes in Appendix 1 varies from amounts speci- 125-ml Erlenmeyer flask; the remaining
fied in other sources (e.g., Selander et al., 1971). 10-15 ml of staln components are added to
These amounts are the minimum required to re- the warm agar just prlor to covering the gel.
solve these protein systems from the maximum Under ideal condltlons, the agar 1s prepared
diversity of taxa. Often these quantities can be re- m advance of slicing and staining by bring-
duced by applying less stain to a gel, especially ing the mixture to a boil in a lnicrowave
once the region of activity has been identified. oven. The flask is corked or covered with
Most agar overlay stains can be easily accom- aluminum foil and kept in a 50°C water bath
plished using as little as 10 ml of stain solution. until used. Coollng of hot, molten agar can
Of the two dyes used in formazan-based be made rapid by the use of ice and an accu-
stains, MTT is cheaper, more toxic, and precipi- rate thermometer. The molten agar forms a
tates more rapidly than NRT but tends to diffuse gel at around 42OC. Some fluorescent stains
and is less stable. The two dyes can be used in are prepared as small agar overlays. Do not
concert. If NBT is yielding only faint bands ini- view the UV light or fluorescing gel wthout
tially, the addition of MTT during staining may the use of a UV light s h ~ e l dor protective
help to intensify the isozymes. glasses. Short wave lights are not necessary
For formazan stains, three components are and should not be used because of the ad&-
particularly sensitive to light: PMS, MTT, and tional health hazard.
NBT. Therefore, the stock liquid solutions and 3. Most stains should be incubated at J7OC fol-
staining gel slices must be kept out of light. Stock lowing staining.
solutions should be stored in either amber glass 4. Staining gel slices must be continuously mon-
bottles and/or bottles wrapped in aluminum foil, itored to prevent overstaining, which results
All stains can be safely and conveniently pre- in unresolvable, diffused, or smeared bands.
pared in Erlenmeyer flasks. Because some stains Some stains must be scored and documented
contain liquid components only (e.g., LDH), these as soon as they are ready, sometimes within 5
may be mixed directly in the stain box so long as min of staining. Stains using insoluble precip-
the stain buffer is applied first. itates can be preserved (see below) and scored
1. Dry chemicals should be weighed and placed following the completion of all staining, even
in a 225-m1 Erlenmeyer flask. on the following day.
2. Add the liquid components. Liquid compo- 5. If the stain has been applied as a liqu~d,and
nents can be handled safely using pipetting not an agar overlay, s~phonoff the stain solu-
devices. In some cases adjustment of pH will t ~ o nand save for appropriate hazardous
88 Chapter 4 /Mzirphy, Sites, Buth & Haufler
waste disposal. Completely cover the gel slice
with fixlng solution (about 50 ml; Appendix applied as an agar overlay. Overstaining results
2) and refrigerate. If M'TT is used as the dye, in very dense isozyine banding patterns. Occa-
do not flood the gel slice with fixative or the sionally, background "ghost bands" may be ob-
forrnazan dye will wash out of the gel; apply served. These bands result from the ability of an
only enough fixative to wet the gel slice enzyme to act on an alternative substrate (e.g.,
(about 20 mI). LDH acting on DL-glycericacid, the substrate of
GLYDH), presence of suificient substrate in the
tissue extract, or contamination by bacteria,
Eoilib Jeslzooting molds, and yeasts (e.g., ethanol and ADII). LDH,
A number of problems inay be encountered fol- ADH and other isozymes can be identified either
lowing application oi the stain, the most common by counterstaining, or by inclusion of the end
of which is the absence of enzyme activity on a product of the reaction, a procedure termed end-
gel This may have several causes. (1) If the dura- product suppression. For example, pyruvic acid
tion of electrophoresis is too long or short, the en- suppresses (but does not stop) LDI-I, and pyra-
zymes lnay have migrated off of the gel or re- zole inhibits ADH. For some enzyme systems
mained in wicks in the origin, respectively. (2) (e.g., GLYDH), use of one or more suppressors is
It IS possible that one (or more) of the stain com- required.
panenis were omitted from the stain recipe. Suc-
cessful staining may be possible by adding the
mlssing component to the stain. (3) Very weak ex- Pr~koccll7 : Drying r?f Agar Overdays
pression typically results from too little of a glven (Time: 6 hr)
compoi~ent,or the use of a partially degraded so-
lutlon of coenzymes. Under these cotlditions, it Agar overlays can be dried on filter paper
will be necessary to add additional stain compo- and saved as documentation as follows:
nents, or reorder the coenzyme. If more than one
1. Cut fiIter paper (c.g., Whatman No. 1) to di-
stall1 1s resolving inadequately, check for common mensions allowing it to fit into a stain box
stnln components, such as G6PDH. Coenzyme ac- (12 x 17 cm) and label it with the enzyme sys-
tivity can be checked by electrophoresis and stain-
tem, gel number, and buffer conditions.
ing a small amount of the coenzyme along with
tissue extracts where activity has been prevjously 2. Decant or vacuum excess fixative from the
resolved. (4) A change In starch lot can result in stain box.
the necessrly to change tile conditions of elec- 3. Cut the agar free from tke edges of the gel
trophoresis. (5) Shifts to a high pH can result in slice using a microspatula.
the conversion of NAD(P) to NAD(P)H. Check 4. Carefully overlay the filter paper on the agar
the pi3 of the final stain solution. (6) Finally, the overlay and then slowly lift the filter paper
addlhon of too much substrate or coenzyme can while separating the agar overlay from the gel
suppress enzyme activity. slice using a microspatula (Figure 11).
Smeared lsozymcs lnay result from use of the 5. Place the filter paper on a few paper towels
wrong eIectraphoresis buffer conditions, too high agar-side up and allow to dry (several hours).
a current (overheating), high concentrations of
11plds in the tissue extracts, or (rarely) improper 6. Once dry, curled overlays can be pressed flat
formation of the gel matnx. and wrapped in plastic for safe handling.
Diffuse isozymes can indicate overstaining, They should be stored in the dark and with
less than deal electrophoresis conditions andlor light pressure to avoid recurling. BECAUSE THE
that a n agar overlay stain should have been ap- AGAR WILL RETAIN DANGEROUS CHEMICALS FOR
YEARS, OVERLAYS SHOULD NEVER BE HANDLED
plied In the latter case, if light shaking of the gel
results in disturbance of the formazan precipitate WITHOUT WEARING PROTECTIVE GLOVES AND/OR
on top of the gel slice, then the stain should be UNLESS THEY ARE WRAPPED.
Proteins: Jsozyme Electrophoresis 89
INTERPRETATION AND
TROUBLESHOOTING
The interpretation of the band patterns comprising Figure 12 Photograph exhibiting triallelic variation
the zyrnogram requires the knowledge of the sub- at the phosphogluc~mutaselocus (Pgm-A) in muscle
unit structure and the genetic control of the en- extracts from the cyprinid flsh Luxilus cardinalis.
zyme system. As discussed in the gene expression Specimens 1, 2, and 4 are homozygous expressing
only the 82 hornomex; specimen 3 is heterozygous
section, the tissue examined for enzyme activity expressing both 82 and 100 homomers; specimens 5,6,
may limit the number of gene products or sub- and 7 are also hcterozygous expressing both 68 and 82
units expressed. These variables may be manipu- homomers.
90 Chapter 4 / Murphy, Sites, Buth & Haufler
case of multirneric enzymes, the two allelic prod- Hardy-Weinberg expectations for the distribution
ucts are also produced in equal quantity in a given of allozyme products of a given locus is usually a
tissue but the products will usually randomIy as- safe one. Violation of this assumption suggests
semble to form all expected heteromers, in addi- that additional study is necessary, beginning with
tion to homomers. It is usually the case that the a reassessment of the scoring of that enzyme sys-
subunits of multimeric enzymes form homomers tem. Scoring only clear bands and omitting
and heteromers at random, yielding banding pat- smeared zones may overestimate the frequency of
terns in predictable rataos. Because heteromers of homozygotes. Frequently the report of 50% allele
similar composition can be assembled in several 1 and 50%allele 2 for a given locus in a table of al-
ways, the ratio of expected intensity of enzyme ac- lele frequencies is the result of incorrect scoring of
tivity differs among isozymes according to the an entire sample (n > 5 ) as heterozygotes.
subunit structure of the enzyme (Figure 13). This Difficulty in scoring gels can occur when any
variation is detailed in Table 6. of the other assumptions discussed previously are
The situation is more complex for multilocus violated. Exceptions to expected subunit interac-
enzyme systems. Multimeric gene products of tions and genetic control are often encountered.
multiple loci in an enzyme system often retain The random association of subunits of multimeric
their ability to form heteromers, and the number enzymes sometimes is restricted, yielding fewer
of isozymes formed can be considerable where zones of activity than expected. For example, cre-
heterozygosity occurs. Harris and Hopkinson atine kinase is a dimer in all vertebrates but the
(1976) provided the following equation for the
computation of the expected number of isozymes
(i) under these circumstances: Homozygote Meterozygote Homozygote
+I
Monomer
Table 6
Subunit structures of homomeric and heteromeric isozymes in heterozygotesa
Monomer Dimer Trimer Tetramer
Homomer
Heteromers
Homomer
aModified from Harris and Hopkinson (1976).Two alleles at tlus slngle locus determine polypeptide uruts
enzymes
1 and 2 respectively. Random combination of subumts of m u l ~ m e r ~ c is assumed.
heterodimer is not formed in heterozygotes at the ity or homozygosity should be correlated arnang
Ck-A locus in teleost fishes (Ferris and Whitt, tissues of an individual (e.g., Murphy and Crab-
197813). The subunit structure of enzymes often is tree, 1983). The probability for unlinked multiple
quite conservative across taxa; however, some en- loci to covary in such a way can be addressed sta-
zymes have been reported to have a variety of tistically (see Hart1 and Clark, 1989).
structures in different groups of organisms In all studies that deal with questions of
(Manchenko, 1988). These reports may reflect ei- whether mobilities of electromorplzs are equiva-
ther real structural differences among taxa or the lent or whether an individual is heterozygous at
restriction of heteromer formation misinterpreted a locus, the resolution of discrete zones of enzyme
as structural differences. Rigorous testing (e.g., activity on a gel is essential. If multiple buffer sys-
Ferris and Whitt, 1978b) should be applied in tems are not used or if tissue extracts no longer
these cases. On rare occasions, allelic products provide sufficient enzyme activity, the rcsolut~on
have different catalytic properties and expected may be inadequate, Interpretation of these subop-
ratios of isozyme expression are not realized. Ex- timal gels results in dubious data sets. For exam-
amination of a large series of individuals that re- ple, overstaining will obscure the subtle differ-
solve all heterozygous and homozygous cate- ences in relative activity of isozymes. In spite of
gories should allow the correct interpretation of the resolution of discrete zones of enzyme activ-
such variation. Epigenetic effects yield isozymes ity and efforts to limit enzyme expression to pri-
of different electropl~oreticmobilities in different mary isozymes, some non-genetic subbanding
tissues and can suggest the action of more struc- may confound the interpretation of gels. The pro-
tural loci than are actually present. If only a single duction of these secondary isozymes, or sub-
locus is active in this case, apparent heterozygos- bands, may vary by tissue location and age, en-
92 Chapfer 4 / Murpliy, Sites, Butlz & Haufler
Gpz-A products
Gpr-B products
plus subbands
Gpi-Mgenotypcs -28 100 100 100 100 100 100 100 164
100 100 100 100 700 100 100 100 100 100
b ~ g u r e14 Photograph exlrr~Llli~lg
variat~onat glu nnlis (lanes 1-6) a n d Luxilus zolzatus (lanes 7-10).
cosc-6-phosphate isomerase locl Gj7z-A and Gyr-R in Genotypes are listed for each locus. Notice the sub-
muscle extracts from the cyprinid hshcs Luxrlus cardz- banding.
(D) (E)
Figure 15 Photograph demonstrating a gel buffer in no case are all of the anticipated five isozymes
screen from rattlesnakes for the enzyme system L-lac- resolved (see Figures 13 and 16). Buffer (A) suppresses
iate dchydrogenase. (A) Tns-c~trateI11 pII 7.0. (B) Tris- the activity of the more anionic system, products of
citrate/borate pH 8.2. (C) Tris-citrate I1 pH 8.0. (D) the heart-predominating Ldk-B locus. Isozymes of the
Trls-citra'ce-EDTApH 7.0. (E) Phosphate-citrate pH 7.0. slower skeletal-muscle-predominating system Ldh-A
Incrcas~ngthe pH also increases the net charge and cannot be resolved adequately on systems (D) and (E).
re!a:ive mobility of tl-ic isozymes. Three or four Note that the minislices are uniquely notched.
iscuylnes are observed, depending on the buffer, but
Proteins: Isozyme Electrophoresis 93
inede
- SOD
4 Origin
Figure 18 Photograph showing the resolution of
Superoxidedismutase (SOD) isozymes (light bands) as
background on a gel stained for glycerol-3-phosphate
dehydrogenase (G3PDH) in spring peeper frogs, Hyla terpretation of the zyrnograms. Documentation of
(pseudacris)crucifer. results through the publication of either gel
photographs or zymograms is recommended
strongly.
desirable, as in the case of observing superoxide
dismutase (SOD) following staining for glycerol-
3-phosphate dehydrogenase (G3PDH; Figure 18), ENZYME AND Locus
or undesirable (Figure 19). NOMENCLATURE
Figure 20 documents the necessity of choos-
ing the comcf array of tissues to be surveyed. Fi- The effective communication of data derived from
nally, Figure 21 shows unacceptable resolution of protein electrophoresis is critical to a clear under-
an isozyrne system. Optimal resolution of en- standing of any study. Consequently, there is a
zyme activity will facilitate a correct genetic in- need for a reasonably standard system of enzyme
Fiwe 19 Photograph showing extensive variability which may be misinterpreted as a second MPI: locus.
in mannose-6-phospilatt isomerase (MPI) isozymes All individuals except those indicated by arrows are
among some hylid frogs along with backgrourtd reso- Hyla (Psercdacris) crucifer. Tlze more anodal isozyrnes in
lution of L-lactate dehydrogenase products (LDH), species 2,H. cadaverina, are Ldh-B products.
Proteins: Isozynw Electropl~oresis 95
larly, numbered alleles should be separated by a sources usually extends only to the secondary lit-
fonzvardslasl~,as in Ldiz-A(200/125). erature wherein modifications are already noted.
Particular interlocus isozymes resulting from With few exceptions, the enzyme names and
subunit Interactions in multilocus systems are Enzyme Commission (EC) numbers used in this
desigaated using enzyme system notation (capi- compilation are those recommended by the Inter-
tal letters) followed by subscripts that designate national Union of Biochemistry (IUBNC, 1984).
subunits, e.g., LDH-A3B1, or simply abbreviated Abbreviations of enzyme names are placed in
A3B1. ln heterozygous individuals, polymeric en- capital letters; abbreviations are developed from
zymes (enzymes composed of multiple subunits) the IUBNC (1984) recommended names and
yield multiple isozymes. Intralocus polymeric sometimes differ from common usage by the ad-
isozymes, formed from the interactions of differ- dition of letters for clarity. The listing of named
ent alleles a t a particular locus in a heterozygote, loci controlling each enzyme system is beyond the
are denoted by locus designation with allelic sub- scope of this appendix and other abbreviations
script notation showing the number of con- are defined in the glossary.
stxtuent allclic suburuts, e.g., Ldh-A(a3blj, or sSod- The quaternary structures for enzymes listed
A 2 j n l b l ) . [Note that in all alIozyme studies, a herein were taken from those reported by Harris
heterozygous condition is simply denoted as Ldh- and Hopkinson (1976), D.E. Soltis et al. (1983),
A(a/b) or sSod-A2(a/b).]Finally, in polymeric, mul- Richardson et al. (1986), Aebersold et al. (1987),
tilocus systems, it is possible for subunits pro- Manchenko (1988), and personal communications
duced by different loci (e.g., A or B) and from a number of researchers. For some enzymes,
alternative alleles at a particular locus [e.g., A(a) these structures are well documented and conser-
or A@)] to combine within a single tissue yield- vative across taxa. For others (e.g., catalase and
ing a multitude of distinguishable isozymes [see glucose-6-phosphate dehydrogenase), several
Gorll~anand Shochat (1972) for the resolution of quaternary structures have been reported
15 LDH isozymes.] For this we suggest combin- (Manchenko, 1988). Whether these and other en-
ing system and allelic notation as follows: LDH- zymes actually exist in multiple structural forms
A3(a2b1)13,where two subunits of Ldk-A(aj, and or have a conserved single multimeric structure
one subunit each of A(b) and B(a), combine to that is expressed as restricted subunit combina-
form a single LDH isozyme. tions remains to be investigated.
Many of the biochemicals used in enzyme
stains are marketed in a number of forms. In some
cases, ultrapurity is not required and considerable
savings can be achieved through the purchase of a
lesser grade. In some cases, the choice of a partic-
(Compiled by Donald G. Buth and Robert W. ular salt may be critical, We have listed (Table 4)
Murphy) the product number of many of these stain com-
ponents keyed to the catalog of the Sigma Chemi-
FarmuXas for enzyme stains frequently are modi- cal Company (P.O. Box 14508, St. Louis, Missouri
fled and republished, often as compilations for 63178 U,S.A.) to allow the reader to evaluate the
speciflc groups of organisms or even for single kind and cost of these biochemicals. This choice
species. Textbook treatments often provide a lim- does not necessarily represent our endorsement of
ited introduction to the vast array of stains avail- these products.
able, whereas listings for specific groups of or- Most of the stains below are based on a stan-
ganlsms often are limited to those systems well dard volume of 50 ml suitable for gel slices from
known or expressed only in those groups. Our most horizontal starch gel apparatus (scaled down.
llsting is not meant to be all-inclusive; our selec- from stain formulas for 100-ml volume used com-
tlon is biased toward economical systems in use monly for the rnacroscale vertical apparatus of ear-
by botanists and zoologists. Our reference to stain lier studies). Some investigators have reduced the
Proteins: Isozyme Electrophoreszs
sodium acetate (NaC2H302.3H20) 0.33 g This stain was modified from Harris and Hopkin-
G-naphtl~yl acid phosphate 0.15 g son (1976) and Siciliano and Shaw (1976). Note
fast garnet GBC 0.03 g that the magnesium chloride used is 1.0 MI not 0.1
Hz0 50 ml M as is used in many other enzyme stains. Cis-
aconitic acid stock solutions should be made in
The pH of the staining solution is about 5.5 so fur- small quantities as it seems to decompose in 1-2
ther adjustment is usually unnecessary; the pH insntl~s.
should be 5.0-6.0. The stain was modified from
Shaw and Prasad (1970). Werth (1985) recom-
mended the use of a stock solution of the substrate Adenosine Deaminase (ADA) $S
Pnapththyl acid phosphate in 70% acetone (use 1 (EC 3.5.4.4)
ml of a 1% solution). D.E. Soltis et al. (1983) and
Werth (1985) recommended the use of fast garnet Monomer This stain may be prepared as an agar
GBC salt as a substitute for black K salt. Sigma fast overlay.
black K salt has not provided satisfactory results 0.2 M Tris-HC1, pH 8.0 15 ml
in studies of reptiles. This stain was modified from Hz0 35 ml
Harris and Hopkinson (1976)and does not resolve adenosine 0.04 g
red-cell ACP in many vertebrates, including many arsenic acid 0.08 g
reptiles. Monomeric red-cell ACP isozymes, also xanthine oxidase 0.4 U
l a m - t as erythrocytic acid phospl~atase(EM; $$I,
nucleoside phosphorylase 1.8 U
may be stained as follows:
5 m d m l MTT 1ml
0.05 M citrate buffer, pH 6.0 50 ml
phenolphthalein diphosphate 0.2 g
This stain was modified from Spencer et al. (1968).
Incubate for 1 hr, decant the staining solution and
spray the gel surface with a concentrated solution
of ammonium hydroxide. Zones of activity will ap- Adenylate Kilaasc ( A K )$$$
pear as pink bands. This same stain was described (EC 2.7.4.3)
by Harris and Hopkinson (1976) who recom-
Monomer This stain may be prepared as an agar
mended 4 hr of incubation at 37OC. In some verte-
overlay.
brates tissue ACP isozymes can also be resolved.
0.2 M Tris-HC1, pH 8.0 50 ml
0.1 M MgC12 6 ml
Aconifate Hgrdra tasc IACOH) S$$ 0.03 g
adenosine 5'-diphosphate
(EC 4.2.1.3) D(+)glucose 0.1 g
Monomer This enzyme was known formerly as hexokinase 200 u
aconitase (ACO or ACON). Mitochondria1and su- G6PDH 40 NAD U
pernatant/cytosolic forms are known (Harris and 10 mg/ml NAD 2 ml
Hopkinson, 1976).This stain may be prepared as 5 mg/mI NBT 1ml
an agar overlay. 5 mg/ml PMS 1ml
0.2 M Tris-HCI, p H 8.0 50 ml This stain was described by Buth and Murphy
1.0 M MgC12 ((seenote below) 1.5 ml (1980) as modified from Fildes and Harris (1966).
0.1 M cis-aconitic acid, pH 8.0 5 ml A more sensitive, but more expensive, fluorescent
isocitric dehydrogenase 3U stain modified from Harris and Hopkinson (1976)
NADP 0.01 g may also be used.
5 mg/ml MTT I ml
5 mg/ml PMS 1ml
Proteins: Isozynw Electropizoresis gg
ground stain will be lost eventually. This stain would be to consider this enzyme under the cate-
was modified from Brewer (1970). Siciliano and gory of generic arninopeptidases EC 3,4.-.-.
Shaw (1976) recommended a longer incubation
period (15 min) for the initial solution. D.E. Soltis Incubate the gel slice in the following solution
for 30-60 min:
et al. (1983) and Werth (1985) noted that up to 1
ml of glacial acetic acid may have to be added to 0.1 M KEE2P04buffer, pH 7.0 50 ml
the KI solution to induce or improve staining. An 0.1 M MgC12 1 ml
alternative CAT stain was described by Harris 10 mg/ml L-leucine-
and Hopkinson (1976) and Aebersold et al. naphthylamide HC1 0.1 ml
(1987). This enzyme cannot be resolved on high
pH gels. Then add:
fast garnet GBC (dissolved in a
small quantity of water) 0.03 g
Continue incubation.
Dimer The following stain may also yield adeny- This stain was modified from those of Brewer
late kinase (AK) gene products. A control slice (19701, Shaw and Prasad (1970) and Ayala et al.
from the same gel must be stained specifically for (1972). Some of these staining methods involve a
AK to ascertain, by a process of elimination, preincubation step in a boric acid solution which
which zones of activity are CK. This stain may be may not be necessary.
prepared as an agar overlay.
0.2 M Tris-HCI, pH 7.0 50 ml
0.1 M MgC12 1 ml
adenosine 5'-diphosphate 0.03 g
glucose 0.05 g
hexokinase 200 u Monomer or Dimer This enzyme was known for-.
merly as diaphorase (DLA) and lipoamide dehy-
phosphocreatine 0.05 g
drogenase (EC 1.6.4.3; see Muramatsu et al., 1978).
GGPDH 40 NAD U
10 mg/ml NAD 1 rnl 0.2 M Tris-HC1, pH 8.0 50 ml
5 rng/ml NBT 1 ml 2 mg/ml2,6-dichlorophenol-
5 mg/ml PMS 1ml indophenol 1ml
NADH 0.01 g
This stain was described by Buth and Murphy 5 mg/ml M n 1 ml
(1980) as modifed from Shaw and Prasad (1970).
A more sensitive, but more expensive, fluorescent Zones of enzyme activity will appear pink/pur-
stain was described by ~ a r i i and
s Hopkinson ple against the blue background of the gel. The
(2976). blue DCIP color will clear overnight if the devel-
oped gel is kept refrigerated (dark) yielding a
white gel with purple isozymes. This stain was
modified from those of Kaplan and Beutler
(1967) and Brewer (1970).Harris and Hopkinson
(1976) used this stain to resolve NADH di-
Monomer This enzyme was known formerly as
aphorase (a synonym of cytochrome-b5 reduc-
leucine aminopeptidase (LAP); the current
tase; EC 1.6.2.2). Aebersold et al. (1987) noted
IUBNC (1984) name and EC number may be
that this stain may also resolve xanthine oxidase
changed as more is learned about peptidases.
(XO) gene products as well as those of a variety
Rickardson et al. (1986) refer to this enzyme as
of other enzymes.
Pep-E (see Peptidase). A conservative approach
102 Chapter 4 / Murphy, Sites, Butk b Haufler
D-fructose-6-phosplzate 0.04 g
G6PDH 40 NAD U
10 mg/ml NAD 2 ml
Dirrier? This enzyme was known formerly as hex- 5 mg/ml NBT 1 ml
ose-6-phosphate dehydrogenase (H6PDH). Tlze
5 mg/ml PMS 1 rnl
following stain may also yield LDH. Either a con-
trol slice from the same gel or the addition of 0.05 This stain was described by Buth and Murphy
g pyruvic acid may be necessary. (1980) as modified from DeLorenzo and Ruddle
(1969).
0.05 M patassium phosphate buffer,
p1-l 7.0 50 ml
u(+)glucose 9g t>l-Glercasidas.e4sxGL'BLIS)$$$
10 nlg /ml NAD 2 ml (EC 3,2.1.20)
5 mg/ml NBT 1 ml
5 mg/ml PMS 1 ml Tetramer
This stain was modified by Berg and But11 (1984) 0.1 M phosplzate-citrate buffer, p H 4.0 5 ml
from (hat described by Harris and Hopkinson 4-methylumbelliferyl-a-D-glucoside 0.01 g
(1976)
Monitor the development of expression under W
light (long wavelength). Zones of activity will ap-
pear as bright areas. To enhance fluorescence,
spray the gel slice with a concentrated solution of
ammonium hydroxide. This stain was modified
from Harris and Hopkinson (1976).Aebersold et
Dillier? NADP (0.02 g in 400 ml) should be added al. (1987) recommended a stain buffer pH of 8.0.
to thc gel before electrophoresis.
0.2 h.i Tris-HC1, pH 8.0 50 ml P-GTucosidase (EgGiUS)$
0.1,2/MgC12
1 3 ml (EC 3.2.3.23)
u-glucose-6-phospha te 0.3 g
hlADI-' 0.03 g Subunit structure uncertain
5 mg/ml NET 1 ml 8.1M phosphate-citrate buffer, pH 4.0 5 ml
5 mg /ml PMS I ml 4-methylun~belliferyl-PD-glucoside 0.01 g
This stain was modified from Brewer (1970). At Incubate for approximately 30 min and then
least for amphibians, the quantities of NADP and view under UV light (long wavelength). Zones
u-glucose-6-phosphate may be reduced by 60% or of activity will appear as briglzt areas. To en-
more hance fluorescence, spray the gel slice with a
concentrated ammonium hydroxide solution.
This stain was modified from Harris and Hop-
kinson (1976).
This stain was modified from Aebersold et al. gene products. A control slice from the same gel
(1987). Alternative stains are discussed by Brewer may be necessary.
(1970). P.T. Chippindale (personal communica- 0.2 M Tris-HC1, pH 8.0 50 ml
tion) obtains better results by applying the stain DL-glycericacid 0.2 g
without agar and in only 13 ml of Tris/HCl. Di-
pyruvic acid 0.05 g
hydrolipoamide dehydrogenase (DDH; EC
1.8.1.4) may also be resolved as a second, rela- pyrazole 0.05 g
tively slower system (see Harris and Hopkinson, 10 mg/rnl NAD 2 ml
1976) because it appears if glutathione is omitted 5 mg/ml NBT 1ml
from the stain. In addition, because Aebersold et 5 mg/rnl PMS 1 ml
al. (1987) noted that the DDH stain may also re- This stain was modified from Siciliano and Shaw
solve xanthine oxidase (XO) gene products as well (1976).
as those of a variety of other enzymes, this may
also be true for GR. FAD may not be necessary if
NADPH is used instead of NADH. However, the
use of NADPH may result in the resolution of
NADPH dehydrogenase (EC 1.6.99.1).
5 mg/ml NBT
5 mg/rnl PMS
Tetramer This stain was modified from Brewer (1970).
0.2 M Tris-HCI, pH 8.0 50 ml
1.0 M lithium lactate, pH 8.0
(see below) 8 mi
20 mg/ml NAD I d
5 mg/ml NBT 1 ml
5 mg/ml PMS 1ml Tetramer The following stain is for NADP-depen-
dent malate dehydrogenase. The convention for
The stock substrate solution may be prepared us- name abbreviation of this kind of enzyme (+P to
ing either DL-lacticacid or lactic acid solution; the MDH) follows Aebersold et al. (1987). This en-
pH should be adjusted to 8.0 with the addition of zyme was known formerly as malic enzyme (ME).
LiOH. This stain was modified from Shaw and Mitochondrial and supernatant/cytosolic forms
Prasad (1970). are known (Harris and Hopkinson, 1976). NADP
(0.02 g in 400 ml) should be added to the gel be-
fore electrophoresis.
0.2 M Tris-HC1, pH 8.0 50 ml
0.1 M MgC12 1 ml
Dirner This enzyme was known formerly as gly- 2.0 M DL-malicacid, pH 8.0 5 ml
oxalase I (GLO). NADP (see note below) 0.02 g
0.2 M potassium phosphate buffer, 5 mg/ml NBT 1ml
pH 6.8 50 ml 5 mg/ml PMS 1 ml
methylglyoxal (40%solution) 0.9 ml
This stain was modified from those of Ayala et al.
glutathione, reduced 0.25 g
(1972) and Cross et al. (1979). It is important that
5 mg/ml MTT 1 ml NADP be used in solid form in this stain. There is
Incubate the gel slice in this solution at 37'C for often sufficient breakdown of NADP to NAD in
40 min and then add: liquid stocks in prolonged storage that NAD-de-
pendent MDH activity will be resolved in addition
2 mg/ml2,6-dichlorophenol- to MDHP. If there is any doubt as to the identity of
indophenol 1ml MDHP, a control slice from the same gel should be
Areas of activity will be seen as white zones on a stained specifically for MDH to ascertain, by a
blue background. This stain was modified from process of elimination, which zones of activity are
Harris and Hopkinson (1976). MDHP. Aebersold et al. (1987) recommended
adding 0.02 g oxaloacetic acid to an MDHP stain
of 50 ml,Use caution when preparing the DL-malic
acid substrate as the solution becomes extremely
hot while adjusting the pH with NaOH.
0.2 M Tris-HC1, pH 8.0 50 ml keep the gel slice moist (Harris and Hopkinson,
0.1 M MgC12 1 ml 1976; Aebersold et al., 1987). After incubation,
w-mannose-6-phospha te 0.05 g remove the substrate solution from the slice
glucose-6-phosphate isomerase 50 U but do not rinse. Add the following visualiza-
G6PDH 40 NAD U tion solution:
10 mg/ml NAD 2 ml L-ascorbicacid 0.5 g
5 mg/ml MTT 1 ml ammonium molybdate solution
5 mg/ml PMS 1ml (see below) 2 ml
Products of LDH may appear as faint bands fol- Incubate at 37OC in a fume hood in the dark. The
lowing staining. LDH activity can be suppressed ammonium molybdate solution can be prepared
by adding 0.05 g of pyruvic acid. This stain was as a stock (2.5 g ammonium molybdate, 8 ml con-
described by Buth and Murphy (1980) as modi- centrated H2SO4,92 ml H20).This stain was mod-
fied from E. Nichols et al. (1973). ified from Aebersold et al. (1987).
This stain was modified from ShaMee and Keenan Snake venom is used in this stain as a source of 1.-
(1986). This enzyme is known only from inverte- amino acid oxidase. The substitution of a less pu-
brates. rified but adequate source of this enzpine (vla
snake venom) was advantageous final~ciallyat
one time but may no longer be so. Several stain
formulas list specifically the venom of the eastern
diamondback rattlesi~ake(Crotalus adamat~teus)for
Subunit structure vaviable The terms dipeptidase use in peptidase stains (e.g., Siciliano and Shaw,
(EC 3.4.13.11) and tripeptide aminopeptidase 1976).We have tried the less expensive venom of
(EC 3.4.11.4) are recommended over the more a closely related rattlesnake, the western dra-
generic term peptidase (IUBNC, 1984).However, mondback (C, atrox, recommended herein), and
the multiple substrate affinities of these enzymes found that it yielded equivalent results. These
and problematic assignment of their homology stains may disappear quickly and thus should be
makes the exact assignment of EC numbers diffi- scored and photographed promptly. For rapidly
cult. Exceptions are those of proline dipeptidase developing peptidases (e.g., Pep-A), it may be ad-
(Pep-D; EC 3.4.13.9) and perhaps cyEoso1 vantageous to incubate these gels at room tem-
aininopeptidase (Pep-E; EC 3.4.11.1). Recom- perature to slow staining.
mended substrates for the resolution of products
of seven peptidase loci described from verte-
brates follow Frick (1981,1983, personal commu-
i~ication),Richardson et al. (19861, and/or Mat-
son (1989). The tissue distribution of these gene Subunit structure lrnccrtain
products is often restricted (e.g. Frick, 1983; Mat-
3-amino-9-ethyl carbazole 0 04 g
son, 1989). Pep-B frequently appears upon stain-
N,N-dimetl~ylformamide 2.5 ml
ing for Pep-F, as does Pep-C on Pep-A making
counterstaining necessary. Then add:
0.05 M sodium acetate buffer, pH 5.0 5 1111
Pep-A: glycyl-L-leucine[dimer] $$$
0.1 M calcium chloride 1 ml
Pep-B: L-leucylglycylglycine[monomer or
dimer] $$$ 3% hydrogen peroxide 1 ml
Pep-C: glpcyl-L-leucine or DL-alanyl-DL- Incubate the gel slice in a refrigerator, usually for
methionine [monomer] $$$ 30-60 min. This stain was modified from Shaw
Pep-D: L-phenylalanyl-L-proline [dimerl $$$ and Prasad (1970) by D.E. Soltis et al. (1983). See
Pep-E: see cytosol aminopeptidase [monomer] Brewer (1970) and Siliciano and Shaw (1976) for
Pep-F: L-leucyl-L-leucyl-L-leucine
[subunit additional PER stains.
structure unknown] $$$$
Pep-S: glycyl-L-leucine[tetramer?] $$$
Trirner This enzyme was known formerly as nu- This stain was described by Buth and Murphy
cleoside phosphorylase (NP). Tlus stain should be (1980) as modified from Brewer (1970).This stain
prepared as an agar overlay may also resolve adenylate kinase gene products.
114 Chapter 4 /Murphy, Sites, Buth G.' Haufler
buffer systems listed herein followed by compar- is not necessary to use non-denatured ethanol al-
isons using similar buffers, if necessary, for opti- though this may be preferable to keep methyl and
mal resolution of enzymes. Additional buffers are isopropyl alcohol out of the gel.
listed by Brewer (1970), Selander et al. (1971),
Clayton and Tretiak (1972): Harris and Hopkinson
(1976), Steiner and Joslyn (1979)) Shaklee and
Tamaru (19811, Conkle et al. (1982), D.E. Soltis et Stock solution:
al. (19831, Cheliak and Pitel (19841, Werth (1985), (0.04 M)citric acid monohydrate 8.4 g/liter
Micales et al. (19861, Selander et al. (1986), Shak-
lee and Keenan (19861, Aebersold et al. (1987))and Adjust to desired pH by adding =lo-15 ml/liter
Morizot and Schmidt (1990). Several of those N-(3-aminopropy1)-morpholine
listed for use in cellulose acetate electrophoresis Electrode: Undiluted stock solution
by Richardson et a1. (1986:153-154) may be ap-
plicable to starch gel work. The reader must re- Gel: 1:19 dilution of stock solution
main aware that buffer formulas are usually de- These gels are hazardous and should be handled
rived empirically and additional modification only with protective gloves. This buffer was de-
should be encouraged. Sambrook et al. (1989) pre- scribed by Clayton and Tretiak (1972). Werth
sented a useful appendix for the preparation of (1985),Shaklee and Keenan (19861, and Aebersold
phosphate buffers. et al. (1987) recommended its use at pH 6,1, 6.0,
Our buffer accounts include a descriptive and 7.0, respectively. Its range of use may be p H
name of the system, molarities of components in 6.0-8.0 (D.E. Campton, personal communication).
solution, exact gram measures of components in Aebersold et al. (1987) suggested the inclusion of
one liter equivalents, formulas for stock solutions 0.01 M EDTA in the stock solution.
as well as dilutions for electrode chambers and
the gels, and references. We have resisted listing
the electrical potential for each of these buffer Axnine-C itrate QPropanol)
systems, although many other compilations of Stock solution:
buffer formulas provide such information. (0.04 M)citric acid monohydrate 8.4 g/liter
Among these, only Brewer (1970) identified cor-
rectly the fact that such potentials are related to Adjust to the desired pH by adding =lo-15
the length of the gel mold and should be ex- ml/liter bis(dimethy1amino)-2-propanol
pressed as volts per linear centimeter of gel. We Electrode: Undiluted stock solution
find most published voltages to be at or beyond
the high end of applicability and improved reso- Gel: 1:19 diIution of stock solution
lution (as well as lab planning) can be gained This buffer was described by Clayton and Tretiak
with electrophoretic runs for longer duration at (1972).It may be optimal at pH 57.5.
lower voltages.
Stock solution:
(0.90 M ) Tris 9.0 g/liter
Stock solution A: (0.50 M) boric acid 0.9 glliter
(0.19 M) boric acid 11.8 g/liter (0.02M) disodium EDTA 6.7 g/liter
(0.03M) lithium hydroxide Adjust to pH 8.6 with NaOH (pellets)
(LiOH*H20) 1.26 gjliter
Electrode:
Adjust to pH 8.1
Anode: 35 ml stock solution + 215 ml Hz0
Stock solution B: (1:6 dilution)
(0.05M) Tris 6.06 g/liter Cathode: 50 m1 stock solution + 200 ml H 2 0
(0.008 M) citric acid monohydrate 1.68 g/liter (1:4 dilution)
Adjust to pH 8.4 Gel: 1:19 dilution of stock solution
Electrode: Undiluted stock solution A This is a modification of the buffer described bp
Boyer et al. (1963) referred to as EBT by ER. Wil-
Gel: 1:9 mixture of stock solutions A:B, final
son et al. (1973) and Shaklee and Keenan (1986),
pH 8.3
and as TBE by Aebersold et al. (1987).Shaklee and
This discontinuous buffer is the lithium hydrox- Keenan (1986)recomn~endedthe use of 7.4 g/liter
ide buffer described by Selander et al. (1971), tetrasodium EDTA in this buffer.
Proteins: Isozy?ncElec troplzoresis 119
sucrose
r-120
Stock solution:
(0.50 M) Tris 60.6 g/liter Tkis buffer system was described by B.J. Turner
(0.65 MI boric acid 40.2 g/liter (1973).
(0.02 M)disodiurn EDTA 6.7 g/liter
Adjust to pH 8.0 'Tr'pSs-Citra tr TI
Electrode: Undiluted stock solution Stock solution:
Gel: 1:9 dilution of stock solution (0.687 M) Tris 83.2 g/liter
These gels tend to be thick and are particularly (0.157M) citric acid monokydrate 33.0 g/liter
difficult to aspirate, pour, and slice. Stock solu- Adjust to pH 8.0
tions are not suitable for long-term storage and
better results may be found using fresh solutions. Electrode: Undiluted stock solution
This is the TVB (Tris-versene-borate)buffer of Se- Gel: 1:29 dilution of stock solution
lander et al. (1971) and Siciliano and Shaw (1976).
Another version of this buffer is described by This is the continuous Tns-citrate I1 buffer of Sc-
Brewer (1570): 0.21 M Tris, 0.15 M boric acid, and lander ct al. (1971). D.E. Soltis et al. (1983) llstcd
6 mM disodium EDTA adjusted to pH 8.0 for the other modifications of the Tris-citrate buffers of
electrode and 21 mM Tris, 20 mM boric acid, and Shaw and Prasad (1970) including (1) 0.135 M
0.68 mM disodium EDTA adjusted to pH 8.6 for Tris, 0.032 M citric acid, pH 8.0, diluted 1:14 for
the gel. Werth (1985) described another system, the gel, (2) 0.135 M Trls, 0.017 M citric acid, pH
termed "salamander 0" and attributed to S.I. 8.5, diluted 134 for the gel, (3) 0.223 M Tns, 0 086
Guttman, which uses the same stock solution and M citric acid, pH 7.5, diluted 1:27.5for the gel, and
concentrations for the electrode and gel buffers: 84 (4) 0.223 M Tris, 0.065 M cltric acid, pH 7.2, di-
mM Tris, 7.9 mM boric acid, and 0.86 mM di-
luted 1:27.5 for the gel. Other variations include
sodium EDTA adjusted to pH 5.1 with HC1. 0.13 M Tris, 0.043 M citric acid, pH 7.0, diluted
1:14for the gel (Siciliano and Shaw, 1976), 0.094 A4
Tris, 0.0235 M citric acid, pH 8.6, diluted 1.5 for
$i-is-Borate-ED7A-Eithiglrrn the gel (Harris and Hopkinson, 19761, and 0 22 M
Tris, 0.086 M citric acid, pH 5.8, diluted 1:27.5 for
Stock solution: the gel (Shaklee and Keenan, 1986).
(0.9 M) Tris 109 g/liter
(0.4 M) lithium hydroxide
(LiOH.H20) 16.8 g/liter
(0.5M) boric acid 30.9 g/liter Stock solution:
(0.1 M ) EDTA free acid 29.2 g (0.75A4) Tris 90.8 g/liter
Hz0 to 1000 ml (0.25 M) citric acid monohydrate 52.5 g/liter
Adjust to pH 9.1 with NaOH Adjust to pH 7.0 with NaOH (pellets).
Electrode: Electrode:
stock solution 225 ml Anode: 35 ml stock solution + 215 ml H 2 0
sucrose 100 g (1:G dilution)
Hz0 to 2000 ml Cathode: 50 rnl stock solution .t 200 ml HzO
Gel: (1:4dilution)
General Principles
This chapter concerns the analysis of lnicroscopically visible aspects of the molec-
ular structure of chromosomes. The term "chromosome" was introduced in 1888
by Wilhelm Waldeyer, and the chromosomaI theory of inheritance was put for-
ward and elaborated by Theodore Boveri, Walter S. Sutton, and Thomas H. Mor-
gan in the first part of this century. Ever since that time the study of chromosomes
(known as either karyology or cytogenetics) has occupied a prominent place in
genetics in both clinical and academic applications, as well as in comparative bi-
ology and phylogenetic studies. Comparative cytogenetics is thus an old field
with diverse schools of interpretation concerning chromosome structure, func-
tion, and evolution.
The history of cytogenetic research can be divided into several eras, each cou-
pled with major technological innovations that triggered revolutions in analytical
approaches (Hsu, 1979). The modern era in cytogenetics, including the incorpo-
ration of molecular methods, was initiated by the development of four main tech-
nological breakthroughs: (1) the discovery that hypotonic treatment spreads
metaphase chromosomes, allowing accurate assessments of chromosome num-
bers and morphology; (2) the development of chromosome banding techniques,
which allows the identification of homologs (within karyotypes of the same
3.22 Chapter 5 / Sessions
Figure 1 Radioisotopic in situ hybridization of ribo- tion, including one homologous pair (chromosome no.
somal probe to chromosomes of salamanders of the 2) and two single chromosomes. (D) P. larselli showing
genus Plethodon. (A) P, d u i ~ nshowing
i labeling over the heavy labeling on the smallest pair of chromosomes
short arm of a medium-sized submetacentric chromo- (no. 14) and scattered labeling over all the chromo-
some. (B) P. veiziculum showing four distinctly labeled somes. (From Macgregor and Sherwood, 1979, with
sites. (C) glutinosus showing four sites of hybridiza- permission.
species) and homoeologs (between karyotypes of al., 1969; Hsu, 1979; Macgregor, 1993); and (4) the
different species); (3) the development of tech- use of immunochemistry, in conjunction with ISH,
niques for i n situ hybridization (ISH) of nucleic to allow the non-radioisotopic detection of hy-
acid probes to cytological preparations of chro- bridized probes with various fluorochromes in a
mosomes, by which specific DNA sequences can process known as chromoso~nepainting (Plate 11,
be localized to particular chromosomes and parts used not only for mapping sequences on chromo-
of chromosomes (Gall. and Pardue, 1969; john et somes, but for identifying chromosomal homolo-
Chromosomes:MolectiLar Cytogenetics 123
gies (synteny)between species (Lichter and Ward, catifig single-copy DNA sequences on mitotic
1990; Trask, 1991; Sasavage, 1992; Wienberg et al., chromosomes (Figure 2; Harper and. Saunders,
1992; Luke and Verma, 1993; Therman and Sus- 1984; Steinmuller et al., 1993). Almost all. moiecu-
man, 1993). lar cytogenetics done these days utilizes non-iso-
The field of molecular cytogenetics is cen- topic in situ hybridization (NISH)teclu~iques.The
tered on the technique of in situ hybridization of most prominent: of these is fluorescence in situ
nucleic acids using radioactive or non-radioactive hybridization (FISH, Plate I), in which norx-ra-
probes, and this methodology will be discussed in dioactive, biotinylated hybridized probes arc vi-
detail in this chapter. The basis of in situ nucleic sualized via fluorescently labeled monoclonal
acid hybridization is the annealing of labeled, mo- and/or polyclonal antibodies wl1ic11, w l ~ e nam-
bile probe molecules and stationary target mole- plified with avidin-biotin and subjected to corn-
cules to form base-paired duplexes. Comparative puter-assisted image processing, results in a de-
studies using in situ hybridizatioi~have tradition- gree sf specificity, resolution, and versatility that
ally involved locating repetitive sequences, such exceeds even the best autoradiographic ISH
as satellite DNA, ribosomal gene clusters, or the (Langer et al., 1981; Manuelidis et al., 1982; Pinkel
extensively reduplicated genes of polytene chro- et al., 1986; Frommer et al., 1988; C.A. Porter et al.,
mosomes using highly radioactive molecular 1991, 1994; Therman and Susman, 1993). High-
probes detected via autoradiography (Plate 1).In resolution detection of biotinylated probes can~bc
situ hybridization can also be used to locate spe- achieved by using FISH in conjunction with con-
cific RNA transcripts on lanzpbrush chromosomes focal laser scanning microscopy (CLSM),or by us-
(Figure 1; Diaz et al., 1981; Varley et al., 1980),and ing gold-conjugated antibodies and transmjssion
techniques have been developed for reliably lo- electron microscopy (TEMISH; Fetni et al., 1992;
- Expected
--.-o--- Observed
Chramosome number
Figure 2 Localization of human insulin gene. Com- some n.Expected number of s11ver grains calculated as
posite label from 35 cells hybridized with tritiated silver grains per unit length, assurnmg random d~strlb-
probe (0.2 pg/ml for 11hr and exposed for 11 days). La- ution. (After Harper et al., 1981.)
be1 is concentrated at telomere of short arm of chromo-
124 Chapter 5 / Sessions
concentrations in egg white (streptavidin, a simi- ture conditions have been used for various
lar protein made by the bacterium Streptomyces species, and often the tissue culture requirements
nvidirzii, is sometimes used instead) that non-im- must be worked out for particular species of m-
munologically binds four molecules of t l ~ evita- terest. One great advantage of in vitro culture is
min biotin. This property of avidin-biotin binding that it may be possible to synchronize cell cycles
allows the amplification of hybridization signal to increase the yield of cl~romosomes,decrease the
when used in co~~junction with biotin-labeled (= variation in chromosome condensation between
biotinylated) probes, and anti-avidin and/or spreads, and decrease the required dose of the im-
anti-biotin antibodies which are themselves con- totrc arrest agent (Watt and Stephen, 1986).
jugated with biotin or fluorochromes. In general, both in vivo and in vitro chromo-
some cultures require a mitotic spindle inhibitor
Comparison of the Primary Methods to block cells in mitotic metaphase. The most com-
monly used spindle inhibitors are colchicine, its
Prepamtion of Mitotic Metaphase Chvomosomes synthetic analog colcemid (deacetylmethyl-
The preparation of useful spreads of mitotic colchicine),and vinblastine (Tji0 and Levan, 1956;
metaphase chromosomes involves five steps: (1) Macgregor and Varley, 1983; Watt and Stephen,
selection of tissues with a high mitotic activity (or 1986).For in vivo culture, relatively high conccn-
stimulation of such activity), (2) in vivo or in vitro trations of colchicine arc injected directly into the
treatment with a mitotic arresting agent (with or body cavity or under the skin (colcemid is sub-
without cell cycle synchronization), (3) hypotonic stantially more potent than colchicine and is uscd
treatment of tissues or cells, (4) fixing (and stor- at lower concentrations). Some organisms, such as
mg) tissues or cells, and (5) making permanent aquatic amphibian larvae, or tissues, such as plant
chromosome preparations on slides. Specific pro- root tips, can simply be immersed in a solution of
tocols for the production of mitotic chromosome colchicine. The optimal treatment time depends
spreads are given later in this chapter. on the cell cycle of the species used and on the de-
The simplest method for obtaining mitotic sired level of cl~romosornecontraction. Cell cycle
chromosomes is to select a tissue that has an in- time is proportional to genome size (C-value, the
trinsicalIy high rate of mitotic cell division in vivo. haploid amount of nuclear DNA), at least in poik-
Root tips in plants, and developing embryos, lar- ilotlierms. The treatment time is short (1-2 lir) for
vae, or regenerating blastemas in animals are organisms with small C-values (most vertebrates
gaod in this regard. In adult vertebrates, high mi- and invertebrates), but is much longer (24-72 hr)
totic rates may be found in bone marrow, intesti- for species with very large C-values (e.g., lung-
nal epithelium, corneal epithelium, kidneys, fish, salamanders, and certain species of flowcr-
spleen, and gonads, depending on the species. ing plants). Much smaller amounts of colchicine
Mitotic proliferation can sometimes be stimulated (or colcemid) are needed for cells in tissue culture.
in vivo by subcutaneous or intraperitoneal injec-
tion of a mitogen such as phytohemagglutinin Squash and Splash Techniqrres
(PHA) or pokeweed mitogen (PWM), or even ac- There are two main techniques for the production
tivated yeast suspension. Use of these tissues of permanent chromosome preparations sultable
avoids the need for tissue culture, which can be for molecular hybridization studies: the "squasl~"
expensive and unpredictably time consuining and "splash" tecliniques. Both are designed to
when Inany species are studied. The disadvantage achieve optimal flattening and spreading of cyto-
of the in vivo technique is that specimens gener- logical material on microscope slides. In the
ally must be kdled to harvest the tissue. squash technique, small pieces of tissue (tluck cell
In vitro methods involve culturing peripheral suspensions can also be used) are macerated
blood or cells from explants of various other kinds and/or finely minced on a slide and then firmly
of tissues (see Fresl~ney,1994, for general tissue squashed beneath a siliconized coverslip. In the
cuIture methods). Many different media and cul- splash technique, a thick suspension of cells is
128 Chapter 5 / Sessions
placed onto a slide from a pipette, usually from a obtained during the process of screening using
distance, and spreading of cells and chromosomes phase contrast optics. A disadvantage of the
occurs via surface tension. For both techniques it squash technique is that the chromosomes can be
is critlcal that the tissues be exposed to hypotonic damaged or lost during the process of making the
soiutron (either distilled water or dilute KCI) be- slides permanent. Another common problem witk,
fore fixation, and then fixed in freshly prepared, the squasl~technique is that the preparation can
ice-cold 3:l fixative made with 3 parts absolute be ruined by the slightest sideways movement of
ethanol (for squashing) or methanol (for splash- the coverslip during squashing or by the inclusion
ing) and 1 part glacial acetic acid. of large bits of tissue, lint, or air bubbles beneath
A useful feature of the 3:l fixative is that tis- the coverslip. The squash technique works best
sues (or cell pellets) can be stored in it for years so for organisms with very large chromosomes, es-
long as they are kept in tightly sealed vials at pecially salamanders, plants, and insect polytene
-20°C. If such long-term storage is necessary (or chromosomes, and is not recommended for
desirable) it is important to make sure that the tis- species with very small chromosomes such as
sues have been well fixed in at least two changes mammals, birds, and reptiles because of the diffi-
of 3 1 for at least 15 min (to remove all water) and culty in obtaining sufficiently flattened chromo-
then stored in fresh 3:l. For preparations that will somes.
eventually be used for in situ hybridization, it is The splash technique involves preparing a
advisable to keep the 3:l fixative ice-cold to mini- suspension of cells fixed in 3:l methano1:acetic
mize hydrolysis. Storage at -20°C is necessary be- acid. Methanol is used because it evaporates
cause this fixative decomposes rapidly at room faster than ethanol. Cells are collected from a cul-
temperature. Cytological tissues fixed and stored ture medium, dispersed, and then inc~ibatedin a
in Lhls manner have been used successfully not hypotonic solution (0.075 M KC1). The cells are
only for in situ hybridization but also for the ex- then centrifuged out of the hypotonic solution
traction of high-molecular-weight DNA se- and re-suspended in fixative (preferably ice-cold),
quences (P.E, Barker et al., 1986). washed several times in fresh fixative, and finally
To make slide preparations using the squash re-suspended in a small volume of fixative. The
technique, small pleces of tissue are removed resulting concentrated cell suspension can either
from the 3:l fixative and briefly soaked in 45% be stored at -20°C, or used immediately to make
acetic acid (this treatment hydrolyzes cytoplasmic slides. The concentrated cell suspension is pipet-
components and can eventually reduce the tissue ted (splashed) onto slides using various tech-
to a nuclear suspension). Plant cells usually have niques designed to maximize spreading of cells
to be hydrolyzed in warm HCl before the 45% and chromosomes. One commonly used method
acetic acid step, to soften the cell wall. The soft- is to splash several drops of the cell suspension
ened tissue is then removed in a small drop of onto ice-cold slides wet with distilled water from
acetic acid to a very clean, subbed (gelatinized) a height of 0.5 m or more and drying them on a
microscope slide, minced as finely as possible, slide warmer (40°C). Another technique is to
covered with a siliconized coverslip, and flame-dry the slides by holding the slide over a
squashed firmly with the thumb on a cushion of bunsen burner and letting the methanol ignite
absorbent paper. The preparation is now either and burn off. Yet another technique is to use a
examined directly, or made permanent by freez- 1-ml pipetter and a plastic pipette tip to pipette a
ing on dry ice. The slides can then be stored in- cell suspension up and down at several different
definitely if kept desiccated at 4OC. The main ad- spots on a slide that has been warmed to 60°C on
vantages of the squash technique are that it is a a slide warmer. Each time the cell suspension is
very quick and efficient way to analyze a particu- drawn back into the pipette, cells are left sticking
lar specimen without building up a large number to the slide in concentric rings. The suspension is
of unnecessary slides. Also, excellent photomicro- drawn back into the pipette before moving to the
graphs of selected chromosome spreads can be next spot. An advantage of this latter technique is
Plate 1 Examples of the use of FISH and confocal ing. (D) Higher magnification of the X chromosome
laser scanning microscopy (CLSM). (A-E) Marsupial with FISH. (E) Silver staining of the X chromosome. (F)
(PtK1) chromosomes observed with CLSM. (From 55 rDNA hybridized to chromosomes of the tetraploid
Robert-Fortel, 1993.) (A) Propidium iodide staining of frog, Odonfophyrnus arnericnnus, observed with stan-
the X chromosome shows secondary constriction (ar- dard epifluorescent microscope. (Compliments of
row). (B) Fluorescence in situ hybridization (FISH) with Maria Jose Martinez Exposito and Martina Guttenbach,
rDNA probe wit11 labeling superimposed on phase con- Institute of Human Genetics, University of Wiirzburg.)
trast. (C) Silver staining visualized by reflection Imag-
Chromosomes: Molecular Cytogenetics 129
tliat several spots can be placed in controlled po- than cell number, and involves endoreplication of
sitions on a single slide, which facilitates screen- their chromosomes. I'olytene chromosomes are
ing the slides for good chromosome spreads (M. easiest to prepare from salivary glands, which are
Schmid, personal communication). Splaslz slide found near the anterlor end of the larvae. The
preparations can be stored indefinitely if they are glands can be exposed quickly by rclnoving the
kept desiccated at 4OC. larva's head with watchmaker's forceps or nec-
A disadvantage of the splash technique is that dles and removing the adhering fat bodies, the
it is sometimes difficult to see or photograph good glands can then be squashed on a subbed slide
examples of chromosome morphology until after under a siliconized coverslip.
the preparations have been stained and/or cover-
slipped (although chromosome spreads can be lo- Preparation of Meiotic CIzromosolnes
cated using phase contrast optics or using a defo- from Spermatocytes
cused condenser under bright field optics). Meiotic chromosome preparations can be rela-
Whichever technique is used to obtain chromo- tively easily made from testes of most anlrnals
some preparations for in situ hybridization, it is and from pollen mother cells (PMCs) in plants.
important that the actual preparations are made Tlie squash technique is used for amphibians and
near one end of the slide for ease of handling later, plants with large chromosomes, while the splash
especially during autoradiography. technique usually is used for amniotes, whrch
Cytological preparations for in situ hy- have much smaller chromosomes. For amphlb-
bridization should be made on slides that have lans, birds, fish, and reptiles, testes are s~niplyrc-
been coatcd with a thin layer of gelatin (subbed) moved, cut or sliced, and fixed directly in 3:l fixa-
to minimize loss of material during processing. t ~ v e .In salamanders, analysis of meiosis 1s
Subbed slides are particularly ~mportantfor au- facilitated by the fact that meiosis occurs in a cau-
toradiography to prevent the nuclear track emul- docephalic wave in the testes (Kezer et al., 1989).
sion from slipping during developing, fixing, A brief hypotonic treatment (e.g., 10 mln In dls-
washing, and staining. tilled water) appears to strip away enough of the
chromatin matrix to visualize all four individual
Preparation of Polytene Chronzosomes chromatids in diplotene bivalents (Kezer et al.,
Polytene chromosomes are somatic chromosomes 1989). In mammals, the gametes are produced In
that have undergone many rounds of endorepli- scmlnlferous tubules that contain all stages of
cation (DNA replication without division of the meiosis. These tubules can be teased out of dls-
cell or nucleus) such that each chromosomal ele- sected testes into a dish of liypotonlc solution,
ment consists of hundreds to thousands of unsep- fixed in 3:l methano1:acetic acid, and hydrolyzed
arated chxomatids. Polytene chromosomes are to a cell suspension which is used to makc splash
found in the cells of dipteran insect larvae, in preparations.
collembolans (springtails), and in certain other in- The chromosomes of certain species of Insects
vertebrates (M.J.D. White, 1973). The familiar (e.g., grasshoppers), amphibians (e.g., salaman-
bands of polytene chromosomes are formed by ders), fish (e.g., lungfish), and plants (e.g, lihcs
chromomeres (densely packed chromatid fibers) and orchids) are large enough to provlde easily
that are found along the length of each chromatid. visible meiotic configurat~ons,such as bivalenis at
Polytene chromosomes are particularly useful for pachytene and diplotene of prophasc I and
gene mapping and comparative studies because metaphase I. In organisms with much smaller
of their large size and banding patterns. cl~romosomcsthe analysls of meiotic conf~gura-
Polytene chromosomes are very easy to pre- tions IS facilitated by prcparing silver-stalncd
pare from dipteran larvae. Good examples are synaptonemal complexes (SCs) of pachytene bl-
midges of the genus Chironomus and fruitflies of valents, which are examined with electron mi-
the genus Duosophila. In both of these organisms, croscopy (Jones and de Azkue, 1993; Macgregor,
larval growth occurs by increase in cell size rather 1993; Peterson et al., 1994). This technique In-
130 Chapter 5 / Sessions
vol\.cs lysmg cells in a mild detergent to cause thread of DNA double helix, extend from many of
surlace spreading of pachytene bivalents at the the chromomeres and are sites of active RNA syn-
alr-water interface. These are then dried down thesis (Callan, 1986).
onto a plastic film on a glass slide, fixed in The largest and most easily studied lamp-
paraformaldehyde, and stained with a concen- brush chromosomes occur in the oocytes of sala-
tra ted AgN03 solution. Selected regions of the manders. A generalized method for obtaining
plastlc film are floated onto a water bath and lampbrush chromosomes from amphibian
picked u p onto copper grids for electron mi- oocytes is included in the protocol section of this
croscopy. The advantage of tlus technique is that it chapter (Protocol 12; see also Callan et al., 1987;
allows very high resolution of synaptic configura- Macgregor, 1993). The procedure involves manu-
tions, including the XY bivalent, and has been ally isolating and opening the nucleus ("germinal
particularly useful in analyzing meiosis in trmslo- vesicle") of immature oocytes with needles and/
catlon and inversion heterozygotes (Johannisson or very sharp watchmaker's forceps in a n un-
and Winking, 1994). buffered salt solution ("isolation medium," IM)
(Figure 8). The nuclear contents (including the
P~~eparation of Lampbrush Chromosomes lampbrush chromosomes) are then transferred to
Lampbrush chromosomes represent bivalents at and dispersed in "dispersal medium" (DM) in an
dlplotene stage in female meiotic cells; they are observation chamber on a specially designed
found In the oocytes of most animals (see Callan, slide. The optimal salt concentration of the IM
1986, for a recent review of lampbrush chromo- and DM varies among species and must be deter-
some structure and function). Lampbrush chro- mined empirically. IM consisting of a 5:l mixture
inosomes consist of two duplicated homalogous of 0.1 M KC1 and 0.1 M NaC1 and DM consisting
chrornosornes held together at regions of crossing of IM + 0.5% paraformaldehyde works fine for
over, or chiasmata (Figure 7). Each homologous most amphibians (Macgregor, 1993).
cl-iromosome of the bivalent consists of an axis The lampbrush chromosomes gradually sink
formed by the two closely associated sister chro- to the bottom of the observation chamber and can
matlds that connects a series of ellipsoid chromo- then be observed and photographed with phase
meres. Lateral loops, each consisting of a single contrast optics. The traditional observation cham-
ber consists of a regular glass microscope slide
with a hole bored in it and a coverslip sealed to
Figure 7 Two lanipbrush bivalents of Ambystoma
rnacro~inctylurn.(From Kezer ct al., 1980.)
Chromosomes:Molecular Cytogenetics 131
Figure 8 Steps in the preparation of lampbrush chro- yolk (Dl,and then transferred to a dispersion chamber
mosomes. Oocytes are isolated in isolation medium in a (E, F). The nuclear envelope is peeled off the nucleus
dish (A). An oocyte is opened with fine forceps (B) and in the dispersion chambcr (G), releasing the nuclear
the yolk is extruded (C), revealing the oocyte nucleus. contents (H).Finally, a coverslip is added to the prepa-
The nucleus is sucked in and out of a pipette to remove ration (I).
the bottom with paraffin wax. The disadvantage seal it to the top of a slide with paraffin (Figure 8).
of this kind of chamber is that once settled, the For in situ hybridization and/or immunocyto-
chromosomes can only be observed with an in- chemistry, the preparation must be centrifuged to
verted phase contrast microscope. A simple alter- firmly attach the lampbrush chromosomes to the
native, allowing the use of a regular phase con- coverslip at the bottom of the chamber. One way
trast microscope, is to punch a hole in a plastic to do this is to use a plexiglass disc observation
coverslip (using a regular paper hole punch) and chamber designed so that it will fit on an epoxy
132 Chapter 5/ Sessions
(A)
a
T
---= -
(B) (C)
1986). The disadvantage of Q-banding is that the
bands are visible only with UV optics and they
fade quite rapidly. G-banding is also simple and
involves brief treatment with trypsin or NaOH
and staining with Giemsa (or similar dyes) in a
phosphate buffer. The result is alternating light
and dark bands (G-bands, or GTG bands), the lat-
Epoxy ter representing primarily AT-rich regions and
Coversllp plug thus corresponding to most Q-bands.
Whereas Q- and G-banding require little or no
pretreatment of the chromosomes, R-banding and
C-banding require a stringent extraction step re-
sulting in significant loss of chromosomal DNA
(at least 60%; Pathak and Arrighi, 1973). R-band-
ing, or "reverse banding", involves pretreatment
with hot (80-90°C) alkali and subsequent staining
with Giemsa (RHG bands) or with a fluorochrome
such as acridine orange (RFA bands). This results
in a banding pattern that is the reverse of G-band-
ing (RHG bands) or of Q-banding (RFA bands). A
Figute 9 Centrifuge hbe fitted with an epoxy plug for much less stringent method for obtaining fluores-
centrifuging lampbrush preparations (Protocol 12). (A) cent R-bands is described below. For C-bands,
Dispersion chamber consisting of a plastic disk with a chromosomes are treated with a strong base at an
7-mm hole bored in the center and a coverslip attached
to the bottom with paraffin. (B) Polymerized epoxy elevated temperature, incubated in a sodium cit-
plug in 30-ml centrifuge tube. (C) Dispersion chamber rate solution at high temperature, and stained in
i s positioned on the epoxy plug by raising the plug a concentrated Giemsa solution. This results in the
with a probe inserted through a hole in the bottom of extraction of almost all of the non-C-band chro-
the centrifuge tube.
matin, leaving only constitutive heterochromatin
(Figure lo), which usually contains rapidly reas-
sociating repeated sequences (Comings et al.,
plug inside a large centrifuge tube (Figure 9). Af- 1973). Methods for Q-, G-, R-, and C-banding are
ter centrifugation, the dispersal medium can be given in the protocol section.
removed and the lampbrush chromosomes fixed Various specialized banding procedures have
and dehydrated. also beeit developed (see Rooney and Czepul-
kowski, 1986; Venna and Babu, 1989; Therman and
Chromosome Banding Susman, 1993). Some of the most useful methods
The four most common methods for banding are fluorescence banding using various flue-
chromosomes are Q-banding, G-banding, R- rochromes (e.g., chromolnycin A3, Hoechst 33258,
banding, and C-banding (Bickmore and Sumner, and DAPI), differential-replication banding using
1989; Sumner, 1990; Therman and Susman, 1993). bromodeoxyuridine (BrdU), and staining for nu-
The simplest of these is Q-banding, which in- cleolar organizer region (NOR) using silver ni-
volves soaking the slides in a buffer and then trate. Chromomycin A3 can generate "R-bands" in
staining with quinacrine mustard or quinacrine mammalian chromosomes, and BrdU banding is
dihydrochloride. This produces fluorescent bands particularly useful for species that are difficult to
Chromosomes:Molecular Cytogenetics 133
Human Chimpanzee Gorilla Orangutan be1 probes for nucleic acid hybridization are 3 2 ~ ,
1251, and 3H; choosing one for in situ hybridization
involves a trade-off between sensitivity and reso-
lution. 32Pyields the highest specific activity and
is extensively used in transfer-hybridization ex-
periments (Chapter 8) but is not used in in situ
hybridization because its high energy disintegra-
Chromosome 1 1 1 1 tions result in poor resolution. Tritium is usually
Chromosome 12 12 12
considered the best radioisotope for in situ hy-
bridization because of the extremely low energy
of p particles emitted (0.018 MeV; Pardue, 1985).
The low-energy P particles emitted by 3H travel
only about 1 pm through autoradiographic emul-
sion, which results in close spatial correspondence
between silver grains and hybridized target (Mac-
gregor and Varley, 1983; Pardue, 1986).The disad-
vantage of using tritium is that it often necessi-
tates long autoradiographic exposure times,
depending on the specific radioactivity of the
probe. Shorter exposure times are achieved with
lz5X, but the radiation emitted is significantly more
energetic than that of tritium with the danger of
less precise resolution and higher background
Chromosome Y Y Y Y (Pardue, 1986).
Figure 13 Restriction endonuclease banding in apes High specific radioactivities are achieved by
and human. Top row: human chromosome 1 and ape in vitro labeling (Macgregor and Varley, 1983; see
homologs. Middle row: human chromosome 2 and ape protocol section). In vitro transcription of RNA by
homologs. Bottom row: human Y chromosome and ape E. coli polymerase for in situ hybridization in-
homolois. (Reproduced with permission from ~errucci
et al., 1987.)
cytological slide preparation, covering with a cov- 0.5 pm from the source, although 1-2% of the par-
ersiip, and incubating in a moist chamber at 37'C ticles may travel up to 3 pm. lZ5Iproduces a
for approximately 12 hr. A suitable moist chamber greater scatter of grains, u p to 16 pm from the
is a large petri dish, plastic freezer box, or similar source, although approximately 90% of the grains
container, lined with paper towels that have been will fall within a 3.5-pm radius and at least half of
soaked UI buffer. Microscope slides are supported the grains will be at the same distance as those
on broken pipettes or glass rods laid side-by-side produced by tritium (A.S. Henderson, 1982).
on the bottom of the chamber. Several different nuclear track emulsions are
After an appropriate incubation time, the hy- available with different sensitivities. The single
bridized cytological preparations must be washed most important property of the emulsion is the in-
to remove probe molecules and their degradation trinsic background of silver grains formed in the
products that are not bound to complementary absence of exposure. Therefore, it is necessary to
sequences on nuclei or chroinosomes. This proce- test each batch of new emulsion as it arrives from
dure 1s necessary to remove both weakly hy- the supplier by developing coated blank slides
bridized molecules and unboulld or non-specifi- and examining them under a microscope. A back-
cally bound lnolecules and is essential to reduce ground of less than 50 grains per field of view un-
background signal. This step can involve different der a 1 0 0 oil
~ objective is considered very good,
levels of stringency depending on the nature of but a grain count of over 100 is unacceptable
the hybrld. Washing usually involves incubation (Macgregor and Varley, 1983).Unacceptable emul-
in 2 x SSC at temperature that is slightly Iower sion should be returned to the supplier for a re-
than that used for the hybridization reaction, in placement. Background can also be controlled by
add~tlonto treatment with 50% formamide (or 5% careful handling and storage of the emulsion,
cold TCX, in the case of radioactive probes). If an After the slides are coated with emulsion, the
RNA probe is used, washing includes a mild di- preparations are exposed for a length of time that
gestion with ribonuclease. Following the washing must be determined empirically. The objective is
step, che slides are ussally dehydrated in ethanol to obtain a sufficient number of silver grains to
and ax-dried. The slides are now ready for au- detect hybridization unambiguously but not so
torad~ographyor immunochemistry. many that cytological detail is obscured. Tlxe best
exposure time for a particular in situ Itybridiza-
Auior~adzngrayhy tion experiment can be determined by including
Visualization of sites of hybr~dizationbetween a several replicates or cytologically suboptimal
radioactive probe and its cytological target is preparations that can be used as test slides. One
achlevcd by autoradiograpl-ry.This procedure in- test slide is developed at a given interval to deter-
volves coating the slides wrth a thin layer of nu- mine whether exposure has been adequate. Expo-
clear track photographic emulsion consisting of sure times can vary from days to months, de-
silver halide crystals suspended in a gelatin ma- pending on the concentration and specific activity
trix TZadiation from hght or from radioactivity of the hybridized probe molecules.
s e n s ~ t l ~ the
e s crystals to form a "latent image,"
which IS visualized when the crystals are reduced Chromosome Painting Using FISH
to metallic silver by photographic developer. The Standard ISH utilizing radioisotopes I-ras been
resulting grain density 1s highest immediately largely supplanted by fluorescent in situ hy-
over the source of radiation and decreases sym- bridization (FISH), which is now widely ac-
n~etricallyon each side of the source in increasing knowledged as the method of choice for localiz-
distance The rate of decrease of grain density ing specific chromosomal sequences in clinical as
from t l ~ source
t determines the resolution, and is well as comparative cytogenetics. One great ad-
depe~\dent on the radioisotope that is used. The vantage of FISH is that it is possible to map mul-
vast rnajorlty of silver grains produced by the /3 tiple probes simultaneously by detection with dif-
particles emltted by tritium will be located within ferent fluorochromes. Up to seven different
Chromosomes: Molecular Cytogenetics 139
pobes have been visualized simultaneously on a tension in the presence of labeled nucleotides
single preparation (Freshney, 1994).Another ad- (e.g., FITC-12-dUTP), using the chromosome tar-
vantage is that non-radioisotopic probes eliminate get as the template (Koch et al., 1991; Volpi and
the need for autoradiography and long exposure Baldini, 1993). This technique results in specific
times, so the procedure is relatively rapid. Also, banding patterns, and can be used to map both
the use of digital imaging systems, such as confo- dispersed and localized repeated sequences. The
cal laser scanning microscopy, makes FISH partic- advantages of PRINS over traditional ISH are that
ularly good for data manipulation and storage, it is very fast, allowing good signals to be ob-
with a degree of sensitivity and localization that tained from repeated DNA sequences in less than
cannot be achieved with standard isotopic ISH one hour and from unique DNA sequences in less
(Freshney, 1994; Plate 1). Einally, non-isotopic than three hours. A useful modification of the
probes are stable for long periods of time and P13INS technique is the use of multiple probes de-
large quantities may be produced at one time and tected with different fluorochromes, called MUL-
stored at -20°C for many months or years. There . TIPRINS (Volpi and Baldini, 1993).
are a large number of different techniques that
come under the category of FISH, and new meth- Immunochemis fy
ods are being developed at a rapid rate. The most The use of immunochemistry has been vastly sim-
commonly used technique involves the use of bi- plified by the commercial availability of numer-
otin-conjugated nucleotides to label nucleic acid ous polyclonal and monoclonal antibodies and
probes (Langer et al., 1981; Ausubel et al., 1992). detection kits. Specific antibodies (e.g., anti-biotin
The probe is detected immunochemically with bi- or anti-avidin) are usually purchased already con-
otin-specific antibodies, avidin, or streptavidin, jugated with biotin, fluorochrome, or peroxidase.
which are conjugated either to a fluorochrome Five immunochemical systems are most com-
(e.g., fluorescein isothiocyanate, FITC) that is vi- monly used to visualize non-radioisotopically la-
sualized with a UV microscope, or to enzymatic beled hybridized probes (Figures 16 and 17). In
reagents (e.g., alkaline phosphatase or horserad- each case the probe usually has been either bi-
ish peroxidase) that can be reacted with a sub- otinylated or BrdU-substituted via nick transla-
strate to form a cytologically visible stain. Several tion, The direct fluorescence method involves the
otlzer techniques for FISH have been developed, use of a single, fluorochrome-conjugated anti-bi-
including CISS (chromosome in situ suppres- otin monoclonal antibody (mab) which binds di-
sion hybridization) and PRINS (primed in situ rectly to the biotin side-groups on the probe. The
labeling). CISS utilizes probes from DNA libraries advantage of this system is that it is relatively fast,
of flow-sorted cl~romosomesto search for DNA simple, and inexpensive. The main disadvantage
sequence homology of the entire length of the tar- is that, unless the target is a reiterated sequence,
get chromosomes while suppressing the repetitive the signal may be too weak to detect. The indirect
DNA sequences of the other chron~osomesby al- fluorescence method involves an anti-biotin mab
lowing the repeated sequences of the probe itself as a primary antibody, followed by reaction with
to reanneal in the hybridization mixture prior to a fluorochrome-conjugated secondary antibody
the actual hybridization reaction (Luke and which recognizes the primary mab as antigen
Verma, 1993). CISS results in the labeling of whole (since mab's are made in mouse cells, the sec-
chromosomes or parts of chromosomes, and can ondary antibody should be a polyclonal anti-
be used to identify homeologous chromosomes in mouse antibody, usually made in rabbit or goat).
different species or in hybrids (Wienberg et al., The main advantage of the indirect method is that
1990,1992; Luke and Verma, 1993). PRINS in- each secondary antibody can bind with two pri-
volves in situ hybridization of unlabeled oligonu- mary antibodies, thus amplifying the signal. The
cleotide probes (oligos) to complementary se- main disadvantage is that it is both somewhat
quences on fixed chromosomes. The oligos serve more expensive and time-consuming than the di-
as primers for in situ DNA polymerase-driven ex- rect method.
140 Chapter 5 / Sessions
A third approach is the peroxidase-antiper- is greater sensitivity than either the direct or indi-
oxidase (PAP) method involving at least three rect method; the main disadvantage is that it is
reagents: primary and secundary antibodies, and much more time-consuming. The PAP method
a PAP complex comprised of the enzyme peroxi- has been used to detect BrdU-labeled probes
dase and an antibody against peroxidase (Figure (Frommer et al., 1988).
26). T11e primary antibody binds to the biotin on The last two methods utilize fluorochrome-
the probe, the secondary, or bridging, antibody conjugated avidin (or streptavidid-biotin conju-
binds to both the primary and to the PAP complex gates to greatly amplify the signal from 11y-
(since both are produced in the same animal bridized biotinylated probes. The first of these
species). The main advantage of the PAP method involves an anti-biotin mab followed by reaction
4@ Fluorochrome 8 Peroxidase
Target DNA 5
Figure 16 The primary systems for detecting biotiny- dase-anti-peroxidase complex. (C) Direct fluorescence:
lated probes using immunochemistry.(A) Indirect fluo- a fluorochrome-conjugatedanti-biotin ab is used alone.
rescence: a primary anti-biotin antibody (ab) is recog- (D) Avidin-biotin conjugated fluorocl~rome:a biotiny-
nized by a fluorochrome-conjugated secondary ab. (B) latcd secondary ab binds to an anti-biotin primary ab,
PAP system: a primary anti-biotin ab is recognized by and fluorochrome-conjugated avidin then binds to the
a bridging (or linker) ab which also binds to a peroxi- secondary ab.
Chromosomes: Molecular Cytogelzetics 141
wit17 a biotinylated anti-mouse polyclonal sec- painting via FISH treats the hybridized biotiny-
ortdary antibody, and amplified by adding fluo- lated probes first with FiTC-conjugated avldin
rochroil~e-conjugatedavidin, which binds to the wluch binds to the biotin side groups of the probe,
biotin on the secondary antibody (Figure 17).The followed by reaction with a biotinylated primary
most commonly used procedure for cl~romosome anti-avidin mab, and then amplification by treat-
(A) Biotin
Target DNA
Biotinj~latedprimary
anti-avidin antibody
Arnpllflcation wlth
addltlonal fluoroclirnme-
conjugated avldil~
- Fusion
f-----------
Fission
144 Chapter 5 / Sessions
though pericentric inversions often can be docu- The easiest and most successful application of
mented by chromosomal morphology alone (i.e., in situ hybridization concerns the localization of
shifts in centromere position), confirmation of any moderately long sequence that is repeated
Robertsonian translocation and paracentric inver- more than 100 times at one place in the genome
sions usually requires some kind of chromosome (Macgregor and Varley, 1983).Consequently, most
markers such as bands, NORs, or hybridized comparative studies have focused on repetitive
probes. Inversions and translocations can also be sequences such as those coding for ribosomal
detected in lampbrush chromosomes and other RNA, tRNA, and histones, as well as highly re-
meiotic preparations. peated satellite DNA sequences. Single-copy
Although differences in chromosome struc- genes have always been easily detected in
ture are often correlated with taxonomic differen- dipteran polytene chromosomes because all gene
tiation, the role of cytogenetic change in actual sequences are multiplied several hundred times
processes of speciation is controversial (Patton and are localized and concentrated.
and Sherwood, 1983; Sites and Moritz, 1987; M. Although there have been many comparative
King, 1993). The fixation of structural rearrange- studies of sequence localization using ISH, these
ments may be the most important, and easily data rarely have been used for estimating phy-
comprehended, cytogenetic correlate of speciation logeny. Phylogenetic analyses are possible using
(cf.Patton and Sherwood, 1982). Some organisms, such characters as the location(s) of sequences
such as many groups of salamanders, show little (e.g., various repeat families) among and with
or no intra- or interspecific variation in cytologi- chromosomes, sequence structure and copy num-
cally visible aspects of chromosome structure de- ber, spatial relationships among identified genes
spite extensive changes in organismal morphol- and other sequences and to specific bands or
ogy, protein biochemistry, and even DNA other markers, and the localization of functional
sequence structure, suggesting that cl~omosomal versus non-functional NORs. These kinds of stud-
morphology has been strongly constrained (Ses- ies have been particularly important for the iden-
sions and Kezer, 1987).The reasons for this cyto- tification of homologies among chromosomes or
genetic stasis are unknown. parts of chromosomes among distantly related
Even more controversial is the possible exis- species for phylogenetic analysis (e.g., Duosophila,
tence of major trends in karyological evolution. In Steinemann et al., 1984; Wienberg et al., 1992).
salamanders, for example, it has been argued that The use of such characters is predicated on
primitive karyotypes are asymmetrical (i.e. contain our understanding of their evolution. Two differ-
both bi-armed and telocentric ch.rornosomes) and ent (but not mutually exclusive) views concerning
bimodal (i.e., contain both rnacrochxomosomes and the mode of evolutionary change in the molecular
much smaller microchromosomes) with high chro- structure of chromosomes are the chromosome
mosome numbers, whereas derived karyotypes are repatterning hypothesis (Mancino et al., 1977;
symmetrical (all bi-armed) and unimodal (no mi- Cremisi et al., 1988) and the homosequentiality
crochromosomes), with lower chromosome num- hypothesis (Figure 19; Macgregor and Sherwood,
bers (Morescalchi, 1973; 1975). This argument is 1979). According to the repatterning hypothesis,
based on correlations of karyotypic patterns with interspecific differences in the chro&osomal loca-
morphology and reproductive biology. A similar tion of certain repetitive DNA sequences (e.g., ri-
correlation has been noted in frogs (Morescalchi, bosomal RNA gcnes) reflect the redistribution of
1973).A hypothesized mechanism for such a trend chromosomal elements within karyotypes. A pre-
involves pericentric inversions to produce telo- diction based on this view is that evolutionary
centrics, followed by Robertsonian translocations changes in sequence location should be relatively
(Morescalchi, 1975; Green, 1983). Chromosome conservative (i.e., slow, unique, and irreversible).
painting and other current methods employing in The homosequentiality hypothesis, on the other
situ hybridization should eventually generate the hand, postulates that differences in the apparent
kind of high resolution data to test this kind of hy- location of various sequences reflect localized am-
pothesis within a phylogenetic context. plifications or diminution of sequences with fairly
Chromosomes: Molecular Cytogenetics 245
or structure of identified sequences, especially in bridization reaction. The sensitivity of in situ hy-
ter~nsof [lie number, kinds, and locations of vari- bridization using radioactive probes depends on
ous repetitive sequences. Cytogenetic mecha- three main parameters: (1)the specific radioactiv-
nislns such as unequal crossing over, inversions, ity of the probe; (2) the efficiency of the hy-
and rri~nslocationshave clearly played a domi- bridization reaction; and (3) the autoradiographic
nant role, and we are just beginning to under- procedure. For many years these parameters lim-
siand ihe role of transposons, retroposons, and ited most in situ hybridization studies to repeti-
the phenomenon of gene conversion in chromo- tive sequences that can be localized with poorly
somal evolution (W.F. Doolittle, 1985; Deininger defined probes of low specific activity and subop-
and Daniels, 1986; Baker and Wichman, 1990; timal hybridization conditions (A.S. Henderson,
Elillls et al., 1991~).
It is clear that we have very lit- 1982). The specific radioactivity of a probe is lim-
tle t~nderstandingof the relationship between the ited only by the specific activity of the nucleotide
molecular structure and function of chromo- precursors used in the synthesis of the probe. For
somes For example, clusters of ribosomal se- clusters of repeated sequences, including polytene
quences have been found on almost every single chromosomes, W A probes labeled wit11 [ 3 H ] U ~ ~
chromosome (in addition to a stable nucleolus or- at 50 Ci/mmol are sufficiently radioactive (Par-
ganlzer region) in the European newt, Triturus due, 1985). For smaller targets, the specific ra-
vul~al-is(Andronico et al., 1985), and certain sim- dioactivity of the probe can be increased by using
ple-sequence satellite DNA sequences are tran- additional 3H-labeled nucleotides.
scribed by lampbrush chromosomes in salaman- The efficiency of hybridization depends on
der oocytes (Varley et al., 1980). These results numerous factors, including the concentration of
make it difficult to make testable predictions con- the probe, the ionic strength of the hybridization
cerning rates, constraints, and directions of evolu- mixture, the incubation temperature for the hy-
tionary change, and indicate that full and proper bridization reaction, the type of chromosomes,
use of molecular cytogenetic information for phy- and the complexity of the site, as well as the
logenciic analyses will require a better under- method used to prepare the slides and the age of
slanding of the molecular basis of chromosome the slides. Ideally, all available complementary
structure and function. target sites will hybridize with the probe at satu-
ration concentrations. This is precluded, however,
Limitations by the nature of cytological preparations, includ-
ing the difficulty in obtaining complete denatura-
One of the most serious limitations of molecular tion of chromosomal DNA, loss of chromosomal
cytogenetics concerns the reliability of chromo- DNA during denaturation, and the possibility of
sornc identification. Ideally, this identification stearic hindrance by chromosomal proteins (A.S.
should be based on banding patterns or some Henderson, 1982). Overall, the efficiency of hy-
other cl~romosome-specificmarkers, independent bridization has been estimated to be 6-10% (Mac-
of the localization of particular sequences. The gregor and Varley, 2983).
chron~osoinesof most mammals and various Some of these limitations of the hybridization
other organisms are readily G-banded and show reaction have been counteracted by using dextran
complex, chromosome-specific banding patterns. sulfate in the hybridization reaction. Dextran sul-
Other organisms, such as amphibians, have seem- fate is essential for the detection of single-copy
ingly C-band-resistant chromosomes, and unam- chromosomal sequences (Harper and Saunders,
big~iouschromosome identification is more diffi- 1984).It is possible that the signal can also be en-
cult and requires a variety of specialized banding hanced by vector sequences that are attached to
techniques. The application of FISH should help cloned probes. These sequences are radiolabeled
to solve this problem, and is limited only by the and free to participate in network formation, thus
avalldb~lltyof suitable probes. contributing to the overall signal at the hybridiza-
Limitations of in situ hybridization mainly tion sites.
concern the sensitivity and efficiency of the hy- The main limitation with non-isotopic ISH,
Chromosomes: Molecular Cytogenetics 147
such as FISH, is that problems are often encoun- sible than the compact DNA in the interior of the
tered in the accessibility of chromosomal target chromosomes (Pinkel et al., 1986). These prob-
DNA to the reagents, a n d halos of signal are of- lems can b e minimized by careful preparation
ten seen around chromosomal targets that repre- a n d storage of the prehybridized slides, a n d (in
sent diffuse strands of DNA that are more acces- the case of biotinylated probes) amplifying the
148 Chapter 5/ Sessions
signal by using multiple layers of avidin (Pinkel and supplies, see the most recent issue of The
et al., 1986). Other problems concern the im- Biotecknology Directory, Stockton Press, New York.
munochemical procedures. Care must be taken in
choosing appropriate primary and secondary an-
tibodies, in performing staining steps in the cor-
rect order, and in preventing cross-contamina-
tion. Background staining from non-specific
1. Subbed slides
binding of one or more of the antibodies is often a
problem, and appropriate controls must be per- 2. Mitotic chromosomes from gut epithelium
formed. Many fluorochromes (e.g., FITC) fade 3. Mitotic chromosomes from plant root tips
quickly under UV, making photographic docu-
mentation difficult or impossible. Confocal laser 4. Squash technique for mitotic and meiotic
scanning microscopy (CLSM) in conjunction with chromosomes
computer-assisted image processing has greatly
5. Yeast method for mitotic chromosomes from
enhanced the resolution of FISH preparations.
small vertebrates
However, it is expensive and thus not always
available. 6. Splash technique for slide preparations of
mitotic chromosomes
7. Mitotic chromosomes from peripheral blood
LABORATORY SETUP in vertebrates
The most essential piece of equipment for cyto- 8. Mitotic chromosomes from fibroblast cultures
genetic studies is a compound microscope (reptiles)
equipped with high-quality phase-contrast objec-
9. Mitotic chromosomes from corneal
tive lenses. Other equipment needed, for banding
epithelium
procedures, molecular techniques, and even tis-
sue culture, are commonly found in most labora- 10. Mitotic chromosomes from insect embryos
tories. Additional specialized (and expensive)
11. Polytene chromosomes from dipteran
equipment that allows the highest quality cyto-
salivary glands
genetic work includes a confocal scanning micro-
scope, CCD video camera, and computer hard- 12. Lampbrush chromasomes
ware and software for digitized image analysis.
13. C-banding
Access to an electron microscope facility is also
an advantage. Table 2 lists the major equipment 14. Q-banding
necessary to set up a molecular cytogenetics lab-
15. G-banding
oratory. Some useful references are: Nierman and
Maglott, 1993 (ATCC/NIH repository catalogue 16. Fluorochrome R-banding with chrornomycin
of human and mouse DNA probes and libraries), A3
Haugland, 1992-1994 (a catalogue of fluorescent
17. AgNOR banding
probes and research chemicals), Ausubel et al.,
1992 (short protocols in molecular biology), Rost, 18. Differential replication banding with BrdU
1992 (a description and review of fluorescence
19. Modification of BrdU banding for salamander
microscopy), Freshney, 1994 (a manual of animal
embryos
tissue culture techniques), Macgregor, 1993 (an
introduction to animal cytogenetics), and Ther- 20. Labeling probes for ISH via nick translation
man and Susman, 1993 (a description and review
21. Radioisotopic IS13 for reiterated sequences
of human chromosome technology). For a com-
using a DNA probe
plete listing of worldwide suppliers of equipment
Clzrornosomes:Moleculat. Cytogerzetics 149
22. Radioisotopic ISH using an RNA probe 3. Kill the animal by overanesthesia (or by pre-
ferred method).
23. Radioisotopic localization of single copy se-
quences 4. Remove the stomach and intestines, squeeze
out any contents, and open lengthwise using
24. Autoradiography fine pointed scissors. Also remove the spleen,
25. Chromosome painting using EISJ-I kidneys, and (if male) gonads and make small
cuts with scissors.
26. FISH with single-copy genomic probe 5. Submerge tissues (separately from each spec-
imen) in a large volume (e.g., 50 ml) of dis-
tilled water in a flask or beaker for 10-15 inin
Proeocof 1: Subbed Slides with agitation. The water should be changed
(Time: =1 hr handling plus 24 hr incubation) if it becomes cloudy or full of debris.
Microscope slides should be very clean; washing 6. Remove tissues from water, blot briefly on pa-
in hot water and detergent is recommended, but per towels, and submerge in 50-100 ml of
at a minimum slides can be cleaned by soaking freshly prepared, ice-cold 3:l fixative (3 parts
them in 95% ethanol to which several drops of ethanol, 1 part glacial acetic acid) for at least
glacial acetic acid have been added, and then air- 15 min on ice (this first volume of 3:l fixative
dried. Subbing coats slides with a thin gelatin can be reused for all syecimei~sduring a par-
film.Reference: Macgregor and Varley (1983). ticular fixing session if kept cold).
7. Transfer fixed tissues to a vial with fresh, cold
1. Wash slides in hot water and detergent, and 3:1 fixative. Glass, 20-ml scintillation vials
rinse copiously in hot water and then distilled with plastic cone inserts in the caps are deal
water. for storing tissues fixed in 3:1 (do not use foil
2. After a final rinse in distilled water, dip the liners, as they wilI decompose into the fixa-
slides into the subbing solution (Appendix). tive). The tissues can now be stored indefi-
3. Drain the slides and dry in a rack overnight nitely (1Qyears or mare) at -20°C, or used Im-
at 60°C. Subbed slides can be stored indefi- mediately to make slides using the squash or
nitely in a slide box. splash techniques (see Protocols 4 and 6).
These tissues even remain suitable ior DNA
extraction.
g3hromsr;omes from
i3roirpcol 2; !di;&~Eic
&kt Epillaelinm
(Time: incubation from 2-48 hr or more, depend- PrntocuX 3: Mitotic Chi-s;nr;somss fi.c;t-rt
ing on species, plus =I hr handling) Piane Root Tips
(Time: incubation 1 2 4 8 hr or more, dependlng
This technique (from Kezer and Sessions, 1979) on species, plus =1 hr l~andhng)
works best for amphibians with large chramo-
somes, but will work for just about any vertebrate Root t ~ p may
s be obtained either from germinated
(and could easily be modified for invertebrates as seeds or from the cleaned roots of adult plants.
well). For potted plants it is best to water liberally 7 or
2 days before taking root tips. Seeds may be ger-
1. Give healthy, well-fed animals an intraperi-
minated on moist filter paper in a petri dish, and
toneal injection of 1.0-5.0% aqueous colchi-
roots can be obtained from bulbs by suspending
cine, approximately 0.1 ml/g body weight.
them with tootl~picksover dishes of water so that
2. Let animal incubate at a physiologically com- they are partially submerged. Healthy growing
fortable temperature for about 1 hr h a m - root tips are brittle, translucent w h ~ t e ,~ v i t h
mals), 4 hr (reptiles), 12-24 hr (frogs), or 24-48 opaque, tapered tips. The most rapidly diwdlng
lu (salamanders).
150 Chapter 5 / Sessions
cells are located in the embryonic tissue (meris- gle drop of 45% acetic acid toward one end of
tern) just proximal to the tip. If roots are not avail- a clean slide (subbed slides are recom-
nblc, l i is possible to use young leaves or the mi- mended, especially if the preparations are to
totrcally active ovary or ovule wall of developing be made permanent).
f!owers or fruits, or premeiotic cells of pollen 3. Mince tissue as finely as possible using sharp
motl~ercells (Dyer, 1979). forceps, scalpel, or razor blade. The result
1, Immerse roots in a solution of 1.0-5.0% should be a cloudy suspension of cells and
colchicine at room temperature for 1 2 4 8 hr small clumps of cells. Remove any clumps,
(germinated seeds can be left intact). The vol- chunks, lint, or other solid bits of debris.
ume of colchicine can be minimized by using 4. Cover the cell suspension with a clean, silicon-
~iplockbags. ized, 22-mm2 coverslip (coverslips can be sili-
2. Cut off distal ends (0.5-1.0 mm) of the root conized by rubbing with commercially avail-
t ~ p sand fix the tips in fresl~lyprepared 3:l able siliconized paper wipes). To avoid bub-
ethano1:acetic acid for at least 15 min. bles, the coverslip should be lowered gradu-
ally by placing one edge down first, in contact
3. Macerate the tissues in 1 N HCl at 60°C for 5
with the suspension on the slide, and then
min.
slowly lowering the coverslip with forceps.
4. Soak the root tips in 45% acetic acid for 1-5
5. To squash the cells, put the slide between lay-
min.
ers of absorbent paper (e.g., paper towels, or
5. Transfer a root tip to a drop of 45% acetic acid bibulous paper pads), stabilize the coverslip
on a clean slide, cut off the terminal 1 mm of by pushing down firmly with thumb and in-
the root tip (containing the meristem) and dex finger on the top layer of paper near two
discard the rest, and crush and mince very edges of the coverslip, and push down very
thoroughly with a scalpel or razor blade (do hard with the thumb of the other hand in the
not let the preparation dry out). center of the coverslip. Slipping of the cover-
6. Make squash preparations (Protocol 4). slip, which may ruin the preparation, can
sometimes be avoided by tapping gently on
the coverslip wit11 a pencil eraser before
B'ra~lr~co'l 4: Squash Technique for squashing.
- t i t o t i c and Meiotic Chrrznslascs~aes 6. The slide can now be examined directly with
(Tune: <5 min per preparation) phase-contrast optics to check for suitable
chromosome spreads. A gross phase contrast
Tlmc is a certain amount of art in making good effect can be obtained with regular bright
chrolnosome preparations, and this is particularly field optics by defocusing the condenser lens.
true of squashes; practice and perseverance usu- It is useful at this stage to record the coordi-
ally are necessary. Once the technique is mastered nates of particularly good spreads, and, if
it is very fast, and it is convenient to set up the working with large chromosomes (e.g., sala-
slide mahng station adjacent to the microscope so manders), to photograph selected examples.
that each preparation can be examined immedi- Photography is particularly useful at this
ately This technique is recommended for organ- stage if it is a rare specimen and good chro-
isms with large chroinosomes. Reference: Kezer mosome spreads are difficult to find, since
and Sessions (1979). subsequent treatment of the slides may de-
stroy or degrade chromosome morphology.
1. Remove a small piece of tissue from 3:l fixa-
and
t~v? submerge it in 45% acetic acid in a 7. The slide can be made permanent with the
small glass dish for at least 2 min (tissue will dry ice technique (Conger and airc child,
disintegrate after prolonged exposure). 1953) by placing the slide on a block of dry ice
(or into a -70°C freezer) for at least 5 min,
2. Put a small bit of tissue (e.g., 1 mm2) in a sin-
Chromosomes: Molecular Cytogenetics 151
then quickly prying off the coverslip with the syringe full of hypotonic KC1 (0.075 MI in-
point of a razor blade and plunging the slide serted in one end to flush out the marrow into
into 95% ethanol for at least 2 min. The slides a small volume (approximately 3 ml) of hy-
are then air-dried, and can be stored indefi- potonic solution in a 15-ml centrifuge tube.
nitely if kept in a sealed slide box with a cot- Flick the tube to disperse the cells and, if nec-
ton-stoppered vial of desiccant at 4OC. essary, add more hypotonic solution to bring
the volume up to 6 ml (the solution should be
cloudy).
Protaco1 5: Yeast Method for Mitotic 5. Let the cell suspension incubate in the hypo-
Ghrasmosomcs from S~nalIartebrates tonic solution for 15-20 m i . at room tempera-
(Time: >24 hr incubation time, plus =3 hr ture.
handling) 6. Add an equal volume of freshly prepared ice-
cold 3:l (3 parts methanol plus 1 part glacial
This technique is based on the stimulation of acetic acid), mixing constantly, and centrifuge
white blood cell proliferation in bone marrow. For at 100g for 2 min.
mammals, sufficient bone marrow can be ob-
tained from the long bones of the limbs. For small 7. Discard the supernatant and flick the tube
lizards, bone marrow may also be obtained by re- vigorously to loosen the pellet (or use a vor-
moving and crushing the spine (C. Moritz, per- tex mixer), then slowly re-suspend the pellet
sonal communication). The volumes given are in 4-6 ml of fresh 3:l fixative with constant
based on tissues obtained from an adult labora- mixing, and let fix for at least 10 min on ice.
tory mouse; they may be reduced or increased for 8. Centrifuge at 100 g for 2 min.
substantially smaller or larger amounts of tissue. 9. Repeat step 7, but re-suspend in <0.5 ml of 3:l
This technique may also work without the yeast methano1:acetic acid to give a finely dispersed
treatment (especially in lab-raised animals). Ref- cell suspension.
erence: Lee and Elder (1980). 10. Check the cell density by making a slide (via
Inject animals with active yeast suspension the splash technique, Protocol 6) and examin-
(subcutaneously in dorsal region, or directly ing under the microscope.
into body cavity), 0.5 m1/25 g body weight. 11. Cells can now be stored in fixative in the
One injection followed by a 24-hr incubation freezer, or can be used immediately to make
period is adequate for subadults and newly slide preparations.
caught animals, but two or three consecutive
injections at 24-hr intervals may be required
for others.
Protocol 6: Spiaska Technique for Slide
2. After the yeast incubation period, inject the
animal with 1 mg/ml colchicine, 0.1 ml/1O g
Preparations af Mitotic Chromosomes
(Time: <1 min per slide)
body weight, and incubate for 1 hr (shorter
incubation times of 20-40 min will yield less Nearly every lab has a slightly different method
condensed mitotic chromosomes as well as for obtaining splash chromosome preparations, in-
fewer spreads). dicating that many of the parameters are matters
3. Kill the animal (e.g., anesthetize with halo- of preference rather than necessity. The following
thane or C02 followed by cervical dislocation) is a generalized protocol that usually works.
and dissect the upper leg bones (femur) and 1. Take a clean, ice-cold, wet slide (slides can be
upper arm bones (humerus) and remove as stored in a coplin jar of distilled water on ice),
much soft tissue as possible to expose the shake it, hold it with one hand at an approxi-
ends of the bones. mately 30" angle over a trash can or towels,
4. Cut off both ends of each long bone and use a and splash several drops of a fixed cell sus-
152 Chapte.~5 / Sessions
pension from a height of 0.5 m or more onto 5. Thirty to 60 mix1 before harvesting, add one
the slide. drop of 0.025% colchicine to eaclx tube.
2. Gently blow on the slide surface, and dry the
slide on a slide warmer or hot plate at 40°C. Park 8. Harvesting and Fixittion
Alternatively, slides may be flame-dried by 1. Centrifuge tubes for 5 min at 200 g.
holding over a bunsea burner or alcohol lamp
to ignite the alcohol. 2. Carefully remove (and discard) supernatant
with pipette, down to just above the pellet (do
3. Check cell density on one or two test slides not disturb the pellet).
and adjust the cell concentration if necessary
by diluting, or spinning down and re-sus- 3. Loosen the cell pellet by flicking the tube (or
pending the cells in a smaller volume of fixa- buzzing it with a vortex mixer).
tive. Cells should be evenly spread and not 4. Add warm (37°C) hypotonic solution (0.075
touching. M KCl) to produce a dilute cell suspension:
a. Add just enough so that it is just possible to see
through the suspension (usually 6-8 mi, but
Protocol 7: Mitotic Chromoso~a-tesfrom check after 3 ml).
i%eripheral Blood in V~r&ebrates b. Add thc hypntonic solution v~gorouslyto sus-
(Time: 72 hr incubation plus 2-3 hr handling) pend pellet.
c. Let cells incubate 5 min at room temperature.
This protocol can be used for a variety of mam-
malian, avian, and reptilian species. Exact vol- 5. Centrifuge at 200 g for 5 min.
umes depend on the amount of blood used. The 6, Discard supernatant with pipette, and flick
culture medium will depend on the species. Stan- tube to loosen pellet.
dard DMEM or RPMI with 10% fetal bovine
serum (FBS) plus antibiotics works well for mam- The next four steps should be done rather quickly.
mals and birds. L-15 medium is often used for
7. Add more hypotonic solution (approximately
amphibian tissue culture, with the advantage that
2 ml), followed immediately by 4 drops of
the cells can be cultured in the absence of artificial
freshly made fixative (3:1 methano1:glacial
COz. Serum enriched for leukocytes may be ob- acetic acid).
trained by allowing the blood to coagulate at 4°C
for 2 hr and then collecting the supernatant 8. Now mix the cells by bubbling air gently into
(serum plus leukocytes). Lymphocytes in whole the solution with a glass pipette (do not suck
blood may be stimulated with a mitogen, usually the cells into the pipette).
plxytohemaglutinin (PHA) at a concenlration of 50 9. Centrifuge at 200 g for 5 min.
,ug/ml in the medium. Reference: Rooney and 10. Discard the supernatant with a pipette, and
Czepulkowski (1984). flick very vigorously to loosen pellet.
11. Add a small amount of fresh fixative vigor-
Part A. Sattting Up C ~ a l t ~ ~ r c s ously, and flick hard to re-suspend pellet.
1. Using sterile techniques, dispense 5 ml of cul- 12. Wash down the inside walls of the tube wit11
ture medium into sterile culture tubes. two more pipettefuls of fixative.
2. Add 1-10 drops of blood into each tube. 13. Let cells sit for 10 rnin in the fixative at room
3. Incubate tubes on their sides, capped ends temperature.
slightly elevated with caps loosened, at 36-37OC
for 72 lu; mixing the contents at least daily.
Part C . Washing
4. Add additional antibiotic if cultures become
1. Centrifuge at 200 g for 5 min.
cloudy (contaminated).
Cizrornosomes: Molecular Cytogenetics 153
2. Discard the supernatant, and flick gently to 3. Transfer muscles to a pctri dish, cut 111to small
loosen pellet (but be careful to keep cells froin fragments, and culture in 2 ml of Dulbecco's
flying u p and sticking to the sides of the medium with 20% FRS and 50 mg/nzl
tube). neomycin in a culture flask at 30°C (cultures
3. Add fresh fixative, rinsing down the walls of should be gassed with air plus 5% C 0 2 when
the tube to keep cells down. the phenol red in the nzediurn indicates a rise
in pH).
4. Let sit for 10 rnin at room temperature.
4. When confluent sheets of cells are seen (>24
5. Centrifuge at 200 g for 5 min . l~r),add 0.02 ml of 0.16% colchicine, and ~ncu-
6. Repeat washing steps 1-5 once. bate for 1hr.
7. Now re-suspend cells in a small amount of 5. Harvest cells by detaching them with 0.125%
new fixative (less than 1/2 pipetteful). trypsin in 0.02% EDTA (withdraw medlum
8. The fixed cell suspension may now be stored and discard; rinse cells once with tryps~nso-
in fixative in a freezer, or slides can be pre- lution, and discard rinse; add fresh trypsin
pared following the splash technique (Proto- and incubate 15-30 sec, then withdraw and
col 6). [Note: If slides are to be G-banded (Pro- discard; incubate until ceils round up 15-15
tocol 15), they should be dried for 24 11r in an rnin], then disperse in fresh medium, Freslz-
oven at 9O0C.1 ney, 1994).
6. Harvest the cells as in Protocol 7, steps 6-10,
Part D. Stainirtg Slides
1. Place air-dried slides into a clean coplin jar
with 50 ml of phosphate buffer (pH 6.8).
Pro'b-ocrsl9: Mitatis: C"nomosomcs froin
Corneal Epiiheiiu33 of Vertsbratss
2. Add 2.5 in1 of Giemsa stain, and squirt up
(Time: 8-18 hr incubation plus several hours
and down to mix well.
handling)
3. Let stain for 10 min.
4. Quickly flood out stain with tap water, then This is a reliable technique for obtaining good mi-
rinse once quickly with either distilled water totic chromosome preparations from anurans, and
or phospl~atebuffer. probably works 011 fish, reptiles, birds, and mam-
5. Shake off excess fluid, and allow slides to air- mals as well (it does not work very well on snla-
dry. manders, however, because of their longer cell cy-
cle times). The sgecifjc protocol described here is
from David M. Green and is designed for frogs
PrslCocsal8: Mitotic Cfnsoa~ssnmcsfrom and toads (references: Bogart, 1972; Iizuka t.t al.,
1991).
Pibrablast Carl tuarcs (Reptiles)
(Time: >2 days incubation, plus =3 hr handling) 1. Inject animal with 0.1% colchicinc in distilled
water, using a long 22-gauge needle. Insert
This technique (from Yonenaga-Yasuda et al., the needle under the skin of tlze uppcr th~gh
1988) could probably be used for any vertebrate, and work it u p the back under the sku11 111to
with appropriate modifications in culture media, the dorsal lymphatic sac. The needle passes
incubation times, etc. through the membrane dividing the dorsal
1. Sterilize hind legs by successive treatment sac from the sac on the upper thigh. Fill the
with 70% ethanol, ether, and merfene. dorsal lymphatic sac until the skin between
2. Remove muscles aseptically and place in a the eyes bulges (amount depends upon the
small sterile bottle with 5 ml of L15 medium size of the frog). Incubate 8 hr (Eleuti7crudacty-
with 5% fetal bovine serum (FBS) and 50 lus, Hyla), 10 hr (Bufo), 14 hr (Rarzn),or up to
mg/ml gentamycin, for 24 hr at room tem- 18 hr (Leiopelma) depending upon tempera-
perature. ture and metabolic rate of the frog.
154 Chapter 5 1 Sessions
2. Clean a spot plate with ethanol, and fill one pick it up. Turn it right-side up and place it on
well for each eye with distilled water. a black surface (so you can see the tissue). Use
3. Kill the frog by immersion in anesthetic solu- the wooden end of a match stick to lightly tap
tion (1% tricaine methosulfonate) or by ap- the top of the coverslip to remove bubbles and
plying a glob of benzocaine ointment distribute the tissue over the slide. Place the
(Anbesol'") on the top of the head. Alterna- slide on some absorbent bibulous paper, cover
tively the frog can be pithed. it with another strip of bibulous paper and
hold it all in place with the thumb and index
4. Dissect out the eyes using a fine No. 11
finger of your left hand (if you are right-
scalpel, being careful not to puncture the eye.
handed). Put your right thumb firmly on top
To begin, insert the blade under the eye be-
between your left thumb and finger and press
tween the eye and the lower eyelid using the
with considerable force to squash the whole
blunt, back edge of the blade. Then carefully
thing. Remove thumb and paper and seal the
cut around the eye's connections at the front
coverglass around the edges with rubber
and back and to the upper eyelid. When free
cement.
from the eyelids and peripheral muscles to
the side, maneuver the scalpel around the 9. Examine with phase-contrast.
back of the eyeball to sever the optic nerve 10. To make the slide permanent, peel off the mb-
and musculature. When the eye is almost ber cement and freeze the slide by immersing
completely free, remove it from any remain- it in liquid nitrogen. Scrape off the remaining
ing connections and lift it from its socket us- coverglass cement, pop the coverslip off with
ing fine forceps. a razor blade or scalpel, and fix the slide in
5. Place each eye in distilled water in the spot 95% ethanol in a coplin jar. This will not work
plate wells, and let sit for one hour. if the coverglass is not siliconized. Transfer to
fresh 95% ethanol after 5 min and then air-
6. To fix the tissue, pick up the eye with fine for-
dry. The chromosomes can now be banded,
ceps and'position it so that it can be held with
hybridized, or stained.
cornea facing down (do this by holding onto
the stubs of the muscles at the back of the
cye). Suspend the eye for 1 min, 1-2 mm Prrstocoli 30: Mitotic Chramosomes
above the surface of a watch glass filled with
glacial acetic acid. The fumes will fix the from Insect Embryos
cornea. Do not allow the eye to touch the (Time: -8 hr)
acetic acid. Place the eye back into its well in This technique is modified from Zhan et al. (1984)
"the spot plate to check for proper fixation. The for orthopterans.
eye surface should be cloudy. If it isn't, re-sus-
1. Place eggs separately in a petri dish contain-
pcnd it over the fumes until it is.
ing filter paper soaked with Mark's M-20 in-
7. Arrange at least three or four clean, sili- sect culture medium (Gibco) with 7.5% FBS
conized coverglasses on the spot plate under and 5.0 mg/ml actinomycin D, and incubate
a dissecting microscope. Place the eye, cornea at 37OC for 4 hr.
up, in a central well. Under low power, use
2. Transfer eggs to another petri dish containing
iorceps to hold the eye and use a blunt scalpel
fresh M-20 medium with 0.16 mg/ml col-
to carefully scrape off the fixed cornea. Trans-
fer the tissue to a coverglass. Divide the tissue
cemid and incubate for 1hr at 37OC.
into equal portions and distribute to cover- 3. Dissect embryos out of eggs in plain M-20
glasses using the scalpel and forceps. Place a medium.
drop (or two) of 70% acetic acid onto the tis- 4. Transfer intact embryos to a centrifuge tube
sue on each coverglass. containing 0.075 M KC1 hypotonic ( ~ 0 . 5
8 Apply a clean slide to a coverglass in order to ml/embryo) for 30 min at 37°C.
Chromosomes: Molecular Cytogenetics 155
5. Dissociate embryos by gentle pipetting, then tap lightly on the coverslip directly over the
centrifuge 2 min at 100 g. glands with a pencil eraser, to help spread the
6. Discard supernatant and re-suspend cells in a chromosomes. Monitor the spreading with a
large volume of fresh 3:l methanokacetic acid, phase microscope.
and fix for 20 min at room temperature. 6. When the chromosomes appear well spread,
7. Centrifuge 2 min at 100g, decant supernatant, make slides permanent using the squash and
and re-suspend in fresh fixative. Repeat once, dry ice techniques (Protocol.4).
re-suspending the final cell pellet in approxi-
mately 1ml of fixative.
S, Use final cell suspension to make splash Protocol 12: Lampbrush Chromosomes
and/or squash preparations. (Time: Part A, 15 min; Part B, >1 hr; Part C, sev-
eral hr; part D, =1hr)
This technique works for salamanders, and easily
PraftocaI 11: Polyiene Chrcomosomes can be modified for frogs, reptiles, fishes, and
from Dipteran Salivary Glands birds. Generally, medium-sized yolky oocytes
(Time: 5-10 min/preparation) (i.e., neither the largest nor the smallest) yield the
best lampbrush chromosome (LBC) preparations;
This technique works best with large, healthy, large, more mature oocytes usually. have con-
well-fed larvae of many species of dipteran flies. densed, featureless LBCs. The best dispersion
Third instar Drosophila larvae are usually found medium (DM) varies among taxa (J. Kezer, per-
crawling up the sides of the culture jar. The paired sonal communication; Macgregor and Varley,
salivary glands are clear or slightly opaque, some- 1983; Callan, 1986). The DM and IM given here
what zucchini-shaped, and have pieces of glisten- are general "all-purpose,'' and should be tried
ing fat body attached. The glands can be seen first.
clearly by using understage lighting on a dissect-
ing microscope. Good polytene chromosomes for Part A. Preparing Ovaries
in situ hybridization should be flat and gray with
1. Anesthetize or kill animal (e.g., in 0.1-0.2%
no refractivity, and the banding pattern should be
MS222).
clearly recognizable (Macgregor and Varley, 1983;
Pardue, 1986). 2. Remove one or both ovaries through an inci-
sion in the ventral body wall.
1. Remove a large third instar larva and place in
3. Transfer ovaries immediateIy to a dry, clean
45% acetic acid (or in isotonic saline) in the
embryological watch glass or small dish, and
middle of a clean slide.
keep covered. Ovaries can be stored "dry" at
2. Use needles and/or watchmaker's forceps to 4°C for 2-3 days if dish is sealed with
pinch off the anterior end of the larva just be- parafilm.
hind the head segment. It is best to hold the
head steady and pull the rest of the body away
until paired salivary glands emerge from the Part B. Isolaiiotr af Nucleus
anterior opening (if they don't appear imme- 1, Submerge a small piece of ovary into 5:l iso-
diately, discard and select a fresh larva). lation medium (5 parts 0.1 M KCI: 1 part 0.1
3. Tease off as much fat body as possible with- M NaC1) in a clean dish.
out damaging the glands. 2. Using watchmaker's forceps, tear open ovary
4. Transfer (by sliding) the glands to a small and remove an oocyte. Grasp the oocyte with
drop of 45% acetic acid near one end of the two forceps and pull laterally to break open
slide, and fix for 1-2 min. the oocyte. The yolky contents will spill out
(take care to keep the preparation completely
5. Cover with a clean siliconized coverslip and
submerged).
156 Chapter 5 / Sessions
3. Locate the translucent nucleus (=0.3-0,4 mm speed to be gradually increased over a 3-min
in diameter in salamanders; 0.1 m m in period to 2000-3000 g-.If the centrifuge is not
lizards), and suck in and out of a small-bore, refrigerated, it can be prechilled by placing
flame-polished Pasteur pipette several times dry ice in the chamber for approximately 30
to remove the adherent yolk (the nucleus is min before use. Centrifuge at 2000-3000 g for
sturdy and can be bounced off the bottom of at least 15 min.
the dish to dislodge adherent bits of yolk). 3. Remove the cliambers from the centrifuge
tubes, immerse in dispersion medium, and
Y'art C. Dispersal of Chrntgaosozrres use a razor blade to remove the coverglass on
which the chromosomes now rest. Gently
1. Transfer the cleaned nucleus to a dispersion swish the coverslip around in the medium to
chamber (if permanent preparations are de- wash away any remaining nucleoplasm.
sired, e.g., for ISH, use a bored circular plas-
tic disc with a paraffin-attached coverslip on 4. Fix the preparation in 70% ethanol for 5 min.
the bottom) completely filled with dispersion 5. Remove the preparation to fresh 70% ethanol
medium (5:1 plus 0.5%paraformaldehyde) so for at least 15 min, then dehydrate in 95%
that there is a convex meniscus on top of the ethanol (2 x 10 min) and air-dry. Tlze prepara-
chamber. tions are now ready for ISH, but may be
2. Using a black background under a dissecting stored desiccated at 4°C until needed.
microscope, grasp the nuclear envelope at the
top of the nucleus with one pair of forceps,
then take hold with a second pair very near Protocol 13: G-Bamdix~g
the first, and pull the two forceps apart with (Time: 1day pretreatment, plus =2.0hr)
a slightly downward motion (nuclear con-
tents should emerge as a gelatinous mass This method works for mitotic and meiotic chro-
completely separated from the envelope, mosomes of most organisms, including plants, in-
which should remain attached to one or both sects, urodeles, anurans, birds, reptiles, fish, and
forceps. [Note: Abandon the preparation im- mammals (Schmid et al., 1979).
mediately if the nuclear contents begin to ex-
1. Bake permanent, unstained, air-dried clzro-
trude spontaneously througlz a small hole in
mosome preparation slides for 1 day in a 60°C
the envelope; such preparations will yield
oven.
only fragmented chromosomes.]
3. Cover the preparation with a coverslip. To
2. Place slides in coplin jar with prewarmed
(30°C), saturated barium hydroxide for 5 min
avoid the disruptive effects of surface tension,
at 30°C.
the coverslip must be dropped so that the sur-
face of the coverslip is parallel to the surface 3. Rinse very briefly in 0.1 N HCl, followed by a
of the slide. thorough rinse in distilled water (e.g., fill and
4. It takes several hours for the chromosomes to
empty the coplin jar six times).
settle onto the floor of the dispersion cham- 4. Place slides in coplin jar with prewarmed 2x
ber. Keep the slides refrigerated in a humid SSC (Appendix) for 1 hr at 6Q°C.
chamber during this time. 5. Rinse in distilled water (2 min).
6. Stain slides in 8.0% Giemsa in phosphate
Part D, To Makc Permanerat Preparation:; buffer [Appendix) p H 6.8, for 5 min. Load
1. Place the dispersion chamber into a centrifuge slides into a coplin jar and add 50 ml buffer,
tube fitted with an epoxy plug. then add 4 ml Giemsa and quickly pipette up
and down until thoroughly mixed.
2. Centrifuge, using a swinging bucket rotor, in
a prechilled table-top centrifuge that allows 7. Rinse out Giemsa stain by flooding coplin jar
Chuomosomes:Molecular Cytogenetics 357
with distilled water or fresh buffer to avoid 6. Air-dry slides and cover with a xylene-based
contamination of slides with film that forms mounting medium ( ~ e ~ e xor' " Permouni'").
on the surface of Giemsa staining solution.
8. Air-dry the slides and mount with a coverslip
in a xylene-based mounting medium (e.g., 1%0loclh?l'16: H:Xuh~rscRrro~12eR-Banrii:.ig
~ e P e x or
' ~ permountTM). with Chromrsmysi~tA3
(Time: ~30-45min)
This stain produces reverse banding (relative to
FrotocoB 14: 9-Banding G-banding) in mammals, and stains NORs in sala-
(Time: =15 min/preparation)
manders, fishes, and some plants (Hack and
This protocol is from Berm and Perle (1986). Lawce, 1980; Sessions and Kezer, 1987).
I. Place slides in 0.5 mg/ml quinacrine dihy- 1. Place air-dried slides in a humid chamber and
drochloride stain for 10 rnin at room tempera- flood each preparation with at least 50 pl of 5
ture. pg/ml chromomycin A3 in chromomjlcin
2. Rinse briefly in distilled water to remove ex- buffer (see Appendix), cover with a coverslip,
cess stain. and stain for 20 rnin in the dark at room tern-
perature.
3. Soak in a coplin jar of McIlvaine's buffer (Ap-
pendix) for 1 min. 2. Rinse off the chromomycin with distilled wa-
ter, and place slides (no more than three at a
4. Mount in a few drops of buffer, aquamount,
time) in coplin jar with methyl green counter-
or 100% glycerol using a thin glass coverslip. staining solution (2-ml stock solution in 50 in1
5. Examine and photograph immediately with phosphate buffer, pH 6.8) for 6 min.
fluorescent optics using a filter combination 3. Rinse in distilled water.
appropriate for fluorescein (FITC; e.g., Zeiss
filter set No. 9, BP 450-490 nm, FT 510 nm, 4. Air-dry, then mount in 100% glycerine or
and LP 520 nm); the fluorescent iinage fades aquamount and examine under UV epifluo-
quickly. rescence optics using an appropriate filter
combination (see Protocol 14, "Q-Banding").
Prestscsl15: 6-barnding
(Time: ==7midslide plus 1 hr drying time) Protocol 17: AgNBR Banding
(Time: =1 min/slide)
This protocol is from Benn and Perle (1986).
This fast, easy, and reliable technique was pub-
1. Age slides by placing them in a hot oven lished by Hsu (1981), and seems to work for all or-
(60°C) overnight. ganisms. Use aged (at least 1 day), air-dried
2. Agitate slides for a few seconds in 0.005% slides.
trypsin in PBS (Appendix); optimal time 1. Mix 2 parts of 50% (w/v in distilled water) sil-
varies widely for different preparations. ver nitrate solution and 1part developer (Ap-
3. Rinse in three changes of ice-cold PBS (dip pendix) in a glass vial (allowing at least 150 p1
consecutively in each coplin jar). for each preparation), and mix thoroughly
4. Stain for 5 min in 5% Giemsa solution in 2. Add 3 drops to each preparation and quickly
phosphate buffer, pH 6.8. add a coverslip.
5. Remove Giemsa by flooding under a gentle 3. Incubate at 90°C for 30-60 sec (or until stain-
stream of water. ing solution has turned muddy yellowish
brown).
158 Chapter 5 / Sessions
4 liir~seoff coverslip with distilled water (using This technique yields complex banding patterns
a squirt bottle, or rinse in a beaker of water). comparable to G-bands in salamanders (Kuro-o et
5. Air-dry slides, and mount in oil or permanent al., 1986; Kohno et al., 1991).
mounting medium. 1. Wash dejellied embryos in several changes of
sterile amphibian saline.
2. Transfer embryos to culture dish (35-mm di-
~ j l DiffcrexuPiaf. Replication
P k . ~ ; i i i18: ameter) containing l a 5rnl of 60% Eagle's min-
U~txziiingwith SxdU imum essential medium (MEM) with 20%
(Time several days incubation, plus 1 full work- FBS, 20% sterile-filtered water, and 400 pg/ml
ing day) BrdU.
Thrs technique can be used to obtain complex 3. Disrupt embryo with a sterile Pasteur pipette,
chron~osornalbanding patterns in organisms in and incubate cells in a darkened, humidified
whlcl~more conventional banding methods d o incubator under a constant flow of air with
not work (Dutrillaux, 1975; Benn and Perle, 1986). 5% C 0 2 for 24 hr at 20°C.
1. Sct up tissue culture cells. 4. Add another 1.5 ml of medium containing
1.0% colchicine and incubate 8 hr at 20°C,
2. Five to seven hours before addition of col-
ccmid, a d d 0.01 ~romodeoxyuri~ine 5. Centrifuge cells and medium at 120 g for 7
(RrdU) and 0.01 M dcoxycytidine to make fi- min.
nal concentration of la-*M each. 6. Discard supernatant, re-suspend cells in 10 ml
3. One hour before harvest add colcemid to final hypotonic solution (amphibian saline diluted
concentration of 0.1%. 3:7 with distilled water), and incubate for 1 hr
at room temperature.
4. l-larvest and make slides via the splash or
squasl~technique (see above). 7. Add 0.5 ml fresh 3:l methanokacetic acid fix-
ative and fix for 10 min.
5. Soak air-dried slides in PBS for 5 min at room
temperature. 8. Centrifuge at 420 g for 5 min, replace super-
natant with fresh fixative, and fix for 5 min.
6. Stam in 0.5 pg/ml I-Ioechst 33258 for 10 min
a t room temperature. 9. Repeat step 8 twice, but re-suspend final cell
pellet in approximately 1 ml fixative, and use
7. Mount in McIlvaine's buffer (Appendix).
to make splash, and/or squash preparations.
8. Trradlate slides at approximately 5 cm from a
10. Age slides for 3-5 days at room temperature.
15-W UV light source at 50°C for 15 min or
under a 75-W growlamp for 24 hr. 11. Stain with 50 pg/ml Hsechst 33258 in cal-
cium- and magnesium-free PBS for 15 min.
9. Ibnse coverslip away with disti1Ied water,
alld incubate slides for 15 min in 2~ ssc at 12. Rinse briefly in distilled water, mount in PBS,
65°C. and expose to UV light at a distance of 10 cm
for 30 min.
10. Sta~rzslides in 8% Giemsa in phosphate buffer
pH 6.8 for 5-10 rnin. 13. Remove coverslips, rinse briefly in distilled
water, then incubate in 2x SSC for 30 min at
11. iilr-dry slides and mount in a xylene-based
60°C.
n7 edium.
14. Rinse slides in running water, then stain in 3%
Giemsa in PBS at pII 6.5 for 4 min.
Protost:! 19: Modification of BrdU- 15. Air-dry slides, and mount in a xylene-based
nanding for Salamander Etabryos
-TP mounting medium.
(Tirne 32 11r incubation, 3-5 days slide aging,
plus =3 11s)
Chromosomes: Molecular Cytogenetics 159
PECE~OCQI20: Labeling Probes far lSEd 8. Slow the reaction by placing the tube on ice,
via Nick, TransIaEion and determine the percentage incorporation
(Time: 3-4 hr) of radioactive nucleotides with the following
procedure:
Nick translation is the most commonly used
a. Mix 5 ,uLof the reaction mixture with 995 pl of
method for both radioisotopic and non-isotopic
TCA/BSA in a microcentrifuge tube, and keep
labeling of probes for in situ hybridization. The on ice for 15 min.
radioisotopic method is designed to label probes
with tritium, which are then detected with au- b. Pass 5 ml of ice-cold 5%TCA through a 2.5-cm-
toradiography (Macgregor and Varley, 1983; Par- diameter Whatman GF/C glass fiber filter fol-
due, 1985, 1986; Malcolm et al., 1986). The proto- lowed by the TCA/BSA reaction mixture.
col for non-isotopic nick translation is designed c. Wash the filter three times with 5 ml of cold 5%
for biotin-avidin labeling, used in most FISH (flu- TCA, and dry the filter at 65OC for 20 min.
orescence in situ hybridization) and chromosome- d. Measure the radioactivity of the filter in a scin-
painting protocols (Ausubel et al., 1992). HPLC- tillation counter using a toluene-based scintilla-
purified nucleoside triphosphates have limited tion fluid.
shelf life in solution, but are stable for up to 1year e. To measure total incorporated and unincorpo-
when stored as aliquots at -20°C. Deoxyribonu- rated radioactivity in the reaction mixture, take
cleoside triphosphates (dNTPs) can be purchased another 5-pl sample of the reaction mix and put
as ready-made 100 rnM solutions, or they can be it directly on a clean filter without TCA, dry it,
purchased in lyophilized form (Ausubel et al., and count it.
1992). Nick translation kits are commercially
available. f. The percentage incorporation of radioisotopes
into the probe is determined by comparing
Nick 'rr;ensXation for Tritium-Labeled Probes counts between the two filters. The TCA-treated
This method utilizes 1 pg of DNA and produces filter should have 20-60% of the counts ob-
enough probe for at least 10 slides, with a specific tained from the untreated filter. The DNA
activity of 2-6 x lo6 cpm/pg. should not be used if it shows less than 10% in-
corporation.
1. Mix 2 x 10" pmol each of tritiated precursors
(dNTPs) in a microcentrifuge tube. 9. Stop the nick translation reaction by adding
100 pl of water-saturated phenol and mixing
2. Aliquot 18 pl of the mixture into ethanol- well with a Pasteur pipette.
washed microcentrifuge tubes; quickly freeze-
dry under vacuum to prevent radiolysis. 10. Centrifuge at 5000 g for 5 min.
3. Add to a tube containing the dried, tritiated 11. Unincorporated nucleotides can be removed
dNTPs: 10 pl of nick translation buffer (Ap- by loading the aqueous supernatant directly
pendix), 5 p1 of DNA (1 pg), and glass-dis- onto a Sephadex G-50 column (see Chapter 8)
tilled water to make a total of 94 pl. that has been prewashed with distilled water.
4. Incubate the mixture at 15OC for 10 min, then 12. Elute with distilled water and collect consec-
chill the tube in ice water. utive fractions of 30 drops each. Count 5 ,ul of
each fraction in a scintillation counter (using
5. Add 5 p1 (12.5 U) of DNA polymerase I to a tergitol scintillator) and combine the frac-
make a total volume of 99 pl. tions containing the first peak of radioactivity
6. Add 1 pl of diluted (1 pg/mU DNase I (1 to come off the column.
mg/ml) stock (dilute stock immediately be- 13. Freeze-dry these combined fractions and re-
fore use). dissolve them in 50 pJ of distilled water.
7. Incubate at 15°C for 1 hr.
160 Chapter 5 / Sessions
2. Add 150 pl of forinamide stock (Appendix) to Pro t m o i 22:: Radisilietopic ISH Ui;il-tg
make a final concentration of 50% and mix an RNA Frobe
well. (Time:6-12 11r incubation plus 6-8 11r)
3. Add 60 pl of 20x SSC (to make final concen-
tration of 4x SSC) and mix well. This protocol is based on Macgregor and Varlcy
(1983) and Pardue (1986).
4. Add 30 pl of distilled water, mix well, and
cool on ice for 5 min. 1. Preparation of RNA probe: lyophilized or
5. Add 30 pl of 0.1 M HCl (i.e., enough to ex- ethanol-precipitated RNA should be dis-
actly titrate the 0.1 M NaOH) and mix well. solved in 2x SSC or in 4x SSC/SO% for-
mamide to provide a total of 2-3 x 10"
6. Keep the hybridization reaction mixture on cpm/ml and 30 yl per slide.
ice and use within 10-15 min.
2. Slide pretreatment and hybridization reac-
tion: same as for DNA-DNA hybridization
Bart C. The ilybridization Rcastirjn (Protocol 21).
1. Place pretreated, air-dried slides horizontally 3, Remove coverslips by dipping in large vol-
in humid chambers (again using black paper). ume of 2x SSC.
2. Place 30 pl of the hybridization reaction mix- 4. Place slides in fresh 2x SSC for 15 min at room
ture (wluch has been kept on ice) in the mid- temperature.
dle of the preparation of each slide.
5. Treat each slide with ribonuclease mixture
3. Place a 22-mm2 glass coverslip over each (Appendix) at 37°C for 1 l-ir.
preparation, avoiding bubbles and making
6. Wash slides in 2x SSC, 2 x 15 min.
sure that the entire preparation is covered
with reaclion mixture. 7. Place slides in 5% TCA at 5OC for 5 min.
4. Cover the humid chambers and incubate at 8. Wash slides in 2x SSC, 2 x 10 min.
37OC for 6-12 hr. 9. Wash slides in 70% and 95% ethanol, 2 x 10
min each.
Pare U. Washing the Slides 10. Air-dry the slides; they are now ready for au-
toradiography (Protocol 24).
1. Lift each slide from the humid chamber and
remove the coverslip by dipping into a large
volume of 2x SSC.
2. Place the slides in a coplin jar of fresh 2x SSC Protocol 23: Radioisotopic Localization
at 65°C for 15 min. of Single-Copy Seqateraces
3. Wash in 2x SSC, 2 x 10 min at room tempera- (Time: 8-16 hr incubation plus 5-6 hr)
ture. This protocol is from Harper and Saunders (1984).
4. Place the slides in a caplin jar of 5% TCA at Use recombinant bacteriophage or plasmid DNA
5°C for 5 min. containing single-copy sequences of ~nterest.
5. Wash in 2x SSC, 2 x 10 min at room tempera- Probes should be labeled with tritiated dNTPs by
ture. nick translation to 20-40 x 106cpmlyg (see Proto-
col 20).
6. Wash in 70% and 95% ethanol, 2 x 10 min
each. 1. Pretreat slides with RNase (as in Protocol 21),
7. Air-dry. The slides are now ready for autora- rinse in 2x SSC, and dehydrate in ethanol.
diography (Protocol 24). 2. Dissolve probe in 50% formamide, 2x SSC,
10% dextran sulfate, p H 7.0, along with
500-fold excess sonicated salmon sperm
DNA carrier.
162 Chnpter 5/ Sessions
3. Dcnaiure probe mixture (85°C for 3-15 min, tents slowly down the side of the 500-ml
then chill quickly on ice). beaker.
4. Apply chilled, denatured probe mixture to 4. Thoroughly mix the contents of the beaker by
sllde preparations and cover with a coverslip. swirling gently so as to prevent the formation
5. Jncubate in a humid chamber at 37OC for 8-16 of bubbles.
hr 5. Dispense the diluted emulsion into scintilla-
6. Rinse thoroughly (e.g., 3 x 5 min each) in 2x tion vials, approximately 10 ml per vial. This
S5C/50% formamide, pH 7.0, and then 2x is enough emulsion to coat approximately 30
SSC,pH 7.0, at 39"C, followed by dehydration slides.
In eihanol (e.g., 70 and 95%, 2 x 10 min each). 6. Wrap each vial in aluminum foil, place them
7 The slides are now ready for autoradiography in a light-proof box, and store at 3-5'C in a re-
(Protocol 24). Slides require an exposure tlme frigerator that is never used for radioisotopes
of 5-22 days at 4°C; keep the slides in a light- or organic solvents. Stored in this way, the
proof slide box (e.g., taped with black electric emulsion may be good for at least 5 years.
tape) along with desiccant in a cotton-stop-
pered vial. Par! Ti, Coating and Expasirrg the Slides
8 G-band clu-ornosomes with Wright stain (see 1. Working in complete darkness or under a
Protocol 24). safelight, place the sealed vial of emulsion
and a slide dipping chamber into the 45°C
water bath for 15-20 min (the dipping cham-
J'rotrziol 24: huto~at7fiograpi1yfor ber can be stood in a beaker or diagonally in a
D ~ ' i C . c t i oof~ ~Radiois~topic1SW coplin jar filled with water, and should be im-
(%me. Part A, 30 min; Part B, 2 4 hr; Part C, =30 mersed to within 0.5 cm of its top edge).
min; Part D, =30 min) 2. Fill the dipping chamber by slowly pouring
Aulorddlograpl~yrequires three main tasks: dilut- the emulsion down its side to avoid bubbles
ing and allquoting a new batch of emulsion, coat- (a small funnel is useful).
ing llic shdes, and developing the exposed au- 3. Dip the slides slowly and smoothly into the
ioracilographs (Macgregor and Varley, 1983). chamber, one at a time (taking care not to
touch the emulsion with fingers), withdraw
Ps!f 'i. Diluting, hliquotirxg, aaui Stcrrlng and drain briefly against the edge of the
!!rtt:~l~~iun chamber, and place in a slide rack to dry.
1. Open package of emulsion (e.g., Kodak 4. Air-dry the slides for at least 2 hr in complete
NTB2) in the darkroom either in complete darkness.
ciarkncss or under a safelight (e.g., Kodak 5. Store the slides for exposure in light-proof
8152-2525) and warm the bottle for 30 min in slide boxes sealed with black electric tape.
a 45'C water bath along with a 200-ml flask of Moisture during the exposure time can cause
dislilled water and an empty 500-ml beaker. the latent image to fade, so it is important to
2. After the emulsion has melted, pour the en- place a vial of desiccant into each slide box.
l ~ r econtents of the bottle very slowly down The vial of desiccant should be loosely
ilw side of the prewarmed 500-ml beaker and plugged with cotton and can be held in place
return beaker to the water bath. with a blank microscope slide. The slide
boxes should be stored at 4:C for the appro-
3. Fill the empty plastic emulsion bottle with
priate exposure time (since the exposure time
pretvarmed distilled water from the Erlen-
is determined empirically, it is important to
meyer flask, mix gently, and pour the con-
include some expendable test slides).
Chromosomes: Molecular Cytogenetics 163
6. After the required exposure time, the slides PTO~BC 25:Q chkoxnasnme:
~ Painting
should be warmed to room temperature and usingFISH
developed according to the following proce- (Time: Part A, 2-3 hr; part B, 12-18 hr; Part C,
dure (all solutrons must be at the same tem- 6 5 hr)
perature, 15-2O0C, to avoid cracking or wrin-
kling the emulsion). This technique has been used to locate ribosomal
DNA on chromosomes of vertebrates, and can be
used for chromosome painting using chromo-
f artC. Developing the Atatoradingraphs
some-specific probes (many are commercially
1. In the dark: gently rock the preparation in available). Probes are labeled with biotin or digox-
freshly mixed developer (e.g., Kodak D-19), igenin (Protocol 20), and the signal is amplified
2.5 min at 20°C (a single coplin jar can be and detected using avidin-biotin and immuno-
used if developing 10 or fewer slides). chemistry (C.A. Porter et al., 1991).
2. Pour out developer and replace with fixer; fix
for 5 rnin at 20°C (lights can come back on af- Past A. Preparation and Denaturation
ter 2 rnin in fixer).
1. Use air-dried or flame-dried permanent slide
3. Pour out fixer and rinse slides in distilled wa- preparations.
ter at least five times, 2 rnin each at 20°C.
2. Treat with RNase (100 ,ug/ml in 2x SSC, p H
4. Air-dry the slides. 7.0) for 1hr at 37°C.
3. Rinse 3 x 3 rnin in 2x SSC.
Part D. Post-Autoradi~grayfiyStaining with 4. Dehydrate in 70,80, and 95% ethanol.
Wight's Stain
5. Denature for 2-4 rnin (determine empirically,
This procedure sometimes results in G-banding in starting with 2 min) at 70°C with prewarmed
mammalian (especially human) chromosomes 70% formamide (Kodak ACS) in 2x SSC (pH
(Chandler and Yunis, 1978; cited in Pardue, 1985). 7.0).
1. Stain for 5 rnin in 5% Giemsa solution in 6. Wash at least 3 times in ice-cold 70% ethanol.
phosphate buffer, pH 6.8. 7. Dehydrate in 80 and 95% ethanol, and air-dry.
2. Remove Giemsa by flooding under a gentle
stream of water.
Part B. Hybridization
3. Air-dry slides and cover with a xylene-based
mounting medium (DePex'" or PermountTM). 1. Make hybridization reaction mix:
a. biotinylated probe DNA, 1-3 &/mi in 2x SSC
For G-bands (mammalian chromosomes): b. 500 pg/ml E. coli carrier DNA
1. Place the slides in a solution of Wright's stain c. 30% formarnide
(15 ml in 45 mI of phosphate buffer, p H 6.8)
for 8-10 min. 2. Denature the probe by heating hybridization
mix to 70°C for 5 min, then immediately cool
2. Rinse briefly in distilled water.
by placing on ice.
3. Enhance staining contrast by destaining slides 3. Add 30 pl of hybridization reaction mixture to
in 95% ethanol (2 min), chloroform (15 sec), the preparation on each slide, cover with cov-
95% ethanol plus 1% HC1 (30 sec), 100% erslips (22 mm2).
methanol (2 rnin), and then restaining in
Wright's stain (6-8 rnin); repeat at least once. 4. Seal coverslips with rubber cement.
4. Rinse in distilled water, air-dry, and mount in 5. Incubate at 37°C for 12-18 hr in a humid
~ermount'"or other xylene-based mounting chamber.
medium.
164 Chapter 5 / Sessions
6. Add primary antibody (polyclonal anti-biotin Chromosome preparation and banding can be
or anti-digoxigenin, in blocking serum), 100 capricious, depending on the particular organlsm
pl per slide (dilute the antibody according to and the kind of banding. Usually it is easiest to
supplier's directions, or determine empiri- obtain good cl~ron~osorne preparations from
cally), cover with coverslip, and incubate in freshly caught, healthy, well-fed individuals (al-
moist chamber for 1 hr at 37°C. though there are always exceptions). Colchlcme
7. Remove coverslip (as above), and wash 3 x 3 may be light-sensitive when in solution, so it IS
min in PBST at room temperature. advisable to make it up fresh just before use, and
to keep it refrigerated. Colchicine powder should
8. Add secondary, FITC-conjugated IgG in
be kept refrigerated and desiccated.
blocking serum (diluted according to sup-
plier's directions, or determined empirically), Among available banding techniques, C-
banding is perhaps the inost foolproof, rellable
and incubate 30 rnin at 37OC in a moist cham-
method for most organisms, although G-bandmg
ber. [Note: Be sure to use correct antibody
works reliably for most species of mammals. Fail-
combinations; for example, rabbit polyclonal
ure to band using the C-banding protocol may be
antibody to biotin as primary, followed by
due to poorly aged slides, inferior Giemsa, or the
RTC-conjugated goat anti-rabbit IgG.1
absence of stainable hetcrochromatin in the chro-
9. Soak off coverslip as before, and wash 3 x 3 mosomes. It is best, therefore, to make sure that
min in PBS at room temperature. the procedure works on an organism known to
10. Counterstain with PI or D M 1 as in Protocol 25. have good C-bands before trying it on uiztcstcd
11. Mount in anti-fade mount for fluorochromes species. Also, if banding is not produced the llrst
(e.g., AquamountrM). time, it sometimes can be induced by treating ihc
same slides a second time; this is especially nn-
12. View slides and photograph using epifluores-
portant for rare or small organisms from svhlch
cencc microscopy and appropriate filters (see
few preparations are available. Sometimes the
Protacol25).
same slides may be used for several different
banding procedures, using the less stringent
methods first (e.g., fluorochrome banding then G-
INTERPRETATION AND banding then C-banding and/or AgNOR band-
TROUBLESHOOTING ing). If fluorocl~romebandlng does not work, then
either the dye is no good (e.g., it is too old or In-
correctly prepared), the wrong excitation f ~ l t e r
Chromosome Bands
was used, or the chromosomes are devoid of the
Chromosome bands can be scored in terms of kinds of sequences for which the fluarochromc IS
their position within and between chromosomes, specific.
as well as their relative sizes (Figure 4). The ter-
minology used for the position of bands and other
markers differs for different organisms, but has
In Situ Hybridization
been standardized for dipterans (especially There are many reasons ~ v h yISH may fail to
Drosophila; Sturtevant and Novitski, 1941) and work. The most common problem is that the 11~7-
various species of mammals (Paris Conference, bridization signal is too wcak or the background
1971; CSKRN, 1973; ISCN, 1981; Rooney and signal is too high. Ideally, there should be sufli-
Czepulkowski, 1986). Banding data usually are cient signal (silver grains or fluorescence) to lo-
presented as a lcaryotype constructed of chromo- cate sites of hybridization unambiguously, but
somes cut from a photomicrograph. It is helpful not so much that details of chromosome structure
to include an idiograrn, which indicates relative are obscured. For radioisotopically labeled repet-
lengths of chromosomes and positions of bands, itive sequences, hybridized sltes are often visibly
especially if the banding pattern is complex. obvious (Plate 11, but de~nonstrationof single-
166 Chapter 5 / Sessions
SCP (Ix)
0.12 M NaCl 10 mM Tris
0.015 M sodium citrate 10 mM MgC12
0.02 M NaP04 Adjust to pH 7.5.
Adjust to pH 6.0, if necessary,
'FPBS
0.15 M NaCl
3.0 M NaCl 4 mM NaHP04
0.30 M sodium citrate 4 mM Tris-HC1
Adjust pH to 7.0 with 10 N NaOH. Adjust to pH 7.6.
INTRODUCTION
This chapter focuses on "in solution" hybridization for the quantitative assess-
ment of relatedness of biological species using nuclear DNA. We therefore ignore
filter hybridization, whiclz has not yet been shown to be useful for the quantlta-
tive evaluation of relationships. In addition to comparing and reviewing different
DNA hybridization protocols, we also consider laboratory practice, DNA reasso-
ciation kinetics, the significance of genome organization to DNA hybridization
data, the interpretation of melting curves, and the application of DNA hy-
bridization data to systematics. Recently, the latter topics have come under close
methodological and analytical scrutiny, We believe it is important to glve due at-
tention to these issues in a volume on molecular systematics.
Early DNA hybridization/DNA kinetics studies include those of Wetmur
and Davidson (1968), Kohne (1970), ICohne and Britten (1971), Bonner et al.
(19731, and Britten et al. (1974). These authors provided a sound description and
theoretical underpinning for the DNA hybridization technique as well as the un-
derlying reassociation kinetics of DNA. They also introduced severaI metrics lor
DNA l~ybridizationdata. Although these studies were not primarily concerned
with systernatics, the distance metrics that they suggested are still in use today.
Large-scale application of the DNA hybridization technique to problems 111
systematics was pioneered by Charles Sibley and Jon Ahlquist. Their r a p ~ d
progress was made possible by the construction of an automated thermal elution
170 Chapter 6 / Werman, Springer 8Britt
tion in homoduplex versus heteroduplex reac- cubation in seconds). Cot plots, in turn, allow one
tions can also be measured, although the factors to determine a Cot value (under specific incuba-
that influence percentage reassociation are not as tion conditions) at which repetitive sequences
easy to disentangle. have reassociated and can be separated from sin-
gle-stranded, single-copy DNA by hydroxyapatite
Summary of the DNA Hybridization column chromatography (Kohne and Britten,
1971). Fractionated single-copy DNA from one
Techniques and Data Analysis species is then radioactively labeled (txacer) and
Briefly, double-stranded DNA is isolated and then hybridized with unlabeled DNA (driver) from the
purified to remove RNA and protein. Long- same species (homoduplex reaction) and from
stranded DNA is then fragmented to short pieces different species (heteroduplex reactions). When
to permit separation of repetitive and single-copy the hybridization is complete, melting profiles
DNA and to reduce viscosity and gel formation. and the extent of reaction are then determined.
Fractionation of single-copy DNA from repetitive Melting profiles, in turn, permit the quantification
sequences is accomplished most easily using reas- of median and/or modal melting temperatures.
sociation kinetic techniques developed by Britten Differences in these parameters between homo-
et al. (1974). These methods facilitate the con- duplex and heteroduplex curves are then used as
struction of Cot plots (Figure 11, which present the the estimates of genetic distance, AT, and ATmod,.
percentage of single-stranded DNA versus the log The extent of hybridization for an interspecies
of Cot (Cot = initial concentration of DNA in moles heteroduplex measurement may be divided by
of nucleotides per liter multiplied by time of in- that for the homoduplex control and multiplied
'8
2
E
3z
4-
Z Z
% 50 $ 50
5 .-
1
Y
!b3
c, 2
o lo-' lo0 lo1 0 10" lo-' lo0 10' 10' 103 104
Cot Cot
Figure 1 (A) Ideal reassociation curve (Cotcurve) for a cludes a mixture of highly repetitive, moderately repet-
single class of DNA (i.e., single-copy or a single fre- itive, and single-copy components. The individual re-
quency class of repetitive DNA). The curve tracks the association curve for the slowest (higher Cot)compo-
loss of single-stranded DNA (determined by the for- nent is shown, with the approximate half-Cotidentified
mula 1/1 + kCot; see text) and the formation of duplex by ( a ) . Horizontal dashed lines approximate the per-
DNA over log intervals of Cotas expressed in [moles of centage of the genome that each class comprises (20%
nucleotide/liter] x sec. The "half-Cot" is the Cot value highly repetitive, 20% moderately repetitive, and 60%
(here = 1) at which 50%of the DNA has reassociated. In single-copy).Since single-copy DNA is the last compo-
an ideal reaction, 80% of the DNA reassociates over 2 nent to reassociate (half-Cot= l,OOO), it can be fraction-
log intervals; thus the Cot value at which 90% of the ated from the repetitive DNA(over hydroxyapatite) by
DNA has reassociated is 10 times the half-Cot.(B) Hy- reassociating the total DNA to a Cot value of 100. At this
pothetical reassociation curve for genomic DNA that in- point, 90% of the single-copy DNA is single-stranded.
172 Chapter 6 / Werman,Springer b Britter
by 100 to obtain a normalized percentage of hy- When distance data are non-additive in expecta-
bridization (NPH).NPH values are generally con- tion, a simple evolutionary path length interpre-
sidered to be a measure of the fraction of the DNA tation of distances on best-fit trees is confounded.
that has diverged to the point where it will no How well do DNA hybridization data fare
longer form stable interspecies dupIexes under under the assumption of additivity? Unfortu-
criterion conditions (see below). However, NPH nately, several factors may compromise the addi-
may also be influenced by (I) sequences in the tivity of DNA hybridization data (Springer and
single-copy genome of the tracer species that are Krajewski, 1989). These factors include homo-
deleted in the single-copy genome of the driver plasy (i.e., parallelisms, reversals, multiple hits),
species, and (2) kinetic effects (i.e., rates of reasso- sequences that are too divergent to form het-
ciation decrease as interspecies sequence diver- er6duplexes, pairing between paralogous se-
gence increases; Bonner et al., 1973). quences, horizontal gene transfer, measurement
Finally, AT, and NPH are sometimes incor- error, the distributioi; of rates of sequence change
porated into yet another distance measure, AT50H. for different sequences, and the history of genetic
These different measures of genetic distance can variation in different lineages. Some of these fac-
then be used in phylogenetic analysis. Typically, tors also affect sequence data. On the other hand,
complete matrices in which each taxon has been violations of additivity do not necessarily pre-
labeled and compared to all other taxa are used clude the accurate recovery of branching order,
for this purpose. Algorithms for phylogenetic even if they cast doubt on the validity of branch
analysis with distance data (see Chapter 11)in- lengths (chapter 11).In some instances (e.g., ho-
clude phenetic methods (Sneath and Sokal, 1973), moplasy), there are even remedies for sources of
best-fit methods (Fitch and Margoliash, 1967; non-additivity; these remedies make it possible to
Cavalli-Sforza and Edwards, 19671, minimum- approach truly additive data. In other instances,
length tree methods (Farris, 1972, 1981; Faith, further work will be required to improve the in-
1985; Saitou and Nei, 19871, and maximum likeli- terpretation of DNA hybridization distances. For
hood methods (Felsenstein, 1987). example, we can use the Jukes and Cantor (1969)
model to correct DNA hybridization distances for
homoplasy, but this requires that we know the
Properties of Hybridization Data conversion between melting point depression and
Genome structure, sequence evolution, popula- percent sequence mismatch (as well as requiring
tion history, and the DNA hybridization tech- that the assumptions of the model are met). Evi-
nique all affect the content of distance matrices. dence for a linear relationship between the ther-
The task at hand is to understand how these fac- mal stability of imperfect hybrids and the extent
tors affect DNA hybridization data and then to se- of sequence divergence shows that about one de-
lect appropriate tree-construction algorithms. gree change in melting temperature corresponds
When rates of sequence evolution are not the to 1%sequence divergence (Bonner et al., 1973).
same in all lineages, for example, UPGMA clus- Other estimates of this relationship are discussed
tering is an inappropriate tree-construction algo- in a following section.
rithm if we desire trees that accurately reflect
phylogeny. Many tree-construction methods,
however, make no assumption about equal rates
Factors Affecting DNA Hybridization
of change (see Chapter 11).Partly for this reason, The kinetics of DNA hybridization are affected by
methods that do not assume a molecular clock several factors, including genome size, copy num-
have become increasingly popular with distance ber, DNA fragment size, and base composition.
data. On the other hand, most of these methods Genome size is significant because the rate of re-
do assume that distance data are additive (at least association or hybridization is inversely propor-
in expectation). Indeed, an evolutionary interpre- tional to the number of different sequences in the
tation of branch lengths on a best-fit tree requires genome. Since most of the DNA is single-copy
an underlying additive matrix (Farris, 1981). (that is, most sequences are different from each
Nucleic Acids I: DNA-DNA Hybridization 173
the rate of hybridization is determined pri- associated fragments (which include most repctl-
marily by the genome size or DNA content per tive elements) on hydroxyapatite. Precis~on
haploid set of chromosomes. Genome size varies among multiple hybridizations, with prepared
widely among taxa, ranging from 106to about 1011 single-copy DNA, will be improved by knowing
nucleotide pairs among eukaryotes (Britten and the required Cot and accurately controll~ngthe
Davidson, 1969; Cavalier-Smith, 1985).For bacte- DNA concentrations, purities, and fragment
ria the DNA content is much smaller and for sizes. For fragmented DNA in solution (500 nu-
viruses it can be less than 104 bp. This represents a cleotide-long fragments at 60°C in 0.12 M ncutral
10 million-fold range in size, and thus rate of hy- phosphate buffer), the fraction that remains sin-
bridization cannot be ignored and is a central part gle-stranded (i.e., has no duplexed regions and
of experimental design. does not bind to hydroxyapatite) can be slmpiy
Repetitive DNA makes up a significant mi- expressed as follows:
nority of the genome of all eukaryotes (e.g., Fig-
ure 18) and some prokaryotes and greatly influ-
ences the dynamics of hybridization. Repetitive
DNA typically shows a large amount of diver- where k is the practical reaction rate constant (lo6
gence within the genome of an individual and divided by the genome slze in bp). The Cot 1s most
does not usually evolve at the same rate as single- easily calculated as l o x U XH X A or 2 x OD x l i x
copy DNA, Thus, it is practical and necessary to A, where A is an acceleratior~factor that depends
remove the repeats. Very few data exist for inter- on incubation conditions (= 1.0 at 0.72 M PB at
species hybridization of repetitive DNA and it has 60°C), U is micrograms per microliter, H is hours,
not been used for effective resolution of system- and OD is optical density at 260 nin for a 1-cm
atic issues although the evolution of the repeats path.
themselves is of some interest. For mammalian DNA with 3 x 109bp a Cot of
There are two essential problems in the sepa- about 3000 is required for half reassoc~ation.Wxth
ration of repetitive and single-copy DNA. First, 1yglpl of DNA under these conditions the Cot is
repeats are interspersed throughout the single- only 240 per day. To accelerate the process, the
copy DNA requiring that the DNA be sheared to DNA concentration and the pl~osphatebuffer con-
small fragments of a few hundred nucleotide centration can be raised (10 pg/pI is about the
pairs (bp). Second, it is not practical to separate practical limit for the former; Britten et al., 1974).
low-frequency repeats from single-copy DNA. These modifications can be used to obtain a C,f of
Tlie more rapid rate of reassociation of repeats, about 30,000 (or 10 times the half-Cot)in a day and
compared to single-copy sequences, has been the a half. At this Cot, the single-stranded fract~on1s
only means for separating these two classes (see only about lo%, whlc11 is sufficient for most ap-
hydroxyapatite procedure described below). plications. Near the end of the reaction an increase
However, separation based on reassociation rate of a factor of 10 in Cot reduces the amount of un-
is never absolute, and small numbers of copies of reacted DNA by a factor of 1Q.In practice, DNA
each repeat family will remain in the "single- degradation may occur and it may not be prof-
copy" DNA. This problem has never caused sig- itable to use more than a few days of incubation,
nificant uncertainty in the interpretation of hy- although chaotropic agents may help by reducing
bridization data because the quantity of DNA the temperature of incubation.
involved is small. The only situation in which this
source of error is likely to be significant is at great
evolutionary distances when very small amounts
The Criterion and Precision of
of hybridization are observed.
Reassociation
To fractionate the repetitive and single-copy The precision of the partially matching duplexes
components, it is practical to reassociate short- that can form during reassociation is determmed
fragment DNA to the Cot required for 10% reas- by the temperature and ionic strength of the incu-
sociation of single-copy DNA and remove all re- bation buffer. Together they establish the criterion
174 Chnpter 6 / Werman, Springer b Brittt
(stringency of reassociation) that is usually de- the use of chaotropic solvents (e.g., TEACL, de-
scribed as the difference between the T, of perfect scribed below) for duplex denaturation.
duplexes (about 85°C in most cases for 0.18 M Na*
or 0.12 M phosphate buffer) in the incubation Comparison of the Primary Methods
buffer and the temperature of incubation (Britten
e l al., 1974: 366). The optimum rate of reassociation Hydroxyapatite
occurs at about 25°C below the T, of the duplexes In 0.12 M PB at temperatures from 45 to 60°C,
being formed (Bonner et al., 1973). If the condi- double-stranded DNA binds efficiently to hy-
tions are too stringent (l.e., if the temperature is too droxyapatite (HAP) whereas single strands do
high and the salt concentration too low), the NPH not. Further, HAP continues to bind double-
is reduced and all of the duplexes have high melt- stranded DNA until the melting temperature is
ing temperatures. The maximum AT, under such reached. Thus, the separation of single- and dou-
conditions is quite small, reducing the resolving ble-stranded DNA is a simple procedure. The
power of the method. On the other hand, if the practical capacity of HAP for DNA, including di-
condihons are too relaxed, the rate of reassociation vergent sequence duplexes, is about 100 pg/400
is reduced and dissimilar sequences may form un- mg HAP in buffer, although native DNA may be
stable duplexes. Under these circumstances, even bound at much higher levels. However, small
very distant sequences form duplexes, possibly to amounts of duplex DNA are slightly eluted near
the exclusion of more closely related sequences. It the melting temperature in 0.12 M PB (G.M. Fox
1s not known what fraction of duplexes are be- et al., 1980b; Martinson, 1973). Thus, it pays to use
tween non-orthologous sequences, and this poten- consistent flow rates and elution volumes for high
tially may provide misleading hformation of evo- accuracy.
lutionary history. However, under more suitable Many substances, such as 0.3 M NaCl or
conditions (2540°C below the T,) practically all NaAc, 7 M NaC104, or 8 M urea, can be present
of the duplexes that form are thought to involve and d o not interfere with the binding to HAP,
orthologous sequences. These have little influence on the separation of
Hybridizations in phosphate buffer (PB) are single- from double-stranded DNA, although
carried out at 60°C in about 0.48 M PB to acceler- they do influence the T,. However, other sub-
ate the reaction (see above) and thereafter diluted stances, such as small amounts of CsC1, protein,
to 0 32 M PB for thermal denaturation. Upon di- and some metallic ions, interfere with duplex
lution, the effective temperature of incubation is binding. For a detailed account of the HAP proce-
reduced to about 53°C. Consequently, any tracer dure see Britten et al. (1974).
that elutes below this latter temperature is not The major advantage for the use of HAP for
due to denaturation of duplexes formed during DNA hybridization studies is that it is the most
the hybridization reaction and therefore can be explored technique. It has been investigated ex-
considered an artifact. tensively at the physicocheinical level and applied
Finally, the base composition of duplexes can the most widely to systematic problems. Disad-
affect their individual melting temperatures. Since vantages, however, include (1)melting curves that
C-C pairs share three hydrogen bonds and A-T are broader than those obtained with the TEACL
pairs share two, DNA double strands that are G-C method, and (2) the time involved in running in-
rich will melt at a slightly higher temperature dividual columns and large numbers of taxa that
than an A-T-rich fragment in standard phosphate may require automated procedures.
buffer. This factor tends to increase the width of
the melting curve for mixed fragments (i.e., when SI Nuclease and Precipitation to Assay Melting
the tracer is not from a source of cloned frag- Many hybridization experiments (Benveniste and
ments) in intra- or interspecies comparisons Todaro, 1976; Benveniste, 1985; O'Brien et al.,
where the .total single-copy fraction is used. The 1985a) have used a straightforward procedure in
eiie?! of base composition can be eliminated by which, after hybridization, the DNA is treated
Nuc leic Acids I: DNA- DNA Hybridization 175
with the single-strand-specific S1 nuclease and cone, 1989). This procedure, however, is some-
precipitated. S1 nuclease degrades single- what more complicated and time consuming than
but not double-stranded, DNA. The ef- the standard PB/HAP system. At present, the
fective criterion is established by the rigor of the method is most useful for closely related species.
enzyme treatment and these authors have shown TEACL is usually combined with S1 nuclease
that different degrees of S1 digestion can change methods for systematic problems. The advantages
the criterion significantly.With this procedure the of this system include (1) elimination of the need
extent of hybridization is never as large as with for HAP columns, (2) many samples can be run at
HAP because the effective kinetics of reassocia- a single sitting (depending on the size of the heat-
tion are different since all unduplexed regions are ing block), (3) melting curves are narrow as com-
digested, and because the unpaired tails of re- pared to HAP, and (4) TEACL compensates for
gions containing duplexes bind to HAP but are A-T, G-C differences. The disadvantages, as com-
digested by 51. The result is a kinetic curve such pared to HAP procedures, are that (1)S1 nuclease
as S = (1 + kCat)-044. Ultimately, there is likely to activity must be carefully standardized between
be steric hindrance and further reduction in rate assays, (2) the criterion depends on enzyme treat-
of reassociation so that the practical extent of com- ment, and (3) the NPH is more difficult to control.
pletion for high quality tracer and driver from the Although we include a representative TEACL
same species is likely to be only about 70% Sl re- protocol below (Protocol ll), other variations are
sistant. For hybridizations between moderately in use. For details of similar methods see Powell
distant species, the NPH is reduced compared to and Caccone (1990) and Caccone and Powell
that observed with HAP. There is apparently a (1992).
proportionality between NPH and AT,,, with a
slope that depends on the degree of S1 digestion Melting Curves Combining the Advantages of
(Benveniste, 1985).Thus, the activity of particular TEACL and Hydroxyapatite
S1 nuclease preparations must be assayed and ap- In 2.4 M TEACL, precisely paired DNA melts at
plied in a consistent fashion. about 62°C regardless of base composition. The
width of the melting curve is about 1.5"C. How-
The Use of Tetraethylammonium Chloride ever, this procedure has two disadvantages. First,
Tetraethylammonium chloride (TEACL) is a the criterion of reassociation cannot be set for
chaotropic solvent that essentially eliminates the widely divergent duplexes since such duplexes
effect of base composition on hybrid melting tem- are digested by S1 almost as fast as single strands.
perature at a concentration of 2.4 M (Meldhior and Second, the combination of the HAP technique
Von Hipple, 1973; Hutton and Wetmur, 1973). and TEACL melting characteristics has been re-
With the use of TEACL the observed width of a stricted by the fact that TEACL and phosphate
precise duplex melting curve will decrease by a buffers tend to form precipitates or two-phase so-
factor of 10, from about 14°C (in 0.12 M PB) to lutions under a variety of conditions. However,
about. 1.5"C. The relative technical advantages of we have observed recently that in the presence of
TEACL for DNA hybridization are discussed in high concentrations of TEACL, the phosphate
Powell and Caccone (1990). Additionally, since concentration that permits duplexes to bind but
TEACL interferes with hydroxyapatite chro- allows single strands to pass through HAP is
matography, single-stranded DNA cannot be re- much reduced.
moved except by digestion with S1 nuclease (see Although we have not exhaustively varied
below). the concentrations and conditions, a good com-
The TEACL procedure has been used effec- promise is 2.0 M TEACL and 0.013 M PB (PT).
tively to detect intraspecific polymorphism in sin- This solvent is stable from 4°C to at least 75'C.
gle-copy sequence divergence (Britten et al., 1978) Precise duplexes bind HAP very well from room
and evolutionary relationships of closely related temperature upward and are eluted from HAP
Drosophila (Caccone et al., 1988a; Powell and Cac- only as they are melted by increasing tempera-
176 Chapter 6 / Werman, Sprivlgev Ci. Briften
ture. However, single strands bind below 50°C, so an internal standard, and using a large number of
this method is restricted to comparisons of rela- columns in an automatic machine. Of course,
tively closely related species. tracer and driver fragment sizes and concentra-
Native long DNA melts in PT at 68OC with a tions (as well as other critical variables) should be
width of about 3"C, and sonicated fragments of controlled carefully in automated procedures as
DNA (500 bp average) melt at 65OC (as deter- in manual methods. There are no automated ma-
mined by elution from HAP in PT). As an exam- chines as yet available commercially. Most of
ple of fractionation, we have used the method to those in use are based on individual requirements
isolate precisely paired repeat duplexes by incu- and design.
bation to Cot 100 in PT at 60°C and passing the so-
lution over HAP at this temperature. The temper-
ature is raised in steps to 64OC with washes of PT; APPLICATIONS AND LIMITATIONS
then the temperature is dropped and the HAP
washed with low concentrations of PB to remove Investigations into the kinetics of DNA reassocia-
the TEACL. The duplexes are eluted with 0.4 M tion form the foundations of DNA l~ybridization
PB for analysis. as a tool for questions of systematic and evolu-
PT is a good solvent for DNA reassociation tionary relationships. Studies regarding the rate of
because it has a rate acceleration factor of more reassociation of sheared, total genomic single-
than 10 (as compared to 0.12 M PB at 60°C) at its stranded DNA have provided quantitative esti-
optimum temperature of 40°C. The acceleration mates of the degree of sequence repetition, length
factor drops to about 1.0 at 65°C or about 3°C be- of repeated sequences, and the interspersion pat-
low the T,. It gives narrow accurate melting tern of these sequences throughout the genome
curves where long native duplexes melt at about (Britten and Kohne, 1967, 1968; Britten et al.,
68°C. 1974). Highly repeated sequences reassociate far
A method is now possible in which DNA is more rapidly than single-copy sequences, and by
melted in 2.4 M TEACL and then the samples are varying the fragment length of the reassociating
diluted 40-fold into 0.12 M PB and passed over DNA, it is possible to estimate repeat length and
HAP to separate duplex from single strands (C. interspersion patterns. Thus, kinetic studies have
Hsiao, personal communication). Those who test yielded a wealth of information on genome orga-
this promising method further will have to ascer- nization and structure in prokaryotes and eukary-
tain that the small concentration of TEACL does otes, as well as providing a method useful for the
not elute some divergent DNA duplexes and per- separation of specific sequence classes. Britten
haps test for the best HAP temperature and the and Kohne (19681, as well as Hood et al. (1974:56),
PB concentration for ideal separation of double provided explanations of reassociation kinetics
from single strands. and Cot analysis.
There are at least two other observations de-
Automated Melting Assay rived from kinetic studies that are important CO
In the earliest eukaryotic interspecies DNA hy- systematic applications of this technique. First, the
bridizations (Kohne, 1970),HAP elution was car- observed reduction in the thermal stability of re-
ried out with a pump and the temperature was associated hybrid DNA (AT,) is directly propor-
raised with a control on a single column. The ac- tional to the sequence difference (in percentage
curacy was increased by using two isotopes and base pair mismatch) between reannealed single
an internal reference DNA. Sibley and Ahlquist strands. Second, this sequence divergence can re-
(1987b and references therein) used an automatic duce the rate at wluch sequences reassociate (Bon-
machine to process 25 HAP columns simultane- ner et al., 1973).In other words, as divergence in-
ously. A good compromise might be made be- creases between sequences, their reassociation
tween accuracy and efficiency of measurements rate slows. This issue is discussed in a later sec-
by avoiding iodination, incorporating the use of tion of this chapter.
Nucleic Acids I: DNA- DNA Hybridizatiolz 177
Several studies at the intraspecific level have bridization has limitations. Since many o l these
utilized hybridization techniques to assess in- details are discussed throughout this chapter, we
terindividual and interpopulational sequence di- provide a brief list (arbltrar~lyordered) of lnajor
vergence and variation. Britten et al. (1978) deter- points.
mined the magnitude of single-copy sequence 1. Direct sequence data are not uncovered, and
polymorphism among individuals of the sea the data are m the form of distance mforma-
urchin Stronglyocentrotus purpuratus and, surpris- tian.
ingly, found it to be about 5%. Similarly, diver-
gence estimates also have been made for isogenic 2. Comparisons are restricted to the single-copy
(parthenogenetic) strains of Drosophila mercatorum fraction of the genome.
(Caccone et al., 1987). Both of these studies em- 3. Dramatic differences In the size of the slnglc-
ployed the use of TEACL to decrease the effective copy fraction between species paus could
width of the melting curves for more accurate de- produce errors in reciprocal measurements of
terminat~onsof AT, over standard phosphate NPH, although thls has not yet been docu-
buffer conditions. Others have used the latter sys- men ted.
tem lo determine intraspecific variation in sea
stars (M.J. Smith et al., i982), herons (Sheldon, 4. Large amounts of i~~traspecific
polymorphism
1987), diprotodont marsupials (Springer, 1988), can be problematic in the estimation of phy-
cave crickets (Caccone and Powell, 19871, and logenetic relationship of closely related
Drosophila (Powell and Caccone, 1989). The mag- species.
nitude of intraspecific variation may be important 5. The upper limit of divergence between
to consider in determining the relationships species where relationships can bc deter-
among closely related species (Chapter 11). mined by this method is set by the cond~tions
The majority of hybridization studies, as ap-
sf DNA reassociation. For example, with
plied to systematics, have involved species and
HAP procedures at standard conditions of 1n-
higher taxon relationships, up to family and ordi-
cubation it is generally difficult to estimate rc-
nal level comparisons. The effective limits of reso-
lationships with reasonable certainty if the
lution depend primarily on the degree of diver-
NPH falls below 50% and the AT,,, is greater
gence (as related to rate) among taxa under
than 20°C. However, Krrsch et al. (1991) have
investigation (see below). Powell and Caccone
used NPH values of less than 50% to look at
(1989) noted that the smallest interspecific differ-
interordinal relationships among marsupials.
ence accurately resolved in their studies with
Marshall and Swift (1992), using a NaC1-Sl-
TEACL was a T, reduction of 0.27"C. Recent
nuclease assay, provided evidence that phylo-
studies on invertebrates include Drosophila (Cac-
genetic comparisons can be made where reac-
cone et al., 1992), sea urchins (M.J. Smith et al.,
tions fall below these NPH and AT, values.
1990), and sand dollars (Marshall, 1992; Marshall
and Swift, 1992).Interspecific comparisons wit11 a 6. DNA hybridization is relatively expensive as
primary focus on phylogenetic relationships compared to other techniques and involves
among birds include Sheldon (19871, Sheldon et the use of radioisotopes.
al., (19')2), Sibley and Ahlquist (1987a,bf and ref-
erences therein), and Krajewski (1989). Mam-
7. In many milligram quantities DNA
are required to permit reasonable compar-
malian studies include Bledsoe (19871, Kirsch et
isons with replication. Thus, comparisons
a1.,(1990a, 1990b, 1991), Springer (1988), Springer
among individuals are restricted to organisms
et al., (1990, 1992b), and Springer and Kirsch
from which the required amount of DNA can
(1989,1991).
be extracted. However, the PERT procedure
As with other molecular techniques used to
(Protocol 10) requires less DNA, allowing
obtain information useful to phylogenetic recon-
studies to be accomplished with several hun-
struction and systematic relationships, DNA hy-
dred micrograms.
178 Chapter 6 / Wernzan, Springer b Britten
PROTOCOLS
L.4BORATORY SETUP
1. DNA isolation and purification
Most of the supplies required for DNA hybridiza- 2e sheared drivers from long native
tlon are those corninonly used in other DNA iso- DNA
lation, manipulation, and characterization tech-
mques, including cloning and sequencing 3. Tracer preparation with 32E' and 3H
(Cl-i'lpter 9). General supplies include centrifuge 4. Tracer self reaction and repeat removal
tubes, culture tubes, assorted glassware, ceramic
mortar and pestle, pipettes (with microliter to mil- 5 . Fractionation of single-copy tracer over
llliier delivery), filters, razor blades, liquid scintil- hydroxyapatite
lation vials, etc. Supplies that may be unique to 6. Estimation of tracer fragment length
DNA hybridization include capillary tubes
(10-100 p1 volumes) in which hybridization reac- 7. Preparing tracers by iodination
tions are carried out and polystyrene disposable 8. D N hybridization
~ with hydroxyapatite and
chron~atograpl-iycolumns with filter discs, 6.5 ml phosphate buffer
capaclty (Figure 2) Equipment needed is shown
in Table 1. 9. Hydroxyapatite column preparation
10. Phenol emulsion reassociation technique
Figure 2 Diagrammatic frontal view of a multicolumn (PERT)
apparatus for fractionating tracers that have been self-
reacied to remove repeats and/or foldback DNA. The
system conslsts of an acrylic plastic box with legs,
ll. '' "ybrid stabiliwusing
the S1 nuclease-TEACL assay
through which heated water can be circulated. Holes
drlllrd in the top and bottom allow for the insertion of
disposable plastic columns, with filter discs, sealed by
rubber grommets. A thermometer placed in the last
tube (with water) is used to measure column tempera-
Lure The apparatus is hooked up to an externally cir-
culating heated water bath as shown in Figure 3.
~ u c l e i Acids
c I: DNA-DNA Hybridization 179
Table 1
Primary equipment used in DNA hybridization
Equipment Use
2. Rapidly dissolve powder in ice-cold SEDTA 10. Add 5 ml of TE (or about 0.5 ml/mg DNA) to
(Appendix). Use 10-100 volumes SEDTA to the DNA and let it swell overnight at 4°C. If
tissue depending on DNA content, e.g., sperm the DNA does not dissolve add more TE as
requires a larger volume than blood. The re- appropriate.
sulting solution should be viscous. 11. (Optional, if there is RNA and protein conta-
3. Rapidly dissolve all lumps of tissue, immedi- mination) To the re-suspended DNA add 20
ately add 20% SDS to a final concentration of pg/ml DNase-free RNase and incubate in a
I % , and stir gently, to avoid shearing the water bath at 37°C for 1 hr. Remove and add
DNA, 100,ug/ml of proteinase-K solution, 1/10 vol-
4. Add 1/5 volume of 5 M sodium perchlorate urne 3.0 M sodium acetate, and 1/I00 volume
solution and mix. of 25% SDS. Mix and incubate at 60°C for 1hr.
Extract once with phenol, then twice with
5. Immediately add an equal volume of equili- 241, chloroform: isoamyl alcohol. Add 2 vol-
brated phenol and mix for 30-60 min. Mix umes of 95% ethanol to precipitate the DNA.
with just enough force to keep the emulsion Wash once with 70% ethanol and partially dry
from separating. Centrifuge at 5000 g for 15 the DNA pellet under vacuum. Re-suspend at
min at 4'C. 2-3 mg/ml in 0.1 D I EDTA over a few drops
M
6. The phenol phase should be on top because of of chloroform.
the density of the perchlorate solution. If the 12. In a spectrophotometer, check the optical den-
phases do not separate, add a small volume sity (OD) of a dilution of the DNA prepara-
of SEDTA and recentrifuge. If you add too tion at 230,260 and 280 nm. At 260 nm, 50 pg
much SEDTA, the phenol layer may end up of DNA in 1 ml solution will give an OD of
on the bottom. Remove the phenol phase. 1.0. The ratio of ODs at 260/280 provides an
7. To the aqueous phase, add an equal volume indication of RNA contamination. The ratio
of 24:l chloroform:isoamyl alcohol, and mix should be close to 1.8. The more RNA present
at room temperature for 30 rnin. Centrifuge as the higher the value. A low ratio of ODs at
above and save the aqueous phase (should be 260/230 indicates protein contamination; this
on the top). Leave the miky interface. Repeat value should be greater than or equal to 2.3.
steps 5-7. 13. Electrophorese 500 ng of the DNA solution on
8. The DNA is now ready for spooling. Place the a 0.6% neutral agarose gel with DNAmarkers
aqueous phase in an acid-washed beaker and stain with ethidium bromide (Chapter 8).
large enough to hold four times the sample The majority of the DNA should migrate as a
volume, Carefully layer 2 volumes of ice-cold large band close to the origin.
95% ethanol onto the DNA solution by pour-
ing it down the side. Keep the two solutions
from mixing.
Protancot. 2: Preparirzg Sheared Drivers
9. Take a long acid-washed glass rod and rotate
it slowly with a slow mixing action just below
from Long Nativc DNA
the interface of the two solutions. The DNA (Time: 6 hr)
should wind onto the rod and form a mass For interspecies hybridizations both driver and
iarge enough so that no more DNA will cling. tracer DNA fragments must be approximately 500
Remove the glass rod and gently squeeze ex- bases in length (denatured) to provide for the sep-
cess ethanol out of the wound DNA against aration of repetitive and single-copy fractions,
the side of the beaker. Slice the DNA with a ra- since repeats are dispersed t l ~ r o u g l ~ a uthe
t
zor blade along the axis of the rod and remove genome at some frequency.
it to a 15-ml sterile tube and repeat the wind- Shearing small samples of DNA is best accoxn-
ing until no more DNA sticks to the rod and plished using an ultrasonic cell disruptor or soni-
the two layers are nearly completely mixed.
:cleic.AcidsI: DNA-DNA Hybridization 181
cator. DNA samples (>20 xnl) can be sheared in a alize the DNA. If the sheared DNA is much
motorized tissue homogenizer following Britten et longer than 500 bases in length, then it must
al, (1974) and J.A. Hunt et al. (1981).Aprotocol for be sonicated again. If the DNA is much
sonicating DNA (in solution) is outlined below. shorter than 500 bases, then it cannot be used
for driver. Thus, it is best not to overshear the
1. To 400 pl sterile water add 50 pl3.0 M sodium DNA at first.
acetate and 100 pg of DNA (= 50 p1 at a con-
centration of 2 mg/ml) in a 2-ml sterile glass
screwcap vial. Mix gently and cool on ice for Protocol 3: Traccr Prepamtion with
15 min. '"3 or :H
2. Sonicate for 30 sec at 80-90% maximum (Time: 3 11r)
power with the tip of the sonicator probe just
below the surface of the solution. Put on ice Radioactively labeled tracer DNA can be prepared
for 30 sec and repeat four more times. Place by standard nick translation procedures (see also
on ice and set up a small chelating resin col- Chapter 8), although iodination has been used ex-
umn (see step 3) to filter out any metallic ions tensively in previous systematic applications of
or particles introduced during sonication. Be- DNA hybridization (Sibley and Ahlquist 1987a,
fore step 3, it may be desirable to go to step 4 and references therein). Iadination procedures
(below) and check the size of the DNA in case (Protocol 7) are somewhat difficult to establish and
additional sonications are required. may require practice to achieve good tracers on a
3. Clamp a 1-ml pipette tip to a ring stand and routine basis. Ail advantage of an iodinated tracer
push a small piece of sterile glass wool into is a long half-life (about 60 days). However, there
the tip. Add 0.5 ml of chelating resin (equili- are also advantages in the use of 32P-or 3H-labeled
brated to pH 7.0 with 0.3 M sodium acetate) tracers, which can be counted in a beta counter. "I?
and rinse several times with 1 rnl of 0.3 M can be counted Cherenkov without the use of sun-
sodium acetate. As the final rinse of sodium tillation fluid, can be detected by a hand-held
acetate passes through the chelating resin, Geiger counter, and has a lugh counting efficiency
add the DNA sample and collect it after it (95%).Tritium does not share these advantages
passes through to the column. Add 250 p1 but has a very long half-life and is a lower energy
sodium acetate to the column to wash out any emitter; consequently, tracers made wit11 W have
DNA and combine with the previously col- an extremely long shelf life. 32Phas a half-life of
lected sample. about 14 days; consequently tracers lose their ac-
4. Divide sample into 500 pl aliquots in 1.5-ml tivity and detectability rather quickly. Also, "P
microcentrifuge tubes and to each add 1 ml of tracers degrade within a few days to a week if la-
cold 95% ethanol to precipitate the DNA. Spin beled to very high specific activity (>I x 106
in a microcentrifuge at high speed for 10 min cpm/pg). Below is a protocol for synthesizing a
at 4°C. Wash pellets in 1 ml of 70% ethanol 32Pgenomic tracer. 3H can be substituted or used
and repeat the spin. Decant the ethanol and in combination with "P,so that a 3EI tracer can be
partially dry the pellets under vacuum. Re- tracked easily.
suspend the pellets in 2030 plO.1 rnM EDTA Nick translation of long native genomic DNA
or in an appropriate volume to obtain a con- is preferred since one has more control in the re-
centration of 5 pglpI(5 ng/ml). sulting fragment size by varying the quantlty of
DNase added to the reaction. If starting wjth
5. Electrophorese 500 ng of the sonicated DNA
sheared DNA (500 bp), the resulting fragment slze
on a 2% alkaline agarose mini-gel (Protocol 6)
will, on average, be considerably smaller, poss~bly
at 40 V for 2-4 hr with PBRIHinfI marker (or
too small to be used as tracer. Additional details
some other suitable marker for 500-bp frag-
regarding nick translation procedures are pre-
ments). Neutralize gel in 500 mM Tris (pH
sented in Sambrook et al. (1989).
7.5) and stain with ethidium bromide to visu-
182 Chapter 6 / We~nzan,Springer E7) Britten
1. In a 1.5-1111 microcentrifuge tube add 5 , L L ~of Geneclean@kit (BIO 101 Inc., P.O. Box 2284,
long native DNA to be nick translated. Then La Jolla, CA 92038-2284). The general proce-
add dure is outlined below.
5 ,ul lox nick translation buffer 5 . To the nick translation reaction add 10 yl of
(Appendix) the giass beads soiution (50% slurry in water,
= glassmilk of the Geneclean@kit) and gently
2 pi each 1 I ~ M dAT13,dGTP, dTTP.
mix. Add 150 pl sodium iodide solution (Ap-
1 p l 1 inM dCTP (or less, for higher spe- pendix), mix gently and set aside at room
clfic activity) temperature for 5 min.
5 pl [32PldCTP(50 pCi at 800 Ci/mM)
6. Spin at high speed in a microcentrifuge for
1 p1DNA polymerase
5-10 sec. Discard the supernatant (radioac-
0.5 pl DNase (lo6 dilution of a 10 mg/mi tive) and wash the glass pellet three times
stock) with 500 pi of the ethanol-Tris wash solution
Sterile water to 50 p1 total (see ethanol wash in Appendix). Spin 5-10
For "3,use several labeled triphosphates sec between washes and discard supernatant
without dilution. Incubate at 12-14OC for 2 hr. at each step. Be sure to resuspend the glass
Add 1/20 volumc 5 M EDTA and place on pellet with each new wash. On the last wash
ice. Remove 1 pl and dilute to 500 pl with wa- remove as much of the ethanol solution as
ter in another 1.5-ml tube to check 32Pincor- possible.
poration. 7. Elute the nick translated DNA from the glass-
2. To check the amount of radioactive nucleotide milk by adding 25 pl of TE or 0.48 M PB and
incorporated, take 10 y1 of the 500 p1 sample placing it into a 50°C water bath for 15 min.
and dot it onto a Whatman GF/C glass filter Spin down the gIass for 30 sec and remove
disc and set aside. Take another 10 pl and mix the supernatant to a new tube. The tracer
ii wlt11 5 ml 10% ice-cold trichloroacetic acid DNA should be free of unincorporated nu-
(TCA) and 50 yg of sheared salmon sperm cleotides and other impurities. It can now be
DNA. Put on ice for 15 ~nin. self-reacted to remove repetitive DNA and
3. Fllter the 5 ml of sample plus DNA through a
any "snapback" DNA formed during the nick
2.4 cin GF/C glass filter disc and wash 5
translation procedure (Protocol 4).
iirnes with 5 ml 10% TCA followed by two
\\iashes with 95% ethanol. Set the filter aside
111 a fume hood behrnd a shield and let it dry; Profact21 4: Eaccr Seli-Reaceion and
remember both filters are radioactive, as are Repeat Reu~ovaI
the wash solutions. Place both. filters into sep- (Time: 1-2 days)
arate scintillation vials and add 10 ml of scin-
illlation cocktail (for 3H)or count Cherenkov A preliminary hybridization reaction and frac-
(no fluid added) for 32P.The unwashed filter tionation must be carried out on the newly la-
represents the total radioactivity added; the beled tracer. An appropriate Cot must be chosen
washed filter represents the amount of 32P(or to reassociate the repeat DNA while leaving the
3H)incorporated into the DNA. Incorporation single-copy component single-stranded so as to
should reach 3040%. separate these components using hydroxyapatite
4 . The unincorporated nucleotides must be re- chromatography. The Cot necessary can be calcu-
lated by the following formula:
moved from the nick translation reaction.
This can be accomplislied by the "spun col-
umn technique" outlined in Chapter 8. How- Cot = (pg DNA/,d sample vo~ume)x
ever, we prefer to use the glassmilk elution 10 x AF x time (hr)
procedure described in Davis et al. (1986:123),
which is available commercially as the
Nucleic Acids I: DNA-DNA Hybridization 183
Tracers below 500 bases in length have the effect Unlike the nick translation procedure, iodination
of lowering the T, by (500/tracer length) in de- of DNA far hybridization experiments in system-
grees C. Thus, a tracer of length 250 will lower the atics generally is carried out on sheared, single-
T, by 2°C. Tracer size must be estimated in the copy DNA. Therefore, we will describe a proce-
denatured state and this can be accomplished dure in which single-copy DNA is first isolated
with alkaline agarose gel electrophoresis as de- and then radiolabeled. This procedure is derived
scribed below. from the general protocols given in Commorford
(1971), Davis (1973), Tereba and McCarthy (1973),
1. Prepare a 2% agarose gel by adding 3 g Orosz and Wetmur (1974),Scherberg and Refetoff
agarose to 150 ml of gel buffer (50 mM NaC1, (1975), Chan et al. (1976), Anderson and Folk
1 mM EDTA) and microwave or boil to dis- (1976), Prensky (1976), and Sibley and Ahlquist
solve. Pour into an appropriately sized gel (1981a). The primary result of the iodination re-
mold and let cool. action is the replacement of a hydrogen atom at
2. Place the gel into a submerged electrophore- the C-5 position of cytidine by an iodine atom.
sis chamber and add enough alkaline running Iodination is much more efficient when DNA is
buffer to cover the gel with 0.5 cm of buffer. single-stranded.
Let the gel equilibrate for 1 hr. Remove A few words of caution should be mentioned
enough running buffer so that only about 2 or for investigators contemplating the use of ra-
3 mm of buffer lies on top of the gel. dioiodine. The temperature, acidic pH, and pres-
3. Add 1/20 vol 1 M NaOI-I to 1000-2000 cpm ence of an oxidizing agent in the iodination reac-
of tracer and incubate at 37OC for 10 rnin. tion all contribute to the volatization of a fraction
Add 1/5 vol loading dye (Chapter 8, Appen- of the radioiodine (Prensky, 1976).This danger re-
dix), mix, and load onto the gel in the second quires rigid measures of monitoring and protec-
or third lane. Treat 500 ng of appropriate tion. The review paper of Prensky (1976)is partic-
marker (PBRIHinfI) in the same fashion and ularly relevant.
load onto the first lane. Run gel at 35 V for Much larger quantities of DNA are generally
6-10 hr. labeled with radloiodine, When DNA samples are
4. Following electrophoresis, cut the lanes into in short supply, this is a significant consideration.
strips separating the marker lane from the Also, radioiodine must be assayed in a gamma
tracer lane(s). Neutralize marker lane in 0.5 M counter.
Tris-HC1, pH 7.5 and stain with ettzidium bro-
mide. Visualize and photograph (include a Part A. ??reparation of Saaiplcs for ladination
ruler) on a UV transilluminator. with IZ5E
5. Cut the tracer lane into 0.5-cm segments, 1. Boil 1.0-1.5 mg of sheared, native DNA in
starting from the origin (loading well) and 0.48 M phosphate buffer for 10 min and incu-
place each piece into a separate scintillatio~~ bate at 60°C to a Cot value at which repeated
vial in order; add 10 ml of scintillation fluid sequences have reannealed. Dilute the sample
and count for 5 min each. Graphically com- to 0.12 M phosphate buffer and apply to a hy-
pare the position of the modal cpm with the droxyapatite column at 55°C. Elute the single-
measured marker fragments to estimate the stranded, single-copy DNA with 20 ml. of 0.12
average tracer size. Use tlus size to adjust the M phosphate buffer.
T, if required. The prjmary reason for sizing 2. Dialyze the single-copy fraction of DNA
is to avoid degraded tracers or those that are against deionized water for 48 hr to remove
exceptionally long. phosphate buffer. Change water frequently.
Nucleic Acids I: DNA-DNA Hybuidizatio7q 185
3. Transfer dialyzed sample to a serum bottle container; iodine IS highly volatile at this
and freeze at -20°C. Lyophilize sample for 24 stage.
hr until DNA sample appears like cotton. 2. Using a 23-gauge needle and a 1-ml syringe,
4. Refrigerate sample until subsequent iodina- carefully draw out all. of the iodlne solutioxl
tion (not more than 24 hr). and add 40 pl of isotope (0.625 mC1) to each
5. Rehydrate lyophilized sample in a small vol- of the eight samples. Do not remove the rub-
ume (50-100 ~ 1of) 0.2 M NaAc adjusted to ber stoppers from the serum vials.
pH 7.5 with glacial acetic acid. It is convenient 3. Add 60 pl of 18 mM thallium chloride (TlCl)
to carry out the rehydration on a piece of to each sample. Again, use a 23-gauge needle
parafilm. , and 1-ml syringe to deliver the TlCl througl~
6. Transfer the sample to a 1.5-ml microfuge the rubber stopper that caps each sample rc-
tube. Vortex for 15 sec and centrifuge in a mi- action.
crofuge for 30 sec to remove insoluble debris. 4. Incubate samples at 6O0Cfor 15 min in a tem-
Determine the concentration of DNA using a perature block.
spectrophotometer. One to 2 pl of the sample 5. mace samples on ice for 5 min.
diluted in 2 ml of water generally is sufficient
6. Use a 23-gauge needle and 1-ml syringe to
for this purpose.
add 30 pl of 1.0 M Tris (base) to each sample.
7. Combine an aliquot of the sample containing
7. Heat samples for 10 rnin at 60°C.
100 pg of DNA with 0.2 M NaAc (pH 5.7) to
bring the total volume to 130 p1 in a 1-ml 8. Place samples on ice for 5 min.
stoppered serum vial. Add 6 p1 of 2 mM KI 9. Transfer samples to dialysis bags and dialyze
and 11 pl of bromcresol green dye (BGD is a overnight against a 4-liter solution of 0.4 M
p H indicator). Adjust the pH of the reaction NaC1, 0.01 M phosphate buffer, and 0.2 IIIM
mixture to 4.7-48 with 0.2 M NaAc (pH 4.01, EDTA.
using pH color standards. Place sample on 10. Transfer samples from dialysis bags to screw-
ice. top vials using Pasteur pipettes.
11. Determine concentration of DNA using a
spectrophotometer (see Protocol 1, step 12).
Part B. Preparation of lodins
Cuvettes committed for this purpose will re-
It is convenient to carry out iodinations for sev- main radioactive and should not be used for
eral samples at once. For sample reactions pre- other laboratory work.
pared as above, eight DNA samples can be radio-
labeled with 5 mCi of IZ51. The foIlowing protocol 12. Count 1 ,u1 of each sample in a gamma
is thus designed for the simultaneous iodination counter.
of eight samples with 5 mCi of 1251,although the 13. Store iodinated DNA tracers at -20°C until
basic protocol can be adapted to any number of needed far hybridization.
samples by using more or less iodine.
All of the manipulations described below
should be carried out under a hood while wear- Protocol 8: DNA Hybridization with
ing two pairs of latex gloves,
Hydruxyapafite and Phasphatc Bu ffcr
1. Start with 5 mCi of lZ5Iin a volume of 10 fi. (~i'me:
2 hr setup; up to 2 weeks incubation)
Vent the rubber seal of the iodine container
The reassociation of DNA hybrids and their melt-
with a 23-gauge needle. Using a 23-gauge
ing properties in neutral phosphate buffer (PB)
needle and a 1-ml syringe, dilute the iodine
has been used extensively to study genome struc-
solution with 340 pl of 0.2 M NaAc and 10 p1
ture and systematic relationsl-iips. The descriplion
of 1 mM KI and allow to equilibrate for 1 hr.
given here is for the simplest manual procedure,
Do not remove the rubber top on the iodine
186 Chnyfer 6 / Werman, Springer &3Britten
wllich 1s adequate but less accurate than some au- Incubate for the appropriate length of time to
tomated methods (e.g., Kohne et al., 1972; Britten achieve the desired Cot.This can be calculated
e t al, 1974). as follows:
'Cnsetting up hybridizations in the PB system
w l i l ~properly prepared tracer, careful attention Cot = (,ug driver/pl reaction volume) x
must be given to the following: (1) The concentra- 10 x AF x hr of incubation
tlons of the PB stock solutions, the hybridization
reaction mix, and the PB used to elute single- For PB concentrations over 0.12 accelerate the
stranded DNA from the hydroxyapatite column reaction: e.g., 0.48 M = 5.6 times faster and
are crltical and must be known with accuracy (2) 0.60 M = 6.5 times faster (see Britten et al.,
The reaction volume must be as small as possible 1974).
(10-1 00 pl) with a DNA mass:volume ratio of at Following the incubation, remove the capil-
least one or greater. Long periods of time are re- lary tube and break off each end by first filing
quircci to achieve high Cot if the volume is large a small groove and snapping it manually. Im-
and driver concentration is small (see below). (3) mediately dilute to 0.12 M PB. This is accom-
The driver must be from 1,000 to 10,000-fold in ex- plished by calculating the volume of water
cess over the tracer mass for total single-copy re- necessary to dilute the 0.60 M reaction mix to
actions to prevent significant self-reassociation of 0.12 M and adding this volume of water to
the tracer. (4) For a set of interspecific measure- 500 p10.12 M Pl3. The hybridization solution
ments using a particular tracer, three reactions can be dissolved directly into this solution.
must be set up for thermal fractionation: (a) tracer Do not dilute the reaction mix into pure wa-
x dliver DNA of the same individual or species; ter, as denaturation may take place prema-
(b) tracer x driver DNA from different species; turely. PB must be present.
and (c) tracer x greatly divergent DNA (to control 6. If the hybridizations are not to be fractionated
for self-reaction of tracer). Hybridizations using immediately, they should be taken out of the
32Por 3H are set up as follows (1251hybridizations 60°C bath and placed directly into a dry
require approximately 250,000 cpm of tracer): ice-ethanol bath and quick frozen. They then
1. 111 a 1.5-ml microcentrifuge tube combine can be stored at -20°C for a few days. Slow
500-1,000 cpm of tracer (mass of tracer can be freezing is disastrous as everything binds to
calculated from its specific activity) with a HAP.
1000-fold excess of sheared (500 bp) driver 7. Once the hybridizations have been diluted to
(C g., for 5 ng tracer add 5 fig or more of dri- 0.12 M PB they can he loaded on a column for
ver). Remember the tracer will be in 0.12 M thermal denaturation (Protocols 9-10).
PB and the driver will be in water or 0.1 mM
EDTA.
2. Adjust the final PB concentration to 0.60 A4 Pro"Lo@aB9: Hydroxy apatite Caltxaxhan
w ~ t h2.4 M stock (pH 6.8), mix, and draw the S%rflparat*ron
solution into a sterile glass capillary tube (Time: 3 hr)
leaving at Ieast 1 cm of air space at each end.
Flame seal the ends and inspect under a dis- 1. Rinse the column (Figure 3) twice with dis-
sect~onmicroscope to insure the integrity of tilled water (all solutions must be blown
[he seal. through the column under air pressure) and
load 400 mg dry hydroxyapatite (HAP).Rinse
3. liepeat for homoduplex and control reactions.
the HAP three times with 3 ml water followed
1' Marlc tubes with water-resistant tape and boil by two rinses with 3 ml. 0.12 M PB. Load each
ai 100°Cfor 2 min. Place immediately into a column with 3 ml0.12 M PB and increase the
50-mi screw cap glass tube filled with water circulating water temperature to 50°C. Blow
at 60°C and submerge in a 60°C water bath. this solution through the column into indi-
Nucleic Acids I: DNA-DNA Hybridization 187
Heated water
vidual scintillation vials and use them as the 5. The first fraction collected is the blank. The
"blanks" for background counts. load fraction and the three 50°C washes are
2. Load the hybridization solution (now in about used to determine the proportion of unre-
600 p1 of 0.12 M PB) onto the HAP bed and acted single-stranded tracer. The remaining
add 2.4 ml of 0.12 M PB. Gently stir the HAP fractions at the temperatures above 50°C are
with a glass rod to remove trapped air and used to determine the T, (see "Interpretation
record the temperature of the column. Remove and Troubleshooting").
the thermometer and blow the PB through the
column into a new vial. Add 3 ml of 0.12 M PB
to the column, let it equilibrate to 50°C, and Protocol 10: Phenol Emulsion
blow through and collect. Repeat the 0.12 M Reassociation xchnique (PERF)
PB wash twice more at this temperature. (Time: 2 hr, plus a 1-5 day incubation)
3. Raise the column temperature at 35OC inter-
vals to ~ O O ~collecting
C, a 3 ml 0.12 M PB The PERT system (Kohne et al., 1977) is a method
wash at each interval. Be sure to gently mix that achieves high Cot value at room temperature
the HAP at each interval and allow the tem- by using a phenol emulsion phase to accelerate
perature to remain constant for 3-4 min. the reassociation reaction as much as 10,000-fold.
Check and record the temperature before We have found this to be a high stringency of re-
blowing the wash through. association system and it is essentially unexplored
4, When the fractions are collected, add nl for studies of relationships. Advantages of this
of scintillation fluid and count eacl,vial Mce system include high Cot in a short period of time
for 5 min, or longer (10 min) if counts are low (half Cot of Dvosophila scnDNA is about 10 min),
( 4 0 cpm). the elimination of a low temperature foot on re-
melting curves (due to degraded trace*, dient from 45 to 65'C. This is accomplished with
etc,), and the reduction in the amount of driver the use of an aluminum block with heat exchang.
1lecessary (optimal acceleration at about 5 pg). ers at each end that can be connected to two sepa-
However, the system is limited in that the strin- rate temperature-controlled circulating water
gency criteria probably cannot be manipulated baths. A series of 11-mm diameter holes drilled
and the NPH falls more quickly with increasing into the block in a staggered pattern can accom-
evolutionary distance than with the standard PB modate 1-5-ml microcentrifuge tubes after the
system. ~ h e s ~ s t e m a tvalue
i c of this technique has holes have been partially filled with water. The
yet to be determined but the protocol is included block must be well insulated with styrofoam (six
here to promote further study. sides) and the resulting temperature gradient
along the block is linear and should be accurate
1. Combine 5 pg of sheared driver and 500-1000 within O.l°C (Britten et al., 1978). This should be
cpm tracer in a 1.5-ml microcentrifuge tube
checked with a precision thermometer moved
and adjust the PB concentration to 0.48 M us-
from hole to hole.
ing 2.4 M stock. Add 0.48 M PB to a final vol-
ume of 1 ml, mix, and place in 100°C water 1. Tracer and driver DNA must be in 0.1 M
bath for 3-5 min to denature the DNA. EDTA. Tracer should be at least 200 cpm/pl
2. Cool the mixture to room temperature and and sheared driver must be at a concentration
add 100 p1 equilibrated phenol and vortex. of 5 pg/pl. Add about 8,000-10,000 cpm
Shake continuously to keep the phases from tracer to 300-400 pg driver (e.g., 30 p1 tracer
separating for one to several days. Mixing can to 70 driver). To this mixture add an equal
be accomplished by attaching the tubes to a volume of 3.0 M TEACL to reach a final coil-
"wrist action" flask shaker, which shakes the centration of 1.5 M TEACL. Seal in a 200-pl
tubes in a vigorous up-and-down motion. capillary tube as described above (in the PB
system), and incubate in a boiling water bath
3. Remove from shaking and add 400 pl ether
for 2 min. Due to the large bore of the capil-
and vortex to extract the phenol. Remove the
lary tube, it is useful to break a smaller diam-
ether and discard.
eter tube (10-20 pl) and insert a small 0.5-crn
4. In a sterile 15-ml tube combine 3 ml water, 1 piece into the ends of the 200-pl. tube before
m10.12 M PB, and add exactly 1 ml of the hy- flaming the ends. This will ensure a goad seal.
bridization solution. This will yield 5 ml of a
2. After denaturation, place the tube in a 45'C
0.12 M PB solution which can be fractionated
water bath for a sufficient length of time to
over HAP as in Protocols 8 and 9, except for
the following: instead of adding 0.12 M PB to achieve an appropriate Cot,The acceleration
factor for incubations in 3.5 M TEACL is four
the loaded sample to bring it to 3 ml, simply
times that of 0.12 M PB. Thus, Cot can be cal-
add the 5 ml to the column, mix, and blow it
culated using the formula: Cot = [(,ug/pl) x 10
through the column, dividing the sample
x 4 x hr incubation].
equally into two scintillation vials. Wash three
times with 0.12 M PB at 50°C and continue as 3. Following the reassociation incubation, cool
in Protocol 9, step 3. the sample to rooin temperature, add an equal
volume of 2x S1 nuclease bulfer to decrease
the pH to 4.4 and the TEACL concentration to
Protocol 13: Analysis of Hybrid 0.75 M. Add SZ nuclease (1-2 p1 of a 5 U/,d
Thermal Stabilitgr Using the stock) for 95% single-strand digestion (appro-
51 Nlaclcaac-T'BArC'k Assay priate incubation times may have to be deter-
(Time: 3 hr setup, incubation up to two weeks, 6 mined empirically). Vortex and incubate at
hr fractionation) 37°C for 10 min. Remove, vortex again to re-
move lumps, and incubate at 37'C for 50 min.
This procedure requires the heating of a number 4. Chill the sample on ice and add 1/10 volume
of samples simultaneously in a temperature gra- of 0.30 M EDTA to stop the reaction. Remove
Nucleic Acids I: DNA-DNA Hybridization 189
10-20 pl of the mixture and purify with the is the portion of DNA remaining as undi-
glassmilk elution proccdure (Protocol 3) or gested duplex. The melting curve can be con-
the spun column technique (Chapter 8) and structed by plotting the incrcasing fraction of
determine the duplex fragment size by elec- the sample digested by the S1 nuclease versus
trophoresis through alkaline agarose gel elec- increasing temperature. The cpin of duplex m
trophoresis, as outlined in Protocol 7. the unheated sample divided by the total cpm
5. To remove digested single-stranded DNA of the predigested 100-p1 sample ( ~ 1 0 0pro-
)
from the remainder of the sample and bring it vldes the extent of reassociation in percent.
to 2.4 M TEACL, load the sample on to a 3-ml 9. As an alternative to the final Sephadex G-100
Sephadex G-100 column previously equili- fractionation, the DNA in duplex can be pre-
brated with 2.4 M TEACL and wash through cipitated with cetyltrimethylammonium bro-
with 10-15 ml of 2.4 M TEACL. Collect 20- to mide (CETAB) and counted separately from
30-drop fractions and count Cherenkov in a the digested single-stranded DNA remaining
scintillation counter to identify the exclusion in the supernatant (Hereford and Rosbash,
peak (maximum concentration of duplex 1977; T.J. Hall et al., 1980).Add 150 pg of calf
DNA which may be in more than one of the thymus DNA and 1 / 2 volume of 9% CETAB
fractions). Bring the peak fraction to 1.8 ml and NaAc to a final concentration of 0.1 M.
with 2.4 M TEACL, and determine the precise Spin in a microcentrifuge for 10 min and re-
concentration by means of a refractometer. suspend the pellet in a small volume of 1.0
6. Remove 100-p1 aliquots from the 1.8-ml sam- mM EDTA; count this sample separately from
ple and place into 16 microcentrifuge tubes. the supernatant.
Place 14 of these ill the heating block (with a
little water in each hole). Keep one of the re-
maining samples unheated and heat the other INTERPRETATION AND
to 70°C for 30 min, prior to S1 digestion. Heat TROUBLESHOOTING
the samples in the heating block for 30 min in
the thermal gradient described above. These
samples will determine the melting curve. To Calculation of Melting Curves from
determine the total amount of radioactivity in Raw Counts: An Example
each sample, count 100 yl of the 1.8-1111 sam-
ple in liquid scintillation fluid. Several different measures are possible for esti-
mating evolutionary distance from interspecilic
7. After heating, place all the samples on ice and DNA hybridizations and melting curves ATillode,
add 100 pl of sterile water, 200 pl of 2x S1 nu- AT, percentage hybridizatio~i(ANPH),and AT50F1
clease buffer, and 10 pl of S1 enzyme solution (which combines AT, and ANPH into a angle
(Appendix); use enough S1 nuclease to digest number).
99% of single strands. Vortex and incubate at The T, is the interpolated tempcratl~rea t
37°C for 1hr. Put on ice and add 1/10 volume which 50% of the hybrids that were formed rr-
0.5 M EDTA (= 40.5 yl) to each tube to stop the main in duplexes. T, can be determined eas~iyby
reaction. At this point the remaining duplex inspection of an integral mcltlng curve, or non-
DNA must again be separated from the diges- linear least-squares regression methods can bc
tion products (both of wluch are radioactive). used to increase accuracy. The TSoHis an estimate
8. Prepare 16 individual (3-ml) Sephadex G-100 of the tcmperature at which 50% of the DNA re-
columns that have been equilibrated with 1.0 mains in duplexes; this measure differs from T,,, if
mM EDTA. Load the 16 samples and elute all DNA fragmenis do not form duplexes. The
with 3 ml 1.0 MEDTA and collect the ex- Tmodedepends on the determmation of the rnter-
cluded fraction. Mix with an appropriate vol- polated temperature at which inaxirnum number
ume of scintillation fluid (10 ml) and count of hybrids melt (the peak in a differential plot of
for 5-10 min per vial. This excluded fraction a meltlng curve).
190 Chapter 6 / Werman, Springer & Britten
The following example (Tables 2-4) is based bound). The bound fraction, or the extent of reac-
on n PERT, interspecies tracer-driver reassocia- tion, is the percentage tracer that formed stable
tion, with a homoduplex control. The curves are hybrids during the reassociation and is repre-
derived from actual data and are of good quality. sented by the sum of the counts from vials 6
Poor curves, influenced by a variety of factors, are through 15 divided by the total, and expressed as
illustrated and described at the end of this section. a percentage [(553.8/638.4) x 100 = 86.74%1.
The tracer has reacted to the extent of 87% To obtain the data to draw an integral melting
with driver from which the tracer was originally curve normalized to 100% reactivity, the cpm
made (homoduplex control); this degree of hy- from vials 6-15 are added sequentially and each
bridlzation is not uncommon for PERT reactions. total is divided by the total cpm bound and mul-
Thermal denaturation of the duplex DNA is as de- tiplied by 100 to give the total percentage eluted
scribed in Protocol 9, where 15 fractions (vials) are at each temperature (Table 3). The curve (Figure
collecied and counted, with washes at 5OC inter- 4) is generated by plotting percentage eluted (y
vals (narrower intervals would be preferable). The axis) versus temperature ( x axis). The T,, of this
fract~onsand the cpm tracer in each fraction based curve is the point on the temperature scale where
on 5 rnin counting of each in a liquid scintillation the curve intersects 50% eluted.
counter (beta) are presented in Table 2 (a gamma In a heteroduplex DNA hybridization reac-
counter is needed for I2b5I). tion involving the tracer (in the first example) and
Adding the blank-corrected cpm from #2 to a driver from a different species, the curve (nor-
#15 gives the total cpm (= 638.4). The proportion malized to 100%)can be determined and the T ,
of tracer which did not bind to the HAP on load- identified as in the example above. However, for
ing is the sum of fractions 2 through 5 divided by T5OHestimates the homoduplex curve is normal-
the total cpm x 100 (84.6/638.4 x 100 = 13.25%un- ized to loo%, as above, but the heteroduplex
Table 2
Raw counts data for homoduplex hybridization for the calculation of
NPR and an integral melting curve
Temperature ("C) Vial cpm eluted cpm blank
Table 3
Counting data for calculating an integral melting curve normalized to
100% reactivity
Temperature (OC) cpm blank cpm eluted (%)
Table 4
Calculation of interspecies melting curve dataa
Elution
Vial A B C D E temperature PC)
1 15.7 0.0
2 46.4 30.7
3 129.6 113.9
4 26.0 10.3
5 13.4 0.0 a.oo 12.18 50
6 17.0 1.3 1.3 0.27 13.05 55
7 20.4 4.7 6.0 1.25 13.89 60
8 25.4 9.7 15.7 3.26 . 15.65 65
9 53.0 37.3 53.0 11.02 22.41 70
10 97.8 82.1 135.1 28.90 37.30 75
11 150.4 34.7 269.8 56.10 61.71 80
12 155.0 139.3 409.1 85.07 86.98 85
23 82.8 67.1 476.2 99.02 99.14 90
14 20.4 4.7 480.9 100.00 100.00 95
15 11.8 0.0 480.9 100.00 100.00 100
aData are from raw counts normalized to 100%reaction (column D)and normalized with
respect to the homoduplex control (column E). Column A, raw counts, B, raw counts minus
blank; C, cumulative sum of counts; D, % of total bound (= 480.9) versus temperatures; and
E, counts normalized to homoduplex control.
100
10
Figure 5 Homoduplex (0) and heteroduplex ( * )
curves illustrating TSoHand AT50H. The homoduplex
curve is normalized to 100%whereas the heteroduplex 0
curve is normalized relative to the homoduplex curve Tm Tson
(see text for explanation). Temperature PC)
Nucleic Acids I: DNA- D N A Hybvidization 193
Tn,, Tmode,NPH, and T5OWwould also be apparent It is worth noting that much of the width of
and each of the others could be calculated if one HAP melting curves is due to the range of base
were measured. However, it will be a while before composition present in the DNA and the failure of
that much information is available for other taxo- complete washing at each temperature. It has
nomic groups, although Sheldon and Bledsoe never been possible to draw condusions about
(1989) have investigated the relationship between the range of rates of divergence from HAP mea-
these distance measures for bird data and Kirsch surements for this reason. If the distribution of
et al. (1989) have presented data for marsupials. rates of divergence were broad and uniform, and
Below, we list some characteristics of each of these if there were no variance in base composition, no
measures. mode could exist. In reality, however, spectropho-
tometric and hydroxyapatite melting curves for
Tmode native DNA are sigmoidal owing to the effects of
ATnlodehas been advocated as the distance mea- base composition, and in the case of hydroxyap-
sure of choice for plzylogenetic applications by atite, incomplete washing. Thus, one always ob-
Sariclz et al. (1989).Like AT,, ATmodeis moderately serves a mode and it is not yet possible to know
precise and does not suffer from the high standard how such artifacts affect the apparent mode in a
error that NPH and ATsoHdo (Bledsoe, 1987; particular measurement. It may be the case, how-
Kirsch et al., 1989). This is clearly an advantage of ever, that the mode indexes that component of the
the former two measures for the range over which genome that has the modal base composition as
they can be measured. Another putative attribute well as the modal rate of divergence.
of ATmodeis that the criterion conditions may not
form a boundary at which differences become NPH
compressed; the mode may simply shift below cri- The normalized percent hybridization (NPH) falls
terion (with increasing interspecies divergence) for heteroduplex comparisons compared to ho-
and cease to be a property of melting curves. Thus, moduplex controls, even if the heteroduplex com-
ATmod,may index comparable suites of sequences parison involves closely related species. If there is
over its range rather than the increasingly smaller a rapidIy evolving class of DNA, NPH is a mea-
subsets indexed by AT,; theoretically, this would sure of that class. However, it is not yet known
render ATmode values more additive than ATmval- how much of the reduction in NPH is due to ki-
ues. An exception to this pattern is noted in T.J. netic effects; rates of reassociation differ in ho-
Hall et al. (1980), where slowly evolving compo- moduplex versus heteroduplex reactions and the
nents of the genome in distant interspecies com- amount of driver DNA available for hybridization
parisons may retain a Tmode. may be used up before the divergent tracer-dri-
Tmodemeasures a peak in the distribution of ver sequences can hybridize. Clearly, there is a
diverging DNA. If the distribution of rates were need for measurements of this kinetic effect.
Gaussian, the variance were small, and if the tech- The rate of reassociation for interspecies hy-
nique were good enough to determine the peak brid formation is retarded by a factor of two per
accurately, then Tmodewould be the measure of 10°C divergence in Tm (Bonner et al., 1973). Late
choice. However, melting curves spread with in- in the incubation there is less driver DNA avail-
creasing divergence and it becomes increasingly able to complete the slower formation of more di-
difficult to determine Tmodeaccurately (Figure 7B); vergent tracer-driver duplexes. One published
false modes may result from the scatter of indi- measurement (Galau et al., 1976) suggests a large
vidual measurements if the curve is broad and kinetic effect for a Xenopus interspecies hybridiza-
flat. Also, different algorithms for Tmodedetermi- tion but it has not been confirmed. Recent mea-
nation (e.g., modified Fermi-Dirac curve fitting, surements with Drosophila DNA (Werman et al.,
parabolic curve fitting, graphic methods; see Shel- 1990) and primate DNA (Bonner et al., 1980)sug-
don and Bledsse, 1989) do not always give the gest that the kinetic effect is small for modest (6%)
same answer. divergence. The underlying process is complex
196 Chapter 6 / Werrnan, Springer & Britten
since the duplexes of driver DNA form between peak in the distribution of rates of DNA sequence
randomly terminated fragments and single- change since the peak may move down toward
stranded regions remain available at their ends. the criterion. Thus, there is a linear relationship
Thus, the concentration of the single-stranded between T , reduction and divergence over short
part of the driver falls at a rate of (1+ kCOt)4.44,
SO evolutionary time. At increasing divergence the
that there is some single-stranded driver available curve begins to level off. This causes compression
very late in the incubation (Britten and Davidson, of the estimates of greater divergence.
1985).The kinetic effect therefore can be reduced
by incubation to high Cot, as is usually done. Since T50~
there are no satisfactory measurements, anyone The T50Hmeasure was devised by Kohne et al.
planning to use NPH for major phylogenetic (1972) to correct for the reduction in normalized
work is advised to make some determinations of percentage hybridization (NPH) that occurs even
the kinetic effect under the conditions of incuba- for closely related species and to remedy the com-
tion. The obvious method is to rehybridize the pression of AT, values that is forced by criterion
non-hybridizing fraction in a standard incubation. conditions. The method of calculation is shown in
In many measurements (Sibley and Ahlquist, Figure 5 and Table 4. As discussed, T50H is a mea-
1981a, 1983; Kirsch et al., 1989; Powell and Cac- sure of the median sequence divergence between
cone, 1990) NPH is not accurately determined. species (T.J. Hall et al., 1980).Obviously, if median
However, in other work it apparently has been sequence divergence could be accurately esti-
determined more reproducibly (Hall et al., 1980; mated, the result would be independent of the cri-
Benveniste, 1985).The technical problems that are terion used in a particular measurement. It was
due to limited Cot and to variations in length and shown by M.J. Smith et al. (1982) that a good com-
concentration of tracer and driver preparations pensation for different temperatures of incubation
are undoubtedly solvable, so we may look for- can be achieved by using T50H.The compensation
ward to more precise determinations of NPH. At for different criteria even extends to S1 methods.
large evolutionary distances where a majority of The S1 nuclease digests the more divergent du-
the DNA can no longer form interspecies du- plexes and the undigested DNA is the better
plexes at the criterion temperature, NPH is the paired fraction, so the observed reduction in T,
primary available measure of interspecies rela- with S1 is less than with HAE However, the NPH
tionships. Tmodeand Tm, in turn, are not useful at is also less and as a result the T50Hobserved with
such high levels of divergence. Recently, Marshall SI is about the same as with hydroxyapatite. Also,
and Swift (1992) have used l/NPH as a distance TSoHgives a more linear relationship with se-
metric for sand dollars, where observed NPHs are quence divergence than do the other measures of
less than 50%. While their resulting phylogenies distance (Britten, 1986).
derived from AT, and 1/NPH show identical AT5oH,however, has its own limitations. Some
branching patterns, they point out that the utility of the initial reduction is due to kinetic effects (see
of this approach is based on highly reproducible k11e discussion of NPH) and this could exaggerate
NPH values. the actual amount of divergence. The contin-
ues to fall more or less linearly with increasing di-
T, vergence until the NPH falls below 50%, at which
For fairly closely related species the T, is a good point it becomes difficult to determine and re-
measure of the amount of the DNA that hy- quires extrapolation beyond the observed melting
bridizes. The T, falls steadily with increasing di- curve. Estimates of T50Hobtained using extrapo-
vergence until it reaches about halfway between lation must be regarded as more unreliable than
the criterion temperature and the Tmof precise those that are obtained directly from melting
duplexes, At greater divergences, the amount of curves. Another problem with TSoHresults from
DNA that hybridizes continues to fall while the the error in determining NPH. The effect of this
T, changes very little. There may be some addi- error in measuring NPH has much less of an im-
tional decrease in T , if there is a well-defined pact on TSOH when the slope of the melting curve
Nucleic Acids I: DNA-DNA Hybridization 197
is very steep (as in close relationships measured root the topology and speclfy net amounts of
with 2.4 M TEACL). shared derived and uniquely derived change on
One approach to remedy the kinetic problems that topology. Furthermore, such an analysis pro-
of ATsoH is to calculate the expected decrease in duces a topology that is equ~valentto the topol-
NPH (for a heteroduplex reaction) based solely on ogy that one would obtain using the indimdual
kinetics and add this to the observed NPH value characters and a parsimony algorithm (see Chap-
before calculating ATsoH.However, data are not ter 11). In reality, however, several factors ]nay
yet available to make such a correction and such compromise the additivrty of DNA hybridization
measurements would be valuable. A practical data and destroy the precise correspondence be-
suggestion is to carry out hybridization reactions tween trees derived from distance data and trees
to high Cot values in an attempt to minimize the derived from parsimony analysis of individual
NPH differences caused by kinetic effects. characters. Some of these factors result from
processes of DNA evolution and also influence se-
1 25H quence data; others are peculiar to different mea-
An alternative for very distant species would be sures of genetic distance derived from DNA hy-
to use the TZgHor the 'temperature at which 25% bridization data. At close distances, for examplc,
of the hybridizable tracer remains in duplexes, NPH is somewl~atinaccurate but is preferable to
but this has not yet been tested. In the case of the use of ATmodeor ATm At larger distances,
melting curves for two distant species of sea ATmodeis subject to error in its determination and
urchin studied by T.J. Hall et al. (1980) and An- A T , exhibits a saturat~on,making ATsoH the
gerer et al. (1976), only about 20% of the DNA hy- metbod of choice, At even larger distances ATzoI,
bridized. The reduction in Tnlodcand Tm, in turn, cannot be determined and NPH remains perhaps
was only a few degrees since the DNA that hy- the only meaningful measure.
bridized was dominated by a high melting tem- Resampling techniques (jackknifing and boot-
perature component. The 20% NPH is a good strapping) have been utillzed in DNA-DNA hy-
measure for this case but a T2sHestimate for tlze bridization studies to determine the confidence
distant species would be easier to combine with levels associated with particular topological
other data for more closely related species. arrangements in tree construction (see Krajewski
and Dickerman, 1990 and Chapter 11). The jack-
Hybridization Data in Phylogenetic knifing method of Lanyon (1985)is particularly
sensitive to between-cell internal inconsistency in
Reconstruction a distance matrix whereas the bootstrapp~ng
Phylogenetic reconstruction is discussed in Chap- method of Krajewski and Dickerman (1990) 1s
ter 11, so the discussion here is restricted to factors sensitive to within-cell imprecision. Trees assessed
that specifically apply to DNA hybridization. If by these two techniques are largely robust wlth
DNA evolutionary changes were additive, so that A T , and ATsoH (Springer et al., 1990; Kirsch el al.,
Buneman's (1971) four-point metric was satisfied 1990a; Springer and Kirsch, 1991; Caccone et a1 ,
and all base pair changes were accurately indexed 1992; Sheldon et al., 19921, but not (at least in one
over all pairwise comparisons, then reconstruc- case) with ATnlode(Kirsch et al., 1990a).
tion of phylogeny would be a trivial operation.
Optiinality criteria such as those developed by Sources of Non-Additivity and Error
Fitch and Margoliash (1967) and Cavalli-Sforza Below we consider sources of error relevant to
and Edwards (1967) provide unambiguous crite- DNA-DNA hybridization data: homoplasy, un-
ria for choosing among competing topologies even distribution of rates of change, measurement
when distances are additive: the correct topology error, paralogous sequences, differences in
will exhibit a perfect fit to the matrix of distances. genome size, and intraspecific variation.
Springer and Krajewski (1989) have proved a Per-
fect-Fit Theorem to substantiate tlus argument. An HOMOPLASY Homoplasy (i.e., reversals and par-
unambiguous outgroup taxon then allows one to allelisms) causes observed sequence differences
198 Chapter 6 / Werrrnlz,Springer 6.Britten
to underrepresent actual amounts of sequence It should also be noted that the expected
divergence. As a consequence both DNA amount of homoplasy is deterministic but that
sequence data and DNA hybridization data are stochastic influences lead to variance around this
non-additive in expectation, because they do not expectation. For DNA sequence data, the stochas-
index all base pair changes that have taken place. tic component is much more important than for
Furthermore, the accumulation of homoplastic DNA hybridization data. Indeed, the variance
changes is non-linear and becomes progressively component associated with the expected amount
more important for increasingly divergent of homoplasy is trivial when the entire single-
sequences. Indeed, the accumulation of homo- copy genome is under comparison (Nei, 1987).
plasy in DNA sequences has been studied exten- This is an advantage of DNA hybridization data
sively and several mathematical models have over DNA sequence data. Mitigating against this
been developed to describe the effect of accumu- putative advantage is the increased measurement
lated ho~noplasyon sequence divergence. One of error associated with DNA hybridization.
the simplest models is based on a Poisson There is also a need to investigate the conse-
process and is given as follows: quences of homoplasy for phylogenetic recon-
struction if a correction for homoplasy is not em-
ployed. Most importantly, branch lengths on
T= -(%) 111/1- (%)Dl resulting topologies will be too short and the rela-
tive proportionality of branch lengths will be dis-
where D is the observed fraction of sequences that torted. Thus, homoplasy cannot be ignored if the
are different for any pairwise comparison and T is relative timing of branching events is of interest.
the expected sum (expressed as a fraction) of ho- The sequence of branching events on a topology
moplastic changes plus observed differences is much less affected by homoplasy, however, and
(T~rkesand Cantor, 1969). Since this model makes in most instances, corrections for homoplasy do
unrealistic assumptions about DNA sequence not affect the sequence of branching events
evolution, more sophisticated inodeIs have been (Springer and Kirsch, 1989; Springer and Krajew-
developed to account for biased codon usage, ski, 1989). This results from the deterministic fash-
synonymous versus non-synonymous substitu- ion in which homoplasy accumulates (i.e., homo-
tions, position-dependent differences in substitu- pldsy is a function of divergence) when a large
tion probabilities, and base-dependent differences number of nucleotides are under comparison, as
113 substitution probabilities (Fitch, 1971a, 1976a, is the case for DNA hybridization data. In con-
1986, Kimura, 1980, 1981; Golding, 1983; Tajima trast, no one has ever proposed (or documented)
and Nel, 1984; W.H. Li et al., 1985b;J.H. Gillespie, that homoplasy among morphological characters
1986b;Nei and Gojobori, 1986; Nei, 1987). [Shoe- is such a predictable, deterministic function of di-
maker and Fitch (1989), however, argue that all of vergence. When homoplasy is not a function of
these models are too conservative since not all nu- divergence, or when the variance associated with
cleotide positions are replaceable.] Unfortunately, this function is extremely large, homoplasy is
these models require actual DNA sequences and much more of an obstacle to phylogenetic recon-
cannot be used in conjunction with DNA hy- struction.
br~dizationdata. Even so, for observed sequence Application of the Jukes and Cantor correc-
differcnces up to 50%, all of these models provide tion requires that we know the conversion be-
estimates of T that are in excellent agreement with tween delta values and percent mismatch. Empir-
the Jukes and Cantor model; discrepancies be- ical estimates from the literature range from 0.7%
come important only at larger distances. Since to 2.0% base pair mismatch per 1°C depression in
DNA hybridization distances are generally much AT, (Bautz and Bautz, 1964; Laird et al., 1969;
less than 50% divergence, the Jukes and Cantor Kohne, 1970; Hutton and Wetmur, 1973; Britten et
model 1s therefore appropriate, albeit slightly con- al., 1974; Caccone et al., 198810). The conversion
serva~lve. most often used is that 1% sequence mismatch
:leic Acids I: DNA-DNA Hybridization 199
corresponds to 1°C of Tm depression, which is went off scale. If kinetic effects were accounted
partly a matter of convenience and standardiza- for, and if deletions were an unimportant source
tion. The recent estimate of 1.7% sequence diver- of NPH reduction (as they very well may be; see
gence per degree of Tm depression (Caccone et al., Meyerowitz and Martin, 1984, then ATsoHshould
198813) may well be correct for the ribosomal se- also converge on the same value as ATmode.In re-
quences studied since these sequences have con- ality, we do not know the distribution of rates of
served regions and clustered substitutions. It may change for the suite of sequences in the single-
not apply to typical single-copy DNA since most copy genome, but it is most likely to differ among
of this DNA is non-coding and might be expected taxonomic groups. Thus, it is unclear if modal or
to exhibit a more random distribution of substitu- median values of sequence divergence provide
tions. To test this, Springer et al. (1992b) compared better estimates of mean sequence divergence. We
the known sequence divergence of a 7.1-kb seg- hope that this issue can be evaluated quantita-
ment of the primate €-globin pseudogene to the tively in the future.
thermal stability of heteroduplexes. They found a
1.18%sequence divergence per degree centigrade. MEASUREMENT ERROR Measurement error is po-
Because the €-globin pseudogene region is non- tentially the single biggest problem with DNA
coding, it presumably evolves in a similar fashion hybridization distances. Springer and Krajewski
io the majority of the single-copy DNA. Thus, this (1989) discussed such error in the context of
value is significant in the conversion of percent- imprecision and inaccuracy, where precision
age sequence divergence to AT, in the majority of refers to the repeatability of replicate measure-
DNA-DNA hybridization studies. Further work ments and accuracy refers to the reliability of a
will be required, however, to determine the mean measurement as an estimate of some quantity.
value of the conversion between percent sequence Reciprocity is also a useful concept for dealing
divergence and melting temperature depression with matrices of DNA hybridization distances.
for a population of sequences (i.e., the single-copy Sarich and Cronin (1976) defined the percentage
genome) that undoubtedly exhibits great varia- non-reciprocity for a pairwise comparison as
tion in the clustering of substitutions. [(distanceAB - distance BA)/(distance AB + dis-
tance BA)] x 100. The average percentage of non-
THE DISTRIBUTION OF RATES OF SEQUENCE CHANGE reciprocity for a distance matrix is then the mean
A desirable property of any DNA hybridization value of this parameter over all pairwise com-
distance measure is that it represents the mean parisons.
amount of sequence divergence between equiva- Precision sf DNA hybridization measure-
lent portions of all genomes under comparison. ments is generally indexed as the standard error
However, once a rapidly evolving fraction of the or standard deviation of replicate measurements.
DNA has diverged such that its Tm is less than the Sibley et al. (1987) reported an average standard
temperature of its reassociation, the average or deviation of 0.35 degrees for AT, measurements.
mean divergence can no longer be measured. The Krajewski (1989), in turn, reported an average
median can be measured out to about 50% NPH standard deviation of 0.48 degrees for AT, mea-
and perhaps estimated further as mentioned surements of cranes. Furthermore, Sibley et al.
above. It seems likely that median distances (1987) found that the standard deviation for AT,
would then need to be corrected only for homo- values increases as a function of sample size up to
plasy to generate reliable estimates of additive ge- n = 5 and then remains stable. Also, standard de-
netic distance, but this has not yet been shown. viation does not depend on the magnitude of AT,
The shape of the distribution of rates of DNA se- values (Sibley et al., 1987; Springer et al., 1990;
quence change (see Springer and Krajewski, 1989) Krajewski, 1989).]Finally, average percent non-rec-
is also important. If this distribution were Gauss- iprocities for matrices of AT, values generally fall
ian, for example, ATmod,would provide a reliable between 3 and 10%(Sheldon, 1987; Springer and
estimate of mean sequence divergence until it Kirsch, 1989; Springer et al., 1990).
200 Chapter 6 / Werman, Springer & Briffen
Similarly, ATmodevalues are moderately pre- given in Springer and Kirsch (19891, the average
cise (Kirsch et al., 1989; Bledsoe, 1987).NPH and percent non-reciprocity was 3.12%.After several
ATsOHvalues, on the other hand, exhibit more iterations of the correction algorithm, this value
scatter for replicate measurements (Sheldon, 1987; was reduced to 1.05%.
Krajewski, 1989; Kirsch et al., 1989). Th'is mea- Both imprecision and inaccuracy affect the in-
surement error may obscure branching patterns ternal inconsistency of distance data and reduce
revealed by AT, and ATmodematrices (see Kirsch the fit between observed distances and distances
et al., 1989).An alternative strategy is to use a re- on an output topology. This internal inconsistency
gression equation to convert AT, values into casts doubt on the validity of branching arrange-
ATsOHvalues. This approach is much less sensitive ments when clades are united by short branch
to the effects of measurement error, yet it allows lengths (see Chapter 12).
one to reduce the effects of compression that
plague ATm values and obtain better estimates of DIFFERENCES IN PARALOGOUS SEQUENCES AND GE-
branch lengths on output topologies, If our intent NOME SIZE Sequences whose differences are a con-
is to use DNA hybridization distances to estimate sequence of independent evolutionary change
the timing of branching events, this issue cannot arising after speciation are referred to as ortholo-
be overlooked. Catzeflis et al. (1987) and Springer gous sequences (Fitch, 1976a).In contrast, paralo-
et al. (1990) have developed exponential regres- gous sequences evolve in parallel in a single line
sions of ATsoHon AT, for DNA hybridization data of descent subsequent to their origin through
on rodents and marsupials, respectively; addi- gene duplication (see Chapter 1). A salient point
tional equations would have to be developed for is that cross-matched paralogous sequences from
other groups. A major disadvantage of this ap- two different species may contain differences
proach is that it may prove intractable for some that predate speciation.
taxonomic groups, e.g., a consistent relationship Fox and Schmid (1980) and Saricl~et al. (1989)
between AT, and AT501-I may not hold for all taxa have argued that such cross-matched hybrids
under study because of differences jn genome size may be present in significant numbers when the
or variation in the amount of rapidly evolving conditions of reassociation are too relaxed, and
DNA. A second disadvantage is that NPH and that these paralogous hybrids form a low melting
ATSOHare most useful when AT, can no longer temperature component characteristic of many
provide resolution, but a regression equation melting curves. Furthermore, they argue that this
should only be used over a range where AT, and low melting temperature component seriously
ATSOHare both monotonically increasing functions compromises the phyiogenetic value of certain
of sequence divergence. hybridization distances, such as AT, and ATSOH.
In contrast to imprecision, inaccuracy is often However, there are no measurements in any spe-
caused by systematic biases that affect a whole cies that precisely quantify the number of such
suite of measurements. One such bias deserves low-copy number elements in the single-copy
mention. Most workers who have used 1251tracers fraction of the genome. Significant quantities of
are familiar with a compression of AT, values as- hybridizing paralogous sequences may be present
sociated with specific tracers (Springer and only under relaxed reassociation, although the sit-
Kirsch, 1989).Short tracer fragments are probably uation may be different for polyploid genomes.
the culprit. Compression can increase the average In addition, for iodinated tracers, which con-
percent non-reciprocity in a distance matrix. The stitute most of the melting curves to which Sarich
effects of compression, however, can be reduced et al, (1989) refer, short fragment size is often a
through the use of an algorithm developed by contributing cause (if not the most important
Springer and Kirsch (1989), which, in turn, is a cause) of the low melting temperature compo-
modification of an earlier algorithm developed by nent. Furthermore, if paralogous sequences are
Sarich and Cronin (1976) for immunological dis- shown to exist, it is easy to calculate a Tm for a
tances. For an uncorrected matrix of AT, values higher temperature component and discard the
Nucleic Acids I: DNA- D N A Hybridizatiol~ 201
INTRODUCTION
The polymerase chain reaction (PCR) has become one of the standard colors on
the systematist's palette. It is a tool of unrivaled power, but as is so often the case,
this power is linked to unrivaled complexity. The source of this complexity is the
PCR reaction itself-a myriad of ionic interactions, kinetic constants, and enzy-
matic activities, all taking place repeatedly and, hopefully, perfectly, In a few
hours time. The fact that it works so well and for so many people is one of the
most astonishing things about it.
This chapter covers some of the basic events of PCR and describes how these
events are critical to the success of a particular amplif~cation.The goal is to fa-
miliarize the reader with the process of PCR, so that PCR can be used to its fullest
extent. Even to veteran molecular biologists, PCR amplification often remains a
mystery. Why does one set of cycles work while others do not? Why does one set
of primers work while others do not? To help solve these problems, this chapter
elnpl~asizesan important aspect of PCR that is frequently overlooked: a PCR ma-
chine is an experimental tool, not just a troublesome gadget designed to produce
a DNA product. Using this tool allows an investigator to master PCR, instead of
the other way around.
206 Chapter 7 / Pal~lmbi
ity. DNA synthesis can begin at that point, copy- cycles to denature/renature DNA and the use of
ing the template by primer extension in the 3' di- the stable activity of Taq polymerase (i.e., T.
rection (Figure 1). aquaticus polymerase, hereafter referred to as Taq)
The polymerase chain reaction uses this syn- throughout this cycle, led quickly to the develop-
thetic process to copy a specific target sequence ment of simple thermal cyclers to guide the poly-
over and over again. Mixtures of oligonu- merase chain reaction to completion. Since the
cleotides, usually called primers because they first use of Taq in PCR, several additional heat-sta-
prime DNA synthesis, are used in the reaction to ble polymerases have been isolated and used in
initiate DNA synthesis at specific places on the the reaction. Some, such as Vent polymerase, have
template. The two primers are designed to anneal a 3' + 5' exonuclease activity (Erlich et al., 1991;
close to one another (within several thousand Erlich and Arnheim, 1992; or see the brochures
base pairs) but on different strands, and they are published by biotechnology companies). This al-
oriented to copy the DNA strand lying between lows the enzyme to "back up" over the last bases
them. For each cycle of heat denaturation/an- that it synthesized and replace them if they were
nealing/synthesis, the region between the incorrect. As a result, these enzymes work more
primers is copied and its abundance in the reac- slowly and have a much lower error incorpora-
tion mixture doubles. During successive cycles, tion rate than does Taq polymerase. Polymerase
this doubling proceeds until the DNA bridging errors do not play a strong role in analysis of PCR
the primers comes to dominate the mixture. products by direct sequencing, but sequencing of
Moreover, most of the copies produced in later cloned PCR products may lead to the incorpora-
cycles are of exactly the same length-the length tion of such errors in the results.
of the DNA between primers. A typical PCR reaction, then, has all the com-
Although this chain reaction (the exponential ponents required for a n in vitro synthesis of
increase of DNA through successive cycles) was DNA: enzyme, appropriate buffers, ample
first described in the early 1970s (Kleppe et al., dNTPs, template DNA, primers, and cofactors
1971), it was not until purification and use of a such as magnesium. The mix is allowed to work
heat-stable polymerase (Mullis and Faloona, 1987) repeatedly, copying the DNA strand between the
that the chain reaction became practical. The treat- primers, with a reaction speed and specificity de-
ments that denature dsDNA to ssDNA (heat, high termined largely by temperature. Successful am-
pH) also destroy most enzyme activity. Thus, with plification within this reaction mix depends on
typical heat-sensitive polymerases, new poly- efficient interaction of all these components,
merase had to be added after every denaturation many of which can be optimized for a given tar-
step to maintain a chain reaction. Moreover, the get DNA.
temperatures at which most DNA polymerases
are active (<45OC) also tended to allow too much
non-specific annealing of primers with template
The Cycle
DNA. The PCR cycle consists of three major phases: de-
The heat-stable polymerase was isolated naturation, annealing, and extension. The molec-
from a hot springs bacterium, Thermus aquaticus, ular events occurring at each of these stages, and
which normally grows at high temperatures. why they are important to PCR, were discussed
Evolution had led to the adaptation of this DNA briefly above. To control the events at each phase,
polymerase to be active at high temperatures, the experimenter needs to make decisions about
and just as importantly, it is stable at even higher the temperature of the phase, its duration, and
temperatures. Heating a reaction to 94OC can de- how quickly this temperature is approached.
nature DNA, but the DNA polymerase of T.
aqunticus is not destroyed (or at least, not imme- Denaturation
diately; see later). In this phase, heat is used to stop all enzymatic re-
The combination of the idea of temperature actions (for example, the synthesis that was oc-
208 Chapter 7 / Palumbi
curing during a previous extension phase) and quickly by Brownian motion (because different
denature the DNA from double to single strands. sections of the molecule are trying to move in dif-
Usually 94°C is the temperature used, although ferent directions). Thus, small molecules like
some recent protocols have suggested 92°C. Too primers have the best chance of jiggling ran-
low a temperature, or too short a denaturation domly into exactly the right position to form
phase, may fail to completely disassociate high- ionic bonds with the targeted annealing site. Of
molecular-weight, genomic DNA. However, al- course, they are also jiggling next to every other
though Taq polymerase is resistant to heat denat- possibIe priming site, and they will bind to these
uration, it is not immune to it, and excessive sites as well if the ionic attraction of the site is
denaturation will reduce enzyme activity. For ex- greater than the forces breaking these attractions.
ample, after 30 incubations at 94OC for 60 seconds Note that if the template DNA is degraded, then
each, Taq loses about half of its activity,In general, the small pieces of template can act as a suite of
30-second denaturations at 94OC seem to strike a random primers, all of which have a perfect
good balance between complete denaturation and match somewhere in the target genome! This is
destruction of enzyme. Some protocols recom- one of the reasons why high-molecular-weight
mend a longer first denaturation step (i.e., of DNA and low-molecular-weight DNA do not mix
60-120 seconds), because this is the cycle in which we11 as templates.
full genomic disassociation is critical. Although every primer is different, there are
some simple rules to approximate the tempera-
Annealing ture at which the ionic attraction of a primer to its
In this phase, the temperature is lowered so that binding site is balanced by the forces of Brownian
oligonucleotide primers can bind to the appropri- motion pushing it away. The relevant measure is
ate sites in the template DNA. This is the most called the T,, or the temperature at which half of
critical phase, because if primers bind correctly to the potential binding sites are thought to have
only the target positions in the template, then primer bound to them. A long primer, or one with
there is a good probability that the expected syn- with greater GC content, has a higher T, (because
thesis product will result. However, there are of the greater number of hydrogen bonds). A com-
many factors that interfere with this perfect union mon rule of thumb is that the T,, (in degrees centi-
of primers and targets. grade) of a perfect primer (that is, one that has a
Consider first that the primers do not know perfect sequence match to the template) is four
what it is the experimenter wants to happen. An- times the number of G's and C's plus two times
nealing is a random process that depends criti- the number of A's and T's in the primer sequence.
cally on the concentration of primer, the avail- Above the T,, few primers are bound (al-
ability of annealing sites, and the presence of though if primer concentration is very high, there
competing, non-ideal annealing positions. As the can still be a Iot of coming and going at annealing
temperature is lowered from the denaturation sites). Below the T,, most of the perfect annealing
phase, primers are jiggling around the PCR mix- sites are occupied, but the primer is also binding
ture, driven by Brownian motion. Ionic bonds be- to a greater number of non-perfect sites (e.g.,
tween the single-stranded primers and the single- those that do not have the exact sequence of the
stranded template are constantly formed and primer). Here lies one of the great tradeoffs of
broken. The most stable ionic bonds last a little PCR. If annealing temperature is too high, not
longer, and as the temperature drops they last for enough primer is bound, but if it is too Iow, then
greater and greater periods of time. Simultane- multiple sites are used and many PCR artifacts
ously, every other single-stranded piece of DNA are generated. If annealing temperature is very
is forming transient bonds with every other low, the genomic DNA will reanneal to itself and
piece, with the exception of sections which can- "self-prime" (that is, it will form its own double-
not bind because they are too close together. strand/single-strand steps to start-synthesis; see
However, larger pieces of DNA do not move as Figure 1).
Nucleic Acids II: The Poly~nevaseChain Reactioiz 209 ,
Other problems can occur as well. Genomic to their target sequences once the extension tem-
DNA often has vast stretches of similar sequences perature is reached. How do they stay attached?
(satellite DNAs), and these will quickly reanneal. As the temperature rises slowly from the anneal-
In addition, in some cases there are stretches of ing temperature to 72OC, polymerizat~onbegins,
DNA that are the inverse of an adjoining stretch albeit slowly. However, thls slow polymerlzatlon
and can bind to this upstream or downstream is enough to add a few extra bases .to the primer,
stretch to form a hairpin structure. This is espe- lncreasrng the stability of the primer-template
cially true of genes for ribosomal RNA, which are complex. Thus, by the time the extension temper-
designed to fold into a series of loops and stems. ature is reached (typically in about 30 seconds or
Often, the DNA in the loop of this structure can- SO), the primer is already part of a growing DNA
not efficiently bind a primer, or if it can, a poly- daughter strand.
merase cannot synthesize past the stem of the Extension time is another important variable.
hairpin. Thus these sites, even though they exist Under ideal conditions, ?izq polymerase wlll syn-
in the template DNA, are invisible to the primers. thesize thousands of bases a minute. As a result,
This is thought to be why some sets of perfect PCR products under 500 bp do not require much
primers work better than others on the same gene. time for complete syntlzesis. For such short prod-
How long should the annealing reaction con- ucts, 30 seconds is ample. For longer products,
tinue? Again, there are important tradeoffs. The l~owever,longer periods of time are best. Typi-
chief advantage the primers have in annealing be- cally, a 30-second extension IS adequate for prod-
fore the template DNA reassociates are concentra- ucts under 500 bp, 60 seconds is needed for prod-
tion and speed. Both are eroded by long anneal- ucts between 500 and 1500 bp, and 90 seconds is
ing times, which give other, bulkier DNA required for longer products. However, opt~miza-
molecules a chance to find one another and an- tion of these times may be required because un-
neal. Long annealing times also are thought to necessarily long extension times appear to in-
give the primers time to "find" imperfect matches crease the likelihood of PCR artifacts.
in the genome, although a little reflection shows
that such matches are found as fast as perfect Choosing Reaction Conditions
matches. The biggest difference is that the resi-
dence times of the primer bound to such imper- Because each template and each primer pair is
fect sites is Iower because the ionic bonds are different, and because molecular systematists
weaker, and this lowered ionic attraction has little tend to use a wide variety of taxa or primers in a
to do with time. single research program, PCR reactions need to
Nevertheless, shorter annealing times seem to be carefully optimized. This means that a certain
provide greater specificity in the PCR reaction amount of trial and error is an integral part of the
than longer ones. Generally, annealing times of PCR experience.
30-60 seconds are most common, although times For molecular systematics, most attention has
as short as 15 seconds often work well at high an- been focused on the primers and how strongly
nealing temperatures with perfect primers. they anneal to the template DNA. If primer an-
nealing is inefficient, or if the primers anneal at
Extension unexpected sites, then the amplification might not
This phase allows the enzyme to work, synthesiz- proceed, or alternative products could be pro-
ing the target DNA segment. Taq polymerase duced. To guard against this occurrence, the slm-
works well at about 72"C, and this is the tempera- plest procedure is to anneal the primers at high
ture usually cl~osenfor the extension reaction. The stringency. In this way, only well-matched
enzyme is active at lower temperatures, however, primer/template couplings will occur and the
and this is important for the success of most am- amplification will be highly specific. Unfortu-
plifications. Most primers have a T , well below nately, these high temperatures often preclude use
7Z°C, and so most will not be bound very tightly of universal primers (those primers designed to
210 Chapter 7 / Palumbi
be effectivein a wide variety of taxa; e.g., Kocher become enormously creative: they are basic out-
et al, 1989). Such primers are seldom perfect lines that can be adapted to a large number of
mdtches in a target sequence. They are usually de- slightly different purposes. By understanding the
signed irt highly conserved regions that vary only basic kinetics of the complex PCR reaction, the
slightly ainong taxa, but variation of a few bases most effective set cf cycle parameters can usually
20 1s common. As a result, use of high anneal- be achieved. For any given set of reactions, this set
ing temperatures may prevent efficient primer- of "best" parameters must be discovered by ex-
telnplate binding. perimentation.
Choices of conditions are even more complex
because most PCR machines allow different cycles
to be joined together to produce a complex ampli-
PCR Components
fication profile. For example, in order to amplify The chemical environment of the PCR reaction is
a product with universal primers, which are not very important to the specificity and efficiency of
likely to be perfect when used on a particular tar- the amplification. This does not mean, however,
get, i t may be practical to use 5 cycles with a low that only a single reaction mix will work: often a
amlealing temperature (45°C or so), followed by variety of reaction mixes will give satisfactory re-
30-35 cycles at a high annealing temperature sults. A short, handy guide to some of the com-
(55°C) In this case, the amplification starts at law mon PCR variations was presented by Carbonari
stringency, allowing imperfect primers to anneal (1993). Note that it is seldom possible to predict
and start synthesis. During the first 5 cycles, how- precisely the effect of changing a particular reac-
evei, PCR products are produced with perfect tion component. In general, a good starting place
ends (the DNA sequence of the ends of the PCR is to use the buffer recipe suggested by the sup-
fragment are identical to the reaction primers be- plier of the thermostable enzyme you will use. Be-
cause the primers have been physically incorpo- cause these enzymes have different origins, they
rakd into the DNA). As a result, subsequent am- have slightly different requirements. Neverthe-
~~liflcations can occur at a higher annealing less, experience has shown that many buffer
temperature. In this case, the products made in recipes can substitute for one another. In addition
the first 5 cycles are the only templates for ampli- to the buffer, reactions must include the raw ma-
fication in the last 30-35 cycles. Although use of terial for synthesis (the dNTPs), the enzyme,
40 cycles at a 45OC annealing temperature would primers, templates, and Mg2+(a cofactor required
also help mismatched primers function, repeated by the enzyme to function). Only the primers and
use of this low annealing temperature often dNTPs are consumed during the reaction, and
causes many more PCR artifacts, these are added in enormous excess, so synthesis
Paradoxically, exactly the opposite strategy is is rarely limited by these components (Table 1).
sometimes used to promote amplification with In the case of the dNTPs, high concentrations
impelfect primers. The first cycles are performed enhance reaction speed because the enzyme acts
at high annealing temperature, which assures that most quickly if the substrates are in concentra-
olily the correct products are made (although tions so h i g l ~that the time until the correct nu-
there are few of them, since this is an inefficient cleotide triphosphate diffuses into the catalytic
set of cycles).Subsequent cycles are performed at site of the enzyme is short. In the case of the
low annealing temperature because this increases primer, excess is required to ensure that alI possi-
reactloll efficiency and by this time, the correct ble annealing sites have "access" to a primer mol-
PCR amplified segment makes up most of the re- ecule during the annealing step. In fact, the
acbon template. This strategy tends to minimize amount of nucleic acid added as primer is often
alternat~veproducts without sacrificing reaction very similar to the amount added as template
eHiaency. (about I pg per 100 /A). This means that the length
As these two examples illustrate, an~plifica- of the template DNA and the total length of all the
clan protocols are not rigidly set but instead have primers combined is nearly the same.
Nucleic Acids II: The Polymerase Chain Reaction 211
Table 1
Amounts of a 1000-bp DNA product that could be produced by complete synthesis
in a typical PCR reaction (100 ml v ~ l u m e ) ~
If the component is used completely
Initial Initial number Number of
component concentration of molecules product strands Weight of product
The cofactor Mg2+is not consumed in the syn- are usually reasons for poor PCR yields other
thesis reaction and is impervious to the heat ex- than low enzyme concentrations.
tremes of the amplification cycle, so initial and fi- Many other additives have been proposed to
nal concentrations are the same. The ion is an enhance PCR reactions. Some of the most: com-
important cofactor in enzymatic catalysis of the mon are BSA (bovine serum albumin), gelatin,
synthesis reaction, so adequate concentrations NP-40, Tween-20, Triton X-100, glycerol, and
speed u p the reaction considerably. However, DMSO. These additives are thought to stabilize
Mg2+ also interacts with the negative phosphate the enzyme (BSA and gelatin), reduce secondary
groups of the dNTPs strongly enough that Mg2+ structure problems (the detergents), or favor pre-
ionically attracted to P04- groups is less available cise annealing. In general, providing moderate
to act as an enzymatic cofactor. For this reason, amounts of these additives may make some reac-
Mg2+concentrations need to be higher than dNTP tions proceed more easily. Again, trial and error
concentrations. Note that the template DNA also seems to work best. Note that in almost every
probably interacts with Mg2+,but there is usually case, too much additive will kill a reaction. Gen-
not enough of this to play any important role in erally, concentrations range from about 0.1% to
sequestering Mg2+. 1%in the final PCR cocktail. Also, all three deter-
Varying Mg2+concentration has been a popu- gents are rarely used together, and when any ad-
lar method of tinkering with PCR reaction condi- ditives are used, reactions tend to have either BSA
tions. In general, 1.5 mM MgC12is added to most or gelatin (not both) and DMSO or glycerol (not
reactions. However, for a particular reaction, both).
titrating MgC12 concentration-that is, perform-
ing a controlled experiment in which MgC12is
varied-can often increase yield, reduce un-
The Thermal Cycler
wanted products, and increase reaction efficiency. All too often, the thermal cycler (PCR machine) is
Maximum concentrations seem to be about 6 viewed simply as a way to produce a product
mM. Above this level, Taq activity tends to de- needed to conduct research. One fills the tubes
cline. with complex reagents and hopes the desired
Enzyme concentration can also be altered in product is present at the end.
PCR reactions. The recommended amounts of But frequently, the product is not there, espe-
polymerase are often far in excess of the amounts cially when first embarking on a PCR experi-
required for amplification, and increasing enzyme ment, or when first using primers obtained
concentration does not automatically increase through the mail from someone who designed
PCR quality. Adding extra enzyme can cause a them for other organisms. In this circumstance, it
previously recalcitrant reaction to work, but there is important to treat the PCR process as an exper-
212 Chapter 7 / Palumbi
imentai opportunity. Tltis requires two elements: tions run with each of four different Mg2+concen-
a good experimental design with proper controls, trations) should be performed, with appropriate
and feedback about tlte results of each experi- positive and negative controls for each set of con-
mental manipulation. ditions. By varying conditions in this careful way,
Negative controls are fairly common in PCR; optimal reactions often can be obtained that
these generally are reaction tubes made without greatly enhance the product of the reaction. Even
template DNA. Presence of a PCR product in such a modest 10% increase in efficiency at every cycle
a case usually means contamination in one or will lead, over 30 cycles, to a 15-fold increase in
(usually) more reagents. However, positive con- product.
trols are also important. These are reactions that
are guaranteed to work as long as the basic PCR
cocktail is functional. Usually they involve using a
Primers and Primer Design
sample of genomic DNA known to give good con- Primers that amplify a given section of DNA in a
sistent results. Sometimes a previous PCR prod- wide range of taxa-so-called universal primers-
uct is used. Sometimes a cloned segment of DNA have been extremely useful in molecular system-
that contains both primers is used. Failure of the atics. The primary reason is that universal primers
positive control means failure of the basic cocktail allow amplification of a DNA segment in a
(e.g., perhaps Taq was inadvertently left out of the species that has never before been the subject of
reaction). In this case, failure of the remaining re- molecular genetic study.
actions is expected. If, however, the positive con- The most common universal primers are for
trol works, then the basic cocktail is functional animal mtDNA, plant cpDNA (chloroplast DNA),
and failure of other reactions must be considered and nuclear ribosomal RNA genes, although uni-
to be a sign of other problems associated with the versal primers for conserved exons of nuclear
template DNA and its relationship to the primers genes are becoming more commonplace.What are
used, the attributes of good universal primers? How are
Discerning the nature of these problems often they designed?
requires additional experiments. First, however, a The most straightforward design metl~odis to
hypothesis about the nature of the problem must align homologous sequences from as many differ-
be formulated using tile results of the previous ex- ent taxa as possible. For protein-coding genes,
periment. The troubleshoot.ing section at the end identical amino acid sequences over a 7-9 amino
of this chapter explains some of the causes of acid stretch are convenient locations for universal
some of the common problems seen in PCR reac- primers. In such regions, nucleotides in third-base
tions. The most important point is that even if the positions can vary widely, and it is variation at
products are not what you expected, you should such positions that creates most of the primer mis-
examine the results carefully. An agarose gel is not match during PCR. This problem can be reduced
"blank" if there are obvious primer-dimers, or in several ways. First, some amino acids are en-
smears running from the wells, or evidence of de- coded by two codons. In these two-fold codons,
graded DNA. the third position usualIy can be either (T or C)or
Once a hypothesis is formed about the nature it can be (A or G). Other, four-fold codons can
of the problem, a remedy can be devised and have any of the four bases at the third position.
tested. Often the remedy involves trying different Designing a primer to include as many one- and
reaction conditions (annealing temperature, Mg2+ two-fold codons as possible greatly reduces the
concentration, amount of template), and often a potential variation. Second, not all codons are
range of conditions should be tested at tl-te same used with equal frequency. In some genomes, like
time (obviously annealing temperature cannot be insect mtDNA, there are strong nucleotide biases
treated this way). In cases where two variables which result in non-random distribution of third-
need to be varied, a traditional block design ex- base positions. In insects and crustaceans, 95% of
periment (e.g., three different template concentra- these bases are A or T. This means that two-fold
Nucleic Acids II: The Polynzerase Chain Reaction 213
codons are very conserved evolutionarily (for ex- withstand higher annealing temperatures, but are
ample, a cysteine-TGT or TGC-will most fre- also subject to greater amounts of self-annealing.
quently be a TGT), and that even four-fold codons
do not vary wildly. Primer-Template Match
Besides nucleotide bias, many genes, espe- Specificity is obtalned through n~aximizingse-
cially nuclear genes, show codon bias. In these quence similarity between the primer and tern-
cases, all four versions of a four-fold codon might plate. However, ampllficat~onproducts arc ob-
theoretically be used, but in most cases only one tarned even when the primer and template do not
or two of the possibilities are commonly used. have perfect sim~larity.Single internal mismatches
Alignment of many l~omologoussequences usu- havc little effect on PCR product yield when the
ally can reveal the nature and extent of codon primers are long and there are 6-10 matched
bias. For an example of protein alignment and de- bases on either side of the mismatch. By contrast,
sign of PCR primers, consult the section on cy- single mismatches at or near the 3' end of thc
tochroine c primers at the end of this chapter. primer can signif~cantlydecrease amplification.
Of the mismatches at the 3' end, A:G, G:A, and
Primer Length C:C reduce yields about 100-fold, whereas A . A
Primers can be as short as 13bp (even shorter for mismatches reduce ylelds 20-fold. T's appear to
RAPDs; see below) and as long as 80 bp. In most be dble to base pair wlth all three other bases
cases 18-24 bp primers are sufficient. The longer fairly well, and this suggests some tips for dcslgn
the primer, the higher the annealing temperature of universal primers.
can be and the greater the specificity. However, Another suggestion for primer design IS to ex-
unpurified primers and long primers have amine closely the 3' end of the primer, where t l ~ c
greater amounts of non-specific primer products polymerase binds initially. The primer m this re-
present in the primer mixture. In a typical gion nceds to be flrrnly annealed to the ternpratc
ohgonucleotide synthesizer, bases are added to a for elfielent polymerase binding. If the 3' end of
synthetic oligonucleotide wit11 about a 98% effi- the primer is a third codon position, then t h ~ last,
s
ciency. This means that for every nucleotide c r ~ t ~ cbase
a l will frequently be misinatched and
added to the primer, 2% of the product is not syn- poor amplification will result. For this reason,
thesized properly. If, for instance, the primer is 20 primers are usually ended at second codon posi-
bp long, then only 68% of the oligonucleotides in tions (the positions which are least likely to vary).
solution are the desired primer product. The re- In addition, the primer is enl-,anted if the third
maining 32% are non-specific primers, approxi- codon position closest to the 3' end (usudlly
mately 2% of which are 1 bp shorter, 2% are 2 bp within 3 bases of the 3' end) IS a two-fold posltion
shorter, etc. In general, these non-specific or is a degenerate base in the primer (that is, the
oligonucleotides do not interfere with amplifica- base at this position is one of two or four different
tion. However, in many cases, primer artifacts nucleotides). This will produce a primer 11kely to
(dimers) and non-specific amplification occur. If be a good match at the flve bases nearest the
you need to make long primers, product purifi- primer's 3' end.
cation is recommended. Otherwise this does not
seem to be necessary. Common Modifications to Primers
Restriction sitcs (see Chapter 8) can be incorpo-
Nucleotide Composition rated into the primers, thus allowing easier
Primers can be any sequence. The ideal primer cloning of PCR products. Some researchers sug-
has roughly equal numbers of each nucleotide gest adding an additional 3-5 bases to the 5' end
without internal repeats or self-similarity. For in- of the primer (after the restriction site) becausc
stance, a primer with the sequence AAATT- thls greatly increases the digestion efficiency. In-
TAAATTT may lead to self-priming and corporation of a biatinylated nucleotide to the 5'
primer-dilner products. GC-rich primers can end during primer synthesis allows solid-phase
sequencing and non-isotopic detection of ampli- GAG GAA GA A
.. or
GAA
.X The primer GAA does
not anneal well with
fied products. Recently, fluorescent tags have been I I I o r I I I
CTC CTT C TT CTC the sequence CTC.
added to nucleotides as well; these form the basis Coding strand
Doubie -stranded
of detection systems in many automatic DNA se- templates primer
quencers.
The primer GAG can
Desig~iof Univevsal Primers GAG GAA GAG GAG anneal well with the
l l l o r l l l or : , , sequence CTTbecause
There are several ways primers can be designed CTC CTT CT T C TC a G:T bond has some
to make them more useful with unknown tem- Double-stranded Coding strand
plate sequences. Olfgonucleotide synthesizers al- templates primer
low equal molar ratios of two, three, or four dif- Figure 2 Design of primers to reduce redundancy by
ferent bases to be added at a particular position. taking advantage of the ability of G-T bonds to form in
This type of synthesis creates so-called degener- primer-template interactions. The G-T bond is not as
ate primers, which are in reality complex mixes stable as an A-T bond, but is much more stable than an
of ~Iigo~~ucleotides of different sequences. The A-C bond.
advantage of this type of design is that, theoreti-
cally, there will be an exact match of a target se-
quence to something in the primer mix. A disad- ticular taxon under study. These new primers typ-
vantage is that, depending on the degree of ically are located just inside the universal set, and
degeneracy, the concentration of this perfect tend to provide cleaner, more consistent amplifi-
primer 1s low. Also, because a complex mixture of cations than even the best universal primers.
primers will show a spectrum of affinities for the
template, it is difficult to judge the best concen-
tration of primers to use without careful titration ASSUMPTIONS
experiments.
A way to reduce primer degeneracy is to take The biggest assumption made,about PCP. is that
advantage of the ability of some "mismatched" the product produced is the product desired. It is
base pairs to form a partial bond. Although G easy to be skeptical about this assumption: within
blnds to C best, G-T bonds can also form. This the billions and billions of base pairs of the typi-
unusual bond suggests a strategy for designing cal genome there is likely to be-a region that will
pnmers, especially for animal mtDNA in which an,neal (although maybe only at low stringency)
most substitutions are transitions. with virtually any short primer. In fact, this is the
Suppose we need to design a primer In a basis for the development of random-primed PCR
stretch of DNA that includes a two-fold codon, analysis (see the section on RAPDs). However,
such as for glutamic acid (GAG, GAA). If we in- random priming also can lead to PCR artifacts
cltrd e the sequence GAA in the primer, it will an- that must be identified.
neal well only with the perfect complement, CTT. Random priming is reduced by using long
However, if we use the sequence GAG instead, primers with reasonably high annealing temper-
thls will anneal perfectly with the perfect comple- atures, and by using pairs of primers known to be
ment, CTC, but it will also anneal well with the in a given orientation and a given distance apart.
imperfect complement, CTT (Figure 2). Thus, for In this case, the most direct indication that the
primers on the coding strand, we can use G in PCR product is the correct one is its size: if the
every position in which there is a potential for ei- product is the predicted size, it is probably the one
ther an A or G in the template. Similar logic sug- for which the primers were designed.
gests using T in each position that has a or a C Given that the product is not a random one,
in tlie template. there are still several assumptions that typically
After a universal primer is designed and used are made about it. The most important one is that
successfully, often it is a good idea to design a the gene segment amplified is from the ortholo-
new set of primers that work we11 only on the par- gous locus (see Chapter 1). As an example, the
Nircleic Acids 11: The Polymerase Chain Reaction 215
$obin genes occur in a small, multigene family. If A separate assumption made when using
primers were designed to a conserved region of PCR for population analyses is that all alleles are
g-lobin, they may well amplify a gene segment being faithfully and equally amplified. This may
from several loci, both functional and non-func- be untrue if alleles differ in the sequences that an-
tional (i.e., pseudogenes). In amplifications from neal to the primers, or if some alle!es do not copy
a number of species, PCR products of the pre- well due to secondary structure. (Note that if the
dicted size might be derived from several of these PCR products are to be cloned, then unbiased
loci. In order to use such gene segments in phylo- cloning is another important assumption; see
genetic studies of species, some evidence that they Chapter 9). Another problem is the production of
are orthologous is required. recombinant PCR products by template "jump-
It generally is assumed that this problem is ing" (Saiki et al., 1988; Scharf et al., 1988a,b; Paabo
less severe when using animal mitochondrial et al., 1989; Scharf, 1990). These recombinant
DNA because this genome does not have multiple products are produced when partially extended
loci (although, of course, multiple copies are the DNA from one site of primer attachment (e.g., one
hallmark of this genome). Even with mtDNA, allele at a heterozygous locus, or one gene in a re-
however, care must be taken that nuclear pseudo- peated family) attaches at a second site (e.g., the
genes are not amplified. There are mitochondrial alternate allele at the heterozygous locus, or an-
gene segments that have been transferred into the other gene in the family) during a subsequent ex-
nuclear genome. These nuclear copies have been tension cycle. The resulting products are likely to
characterized (some minimally) in several taxa, contain some stretches of DNA from one allele or
including locusts (Gellisen et al., 1983), sea locus and other stretches of DNA from the other
urchins (Jacobs et al., 1983), birds (Quinn, 19921, allele or locus. Some polymerases (especially Taq)
rodents (M.F. Smith et al., 1992), crabs, and corals appear to have "pause sites" at particular DNA
(S. Romano, personal communication). sequences that are likely to promote recombinant
Quinn (1992) found that mtDNA sequences PCR products in heterozygotes (R. D. Bradley and
from snow geese were variable from population D. M. Hillis, personal communication). In these
to population, but that this analysis was compli- cases, switching to a different thermostable poly-
cated by the amplification of a nuclear pseudo- merase may reduce the production of recombi-
gene similar to the control region mtDNA se- nants. The problem also can be reduced by ampli-
quences that were the true target of PCR. fying short stretches of DNA or by using long
Moreover, these pseudogenes tended to dominate extension times. In some cases, it may be possible
amplifications of some samples but not others, to design allele-specific piimers. However, if al-
probably because some samples were from blood lelic-specific sequences need to be determined, the
(rich in nuclear DNA) whereas others were from likelihood of recombinant products in amplifica-
liver (with more mitochondrial DNA). tions from heterozygous individuals should al-
Quinn solved this problem by comparing am- ways be considered.
plifications from pure mtDNA and blood samples
from the same individuals. Mitochondrial-specific
primers allowed unambiguous amplification of APPLICATIONS AND LIMITATIONS
the correct locus. Other questions may be asked to
help distinguish nuclear pseudogenes from mito- Types of Amplifications and
chondrial targets: Is the transition:transversion ra- Types of Data
tio appropriate for mitochondrial versus nuclear
genes? Are the sequences monophyletic among The most common use of PCR in molecular analy-
cIose1y related species? Are there odd insertions sis has been for amplification and sequencing of
or deletions or stop codons that would indicate homologous genes in related organisms. How-
that the gene is non-functional? Do Southern hy- ever, several other types of data can be derived
bridizations using the PCR product as a probe from PCR products, and each of these types of
yield the expected results? data can be gathered from several different types
216 Chapter 7 / Palurnbi
of amplified DNA. Methods for obtaining these might often be the most informative (see Chapter
data (RFLPs, length variants for microsatellites, 12).Other approaches are to target conserved sec-
mobility variants for denaturation gels) are cov- tions of genes for primers, and to use these
ered in Chapter 8 and will only be mentioned primers to bracket more variable sections. For ex-
here. Instead, this chapter concentrates on some ample, cytochrome b has a wide variety of con-
of the common PCR targets. served and variable domains that are associated
with the function of this gene in the mitochondria1
Animal Mitochondria1 DNA membrane (see references in Irwin et al., 1991;
The existence of full sequences of mtDNAs from Martin and Palumbi, 1993a). Kocher et al. (1989)
several phyla has encouraged the development of designed primers that anneal to sections of the
a suite of so-called universal primers for this gene coding for conserved regions, yet also span a
genome (e.g., Kocher et al., 1989). These primers less conserved section.
allow access to the mitochondrial genomes of Recently, a different approach to mtDNA am-
species otherwise unknown to molecular biology, plification has been developed by M.J. Smith and
and encourage the sequencing and comparison of co-workers (1993).In their strategy, primers span
homoIogous genes of closely related species and gene junctions in mtDNA, and PCR products con-
of populations within species. tain the 5' and 3' ends of adjacent genes, along
In animal mtDNA, two sets of universal with intervening tRNA genes. This approach has
primers have been widely used for ribosomal been used to quickly estimate mtDNA gene order
genes and two sets have been used for protein- for novel taxa, as gene order includes important
coding genes. The ribosomal primers are highly phylogenetic information at higher taxonomic
conserved (see Appendix), yet span a region that levels (M.J. Smith et al., 1993).Another use for this
includes enough variation to be phylogenetically approach has been to span highly variable seg-
useful at the species level and below. Overall, the ments like the vertebrate control region by using
12s rRNA gene is shorter than the 16s rRNA gene, conserved primers in the flanking cytochrome b
but the former has been subjected to more careful and 125 RNA genes (Martin et al., 1992a). This
analysis of secondary structure (e.g., Simon et al., stretch of DNA generally is too long to be se-
1990).The 12s gene evolves at about the same rate quenced with most double-stranded or asymmet-
as the average for the entire mitochondrial ric PCR methods, but it can be restriction digested
genome (Simon et al., 1990). or cloned and sequenced. ,
Among protein-coding genes in mtDNA, Amplifications of mtDNA profit from the
there is a wide range of levels of conservation. multiple copies of this compact genome in animal
Some proteins are so variable that it is difficult to cells. Most somatic cells have thousands of copies
align homologous amino acids (e+g.,ATPase 6,8). of mtDNA. Large oocytes have hundreds of
Others are so highly conserved that it may be dif- thousands of copies. From a practical standpoint,
ficult to detect any amino acid change among gen- this provides a large number of starting copies
era (e.g., cytochrome oxidase I). As expected from for PCR-an advantage shared only with chloro-
the neutral theory, when four-fold degenerate plast DNA or multicopy nuclear loci like the ri-
sites are examined in these proteins, there is no re- bosomal RNA genes. Moreover, DNA extractions
lationship between rate of silent substitution and can be adjusted to yield a greater ratio of mtDNA
degree of amino acid conservation (Kessing, to chromosomal DNA. Differential centrifugation
1991).Thus, even the most highly conserved gene has long been used to isolate mitochondria (Lans-
(at the amino acid level) has as great a rate of mann et al., 1981; see also Chapter 8). Such mito-
silent change as does the most variable gene. This chondrial preparations have up to 50% mtDNA;
makes the design of universal primers easier for ultracentrifugation can increase this to near 100%,
highly conserved genes. Of course, amino acid greatly easing amplifications, and easing the
evolution is faster in less conserved genes, and for problems of identifying true mitochondria1
phylogenetic reconstructions this type of data genes.
Nucleic Acids 11: The Polymerase Chain Reaction 217
arnphfication. Because mRNAs are usually found MrcRosATELLrTE DNA A similar approach has
in many copies in a cell, and because introns have led to development of an entirely different strat-
been edited out, cDNA amplifications of particu- egy for obtaining population genetic data.
Jar coding regions are sometimes simpler than the Genomic clones are screened for homology to
corresponding genoinic amplifications. It is even probes constructed from dinucleotide repeats
possible to amplify from a cDNA preparation us- (e.g., CACACA, etc.). These clones are se-
ing only one specific primer if the downstream quenced, and PCR primers are constructed that
prllnei- used is an oligo dT. (This technique has flank the repeated segments. The number of tan-
been modified into the RACE protocol of dem dinucleotide repeats tends to vary from
Frohman et al., 1988.) individual to individual due to unequal cross-
However, drawbacks of cDNA amplifications ing-over, slip mismatch replication, and other
include the extreme care with which samples genetic mechanisms (Queller et al., 1993). Thus,
must be treated before use. The mRNAs upon the PCR products will be of slightly different
whlc11 this method depends are finicky templates sizes (differing by 2, 4, 6 bp, etc.). By elec-
prone to fast decay if not preserved carefully. In trophoresing these products on an acrylamide
addltlon, some genes are only poorly expressed in gel, the number of repeats that an individual
some tissues, and are thus largely unavailable as possesses (at both alleles) can be estimated.
mlWAs. Population cl~aracterizationconsists of docu-
It is important to note that many genes are menting the frequency of different length vari-
not truly single-copy. Instead they exist as part of ants in each population. Advantages of this
small gene families that have 2-10 expressed loci technique are its speed and accuracy once the
and might have additional copies as pseudo- appropriate PCR primers are known, and that a
genes. Use of highly conserved primers may well great deal of polymorphism tends to be visible
amplify a suite of products from these indepen- with this technique. This is especially important
dent loci, and care should be exercised in their for comparison of large numbers of individuals
analysis. in a population (Queller et al.,, 1993).
Disadvantages include uncertainty about the
ANONYMOUS SINGLE-COPY SEQUENCES Recently, functional role of nucleotide repeat variation (one
Karl and Avise (1993) have developed an ap- such trinucleotide-repeat polymorphism gives
proach to single-copy amplifications that they rise to the fragile X syndrome in humans; Verkerk
call "anonymous single-copy RFLPs." A genom- et al., 1990),the work required fo develop primers
ic library is screened for single-copy sequences for each new species examined (but see Schlot-
(see Quinn and White, 198713). These regions terer et al., 1991), and the fact that only a few al-
are sequenced and spec~ficprimers that anneal lelic states are possible, enhancing the chance of
to the ends of the region are constructed. parallel evolution of a particular length variant
Amplifications from genomic DNA produce a (see FitzSimmons et al., 1994).This latter problem
homolog of the cloned fragment, which can be means that phylogenetic analysis of the alleles
assayed by restriction digestion or sequencing. discovered is difficult or misleading. The
Advantages of this technique are that it pro- strengths and limitations of microsatellites are dis-
vid es a large number of independent loci, and it cussed further in Chapter 8.
can be applied to any species, A serious disad-
vaniage is the large effort involved in screening RANDOM PRIMER AMPLIFICATIONS Because one of
and confirming single-copy clones, an effort that the most time-consuming aspects of the above
must be repeated with every new species. In methods is primer design, and because this
addition, nothing is known about the sequences design process must often be repeated for each
produced, which makes identification of homol- new taxon studied, efforts have been made to
ogous sections in related species difficult and develop amplification systems that sidestep
complic'~tesanalysis of the results. primer design. RAPDs (see Hadrys et al,, 1992)
Nucleic Acids 11: The Polymerase Chain Reaction 219
this by using a large set of short, ran- trol over the amplification condition is critical to
dom oligonucleotides. Even random primers confident interpretation of the results.
anneal with some probability in any given Another problem is that absence of a product
genome, and by screening a large number of in a particular reaction could be caused by many
primer pairs it is possible, by chance, to find genomic differences, such as nucleotide substitu-
some that produce useful products. tions in one or both primer sites, unequal recom-
These products may not occur on every chro- bination or replication slippage between sites cre-
mosome or in every individual. If the primer sites ating a long insertion that is not amplified well, or
are polymorphic, then the existence of a particu- inversion of a primer site.
lar product may be a good Mendelian character In addition, controlled matings have some-
which is typically (but not always) dominant. In times shown the appearance of novel bands in
this case the frequency of this product in amplifi - PCR experiments, making inference about the
cations from a large number of individuals can be parents of offspring problematical. This type of
used as a population marker. problem needs to be solved by very careful exper-
A great advantage of RAPDs is that they re- iments that show the Mendelian inheritance of the
quire no foreknowledge about any particular bands used in an analysis (e.g., Levitan and Gros-
gene in a target taxon. Given a large bank of ran- berg, 1993).
dom primers, some useful products are likely to Lastly, homologous loci are very difficult to
be amplified from virtually any species. A sec- identify, making IiAPDs difficult to use in inter-
ond advantage of the method is that it is random populational or interspecific comparisons (Hillis,
with respect to the genome. In a large number of 1994a; J.J. Smith et al., 1994).This problem can be
different products, there are likely to be some alleviated by cloning and sequencing the RAPD
that amplify a section of every chromosome. As a PCR product, and using redesigned, locus-specific
result, RAP% can generate molecular markers primers to amplify homologous loci from other
that can then be correlated with other pheno- individuals or species.
typic traits (such as pesticide resistance in
insects). BXON-PRIMED INTRON-CROSSING (EPIC)AMPLIFICA-
Another powerful use of RAPDs is in mater- T r o N s Many nuclear genes play important
nity and paternity exclusion analysis. Once a par- metabolic roles, and their products have amino
ticular product is known to be inherited as a acid sequences that are highly conserved among
Mendelian character, it can be used to screen a set taxa. For example, the amino acid sequence of
of adults to test to see which are the parents of a actin, a protein involved in muscle action and
particular offspring. Use of a large number of the cellular skeleton, is up to 95% identical
RAPD products in this way can be a precise way among different animal phyla. In addition to
of documenting offspring-parent relationships, such structural proteins, many catalytic proteins
dispersal distances, or multiple paternity of a exhibit highly conserved active sites.
brood (Levitan and Grosberg 1993). These conserved gene segments often are use-
Disadvantages of RAPDs are numerous, how- ful for designing PCR primers, but the resulting
ever. First and foremost is that it is difficult to dis- gene segment is likely to be too conserved to be
tinguish many of the polymorphisrns apparent phylogenetically informative except in distant tax-
using this technique from PCR artifacts. PCR is onomic comparisons. A solution to this problem
not always a precise process that gives exactly the is to design these conserved primers so that the
same results every time. Sometimes minor differ- gene segment between them crosses an intron.
ences in template quality or abundance cause ar- This strategy works well only if the intron posi-
tifactual differences in whether a particular prod- tions are known and intron sizes are small enough
uct is seen from individual to individual. Because to be amplified efficiently, but genomic DNA se-
some of the RAPD results come from negative ev- quences have provided this information for a
idence (e.g., lack of a band on a gel), precise con- wide array of genes. Intron positions can evolve,
220 Chapter 7 / Palumbi
but in some cases the placement of an intron in a the loci are undergoing concerted evolution). Sec-
gene has remained constant over a long period of ond, no more than two different sequences (rep-
time. For example, in virtually all actin genes, resenting +wo alleles possible at this locus) should
there is an intron at amino acid position 41 (Kow- occur in a diploid organism. Presence of more
be1 and Smith, 1989). Within smaller taxonomic than two different sequences indicates presence of
groups (such as mammals), many genes are con- more than a single locus or recombinational arti-
served at the amino acid level and are known to facts (note, however, that PCR errors can create
include introns in conserved positions. The basic minor differences in cloned sequences but that the
approach of intron amplification has been de- frequency of these transition errors is only about 1
scribed by Lessa (1992), Lessa and Applebaum in 500 bp).
(19931, Slade et al. (1993), and Palumbi and Baker Once the loci are identified, locus-specific and
(1994). species-specificprimers can be designed that give
Primer design and initial amplifications are single PCR products at high stringency. Amplifi-
similar to procedures used for other types of am- cations from multiple individuals or multiple
plifications. However, intron sizes can be highly species can then be cloned (to separate alleles)
variable among species, and it is difficult to pre- and sequenced. Alternatively, RFLP analysis on
dict what size PCR product to expect. In addition, amplified introns can be used to quickly screen
many conserved genes occur in small gene fami- population patterns (Slade et al., 1993; Palumbi
lies with several loci or pseudogenes. Thus, initial and Baker, 1994).
amplifications at low stringency are expected to
produce multiple products. These products, plus Ancient DNA
the inevitable artifacts produced by PCR, usually The polymerase chain reaction makes it possible
result in a confused initial picture of the amplifi- to analyze the tiny amounts of DNA that are pre-
cation results. Typically, a few strong bands ap- served in some fossil and subfossil material. In
pear, along with several minor bands or smears. general, the DNA extracted from such samples is
In many cases, the smallest, strongest bands in very small (100400 bp), in low abundance, and
an amplification represent processed pseudo- shows extensive oxidative damage. Paabo (1988)
genes, from which the introns have been removed estimated yields of 1-200 pg DNA per gram of
(by mRNA processing before insertion into the starting material. Cooper et al. (1992) reported
genome).To distinguish tme introns from PCR ar- that samples extracted from soft tissues tended to
tifacts, and to identify pseudogenes or separate be smaller than those recoverable from bone.
loci, it is a good idea to amplify genomic DNA A major advantage of PCR analysis of fossil
from several closely related species, and clone the DNA is the direct recovery of ancestral DNA se-
entire original PCR products (Marchuk et al., 1991; quences which can be used to clarify phylogenetic
see Chapter 9). A variety of inserts are then se- relationships (Higuchi et al., 1989; W.K. Thomas
lected and sequenced, including representatives of et al., 1990; Cooper et al., 1992; DeSalle et al., 1992;
the strongest bands from the initial amplification. Janczewski et al., 1992). A second valuable result
Ideally, primers are designed so that the PCR is the demonstration of previous levels of genetic
product includes a short stretch of the amino variation or population subdivision in modern
acids that flank the intron. True introns are identi- species that have undergone contemporary bot-
fied by sequencing the intron/exon junctions of tlenecks or range shifts (W.K. Thomas et al., 1990).
cloned inserts. Confirmed introns include the pre- The small size of DNA fragments recovered from
dicted amino acid flanking regions as well as the ancient samples inakes it difficult to form strong
conserved intron splice signals. Different loci are conclusions about close phylogenetic relation-
identified in several ways. For a given locus, se- ships on the basis of a single genetic locus. AS a
quences between closely related species should be result, muliple loci need to be examined for high
more similar than sequences between loci (unless resolution (e.g., Janczewski et al., 1992).
Nucleic Acids 11: The Polymerase Clzailz Xeactio~z 221
Forensic Identification of Small Tissue Sanaples protein-coding sequences to be obtained from ge-
In some cases, it is impossible to identify a species nomic DNA without recourse to cDNA libraries.
from the morphological information available. Such amplifications would cross large introns,
Sometimes the available material is only bits and and may provide a convenient source of material
pieces from museum collections (Higuchi et al., for RFLP analysis of nuclear DNA variation.
19891, fossil digs (Cooper et al., 1992; Janczewski
et al., 1992), fish markets (Baker and Palumbi,
19941, or crime scenes. Sometimes individuals are LABORATORY SETUP
so small (e.g., bacteria) that individual identifica-
tion is impossible (DeLong, 1990).Sometimes dif- Few pieces of specialized equipment are requlred
ferent species (e.g., marine larvae, insect larvae) for PCR. Compared to the expensive commitment
are so similar morph~logicallyat some stage of to ultracentrifuges that characterizes some mole-
their life history that they cannot be distinguished cular work (e.g., for isolation of whole mtDNA),
(R.R. Olsen et al., 1991). In these cases, positive the basic PCR systematist's lab is far simpler. The
identification often can be obtained by RFLP or laboratory setup described in Chapter 9 is more
sequence analysis of PCR products (Silberman than sufficient for all PCR needs. The minimum
and Walsh, 1992). These analyses differ only needs for PCR include a thermal cycler, a variety
slightly from standard analyses; the biggest of automatic pipetters (e.g., at least one in each of
changes are in sample preparation. Very small the following size ranges: 1-20 pl, 10-200 pl, and
samples (e.g,, individual copepods, sperm sus- 50-1000 PI), agarose gel apparatus and power
pensions, single larvae) often can be amplified di- supplies (see Chapter 8), a UV light source to
rectly-that is, without prior DNA extraction. view the PCR products, and a PolaroidTM camera
This limits the chance of contamination during ex- for photodocumentation of gels. To help solve
traction, and speeds analysis of large number of contamination problems, a powerful UV source-
samples. Larger samples (e.g., bone, dried skin, a UV cross-linker-can be used to irradiate PCR
blubber, canned meat) can be extracted first, and solutions and hardware. Other equipment llsted
then amplified. in Chapter 9, such as microcentrifuges, is requ~red
for some of the following protocols.
Long PCR
A number of recent reports laid the groundwork
for amplifications of large sections of DNA (see
Cheng et al., 1994a,b and references therein). The
methods they describe do not differ dramatically 1. DNA isolation for PCX
from the protocols used to amplify short sections 2. Polymerase chain reaction
of DNA. However, the addition of two types of
enzymes to the PCR reaction, one of which in- 3. PCR from RNA
cludes a 3' -+ 5' cxonuclease activity, seems to en-
hance the yield of Iong PCR reactions, possibly by Protocols used to amplify DNA ln molecular sys-
correcting mistakes made by Taq polymerase tematics may requlre considerable improvisation.
(which does not proofread) (see W.M. Barnes, The exploration of different taxa, many of wlxcli
1994 and Chapter 9, Protocol 17 for a more com- have never before been the subject of molecular
plete discussion). study, requires a flexibility not normally needed
Such Iong PCR amplifications may be partic- in laboratories specializing in well-understood
ularly useful for mtDNA because the two widely model organisms. Solution of the special prob-
used 16s rDNAprimers (16Sa and 16Sb) could be lems of phlox phenolics, molluscan mucus, or
used to amplify the entire mtDNA of a given simian pseudogenes requires both imag~natlon
species. In addition, long amplification allows and common sense. Remember that irnaginat~ve
222 Chapter 7 / Palumbi
4. The solution prepared in step 3 contains only Dilute such samples in TE and mix until the
about 0.1% DNA; the rest is cellular debris. To viscoelastic behavior of the solution is greatly
enhance the relative fraction of DNA, cen- reduced.
trifugation can be used to separate nuclei and 8. Proceed with phenol/chloroform extraction
mitochondria from the cellular fragments and detailed in Chapter 9 (Protocol I).
from unbroken tissues. Centrifuge the ho-
mogenate at about 1,000 rpm in a microcen-
trifuge to pellet debris. One minute is usually Part. B.Variations an thc Basic Extraction
sufficient, but longer spins may be appropri- PRESERVED TISSUES In many circumstances, tissues
ate for viscous solutions. The supernatant are not fresh when obtained. For tissues that are
should be cloudy but should not contain ob- frozen, or stored in preserving chemicals like al-
vious particulate matter. cohol, DMSO/EDTA, or lugh salt solutions, initial
5, To separate the organelles (containing the DNA isolation procedures might differ from those
DNA) from the soluble proteins, carbohy- listed above.
drates, etc., microcentrifuge this supernatant For frozen tissues, the thawing process often
for =3 min at 14,000rpm. Longer spins may be releases active DNases that degrade nucleic acids.
important for some homogenates. The mito- To avoid this problem, grind the tissue with a
chondrial pellet should be a velvety gray-tan. mortar and pestle while it is still frozen in liquid
[Note: It is possible to improve the isolation of nitrogen or dry ice. A small coffee bean grinder
mitochondria1 DNA at this point by first per- does a good job of powdering tissue when the
forming a low-speed spin (2-3000 rpm for 2 frozen tissue is mixed with equal parts particulate
min) to remove nuclei, and then performing dry ice.
the high-speed spin to pellet mitochondria. After the tissue has been ground, proceed
Different tissues require slightly different con- with DNA isolation, either using the basic proce-
ditions, so check a few test pellets and super- dure outlined in Part A, or more generally, the
natants by viewing them in a microscope. At proteinase K isolation described in Chapter 9.
400x, nuclei usually are visible as round, clear For tissues immersed in preservative chemi-
bubbles with a definite diameter. Mitochon- cals, first remove the chemicals. This usually can
dria can barely be seen as dark dots moving be done by blot-drying a specimen with paper
around by Brownian motion.] toweling, or if alcohol was the preservative, the
sample can be dried in a vacuum. Most of the
6. Pour off the supernatant gently. Resuspend
preservatives used in storing samples for DNA
the pellet in the original volume with TE, If
work will not interfere too much with DNA ex-
the velvet part of the pellet is in a ring around
traction, so exhaustive removal of these chemicals
a darker looking center, try to re-suspend the
is not required. The major exception to this state-
velvet ring and transfer it to a new tube, leav-
ment is formalin. Formalin cross-links DNA and
ing the dark center behind, The re-suspended
proteins, and prevents efficient extraction of high-
pellet should again be cloudy.
molecular-weight DNA. Some formalin-preserved
7. Add enough 20% SDS to bring the re-sus- specimens remain useful for PCR, however, espe-
pended pellet to 1%SDS. The solution should cially those that have been "hardened" in forma-
immediately clear as the membranes dissolve. lin for only a short time before being placed in
If too much tissue was used, the solution will ethanol. For formalin-preserved specimens, high
be extremely viscous. Ideally, it should be dithiothreitol (DTT) concentrations (up to 100
slightly viscous, but not stringy with DNA. mM) in the proteinase K extraction buffer seems
Samples that are too dense with DNA at this to help break the protein-DNA cross links.
point have an extremely high viscosity and
exhibit impressive viscoelasticities, which BONE Bone contains substantial amounts of
hinders subsequent separation procedures. DNA trapped in the cells that formed the calci-
224 Chapter 7 / Palumbi
um carbonate matrix, and in the central marrow. DNA from plants for PCR. If the DNA is still not
Altl~ougl~there are several published reports of pure, the following protocol (contributed by S.
DNA extraction from bone (even fossil bone), Miller, U.S. Fish and Wildlife Service) allows pre-
contaminants that interfere with PCR have cipitation of DNA away from muc~polysacclza-
plagued this field (Hoss and Paabo, 1993). The rides that inhibit PCR.
following protocol was suggested by Hoss and 1. Heat genomic DNA extraction at 60°C for 1
Paabo (1993) to alleviate this problem. hr.
1. Remove a 1-mm layer from the outside of the 2. Add 1/2 volume of 8 M LiC1.
bone with a drilling machine to reduce conta- 3. Let stand for an hour, then centrifuge at high
mination. speed in a microcentrifuge to pellet the pre-
2, Grind the sample to a fine powder in a freezer cipitated DNA. Mucopolysaccharides remain
mill (see Hoss and Paabo, 1993) with liquid in solution, so remove supernatant and resus-
nitrogen, or use a coffee grinder and dry ice. pend the pellet in water or TE.
3. Add 0.5 g bone powder to 1 ml of 10 M
guanidinium thiocyanate (GuSCN), 0.1 M
Tris-HC1 pH 6.4, 0.02 M EDTA pH 8.0, and RESCUING SAMPLES THAT WILL NOT AMPLIFY On
1.3%Triton-X 100. occasion, some precious sample will not give
4. Incubate at 60°C for 1-3 hr with occasional good amplifications, no matter how malzy per-
agitation. mutations of the amplification protocol are tried
(see "Troubleshooting"). Sometimes cleaning up
5. Centrifuge for 5 min at about 5000 rpm in a
the DNA solution can solve this problem.
rnicrocentrifuge.
6. Extract the supernatant with glass milk (see Filtration:
Chapter 9). I. Use an ultrafiltration tube (e.g., Centricon 30
or 100)to wash the DNA. Add 1/3 volume of
EXTRACTION CONTROLS When using small 7.5 M ammonium acetate (pH 7.5) to the
amounts of precious tissues like fossils, hair, or DNA sample and add enough 2.5 M ammo-
larvae, it is important to include extraction con- nium acetate to fill the filtration tube.
trols. These are extractions of nothing, using the 2. Spin using the recommendations of the ven-
same chemicals and conditions used to extract dor to reduce volume to about 10 pl.
the small samples. The products of these shadow-
3. Add sterile distilled water to fill the tube and
extractions should be included in subsequent
repeat the centrifugation,
PCR amplifications to certify that no exogenous
DNA was in the extraction chemicals. 4. Repeat step 3.
5. Dilute sample to original volume.
PROBLEMS CAUSED BY PHENOLICS OR MUCUS In
some cases, huge excesses of carbohydrate, pig- Re-precipitate:
ment, or other chemicals are released from a tis- 6. Precipitate the DNA a second time to rid it of
sue. Worse, these offending molecules often co- unwanted salts, Add 1 / 3 volume of 7.5 M
purify with the DNA, preventing subsequent ammonium acetate (pH 7.5) to the DNA sam-
amplification. To help limit these problems, a ple and add enough 2.5 M ammonium acetate
number of exotic extractions have been devel- to increase the volume to ten times the origi-
oped. The most popular uses a detergent called nal amount.
CTAB that complexes with carbohydrates and 7. Add one volume of isopropanol or two vol-
can be phenol extracted (see Chapter 9). Fangan umes of EtOH, and collect pellet as described
et al. (1994) give a good review of extracting in Chapter 9 (Protocol 1).
Nucleic Ac ids 11: The Polymerase Chain Reactiolz 225
]'art C. Amplifying froxa~Ti'ssraes produce a large amount of that product and only
For very small amounts of tissue, cellular contents that product.
released into the amplification cocktail do not The following PCR protocols work well with
seem to prevent the polymerase chain reaction. In a variety of different DNAs. There are a number
these cases, the tissues are added directly to the of different buffers being used in other labs and in
PCR cocktail, and subjected to thermal cycling, the literature that may work as well as or better
avoiding the DNA extraction step entirely. It is a than ours. In particular, the amount of MgC12
good idea to use a PCR buffer that includes some used in the PCR reactions should be adjusted for
detergent (Triton X-100 or NP-40) when attempt- every different DNA template being used. Some
ing amplifications directly from tissues. Some recent reports suggest adding other organrc sol-
people have had more success when the tissues vents such as DMSO, formamide, or PEG up to
are placed in the PCR cocktail (minus enzyme) about 1%concentration to the reaction buffer. If
and are incubated in tlie refrigerator overnight. you are having problems, explore other buffers
Others incubate tissues at 94OC far a few minutes, using the experimental approach outlined above.
centrifuge the mixture and use an aliquot for PCR. Table 2 lists some inhibitors of the PCR reaction.
Whether a particular tissue will allow this type of Above the levels listed, PCR reactions tend to be
amplification will require some experimentation. inhibited. But below the levels shown, such addl-
Below is a protocol for the direct amplification tives often can help adjust yields and reaction
from bacterial colonies that is very useful in facil- specificity.Note that some of these levels are sur-
itating the screening of plasmid or phagemid prisingly high (e.g., for urea). Finally, careful ad-
clones: justment of reaction conditions appears to favor
PCR of long fragments (W.M.Barnes, 1994; Chcng
1. Label. colonies growing on agar in petri dishes
et al., 1994a). W.M. Barnes (1994) reported the first
by drawing circles around them on the back
amplification of very long DNA fragments
of the plates.
through use of several different polymerases in
2. Prepare 25 pl PCR cocktails for each labeled the same mix.
colony using primers that are upstream and
downstream of the point in the vector into Part. A, iiocibie-Strailiiecf D N A iamptifica8ionc
which the inserts have been cloned (e.g., use This is the most common type of amplification; jt
M13 and M13-reverse sequencing primers). can produce DNA products for many uses. The
3. Using a sterile loop, pick a colony and swirl goal is to produce a large amount of double-
the loop into a PCR reaction tube. Do not use stranded DNA copied from a particular gene re-
too much of the colony or the amplifications gion. Start by preparing or assembling the follow-
will not work. A light toucli is all that is ing solutions:
needed.
4. Amplify at 55°C annealing, using an exten-
sion time appropriate for the size of the insert Table 2
(see Protocol 2). Levels above which various solvents
inhibit PCR
Solvent Inhibitor)) level
F%otocoE2: The Poiys~eraseChain
Reactian Ethanol 10%
(Time: 1hr to set up, 2 4 hr for a 40-cycle Urea 1.5M
amplification) DMSO 1%
PCR is a means to an end: tlie genetic characteri- Formamide 10%
SDS 0.01%
zation of a species or a population or an individ-
ual. Given a particular target DNA, the goal is to From Gelfand and White, 1990
226 Chapter 7 / Palumbi
1. DNA template (genomic DNA, cpDNA, 5. Place tube in thermal cycler and start run (see
mtDNA, etc.). "Thermal Cycling," below).
2 10x polymerase buffer (Appendix).
MULTIPLE REACTIONS Very seldom are PCR reac-
3 10x dNTPs (Appendix). These are the raw tions run singly. If nothing else, this violates the
materials for DNA synthesis, and are usually maxim of always using positive and negative con-
used at 200 pM. The 10x solution is thus 2 trols. In fact, several reactions, identical except for
ntM for each dNTP. the template, typically are run side by side. In this
4. Oligonucleotide primers, diluted to 10 pM. case, it is convenient to make a PCR cocktail that
5. MgC12in water (usually 150 mM). includes everything but the template by multiply-
6 Taq poIymerase (kept cold). ing all the volumes in step 1 above by the number
of reactions to be run. This limits measurement er-
7. Distilled and sterilized water. rors, especially of small sample volumes, and it
ensures that all the reagent concentrations in each
BASIC REACTION The following protocol is for 25 tube are exactly the same.
pl reactions. It can be scaled up as needed.
1. Multiply volumes in step 1 under "Basic Re-
1. Mix the following in a microcentrifuge tube: action."
2 5 ,u! lox Taq buffer, 2.5 @ 8 8dNTPs (i.e., 2. Add reagents together in any order, except
2 rnM each of dATP, dGTP, dCTP, dTTP), 1.2 enzyme is added last. (See note about the care
pl each of two primers (10 pM stock solu- of enzyme above.)
tions), 0.5-1 U Taq polymerase (this is a lot
3, Mix enzyme gently and aliquot the reaction to
less than recommended by many suppliers separate sterile 500-,d tubes.
but it is sufficient). Add ddH20 to make 24 ,d
per reaction (=I7pl).The order in which these 4. Add template (see note above about
are added is not very important as long as the amounts).
enzyme comes last. Make certain the enzyme 5. Add oil. Cap and label tubes.
solution, which typically is in a heavy glyc- 6. Spin tubes briefly in a microfuge, and place in
erol stabilizing buffer, is well mixed, but do thermal cycler (see "Thermal Cycling").
not create any froth in the tube. Keep the en-
zyme on ice or in the freezer until needed; re- THERMAL CYCLING The PCR cycle is relatively
move it only to take the small amount of fluid
simple and is composed of three major steps
11eededfor the reaction. Using a larger cock-
reviewed below (a more complete description is
La11 (see below) solves problems with measur-
presented earlier in this chapter).
ing small amounts of solutions.
2 Add about 1 pl template DNA (in ddKzO or 1. Denaturation. 94°C for 30 sec seems to work
0 . 1TE).
~ This should be about 1-2 ng of well, but shorter times have also been recom-
mtDNA or 1-2 ,ug genomic DNA. If in doubt, mended. If the melting temperature is too
use less. low, or the time too short, the double-
stranded DNA may not denature, thereby re-
3. Add any additives, such as extra Mg2+, ducing the efficiency of the reaction. This is
DMSO, etc. especially true for the first cycle, in which the
4. Add 1 drop of mlneral oil (common pure goal is to denature high-molecular-weight
Jrug slore variety) to prevent evaporation of DNA. Some protocols suggest a long initial
the sample. If condensation forms on the top denaturation (usually 60 sec). However, Taq
of the PCR tubes during cycles the reactions polymerase loses activity with each denatu-
may not work. Spin in microcentrifuge for =5 ration cycle and eventually becomes less ac-
sec. [Note that larger reactions need more oil tive. So, there needs to be a balance between
because the surface of the fluid is larger for denaturation of the DNA and that of the
high volume soiutions in conical tubes.] enzyme.
Nucleic Acids II: The Polymerase Chain Reaction 227
2. Annealing. Standard temperatures are about
55°C for 30-60 sec for good primers. If you perature. The following types of temperature cy-
are having problems with getting any prod- cles generally are appropriate when dealing with
uct at this annealing temperature, lower the various DNAs, primers, and annealing tempera-
temperature in stages (of about 2") to 4548°C tures (Table 3).
(although temperatures as low as 37°C some-
times work). For long, perfect primers, an- HIGH-STRINGENCY BOUNCES When the primers
nealing temperatures of 60-65°C or higher are very good, the annealing temperature is
may be used. high, and the product is under about 500-750 bp,
3. Extension. The Taq polymerase works best at it is possible to "bounce" from the denaturation
temperatures of about 72-75°C. The enzyme to the annealing to the extension very quickly,
synthesizes thousands of bases a minute, so with only a 15-sec pause at each step. These
long extensions generally are not needed. amplifications proceed very quickly, and often
Thirty sec is adequate for products u p to produce very clean, single products.
about 500 bp; 60 and 90 sec are needed for
products of 500-1500 and >I500 bp, respec- LOW-STRINGENCY SHUPPLES Lowering the an-
tively. nealing temperature is a commonly used method
of encouraging a reluctant PCR reaction. There
are few rules to go by, but there are two general
Part G. Variations in the Cycle approaches. The first is to lower the annealing
If a high annealing temperature is not practical temperature by a few degrees in subsequent PCR
(usually because of an imperfect match between experiments until a product is produced. The
primers and template), the simplest way to en- other is to drop to very low temperatures
hance PCR success is to lower the annealing tem- (4045") and observe how many products (e.g.,
Table 3
Combinations of cycle temperatures that can be used for different matches of
primer to templaten
Cycle
temperatures Used for Comments
DNA strands of different sizes) are produced. is meant to favor annealing of poor primers m
Then the annealing temperature is increased by the early cycles. In later cycles, only the products
stages to eliminate the extra bands. These extra made in the first few cycles will amplify. This is
bands also can be manipulated by increasing because PCR products have incorporated the
MgClz concentrations: often fewer bands are synthetic primers, so the free primers are a per-
produced at 3 rnM MgC12. fect match for the new products. A typical step-
Note that when the annealing temperature is up is: 5 cycles of 94-45-72 for 30 sec each, fol-
very low, it is a good idea to drop the extension lowed by 35 cycles of 94-55-72 for the same
temperature as well. This is so that the primer times.
does not fall off the template before the enzyme Obviously, longer extension times might be
has had a chance to begin synthesis. Alternatively needed for long products. Possible variations in-
this is one instance when it is sometimes appro- clude altering the number of low stringency cy-
priate to change from one temperature to another cles (3-lo), altering the annealing temperatures
slowly This type of temperature change is called a (based on the T, of the primers), and decreasing
"ramp" and is a basic part of the programming the extension temperature of the low-stringency
options on some thermal cyclers (but not all). In cycles.
these cases, after annealing, the temperature is in-
creased slowly toward the extension temperature, TOUCH DOWN PROCEDURES Another approach,
allowing the enzyme to work (albeit slowly) to ex- called a "touch down," works best if the primer
tend the primer. This extension solidifies the is a good match to the template, but has alterna-
primer's "grip" on the template, allowing the ex- tive binding sites as well. High-stringency
tension temperature to be raised even further. annealing steps favor binding only to the correct
Typically, ramp times of about a minute are used sites. Later, when the mixture is dominated by
to "locktfa bad primer in place. PCR products from the initial few cycles, lower
Some investigators believe that it is useful to stringency annealing temperatures are less likely
extend the annealing time in addition to (or as op- to result in binding at the alternative sites. A typ-
posed to) extensively lowering the annealing and ical touch down sequence is:
extension temperatures in steps 2 and 3. This
2 cycles at 94-60-72 for 30 sec at each step
gives a primer more of a chance of finding its
complement and allows the Taq polymerase (even 2 cycles at 94-58-72
though not at optimal temperature) to extend the 2 cycles at 94-56-72
primer sequence a little, thus "locking" it to its 2 cycles at 94-52-72
complement on the template. The trade-off is that
32 cycles at 94-50-72
this also greatly reduces specificity of the reaction,
and non-specific products can result.
hart.43, Single-Stranded Amplifications
A D D r n o N OF A THIRD PRIMER Adding a third In single-stranded or asymmetric PCR amplifica-
primer to the cocktail can greatly enhance the tions, ssDNA template for sequencing is pro-
yield of product from the other two. This third duced by limiting one of the two primers. During
primer should anneal outside of the other two, the initial cycles of a PCR run, dsDNA is pro-
and be added in about one-fifth the normal duced as in a normal alnplification. However;
amount. This procedure often enhances the sub- during the final cycles the limiting primer runs
sequent production of single-stranded template out. The other primer continues to initiate ampli-
with asymmetric amplifications. fication, but only single-stranded products are
produced with each cycle. The trick to this
STEP-UP PROCEDURE A " ~ t e p - ~ p " i n run-~ ~ l ~ e method
~ is in adjusting the concentration of the
ning the first few cycles (3-5) at a low annealing limiting primer so that it runs out after enough
temperature, and then switching to high strin- double-stranded product has been produced and
gency to finish the ampljfication. This procedure before the PCR run is complete.
Nucleic Acids II: TIze Polynzerase Clinitz Xeactiorz 229
Although asymmetric amplifications used to alnplification in a new reaction with only one
be the primary means for generating DNA se- primer (use 5 pl of 10 plvl solution of pnmer, as
pencing template from PCR products, there are usual). The amount of template added is deter-
now several other methods, such as double- mined by the amount of double-stranded DNA in
stranded sequencing, cloning, solid-phase se- the first amplification.
quencing, and cycle sequencing (see Chapter 9). Metlzod B: Take a one-hundredth dilut~onof
These other metl~odstend to be more reliable and the double-stranded PCR product and use that 111
less prone to the failures so common in asymmet- an asymmetric single-stranded amplification as
ric PCR. These failures can be resolved, however, described above.
for given templates, and in these cases, the Mefhod C: Gel purify the double-stranded
method can work very we11 to provide sequenc- band. Run the double-strand ampIification on a
ing template. regular 2% agarose gel. Cut out the appropriate
The use of a third primer, which anneals in- band with a sterile blade. (This method is ideal ~f
side those used in the original double-stranded the double-stranded amplification has multiple
amplification, often gives superior single- products.) Some investigators recommend uslng
stranded amplifications. This is probably because TAE instead of TBE in these agarose gels. Soak the
use of the third primer adds a new layer of speci- gel slice in 1 ml sterile water for 1-3 hr, then re-
ficity to the single-stranded reaction that is not place with 50 pl sterile water and freeze the gel
added by using one of the previous primers. This slice. Thaw it immediately, and repeat the
appears to be important for ribosomal RNA gene freeze/tl~awcycle two more times. Use 1 pl of thls
primers, or others that tend to give non-specific solution in the single-stranded amplifications. Al-
double-stranded products. ternatively, take a tiny sllce of out of the middle of
Single-stranded DNA can be produced from the gel band that contains the DNA and use that
pure mtDNA, or from a previous double-stranded directly in a PCR solution as template. Low-melt
PCR amplification. Typical reaction mixtures are agarose can also be used in the originaI gel; in this
in 100 fl instead of 25 pl. case, cut out the band and melt it at =80°C. The
agarose can be removed by phenol extraction, fol-
SINGLE-STRANDED AMPLIFICATIONS FROM PURE lowed by a chloroform extraction and EtOH pre-
mtDNA FOX. pure mtDNA samples with good cipitation (see Chapter 9, Protocol 1).
primers, an asymmetric amplification can be
used. Make the PCli cocktail as above, with the
following differences. One of the primers (the PrseocsX 3: XrCR Exam RRNA
limiting primer) is one hundredth the concentra- (Time: 3 hr for first strand synthesis; 3 hr for
tion used i n double-stranded amplification. standard PCR)
Thus, use a 0.1 p M primer solution (instead of
the normal 10 p M solution). Add 1-5 ng of pure In some cases, the most appropriate starting ma-
mtDNA as template. terial for PCR is not DNA, but is instead an RNA
transcript. To amplify from RNA, an additional
SINGLB-STRANDBD AMPLIFICATIONS FROM DOUBLE- step is required. This is because Taq polymerase
STRANDED AMPLIFICATIONS Use one of the three can only use DNA as a template. The enzyme re-
methods outlined below to generate single- verse transcriptase (RT) is used to reverse tran-
stranded product from initial double-stranded scribe RNA into DNA.
amplifications. In all three methods, primer PCR amplifications from cDNAs can be done
annealing temperatures are stringent (55-60°C), with no modification of the basic protocol. First-
regardless of the conditions originally needed to strand synthesis of the cDNA is accomplished us-
generate the double-stranded product. In gener- ing RT primed with an oligo dT. This primer (a
al, 100 p1 reactions are used, increasing the pro- string of T's) anneals to the poly-A tail that is
duction of template. added By the cell) to most messenger RNA before
Method A: Use 1-5 ,dof the double-stranded it is translated. The result is a IWA/DNA hybrid
230 Chapter 7 / Palumbi
that can also be used directly in PCR reactions. TO 3. When primers are made, the stock solution
clone these products into a cDNA library, second- usually is highly concentrated. From this
strai-td synthesis is required. This is the replace- highly concentrated stock solution, it is desir-
ment of the first RNA strand with a DNA strand. able to make a 100 pM stock solution which
Prokocols for extracting RNA from animal and can then be used in malung 10 pM solutions
plant tissue are given in Chapter 9. for individual use. The different stock solu-
tions are stored separately. In this way mas-
1. Dry about 1pg of mRNA in a lyophilizer. Do
sive, laboratory-wide contamination prob-
no1 overdry it because then it will re-suspend
lems can be avoided and any contamination
poorly.
problems that do arise can be contained.
2. To 30 pmol of prlmer add RNase-free water
4. Different sets of pipetters should be desig-
up to a total volume of 11 pl.
nated for different procedures. One set of
3 Heat to 70°C in a heating block for 10 min and pipetters should be designated for preparing
freeze on dry ice. Let the sample warm to PCR reactions, These pipetters should never
room temperature slowly (about 10 m i d . come in contact with any amplified DNA. An-
4 A d d 4 pl of reaction buffer (usually supplied other set of pipetters can be designated for
with RT enzyme), 2 p10.1 M DTT, and 1.5 p l post-PCR use. One pipetter should be desig-
dNTPs (10 mM each). nated to be used only in loading samples in
5 Incubate at 37OC for 10 min. agarose gels. Another set of pipetters should
be designated for use with radiation only.
6 Add 1.5pl of reverse transcriptase and incu-
bate a t 37°C for 60-90 minutes.
7. Amplify the target sequence from cDNA us- Some Common Problems With PCR
ing the standard PCR protocol (Protocol 1).
Problem: No PCR product, not even in positive
controls.
TROUBLESHOOTING Possible remedies:
1. Repeat the experiment.
2. Check buffer, dN'TPs, and primer recipes and
Avoiding PCR Problems: PCR Hygiene concentrations. Iiemake any questionable so-
lutions.
Bccduse PCR products are so concentrated and
easily volatilized (by opening a microfuge tube or 3. Try a different set of primers or a different
plpettmg, for instance), cross-contamination of positive control.
sanlples is potentially a serious problem. Certain 4. Try a new batch of enzyme (this is seldom the
slmplc precautions can be taken to avoid contam- problem unless the enzyme is very old).
lnalion or at least ~nilzimizeit if it occurs. 5. Was oil added to the reactions?
1 . Al~quotingsolutions makes it possible to con- 6. Check the thermal cycler by watching it go
Lcun and help resolve contamination problems through 2-3 cycles.
thdt do arise. Each person working in the lab
should have his or her own set of solutions.
Problem: Positive control works, but otherwise
PCR reagents prepared in large amounts there is no product.
should be distributed in 1.5-ml microfuge
tubes and stored at -20°C. Possible remedies:
1. Run 5 pl of the stock DNA solution on a 1%
2. Water used for PCR reagents, DNA, and agarose gel. If there is a large amount of high-
pnrners should be double-distilled, sterilized, molecular-weight DNA, try diluting the start-
and then distributed in 1.5-ml microfuge ing template DNA (try dilutions of 1:10 or
tubes and stored at -20°C.
Nucleic Acids II: The Polymerase Chain Reaction 231
product from which to produce single-stranded For most gene regions, primer sequences are
templates. Try the following remedies to produce given and aligned with published sequences from
high-quality single-stranded product. a variety of taxa. A reference map is also included
that shows the location of the primers relative to
1. Try a dilution series with the double-stranded
each other. The labels S'and 3' indicate "up-
product to determine the template concentra- stream" and "downstream" primers, respectively.
tion that works the best. The following standard degeneracy symbols are
2. Re-do the double-stranded amplification at used:Y=CorT;R=GorA;Z=CorG;S=Cor
higher stringency. It may be that the original A;Q=AorT;M=A,T,orC;D-GorT.
double-stranded amplification is tainted with
non-specific products. This is especially help-
ful when the single-stranded products are Nuclear Ribosomal Gene Primers
"smeary" on the gels, indicating a non-spe- Three of the eukaryotic nuclear ribosomal RNA
cific product. genes are organized in a cluster that includes a
3. Adjust the concentration of the limiting small subunit gene (16s to 185, where S stands for
primer in the initial double-stranded amplifi- Svedburg units, a measure of sedimentation rate),
cation. a large subunit gene (26s to 28S), and the 5.85
4. Try gel-purifying the double-stranded tem- gene. In addition, two internal transcribed spac-
plates. Gel-purification, in principle, should ers (ITS-1 and ITS-2) lie between these genes and
eliminate all excess primers. Therefore, both there is an external transcribed spacer (ETS) at the
primers must be added to the single-strand re- 5' end of the transcribed RNA . These six compo-
actions. The concentration of the limiting nents make up the basic cluster; they are repeated
primer should be about one-hundreth the con- in a tandem array in the eukaryotic genome up to
centration of the second primer. However, ~un hundreds or thousands of times. Between each
a dilution series with the limiting primer to de- cluster in the array is a non-transcribed spacer
termine the concentration that works best. (NTS) that serves to separate individual repeats
from one another on the chromosome.
5. Tiy using an internal primer to generate sin-
In general, the genes are more highly con-
gle-stranded template (see "Method C").
served than the transcribed spacers, which are
6. Try a different sequencing technique. Al- more highly conserved than the non-transcribed
though asymmetric PCR was once the most region. Within even the conserved genes, how-
common way to sequence PCR products, ever, there are regions of very high sequence sim-
other (often more reliable) methods are avail- ilarity among taxa, and regions of low conserva-
able. See Chapter 9. tion (Hillis and Dixon, 1991). Thus, the entire
array is a patchwork of evolutionary rates.
PCR primers have been designed to span
USEFUL PRIMERS each of the ribosomal genes. Generally, the 5.85
gene is too short for most purposes, but primers
Included herein are primers that have been useful designed for this small gene can be used to am-
across a broad range of taxa for targets in nuclear, plify the adjacent ITS regions. The primers
chloroplast, and mitochondria1 genomes. Several shown in Figure 3 are a sample of the primers
other sources provide access to @imers that may listed by Hillis and Dixon (1991). Maps of se-
work on specific taxa, including Palumbi et al. quence conservation based on comparisons of
(19911, Simon et al. (2994), T.J. White et al. (1990), mammalian sequences to those of other phyla
and Kocher et al. (1989). These primers are all writ- can be found in Figures 3-8 of Hillis and Dixon
ten in the 5' -+ 3' direction. This means that the (1991). The map in Figure 3 is very approximate;
"downstream" primers are the reverse complements the reader should consult Hillis and Dixon (1991)
of the coding sequences to which they anneal. for details.
Nucleic Acids 11: The Polymerase Chain Reactiolz 233
I II 111 rv v VI
---+ -+ -+. -+ -+ -4
Map Sequence
position Primer/Taxa position
I. 18e-5' CTGGTTGATCCTGCCAGT 5
Mammal . . . . . . . . . . . . . . . . .
Frog
Urchin
Fruit fly
Rice
Yeast
Protist
IV . 28y-5' CTAACCAGGATTCCCTCAGTAACGGCGAGT
Mammal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Frog . . . . . . . . . . . . . . .C . . . . . . . . . . . . . .
Urchin . . . . . . . . . . . . . . .C . . . . . . . . . . . . . .
Fruit fly . . . . . A . . . . . . TT . . T . . . . G . . . . . . . .
Nematode . . , . AA . . . . . . . . . . T . . . . . . . . . . . .
Rice . . .T . .G . . . . . . . . C T . . . . . . . . . . . . .
l'east .C G...T.............
V. 28ee-5' ATCCGCTAAGGAGTGTGTAACAACTCACC 1795
Mammal . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Frog . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Urchin . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Fruit fiy . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Nematode . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Rice . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Yeast . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
(cont~nuedon next pqe)
Figure 3 Map of the nuclear rDNA array showing ap- Individual primer sequences are listed wit11 an allgn-
proximate positions of primers used to amplify regions ment to other taxonomic groups. Sequence positlon
of the 185, 5.85, and 285 genes, as well as the interven- refers to the starting position af the primer In thc! rele-
ing ITS regions (see Hillis and Dixon, 1991 for details), vant Mus rnusculus gene.
234 Chapter 7 / Palumbi
horn previous
(co~rh171ied page)
I I1 111 N V VI
-+ 4 4 -+ -+ -+
[ NTS 1 ETS 1 18s ] ITS-1 / 5.85 I ITS-2 I 285
b f- C t 4-
VII VIII IX X XI
Map Sequence
posltion Prlmer/Taxa position
VI. 28~-5' AAGGTAGCCAAATGCCTCATC 3429
Mammal
Frog
Frult fly . . . . . . . . . . . . . . . . . . . . .
Nematode . . . . . . . . . . . . . . . . . . . .T
Rice . . . . . . . . . . . . . . . . . . . . .
Yeas1 . . . . . . . . . . . . . . . . . . . . .
Protist . . . . . . . . . . . . . . . . .T . .
VI1 1811-3' AGGGTTCGATTCCGGAGAGGGAGCCTGAGAGAAA 420
Mammal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Frog . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Fruit fly . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Neina tode . . . . . . . . . .C . . . . . . . . . . . . . . . . . . . . .
lbce . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Yeast . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Protist . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
VJII. 5.8~-3' GTGCGTTCGAAGTGTCGATGATCAA 85
Mammal . . . . . . . . . . . . . . . . . . . . . . . . .
Frog . . . . . . . . . . . . . . . . . . . . . . . . .
Loach . . . . . . . . . . . . . . . . . . . . . . . . .
Urchin . . . . . . . . . . .A . . . . . . . . . . . . .
Silkworm . . . . . . . . . . .A . . . . . . . . T . . . .
RJce T . . . . . . . A . . . A C . . . . . . G . TC .
I
-+
125rCNA 1
4-
I1
Map Sequence
position Primer/Taxa position
I. 12SA-5' AAACTGGGATTAGATACCCCACTAT
Human . . . . . 1067
Frog . . . . . . . 2486
Urchin . CA . TGT.. 491
Fruit fly . A. .T T . 14612
12Sai-5' AAACTAGGATTAGATACCCTATTAT'
Human . . G. . . . . C C . 1067
Urchin . C . . . . . . . G . 491
Fruit fly . . 14588
11. 1258-3' GAGGGTGACGGGCGGTGTGT
Human . . . . . . . . . 1478
Frog . . . . . . . . . 2898
Urchin A . . . . . .A. 853
Fruit fly A .A C . . . . .A . 14211
12Sbi-3' AAGAGCGACGGGCGATGTGT
Human G G I G 1478
Urchin G . T 855
Fruit fly 14214
Figure 4 12s rRNA primers. Many of these primers designed to walk through most of the 16s gene and the
were first published by Kocher et al. (1989). They tend 3' part of the 12s gene. The 12Sai and 12Sbi primers
to work for most animal phyla, although there are ex- were made for insects by Chris Simon. They have a
ceptions. For vertebrates it is useful to use 12SA-5' with higher ratio of A's and T's than the other primers and
16SA-3' (Figure 5). This is a large fragment (=I425 bp) require a slightly lower initial annealing temperature.
that can be subjected to restriction digestion or se- These latter primers also work for most crustaceans (ex-
quenced from both directions.New primers then can be cept the barnacles) and many mallusks.
236 Chapter 7 / Palumbi
Map Sequence
position Primer /Taxa position
I. 16Sar-5' CGCCTGTTTATCAAAAACAT
Human . . . . . . . . . .C . . . . . . . . . 2510
Frog . . . . .C . .GC.T . . . . . . . . 3999
Urchin . . . . . . . . . .C . . . . . . . . . 5092
Fruit fly . . . . . . . . . .A . . . . . . . . . 13398
mitochondrial genome as a whole. Because the chain. Its amino acid sequence is highly con-
amplified fragment of the 16Sar and 16Sbr served across phyla, making it easy to align se-
primers (I?igure 5 ) is larger than the 12s fragment quences to one another, and making it possible to
(about 550 bp compared to 400 bp), it is slightly design useful universal primers (Figure 6). Be-
more useful in phylogenetic reconstruction. klso, cause it is so highly conserved, amino acid substi-
there is enough variation in some species to be tutions are rare within species, but silent changes
useful in population level studies. are just as common as they are in other mtDNA
Most investigators who have used the genes with lower constraints on amino acid se-
primers shown in Figure 5 successfully on organ- quence (Kessing, 1991).Amino acid sequences are
isms distantly related to the species on which the useful in phylogenetic reconstruction of deep evo-
primers were based (e.g., corals, hydroids, gas- lutionary branches (Palumbi and Benzie, 1991).
tropods) have used preliminary sequences from
initial amplifications done at low stringency to de- Cytochrome b Primers
sign taxon-specific primers. Cytochrome b is a protein in the electron transport
chain. It is the only protein product of the mito-
Cytochrome Oxidase I chondrial genome that is a fully functional
This gene is a subunit of the cytochrome oxidase monomer-that is, it is not a subunit of a large en-
complex that is part of the electron transport zyme complex, as are all the other mitochondrial-
Nucleic Acids 11: The Polyrnerasc Chain Reaction 237
11
---+
I Cvtochrome oxidase I: Vertebrates 1
VII
4
Map Sequeu ce
position Primer/Taxa positior~
I. Cole-3' CCA GAG A T T AGA GGG A A T C A G TG
Human . .T . . . .A. . .G . . A . . . . . . . . 7110
Frog . . . . TA . A . . AC . . . . . . . . . . . 8602
Urchin . . . . . . .AG . . G . . A . . C . . . . . 6992
Fruit fly . . . . T A . .A. . AT . . . T. . . . . . . 2672
encoded proteins. The chemistry a n d action of cy- tions of the gene that are highly conserved among
tochroine b a r e well k n o w n , a n d a s e c o n d a r y taxa and are thought to be important in the func-
structure of the protein has been proposed. Irwin tion of the protein.
et al. (1990) examined the molecular evolution of Figure 7 s l ~ o w primers
s for cytochrome b, prl-
cytochrome b i n m a m m a l s , a n d M a r t i n a n d marily for use in vertebrates. Kocher et al. (1989)
Palumbi (1993a) compared these results to t h e listed t w o universal cytochrome b primers that
evolution of this protein i n sharks. Both studies amplify a s h o r l section from a wide varieky of
noted that the level of amino acid conservation taxa. This short section s h o w s s o m e variation
varies significantly in different parts of the cy- within some populations (e.g., fish) and between
tochrome b gene. The most variable part is at the species. It is s o short that robust phylogen~csare
3' e n d of the sense strand. There are several sec- sometimes difficult to produce. To obtain a longer
238 Chapter 7 / P a l m b i
I I1 VI
3 3 -+
[ E 1 Cytochrome b IT/
t t *
I11 v VII
Map Sequence
position 13rimer/Taxa position
1. GLU-5' TGA TAT GAA AAA CCA TCG TTG
I-Iurnan . C . . . T . . 14724
Frog C C
Carp CT G C
Chlcken C G CT G T
Shrlrnp C AT T G T AT
~ r o g' . . . . . . A . . A . . . . . . . . . . . . . .
Urchin AG. . . . A . . G . . C . . . . . C . . C . .
V. CB3-3' GGC AAA T A G GAA R T A T C A T T C
Human . . . . . . G . . A . . . . . . . . . . . 15560
Frog . . . G. . . . . . . . . . . . . . . . . 17065
Stingray . . . . . . . . . . . . . . . . . . . . .
Sturgeon . . . . . . G . . A. . . . . . . . . . .
Urchin . . . G.. ..A . . . . . . C.. . . .
Fruit fly A . . . . . . .A A . . . . . . . . . . . 11325
v I
4 -+
l T l P l Control region I F /
t C
II IV
Map Sequence
position Prmer/Taxa position
I PRO-5' CTA CCT CCA ACT CCC A A A GC
Human C A T T G A 15980
Frog C A TTG C
Sturgeon TC C TT. . . . .
Urchin TAC A T . G . . .
11. PHE-3' T C T T C T AGG C A T T T T C A G T C
Human C G AA . . . . . . . 625
Frog .A CA
Urchin C TG A
a third strand. In taxa other than mammals, the The map shown in Figure 8 is for mammals.
control region is organized very differently, often It is a useful guide for fishes as well, in which
without an obvious D-loop. In sea urchins, this rates of control region substitution can be up to 40
region is under 200 bp long. In fish, it tends to be times hgher than in the cytochrome b gene (W. 0.
very long and Is often full of repeated sequences. McMillan and S. R. Palumbi, unpublished data).
In insects, it is called the AT-rich region and can However, this region appears to be rearranged in
also be long and full of repeated sequences. birds (T. Quinn, personal communication).
In the control region there is usually a set of
conserved sequence-blocks that presumably are Chloroplast DNA Primers
important in controlling mtDNA replication and
transcription. See Attardi (1985) for a review. In- Among chloroplast genes that have been se-
terspersed in this region, often flanking the con- quenced extensively, rbcL clearly leads in terms of
served areas, are sections of non-coding DNA that the number of species examined (see R.G. Olm-
seem to be free to vary. These regions contain stead et al., 1992; also, volume 80 of the Annals of
many polymorphic sites, and have been the focus the Missouri Botanical Garden).
of concerted efforts to understand the population
biology of mammals through mtDNA sequencing. rbcL
Attention to the control regions of other groups The large subunit of ribulose 1,5-bisphosphate
for similar purposes has lagged behind work on carboxylase is encoded in the chloroplast
mammals. genome, and is part of a large enzyme that cat-
240 Chapter 7 / Palumbi
Primer/Taxa
rbcLla-5' GGCCGTCGACATGTCACCACAAACAGARACTAAAGC
Barley . . . . . . . . . . . . . . . . .A . . . . . . . .
]Rice . . . . . . . . . . . . . . . . .A . . . . . . . .
Vicia . . . . . . . . . . . . . . . . .A . . . . . . . .
Galium . . . . . . . . . . . . . . . . .G . . . . . . . .
rbcLlb-5' ATGTCACCACAAACAGAAACTAAAGCAAGT
rbcL12-3' CTCGSAGCTCCTTTTAGTAAAAGATTGGGCCGAG
Spinach . . . . . . . . . . . . . . . . . . . . . . . .
Tobacco . . . . . . . . . . . . . . . . . . . . . . . .
Pea . . . . . . . . . . . . . . . . . . . . . . . . .
Cotton . . . . . . . . . . . . . . . . . . . . .c . .
Alfalfa . . . . . . T . . . . . . . . . . . . . .T . .
OW-3' ACTACAGATCTCATACTACCCC
Rice . . . . . . . . . . . . . . . . . . . . . .
Figure 9 Chloroplast rbcL primcrs. The underlined similarity). The 3' primers actually anneal to a con-
sections of rbcLla and rbcLl2 correspond to restriction served section of the chloroplast DNA downstream of
sites that have been built into these primers (Sal I and the rbcL gene (see R.G. Olmstead et al., 1992).These
Sac I). The rbcLla primer is identical to 25 sequences in primers have been used extensively on angiosperms,
GenBank (note that the matches listed here are from a but work on some non-angiosperms as well.
BLAST search, and represent the taxa with the highest
Amino acid
Primer/Taxa position
CK6-5' GAC C A C C T C CGA GTC A T C T C Z ATG
Mouse . . . . . . . . . . .C . .G . . . . . . . . .
Chicken . . . . . T A CC A . . . . G . . . . . C . . . 248
Fish . . . . . . . . . . . . . . . . . . ..G . . . 233
Urchin . . T . . . ACT . . G . . T . . T . . C . . . 273
Lobster . .T . . . . . . ..C A.. . . . .T . . 168
bility, potential high rates of variability, and alleles at these loci for a particular set of spccies. It
well-understood genetic background of this nu- would be interesting to compare electromorphs
clear allelic system make it attractive for molec- and intron differences in these taxa. Primers are
ular studies of population structure and genetic shown in Figure 11.
diversity. Below are a few exon-priming, intron-
crossing (EPIC) primer pairs that seem to have Actin
broad applicability. Use of these primers re- Genes in this small family are known to be highly
quires a bit of patience and willingness to trou- conserved across animal phyla (Foran et al., 19851,
bleshoot. and codon bias limits heterogeneity at four-fold
and two-fold sites. Most importantly some intron
Creatine Kinase positions are also conserved. For example, the
Creatine kinase (CK) supplies muscles with en- first intron occurs near ammo acid position 41 in
ergy by transferring a phosphate group from most species examined (Kowbel and Smith, 1989),
creatine phosphate to ADP. This enzyme is only which allows exon primers to be developed both
known in vertebrates and echinoderms, but a upstream and downstream from the intron (see
very similar enzyme occurs in the protostome Palumbi and Baker, 1994). Both cytoplasm~cand
phyla, where it is called arginine kinase (ARK). muscle types of actin appear to be amplified ~11th
Creatine kinase has three unlinked loci in most these primers (Figure 12).
vertebrates known, and occurs in a tandem
triple repeat in sea urchins. There are a variety Cytochrome c
of loci of arginine kinase in insects and crus- Cytochrome c has long been of interest to mole-
taceans. cular evolutionists. It was studied extensively
Many electrophoretic surveys have included using protein sequencing before large-scale
CI< or ARK, and so there may be protein data for DNA sequencing became practical (see Dicker-
232 Chnpfer 7 / Pnlunzbi
Amino acid
Pnmer/Taxa position
ACTI-5' GCT G T T T T C CCG TCG A T T GT
Starf~sliC C C . 29
Starfish M . G A T 29
Urchin A
Mouse . .. ,
Frult fly C A G T
Yeast A T
Frog G T T
Cliicken A T . T . ..
Brine shrimp . C .T C . T
Nerna tode C G . A . T
ACTII-3' GTC C T T C T G CCC C A T ACC SAC CAG
Starfish C . . . A 51
Starfish M . . G 51
Urclun T T . C .
Mouse . T .. C .
Chicken . C ..
Carp T ... A .
Nematode .. . . . T . G . ..
Figure 12 Actin intron primers. These primers gener- for some insects, spiders, all mammals tested, and sea
al!y produce multiple bands between 200 and 2000 bp urchins. See Baker and Palumbi (1994) for details.
117 length Sucessful amphfications have been obtained
Amino acid
Primer/Taxa posit~on
R/K C A/L Q C H T
cytC-C-5' AAG T G T GCY CAR TGC CAC AC
Human C . .G . 19
Insect CGC , C . C G 18
Yeast GA . CTA A 18
Rice C C . G . , , 22
M K T G P I Y K K
cytC-B-3' C A T C T T G G T G C C GGG G A T G T A T T T C T T
Insect ... . C 77
Rat . . . . . . . C. 72
Yeast ... A A . A A .. 77
Ibce . . T . A
Figure 13 Cytochrome c intron primers. Note that lysines) should be omitted. Although the plant introns
cytC-C is a degenerate primer but that degeneracy in seem to be large and might be useful in population or
cytC-C has been reduced by using the mismatch rules systematic studies, the vertebrate introns are small, and
discussed in the text. The nucleotide at position 10 of have not provided enough sequence data to be useful.
cytC-B (listed as a G here to match the insect sequence) Some of these latter amplifications might be processed
might be changed for plants or fungi. Also note that a pseudogenes, which are known for mammals. How-
second intron occurs near the 3' end of cytC-B in some ever, initial amplifications from fish have shown larger
plants, and the last two codons (both corresponding to intron sizes.
trons i n all taxa. EFO was designed to anneal up- v i d e a section of a v e r y polymorphic nuclear
stream of several other potential intron positions. coding gene. The amplified product is 800-1000
Note that i n humans there are m a n y processed bp in most mammals tested. The primers d o not
pseudogenes of E F l a (Uetsuki et al., 1989). a p p e a r t o w o r k o n birds, reptiles, o r inverte-
brates, a n d s o are probably o n l y useful within
DQa the mammals.
The primers shown in Figure 16 amplify a sec-
tion of the hypervariable protein-coding domain Aldolase
of the MHC locus DQA, a n d were developed by Lessa a n d Applebaum (1993) demonstrated the
Slade e t al. (1993). Note that these primers are usefulness of combining EPIC alnplifications and
not designed to target introns, b u t instead pro- denaturing gradient gel electrophoresis. They de-
Amino acid
Primer/Taxa position
INTA-5' AAC C T T CAC A A C A A Y GAG GC
Human . . . . . . . . . . . 197
Mouse . . . . . .
Frog . .T .G . . . . C C
Fruit fly .T G . . . . . . . C .. 21
Nucleotide
Primer/Taxa position
DQAI-5' CCGGATCCCAGTACACCCATGAATTTGATGG 492
Human . . . . . . . . . . . .
DQA2-3' CCGGATCCCCAGTGCTCCACCTTGCAGTC
Human 1336
Rat
Sheep
Figure 16 DQa intron primers. Both primers have an section of the DQA gene, as well as the intron between
eight-base Barn HI linker at the 5' ends. They amplify a exons two and three.
Nucleic Acids XI: The Polymerase Clzai~zReaction 245
Nucleotide
Prmer/Taxa position
Aldl-5' TGTGCCCAGTATAAGAAGGATGG 5323
Rat A
Rat B T C
Figure 17 Aldolase intron primers. Primers work well 1993).Nucleotide positions correspond to those 111 the
for loci A and C in mammals, reptiles, and birds. Some mouse sequence MUSALDAA, as grven m Lessa and
rodents are !mown to have a pseudogene that also am- Applebaum (1993).
plifies (see Lessa and Applebaum, 1993; Slade et al.,
genes had a size consistent with a lack of an in- Drosopizila has four loci, Chiamydomorzas has two.
tron and showed stop codons in every reading Yeast and chicken proteins are about 70% similar
frame. Introns between exons 2 and 4 are similar at the amino acid level. The primers shown 1x1 Fig-
in position in humans, chickens, and Drosophila; ure 19 were designed by Tom Duda, University of
they range in size from 180 to 1370 bp. The intron Hawaii.
between exons 4 and 5 can be bigger (>2500bp in
chickens), or not occur at all (Drosophila). See
SIade et al. (1993) for details.
More Information about PCR
Many different guides and lists of protocols are
Beta Tubulin available that describe the PCR process. Soille of
Alpha and beta tubulin are related proteins that the most useful include Innis et ai. (1990), Erlich
form heterodimers to make up the bulk of micro- et al. (1989), Sambrook et al. (1989), and Z~rnmer
tubules. Like actin, the proteins are highly con- et al. (1993). Recent reviews include Erlich et al.
served, and occur in a small gene family: (1991) and Erlich and Arnheim (1992). Trade jour-
Map Nucleotide
position Prrmer/Taxa posihon
I. H2A6-5' GCTGGGCCGGTAAGGCTGGNAAGG 19
Cow G . G
11. H2A2-5' GAAGAGT TGGATTCCCTCATCAA
Ch~cken
Dog A
111. H2A5-3' TGTGGATGTGTGGAATGACACCT 320
Human
Cow
Figure 18 Histone H2AF intron primers. Designed by and H2A5 primers gave the fewest pseudogene ampli-
Slade et al. (1993), these primers seem to work only on fications, althougl~the prlrners may not work wcll out-
vertebrates, and they produce many pseudogene am- side of placental mammals.
plifications.According to Slade et al., use of their M2A2
246 Chapte7*7 / Pnlumbi
Amino
Map acid
pos~tlonPnmer/Taxa position
I Tubl5' C A G G C T GGT C A A T G T GGY A A Y C A
Human C C
Ctucken A G G C C C
Frult fly C C G C C C
NematodeA A C A C T T
Nematode I? A C C
Figure 1 9 Beta-tubulin intron primers. Tub1 and Tub2 92, whereas urchins, insects, and nematodes have an in-
d o noi work wcll together, but Tub1 and Tub4 work on tron near amino acid 130. Introns are marked by verti-
ail of the taxa tested, including vertebrates, sea urchins, cal lines. Taxa possessing a given intron are listed to the
mollusks, and flies. Tub3 and Tub4 also work well to- right: c = chicken; f = fly; h = human; n = nematode; u =
gethcr on the same set of taxa, and in this region verte- urchin. All sequences are from GenBank.
braies tend to have an intron near amino acid position
Thebe recipes are for lx solutions, b u t they usu- Once the solution is made, test it immediately, ~f
ally d : made
~ at the concentrations shown and di- it passes, it be aliquoted into 1.5-mi tubes
luted for use. and frozen, where it will last u p to one year.
Nucleic Acids 11: The Polymerase Chain Reaction 247
Type of
- --,---s--=+
-
..I
- -
~~gga~~~&ggg
sequerrce change s
e
z
2
ab~~-@g-~~
-4
+
-7
-
Assays
Levels of
application -Broad ------+ < Intraspecific b
Phylogeny +++ - - - -
Figure 1 Schematic of techniques for analyzing frag- useful for estimating selfing rates directly from family
ments and their applicability to differentproblems in data but not indirectly from genotype frequencies in
molecular systematics. Time and expense required for natural populations.
llnplctnenlation are not considered. IiAPDs are more
(Chapler 9 ) offers extremely high resolution and ciency to screen populations or species for specific
yields character data that also can be converted to changes in sequence (Slade et al., 1993).
estimates of sequence divergence if so desired
(Chapter 11).The development of reliable methods Forms of Fragment Variation
lor direct sequencing via PCR (Chapters 7 and 9) Differences among individuals in the number
has simplified the generation of sequence data. and/or pattern of DNAfragments can arise through
Sequence variation also can be examined indi- a number of distinct processes, including changes in
rectly by electrophoreticallycomparing DNA frag- the amount of DNA, the structure of the DNA, or
nwrlts to look for variation in their number, size, the number or distribution of restriction sites. Also,
or conformation (Figure 1; reviewed by Lessa and where assays use gene amplification, there may be
Applebaum, 1993; Grompe, 1993).Although frag- variation in the ability to amplify a specific DNA
ment analysis offers less resolution than nu- segment. The types of polymorphisms related to
cleotide sequencing in some respects, it is a pow- these processes are reviewed below.
erful and cost-effective alternative where large
nunlbers of individuals or loci or large segments VARIATION LN FRAGMENT SIZE Fragments can vary
of a genome are being screened. A theme devel- in size because of unique insertions or deletions
oped in this chapter (see also Chapter 12) is that or because of changes the number of copies of
fragment methods are a powerful complement to tandemly repeated sequences. The tandem
sequencing studies. The nucleotide sequence pro- repeats with the lughest rate of change in copy
vides a detailed restriction site map that allows for number are very short-2 to 5 bp in the case of
precisc interpretation of fragment changes k g . , microsatellites (Tautz, 1989; Weber and May,
Can11 et al., 1984; Liston, 1992; Wugall et al., 1994) 1989; Goodfelloxv, 1993; Morgante and Oliveri,
and fragment methods can be used with great effi- 1993; Queller et al., 1993; Figure 2A)--or some-
Nucleic Acids ID: Analysis of Fragments and Restriction Sites 251
Figure 2 Examples of variation at (A) single micro- lite autoradiogram (provided by D. Lambert; see Lam-
satellite loci and (B) multiple minisatellite loci. In (A), bert et al., 1994) illustrates variation within and be-
the two microsatellite loci (B29, B123) were amplified tween families of pukeko detected using probes pV47-2
from each of 20 bridled nailtail wallabies (lanes A to T) and 3'HVR. The pukeko families have a complex mat-
and run adjacent to a known sequence (sm) to identify ing structure with a , p, and y individuals ranked ac-
alleles by their slze. Both loci show the shadowing typ- cording to their dominance. The autoradiograph illus-
ical of dinucleotide microsatellite loci, but this is very trates both the complexity of the profiles and the power
consistent from sample to sample, enabling accurate for determining parentage.
scoring of genotypes. In (B), the multilocus minisatel-
252 Chapter 8/ Dowling, Moritz, Palmer 6Rieseberg
Table 1
Properties of selected microsatellite and minisatellite loci in humans
Number of
Locusa N Heterozygosity allelesb Mutation rateC
Minisahllite loci
D5S43 125 0.890 NR 0 (0-0.003)
D12Sll 125 0.970 NR 0 (0-0.003)
D7S22 125 0.980 NR 0.003 (0.0006-0.009)
D7S21 125 0.980 NR 0.007 (0.003-0.015)
DlS7 125 1.000 NR 0.052 (0.038-0.072)
Microsatellite loci
HUMHPRTB [AGAT], 417 0.753 9 0.000054
I-IUMT401 [AATG], 370 0.720 5 0.000035
HUMRENA4 [ACAC], 374 0.442 6 0.000023
HUMFABP [AATI, 152 0.574 7 0.000045
IIUMARA. [AGCI, 97 0.892 14 0.000159
a Data for minisatellitc loci are from Jeffreyset al., 1988; those for microsatellite loci are from Edwards et
al.. 1992.
NR = not reported.
For minisatellites, observed mutation rates among 344 offsprin are re orted with 95% confidence inter-
vals in parentheses. For the micronatellites, no mutations were o&erve8arid the estlmahes are derived
from diversity levels using a maximum likelihood method that simuitaneously estimates N,and mutation
rate (p).A more recent stud re orts observed mutation rates for 28 human microsatellite loci at between 0
and 0.008 with a mean of 0.80lBmutations per gamete per generation (Webcr and Wong, 1993).
Nucleic Acids 111: Analysis of Fragments and Restrictio~zSites 253
Table 2
Examples of recognition sequences and types of end produced
Recognition
Enzyme sequence End TYPE
et al,, 1992; Ruano et al., 1994) and provide a morphism~(RFLPs). Base substitutions (or small
simple means of identifying distinct sequences. indels) can create or eliminate cleavage sites for a
Keteroduplex analysis also increases the sensi- particular enzyme, thereby altering the number
tivity of DGGE (Myers et al., 1989). These tech- and size of fragments detected by that enzyme
niques have the potential to detect single base- alone (e.g., Figure 3). Larger indels or rearrange-
pair changes and have been developed to assay ments typically alter fragment patterns for several
variation in PCR products, although SSCPs also restriction enzymes simultaneously (see the sec-
can be detected using genomic DNA (e.g., Orita tion on "Interpretation and TroubIeshooting"), re-
et al., 1989). sulting in correlated change in restriction frag-
ments and, thus, non-independence of fragnlent
RESTRICTION SITE VARIATION Base substitutions characters (Figure 3).
or insertion/deletion (indel) events have often
been detected using restriction endonucleases RAPDS The approaches discussed above all refer
(REs): enzymes isolated from bacteria that cut to changes within a spec~fic,dehberately target-
DNA at a constant position within a specific ed segment of DNA. An alternative method of
recognition sequence, typically 4-6 bp long (e.g., detecting variation, specihc to PCR, is to defect
Table 2). Thousands of REs have been isolated the presence or absence of randomly amplified
and characterized, with all of the information polymorphic DNAs (RAPDs; J.G.K. Williams e t
stored in a database (REBASE; Roberts and al., 1990; see Chapter 7 ) . This technique 1s
Macelis, 1993). Each cleaves DNA at a character- designed to detect sequence changes within PCR
istic recognition sequence that usually is symme- priming sites; base substitutions within elther
trical and, when cleaved, leaves ends with a 5' priinlng site will affect the eff~ciencyof ampl~fl-
overhang, a 3' overhang, or no overhang (Table cation, changing the praflle of fragments pro-
2). In some cases, enzymes isolated from differ- duced by a given primer (Caetano-Anollks and
ent bacteria have the same recognition sequence Bassam, 1993). However, the abiIity to amplily a
(= isoschizomers) a n d some recognition specific segment also will be affected by large
sequences overlap (see below). The specificity of insertions or deletions between priming sites
cleavage by REs means that complete digestion and by template DNA quality and other factors
of a particular DNA allele will yield a repro- affecting the PCR reaction (Hadrys et al., 1992;
ducible array of fragments. Black, 1993; Ellsworth et al., 1993). The method
The variations in fragment pattern revealed was originally developed using very short (e.g,,
following digestion with restriction enzymes are 10 bp) primers and low annealing temperatures,
referred to as restriction fragment length poly- but longer "semi-random primers" also can be
254 Chapter 8 / Dowling,Moritz, Palmer & Rieseberg
(A) C
Reference
d
C
Deletion
a $ f & b C
Duplication I 1 4
(direct, tandem) * *
Invers~on g J-h' C
*I *I
d - --
g
- b
-- - -
f
-
-
a -- -
h -
- - Distance from origin
,@
Eigure 3 The effect of different kinds of sequence a calibration curve based on a sample with fragments
cljai-rge on RFLI's. (A) DNA fragments (a-11) are gener- of known size run on each gel (lane S = size standard).
by RE digestion and (B) electrophoretically sepa- Vertical arrows indicate cleavage sites and asterisks in-
rared by size. (C) Eragn~cntsizes are determined using dicate the boundaries of rearrangements.
used (e.g., to target intron-exon boundaries; logarithm of the molecular weight. Fragments of
Welning and Langridge, 1991). known size are run on each gel to act as an inter-
nal standard against which the size of other frag-
Prillciples of Electrophoresis ments are estimated by interpolation from a cali-
The fragments produced by PCR amplification or bration curve (Figure 3).
by digestion of DNA with REs are sorted accord- The migration of fragments through neutral
ing to their size and/or conformation by gel elec- polyacrylamide gels is also affected by their con-
tlophoresis. At neutral yH, the sugar-phosphate formation (G. Sing11 et al., 1987).Single-stranded,
backbone of the DNA is negatively charged, caus- or even partially denatured DNA, migrates more
ing the molecule to migrate through an electric slowly than does double-stranded DNA. The mo-
field. The gel media used, agarose or polyacry- bility of single-stranded DNA also depends on
lamide, form a dense matrix through which patterns of local folding or renaturation. These
smaller fragments can move more easily than properties provide the basis for detecting het-
large fragments. For double-stranded DNA, the eroduplexes, SSCPs, and, using gradients of de-
distance migrated typically is proportional to the naturing chemicaIs (e.g., urea) or temperature, the
Nzicleic Acids 111: A nalysis of Fragments and Restriction Sites 255
DGGE method. These methods are discussed in al., 1993). For microsatellite analysis, null alleles
more detail below. may fail to amplify because of large increases in
the size of the product or mutations in the flank-
~ssumptions ing primer sites. This is a potential problem be-
cause heterozygotes for null alleles will appear as
~ e r i t n b i l i t Repeatability,
y~ and Independence homozygotes for the amplified allele, biasing esti-
The most basic assumptions of fragment and RE mates of genotype and allele frequencies. Such
are that the characters in question are null alleles have been reported for humans, using
heritable, repeatable, and independent. These as- primers designed from human sequences (Callen
sumption~are considered in turn below. Violation et al., 2993) and may be more frequent when using
of any one could have significant effects on phy- primers designed from different species (C. Moritz
logenetic analyses and population genetic studies. and A. Heideman, unpublished data). N d alleles
The assuinption of heritability has two ele- also have been reported for minisatellite loci (Bru-
ments: fidelity of transmission, and mode of in- ford et al., 1992) and could occur where the probed
heritance. Fidelity of transmission is most likely minisatellite sequence contains a site for the re-
to be violated when using rapidly evolving char- striction enzyme used.
acters such as VNTR loci. Because of their high It generally has been assumed that animal
mutation rates (e.g., Jeffreys et al., 1988), such loci mtDNA and (in many cases) cpDNA are strictly
may be inappropriate for comparisons among di- maternally inherited. Recent studies have identi-
vergent populations or distinct species. RE sites fied paternal leakage of mtDNA in some animals;
recognized by methylation-sensitive enzymes however, the effect of such paternal contributions
could also violate the assumption of heritability, is unclear. Birky et al. (1989) demonstrated that
as variation in the state of methylation would low-frequency leakage (such as that seen in Mus;
mimic the gain or loss of cleavage sites. Methyla- U.B.Gyllensten et al., 1991) would have little im-
tion-induced artifacts can be detected by com- pact, but more frequent paternal transmission
paring the fragment patterns produced by (e.g., Mytilus; Hoeh et al., 1991; Zouros et al., 1992)
isoschizomers that differ in sensitivity to meth- could have a substantial impact on the evolution-
ylation (e.g., MspI and HpaII; Groot and Kroon, ary dynamics of the system. Biparental inheri-
1979), but the frequency of these artifacts is un- tance of cpDNA is suspected in close to 20% of
known. Methylation does not appear to be a flowering plant species (Corriveau and Coleman,
problem for RFLP analysis of mtDNA and 1988; Harris and Ingram, 1991), and predomi-
cpDNA (Palmer, 1985a), but can result in hyper- nateIy paternal inheritance has been reported for
variation and apparent homoplasy of specific conifer cpDNA (Szmidt et al., 1987; D.B. Wagner
sites in nuclear sequences (G.N. Wilson et al., et al., 1987).Thus, deviations from strict maternal
1984; Jorgensen and Cluster, 1988). It does not af- inheritance appear to be relatively frequent in
fect RFLP analysis of PCR products because these plants, making it necessary to verlfy mode of in-
are produced without methylation. heritance of plastid DNA using appropriately de-
Knowledge of the mode of inheritance is also signed breeding experiments or cytological tests
critical, especially for RAPDs, which are often (Milligan, 1992).
dominantly expressed (Hadrys et al., 1992).With- Repeatability is more of a problem for DGGE,
out this information, it may be impossible to dis- SSCP, and RAPDs. The subtle nature of changes
tinguish alleles of a single codominant locus from detected by DGGE and SSCP makes strict adher-
independent products of non-homologous loci ence to specific parameters (e.g., gel conditions,
(Riedy et al., 1992; J.J. Smith et al., 1995), thereby temperatures, run times) essential for consistent
affecting estimates of population genetic parame- resolution (Lessa, 1993). RAPDs are sensitive to
ters (Clark and Lanigan, 1993; Lynch and MiUigan, reaction conditions and often produce spurious
1994). In these instances, it is essential to examine and unrepeatable products if parameters are not
heritability using appropriately designed crossing carefully standardized (e.g., Ellsworth et al., 1993;
experiments (e.g., Carlson et al., 1991; Rieseberg et Muralidharan and Wakeland, 1993).
256 Chapter 8 / Dowling, Moritz, Palmer & Xieseberg
The assumption of independence of charac- Chapter 7). Given the increasing number of in-
ters has both technical and biological (e.g., link- stances of mitochondria1pseudogenes discovered
age) dimensions. The assumption of indepen- in the nuclear genome (e.g., Gellisen et al., 1983;
dence is potentially violated at the technical level Jacobs et al., 1983; Quinn, 1992; M.F. Smith et al.
for characters generated by several of these meth- 1992; Lopez et al. 19941, the orthology of putative
ods. Where the genetic basis of specific fragments mtDNA amplimers should be tested rather than
is unknown (e.g., multilocus DNA fingerprints, assumed! Given the assumption that copies of
RAPDs), it is impossible to assign fragments of mtDNA residing in the nucleus probably are in-
specific mobility to a particular locus without set- complete, one way to ensure that the amplified
ting up laboratory crosses or cloning/hybridiza- products are derived from the mitochondrion is to
tion experiments. Non-independence can affect amplify the whole genome from a subset of Sam-
RFLPs if there is overlap in the recognition se- ples using "long PCR" (Cheng et al., 1994a,b),
quences of the REs used. This effect may be obvi- then use this product as a template for subsequent
ous, as in the case of MboI (GATC) and BamHI amplifications. Another is to use highly purified
(GGATCC),in which the cleavage sites of the lat- mtDNA (see below) as template, although trace
ter are a subset of cleavage sites of the former. Or, contamination with nuclear sequences can make
the non-independence may be subtle, as exempli- even this ineffective (M.F. Smitl~,personal com-
fied by a C-T transition that eliminates an EcoRI munication) .
site (GGATCC) but produces an HinfI site For RE analysis, fragments of identical mobil-
(GANTC). Such non-independance can cause sig- ity tend to be homologous for sequences from
nificant errors in estimating sequence divergence closely related individuals and perhaps even from
or phylogeny (e.g., Hillis et al., 1994a; Hugall et most intraspecific comparisons (depending upon
al., 1994). the rate of evolution of the DNA sequence in
question). However, the likelihood of conver-
Homology of DNA Segments and Alleles gence-that is, two samples having fragments of
If two individuals exhibit fragments with identical the same size but produced by different cleavage
mobilities, it is assumed that these fragments iden- sites-increases as sequences become more differ-
tify homologous stretches of DNA. For anony- ent (Upholt, 1977). Convergent fragments are
mous or rapidly mutating segments of DNA, this readily identified by mapping cleavage sites.
assumption may not hold. In RAPD analysis, In comparing restriction sites, the investigator
comigrating products may not be homologous, assumes that it is possible to identify positional
particularly in comparisons between species homologs; however, the accuracy of restriction
(Black, 1993; J.J. Smith et al., 1995) and should be maps is limited by error in the measurement of
verified by transfer hybridization using the spe- fragments. Since tke magnitude of measurement
cific RAPD product in question as a hybridization error is proportional to the size of the fragment in
probe (Hadrys et al., 1992) or by cleaving gel-iso- question, such errors are minimized by using
iated products with restriction enzymes and ob- small fragments when constructing cleavage
serving congruent band profiles (e.g., Fritsch and maps. In addition, all siles of questianable homol-
Rieseberg, 1992). Similarly, bands of identical mo- ogy should be checked using side-by-side com-
bility in the complex profiles typically produced parisons of each sample, with the site in question
by multilocus minisatellites are not necessarily ho- isolated on as small a fragment as possible. ACCU-
mologous (Lynch, 1988; Burke et al., 1991). racy of cleavage maps and checks of positional 110-
Under some circumstances, the fragments mology are improved considerably by the use of
compared may be paralogous rather than orthol- polyacrylamide gels to visualize small fragments.
ogous (see Chapter 1). This is especially a prob- Comigrating homologous fragments in differ-
lem for RE analysis of PCR products, where mul- ent individuals are assumed to represent products
tiple copies and/or pseudogenes, or even that are identical by descent, In practice, this as-
non-homologous products, may be amplified by sumption may not hold because of technical limi-
a single set of primers (e.g., Slade et al., 1993; see tations or convergence of allelic state. For exam-
Nucleic Acids 111: Analysis of Fragments and Restriction Sites 257
pie, DGGE and SSCP can distinguish sequences gree of bending of DNA rather than changes In
differingby a single base change (e.g., Lessa, 1992; the flanking restriction sites being assayed. Sucln
T.A. Smith et ai., 1992; reviewed in Lessa and Ap- changes will affect the fragment patterns pro-
$ebaum, 1993); however, under a given set of duced by all REs that cleave on either side of the
conditions these methods may not identify all conformation mutation and can be misintcr-
(Sarkar et al., 1992; Fan et al., 1993; Nor- preted. Other changes, such as sequence re-
man et al., 1994).For minisatellite loci, the identi- arrangements (e.g., Palmer et al., 1985; Jansen and
fication of separate alleles is limited by the accu- Palmer, 1987a), duplications (e.g., Moritz and
racy of estimating fragment size and variation in Brown, 1987; Moritz, 199la; Zeverii~get al., 1991)
among lanes on a gel and alleles of sim- and minor length variants (e.g.,Cann and Wilson,
ilar mobility are usually pooled for analysis (Bu- 1983; Densmore et al., 1985; Palmer et al., 1985)
dowle et al., 1991; resolution can be improved us- can also cause change in mobility of fragments
ing internal size markers in each lane, see Burke defined by homologous restriction sites. Misinter-
et al., 1991; Taggart and Ferguson, 1994).Conver- preting such changes as gain/loss of cleavage
gence of alleles at VNTR loci could also occur if sites leads to gross errors in subsequent analyses
the mutation rate is high and the number of pos- (e.g., Hugall et al., 1994). Therefore, it is essent~al
sible character states is finite (ValdCs et al., 1993). to establish fragment homology by mapylng
However, these parameters are largely undeter- cleavage sites.
mined. FitzSimmons et al. (1995) found several in-
sertion-deletion events in the sequences flanking
Comparison of the Primary Methods
homologous microsatellite loci from different
species, so that alleles of the same length from The choice of technique should be based upon thr
these species would not be identical by descent. type of variation and gene(s) appropriate to the
Precisely mapped (and therefore homolo- problcm (see the section on "Application and
gous) RE cleavage sites also may be convergent Limitations" and Chapter 12). The choiccs to be
because of multiple base substitutions within the made among techniques relate to (1)method for
recognition sequence (Upholt, 1977). However, isolation of DNA, (2) selection of restriction en-
there is some debate about the level of sequence zymes (where appropriate), (3) medium for elec-
divergence at which this occurs (Templeton, trophoresis, and (4) method used to detect Irag-
1983a; Nei and Tajima, 1985). This level may vary ments (Figure 4). Again, one approach is not
among taxonomic groups and is likely to depend exclusive of another; different approaches can be
on factors such as mutation rates and base com- combined as needed.
position. The probability of convergent site loss is
far greater than that of convergent site gain be- Netlzod of DNA Preparation
cause a site loss is caused by any point mutation The optimal method of DNA preparation de-
within a cleavage site whereas a site gain requires pends on the type of tissue available, sequences to
a specific base substitution at a particular base be assayed, and whether or not PCR wiIl be uscd.
pair (Templeton, 1983a; DeBry and Slade, 1985; Howcver, even for PCR, there is no substitute for
W.-H. Li, 1986).These inequalities should be con- highly purified DNA of known quality and can-
sidered when restriction site data are used for centration for initial experiments (see Chapter 7).
phylogenetic analysis (see Chapter 11). For multilocus minisa tellite analyses, DNA witlt
Conversely, fragments of different mobility consistently high quality and uniform concentra-
may actually be homologous. For example, SSCP tion is critical (Bruford et al., 1992). Preparations
may represent one sequence as two or more mo- of total cellular DNA can be used for analysis of
bility variants, suggesting alternate stable confor- any sequences, and in many cases (e.g., in practl-
mations for the same sequence (Hayashi, 1991a). cally all current studies of plant molecular sys-
G. Singh et al. (1987) demonstrated that changes tematics) are preferred for reasons relating to
in mobility of some human mtDNA fragments yield, flexibility, and adaptability (Palmer et a1 ,
were due to base substitutions affecting the de- 1988b), For hybridization methods or PCR ap-
258 Chpter 8 / Dowling, Moritz, Palmer b Xieseberg
Treatment
Electrophoretic
conditions
Type ol data +------ RFLPs, RE site maps ------+ PCR R n P s DGGE Microsatellites SSCPs
Sensitivity High Low-moderate Moderate-lugh Moderate High High Moderatohigh
times be too high to visualize DNA via end-label- ping), the types of ends produced (5' overhang or
ing. Maximum purity can be achieved using neu- blunt for end-labeling), and cost (Table 3). In gen-
tral CsCl equilibrium gradients with intercalating eral, REs that cleave at 4-bp sites will cleave more
dyes such as ethidium bromide or propidium io- often than those that cleave at 6-bp sequences.
dide. Invertebrate and fungal mtDNAs and some The base content of the recognition sequences is
algal cpDNAs (but not vertebrate mtDNAs or also important: REs with GC-rich recognition se-
plant cpDNAs) typically have a strong bias to- quences will cleave in fewer places in sequences
ward A and T (G.M. Brown, 1985; Palmer, 1985b; that have low GC content (e.g., Table 3). Sensitiv-
Wolstenholme, 1992) and can be separated from ity to methylation also may be relevant where a
unbiased or GC-rich DNA in neutral CsCl gradi- mixed sample of nDNA and organellar DNA is
ents, especially with the addition of dyes such as analyzed. Palmer (1986a) lists methylation-sensi-
bisbenzimide (Hoescht 332581, which preferen- tive REs that cut plant nDNA rarely, but cpDNA
tially bind AT-rich regions (Hudspeth et al., 1980; sufficiently, to permit RFLP analysis. Analysis of
Gargouri, 1989).Care also must be taken to avoid large sequences such as entire cpDNAs (1.150 kb)
copurification with nuclear satellite sequences typically use 6-bp-recognizing REs that produce
(e.g., Arnason and Widegren, 1984). fragments that mostly range from 1-5 kb, al-
The time and expense involved in obtaining though REs that produce more, smaller fragments
highly purified organellar DNAs by ultracentrifu- can be used to compare closely related sequences.
gation is considerable, but justified for whole Closely related animal mtDNAs can be compared
genome RFLP analysis in which fragments are to using 4-bp-recognizing REs (e.g., W.M. Brown,
be detected by staining or end-labeling (e.g., ani- 1980; Moritz, 1991b; Dowling and Childs, 1992;
mal mtDNA). Numerous alternatives to purifica- Dowling and Brown, 19931, but beyond ~ 2 se- %
tion via CsCl gradients have been proposed (e.g., quence divergence it becomes too difficult to
Chapman and Powers, 1984; Powell and Zuniga, identify individual gains or losses of cleavage
1983; Palva and Palva, 1985; C.S. Jones et al., 1988; sites (but see Hillis et al., 1992 for a different ap-
DeSalle et al., 1993). These generally are cheaper proach) and mtDNAs are best analyzed by map-
and faster and seem adequate for particular tis- ping cleavage sites for 5- or 6-bp-recognizing REs.
sues or organisms and for detection methods of Even for large sequences or entire genomes, it
low sensitivity. However, they are not as generally has been recognized that restriction enzymes vary
applicable as CsCl banding, and any nuclear con- in their efficiency for generating RFLPs (Whitkus
tamination can lead to misinterpretation of frag- et al., 1994). For example, RFLP studies of low-
ment patterns, particularly where highly repeated copy number anonymous nuclear sequences in
nuclear sequences are common (see "Interpreta- plants suggest that larger restriction f~agmentsare
tion and Troubleshooting"). If such alternatives more likely to be polymorphic than smaller
are to be used, we recommend that mtDNA be ex- fragments due to the high frequency of inser-
tracted via CsCl gradients from a subset of Sam- tions/deletions observed at these loci (McCouch
ples to act as nDNA-free controls. A promising et al., 1988; Miller and Tanksley, 1990a). Thus, en-
new development, long PCR, can efficiently am- zymes that cut less frequently, and therefore
plify segments of DNA >20 kb (W.M. Barnes, generate larger fragments, tend to detect more
1994) and allows the purification of whole fragment length polymorphism. For analysis of
mtDNA genomes by gene amplification (Cheng et multilocus minisatellites, several REs, usually 4bp-
al., 1994b).This could prove to be a practical and recognizing enzymes (e.g., HinfI, HaeIII, MboI) are
efficient alternative to the use of CsCl gradients. tested to find those that reveal polymorphism in
fragments of appropriate size (Bruford et al., 1992).
Choice of Resk'ction Enzymes The above discussion assumes that REs are
Depending on the application, REs are selected on selected at random. This is efficient for the analy-
the basis of how many sites they are likely to rec- sis of large sequences with approximately even
ognize, the range of salt conditions under which base content, in which case most enzymes are
they are active (broad for double-digest map- likely to cleave one or more times. However, di-
260 Chapter 8 / Dowling, Moritz, Palmer 8 Xieseberg
Table 3
Properties of several commonly used restriction endonucleasesa
Number of sites
Table 3 (contrnued)
Properties of several commonly used restriction endonucXeasesa
Number of sites
FUUII CACJCTG 12 1 1
RsaI GT~AC 286 35 9
Sac1 GAGCT~C 21 2 0
Sac11 CCCC~GG 7 2 0
SalI G J m C 11 0 0
ScrEI CCLNGG 239 22 5
SmaI CCC~GGG 16 0 0
SpeI A ~ W T 28 9 1
Ssp1 AAT~ATT 137 11 95
StuI AGGLCCT 16 13 0
sty1 C.~C(AIT)(A/T)GG loo 22 o
TaqI TJCGA 639 29 30
XbaI TJ-A 49 5 3
XhoI CLTCGAG 24 1 1
a Best digestion is
achieved uslng buffers and conditions supplled by the manufacturer. For the recognition
sequence, the locat~onof cuts is Indicated by an arrow ($) and the bases filled in by end-labeling are under-
lmed. "NA" indicates that the information was not available.
b~urnberof tobacco cpDNAfra ments produced (=number of cleavage sites rnh~usone inverted repeat
segment, plus one). Data from A n o z a k i et al., 1986.
Wurnber of human mtDNA fragments produced (=number of cleavage sites). Data from S. Anderson et
al., 1981.
dNumber of honeybee mtDNA fragments produced (= number of cleavage sites). Data from Crozier and
Crozier, 1993.
gestion of PCR-amplified segments with ran- tion for each allelc to be tested, but greatly ex-
domly selected REs is usually inefficient because tends the use of RBs to screen large population
the sequence is relatively short (i.e., <2 kb) and samples for alleles revealed by sequencing.
most enzymes either do not cleave or have uni-
form fragment patterns (e.g., Karl and Avise, Electroplzoresis of RE Fragments and VNTR Loci
1993). In this case, Slade et al. (1993) recom- Fragments may be sorted according to size by elec-
mended "targeted" digestion (i.e., using restric- trophoresis through agarose gels, polyacrylamldc
tion enzymes shown from preliminary sequence gels, or both. The range over which the size of
data to have multiple or variable cleavage sites). fragments can be estimated accurately varies ~'lt11
It is also possible to create restriction sites to gel concentration and buffer. Agarosc gels arc
assay specific nucleotide variants that diagnose commonly used at a concentration of between
two alleles or phylogenetic lineages previously 0.6% and 2.0% agarose w/v, using TAE or TEE
identified by DNA sequencing. Tlus makes use of buffers, and these gels allow accurate estimation
specifically designed mismatch primers to alter of fragment sizes over the range 300 bp to 20 kb.
the sequence of the PCR product just 5' of the TAE provides better buffering capacity and allows
variable nucleotide site such that one of the two better separation of large fragments, but poorcr
alleles now has a restriction site (C. Strobeck and resolution of small fragments. Low (e.g., 0.8%)
1.Sved, personal communication; e.g., Figure 5A). concentration agarase gels and long run times are
This approach requires an additional amplifica- used for analysis of rnultilocus minisatellltcs to
262 Chnpf er 8 / Dozuli~zg,Moritz, Palmer & Rieseberg
(A) Mismatch
I
I'olymorph~sm (13) 2 c ~ , - ~ c ~ c ~ c ~ c(C)~ c ~ c ~
Prlrner
I
ATIATTAATIAAC . .
Sequc~~ce{ T
I. c. . 190
147
110
131
Figure 5 Use of a mlsmatch primer (A) to de- 110 65
57
velop IWLP assays to diagnose variable nu-
clcolldc sltes (B). The primer, which extends
from Lhc ATT to the left, creates a new AseI site
in ti-~cCn allele but not the Cb allele, thereby
providing the basis for the rapid diagnostic test
shown on the acrylamide gel. (C) An example
of high resolution separat~onof small frag- both types of gel. Use of both agarose and poly-
ments produced by digesting an ~340-bpfrag-
ment of marine turtle control region with MseI acrylamide gels produces extremely accurate
(sce Norman et al., 1994). cleavage site maps and ensures that all fragments
are visualized; it also can improve the resolution
of fragments in the region of overlap (-1.5-0.4 kb
maximize resolution (Bruford et al., 1992). Poly- for 1.2% agarose and 3.5% polyacrylamide) and
acrylarnide gels typically are composed of be- may reveal conformation-induced changes in
tween 3.5% and 6.0% acrylamide, and provide for fragment mobility, as these are restricted to the
analysis of fragments ranging from 10 bp to 1 kb. polyacrylamide gels (G.Singh et al., 1987).
Agarose gels are simpler to prepare than For the analysis of microsatellites, amplifica-
polyacrjrlamide gels and may be run horizontally tion products typically are separated on high per-
or vertically. Horizontal gels are easier to prepare centage (e.g., 6%) denaturing acrylamide gels, as
but arc thicker than vertical gels, This is an ad- are used for sequencing (Chapter 4). This enables
vantage where large amounts of DNA are to be reliable separation of alleles that differ by as little
loaded per lane (e.g., when staining with interca- as 2 bp in fragments of 100400 bp and eliminates
latii-ig dyes or using hybridization methods), be- artifacts that are due to differences in conforma-
cause [he DNA concentration at the gel interface tion. Typically, products of sequencing reactions
should not exceed l ,ug/mm2. However, the thin- from some DNA k g . , M13) are included on each
ner (imm) vertical gels are easier to dry onto fil- gel to provide precise measures of allele length
ter paper for autoradiography of end-labeled (Figure ?A). Thus, microsatellite alleles can be
fragme~~ts. Only polyacrylamide gels allow high scored by their absolute length, thereby avoiding
resolut~onof very sn-iall fragments (<0.2 kb) and the practical limitations of methods based on rela-
arc used in conjunction with a variety of detec- tive migration (e.g., DGGE, SSCPs, minisatellites;
tion lechniques, such as end-labeling (W.M. see Chapter 2). A further advantage of microsatel-
Brown, 1980) and some hybridization methods lites for large-scale studies is that their elcc-
(Kreit~nanand Aguadc, 1986). The difference be- trophoresis and detection can be greatly sirnpli-
tween the resolving power of agarose and poly- fied using multiplexing and automated DNA
;icry!amide gels is partrcularly evident where sequencing technology. For this purpose, primers
small products generated by digestion of PCX are resynthesized with a fluorescent dye attached;
products are visualized by staining (Figure 5). this allows different microsatellite loci to be ana-
Howcvel-, polyacrylarnide gels must be handled lyzed simultaneously using different dyes
wrth caution, as unpolymerized acrylamide is a (Schwengel et al., 1994).
potent neurotoxin.
Fur some methods that produce a wide range Ileteroduplexes, SSCP, and DCGE
of fragment sizes (e.g., end-labeling of animal These methods, which are most appropriate to
mtDNA RFLPs), it is useful to run each sample on small (<600 bp) PCR products, vary in complex-
Nucleic Acids IE Analysis of Fragments and Restriction Sites 263
Table 4
Comparative resolving power of heteroduplex analysis, SSCP, and
DGGE methods for detecting different DNA sequences
Method Fragment size Medium Sensitivitya
ity and sensitivity (Table 4; see also Chapter 9, technique is most efficient for fragments of 4 0 0 bp
Protocol 20). Heteroduplex and SSCP analyses and is enhanced by the use of a primer with a 40-
are technically straightforward, involving elec- bp GC-clamp attached (Myers et al., 1989; Sheffield
trophoresis of PCR products through native et al., 1992) and by heteroduplex formation. The
polyacrylamide gels to test for variation in con- GC-clamp converts the sequence into a single do-
formation. Heteroduplexes can be formed by co- main with respect to denaturing properties and
amplification of different alleles (e.g., a standard holds the denatured stands together for staining.
vs. variant mtDNA) or by denaturing and rena- Myers et al. (1989)also suggested that longer PCR
turing mixed PCR products. According to products be digested prior to DGGE, although in
Hayashi (1991a,b), the separation of alleles by this case it would not be possible to attach the GC-
SSCP is sensitive to variations in temperature and clamp to all fragments. Considerable care is needed
buffer conditions and is enhanced by the addition to ensure that the denaturation gradient (whether
of glycerol and the use of gels with low crosslink- temperature or urea) is optimized for the alleles
ing (i.e., low bisacryl-amide/acrylamide ratio). concerned. The appropriate conditions can be de-
Fan et al. (1993) found that resolution was also re- termined empirically using a parallel gradient
duced when conditions were less than optimal, prior to loading multiple samples on a perpendic-
with transitions more difficult to detect than ular gradient (Figure 6 ) . More details on the nu-
transversions. To optimize resolution, gels should ances of DGGE are available in reviews by Myers
be cooled and run at low power (10 W),with a et al. (1989) and Lessa (1993).
minimum migration of the smallest fragments of The power of these methods lies in screening
16-18 cm. large numbers of individuals for known or novel
DGGE is a specialized form of electrophoresis sequence variants. Although they d o not reveal
that uses polyacrylamide gels with gradients of de- the location of the differences in the PCR product,
naturants (e.g., urea) to detect differences in the sta- use of these techniques in combination with lim-
bility of PCR products (reviewed by Myers et al., ited sequencing provides efficient and sensitive
1989; Lessa, 1993). The double-stranded product assays of genetic polymorphism in natural popu-
moves through the gel until it reaches a urea con- lations (e.g., Hayashi, 1991b; Lessa, 1992; Norman
centration where it begins to become single- et al., 1994). Lessa (1992) was able to reamplify
stranded, at which point migration is retarded. Se- and sequence fragments extracted from a DGGE
quences that differ by as little as a single base polyacrylamide gel; thus, DGGE can be used to
substitution begin to denature at a different point purify an allele for direct sequencing-an impor-
within the gradient, resulting in different mobilities tant application for rare alleles that are not ho-
at the completion of electrophoresis (Figure 6). This mozygous among the samples analyzed,
264 Chapter 8 / Dowling, Moritz, Palmer t;.Xieseberg
beled primers (rather than incorporation of la- random priming (Feinberg and Vogelsteln, 19831,
beled dNTPs) gives better results because of the altliough non-radioactive methods also have been
higher specific activity and the absence of base developed (e.g., biotin-streptavidin, J.J. Leary et
effects (Hayashi, 1991a). al., 1983; AMPPD and alkalinc phosphatase, Cate
End-labeling of fragments produced by re- et al., 1991; Bronstein et al., 1991). The labeled
striction enzymes involves adding a-labeled 32P strands will hybridize to complementary mem-
(or 35s) nucleotides ( d N P s ) to the ends produced brane-bound single-stranded sequences, allowing
by cleavage with REs and, again, can only be ap- them to be detected by exposure to film (radioac-
plied effectively to highly purified sequences. Be- tive probes, chemoluminescence) or staining (bi-
cause each fragment has the same number of otin probes). The amount of base pair mismatch
this technique has the advantage that frag- permitted (stringency) can be controlled by vary-
111ent intensity is independent of size (i.e., a 10-bp ing temperature, salt, and formamide concentra-
fragment should be as intense as a 10-kb frag- tion (Sambrook et al., 1989).
ment). Tlus method also is highly sensitive: end- This approach has many useful properties.
labeled fragments of any size can be visualized Any sequence for which there is a probe can be an-
from 1-5 ng of digested DNA (Figure 4). How- alyzed from heterogeneous (total cellular) DNA.
ever, end-labeling of RE-digested PCR products Thus, hybridization is the only practical approach
(with all four CX-~~P-~NTPS) tends to reveal a large where the sequence cannot be readily purified, ei-
number of spurious amplification products not ther directly or by amplification. The method IS
visible in EB-stained gels, making interpretation highly sensitive, allowing detection of picogram
difficult. We suspect, but have not demonstrated, (pg) quantities of a fragment. It also is an efficient
that these represent incomplete amplification approach to assaying multiple sequences (e.g.,
products with long single-stranded stretches that Quinn and White, 1987a; Sites and Davis, 1989)
make ideal templates for polymerization. making it an especially valuable tool for gene
With end-labeling, it is preferable to use REs mapping (e.g., Palmer et al., 1988b) or sequential
that produce 5' overl~angsor blunt ends (Table 2). analysis of single-locus minisatellites (Bruford et
The large (Klenow) fragment of E. coli polymerase al., 1992).Multiple (>lo)probes can be applied se-
has both polymerase and 3' exonuclease func- quentially to membrane-bound DNA by dissociat-
tions. The polymerase will add radioactive nu- ing the probe and target strands under conditions
cleotides using the 5' overhang as a template. The in which the latter remain attached to the filter.
3' exonuclease can convert blunt ends or 3' over- Transfer hybridization does hcve some pit-
hangs to 5' overhangs but is relatively inefficient. falls. Using standard methods it is difficult to de-
Ends can be labeled with any O?~P- or 35S-dNTP tect fragments smaller than 250 bp, making the
so long as the nucIeotide occurs among the bases technique less able to detect the gain or loss of
to be inserted (Table 3). However, if several differ- closely spaced cleavage sites and limiting the de-
ent REs are being used, it is most efficient to use tection of variants at VNTR loci. However, Krcit-
all four 32P-or 35S-dNTPs.This makes end-label- man and Aguade (1986) used electroplnoret~c
ing a relatively expensive approach unless there transfer of digested DNA from denaturing. poly-
is economy of scale. acrylamide gels, combined with high-sensitivity
Transfer hybridization (Southern, 1975) has l~ybridizationconditions (Church and Gilbert,
two basic steps. First, a membrane-bound replica 19841, to detect fragments smaller than 100 bp
of the gel is made by transferring electrophoreti- A second potential pitfall of hybridization IS
calIy separated fragments to a nylon or nitrocellu- the danger of detecting fragments other ~ h a n
lose membrane. Second, labeled single-stranded those from the sequence to be assayed (e.g., par-
DNA probe is allowed to hybridize with comple- alogous genes). In most cases, this problem can be
mentary membrane-bound sequences. The probe minimized by hybridizing at the highest possible
usually is labeled with radioactive (321' or 35S) stringency and using cloned probes. For organelle
dNTPs by nick-translation (Rigby et al., 1977) or sequences, this problem can arise because of the
266 Chapfer 8 / Dowling, Morifz, Palmer t;l Rieseberg
Table 5
~volutionaryproperties of different genomes and lineagesa
-
-
Genome
mtDNA
Lineage
Animals
Inheritance
Maternal
Point mutations
High
Size range
14- >30 kb
Rearrangements
Very rare
mtDNA Land plants Maternal LOW 200-2,500 kb Very frequent
mtDNA Fungi Allb LOW 20-200 kb Frequent
cpDNA Land plants ~llb LOW 120-217 kb Rare
nDNAC Animals Biparental Moderate 1-1000 x lo5 kb Frequent
nDNAC Land plants Biparental Moderate 1-1000 x lo5 kb Frequent
nDNAC Fungi Biparental Not known 0.1-10 x lo5kb Frequent
Data summarized from Cavalier-Smith, 1985; Palmer, 1985a,b; Moritz et al., 1987; Wolfe et al., 1987,1989a,b; Palmer and Herbon,
1988; and Neale and Sederoff, 1988.
"or properties of hypervariable minisatellite and microsatellite loci, see text.
"he term "all" refers to maternal, paternal, and biparental modes of inheritance.
C nDNA = single-copy nuclear loci.
both cpDNA and mtDNA have relatively low choice of sequences, Sequences with biparental
substitution rates (Palmer, 1985a).In contrast, the codominant inheritance (e.g., low-copy-number
very high mutation rate of VNTRs (Table 1) nuclear RFLPs, VNTRs) often display multiple al-
makes these ideal for studies of mating systems leles per locus and heterozygotes are easily iden-
and structure of closely related populations tified. In contrast, RAPD loci are usually biallelic,
(Queller et al., 1993),but is expected to confound with typically dominant inheritance patterns
phylogenetic information as evolutionary dis- (J.G.K. Williams et al., 1990). Thus, RAPDs pro-
tance increases. vide less genetic information on a per locus basis
The mode of inheritance has a strong effect than codominant loci when applied to questions
on the dynamics of genetic variation within of population genetic structure (Lynch and Milli-
species and on the inferences that can be drawn. gan, 1994), paternity (Lewis and Snow, 19921,
The inheritance of mtDNA and cpDNA is usually outcrossing rates (Fritsch and Rieseberg, 1992), or
uniparental and effectively haploid. This results hybridization (Rieseberg and Ellstrand, 1993).
in a four-fold reduction in the effective number of Another mode of inheritance is shown by
genes when males and females are equally fre- some repeated sequences: this is concerted evolu-
quent (Birky et al., 19891, which increases the ef- tion, the tendency for copies of such sequences to
fect of drift and the rate of turnover within popu- become homogeneous, first among gene copies
lations (Avise el al., 1984,1988). This property is within genomes and then among individuals
valuable for analyses of population structure in within populations (Zimmer et al., 1988; Hancock
that it tends to increase the proportion of varia- and Dover, 1990; reviewed by Arnheim, 1983). If
tion distributed among populations and, together sufficiently rapid, concerted evolution could en-
with a high mutation rate, provides for more hance among-population variation (by reducing
rapid sorting of ancestral alleles within and be- within-population variation), making these genes
tween species (e.g., Slade et al., 1994). Compar- especially useful for defining population bound-
isons of genetic markers with uniparental versus aries and potentially simplifying sampling for
biparental inheritance also can be used to detect phylogenetic studies (Hillis and Davis, 1988).
differences in behavior between the two sexes However, the same process would make these se-
(e.g., M.L. Arnold et al., 1991; Karl et al., 1992a). quences inappropriate for quantifying long-term
Dominant versus codominant inheritance also is gene flow, effective population size, and devia-
an important consideration with regard to the tions from Hardy-Weinberg equilibrium.
268 Chapter 8 1 Dowling, Moritz, Palmer 8 Rieseberg
The rate of sequence rearrangement also is a more accessible, more researchers are examining
consideration in selecting a sequence for fragment fragment variation at individual nuclear loci, ei-
analysis (Table 5 ) . Rearrangements, unless care- ther by hybridization or tlvough analysis of PCR
fully characterized, complicate RE analysis and products (e.g., RAPDs or microsatellites). Appli-
can lead to gross overestimates of sequence diver- cations are broad and include estimating the ex-
sity. However, once identified, the rearrangements tent of variation within and among pop~lations,
are themselves potential sources of phylogenetic levels of gene flow and effective population size,
information. Most cpDNA and animal mtDNA se- patterns of historical demography and biogeogra-
quences are stable in this regard, although there phy, and analyses of parentage and relatedness.
are exceptions (e.g., Moritz and Brown, 1987;
Palmer et al., 1987,1988a; Palmer, 1991). In con- Animal Mitochondrial DNA
trast, plant mtDNAs (Palmer and Herbon, 1988) A great deal of attention has been given to Rl?LP
and, to a lesser extent, fungal mtDNAs (Bruns analysis of animal mtDNA within and among
and Palmer, 1989) are in general unsuitable for populations (for recent reviews, see Ilarrison,
whole-genome fragment analysis because of their 1989; Avise, 1994; Moritz, 1994).The popularity of
high rate of structural change. Transposable ele- mtDNA for studies of animal populations is due
ments are common in nuclear genomes, and their to a combination of its maternal, clonal inheri-
insertion or excision will modify fragment pat- tance, its relatively rapid rate of base substitition,
terns revealed by hybridization assays (e.g., and the relative ease with which it is isolated and
Aquadro et al., 1986), potentially complicating analyzed. Much of the variation in animal
fragment analyses. J.G. Lawrence et al. (1989) mtDNA is due to base substitution, with transi-
used the location of transposoi~sto estimate rela- tions greatly outnumbering transversions (W.M.
tionships among closely related strains of E , coli, Brown et al., 1982). However, in many groups of
but noted that transposons moved too frequently animals length variation also is common, with dif-
for such characters to be of use among more di- ferences occurring within (i.e., heteroplasmy) as
vergent isolates. well as between individuals (reviewed in Moritz
In the following sections, we illustrate by ex- et al., 1987; Rand, 1993). Significantly, the rates
ample some of the strengths and weaknesses of and patterns of mtDNA evolution vary consider-
fragment analysis of the more commonly used ably among taxa (Martin and Palumbi, 1993;
types of sequences for studies of intraspecific vari- Rand, 1994) as well as among genes within taxa
ation, hybridization, and interspecific and higher (W.M. Brown, 1985; Adachi et al., 1993; Kondo et
level phylogeny. This is not intended to be ex- al., 1993; Zhu et al., 1994).
haustive: the application of molecular information The amount and distribution of variation
to problems in evolutionary biology and ecology within versus among populations depends on
has expanded far too rapidly to contemplate such population sizes and rates of gene flow, both his-
a review, torical and contemporary. The most common pat-
tern, at least in terrestrial vertebrates (for marine
counter examples, see Palumbi, 19921, is to have
Population-Level Comparisons less variation within than among geagrapluc pop-
Fragment variation in cpDNA, animal mtDNA, ulations, indicative of population structuring
and unique nuclear sequences has provided use- (Smouse et al., 1991; reviewed by Avise et al.,
ful genetic markers for the analysis of variation 1987). Variants detected within populations typi-
within species. For animals, the most commonly cally have low levels (<I%)of sequence diver-
used systems have been mtDNA and nuclear gence and are best detected by digestion with 4-
VNTR loci, especially in the form of multilocus bp REs that sample a larger fraction of the
fingerprints. For plants, cpDNA and anonymous genome. However, it is often possible to distin-
low-copy-number nuclear sequences have been guish numerous mtDNA haplotypes within pop-
most commonly used. As the technology becomes ulations if sufficient enzymes are used to sample
Nucleic Acids III: Analysis of Fragments and Xestviction Sites 269
the genome (e.g., Cann et al., 1984; Dowling and marked when information on divergence among
Childs, 1992; Dowling and Brown, 1993). In sev- alleles was included as well as their frequency
eral studies (e.g., Avise et al., 1992a; Moritz and and distribution; however, the converse was true
Heideman, 1993; Tibbets and Dowling, 1995), for coconut crabs (Lavery et al., 1996a1, suggesl-
highly divergent mtDNA types have been found ing that the most sensitive form of analysis will
at the same location, suggesting either large long- vary (see Hudson et al., 1992a).
term effective population size or immigration The unique characterist~csof mtDNA also re-
from a previously isolated population. sult in some disadvantages. The lack of recombl-
An interesting recent development is the use nation makes mtDNA comparable to a single al-
of the distribution of pairwise differences (e.g., the lozyme locus with many alleles. Consequcntly,
number of different restriction sites) between estimates of gene diversity obtained from mtDhTA
mtDNA alleles to make inferences about histori- are expected to exhibjt a larger variance than com-
cal demography. For example, a Poisson distribu- parable estimates made using a large number of
tion (i.e., a star-like phylogeny) may indicate a pe- nuclear loci. The extremely rapid evolut~oriof
riod of rapid population expansion (Slatkin and some mtDNA genes can result in convergence,
Hudson, 1991; Rogers and Harpending, 1992). confounding phylogenetic relationships even
Applied to humans, this method suggests rapid wlthin some species (e.g., Lansman et al., 1983;
population growth (DiRenzo and Wilson, 1991). Dowling and Brown, 1989). This problem seems
Applied to Pacific Ocean populations of coconut to be particularly severe for length variants where
crabs, it also indicates exponential population thc number of character states 1s discrete and fi-
growth from about 130,000 years ago-a remark- nite. In general, length variants should only be
able result given that the species has declined to used as markers for the analysis of population
the point where it is now threatened with extinc- subdivision where there is evidence for thejr sia-
tion (Lavery et al., 1996a). The coconut crab and ble inheritance, and even then with caution.
other examples serve to emphasize that the ge-
netic signature recorded in mtDNA phylogeny re- Chloroplast DNA
flects historical more than contemporary popula- The chloroplast genome IS conservative in most
tion processes, particularly where population size respects (Table 5; reviewcd by Palmer 1985a,b;
or connectivity fluctuates (Avise, 1994; Moritz, Wolfe et al., 1987,1989b; Palmer, 1991).Although
1995). the range of cpDNA sizes among photosynthetic
The tendency for variation among popula- land plants is large (120-217 kb), most of this van-
tions to be significantly higher than that within ation is due to a few exceptional genonies and
populations allows mtDNA to be used to estimate length mutations typically are short (<I. kb) and
phylogenies of populations, and thus to investi- are of restricted occurrence. The order and
gate patterns of historical biogeography (Berm- arrangement of chloroplast genes is nearly invari-
ingham and Avise, 1986; Bowen et al., 1989; ant. Most land plant cpDNAs have identical orga-
Moritz and Heideman, 1993; Moritz et al., 1993; nization, and most variants that do occur stem
Joseph and Moritz, 1994). The phylogeographic from one or a few simple inversions. The rate of
approach (Avise et al., 1987)provides a qualitative nucleotide substitutions in cpDNA appears lo be
assessment of genetic population structure within less than that of animal and plant nDNAs, and
species. A more quantitative approach, using fixa- much less than for animal mtDNAs (Table 5).Per-
tion indices (e.g., Takahata and l'alumbi, 1985; Ex- haps the most variabIe feature of cpDNA is ~ t s
coffier et al., 1992) or alternative statistical meth- mode of inheritance, which may be strictly niater-
ods (e.g., Slatkin and Maddison, 1989; Neigel and nal (most angiosperms), biparental (=20% of an-
Avise, 1993),is needed where the variation within giosperm species; Corriveau and Coleman, 1988;
populations is substantial relative to that among Harris and Ii~gram,1991),or paternal (conifers; re-
populations. Excoffier et al. (1992) found that vari- viewed in Sears, 1980; Neale and Sederoff, 1988).
ation among human populations was more However, even with biparental inheritance, trans-
270 Chapter 8 / Dozuling, Morifz, Palmer & Rieseberg
mlssron is essentially clonal as recombination has tion at single-copy or repeated loci also can pro-
not been observed in land plants. vide information on allele phylogeny. A signifi-
The slolv rate of change in cpDNA sequence cant advantage of working with identified, rather
and structure (Table 5) is reflected in the low lev- than anonymous, nuclear loci is the greater PO-
els of wthin- and among-population variation ap- tential for examining variation at the same loci in
parenl from most of the early studies (Banks and other species and thus benefitting from the Syn-
Urrky, 1985; D.B. Wagner et al., 1987; Neale et al., ergy between studies of n~olecularevolution and
1988).Ho~vever,a number of more recent studies systematics (Chapter 1).
of llowering plant species have found substai~tial
levels of wrtl~in-populationvariation (reviewed in MINISATELLITE SEQUENCES The discovery of
D.E. Soltis et al., 1992). In many of these cases, I~ypervariableminisatellite sequences and their
however, the intraspecific cpDNA variation does use in D N A fingerprinting revolutionized the
not appear to have arisen in situ, but instead ap- analysis of population-level variation, particular-
pears to result from cpDNA introgression (re- ly the assessment of parentage, contributing to
viewed in Rieseberg and Soltis, 1991). In many studies of sexual selection, mating behavior, and
studlcs principally aimed at clarifying interspe- population ecology (e.g., Jeffreys et al., 1985a,b;
cific relat~onships,cpDNA polymorphisms were Burke, 1989; Burke et al., 1991; Gibbs et al., 1990;
eicher absent or rare among conspecific popula- Rogstad et al., 1991a; Haig et al., 1993; Wolff et al.,
tions (reviewed in Palmer, 1987; D.J. Crawford, 1994).
1989; D.E. Soltis et al., 1992). The ultimate utility Minisatellites typically are analyzed by diges-
of cpDNA as a population marker remains un- tion with REs (which do not cleave within the tan-
clear, and is llkely to vary from species to species, dem repeats) and transfer hybridization, using
depending on extrinsic and intrinsic factors (i.e., minisatellite sequences, an entire hypervariable
the number of restrictioi~sites and average length sequence, or synthetic oligonucleotides as probes.
of restriction fragments surveyed, the age of the Two distinct strategies have been applied. Core
specles and its populations and their rate of minisatellite probes have been used to reveal vari-
cpDNA evolution). Point mutation rates in ation at a large number of hypervariable loci (e.g.,
cpDNA may vary severalfold among closely re- Jeffreys et al., 1985b).The result is complex multi-
lated laxa (Palmer et al., 1988b), whereas rates of fragment patterns that are usually unique to an
rearrangements and length mutations have been individual and are extremely powerful for testing
shotvn to be highly variable, principally due to parentage when the putative parents can be lined
dlfferel~cesin the amount of short dispersed and up next to the individual in question. A major ad-
tandem repeats, respectively (Palmer et al., 1985, vantage of the method is that many of the probes
1987) can be applied across a wide spectrum of plants
and animals. The alternative approach is to assay
Nuclear Sequences hypervariable loci one at a time using synthetic
Seq~~cnces in the nuclear genome provide an ef- oligonucleotides or cloned sequences as probes
fec~lvelyinexhaustible supply of genetic markers, (e.g., Nakamura et al., 1987; Bentzen et al., 1991;
if that variation can be accessed efficiently. For Prodohl et al., 1994; Scribner et al., 1994; Verheyen
analyses of intraspecific variation, attention has so et al., 1994).However, cloning of minisatellite loci
far iocused on single-copy sequences, particularly for probes is a relatively complex process (Bruford
those that are hypervariable (VNTRs), and on re- et al., 1992). This approach requires more effort
peated sequences such as rDNA cistrons. These than multilocus fingerprinting, and the variation
vary widely in the form and rate of mutation, at individual loci tends to be taxonomically re-
which has important implications for how they stricted (Gray and Jeffreys, 1991; but see Hannotte
are used. An important distinction is that analy- et al., 1992),but it has the major advantage that al-
ses oi VNTR loci provide information on allele leles can be assigned to specific loci and geno-
distr~bution,whereas analyses of sequence varia- types can be identified. In some cases it is possi-
Nucleic Acids III: Analysis of Fragments and Restriction Sites 271
ble to amplify individual minisatellite loci by PCR Basic protocol Enrichment method
(e.g., Boerwinkle et al., 1989). Indeed, Jeffreys et
al. (1991) combined PCR with RE digestion to ex-
amine the fine structure of variation at a human
minisatellite locus, increasing the resalution of
technique still further.
Although comparison of the multifragment
patterns generated by the minisatellite probes re-
mains a popular method for testing parentage,
there are several technical and statistical difficul-
ties with using this method (see Lynch, 1988;
Burke et al., 1991; Bruford et al., 1992). These in-
clude: (1) assigning specific fragments to a partic-
ular locus and thus identifying alleles and deter-
mining genotypes; (2) potential comigration of
non-homologous fragments (convergence),inflat-
ing the variance of the estimate of similarity; and
(3) correlations among loci due to linkage.
Nonetheless, measures of similarity and popula-
tion structure have been devised (Lynch, 1990,
1991a; Reeve et al., 1992), and several studies have
revealed variation in band-sharing coefficients
consistent with known relationships, current pop-
ulation sizes, or inferred population history (e.g.,
Wayne et al., 1991b; D.A. Gilbert et al., 1990,1991;
Degnan, 19934. The use of single-locus rninisatel-
lites for the study of natural populations is at an
early stage but overcomes most of the above diffi-
culties and appears very promising (e.g., May et
al., 1993; Scribner et al., 1994).
MICROSATELLITE SEQUENCES Microsatellite as- Figure 7 Schematic of an approach for isolating mi-
says have rapidly become established as a pow- crosatellite loci and developing primers for amplifica-
erful tool for the analysis of mating systems and tion using a basic protocol (left; see, e.g., Rassman et al.,
1991) or a non-radioactive procedure that enriches
population structure (reviews in Queller et al., cloned fragments containing repeat arrays (right; after
1993; Bruford and Wayne, 1993; Schlotterer and Armour et al., 1994).
Pemberton, 1994). Positive features include: (1)
high variability (e.g., Table I), even in species
lacking polymorphism at allozyme loci (Hughes morphism at microsatellite loci has been report-
and Queller, 1993); (2) the ability to score codom- ed across cetaceans, which diverged 10 million
inant genotypes with exact allele sizes; and (3) years ago (Schlotterer et al., 1991); marine turtles,
access via PCR, making it possible to work from which separated u p to 150 million years ago
extinct as well as extant populations (e.g., A.C. (FitzSimmons et al., 1995); and divergent taxa of
Taylor et al., 1994; Roy et al., 1994). One draw- birds (Hanotte et al., 1994).In any case, the meth-
back is the need to develop new sets of primers ods for developing microsatellite primers from
for each group of species, although this may be random, size-selected clones are reasonably
less of a problem than for minisatellites (see straightforward (Figure 7; Rassman et al., 1991;
Chapter 9, protocol 26). Conservation of poly- Armour et al., 1994; see also Chapter 9). The
272 Chapter 8 / Wowling, Moritz, Palmer & Xieseberg
most frequent microsatellites in mammals are IDENTIFIED SINGLE-LOCUS SEQUENCES RFLP analy-
AC-TG repeats, followed closely in number by sis has been used to examine variation within and
AG-TC repeats (Morgante and Oliveri, 1993). In among populations for genes of known function
contrast, repeats of the AT type are by far the [e.g.,alcohol dehydrogenase (Aquadro et al., 1986;
most common in plants, with repeats of the AC- Kreitman and Aguade, 1986; G.M. Simmons et al.,
TG type occurring very infrequently (Morgante 1989) and several other genes (Begun and
and Oliveri, 1993). Aquadro, 1993) in D~osopkila].These studies typi-
Most applications to date have concerned cally have revealed a wealth of polymorphism
analysis of mating systems (e.g., Amos et al., and have provided information concerning the
1993; Morin et al., 1994),where microsatellites are evolutionary history and diversity of populations,
particularly powerful because of the large num- the action of selection and drift on the sequences
ber of genotypes observed. Analyses of mi- concerned, or both. Similarly, RFLP analyses of
crosatellite variation among populations are just major histocompatibility (MHC) loci have pro-
beginning to appear (e.g., Edwards et al., 1992; vided important insights into population
Roy et al., 1994; Bowcock et al., 1994; Paetkau and processes and molecular evolution (reviewed by
Strobeck, 1994) and, despite concerns over the Slade, 1992). Other fragment separation methods,
impact of convergence and selection, results seem such as heteroduplex analysis, SSCP, and DGGE
promising. One interesting application is to ex- are being widely used to screen for diagnostic
amine differences among populations for a Y- polymorphisms in several genes associated with
chromosome-specific microsatellite, permitting human diseases (reviewed by Grompe, 1993) and
direct analysis of male-mediated gene flow (San- to a more limited extent in surveys of natural pop-
tos et al., 1993). Much more needs to be learned ulations (Lessa, 1992; Aguade et al., 1994).
about the dynamics of microsatellite variation Because they are expected to be more variable
within and between species before their potential than coding regions, there have been several stud-
can be fully realized (Bruford and Wayne, 1993; ies of variation in introns using PCR primers lo-
ValdCs et al., 1993). Nonetheless, microsatel1ites cated in conserved regions af flanking exons
provide an alternative to allozymes for some, but (Lessa, 1992; Slade et al., 1993; Palumbi and Baker,
not all, applicatiuns where inforti~ationon allclc 1994),This is a very promising approach because
frequencies at nuclear loci is needed (see also the design of primers that can be used across tax-
Chapter 12). onomically divergent species (see Chapter 7) will
One limitation peculiar to microsatellite promote use of the same loci in different species.
analysis is that the use of primers developed in Not surprisingly, nuclear introns tend to be less
one species on other species may bias estimates of variable than mlDNA, so that digestion of PCR-
allelic diversity, with the species from which the amplified introns with randomly selected REs is
primers were designed having higher values than relatively inefficient. Thus, studies to date have
"non-source" species (Bruford and Wayne, 1993). combined sequencing with DGGE (Lessa, 1992) or
Clones are selected on the basis of having long targeted digestion (Slade et al., 1993; Palumbi and
uninterrupted arrays of repeats (typically >12 un- Baker, 1994).
interrupted repeats; Figure 7) because of the ex-
pectation that these are the most likely to be poly- ANONYMOUS SINGLE-COPY SBQUI~NCES Randomly
morphic (Weber, 1990). In other species, arrays at cloned low-copy-number sequences have been
some of these loci are likely to be shortened or in- examined for RFLPs via transfer hybridization
terrupted, presumably reducing the mutation rate (e.g., Quinn and White, 198%; Degnan, 1993b)
and thus the diversity, irrespective of population and have been shown to provide useful informa-
size. The extent of this bias remains to be deter- tion on patterns of intraspecific polymorphism
mined. The effect was evident in cross-species Probes for RFLP analysis of anonymous nuclear
amplifications of marine turtles (FitzSimmons et loci typically are obtained from either cDNA or
al., 1995),but not in wombats (A.C. Taylor et al., genomic libraries. From these libraries, clones
1994). are selected that. hybridize to single or low-copy-
Nucleic Acids III: Analysis of Fuagrnents and Restriction Sites 273
number sequences. In practice, this requires efficient. Of 15 primer pairs developed for turtles,
selecting clones that hybridize to only one or two only seven reliably amplified products of the ex-
restriction fragments. There appears to be no cor- pected size and only five were polymorpluc.
relation between clone size and polymorphism Genetic studies of many species are often
(McCouch et al., 1988; Miller and Tankslejr, limited by the quantity and quality of tissue
1990a), but cDNA clones typically reveal consid- available for analysis, and by the number of vari-
erably more polymorphism than genomic clones, able loci that can be assayed in a cost-effective
regardless of the enzyme used to generate the manner. PCR-generated RAPDs have proven ef-
library (Landry et al., 1987; McCouch et al., 1988; fective for efficiently surveying numerous poly-
Miller and Tanksley, 1990a). morphic loci from small amounts of tissue. How-
Perhaps the most common application of ever, as discussed above, RAPDs have significant
RFLP analyses of single-copy nuclear loci has limitations as well. Although some of these diffi-
been to assess patterns of relationships among culties can be overcome witln appropriate experi-
populations or accessions of cultivated plants and mental design (such as that shown in Figure 81,
their wild relatives (Kochert et al., 1991; Aldrich the intrinsic teclmical and conceptual limitations
and Doebley, 1992; Neuhaussen, 1992; Liu and of RAPDs have caused many to have substantial
Furnier, 1993; reviewed in Whitkus et al., 1994). reservations about their use: in some instances,
Typically, populations or accessions are examined this information might be obtained more reliably
for the presence/absence or frequency of RFLP using other sets of markers (e.g., microsatellites
"alleles" and then subjected to phylogenetic or allozymes).
analysis. Although this approach generally pro- Nonetheless, witln these caveats and at the
vides more resolution than isozyme analysis, appropriate taxonomic 'level, RAPDs can be a
these WLP studies often suffer from inadequate powerful tool for studies of the genetics, system-
population sampling. Moreover, in many in- ati.cs, and ecology of populations, By far the most
stances the genetic basis of the RFLP variation is common use of RAPDs has been to identify and
poorly understood. Thus, genetic alleles repre- discriminate among individuals, cultivars, vari-
sented by more than one fragment may be scored eties, and species (M.L. Sinith et al., 1992; Fani et
more than once, thereby biasing estimates of rela- al., 2993; Mailer et al., 1994; Hsiao and Riescberg,
tionships. Nonetheless, if correctly used, RFLP 1994). For example, Smith et al. used KAIJUs to
analysis of low-copy-number anonymous nuclear identify the spatial distribution and size of one of
loci can be a powerful tool for intraspecific sys- the largest and oldest living organisms, an indi-
tematics. This is aptly demonstrated by Aldrich vidual of Arnzillnria bulbosn. This represents an
and Doebley (1992), who use nuclear RFLP data ideal use of RAPDs, because reproducible differ-
to support the origin of domesticated sorghum ences in RAPD phenotypes are all that is required
from wild sorglzum of central-northeastern Africa. to identify and differentiate clonal genot-ypcs;
Otlner applications include cultivar and individ- knowledge of the genetic basis of the RAPD phe-
ual identification (Smith and Smith, 1991; Vaccino notypes is not essential unless relationships
et al., 1993), estimation of levels and partitioning among clones must be ascertained. Another com-
of genetic diversity (Aldrich and Doebley, 1992), mon use of M P D data has been to describe pat-
and parentage determination (Quinn and White, terns of relationships among populations or ac-
1987b). cessions of cultivated plants and their wild
Karl and Avise (1993) modified this approach relatives (Liu and Furnier, 1993; Adams et al.,
for PCX, developing primers for random single- 1993; Yu and Nguyen, 1994).To date, too many of
copy clones and screening populations for RFLPs these studies have been limited by inadequate
in the amplification products. This techlnique has sample sizes and inadequate knowledge of frag-
provided significant insights into marine turtle ment heritability and homology.
population structure (Karl et al., 1992) and pat- U P D s also have been used successfully for
terns of genetic differentiation in oysters (Karl and estimation of parentage, contributing to studies of
Avise, 1992),but even proponents regard it as in- reproductive biology in both plants and animals
274 Chapter 8 / Dowling, Moritz, Palmer @ Xieseberg
.-.---..-------a>---.--*
- **-E
A
+
~~-~~~+~~&-@~~~-~~+~=g;
----
"." -
-
E
7
--
w
-
*
T -. - *?4?:.a3&&-zE5F-E
-.---- ---.
.
-a >.
Search for
polymorphism
Determine mode
of inheritance
(if possible)
Conduct survey
Verify honiology
Analyze data
should expect to survey more than 50 RAPD loci studies, RE variation has been important for ex-
for each offspring for most applications of pater- amining the significance of hybridization. Studies
nity exclusion analysis. Nonetheless, by choosing of RE variation have been used to identify shifts
a subset of markers with high recessive allele fre- in the position of contact zones (e.g., M.L. Arnold
quencies, RAPD loci provide nearly as much et a.l., 1987a; Dowling and Hoeh, 1993) and the
power as biallelic codominant loci (Lewis and role of hybridization in the production of new
Snow, 1992). Furthermore, several loci can be as- plant and animal taxa (reviewed by M.L. Arnold,
sayed per primer, and considerable automation of 1992; Dowling and DeMarais, 1993).The detection
the technique is possible (R. Sederoff, personal of hybridization in past evolutionary events is
communication). Thus, RAPDs may be useful for best achieved by rigorous phylogenetic analysis,
estimation of parentage in systems that are genet- which involves using several independent data
ically uncharacterized, and where the availability sets to identify mosaics of characteristics con-
of variable codominant markers is limited. tributed by different taxa (Rieseberg and
Brunsfeld, 1992).This can be accomplished read-
REPEATED SEQUENCES The repeated nuclear gene ily by combining direct sequencing and restriction
most commonly assayed is the ribosomal RNA site analyses (for generating a large number of
cistron. This spans both variable and conserved characters and assessing intra- and interpopula-
regions, and a few studies have demonstrated tional variation, respectively).
intraspecific l2FLP variation, often due to length At this time, mtDNA and cpDNA, often in
variation in the non-transcribed spacer region combination with allozyme variants, have been
(e.g., Learn and Schall, 1987; Sites and Davis, the markers most frequently used in such studies,
1989; Hillis et al., 1991~).In a few cases, useful due mainly to their ease of application. The typi-
intraspecific variation in the internal transcribed cally maternal mode of inheritance for these char-
spacer has been revealed by PCR sequencing, acters makes them particularly useful for studies
which could be the basis for screening via a frag- of hybridization, because they provide a means
ment method. for identifying the maternal form involved in the
production of hybrids and the assessment of di-
Studies of Interspecific Hybridization rectionality of introgression. Y-chromosome-spe-
Studies of hybridization play an important role in cific sequences can provide a similar haploid
our understanding of evolutionary processes (re- marker for tracing the male contribution (Vanler-
viewed by Hewitt, 1988; Harrison, 1990).Where Berghe et al., 1986; Tucker et al., 1992; Lundrigan
distinct taxa currently hybridize, it is possible to and Tucker, 19941, although their use has been
examine speciation and the evolution of repro- limited. Autosomal nuclear DNA markers, while
ductive isolation. Typical studies of hybridization used less often than organelle systems, also have
involve a population genetic approach to quantdy proven to be informative (e.g., rDNA: R.J. Baker
patterns of gene exchange among extant taxa (re- et al., 1989; RAPDs: M.L. Arnold et al. 1991; and
viewed by Barton and Hewitt, 1989; Harrison, anonymous nDNA loci: Degnan, 1993b; Parsons
1990). The very nature of these studies requires et al., 1993).We expect that nuclear DNA markers
examination of large numbers of individuals for will be applied more regularly in the future, espe-
several independent markers (e.g., different nu- cially for organisms in which limited allozyme
clear genes, mtDNA, cpDNA), making direct se- variation exists. With the above caveats, RAPD
quencing impractical and inappropriate. Many in- markers potentially will be of great use for future
dividuals can be screened relatively quickly and studies of hybridization because of the efficiency
cheaply for fragment length or site polymor- with which species-specific markers can be devel-
phism~,making these methods ideally suited for oped (Rieseberg and Ellstrand, 1993). This is pri-
studies of hybridization (reviewed by Harrison, marily due to the "universality" of RAPD primers
1990; Rieseberg and Ellstrand, 1993). across taxonomic groups, combined with the
In addition to its use as a tool in evolutionary many loci typically amplified by each primer. In
276 Chapter 8 / Dowling, Moritz, Palmer & Xieseberg
addition, bulked segregate analysis (Michelmore among closely related species, whereas those with
et al., 1991) can increase the efficiency of marker slower evolutionary rates may provide useful
detection. This method involves pooling the DNA characters for studying relatively ancient diver-
from individuals of each parental species and gences. Levels of variation should be assessed in a
screening the bulked samples for polymorphisms. pilot study (see Chapter 2), with final design in-
DNA-based characters are not without their cIuding adequate samples below the level of in-
limitations. Although haploid markers allow for terest to assess the impact of geographic variation
direct examination of the maternal/paternal com- (Smouse et al., 1991).PopuIation sampling should
ponent of hybridization, they are useful only take into account other characters as well as geo-
when used in conjunction with other character graphic information, with special effort made to
sets, such as allozymes (Chapter 4) or other DNA include samples from morphologically/geneti-
markers. When using diploid markers, it is essen- cally distinctive as well as geographically dis-
tial that their mode of inheritance is understood. persed populations.
For example, when applying rDNA characters, it An ideal approach to such a study would in-
is important to consider the effect concerted evo- volve a combined approach involving direct se-
lution could have on the distribution of variants quencing (see Chapter 9) and RE analysis. Nu-
within individuals and populations and the esti- cleotide sequences for each major lineage can be
mation of deviations of observed numbers of hy- used as a guide for surveys of RE site variation,
brids relative to those expected (M.L. Arnold et allowing for fast and efficient quantification of
al., 1987a; Hillis et al., 1991~).
The phenotypic na- levels of variation within lineages. It shouId be re-
ture of RAPDs (multiple bands and typically stated that the phylogenies produced are of the
dominant expression of alleles) makes it impera- molecules and may differ from the organismal
tive that breeding studies be conducted in order phylogeny for various reasons, including intro-
to understand patterns of heritability, gression, gene conversion, and sortlng of poly-
morphism (reviewed by Avise, 1994).
Species-Level Comparisons Animal Mitochondria1 DNA
As with olher levels of comparison, the ideal is to The application of mtDNA RFLPs to phylogenetic
find characters that vary among, but not within, analysis of congeneric species has been reviewed
the groups being studied. Further, differences extensively (A.C. Wilson et al., 1985; Birley and
among groups should not be so large that conver- Croft, 1986; Avise, 1986; Moritz et al., 1987; Avise,
gence, parallelism, or non-homology obscure the 1994).In general, the approach has proven useful
true phylogeny. The choice of characters or se- for resolving relationships of closely related
quences for analysis is critical in achieving this bal- species. Phylogenetic analysis of mtDNA restric-
ance. Due to the rapid rate of evolution and the fi- tion sites also has identified the bisexual species
nite number of character states, homoplasy is that acted as the maternal parent of hybrid-
likely to be common for length-associated charac- parthenogenetic species (W.M. Brown and Wright,
ters such as those assayed by microsatellites and 1979; reviewed in Avise et al., 1992b; Moritz et al.,
minisatellites. Because of problems involving re- 1992b) and the existence of past introgression
peatability and homology, RAPDs are unlikely to (Dowling and DeMarais, 1993).
be useful for phylogenetic studies because of prob- The main problems encountered in such stud-
lems discussed previously (see also Clark and ies stem from sorting of polyrnorp~usm,where re-
Lanigan, 1993; Hillis, 1994; J.J. Smith et al., 1995). cently separated species are being compared, and
Therefore, the discussion below will focus on the from high levels of noise (homoplasy), where dis-
use of RE characters for phylogenetic inference. tantly related species are examined. Using simu-
Sequences with rapid evolutionary rates yet lation studies, Neigel and Avise (1986) showed
moderate to low intraspecies polymorphism are that sequences froin recently separated mono-
most appropriate for analyzing relationships phyletic sister taxa appear polyphyletic initially,
Nucleic Acids III: Analysis of F~agmentsand Rest~ictiolzSites 277
then paraphyletic, and then monophyletic, as the example, Drosoplzila (DeSalle et al., 1987a) and
original polymorphic lineages are terminated and minnow (Dowling et al., 199210) mtDNA se-
replaced by variants unique (i.e., apomorphic) to quences seem to plateau at only 8% and 10% se-
each taxon. The simulations indicated that, for a quence divergence, respectively. Once these levels
haploid marker such as mtDNA, this process may are reached, further base substitutions are con-
take 4N, generations, where N,is the effective centrated at positions that have already changed,
population size. However, the time frame is also which is likely to increase homoplasy among
likely to be affected by the amount and distribu- IIFI,Ps. Since the plateau point varies among iax-
tion of polymorpltism within each species, the ge- onomic groups and among genes (Zhu et al.,
ographic mode of speciation, and the demo- 19941, so will the ability to resolve phylogenetic
grapluc history of the two species (see also Avise relationships. Therefore, it is essential to complete
et al., 1984,1988). This problem is not restricted to a carefully designed pllot study (Chapter 2 ) before
mtDNA or recently separated taxa. W.S. Moore embarking on a large-scale study. Where homo-
(1995) and Slade et al. (1994) found that ancestral plasy does appear to obscure relationships, it rnay
polymorphisms in mtDNA are eliminated more be possible to improve the signal-to-noise ratlo by
rapidIy than those at nDNA loci, presumably be- restricting comparisons to a slowly evolvlng re-
cause of the reduced effective population size and gion (Dowling and Brown, 1989).
higher mutation rate of the former. Theoretical
studies (Pamilo and Nei, 1988) indicate that if an CIiEoroplast DNA
ancestral taxon was highly polymorphic and mul- Nucleotide sequence divergence values for
tiple speciation events occured over a short time cpDNAs of congeneric species typically range up
relative to effective population size, then the prob- to 2.0% (see references in Palmer, 1987; Palmer et
ability of obtaining the correct topology from a al., 1988b; D.J. Crawford, 1989). Given a typical
single sequence is low. This has undoubtedly con- genome size of 150 kb, sampling with 10-20 REs
tributed to the debate over the phylogeny of that cleave from 20-100 times each will allow cov-
higher l~ominidsas deduced from mtDNA and erage of 1-5 kb of sequence, which usually is ade-
other sequences (reviewed by Holmquist et al., quate to produce a highly resolved phylogeny.
1988a,b) and created difficulties in resolving rela- Thus far, such phylogenies have been relatively
tionslups among African rift lake cichlids (Moran untroubled by problems of hoxnoplasy (Givnisl~
and Kornfield, 1993).There seems no obvious so- and Sytsma, 1995) and have contributed to a bet-
lution to the problem of polymorplusin. However, ter understanding of a host of phylogenetic prob-
it does stress the need for adequate sampling of lems, including the identification of crop plant
geographic populations and gene loci for phylo- origins from wild species, identification of the ma-
genetic analyses. If there is a strong geographic ternal and paternal ancestry of a number of hy-
component to intraspecific polymorphism, inade- brid and polyploid species, detection of unsus-
quate sampling may lead to erroneous phyloge- pected cases of introgress~on,and identlfica tion of
nies (Smouse et al., 1991). the progenitor genus of a putatively monotypic,
Homoplasy can be a substantial problem morphologically isolated genus (reviewed in
where distantly related tax? are compared. Tlus is Palmer, 1987; Palmer et al., 1988b; D.J. Crawford,
true particularly where comparisons are restricted 1989; Olmstead and Palmer, 1994; Soltis and
to fragment sizes rather than mapped cleavage Soltis, 1994; Sytsma and Halm, 1994).
sites. The upper limit to useful RFLP comparisons In situations where the quantity of DNA is
of mtDNA presumably is set by constraints on se- limiting, where extensive population sampling is
quence evolution. Sequence comparisons indicate required, or where rearrangements make map-
that primate mtDNAs reach a plateau of sequence ping difficult, digestion of PCR-amplified cpDNA
divergence at about 25% (W.M. Brown et al., fragments with frequent-cutting enzymes may be
1982). It is important to note, however, that the the method of choice (e.g., Rieseberg et al., 1992;
position of this plateau may vary among taxa. For Liston, 1992). Because the entire chloroplast
278 Cizapfer 8 / Dowling,Moritz, Palmer E7) Riese berg
genome has been sequenced for several plant Studies using restriction site variation of
species, it is now possible to generate universal cpDNA for interspecific phylogenetic analysis
primers for almost any portion of the genome. have encountered several problems. As with a l ~ -
Thus, rapldly evolving non-coding sequences can mal mtDNA, the sorting of ancestral polymer-
be chosen for comparison of closely related phisms can lead to discordance between c ~ D N A
species, whereas more slowly evolving sequences trees and organis~nalphylogenies (reviewed in
can be amplified for more divergent taxa. D.E. Soltis et al., 1992; Doyle, 1992). This problem
Nonetheless, tlus method is limited by several fac- appears to be less severe for cpDNA than for ani-
tors including: (1) the difficulty of amplifying mal mtDNA, however, because of the low rate of
large cpDNA regions (2-5 kb) in sufficient quan- cpDNA evolution, low effective population sizes
tity for digestion (although this may be overcome of many plant groups, and resulting low levels of
uslng "long-PCR); (2) the Iaborious nature of intrapopulational and intraspecific cpDNA poly-
double-digest mapping of complex fragment pro- morphism. Conversely, due to the high potential
files generated by four-cutter enzymes; and (3) the for interspecific gene flow in plants, hybridization
lim~tedpl~ylogeneticresolution typically obtain- and introgression may be a greater problem for
able from a single region. Mapping efficiency and phylogenetic analysis of cpDNA variation than
choice of restriction endonucleases can be greatly animal mtDNA (Rieseberg and Soltis, 1991). For-
enhanced by the availability of sequence data for tunately, this problem is readily solved by ade-
at least one of the taxa under study. In addition, quate geographic sampling for phylogenetic
several regions can be amplified and digested to analysis and by comparing cpDNA trees with
increase phylogenetic resolution. phylogenetic hypotheses based on other data sets.
Early studies of cpDNA restriction site varia- Another common problem for interspecific phy-
tion within a genus were accomplished by direct logeny is the conservative nature of cpDNA evo-
n~spect~on of restriction fragment patterns of pu- lution, which often limits resolution among
rlfied cpDNA. However, most current efforts use closely related species (e.g., Schilling and Jansen,
a transfer hybridization approach in which cloned 1989; Rieseberg et al., 1991; D.E. Soltis et al., 1991;
cpDNA fragments are hybridized sequentially to Mummenhoff and Koch, 1994). In some instances,
filter blots containing digests of genomic DNA resolution can be increased by sampling the
(see protocols). Although more laborious, this ap- genome with additional restriction endonucle-
proach has two main advantages. The use of total ases, particularly endonucleases with four-base
DNA as compared to cpDNA has major advan- recognition sites (four-cutters),which typically cut
tages with respect to yield (therefore much less cpBNA much more frequently (Olmstead and
starilng material is required) as well as extraction Palmer, 1994).
flexibility and adaptability (see Palmer et al.,
1988b for a fuller discussion). By probing with Nuclear Genes
cloned portions of the chloroplast genome, the Some single-copy nDNA sequences have been
cornplexlty of the fragment patterns is greatly re- compared among species by RFLP analysis (e.g.,
duced, allowing a more critical analysis of frag- ADH among Drosophila; Langley et al., 1981;
ment differences in terms of discrete mutations Bishop and Hunt, 1988; Y-chromosome markers in
and often permitting the direct mapping of re- Mus; Tucker et al., 19891, but the data are too few
striction fragments and sites. Fortunately, many for particular advantages and limitations to be
complete clone banks are readily available for a identified, RFLP analysis of multigene families is
wlde range of land plant cpDNAs (reviewed in exemplified by studies of globin variation among
Paln~er,1986a; Palmer et a]., 1988b; Ollnstead and primates (Zimmer et al., 1980; Barrie et al., 1981).
Palnier, 2994), with a well-characterized bank Analysis of multigene familes requires particular
froin the completely sequenced genome of Nico- care when using heterologous probes-because low
rmna generally being the most usefuI (Olmstead stringency hybridization is likely to detect varia-
and i3almer11992). tion in duplicate copies as well as the target se-
Nucleic Acids 111: Analysis of Fragments and Restriction Sites 279
pence. It then becomes important to distinwish 1984; M.L. Arnold et al., 1987a; Mindell and Hon-
between variation in orthologous (shared by de- eycutt, 1990; Rieseberg, 1991). The variation re-
scent) and paralogous (duplicate) copies for phy- vealed in these studies was typically, although
logenetic analysis. Even if this distinction can be not exclusively, in the transcribed or non-tran-
made (e.g., by relative intensities of hybridization; scribed spacers and was due to length mutations
see Barrie et al., 19811, gene conversion among or to the gain or loss of cleavage sites. The phylo-
members of a multigene family (e.g., Slightom et genetic information obtained from these studies
al., 1987) could still cause the gene tree to differ typically has been consistent with previous stud-
from the species tree. ies. However, M.L. Arnold et al. (198%) found
There are also a number of studies that have that divergence of a highly repeated sequence
used RFLP variation at numerous anonymous nu- was inconsistent with other evidence on the rela-
clear loci for phylogenetic inference among tionships among subspecies of Caledia captiva,
species (Song et al., 1988; Miller and Tanksley, and attributed the discrepancy to historical intro-
1990b; Kesseli et al., 1991; Jena and Kochert, 1991). gression.
Because each locus can be considered a poten-
tially independent estimator of phylogenetic rela- Higher-Level Systematics
tionship, this approach greatly reduces the prob-
lems of phylogenetic sorting and hybridization Investigations at this level have used both
associated with gene trees. However, much nu- changes in cleavage sites and gross structural re-
clear RFLP variation appears to result from inser- arrangements as characters for phylogenetic
tions, deletions, and rearrangements. Thus, frag- analysis. However, in contrast to sequence data
ment profiles generated by different endo- (see Chapter 9), there have been relatively few ap-
nucleases with the same probe are often corre- plications of WLPs to higher-level systematics.
lated, suggesting that it may be best to use many
probes with a single enzyme each. Anonymous Animal Mitochondria1 DNA
nuclear loci also will be subject to the problem of Although sequence evolution of animal mtDNA
orthologous versus paralogous variation dis- typically is rapid, certain aspects are highly con-
cussed in the preceding paragraph. served. These include rntDNA structure (Bridge et
The most comprehensive studies of nuclear al., 19921, gene order, genetic code, and the sec-
RFLP variation are on crop plants and their rela- ondary structure of tRNA and rRNA sequences
tives and involve Brassica (Song et al., 1988, (reviewed by Wolstenholme, 1992). The order of
1990), tomato (Miller and Tanksley, 1990b), and mtDNA genes varies considerably among phyla,
lettuce (Kesseli et al., 1991). For example, phylo- with the position of tRNA genes more variable
genetic analysis of eight species of tomato with than other coding sequences, There are some in-
40 RFLP loci generated two clusters, correspond- dications of minor variations within classes or
ing to self-incompatible and self-compatible phyla (e.g., in vertebrates, Paabo et al., 1991; Des-
species. In addition, red-fruited tomato species jardins and Morais, 1991; Seutin et al., 19941, mak-
formed a cluster within the self-compatible ing it imperative to further investigate within-
species group. It is noteworthy that an earlier group diversity before applying gene order as a
cpDNA-based phylogeny for the group did not tool for estimating relationships among phyla
resolve species into self-incompatible and self- (W.M. Brown, 1985; e.g., Sankoff et al., 1992; M.J.
compatible clades (Palmer and Zamir, 19821, but Smith et al., 1993).
did support a clade of red-fruited species. Which Aside from structural changes, some coding
tree represents the "true" phylogeny for toma- sequences (reviewed in Brown, 1985) may be con-
toes remains unclear. servative enough to provide characters useful for
Of repeated genes, the rDNA cistrons have phylogenetic analysis among genera. However,
been used most widely in interspecific compar- sequencing rather than fragment analysis is
isons (e.g., Coen et al., 1982; G.N. Wilson et al., clearly the method of choice here.
280 Chapter 8 / Dowling,Moritz, Palmer b Rieseberg
analyses of higher phylogeny using rDNA, as op- ethidium bromide, RNase A, DNase 1, DNA poly-
poseci to hundreds of sequencing studies.) merases (e.g., Klenow fragment, Kornberg en-
zyme, Taq polymerase), restriction enzymes (see
Table 3), bovine serum albumin (crude for addi-
LABORATORY SETUP tion to hybridization solutions, ultrapure for other
applications), ethanol, sodium acetate, Fmercap-
Major equipment items needed for analysis of toethanol (WE),sorbitol, hexadecyltrimethylam-
DNA fragments are included in Chapters 7 and 9, monium bromide (CTAB), ammonium acetate,
The most expensive of these is an ultraspeed cen- potassium chloride, dithiothreitol (DTT), agarose
trifuge with approproriate rotors needed to pre- (ultrapure), acrylamide (ultrapure), bisacrylamide
pare mtDNA of high purity. A programmable (ultrapure), ammonium persulfate (ultrapure),
thermal cycler is needed for gene alnplification re- N,N,N',N'-tetramethylethylenediamine (%ED),
actions (Chapter 7) and is now routine and af- boric acid, urea, and sodium citrate.
fordable equipment for a molecular systematics
laboratory. DGGE requires a specialized elec-
trophoresis chamber with recirculating tempera- PROTOCOLS
ture-controlled water. Chambers for routine gel
electrophoresis can be made in-house at low cost We describe protocols for the basic operations in
(see below). Other essential items include an au- fragment analysis. The more complex procedures
toclave (or access to one), a fume hood, and a for optimization and application of DGGE are de-
source of high-purity water. Single-distilled or scribed clearly and in detail by Myers et al. (1989)
deionized water can be used for rinsing glassware and Lessa (1993). Methods for SSCPs and the crit-
and making up electrophoresis buffers, but solu- ical variables are described by Hayashi et al.
tions used for preparing or manipulating DNA re- (1991a,b), and the basic method is described in
quire even greater purity (i.e., sterile double- Chapter 9, Protocol 20. The experimental ap-
distilled or sterile distilled-deionized water). proach to, and methods for, PCR are elucidated
Standard laboratory items that are used include in Chapter 7. The protocols given here conclude
glassware, including various sizes of beakers, with a detailed exposition of restriction site
graduated cylinders and pipettes, Erlenmeyer mapping by double digestion or sequential hy-
flasks, slde-arm flasks, and bottles. High-strength, bridization.
acid/so!vent-resistant centrifuge tubes are needed Isolation of animal mtUNA using Csc1-p1
for many applications. Disposable supplies in- gradients
clude gloves, pipette tips, Pasteur pipettes, and
microcentrifuge tubes. 2. Isolation of cpDNA using sucrose step and
Reagents generally should be of analytical CsC1-EB gradients
reagcrii grade or better, although there are some 3, Digestion DNAwitl, restriction
exceptions (see below). In particular, chemicals
used In the preparatiol~and manipulation of 4. Agarose and polyacrylamide gel electrophoresis
DNA must be of high quality, as must the media 5. Staining with ethidium bromide
used for electrophoresis. Commonly used
reagents include; tris-base, sodium chloride, eth- 6. CX~~P-3'
el-td-labelil~gof restriction fragments
ylenedlaininetetraacetic acid (EDTA, disodium, 7, Primer for microsatellite analysis
dil~ydrate),sucrose, sodium dodecyl sulfate (SDS,
ultrapure), cesium chloride (technical grade), pro- 8. Transfer hybridization
pidlum rodide, light mineral oil, hydrochloric 9. Mapping restriction sites
acid, sodium hydroxide, isopropyl or isobutyl al-
col~ol,proteinase (e.g., proteinase K or pronase),
phenol (ultrapure), chloroform, isoamyl alcohol,
Nucleic Acids III: Analysis of Fragments and Restviction Sites 283
in STES buffer (Appendix). However, storage of samples (>I g) until pellet is the same size in
tissue in this buffer softens tissue considerably, two consecutive spins. For small-scale prepa-
making membranes more susceptible to breakage. rations (<5 g), go to step 6.
This may be a function of the high concentration 4. Transfer the supernatant to a 50-ml poly-
of EDTA, since Avise and coworkers (Lansman et propylene or polyalloiner screw cap cen-
al., 1981; Ball et al., 1988) report good results us- trifuge tube. Centrhge at 23,000 g, 4"C, for 20
ing their buffer, which contains less EDTA. There- min to pellet mitochondria and other remail,-
fore, this strategy should be tested for the differ- ing cellular debris. Decant supernatant and
ent combinations of tissue, species, and buffers. drain pellet.
Yields from ethanol-preserved tissues are poor, 5. (OPTIONAL; for large amounts of tissue) Purify
possibly because of damage to the mitochondrial mitochondrial fraction on a 1.0 M/1.5 M su-
membranes (S. Palumbi, personal communica- crose step gradient as follows:
tion).
a. Re-suspend pellet in 20 rnl 0.25 M sucrose (in
Pare A. Preparahinni of Crirde mti3P;IPk ThE, Appendix).
1. Sacrifice or, if frozen, partially thaw animals b. Make the sucrose gradient by underlayering 10
and remove tissues. If using only cells (e.g., ml of 1 M sucrose (in ThE) with 8 ml of 1.5 M
blood), pellet and begin at step 7. sucrose.
2. Homogenize thoroughly in cold STES buffer c. Carefully overlayer the sample onto the gradi-
(see Appendix: 12 ml buffer/g tissue; 12 ml ent.
minimum). The concentration of EDTA may d. Centrifuge at 25,000 rpm (81,000g), 4"C, for 1hr
be adjusted, depending upon leveIs of DNase (no brake) in a Beckman SW28 rotor (or equiva-
activity. EDTA inhibits DNases by chelating lent).
divaient cations required for their function. A e. After centrifugation, aspirate off the top of the
good starting concentration is 100 mM EDTA. gradient and carefully remove the mitochondr-
Isolations of mtDNA from organisms with ial fraction (appears as a band at the 1.0-1.5M
high levels of DNase activity (e.g., mollusks) interface; see Figure 10A).
have been more successful using 200 mM f. Re-suspend mitochondria1 fraction in three vol-
EDTA in their grinding buffer, while initial umes of ThE and centrifuge at 23,088 g to pcllet.
studies of teiid lizards and terrestrial mam-
mals worked well with 1 mh4 EDTA. It is im- 6. Re-suspend the pellet (from step 4 or 5fl in 1.0
portant to note that increasing EDTA concen- ml ThE at room temperature and mix vigor-
tration decreases the stability of membranes. ously. If the pellet volume is greater than 0.3
High EDTA concentrations limit the loss of ml, re-suspend in 4 volumes of ThE.
mtDNA due to degradation, but mtDNA is 7. Add 0.125 ml(1/8 re-suspended volume) 20%
lost due to membrane breakage and which re- SDS (W/V in H20) to lyse membranes, mix
sults in inability to recover the molecules gently, and leave at room temperature for at
from the supernatant. Therefore, it may be least 10 min.
necessary (particularly when working with 8. Add 0.188 m1 (1/6 volume) CsC1-saturated
small amounts of tissue) to determine empir- water to precipitate nuclear DNA-SDS-CsCI,
ically which EDTA concentration provides the mix gently, and place on ice for at least 15
best yields. min. Larger samples (>1 g tissue) may require
3. Centrifuge homogenate for 5 min at 1200 g, longer incubation times (i.e., overnight) to
doc, to pellet nuclei and large cellular debris. complete precipitation. (This mixture can be
This pellet may be saved for nuclear DNA ex- stored at 4°C overnight or longer at this
traction. Repeat this step, when using large point).
Nucleic Acids IIP: Analysis of F ~ a g m e n t sa ~ Rest~ictioiz
d Sifes 285
Table 6
Approximate amounts of PI and CsCl to adjust sample densities
to 1.40 g/ml
o l 2 mg/mnl PI in TE). Samples may be stored justable); speed = 36,000 rpm (140,000 g ;
for moi-itl~sat -20°C by adding only the CsC1. running time = 20-24 hr. The running time is
The PIis added just prior to ultracentrifugation. dependent upon the amount of DNA in the
2 . Add 0.23 1x11 of 2 mg/ml PI stock (in TE). sample; the more DNA, the longer it takes
Check the density of each sample by: (a) re- for the sample to attain equilibrium. Larger
peatedly weighing 1 ml of the solution, (b) samples may require more than 24 hr to
accurately measuring the sample volume and reach equilibrium.) Now go to step 8 (see
weighing the sample, or (c) using a refrac- below).
torneter. Adjust to 1.40 g/ml by addition of 5. For small volume gradients (total volume
water (if too heavy) or solid CsCl (if too <2.5 ml; to be run in a Beckrnan TLS-55 rotor
hght). or equivalent) or for large initial volumes
3. Place samples in ultracentrifuge tubes and (>1.5 ml of sample; to be run in a Beckman
check the volume of each. To form a step gra- SW6OTi rotor or equivalent), measure voIume
dient, careiully underlayer the sample with of supernatant from part A and adjust density
1 33 ml of 1.70 g/ml solution (Appendix) per to 1.52-1.57 g/ml by adding the amount of
mi of sample. Overlayer the step gradient CsCl indicated in Table 7 (this includes the
wlth mineral oil to within 1-3 mm of the top volume of PI to be added later).
(there should be at least 2 mm oil). 6, Just prior to centrifugation, add the amount
4 Put tubes into rotor buckets, carefully hook of 2 mg/ml PI needed to bring final concen-
buckets onto rotor, and place rotor on drive bation to 350 ,ug/ml (Table 7) and mix. Mea-
shalt. For a Beckman SW60Ti rotor or equiv- sure density of the solution. Final density
alcnt, set run parameters to: temperature = should be 1.52-1.57 g/ml. If necessary, adjust
21°C; maximum temperature = 35OC (if ad- by adding water or solid CsCI.
Table 7
Approximate amounts of PI and CsCl to adjust sample densities
to 1.55 g/ml
(mf)'
V[inxtia~~ ~r ( d l CsCl (g)
1.0 0.21 0.93
1.1 0.23 1.01
1.2 0.25 1.11
1.3 0.27 1.20
1.4 0.29 1.29
1.5 0.31 1.39
1.6 0.33 1.48
17 0.35 1.57
1.8 0.37 1.66
19 0.39 1.76
2.0 0.41 1.85
2.1 0.43 1.94
22 0.46 2.04
2.3 0.48 2.13
2.4 0.50 2.22
2.5 0.52 2.32
VDdhajii s the volume of sample prior to the addition of PI and CsCl
Nucleic Acids III: Analysis of Fragments and Restriction Sites 287
7, Place samples in tubes and fill to within 1-3 Part F). VeIocity Cexstrifugatinn on a Step
mm of the top with light mineral oil. Balance Gradient
tubes to within k0.02 g of each other. Run pa-
rameters for a Beckman TLS-55 rotor (or 1. Measure the volume of the sample collected
equivalent) are: 50,000 rpm (140,000 g), 2I0C, from the equilibrium gradient and add an
and >20 hr. For the larger Beckman SW60Ti equal volume of TE (at least 2/3 of the sam-
rotor (or equivalent), parameters are as in step ple volume) and mix. Addition of TE reduces
4, except that minimum run time is 36 hr. the density of the sample below 1.40 g/ml, al-
lowing it to be layered over the step gradient.
8. To end the run, push "stop" with brake on Failure to add a sufficient volume of TE re-
and remove tubes from buckets. duces efficiency of velocitization by prevent-
ing discrete overlayering of sample on the
gradient. The combined volume of sample
X7art C , Recovcsy of DNA
and TE should be less than 1 ml.
In room light, the nuclear DNA (actually all linear
and relaxed circular DNA, i.e., including damaged 2. The sample is overlayered onto a step gradi-
mtDNA) should be visible as an intense red band. ent consisting of two layers, 0.7 ml of 1.70
The band containing undamaged mtDNA, which g/ml solution (Appendix) and a quantity of
is from 2-6 mm below the nuclear DNA band, 1.40 g/ml solution (Appendix) determined by
probably will not be visible (Figure 10B).Bands of the volume of the diluted sample. The
carbohydrate are white to light pink in room light amount of 1.40 g/ml solution is calculated us-
and may be present below the mtDNA band. RNA ing the following formula:
is found at or near the bottom of the gradient.
I. Wear safety glasses or face shield and gloves. Volume (in ml) of 1.40 g/ml solution =
Using a long-wave (305 nm) UV light source, 3.8 ml - volume of diluted sample -
locate the mtDNA band. The mtDNA band is 0.7 ml of 1.70 g/ml solution
often not visible. In such cases, collect the area
2-6 mm below the main band. 3. Add the correct amount of 1.40 g/ml solution
to an ultracentrifuge tube. Using a Pasteur
2. Puncture the tube bottom with an 18-21-gauge pipette, underlayer this with 0.7 ml of 1.70
syringe needle with a wire inserted in it (appa- g/ml solution.
ratus in Figure 11).Use the wire to dislodge (by 4. Carefully layer the diluted sample on top of
pushing up) the small plastic plug that may the gradient, add light mineral oil to within
clog the needle, then remove the wire from the 1-3 mm of the top and balance tubes to
needle. The flow can be regulated by placing a within 0.02 g.
gloved finger over top of the tube.
5. Put the tubes into rotor buckets and place ro-
3. Collect the mtDNA fraction in a 1.5-ml micro- tor (Beckman SW6OTi or equivalent) into ul-
centrifuge tube. If the mtDNA is to be further tracentrifuge. Centrifuge at 45,000 rpm, 21°C
purified (parts D and E) and the mtDNA for 3.5 hr, with no brake.
bands are faint (or invisible), include the first
drop of nuclear DNA as a reference point for
further gradients. Otherwise, avoid contami- Part E. Samp!e Rccovery and Final: Eqrijlibriurn
nating the mtDNA fraction with any DNA Gradient
from the top band. The top band DNA also 1, Puncture tubes as in step 2, part C. Collect the
can be collected and usually is adequate for a
bottom 1.4 ml of the step gradient into a 1.5-
variety of uses (e.g., transfer-hybridization ml microcentrifuge tube.
analysis of nuclear and mtDNA sequences,
template for PCR amplification). Proceed to 2. Put 1ml of 1.55 g/ml solution (Appendix) into
extraction and dialysis (part F), unless further an ultracentrifuge tube, add the sample, and
purification (parts D and E) is desired.
288 Clzaptev 8 / Dowling,Moritz, Palmer 6.Rieseberg
SIDE VIEW
21-gauge
1-5/8 in. needle
~nsertcd Threaded brass
througl~ fitting to screw
brass into base
7/16 in. fitting
I-Needle in
brass fitting
Figure
mix. Add light mineral oil and balance tubes as microcentrifuge. The saturated alcohol forms
above. the top layer (pinkish from the dye) and is
3. Use the same centrifugation conditions as in step discarded after each extraction. Repeat this
7, part B, with run time reduced to 18-20 hr. process until the sample (lower layer) is clear.
4. Recover sample as described in part C. 2. Place samples into 8-mm dialysis tubing (for
preparation, see Appendix) and tie or clip
tightly.
Past ki Extraction of Dye and Dialysis 3. Dialyze against two changes of 2 L 0 . 5 TE,~
1. To remove PI from a sample, extract with iso-
for 24 hr. -
propyl alcohol (saturated with CsCl; top layer 4. Remove and store purified mtDNA (should
is the isopropyl alcohol) and spin briefly in be in 0.2-0.5 ml) at -20°C.
Nucleic Acids III: Analysis of Fragments and Restriction Sites 289
Protocol 2: 1solaE.ian-eof cpDNh XJsix~g. overlay should be added with sufficent mix-
Sucrose Step and CsCl-EB Cradicaats ing to create a diffuse interface and thereby
(Time: Part A: 3 hr; Part 0:6-18 hr) prevent trapping of nuclear material in the
band of chloroplasts that form at the
This method involves two steps: (1) purification 30%-52% interface.
of intact and broken chloroplasts using a sucrose 9. Centrifuge the step gradients at 25,000 rpm
step gradient, and (2) purification of the cpDNA (81,000 g) for 30-60 min at 4'C in a SW-27
released from the organelles, together with any (Beckman) or AH-627 (Sorvall) rotor.
contammating nDNA and mtDNA, using a CsCl
10. Remove the chloroplast band from the
gadient with the intercalating dye ethidium bro-
30%-52% interface (Figure 10) using a wicie
mide (EB). Although the sucrose gradient proce-
bore pipette, dilute wit11 3-10 volumes wash
dure does not give as pure cpDNA as the DNase I
buffer, and spin at 1,500 g for 15 min at 4°C.
procedure of Kolodner and Tewari (19751, it is
much more applicable to a wide range of pIants 11. Re-suspend the chloroplast pellet in 1-2 ml
for which it is difficult or impossible to prepare in- wash buffer (or 15 ml if to be further puri-
tact, DNase I-resistant chloroplasts, or for which fied).
tissue quantities are limiting. For details and mod- 12. Add 1/20 volume of a 20 mg/ml solution of
ifications of this procedure, and for discussion of self-digested (2 hr at 37°C) proteinase K and
alternative procedures for purifying cpDNA, see incubate for 2 min at room temperature.
Palmer (1986a) and Palmer et al. (1988b). 13. Gently add 1 / 5 volume of lysis buffer (Ap-
pendix). Slowly invert tube several times ovcr
Part A: Xsnlation of Chloxopiasts and lysis a period of 10-15 min at room temperature,
1. Use young, unexpanded green leaves if at all then make the CsCl gradient (part B, below).
possible since they will have smaller cells 14. A cpDNA-enriched "total" DNA preparation
than older fully expanded leaves, and hence can be prepared by re-suspending the pellct
will yield more DNA. If practical, prior to ex- of the sucrose gradient in 1.5 ml of wash
traction, place plants in the dark for 1 4 days buffer, lysing (steps 12 and 13), conducting a
to reduce chloroplast starch levels. This usu- clearing spin (10 min, 1,750 g), and CsCl
ally is not essential. banding (see below).
2. Cut leaves into small pieces, 2-10 cm2in sur-
face area. Wash cut leaves in tap water (if vis- Faxi B. CsCI-EB Paarifiratian of cpDNA
ibly dirty). This method is described for cpDNA, but is ap-
3. Place 10-100 g of cut leaves in 50-400 ml of plicable to any crude DNA preparation. A snzaller
ice-cold cpDNA isolation buffer (Appendix). volume, more rapid protocol is described by
4. Homogenize in a blender for 3-5 5-sec bursts Weeks et al. (1986).
at high speed. 1. Bring the DNA sample (e.g., chloroplast
5. Filter through four layers of cheesecloth (with lysate, or re-suspended isopropanol pellet
squeezing).
- from a total DNA CTAB extraction, Chapter
6. Centrifuge filtrate at 1000g for 15 min at 4OC. g
9) to a volume of roughly 3 ml. Add 3.35 of
freshly powdered CsCl and dissolve by gen-
7. Re-suspend the pellet from 10-50 g of starting
tle mixing. Add EB to a final concentration of
material in 5-8 ml of ice-cold wash buffer
200 pg/ml and enough distilled H 2 0 to bring
(Appendix) using a soft paint brush and vig-
sample to a final volume of 4.45 ml and a fi-
orous swirling.
nal density of 1.55 g/ml.
8. Load the re-suspended pellet onto a step gra-
2. Centrifuge for 4-16 hr at 220,000-290,000 g at
dient consisting of 17 ml of 52% sucrose over-
20°C in a vertical rotor (e.g., Sorvall TV-865,
Iayed with 8 ml of 30% sucrose, both in 50
Beckman 65Vti).
mM Tris-HC1, p H 8.0, 25 mM EDTA. The
290 Chapter 8 / Dowling, Moritz, Palmer b Rieseberg
3 Remove any scum (this wiII be considerable ticular enzyme may vary depending upon its
In the case of a directly banded chloroplast source; therefore, manufacturer's condition
lysate) from the top of the gradient using a 1- should be consulted prior to use. REs vary widely
ml pipette tip with the end cut off. Use a sec- in stability: those that denature rapidly are best
ond 1-ml pipette tip with the end cut off used at relatively high concentration, whereas sta-
obl~quelyto remove the visible band of DNA. ble REs can be used at lower concentrations (1-2
This should be removed in as smalI a voIume U/sampie) for extended periods (e.g., Crouse and
as possible (i.e., 0.5-1.0 ml). Ainorese, 1986). Digestion also can be improved
4. If the DNA fraction is visibly dirty after the by addition of bovine serum albumin (BSA), and
firs1 gradient (as is often the case with direct the addition of spermidinc has been found to as-
banding of chloroplast lysates), it can be sist in digestion of DNA samples containing im-
banded a second time. Simply bring the purities (e.g., Jeffreys, 1982). Because many REs
UNA/CsC1 fraction to a volume of 4.45 ml by are heat sensitive, they should be stored at -20°C,
ddding a premixed 1.55 g/ml solution of CsCl preferably in a frost-free freezer, and removed for
with 100 pg/ml EB, and TE and repeat steps as short a period as possible. The enzymes are
2 and 3. stored in 50% glyceroI to prevent denaturation by
5. Remove EB by three extractions with iso- freezing. The glycerol can affect RE activity if pre-
sent at greater than 5% of the final reaction mix-
propanol (uppermost layer) as described in
ture. Thus, the volume of RE added to a reaction
I'rotocol 1, part F.
should always be less than 10% of the total.
6. There are two ways to remove the CsC1. Ei-
ther dialyze (Protocol I, part F) or ethanol- Part A. Digestion of Single Samples
precipitate as described below.
1. For each sample, the final reaction volume
s. I?ernove the aqueous layer from the third so- should be 5-30 p1. For a single digest, add the
propanoi extractlon and add two volulnes of following to a sterile microcentrifuge tube:
I-i2Q to dilute the CsCl. Mix gently and add 6
volulnes of ice-cold ethanol to precipitate DNA. a. BSA (100 mg/ml solution) and appropriate
Place at -20°C for 30 min to overnight. Do not buffer stock (typically provided by supplier as
place at -80°C or the CsCl will precipitate. lox stock) are added at 1/10 final volume.
b. Centrifuge at >1,750 for 10 min to collect the b. Water (sterile, deionized, distilled) IS added to
DNA precipitate. dilute the reaction mixture to the calcuiated fi-
nal volume (see below).
c. Wash pellet with 70%ethanol. Spin at >I750 g
for 2 min to collect the DNA. c. DNA accord~ngto amount required: 1-5 ~ i for g
end-labeling, 0.1-10 ,ug for staining or transfer-
d . Re-suspend pellet in 0.1-0.5 ml of TE.
hybridization, depending on the sequence as-
7 Slore the DNA at 4OC for short-term use and sayed and the size of fragment to be detected.
at -20°C for long-term use. The volume depends on concentration k g . ,
mtDNA purified according to Protocol 1 can
usually be used at 1-10 p1 per digest for end-la-
IPriiiacoX3: Digestion of DNA with beling).
Rcs tri ction Eurdonrscieases d. 1-2 U of the appropriate RE. More may be
(?'me: Part A: 2-6 hr; Part B: 2-6 hr) needed for large amounts of DNA (>I ,ug) or for
The activity of REs varies with temperature, pH, heat-labile REs.
and salt (Nat, K+, Mg2+)concentration. However, Example:
~t1s usually possible to achieve acceptable levels
1p1 ZOx buffer stock (1/10 final volume)
of activity using a small range of buffers (supplied
1 pl 1 mg/ml BSA stock (1/10 final V O ~ -
by the manufacturer) that differ in the final con-
ume)
cen tration of Na+. Digestion conditions for a par-
Nucleic Acids III: A nalysis of Fragments and Restriction Sites 291
GEL MOLDER
# Dimensions
-b 12-3/4 in.4- 1 1/2 in. x 8-112 in x 21-3/4 In. top
2 1/4 in x 2-1/4 in. x 12-3/4 in. sides
2 1/2 in x 1-1/8 in. x 8-1/2 i n legs
2 1/2 In. x 1/2 in. x 8-1/2 in. feet
1-1/8 in. 2 l / 4 in. x 1-3/8 in. x 8-1/2 in. gates
M
1 in. BUFFER TANKS
SIDE VIEW
Banana plug
rn I
1 1 1 0
I-.,,
:..
0 "t.
1/2 in.
in.
3 in. Wire holders
f
1-1/8 in.
U
- - , q 3 */ 8/n in. Li 9+1 ".- -il
in.
'
. 9-3/4 in. ---------I
1--.1--1
1/2 in. 2/2 in. TOP VIEW
SIDE VIEW OF
INTERNAL PIECES
1/4in.
H 2-1/8 in.
-7
1-3/8 in.
2-1/4 in. 3/8 in.
L b8-1/2 i n . d
1/2 in. &3-1/2 in. ---4
END VIEW
END VIEW
COMBS
5/32 in. 1/16 in.
I
I
7/8in./ 1 39 teeth
-.' 9-1/ 2 in. --------A
i
b8-1/2 in.
l
- SIDE VIEW
Figure 12 Plans for a non-submarine type, horizontal lished plan of McDonell et al. (19m,with modification by
agarose gel electrophoresis unit wit11 agarose wicks (1unit M. Murray, W. Thompson, R. Jorgensen, and J. Palmer.
= 1 gel holder plus 2 tanks). The design of two types of gel (Figure courtesy of Nanette Mussy and Jim Manhart.)
combs is also shown. Gel rig plans are based on the pub-
Nucleic Acids In: Analysis of Fragments and Restrictioiz Sites 293
9-3/4 Ill
+8 in. -----4
H
3 / 4 in.
SIDE VIEW
3/16 in. SIDE VIEW
a d -
14-3/4 in. jc
5/8 in. 1
1-3-1 /4 in. -1
depth
2-7 j8 in. l/i in.
notch
width
t
13/16 inJJjU[-~9/16 in. Combs'have teeth
L 1 mm or 2 rnm thick
1. p
5/Btrn. 1/2 in. 20 teeth I
F
k--- -
8-3/8 in.
i k--- 8-1/ 4 in. 4
4 Figure 1 4 Plans for an adjustable vertical gel rig. vertical gels, squirting a small amount of
Glass plates are 3.2-mm double-strengthglass, 16.5 cm buffer between the gel and each tooth of the
x 19 cm and 16.5 cm x 44.5 cm; in sets of two, where one comb oftens helps. Remove the tape from the
has a notch in the top that is 1.9 cm deep x 14 cm wide
(centered).Spacers for the agarose gel (smallgel) are 2.0 mold, place the gel in the rig and submerge in
mm thick. Spacers for the polyacrylamide gel (large gel) buffer to prevent the gel from dessicating. For
are 0.75 mm thick. Combs routinely have 16 wells for vertical gels, squirt molten agarose between
both gels. the plate and rig before clamping together to
provide a good seal against buffer leakage.
The gel is now ready to use, or it may be kept
gether. The bottom is sealed using tape or by as is for at least 1 day, as long as the wells re-
pouring an agarose plug while the mold unit main immersed in buffer,
is held vertically in a stand with a central Prior to electrophoresis, wells should be
well. tested by preloading with dilute (1x1 running
2. Mix agarose, lox stock of gel buffer (usually dye (Appendix) and electrophoresed for sev-
TBE or TAE, see Appendix), and distilled wa- eral minutes. In the case of vertical gels, thin
ter. For example, to make 200 ml of a 1% gel, layers of agarose need to be removed from
combine 2 g of agarose, 20 ml of lox buffer, the wells manually (Hamilton syringes work
and make up to 200 ml with H20.Mix the in- well for this) and by gentle rinsing.
gredients thoroughly in a flask and boil vig- Connect the electrical leads to the gel appara-
orously with intermittent swirling. If using a tus. DNA migrates to the anodal (positive)
microwave, add a teflon-coated stir bar to pole, therefore the wells should be closest to
avoid superheating. The preparation is ready the cathodal pole (for vertical gels, anode at
when all of the particles have gone into solu- the bottom, cathode at the top).
tion. When cooking agarose (especially in a Add 1/5 volume of loading solution (Appen-
microwave oven), loss of water due to evap- dix) to the sample (which should already be
oration can be significant. Check the final vol- end-labeled if necessary). A size standard
ume, add water to replace that which has (e.g., HindIII- or AvaI/BglII-digested 3L bacte-
boiled away, and reheat briefly to ensure that riophage DNA) must be included on each gel.
the agarose is well mixed and dissolved. For analysis of minisatellites, an internal size
Molten agarose may be stored for several marker revealed by hybridization can be in-
days at 70°C (or allowed to set at room tem- cluded in each lane (Burke et al., 1991).
perature), or after sufficient cooling (when the
flask is no longer too hot to handle; Using a Hamilton syringe or adjustable mi-
=50-55°C), can be poured into the vertical or cropipettor, load each sample into the well,
horizontal mold. Pouring agarose that is too splitting samples between the agarose and
hot will crack the plates or warp the plexi- acrylamide gels if both are used. Fragments
glass mold. are best resolved using low voltages (1.0-1.5
V/cm), although much higher voltages (=I0
3. Pour the slightly cooled agarose into the level V/cm) sometimes are used for rapid running
mold. For horizontal gels, the comb should be of minigels. Full-length (20 cm) agarose gels
in place prior to pouring. For vertical gels, in- are usually run overnight. Electrophoresis
sert the comb immediately after pouring and typically is stopped when the dye front
fix it in place by clamping the comb to the (equivalent to 0500 b p in a 1% gel) has
back plate. Let the gel set until it is cool to the reached the end of the gel. The gel mold is re-
touch and opaque. moved from the apparatus and the gel is
4. Carefully remove the comb to prevent tearing treated to visualize the fragments (see "Stain-
of the wells or the teeth separating them. For ing" and "Gel Drying," below).
296 Chapter 8 / Dowling, Moritz, Palmer & Rieseberg
POLYACRYLAMIDE GELS Polyacrylamide gels are they usually may be put back in place and
prepared at varying concentrations (typically still provide a good barrier between lanes.
3.5-6.0%) and are used for visualizing small 6. Place the gel in the apparatus (Figure 14) as
fragments (<I000 bp), Unlike agarose, polyacry- for vertical agarose gels. Fill the buffer tanks
lamide gels are run only vertically (Figure 141, to prevent dessication. The gel may be stored
and transfer from polyacrylamide gels to a as is overnight or used immediately.
hybridization filter must be done electrophoreti- 7. The remaining steps are as described for
cally (Church and Gilbert, 1984; Kreitman and agarose gels, using a standard wit11 fragments
Aguade, 1986). The following instructions are for of appropriate size (e.g.,HueIII-digested $XI74
the long (44.5 cm) 4% gels used for electrophore- RF DNA), Polyacrylamide gels can be run in a
sis of end-labeled RE products; the procedures minimum of 4 hr or overnight. However, ap-
are similar for the short 6-8% gels used for plication of strong current (typically >300 V
analysis of digested PCR products. for a long gel) can severely distort the migra-
tion front due to differential heating. When
CAUTION: Polyacrylamide is a cumulative neuro- the dye front has migrated the appropriate
toxin and must be handled with extreme care. distance (28 cm for a 40 cm 3.5% gel), the gel
Always use gloves, and handle the powder in a mold is removed from the apparatus. The gel
fume hood while wearing a face mask. is then treated accordingly (see below).
1. Wash plates (one notched as for agarose gels)
with ethanol. If the gel consistently sticks to Part B, f;ef llryir~gand ikukoxadictpraphy
both plates, spread silane (Appendix) on the When using gels to separate end-labeled frag-
top (notched) plate. An appropriate substitute ments, it is best to dry the gels to a piece of chro-
is water-repellent for windshields. This prod- matography paper (Whatman 3MM) before au-
uct is much easier to use and can be pur- toradiography. Dried gels are easier to handle and
chased at many auto parts stores for a fraction the fragment patterns much sharper.
of the cost of silane.
2. Place spacers between the plates and clamp CAUTION: For 32Por 35S end-labeled fragments,
tightly in place. Tape the bottom of the plate the solution in the bottom tank of the gel rig
to complete the mold. contains the unincorporated nucleotides. There-
3. Wearing gloves, mix the appropiate amounts fore, this solution is highly radioactive, requir-
of bis:acrylamide, buffer, and distilled water ing caution in handling and proper disposal.
(Appendix) in a flask. 1. Remove the gel mold from the apparatus. Re-
4. Add 10%ammonium persulfate and TEMED move a side spacer and carefully split the top
to the mixture, mix by swirling, and imme- (notched) plate away from the bottom with a
diately pour between glass plates. While spatula. For polyacrylamide gels, the gel will
pouring, make sure that no large bubbles sometimes stick to both plates. The gel can be
form, as these will interfere with migration removed from either plate by gently squirting
of fragments. When the mold is full, lay flat with water as the plates are separated.
on a raised surface. Insert comb approxi- 2. For agarose gels, gently rinse the exposed
mately 1-2 cm into the gel (depending upon side of the gel with water to remove excess
sample volume to be loaded) and fix by nucleotide and reduce background contami-
clamping the two plates over the comb with nation. Drain by tilting, allowing the water to
a large binder clip. This minimizes the run off. Excess water should be removed by
amount of polymerized acrylamide in the gentle blotting with an absorbent wipe. This
wells.
procedure is not typically necessary for poly-
5, After 40-60 min, carefully remove the comb acrylamide gels, but if performed, do not blot
from the gel. If the walls of the wells break, the gel dry as the gel will stick to the wipe.
Nucleic Acids III: Analysis of Fuagmen ts and Res trictio~zSites 29 7
3. Remove the gel from the glass plate by adhe- method for silver staining is provided by Bassaln
sion to the filter paper. and Cae tano-Anollits (1993).
4. Rinse and blot the opposite side of the gel as
previously described (step 2) and place a sec-
ond piece of filter paper the same size as the ProkasoE 5: Staining wiih EtF;ia'
41~02
first beneath the gel and filter paper. Cover Bromide
the gel with plastic wrap, trim the plastic (Time: 3 0 4 5 min)
wrap and filter paper to the size of the gel,
and place in the gel dryer. Apply vacuum and Fragments may be visualized using UV fluoresc-
turn on heat. 1.5-mm thick vertical gels usu- ing dyes such as ethidium bromide IEB, a power-
ally dry in 3 0 4 5 inin; thicker, horizontal gels ful carcinogen) which blnd to the DNA molecule.
take considerably longer. The gel is now fixed This is used to observe RFLPs where large
to the top piece of filter paper. amounts of purified ox amplifed sequence are
available (e.g., Figure 5). Staining is also an im-
Remove plastic wrap and extra filter paper
portant step in the transfer-hybridization method
and dispose of as radioactive waste. Load the
(Protocol 8). The method below is used to stain
dried gels and film into an autoradiograph
gels after electropl~oresis.Alternatively, EB can be
cassette. The number of intensifying screens
included in the gel mix or added to the elec-
.to be used is determined by monitoring the
trophoresis buffer. The EB solution needs to be
gel with a Geiger counter (for 32Plabeling)
disposed of according to regulations for carcino-
and past experience. At -70°C, intensifying
genic compounds and should be replenished
screens enhance the intensity of the image (in-
every 1-3 days depending on usage.
cluding the background contamination) by a
factor of four (one screen) to ten (two screens). 1. Trim the gel (e.g., at slots and 4 cm below the
However, the use of two screens reduces the broinophenol blue) and place on an acrylic
crispness of image, If one intensifying screen plastic sheet.
is used, the orientation is intensifying screen 2. Stain gel in 500 rnl distilled H20 w ~ t h0.5
(shiny side up), film, and the dried gel (gel pg/ml EB for 10-20 min. Shake gentIy. Pour
side towards film). If two screens are used, off El3 solution and rinse for I min in d~stllled
the orientation is intensifying screen (shiny H20.
side up), film, intensifying screen (shiny side
3. Shake gel in second rinse of distilled H20for
down), and the gels (gel side facing the film).
5-30 min to remove excess EB from gel.
6. After exposure for the appropriate length of
4. Photograph gel (using a PolaroidTM camera or
time (dependent upon the amount of DNA la-
beled, the efficiency of the labeling reaction, other instant visualization system) wxth a
age and type of nucleotide used [ ? ' or 35S], plastic ruler next to the size marker. If the
and the number of intensifying screens), the photograph is to be enlarged or publislied,
autoradiograph is developed, fixed, and al- save a negative (for PolaroidTMnegatives:
wash with water, then sodium sulfite, then
lowed to dry.
rinse with water).
enabling more digests per sample. d2P-labeled 2 . Add 5 pl of label mix to each sample and
dNTPs are used most frequently because their leave at appropriate temperature (see above)
high-energy emission results in relatively short for 20-30 min.
exposure times for autoradiography. The alterna- 3. Add 1/5 volume of loading dye to each Sam-
{ w e , 35S,
has a longer half-life (half-life of 60 days ple. This can be mixed by vortexing or by
compared to 14 for 32P)and produces crisper im- gentle aspiration in the Hamilton syringe dm-
ages, but requires much longer exposure times ing loading.
and contamination is more difficult to detect in 4. Load samples into wells, splitting each be-
the laboratory, requiring swipes and scintillation tween agarose and acrylamide gels.
counts. Where several different REs, each with its
own type of end, are used, it is simplest to use all
four radiolabeled dNTPs for end-labeling. Protocol 7: Primer Labeling far
The reaction uses the large (Klenow) frag-
ment of DNA polymerase I which has 5' -+ 3'
Microsatellit@Analysis
(Time: 40 min)
polymerase and 3' exonuclease functions (see
"Methods of Detection"). The polymerase func- Microsatellite loci typically are analyzed via PCR
tion is far more active than the 3' exonuclease. La- and new primers should be designed and opti-
beling generally is carried out at room tempera- mized as described in Chapter 7. A protocol lor
ture or at 4'C. However, fragments with blunt cloning microsatellite loci to determine appropri-
ends or 3' overhangs (Table 2) are best labeled at ate primer sequences is given in Chapter 9. The
37°C to maximize the exonuclease activity. Under PCR products are electrophoresed through dena-
thebe conditions, randomly sheared fragments turing polyacrylamide gels (as used for sequenc-
also may be labeled, thereby increasing back- ing; see Chapter 9 ) and are best visualized by
ground. This can be reduced by adding only the radioactive labeling, or, with an automated se-
first nucleotide to be inserted (e.g., for RsaI di- quencing apparatus, using fluorescence. The pro-
gests, just add 32P-dTTP). tocol below is for preparing radiolabeled primers
that can be used in combination with cold primers
1. Prepare a labeling mix to be added to each
in the PCR reaction. The same method is used to
sample. This consists of lox label buffer (Ap-
prepare primers for cycle sequencing (Chapter 9).
pendix), radioactive dNTPs, the large The protocol requires y-labeled ATP. Either
(Klenow) fragment of DNA polymerase I, and y-33P-dATP (1000-3000 Ci/mmol) or Y - ~ ~ P - ~ A T P
distilled water. The amount of lox label buffer (3000 Ci/mmol) nucleotides are used; Y - ~ ~ S - ~ A T P
added must take into account the volume of is not recommended because of the reduced effi-
the digests as well as the labeling mix itself. ciency of polynucleotide kinase (PNK) with this
For example, if 5 ,dof label mix is to be added isotope.
to each of 16 tubes (e.g., 15 digested DNA
samples and a size standard) which already 1. Calc~tlatevolumes for an end-labeling reac-
contain 10 p1 of digest, the total volume, in- tion as follows: for each PCR reaction, select
cluding an aliquot for pipetting error, is 17 x one primer to label and use a mix of 3:l unla-
15 = 255 pl. For this example, this mix would beled to labeled primer (the second primer is
include: unlabeled). Note that the labeling reaction
will result in a primer stock at 0 . 1 the
~ origi-
25.5 pl lox label buffer nal concentration. Calculate the amount of la-
5 U b0.25 U/sample) Klenow poly- beled primer required for the number of PCR
merase reactions and proceed with the labeling reac-
2 pl of 800 Ci/mM o?~P-~NTPs (Q 0.5 /.d tion.
= 5 yCi each) 2. For example, to prepare 10 pl of labeled
ddH20 to 85 p1 ((i.e.,17 aliquots @ 5 pl primer, combine the following ingredients in
each) an 0.5-ml microcentrifuge tube:
Nucleic Acids 111: Analysis of Fragments and Restriction Sites 299
1.0 pl primer (10 (i.e., 10 pmoles) DNA, making it single-stranded, and hybridiza-
1.5pl Y - ~ ~ P - ~ A
(i.e.,
T P10 pmoles) tion; (4) washing the filter; and (5) autoradiogra-
1.0 pl lox T4polynucleotide kinase phy. Significant variations include transfer by vac-
buffer uum instead of capillary action ("vacuum
0.625 ,ulT4 polynucieotide kinase (i.e., blotting"), transfer under alkaline conditions
5 U) (Reed and Mann, 1985),production of radioactive
5.875 ddHpO probes by random priming instead of nick trans-
lation (Feinberg and Vogelstein, 19831, and modi-
Mix, then incubate at 370C fications of the stringency of hybridization and
then denature the at 650C washing (see Hames and Higgins, 1986). For some
min. Note: if using Y - ~ ~ S - ~ use
A T20
~ ,U PNK lower-sensitivity applications (e.g., excluding dex-
and incubate for much longer (e.g.~4 hr) at tran sulfate from hybridization mix) or some types
370C. Spin brieflyto any of membranes (e.g., non-charged membranes), the
End-labeled primers be at -200C prehybridization step can be omitted without a
for as long as one month and can be used di- substantial increase in background.
rectly without any further preparation. The most important aspect of hybridization is
3. To use the labeled primer in a set of PCR re- determining the appropriate conditions (strin-
actions, for example 20 reactions each of 6.25 gency). When using heterologous probes, some
p1, prepare the following master mix, using base-pair mismatches must be permitted, with the
precautions against cross contamination amount of mismatch required dependent upon
(Chapter 7): similarity of probe and target DNAs. Stringency
can be reduced by lowering temperatures, and by
64.6 pl ddHIO increasing salt and formamide concentrations. For
2.5 p110 mM dNTPs more precise description of manipulation of these
2.5 $ lox Taq polymerase buffer parameters, see Sambrook et al. (1989) and Hames
7.5 p125mM MgC12(adjust as necessary) and Higgin (1986).
4.0 $ 10 IMunlabeled primer 1
3.0 $10 mM unlabeled primer 2 Part A. Transfer of DNA to I.hc Membrane
10.0 p11 mM labeled primer 2 (from The electrophoresed fragments are made single-
step 2) stranded by alkaline treatment and are then trans-
20 pl DNA template (i.e., 1 pl per ferred in the same orientation from the gel to a
reaction) binding membrane to which they are bound.
~ l i ~add ~ DNA
~ t template
, (including negative 1. After agarose electrophoresisf stain the gel
control) and proceed with thermal cycling. It is ef- with EB and photograph with a ruler next to
ficient to use multiwel trays for the reactions. the size marker to allow fragment sizes to be
PCR products can be stored at -20°C prior to elec- determined from the fina1 autoradiograph.
trophoresis on denaturing acrylamide gels. Trim the gel to minimum size, slicing at the
origin and 1-2 mm from the outside DNA
lanes. For RE assays of mtDNA, cut the bot-
tom at the 150-200 bp position. For minisatel-
Protocol 8: Transfer Hybridization Iites, gels are run until fragments of =2 kb are
(Time: Part A: 3 hr to overnight; Part 8: 1hr to
at the bottom of the gel. For large-scale survey
overnight; Part C: 6-24 hr; Part D: 2 hr plus
work, the sizes of gels are calculated so that
exposure time)
two (or sometimes three or four) fit precisely
onto a single piece of film: e.g., twd20 cm ;(
This method consists of five basic steps that follow
12.5 cm gels will result in membranes that can
digestion and electrophoresis on agarose gels: (1)
be exposed together on a standard size (20 cm
l~ansferof the DNA from the gel onto a filter; (2)
x 25 cm) piece of film.
prehybridization of the filter; (3) labeling the probe
300 Chapter 8 / Dowling, Moritz, Palmer & Rieseberg
Whatman
(2 pieces)
Whatman
(2 pieces)
I I---/ I
Figure 15 Setup of transfer according to the two-sided, dry-blot method.
Nucleic Acids III: Analysis of Fragments and Restrictio?z Sites 301
4-cm paper towel trimmed to fit dry milk powder to distilled H 2 0 to nearly
plexiglass or glass plate full volume and mix gently, then acid SDS
weight (from 20% stock solution, millipore-filtered),
and finally add the SSC (20x stock solution,
Any bubbles or creases will cause uneven also millipore-filtered). Cover flask and heat
transfer and should be removed by rolling for 2 hr in a 65°C H20bath. One or two filters
with a Pasteur pipette or other suitable cylin- in a plastic bag can be prehybridized (and hy-
drical object. Allow transfer to proceed for 3 bridized) in 10 ml of solution. For large scale
11r to overnight. experiments, 100 rnl of solution will suffice
for approximately 20 filters (each 12.5 x 20
12. Disassemble the gel blot, taking care to mark
cm) hybridized together in a plastic tub or in
the filters at any spots where they are to be
plastic bags, either individually or in small
cut. Also mark the filter to define orientation
relative to the gel. groups,
5. For tubs, add the hybridization solution and
13. Shake membrane in 2x SSC for 10 min.
cover with a lid. For bags, add the hybridiza-
14. Air-dry for 30-120 min on filter paper. Some tion solution, carefully remove any bubbles,
types of membrane may require further dry- and heat-seal 2-3 cm from the edge of the fil-
ing in a vacuum oven for 30 min at 80°C to ir- ter. This will leave room for additional sealing
reversibly bind the single-stranded DNA after adding the probe. Place all bags together
fragments to the membrane. Alternatively, in a single plastic tub with lid.
DNA can be cross-linked to the membrane by
6. Shake gently for 2 hr ta overnight at 65°C.
exposure to ultraviolet light, with the neces-
sary energy-approximately 1.200 J. Note,
however, that overexposure can result in re- Pare C. l,al?eiing o f Pmb::: and Hybricfizatjnrs
duced hybridization efficiency (Reed and Here we describe nick-translation (Rigby et al.,
Mann, 1985). 1977), one of several methods available for label-
15. Trim off any extra filter outside of the desired ing DNA for use as probe in transfer hybridlza-
image. Remember, filters must be an appro- tion. Random priming of DNA (Feinberg and Vo-
priate size for autoradiography Slice filters gelstein, 1983) also is commonly used (this
into smaller strips as necessary. Store new fil- reaction is best performed using commercialiy
ters at room temperature or 4°C until needed available kits). It also should be possible to gener-
for hybridization. ate large quantities of labeled probe by incorpo-
rating a32P-dNTPsinto PCR reactions. In any
case, unincorporated nucleotides may be re-
Part B. Prehybddizahion of 1.11~Filficr moved using steps described below (4-8) or any
commercially available spin columns.
I. Wet filters in 500 ml2x SSC for 5 min.
2. If appropriate, remove probe from the previ- 1. Prepare 10 p1 of nick-translation buffer cock-
ous hybridization by shaking the filter in 500 tail per reaction:
rnl of boiling 0 . 1 ssc
~ for 3-5min. New mem- 3.0 pl10x nick-translation NT buffer
branes should be washed in 500 ml of 0 . 1 ~
(Appendix)
SSC, 0.5% SDS at 65°C for 1 hr to minimize
background on subsequent hybridizations. 0.5 p1 DNA Polymerase I (10 U/pl)
1.0 pl each of 5 mM dTTP, dATIJ, dGTP
3. Remove excess solution from membranes and
1.0 ,ul d2P-dCTP (i.e., 20 pCi of 3000
place in a heat-sealed plastic bag.
Ci/mM stock)
4. Prepare the hybridization solution (4x SSC, 1.0 ,ulDNase I (0.1 pg/ml stock)
1% SDS, 0.5% nonfat dry milk; or alternative, 3.5 @ ddH20
e.g,, Church and Gilbert, 1984).First add the
302 Chapter 8 / Dowling, Moritz, Palmer 6 Rieseberg
with 3 2 ~ \
RE digestion
enzyme 1
$.
a+b+c
C
d~gests
b+c
Complete b
dige, [ - C
Electrophoresis
detection
DNA at one end only and then to generate a series Part A. Double Digestiort Expcrimcnts
of partial digests either by varying digestion time 1. Determine which samples and W s need to be
or by serial dilution (Danna, 1980; Ausubel, 1987). mapped. REs typically are selected by cost,
Hillis et al. (1992) used a modification of this expected number of cleavage sites, and corn-
method for generating restriction maps of the T7 patibility of buffers in double digests. TOstart,
phage lineages used in their experimental phylo- the fragment pattern for each RE is Compared
genies. In their analysis, partial digestion products across representative samples (e.g., one per
were visualized by hybridization with synthesized locality). Note that not all samples will need
oligonucleotide probes specific to each end of two to be mapped for all sites.
conserved fragments (four in all).
Multiple digestion experiments compare the 2. Perform double digests (Protocol 3) for the ref-
fragment patterns of REs used alone and in com- erence sample to map all sites relative to each
bination. This enables the cleavage sites of the dif- other. T~ically,the best sample to use as a ref-
ferent REs to be located relative to one another erence is the one that exhibits the most cleav-
(Figure 16). This approach works well for se- age sites, as this maximizes the ability to infer
quences of up to 30 kb (e.g., animal mtDNA and losses in other samples. One strategy is to be-
nuclear rDNA) and for REs that make relatively gin with all painvise combinations of three to
few cuts. five REs that cleave only a few sites (i.e., up to
The third approach, sequential hybridizations three) and which cut in at least two buffers.
(e.g., Figure 91, is particularly useful for Iarge Fragments are end-labeled (Protocol 61, sepa-
b30-kb) sequences such as entire cpDNAs rated by electrophoresis through agarose and
(Palmer, 1982,1986a), although it also can be used polyacryalmide gels (Protocol 41, and visual-
to map cleavage sites in smaller sequences (e.g., ized by autoradiography (Protocol 4).
Sites and Davis, 1989). The basic strategy is to se- As an example, consider a hypothetical circular
quentially hybridize a series of radioactive frag- molecule of 10 kb. Assume that this molecule was
ments ("probes"), which together make up the en- digested with enzymes A, B, C, and D (see Table
tire sequence, to fragments produced by single below) in all possible single- and double-digest
and doubIe digests after the latter have been combinations, and that fragments were separated
transferred to nylon membranes. Adjacent frag- by electrophoresis along with appropriate size
ments will hybridize to the same probe whereas standards. Migration distance for each fragment
physically separated fragments will not. By hy- was measured and the number of fragments
bridizing to both single and double digests, it is found in each lane was noted. Fragment sizes
possible to deduce the order of fragments pro- were estimated from calibration curves generated
duced by each RE and the relative location of by plotting migration distances of the size-stan-
cleavage sites for different REs (see Palmer, 1982). dard fragments against their known fragment
Methods for double digesting and sequential hy- lengths (Figure 3).
bridization are given below.
Gel lane 1 2 3 4 5 6 7 8 9 10
Enzymes A A-I3 A-C A-D D B B-D B-C C C-D
Number of sites 2 3 4 3 1 1 2 3 2 3
Sizes 6.0 4.0 3.5 6.0 10 10 5.5 5.0 6.5 6.0
4.0 4.0 3.0 2.5 4.5 3.5 3.5 3.5
2.0 2.5 1.5 1.5 0.5
1.0
Estimated size 10 10 10 10 10 10 10 10 10 10
Nucleic Acids 111: Analysis of Fragments and Restriction Sites 305
Molecule size and number of fragments should The correct orientation is determined by exami-
be cansistent across each lane and predictable nation of the A-D double digest re1ati.v~Lo the
from single digests. For example, enzymes A map produced for B and D. Upon incorporation
and B cut twice and once, respectively; there- of enzyme D, the two poss~blealternat~vesarc.
fore, the A-B double digest (lane 3 in the table)
sl~ouldcolztain three fragments that sum to 10
kb. Fewer fragments than expected may be ob-
served in two situations: (1) when fragments
comigrate, or (2) when sites are very close to- Orientation 2
gether. Comigration would be indicated if the
estimated size of the molecule is smaller than
expected (with the size difference the same as
the comigrating fragment). In addition, the Based on fragment sizes obtained from the A-D
comigrating fragment will be Inore intense than double digest, the correct alignment of sites for
other fragments in the same lane. When sites enzymes A, B, and D is provided by orientat~on1.
are close together, the fragment produced may The sites for enzyme C then can be mapped.
be so small as to migrate off the gel. In this situ- Enzyme B cleaves the 6.5-kb C fragment into two
ation, the molecule size will be approximately pieces, 1.5 kb and 5.0 kb. When mapping C rela-
the same as expected, and double digests with tive to B, the two possible alternatives are:
other enzymes will place the two sites proxi-
mate to each other.
To construct the map from the measured frag-
ment sizes, select enzymes that cleave the fewest
times and progressively add enzymes that exhibit Orientation 2
more complex fragment patterns. For our exam-
ple gel from above, we would start with enzymes
B and D, thereby producing the following map
(sizes are not to scale): Again, the correct orientation is determined by
incorporating enzyme D in the two alternatives:
Orientation 1
B C C 0.5 D
Note that location of the zero point (in this case,
the cleavage site for enzyme B) and orientation
4 1.5kb / 3.5 kb j. kb + 4.5 kb
Orientation of sites and distances among them is fully mapped with no further digests. When a
verified by looking at t h e fragment sizes pro- pair of fragments unique to the pattern in one
duced by the A-C double digest. sample sum in size to a third, larger fragment
unique to the pattern found in a second sample,
3. The map for this reference sample is com- those two fragments must be adjacent in the h s t
pleted by adding more enzymes, using the genome. There are two weaknesses to inferring
three previously mapped enzymes as a back- site changes by this approach. First, it is not pos-
bone for construction of additional maps. sible to discriminate between small length muta-
Choice of backbone REs is determined b y tions and site mutations near the end of a frag-
spacing of sites (evenly spaced is best) and ment, although this problem can be alleviated to
buffer conditions. To continue the example a large extent by using polyacrylamide gels to vi-
lrom above, such. a gel might look like: sualize small fragments (Figure 17). Second, for
Lane circular molecuIes such as animal mtDNA, it is
(2) (7) A 4 (12) C difficult to differentiate samples with no sites for
(2) B-E (8) A-E (13) C-F a particular enzyme (but some molecules lin-
(3) B-1: (9) size (14) C-G earized by damage) from those with one site. In
(4) R-G standard (15) C-E addition, different samples cleaved once but in
(5) G (10) A-F (16) E differentpositions will be indistinguishable.This
(6) A (11) F is not a problem for Iinear genomes because the
number of sites is one less than the number of
fragments, and samples which are uncut will ex-
The three new enzymes (E, I?, and G) would
hibit only a single fragment. The only way to al-
be m a p p e d independently relative to the
leviate these problems for circular DNAs is to di-
backbone enzymes (A, B, and C), essentially
rectly map the sites in question. In practice, all
iollowlng the logic applied above.
samples suspected of possessing zero or one sites
4 Thls procedure is performed until all en- for a particular RE must be critically reexammeit
zymes are mapped. Where REs produce com- using an appropriate double digest.
plex fragment patterns (i.e., exhibit 5 or more
b. Each sample that shows extreme differences
cleavage sites), several mapping gels may be
(fragment patterns not interpretable in terms of
necessary for accurate map construction.
single site gains or losscs, typically >3-5% se-
5 Once the reference map is completed for all quence divergence)must be mapped separately
enzymes, relative positions of closely spaced
c. For intermediate haplotypes, some sites may be
sites (typically less than 100 bp) are tested by
inferred relative to the reference map (as de-
selected double digests. In the example
scribed above) while others (e.g., RBs that ex-
above, one could test the order of sites A-C-D
hibit single sites or inferred site gains relative to
using side-by-side cornparison of A-C and
the reference sanxplu) will require placement by
A-D double digests. Thls order is verified by
double digestion.
the smaller size of the A-C fragment of inter-
c-sl relative to homologous piece produced by 7. After site maps have been generated for all
A-D (1.0 versus 1.5 kbj. samples, it is essential to test for site homology
6. Strategy for mapping additional samples de- across samples. In situations where sites ap-
pends upon the extent of divergence. pear fairly close (within 100 bp), they often are
assumed to be homologous, especially if they
a. Where samples differ by only a few site changes, have been mapped with small fragments re-
maps are readily generated with few additional solved on polyacrylamide gels. When sites in
digests. Single-site gains usually can be placed different samples appear to be within 100-500
with only one double digest. Where fragment b p of each other or enzymes produce small
patterns are identical or differ by one or two site fragments, site homology across samples must
losses from the reference sample, sites can be be tested in side-by-side comparisons.
Nucleic Acids III: Analysis of Fragments and Restriction Sites 307
8 07.56.6
;:;
__+
l
1 5
G
8.0
1.08
Figure 17 RE fragment patterns and maps of
NlndIII nztDNA haplotypes from Luxilus cornutus
(types A and B) and L. clzrysocephaltls (type C). Pan-
els (A) and (B): agarosc and polyacrylamide gels,
respectively, with haplotypes designated by letters
A-C; the size standard (S) is a combination of h
bacteriophage DNA digested with Hind111 and
4x174 RF DNA digested with HaeIII. Panel (C): net-
work of haplotypes showing RE sites and changes
(maps from Dowling et al., 1992).Whereas the site
loss differentiating pattern B from A (at position
2.2) could be inferred from a complete map of A
:
(1.27
and knowledge of the fragment changes (panel A),
the converse (i.e., inferring the site gain in A rela-
tive to a complete map of B) is not possible. Small
fragments (such as the 450-bp fragment in haplo-
types A and B produced by RE sites at 7.5 and 8.0),
are best examined and mapped using polyacryl-
0.19 amide gels (e.g., panel B).
1 2.0 1.0
Fragment 2 2.0 1.3
size: 2.0 0.3 1.0 67
3 2.3 1.O
X Yl y 2 z 4 2.3 1.O
C
Taxon 1 + + + +
Taxon 2 + + - + Sizes of X-YI fragments would indicate that site
Y1 is found in taxa 1 and 2 but not 3 and 4. Like-
Taxon3 + - + + wise, comparison of Z-Y2 would indicate that
Taxon4 + - + + only taxa 1,3, and 4 share site Y2.
308 Chapter 8 / Dowling, Moritz, Palmer & Rieseberg
c v a l h x ap c c e v c ab
1 7 . 6 k b ~ 1 1 ' 1 11 1 I I I I
h Xnnnn c s s n l I nc
I
I -
-0.5 kb
\ I
j 3.9 kb ,
;
\I I ,/
cv a lh x a? c c e x
13.2kb h l l / l l l l l i l ' 1 ?I' L
h Xnnnn c s s b nc 1
Figure 19 Effectsof a heteroplasmic 3.9-kb deletion in deletions (3.9 and 0.5 kb) in relation to the S genome
CnemidophorusmtDNA. (A) Comparison of end-labeled cleavage map. For E s that have no site within the dele-
fragments of the standard 17.6-kb mtDNA (S) and the tion, one fragment simply gets smaller (e.g., lanes 2,3,
heteroplasmic 17.6/13.2-kb sample (Dl. The two types 7). For REs that cut within the deleted region, two frag-
of genome in the heteroplasmic sample are present in ments are missing and have been replaced by a larger
approximately equal quantities. The first pair of lanes fragment equal to their sum minus the deletion (e.g.,
are partial digests showing two size classes of relaxed lane 5, 10.7 and 6.7 replaced by 13 kb; lane 9,lO.O and
circles and linear molecules in the D sample. The other 3.0 replaced by 8.6 kb). Abbreviations: a = AvaI, b =
eight pairs of lanes are digests with: (2) BamHI, (3) SacII, BamHI, c = BclI, e = EcoRI, h = HindIII, 1 = SalI, n = NciI,
(4) BclI, (5) EcoRV, (6) NciI, (7) XbaI, (8) EcoRI, and (9) p = PvuII, s = SUCH,v = EcoRV, and x = XbaI. The saw
AvaI. The bars indicate fragments from which 3.9 kb tooth region indicates a set of small tandem repeats that
was deleted in the 13.2-kb genome. The size marker is have been reduced in copy number in the deletion
digested with HindIII. (B) Map of the location of the genome.
312 Chapter 8 / Dowling, Moritz, Palnzer t
Figure 20 Contamination of CsC1-isolated mtDNA. mtDNA contaminated with exogenous DNA (extra
Samples are mtDNAs isolated from Rutilis alburnoides fragments in both digests of panel A); 3 = standard
and digested with HindJII and BamHI, with fragments mtDNA preparation; S = combination of k DNA di-
visualized by (A) end-labeling and (B) transfer hybrid- gested with Hind111 and 4x174 RF DNA digested with
ization. Lanes: 1 = mtDNA preparation contaminated HaelII. (Figure courtesy of M. J. Alves.)
with nuclear DNA (background smear in panel A); 2 =
(e.g., in Hyla [Pscudacris]crucifer mtDNA; Moritz site heteroplasmy results in only one larger frag-
et al., 1987).Incomplete digestion can be caused by ment. However, there also may be considerable
technical artifacts (see below) such that not all heterogeneity among sites in their cleavage rate. A
molecules are cleaved at all sites, or by true het- more sophisticated way to discriminate between
eroplasmy for a restriction site that results in large partial digestion and site heteroplasmy is to clone
fragments equal to the sum of two or more smaller from the sample (see Chapter 9) and test for the
fragments present in complete digests (e.g., presence of both restriction types among the
Wauswirtl~and Laipis, 1984; Benzten et al., 1988; clones. Bands suspected of representing partial di-
Gold and Ricl~ardson,1990). A crude but simple gests should be measured to ensure that they do
test i s to repeat the digest with a new batch of the indeed represent the sum of two or more smaller
RE or to digest with an excess of enzyme for ex- fragments. All too often, interesting phenomena
tended time periods. Typically partial digestion such as duplications or deletions are missed be-
produces a variety of larger fragments, whereas cause additional bands were assumed to be due t~
Nucleic Acids 111: Analysis of F~agmentsand Restriction Sites 3x3
ment, must extend past the end of the 10.6-kb limited to taxa for which purified cpDNA is
probe by at least 5 kb. So too must one of the two available, as this will facilitate the mapping
smaller fragments (9.0, 5.2 kb) produced by site effort. Alignment of each enzyme map for the
galn within the 14.2-kb fragment. Given their reference genome is greatly aided by includ-
si;r*s and relative hybridization intensities, the ing on each enzyme gel or set of gels a double
only possible interpretation is that the 5.2-kb frag- digest of the reference genome with the en-
ment is internal to the 10.6-kb probe and that the zyme specific to that gel and an enzyme used
9.0-kbone extends across its end. Similarly, in in common in a11 the double digests. Draw
DNAs 6 and 8, the 6.3-kb fragment, which hy- the maps on two sets of sheets. On one sheet
bridlzes only weakly with the 10.6-kb probe, must draw all of the aligned maps for the reference
overlap the probe only slightly and the 2.7-kb genome, one atop the other. Use this for step
fragment must lie internal to it. This logic estab- 3. Draw the reference map for a single en-
lisl~edBstXI fragment orders of 1.5-5.2-9.0 in zyme on a second sheet. Include below this
DNAs 1,3,5, and 7 and 1.5-5.2-2.7-6.3 in DNAs 6 one-line map the aligned map of the clones
and 8. The second way to establish these fragment used as hybridization probes. Use this set of
orders is by using as hybridization probes frag- maps for step 2.
ments adjacent to the 10.6-kb fragment. One of 2. Group the autoradiograms by enzyme and by
these two fragments should hybridize to the mu- order of probe. Using the single enzyme map
tated fragments, of 9.0 and 6.3 kb, that span the sheets, draw the map for each taxon on a sepa-
junction between it and the 10.6-kb probe. rate line above the reference map. The com-
DNA 2 differs from all other DNAs in Figure pleteness of the mapping information needed
21 in lacking the 1.5-kb BstXI fragment and fea- will depend on the amount of variation de-
turing instead a fragment of 1.3 kb. In the absence tected. For divergent genomes, it may be nec-
of any other information one cannot distinguish essary ta write down each site and fragment
between two alternative explanations for this size for the whole genome; for similar
fragment difference: (I)a deletion of 0.2 kb in this genomes simply writing the variabie sites and
region in DNA 2 relative to all others, and (2) a fragments may suffice; for genomes of inter-
site mutation occuring 0.2 kb from one end of the mediate variabilify writing all the sites and just
1.5-kb fragment, with the additional 0.2-kb frag- the variable sizes may be enough. This step
ment in DNA 2 having gone undetected, probably will identlfy all clear-cut site mutations and all
being run off the end of the gel. Length mutations cases of ambiguity regarding length muta-
can be recognized by aligning the fragment maps tion/site mutations near ends of fragments.
constructed for each enzyme and observing cor- 3. Regroup the autoradiograms according to
related size changes overlapping the variable probe fragment. Analyze these together with
fragment in question (see above). the unified reference genome map to resolve
The general form of the site mapping analysis length mutation/site mutation ambiguities as
used in cpDNA studies consists of three steps: described in the preceding paragraph.
1 Construct a reasonably complete map of the
cpDNA of one of the taxa under study (the
reference genome) for each enzyme used. Troubleshooting
Logical choices for the reference genome Problems encountered in RE analyses typically oc-
could be the one used as the source of cloned cur at any of three stages: (1) during DNA isolation
hybridization probes or one which has been and storage, (2) during digestion and electrophore-
completely sequenced (e.g., tobacco; Shi- sis, and (3) during transfer hybridization. Alterna-
nozaki et al., 1986) and for which computer- tively, inherent properties of the sequence studied
generated maps can be made easily. If such could present problems. A brief list of problems
genomes do not fall within the study group that may be encountered and some solutions fol-
then the choice of reference genome should be low.
Nucleic Acids 111: Analysis of Fragments and Restriction Sites 315
Problem: Weaker signals with time and reuse For most loci, there is considerable shadow~ng
Remedy: Don't wash filters excessively; use low- (e.g., Figure 2A) thought to be duc to replication
copy-number and more divergent probes first, slippage during the PCR reaction, although so-
high-copy-number and conserved probes last matic mutation is another possibility. Whatever
the cause, the shadowing usually is very consis-
problem: Unequal hybridization efficiency to dif- tent across samples and between ainplificatlons so
ferent fragments of the same digest that alleles can be reliably and repeatably scored.
Remedy: A probe produced by random-priming Phenotypes should be inspected carefully to den-
(Feinberg and Vogelstein, 1983) may not be label- tify heterozygotes for closely spaced alleles: e.g ,
ing randomly, so try a nick-translated probe; or a if the typical pattern for a homozygote is a domi-
heterologous probe may not be binding equally nant band and two sub-bands with 2-bp spacing,
well to all fragments, $0 use a probe more similar a heterozygote will have two similar intensity
to the target DNA or reduce the stringency of bands followed by two sub-bands.
wash conditions In some cases, we have observed more pro-
nounced "stuttering" and/or weaker ampliflca-
Intrirtsic Biological Problems tion in larger than in smaller alleles, especially
P1,oblern: Contamination of sample with exoge- where allele sizes are bimodal and widely sepa-
nous DNA (Figure 20) rated. In such circumstances, tests for Hardy-
Remedy: Characterize and exclude exogenous Weinberg equilibrium (Chapter 10) or direct tests
fragments when analyzing data of inheritance are useful to verify that the differ-
ent fragments are allelic. These tests also should
Problem: Heteroplasmy be performed to test for high-frequency null alle-
Remedy: Characterize and account for variable- les: variants that fail to amplify because of large-
length fragments when analyzing data scale expansion or mutations in the priming sites
(see "Assumptions").
Problem: Cross-hybridization to another genome Alleles are scared by thcir absolute size rela-
or to repeated sequences within the genome (Fig- tive to size markers, which usually are known se-
ure 24) quence ladders that are loaded on the same gel.
Remedy: Switch probes; use subportions of the These should be run on either side (3-4 spaces
probe lacking the cross-hybridizing region; make from the edge of the gel) and in the middle as
sure probe is free of contaminating DNA well. To check for consistency of scoring, a small
number of samples should be run on different
Problem: Distinguishing point mutations from gels as internal controls.
small-Iength mutations
Remedy: Use both agarose and polyacrylamide Tvoubleskooting
gels to visualize all possible fragments Most technical difficulties encountcred with mi-
crosatellites are the same as for general PCR (see
P~*oblem: Comigrating, non-identical bands Chapter 7) or for running of sequencing gels
Remedy: Establish homology by constructing re- (Chapter 9).
striction maps
Problem: Excessive sl~adowing,making reliable
scoring of microsatellite alleles impossible
Microsatellites Remedy: Optimize PCR condit~ons,particularly
For the most part, interpretation of microsatellites primer concentration, Mg2+ concentrat~on,and
as codominant alleles is straightforward. Samples annealing temperature. In general, minimize an-
are run adjacent to known sequences so that the neallng and extension times and maximize au-
precise size of each product can be determined. nealing temperature
318 Chapter 8 / Dowlhzg, Moritz, Palmer ti Xieseberg
Figure 24 The effect of contaminating nuclear DNA in rafus mtDNAs digested with Mbol and end-labeled
hybrldlza tion probes. Contamination can be obvious, as with 321?. Note the faint band (indicated by the arrow)
portrayed in panels (A) and (B),or subtle, as demon- of nuclear origin (identified as nuclear by EB-staining
straled in panels (C)and (D). (A) Tz~rsiopstruncntt~stotal of total DNA digests; gel not. shown). (D) Tursiops truiz-
DNA sampXes digested wlth NtnPI and hybridized with catus total DNA samples digested with MboI and hy-
Tu~siopsmtDNA contammated with nuclear DNA. S = il bridized with Tursiops mtDNA, contaminated with nu-
bacteriophage DNAdigested with HzndIlI. (B) Same fil- clear DNA. The arrow identifies a fragment of nuclear
ter as (A), hybridized with a Tursiops mtDNA sample origin (see panel c). M, 9x174 RF DNA digested with
lacking nuclear DNA contammation. (C) Tursiops tr~ln- HarIII.
Proillem Poor resolution of closely spaced alleles Remedy: Redesign primers to generate products of
Rcrnedy. Run fragments into bottom 1 / 4 of gel to <200 bp (this can reduce shadowing as well be-
incrzase separation cause of increased efficiency of amplification)
Nucleic Acids III: Analysis of Fragments and Restriction Sites 319
TEMED
Use flush from the bottle
Nick Translakion~Buff e1:
50 mA4 Tris-HC1, pH 7.2
10 mM magnesium sulfate
1 mhl dithiothreitol
3 M sodium chloride
1M dithiothreitol (DTT)
2 M Tris, pH 7.4
1M magnesium chloride 10 mM Tris-HC1, pH 8.0
1M potassium chloride 1 rnM EDTA
14 M &mercaptoethanol 100 mMI sodium chloride
deionized, distilled water (ddH20) 0.2% SDS
eic Acids IV:
Sequencing and C
David M. Hillis, Barbara K.Mable, Allan Larson,
Scott K.Davis, and Elizabeth A. Zimmer
analysis is considerably lower. Of course, in order tinued advances in sequencing technology, the
to use nucleotide sequence positions in phyloge- rapid increase is likely to continue. Although only
netlc studies, homologous sequences must be a tiny fraction of these data are collected for sys-
aligned. The number and size of homologous se- tematic studies, many of the data can be used for
quences that can be aligned will differ depending this purpose. Unfortunately, most of the available
on the level of comparison, but for most studies data have been collected from a very few species.
potentially useful variation is essentially inex- Figure 1 shows the taxonomic distribution of the
haustible in the near futxre, sequences currently in GenBank, and it is clear
Nucleic acid sequence data are compiled con- that chordates are heavily overrepresented and
tinuously in several data bases; GenBank (main- most other eukaryotes are seriously underrepre-
tained by the U.S. National Center for Biotechnol- sented compared to extant species diversity. In
ogy Information at the National Library of fact, a Iarge fraction of the sequences are from
Medlclne) is perhaps the best known and most about ten species of commercial or medical impor-
w ~ d e l yused. Another well-known data base is tance to humans (Hillis, 1987). Thus, although se-
corxpiled by the European Molecular Biology Lab- quence data collected for other purposes are a use-
oratory These compilations represent, in effect, the ful starting point for some systematic studies,
largest comparative data sets ever collected. When systematists usually must collect comparative data
the fist edihon of this book was published in 1990, for relevant species of interest.
there were fewer than 108 nucleotides in GenBank. Although there are a number of strategies for
As of 1995, there are more than three times as obtaining sequence data for use in systematics, all
many nucleotides (approximately 3 x lo8) in the methods have four basic steps. First, a particular
database, from about 3.5 x l o 5 sequences. The target sequence must be identified that contains
genomes of many viruses have been sequenced i n an appropriate amount of variation across species
their entirety, and the complete sequence of the or individuals for the problem that is to be ad-
genome of a free-living organism (Haemophilus in- dressed (this step is discussed below, under "Ap-
fluenza~)has been reported (Fleischmann et al., plications and Limitations"; also see Chapter 8).
1995). With an increased number of laboratories Second, many copies of the target sequence must
collecling nucleotide sequence data, and with con- be isolated and purified from each individual to
Viruses
I
Chordates -
---
~ ~ ~ Figure 2 =Sequencing~and cloningyflow chart. ~
Num-
--
&--* --- bers refer to protocols.
be examined. Third, the purified DNA (or, rarely, whole genomic DNA or cDNA (Chapter 7) and
RNA) must be sequenced. Finally, homologous sequenced directly, or RNA transcripts of the
sequences must be aligned (alignment is dis- genes of interest can be isolated and sequenced.
cussed in the section on "Interpretation and Trou- Amplification of target DNA sequences by the
bleshooting"). The various methods differ pri- polymerase chain reaction has become the most
marily in how the nucleic acid is isolated: "direct" widely used approach for comparative studies.
methods involve either directly amplifying the Direct sequencing of RNA can be used to se-
target DNA or isolating abundant RNA tran- quence nuclear-encoded ribosomal RNAs or (oc-
scripts; cloning methods involve the preparation casionally) messenger RNAs that are particularly
and isolation of viral and/or bacterial vectors that abundant in a particular tissue (Weisman et al.,
contain copies of the sequence of interest. Each of 1986). For most applications, however, direct se-
the methods has distinct advantages and disad- quencing of RNA has been replaced by amplifica-
vantages and is particularly appropriate under tion (of genomic or cDNA) and sequencing; the
certain conditions. In the next few sections, we exceptions are cases where a target sequence is
outline the differences of the most widely used difficult to amplify in a particular taxon.
methods and discuss the relative merits of each
approach. Cloning
Whole genomic DNA can be used to construct ge-
nomic libraries, which contain virtually all of an
Isolating Target Sequences organism's DNA cloned in pieces into a viral
In molecular systematics, three methods are com- host-usually one of several-derivatives of the
monly used to isolate nucleic acids for sequencing lambda (h)bacteriophage. The library often con-
(Figure 2). Cloning using recombinant DNA tech- tains 108 or more copies of packaged viral DNA,
nology is the most widely used approach in mol- each with a fragment of the original organism's
ecular biology, although systematists often are re- genome. In order to use the library, an investiga-
luctant to use this strategy because of the tor must find the viruses that contain the gene or
perceived time and effort involved. Alternatively, region of interest and grow additional copies of
target DNA sequences can be amplified from these viruses (Figure 2). Although DNA can be
324 Chapter 9 / Hillis, Mable, Larson, Davis & Zimrneu
isolated and sequenced at tlus point, it is usually striction enzyme, ligate the DNA into the cloning
desirable to subclone the target sequence into a vector, and package the resulting recombinant
sequencing vector fjrst (usually a bacterial plas- lambda DNA with commercial packaging extracts
inid or the virus MI3 and its derivatives). Plasmid to produce subgenomic libraries. (They are subge-
clones can be stored indefinitely in a freezer and nomic libraries because restriction fragments of
grown in quantity whenever desired, and the tar- inappropriate sizes will not be represented.) The
get DNA can be easily isolated and sequenced. choice of cloning vector will depend upon the de-
Bacteriophage lambda is one of the most ex- sired target site and upon the size range of inserts
tensively used cloning vectors for initial cloning to be cloned. Lambda vectors generally are liin-
steps, such as construction of genomic and subge- ited to inserted fragments of less than 23 kb, and
nomic libraries (for reviews, see Frischauf, 1987 many vectors can incorporate only much smaller
and Salnbrook et al., 1989).Numerous lambda de- fragments (typically in the range of 2-15 kb). If
rivatives have been constructed, each of which larger fragments must be cloned, then the libraries
has advantages for certain cloning applications. should be constructed in cosmids, which are spe-
This bacteriophage is a double-stranded DNA cialized cloning vectors designed to accommodate
virus of approximately 50 kilobases (kb) in length, fragments of up to 45 kb in length (see DiLella
with single-stranded complementary ends that al- and Woo, 1987, or Salnbrook et a]., 1989 for more
low the lambda DNA to circularize after entering information).
a bacterial host. In the host bacterium, the lambda In order to use the gene library, one must
DNA is replicated by one of two pathways (lytic screen the recombinant lambda (or other) clones
and lysogenic cycles). In lytic growth, many to find the particular gene or DNA region of in-
copies of the bacteriophage DNA are produced terest. Numerous methods have been developed
via rolling circle rephcation, and are then pack- for screening gene libraries (see Berger and Kim-
aged into a protein coat that consists of a head mel, 1987; Ausubel, 1989). Most of these methods
(that contains the DNA) and a tail. The host bac- involve either hybridizing with a nucleic acid
terium that contains these mature viruses is then probe (see Protocol 8) or immunoscreening for ex-
lysed and the progeny phage are released. In a pressed proteins of interest. Hybridization is the
petri dish, this lysis can be visualized against a more general procedure, although it is not as effi-
background of a bacterial lawn as a plaque (a cient for screening for protein-coding genes that
clear spot in which the bacteria have been lysed). can be expressed in vivo.
Modifications of lambda bacteriophage for If the library is composed of random frag-
cloning have deletions of a central, non-essential ments of average size I from a genome of size G,
(for lytic growth) portion of the genome, into the number of independent clones (N) that must
which foreign DNA may be inserted. They also be screened to isolate a single-copy fragment of
have been selected to have only one or two re- interest with a probability of P can be calculated
striction sites for a given restriction enzyme, by the formula
which are the target sites for cloning. In vectors
wit11 two cloning sites (known as replacement
vectors), a fragment of the bacteriophage DNA is
replaced with the foreign DNA; in vectors with a
single site (known as insertion vectors), the for- As a rough approximation, one must screen ap-
eign DNA is inserted into the bacteriophage. Nu- proximately five times the number of base pairs
merous vectors have been constructed that con- of DNA that are in the genome [e.g., (I x N/G =
tain a diversity of cloning sites and accommodate 5)1to have a 99% chance of locating a specific sin-
a relatively large range of foreign fragments. gle-copy gene (Seed et al., 1982).However, the to-
Many of these are commercially available as tal number of clones screened can be much
predigested phage arms, so that one only needs to smaller for sequences that are present in high
digest the target DNA with an appropriate re- copy number (e.g.,rRNA genes, mtDNA, cpDNA,
Nucleic Acids IV: Sequencing and Clorziizg 325
and highly repeated heterochromatic sequences). the polymerase chain reaction (PCR) technique
Screening efficiency can be increased greatly by (Kleppe et al., 1971; Mullis and Faloona, 1987;
cleaving the genomic DNA with appropriate re- Ochman et al., 1988; Saiki et al., 1988; see also
striction enzymes (rather than random shearing) Chapter 7). Starting with virtually any amount of
and selecting lambda vectors that only accept re- DNA, it is possible to amplify a target sequence
striction fragments in the size range of the desired up to microgram quantities. Under some circurn-
target sequence. Of course, for this approach, a re- stances (see some of the limitations discussed in
striction map must be obtained before the library Chapter 7), this amplified DNA is sufficiently ho-
can be constructed (see Chapter 8). rnogeneous and of sufficiently high quality for se-
Once the appropriate DNA fragment has been quencii~g(Figures 2 and 3). Double-stranded
cloned and isolated in a lambda vector, it can be DNA produced by PCR amplification can be se-
subclol~edinto a plasmid or the virus M13. Sub- quenced. directly or by generatjng single-stranded
cloning is accomplished by cleaving the target se- DNA from the amplification product (see review
quence from the lambda with the same restriction by Bevan et al., 1992). Single-stranded DNA is
enzyme with which it was originally cloned, and generated by asyminetric realnplification using an
then ligating the ends of the DNA with plasmid excess of one of the primers (Gyllensten and Er-
or N13 DNA that has been cleaved with a restric- lich, 1988; Chapter 71, by treatment with exonu-
tj01-1enzyme to produce compatible ends (see Pro- clease (Iliguchi and Ochman, 19891, or by use of
tocol 10 and Chapter 8). The subcloning vector is biotinylated primers (Mitchell and Merrill, 1989).
then introduced into a bacterial host in a process The basic requirement for PCR is that the se-
known as transformation (see Protocols 11-13). quences of the regions flanking the target se-
This allows the cloned fragment to be grown in quence are known so that primers to these regions
quantity, easily isolated, and sequenced. DNA can be constructed (however, a method called in-
amplified in vitro (i.e., PCR; see Chapter 7 and be- verted PCR can be used. to amplify outside of a
low) also can be cloned in this manner in order to known region; Ochman et al,, 1988). Using this
purify heterogeneous amplification products, pro- methodology, it is possible to sequence DNA iso-
duce a stable clone (so that the isolated sequence lated from a wide variety of sources, includirrg
can be used for other purposes or by other indi- preserved museum specimells and (under special
viduals), and facilitate sequencing. Until recently, circumstances) fossils or subfossils (Piiiibo et al.,
MI3 was the most widely used sequencing vector 1988; T.J. White et al., 1989; Golenberg et al., 1990;
because it allowed single-stranded sequencing, DeSalle et al., 1992; Cano et al., 1993; DeSalle,
which provided superior autoradiographs com- 1994; Cano and Borucki, 1995).However, at least
pared to double-stranded sequencing. However, some "fossil" sequences that have been reported
new double-stranded sequencing protocols have (e.g., Woodward et al., 1994) are clearly contami-
greatly improved sequencing of plasmid clones, nants from recent l ~ u m a n s (Hedges and
and the greater ease with which plasmids are Schweitzer, 1995;Allard et al., 1995; Zischler et al.,
grown, manipulated, and stored is a strong point 1995) or other sources.
in their favor. Also, some recently developed plas-
mids have single-stranded farms that can be used RWA Imlatio?~
for single-stranded sequencing. Improved meth- The tlurd approach to isolating target sequences 1s
ads for insertion of PCIi products into plasmid to isolate the transcr~bedRNA and sequence l t us-
vectors (TA cloning, incorporation of restriction ing reverse transcriptase (Harnlyn et al., 1978; Pro-
sites in PCR primers) also have made direct tocol 23). Before PCR, this method was used ex-
clonil~gmore accessible (see Protocol 18). tensively in sequencing the ribosomal RNA genes,
and the method is still used to sequence RNAs
In Vitro Amplification that prove difficult to amplify. Ribosomal RNA is
Direct sequencing from complex genomic DNA relatively easy to sequence directly because it ac-
has been made possible with the development of counts for a large fraction of the total cellular
326 Chapter 9 / Hillis, Mable, Larson, Davis 8Zimmer
JiNA Some regions of rlWA sequences are con- many studies, the approach has been replaced by
served th~oughoutmost living organisms, and PCR amplification of the rRNA genes (Chapter 7).
scveral universal primers have been constructed
iha t are complementary to these regions (Lane et
al., 1985; Hillis and Dixon, 1991).These primers
Nucleic Acid Sequencing
can bc used to sequence several regions of rRNA Although protein sequencing became a routine
from v~rtuallyany organism. This technique has (albeit costly and labor-intensive) method for the
had n major impact 011 systematic studies of study of protein molecular evolution by the late
prokaryotes (e.g., G.E. Fox et al., 1980) and has 1950s, nucleic acid sequencing did not become
been applied througl~out~netazoanand plant commonplace in studies of molecular systematics
groups as well (e.g.,Field ef al., 1988).However, in until the 1980s. In fact, until the mid-1970s, only
stretches of DNA 15-20 base pairs in length had
been sequenced. Breakthroughs in nucleic acid se-
Figure 3 AmpIificat1oi.i of a conserved region of quencing were published almost simuItaneousIy
n?tDNA via the polymerase clxam reaction. The primers
shown aniplify a region of the mitocho~~dr~al cy- by Maxam and Gilbert (1977) and Sanger et al.
iochrome b gene ln vertebrates and some invertebrates (1977).These procedures are outlined in Figures 4
(Kocher et al., 1989). and 5, respectively.
j.
-300 bp unknown
I
1 ~~-~GCTTCCATCC~CATCTCAGCATGAT~~~XXXXXX---XXXXXX~'TGA~CAAATATCATTCTGA~TGCGTTT-Y
1 ?-TTTTT~G~GGCTA~TTGTA~GTCGTACTACTTTXXXXXXX---XXXXXXXACTCCTGTTTATAGTAAGACTCC~~AAA-~' 1
-
I I
I-Ieat at 95°C
to separate strands
I
I I
Cool to 50°C
to anneal primers
I I
I I
Warm to 70°C
for DNA replication
I
I ~'-~GCTTCCATCCAACATCTCAGCATGATGKAAXXXXXXX---XXXXXXXTGACX~ACAAATATCATTCTGA~T~TTT-~'
11 *
Primer extcilslon vla Taq polymerase
ACTCCTGTTTATAGTAAGACTCCCCGACGTCM-5'
Primer extension via Taq polymerase
I 5'-~Gr_TTCCATCCAACATCTCAGCATGATGKAA
I 3'-T~TTTOi4ETAGTTGTAGAGTCGTACTACTTTXXXXXXX-- - X X X X X X X A C T C C T G T T T A T A G T A A G A c T c ~ ~ A ~ ~ '
I
SGATCAGGCTTAAGCA-3'
3'-CTAGTCCGAATTCGT-5'
'P-GATCAGGCT TAAGCA
G A+G T+C C
A
C
G
A
A
T
T
C
G
G
A
Figure 4 Maxam-Cilbert (chemical C
cleavage) sequencing. See text for ex- T
planation. A
328 Chapter 9 1 Hillis, Mable, Larson, Davis & Zimmer
quencing (Eckert, 1987) greatly facilitated the pro- Next, a short segment of DNA (typically 15-25
cedure; these vectors allow selective end-labeling, bp) known to be complementary to a segment on
SO that either strand can be sequenced without the target DNA (or in the adjacent sequencing
separation into single-stranded fragments. Se- vector) is annealed to the target sample; this short
quencing is accomplished by dividing the target fragment is known as a primer. The sample is
DNA into four subsamples and then treating the then divided into four subsamples, to each of
subsamples with a series of base-specific chemical which is added the four deoxynucleotides (i.e.,
reagents that partially cleave the DNA. The first dATP, dCTP, dGTP, and dTTP, at least one of
sample is treated with dimethyl sulfate, which which is radioactively labeled) and DNA poly-
mekhylates a few percent of the guanines in the merase. In addition, one of four dideoxynu-
sequence, and piperidine, whicl~displaces the cleotides (ddNTP) is added to each of the tubes
methylated guanines and thereby cleaves the (ddATP, ddCTP, ddGTP, or ddTTP, respectively).
DNA at these sites. The second sample is treated The primer has a free 3' OH group on its deoxyri-
with formic acid, which protonates a few percent bose, to which additional nucleotides can be at-
of purine-ring nitrogens, and piperidine, which tached. The DNA sequence is extended by the
then displaces the affected purines (adenosine DNA polymerase using the target DNA as a tem-
and guanine). The third sample is treated with hy- plate (Figure 5). On some strands in the sequenc-
drazine, which removes cytosine and thymine ing reaction, a given ddNTP will be incorporated
from the DNA and leaves ribosylurea. The DNA into the growing strand, at which point the poly-
is then cleaved at these sites with piperidine. The merization is terminated because the ddNTP lacks
fourth sample is treated like the third, except that a 3' OH group. The radioactive fragments in the
the hydrazine treatment is conducted in the pres- four subsamples are then separated by denatur-
ence of NaC1, which suppresses the reaction of ing polyacrylamide gel electrophoresis and visu-
thymine with hydrazine (so the DNA is cleaved alized by autoradiography, as with Maxam-
only at cytosines). In all of these subsamples, Gilbert sequencing. The fragments in each sub-
chemical cleavage is carried out under conditions sample will terminate with the corresponding
in which only a few of the respective nucleotides ddNTP (which is complementary to the dNTP on
in any given fragment are affected. However, be- the template sequence), and the sequence of the
cause the cleavage is random and the population target DNA can be read directly from the autora-
of DNA fragments examined is large, some frag- diograph (Figure 5).
ments will be cleaved at each of the nucleotide
positions (Figure 4). The radioactively labeled Cycle Sequencing
fragments from the four subsamples are then elec- Cycle sequencing is based on the dideoxynu-
trophoretically separated by size on a denaturing cleotide chain-termination method of Sanger et al.
polyacrylamide gel and visualized by exposing (1977) but utilizes a linear polymerase reaction to
the dried gel to X-ray film to produce an autora- amplify labeled DNA that is complementary to
diograph. The sequence of the DNA sample can the target DNA (Murray, 1989; Craxton, 1991).As
then be read directly from the autoradiograph discussed for the Sanger method, an appropriate
(Figure 4). primer molecule is annealed to a complementary
single-stranded segment of DNA in the presence
Sanger Dideoxy Sequencing of deoxynucleotide triphosphates (dNTPs) and
Sanger sequencing, or controlled interruption of dideoxynucleotide triphosphates (ddNTPs). En-
enzymatic DNA replica tion, uses dideoxynu- zymatically controlled DNA synthesis is initiated
cleotide analogs in primer-directed DNA exten- at the 3' OH terminus of the annealed primer and
sion to produce discrete DNA fragments (Figure continues until chain growth is terminated by in-
5). The double-stranded DNA is first denatured to corporation of one of the four dideoxynucleotides.
produce single-stranded DNA (or single-stranded In thermal cycle sequencing, DNA synthesis is
DNA is isolated from single-stranded vectors). catalyzed by a therrnostable DNA polymerase
Nucleic Acids IV; Sequencing and Clo~zi~zg328
$ 4 4 4
A C G T
(e.g., Taq or Vent DNA polymerase). Heat denatu- Iinear amplification of labeled product (thus it i s
ration of double-stranded template allows labeled not a chain reaction). As in the Sanger method, ra-
primers access to a single strand and subsequent dioactive fragments in the four subsamples are
extension by the polymerase. Successive cycles of separated by denaturing poIyacrylamide elec-
denaturation, annealing, and synthesis result in a trophoresis and visualized by autoradiograp11)r.
330 Chapter 9 / Hillis, Mable, Larson, Davis G. Zimmer
.~lthough labeling of template can be quencing. They also permit sequence information
achicvcci through incorporation of an alpha-la- to be read from a gel directly into a computer
belcd dcoxynucleotidc triphosphate into the without the need for visual inspection of an au-
nascent chain ( d 2 P or d5S), the efficiency of the toradiogram. Imaging systems also are expensive
reactlor.\can be increased greatly by using end-la- and usually are acquired as shared institutional
beled prlmers. In this type of reaction, T4polynu- facilities, but they have the advantage of utility
cleotlde kinase (PNK) is used to add a y32P (or for analyzing diverse kinds of electrophoretic data
y3") froin rATP to the 5' end of a primer mole- in addition to sequencing sets. Detailed protocols
cule. Pnrners also can be biotinylated for chemi- are available from the manufacturers of both au-
luminescent DNA sequencing (Creasey et al., tomated sequencers and imaging systems so we
1991) or end-labeled with fluorescent dyes for au- will not duplicate them here.
tomated DNA sequencing (Tracey and Malcahy,
1991; see below).
Assumptions
Autornnfed Sequenci~?g In all broad-scale comparative studies, certain im-
There are a number of types of automated DNA plicit assumptions are made by the investigators.
sequencing, but most use Sanger sequencing with In the case of DNA sequencing, these i ~ ~ c l u das-
e
fluorescently labeled (rather than radioactively la- sulnptions about biochemical methodology and
beled) DNA fragments. These fragments are de- about genome and organismal evolution. It is im-
lccted during electrophoresis with the use of a portant to realize the limitations placed upon in-
tun.>ble laser. The laser is stationary with respect terpretation of sequence data that arise from such
to the cleclrophoresis apparatus, and fragments assumptions.
are recorded as they pass a single point. The At the biochemical level, the homogeneity of
process 1s "automatic" in that one does not visu- input DNA and the fidelity of DNA replication are
ally ~nspectan autoradiograph and manually issues of primary importance. Contamination of
record the results; instead, the sequence is template DNA (prepared either in vivo via cloning
recorded directly into a computer or onto paper or in vitro via PCR) or polymorphism in the se-
in the form of a chromatograph (Plate 21, which quence based on interallelic variation or pooling
may be interpreted (by computer software or vi- of individual samples may lead to uninterpretable
sually) lnto a DNA sequence. Because of their ex- or incorrect sequences. contamination is a poten-
pense and maintenance requirements, automated tially serious issue with PCR because even a sin-
sequencing machines often are maintained as part gle strand of foreign DNA can be amplified and
of a shared institutional facility with supervising thereby confound results. Contamination prob-
technicians hired to operate them. However, own- lems in PCR are most likely to occur for DNA sam-
ership and operation of automated sequencers by ples that are difficult to amplify because the sam-
i n d ~ v ~ d ulaboratories
al is becoming more com- ple contains chemical impurities, the target DNA
mon as costs decrease. It is likely that most large- is degraded, or the primers do not produce a good
scale DNA sequencing will be done by automated match to the template. In these cases, small
systems at some point in the future (Hunkapiller a m o u ~ ~oft scontaminating DNA that otherwise
ct a1 , 199i), although the automated sequencing would have been overwhelmed by the target
machines of tomorrow will likely be based on rad- DNA may be amplified in place of the target DNA.
ically different technologies from those of today High fidelity of DNA replication is important both
(e.g, Jett ct al., 1989; Soper et al., 1991; Harding in the amplification of the input DNA (done either
and Keller, 1992). In vivo or in vitro) and in the various methods that
Imaging systems that read Information from constitute controlled interruption of enzymatic
autoradiographs of manually produced sequenc- replication. When sequencing amphfied DNA, one
ing gels constitute an alternative to automated se- assumes that the sequence determined is the orig-
Jucleic Acids IV; Sequencing and Cloning 331
inal one isolated from the genome and not a vari- which known copies exist as distinctly sized prod-
ant produced during DNA replication. Some poly- ucts (Lessa 1992; Palumbi and Baker, 1994; see "In-
merases (e.g., Taq) have a high error rate (incorpo- terpretation and Troubleshooting").
ration of the incorrect nucleotide during DNA When the target sequences are part of a gene
replication), so sequencing of multiple isolates is family in which copies within an individual are
needed to confirm a sequence. homogenized (i.e., undergo concerted evolution),
Another implicit assumption when using it is important to use a typical or consensus repeat
DNA sequence information for phylogenetic stud- sequence. It also is desirable to determine in ad-
ies is that homologous genes can be reliably iden- vance that the rate of concerted evolution (gene
tified for comparison. If primers designed for PCR family homogenization) of the repeats in that fam-
recognize and anneal at more than one locus in a ily significantly exceeds that of speciation for the
particular genome, this assumption can be vio- groups of organisms compared (Hillis and Davis,
lated. Recent studies have indicated that tandem 1988; Sanderson and Doyle, 1992). A preliminary
duplications sometimes occur in mitochondria1 restriction endonuclease cleavage analysis (Chap-
DNAs (e.g., Moritz and Brown, 1987; Zevering et ter 8) can produce an estimate of the homogene-
al., 1991) and that some mtDNA genes have puta- ity of the gene family to be compared and this
tive non-functional nuclear copies (e.g., M.E Smith may dictate the appropriate sampling strategy
et al., 1992). Single-copy nuclear genes also have (Chapter 2). Assumptions of level of homology
been used recently for systematic studies but the are important to comparative sequence studies
potential for amplification of paralogous loci and (see T-Iillis, 1994a); for most phylogenetic studies
pseudogenes is an even higher risk. PCR arnplifi- except study of gene phylogenies, it is critical that
cation can also produce recombinant sequences orthologous (rather than paralogous) genes are
(i.e., combinations of nucleotides not found on a being compared (see Chapter 1).However, fami-
single strand of DNA in the individual amplified) lies of genes undergoing concerted evolution (and
if individuals are heterozygous for a particular thereby exhibiting plerology; Patterson, 1988) also
gene locus or primers anneal to multiple sites can be used to reconstruct relationships among
(Saiki et al., 1985; Scharf et al., 1988a,b; Scharf, lineages, as long as the branches in the tree are
1990; Bradley et al., 1993). The amplification of separated by enough time to allow homogeniza-
multiple products is a common result of amplifi- tion to occur (Sanderson and Doyle, 1992).
cations involving nuclear genes and may be due to With respect to analyses based on nucleic acid
the presence of processed pseudogenes or other sequences, both alignment and phylogenetic in-
duplicated gene copies. In addition, formation of ference steps involve making either implicit or ex-
heteroduplexes between the target gene and other plicit assumptions about evolutionary processes.
fragments (Zorn and Krieg, 1991; Valentine et al., Alignment algorithms are designed to maximize
1992)can make isolation of uniform sequences dif- percent sequence similarity while minimizing the
ficult. When target genes are part of a gene family, number of insertioddeletion events (see "Inter-
identifying orthologous sequences can be a prob- pretation and Troubleshooting"). Thus, base sub-
lem if different gene copies cannot be distin- stitutions are assumed to be more frequent during
guished by size, and amplification of multigene evolution and are penalized less severely by the
families is subject to both selection and drift dur- alignment algorithms. Furthermore, alternative
ing the PCR process (A. Wagner et al., 1994).Most alignments may be equally good, and the extent
of these problems can be overcome by (1) cloning, of reciprocal illumination between alignment and
(2) using rapid screening techniques to detect se- phylogeny reconstruction procedures has not
quence heterogeneity (reviewed by Lessa and Ap- been standardized. Choice of methods for phylo-
plebaum, 1993), (3) using procedures that discour- genetic inference and character weighting strate-
age amplification of recombinant sequences (see gies also rely to varying degrees on assumptions
Chapter 7),and/or (4) examining gene families in about evolutionary processes (see Chapter 11).
332 Chapter 9 / Hillis, Mable, Larson, Davis & Zirnrner
Comparison of the Primary Techniques the target bands using low-melting-point agarose
and then extracting the DNA using one of several
Cloning versus Direct Sequencing purification methods (see Protocol 19 and "Trou-
Development of the PCR method has had a major bleshooting").
impact on systematics, because it is faster and Another method for purifying amplificatiol~
sometimes easier to ampllfy DNAusing PCR than products involves using two internested sets of
to clone (see Chapter 7). The disadvantages are primers. After the initial amplification round has
that (1) the sequences of the flanking regions usu- been completed using the two external primers,
ally must be known; (2) the individual should be the DNA is reamplified with a set of primers in-
homozygous or otherwise homogeneous for the ternal to the first. This helps to purify the amplifi-
sequence of interest (or else steps must be taken cation product and reduces ambiguity (Mullis and
to control in vitro recombination); (3) only rela- Faloona, 1987). It also is possible to reduce the
tively short fragments can be sequenced directly; cost of Taq polymerase by isolating and purifying
(4) no clone is produced for verification or use in recombinant Taq enzyme following overproduc-
further work; (5) linear amplified DNA can be tion in E. coli (Engelke et al., 19901, although some
harder to sequence than is circular cloned DNA; applications of Taq polymerase are covered by
and (6)the cost of Taq polymerase can be quite patents that require purchase of the enzyme from
high. Because of the need for known flanking re- a licensed source. Moreover, because of the time
gions, the technique has been used mainly for re- and effort required to isolate and purify Taq poly-
gions for which complete and closely related se- merase, many laboratories probably will prefer to
quences are available (e.g., Wrischnik et al., 1987) continue to use commercially prepared DNA
or regions that are flanked by highly conserved polymerases.
sequences (see the sect1011 on "Useful Primers" in One major advantage of PCR is that the
Chapter 7). However, methods are available for method can be used to obtain sequences from al-
sequencing outside (rather than inside) of known cohol-preserved tissues (Kocher et al., 1989) or,
flanking regions (Ochrnan et al., 1988; Loh et al., rarely from fossil or subfossil specimens (Paabo et
19891, although technical difficulties with imple- al., 1988; T.J. White et al., 1989; Golenberg et al.,
mentation prevent the widespread use of inverse 1990; DeSalle et al., 1992; Cano et al., 1993; Thomas
PCR. Most of the disadvantages of PCR (except and Paabo, 1993; DeSalle, 1994; but see Zischler et
for the requirement of known flanking regions al., 1995).Because nucleic acids from such sources
and the high cost) can be overcome by combining tend to be degraded, PCR is the only approach to
DNA amplification and cloning strategies; cloning sequencing that typically is applicable under these
from the amplified DNA products is considerably conditions. It is thought that PCR can amplify
faster than cloning from whole genomic DNA (see fragmented DNA by "reassembling" the target
Protocol 181, especially since cloning methods from the fragments tl~roughsuccessive cycles'of
have been designed specifically for PCR products annealing and extension, wit11 each partially ex-
[e.g., TA cloning (Marchuk et al., 19911, blunt-end tended sequence acting as a primer for the next
ligations (Liu and Schwartz, 19921, or incorpora- fragment. Under these conditions, it is highly
tion of restriction fragments into PCR primers]. likely that the reassembled DNA is composed of
However, sequencing directly from PCR amplifi- fragments of sequence from many different copies
cation products reduces problems associated with of the target locus, but this is not usually a prob-
errors made by Taq polymerase, because errors in lem for high-copy, hoinogeneous targets such as
all but the earliest rounds of amplification will ap- most animal mtDNA. Application of methods for
pear as ambiguities. Length polymorphisms of extracting DNA from minute tissue samples that
any size in the target DNA cause the greatest dif- have been developed for forensic purposes (e.g.,
ficulties for PCR, because two offset sequences Chelex extractions; see Protocol 3) have greatly im-
produce unreadable sequence gels. However, proved the potential for obtaining usable sequence
these problems can be reduced by gel-isolation of information from ancient or fragmented DNA.
and C l o n i ~ ~ g333
Nucleic Acids IV: Seqtlenci~~g
The other direct sequencing technique, RNA structure, which is not a problem with Maxam-
sequencing, also is faster and easier than cloning Gilbert sequencing. Occas~onally,a sect~onof
alzd sequencing. Direct RNA sequencing is used DNA will be highly resistant to Sanger sequenc-
to obtain sequence informaiiol~for nu- ing and can be sequenced only using clzemrcal
clear-encoded ribosomal RNA for phylogeny re- degradation. In addition, Maxam-Gilbert se-
colzstruction and rapid identification of microor- quencing is easily adaptable to some massive se-
ganisms. Direct RNA sequencing has tlze quencing efforts (see below).
disadvantages of (1) only having a single strand
available (so verification requires use of overlap- Cycle Sequencing versus Ollzer Methods
ping primers on a single strand rather than two- One of the main advantages of cycle sequencing
strand verification); (2) requiring fresh (or in some is that it reduces the amount, and, to some extci~t,
cases frozen) tissues; (3) providing access only to the qual~tyof template necessary lor sequencing
transcribed regions; (4) having greater difficulties Enough template can be generated from a single
in regions of strong secondary structure; (5) hav- 25-pl PCR reaction to result ill clear resolutioi~of
ing difficulties in polymorphic regions; and (6) DNA sequences. This is due to both the ~izcreascd
not producing a stable and verifiable clone. The efficiency of end-labeled primers and to a modcst
regions that are accessible to direct RNA sequenc- llnear amplification of the initlal template DNA.
ing (primarily the nuclear ribosomal genes) are Although most protocols recommend purification
also accessible to PCR; thus, sequencing the am- of PCR templates prior to sequencing, direct se-
plified products of PCR is the preferred direct se- quencing of unpurified PCR products can bc
quencing method because of the greater accuracy aclueved mihen clean arnplifrcation without evi-
and the possibility of using poorer quality tissue dence of length (or other) heterogeneity IS appar-
samples. ent. Sequences also may be obtained directly from
phage plaques and bacterial colo~zieswithout pu-
Maxam-Gilbert versus Sanger Sequenci~zg rification (Krishnan ct al., 1991; Young and
Although both Maxam-Gilbert and Sanger se- Blakesley, 1991).
quencing have been used extensively for deter- Another advantage of cycle sequencing IS
mining DNA sequences, we emphasize Sanger se- that direct labeling of double-stranded ternplatc
quencing in this chapter for several reasons. circumvents problems usually associated wlth di-
Modifications of Sanger sequencing can be used rect sequencing of double-stranded DNA or wltlz
to sequence both DNA and RNA, whereas generation of single-stranded templates froin
Maxam-Gilbert sequencing is applicable only to double-stranded DNA segments. in cycle se-
DNA. Furthermore, Maxam-Gllbert sequencing quenclng, more of the labelcd primer is extended
requires a prior knowledge of tlze restriction map by the DNA polymerase than 1x1standard double-
of the target sequence, because it is necessary to stranded DNA sequencing protocols, wh~cli
cleave the DNA into manageable size pieces for rcsults in less wastage of prlmer and shortel
sequencing. Sanger sequencing requires no prior autoradiograph times (Murray, 1989). Many sc-
knowledge of the sequence, because primers can quenclng artifacts arc removed by the thcriual
be complementary to the cloning vectors or, in the cycling procedure and therefore are not as appar-
case of PCR, amplification prii~xerscan also be ent. Sing-le-strandedDNA sequencing may result
used as sequencing primers. Modifications of in less ambiguous sequences (i.e., fewer stops
Sanger sequencing are used in cycle sequencing and more readable sequences close to the pnincr)
and in automated sequencers. Finally, it is possi- than double-stranded sequencil-ig,but the gcncr-
ble to read more scquence information per gel ation of smgle-stranded DNA can be a time-con-
with Sanger than with Maxam-Gilbert sequenc- sumlng and labor-intensive procedure (Tripathl,
ing, because of better band resolution. However, 1991). Single-stranded DNA may bc produced
Sanger sequencing can present problems in se- from double-stranded templates by cloning into
quencing regions with strong DNA secondary a vector such as MI3 or tl~roughasyrnmeir~c
334 Chapter 9 / Hillis, Mable, Larson, Davis 6 Zimnzer
PCII. However, cloning is time-consuming, and cloning and amplification steps. However, this
g~tll~rdtloll of sufficient asylnmetric product for procedure is not generally applicable to most sys-
sequencing can be problematic. Cycle sequencing tematic studies, because each visualized sequence
utrli~esdouble-stranded template and therefore requires a separate probe and an appropriately lo-
reduces the time of preparation of sequencing cated restriction site. In addition, the low-copy
sa~t~ples and can result in cleaner sequences com- number of each target sequence results in a weak
pared with other double-stranded sequencing signal on tlze hybridized blot (Church and Kieffer-
methods. Higgins, 1988). A related technique known as
Cycle sequencing reactions are quick and ef- multiplex sequencing was described by Church
Ilcicut and leave less room for experimental error and Kieffer-Higgins (1988) that has been used for
because most of the procedure is performed in a massive sequencing of entire prokaryotic
tl~ermalcycling machine. The efficiency of the re- genomes (and similarly large sequencing. efforts).
actions can be increased further by using such Although several variations are possible, the key
titne-saving measures as automatic pipetters and to multiplex sequencing is the combined sequenc-
mlcrotiter plates (V. Smith et al., 1993). The reac- ing of numerous clones on a single gel, each of
tion is also very versatile. It can be used for sin- which is incorporated in a distinct vector. The
gle-stranded or double-stranded DNA, can utilize DNA is then transferred to a nylon membrane as
cloned products or PCR products, and can gener- in genomic sequencing, and the sequences of the
ate templates for direct or automated sequencing. individual clones are visualized by successive hy-
The major limitation of the procedure is that bridization to vector-specific probes. The advan-
i l ~iicorporatessome of the problems associated tage of the technique is the reduction of repetition
wrth the polymerase chain reaction. These include of many of the sequencing steps by a factor of
syeclficity of primer design, standardization of re- twenty or more. However, the technique is un-
actloll conditions across thermocycler machines, likely to be incorporated in most smaller scale sys-
eslablish~nentof optimal reaction conditions for tematic studies because of the need for cloning
new taxa or new sets of primers, potential incor- into a large number of distinct vectors.
poration of Tnq polymerase errors, and high cost Sequencing technology is continuing to ad-
of thermally stable polymerases. vance rapidly driven in part by the current inter-
est in sequelicing whole genomes. One especially
0 ther Methods of DNA Sequencing promising technology involves cleaving nu-
Several other modifications to DNA sequencing cleotides (using exonucleases) from a single
have been developed, each of which has special- strand of DNA in a flow cytometer, then detecting
ized applications. Genomic sequencing (Church and identifying these nucleotides in order in a
and Gilbert, 3984) can be used to assess sequence constant stream (Jett et al., 1989; Shera et al., 1990;
information directly front genomic DNA. In this Soper et al., 1991; Karding and Keller, 1992).'Ije-
procedure, fragments from completely restricted tection and identification can be accomplished by
whole genomic DNA are partially cleaved using laser spectroscopy of the fluorescently labeled nu-
chei~zrcaldegradation. These fragments are then cleotides (Shera et al., 1990). The technique has
separated on an acrylarnide gel and electro- considerable promise for very rapid sequencing of
phoretically transferred to a nylon membrane. large DNA molecules, but it still has numerous
The fr'lgments on the membrane are then hy- technical limitations that need to be overcome be-
bridized to a series of specific probes so that DNA fore tlze method can be widely used (L.M. Davis
sequences can be visualized. The technique et al., 1992). Whether or not this particular
thereby eliminates cloning or in vitro amplifica- method becomes generally applicable, radically
tlon steps. Another advantage to this approach is new approaches such as this are needed to under-
t h a t information on DNA methylation is pre- take massive sequencing projects such as the hu-
served, whereas this information is lost during man genome initiative.
~ c l e i cAcids IV: Sequencing and Cloning 335
sufficientlyclosely related organism. In the latter potheses (Bradley et al., 1993). Other studies of
case, cloned or synthetic DNA can be used to se- gene evolution at the DNA sequence level are
lect the sequence of interest from a genomic li- beginning to reveal considerable inforlnatioll
brary made from a sample of total cellular DNA about the processes of substitutional mutations
(Protocol 8); alternatively synthetic DNA can be (e.g., Gojobori ct al., 1982; W.-H. Li et al., 1984;
used to amplify the sequence of interest from a Moriyama et al., 1991; Bull et al., 1993a1, the ori-
preparation of total cellular DNA using the poly- gin of new loci (Prager and Wilson, 19881, the evo-
merase chain reaction (Chapter 7). Sequences lution of gene families (e.g., Goodman et al.,
located in organellar genomes are present in 1979), and convergence at the molecular level due
sufficiently high frequency in cellular DNA prepa- to selection (Stewart and Wilson, 1987). In addi-
rations that cloning and amplification are rela- tion, phylogenetic analyses of sequences are being
tively routine. used to reconstruct the sequences of ancestral
For nuclear DNA, cloning and amplification genes, promoters, and proteins, wluch can then be
are routine for sequences that are extensively re- synthesized in vitro and tested in vivo (Adey et
peated relative to the total size of the nuclear al., 1994; Jermann et al., 1995; Stewart, 1995).
genome. The genes encoding ribosomal. RNA, for Moreover, study of the evolution of genes is also
example, are in this category. If the DNA sequence beginning to have an important impact on our un-
is transcribed, direct sequencing of the RNA tran- derstanding of related fields in molecular, cellu-
script is possible where the frequency of the tran- lar, and developmental biology (e.g,, Schubert et
script in total cellular RNA (or in the isolated al., 1993; Doyle, 1994).
polyadenylated or non-polyadenylated RNA frac-
tions) is sufficient to permit direct sequencing.
Weisman et al. (1986) showed that this latter ap-
Intraspecific Diversity
proach is feasible even for a non-repetitive gene With improvements in PCR and sequencing tech-
whose transcript is abundant only in a certain tis- niques, intraspecific sequence variation at the
sue at a certain stage of the life cycle, and where species level now can be used in studies of epi-
prior knowledge of homologous sequences is demiology of diseases (e.g., Ou et al., 1992; Hillis
fragmentary. Previously, it was necessary to study and Huelsenbeck, 1994a), gene flow (e.g., Slatkin
the sequence evolution of alleles at most loci by and Maddison, 1989,1990; Hudson et al., 1992b),
first cloning the gene and its flanking regions geographic variation (e.g., Vigilant et al., 1991;
(e.g., Ponath et al., 1989a,b). However, improve- Hayes and Harrison, 1992; D.R. Maddison et al.,
ments in PCR and sequencing technology have 1992), and hybridization (e.g., Moritz et al., 1992b;
made it possible to compare sequences from Moritz and Heideman, 1993; ArBvalo et al., 1994).
cDNA generated using reverse transcriptase am- However, there is a trade-off between the detailed
plifications and to directly amplify even single- information at one or a few loci in a sequencing
copy nuclear DNA (e.g., Bradley et al., 1993). study and the less detailed information at many
Sequence information also can provide more loci in studies of allozymes (Chapter 4),
heretofore unavailable details about the molecu- RAPDs (Chapter 7), or restriction sites (Chapter 8).
lar processes that are responsible for many evolu- Animal mitochondria1 DNA offers one of the
tionary phenomena. For instance, several distinct best opportunities for applying DNA sequencing
hypotheses have been proposed for the origin of to the study of population genetic processes. Am-
rare alleles of enzymes in hybrid zones, including plification and sequencing can be used to cliarac-
intracistronic crossing over, hybrid dysgenesis terize the mtDNA haplotypes present in a popu-
(the release of previously controlled transposable lation or species and to reconstruct the gene
elements), and differential selection in hybrid phylogeny that relates them, Because animal
zones (reviewed by D.S. Woodruff, 1989; see mtDNA is maternally transmitted (at least most of
Chapter 4). Sequencing of the rare and parental al- the time, in most species) and non-recombinii~g,
Ieles has helped to distinguish among these 11y- all parts of the molecule share the same historical
Nucleic Acids XV: Sequevrcing and C Z O I Z ~ I337
Z~
pattern of coinmoil descent (A.C. Wilson et al., these loci is sufficient to justify their regular use
1985). The use of these gene phylogenies of in population genetics applications. In humans,
rntDNA together with geographic information on no variation was found in a sample of 38 rnalcs *n
the populations sampled provides a means for an intron of the ZFY gene on the Y chromosome,
the genetic structure of populations although variation was detected in other primate
(termed "jntraspecific phylogeography" by Avise species (Rorit et al., 1995).
et al., 1987). Use of gene phylogenies forms the
basis for the approach to population genetics Interspecific Diversity
known as coalescent theory. Coalescent theory
provides a means for measuring gene flow among Interspecific studies include investigations across
populations (Slatkin and Maddison, 1989, 1990; an immense span of time, from a geological instant
Hudson et al., 1992b; Templeton, 1993,1995) and through about 4 billion years of the history of life.
for evaluating the phenotypic effects of allelic sub- Naturally, a target sequence that provides resolu-
stitutions (Templeton et al., 1988, 1992). Uses of tion at one end of this continuu.m is unlike1.y to be
DNA sequence data for population genetic stud- useful at the other end. Some of the genes or re-
ies of plants tends to be more difficult than for an- gions that have proved especially versatile include
imals because their organellar genoines (espe- the ribosomal DNA arrays and the mitocl~ondrial
cially cpDNA) are less variable, thereby providing and cldoroplast genomes. Various nuclear targets
fewer markers for studying intraspecific varia- (other than rDNA) have been useful within partic-
tion; however, some instances of intraspecific ular groups, although gene duplication, and the
variation have been reported for cpDNA (D.E. presence of pseudogenes have caused consider-
Soltis et al., 1992). Sequence surveys of plant able difficulties for some applications.
mtDNA have been much less comprehensive than There are many and varied applications of in-
for animals, so it is possible that this molecule will terspecific phylogenies. In many cases, the phy-
prove useful in plant population genetics as well logeny is of intrinsic interest in its own right, as it
(Palmer, 1992). informs us about the course of evolution and pro-
Sequencing studies have found a place within vides the basis for taxonomy and classification.
the field of conservation genetics as well (re- However, interspecific phylogenies also are csti-
viewed by Avise, 1994).Studies of sequence vari- mated to directly study such topics as speciation
ation (primarily within the mtDNA and cpDNA (e.g., Moritz and Heideman, 1993; Patton and
genomes) have focused on issues such as inbreed- Smith, 19941, biogeography (Page, 1993a, 1994),
ing depression, reduced heterozygosity in small and co-speciation (IJage,1991; Brooks and McLen-
populatioi~s,introgression, and identification of nan, 1993; Chapela et al., 1994; HinMe et ai., 1994).
commercial products made from endangered Interspecific phylogenies are also needed to con-
species. trol for historical effects in studies of behavior,
Use of nuclear loci for intraspecific studies ecology, physiology, and other comparative or-
has received far less attention than has use of the ganisinal fields (Felsenstein, 1985c; Baum and.
mtDNA genome. However, studies of nuclear in- Larson, 1991; Brooks and McLennan, 1991; Har-
iron sequences (e.g., Lessa, 1992; Lessa and Ap- vey and Pagel, 1991; Garland et al., 1992,1993).
plebaum, 1993; Slade et al., 1993) and spacer re- The nuclear and mitochondria1 genes encod-
gions of rRNA genes (Kambhampati and Rai, ing ribosomal RNA have been particularly inipor-
1991) appear promising for such applications. The tant for inferring species phylogenies because
search for a paternally inherited counterpart to they are easily accessible, collectively demonstrate
the maternally inherited mtDNA of animals has a wide range of evolutionary rates, and therefore
focused on introns in coding sequences of sex have the potential to provide resolution across a
chromosomes (in species in which the male is the large time scale. Until recently, phylogenetic se-
heterogametic sex), such as the Y chromosoine of quence comparisons have concentrated 011 the
mammals. It remains to be seen if the variation in coding portions of the ribosomal genes and their
338 CJzapter 9 / Hillis, Mable, Larson, Davis
RNA products (see review by Hillis and Dixon, 1989; Gouy and Li, 1989b; Wainwright et al., 1993;
3 9911. The nuclear-encoded ribosomal RNA clus- Halanych et al., 1995). The relatively variable re-
ter demonstrates an unusual pattern of evolution, gions within the rRNA genes make them useful
fealurlng the interspersion of relatively rapidly for examining relationships within more closely
evolvmg sequences with some of the most highly related groups, such as various groups of algae
conserved ~nacromolecularsequences known (Jupe et al., 1980; Perasso et al., 1989; Eschbach et
(Gcrbi, 1985). The most h ~ g h l yconserved se- al., 1991; Larson et al., 1992), angiosperms
quences have been useful for investigating the (Hamby and Zimmer, 1988, 1992; Wolfe et al.,
oldesl divergences in the history of life (G.E. Fox 1989b; Zimmer et al., 19891, fungi (Guadet et al.,
ct a1 ,1980; Kunzel and Kochel, 1981; D.E Spencer 1989; Watanabe et al., 1989; Gargas et a]., 1995),
et a1 , 1984; Hasegawa et al., 1985a; Lane et al., mollusks (Ghiselin, 1988), arthropods (Hancock et
1985, G.J. Olsen, 1987; Field et al., 1988; Cedergren al., 1988; Abele et al., 1989; W.C. Wheeler, 1989;
et al., 1988; Lake, 1988; Ghiselin, 1988). Compar- Kim and Abele, 1990; Spears et al., 1992), echino-
isons of mltochondrial and cl~loroplastribosomal derms (Raff et al., 1988; A. Smith, 1989),and ver-
genes with prokaryotlc ribosomal genes have tebrates (Hillis and Dixon, 1989; Larson and Wil-
helped resolve relationships among these eukary- son, 1989; S.B. Hedges et al., 1990; Larson, 1991b;
otlc organelles and their prokaryotic relatives (D. Larson and Dirnmick, 1993; Hillis et al., 1991a,
Yang el al., 1985; S. Turner et al., 1989). 1993b). The internal transcribed spacer (I'TS) re-
Riboso~nalRNA genes (especially the small gions are useful for examining relationships
subunit) have provided considerable insight on among closely related species (Gonzalez et al.,
the relat~onsl~ips of prokaryotes; in fact, much of 1990; Phillips and Pleyte, 1991; Lee and Taylor,
what 1s known of prokaryote phylogeny has been 1992; Pleyte et al., 1992; C.E. Ritland et al., 1993;
derived from rDNA analyses. X~bosomalRNA Scl~lottereref al., 1994; Vogler and DeSalle, 1994;
gencs have also been used both to support (e.g., Vilgalys and Sun, 1994). The nuclear-encoded ri-
Pace ct a1 , 1986; Woese and Olsen, 1986; Gouy bosomal RNA sequences can be studied by se-
and LI, 19894 and refute (e.g., Lake, 198713) the quencing cloned copies of the ribosomal genes
monopllyly of archeans (formerly called archae- (Ware et al., 1983; Hassouna et al., 1984; Elwood
bactclla) Within the Archea, studies such as those et al., 19851, by sequencing PCR-amplified regions
of Gupta et al. (1963), Woese et al. (1984a), Lech- (Sogin, 1990; Weisburg et al., 1991; see primer
ner et al. (19851, and Leffers et al. (1987)initiated a compilation by Hillis and Dixon, 19911, or by se-
new area of understanding of the broad outllnes quencing the ribosomal RNA directly (Hamlyn et
of diversity in this group, and many Archea that al., 1978; Youvan and Hearst, 1979; Qu et al., 1983;
are resistant to culturing are known only from Hamby et al., 1988; Larson and Wilson, 1989;
the~rrDNA sequences. Scores of ribosomal DNA Bachellene and Qu, 1993) . The latter approach is
sequencing studies have also indicated the major facilitated by the fact that the majority of the RNA
evolu tlonary relationships among the Eubacteria in any cell is nuclear-encoded ribosomal IWA.
(e 8,Wocse et al., 1984b,c, 1985; Weisburg et al., Sequence comparisons of selected regions of
19b9a,b). animal mtDNA (including the rntDNA ribosomal
Sequcnce studies of the rRNA genes, from the genes) are useful for inferring phylogenetic rela-
conherved regions to the rapidly evolving regions tionships of species whose divergences are more
knoxrn as divergent domains or expansion seg- recent than those accessible using nuclear-en-
mcnis, have proven useful for investigating evo- coded ribosomal W A (Moritz et al., 1987).Phylo-
lutionary divergences that occurred throughout genetic analyses have been conducted based on
the hlstory of the metazoans (Sogin et al., 1986; miDNA sequences of many of the major phyla of
Patterson, 1989; Lake, 1990a; Larson, 1991a). The animals, although the greatest number of studies
more conserved regions have been useful for have been conducted on arthropods (e.g., Alexan-
lookrng at relationships among the major phyIa der, 1991; Simon, 1991; Ballard et al., 1992; Cun-
(e g , blwood et al., 1985, Field et al., 1988; Sogin, ningham et al., 1992; DeSalle, 1992; Cameron,
Jucleic Acids IV: Sequencing and Cloning 339
1993; Simon et al., 1994) and vertebrates (e.g., taxa (Lessa 1992; Lessa and Applebaum 1993;
W.M. Brown et al., 1982; Higuchi et al., 1984; Hix- Slade et al., 1993,1994; Palumbi and Baker, 1994).
son and Brown, 1986; Hayasaka et al., 1988; The number of loci examined in a comparative
Holmquist et al., 1988a; Miyamoto et al., 1990; Ru- manner and the number of taxa for which such in-
volo et al., 1991; S.B. Wedges et al., 1991,1992; Al- formation is available should increase dramati-
lard et al., 1992; Ammerman and Hillis, 1992; cally over the next several years.
Block et al., 1993; Titus and Larson, 1995). The
mtDNA sequences that have received the most at-
tention are the genes for ribosomal RNA (125 and SUMMARY
16S), cytochrome oxidase I and TI, and cytochrome
b, as well as the control region, but other regions From the preceding discussion, it should be clear
are proving to be useful as well. that nucleic acid sequencing can be used to study
Most comparative plant DNA sequencing virtually any systematic problem, from studies of
studies have utilized the chloroplast genome evolutionary processes to the phylogeny of life.
(Clegg el: al., 1990; Clegg and Zurawski, 1992; Iiowever, this does not mean that sequencing is
Jansen et al., 1992). Complete chloroplast DNA necessarily the best approach to any problem.
(cpDNA) sequences have been obtained for sev- Since it is not always a cost- or time-effective
eral species of plants (the first two were se- method for obtaining relevant data (see Chapters
quenced to completion by Shinozaki et al., 1986 2 and 12), other techniques are best used for many
and Ohyama et al., 1986). However, a large frac- systematic applications. This is especially true of
tion of studies of cpDNA have focused on se- studies that require examination of many individ-
quence variation of rbcL, the gene that encodes the uals or loci, such as many studies of intraspecific
large subunit of ribulose-1,5-bisphosphate car- variation (e.g., geographic variation, reproductive
boxylase (e.g., J, Aldrich et al., 1986a,b; Ritland modes, geographic variation, and heterozygosity
and Clegg, 1987; Zurawski and Clegg, 1987; S. estimates) and many studies of closely related
Turner et al., 1989; Morden and Golden, 1989; D.E. species (hybridization, cryptic species, and recent
Soltis et al., 1990; Donoghue et al., 1992; R.G. Olm- phylogeny). However, for problems of phylogeny
stead et al., 1992; M.W. Chase et al., 1993). Taken reconstruction over relatively ancient spans of
together, these studies represent one of the largest time (greater than 50 million years), no other mol-
sequence data bases (in terms of number of ecular technique is as likely to be informative as
species) available for phylogenetic analysis. appropriate nucleotide sequence data.
Many other loci have been cloned and se-
quenced from several species and used for phy-
logeny reconstruction. Loci for which consider- LABORATORY SETUP
able comparative sequence information is
available include (in addition to the r M A genes Although all nucleic acid sequencing work re-
and mtDNA genome discussed above) the globin quires a relatively sophisticated laboratory, needs
loci in primates (e.g., Koop et al., 1986; Goodman will vary depending on the method(s) chosen for
et al., 1987; Miyarnoto et al., 1987; Holmquist et isolating the target sequence. In general, cloning
al., 1988a), the immunoglobulin genes in rodents requires somewhat more equipment (and experi-
(e.g., Ponath et al., 1989a,b),tile alcohol dehydro- ence) than PCR amplification or direct RNA se-
genase loci in fruit flies (e.g., Bodmer and Ash- quencing. Table 1 provides a rough idea of the re-
burner, 1984; Schaeffer and Aquadro, 1987; Bishop quirements of a typical laboratory for cloning and
and Hunt, 1988a), and the aldolase and DQQ: sequencing.
genes in both mammals (Lessa, 1992) and skinks Beyond the equipment listed in Table 1, the
(Slade et al., 1994). Intron-containing regions of supply needs for a sequencing laboratory are sim-
single-copy nuclear genes (see Chapter 7) have ilar to those described in Chapter 8 for restriction
been used to compare closely related groups of enzyme analysis. Cloning work requires a few
340 Chapter 9 / Hillis, Mable, Larson, Davis G. Zirnrner
Table 1
Primary equipment for sequencing and cloninga
Equipment Use (protocols in parentheses)
Table 1(continued)
Primaxy equipment for sequencing and cloningn
Equipment Use (protocals in parentheses)
additional supplies, such as disposable petri reagents arc the samc as described for rcstrictlon
dishes, culture tubes, and wirc loops for spread- site analysis (Chapter 8).
ing bacterial colonies. Oligonucleotide primers for
sequencing are commercially available lor t l ~ c
various cloning vectors, or specific primers can be FR(>TOCQLS
made to order by many companies and central-
ized institutional facilities. If an oligonucleotide 1. DNA isolation from a~~imals,
protists, and
synthesizer is available in the laboratory, primers prokaryotes
can be designed and constructed with iitile time
2. DNA isolation from plants, fungi, and algac
or effort (see W.M. Barnes, 1987).
In addition to the materials described for re- 3. DNA isolation from minute quantities of
striction site analysis (Chapter 81, cloning and se- tissue
quencing require a few specialized enzymes, an-
4. RNA isolation froin animals
tibiotics, and other chemicals. Sanger dideoxy
sequencing - requires
- one of s e v e r a y pol;-
~ ~ ~ 5. RNA isolation from plants
me>ases, or reverse transcriptase for RNA se-
quencing (see Protocols 22 and 23). The ther-
of gene banks in lambda
bacteriophage vectors
mostable polymerases (ex., Taq or Vent)used for
PCR can aiso'be used as egective polymerases for - -
7. Growing bacteriophage
DNA sequencing. DNA ligase is- needed in
cloning to ligate compatible fragments (see Proto-
8, Screcni.&
cols 10and j8). Plasmid subcloning requires the 9. Miniprep isolation of lambda bacteriophage
use of various antibiotics and substrates (to en- DNA
sure the presence of plasmids and for screening
10. Subcloning into plasmids or MI3
recombinant from non-recombinant plasmids; see
Protocols 12 and 18).Vector DNA and host bacte- 11. Preparation sf frozen competent cells for
rial strains are available commercially or through transformation
exchange with other laboratories. The remaining
342 Chapter 9 / Hillis, Mnble, Larson, Davis G. Zimmer
12. Transformation of plasmid DNA mals, a few drops of blood can be diluted directly
13. Transforlnation of MI3 bacteriophage DNA
into STE in step 2 (much greater quantities of
mammalian blood are needed, because mam-
14 Isolation of plasmid DNA malian erythrocytes lack nuceli and the DNA must
be isolated from the leukocytes). Extraction of
15. Niniprcp isolation of MI3 DNA
DNA from many species of mollusks (especially
16. Preparing permanent frozen stocks of gastropods) has proven difficult; it is usually nec-
plasmid clones essary to experiment with several different tissues
(the gonads work well in many species), or see the
17. Preparation of PCR products for sequencing variations of Protocol 1 in Chapter 7. Most insects
18. Cloning of PCR products can be processed whole after removal of the diges-
tive tract. If isolating DNA from ethanol-preserved
19. I'urification of PCR products
samples, the tissue should be lyophilized first. See
20. Screening methods for detecting variation Chapter 8 for protocols for isolating mtDNA and
in sequencing templates cpDNA and Chapter 7 for alternative methods for
2 1. PIcparing a sequencing gel isolating nuclear DNA.
For many tissues, particularly most vertebrate
22 DNA sequencing reactions tissues, high-quality DNA can be isolated rou-
tinely using abbreviated procedures. The basic ex-
23 lWA sequencing reactions
traction method (part A) can be used for most tis-
21. Cycle sequencing sues, including those that contain pigments that
must be removed from the DNA sample, The al-
25. Running a sequencing gel
ternate method (part B) is a more condensed pro-
26 Microsatellites cedure that works best with muscle tissue or other
tissues that are not heavily pigmented.
6. Add an equal volume of PC1 (Appendix), mix mately 50 ,ug/ml. The ratio of the readings at
gently but thoroughly and incubate at room 260 nm/280 nm should be approximately 1.8;
temperature for 5 min. If the phases separate, lower readings indicate contamination with
gently mix again. protein and/or phenol. Relative concentra-
7. Centrifuge for 5 min at ~ 7 0 0 0g (or at high tion can also be determined by elec-
speed in a microcentrifuge). trophoresing the sample on a denaturing
8. Carefully remove the aqueous layer with a mi- polyacrylamide gel or a n agarose gel and
cropipette and a wide bore tip and transfer to a comparing to a standard.
clean tube. The aqueous layer is usually the 19. Dilute the sample to the desired working con-
top layer, although high salt concentrations can centration with TE.
cause inversion of the phases. Be careful not to 20. Electrophorese 2 pl of the sample on a minigel
disturb the cellular debris on the interface. to check for degradation and determine if it
9, Re-extract the aqueous phase with PC1 (re- will be necessary to treat the sample to re-
peat steps 6-8). move RNA.
10. Add an equal volume of CI (Appendix), mix
gently, and incubate at room temperature for Part R. hlion-rative E N A Exlraciion Method
2 min. Re-mix once a minute to prevent the Steps 1 4 : Perform as in part A.
phases from separating.
5. Add 1/10 the sample volume of 5 M NaCl
11. Centrifuge for 3 min at ~ 7 0 0 0g. and place on ice for 1 hr.
12. Carefully remove the upper (aqueous) layer 6. Centrifuge at ~ 7 0 0 0g for 10 min.
with a micropipette and a wide bore tip and
transfer to a clean tube. Be careful not to dis- 7. Transfer the supernatant to a clean tube and
turb the interface. shake well.
13. Re-extract the aqueous phase with CI (repeat 8. Centrifuge at =7000 g for 5 min.
steps 10-12). 9. Add 2-3 times the sample volume of 95%
14. Add 1/10 the sample volume (about 45 pl) of ethanol. The DNA should precipitate imme-
2 M NaCl (or 3 M NaAc or 5 M N u c ) and diately.
2.5 times the sample volume of ice-cold 95%
ethanol.
15. Precipitate the DNA at -20°C for at least two Protocol 2: DNA Isofafio~xfrom Flanfs,
hours (overnight is preferable). Fungi, and Algae
16. Centrifuge the precipitate for 10 rnin at ~7,000 (Time: day 1: =2 hr; day 2: =30 min)
g. Wash the pellet twice with 7'0% ethanol and
dry in a vacuum centrifuge. Many methods have been developed for isolation
of high-molecular-weight DNA from plants (Ziin-
17. Re-suspend the pellet in 250 ,ul (the volume mer et al., 1981; Saghai-Moroof et al., 1984; Rogers
will depend on the pellet size and desired con- and Bendich, 1985; Doyle and Dickson, 1987).
centration) of l x TE or in diethylpyrocarbon- These methods differ primarily with respect to
ate-treated distilled H20. Incubating the Sam- their requirements for input material (fresh,
ple at 4540°C can facilitate dissolution of the frozen, or lyophilized; gram or hundreds of gram
pellet. quantities) and the use or non-use of ultracen-
18. Check the concentration and purity of the trifugation steps. The protocol given below is rel-
sample in a spectrophotometer by taking atively simple and is useful for the preparation of
readings at 260 nm and 280 nm. An optical the small samples of DNA needed for applica-
density of 1 at 260 nm corresponds to a dou- tions in this chapter. See Chapter 8 for protocols
ble-stranded DNA concentration of approxi- for isolating cpDNA and mtDNA.
344 Chapter 9 / Hillis, Mable, Larson, Davis
1. Grind leaf or flower tissue to a fine powder in Protocol 3: Jfsalakion of DNA Brerlv
liquid nitrogen with a mortar and pestle. Painaeate Quantities of 'Fissae
(Note: Be very careful while powdering tissue (Time: ~ 1 . hr)
5
as the mortar and pestle can shatter due to the
extreme cold.) With continued improvements in PCR tech-
2. Add P-mercaptoethanol (/?ME)to 2x CTAB niques, even very minute quantities of tissue are
extraction buffer (Appendix) to a final con- sufficient to allow reliable amplification of DNA
centration of 0.2%. Heat the CTAB plus ,BME segments. The previously described methods for
solution in a 60°C water bath for 5 min. DNA extraction usually require at least several
3. Aliquot 500 yl of the buffer plus PME into a hundred micrograms of tissue. Recent advances
1.5-mi microcentrifuge tube. Add =I00 mg in forensic techniques have made it possible to
extract DNA from single hairs or tissue scrapings.
fine nitrogen powdered tissue and place in a
60" water bath for 45 min. Application of these techniques to molecular sys-
tematic problems has allowed extraction of DNA
4. Add 500 p1 of CI. Close the tube and extract from some fossil tissues and preserved speci-
by gently inverting the tube. Extract for 10 mens. These methods also have been used to al-
min. . low non-destructive sampling of populations for
5. Centrifuge for 5 min at =7000 g (high speed) survey purposes.
in a microcentrifuge. The following protocol is based on Singer-
6. Transfer the upper (aqueous) phase to a fresh Sam et al. (1989).This method uses BioRad Chelex
tube using a wide bore pipette tip and re-ex- 100", which was designed for ion resin exchange.
tract with CI. Centrifuge as above and trans- It is a very simple procedure but usually result sir^
fer the aqueous phase to a fresh tube. template usable for PCR. Only a very small
7. Add 1 ml of absolute ethanol and allow the
amount of tissue is required and exceeding the
DNA to precipitate at -20°C for at least 30 recommended quantity will result in decreased
min. (Note : Precipitating overnight substan- extraction efficiency. Optimization may be re-
tially increases the yield). quired in terms of quantity of tissue used and in-
cubation times. However, the basic procedure has
8. Centrifuge for 1-5 min at ~7000 g to pellet the been found to work for single drops of blood from
DNA. vertebrates (especially other than mammals); sin-
9. Decant the ethanol and briefly dry the pellet gle hairs frommam&ils; fossil tissue scrapings;
in a vacuum centrifuge. and alcohol and formalin-preserved specimens
10. Redissolve the pellet in 100 pl of 1x TE. Add (see Walsh et al., 1991). Throughout this proce-
10 pl of 3 3 sodium acetate and 2.5 volumes dure it is essential that all solutions, instruments,
of 95% ethanol. Precipitate at -20°C for 30 and tubes are sterile because even trace quantities
min. of contaminants can result in serious contamina-
11. Centrifuge for 5 min at -7000 g. Decant the tion problems. Instruments should be flamed in
ethanol and add 1ml of 70% ethanol to wash alcohol between use and it is a good idea to do
the pellet. Recentrifuge for 2 min at 7000 g. control,reactions to check for contaminatioiz.
Two 70% ethanol washes may be necessary to 1. Scrape a sliver of tissue from a frozen sample
remove traces of CTAB or chloroform. (less than 1 mg) with alcohol-flamed forceps
12. Dry the pellet in a vacuum centrifuge until and place in a sterile microcentrifuge tube
all visible traces of ethanol are gone. Do not containing 500 pl of 5% Chelex. Wash forceps
overdry the pellet. Redissolve in 200 pl of l x with water and then alcohol, followed by
TE. Determine concentration and purity of flaming, in between samples. If using blood,
the sample as in Protocol 1 (part A), step 18. drop several. microIiters into a tube contain-
ing Chelex.
Plate 2 Chromatographs from an automated DNAse- quence (second panel), lower slgnal-to-noise results m
quencer. The height of each of the four colored lines in- occasional ambiguities (e.g., base positxon 186) Within
dicates the relative intensity of fluorescence that corre- another 100 nucleotides, the peaks become less well de-
sponds to each of the four labeled dideoxynucleotides. fined and ambiguities increase (th~rdpanel). Evcntu-
Hence, the peaks may be read directly as DNA se- ally, the peaks are poorly deflncd and the sequence 15
quences (indicated above the chromatograph).The se- unrellablc (bottom panel). The length of reliable reads
quence near the primer (top panel) is clear and easy to will depend on many factors, including the model of
read, with well-defined peaks that are easy to distin- the sequencer, the quality of the template, and the de-
guish from background noise. Further along the se- tails of the sequencing reaction.
Nucleic Acids IV: Sequencing alzd Cloning 345
2. Incubate tubes at 56°C for 45 min to over- 5. Pour the guanidinium isothiocyanate/kmer-
night, until most of the tissue has disinte- captoethanol solution into the mortar and stir
grated (times will vary with type of tissue until the mixture freezes.
used). 6. Place the mortar in a 60°C water bath until
3, Vortex at maximum lor 10-15 sec. the mixture melts. Stir, then pour the mixture
4. Heat at 95-100°C for 15 min. into the centrifuge tube and place it in a
beaker of water in the 60°C water bath.
5. Vortex at maximum for 10-15 sec.
7. Draw the mixture into a syringe (10-ml vol-
6. Store at 4°C.
ume, fitted with an 18-gauge needle) and
7. Prior to using for PCR, centrifuge the extracts forcibly eject it into the centrifuge tube. Re-
in a microcentrifuge to pellet the Chelex peat until the viscosity of the mixture is re-
beads. duced.
&
. <
21. Heat to 60°C. Add 1/2 volume (from step 19) insects. It is a modification of the procedure of
of phenol heated to 60°C and mix. Add 1/2 T.C. Hall et al. (1978; for further information, see
volume (from step 19) of CI; mix for 10 mi11at Hamby et al., 1988).
60°C.
1. Place 5-10 g of liquid-nitrogen-powdered tis-
22. Cool on ice and centrifuge at 4OC for 10 min sue in a 50-1111polypropylene tube. Add 25 ml
(4000 g). hot (90-95°Ci borate buffer (Appendix) and
23. Recover aqueous (top) phase (with siliconized homogenize the sample in three 10-scc bursts.
Pasteur pipette) into a new 50-ml centrifuge 2. Filter the extract through sterile cheesecloth
tube. into a fresh tube. Add 0.3 1n1 of 10 mg/ml
24. Repeat steps 21-23. proteinase K solution. Incubate for one hr at
25. Extract twice with CI at room temperature (as 37°C. Add 1 ml of 2 M KC1 to the tube and
in steps 21-23, except for temperature). chill on ice for 5-10 min.
26. Add 2-2.5 volumes of absolute ethanol and 3. Centrifuge at 16000 g i n a swinging bucket ro-
store at -20°C overnight. tor for 10 min at 4OC. Filter the supernatant
27. Repeat steps 17 and 18. through a double layer of laboratory wipes
into a 30-ml glass centrifuge tube. Add 1/4
28. Re-suspend the pellet in DE13-treated, dis- volume of 10 M LiCl. Freeze the sample on dry
tilled water (51ml). ice for 30 min and then keep at 4°C for 2-4;hr.
29, Take optical density readings on a 1/50 dilu- 4. Centrifuge at 13000g ina swinging bucket ro-
tion (10 p1 sample in 490 pl DEP ddH20) at tor for 15 min at 4OC. Pour off the supernatant
260 nm and 280 nm in a spectrophotometer. immediately as the IWApellet will be loose.
An optical density of 1 at 260 nm corre-
sponds to =40 pg/ml for RNA. Pure samples 5. Wash and re-suspend the pellet with 5 ml of
of RNA have a ratio of optical density read- cold 2 M LiCl. Centrifuge as in step 4.
ings at 260 nm/280 nm of ~ 2 . 0lower
; read- 6. Re-suspend the pellet in at least 2 in1 of 2 M
ings indicate contamination by protein and/ potassium acetate, pH 5.5. This pellet often re-
or phenol. quires extensive vortexing and some warm-
30. Separate the sample equally into two micro- ing to re-dissolve. Add 2.5 volumes of ice-
centrifuge tubes, precipitate one of them with cold ethanol and store at least 4 hr at -20°C.
2-2.5 volumes absolute ethanol, store at -80°C 7. Centrifuge at 12000g in a swinging bucket ro-
(for long-term storage). tor for 15 min at 4OC.
31. To the other tube of sample add dithiothreitol 8. Air-dry the pellet and dissolve in 5 ml STE.
(DTT) and RNasin as follows: 1 pl of 2.5 M (Remove aliquots of 20-50 pl here and af
DTT per 500 pl of sample (or 10 pl of 0.25 M steps 9 and 10 to assay RNA integrity on
DTT per 500 pl). Centrifuge, vortex, and cen- agarose minigels). Add 5 ml of PC1 and ex-
trifuge again. Add 12.5 ,ul RNasin per 500 pl tract the sample with tl~oroughmixing for 2-5
of sample. Centrifuge, vortex, and centrifuge. min. Let stand on ice for 10 min.
Store at -gO0C. 9. Repeat step 7. Remove the top layer and put
in a fresh glass centrifuge tube (remove
minigel aliquot) and then add I nd of 4 M am-
Pxsbfocol5: Is~liileionsi RNA from monium acetate and 10 ml ice-cold absolute
ethanol. Mix well and store 4 hr to overnight
Pliinnts at -20°C,
(Time: day 1: 8-10 hr; day 2: -4 hr)
10. Repeat step 7. Dry the pellet in a vacuum cen-
This technique is the most effective for isolation of trifuge. Dissolve in 1-2 ml of l x TE (use more
RNA from plants and algae, as well as from many TE with larger pellets). Determine RNA con-
centration and purity as inProtocol 4, step 29.
Nucleic Acids IV: Seqtlelzcing and Cloning 347
6: Pxepa~dtion(94
~30t<3~:09 Partial Gene 7. Package the ligation u s ~ n ga comlnerc~al
I,ibrxics ir~.ILBacleriophage Vectors packaging extract, follow~ngtlre manufac-
(Time: day 1 [steps 1-61: 56-10 hr; day 2 [steps turer's instructions. (Tlr~sstep varies slightly
7-93 : =2-3 1 ~ ) with various packaging extracts, but usually
Involves simply adding the ligation mixture
Tire following protocol presents an example of directly to a freeze-thaw lysate and a sonic ex-
lambda cloning that is typical for many commer- tract and incubating at room temperature for
cially available lambda cloning vectors. One can a few hours).
also grow and purify lambda DNA and make 8. Dilute the packaged phage with 0.5-1 ml of
extracts rather than using commercial PDB (Appendix). The resulting gene library
preparations (see Berger and Kimmel, 1987, for should contain lo6-loy recombinant phage
details). In step 1, the DNA may be digested to (depending on the efficiency of the packaging
completion with a particular endonuclease that is extract used and tlre quality of the DNA lrga-
known to flank the region of interest. If this infor- tion) with inserts from 1 to 23 kb (dependmg
mation is not known, it is usually preferable to di- on the cloning vector used).
gest the DNA partially with an endonuclease that 9. Plate serial dilutions (1 p1-0.1 pl-0.01 ill) of
has a short recognition sequence (e.g., MboI) to the gene bank to determine the titer and re-
generate fragments of the desired size. combination efficiency (sce Protocol 7).
1. Digest I pg of target DNA with the desired
cloning enzyme (e.g., EcoRI).
2. Ethanol-precipitate the restriction digest by Prorocoi 2 Gg'xro~rringBackeriophagc
the addition of 1/10 volume of 2 M NaCl and (Time: day 1: =I0 min; day 2: =I hr [plus incuba-
2.5 volumes of absolute etlrairol. (Note: 3 M tion time])
NaAc can be used in place of 2 M NaC1, but liecornbinant lambda bacteriophage are grown by
small traces of NaAc seem to be more detri- adding aliquots or serial dilut~oiisof the phage li-
mental to ligation efficiency). brary to appropriate host bacteria, then plat~ng
3. Incubate two or more hours at -20°C, then the bacteria and selecting the resulting plaques
centrifuge at ~ 7 0 0 0g for 20 min. Decant the For titering libraries, it is usually desirable to plate
ethanol and dry the pellet in a vacuum cen- several 10-fold serial dilutions of the stock to dc-
trifuge. termine the concentration. If relatively few sc-
4. Re-suspend the pellet in 10 p1 of water. combinant phage are obtained, or if larger quanti-
ties of the library are desired, the library can be
5. Assay 5 ,ul of the digested DNA on a minigel
amplified (see Berger and Kimmel, 1987) Flow-
with standard lanes containing 0.1 and 0.5 pg
ever, it should be cautioned that some recornbi-
of DNA. This will verify that the restriction
nant bacteriophage will replicate much faster than
digest and subsequent ethaizal precipitation
others (because of the size of the insert), and tlwt
were successful.
the amplified library will therefore overrepreseni
6. Add an equal molar ratio of target DNA to some clones and underrepresent others. Tlrerc-
lambda phage arms. For instance, if the ap- fore, usually ~tis best not to amplify the library
proximate average target cloning size is 8 kb unless absolutely necessary.
and the lambda arms total 40 kb, then add 0.2 For growing lambda bacteriophage, strams of
,ug of the digested DNA (=2 ~ 1to) 1 pg of bacteria are selected that do not allow recombma-
lambda phage arms and bring the total volume tion among the phage (recA- strains); these stralns
to 3.5 @. Next add 0.5 ,dof lox ligation buffer, are typically supplied wlth the phage arms and
0.5 @ of T4 DNA ligase (2 Weiss uuts), and 0.5 the recA- phenotype can usually be maintained
pI of 10 xnM A n , pH 7.5.Mix tke ligation reac- by antiblotic selection, Systems for detectlon of re-
tion thorougldy and incubate for one hour at combinant versus reconstituted lambda bacteno-
room temperature, then overnight at 4OC.
348 Clzapter 9 / Hillis, Mable, Lnrson, Davis & Zimmer
phage also vary with different host strains; some To find the particular gene or DNA region of in-
systems use color selection by IPTG/X-Gal (see terest, one must screen the gene library by plating
Bergcr and Kimmel, 1987) and others use bacterla the phage at an appropriate density (typically
t11al only allow recombinant lambda growth. The 2,000-50,000 plaques/plate), transferring the
ba51c protocol for growing bacteriophage is given phage DNA to a binding membrane (filter lift),
below; variations may be required for particular and hybridizing the filter Iift with an homologous
bacterial host strains. probe. This procedure is relatively easy if the gene
is present in high copy number (e.g., the rRNA
1. Pick a single colony of the host strain from a
genes, heterochromatic repeats, or mtDNA frag-
plate that contains the antibiotic that allows
ments) and is flanked by appropriate restriction
selection for the recA- phenotype, and add to
sites for the library that has been constructed. Sin-
L-broth (Appendix) plus 0.2%maltose plus 10
gle-copy genes require screening of many more
mM MgS04using sterile technique (250 rnl of plaques (often as many as lo6);this may require
L-broth is enough for most applications).
plating on larger plates than in the protocol belour
Grow overnight:with vigorous shaking ( ~ 3 0 0
or use of a lambda strain that accepts larger frag-
rpm) at 37OC.
ments.
2. Centrifuge UI sterile tubes at 1000 g for 10 min This protocol is among the simplest for iden-
to pellet the cells. tifying clones of interest, although numerous
3. Re-suspend the cells in one half of the origi- other techniques are more applicable in particular
nal volume of sterile 10 rnM MgS04. situations. For a review of the various methods,
4. Remove L-broth plates from 4°C refrigerator see Berger and Kimmel(1987) or Ausubel(1989).
and warm in incubator (3T°C). 1. Plate out the phage at a density where the
5. Mix 200 pl of cells for a 100-mm plate or 450 plaques cover the majority of the plate, but do
,id of cells for a 150-mm plate with the phage not overlap significantly. Square plates are
stock in a sterile culture tube. Incubate at 37OC preferable to round plates, as square filter lifts
for 15 min with gentle (=I00rpm) shaking. save film during autoradiography. For a 100-
6. While the cells plus phage are incubating, mm square plate, approximately 2,000-10,000
melt L-broth top agarose in a microwave oven plaques can be screened efficiently. Incubate
and allow it to cool to 48°C. Hold at 4S°C in plates for =8 hours at 37OC.
water bath. (Top agarose is preferable to top 2. Cool the plates for several hours at 4OC to
agar, because the former will not stick to filter harden the top agarose.
lifts as readily). 3. Carefully lay a nylon (or nitrocellulose) filter
7. After the infection is complete, add 3 ml(100- onto the surface of the plate and wait about 2
rnm plate) or 7 rnl(150-mm plate) of 48°C top min for it to absorb moisture (and phage
agarose to the culture tube, vortex gently, and DNA) from the plate. No bubbles should be
pour over the surface of the plate. Tilt the trapped under the nylon or areas of the plate
plate to spread the agarose evenly. Grow 6 hr will not transfer well. While waiting, stick a
to overnight in a 37°C incubator, until plaques hypodermic needle containing waterproof ink
are approximately 1 mm in diameter. through the filter into the plate in three to five
places. This should mark both the filter and
the plate with ink dots so that they can be re-
Bacteriopfiage
PzakutoI 8s S c r c c n i ~ ~ g aligned later.
f,ihrnrics 4. Carefully peel the nylon filter off and place
(Tim?: step 1: see Protocol 7; step 2: =2 hr; steps into denaturing solution (Appendix) for =2
3-7 =2 5 hr; step 8: =2 days; steps 9-10: see min. Meanwhile, lay a second filter on the
Pro~ocol7; steps 11-12: =4 days; total time: =1 plate and repeat the process, this time waiting
weeii) =4 min before removing the filter. Mark the
Nucleic Acids IV: Sequencing and Cloning 349
Figure 7 (A)Gel for checking a series of lambda bac- of the lambda bacteriophage; the smaller fragment in
teriophage clones (even lanes) and their plasmid sub- each of the odd lanes is the linearized plasmid vector.
clones (odd lanes) digested wit11 EcoRI. Lanc 9 is (B) Autoradiograph of Southern blot from check gel
lambda DNA digested with Ni~dI11.The two larger shown in Figurc 8A, hybridized with an homologous
fragments in the even lancs correspond to the two arms probe to vcrify clones.
add 7 ml top agar (or top agarose) at 48°C (as 10. Decant the supernatant and allow the in-
in Protocol 7, steps 6-7) and plate on a 150 verted tube to drain thoroughly. (Note: A
mm L-broth + MgS04 + maltose plate (Ap- white precipitate should be clearly visible).
pendix). Grow 6-8 hr at 37OC. The plaques 11. Re-suspend the pellet in 0.5 ml of PDB in a
should be confluent or nearly so. 1.5-1111microccntrifuge tube. Add 5 /.d of 0.5 M
2. Add 5 ml of PDB (Appendix) to the plate and EDTA.
shake gently at 4OC overnight. 12. Incubate at 65OC for 15 min.
3. Remove the PDB with a Pasteur pipette and 13. Extract twice with an equal volume of PC1 as
transfer it to a glass or polypropylene cen- described in Protocol 1 (steps 6-9). A large
trifuge tube. Add 200 pl of chloroform and amount of PEG will collect at the interface
mix. during these extractions.
4. Spin down the debris at 7500 g for 10 min at 14. Extract twice with an equal volume of CI as
4°C. described in Protocol 1 (steps 10-13).
5. Collect the supernatant, transfer it to a clean 15. Add 50 pl of 2 M NaCl and 1 ml of ethanol to
glass or polypropylene centrifuge tube, and precipitate the DNA.
add 1 pglml of DNase I and RNase A (nor-
16. Centrifuge at 7500 g far 10 min to pellet the
mally kept as 1 mg/ ml stocks).
DNA.
6. Incubate 30 min at 37OC.
17. Decant the ethanol, dry the pellet in a vac-
7. Add an equal volume of PEG stock (Appen- uum centrifuge and re-suspend the DNA in
dix) and mix gently. 250 pl of l x TE. Check concentration and pu-
8. Incubate 1 hr on ice. rity spectrophotometrically as described in
9. Pellet the precipitated phage by centrifuga- Protacol 1, stcp 18. 10 ,L of this
II stock should
tion at 12,000 g for 20 min at 4OC. be ample for a test restriction or a subclol~ing
experiment.
Nucleic Acids IV: Sequencing and Clo~zilzg 351
9. Re-suspend each pellet in 7 ml of cold 0.1 M 1. Thaw frozen competent cells on ice.
CaC12. 2. Aliquot 200 pl of cells into a sterile tube on
1 0 Incubate the cells on ice overnight. ice. This should be enough cells to allow effi-
11. Add 3 rn! of ice-cold 50% giyceroi/50 rnM cient transformation with the DNA from a
CaC12to each ttlbe. Mix gently. subcloning experiment.
12 Alrquot 0.5 ml of cells/tube into pre-chilled 3. Add ligated DNA in up to 50 PI total volume
tubes and quick-freeze in liquid nitrogen. and sufficient 1.0 M CaCl, to keep the Ca2+
Store the frozen cells at -80°C. Cells prepared concentration at 0.1 M.
in this manner retain 290% of their original 4. Mix thoroughly and incubate on ice for 30
competency for up to one year. min.
5. Heat shock the cells at exactly 42°C for 2-3
min.
PruiacasE 11.2:Transformation of E. culi 6. Allow the cells to cool to room temperature,
wit11 BJlasmid DNA then add 1ml of L-broth.
(Time: =2 hr to step 9) 7. Incubate the cells at 37°C for 30 min to allow
the expression of drug resistance.
The following protocol is used to isolate and
screen plasmid clones created in Protocol 10. The 8. Spread the cells on L-broth + 1%agar plates.
plasn~idsare introduced into competent E. coli The plates should contain the appropriate an-
cells (produced in Protocol 11). Because the plas- tibiotic for the plasmid (e.g., 100 mg ampi-
lnrd carrres a gene for antibiotic resistance (typi- ciIlin/l), as well as 50 mg IPTG and 40 mg
cally ampicillin or tetracycline), the transformed X-Gal per liter of broth.
bacterla can be isolated by growing the cells with 9. Grow overnight at 37OC. Colonies that contain
the appropriate antibiotic. However, cells with recombinant plasmids will be white; colonies
both recombinant as well as non-recombinant that contain non-recombinant colonies will be
plasrnids will grow under these conditions, so a blue. DNA can be Isolated from white
secorld screening condition is usually imposed. colonies for screening by using a scaled-down
For some plasmids, this involves a second gene for version of Protocol 14, part A. After the cor-
a ddferent antibiotic resistance that is disrupted by rect clone is identified (Figure 71, it should be
cloning inko the target sitc Recombinant piasmids streaked onto a new plate (with the appropri-
are chen separated from non-recombinant plas- ate antibiotic), grown in volume for DNA iso-
~ n d by
s replicate platmg on plates with one and lation (Protocol 14), and frozen for permanent
wlllx both antibiotics. Most plasmid vectors, how- storage (Protocol 16).
ctfer, use color screening for recombinant plas-
mids The most cornmon system involves a P-
galactosidase gene that bridges the cloning site. By Profocol13: Translonnation of MI3
addl~igappropriate substrates to the plates (X-Gal Bacteriophage DNA
and TPTG), bacterial colonies that contain non-re- (Time: *1 1w to step 9)
cornbinant plasmids will produce blue colonies,
whereas colonies with recombinant plasfnids This protocol should be followed for transforma-
(which have non-functional Pgalactosidase genes) tion of E. coli with M13 clones. Blue/white screen-
w l l produce wh~tecolon~es.The following proto- ing for recombinant DNA (as described in Protocol
col assumes that a plasmid with blue/white 12) is used for MI3 phage. One-tenth of a sub-
screening 1s used. (For information on alternative cloning reaction involving 1 ,ug of M13 DNA will
screening methods, see Berger and Kimrnel, 1987.) yield sufficient recombinant phage for analysis.
Ten LOO-mm plates will be sufficient for a transfor-
1. Thaw frozen competent cells on ice.
miition involving up to 0.1 ,ug of vector DNA.
Nucleic Acids IV: Sequencing and Cloning 353
2. Aliquot 200 ~1 of cells into a sterile tube on suspension to 50-rnl polypropylene centrifuge
ice. tubes. Incubate at room temperature for 5
3. Add the ligated DNA and sufficient 1.0 M min.
CaC12to keep the Ca2+concentration at 0.1 M. 4. Add 8 ml of freshly made 0.2 M NaOH plus
4. Mix thoroughly and incubate on ice for 30 1% SDS. Mix by hand and incubate at room
min. temperature for 5-15 min. The solution
5. Heat shock the cells at exactly 42°C for 2-3
should become less viscous during this time.
min. 5. Add 6 ml of 5 M KAc, pH 4.8. Vortex thor-
oughly, then incubate on ice for 5 min.
6. Allow the cells to cool to room temperature,
then aliquot the transformation into a number 6. Centrifuge at 7500 8 for 10 rnin at 4°C. Care-
of sterile culture tubes equal to the number of fully transfer the supernatant to a new tube.
plates desired. 7. Add an equal volume of PC1 and vortex. Cen-
7. Add 100 pl of a fresh overnight culture of an trifuge for 1 min at 7500 8. Transfer the aque-
appropriate strain of E. coli to each tube. ous (top) phase to a new tube.
8. Add 4 ml of warm (48°C)top agar and imme- 8. Add an equal volume of ether and vortex.
diately spread on a L-broth + 1% agar plate Centrifuge for 10-20 sec. Remove ether (top
containing 50 mg IPTG and 40 mg X-Gal per layer), and save lower layer in tube.
liter of medium. 9. Add 2.5 volumes of cold absolute ethanol,
9. Grow overnight at 37°C. See comments under mix thoroughly, and incubate 10 rnin or
Protocol 12, step 9. longer at -20°C.
10. Centrifuge at 8000 g for 5 rnin at 4OC to pellet
the plasmid DNA. Decant the ethanol. Add 1
of Plasmjid DNA
Protocol 14: EsolaQic~n ml of 70% ethanol and transfer the DNA pel-
(Time: Part A: day 1:=I0 min; day 2: -2 hr. Part let plus 70% ethanol to a microcentrifuge
B: day 1: =I0 min; day 2: =3 hr; day 3: =6 hr) tube. Spin in microcentrifuge for 1 min, de-
cant ethanol, and dry the DNA in a vacuum
The following protocol contains two parts. Clean centrifuge until the ethanol has just evapo-
preparations of plasmid DNA suitable for most rated.
purposes (including sequencing) can be obtained 11. Dissolve DNA in 1 ml of I X TE. Add 10
by following part A of the protocol. If further pu- pg/ml of RNase A and incubate at 37OC for 30
rification is necessary, the CsCl protocol (part B) min.
can be used, but part B requires an ultracen- 12. Add 0.1 m15 M KAc, Repeat steps 7-10.
trifuge. Either part can be scaled up or down as 13. Dissolve DNA in u p to 1 ml of I x TE, and
needed. For alternative protocols and modifica-
check concentration and purity.
tions, see H. Miller (1987).
loop and spread on a plate with appropriate be cloned into a sequencing vector (often required
antibiotics. for fragments larger than ~ 6 0 bp),
0 it is helpful to
incorporate a restrictian enzyme recognition site
on the 5' end of the primers. A primer so con-
Protocol 17: Isolaficses of PCR Products structed should 11ave an additional24 bases 5' to
for Seqascslcing the restriction site. Mismatches at the 5' end of the
(Time: =5 hr for double-stranded template, 010 primer will not usually impede the amplification
hr for single-stranded template) process, although an absolute match of the primer
to thc target DNA is preferable. Although primers
A complete discussion of PCR techniques is given as short as 17 bp have been used effectively, it is
in Chapter 7; see Protocol 2 of that chapter for the usually desirable to use primers in the range of
basic amplification procedure. Among parameters 25-35 bp. Care should be taken to match the melt-
that can be varied to optimize amplification are ing temperatures (T,,) of the two primers: T,,, =
the concentrations of DNA template, primers, [4 x (#G's + C's)l + [2 x (#A's + T'sll. Primer mix-
dNTPs, Mg2+,KCl, and Taq polymerase, as well as tures with up to 256-fold degeneracy have been
the length and temperature of the annealing and used successfully in PCR amplification, although
extension cycles (see Gyllensten and Ehrlich, 1988; more than 32-fold primer degeneracy often results
Lawyer et al., 1989; T.J. White et al., 1989; Kocher in highly heterogeneous amplification products,
and White, 1989; Chapter 7). There are a number which may result in ambiguities in sequ.encespro-
of DNA polymerases that may be used for PCR duced. Degeneracy should be no more than two-
amplifications (e.g., Taq, Vent, Deepvent, Pfu). fold at any one site.
Choice of a polymerase may vary depending on Several methods have been developed for se-
the size of the target fragment, the denaturing quencing DNA from PCR reactions. Parts A and B
temperature to be used, and the necessity to have describe two methods for sequencing the ampli-
or not to have proofreading activity associated fied product. Sequencing single-stranded tem-
with the enzyme. For example, Vent polymerase plate through asymmetric reamplification (part:A;
is stable at higher temperatures than Taq poly- Gyllensten and Erlich, 1988) usually is limited to
merase and is recommended for target sequences short ($600 bp) amplified fragments; longer [rag-
that may have strong secondary structure, be- ments can be sequenced directly using part B.
cause the denaturing temperature can be raised to Both direct sequencing options (parts A and B)re-
99°C. "Long-ranging PCR" (i.e., amplification of quire homogeneous amplification product. If the
large fragments) has varied performance with dif- PCR product is heterogeneous, or a clone is de-
ferent polymerases and works best using a com- sired, then the DNA should be inserted into a se-
bination of enzymes, one of which has proofread- quencing vector for analysis (see Protocol 18).
ing activity (see W.M. Barnes, 1994). Remember
that each enzyme has its own requirements for Paaft A, Asymmetric Reamplification
buffer co~nposition(e.g., Vent uses MgS04,
1. Purify amplified DNA samples using low-
whereas Taq uses MgC12) and when combining
melting-point agarose gels (Protocol 19) or
enzymes, one buffer may work better than the
Centricon 30TM cartridges (Amicon Corp.,
other. For example, Taq works well in Vent buffer
Danvers, MA, USA). Wash t l ~ eCentricon 30IM
(in fact, it sometimes works better in Vent buffer)
cartridges by applying 2 ml of TE and ccn-
but not vice versa.
trifuging at 4800 g for 10 min at 4°C. Then
Careful primer design is critical to success of
add the 2 ml of amplified DNA solution to the
PCR amplification. The two primers should be
cartridge and centrifuge as above for 15 min.
complementary to opposite strands, and should
Discard the solution in thc reservoir. Collect
flank the target sequence at a distance of up to 4
the purified DNA sample by inverting the
kb (larger fragments can be amplified with di-
cartridge and centrifuging at 200 Q for 2 rnln
minishing success).If the amplified fragment is to
The final volume of DNA sample sl~ouldbe
356 Chapter 9 / Hillis, Mable, Larson, Davis
approximately 100 pl. The yield of DNA Marchuk et al., 1991).Blunt-ended cloning may be
should be approx~mately7-10 pg. used but it is first necessary to enzymatically re-
2. Repeat the amplification (Chapter 7), but with move the 3' overhang using an enzyme with 3' -+
a 1:100ratio of the two primers (some experi- 5' exonuclease activity (see Scharf, 1990; Marchuk
~l~el~tation
in primer ratio may be necessary). et al., 1991j. T4 DPdA polymerase can be used to
Xttpr asymmetric amplification, the low-con- blunt both 5' and 3' overhanging ends in the same
centration primer can be used to sequence the reaction. The exonuclease activity of the enzyme
fragment (see Protocols 21,22, and 25). digests the 3' overhangs to create a blunt end and
the polyn~eraseactivity end fills the 5' overhangs.
If there is a restriction site overhang, Klenow frag-
X'JI ! 1%'i~,olafionof Dijui)lc-Stranbcd 13Nn for
ment can be used because it will cleave the single
> ~ C ! L - ~ ~ C ~ J Z P ,
nucleotide overhang without digesting further.
1 Concentrate the PC17 product to -25 pl total The blunt-ended product then needs to be phos-
voluine in a vacuum centriI%ge/concentrator. phorylated (cold kinasing reaction) prior to liga-
2. Prepare a SepharoseTM CL-6B column (Boeh- tion to the vector. Blunt-end cloning tends to be
ringer Mannl~eim)by mixing thoroughly. Re- less efficient than sticky-end cloning, so screening
move the top cap, then the bottom cap, and with a color selection vector is recommended.
drain excess buffer from the column. Spin the Vectors such as ~ l u e s c r i ~(Stratagene,
t'~ La Jolla,
column 2.5 min at 1100 g.Discard the buffer CA, USA) work on the principle that vectors with-
and repeat spin. out an insert will have a functional fi-galactosi-
3. Vslng a new collection tube, add the sample dasc gene and transformed bacterial colonies will
i1.01~step 1 ( 4 5 ,ul)
to the middle of the col- turn blue, whereas vectors with an insert will not
urnn. S p ~ nfor 10.5min at 1 1 0 0 to
~ recover the have the correct enzyme and colonies will remain
purified DNA. white (see Protocol 12). Other methods are avail-
able that claim to increase the efficiency of blunt-
4 To prepare DNA template for sequencing, use
end ligations (see Liu and Schwarz, 1992) but will
21-2 ,ug of the purified DNA (=I0 pl) in Pro-
not be described here.
tocol 22, part A. Sequence the product using
A more efficient method of cloning PCR prod-
modified T7 DNA polymerase (Tabor and
ucts exploits the A-overhangs created by the ac-
Rrchardson, 1987)as described in Protocol 22.
tion of Taq polymerase without the necessity for
futher modification of the template prior to liga-
tion (Marchuk et al,, 1991).Although several com-
Cfonixng &Ie.thobs for
k 3 r i i ! u ~ ~18:
i
mercially prepared TA-cloning kits are available
Pi,:; TPruducts (e.g., InvitrogenTM, NovagenTM), the procedure is
(Time Part A, sections 1-111: -5-6 hr; section IV:2 relatively straightforward,A T-vector is created by-
hr, plus overnight; sections V-VI: 2-5 hr, plus digesting an appropriate plasmid (e.g., Blue-
overnight. Part B: 2 days) scriptTM) with a restriction enzyme that has only a
For templates that are difficult to amplify or when single restriction site in the vector (e.g., EcoRV).
het~rogeneityof sequences within a size class is The digested vector is then incubated with Taq
suspected, cloning of PCR products can lead to polymerase and dTTP. The absence of any other
better quality and less ambiguous sequences. Sev- nucleotides in the mixture results in the addition
eral cloning methods have becn developed specif- of a single thymidine at the 3' end of each frag-
ically for PCR products and have increased the ef- ment. The vector and PCR product then have
ficrency of cloning. One of the difficulties with complementary single-base 3' overhangs. The 3'
clol-ung PCR products has to do with the terminal T-overhang inhibits self-ligation of the vector, and
translerase property of Taq polymerase, which re- the unphosphorylated 5' end prevents ligation of
sults in the addition of a single nucleotide (usu- PCR products to each other. One important piece
ally adenosine) to the 3' end of the sequence (see of information that is not specified in most of the
Nucleic Acids IV: Sequencing and Cloning 357
TA-cloning kit protocols is that ligations using T- 7. Add 5 pl (1/10 volume) 3 M NaAc (do not
vectors require higher concentrations of ligase use NH4Ac because it interferes with kinase
than is normally required (~4x1.Transformation, activity).
color selection, and screening can be performed as 8. Add 150 pl(3 volumes) of ethanol.
for blunt-end reactions. The efficiency of this 9. Precipitate for 20 min at -20°C.
method is thought to be 100-fold that of blunt-end
cloning when using unmodified PCR products. It 10. Vortex. Centrifuge for 5 min at 4OC.
also requires fewer steps than blunt-end reactions 11. Decant the ethanol.
and is quicker to perform. 12. Dry, re-suspend in 15 p1 ddH,O, and place at
Whichever method is used, once the PCR 37°C to dissolve the DNA.
products hove been cloned, screening for inserts of
the appropriate size may be accomplished in sev- SF,CTIII)N $1. C i ) I I? KINA41WG 01. I'CK 1'ROI)UCTS
eral ways. A quick method (e.g., part A, section V) PXlOII. TO IILCPXING
may be used as a rapid screen to search for posi-
1. Combine:
tive clones. PCR can also be utilized to amphfy the
region of the vector that contains the insert. These 15.0 pl ddH2Qwith DNA (blunt-ended)
methods both rely on the ability to distinguish be- 2.0 pl lox kinase buffer
tween clones with and without inserts on the ba- 1.0 pl DDTT (reducing agent to prevent en-
sis of differential migration through minigels. Di- zyme from oxidizing)
gesting the vector DNA with restriction enzymes 2.0 pl rATP (phosphate donor)
greatly improves the ability to detect differences in 0.5 pl T4 DNA k'~nase
migration of positive versus self-ligated clones. 2. Incubate mixture at 37°C for 3045 min.
I-fowever, preparation of templates for sequencing
3. Add 30 pl ddH,O (to 50 pl) and phenol ex-
is best performed using a more thorough prepar-
tract as above. Centrifuge for 2 min (~7000 g).
ative method (i.e., minipreps). There are many
miniprep methods available. Two methods were 4. Precipitate with 5 pl of 3 M NaAc, 150 p1
described in Protocols 14 and 15 and we have in- ethanol, and 0.5 p1 of 20 pg/@ t W A ,
cluded a rapid method here (part A, section VI). 5. Re-suspend in 12 pl ddH20.
Although the STET method (part A, section VI) is 6. Vortex and centrifuge briefly. Place in heating
faster, alkaline lysis/SDS methods (such as in Pro- block at 37°C for 5 min to dry pellet.
tocol 14, part A) appear to produce plasmid prepa-
7. Add 2 pl to a new tube. Add 10 pg (1 pl of
rations of more consistent quality.
stock solution) of RNase to remove tRNA.
Part I\. 15ltrnt-Bard Cloning 8. Heat at 37OC for 5 min.
%;C!'IOW X. is1 UN7-I NU [ W A C 110h 9. Electrophorese a sample of the reaction on a
1. Combine: minigel to check quality and relative quantity
of DNA.
7.5 1-11ddH20 + DNA
1.0 pl lox polymerase buffer 10. Use 5 pl of the kinased DNA in ligation reac-
tions.
1.O pl5mM dNTP
0.5 pl T4 DNA polymerase
2. Incubate mixture for 30 min at 37°C.
1. Combine:
3. Bring volume to 50 ,dvolume with ddH20.
11.5 pl insert DNA plus ddHpO
4. Add an equal volume of 25:24:1pheno1:cholo- 2.0 pl blunt-ended vector DNA (approxi
roform:isoamyl alcohol.
mately 20 ng)
5. Vortex, Centrifuge for 15 min (~70008). 2.0 pl lox ligation buffer
6. Extract top layer. 2.0 p110 mM rATP
358 Chapter 9 / Hillis, Mable, Larson, Davis B Zi~ninev
pl of the sample on a minigel with uncut vec- dence of size polymorphisms, centricon filtration
tor as a standard to check digestion. (e.g., ~ i l l i p a r e MC40)
~" is the simplest and most
5. To add T-overhang, combine: efficient method. The method is very straightfor-
ward and will not be described here. However, it.
85 pl vector digestion can result in sequence ambiguities close to the
10 pl Taq lox buffer primer if primer-dimer bands are present. Di.rect
2 pl dTTP (100 mM) purifjcation of PCR products using a sili.ca bind-
3 pl Taq polymerase (2U/,d) ing matrix (glassmilk) can reduce these problcms
Add 75 pl mineral oil overlay. because it tends to remove small segments of
6. Incubate at 70°C for 2 11r in a thermal cycler or DNA. flowever, the most reliable results often arc
a heating block. obtained by gel-purifying target bands using low-
melting-point agarose (LMP). The agarose can be
7. Extract with PC1 (Appendix) and chloroform
eliminated by phenol extraction but cleaner tc11i-
(see Protocol 1, steps 6-8 and 10-12). plates and better yield can be achieved by ~lsing
8. Ethanol-precipitate (see Protocol 1, steps one of the other purification methods described in
14-16) 15-20 min at -80°C. this protocol. Gel purification is preferred when
9. Re-suspend pellet in 100 p1 0 . 1 TE. ~ This PCR primers result in amplification of multiple
should be enough for 50 ligations. Store at products of different sizes.
4°C. 'LMP gels use the same buffers and proce-
dures as do normal agarose gels but nor~nallyare
run at a relatively high percentage of agarose
5LCl"XDk IX, I.IC;ATlON REACTION
(1.2-2%) to allow maximum separation of bands
1. Combine: and a n easier handling consistency of gels. TAE
4 pl ddH:O buffer is recommended over TBE because borate
1 pl lox ligation buffer (Appendix) is thought to interfere with sequencing reactions.
1p l 5 mMATP Although there are many methods that have been
2 pl T-vector (prepared in section I) used to extract DNA fro111 agarose gels, we will
1pl PCR product (fresh is better; purifica- describe two here. The first is based on isolatioi~
tion may be unnecessary) of DNA on a silica binding matrix as described by
1 pl T4 DNA ligase (3-4 U/pl) L.G. Davis et al. (1986).Kits based on this proce-
2. Incubate overnight at 12OC. dure (or slight modifications thereof) are available
from several companies. The method recovers up
3. Transform with DH5acompetent cells (Proto- to 90% of the initial DNA template and results in
col 12) as for blunt-end cloning (see part A, the elimination of excess proteins, salts, uninco-
section IV). porated nucleotides, primers, and other residual
4. Screen for inserts using X-Gal color selection impurites (e.g., small RNAs, ethid.ium brolxide,
followed by screening methods (see part A, and phenol). The second procedure allows direct
sections V-VI). recovery of DNA from agarose gels by migration
into a well that contains a high-salt buffer (Zhen
and Swank, 1993).This mcthod has been found to
Pro tosol 143: Purification of BsCR recover up to 98% of the initial DNA. It has been
Products for Sequencing used to isolate fragments ranging in size from 200
(Time: =I-2 hr) to 5000 bp. Although the original protocol recom-
mendeh subsequent phenol extraction prior to se-
The efficiency of sequencing reactions by any quencing, the high salt does not seem to interfere
method can be improved by purifying PCR-gen- with sequencing reactions using either direct se-
erated templates prior to sequencing. If unam- quencing or cycle sequencing procedures. Both of
biguous PCR products are generated with no evi- these methods work well and choice of method
360 CFinpter 9 / Hillis, Mable, Larso~z,Davis & Zimme~
may depend on personal preference. The glass- 10. Add 500 pl Tris-ethanol wash buffer (Appen-
1mlk procedure (part A) requires a little more time dix).
but the well method (part B) requires more atten- 11. Mix and spin in a microcentrifuge for 30 sec.
tion while the gel is running.
12. Remove the supernatant with a pipette and
If multiple products are produced in a PCR discard it.
reaction, the smallest products usually will be the
most concentrated. If the target products are large, 13. Repeat steps 10-12 one or two times.
yield may be improved by gel-isolating the larger 14. On the last wash, it is important to ensure
products and then using the purified template in that no liquid remains.
a realnplification reaction. If the band excised 15. Add 10-15 p1 l x TE. The volume used will
from a gel actually contains two products of siml- depend on the final concentration of product
lar s u e , a secoild gel-purification step may be nec- desired. Higher yieid is obtained by perform-
essal y following reamplification. ing two elutions than by a single elution of
larger volume.
Yart Z Glastrmilk Ji'rrriiicafion 74c.L-hod 16. Mix well and incubate at 40°C for 5 min.
1. Electrophorese the PCR product on a 1.5% 17. Mix and spin in a microcentrifuge for 30 sec.
LMP gel in 0 . 5 TAE
~ buffer. The volume of 18. Remove the liquid (DNA pIus 7%) with a
product loaded onto the gel will depend on pipette and save to a new tube.
thc concentration desired and the thickness of
the gel. Large-volume samples either can be 19. Repeat steps 15-18 to increase yield.
preclpltated and concentrated into a smaller 20. Spin the tube containing the final sample in a
volume or products can be divided into microcentrifuge for 10 sec to ensure that all
slnaller volumes and loaded in severaI lanes matrix is out of solution. Before using for se-
of the gel. quencing, it is best to centrifuge again because
2 Stam the gel with ethidium bromide and ex- the matrix can inhibit DNA accessibility.
clse the target band with a scalpel or razor
blade. Place excised bands into 1.5-ml micro-
centrifuge tubes. If thc sample was divided 1. Electrophorese samples at a low voltage (e.g.,
~lnong- lanes, the exclsed bands from the ap- 5 V/cm) on a 1.5%LMP gel in 0 . 5 TAE~ con-
propriate lanes can be combined into the taining 0.2-0.5 pglml cthidicun bromide . The
snmc tube.
buffer should just reach the edgcs of the gel
3 Add 3 volurnes of NaX binding buffer (Ap- but not cover it.
pendix) to tubes conlaining the gel bands. 2. After samples have run halfway, place the gel-
4 lncubate at 40°C for 5-10 min to melt gel. rig on a UV-light box (long wave). (Remem-
5. Vortex matrix (i.e., glassmilk) thoroughly and ber to wear appropriate eye and skin protec-
add 10 ,dto gel plus binding buffer. tion). Excise wells directly in front of target
6. Mix well by inverting and incubate at room bands. Wells should be slightly wider than
temperature for 5-10 min. The efficiency of the bands.
binding can be increased by frequent inver- 3. Add 250-400 pl 15% PEG/TAE to the well.
sion of tubes. The purpose of the PEG is to retard migration
7. MIXand spin in a microcentrifuge for 30 sec. of DNA through the trough buffer so that the
target migrates as a discrete band.
8. Draw off the supernatant with a pipette or as-
plrator and discard. 4. Electrophorese the target bands into the ex-
used wells. It is important to watch carefully
9. Acid 500 p1 Nal binding buffer. Repeat steps at this step. Run at a low current. When the
6-8. band has moved into the well, transfer the liq-
Nucleic Acids IV: Sequencing and Cloning 361
uid from the well into a 1.5-ml microcen- et al., 1992). The method described here is based
trifuge tube with a pipette. The current may on Wilson and Schulles (1992) and was designed
be reversed if target bands have migrated too for screening MI3 plaques. See Lessa and Apple-
far, but the DNA yield may be reduced. baum (1993) for recommendations about opti-
5. Use directly in sequencing reactions or pre- mization.
cipitate to remove salt. DGGE (denaturing gradient gel elec-
trophoresis) utilizes double-stranded DNA and is
able to separate homoduplex molecules based on
rsdstacol 20: Screening Methods for differences in migration through a gel containing
linear gradients of denaturants (i.e., urea) at con-
Detecting Variatiort in DNA Sequences stant temperature (see Chapter 8). The basic prin-
Although screening of cloned products by se- ciple is that DNA segments will partially denature
quencing multiple clones can be used to detect at a point along the gradient determined by its
polymorphisms in sequences under comparison, melting point. Mutations will alter the melting
there are a number of other methods that can de- point and result in changes in the rate of migra-
tect single base-pair changes within or among tion. This is the most sensitive method and can
samples prior to sequencing. Lessa and Apple- detect nearly every mutation in DNA fragments
baum (I 993) recently reviewed these methods and up to 500 bp. It also can be combined with het-
their applicability to population biology, so only a eroduplex analysis (Lessa and Applebaum, 1993)
brief summary will be given here (see also Chapter and efficiency can be improved by adding a GC
8). A detailed protocol is given here only for SSCP clamp to PCR products (Myers et al., 1989;
because it is the simplest procedure that results in Sheffield et al., 1992). Other advantages are that
good resolution of sample heterogeneity. optimization of gradients and conditions can be
The simplest of these methods, heteroduplex standardized across templates, it is more versatile
analysis, is designed to test whether a sample than the other methods, and it can be preparative
contains one or two types of DNA by the differ- (i.e., bands can be cut from the gel and used for
ences in mobility expected between heteroduplex sequencing) as well as analytical (see Lessa, 1993).
and homoduplex molecules on acrylamide gels. It However, it also requires the most specialized
is very simple but relatively limited in its sensi- equipment and most complicated procedures.
tivity and applicability. Methods for pouring gradient gels and maintain-
SSCP (single-strand conformational poly- ing constant temperature electrophoresis will dif-
morphism; protocol below) is a technique de- fer depending on the apparatus used. Therefore,
signed to identify allelic variation at a given locus the reader is best directed elsewhere for detailed
(Orita et al., 1989).The technique is useful for de- protocols. Detailed descriptions of methods and
tecting variation in short fragments of DNA. It equipment required are given by Myers et al.
utilizes the differences in migration in a gel ma- (1986,1989).
trix caused by conformational changes of single-
stranded DNA that result from point substitu- SSCP Protocot
tions, insertions, and deletions. It is relatively 1. Add 1 pl of [d2I'1dATP to the last ten cycles
simple and conditions may be optimized to maxi- of PCR reaction.
mize variation detectability, but optimization may
vary among samples and may require some trial 2: Combine 10 y1 of PCR product with 5 yl of
and error (see Lessa and Applebaurn, 1993). It is formamide sample buffer (95% formamide, 20
estimated to detect 99% of single base-pair mM EDTA, 0.05% bromophenol blue, and
changes for fragments of 100-300 bp and 89% of 0.05%xylene cyanol).
changes for fragments of 300-450 bp (Hayashi, 3. Heat-denature at 95°C (or place in a boiling
1991a,b),but longer sequences may be used when water bath) for 5 min.
combined with endonuclease digestion (Iwahana 4. Immediately cool on ice.
362 Chapter 9 / Hillis, Mable, Larson, Davis & Zimmer
Front view
f'xottacot 21: Preparing a Sequencing
Geli
(Time: -1-2 hr)
The details of the following protocol will vary de-
pending on the style of sequencing apparatus
used; a simple sequencing apparatus is shown in
Figure 8. The gel spacers can vary in thickness
from 0.2-0.8 mm. If spacers of uniform thickness
are used, the bands at the bottom of the gel will
be widely spaced, whereas .those at the top will be
very close together. Much longer sequences can be
read from gels that take advantage of field gradi-
ents produced with wedge-shaped spacers (An-
sorge and Labeit, 1984).With wedge-shaped spac-
ers, bands will be much more evenly spaced along
the length of the gel. Wedge-shaped spacers can
be obtained commercially, but are expensive and
often not uniform. An effective alternative is to Figure 8 A basic sequencing gel unit. The gel is
combine two layers of spacers at the bottom of a poured between the two glass plates (El and E2), which
gel, with only a single layer of spacers at the top. are separated by the teflon spacers (H). Note that the.
Experimentation will be required to find the opti- front plate (E2) is slightly longer than the back plate
inal gradient for a particular sequencing system, (El) to allow contact between the gel and buffer in the
lower tray A sharkstooth comb (I) is inserted at the top
but a gradient of 0.2 to 0.8 m m usually is quite ef- of the gel (see Protocol 21). The two plates are held to-
fective. gether by clamps (heavy-dutypaper clamps work well)
Reading long sequences b 6 0 0 bp) requires and the gel is inserted into the lower well (C) where it is
long sequencing gels (>80 cm), wedge-shaped held in place by a plexiglass bar (B). The top of the gel
spacers, and use of 3%-labeled or 33P-labeled is clamped to the side ears of the upper tank (A); nate
that the front of the upper tank is open to allow contact
(rather than 3zP-labeIed) nucleotides. (Some pro- between the buffer and the gel. A rubber gasket (GI
prietary acrylamide solutions [e.g., Long RangerTM forms a seal between the upper tank and the earred
of AT Biochem] also increase the length of read- glass plate (El). A11 aluminium plate (F) is clamped to
able sequence). Pouring and handling very long the glass plates to ensure even heating. The electrodes
gels presents some difficulties. A n alternative are constructed from platinum wire to p-'event corro-
sion. The stand (D) can be modified to permit height
pouring strategy to the one given below is to slide adjustment of the upper buffer tray so that gels of many
the plates together, pouring the acrylamide gel different lengths can be accommodated. Sequencing
mixture ahead of the leading edge of t h e top gels are typically 40-100 cm long and 2040 cm wide.
Nucleic Acids IV:Seq~~el~ci~zg
and Clolzirzg 363
plate. With practice, gels without any bubbles can of the gel apparatus and plates used. TEMED
be prepared from very long plates with tlus tech- should always be added last.]
l~ique.Another technique for pouring long gels 5. Pour the gel solution between the plates us-
it~volvesinjecting the acrylamide through a small ing a 25-ml pipette and a regulating pipette
hole in the bottom of one of the two glass plates bulb or pour slowly and constantly from a
(Slightom et al., 1987). For handling long gels, it beaker. Allow the solution to run down one
may be preferable to bind the gel to one of the edge and fill from the bottom. Avoid forming
plates (using bind-silane [y-methacryloxypropyl- bubbles between the plates.
trimethoxysilane]) rather than to transfer the gel 6. Insert a sharkstooth comb backwards bc-
to filter paper lor vacuum drying. Gels attached tween the plates at the top, aligning the holes
to the glass can be dried with a hot-air blower or In the comb with thc edge of the back plate
in a drying cabinet (see IJrotocol 25). (sl~orterone). Allow gel solution to cover the
1. Prepare the inner surfaces of the gel plates (af- outer surface of the comb. Clamp into place
ter cleaniiig) using 2% dimethyldicldorosilane and allow to rest for one hour at an mcline.
solution in 1,1,l-trichloroethane (add 21 full Removc the clamps after the gel sets.
Pasteur pipette of silane per surface and 7. Pour diluted electrode buffer on the comb
spread with a lab tissue; polish surface until with a Pasteur pipette. Remove the comb and
smooth). CAUTION:Wear gloves and prepare rinse it clean in distilled water. Re-insert the
tlie plates in a fume hood, as the silane solu- comb with the tceth polntlng inward so that
tion is highly toxic. the tips of the teeth barely touch thc surface
2. Clainp the plates together with spacers be- of the gel.
tween them. Be sure that the spacer covers the 8 Cut the tapc from the bottom edge of the gel
complete length of the gel plate. For this w ~ t ha razor blade.
method, d o not use a spacer across the bot- 9. Clamp the gel onto the gel-running appara-
tom of the plates. f11r
3. Tape all sides of t l ~ egel plates except the top, 10. Fill the upper and lower reservoirs wit11 IX
making sure that all edges are tightly sealed. TBE buffer.
Re-clamp the sides of the taped plates. An al-
ternative method involves using a spacer at 11. Use a syringe or micropipetter to clear [lie
the bottom and clamping the sides and bot- wells formed by the sharkstooth comb. Tht:
torn of the gel rather than taping. wells are the spaces between the teeth.
4. Mix tlie following gel solution in a 500-ml 12. Fill the wells with 4.5 pl of stop buffer. Prc-
flask (for a 4% gel): run the gel, setting the current not to exceed
25 mA and the voltage not to exceed 2,000 V.
60 ml urea mix (Appendix) Use a micropipetter with microtl~intips or
20 ml20% acrylalnide (Appendix) capillary tubes drawn to a fine tip to load the
20 rnl l x TBE buffer (Appendix) gel. (Length af pre-run = 15-30 min.)
400 pl 10% ammonium persulfate
50 ,ulTEMED
[Note: T11e concentration of acrylamide for H3~sBocol22:D h T hScqua:naing
DNA gels should be 4-6%. For RNA gels, use Reactioxas
8% acrylaniide. Gels can also be poured using (Time: Part A: =1,5hr; Part B:=30 min; Part C:
a stock solution of the desired percentage of llr)
acrylamide (e.g., 6% working solution, see
Appendix) and adding 10% ammoniuln per- The conditions of DNA sequencing reactions can
sulfate and TEMED just prior to pouring the be varied according to (1)the length of sequence
gel. The volume used will depend on the size to be determined; (2) whether single- or double-
364 Clznpfer 9 / Hillis, Mable, Larson, Davis b Zimmer
stranded DNA is to be sequenced; (3) the base For amplified DNA from Protocol 17, part B,
camposition of the primer sequence; (4) the base add 4 pl of 1% acrylamide
coriipos~tronof the target sequence; and (5) the se-
iluenclng enzyme to be ~ l s e dIn . general, these 4. Add 150 pl absolute ethanol and mix.
vnri~lilonsare noted in the following protocol, ex- 5. Precipitate DNA at -80°C for 45 min or more.
cept that particular conditions for the various 6. Pellet DNA for 20 inin in refrigerated micro-
sequcnclng enzymes should follow the manu- centrifuge.
facturer's recommendations. The common se- 7. Wash pellet with 70% ethanol (approximately
quel~cmgellzymes are Klenow fragment, modlfied 150 pl), centrifuge for 10 min in a refrigerated
bacter.iop11age T7 DNA polymerase (Sequenase'") microcentrifuge.
(Tabor and Ibchardson, 1987),and Taq polymerase.
Although good results can be obtained wit11 any 8. Wash pellet with absolute ethanol (approxi-
of time enzymes, modified T7 DNA polymerase mately 150 pl), spin 10 min in refrigerated mi-
usually provides superior results, especially in re- crocentrifuge.
glans of strong secondary structure. If probleins 9. Dry the pellet in a vacuum centrifuge. [Note:
arise 111 sequencing regions of high GC content, This DNA may be stored at -20°C dry before
such as colnpressions or "stop bands" (i.e., strong proceeding to the next step.]
bands In more than one lane at a given nucleotide
poslliun), it may also be desirabvle to substitute I%ar.kR . Prcpar;~tionof Sol~ation~ and
dlTP ax 7-deaza dGTP for dGTP in step 1of part B Terminakion Ttrbes
(W.hl Barnes et al., 1983, Gough and Murray,
1983, Mlzusawa et al., 1986) These problems also 1. Label four tubes per reaction with GI A, T,
may be reduced by using both Taq polymerase and and C. To each tube add 2.5 p1 of the respec-
inodliied T7 DNA polymerase in the sequencing tive ddNTP mixture. All four mixtures con-
reactions (Austin, 1995; see section on "Interpreta- tain 80 p.M dGTP, 80 pM dATP, 80 pA4 dCTP,
tion ~ n Troubleshooting").
d For sequencing dou- 80 pM dTTP, and 50 plvl NaC1. In addition,
ble-stral-tded DNA, the DNA should be denatured each contains 8 pM of the respective ddNTP.
(part A; Haltiner et al., 1985) before starting part B. For dITP sequencing, substitute 160 dITP
Single-stranded DNA (c.g., MI3 or asymmetric for 80 pi4 dGTP in each mixture. dNTPs also
PCR products) can be used directly in part C. can be reduced to sequence close to the
primer (i.e.,the ratio of ddNTPs to dNTPs can
be altered to adjust readability length).
Par i ~ " xIDenaturatioxz and Neutrafi~atiur:of
Doublc-Stranded DNA Template 2. Prepare labeling mix depending on sequenc-
ing distance from primer. Stock: 7.5 ,dvI dGTP
1. Bring 1-3 pg of liNased plasmid DNA (or (or 15 ph4 dITP), 7.5 pM dCTP, 7.5 ,uMdTTP. -
other double-stranded template) to a volume Dilute stock 1:10 for sequencing close to
of 20 pl with deionized, distilled water. Add 2 primer, 1:5 for sequencing 25-300 bp from
111of 2 N NaOH. The exact amount of DNA primer, and use undiluted for greater than
will vary depending on the size of the tem- 300 bp from primer.
plate. A 1:1 molar ratio should be maintained
between primer and template. 3. Prepare DNA polymerase according to man-
ufacturer's directions.
2 Incubate at 65OC for 5 min, then place on ice
and allow to cool.
3. Add ~~eutralizing salt mix of: Part C. Pritner Anwaling and Sequencing
Kcaction
2 ,d8 8 NNH4Ac
3 PI 3 3 NaOAc 1. For DNA produced in part A, re-suspend
20 pl ddH@ template in 8 ,dof primer (2.5 ng/pl) in a 0.5-
ml microcentrifuge tube. For single-stranded
Nucleic Acids IV: Sequencing and Cloning 365
template from asymmetric PCR, add 7 p1 8. Prior to loading on gel, heat samples at 80°C
DNA + 1 p110 plvl primer. for 2 min.
2. Add 2 p1 of 5x sequencing buffer (e.g., 200
mM Tris-HC1, pH 7.5, 100 mM MgC12, 250 Part 11. Modificatioxls Betr Microtiter Pkate
rnM NaC1; this may vary with the DNA poly- 1-3. As in part C.
merase used).
4. Preheat two heating blocks, one at 37OC and
3. If the (G+C)/(A+T) ratio of the primer is ap-
the other at 95OC (the 95°C block will heat
proximately 0.5 or more, heat the tube to 65OC
faster if it is covered). Use blocks that have
for 2 min, then allow the tube to cool down at been drilled to hold a microtiter plate sur-
a rate of approximately 1°C/min to 35OC. If rounded by a raised edge.
the (G+C)/(A+T)ratio is less than 0.5, hold
the tube at 37OC for 15 min. Some experimen- 5. Cover two columns of wells on a microtiter
tation will be required for specific primers. plate with lab tape. Label each row according
Samples can be frozen at this step if desired. to the template to be sequenced. Label each
column G, A, T, C from left to right.
4. Prepare the sequencing cocktail in a micro-
centrifuge tube, with the following amounts 6. Add 2,5 p1 of termination mix (ddG, ddA,
of reagents for each template to be sequenced ddT, ddC) to each well in the appropriate
(add enzyme just before required): row. Cover plate and place on ice.
2.0 pl dGTP labeling mix (1:20 dilution") 7. Prepare the sequencing cocktail as in part C,
1.0 pl0.1 M dithiothreitol (DTT) step 4.
0.5 p1 [cx-~~PI~ATP, [a-33PldATP, or 8. Add 5.8 p1 of sequencing cocktail to each of
[tr-35S]thio-dATP the template/primer solutions, carefully plat-
2.0 pl DNA polymerase in buffer (I:&di- ing the sequencing cocktail on the side of the
lution with enzyme dilution buffer) tube about 1/3 down from the lip.
*or dilute as recommended in part B 9. Centrifuge the tubes briefly, vortex, and spin
above briefly again. Allow the extension reaction to
proceed for 30 sec to 7 0 min (see part C, step
Centrifuge briefly, vortex, and centrifuge 5). Place tubes on ice.
briefly again. The sequencing cocktail should 10. Add water to the 37°C block so that a thin
be kept on ice. layer of water covers the depression in the
block. Place the microtiter plate with the
5. For each reaction, add 5.5 pl of sequencing dideoxynucleotides on the block so that wa-
cocktail, and briefly centrifuge. Allow the ex- ter contacts the bottom of the wells in the
tension reaction to proceed at room tempera- plate. Incubate for 30-60 sec.
ture for 30 sec to 10 min, depending on how
close or far from the primer you wish to se- Start a timer. For each template, pipette 3.4 @
quence (short extension times allow se- from the tubes prepared in step 9 into the four
quences to be read close to the primer, wells (G, A, T, C) from left to right in one row,
whereas longer extension times accentuate beginning with the top row. Mix the contents
bands farther downstream; a happy medium in each well by pumping the solution once
is 2 rnin). Place tubes on ice. with the pipettor.
6. Add 3.5 pl of this reaction mixture to each of After pipetting the termination reactions, wait
the termination tubes prepared in part B, step until at least 5 min have elapsed on the timer
1 (pre-warmed to 37OC). Centrifuge briefly (it takes about 5-6 min to pipette 10 templates
and incubate at 37°C for 2 min. if no problems arise). Begin placing 4 pl of
formamide dye in the wells in the same order
7. Add 4 7 pl stop buffer (Appendix), centrifuge (e.g., left to right, row by row, top to bottom)
briefly, and place on ice or freeze up to 7 days.
366 Chapter 9 / Hillis, Mable, Larson, Davis & Zim~ner
as the termination reactions were pipetted. 6. To the G tube add 1 pl of 1.5 mM ddGTP; to
Work methodically and you should come to the A tube add 1 pl of 8 11lM ddATP; to t l ~ eT
the end of the plate in about the same time tube add 1 pl of 5 mM ddTTP; and to tlte C
that it took to pipette the termination reac- tube add 1pl of 2 mM ddCTP.
tions. Place the plate on ice or freeze. 7. Prepare "Reaction Mixture 1" in the following
13. Just before loading the sequencing gel, dena- manner for each RNA sample to be se-
ture samples on the 95OC block. Adjust the quenced:
water level to form a thin layer covering the 3 p1 dNTP mix (5 mM each dATP, dCTP,
depression on the block. This may lower the dGTP, dTTP)
temperature of the block somewhat; tem- 3 pl dddH20
plates will denature at 85°C. Water should 3 pl reverse transcriptase
contact the bottom of the wells on the mi-
crotiter plate. Denature for 3 min and imme- Vortex, spin in a microcentrifuge for several
diately place the plate on ice. seconds, and store on ice until needed.
8. Add 2.1 pl of the solution from step 4 to each
of the four tubes.
Protocol 23: RNA Sequencing
Reactions 9. Add 2 ,dof "Reaction Mixture 1" to each tube
(Time: =3 hr) (G, A, T, C), vortex, and spin for several sec-
onds in a microcentrifuge.
A common problem in sequencing rRNA with re- 10. Incubate at 4B°C for 40 min.
verse transcriptase is sequencing through regions
of strong secondary structure. One method that 11. Prepare "Reaction Mixture 2" (during step
may help to resolve such sequences involves the 10).
addition of terminal deoxynucleotidyl transferase 3.0 p1 dNTP mix
(TdT) following the completion of the reverse 3.0 pl reverse transcriptase
transcriptase extension reactions (DeBorde et al., Vortex, spin in a microcentrifuge for several
1986).This procedure is indicated below as an op- seconds, and store on ice until needed.
tional step 14.
1. Add 6 p1 of solution of the RNA to be se- 12. Add 1 ml of "Reaction Mix 2" to each tube.
quenced to a microcentrifuge tube. Vortex and then spin for several seconds in a
microcentrifuge.
2. Heat the RNA to >90°C for 5 min in a heating
block. Cool in ice water, and then spin for sev- 13. Incubate at 48OC for 40 rnin. Spin for several
eral seconds in a microcentrifuge. seconds in a microcentrifuge following incu- .
bation.
3. Add 1 ml of 20x reverse transcription buffer
(Appendix) and 2 ml of labeled primer (work- 14. (Optional; see comments above) Add 1 pl of a
ing stock = 0.5 pmol/pl) and 1.5 pl of RNasin mixture of dATP, dCTP, dTTP, and dGTP
(2000 U/ml) in this order, vortex, and then (each at 1 mM) and 10 U of terminal de-
spin for several seconds in a microcei~trifuge. oxynucleotidyl transferase to each tube. Incu-
bate at 37OC for 30 rnin.
4. Incubate at 42OC for 30 min. Spin for several
secoitds in a microcentrifuge following incu- 15. Add 4 ml of stop buffer (Appendix) to each
bation. (Note: To save time, the next three tube.
steps are performed during this incubation.) 16. Heat for 5 min to >90°C. Cool on ice, vortex,
5. For each RNA sample to be sequenced, pre- and then spin for several seconds in a micro-
pare four microcentrifuge tubes marked GI A, centrifuge. Store on ice until use.
T, and C.
Nucleic Acids IV: Sequerzcing and Clonilzg 367
"rottrcoi 24: 'Fherrnal Cycle Seqxaenci~ag dioisotope used he., y33Por y32P) and how old ii ts,
(Time: =4 hours) Although it is possible to use y5Sfor end-labelmg,
MJ Research (a manufacturer of therinai cyclers)
As for polymerase chain reaction procedures, op- recently warned that radioactive H2S may be
timization of reaction conditions and concentra- formed when 35Sis used in thermal cyclers and
tions of reagents may depend on the quality, therefore it should not be used for cycle sequenc-
quantity, and nature of the template DNA, the ing. f3P is nice to work with because it is relatively
particular model of thermal cycling machine, and stable and can be used for end-labeling for several
the specific primers used in the sequencing reac- months, it does not result in significant degrada-
tion. The same rules for primer design that are tion of end-labeled primers, sequencing samples
outlined for PCR and Sanger sequencing also ap- can be run for up to three months, and it tends to
ply to cycle sequencing. As for PCR,when begin- result in better resolution than p2P(Evans and
ning cycle sequencing of a new taxon or when us- Read, 1992).However, it is about four times more
ing new primers, some trial and error may be expensive. F2Presults in degradation of end-la-
necessary to result in optimal sequence produc- beled primers and sequencing reactions withln
tion (see Chapter 7). several weeks. However, it is much cheaper and
Although unambiguous sequences may bc sequences of allnost the same quality as yqPcan
obtained directly from unpurified PCR products be generated by reducing the amount of 92P m the
or from bacterial colonies or plaques, further pu- end-labeling reaction, by reduclng the amount of
rification may improve resolution of sequences end-labeled primer in t.he sequencing reaction, or
obtained. For example, quality of sequences ob- by using old y32P(i.e,, about one half-life after
tained from plaques may be improved by per- manufacture). Quantity of labeled template loaded
forming a short asymmetric PCR reaction follow- onto sequcncing gels and tlmes for autoradl-
ing the labeling reaction (Mason, 1992). For PCR ograph exposure may also be varied wit11 the re-
products, gel isolation of target fragments (see action conditions and isotope used. Specific cy-
Protocol 19) may be used to reduce ambiguities in cling conditions may vary among thermal cyclcr
sequences caused by primer-dimers or length models and may need to be adjusted accordingly
heterogeneity. Centricon filtration of PCR prod- Efficiency of reactions may be increased by using
ucts may also be used but tends to result in more microtiter plates for sequenccmg.
ambiguities in sequences close to the primer. Effi-
ciency of sample recovery may also be increased Part A. End-kabcxing R~:actsr,nx
by using methods to remove oil from the reactions
I. For each primer, cornbil~e:
(Whtehouse and Spears, 1991) or by changing the
type of oil used (Ross and Leavilt, 1991). Choice 2.5 pl lox kinase buffer
of a purification procedure may require experi- 1 pl10 pM primer (0.4 ph4 final concen-
mentation on the system being used. When het- tration)
erozygosity of sequences within a particular size 1 p1 T4 polynucleotide lcinase (0.4 Ulpl
fragment is suspected, cloning into plasmid vec- final concentration)
tors and screening for heterogeneity may be nec- 3 p1 f3P ATP (1.2 p1 Ci/pl; 1.2 pM ATP
essary (see Protocols 18 and 20). Less template is final concentration)
required than in standard chain termination Add sterile, ddH20to a final volume of 25 ~ 1 1
methods, but quality is very important.
Many methods exist for cycle sequencing and 2. Centrifuge briefly and incubate at 37OC for 30
a number of commercially available kits are on the min.
market. The concentrationof radioisotope used for 3, Incubate at 95OC for 5 mm to denature the b-
end-labeling should be optimized to achieve an nase.
optimal balance between ATP concentration and
4. Stop reaction by placing on ice or at -20°C.
specific activity, depending on the type of ra-
368 Ctznpter 9 / Hillis, Mable, Lnrson, Davis 6.'Z i ~ n m e r
18. If the gel was bound to the glass plates using golden jackals (Roy et al., 1994). Bovine mi-
bind-silane, dry with a hot-air blower or in a crosatellites have been used successfully in sheep
drying cabinet. and goats (S.S. Moore et al., 1991). The following
19. Place the gel in a film cassette with autoradi- protocol has been used to isolate (GT),, micro-
ography film for =24 hr (the exact length of satellites from cattle, horses, turtles, and lizards.
exposure will vary). Exposure time can be es- The average insert size in libraries con-
timated after scanning the gel with a Geiger structed using the following protocol is approxi-
counter (the relationship between how ra- mately 600 bp. The (GT)15probe hybridizes to
dioactive the gel appears and exposure time 2.9% of insert-bearing colonies in a bovine library
will vary among laboratories). If 35Swas used, and 66% of these contain (GT),, microsatellites. A
do not place plastic wrap between the gel and weaker signal on the grid indicates a reduced like-
the film. lihood that an insert contains a useful repeat.
Longer repeats are more likely to be polyl~~orphic
20. Develop the film for 5 min in a developer
(Weber and May, 1989).
tank. Rinse in stop bath and place in fixer for
In 5% of the microsatellites that have been iso-
5 min. Wash well in running tap water and
hang to dry. lated by this protocol, the repeating unit is too
close to the Sau3Al site to design primers. An-
other 20% of the microsatellites fail to amplify,
and about 20% of the amplified loci are mono-
Protocol 26: Microsatellites morphic. Thus, the yield of useful microsatellite
(Time: -1 week) markers typically is about 1 per 100 white
Microsatellites are tandemly repeated DNA se- colonies. This ratio has been observed in mam-
quences of one to six bases in length. Variation in mals and reptiles, but in birds the ratio appears to
the number of repeats generates length polymor- be much lower-approximately one useful mi-
phism~that can be visualized and scored follow- crosatellite marker per 1000 white colonies
ing PCR and polyacrylamide gel electrophoresis. screened.
Microsatellites appear to be uniformly distributed
1. Digest 10 pg of total genomic DNA to com-
throughout human (A.E. Hughes, 1993), mouse
pletion with Sau3Al in a 100-pl reaction.
(Stallings et al., 1991), chicken (Haberfeld et al.,
19911, whale (Tautz, 19891, and insect (Choudhary 2. Load the entire digest onto a 0.8% agarose
et al., 1993) genomes. Abundance (every 50 kb in minigel with a low-molecular-weight size
mammals), uniform distribution, and high poly- standard.
morphism make these markers useful in popula- 3. After electrophoresis, stain the gel with ethid-
tion genetic and gene mapping studies. Mi- ium bromide and excise fragments between.
crosatellites have been developed for vespid 300 and 800 bp with a sterile razor blade.
wasps (Choudhary et al., 1993), chickens (Haber- 4. Recover the DNA from these gel fragments by
feld et al., 19911, cattle (Barendse et al., 1994; M.D. crushing, freezing, and thawing them several
Bishop et al., 1994), pigs (Ellegren et al., 1993; times and then pelleting the agarose by cen-
Rohrer et al., 1994; Wintero et al., 19921, horses (El- trifugation.
legren et al., 1992; Marklund et al., 19941, dogs
5. Precipitate the DNA from tlie supernatant for
(Holmes et al., 1993; Ostrander et al., 19931, cats
1 hr at -20°C by the addition of 1/10 volume
(O'Brien, 1993),mice (Cornall et al., 1991; Dietrich
of 2M NaCl and 2.5 volumes of ethanol.
et al., 1992, 1993), rats (Serikawa et al., 19921, and,
of course, humans (Litt and Luty, 1989; Tautz, 6. Re-suspend the pellet in 90 p1 of 10 rnM Tris-
1989; Weber and May, 1989).Canine microsatellites HC1 (pH 8.3), and 10 pL calf intestma1 phos-
already have bcen applied in genetic differentia- phatase (CIP) buffer containing 4 U of CIP.
tion and hybridization studies of wolves and coy- 7, Incubate at 37°C for 1 11r. This treatment re-
otes, and have been amplified successfully in moves the 5' phosphate groups from genomic
Nucleic Acids IV: Seqtlerzcing a~zdClolzing 371
DNA so that the fragments cannot self-ligate 18. Perlianently blnd the DNA to the membranes
and create chimeric inserts. w ~ t han overnight incubatiol-t in a 65°C dry
8. Reinove the CIP by incubating the mixture oven.
with 0.5% SDS, 5 mM EDTA (pH 8.01, and 100 19. Hybridize membranes with an end-labeled
mg/ml proteinase K for 30 min at 55°C. (GTII5 oligonucieotide (see Chapter 8 for de-
9. Extract the sample with an equal volume of tails).
PC1 (Appendix) and again with chloroform. 20. Isolate DNA from the colonies that liy-
10. Combine the size-selected phosphatized ge- bridized to the probe and those from the mas-
nomic DNA with an equaI molar ratio of plas- ter plates (Protocol 14). Sequence the inserts
mid DNA that has been cut by BnnzHI to pro- using the appropriate plasmid primers.
duce compatible ends. 21. Design PCR primer pairs in the regions flank-
11. Co-precipitate the mixture by the addition of mg the microsatellite based on the sequence
1/10 volume of 2 M NaC1 and 2.5 volumes of (see Chapter 7).
ethanol.
12. Dry the pellet from the co-precipitation and
re-suspend it into a 25 pl ligation reaction mix INTERPRETATION AND
containing 7.5 U of T4 DNA ligase (in ligation TROUBLESHOOTING
buffer plus 0.5 mM ATP) and incubate over-
night at 12°C. Autoradiograph Interpretation
13. Transform the ligations by aliquots into DH5a
competent E. coli cells (2 of ligation mix/50 Although reading autoradiograplils of sequence
yl cells). Ligations may be stored frozen at data is relatively straightforward (see Figures 4,5,
-20°C for many montl-ts and will still effec- and 9A), some practice is required to record the
tively transform cells. data accurately and to identify and solve prob-
lems. When sequencing DNA, it is strongly advls-
14. Plate transformed cells on LB plates with able to sequence both strands, as this provides a
ampicillin (200 mg/ml), X-gal (40 mg/ml), check against reading errors. When sequencing
and IPTG (100 mg/ml) for blue-white color RNA, only one strand can be sequenced, so ~t 1s
selection. A low ratio of white:blue colonies necessary to sequence broadly overlapping re-
usually indicates insufficient genomic DNA in gions in order to verify the sequence.
the ligation. A low number of transformants Reading sequences from autoradiographs 1s
indicates either an excess of genomic DNA in greatly simplified by use of one of various gel-
the ligation or poor quality competent cells. readers-digitizers coupled directly to a coin-
15. Pick white colonies onto a fresh plate covered puter. Use of a gel reader reduces human error
with a gridded nylon membrane and onto a compared to recording a sequence and then in-
master plate numbered to correspond to the putting the sequence via a keyboard. Most gel
grid. readers and software packages allow previousIy
16. Allow the cells to grow on the surface of the input sequences to be verified, thus further re-
nylon membrane for 12-18 hr. ducing error. Various automated gel readers have
17. Lift the membrane off the plates and place se- been and continue to be developed, and evcntu-
quentially onto satura-ted blot paper in the ally may replacc manual reading of autoradi-
bottom of PyrexTM dishes soaked in: 10% SDS ographs altogether. However, experience in gel
for 3 min; denaturing solution (Appendix) for reading usually allows higher accuracy of man-
ual sequence interpretat~oncompared to the prc-
5 min; neutralizing solution (Appendix) for 5
min; and 2x SSC (Appendix) for 5 min. This sent automated autoradiograph reading tech-
will denature the plasmid DNA and bind it to nology Software for automated reading of chro-
the membrane. mato-graphs produced by automated sequel~cers
372 Cl~npter9 / Hillis, Mable, Larson, Davis
tase extension reactions is helpful (see Protocol 23 nucelotide databases for the best match of virtually
and DeBorde et al., 1986). In sequencing DNA, any sequenced segment of DNA, especially if the
use of bacteriophage T7 DNA polymerase or Taq identity of the fragment is unknown or uncertain
polymerase (or both; see Figure 10) rather than (R.F. Doolittle, 1990b).Because some sequence will
Klenow fragment resolves many problems with always represent a best match, ~tis desirable to
secondary structure; extreme cases can be re- know if the match is sigruhcantly different from a
solved by using dITP (or 7-deaza dGTP) rather random match. Lipman and Pearson (1985) de-
than dGTP in the sequencing reactions (see Proto- scribed a z statistic for this purpose, which is de-
col 22; also W.M. Barnes et al., 1983; Gough and rived from the particular similarity score used in
Murray, 1983). the search procedure. Briefly, the z statistic equals
the difference between the similarity score and the
mean similarity score from the database scan, di-
Sequence Comparison and Alignment vided by the standard deviation of the similarity
Once the sequence has been obtained, it must be scores from the database scan. They suggested the
related to other sequences to be of use in system- following guidelines: z > 3, possibly significant; z
atics. Sequences either can be aligned with known > 6, probably significant; and z > 10, significant.
orthologs (or paralogs and xenologs if the evolu- Other approaches to similarity sipficance testing
tion of gene families is the object of study), or sim- have been described by Kanehisa (19841, Lipman
ilarity searches (often incorrectly termed homol- et al. (1984), T.E Smith et al. (1985),and Pearson
ogy searches) can be performed by matching the (1990).
sequence to all other sequences in a databank The program Entrez is a very useful tool for
such as GenBank. In the latter case, the best exploring the nucleotide and protein databases, as
matches (or the most interesting ones) are then ex- well as the associated literature. It is available on
tracted and aligned for phylogenetic analysis. CD-ROM (address above), or a network version is
Alignments may be simple for closely related pro- available across the Internet (contact netinfo8
tein genes, but may be extremely difficult or am- ncbi.nlm.nih.gov to register and obtain client soft-
biguous if the sequences are distantly related or ware). A World Wide Web version is also available
come from non-protein-coding regions. (htpp://www.ncbi.nlm.nih.gov/). After genes of
interest have been identified, Entrez may be used
Database Searches to locate many more similar sequences very
Local alignment algorithms find all subsequence rapidly. This speed is possible because Entrez con-
matches above a certain defined threshold. Search tains precompiled lookup tables of connections
of data banks makes use of these algorithms, such among similar sequences from prior BLAST
as the BLAST algorithm of Altschul et al. (1990),the searches. It also has direct connections among nu-
FASW algorithm of Lipman and Pearson (1985), or cleotide entries, protein entries, and literature ref-
the FASTA algorithm of Pearson and Lipman erences from the various databases, which are dis-
(1988).An implementation of the BLAST algorithm tributed with the program on CD-ROM or are
is distributed as part of the Entrez CD-ROM pack- accessible across the Internet. Searches of the data-
age (National Center for Biotechnology Informa- bases are possible through Boolean queries of al-
tion, National Library of Medicine, 8600 Rockville most any of the illformation associated with the
Pike, Bethesda, MD, USA). It also can be accessed databases. For instance, searclung may be done by
across the Internet (to register and obtain client soft- taxonomic group, text terms in titles and ab-
ware, contact [email protected]);a ver- stracts, key words, author names, accession num-
sion is also available on the World Wide Web bers, Enzyme Commission numbers, sequence ID
(htpp://www.ncbi.nlm.~i.gov/). Versions of other numbers, medical subject headings, gene names,
local alignment algorithms are distributed as part chemical substances, or MEDLINE ID numbers.
of most commercial sequence analysis software Producers of nucleotide sequences should
packages. It has become commonplaceto search the make their findings available to the public by de-
Nucleic Acids IT/: Sequencing and Cloni?zg 375
Figure 32 Matrix comparison of a portion of the 285 Davis, 1987).The letters represent percent similarity
rlZNA genes of a frog (Xenopus Iaevls, vertical axis; Ware over blocks of 30 bp; A: loo%, B: 98-99%, C: 96-9796,
et nl ,1983) and a mouse (Mtns muscultls, horizontal axis; etc. All matches of 65%or higher similarity are shown.
I-Iassouiza et al., 198.1).The deflections along the diago- Note the regions of similarity between GC-rich regions
nal represent insertion/deletion events (see Hillis and at positions 500-1000 and 2500-3200.
should be identified. For more information on Chan et al., 1992; Higgins et al., 1992; Wheeler and'
pair.culse alignment, see Waterman et al. (1991). Gladsteii~,1992,1994).
One approach to multiple alignment is to
Multiple Alignments make pairwise alignments, and then add the se-
For most phylogenetic studies, sequences must be quences together by inserting additional gaps as
allg-ned arnong multiple taxa, individuals, or needed. However, the final alignment will be or-
genes. Tn principle, the method of Needleman and der-dependent, meaning that different alignments
Wunsch (3970) could be extended to multiple di- will be achieved depending on the order of the
menslons, but this approach would be computa- pairwise alignments. Feng and Doolittle (1987,
tionally impractical. Many of the recent advances 1990) proposed to obtain the order of the pairwise
in alignment have been concerned with ways of alignments from clusters in an initial tree pro-
solving the problem of multiple alignment (e.g., duced from a matrix of distances across all pair-
Feng slid Doolittle, 1987, 1990; Hein, 1989a,b; S.C. wise alignments. This strategy is implemented in
Nucleic Acids IV: Sequencing and Cloning 377
(A)
22 100 22 120 22 140
Mus GTCAGCCAGGACTCTCTACCCGCTCACGGCAAGGCTTCCCTGCCCGCTACCGGAGGCAAC
Rattus GTCAGCCAGGACTCTCTACCCGCTCACGGCAAGGCTTCCCTGCCCGCTACCGGAGGCAAC
Homo GTCAGCCAGGACTCTCTACCCGCTCGCGGCAAGGCTTCCCTGCCCGCTACCGGAGGCAAC
Xhlneura GTCAGCCAGGATTCTCTATCCGCTCGCGGCAAGGCTTCCCTGCCCGCTACCGGAGGCAAC
Cacatua GTCAGCCAGGATTCGCTATCCGCTCGCGGCAAGCCTTCCCTGCCCGCTACCGGAGGCAAC
Xenopus GTCAGCCAGGATTCTCTACCCGCTCGCGGCAAGCCTTCCCTGCCCGCTACCGGAGGCAAC
Rhyacofriton GTCAGCCAGGATTCTCTATCCGCTCGCGGCAAGCCTTCCCTGCCCGCTACCGGAGGCAAC
Typhlonectes GTCAGCCAGGATTCTCTATCCGCTCGCGGCAAGCCTTCCCTGCCCGCTACCGGAGGCAAC
Latimeria GTCAGCCAGGATTCTCTACCCGCTTGCGGCAAGGCTTCCCTGCCCGCTACCGGAGGCAGC
Cyprinelfa GTCAGTCCAGGATTCCTACCCGCTGGCGGTCAAGCCTTCCCTCCGGCTACCGGAGGCAGC
* * * * *
* ( I ** ** ** * * * * *
(B)
22 100 22 120 22 / 40
MUS GTCAG-CCAGGACTCTCTACCCGCTCACGG-CAAGGCTTCCCTGCCCGCTACCGGAGGCAAC
Xaftus GTCAG-CCAGGACTCTCTACCCGCTCACGG-CAAGGCTTCCCTGCCCGCTACCGGAGGCAAC
Homo GTCAG-CCAGGACTCTCTACCCGCTCGCGG-CAAGGCTTCCCTGCCCGCTACCGGAGGCAAC
Rhineura GTCAG-CCAGGATTCTCTATCCGCTCGCGG-CAAGGCTTCCCTGCCCGCTACCGGAGGCAAC
Cacatua GTCAG-CCAGGATTCGCTA.~CCGCTCGCGG-CAAGCCTTCCCTGCCCGCTACCGGAGGCAAC
Xenopus GTCAG-CCAGGATTCTCTACCCGCTCGCGG-CAAGCCTTCCCTGCCCGCTACCGGAGGCAAC
Xhyacofriton GTCAG-CCAGGATTCTCTATCCGCTCGCGG-CAAGCCTTCCCTGCCCGCTACCGGAGGCAAC
Typhlonectes GTCAG-CCAGGATTCTCTATCCGCTCGCGG-CAAGCCTTCCCTGCCCGCTACCGGAGGCAAC
Latimeria GTCAG-CCAGGATTCTCTACCCGCTTGCGG-CAAGGCTTCCCTGCCCGCTACCGGAGGCAGC
Cyprinella GTCAGTCCAGGATTC-CTACCCGCTGGCGGTCAAGCCTTCCCT-CCGGCTACCGGAGGCAGC
* * * * *
Figure I4 Alignment of a portion of the 28s rRNA that are variable among species are marked with an as-
genes of various species of vertebrates sequenced by terisk. Insertions are indicated by dashes. (A) Align-
Hadjiolov et al. (1984),Gonzalez et al. (1985),Ilassouna ment with no insertions added. Note that this align-
et al. (1984), HiUis and Dixon (1989),Larson and Wilson ment requires 20 variable sites. (B) Alignment with four
(19891, and Ware ct al. (1983). The numbers refer to the insertioddeletion events. There are only 10 variable
nucleotide positions of the Mus 28s rRNA gene; sites sites in this alignment, including the insertions.
0.5 M EDTi'a,
A solution of chloroform and isoamyl alcohol, in
the ratio 24:l. (Etlay1enediaminr;Jt-ctraaackicArid), ;?M
8.0
2s CTAB Extr;~cfianBuffer 268.1 g disodium EDTA
ddH,O to 1 L
10 g CTAB [hexadecyltrimethylammonium
bromide] Dissolve and pH with sodium l~ydroxide.Auto-
140 m15 M sodium chloride clave.
25 m12 M %is-HCl, pH 8.0
20 m10.5 M EDTA
0.05 M glucose
?Ox Cycle Sequencing Buffer 0.025 M Tris-HCI, pH 8.0
0.01 M EDTA
300 mM Tris-HC1, p1-I 9.0
50 mA/I magnesium chloride
300 mM potassium chloride
0.25% (w/v) NP40
60 m15 M potassium acetate
0.25% (w/v) Tween 20 11.5 ml glacial acetic acid
28.5 ml d d H i 0
Dena t uril~gSolution
1.5 M sodium chloride
0.5 M sodium hydroxide
10 g tryptone
10 g sodium chloride
5 g yeast extract
d d H 2 0 to 1 L
Add 0.1% diethylpyrocarbonate to water, wait 12
hr, and autoclave. Used to inhibit RNase. Adjust pH to 7.2 with sodium hydroxide, then au-
toclave. For L-broth plates, add 15 g agar/L belore
autoclaving. For L-broth + MgS04 -e- maltose
Dialysis Tubing plates, add sterile MgS04 to 10 mM final concen-
To prepare 2 L of solution to boil tubing: tration and sterile-filtered maltose to 0.2% after
autoclaving. Sterile-filtered antibiotics, IPTG, and
40 g Na2C03 X-Gal should also be added after autoclaving, af-
4 m10.5 M diaodium EDTA ter agar has cooled down to below 50°C. For L-
ddH20 to 2 L broth top agarose, add 7 g/L agarose to L-broth
before autoclaving.
380 Clzapker 9 / Hillis, Mable, Larson, Davis b Zilnrner
BIOLOGICAL CONTEXT
This chapter considers genetic variation within species. The general goals of pop-
ulation genetic studies are to account for and characterize the extent of genetic
variation within species. Variation provides the raw material for future evolu-
tionary change, and different levels of variation in different populations may pro-
vide evidence for different evolutionary events in the past. Much of the chapter is
directed towards estimation and interpretation of F statistics. For human popu-
lations, these measures are to be used in the proposed human gene diversity pro-
ject (Cavalli-Sforza et al., 1991), and they already have been used to adjust allele
frequencies in forensic calculations (Nichols and Balding, 1991).There has been
confusion in the literature over the use of these statistics to estimate migration
rates (Slatkin and Barton, 1989) and over the use of fixed or random statistical
models (Chakraborty and Danker-Hopfe, 1991).
The characterization of variation rests on phenotypic observations. It will be
assumed here that there is a direct relation between phenotype and genotype,
meaning that only discrete data are being considered. Further, it will be assumed
that the crosses necessary to demonstrate that a band on an electrophoretic gel
(for example) does indeed correspond to an allelic form of a single gene have
been carried out. Different genetic entities, whether allozymes, restriction site pat-
terns, repeat copy numbers, or nucleotide subsequences, will be taken to be sub-
stantially independent. Although means for accommodating associations be-
tween Mendelizing units are available (e.g., Weir and Cockerham, 1989a), they
are beyond the scope of this chapter.
The first analyses of genetic data are those that rest simply on the genotypic
state. Avise (1994) has been able to make statements about the phylogeny of
385
386 Chapter 10 / Weir
species on the basis of n~itochondrialgenotypes quencies of pairs of loci have even more severe
observed in different populations. In this field of problems. Linkage disequilibrium refers to the
"pl~ylogeography,"individuals are genotyped departure of such joint frequencies froi~lthe prod-
and assigned to maternal lineages, and the result- ucts of single frequencies, and theory relates the
ing phylogeny is related to patterns of geograpluc expected value of squared linkage disequilibrium
distibution. In a study of pocket gophers, Geomys to population size and recombination rates be-
pinefus, Avise et al. (1979) found that most mito- tween the loci. Altl~oughit would bc of great ad-
chondrial haplotypes in their sample were local- vantage to be able to estimate either of these
ized geographically, and that related genotypes quantities from frequency data, once again the
tended to be geographically contiguous or over- large sampling variances of linkage disequilibria
lapping. make this unlikely (Hill and Weir, 19941, unless
At the next level, counts of genotypes lead to ancestral combinations of alleles can be inferred
simple measures of variation such as the number (Kaplan et ai., 1995). It is the variance caused by
of alleles per locus or the allelic frequencies. The the stochastic nature of the evolutionary forces
numbers of alleles may be sufficient for the pur- acting on the population, rather than that caused
pose of establishing that there is variation, but it by sampling of individuals for observation, that
IS difficult to use them to compare levels of varia- causes difficulties. The variance introduced by the
tion between different populations. For such pur- evolutionary forces cannot be reduced simply by
poses, allelic frequencies are better suited and taking larger sample sizes. Another problem with
appropriate statistics will be discussed. More basing inferences on linkage disequilibrium is
complex functions of frequencies, such as gene di- that most applications have assumed relations
versity (e.g., Weir, 1989), defined as one minus the that hold only in populations in equilibrium for
sum of squared allelic frequencies, can be used to the joint effects of drift and recombination, Most
address the mechanisms for the maintenance of natural populations are unlikely to be in such
variation. equilibrium.
The main theme of this chapter is that statisti- Further refinements of the classic model al-
cal analyses must be based on biological models. low for specified forces of selection, migration, or
The classic model is that of an ideal population, mutation. Each of these lead to changes in allelic
infinite in size and mating at random for a locus frequencies over time. Whe11 data are available
at which there are no disturbing forces such as from several generations, it may be possible to
mutation, migration, or selection. Such a model make inferences about the evolutionary events
leads to predictions of equality, for example, the acting within the species. Good discussions have
relationship between gene diversity and hetero- appeared previously in the literature. Prout (1965)
zygosity (defined as the frequency of heterozy- detailed the difficulties in estimating the strengths .
gotes).A significant difference between two quan- of selection acting at different stages of the life cy-
tities may indicate the violation of one or more cle, while Cl~ristiansenand Frydenberg (1973)
assumptions of the model. showed .the power of having data collected for
One of the first steps to introduce realism into mother-offspring pairs. Estimating migration or
the classic model is to suppose that the popula- mutation rates in a population generally proceeds
tion is finite, although still mating at random. This under the assumption that the population is at
model allows quantities such as the change in fre- equilibrium (e.g., Chakraborty and Leimar, 19871,
quency of heterozygotes to be related to popula- so that the various evolutionary forces are no
tion size and may suggest a means for estimating longer changing the quantities being measured.
population size (e.g., Laurie-Ahlberg and Weir, Although theory relates allelic frequencies and
1979).Unfortunately, the large sampling variance functions of allelic frequencies to mutation or mi-
of heterozygosity make such attempts of limited gration rates, care must be taken in the analyses.
use in samples that are measured only in the hun- . Estimation of migration rates among local popu-
dreds. Statistics constructed iron1 the joint fre- lations, for example, should use models that rec-
Intraspecific Differentiation! 387
ognize that migration prevents allele frequcncies pling scheme of regions and populations within
in different populations from being independent. regions for the same species, These authors were
Eastel (1986) was able to estimate migration rates all concerned either with aspects of genetic het-
without an assumption of equilibrium. He took erogeneity among populations within a single
advantage of the known history of different pop- species or with variation in mating structure.
ulations of giant toads, Bufo marinus. Compar- The growing availability of rnalecular infor-
isons among allelic frequencies between intro- mation has allowed more detailed studies of
gressed populations and the original populations within-species variation. For example, measures
allowed rates of admixture to be estin~ated. of population structure for human VNTR loci
This chapter is concerned with survey data have been estimated by Weir (1994). Gaur and
from one or several natural populations. It is be- Clegg (1993a,b) studied nucleotide variation at the
coming increasingly easy to collect molecular data Adh-1 locus in the genus Zen (maize and teosinte)
on many loci for many individuals in many pop- and in pearl millet, Pennisteum glaucum. They
ulations. Temporal information is not generally were interested in the evolution of that locus, and
available, so direct evidence for selection, for ex- estimated divergence times between maize alleles
ample, is not obtained. What the data do allow, to be as long as 2 million years. Schaeffer and
however, is a characterization of the relationships Miller (1991) looked at sequence variation at the
between genes at various levels in a hierarchy. Adh locus in Drosophila pscudoobscura and used
The degrees of relatedness of genes within that information to infer the time of divergence
individuals, between individuals within subpop- between populations in Colombia and California.
ulations, between subpopulations within popula- Statistical procedures have been developed to al-
tions, and so on, can all be estimated. Com- low sequence data to provide information about
parisons between estimates allow inferences to be factors such as geographic subdivision (Hudson
made about the forces acting within a species. If et al., 1992a; Slatkin and Maddison, 1990) and se-
genes within individuals appear to be related in a lection (McDonald and Kreitman, 1991).
population, there may be departures from ran-
dom mating. If genes within individuals appear Genetic and Statistical Sampling
related to a different extent in different popula-
tions, the possibility of different mating systems Unless a population is absolutely uniform for klie
or population sizes in those populations needs to loci being studied, different samples from the
be investigated. If different loci show different de- population will show different levels of gcnctlc
grees of relatedness for genes within individuals variation. This is simply a consequence of the
within the same population, the possibility of se- "statistical sampling" that results in each sample
lective forces acting on those loci needs to be con- having a different set of ~ndividuals.Statistical
sidered. Note that although the degrees of rela- sampling variation can be accominodated in
tionship may be expressed in terms of measures analyses set up to allow statements to be made
of inbreeding, the most general interpretation re- about the population, based on the sample a t
gards them as correlations (Cockerham, 19731, as hand. Consider, for example, estimating the mean,
detailed below. p, for some variable. If obscrvations of tlne van-
Estimation of levels of relatedness of genes in able are denoted by X, and the sample mean oi 17
a hierarclucal sampling scheme has been consid- values by X,then ,U is estimated by X .Variation
ered by many authors for a wide variety of between samples is anticipated by assigning a
species. Schoen (1982) looked at populations of variance of 02/nto this estimate, where o2 is the
the annual plant Gilia achilleifolia, whereas Guries variance of the original variable. As the sample
and Ledig (1982) considered the longer living tree size gets larger, the population is better reprc-
Pinus rigida. Foltz and Hoogland (1983) looked at sented by the sample and different samples be-
populations of the prairie dog Cynomys ludovi- come more similar (i.e., the among-sample vari-
cianus, and Chesser (1983) looked at a nested Sam- ance of the sample mean decreases).
in I~Opulation genetics there is another level of
s a m p ] ~ nto
g be considered. Each generation of a
Fixed and Random Models
poyula [ion rs formed by the muon of gametes cho- The previous discussion also can be phrased in
sen froin among those produced by the previous terms of fixed and random effects, to show how
gencrd iion. This "genetic sampling" process would the intended scope of inference affects the sam-
cause [he population to look different if the forma- pling properties of genetic statistics. If there is in-
tion of a new generahon was replicated. Genes can terest only in the particular population sampled,
also be altered by mutation, and pairs of genotypes then prior genetic sa~npllngis not of consequence.
make differential contributions to succeeding gen- It is necessary only to take account of the statisti-
erations because of selection. These forces are sto- cal sampling for repeated samples from this one
chast~cbecause they ~ncludea random element- "fixed" population. Future samples would be
the particular genes affected cannot be specified in taken from the same population. Comparisons be-
advance. Population genetic theory depends on the tween different fixed populations can be phrased
conicyt of replicate populations which are main- in terms of means, and it will be shown that nu-
tamed under the same conditions, but which will merical procedures of permutation and resam-
differ because of genetic sampling. It is possible to pling are of use in comparing populations.
derive variances for statistics of interest that in- A different situation arises when the sample
clude both types of variation. is to be used to make inferences about the species
One use for such total variances is in predict- as a whole. In this case there is less interest in the
ing future values. For a part~cularpopulation, it is particular population sampled, which can now be
posslblc to predict the expected value of a statistic, regarded as being "random." Future samples may
such as allelic frequency, heterozygosity, or link- very well be drawn from a different population,
age dlsequilibriumn, in a future generation but not so both statistical and genetic sampling variation
to specify the actual value of the statistic. For a need to be considered. The distinction between
neutral gene (i.e., a gene unaffected by selection) fixed and random effects arises in statistics. In the
in a flnrte population, for example, it is known analysis of variance context, it is easier to detect
that the allelic frequencies have constant expected differences between means in a fixed-effects situ-
values over time, although in any particular pop- ation because a smaller variance is used in the de-
ulation the frequency may have drifted to any nominator for the test statistic. It is only one spe-
value between 0 and 1. Statements about the sta- cific set of means (fixed effects) that are being
tlstic in some future sample therefore must take compared, and not some population of means
into account the variation between replicate pop- (random effects) for which the means at hand are
ulations, as well as that between replicate samples just a sample.
floxn any one population. The distinction between fixed and random
A difficulty arises 111 that the magnitude of be- models in the genetic context has been made pre-
tween-population variation cannot be estimated viously by Cockerham and Weir (1986). They
wit11 a sample from a single population. One way stress that the random model considers each pop-
around this problem is sometimes afforded by the ulation to be a replicate sample of the evolution-
availability of several unlinked loci in the data set. ary process. Chakraborty and Danker-Hopfe
Although genes at different loci are never com- (1991) restrict attention to a fixed model, and
pletely independent (since they are carried on the point out that "in such formulations no a s s m p -
same gametes between generations), they may tion is needed regarding the evolutionary mecha-
have frequencies that are nearly independent and nism that determines the process of genetic differ-
therefore are functloi~allyequivalent to separate entiation within and between subpopulatiol~s."
populairons. Genes at these loci have each been This seems unlikely to be useful for evolutionary
exposed to the same genetlc forces between gen- studies.
eratlons, but can have different pedigrees in the As an example of the differences in ranges of
sanie ~ ~ asadoygenes in replicate populations. inference that follow from the choice of flxed or
Intraspecific Differentiation 389
The sum over heterozygous classes, indicated by The sample variances (obtained by using observed
involves every heterozygote A,AIwith allele A,, genotypic frequencies in tlus last equation) or their
and by convention the sum includes each AiAj square roots (i.e., the standard deviations) can be
just once. presented along with the sample frequencies.
Sample genotypic values will be distin- If the locus has several alleles, the number of
guished f r p the population values they estimate genotypic classes is largc and the data may be
by tildes, P . The population value refers to the summarized better with allelic frequencies. For
particular population in the fixed model, or to all codominant alleles such as those found for al-
replicate populations in the random case. Ex- lozyme markers or VNTR loci, allelic numbers can
pected values will be indicated by the symbol &. be found directly from the genotypic numbers:
390 Chapter 10 / Weir
the least frequent classes This problem increases example of a random number generator program).
wlth :he number of alleles, but even for two al- The parameter 41 is estimated from tlte new Sam-
lele> a sample of size 100 is expected to have ple, and the process is repeated many times-per..
only four Individuals m the aci class when allelic haps 1000 or more. In place of the single estimate
frequelicles are = 0.8, pn = 0.2. Conventional from the original sample, bootstrapping provides
wlsdoln says that goodness-of-fit X2 tests should as many new estimates as desired. While this col-
not be performed on classes with expected lection of new estimates cquld be used to provide
counts less than five, although this is probably an estimated variance for d , it is more informative
too conservative. Koehler and Larntz (1980) gave to work with the whole distribution of estimates.
rules rtklritlng sample slze to numbers of cells. As For example, a 95% confidence interval for 4 can
mentioned in the HWE testing section, doubts be constructed as the limits between which the
about the effect of small expected counts are best middle 95% of the bootstrap estimates lie.
resolved by looking at the contribution of each Bootstrapping within each of the r samples,
class to the test statistic. When HWE is assumed, therefore, will provide confidence intervals for the
l t 1s sufficient to compare the allele arrays in each population allele frequencies, without having to
of the populations. The contingency table is then invoke the binomial theorem and therefore with-
only zl x r, and problems of small expected out having to assume Hardy-Weinberg equilib-
counts are less likely. rium. Two populations can be judged to have
different allele frequencies if the estimated fre-
KUMERLCAL RESAMPLING A n alternative to the quencies have non-overlapping confidence inter-
conc~ngencytable approach is provided by vals. How wide should these confidence intervals
nurnerlcal resampling (Efron, 1982). This is a be in order to have a specified confidence in the
rrie~msof making inferences about population statement that the populations have different fre-
allele frequencies from the sample frequencies. quencies? An approxiinate answer can be based
Variances can be estimated and confidence inter- on normal distribution theory (Weir, 1992b).
vals can be constructed, both referring to repeat- An answer that does not invoke normality
ed sampling from the same populations. Briefly, can be based on Chebyshev's inequality, which
numerical resampling mimics the drawing of states that the probability of a random variable
new samples from the original sample for each being more than k standard deviations from its
populatlol~.Two metllods are commonly used: mean is less than l/k2. For population I, if the
jackkiiiflng and bootstrapping. They were sample frequency for allele A, has variance C T ~ ,
descr~bedIn a genetic context by Dodds (1986) this means that
bul 3nly bootstrapping will be considered here
smce ~i provides much more information about
the parameter being estimated than does jack-
knifing I t leads to an estlmate of the sampling ,. a similar equation for the sample frequency
with
disiribution of the estlxnate. PI, in population 11. Under the hypothesis that
For bootstrapping, from the original set of 11 the two populations have the same allele frequen-
observations a new sample of the same size is cies, P,, 1= P,,,and the same variances of sample
constructed by random sampling with replace- frequencies, Chebyshev's inequality is
ment. In other words, each of the original obser-
vatlolis 1s equally likely to be selected to consti-
tute ,iny one of the members of this new sample.
The bootstrap sample therefore is likely to have
some of the original observations represented For this last probability to be 0.05, corresponding
many tlmcs, and some of them not represented at to a 95% confidence interval on the difference of
all. Diawing tlte sample requires the use of frequencies, it is necessary that K2 = 20. Wow-
(psecdo)random numbers (see Weir, 1990 for an ever, if this corresponds to non-overlapping con-
I n traspecific Differentiation 393
PERMUTATION TESTS A distribution-free ap- where F=Cibi,h is the average sample fre-
proach can be based on permutation. ~xtending quency of the allele over the samples. If the Sam-
the previous analogy of a deck of cards to two ples were of unequal sizes, n,, weighted means
populations, the deck now has a card for each and variances should be used and then
allele in the two populations. The hypothesis of
equality of frequencies in the two populations is
ec~uivalentto alleles being distributed indepen-
dently into two samples. After the permutation
is completed, the first 2 n cards
~ are taken to rep- with
resent sample 1, and the remaining 2nrI cards to
represent sample 2, where nr and nfl were the e = xir xi
~ ~ jni,~ E /=x ~ ni/r
or'lginal sample sizes. The proportion of permu-
tations in which the difference in allele Aicounts It is true that this quantity increases as the
between the two samples is as great or greater sample allele frequencies diverge, but it is difficult
than the original observed value provides the to assess the significance of divergence. Using the
significance level for rejecting the hypothesis of FSTstatistic to test for gene frequency differences
equality. This test is local to allele A,. If HWE is would require knowledge of its sampling proper-
assumed, a global test over all alleles would be ties. In the fixed-population framework, it is pos-
based on the conditional probability of the two sible to relate FSTto the contingency-table X2 test
arrays: statistic. Suppose attention is focused-on only one
allele, A, and its alternative allele(s),A. If the Sam-
ple frequency of allele A in the ith sample is Fi,
and the sample is n,, the x2 statistic for com-
paring allele counts over populations is
pected value is not desirable, although for very ferentiation, and this differentiation is conve-
large sample sizes FsTis expected to be close to niently quantified with the F statistics of Wright
zero when populations have the same frequencies. (1951), or the analogous measures of Cockerham
In the fixed model, the only variation being (1969,1973).Tl-tese quantities measure tl-te degree
considered is that between samples from the same of relatedness of various pairs of genes. Cocker-
population. This is the variation giving rise to the ham (1969) described tl-te three basic quantities in
distribution of the test statistic, and is to be con- the situation when diploid individuals are sam-
trasted to the random model case considered be- pled from a series of populations as follows: (1)
low and by Bowcock et al. (1991).A quite differ- the overall inbreeding coefficient F (Wright's F,)
ent approach was adopted by ~ h a k r a b o r andt~ is the correlation of genes within n-tdividualsover
Danker-Hopfe (1991).They defined FSTin terms of all populations; (2) the coancestry 8 (Wright's
the population frequencies, p, . No provision is FST) is the correlation of genes of different indi-
made for statistical or genetic sampling, and so viduals in the same population; and (3)f Wright's
there is no basis for making statements about the Fis) is the correlation of genes within individuals
size of the parameter. within populations. Because the populations are
assumed to have been isolated since the ancestral
population, genes in one population are indepen-
Random Populations dent of those in another.
Under the random model, the populations sam-
pled may be considered to represent the species Haploid Data
and therefore to have a common evolutionary an- If data are available on genes directly, then the
cestry. Even though the populations may have analysis is in terms of allelic frequencies and is
been distinct for some time, the analysis is built phrased conveniently in terms of a set of indica-
on the assumption that there is a single ancestral tor variables. For gene j in a sample from popula-
population. The expectations referred to by means tion i, a variable xy can be defined by x,! = 1 if the
and variances now refer to repeated samples from gene is allele A, or xll = 0 if tl-te gene is not allele A.
the populations and to replicate populations. In The expected value of xij over samples and
the absence of disturbing forces, such as selection, replicate populations is therefore the allelic fre-
all populations are expected to have the same al- quency p~ common to these populations. Under a
lelic frequencies. model assuming no forces such as mutation or se-
Underlying the analysis of differentiation in lection, this frequency is also that in the ancestral
the random model is the notion that genetic Sam- population. The sample allelic frequency FA, in a
pling causes different genes in a population to be sample of 12, genes from the ith population can be
dependent, or related. Even though individuals or written as
genes may be sampled randomly, the process of
taking expectations must recognize that they are
dependent througll thelr shared ancestry. Another
essential concept for the analysis is that the rela-
tionships between various genes are relative to For l-taploiddata, there is only one F statistic.
the least-related genes in the data. It is generally It measures the relationship between different
assumed that these least-related genes are inde- genes in the same population, relative to the zero
pendent-the data do not allow measures of rela- relationship between genes of different popula-
tionship to be estimated otherwise. tions. The quantity is written here as 8 and was
Interest can be centered 017. the extent to termed FST by Wright (1951). It allows the ex-
which different populations within the species pected value of a sample allelic frequency to be
have differentiated over the time since the ances- written as
tral population. The action of evolutionary forces,
or genetic sampling, will result in intraspecific dif-
Table I
Analysis of variance layout for haploid data
X
df Sum of squares* Expected mean square
Although the allelic frequency ,un is assumed to square for between populations is written as MSP
remain constant, for finite populations the and for genes witliin populations as MSG:
coancestry will increase over time as inbreeding
accrues in each population. Tn other words, 8
measures the extent of differentiation between
populations. It is worth stressing that this mca-
sure of between-population differentiation is a
consequence of the relatedness of genes within
populations. As il~dividualsbecome more related
within populations, the independent populations
become more differentiated.
Estimation of 8 can proceed by the method of
moments, with the various statistics being con- The value of 11, is given in Tables 1 and 2.
veniently organized in an analysis of variance for- It may be convenient to identify two terms as
mat (Table I). [Other metl~odsof estimation were the colnponents of variance for genes within pop-
reviewed by Spiehnan et al. (1977).] Data are con- ula tions,
sidered to be available from r populations, with
different numbers of genes, n,, from each being al-
lowed. Thc weighted average frequency of allele
A over all the samples is written as and between populations,
Table 2
Analysis of variance layout for diploid data
2
Betwccn populations I.-1 2C:=ln,(PA,- PA.) PA(^ - PA)
= 2(r - 1 ) ~ s ; [(I - F)+ 2(F - 0 )+ 2n,9]
fixed model. There is a difference between arriv- linearly with the time since divergence of the two
ing at this equation as an approximate expression populations from the ancestral population. When
for the estimator of a parameter, and taking the all populations are fixed at all loci scored, the esti-
equation as the definition of a quantity of interest. mator is undefined, since the equation becomes
Estimation of 6 has been presented in terms of the ratio of zeros. 'Indeed, unless there is addi-
one of the alleles, A, at a locus. If there are only tional information on the molecular structure of
two alleles at the locus, then the same estimate the fixed alleles, there is no information in this
would be obtained if the other allele is used, For case on how long the populations have been fixed
more than two alleles, however, a different esti- or on the time of divergence of the populations.
mate will result wit11 every allele. Since the para-
meter 6 is the same for every allele under a model rNFERENcEs ABOUT 6 Making inferences about 6
of no disturbing forces, all of these estimates refer beyond simply estimating it may be accom-
to the same quantity and an appropriate average plished by numerical resampling. Confidence
is needed to give the best single estimate. For the intervals follow from bootstrapping. Because 8 is
uth allele, the estimate could be written as a parameter appropriate for the random model,
numerical resampling cannot be performed by
resampling genes at a locus within populations,
as was done in the fixed model. The resarnpling
must mimic both the genetic sampling that caus-
and then an overall estimate with the desirable es replicate populations to differ, and the sam-
properties of low bias and small variance is given pling of genes for observation from each popula-
by combining the information from all v alleles tion. Two possibilities are suggested.
In the first place, resampling may be done
over loci. For a study in which m loci are scored,
bootstrapping over loci provides confidence in-
tervals for 6. Each bootstrap sample consists of a
There is an additional extension to cover the case set of m pairs of values,
where several loci are scored. Once again, under
a model with no disturbing forces, every allele at
every locus provides an estimate of the same
quantity. Indexing the loci by I and alleles within drawn with replacement from the rn calculated val-
foci by u, then the individual estimates are ues, and the combined estimate formed from this
new collection of T's. As before, the middle 95%of
these new estimates will provide a 95% confidence
interval. The hypothesis that 6 has some specified
value can be rejected, at the 5% significance level,
and the overall estimate from all m loci is if the confidence interval does not include that
value. Data sets with overlapping (90%)confidence
intervals provide no evidence that the correspond-
ing 6 values differ (at the 5% level).
It is also possible to bootstrap over popula-
tions; this may be done for each locus separately.
With 6 estimated, there is a quantification of In this way the estimates of 6 for each locus can
the degree of divergence among a set of r popula- be compared. Loci that do not give overlapping
tions. Equal allele frequencies in all populations confidence intervals for 6 may be affected by dif-
will cause 8 to be zero and, for a pair of popula- ferential disturbing forces such as selection. Boot-
tions, O may serve as a measure of genetic dis- strapping over populations requires that there are
tance. Under the drift model, In(1- @)willincrease several populations, just as resampling over loci
398 Chapter 10 / Weir
supposes that several loci have been scored. In lowed, but it is now possible to estimate the de-
practice, at least five populations or loci appear to gree of relationship between genes within indi-
be necessary. viduals, F, as well as tliat between genes of differ-
A promising method for making inferences ent individuals, 8. The preceding haploid analysis
about 0 may be suggested by recent work of essentially dropped the distinction between F and
Brownie and Boos (1994). These authors showed 8. Differentiation between independent popula-
that analysis of variance methodology may be tions is still measured in terms of 8, reflecting the
quite robust to lion-normality. If this suggestion, relatedness of individuals witlun populations, but
also hinted at by D'Agostino et al. (19881, holds the analysis also provides an estimate of tlie over-
for zero-one variables like the indicator variables all inbreeding coefficient, F. Remember that, ln
x,, used here, then the ratio of mean squares in Wright's notation, F = FITand Q = FSTThe degree
Table 1could be used to test the hypothesis tliat 0 of inbreeding within populations, f, or Frs, can be
is zero. The mean squares themselves may be expressed as
taken to be X2 variables, and this used to construct
confidence intervals without tlie need for numer-
ical resampling. The current availability of com-
putmg power, however, means that numerical re-
sampling does not impose an undue burden on Only f can be estimated from data from a single
the data analyst. population, by means of a momel~t-estimator
It is necessary to comment on the possibility such as
that the estimate of Qmaybe negative. Two situa-
lions are likely to give this outcome. It may be
that the true value of e is oositive but small. Since
the estimale has low bias, the estimate is about as
likely to be below as above the true value. In this
case, estimates less than the true value will often (Weir, 1990). The estimate o f f provided by tlie
be negative. The second situation is that t l ~ epara- program given in Weir (1990) is an average value
meter may be negative. In statistical language, applying to all populations sampled.
this corresponds to a negative intraclass correla- The expected value of tlzc squarcd sampIe al-
tion, and indicates the advantage of adopting tlie lelic frequencies now must reflect the two levels
perspective of correlation coefficients. In genetic of relatedness of different genes within papula-
language, it means that genes are more related be- tions-both are a consequence of prior genetic
tween than within populations. This could result sampling:
from some forms of migration that violate the as-
sumption of the populations having remained
distinct since an ancestral population. A more
complete discussion of biological causes for a neg-
ative component of variance between popula-
tions, QpA(l- pA)< 0, was given by Cockerham
(1973) for tlie analysis ol diploid data. Any mat-
ing system, such as the avoidance of self-mating, Once again, the estimation procedure follows nat-
that causes genes to be more alike between indi- urally from an analysis of variance layout. Tliere
viduals than within individuals, can cause this are now three sources of variation: populations,
phenomenon. individuals within populations, and genes within
populations. The sums of squares are constructed
Diploid Data with gene and genotypic frequencies, as shown in
When observations are made on diploid individ- Table 2, and fallow from the usual nested analy-
uals, the analysis should be performed at the sis of variance sums of squares for the indicator
diploid level. The same general approacli is fol- variable x,]k for the kt11 gene (k = 1,2) in the jth in-
Intraspecific Difjferen tiation 393
S2 = p“, (I-&$)- but the use of computers for data analysis n~akes
these common levels of approx~lnationunneccs-
sary.
&[ ~4( np- At i ,. ( l -- pA )
-
-
For (monoecious) populations mating aL ran-
dom, genes are equally relatcd whethcr tl-icy arc
withi11 or between individuals. III this case F = 0
=n1[ ( G I ) + ( ~ - l ) ( E - n ~ ) ] s ~
A
or f = 0. Therefore, estimates of F and 0 that differ
- nc
412: ~~,1 significantly may indicate departures from ran-
dom mating. Any avoidance of mating of relatives
will cause 8 to exceed F and f to be negative. In
the language of variance components, t l ~ compo-
c
nent for individuals with~npopulations will then
s3 -%fi
- A be negative. Recall that this component is actually
400 Chapter 10 / Weir
the cillference of two posltlve quantities. It is not quencies no longer remain constant over time,
b c ~ n gdairned that there is a variance that is ncga- and it is not appropriate to separate p~ and 8 in
tlve Dlfferent patterns of differences for the two the expectation of ~2,.
In this case, it is sufficient
estlmales (1 e., F and 8)at different loci indicate to work with measures of allelic similarity with-
that there are forces other than non-random mat- in a n d between populations. Migration is
mg alfecilng these loc~ allowed between the populations. Using nota-
The effects of selcctlon on the F statistics were tion from Cockerl~amand Weir (19871, Q2 is the
deta~ledby Cockerham (1973) Dlfferent selective probability that two genes within a population
forces in different populations, tending to Increase have the same allelic state, whereas Q3 is the
their differences, will lncrease the value of 8 chance that two genes, one in each of two sepa-
Within a population, Lewontin and Cockerhain rate populations, are in the same state. The
(1959) showed that selection at a locus gives neg- expected mean squares in the analysis of vari-
at~vef values unless the viability of a heterozygote ance layout, after summing over alleles, become
1s less than or equal to the geometric mean of the
vlabll~tlesof the two homozygotes
(Kimura and Crow, 1964).Note that different pop- The drawback of using 8 or related quantities
ulations are considered here to be independent. to estimate migration rates is that it is assumed
Often it has been suggested that migration rates that population differentiation is due to gene
could be estimated from 8 values (e.g., Slatkin, flow. In the absence of direct observations on gene
1985). There is an infinite-island model, corre- flow, this may be the only alternative but it does
sponding to the infinite-alleles mutation model. mean that alternative evolutionary scenarios lead-
Each generation, any gene sampled from a popu- ing to the same pattern of gene frequencies can-
lation has probability nz of having migrated from not be eliminated.
any one of an infinite number of other popula-
tions. When these various "islands" are of finite Population Subdivision
size N, an equilibrium is again established be- It is often the case that individuals are sampled in
tween loss of variation due to drift within islands, a nested sampling scheme. J.S.F. Barker et al.
and gain of variation by migration from other is- (1986) sampled Drosophila buzzatii from transects
lands. The equilibrium value of 8 is simply and sites within transects; Ferrari and Taylor
(1981) sampled Drosophila subobscura from subdi-
visions, regions within subdivisions, and demes
within regions. Each recognizable level in such hi-
erarchies adds another level to the degrees of re-
Although this suggests a means for estimating latedness of genes. The analysis of intraspecific
Nm,there are complications in practice because differentiation can be carried out by looking at the
the infinite-island model is unrealistic. As soon as hierarchy of relationships between pairs of genes,
a finite number n of islands are postulated, there
is a non-zero probability that two islands receive
migrant genes from the same island. The island
populations are not independent and there is a
need to distinguish between €4and 4. The quan-
tity p can be estimated (although it should not be from the most closely related pairs (within indi-
called 0 or Fsr), and was shown by Cockerham viduals) to the most distantly related (between
and Weir (1987) to have exgectation in an equilib- largest sampling units). This hierarchy is conve-
rium population of niently recognized and organized in nested analy-
ses of variance layouts. These layouts then pro-
vide the means for estimating the various
measures of differentiation. Details for both three-
and four-level hierarchies were given by Weir
where, for mutation rate p, p = (1 - pI2, d = (1990).
(1 - m d 2 , and a = n/(n - 1). For a large number
of islands
APPLICATIONS
words, what is the frequency that the per- whereas the single-individual frequencies are
petrator has genotype A,AI, given that the suspect
in the crime has that type? The best way of pre-
senting the evidence of a match between suspect
and evidentiary material is to compare probabili- There is almost no empirical information oh
ties of the match under two hypotheses, C: sus- the higher-order measures y, 6, and A. However,
pect S and perpetrator P are the same person, and for low levels of relatedness, numerical work has
?: suspect and perpetrator are different people. shown that the following approximations given
For heterozygotes, the comparison is given by the by Balding and Nichols (1994)for the conditional
likelihood ratio frequencies are satisfactory (they are somewhat
too small for heterozygotes and too large for ho-
mozygotes) :
Ta b l c 3
MuliiXocus genotype frequencies for the MNS blood group system in samples from
tell political districts of the Papago Indians of Arizona
Dlstrlcr MNSS MMSs MM~S ~ s S MNSs MNss NNSS NNss Total
1 9 14 18 7 12 8 2 1 0 71
2 3 7 8 I 3 4 1 1 1 29
3 12 31 12 3 14 7 0 0 1 80
4 3 9 8 3 12 9 1 2 1 48
7 4 14 18 0 6 10 1 0 2 55
6 1 8 7 4 2 2 0 2 2 28
7 8 31 4 3 14 4 0 3 1 68
6 21 52 55 7 24 21 0 4 3 187
9 4 16 7 5 10 6 0 2 1 51
10 23 14 4 3 10 3 1 1 1 60
From Workn~anand Niswander, 1970.
wluch detailed tables of genotypic frequencies are ness-of-fit tests are probably adequate and the test
presented. Table 3, which is part of Table 2 in statistics are shown in Table 4, but with samples as
Workman and Nistvander's paper, shows the small as 29, it is preferable to conduct exact tests.
two-locus counts for the MNS blood group sys- The permutation method described above was
tem In samples from the ten districts. The data used to find the significance levels (also shown in
will be treated here as though there arc two Table 4). Workman and Niswander had found
codomlnant loci M and S, each with two alleles: only the S locus in district 7 to be out of
M, Nand S, s. Hardy-Weinberg equilibrium. They commented
I-Iardy-Weinberg testing may be conducted that since there was a single significant value in 20
for each locus in each district. The usual x2 good- tests, "it may be concluded that the observed pro-
Table 4
Significance levels of tests for Hardy-Weinberg equilibrium
M S
District x2 Permutation x2 Permutation
In traspecific Differentiation 405
portions do not differ from those expected." Cer- The low value for M is consistent with the homo-
tainly one significant result in 20 replicate tests is geneity of genotypic frequencies, but not with the
to be expected when there is true equilibrium and heterogeneity of allele frequencies. There is a
a 5% sigruficance level is used. However, in this in- higher 0 value for S, and it is on the borderline of
stance it is not clear that there are 20 replicate tests. being significantly different from zero. Mention
If there is equal interest in each locus and each dis- was made earlier of it being easier to detect dif-
trict, then no test. is replicated. It would be of in- ferences in a fixed population situation than in a
terest to explain why the S locus has such an ex- random situation where inferences are being
cess of heterozygotes in District 7. (This excess made in a wider context and therefore there is a
results in two-locus genotypic frequencies not be- need to take into account larger variances.
ing consistent with the appropriate products of al-
lele frequencies in District 7, but there is no other
evidence of two-locus associations.) CONCLUSION
To judge the interdistrict heterogeneity in
genotype frequencies, a 10 x 3 contingency table As with all population genetic analyses, quanti-
(10 populations by 3 genotypes) test can be con- fying the effects of variation among populations
ducted for each Iocus. The test statistics are 25.80 within species requires methods tailored to the
for M and 73.35 for S. The second value is highly intended scope of inference. For detecting differ-
significant, with the largest differences being in ences between specific populations, standard
districts 5 (ss), 7 (Ss and ss), and 10 (SS and ss). statistical techniques involving contingency table
Both loci show significant heterogeneity in allelic x2 tests or permutation tests are appropriate. It is
frequencies, with the 10 x 2 contingency tables (10 possible to measure the extent of differentiation
populations by 2 alleles) having X* values of 18.49 with the F statistic, FST, but difficult to ascribe a
for M and 50.82 for S. These analyses are from a genetic meaning to this quantity in the fixed-
fixed population viewpoint, and make no appeal model analysis. If the sampled populations are
to any genetic mechanism. regarded as having been sampled from a set
With a genetic random model supposing a of populations subjected to evolutionary
common origin for the ten districts, it is appro- events, then the full set of F statistics provides
priate to estimate F statistics. Using the DIP- an appropriate parameterization. Moment esti-
LOID.FOR program (Weir, 19901, the bootstrap mators are convenient for supplying numerical
confidence intervals for 0 for two loci (bootstrap- values, and resampling over loci may allow in-
ping over districts) are: ferences to be made about differences among
populations.
ogenetic Inference
David L.Swofford, Gary J. Olsen,
Peter J. Waddell, and David M. Hillis
INTRODUCTION
Inferring phylogenetic relationships from molecular data requires the selection of
an appropriate method from the many techniques that have been described. Un-
fortunately, phylogenetic analysis IS frequently treated as a black box Into wluch
data are fed and out of which "The Tree" springs. Our goal in this chapter is to
provide more than a cursory description of the available analytical mcthods,
rather, we hope to develop a conceptual framework for understanding the theo-
retical and practicaI distinctions among alternative methodolog~es.Phylogenet~c
analysis of molecular data is in the midst of a remarkable transformation The
most striking theme in this shift is an increased emphasis on the use of methods
that are based on models of evolutionary change. Moreovcs, users of methods Lhat
do not require explicit models arc now much more likely to ~ncorporatemocllh-
cations based on reasonable assumptions about the evolutlanary process than
when the first edition of Molecular Systenznfzcs appearcd only six years ago We
view this trend as a positive one and have reorganized our chapter accordingly.
Regrettably, we cannot accoinplish all of the above objectives and at the samc
time provide an exhaustive review of the voluminous l~teratureon phyloge~iehc
reconstruction; however, Felsensteii~(1982, 1988a, 1993) and Hillis et al. (1993a)
have presented general reviews of methods for inferring phylogenies. Jnstcad,
we will focus 011 methods that are currently in widespread use or that are likely
to be used in the foreseeable future. We will also avoid the temptatmix to a t e
every relevant paper, limiting our citations to papers that are either of funda-
mental importance to the development of a method or that provide the clearest
explanations of that method.
As any reader even moderately familiar with the current state of affairs 111
pl-i)rlogeliet~~s
already knows, debates among pro- forms of pair-group cluster analysis (e.g., UP-
ponents ol rival methodologies are often intense GMA) and some other distance methods such as
and solnetimes ~~nneccssarily acrimonious. Con- neighbor joining (discussed later in this chapter),
sequently, we will offer recommendations where The methods tend to be computationally fast be-
we deem them appropriate, but wlll deliberately cause they proceed directly toward the final solu-
c?vold taking strong positions on or making con- tion without requiring evaluation of large xlum-
tro\,crsial assertions about issues where there is bers of competing trees.
room for legitimate disagreement. Instead, we The second class of methods lzas two logical
hope to provide suffic~entbackground so that steps. The first step is to define an optimality cri-
readers will be able to make informed decisions terion (formally described by an objective func-
regarding thc techniques most appropriate for tion) for evaluating a given tree-i.e., a score is as-
their own data. Our treatment In this chapter will signed and subsequently used for comparing one
be llmlted to the inference of the phylogenetic lus- tree to another. The second is to use specific algo-
tory of the genes under study. For a variety of rea- rithms for computing the value of the objective
sons, these "gene trees" may fail to reflect the re- function and for finding the trees that have the
lationships of the organisms from which the genes best values according to this criterion (a maxi-
were sampled. A discussion of these and related mum or minimum value, as appropriate). Thus,
issues 15 presented in Chapter 1; we will not ad- the evolutionary assumptions made in the first
dress thcm further here step are decoupled from the computer science of
the second step. The price of this logical clarity is
Algorithms versus Optimality Criteria that the methods tend to be much slower than
those of the first class, a consequence of having to
Inferring a phylogeny is an estimation procedure; search f ~ the
r tree(s) with the best score. For data
we are malting a "best estimate" of an evolution- sets containing more than about 8 to 20 taxa, the
ary hlslory based on the incomplete information search for the best tree is usually not exact (be-
contained In the data. In the context of molecular cause of the large number of possible solutions),
systematics, we generally do not have direct in- and thus we must add caveats regarding the thor-
lormation about the past-we only have access to oughness of the search for the optimal tree. These
contemporary species and molecules. Because we issues are covered in detail below.
can postdate evolutionary scenarios by which any It is important to distinguish between the
cl~osenphylogeny could have produced the ob- uses of algorithms in the two approaches. In a
served data, we musk have some basis for select- purely algorithmic method, the algorithm clefirlrs
ing one or more preferred trees from among the the tree selection criterion and takes on funda-
set of possible phylogenles. Phylogenetic infer- mental importance. In a criterion-based method,
ence methods seek to accomplish this goal in one however, the algorithms are merely tools used in
of two ways: (1) by defining a specific sequence of evaluating the objective function and searching
s t c p d x ~algorithm) that leads to the determina- for trees that optimize it.* Because criterion-based
tion of n tree; or (2) by deflning a criterion for methods can assign scores to every tree examined,
co~nparingalternative phylogenies to one another phylogenies can be ranked in order of preference
and decidlng which 1s better (or that they are according to the chosen criterion. This is an enor-
equally good) mous advantage over purely algorithmic meth-
Purely algorithmic methods combine tree in- ods. If a criterion-based method finds that there
ference and the definition of the preferred tree are thousands or millions of trees that explain the
into a slngle statement. These methods include all data about equally well, the user of the method
"Aclually, the same algor~thm!nay bc used in both approaches, albeit for very different goals. For instance, an algo-
rithm used to specify a f~naltrce m a purely algor~thmicmetbod may be used to find an initial tree for a criterion-
based m~thod(e.g.,as a startlng point for branch-swappingrearrangement algorithms).
Phylogenetic Inference 409
will not be misled into believing that any particu- If a phylogenetic inference method could be
lar tree is well-specified. On the other hand, when based upon a complete knowledge of the evolu-
a purely algorithmic method determines a single tionary process, it would be free of systematic er-
tree, the user will have no immediate knowledge ror (i.e., if enough data were obtained, the method
about the strength of support for that tree. Some would consistently obtain the true phylogeny).
workers (e.g., Hedges et al., 199213) have argued Even in the absence of such complete knowledge,
that algorithmic methods can be rescued by using hypothetical models of the evolutionary process
statistical methods such as nonparametric boot- could be used to derive (or otherwise justify) tree
strapping (see the section "Reliability of Inferred inference methods that would be free of system-
Trees," later in this chapter) to assess the confi- atic error, if the assumed /nodel were correct. A vari-
dence in a tree found using an algorithmic ety of inference techniques have been formulated
method. This position fails to address the criticism on the basis of explicit evolutionary assumptions.
that algorithmic methods generally do not ad- These model-based methods are not necessarily
dress the operational evolutionary assumptions. invalidated when one or more of their assump-
As an extreme example, consider an algorithm tions is violated-a model does not have to be
that chooses trees independently of the data, for perfect in order to be useful. That is, although the
example by labeling the tips of a maximally asym- assumptions may be sufficient to ensure the va-
metric tree in alphabetic order of the species lidity of a technique, under special circumstances
names. Repeated analyses using different re-sam- they might not all be necessary, and the method
plings of the data will always generate the same may be robust to violation of its assumptions. Fur-
tree, leading to the obviously absurd conclusion thermore, model assumptions need not be ac-
that the tree is extraordinarily reliable. cepted in a vacuum; data can and should be al-
lowed to reject the model if the model is
Use of Models and Assumptions inadequate.
Although almost all methods accept the ap-
in Phylogenetics propriateness of a tree-like model of evolution (a
Although we will deal extensively with specific strong assumption in itself), many commonly
models of the evolutionary change of molecules, used methods of phylogenetic inference are not
a preliminary discussion of the relevance of mod- explicitly based on a set of evolutionary assump-
els in general is in order at the outset. Phyloge- tions. However, the lack of stated assumptions
netic inferences are premised on the inheritance of does not mean that a method is assumption-free;
ancestral characteristics, and on the existence of the assumptions are simply implicit rather than
an evolutionary history defined by changes in explicit. For example, the widely used method of
these characteristics. The stable inheritance of maximum parsimony does not depend on a pre-
characteristics is mediated by the genome. Differ- cise model, but believing its results does require
ences due to epigenetic or environmental factors one to believe that plausible evolutionary scenar-
do not provide useful phylogenetic information ios that could cause it to fail have not taken place.
and must be specifically avoided; all characteris- It is often argued that it is circular to model char-
tics of interest are genetically mediated. Therefore, acter change for the purpose of estimating a phy-
the data for phylogenetic inference reflect, more logeny because we cannot begin to understand
or less directly, genomic information. From this the processes of character change without first
reductionistic perspective, a complete evolution- knowing the tree. We prefer, instead, to think of
ary history is synonymous with an event-by-event the problem as one of "reciprocal illumination"
accounting of fixed mutations in every genomic (Hennig, 1966): having some idea of the phy-
lineage of interest. This view of the problem pro- logeny is relevant to the development of good
vides a common framework, albeit a purely con- models, but ever-improving models can also lead
ceptual one, for analyzing and comparing types to better phylogenetic inferences. Thus, both
of molecular data and analysis techniques. classes of methods are useful and important. We
410 Chaptev 11 / Swofford, Olsen, Waddell b Hillis
see it as unfortunate that some workers, in their connected to an internal node, then the node rep-
zeal to avoid circularity, limit theinselves to resents a multifurcation, or polytomy. A tree ill
"model-free" methods that may be more likely to which all internal nodes represent bifurcations is
violate their (implicit)assumptions than the meth- said to be binary, fully resolved, or strictly b~fur-
ods they reject, for which the assumptions are cating. A tree that contains a single internal node
more explicit. is called a star tree.
One assumption implicit in this general view An unrooted, fully resolved tree has T termi-
concerns the uniqueness of the genomic lineage. nal nodes (corresponding to the taxa) and T - 2 in-
The potential confusion due to Iateral gene trans- ternal nodes. The tree has 2T - 3 branches, of
fer has received much recent attention. When which T - 3 are interior and T are peripheral. The
transfer is common among thc lineages of inter- total number of distinct unrooted, strictly bifur-
est, a population genetic analysis (Chapter 10) is cating, trees for T taxa is
most appropriate. Our presentation is appropriate
for cases in which interspecies differences are
large compared to intraspecific variation.
Definitions of Terms
Most of the analytical techniques that we will dis- (Felsenstein,1978b).Adding a root adds one more
cuss result in the inference of an unrooted tree or internal node and one more interior branch. Since
unrooted phylogeny-a phylogeny in which the the root can be placed along any of the 2T - 3
earliest point in time (the location of the common branches, the number of possible rooted trees is
ancestor) is not identified. (We generally use free increased by a factor of 2T - 3.
and phylogeny interchangeably.) Also, biologists
often refer to an unrooted tree as a network; how-
ever, this usage conflicts with the definition ap- TYPES OF DATA
plied to that term by mathematicians and should
be avoided (the section "Split Decomposition" All of the experimental data gathered by the tech-
uses network in the correct sense). When we find niaues in this volume fall. into one of two broad
it necessary to distinguish between rooted and categories: discrete characters, and similarities or
unrooted phylogenies or trees, we will do so ex- distances. A discrete character provides data
plicitly. about an individual species or sequence. Charac-
The components of a phylogenetic tree go by ter data are often transformed inio similarity or
a variety of names. The contemporary taxa corre- distance values representing quantitative com-
spond to terminal nodes or tips, also called leaves parisons of two species or sequences; each such
or external nodes. The branch points within a tree measure describes a pairwise relationship. Of the
are called internal nodes. Nodes are called ver- methods discussed in this book, only DNA-DNA
tices or points by some authors. The branches hybridization data are collected directly in the
connecting (incident to) pairs of nodes are also form of pairwise distance comparisons. Appro-
called edges, links, or segments. We will use the priate distance measures and transformations for
terms peripheral branches to refer to branches DNA-DNA hybridization data are discussed in
that end at a tip and interior branches (or, in the Chapter 6. o u r discussion here focuses on charac-
case of a tree with four terminal nodes, central ter data.
branch) to refer to branches that are not incident Discrete character data are those for which a
to a tip. data matrix X assigns a character state xi, to each
If just three branches connect to an internal taxon i for each character j. Although syst'ematists
node, then the node represents a bifurcation, or sometimes disagree about the terminological dis-
dichotomy. If there are more than three branches tinction between character and character state, we
prefer to think of characters as independent vari-
ables whose possible values are collections of inu-
tually exclusive character states.
The assumption of Independence among
characters is common to most character-based
of analysis, When we can not assume in-
dependence, we are forced to take covariances
ainong characters into account, and the computa-
tional methods become considerably more com-
plicated. Furthermore, tlze assumption of inde-
pendence enables us to treat each position
separately in certain time-consuming stages sf
computational algorithms, thereby allowing prob-
lems to be subdivided into a number of much
simpler subproblems. (For example, numbers of
substitutions can be minimized separately posi- Figure 3. Ordered and u~iordercdcliaracters (A) Or-
tion-by-position and then summed over positions dered multistate character (transformat~onbetwccn ally
in a parsimony algorithm, or probabilities can be two states that arc not directly connected tmpi~espas-
lnultiplicd over positions in a maximum likeli- sage througl~one or more ~nternledlatestates). (%) Un-
ordered multistate character (any state can transform
hood approach.) directly lilt0 any other state) (C) Ordcred multlstatc
A second assumption required of character characters m which the polarity is indlcatcd (thc orcla-
data is that the cl~aractersbe homologous. As ar- mg relation 1s the same In all threc cnses but the anccs-
ticulated in Chapter 1, the concept of homology is tral state d~ffers).
complicated by the variety of meanings that have
been applied to the term. In general, by honzology
we mean that a character must be defined in such ordered, depending on whetl~eran orderlng rcln-
a way that all of the states observed over taxa for tionship is imposed upon the possible states (Fig-
that particular character must have been derived, ure 1).Por example, nucleotjde sequence data are
perhaps with modification, from a corresponding generally treated as unordered multistate charac-
state observed in the common ancestor of those ters, since there is no a prior1 reason to assume, for
taxa. When we are interested in relationships instance, that state C is inier~nedlatebctwecn
among species rather than among genes, we fur- states A and G. In tlze context of phylogenetlc
ther restrict this definition to include only orthol- analysis, we say that any state is allowed to trans-
ogous, as opposed to paralogous or xenologous, form directly into any other state. If, 011 tlzc othcr
genes. hand, we are willing to make assumptions In-
In general, character data are either qualita- volving the relationships among the states of a
tive, in which case the possible states are two or character, we can rank tlze clzaracter states in to an
more discrete values; or quantitative, i11 which ordered series he., a linearly ordered character)
case the characters vary continuously and are or a branching diagram (partially ordered char-
measured on an interval scale. Qualitative charac- acter or character-state tree.) Multistate ordcred
ters may be further subdivided into binary (two characters arc not commonXy encountered in mol-
possible states) and multistate (three or more pos- ecular data sets, but they are sometimes used $11
sible states). Binary characters typically represent the a~~alysis of allozyme data.
the presence or absence of some item, such as the The concepts of character order and charac/cr
recognition sequence for a restriction endonucle- polarrty sl~ouldnot be confused. The former dc-
ase at a certain map location (restriction site) or a fines the allowed character-state translormatlons,
particular allele at an isozylne locus. whereas the latter refers to the dlrectlolz of clzarac-
Multistate characters may be ordered or un- ter evolution. Estimation of character polarlty
412 Clzapter 11 / Swofford, Olsen, Waddell b Hillis
gawxally ~nvolvesan assessment of the observed from sequence data. Methods for alignment are
characLer st,~tenost likely to represent the ances- discussed in Chapter 9.
tral condition (i.e., the state found in the most re-
cent corn1no1-1ancestor of the taxa under study). Restrictio~~
Endonuclease Data
A n excellent discussion of character orderlng and
pola~lty(in a non-molecular context) can be found Restriction endonuclease analysis provides char-
in Mabee (1989). We will return to the subject of acter data in one of two forms, both of which lead
character polarity in the discussion. of parsimony to a set of binary characters for each taxon. Ideally,
methocis. the characters are map locations and character
Quan~itativecharacters are less commonly states are presences or absences of the recognition
used as character data in molecular systematics. sequences for particular endonucleases at those
The prominent exception occurs when polymor- locations (restriction-site data). However, because
phrc characters such as allelic isozyines or the construction of restriction maps is time-con-
xntDNA haplotypes are coded as frequencies. suming (see Chapter 8), some workers simply
treat the presence or absence of restriction frag-
ments of-a given length as character states (re-
Sequence Data striction-fragment data).
.
d
111 principle, the use of sequence data as charac- We do not recommend the use of restriction-
ters for phylogenetic analysrs is straightforward fragment data for input to phylogenetic analysis,
Given a set of sequences, the characters are repre- primarily because these data violate the assump-
sented by corresponding positions (offsets)in the tion of independence among characters. If a new
sequences, and the character states are the nu- site evolves between two preexisting sites, one
cleolide or a~ninoacid residues observed at those (longer) fragment disappears and two new
positlons For example, if nucleotide A is observed (shorter) ones appear. Thus, even thoug11 two
to occlir at position 139 in a sequence, "position species may share two of the three restriction sites,
139" rs the character and "A" is the state assigned they have no fragments in common-a potentially
Lo lkat character. To simphfy our exposition, we serious source of error. Some authors (e.g., B. Bre-
will us~tallyconfine our descriptions to nucleohde mer, 1991) have recognized this difficulty and ar-
sequences unless the dlstlnction is important. gue that it can be overcome by looking at
Uniortunately, thls s~rnphcityis deceiving. In "enough fragment data so that each occurrence
addltion to requiring the usc of homologous mol- of this kind of error will be swamped by other
ecules (see Chapter 3 1, phylogenetic analysis of se- data. We are unconvinced by this argument, how-
quence data requires positional homology. That ever, because there is no guarantee that if some-
is, the nucleotides observed at a given position in thing is done inappropriately enough times, all.
the tdxa under study should all trace their ances- will work out in the end (and the amount of sys-
fry to a smgle position that occurred in a common tematic error introduced by this shortcut: will in-
anccstor of those taxa. Except for higldy conserved crease substantially with increasing divergence
sequences, insertion and deletion events must among the taxa in the analysis). A second and re-
nearly always be postulated 111 order to make be- lated problem with fragment data is that insertions
llevable the assumption that nucleotides at corre- or deletions are difficult to handle. For example,
sponcilng positlons in the various sequences are in the insertion of a length of DNA long enough to al-
fact homologs. An alignment of the sequences is ter the mobility of the fragment (but not contain-
obtalned by inserting gaps, which correspond to ing a restriction site) requires the worker to assert
i~~sertiollsor deletions, ~ n t oone or more of the se- that a species lacks a fragment found in one or
quences In order to place positions inferred to be more other species, even though the restriction
homologous into the same column of the data ma- sites responsible for the fragment are at homolo-
trix Aligrunent is often the most difficult and least gous points on the map (see Chapter 8).
undersLood component of a phylogenetic analysis Even when sites are mapped, restriction en-
Phylogenefic Inference 413
donuclease data are problematic for phylogenetic With the development of character-based
due to the asymmetry in the probabilities methods, however, came a second controversy,
of gaining and losing sites. If a particular sequence this one involving the importance of allele fre-
of six base pairs is only one substitution away quency information. Some authors (e.g., Micke-
from equalling the recognition sequence of a par- vich and Johnson, 1976) argued that the presence
tic~darendonuclease (a "one-off" site), then given or absence of an allele was of more fundamental
that a substitution occurs *withinthe six-base se- evolutionary importance than was its frequency
quence, only one of the 18 possible substitutions (which was subject to modification by drift
of one base for another will convert the sequence and/or selection), and that frequency information
to a restriction site. On the other hand, if the six- should therefore be discarded. These authors pre-
base sequence is already a restriction site, then a ferred to recast the data into presence/absence
substitution at any of the six positions will cause form. However, other authors (e.g., Swofford and
the site to be lost. Thus, losing an existing restric- Berlocher, 1987) have argued that there is no rea-
tion site is much more likely than gaining a site at son to ignore frequency information in analyzing
a particular location. (For more complete discus- allozyme data.
sions, see Templeton, 1983a, 1983b and DeBry and The earliest attempts to use allozyme charac-
Slade, 1985.)Note that this argument applies only ters directly in a phylogenetic analysis generally
to particular sites in the genome; it does not imply treated the allele as the character and either its
a net loss of restriction sites during evolution. Be- presence/absence (e.g., Mickevich and Johnson,
cause of these gain-loss asymmetries, special han- 1976) or its frequency (e.g., Buth, 1979b; Simon,
dling may be required for restriction-site data. 1979) as the character state. This procedure, how-
ever, is open to the same criticism leveled at the
use of restriction fragment data: the assumption
Isozyme Data of independence of characters is violated. Specif-
Allozyme (allelic isozyme) data represent the only ically, since the frequencies of the alleles at a lo-
type of isozyme data routinely used in phyloge- cus in a given taxon are constrained to sum to
netic analysis (but see Buth, 1984a, and Chapter 4 one, if the frequency of one allele increases, the
for a discussion of other data types). These data frequency of at least one other allele must de-
are usually presented as a three-dimensional ar- crease. This property leads to problems, for ex-
ray that specifies the frequency of each allele at ample, when allele-as-character data are sub-
each locus in each population or taxon.* Two con- jected to maximum parsimony analysis, where
troversial issues confront the researcher attempt- ancestors are often inferred to contain no alleles
ing to estimate phylogenies from allozyme data. at all (presence/absence cading) or frequencies
The first concerns whether or not to transform the that do not sum to one (frequency coding) for
data to genetic similarities or distances. Probably some loci.
due more to inertia than anything else, the pre- Because of these difficulties, Buth (1984a) and
dominant mode of analysis throughout the 1970s others have advocated an approach that recog-
and into the 1980s was to compute a matrix of nizes the locus as the character and the allelic
pairwise similarities or distances between taxa composition at the locus in each taxon (i.e., allele
that served as the input to cluster analysis or ad- or combination of alleles present) as the character
ditive-tree methods, The stereotypical way in state. For example, if some taxa are fixed either for
which these data were treated tended to retard the allele a or for allele b, whereas others are poly-
development of approaches that made direct use morphic for both alIeIes, then three states would
of the character information. be recognized: "only a," "only b," and "a plus b."
"It is customary to refer to loci as putative or presumptive and to use the term elecfromorphsrather than alleles because
of the indirect nature of the data and the usual absence of crossing experiments to conftrrn the mode of inheritance.
For our purposes here, the siinpler terms suffice.
414 Chapter 11 / Swoffovd, Olsen, Waddell 8Hillis
The resulting discrete character states ("particu- these considerations, methods that require re-cod-
late data") are either left unordered or ordered ing of allele frequency arrays into discrete states
mto some logical progression (see Buth, 1984a, for should be used only when levels of polymor-
details) for subsequent analysis. phism are low, with problematic loci excluded
Despite its intuitive appeal, several factors from the data set.
limit the utility of the particulate data, locus-as- J.S. Rogers (1984,1986) and Swofford and
character approach, When many different alleles Berlocher (1987) have developed methods of
occur in various combinations across taxa, the analysis that use the observed allele frequencies
number of unique combinations may approach or directly in character-based analyses rather than
even equal the number of taxa. Such characters requiring their recoding as discrete states (see the
will contain little or no information if the charac- section on "Parsimony on Allozyme Data").
ter states are left unordered. Ordering the charac- Felsenstein's (1981b) maximum likelihood
ter states helps somewhat, but the ordering crite- method for continuous characters evolving under
ria often seem subjective and arbitrary. a Brownian motion process can also be applied to
Buth (1984a)distinguished qualitative coding, gene frequency data (after an appropriate trans-
in which observed combinations of alleles are formation).
used regardless of frequencies, and quantitative
coding, in which estimated allele frequencies are
used to assess ''whether the states expressed by
Gene Order Data
two taxa are statistically identical." Obviously, Phylogenetic inference based on the structural
qualitative coding is extremely susceptible to arrangement of genes, particularly in organellar
sampling error. Consider the example in the genomes, provides a useful alternative to the
above paragraph. Taxa that were in reality poly- more traditional comparison of the sequences of
morphic for alleles a and 11 would often be incor- one or more genes (or indirect measures thereof).
rectiy scored as "fixed" if one allele were rare, un- Although we will not discuss the use of gene-or-
less sample sizes were large. (Swofford and der data in detail, there is growing evidence that
Berlocher, 1987, give a table showing the proba- such data will provide important information on
bility of failing to detect low-frequency alleles in relationships, particularly when trying to resolve
samples of various sizes; see also Chapter 2). Even ancient divergences. Sankoff et al. (1992) used
if allele frequencies could somehow be deter- gene-order comparisons to estimate a phylogeny
mined without error, it would be unreasonable to for 16 taxa, including fungi and other eukaryotes,
argue that allele frequencies are so irrelevant that and obtained a tree highly compatible with our
the distinction between allele frequency arrays of, current understanding of metazoan and fungal re-
say, [0.01,0.99] and [0.99,0.011 is unimportant. lationships. More recently, Boore et al. (2995) have
Quantitative coding presumably makes use of used gene-order data to address longstandirtg
contingency-table analysis to test whether two or questions regarding arthropod relationships. They
more samples could have come from a single ho- were able to draw strong conclusions about rela-
mogeneous population. In most cases involving tionships that previously had been highly am-
interspecific comparisons, however, we know be- biguous. Boore et al. (2995), Downie and Palmer
forehand or from the analvsis of other loci that (1992b), and others have argued that gene re-
such is not the case, even if ;he difference between arrangements are potentially more informative
the allele frequency arrays of two taxa at a partic- because they occur less frequently (and hence are
ular locus is not deemed significant. Furthermore, less subject to parallelism and convergence) than
the power of these tests to detect heterogeneity is sequence data, and because the large number of
weak unless sample sizes are large. Therefore, possible character states makes it unlikely that the
failure to reject the null hypothesis of homogene- same gene order will evolve independently in dif-
ity should not usually be taken as evidence that ferent lineages. Thus, while gene-order characters
the taxa are "statistically identical." Because of typically are insufficient to obtain a fully resolved
Phylogenetic I~zfeuel?ce 415
tree, one can generally have high confidence in tionary steps (transforrnat~ons fro111 one character
the groups that are supported. state to another) required to explain a given sel of
Phylogenetic analysis of gene-order data is in data. For example, the steps l n i ~ hbe
t base substi-
ils infancy (although tlze problems are similar to tutions for nucleot~desequence data, or gall1 and
those encountered in the analysis of chromosomal loss events for restriction-site data. Obviously, a
inversions and other rearrangements). A serious tree that minimizes the total number of steps also
complication is that the characters can no longer miniinizes the number of extra steps (horno-
be assumed to evolve independently, because it is plasies) needed to explain the data.
the relationships of the genes to each other that In more mathematical terminology, we can
define tlze characters. Sankoff et al. (1992) have define the general inaxiinum parsimony problc~n
developed and implemented a 1net11od for mini- as thc following. From the set of a11 possible trees,
mizing the number of evolutionary events (inver- find all trees zsuch that
sions, transpositions, insertions, and deletions) re-
quired to convert one circular genome into
another. This quantity then serves as the basis for
a distance metric. Others (e.g., Doore et al., 1995)
have performed parsimony analysis on special
codings of the gene order data, despite the non-
independence of the data. It is likeIy that methods is rnmimal, where L(z) is the length of tree z, B 1s
of analysis for gene-rearrangement comparisons the number of branches, N IS the number of char-
will be an active area of research for the next few acters, k f and k l 1 are the two nodes incident to
years. each branch k, xk?, and xi ,, represent either ele-
ments of the input data matrlx or optimal charac-
ter-state assignments made to internal nodes, and
BBTIMALITY CRITERIA I: diff(y,z) is a function specifying the cost of a trans-
PARSIMONY METHODS formation from state y to stale z along any branch.
The coefficient wlassigns a weight to each charac-
Of tlze existing numerical approaches to inferring ter; it is often set to 1, but thls need not be the
phylogenies directly from character data, meth- case. Note also that diff(y,z) need not equal
ods based on the principle of maximum parsi- diff(z,y), although for methods that yieId un-
mony have been the most widely used by far. rooted trees, diff(y,z)= diff(z,y).As discussed be-
Most biologists are familiar with the usual notion low, tlze definition of optznrnl chnracter-state nss~gl?-
of parsimony in science, which essentially main- ~nenfsinay include rcstr~ctionson the nature of
tains that simpler hypotheses are preferable to permissible character-statc changes.
inore complicated ones and that ad Iqoc hypothe- Any discussion of parslinony mehods must
ses should be avoided whenever possible. Neth- distinguish between tlze o p t ~ m a l i t ycrlterlon
ods for estimating trees under the criterion of par- (minimal tree length under a specified set of re-
simony equate "simplicity" with the explanation strictions on permissible cl~aracter-statechanges)
of attributes shared among taxa as due to their in- and the actual algorithm used to search for opti-
heritance from a common ancestor (e.g., Sober, mal trees. Early descriptions of parsiinony inetll-
1989).When character colficts occur, however, ad ods (e.g., Farris, 1970) were presented m a way
lzoc hypotheses cannot be avoided if tlze observed that tended to obscure the boundaries between
character distributions are to be explained, and criteria and algorithms. Biologists attempting to
assumptions of I~omoplasy(convergence, paral- understand a method sl-lould not become so
lelism, or reversal) must be invoked. mired in algorithmic details that they lose track
In general, parsimony methods for inferring of the underlying biological principles and as-
phylogenies operate by selecting trees that mini- sumptions (Felsenstein, 1982).Algorithms tend to
mize the total tree length: the number of evolu- have short life spans, because better ones are con-
416 Chapter I1 / Swoffovd, Olsen, Waddell G. Hillis
stantly b e ~ n ginvented. For example, Farris's quantity but differing in their underlying evolu.
(1 970) olgc~rithrnfor estimating minimum-length tionary assumptions. We will now address each
tree.; under the Wagner parsimony criterion is of these methods in turn. The metliods are pre-
not, io our knonrledge, ~ l s e din any modern, sented in a logical p r o g r e s s i o ~rather
~ than in
~vidc!yused parsimony computer program (e.g., chronological order of their introduction into the
Farrii'i Hennlg86, Felsenstein's PHYLIP-MIX, or literature. In describing the procedures used to
S~vofiord'sPAUP), but 111scrzferion forms the ba- compute the minimum length required by a tree
sl3 for all of them. For these reasons, the concep- under a particular o p t i n ~ a l ~criterion,
ty we will
tual f r an~cworkin whlch we wlll discuss parsl- consider a single character (position) in isolation
many (and other) crlteria assumes that the from the rest. Because of the assumption of inde-
problem of finding optimal trees is not at issue. pendence among characters, we can compute the
\Vc assume, for the moment, that every possible overall tree length by summing, over all charac-
tree can be evaluated, optlmlzing each one ac- ters, the lengths required by each individual
cordlilg to the chosen criterion and ranking them character. For the simplest procedures (Fitch and
accordmg to that criterion. We will take up the Wagner parsimony), we provide pencil-and-pa-
matter of searching for optimal trees in a subse- per algorithms for computing tree lengths and
quent section. determining optimal character-state assign-
A common misconception regarding the use ments. Again, we are concerned olily with calcu-
of parsimony methods is that they require a priori lating the length of a single tree, which is taken
determination of character polarities (see above). as a given; this tree may not be a most-parsimo-
In morpl~ologicallpbased studies, character po- ~iiousarrangement for our example character (or
larity is often inferred using the method of out- even over all characters); it is simply a tree that
group comparison, and the resulting "polarized" we wish to evaluate.
c h a ~acters form the basis of the analysis. F ~ ~ r t h e r -
rnorc, since a "hypothetical ancestor" 1s implied
by the polarity ass~gnments,the output of an
Fitch and Wagner Parsimony
analybls of polarized characters is a rooted tree. These are the simplest parsimony methods, im-
Whel eas specification of polarities provides a suf- posing no (Fitcl~) or minimal (Wagner) constraints
ficlei~tbass for obtaining rooted (rather than un- on perniissible character-state changes. The Wag-
rooted) trees, it is by no means prerequisite to the ner method, formalized by Kluge and Farris
use of parsimony mctl~ods.This circumstance is (1969) and Farris (19701, assumes that characters
fortunate, since the estin~ationof character polar- are meas~rredon an interval scale; thus it is ap-
~ t IS
y both n-tore difficult and less meaningful for propriate for binary, ordered multistate, and con-
most kinds of molecular data. All that is required tinuous characters. Fitch (1971b) generalized tlie
to obtnu~rooted trees from parsimony analysis is method to allow unordered multistate characters
to include m the data set one or more assumed (e.g., nucleotide and protein sequences). Wagner
outgroup taxa. Thc location at which the out- parsimony assumes that any transformation from
group Iorns the unroo[ed tree implies a root with one character state to another also implies a trans-
respect to the ingroup taxa. We emphasize, how- formation through any intervening states, as de-
ever, (hat the assignment of taxa to the outgroup fined by the ordering relationship. Fitch parsi-
constitutes an assumption that the remaining taxa mony allows any state to transform directly to
(the ~ngroup)are monophyletic (an assumption any other state. 130th methods permit free re-
that hopefully is juslified by evidence extrinsic to versibility; that is, change of character-states in ei-
the data at hand). Xf this assumption is wrong, the ther direction is assumed to be equally probable,
trec wrll be rooted incorrectly. and character states may transform from one state
I'arsimony analysis actually comprises a to another and back again. A consequence of re-
group of related methods, u n ~ t e dby the goal of versibility is that the tree may be rooted at any
uunlnuzrng some evolutionarily significant point with no change in the tree length.
Phylogenetic Inference 417
To determine the minimum length required An application of the above algorithm is pre-
by a given character j under either the Wagner or sented in Figure 2. We wish to compute the length
pitch criteria, only a single pass over the tree is re- of the unrooted tree of Figure 2A. (Although the
quired, proceeding from the tips toward the arbi- more usual situation for molecular data would in-
trary root. Computer scientists call this pass a post- voIve binary rather than rnultistate characters, we
order traversal. Although the computation can be treat the multistate case to demonstrate the gen-
performed in other ways, we recommend rooting erality of the algorithm. Binary characters are sirn-
the tree at one of the terminal taxa, denoted r, as ply a special case.) We first re-root the tree at node
shown in Figure 2. The algorithm for computing
the length of a strictly bifurcating tree under the
Wagner parsimony criterion then proceeds as fol-
lows (see Swofford and Maddison, 1987, for a
more rigorous presentation).
1. To each terminal node i (including the one at
the root), assign a state set S, containing the
character state assigned to the corresponding
taxon in the input data matrix (= x0) Initialize
the tree length to zero.
2. Visit an internal node k for which a state set Sk
has not been defined but for which the state
sets of k's two immediate descendants has
been defined. Let i and j represent Kstwo irn-
mediate descendants. Assign to k a state set Sk
according to the following rules:
2a. If the intersection of the state sets assigned to
nodes i and j is non-empty (Si n SI # a),let k's
state set equal this intersection (i.e., Sk = S, n SI).
The intersection can be represented as a closed
interval [ak,bk].
2b. Otherwise (Si n S, = a),let k's state set
equal the smallest closed interval [ak,bk]contain-
ing an element from each of the state sets
assigned to i and j. Increase the tree length by
bk - ak.
3. If node k is located at the basal fork of the tree
(i.e., the immediate descendant of the termi-
nal node placed at the root), the traversal has
been completed; proceed to step 4. Otherwise,
return to step 2.
Figure 2 Steps in the algorithm for computing the
4. If the state assigned to the terminal node at the length of an ordered character under Wagner parsi-
root of the tree (x,) is not contained in the mony. (A) The unrooted tree and character states. (8)
Tree obtained by rooting at terminal node A and initial
state set just assigned to the node at the basal state sets assigned to temlnal nodes. (C) State sets com-
fork of the tree (Sk),increase the tree length by puted for interior nodes (bold).(D)Reconstruction ob-
the distance from x, to Sk. (This distance tained according to the algorlthin described in the text.
equals ak - x, if x, < ak or x, - ak if x, > bk.) (E) An alternative, equally parsimonious reconstruc-
tion.
418 Chapter I1 I Swofford, Olsen, Waddell G. Hi1Eis
A (although we could have chosen any node), [0,2] to 1) to node X; likewise we assign state 1
yielding the rooted tree shown in Figure 2B. Also (the closest state in [1,31 to 1)to node Y. The re-
shown in Figure 2B are the state sets assigned to sulting reconstruction is shown in Figure ZD,and
the terminal nodes according to step 1 of the algo- confirms the value of 5 as the minimum length for
rithm. Visiting internal node X in the first invoca- this character.
tion of step 2, we observe that Sg n SC = 10) A {2}= It is important to remember that this method
0, and hence assign the interval [0,21 to Sx, finds only a single MPR, although others may ex-
adding 2 - 0 = 2 to the tree length, Similarly, we ist. For instance, the reconstruction in Figure 2E
let Sy = [1,3] in the second invocation of step 2, also requires 5 steps. Swofford and Maddison
and add 3 - 1 = 2 to the length, which is now 4. In (1987) described an exact algorithm for obtaining
the third and final invocation of step 2, we ob- all MPRs for discrete character data under the
serve that the intersection Sx n Sy= [0,2] n [1,3] is Wagner parsimony criterion.
not empty, and therefore assign the interval [1,21 Simple modifications of the above algorithm
to Sz The situation as we arrive at step 4 is shown provide for the treatment of multistate unordered
in Figure 2C. Since x, = 0 is not an element of Sz = characters (e.g., nucleotide sequence positions)
[1,3], we add an additional 1 - 0 = 1 to the length. under the Fitch (1971b) parsimony criterion. In
Thus, evolution of this character requires a mini- the initial pass (computation of state scts and tree
mum of five steps on our given tree. Icngtl~s),modify steps 2 and 4 as follows:
The procedure outlined above is sufficient to 2a'. If the intersection ofthe state sets assigned to
obtain the minimal length required by any char- nodes i and j is non-empty (Si n Sl # O), let k's
acter on a given tree. However, it does not actu- state set equaI this intersection (i.e.,Sk= S, n Sl).
ally assign optimal character states to the hypo-
thetical ancestors (internal nodes) of the tree to 2b'. Otherwise (S, n SI = O), let k's state set
yield a most-parsimonious reconstruction equal the union of the state sets assigned to
(MPR). To obtain such a reconstruction we can nodes i and I (S,u S,), and increase the tree
make a second pass over the tree, this time pro- length by 1.
ceeding from the root toward the tips (a preorder 4'. If the state assigned to the terminal node at the
traversal): root of the tree (x,) is not contained in the state
5. Visit an internal node k for which an optimal set just assigned to the node at the basal fork
state assignment xk has not yet been made but of the tree (Sk),increase the tree length by 1.
for which such an assignment has been made In order to obtain an MPR, modify step 6 above as
to k's immediate ancestor, denoted m. (Note
foIIows:
that the first time this step is invoked, k corre-
sponds to the node at the basal fork of the tree 6'. If x, is contained in the state set assigned to k
and rn = u, the terminal taxon at the root of the in the first-pass (Sk),assign this state to k as
tree.) well. Otherwise, arbitrarily assign any state
from Skto k.
6. Assign to k the state from the state set com-
puted in the first-pass, Sk (= [ak,bk]),that is An example of the applicatioi~of the above algo-
closest to x,. Specifically, if x,, is contained in rithm is shown in Figure 3. We are interested in
Sk, we let xk = x,. Otherwise, we let xk = ak if carnputing the length required by a single charac-
xm r: akor xk = bk if X, > bk. ter on the unrooted tree of Figure 3A. As before,
we re-root the tree arbitrarily at node A, yielding
7. If all internal nodes have been visited, stop.
the tree shown in Figure 313. The state sets as-
Otherwise return to step 1.
signed to the terminal nodes are indicated on the
Applying steps 5-7 to the example of Rgure 2, we figure. Visiting node X in the First invocation of
first assign state 1 (the closest state in [1,2] to 0) to step 2', we see that {A}n (C] = 0, and hence as-
node Z. We then assign state 1 (the closest state in sign the union {A,C} as the state set Sx and set the
Phytogenetic Inference 418
"Some readers, familiar with "Dollo's Law of Irreversibility," may be confused at this point. T11e Dollo parsimony
model does not assume complete irreversibility, only that a denved character state cannot be lost and then
regained. The Camin-Sokal modcl does not permit a derived character state to relurn to thc ancestral condition.
Phylogene t i c Inference 423
node), sacrificing the guarantee of global optirnal- man-Czelusniak algorithm may be overkill in the
~ t but
y providing greater trartab5ty. Nanney et al. sense that it pays too much attention to silent sub-
(1989)described and programmed a more ap- stitutions (e.g., substitutions at third positions that
proxlniale, but much faster, procedure that oper- do not change the corresponding amino acid), If
ates by assuming that lengths of insertions and silent substitutions occur so frequently that infor-
deletions are sufficiently small to allow alignment mation from third positions quickly reaches satu-
within a local "window" rather than obtaining a ration, then these positions would contribute
global alignment for any triplet of sequences. mainly noise (or worse, systematic error) and
Hejn (1990a,b) and Wheeler and Gladstein (1994) should therefore be ignored. Weighting methods
have developed useful programs for simultane- presumably could be used to minimize the contri-
ous alignment and tree optimization (see Chapter bution of third positions without ignoring them
9 for dctalls). entirely. To our knowledge, however, such meth-
ods have not been used.
A third approach, intermediate between the
I'arsimony on Protein Sequences first two, has been implemented by Felsenstein
Because this book does not specifically deal with (1993) in his PROTPARS program from the
ain1170acid sequencing, our discussion of parsi- PHYLIP package but has yet be formally de-
mony methods for treating these sequences will scribed in the literature. Unlike the Eck-Dayhoff
be brief. Three general procedures have been approach, it does consider the genetic code, but it
used. The first, and simplest, is to minimize the also deviates from the Moore-Goodman-Czelus-
number of amino acid replacements by using niak method by ignoring silent substiiutions. Al-
Fitch parsimony as described above (i.e., each po- though ignoring silent substitutions sounds like
sltlon in the aligned sequences is a multistate un- extra work, the required bookkeeping is in fact
ordered character, of which the possible states are simplified considerably because the program does
the 20 possible amino acid residues). This ap- not need to consider all the potential mRNA
proach, apparently used first by Eck and Dayhoff codons responsible for a particular amino acid
(1966),Ignores the genetic code by failing to con- residue or all of the potential synonymous codon
sider the mii~imalnumber of nucleotide substitu- assignments to the interior nodes. For example,
tions required for the replacement of one amino PROTPARS would assign one step to a change
acld by another (i.e., some replacements require a from lysine to arginine (e.g., AAA -+ AGA), but
single nucleotide substitution, whereas others re- two steps to a change from lysine to proline (e.g.,
q ~ t ~ r e t woroeven three substitutions). AAA + CAA (glutamine) -+ CCA). Changes such
Goodman, Moore, and their colleagues devel- as pheuylalanine to glutamine require three nu-
oped a more sophisticated approach (reviewed by cleotide substitutions (e,g.,AAA 4 GAA (leucine)
Goodman, 1981) tliat seeks trees requiring the i GAT (leucine) -+GTT) but are counted as only
fewest number of nucleotide substitutions at the two steps, since the middle substitution is silent.
mRNA level. They produced an algorithm that One could take Felsenstein's argument a step
generalizes the Fitch parsimony approach to further. Because of the biochemical properties of
codons, taking into account the degeneracy of the the various amino acids, there is often little selec-
genetic code and guaranteeing that one obtains tion against changes between amino acids having
ciw m ~ n i m u nnumber
~ of nucleotide substitutions similar properties (e.g., between aspartic and glu-
r t q u ~ r e dby any given tree. (A highly readable tamic acids). If changes between similar residues
presentation of the algoritlm, including a worked occur very frequently, perhaps we should ignore
example, appears in G.W. Moore, 1976; see also them as well (or at least give them less weight).
Goodman et al., 1979).A more recent modification The generalized parsimony method can be used
to their algorithm, by 1. Czelusniak, permits the to implement this strategy (Marsh et al., 19941,
ln~xtureof amino aeld and nucieotide sequences with the weights derived from the matrices pre-
( w l ~ e navailable) in the same analysis (Goodman, sented by Dayhoff (1978) or Henikoff and
1981). Despite its elegance, the Moore-Good- Henikoff (1992).
Phylogenetic Inference 425
'An alternative position is that parsimoliy is required as a method of scientific inquiry regardless of any considera-
t~onsabout whether it is more or less likely to recover the true phylogeny than other methods. Some proponents of
this view hold that since the truth is essentially unknowable, we should abandon the search for it and simply
choose the most parsimonious solution for its own sake, We do not subscribe to this position. Although the true
phylogeny may be "unknowable," it can nonetheless be estimated, and we view phylogenetic methods as means to
that end rather than an end in themselves.
tParsimony in the traditional sense, i.e., "uncorrected parsimony"; see the end of this section,
Plzylogelzetic Inference 427
'Felsenstein's results have often been criticized (e.g., Farris, 1983,1986b) because they are based on unreallst~cand
restrictivc models of the evolutionary process. This criticism is unjustified, however, as the polnt could equally wcll
be made with much more general and believable models, but requiring more complex mathematics. Farris's (1983,
1986b)pomt that a maximum likelihood method will guarantee consistency only if evolution proceeds according to
the assumed model is of course true, a point to which we will rekrn latcr.
428 Chapter 11 I Swofiord, Olsen, Waddell & Hillis
cussed in more detail below.) For the moment, sometimes be a C-and occasionally it. would
also assume that the substitution rate is the same even be a G or a T.
throughout the tree (we will see later that this as- Returning our attention to the full tree, we
sumption is not necessary; it merely allows us to know that at least two changes must have oc-
think of branch lengths as amounts of evolution- curred, and since change is rare in this example,
ary time). The observation that all eight descen- histories with three or more changes are less
dants of ancestor 2 have nucleotide A is most con- likely than those with only two changes. But on
sistent with change being rare, so postulated which two of the three branches (a,P, or fl are
histories with fewer changes are more plausible the changes most likely to have occurred? Be-
than histories with more changes. Thus, from a cause branch a is so short, it is much more likely
maximum likelihood perspective, ancestor 2 that the two changes have occurred on branches
would have an A in those histories (ancestral state p and y than on any pair of branches involving
reconstructions) having the highest probability of branch a. Therefore, histories with an A in ances-
giving rise to the observed nucleotides. Although tor 1 are more likely than others of having gener-
histories in which ancestor 2 had a C, G, or T ated the observed data, and due to the greater
would also contribute to the overall probability of length of branch y, histories with a C in ancestor 1
the specified tree having generated the observed aremore likely than are those with a G. It seems
data, if all branches in the subtree were very short, extremely unlikely under our model that ances-
histories with an A at ancestor 2 would contribute tor 1 would have possessed a T. Thus, we obtain
the vast majority of the total probability. This is as a clear ordering of preferences for all residues. An
close as maximum likelihood gets to saying "an- important practical consequence is that, unlike
cestor 2 had an A." parsimony, this sequence position would be in-
We now move to ancestor 1. The branches formative with respect to the placement of a new
connected to this ancestor lead to ancestor 2 sequence containing a C at the site, biasing the
(probably an A) and to sequences known to pos- decision toward connecting this new sequence to
sess a C and a G (the outgroup), respectively. Ig- branch P (Figure 9C).
noring the G for the moment, let us consider It is important to remember that the only rea-
whether ancestor 1 is more likely to have pos- son for the appropriate predisposition toward tree
sessed an A or a C, given the topology of the tree 9C is the short length of branch a and the low
and the nucleotides found in the tip sequences. If overall rate of change. In this case, an improbable
ancestor 2 indeed possessed an A as expected, at substitution along branch a is avoided by placing
least one change must have occurred along the the change along the branch leading to the tips
path between ancestor 2 and the tip having a C with nucleotide C in Figure 9C. For either of the
(i.e., branches a and P). Because branch lengths trees of Figure 9B and 9D, avoiding a substitution
represent the expected number of character-state along the original branch a would require parallel
changes along a branch, when a branch is short, A -+ C changes along the lineages terminating at
there is a relatively low probability of a single taxa possessing nucleotide C. These parallelisms
cklange occurring along that branch, and an al- would be improbable events if the rate of change
most negligible probability of more than one is low, but they become more probable as rates &-
change. Thus, given that a character change crease. Thus, as branch a becomes longer and the
(probably) occurred somewhere along branches a rates of change grow faster, the preference for tree
or b, it is far more likely to have occurred along 9C will decrease.
the long branch P than the short branch a. Thus, In summary, whereas parsimony ignores in-
ancestor 1 is much more likely to have possessed formation on branch lengths when evaluating a
an A than a C. Remember, however, that the esti- tree, maximum likelihood considers that changes
mate of A at ancestor 1 is a probabilistic state- are more likely along long branches than short
ment. When the same configuration of tip states ones, and estimation of branch lengths is an im-
arises at different sites, the nucleotide found in the portant component of the method. This difference
actual ancestor will usually be an A, but it would explains the consistency of maximum likelihood
430 Chapter 11 / Swofford, Olserz, Waddell G. Hillis
under many situations in which parsimony is in- posed changes and af sampling variance is that
consistent. In the example of Figure 8, maximum even with very short sequences, maximum IikeIi-
likelihood will not be fooled by the "misinforma- hood tree inference tends to outperform alterna-
tive" pattern JV,because this pattern is very likely tive methods (e.g. parsimony or additive dis-
to occur even on the true tree. Distance methods tances) when evaluated under many models of
that adequately account for unobserved substitu- sequence evolution (see, e.g., Hasegawa and Fuji-
tions wdl also succeed in this case, altl~oughthey wara, 1993; Kuhner and Felsenstein, 1994;
tend to be less efficient, requiring more data to Huelsenbeck, 1995a).
achieve the same level of accuracy (e.g., see Hillis Several areas of biological research, notably ge-
et al., 1994b; Kul~nerand Felsenstein, 1994; netic mapping and clinical testing, routinely use
Huelsenbeck, 1995a,b). maximum likelihaod metlxods for testing hypothe-
ses, However, the perceived and a c l a l complexi-
ties of obtaining maximum likelihood solutions to
Maximum Likelihood Methods problems that involve numerous alternative hy-
Maximum likelihood methods of phylogenetic in- potheses has inhibited the more general use of
ference evaluate a hypothesis about evolutionary these techniques. The following discussion at-
history in terms of the probability that a proposed tempts to outline the elements of a maximum like-
model of the evolutionary process and the hy- lihood formation of phylogenetic inference. For ad-
pothesized history would give rise to the ob- ditional perspective, Goldman (1990) provides a
served data. It is conjectured that a history with a very accessible introduction to these concepts.
higher probabjlity of giving rise to the current
state of affairs is a preferable hypothesis to one Objective
with a lower probability of reaching the observed Phylogenetic analysis seeks to infer the history (or
state. Maximum likelihood estimation was first set of histories) that are most consistent with a set
used in phylogenetic inference by Cavalli-Sforza of observed data. In the present case, the data are
and Edwards (1967). However, because they did observed nucleotide (or protein) sequences; the
not use sequence data, this work remained rela- unknowns are the branching order and branch.
tively obscure. Felsenstein (1981a, 1993) brought lengths of the tree. To apply a maximum likeli-
the maximum likelihood framework to nu- hood approach, a concrete model of the evolu-
cleotide-based phylogenetic inference, Later, max- tionary process that accounts for the conversion
imum likelihood was applied to amino acid se- of one sequence into another must be specified.
quence data as well (Kishino et al., 1990; Adachi This model may be fully defined; alternatively, it
and Hasegawa, 1992). may contain many parameters that are to be esti-
In addition to its consistency properties, max- mated froin the data. A maximum likelihood ap-.
imum likelihood is useful because it often yields proach to phylogenetic inference evaluates the
estimates that have lower variance than other probability that the chosen evolutionary model
methods (i.e., it is frequently the estimation will have generated the observed sequences (the
method Ieast affected by sampling error). It also probability of the data under the model); phylo-
tends to be robust to many violations of the as- genies are then inferred by finding those trees that
sumptions used in its models. Part of its power in yield the highest likelihoods.
this respect is that many models of sequence evo- The basic principles involved in calculating
lution that assume identical distributions across the likelihood of a tree are introduced in Figure
sites can safeIy assume that the actual substitution 10. Figure 10A shows a set of aligned nucleotide
processes taking place at different sites have sequences for four taxa. Suppose we want to
much in common, even if they are not exactly evaluate the likelihood of the unrooted tree
identical. Consequently, the major components shown in Figure IOB; that is, what is the proba-
determining the evolution of sequences can be de- bility that this tree could have generated the data
scribed by just a few parameters. The overall re- of Figure 10A under our cl~osenmodel? Because
sult of both improved~compensationfor superim- most of the models currently used are time-re-
Phy logenetic Inference 432.
tvhere the rows (and columns) correspond to the bases A, C, G, and T, respec-
tively The factor p represents the mean instantaneous substitution rate. This
.
mean rate is modified by the relative rate parameters a, b, c, . ., 1, which corre-
spond to each possible transforlnation from one base to a different base. The
product of a relative rate parameter and the mean instantaneous substitution rate
constitutes a rate pauai?leter. The remaining parameters, ~CA,ele, and n~,are frL.-
ijueirc9 parameters that correspond to the frequencies of the bases A, C, G, and T,
resyecilvely ( Z . Yang, 1994a).We assume that these frequencies remain constant
over time (i.c., they are always at equilibrium), and that the rate of change to each
base w proportional to the equilibrium frequency but independent of the identity
of il-ie starting base. The diagonal elements of Q are always chosen SO that the el-
ements in the corresponding row sum to zero. It is sometimes convenient to de-
compose Q into two matrices R and n, where
Phylogenetic Inference 433
and
l o 0 0
The off-diagonal elements of Q are then equal to the off-diagonal elements of the
inatrix product m, and the diagonal elements of Q are once again set to the neg-
ative of the sum of the off-diagonal elements for the corresponding row. Analo-
gous matrices can be defined for protein sequence data, except that there are 20
states rather than 4.
Almost all of the DNA substitution models proposed to date are special cases
of matrix (3). It is usually assumed that the overall rate of change from base i to
base 1 in a given length of time is the same as the rate of change from base j to
base i. Such models are said to be time-reversible, This corresponds to the rate
parameter restrictions g = a, h = b, i = c, j = d, k = e, and 1 = f. One byproduct of
time reversibility is that the likelilzood of a tree generally does not depend on how
the tree is rooted. Consequently, as for most of tlze parsimony methods discussed
above, maximum likelihood estimation is usually limited to the inference of un-
rooted trees, and other assumptions must be invoked to convert an unrooted tree
into a rooted one. Although it is possible to relax the time-reversibility assump-
tion, this relaxation introduces additional computational complications, includ-
ing the need to consider rooted trees. Thus, we will only consider symmetric R
matrices of the form
Pan, @ubn,
---(anA d~
i- -I-enT) M ~ G
- p ( b ~ , +d z c +f ~ , ) (4)
,ud%
Wnc P~ZG -p(cn, + en, + frt,)
434 I
Chapter 11 Swofford, Olselz, Waddell b Hillis
3 substitution types
(h.ansverslons,2 transition
Tr N SYM
2 substitution types 3 substitution types
(transitions vs, transversions) (transitions,2 transversion
classes)
4 4
HKY85
Equal basc
frequencies
Single substitution type 2 substitution types
(transitionsvs. transversions)
F81
//
Figure 11 Relationship between special cases of the
general time-reversible family of substitution models.
JC
Single substitution type
(Lanave et al., 1984; Tavar6, 1986; Barry and Hartigan, 198%; Rodriguez et al.,
1990). Most of the remaining models commonly used either for maximum likeli-
hood tree inference or estimation of pairwise evolutionary distances can be ob-
tained by restricting the parameters in matrix (4), as shown in Figure 11. For in-
stance, if the substitutibn t v ~ e are
$ I
s divided into transversions, transitions
between purines, and transitions between pyrimidines, we obtain the model of
Tamura and Nei (1993; TrN) by requiring that a = c = d = f. Similarly, we can ob-
tain Kimura's (1981) three-substitution-type (K3ST) model by requiring that all
bases occur in equal frequency (a = =7 5= ~ = 0.25) and dividing the substi-
tution types into transitions (b = e), A ++T or C ++ G transversions (c = df, and
A u C or G ++ T transversions (a = f). Zharkikh (1994) described a model (SYM)
that is equivalent to GTR except that it assumes equal base frequencies.Any other
restriction of the relative rates from the general time-reversible model (e.g., a = c,
e = f ) is possible; all such models are also time-reversible.
Further restrictions 011 the parameters in matrix (4) lead to more familiar
models. If we assume that the equilibrium frequencies of all bases are the same
= =
(nA= = = 9= 0.25) and that all substitutions occur at the same rate (a = b =
c = d = e = f = 1, the model reduces to that of Jukes and Cantor UC) (1969):
T11e base frequency and substitution rate are typically combined into a single pa-
rameter a = M4,leading to the simpler form:
IGnzura's (1980) two-parameter model (K2.P) takes into account the conlrnon ob-
servation that transitions and transversions occur at different rates, but still as-
sumes equal base frequencies. Thus we set a = c = d = f = 1 and b = e = K and ob-
tain
Letting the transition rate a= P K / ~and the transversion rate P = y/4,the above
can be rewritten as
Note that K = a/p represents the transition bias. When K = 1, there is no pref-
erence for transitions and the model reduces to the JC model. However, because
there are twice as many kinds of transversions as transitions, the expected transi-
tion:transversion ratio is 1:2. Similarly, if K = 4, we would then expect twice as
many transitions as transversions.
436 Chapter 11 / Swofforci, Olsen, Waddell G. Hillis
The K2P model can easily be generalized to allow unequal equilibrium base
frequencies (13asegawa et al., 1985b).The instantaneous rate matrix for this model
(HKY85) IS thcn given by
Pnc
+ZA)
;; PT
-p(zR+ nc)
Thli model was also described as the "equal input" model by Tajirna and Nei
(1982).
Felsenstein (1984) used a different method to accommodate unequal base fre-
quencles in a two-parameter inodel (the F84 model, formally described in Kishino
and 1-lasegawa, 1989). T11c F84 mode1 divides the substitution process into two
components: a gel~ernlsubstitution rate capable of producing all types of substi-
tutions, and a wifhrrr-group substitution rate that produces only transitions. The
instani.?neous rate matrix for the F84 model can be obtained from matrix (4) by
scrt~ngi, = c = d = f = 1, h = (1+ K/q?), and e = (1 + K/zY):
(e.g., Cox and Miller, 1977; Hasegawa et al., 1985b; Z. Yang, 1994a). The exponen-
tial can be evaluated by decomposing the instantaneous rate matrix Q into its
eigenvalues and eigenvectors (we omit the details of how this is done, but see
Lewis et al., 1996, for an introductory explanation of the techniques used). For
several models, simple expressions exist for the eigenvalues, allowing direct an-
alytic calculation of the elements of the substitution probability matrix. For ex-
ample, in the K2P model of DNA substitution, there are only three probabilities to
consider: the probability of a transversion-type substitution; the probability a
transition-type substitution; and the probability of no substitution. These proba-
bilities are:
(i # j, transversion)
Substitution probabilities for some other DNA models are as follows (see Lewis ct
al., 1996):
'We refer to this matrix as the substitution probability matrix rather than the more traditional
transition probability matrix to avoid confusion with "transition" in the sense of a change
between two purines or between two pyrimidines.
438 Chapter 11 / Swoffoud, Olsen, Waddell b Hillis
where A = 1 + JJj ( K - 1) for the HKY85 model and A = K + 1 for the F84 model,
with Hj = Q + Q if base j is a purine (A or G) and Ell = Q + n~ if base j is a
pyrimidine (C or T). Substitution probabilities for the remaining models can be
calculated by numerical evaluation of the eigenvalues and eigenvectors of Q us-
ing standard algorithms (2.Yang, 1994a; Lewis et al., 1996).
Poisson : qj(Z) =
20 *
20 e-pt (i # j)
The assumption of equal alnino acid frequencies is clearly unreasonable for pro-
tein sequence data. If substitution rates are still assumed to be equal, an analog to
the Felsenstein (1981a) model would have the same basic form as the instanta-
neous rate matrix of (ti), but with 20 states instead of 4. This model has been
called the proportional model by Hasegawa and Fujiwara (1993). The corre-
sponding change probabilities are the same as (7):
Phylogenetic Inference 439
where g now represents amino acid frequencies rather than base frequencies, Al-
though this model is preferable to the Poisson model, it st111assumes that the rel-
ative frequencies of the amino acids are constant across sites. This assumption 1s
clearly violated as well (e.g., hydrophobic amino acids predominate in some rc-
gions of a protein, while hydrophilic amino acids predominate in others).
A large body of empirical evidence dcmonstra tes that an amino acid 1s more
likely to bc replaced by a physicochemically similar amino acid than would be
predicted by an equal-change-probability model (Dayhoff et al., 1978).Klshino ct
al. (1990) were able to derive a maximum likelil~oodmethod analogous to the
general time-reversible model for DNA sequences by using an instantancous rate
matrix derived froin Dayhoff et al.'s (1978) empirical substitution matrix. Thls
model has been implemented as the Dayhoff model in the PROTML program of
the MOLPHY package (Adachi and I-lasegawa, 1992). More recently, a model
(JTT)based on the updated empirical substitution matrix of D.T. Jones et al. (1992)
has been added to PROTML; preliminary evidence indicates that this modifica-
tion provides a better model for the evolution of diverse proteins than the Day-
hoff model (Cao et al., 1994).
Protein-coding DNA sequences can be analyzed using either the origlnal
DNA sequences or the translated proteins (with some complications). Some in-
formation is lost in the translation to protein sequences. 0 1 1 the other hand, an
obvlous limitation to use of the original DNA sequences is that the assumption
of equal rates of change for all sites is violated due to the degeneracy of the ge-
netic code; a greater proportion of synonymous changes allows third positions to
evolve at a much more rapid rate than first and second posltlons. This problem 1s
easily corrected by allowing relative rates to be specified on a site-specific basis
(see below). However, selection at the amino acid or codon level will cause the
assumption of independence among sites to be violated as well. Consequently,
maximum likelihood analyses of protem-coding DNA sequences probably sl~ould
be conducted at the protein level unless the sequences are not very divergent (see
Reeves, 1992, for a discussion of these and related ~ssues). An alternative is to use
a model of codon evolution wit11 61 states (Muse and Gaut, 1994; Goldman and
Yang, 19941, retaining the full information content of the DNA sequences. Unfor-
tunately, codon-based models are still in their infancy and are much more corn-
putationally intensive than 4-state (or even 20-state) models.
THE RELATIONSHIP BETWEEN SUBSTITUTION RATE AND TIME FOXall of these models,
the probability of a change from state i to state j depends on the interaction of
the duration of time t and the substitution rate ,u only tluough their product pf
(Felsenstein, 1981a).Thus, a branch could be "long" either because it represents
a long period of evolutionary time or because the rate of substitution has been
high, In general, ~tis imposs~bleto tease these two components apart unless one
is willing to assume a perfect molecular clock. Consequently, the mean substitu-
440 Chapter 11 / Szoofford, Olsen, Waddell & Hillis
tion rate 1 1 1s usually set to 1 and the relative rate goodness-of-fit statistic and then search for a
paralneters a, b, ..., jare scaled so that the average model that maximizes this statistic without adding
rate of substitution at equilibrium is 1 (e.g., Z. uruwcessary parameters that do little more than ex-
Yang, 1994a).The length of a branch then repre- plain random fluctuations in the data. If we can as-
senti clrc expected number of substitutions per sume that sites in the sequence evolve indepen-
sllc along that branch, with no implication as to dently, then the data represent a multinornial
l h e actucli amount of evolutionary time it- repre- sample, so goodness-of-fitstatistics such as a ~2 or
sen~s. the log likelihood ratio test (e.g., G of Sokal and
These models allow the expected number of Rohlf, 1981) can be used to measure the fit of the
substitutions to be different for each branch of observed data to the predictions of the model (see
the tree. As noted above, one consequence of this Navidi et al., 2991 for a general discussion, and Rit-
freedom is that the likelihood of a tree can be cal- land and Clegg, 1987for examples).In phylogenet-
culated independently of the location of the root. ics it is more common to use the likelihood ratio
If one 1s willing to assume that the substitution statistic, which (unlike the ~2 statistic) does not re-
rate xs approximately homogeneous across lin- quire the expected probability of all distinct nu-
eages, then the likelihood can be estimated un- cleotide patterns to be calculated.As with a contil-t-
der a rnolec~~lar clock model by estiinating gency table analysis, we expect that with a large
blanching times rather than the lengtl-ts of each amount of data, the G statistic will behave like a ~ 2 -
branch (Bishop and Friday, 1985; Felsenstein, distributed random variable, assuming the model
1993) (Note that this model then requires evalu- is correct. (Likelihood-ratio tests of model fit are
ation of rooted rather than unrooted trees.) Be- further described in the section on "Reliability of
cause the clock model req~~ires estimation of only Inferred Trees.") A related measure, the Akaike in-
about half as many parameters as the uncon- formation criterion (Akaike, 19741, can also be used
strained model [(T- 1)/(2T - 311, it will be more to choose the most appropriate model (e.g.,
effic~ent(in the sense of requiring less data to Kishino and Hasegawa, 1990), althougl~in practice
achieve the same level of accuracy) if the clock this measure is similar to a variety of other model
assuli-iptlons are valid. Felsenstein (1993) out- selection criteria (see A.J.Miller, 1990).It is also im-
lined a likelihood ratio test of the molecular portant to avoid overconfidence when one model
clock that compares the likelihoods of the more fits the data much better than another if the over-
constrained clock model to the unconstrained- all fit is not good, si~iceboth models could be quite
branch-length model. inadequate.
typically involve an Iterative approach in which tl-te likelihood of an ancestral state 1s tl-te product
each branch is optimized separately by Newton's of the likelihoods of the state giving rise to the
method (e-g.,Kislzino et al., 1990; G.J. Olsen et al., daughter trees. In parsimony, the total cost of the
1992; Tillier, 1994; Lewls et al., 1996).This method tree is the sum of tl-te costs at each position,
rs guaranteed to find globally optimal branch whereas the net log-likelihood of a tree is the sum
lengths for a given tree topology only if there is at of the log-likelihoods of the evolution at each se-
most one maximum on the likelihood surface. Al- quence position. Essential differences between the
though Fukami and Tateno (1989) claimed to have general parsimony approach and the maximum
proved this to be the case, Steel (199bb) presented likelihood approach include: the cost of a change
a simple counterexample demonstrating that mul- in parsimony is not a function of branch length,
tiple optimality peaks could occur and found tl-te unlike maximum likelrhood; and maximum par-
error in Fukami and Tateno's proof. Steel's exam- simony looks only at the single, lowest cost solu-
ple was artificial, but preliminary results (J.S. tion, whereas maximum likelihood looks at the
Rogers and D.L. Swofford, unpublished data) in- combined likelihood for all solutions (ancestral
dicate that the problem can occur with real data states) consistent with the tree and branch lengths
sets as well. So far, local optima seem to occur (see the discussion of integrated likelihood in
only on trees that provide extremely poor expla- Goldman, 1990). Felsenstein has used the rela-
nations of the data (e.g., random trees). tionship between likelihood and parsimony to
It is important to emphasize that the method gain several insights into the parsimony criterion,
for calculating likelihoods described in this sec- including the discovery of the potential for incon-
tion does not require calculation of the probabili- sistency due to unequal rates (Felsenstein, 19784
ties of each possible reconstruction of ancestral and the inference of a character-weighting ration-
states as was shown in the conceptual example of ale (Felsenstein, 1981~).
Figure 10. The two methods are in fact equivalent,
but if we were indeed required to consider all Accomnzodating Rate Heterogeneity across Sites
possible reconstructions, the problem would be- The maximum likelihood models described above
come essentially intractable, as there are 4T-2pos- all assume that every site evolves at the same rate.
sible reconstructions for DNA sequence data and , Violation of this assumption can have devastating
20T-2possible reconstructions for protein sequence consequences. For instance, Gaut and Lewis
data. For example, a data set of 20 taxa and DNA (1995) showed that maximum likelihood inference
sequences of length 2000 would require calcula- under the assumption of rate homogeneity can
tion of the probabilities of 1.4 x 1014 reconstruc- become inconsistent when the true evolutionary
tions for a given topology and set of branch process exhibits site-to-site rate variation, even
lengths, and adjustment of even one branch when all other aspects of the process are modeled
length would require recalculation of all of them. perfectly. If there is strong variation in rates across
It is extremely fortuitous that the probability sum- sites, sites that are resistant to change (e.g., due to
mations can be rearranged into forms like equa- strong selective constraints) can hide tl-te actual
tion (9) (corresponding to the "pruning" algo- amount of change that has occurred at more
rithm of Felsenstein, 1981a). rapidly evolving sites. This causes maximum like-
Evaluation of the likelihood of a tree and lihood to underestimate the number of multiple
counting the number of clzanges of a tree under changes; the longer the branch the greater the un-
the general parsimony criterion are similar in sev- derestimation. Thus, maximum likelihood can be-
eral respects. The cost of a given change under come "positively misleading" (Felsenstein, 1978a)
parsimony is analogous to the likelil-tood of the for exactly the same reasons as parsimony (Figure
given change from the substitution matrix, P(f). h-t 8): highly divergent sequences will appear to be
parsimony, the cost of placing a given state at an more closeIy related than they actually are (see
internaI node is the sum of the costs of deriving Lockhart et al., 1995a, for a probable example of
both of the daughter trees from that state, whereas this problem with rcal. data).
Phylogerzefic Inferu~cc 443
Rate heterogeneity can be incorporatcd into substitutions (perhaps due to strong functional.
likelil~oodanalyses by including an additional rel- constraint), but that the remaining sites all vary at.
ative ratc component, r , into the substitution the same rate (Hasegawa et al., 1985b;Churchill
probability expressions. In the JC model, for ex- et al., 1992; Reeves, 1992; Sidou7et al., 1992). I n
ample, we let this case, when r = 0,Pi,(t,y) = I and Pii(t,r) = O for
all i st j. The proportion of invariable sites s111.1stei-
ther be estimated separately (see below) or treated
as a parameter that is optimized for each tree.
Therc is no reason in principle to restrict the rato
of one of the categories to O (no change), or to
limit the number of categories to 2, but estimation
If the relative rates u are scaled so that the mean of the proportion of sites within each category
substitution rate remains 1, branch lengths will and the relative rates among categories becomes
still reflect the number of substitutions per site. In much more complicated otherwise.
the simplest case, we simply assign a rate ul to The most cominonly used continuous distrib-
each site j. Typically, the basis for this assignment ution for modeling rate heterogeneity i s the
would be some a p r i m classificatiol~of sites into gamma (r)distribution (e.g.,Z. Yang, 1993; Steel
functional categories and assignment of relative et al., 1993~). The r distribution has two parame-
rates to the categories. Categorizations might be ters, a shape parameter a a n d a scale parameter P.
first, second, and third positions of a protein-cod- By setting P to l/a, a distribution with a mcan
ing gene, or paired versus unpaired sites for a ri- rate of 1 is obtaincd, and a wide variety of rate
bosomal RNA gene. It is also possible to assign distributions can be obtaincd by varying cx (Fig-
sites to rate categories based on the observed pat- ure 13).
tern of residue change. Van de Peer et al. (1993) The shape parameter a is equal to the inverse
proposed a way to do this by observing the fre- of the coefficient of variation of the substitutioll
quency with which sequence pairs differ at each rate, so that as a! increases, the distribution con-
site as a function of the distance between the se- verges to an equal-rates ~nodcl.Obtaining likeli-
quence pair. G.J. Olsen has written a program hoods by integrating over the r distribution (or
(DNArates; see Appendix) that performs a maxi- any other continuous distribution) is usually ex-
mum likelihood estimate of the rate at each site tremely cornputationally intensive ( Z . Yang, 1993;
for a given phylogenetic tree. see the section on Hadamard conjugation for a
Several stochastic models that explicitly in- fast method under some models). Z. Yang (1994b)
corporate site-to-site rate variation are available. evaluated an alternative procedure in which the f
In these models, each site has a certain probabil- distribution is divided into several rate categories
ity of evolving at any rate contained in some by finding boundaries in the d i s t r i b u t i o ~such
~
probability distribution, which may either be dis- that each category has equal probability. The
crete or continuous. For a discrete rate distribu- mean (or median) of each category is then used to
tion, the full likelihood for a given site is obtained represent all of the rates within that category. Z .
by summing over rate categories the likelihoods Yang (1994b) found that this "discrete gamma"
of the site given each rate, weighted by the proba- model can provide a good appmxirnation with as
bility that the site is drawn from each category few as four ratc categories. The advantage of u s -
(Felsenstein, 1981a).Site likelihoods are calculated ing a discrete model is that it requircs only a tiny
a~~alogously for a continuous rate distributio~~ ex- fraction of the computer time needed for the con-
cept that the likelihoods must be integrated over tinuous model. The discrete r distribution, like
the entire distribution. the contii~uouscase, only adds one extra parame-
The simplest model based on a discrete distri- ter to the model (the shape parameter), no matter
bution is an invariable-sites model that assumes how may rate categories are considered.
some fraction of the sites is incapable of accepting In some situations, mixtures of rate hetero-
444 Chapter 11 / Swofoud, Olsen, Waddell & Hillis
Rate
Figure 13 The gamma distlibutlon for four different tribution becomes more peaked and symmetrical
values of the shape parameter (a). When a is small, about a mean rate of 1.0. When a is ~nfinity,all sites
most of \he sltes evolve very slowly, but a few sites have relative rate 1.0, so that an equal-rates model can
have modcrate-to-high rates. As a Increases, the dis- be obtained as a special case of the gamlna model.
geneiiy models may be appropriate. For example, ally optimal values of these parameters in the n-
Gu t i al. (1995) and WaddelI and Penny (1996a) dimensional parameter space, That is, we would
have proposed an "invar~ant+ gamma" model, m consider every possible tree and optimize
which some fractjon of the sites, 0, are invariable, (jointly) all parameters of the model for each tree,
with the remaining rates distributed according to choosing the resulting tree(s) of highest likeli-
a l- distribution with shape parameter a. hood. ~~r a given tree,-one could perform a mul-
tidimensional optimization using Newton's
Esti-itlafingModel Pnrnrneters method (e.g., A.W.F. Edwards, 1972). Unfortu-
The 11.iodcls described above contain a variety of nately, this approach is difficult to implement be-
palaincters that must be estimated from the data cause it requires knowledge of the first and sec-
or supplied on the basis of extrinsic evidence. ond partial derivatives (and second cross-
The5e parameters include: the tree topology; the derivatives) of the likelihood function with re-
branch-length estimates (which are spec~ficto spect to each of the parameters. Even when these
eLichtopology); the relative rate parameters of the derivatives are available, their computation can
subsiltution models (a, b, ..., fl in matrix (4) or re- be quite slow.
laied parameters such as K and K; the base-fre- In the section "Calculating the Likelihood of
CIUEnCy parameters (nA,ZC, TCG, and nT),and the a Tree," we described a procedure that finds
para~i~eters used in n~odelingrate heterogeneity branch lengths that are at least locally optimal,
(gan\lnd shape parameter, proportion of invari- given the Glues of any other parameters in the
abld ilces, etc.).Ideally, we would search for glob- model. For any model more complex than the
Phylogenetic Inference 445
JCIPoisson models, the values of additional para- for the data (e.g., a parsimony tree, or a maxi-
meters should be simultaneously optimized. mum likelihood tree inferred under the model of
When the model contaills only one additional pa- Jukes and Cantor, 1969)and then "fix" the result-
rameter (e.g., in the K2P model or the shape pa- ing estimates in a search for better trees under the
rameter in the J C t r model), it is relatively easy to desired model. A successive approximations ap-
plot the likelihood function evaluated at various proach might work very well in this case. That is,
values of the parameter of interest and thereby if a tree of higher likelihood is found, the para-
find a value that approximately maximizes the meters could be re-optimized on this new tree
likelihood (e.g., Felsenstein, 1993).Obviously, this and fixed for yet another search, alternating be-
procedure can be quite tedious. tween estimation and tree-searching until the
A method that has worked weU for one of us same tree is found in successive iterations. Al-
(DLS) is the use of derivative-free methods for though this strategy seems quite promising, its
function minimization developed by Brent (1973) effectiveness needs to be confirmed in empirical
for a single variable and M.J.D. Powell (1964; as studies. Note that one of the limitations ascribed
modified by Brent, 1973) for two or more vari- to the use of successive approximations in parsi-
ables. The procedure implemented in PAUP* mony character weighting is not relevant in this
(Swofford, 1996)is to use the Brent-Powell meth- case, because the likelihood function provides an
ods to find optimal values for all parameters other objective function that is comparable across para-
than branch lengths. When these algorithms need meter values and trees.
to evaluate the likelihood function, optimal An alternative to the methods presented
branch lengths (conditional on the current values above is to estimate the model parameters using
of the other parameters) are obtained using New- methods other than likelihood. For example, the
ton's method as described above. Thus, optimal rshape parameter can be approximated by fitting
values of all parameters are obtained when the al- a negative binomial distribution to a frequency
gorithm converges. (As for all heuristic methods, distribution of the number of changes required at
howevcr, there is no guarantee that the resulting each site under the parsimony criterion (e.g.,
solution is globally optimal.) For small data sets Uzzell and Corbin, 1971; Kocher and Wilson,
(4-8 taxa), this strategy can be used for every tree 1991; Wakeley, 1993; Sullivan et al., 1995a).A sim-
evaluated due to the small size of the trees and ilar approach can be used to estimate the propor-
the modest number of topologies tested. How- tion of invariable sites using the Poisson distribu-
ever, optimization of all model parameters on tion (Fitch and Markowitz, 1970; Markowitz,
every tree tested dramatically slows the search us- 1970).Sidow et al. (1992) described another inter-
ing larger data sets. Z. Yang and coworkers (Yang, esting method for estimating the proportion of in-
1994a,b,c; Yang et al., 1994) have suggested that variable sites based on a mark-recapture model
parameter estimates are fairly stable across tree (Seber, 1982).These estimates require different as-
topologies as long as the trees are not "too wrong" sumptions than maximum likelihood tree models
(Yang, 11995). Estimates of the shape parameter for and can be calculated quickly, so they may be use-
the r model of site-to-site rate variation appear to ful as a first approximation for selecting a model,
be somewhat more sensitive to the tree topology obtaining starting parameter values for maximum
than substitution-rate parameters (Yang, 1995; likelihood estimation, or examining the effect of
Sullivan et al., 1995b), although these conclusions tree topology on parameter estimates (e.g., Sulli-
are largely based on comparison of trees that van et al., 1995a).
probably fall into the "too wrong" category (e.g.,
random trees or star trees). Maximum Likelihood Methods for
As long as parameter estimates are not Other Data Types
wildly unstable across tree topologies, a poten- Maximum likelihood methods also can be applied
tially useful method would be to estimate the to other data types, such as gene frequencies
model parameters on some reasonably good tree (Felsenstein, 1981b) or restriction sites (Felsen-
446 Chapter 11 / Swofford, Olsen, Waddell & Hillis
stein, 1992b).The basic approach is the same as likelihood methods have consistently outper-
that described above for sequence data: one for- formed distance methods in choosing the correct
mulates a model of evolutionary change and cal- tree (e.g., Kuhner and Felsenstcin, 1994; Z. Yang,
culates the probabihty that tlze observed data (111 1994c; Huelsenbeck, 1995a).Although some other
this case, restriction site presences/absences or ar- studies have reported better performance of some
rays of gene frequencies) would have been gener- distance methods (Saitou, 1988; Saitou and Iinan-
ated by a particular tree topology under the ishi, 1989; Tateno et al., 19941, these results have
model. The mechanics of estimating branch subsequently been shown to be based on inade-
lengths and other model parameters are essen- quate computer programs and/or inappropriate
tially equivalent; the differences lie in the form of comparisons (Hasegawa et al., 1991; Z. Yang,
the models and how clzange probabilities are cal- 1994c; Huelsenbeck, 1995b).
culated. For some sources of data, including im-
munology and nucleic acid hybridization, there is
no alternative to the use of distance methods. For
Pairwise Distance Methods other types of data, i~zcludingmacromolecular se-
A critical point made in the comparison of parsi- quence, restriction site, and allozyme data, dis-
mony and likelihood metlzods above was that tances can provide a way to take advantage ~f
parsimony methods seek solutions that minimize models of evolutionary change when likelil.rood
the amount of evolutionary cl~angerequired to methods are either unavailable or intractable. Un-
explain the data, whereas liltelihood methods at- til recently, computers have been too slow and al-
tempt to estimate the actual amount of clzange ac- gorithms too inefficient to exploit fully the advan-
cording to an evolutionary model. This distinction tages of maximum likelil~oodtechniques, and
is reIevant because as mutations are fixed in the distance methods played a more important role.
genome, there is an ever-increasing chance of su- Even with the availability of faster maximum like-
perimposed changes occurring at a single se- lihood computer programs (see Appendix), dis-
quence position: changes at a particular site along tance methods remain useful, particularly for the
a lineage of the phylogeny may mask earlier analysis of large data sets, where their increased
changes at that sitc, and parallel or convergent speed allows more thorough testing of alternative
changes may occur at the same site in different tree topologies.
lineages. Thus, estimates of the amount of evolu- The negative side of reducing character data
tionary change implied by parsimony will be un- to pairwise distances is that information is lost in
derestimates of the true amount of change, unless the transformation. For instance, Penny (1982) has
the actual rate of change is extremely small. shown examples in which several different sets of
An alternative to the use of likelihood for sequences yield tlze same distance matrix, but
minimizing the impact of the underestimation given only the distances it is impossible to go
problem is the use of corrected distances that ac- back to the original sequences. Although this loss
count for superimposed changes by estimating of information probably explains tlze better per-
tlze number of unseen events using the same sorts formance of character-based maximum likelihood
of models employed in maximum Iikelihood inference, it clearly is not devastating. In fact,
analysis. The corrected distances are then esti- Inany sequence data sets yield identical conclu-
mates of the true evolutionary distance, whch re- sions with character-based and distance-based
flects the actual mean number of changes per site analyses (e.g., G.J. Olsen, 1987).Another draw-
that have occurred between a pair of sequences back to d~stanceanalysis is that it does not lend it-
since their divergence from a common ancestor. self to the combination of different kinds of data
Thus, following Cavaili-Sforza and Edwards into the same analysis, as is possible for character-
(1967), we view distance methods as less desirable based analyses (e.g., Miyamoto, 1985). Finally,
appraximations to a full maximum likelihood ap- only through character-based analysis can a re-
proach. In recent simulation studies, maximum searcher identify particularly informative charac-
Plzyloge~zt.ficInference 447
Additive Distances
If we could determine exactly the true evolution-
Add~tlveproperties.
ary distance implied by a given amount of ob-
dAB = V l + V 2
served sequence difference between each pair of
dAC=vlI-V3+7I',
taxa under study, these distances would have the
dAD= ~1 + ~3 + U S
very useful property of tree additivity (Figure 14):
dBC = 112 I- ' 3 + vq
the evolutionary distance between each pair of
dBD = V z + V 3 + U s
taxa would be equal to the sum of the lengths of
~ C D =~4 + U S
each branch lying on the path between the mem-
bers of each pair. (The branch lengths also repre-
sent evolutionary distances between pairs of se-
quences, but at least one member of the pair is a
hypothetical ancestral taxsn.) Additive distances
satisfy the four-poilzt nzetric condition (Buneman,
1971):for any four taxa A, B, C and D,
Additive propert~cs:
where d,, is the distance between taxa 7 and j, and dAR = ~1 + 0 2 t ~ 1 3
"max" is the maximum value function. Conceptu- dAC= 111 + V 2 + U 4
ally, this simply means that of the three sums of dBC = V g + vq
distances d,, + dkl where i it j + k ;t 1, one of these
must be as small or smaller than the other two, Ultrametric properties.
and these other two must be equal. For example, v, = v4
in figure 14A: v1 = v2 + v3 = v, + U q
d,, +dcD = v, +v, +v,+v, Figure 14 Addltlve and ultrametric trees (A) An ad-
d~tivetrec relating four taxa. A, H, C, and D It also 1~51s
dAC + dBD = (vl + v, + v,) + (v, + v, + v,) = the relationships between the srx taxon-to-taxon dis-
v* + v, + v, + u5 + 2v3 tances (dABthrough dcD) and thc flvc branch lengths
(v, through u s ) Add~t~ve distances and trccs do not
d,, +dB, = (v,+v, +v,)+(v,+v3+v4)= make any assumption about the rootmg; hence the rc-
v, + v, + v4 + v, + 2v3 lat~onsliipsare displayed In a n unroofed format All
sets of palrwlse d~stanccsthat satisfy the four-palnt
condltlon (see text) can be represented as a unlquc ad-
Tree-additive distances can be fitted to an un- ditive tree (B) An uItramctric trec relating three taxa. A,
rooted tree such that all pairwise distances are 8, and C In addrtlon to having addltlve propcrt~es(all
equal to tlie sum of the lengths of the branches taxon-to-taxon distances are the total of thc branch
along the path connecting the corresponding taxa lengths joining them), cvcry common ancestor 15
(Figure 14A). Unfortunately, due to the finite equidistant from all ~ t descendants.
s Thus, thc mo5t ~ c -
ccnt common ancestor of B and C is a 3 from B arid v 4
amount of available data, stochastic (random) er- from C, therefore v 3 = V J Llkew~sc,tl-ie common an-
rors will cause deviation of the estimated evolu- cestor of A and B I S v 1 from h and v 2 + v g from 8,
tionary distances from perfect tree additivity even therefore v = v I- v
448 Chapter 11 / Swoflord, Olsen, Waddell & Hillis
when evolution proceeds exactly according to the the vertical bars represent the absolute value, and
model used for distance correction. Many meth- cw = 1 or 2. A value of a and a weighting scheme
ods have been described that derive a tree and an must be chosen.
associaicd set of branch lengths that comes clos- Setting a to 2 represents a weighted ieast-
est (111 some sense) to being additive for a matrix squares criterion; the weighted squared devia-
of pairrvlse distances. These methods typically, tion of the path-length distances from the dis-
but not always, attempt to optimize an objective tance estimates will be minimized. If a = 1, then
funcrlon that quantifies the degree of "distortion" the weighted absolute differences will be mini-
between the path length and observed distances. mized. If the errors m the distance estimates are
T11e orlginal descriptions of these methods often distributed uniformly across the data, then the
confound thc choice of an optimality criterion least-squares criterion is preferred. If some esti-
with the algorithms used to select an optimal tree, mates are apt to be particularly bad, there are
but we will separate these two components, de- two considerations. First, if the identities of the
ferring the latter to the "Searching for Optimal least certain estimates are known, this knowl-
Trees" section. edge can be accommodated in the least-squares
method by assigning particularly low weights to
AdA.rf.rve-Tree Methods these uncertain values. If, however, it is not
A colnplete record of all genetic events would known a priori which estimates are apt to be er-
colzsti ttr tc a set of perfectly additive distances. We roneous, then using the minimum absolute de-
will1 tl-cat the experimentally derived distances, viations will reduce the overall perturbation
which estimate the (unknown) number of genetic caused by spurious data values. This last condi-
events that have actually occurred from the num- tion might pertain to direct experimental deter-
ber of differences actually observed between each minations of the distance data, a situation in
p a r of taxa, as approxiinations of this ideal. To which unrecognized experimental artifacts
emphLislzethe uncertainty in the values, we will could substantially Flaw some values.
call thcrn distance estimates. We can now address The four most cornrnonly used weighting
tl~cproblem of choosing a tree from the following schemes are:
conccpkual perspective: We have uncertain data
l want to fit to a particular mathematical
t l ~ a we
nod el (an additive tree) and find the optimal
value for the adj~rstableparameters (the branch-
ing pattern and the branch lengths).
viewed as a compromise that assumes the uncer- branch k is part of the path connecting taxon i to
tainties are proportional to the square roots of the taxon j, otherwise is equal to 0.With this de-
values (Felsenstein, 1993).Note that missing data finition it follows that
can correctly be handled by setting the corre-
sponding weight to zero; that is, ~f d,, is unknown,
setting zo,, = 0 will cause this observation to be ig-
nored (although most currently available software
does not allow specification of individual pair-
wise weights). Thus, a system of equations such as that of Figure
If there is a rational method for estimating a;, 14A can be represented in matrix notation as
then use of equation (12d) is preferable. Theoreti-
cal variance formulas are available for most of
the model-based distances described below (al-
though space limitations preclude their inclusion
here, they are available i n the original refer-
ences). These theoretical variances can be used
for DNA and protein sequence data, restriction
site data, and gene frequency data. An important
property of these formulas is that they explicitly
state the dependence of uncertainty on the
amount of data; e.g., for sequence-based dis-
tances, the variance is inversely proportional to
the sequence length N,A problem, however, is If the distances were additive, then p,,= di, for all
that if two sequences are identical, the estimated (i, j ) pairs, and we could solve (13) directly. In
uncertainty will be zero, which causes equation general, however, due to the imperfect additivity
(12d) to be undefined and would be a question- of the distances, we must use (13) to eliminate p,]
able conclusion in any case. A practical treatment from (11) and seek a solution to the v i s that mini-
is to assume that the minimum measurable dis- mizes E. This minimization can be accomplished
similarity is one-half of a substitution, yielding using special-purpose linear or quadratic pro-
(approximately) 1/(2IV2),as a minimum value to gramming algorithms (e.g., Barrodale and
be imposed on the estimated variance. Roberts, 1973),by iterative successive refinement
For other kinds of data, including indirect techniques ("alternating least-squares;" Felsen-
methods such as DNA hybridization or immuno- stein, 1993), or-when a = 2 and w,] = 1-by using
logical distances, random errors can be estimated ordinary linear algebra (e.g., Cavalli-Sforza and
by comparing replicate experiments or using reci- Edwards, 1967; Kidd and Sgaramella-Zonta, 1971;
procal comparisons (where appropriate; see G.J. Olsen, 1988) using the equation:
Chapter 6). These concepts are discussed in the
corresponding experimental chapters.
For an unrooted tree of T taxa, there are 2T - 3
independent branches that define the p , values,
and there are T(T - 1)/2 distinct pairwise dis- For weighted least-squares criteria like that of
tances. To represent mathematically the relation- Fitch and Margoliash (1967), the linear algebraic
ships between the branch lengths, vb and the path solution is
lengths between pairs of taxa, we need an appro-
priate representation of the tree topology. Let A be
a matrix of T(T - 1)/2 rows and 2T - 3 columns
such that the element A(,,)kis equal to 1 if the
450 Chapter 11 I Swofford, Olsen, Waddell b Hiillis
where W is a T(T- 1)/2 x T(T- 1)/2 matrix with priate because some highly suboptimal trees can
diagonal elements equal to the weights associated use negative values to produce a low apparent er-
with each pairwise comparison and all off-diago- ror. Several methods for dealing with negative
nal elements equal to 0. branch lengths have been proposed. Some au-
The methods in the previous paragraph fit the thors (e.g., Cavalli-Sforza and Edwards, 1967;
data to a specific tree topology, and thus assume Kidd and Sgaramella-Zonta, 1971) have favored
that an appropriate search strategy will be used to outright rejection of any tree that requires a nega-
find the best topology In an alternative approach tive optimal value for any branch. This extreme
described by We Soete (1983a,b), the values of p, approach runs the risk of rejecting the correct tree
are initially set to the observed distances (d?), and in certain realistic situations. An alternative strat-
then they are gradually adjusted by an opt~rniza- egy (Felsenstein, 1993) is to constrain the opti-
tion regimen that keeps them at a local minimum mization process so that the negative branch
of equation (ll),while improving their fit to in- lengths are disallowed; a solution that optimizes
equality (10) for all sets of four taxa. At the end of E under the constraint that all branch lengths be
the process, all sets of p,, satisfy inequality (10)- non-negative is obtained. If (14) or (15) is used to
so they will perfectly fit some additive tree-and determine least-squares branch lengths, the only
they are at a minimum of equation (11). alternative is simply to set any negative branch
A problem that sometimes arises with the lengths to zero and then calculate E without read-
above methods is that full minimization of equa- justing the other branches. This method gives ex-
tion (11)requires that some of the vk be negative. act values of E for trees that have no negative
A negative branch length does not correspond to branch lengths and overestimates the value of E
any meaningful biological process and should otherwise. ?he amount of overestimation is small
probably be avoided (e.g., Kidd and Sgaramella- as long as there are no large negative branch
Zonta, 1971).Allowing branches to have negative lengtl-rs,
values when E is evaluated is probably inappro- Table 1 summarixes the results of a least-
Table 1
Optimal 5s rRNA tree by weighted least-squares criterion
Sequence Estimated Expected Distance Expected Error
pa? distance" distancec difference" uncertaintyc contributionf
length criterion consistently outperformed least- striking improvement in the performance of the
s q ~ ~ acrileria
cs based on (11) (Kidd and Cavalli- FM method when branch lengths were con-
Sfor~a,1971).Apparently unaware of this work, strained to be non-negative; in their study the per-
Rzlietsky and Nei (1992a) described a method formance of the FM method slightly surpassed an
based on essentially the same criterion, cailing it approximate method closely related to ME (the
the 117lliln?um evolution method: neighbor-joining method; see below), but only if
negative branch lengths were disallowed.
Ultrametric Distances
Ultrametric distances are more constrained than
tree-additive distances. Mathematically, ultramet-
The only difference between the two methods IS ric distances are defined by satisfaction of the
h a t R~l~etsky and Nei drop the absolute values in three-point condition, whic1-1requires that for any
cqu.~~ion (IG), which has the seemingly undesir- t h e e taxa A, 13, and C,
able property of allow~ngnegative branch lengths
to rrnprove the apparent goodness-of-fit of the
irce In practice, however, the two methods are lit-
tle difierent, because the branch lengths are usu- This inequality simply states that two of the three
ally non-negative (or very close to zero if nega- pairwise distances between three taxa are equal
Livc) (Swofford, unpublished observations) on and at least as large as the third. Phylogenetically,
trees scorlng well according to equation (16). The ultrametric distances will precisely fit a tree so
choice of the name "minimum evolution" is un- that the distance between any two taxa is equal lo
foriul-iatc,as the same name had been used earlier the sum of the branches joining them, and the tree
for s clulte different method (Cavalli-Sforza and can be rooted so that all of the taxa are equidistant
Ed~vards,1967; Thompson, 1973).Because the ear- from the root (Figure 14B). The first half of this de-
lie1 method was never widely used and the scription defines an additive tree (and implies
Rzhersky-Nei method is becoming very popular, that ultrametric distances are additive). The sec-
i t s e e ~ best
~ ~ sto refer to the methods defined by ond half of the description corresponds to the
equations (16) and (17) as thc minimum evolution concept of a molecular clock that runs at the same
(ME) method. rate in all lineages at any given moment. Two po-
R7hetsky and Nei (1992b) have provided a tential surprises may emerge, however. First, even
theoreucal argument for the superiority of thc ME with ultrametric data, there is no guarantee that
method over the Fitch-Margoliash (FM) and re- the amount of divergence is linear in time. In par-
lated methods duc to a bias in the latter methods ticular, superimposed sequence changes, which
~ ~ 1 1 ~t h1e1varlance of thc estimated distances is decrease the observed molecular divergence, do
high (c g , due to large drfferences between sl-iort not destroy the ultrametric property, Second, ob-
sequences) Although their computer simulations taining ultrametric data is extremely unlikely;
appe'11ed to reinforce this concIusion, the actual even if the underlying substitution rate is per-
reason for the betier performance of ME is un- fectly constant, any finite sample will yield statis-
cIear, as the bias quickly becomes inconsequential tical fluctuations in the measured divergences.
as sequence length increases. It seems more ylau- Consequently, even a universal substitution rate
sible that the enhanced ME performance is due to would not give ultrametric data without an infi-
a reduced impact of negative branch lengths In nitely large sample. The closest experimental ap-
the M E melhod. I<idd et a1 (1974) reported that if proximations of infinite samples are genome hy-
trees contaming negative branch lengths are auto- bridization measurements (Chapter 61, although
matically rejected, the ME and FM methods give measurement errors limit the effective amount of
essc.i-iiidly den tical rcsults. Felsenstein and Kuh- data (Felsenstein, 1987).
ner's (1994) simulations also demonstrated a If data are nearly ultrametric by equation (181,
Phylogenetic Inference 453
which is rarely the case, methods that assume a thereby changing our definition of sequence dis-
lnolecular clock can be more efficient (require less similarity to the number of aligned sequence posi-
data to achieve the same probability of inferring tions containing "non-synonymous" residues di-
the correct tree). Felsenstein's (1993) KITSCH pro- vided by the number of sequence positions
gram uses the same criterion as equation (11) compared. For example, "conservative substitu-
(with a = 21, but constrains the lengths of the tions" are commonly ignored when comparing
branches so that the total length from the root of proteins by pooling the amino acids into six
the tree to each terminal taxon is the same. Cluster groups: acidic (D, E), aromatic (E W, Y),basic (El,
analysis methods (described below) are also ap- K, R), cysteine, non-polar (A, C, 1, L, P, V), and po-
propriate under the assumption of a molecular lar (MIN, Q, S, TI. Residues within each group are
clock, and are very fast. Colless (1970) provided a considered synonymous; residues in different
precise definition of how much deviation from ul- groups arc considered non-synonymous.
trametricity can be tolerated without causing the As discussed above, if the evolution of a gene
estimation of the tree to become inconsistent. includes insertions and/or deletions, then gaps
However, there is little practical reason to use must be inserted to adjust for the internal length
cluster analysis because related methods such as changes when aligning the contemporary se-
neighbor joining are applicable to more general quences. Althougl~the character state "gap" is
additive distances, require very little additional sometimes treated as a fifth base or twenty-first
computation, and are often more efficient in sim- amino acid, the processes responsible for base
ulation studies under a molecular clock model substitution and-for insertion-and deletion are
(Sourdis and Krimbas, 1987; Charleston, 1994)un- evolutionarily and mechanistically distinct. Be-
less rates of substitution are high. cause a proper treatment is not obvious, sequence
positions with gaps are usually omitted from
analyses in one of two ways (e.g., Kumar et al.,
Distance Transformations fou Sequence Data 1993; Swofford, 1996).The first (pairwise deletion)
MEASUREMENT OF SEQUENCF DISSIMILARITY By omits sites in which one or both sequences have a
far the most common method of summarizing gap for each affected comparison. This option is
the relationship between two sequences is by appropriate when gaps are short and distributed
their fractional (or percentage) similarity or dis- approximately at random (Kumar et al., 1993).A
similarity. In its simplest form, the sequence dis- s@condtreatment (complete deletion) deletes a
similarity is equal to the number of aligned site from all pairwise comparisons if any of the se-
sequence positions containing non-identical quences in the data set have a gap at that site. Al-
residues (bases or amino acids) divided by the though the complete deletion method discards
number of sequence positions compared (in moreinformation, it may be more appropriate
mathematics this distance is called the Hamming when some regions of a sequence (e.g., more
distance). I-lowever, we must explicitly address rapidly changing regions) are more prone to in-
several subtleties and potential ambiguities: sertion/deletion events than others, in which case
alternatives to limiting the comparison to identi- pairwise deletion could introduce a bias. Align-
cal residues; terminal length variation of mole- ment gaps are usually positioned to maximize the
cules; alignment gaps; and treatment of ambigui- alignment of identical residues in sequences.
ties. The following sections assume that the Thus, additional insertion/deletion events could
sequence alignment has already been defined systematically raise the apparent similarity Once
(see "Sequence Data" in the section "Types of again we emphasize that regions of the sequence
Data" above, and Chapter 9). alignment that contain substantial numbers of
It is frequently of interest to define the simi- alignment gaps should be omitted from the analy-
larity of two molecules in terms of a more relaxed sis; positional homology is too uncertain for reli-
criterion than the fraction of identical residues, able estimates to be made from these regions.
454 Chapter 11 / Swofford, Olsen, Waddell tS Hillis
wlzere n,, is tlze number of times sequence X has state i aligned next to state j in se-
quence Y, and N = Cn,. Let us represent this matrix as
identity) in the sequences being compared. For example, counting a purine (R)
as synonymous with A and G and non-synonymous with C and T will tend to
overestimate the similarity between the affected sequence comparisons. One
approach (Swofford, 1996) is to distribute diflerences between sites with atnbi-
guities based on the frequencies of differences at unambiguous sltes. For in-
stance, suppose that a site has an A in sequence X and an R in sequence Y. If for
t111s comparison there are 450 sites that have an A in both sequences, and 50
sites that have an A in one sequence and a G in the other, then the site would
c0ntribut.e 45015' 00 = 0.9 to the value of (Fr,,)AA,and 0.1 to Maxlmum
likelihood distances (see below) can deal with the ambiguitxes directly (e.g.,
Felsenstein, 1993) by considering the likelihood of each possible resolution of
the ambiguity.
The uncorrected distance, often referred to as the dissin~~larity(D) or p-dis-
tance (e.g., Kumar et al., 1993),is simply the total number of differences divided
by the total number of available sites:
dxy= b + c + d + e + g + I . r + i + j + l t ~ i z + r ~ + o
p-distance : =I-(a+ f + k + p )
Note that the maximum expected dissimilarity is 0.75; if D equals or exceeds this
value, the distance becoines undefined because the argumeilt of the logarithm be-
comes negative. A distance for the model of Felsenstein (1981a),whicl~relaxes the
assumptioil of equal base frequencies, is given by
mates of the number of transition versus trans- strategy is appropriate when a substantial se-
version substitutions and use a weighted combi- quence divergence is apparent. The rationale is
nation of these as the estimate of the evolutionary that the third codon position will be largely ran-
distance (Schoniger and von Haeseler, 1993; domized and hence phylogenetically uninforma-
Goldstein and Pollock, 1994; Tajima and tive. This approach, by definition, also circum-
Takezaki, 1994). These methods appear to be vents the problem sf the third codon position
much more reliable for tree inference than the changing more rapidly than the first two and re-
usual K2P distance (Pollock and Goldstein, 1994). duces the degree of violation of the assumption
that all sites are changing at the same rate.
PROTEIN-CODING DNA SEQUENCES In principle, The third basic method is to infer the protein
knowledge of the gene sequence should be more sequence from the gene sequence and perform the
informative than the corresponding protein phylogenetic analysis at the protein level. This ap-
sequence. In practice, at least two factors call this proach has two merits: (1) the protein is the most
assertion into question. First, silent substitutions biologically relevant aspect of the gene (taken as
in protein-coding genes are much more frequent a whole); and (2) the sequence can be compared
than replacement substitutions; thus the third with homologous molecules that were sequenced
codon positions tend to become randomized at the protein level, for which nucleotide se-
quickly and convey very little information about quences are therefore unknown. In addition to the
distant phylogenetic relationships. Second, the distances for the Poisson and Proportional mod-
base composition of the third codon position els described above, PHYLIP (Felsenstein, 1993)
appears to vary systematically between some provides a distance under the Dayhoff model.
species, thereby indicating that it can be subject The more complex methods involve estimat-
io at least a moderately strong selective force ing the numbers of synonymous (silent) and non-
that is different in different lineages. The pres- synonymous (replacement) substitutions sepa-
ence of directional selection can lead to profound rately. When the maximum divergence between
sequence convergences and consequent errors in taxa is low, distances based on synonymous
inferred relationships. With these considerations changes may reduce the effect of among-site rate
in mind, three relatively simple strategies can be variation, as synonymous substitutions are largely
used to analyze protein-coding sequences, and a neutral (Kumar et al., 7993). For more distantly re-
host of moderately to extremely complex alterna- lated taxa, restriction to non-synonymous changes
tives exists. tends to minimize the impact of noise contributed
The simplest method of calculating distances by a large number of silent changes. Many meth-
between sequences for protein-coding genes is to ods have been proposed far estimating synony-
apply the distance fonnulas above directly to the mous versus non-synonymous substitutions (W.-
gene sequence without special treatment. This H. Li et al., 1985b; Nei and Gojobori, 1986; W.-H.
method is reasonable, or even preferred, when the Li, 1993b; and references cited therein). These
total amount of divergence is very small, in which methods differ in the details of how they deal
case the resulting trees are based primarily on with multiple substitution pathways when two
silent substitutions in the genes. The main draw- codons are more than one substitution apart and
back is that a systematic undercorrection for su- how they account for different levels of degener-
perimposed substitutions will result, since the as- acy (e.g., a site in a sequence is twofold degener-
sumption that all positions are equally subject to ate if one of the three possible changes is synony-
change will clearly be violated. If the amount of mous and fourfold degenerate if all possible
sequence divergence is truly small, then superim- changes at the site are synonymous).
posed changes will be rare and the undercorrec-
tion will be negligible. MAXIMUM LIKELIHOOD DISTANCES The most
The second approach is to restrict the analy- straightforward (and computationally intensive)
sis to the first two nucleotides of each codon. This method for estimating evolutionary distances is
458 Chapter 11 / Swofford, Olselz, Waddell & .'Nilis
to apply maximum likelihood according to the the highest likelihood) as the parameter value for
models described under "Models of Sequence calculating distances as input to a tree search us-
Evolution." As noted above, the "tree" in this ing a distance criterion. This hybrid approach can
case is a single branch, and we estimate the be an effective compromise between a full search
branch length (expected number of substitutions under the maxilnum likelihood criterion (which
per site) t6at maximizes the probability of one may be computationally infeasible) and an arbi-
sequence evolving from the other. (Because of the trary choice of parameter values using a distance
time-reversibility of the models, it makes no dif- criterion.
ference which sequence is considered ancestral.)
Felsenstein's (1993) DNADIST program obtains TREATMENT OF UNDEFINED VALUES Distance val-
maximum likelihood estimates of distance under ues become undefined if the apparent sequence
the JC, K2P (with or witl~outa gamma-correction divergence exceeds the maximum possible (true)
for among-site rate variation), and F84 models, distance under the assumed model of evolution.
but the same approach easily could be adapted to For example, in the JC model, complete random-
accommodate other models. Many (but not all) of ization of sequences would lead to D = 0.75 (i.e.,
the distance formulas presented above are maxi- even for two random sequences, one-fourth of
mum likelihood estimators (e.g, see Zlzarkikh, the nucleotides are expected to be identical by
1994). However, direct use of maximum likeli- chance). If the observed dissimilarity equals or
hood to calculate the distance has a number of exceeds 0.75 due to sampling error or violation
advantages. Most importantly, it allows model of the model, the logarit& in equation (20) can-
parameters, such as the transition:transversion not be taken. In this-situation, it is probably wise
ratio, to be maintained at a consistent value not to proceed without taking steps to avoid
across all pairwise comparisons (e.g., although problems due to excessive saturation. If only one
the standard K2P distance formula is a maximum or two sequences are causing the problem, they
likelihood estimate when estimating the transi- can be eliminated from the analysis. If the prob-
tion:transversion ratio independently for every lem is rnostly due to high rates of transition-type
pair, the distance must be numerically evaluated differences, transversion-only distances (or max-
usinn maximum likel~hoodin order to use a fixed
w
imum likelihood distances with a high. transi-
ratio as a means of reducing sampling variance). tion:transversion ratio) can be employed. As a
Maximum likelihood estimation also provides a last resort, any undefined distances can be re-
very clean way of handling missing or ambigu- placed by an-arbitrarily large distance value,
ous data, as the probability of observing each of such as twice the rnaxirnum observed distance.
the bases allowed by the ambiguity can be explic-
itly evaluated. ACCOMMODATING AMONG-SITE RATE VARIATION IN
Although maintenance of substitution-model DISTANCE CORRECTIONS Distance corrections
parameters at a consistent value is an advantage that assume equal rates of change across sites
of maximum likelihood distances, it adds the bir- will be affected by the same problem that com-
den of specifying their values. One possible way plicates maximum Iikelihood analysis when
of estimating these parameters is to perform phy- among-site rate heterogeneity exists: distances
logenetic analyses using a range of parameter val- will underestimate the actual nuinber of substi-
ues, then choose the parameter settings that max- tutions (Gelding, 1983). Fortunately, this rate
imize the additivity of the distances on the best heterogeneity can be accommodated without
tree(s) found (e.g., that minimize the value of E in too much difficulty. For maximum likelihood
equation 11).~~Grnatively, parameters may be es- distances, any of the model variations described
timated using maximum likelihood on a few "rea- under "Accommodating Rate Heterogeneity
sonable" trees obtained using simpler distances. Across Sites" in the "Maximum Likelihood
If the parameter estimates are reasonably similar Methods" section can be applied directly. If rates
acrosstl~esetrees, it is probably safe t-o use their are assumed to follow a gamma drstribution,
mean value (or the value from the tree that had special modifications of the distances described
above are available for the JC and K2P models in this correction should be estimated from the
(Jin and Nei, 1990) and TrN models (Tamura and constant sites alone.
Neil 1993).Although not noted by tliese authors,
these "gamma" distances can be obtained from LOG-DETERMINANT DISTANCES The ~ l ~ o d e l s
the usual distances simply by replacing tlie described above for inaxlinum llkcl~hoodand
function ln(x) with a(l - x-l'9 in tlie original d~stanceestimat~oizassume that the substltut~on
formulas, where a: is the shape parameter of the probability matrices rema111 constant throughout
gamma distribution (this function 1s tlie inverse the tree (l.e., they are stationary) and that t l ~ e p
of the moment generating function for the distri- have the property of tlnze revers~bil~ty (which
bution). In fact, this method also works for most jointly 11ilyly that basc frecluencies rcilzain a t a
( ~ not
f all) of the other time-reversible distances constant, equllibriunz value) The LogDct (Steel,
(Waddell and Steel, 1995; Lewis and Swofford, 1994a; Locklzart et a l , 1994) or parallnear dls-
unpublished; see Swofford, 1996). For example, tance (Lake, 1994) 1s a transformation that yields
the general time-reversible distance witli a dis- addltive distances under a much w ~ d e rset of
tribution of rates across sites can be written as models. Perhaps most ~mportantly,thls transfor-
d,,, = -tracein I M - ~ ( ~ - ~ F , ~where
) ] ) , M-l is the matjon 1s robusl to changing base composltlon
s a m e fuliction used for the Jin-Nei a n d (e.g., GC bias) among the taxa being stud~ed-a
Tamura-Nei distances in the case of the gainma potential source of systellzat~cerror if stationary
distribution, but can be tlzc inverse of tlie ~lzodelsare assumed. The LogDet transfoxma tion
moment-generating function for other distribu- wlll yleld an additlve dlstance (in expectatlon)
tions as well (Waddell and Steel, 1995). The under any Markov model. of evolution (sce
value of a must be determined independently above) as long as sites cvolve ~dentlcallyand
using one of t h e methods outlined above. independently and rales of substltutlon arc equal
Choice of an a value based on results from pre- across sites. This general Markov model is
vious studies is also an option (e.g., Kumar et described by a rooted tree, where the root can
al., 19931, although evidence is accumulating have any base composltlon (as long as all states
that levels of rate heterogeneity vary widely have a non-zero frequency) There are no con-
among different genes, regions of genes, and stralnts on the parameters 111 edch s u b s t ~ l u i ~ o n
organisms. probability matrlx P(1) (all 12 substltutlons are
The invariable sites model (see above) can free to occur at different rdles), and P(t) can bc
also be applied to distance estlinatlon by remov- dlffercnt for each branch or at diffcrcnt polnis
ing a certain fraction of the constant sites from the along the samc branch Each P(t) matrix irnplles
data matrix. Tlie easiest way to accomplish this is its own set of stationary basc colnposltlon val-
to subtract the constant $N/4 from the diagonal ucs, so tliese are also allojved to vary throughou~
entries of 1 1 , in matrix (19) (and adjusting N ac- the tree. These assumptions correspond to those
cordingly) before calculating the distance, where of the maximum hkel~hoodmodel proposed by
$ is the desired proportion of invariable sites and Barry and I-Iartigan (1987a)
N is tlzc total number of sites (Waddell, 1995). If The basic forin of the log-determ~l-iantd ~ s -
basc frequencies are unequal, ~t is preferable to taizces is
subtract ~~$Nfr0111 Lhc kt21 diagonal elerr~entof the
divergence matrix, where n,(is the frequency of
base k . When base composition is not liomoge-
neotzs tliraughout the tree, or in other situations (Steel, 19944, where "det" refers to the determi-
where constant sites have a different composition nant* of a ilzatrix and F,, is an r x divergence
than the variable sites, tlie base frequencies used n-tatrlx for sequences X and Y (e.g., equatlon 19)
*The defrllitlon of the deterrnlnant of a matrtx 1s beyond the scope of thls chapter Introductions to nlatr~xalgcbla
can be found 1n many statistics texts or any lu~caralgcbra text. An excelleni mtraductioi~for brologists 1s Bul~ncr
(1994, p. 298 ff.).
460 Chapter 11 / Swofford, Olselz, Waddell & Hillis
A det F,,
= - I In[Jdet nXEy
1
wherf II, and rIy are diagonal matrices of the
quencies, the variance of the LogDet distance
(Lockhart et al., 1994)becomes equal to that cal-
culated by the usual variance formulas. Furtlzer-
more, four-taxon computer simulations (D.L.
Swofford, P.O. Lewis, and P.J. Waddell, unpub-
character-state frequencies in sequences X and Y, lished) show that when data are simulated ac-
rcspect~vely(Lockhart et al., 1994). The expected cording to any of the models in the GTR family
value of this distance will be equal to the mean (Figure ll),the minimum evolution method us-
nurnbcr of substitutions per slte if base frcquen- ing LogDet distances leads to recovery of the
cies are all equal, in ~vhichcase correct tree about as often as using other dis-
tance measures-including the distance specific
ln(det n,n,) = -r lnr to the simulation model-:for all but very short
sequences ( ~ 2 0 bases).
0
Otlierwlse, it will overestimate the evolutionary The LogDet can be applied to amino acid se-
distance by a constant factor that becomes larger quences (Lake, 1994 gave a four-taxon example),
as base composition becomes more unequal or even using each of the 61 non-stop codsns as
(Waddell, 1995).Note that equation (23) is equiv- character states. The variance of the LogDet may
alcni to Lake's (1994) paralinear distance except become more of a problem in these situitions, so
for [he scaling by l / r . (Lake did observe, how- it may be useful t.b group some states together
ever, that his paralinear distance was approxi- (e.g., into the six main amino acid classes). An-
rnaiely equaI to r timcs the mean number of sub- other problem is that a state may be entirely ab-
slltu tions per site ) For non-stationary models, sent in one or more of the sequences. In this case,
(21) tends to overestimate the mean number of the determinants of F,, and of I& and/or rPy will
s~tbshtutions,but it can also be an underestimate, be zero (yielding an undefined distance when the
depcndrng on the base composition at internal log is taken). The best way to deal w ~ t hthis situa-
p o ~ n t sof the tree. Even under non-stationary tion remains to be determined; possible solutions
models, however, the LogDek distance often pro- include removing the state from the F, matrix al-
vldcs better estimates of the number of substltu- together (if the state is absent from all of the se-
tlons per site than any of the standard distance quences), pooling this state with another, or set-
Phylogenetic Inference 461
ting the corresponding elements of F,, to some WHICH SEQUENCE DISTANCE TRANSEORMATfON IS
sinall value such as 1/ (2N). BEST? AS the above discussion indicates, dis-
Lockhart et al. (1994) found that use of tance analysis of sequence data requires choos-
LogDet distances yielded more believable trees in ing a distance transformation from a rather over-
three examples for which nucleotide composition whelming number of possibilities. Ideally, we
was variable over taxa. However, a weakness of would always choose the most general distance
the standard LogDet transform in real applica- available, as this distance has the smallest chance
tions is that it is no more robust to unequal sub- that assumptions corresponding to particular
stitution rates at different sites than are other dis- restrictions of the underlying model will be vio-
tance measures (Barry and Hartigan, 198713; lated. Currently, this criterion would lead to a
Lockhart et al., 1994; Lake, 1994). Lockhart et al. tradeoff between the LogDet/paralinear distance
(1994) reported that for some data sets, reasonable (which requires special treatment if there is sub-
trees could be obtained only after eliminating sites stantial among-site rate variation) or the GTR
that were uninformative according to the parsi- (general time-reversible) distance with an appro-
mony criterion, and suggested that inclusion of priate correction for rate heterogeneity (Waddell
sites that were highly unlikely to change might be and Steel, 1995). However, generality often
the cause of the problem. Unfortunately, unlike comes at the price of increased variance, and
the less general distance transformations, LogDet many simulation studies have indicated that
distances cannot be directly modified to take ac- simpler distances based on models that are
count of a specific distribution of rates such as the known to be vioIated may nonetheless perform
gamma distribution. better for phylogenetic inference than distances
Waddell (1995) has shown that by subtracting based on the same model being used to generate
an appropriate proportion of invariant (constant) the data (e.g., see Nei, 1991 and references cited
sites from the diagonal elements of I?, (see "Ac- therein). For example, when sequences are rela-
commodating Among-Site Rate Variation in Dis- tively short, use of simple dissimilarity (p-dis-
tance Corrections," above), LsgDet distances can tance) or the JC distance can lead to correct
become nearly additive even if the true distribu- recovery of the true tree more often than the K2P
tion of rates across sites follows a continuous dis- distance, even when there is a fairly strong tran-
tribution sucl~as the gamma. Methods of estimat- sition/transversion bias.
ing the proportion of invariable sites for It is difficult to provide simple prescriptions
maximum likelihood and other distance transfor- for the choice of a distance measure (but see Ku-
mations perform well, whereas simple removal of mar et al., 1993, for one such set of recommenda-
parsimony-uninformative sites tends to be too se- tions). In general, we believe that additional stud-
vere. However, as base composition becomes ies will confirm preliminary simulations that
more heterogeneous over taxa, sites with different indicate little variance-inflation problem with
rates of change also change base composition LogDet/paralinear distances when all sites evolve
with respect to each other. Thus, it may be impor- at the same rate (see "Log-Determinant Dis-
tant to estimate base frequencies using only the tances," above). Because of their generality (in-
constant sites, rather than the full data set, when cluding their robustness to base composition bi-
calculating the proportion of sites to remove from ases), log-determinant distances are probably
the diagonal elements of F,. Removing constant preferable to other, more restricted, distances that
sites is helpful and may adequately correct for the do not incorporate corrections for among-site rate
problem of rate heterogeneity plus shifting base variation. Beyond that, we offer Kumar et al.'s
composition (Waddell, 1995),but a better strategy (1993, p. 29) rule of thumb: "As a general rule, if
may be to classify sites into a few distinct rate two distance measures give similar distance val-
classes, apply the LogDet transform to each, and ues for a set of data, use the simpler one because it
sum these separate estimates to obtain the final has a smaller variance." Of course, the longer the
distance. sequence length, the less variance considerations
462 Chapter 11/ Swofford, Olsen, Waddell & Hillis
dominate the choice of a distance. With long se- tion to DN to alleviate the problems created by
quences (e.g., >2000 bases), it may be more proi- non-uniform rates of change:
itable to emphasize closcr modeling of the substi-
tution process than to worry too much about
variance.
within-taxon lieterozygosity (5. Wright, 1978; Nei and Li's (1979) inethod for estimating Lhc
Hillis, 1984); the distance between two taxa that number of ~~ucleotlde substit~~tionsthat have oc-
are fixed for alternate alleles exceeds that between curred slnce divergence of a pair of taxa X and Y
two taxa in wrhich one or both are heteroallelic but from a comnon ancestor I S typ~callyused An es-
have no alleles in common. timate of the proportion of ancestral restriction
An alternative Euclidean measure that over- sites that have remained unchanged untll the pre-
comes this Emitation is the arc distance of Cavalli- sent is given by
Sforza and Edwards (1967),which is given by
tion ,-tccordmgto a gainma distribution (see "Ac- Hybridization data and their transforinatioll
commodat~ngAmong-Site R$te Variation in Dis- to amount of difference in the DNAs are dis-
t a ~ ~ Corrections,"
ce above). S is used to estimate cussed extensively in Chapter 7. These data can be
the proportion of restriction sites teat have been corrected for superimposed base changes by the
prcsel rred by a pair of species, and sl/' then rep- methods discussed above.
resents the corresponding fraction of silnllarity at
each of the r sites in the recognition sequence. The ~ ~ d ~ ti^^^ l - ~for character
~ ~ ~ d
d~siancevalue that predicts this fraction of simiIar
sites under the chosen model and parameter set- Data: Hadamard Conjugation
t u g \ is then estimated by maximum likelihood The Hadamard conjugation, or spectral analysis
The methods described above are appropriate (Hendy and Penny, 19931, offers another frame-
when all restrlctlon endonuclcase recogn~tion work for taking superimposed changes into ac-
S I ~ L are
~ the same length For studies involving count. It will not be possible to provide a coin-
enzymes wlth different slzes of recognition se- plete description and justification of this family of
qucnccs, more coinplicated methods developed methods in the space available, so we will instead
by Ncl and Tajima (1983) can be used, although try to provide a clear explanation of the basic
we wlll not describe them here. methodology. We begin by describing another
Nel and Li (1979) also addressed the problem model of character change introduced formally by
of cslrniatil~gnucleot~desubstitutions from re- Cavender and Felsensteln (1987). The Caven-
slrrct1oi1 fragment data I-lowever, these estimates der-Felsenstein model is essentially a two-state
,Ire reliable only if the actual number of substitu- equivalent of the Jukes-Cantor (1969) model.
tions has been low (e.g.,the samples are restricted Each of the two states (0and 1)are assumed to oc-
to wnspeclfic populations). Consequently, we will cur at equal frequency, and the probabrlity of
17ot describe their procedures for dealing with change from state 0 to state 1 is equal to the prob-
frngmcnt data; the interested reader can consult ability of change in the opposite direction. For ex-
their y aper directly ample, this model might apply if we pool the
purines (A and G) into one character state (0) and
Dnrr~~i~~oToyicaE
and Nucleic Acid Hybridization the pyrimidines (C and 2-1 into anotj~ercl~aracter
Dirla state (1).
Wh217 analyzing ~rnn~unological measurements, it
1s usi~nllyassu111ed that, wlthin certain ljmits, the Iievisiti~zgthe Felsenstein Zolze
mcasured iinmunological distance (ID) increases Consider the problein of calculating the probabil-
linenrIy wlth the number of ammo acid differ- ities of obtaining the various character patterns on
ences 111 the proteins being compared. The con- a tree such as that shown in Figure 16A, which
st'int of proportionality depends on the number corresponds to one of the examples used by
of ~ndcpendcntbinding domains and on the frac- Felsenstein (1978a) to demonstrate the potential
lion ol amino acid changes that alter a domain inconsistency of parsimony. Let Pi,klrepresent the
sufiiciently to inhibit antibody binding. Thus, probability of each possible pattern, where i, j, k,
there is s~gnificantuncertainty in the exact scahng. and 1 are the states (0 or 1)found in taxa 1, 2, 3,
If ~ V Cknew the scaling, we would apply a correc- and 4, respectively These pattern probabilities can
tlo1-1for superimposed amino acid replacements. be determined using the same system described
This is of little practical importance, however, under "Calculating the Likelihood of a Tree." As
sii?cc the amount of divergence being mcasured is an example, let us evaluate the probability that
~ L I I small,
L ~ so any correctxsn would also be small. the pattern of Figure 16B (0011) will evolve under
hie suggest equating evolutlonary distance to the the conditions of the Cavender-Felsenstein
immunological distance-that IS,assume that d = model. We first note that because of the time-re-
ID fur each pair of proteins. versibility assumption, we can re-root the tree at
Phylogenetic Inference 465
Figure 16 Calculation of the probability of observing a Tree re-rooted at an arbitrary internal node. (D) Calcu-
given pattern of character states on a tree. (A) An un- lation of the probability of the pattern shown in (B). (E)
rooted tree for four taxa with probabilities of character Calculation of expected proportion of characters that fa-
differences x or y along each branch. (B) Tips of tree la- vor tree (A). (F) Calculation of expected proportion of
beled by character states in the pattern of interest. (C) characters that favor the tree grouping taxa 1and 3.
an arbitrary internal node (Figure 16C), and then where x is the probability of a character-state
sum the probabilities of each of the four configu- change along the "long" branches and y is the cor-
rations of states at the two internal nodes (Figure responding probability for the "short" branches.
16D). That is, for each scenario, we multiply the Equation (26) is equivalent to one given by
prior probability of the basal state (= 112 in this Felsenstein (1978a). A similar derivation reveals
case) times the product of the probabilities of the that the probability of a pattern evolving that sup-
various changes (or non-changes) implied by each ports the tree grouping taxa 1+3 and 2+4 is
reconstruction. Because of the symmetry of the
branch lengths used here, the probability (Plloo)of
the other pattern that supports the tree of Figure
16A is equal to Peon. Thus, the probability of a
character pattern evolving that supports the true Felsenstein (1978a) used these results to show that
tree is for many values of x and y ( x > y), the probability
of evolving character patterns that favor an incor-
rect tree exceeds that of patterns supporting the
466 Chapter I 1 / Swuffu~d,Olsen, Waddell & Hillis
Thus, a sample of 1000 characters will,on avcr- Figure 17 Definit~onof Hadamard matrices. A
age, contain 30 more characters favoring a n incor- Hadamard matrix H IS a square matrix whose entries
rect tree than the true tree (78 versus 48). are all 1 or -1, and with every row (and column) or-
The tedious strategy outlined i n the above thogonal to every other row (and colunii~).(A) Basic
form of a Hadamard matrix, and recursive formula for
paragraph could be used to calculate the proba- generating the next larger matrix. (B) Example calcula-
bilities of any of the z4 = 16 possible character pat- tion of a matrix wit11 four rows and colulnns from the
terns for the four terminal taxa. Furthermore, it previous matrix with two rows and columns.
could in principle be generalized to trees of any
size. But as there are 2T distinct cl~aracterpatterns
and 2T-2 ways of generating each, this algebraic above, in = 8, and the correspondi~lgHadamard
approach quickly becomes unmanageable. matrix is
'This section assumes some familiarity wit11 matrix algebra; see many statistics texts or any linear algebra text for
introductions. Bulmcr (1994, p. 293 ff,)provides an accessible overview for biologists. For now, note t.11at the prod-
uct of a matrix A and a vector b, denoted Ab, can be obtained as in the foIlowing example:
(R) miti011
0 a, (1,2,3,41
1 (11, [2,3,4)
2 [2), IS, 3,41
where L], is the expected number of changes per site along branch i. Note that this
is just a special case of the general Poisson-correction formula (21) with = l / 2 .
Define the branch-lengliz spcclrum ~ ( 7as)
f -1.108040'
h o w let s(T) be the expected sequence spectrum-a vector where each element s k is
the precllcted proportion of the characters supporting each possible bipartition of
tlic taxa (division into two subsets; see Figure 18 for how bipartitions are in-
dexed! For example, s3is equivalent to Felsenstein's (1978a) Pooll+ Plloo,and % is
equivalent lo Polo,+ PloloThe values of s(T) can be obtained using the following
I-ladamnrd conjugation.
0
Phylogenetic Inference 469
(The simple form of the inverse of a Hadamard matrix, shown above, is an im-
portant advantage of the method.) For our example,
The probabilities corresponding lo Plloo+ Pool,and Polol-i- PLolo (sj and 55, respec-
tively) correspond exactly to those calculated algebraically in the preceding sec-
tion. Hada~nardconjugation has strong advantages over such algebraic calcula-
470 Chapter 11 / Swoford, Olsen, Waddell & Hillis
The 4 vector below shows the results of a CHOOSING A TREE Wltl'i real data sets, the PIC-
random sample of 1000 characters (using the ture is seldom as clear as the abovc section sug-
pseudorandom number generator in Mathemat- gests, and we must use one of several methods
icao) according to the expected sequence spec- to choose an optimal trec based 9n the trans-
trum in equation (33). formed data represented by the Y vector Tll?
closest tree procedure (Hcndy, 1991) is one corn-
monly recommended method. For a given trec z
containing K branches, ~t is straightforward io
find a vector q(7) th$t mlnlinlzes thc Euclldcan
distance from q ( ~toy.
) The squared distance can
be obtained [without the need to form q ( ~ )
exphcitly] using the forrnula
of ? The elements of ? are used as character Note that equation (39) is equal to the first term
weights, and a minimum-length tree under the on the right-hand side of (38).The second term in
weigliied parsimony criterjon is sought. As noted (38),although different for each tree, appears not
above, some elements of Y may be negative due to contribute greatly to the discrimination among
to lack of model flt or sampling error; these val- trees (Waddeil, 1995), and dropping it from the
ues are typically set to 0 before proceeding. Cor- optimality criterion allows US to use character
rccied parsimony 1s always consistent under the compatibility methods to ~ninimjze(39). Specifi-
Cavcr-tder-Felsenstein model (Steel et al., 1993a), cally, after setking any negative values in Y to 0,
unlike siandard parslrnorly. Corrected parsimony wc square each element and find a maximum
chooses the correct tree in our four-taxon exam- weighted clique; solution of this problem is then
ple, bec'luse the weight of character patterns sup- equivalent to minimizing the sum of squared de-
poriir~gthe true tree ( Y 3 ) is greater than that of viations for the excluded partitions from their ex-
pected value of 0. This ~ e t h o dseems especially
character patterns favoring alternative trees (75 promising when each Y,is divided by its esti-
and 76). The simulation studies of Cl~arleston mated sampling error before proceeding (yielding
(199-1)suggest that corrected parsimony can be the vector y,,), which gives a forin of weighted
lughly eflective in some situations, and in general least-squares tree selection (Waddell, 1995).
tends to outperform the closest tree and other
~ncihodsdescribed below. DATA EXPLOIZATION Apart from their use in esti-
An analogous method of corrected character mating trees, spectral analysis rnethods are use-
compnilbllity also can be employed. This method ful as aids in understanding the peculiarities of
searches for the largest weighted clique for the particular data sets. Strong contradictory signals
salnc data matrix and weights used for corrected in the? vector allow the data to reject the model,
parixrnony. A clique is slmply a set of mutually and we should explore the reasons that the cor-
compatible characters that can all fit OII the same rect data are not treelike if this occurs. Lack of fit
evolilllonary tree without hornoplasy (e.g., LC to a tree lnay indicate that our model is too sim-
Quesne, 1982, Estabrook, 1983). Standard graph ple (e.g., we are not accounting adequately for
t l ~ c o r yalgorithms exist for exact solution of the rate heterogeneity across sites, or the substitu-
wergl~tedclique problem (e.g., Bron and Ker- tion model is too restrictive). Alter-natively, there
b ~ i c h 1573).
, inay be multiple signals due to re-combination
A final method 1s actually a hybrid of the clos- or to non-independence among sites.
est tree and character coinpatibility approaches. It is lielpfui to plot the inferred branch lengths
Retnember that when evolution proceeds exactly (7 values) divided by their estimated standard er-
accordrng to the model and there is 1x0 sampling rors to see how much statistical support the "sig-
error, 2T - 3 of the elements in T will be positive; nals" really have (Waddell et al., 1994; Waddell,
1995). Another useful way of viewing the cor-
the I emainder (except for To) will equal 0. Thus, rected sequences is to plot the magnitude of each
fol a n y particular tree, the squared deviations signal in the conjugate spectrum against the sum
from 0 of the elements of '? that correspond to bi- of its pairwise incompatibilities with all other se-
partltlons not found on the tree is a least-squares quence patterns (a support/conflict spectrum; see
xnensure of the lack of fit: Lento et al., 1995).These graphical representations
of noise in the data set allow exploration of the
factors responsible for conflicts in different re-
gions of the tree and suggest which hypotheses of
relationship should be subjected to further
scrutiny. The paper by Lento et al. (1995) provides
good examples of this approach.
Phylogenetic Inference 473
the delta method approximation (Waddell et al., the ?' values from the usual Hadamard conjuga-
1994). The simulations by Waddell et al. (1994) tion under the same model, because the distance
showed that the covariance matrix derived in this method of estimating path-set lengths involving
way gives nearly unbiased results, whereas boot- more than two taxa has lower variance (Waddeli,
strap resampling tends to yield overestimates. As 1995).Consequently, tree selection using this vec-
long as a pattern occurs five or more times in the tar tends to be more reliable (Charleston, 1994).
observed data, it is reasonable to treat the corre- However, the distance Hadamard does not seem
sponding corrected pattern (or branch length) as to be as sensitive as the Hadamard conjugation at
normally distributed, resulting in straightforward detecting violations of the model's expectations.
confidence intervals, or tests of the hypothesis The studies of Lento et al. (1995) and Lockhart et
that its true value is zero. The covariances of cor- al. (199510)suggest that this method is a useful ex-
rected patterns can also be thought of as covari- ploratory tool when trying different distance
ances of tree branch length estimates. Generally, transformations, although more study is needed
the more changes per site there are on the tree, the on how directly a pattern from the distance
more strongly branch lengths become either posi- Hadamard can be treated as evidence for specific
tively or negatively correlated (Waddell et al., sequence patterns.
1994). (These interdependencies tend to make the
iterative search for a maximum likelihood solu-
tlon slower.) Another conclusion from this study Lake's Method of Invariants
is that long branches, even when not biasing the
topology of the tree, nonetheless cause a large in- Xatio~zale
crease in the variance of internaI branch length es- As discussed earlier in this chapter, the presence
timates, reducing the reliability of tree selection. of more than one long, unbranched lineage in an
It is possible to estimate a confidence interval on analysis can lead to systematic error in the ab-
transition:transversio~lratios or the shape para- sence sf perfect compensation for superimposed
meter of distributions used to model among-site substitutions. In the context of parsimony, the ho-
rate variation (Waddell, 1995). moplasies along the long branches can over-
whelm the informative character changes along
The Distance Hadanzard the internal branch(es) of the tree (see Figure 8
The last part of the Hadamard conjugation (from and the section "Parsimony and Inconsistency").
p to ?) can also begin from a matrix of pairwise Ideally, we would like to distinguish informa-
distances (either corrected or uncorrected) tive changes from homoplasies. In parsimony and
(Hendy and Penny, 1993). We would like to esti- maximum likelihood analyses, the addition of
mate a branch-length spectrum (now called ?,) new sequences whose branch points subdivide
and choose an optimal tree from this spectrum, the longest lineages (i.e., representation of taxa
ana1ogously to the procedure used for sequence that are specifically related to the most divergent
data. We input the distances at the level of the taxa already in the tree) will tend to accomplish
generalized distance vectors p (formula 35). How- this goal. The effect is illustrated in Figure 19
ever, a co~nplicationarises because these vectors where adding sequences A' and B' to the tree
include elements corresponding to path sets in- would reduce the effects of hon~oplasiesalong the
volving more than two taxa; see Hendy and branches leading to A and B. Of course, the prac-
Penny (1993) for a method of estimating these tical utility of this approach requlres that appro-
path-set lengths. The ?,vector resulting from for- priate taxa exist, that their identities are known,
mula (37) then serves as the basis for choosing a and that the corresponding sequence data exist or
tree as described above. can be generated. A second method of reducing
Simulations and analytic calculation: have the effects of homoplasy is to confine the analysis
shown that the variances of entries in the Y, vec- to the most conserved sequences (both on the ba-
tor resulting from this approach are lower than sis of the overall conservation sf the molecule and
Phylogerzetic Infe~ence 4 75
hlethodologtj
Lake's method can be described by the folio~inzg
sequence of steps:
1. Choose a quartet of aligned sequences; call
them A, B, C, and D.
2. Find the alignment positions in m~hichtwo se-
quences have purines and two have pyrim-
idines.
Figure 19 Adding new taxa to a parsimony or maxi-
mum likelihood tree to reduce the effects of homoplasy. 3. Consider the three possible groupings of se-
Given the unrooted tree shown in heavy lines, the long quences (see Figure 21): AB/CD (A with 13, C
lineages leading to A and B would have the greatest with D), AC/RD and AD/BC. Call these
tendency to artifactually group due to parallel or con- branching patterns X, Y, and Z,respecti~iely.
vergent changes in sequence. Adding taxa A' and B'
would reduce this effect by subdividing the long lines. 4. Using the sequence posltlons at which se-
quences A and B are bofh purines or botiz
pyrimidines (and sequences C and D are both
by selecting the most conserved portions of the of the opposite class of base), use the rules In
molecule). In distance-based analyses, estimates Table 2 to count the number of positions that
of the superimposed substitutions (which include support and the number that counter brnnch-
the homoplasies) can also be included. ing order X. Call these totals X+ and X-, rc-
Lake (1987a) suggested an alternative spcctively. Similarly, find the support (Y9and
method, which he called evolutionary parsimony, countersupport (Y-) for branclung order Y, us-
for analyzing the branching pattern linking four 111g the sequence pos~tionsat w l ~ l c hse-
nucleotide sequences. The analysis can be derived quences A and C have the same class of base,
from the following assumptions: (1)substitutions and B and D have the opposite class, Finally,
at a given sequence position are independent; (2) find the support (Z+)and countersupport (Z-)
a balance exists among specific classes of trans- for branching pattern Z. If the counting has
versions (a sufficient coi~ditionfor this balance is bee11 done correctly, the total of Xt,X-, Y ' , I/-,
that transversions are equally likely to yield each %+, and Z- will be equal to the total numbcr of
of the two possible s~~bstitution products, so that positrons with two purines and two pyrim-
C is equally likely to change to A or G, etc.); and idines, as found in the second step.
(3) insertions or deletions can be safely ignored.
5. The net supports for branching patterns X, Y,
An advantagc of the method is that it does not as-
and Z are
sume anything about rate equality over sites; each
site is free to evolve at a different rate than all
otl~ersites.
If the assumptions are satisfied, then parallel
transversions in the two branches of a tree pro-
The support for two of the branching patterns
duce equal numbers of similar (type I in Figure
should be near zero, while the remaining
20) and dissimilar (type 2 in Figure 20) nu-
branching pattern inay or may not be sup-
cleotides. Thus, the net effect of peripheral branch
ported by a significantly non-zero score.
transversions could be cancelled if the type 2
476 Chapter 11 / Szuofford, Olsen, Waddell & Hillis
Tree and
obscrvcd
nucleotldes
6. Lake 11987a) suggested that statistical signili- (1988a) correctly pointed out that the x2 ap-
cance be evaluated by a one degree of free- proximatio~~ is inadequate when counts are
dom x2 test: low and recommended the use of the exact bi-
nomial test instead.
such outcomes are most likely to be the result of central branch transversion) rather than subtract-
selective pressure or some other non-random ing them as does Lake's method.
process.
Performance
TRANSITIONS AND TRANSVERSIONS The phyloge- Despite its intuitive appeal, the drawback of
netic information provided by Lake's method is Lake's method is inefficiency. Especially when
based entirely on transversion substitutions, so rates of change are high, simulation studies sug-
positions with two purines and two pyrimidines gest that it requires vastly more data to achieve
are required. If there are no transversions, there the same probability of inferring the correct phy-
will be no signal. On the other hand, transition logeny as other methods. For example, in four-
substitutions decrease the signal. In particular, taxon simulations using the K2P model under
peripheral branch transitions convert informa- long-branch-attraction conditions, Hillis et al.
tive (supportive) positions into countersupport, (1994b) found that Lake's method required about
suggesting that: the method might be particularly lo8 nucleotides before its probability of selecting
sensitive to the ratio of transitions to transver- the correct tree exceeded 1/ 3 (= the probability of
sions. If transitions are indeed substantially more a randomly chosen tree). Maximum likelihood
frequent than transversions, then it is difficult to analysis, on the other hand, achieved 95% success
accumulate a sufficient number of transversions at only 5000 nucleotides under the same condi-
to infer the branching pattern without having the tions. Lake's method can be consistent under con-
signal raxtdomized by transitions (see W.-H. Li et ditions in which maximum likelihood (as cur-
al., 198713).As noted above, generalized parsimo- rently implemented) is inconsistent, so given
ny (character-state weighting), transversion par- enough data, it remains a potentially useful
simony, and transversion-based distance meth- method. Unfortunately, "enough data" may be
ods provide alternative methods of coping with vastly more than the amount available.
a high transition:transversion ratio. Under many
conditions, these methods are much more effi-
cient than Lake's method at finding the correct
Rooting Revisited
tree (Ilillis ct al., 199410). Most of the methods discussed above do not spec-
Interestingly, transversion parsimony (as de- ify the location of the root. If, as is generally the
fined in this chapter, which differs from Lake's case, a rooted tree is desired, the root must be lo-
use of the term) applied to four sequences seeks cated using extrinsic information. As mentioned
the tree, X, Y or 2, with the largest value of XC + above, the most commonly used method is to in-
X-, Y+ + Y-, and Z+ -1- 2-.By examining the equa- clude one or more taxa that are assumed to lie
tions in (40), it can be seen that transversion par- cladistically outside of a presumed monophyletic
simony uses the same data but adds the terms that group. We recommend including more than one
look like a peripheral branch transition (and a outgroup taxon as a means of testing the assump-
478 Chapter 11 / Swofford, Olserz, Waddell b Hillis
Figure 23 Enumeration of all 15 possible unrooted trees for five taxa (see text)
order to find a globally optimal solution. A sim- blc trees for the first five taxa, obtalned by addnig
ple algorithm, outlined in Figure 23, can be used the fifth taxon to each of the five possible
to perform this enumeration. Initially, we connect branches for the tliree trees obtaincd at the four-
tlie first three taxa in the data set to form the only taxon stage. This makes clear the rationale fol cx-
possible u ~ ~ r o o t etree
d for these taxa (Figure 23, presslon (1) for counting the number of posslble
row 1).In the next step, we add the fourth taxon unrooted bifurcating trees for 7' laxa for each oT
to each of the three branches of the three-taxori the possible trees for r - 1 taxa, tliere are 2(1- 1)- 3
tree, thereby generating all t l ~ r c epossible un- = 21 - 5 branches to w111ch the it11 taxon can bc
rooted trces for the first four taxa (Figure 23, row connected. Note that the order of additlo11 1s 1x11-
2). We continue in a similar fasliion: adding the ith material; we could have just as easily choscn taxa
taxon to each branch of every tree (containing i - 1 at random for next addition at each step.
taxa) generated during a previous step. Thus, for Evaluation of cxpresslon (1) for several possi-
example, row 3 of Figure 23 contains all 15 possi- ble values of T qulckly reveals why exliaushve
480 Chnpter 12 / Swofford, Olsen, Waddell b Hillis
B E D C D E E C D C D C
v?A??'+?'FA
C2.1 C2.2 C2.3 C2.4 C2.5
scardx procedures are usehl only for small num- enumeratio~~ is available for any criterion whose
bers of taxa There are 945 possible unrooted trees value is known to be non-decreasing as additional
fot only 7 taxa, over 2 x 106 trees for 10 taxa, and taxa are connected to a tree. The branch-and-
over 2 x 1020possible trees for 20 taxa (Felsenstein, bound method, frequently used to solve problems
1978b;see Table 2 in Chapter 12). Thus, exhaustive in combinatorial optimization, was first applied to
enumeration of all possible trees typically is feasi- evolutionary trees by Hendy and Penny (1982).
ble only for 11 or fewer taxa (34,459,425 trees). The branch-and-bound method closely resembles
tile exhaustive search algorithm described above.
B1,nrrcTr-cljld-Bound Methods in this procedure, we traverse a search tree In a
Fortunately, an exact algorithm for identifying all depth-first sequence, as illustrated in Figure 24.
opl~lrialtrees that does not require exhaustive The root of the search tree (A) contains the only
Phylogenetic Inference 481
possible tree for the first three taxa. We first con- If we reach the end of a path on the search
struct one of the three possible trees obtained by tree and obtain a tree whose score is equal to the
connecting taxon 4 to tree A, yielding tree B1. upper bound L, then this tree is a candidate for
Then, to this tree, we connect taxon 5, yielding optimality. If this score is less than L, then this is
tree C1.1. (If there were more than five terminal the best tree found so far, and we have improved
taxa, we would continue to join additional taxa in the upper bound on the score of the optimal
this manner until a tree containing all T taxa had tree($. This improvement is important, as it may
been completed.) Now, we backtrack one node on enable other search paths to be terminated more
the search tree (i.e., back to tree B1) and generate quickly. When the entire search tree has been tra-
the second tree resulting from the addition of versed, all optima1 trees will have been identified.
taxon 5 to tree B1 (= tree C1.2). When all five of The branch-and-bound method is extremely
the trees derivable from tree B1 (Cl.l-Cl.5)have effective for many criteria, permitting exact solu-
been constructed, we backtrack all the wav to tree tions for 20 or more taxa, depending on the effi-
A of the search tree and take the secoid path ciency of the implementation, the speed of the
away from this node, leading to tree B2. As before, available computer, and the "messiness" of the
all five trees derivable from tree B2 (C2.1-C2.5) data. The method can be used to search for opti-
are constructed in turn. Then we backtrack once mal trees under parsimony, maximum likelihood,
again to tree A and proceed down the third path, and additive distance criteria in programs such as
toward trees C3.1-C3.5. Eventually we will have PAUP* (see Appendix).
constructed all of the possible trees, culminating The above presentation of the branch-and-
with tree C3.5. If the score of each tree containing bound method, although correct, is an oversim-
all five taxa were evaluated at the time of its con- plification of the algorithms actually used in state-
struction, then the search would be an exhaustive of-the-art computer programs. Refinements in the
one equivalent to that described in the above sec- algorithm that greatly speed the computations
tion. However, a branch-and-bound search differs usually are implemented. These refinements, de-
by eliminating parts of the search tree that only signed to promote earlier cut-offs in the traversal
contain suboptimal solutions. of the search tree, include: ( 2 ) using heuristic
Let 1, represent an upper bound on the opti- methods (discussed below) to obtain a near-opti-
mal value of the chosen optimality criterion. (We ma1 tree whose score is used as the initial upper
assume that we want to minimize this criterion, bound; (2) designing the search tree so that diver-
just as we minimize the tree length under a parsi- gent taxa are added early, thereby increasing the
mony criterion or minimize the sum of squared length of the initial trees in the search path; and
deviations in an additive-tree distance method.) (3)using pairwise incompatibility to improve the
For the present, we can obtain L by evaluating a lower bound on the length that will ultimately be
random tree; if we know that a tree of score L ex- required by trees descending from a tree i t a
ists, then the score of the optimal tree(s) cannot given node of the search tree. These methods are
exceed this value. As we are moving along a path discussed in more detail in Hendy and Penny
of the search tree toward its tips (containing all T (1982) and Swofford (1996).
taxa), if we encounter a tree whose score exceeds An obvious question may have occurred to
L, then there is no need to proceed further along the reader at this point. Since the branch-and-
this path; connecting additional taxa cannot pos- bound method requires evaluation of all trees as
sibly decrease the score. Thus, we can dispense its worst possible case, why would we ever want
with the evaluation of all (phylogenetic)trees that to perform an exhaustive search? In fact, if we
descend from this node in the search tree and im- were interested only in the optimal trees, the
mediately backtrack and proceed down a differ- branch-and-bound algorithm would indeed be
ent path. By cutting off portions of the search tree the preferred means of finding them. However,
in this manner, we can greatly reduce the number exhaustive searches permit the researcher to ex-
of trees that must actually be evaluated. amine the frequency distribution of tree lengths.
482 Chapter 11 / Swoford, Olsen, Waddell & Hillis
Figure 25 Hcur~stlctree seiectlon uslng star decom- eages leadlng away from the ccntral node. The best trcc
posit~onmethod. At each step, thc optimallty criterion found during each step bcco~nesthc starting polnt fol
is evaluated for each possible joinil~gof a pair of lm- thc next step.
the current situation rather than attempting to see a single internal node (Figure 25, step 1). Ncxt,
more broadly into the future. Tl~us,one placement we evaluate the optimality criterion for all pos-
of a taxoil may be best given the taxa currently on sible trees that can be constructed by jorning two
the tree, but that placement may become subopti- of the terminal nodes into a new group (Flguic
rnal upon the addition of subsequent taxa. Once a 25, step 2). The tree from t h ~ stage
s that scores
decision has been made to connect a laxon to a best according to the criterion is saved for the
certain point, however, we must usually accept next stcp. Each time wc form a new group, wc
the consequences of that decision for the remain- reduce by one the number of branches con-
der of the stepwise addition process, perl~apsend- nected to the central node. The process contln-
ing u p in a local optimum as a result. ues until the step in which all generated trecs
are binary (Figure 25, step 31, and we choose t l ~ e
Star Decomposition Methods best of these (again according to the chosen op-
An alternative to stepwise addition is the star timality criterion).
decomposition method, a divisive pairwise clus- The most co~nmonlyused star decornpos~tlon
tering method (see "Cluster Analysis," below). method is the lleighbor-joining algoritl~mof
The algorithm can bc used wit11 any criterion Saitou and Nei (1987; see below). Saitou (19?0),
that can be evaluated 011 a non-binary (polyto- Adachi and Hasegawa (19921, and Z. Yang (1995)
mous) tree. To begin, we col~nectall of the ter- have also implemented the me!hod for both DKA
minal taxa connected in a "star tree" containing and pr~tei1-imaximum likelihood. Star dccon~po-
mality criterion), then we will eventually arrive at
the global optimum. However, if the intermediate
trees would require us to pass through trees that
are inferior to the one(s) already obtained, we will
once again find ourselves trapped in a local opti-
m u m unless an option is provided for branch
swapping on suboptimal trees (e.g., the "KEEP('
option in PAUP*; Swofford, 1993,1996). A related
problem concerns plateaus on the optimality sur-
face. It may be the case, for example, that an opti-
mal tree lies several rearrangements away from
the current tree, and that these rearrangements all
correspond to trees having equal scores under the
optimality criterion. If the intermediate trees are
discarded becausc they are "not better," then the
optimal tree will not be found. A few programs do
rigurti 26 Branch swapplng by nearest-nelghbor In- not retain equally good trees because they have
terchangm (NNIs). Each Interim branch of the tree de- no protection against cycling (alternation between
fines ;Ilocal reglon of four subtrees connected by the m- two trees, each of which can be rearranged to
ter~orbrru~~ch lnterchang~nga subtree on one side of the
branch with one from ihc other constitutes an NNI yield the other); these programs will not be effec-
Two S L I C ~ Trearmngcrne~itsare possible for each ~ntcrlor tive if plateaus are encountered, since they are un-
branch. able to traverse the plateau.
Graircll Swapping
Becauqe of the excessive greediness and suscepti-
blllty to local optima problems, stepwise addition
and star decomposit~onalgorithms generally do
not find optimal trees unless the number of taxa
1s small or the data are very clean. However, it
may be possible to Improve the initial estimate by
perfollning sets of predclined rearrangements, a
teciunrque commonly referred to as branch swap-
ping 111 general, any one of these rearrangements
amo~iiltsto a "stab in the dark," but the hope is
tlld t 11a belter tree exists, one of the rearrange-
lnel-is will find ~ t Examples
. of tlzree kinds of re-
arrangements used In current branch-swappmg
a l g o ~ ~ t l 'Ire
~ n shown
~s In Figures 26 through 28.
01 course, the globally optimal tree(s) may be
several rearrangements away from the starting
trw Tf a rearrangement is successful in finding a Figure 27 Branch swapping by subtree pruning and
be~lertree, a round of rearrangements is initiated regrafting. A subtree is pruned from the tree (e.g., the
subtree containing terminal nodes A and B as indi-
on tfus new tree. As long as each round of re- cated). The subtrce is then regrafted to a different loca-
arrangements is successful In finding an im- tion on the tree. All possible subtree removals and reat-
i~rovecit r ~ (according
e to its score under the opti- tachment points arc evaluated.
Phylogenetic Inference 485
I'rob[accepting solution ~ ( t , , ~ ) ]
Testing for Convergence
Because of the limitations of heuristic approaches,
some way of evaluating the success of the chosen
method in obtaining a globally optimal solution is
needed. The obvious strategy in this regard is to
begin from different starting points and ask where k is a parameter that can vary over time.
whether the same result is always obtained. For ex- In the Great Deluge method (Dueck, 1990;
ample, a set of random sequences for the addition Dueck and Scheuer, 1990), the probability of ac-
of tawa c~lnbe used to generate imtial trees for in- cepting a new solution t, is 1 if z(t,+,) > w,,
put to branch swapping. Since, for reasonably where w, is a bound that increases slowly with
noisy data at least, the starting trees will vary de- time, so that if t,tl is accepted, then = w, +
pending on the addition sequence, convergence to c[z(tttl) - (f,)l. The constant c is usually about
a common optiinal tree (or set of trees) is encour- 0.01 to 0.05. These methods of determining the
aging. (A more extreme approach-using random acceptability of a new solution offer an efficient
trees rather than random addition sequences- means of improving the performance of heuris-
could be adopted; however, the starting trees are, tic searches (M. Charleston, personal communi-
on average, so far from the optimal trees that this cation), and there are many other variants, in-
strategy seems to be less effective.) Even if re- cluding thc use of a "tabu list" (Glover, 1989)
arrangements of different starting trees do not con- that prevents :he search from revisiting any so-
verge to the same end point, the use of several lutions it 11as just tried (the list usually contains
starting trees is a good idea; if multiple peaks on about 5 to 10 solutions).
486 Chapter 11 / Swofford, Olsen, WaddeLl G. Hillis
Algorithmic and Other Methods 4. Define the distance from u to each other cluster
(k, with k # i or j) to be an average of the dis.
The methods for tree searching described in the tances dk, and dki.
above sections are appropriate when an optimal- 5. Go back to step 1 with one less cluster; clusters
ity criterion that can be evaluated for any given i and j have been eliminated, and cluster u has
tree is chosen. The problem is then reduced to been added.
finding an optimal tree given the chosen criterion.
The methods described below do not cleanly fit The variants are primarily in the details of
into this framework, either because they are de- step 4. The most con~i~ionly used clustering
fined solely on tlze basis of an algorithm or be- method is UPGMA (unweighted pair group
cause the task of finding an optimal tree cannot be method using arithmetic averages), in which the
cleanly separated from that of evaluating a spe- averaging of the distances in step 4 is based on the
cific tree. total number of taxa in the clusters. That is, if clus-
ter i contains TItaxa, and cluster j contains T, ta,xa,
Cluster Analysis then dku= (TIdkl+ T' dk,)/(Tl+ TI).If the simple av-
Cluster analysis is a family of related techniques erage Idka= (dL,+ dk1)/2]is used instead, the tech-
for representing similarity or distance data (we nique is called WPGMA (weighted PGMA).
will use distances) in the form of an ultrametric Other variants include using tlze rnaxlmum dis-
tree (Sneath and Sokal, 1973). If the data tlzem- tance [dkM = max(dk,,dk,),called complete linkage],
selves are ultrametric, then the representation on or the minimum distance Idku= min(dk,,dk,),called
the tree will be exact. It should be obvious that if single linkage]. These alternatives all give the
the distance data tl~emselvesare not ultrametric, same results when the data are ultrametric, but
then they cannot be fit exactly to suc11 a tree, and they can differ in their inferences when tlze data
therefore errors might be introduced. are not ideal.
The method of cluster analysis is concepkally An example of usmg UPGMA to Infer a tree
simple. The raw data arc provided as a table of of five taxa (5srRNA sequences) is given in Fig-
distances between all pairs of taxa. Call dl, the dis- ure 29. The figure presents the upper right half of
tance between taxa i and j. The tree is constructed the pairwise distance matrix at each stage of tlze
by linking tke least distant pairs of taxa, followed cluster analysis. Starting with the first table, the
by successively more distant taxa, or groups of smallest distance, the 0.1715 substitutions per se-
taxa. When two taxa are linked, they lose their in- quence position separating Bsu and Bst, is indi-
dividual identities and are subsequently referred cated in bold face. Thus, the first inferred branch-
to as a single cluster. Initially, each taxon consti- ing unites these taxa at a depth of 0.1715/2 =
tutes its own cluster. At each stage in the process, 0.0858. These two taxa are merged into a cluster
as two clusters are merged into one, the number in the next table, and their distances to all other
of clusters declines by one. The process is com- taxa are averaged. For example, the distartcc from
plete when the last two clusters are merged into a the Bsu-Bst group to Lvi is (0.2147 + 0.2991)/2 =
single cluster containing all of the original taxa. 0.2569. The smallest distance in the second table
The steps of the method are as follows: joins the Bsu-Bst cluster with Mlu a t a depth of
0.1096 (= 0.2192/2). The distances of the Bsu-Bst-
1. Given a matrix of pairwise distances, find the
Mlu cluster to the other taxa are then computed
clusters (taxa) i and j such that d , is the mini- by the unweighted method. For example, the dis-
mum value in the table. tance to Lvi is (2 x 0.2569 + 0.3943)/3 = 0.3027.
2. Define the depth of the branching between i Notice that this value is identical to (Bsu:Lvi -t
and j (1,)) to be d,/2. Bst:Lvi + Mlu:Zvi)/3, where A:B is the distance
from taxon A to taxon B. Each taxon in the origi-
3. If i and j were the last two clusters, the tree is
nal data table contributes equally to the averages,
complete. Otherwise, create a new cluster
which is why the method is called unwnghted. The
called u.
Bsu Rst Lvi Amo Mlu
- sumpt~onis that the data are approx~matclyultra-
"-.,
U>U
- 0.1715 0.2147 0.3091 0.2326 metrlc T h ~ assumpt~on
s 1s of course a very it1ong
Bst onc, but ~t is seductive to belleve that a ~ l l l p l e
Lvi s t r ~ n g e ~assu~nptlon
lt can be satisiled more easlly
At110
than a 1 0 1 1 list~ of (what mlght be) less resti~ct~vc
assumptions. Second, the ldca of grouping the
Bsu-Bst Lvi Amo Mlu taxa that are least d~ffereni,regardless of any fmei
~su-Dst - 0.2569 0.3245 0.2192 po111ts of considerat~on,has a strong intultivc ap-
Lvi - 0.2795 0.3943 peal. The extreme of tlus view is the pl~eneticpcr-
Am0 - 0.4289
- spcctlve m wluch ~tis asserted that nothul~gbut thc
Mlu
extent of similarity matters biologically and that
Bsu-Bst-Mlu Lvi Arno considelation of the h~storlcalbranching ordcr 1s
~su-Bst-Mlu - 0.3027 0.3593
of purely secondary lilterest A thli-d reason IS the
Lvi - 0.2795 ava~lab~hty of programs to do cluster analys~sand
Amo - the relat~vespeed of the calcuiatrons, thcrcby en-
abllng large ntu~nbersof taxa to be analyled
As emphas~zedrtbo~/e,slmple cluster anal~lsls
Bsu-Bst-Mlu - 0.3310 has drawbacks. First, xt is just an algorlthin (or
Lvi-Aino -
farn~lyof algorithms) wlth no objectlvc defInlho11
Figure 29 Cluster analys~s(UPGMA) of 55 rRNA evo- of what constitutes an o p t ~ ~ ntreea l when the data
lut~olzarydlstance est~mates.Abbrevlatlons correspond are not d e a l . In part~cular,because genes do not
to Figure 15. Each table represents the pairwlse dls- dlverge un~formly111 all organisms or orgar~ellcs
tances (estimated nuclcot~desubstitut~onsper sequence (Chapters 8,9, and 121, syslernatlc errors arc llhely
poslt~on)for one round of cluster~i~g (only the upper tc) be introduced into cluster analysls reconstruc-
right half of the symmetr~calmatr~x1s sl~own)The
rnlnlmum dlstance value 111 each table IS In bold. The tions F~nally,alternative, rapld mell~odsare a \ all-
correspoi~dingpalr of taxa (or clusters) are merged Into able that wlll work for ail addltlve trees, not just
a s~nglecluster in the next table The bold dlstance those that are ultrametrlc
value IS tw~cethe depth of the branch po~ntscparat~ng
the clusters merged. A diagram of the Inferred tree is In Algorithnzic Methods for A d d i t i v e ?i.ees
F~gure15B.
A variety of algoritlmllc metl~odsrelated to clus-
ter analysis have been proposcd that wlll colrectly
reconstruct additive trees, whether the data are ul-
smallest distance in the third table unites Lvi and tranzetric or not. These methods fall into threc prl-
Amo at a depth oi 0.1398. The distance between mary categories. Those 111 Lhe flrst category tl ans-
the Bsu-Bst-Mlu and Lvi-Amo clusters is then (3 form any a d d ~ t l v edlstance matrix Into an
x 0.3027 + 3 x 0.3593)/6 = 0,3320. Thus the im- ultrametric matrlx and then use cluster analys~sto
plied rook of the tree joins these two clusters at a infer the tree. T11ey lnclude the transformed d ~ s -
depth of 0.1655. The complete tree is shown in tances method of W.-H. LI(1981), the present-day
Figure 15B. ancestor m e t l ~ o dof Klotz and Blankcn (1981),
Note that cluster a~lalysiscannot join two taxa and, in a less obvious sense, the neighbor-jo~n~ng
(sometimes called operational taxonomic units or method of Saltou and NEI (2987). The second cat-
OTUs) unless at least one pairwise distance links egory comprises methods that form the clusters
tlzem. Thus, missing data w~tluna group can force consistent wit11 the largest fraction of taxon-quar-
one or more members out af the group in the in- tcts, uslng a relaxed deflnitlon of additivity for a
ferred tree, a problem discussed in greater detail four-taxon tree. These methods include those of
under "Similarity and Distance Data." Sattat11 and Tversky (1977) and Fitch (1981).Mcth-
Cluster analysis has historically been very ods of the thlrd class, which mcludes tlze distancc
popular for several reasons. First, the principal as- Wagner method (Farris, 1972), b u ~ l dan add~llve
representation of the tree by sequential addition When two nodes are linked, their common ances-
oi !aka The transformed distance approaches all tral node is added to the tree and the terminal
11ave a computational complexity that is propor- nodes wit11 their respective branches are removed
tlo11~11to T3; therefore, any p r o b l e ~ nthat is from the tree. This pruning process converts the
iractable wlth standard cluster analysis can also newly added common ancestor into a terminal
bc solved with these methods. We present a ver- node on a tree of reduced size. At each stage in
slon of tke neighbor-joli~ingmethod below. the process, two terminal nodes are replaced by
Urihke cluster analysis, additive-tree methods one new node (corresponding to an internal node
yleid ~ ~ n r o o t etrees,
d which are adequate for some on the filial tree). The process is complete when
purposes. If a root 1s to be placed, however, it two nodes remain, separated by a single branch.
must be based on an ancillary criterion. Usually,
one or n ~ o r etaxa that are assumed to lie outside a The steps of the method (modified from Studier
monophyletic group of interest are included in the and Keppler, 1988) are as follows:
n1:alysis The locatlon at ~ r h i c hthese taxa join the
1. Givcn a matrix of pairwise distances (d), for
tree defines the root wlth respect to the ingroup.
each terminal node i calculate its net diver-
Another method, nudpoint rooting, depends on
gence (r,) from all other taxa sing the for-
an assumption of ratc uniformity that is some-
mula
what wcalcer tharl assumlng a molecular clock
across the entire tree. if the two most divergent
line'lgcs havc evolved at the same rate, then the
a p p ~ n p r i a kroot is at the midpoint of the path
coiznecting these tam
where N is the number of terminal nodes in
rric NEIGHBOR-JOINING METHOD Ne~ghborjoin- the current matrix. Note thc assumption that
it-ig (Sartou and Nel, 1987) is conceptually related d,i = 0,otherwise the summation would need
to Lrod~tionalcluster analysis, but removes the to skip over k = i.
as5u111prlon that the data are ultrametrlc In prac-
2. Create a rate-corrected distance matrix (M) in
tlc,~lterms, it does not assume that all lineages
which the elements are defined by
havc diverged equal amounts. However, it does
nssume that the data come close to fitting an
a d d r t ~ v etree, so correction for superimposed
~ ~ tdata that might
s ~ ~ b s i ~ t u t i oisn~s m p o r t a for
~ncludelineage-to-lmeage differences in average for all i and with j > i (the matrix is symmetri-
rnie cal, and the case of i = j is not interesting).
The neighbor-loirung algorithm is a special Only the values i and j for which M,, is mini-
c'ise of the star decomposlt~onmethod described mum need be recorded; saving the entire ma-
c'irlier 111 contrast to cluster analysis, nelghbor trix is unnecessary.
jo117111gkeeps track of nodes on a tree rather than
3. Define a new node z~ whose three branches join
tavd or cltrsters of taxa. The raw data are provided
nodes i, j, and the rest of the tree. Define the
as a distance matrix, and the initial tree is a star
lengths of the tree branches from u to i and j:
tree A i~~odified distance matrix is constructed m
wi11cl1 the scparatlon between each pair of nodes
is a d j ~ ~ s t eon
d the basis of their average diver-
gertcc from all other nodes (conceptually, this ad-
]~:,tmenthas the effect of normalizing the diver-
gence of each taxon for its average clock rate). The
4. Define the distance from LL to each other termi-
t~ee is constructed by I11.tking the least-distant pair
nal node (for all k # i or j )
oi- nodes as defined by this modified matrix.
Phylogenetic Inference 489
property of these corrected distances is that they the adjacent branch length so that the total dis-
are negative; therefore, finding the minimum dis- tance between an adjacent pair of terminal nodes
tance means finding the most negative value. In was unaffected. This change does not alter ihe
the first table, the minimum value is the -0.5689 topology of the tree found by the algorithm; it just
relating Amo and Lvi. Both this value and the cor- guarantees non-negativity of branch lengths (e.g.,
responding uncorrected distance, 0.2795, are in for interpreting branch lengths as estimated nunl-
boldface. Thus, Amo and Lvi are joined to one an- bers of substitutions).
other and to the rest of the taxa through a new Neighbor joining is classified as an algorith-
node, called node 1 in this example. The two lines mic method because it constructs only one tree
below the table illustrate the calculation of the and does not explicitly optimize any objective
branch lengths from the two taxa to the node. function (the branch-length estimates from neigh-
Arno and Lvi are then removed from the distance bor joining are not, in general, optimal for the
table, and the distances from node 1 to the re- minimum evolution criterion). We believe that it
maining taxa are calculated using equation (43). should be thought of as a means of getting a start-
For example, t l ~ eBsu to node 1 distance is (0.2147 ing tree for more thorough searches using branch
c 0.3091 - 0.2795)/2 = 0.1222. The second table, swapping under tlle minimum evolution or other
~vllichnow relates only four terminal nodes, is additive-tree criteria, not as a method for choos-
treated just as the first table. Looking at the cor- ing a final tree.
rected distances, we find two pairs wit11 the low-
est value, -0.4278. This is not a coincidence:if Bsu SPLIT D E C O M P O S ~ T I O N All of the methods
and node 1 are sister nodes, then Bst and Mlu described above will select a tree regardless of
must also be sister groups. (If this observation is how non-treelike the data appear. When the data
unclear, try drawing the unrooted tree of four do not conform to a treelike model, criterion-
taxa.) The remaining arithmetic will peld identi- based methods may provide some indrcation of a
cal trees regardless of wfuch of these two pairs are problem, for example, by discovering some near-
joined at this step. In this example, node 2 is ly optimal trees that are quite different in topolo-
added to the tree, joining Bsu, node 1, and the rest gy. Algoritl~micmethods such as neighbor join-
of the tree. The branch lengths from Bsu and node ing provide little or no indication that the data
1 to node 2 are calculated below the table. The do not conform to the model. Split decomposi-
third table eliminates Bsu and node 1, and adds tion (Bandelt and Dress, 1992) is a method for
node 2. In this table, which relates three periph- graphically representing trends in distance data.
eral nodes, all three rate-corrected distances are The method detects well-supported groupings
identical. As in the previous step, this result is not when they occur, but also identifies conflicting
a coincidence: only one possible unrooted tree can (incompatible) groups that may also have strong
link three taxa. The choice of the pair to be joined support in the data. These conflicts can arise
is arbitrary; the ultimate outcome will be the from sources such as inadequate correction for
same. Adding node 3 to the tree so that jt links Bst superimposed changes in the distance transfor-
and node 2 to the rest of the tree (which is only mation, convergence driven by natural selection,
Mlu at this point) gives one more pair of branch or reticulate evolution. We will not give a com-
lengths and a "tree" containing node 3 and Mlu. plete description of this method, but will outline
Their pairwise distance is used directly as the the basic ideas using a simple example.
length of the segment joining them. The tree is The method is based on the four-point metric
completed. The results are shown in Figure 15A. (formula 10) (Buneman, 1971),which states that il
As the neighbor-joining algorithm seeks to taxa i, j, k, and 1 (a quartet) are related by a tree
represent the data by an additive tree, it can as- ((i, j), (k,1)) and the distances are tree-additive,
sign a negative length to a branch. Kulzner and then the minimum sum will be dl, + dkl,~7hilethe
Felsenstein (1994) modified the algorithm so that larger sums + d,, and dtl+ dlk will be equal. With
when a negative branch length occurred, it was real data (i.e., imperfectly additive distances), the
set to zero, and the difference was transferred to relationship dlk + dl[ = dIl+ dIkwill not hold. Al-
Figure 31 (A) Distance matrix for split decomposition
example. (B) Graphical representation (network) of
splits ilnplied by matrix (A). (C) Poisson-corrected dis-
tance matrix. (B)Corrcct tree inferred from matrix (C).
though we could hope that d,]+ dkl .:dlL + d,l and
d , -t. dki < d,! + dlk ,which forms the basis of the Sat-
tath-Tversky (1977) and Fitch (1981) "neighborli-
ness" methods, even this relationship will usually Thus, we reject the tree ((1,4),(2,3))and calculntc
be violated by some quartets. Split decomposition index representing support for each of
ail isolafio~~
adopts the working assumption that at the very the other partitions (splits) as
least, d,]+ dkl will not be tlze largest of the three
sums. Usually, phylogenetic methods assume that
if dl, + dlkexceeded hot11 other sums, then there is
no support In the data for the tree ((I,!), (j,k)).
However, we can also ask whether there is rela-
tively unambiguous support for one of the other The observatioil tlzat support for the ((1,2),(3,4))
two trees. For example, if d,, + dkl and d,k+ dkl are split is nearly half that of the support for tlie
nearIy equal, but both are distinctly smaller than ((1,3),(2,4))split suggests tlzat there is conf11ct111g
d,! + dlkrconflicting support is evident. The closer support for two differenl groupings In the data
one of these two sums approaches d,,+ dlk, the set. Thls conflict is represented by drawing ihc
more consistent is the support far thc tree corre- tree as a network showing the amount o f support
sponding to the other sum. for each of the two supported groupings (F~gurc
We illustrate this procedure using the Izypo- 31B). A standard tree-bullding method srrch as
thetical example of Figure 31. The distances in nelgl~borjoining would, 111 contrast, select the
Flgure 31A are the observed or uncorrected dis- tree ((1,3),(2,4))but grve no ~ndicationof the SLIP-
tances that would be expected from the example port in the data set for the alternative tree
used to illustrate the Hadamard conjugation ( i . ~ . , ((1,2),(3,4)).
calculated using the relationship d = (1 - r ) / 2 ; see In this example, we can ~dentifytlze cause of
equation 31). The three relevant distance sums the conflict as fallure to account for superral-
are: posed changes, which in this case would cause
select~onof an incorrect tree rrsing neighbor loin-
sng or other additive-tree methods. However, us-
mg the standard Poissolt correction (equat~on27),
we can obtain the corrceted distance matrix
shown in Figure 31C. (Note that the elements of
this matrix are equal to one-half of the appropn-
492 Cizlzpter 11 / Swofford, Olsen, Waddell b Hillis
ate elements in the corrected generalized distance low one to detect when conflicting splits are due
vector p of equation 30.) For this corrected ma- to events such as horizoxxtal transfer of DNA or
rrix, we have recombination. Such claims should be evaluated
with more sensitive character- and sequence-
based methods k g . , Stephens, 1985; Hein, 1990a,
1993).A more straigl~tfonvarduse of the method
is in the choice of a distance transformation (e.g,,
allowing more substitution parameters, unequal
rates across sites, and/or unequal base composi-
tions). Split composition can give some idea of
'LtTlzenspllt decomposition 1s performed using the whether these transformations are improving the
corrected distances, the box in Figure 318 indi- "treelikeness" of the graph or making it worse (VI-
cating conflrcting support disappears because sualized as a more "'boxy" network; e.g., see Lock-
l(dl++ dZ3)- (d13+ dZ4)1/ 2 = 0, and the correct hart et al., 1995b).
tree 1s 111ferred(Figure 31D). Split decomposition analysis will not neces-
For a tree of more than four t-axa, the devia- sarily detect some kinds of departures from pre-
tion from the additive four-poinl metric condition dictions of a model, again because we are start-
1s measured for all posslbIe subsets of four taxa. ing from distances rather than characters. For
Bandelt and Dress (1992)showed that only a cer- example, unlike the Hadarnard conjugation,
tain nulnbcr of the implied splits can be portrayed split decomposition will not recognize an excess
011 d planar graph (the spl/t dcco~nposableportion); of patterns supporting all three four-taxon trees,
the 13K1POI t ~ o nwhich cannot is referred to as tlie as would happen if there were more superim-
sp!zt-pt v ~ i ~desidue.
e Bandelt and Dress (1992) sug- posed changes than the model predicts. Like the
gested l l ~ nthet majority of the random noise con- Hadamard conjugation, we need a means of de-
tarned in a data set is iransfcrrcd to the split- termining whether a conflicting "signal" is re-
primc residue (which also contains some ally present or is simply due to sampling error
systematic biases that arc only locally uniform in causing inequality of d,k + 4[ = djf + dlkby chance.
t h e ~ dircclron).
r Remaining random noise and sys- Unfortunately, this question has received little
tcrnalic error is retained in the split-decomposable attention, but with small data sets it is possible
component and is observed on the resulting net- to determine analytically how many standard
work as incompatibilities between splits (or unre- deviations separate the three sums of distances.
solved nodes; see below). A bootstrap approach (see below) to assessing
Wlxen using split decompositlon on substan- the reliability of features in the split decomposi-
tial n ~ ~ m b eofr staxa, tlze resulting graph often ap- tion graph is also feasible, but will probably be
pears more llke an unresolved tree than a network conservative. The relationship between split de-
with inany boxes. Distant outgroups, for example, composition and the distance Kadamard is not
can show large random fluctuations and also dif- well understood; both methods should be con-
ferent syslematic biases, tending to hide the infor- sidered useful because they give different in-
rnatlon on ingroup systematic bias (as all three sights.
quarlel- rclations may be optimal depending on
which taxa are used). When this happens, local- METHODS BASED ON A RELAXED FOUR-POINTMETRIC
ized (but possibly strong) systematic error is lost The methods of Sattath and Tversky (1977) and
m tlz~spllt-prime residue and the graph loses both Fitch (1.981) are also based on a relaxation of the
"boxmess" and resolution. One solution to this four-point metric condition of Buneman (1971).
problem is to look for systematic errors by restrict- However, they are based on a somewhat stricter
mg ihc ;~nalysisto smaller subsets of taxa (4-10) criterion than split decomposition. These methods
Because it is based on distances and not clnar- operate by creating a similarity matrix sij that
actcrs, spllt decompositlon by itself does not al- counts the number of times each pair of taxa i and
Phylogenetic Inference 493
j satisfy the conditions d,] + dkl < d,k + dkl and dil + Modifications to the distance Wagner procedure
dkl < dii + dlk over all pairs (k, 1). Tlus matrix forms have subsequently been proposed by Swofford
the basis for a cluster analysis. We begin by choos- (1981) and Tateno et al. (1982).As with neighbor
ing the pair (i, j) for which sij is maximal, and form joining, if the experimentally determined dis-
the corresponding cluster. These two taxa are tances are additive, then the optimal solution
merged into a single object and distances are re- will always be found. I-Iowever, when the fit is
calculated as in UPGMA. The quartet-based scor- not exact, the behavior is not intuitively obvi-
ing of pairs of taxa is then repeated, and the cycle ous.
continues until all taxa have been clustered. (The
Sattath-Tversky and Fitch methods differ slightly
in the details of how averaging is performed in RELIABILITY OF INFERRED TREES
preparation for the next clustering cycle.)
The Sattath and Tversky (1977) and Fitch Systematic Versus Random Error
(1981) methods have not been widely used. Fur-
thermore, simulations by Charleston (1994) indi- In any statistical analysis, two kinds of error (sys-
cate that these methods (and other transformed tematic and random) need to be distinguished.
distance methods, such as that of W.-H. Li, 1981) We define random error as deviation between a
are less effective in identifying the correct tree parameter of a population and an estimate of that
than methods such as neighbor joining or closest parameter, due strictly to a limited sample size
tree (applied to the distance Hadamard). They are used to make [he estimate. By definition, random
also more computationally intensive (requiring error disappears in infinite samples. In contrast,
time proportional to T5, as opposed to T3 for systematic error is deviation between a parame-
neighbor joining). ter of a population and an estimate of that para-
meter, due to incorrect assumptions in the esti-
DISTANCE WAGNER AND RELATED METHODS The mation method. Systematic error persists (and
conceptual perspective of Pitch-Margoliash may intensify) as sample sizes increase and be-
methods and neighbor joining is that the esti- come infinite.
mated pairwise distances are to be fit to an Throughout this chapter, we have discussed
additive tree, with some of the estimates (obser- various conditions under which systematic error
vations) being greater than the true vaIues and arises in phylogenetic analyses. In general, sys-
some of them being smaller than the true val- tematic error occurs when the evolutionary
ues. An alternative view is one in which the process violates the assumptions of a phyloge-
sequence (or other) differences are not corrected netic method in a critical way. Under these condi-
for superimposed changes and thus provide tions, a bias may be introduced into the evalua-
lower bounds for the actual evolutionary dis- tion of alternative phylogenies, favoring some
tance, h this framework, the length of the path branching patterns and decreasing the support for
connecting any pair of taxa must equal or others. If the bias becomes sufficiently great, it
exceed the corresponding observed distance. In may overcome the legitimate support for the cor-
analogy to character-based parsimony, the rect tree and lead the researcher to an incorrect
desired tree is the one that minimizes the total conclusion. Because the effect is systematic, the
of all branch lengths in the tree, while using the addition of more data will tend to solidify the in-
pairwise distances as lower bounds on the path- correct conclusion (and the method is said to be
length distances. Beyer et al. (1974) a n d inconsistent or positively mislending under these
Waterman et al. (1977) have described exact conditions; FeIsenstein, 1978a).For a mistake to
methods for accomplishing the desired mini- occur in phylogenetic estimation of the branching
mization on a given tree. Farris's (1972) dis- order as a result of systematic error, the magni-
tance Wagner algorithm can be thought of as a tude of the bias must exceed the valid support for
heuristic approach to the same problem. the correct tree. Furthermore, the bias must be in
494 Chapter 11/ Swofford, Olsen, Waddell €? Hillis
the direction of an erroneous tree, as it is possible tions should be tested, the effects of potential
for systematic bias to increase apparent support sources of bias should be explored, and met]zods
for the historically correct tree. Thus, the presence should be used to reduce the effects of systematic
of a bias does not necessarily lead to wrong an- error in the analysis.
swers, but it does cast doubt upon the valid~tyof
the inference process.
Even if evolution occurred exactly as as- Systematic Error
sumed by a particular analytical method, an in-
correct tree may be inferred with finitc data due Conditions m a t Lead to Systenzatic Evvov
to chance events (which introduce random error). Fortunately, the situations likely to lead to syS-
For example, convergent substitutions might be tematic error under most of the methods we have
cxpected to occur (in a given situation) only once described are relatively well understood. We have
per 100 nucleotide sites, but because of sampling discussed some of these conditions in the sections
error we might observe thrce convergent substi- describing each of the methods; here we present a
tutions in a single sample of 100 nucleotide sites. brief review far the major classes of analysis.
This type of error occurs even when the presumed
model is correct. By analogy, the observation of 20 GENERAL ASSUMPTIONS Almost all lnet1lods
consecutive "heads" in a coin-tossing experiment assume that the characters analyzed are vertical-
might lead us to conclude that the coin is two- ly inherited (rather than horizontally acquired).
headed, but of course this outcome has a finitc This assumption is usually inet for molecular
probability of occurring (approximately 10") even data, and so probably only rarely introduces sys-
if the coin is fair. In inferential statistics, we gen- tematic error into molecular systematic studies.
erally choose a certain probability (typically 0.05) The other general assumption of most methods
below which an outcoine is improbable enough is that characters are independent with respect to
(assuming that random error accounts for the de- probability of change. If, for examplc, a change
viation) to warrant rejection of a null hypothesis. in one nucleotide position makes a clzange in a
Random error does not necessarily produce a second position more likely, then this assump-
random effect on the outcome of an analysis, tion is violated (see Wheeler and Moneycutt,
however. For instance, for many methods of cal- 1988, and Korber et al., 1993 for exa~~zples). If
culating pairwise distances, small distances and methods do not explicitly account for this non-
large distances are affected differently by sam- independence, it may lead to systematic error.
pling error. Under some conditions, this leads to
a sample-size-dependent bias in methods that are PARSIMONY If the number of actual sequence
nonetheless consistent for the model under con- cl~angesper sequence position in a macromole-
sideration (see Ilillis et al., 1994b for an example). cule is always sinall (zero or one), then parsiinony
In otlzer words, even if a method is consistent will correctly recollstruct the phylogeny given
and will lead to the correct tree if given an infi- enough data (Felsenstein, 2978a). As the nt1111ber
nite amount of data, it nonetheless may be biased of chsanges increases, the proportion of those
with finite data, even if its assumptions are met changes that are homoplastic (parallel, conver-
perfectly. gent, or reversed) increases. If the tree is relatively
Realistically, both random and systematic er- dense (i.e., branch lengths are short enough so
ror are expected in any given study. Random er- that the expected number of changes on any one
ror occurs in any finite data set (since the expected branch is small), these holnoplastic clzanges usu-
proportions of different character patterns are real ally will be detected as such. However, parsimony
numbers), so tile sensitivity of the results to the analyses do not detect multiple changes on long
presence of random error needs to be assessed. unbianched lineages, thereby creating the poten-
Because systematic error is expected when the as- tial for bias if a mixture of long and short branch-
suinptions of a method are violated, the assump- es are present in an analysis (Felsenstein, 1978a).
A ~ ~ I T I V E - T R ETECHNIQUES
E The additive-tree tematic error. There are, however, a few, tech-
techlliques discussed in this chapter are free of niques that can hell! ill cvaluatillg the extent of
error if the distance data are additive systematic error, and for assessing the expected
(satisfy the four-point condition) and no distance effects of identified systcinatlc error.
values between sister taxa are missing from the
data matrix. This internal consistency of the tech- TESTS OF MODEL FIT Often there are tradeofks
nique places the burden of accuracy on the esti- between modcl colnplexlty (wl~lchprovides con
m a t i ~ l and
l transformation of tlie distance data sistency under a wide range of conditions) and
opposed to the actual tree inference proce- both computational complexity and sensitlvlty to
dure. Specifically, the model used to correct for random error. Therefore, in using a inctllod tl~at
superimposed changes must reflect tlze underly- assumes an explicit model of evolution, it 1s
ing evolutionary processes. To the extent that it important to choose a model that is complex
does not, additive-tree methods are susceptible enough to explain the observed data, but not so
to systematic error. complex as to be subject to llilpractlcally long
or require :mpractlcally large data
MAXIMUM LIKELIHOOD If the model of e.iroluti0n sets Clioos~nga model, therefore, requires a test
used to evaluate the likelihood of given trees to compare the fit of one model of e v o l u t ~ o n
does not reflect the actual evolutionary process- aga~iistanother lor a particular data set. Fur-
es, then maximum likelihood analyses will be thermore, we need to know ~f the best lnodel
subject to systematic error. 117 general, maximum p r o v i d e s a n adequate explanation of tllc
likelihood appears to be more robust to viola- observed data. Reeves (1992)and Goldman
tions of its assumptions than are additive-tree (1993a,b) have described tests for thls purpose
methods (Huelsenbeck, 1995b). In principle, To compare two models of evolution, c old-
maximum likelihood models can be made arbi- man (1993a) suggested using the likelihood ratio
trarily complex to account for particular evolu- test statistic, 8
tionary processes, but the cost in terms of com-
putational time may be severe. Moreover, com-
plex models may bc more sensitive to random
error than are simple models (because more where In L1 is the log hkehhood under the Inole
parameters need to be estimated from the same colnplex (~arameter-rich) model and In L~ 1s thc
amount of data). log l~kelihoodunder the simpler nzodel. This sla-
tistic will always take on a value greater thall or
CLUSTER ANALYSIS If the assumption of ultra- equal to zero because the likelihood under the
rnetricity is satisfied and no distance values complex model will always be equal to or higher
between sister taxa are missing from the data than the likelihood under the simple model. TO
matrix, cluster analysis will be free of systematic test whether the more complex model prav~clesa
error. However, if two lineages are not equally significantly better explanation of tlze observed
distant from a third, more divergcd lineage (i.e., data, Goldman (1993a)suggested that the null d:s-
if the pairwise distances are not ultrametric), tributron of tlic statistic 6be determined usil~g
then systematic error will be introduced. As simulation. The tree and the parameters of the
pointed out above, satisfactiolz of the three-point model arc estimated under the null l ~ y ~ ~ t h
condition establishes tlxat the distances are ultra- that t l ~ simpler
e model of evolution is correct, and
metric. In practice, this condition is rarely satis- this estimated tree and parametenzed model are
fied by real data. then used to simulate many replicate data sets of
the same slze as the origulal.Maximum like]lhood
Recognizing Systematic Ewor scores are then calculated under both the simple
There is no foolproof method for identifying arti- and complex models to produce a null distribuhon
facts in plzylogenetic trees that result from sys- for the test statistic 6.If 6(from the origuxal data) 1s
496 Chapter 11 / Su~offord,Olsen, Waddell G. Hillis
greater than 95% of scores from the simulated to perform the test on the most deviant lineages
data, then the simpler nlodel of evolution is re- (those with the greatest and/or least total length
jecreci. Note, however, that rejection of the null hy- in a rooted additive tree). Alternatively, some au-
potlle,ls only indicates that the slrnpler model is thors have simply varied the assumed rate for
madeq~iaitcto explain the observations; it does not each branch or subtree, one after the other. In ei-
neccssarlly indicate that the more complex model ther case, the approach amounts to multiple hy-
is adeq~~ate The more complex model is now the pothesis testing, and lowers the significance be-
null model and is subject to further testing. low that of a single likelihood-ratio test with the
Typrcally, one can conduct tests to see if a same value of 6.
given parameter that can be added to a model Another approach for testing model fit has
pro.irides a significant ~mprovementin the optl- been proposed by Rzhetsky and Nei (1995).They
mallty 5core. For instance, many models assume derive linear invariants that are independent of
a diflcr ence in the probabilities of transitions and evolutionary time and phylogeny and reflect the
transversions. TOtest if this parameter (transi- constraints on a restricted model relative to more
tion transversion ratio) is necessary, one could general time-reversible models. By testing
test the Kirnura two-parameter model against whether the deviations of these invariants from
the Jukes-Cantor one-parameter model of DNA their expected values are greater than would be
subst~tut~on (see the section on "Maximum Like- expected by chance if a particular model were
hhood"). In this example, the log likelihood un- true, a test of whether that model is applicable to
der the Kimura model would be 111 L1and the log a particular data set is obtained. Goldman's
likelihood under the Jukes-Cantor model would (19934 method has some theoretical advantages,
be In Lo. but Rzhetsky and Nei's (1995) method is mucl~
To test the adequacy of a given model of evo- more computationally feasible.
lution, Coldman (1993a) suggested that the log
likelrhood under the multinon~ialdistribution ASSESSING THE EFFECT OF A POTENTIAL BIAS In
(In L1) be tested against the model of interest some cases, a model of evolution may be ade-
(inLo). This test is very stringent, however, and quate for the majority of taxa, but not applicable
under a wide variety of circumstances the model to all taxa. For instance, if a model incorrectly
of interest will be rejected as an "adequate" ex- assumes that the same equilibrium base frequen-
planation of the observed data. This does not cies exist in all lineages, then systematic error
mean that the model is inadequate to provide a will be introduced into the analysis. The problem
reasonable estimate of phylogeny, but ~t does may be particularly severe if the differences in
mean that the model fails to provide a perfect de- base composition do not follow phylogenetic
scrlpt~onof the underlying evolutionary pro- lines. If base composition is affected by ecologi-
cesses. Since we never expect models of evolution cal or physiological factors, then the potential for
to be correct in every detail, the test is perhaps convergence in base composition exists. For
best used to estimate how far the assumed model instance, Pettigrew (1994) argued that the meta-
deviates from the underlying processes. The bolic constraints of flying bias the base composi-
greater the deviation, the more attention one tion of rnicrochiropterans (the mostly small,
should pay to discovering those ctspects of thc echolocating bats) and the megacl~iropterans
evolutionary process that have not been ade- (flying foxes and their relatives) toward a higher
quarely ~ncorporatedInto the model. AT content, and that this bias misleads phyloge-
In applying the likelihood ratio test, the num- netic analyses of many different mitochondria1
ber oi tests being conducted needs to be consid- and nuclear genes (an effect he called the "flying
ered For example, in comparing the likelihoods DNA hypothesis"). He argued that the numer-
o i ultt elmetric trees (i.c, assuming a "constant ous studies that support the monophyly of these
clock") t o trees In whlch a given lineage is al- two bat groups (e.g., Bennet el al., 1988; Adkfns
lowed to change at a different rate, it is tempting and Honeycutt, 1991; Mindell et al., 1991; Am-
Phylogenefic Inference 497
merman and Hillis, 1992; Bailey et al., 1992; to the remaining eukaryotes. Furthermore, if in-
Stanhope et al., 1992) can all be explained by this variant sites are taken into account, then the sup-
base compositional bias. Instead of bat mono- port shifts strongly in favor of microsporidians as
phyly, Pettigrew (1986, 1991a,b, 1494) has argued the most basal eukaryotic lineage (Waddell, 1995).
(primarily on the basis of neuroanatomy) that Similar tests can be conducted to examine the
lnegachiropterans are more closely related to pri- potential effects of any hypothesized systematic
mates than to microchiropterans. Therefore, two bias. For example, both Gouy and Li (1989a) and
different explanations have been presented for Olsen and Woese (1989) have argued that if the
the apparent support from DNA sequences for tree of life proposed by Lake (1988)-in which Ar-
bat monophyly: either the two "bat" groups are chaea is paraphyletic or polyphyletic-were cor-
phylogenetically related, or the results are rect, a systematic bias due to "attraction" of long
accounted for by systematic bias. Van Den branches would not be sufficient to yield the trees
Bussche et al. (1996) have tested Pettigrew's fly- observed by the former groups (in which Archaea
ing DNA hypothesis for the relevant data sets is monophyletic). Gouy and Li (1989a) and Olsen
through simulation, and have shown that the and Woese (1989) interpreted these results as
support for bat monophyly cannot be explained grounds to reject the proposal of Lake as being in-
on the basis of base composition bias alone. Even consistent with their observations, a conclusion
if Pettigreds phylogenetic hypothesis is correct, that is contested by Lake (1990b).
and every substitution in the two bat lineages
went to an A or a T, then the bias would still not SENSITIVITY TO SPECIFIC TAXA IN THE TREE If the
be sufficient to explain the observed support for data and tree inference technique were ideal,
bat monophyly. Furthermore, analyses that are analyzing any two subsets of taxa would yield
better at taking different base composition congruent trees (i.e., the trees would be identical
among lineages into account (such as LogDet after pruning taxa absent from one or both trees).
analyses) still support bat rnonophyly. Therefore, In practice tlus is not the case. (Otherwise, find-
the analyses show that the particular bias is not a ing optimal trees would be almost trivial, since
sufficient ex~lanationfor the data. This does not
I
constructing a tree by sequential addition of taxa
mean that the data have no systematic bias, but would always lead directly to the globally opti-
it does mean that the hypothesized bias is not an mal tree, regardless of the order of addition.)
explanation for the results in this case. Both systematic and random error can distort the
In other cases, base composition does have a tree so that the inferred branching order is
demonstrable effect on phylogenetic analyses (see dependent on the taxa included. Because the
Rzhetsky and Nei, 1995, for a test to detect signif- to& error contains both systematic and random
icant base compositional differences). For in- components, variation with the sampling of taxa
stance, Leipe et al. (1993) and Hasegawa and does not necessarily indicate an effect of system-
IIashimoto (1993) have suggested that early eu- atic error, but it is suggestive. Most sources of
karyote evolution is especially difficult to analyze systematic error are expected to increase with
because of unequal base composition (e.g., the Gi- branch length; therefore, if the changes in tree
ardia genome is about 70% G+C, whereas the av- topology are specific to the most diverged taxa,
erage microsporidian genome is 35% G+C). Ob- then there is again reason to suspect that system-
served distances, transformed distances, standard atic error is having a significant effect on the
parsimony, and current maximum likelihood analysis.
models all support Giardia as the sister lineage to Lanyon (1985) described a jackknife method
other eukaryotes with high bootstrap support that evaluates taxon stability by computing T
[based on Gouy and Li's (1989) small-subunit trees, each time leaving out one taxon. By com-
rIWA data set]: However, phylogenetic analysis puting a strict consensus of these trees using a
of LogDet distances shows equal support for ei- method that allows different subsets of taxa to be
ther ~ i a r d i aor microsporidians as the sister group contained on each of the rival trees, the investiga-
498 Chapter 11 / Swofford,Olsen,Waddell b Hillis
tor can determine which relationslups are consis- If a conflict cannot be expIained by random
tent. Felsenstein (1988a) suggested that this error associated with finite sampling, then one of
method may not have the properties of a statisti- the following possible explanations should be
cally valid jackknifing procedure, but it nonethe- considered: the inadvertent use of non-ortholo-
less provides a useful index of which groups are gous genes (e.g., a tree with mouse and rabbit a-
most stable to taxon selection. globin and rat ,!?-globin; paralogy); reticulation of
lineages due to hybridization or lateral gene
CONTRIBUTION OF INDIVIDUAL TAXA TO THE OPTI- transfer (xenology); or the presence of significant
MALIm CRITERION If the placement of a particu- levels of systematic error (leading to inconsistent
lar taxon is problematic (due to systematic error), conditions) in one or both trees.
removal of that taxon from the analysis will fre-
quently make a disproportiolzate clzange in a NONPARAMETRIC APPROACHES Nonparametric
measure of tree quality, such as the least-squares tests may provide an additional source of guid-
criterion in a distance tree, the estimated homo- ance in evaIuating a tree inferred fro111 distance
plasy of a tree derived by parsimony, or the data (or for which pairwise distance estimates
overall likelihood ratio statistic 6. However, such can be generated from the character data), In
measures are correlated with the number of taxa practice, the usefulness of these tests is depen-
in an analysis, so one must confirm that the dent on the details of the tree inferred, and in
change in a given statistic is significantly greater many circumstances the tests may not be able to
than would be predicted by the removal of an distinguish alternatives. An illustration of a case
average taxon (in many cases this will require a in which they might be useful is provided by the
simulation study). trees in Figure 32. A comparison of the paths
from A and B to D yields the expectation that dAD
INFERENCES BASED O N D I F F E R E N T MOLECULES > dgD for all three trees. Let us assume that this
Pl~ylogeneticrelationships inferred from two or trend is significantly supported by the data (for
more different molecules sl~ould,in theory, be example, the trend is verified by bootstrap sam-
congruent if the molecules had the same overall plings of sequence positions). If we now consider
history. If the inferred reIationships are different, the relationships ol C to A and B, we expect that
the reasons for the differences should be investi- dAC > dsc in trees 1 and 3 (an expectation that
gated (Bull et al., 1993b).It is important to avoid could also be true of a minor variant of tree 2),
confusing differences between the optimal trees whereas dBc > dAC is only consistent with tree 2.
with the conclusion that the results are signifi- Again, we can examine the data directly to see if
cantly incongruent: the former might simgy be one of these inequalities is significantly support-
d u e to random errors in one or both trees, ed. In particular, if we observe that dBc > dAC,
whereas the latter asserts tlze existence of a sig- then we must conclude that trees 1 and 3 are
nificant conflict. One method for deciding incorrect, leaving tree 2 by elimination. Yet, if
between these two possibilities is to fit each data tree 2 were historically correct, systematic error
set to the tree(s) derived froin the other data could have biased the tree inference procedure to
set(s). Most modern programs allow the input of group the long branches leading to C and D,
user-defined trees for evaluation under a partic- leading to the incorrect choice of tree 1. The rea-
ular optimality criterion. For example, suppose son that it is possible to infer tree 2 from the data
tree 1 is optimal for data set A and tree 2 is opti- and yet to find certain distances significantly
mal for data set B. If tree 2 is nearly as good as inconsistent with that tree lies in the particular
tree 1 for data set A, and if tree 1 is nearly as ratios of branches and in the fact that the latter
good as tree 2 for data set B, then there is no-real test does not need to examine the most underes-
conflict, just inadequate information. This result timated distance (i.e., that separating C and Dl.
can sometimes occur even though tlze two trees In contrast, the tree inference procedures dis-
differ substantially in their topologies! cussed would include the distance from C to D
Tree 1 Tree 2 Trcc 3
Figure 32 Three alternative trecs relating four taxa
that can be distinguished by a non-parametric test on
the distance data. See text. substitution of different outgroup taxa, one or
two at a time, can still be used to evaluate the
reliability in the position of the root.
(directly or indirectly) and potentially be misled Ironically, the effect of multiple outgroups In
by this value. parsmony is allnost exactly the opposite. The use
of multiple species of an outgroup taxon will tend
Reducing Systematic Ewor to dlvide the longest brancli 111 the tree, thereby
Several strategies are available to minimize sys- decreasing its tendency to attract other long
tematic error and its effects on a phylogenetic branches (Felsenstein, 1978a; Hendy and Penny,
analysis. 1989; A.B. Smith, 1994).To be inost effective, how-
ever, additional outgroups should be chosen so as
CHANGING THE ASSUMPTIONS One obvious way to divide long branches reasonably evcnly;
to reduce the chance of having systematic error adding an extremely close relative of a very dis-
lead to incansistei~cyis to change the assump- tant outgroup will gain little Of course, the bene-
tions of the analysis to better match the observed fits of adding additronal taxa are not llmited to the
data (e.g., see "Tests of Model Fit," above). One outgroup. Long branches (sparse regions) wlthln
example has already been given: if base composi- the ingroup can also contribute to systematic cr-
tion is thought to vary significantly among taxa, ror, and multiple substitutions are more easily dc-
then pairwise distances can be corrected using tected in dense regions. A somewhat paradoxical
the LogDet transformation. However, the source phenomenon results. With large numbers of Laxa,
of the systematic error may not always be so correctly inferring every aspect of the true topol-
obvious, or a method may not have been devised ogy is extremely difficult, but if we were inter-
for dealing with an identified bias. The following ested in the relationships of only, say, four taxa,
teclmiques may be useful in these cases. we would be much better off to compute d trtJefor
20 taxa (interspersed among the four of interest)
REMOVING LONG BRANCHES A practical consid- and prunc 16 of them froin the tree than to coni-
eration in the inference of trees from pairwise p t e - t h e tree for only the four taxa.
distance data is that the effects of systematic
error are expected to be worse with larger than ELIMINATING UNRELIABLE DATA Another placl1-
with smaller distances. As noted in the discus- cal cons~derationconcerns the fact that a branch
sion of the Fitch-Margoliash technique, pairwise is long because a large number of substitutions
distance methods include all measurements in have occurred in the sequences being comparcd
the calculations as .though they were indepen- Liinltlng an analysis to those sequence reglons In
dent. Therefore, having many long distances in a which positional homology is most certain tends
tree will tend to compound errors. In order to t o exclude the most varlablc p o s r t ~ o l ~117s
work around this problem, the use of outgroup sequences, thereby shortening branches and
sequences should be kept to a minimum when decreasing the sens~hvltyof the analysis to mul-
using a painvise distance method. However, the tiple substitutions Tlus concept can be pushed
500 Ciznpter I1 / S~uofiord,Olsen, Waddell & Hillis
~ L I tl-ter
I lf hypervariable regions can be identl- characters: those that are misinformative. T1xese
fied 111 set of sequences, then they might be observations lead us to the rationale for chamc-
e1im1natcl.dfrom the analysis, even if their posl- ter weighting. If we could somehow deduce
t~onalhomology 1s not in doubt. This phenome- which characters were in fact the unreliable ones,
non yrovrdes one motivation for character the task of reconstructing evolutionary trees
we~ghcing would be greatly simplified, because we could
Subjective elimination of data is sometimes minimize their infiuence in the analysis by giv-
critlc~~cd as being too arbitrary (e.g., Gatesy et al., ing them less weight.
1993). Although we share the concerns of these Identification of unreliable characters is also
authors, we take the position that data are ex- an effective way to avoid systematic error. By as-
cluded from the moment one chooses a particular signing lower weight to the characters that either
gene, set of genes, or gene region to use in a sys- violate the assumptions of a method or arc known
ten-ialjc study. Most researchers would agree that to predispose the method to inconsistency, we can
cerlaln genes are evolving at an inappropriate rate minimize the likelihood that systematic error ttrill
for the level of a study, and would avoid those occur. For instance, parsimony methods are much
genes 111 an attempt to minimize saturation effects more likely to be consistent if character change is
and other problems (see, e.g., Simon et al., 1994). low, and consequently work best if the events be-
Tt seclns unreasonable to argue that just because ing minimized (i.e., homoplastic changes) are in
sequence data have been obtained (perhaps even fact the rare events. If the rapidly evolving char-
accidentally) for a region that is evolving too acters are recognized as such and given little
rapidly lo be reliable in a study, we are forced to weight in the analysis, the problem of attraction
retain [hem at all costs. It is unrealistic to think of long branches due to chance convergences will
that subjectivity in a molecular systematic study be minimized. Unfortunately, beyond the use of
can be entirely avoided-for example, one could alignment difficulty as a criterion for macromole-
almost always sequence additional taxa relevant cular sequences, methods for assessment of char-
to a question, and it is a subjective decision when acter reliability have received little attention.
to stop. We believe that the benefits of excluding One extreme form of weighting is the elimi-
clearly unreliable regions-however subjectively nation of characters, as discussed above. By as-
deiermined-outweigh the dangers. signing one set of characters the maximum
The above paragraph notwithstanding, we weight (unity) and another set of characters the
look iorward to the development of methods that minimurn weight (zero),we esse~ztiallyassert that
allow ;Imore objective assessment of which posi- [here are Iwo classes of characters, one compris-
I J O J I ~111 a sequence are worth retaining. One ing characters that, at least on an a priori basis, are
prij1~115mg approach is the elision method of W.C. all equaIly reliable, the other containing characters
Mrheelcr et al. (1995), which attempts to identify that are worthless for the analysis in question. If
stable irersus unstable alignment regions by ask- we believed that characters actually behaved in
ing which positions align consistently over a wide this way, we would use a method of analysis
r'ingc of al~gnmentparameters. known as character compatibility (Felsenstein,
1981b), which searches for the largest "clique"-a
cr-r~rcnnc.rEirWEIGHTING Obv~ously,all charac- set of mutually compatible characters that can all
1rl.s ;ire no1 equally informalive with respect to fit on the same evolutionary tree without homo-
the evolutionary l~istoryof the taxa under study. plasy (e.g., Le Quesne, 1982; Estabrook, 3983).
Sanic cl~aractersare both informative and reli- Compatibility methods are no longer in wide-
able, they are telling us the truth about their spread use, probably because of their implicit ad-
past Other characters may be reliable but unin- herence to an ul~realisticmodel that asserts that
formative: although they are not actively mis- once a cl~aracterhas been excluded from the
leading us, they are not telling us anything very largest clique, it no longer conveys any useful in-
~ ~ i e l either,
ul The reason that phylogenetic formation whatsoever.
annlyqls is so difficult lies in a third category of An approach that uses compatibility as an ob-
Phylogenetic Inference 503.
jective weighting criterion (rather than to infer from a new set of weights is identical to the tree
phylogeny directly) was developed by Penny and that was used to derive those weights). Farris
Hendy (1985,1986). Sharkey (19891, apparently (2969) used reweighting functions based on the
unaware of the work of Penny and Hendy, de- consistency index (Kluge and Farris, 1969), de-
scribed a related approach, but limited to binary fined as r l / l , where rl is the range cf character j
characters. The strategy of these workers is to (defined as the minimum number of steps that the
count the observed number of incompatibilities character would require on any possible tree) and
(01)between each character (j) and each other I, is the length required by the character on the
character. (For methods to test the pairwise com- tree at hand. Thus, characters that change the
patibility of unordered multistate characters, see minimum possible number of times have perfect
$stabrook and Landruin, 2975; Fitch, 1975,1977; consistency (1.01, whereas characters that change
Sneath et al., 1975.) To convert this number to a more often have lower consistencies (approaching
weight, Penny and I-Iendy recommended com- zero in the limit). Farris also noted that more ex-
puting the number of incompatibilities expected treme forms of weighting might be more effective
by chance (El) if the distribution of states for each than the use of the consistency index in successive
character were independent of that for other char- weighting procedures.
acters (i.e., free of any non-independence imposed One danger inherent in any successive ap-
by their evolution on a common phylogeny). proximations (a posteriori) approach is the likeli-
Penny and Wendy (1985) tested several weighting hood of the search becoming trapped in a local op-
functions, but seem to have settled on the simple timum that depends on the starting tree (see also
rela tionship Neff, 1986).It is easy to see that a character that is
inconsistent with the initial tree and down-
weighted as a result will have less influence in the
second iteration than it did in the first. But there
Thus, a character that is compatible with all other are some trees on which the character would have
characters is assigned thi maximum weight been perfectly consistent, and would therefore
(unity), whereas a character that is incompatible have been given maximum weight. Farris (1969)
with as many characters as would be expected by tested the effectiveness of his successive approxi-
chance alone is assigned zero weight. More im- mations method by adding random noise to a data
portantly, characters that fall between these two set containing otherwise perfectly compatible
extremes are assigned intermediate weights. characters and testing whether the noisy charac-
(Note that if the observed number of incompati- ters were in fact the ones assigned little weight in
bilities actually exceeded the expected number, a successive iterations (they were). We suggest that
negative weight would be assigned unless the one not become overconfident upon seeing this
weights are constrained to be non-negative.) This kind of result, however, as characters in real data
method of weighting thus uses hierarchical struc- sets do not fall cleanly into "completely reliable"
ture in the data to assign weights, but does not versus "random noise" categories. Nonethe-
base weights on any specific t&e. Unfortunately, less, recent model-based simulation studies and
these methods remain relatively untested. studies of well-supported phylogenies (J. Mc-
Another approach to character weigllting is to Guire and J. Iluelsenbeck, personal communica-
estimate optimal weights by successive approxi- tion) indicate that successive approximation ap-
mation (Farris, 1969). An initial set of weights proaches can be effective, although not as they are
(perhaps uniform weights) is used to obtain an usually implemented. McGuire and Huelsenbeck
il~itialestimate of the tree. From some measure of observed little or no improvement in accuracy
the fit of the characters to this tree, a new set of over the initial parsimony estimate when succes-
weights is derived, which are then used to esti- sive weighting was performed using a character's
mate a second tree. The iterative rederivation of average consistency index across all of the most-
weights and recomputation of trees continues un- parsimonious trees as the reweighting criterion.
til the solution stabilizes (i.e., the tree derived However, they found that successive weighting
502 Chapter I 1 I Swofford, Olsen, Waddell
did increase the accuracy of the estimated phy- Character-state weights can be implemented
logeny when used with more extreme forms of by use of the step matrices described in the set-
weighting (such as the inverse of the total number tion on "Generalized Parsimony." Several metlz-
of character-state changes raised to the tenth ods have been proposed for determining appro-
power) and when the best observed index value priate weights. If:we knew the actual probabilities
for a character across all of the most-parsimonious for each type of transformation (e.g., for DNA se-
trees is used (as suggested by campbell and Frost, quence data, A -;\ C, A + G, A -+ T,etc.), then an
1993). appropriate transformation would be
Another problem with successive weighting
approaches similar to Farris's (1969) method is
that there is no objective criterion for comparing
any two trees (D.R. Maddison, 1990).That is, if a where C,,) is the cost of a state change from state i
tree is found to be optimal by the successive ap-, to state j and P,,, is the relative probability that
proximations algorithm, one cannot say how state i will change to state j across a given branch
much worse (if at all) an alternative tree is. or tree (Felsenstein, 1981c; W.C. Wheeler, 1990a).
Goloboff (1993) has developed a method for If the entire probability matrix of state changes is
weighting characters based on their implied ho- converted in tlus way into a step-matrix of change
moplasy that avoids this limitation by defining a costs (including the diagonals, which represent
weighting function and optimality criterion that the probability that a state will not change), then
can be evaluated for any tree and compared the most-parsimonious reconstructions oi ances-
across trees. The idea is promising, although t l ~ e tral states represent maximum Bayesian probabil-
method needs to be mare thoroughly evaluated. ity estimates for these states (D.R. Maddison,
Simon et al. (1994) have written an excellent 1990; Maddison and Maddison, 1992).
review of character-weighting strategies that is How can the relative probability matrix of
both more data-oriented and more comprehen- state changes be estimated? If we can assume a
sive than the discussion here; readers are urged to constancy of processes across characters, then it is
consult their paper for additional insights into is- possible to estimate the probability matrix from
sues concerning weighting in distance and char- the observed data. For instance, with DNA se-
acter-based contexts. quences, we might assume that the relative prob-
abilities of substitutions are affected in the same
CHARACTER-STATE WEIGHTKNG In character way across sites by exposure to the same muta-
weighting, entire characters (e.g., ilucleotide posi- gens and repair mechanisms. Given this assump-
tions in a gene) are weighted differentially. In tion, one way to estimate relative probabilities of
contrast, character-state weighting provides dif- change is to base the calculation on t l ~ eratio of ex-
ferent weights for different character-state trans- pected to observed changes ill all pairwfse com-
formations withill a character (see the section on parisons of the sequences, taking the relative base
"Generalized Parsimony"). Differential character- frequencies of each base into account (Thomas
state weighting provides a mechanism for and Beckenbach, 1989; Knight and MindelI, 1993).
increasing both the consistency and the efficiency However, the various pairwise comparisons are
of parsimony analyses when the relative proba- not evolutionarily independent, so the calcula-
bilities of character-state transformations differ, tions will be biased by the underlying phylogeny.
especially at high rates of evolution (Huelsenbeck One way to account for this non-independence is
and Hillis, 1993; Hillis et al., 1994a,b). The to reconstruct all most-parsimonious ancestral
method works by giving greater weight to rare states in an initial estimate of the tree, and then
changes, which are less likely to be homoplastic use this information to produce a change-and-sta-
(especially at overall high rates of character sis matrix (Maddison and Maddison, 1992). Of
change) and hence more likely to be reflective of course, the reconstruction requires an initial tree,
pl~ylogenetichistory (Williams and Fitch, 1989). which (if estimated by parsimony) requires an ini-
tial matrix of change costs, so the estimate may be from G to A) or asymmetric (111 whlch case I l ~ e
biased by the initial assumptions. In practice, the reclprocal costs will differ, wlt1-1Dollo parsimony
relative frequencies of the various changes usu- being the most extreme form) Under most cv-
ally are not biased greatly by the initial tree, and cumstances, the reclprocal. costs should be sym-
~ubsequentrounds of tree estimation can also in- metric, so that any part of the tree can be looted
volve reweighting of the character-state changes (by ~ncluslonof an outgroup, for ~izstance)~71th-
until a stable solution is reached (a procedure out changing the length of the tree. If asymlnctn-
called dynamic weighting by Williams and Fitch, cal step matrlces are used, then the various lot-
1989). Alternatively, matrices of change costs can lngs of a tree wlll differ in tree length, so rooted
be estimated for several alternative hypotlzeses to trees must be exarnincd to dcterininc the trec
examine directly the extent to which the starting length of the potentla1 solutions Slnce small
tree biases the estimates of change costs. asymmetries m the estimated matrlx are expected
One problem with the above approach is that from random error associated with finlte saalplc
the most-parsimonious reconstructions are not the sizes, onc would not want to root the trce on the
o111y changes possible. Ideally the relative proba- basrs of t h ~ srandom error alone. However, i f ihc
bility matrix should be based on sumned proba- asymmetries of change among states are stro~lg
bilities across all possible character-state histories. and obvious (as wlth some RNA virusps,
This can be accomplished using maximum likeli- Moriyalna el al., 19911, then the use of an asyrn
hood (e.g., Z. Yang, 1994a; Z . Yang et al., 1994). inetr~calstep lnatrlx may be justifled (e g , see
But if the relative probability matrix is estimated Hillls et a1 , 1994a)
using likelihood methods, then what is the ad- The assumption of constant substitutional
vantage of using weighted parsimony methods processes operating across sites can be vlolatcd
over an explicit likelihood estimation procedure for any number of reasons, ~ncludingdependence
to estimate the tree? One of the principal advan- on the state of izeighbor~ngbases (Randall ct a l l
tages is one of computation time: conlplex maxi- 1987; Schaaper and Dunn, 19571, codon usage 113
muin likelihood models typically constrain an in- protcin-coding genes (W -FI LI et a1 , 1985b),
vestigator's ability to search tree spacc strand bias (Wu and Maeda, 1987, Thomai and
thoroughly, so only a very small portion of the po- Beckenbach, 1989), n ~ tat1011
u bias (Loeb and PI c-
tential solution space can be explored. Weighted ston, 19861, secondary structural conslraints
parsimony procedures often provide a close ap- (Gerbi, 1985; D ~ x o nand Hillls, 1993, T~lllerand
proximation to the likelihood solutions, and the Coll~ns,19951, and other non-phylogenctlc
calculations are much faster. Thus, one strategy is sources of covariation among sites (Fltch a n d
to estimate the relative probability of change ma- Markowltz, 1970, Korber et al., 1993). Thercforc.,
trix using likelihood, and then use weighted par- in some s~tuatlons,~tmay be necessary to d ~ v l d c
simony to explore the solution space as thor- the data set (e.g., into first, second, and tl~irdpo-
oughly as computational limits permit. Once sitlons of codons) for the purpose of cornpnlii\g
optimal or near-optimal solutions have been separate step matrices to p r o v ~ d edlffercntlal
found (under the weighted parsimony criterion), weighting of state changes among thc vallorls
they can be used as input trees and evaluated un- sltes ~ * the
r sequence
der the likelihood model. Given a fixed and finite
amount of computation time, this procedure often
finds better solutions under the likelihood crite-
Random Error
rion than does a direct search of tree space under The only way to avoid random error is to obtain
the likelihood criterion, at least for moderately an infinite amount of data; this practice will guar-
large data sets. antee the correct result as lor-lg as the method is
Step matrices used for character-state weight- consistent. This option is unrealistic, however, so
ing can be either syinmetric (e.g., the cost of a it is important to maximize the extraction of pliy-
change from A to G will equal the cost of a change logenetic information by using the most effic~ent
504 Chapte~11 / Szoofford, Olsen, Waddell G.' Elillis
methods [hat are applicable to the available data. length atid measures of character consistency or
In any case, methods must be used to estimate the homoplasy (Archie, 1989a,b; Faith, 1990, 1991;
sensrtivr[y of the results to finite sampling. Penny Faith and Cranston, 1991).
and 1-Icndy (19861, Felsenstein (1988a), Li and An alternative approach to permutatioll is the
Couy (19911, Hillis et al. (1993a), and Li and examination of shape of the distribution of tree
Zharktkh (1995) have presented reviews of the lengths for either all possible trees or a random
Inany methods available. Here we present and sample of them (Hillis, 1991; Hillis and Hueken-
dlsc~rssa few of the more common rnetliods. beck, 1992).Fitch (1979, 1984) observed that data
sets with little or no hierarchical structure tended
Testins for Hierarclziccrl Structure to produce relatively symmetric tree-length fre-
Even ~ r a' data set were co~istructedby randomly quency distributions. Hillis and Huelsenbeck
asslglilng character states to taxa, some random (1992) showed that as the amount of hierarchical
covariatlon would be expected due to the sto- structure was increased, these distributions be-
chastic nature of the sarnpllng process. This ran- came more Ieft-skewed. The degree of skewness
dom covariation would lead phylogenetic recon- can be quantified using the standard gl statistic.
structlon methods to prefer some trees over others For n trees of length T, gl is calculated as
even tilough true hierarchical structure in the data
was absent. Thus, it is worthwhile to ask whether
a data set contains more hierarchical structure
than would be expected purely by chance.
0i1c way to assess the non-randomness of hi-
eral-clilcalstructure is through permutation tests, where s is the standard deviation of the tree
c.irl-uc11provide a means for approxlinating the dis- lengths (Sokal and Rolzlf, 1981). Strong skewness
iributron of a test statistic under a given null hy- can be misleading, however, as very localized
poihesls by perinutlng (randomizing) the ob- structure can lead to highly asymmetric tree-
served data. In a phylogenetic context, permuted length frequency distributions. (For example, a
data s c t s are created by randomizing character purely random data set can produce a highly
states among taxa, while holding the total num- skewed tree-length distribution if one taxon is du-
ber of occurrences of any state constant. Thus, any plicated, as trees consisient with the monophyly
correlation among character states that results of the duplicated pair will be much shorter than
fro111 actual pl~ylogenetlcstructure is destroyed. the remaining trees.) Hillis (1991) suggested a
By comparing the null distribution of a test slatis- procedure for detecting those groupings most re-
tic g~neratedfrom a series of permuted data sets sponsible for the observed structure by calculat-
with Lhc observed value of the statistic from the ing the g1 statistic after successive restrictions of
origlual data, one can determine whether the null the sample space of trees. He used random char-
hypothesis of no phylogenetic structure can be re- acter states (rather than permuted stales from the
jectvd If the test statistic does not lie in the ex- observed matrix) to estimate the null distribution.
trclnc (say 5%) tail(s) of the null distribution, then The latter approximation is computationally
il~cre1s a reasonably good chance that it could much laster than permutation (in fact, it need
have artsen by chance in the absence of meaning- only be calculated once for a given number of taxa
ful i~ierarchicalstructure, and further analysis of and characters), but it is sensitive to deviations in
the ii'lta would seem ill-advised. It is important to the frequencies of the observed character states.
i erncrnber, however, tliat although significant hi-
erarchical structure may be due to phylogenetlc Tests for Conzpauing Two Trees
signal, other sources of structure (such as basta Many tests have been described to compare rive
composttional bias, or convergence) may also lead Iiypothesized trees: Is tree A significantly better
to rejection of the null hypothesis. Statistics that (under a given optimality criterion) than tree G, or
have been used in pertnutation tests include tree are the differences within the expectations of ran-
Phylogenetic Inference 505
dom error? Such tests have been devised for each tive site. The expectation for D (under the null hy-
of the major optimality criteria. pothesis that the two trees are not significantly
different) is zero, and the sample variance of D is
PARSIMONY The first analytical tests for parsi-
mony were devised by Cavender (1978, 1981),
who studied the case of a four-taxon tree.
Felsenstein (1985b) extended these results to
include an assumption of a constant molecular
clock, and Steel et al. (1993b) extended Felsen- where n is the number of informative sites. The
stein's test to take into account unequal null hypothesis that D = 0 can be tested with a
nucleotide frequencies among the taxa. Li and paired t-test with n - 1 degrees of freedom, where
Zharkikh (1995) noted that these tests could, in
principle, be extended to more than four taxa,
but that the tests are expected to have very low
f -
= D ln
,s /&'
power, Therefore, we concentrate here on related
heuristic tests that can be used with any number If there is no a priori reason to suspect that tree 1
of taxa. is better than tree 2, the test should be two-tailed.
Templeton (1983b) devised a nonparametric If there is an a priori reason to suspect that one tree
test for comparing two trees. The test utilizes a is better than the other (for instance, if one tree is
Wilcoxan ranked sums test of the relative number the optimal tree found in a search, and it is being
of steps required by each character on each of the compared against nearby suboptimal trees), then
respective trees. If the characters are uniformly the expectation for D is no longer zero. For this
weighted and require no more than one addi- reason, the test is strictly valid only when the two
tional change on either of the trees, then the test trees being compared are selected on an a priori
can be simplified into the "winning sites" test of basis.
Prager and Wilson (1988). This simple test com-
pares the number of characters that favor each of DISTANCE TESTS Rzhetsky and Nei (1992a,
the two trees and tests the results against a bino- 1993) proposed a test for comparing two trees
mial distribution. Under the assumption that ran- under the minimum evolution criterion. In this
dom noise will be equally likely to favor either of test, D is the difference in the sum of the branch
the two trees, the test asks whether the support lengths for the two trees as estimated by the
for one hypothesis is significantly better than least-squares method, and the variance of D is
would be expected from random variation among either estimated by bootstrapping (Nei, 1991) or
the characters. Although the assumption is usu- computed analytically (Rzhetsky and Nei,
ally not met exactly (because the size of the rele- 1992a). Rzhetsky and Nei suggested a search
vant subgroups in the two trees is expected to dif- strategy for solutions under the minimum evo-
fer), the effect of the violation is likely to be small lution criterion by comparing the neighbor-join-
and the test gives an easy approximation of the ing approximation to all trees that differ from
probability that the observed difference is due to the neighbor-joining tree by u p to four symmet-
random error. ric-difference distance units (dsu), and accepting
Kishino and Hasegawa (1989) devised a para- all trees that are not significantly worse than the
metric test for comparing two trees, under the as- neighbor-joining tree. They restricted the com-
sumption that all nucleotide sites are indepen- parisons to trees within 4 dS, because studies
dently and identically distributed. This test uses based on six taxa showed that it is unlikely for
the difference in lengths of the two trees (Dl as a the optimal solution to be any more distant
test statistic, where D = CDg ) and Qi) is the dif- from the neighbor-joining tree under these con-
ference in the minimum number of nucleotide ditions, at least if the number of characters
substitutions on the two trees at the ith informa- examined is large. However, this search strate-
506 Chapter I 1 / Swafford, Olserr, Waddell G. Hillis
gy is likely to miss many solutions that are rearrangement, we could test one topology
equal to or better than thc neighbor-joining tree against the other by pretending that there was
if there are greater numbers of taxa. For one degree of frcedoi~~ and using the hkelihood
instance, one of us (DLS) has found more than ratio test.
27,000 trees that are equal to or better than the Other approaches have been used to estimate
neighbor-joining tree (under the minimum evo- the significance of a difference in log likelihoods.
lution criterion) for the distance matrix exam- One is the application of the Kishino and
ined by S.B. Hedges et al. (1992b; based on the Hasegawa (1989) test (discussed above, under the
data of Vigilant et al., 1991).All but 345 of these parsimony criterion). An alternative is to generate
equal or better solutions are more than 4 dsD the expected distribution of 6(rather than assum-
from the neighbor-joining tree, and better solu- ing a ~2 distribution) through silnulation of the
tions are as much as 30 dsD from the neighbor- null hypothesis (i.e.,the tree with the lower likeli-
joining tree. Therefore, a neighbor-joining esti- hood). The likelihood analysis already provides
mate (with a search of nearby trees) IS a poor the expected branch lengths given the topology of
substitute for a thorough search of tree space the null hypothesis, under an explicit model of
for near-optimal solutions. If the number of character evolution. Thus, this parameterized tree
taxa is very small (the conditions under which can be simulated under the assumed model of
this search strategy is likely to be successfuI), an evolution, and the simulated data sets can be ana-
exact search (exhaustive or branch-and-bound) lyzed under the maximum likelihood criterion.
is computationally simple and will always find The expected distribution of differences in log
the optimal solutions. likelihood scores (or twice the differences, if the
An alternative approach to testing the differ- standard test. statistic is maintained) between the
ence between two trees is to use a measure called optimal tree and null tree can then be generated
the generalized least-squares sum of squares, under tke assumptiol~that the null hypothesis is
which is similar to a weighted least-squares mea- true. If the difference in the test statistic for the
sure but takes covariances between distances (e.g., trees being compared exceeds 95% of the simu-
shared branches in the tree) into account. This sta- lated differences, then the two trees are signifi-
tistic can be compared against a x2 distribution cantly different at p < 0.05, and the null hypothe-
(see Bulmer, 1991 for examples). sis can be rejected. An example of this approach
(which could be used with any optimality crite-
LIKELIHOOD If one tree is a subset of a second, rion) is presented in Chapter 12, The primary lim-
more fully resolved tree, then the two hypothe- itation to its implementation is the computation
ses can be compared with a standard likelihood time involved, which can be considerable when
ratio test, using twice the difference of the log the data sets are large and the optimality criterion
likelil~oodsof the two trees as a test statistic (6 1. is maximum likelihood.
This statistic is compared against the x2 djstribu-
tion, with the degrees of freedom equal to the Assessing the Reliability oflndividual Branches
difference in the number of parameters of the In many situations, it is desirable to assess the re-
two hypotheses (in this case, the number of addi- liability of the individual internal branches of an
tional branches in the more fully resolved tree). estimated tree. Many methods have been sug-
Unfortunately, we would usuallj~like to compare gested for this purpose. For instance, several
two trees that are not subsets of one another. In a methods have been proposed for testing whether
strict sense, the likelihood ratio test is invalid a particuIar internal branch length is significantly
under these conditions, because the number of greater than zero in an additive-distance tree (see
parameters in the two hypotheses is equal, so we Li and Gouy, 1991). Here we describe two non-
have zero degrees of freedom. Felsenstein parametric approaches that have been widely
(1988a) has suggested that in cases where two used for testing the degree of support. for particu-
tree topologies differ by a single branch lar branches.
DECAY/SUPPORT INDICES AND F P T P TESTS In par- be used to estlmate the varlance associated ~.\rilh
simony, a useful index of support for a mono- a stahstlc for which the underlying sampllng d ~ s -
phyletic group may be obtained by calculating tributioll is eltl~erunknown or dlfflcult to d c lve~
the difference in tree lengths between the short- analytically. These methods are called tesnt~ipi~rrg
est trees that contain versus lack that group (K. techniques because they operate by estlniatlng the
Bremer, 1988). This statistic has been referred to variancc of the sampling dlstrlbution by rcpeat-
as the decay iizdex (Donoghue et al., 1992) or the edly resampling data from the original data set
support mdex (K.Bremer, 1994). A difficulty with Under certain reasonable assumptions (Efron,
this index is that it is not clear how large a value 1982), Lhe variance of the statistic of intert5l can
must be for the group to be considered well sup- be approxilnated froin thc dlstributlon ol 111c
ported. Faith (1 991) extended permutation samplc cst~rnateovel rcpllcations of the jesaln-
approaches to test lor the monophyly of a given pling process. These resainpllng methods \trert'
group of taxa. 1-11s a priori T-PTP (topology- flrst used In a phylogenct~ccontext by Muellei
dependent permutation tail probability) test uses and Ayala (1982),Felsenstein (1985a), and 12ei1ny
as a test statistic the difference in the lengths of and Hendy (1985)
the shortest trees in which a particular group is The bootstrap and thc jackknife differ 11-1 the
non-moi~ophyleticand monophyletic, respec- way 111 which resampllng is performed In ihe
tively. This statistic is equivalent to the sup- bootstrap, data po111ts are sampled randomly,
port/decay indices described above, suggesting wlth replacement, from the original data sct un t ~ l
that it might provide a useful means of assessing a new data set containing the orlginal nunibcr of
their significance. The null distribution of the observations 1s obtained. Thus, some data pomts
test statistic is determined by evaluating the cor- ~1111not be included at all 111 a given b o o l s l ~ a p
responding length differences of trees calculated rephcat~on,others will be included once, and still
from permuted data sets. Faith's a posieriori T- others twlcc or more. For each repllcatlon, the sLa-
PTP test uses the same test statistic as the a priorz tlstlc of interest is coinputed The jackknife, on the
T-PTP test but uses a different method for gener- other hand, resamples thc original data scl by
ating the 11~11distribution. After permutation of dropping k data points al a time and rccomputmg
thc data matrix, one calculates the length differ- the estimate from the rcinalning I I - k obscrva-
ence for all groups of the same size as t l ~ egroup tlons (see R.G. Miller, 1974) We dcscrlbc boot-
of interest and picks the greatest length differ- strapping here because ~t 1s muck more com-
ence betwecn the shortest tree in which the monly used 111 phylogcnetlc applicatrons, b u t
group of interest is non-monophyletic and the much of the discussion applles to jackkriiflng as
shortest tree in which the group is monopl~yletic. well.
Unfortunately, tl~csctests are sensitive to struc- Flgure 33 illustrates tke bootstrapping proce-
ture in the data set that is unrelated to the specif- dul e in a phylogcnetic context. History (thc l r ~ l c
ic hypothesis of n ~ o n o p h y l ybeing evaluated phylogeny) given us one actual dlstrtbut~on
(Thorne et al., 1996). Simulations of Faith's topol- of characters among taxa for any given data set oi
ogy-dependent cladistic permutation tail proba- interesl. The Ideal way to exatnine the effects of
bility (T-PTP) tcsts (Huelsenbeck et al., 1995; randoin error would be to replay the cvolutronary
Tl~orneet al., 1996) demonstrate that it does not tape Inany times; t h ~ would
s allow us to cxalmnc
accurately test for ~nonophylyof the specified san~plingvarlance 111our data directly (see Flgulc
group, so the question of how to assess the sig- 33A). However, tins is not possible due to the sln-
nificance of a support/decay index remains g~tlarltyof evolutionary lustory. Instead, boot-
unanswered. strapping allows us to gcnerate a serles a l p ~ u -
dosamples (by resamplil~gthe unique data set
NONPARAMETRIC RESAMPLING M E T H O D S The with replacement; Flgure 33J31, whicll we can use
bootstrap and the jackknife (Efron, 1982; Efron m place of the actual samples to estiinatc sam-
and Gong, 1983; Efron and Tibshirani, 1993) can piing variance. Typically, the pseudosamples arc
508 Chapter 21 / Szoofjo~d,Olsen, Waddell @ Hillis
+sa~f$l@&+$~fz~~;&~~g~~
i:7.-i;. . :=
- .
.*.=
z.:*.:--7.:
-:--23-'-.
L
:-F
-7,
-
--..--zz--.-.
.. .. ..- .->.-
: --
a. Estimate of
x ,,$
.
i.a>:
p@T.$z
s-*..::.<y+r. .
: -
=.W?-=.---.~Z
...e=-:-z:-.==+z>7&*:
~~~~~~7
,::=7+=.s&:.= ...=>:==.==-c..&--
*-
5=
- 5z..---:-%-rA*2."5*---?<
-- -
-
":<: : : . -:.: .zzr. =:.,=.:+.: --
-x.:.? !
variance about
variance about
Estimate of
true plxylogeny
Flgr~re33 (A) i f phylogenles were repeatable experl- phylogenies are not usually repeatable, it is not possi-
menis, 11 would be posstblc to generate many indepen- ble to draw more than one sample of characters for a
dent iarnplcs of characters for a given gene and tasa of given gene and taxa of ~nterest.Therefore, bootstrap-
iniercii In tlz~scase, the sdlnpling varlance about the ping is used to generate pseudosamples from the
true phylogeny co~rldbe calculated directly Erom esti- unique sample, and sampling variance is calculated
mat?s based on these ~lrdcpendentsamples. (B) Because from estimates based on these pseudosamples
analyrcd ~ndividually,and the proportion (P)of would be found in an analysis of a new, indepen-
ihc pseudosamples that support a given internal dent sample of cl~aracters(assuming we could re-
branch on a tree is recorded. play the evolutionary tape). More recently, Felsen-
iio~.zrmany pseudosarnples must be gener- stein and Kishino (1 993) have suggested that P
ated to obta~na precise estimate of P? The sam- can be interpreted as a measure of accuracy, or the
pllng variance oi P follows the binonual distribu- probability that the specified branch is contained
tiorz, such that o2-- P(1 - P ) / n , where n is the in the true tree (assuming that the phylogenetic
number of pseudosamples (S.B. Hedges, 1992). method is consistent).
For i~istance,if we draw 100 pseudosarnples, the Hillis and Bull (1993) examined these two in-
sample variance of P ranges from a maximum of terpretations of bootstrap proportions, using both
0.0025 (when P is 50%) to a minimum of 0 (when simuIated and known experimental phylogenies.
P is 0 or 100%).However, this just tells us how They found that bootstrap proportions provide
sin~ilnrthe estimate of P is likely to be to what we relatively unbiased, but highly imprecise, esti-
rvould obtain if we could analyze an infinite num- mates of repeatability. They also found that boot-
ber of pseudosamples. It does not tell us anything strap proportions provide biased estimates of ac-
about the interpretation of P. curacy (a result that was also found ar~alytically
Felsenstein (1985a) originally suggested that by Zharkikh and Li, 1992a,b, for four-taxon trees
P could be used as a measure of repeatability, or both with and witl~outa molecular clock). When
the probability that a specified internal branch the phylogenetic method is consistent, bootstrap-
Phylogenetic Inference 509
ping gives underestimates of accuracy at high number of potential branches is often so large that
bootstrap values, and overestimates of accuracy at an almost hopelessly low alpha level would be re-
low bootstrap values. The extent of the bias de- quired in order to maintain an overall type I error
pends (at least) on the number of taxa, the num- rate of, say, 0.05.
ber of characters, and the location of the internal Anotlier concern is tke assumption that the
branch in the tree (Hillis and Bull, 1993; Zharkikh sequence positions are changing independently of
and Li, 1995; Li and Zharkikh, 1995). one another. To the extent that this is not true, the
Two corrections have been proposed to recal- pseudosarnples will be too large, and the boot-
ibrate bootstrap proportions to account for this strap values will be higher than they would be
bias. Rodrigo (1993) proposed using an iterated otherwise. It is also important to note that the
bootstrap (Hall and Martin, 1988). This involves bootstrap can only assume that the data at hand
bootstrapping each of the pseudosamples ob- are representative of the underlying distribution
tained in the first round of bootstrapping, and and thereby estimate the variation that would be
thus is computationally very intensive. Zharlukh obtained by sampling additional data from that
and Lf (1995) sl~owedthat a simpler correction distribution. If the data are not representative or
can be obtained with just two rounds of boot- if the reconstruction method makes an inconsis-
strapping on the original sample (with one set of tent estimate of the phylogeny, then bootstrap-
pseudoreplicates the same size as the original ping will not remove this bias.
data matrix, and the other set of pseudoreplicates Bootstrapping and jackknifing can be used ei-
with reduced character matrices). The estimates ther with methods that operate on characters di-
from the two sets of pseudoreplicates can be com- rectly or with methods in which character data are
bined along with a correction for sample size to first transformed into distances. In character-
produce a corrected estimate of plzylogenetic ac- based methods, weighting vectors corresponding
curacy. The simulations of Zharkikh and Li (1995) to the number of times each character is sampled
indicate that this complete-and-partial bootstrap can be constructed and input to the analysis. For
technique can be effective for reducing the bias of distance methods, the resampling is conducted
bootstrap proportions, at least if the number of in- prior to calculation of the distance matrix; each
formative characters in the original data set is replication is then performed using a different in-
large (2100). put matrix corresponding to the replicate sample.
As with other methods, for a valid test using However, an additional source of bias exists with
bootstrapping the null hypothesis should be spec- metl~odsthat make non-linear transformations of
ified in advance. Otherwise, we run into a multi- sequence data (including distance corrections).
ple-tests problem similar to the one arising in a Under these conditions, the bootstrap will (in ex-
posteriori comparison of means following an pectation) overestimate the variance of the cor-
analysis of variance: inflation of the type I error rected data (e.g., Waddell et al., 1994),which leads
rate above the nominal level. (The problem may to conservative tests of significance .
be circumvented to some degree if the researcher Finally, the bootstrap replicates should be
interprets the frequency in which a group appears evaluated under an optimality criterion rather
in replicate trees as an index of support rather than just a tree-building algorithm. Otherwise,
than as a statistical statement, but this interpreta- any bias of the algorithm will artificially inflate
tion is far from satisfactory.) If we are interested the bootstrap proportions. Imagine, for example,
in testing more than one internal branch or if we an algorithm that clustered taxa solely on the ba-
are unable to pre-specify the branch(es) of inter- sis of their input order in the data matrix. Even
est, we can adjust the significance level to allow with no data, such an algorithm would find the
for the fact that we are conducting more than one same tree for every pseudoreplicate. However,
test (e.g., by dividing the significance level by the the resulting 100% bootstrap proportions would
number of tests implied). However, if the bear no relation to any measure of phylogenetic
branches of interest cannot be pre-specified, the accuracy.
510 Chapfer11 / Swoford, Olsen, Waddell & Hillis
IZPPENDIX: PROGRAMS AND SOF*$'&ARfi X3ACKACES A'BJAealF,AUEE
POR CONYlUC'TbNG B2MYLOGE;INeTICANT9 FOP$lkA1'HON CENISTIC
Alt4hLYS ES
Some of this information was extracted from a file compiled by J. Felsenstein and
distributed as part of the PHYLIP documentation in the file main.doc. That file
should be consulted for recent updates on availability and information about new
programs.
Contact W. C. Wheeler
-
MALIGN Mac~nLoshOS, Simultaneous alignment of mul-
(W C Wheeler DOS, U n ~ x , tiple sequences and construct~on (Department of Invertebrates,
and D Cladstern) and C soucc of parsimony trees. Code for American Museum of Natural
code implementation on parallel History, Central Fark West at
architectures is available 79th Street, New York, NY
10024-5192, USA) or source
code is available by anony-
mous ftp from ftp.amnh.org
MARKOV FORTRAN To compute distance measures Contact C. Lanave at
(G. l'esole and source code and substitution matrices under lar;[email protected],it
C. Saccone) a stationary Markov model of
DNA substitution. Bootstrapping
is included to assess the reliabihty
of the results
MEGA DOS Calculahon of nucleotide and pro- Institute of Molecular Eva-
(S. Ku:nar, tein pairwise distances, and calcu- lutionary Genetics, Pennsyl-
K Tamura, and latian of trees using the neighbor- vania State Un~versity,Uni-
M Ncl) joining and UPGMA algorithms. versity Park, Pennsylvania
Also searching capabilities under 16802 USA
the parsimony criterion using ([email protected])
stepwise addition, local branch-
swapping, or branch-and-bound
algorithms. Includes bootstrap-
ping and tests for comparlng the
length of two additive trees
hlFrTl<llE DOS Search for trees under minimum Contact M. Nei (same address
evolution criterion; with standard as MEGA)
errors and significance tests
blolcvol DOS, A package of about 20 programs Contact W. Fitch at
('vV Tltch) FORTRAN for estimating parsimony and wfitch8daedalus.bio.uci.edu
source code distance trees, dynamic weighting,
alignment, searching for second-
ary structure, and other analyses
of molecular data
MOLPi-I'f C source code A package of prograins for maxi- By anonymous ftp from
(J Adailhi mum likelihood analyses w ~ t h sunmh.ism.ac.jp
and bl Hasegawa) either nucleotide (NUCML) or
protein (PROTML) sequences,
basic statistics of nucleotide
(NUCST) and protein (PROTST)
sequences, and neighbor-joining
analysis (NJDIST)
MUST and 3s DOS Sequence management, analysis Contact H. Philippe at
(I-i Phllippe) of taxon sampling effects, and hpG8bio4.bc4.u-psud.fr
estimation of appropriate se-
quence lengths for a given
analysis
NONA DOS For parsimony analyses using Contact P. Coloboff at Depart-
(1' C;oioboff) Kennig86 data file format but ment of Entomology, Ameri-
with no limit on the number of can Museum of Natural His-
taxa and characters tory, Central Park West at 79th
St., New York, New York 10024
(continued)
Phylogenetic Iizference 513
programIPackage Operating system
(author) or source code Applications Availability
-
ODEN C source code For distance matrix analyses on By anonymous ftp from
(Y. Ina) nucleotide or protein sequences bioslave uio.no (in directory
pub/oderr)
PAML C source code A package mostly for maximum By anonymous ftp from
(2.Yang) likelihood analyses with either ftp.bio.indiana.edu (in direc-
nucleotide or protein sequences. tory molbio/evolve)
Includes programs for recon-
struction of ancestral sequences
and conducting analyses of mul-
tiple genes (baseml, codonml)
and simulating trees (mcml) un-
der maximum likelihood. Also
includes a parsimony program
(pamp) for estimating substitu-
tion matrices, intersite variability
of rates of evolution, and an-
cestral states
PARBOOT C source code For parallel processing of boot- By anonymous ftp from
(P. Roux and strapped data sets in conjunction megasun.bch.umontreal.ca
T. Littlejohn) with PHYLIP
PAUP* Macintosh OS, For finding and evaluating trees Sinauer Associates, Sunderland,
(D. 1;. Swofford) DOS, Unix, under the minimum evolution, Massachusetts 01375 USA
VAX/VMS DNA maximum likelihood, and (orders@sinauer,com)
parsimony (including generalized
parsimony) criteria. Includes
branch swapping, branch-and-
bound, and exhaustive searches.
Reliability of trees may be assessed
with permutation tests, decay/sup-
port indices, bootstrapping, in-
variant tests, or maximum likeli-
hood scores. Includes extensive
pairwise distance calculations,
consensus techniques, and recon-
struction of ancestral states using
parsimony and likelihood methods
Pce-Wee DOS For parsimony analyses using Contact P. Goloboff at Depart-
(P. Goloboff) character weights determined by ment of Entomology, American
their homoplasy during tree search Museum of Natural History,
Central Park West at 79th St.,
New York, New York 10024
PHYLIP DOS, Windows, A package of 30 programs, in- By anonymous ftp from evolu-
(J. Felsenstein) Macintosh OS, cluding parsimony methods of tion.genetics.washington.cdu
C source code invariants, maximum likelihood (in directory pub/phylip) or
(for nucleotide, protein, and restric- from the World Wide Web site:
tion site data), distance methods, (http:/ /evolution.genetics.
and compatibility analysis. Search- washington,edu/phylip.html)
ing by stepwise addition, branch
swapping, and the branch-and-
bound algorithm for some methods.
Includes bootstrapping, tree draw-
ing/ assessment of independent con-
trasts, various statistical tests of trees,
and consensus analysis
Chapter 11 / Swofford, Olsen, Waddell G. Hillis
ProgramIPackage Operating system
(author) or source code Applications Availability
Random Cladistics DOS For conducting permutation tests, By anonymous ftp from
(Mark Siddall) bootstrapping, or jackknifing in zoo.utoronto.ca/pub
conjunction with Hennig86 (randoin.doc and random.exe)
RAPDistance DOS, Windows For computing distance matrices By anonymous ftp from
(J. S. Amstrong, in RAPD analyses 1ife.anu.edu.a~(in directory
A. J. Gibbs, pub/RAPDistance)
R.Peakall,
and G. Weiller)
INTRODUCTION
From the preceding chapters, it should be clear that the diversity of molecular
techniques available to systematists is considerable and that the problems that
can be addressed with these techniques span an enormous range, from relat~on-
ships among genes within populations to the phylogeny of life. The rapid dcvei-
opment and power of these techniques has produced a euplnoria in evolutionary
biology; because so many new problems can be addressed, it is a commonly held
mnisconception that all evolutionary problems are solvable with molecular data.
Tlus is clearly not the case. Worse, inappropriate techniques are often appl~ed(at
a considerable waste of time and expense) to particular problems that could be ef-
fectively addressed with alternative techniques. In other cases, the technique cho-
sen lnay not be the most cost-effective choice. Therefore, we provide some guide-
lines in this chapter to aid in matching teclmiques to problems.
In the first edition of this book, we concluded that the field of molecular sys-
tematics was in its early stages, with much unexplored potential. Tkc vast in-
crease in volume of the bibliography (which provides only the briefest overview
of the available literature) demonstrates that the use of molecular techniques 111
evolutionary biology has increased dramatically in the last five years. Sanderson
et al. (1993) conducted an extensive survey of phylogenetic analyses and con-
516 Clznpter 12 / Hillis, Mable G-' Moritz
cluded (hat the years from 1989-1991 were char- were immediately promoted by some as '%etterV
acterized by a rapid accumulation of phylogenetic than traditional morphological data. The develop-
data, encompassing a broad scope of topics and ment of DNA hybridization techniques and re-
applicatio~zs,Although almost half of the studies striction fragment analysis was accompanied by
they assessed were based on morphoIogical data, new assertions of superiority, and individuals who
the use of molecular data (sequence data, iol- worked with isozyme electrophoresis were chas-
lowed by restriction site and allozyme data, and tised (in review of grants and publications, for in-
then DNA-DNA hybridlzatlon data) saw a pro- stance) for being "old-fashioned." Most recently,
portlor~atelyhigher increase. Among the journals some proponents of sequencing have suggested
publishing phylogenetic lnforination most often, that all other techniques are superfluous and out-
Jo~irnniof Molecular Evolution and Molecular Biol- dated (e.g., Wilson et al., 1989). We disagree
ogy niid Ez~oluiionwere the top two, emphasizing strongly with these assertions; certain techniques
the crltlcal role that phylogenetic inference is are better than others for answering particular
playing in studies of molecular evolution (Sander- problems, but no technique is best under all cir-
son et al., 1993; see also Chapter 1). The years cumstances. As Avise (1994:xii-xiii) pointed out,
since 199 1 have seen an even more pronounced sometimes timing is everything:
increase in the diversity of studies using molecu- Imagine for the sake of argument that DNA
Iar systematic techniques. In discussing the issue sequencing methods had been widely employed
of clzoosing an appropriate molecular technique for the past 30 years and that only recently had
for addressing a given problem, we mention some protein-electrophoretic approaches been intro-
of the many applications of molecular systemat- duced. No doubt a headlong rush into allozyrne
ics Tlus is by no means an exhaustive summary techniques would ensue, on justifiable rationales
and should be regarded as a composite illustra- that (a) the methods are cost-effective and tech-
tion rather than a complete picture of recent ad- nically simple, (b) the variants revealed reflect
vances (lor additional examples, see Chapters 2, independent Mendelian polymorphisms at sev-
4-91. For a more thorough review of recent appli- era1 loci scattered around the genome (rather
cations, a good starting. place is Molecular Markers, than as linked polymorphisms m a single stretch
of DNA), and (c) the amino acid replacement
Nnturiil iiistoiy and Evol~~tion(Avise, 1994). substitutions uncovered by protein electrophorc-
111 addition to choosing an appropriate mole- sis (as opposed to the silent base changes often
cular techniclue, seleclion of method of analysis is revealed in DNA assays) might bring molecular
an equally important decision. Recent advances evolutionists closer to the real "stuff" of adap-
have resulted in improved methods for analyzing tive evolution. To carry the argument farther,
molecular data, but there are many areas that re- suppose that molecular genetic methods had
main controversial and probably will be subject to been employed tlwoughout the last century but
further attention in the near future. Therefore, we that an entrepreneurial scientist finally ventured
include a brief overview of areas of current devel- into the world of nature and discovered organis-
oprnent in methods of analysis, and discuss ma1 phenotypes and behaviors. Finally, the inter-
face of gene products with the environment
whete w e expect additional advances will occur. would have been revealed! Imagine the sense of
excitement and research prospects!
CHOOSING A TECHNIQUE FOR A The point is that all approaches provide interest-
PARTICULAR PROBLEM ing and important insights into biodiversity and
evolution, and it makes little sense to think of one
Advances in technology often promote various technique as being inherently superior to another.
kinds of data chauvinism. When isozyme elec- Rather than promoting the latest technique as a
trophoresis and microcomplement fixation began panacea, it is worthwhile to consider which tech-
to be applied widely to systematic problems in the nique(~)are best suited for a particular problem.
1960s dnd early 1970s, the new biochemical data Morpholog~caldata are clearly superior to molec-
Applications of Molecular Systematics 517
ular data under certain conditions (e.g., for stud- should be used in combination. Table 1 lists some
ies of long-extinct species, or for looking at some of the common applications of molecular tech-
interactions with and adaptations to the environ- niques in systematics. We roughly classify each
ment), just as the reverse is true under other con- technique into one of four categories for each of
ditions (Chapter 1; Hillis, 1987).Only by combin- the problems listed: the technique is (1) inappro-
ing data from various morphological, behavioral, priate for the problem; (2) appropriate under lim-
physiological, and molecular techniques is it pos- ited conditions; (3) appropriate but not usually
sible to obtain a comprehensive view of evolution. cost-effective;or (4) appropriate under most con-
We think that the recent trend toward technique ditions. By inappropriate, we mean that consider-
overspecialization in graduate studies is harmful able time, money, and effort can be wasted by at-
to the field of evolutionary biology (not to men- tempting to answer the given problem using a
tion the rest of biology). Of course students particular technique, with little likely fruition. As-
should know how to sequence a gene if that is rel- suming that there are no technical barriers, the
evant to their research-but that shouldn't pre- most common reason for such a failure is that
elude them from examining proteins, chromo- there is too little or too much variation to address
somes, behavior, or morphology. Any lab that is the question of interest. A technique is listed in the
limited to only one technique is going to be re- second category (appropriate under limited con-
stricted to a relatively narrow set of evolutionary ditions) for a particular problem if success has
questions and problems. been obtained under some conditions (when lev-
Given the above caveats, we will now address els of variability are appropriate), but alternative
the issue of choosing a molecular technique to ad- techniques are more likely to yield more robust
dress a particular problem. In doing so, we will results for equal or less effort. The third category
try to emphasize that, in many cases, techniques (appropriate but not cost-effective) indicates that
Table 1.
Applications of various molecular techniques to problems in systematics
DNA Restriction Fragment DNAfRNA
Problem Isozymes Cytogenetics hybridization analysis analysisa sequencing
the given technique may be used to address the useful markers can be identified through DNA se-
problem, but that other techniques will probably quencing and then appropriate rapid screening
be just as effective for much less effort and/or techniques (fragment analyses or isozyme elec-
money. In other words, except under unusual cir- trophoresis) can be designed to examine this vari-
cumstances, a technique is only recommended for ation across many individuals and Ioci (see Chap-
a particular problem if it falls in the fourth cate- ters 8 and 9). Therefore, studies that combine
gory (appropriate under most conditions). One fi- approaches such as DNA sequencing and
nal caveat-the times of divergence given jn Table isozyme or fragment analyses (e.g., R.J. Baker et
1 are very rough. Because rates of molecular di- al., 1989; Bradley et al., 1993; R.S. Thorpe et al.,
vergence can be very different among lineages 1994) can maximize effectiveness by combining
and among molecules (see below), and because lugh resolution with broad coverage of individu-
the limitations of some methods are still to be ex- als and/or loci.
plored, the times should be used only as a first ap- For studies of mating systems, population
proximation. structure, and heterozygosity, isozyme elec-
Many studies of gene evolution require DNA trophoresis remains one of the best techniques
sequencing, because no other technique provides available. These studies usually require informa-
the necessary information to infer relationships tion from many individuals at many loci, and are
among individual alleles. However, studies of suited perfectly to the kind of data provided by
functional gene duplications and linkage studies isozyme electrophoresis, altl~oughmicrosatellites
can be conducted very efficiently with isozyme are being used increasingly for this purpose. Cy-
techniques (e.g., Buth, 1983; Morizot and Siciliano, togenetic analysis, particularly of meiotic config-
1984; Morizot, 1990).Restriction site and fragment urations, can reveal significant changes in the ge-
studies can be very useful to screen many indi- netic system (e.g., clonal inheritance, polyploidy,
viduals or tandemly repeated loci to study interchange heterozygosity) that are important in
processes such as biased gene conversion or un- their own right and also affect interpretation of
equal crossing over (Seperack et al., 1988). Molec- other types of genetic markers. Studies of indi-
ular cytogenetic analyses also can be used in com- vidual relatedness require analysis of variation at
bination with these techniques to study the large numbers of loci as well, and under certain
distribution of genes across the nuclear genome conditions, isozymes may provide this informa-
(e.g., Wichman et al., 1985,1991; Baker and Wich- tion. The various methods that access variation in
man, 1990; Hillis et al., 1991~). mini- and microsatellite loci (see Cl~dpter8) are
lsozyme electrophoresis, restriction site analy- perhaps the most powerful for inferring individ-
sis, and fragment analysis (e.g., DNA fingerprint- ual relatedness, but in most cases such studies
ing, rnicrosatellites, RAPDs) are applicable to a should be restricted to inferences about close rela-
large number of population-level problems (see tives (Lynch, 1988). DNA fingerprinting tech-
Chapters 4 and 7-8). DNA sequencing also is ap- niques that employ gene amplification (as well as
plicable at this level, but most studies of popula- DNA sequencing studies) can use non-destructive
tion genetics require examination of large num- sampling of tiny tissues samples, an especially
bers of individuals over large numbers of loci (see important point in the field of conservation ge-
Chapters 2 and 10). Although it has become eas- netics, where collecting whole specimens may en-
ier to obtain sequences from many individuals for danger the study populations (e.g., Garza and
certain loci (particularly the mitochondria1 Woodruff, 1992; Morin et al., 1994; A.C. Taylor et-
genome) by amplifying the DNA (see Chapter 7), al., 1994).Tl~eoretically,DNA sequencing can be
it is still inordinately expensive, time-consuming, used with high precision to examine individual
and difficult to obtain sequence information from relatedness, but only if many loci are examined
multiple Mendelian nuclear Ioci among llurnerous from each individual.
individuals. However, sequencing and fragment Geograpl~icvariation within species, detec-
analyses can be combined with excellent results: tion of clonal diversity, the origin of unisexual
Applicafions of Molecular Systsnzatics 519
species, hybridization, and discovery of cryptic species boundaries became clearer, diagnostic
species arc all effectively studied with isozyme morphological traits were found for each of the
electrophoresis, cytogenetics, restriction site species. As in this case, information on species
analysis, and some form of DNA fraginent stud- boundaries from molecular data is often invalu-
ies, or with a combination of these approaches able for separating intraspecific morphological
(e.g., D.D. Shaw et al., 1990; Scribner et al., 1994). polymorphisms from diagnostic characters.
Analyses of cpDNA and mtDNA, which are ma- Perhaps the most common applicat~onof
ternally inherited in most species, can be com- molecular techrziques in systematics is to estimate
bined with studies of nuclear loci (e.g., allozymes) phylogeny. All of the techniques discussed 111 t h ~ s
to provide information on both the degree and bi- book have been applied successfully to questions
ases in direction of hybridization. The two Ends of phylogeny, although the appropriate tec11-
of data also can be c~mbined(often with cytoge- niques will vary from study to study. In order for
netic data as well) to determine not only the a technique to be useful for reconstructing phy-
species involved in initial hybridization events logeny, enough variation must exist among the
&at gave rise to unisexual species, but also the species examined for application of phylogenetic
sexes of each species involved in the hybridiza- reasoning, but not so much that the characters un-
tion event(s) (e.g., J.W. Wright et al., 1983; Avise der study are saturated by cliange. To a first-order
et al., 1991; Moritz and Heideman, 1993; Radtke approximation, useful ranges of divergence can
et al., 1995). be predicted for each major technique except for
The detection of morphologically cryptic cytogenetics, where change is not strongly corre-
species is often accidental; with any molecular lated with time (Table 1). However, these ranges
technique, one should be open to the possibility are very rough; some groups show rnucl~less
that previous perceptions of species boundaries variation for certain characters, and applicatron of
may have been wrong. In many cases, systema- a given technique may be extended further back
tists choosc to investigate a suspicious "polymor- into time for such groups (see the section "Predlc-
phic" taxon; allozyme electrophoresis is used tions of Time from Molecular Data," below). For
commonly in these cases. However, many other example, mtDNA and many commonly exanilned
cryptic species have been discovered accidentally. isozyme loci can be used to study relationships
In addition to examples from studies of isozymes, among higher taxonomic levels of birds than 1s
cryptic species also have been discovered by cy- possible within most other groups of vertebrates
togenetic techniques (e.g., Moritz, 1983) and by (Kessler and Avise, 198510).In groups that have
immunological techniques (e.g., Scanlan et al., never been studled, same experimentation may
1980). In the former case, a morphologically vari- be required to find a technique suitable for a par-
able nominal species of gecko, Hetero~zotiabinoei, ticular phylogenetic problem (see Chapter 2).For
was shown to consist of several cryptic bisexual most studies, however, Table 1 will provide a
species and numerous parthenogenetic lineages of guide to selection of an appropriate teclmicluc, ai
hybrid origin. These concIusions were supported least for a pilot study
subsequently by analysis of isozymes (Moritz et In addition to a necd for rapidly evolv~ngse-
al., 1989a,b).In the immunological example, Scan- quences (see Table I), tracking relationships of m-
lan et al. (1980) showed that some individuals of dividuals within populations often requlres meth-
the nominal frog species Gastrotlzeca riobambae ap- ods that allow for reticulation of lineages (see
peared to be more closely related to other species "Trees versus Networks," below). However, therc
than to other individuals of G,riobambae. This led are several cases in which methods that assume a
Duellman and Hillis (1987) to examine this group bifurcating tree are appropriate for looking a t in-
with allozyme electrophoresis, which extended traspecific phylogenies. For instance, phylogenles
the findings from immunology to suggest that six of organellar genomes usually can be assumed to
species in two different species groups had been be largely free of reticulat~ons.Studies of in-
confused under the name G, riobambae. After the traspecific maternal phylogenies of mitochondr~al
520 Chapter 12 / Hillis, Mable & Moritz
D S A have become commonplace for some ated with few numbers of nucleotide changes can
specles, such as human populations (e.g., Vigilant be overcome by using multiple nuclear loci (see
el al., 1991; D.R. Maddison et al., 1992).Also, rela- Slade et al., 1994),but at present this is a labor-in-
tionshlps within asexual species (or species in tensive undertaking because each gene and each
which recombination is rare) can be examined us- taxon may require specific pilot studies to resolve
ing rnetl~odsthat build bifurcating trees. Phyloge- problems with amplification of pseudogenes and
netic studies also have become increasingly im- multiple gene copies (Chapters 7-8). Intron size
portant in epidemiological and evoiutionary can vary widely between taxonomic groups, and
studics of vlruses. These studies are possible be- primers that work well in one group may not be
cause viruses often evolve rapidly enough to pro- useful in another (e.g., Slade et al., 1994). The in-
duce sulficient variatioin for phylogenetic analysis ternal transcribed spacer (ITS) regions of the ribo-
over ihe course of just a few years or decades somal RNA gene array may prove more useful as
(e.g., li E Doolittle et al., 1990; Fitch et al., 1991; a rapidly evolving nuclear gene target for fine-
R A. Ol~nsteadet al., 1992; Nichol et al., 1993a; scale comparisons, although complications can
Eickbush, 1994; Korber et al., 1994; Crandall, arise if there is extensive variation among copies
1995a,b). In one well-publicized case (Ou et al., within individuals. Some studies (e.g., Gonzales
1992),phylogenetic analyses were used to identify et al., 1990; Pleyte et al., 1992) have used the ITS
a ser~esof patients infected with W N in a dental regions for comparisons of closely related species,
practice. Phylogenetic analyses also have been but rates of nucleotide change vary widely across
used to identify viruses associated with outbreaks the region, the size of the region varies widely
of prevlousiy unidentified diseases, in some cases among taxa (e.g., Gonzales et al., 19901, and there
even before the viruses have actually been iso- is virtually no obvious sequence homology be-
lated (Nichol et al., 1993b). tween more distantly related taxonomic groups
Closely related species (diverged within the (e.g., Pleyte et al., 19921, The non-transcribed
past 5 million or so years) are best studied by ex- spacer regions of ribosomal DNA have not been
amil-iing relatively fast-evolving isozyme loci (see examined as thoroughly, but tend to be too vari-
Chapter 41, nuclear spacer regions and introns or, able to provide meaningful phylogenetic signal.
in anlmals, the mitochondria1 genome (see Chap- Even with these rapidly evolving sequences, it re-
ters 7-9). Other techniques have, on occasion, mains impractical to resolve very recent (e.g.,
p r o v ~ duseful in tlus range, but in the majority of post-Pleistocene) divergences because it is diffi-
cases are not sensitive enough to detect sufficient cult to sort uniquely derived character states from
changes over such a short time scale. There has random fixation of ancestral polymorphisms
been some recent effort to find intron regions of (E.N. Arnold, 1981; Neigel and Avise, 1986).
protein-coding nuclear genes that could be useful The most common timeframe of divergence
to study divergence of closely related species and studied by systematists (roughly 5-50 million
of populations within species (Lessa, 1992; SIade years) is within the range of study of most of the
et al , 1993; Palumbi and Baker, 1994). In some techniques discussed in this book. Further back
cases, substantial variat~onand geographic struc- into time (50-500 million years: Table 1)most of
ture have been observed (e.g.,Palumbi and Baker, the techniques are relatively ineffective, except for
1494,13~~r"cn and Lee, 1994). However, a 729-bp sequencing of relatively conserved genes (Chap-
~ntronof the human Y cl~romosomeshowed no ter 9) and perhaps comparing changes in organi-
varlaiion among 38 human males (Dorit et al., zation of organelle genomes (Chapter 8). Beyond
1995).The few studies that have compared intron 500 million years divergence, only sequencing the
sequences between species also have found sur- most conserved genes has been effective for phy-
prisli~glylittle variation between closely related logeny reconstruction. In this range, adequate
taxa ( e . ~ .Slade
, et al., 1994); in myobatrachine resolution of closely spaced divergence points be-
frog"t!xere tends to be little variation within gen- comes highly unlikely using any technique.
era but major differences between genera (8. If several techniques are appropriate for ad-
Mnb'ie, personal observation). Problems assocl- dressing a particular problem, cost and the avail-
Applications of Molecular Systematics 522
BA
for laboratory set-up and operating expenses vary
considerably, but in general isozyme electrophore- E
sis and cytogenetics are the least expensive tech-
niques per specimen examined, whereas DNA hy-
bridization and restriction analysis are several
times as expensive, and nucleic acid sequencing is
the most costly approach. However, this doesn't A F
*w
mean that a given problem will always be an-
swered with less money by the less expensive tech-
niques, because considerable money (and time) can
be wasted by trying to apply an inappropriate
technique to a particular question. All heritable in-
formation is potentially accessible to DNA se-
quencing, whereas only subsets of this information
are accessible to the other techniques. Often, the
choice of technique will depend upon the resolu- (B)
Bw
tion required to address the question of interest.
For many problems, it will be useful. to use
more than one approach. For example, simultane-
ous examination of chromosomes, allozymes or
microsatellites, and mtDNA to investigate popu-
lation structure, clonal diversity, or hybridization (c) A
phenomena can provide qualitatively in- Figure 1 (A) Unrooted tree (not directed with respect
formation than would be obtained from the use of to time, but contains no cycles). (B) Rooted tree show-
any one approach. For phylogenetic studies, it is ing same branching relationships as in (A), but rooted
useful to study several sequences that evolve at along the branch leading to F. This implies a direction
of time, from the root toward the tips. The branch
different rates to resolve different parts of the in example are arbibary (C) An unraoted
phylogeny. An may network with one cycle. If this network were rooted
pare allozymes to identify groups and to obtain along lineage F, the cycle might be interpreted as a re-
some phylogenetic information within and be- combination or hybridization event between the lin-
tween groups; rapidly evolving sequences (e.g., eages leading to *and Clwhich gave rise to lineage B.
The graph could also be interpreted as ambiguous
animal mtDNA) to resolve relationships within placement of A, B, C.
groups; and slowly evolving sequences (e.g.,
rDNA) to resolve among-group relationships or
to root the tree by comparison to outgroup taxa. mistakenly called networks by systematists, but
the term network actually refers to graphs with cy-
cles (see Figure 1).Of course, some biological phe-
DATA ANALYSIS: ISSUES AND nomena can only be represented by networks. Ex-
CONTROVERSIES amples include recombination events between
genes, hybridization events between lineages, and
processes of horizontal gene transfer such as retro-
Trees versus Networks transposition. In situations where such phenom-
Most phylogenetic methods produce trees, which ena are likely to occur (e.g., in many intraspecific
in the language of graph theory are restricted to studies, or in groups where hybridization is com-
graphs without cycles (cycles are commonly called mon), methods that build networks rather than
reticulations by systematists). Trees can be either trees should be used.
rooted or unrooted (the latter are undirected with The principal problem with building networks
respect to time). Unrooted trees are sometimes is to detect recombination events (or other reticu-
522 Chapter 12 / Hillis, Mable & Moritz
lations). One of the simplest (yet often effective) analyses of the individual studies. It is not un-
procedures for detecting reticulations that result usual to find that the mean estimate from the
from hybridization events involves producing a combined studies falls within the confidence lim-
tree, then searching for branches with excessive its of the estimates of each of the individual stud-
homoplasy (Buth, 1984a; Funk, 1985). More re- ies, even though the point estimates of these indi-
cently, Hein (1990,1993) developed an explicit al- vidual studies differ. In this case, the differences
gorithm for detecting recombination. Other meth- among the studies may be ascribed to stochastic
ods (e.g,, Bandelt and Dress, 1992; see Chapter 11) variation, and the grand mean can be accepted as
examine support for alternative solutions and pre- the best estimate of tlze parameter in question.
sent the results as a network that represents the In a phylogenetic context, the multiple data
potential ambiguity. Templeton et al. (1992) devel- sets may represent different genes, different
oped a method that uses Hein's algorithm to de- kinds of data (e.g., sequence data and inorpho-
tect recombination events, and then represents logical data), or even different process classes
parsimonious and near-parsimonious solutions in within a single gene (e.g., first, second, and third
a network. This latter method is most effective positions of codons). Debates about how (or
when the average number of changes among Izap- whether) data from these multiple data sets
lotypes is small, a situation in which most other should be combined in phylogenetic analyses
metl~odsare least effective (Crandall, 19941, have paralleled the debates about meta-analyses
To date, methods that assume a tree are used in general (see Hillis, 1995). Hillis (1987) sug-
much more commonly than methods that assume gested that the best estimate of phylogeny is de-
a network. However, under certain circumstances, rived from a combined analysis of all relevant
an assu~nptionof a network is much more realis- data, but that congruence among independent
tic (Crandall et al., 1994; Crandall, 1995a,b; Cran- data sets provides convincing evidence that the
dall and Templeton, 1996).Given the recent inter- underlying phylogeny is being correctly esti-
est in intraspecific applications of gene evolution, mated. This position is consistent with the basic
further development and more widespread use of idea of a meta-analysis. Kluge (1989) argued that
network methods is expected. the relevant data sets should always be combined
for analysis, but that the combined analysis
Combined versus Separate Analyses of makes the individual results irrelevant-an ap-
proach he called "total evidence." Kluge argued
Multiple Data Sets that a combined analysis maximizes the explana-
In any field of science, a question arises whenever tory power of all the data, whether or not the in-
multiple studies have been conducted to address dividual results are consistent with the combined
the same problem: If the results of the individual result. This is equivalent to the first part of a stan-
studies differ, wlzat is tlze best way to reach a gen- dard meta-analysis. Miyamoto and Fitclz (1995)
eral concIusion?The general term for such a com- took the opposite position, and argued that the
bined study is a meta-analysis, which was origi- individual data sets (or process partitions) always
nally defined as a "statistical analysis of a large should be analyzed separately, because the sepa-
collection of analysis results from individual stud- rate analyses are likely to provide insights into
ies for the purpose of integrating the findings" the individual data sets, and different results
(Glass, 1976).However, the term is often used in could indicate violation of underlying assump-
a Inore restrictive sense to describe a particular tions in one or more of the analyses. This is
method of analyzing multiple studies (Hedges equivalent to the second part of a standard meta-
and Olkin, 1985; Olkin, 1990; Mann, 1990; Dick- analysis. fn practice, most systematists do both
ersin and Berlin, 1992).The typical meta-analysis separate and combined analyses (Hillis, 1987;
(restrictive sense) consists of a weighted, com- Olmstead and Sweere, 19941, although the final
bined analysis of all the data from across studies; step (asking whether the combined result is
the result of this combined analysis is then com- within the confidence limits of the individual
pared (statistically) to the results of the separate studies) is rarely attempted. The procedures for
Applications of Molecular Syste~nafics 523
establishing whether or not a given result is assign some measure of reliability to each of the
within the confidence set of trees for a given internal branches in a tree (see Chapter 11). Suclz
analysis are still under development (see Sander- approaches are designed with hypothesis-gener-
son, 1989; Swofford, 1991; Rodrigo et al., 1993; de ating (rather than hypothesis-testing) studies In
Queiroz, 1993; Lanyon, 1993; Hillis, 1995). I-iow- mind (see also Chapter 1). A tree is often recon-
ever, examples of this approach are beginning to structed with no a priori hypotlzesis of phylogeny
appear (e.g., Omland, 1994), and phylogenetic to be tested: the investigator simply wants a reli-
meta-analyses of multiple data sets will likely be- able estimate of phylogeny for the group. Under
come more common in the near future. these conditions, we need some measure of the re-
A combined phylogenetic analysis makes two liability of the various reconstructed branches.
important assumptions: first, that the same under- Measures such as bootstrap proportions and sup-
lying tree is being reconstructed in each of the port indices are designed to provide this informa-
studies; and second, that tke chosen method of tion. However, as detailed in Chapter 11, the m-
analysis is appropriate for each of the individual terpretations of these measures are not always
data sets. If a test for homogeneity among data straightforward in this context. Furthermore, in
sets fails, tlus is an indication that one or both of many cases, it is the overall tree structure (ra thcr
these assumptions has been violated (Bull et al., than a particular branch) that suggests that the
1993b;de Queiroz, 1993).A violation of the first as- null hypothesis is incorrect. We can imagine a sit-
sumption can occur because individual gene trees uation in which no single branch is particularly
may differ from the species tree that contains them well supported, and yet the cumulative effects of
(because of lineage sorting or non-orthology).A vi- many branches contain enough phylogenetic sig-
olation of the second assumption may occur be- nal to reject a parficular null hypothesis. Undcr
cause a given method may be inconsistent (or oth- these conditions, the method of parametric boot-
erwise biased) for one or more of the data sets, or strapping (Efron, 1982; Bull et al., 1993a; Huelsew
because the different data sets are evolving in re- beck et al., 1995) can be extremely useful.
sponse to different processes (e.g., rates of substi- Bootstrapping methods are-a general set of
tution may differ dramatically). Under these con- methods for creating pseud~re~licaie data sets in
ditions, the combined analysis may be less situations where true resamyling is impractical.or
informative or even misleading compared to one impossible. (The name "bootstrapping" refers to
or more of the individual analyses (see Bull et al., pulling one's self u p by the bootstraps in this sta-
199313 for some exan~ples).Modifying the method tistically difficult situation.) In the case of phylo-
of combined analysis (e.g., by differential weight- genetics, we only have a single instance i f ;ach
ing; Hillis, 1987; Chippindale and Wiens, 1994; or taxon. Yet, we know that the distribution of char-
by making the model underlying the analysis acters we observe is influenced by stochastic ef-
more generally applicable) will solve the problem fects. The pseudoreplicate data sets generated by
in some cases. However, in other cases it is possi- bootstrapping alldw an investigator to asscss
ble to show that a given data set is uninformative whether or not these stochastic effects are likely to
at best and misleading at worst, no matter how it have influenced the results (in the phylogenet~c
is analyzed (see Huelsenbeck et al., 1995 for an ex- context, the branching order of the tree). 111 phy-
ample). In such cases, the combined analysis logenetic analyses, nonparametric bootstrapping
clearly should exclude the problem data set. (usually simply called '%ootstrappingr' in system-
atics) is the most commonly used method: the
pseudoreplicate data sets are generated by ran-
Hypothesis Testing and the domly sampling the original character matrlx
Parametric Bootstrap with replacement to create-new character ma tl-ices
Most methods for testing the reliability of phylo- of the same size as the original (Efron, 1979, 1982;
genetic results concern testing the reliability of the Felscnstcjlz, 1985a;see Chapter 11).The frequency
data as a whole (is there information content in with which a given branch is found upon analy-
the data set, or just random noise?) or attempt: to sis of these pseudoreplicate data sets is recorded
524 Chapter 12 / HiLlis, Mable & Moritz
Table 2
The number of distinct, unrooted, bifurcating trees as a function
of the number of taxa
Number Number
o f taxa of trees
10 2x1O6
22 3 x loz
50 3x
100 2 x 10lS2
1,000 2 x 102f860
10,000 8 x 1038,658
loo,ooo 1 10486,663
1,000,000 1 105,866,723
10,000,000 5 1068,667,340
problems in which a clear a priori hypothesis ex- rooted bifurcating trees. Although it is unlikely
ists. However, this may not be as limiting as it that anyone would attempt an analysis of this
first seems. If an initial. estimate suggests a poten- size, analyses have already appeared that con-
tial source of systematic bias (such as long branch sider the relationships among hundreds of species
attraction or skewed base composition), paramet- (Chase et al., 1993), and analyses of hundreds or
ric bootstrapping can be used to assess whether even thousands of human mitochondria1 se-
the observed conditions are sufficient to affect the quences are likely to be attempted. Obviously, the
results. Another good use for parametric boot- number of possible solutions in these cases makes
strapping is to predict the amount of data that exhaustive examination of the solution space im-
will be needed to obtain reliable phylogenetic res- practical. Given the number of possible solutions,
olution, given a preliminary data set and prelimi- it seems unreasonable to expect that we could
nary tree estimation (I-Iuelsenbeck et al., 1995). find the one correct history out of all the possible
Thus, parametric bootstrapping can be used not phylogenies. In fact, how do we know that any of
only for testing alternative l~ypotheses,but also as our estimated phylogenies are accurate? Accuracy
a tool to guide in study design (Chapter 2) and of phylogenetic methods can be assessed in sev-
troubleshooting. eral ways, including simulations, experimental
phylogenies, statistical tests, and congruence
studies. Statistical tests and congruence studies
Phylogenetic Accuracy are discussed in Chapter 11 and elsewhere in t l ~ s
The number of possible solutions in any phyloge- chapter. Below, we present a discussion of the use
netic analysis increases remarkably quickly as a of simulations and experimental studies for as-
function of the number of taxa. Even if only bifur- sessing accuracy in phylogenetic analyses (for ad-
cating solutions are considered, there are more ditional discussion of this topic, see Hillis, 1995).
possible branching orders for 50 taxa than there
are atoms in the universe (Table 2). If we consider Simulations and Pe@ormunce Criteria for
the number of possible phylogenies for all living Phylogenetic Methods
species, the size of the potential solution set is be- The most widely used (and abused) approach for
yond normal comprehension. For 10 million taxa assessing phylogenetic performance is numerical
(well below most estimates of the number of ex- simulation under an explicitly stated evolutionary
c
portant, consistency has received a surprising verges on the correct solution as more data become
amount of attention i n contrast to efficiency, Effi- available to the method. For many methods, there
a e m y 1s a measure of how quickly a method con- is a tradeoff between consistency and efficiency.
Applications of Molecular Systematics 529
For instance, Lake's (1987) method of invariants is gle neighbor joining tree, David Swofford (per-
consistent under a wide variety of conditions, and sonal communication) found more than 27,000
has little bias even under extreme conditions of better trees under the minimum evolution crite-
branch-length heterogeneity. Ilowever, the method rion (which is the appropriate criterion for neigh-
is also extremely inefficient under many circum- bor-joining according to both authors of the algo-
stances. Hillis et al. (1994b) presented an illustra- rithm; Nei, 1991; Saitou, 1991). The accepted
tive simulation of this point, in which the most ef- standards in the field (as of this writing) appear
ficient methods found the correct tree 100%of the to be very different depending on the criterion
time with as few as 200 nucleotides, whereas chosen. Point estimates are almost never accepted
Lake's method of invariants required >lo9 nu- by investigators who choose parsimony or maxi-
cleotides to achieve the same level of performance mum likelihood criteria, at least without some
with the same data sets. It is probably more impor- search of tree space for better or equally good so-
tant for most investigators to know that a chosen lutions. (It was not always so; old parsimony pro-
method can find the correct solution with a limited grams such as Wagner78 output point estimates
data set than to know that it wouId find the correct only, and these results were widely reported in
solution if they had an infinite data set. the 1970s.)Unfortunately, this rigor often does not
There also are tradeoffs in some cases be- extend to investigators who use a distance crite-
tween robustness and efficiency. A method is ro- rion such as minimum evolution; many papers
bust if it is relatively insensitive to violations of its still appear each year with only neighbor-joining
assumptions. A method may be both consistent trees, without any attempt to optimize the solu-
and efficient under a given model of evolution, tion. Whether this is a result of lax standards or
and yet if the assumptions of the method are vio- ignorance on the part of investigators, reviewers,
lated, the method may quickly become inefficient and editors is unclear, but the evidence suggests
and/or inconsistent. This suggests an excellent use that many investigators do not realize that neigh-
for simulations, namely to explore the sensitivity bor-joining trees are only approximate solutions.
of a method to its various assumptions by sys- Among the various optimality criteria, there is
tematically violating them. A related point is that also a tradeoff in many cases between efficiency
complex models may be needed to achieve effi- and computational speed. For instance, when its
ciency under some conditions, and yet model assumpt~onsare met, maximum likelihood meth-
complexity also comes at a cost (see Chapter 11). ods are often more efficient than other methods
One of the most obvious tradeoffs is between (Hillis et al., 1994b; Huelsenbeck, 1995a,b). How-
computational speed and discriminating ability. ever, the much greater computational complexity
Single-tree algorithms (e.g., neighbor joining, of maximum likelihood limits the thoroughness of
UPGMA, various stepwise addition algorithms; tree searches for large data sets compared to par-
see Chapter 11) are very fast for finding a point es- simony or distance criteria. Thus, an investigator
timate of a tree, but they do not guarantee an op- with a large data set needs to decide if choice of
timal solution, and they do not permit the com- criterion is more important than a thorough search
parison of alternative solutions. Many of these of tree space for optimal or near-optimal solutions.
algorithms are good for finding a starting point Methods also differ greatly in their versatility;
for a more thorough search of tree space under a in other words, what kind of information can be
given optimality criterion, but they should not be incorporated into the analysis? The popularity of
viewed as a final solution. Strangely, some inves- parsimony methods (Sanderson et al., 1993) stems
tigators (e.g., S.B. Hedges et al., 1992b; Stoneking in part from their great versatility. Not only can a11
et al., 1992)consider the single-tree output an ad- kinds of character data (whether morphological,
vantage of using these methods. They may give a behavioral, ecological, or molecular) be analyzed
single "answer," but they clearly do not guaran- using parsimony, but almost any information on
tee that this answer is even among the best solu- evolutionary processes can be incorporated into
tions. For instance, using the same data set for the analyses as weI1. For instance, each site in a
which S.B. Hedges et al. (1992b) calculated a sin- gene could be weighted differently based on a pri-
530 Chapter 12 / Hillis, Mable G. Movitz
XI11
in Hillis et al., 1993a). Parsimony analyses also in- VI +
XIV
clude estimating ancestral character states as well
as estimating branch lengths and branching order,
whereas many other methods of analysis exclude (B) R 7
at least one of these procedures.
In summary, the choice of method wjll depend
on which of the various performance criteria are of
greatest importance for a given application. Sirnu-
Iations can be useful for evaluating some of these
criteria under particular circumstances, but the re-
sults of a given simulation study should not be
overgeneralized. It is important to realize that no
method is best for all performance criteria, and
that there are tradeoffs among many of the crite-
ria. In choosing a method, an investigator should
identify the goals of his or her study, and then
evaluate which of the methods is best suited to
meet those goals. Beware of anyone who recom-
mends a method without asking about the goals
and details of the study!
ferred phylogeny (constructed bljndly with respect entiation not as a metronome, but as a Pojssoli
to knowledge of the true tree). process with regularity of the same order of mag-
Experimental studies of phylogeny have just nitude as radioactive decay (Fitcl~,ZY76b, A C.
begun to appear, so there have been only limited Wilson ct al., 1977,1987a). This has promoted the
tests of the predictions from simulations. Already, use of molecular divergence measures to provide
however, it appears that experimental studies WLU a timeframe for phylogen~es,particularly where
be useful for identifying assumptions of models there are insufficient data 011 fossils (e.g., Sarich
that are likely to be violated when applied to real and Wilson, 1967; A.C. Wilsol~et al., 1974, 1975;
organisms and for testing assertions of method Beverley and Wilson, 1985; A.C. Wilson et al.,
superiority. For instance, a maximum likelihood 1987b; Bowen et al., 1989; Vigilant et al., 7 991;
model for restriction site variation has been Wayne ct al,, 1991b).
shown to perform rather poorly compared to The "molecular clock controversy" 1s really
other methods in an experimental phylogeny several different controversies. Therefore, TYC
(Hillis et al., 1994a). Also, coding of restriction break the controversy down into its variouc corn-
fragments (rather than restriction sites) has been ponents in the following sections.
shown to be positively misleading in some exper-
imental cases (M.E. White et al., 1991; Hillis et al., Is There a Universal Molecular Clock?
19944, contrary to the recommendations of some Heterogeneity of rates across different nucleot~de
recent authors (e.g., B. Bremer, 1991). positions, different genes, different genoll~icrc-
gions, or different genomes within an organis~~~al
lincage (for instance, nuclear versus urganellar
Predictions of Time fkom Molecular Data genomes) is undeniable (for a revlew, see LIand
A common application (and an area of considcr- Graur, 1991). However, "un~versal"rnolccular
able controversy) of molecular systematics is the clocks have been proposed for many indlv1cIua1
prediction of time from molecular divergence genes or genomic regions across a wide spectrum
data. It is clear that molecular divergence is of taxa. For instance, a clock is often claimed for
roughly correlated with divergence of time; how- animal mtDNA, which is supposed to evolve at
ever, there is considerable debate about constancy about 2% sequence divergence per million years
of rates of divergence and how much error is as- between pairs of taxa (W.M. Brown et al., 1979).
sociated with predictions of divergence times A.C. Wilson et al. (1985) stated that "no major cle-
from measures of nzolecular similarity (Chapter partures from this rate are known for the inole-
1).The "molecular clock l~ypothesis"l~oldsthat cule as a whole." X-iowever, since then, many
the rate of molecular change is constant enough studies have shown considerable rate heterogene-
(within the bounds of particular gencs and taxa) ity of mtDNA within and between various animal
to be useful in predicting times of divergence. groups (DeSalle and Templeton, 1988; Hascgawa
However, despite the many applications of the and Kishino, 1989, Martin el al., 199213; Avise el
molecular clock hypothesis, there have been few al., 1992~).Molecular clocks have been proposed
serious attempts to determine confidence limits of for numcrous nuclear gencs as well, but in most
the estimates of time derived from molecular di- cases the evidence suggests that rates of subst~iu-
vergences. In this section, we suggest that much tion vary among taxono~nicgroups (reviewed by
greater rigor and caution is needed in estimating W.-H. Li, 1993; Avise, 1994).There is, howevel, ari
divergence times from molecular data than is important caveat: all estimates of rates of diver-
c o m m & ~ exercised.
l~ gence must ultimately trace back to datcs dcrlved
Zuckerkandl and Pauling (1962,1965) were fro111 the fossil record (or, more dubiously, to iac-
the first to suggest that genes and their protein ariance events estiinatcd from biogeography) and
products might evolve at rates constant enough these are often open to different interpretations
that measures of nzolecular divergence could be (Marshall, 1990; Easteal and Collett, 1994)
used to calibrate a "molecular clock." Recent. ad- The suggested reasons for the lack of gcncral-
vocates of this hypothesis view molecular differ- ity of molecular clocks include differenccb In
532 Chapter 12 / Hillis, Mable cEi Moritz
eage in the tree can be reconstructed without
metabolic late (Thomas and Beckenbacl~,1989; error.
Avise rt al., 1992~;Martin et al., 199233); differ- 5s calibration dates for all times of divergence
ences in DNA repair efficiency (Britten, 1986); dif- used to calculate the rate of the molecular
lerences in exposure to mutagens (Adelman et al., clock are known without error.
1988); differences in nucleotide generation times
(Martin and Palumbi, 199313);differences in num- 6. A regression of time on number of substitu-
ber of DNA replications in germ line cells (Wu tions can be conducted without error.
and Li,1985); and differences in organismal gen- Under these conditions (none of which is realistic,
eration times (Laird et al,, 1969; Kohne, 1970; of course), we would be able to construct a molec-
Catzefiies et a]., 1987; MT-H. Li et al., 1987; Sibley ular clock like the one shown in Figure 7 .The con-
et a1 , 1988; Gaut et al., 1992).Martin and Palumbi fidence limits for individual data points can be
(l993b) usefuily combined the effects of metabolic easily calculated under this model based on the
rate and generation time into a single concept of Poisson expectations:
"llucleotlde generation time."
This apparent heterogeneity leaves many ad-
vocales of molecular clocks to argue for "local" P = e+pY
Y!
molecular clocks (see W.-H. Li, 1993). The argu-
~nei-ii1s that among closely related species with where P is the expected frequency of Y substitu-
s ~ r n i l a rlife histories, metabolic rates, generation tions and pis the mean i~umberof expected substi-
titnes, etc., rates of evoluiion for a particular gene tutions. Thus, on our perfect clock (with a mean
are hkely lo be stable. Therefore, according to this substitution rate of one substitution per million
argument, predictions of time can be made if we years), for lineages of 15 million years we would
calibr~tc the rate of evoliltlon separately for each expect on average to observe 15 substitutions, and
gene 1x1 each taxonomic group. Putting issues of we would expect 95% of such lineages to have be-
practicality aside for the moment, it is a useful ex-
erclse !o accept that such local clocks exist and
turn to the issue of what we should expect from 95% confidence hmits:
the pcriect lnolecular clock. 15 +
15 7 substitutions
w ~ i hsubstitutions accun~ulatingfollowing a M
l-'olsson distribution (A.C. Wilson et al., 0 5 10 15 20 25
Number of substitutions
1987a). Therefore, the only variation we will
observe is stochastic. For the sake of discus- Figure 7 The sampling expectations of a perfect molec-
s o n , we will set tho rate at one substituiion ular clock, arbitrarily set at one substitution (throughout
(within the gene of illterest) per million years. the sequence examined) per million years. The model
assumes that change is linear with time, substitutions
2 Rate of change Is equal across all positions followa Poisson process, there is 110 error in callbration
compared and across all lineages. times or collection of data, all substitutions are ob-
3. ~h~ phylogenetic tree can be reconstructed served, and all lineages are evolutionarilyindependent
Under these cond~tions,the 95% confidence limits for in-
.i\-ichouterror, and eaclt branch in the tree can dividual data points would be as shown, l;or instance,
be analyzed independently. 95% of lineages isolated for 15 million years would be
4 -rhe nulnber of substitutions along each 1in- expected to exhibit 8-22 substibtions, inclusive.
Applications of Molecular Systematics 533
(A) (0)
Limitations of Molecular Clock Calibrations lems associated with determining dates from fos-
In the real world, it i s never possible to satisfy all sil evidence. Ideally, we would need accurately
the conditions of the perfect molecular clock dated fossils from just above and below the split-
model. In particular, there are a number of prob- ting event we wish to date (Figure 10A). How-
lems that occur in calibrating molecular clocks. ever, it is more likely that the fossils are not direct
For instance, rather than using independent lin- ancestors, but simply branched off the tree before
eages in a phylogenetic tree to calibrate a clock, and after the splitting event (Figure 10B; Mar-
most investigators use all pairwise divergences shall, 1990).These latter dates will tend to under-
among taxa within a given group. These values estimate the age of the splitting event. Even with
are not independent of one another because an outstanding fossil record for a group, it is ex-
many are based on shared portions of the phylo- ceedingly difficult to pinpoint the age of the last
genetic tree (see also Lynch and Jarrell, 1993). common ancestor of a group of living species
This lack of independence can only inflate the (S.S. Carlson et al., 1978).
perceived correlation between divergence and Biogeographic data also have been used to
time (Figure 9). There are also numerous prob- calibrate molecular clocks, but there are difficul-
Applications of MoleculalaSystematics 535
Although we believe this basic approach is underestimate of the confidence limits). Figure 12
flawed for the reasons outlined above, it is useful shows two 95% confidence limits for a regression
to calculate the confidence limits of these clocks of time o n molecular divergence. The data are
based on the regression model that has been used based on percent divergence (corrected for ex-
to cstabllsh the calibrations. In other words, we pected multiple substitutions; C%) of silent sub-
wlii ~gnorethe probielns of non-independence of stitutions in coding regions of several genes com-
the p;llXwlsedivergence esiimates for a moment pared among various pairs of mammals (from
to zs!,mate the confidence limits. Even giving Britten, 1986). A weighted linear regression of
these callbrations the benefit of the doubt, it can time (Y) on divergence (X)gives Y = 1.39X (as
be shown that the conf~del~ce limits for new esti- represented by !ine A in Figure 12). The variance
ma t2s ol time based on the models are so large as of the residuals under this model of regression is
to rn'ike {he clocks h~ghlyimprecise. given by
The regression model is usually a simple
weighted linear regression of time on molecular
d~vergmce, with the constraint that the intercept
of the regression line is the origin. The common
cahbrstlon technique of divrding the average time
s2yx=
($1
c - --cx
n-l
oi divergence by the average ~noleculardiver-
gence WI!~ produce the correct slope of this re- and the estimated standard error of the slope (b =
gression under the a s s u m p t i o ~that
~ the residual 1.39)is given by
error of the regression is proportional to molecu-
lar divergence (Snedecor and Cochran, 1989). In
s , method is acceptable if there is
otl~cr' ~ ~ o r dthis
proportionally greater deviation about the regres-
=g
sign llile at high levels of molecular divergence
than a t low levels of molecular divergence. This
seems to be a reasonable assumption, and plots of
C
t:me versus molecular divergence generally fol-
low [his pallern (Figures 12-14). 125
Calculating confidence limits for molecular
clocks based on the expected error in the measure
of 1nolecular divergence is inadequate, because
thi, source o i error is trivial compared with the ... 100
res~dualerror of the regression. lit is more rigor-
oua to assume that the error associated with the -E'
molecular measure is triu~aland calculate confi- 2 ,,
del~cclimits for predictions of time based on the $
regression (although this will result in a slight 5
E
'F; 50
S-
Y =SIX/s estimated. However, even these underestimated
confidence limits are so great as to render the
clock estimates minimally useful. Nonetheless, if
one is interested in applying molecular clock
models to questions of time-since-divergence,
(Snedecor and Cochran, 1989).*The 95% confi- then the error associated with the estimates of
dence limits of new estimates of time from the time cannot be ignored.
data presented in Figure 12 are represented by The data that have been used to calibrate two
lines C1 and C2. These limits are quite large; for other molecular clocks are plotted in Figures 13
instance, at C% = 50, the 95% confidence interval and 14. In Figure 13, data on mtDNA sequence di-
is 69.5 +. 65.34 million years. vergence is plotted against time-since-divergence
The above approach can be used to calculate information derived from the fossil record of pri-
*It is important in calculating these confidence limits to recall that the regression model assumes the res~dualerror of the regres-
sion is proportional to the molecular d~vergenceand that the regression line runs through the origin Confidence limit calcula-
tions that assume the resldual error is the same for all values of molecular divergence (e.g., S.S. Carlson et al., 1978) seriously
underestimate the actual confidence limits.
538 Chapter 22 / Hillis, Mable & Moritz
Immunological distance
Figure 14 Regression of estimated time since separa- ure 12. D indicates the reported relationship between
tion on immunologicai distance. The data points are the time since divergence and albumin immunological dis-
same that were used by Prager et al. (1974) to calculate tance for mammals (Sarich and Wilson, 1967). Confi-
the rate of albumin evolution in birds (A); hence, some dence limits of D cannot be calculated because of an in-
of the points are averages for comparisons of several sufficient number of data points.
species. Key to confidence limits is the same as for Fig-
mates (from W.M. Brown et al., 1979). This cali- The values of time-since-divergence used in
bration is widely used as a standard mtDNA Figures 12-14 are by no means universally ac-
clock (A.C. Wilson et al., 1985), although Moritz et cepted; indeed, the extreme difficulty with which
al. (1987) stressed likely errors associated with pa- such data may be garnered from the fossil record
leontological calibrations and from variation is a major obstacle to calibrating molecular clocks.
among lineages. Although the confidence interval We have used the original data upon which these
for this calibration is smaller than that in Figure calibrations were based in order to provide confi-
12, it is still large enougl~to be quite limiting for dence limits for estimates derived from the cali-
most applications of a molecular clock. brations. New calibrations based on new values
The calibration of albumin divergence based on of time are possible, but these calibrations should
immunological colnparisons among birds (Figure be accompanied by newly calculated confidence
14; Prager et al,, 1974) shows that the confidence limits,
limits of new predicted values of time may be so It is difficult to find relevant data to calculate
large as to not exclude any reasonable possibility. confidence limits for an allozyme (Nei's genetic
Note, however, that the confidence limits for the distance, D) clock. As noted by Avise and Aqua-
slope of tlus calibration for birds (BI and B2 in Fig- dro (1982), "...the major obstacle to critical tests
we 3)do not include the rate reported for m a m d of the electrophoretic protein clock is the almost
(D inFigure 3),as Prager et al. (1974) correctly con- total lack of reliable independent information
cluded. This highlights the necessity of calibrating about times of speciation." Nonetlleless, there
inolecular clocks w i t l the
~ group of interest. has been an enormous range of estimated rates
Applications of Molecular Sysfernafics 539
al. (1986)used a molecular clock based on nu- some data clearly overlap, it is difficult to imag-
cleotlde substitutions among isolates of influenza ine any data that would not have "fit" such broad
A vlruses collected across 50 years to predict the confidence hrdts. Furthermore, both estimates ap-
ongln of a Russian influenza strain that resulted pear to be consistent with most current hypothe-
from accidental release in an moculation program. ses of modern human origins.
I-Iowever, the influenza lsola tes collected through The above discussion suggests that consider-
the %-year sampling penod clearly are not evolu- able caution is needed in predicting absolute
tional l l y independent, and essentially the same times of divergence from molecular data. How-
result can be obtained by simply assuming that ever, this in no way impedes the calculation of rel-
the isolates are part of a single lineage sampled atizle times of divergence, because many methods
through tlrne. The rate of evolution doesn't even of plxylogenetic inference are relatively insensitive
need to remain constant through time for this ap- to differences in rates of divergence {Chapter 7.1).
proach to work. As an example, although we may have limited
Ye1 another approach is to use coalescent confidence in the absolute time that the orangutan
models 10 estimate times of divergence These lineage diverged from the common ancestor of
models require a number of limiting assumptions, humans, chimps, and gorillas, i-nolecular data
but under certain circurnsiances, i t may be possi- leave little doubt that tlxis event occurred before
ble to obtain reasonable estimates of divergence the latter three lineages diverged (Slightom et al.,
tlme and calculate reasonable confidence limits 1987). FortunateIy, the vast majority of applica-
for the ebtimate. For instance, Dorit et al. (1995)re- tions of molecular systematics do not depend
ported a complete lack of variation among 38 lxu- upon calibrations (or even the existence) of mole-
man males for a reg1011 of the Y chromosome. As- cular clocks. Differences in rates of divergence
suming random mating, equilibrium population among lineages detract only from methods of
size, exponentially distributed bifurcation times, analysis that require clocklike behavior of mole-
and a mutatiou rate estimated from other greal cules, and alternative methods of analysis exist for
apes, they used a coalescent model to estimate all applicatiol~sof molecular systematics except
that the last common male ancestor for these in- for the absolute estimation of time.
dlviduals existed bekween 0 and 800,000 years ago
(95% confidence limits). However, the distribution
of expected coalescence times for sequences with
110 detected base substitutions is exponential
APFLICATPON 01:PHYLOCENIES
(Tapma, 1983) so the mean estimate of coalescence FOR ANALYZING MACRO-
time derived from the uniform Y chromosome se- EVOLUTIONARY PATTERNS:
quences is a poor description of the likely out- COMPARATIVE METHODS
comes. Despite tlxis uncertainty, the mean esti-
mate of 270,000 years ago presented by Dorit et al. The ultimate goal of most systematic studies is to
has been widely reported as "fitting" the estimate provide insight into the historical structure of
of the last common female ancestor for human groups of organisms and the evolutionary
mtDN.4 (e.g.,Paabo, 1995). Vigilant et al. (1991) processes that underlie diversity. Historically,
reported the latter date at 266,000-249,000 years these goals tended to be separated into the work
ago, but this range of dates takes only the experi- of taxonomists and the work of population biolo-
mental error (and noi the sampling error of the gists. However, improvements in the confidence
molecular clock calibration) into account. Based with which phylogenies can be estimated has
on other primate mtDNA clocks (e.g., Figure 131, been essential to the development of comparative
we could approximate confidence limits of about studies that use statistical tests to account for the
0-1,000,000 years ago for the date of the last com- non-independence of taxa caused by commolv an-
mon ancestor of human mtDNA. Although the cestry. Althougl~previous tests have been pro-
coi~fliiencelimits for the nxtDNA and Y chromo- posed to compensate for relatedness of organisms
Applications of Molecular Systematics 541
under study it is only relatively recently that lack weighted squared-change parsimony to encom-
of independence has been recognized by non- pass varying models of evolution, and Garland et
systematists as a major concern in comparative al. (1992) developed a method for assessing
studies. Felsenstein (1985~)was among the first to whether contrasts have been adequately stan-
propose a statistical test for comparisons of con- dardized (i.e., that underlying models of evolu-
tinuous traits between organisms that used a phy- tion have been adequately accounted for). Losos
logenetic hypothesis as a structural framework. (1994) proposed that the sensitivity of statistical
He proposed uslng a series of independent con- comparisons to phylogenetic history could be
trasts to search for correlations in traits among ter- evaluated by comparing results based on a large
minal taxa and their ancestors. number of conceivable phylogenies (varying di-
The major limitation of this approach is the chotomous branchings and branch lengths) for a
necessity for a reliable phylogenetic hypothesis given group to determine the importance of vio-
that includes an estimate of branch lengths (ex- lation of these assumptions.
pressed in units of expected variance of change) Many other methods exist for the analysis of
with which to standardize contrasts (Grafen, correlated continuous characters based on phylo-
1989). Felsenstein (1985~;1988b) proposed meth- genetic history (e.g., autocorrelation: Cheverud et
ods to account for incompletely resolved phylo- al., 1985; minimum evolution: Huey and Bennett,
genies and to estimate branch lengths but this has 1987, Martins and Garland, 1991; Garland et al.,
been the subject of considerable controversy (see 1991; nested analysis of covariance: Bell, 1989).
Grafen, 1989,1992; Harvey and Pagel, 1991; Pagel, The reader is directed elsewhere for detailed com-
1992; Page1 and Harvey, 1992; Garland et al., 1991, parisons of these methods (e.g., Harvey and
1992; Losos, 1994). Grafen (1989; 1992) proposed Pagel, 1991) and for evaluations of their relative
the "phylogenetic regression," which is based on performance in simulation studies (e.g., Martins -
Felsenstein's method but uses a likelihood ap- and Garland, 1992; Garland et al., 1992,1993; Git-
proach to simultaneously estimate relationships tlernan and Luh, 1992). Methods for analysis of
between standardized independent contrasts and correlations in discrete characters also have been
Lo transform branch lengths. This method has developed (Ridley 1983; Maddison, 1990; see also
been contrasted (Grafen, 1992; Page1 and Harvey, Harvey and Pagel, 1991). The choice of method
1992) with an alternative generalization of Felsen- may depend on whether the emphasis is on iden-
stein's approach for analysis of incompletely re- tifying patterns of correlation in traits among
solved phylogenies that was discussed by Harvey closely related taxa or in more specific hypothesis
and Page1 (1991; fully described in Pagel, 1992).A testing, for which adequate statistical power and
more complete review and additional suggestions at least some knowledge of the underlying distri-
for the treatment of "hard" versus "soft" poly- bution of character change is more critical.
tomies was given by Purvis and Garland (1993). Some questions require reconstruction of an-
The concern in these studies is that contrasts have cestral states for prediction of direction of changes
been adequately standardized to account for (e.g., Donoghue, 1989; W.P. Maddison, 1990,
changes along branches of differing lengths and 1991).Ryan and Rand (1995)reconstructed ances-
that errors associated with interpreting poly- tral calls of frogs of the genus Physalaemus based
tomies have been reduced. on a phylogenetic analysis of mitochondria1DNA
Another assumption of Felsenstein's method sequences (Figure 16). They then synthesized
that has been the subject of discussion is that evo- these calls electronically, and tested preferences
lution proceeds through Brownian motion, so that for extant and ancestral calls among the various
expected variance of change in a trait is propor- species. Ancestral gene and promoter sequences
tional to time. Martins and Garland (1991) per- have also been reconstructed (literally) from in-
formed simulation studies using null distribu- ferred ancestral states from phylogenetic analyses,
tions to evaluate the reliability of Felsenstein's and then tested in living systems (Adey et al.,
method (compared to alternative methods) using 1994; Jermann et al., 1995; Stewart, 1995). Such
542 Chapter 12 / Hillis, Mable b Mouitz
P. enesefae P. ephrpplfer
i
\
I
13.3
I
P.species A P.pustulosus
-
13.1 g F
P. petersi
y,y
P.species B P. coloradorurn P.pusluiatus
u 2500
u800
Figure 16 Reconstruction of ancestral advertisement msec
calls in frogs of the Physalaemus pustulosus and P.ephip-
pifw groups (adapted from Ryan and Rand, 1995).The
branch lengths are estimates of the numbers of changes which could be used experimental investiga-
in the mitochondria1 12s rDNA gene (averaged across
all most-parsimonious reconsiructions).The graph in tion of microevolutionary processes Brooks and
the lower right indicates the scale of the axes for the McLennan (1991) provide a review of the types of
sonagrams. See Ryan and Rand (1995)for details of the questions that have been addressed using these
analysis. methods.
Whichever method is used, the important
point is that combining traditional. comparative
studies have been highly successful, and results studies with phylogenetics has greatly increased
from experimental phylogenies suggest that these the potential to predict and interpret patterns of
reconstructions are likely to be highly accurate evolution among organisms at various taxonomic
(Hillis et al., 1992). The ability to reconstruct an- levels. This approach has increased communica-
cestral genes, behaviors, and phenotypes provides tion among biologists in a wide variety of fields
a powerful tool for investigating functional and recent applications are too diverse to cover in
changes through time. this brief overview. However, applications have
Other studies are concerned with compar- included such widely divergent topics as genome
isons among terminal taxa only. Lynch (1991b) de- size evolution (Sessions and Larson, 19871, coad-
veloped a phylogeny-based likelihood method for aptation of physiological constraints and behav-
partitioning mean phenotypes of taxa into I~erita- ioral preferences (Huey and Bennett, 1987; Mar-
ble phylogenetic effects and non-heritable resid- tins and Garland, 1991; Garland et al., 19911,
ual components to be used to infer constraints on allometry (e.g., Page1 and Harvey, 19891, experi-
macroevolutionary and microevolutionary mental ethology of mating systems (Brooks and
processes, respectively. McLennan (1991) sug- McLennan, 1991; McLennan, 19911, sexual selec-
gested that phylogenetic systematics could be tion on secondary sexual characters (Ryan and
used to uncover macroevolutionary patterns, Rand, 19951, developmental genetics of homeobox
Applications of Molecula~Syste~nat~cs543
genes (Doyle, 1994), relationshps between phylo- proaches to sampling, assaylng variation, and 1n-
genetic pattern and developmental processes (De- terpreting results.
Salle and Grimaldi, 1993), use of weI1-character- The next few decades will continue to be an
ized model systems for comparison of gene and exciting time for molecular systematics. The com-
gene system evolution (Kellogg and Bircher, mon ground between molecular population ge-
1993), evolution of coordinated regulation of netics and phylogenetics will continue .to be ex-
genes encoding intermediate metabolic enzymes plored as large databases for il~traspecific
(Clark and Wang, 19941, and many others. The variation in multiple genes are developed for
number and diversity of these kinds of studies model taxa and coalescence theory matures. \Vc
should see a substantial increase with improve- will almost certainly see many additional entire
ments in methods of phylogenetic reconstruction. prokaryote genomes sequenced and compared,
However, it must be re-emphasized that these in- which will provide us with opportunities to ex-
ferences will only be as sound as the phyogenies amine questions of whole genome evolution (see
on which they are based. Highly sophisticated Eleiscl~mannet al., 1995). The push to sequence
analyses of character evolution will not make up the human genoxne will continue to prod;cc ad-
for non-rigorous estimatian of the phylogeny of vancements in sequencing technology and fostcr
the taxa compared. comparative sequencing projects within eukary-
otes. As more data are accumulated on the pro-
cesses of genome evolution, this information can
THE FUTURE OF MOLECULAR be incorporated into better and more rel~able
SYSTEMATICS methods for estimating phylogenies. Molecular
biology will continue to provide new inforlnation
Molecular systematics has undergone a number on the molecular basls of development, so that a
of remarkable changes during the past tl~ree true synthesis of molecular and morphological
decades. These changes include not only techno- data can occur.
logical developments and refinements (e.g., dis- All levels of systematics are enjoying a rc-
covery and isolation of Type 11: restriction en- nascence, as the importance of understanding his-
zymes, development of DNA sequencing, torical relationsh~psin interpreting patterns
discovery of a heat-stable DNA polymerase and throughout biology is beginning to be widely ap-
its use in DNA amplification) but also major ad- preciated. In addition, the recent emphasis and
vances in issues of analysis. We expect these ad- concern for biodiversity and conservation has
vances to continue, and we hope that discussions placed more national and international a t t e n i ~ o ~ ~
of the current limitations of data collection and on systematics (A.C. Wilson, 1986; Q.D. Whecler,
analysis presented throughout this book will 1995). This emphasis and attention can have cl-
stimulate consideration of these issues. Improve- ther a positive or a negative effect on systematics.
ments can came from technoIogica1 deveIop- The effect will be negative if, in the rush to obtain
ments per se, as well as from increased sophisti- systematic information, scientific rigor is aban-
cation in the use of current methods. The doned. The effect wjll be positive if, in the need
interplay between our increased understanding for accurate information, a premium is placed on
of the evolutionary dynamics of molecules and rigorous data collection and analysis."~7eI~ope
their use as markers in systematic studies is that this book will help to stirnulate the latter
fundamental to developing more efficient ap- course of action.
Acknowledgments
Chapter 1 / Craig Moritz and David M, Hillis
We thank John Avise, John Gillespie, Mark Kirkpatrick, Barbara Mable, and the
late Allan Wilson for comments on various drafts of this chapter.
We are grateful to S. Lavery and C. James for their comments and suggestions.
We thank Sheldon I. Guttrnan, Mia Molvray, and Elizabeth A. Zirnmer for assis-
tance with the botanical section. Carol R. Townsend developed the folded alu-
minum foil packets and cardboard sleeves for packaging tissues from small
organisms. Robert M. Zink and K. Elaine Hoagland read and gave valuable
advice on the manuscript. This chapter is an outgrowth of the Workshop on
Frozen Tissue Callections and Management supported by the National Science
Foundation (Dessauer and Hafner, 1984).
545
546 Acknowledgments
Frick, Ronald G. Garthwaite, Carla Hass, Gennady I? Manchenko, Ronald H.
Matson, Donald C. Morizot, and James 8. Shaklee. We are indebted to Donald
E. Campton and Ronald H. Matson who provided us with their own lists of
buffer formulas which we have consulted liberally. We also thank Herb
Dessauer, Stephen D. Ferris, James B. Shaklee, and Gregory S. Whitt for many
helpful comments over the years. Ross MacCulloch, Lisa Gilhooley, Marty
Rouse, Cathy Rutland, Cynthia Horkey, Scotty Allen, and Karen Ditz greatly
assisted with the preparation of the manuscript. Paul Chippindale, David
Hillis, Maurice Ringuette, Gregory S. Whitt and Ronald H. Matson provided
valuable editorial comments. Brian Thompson prepared the equipment dia-
grams.
We thank Charles G. Sibley for a helpful review of an early draft of the manu-
script. We also tlnartk Adalgisa Caccone for suggestions and information regard-
ing TEACL methods.
This chapter has benefitted from the suggestions and input of numerous indi-
viduals, including Jim Bull, Mike Charleston, Keith Crandall, Mike Hendy,
Peter Lockhart, Barbara Mable, David Maddison, Wayne Maddison, Jirn
McGuire, David Penny, and Mike Steel. Joe Felsenstein, John Huelsenbeck, and
Paul Lewis have been especially helpful with issues related to maximum likeli-
hood inference and statistical testing. In addition, we thank the students in the
molecular evolution courses at the Woods Hole Oceanographic Laboratory and
the University of Texas for useful comments on material in this chapter.
We thank John Avise, Jim Bull, John Gillespie, Mark Kirkpatrick, and the late
Allan Wilson for comments on various drafts of this chapter. Mike Ryan provid-
ed the file for the preparation of Figure 16. Support for research discussed in
this chapter was provided by the National Science Foundation and the Centers
for Disease Control and Prevention.
A ampere. Unit of electrical current, defined as the M molar; unit of concentration; moles of solute per
current which, if maintaliled in two straight paral- liter of solution
iel conductors ofiilfimie length, of negligible clr- meter; unit of length; defined as 1r650t763'73
cular cross-section, and placed 1m apart m a vac- wavelengths of the orange-rcd radiation of 86Kr
mA milliampere; 1mA = A
uum, would produce between these two
mCi milhcurie; 1 mCi = Ci
conductors a force equal to 2 x 10" N/m. Equiva- pg microgram; = 10-6
lent to the current that passes in a resistance of 1 R mg milligraln; 1 mg = 10-3
when a potentlal difference of 1V is applied. min minute; 1min = 60 sec
"C degrees Celsius (centigrade scale). Unit of temper- microliter; 1 = 10-6 L
aiure based on scale in which 0°C = ice point of ml milliliter; 1ml = L
water, 100°C = steam point of water (at atmos- pA4 micromolar; 1 pM = lo4 M
pheric presure). mM millimolar; 1rnM = M
C coulomb; unit of charge; defined as that quantity of pm micrometer; 1 pm = 1 0 6 m
charge that flows across any crass-section of wire mol micromole; 1 pmol = 10-6 mol
in 1 sec when there is a steady current of 1 A mmol milliinole; 1 mmol = mol
cnl calorle; unit of heat; the amount of heat necessary moI mole; the amount of substance that contains the
to raise the temperature of 1 g of water from same number of formula units as there are 12C
14.S°Cto 15.5'C when the water is at atmospheric atoms in 12.00000 g 12C(6.0225 x or
presure Avogadro's number)
Cal kilocalorie; 1 Cal = l o 3 cal N Newton; unit of force; 1 N = 1 kg-in/sec2
Ci curie; unit of radioactivity, 1 CI = 3.70 x lo1' dish- N normal; a solution contailling one equivalent
tegrations/sec weight of the constituent m question in 1 L of
cm centimeter; 1 cm = m solution
iyrn counts per minute; detected (typ~callyby a ng nanogram; 1 ng =
Gelger counter or a scintillation counter) disinte- nm nanometer; 1 nm = 10- m 5
grations per minute (see CI) nmol nanomole; 1nmol = moles
g gram, unit of mass; originally the mass of 1 cm3 of Q ohm; unit of resistance; 1 a =1 V/A
water at 4OC; now defined by reference to a stan- pg picograin; 1 pg = 10-l2 g
dard kilogram a 4-cm-high, 4-cm-wlde cylinder pmol picomole; 1 pmol = 10-l2 moles
of piatmum-irldlum kept at the Bureau rpm revolutions per minute
Interr~aiionaldes 1201sel Mesutes in France S Svedberg unit;unit of sedimentation rate
g unit of gravitational force; 1g = the gravitational sec second; unit of time; originally 1/86,400 of a
force of Earth mean solar day; now defined as 9,192,631,770
hr hour; 1 hr = 60m ~ n vibrations of radiation from '"CS
J ~oule,unlt of work; 1 J = 1 N/m U enzymeunit
kg kilogram; 1 kg = 10
l<m kilometer; 1km = 10 m' ~i V volt; unit of electric potential difference; 1V = 1
J/C
t bter, unit of liquid volume; 1 L = 1,000 cm' W watt; unit of power; 1 W = 1J/sec
ossary and Abbreviations
A 1. In DNA or RNA sequences: adenine. 2. In pro- Antibody A large protein made in response to a for-
tein sequences: alanine. eign antigen (generally a protein).
AAR Amino acid replacement. Antigen Any molecule that ellcits an antibody
Acetone powders Preparations obtained by grinding response.
tissues in ice-cold acetone and allowing the ace- Antigenic site A region of 5 to 10 amino acids on an
tone to evaporate from the resultant solids. antigen to which antibodies can be elicited.
Additive distances A set of distances between pairs Apomorphy A derived character state.
of sequences or taxa that will precisely fit a Area cladogram A tree that depicts historical rela-
unique, additive phylogenetic tree. Defied math- tionships among geographic areas.
ematically by satisfying the four-point condition ATE Acetate-Tris-EDTA buffer (see Chapter 9,
(see Chapter 11). Appendix).
Additive tree A phylogenetic tree in which the dis- Autapomorphy A derived character state unique to a
tance between any two polnts is the sum of the particular taxon.
lengths of the branches along the path connecting Autoradiograph An image produced on X-ray film
two points. by placing a radioactive object (such as a gel con-
AGE Agarose gel electrophoresis taining labeled DNA fragments) next to film in a
Alignment The juxtaposition of amino acids or light-proof container.
nucleotides 111homologous molecules to maxi- Autosome A chromosome other than a sex chromo-
mize similarity or minimize the number of some.
inferred changes among the sequences. Avidin-biotin Glycoprotein-vitamin complex used
Alignment is used to infer positional homology for histochemical staining with non-radioactively
(qv) prior to or concurrent with phylogenetic labeled probes. Avidin non-immunologically
analysis (see Chapters 9 and 11). binds four molecules of biotin, which allows
Allele A particular form of a gene at a particular amplification of hybridization signal when used
locus. in conjunction with biotin-labeled probes, and
Allele genealogy See gene tree. anti-avidin and/or anti-biotin antibodies which
Allopatric Occurring in geographically separate are themselves conjugated with biotin or fluo-
areas. See sympatric, parapatric. rochromes (see Chapter 5).
Allozyme An allele of an enzyme.
Alu repeat The most abundant interspersed repeated B In DNA or RNA sequences any nucleotide except
DNA family of primates. adenine.
AMPPD Disodium 3-(4-methoxyspiro-[l,2-dioxe- Base composition The relative proportions of the
~hen~l
tane-3-2'-tri~~lo[3.3,1.1~~~Idecan)-4-y1] four respective nucleotides in a given sequence of
phosphate. DNA or RNA.
Anion A negatively charged molecule. BCIP 5-Bromo-4-chloro-3-indolylphosphate.
Anode The positive electrode in an electrolytic cell BGD Bromcresol green dye.
(such as an electrophoresis chamber) toward Bifurcation A node in a tree that connects exactly
which anions migrate. three branches. If the tree is directed (rooted),
ANS Anilino naphthalene sulphonate. then one of the branches represents an ancestral
550 Glossary and Abbreviations
lineage and the other two branches represent Character polarity The inferred direction of change
descendent lineages. Synonym: dichotomy. of a character state in a phylogenetic tree; usually
Biotinylated Labeled wlth biotin. Biotinylated determined by reference to the character state in
probes are used for non-radioactive histochelnical an outgroup.
staining or filter hybridization visualization (see Character state The specific value taken by a charac-
Chapter 5). The technique also may be used to ter in a specific taxon or sequence (e.g., green eyes
label oligonucleotidepruners used for FCR and or glycine at position 12 of a particular protein).
sequencing (see Chapter 7). See character.
BN buffer Bicarbonatc-nonidet buffer (see Chapter Character-state tree A description of the transitions
6, Appendix). among the states of a multistate character, espe-
Bootstrapping See nonparametric bootstrapping and cially when the transitions do not define a linear
parametric bootstrapping. series of states.
bp Base pair. Chiasmata Sites of the mutual switching of non-sis-
BPA Brooks parsimony analysis (cospeciationanaly- ter chromatids of homo1ogous chromosome seg-
sis). A method for encoding and combining infor- ments observed during prophase and metaphase
mation from several independent pl~ylogenetic of meiosis I.
trees for the purpose of inferring coevolutionary Chromatid The eukaryotic d~romosomeprior to
patterns. replication, or one of the two longitudinal sub-
BrdU Bromodeoxyuridine. units of a chromosome after replication.
BSA Bovine serum albumin. Chromomere A region on a chromosoine of densely
Bufh coat A thin layer of white blood cells that lies packed chromatid fibers that produces a dark
above the erythrocytes after vertebrate blood is band (as on a polytcne chromosome).
centrifuged. Chromosome painting A method for the non-
Bulked segregate anaIysis Pooling DNA from each radioisotopic detection of hybridized chromoso-
parental species and screening bulked sample for mal probes using various fluorochromes. The
polymorphisrns using RAPD markers (see method combines immunochemistry with in situ
Chapter 8). l~ybridization.It is used for mapping sequences
on chromosomes and for identiljrlng chrornoso-
C 1. In DNA or R I A sequences: cytosine. 2. In pro- ma1 homologies between species (see Chapter 5).
tein sequences: cysteine. Chromosome repatterning hypothesis One of two
Cathode The negative electrode in an electrolytic cell hypotheses concerning the mode of evolutionary
(such as an electrophoresischamber) toward change in the molecular structure of chromo-
which cations migrate. somes. According to this hypothesis, interspecific
Cation A positively charged molecule. differences in the chromosomal location of certain
C-bands Dark bands on chromosomes produced by repetitive DNA sequences reflect the redistribu-
strong alkaline treatment at high temperahre, fol- tion of chromosomal elements within karyotypes.
lowed by incubation in sodium citrate solution, fol- This hypothesis predicts that evolutionary
lowed by Giemsa staining. C-bands generally cor- changes in sequence location should be relatively
respond to regions of constitutiveheterochromatin. conservative. See homoseqentiallty hypothesis
cDNA See complementary DNA. and Chapter 5.
Central branch The interior branch connecting the CI Chloroform-isoamyl alcohol (see Chapter 9,
two internal nodes of an unrooted phylogenetic Appendix).
tree of four taxa. CIC See cold-induced constriction.
Chaotropic agent In DNA-DNA hybridization, an CISS Chromosome in s i h suppression hybridiza-
agent that reduces thermal stability of base pair- tion. A method for fluorescence in situ hybridiza-
ing in DNA. tion (see FISH) that utilizes probes from DNA
Character A variable feature that in any given taxon libraries of flow-sorted chromosomes to search
or sequence takes one out of a set of <wo or more for DNA sequence homologies of whole or partial
different states (e.g., eye color or amino acid posi- chromosomes. CXSS is used to identify homeolo-
tion 12 of a particular protein). gous chromoson~esin different species or in
Character compatibility A method of phylogenetic hybrids (see Chapter 5).
analysis that seeks the largest clique of characters Cladogram A tree that depicts inferred historical
that can be fitted to a common tree so that each branching relationships among entities. Unless
character state arises only once (see Chapter 11). otherwise stated, the depicted branch lengths in a
cladogram are arbitrary; only the branching order C o t plot A plot of percentage of single-stranded
is significant. See phylogram. DNA versus log of CoI-.
Cluster analysis A rapid method of hierarchically cpDNA ChloropIast DNA.
grouping taxa or sequences on the basis of simi- Criterion In DNA-DNA hybr~dization,the s trin-
larity or distance. gency of reassociatlon of single-stranded DNA
Coalescence The evolutionary process vicwed back- measurcd by the difference between thc T,, oi
ward through time, so that allelic diversity is perfect duplexes in the incubation buffer arid the
traced back;hrough mutations to ancestral alle- temperaturc of ~ncubatlon(see Chaptcr 6 )
les. Coalescent theory can be used to make pre- Cryptic allele An undetected (by a particular tecl1-
dictions about effective population sizes, ages and r-uque) variant at a gene locus
frequencies of alleles, selection, rates of mutation, CTAB Hexadecyltr~methylammoniuinb r o m ~ d e
or time to common ancestry of a set of alleles. C-value A measure of haplo~dDNA content per cell
Coancestry coefficient The correlation of genes of
different individuals in the same population; a D 1.In DNA or RNA sequences: any nucleotide ex-
measure of the relatedness of individuals within cept cyfosine. 2.111 protem sequences: aspartic ac~d
populations (symbolized by 0 or FST);sce DAB Diaminobenzidine tetrahydrochloride.
Chapter 10. DAFI Diam~dino-2-plzenylindole
Cold-induced constriction A chromosome-specific Degenerate primers Oligonuclcotldes dcslgned to
constriction induced in certain specles with large Include a mixture of different sequences to allow
chromosomes by prolonged trcatment of the for variation at particular nucleottde positions ~n
organism at 0.5-2.5'C in the presence of a targct sequence.
colclucine. Dendrogram Any branching, treehke diagram.
Complementary DNA DNA reverse transcribed DEP Diethyl pyrocarbonate
from an RNA template. DGGE Denaturing gradient gel electrophoresis A
Concexted evolution The generation and mainte- specialized form of electrophoresis uslng poly-
nance of homogeneity among members of a fami- acrylamide gels with gradlents of denaturants,
ly of DNA repeats within a species or population. used to detect differenccs In the stability of PCR
Confornational isozymes lMultiple forms of a single products. The double-stranded poduci moves
gene product [hat differ in secondary or tertiary through the gel until a point iiz the denaturing
structure. gradient at which it becomes single-stranded,
Congruence Agreement among data or data sets. DNA segments differ~ngby a single nucleotide
Consistency In the context of statistical inference: o f en can be distinguished using this technlque.
convergence on the correct answer using a partic- Dichotomy See bifurcat~on.
ular method as the sample size becomes infinite. Diplotene bivalents Pairs of homologous chromo-
Lack of consistency indicates a critical departure somes associated via chiasmata during the
from the assumptions (explicit or implicit) of the dipiotenc substage (when ch~asmataare hrst
method of analysis. All methods are consistent formed) of prophase I of meiosis.
when their assumptions are met; all methods Direct fluorescence method A mcthod for non-
become inconsistent when their assumptions are radioactive immunochemical v~sualizationof
violated sufficiently. Contrast with efftciency. hybrid~zedprobes. See Indirect fluorcsccncc
Consistency index A measure of the amount of method and Chapter 5.
homwplasy exhibited by a character or sct of char- Disequilibrium coefficient A term that describes
acters on a tree, defined as the sum of the mini- the differencc bctween the jolnt frequency of two
mum individual character ranges divided by the or more alleles and the product of the frcquenclcs
observed number of changes. If there is no homo- of the scparate alleles.
plasy, these quantities will be equal, so that the Dissimilarity A generic measure of the dlfferencc
consistency index achieves its maximum value of between two objects, usually measured on a scalc
one. of 0 to 1.
Constitutive heterochromatin Regions on chromo- Distance A measure of thc difference betweeen t ~ 7 o
somes consisting mostly of highly repeated, non- objects, usually measured on a scale of 0 to mfin~ty
coding sequences. Distance estimates A phrase used to emphasize thc
C o t Initial concentration of DNA in a DNA reassoci- potentially imperfect rcflcctlon of cvolut~onary
ation experiment (in mol/L) multiplied by time of history In distance valucs lnfcrrcd from experl-
incubation in seconds. mental or sequencc data.
552 Glossary and Abbreviations
DMSO Dimethylsulfoxlde. Exon A segment of an interrupted gene that is repre-
DNA fingerprinting In the broad sense, any fine- sented in the mature mRNA.
scale DNA analysls that allows identification of
sdmples at the level of the individual. The term F In protein sequences: phenylalanine.
has been specifically applied to analyses such as FACS Fluorescence activated cell sorting. A tech-
VNTlls, RAPDs, microsatellites, and minisatel- nique that allows large scale isoIation of parhcu-
li~es lar chromosomes of a given karyotype (see
DNA polymerase An enzyme that catalyzes synthe- Chapter 5).
sis ol DNA under direction of single-stranded FCS Fetal calf serum.
DNA template. Felsenstein zone A region (in parameter space) of
dNTP Deoxyribonucieotide. inconsistency for a given phylogenetlc method
Downstream 3' of the target sequence. under a given evolutionary model.
Driver Unlabeled, fractionated single-copy DNA F I s See inbreeding coefficient.
used in DNA-DNA hybridization experiments FIT See inbreeding coefficient.
See tracer. FISH Fluorescence in situ hybridization. Refers to a
dsDNA Double-stranded DNA. number of methods for non-isotopic in situ
hybridization (l.e., methods uslng non-radioac-
E in proteln sequences: glutamic acid. tlvely labeled probes). See ISH and Chapter 5.
E'B Ethidium bromide. FITC Fluorescein isothyocyanate.
EDTA Ethylene diamine tetra-acetate Fixation index See inbreeding coefficient.
Eff~ciency In the context of stat~sticalinference: A Fourfold degenerate codons Codons for which the
measure of how quickly a particular method con- third base position can be occupied by any of the
verges on the correct solution as more data are four nucleotides without altering the encoded
applied to the problem. amino acid.
Electrodecantation The setthng of proteins of hlgh Fs, See coancestry coefficient.
molecular weight toward the bottom of a horizon- F statistics A set of coefficients that describes how
tal gel during electrophoresis. genetic variation is partitioned within and among
Electroendosmosis Movement of ionized buffer populations and individuals. See coancestry
solutlo1.1through a gel caused by gel charge coeffient, inbreeding coefficent, and Chapter 10.
groups.
Electromorph An rlectrophoretically indistinguish- G 1.In DNA or RNA sequences: guanine. 2. In pro-
able class of isozymes. Electromorphs represent k i n sequences: glyclne.
alleles ~f all differences between variants result in Caps Edihng syn-ibols that are inserted into
changes in electrophoretic migration rate. sequences in the process of alignment in order to
Electrophoresis The separation of molecu~esln an compensate for presumptive insertion and dele-
zlectrlc flcld. tion events.
EPIC Exon priming, iintron crossmg. Refers to G-bands Dark bands on chromosomes produced by
primers, designed to amplify intron regions, that Giemsa staining. G-bands occur primarily in AT-
are based on conserved exon sequences flanking rich regions.
the target introns. Gene conversion A genetic process by which one
Epigrnetic All processes relating to the expression sequence replaces another at an orthologous or
and interaction of genes. paralogous locus. May result from mismatch
Evolutionarify significant unit Historical groups of repair in heteroduplexes.
populations recognized for the purpose of priori- Gene tree A branching diagram that depicts the
tlzlng conservation actions and determining long- known or (usually) inferred relationships among
term strategies. Equivalent to species under most an historically related group of genes or olher
lineage concepts of species See species. nucleotide or amino acid sequences.
Evolutionary distance An idealized measure of the Genomic library A mixture of cloned DNA frag-
e i nluhonary separation of sequences or taxa, ments (usually in viral or cosmid vectors) that
s.iich as the total number of substitutional events. together represent virtually all of an organism's
They are defined so that the values are additive DNA. Partial or subgenomic libraries contain
a n d hence will precisely ht an additive evolution- only restriction fragments of a cerlain size range.
ary tree. GTE Glucose-Tris-EDTAbuffer (see Chapter 9,
Appenix).
Glossary and Abbreviations 553
H 1, In DNA or IWA sequences: any nucleotide than inheritance from a common ancestor. These
except gualune. 2. In protein sequences: histidine. include convergence, parallelism, and reversal.
H Symbol for average heterozygosity. Homoseqentiality hypothesis One of two hypothe-
HAP Kydroxyapatite. HAP is used in columns to ses concerning the mode of evolutionary change
separate single-stranded DNA from double- in the molecular structure of chromosomes. This
stranded DNA (see Chapter 6). hypothesis holds that differences in the apparent
."
Hardy-Weinberg equilibrium An equilibrium of chromosomnl locations of various sequences
genotypes achieved in populations of infinite size reflect localized amplification or diminution of
(in which there is no immigration, emigration, sequences with fairiy stable chsomosomal loca-
selection, or mutation) after one generation of tions and predicts rapid and reversible changes in
panmictic mating. With two alleles A and B of fre- the cytologically visible cl~romosomallocation of
quency p and q, respectively, the Hardy-Weinberg certain kinds of sequences. See chromosome
equilibrium frequencies of the genotypes AA, AB, repatterning hypothesis and Chapter 5.
and BB are p2, 2pq, and q2, respectively. Hornospecific Adjective used to describe a probe
Heterochromatin Chromosomal segments or whole derived from the same species that is under study,
chromosomes that generally exhibit a condensed or to refer to any reaction between homologous
state throughout interphase and late replication. molecules from the same individual or species.
See constitutive heterochromatin. HWE Hardy-Weinberg equilibrium.
Keteroduplex A hybrid DNA-DNA molecule Hybrizymes Alleles found in hybrid zones which
formed between (presumably homologous) are rare or absent in populations of the parental
sequences. (non-hybrid) species.
Heterologous Homologous molecule from a species Hybridoma Cell line formed by fusing a B lympho-
other than that which is being examined. cyte with a myeloma (a tumor cell line derived
Heteroplasmy The containment by one cell or indi- from a lymphocyte). Used in the production of
vidual of more than one type of a particular monoclonal antibodies.
organellar DNA (e.g., mtDNA or cpDNA).
Heteropolyrner A rnultimeric protein formed from I In protein sequnces: isoleucine.
products of multiple alleles. IgG A type of antibody commonly used for
Heterospecific Adjective used to describe a probe immunochemical staining.
derived from a species other than the species ImmunoglobuIin (Ig) An antibody composed of
under study, or to refer to any reaction between two identical light chains and two identical heavy
homologous molecules from different species. chains of amino acids.
Heuristic method Any analysis procedure that does Inbreeding coefficient The correlation of genes with-
not guarantee finding the optimal solution to a m individuals (symbolized by F or Fm;this is the
problem (usually used to obtain a large increase overall inbreeding coefficient), or the correlation
in speed over exact methods). of genes within individuals within populations
HKY model The DNA substitution model of M. (symbolized by f or Frs; this is the within-popula-
Hasegawa, Id. Kishino, and T. Yano (1985; see tion inbreeding coefficient; see Chapter 10).Frs is
Chapter 11). also known as the fixatlon index. Both FIs and FIT
~ o m e o l o ~Chromosomes
s that are homologous are measures of deviation from Hardy-Weinberg
among species. proportions; positive values indicate a deficiency
Homoduplex A hybrid DNA-DNA molecule of heterozygotes whereas negative values indicate
formed between sequences from the same mdi- an excess of heterozygotes.
vidual (or sometimes, species). Indel Insertion/deletion event.
Homogeneous Markov process A process that fol- Indirect fluorescence method A method for non-
lows a Markov model (qv) and does not vary radioactive immunochemicai visualization of
through time (e.g., in different parts of a phyloge- hybridized probes (see Chapter 5).
netic tree). Ingroup An assumed monophyletic group, usually
Homology Common ancestry of two or more genes comprising the taxa of primary interest.
or gene products. In situ hybridization The annealing of a mobile,
Homomeric isozyme An enzyme composed of mul- labeled nucleic acid probe to a stationary nucleic
tiple identical polypeptide chains. acid target <oftenwnole chromosomes) to form
Homoplasy A collection of phenomena .that leads to base-paired duplexes.
similarities in character states for reasons other
554 Glossary and Abbreviations
Interior branches Branches in a phylogenetic tree virus to circularize after entering its bacterial
that do not connect to a tip of the tree. host. The DNA is packaged into a protein coat in
Interior nodes The branch points in a phylogenetic the mature virus.
tree. If the tree is rooted, the root node IS also an Lampbrush chromosome A bivalent at diplotene
interior node. stage in a female meiotic cell; found in the oocytes
Intron Non-coding reglon of an interrupted gene of most animals.
that is transcribed ~ n t oXWA but is excised during L-broth Luria broth (see Chapter 9, Appendix).
processing of the primary transcript into a mature LINE Acronym for long interspersed elcment. An
mRNA. interspersed repetitive DNA sequence usually
Isoelectric point The pH at which the positive and >5,000 bp.
negative charges of a protein are equal. Linearly ordered character A multistate character in
IsoIoci Two or more loci of a multilocus enzyme sys- which the allowed transitions between states
tem that produce products of the same elec- form a linear chain.
trophoretic mobility Linkage disequilibrium Departure from the pre-
Isology Sequence similarity of aligned nucleic acids dicted frequencies of multiple locus gamete types
or polypeptides; the similarity may be due to assuming alleles are randomly associated.
homology or convergence. Lyophifization Drying from the frozen state.
Isomers Molecules with the same chem~calformula Lysogenic cycle A cycle of phage growth in which
but different molecular structures. the phage become a stable prophage component
Isoschizomer Restriction endonuclease with the of the bacterial, genome.
same recognition sequence as another restriction Lytic cycle A cycle of phage growth in which the
endonuclease. phage are replicated many times, resulting in
Isozyme Enzyme with the same chemical function eventual destruction of the host bacterial cell and
as another enzyme, but differing in primary, sec- release of the progeny phage.
ondary, tertiary, and/or quaternary structure.
M 1. In DNA or I W A sequences: adenine or cyto-
Jackknifing A statistical method of numerical sine. 2. In protein sequences: methionine.
resampling based on deleting a portlon of the M13 A filamentous bacteriophage of the bacterium
original obsemat~onsin subsequcnt samples (see E.coli that is widely used for clonlng and
Chapters 10 and 11). sequencing. The genome of MI3 is circular and
Jukes-Cantor model The DNA substitut~onmodel approximately 6,500 bp in length. M13 occurs in
of T.H. Jukes and C. R. Cantor (1969) that both a double-stranded rephcative form (used for
assumes all possible nucleotide substitutions are cloning small fragments) and a single-stranded
equally likely (see Chapter 11).The Jukes-Cantor form (used for Sanger dideoxy-sequencing);see
model is a special case of the Kimura model (qv). Chapter 9.
MAB Monoclonal antibody.
K 1. In DNA or RNA sequences: guanine or thymine Management unit Demograplnically independent
(uracil in RNA). 2. In protein sequences: lysine. sets of populations recognized for the purpose of
KAc Potassium acetate. management of exploited or endangered species,
Karyotype A pictorial or diagrammatic representa- e.g., for population monitoring and manipulation.
tion of the metaphase chromosomes of the com- Broadly equivalent to "stocks" and recognized as
plement of an individual or a species. sets of popula tions showing significant divergence
kb Kilobase pairs, or 1000 base pairs of DNA. in allele freauencies from other conspecific sets.
Kimura model The DNA substitution model of M. Markov model ' A~nodelin which the ;robability of
Kimura (1980) that assumes all transitions are a change from one state to another does not
equally likeIy and all transversions are equally depend on the previous history of the state.
likely (see Chapter 11). Maximum likelihood A criterion for estimating a para-
meter from observed data under an explicit model.
L In protein sequences: leucine. In phylogenetic analysis, tl~eoptimal tree under the
Lambda (h.)bacteriophage Avirus of the bacterium maximum likelihood criterion is the tree that is the
E. colt that is widely used as a cloning vector in most likely to have occurred givcn the observed
molecular bioiogy. It is a double-stranded DNA data and the assumed model of evolution.
virus approximately 50 kb in length, with single- Maximum parsimony A criterion for cstimating a
stranded complementary ends that allow the parameter from observed data based on the prin-
Glossary and Abbreviatiorzs 555
ciple of minimizing the number of events nceded mRNA Messenger RNA.
to explain the data. In phylogenetic analysis, the mtDNA Mitochondria1 DNA
optimal tree under the maximum parsimony cri- MTT 3-(4,5-Dirnethylth1azol-2-yl)-2,5-diphunyltcta-
terion is the tree that requires the fewest number zoliuln bromidc.
of character-state changes (which may be differ- Multifurcation A node in a trcc that connects morc
entially weighted across characters and/or char- than threc branches. If the trce 1s drrected (root-
acter states). Often simply called parsimony. ed), then one of the branches represents an ancc.5-
MEM Eagle's minimum essential medium. tral l~neageand the remaming branches rcprescnt
Methylation The chemical process of adding a descendent lineages. A multifurcation may rcprc-
methyl group to a rnolccule. sent a lack of resolulion because of too few data
Microsatellites A subset of VNTRs characterized by available for infcrrtng thc phylogeny ( ~ which
n
very short (2-5 bp) tandem repeats with a high case ~tis satd to be a soft multifurcahon) or i t may
rate of variation in copy number among individu- represent the hypotheslzcd simultaneous spllttlng
als. These loci tend to be randomly distributed of several lineages ( ~ which
n case it is sald to bc a
throughout the gcnome and are subject to replica- hard mult~furcation).Synonym polytomy
tion slippage that leads to length variation (see Multimeric protein Aproteln that co~~tains multlplc
Chapter 8). polypeptide chains.
Minimum evolution 1. Originally, a name applied Multiplexing Any process that conducts rcpetltive
to a phylogenetic optimality criterion developed tasks simultaneously on many objccts. In the co1-i-
by Cavalli-Sforza and Edwards (1967).2. The text of DNA sequencing, rnultiplex~ngrefers to
name applied by Rzhetsky and Nei (1992a) to a combincd sequencing of numerous clones on a
phylogenetic optimality criter~onthat was origi- single gcl, each of which 1s Incorporated into a
nally described by Kidd and Sgaramella-Zonta distinct vector. Multlplcxing IS also used to rcfer
(1971). The optimal tree under this criterion is the to methods of analysis such as sirnultancous
tree with the smallcst sum of branch lengths as arnpliflcation of several mxcrosatelllte locl vla
estimated under the least-squares criterion, with PCX.
negative branch lengths disallowed. MUJdTIPRINS Multiple primed m s ~ t uhybnd~za-
Minisatellites A subsct of VNTWs characterized by tlon. Modification of the PRLNS techn~quc(qv) fol
tandem repeat unlts of approximately 20 bp. usc with multiple probes detected with dlfferent
These loci tend to be concentrated close to telom- fluorochro~nes(sce Chapter 5).
crcs and vary in length and sequence because of
intramolecular or interallellc rccombit~ationand N 1.In DNA or RNA sequences. an unknown
gene conversion (see Chapter 8). nucleotidc 2. In protcln sequences: asparagme
Mitogen A substance that stimulates mitosis. NaAc Sod~umacetate
Molecular clock hypothesis The hypothesis that NAD P-n~cotinainideadenine dinucleotide
molecules evolve In direct proportion to time, so NADH ,!?-nicotinarnldcadcnlne dlnucleohdc,
that differences between homologous DNA reduced form.
sequences or proteins can be used to estimate the NADP /?-nicotinarnide adcnine dlnucleotide phos-
time elapsed since the two molecules (or the taxa phate.
that contam thcm) last shared a common anccstor. NBT Nitro blue tetrazohum.
Molecular systematics The detection, dcscription, nDNA Nuclear DNA.
and explanation of molecular biological d~vcrsity, N , Effectwe populat~onsize
both within and ainang species. Neighbor joining An heurlstlc method for obtail~~ng
Monoclonal antibody A single antibody produccd a po~ntestimate of a m~i-iiin~~rn
evolutton trec (scc
in quantity by cultured lines of hybridolna cells. Chapter 21).
Monomeric protein A protein that contains a single Network A graph that dcp~ctsrelationsh~psamong
polypeptide chain. ent~ticsand contains cycles (ret~culations).
Monophyletic A group of taxa that contains an Nonidet Non-ionic detergent (e.g., NP-40).
ancestor and all of its descendants. Nonparametric bootstrapping A statistical melhod
Most-parsimonious reconstruction (MPR) Any based on repeated random sampling with
assignment of ancestral states to characters on a replacement from an orlg~nalsample to provide a
tree so that the change of each character is mini- collectlan of new pseudorepllca te samples, Cron-i
mized (subject to any constraints being enforced). wl~ichsampling variance can be cstlmatcd (scc
MPR Most-parsimonious reconstruction. Chapters 10 and 11).
556 Glossnry and Abbreviations
Non-synoz~ymoussubstitution A nucleotide substl- Paralogy Homology that arises via gene duplication.
tutlon that results in an amino acid replacement. Parametric bootstrapping A method for producing
NOR Nucleolar organizer reglon. independent pseudoreplicates of a data set by
NP-40 Nonidet P-40,a nomonic detergent. estimating parameters from the observed data,
NPH Norinahzed percentage of hybridization (see using the estimates to produce a model, and
Cliapter 6 ) Defined CIS the extent of hybridization using the model to simulate replicate data sets.
In a heieroduplex comparison divided by that for See Chapter 12.
the homoduplex control
P' ressed as a percentage
Nuclear genome The port on of the genome con-
tamed in the nucleus of eukaryotes, i.e., the chro-
Parapatric Adjacent but non-overlapping distribu-
tions. See allopatric, sympatric.
Parsimony See maxlinum parsimony.
mosomes. Partially ordered character A multistate character
Nucleolar organizer region A region on a chromo- that is ordered, but for which the permitted state
some that contains the riboson~alRNA genes and transitions do not form a linear series.
:issoclated spacers. PB Phosphate buffer, used in DNA-DNA hybridiza-
Null allele An allele that produces either no proteln tion experiments (see Chapter 6, Appendix).
product or a non-functional proteln product PBS Phosphate-buffered saline (see Chapter 6,
(under the conditions analyzed). Appendix).
Neutral~ty The state of being free from the effects of PC1 Mixture of phenol, chloroform, and isoamyl
selectlon. alcohol, used in DNA extraction protocols (see
Chapter 9, Appendix).
Objective function A function that defines how well PCR Polymerase chain reaction.
data f ~at particular hypothesis (as, for instance, a PDB Phage dilution buffer (see Chapter 9, Appendix).
part~cularphylogenetic tree). PEG Polyethylene glycol.
OD Optical density, as measured in a spectropho- Perfect primer An oligonucleotide designed to be
tomcter. Used to estlrnate concentration and puri- exactly complementary to a target DNA sequence.
ty of DNA solutions (see Chapters 7-9). Pericentric inversion An inversion of a region of a
Oligonucleotide A short chain of nucleotides, often chromosome that includes the centromere.
produced in the laboratory. Peripheral branches The branches on a phylogenetic
Optirnality criterion Same as objective function. tree that coi~nectto a terminal taxon or sequence.
Ordered character A multistate character for which PERT Phenol emulsion reassociation technique (see
the changes between states are constrained; not Chapter 6).
all states can be reached directly from any other. PHA Phytohaemagglutinin, a mitogen.
Organellar genome The DNA contained in cytoplas- Phenogra~n A branching diagram that links entities
mlc organelles (i.e,, mtDNA and cpDNA). by estimates of overall similarity. Usually con-
Orthology Homology that arises via speciation. structed using WCMA cluster analysis.
OTU Operational taxonomic unit. Synonymous with Phylogeny The historical relationships among lin-
teilninal taxon in thls book eages of organisms or their part:: (e.g., genes).
Outgroup One or more taxa assumed to be phyloge- Phylogeography The study of biogeography as
neLlcally outside the ingroup. revealed by a comparison ofestimated phyloge-
Outgrouy comparison Amethod that can be used nies of populations or species with their geo-
lor assigning the direction of change to character- graphic distributions.
state transformations and for determining the PI Propidium iodide.
root of 3 phylogenctic tree (see Chapter 11). Phylogram A tree that depicts inferred Iustorical
relationships among entities. Differs from a clado-
P In protein sequences. prollne. gram in that the branches are drawn proportional
Pachytene A substage of prophase of meiosls I in to the amo~mtof inferred character change.
~vhichthe homologous chromosomes are paired Plaque A clear spot on a bacterial lawn (in a petri
f~olnend to end. plate) that results from lysis of the resident bacte-
PAGE Polyacrylamide gel electrophoresis. ria by bacteriophage.
PAP Peroxidase-antiperoxidase complex. Used in a Plasmid A self-replicating extrachromosomal circu-
method for visualizing non-radioactively labeled lar DNA.
hybridzed probes (see Chapter 5). Plerology Homology of repeated sequences that are
Faracentric inversion An inversion of a regionaf a subject to concerted evolution.
chromasnme that does not include the centromere. Plesiomorphy An ancestral character state.
Glossary and Abbreviations 557
pMS Phenazine methosulfate. R 1. In DNA or RNA sequences: adenine or guanine.
PNK Polynucleotide kinase. Used in end-labeling of 2. In protein sequences: arginine.
primers. Random error Deviation between a parameter of a
polymerase chain reaction A process for amplifying population and an estimate of that parameter, due
a target DNA sequence manyfold, in which a strictly to a limited sample size used to make the
series of thermal cycles each result in denatura- estimate. By definition, random error disappears
tion of a double-stranded target, annealing of in infinite samples.
oligonucleotide primers to the resulting single RAPDs Random amplified polymorphic DNAs (see
strands, and primer extension catalyzed by a ther- Chapter 8).
mostsble DNA polymerase. R-bands Bands on chromosomes that exhibit the
Polytene chromosome A somatic chromosome that reverse pattern of Q- or G-bands.
has undergone many rounds of endoreplication rDNA Ribosomal DNA, which contains the genes
such that each chromosomal element consists of for ribosomal XNA and the associated spacer
hundreds to thousands of unseparated chromatids. regions.
Polytomy See multifurcation. RE Restriction enzyme.
Positional homology The relationship among the Reciprocity The degree to which reciprocal mea-
columns of nucleotides or amino ac~dsin correct- sures of divergence (e.g., A to B versus B to A)
ly aligned DNA or protein sequences. The agree.
nucleotides or amino acids in a single column of Recombination Exchange of gene segments between
the alignment are inferred to have been derived non-sister chromatids through the physical
from a-single ancestral nucleotide or amino acid, process of exchange of (usually) homologous
with or without intervening substitutions or strands of DNA.
replacements. Restriction endonuclease An enzyme that cleaves
Postf.ranslationa1modification Any process that double-stranded DNA. Type I restriction endonu-
modifies a polypeptide after its translation from cleases are not sequence-specific; Type I1 restric-
RNA. tion endonucleases cleave DNA at particular
PPS A solution of phenoxyethanol-phospl~ate- recognition sequences (typically 4-6 bp palin- ,
sucrose, used to preserve proteins. dromes).
Primary antibody An antibody produced directly in Restriction fragment length polymorphism (RFLP)
response to a particular antigen. A polymorphism in an individual, population, or
Primers Oligonucleotides used to initiate synthesis species defined by restrictlon fragments of a dis-
of DNA by a DNA polymerase or reverse tran- tinctive length. Usually caused by gain or loss of a
scriptase. A primer anneals to a complementary restriction site, but may result from an insertion
sequence in a single-stranded DNA or RNA tem- or deletion of a fragment of DNA between two
plate, and the polymerase then extends the com- conserved restriction sites.
plementary sequence from the primer. Retroposition Reverse transcription of M A to DNA
PRINS Primed in situ hybridization. A method for with subsequent integration of the DNA at a new
fluorescence in situ hybridization (see FISH) genomic site.
which utilizes unlabeled oligonucleotide probes Retroposon A transposable rctroelement that neither
to complementary sequences on fixed chromo- constructs virion particles nor is flanked by termi-
somes, with subsequent extension by DNA poly- nally redundant sequences.
merase and labeled nucleotides (see Chapter 5). Reverse transcriptase An enzyme that transcribes
Pseudogene A usually nan-functional copy of a pro- RNA into DNA.
tein-coding gene inserted at another location in RFLP See restriction fragment length polymor-
the genome. Most pseudogenes result from phism.
retroposition of processed mRNAs, and therefore Robertsonian translocation Fission or fusion of
typically lack introns and the regulatory chromosomes at their centromeres.
sequences necessary for expression. rRNA Ribosomal M A , the nucleic acid component
PWM Pokeweed mitogen. of ribosomes, which functions in translation of
proteins from mRNA.
Q In protein sequences: glutamhe. RT Reverse transcriptase.
Q-bands Fluorescent (under UV light) bands on
chromosomes produced by quinacrine staining. S 1.In DNA or RNA sequences: guanine or cytosine.
Q-bands are brightest in AT-rich regions. 2. In protein sequences: serine.
558 Glossary and Abbreviations
Sl nuclease An enzyme that digests single-stranded the states that are consistent with the potential
DNA. most-parsimonious reconstructions.
Satellite DNA Highly repeated DNA sequences that STE Sodium chloride-Tris-EDTA buffer (see Chapter
band apart from most nuclear DNA in CsCl ultra- 9, Appendix).
centrifugation. STES Sodium chloride-Tris-EDTA-sucrose buffer
SC Synaptonemal complex. (see Chapter 8, Appendix).
scnDNA Single-copy nuclear DNA. Streptavidin Protein made by the bacterium
SCP Saline citrate-phosphate (see Chapter 5, Streptofnycesavidiniz. Streptavidin binds biotin
Appendix). and is often used in place of avidin in histochemi-
SDS Sodium dodecyl sulfate (= sodium lauryl cal staining procedures. See avidin-biotin.
sulfate). Stringency In DNA-DNA or DNA-RNA ltybridiza-
Secondary isozyme A conformational isoqme. tion, the conditions of the hybridization (such as
SEDTA Saline EDTA (see Chapter 6, Appendix). temperature and concentration of chemical addi-
Sequential electrophoresis The use of a series of dif- tives) that determine the degree of similarity that
ferent electrophoretic conditions to uncover hid- will result in formation of hybrid molecules.
den heterogeneity in isozyme electrophoresis. Subbands Non-allelic bands in isozyme elec-
SGE Starch gel " electrophoresis. trophoresis that represent the electrophoretic loca-
Similarity A generic measure of the resemblance tion of conformational isozymes.
between two objects, usually on a scale from Superimposed changes Changes at a part~cularsite
1 to 0. along a lineage of the phylogeny that mask earlier
SINE Acronym for short interspersed element. An changes at that site, as well as parallel or conver-
interspersed repetitive DNA sequence of <500 bp. gent changes that occur at the same site in differ-
Single-strand conformational polymorphism A ent lineages.
polymorphism detected by differential migration Sympatric Occurring in the same place. See
of DNA fragments in a gel matrix caused by con- allopatric, parapatric.
formational changca of single-stranded DNA Symplesiomorphy A shared ancestral character state.
resulting from point substitutions, msertions, and Synapomorphy A shared derived character state
deletions (see Chapters 8 and 9). that is indicative of a phylogenetic relationship
Southern blot A membrane onto which DNA has among two or more OTUs.
been transferred directly from an electrophoretic Synaptonemal complex A set of proteinaceous par-
gel. Named after the blotting technique devised allel strands that occur coaxial to paired chrorna-
by E. M. Southern (1975). somes during prophase I of meiosis, which func-
Species A cohesive historical lineage of tion to hold the paired chromosomes together and
ancestral-descendent populations of organisms facilitate recombination.
that maintains its identity from other such lin- Synonymous substitution A nucleotide substituhon
eages. A species comes into being at a branching that does not result in an amino acid replacement.
event (one lineage becomes two or more lineages) Synteny Genetic linkage of loci to the same chromo-
and ceases to exist either at a branching event some.
(when it gives rise to new species) or when the Systematic error Deviation between a parameter of
lineage is terminated through extinction. a population and an estimate of that parameter,
Specific reactivity See specificity. due to incorrect assumptions in the estimation
Specificity The degrcc to which antibodies react method. Systematic errar persists (and may mten-
with multiple antigenic s~tes.Initially antibodies sify) as sample sizes increase and become infinite.
are monospecific, but with longer periods of
immunization they become morc cross reactive T 1. In DNA sequences: thymine. 2. In protein
(react with more antigenic sites). sequences: threonine.
SSC Saline sodium citrate (see Chapter 5, TAE Tris-acetic acid-EDTA buffer (see Chapter 8,
Appendix). Appendix).
SSCP Single-strand conformational polymorphism. Taq polymerase A thermostable DNA polymerase
ssDNA Single-stranded DNA. from Thermus aquaticus, a thermophilic bacterium.
Star tree A tree that contains a single internal node. Used for amplification of DNA via the poly-
State set Amathematical set of character states, as merase chain reaction.
used during a parsimony analysis to keep track of TCA Trichloroacetic acid.
Glossary arzd Abbreviations 559
TE Tris-EDTA buffer (see Chapter 9, Appendix). Tree length The sum of the estimated or actual
TEACL Tetraethylammonium chloride, a chaotropic branch lengths in tree
agent used to reduce the effects of differential Two-fold degenerate codons Codons for w111ch the
base composition on hybrid melting temperature third base pair can be occup~cdby either purlne
in DNA-DNA hybridization (see Cl~aptcr7). or by either pyrimidine (1 e., it can be degenerate
TEMED N,N,N',N'-Tetramethylethylenediamine. for either T/C or A/G) without altering thc
Terminal nodes Tips of a phylogenetic tree at which encoded amino acid.
OTUs (terminal taxa) are placed.
Thermal cycler Machine used to produce the con- U 1.In RNA sequenccs: uracil
trolled temperature cycles required for PCR. Ultrametric distances Palrwise distance values that
Time-reversible model A model in which the proba- prcc~selyfit a rooted tree with a constant molecu-
bility of change from state A to state B is the same lar clock. Defined mathematically by satisfying
as the probability of change from B to A. Thus, in the three-point canditlon (see Chapter 11).
the context of phylogenetic analysis, evaluation of Unequal crossing over Pl~ysicalcrossover between
a given tree under a time-revcrsible model is imperfectly aligned rcpeats of a multigene famlly,
independent of the root of thc tree. - DNA
which results in onc smaller and one larger
Titer The conccntration of a substance as determined molecule.
by the amount of a known reagent required to Unequal sister chromatid exchange Sec unequal
bring about a given effect in a test solution. crossing over.
TSoH The interpolated temperature along a DNA melt- Unordered character A character for which anv state
ing curve at which 50%of the DNA is double- can change directly to any other state.
stranded. T5OI3differs from T, (below) when all Universal primer An ol~gonucleotidedesigned to bc
DNA in a DNA-DNA hybridization rcaction does complementary to target sequences that are con-
not form duplexes. The difference in TNI+between served over a wide range of taxa.
homoduplex and hetelvduplex curves 1s AT,. Unrooted tree A pl~ylogcl<etrctree that is not direct-
T, The interpolated temperature along a DNA melt- ed with respect to tin~c.
ing curve at wl~ich50% of the duplex DNA UPGMA Unweighted pair-group method of arltli-
formed In a DNA-DNA hybridization reaction is metic averages. A cluster analysis technique.
double-stralided. The difference in T,, between Upstream 5' of the target sequence.
l~omoduplexand heteroduplcx curves is AT,.
Tmodc The interpolated temperature of the peak of a V 1. In DNA or RNA sequenccs. adenine, cytosmc,
d~fferentialplot of a DNA melting curve. The dif- or guaninc (not thymine or uraciI). 2. In proteln
ference in Tmodebetween homoduplex and het- sequenccs: valine.
eroduplex curvcs is A Tmode Variable number tandem repeat loci Genomic loca-
TPBS Tris-phosphate buffered sahne. tlons that contain variable numbers of short
Tracer Radioactively labeled, fractionated single- tandemly repeated sequcnccs (see Chaptcr 8).
copy DNA used in DNA-DNA hybridization VNTR Variable number tandem repeat.
experiments. See driver.
Transition A nucleotide substitution from one W I. In DNA sequences: adenlnc or thymine. 2. In
purine to another purine (e.g., A -+ G), or from RNA sequences: adenine or urac~l.3. In protem
one pyrimidine to another pyrimidine (e.g., sequences: tryptophan.
T-+ C). WPGMA Werghted pax-group method of arlthrnet~c
Transposable element A genomic element that can averages. A cluster analysls technique.
move froin site to site in the genome of an organ-
ism, either through direct DNA copying (at least Xenology Homology that arises vla lateral gene
in prokaryotes) or reverse transcription from an transfer between unrelated specics (e.g., by rctro-
RNA intermediate (probably the usual mecha- viruses).
nism in eukaryotes).
Transposon A segment of DNA flanked by transpos- Y 1. In DNA or RAJA scqucnccs. cytosine or thyinine
able elements that is capable of moving its loca- (uracil in RNA). 2. In protcin sequences: tyrosinc
tion in the genome.
Transversion A nucleotide substitution from a Zymogram The pattern on an aliozyme electro-
purine to a pyrimidine (e.g., A -t C), or vice phoresis gel visualized by histochem~calstaining
versa (e.g., T -+G)
Literature Cited
ALelc, L.C , W. Kiln and B. E. Felgenhaucr. 1989. Akalke, H. 1974. A new look at the statistical model
"vloleculal evidence for the inclusion of the phy- ident~ficatlon.IEEE Trans. Autom. Contr. AC-
luln Pentaston~ida111 the Crustacea. Mol. Biol. 19:716-723.
bvol 6:685-691. Aldrich, J., B. Cherney, E. Merlin and J. D. Palmer.
A b o ~ t iF~ ,1987. Letter to the editor. Cell 51:515-516. 1986a. Sequence of the rbcL gene for the large sub-
Adachl, J. and M. Itasegawa 1992. MOLPNY: unit of ribulose bisphosphate carboxylase-oxyge-
Programs for moiecular phylogenetics I-PROTML: nase from petunia. Nucl. Acids lies. 14:9534.
Maximurn lzkelihood inference of protein phylogeny. Aldrich, J., B. Cherney, E. Merlin and J. D. Palmer.
Co~nputerScience Monographs, No. 27. Institute 198623. Sequence of the rbcL gene for the large sub-
of Statistical Mathematlcs, Tokyo unit of ribulose bisphosphate carboxylase-oxyge-
Adachi, J ,Y. Cao and IvI Hasegawa. 1993. Tempo and nase from alfalfa. Nucl. Acids Res. 14:9535.
mode of mitocliondnal DNA evolution in verte- Aldrich, P. R.and 1. Doebley. 1992. Restriction frag-
b ~ a t e ai
s the amino a c ~ dsequence level: rapid ment variation in the nuclear and chloroplast
evolulion in warm-blooded vertebrates. J. Mol. genomes of cultivated and wild Sorglnrm bicolor.
Evol. 36.270-281. Theor. Appl. Genet. 85:293-302.
Adams, R. P.,T. Derneke and H. H.Abufatih. 1993. Alexander, B. A. 1991. Phylogenetic analysis of the
IiAPD DNA fingerprints and terpenoids: Clues to genus Apis (Hymenoptera: Aptdae). Ann.
past migrations of Juniper~rsin Arabia and East Entomol. Soc. Am. 84:137-149.
Africa. Theor. Appl. Genet. 87:22-26. Allard, M. W., M. M. Miyamoto, L, Jarecki, R Kraus
Adelman, R., R. L. Saul and B. N. Ames. 1988. and M. R. Tennant. 1992. DNA systematics and
Oxidatlve damage to DNA: Relation to species evolution of the artiodactyl family Bovidae. Proc.
metabolic rate and life span. Proc. Natl. Acad. Sci. Natl. Acad. Sci. USA 89:3972-3976.
USA 85:2706-2708. Allard, M. W, D. Young and Y. Huyen. 1995. Detecting
Adcy, N B., T 0.Tollefsbol, A. B.Sparks, M. H. Edgell dinosaur DNA. Science 268:1192.
and C A Hutchlson JlI. 1994. Molecular resurrec- Allegrucci, G., D. Cesaroni and V. Sbordoni. 1987.
t ~ o nof an extinct ancestral promoter for mouse Adaptation and speciation of cave crickets
L1. Proc. Natl. Acad. Sci. USA 91:1569-1573. (Orthoptera, l&aphidopboridae): Geographic
Adklns, R. M. and R. L. Honeycutt. 1991. Molecular variation of morphometric i ~ d i c e and
s allozyme
phylogeny of the superorder Archonta. Proc. frequencies. Biol. J, Linnean Soc. 31:151-160.
Natl. Acad. Sci. USA 88:70317-10321. Allendorf, F. W. 1977. Electromorphs or alleles.
Arbersold, P. B., G. A. Winans, D.J. Teel, G. B. Milner Genetics 87821-822.
and F. M. Utter. 1987. Manual for starch gel elec- AUendorf, E W. and S. R. Phelps. 1981. Use of allelic
trophoresis: A method for the detection of genetic frequencies to describe population structure. Can.
varla tian. NOAA Tech. Report NMFS No. 61. J. Fish. Aquat. Sci. 38:1507-1514.
Aguadk, M., W. Meycrs, A. D. Long and C. H. Langley. Allendorf, F. W. and G. H. Thorgaard. 1984.
1994. Single-strand conformation polymorpl~ism Tetraploidy and the evolution of salmonid fishes,
analysis coupled wlth stratified DNA sequencing pp. 1-53. In B. J. Turner (ed.), Evolutiona~/Genetics
ieveals reduced sequence variation in the sub) of Fishes. Plenum, New York.
and sti(W) regions of the Wrosophila melanogaster Allendorf, F. W. and F. M. Utter. 1973. Gene duplica-
X chromosome. Proc. Natl. Acad. Sci. USA tian within the family Salmonidae: Disomic inher
91 46584662.
Literature Cited 561
itance of two loci reported to be tetrasomic in Dutta (ed.), DNA Systematics.CRC Press, Boca
rainbow trout. Genetics 74:647-654. Raton, FL.
Allendorf, F. W., K L. Knudsen and R. F. Leary. 1983. Aquadro, C. E and 1. C. Avise. 1982a. An assessment
Adaptive significance of differences in the tissue- of "hidden" heterogeneity within electromorphs
speclfic expression of a phosphoglucomutase at three locl in deer mice. Genetlcs 102:269-284.
gene in ralnbow trout. Proc. Natl. Acad. Sci. USA Aquadro, C. E and J. C. Avise. 1982b. Evolutionary
800:1397-1400. genetics of birds. VI. A reexamination of protein
Allendorf, E W., G. Stahl and N. Ryman. 1984. divergence using varied electrophoretic condi-
Silencing of duplicate genes: A null polymor- tions. Evolution 36:1003-1019.
phism for lactate dehydrogenase in rainbow Aquadro, C. F. and B. D. Greenberg. 1983. Human
trout. Mol. Biol. Bvol. 1:238-248. mitochondria1 DNA variation and evolution:
Altschul, S. E, W. Gish, W. Miller, E. W. Myers and D. Analysis of nucleotide sequences from seven indi-
J. Lipman. 1990. Basic local alignment search tool. viduals. Genetics 103:287-312.
J. Mol. Biol. 215:403-410. Aquadro, C. F., S F. Deese, M. M. Bland, C. H. Langley
Ammerman, L. K. and D. M. Hillis. 1992. A molecular and C. C. Laurie-Ahlberg. 1986. Molecular popu-
test of bat relationships: Monophyly or diphyly? lation genetics of the alcohol dehydrogenase gene
Syst. Biol. 41:222-232. region of Drosophila melanogaster. Genetics
Amos, D., C. Schlotterer and D. Tautz. 1993. Social 114:1165-1190.
structure of pilot whales revealed by analytical Aradhya, K. M., D. Mueller-Dombois and T. A.
DNA profiling. Science 260:670-672. Ranker. 1991. Genetic evidence for recent and
Anderson, D. M.and W. R. Folk. 1976. Iodination of incipient speciation in the evolution of Hawaiian
DNA. Studies of the reaction and iodination of Metrosideros (Myrtaceae).Heredity 67:129-138.
papovavirus DNA. Biochemistry 15:1022-1030. Archie, J. W. 1989a.A randomization test for phyloge-
Anderson, J. O., J. Nath and E. J. Harner. 1978. Effect netic information in systematic data. Syst. 2001,
of freeze-preservation on some pollen enzymes. 38:239-252.
Cryobiology 15:469477. Archie, J. W. 198913. Phylogenies of plant families: A
Anderson, P. R. and J. G. Oakeshott. 1984. Parallel geo- demonstration of phylogenetic randomness in
graphical patterns of allozyme variation in two DNA sequence data derived from proteins.
sibling Drosophila species. Nature 308:729-731. Evolution 43:1796-1800.
Anderson, S., A. T. Banker, B. G. Barrell, M. H. L. Archie, J. W., C. Simon and A. Martin. 1989. Small
DeBruijn, A. R. Coulson, J. Drouin, I. C. Eperon, sample size does decrease the stability of dendro-
D. P. Nierlich, B. A. Roe, E Sanger, P. H. Schreier, grams calculated from allozyme-frequency data.
A. J. H. Smith, R. Staden and I. G. Young. 1981. Evolution 43678-683,
Sequence and organization of the human mito- Arctander, P. 1988. Comparative studies of avian DNA
chondrial genome. Nature 29k457-465. by restriction fragment length polymorphism
Andronico E, S. De Luccini, F. Graziani, I. Nardi, R. analysis: Convenient procedures based on blood
Batistoni and G. Barsacchi-Pilone. 1985. Molecular samples from live birds. J. Ornithologie
organization of ribosomal RWA genes clustered at 129:205-216.
variable chromosomal sites in Triturus vulgaris Arhvalo, E., S. K. Davis, G. Casas, G. Lara and J. W.
meridionnlis (Amphibia, Urodela). J. Mol. Biol. Sites, Jr. 1993. Parapatric hybridization between
186:219-229. chromosome races of the Sceloporus grammicus
Angerer, R. C., E. H. Davidson and R. J. Britten. 1976. complex (Phrynosomatidae): Structure of the
Single copy DNA and structural gene sequence Ajusco transect. Copeia 1993:320-340.
relationships among four sea urchin species. Arkvalo, E., S. K. Davis and J. W. Sites, Jr. 1994.
Chromosoma 56:213-226. Mitochondria1DNA sequence divergence and
Ansorge, W, and S. Labeit. 1984. Field gradients phylogenetic relationships among eight chromo-
improve resolution on DNA sequencing gels. J. some races of the Sceloporus grammicus complex
Biochem. Biophys. Meth. 10:237-243. (Phrynosomatidae) in central Mexico. Syst. Biol.
Appels, R, and J. Dvorak. 1982. The wheat ribosomal 43:387-418.
DNA spacer region: Its structure and variation in Armour, J. A. L. and A. J. Jeffreys. 1992. Biology and
populations and among species. Theor. Appl. applications of human minisatellite loci. Curr.
Genet. 63:337-348. Opin. Genet. Dev. 2:850-856.
Appels, R. and R. L. Honeycutt. 1987. rDNA: Armour, J. A. L., R. Neumann, S. Gobert and A. J,
Evolution over a billion years, pp, 81-135. In S. K. Jeffreys. 1994. Isolation of human simple repeat
562 Literature Cited
loci by hybridisallon Selection.Human Mol. Attardi, G. 1985. Animal m~tochondrtalDNA: An
Genet. 3:599-605. extreme example of genetic economy. Int. Rev.
Arnason, U. and A. GuIIberg. 1904. Relationships of Cytol. 93:93-145.
baleen whales estabiished by cytochrome b gene Austin, C. C. 1995. A new method of bi-polymerase
sequence comparison. Nature 367:726-728. sequencing prevents "stop-bands." Mol.
Arnason, U. and B. Widegren. 1984. Different rates of Biotechnol. 4500-101.
divergence in highly repetitive DNA of cetaceans. Ausubel, E M. (ed.). 1989. Current Protocols iiz
Hereditas 101:171-177. Molecular Biology. John Wiley and Sons, New
Arnheim, N. 1983. Concerted evolution of multigene York.
families, pp. 38-61. Iir M. Nei and R. K.Koehn Ausubel, F. M., R. Brent, 17. B ICingston, D. D. Moore,
(eds.), Evolution of Genes and Proterns. Sinauer, J. G. Seidman, J. A. Smith and K.Struld. 1992.
Sunderland, Massachuset(s. Short Protocols i n Molecular Biology. 2nd ed. John
Arnheim, N., E. M. Prager and A. C. Wilson. 1969. Wiley and Sons, New York.
Immunological prediction of sequence differences Avise, J. C. 1974, Systematic value of electrophoretic
among proteins. Chemical comparisons of chick- data. Syst. Zool.23:465481.
en, quail, and pheasant lysozymes. J. Biol. Chem. Avise, J. C. 1976. Genetic differcntiation during specia-
244:2085-2094. tion, pp. 106-122. In R J. Ayala (ed.), Molecular
Arnheim, N., D. Treco, B. Taylor and E. M. Eicher. Evolution., Sinauer, Sunderland, Massacl~usetts.
1982. Distribution of ribosomal DNA length vari- Avise, J. C. 1986. Mitochondria1DNA and the evolu-
ants among mouse chromosomes. Proc. Natl. tionary genetics of higher animals. Phil. Trans.
Acad. Sci. USA 79:4677-4680. Roy. Soc. London B 312:325-342.
Arnold, E. N. 1981. Estimating pl~ylogeniesat low tax- Avise., J. C . 1989. Gene trees and organismal histories:
onomic levels. Z. Zool. Syst. Bvo1ut.-forsch. A phylogenetic approach to population biology.
19:l-35. Evolution 43:1192-1208.
Arnold, M. L. 1992. Natural hybridization as an evolu- Avise, J. C. 1994.Molecular Markers, Natural History,
tionary process. Annu. Rev. Ecol. Syst. 23237-261. and Evolution. Chapman and Hall, New York.
Arnold, M. L., D. D. Shaw and N. Contreras. 1987a. Avise, J. C. and C. F. Aquadro. 1982. A comparative
Ribosomal RNA-encoding DNA introgression summary of genetic distances in the vertebrates.
across a narrow hybrid zone between two sub- Evol. Biol. 15:151-158.
species of grasshopper. Proc. Natl. Acad. Sci. USA Avise, J. C. and G. B. Kitto. 1973. Phosphoglucose iso-
84:3946-3950. merase gene duplication in the bony fishes: An
Arnold, M. L., P. Wilkinson, D. D. Shaw, A. D. evolutionary history. Biochem. Genet. 8:113-132.
Marchant and N. Contreras. 1987b. Highly repeat- Avise, J. C. and R. A. Lansman. 1983. Polymorphism of
ed DNA and allozyme variation between sibling mitochondria1DNA in populations of higher ani-
species: Evidence for introgression. Genome mals, pp. 165-190. In M. Nei and R. K. Koehn
29:272-279. (eds.), Evolution of Genes and Proteins. Sinauer,
Arnold, M. L., J. L. Hamrick and 8. D. Bennett. 1990. Sunderland, Massachusetts.
Allozyme variation in Louisiana irises: A test for Avise, J. C., J. J. Smith and E J. Ayala, 2975. Adaptive
introgression and hybrid speciation. Heredily differentiation with little genic change between
65:297-306. two native California minnows. Evolution
Arnold, M. L., C. M. Buckner and J. L. Robinson. 1991. 29:411-426.
Pollen mediated introgression and hybrid specia- Avise, J. C., C. Giblin-Davidson, Laenn, J. C. Patton
tion in Louisiana irtses. Proc. Natl. Acad. Sci. USA and R. A. Lansman. 1979. Mitochondria1DNA
88:1398-1402. clones and matriarchal phylogeny within and
Arrand J. E. 1985. Preparation of nucleic acid probes, among geographic populations of the pocket
pp. 17-45. In 8. D. Ha~nesand S. J. Higgins (eds.), gopher, Geomys pinetis. Proc. Natl. Acad. Sa. USA
Nucleic Acid Hybridisation: A Practical Approach. 76:6694-6698.
IRL Press, Oxford. Avise, J. C., J. E. Netgel and 1. Arnold. 1984.
Asber, J. H. 1970. Parthenogenesis and genetic vari- Demographic influences of lnitochondrial DNA
ability. II. One locus model for various diploid lineage survivorship in animal populations. J.
populations. Genetics 66:369-391. Mol. Evol. 20:99-105.
Atcl~ley,W. R. and W. M. Fitch. 1991. Gene trees and Avise, J. C., J. Arnold, R. M. Ball, E. Bermingham, T.
the origins of inbred strains of mice. Science Lamb, J. E. Neigel, C. A. Reeb and N. C. Saunders.
254:554-558. 2987. Intraspecific phylogeography: The mito-
Literature Cited 563
chondrial bridge between population genetics Zealand populations ot chaffinaches (Frrngilln
and systematics. Annu. Rev. Ecol. Syst. coelcbs), Evolution 46:1784- 1 800
18:489-522. Baker, A J and A. Moeed 1987 Rapid genetic diflcr-
Avise, J. C., R. M. Ball and J. Arnold. 1988. Current entiation and founder effect in colonizing popula-
versus historical population sizes in vertebrate tions of Common Mynas (Aci ~doiherestrzstls)
species with high gene flow: A comparison based Evolution 41:523-538.
on mitochondrial DNA lineages and inbreeding Baker, C S. and S. R. Palumbi. 1994. Which whalcs are
theory for neutral mutations. Mol. Biol. Evol. hunted? A molecular genetrc approach to moni-
5:331-344. toring whallng. Science 265 1538-1539.
Avise, J. C., B. W. Bowen and T. Lamb. 1989. DNA fin- Baker, M.C , D B.TI.iompson, G. L Sherman, M, A
gerprints from hypervariable mitochondrial geno- Culin~nghamand D F Tomback. 1982. Allozylnc
types. Mol. Biol. Evol. 6:258-269. frequencies in a hnear series of song dlnlect popu-
Avise, J. C., J. C. Trexler, J. Travis and W. S. Nelson. lations. Evolution 36.1020-1 029
1991. Poectlia mexfcana is the recent female parent Baker, R. J. and H A. Wichman. 1990. Retrotransposon
of the unisexual fish P.formosa. Evolution Mys 1s concentrated on the sex chromosomes
45:1530-1533. Imphcations for copy nuinber containment
Avise, J. C., R. T. Alisauskas, W. S. Nelson and C. D. Evolution 44:2083-2088.
Ankney. 1992a. Matriarchal population genetic Baker, R. J., S. K. Davrs, R. D. Bradlcy, M. J. Ham~lion
structure in an avian species with female natal and R. A. Van Den Bussche 1989. Ribosomal-
plulopatry. Evolution 4631084-1096. DNA, mitochondrial-DNA, chromosomal, and
Avise, J. C., J. M. Quattro and R. C. Vrijenhoek. 1992b. allozymlc stud~eson a contact zone in the pocket
Molecular clones within organismal clones: gopher, Geoinys. Evolution 43 63-75.
Mitochondria1 DNA phylogeliies and evolution- Baker, R. J., R. L. FIoneycutt and R A. Van Den
ary history of unisexual vertebrates. Evol. Biol. Bussche. 2991a. Examlnatlan of thc monophyly of
26:225-246. bats: Restriction map of the ribosomal DNA
Avise, J. C., B. W. Bowen, T.Lamb, A. B. Meylan and E. c~stron,pp. 42-53. In T. A Grlffiths and D.
Benningham. 7992c. Mitochondria1 DNA evolu- T<hngener (eds.), Confnb~~tlolis 117 Manrlr7alog-y in
tion at a turtle's pace: Evidence for low genetic Iionor of Karl F. Koopnmn Amerlcan Museum ot
variability and reduced microevolutionary rate in Natural History, New York
the Testudines. Mol. Biol. Evol. 9:457-473. Baker, R. J., M. J. Novacek and N. B. Simmons 3991b.
Ayala, E. J. 1982. Genetic variation in natural popula- On the monophyly of bats. Syst. Zool. 40:216-231
tions: Problem of electrophoretically cryptic alle- Bald~ng,D. J. and R. A Nichols. 2994. DNA proflie
les. Proc. Natl. Acad. Sci. USA 79:550-554. match probability calculat~onsHow to allow for
Ayala, E J. 1986. On the virtues and pitfalls of the population stratlflcat~on,relatedness, databasc
moecular evolutionary clock. J. Hcrcd. sclectlon and single bands. Forensic Sci. Int
77:226-235. 64,125-140.
Ayala, E J., J. R. Powell, M. L. Tracey, C. A. Mourao Ball, R. M , Jr., S Frccman, F.C James, E. Bermlnghan~
and S. Perez-Salas. 1972. Enzymc variability in the and J. C. Avise 1988. Phylogeographic population
Drosophzla willistonz group. IV. Genic variation in structure of rcd-w~ngcdblackbirds assessed by
natural populations of Drosophrla willistoni. mitochondrial DNA. Proc. Natl. Acad. Sci USA
Genetics 70:113-139. 85 1558-1562.
Ball, R M. J ,J. E. Neigel and J C Av~se1990. Gcnc
Baba, M. L., M. Goodman, H. Dcne and G. W. Moore. gcncologies within the organismal pedxgree o f
1975. Origins of the Ceboidea viewed from an random-mating populatiol-is Evolut~on
immunological perspective. J. Human Evol. 44.360370.
489-102. Ballard, J W O., G. J. Olscn, D P Faith, W.A. Odgers,
Bachellerie, J. -P. and L. -H. Qu. 1993. Direct ribosomal D. M. Rowell and P W. Atkrnson. 1992. Ev~dencc
RNA sequencing for phylogenetic studies. Meth. from 125 ribosomal RNA scquenccs that o11y-
Enzymol. 224:349-357. chophorans are modlfied arthropods. Sc~ence
Bailey W. J., J. L. Slightom and M. Goodman. 1992. 258'1345-1348.
Rejection of the "flying primate hypothesis" by Bandelt, H.-J. and A. W. M. Dress 1992. Spht dccom-
phylogenetic evidence from the E-globin gene. position: A new and useful approach to phyloge-
Science 256:8&89. netic analysis of distance data. Mol. Phylogc~~ct
Baker, A. J. 1992. Genetic and morphometric diver- Evol 1.242-252.
gence in ancestral European and descendent New
564 Literature Cited
B~ltks,J A and C. W. Birky Jr. 7985. Clxforoplast DNA Bassam, B. J. and G. Caetano-Anollbs. 1993. Silver
iii\v?rs~tyis low in a wlld plant, Lupinus texensis. staining of DNA in polyacrylamlde gels. Appl.
Proc Natl. Acad Sci USA 82:6950-6954. Biochem. Biotechnol. 42:181-188.
Darendse, W., et al. 1994.A genetic linkage map of the Bassam, B. J., G. Caetano-AnollCs and P. M. Gresshoff.
bovine genome. Nature Cenetlcs 6:227-235. 1991. Fast and sensitive silver staining of DNAin
Barker, J. S. F., P.D.East and B.S. Weir. 1986. Temporal polyacrylainide gels. Analyt. Biochem. 196:80-83.
and microgeographic variation m allozyme fre- Baum, D. 1994. rbcL and seed-plant phylogeny. Trends
queficies in a natural pop~llationof Drosoph~lia Ecol. Evol. 9:39-41.
buzznrtzr. Genetics 112:577-611 Baum, D. and A. Larson. 1991. Adaptation reviewed.
Barker, P E , J. I?. Testa, N.2.Parsa and R. Snyder. A phylogenetic methodology for studying charac-
1986. High molecular welght DNA from fixed ter macroevolution. Syst. Zool.40:l-18.
cytogcnetic preparations. Am. J. Human Genet. Bautz, E. K.and E A. Bautz, 1964. The influence of
39,661-668. noncompiernentary bases on the stability of
B'irnes, P.T and C. C. Laur~e-Ahlberg.1986. Genetic ordered polynucleotides, Proc. Natl. Acad. Sci.
variability of flight metabolism in Drosophila USA 52:1476-1481.
~~~elanogasler. 111. Effects of GPDH allozymes and Baverstock, I? R. and M. Adams. 1987. Comparative
environmental temperature on power output. rates of molecular, chromosomal and morphologi-
Genetlcs 112:267-294. cal evolution in some Australian vertebrates, pp.
13arnes, W. M. 1987. Sequencing DNA with dideoxyri- 175-188. In K. S. W. Campbell and M. E Day (eds.),
bonucleotides as cham terminators: Hints and Xates of Evolution. Allen and Unwin, London.
strategies for big projects. Meth. Enzymol. Baverstock, R., C. 13. S. Watts and S. R. Cole. 1977.
152 538-556. Electrophoretic comparisons between the
Barnes, W. M. 1994. PCR amplification of up to 35-kb allopatric populations of five Australian
DNA with high fidelity and high yield from h pseudomyine rodents (Muridae). Australian J.
bacteriophage templates. Proc. Natl. Acad. Sci. Biol. Sci. 30:471-485.
USA 912216-2220 13averstock,P. R., S. R. Cole, B. J. Rchardson and C. H.
Barnes, W M., M. Bevan and P. H. Son. 1983. Kilo- Watts. 1979. Electrophoresis and cladistics. Syst.
sequencing: Creatlon of an ordered nest of asym- Zool. 28:214-219.
metric deletions across a large target sequence Baverstock, P. R., M.Adams and C. H. S. Watts. 1986.
carr~edon phage M13. Meth. Enzymol. Biochemical differentiation among karyotypic
101.98-122. forms of Australian Rattus. Genetica 71:ll-22.
Uarrodaie, I. and F. D. K.Roberts. 1973. An improved Beckman, J. S. and J. L. Weber, 1993. Survey of human
algorithm for discrete lI linear approximation. and rat microsatellites. Genomics 12:627-631.
SlAM J. Numer. Anal. 10:839-848. Beerli, P., H. Hotz and T. Uzzell. 1996. Geologically
Barreti, M ,M.J. Donoghue and E.Sober. 1991. dated sea barriers calibrate a protein clock for
Against consensus. Sysl. Zool. 40,486493. Aegean water frogs. Evolution (in press).
Barrle, P. A.,A.], Jeffries and A. F. Scott. 1981. Begun, D. J. and C. E Aquadro. 1993. African and
Evolution of the pglobin gene cluster in man and North American populations of Drosopl?ila
the primates. J. Mol Biol. 149319-336. rnelanogaster are very different at the DNA level.
Darro~rclough,G. R., N. K. Johnson and R. M*Zink. Nature 365:548-550.
1985. On the nature of genic variation in birds, I.
Benjamin, D. C., J. A. Berzofsky, I. East, E R. N.
pp. 135-154. In R. I: Johnston (ed.), Current Gurd, C. Hannum, S. J. Leach, E. Margoliash, J. G.
Or~~iilzology. Vol. 2. Plenum, New York. Michael, A. Miller, E. M. Prager, M. Rcichlin, B. B.
Ba~ion,N. 13.and 6 .M Ilew~lt.1989. Adaptahon, Sercarz, S. J. Smith-Gill, P. E. Todd and A. C.
speciation and hybr~dones. Nature 341:497-503. Wilson. 1984. The antigenic structure of proteins:
Rdrton, N.H,R. R. Halllday and G. M. Hewitt. 1983. A reappraisal. Annu. Rev. Immunol. 2:67-101.
Rare electrophorehc variants in a hybrid zone. Benn, P. A. and M. A. Perle. 1986. Chromosome stain-
Hered~ty50:139-146. ing and banding techniques, pp. 57-84. In D. E.
13a11y; D.and J. A. Hartigdn, 1987a. Statistical analysis Rooney and B. H. Czepulkowski (eds.), Human
of holninoid molecular evolution. Stat. Sci. Cytogeneiics. lRL Press, Oxford.
2.191-210. Bennet, S., L.J. Alexander, R. H. Crozier and A. G.
Darry, D. and J. A. Hartigan. 198%. Asynchronous dis- Macfilay. 1988. Are megabats flying primates?
tance between homologous DNA sequences. Contrary evldence from a mitochondria1 DNA
Biurnetrics 43:261-276. sequence. Aust. 1. Biol. Sci. 41:327-332.
Literature Cited 565
Bennett, M. D.1972. Nuclear DNA content and mini- Beyer, "W. A., M.L. Stein, T. E Smith and S. M. Ulam.
mum mitotic time in herbaceous plants. Proc. 1974. A molecular-sequence metnc and evolution-
Roy. Soc. London B, 181:109-135. ary trees. Math. Biosci. 19:9-25.
Benzten, P., W. C. Leggett and G. G. Brown. 1988. Bickmore, W. A. and A. T. Surnner. 1989. Mammalian
Length and restriction site heteroplasmy in the chromosome banding-an expression of genome
mitochondnal DNA of American shad (Alosa organization. Trends Genet. 5:144-148.
sapidisszm). Genetics 118:509-518. Birky, C. W., Jr. 1983. The partitioning of cytoplasmic
Bentzen, P., A. S. Harris and J. M. Wright. 1992. organelles at cell division. Int. Rev. Cytol.
Cloning of hypervariable minisatellite and simple 15:49-89.
sequence microsatellite repeats for DNA finger- Birky, C. W., Jr.,T. Maruyama and P. Fuerst. 1983. An
printing of important aquacultural species of approach to population and evolutionary genetic
salmonids and tilapia, pp. 242-262. In T. Burke, G. theoy for genes in mitochondria and chloroplas-
Dolf, A. J. Jeffreys and R. Wolff (eds.), DNA ts, and some results. Genetics 103:513-527.
Fingerprinting: Approacl~esand Applications. Birky, C. W., Jr., P. Fucrst and T. Maruyama. 1989.
Birkhauser Verlag, Basel, Switzerland. Organelle gene diversity under migration, muta-
Benveniste, R. E. 1985. The contribution of retroviruses tion, and dnff: Equilibriuln expectations,
to the study of mammalian evolution, pp. approach to equilibrium, effects of heteroplasmic
359-417. In R. J. MacIntyre (ed.), Molecular cells, and comparison to nuclear genes. Genetics
Evolutionay Genetics, Plenum, New York. 121:613-627.
Benveniste, R. E. and G. J. Todaro. 1976. Evolution of Birley, A. J. and J. H. Croft. 1986. Mitochondria1 DNAs
type C viral genes: Evidence for an Asian origin and phylogenetic relationships, pp. 107-137. In S.
of man. Nature 261:lOl-108. K. Dutta (ed.), DNA Systematics. CRC Press, Boca
Berg, W. J. and D. G. Buth. 1984. Glucose dehydroge- Raton, FL.
nase in feleosts: Tissue distribution and proposed Birstem, V, J. 1982. Structural characteristics of genome
function. Comp. Biochem. Physiol. 77B:285-288. organization in amphibians: Differential staining
Berger, S. L. and A. R. Kimmel (eds.). 1987. Guide to of chromosomes and DNA structure. J. Mol. Evol.
Molecular Cloning Techniques. Meth. Enzymol. 18:73-91.
152:1-812. Bisbee, C. A., M. A. Baker, A. C. Wilson, I. Hadji-Azimi
Berlocher, S. J. and G. L. Bush. 1982. An electrophoret- and M. Fischberg. 1977.Albumin phylogeny for
ic analysis of Rhagoktis (Diptera: Tephritidae) clawed frogs (Xenopus). Science 195:785-787.
phylogeny. Syst. Zool. 31:13&155. Bishop, J. G. and J. A. Hunt. 1988. DNA divergence in
Berlocher, S. H. and D. L. Swofford. 1996. Searching and around the alcohol dehydrogenase locus in
for phylogenetic trees under the frequency parsi- five closely related species of Hawaiian
mony criterion: An approximation using general- Drosophila. Mol. Biol. Evol. 5:415-432.
ized parsimony. (unpublished manuscript) Bishop, M. D., S. M. Kappes, J. W.Keele, R. T. Stone, S.
Bermingham, E. and J. C. Avise. 1986. Molecular zoo- L. E Sunden, G. A. Hawkins, S. S. Toldo, R. Fries,
geography of freshwater fishes in the southeast- M. D. Grosz, J. Yoo and C. W. Beattie. 1994. A
ern United States, Genetics 113:939-965. genetic linkage map for cattle. Genetics
Bernardi, G., B. Olofsson, J. Filipski, M. Zerial, J. 136:619-639.
Salina, G. Cuny, M. Meunier-Rotival and E Bishop, M. J. and A. E. Friday. 1985. Evolutionary trees
Rodier. 1985. The mosaic genome of warm-blood- from nucleic acid and protein sequences. Proc.
ed vertebrates. Science 228:953-958. Roy Soc. London B 226:271-302.
Beutler, E. 1969. Electrophoresis of phosphogiycerate Black, I. W. C. 1993. PCR with arbitrary primers:
kinase. Biochem. Genet. 3:189-195. Approach with care. Insect Mol Biol. 2:1-6.
Bevan, I. S., R. Rapley and M. R. Walker. 1992. Bledsoe, A EI. 1987. DNA evolutionary rates in nine-
Sequencing of PCR-amplrfied DNA. PCP. Meth. primaried passerine birds. Mol. Biol. Evol.
Applica, 1:222-228. 4:559-571.
Beverley, S. M. and A. C. Wilson. 1982,Molecular eve- Block, B. k.,J. R. Finnerty, A. E R. Stewart and J. Kidd.
lutlon in Drosophila and higher diptera. I. Micro- 1993. Evolution of endothermy in fish: Mapping
complement fixation studies of a larval physiological traits on a molecular phylogeny,
hemolymph protein. J. Mol. Evol. 18:251-264. Science 260:210-214.
Beverley, S. M, and A. C. Wilson. 1985. Ancient origin Bodmer, M. and M. Ashburner. 1984. Conservation
for Hawaiian Drosophibae inferred from prote~n and change in the DNA sequences coding for
comparisons. Proc. Natl. Acad. Sci. USA aicohoI dehydrogenase in sibling species of
82475311757. Drosophila. Nature 309:421-430.
566 Literafure Cited
Boerwinkle, E., W. Xiong, E. Fourest and L. Chan. Boyden, M. G. 1967. It's about time. Ser. Mus. Bull.
1989. Rapid typing of tandemly repeated hyper- 37:7-10.
variable loci by the polymerase chain reaction: Boyer, S. H.1961. Alkaline phosphatase in human sera
Application to the apolipoprotein B 3' l~ypervari- and placentae. Science 134:1002-1004.
able region. Proc. Natl. Acad. Sci. USA 86:212-216. Boyer, S. H., D. C. Fainer and E. J. Watson-Williams.
Bogart, J. F. 1972. Karytoypes, p. 171-195. In W. E. 1963. Lactate del~ydrogenasevariant from h~unan
Blair (ed.), Evolution tn the Genus Bufo. University blood: Evidence for molecular subunits. Science
of Texas Press, Austm. 141:642-643.
Bogart, J. P., L.A. Lowcock, C. W. Zeyl and B. K. Bradley, R. D., J. J. Bull, A. D. Johnson and D. M.
Mable. 1987. Genome constitution and reproduc- Millis. 1993. Origin of a novel allele in a mam-
tive biology of hybrid salamanders, genus malian hybrid zone. Proc, Natl. Acad. Sci. USA
Ambystoma, on Kelleys Island in Lake Erie. Can. J. 90:8939-8941.
Zool. 65:2188-2201. Brazaitis, P. and M. Watanabe. 1982. The doppler, a
BonneU, M. T. and R. K. Selander. 1974. Elephant seals: new tool for reptile and amphibian hematological
Genetic variation and near extinction. Science studies. J. Herpetol. 16:1-6.
184:908-909. Bremer, K.1988. The limits of amino acid sequence
Banner, T. I., D. J. Brenner, B. R. Neufeld and R. J. data in angiosperm phylogenetic reconstruction.
Britten. 1973. Reduction in rate of DNA reassocia- Evolution 42:795-803.
tion by sequence divergence. J. Mol. Biol. Bremer, B. 1991. Restriction data from chloroplast
81:123-135. DNA for phylogenetic reconstruction: Is there
Bonner, T. I., R. Heinemann and G. J. Todaro. 1980. only one accurate way of scoring? Plant Syst.
Evolution of DNA sequences has been retarded in Evol. 17539-54.
Malagasy primates. Nature 286:420-423. Bremer, K. 1994. Branch support and tree stability
Boore, J. L., T.M. Collins, D. Stanton, L. L. Daehfer Cladistics 10:295-304.
and W. M. Brown. 1995. Deducing the pattern of Breneman, J. W., M. J, Rarnsey, D. H. Lee, G. G.
arthropod phylogeny from mitochondrial DNA Eveleth, J. L. Minkler and J. D. Tucker. 1993. The
rearrangements. Nature 376:163-165. development of chromosome-specific composite
Bowcock, A. M. and L. Cavalli-Sforza.1991. The study DNA probes for the mouse and their application
of variation in the human genome. Genomics to chromosome painting. Chromosoma
11:491498. 102:591-598.
Bowcock, A. M., J. R. Kidd, J. L. Mountain, J. M. Brent, R. P. 1973. Algovithms for Minimization WEhout
Herbert, L. Carotenuto, K. K. ~ i d and d L. L. Derivatives. Prentice-Hall, Englewoad Cliffs, New
Cavalli-Sforza. 1991. Drift, admixture, and selec- Jersey.
tion in human evolution: A study with DNA poly- Brewer, G. J. 1970. An Introduction to Isozyme
morphism~.Proc. Natl. Acad. Sci. USA Techniques. Academic Press, New York.
88:839-843. Bridge, D., C. W. Cunningham, B. Schierwater, R.
Bowcock, A. M., A. Ruiz-Linares, J. Tomfohrde, E. DeSalie and L. W. Buss. 1992. Class-level relation-
Minch, J. R. Kidd and L.L. Cavalli-Sforza. 1994. ships in the phylum Cnidaria: Evidence from
High resolution of human evolutionary trees with mitochondria1genome structure, ]?roc.Natl.
polymorphic microsatellites. Nature 368:455457. Acad. Sci. USA 89:8750-8753.
Bowen, B. W.,A. B. Meylan and J. C . Avise. 1989. An Briscoe, D. A., J. M. Malpica, A. Robertson, G. J. Smith,
odyssey of the green sea turtle: Ascension Island R. Frankham, R. G. Banks and J. S. R Barker. 1992,
revisited. Proc. Natl. Acad. Sci. USA 86:573-576. Rapid loss of genetic variation in large captive
Bowen, 8. W., W. S. Nelson and J. C. Avise. 1993. A populations of Drosophila flies: Implications for
molecular phylogeny for marine turtles: Trait the genetic management of captive populations.
mapping, rate assessment, and conservation rele- Conserv. Biol. 6:416-425.
vance. Proc. Natl. Acad. Sci. USA 905574-5577. Britten, R. J. 1986. Rates of DNA sequence evolution
Boyden, A. (ed.). 1948-1978. Serol. Mus. Bull. Vols. differ between taxonomic groups, Science
1-51. 231:2393-1398.
Boyden, A. 1942. Systematic serology: Acritical Britten, R. J. 1989. Comment on DNA hybridization
appraisal. Physiol. Zoal. 15:109-145. issues raised at Lake Arrowhead. J. Mol. Evol.
Boyden, A. 1964. Perspectives in systematic serolom, 18:163-164.
pp. 75-99. In C. A. Leone (ed.), Taxonomic Britten, R. J. and E. H. Davidson. 1969.Gene regulation
Biocl~emistryand Serology. Ronald Press, New York. for higher cells: A theory. Science 165:349-357.
Literature Cited 567
Br~tten,R. J . and E. H. Davidson. 1985. Hybridisalion Brown, W. M.1983 Evolut~onof animal rnltocl~ondr~al
strategy, pp, 3-15. In B. D. Hames and S. J. DNA, pp. 62-88. Irz M Ncl and R. K. Koehl~
Higgins (eds.), Nucleic Acid Hybrldisafioiz: A (eds.), Evolution of Geiies and Prateltls. Smaue~,
Prnctlcal Applanch. IRL Press, Oxford. Sundcrland, Massachusetts
Bntteiz, R. J, and D. E.XCohne, 1967. Nucleotide Brown, W. M. 1985. The mltochondrlal genome of a111-
sequence repetition in DNA. Carnegie Inst. Wash. mals, p p 95-130 I n It Mac111tjxc (ed.),Molec~ilni
Yearbook 65.78-106. Evolutionary Genetrcs Plcnum, New York.
Britten, R. J. and D. E. Kohnc. 1968. Repeated Brown, W. M. and J Wrxgiit. 1979 Mitochondr~alDNA
sequences in DNA. Science 161:529-540. analyses and the orlgln and relative age of
Brittcn, R. J., A. Cetta and E. H. Davidson. 1978. The parthenogenetic lizards (genus Cnernidophorus).
single-copy sequence polymorphism of the sea Sc~ence203:2247-1249.
urchin Strongylocentrotus purpuratr~s.Cell Brown, W M., M. George, Jr and A. C. W~lson1979
15:1175-1186. Rap~dcvolution of animal mitochondrlal DNA
Britten, R. J., D. E. Graham and B. R. NeufeId. 1974. Proc. Nall Acad. Sci. USA 76 1967-1971
Analysis of repeating DNA sequences by reassocl- Brown, W. M., E. M. Pragcr, A. Wang and A. C
ation. Meth. Enzymol. 29:365-418. Wilson. 1982. Mltochondrlal DNA scqucnccs o I
Bron, C. and J. Kerbosch. 1973.Algorithm 457: Finding primates: Tempa and mode of evolution. J Mol
all cliques of an undirected graph. Comm. ACM EvoL 18:225-239.
16:575-577. Brownre, C. and D. D. Boos 2994. Type 1 error robust-
Bronstein, I., j. C. Voyta, K. G. Lazzari, 0.Murphy, B. ness of ANOVA and ANOVA on ranks whcn tlie
Edwards and I,. J. Krika. 1990. Rapid and sensi- number of treatments is large. Biometries
tive detection of DNA in Southern blots with 50:542-549.
chemiluminescence. BioTechniques 8:310-312. Bruford, M. W. and R. K.Wayne 1993. Microsatcllltes
Brooks, D. R. 1981. Hennig's parasitological method: and their application to population genetic stud-
Aproposed solution. Syst. Zool. 30:229-249. ies. Curr. Qpin. Genct. Dev. 3 939-943
Brooks, D. R. 1990. Parsimony analysis in historical Bruford, M. W., 0. Hanotte, J. F Y Brookfield and T
biogeography and coevolution: Methodological Burke. 2992. Single-locus and multilocus DNA
and theoretical update. Syst. Zool. 39:14-30. fingerprinting, pp. 225-269. In A. R. Hoelzel (ed ),
Brooks, D. R. and D. A. McLennan. 1991. Phylogmy, Molecular Genehc Analysrs oj Poptilafions. A
Ecology, and Behavior: A Research Program in Practical Approach. IRL Prcss, Oxford.
Comparative Biology. University of Chicago Press, Bruns, T. D. and J. D. Palmer. 1989. Evolution of mush-
Ch~cago. room mitochondrial DNA Suzllus and related
Brooks, D. R. and D. A. McLennan. 1993. Parascript: genera. J. Mol. Evol. 28:348-362.
Parasites and the Language of Evolution. Budowle, B.,A. M. Giusti, J. S Waye, F. S. Baechtcl, I<.
Smithsonian Institution Press, Washington, D.C. M. Fourney, D. E. Adams, L A. Presley, H. A
Broughton, R. E. and T. E. Dowling. 1994. Length vari- Dcadman and K.L. Monson. 1991. Rxed-bin
ation in mitochondrial DNA of the minnow, analysis for statistical evaluation of continuous
Cypriizella spiloptera. Genetics 138:179-190. distributions of allelic data from VNTR loci, for
Brown, A. D. H. 1975. Sample sizes needed to detect use 111 forensic comparisons. Am. J. .Human
linkage disequilibrium between two or tlvee loci. Genet. 48:841-855.
Theor. Pop. Biol. 8:184-201. Buffon, G.-L. de L., Comlc de. 1753. Histatre Nalureiic
Brown, J. K. M. 1994. Probabilities of evolutionary Ginirale et Particuladre. Val. 4 Imprimerie Royale,
trees. Syst. Biol. 43:78-91. Paris.
Brown, K. L. 1985. Demographic and genetic charac- Bull, J. J., C. W. Cunningham, I. 1. Molineux, M.R.
teristics of dispersal in the mosquitofish, Badgett and D. M. I-lllhs. 1993a Expermental
Gambusia afinis (Pisces: PoeciIiidae). Copeia molecular evolution of bacteriophage T7.
1985:597-612. Evolution 47:993-1007
Brown, T. A. and K.A. Brown. 1994. Ancient DNA: Bull, J. J., J. P. Huelsenbeck, C. W. Cunningham, D. L.
Using molecular biology to explore the past. Swofford and J? J Waddell. 1993b. Partltion~ng
BioEssays 16:719-726. and combining data in phylogenetic analysis.
Brown, W. M.1980. Polymorphism in mitocbondrial Syst. B101.42:384-397.
DNA of humans as revealed by restriction Bulmer, M. 1991. Use of the method of generalized
endonuclease analysis. Proc. Natl. Acad. Sci. USA Icast squares inreconstructing phylogenles from
77:3605-3609. sequence data. Mol. Blol. Evol. 8:868-883.
568 Literature Cited
Bulmc.r, kl 1994 Tkeoretzcal Evolutions y Ecology. Buth, D. G. 1982a. Glucosepl~osphate-isomerase
Sinauer Associates, Sunderland, Massachusetts. expression in a tetraploid fish, Moxosfoma lackneri
Buneinan, P.1971. The recovery of trees from mea- (Cypriniformes, Catostomidae): Evidence for
sures of dissimilarity, pp. 387-395. In F. R. "retetraploidization"? Geneiica 57:171-175.
Hodson, D. G. Kendall and P. Tautu (eds.), Buth, D. G.1982b. Locus assignments for general mus-
Mcfkemattcs In the Arcizaeological and Hisioricnl cle proteins of darters (Etheostomatini). Copeia
Sc~eizces Edinburgh University Press, Edinburgh. 1982:217-219.
Buonagurio, D. A., S. Nakada, J. D. Parvin, M.Krystal, Buth, D. G. 1983. Duplicate isozyme loci in fishes:
P I'alese and Mr.M. 17itch.1986. Evolution of Origins, distribution, phyletic consequences, and
human influenza A vlruses over 50 years: Rapid, locus nomenclature, pp. 381400. bz M. C. Rattazzi,
unlforru rate of changc In NS gene. Science J. G. Scandalios and G. S. Whitt (eds.), Isozymes.
232 9811-982. Current Topics in Biological and Medical Researclz, Vol.
Burke, T.1989. DNA flngcrprinting and other methods 10. Genetics and Evoltition. A, R. Liss, New York.
for the study of matlng success. Trends Ecol. Evol. 8ut11, D. G. 1Y84a. Tlte application of electrophoretic
4 139-144. data in systematic studies. Annu, Rev. Ecol. Syst.
Burke, T.and M. W. Bruford. 1987. DNAfingerprint- 15:501-522.
ing in birds. Nature 327149-152. Buth, D,G. 198413. Allozymes of the cyprinid fishes:
Burke, T., N. B. Davles, M. W. Bruford and B. J. Variation and application, pp. 561-590. In B. J.
llatchv\rell. 1989. Parentai care and mating behav- Turner (ed.), Evolufionay Genetics of Fishes.
i,)ur of polyandrous dunnocks PrunelIa modulnris Plenum, New York.
related to paternity by DNA fingerprinting Buth, D. G. 1990. Genet~cprinciples and the interpre-
Nature 338.249-251. Lation of electrophoretic data, pp. 1-21. In D. H.
Burke, T., 0 Hanotte and M.W. Bruford. 1991. Whitmore (ed.), Electrophoretic and Isoelectric
Multilocus and single locus min~satelliteanalysis Focusing Techniques in Fisheries Management. CRC
In population biological studies, pp. 155-168. In T. Press, Boca Katon, Fionda.
Burke, G. Dolf, A J Jeffreys and R. Wolff (eds.), Buth, D. G. and R. W. Murphy 1980. Use of n~coti-
D N A I'rngerprinting: Approaches arid Applicafion. namide adenine dinucleotide (NAD)-dependent
Birkhduser Verlag Kasel, Switzerland. glucosc-6-phosphate dehydrogenase in enzyme
Burkhart, B. D ,E. Montgomery, C. 1-1. Langley and R. staining procedures. Stain Technol. 55:173-176.
.A Voeker, 1984. Characteri~ationof allozyme Bu th, D.G., B. M. Burr and J. R. Sclnenck. 1980.
null and low act~vitralleles from two natural EIectrophoretic evidence for relationships and dif-
poyulatlons of Drosophrla melanogaster. Genetics ferentiation among members of the percid sub-
107:295-306. genus Microperca. B~ochem.Syst. Ecol. 8:297-304.
Bur nell, K. L. and S. B. fledges. 1990. Relacionships Buth, D. G., R. W. Murphy, M. M. Miyamoto and C. S.
and biogeography of West Indian Anolts (Sauria. Lieb. 2985. Creatine kinases of amphibians and
Iguarudae): An approach using slow-evolvmg reptiles: Evolutionary and systematic aspects of
yrotem loci. Canb. J. SCI.26-7-30. gene expression. Copcia 1985:279-284.
Burtan, R. S. and B.-N. Lee. 1994. Nuclear and mito-
cl~otldrialgene gencalogles and allozyme poly- Caccone, A. and J. R, Powell. 1987. Molecular evolu-
morphlsm across a major phylogeograph~cbreak tionary divergence among North American cave
in the copepod Tigrioyus califor~z~ctts. Proc. Natl crickets I1 DNA-DNA hybridization. Evolution
Acad SCI.USA 91:5197-5201. 41:1215-1238.
R~ksack,S D , B G. Jerlcho, L. R. Maxson and T. Caccone, A. and J. R. Powell. 1992. A protocol for the
Lizrcll. 1988. Evolutionary relationships of sala- TEACL method of DNA-DNA hybridization, pp.
~nandersln the genrrs T Y I ~ L I YThe
U S :view from 385407. i n G. M. Hewitt, A. W. B. johnson and J.
~mmu~rology. Herpetologlca 44307-316. P. W. Young (eds.), Molecular Techniques in
Buth, C . G. 1979a. Creatine kinasc variability in Taxonomy..Springer-Verlag, New York.
Moxostonza nlacrolep~doturn(Cypriniformes: Caccone, A,, G. D. Amato and J. R. Powell. 1987.
Catostomidae). Copela 1979:152-154. lntraspec~ficDNA divergence in Drosopkila: A
Built, D. G. 197Yb. Genet~crelationships among the study on parthenogenetic D. mercatorum. Mol.
torrent suckers, genus Thoburniu. Biochem. Syst. Biol. Evol. 4:343-350.
Ecol. 3311-316. Caccone, A., G. 23. Amato and J. R. Powell. 1988a.
Durh, 9, G 1980, Staining procedures for D-2-hydrox- Rates and patterns of scnDNA and mtDNA diver-
~3c1ddclrydrogenase as applied to studlrs of gence within the Drosoplliln rn~lanogastersub-
io-iver vertebrates. lsozyme BuII. 13:115. group Genetics 118:671-683.
Caccone, A,, I<.DeSalle and J. R. PowelI. 1988b. Cano, R. J., H. N.Poinar, N. J. Pienlazek, A. Acra and
Calibration of the change in thermal stability of G. 0.Poinar, Jr. 1993. Amplification and sequenc-
DNA duplexes and degree of base pair mismatch. ing of DNAfrom a 120-135-million-year-old wee-
J. Mol. Evol. 27:212-216. vil. Nature 363:536-538.
Caccone, A., Gleason, J. M. and J. R.Powell. 1992. Cantatore, P., M. N. Gadaleta, M. Roberti, C. Saccone
Complementary DNA-DNA hybridization in and A. C. Wilson. 1987. Duplication and remould-
Drosophiln. J. Mol. Evol. 34:130-140. ing sf tRNA genes during the evolutionary
Cadle, J. E. 1988. Phyiogenetic relationships among rearrangement of mitochondria1 genomes. Nature
advanced snakes: A molecular perspective. Univ. 329:853-854.
Calif. Pub. Zool. 119:l-77. Cao, Y., J. Adachl, T.Yano and M. Hasegawa. 1994.
Caetano-Anollks, G. and B. J. Bassam. 1993. DNA Phylogenetic placement of guinea pigs: No sup-
amplification fingerprinting usmg arbitrary port of the rodent polyphyly hypothesis from
oligonucleotide primers. Appl. Biochem maximum likelihood analysis of multiple protein
Biotechnol. 42:189-200. sequences. Mol. Biol. Evol. 11:593-564.
Caetano-AnollCs, G., B. J. Bassam and P. M. Gresshoff. Carbonari, M. 1993. Optimization of PCR perfor-
1992. Primer-template interactions during DNA mance. Trends Genet. 9:4243.
amplification fingerprinting with single arbitrary Carlson, J. E., L. K. Tulsieram, J. C. Glaubitz, V. W. K.
oligonucleotides. Mol. Gen. Genet. 235:157-165. Luk, C. Kauffeldt and Rutledge. 1991. Segregation
Callan, H. G. 1966. Chromosomes and nucleoli of the of amplified DNA markers in F1 progeny of
axolotl, Ambystoma mexicanurn. J. Cell SCI.1:85-108. conifers. Theor. Appl. Genet. 83:194-200.
Callan, H. G. 1986. Lampbrush Clzromosomes. Springer- Carlson, S. S.,A. C. Wilson and R. D. Maxson. 1978.
Verlag, Berlm. Do aibumin clocks run on time? Science
Callan, H. G., J. G. Gall and C. A. Berg. 1987. The 200:1183-1185.
lampbmsh chromosomes of Xenopus laevis: Carpenter, J. M. 1988. Choosing among multiple
Prepardtiui~,iclentificdtiuri,dtld disiribuliol~o i 5 s
DNA sequences. Cnromosoma 95:236-250.
Callen, D. F,, A. D. Thompson, Y. Shen, H. A. Phillips, Carr, S. M., A. J. Brothers and A C. Wilson. 1987.
R.I. Richards, J. C. Mulley and G. R Sutherland. Evolutionary inferences from restriction maps of
1993. Incidence and orlgtn of "null" alleles in the mitochondrial DNA from nine taxa of Xenopus
(AC)n microsatellite markers. Am. J. Human frogs. Evolution 41:?76-190.
Genet. 52:922-927. Case, S. M, and M. H. Wake. 1977. immunological
Cameron, S. A. 1993. Multiple origins of advanced comparisons of Caecilian albumins (Amphibia:
eusociality in bees inferred from mitochondrial Gymnophiona). Herpetologica 33:94-98.
DNA sequences. Proc. Natl. Acad. Sci. USA Case, S. M,and E. E. Williams. 1984 Study of a contact
90:8687-8691. zone in the Anolis distichus complex in the central
Camin, J. H. and R. R. Sokal. 1965.A method for Domin~canRepublic. Herpetologica 40:118-137.
deducing branching sequences in phylogeny, Casillas, E., J. Sundqulst and W. E. Ames. 1982,
Evoiution 19:311-326. Optimization of assay conditions for, and selected
Campbell, J. A. and D R. Frost. 1993. Anguid lizards tissue distributior~of alanine aminotransferase
of the genus Abronia: Revisionary notes, descrip- and aspartate aminotransferase of English sole,
tions of four new species, a phylogenetic analysis, Pflrophys vetulus Girard. J. Fish. Biol. 21:197-204.
and key. Bull. Amer. Mus, Nat. Hist. 216.1-121. Castora, F. J., N. Arnheim and M. V. Simpson. 1980.
Cann, R. L,and A. C. Wilson. 1983. Length mutations Mitochondria1 DNA polymorphism: Evidence
in human mitochondria1 DNA. Genetics that variants detected by restriction enzymes dif-
104:699-711. fer in rlucleotide sequence rather than in methyla-
Cann, R. L., W. M. Brown and A. C. Wiison. 1984. tion. Proc. Natl. Acad. Sci. US4 7Z6415-6419.
Polymorphic sites and the mechanism of evolu- Cate, R. C., C. W. Ehrenfels, M Wysk, R. Tizard, 7. C.
tion in human mitochondrial DNA. Genetics Voyta 8.Murphy 111and 1. Bronstein. 1991.
106:479-499. Genomic southern analysis with alkaline-phos-
Cannatella, D. C. and R 0.de SA. 1993. Xenopus Iamis phatase conjugated oligonucleotide probes and
as a model organism. Syet. Bioi. 42:47&507. the chemiluminescent substrate AMPPD. Genet.
Cano, l? J. and M. K. Borucki. 1995. Revival and iden- Anal. Tech. Appl. 8;102-106.
tification of bacterial spores in 25- to 40-million- Catzeflis, F. M., E H. Sheldon, J. E. Ahlquist and C. G.
year-old amber. Science 268:1060-1064. Sibley. 1987. DNA-DNA hybridization evidence
570 Literature Cited
of the rapid rate of rodent DNA evolution. Mol. Chambers, G. K., W. G. Laver, S. Campbell and J. B.
Biol. Evol. 4:242-253. Gibson. 1981. Structural analysis of an elec-
Cavalier-Smith, T. (ed.). 1985a. 7% Evolutton of Cenonle trophoretically cryptic alcohol dehydrogenase
Size. John Wiley & Sons, New York. variant from an Australian population of
Cavalier-Smith, T. 1985b.Eukaryotic gene numbers, Drosophila melanognster. Proc. Natl. Acad. Sci. USA
non-coding DNA, and genome size, pp. 69-103. 78:3103-3107.
In T. Cavalier-Smlth (ed.), The Evolutbt of Genome Champion, A. B., E. M. Prager, D. Wachter and A. C.
Size. Wiley, New York. Wilson. 1974. Microcomplement futation, pp.
Cavalli-Sforza, L. L. and A. W. E Edwards. 1967. 397416. In C. A. Wright (ed.), Biochemtcal and
Phylogenetic analysis: Models and estimation Immunological Taxonoiny of Animals. Academic
procedures. Evolution 32:550-570 and Am. J. Press, London.
Hum. Genet. 19:233-257. Champion, A. B., E. L. Barrett, N. J. Palleroni, K. L.
Cavalli-Sforza, L.L., A. C. Wilson, C. R. Cantor, R. M. Soderberg, R. Kunisawa, R. Contopoulou, A. C.
Cook-Deegan and M. C. King. 1991. Call for a Wilson and M. Duodoroff. 1980. Evolution in
worldwide survey of human genetic diversity. A Pseudornonas fluoresceizs. J. Gen. Micra.
vanishing opportunity for the Human Genome 120:485-511.
Project. Genomics 11:490-491. Chan, H.-C., W. T. Ruyechan and J. G. Wctmur. 1976.
Cavender, J. A. 1978. Taxonomy with confidence. In vitro iodinatioa of low complexity nucleic
Math. Biosci. 40:271-280. acids without chain scission. Biochemistry
Cavender, J, A. 1981. Tests of phylogenetlc hypotheses 15:5487-5490.
under generalized models. Math, Biosci. Chan, S. C., A. K. C. Wong and D. K. Y, Chiu. 1992. A
54:217-229. survey of multiple sequence comparison meth-
Cavender, J. A. ANZ)J. Fclsenstein. 1987. Invariants of ods. Bull. Math. Biol. 54563598.
phylogenies in a simple case with discrete states. Chapela, I. H., S. A. Rehner, T. R. SchuItz and U. G.
J. Classif. 4:57-71. Mueller. 1994. Evolutionary history of the sym-
Cedergren, R., M. W. Gray, Y. Abel and D. Sankoff, biosis between fungus-growing ants and their
1988. The evolutionary relationships among fungi. Science 266.1691-1694.
known life forms. J. Mol. Evol. 28:98-112. Chapman, R. W. and D. A. Powers. 1984. A method for
Cei, J. M. 1972. Archaeobatrachia versus Neobatrachia: rapid isolation of mtDNA from fishes. Maryland
A first serological approach. Serol. Mus. Bull. Sea Grant Tech. Rep. MD-SG-TS-84-05.11 pp.
48:1-4. Charleston, M. A. 1994. Factors affecting the perfor-
Chakraborty, R. 1992. Sample size requirements for mance of phylogenetic methods. Ph.D. disserta-
addressing the population genetic issues of foren- tion, Massey University.
sic use of DNA typing. Human Biol. 64341-159. Charleston, M. A., M. D. Hendy and D. Penny. 1994.
Chakraborty, R. and H.Danker-Hopfe. 1991.Analysis The effects of sequence length, tree topology, and
of population structure: A comparative study of number of taxa on the performance of phyloge-
different estimators of Wright's fixation indices, netic methods. J. Computation. Biol. 1:133-151.
pp. 203-254. In C. R. Rao and R. Chakraborty Chase, C. D., M. Ortega and C. E. Vallejos. 1991.
(eds.), Handbook of Statistrcs, Volunte8. North- DNA restriction fragment length polymorphisms
Holland, Amsterdam. correlate with isozyme diversity in Phaseolus vul-
Chakraborty, R. and L. Jin. 1993. Determination of garis L. Theor. Appl, Genet. 81:806-811.
relatedness between individuals using DNA fin- Chase, M, W., D. E. Soltis, R. G. Olmstead, D. Morgan,
gerprinting. Human Biol. 65:875-895. D. H, Les, 8. D. Misliier, M. R. Duvall, R. A. Price,
Chakraborty, R. and K. K. Kidd. 1991. The utility of H. G. Hills, Y.-L. Qiu, K. A. Kron, J. H. Rettig, E.
DNA typing in forensic work. Science Conti, J. D. Palmer, J. R. Manhart, K. J. Systma, H.
254:1735-1739. J. Michaels, W. J. Kress, K. G. Karol, W. D. Clark,
Chakraborty, R. and 0. Leimar. 1987. Genetic variation M. Hedren, B. S. Gaut, R. K. Jansen, K.J. Kim, C.
within a subdivided population, pp. 90-120. In N. F, Wimpee, J. E Smith, G. R. Furnier, S. H. Strauss,
Ryman and E Utter (eds.), Population Genetics and 0.Xiang, G. M. Plunkett, P. S. Soltis, S. M.
Fisheries Manageinenl. University of Washington Swensen, S. E. Williams, P. A. Gadek, C. J. Quinn,
Press, Seattle. L. E. Eguiarte, E. Golenberg, G. H. J. Learn, S. W.
Chakraborty, R. and M. Nei. 1977. Bottleneck effects Graham, S. C. Barrett, S. Dayanandan and A.
on average heterozygosity and genetic distance Albert. 1993. Phylogenetics of seed plants: An
with the stepwise mutation model. Evolution analysis of nucleotide sequences from the plastid
3197-356. gene rbcL. Ann. Missouri Bot. Gard. 80:528-580.
Literature Cited 573
Cheliak, W. M. and J. A. Pitel. 1984. Techniques for Church, G. M.and Mr. Gilbert. 7 984. Genomic sequcnc-
starch gel electrophoresis of enzymes from forest ~ n gI'roc.
. Natl. Acad. Sci. USA 81 :1991-1995.
trees. Information Report PI-X-42. Petawawa Nat. Church, G. M. and S. Kieffer-I-hgglns. 1988. Mu1t1plt.x
Forestry Inst., Canadian Foreslry Service. DNA scquenclng. Science 240.185-188.
Chen, B-Y., S-H. Mao and Y-1-1. Lmg. 1980. Clzurch~ll,G. A., A, von Haeseler and W. C Navld~.
Evolutionary relationships of turtles suggested by 1992. Sample size for s phylogenetic mfercncc
immunological cross-reactivity of albumins. Mol Biol. Evol. 9:753-769.
Comp. Biochem. Physlol. 663,421425. Clark, A. G. and C. M. S. Lanigan. 1993. Prospects for
Cheng, S., C. Fockler, W. M.Barnes and R. Higuchi. estimating nucleotide divergence with RAPDS
1994a. Effective amplification of long targets from Mol. Biol. Evol. 10:1096-1111.
clones inserts and human genomic DNA. Proc. Clark, A. G. and I,.Wang. 1994. Comparative evolu-
Natl. Acad. Sci. USA 91:5695-5699. tionary analysis of metabohsm in nine Drosophlln
Cl~eng,S., R. Higuchi and M. Stoneking. 1994b. species. Evolution 48:1230-1243.
Complete mitochondria1 genome amplification. Clark, A. G. and T. S. Whlttam. 1992. Sequencing
Nature Genetics:350-351. crrors and molecuIar evolutionary analysis MoJ
Chepko-Sadc, B. D. and Z. T. Halpin (eds.).1987. Blol Evol. 9:744-752.
Manzmallan D~spersalPatterns: Tlze Effects of Social Clegg, M.T. 1993. Chloroplast genc sequences and t l ~ c
Structure on Poptlintion Gerzetics. University of study of plant evolution. Proc. Natl. Acad SCI
Chlcago Press, Chicago. USA 90:363-367.
Cherry, L. M., S. M. Case, J. G. Kunkel, J. S. Wyles and Clegg, M. T. and G. Zurawski. 1992. Chloroplast DNA
A C. Wilson. 1982. Body shapc metrics and and the study of plant p11yIogeny: Present status
organismal evolution. Evolution 36924-933. and future prospects, pp. 1-13.111 I? S. Saltis, J E.
Chesser, R. K. 1983. Genetic variability within and Soltis and J. J. Doyle (eds.),Moleclrlar Systematics
among populations of the black-tailed prairie of Plants. Chapman and Hall, New York.
dog. Evolution 37:320-331. Clegg, M. T., G. H.Learn and E. M. Golenberg 1990
Cheverud, J. M., M. M. Dow and W. Leutenegger. Molecular evolution of chloroplast DNA. In R K
1984. The quantitative assessment of phylogenetic Selander, A. G. Clark and T. S. Whittam (eds.),
constraints in comparative analyses: Sexual Evolution at the Moleculu~Level. Sjnauer,
din~orphismin body weight among primates. Sunderland, Massachusetts.
Evolution 39~1335-1351. Clayton, J. W. and D. N. Tretiak. 1972. Amine-citratc
Chilsan, 0. I?., L. A. Costello and N. 0. Kaplan. 1965. buffers for pH control in starch gel electrophorc-
Effects of freezing on enzymes. Fed. Proc. 24 sis. J. Fish. Res. Board Canada 29:1169-1172.
(s15):555-565. Cochrane, 8. J. and R. C. Richmond. 1979. Studies of
Chippindale, P. T. 1989. A high-pH discontinuous esterase 6 in Drosophzla melanogasfer. 1. The gencl-
buffer system for resolution of isozymes in starch- ics of posttranslational modification. Biochem.
gel electrophoresis. Stain. Technol. 6461-64. Genet. 17:167-183.
Chippindale, T. and J. J. Wiens. 1994. Weighting, Cockerham, C. C. 1969. Variance of gene frequencies.
partitioning, and combining characters in phylo- Evolution 23:72-84.
genetic analysis. Syst. Biol. 43:278-287. Cockerham, C. C. 1971. Higher-order probabilities of
Cho, S., A. Mitchell, J. Regier, C. Mitter, R. Poole, identity by descent. Genetics 69:235-246.
T. Wedlander, and S. Zhao. 1995.A highly con- Cockerham, C. C. 1973. Analyses of gene frequencies.
served nuclear gene for low-level phylagenetics: Genetlcs 74:679-700.
Elogations factor l a recovers morphology-based Cockerham, C. C. 1984. Drift and mutation with a
tree for heliothine moths. Mol. Biol. Evol. 22: finite number of allelic states. Proc. Natl. Acad.
650-656. Sci USA 81:530-534.
Choudharv, , M., -1. E. Strassman, C. R. Solis and D. C. Cockcrham, C. C. and B. S. Weir. 1983. Variance of
Queller. 1993. Microsatellitevariation in social actual inbreeding. Thcor. Pop. Uiol. 23:85-109
insects. Biochem. Genet. 31237-95. Cockerham, C. C. and B. S. Weir. 1986. Estimation of
Chrambach, A. and D. Rodbard. 1971. Polyacrylamide inbrecding parameters in stratified populations
gel electrophoresis. Science 172:440-451. Ann. Human Genet. 50.271-281.
Christiansen, F, B. and 0.Frydenberg. 1973 Selection Cockerham, C. C. and B. S. Weir. 1987. Correlations,
component analysis of natural polymorphisms descent measures: Drlft with migration and muta-
using population samples including mother-off- tlon. Proc. Natl. Acad. Sci. USA 8453512-8514.
spring combinations. Theor. Pop. Biol. 4:425-445. Cockerham, C. C. and B. S. Weir. 1993. Estimation of
gene flow from F-statist~cs.Evolution 47.855-863.
572 Literature Cited
Cocks, G. T. and A. C. Wilson. 1972. Enzyme evolution Biological and Medical Research, Vol. 6. A. R. Liss,
in ihe Ent.erobacter~aceae.J. Bacteriol. 110:793-802. New York.
Coen, E., T. Strachan and G. Dover. 1982. Dynamics of Crabtree, C. B. 1987. Allozyme evidence for the phylo-
concerted evolution in regions of ribosomal DNA genetic relationships within the silverside sub-
and histone gene familres in the melanogaster family Atherinopslnae. Copeia 1987:860-867.
group of Drosophrln J Mol Biol. 158:17-35. Cracraft, J. 1987. DNA hybridization and avian phylo-
Colless, D FI. 1970. Thc phenogram as an estimate of genetics. Evol. Biol. 21:47-96.
phylogeny. Syst. Zool. 3 9:352-362. Cracraft, 1. 1989. Speciation and its ontology: The
Collier, G. E 1990. Evolution of arginine kinase wrthln empirical consequences of alternative species con-
the genus Drosophlln J Hered. 81:177-182. cepts for understanding patterns and processes of
Coiller, C E. and R. J. Maclntyre. 1977. differentiation, pp. 28-59. In D. Otte and J. A.
Microcomplement fixation studies on the evolu- Endler (eds.), Speciation and its Consequences.
tion of a-glycerophosphate dehydrogenase within Sinauer Associates, Sunderland, MA.
the genus Drosophila Proc. Natl. Acad. SCI.USA Crandall, K.A. 1994. lntraspecific cladogram estima-
74 684-688. tion: Accuracy at higher levels of divergence.
Collins, T M., E Kraus and G. Estabrook. 1994a. Syst. Biol. 43:222-235.
Compositional effects and weighting of Crandall, K. A. 1995a. Intraspccific phylogenetics:
~~ucleotlde sequences for phylogenetic analysis. Support for dental transmission of human
Syst. Biol. 43:449459. immunodeficiency virus. J. Virol 69:2351-2356.
Collrns, T M., P. H. Wimberger and G. j. P. Naylor. Crandall, K. A. 199513. Multiple interspecies transmis-
1994b. Compositional blas, character-state bias, sions of human and simian T-cell leukemia/lym-
and character-state rcconstmction using parsimo- phoma virus type I sequences. Mol. Biol. Evol. (in
i~y.Syst. Biol. 43:482-496. press)
Comings, D. E. 1978. Mechanis~nsof chromosome Crandall, K, A., A. R. Templeton and C. E Sing. 1994.
banding and ~mphcationsfor chro~nosomestruc- Interspecific phylogenetics: 13roblemsand solu-
Lure. Annu. Rev. Genet. 12:25-46. tions, pp. 273-297. In R. W. Scotland, D. J. Siebert
Commorford, S. L. 1971. Iodmatmn of nucleic acids in and D. M. Williams (eds.), Models 171Phylogeny
vit~o.Blochemistry 10.1993-2000. Xeconstruct~on.Clarendon Press, Oxford.
Conkle, M. T., P.D. EIodgskiss, L. B. Nunnally and S. Crandall, K. A. and A. R. Templeton. 1996.
C Hunter. 1982. Sfnrclr Gel Electrophoresis of Conifer Applications of intraspecific phylogenetics. In P.
Seeds A Laboraioy Manual. Gcn. Tech. Report H. Harvey, A. J. Leigh Brown and J. Maynard
PSW-64. Pacific Southwest Forest and Range Smith (eds.), New Usesfor New Phylogemies. Oxford
Experimental Station, Forest Service, U.S. Dept. University Press, Oxford.
Agriculture, Berkeley, California. Crawford, U.J. 1989. Enzyme electrophoresis and
Cooper,A., C. Mourer-Chauvir6, G. K. Chambers, A. plant systematics, pp. 146-164. In Soltis, D. E. and
von Haeseler, A. C. Wilson and S. Paabo. 1992. P. S. Soltis (eds.), Isozyrnes in Plant Biology.
Independent orlglns of new Zcaland moas and Dioscorides Press, Portland, Oregon.
k~wls.Proc. Natl. Acad. Sci. USA 89:8741-8744. Crawford, D.J. 1990. Plant Molecular Systematics. Jol~n
Coradm, L and D. E. Clannasl 1980. The effects of Wiley and Sons, New York.
chcmical preservatives on plant collections to be Crawford, D. L,and D. A. Powers. 1989. Molecular
used m chemotaxonomic surveys. Taxon 29:33-40. basis of evolutionary adaptation at the lactate
CL)n?all,R. J ,T. J. Aitman, C M. Hearne and J. A. dehydrogenase-I3locus in the fish Fundulus hetero-
'lbdd. 1991. The generat~onof a library of PCR- clitus. Proc. Natl. Acad. Sci. USA 86:9365-9369.
analyzed microsatellltc varlants for genetic map- Crawford, T.J. 1984. What is a population?, pp.
ping of the mousc genome. Genomics 10:874-881. 135-174. In B. Shorrocks (ed.), Evolutions y
Corriveau, J. L. and A. W. Coleman. 1988. Rapid Ecology. The 23rd Symposium of the British Ecological
screening method to detect potential biparental Society. Blackwell, Oxford.
inlieritance of plastid DNA and resulks from over Craxton, M. 1991. Linear amplification sequencing: A
200 anglosperln specles. Am. J. Bot. 75:1443-1458. powerful method for sequencing DNA. Methods
Cox, D. R. and H. D. Miller. 1977. The Theoly of 3:ZO-24.
Stochasftc Processes. Chapman and Hall, London. Creasey, A,, L. DIAngio,T. S. Dunne, C. IKisslnger, T.
Coyne, ] 1982. Gel electrophoresis and cryptic protein O'Keeffe, H. Perry-O'Keeffe, L. S. Moran, M.
variation, pp. 1-32. Tn M. Rattazzi, J. Scandalios Roskey, I. Schildkraut, L, E. Sears and B. Slatko.
and G.Whitt (eds.),lsozytnes: Current Top~cszn 1991. Application of a novel chemiluminescence-
based DNA detection method to single-vector Daly, J. C. 1981. Effects of social organization and envi-
and multiplex DNA sequencing. BioTechnology ronmental diversity on determining the genetic
11:102-109. structure of a population of the wild rabbit,
Cremisi, E, R. Vignali, R. Batistoni and G. Barsacchi. Oryctolagus cuniculus. Evolution 35:689-706.
1988. Heterochromatic DNA in Triturus Dando, P. R., K. B. Storey, P. W. Hochachka and J. M.
(Amphibia, Urodela) 11. A centromeric satellite Storey. 1981. Multiple dehydrogenases in marine
DNA. Chromosoma 97:204-211. molluscs: Electrophoretic analysis of alanopine
dehydrogenase, strombine dehydrogenase,
Cronin, J. E. and V. M. Sarich. 1975. Molecular system- octopine dehydrogenase, and lactate dehydroge-
atics of the New World monkeys. J. Human Evol. nase. Marine Biol. Letters 2:249-257.
4:357-375. Danna, K. J. 1980. Determination of fragment order
Cross, T. E, R. D. Ward and A. Abreu-Grobois. 1979. through partial digests and multiple enzyme
Duplicate loci and allelic variation for mitochon- digests. Meth. Enzymol. 65:449-467.
drial malic enzyme in the Atlantic salmon, Salmo Danzmann, R. G. and J. P. Bogart. 1982a. Evidence for
salar L. Comp. Biochem. Physiol. 62B:403-406. a polymorphism in gametic segregation using a
Crother, B. I. 1990. Is "some better than none" or do malate dehydrogenase locus in the tetraploid
allele frequencies contain phylogenetically useful treefrog Hyla versicolor. Genetics 100:287-306.
information? Cladistics 6277-281. Danzmann, R. G. and J. P. Bogart. 1982b. Gene dosage
Crother, 8. I. 1992. Genetic characters, species con- effects on MDH isozyrne expression in diploid,
cepts, and conservation biology Conserv. Biol. triploid, and tetraploid treehogs of the genus
6:314. IJyla. J. Hered. 73:277-280.
Crouau-Roy, B. 1988. Genetic structure of cave- Darnell, R., H. Lodish and D.Baltimore. 1986.
dwelling beetles populations: Significant deficien- Molecular Cell Biology. Scientific American Books,
cies of heterozygotes, Heredity 60:321-327. New York.
Crousc, J. and D. Amorese. 1986. Stabilily of rest~iction Darwin, C. 1859. On the Origin o/Sy~ciesby Means of
endonucleases during extended digestion. Focus Natural Selection. J. Murray, London.
(BRL) 8:l-2. Daugherty, C. H., A. Cree, J. M. Hay and M. B.
Crow, J. E 1985.The neutrality-selection controversy Thompson. 1990. Neglected taxonomy and con-
in the history of evolution and population genet- tinuing extinctions of tuatara (Sphenodon).Nature
ics, pp. 1-18. In T. Ohta and K. Aoki (eds.), 347177-179.
Population Genetics and Molecular Evolution. Japan Davies, D. H., R. Lawson, S. J. Burch and J. E. Hanson.
Science Press and Springer-Verlag, Tokyo. 1987. Evolutionary relationships of a "primitive"
Crozier, R. H. and Y. C. Crozier. 1993. The mitochondr- shark (Heferodontus)assessed by micro-comple-
ial genome of the honeybee Apis rnellifera: ment fixation of serum transferrin. J. Mol. Evol.
Complete sequence and genome organization. 25:74-80.
Genetics 133:97-117. Davis, J. I. and K. C. Nixon. 1992. Populations, genetic
CSKRN. 1973. Committee for a standardized kary- variation, and the delimitation of phylogenetic
otype of the Norway rat, Xattus norvegicus. species. Syst. Biol. 41:421435.
Cytogenet. Cell Genet. 12:199-205. Davis, L. G., M. D. Dibner and J. E Battcy. 1986. Basic
Cunningham, C. W., N. W. Blackstone and L. W. Buss. Methods in Molecular Biology. Elsevier Science
1992. Evolution of king crabs from hermit crab Publ., New York.
ancestors. Nature 355:539-542. Davis, L. M., F. R. Fairfield, M. L.Hammond, C. A.
Cummings, M. P., S. P. Otto and J. Wakeley 1995. Harger, J. H.Jett, R. A. Keller, J. H. Hahn, L. A.
Sampling properties of DNA sequence data in Krakowski, B. Marrone, J. C. Martin, H. L. Nutter,
phylogenetic analysis. Mol. Biol. Evol. 12:814-823. R R. Ratliff, E. B. Shera, D.J. Simpson, S. A. Soper
and C. W. Wilkerson. 1992. Rapid DNA sequenc-
D'Agostino, R. B., W. Chase and A. Belanger. 1988. The ing based on single-molecule detection. Los
appropriateness of some common procedures for Alamos Sci. 20:280-285.
testing the equality of two independent binomial Davis, M. 13.1973. Labeling of DNA with lZ51. Carnegie
populations. Am. Statist. 42:198-202. Inst. Wash. Yearbook 72:217-221.
Dallas, J. E 1988. Detection of DNA "fingerprints" of Davison, D. 1985. Sequence similarity ("homology")
cultivated rice by hybridization with a human searching for molecular biologists. Bull. Math.
minisatellite DNA probe. Proc. Natl. Acad. Sci. Biol. 47437474.
USA 85:6831-6835.
574 Literature Cited
Dawley, R. M, and J. P. Bogart (eds.). 1989. Evolution gressive hybridization: Implications for evolution
and Ecology of Unisexual Verfebrates. Bull. New and conservation. Proc. Natl. Acad. Sci. USA
York State Museum, Albany. 89:2747-2751.
Dawley, R, M., J. H. Graham and R. J. Schultz. 1985. Dene,H.,M. Goodman and W. S. Prychodko. 1978. An
Triploid progeny of pumpkinseed x green sunfish immunological examination of the systematics of
hybrids. J. Hered. 76.251-257. the Tupaioidea. J. Mammal. 59:697-706,
Dawson, D. M., H.M. Eppenberger and N. 0. Kaplan. Densmore, L. D.1983. Biochemical and immunologi-
1967. The comparative enzymology of creatine cal systematics of the order Crocodilia. Evol. Biol.
kinases. 11. Physical and chemical properties. J. 16:397-465.
Biol. Chem. 25:210-217. Densmore, L. D., J. W. Wright and W. M. Brown. 1985.
Dayhoff, M. 0. 1978. Atlas of Protein Sequenceand Length variation and heteroplasmy are frequent
Sfructure, vol. 5, suppl. 3. Natl. Biomed. Res. in mitochondrial DNA from parthenogenetic and
Found., Silver Spring, Maryland. bisexual lizards (genus Cnemidophorus). Genetics
Dayhoff, M. 0 , R. M. Schwartz and B. C. Orcutt. 1978. 110:698-707.
A model of evolutionary change in proteins, pp. de Queiroz, A. 1993. For consensus (sometimes). Syst.
345-352. In Dayhoff, M. 0.1978. Atlas of Protein Biol. 42:368-372.
Sequence and Structure, vol. 5, suppl. 3. Natl. Derr, J. N., J. W. Bickham, I. F. Greenbaum, A. G. J.
Biomed. Xes. Found., Silver Spring, Maryland. Rhodin and R. A. Mittermeicr. 1987. B~ochemlcal
Debeau, L., L. A. Chandler, J. R. Gralow, P. W. Nichols systematics and evolution in the South American
and P. A. Jones. 1986. Southern blot analysis of turtle genus Platemys (Pleurodira: Chelidae).
DNA extracted from formalin-fixed pathology Copeia 1987:370-375.
specimens. Cancer Res. 46:2964-2969. de SB, R.0. and D. M. Hillis. 1990. Phylogenetic rela-
DeBorde, D. C., C. W. Naeve, M. L. Herlocher and H. tionships of the pipid frogs Xenopus and Szlurana:
E Maassab. 1986. Resolution of a common RNA An integration of ribosomal DNA and morpholo-
sequencing ambiguity by terminal deoxynu- gy. Mol. Biol. Evol. 7:365-376.
cleotidyl transferase. Analyt. Biochem. DeSalle, R. 1992. The phylogenetic relationships of
1573275-282. flies in the family Drosophilidae deduced from
UeBry, R. W. 1992. The consistency of several phyloge- mtDNA sequences. Mol. Phylogenet. Evol.
ny-inference methods under varying evolutionary 1:31-40.
rates. Mol. Biol. Evol. 9:537-551. DeSale, R. 1994. Implications of ancient DNA for phy-
DeBry, R. W. and N. A. Slade. 1985. Cladistic analysis logenetic studies. Experientia 50:542-550.
of restriction endonuclease cleavage maps within DeSalle, R. and D. Grimaldi. 1993. Phylogenetic pat-
a maximum-likelihood framework. Syst. 2001. tern and developmental process in Drosoplzila.
34:21-34. Syst. Biol. 42:458475.
Degnan, S. D. 1993a. Genetic variability and popula- DeSalle, R. and A. R. Templeton. 1988. Founder effects
tion differentiation inferred from DNA finger- accelerate the rate of mitochondrial DNA evolu-
printing in silvereyes (Aves: Zosteropidae). tion in Hawaiian Drosophila. Evolution
Evolution 47:1105-1127. 42:1076-1084.
Degnan, S. D. 1993b. The perils of single gene trees- DeSalle, R,, L. V. Giddings and A. R. Templeton. 1986.
mitochondrial versus single-copy nuclear DNA Mitochondria1 DNA variability in natural popula-
variation in white-eyes (Aves: Zosteropidae). Mol. tions of Hawaiian Drosophila. I. Methods and lev-
Ecol. 2219-225. els of variability in W. stlveslris and D. heteroneura
Deininger, L. and G. R. Daniels. 1986. The recent populations. Heredity 5675-85.
evolution of mammalian repetitive DNA ele- DeSalle, R., T. Freedman, E. M. Prager and A. C.
ments. Trends Genet. 2:76-80. Wilson. 1987a. Tempo and mode of sequence evo-
DeLong, E. F. 1990. Archaea in coastal marine environ- lution in mitochondrial DNA of Hawaiian
ments. Proc. Natl. Acad. Sci. USA 89:5685-5689. Drosophila. J. Mol. Evol. 26:157-164.
DeLorenzo, R. J. and E H. Ruddle. 1969. Genetic con- DeSalle, R., A. R. Templeton, I. Mori, S. Pietscher and
trol of two electrophoretic variants of glu- J. S. Johnson. 198%. Temporal and spatial hetero-
cosephosphate isomerase in the mouse. Biochem. geneity of mtDNA polymorphisms in natural
Genet. 3:151-162. populations of Drosophila rnercaforunz.Genetics
DeMarais, B. D., T,E. Dowling, M. E. Douglas, W. L. 116:215-233.
Minckley and P. C. Marsh. 1992. Origin of Gila DeSalle, I<.,J. Gatesy, W.Wheeler and D. Gnmaldi.
seminuda (Teleostei: Cyprinidae) through intro- 1992. DNA sequences from a fossil termite in
Literature Cited 575
Oligo-Miocene amber and their phylogenetic Dcssauer, H. C., J. B. Cadle and R. Lawson. 1987.
implications. Science 257:1933-1936. Patterns of snake evolution suggested by their
DeSalle, R., A. K. Williams and M.George. 1993. proteil~s.Fieldiana 2001. N.S. 34.1-34.
Isolation and characterization of animal mito- Dessaucr, H. C., M. S.I-Iafner, R M.Zink and C. J.
chondrial DNA. Meth. Etizymol. 224:176-204. Cole 1988.A nat~onalprogram to develop, main-
Desjardins, P. and R. Morais. 1991. Nucleotide tain, and utllize frozen tlssue collections for scien-
sequence and evolution of coding and noncoding tific research. Assoc. Syst. Collections Newsle~ter
regions of a quail mitochondria1 genome. J. Mol. 16.3,9-10.
Evol. 32:153-161. Djaz, M. O., G. Barsacchi-Pilone, K.A. Mahon and J
Dc Soete, G. 1983a. A least squares algoritlun for fit- G. Gall. 1981. Transcripts from both strands of a
ting additive trees to proximity data. satellite DNA occur on lamybrush chramosomc
Psychometrika 48:621-626. loops of the newt Nolophfhalmus. Cell 24.649-659
De Soete, G. 1983b. On the constmction of Dickersm, K. and J. A. Berlin 1992. Mcta-analys~s
"optimal"phylogcnetic trees. Z. Naturforsch State-of-the-science. Epldem~olRev, 24 154-276.
38:156-158. Dickerson, R. E. 1971. The structure of cytochrome (
Dessauer, 13. C. and C. J. Cole. 1984. Influence of gene and the rates of molecular evolution. J. Mol. Evol
dosage on electrophoretic phenotypes of proteins 1:2645.
from lizards of the genus Cnemtdophorl~s.Comp. D~clrlch,W., H. Katz, S. E. Llncoln, H.-S. Shm, J.
Biochem. Physlol. 77B:181-189. Friedman, N. C. Dracopli and E. S.Lander 1992.
Dessauer, H.C, and C. J. Cole. 1986. Clonal inheri- A genetic map of the mouse sultable for typing
tance in partl~enageneticwhiptail lizard: ~i~traspecific crosses. Genetics 131,423447
Biochelnical evidence. J. I-Iered. 773-12. Dletr~ch,W, J. Miller, H. Katz, I3 Joyce, li. Steen, S
Dessauer, H. C. and C. J. Cole. 1989. Diversity between Lmcoln, M. Daly, M. P. Rceve, A. Weaver, 1'
and within nominal forms of unisexual teiid Anagnostopoulis, N. Goodman, N Dracopoli and
lizards, pp. 49-71. In R. Dawley and J. I? Bogart E. S. Landcr. 1993. SSLP genetic mapping of the
(eds.), Evolution and Ecology of Unisexual mouse ( M u s nzuscidus) SN-40, pp. 110-142 117 S J
Veitebrafes.New York State Museum, Albany. O'Brlei~(ed.), Genetic Maps Loclts Maps of Cornplcx
Dessauer, H. C. and C. J. Cole. 1991. Genetics of whip- Genomes. Book 4. Nonhuman Vertebmtes. 6th cd
tail lizards (Reptilia: Teiidae: Cuemidophorus) in a Cold Spring Harbor Laboratory Press, Cold
hybrid zone in southwestern New Mexico. Spring Harbor, New York
Copcia 1991:622-637 DlLella, A G. and S. L C Woo 1987 Cloning large
Dessauer, H. C. and M. S. Hafner (eds.). 1984. segments of genomic DNA uslng cosmid vectors
Collections of Frozen Tissues: Value, Management, Metlt. Enzyinol. 152:199-212,
Field and Laboratory Procedures, and Dlrecto y of DiMlchele, L. and D. A. Powers. 1982a LDH-B geno-
Exlsting Collection?.Assoc. Systcmatics type-speclfic hatching times of Ftitzd~~lus heferocll-
Collections, Wasliington, D.C. lus embryos. Nature 296563-564.
Dessauer, 13. C. and R. A. Menzies. 1984. Stability of DiMlchele, L. and D. A. Powers. 1982b. Physlologicnl
macromolecules during longtcrm storage, pp. basls for swimmlng endurance differences
17-20. In H. C. Dessauer and M. S. HaIner (eds.), between LBH-B genotypes of F~irzduiusheterocil-
Collections of Frozen Tissues: Value, Management, t r s Science 216:1014-1016
Field artd Laboratory Procedures, and Directory of Dlnimick, W. W. 1987 Fhylogenetic relationsh~psof
Exjstzng Collecttons. Assoc. Syst. Collections, Nolrop~shubbs~,N welaka and N.emilne
University of Kansas Press, Lawrence. (Cyprinifosmes: Cyprlnidac) Copeia
Dessauer, H. C., M. J. Braun and S. Neville. 1983.A 1987:316-325.
simple hand centrifuge for field use. Isozyme Dixon, M. T and D. M. H1111s. 2993 Ribosomal l'aTA
Bull. 16:91. secondary structure: Compensatory muta hons
Dcssauer, H. C., R. A. Menzies and D. E. Fairbrothers. and implications for phylogenet~canalysis Mol.
1984. Procedures for collecting and preserving tis- 8101. Evol 10:256-267.
sues for molecular studies, pp. 21-24.17~ H. C. Dobzhansky, T.1937. Genetics and the Brzgin of Spertes
Dessauer and M. S. Hafner (eds.), Collections of Reprinted 1982, Columbla University I>rcss,NCIV
Frozen Tissues: Value, Managemenf, Field and York.
Laboratory Pvocedu~es,and Directory of Existing Dodds, K.G. 1986. Resampllng methods m genetlcs
Collections. Assoc. Syst. Collections, University of and the effect of family structure in genehc data
Kansas Press, Lawrence. Inst. Statis. M h ~ c oSeries
. 1684T, North Carolina
State University, Ralelgh.
576 Literature Cited
Dorn~i~go, E. and J. J. Holland. 1994.Mutation rates Dowling, T.B. and M. R.Childs. 1992. Impact of
and rap~devolution of IWA viruses, pp. 161-184. hybridization on a threatened trout of the south-
in S S Morse (ed ), The Evollntionay Btology of western United States. Conserv. Biol. 6:355-364.
VI~L(FCS.Raven Press, New York. Dowling, T. E.and B. D. DeMarais. 1993. Evolutionary
Ucnoghue, M, J. 1989 Pl~ylogeiziesand the analysis of significance of introgresslve hybridization in
evolutionary sequences, wlth examples from seed cyprinid fishes. Nature 362:444446
plants Evolution 43:1127-1156. Dowling, T. E. and W. R. Hoeh. 1991. The extent of
Donoghuc, M. I., R. G Olmstead, J. F. Smith and J. D. introgresslon outside the hybrid zone between
Palmer, 1992. Phylngenetic relationships of Notropis cornutus and Notropis cl~rysocephalus
Dipscales based on rbcL sequences. Ann. M~ssouri (Teleostei:Cyprinidae). Evolution 45:944-956.
Bot Garden 79:249-265. Dowling, T. E., G. R. Smith and W. M. Brown. 1989.
Donnellan, S. C. and K. P Aplin. 1989. Resolulion of Reproduclive isolation and introgression between
cryptic species in tlze New Guinean Ilzard, Notropis cornufus and Nofropis chrysocephalus (fam-
S~he17o1~10rpJ~us lobter~sis(Scincidae)by elec- ily Cyprinidae): Comparison of morphology,
~rophorcsis.Copeia 1989 81-88. allozyrnes, and mitochondria1DNA. Evolution
Dodlittle, R. F. (ed). 1990a.Molecular Evolution: 43:620-634.
Corrlputer Analysls of Protein and Nuclerc Acid Dowling, T. E., B. D. DeMarais, W. L. Minckley, M. E.
Sequences. Methods 11-1Enzymology, 183. Douglas and P. C. Marsh. 1992a. Use of genetic
Academic Press. San Diego. characters in conservation biology. Conserv. Biol.
Doolittle, R. E 1990b.Searching through sequence 6:7-8.
databases. Meth. Enzymol. 18399-110. Dowling, T. E., G. R. Smith, W.R. Hoeh and W. M.
Doolltlle, R. F., D . 4 Feng, M. A. McCiure and M, S. Brown. 199223. Evolutionary relationships of shin-
Johnson. 1990. Rctrovirus phylogeny and evolu- ers in the genus Luxilus (Cyprinidae) as deter-
tlon. Curr. Top. Mlcro. Immunol. 157:l-18. mined by analysis of mitochondria1 DNA. Copeia
Doohttle, W. Fa1985. Middle repetitive DNAs, pp. 1992:306-322.
443487. In T. Cavalier-Smlth (ed.), The Evoluf~ui~ Downie, S. R. and J. D. Palmer. 1992a. Restriction site
of Geno?neSize, John Wiley and Sons, New York. mapping of the chloroplast DNA inverted repeat:
Dolli, R. L., H. Akashi and W. Gilbert. 1995.Absence A molecular phylogeny of the Asteridae. Ann.
of polymorphism at the ZFY locus on the human Missouri Bot. Garden 79:266-283.
Y chromosome. Science 268:1183-1185. Downie, S. R. and J. D. Palmer. 1992b. Use of chloro-
Dover, G.A. 1982. Molecular drive, a cohesive mode plast DNA rearrangements in reconstructing
of species evolutian. Nature 299:lll-117. plant phylogeny, pp. 1435. In P. S. Soltis, D. E.
Dover, G. A. 1987. Letter to the editor. Cell 51:515. Soltis and J. Doyle (eds.), Molecular Systematics of
Dover, C.A. and D. Tautz. 1986. Conservation and Plants. Chapman and Hall, New York.
divergence in muit~genefamilies: Alternatives to Downie, S. R. and J. D. Palmer, 1994.A chloroplast
selection and drift. Phil. Trans. Roy. Soc, London DNA phylogeny of the CaryophylXalesbased on
B312.275-289. structural and inverted repeat restriction site vari-
Dover, G A., S. Brown, E. Coen, J. Dallas, T. Strachan ation. Syst. Dot. 19:236-252.
and M. Trick. 1982 The dynamics of genome evo- Downie, S. R., R. G. Olmstead, G. ~urawski,D. E.
lution and species differentiation, pp. 343-372, i n Soltis, P. S. Soltis, J. C. Wa and J. D. Palmer. 1991.
C A. Dover and R. B. Flavell (eds.), Genome Six independent losses of the chloroplast DNA
Evolziiion. Acadelnic Press, New York. rp12 intron in dicotyledons: Molecular and phylo-
Dowling, H.C., R. Highton, G. C. Maha and L. R. genetic implications. Evolution 45:1245-1259.
Maxson. 1983. Biochemical evaluation of colubrid Doyle, J. J. 1992. Gene trees and species trees:
snake phylogeny. J. Zool. (London) 201:309-329. Molecular systematics as one-character taxonomy.
Dowling, T. E. and W. M. Brown. 1989. Allozymes, Syst. Bot. 17:144-163.
~nltochondrialDNA, and levels of phylogenetic Doyle, J. J. 1994.Evolution of a plant homeotic multi-
resolution among four species of minnows gene family: Toward connecting molecular sys-
(Nofropis:Cyprinidae). Syst. Zool. J8:126-143. tematics and molecular developmental genetics.
Dowlmg, I:B.and W. M. Brown. 1993. Population Syst. Biol. 43:307-328.
structure of the bottlenose dolphin (Tursiops trim- Doyle, J. J. and E. E. Dickson. 1987. Preservation of
cafus) as deiermmed by restriction endonuclease plant samples for DNA restriction endonuclease
analysis of mitochondrial DNA. Marine Mam-cn. analysis. Taxon 36:715-722.
SCI.Y:138-155. Doyle, J. J., J. I. Davis, R. J. Soreng, D. Gamin and M. J.
Anderson. 1992. Chloroplast DNA inversions and ty of mutation rate: Protein evolution inmam-
the origin of the grass family (Poaceae).Proc. mals is not neutral. Mol. Blol. Evol. 11:643-648.
Natl. Acad. Sci. USA 89:7722-7726. Echelle, A. A. and !I J. Connor. 1989. Rapid, geograph-
Dubin, D. T., C. C. HsuChen and L. E. Tillotson. 1986. ically extensive genetic introgression after sec-
Mosqu~tomitochondria1 transfer RNAs for valine, ondary contact between two pupfish species
glycine and glutamate: RNA and gene sequences (Cyprinadon, Cyprinodontidae). Evolution
and vicinal genome organization. Curr. Genet. 43:717-727.
10:701-707. Echelle, A. A. and T. E. Dowling. 1992. Mitochondria1
Dueck, G. 1990. New optimization heuristics: The DNA evolution of the Death Valley pupfishes
Great Deluge algorithm and the record-to-record (Cyprinodon, Cyprinodontidae). Evolution
travel. Scientific Centre Technical Report, IBM 46:193-206.
Germany. Echelle, A. A., T.E. Dowling, C. Moritz and W. M.
Dueck, C. and T. Scheuer. 1990. Threshhold accepting: Brown. 1989. Mitochondria1 DNA diversity and
A general purpose optimisation algorithm the origin of the Menidia clarkl~ubbsicomplex of
appearing superior to simulated annealing. J. unisexual fishes (Atherinidae). Evolution
Comp. Physics 90:161-175. 43:984-993.
Duellman, W. E. and D. M. Hillis. 1987. Marsupial Echelle, A. F., A. A. Echelle and D. R. Edds. 1989.
frogs (Anura: Hylidae: Gastrofheca)of the Conservation genetics of a spring-dwelling desert
Ecuadorian Andes: Resolution of taxonomic prob- fish, the Pecos gambusia (Gambusia nobilis,
lems and phylogenetic relationships. Poeciliidae). Conserv. Biol. 3:159-169.
Herpetologica 43: 135-167. Eck, R. V. and M. 0.Dayhoff (eds.). 1966, Atlas of
Duellman, W. E., L. R. Maxson and C. A. Jesiolowski. Protein Sequence and Structure 1966. Natl. Biomed.
1988. Evolution of marsupial frogs (Hylidae: Res. Found., Silver Springs, Maryland.
Hemiphractinae): Immunological evidence. Eckert, R. 1987. New vectors for rapid sequencing of
Copeia 1988:527-543. DNA fragments by chemical degradation. Gene
Dutrillaux, B, 1975. Discontinued treatment with 51:242-252.
BudR and staining with acridine orange: Edwards, A., H. A. Hammond, J. Li, C. K. Caskey and
Observation of R- or Q- or intermediary banding. R. Chakraborty. 1992. Genetic variation at five
Chromosoma 52:261-273. trimeric and tetrameric tandem repeat loci in four
Dyer, A. E 1979. Investigating Chromosomes. John Wiley human population groups. Genomics 12:241-253.
and Sons, New York. Edwards, A. W. E 1972. Likelihood. Cambridge
Dykhuizen, D. E., C. Mudd, A. Honeycutt and D. L. University Press, Cambridge.
Hartl. 1985. Polymorphic posttranslational modi- Efron, B. 1979. Bootstrapping methods: Another look
fication of alkaline phosphatase in Escherichia coli. at the jackknife. Ann. Stat. 21-26.
Evolution 39:l-7. Efron, B. 1982. The Jackknife, fhe Bootstrap, and Other
Resampling Plans. CBMS-NSF Regional
Eanes, W. F. 1987. Allozymes and fitness: Evolution of Conference Series in Applied Mathematics,
a problem. Trends Ecol. Evol. 2:44-48. Monograph 38. Soc. Indust. Appl. Math.,
Eanes, W. E, L. Katona and M. Longtine. 1990. Philadelphia.
Comparison of in vitro and in vivo activities asso- Efron, B, and G. Gong. 1983.A leisurely look at the
ciated with the G6I'D allozyme polymorphism in bootstrap, the jackknife, and cross-validation.
Drosophila melanogaster. Genetics 125:845-853. Am. Statist. 37:36-48.
Easteal, S. 1985. The ecological genetics of introduced Efron, B. and R. J. Tibshirani. 1993. A n Introduction to
populations of the giant toad Bufo marinus. 11. the Bootstrap. Chapman and Hall, New York.
Effective population size. Genetics 110:107-122. Eickbush, T. 1994. Evolution of retroelements, pp.
Easteal, S. 1986. The ecological genetics of introduced 121-157. In S. S. Morse (ed.), The Evolutionary
populations of the giant Toad, Bufo marinus. W. Biology of Viruses. Raven Press, New York.
Gene flow estimated from admixture in Ellegren, H., M. Johansson, K.Sandberg and L.
Australian populations. Heredity 56:145-156. Andersson. 1992. Cloning of highly polymorphic
Easteal, S. 1990. The pattern of mammalian evolution microsatellites in the horse. Anim. Genet.
and the relative rate of molecular evolution. 23:133-142.
Genetics 124:165-173. Ellegren, H., M. Johansson, B. I? Chowdhary, 5.
Easteal, S. and C. C. Collett. 1994. Consistent variation Marklund, D. Ruyter, L. Marklund, Bduner-
in amino-acid substitution rate, despite uniformi- Nielsen, I. Edfors-Lilja, I. Gustavsson, R. K. Juneja
578 Literature Cited
and L. Andersson. 1993. Assignment of 20 Evans, M. R. and C. A. Read. 1992. 32!?, 33Pand 35S:
microsatellite markers to the porcinc linkage map. Selecting a iabel for nucleic acid analysis. Nature
Genomics 16:431-439. 3583520-521.
Ellsworth, D. L., K,D. Kittenhouse and R. L. Evarts, S. and C. J. Williams. 1987. Multiple paternity
Honeycutt. 1993.Artifactual variation in random- in a wild populat~onof mallards. Auk
ly amplified polymorphic DNA banding patterns. 104:597-602.
BioTechniques 14:214-217. Excoffier, L., P. E. Smouse and J. M. Quattro. 1992.
Elwood, H.J., G. J. Olsen and M.L. Sogin. 1985. The Analysis of molecular varlance inferred from met-
small-subunit ribosomal RNA gene sequences ric distances among DNA haplotypes:
from the hypotrichous cillates Oxytricha nova and Application to human mitochondrlal DNA
Stylonychia pustulata. Mol. Biol. EvoI. 2:399-410. restriction data. Genetics 131:479-491.
Endler, J. A. 1979. Gene flow and life history patterns.
Genetics 93:263-284, Fairbrothers, D. E. and M. A. Johnson. 1964.
Endler, J. A. 1989. Conceptual and othcr problems in Comparative serological studies within the fami-
speciation, pp. 625-661. b~D. Otte and J. A. iies Cornaceae (dogwood) and Nyssaceaa (sour
Endler (eds.), Speciation and its consequences. gum), pp. 305-318. In C. A. Leone (ed.), Taxonomic
Sinauer Associates, Sunderland, MA. Btochemistiy and Serology. Ronald Press, New
Engel, W., J. Schmidtke, W. Vogel and V. Wolf. 1973. York.
Genetic polymorphism of lactate dehydrogenase Faith, D. P. 1985. Distance methods and the approxi-
isoenzymes in the carp (Cyprinus carpio) apparent- mation of most-parsimonrous trees. Syst. Zool.
ly due to "null alleles." Genetica 8:281-289. 34:312-325.
Engelke, D .,A. Krikos, M. E. Bruck and D. Ginsberg. Faith, D. P. 1990. Chance marsupial relationships.
1990. Purification of Themlus aquaticus DNA poly- Nature 345:393-394.
merase expressed in Esclzerichia coli. Analyt. Faith, D. P. 1991. Cladistic permutation tests for mono-
Biochem. 191:396-400. phyly and nonmonophyly. Syst. Zool.40:366-375.
Epplen, J. T. 1988. On simple repeated CAC/TA Faith, D. P. and P. S. Cranston. 1991. Could a clado-
sequences in animal geizomes: A critical reap- gram this short have arisen by chance alone? On
praisal. J. Hered. 79:409417. permutation tests for cladistic structure.
Erlich, H. A. 1989. PCR Tecimology. Stockton press, CIadistics 7:1-28.
New York. Fan, E., D. B. Levin, B. W. Glickman and D. M. Logan.
Erlich, W.A. and N. Arnhcim. 1992. Genetic analysis 1993. Limitations in the use of SSCP analysis.
using the polymerasc chain reaction. Annu. Rev. Mutation Res. 288:85-92.
Genet. 26:479-506. Fangan, 8. M., B. Stedje, 0. E. Stabbctorp, E. 5. Jensen
Erlich, H. A,, D. Gelfand and J. J. Sninsky. 1991. Recent and K. S. Jakobsen. 1994. A general approach for
advances in the polymerase chain reaction. PCR amplification and sequencing of chloroplast
Science 252:1643-1651. DNA from crude vascular plant and algal tissue.
Eschbach, S., J. Wolters and P. Sitte. 1991. Primary and BioTechniques 16484494.
secondary structure of the nuclear small subunit Fani, R., G. Damamiani, C. DiSerio, E. Gallori,A. Grifoni
ribosomal RNA of the cryptomanad Pyrenomanas and M. Bazzicalupo. 1993. Use of random ampli-
salina as inferred from the gene sequence: fied polymorphic DNA (IIAPD) for generating
Evolutionary implications. J. Mol. Evol. specific DNA probes for microorganisms. Mol.
32:247-252. Ecol. 2243-250.
Estabrook, G. F, 1983. The causes of character incom- Farris, J. 5. 1969. A successive approximations
patibility, pp. 279-295. In J. Felsenstein (ed.), approach to character weighting. Syst. Zool.
Numerical Taxonomy. NATO AS1 Series, Vol. GI, 18:374-385.
Springer-Verlag, Berlin. Farris, J. S. 1970. Mcthods for computing Wagner
Estabrook, G. E 1992. Evaluating undirected position- trees. Syst. Zool. 34:21-34.
al congruence of individual taxa between two Farris, J. S. 1972. Estimating phylogenetic trees froin
estimates of the phylogenetic tree for a group of distance matrtices. Am. Nat. 106:645-668.
taxa. Syst. Biol. 41372-177. Farris, J. S. 1977. Phylagcnetic analysis under DoUoJs
Estabrook, G. R and L. Landrum. 1975.A simple test Law. Syst. Zool. 26:77--88.
for the possible simultaneous evolutionary diver- Farris, J. S. 1981. Distance data in phylsgenetic analy-
gence of two amino acid positions. J. Math. Biol. sis, pp. 3-23. In V. A. Funk and D. R. Brooks
4:195-200. (eds.), Advances irr Cladistics: Proceedzngs of tlze First
Meeting ofthe Willr Hennig Soclety. New York slze from samples of sequences Inefficiency of
Botanical Garden, Bronx. palrwlse and segregating sltcs as coinparcd to
Farris, J. S. 1983. The logical basis of phylogcnetic sys- phylogenetic estimates. Genet. Res. Camb.
tematics, pp. 7-36. In N. I. Platnick and V. A. Funk 59~139-147.
(eds.), Advances In Cladistics. Columbia University Felscnstein, J. 1992b. Phylogenies From rcstrlctlon
Press, New York. sites, a maximum likcllhood approach. Evolution
Farris, J. S. 1985. Distance data revisited. Cladistics 46.159-173.
1:67-85. Felsenstein, J. 1993. lJkWLIP (IJhylogenyInfereizce
Farris, J. S. 1986a. Distances and cladistics. Cladistics Package), version 3.5~.Department of Genetics,
2:14.2-157. University of Washington, Seattle.
Farris, J. S. 1986b. On the boundaries of phylogcnetic Felsenstein, J. and EI.K~shino1993 Xs there someth~ng
systematics. Cladistics 2:14-27. wrong with the bootstrap on phylogenies? A
Feinberg, A. P. and B. Vogelstein. 1983. A technique for reply to Hillis and Bull. Syst Blol. 42:193-200.
radiolabelling DNA restriction endonuclease frag- Feng, D-E and R. F. Doolittle. 1987 i'rogressive
ments to high specific activity. Analyt. Biochem. sequence alignment as a prerequisite to correct
132:6-13. phylogenetic trees J. Mol. Evol. 25.351-360
Felsenstein, J. 1978a. Cases in which parsimony and Feng, D-E and R. F. Daolittle 1990 Progressive align-
compatibility methods will be positively mislead- ment and phylogenet~ctree construction of pre-
ing. Syst. Zool. 27:401-410. tein sequences. Meth. Enxyrnol. 183.375-387.
Felsenstcin, J. 1978b. The number of evolutionary Fernholm, B., K. Bremer and I3 Jornvall (eds.) 1989
trecs. Syst. Zool. 27:27-33. The Hierarchy of L~fe.Clsevicr Sclence Publishers,
Felsenstein, J. 1981a. Evolutionary trees from DNA Amsterdam.
sequences: A maximum likelihood approach. J. Ferrari, J. A, and C. E. Taylor 1981. M~erarchicalpat-
Mol. Evol. 17:368-376. terns of chromosome vanation in Drosophtiia sub-
Felsenstein, J. 1981b. Evolutionary trees from gene fre- obscura. Evolution 35:391-394
quencies and quantitative characters: Finding Ferris, S. D. and G. S. Wh~tt.1977a Duplicate gene
maximum likelihood estimates. Evolution expression in dlploid and tetraploid loaches
35:1229-1242. (Cypriniformcs, Coblt~dae).Biochem. Genet
Felsenstein, J. 1981~. A likelihood approach to charac- 15:1097-1112.
ter weighting and what it tells us about parsitno- Ferris, S D. and G. S. Whitt. 197% Loss of duplicate
ny and compatibility. Biol. J. Linnean Soc. gene expression after polyploidlzation. Naturc
16183-196. 265.258-260,
Felsenstein, J. 1982. Numerical inetliods for inferring Ferris, S D.and G. S. W h ~ t t1978a. Phylogeny of
evolutionary trees. Quart. Rev. Biol. 57:379-404. tetraploid catostomid flshes based on thc loss of
Felsenstein, 1.1984. Distance methods for inferring duplicate gene exprcsslon. Syst Zool. 27.189-203.
pl~ylogenies:A justification. Evolution 38:16-24. Ferr~s,S. ID.and G. S. Whltt 197810. Genetic and molc-
Felsenstein, J. 1985a. Confidence limits on phyloge- cular analys~sof non-randoin dimer assembly of
nies: An approach using the bootstrap. Evolution Lhe creatine kinase isozynles of fishes. Biochem
39:783-791. Scnct. 26:811-829
Felsenstein, J. 1985b. Confidence limits on phylogenies Ferrucci, L,, E. Romano and G. E De Stefano. 1987
with a molecular clock. Syst. Zool. 34352-161. Thc AILII-mducedbands m great apes and man
Felsenstein, J. 1985c. Phylogenies and the comparative lmpl~cationfor heterochloma t ~ characterizat~oi-i
u
method. Am. Nat. 125:l-15. and satellite DNA distr~bution.Cytogenet Cell
Felsenstein, J. 1986. Distance methods: A reply to Genet. 44:53-57.
Farris. Cladistics 2:130-143. Fetni, R., li. Droum, N. Lemincux, B Malfoy, 13
Felsenstcin, J. 1987. Estimation of hominoid phyloge- Dutrillaux, P. Messier and C. L Richer. 1992.
ny from a DNA hybridization data set. J. Mol. Detection of small, single-copy gciles on protem-
Eval. 26:123-131. G-banded chromosoines by electron microscopy
Felsenstein, J. 1988a. Phylogenies from molecular Cylogenet. Cell Genet. 60.187-389.
sequences: Inference and reliability. Annu. Rev. Field, K.G., G. J. Olsen, B.J. Lane, S J. Giovannnni,
Genet. 22:521-565. M. T.Ghiselin, E. C, liaff, N. R. Pace and Ii. A.
Felsenstein, J. 1988b. Phylogenies and quantitative Raff 1988. Molecular phylogeny of the anlmal
characters. ,41mu. Rev. EcoI. Syst. 19:445471. kmgdom. Science 239 748-753.
Felsenstein, J. 1992a. Estimating effective population
580 Literature Cited
Frclcis, &I.A., P. R.Gaudreault a l ~ dH.Tyson. 1989. Fitch, W, M. 1984. Cladistic and other methods:
i-Icritablc changes In electrophoretic properlles of Problems, pitfalls, and potentials, pp. 221-252. In
flax perox~dasesresulting from variation in N T. Duncan and 1:G. Stuessey (eds.), Cladistlc
nutr~entlevel. Genetica 78.81-90. Perspecfzves on the Reconstruction of Evolutionay
rigurroa, E, M. Kasahara, H T~chy,E. Neufeld, U. History. Columbia University Press, New York.
lillte and J. Klein. 1987. Polymorphism of umque Fitch, W. M. 1986. A hidden bias in the estimate of
noncodmg DNA sequences in wild and laborato- total nucleotide substitutions from pairwise dif-
ry mice. Genetics 117.101-108. ferences, pp. 315-328. In S. Karlin and E. Nevo
Fildes, R.A. and H. Harr~s.1966. Genetically deter- (eds.), Evoluttonary Processes and Theory. Academic
m~nedvariation of adenylate klnase in man. Press, Orlando, Florida.
Nature 209:261-263. Rtcl~,W. M. and W. R. Atchley. 1985. Evolution in
, C.and R. W. Brosemer. 1973. Immunochem~ca!
F ~ n kS. inbred strains of mice appears rapid. Science
siudles with glycerol 3-phosphate dehydrogenase 228:1169-1175.
~nbees and wasps. Arch. Biochem. B~ophys. Fitch, W. M. and W. R. Atchley. 1987.Divergence in
1-58 30-35. inbred strains of mice: A comparison of three dif-
Flshel, S. E. and G.S Whitt. 1978. Evolution of ferent types of data, pp. 203-216. In C. Patterson
isozyme loci and their differet~tialtissue expres- (ed.), Molecr~lesand Morphology in Evolufion:
sion. Creatine bnase as a mode! system. J. Mol. Conflict or Conzpromtse? Cambridge University
Evoi. 12:25-55, Press, Cambridge, England.
Asher, S E. and G. S. Whitt. 1979, Evolution of the cre- Fitch, W. M. and E. Margoliash. 1967. Construction of
ahnc kinase isozyme system in the primitive ver- pl~ylogenetictrees. Science 155:279-284.
tebrales. Occ. Pap. Californla Acad. Sci. Fitch, W. M. and E. Markowitz. 1970. An improved
134.142-159. method for determining codon variability in a
Flsher, S E., J. 3.Shaklee, S D. Ferris and G. S. Whitf gene and its application to the rate of fixation of
1960 Evolution of five mul~ilocusisozyme sys- mutations in evolution. Biochem. Genet.
terns UI the chordates. Genetica 52/53:73-85. 4:579-593.
Filch, W M. 1966. An Improved method of testlng for Fitch, W. M., J. M. E. Leiter, X. Li and I? Palese. 1991.
evolutionary homology. J. Mol. Biol. 1629-16. Positive Darwinian evolution in human influenza
Fitch, W. M. 1970. Distinguishing homologous from A viruses. Proc. Nat. Acad. Sci. USA 88:42704274.
analogous proteins. Syst. Zool. 19:99-113. FitzSimmons, N. N., C. Moritz and S. S, Moore. 1995.
Rtih, W. M. 1971a. Thc non-identity of invariant posi- Conservation and dynamics of microsatellite loci
tions in the cytochrome c of different species. over 300 million years of marine turtle evo1ut.ion.
Blochem. Genet. 5,231- 241. Mol. Biol. Evol. 12:432440.
Flich, W. M. 1971b. Toward defin~ngthe course of evo- Flavell, R. B. 1986. Repetitive DNA and chromosome
lution Minimal change for a specific tree topolo- evolution in plants. Phil. Trans. Roy. Soc. London
gy. Syst Z001.20:406416. B312:227-242.
Rtci?, W. M.1975. Toward f~ndingthe tree of maxi- Flavell, R. B., M. O'Dell, P. Sharp, E. Nevo and A.
mum parsimony, pp. 189-230. In G. F. Estabrook Beiles. 1986. Variation in the intergenic spacer of
(ed 1, Proceedings ofthe Eighth Internatzonal ribosomal DNA of wild wheat, Triticum
Conference on Nzrrnerlcni Taxonomy. W. H . Freeman, dicoccoides, m Israel. MoI. Biol. EvoI. 3:547-558.
San Francisco. Fleischer, R. C. 1983. A comparison of theoretical and
Fltcli, W. M.1976a. The molecular evolution of electrophoretic assessments of genetic structure in
cytochrome c in eukaryotes. J. Mol. Evol. 8:13-40. populations of the house sparrow (Passer domesti-
Flich, W. M. 1976b.Molecular evolut~onaryclocks, pp. CUS). Evolution 37:1007-1009.
160-178. In F. J Ayala (cd.), Molecular Evohliow. Flint, J., A. V. 5. Hill, D. K. Bowden, S. J. Oppenheimer,
Sinauer, Sunderland, Massachusetts. P.R. Sill, S. W. Serjeantson,J. Bana-Koiri, K.
Fitch, W. M. 1977. On the problem of discovering the Bhatia, M. P.Alpers, A. J. Boyce, D. J. Weatherall
most parsimonious tree. Am. Nat. 111:223-257. and J. B. Clegg. 1986. High frequencies of a-tha-
Frtcll, LV. hi.1979. Cautionary remarks on using gene lassaemia are the result of natural selection by
expression events ~nparslmony procedures. Syst. malaria. Nature 321:744-750.
Zoo1 28:375-379. Foltz, D. W. 1986.Null alleles as a possible cause of
F~tch,W. M. 1981. A non-sequential method for con- heterozygote deficiencies in the oyster Crassostrea
struct~ngtrees and hierarchical classifications.J. virginica and other bivalves. Evolution 40:869-870.
Mol Evol. 18:30-37. Foltz, D. W, and J. L. Woogland. 1983. Genetic evi-
Literature Cited 581
dence of outbreeding in the black-tailed prairie gene sequences in animals: Initial assessment of
dog (Cynonzys ludovicianus). Evolution 37:273-281. character sets from concordance and divergence
Fonatsch, C., C. Gradl, J. Ragoussis and A. Ziegler. studies. Syst. Biol. 43:511-525.
1987.Assignment of the TCPl locus to the long Frischauf, A.-M. 1987. Construction and characteriza-
arm of human chromosome 6 by in situ tion of a genomic library in lambda. Meth.
hybridization. Cytogenet. Cell Genet. 45:109-112. Enzymol. 152:190-199.
Foran, D. R., P. J. Johnson and G. P. Moore. 1985. Fritsch, P. E and L. H. Rieseberg. 1992. High outcross-
Evolution of two actin " genes in the sea urchin ing rates maintain male hermaphrodite individu-
Strongylocentrotus fransiscanus. J. Mol. Evol. als in populations of the flowering plant Datisca
22:108-116. glomerata. Nature 359:633-636.
Fox, G. E., E. Stackebrandt, R. B. Hespell, J. Gibson, J. Frohman, M. A., Dush, M. K. and G. R. Martin. 1988.
Maniloff, T. A. Dyer, R. S. Wolfe, W. E. Dalch, R. S. Rapid production of full-length cDNAs from rare
Tanner, L. J. Magrum, L. B. Zablen, R. Blakemore, transcripts: Amplification using a gene-specific
R. Gupta, L. Bonen, B. J. Lewis, D.A. Stahl, K. R. oligo-nucleotide primer. Proc. Natl. Acad. Sci.
Luehrsen, K. N. Chen and C. R. Woese. 1980. The USA 85:8998-9002.
phylogeny of prokaryotes. Science 209:457-463. Frommer, M., C. Paul and P. C. Vincent. 1988.
Fox, G. M. and C. W. Schrnid. 1980. Related single Localization of satellite DNA sequences on human
copy sequences in the human genome. Biochim metaphase chromosomes using bromodeoxyuri-
Biophys. Acta 609349-363. dine-labelled probes. Chromosoma 97:ll-18.
Fox, G. M., J. Umeda, R. K.-Y. Lee and C. W. Schmid. Frost, D. R. and D. M. Hillis. 1990. Species in concept
1980. A phase diagram of the binding of mis- and practice: Herpetological applications.
matched duplex DNAs to hydroxyapatite. Herpetologica 46:87-104.
Biochem. Biophys. Acta. 609:364-371. Frykman, I, and 8 . 0 . Bengtsson. 1984. Genetic differ-
Frair, W. 1964. Turtle family relationships as deter- entiation in Sorex. 111. Electrophoretic analysis of a
mined by serological tests, pp. 535-544. In C. A. hybrid zone between two karyotypic races in
Leone (ed.), Taxonomic Biochemistry and Serology. Sorex araneus. Hereditas 70:259-270.
Ronald Press, New York. Fu, Y -X. and W.-H. Li. 1993. Statistical tests of neutral-
Freifelder, D. 1982. Physical Biochemistry: Applications to ity of mutations. Genetics 133:693-709.
Biochemistry and Molecular Biology. 2nd ed. W. H. Fukami, K. and Y. Tateno. 1989. On the maximum like-
Freeman and Co., New York. lihood method for estimating molecular trees:
Frelin, C. and F, Vuilleumier. 1979. Biochemical meth- Uniqueness of the likelihood point. J. Mol. Evol.
ods and reasoning in systematics. Z. Zool. Syst. 28:460-464.
Evo1ut.-forsch. 17:l-10. Funk, V. A. 1985. Phylogenetic patterns and hybridiza-
Freshney, R. I. 1987. Culture ofAnimal Cells. Alan R. tion. Ann. Missouri Bot. Card 72:681-715.
Liss, New York. Furrer, B., U. Candrian, P. Wieland, J. Luthy 1990.
Freshney, R. I. 1994. Culture of Animal Cells. 3rd ed. Improving PCR efficiency. Nature 346:324.
Wiley-Liss, New York. Futuyma, D. J. 1986. Evolutionary Biology. 2nd ed.
Frick, L. W. 1981.A biochemical, phylogenetic and Sinauer, Sunderland, Massachusetts.
immunological investigation of the cytosolic di-
and tripeptidases of fishes. Ph.D. dissertation, Galau, G. A,, M. E. Chamberlin, B. R. Hough, R. J.
University of Hawaii. Britten and E. H. Davidson. 1976. Evolution of
Frick, L. W. 1983. An electrophoretic investigation of repetitive and nonrepetitive DNA in two species
the cytosolic di- and tripeptidases of fish: of Xenopus, pp. 200-224. In F. J. Ayala (ed.),
Molecular weights, substrate specificities and tis- Molecr~larEvolution. Sinauer, Sunderland,
sue and phylogenetic distributions. Biochem. Massachusetts.
Gene:. 21:309-322. Gall, J. G. and M. L. Parduc. 1969. Formation and
Frieden, C. 1963. Glutarnate dehydrogenase. V. The detection of RNA-DNA hybrid molecules in cyto-
relation of enzyme structure to the catalytic func- logical preparations. Proc. Natl. Acad. Sci. USA
tion. J. Biol. Chem. 238:3286-3299. 63:378-383.
Friedlander, T.P., J. C. Regier and C. Mitter. 1992. Gantt, J. S., S. L Baldauf, I? J. Calie, N. E Weeden and
Nuclear gene sequences for higher level phyloge- J. D. Palmer, 1991. Transfer of rp122 to the nucleus
netic analysis: 14 promising candidates. Syst. Biol. greatly preceded its loss from the chloroplast and
41:483-490. involved the gain of an intrcn. EM&OJ.
Friedlander, T.,.'i J. C. Regier and C. Mitter. 1994. 10:3073-3078.
Phylogenetic information content of five nuclear
582 Literature Cited
Gargas, A,, P.T. DePr~est,M.Grube, A. Tehler. 1995. Georges, M., A.-S. Lequarre, M. Castelli, R. Hanset
Multiple orlgins of lichen symbioses in fungi sug- and G. Vassart. 1988. DNA fingerprinting in
gested by SSU rDNA phylogeny. Sclence domestic animals using four different minisatel-
268:1492-1494. lite probes. Cytogenet. Cell Genet. 47:127-131.
Gargouri, A. 1989. A rapid and simple method for Gerbi, S. A. 1985. Evolution of ribosomal DNA, pp.
extracting yeast mitochondria1DNA. Curr. Genet. 419-517. In R. J. MacXntyrc (ed.), Molecular
15:235-237. Evolutionary Genetics. Plenum, New York.
Garland, T., Jr., R. B. Huey and A. E Bennett. 1991. Ghiselin, M. T. 1988.The origin of molluscs in bight of
Phylogeny and coadaptation of thermal physiolo- molecular evidence. Oxford Sum. Evol. Biol.
gy in lizards: A reanalysis. Evolution 45:1969-1974. 5:66-95.
Garland, T., Jr., P. H.Harvey and A. R. Ives. 1992. Gibbs, H. L., P. J. Weatherhead, P. T. Boag, B. N. White,
Procedures for the analysis of comparative data L. M. Tabak and D. J. Hoysak. 1990. Realized
using phylogenetlcally independent contrasts. reproductive success of polygynous red-winged
Syst. Biol. 41:18-32. blackbirds revealed by DNA markers. Science
Garland, T., Jr., A. W. Dickerman, C. M. Janis and J. A. 250:1394-1397.
Jones. 1993. Phylogenetic analysis of covariance Gilbert, D. A., N. Lehrman, S. J. O'Brien and R. K.
by computer simulation. Syst. Biol. 42:265-292. Wayne. 1990. Genetic fingerprinting reflects pop-
Garza, J. C. and D. S. Woodruff. 1992.A phylogenetic ulation differentiation in the California channel
study of the gibbons (Hylobates) using DNA island fox. Nature 344:764-767.
obtained nondestructively from hair. Mol. Gilbert, D. A., C. Packer, A. B. Pusey, J. C. Stephens
Phylogenet. EvoI. 1:202-210. and S. J. O'Brien. 1991. Analytical DNA finger-
Gastony, G. J. 1986. Electrophoretic evidence for the printing in lions: Parentage, genetic diversity and
origin of fern species by unreduced spores. Am. J. kinship. J. Hered. 82378-386.
Bot. 73:1563-1569. Gilberk, S. F. 1991. Developmental Biology. 3rd ed.
Gastony, G. J. 1991. Gene silencing in a polyploid Sinauer, Sunderland, Massachusetts.
homosporous fern: Paleopolyploidy revisited. Gillespie, J. H. 1984. The molecular clock may be an
Proc. Natl. Acad. Sci. USA 88:1602-1605. episodic clock. Proc. Natl. Acad. Sci. USA
Gatesy, J., R. DeSaUe and W. Wheeler. 1993. 81:8009-8013.
Alignment-ambiguous nucleotide sites and the Giliespie, J. H. 1986a. Natural selection and the molec-
exclusion of systematic data. Mol. Phylogenet. ular clock. Mol. Biol. Evol. 3:138-155.
Evol. 2:152-157. Gillespie, J. H. 1986b. Variability of evolutionary rates
Gaut, B. S. and M. T. Clegg. 1993a.Molecular evolu- of DNA, Genetics 113:1077-1091.
tion of the AdhZ locus in the genus Zca. Proc. Natl. Gillespie, J. H. 1986~.Rates of molecular evolution.
Acad. Sci. USA 9035095-5099. Annu. Rev. Ecol. Syst. 17:637-665.
Gaut, B. S. and M. T. Clcgg. 1993b. Nucleotide poly- Gillespie, J. H. 2987. Molecular evolution and the neu-
morphism in the Adhl locus of pearl millet tral allele theory. Oxford Surv. Evol. Biol. 4:lO-37.
(Pennisetum glaucum) (Poaceae). Genetics Gillespie, J. H. 1991. The Causes of MolecuLr Evolution.
135:1091-1097. Oxford University Press, Oxford.
Gaut, B. S. and P. 0.Lewis. 1995. Success of maximum Gillespie, J. H. and K. Kojima. 1968. The degree of
likelihood in the four-taxon case. Mol. Biol. Evol. polymorphism in enzymes involved in energy
12:152-162. production compared to that in nonspecific
Gaut, B. S., S. V. Muse, W. D. Clark and M. T.Clegg. enzymes in two Drosophila nnannssae populations.
1992. Relative rates of nucleotide substitution at Genetics 61:582-585.
the rbcL locus of monocotyledonous plants. J. Gillespie, R. G., H. B. Croom and S. R. Palurnbi. 1994.
Mol. Evol. 35:292-303. Multiple origins of a spider radiation in Hawaii.
Gauthier, J., A. G. Kluge and T. Rowe. 1988.Amniote Proc. Natl. Acad. Sci. USA 91:2290-2294.
phylogeny and the importance of fossils. Gittleman, J. L. and H,-K.Luh. 1992. On comparing
Cladistics 43105-205. comparative methods. Annu. Rev. Ecol. Syst.
Gelfand, D. H. and T. J. White. 1990. Thermostable 23:383-404.
DNA polymcrases, pp 129-141. In M. A. Innis, D. Givnish, T.J. and K. J. Sytsma. 1995. Homoplasy in
H. Gelfand, J. J. Sninsky and T.J. White (eds.), moIecular vs. morphological data: The likelihood
PCR Protocols. Academic Press, New York. of correct phylogenetic inference Evolution (in
Gellisen, G., J. Y.Bradfield, B. N. White and G. R. press).
Wyatt. 1983. Mitochondria1DNA sequences in the Glass, G. V. 1976. Primary, secondary and rneta-analy-
nucIear genome of a locust. Nature 301:631-634. sis of research. ~ d u cRes.
: 5:3-8.
Literature Cited 583
Glover, F. 1989. Tabu search-part 1. OlSA J. Comp. Gonzaler,, I. L.,J. E Sylvester, T. R Smlth, D.
1:190-206. Stambolian and R. D. Scl~m~ckel. 1990. k b o s o ~ n a l
Goelz, S. E., S. R. Ham~ltonand B. Vogelstein. 1985. RNA gene sequences and hominoid phylogeny
Purification of DNA from formaldehyde fixed Mol. B~ol.Evol. 7:203-219
and paraffin embedded human tissue. Biochem. Good, D. A. 1989. Iiybridizahon and crypt~cspeclcs in
Biophy Res. Comm.l30:118-126. Dzcamptodon (Caudata. Dicamptodontidae)
Gojobori, T., W.-H. Li and D. Graur. 1982. Patterns of Evolution 43:728-744.
nucleotide substitution in pseudogenes and func- Good, D. A., G. Z.Wurst and D.B Wake. 1987.
tional genes. J. Mol. Evol. 18:360-369. Patterns of geographic variat~onin allozymes of
Gold, ,T. R. and L. R. Richardson. 1990. Restriction sitc the Olympic salamander, Rlzyacotriton oly~nyiiu\
heteroplasmy in the mitochondria1 DNA of the (Caudata: Dicamptodontxdac). Fieldiana Zool N
marine fish Scinenops ocellatus (L.).Anim. Genet.. S. 1374:l-15.
21 :313-316. Good, P.1994. Pernziilatioii 'resfs: A Practical Glrrde to
Golding, G. 8. 1983. Estimates of DNA and protein Resampliizg for Testing ~ypotliesrs.Springer-Verlag,
sequence divergence: An examination of some New York.
assumptions. Mol. Biol. Evol. 1:125-142. Goodfcllow, P. N. 1993. M~crosatellitesand lhc new
Golding, G. B. and C. Strobeck. 1983. Increased num- genetlc maps. Curr. B~ol.3:149-151.
ber of allcles found in hybrid populations due to Goodman, M. 1961. The role of lmmunachemical dlf-
intragenic recombination. Evolution 37:17-29. ferences in the phyletic devclopment of human
Goldman, N. 1990. Maximum likelihood of phyloge- behavior. Human B~ol.33 131-162.
netic trees, with special reference to Poisson Goodman, M. 1963 Serolog~calanalysis of the system-
process models of DNA substltution and to parsi- atics of reccnt h o m ~ n o ~ dI-luman
s. Biol.
mony analysis. Syst. Zool. 39:345-361. 35:377-424.
Goldman, N. 1993a. Statistical tests of models of DNA Goodman, M. 1981. Decoding the pattern of protell1
substltution. J. Mol. Evol. 36:182-198. evolution. Progr. Biophys. Mol. Biol. 37:105-164.
Goldman, N. 199313. Simple diagnostic tests of models Goodman, M. 1985. Iiatcs of molecular evolut~onThe
of DNA substitution. J. Mol. Evol. 37:50-661. hominoid slowdown BioEssays 3:9-14.
Goldman, N. and Z. Yang. 1994. A codon-based model Goodman. M. and G. W. Moore 3971.
of nucleotide substitution for protein-coding Immunodiffusion systematics of the primatcs I
DNA sequences. Mol. Biol. Evol. 11:725-736. The Catarrhhi. Syst. Zool 20:19-62.
Goldstein, D. B. and D. D. Pollock. 1994. Least squares Goodman, M., J. Barnabas, G. Matsuda and G. W.
estimation of molecular distance-noise abate- Moore. 1971. Molecular evolution in the descent
ment in pl~ylogeneticreconstruclion. Theor. Pop. of man. Nature 233:604-613.
Bial. 45:219-226. Goodman, M., J. Czelusniak, G W Moore, A. E.
Goldstein, D. B., A. R. Linares, M. W. Feldman and L. Romere-Herrera and G. Matsuda. 1979 F~tling
L. Cavalli-Sforza. 1995. An evaluatiol~of genetic the gene lineage into the species lineagc, a parsi-
distance for use with microsatellite data. Genetics mony strategy illustrated by cladograms con-
139:463471. structed from globln sequences. Syst. Zool.
Golenberg, E. M., D. G. Giannasi, M. T.Clegg, C. J. 28:132-163.
Smiley, M. Durbin, D. Henderson and G. Goodman, M., M. M.Miyamoto and J. C7elusnlak.
Zurawski. 1990. Chloroplast DNA sequence from 1987. Pattern and process in vertebrate phylogeny
a Miocene Magnolia species. Nature 344:656-658. revealed by coevolution of molecules and mor-
GoUmann, G., P. Rolh and W. Hodl. 1988. phologies, pp. 141-176. Iiz C. Patterson (ed.),
Hybridization between fire-bellied toads Bombina Molecules and Morphology L I Z Evolution: Confllcl a:
barnbiizn and Boinbi~zavariegata in the Karst regions Cornpromzse? Cambridge Un~versityPress,
of Slovakia and Hungary: Morphological and Cambridge.
allozyrne evidence. J. Evol. Biol. 1:3-14. Gorman, G. C. 1971. Evolutionary genetics of ~sland
Goloboff, P. A. 1993. Estimating character weights dur- lizard populations. Yearbook Am. Philo. Soc
ing tree search. Cladistics 983-91. 1971:318-319.
Gonzales, I. L., J. L. Gorski, T. J. Campden, D. J. Gorman, G. C. and J. Rcnzi, Jr. 1979 Genetic dlstancc
Dorney, J. M. Erickson, J. E. Sylvester and R. D. and hctcrozygosity estimates m electrophoret~c
Schmiclcel. 1985. Variation among human 28s studics: Effects of sample slzc. Copeia
ribosomal RNA genes. Proc. Natl. Acad. Sci. USA 1979242-249.
82:7666-7670.
Golman, G. C, and D.Shochat 1972. Multiple lactate the primates. Proc. Roy. Soc. London B
dchydrogcnase alleles In the lizard Agamn stelllo 243:241-253.
Expcrle~ltia28:351-353. Graybeal, A. 1993. The phylogenetlc utihty of
Gorman, G.C., A. C. Wilson and M. Nakanish~.1971. cytochrome b: Lessons from bufonid frogs. Mol.
A biochemical approach towards the study of rep- Phylogenet. Evol. 2:256-269.
t~!janphylogeny: Evolution of serum albumin Graybeal, A. 1994. Evaluating the phylogenetic utility
and lactlc dehydrogenase Syst. Zool. 20:167-185 of genes: A search for genes informative about
Gorman, G. C., D. G. Buth and J. S. Wyles. 1980. Anoiis deep d~vergencesamong vertebrates. Syst. Biol.
lizards of the eastern Caribbean: A case study m 43:174-193.
e~volutlon.111. A cladlsiic analysis of albumin Green, D.M. 1983. Evidence for chron~osomenumber
i~nlnunolog~cal data, and the definition of specles reduction and chromosome homoseguentiality in
groups. Syst. 2001. 29.143-158. the 24-chromosome Korean frog, Kana dybowskiz
Gorzuia, S., C. L.Arocha-Pinango and C. Salazar. 1976. and related species. Chromosorna 88:222-226
A method of obtaining blood by caudal vein from Green, D. M. and S. K. Sessions. 1991.Amphibian
laige reptiles. Copela 1976:838-839. Cyfogeneticsand Evolution. Academic Press, San
Gottlleb, L. D. 1973. Genetic differentlation, sympatric Diego.
speciation, and the origin of a diploid species of Green, D.M., J. P. Bogart and E. H. Anthony. 1980. An
Scephanomerra. Am. J. Bot. 60:545-553. interactive, microcomputer-based karyotype
Gottllob, L. C . 1982a. Conservation and duplication of analysis system for phylogenetic cytotaxonomy
isozymes in plants. Science 216:373-380. Comput. Biol. Med. 10:219-227.
Gotilzeb, L. D. 1982b. Isozyme number and pliylogeny, Greenbaum, 1. F. 1981. Genetic interactions between
pp 209-221. In U. Jensen and D. E. Fairbrothers hybridizing cytotypes of the tent-making bat
(eds ), Proteins and Nucle~cAcids in Plant (Uyoderma bllobatzrm). Evolution 35:305-320.
Spternatics. Sprmger-Verlag, Berlin. Greenberg, B. D., J. E. Newbold and A. Sugino. 1983.
Gottl~cb,L. D. and N. F. Weeden. 1979. Gene duplica- Xntraspecific nucleotide sequence variability sur-
tlon and phylogeny in Clnrkia. Evolution rounding the origin of replication in human mito-
33: 1024-1039. chondrial DNA. Gene 21:33-49.
Cough, J A. and N. E. Murray. 1983. Sequence diversi- Grompe, M. 1993. The rapid detection of unknown
~yamong related genes for recognition of spec~flc mutations in nucleic acids. Nature Genetics
targets in BNAmolecules. J. Mol. Biol. 166:l-19. 5:121-117.
Gouy, M. and W. -H. Li. 1989a. Phylogenetic analysis Groot, G. S. P. and A. M. Kroon. 1979. Milocl~ondrial
based on r W A sequences supports the archae- DNA from various organisms does not contain
bacterial rather than the eocyte tree. Nature internally methylated cytosine in -CCGG-
339:145-147. sequences. Biochim. Biophys. Acta 564:355--357.
Gouy, 91. and W. -H. Li. 1989b. Molecular phylogeny Guadet, J., J. Julien, J. Lafay and Y. Brygoo. 1989.
of the k~ngdomsAnimalia, Plantae and Fungi. Phylogeny of some Fusarium species, as deter-
Mol. 8101. Evol. 6.109-122 mined by large-subunit rRNA sequence compari-
GIaicn, A. 1989. The phylogenetic regression. Phil. son. Mol. Biol. Evol. 6:227-242.
Trans. Roy. Soc. London 326:119-157. Gruenbaum, H., T.Naveh-Many, H. Cedar and A.
Grafcn, A. 1992. The uniqueness of the phylogenetic Razin. 1981. Sequence specificity of methylation
regression. J. Thcor. Biol. 156:405-423. in higher piant DNA Nature 292:860-862.
Grahain, D,E 1978. The lsolatlon of h ~ g hmolecular Grula, J. W., T. J. Wall, T. D. Giugni, G. J. Graham, E. H.
xve~ghtDNA from .cvhole organisms or large tis- Davidson and R. J. Britten. 1982. Sea urchin DNA
sue masses. Analyt Blochem. 85:609-613. sequence variation and rcduced interspecies dif-
Glaham, J. H. and J. D. Fclley, 1985 Genomic coadap- ferences of the less variable DNA sequences.
ration and developmental stability within intro- Evolution 36665-676.
gresscd populations of Entzeauanthus glorzosus and Gu, X., Y.-X. Fu and W.-H. Li. 1995. Maximum likeh-
E obnus (Piscer;,Centrarchidade). Evolution hood estimation of the heterogeneity of substitu-
39.204-114. tion rate among nucleotide sites. MoI. Biol. Evol.
Gray, C. S. and Mr. M. Fitch. 1983. Evolution of antibi- 12:546-557.
otic res~stancegenes: The DNA sequence of a Guillcmette, J. G. and P. N.Lewis. 1983. Detection of
kanamycin resistance gene from Staphylococcus s~lbnanogramquantities of DNA and RNA on
nureus. Mo!. Biol. Evol. 1:57-66. nat~veand denaturing polyacrylamide and
GI ,iy, 1 C. and A. J. Jefileys. 1991. Evolutionary tran- agarose gels by silver staining. Electropborefiis
s~enceof hypervariable mmisatellites in man and 4:92-94.
Literature Cited 585
Guo, S.-W and E. A. Thompson. 1992. Performing the populations of the ninesplne stickleback,
exact test of Hardy-Weinberg proportion for mul- Pungitius pungltius, pp. 438-452. In R. L. Mayden
tiple alleles. Biornetrics 48:361-372. (ed.),Systematics, H~siorlcalEcology, and North
Gupta, R., J. M. Lanter and C. R. Woese. 1983. Amerran Freshwater F~shes.Stanford University
Sequence of the 16s ribosomal RNA from Press, Stanford.
Halobacteriunz volcanii, an archaebacterium. Haig, S. M , J. R. Belthoff and D.H. Allen. 1993.
Science 221:656-659. Examination of population structure in red-cock-
Guries, R.P. and E T. Ledig. 1982. Genetic diversity aded woodpeckers using DNA profiles. Evolution
and population structure in pitch pine (Pinus rigi- 47:185-194.
da Mill). Evolution 36:387-402. Halanych, K.M., J. D. Bacheller,A. M. A. Aguinaldo,
Gutierrcz, R. J., li. M.Zink and S. Y. Yang. 1983. S. M. Liva, D. M. Hillis and J. A. Lake. 1995.
Genetic variation, systematic and biogeographic Evidence from 185 ribosomal DNA that the
relationships of some Galliform birds. Auk lophophorates are protostome animals. Science
100:33-40. 267:1641-1643.
Gyllensten, U. B ,D. Wharton, A. Josefsson and A. C. Halkka, L., Soderlund, U. Skaren and J. Keikkila.
Wilson. 1991. Parental inheritance of mitochondi- 1987. Chromosomal polymorphism and racial
a1 DNA in mice. Nature 352255-257. evolution of Sorex araneus L. in Finland. Hereditas
Gyllensten, V. and H. Erlich. 1988. Generation of sin- 106:257-275.
gle-stranded DNA by the polymerase chain reac- Hall, B. K. (ed.). 1994. Eomology: The Hierarchical Basis
tion and its applications to direct sequencing of of Comparative Biology. Academic Press, New York.
the HLA-DQa locus. Prac. Natl, Acad. Sci. USA Hall, P. and M. A. Martin. 1988. On bootstrap resam-
85:7652-7656. pling and iteration. Biometrika 756614371.
Hall, T. C., Y Ma, B. V. Buchbinder, J. W. Pyne, S. M.
Haberfeld, A,, A. Cahaner, 0 . Yoffe, Y. Plotsky and J. Sun and F. A. Bliss. 1978. Messenger RNA for GI
Hillel. 1991. DNA fingerprints of farm animals protein of French bean seed: Cell-free translation
generated by microsatellite and minisatellite and product characterization. Proc. Natl. Acad.
DNA probes. Anim. Genet. 22:299-305. Sci. USA 75:3196-3200.
Hack, M. S, and H. T. Lawce. 1980. The Association of Hall, T. J., J. W. Grula, E. H. Davidson and R. J. Britten.
Cyfogenetic Technologists Cytogenetics Laboraioy 1980. Evolution of sea urchin non-repetitive DNA.
Manual. University of California Press, San J. Mol. Evol. 16:95-110.
Francisco. Hallick, R. B., L. Hong, R. G. Drager., M. R. Eavreau,
Hadjiolov, A. A., 0.I. Georgiev, V. V. Nosikov and L. P A. Monfort, B. Orsat, A. Spielman and E. Stutz.
Yavachev. 1984. Primary and secondary structure 1993. Complete sequence of Euglena gracilis
of rat 285 ribosomal RNA. Nucl. Acids Res. chloroplast DNA. Nucl. Acids Res. 21:3537-3544.
12:3677-3693. Haltiner, M., T. Kempe and R. Tijian. 1985. A novel
Hadrys, H., M.Balcik and B. Schierwater. 1992. strategy for constructing clustered point muta-
Application of random amplified polymorphic tions. Nucl. Acids Res. 13:1015-1026.
DNA (RAPD) in molecular ecology. Mol. Ecol. Hamby, R. K. and E. A. Zimmer. 1988. Ribosomal RNA
1:55-63. sequences for inferring phylogeny within the
Hadrys, H., B. Schierwater, S. L. Dellaporta, R DeSalle grass family (Poaceae). Plant Syst. Evol.
and L. W. Buss. 1993. Determination of paternity 160:29-37.
in dragonflie~by random amplified polymorphic Hamby, R. K. and E. A. Zimmer. 1992. Ribosomal RNA
DNA fingerprinting. Mol. Ecol. 2:29-87, as a phylogenetic tool in plant systematics, pp.
Haeckel, E.1866. Generellc Morphologie der Organismen: 50-91. In P. S. Soltis, J. E. Soltis and J. J. Doyle
Allgerneine Grundzuge der organischen Formen- (eds.),Molectrlar Systetnaflcs of Plants. Chapman
Wissenschaft, nzechanisch begrtrndet durch die von and Hall, New York.
Charles Darzuin rcforrnirte Descendenz-Theorie Hamby, R. K., L. Sims, L. Issel and E. Zimmer. 1988.
Georg Riemer, Berlin. Direct ribosomal RNA sequencing: Optimization
Hafner, M. S. and S. A Nadler. 1988. Phylogenetic of extraction and sequencing methods for work
trees support the coevolution of parasites and with higher plants. Plant Mol. Biol. Rep.
their hosts. Nature 332258-259. 6:175-192.
Hagiund, T.R., D. G. Buth and R.Lawson. 1993. Hames, B. D. and D. Rickwood. (eds.) 1981. Gel
Allozyme variation and phylogeneiic relation- Electrophoresis of Protezns. A Practical Approach XRL
ships of Asian, North American, and European Press, Oxford.
586 Literature Cited
Hames, B. D. and S. J. Higgins. 1985. Nuclezc Acid genetic marker in population and evolutionary
Hybridizatron: A Practzcnl Approach. IRL Press, biology. Trends Ecol, Evol. 4:6-11.
Oxford. Harrison, R. G. 1990. Hybrid zones: Windows on evo-
Hamkalo, B. A. and N. J. Hutchison. 1984. In situ lutionary process, pp. 69-128. In D. J. Futuyma
hybridization at the elcctron microscope level, pp. and J. Antonovics (eds.), Oxford Sl~rveysin
97-115. In R. S. Sparkes and F. F. de la Cruz (eds.), Evolutzonary Biology. Vol. 7. Oxford University
Research Perspectives in Cytogenetics. University Press, London.
Park Press, Baltimore. Harrison, R. G., D. M. Rand and W. C. Wheeler. 1987.
Hamlyn, P. H., G. G. Brownlee, C.-C. Cheng, M. J. Gait Mitochondria1 DNA variation in field crickets
and C. Milstein. 1978. Complete sequence of con- across a narrow hybrid zone. Mol. Biol. Evol.
stant and 3' noncoding regions of an 4144-158.
immunoglobulin mRNA using the dideoxynu- Harry, J. L. and D. A. Briscoe. 1988. Multiple paternity
cleotide method of RNA sequencing. Cell in the loggerhead turtle (Caretta caretta). J. Hered.
15:1067-1075. 79:91-99.
Ilamrick, J. L. and M. J. W. Godt. 1989.Allozyme Hartl, D. L. and A. G. Clark. 1989. Principles of
diversity in plant species, pp. 43-63. In A. D. H. Population Genetics. 2nd ed. Sinauer, Sunderland,
Brown, M. T. Clegg, A. L. Kahler and B. S. Weir Massachusetts.
(eds.), Plant Population Genetics, Breeding and Hartman, B. K. and S. Udenfried. 1969. A method for
Genetic Resources. Sinauer, Sunderland, immediate visualization of proteins in acrylamide
Massachusetts. gels and its use for preparation of antibodies to
Hancock, J. M. and G. A. Dover. 1990. "Compensatory enzymes. Analyt. Biochem. 30:391-394.
slippage" in the evolution of ribosomal genes. Harvey, P. H. and M. D. Pagel. 1991. The Comparative
Nucl. Acids Res. 18:5949-5954. Method In Evolutionary Biology. Oxford University
Hanotte, O., E. Cairns, T. Robson, M. C. Double and T. Press, Oxford.
Burke. 1992. Cross-specieshybridization of a sin- Harvey, II., E. C. Holmes, A. O. Mooers and S. Nee.
gle locus minisatellitc probe in passerine birds. 1994. Inferring evolutionary processes from mole-
Mol. Ecol. 1:127-130. cular phylogenies, pp. 313333, In R. W. Scotland,
Hanotte, O., C. Zanon, A. Pugh, C. Greig, A. Dixon D. J. Siebert and D. J. Williams (eds.), Models in
and T.Burke. 1994. Isolation and characterization Phylogeny Reconstruction. SystematicsAssociation
of microsatellite loci 111 a passerine bird: The reed Special Volume 52, Oxford.
bunting. Mol. Ecol. 3:529-531. Hasegawa, M. and M. Fujiwara. 1993. Relative effi-
Harding, J. D. and R. A. Keller. 1992. Single-molecule ciencies of the maximum hkelihood, maximum
detection as an approach to rapid DNA sequenc- parsimony, and neighbor-joiiung methods for
ing. Trends Biotech. 10:55. estimating protein phylogeny. Mol. Phylogen.
Harper, M,E., A. Ullrich and G. F. Saunders. 1981. Evol. 2:l-5.
Localization of the human insulin gene to the dis- Hasegawa, M. and T. Kashimoto. 1993. Ribosomal
tal end of the short arm of chromosome 11. Proc. RNA trees misleading? Nature 361:23.
Natl. Acad. Sci. USA 78:445&4460. Hasegawa, N. and H. Kishino. 1989. Heterogeneity of
Harper, M. E. and Saunders, G. F. 1984. Localization of tempo and mode of mitochondrial DNA evolu-
single-copy genes on human chromosomes by in tion among rnam~nalianorders. Japan J. Genet.
situ hybridization of 3H-probes and autoradiogra- 61:243-258.
phy, pp. 217-133. In R. S. Sparkes and E E de la Hasegawa, M., Y, Iida, T. Yano, E Takaiwa and M.
Cruz (eds.), Research Perspectives itr Cytogetzetics. Iwabuchi. 1985a. Phylogenetic relationships
University Park Press, Baltimore. among eukaryotic kingdoms inferred from ribo-
Harris, H. 1966. Enzyme polymorphism in man. Proc. somal RNA sequences. J. Mol. Evol. 22:32-38.
Roy Soc. London B 164:298-310. Hasegawa, M., W.Kishino and T. Yano. 1985b. Dating
Harris, H, and D. A. Hopkinson. 1976 et seq. Handbook of the human-ape splitting by a molecular clock
of Enzyme Elecfrophoresis in Human Genetics. of mitochondrial DNA. J. Mol. Evol. 21:160-174.
North-Holland, Amsterdam. Hasegawa, M., H. Kishino and N. Saitou. 1991. On the
Harris, S. A. and R. Ingram. 1991. Chloroplast DNA maximuln likelihood method in molecular phylo-
and biosysternatics: The effects of intraspecific genetics. J. Mol. Evol. 32:443-445.
diversity and plastid transmission. Taxon Haslewood, G. A. D. 1967. Bile Salts. Metl~uen,
40:393-4 London.
Harrison, R. G. 1989. Animal mitochondrial DNA as a Hassot.mil, N., B. Michot and J.-P. Bachellerie. 1984.
Literature Cited 587
The complete nucleotide sequence of mouse 285 I-Icdges, S. B 1989. Evolut~onand biogeography 01
rRNA gene. Implications for the process of size West Indran frogs of the genus Elcutherodactyir~s
Increase of the large subunrt rRNA in higher Slow-evolving locl and the major groups, pp
eukaryotes. Nucl. Acids Res. 12:3563-3583 305-370.111 C. A. Woods, (cd 1, B10,yeograpizyn/ khc
Haucke, H-R and G. Gellrssen. 1988. Different mito- West Indies: Past, Preseizl, rzild Future. Sandhill
chondrial gene orders among insects: Exchanged Crane Press, Galnesvrlle, Florrda.
tRNA gene positions in the COII/COIII region t-ledges, S. B. 1992. The number of replications ncedcd
between an orthopteran and a dipteran species. for accurate est~mationof the bootstrap P value in
Curr. Genet. 14:471-476. phylogenetic studies. Mol. B~ol.Evol. 9.366-369
Haufler, C. H. 1987. Electrophoresis is modifying our Hedges, S. B. and M. EI. Schwe~bcr.1995. Detcctlng
concepts of evolutjon in homosporous pterido- dlnosaur DNA. Sclence 268,1191-1192.
phytes. Am. J. Bot. 74:953-966. Hedges, S. B ,K. D. Mobcrg and L R. Maxson 1990
Haugland, R. P. 1992-1994. Handbook ojFluorescent Tetrapod phylogcny ~nferredfrom 185 and 28s
Plabes and Research Chemicals. 5th ed. Molecular ribosomal RNA sequences and a review of the
Probes, Eugene, Oregon. cvrdence for alnniotc relatioi1shlps. Mol. Blol
Hauswirth, W. L. and P. J. Laipis. 1985. Transmission Evol. 7:607-633.
genetics of mammalian mitocl~ondria:A molecu- Hcdgcs, S. R., R L. Rezy and L R. Maxson. 1991.
lar inodel and experimental evidence, pp. 49-59. Phylogenetic relationsl~lpsand biogeography of
In E. Quagliariello, E. C. Slater, E Palrnieri, C. xantusiid lizards, inferred from mitochondr~al
Saccone and A. M. Kroon (eds.), Achievements and DNA sequenccs. Mol. Biol. Evol. 8:767-780.
Perspectives of Mitochondria1 Research. Elsevier, Hedges, S. B., J. Bogart and L. I<.Maxson. 1992a
Amsterdam. Ancestry of un~sexualsalamanders. Nature
Hauswirth, W. W., L. 0.Lim, B. Dujon and G. Turner. 356 708-710.
1987. Methods for studying tile genetics of mito- Hedges, S B., S Kumar, K Tamura and bi.Stoneking
cl~ondria,pp. 171-282. In V. M. Darley-Usmar, D. 1992b. kiuman origins and analysis of mitochon-
Rickwood and M. T. Wilson (eds.), Mitochondria. drial DNA sequences Science 255:737-739
A Pracfical Approach. IRL Prcss, Oxford. I-Iedges, L. V. and I. Olkin 1985. Stntzstical Mcthods T o ,
Hay, R. J. 1979. Idenhfication, separation and culture Meta-analysis. Academic Prcss, Orlando, Flortda
of mammalian tissue cells, pp. 143-318. In E. Reid Hedrick, P.W. 1983. Genet~csof Populatlolzs. Science
(cd.), Cell Populations, Methodology Surveys (B): Books International, Boston.
Biochemistry. Vol. 8. Wiley and Sons, New York. Hem, J. 1989a. A new method that slinultaneously
Hay, R. J. and G. E Gee. 1984. Procedures for collecting aligns and reconstructs ancestral sequences for
cell. lines under field conditions, pp. 25-26. In H. any number of homologous sequences, when
C. Dessauer and M. S. Hafner (eds.), Collections of phylogeny is given. Mol. Biol. Evol. 6:649-668
Frozen Tissues: Value, Management, Field and Hein, J. 1989b. A tree reconstruct~onmethod that is
Laboratoiy Procedures, and Dzrectory of Existing economical m the numbex of pairwlse compar-
Collections.Assoc. Syst. Collections, University of isons used. Mol. Blol. Evoi 6.669-684.
Kansas Press, Lawrence. Hem, J. 1990a. Reconstruchng evolution of sequences
Hayasaka, K., T.Gojobori and 5. Horai. 1988. subject to recomb~nat~on uslng parsimony. Math,
Molecular phylogeny and evolution of primate Biosci. 98:185-200.
mitochondria1DNA. Mol. Biol. Evol. 5:626-644. I-Iein, J. 1990b. Unified approach to alignment and
Hayashi, K. 1991a. PCR-SSCP: A simple and sensitive phylogenies. Mct11. Enzyn~ol.183:626-644.
mcthod for detection of mutations in the genomic Hein, J. 1993. A heuristic mcthod to reconstruct the
DNA. PCR Meth. Applica. 1:34-38. history of sequences subject to recombinat~onJ.
Hayashi, K. 1991b.PCR-SSCP: A method for detction Mol. Evol. 36:396-405.
of mutations. GATA 9:73-79. Heinstra, P.W. H., W J. M.Aben, W. Scharioo and G
Hayes, J. P. and R. G. Harrison. 1992. Variation in E. W. Thorig. 1986. Alcohol dehydrogenase of
mitochondria1 DNA and the biogeographic histo- DrusophiIa nielanogaster: Metabalic differences
ry of woodrats (Neotoma)of the eastern United mediated througi cryptic allozymcs. Heredity
States. Syst. Biol. 42:331-344. 57.23-29.
Healy, J. A. and M. E Mulcahy 1979. Polymorphic Helfinan, D. M., J. C. Fiddes and D. Hanahan. 1987
tetrameric superoxide dismutase in the pike Esox Directional cDNA cloning in plasmid vectors by
luctus L. (Fisccs; Esocidae).Comp. Biochem. sequential. addition of oligonucleotide linkers.
Physiol. 62B:563-565. Meth. Enzymol. 152:349-359.
588 Literature Cifed
Hendc~son,A. S. 1982. Cytological hybndization to triosephosphate isomerase (TPI)in Isotes
mammalian chromosomes. Int. Rev. Cytol. (Isotaceae). Am. J. Bot. 76:215-221.
76 1 4 6 I-Xiggins, D. G. and I?. M. Sharp. 1988. CLUSTAL: A
Henderson, N. S. 1965. Isozymes of isocitrate dehy- package for performing multiple sequence align-
drogenase: Subunit structure and intracellular ment on a microcomputer. Gene 73:237-244.
location J. Exp. Zoo!. 158:263-274. Higgins, D. G. and P, M. Sharp. 1989. Fast and sensi-
ISendy, M. D. 2989. The relationship between simple tive ~nultipiesequence alignments on a micro-
evolutionary tree models and observable computer. CABIOS 5:151-153.
iequcnce data. Syst. Zool. 38:310-321. Higgins, D. G., A. J. Bleasby and R. Fuchs. 1992.
I-lendy, M. D. 1991.A cornbinatorial description of the CLUSTAL V: Improved software for multiple
closest tree algorithm for finding evolutionary sequence alignment. Comput. Appl. Biosa.
irees. Discrete Math. 96:51-58 8.189-192.
IIendy, M. D. and D. Penny. 3 982. Branch and bound Highton, R., G. C, Maha and L. R. Maxson. 1989.
aigorrtluns to determine minimal evolutionary Biochemical evolution in the slimy salainanders
irces Math. Bioscl, 59:277-290. of the P(et1zado11glutinosus complex in the eastern
Iiendy, M. D. and D. Penny. 1989. A framework for the United States. Illinois Biol. Monogr. 57:l-153.
qudntitative study of evolutionary trees. Syst. Higucl~i,R. G. and H. Ochman. 1989. Production of
Zool. 38.297-309. single-stranded DNA templates by exonucleasc
Hendy, M D. and D. Penny. 1993. Spectral analysis of digestion following the polymerase chain reac-
phylogenetic data. J. Class. 10:5-24. tion. Nucl. Acids Res. 17:5865.
Hendy, M. D., D. Penny and M, A. Steel. 1994.A dis- Higuchi, R. G., B. Bowman, M. Freiberger, 0.A Ryder
crete Fourier analysls for evolutionary trees. Proc. and A. C. Wilson. 1984. DNA sequences from the
Natl Acad. Sci. USA. 91,33393343. quagga, an extinct member of the horse family.
HclukoTf, S. and J. G. Hen~koff.1992.Amino acid suh- Nature 312:282-284.
st~tutionmatrices from protein blocks. Proc. Natl. Higuchi, R. G., L. A. Wrischnik, E. Oakes, M. George,
Acad. SCI.USA 89:10925-10919. B. Tong and A. C. Wilson. 1987. Mitochondria1
Hennig, W. 1950. Grundzuge einer Th~orieder phylo- DNA of the extinct quagga: Relatedness and post-
~erirf~scheiz Systemaflk Deutscher Zentralverlag, mortem change. J. Mol. Evol. 25:283-287.
Berlin. Hilbish, T. J. and R. K.Koehn. 1985a. The physiologi-
l-Iennig, W. 1966. Phylogel~eflcSysfematics. University cal basis of natural selection at the LAP locus.
u l Illinois Press, Urbana. Evolution 393302-1317.
i-icreiord, L M. and R.liobash. 1977. Number and dis- Hilbish, T. J. and R. K.Koel~n.1985b. Dominance i n
tribution of polyadenylated RNA sequences in physiological phenotypes and fitness at an
yeast Cell 10:453-462. enzyme locus. Science 229:52-54.
Helman, S. G. 1980. The Naturalist's Field Journal. Buieo Hilbish, T. J., L. E. Deaton and R. K. Koehn. 1982.
Books, Vermillion, South Dakota. Effect of an allozyme polymorphism on regula-
Hernandez, J. L. and B. S. Weir. 1989. A disequilibrium tion of cell volume. Nature 298:688-689.
coefficient approach to Hardy-Weinberg testmg. Will, W. G. and 13. S. Weir. 1988. Variances and covari-
Ulolnetrics 45:53-70. ances of squared linkage disequilibria. Theor.
1iernandc~-Juviel, J. M., U.J. Morafka, I. Delgado, G. Pop. Biol. 33:54-78.
D Scott dnd R. W Murphy. 1992. Effect of enzyme Hill, W. G. and B. S. Weir. 1994. Maximum likelihood
dilution on the relative mobility of glutamate estimation of gene location with linkage disequi-
dehydrogenase isozyrnes in the prairic rat- librium. Am. J. Human Genet. 54:705-714.
ilesnake, Crotalus vir~disvrridls. Copeia Hillis, D. M. 1984. Misuse and modification of Nei's
1992 1117-1119. genetic distance. Syst Zool. 33238-240.
J-lewitt, G. M. 1988. Ilybrrd zones-natural laborato- Hillis, D. M. 1985. Evolutionary genetics of the
ries for evolutionary studies. Trends Ecol. Evol. Andean lizard genus Pl~olidobolus(Sauria:
3 158-167. Gymnophthalmidae): Phylogeny, biogeography,
I-Iey, J and R. M, miman. 1993. Population genetics and a comparison of tree construction techniques.
and phylogenetics of DNA sequence variation of Syst. Zool.34:109-126.
il~uItlpleloci within the Drosophila melanogaster I-iillis, D.M.1987. Molecular versus morphological
complex. Mol. Biol. Evol. 10:804-822. approaches to systematics. Annu. Rev. Ecol. Syst.
Hickey, R J., S. I. Guttman and W. H. Eshbaugh. 1989. 28:23-42.
Ev~dencefor post-translational modification of
Literature Cifed 589
Hillis, D. M. 1989. Genetic consequences of partial Molectilar Evolution of Physiological Processes.
self-fertilizationon populations of the Florida tree Rockefeller University Press, New York.
snail (Liguusfasciatus). Am. Malacol. Bull. 6:7-12. Hillis, D. M. and J. P. Huelsenbeck. 1995. Assessing
fillis, D. M. 1990. The phylogeny of amphibians: molecular phylogenies. Science 267:255-256.
Current knowledge and the role of cytogenetics, Hilhs, D. M. and J. C. Patton. 1982. Morphological and
pp. 7-31. In D. M. Green and S. K. Sessions (eds.), electrophoretic evidence for two species of
Amphibian Cytogenetics and Evolutiofi. Academic Corbicula (Bivalvia: Corbiculidae) in North
Press, San Diego. America. Am. Midl. Nat. 108:74-80.
Willis, D. M. 1991. Discriminating between phyloge- Hillis, D. M., D. S. Rosenfield and M. Sanchez. 1987.
netic signal and random noise in DNA sequences, Allozymic variability and heterozygote deficiency
pp. 278-294. In M. M. Miyamoto and J. Cracraft within and among morphologically polymorphic
(eds.), Pkylogenetic Analysis of DNA Sequences. populations of Liguus fasciatus (Molluscs:
Oxford University Press, New York. Pulmonata: Bulimulidae). Am. Malacol. Bull.
Hillis, D. M. 1994a.Homology in molecular biology, 5:155-159.
pp. 339-367. bl B. K. Hall (ed.), Homology: The Hillis, D. M., M. T.Dixon and L. K. Ammerman.
Hierarchical Basis of Comparative Biology. Academic 1991a. The relationships of the coelacanth
Press, New York. Lntimeria chalumnae: Evidence from sequences of
Hillis, D. M. 1994b. Phylogenetic searching of molecu- vertebrate 285 ribosomal RNA genes. Environ.
lar data bases. Syst. Biol. 43:461463. Biol, Rshes 32:119-130.
Hillis, D. M. 1995. Approaches for assessing phyloge- Hillis, D. M., M. T. Dixon and A. L. Jones. 1991b.
netic accuracy. Syst. Biol. 44:3-16. Minimal genetic variation in a morphologically
Hillis, D. M. and J. J. BuU. 1991. Of genes and diverse species (Florida tree snail, Liguus fascia-
genomes. Science 254528. tus).J. Hered. 82282-286.
Hillis, D. M. and J. J. Bull. 1993.An empirical test of Hillis, D. M., C. Moritz, C. A. Perter and R. J. Baker.
bootstrapping as a method for assessing confi- 1991~.Evidence for biased gene conversion in
dence in phylogenetic analysis. Syst. Biol. concerted evolution of ribosomal DNA. Science
42:182-192. 251:308-310.
Hillis, D. M, and S. K. Davis. 1986. Evolution of ribo- Hillis, D. M., J. J. Bull, M. E. White, M. R. Badgett and
somal DNA: Fifty million years of recorded histo- I. J. Molineux. 1992. Experimental phylogenetics:
ry in the frog genus Rana. Evolution 40:1275-1288. Generation of a known phylogeny. Science
Hillis, D. M. and S. K. Davis. 1987. Evolution of the 255:589-592.
285 ribosomal RNA gene in anurans: Hillis, D. M., M. W. Allard and M.M. Miyamoto.
Phylogenetic implications of length and restric- 1993a. Analysis of DNA sequence data:
tion site variation. Mol. Blol. Evol. 4:117-125 Phylogenetic inference. Meth. Enzymol.
Hillis, D. M. and S. K. Davis. 1988. Ribosomal DNA: 242:456487.
Intraspecific polymorphism, concerted evolution, Hillis, D. M., L. K. Ammerman, M. T.Dixon and R. 0.
and phylogeny reconstruction. Syst. Zool. de Sb. 199323. Ribosomal DNA and the phylogeny
32:63-66. of frogs. Herpetol. Monog. 7:118-131.
Hillis, D. M, and M. T. Dixon. 1989. Vertebrate phy- Hillis, D. M., J. J. Bull, M. E. White, M. R. Badgett and
logeny: Evidence from 28s ribosomal DNA I. J. Molineux. 1993c. Experimental approaches to
sequences, pp. 355-367. In B. Fernholm, K. phylogenetic analysis. Syst. Biol. 42:90-92.
Bremer and I-I. Jornvall (eds.), The Hierarchy of Hillis, D. M., J. P. Huelsenbeck and C. W.
Life. Proc. Nobel Symp. 70. Elsevier, Amsterdam. Cunningham. 1994a. Application and accuracy of
Hillis, D. M. and M. T. Dixon. 1991. Ribosomal DNA: molecular phylogenies. Science 264:671-677.
Molecular evolution and phylogenetic inference. Hillis, D. M., J. P. Huelsenbeck and D. L. Swofford.
Quart. Rev. Biol. 66:411453. 1994b. Hobgoblin of phylogenetics? Nature
Hillis, D. M. and J. P. Huelsenbeck. 1992. Signal, noise, 369:363-364.
and reliability in molecular phylogenetic analy- Hinkle, G., J. K. Wetterer, T. R. Schultz and M. L.
ses. J. Hered, 83:189-195. Sogin. 1994. Phylogeny of the attine ant fungi
Hillis, D. M, and J. Huelsenbeck. 1994a. Support for based on analysis of small subunit ribosomal
dental HIV transmission. Nature 369:24-25. RNA gene sequences. Science 266:1695-1697.
Hillis, D. M. and J. P. Huelsenbeck. 1994b. To tree the Hiratsuka, J., H. Shimada, R. Whittier, T, Ishibashi, M.
truth: Biological and numerical simulations of Sakamoto, M. Mori, C. Knoda, Y. Honii, C. -R.
phylogeny, pp. 55-67. In D. M. Fambrough (ed.), Sun, B. -Y. Meng, Y. -Q. Li, A. Kanno, Y.
Nishizawa, A. Hirai, K.Shinozaki and M. Sugura. analysis of isozyme patterns, pp. 489-508. In C. L.
1989. The completc sequence of the rice ( O y z a Markert (ed.), Isozymes. Val. 1. Academic Press,
sativa) chloroplast genome: Intermolecular recom- New York.
bination between distinct tRNA genes accounts Hdss, M. and S. Paabo. 1993. DNA extraction from
for a major plastid DNA inversion during the Pleistocene bones by a silica-based purification
evolution of the cereals. Mol. Gen. Genet. method. Nucl. Acids Res. 21:3913-3914.
212185-194. Hoss, M., M.Kohn, S. Pahbo, F. Knauet and W.
Hixson, J. E. and W. M. Brown. 1986. A comparison of Schroder. 1992. Excrement analysis by PCR.
the small ribosomal RNA genes from the mito- Naturc 359:199.
chondrial DNA of the great apes and humans: Houck, L. D., S.G. Tilley and S. J. Arnold. 1985. Sperm
Sequence, structure, evolution, and phylogenetic competition in a plethodontid salamander:
implications. Mol. Biol. Evol. 3:l-18. Preliminary results. J. Herpetol. 19:420-423.
Hoeh, W. R., K. H. Blakley and W. M. Brown. 1992. Houde, P. and M.J. Braun. 1988. Museum collections
Heteroplasmy suggests limited biparental inheri- as a source of DNAfor studies of avian phyloge-
tance of Mytitus mitochondria1 DNA. Science ny. Auk 105:773-776.
251:1488-1490. Hsiao, J.-Y. and L.H. Rieseberg. 1994. PopuIation
Hoelzal, R. and G. A. Dover. 1987. Molecular tech- genetic structure of Yush n~itakayanzenszs
niques for examining genetic variation and stock (Bambusoideac, Poaceae) in Taiwan. Mol. Ecol.
identity in cetacean species. Report of the 3:201-209.
International Whale Commission. Hsu, T. C. 1979. Human and Mamntalian Cytogenetics.
Hoey, M. T, and C. R. Parks. 1991. Isozyme divergence Springer-Verlag, Berlin.
between Eastern Asian, North American and Hsu, T. C. 1981. Polymorphism in huinan acrocentric
Turkish populations of Liquidambar chromosomes and the silver staining method for
(Hamamelidaceae).Am. J . Dot. 78:938-947 nucleolus organizer regions. Karyogram 245.
Holmes, N. G., C. S. Mellersh, S. J. Humphreys, M. M. Hubby, J. L.and R. C. Lewontin. 1966. A molecular
Binns, A. Hollirnan, li. Curtis and J. Sampson. approacli to the study of gemc heterozygosity in
1993. Isolation and characterization of microsatel- natural populations. I. The number of alleles at
lites from the canine genome. Anim. Genet. different loci in Dlasophila yseudoobscura. Genetics
24:289-292. 543571-594.
Holmquist, R., M. M. Miyamoto and M. Goodman. Hubby, J. L. and L. H. Throckmorton. 1965. Protein
1988a. Analysis sf higher-primate phylogeny differences in Drosophila. 11. Comparative species
from transversion differences in nuclear and genetics and evolutionary problems. Genetics
mitochondria1 DNA by Lake's methods of evolu- 52203-215.
tionary parsimony and operator metrics. Mol. Hudson, R. R. 1990. Gene genealogies and the coales-
Biol. Evol. 5:217-236. cent process. Oxford Surv. Evol. Biol. 2 1 4 4 .
Holmquist, R., M. M. Miyamoto and M. Goodman. Hudson, R. R., D. D. Boos and N. L. Kaplan. 1992a.A
1988b. Higher-primate phylogeny-Why can't we statistical test for detecting geographic subdivi-
decide? Mol. Biol. Evol. 5:201-216. sion. Mol. Biol. Evol. 9:138-151.
Holsinger, K. E. and L. D. Gottlieb. 1988. Isozyme vari- Hudson, R. R., M.Slatkin and W. P. Maddison. 1992b.
ability in the tetraploid Clarkia gracilis Estimation of levels of gene flow from DNA
(Onagraceae) and its diploid relatives. Syst. Bot. sequence data. Genetics 132:583-589.
13:l-6. Wudspeth, M. E. S., D. S. Scl~urnard,K. M. Tatti and L.
Holtsford, T. P. and N. C. Ellstrand. 1990. Inbreeding I. Grossman. 1980. Rapid purification of yeast
effects in Clarkia tembloviensis (Onagraceae) popu- mitoclzondrial DNA in 11igli yield. Biocl~im.
lations with different natural outcrossing rates. Biophys. Actd 610:221-228.
Evolution 44:2031-2046. Huelsenbeck, J. P. 1995a. Performance of phylogenetic
Honeycutt, R. L., S. W. Edwards, K. Nelson and E. methods in simulation. Syst. Biol. 44:17-48.
Nevo. 1987. Mitochondria1 DNA variation and Huelsenbeck, J. Pa1995b. The robustness of two phylo-
the phylogeny of African mole rats (Rodentia: genetic methods: Four-taxon simulations reveal a
Bathyergidae).Syst. Zool. 36:280-293. slight superiority of maximum likekhood over
Hood, L. E.,J. H. Wilson and W. B. Wood. 1974. neighbor joining. Mol. Biol. Evol. 12:843-849.
Molecular Biology of Eucaryotic Cells. Vol. 1.W. A. Huelsenbeck, J. P.and D. M. Hillis. 1993. Success of
Benjamin, Nenlo Park, California. phylogenetic methods in the four-taxon case.
Hopkinson, D.A. 1975. The use of thiol reagents in the Syst. Biol. 42:247-264.
Liferatwe Cited 591
Huelsenbeck, J. P., D. L. Swofford, C. W. Cunningham, alated deoxyribonuclerc ac~d.Biochemistry
J. J. Bull and P, W. Waddell. 1994. Is character 12.558-563.
weighting a panacea for the problem of data het- Huxley, J. 1942. Evoluflon 7 % Modern
~ Syi~thrsrsAilen
eroieneity & ph ylogenetic &alysis? Syst. Biol. and Unw~n,London
43:288-291.
Huclsenbeck, J. P., D. M.H~llisand R.Jones. 1995. Innis, M.A., D. H. Gelfand, J. J. Sninsky and T,J. Wlute.
Parametric bootstrapping in molecular phylogc- 1990 PCR Protocols. Academic Press, New York.
netics: Applications and performance. In J. International Union of Blochcinistry: Nomenclalure
Ferraris and S. Palumbi (eds.), Molecular Zoology: Committee. 1984. Eizzylize Nonzenclattrre, 1984
Strategres and Profocols. Wiley, New York. Academic Press, Orlando, Florida.
FIuey, R. B. and A. E Bennett. 1987. Phylogenctic stud- Irwin, D. M., T. D. Kocher and A. C. Wilson 1991
ies of coadaptation: Preferred temperaturcs ver- Evolution of the cytachroine b gene of mammals.
sus optimal performance temperatures of lizards. J. Mol. Evol. 32:128-144
Evolution 41:1098-1115. ISCN 1981. An internat~onalsystem for human cyto-
Hugall, A,, C. Morita, J. Stanton and D. R. gcnetic nomenclature-high resolution bandlng.
Wolstenholme. 1994. Low, but strongly structured Cytogenet. Cell Genet 31:l-23.
mitochondria1 DNA diversity ~nroot knot nema- Iwahana, H., D. Yosl~imotoand M. Itakura. 1992.
todes (Meloidogyne).Genetics 136:903-912. Detection of p o ~ nmutations
t by SSCP of PCR-
Hughes, A. E. 1993. Optimization of microsatellite amplified DNA after endonuclease digestion
analysis for genetic mapping. Genomics BioTechniques 12.64-66.
15:433434.
Hughes, A, L.and M. Nei. 1989. Ancient mterlocus lackman, T. R. and D. B. Wake. 1994. Evolutionary and
exon exchange in the history of the HLA-A locus. historical analysis of protein variation in the
Genetics 122:681-686. blotched forms of salan~andersof the Ensatinn
Hughes, C. R. and D. C. Queller. 1993. Detection of complex (Amplubia, Plethodont~dae).Evolution
highly polymorphic microsatellite loci in a species 48.876-897.
with little allozyme polymorphism. Mol. Ecal. Jackson, J. E and J. A. Pounds. 1979. Commcil~son
2:131-138. assessing the dedifferentlating effects of gcnc
Hunkap~ller,T., R. J. Kaiser, D.E Koop and L. Wood. flow. Syst. ZOO^. 28.78-85.
1991. Large-scale and automated DNA sequence Jacobs, EX. T., J. W. Posakony, J. W. Grula, J. W. Roberts,
determination. Science 254:59-68. J. 1-1 Xin, R. J. Brittcn and 2. E-I. Davidson 1983.
Hunt, J. A., T. J. Hall and R. I. Britten. 1981. Mitochondria1 DNA sequences in the nuclear
Evolutionary distances in Hawaiian Drosophila genome of Sfrongylocentrotus purpurrztus. J. Mol
measured by DNA reassociation. J. Mol. Evol. Biol.165:609-632.
17:361-367. Janczewski, D. N., N. Yuhh, D.A. Gilbert, G.T.
Hunt, W. G. and R. K. Selander. 1973. Biochemical Jeffcrson and S. J. O'Brien. 1992. MoZccular phylo-
genetics of hybridization in European house mice. genetic inference from saber-toothed cat fosslis of
Heredity 31:ll-33. Rancho La Brea. Froc. Natl. Acad. Sci. USA
Hunter, X. L. and C. L. Markert. 1957. Histochemical 899769-9773.
demonstration of enzymes separated by zone Jansen, R. K. and J. D. Palmer. 1987a. Chlaroplast
clectrophoresis in starch gels. Science DNA from Icttuce and Barnodesig (Asteraceae).
125:1294-1295. Structure, gene local~zationand characterization
Hutchinson, M. N.and L. R. Maxson. 1987a. of a large inversion. Curr. Genet. 11:553-564
Biochemical studies on the relationships of the Jansen, R. K. and J. D. Palmcr. 1987b. A chloroplast
gastric-brooding Frogs, genus Rh.eobatraclzus. DNA inversion marks an ancient evolutionary
Amphibia/Reptilia 8:1-11. split in the sunflower family (Asteraceae). Proc.
Hutcl~inson,M. N, and L. R. Maxson. 198%. Natl. Acad. Sci. USA 84:5818-5822.
Phylogenetic resolution ainong Australian tree Jansen, R. K. and J. D.Palmer 1988. Phylogenetlc
frogs (Anura: Hylidae: Pelodryadinae): An ~n~plications of cl~loroplastDNA restriction site
immunological approach. Australian J. 2001. variat~onin the Mutis~eae(Asteraceae).Am. J
3561-74. Bot. 75:751-764.
Plutton, J. R. and J. G. Wetrnur. 1973. Effect of chemical Jansen, R. K., H. J. Mlchaels and J. D. Palmer. 1991.
modification on the rate of renafuratian of Phylogeny and character evoIution In the
deoxyribonucleic acid: Deamination and glyox Astcraceae based on chloroplast DNA restnctlcln
sitc mapping. Syst. Bot. 16:98-115.
592 Literature Cited
Jansen, R. K.,H. J. Michaels, R. S. Wallace, K.-J. Kim, S. and D. J. Simpson. 1989. High-speed DNA
C Keeley, L. E. Watson and J. D. Palmer. 1992. sequencing: An approach based upon ffuores-
Chloroplast DNA variation ln the Asteraceae: cence detection of single molecules. J. Biomol.
Phylogenetic and evolutionary implications, pp. Struct. Dynam. 7:301-309.
252-279. In P.S. Soltrs, J E. Soltis and J. J. Doyle Jiminez-Marin, D, and H. C. Dessauer. 1973. Protein
(eds.),Molecular Systematics of Plants. Chapman phenotype variation in laboratory populations of
and Hall, New York. Itaftus norvegxcus. Comp. Biochem. Physiol.
Jcanp~erre,M. 1987. A r a p ~ dmethod for the purifica- 46B:487-492.
l ~ o nof DNA from blood. Nucl. Acids Res. 15:9611. Jin, L.and R. Chakraborty. 1995. Population structure,
Jech, M. S, and N. C. Wheeler. 1984. Laboraiory Ma?zual stepwise mutations, heterozygote deficiency and
For Holdizontal Starch Gel Electrophoresis. their implications for DNA forensics. Heredity
Weyerhauser Research and Development Report 743274-285.
#O50-3210/6. Jin, L, and M. Nei. 1990. Limitations of the evolution-
Jcffreys,A. J. 1982. Spermidine and the digest~onof ary parsimony method of phylogenetic analysis.
mipure DNA. Focus (BRL) 4(3):12. Mol. Biol. Evol. 7:82-102.
Jeffrcys,A. J. and D. 8. Morton. 1987. DNAfinger- Johannisson, R. and H. Winking. 1994. Synaptonemal
pnnts of dogs and cats. Anim. Genet. 18:l-15. complexes of chains and rings in mice heterozy-
Jeff~cys, A. J., V, Wilso1-1and S. L. Thein. 1985a. gous for multiple Robertsonian translocations.
Hypervariable "minisatellite" regions in human Chromosome Res. 2:137-145.
DNA Nature 314:67-73. Job, H., M. L. Birnsteil and K. W. Jones. 1969.
Jeffreys,A. J., V. Wilson and S. L. Thein. 1985b. RNA-DNA hybrids at cytological levels. Nature
lndlvidual-specihc "fingerprints" of human 223:582-587.
DATA. Nature 316:76-79. Johnson, A. G., E M. Utter and H. 0.Hodgins. 1970.
Jeffreys, A J.,V. Wilson, R. Kelly, B. A. Taylor and G. Interspecific variation of tetrazolium oxidase in
Bulheld. 1987. Mouse DNA "fingerprints": Sebastodes (rockfish).Comp. Biochem. Physiol.
Analysis of chromosome localization and germ- 37:281-285.
lme stability of hypervariable loci in recombinant Johnson, G. B. 1976. Hidden alleles at the a-glyc-
inbred strains. Nucl. Acids Res. 15:2823-2836. erophosphate locus in Colins butterflies. Genetics
jeff~eys,A J., N. J. Xoylc, V. IVilson and Z. Wong. 83:149-167.
1988. Spontaneous mutation rates to new length Johnson, G. 6.1977. Assessing electrophoretic simnilari-
alleles at tandeni-rcpehtive hypervariable loci in ty: The problem of hidden heterogeneity. Annu.
11un1an DNA. Nature 332:278-281. Rev. Ecol. Syst. 8:309-328.
Jeffrcys, A. J., A. MacLcod, K. Tamaki, D. L. Ned and Johnson, G.B. 1979. Increasing the resolution of poly-
D.G. Monckton. 1991.Minisatellite repeat coding acrylamide gel electrophoresis by varying the
as a digital approach to DNA typing. Nature degree of crosslinking. Biochem. Genet.
354:204-209. 17:499-516.
Jeffxeys, A. J., K. Tamak~,A. MacLeod, D. G. Johnson, M. S. and R. Black. 1984. The Wahlurtd effect
klonckton, D. L. Neil and J. A. L. Armour. 1994. and the geographical scale of variation inthe
Co~nplexgene conversion events in germline intertidal limpet Siphonaria sp. Marine Biol.
mutation at human minisatellites. Nature 79:295-302.
Genetics 6:136-145. Johnson, M. S. and X. E Doolittle. 2986. A method for
j ma, K.K.and G. Kochert. 1991. Restriction fragment the sitnultaneous alignment of three or more
length poiymorphism analysis of CCDD genome amino acid sequences. J. Mol. Evol. 23:267-278.
species oi the genus O y z a L. Plant Mol. Biol. Johnson, N. K., R. M. Zink, G. E Barrowclough and J.
16:831-839. A. Marten. 1984. Suggested techniques for mod-
lensen, U. and D. E.Fairbrothers (eds.) 1983. Protelns ern avian systeinatics. Wilson Bull. 96:543-560.
and Nucleic Acids in Plant Systematics. Springer- Johnson, N. K., R. M. Zink and J. A. Marten. 1988.
Verlag, New York. Genetic evidence for relationships in the avian
Jermat~n,T.M., J. G. Opitz, J. Stackhouse and S. A. family Vireonidae. Condor 90:428-445.
Bcnner. 1995. Reconstructing the evolutionary his- Jones, C. S., H.Tegelstrom, D, S. Latchman and R. J.
tory of the artjodactyl ribonuclease superfamily. Berry 1988. An improved rapid method for mito-
Nature 374:57-59. chondrial DNA isolation suitable for use in the
Jett, J. H., R.A. Keller, J. C. Martin, B. L. Marronc, R.K. study of closely related popualtions. Biochem.
Moyzis, R. L.Ratliff, N. K. Seitzinger, B. B. Shera Genet. 26:83-88.
Literature Cited 593
Jones, D. T.,W. R. Taylor and J. M. Thornton. 1992. The Karl, S. A. and J. C. Avise. 1992. Balancing selection at
rapid generation of mutation data matrices from allozyme loci in oysters: Implications from
protein sequences. Con-rp.Appl. Biosci. 8:25-282. nuclear RFLPs. Science 256:100-102.
Jones, G. H. and D. de Azkue. 1993. Synaptonemal Karl, S. A, and J. C. Avise. 1993. PCR-based assays of
complex karyotyping: An appraisal based on a Mendelian polymarphisms from anonymous sin-
study of Crepis caplllaris. Chromosome Res. gle-copy nuclear DNA: Techniques and applica-
1:197-203. tions for population genetics. Mol. Biol. Evol.
Jones, T. R., A. G. Kluge and A. J Wolf. 1993. When 10:342-361.
theories and methodologies clash: A phylogenetic Karl, S. A., B. W. Bowen and J. C. Avise. 1992. Global
reanalysis of the North American ambystomatid population genetic structure and male-mediated
salamanders (Caudata: Ambystomatidae). Syst. gene flow in the green turtle (Chelonia nzydas):
Biol. 42:92-102. RILP analysis of anonymous nuclear loci.
Jorgensen, R. A. and P. D. Cluster. 1988. Modes and Genetics 131:163-173.
tempos in the evolution of nuclear ribosomal Keilen, D. and Y. L. Wang. 1947. Stability of hemoglo-
DNA: New characters for evolutionary studies bin and certain non-erythrocytic enzymes in vitro.
and new markers for genetic and population Biochem. J. 41:491499.
studies. Ann. Missouri Bot. Gard. 75:1238-1247. Kellogg, E. A. and J. A. Bircher. 1993. Linking phyloge-
Joseph, L. and C. Moritz. 1994. Mitochondria1DNA ny and genetics: Zea mays as a tool for phyloge-
phylogeography of birds in eastern Australian netic studies. Syst. Biol. 42:409414.
rainforests: First fragments. Australian J. Zool. Kemmerer, E. C., M. Lei and R. WU. 1991. Isolation
42:385403. and molecular evolutionary analysis of a
Joseph, L., C. Moritz and A. Hugall. 1995. Molecular cytochrome c gene from 0y z a sativa (rice).Mol.
support for vicariance as a source of diversity in Biol. Evol. 8:212-226.
rainforest. Proc. Roy. Soc. Lond. (in press). Kempthorne, 0.1957. An introduction to Genetic
Jouannic, S.,C. Kerbourch,and B. Kloareg and S. Statistics.Wiley, New York.
Loiseau-de Goer. 1992. Nucleotide sequences of Kephart, S. R. 1990. Starch gel electrophoresis of plant
the atpB and the atpE genes of the brown alga isozymes: A comparative analysis of techniques.
Pylaiella Iittoralis(L.) Kjellm. Plant Mol. Biol. Am. J. Bot. 775693-712.
18 5319-822. Kesseli, R., 0. Ochoa and R. Michelmore. 1991.
Jukes, T. H. and C. R. Cantor. 1969. Evolution of pro- Variation at RFLP loci in Lactuca ssp. and origin of
tein molecules, pp. 21-132. In H. N. Munro (ed.), cultivated lettuce ( L , safiva). Genome 34:430436.
Mammalian Protein Mefabolism. Academic Press, Kessing, B. D. 1991. Strongylocentrotid sea urchin
New York. mitochondria1 DNA: Phylogenetic Relationships
Jupe, E. R,, R. L. Chapman and E,A. Zimmer. 1988. and patterns of molecular evolution. Masters the-
Nuclear ribosomal RNA genes and algal phyloge- sis, Department of Zoology, University of Hawaii,
ny-the Chlamydornonas example. BioSystew Honolulu, HI.
21:223-230. Kessler, L. G. and J. C. Avise. 1985a.Microgeographic
lineage analysis by mitochol-rdrialgenotype:
Kambl-rampati, S. and K. S. Rai. 1991. Temporal varia- Variation in the cotton rat (Sigmodun hispidis).
tion in the ribosomal DNA nontranscribed spacer Evolution 39:831-838.
of Aedes albopictus (Diptera: Culicidae). Genome Kessler, L. G. and J. C. Avise. 198510.A comparative
34:293-297. description of mitochondria1 differentiation in
Kanehisa, M. 1984. Use of criteria for screening poten- selected avian and other vertebrate genera. Mol.
tial homologies in nucleic acid sequences. Nucl. Biol. Evol. 2:109-126.
Acids lies. 12:203-213. Kettler, M. K. and G. S. Whitt. 1986. An apparent pro-
Kaplan, J.-C. and E. Beutler. 1967. Electrophoresis of gressive and recurrent evolutionary restriction in
red cell NADH- and NADPH-diaphorases in nor- tissue expression of a gene, the lactate dehydroge-
ma1 subjects and patients with congenital methe- nase-C gene, within a famlly of bony fish
moglobinemia. Biochem. Biophy. Res. Comm. (Salmoniformes: Umbridae). J. Mol. Evol.
29:605-610. 23:95-107.
Kaplan, N. L., W. G. Hill and B. S. Weir. 1995. Kettler, M. K., A. W. Ghent and G. S. Whitt. 1986. A
Likelihood methods for locating disease genes in comparison of phylogenies based on structural
non-equilibrium populations. Am. J. Human and tissue-expressional differences of enzymes in
Genet. 56:18-32. a family of teleost fishes (Salrnoniformes:
Umbridae). Mol. Biol. Evol. 3:485498.
594 Literature Cited
Kezer, J. and S. K. Sessions. 1979. Chromosome varia- Kimura, M. and T. Ohta. 1972. On the stochastic model
tion in the plethodontid salamander, Aneides fer- for estimation of mutational distance between
reus. Chromosoma 71:65-80. homologous proteins. J. Mol. Evol. 2:87-90.
Kezer, J., P. Lebn and S. K. Sessions. 1980. Structural Kimura, M. and G. H. Weiss. 1964. The stepping stone
differentiation of the meiotic and mitotic chromo- model of population and the decrease of genetic
somes of the salamander Ambystoma nzacrodacty- correlation with distance. Genetics 49:561-576.
lunz. Chromosoma 81:277-197. King, J. L. and T. H. Jukes. 1969. Non-Darwinian evo-
Kezer, J., S. K.Sessions and P. Ledn. 1989. The meiotic iution. Science 164,788-798.
structure and behavior of the strongly heteromor- King, J. L. and T. Ohta. 1975. Polyallelic mutational
phic X/Y sex chromosomes of neotropical pletho- equilibria. Genetics 79:681-691.
dontid salamanders of the genus Oedipina. King, M. 1993. Species Evolution: The Role of
Chromosoma:98:433-442. Chromosome Change. Cambridge University Press,
Kidd, K.K.,P. Astolfi and L. L. Cavalli-Sforza. 1974. Cambridge.
Error in the reconstruction of evolutionary trees, Kirsch, J. A. W., Springer, M. A., Krajewski, C., Arcl~er,
pp. 121-136. I n J. F. Crow and C. Denniston (eds.), M., Aplin, K. and A. W. Dickerman. 1990a.
Genetic Distance. Plenum, New York. DNA/DNA hybridization studies of the carnivo-
Kidd, K. K, and L. L. Cavalli-Sforza. 1971. Number of rous marsupials. I: The intergeneric relationships
characters examlned and error in reconstruction of bandicoots (Marsupialia: Perameloidea). J. Mol.
of evolutionary trees, pp. 335346. bz F. R. T-Jodson Evol. 30:434448.
and P. Tautu (eds.),Mathematics in the Kirsch, J. A. W., Krajewski, C., Springer, M. S. and M.
Archaeological aod Historical Sciences. Edinburgh Archer, 1990b.DNA-DNA hybridization studies
University Press, Edii-tburgh. of carnivorous marsupials. 11. Relationships
Kidd, K. K. and L. A. Sgaramella-Zonta. 1971. among dasyurids (Marsupialia: Dasyuridae).
Phylogenetic analysis: Concepts and methods. Australian J. Zool. 38673-696.
Am. J. Human Genet. 23:235-252. Kirsch, J. A. W., Dickcrman, A. W., Reig, 0,A. and M.
Kilias, J. 1987. Protein characters as a taxonomic tool S. Springer. 1991. DNA hybridizatiol~evidence for
in lichen systematics. Bibl. Lichenol. 25445455. the Australian affinity of the American marsupial
Kim, J. 1993. Improving the accuracy of phylogenetic Dvomiciops australts. Proc. Natl. Acad. Sci. USA
estimation by cornbin~ngdifferent methods. Syst. 88:10465-10469.
Biol. 42:331-340. Kishino, H,and M. Hasegawa. 1989. Evaluation of the
Kim, W. and L. G. Abele. 1990. Molecular phylogeny maximum likelihood estimate of the evolutionary
of selected decapod crustaceans based on 18s tree topologies from DNA sequence data, and the
rRNA nucleotide sequences. J. Crustacean Biol. branching order in Mominoidea. J. Mol. Evol.
1O:l-13. 29:170-179.
Kimura, M. 1968. Evolutionary rate at the molecuIar Kishino, H. and M. Hasegawa. 2990. Converting dis-
level. Nature 217624-626. tance to time: Application to human evolution.
Kimura, M. 1980. A simple method for estimating evo- Meth. Enzymol. 183:550-570.
lutionary rate of base substitutions through com- Kishino, H., T. Miyata and M. Hasegawa. 1990.
parative studies of nucleotide sequences. J. Mol. Maximum likelihood inference of protein phy-
Evol. 16:211-229. logeny and the origin of cl~loroplasts.J. Mol. Evol.
Kimura, M. 1981. Estimation of evolutionary distances 31:151-160.
between homologous nucleotide sequences. Proc. Kitto, G. B., P. M. Wasserman and N. 0.Kaplan. 2966.
Natl. Acad. Sci. USA 78:454-458. Enzymatically active conformers of mitochondria1
Kimura, M. 1983a. The neutral theory of molecular malate dehydrogenase. Proc. Natl. Acad. Sci. USA
evolution, pp. 208-233. In M. Nei and R. K. Koehn 56578-585.
(eds.), Evolution of Genes and Proteins. Sinauer, Kjer, K. M., G. D. Baldridge and A. M. Fallon. 1994.
Sunderland, Massachusetts. Mosquito large subunit ribosomal RNA:
Kimura, M. 19831s. The Neutral Theory of Molecular Simultaneous alignment of primary and sec-
Evolution. Cambridge University Press, ondary structure. Biochim. Riophy. Acta:147-155.
Cambridge. Klebe, R. J. 1975.A simple method for the quantifica-
Kimura, M. 1986. DNA and the neutral theory. Phil. tion of isozymes patterns. Biochem. Genet.
Trans. Roy. Soc. London B312:343-354. 13:805-812.
Kimura, M, and J. E Crow. 1964. The number of alleles Klein, J. 1982. Immunology: The Science of Self-Nolzself
that can be maintained in a finite population. Discriminatiotz. John Wiley &Sons, New York.
Genetics 49:725-738.
Literalure Cited 595
Klem, J., Y. Satta and C. O'Huigin. 1993. The molecular blllty across the isthmus of Panama. Sclence
descent of the major histocompatibility complex. 260 1629-1632.
Annu. Rev. Immunol. 11:269-295. Kobayash~,T., G. B. Milner, D. Tee1 and E M.Utter
Klcppe, K., E. Ohtsuka, R. Kleppe, I. Molineux and H. 1984 Genetic basis for electrophoretic var~at~on of
G. Khorana. 1971. Studies on ploynucleotides adenosine deaminase m chlnook salmon. Trans
XCVI. Repair replication of short synthetic DNA's Am Fish. Soc 113:86-89
as catalyzed by DNA polymerases. J. Mol. Biol. Koch, J ,J. Hindkjaer, J. Mogensen, S. Kalvraa and L
56:341-361. Bolund 1991. An rmproved method for chromo-
Klier, K., M.J. Leoschke and J. E Wendel. 1991. some-specif~clabellng of alpl~a-satelliteDNA m
Hybridization and introgression in white and yel- s ~ t uby uslng denatured double-stranded DNA
low ladyslipper orchids (Cypripcdium candidurn probes as primers m a primed In situ 1abeIlng
and C. pubescens). J. Hered. 82:305-318. (PRINS) procedure. GATA 81 171-178.
Klotz, L. C. and R. L. Blanken. 1981.A practical Kocher, T D. 1991. Sequence evolution of miiochondr-
method for calculating evolutionary trees from lal DNA in human and ch~mpanzees:Control
sequence data. J. Theor. Biol. 91:261-272. reglon and protein coding region, pp. 391-413 111
Wuge, A. G. 1983. Cladistics and the classification of S. Osawa and T. Honjo (eds ), Evolutzotz of Lrfc
the great apes, pp. 151-177. Ira R. L. Ciochan and f'osstls, Molecules, and Cullure. Springer, Tokyo.
R. S. Cormccini (eds.), New Interpretatloi.rs of Ape Kocher, T. D.and R D.Sage. 1986 Further genet~c
and Human Ancest ry. Plenum, New York. analyses of a hybrid zonc between leopard frogs
Kluge, A. G. 1984. The relevance of parsi~nonyto phy- (Rana pipiens complex) in ccntral Texas. Evoluljon
logenetic inference, pp. 2438. in T.Duncan and T. 40:21-33.
Stuessy (eds ), Cladistics: Perspectives on the Kocher, T. D. and T J. White 1989 Evolutionary
Reconstruction of Evolutionary History. Columbia analysis via PCR. In H A. Erllch (ed.), PCR
University Press, New York. Technology: Priizciples and Appllcatrons for DNA
Kluge, A. G. 1988. Parsimony in vicariance biogeogra- Amplification. Stockton Press, New York
phy: Aquantitative method and a Greater Kocher, T D. and A. C W~lson.1991. Sequence evolu-
Antillean example. Syst. Zool. 37:315-328. tion of mitochondr~alDNA In humans and chlm-
Kluge, A. G. 1989. A concern for evidence and a phylo- panzees: Control reglon and a protein-coding
genetic hypothesis of relationships among region, pp. 391413 111S Osawa and T I-ionjo
Epicrates (Boidae, Scrpentes). Syst. Zool. 38:7-25. (eds ), Evolutton of Lrfe Springer-Verlag,Tokyo
Kluge, A. G. and J. S. Farris. 1969. Quantitative phylet- Kocher, T D , W. K. Thomas, A. Mcycr, S. V, rdwards,
ics and the evolution of anurans. Syst. Zool. S. Paabo, F. X, V~llablancaand A. C. W~lson1989.
18:l-32. Dynam~csof mltochondrral DNA evolution in
Kluge, A. G. and R. E. Strauss. 1985. Ontogeny and animals Amplification and scquencmg wlth con-
systematics. Annu. Rev. Ecol. Syst 16:247-268. served primers. Proc. Natl Acad. Scl. USA
Knight, A, and D. P. Mindell. 1993. Substitution bias, 86,6196-6200.
weighting of DNA sequence evolution, and the Kochert, G., T. Halward, W. D.Branch and C. E.
phylogenetic position of Fea's viper. Syst. Biol. Slmpson 1991. RFLP variability in peanut
42:18-31. (Arnchis hypogaea L.) cultivars and wrld spec~es
Knight, A. and D. P. Mindell. 1995. Weighbng of Theor. Appl, Genet 81 565-570.
nucleotide sequences: A reply Syst. Biol. Koehler, K. and K. Larntz 1980 An empirical snvestl-
44112-116. gatlon of goodness-of-frt statlstlcs for sparse
Knlght, A., D. Styer, S. Pelikan, J. A. Campbell, L. D. multmomials. J. Am. Statls Assoc 75 336-344.
Densmore I11 and D. P. Mindell. 1993. Choosing Koehn, R. K. 1978. Physrology and brochen~lstryof
among hypotl~esesof rattlesnake phylogeny: A enzyme variation. The jnterface of ecology and
best-fit rate test for DNA sequence data. Syst. populat~ongenetics, pp. 51-72. In P. Brussard
Biol. 42:356-367. (ed.), EcoIogical Gerzet~cs:The Ii~terjnce Spru~gcr,
Knight, S. E. and D. M. Waller. 1987. Genetic conse- New York.
quences of outcrossing in the cleistogamous Koehn,..?I K.and E W. Irnmermann. 1981. Blochemlcal
annual, linpatrens capensis. I, Population-genetic studics of ammopep ttdase polymorph~smIn
structure. Evolution 41:969-978. Mytilus edults. I Dependence of enzyme actsvity
Knowlton, N., L. A. Weigt, L. A. Solorzano, D. K. Mills on season, tissuc, and genotype. Blocl~emGenet
and E. Bermingham. 1993. Divcrgence in proteins, 19.1115-1142.
mitochondria1DNA, and reproductive compati-
596 Literature Cited
Koehn, 1; K. and J. E Siebenaller. 1981. Biochemical envelope protein. Proc. Natl. Acad. Sci. USA
studies of aminopeptidase polymorphism in 90:7176-7180.
i\/Iy/ll!is edul~s.11. Dependence of reaction rate on Korber, B. T. M., R. F. Smith, K. MacInnes and G.
physical factors and enzyme concentration. Myers. 1994. Mutat~onaltrends in V3 loop protein
Bloc!:em. Genet. 19:1143-1162 sequences observed m different genetic lineages
Koehn, R K., R.I. E. Newel1 and F.I~nrnermann.1980. of human imrnunodeficlency virus type I. J. Virol.
b~a~nienance of an aminopeptidase allele frequen- 68:6730-6744.
cy cline by natural selection. Proc. Natl. Acad. Scl. Kornberg, A. 1980. DNA Replication. Freeman, Sail
USA 775385-5389. Fransisco.
Kochn, 12. K., W J Die111 and T. M. Scott. 1988. The dif- Kowbel, D. 5, and M. J. Smith. 1989. The genomic
frrential contribution by individual enzymes of nucleotide sequences of two differentially
glycolysls and proten1 catabolism to the relation- expressed actin-coding genes from the sea star
shxp between heterozygosity and growth rate in Pisaster ochmceus. Gene 72297-308.
the coot clam, Mulil~mlateralrs. Genetics Krajewski, C. 1989. Phylogenetlc relationships among
118~121-130. cranes (Aves: Gruidae) based on DNA hybridiza-
l<ohne, 13 E. 1970. Evolulion of h~gher-organism tion. Auk 106.603-618.
DNA Quart. Rev. Blophys. 33:327-375. IGajewski. C, and A. W. Dickerman. 1990. Bootstrap
Kohnc, D 1.: and R. J. Brltten. 1971. Hydroxyapatite analysis of phylogenctic trees derived from DNA
rechnirlues lor nuclelc acld reassociation, pp. l~ybridizationdistances. Syst. Zool. 39:383-390.
500-512.111 G. L. Cantoni and D. R. Davies (eds.), Kraus, E 1991, htra-individual ploidy consistency
Procedures 117 Nuclerc Acid Resmrch. Harper and among unisexual Ambysfoma. Copeia 1991:3843.
liow, New York. Kraus. E and M. M Miyamoto. 1990. Mitochondria1
Kohne, D E.,J. A. Chiscon and l3. H.Hoyer. 1972. genotype of a unisexual salamander of hybrid ori-
Evolution of primate DNA sequences. J. Human gin is unrelated to either of its nuclear haplo-
Evol 1 627-644. types. Proc. Natl. Acad. Sci. USA 87:2235-2238.
Kohne, D. E., S. A. Levison and M. J. Byers. 1977. Kreitman, M. 1987. Molecular population genetics.
Room temperature method for increasing the rate Oxford Surv. Evol. Biol. 4:38-60.
of DNA reassociation by many thousandfold: The Kreitman, M, and M. Aguade. 1986. Genetic uniformi-
pi7cnol emulsion reassociation technique. qr in two populations of Drosoplzila rnelanogaster as
Blochemistry 16:5329-5341. revealed by Biter hybridization of four-
Kohno, 5.I , M. Kuro-o and C. Ikebe. 1991. nucleotide-recognizing restriction enzyme
Cytogenetics and evolution of hynobiid salaman- digests. Proc. Natl. Acad. Sci. USA83:3562-3566.
ders hz D.M. Green and S. I(.Sessions (eds.) I&ishnan, B. R., R. W. Blakesley and D. E. Berg. 1991.
Alripizzblan Cytogenetics and Eriolution. Academic Linear amplification DNA sequencing directly
Press, San Diego. from single phage plaques and bacterial colonies.
I<olodi-ier,R. and K. KTernari. 1987. The molecular Nucl. Acids Res. 19:1153.
s u e and conformation of the chloroplast DNA KruskaI, J. B. 1983. An overview of sequence cornpari-
fro111 higher plants. Biochim, Biophys. Acta son, pp. 1 4 0 . In D. Sankoff and J. R. Kruskal
402 372-390. (eds.), Time Warps, Siring Edits, and
Kol-ido, I<.,S. I-iorai, Y. Satta and N. Takahata. 1993. Macromolec~~les: Tile Theoty and Practice of Sequence
Evolution of homlnoid n~~tochondrial DNA with Compavison. Addison-Wesley, London.
special relerence to the sllent substitution rate Kuhner, M. K. and J. Felsenstein. 1994. Asimulation
o v a the genome. J Mol. Evol. 36:517-531. comparison of phylogeny algorithms under equal
Koop, B. F,M.Goodman, P. Xu, K. Chan and J. L. and unequal evolutionary rates. Mol. Biol. Evol.
Sllgl-itom.1986. Primate 11-globin DNA sequences 11:459-468.
and man's place among the great apes. Nature Kumar, S., K. Tamura and M. Nei. 1993. MEGA:
319 234-238. Molecular Evolutiona y Genetics Analysis. Version
Kooy, B. I?, D,A. Tagle, M, Goodman and J. L. 1.0. Pennsylvania State University, University
Sl~ghtorn.1989. A molecular view of primate phy- Park, Pennsylvania.
lcgcny and lnlportant systematic and evolution- Kiintzel, 13.and H.G. Kiichel. 1981. Bvoiution of
ary questions. Mol. Biol. Evol. 6:580-612. rRNA and origin of mitochondria. Nature
Korbel, 'S 1..% I<. M.IFarber,
, D. H,Walpert and A . S. 293:751-755.
iapides. 1993. Covariation of mutations in the V3 Kuro-0, M., C. lkebe and S. Kohno. 1986. Cytogenetic
loop oi human imiuunodcficiency virus type I studies of Ilynobiidae (Urodela) IV.DNA replica-
tion bands (R-banding) in the genus Hynobius and Lamboy, W. E 1994 The accuracy of the maximum
the banding karyotype of Hynobius nigrescens parsimony method for phylogeny reconstruction
Stejneger. Cytogenet. Cell Genet. 43:14-18. with morphological characters. Syst. Bot.
Kuro-o, M., C. Ikebe and S. Kohno. 1987. Cytogenetic 19:489-505.
studies of Hynobiidae (Urodela) VI R-banding Lanave, C., G. Preparata, C. Saccone and G. Serio.
patterns m five pond-type Hynobzus from Korea 1984. A new method for calculating evolutionary
and Japan. Cytogenet. Cell Genet. 44:69-75. substitution rates. J. Mol. Evol. 20236-93.
Landry, B. S., R. Kesseli, H. Leung and R. W.
Lacroix, J. C., R. Azzouz, D. Boucher, C. Abbadie, C. K. Michelmore. 1987. Comparison of restriction
Pyne and J. Charlemagne. 1985. Monoclonal anti- endonucleases and sources of probes for their effi-
bodies to lampbrush chromosome antigens of ciency in detecting restriction fragment length
Pleurodeles waltlii. Chromosoma 92:69-80. polymorphisms in lettuce (Lactuca sativa L.).
Laird, C. D. 1987. Proposed mechanism of inheritance Theor. Appl. Genet. 74:646-653.
and expression of human fragile->(syndrome of Lane, D. J., B. Pace, G. J. Olsen, D. A. Stahl, M. L. Sogin
mental retardatron. Genetics 117:587-599. and N. R. Pace. 1985. Rapid determination of 16s
Laird, C. D., B. L. McConaughy and B. J. McCarthy ribosomal sequences for phylogenetic analyses.
1969. Rate of fixatlon of nucleotide substitutions Proc. Natl. kcad. Sci, USA 82:6955-6959.
in evolution. Nature 224.149-154. Langer, P. R., A. A. Waldrop and D. C. Ward. 1981.
Laird, C. D., E. Jaffe, G. Karpen, M. Lamb and R. Enzymatic synthesis of biotin-labeled polynu-
Nelson. 1987. Fragile sites in human chromo- cleotides: Novel nucleic acid affinity probes. Proc.
somes as regions of late-replicating DNA. Trends Natl. Acad. Sci. USA78:6633-6637.
Genet. 3:274-281. Langley, C. H., E. Montgomery and W, Quattlebaum.
Lake, J. A. 1987a. Rate-independent technique for 1981, Restriction map variation in the ADH
analysis of nucleic acid sequences: Evolutionary region of Drosopkila. Proc. Natl. Acad. Sci. USA
parsimony. Mol. Biol. Evol. 4:167-191. 79:5631-5635.
Lake, J. A. 198713. Prokaryotes and archaebacteria are Lansman, R. A., R. 0.Shade, J. E Shapira and J. C.
not monophyletic: Rate invariant analysis of Avise. 1981. The use of restriction endonucleases
rRNA genes indicates that eukaryotes and eocytes to measure mitochondria1 DNA sequence rdated-
form a monopl~ylet~c taxon. Cold Spring Harbor ness in natural populations. J. Mol. Evol.
Symp. Quant. 8101.52:839-846. 17:214-226.
Lake, J. A. 1988. Origin of the cukaryotic nucleus Lansman, R. A., J. C. Avise, C, E Aquadro, J. F.Shapira
determined by rate-invariant analysis of rRNA and S. W. Daniel. 1983. Extensive genetic varia-
sequences. Nature 331:184-186. tion in mitochondria1 DNAs among geographic
Lake, J. A. 1990a. Origin of the Metazoa. Proc. Natl. populations of the deer mouse, Peromyscus manic-
Acad. Sci. USA 82763-766. ulatus. Evolution 37:l-16.
Lake, J. A. 1990b.Archaebacterial or eocyte tree? Lanyon, S. 1985. Detecting internal inconsistencies in
Nature 343:418-419. distance data. Syst. Zool. 3k397-403.
Lake, J. A. 1994. Reconstructing evolutionary trees Lanyon, S. 1993. Phylogenetic frameworks: Towards a
from DNA and protein sequences: Paralinear dis- firmer foundation for the comparative approacl~.
tances. Proc. Natl. Acad. Sci. USA 91:1455-1459. Biol. J. Linnean Soc. 49:45-61.
Lamarck, J.-B.-Pa-A.de M. de. 4809. Philosopkze Lapoint, F.-J.and P, Legendre. 1992.A statistical
Zoologique, ou Expositron des ConsrdPrations framework to test the consensus among additive
Relatives d l'histoire Naturelle des Anlrnaux. Dentu, trees (cladograms). Syst. Biol. 41:158-171.
Paris. Larson, A. 1989. The relationship between speciation
Lamb, T., C. Lydeard, R. B. Walker and J. W. Gibbons. and morphological evolution, pp. 579-598. In D.
1994. Molecular systematics of map turtles Otte and J. A. Endler (eds.), Speciation and Its
(Graptemys):A comparison of mitochondria1 Consequences, Sinauer, Sunderland,
restriction sites versus sequence data. Syst. Biol. Massachusetts.
43:543-559. Larson, A. 1991a. Evolutionary analysis of length vari-
Lambert, D. M., C. D. Millar, K. Jack, S. Anderson and able sequences: Divergent domains of ribosomal
J. L. Craig. 1994. Single- and multilocus DNA fin- RNA. Pp. 221-248. In M. M. Miyamoto and J.
gerprinting of communally breeding pukeko: Do Cracraft (eds.) Phylogenetic Arzalysls of D N A
copulations or dominance ensure reproductive Sequence Data. Oxford University Press, New
success? Proc. Natl. Acad. Sci. USA 91:9641-9645. York.
598 Literature Cited
Larson, A. 1991b. Amolecular perspective on the evo- RNA immobilized an nitrocellulose: Bioblots.
lutionary relationships of the salamander fami- Proc. Natl. Acad. Sci. USA 80:4045-4049.
lies. Evol. Biol. 25:211-277. Leary, R. E, F W. Allendorf and K. L. Knudsen. 1984.
Larson, A, and W. W. Dimmick. 1993. Phylogenetic Major morphological effects of a regulatory gene
relationships of the salamander families: An Pgml-t in rainbow trout. Mol. Biol. Evol.
analysis of congruence among morphological and 1:183-194.
molecular characters. Herpetol. Monog. 7:77-93. Leberg, P. L. 1992. Effects of population bottlenecks on
Larson, A. and A. C. Wilson. 1989. Patterns of riboso- genetic diversity as mcasured by allozyme elec-
mal RNA evolution in salamanders. Mol. Biol. trophoresis. Evolution 46:477494.
Evol. 6:131-154. Lebherz, H. G. 1983. On epigenetically generated
Larson, A,, D. B. Wake and K. Yanev. 1984. isozymes ("pscudo isozymes") and their possible
Measuring gene flow among populations having biological relevance, pp. 203-218. In M. C.
high levels of genctic fragmentation. Genetics Rattazzi, J. G. Scandalios and G. S. Whitt (eds.),
106:293-308. Isozymes: Current Topics in Biological and Medical
Larson, A,, M. M. Kirk and D. L. Kirk. 1992. Molecular Research, Vol. 7. Molecular Structure and Regulation.
phylogeny of the volvocine flagellates. Mol. Biol. A. R. Liss, New York.
Evol. 9:85-105. Lechner, K., G. Wich and A. Bock. 1985. The nucleotide
Laurie-Ahlberg, C. C. and B. 5. Weir. 1979.Allozyrnic sequence of the 16s rRNA gene and flanking
variation and linkage disequilibrium in some lab- regions from Mefhairobacteriulnformicicunz: The
oratory populations of Drosoplzila melanogaster. phylogenetic relationship between methanogenic
Genetics 92:1295-1314. and halophilic Archaebacteria. Syst. Appl.
Lavery, S., C. Moritz and D. R. Fielder. 1995. Changing Microbiol. 6:157-163.
patterns of population structure and gene flow at Lecointre, G., H. Philippe, H, L. V.L&and H. L.
different spatial scales in the coconut crab (Birgus Guyader. 1993. Species sampling has a major
latro). Heredity 74:531-541. impact on phylogenetic inference. MoI.
Lavery, S., C. Moritz and D. R. Fielder. 1996a. The Phylogenet. Evol. 2:205-224.
effects of scale on the population structure of the Lee, M. R. and F.I? B. Elder. 1980. Yeast stimulation of
coconut crab (Birgus latro). (unpublished manu- bone marrow mitosis for cytogenetic investiga-
script) tions. Cytogenet. Cell Genet. 26:36-40.
Lavery, S., C. Moritz and D. R. Fielder. 1996b. Genetic Lee, S. B, and J. W. Taylor. 1992. Phylogeny of five fun-
patterns suggest exponcntial population growth gus-like protoctistan Phytophthora species,
in a declining species. (unpublished manuscript) inferred from the internal transcribed spacers of
Lavin, M., J. J. Doyle and J. D. Palmer. 1990. xibosomal DNA. Mol. Biol. Evol. 9:636653.
Evolutionary significance of the chloroplast DNA Leffers, H., J. Kjems, L. Ostergaard, N. Larsen and R.
inverted repeat in the Leguminosae subfamily A. Garrett. 1987. Evolutionary relationships
Papilionidae. Evolution 44:390-402. amongst Archaebacteria.A comparative study of
Lawrence, C. 8.1990. Use of homology domains in 23s ribosomal RNAs of a sulphur-dependent
sequence similarity detection. Meth. Enzymol. extreme thermophile, an extreme halophile and a
183:133-145. thermophilic methanogen. J. Mol. Biol. 195:43-61.
Lawrence, J. G., D. E. Dykhuizen, R. F,DuBose and D. Leipe, D, D., J. M.Gunderson, T. A. Nerad and M. L.
L. Hartl. 1989. Phylogenetic analysis using inser- Sogin. 1993. Small subunit RNA+ of Hexamita
tion sequence fingerprinting in Escherichia coli. tnflata and the quest for the first branch in the
Mol. Biol. EvoI. 6:1-24. eukaryotic tree. Mol. Biochem. Parisitol. 59:41-48.
Lawyer, E C., S. Stoffel, R. K. Saiki, K. Myambo, R. Lcnto, G. M., R. E. Hickson, G. K. Chambers and D.
Drummond and D. H. Gelfand. 1989. Isolation, Penny. 1995. Use of spectral analysis to test
characterization, and expression in Escherichia coli hypotheses on the origin of pinnipeds. Mol. Biol.
of the DNA polymerase gcne from Themzus Evol. 12:28-52.
aquaticus. J. Biol. Chem. 264:6427-6437. Leone, C. A. 1964. Taxonomic Biochemistry and Serology.
Learn, G. W., Jr. and B. A. Schaal. 1987. Population Ronald Press, New York.
subdivision for ribosomal DNA repeat variants in Leone, C. A. 1968. The immunotaxonomic literature:
Clematisfremontiz. Evolution 41:433438. The animal kingdom. Serol. Mus. Bull. 39:l-28.
Leary, J. J., D. J. Brigati and D. C. Ward. 1983. Rapid LeQuesne, W. J. 1982. Compatibility analysis and its
and sensitive colorimetric method for visualizi~~g applications. Zool. J. Linnean Soc. 74267-275.
biotin-labeled DNA probes hybridized to DNA or
Liteyatu16e Cited 599
Lessa, E. P. 1990. Multidimensional analysis of geo- degree of heterozygos~tym natural populations of
graphic genetic structure. Syst. Zool. 39:242-252. Wlasophrla pseudoobscrtra. Genetics 54 595-609
Lessa, E. P. 1992. Rapid surveying of DNA sequence Li, C. C. 1988. Pseudo-random mating. In cclebratlon
variation in natural populations. Mol. Biol. Evol. of the 80th anniversary of the Hardy-Weinbeig
9:323-330. law Genetics 119:731-737
Lessa, E. P. 1993. Analysis of DNA sequence variation LI, W-H. 1980. Rate of gcne siienclng at dupIicatc loc~.
at population level by polymerase chain reaction A theoretical study and interpretation of data
and denaturing gradient gel electrophoresis. from tetraploid fishes Gcnetlcs 95:237-258
Meth. Enzymol. 224:419-428. Li, W-H. 1981. Asimplc method for construct~ngplv-
Lessa, E. P. and C. Applebaum. 1993. Screening tech- logcnetic trees from distance matrices Proc Natl.
niqucs for detecting allelic variation In DNA Acad. Sci. USA 78 1085-1089.
sequences. MoI. EcoI. 2:119-129 LI, W.-H. 1986. Evolutionary change of restrict~on
Leu, S., J. Schlesinger, A. Michaels and N. Shavit. 7992. cleavage sites and pl~ylogenetlcmfercnce.
Complete DNA sequence of the Chlai7zydonzonas Genetlcs 113:187-213.
reinizardtii chloroplast atpA gene. Plant Mol. Biol. LI, We-H.1993a. So, what about the molecular clock
18:613-616. hypotl~esis?Curr. Opm. Genct. Dev. 3.896-901
Levan, A., D. Fredga and A. A. Sandberg. 1964. Li, W.-H. 199313. Unb~asedestimation of the rates of
Nomenclature for centromenc position on chro- synonymous and nonsynonymous substitution. J
mosomes. Hereditas 52:201-220. Mol. Evol. 36:96-99.
Levin, D. A. 1981. Dispersal versus gene flow in Li, W.-H. and M. Gouy 1991. Statistical methods tor
plants. Ann. Missouri Bat. Gard. 68.233-253. Icst~ngphylogenlcs, pp 249-277.111 M. M.
Lrvitan, D. R. and R K. Crosberg. 1993. The analysis Miyamoto and J. Cracraft (eds.), Phylogei~etrc
of paternity and maternity in the marine hydro- Anniysls of DNA Sequence? Oxford Univers~ty
zoan Hydractitzia symbioloizgicarpus using random- I)ress, New York.
ly amplified polymorphic DNA (ILtlPD) markers. LI, W.-H. and D. Graur, 1991. Fundamentals of Moiecr~lar
MoX. Eool. 2:315-328. Evolution. Sinauer, Sunderland, Massachusetts.
Leviton, A. E.,R. H. Gibbs, Jr., E. H. Heal and C. E. Li, W.-H and L. A. Sadler. 1991. Low nucleotlde diver-
Dawson. 1985. Standards in herpetology and sity in man. Genetics 129,513-523.
ichthyology. Part I: Standard symbolic codes for L1, W.-H. and M. Tanimura. 1987. The molecular clock
institutional resource collections in herpetology runs mare slowly in man than in apes and mon-
and ichthyology Copeia 1985:802-832. keys. Nature 326.93-96.
Lewin, B. M. 1987. Genes III. Wiey, New York. Li, W.-H. and A. Zhark~kh1995 Statistical tests of
Lewin, R. 1988. Conflict over DNA clock results. DNA phylogen~esSyst B101.44:49-63
Science 241:1598-1600. Li, W.-H., C.-I. Wu and C.-C. Luo 1984.
Lewis, P., J. TJ. Huelsenbeck and D. L. Swofford. 1996. Nonrandomness of point mutation as reflected in
Maximum likelihood. In D. L. Swofford, PAUP: nuclestide subst~tut~ons in pseudogenes and ~ t s
version 4.0. Sinauer Associates, Sunderland, evolutionary implications. J. Mol. Evol. 21.58-71.
Massachussetts. Li, W.-H., C.-C. Luo and C.4. Wu 1985a. Evolution of
Lewis, P. 0. and A. A. Snow. 1992. Deterministic pater- DNA sequences, pp. 1-130 in R. MacIntyre (cci ),
nity exclusion using RAPD markers. Mol. Ecol. Molecular Evolutions y Genctzcs. Plenum, New Yor k.
1:155-160. L,W.-H., C.-I. WU and C -C Luo. 1985b.A new
Lewontin, R. C. 1974. The Genetic Basis of Evolutionnry method for estimating synonymaus and nonsyn-
Change. Columbia University Press, New York. onymous rates of nucleotide substitution consld-
Lewontin, R. C. 1986. Population genetics. Annu. Rev. erlng the relative likel~hoodof nucleotide and
Genet. 19:81-102. codon changes. MQI Bloi. Evol, 2:150-171.
Lewontin, R. C, and C. C. Cockerham. 1959. The good- Li, W.-H., M. Tanimura and P.M Sharp. 1987a An
ness-of-fit test for detecting natural selection in evaluation of thc molecular clock hypothes~s
random mating populations. Evolution 13:561-564. using mammalian DNA sequences. J, Mol Evol.
Lewontin, R. C. and D. L. Hartl. 1991. Population 25.330-342.
genetics in forensic DNA typing. Science Li, W.-H., K.H. Wolfc, J. Sourdls and P. M. Sharp
254:1745-1750. 1987b. Reconstruction of phylogenetlc trees and
Lcwontin, R. C. and J. Hubby 1966.A molecular estunatlon of divergence tlmes under nonconstant
approach to the study of genlc heterozygosity in rates of evolut~on.Cold Spring Harbor Symp
natural populations. 11. Amounts of variation and Quant. Biol. 52:847-856.
600 Liferafure Cifed
Libby, R L. 1938. The phoironrefiectometer-an Loeb, L. A. and B.D. Preston. 1986. Mutagenesis by
mstruincnt for the measurement of turbid sys- apurinic/apyrimidinic sites. Annu. Rev. Genet.
terns J Immunol. 34:71-73. 20:201-230.
Llclltcr, P and D. C. Ward. 1990. Is non-isotonic in-situ Loh, E. Y., j.' E Elliott, S. Cwirla, L. L. Lanier and M. M.
hybsldlzation finally comlng of age? Nature Davis. 1989. Polymerase chain reaction with sm-
345 93-94. gle-stranded specificity: Analysis of T cell recep-
Lin, C . C ,G.Shipmann, W. A. IGttrelI and S. Olu~o. tor a chain. Science 243:217-220.
1969 The predomnance of heterozygotes found in Long, E. H. and 1. B. Dawid. 1980. Repeated genes in
wild goldfish of Lake Erie at the gene locus for sor- eukaryotes. Annu. Rev. Biochem. 49:727-764.
bltol dehydrogenase. Biochem. Genet. 3:603-607. Loomis, W. E 1988. Four Billion Years: An Essay on the
tindahl, T 1993. Instability and decay of the prrmsry Evolution of Genes and Orgnnisms. Sinauer,
structure of DNA. Nature 362:709-715. Sunderland, Massachusetts.
Linnaeus, C. 1758. Systema Naturne. 10th ed. Stockholm. Losos, J. 1994. An approach to the analysis of compar-
Lint, D , J. Clayton, L. Postma and R. Lillie. 1988. ative data when a phylogeny is unavailable or
Evolution of cetaceans. A serum albumin incomplete. Syst. Biol. 43:117-123.
~mmunologicaland biochemical perspective. Lopez, J. V., N. Yuhki, R. Masuda, W. Modi and S. J.
(Abst 1133.21.36).XVI lntcr~~ational Congress of O'Brien. 1994. Nurnt, a recent transfer and tandem
Genetics, Toronto. amplification of mitochondria1 DNA to the
Llpman, D J. and W. R. Prarson. 1985. Rapid and sen- nuclear genome of the domestic cat. J. Mol. Evol.
s ~ t ~ protein
ve similarity searches. Science 39:174-191.
227,1435-1441. Lowenstein, J. M. 1985a. Molecular approaches to the
Lipman, D. J., W. J. Wilbur, T. E Smith and M. S. identification of species. Amer. Sci. 73:541-547.
VVatermdn. 1984. On the statistical significance of Lowmstein, J. M. 1985b. Radioimmune assay of mam-
nucleic acid similarit~esNud. Acids Xes. moth tissue. Acta Zool. Fennica 170:233-235.
12 215-226. Lowenstein, J. M. and 0.A. Ryder. 1985.
Liston, A. 1992. Variation in the chloroplast genes Immunological systematics of the extinct quagga
rp0C1 and rpoC2 of the genus Astragalus (Equidae). Experientia 41:1192-1193.
(Fdbaceae); evidence from restriction s ~ t map-
e Lowenstein, J. M., V. M. Sarich and B. J. Richardson.
pmg of a PCR anlpiificd product. Am. J. Bot. 1981. Albumln systematics of the extinct mam-
79 953-961. moth and Tasmanian wolf. Nature 291:409-411.
Lrtt, M and J. A. Luty. 1989.A hypervariable Luke, S. and R. S. Verma. 1993. The genomic synteny
microsatellite revealed by in vitro amplification of at DNA level between human and chimpanzee
dinucleot~derepeat within the cardiac muscle chromosomes. Chromosome Res. 1:215-219.
actin gene. Am. J. Human Genet. 44:397-401. Lumb, W. V. and E.W. Jones. 1984. Veterinary
Liu, Z -G and G. R. Furnier 1993. Comparison of Anesthesia. 2nd ed. Lea and Pebriger,
aliozyme, RELP,and RAPD markers for revealing Philadelphia.
gcnelic variation within and between trembling Lundberg, J. G. 1972. Wagner networks and ancestors.
aspcn and bigtooth aspen. Theor. Appl. Genet. Syst. Zool. 21:398-413.
b7 97-1 05. Lundrigan, B. L. and P. K. Tucker. 1994. Tracing pater-
Llu, Z-G and L.M. Schwartz. 1992. An efficient nal ancestry in mice, using the Y-linked, sex-deter-
method for blunt-end llgation of PCRproducts mining locus, Sry. Mol. Biol. Evol. 11:483-492.
BiaTechniques 12:28-30. Lynch, M. 1988. Estimation of relatedness by DNA fin-
Lockhart, P. J., M.A Steel, M. D. Hendy and D.Penny. gerprinting. Mol. Biol. Evol. 5:584-599.
1994. Xect~veringevolutionary trees under a more Lynch, M. 1990. The similarity index and DNAfinger-
rcdirshc model of sequence evolution. Mol. B i d . printing. Mol. Biol. Evol. 7:478-484.
Fvoi 11:605-612. Lynch, M. 1991a. Analysis of population genetic struc-
Lockl~drt,I-' J., A. W. Larkum, M.A. Steel, P. J. ture by DNA fingerprinting, pp. 113-126. In T.
hiaddell and D. Penny. 1995a. Evolution of Burke, G. Dolf, A. J. Jeffreys and R. Wolff (eds.),
chlorophyll and bacterlochlorophyll: The problem DNA Fingerprinting: Appronches and Appl~cations.
of invariant sites in sequcnce analysis. Proc. Natl. BirM~auser,Boston.
Acdd. Sci. USA (in press). Lynch, M. 1991b. Methods for the analysis of coinpar-
Lockhart, P.J ,D. Penny and A. Meyer. 1995b. Testing ative data in evolutionary biology. Evolution
the phylogeny of swordtall fishes using split 45:1065-1080.
decomposition and spectral analysis. 7. Mol. Evol.
41 666-674
Literature Cited 601
Lynch, M. and T. J. Crease. 2990. The analysis of popu- Maddison, D. R. 1990. Phylogenetic inference of hls-
lation survey data on DNA sequence variation. torical pathways and models of evolutionary
Mol. Biol. Evol. 7:377-394. change. Ph.D. dissertation, Llarvard University.
Lynch, M. and P. E.Jarrell. 1993. A method for calibrat- Maddison, D. R., M. Ruvolo and D. L. Swofford. 1992.
ing molecular clocks and its application to animal Geographic origins of human mitochondria1
mitochondria1 DNA. Genetics 135:1197-1208. DNA: Phylogenetic evidence from control region
Lynch, M. and B. G. Milligan. 1994. Analysis of popu- sequences. Syst. Biol. 41:lll-124.
lation genetic structure with RAPD markers. Mol. Maddison, W. P. 1989. Reconstructing character evolu-
Ecol. 3:91-100. tion on polyto~nousrladograms. Cladistics
Mabee, I? M. 1989.Assumptions underlying the use of 5:365-377.
ontogenetic sequences for determining character- Maddison, W. 1990. A method for testing the corre-
state order. Trans. Am. Fish. Soc. 118:151-158. lated evolution of two binary characters: Are
Mabee, P. M. 2993. Phylogenetic interpretation of gains or Ioses concentrated on certain branches of
ontogenetic change: Sorting out the actual and a phylogenekic tree? Evolution 44:539-557.
artefactual in an empirical case study of centrar- Maddison, W. P. 1991. Squared-change parsimony
chid fishes. Zool. J. Linnean Soc. 107:175-291. reconstructions of ancestral states for continuous
Mabee, P. M, and J. Humphries. 1993. Coding poly- valued characters on a phylogenetic tree. Syst.
morphic data: Examples from allozymes and Zool.40:304-314.
ontogeny. Syst. Biol. 42166-181. Maddison, W. P. and D. R. Maddison. 1992. MacClade,
MacArthur, R, H. and E. 0.Wilson. 1963. An equilibri- version 3.0. Sinauer, Sunderland, Massachusetts.
um theory of insular zoogeography. Evolution Maddison, W. P., M. J. Donoghue and D. R. Maddison.
17:373-387. 1984. Outgroup analysis and parsimony. Syst.
MacArthur, R. H. and E. 0.Wilson. 1967. The Theoy of Zool.33:83-103.
Island Biogeography. Princeton University Press, Maeda, N., C. -I. Wu, J. Bliska and J. Reneke. 1988.
Princeton. Molecular evolution of intergenic DNA in higher
Macgregor, H. C. 1993. An Introduction to Animal pirmates: Pattern of DNA changes, molecular
Cytogenetjcs. Chapman and Hall, London. clock, and evolution of repetitive sequences. Mol.
Macgregor, H. C. and S. K. Sessions. 1986. The biologi- Biol. Evol. 5:l-20.
cal significance of variation in satellite DNA and Mailer, R. J., R. Scarth and B. Fritensky. 1994.
heterochromatin in newts of the genus Triturus: Discrimination among cultivars rapeseed
An evolutionary perspective. Phil. Trans. Roy. (Brassica napus L.)using DNA polymorphisms
Soc. London B312:243-259. amplified from arbitrary primers. Theor. Appl.
Macgregor, H. C. and S. Sherwood. 1979. The nucleo- Genet. 87:697-704.
lus organizers of Plethodon and Aneides located by Maiste, P. J. 1993. Comparison of statistical tests for
in situ nucleic acid hybridization with Xenopus independence at genetic loci with many alleles.
3H-ribosomalW A . Chromosoma 72271-250. Ph.D. dissertation, North Carolina State
Macgregor H. C. and J. Varley. 1983. Working with University, Raleigh.
Animal Chromosomes. John Wiley and Sons, New Malcolm, S., J. K. Cowell and B. D. Young. 1986.
York. Specialist techniques in research and diagnostic
Macgregor, H. C., S. K. Sessions and J. W. Arntzen. clinical cytogenetics, pp. 197-226. In D. E. Rooney
1990. An integrative anaiysis of phylogenetic rela- and B. H. Czepulkowski (eds.), Human
tionships among newts of the genus Triturus Cytogenetics. IRLPress, Oxford.
(family Salamandridae), using comparative bio- Maldonado, I. E. 1992. Problems in the identification
chemistry, cytogenetics, and reproductive interac- of XDH in vertebrates. Isozyme Bull. 25:72
tions. J. Evol. Biol. 3329-273. Manchenko, G. P. 1988. Subunit structure of enzymes:
MacIntyre, R. J. 1976. Evolution and ecoiogical value Allozymic data. Isozyme Bull. 21344-158.
of duplicate genes. Annu, Rev. Ecol. Syst. Manchenko, G. 1.' 1994. Handbook of Detection of
7:421468. Enzymes on Electrophoretic Gels. C.R.C. Press, Ann
MacIntyre, R. J. (ed.) 1985.Molecular Evolutionary Arbor.
Genetics. Plenum, New York. Mancmo, G., lvr. liagghianti and S. Bucci-lnnocenti.
MacIntyre, R. j., M. X. Dean and G. Batt. 1978. 1977. Cytotaxonomy and cytogenetlcs in
Evolution of acid phosphatase-1 in the genus European newt species, pp. 411-447. In 3.EI.
Drosopizila. Immunological studies. J. Mol. Evol. Taylor and S. I. Guthnan (eds.), The Reproductive
12:121-142. Biology of Amphibians. Plenum, New York.
602 Literature Cited
Maniatis, T., E. F. Fristch and J. Sambrook. 1982. Marsden, J. E. and B. May. 1984. Feather pulp: A non-
MolecuIar Cloning: A Laboratory Mariual. Cold destructive sampling technique for electrophoret-
Spring Harbor Publications, Cold Spring Harbor, ic studies of birds. Auk 101:173-175.
New York. Marsh, T. L., C. I. Reich, R. B. Whitelock and G. J.
Manly, B. E J. 1991. Randornizatioiz and Monte Carlo Olsen. 1994. Trat~scrlptionfactor LID tn the
Methods in Biology. Chapman and Hall, NEWYork. Arcbea: Sequences in the Therrnococcus celer
Mann, C. 1990. Meta-analysis in the breech. Science genome would encode a product closely related
249:476-480. to the TATA-binding protein of eukaryotes. Proc.
Manos, P. S., K. C. Nixon and J. J. Doyle. 1993. Natl. Acad. Sci. USA 81:4180-4184.
Cladistic analysis of restriction site variation Marshall, C. R. 1990. The fossil record and estitnating
within the chloroplast DNA inverted repeat divergence times between lineages: Maximum
region of selected Harnamelididae. Syst. Bot. divergence times and the importance of reliable
18:551-562. phylogenies. J. Mol. Evol. 30:400-408.
Manuelidis, L., I?. R. Langer-Safer and D. C. Ward. Marshall, C. R. 1992. Character analysis and the inte-
1982. High-resolution mapping of satellite DNA gration of molecular and morphological. data in
using biotin-labeled DNA probes. J. Cell Biol. an understanding of sand dollar phylogeny. Mol.
95:619-625. Biol. Evol. 9:309-322.
Mao, S.-H, and B.-Y. Chen. 1982. Serological relation- Marshall, C. R.and H.Swift. 1992. DNA-DNA
ships of turtles and evolutionary implications. hybridization phylogeny of sand dollars and
Comp. Biochem. Physiol. 71B:173-179. highly reproducible extent of hybridization val-
Mae, S.-H., 8.-Y. Chen, E-Y. Yin and Y.-W. Guo. 1983. ues. J. Mol. Evol. 34:31-44.
Immunotaxonomic relationships of sea snakes to Martin, A. P. and S. R.Palumbi. 1993a.Protein evolu-
terrestrial snakes. Comp. Biochem. Physiol. tion in different cellular environments:
74A:869-872. Cytochrome b in sharks and mammals. Mol. Biol.
Mao, S.-I-I., W. Frair, F.-Y. Yln and Y.-W. Guo. 1987. Evol. 10:873-891.
Relationships of some Cryptodiran turtles as sug- Martin, A. P. and S. R. Palumbi. 1993b. Body size,
gested by immunological cross-reactivity of metabolic rate, generation time and the molecular
serum albumins. Biocl~em.Syst. Ecol. 15:621-624. clock. Proc. Natl. Acad. Sci. USA 90:40874091.
Marchant, A. D., M. L. Arnold and P. Wilkinson. 1988. Martin, A. P., R. Humphreys and S. R Palumbi. 1992a.
Gene flow across a chromosomal tension zone. I. Population genetic structure of the armorhead,
Relicts of ancient hybridization. Heredity Pseudopentaceros wheeleri, in the North Pacific
61:321-328. ocean: Application of the polymerase chain reac-
Marchuk, D., M. Drumm, A. Saulino and F.S. Collins. tion to fisheries problems. Can. J. Fish. Aquat. Sci.
1991. Construction of T-vectors, a rapid and gen- 49:2386-2391.
eral system for direct cloning of unmodified PCR Martin, A. P., G. J. P. Naylor and S. R. Palumbi. 1992b.
products. Nucl. Acids. Xes. 19:1154. Rates of mitochonrial DNA evolution in sharks
Markert, C. L. 1983. Isozymes: Conceptual history and are slow compared to mammals. Nature
biological significance, pp. 1-17. In M. C. Rattazzi, 357:153-155.
J. G. Scandalios and G.S. Whitt (eds.), Isozymes: Martins, E.P. and T. Garland, Jr. 2991. Phylogenetic
Current Topics in Biologtcal and Medical Researclz, analyses of the correlated evolution of continuous
Vnl. 7. Molecular Structure and Xegulatton. A. R. characters: A simulation study. Evolution
Liss, New York. 45:534-557.
Markert, C. L. and E Mollcr. 1959, Multiple forms of Martinson, H. G. 1973. The nucleic acid-hydroxyap-
enzymes: Tissue, ontogenetic, and species-specific atite interaction. 11. Phase transitions in the
patterns. Proc. Natl. Acad. Sci. USA 45753-763. deoxyribonucleic acid-hydroxyapatite system.
Markert, C. L., J. B. Shaklae and G. S. Whitt. 1975. Biochemistry 12:145-150.
Evolution of a gene. Science 189:102-114. Mason, I. J. 1992. Rapid and direct sequencing of DNA
Marklund, S., H. Ellegren, 5. Eriksson, K. Sandberg from bacteriophage plaques using sequential lin-
and L. Andersson. 1994. Parentage testing and ear and asymmetric PCR. BioTechniques 12:60-61
linkage analysis in the horse using a set of highly Massaro, E. J. and C. L. Markert. 1968. Protein staining
potymorphic microsatellites.h i m . Genet. on starch gels. J. Histochem. Cytochcm.
25:19-23. 16:380-382.
Markowitz, E. 1970. Estimation and testing goodness- Matson R. H. 1984. Applications of eiectrophorctic
of-fit for some models of codon fixation variabili- data in avian systematics. Auk 101:717-729.
ty. Biochem. Genet. 4:595-601.
Literature Cited 603
Matson, R. H. 1989.Avian peptidase isozymes: Tissue Mayden, R.L (ed.). 1992. Systemntzcs, Hisfor~cal
distributions, substrate affinities, and assignment Ecology, and North Amerlcaiz Reshwater Flshes
of homology. Biochem. Genet. 27:137-151. Stanford University Press, Stanford, Califnr~lla
Maure, R. R. 1978. Freezing mammalian embryos: A Mayr, E.1942. Systemal.rcs nnd the 01 zgzn of Spec~es
review of techniques. Theriogenology 9:45-68. Reprinted 1982, Columbia University Press, NCLV
Maxam, A. M. and W. Gilbert. 1977. Anew method for York.
sequencing DNA. Proc. Natl Acad. Scl. USA Mayr, E. 1983. The Growth of B~ologrcalTIzougizt
74:560-564. Diversity, Evolution, and bzlzerztance. Harvard
Maxam, A. M. and W. Gilbert. 1980. Sequencing end- Uluversity Press, Cambndge, Massachusetts
labeled DNA with base-specific chemical cleav- Mazur, P, 1970. Cryoblology: The freezmg of b~ological
ages. Metl~.Enzymol. 65:499-559. systems. Science 168939-949.
Maxson, L. R. 1981. Albumin evolution and ~ t phylo-s McBec, K., R. J. Baker and R L.l-Ioneycutt. 1987.
genetic implications in toads of the genus Bufo. II. Observations on rates of D N A degradation Abstr
Relationships among Eurasian Bufo. Copeia 87, Ann. Meet., Amcr. Soc. Man~malogists,
1981:579-583. Albuquerque, New Mexico.
Maxson, L. R. 1984. Molecular probes of phylogeny McClenaghan, L. R., Jr., M. H. S m ~ t hand M. W Smitl~
and biogeography in toads sf the widespread 1985. Biochemical genetics of mosquttofish IV
genus Bufo. Mol. Biol. Evol. 1345-356. Changes of allele frequenclcs through time and
Maxson, L. R. and C. H. Daugherly. 1980. space. Evolution 39:451-460.
Evolutionary relationships of the monotypic toad McCouch, S. R., G. Kochert, Z. H Yu, Z. Y. Wang, C; S,
family Rltinopl~rynidae:A biocltemical perspec- IU~ush,W. R. Coffman and S D. Tanksley 1988.
tive. Herpetologica 36:275-280. Molecular mapping of rice cltroinosomcs Tl~eor.
Maxson, L. R. and R.D. Maxson. 1990. Proteins TI: Appl. Genet. 76:815-829.
Immunological techniques, pp. 127-155. In W.M. McCracken, G. E and J. W. Bradbury. 1977. Paternity
Hillis and C. Moritz (eds.), Molecular Systematics. and genetic heterogei~eltyIn the polygynous bat,
Sinauer, Sunderland, Massachusetts. I-'hyllostonzus Izastatus. Sc~ei~ce 198:303-306.
Maxson, L. R. and J. D. Roberts. 1985. An immunologi- McDonald, 14.S. 1976. Methods for the physiolog~cal
cal analysis of the phylogenetic relationships study of reptiles, pp. 19-1 26.11~C. Gans and W. P,
between two enigmatic frogs, Myobatrachus and Dawson (eds.), B~ologyof the Xept~lm,Vol 5.
Arenophryne J. Zool. (London) 207:289-300. Academic Press, Ncw York
Maxson, L. R. and J. M. Szymura. 1984. Relationships McDonald, J, f-l. 1989. Selection component analysis of
among discoglossid frogs: An albumin perspec- rhc Mpi locus in the amplupod Platorcizestrn pintell-
tive. Amphibia/Reptiha 5:245-252. sls. Heredity 62:243-249
Maxson, L. R. and A. C. Wilson. 1975. Albumin evolu- McDonald, J. H. and M. Kreitman. 1991. Adaptive pro-
tion and organismal evolution in tree frogs tein evolution at the Adh locus 111 Drosophlla.
(Hylidae). Syst. Zool. 24:l-15. Nature 351652-654.
Maxson, L. R., R.Highton and D. B. Wake. 1979. McDonald, J. H a n d J. F Sicbcnaller 1989. Similar gco-
Albumin evolution and its phylogenetic implica- graphic variation at the LAP locus In the musscls
tions in the pletl~odontidsalamander genera Myfzlus trossulus and M edulis. Evolution
Pletlzodon and Ensatina. Copeia 1979:502-508. 43:228-231.
Maxson, L. R., L. S. Ellis and A,-R. Song. 1981. McDoncll, M. W., M. N. Simon and F. W. Studier 1977
Quantitative immunological studies of the albu- Analys~sof restriction fragments of T7 DNA and
mins of North American squirrels, family determination of molecular welghts by elec-
Sciuridae. Comp. Biochem. Physiol. 68B:397400. trophoresis in neutral and allcal~negels. J. Mol.
Maxson, R. D. and L. R. Maxson 1986. Micro-comple- BiaL 110:119-146.
lnent fixation: A quantitative estimator of protein McGovern, M. and C. R. Tracy. 1981. Phenotypic varla-
evolution. Mol. Biol. Evol. 3:375-88. tion in electromorphs previously considered to Isc
May, C. A., J. H. Wetton, P. E. Davis, J. E Brookficld genetic markers in Microtus ochrogaster. Oecolog~a
and D. T. Farkin. 1993. Single-locus profiling 51:276-280.
reveals loss of vartation in inbred populations of McInnes, J. L., P. D.Vise, N. Habilt and R. Xi. Symons
the red kite (Mzlvus milvus). F'roc. Roy. Soc. Land. 1987. Chemical b ~ o t ~ ~ ~ y l aoft lnuclelc
on acids nrith
I3 251:165-170. pltotoblotin and the~ruse as hybridization probes.
Mayden, R. L. 1986. Speciose and depauperate phy- Focus (BRL) 9:l-4.
lads and tests of punctuated and gradual evolu-
tion: Fact or artifact? Syst. Zool. 35:591-602.
604 Literature Cited
McKusick, V. A. 1988. The Morbid Anatomy of the Michelmore, R. W., I. Paran and R. V. Kesseli. 1991.
7-Innran Genome. Howard Hughes Medical Identification of markers linked to disease-resis-
li-istitute. tance genes by bulked segregant analysis: A rapid
McLellan, T. 1984. Molecular charge and elec- method to detect markers in specific genomic
trophoret~cmobillty in cetacean myoglobins of regions by using segregating populations. Proc.
known sequence. Biochem. Genet. 22:181-200. Natl. Acad. Sci. USA 88:9828-9832.
McLellan, T. and L. S. Inouye. 1986. The sensitivity of Mickevich, M. E and M. S. Johnson. 1976. Congruence
isoelcctric focusing and electrophoresis in the between morphological and allozyme data in
detection of sequence differences in proteins. evolutionary inference and character evolution.
Blochem. Genet. 24.571-577. Syst. Zool. 25:260-270.
McLcnnan, D.A. 1991. Integrating phylogeny and Milinkovitch, M. C. 1995. Molecular phylogeny of
experiiizental ethology From pattern to process. cetaceans prompts revision of morphological
Evolution 45.1773-1789. transformations. Trends Ecol. Evol. 10:328-334.
McLennan, D. A., D. R. Brooks and J. D. McPhail. Miller, A. J. 1990, Subset Selection in Regression.
1988. The benefits of communication between Chapman and Hall, London.
comparative ethology and phylogenetic systemat- Miller, H. 1987. Practical aspects of preparing phage
~ c sA: case study using gasteroid fishes. Can. J. and plasmid DNA: Growth, maintenance, and
Zool. 662177-2190. storage of bacteria and bacteriophage. Meth.
McPheron, 13. A., D. C. Smith and S. H. Berlocher. Enzymol. 152:145-170.
1988, Genetic differences between host races of Miller, J. C. and S. D. Tanksley. 1990a. Effectof differ-
liilagolefrs pornonella. Nature 336:64-66. ent restriction enzymes probe source, and probe
McWr~ght,C. G.,J. J. Kearizey and J L. Mudd. 1975. length on detecting restriction fragment length
Effect of environmental [actors on starch gel elec- polymorphism in tomato. Theor. AppI. Genet.
trophoretic patterns of human erythrocyte acid 80:385-389.
phosphatase, pp. 151-161. In G. Davis (ed.), Miller, j. C. and S. D.Tanksley. 1990b. WLP analysis uf
Forenslc Science. Amer. Chem. Soc. Symp. Ser. 13, phylogenetic relationships and genetic variation
ACS. Washington, D. C. in the genus Lycopersicon. Theor. Appl. Genet.
Meagher, S. and T.E. Dowhng. 1991. Hybridization 80:437-488.
between the cyprinid fishes Luxilus albeolus, L, cor- Miller, R. G. 1974. The jackknife: Areview. Biometrika
nutus, and L, cerasinus with comments on the pro- 61:l-15.
posed hybrid origin of L. albeious. Copeia Milligan B. G. 1992. Is organelle DNA strictly mater-
1991,979-991. nally inherited? Power analysis of a binomial dis-
Melchlor, W. B. and P H Von Hippel. 1973.Alteration trtbution. Am. J. Bot. 79:1325-1328.
31 the relative stability of dA-dT and dG-dC base Milligan, B. G. and C. K. McMurray. 2993. Dominant
pairs m DNA. Proc Natl. Acad. Sci. USA versus codominant markers in the estimation of
70.298-302. male mating success. Mol. Ecol. 2:275-284.
Xicllor, J. D. 1978. Fundarnerzials ofFreeze-Diylng. Mindell, D. 1' and R. L. Honeycutt. 1990. Ribosomal
Academic Press, New York. RNA in vertebrates: Evolution and phylogenetic
Mcnken, S. 13. J. 1987. Is the extremely low heterozy- implications. Annu. Rev. Ecol. Syst. 21:541-566.
gosity level in Ypononiruta rorellus caused by bot- Mindell, D, P,, J, W. Sites, Jr. and D. Graur. 1989.
tlenecks? Evolution 41:630-637. Speciational evolution: A plzylogenetic test with
Merr~tt,R B., J. F. Rogers and B. J. Kurz. 1978. Genetic allozymcs in Sceloporus (Reptilia).Cladistics
variability in the longnose dace, Xhinichdhys 5:49-61.
cataractae. Evolution 32:116-124. Mindell, D.I?., J. W. Sites, Jr. and D. Graur. 1990.
hreycr, A and A. C. W11soi-i.1990. Origin of tetrapods Assessing the relationship between speciation
inferred from their mitochondria1 DNA affiliation and evolutionary change. Cladistics 6:393-398.
io lungfish. J. Mol. Evol 31:359-364. Mindell, D. P., C. W. Dick and R. J. Baker. 1991.
blryerowitz, E. M. and C. H.Martin. 1984. Adjacent Phylogenetic relationships among megabats,
chronzoso~nalregions can evolve at very different microbats, and primates. Proc. Natl. Acad. Sci.
rates: Evolution of the Drosopilila 68C glue gene USA 88:10322-10326.
rigLon. 1.Mol. Evol. 20.251-264. Minton, S. A. and S, K. Salanitro. 1972. Serological
hl~cales,J. A., M, R. Bonde and G. L. Peterson. 1986. relationships among some colubrid snakes.
The use of isozyrne analysis in fungal taxonomy Copeia 1972:246-252.
and genetics. Mycotaxon 27:405449. Mitchell, L. G. and C. R. Merril. 1989.Affinity genera-
Literature Cited 605
tion of single-stranded DNA for dideoxy sequenc- PCR primer pairs in closely related species.
ing following the polymerase chain reaction. Genomics 10:654-660.
Analyt. Biochem. 178:239-242. Moore, W. S. 1995. Inferr~ngphylogenies from mtDNA
Miyamoto, M. M. 1981. Congruence among character variation: Mitochondrial-gene trees versus
sets in phylogenetic studies of the frog genus nuclear-gene trees. Evolution 49:718-726.
Leptodacfylus. Syst. Zool.30:281-290. Moran, P. and I. Kornfield. 1993. Retention of ancestral
Miyamoto, M. M. 1983. Biochemical variation in polymorphism in the rnbuna species flock
Eleutherodactylus bransfordii: Geographic patterns (Teleostei: Cichlidae) of Lake Malawi. Mol. Biol.
and cryptic species. Syst, Zool. 321:43-51. Evol. 10:1015-1029.
Miyamoto, M. M. 1985. Consensus cladograms and Morden, C. W. and S. S. Golden. 1989.psbA genes indi-
general classifications. Cladistics 1:186-189. cate common ancestry of prochloropl~ytesand
Miyamoto, M. M. and S. M. Boyle. 1989. The potential chloroplasts. Nature 337:382-385.
importance of mitochondria1 DNA sequence data Morescalchi, A. 1973.Amphibia, pp. 233-348. In A. 13.
to eutherian mammal phylogeny, pp. 437-450. In Chiarelli and E. Capanna (eds.), Cytotaxonomy and
B. Fernholm, K. Bremer and H. Jornvall (eds.), The VertebrafeEvolution. Academic Press, New York.
Hierarchy of Life. Elsevier, Amsterdam. Morescalchi, A. 1975. Chromosome evolution in the
Miyamoto, M. M. and J. Cracraft. 1991. Phylogenetic caudate amphibia. Evol. Biol. 8:339-387.
inference, DNA sequence analysis, and the future Morgan, K. and C. Strobeck. 1979. Is intragenic recom-
of molecular systematics, pp. 3-17. In M. M. bination a factor in the maintenance of genetic
Miyamoto and 1. Cracraft (eds.), Phylogenetic variation in natural populations? Nature
Analysis of D N A Sequences. Oxford University 277:383-384.
Press, New York. Morgante, M. and A. M. Oliveri. 1993. PCR-amplified
Miyamoto, M. M. and W. M. Rtch. 1995. Testing microsatellites as markers in plant genetics. Plant
species phylogenies and phylogenetic methods 1. 3:175-182.
with congruence. Syst. Biol. 44:64-76. Morin, A,, J. J. Moore and D. S. Woodruff. 1992.
Miyan~oto,M. M., J. L. Slightom and M. Goodman. Identification of chimpanzee subspecies with
1987. Phylogenetic relationships of humans and DNA from hair and allele-specific probes. Proc.
African apes as ascertained from DNA sequences Roy. Soc. London B 249:293-297.
(7.1 kilobase pairs) of the W-globin region. Morin, P. A., J. J. Moore, R. Chakraborty, J. Li, J.
Science 238:369-373. Goodall and D. S. Woodruff. 1994. Kin selection,
Miyamnto, M. M., F: Kraus and 0.A. Ryder. 1990. social structure, gene flow, and the evolution of
Phylogeny and evolution of antlered deer deter- chimpanzees. Science 265:1193-1201.
mined from mitochondrial DNA sequences. Proc. Moritz, C. 1983. Parthenogenesis in the endemic
Natl. Acad. Sci. USA87:6127-6131. Australian lizard Heteronotia binoei (Gekkonidae).
Miyamoto, M. M., M.W. Allard, R. M. Adkins, L. L. Science 220:735-737.
Janecek and R. L. Honeycutt. 1994. A congruence Moritz, C. 1987. Parthenogenesis in the tropical
test. of reliability using linked mitochondria1DNA gekkonid lizard, Nactus arnouxii (Sauna:
sequences. Syst. Biol. 43:236-249. Gekkoludae). Evolution 41:1252-1266.
Mizusawa, S., S. Nishimura and R Seela. 1986. Moritz, C. 1991a. Evolutionary dynamics of mitochon-
Improvement of the dideoxy chain termination drial DNA duplications in parthenogenetic geck-
method of DNA sequencing by use of deoxy-7- os, Heteronotia binoei. Genetics 129:221-23.
deazaguanosine triphosphate in place of dGTP. Moritz, C. 1991b. The origin and evolution of
Nucl. Acids Res. 14:1319-1324. parthenogenesis in Heteronotia binoei (Gek-
Moore, D. W, and T. L. Yates. 1983. Rate of protein konidae): Evidence for recent and localized ori-
inactivation in selected animals following death. gins of widespread clones. Genetics 129:211-219.
J. Wildl. Manag. 47:1166-1169. Moritz, C. 1994.Applications of mitochondria1 DNA
Moore, G. W. 1976. Proof for the maximum parsimony analysis on conservation: A critical review. Mol.
("Red King") algorithm, pp. 117-137. In M. Ecol. 3:401-411.
Goodman and R. E. Tashian (eds.),Molecular Moritz, C. 1995. Uses of molecular phylogenies for con-
Anthropology, Plenum, New York. sewation. Phil. Trans. Roy. Soc. London (in press).
Moore, S. S., L. L. Sargeant, T. J. King, J, S. Mattick, M. Moritz, C. and W. M. Brown. 1986. Tandem duplica-
Georges and D. J. S. Hetzel. 1991. The conserva- tion of D-loop and ribosomal RNA sequences in
tion of dinucleotide microsatellites among mam- lizard mitochondria1DNA. Science
malian genomes allows the use of heterologous 233:1425-1427.
606 Literature Cited
Moritz, C. and W. M. Brown. 1987. Tandem duplica- 207-234. In 2. I. Og~taand C. L. Markert (eds.),
tions in animal mitochondria1DNAs: Variation in Isozymes: Structure, Function, and Use in Biology
incidence and gene content among lizards. Proc. and Medzcine. Wiiey-Liss, New York.
Natl. Acad. Sci. USA 84:7183-7287. Morizot, D. C. and M. E. Schmidt. 1990. Starch gel
Moritz, C. and A. Heideman. 1993. The origin and electrophoresis and histochemical visualization of
evolution of parthenogencsis in Heteroirotza binoei proteins, pp. 23-80. In D. H.Whitmore (ed.),
(Gekkonidae): Reciprocal origins and diverse Electrophnreticand Isoelecfrlc Focusing Techniques in
mitochondria1 DNA in western populations. Syst. Fisherzes Management. CRC Press, Boca Raton, FL.
Biol. 42:293-306. Morizot, D. C. and M. J. Siciliano. 1982. Linkage of
Moritz, C., T. E. Dowling and W. M. Brown. 1987. two enzyme loci in fishes of the genus
Evolution of animal mitochondrial DNA: Xiphophorus (Poecillidae).J, I-Iered. 73:163-167.
Relevance for population biology and systemat- Morizot, D. C. and M. J. Siciliano. 1984. Gene mapping
ics. Annu. Rev. Ecol. Syst. 18:269-292. in fishes and other vertebrates, pp. 173-234. In B.
Moritz, C., M. Adams, S. Donnellan and P. Baverstock. J. Turner (ed,), Evolutiotlavy Genetics of Fishes.
1989a. The origins and evolution of parthenogen- Plenum, New York.
esis in Heteronotin binoet: Genetic diversity among Morizot, D. C., J. A. Greenspan and M. J. Siciliano.
bisexual populations. Copeia 1990:333-348. 1983. Linkage group VI of fishes of the genus
Moritz, C., W. M. Brown, L. D. Densmorc, J. W. Xiphophorus (Poeciliidae):Assignment of genes
Wright, DD. Vyas, S. Donnellan, M. Adams and P. coding for glutamine synthetase, uridine
Baverstock. 198913. Genetic diversity and the monophosphate kinase, and transferrin. Biochem.
dynamics of hybrid parthenogenesis in Genet. 21:1042-1049.
Cnemidophorus (Teiidae) and Heteronotia Moss, D. W. 1982. Isoenzymes. Chapman and Hall,
(Gekkonidae),pp. 87-112. In R. M. Dawley and J. New York.
P. Bogart (eds.), The Biology of LI,zisexual Motro, U. and G. Thomson. 1982. On heterozygosity
Vertebrates. New York State Museum, Albany. and the effective size of populations subject to
Moritz, C., S. Donnellan, M. Adams and P. R. size changes. Evolution 36:1059-1066.
Baverstock. 1989c. The origin and evolution of Mubumbila, M. V., 0.Carelse and J. Kempf. 1993.
parthenogenesis in Heteronotia binoei Isolation by asymmetric polymerase chain reac-
(Gekkonidae): Extenswe genotypic diversity tion and partial sequencing of thc common bcan
among parthenogens. Evolution 43:994-1003. chloroplast trnL (UAA) gene and pseudogene.
Moritz, C., C. J. Schneider and D. B. Wake. 1992a. Phytochem. Anal. 4:145
Evolutionary relationships within the Ensatina Mueller, L. D. and R J. Ayala. 1982. Estimation and
eschscholtziicomplex confirm the ring species interpretation of genetic distance in empirical
interpretation. Syst. Biol. 41:273-291. studies. Genet. Res. 40:127-137.
Moritz, C., T. Uzzell, C. Spolsky, H. Hotz, I. Darevsky, Mulley, J. C. and B. D. H. Latter. 1980. Genetic varia-
L. Kupriyanova and .?I Danielyan. 1992b. The tion and evolutionary relationships within a
maternal ancestry and approximate age of group of thirteen specles of penaeid prawns.
parthenogenetic species of Caucasian rock lizards Evolution 34:904-916.
(Lacerta: Lacertidae). Genetica. 87:53-62. Mullis, K. B. and E A. Faloona. 1967.Specific synthesis
Moritz, C., L. Joseph and M. Adams. 1993. Cryptic of DNA in vitro via a polymerase catalyzed chain
diversity in an endemic rainforest ski& reaction. Meth. Enzymal. 155:335-350.
(Gnypetoscincus queenslandiae). Biodiv. Conserv. Mummenhoff, K.and M. Koch. 1994. Cl~loroplast
2:412425. DNA restriction site variation and phylogenetic
Moritz, C., A. Heideman, N. N. FitzSimmons, A. relationships in the genus Thfaspi sensu lato
Hugall and P. Hale. 1996. Microsatellitesfor (Brassicaceae).Syst. Bot. 19:73-88.
macropods (Marsupialia): Cross-species polymor- Muona, O., R. Yazdani and D. Rudin. 1987. Genetic
phism and amplification artefacts. (unpublished change between life stages in Pinus sylvestris:
manuscript) Allozyme variation in seeds and planted
Moriyama, E. N., Y. Ina, K. Ikeo, N. Shimizu and T. seedlings. Silvae Genet. 3659-42.
Gojobori. 1991. Mutation pattern of human Muralidharan, K. and K. E. Wakeland. 1993.
immunodeficiency virus genes. J. Mol. Evol. Concentration of primer and template qualitative-
32:360-363. ly affects products in random-amphfied polymor-
Morizot, D. C. 1990. Use of Ash gene maps to predict phic DNA PCR. BioTechniques 14:362-364.
ancestraI vertebrate genome organization, pp.
Literature Cited 607
Muramatsu, T., S. Kan and M. Hiraishr. 1978. Isolation Myers, 17. M., T.Maniatus and L. S. Lerman. 1986.
and characterization of lipoamide dehydrogenase Detection and localization of single base changes
from mackerel dark muscle. Comp. Biochem. by denaturing gradient gel electropl~oresls.Met11
Physiol. 61B:247-252. Enzymol. 155:501-527.
Murawski, D. A. and J. L. Hamrick. 1990. The effect of
the density of flowering individuals on the mat- Mycrs, R. M., V.C Sheffield and D.R. Cox. 1989
ing systems of nine tropical tree species. Heredity Mutation detechon, GC-clamps, and dena tun115
67:167-174. gradient gel electrophoresis, pp. 71-88. In H.A
Murphy, R. W. 1983a. Paleobiogcography and genetic ErIich (ed 1, PCB Tecl~lzologyPrlrzc~plesnnd
diffcrcntiation of thc Baja California herpetofau- Applications for DNA Anzpilficatton . Stockton
na. Occ. Pap. California Acad. Sci. 137.148. Press, New York.
Murphy, R. W. 1983b. The reptiles. Origin and evolu-
tion, pp. 130-158. In T. J. Case and M. L. Cody Nadeau, J. H., J. Britton-Davidian, F Bonbomn~eand
(eds.), Island Bzogeography in the Sea of Cortez. L. Thaler. 1988. H-2 polymorphisms are more uni-
University of California Press, Berkeley. formly distributed than allozylne polymorph~sms
Murphy, R. W. 1988. The problematic phylogenetic in natural populations of house mice. Gcnetlcs
analysis of interlocus heteropolymer isozyme 118:131-140.
characters: A case study from sea snakes and Nakamura, Y., M. Leppcrt, P. O'Conncll, R. Wolff, T
cobras. Can. J. Zool. 66:2628-2633. Holm, M. Culver, C Mart~n,E. Fujimoto, M. Hoff,
Murphy, R. W. 1993. The phylogenctic analysis of E. Kumlin and R.White. 1987 Variable number of
allozyme data: Invalidity of coding alleles by tandem repeat (VNTR) markers for 11uman gene
presence/absence and recommended procedures. mapping. Scrence 235,1616-1622.
Biochem. Syst. Ecol. 21:2538. Nakanislu, M.,A. C. Wilson, A. Nolan, G. C. Gorrnan
Murphy, R. W. and C. B. Crabtree. 1985a. Genetic rela- and G. S. Bailey 1969. Phenoxyethanol: Protc~n
tionships of the Santa Catalina Island rattleless prescrvative for taxonomrsts. Sclencc 163-681-683.
rattlesnake, Crotalus catalinensis (Serpentes: Nanney, D. L., R. M. Preparata, E P. Preparata, E B
Viperidae). Acta Zool. Mexicana (n.s.) 9:1-16. Meyer and E. M Simon. 1989. Shifting dltyplc slte
Murphy, R. W. and C. B. Crabtree. 198513. Evolutionary analysis: Heuristics for expanding the pl~yloge-
aspects of isozyme patterns, number of loci, and netic range of nucleot~desequences in Sankofl
tissue-specific gene expression in the prairie rat- analysis. J. Mol. Evol 28.451-459.
tlesnake, Crotalus vtrtdis viridis. Herpetologica Nardi, I., F. Andronico, S. De Lucch~niand R. bat is ton^
41:451-470. 1986. Cytogenehcs of the European plethodonttd
Murphy, R. W. and N. R. Lovejoy. 1995. Punctuated salamanders of the genus Hydromnntcs (Amphtbia,
equilibrium or gradualism in the lizard genus Urodela). Chromasoma 94:377-388.
Sceloporus? Lost in plesiograms and a forest of Navidi, W. C., G. A. Churchill and A. V. von Ilaescler
trees. Cladistics (in press). 1991. Methods for inferring phylogenies from
Murphy, R. W. and R. H. Matson. 1986. Gene expres- nucleic acid sequence data by using maximum
sion in the tuatara, Splzenodon punctatus. New likelihood and linear invanants. Mol. Blol. Evol
Zealand J.Zool. 13:573-581. 8.128-143.
Murphy, R. W., W. E. Cooper, Jr. and W. S. Richardson. Neale, D B, and R. R Sederoff 1988. Inheritance and
1983. Phylogenetic relationships of the North evoiution of organelle gcnomes, pp. 251-264.111 J
American five-lined skinks, genus Eurneces W. Hanover and D. E. Keathly (eds.), Grnehc
(Sauria: Scincidae). Herpetologica 39:200-211. Manipulafton of Woody Plants Plenum, New York
Murphy, R. W., E C. McCollum, G. C. Gorman and R. Neale, D. B., M.A. Saghai-Maroof, R. Mr.Allard, Q.
Thomas. 1984. Genetics of hybridizing popula- Zhang dnd R. A. Jorgenses~1988. Chloroplast
tions of Puerto Rican Sphaerodactylus. J. Herpetol. DNA diversity in populations of wild and cultl-
18:93-105. vated barley. Genetics 120:1105-1110.
Murray, V. 1989. Improved double-stranded DNA Nee, S., E. C. Holrnes and P. H. Harvey. 1995. Inferring
sequencing using the linear polymerase chain population processes from molecular phyloge-
reaction. Nucl. Acids Res. 17:8889. nies. Phil. Trans. Roy. Soc. L ~ n d o n(in press).
Muse, S. V. and B. S. Gaut. 1994. A likelihood approach Needleman, S. B. and C D. Wunsch. 1970. A general
for comparing synonymous and nonsynonymous method applicable to the search for similarltles In
substitutioi~rates, with application to the chloro- the amino acid sequence of two proteins J Mol
plast genome. Mol. Blol. Bvol. 11:715-724. Biol 48:443453.
Neff, N. A. 1986. A ratlonal basis for a priori character ment length polymorphism in Cucumis melo.
weighting. Syst. Zool. 35:110-123. Theor. Appl. Genet. 83:379-384.
Nei, M. 1972. Genetic distance between populations. Nevo, E., A. Ueiles and R. Ben-Shlomo. 1984. The evo-
Am Nat. 106:283-292. lutionary significance of genetic diversity:
Nel, M 1973. Analysls of gene diversity in subdivided Ecological, demographic and life history corre-
populations. Proc. Natl. Acad. Sci. USA. lates. Lec. Notes Biomath. 53:13-213.
70 3321-3323. Nichol, S. T.,J. E. Rowe and W. M. Fitch. 1993a.
NCI,M 1978. Estimation of average heterozygosity Punctuated equilibrium and positive Darwinian
and genetic distance from a small number of indi- evolution in vesicular stomatitis virus. Proc. Nat.
viduals. Genetics 89:583-590. Acad. Sci. USA 90:1042410428.
Nei, M 1987. Molecular Evolt~tionaryGenetics. Nichol, S T., C. F. Spiropoulou, S. Morzunov, P. E.
Columbia University Press, New York. Rollin, T. G. Ksiazek, H. Feldmann, A. Sanchez, J.
Nel, M.1991. Relative efficiencies of different tree Childs, S. Zaki and C. J. Peters. 1993b. Genetic
lnhklng methods for lnolecular data, pp. 90-128. identification of a hantavirus associated with an
111M M. Miyamoto and J Cracraft (eds.), outbreak of acute respiratory tllness. Science
i3hylogenetlcAnalysrs of DNA Sequences. Oxford 262:914-917.
University Press, New York. Nichols, E., V. M. Chapman and E H. Ruddle. 1973.
Nei, and R. K.Chesser. 1983. Estimation of fixation Polymorpl~isrnand linkage for mannosephos-
indices and gene diversif~cation.Ann. Human phate isomerase in Mus musculus. Biochem.
Genet. 47253-259, Genet. 8:47-53.
Nei, M. and T. Gojobori 1986. Simple methods for Nichols, R. A. and D. J. Balding. 1991. Effects of popu-
estimating the numbers of synonymous and non- lation structure on DNA fingerprint analysis in
synonymous nucleottde substitutions. Mol. Biol. forensic science. Heredity 66:297-302.
Evol. 3:418-426. Nickrent, D. L. 1986. Genetic polymorphism in the
Nei, M and r(.K.Koehn (cds.) 1983. Evolution of Genes morphologically reduced dwarf mistletoes
,7n~fProteins. Sinauer, Sunderland, Massacl-iusetts. (Arceuthoblum,Viscaceae):An electrophoretic
Nel, 91 and W.-H. LI. 1979. Mathematical model for study. Am. J. Bot. 73:1492-1501.
studying genetic variation in terms of restrlctiorr Nickrent, D. L., S. I. Guttman and W. M.Eshbaugh.
eildonucleases Proc Natl. Acad. Scl. 1984. Biosystematlc and evolutionary relation-
U SA76:5269-5273. ships among selected taxa of Arceuthobium. U.S.
Nel, M and E Tajima. 1983. Maximum likelihood esti- Dept. Agriculture Tech. Report IW-111.
mat~onof the number of nucleotide substitutions Nierman, W. C. and D. R. Maglott (eds.). 1993.
from restriction sites data. Genetics 105:207-217 ATCCINIH Reposito?y Catalogue of Human and
Nel, M. and E Tajima 1985. Evolutionary change of Mouse DNA Probes and Libraries. 7th ed. American
restrlctlon cleavage sites and phylogenetic infer- Type Culture Collection, Rockville, Maryland.
cnce for malt and apes. Mol. Uiol. Evol. 2189-205. Nixon, K. C. and J. M. Carpenter. 1993. On outgroups.
Nei, M , T.Maruyama and Ii. Chakraborty. 1975. The Cladistics 9:413-426.
bottleneck effect and genetic variability in popu- Noordermeer, J., F. Meijlink, P. Verrijzer, E Xjsewijk
latlons. Evolution 29:l-10. and 0.Destree. 1989. Isolation of the Xenopus
Nclgel, J E. and J. C. Avlse. 1986. Phylogenet~creia- homolog of ~nt-l/winglessand expression during
tlonships of mitocl~ondrialDNA under various neurula stages of early development. Nucl. Acid
demographic models of speciation, pp. 515-534 Res. 17:ll-18.
J17 E. Nevo and S. Karlin (eds.), Evolutionary Norman, J., C. Moritz and C. I. Limpus. 1994.
Processes and Theoiy. Academic Press, New York. Mitochondria1 DNA control region polymor-
Nelgel, J. E. and 5. C. Avise. 1993. Application of a ran- phisms: Genetic markers for ecological studies of
dom walk model to geographic distributions of marine turtles. Mol. Ecol. 3:363-373.
animal initochondrial DNA variation. Genetics Nuttall, G. 13. E 1904. Blood Immunity and Blood Rela-
135 1209-1220. flonship. Cambridge University Press, Cambridge.
Nelson, I<, R. J , Baker and R. L. Honeycutt, 1987.
Mltochondrial DNA and protein differentiat~on O'Brien, S. J. 1993. Domestic cat. pp.250-253, In S. J.
between l~ybridizingcytotypes of the white-foot- O'Brien (ed.),Genetic maps. Locus maps of complex
ed mouse, Peromyscus leucopus. Evolution genomes Book 4. Nonhuman verfebrates.6th ed.
41.864-872. Cold Spring Harbor Laboratory Press, Cold
Keuhaussen, S. L. 1992. Evaluation of restriction frag- Spring Harbor, New York.
i )
Literature Cited 609
O'Brien, S. J., D. E. Wildt, D. Goldman, C. R. Merril Wachter and M. L. Straf (eds.), The future of Meta-
and M. Bush. 1983. The cheetah is depauperate in analysis. Russell Sage Foundation, New York.
genetic variation. Science 221:459462. Olmstead, R. A., R. Langley, M. E. Roelke, R. M.
O'Br~en,S. J., W, G. Nash, D. E. Wildt, M. E. Bush and Goeken, D. Adger-Johnson, J. P. Goff, J. P. Albert,
R. E. Benveniste. 1985a. A molecular solution to C. Packer, M. K. Laurenson, T. M. Caro, L.
the riddle of the giant panda's phylogeny. Nature Scheepers, D. E. Wildt, M. Bush, J. S. Martenson
317:140-144. and S. J. O'Brien. 1992. Worldwide prevalence of
O'Brien, S. J., M E. Roelke, L. Marker, A. Newman, C. lentivirus infection in wild feline species:
A. Winkler, D. Meltzer, L. Colly, J. E Evermann, Epidemiologic and phylogenetic aspects. J. Virol.
M. Bush and D. E. Wildt. 1985b. Genetic basis for 66:6008-6018.
species vulnerability in the cheetah. Science Olmstead, R. G. and J. D. Palmer. 1992.A chloroplast
227:1428-1434. DNA phylogeny of the Solanaceae: Subfamilial
O'Brien, S. J., D. E. Wildt, M. Bush, T. M. Caro, C. relationships and character evolution. Ann.
FitzGibbon, I. Aggundey and R. E. Leakey. 1987. Missouri Bot. Garden 79:346-360.
East African cheetahs: Evidence for two popula- Olmstead, R. C. and J. D. Palmer. 1994. Chloroplast
tion bottlenecks? Proc. Nati. Acad. Sci. USA DNA systematics: A review of methods and data
84:508-511. analysis. Am. J. Bot. 81:1205-1255.
3chma11,H., A. S. Gerber and D. L. Hartl. 1988. Olmstead, R. G. and J. A. Sweere. 1994. Combining
Genetic applications of an inverse polymerase data in phylogenetic systematics: An empirical
chain react~on.Genetics 120:621-623. approach using three molecular data sets in the
Odrzykoski, I. J. and J. Szweykowski. 1991. Genetic Solanaceae. Syst. Biol. 43:467-481.
differentiation witl~outconcordant morphological Olmstead, R. G., H.J. Michaels, K. M. Scott and J,
divergence in the thallose liverwort Conocephalum Palmer. 1992. Monophyly of the Asteridae and
conicurn. Plant Syst. Evol. 178:135-151. identification of their major lineages as inferred
O'Grady, R. T. and G. 8. Deets. 1987. Coding multi- from DNA sequences of rbcl;. Ann. Missouri Bot.
state characters, with special reference to the use Gard. 79:249-265.
of parasites as characters of their hosts. Syst. Zool. Olmstead, R. G., J. A. Sweere and K. H. Wolfe. 1993.
36:268j-279. Ninety extra nucelotide in ndhF gene of tobacco
O'Hara, R. J. 1993 Systematic generalization, histori- chloroplast DNA: A summary of revisions to the
cal fate, and the species problem. Syst. Biol. 1986 genome sequence. Plant Mol. Biol.
42:231-246. 22 :1191-1193
Ohno, S. 1970. Evolution by Gene Duplication. Springer- Olsen, G. J. 1987. Earliest phylogenetic branchings:
Verlag, New York. Comparing rRNA-based evolutionary trees
Ohno, S., C. Stenius, L. Christian and G. Schipmann. inferred with various techniques. Cold Spring
1969. De novo mutation-like events observed at Harbor Symp. Quant. Biol. 52:825-837.
the 6PGD locus of the Japanese quail, and the Olsen, G. J. 1988. Phylogenetic analysis using riboso-
principle of polymorphism breeding more poly- ma1 RNA. Meth. Enzymol. 164:793-838.
morphisin. Biochem. Genet. 3:417-428. Olsen, G. J. and C. R. Woese. 1989. A brief note con-
Ohta, N., H. Nagashima, S. Kawano and T.Kuroiwa. cerning archaebacterial pl~ylogenyCan. J.
3992. Isolation of the chloroplast DNA and the Microbiol. 35:119-123.
sequence of the trnK gene from Cyanidium calcari- Olsen, G. J., R. Overbeek, N. Larsen, T.L. Marsh, M. J.
urn Strain RIC-I. Plant Cell Physiol. 33:657-661. McCaughey, M. A. Maciukenas, W.-M. Kuan, T. J.
Ohta, T.1977. Extension of neutral mutation drift Macke, Y. Xing and C. R. Woese. 1992. The riboso-
hypothesis, pp. 148-167. In M. Kimura (ed.), ma1 database project. Nucl. Acids Res. 20
Molecular Evolution and Polymorphism. Nat~onal (suppl.):2108-2200.
Institute of Genetics, Mishima, Japan. Olsen, R. R., J. A. Runstadler and T. D. Kocher. 1991.
Ohta, T. 1992. The nearly neutral theory of molecular Whose larvae? Nature 351:357-358.
evolution. Annu. Rev. Ecol. Syst. 23:263-286. Omland, K. E. 1994. Character congruence between a
Ol~yama,K., W. Fukuaawa, T. Kohchi, H. Shirai, T. molecular and a morphological phylogeny for
Sano, S. Sano, K. Umesono, Y. Shiki, M. Takeuchi, dabbling ducks (Anus).Syst. Biol. 43:369-386.
Z. Ckang, S. Aota, H.lnokuchi and H. Ozeki. Orita, M., H. Iwahana, H. Kanazawa, K. Hayashi and
1986. Complete nucleotide sequence of liverwort T. Sekiya. 1989. Detection of polymorphisms of
Marchantiu polymorpha chloroplast DNA. Plant human DNA by gel electrophoresis and single-
Mol. Biol. Rep. 4:148-175. strand conformation polymorphisms. Proc. Natl.
Olkin, I. 1990, History and goals, pp. 3-10. In K. W. Acad. Sci. USA 86:2766-2770.
610 Literature Cited
Orosz, J. M. and J. G. Wetmur. 1974. In vitro iodination Page, R. D. M. 1993a Genes, organisms, and areas:
of DNA. Maximizing iodination while minimiz- The problem of multiple lineages. Syst. Biol.
ing degradation: Use of buoyant density shifts for 4277-84.
DNA-DNA hybrid isolation. Biochemistry Page, R. D. M. 1993b. On islands of trees and efficacy
13:5467-5473. of different methods of branch swapping in find-
Ostrander, E. A., G. E Sprague, Jr. and J. Rine. 1993. ing most-parsimonious trees. Syst. Biol.
Identification and characterization of dinucleotide 42:200-210.
repeat (CA),, markers for genetic mapping in Page, R. D. M. 1994. Maps between trees and cladistic
dog. Genomics 16:207-213. analysis of historical associatians among genes,
Ou, C.-Y., C. A. Ciesielski, G. Myers, C. I. Bandea, C.- organisms, and areas. Syst. Bid. 43:58-77.
C, Luo, B. T, M. Korber, J. I. Mullins, G. Palmer, J. D. 1982. Physical and gene mapping of
Schochetman, R. L. Berkelman, A. N. EcQnomou, chloroplast DNA from Atrtplex triangularis and
J. J. Witte, L. J. Furman, G. A. Satten, K. A. Cucumis sativn. Nucl. Acids Res. 10:1593-1605.
Macinnes, J. W. Curran and K. W. Jaffe. 1992. Palmer, J. D. 1985a. Evolution of chloroplast and mito-
Molecular epidem~ologyof HIV transmission in a chondrial DNA in plants and algae, pp. 131-240.
dental practice. Science 256:1165-1171. In R. J. MacIntyre (ed.), Molecular Evolutfonary
Ouchterlony, 0.1958. Diffusion-in-gel methods for Gerzetics. Plenum, New York.
i~nmunologicalanalysis. Progr. Allergy 5:l. Palmer, J. D. 1985b. Comparative organization of
chloroplast genomes. Annu. Rev. Genet.
Paabo, S. 1985. Molecular cloning of ancient Egyptian 19:325-354.
mummy DNA. Nature 314:644-645. Palmer, J. D. 1986a. Isolation and structural analysis of
Paabo, S. 1989. Ancient DNA: Extraction, characteriza- chloroplast DNA. Meth. Enzymol. 118:167-186.
tion, molecular cloning, and enzymatic amplifica- Palmer, J. D. 1986b.Chloroplast DNA and phylogenet-
tion. Proc. Natl. Acad. SCL. USA 86:1939-1943. ic relationships, pp. 63-80. In S. K. Dutta (ed.),
Paabo, S. 1995. The Y chromosome and the origin of D N A Systernattcs. CRC Press, Boca Raton, Horida.
all of us (men). Science 268:1141-1142. Palmer, J. D. 1987. Chloroplast DNA evolution and
Paabo, S., J. A. Gifford and A. C. Wilson. 1988. biosystematic uses of chloroplast DNA variation.
Mitochondria1DNA sequences from a 7000-year Am. Nat. 130:S6-529.
old brain. Nucl. Acid Res. 16:9775-9787. Palmer, J. D. 1991. Plastid chromosomes: Structure and
Piabo, S., R. Higuchi and A. C. Wilson. 1989. Ancient evolution, pp. 5-53. In L Bogorad and I. K. Vasil
DNA and the polymerase chain reaction. J. Biol. (eds.), Cell Culture and Somatic Cell Genetics of
Chem. 264:9709-9712. Plants, vol. 7A, The Molecular Biology of Plastids.
Paabo, S., W. K. Thomas, K.M. Whitfield, Y. Academic Press, San Diego.
Kumazawa and A. C. Wilson. 1991. Palmer, J. D. 1992. Mitochondnal DNA in plant sys-
Rearrangements of mitochondria1 transfer RNA tematics: Applications and limitations, pp. 36-49.
genes in marsupials. J. Mol. Evol. 33:426-430. In P. S. Soltis, J. E. Soltis and J. J. Doyle (eds.),
Pace, N. R., G. J. Olsen and C. R. Woese. 1986. Molecular Systenzatics of Plants. Chapman and
Ribosomal RNA phylogeny and the primary lines Hall, New York.
of evolutionary descent. Cell 45:325-326. Palmer, J. D. and L. A. Werbon. 1988. Plant mitochon-
Pagel, M. D. 1992. A method for the analysis of com- drial DNA evolves rapidly in structure, but slow-
parative data. J. Theor. Biol. 156:431442. ly in sequence. J. Mol. Evol. 28:87-97.
Pagel, M. D. and P. H.Harvey 2989. Taxonomic differ- Palmer, J. D, and D. Zamir. 1982. Chloroplast DNA
ences in the scaling of brain on body size among evolution and phylogenetic relationships in
mammals. Science 244:1589-1593. Lycopersicon. Proc. Natl. Acad. Sci. USA
Pagel, M. D. and P. H. Harvey. 1992. On salving the 79:5006-5010.
correct problem: Wishing does not make it so. J. Palmer, J. D., R. A. Jorgensen and W. F. Thompson.
Theor. Biol. 156:425430. 1985. Chloroplast DNA variation and evolution in
Paetkau, D. and C. Strobeck. 1994. Microsatellite Pisum: Patterns af change and phylogenetic
analysis of genetic vanation in black bear popula- analysis. Genetics 109:195-213.
tions. Mol. Ecol. 3:489496. Palmer, J. D., B. Osorio, J. Aldricl~and W. F.
Page, R. D. M.1991. Clocks, clades, cospeciation: Thompson. 2987. Chloroplast DNA evolution
Comparing rates of evolution and timing of among legumes: Loss of a large inverted repeat
cospeciation events in host-parasite assemblages. occurred prior to other sequence arrangements.
Syst. Zool. 40:188-198. Curr. Genet. 11:275-286.
Literature Cited 612
Palmer, 1. D.,B. Osorio and W F.Thompson. 1988a. enzyine activity in two spccles of the genus
Evolutionary significance of inversions in legume Garnrxarus (Crustacea. Arnphipoda). Evolution
cl~loroplastDNA. Curr. Genet 14:75-89. 46 1568-1573.
Palmer, J. D., R. K. Jansen, H. J. Michaels, M. W. Chase Patarnello, T., P.M. Blsol and B. Battaglia. 1989.
and J. R. Manhart. 198810. Chloroplast DNA and Studies on differential fitness of PGI genotypes
plant phylogeny. Ann. Missouri Bot. Gard. with regard to temperature In Ganzmarus ~i?serzsi-
75:1180-1206. bilis (Crustacea: Amphipoda). Marine Biol
Palumbi, S. R. 1992. Marine speciation on a small plan- 102:355-359.
et. Trends Ecol. Evol. 7:114-121. Pathak, S. and E E. Arrighl. 1973 Loss of DNA follow-
Palumbi, S. R, and C. S. Baker. 1994. Contrasting pop- ing C-banding procidures. Cytogenet. Cell Genct
ulation structure from nuclear intron sequences 12:414422.
and mtBNA of humpback whales. Mol. Biol. Patterson, C. (ed.) 1987. Molecl~lesrand Morphology i i r
Evol. 11:426435. Evolution' Conflict or Coinpro~nise?Cambridge
Palumbi, S. R. and J. Benzie. 1991. Large rnitochondri- Universrty Press, Cambridge
a1 DNA differences among morphologically simi- Patterson, C. 1988. Hon~ologym classical and molecu-
lar penaeid shrimp. Mol. Mar. Biol. Biotechnol. lar biology. Mol. Brol Evol 5 603-1525.
1:27-34. Patterson, C. 1989. Phylogenet~crelations of tnajor
Palumbi, S. R., A. P. Martin, S. Romano, W. 0. groups: Conclusions and prospects, pp. 471-488
McMillan, L. Stice and G. Grabowskl. 1991. The 712 8. Fernholm, K. Brerner and I-I. Jornvall (eds ),
Simple Fool's Guide to PCR ,Special Publ. Dept. The H~erarrhyof Life Elscvlcr, Amsterdam
Zoology, University of Hawaii, Honolulu. Patterson, C., D. M Wllliams and C. J. Humpliries.
Palva, T.K. and E. T. Palva. 1985. Rapid isolation of 1993. Congruence between molecular and mor-
animal mitochondria1 DNA by alkallne extraction. phologlcal phylogenres Annu Rev. Ecol Syst
FEBS Letters 192:267-270. 24:253-188.
Pamllo, P. and M. Nei. 1988. Iielationships between Patton, J. Id. and J. H. Feder 1981. Microspatlal genetlc
gene trees and species trees. Mol. Biol. Evol. hcterogeneity in pocket goplicrs: Non-random
5:568-583. breed~ngand dr~ft.Evolution 35:912-920.
Pardue, M. L. 1985. In situ hybridization, pp. 179-202. Patton, J L and S. W Shcrwood. 1982. Gcnonie evolu-
In B. D. Hames and S. J. Higgins (eds.), Nucleic tron In pocket gophers (genus Tlzoniomys) I.
Acid Hybridization: A Practical Apprvach. IRL Press, Hcterocl~romatmvarlatlon and speclation potcn-
Oxford. tlal. Chron~osolna85 149-162.
Pardue, M. L. 1986. In situ hybridization to DNA of Patton, J. L. and S. W. Shcrwood. 1983. Chromoson~e
chromosomes and nuclei, p p 111-137. ln D. B. evolution and speclatian In rodents. Annu. Rev
Roberts (ed.), Drosopllzla: A Pmcfical Approach. IRL Ecol. Syst. 14:139-158.
Press, Oxford. Patton, J L, and M. F Smlth 1994. Paraphyly poly-
Paris Conference. 1971. Standardization in l ~ u m a n phyly, and the nature of specles boundaries In
cytogenetics. Cytogenetics 11:313-362. pocket gophers (genus Thotnotnys). Syst. Biol.
Parker, E. D., Jr. and R. K.Selander. 1976. The organi- 43:ll-26.
zation of genetic diversity in the parthenogenetic Patton, J. L., M. E Smith, R. D.Prlce and R. A
lizard Cnemidophorus lesselatus. Genetics Hellenthal. 1984. Genctrcs of hybridization
84791-805. between the pocket gophers Tiromomys bottae and
Parkm, D. T. and S. R. Cole. 1985. Genetic differentia- Tkomonzys townsendiz In northeastern California
tion and rates of evolution in some introduced Great Basin Nat. 44:431-440.
populations sf the House Sparrow, Passer domesfi- Pearson, W. R. 1990. Rapld and scns~tlvesequence
cus in Australia and New Zealand. Heredity comparison with FASTP and FASTA. Meth.
54:15-23. Enzymal. 183:63-98.
Parsons T. J., S. L. Olson and M. J. Braun. 2993. Pearson, W. R. and D. J. Liprnan. 1988. Improved tools
Un~directionalspread of secondary sexual for b!ological sequence comparison. Proc. Natl
plumage traits across an avian hybrid zone. Acad. Sci. USA 85:2444-2448.
Science 260:1643-1646. P e n ~ ~D.y , 1982. Towards a basis for classification: The
Patarnello, T. and B. Battaglia. 1992. Glucose-phos- incompleteness of distance measures, incompatl-
phate isomerase and fitness: Effects of tempera- bility analysis and phenetic classification J. Thcor,
ture on genotype dependent mortality and Biol. 96129-142.
612 Literature Cifed
Penny, DD, and M. D. Hendy. 1985. Testing methods of Pinkel, D., T. Straume and J. W. Gray. 1986.
evolutionary tree construction. Cladistics Cytogenetic analysis uslng quantitative, high-sen-
1 266-272 sitivity, fluorescence hybridization. Proc. Natl.
Penny, D and M. I-Iendy. 1986. Estimating the rehabd- Acad. Scl. USA 83:2934-2938.
l t y of evolutionary trees. Mol. Blol. Evol. Pirrotta, V. 1986. Cloning Drosophila genes, pp. 83-110.
3 403-417. In D. 8. Roberts (ed.), Drosophlla: APractical
Penny, D. and M. D. Hendy. 1987. TurboTree: A fast Approach. IKL Press, Oxford.
algorithm for minimal trees. CABIOS 3:183-187. Plante, Y., P. T. Boag and B. N. White. 1987.
Penny, D., L. R. Foulds and M. D. I-Iendy. 1982. Nondestructive sampling of mitochondria1 DNA
Testing the theory of evolution by comparrng evo- from voles. Can. J. Zool. 65:175-180.
lut~onarytrees constructed from five different Pleyte, K. A., S. D, Duncan and R. B. Phillips. 1992.
protem sequences Nature 297.197-200. Evolutionary relationships of the salmonid fish
Pc~uny,D., M. D. Hendy and M. Steel. 1992. Progress genus Salvelinus inferred from DNA sequences of
with rnethods for constructing evolutlanary trees. the first internal transcribed spacer (ITS11 of the
Trends Ecol. Evol 7.73-79. ribosomal DNA. Mol. Phylogenet. Evol. 1:223-230.
Perasso, R., A. Baroin, L. 13. Qu, 3.-P. Bachellerie and Pollock, D. D. and D. B. Goldstein. 1994.A comparison
A Adoutte. 1989. Origin of the algae. Nature of two methods for constructing evolutionary dis-
339 142-144. tances from a weighted contribution of transition
Pctcrson, D. G., S. M Stack, J. L. Ilealy, B. S. Donohoe and transversion differences. Mol. Biol. Evol.
and L. K. Anderson 1994 The reiationship 12:713-717.
between synaptonernal complex length and Ponath, I? D., R. T. Boyd, D. M. Hillis and P. D.
genome size in four verlebrate classes Gottlieb. 1989a. Structural and evolutionary com-
(Ostelchthyes, Reptllia, Aves, Mammalia). parisons of four alleles of the mouse Igk-J locus
Chro~~~osorne Res. 2 153-162. which encodes immunoglobulin kappa light
I'eirlgrew, J. D. 1986. Flylng primates? Megabats have chain joinlng (Jk) segments. Immunogenetics
thc advanced pathway from eye to midbrain. 29:389-396.
Saence 231:1304-1306. Ponath, P. D., D. M. Hillis and P. D. Gottlieb. 1989b.
J't.itiglew, J. D.1991a. \471ngs or brain? Convergent Structural and evolutionary comparisons of four
evolution in the origll~sof bats. Syst. Zoal. alleles of the mouse immunoglobulin kappa chain
40:199-216. gene, Igk-VSer. Immunogenetics 29249-25'7,
Pettlgrew, J. D. 1991b. A fruitful wrong hypothesis? Pope, T. R. 1992. The influence of dispersal patterns
Response to Baker, Novacek and Simmons. Syst. and mating system on genetic differenhation
Zoo1 40.231-239. within and between populations of the red
Pcttlgrew, J. D. 1994. Flying DNA. Curr. Biol. howler monkey (Alouatta seniculus). Evolution
4 277-280. 46:1112-1128.
Petlrgrew, 1. D , B. C. M. Jamieson, S. K.Robson, L. S. Porter, A. H. 1990. Testing nominal species boundaries
Hall, K. I. McAnally and 1-1. M.Cooper. 1989. using gene flow statistics: The taxonomy of two
Plxylogeneticrelations between microbats, mega- hybridizing admiral butterflies (Limenitis:
bats and primates (Mammalia: Chireptera and Nymphalidae). Syst. Zool. 39:131-147.
Primates). Phil. Trans, Roy. Soc. B 325:489-559. Porter, C. A., M. J. Hamilton, J. W. Sites, Jr. and R. J.
Pfennig, D W. and H. K. Reeve. 1993. Nepotism in a Baker. 1991. Location of ribosomal DNA in chro-
solltary wasp as revealed by DNA fingerprinting. mosomes of squamate reptiles: Systematic and
Evolution 47:700-704. evolutionary implications. Herpetologica
Fhllbllck, C, T. 1993. Underwater cross-pollination in 47:271-280.
Cailrtrlche hermapi~rod~tic (Callltrichaceae): Porter, C. A,, M. W. Haiduk and K. de Queiroz. 1994.
Cvldence from random amplified polymorphic Evolution and phylogenetic significance of ribo-
DNA markers. Am J Bot. 80:391-394, soma1 gene location m chromosomes of squamate
Phl!ilps, SC. B. and K. A. Pleyte 1991. Nuclear DNA reptiles. Copeia 2:302-313.
and salmonid phylogenics. J. Fish. Biol. 39(suppl. Powell, J R. and A. Caccone. 1989. Intra- and interspe-
A):259-275. cific genetlc variation in Drosophila. Genome
Rerson, E D., V. M. Sarich, J. M. Lowenstein, M. J. 31:233-238.
Danlel and W. E. Ramey. 1986.Amolecular link Powell, J. R. and A. Caccone. 1990. The TEACL
between the bats of New Zealand and South methad of DNA-DNA hybridization: Techmcal
America. Nature 323.6043. considerations. J, Mol. Evol. 30:267-272.
Literature Cited 613
Powell, J R, and M. C. Zuniga. 1983.A simplified pro- Prout, T. 1965. The estimation of fitness from genotyp-
cedure for studying mtDNA polymorphisms. ic frequencies. Evolution 19:546-551.
Biochem. Genet. 21:1051-1055. Pryer, K.M. and C. H. Haufler. 1993. Isozymic and
Powell, J. R., A. Caccone, G. D. Arnato and C. Yoon. chromosomal evidence for the allotetraploid ori-
1986. Rates of nucleotide substitution in gin of Gymnocarpiurn dryopteris (Dryopteridaceae).
Drosophrla mitochondria1 DNA and nuclear DNA Syst. Bot 18:150-172.
are similar. Proc. Natl. Acad. Sci. USA Purvis, A. and T. 6 .Garland, Jr. 1993, Polytomies in
83:9090-9093. comparative analyses of continuous characters.
Powell, M. J. D. 1964.An efficient method for finding Syst. Biol. 42:569-575.
the minimum of a function of several variables
without calculating derivatives. Comp. J. Qu, L. H., 13. Michot and J.-P. Bachellerie. 1983.
7:155-162. Improved methods for structure probing in large
Powers, D. A. 1987. A multidisciplinary approach to RNAs: A rapid heterologous sequencing
the study of genetic variation in species, pp. approach is coupled to the direct mapping of
102-134. In M, E, Feder, A. E Bennet and X.B. nuclease accessible sites. Application to the 5' ter-
Huey (eds.), New Directions in Physiological minal domain of eukaryotic 285 rRNA. Nucl.
Ecology. Cambridge University Press, New York. Acids Res. 11:5903-5920.
Powers, D. A., G. S. Greaney and A. R. Place. 1979. Quellar, D. C., J. E. Strassn-iann and C. R. Hughes.
Physiological correlation between lactate dehy- 1988. Genetic relatedness in colonies of tropical
drogenase genotype and haemoglobin function in wasps with multiple queens. Science
killifish. Nature 277:240-241. 2421155-1157.
Prager, E. M. and A. C. Wilson. 1971a. The dependence Queller, D. C., J. E.Strassmann and C. R.
of immunological cross-reactivity upon sequence Hughes.1993. Microsatellites and kinship. Trends
resemblance among lysozymes. I. Micro-comple- Ecol. Evol. 8:285-288.
ment fixation studies. 1. Biol. Chem. 246:5978-89. Quinn, T. W. 1992. The genetic legacy of Mother
Prager, E. M. and A. C. Wilson. 1971b. The depen- Goose: Pl-iylogeographic patterns of lesser snow
dence of immunological cross-reactivity upon goose Chen caerulescens caerulescens maternal lin-
sequence resemblance among lysozymes. 11. eages. Mol. Ecol. 1:105-117.
Comparison of precipiiin and micro-complement Quinn, T. W. and B. hi.White. 1987a.Analysis of DNA
fixation results. J. Biol. Chem. 246:7010-17. sequence variation, pp. 163-198. In F. Cooke and
Prager, B. M, and A. C. Wilson. 1976. Congruency of P. A. Buckley (eds.), Avian Genetzcs. Academic
phylogenies derived from different proteins. J. Press, London.
Mol. Evol. 9:45-57. Quinn, T. W. and B. N. White. 198%. Identification of
Prager, E. N. and A. C. Wilson. 1988. Ancient origin of restriction fragment length polymorphisms in
lactalbumin from lysozyme: Analysis of DNA and genomic DNA of the lesser snow goose. Mol. Biol.
amino acid sequences. J. Mol. Evol. 22326-335. Evol. 4:126-143.
Prager, E. M., A. H.Brush, R. A. Nolan, M. Nakanishi Quinn, T. W., J. S. Quinn, F'Cooke and B. N. White.
and A. C. Wilson. 1974. Slow evolution of trans- 1987. DNA marker analysis detects multiple
ferrin and albumin in birds according to micro- maternity and paternity in single broods of the
complement fixation analysis. J. Mol. Evol. lesser snow goose (Anser caerulescens caerulescens).
3:243-262. Nature 396:392-394.
Prager, E. M., A. C. Wilson, J. M. Lowenstein and V. M.
Sarich. 1980. Mammoth albumin. Science Radtke, R. D., S. D. Donnellan, R.I\].Fisher, C. Moritz,
209:287-289 K. A. Hanley and T. J. Case. 1995. When species
Prensky, W. 1976. The radiolodination of RNA and collide: The origin and spread of an asexual
DNA to high specific activit~es,pp. 121-152. In D. species of gecko. Proc. Roy. Soc. London B
M. Prescott (ed.), Methods in Cell Biology. 258:145-152.
Academic Press, New York. Raff, R. A., K. G. Field, M. T. Ghiselin, D. J. Lane, G. J.
Pnce, D. K., G. E. Collier and C. F.Thompson. 1989. Olsen, A. L. Parks, B. A. Parr, N. R. Pace and E. C.
Multiple parentage in broods of house wrens: Raff. 1988. Molecular analysis of distant phyloge-
Genetic evidence. J. Hered. 80:l-5. nehc relationships in echinoderms, pp. 2941. In
Prodohl, P. A,, J. B. Taggart and A. Ferguson. 1994. C. R. C. Paul and A. B. Smith (eds.), Echinoderm
Single locus inheritance and joint segregation Phylogeny and Evolutionary U~ology.Oxford
analysis of minisatellite (VNTR) loci in brown University Press, Oxford.
trout (Salmo frufta L.). Heredity 73:556-566.
614 Literature Cited
Ragghianti, M., S. Bucci, G. Mancino, J. C. Lacroix, D. Reed, K.C. and D. A. Mann. 1985. Rapid transfer of
Boucher and J. Charlemagne. 1988. A novel DNA from agarose gels to nylon membranes.
approach to cytotaxonomic and cytogenetic stud- Nucl. Acids Res. 13:7207-7221.
ies in the genus Triturus using monoclonal anti- Reeve, H. K., D.F. Westneat and D. C. Queller. 1992.
bodies to lampbrush chromosomes antigens. Estimating average within-group relatedness
Chromosoma 97:134-144. from DNA fingerprints. Mol. Ecol. 1:223-232.
Rainboth, W. J. and D. G. Buth. 1992. On the costs of Reeves, J. W.1992. Heterogeneity in the substitution
isozyme electrophoresis: Current prices for process of amino acid sites of proteins coded for
enzyme stains. Isozyme Bull. 2522-26. by mitochondria1 DNA. J. Mol. Evol. 35:17-32.
Rainboth, W. J. and G. S. Wh~tt.1974. Analysis of evo- Rernsen, J. V., Jr. 1977. On taking field notes. Am. Birds
lutionary relationships among shiners of the sub- 31:946-953.
genus Luxilus (Teleoste~,Cypriniformes, Notropzs) Reynolds, J., 8. S. Weir and C. C. Cockerham. 1983.
with the lactate dehydrogenase and malate dehy- Estimation of the coancestry coefficient: Basis for
drogenase isozyme systems. Comp. Biochem. a short-term genetic distance. Genetics
Physiol. 49B:241-252. 105:767-779.
Ramshaw, J. A. M., J. A. Coyne and R. C. Lewontin. Richardson, B. J. 1981. The genetic structure of rabbit
1979. The sensitivity of gel electrophoresis as a populations, pp. 37-52. In K. Myers and C. D.
detector of genetic variation. Genetics MacInnes (eds.), Proceedings of the World
93:1019-1037. Lagomorph Conference held in Guelph, Ontario,
Rand, D. M. 1993. Endotherms, cctotherms, and mito- August, 1979, Guelph, University of Guelph.
chondrial genome-size variation. J. Mol. Evol. Richardson, B. J. 1983. Distribution of protein varia-
37:281-295. tion in skipjack tuna (Katstlmonus pelamls) from
Rand, D. M. 1994. Thermal habit, metabolic ratc and the central and south-western Pacific. Australian
the evolution of mitochondria1 DNA. Trends Ec01. 5. Marine Freshwater. Res. 34:231-251.
Evol. 9:125-131. Richardson, B. J., P. R. Baverstock and M. Adams.
Rand, D. M.and R. G. I-Iarrison. 1986a. Ecological 1986. Allozyme Electrophoresis: A Handbook for
genetics of a mosaic hybrid zone: Mitochondrial, Animal Systematics and Population Structure.
nuclear and reproductive differentiation of crick- Academic Press, Sydney.
ets by soil type. Evolution 43432-449. Riddle, B. R., R. C. Honeycutt and P. L. Lee. 1993.
Rand, D. M. and R. G. Harrison. 198613. Mitochondrial Mitochondrial DNA phylogeography in northern
DNA transmission genetics in crickets. Genetics grasshopper mice (Onyckomys leucogaster)-the
114:955-970. influence of Quaternary climatic oscillations on
Rand, D.M., M. Dorfsman and L. M. Kann. 1994. population dispersion and divergence. Mol. Ecol.
Neutral and non-neutral evolution of Drosoplzzla 2183-193.
mitochondria1 DNA. Genetics 138:741-756. Rider, C. C. and C. B. Taylor. 1980. Isoenzymes.
Randall, S. K., R. Eritja, B. E. Kaplan, J, Petruska and Chapman and Hall, London.
M. E Goodman. 1987. Nucleotide insertion kinet- Ridgway, G. J., S. W. Sherburne and R. D. Lewis. 1970.
ics opposite abasic lesions in DNA. J. Biol. Chem. Folymorphism in the esterases of Atlantic her-
262:6864-6870. ring. Trans. Am. Fish. Soc. 99:147-151.
Ranker, T. A. and A. E Schnabel. 1986. Allozymic and Ridley, M. 1983. The Explanation !or Organic Diversity:
morphological evidence for a progenitor-deriva- The Comparative Method and Adaptations for Mating.
tive species pair in Camassia (Liliaceae). Syst. Bot. Oxford University Press, Oxford.
11:433445. Riedy, M. E, W. J. Hamilton and C. E Aquadro. 1992.
Rassmann, K., C. Schlatterer and D. Tautz. 1991, Excess of non-parental bands in offspring from
Isolation of simple-sequence loci for use in poly- known primate pedigrees assayed using RAPD
merase chain reaction-based DNA fingerprinting. FCR. Nucl. Acids Res. 20:918.
Electrophoresis 12:113-118. Rieseberg, L. H. 1991. Homoploid reticulate evolution
Rceck, G.R., C, de Haen, D. C. Teller, R. E Boolittle, in Helianthus: Evidence from ribosomal genes.
W. M. Fitch, R. E.Dickerson, F. Chambon, A.D. Am. J. Bot. 78:1218-1237.
McLachlan, E. Margoliash, T. H. Jukes and E. Xieseberg, L. H. and S. J. Brunsfeld. 1992. Molecular
Zuckerkandl. 1987. "I-Tomology" in proteins and evidence and plant introgression, pp. 151-176. h
nucleic acids: A terminology muddle and a way P. S. Soltis, D. E. Soltis and J. J. Doyle (eds), Plant
out of it. Cell 50:667. Molecular Systetnatics. Chapman and Hall, New
York.
Literature Cited 615
Rieseberg, L. H. and N. C. Ellstrand. 1993. What can Roberts, J \V,, S. A. Johnson, I' ffier, T. J. Hall, E 13.
morphological and molecular markers tell us Davidson and R. J. Britten. 1985. Evolutionary
about plant hybridization. Crit. Rev. Plant Sci. conservation of DNA sequences expressed in sea
12:213-241. urchin eggs and embryos. J. Mol Evol. 22:99-107,
Rieseberg, L. EI. and D. E. Soltis. 1991. Phylogenetic Roberts, L. 1989. Genome project under way, at last.
consequences of cytoplasmic gene flow in plants. Science 243:167-168.
Evol. Trends Plants 5:65-84. Roberts, R.J. 1984. Restriction and modificat~on
Riesebcrg, L. W., S. Beckstrom-Sternberg, A. Liston ezymes and their recognltlon sequences. Nucl.
and D. Arias. 1991. Phylogenetic and systematic Acids Res. 12:r167-r204
inferences from chloroplast DNA and isozyme Robeits, R. J and D. Macellis 1993 REBASE-restrlc-
variation in Heliaizfhus sect. Heliantltus. Syst. Rot. tion enzymes and methylascs Nucl. Acids Res.
16:50-76. 21 3125-3137.
Fkcseberg, L. H., M. A. Hanson and C. T. Philbrick. Robertson, D. L., P. M. Sharp, F. E.McCutchan and B
1992. Androdioecy is derived from diaccy in H. Hahn. 1995. Rccombinatlon in HIV,Nature
Datiscaceae: Evidence from restriction site map- 374:124-126.
ping of PCR-amplified chloroplast DNA frag- Rodrigo, A. G. 1992. Two optllnality criteria for select-
ments. Syst. Bot. 17:324-336. ing subsets of most parsimonious trees. Syst. Biol.
Rieseberg, L. Xi., H. Choi, R. Chan and C. Spore. 1993. 41:3340.
Genomic map of a diploid hybrid species. Rodrigo, A. G. 1993. Calibrating the bootstrap test of
Heredity 70:285-293. monophyly. Int. J. Parasitol 23.507-514.
Rigby, P. W. J., M. Dieckmann, C. Rl~odesand P. Berg. Rodrigo, A. G., M. Kelly-Borgcs, P R Bcrgquist and P
1977. Labelling deoxyribonucleic acid to high spc- L. Bergquist. 1993.A randoniisat~ontest of the
cific activity in vitro by nicktranslation with DNA null hypothcsls that two cladograrns are sample
polymerase I. J. Mol. Biol. 113:237-251. estimates of a paramctrlc phylogenetsc tree New
Rijsewijk, F. M. Schuermann, E. Wagenaar, P. Parren, Zealand J Bot. 31:257-268
D. Weigel and R. Nussc. 2987. The Drosophila Rodriguez, F., I. L. Oliver, A. Marin and J. R. Mcdina.
homolog of tlie rnnuse mamnary oncogeilc int-1 1990. Thc general srochashc model of nucleotide
is identical to the segment polarity gene wingless. subst~tution.J. Theor. Biol. 142.485-501.
Cell 50649-657. Roff, D. A, and P. Bentzen. 1989. The statistical analy-
Riley, M. A,, 5. R. Kaplan and M. Veuille. 1992. sls of mitochondnal DNA polymorphisms: X2 and
Nucleotide polymorphism at the xanthine dehy- the problem of small samplcs ~Mol.Biol. Evol
drogcnase Iocus in Drosophila pseudoobscura. Mol. 6.539-545.
Biol. Evol. 9:56-69, Rogers, A. R,and H. Harpendlng 1992. Population
Riley, V. 1960. Adaptation of orbital bleeding tech- growth makes waves in the dlstrlbution of pair-
nique to rapid serial blood studies. Proc. Soc. Exp. wlsc genetic differences. Mol. Blol. Evol.
Biol. Med. 104:751--754. 9.552-569
Ritland, C. E., K. Ritland and N. A. Straus. 1993. Rogers, D. S, and M. D. Engstrom 1992. Gcnetlc dif-
Variation in the ribosomal internal transcribed ferentiation m spiny pocket nilce of the 1;iotnys
spacers (ITS1 and ITS2) among eight taxa of the pzcfus species-group (fainlly Heterornyidae). Can.
MitrzuIus guttatus species complex. Mol. Biol. Evol. J. Zool. 70:1912-3919
10:1273-1288. Rogers, J. S. 1972. Measures of genct~csimilarity and
Ritland, K. and M. T. Clegg. 1987. Evolutionary analy- genetic distance Studles 111 Genet. VII. Un~verstty
sis of plant DNA sequences. Am. Nat. of Tcxas Pub. 7213:145-153.
130:s75-~100. Rogers, J S. 1984. Deriving phylogenctic trees from
Ritland, K, and E R. Ganders. 1987. Convariation of allcle frequencies. Syst. Zool. 33.52-63.
selfing rates with parental gene fixation indices Rogers, J. S. 1986. Derlving phylogenetic trees from
within populations of Mitlzulus grrttatus. allele frequencies: A comparisolx of nine genctic
Evolution 41:760-771. distances. Syst. Zool. 35.297-310.
Robert-Fortel, I., H. R. Junera, G. Geraud and D. Rogers, S. 8 . and A. J. Bendsch. 1985. Extraction of
Hernandez-Verdum. 1993. Three dimensional DNA from milligram amounts of fresh, herbari-
organization of the ribosomal genes and Ag-NOR urn and mummified plant tissues Plant Mol. Biol.
proteins during interphase and mitosis in PtKl 5.69-76.
cells studied by confocal microscopy.
Chromosoma 102:146-157.
616 Literature Cited
Rogsiad, S Ji., J. C. Patton and B. A. Schaal. 1988. M13 properties of rattlesnake venom following 26
repeat probe detects DNA minisatellite-like years of storage, Proc. Soc. Exp. Biol. Med.
sequences in gymnosperm and angiosperm. Proc. 103:737-739.
Natl Acad. Sci. USA85:9176-9178. Ruvolo, M., T. R. Disotell, M. W. Allard and W. M.
Rogstad, S. H., H. Nybom and 13.A. Schaal. 1991a. The Brown. 1991.Resolution of the &can hominoid
Leilapod "DNA fingerprlnking" MI3 repeat probe trichotomy by use of a mitochondria1gene
l~vedlsgenetic dlverslty and clonal growth in sequence. Proc. Natl. Acad. Sci. USA 88:1570-1574.
quakmg aspen (Popnlus tremuloides, Salicaceae). Ryan, M. J. and A. S. Rand. 1995. Female responses to
Piant Syst. Evol. 175,115-123. ancestral advertisement calls In TGngara frogs.
Rogstad, S H., K. Wolff and B A. Schaal. 1991b. Science 269 :390-392.
Geographical variation 111Asznzzna iriloba Dunal Ryman, N. and F. Utter, (eds.) 1987.Population Genetzcs
(Annonaceae) revealed by MI3 "DNA finger- a71d Fzshery Management. University of
PI~nting"probe. Am. J. Bot. 78:1391-1396. Washington Press, Seattle.
Rohrer, G. A., L. J. Alexander, J. W. Keele, T. Smith Ryman, N., F,W. Allendorf and G. Stahl. 1979.
and C. M! Ueattie. 1994. A microsatellite linkage Reproductive isolation with little genetic diver-
map of the porcine genome. Genetics 136:231-245. gence in sympatric populations of brown trout.
Rollo, F A,, A. Amici, R. Salvi and A. Garbuglia. 1988. Genetics 92:247-262.
Short but faithful pieces of ancient DNA, Nature Ikhetsky, A, and M. Nei. 1992a. A simple method for
335:774. estimating and testing minimum-evolution trees.
Xooncy, D E. and B. H. Czepulkowski. 1986.Hurwnn Mol. Biol. Evol. 9945-967.
Cyiogenetics. IRL Press, Oxford. Rzhetsky, A. and M. Nei. 1992b. Statistical properties
Roosc, M. L.and L. D. Gottlieb. 1976. Genebc and bio- of the ordinary least-squares, generalized least-
chenucal consequences of polyploidy in squares, and minimum-evolution methods of
Tmyopogon. Evolution 30:818-830. phylogenetic inference. J. Mol. Evol. 35:367-375.
Ropson, 1.J. and D. A. Powers. 1989. The allelic Rzhetsky, A. and M. Nei. 1993. Theoretical foundation
isotymes of hexosc-6-phosphate dehydrogenase of the minimum-evolution method of phylogenet-
isolated from Fundulus heleroclitus: Physical char- ic inference, Mol. Biol. Evol. 10:1073-1095.
actcrs and kinetic properties. Mol. Biol. Evol. Rzhetsky, A. and M. Nei. 1995. Tests of applicability of
6.171-185. several substitution models for DNA sequence
Iiopson, I. J., D. C. Brown and D.A. Powers. 1990. data. Mol Biol. Evol. 22:131-151.
Uiocheinical genetics of Fundulus heteroclitus (L.).
V1. Geographical variation in the gene frequencies Sackler, M. L. 1966. Xanthine oxidase irom liver and
of 15 loci. Evolution 44:16-26. duodenuin of the rat: Histochemical localization
l<aser~,D E. and D. G. Buth. 1980. Empirical evolu- and electrophoretic heterogeneity. J. Histochem.
t~onaryresearch versus neo-Darwinian specula- Cytochem. 14:326-333.
t ~ o nSysh
. Lool. 29:3GG-308. Sage, R, D, and R. K.Selander. 1979. Hybridization
Ross, j. and S. Leavitt. 1991. Iinproved sample recov- between species of the Rana pipiens complex in
ery in thermocycle sequencing protocols. central Texas. Evolution 33:1069-1088.
BioTechniques 11:618-619. Saghai-Moroof, M. A., K, M. Soliman, R. A. Jorgensen
Rost, E W. D.1992. FluorescewceMicroscopy.Vol. 1. and R. W. Allard. 1984. Ribosomal DNA spacer-
Cambridge University Press, Cambridge length polymorplusms in barley: Mendelian
Roy,M S., E. Geffen, D. Smith, E. Ostrander and R. K. inheritance, chromosoinal location, and popula-
Wayne. 1994. Patterns of differentiation and tion dynamics, l'roc. Natl. Acad. Sci. USA
hybridization in North American wolf-like canids 81:8014-8019.
revealed by analysis of nucrosatellite loci. Mol. Saiki, R. K., S. Scharf, R Faloona, K. B. Mullis, G. T.
Bloi Evol. 11:553-570. Horu, 1-1. A. Erlich and N Arnheim. 1985.
Runno, G , A . S. Deinard, S. Tishkoff and K. K. Kidd. Enzymatic amplification of Pglobin genomic
1494 Detection of DNA sequence variation via sequences and restriction site analysis for diagno-
dellberate heteroduplex formation from genamic sis of sickle cell anemia. Science 230:1350-1354.
DNAs amplified en masse in "population tubes". Saiki, R. K., D. H. Gelfand, S. Stoffel, S. J. Scharf, R.
PCR Mcth. Applica. 3:225-231. Higuchi, G. T.Horn, K. 13. Mullis and H. A. Erlich.
liusiell, F E. 1980. Snake Veizom Poisol~ing.Lipponcott, 1988. Primer-directed enzymatic amplification of
l3hlladelphia. DNA with a thermostable DNA polymerase.
Rusaeli, F.E., J, A. Emery and T. B. Long. 1960. Some Science 239:487-491.
Literature Cited 617
Saitou, N. 1988. Property and efficiency of the maxi- Frequency of insertion-deletion, transversion, and
mum likelihood method for molecular phylogeny. transition in the evolution of 5S ribosomal TCNA.
J. Mol. Evol. 27:261-273. J. Mol. Evol. 7:133-149.
Saitou, N. 1990. Maximum likelihood methods. Meth. Sankoff, D., G. Leduc, N. Antoine, B. Paquin, B. F.
Enzyrnol. 183:584-598. Lang and R. Cedergren. 1992. Gene order com-
Saitou, N. 1991. Molecular Systematics (book review). parisons for phylogenetic inference: Evolution of
Mol. Biol. Evol. 4:559-561. the mitochondria1 genome. Proc. Natl. Acad. Sci.
Saitou, N. and T. Imanishi. 1989. Relative efficiencies USA 89:6575-6579.
of the Fitch-Margoliash, maximum-parsimony, Santos, F. R., S. D, J. Pena and J. T.Epplen. 1993.
maximum-likelihood, minimum-evolution, and Genetic and population study of a Y-linked
neighbor joining methods of phylogenetic tree tetranucleotide repeat DNA polymorphism with a
construction in obtaining the correct tree. Mol. simple non-isotopic technique. Human Genet.
Biol. Evol. 6:514525. 90:655-656.
Sa~tou,N. and M. Nei. 1987, The neighbor-joining Sarich, V. M. 1977. Rates, sample sizes, and the neu-
method: A new method for reconstructing phylo- trality hypothesis for electrophores~sin evolution-
genetic trees. Mol. Biol. Evol. 4:406-425. ary studies. Nature 265:24-28.
Salthe, S. N. and N. 0. Kaplan. 1966. Immunology and Sarich, V. M. 1985. Rodent macromolecular systemat-
rates of enzyme evolution in the amphibia in rela- ics, pp. 423-452, In W. P. Luckett and J.-L.
tion to the origins of certain taxa. Evolution Hartenberger (eds.), Evolutioiza y Relationships
20:603-616. Among Rodents. A Multidisciplinary Analysis.
Sambrook, E., F. Fritsch and T. Maniatis. 1989. Plenum, New York.
Molecular Cloning. Cold Spring Harbor Press, Sarich, V. M. and J. E. Cronin. 1976. Molecular system-
Cold Spring Harbor, New York. atics of the primates, pp. 141-170. In M. Goodman
Sanderson, M, J. 1989. Confidence limits on phyloge- and R. E. Tashian (eds.), Molecular Anthropology.
nies: The bootstrap revisited. Cladistics 5913-129. Plenum, New York.
Sanderson, M. J. and J. J. Doyle. 1992. Reconstruction Saricl~,V. M. and A. C. Wilson. 1966. Quantitative
of organisrnal and gene phylogenies from data on immunochemistry and the evolution of primate
multigene families: Concerted evolution, homo- albumins: Micro-complement fixation. Science
plasy, and confidence. Syst. Biol. 414-17. 1541563-1566.
Sanderson, M. J., B. G. Baldwin, G. Bharathan, C. S. Sarich, V. M. and A. C. Wilson. 1967. Immunological
Campbell, C. von Dohlen, D.Ferguson, J. M. time scale for hominid evolution. Science
Porter, M. F, Wojciechowski and M.J. Donoghue. 158:1200-1203.
1993.The growth of phylogenctic information Sarich, M., C. W. Schmid and J. Marks. 1989. DNA
and the need for a phyfogenetic data base. Syst. hybridization as a guide to phylogenies: A critical
Biol. 42:562-568. analysis. Cladistics 5:3-32.
Sanger, E, S. Nicklen and A. R. Coulson. 1977. DNA Sarkar, G., H.-S. Yoon and S. S. Sommer. 1992.
sequencing with chain-terminating inhibitors. Screening for mutations by RNA single-strand
Proc. Natl. Acad. Sci. USA 74:5463-5467. conformation polymorphism (rSSCP):
Sankoff, D. 1975. Minimal mutation trees of sequences. Comparison with DNA-SSCP. Nucl. Acids Res.
SIAM J. Appl. Math 28:35-42. 209371-878.
Sankoff, D. and R. J. Cedergren. 1983. Simultaneous SAS Institute. 1985. SAS User's Guide: Statistics, Version
comparison of three or more sequences related by 5. SAS Institute, Cary, North Carolina.
a tree, pp. 253-263. In D. Sankoff and J. B. k s k a l Sasavage, N.1992. Painting by the chromosome num-
(eds.), Trme Warps, String Edits, and bers. J. NIH Res. 444-46.
Macromolecules: The Theory and Practice of Sequence Sattath, S. and A. Tversky. 1977. Additive similarity
Comparison, Addison-Wesley, Reading, trees. Psychometrika 42:319-345.
Massachusetts. Savage, J. M. 1973. The geograpl~icdistribution of
Sankoff, D, and Rousseau. 1975. Locating the ver- frogs: Patterns and predictions, pp. 351-455. i n J.
tices of a Steiner tree in arbitrary space. Math. L. Vial (ed.), Evolutionary Biology of the Anurans:
Prog. 9:240-246. Contemproray Research on Major Problems.
Sankoff, D., C. More1 and R. J. Cedergren. 1973. University of Missouri Press, Columbia.
Evolution of 55 RNA and the non-randomness of Scanlan, B. E., L. R. Maxson and W. E. Duellman. 1980.
base replacement. Nature 245:232-234. Albumin evolution in marsupial frogs (Hylidae:
Sankoff, D., R. J. Cedergren and G. Lapalme. 1976. Gaslrotheca). Evolution 34:222-229.
618 Literature Cited
Schaal, 8. A,, W. J. Leverich and J. N~cto-Sotela.1987. of simple sequence DNA. Nuci. Acids Res.
Ribosomal DNA variation in the native plant 20:211-215.
Phlox divaricata. Mol, Bioi. Evol. 4:611-621. Schlotterer, C., B. Amos and D. Tautz. 1991.
Schaaper, R. M. and R. L. Dunn. 1987. Spectra of spon- Conservation of polymorphic sequence loci in
taneous mutations m Escherichia colt strains defec- certain cetacean species. Nature 35453-65.
tive in mismatch correction: The nature of in vivo Schlotterer, C., M. T. Hauser, A. von Waeseler and D.
replication errors. Proc. Natl. Acad. Sci. USA Tautz. 1994. Comparative evolutionary analysis of
84:6220-6224. rDNA ITS regions in Drosophila. Mol. Biol. Evol.
Schaeffer, S. W. and C. F, Aquadro. 1987 Nucleotide 11:513-522.
sequence of the alcohol dehydrogenasc region of Schmid, M. and M. Guttenbach. 1988. Evolutionary
Drosophila pscudoobscura: Evolutionary change diversity of reverse (R) fluorescent cl~omosome
and evidence for an ancient duplication. Genetics bands in vertebrates. Chromosoma 97:lOl-124.
117:61-73. Schmid M., J. Olert and C. Klett. 1979. Chromosomc
Schaeffer,S. W. and E. L. Miller. 1991. Nucleotide banding in Amplubia 111. Sex chromosomes in
sequence analysis of Adh gene estimates the time Trzturus. Chromosoma 71:29-55.
of geographic isolation of the Bogota population Schoen, D. J. 1982. Genetic variation and the breeding
of Drosophila pscudoobscura. Proc. Natl. Acad. Sci. system of Gilia achilleifolia.Evolution 36:361-370.
USA 88:6097-6101. SchBniger, M. and A. von Haeseler. 1993. A simple
Schafer, M. and W. Kuaz. 1985. rDNA in Locusta migra- method to improve the reliability of tree recon-
toria is very variable: Two introns and extensive structions. Mol. Biol. Evol. 10:471483.
restriction site polymorphisms in the spacer. Schubert, F. R., K. Nieselt-Struwe and P. Gruss. 1993.
Nucl. Acids Res. 13:1251-1266. The antennapaedia-type homeobox genes have
Scharf, S. J. 1990. Cloning with PCR, p p 8491. In M. evolved from three precursors separated early in
A. Innis, D. H. Gelfand, J. J. Sninsky and T. J. metazoan evolution. Proc. Natl. Acad. Sci. USA
White (eds.), PCR Protocols. Academic Press, New 90:143-147.
York. Schwaner, T. D. and H. C. Bessauer. 1982.
Scharf, S. J., C. M. Long and H. A. Erlich. 1988a. Comparative immunodiffusion survey of snake
Sequence analysis of the HLA-DRP and HLA- transferrins focused on the relationships of the
DQP loci from three Pemphigus vulgaris patients. natricines. Copeia 1982:541-549.
Human Immunol. 22:61-69. Schwaner, T.D., P. R. Baverstock, H.C.Dcssauer and G.
Scharf, S. J., A. Friedman, C. Brautbar, F. Szafer, L. A. Mengden. 1985. Immunological evidence for the
Steinman, G. Ilorn, U.Gyllensten and H. A. phylogenetic relationships of Australian elapid
Erlich. 1988b. HLA class II allelic variation and snakes, pp. 177-184. In G. Grigg, R. Shine and H.
susceptibility to Pniphigus vulgaris . Proc. Natl. Ehmann (eds.), Biology of Australasian Frogs and
Acad. Sci. USA 85:3504-3508. Reptiles. Royal Zool. Soc., New South Wales.
Scherberg, N.H. and S. Refetoff. 1975. Radioiodine Schwartz, M. K., 1. S. Nisselbaum and 0.Bodansky.
labeling of ribopolymers for special applications 1963. Procedure for staining zones of activity of
in biology, pp. 343-359. In D. M. Prescott (ed.), glutamic oxaloacetic transaminase following elec-
Methods in Cell Biology, Vol. 10, Academic Press, trophoresis with starch gel. Am. J. Clin. Pathol.
New York. 40:103-106.
Schilling, E. E. and R. K. Jansen. 1989. Restriction frag- Schwartz, 0.A. and K. B. Armitage. 1980. Genetic
ment analysis of chloroplast DNA and the sys- variation in social mammals: The marmot model.
tematics of Viguieva and related genera Science 202665-667.
(Asteraceae: Heliantheae). Am. 1. Bot. Scl~wartz,R. M. and M. 0. Dayhoff. 1978. Origins of
121769-1778. prokaryotes, eukaryotes, mitochondria, and
Schleif, R. F. and P. C. Wensink. 1981. Practical Methods chloroplasts: Aperspective is derived from pro-
in Molecular Biology. Springer-Verlag, Berlin. tein and nucleic acid sequence data. Science
Schlotterer, C. and J. Pemberton. 1994. The use of 199:395403.
microsatellites for genetic analysis of natural pop- Scliwengel, D. A,, A. E.Jcdlicka, E. J. Nanthakumar, J.
ulations, pp. 203-214. In B. Schierwater, 'B. Streit, L. Weber and R. C. Levitt. 2994. Cornparisan of
G. P.Wagner and R. DeSalle (eds.), Molecular fluorescence-based semi-automated genotyping
Ecology and Evolution: Approaches and Applications. of multiple microsatellite loci with autoradi-
Birkhauser Verlag, Basel, Switzerland. ographic techniques. Genomics 22:46-54.
SchlGtterer, C. and D. Tautz. 1992. Slippage synthesis
Liferutu~eCited 61 3
Schwert, G. W. 1957. Recovery of native bovine serum Rat gene mapping uslng PCR-analyzed
albumin after precipation wlth trichloracetic acid microsatellites. Genetics 131 701-721.
and solution in organic solvents. J. Am. Chem. Sessions, S. K.1982. Cytogenet~csof diploid and
Soc. 79:139-141. triploid salamanders of the Ambystoma ~cffersonl-
Scribner, K.T., J. W. Arntzen and T. Burke. 1994. nnum complex. Chromosoma 84:599-621
Comparative analysis of intra-and interpopula- Sess~ons,S. K. and J. Kezer. 1987. Cytogenetic evolu-
tion genetic diversity in Bufo bufo, using allozyme, tion i12 the plethodontid saiamandcr genus
single-locus m~crosatellite,minisatellite and mul- Aneides. Chromosoma 95:17-30.
tilocus minisateirite data. Mal. Biol. Eval. Sess~ons,S. K.and A. Larson 1987. Developmental
11:737-748. correlates of genomc size in piethodolqtid sala-
Sears, 8. B. 1980. T11e elimination of plastids during manders and therr impl~cationsfor genome evo-
spermatogenesis and fertilization in the plant lutton Evolution 41:1239-1251.
klngdom, Plasmid 4:233-255. Seutin, C., B.F: Lang, D. P.Mindell and R. Morals
Seber, G. A. R 1982. Tlze Estzmation of Animal 1994. Evolution of the WANCY region m amnrotc
Abuizdance. Charles Griffin and Co., 1,ondon. mltochondrial DNA. Mol Biol Evol. 11.329- 340
Seed, B., R. C. Parker and N. Davidson. 1982. Shaklce, J. B. 1984. Genetic var~ahonand populat~on
Representation of DNA sequences in recombinant structure in the damselfish, Stegastes fasc~olaflds,
DNA libraries prepared by restriction enzyme throughout the Hawallan Archipelago. Copeia
partial digestion. Gene 19:201-209. 2 984.629-640.
Selander, R. K., M. K. Smith, S. Y. Yang, W. E. Johnson Shaklee, J. D,and C. P.Keenan. 1986. A Practtcal
and J. R. Gentry. 1971. Biochemical polymorphism Laboratory Guide to the Techniyues and Met\zodology
and systematics in the genus Peramyscus. I. of Electrophoresis and Its Appl~calronto Fish Ftllct
Variation in the old-field mouse (Peromyscus lderztificafion. CSBO Marme Laboratories Publ.
polionotus). Stud. Genet. V1. University of Texas 177. Melbourne, Australia.
Pub. 7103:49-90. Shaklee, J. B, and C. S. Tanaru. 1981. Biochemical and
Sclandcr, R. K., D. A. Caugant, H.Ochman, J. M. morpl~ologica!cvolution of Hawanan bonefishes
Musser, M. N. Gilmour and T. S. Whlttam. 1986. (Albula).Syst. Zool.30:125-146.
Methods of multilocus enzyme electrophoresis for Shaklee, J. B. and G. S. Wl~itt.1981. Lactatc dehydro-
bacterial population genetics and systcmatics. genase isozymes of gadiform fishes: Divergent
Appl. Environ. Microbial. 51:873-884. patterns of gene expression indicate a heterogc-
Sellers, P. 1974. On the theory and computation of evo- neous taxon. Copeia 1981:563-578.
lutionary distances. S U M J. Appl. Math. Shaklee, J. B., K. L. Kepes and G. S. Whitt. 1973.
26:787-793. Specialized lactate dehydrogenase isozymcs: The
Sensabaugh, G. F. 1982. Isozymcs in forensic science, molecular and gcnetic bas~sfor the uniquc eye
pp. 247-282. In M. Rattazzi, J. Scandalios and C. and liver LDHs of teleost fishes. J. Exp. Zool
Whitt (eds.), Isozymes: Current Topics in Biological 185:217-240*
and Medical Research, Vol. 6. A. R. Liss, New York. Shaklee, J. B., E W. Allcndorf, D.C Morizot and G S.
Sensabaugh, G. F., A. C. Wilson and P. L. Kirk. 1971a. Whitt. 1992. Gene nomenclature for prote~n-cod-
Protein stabih ty in preserved biological remains. ing loci in fish. Trans. Am Fish. Soc. 119.2-15.
X. Survival of biologically active proteins in an 8- Sharkey, M. J. 1989 A hypothesis-~ndependentmcthad
year-old sample of dried blood. Int. J. Biochem. of character welghtlng for cladrstic analysls
2:545-557. Cladistics 5:63-86.
Sensabaugh, G. F., A. C. Wilsoi~and P. L. Kirk. 1971b. Sbaw, C. R. 1965. Electrophoretic variation m
protein stability in preserved biolog~calremains. enzymes. Science 149936-943
11. Modification and aggregation of proteins in an Shaw, C. R. and R. Prasad. 1970 Starch gel elec-
8-year-old sample of dried blood. Int. J. Biochem. trophoresis of cnzymes-a compilation of rcclpcs.
2558-568. Biochem. Gcnet. 4:297-330.
Separack, P,, M. Slatkin and N.Amheim. 1988. Shaw, D.D., A. D. Marchant, M.L. Arnold and N
Linkage disequilibrium in human ribosomal Contreras. 1987. Chromosomal rearrangements,
genes: Implications for multigene family evolu- ribosomal genes and mitochondria1 DNA:
tion. Genetics 119:943-949. Contrasting pattcrns of iiztrogrcssion across a nar-
Serikawa, T., T. Kuramoto, P. Hilbert, M. Mori, J. row hybrid zone, pp 121-130. In I? E. Brandham
Yainada, C. J. Dubay, K.Lindpainter, D. Ganten, J. and M. D. Bennett (eds.), Kew Chrorrlosome
-L. Guenet, G. M. Lathrop and J. S. Beckman 1992. Corzference. 111. Allen and Unwin, .
620 Literature Cited
Shaw, D D., A. D. Marchant, M. L. Arnold, N. mastodon and woolly mammoth demonstrated
Contreras and B.Kohlll~ann1990. The control of immunologically Paleobiology 1k429-437.
gene flow across a narrow hybrid zone: A selec- Shows, T. B, and F. H. Ruddle. 1968. Function of the
hve role for chro~nosomalrearrangement. Can. J. lactate dehydrogenase B gene in mouse erythro-
Zoo1 68 1761-1769. cytes: Evidence for control by a regulatory gene.
Shaw, J ,T IZ Meagher and P. I-Iarley. 1987. Electro- Proc. Natl. Acad. Sci. USA 61:574.
phoretlc evidence of reproductive isolation Shrlver, M. D., J. Li, R. Chakraborty and E.
between two varieties of the moss Climaciutn Boerwinkle. 1993. VNTR allele frequency distrib-
alrzerrcunum. Heredlty 59337-343. utions under the stepwise mutation model: A
Sheffield, V. C., D. R. Cox and R. M. Myers. 1989. computer simulation approach. Genetics
Attachment of a 40-base pair G+C rich sequence 134:983-993.
(GCcIamp) to genomic fragments by polymerase Sibley, C. G. and J. B. Ahlquist. 1981a. The phylogeny
chain reaction results ~nin~proveddetection of and relationships of the ratite birds as indicated
srngle base changes. Proc. Natl. Acad. Sci. USA by DNA-DNA hybridization, pp. 301-335. In G.
86 232-236. G. E. Scudder and J. L. Reveal (eds.), Evolution
Sheffield, V. C., J. S. Beck, E. M. Stone and R. M. Today. Carnegie-Mellon University, Pittsburgh,
Myers. 1992. A simple and efficient method for Pennsylvania.
attachment of a 40 base pair G+C rich sequence to Sibley, C. G. and J. E. Ahlquist. 1981b. Instructions for
I'CR amplified DNA. BioTechniques 12:386-387. specimen preservation for DNA extraction: A
Sheldon, F H. 1987. Rates of single-copy DNA evolu- valuable source of data for systematics. Assoc.
tion in herons. Mol. Biol. Evol. 4.56-69. Syst. Collections Newsletter 9:44-45.
Sheldon, F 13, and A. H.Bledsoe. 1989.Indexcs to the Sibley, C. G. and J. E. Ahlquist. 1983. The phylogeny
ieassociation and stability of solution DNA and classification of birds based on the data of
l~pbrids.J. Mol. Evo1.29.328-343, DNA-DNA hybridization, pp. 245-292. In R.E
Sheldon, E H., Sllkas, B., Kinnarney, M., Gill, F. B., Johnston (ed.), Current Ornithology,Vol. 1.
Zaho, E.and B. Silverin. 1992. DNA-DNA hybrid- Plenum, New York.
lzalion evidence of phylogenetic relationships Sibley, C. G. and J. Ahlquist. 1987a.Avian phylogeny
among major lineages of Parus. Auk 109:173-185. reconstructed from co~nparisonsof the genetic
Sl~era,E U , N. K.Seitzinger, L. M. Davis, R. A. Keller material, DNA, pp. 95-121. In C. Patterson (ed.),
and S. A Soper. 1990. Detection of single fluores- Molecules and Morphology in Evolution: Conflict or
cent n~olecules.Chem. Phys. Letters 175:553-557. Compromise?Cambridge University Press,
Shields, G F. and A. C. Wilson. 1987. Calibration of Cambridge.
m~iochondrialDNA evolution in geese. J. Mol. Sibley, C. G. and J. E.Ahlquist. 1987b. DNA hybridiza-
13101,24:212-217. tion evidence of hominoid phylogeny: Results
Shlnozaki, K., M. Ohme, M. 'Tanaka, T. Wakasugi, N. from an expanded data set. J. Mol. Evol.
Hayashida, T. Matsubayashl, N, Zaita, J. 26:99-121.
Chunwongse, J. Obokata, K. Yamaguchi- Sibley, C. G. and J. B. Ahlquist. 1990. Phylogeny and
Shmozaki, C. Ohto, K. Torazawa, B. Y. Meng, M. Classification of Birds. Yale University Press, New
S~igita,H. Deno, T. Kamogashlra, K.Vamada, J. Haven.
K~rsuda,E Takaiwa, A Kato, N.Tohdoh, W. Sibley, C . G., K, W. Corbin, J. E. Ahlquist and A.
Shlmada and M. Sugiura. 1986. The coinplete Ferguson. 1974. Birds, pp. 89-176. In C. A. Wright
nucleotlde sequence of tobacco chloroplast (ed.), Biochemical and In~munologicalTaxo~zomyof
geiiome. Its gene organ~zationand expression. A~~imals. Academic Press, New York.
EblBO J 5.2043-2049 Sibley, C. G., J. E. Ahlquist and F. H.Sheldon. 1987.
Shochnl, D. and H. C. Dessauer 1981. Coinparatlve DNA hybridization and avian pl~ylogenetics:
il~~mu~~ological study of album~nsof Anolzs Reply to Cracraft. Evol. Biol. 21:97-125.
lizards of the Caribbean Islartds. Comp. Biochem Sibley, C. G., J. E. Ahlquist and B. L.Monroe Jr. 1988.A
Ijhyslol. 68A:67-73 classification of living birds based on DNA-DNA
Shoemdker, J. S. and W. M. Fitch. 1989. Evidence from hybridization stud~es.Auk 105:409423.
nuclear sequences that invariable sites should be Siciliano, M. J. and C. R. Shaw. 1976. Separation and
considered when sequence divergence is calculat- visualization of enzymes on gels, pp. 185-209. In
cd Mol. Biol. Evol. 6.270- 289. I. Smith (ed,), Chromutographicand Electrophoretic
Shoshan~,J ,]. M. Lowenste~n,D. A. Walz and M. Techniques. Vol. 2. Wm. Heineman Medical Books,
Goodman. 1985. Proboscidean origins of London.
Liierature Cited 621
Sidow, A. and A. C. Wilson. 1991. Compositional sta- Sites, J. W., Jr. and S. K.Davis. 1989. Phylogenetic rela-
tistics evaluated by computer simulation, pp. tionships and molecular variability within and
129-146. In M. M. Miyamoto and J. Cracraft among six chromosome races of Sceloporus gram-
(eds.), Phylogenetic Analysis of DNA Sequences. micus (Sauria, Xguanidae), based on nuclear and
Oxford University Press, New York, Oxford. mitochondria1 markers. Evolution 43:296-317.
Sidow, A,, T. Nguyen and T. P. Speed. 1992. Estimating Sites, J. W., Jr. and C. Moritz. 1987. Chromosome change
the fraction of invariable codons with a capture- and speciation revisited. Syst. Zool. 36:153-174.
recapture method. J. Mol. Evol. 35:253-260. Sites, J. W., Jr. and R. W. Murphy. 1991. Isozyme evi-
Silberman, J. D.and l?. J,. Walsh. 1992. Species identifi- dence for independently derived, duplicate
cation of spiny lobster phyllosome larvae via G3PDH loci among squalnate reptiles. Can, J.
ribosomal DNA analysis. Mol. Mar. Biol. Zool. 69:2381-2396.
Biotechnol. 1:195-205. Sites, J. W., Jr., J. W. Bickham, B. A. Pytel, I. F.
Simmons, G. M., M. E. Kreitman, W. E Quattlebaum Greenbaum and B.A. Bates. 1984. Biochemical
and N. Miyashita. 1989. Molecular analysis of the characters and the reconstruction of turtle phylo-
alleles of alcohol dehydrogenase along a cline in genies: Relationships among batagurine genera.
Duosophila melnnogast.er. I. Maine, North Carolina, Syst. Zool. 33:137-158.
and Florida. Evolution 43:392-392. Sites, J. W., Jrd,R. L. Bezy and P. Thompson. 1986.
Simmons, N. B., M. J. Novacek and R. J. Baker. 1991. Nonrandom heteropolymer expression of lactate
Approaches, methods and the future of the dehydrogenase isozymes in the lizard family
Chiropteran monophyly controversy: A reply to J. Xantusiidae. Biochem. Syst. Bcol. 14:539-545.
D. Pettigrew. Syst. Zool. 40:239-241. Sites, J. W., Jr., D. M. Peccinini-Seale, C. Moritz, J. W.
Simon, C. 1979. Evolution of periodical cicadas: Wright and W. M. Brown. 1990. The evolutionary
Phylogenetic inferences based upon allozyme history of parthenogenetic Cnemidophorus lemnis-
data. Syst. Zool. 28:22-39. catus (Sauria, Teiidae). I. Evidence for a hybrid
Simon, C. 1991. Molecular systematics at the species origin. Evolution 44:889-905.
boundary: Exploiting conserved and variable Sites, J. W., Jc,S. K. Davis, D. W. Hutchison, B. A.
regions of the mitochondria1 genome of animals Maurer and G. Lara. 1993. Parapatric hybridiza-
via direct sequencing of enzymatically amplified tion between chromosome races of the Sceloporus
DNA, p p 33-71. In G. M. Ilewitt, A. W. B. grammicus complex (Pluynosomatidae): Structure
Johnson and J. P. W. Young (eds.), Molecular of the Tulancingo transect. Copeia 1993:341-366.
Techniques in Taxonomy. NATO Advanced Studies Slade, R. W. 1992. Limited MHC polymorphism in the
Institute, H57. Springer, Berlin. southern elephant seal: Implications for MHC
Simon, C., S. Paabo, T.D. Kocher and A. C. Wilson. evolution and marine mammal population biolo-
1990. Evolution of mitochondria1 ribosomal RNA gy. Proc. Roy. Soc. London B 249:163-171.
in insects as shown by the polymerase cham reac- Slade, R. W., C. Moritz and A. Heideinan and P. T.
tion, pp. 235-244. Tn M. Clegg and S. O'Rrien Hale. 1993. Rapid assessment of single-copy
(eds). Molecular Evolution. UCLA Symposium on nuclear DNA variation in diverse species. Mol.
Molecular and Cellular Biology, New Series. Vol. Ecol. 2359-373.
122. Wiley-Liss, New York. Slade, R. W., C. Moritz and A. Heideman. 1994.
Simon, C., F. Frati, A. Beckenbach, B. Crespi, H. Liu Multiple nuclear-gene phylogenies: Application
and I? FFlook. 1994. Evolution, weighting and phy- to pinnipeds and comparison with a rnitochondri-
logenetic utility of mitochondria1 gene sequences a1 DNA gene phylogeny Mol. Biol. Evol.
and a compilation of conserved polymerase chain 11:341-356.
reaction primers. Ann. Entomol. Soc. Am. Slatkin, M. 1985. Gene flow in natural populations.
87:651-701. Annu. Rev. Ecol. Syst. 16:393-430.
Singer-Sam, J., R. C Tanguay and A. D. Riggs. 1989. Slatkin, M. 1987. Gene flow and the geographic struc-
Use of Chelex to improve the PCR signal from a ture of natural populat~ons.Science 236:787-792.
small number of cells. Amplifications 3:11. Slatkin, M. 1991. Inbreeding coefficients and coales-
Singh, G., N. Neckelmann and D. C. Wallace. 1987. cence times. Genet. Res. Camb. 58:167-175.
Conformational mutations in human mitochondr- Slatkin, M. 1993. Isolation by distance in equilibrium
ial DNA. Nature 329:270-272. and non-equilibrium populations. Evolution
Singh, R, S., R. C. Lewontin and A. A. Pelton. 1976. 47:264279.
Genetic heterogeneity within electropl~oretic Slatkin, M. 1995. A measure of population subdivision
"alleles" ol xanthine dehydrogenase in Drosophila based on rnicrosatellite frequencies. Genetics
pseudoobscuua. Genetics 84:609-629. 139:457-462.
622 Literature Cited
Smith, M. I., Boom, J. D. G. and R. A. Raff. 1990.
Slatkin, M, and N. H. Barton. 1989.A comparison of Single-copy DNA distance between two con-
three indirect. methods for estimating average lev- generic sea urchin species exhibiting radicalIy dif-
els of gene flow. Evolution 43:1349-1368. ferent modes of development. Mol. Biol. Evol.
Slatkin, M.and R. R. Hudson. 1991. Pairwise compar- 7:315-326.
isons of mitochondrial DNA sequences in stable Smith, M. J., A. Arndt, S. Corski and E. Fajber. 1993.
and exponentially growing populations. Genetics The phylogeny of echinoderm classes based on
129:555-562. mitochondria1 gene arrangements. J. Mol. Evol.
Slatkin, M. and W. Maddison. 1989. A cladistic mea- 36545554.
sure of gene flow inferred from the phylogenies Smith, M. L., J. N. Bruhn and J. 8 . Anderson. 1992. The
of alleles. Genetics 123:603-613 fungus Armillaria bulbos is among the largest and
Slatkin, M. and W. P. Maddison. 1990. Detecting isola- oldest living organisms. Nature 356:428-431.
tion by distance using phylogenies of genes. Smith, M. W., C. E Aquadro, M. H. Smith, R. K.
Genetics 126:249-260. Chesser and W. J. Etges. 1982. Bibliography of
Siightom, J. L., T.W. Theisen, B. P. Koop and M. Electrophoretic Studies of Biochenzical Variation m
Goodman. 1987. Orangutan fetal globin genes. Natural Vertebrate Populations. Texas Tcch Press,
Nucleotide sequences reveal multiple gene con- Lubbock.
versions during horninid phylogeny. J. Biol. Smith, T. A,, J. WheIan P, J. Parry. 1992. Detection of
Chem. 2627472-7483. single-base mutations in mixed population of
Small, E., S. E. Warwick and B. Brookes. 1992. Isozyme cells: A comparison of SSCP and direct DNA
variation and alleged progenitor-derivative rela- sequencing. GATA 9:143-145.
tionships in the Medicago murex complex Smith, T. F., M. S. Watcrman and W. M. Fitch. 1981.
(Fabaceae). Plant Syst. Evol. 181:3343. Comparative biosequence metrics. J. Mot. Evol.
Smith, A. B, 1989. RNA sequence data in phylogenetic 18:38:46.
reconstruction: Testing thc limits of its resolution. Smith, T. F., M. S. Waterman and C. Burks. 1985. The
Cladistics 5:321-344. statistical distribution of nucleic acid similarities.
Smith, A. 8.1994. Rooting rn~leculartrees: Problems Nucl. Acids Res. 13645-656.
and strategies. Bioi. J. Linnean Soc. 51:279-292. Smith, V., M. Craxton, A. T.Bankier, C. M.Brown, W.
Smith, C. A., J. M. Jordan and J. Vinograd. 1971. In D. Rawlinson, M. S. Chee and B. G. Barrell. 1993.
vivo effects of intercalating drugs on the superhe- Preparation and fluorescent sequencing of MI3
lix density of mitochondrial DNA isolated from clones: Microtiter methods. Meth. Enzymol.
human and mouse cells in culture. J. Mol. Biol. 218:173-187.
59:255-272. Smithies, 0. 1955. Zone electrophoresis in starch gels:
Smith, G. R. 1992. Introgression in fishes: Significance Group variations in the serum proteins of normal
for paleontology, cladistics, and evolutionary individuals. Biochem. J. 61:629-641.
rates. Syst. Biol. 41:41-57. Smouse, P. E., T.E. Dowling, J. Tworek, W. R. Hoeh
Smith, J. J., J. S. Scott-Craig, J. R. Leadbetter, G L. and W. M. Brown. 1991. Effects of intraspecific
Bush, D. L. Roberts and D. W. Fulbright. 1995. variation on phylogenetic inference: A likelihood
Characterization of random amplified polymor- analysis of mtDNA restriction site data in
phic DNA (RAPD) products from Xanfhomonas cyprinid fishes. Syst. Zool. 40:393409.
campestris: Implications far the use of RAPD prod- Sneath, P. H. A. and R. R. Sokal. 1973. Numerical
ucts in phylogenetic analysis. Mol. Phylogenet. Taxonomy. W. H. Freeman, San Francisco.
Bvol. 3135-145. Sneath, P. H. A., M. J. Sackin and R. Amber. 1975.
Smith, J. S. C. and 0.S. Smith. 1991. Restriction frag- Detecting evolutionary incompatibilities from
ment length polymorphisms can differentiate protein sequences. Syst. Zool. 24:311-332.
among U.S. maize hybrids. Crop Sci. 31:893-899. Snedecor, G. W. and W. G. Cochran. 1989. Statistical
Smith, M. F., W. K. Thomas and J. L. Patton. 1992. Methods. 8th ed. Iowa State University Press, Arnes.
Mitochondrial-like sequence in the nuclear Sober, E. 1983. Parsimony in systematics:
genome of an Akodontine rodent. Mol. Biol, Evol. Philosophical issues. Anlzu. Rev. Ecol. Syst.
9:204-215. 14:335-357.
Smith, M. J., R. Nicholson, M. Stuerzl and A. Lui. 1982. Sober, E. 1989. Reconstructing the Past: Parszmony,
Single copy DNA homology in sea stars. J. Mol. Evolution, and Inference. MIT Press, Cambridge,
Evol. 18:92-101. Massachusetts.
Literafula Cited 623
Sogin, M.L. 1989. Evolutioi~of eukaryotic microor- Soltis, D. E., P S. Saltis and B. G. M~lllgan.1992
ganisms and their small subunit ribosomal RNAs. lntraspec~ficchloroplasl var~atlon:Systematic and
Amer. Zool. 29:487-499. plrylogcnetic implications, pp 117-150.111P. S
Sogin, M. L. 1990. Amplification of r~bosolnalRNA Soltis, D. E Soltis and J. J. Doyle (eds.), Plnrll
genes for molecular evolution studies, pp. Molec~~lar Systeinat~cs.Chnpman and Hall, N e w
307-314. b?M. A. Innis, D. H.Gelfand, J. J. York.
Sninsky and T. J. White (eds.), PCR Protocols: A Soltls, P. S. and D. E. Soltls. 1994 Plant ~nolecularsys-
Guide to Methods and Applicatto~zs.Academic Press, tcmatics: Inferences of phylogeny and evolution-
San Diego. ary processes. Evol. Biol. 28:139-194.
Sogin, M, L., H. J. Elwood and J. H. Gunderson. 1986. Song, K.M., T. C. Osborn and P. H W~llia~ns. 1988.
Evolutionary diversity of eukaryotic small-sub- Rrass~cataxonomy based on nuclear restrictioli
unil rRNA genes. Proc. Natl. Acad. Sci. USA fragment length polymorpl~lsms(WLPs). I.
83:1383-1387. Genome evolution of diplo~dand amphid~ploid
Sogin, M L., J. H. Gunderson, H. J. Elwood, R. A. species. Thcor. Appl. Gcnct. 75 784-794.
Alonso and D. A. Peattie. 2989. Phylogenetic Song, K. M., T.C. Osborn and P.13 Williams. 1990.
meaning of the kingdom concept: An unusual Ul'ass~cntaxonomy based on nuclear restriction
ribosomal RNA froin Giardta lanzblia. Scicnce fragment length polymorphlsms (RFLPs). 3.
243:75-77. Genome relationships in Brassten and related gel?-
Sokal, R. R. and E J. Rohlf. 1981. Bion~ety ,Second era and the origin of B. oleracen x B , rapa (syn
Edition. W. W.Freeman and Co., San Francisco. campestrzs). Theor. Appl Genet. 79:497-506.
Solignac, M., Guer~nont,J., Monnerot, M., J-C. Sopcr, S. A., L. M. Davis, R R. Falrheld, M. L.
Mounolou. 1984. Genetics of mitochondria in Hammond, C. A. Harger, J. 1-1. Jett, R. A. Keller, B
Drosophila: mtDNA inheritance in heteroplasmic L. Marone, J. C. Martin, H. L. Nutter, E. I3 Shera
strains of 13. nrauritiana. Mol. Gen. Genet. and D. J. Simmons. 1991. Raptd DNA sequel~cmg.
197:183-88. based on single molecule dcSection Proc. Int. Soc
Soltis, D. B. and L. J. Rieseberg. 1986.Autopolyploidy Opt. Engin. 1435:168.
in Toliniea nzeizziesu (Saxifragaceae):Genetic Sourdls, J. and C. Krimbas. 1987 Accuracy of phyloge-
~nslghtsfrom enzyme electropl~oresis.Am. J. Bot. netic trees estimated from DNA scquence data.
73:310-318. MoI. Bioi. Evol. 4:159-168
Soltis, D. E. and P. S. Soltis. 1989. Polyploidy, breedlng Southern, E. M. 1975. Detection of specific sequences
systems, and genetic differentiation in homo- among DNA fragments separated by gcl elec-
sporous pteridophytes, pp. 241-258. In D. E.Soltis trophoresis. J. Mol. Biol. 98:503-517.
and P. S. Soltis (eds.), Isozymes in Plant Biology. Spears, T., L.G. Abele and W. Kiln. 1992. The mono-
Dioscoridcs Press, Portland, Oregon. phyly of brachyuran crabs: A phylogenetic study
Soitis, D. E., C. H. Haufler, D. C. Darruw and G. J. based on 18s rRNA. Syst. Biol. 41:446-461.
Gastony. 1983. Starch gel electrophoresis of ferns: Spencer, D. F., M. N. Schnarc and M. W. Gray. 1984.
A compilation of grinding buffers, gel and elcc- Pronounced structural s~mllaritiesbetween the
trode buffers, and staining schedules. Am. Fern J. small ribosomal RNA genes of wheat mitoc11011-
739-27. 'dria and Escherichla coll. Proc. Natl. Acad. Scl.
Soltis, D. E., P. S. Saltis and B. D. Ness. 1989a. USA 81:493-497.
Chloroplast DNA variation and multiple origins Spencer, E. W., V. M. Ingram and C. Levinthal. 1966.
of autopolyploid y in Heuchera micrantha Electrophoresis: An accident and some p~ecau-
(Saxifragaceae). Evolution 43:650-656. tions. Science 152:1722-1723.
Soltis, D. E., T. A. Ranker and B. D. Ness. 198913. Spencer, N., D. A. Hopkinson and 13. Harris. 1964.
Chloroplast DNAvariation in a wild plant, Phosphoglucomutase polylnorph~smin man.
Tolmtea menziesii. Genetics 121:819-826. Nature 204:742-745.
Soltis, D. E., P. S. Soltis, M. T.Clegg and M. Durbin. Spencer, N., D. A. Hopkinson and H.Harris. 1968
1990. rbcL sequence divergence and phylogenetic Adenosine deaminase polymorphism j37 man.
relationships in the Saxifragaceae sensu lato. Froc. Ann. I-luman Genet. 32.9-14.
Natl. Acad. Sci. USA. 87:4640--4644. Spielman, R. S., J. V. Nee1 and F. H.F Li. 1977.
Soltis, D. E., P. S. Soltis, T. G. Coll~erand M. L. Inbreeding estimation from populakion data.
Bdgerton. 1991. Chloroplast variation within and Models, procedures and ~mphcations.Genetics
among genera of the Heuchera group 85:355-371.
(Saxifragaceae): Evidence for chloroplast transfer
and paraphyly. Am. J. Dot. 78:1150-1161
624 Lifernture Cited
Spinella, D G.and R.C. Vrijenhoek. 1982. Genetic dis- Stecher, P. G., M. Windholz, D. S. Leahy, D. M. Bolton
secrion of clonally inherited genomes of and L.G. Eaton. 1968.MerckTndex. 8th. ed.
Po~cll~opszs.11. Investigation of a silent car- Steel, M. 1994a. Recovering a tree from the Markov
boxylcsterase allele. Genetics 100.279-286. leaf colourations it generates under a Markov
Spolshy, C ,C. A. Phillips and T. Uzzell. 1992. model. Appl. Math. Lett. 7:19-23 (also published
Gynogenetic reproduction In hybrid mole sala- as May 1995, Research Rep, i03, Mathematics
manders (genus Ar~~bystomn). Evolution Bept., University of Christcliurch, NZ).
46 1935-1944. Steel, M. A. 1994b. The maximum likelihood point for
Spr~nger,M. S. 1988. The phylogeny of diprotodontian a phylogenetic tree is not unique. Syst. Biol.
marsupials based 011 single-copy DNA-DNA 43:560-564.
hybridization and craniodental anatomy. Ph.D. Steel, M. A,, M. D. Hendy and D. Penny. 1993a.
dlsserlation, Universrty of California, Everside. Parsimony can be consistent! Syst. Biol.
Springer, M.S. and J. A. W. Kirsch. !989. Rates of sin- 42:581-587.
gle-copy DNA evolution in phalangeriform mar- Steel, M. A., P. J. Lockhart and D. Penny 199313.
supials. Mol. 8101. Evol. 6:331-341. Confidence in evolutionary trees from biological
Springer, M S. and J. A. W. K~rscli.1991. DNA sequence data. Nature 364:440-442.
hybridization, the compression effect; and the Steel, M. A., L. Szekely, P. L. Erdos and P. J. Waddell.
radiation of diprotodontlan marsupials. Syst. 1993c. A complete family of phylogenetic invari-
Zoo1 40:131-151. ants for any number of taxa under Kimura's 3ST
Springer, M. S. and C. Krajewski. 1989. DNA model. New Zealand J. Bot. 31:289-296.
hybrld~zationin animal taxonomy: A critique Steffcn, D. L., G. T. Cocks and A. C. Wilson. 1972.
irom first principles. Quart. Rev. Biol. 64:291-318. Micro-complement fixation in Klebstella classifica-
Springel, M. S., Kirsch, J. A. W., Aplin, K. and T. tion. J. Bacteriol. 110:803-808.
Flannery. 1990. DNA hybridization, cladistics, and Steinemann, M., W. Pinsker and D. Sperlich. 1984.
the phylogrny of phalangerid marsupials. J. Mol. Chromosome homologies within the Drosophila
Evol 30:298-311. obscura group probed by in situ hybridization.
Spr~ligcr,M. S., Davidson, E.If. and X , J. Britten. Chromosoma 91:46-53.
1992a. Calculation of sequence d~vergencefrom Steiner, W. W. M. and D. J. Joslyn. 1979.
the thermal stability of DNA heteroduplexes. J. Elecirophorctic techniques far the genetic study
Mol Uvol. 34:379-382. of mosqu~toes.Mosquito News 3935-54.
Springer, M S., McKay, G., Apiin, K. and J. A. W. Steinmuller, J., E. Schleiermacher and H. Scherthan.
K~rsch199233. Relations among ringtail possums 1993. Direct detection of repetitive, whole chro-
(Marsupialia: I'seudocheiridae) based on DNA- mosome paint and telomere DNA probes by
DNA hybridisation. Australian J. Zool.40:423-435. immunogold electron microscopy. Clromosoi~~e
St Louis, V.L. and J. C. Barlow. 1988. Genetic differen- Res. 1:45-51.
tiatlon among ancestral and introduced popula- Stephen, W. P. 1974. Insects, pp. 303-349. In C. A.
tlons of the Eurasian tree sparrow (Passer mon- Wright (ed.),Biochemical and Immunological
fanus) Evolution 42:266-276. Taxonomy of Animals. Academic Press, New York.
Stahl, D A., D. J. Lane, G. J Olsen and N. R.Pace. Stephens, J. C. 1985. Statistical methods of DNA
1984.Analysis of hydrothcrrnal vent-associated sequence analysis: Detection of intragenic recom-
syn~blontsby ribosomal RNA sequences. Science bination or gene conversion. Mol. Biol. Evol.
224.409-411. 2:539-556.
Stallings, R. L., A. F. Ford, D. Nelson, D. C. Torney, C. Stewart, C.-B. 1995. Active ancestral molecules. Nature
E.ilildebrand and I<.K. Moyzis. 1991. Evolution 37k12-13.
and distribution of (GT),, repetitive sequences in Stewart, C.-B. and A. C. Wilson. 1987. Sequence con-
inainmalian genoines. Genomics 10:807-815. vergence and functional adaptation of stomach
Sianhape, M.J., J. Czelusniak, J.-S. Si, J. Nickerson and lysozymes from foregut fermenters. Cold Spring
M Goodman. 1992. A molecular perspective on Harbor Symp. Quant. Biol. 52:891-899.
mammalian evolution from the gene encoding Stonelung, M., B. May and J. Wright. 1981. LOSSof
in terphotoreceptor ret~noldblnding protein, with duplicate gene expression in salmonids: Evidence
convlnclng evidence for bat monopl~yly.Mol. for a null allele polymorphism at the duplicate
Phylogenet. Evoi. 1:148-160. aspartate aminotransferase loci in brook trout
Stanton, M. 1986. Unveiling the mystery of plant (Salvelznus fontinalls). Biochem. Genet.
paternity. Trends Ecol Evol. 1:116-117. 19:1063-1077.
Literature Cited 625
Stoneking, M., S. T. Sherry and L. Vigilant. 1992. Freshwater Fishes. Stanford University Press,
Geographic origin of human mitochondria1 DNA Stanford.
revisited. Syst. Biol. 41:384-391. Swofford, D. L. and R. B. Selander. 1981.BIOSYS-1: A
Stowell, R. E. (ed.), 1965. Cryobiology, Fed. Proc. FORTRAN program for the comprehensive analy-
24:Sl-S324. sis of electrophoretic data in population genetics
Strobeck, C. and K. Morgan. 1978. The effect of intra- and systematics. J. Hered. 72:281-283.
genic recomb~nationon the number of alleles in a Sytsma, K. J. 1990. DNA and morphology: Inference of
finite population. Genetics 88:829-844. plant phylogeny. Trends Ecol. Evol. 5:104-110.
Studier, J. A. and K. J. Keppler. 1988. A note on the Sytsma, K. J. and W. J. Hahn. 1994. Molecular
neighbor-jolning algorithm of Saitou and Nei. Systematics: 1991-1993. Progr. Botany 55:307-333.
Mol. Biol. Evol. 5:729-731. Szikely, L. A., M. A. Steel and P. L. Erdos. 1993.
Sturtevant, A. H. and E. Novitski. 1941. The homolo- Fourier calculus on evolutionary trees. Adv. Appl.
gies of the chromosome elements in the genus Math. 14:200-216.
Drosophila. Genetics 26:517-541. Szrnidt, A. E., R. Alden and J.-E. H Ilgren. 1987.
Sullivan, J., I<. E. Holsinger and C. Simon. 1995a. Paternal inheritance of chloroplast DNA in Larix.
Among-site rate variation and phylogenetic Plant Mol. Biol. 9:59-64.
analysis of 12s rRNA in sigmodontine rodents. Szymura, J. M. and N. H. Barton. 1986. Genetic analy-
Mol. Biol. Evol. 11:261-277. sis of a hybrid zone between the fire-bellied
Sullivan, J., K. E. Holsinger and C. Simon. 1995b. The toads, Bomhcna bombina and B, varlegata, near
effect of topology on estimates of among-site rate Cracow in southern Poland. Evolution
variation. J. Mol. Evol. (in press). 40:1141-1159.
Sumner, A. T. 1990. Chromosome Banding. Unwin Szymura, J. M. and N. H. Barton. 1991. The genetic
Hyman, London. structure of the hybrid zone between the firebel-
Suzuki, H., K. Moriwaka and E. Nevo. 1987. lied toads Bombina bombina and B, uariegata:
Ribosomal DNA (rDNA) spacer polymorphism in Comparisons between transects and between loci.
mole rats. Mol. Biol. Evol. 4:602-610. Evolution 45:237-261.
Swofford, D. L. 1981.On the utility of the distance
Wagner procedure, pp. 25-43. In V. A. Funk and Taberlet, P. and]. Bouvet. 1994. MtDNA polymor-
D. R. Brooks (eds.), Advances in Cladistics. Proc. pl~ism,phylogeography and conservation genet-
First Meeting of the Willi Hennig Soc., New York ics of the brown bear Ursus arcfos in Europe. Proc.
Bot. Garden, Bronx. Roy. Soc. London B 255:195-200.
Swofford, D. L. 1991. When are phylogeny estimates Tabor, S. and C. C. Richardson. 1987. DNA sequence
from molecular and morphological data incon- analysis with a modified bacteriophage T7 DNA
gruent?, pp. 295-333. In M. M. Miyamoto and J. polymerase, Proc. Natl. Acad. Sci. USA
Cracraft (eds.), Phylogenefic Analysis of D N A 84:47674771.
Sequences. Oxford University Press, New York. Taggart, J. B. and A. Ferguson. 1994. A composite
Swofford, D. L. 1993. PAUP: Phylogenetic Analysis DNA size reference ladder suitable for routine
Using Parsimony, version 3.1. Formerly distrib- application in DNA fingerprinting/profiling
uted by Illinois Natural History Survey, studies. Mol. Ecol. 3:271-272.
Champaign, Illinois. Tajima, F. 1983. Evolutionary relationship of DNA
Swofford, D. L. 1996.PAUP*: Phylogenefic Analysis sequences in finite populations. Genetics
Using I-'arsimorry (and Other Methods), version 4.0. 105:437-460.
Sinauer Associates, Sunderland, Massachusetts. Tajima, F. and M. Nei. 1982. Biases of the estimates of
Swofford, D. L. and S. H. Berlocher. 1987. Inferring DNA divergence obtained by the restriction
evolutionary trees from gene frequency data enzyme technique. J. Mol. Evol. 18:115-120.
under the principle of maximum parsimony. Syst. Tajima, F. and M. Nei. 1984. Estimation of evolution-
Zool. 36:293-325. ary distance between nucleotide sequences. Mol,
Swofford, D. L. and W. P. Maddison. 1987. Biol. Evol. 1:269-285.
&constructing ancestral character states under Tajima, E and N. Takeaaki. 1994. Estimation of evolu-
Wagner parsimony. Math. Biosci. 87:199-229. tionary distance for reconstructing molecular
Swofford, D. L. and W. P. Maddison. 1992. Parsimony, phylogenetic trees. Mol. Biol. Evol. 11:27&286.
character-state reconstructions, and evolutionary Takahata, N. 1989. Gene genealogy in three related
inferences, pp. 186-223.177 R. L. Mayden (eds.), populations: Consistency probabil~tybetween
Systematics, Historical Ecology, and North American gene and population trees. Genetics 122:957-966.
626 Literature Cited
Takahata, N. and S. R. Palumbi. 1985. Extranuclear dif- sis of natural populations. Genetics
ferentiation and gene flow in the finite island 120:1145-1154.
model. Genetics 1093441457. Templeton, A. R., K. Shaw, B. Routman and S. K.Davis.
Tammar, A. R. 1974. Bile salts of Amphibia, pp. 67-76. 1989.The genetic consequences of habitat fragmen-
In M, Horkin and B. T. Scheer (eds.), Citemica1 tation. Ann. Missouri Bot. Garden 77:13-27.
Zoology. Academic Press, New York. Templeton, A. R., K. A. Crandall and C. E Sing. 1992.
Tamura, K. and M. Nei. 1993. Estimation of the num- A cladistic analysis of phenotypic assoc~ations
ber of nucleotide substitutions in the control with haplotypes inferred from restriction endonu-
region of mitochondrial DNA in humans and clease mapping and DNA sequence data. 111.
chimpanzees. Mol, Biol. Evol. 10:512-526. Cladogram estimation. Genetics 132:619-633.
Tateno, Y., M. Nei and E Tajima. 1982. Accuracy of Templetan, A. R., B. Routman and C. A. Phillips. 1995.
estimated phylogenctic trees from molecular data. Separating population structure from population
I. Distantly related trees. J. Mol. Evol. 18:387-404. history: A cladistic analysis of the geographical
Tateno, Y., N.Takezaki and M. Nei. 1994. Relative effi- distribution of mitochondrial DNA haplotypes in
ciencies of the maximum-likelihood, neighbor the tiger salamander, Anzbystama tigrinum.
joining, and maximum-parsimony methods when Genetics 40:767-782.
substitution rate varies with site. Nol. 8101. Evol. Tereba, A, and B. J. McCarthy 1973. Hybridization of
11:261-277. lZ51-labeledribonucleic acid. Biochem~stry
Tautz, D. 1989. Hypervariability of simple sequences 12:4675-4679.
as a general source for polymorphic DNA mark- Therman, E. and M. Susman. 1993.Huntan
ers. Nucl. Acids Res. 17.6463-6471. Chromosomes, Sfrucfure,Behavior, and Epcts.
Tavart., S. 1986. Some probabil~sticand statistical prob- Springer-Verlag, New York.
lems on the analysis of DNA sequences. Lec. Thomas, M.R. and N. S. South. 1993. Microsatellite
Math. Life Sci. 1757-86. repeats in grapevine reveal DNA polymorphisms
Taylor, A, C., W. B. Sherwin and R. K. Wayne. 1994. when analyzed as sequence-tagged sites (STSs).
Genetic variation of microsatellite loci in a bottle- Theor. Appl. Genet. 86:985-990.
necked species: The northern hairy-nosed wom- Thomas, W. IC. and A. T. Beckenbach. 1989. Variation
bat Lasiorhinus krefftii, Mol. Ecol. 3:277-290. in salmonid mitochondrial DNA: Evolutionary
Taylor, H. A., S. E. Riley, 5. E. Parks and R. E. constraints and mechansims of substitution. J.
Stevenson. 1978. Longterm storage of tissue sam- Nol. Evol. 29:233-245.
ples for cell culture. In Vitro 14:476-478. Thomas, W. K. and S. Paabo. 1993. DNA sequences
Tegelstrom, H. 1986. Mitochondria1 DNA in natural from old tissue remains. Meth. Enzyrnol.
populations: An improved routine for the screen- 224:406-419.
ing of genetic variation based on sensitive silver- Thomas, W. K., S. Paabo, F. Villablanca and A. C.
staining. Electrophoresis 7:226-229. Wilson. 1990. Spatial and temporal continuity of
Templeton, A. R. 1983a. Convergent evolution and kangaroo rat populations shown by sequencing
non-parametric inferences from restriction frag- mitochondria1 DNA from museum specimens. J.
ment and DNA sequence data, pp, 151-379. In B. Mol. Evol 31:101-112.
Weir (ed.), Statistical Analysis of D N A Sequence Thompson, E. A. 1973. The method of minimum evo-
Data, Marcel Dekker, New York. lution. Ann. Human Genet. 36:333-340.
Templeton, A. R. 198313. Phylogenetic inference from Thorne, J. S., D.L. Swofford, J. Felsenstein and B. S.
restriction endonuclease cleavage site maps with Wiegmann. 1996. The topology-dependent per-
particular reference to the humans and apes. mutation test for monophyly does not test for
Evolution 32221-244. monophyly. (u~zpublishedmanuscript)
Templeton, A. R. 1987. Nonparametric inference from Thorpe, J. P. 1982. The molecular clock hypothesis:
restriction cleavage sites. Mol. Biol. Evol. Biochemical evaluation, genetic differentiation
4:315-319. and systematics. Annu. Rev. Ecol. Syst.
Templetan, A. R. 1993. The "Eve" hypothesis: A genet- 13:139-168.
ic critique and reanalysis. Am. Anthropol. Thorpe, R. S., D. P, McGregor, A. M. Cummings and
95:51-72. W. C. Jordan. 1994. DNA evolution and coloniza-
Templeton, A. R. ,C. F. Sing, A. Kessling and S. tion sequence of island lizards in relation to geo-
Humphries. 1988. A cladistic analysis of pheno- logical history: mtDNA RFLP, cytochrome b,
type associations with haplotypes inferred from cytochrome oxidase, 125 rRNA sequence, and
restriction endonuclease mapping. 11. The analy- nuclear M P D analysis. Evolution 48:230-240.
Literature Cited 627
Tibbets, C. A, and T. E. Dowling. 1995. Effects of Tuckcr, P. K., B. K. Lee and E M E~cher.1989 Y chro-
intrinsic and extrinsic factors on population frag- mosome evolutran 111 the subgenus Mus (genus
mentation in three North American minnows Mus) Genetics 122 169-179
(Teleostei:Cyprinidae). Evolution (in press). Tucker, P K.,I? D. Sage, J. Warner, A C. W~isonand F
Tllley, S. G. 1981. Anew species of Des~nognathus M. Eicher 1992. Abrupt cllne for scx chromo-
(Amphibia: Caudata: Plethodontidae) from the somes in a hybrid zone between two species of
southern Appalachian mountains. Occ. Pap. Mus. mice. Evolution 46.1146-1163
Zool. University of Michigall 695:l-23. Turner, B. J. 1973. Gcnetrc variation of m~tochondr~al
Tilley, S. G. and J. S. Hansman. 1976. Allozymic varia- aspartate aminotransfcrase In the teleost
tion and occurrence of multiple inseminations in Cyprinodon nevndensrs. Comp Blochem. Phys~ol
populations of the salamander Desmognathus 44B:89-92.
ochrophneus. Copeia 1976:734-741. Turner, B J 1974 Genetic d~vergcnceof Death Vallcj~
Tilley, S. G. and P. M. Schwerdtfeger. 1981. pupfish species: B~ocheln~cal versus marphologl-
Electrophoretic variation in Appalachia11 popula- cal evidence. Evolutlo~~ 28.281-294.
tions of the Desmognafhusfuscus complex Turncr, B. J 1980.A multiple sllcer for starch gels
(Amphibia: Plethodontidae). Copela lsozylne Bull. 13.113.
1981:109-119. Turncr, B. J. 1984. Evolutlona~ygenetics of art~ficlal
Tillier, E. R.M. 1994. Maximum likelihood with multi- refug~umpopulations of an endangered speclcs,
parameter models of substitution. J. Mol. Evol. the desert pupfish. Copela 1984.364-369
39:409417. Turner, B. J ,R. I<.Miller and E M Rasch. 1980
Tillier, C. R. M. and R. A. Collins 1995. Neighbor join- Sign~ficantdifferential gene duplication t\r~thout
ing and maximum likelihood with RNA ancestral tetraploldy in a genus of Mexlcan llsh
sequences: Addressing the interdependence of Experientia 36:927-930
sites. Mol. Biol. Evol. 12:7-15. Turner, B. J., J. S Balsano, I? J Monaco and E. M
Timmis, J. N, and N. S. Scott. 1984. Promiscuous DNA; Rasch 1983. Clonal diversity and evolutionary
Sequence homologies between DNA of separate dynamics m a diploid-tr~plo~d breeding con~plex
organelles. Trends Riochein. Sci. 9:271-273. of unisexual fishcs (Poeczliu) Evalut~on
Titus, T. A. and A. Larson. 1995.A molecular perspec- 37.798-809.
tlve on the evolutionary rad~ationof the salaman- Turner, S ,T. Burger-W~ersma,S 1 Glovannon~,L R
der family Salamandridae. Syst. Biol. 44:125-251. Mur and N. R. Pace 1989 ?he relationship of a
Titus, T. A., D. M. Hillis and W. E. Duellman. 1989. procl~lorophyteProchlorolFirlx hollandlca to green
Color polymorphism in neotropical treefrogs: An chloroplasts. Nature 337.380382
allozymic investigation of the taxonomic status of
Hyla favosa Cope. Werpetologica 45:17-23. Uetsuki, T., A. Naito, S Nagata, Y. Kazlro. 1989
Tjio, J. H, and A. Levan. 1956. The chromosome num- Isolation and character~zationof the human chra-
ber of man. Hereditas 42:2-6. masornal gene for polypepttde chain eiongatlor~
Tobler, J. E. and E. H. Grell. 1978. Genetics and physio- factor-1 alpha. J. Biol. Chcm. 264:5791-5798.
logical expression of Phydroxy acid dehydroge- Upholt, W. B. 1977. Est~mationof DNA sequence
nase in Drosophila. Biochem. Genet. 16333-342. divergence from cornpanson of restriction
Torroni, A., T. G. Scburr, C.-C. Yang, E. J. E. Szatlunary, endonuclease digests. Nucl. Aclcis Ices. 4.1257-65
R. C. Williams, M. S. Schanfield, G. A. Troup, W. Utter, E, P. Aebersold and G W~nans,1987.
C. Knowler, D. N. Lawrence, K. M. Weiss and B. Interpreting genetrc variation detected by elec-
C. Wallace. 1992. Native American mitochondria1 trophoresis, pp. 2146.I11N. Ryman and E Utter
DNA analysis indicates that tlze Amerind and the (eds.), Population Gunefzcsand F~slzeryManage~nenl
Nadene populations were founded by two inde- University of Washington Press, Seattlc.
pendent migrations. Genetics 30:153-162. Uy, R. and F. Wold. 1977. Posthanslatlonal c~valent
Tracey, T. E. and L. I. Mulcahy. 1991.Asimple method modif~cat~on of proteins. Science 198.890-896.
lor direct automated sequencing of PCR frag- Uzzell, T. and K. W. Corbin. 1971 Rtting discrete
ments. BioTechniques 11:68-75 probabil~tydistributions to evolutionary events
Trask, B. J. 1991. Fluorescence in situ hybrid~zation: Science 172:1089-1096.
Applications in cytogenetics and gene mapping.
Trends Genet. 7:149-154. Vacclno, E,M. Accerb~and M Corbellini. 1993.
Tripatlu, R. L.1991. Alternative dideoxy sequencing of Cultivar identification in T acstzvum using hlgl11y
double-stranded DNA. BioTcchniques 12:390-391. polymorph~cDNA probcs. Theor. Appl. Genet
87.833-836.
628 Literature Cited
Valdes, A M. and D. Piiiero. 1992. Phylogenetlc esti- Vawter, L. and W. M. Brown. 1986. Nuclear and mito-
ma tion of plasmid exchangc 11.1 bacteria: chondrial DNA comparisons reveal extreme rate
Evolution 46:641-656. variation in the lnolecular clock. Science
Valdks, A. M , M. Slatkin and N. 8 . Freimer. 1993. 234:194-196.
Allele frequencies a t iurcrosatellite loci: The step- Vawter, I;. and W M. Brawn. 1993. Rates and patterns
wise mutation model revisited Genetics of base change in the small subunit ribosomal
133,737-749. W A gene. Genetics 134:597-608.
Valenui~e,J. E., M.J. Boyle and W. A. Sewell. 1992. Verheyen, G. R., 9. Kempenaers, T. Burke, M. Van Den
Presence of single-stranded DNA in PCR prod- Broeck, C. Van Broeckhoven and A. Dhont. 1994.
ucts of slow mobility. BioTechniques 13:222-224 Identification of hypervariable single locus mh-
Van Beneden, 8,J, and D.A. Powers. 1989. Structural isatellite DNA probes in the blue tit Parus
and functional differentiation of two clinally dis- caeruleus. Mol. Ecol. 3:137-143.
tr~butedglucosephosphate isomerase allelic Verkerk, A. and 20 others. 1991. Identification of a
isozymes from the teleost fish Fundulus heterocli- gene (FMR-1) containing a CGG repeat coincident
tins Mol Biol. Evol. 6:155-270. with a breakpoint cluster region exhibiting length
Van de Peer, Y., J. M. Neefs, P De Iiijk and R. De variation in Fraglle X syndrome. Cell 65:905-914.
Wachter 1993. Reconstructing evolution from Verma, R. S. and A. Babu. 1989.Human Chromosontes:
eukaryotic small-ribosomal-subunit RNA Manual of Basic Techniques. Pergamon Press, New
secluences: Calibration of the molecular clock. 3. York.
Mol. Evol. 37:221-232. Vigilant, L., M. Stoneking, H. Harpending, K.Hawkes
Van Den Bussche, R. A,, D. M. Hillis, J. P. Huelsenbeck and A. C. Wilson. 1991.African populations and
and R.J. Baker, 1996. Base compositional bias and the evolution of human mitochondria1 DNA.
phylogenetic analyses: A test of the "flying DNA" Science 2531503-1507.
hypothesis. (unpublished manuscr~pt) Vilgalys, R. and B. L. Sun. 1994,Ancient and recent
Van Laarhovcn, I? J. M and E H L. Aarts. 1987. patterns of geographic speciation in the oyster
Srrnulafed Annealing: Theory and Applications. mushroom Pleurotus revealed by phylogenetic
Rc~del,Boston. analysis of ribosomal DNA sequences. Proc. Natl.
VanlerBerghe, E, B. Dod, P. Boursot, M. Bellis and E Acad. Sci. USA 91:4599-4603.
Uonhomme. 1986.Absence of Y-chromosome Vogler, A. P. and R. DeSalle. 1994. Evolution and pby-
lntrogression across the hybrid zone behveen M u s logenetic information content of the ITS-1 region
irrirscul~~sand Mus dornesflcus. Genet. Res. 1n the tiger beetle Cicmdela dorsalis. Mol. Biol.
48.191-197. Evo~.lk393-405.
van Ooycn, A. V. Kwee and 13. Nusse. 1985. The Volpi, B. V, and A. Baldini. 1993. MULTIPRINS: A
nuclcoilde sequence of the human int-1 mamma- method for multicolor primed in situ labelling.
ry oncogene; evolutionary conservation of coding Chromosome Res. 1:257-260.
and non-coding sequences. EMBO J. 4:2905-2909. Vrijenhoek, R. C. 1989. Genetic diversity and the ecolo-
vanTets, P. and I. M. Cowan. 1966. Some sources of gy of asexual populations, pp.175-197. In K.
variallon in the blood sera of deer (Odocoileus) as Wiihrmann and S. Jain (eds.), Population Biology
revealed by starch gel electrophoresis. Can. J. and Evolufion. Springer-Verlag, New York.
Zool. 44531-647. Vrijenhoek, R. C., M. E. Douglass and G. K. Meffc.
Van Treuren, R., R. Bijlsma, W. Van Delden and N. J. 1985. Conservation genetics of endangered popu-
Ouborg 1991. The significance of genetic erosion lations in Arizona. Science 229:400-402.
In the process of extinctlon. 1. Genetic differentia-
tlon ~n Salvia pratensis and Scabiosa columbaria ln Waddell, P. J. 1995. Statistical methods of phylogenetic
rcla tion to population slze. Heredity 66:181-189. analysis, including Hadamard conjugations,
Varley, J. M., 1-1, C. Macgregor, I. Nardi, C. Andrews LogDet transforms, and maximum likelihood.
and H.P. Erba. 1980. Cytological evldence of tran- P11.D. dissertation, Massey University.
script1011of highly repeated DNA sequences dur- Waddell, P. J. and M.D. Hendy 1995. Families of order
ing the lampbrush stage in Trzturus cristatus 2t-1 bipartition invariants under the generalised
cnl nzfex. Chromosoma 80:289-307. Kimura 3P model. Massey University Mathe-
Vassarr, G., M. Georges, R. Monsieur, W.Brocas, A. S. matical and Information Sciences Report, Series B.
Lequarre and D.Christophe. 1987. A sequence in Waddell, P. J. and D. Penny. 1996a. Evolutionary trees
M13 phage detects hypervariable minisatellites in of apes and humans from DNA sequences. In A.
human and animal DNA. Science 235683-684. J. Lock and C. R. Peters (eds.), Handbook of
Literature Cited 629
Symbolic Evolution. Clarendon Press, Oxford (in comparisons of higher plant plastocyanins.
press). Phytochemistry 15:137-141.
Waddell, P. J. and D. Penny. 199613. Extending Wallace, D. G., M -C. King ana A. C. Wilson. 1973.
Hadamard conjugations to model sequence evo- Albumin differences among ranid frogs:
lution w ~ t hvariable rates across sites. Available Taxonomic and phylogenetic implications. Syst.
by anonymous ftp from onyx.si.edu. ZooL 22:I-13.
Waddell, F.J. and M. A. Steel. 1995. General time Walldorf, U.and B. T. Hovemann. 1990. Apls mellifera
reversible distances allowing a distribution of cytoplasmic elongation factor la (EF-la) is close-
rates across sites. Research Report, Department of ly related to Drosophila melanogasfey EF-la. FEBS
Mathematics and Statistics, Canterbury Letters 267245-249.
University Walsh, P. S., D. A. Metzger and R. Higuchi. 1991.
Waddell, P, J., D. Penny, M. D. Hendy and G. Arnold. Chelex 100 as a medium for simple extraction of
1994. The sampling distributions and covariance DNA for PCR-based typing from forensic materi-
matrix of phylogenetic spectra. Mol. Biol. Evol. al. BioTechnlques 10:506-513.
11:630-642. Walter, H., W. Selby and J. R. Fransisco. 1965.
Wagner, A,, N. Blackstone, P, Cartwright, M. Dick, B. Altered electrophoretic mobilities of some ery-
Misof, P. Snow, G. P, Wagner, J. Bartels,M. Murtha throcytic enzymes as a function of their age.
and J. Pendleton. 1994. Surveys of gene families Nature 208:76-77.
using polymerase chain reaction: PCR selection Waples, R. S. 1989. A generalized approach for esti-
and PCR drift. Syst. Bioi. 43:250-261 mating effective population size from temporal
Wagner, D. B., G. R. Furnier, M. A. Saghai-Maroof, S. changes in allele frequency. Genetics 121:379391.
M. Williams, B. F.Dancik and R. W. Allard. 1987. Ward, R. D.,B. J. McAndrew and G. P. Wallis. 1979.
Chloroplast DNA polymorphism in lodgepole Purine nucleoside phosphorylase variation in the
and jack pines and their hybrids. Proc. Natl. brook lamprey, Lampetra planer1 (Bloch)
Acad. Sci. USA 84:2097-2100. (Petromyzone, Agnatha): Evidence for a trirneric
Wagner, W. H. 1983. Reticulistics: The recognition of enzyme structure. Biochem. Genet. 17:251-256.
hybrids and their role in cladistics and classifica- Ware, V. C., B. W. Tague, C. G. Clark, R. L. Gourse, R,
tion, pp. 63-79. In N. 1,Platnick and V. A. Funk C. Brand and S. A. Gerbi. 1983. Sequence analysis
(eds.), Advances in Cladzsttcs: Proceedings of fhe of 285 ribosomal DNA from the amphibian
Second Meeting of the -/Villi finnig Society. Xenopus laevis. Nucl. Acids Res. 11:7795-7817.
Columbia University Press, New York. Waterman, M. S. 1984. General methods of sequence
Wahlund, S. 1928. The combination of populations comparison. Bull. Math. Bioi. 46:473-500.
and the appearance of correlation examined from Waterman, M. S., T. F. Smith and W. A. Beyer. 1976.
the standpoint of the study of heredity. Hereditas Some biological sequence metrics. Adv. Math.
1165-106. 20:367-387.
Wainwright, P. O., G. Hinkle, M. L. Sogin and S. K. Waterman, M. S., T. E Smith, M. Singh and W. A,
Stickel. 1993. Monophyletic origins of the meta- Beyer. 1977.Additive evolutionary trees. J. Theor.
zoa: An evolutionary link with fungi. Science Biol. 64199-213.
260:340-342. Waterman, M. S.,]. Joyce and M. Eggert. 1991.
Wake, D. B, and A. Larson. 1987. Multidimensional Computer alignment of sequences, pp. 59-72. In
analysis of an evolving lineage. Science 238:42-48. M. M. Miyamoto and J. Cracraft (eds.),
Wake, D. B., G. Roth and M. H. Wake. 1983. On the Phylogenetic analyszs of D N A sequences. Oxford
problem of stasis in organismal evolution. J. University Press, Oxford.
Theor. Biol. 101:211-224. Watson, P. R (ed.) 1978. Artificial breeding of non-
Wake, D. B., K. P. Yanev and M. M. Frelow. 1989 domestic animals. Symp. 2001. Soc. London
Sympatry and hybridization in a "ring species": 43:l-376.
The plethodontid salamander Ensatina Watt, J. L. and G. S. Stephen. 1986. Lymphocyte cul-
eschscholtzii, pp. 134-157. In D. Ottc and J. A. ture for chromosome analysis, pp. 39-55. Ifl D.E,
Endler (eds.), Speciation and Its Consequences. Rooney and B. 13, Czepulkowski (eds.), Human
Sinauer, Sunderland, Massachusetts. Cytogenefics.IRL Press, Oxford.
Wakeley, J. 1993. Substitution rate variation among Watt, W. B. 1972. Inhagenic recombination as a source
sites in hypervariable region 1of human mito- of population genetic variability. Am. Nat.
chondrial DNA. 5. Mol. Evol. 37:613-623. 106:737-753.
Wallace, D.G. and D. Boulter, 1976. Immunoioglcal
630 Literature Cited
Watt, W. B. 1977. Adaptation at specific loci. I. Natural to isolations from mammalian, insect, higher
selection on phosphoglucose isomerase of Collas plant, algal, yeast, and bacterial sources. Analyt.
butterflies: Biochemical and population aspects. Biochem. 152576-385.
Genetics 87:177-794. Wegnez, M. 1987. Letter to t l ~ eeditor. Cell 51:516.
Watt, W. B. 1983. Adaptation at specific loci. 11. Weining, S. and P. Langridge. 1991. Identification and
Demographic and biochemical elements in the mapping of polymorphisms in cereals based on
maintenance of the Colzas PGI polymorphism. the polymerase chain reaction. Theor. Appl.
Genetics 103:691-724. Genet. 82:209-216.
Watt, W. 0.1985, Bioenergetics and evolutionary Weir, 8. S. 1989. Sampling properties of gene diversity,
genetics: Opportunities for new synthesis. Am. pp. 2342. b?A. W.D. Brown, M. T., Clegg, A. L.
Nat. 125:118-143. Kahler and B. S. Weir (eds.), Plant Po;lulatton
Watt, W. 8.1986. Power and efficiency as indices of fit- Genetics, Breeding and Genetic Resources. Sinauer,
ness in metabolic organization. Am. Nat. Sunderland, Massachusetts.
127:629-653. Weir, B. S. 1990. Genetlc Data Analysis. ~inauer,
Watt, W. B., P. A. Carter and S. M. Blower. 1985. Sunderland, Massacl~usetls.
Adaptation at specific loci. IV. Differential mating Weir, B. S. 1992a. Population genetics in the forensic
success among glycolytic allozyme genotypes of DNA debate. Proc. Natl. Acad. Sci. USA
Colias butterflies. Genetics 109:157-175. 89:1165411659.
Watt, W. B., I? A. Carter and K.Donohue. 1986. Weir, B. S. 1992b. Independence of VNTR alleles
Females' choice of "good genotypes" as mates is defined as fixed bins. Genetics 130:873-887.
promoted by an insect mating system. Science Weir, B. S. 1994. Effects of inbreeding on forensic cal-
233:1187-1190. culations. Annu. Rev. Genet. 28:597-621.
Wayne, R. K., S. K. George, D. Gilbert, P. W. Collins, S. Weir, B. S. and C. C. Cockerhani. 1984. Estimating F-
D. Kovach, D. Girman and N. Lehman. 1991a. A statistics for the analysis of population structure.
morphological and genetic study of the island Evolution 38:1358-1370.
fox, Urocyon littoralzs. Evolut~on45:1849-1868. Weir, B. S. and C. C. Cockerliam. 1989a. Complete
Wayne, R. K., B. Van Valkenburgh and S, J. O'Brien. characterization of disequilibrium at two loci, pp.
1991b. Molecular distance and divergence time in 86-110. In M. W. Feldman (ed.), Mntizenzatical
carnivores and primates. Mol. Biol. Evol. 8:297-319. Evolutionary Tlzeory. Princeton University Press,
Weber, J. L.1990. Tnformativeness of human Princeton.
(dC-CIA),-(dG-dT), polymorphisms. Genomics Weir, B. S. and C. C. Cockerham. 1989b. Analysis of
7:524530. disequilibrium coefficients, pp. 45-51. In W. G.
Weber, J. L. and P. E. May. 1989. Abundant class of Hill and T. F. C. Mackay (eds.), Evolution and
human DNA polymorphism which can be typed Animal Breeding: Reviews on Moleculav nnd
using the polymerase chain reaction. Am. J. Quantitative Genetlcs Approaches in Honour of Alan
Human Genet. 44388396. Robertson. Commonwealth Agricultural Bureaux,
Weber, J. L, and C. Wong. 1993. Mutation of human Slough, United Kingdom.
short tandem repeats. Human Mol. Genet. Weisburg, W. G., M. E. Dobson, J. E. Samuel, G. A.
2:1123-1128. Dasch, L. P. Mallavia, L. Mandelco, J. E. Sechrest,
Weeden, N. F. 1983. Plastid isozymes, pp. 139-158. In E. Weiss and C. R. Woese. 1989a. Phylogenetic
S. D. Tanskcy and T. J. Orton (eds.), dsozymes in diversity of the Rickettsiae. J. Bacteriol.
Plant Genetics and Breeding, Part A. Elsevier, 171:42024206.
Amsterdam. Weisburg, W. G., J. G. Tully, D. L. Rose, J. P. Petzcl, H.
Weeden, N. F,and J. F. Wendell. 1989. Genetics of Oyaizu, D. Yang, L.Mandelco, J. Sechrest, T G.
plant isozymes, pp.46-72. I71 D. E. Solhs and P. S. Lawrence, J. Van Etten, J. Manilaff and C. R.
Soltis (eds.), lsozymes in Plnnt Biology. Dioscorides Woese. 1989b.A phylogenetic analysis of the
Press, Portland, Oregon. mycoplasmas: Basis for their classification. J.
Weeden, N. E, J. J. Doyle and M. Lavin. 1989. Bacteriol. 171:6455-6467.
Distribution and evolution of a glucosephosphate Weisburg, W. G., S. M.Barns, D. A. Pelletier and D. J.
isomerase duplication in the Leguminosae. Lane. 1991. 16s ribosomal DNA amplification for
Evolution 45:1637-1651. phylogenetic study. J. Bacteriol. 173:697-703.
Weeks, D. F.,N.Beerman and 0.M. Griffith. 1986. A Weisman, L. S., B. M. Krummel and A. C. Wilson.
small scale five-hour procedure for isolating mul- 1986. Evolutionary shift in the site of cleavage of
tiple samples of CsCl-purified DNA: Application prelysozyme. J. Biol. Chem. 261:2309-2313.
Literature Cited 631
Weller, S. J., D,P. Pashley, J. A. Martin and J. L. White, M. B., M. Carvalho, D. Derse, S. J. O'Brien and
Constable. 1994. Phylogeny of noctuoid moths M. Dean. 1992. Detecting single base substitutions
and the utility of combining independent nuclear as heteroduplex polymorphisms. Genomics
and rnitochondrlal genes. Syst. Biol. 43:194-211. 12:301-306.
Werman, S. D., Davidson, E.H. and R. J. Britten. 1990. Wh~te,M. E ,J J. Bull, I J Mol~neuxand D. M I3lllls
Rapid evolution in a fraction of the Drosoplaila 1991 Experimental phylogen~esfrom T7 bactcrlo-
nuclear genome. J. Mol. Evol. 30:281-289. phage, pp 935-943 In E Dudley (ed ), The Ulzlty
Werth, C. R. 1985. Implementing an isozyme laborato- of Evollitiona y Blology Proceedtngs of the Fourtk
ry at a field station. Virginia J. Sci. 36:53-76. Iizternatronal Congress of Systematzc and
Wcrth, C. R.and M. D.Windham. 1987.A new model Evoluttonary Biology. Bloscorldes Press, Portland
for speciation in polyploid pteridophytes result- White, M J D 1973 Aninla1 Cytology and Evolutlon
ing from reciprocal silencing of homoeologous 3rd ed Cambrtdge University Press, Cambridge
genes. Am. J. Bot. 74:713-714. Wlute, M W., S D. Mane and R C Bchmond 1988
Werth, C. R., S. I. Guttman and W. H. Eshbaugh. Studles of esterase 6 In DlO~Op\2llarnelanogaster
1985a. Electropl~oreticevidence of reticulate evo- XVIII Biochemical d~ffcrenccsbetween the slow
lution in the Appalachian Aspleniui?~con~plex. and fast allozymes. Mol Biol Evol 5 41-62
Syst. Bot. 10:184-192 White, T J ,N. A r ~ h e l mand FJ. A Erl~ch.1989 T11c
Werth, C. R., S. I. Guttman and W. I-I. Eshbaugh. polymerase cham reactlon Trends Genet
1985b. Recurring origins of allopolyploid species 5 185-189.
in Aspleniutn. Science 228:731-733. Whlte, T J., T Bruns, S. Lee and J Taylor 1990.
Wetmur, J. G. and N. Davidson. 1968. Kinetics of Arnpliflcatlon and direct sequenung of fungal
renaturation of DNA. J. Mol. Biol. 31:349-370. ribosomal RNAgencs for phylogenetlcs, pp
Wetton, J. H., R.E. Carter, D. T. Parkin and D,Walters. 315-322. In M. A Innls, D I-I Gelfand, J J
1987. Demographic study of a wild house spar- Snlnsky and T. J. Wl~lte(eds ), PCR protocol^
row population by DNA fingerprinting. Nature Academic Press, New York
327:147-149. Whltehouse, E. and T Spears 1991 A simplc method
Wheeler, Q. D. 1995. Systematics, the scientific basis for removlng oil from cyclc sequencllig react~ons
for invcntorics of biodiversity. Biodiv. Conserv. B~oTechnlques11 616-628
4:476-489. Whitkus, R , J Doebley and J F Wendel 1994 Nuclear
Wheeler, W. C. 1989. The sytematics of insect riboso- DNA markers In systematics and evolution, p p
mal DNA, p p 307-321. In 8. Fernholm, K.Bremer 116-141. In L Phdlips and I K Vasll (eds ), DNA-
and H. Jornvall (eds.), The Hierarchy of Ltfc. Based Markers in Plants Kluwer Academc
Elsevier, Amsterdam. Publlshcrs, Dordrecht, The Netherlands.
Whccler, W. C. 1990a. Combinatorial weights in phy- Wh~tmore,D H. 1990. Isoelectric focusing of protclns,
logenetic analysis: A statistical parsimony proce- pp. 81-105. In D. I-I. Whitmore (ed ),
dure. Cladistics 6:269-275. Eleclropharetic and Isoeiectr tc Focusrng Tech111qiiesit1
Wheeler, W, C. 1990b. Nucleic acid sequence phyloge- Ptsherzes Maizag.en~ent.CRC Press, Boca Raton,
ny and random outgroups. Cladistics 6363-368. Florlda
Wheeler, W. C. and D. Gladstein. 1992.MALIGN. Whltt, G S 1970. Developmental genehcs of the lac-
American Museum of Natural History, New York. tate dehydrogenase lsozymcs of fish. J. Exp Zoo1
Wheeler, W. C, and D. Gladstein. 1994. MALIGN: A 175 1-36
inultiple sequence alignment program. J. Ilered. Whltt, G S 1981. Evolution of lsozyme locl and thclr
85:417. dlfferentlal regulatron, pp. 271-289.Ii1 G. G E.
Wheeler, W. C. and R. L. Honeycutt. 1988. Paired Scudder and J. L. Reveal (cds ), Evollitton Toduy,
sequence difference in ribosomal RNAs: Proceedtngs of tlze Secaizd Infenzatlanal Congress of
EvoLutionary and phylogenetic implication. Mol. Systerrratrc and Evolutronaiy Biology. Hunt lnst Uot
Biol. Evol. 5:90-96. Documentation, Carncgle-Mcllen University,
Wheeler, W. C, and K.Nixon. 1995.A novel method Pittsburgh, Pennsylvania
for economical diagnosis of cladograms under Whitt, G. S 1983. Isozymcs as probcs and part~clpant.;
Sankoff optimization. Cladistlcs 10:207-213. In developmental and evolutionary genet~cs,pp.
Wheeler, W. C., J. Gatesy and R. DeSalle. 1995. Elision: 1-40. In M. C. Rattazzl, J G Scandalios and G. S
A method for accommodating multiple molecular Whrtt (eds 1. bozymes Czlrrcrnt Toprcs in Btologlcal
sequence alignments with alignment-ambiguous and Medlcal Research Vol 10 Geneftcs and
sites. Mol. Phylogenet. Evol. 4:l-9. Evolution. A, R. Llss, New York.
632 Literature Cited
Whlt~,G. S 1987. Species differences in isozyme tissue Wilk~nson,M. 1994. Common cladistic information
patterns: Their uhi~tyfor systematic and evolu- and its consensus representation: Reduced Adams
tionary analyses, pp 1-26. In M. C. Rattazzi, J. t. and reduced cladistic consensus trees and pro-
Scandalios and G. S.Whitt (cds.), Isozymes: files. Syst. B~ol.43:343-368.
Ciii rent Topics ~n B1oIog1'cnIand Medical Researclz, Williams, J. G. K., A. R.Kubelik, K.J. Livak, J. A.
VoI 15. Genetics, Development, nnd Evolution. A. R. Rafalski and S. V. Tingey, 1990. DNApolymor-
L :ss, New York. phisms amplified by arbitrary primers are useful
Whitt, C S., J B. Shaklee and C. L. Markert. 1975. as genetic markers. Nucl. Acids Xes.
Evolution of the lactate dehydrogenase isozymes 18:6531-6535.
or fishes, pp. 381-400. I n C. L. Markert (ed.), Williams, P. L. and W. M. Fitch. 1989.Finding the min-
lsozyirres IV: Genelrcs a~rdEvolution. Academic imal change in a given tree, pp. 453-470. In B.
Press, New York. krnholrn, K.Bremer and H. Jornvall (eds.), The
M'h~ttcmore,A. and B. Schaal. 1991. Interspecific gene Hierarchy of Llfe, Elsevier, Amsterdam.
llo~7 in sylnpatric oaks. Proc. Natl. Acad. Sci. USA Williams, S. M., R. BeSalle and C. Strobeck. 1985.
88 2540-2544. Homogenization of geographical variants at the
Wich~nan,H. A,, S. S. Potter and D. S. Pine. 2985. Mys, nontranscribed spacer of rDNA in Drosophila mer-
a family of mammalian transposable elements catorurn. Mol. Biol. Evol. 2:338-346.
~soiatedby phylogcnetic screening. Nature Williams, S. M., G. R.Furnier, E. Fuog and C. Strobeck.
31277-81 1987. Evolution of the ribosomal DNA spacers of
Wicliman, H.A,, C. T. I'ayne, 0.A. Iiyder, M.J. Drosophila melanogaster: Different patterns of vari-
Hamilton, M. Maltbie and R,J. Baker. 1991. ation on X and Y chromosomes. Genetics
Gcnomic distribution of hetcrochromatic 116:225-232.
sequences m equids: Ilnplications to rapid chro- Williams, S. M., R. W. DeBry and J. L. Feder. 1988. A
~ ~ \ o s o mevolution.
al J. Hered. 82;369-377. commentary on the use of ribosomal DNAin sys-
Wienberg, J. R., A. Jauch, R. Stanyon and T. Cremer. tematic studies. Syst. ZaoI. 3260-63.
1990. Molecular cy totaxonomy of primates by Wilson, A. C., V. M. Sarich and L. R.Maxson. 1974.
cillomosomal in situ suppression hybridization. The importance of gene rearrangement in evolu-
Genomics 8:347-350. tion: Evidence from studies of rates of chromoso-
W~enberg,J, R., C. A. Stanyon and T, Cremer. 2992. ma], protein, and anatomical evolution. Proc.
Homologies in human and Macaca fuscata chro- Natl. Acad. Sci. USA 71:3028-3030.
mosomes revealed by in situ suppression Wilson, A. C., G. L. Bush, S.M. Case and M. C. King.
hybridization with l~umanchromosome-specific 1975. Social structuring of mammalian popula-
DNA libraries. Chromosoma 101:265-270. tions and rate of chromosomal evolution. Proc.
Wlens, J. J. and P. T. Chipplndale. 1994. Combining Natl. Acad. Sci. USA 72:5061-5065.
and we~ghtingcharacters and the prior agreement Wilson, A. C., S. S. Carlson and T.J, White. 1977.
approach revisited. Syst. Biol. 43:564-566. Biochemical evolution. Annu. Rev. Biochern.
Wlcns, J. J. and D. M. Hill~s.1996. Accuracy of parsi- 46:473-639.
mony analysis using morphological data: A reap- Wllson, A. C., R. L. Cann, S. M. Carr, M. George, Jr., U.
pra~sal.Syst. Bot. (in press). B. Gyllensten, K. Helm-Bychowski, R. C. Higuchi,
Wiens, J . J. and T. A. T~tus.1992. A phylogenetic analy- S. R. Palumbi, E. M. Prager, R. D. Sage and M.
LIS of Spea (Anura: Pelobatidae). I-Ierpetologica Stoneking. 1985. Mitochondria1 DNA and two
17.21-28. perspectives on evolutionary genetics. Biol. J.
Wllcy, E.0.1978. The evolutionary species concept tinnean Soc. 26:375400.
reconsidered. Syst. Zool. 27:17-26. Wilson, A. C., H. Ochman and E. M. Prager. 1987a.
Wiley, E. 8.1982.Phylogerietics: 17ze T k w r y nnd Practice Molecular time scale for evolution. Trends Genet.
of I'i~ylogenetic Systernatlcs. Wiley Interscience, 3:241-247.
New York. Wilson, A. C., M. Stoneking, R. L. Cann, E. M. Prager,
lhlllc.y, E 0. 1988a. Vicarlance biogeograpl~yAnnu. S. U.Ferris, L. A. Wrischnik and R. G. Higuchi.
Rev Ecol. Syst. 19:513-542. 1987b. Mitochondria1 clans and the age of our
Wlley, E.0. 1988b. Parsimony analysis and vicariance common mother, pp. 158-164. In F. Vogcl and K.
biogeography. Sys~.Zool. 37:271-290. Sperling (eds.), Human Genetics. Proceedings ofthe
\Vlll~i.lmi,I< W, 1942. The application of the precipitln Seventh lnternntional Congress, Berlin, 2986.
iechnique to theories concerning the origin of ver- Springer-Verlag, Berlin.
icbrates Biol. Bull 82:179-189.
Literature Cited 633
Wilson, A. C., E. A. Zimmer, E. M. Prager and T. D. Wolfe, K. H., W.-EI. Li and P? M. Sharp. 1987, Rates of
Kocher. 1989, Restriction mapping in the molecu- nucleotide substitutions vary greatly among plant
lar systematics of mammals: A retrospective mitochondrial, chloroplast, and nuclear DNAs.
salute, pp. 407419. In B. Fernholm, K. Bremer Proc. Natl. Acad. Sci. USA 84:9054-9058.
and H. Jornwall (eds.), The Hierarchy of Life. Proc. Wolfe, K. H., W.-W. Li and P. M. Sharp. 1989a. Rates of
Nobel Symp. 70. Elsevier, Amsterdam. synonymous substitution in plant nuclear genes.
Wilson, E. 0. 1985. Time to revive systematics. Science J. Mol. Evol. 293208-211.
230:1227. Wolfe, K. H., M. Gouy, Y.-W. Yang, P. M. Sharp and W.-
Wilson, E. 0.1986. The value of systematics. Science H. Li. 1989b. Date of the monocot-dicot diver-
231:1057. gence estimated from chloroplast DNA sequence
Wilson, F. R., G. S. Whitt and C. L. Prosser. 1973. data. Proc. Natl. Acad. Sci. USA 86:6201-6205.
Lactate dehydrogenase and malate dehydroge- Wolfe, K. H., C. W. Morden and J. D. Palmer. 1992.
nase isozyme patterns in tissues of temperature Function and evolution of a minimal plastid
acclimated goldfish (Carassius auratus). Comp. genome from a nonphotosynthetic parasitic plant.
Biochem. Physiol. 46B:105-116. Proc. Natl. Acad. Sci, USA 89:10648-10652.
Wilson, G. N., M. holler, L. L. Szyura and R. D. Wolff, K., S. H. Rogstad and B. A. Schaal. 1994.
Schmickel. 1984. Individual and evolutionary Population and species variation of minisatellite
variation of primate ribosomal DNA transcription DNA in Plantago. Theor. Appl. Genet. 87:733-740.
initiation regions. Mol. Biol. Evol. 1:221-237. Wolstenholme, D. R. 1992. Animal mitochondrial
Wilson, V. G. and G. Schuller. 1992. PCR-SSCP screen- DNA: Structure and evolution. Int. Rev. Cytol.
ing of MI3 plaques. Focus (BRL) 16:59-62. 141:173-216.
Wintero, A. K., M. Fredholm and P. D. Thomsen. 1992. Wolstenholrne, D. R., Clary, D. O., MacFarlane, J. L.,
Variable (dG-dT),(dC-dA), sequences in the Wahleithner, J. A. and L. Wilcox. 1985.
porcine genome. Genomics 12:281-288. Organization and evolution of invertebrate mito-
Wirz, T., U.Brandle, T. Soldati, J. P. Hossle and J.-C. chondrial genomes, pp. 61-69. In E. Quagliariello,
Perriard. 1990. A unique chicken 8-creatine kinase E. C. Slater, FAPalmieri, C. Saccone and A. M.
gene gives rise to two B-creatine kinase isopro- Kroon (eds.), Achievements and Perspectives of
teins with distinct N-termini by alternative splic- Mitoclzondrial Research. Elsevier, Amsterdam.
ing. J. Biol. Chem. 265:11656-11666. Womack, J. E. 1983. Post-translational modification of
Woese, C. R, and G. J. Olsen. 1986. Archaebacterial enzymes: Processing genes, pp. 175-186. In M. C.
phylogeny: Perspectives on the urkingdoms. Syst. Rattazzi, J. G. Scandalios and G. S.Whitt (eds.),
Appl. Microbiol. 7:161-177. Isozymes: Current Topics in Biological and Medical
Woese, C. IT., Maniloff, J. and Zablen, L. B. 1980. Research, Vol. 7.Molecular Structure and Regulation.
Phylogenetic analysis of the mycoplasmas. Proc. A. R. Liss, New York.
Natl. Acad. Sci. USA 77494498. Worthington Wilmer, J., C. Moritz, L, Hall and J. Toop.
Woese, C. R., R. Gupta, G. M. Hahn, W. Zillig and J. 1994, Extreme population structuring in the
Tu. 1984a. The phylogenetic relationships of three threatened Ghost Bat, Macrodema gigas: Evidence
sulfur-dependent Archaebacteria. Syst. Appl. from mitochondria1 DNA. Proc. Roy. SOC.London
Microbiol. 5:97-105. B 257:193-198.
Woese, C. R' ., E. Stackebrandt, W. G. Weisburg, B. J. Wong, C., C. E. Dowling, R. K. Saiki, R. G.Higuchi, H.
Paster, M, T. Madigan,V. J. Fowler, C. M. Hahn, P. A. Ehrlich and H. H. Kazazian, Jr. 1987.
Blanz, R. Gupta, K. H. Nealson and G. E. Fox. Characterization of Pthalassaemia mutations
1984b. The phylogeny of purple bacteria: The alpha , using direct genomic sequencing of amplified sin-
subdivision. Syst.Appl. Microbiol. 5:315-326. gle copy DNA. Nature 330:384-386.
Woese, C. R., W. G. Weisburg, B. J. Paster, C. M. Hahn, Woodruff, D. S. 1989. Genetic anomalies associated
R. S. Tanner, N. R. Krieg, H.-P. Koops, H. Harms with Cerion hybrid zones: The origin and mainte-
and E. Stackebrandt. 1984c.The phylogeny of nance of new electromorphic variants called
purple bacteria: The beta subdivision. Syst. Appl. hybrizymes. Biol. J. Linnean Soc. 36281-294.
Microbial. 5:327-336. Woodruff, R. C. and J. N. Thompson. 1980. Hybrid
Wocse, C. R., W. G. Weisburg, C. M. Hahn, B. J. Paster, release of mutator activity and the genetic structure
L. B. Zablen, B. J. Lewis, T. J. Macke, W. Ludwig of natural populations. Evol. Biol. 12:129-162.
and E. Stackebrandt. 1985. The phylogeny of pur- Woodward, S. R., N. J. Weyand and M. Bunnell. 1994.
ple bacteria: The gamma subdivision. Syst. Appl. DNA sequences from Cretaceous Period bone
Microbiol. 6:25-33. fragments. Science 2663229-1232.
634 Literature Cited
Workman, P. L.and J. D. Niswander. 1970. Population Yang, Z. 1993. Maximum likelihood estimation of phy-
studies on southwestern Indian tribes. 11. Local logeny from DNA sequences when substitution
genetic differentiation in the Papago. Am. J. rates differ over sites. Mol. Biol. Evol.
Human Genet. 22:24-29. 10:1396-1401.
Wothe, D. D., H. Charbonneau and B. M. Shapiro. Yang, Z. 1994a. Estimating the pattern of nucleotide
1990. The phosphocreatine shuttle of sea urchin substitution. J. Mol. Evol. 39:105-111.
sperm: Flagellar creatine kinase resulted from a Yang, 2. 1994b. Maximum likelihood phylogenetic
gene triplication. Proc. Natl. Acad. Sa. USA estimation from DNA sequences with variable
87:5203-5207. rates over sites: Approximate methods. J. Mol.
Wright, C. A. (ed.). 1974. Biochemical and I~r~~irunological Evol. 39:306-314.
Taxonomy ofAnima[s. Academic Press, New York. Yang, Z. 1994c. Statistical properties of the maximum
Wright, C. A. (ed.). 1978. Biochemical and Immunological likelihood method of phylogenetic estimation and
Taxonomy of Animals. 2nd ed. Academic Press, comparison with distance matrix methods. Syst.
New York. Biol. 43:329-342.
Wright, D. A., C. M. Richards, J. S. Frost, A. M. Yang, Z. 1995. PAML, Phylogenetic Analysts by
Camozzi and B. J. Kunz. 1983. Genetic mapping Maximum Likelihood (PAWL), version 1.1. Institute
in amphibians, pp. 287-311. In M. C. Rattazzi, J. of Molecular Genetics, Pennsylvania State
G. Scandalios and G. S. Whitt (eds.), Isozymes: University, University Park.
Current Topics in Biological and Medtcal Research, Yang, Z., N. Goldman and A. E. Friday. 1994.
Vol. 7.Molecular Structure and Iiegulation. A. R. Comparison of models for nucleotide substitution
Liss, New York. used in maximum likelihood phylogcnetic esti-
Wright, J. W., C. Spolsky and W. M. Brown. 1983. The mation. Mol. Biol. Evol. 11:316-324.
origin of the parthenogenetic lizard Yonenaga-Yassuda, Y., S. Kasahara, T. M.Chu and M.
Cnemidophorus laredoensis inferred from mitochon- T. Rodrigues. 1988. High-resolution RBG-banding
drial DNA analysis. Herpetologica 39:410-416. pattern in the genus Tropidurus (Sauria,
Wright, S. 1943. Isolation by distance. Genetics Iguanidae). Cytogenet. Cell Genet. 48:68-71.
28:114-138. Young, A. and R. BIakesley. 1991. Sequencing plasmids
Wright, S. 1951. The genetical structure of populations. from single colonies with thc dsDNA cycle
Ann. Eugen. 15:323-354. sequencing system. Focus (BRL) 13:137.
Wright, S. 1978. Evolution and the Genetics of Youvan, D. C. and J. E. Hearst. 1979. Reverse tran-
Populations. University of Chicago Press, Chicago. scriptase pauses at N2-methylpanine during in
Wrischnik, L. A., R. G. Higuchi, M. Stoneking, H. A. vitro transcription of Escherichia coli 16s ribosomal
Erlich, N. Arnheim and A. C. Wilson. 1987. RNA. Proc. Natl. Acad. Sci. USA 763571-3574.
Length mutations in human mitochondrial DNA: Yu, L.-X. and H. T.Nguyen. 1994. Genetic variation
Direct sequencing of enzymatically amplified detected with RAPD markers among upland and
DNA. Nucl. Acids I<es.15:529-542. lowland rice cultivars. Theor. Appl. Genet.
Wu, C.-I. 1991. Inferences of species phylogeny in rela- 87:68&692.
tion to segregation of ancient polymorphisrns.
Genetics 127:429-435. Zevering, C. E., C. Morltz, A. Heideman and R. Sturm.
Wu, C.-I. and W.-H. Li. 1985. Evldence for higher rates 1991. Parallel origin of duphcations and the for-
of nucleotide substitution in rodents than in man. mation of pseudogenes in mitochondrial DNA
Proc. Natl. Acad. Sci. USA 82:1741-1745. from parthenogenetic lizards (Heteronotia btnoei:
Wu, C.-1. and N. Maeda. 1987. Inequality in mutation Gekkonidae). J.Mol. Evol. 33:431-441.
rates of the two strands of DNA. Nature Zhan, T. S., 5. Pathak and J. C. Liang. 1984. Induction
327:169-170. of G-bands in the chromosomes of Melanoplus san-
Wulf, J. H. and R. G. Cutler. 1975. Altered protein guinipes (Orthoptera, Acrididae). Can. J. Genet.
hypothesis of mammalian aging processes: I. Cytol. 26:354359.
Thermal stability of glucose-6-phosphate dehy- Zharkikh, A. 1994. Estimation of evolutionary dis-
drogenase in C57BL/6J mouse tissue. Exp. tances between nucleotide sequences. J. Mol.
Gerontol. lO:101-117. Evol. 39:325-329.
Zharkikh, A. and W.-13. Li. 1992a. Statistical properties
Yang, D., Y. Oyaizu, H. Oyaizu, G. J. Olsen and C. R. of bootstrap estimation of phylogenetic variability
Woese. 1985. Mitochondria1 origins. Proc. Natl. from nucleotide sequences. I. Four taxa with a
Acad. Sci. USA 82:44434447. molecular clock. Mol. Biol. Evol. 9:1119-1147.
Literature Cited 635
Zharkrkh, A, and W.-H. Li. 1992b. Statistical properties Zimmerman, W. 1930 Dze Pkylogente del PfZanzel? C
of bootstrap estimation of phylogenetic variability Rscher, Jena, Germany
from nucleotide sequences. 11. Pour taxa without a Z~rnmerinan,W. 1931. Arbeltswelse der botanlschen
molecular clock. J. Mol. Evol. 35:356-366. Phylogcnct~kund anderer
Zharkikh, A. and W.-H. Li. 1993. Incanslstency of the Gmppierungswisscnscl~eftei~, pp. 941-1 053.Ii.1E
maximum-parsimony method. The case of five Abdcrhaldcn (ed.),l-lnizdbticl~der b~ologzschelz
taxa with a rnolccular clock. Syst. Biol. 42:113-125. Arbeitsmetkoden. Urban and Schwarzenberg,
Zharkikh, A. and W.-H. Li. 1995. Estimation of confi- Berlin.
dence in phylogeny: The full-and-partial boot- Zimmerrnan, W. 1934. Research on phylogeny of
strap technique. Mol. Phylogenet. Evol. 4:44-63. species and of single characters. Am. Nat
Zhen, L. and R. T. Swank. 1993. A simple and high 68:381-384.
yield method for recovering DNAfrom agarose Zimmerman, W. 1943. Die Methoden der
gels. BioTechniques 14:894-898. Pltylogenetik, pp. 20-56 In G Heberer (ed ), Die
Zhu, D., B. G. M. Jamieson, A. Hugall and C. Moritz. Evolutton der Organlstnen G Fischer, Jena,
1994. Sequence evolution and phylogenetic signal Germany.
in control region and cytochrome b sequences of Zorn, A. M. and P. A. Krleg. 1991. PCR analysls of
rainbowfisl~es(Mclanotaeniidae).Mol. Biol. Evol. alternative splicing pathways Idcntificatlon of
11:672-683. artifacts generated by heteroduplex format~on
Zischler, H., M. Hdss, 0. Handt, A. van Haeseler, A. C. BioTecl~niques11:181-183.
van der Kuyl, J. Goudsmit and S. Paabo. 1995. Zouros, E., K. R. Frccman, A. 0.Ball and G. I-I.
Detecting dinosaur DNA. Science 268:1192-1193. Pogson. 1992. Direct evldencc for extenslve patcr-
Zimmer, E. A., S. L. Martin, S. M. Beverly, Y. W. Kan nal mitochondria1 DNA ~nheritancein the marine
and A. C. Wilson. 1980. Rapid duplications and mussell Mytilus. Nature 359:412414.
loss of genes coding for a chains of hemoglobin. Zuckerkandl, E. and L. Paul~ng.1962. Molecular dis-
Proc. Natl. Acad. Sci. USA77:2158-2162. ease, evolution and genic heterogcneity, pp.
Zimmer, E. A,, C. J. Rlvin and V, E. WaIbot. 1981. A 189-225. In M. Kasha and B.Pullman (eds.),
DNA isolation procedure suitable for most higher FIo~orizonsin Biochelnzstry Academic Press, New
plant species. Plant Mol. Biol. Newsl. 2:93-96. York.
Ziinmer, E. A,, E. R. Jupe and V. Walbot. 1988. Zuckerkandl, E. and L Pauling 1965. Evolutionary
Ribosomal gene structure, variation and inhen- divergence and convergence In protclns, pp
tance in maize and its ancestors. Genetics 97-166. In V. Bryson and H. J. Vogel (eds.),
120:1125-1136. Evolvzng Genes and Profems Academic Press, New
Zimmer, E. A,, R. K. Hamby, M. L. h o l d , D. A. York.
Leblanc and E. C. Theriot. 1989. Ribosomal RNA Zurawski, G. and M. T. Clegg. 1987 Evolution o i high-
phylogenies and flowering plant evolution, pp. er-plant chloroplast DNA-encoded genes.
205-214. In B. Fcrnholm, K. Bremer and H. lmpllcations for structure-function and phyloge-
Jornvall (eds.), The Hierarchy oJLife. Proc. Nobel netlc studies. Annu. Rrv. Plant Physiol.
Syrnp. 70. Elsevier, Amsterdam. 38:391-418.
Page numbers in boldface type in&- Algorithm(s) divergence, 59
cate formulas for stock s~lutions. defined, 408 historical events, 56
exact, 478482 spatial and temporal, 56
AAT (aspallate aminotransferase), 100 "greedy," 482 ALP (alkalinephosphatase), 99
ABl,l-l, 510 us, optimality criteria, 408409, ALPDH (alanopinedehydrogenase), 99
Acetate-Tns-PDTA (ATE), 378 415-416 Aluminum foil tissue packaging, 3031
Acetone powder, 37-38 single-tree, 529 Alu-repeats, preserved fragments, 39
N-acciyl-PglucosaWdase ( F A ) , 97 Algorithmic methods, 4864% Ambiguities, pairwise sequence compar-
Acld phosphatase (ACP),97-98 additive trees, 487493 ison, 454-455
ACOIH (amnitate hydratase), 98 cluster analysis, 486487 Arnine-citrate
ACP (acid phosphatase),97-98 distance Wagner, 493 morpholine, 117
Acrylam~deaoiutions, 378 neighbor joining, 488-490 propanol, 117
AC TC repcats, 271 Alignment, 412 Ammonium acetate (NIWc),378
Actln prrmer, 241-242 gaps, 453 Ammonium persulfate (APS),319
Activalors, eleclrophorehc, 73 Alignment algorithms, 331 Amplification.see also Nuclear DNA am-
ADA (adenosine d e a m s e ) , 98 global, 375 plifications; Polymerase chain reac-
Adaptahon, and allozyme vanation, local, 374 tion; specific types
58-59 Alkaline electrophoresis buffers, 201 direct, 225
Addlhon, stepw~se,482-483 Alkaline phosphatase (ALP), 99,139, reverse transcriptase, 336
Additive distances, 172,447-448, 142,265 AMPPD [disodium 3-(4-methoxyspiro-
487493 Allele(s),51,413 [1,2-dioxetane-3-2'-
Additrve tree methods, 448-452 as characters, 413 tri~ydo(3.3.1.l~~~)decan14
algoathmic, 487-493 coalescence, 266 yl)pheny11,265
Fitch -Mdrgohash metl~od,448-451 cryptic, 64 Ancestral polymorphisms, 277
rninlmum evolution (ME) method, homology, 256-259 Ancient DNA, 228
451-452 locus nomenclature, 95 Anesthesia, 33-34
systematic error, 495 null, 66,255 Aneuploidy, 62
Addillvliy, tree, 447 rare, 67 Animal mitochondria1DNA (mtDNA).
Additivity assumption, 172 segregation, detection limits, 61-66 see Mitochondria1DNA, animal
Adenosme dcaminase (ADA), 98 sorting, 9-10 Animal tissue collection, 33-35
Adenylatc lunase (a), 9598 variants, LDH isozyrne, 93 Anion, 53
ADH (alcohol dehydrogenase), 99 variation, 23 Annealing
hgarose gel, electrophoresis (AGE),55, Allele frequenc~(les),389-3901413-414 extension, 227
262 sre ulso lsozyme elech.ophoresis between-population heterogeneity, 65 PCR cycle, 208-209
protocol, 291-297 gene diversity, 386 temperanue, PCR, 227-228
Ag:~r overlay, 86-89,97 geographic variation, 56 thermal cycling, 227
AGE see Agarose gel elecLrophoresis population structure, 54,56 Anode, 52
AgNOI? receding, 414 Anonymol~ssingle-copy RKPs, 218
banding protocol, 157-158 space, 425 Anonymous singlecopy sequ@nce(s)
develop~r,166 variance, 20-21 amplification, 218
AC-TC repcats, 271 Allopatnc sp&es/popuIahons, 22-24 population-level comparisons,
AK (adcnylatekinase), 98,100 Allopolyplo~ds,62 272-275
Aka~kemformahon cnter~on,440 Allozyme(s), 51 Anti-avidin antibody, biotinylated goat,
Alanoplne dehydrogenase (ALPDH), 99 characters, 59 166
A L M (alanine aminotransferase), 99 clock, 60,538-539 Antibody(1es).ser also Polyclonal; spe-
Alb~~rnm data, 413 cific types, e.g., Anti-avidin, Blohn,
frcezo-thdw stability, 38 data, parsimony, 425-426 Monoclonal
stability in alcohol, 33 electrophoresis, 3, 19 chromosome painting, 126
Alsohol del~yd~ogenase (ADH), 99 synapomorpltic, 65 cross-contamination, 148
Alcohol tlssue preservahon, 33 AUozyme variahon gold-conjugate, 123
Aldolase pnmer, 243-245 adaptive differences, 58-59 monoclonal, 223,126
Index 637
polyclonal, 123,126 probes, 265 subtree pruning and regrafting, 484
primary, 139-140 specific antibodies, 139 tree b~sectionand reconnection, 485
Ant~gens,clmmosome painting, 126 Biotin-avid'i label, 123,136 BrdU-banding, salamander embryo pro-
Apomorphism, 277 Blotin label, 226,135-136,142 tocol, 158
A15 (ammonium persulfate), 319 Diohn-labeled probe hybridization buffer, BrdU label, 126,135-136
ARAB (a-L-arabinofuranos~dase),99 167 Breeding structure, 56-57
Arc dlstance measure, 463 Biohnstreptavidin, 265 Breeding studies, 53
Area cladograrn, 60 Biohnylatd goat anti-avidin antibody, Brent-Powell methods, 445
ARK (argirune kinase), 100 166 5-Bromo-4-chloro-3-indolyphosphate
NU(primer, 241 Biotinylated nucleotide, 213-214 (BCIP), 142
Asexual species Biotinylated probes, 127,136 Brooks parsimony analysis (BPA),59-60
complex, 23 Biotypes, unisexual, 61 Broth, L, 379
relationships, 520 Bisbenzimide, 259 USA (bovine serum a l b u m ) , 38,211
Aspartate aminotransferase (AAT), 100 Bivalents, 129-130 BufFer(s)
Assumptions, general, and systematic er- BLAST algorithm, 374 additions, in PCR, 225
ror, 494 Blocking solution, fluorescein-avidin, 166 alkaline electrophores~s,201
Asymmetric reamplification, 325,355-356 Blood collection, 33-35 EN, 166
ATE (acetate-Tris-EDTA), 378 Blotting, vacuum, 299 borate, 379
AT repcats, 271 Blucose, 320 chromomycin, 166
Automated sequencers Blunt-end cloning, 357-358 CTAB (hexadecyltrimethylammonium
gel reading, 371-372 BN (bicarbonate-nonidet) buffer, 166 bromide) extraction buffer, 379
types and use, 330 Bone, PCR extraction, 223-224 cycle sequencing, 379
Autopolyploids, 62 Boolean queries, 374 functions, 53
Autoradiography, 123 Bootskapping, 197,392,397-398 hybridization, 137
chromosomes, 123,138,162-163 nonparametric, 409 hybridization for biotin-labeled
DNA fragments, 302 parametric, 523-526 probes, 167
DNA sequening, 328330,369-374 parametric vs. nonparametric, 523-524 incubation temperature/strength, 173
genomic libraries, 349 random error, 507-509 ionic, 53
Avidm, 139 see also Anti-avidin antibody; split decomposition, 492 isolation (cpDNA), 319
Avidin-hintin; Fh~owscsin-avidin stachastic effects, 523 isazyme electrophoresis, 116-120
biochemical properties, 126-127 Bootstrap proportion, 523-524 label, 320
Avidin-biotin, 123,136 Borate ligation, 380
buffer, 379 lysis (cpDNA), 319
Background staining, 93-94,148 continuous, 117 McIlvaine's, 167
Bacterial colony amplificahon, 225 discontu~uous,118 multiple, 93
Bacteriophage lambda. see Lambda (h) Bottlen~ks,56,201,220 NaI binding, 380
bacteriophage Bounces, high-stringency,227 nick translation, 167,202
Bacteriophage library screening protocol, Bovine serum albumin (BSA), 38,211 nrck translation (transfer hybridiza-
348-349 Branch-and-bound methods, 4801182 tion), 320
Banding. see Chromosome banding; Mole- Branch attraction, long, 478 "2P end-label, 320
cular cytogenetics; specifictypes Branches phage dilution, 380
problems, 372473 interior (central), 410 phosphate, 185-186,167,202. see also
RFLP, substoichiometric, 310 peripheral, 410 Phosphate buffer
BankIt, 375 reliability of individual, 506-509 phosphate hybridization, 174
Base compositional bias, 4 removing long, 499 reverse transcriptase, 380
factors affecting, 496 Branching Sl nuclease, 202
BCIP (5-bromo4chloro-3-indolyphos- patterns, unrooted, 477 screen, 92-93
phate), 86,142 sequence, 198 STES (sodium
Behavior, 337 timing, 198 chloride-Tris-EDTA-sucrose),
Beta tubulin primer, 245-246 Branch length(s), 439-440 283-284
Bias covariances, 474 stop (nucleic acid sequencing), 380
assessing effect, 496-497 inferred, 472 TAE (Tris-acetate-EDTA), 262,247
base compositional, 496 least squares, 450 Taq polymerase, 246,380
codon and nucleotlde, 212-213 LogDet distance, 460 TBE (Tris-borate-EDTA), 247,262
simulation, 527 methods for finding, 441-442 TE (Tris-5DTA), 247
Bifurcation, 410 model, unconstrained, 440 Tris-acetate, 203
Binary characters, 411 negative, 450 Tris-ethanol wash, 381
Biogeographic data, errors, 535 spectrum, estimation, 474 wash (cpDNA), 319
Biogeography, 337 spectrum [$Dl, 468 Buffer systems, 69-70,8243
historical, 269 substitutions, 440 Buffer tray, 70-71
Biopsy protocol, 36 Branch swapping Buffer well, 69
Biotin nearest neighbor interchanges (NNIs), Bulked segregate analysis, 276
biochemical properties, 126-127 484
conjugated nucleotides, 139 rearrangement algorithm, 408 CAGE (cellulose acetate gel electrophore-
sis), 55
CAIC, 510
638 Index
Calcium-binding proteins (CBP), non-spe- Character state(s), 410-411 karyotypes and idiograins, 165
cific, 100 gap, 453 rearrangements, 143
Cam-Sokal parsimony, 422 isozymc data, 413-414 scormg, 165
CAP (cytosol aminopeptidase), 101 optunal assignments, 415 specialized procedures, 132-133
Carbon dioxide, solid (dry ice), 36-37 particulate data, 414 timing, 133-134
Catalase (CAT), 100 probabilrty of observing on a tree, types, 132
Catalytic properhes, and isozyme expres- 464-465 Chromosome characters, 144
sion, 91 reconstructions In Dollo parsimony, Chromosome In situ suppression hy-
CAT (catalase),100 420 bndization (CISS), 139
Cathode, 52 restriction cndonuclease data, 412413 Chromosome paitthng, 122-124,142
Cation, 52-53 tree, 411 antieens and antibodies, 126
L
,