The Multispecies Coalescent Over-Splits Species in The Case of Geographically Widespread Taxa
The Multispecies Coalescent Over-Splits Species in The Case of Geographically Widespread Taxa
The Multispecies Coalescent Over-Splits Species in The Case of Geographically Widespread Taxa
Taxa
© The Author(s) 2019. Published by Oxford University Press, on behalf of the Society of Systematic Biologists. All rights reserved.
For Permissions, please email: [email protected]
CHAMBERS AND HILLIS
genetic data analyzed under the multispecies coalescent (MSC) model, and results from these
studies often are regarded as conclusive support for taxonomic changes. However, most MSC-
application of these genetic-based approaches (without due consideration of sampling design, the
population structure) can lead to over-splitting of species. Here, we argue that in many common
species complexes. We consider these points with respect to a historically controversial species
group, the American milksnakes (Lampropeltis triangulum complex), using genetic data from a
recent analysis (Ruane et al. 2014; Syst. Biol. 63:231-250). We show that over-reliance on the
program BPP, without adequate consideration of its assumptions and of sampling limitations,
conclude that the best available evidence supports three, rather than seven, species within this
incorporate thorough analyses of geographic variation and carefully examine putative contact
2
http://mc.manuscriptcentral.com/systbiol
MULTISPECIES COALESCENT AND SPECIES DELIMITATION
Systematists attempt to understand and organize the diversity of life using two
fundamental concepts: species and trees of relationships among species. Under this framework,
species are viewed as individual, independently evolving metapopulation lineages, within which
2007). Species lineages split and give rise to new independent lineages, forming phylogenetic
trees of species in the process. Within those trees, monophyletic groups of species, or clades,
The boundary between species and clades is not arbitrary, as life is clearly not organized
in a continuum. Instead, there are clear reproductive and genetic breaks that allow different
lineages to evolve on independent evolutionary pathways. Within sexual species, gene flow
typically maintains cohesion such that lineages evolve as units through time (Ghiselin 1974;
Templeton 1989). Ecological circumstances (selection for particular ecological roles) may also
play a role in maintaining species, even in the case of asexual organisms (Fontaneto et al. 2007;
Although the theoretical distinction between species and clades is clear, the origins of
new species are necessarily fuzzy, as are the beginnings of all ontological individuals (Ghiselin
1974; Frost and Hillis 1990; de Queiroz 1998). Species rarely split instantaneously into
descendant lineages, and different biologists may use different operational criteria to detect a
splitting event (de Queiroz 1998, 2007). Widespread, geographically variable, but continuously
distributed species and species complexes present a particularly difficult problem for
scales. In some cases, this variation can be clinal and essentially continuous, with gene flow
across the entire species range (e.g., Slatkin and Maddison 1990; Slatkin 1991). In other cases, a
3
http://mc.manuscriptcentral.com/systbiol
CHAMBERS AND HILLIS
species complex might consist of multiple geographically, genetically cohesive, parapatric taxa
with little or no gene flow between species where they come into contact (e.g., Hillis 1988).
Intermediate conditions are also possible, such that gene flow is restricted but not entirely
species delimitation (Ensatina salamanders provide a textbook example of such complexity and
Here, we explore the limitations of a commonly used approach for species delimitation
that relies on the multispecies coalescent model (hereafter, MSC-based methods). Despite the
known assumptions and limitations of these methods (Leaché and Fujita 2010; Olave et al. 2014;
Eberle et al. 2016; Luo et al. 2018; Barley et al. 2018), they are often used in isolation for species
delimitation and taxonomic change. We illustrate, using a case study, problems that may arise
from inadequate consideration of a priori group designations, limited sampling, and lack of
attention to contact zones in the context of one MSC-based species delimitation method.
The multispecies coalescent has become an important conceptual framework for inferring
relationships among species (species trees) from relationships among different genes (gene
trees), while taking into account incongruence among gene trees that results from incomplete
lineage sorting (Maddison 1997). Because genes trees are not always monophyletic within
species lineages, the multispecies coalescent was introduced as a way to detect recently divergent
lineages from collections of gene trees (Knowles and Carstens 2007). However, several
biological processes other than incomplete lineage sorting (including hybridization and
4
http://mc.manuscriptcentral.com/systbiol
MULTISPECIES COALESCENT AND SPECIES DELIMITATION
geographic structuring of populations) can also contribute to discordance among gene trees, and
the extent to which the multispecies coalescent is able to estimate a species tree depends in part
on how much discordance is limited to the process of incomplete lineage sorting within species
The multispecies coalescent has been implemented in several methods for species
delimitation (e.g., Yang and Rannala 2010; Ence and Carstens 2011; Camargo et al. 2012; Fujita
et al. 2012; Leaché et al. 2014), and some authors have argued that these methods present a more
objective approach for testing species hypotheses compared to traditional methods of species
delimitation (Leaché and Fujita 2010; Fujita et al. 2012). One commonly used method is
Bayesian Phylogenetics and Phylogeography (BPP; Yang and Rannala 2010), which we examine
here. Recently, as the limitations of BPP have been explored (Sukumaran and Knowles 2017;
Barley et al. 2018; Leaché et al. 2018), it has become evident that this method does not
necessarily delimit species boundaries, but may also identify other kinds of genetic structure
within species.
Many MSC-based methods (including BPP) use clustering algorithms for initial
population-level assignment of individuals to groups which are subsequently validated using the
MSC-based method (see Carstens et al. 2013 for a full review). The number of individuals and
loci sampled play a significant role in ensuring programs such as Structure or Structurama
(Pritchard et al. 2000; Huelsenbeck et al. 2011) infer appropriate groups for testing (Rittmeyer
and Austin 2012; Olave et al. 2014; Hime et al. 2016). Limited geographic sampling can produce
the appearance of distinct genetic clusters, even when samples are drawn from continuous clines
5
http://mc.manuscriptcentral.com/systbiol
CHAMBERS AND HILLIS
or geographically structured populations (Hedin et al. 2015; Barley et al. 2018). Consider, for
example, two extreme alternatives. In one case, distinct species lineages have a narrow contact
zone with little to no gene flow or hybridization. In another case, a single species exhibits a
scenarios requires thorough sampling across the cline or contact zone. If sampling is limited and
genetic information is obtained only from geographically distant populations, clustering methods
may be incapable of distinguishing between these two scenarios (Irwin 2002; Schwartz and
McKelvey 2008; Rittmeyer and Austin 2012; Puechmaille 2016; Bradburd et al. 2018).
There has been extensive discussion of the limitations of MSC-based methods. Overall,
depending on taxonomic, geographic, and genetic sampling, BPP can yield variable results in
delimitation (Setiadi et al. 2011; Olave et al. 2014; Reid et al. 2014; Zhang et al. 2014; Hime et
al. 2016; Barley et al. 2018). Here, we extend this literature by providing a re-analysis of an
existing dataset from a published study (Ruane et al. 2014) to illustrate the impact of using
limited data on BPP’s ability to delimit species. Particularly, we focus on the ramifications of
using nuclear genetic datasets with limited sampling and little phylogenetic signal, combined
species delimitation. We emphasize that our analysis is not necessarily a criticism of the MSC-
based method BPP itself, but rather its application to inappropriate datasets in species
delimitation studies.
Ruane et al. (2014) sought to clarify species boundaries and relationships in the
6
http://mc.manuscriptcentral.com/systbiol
MULTISPECIES COALESCENT AND SPECIES DELIMITATION
concluded that genetic evidence supported the recognition of seven species in what had
traditionally been considered a single species (Williams 1988). Based primarily on results from
BPP, Ruane et al. (2014) elevated seven groups in the L. triangulum complex to full species
Using the Ruane et al. (2014) dataset, we show that sparse geographic sampling,
combined with a conflicting signal from interspecific introgression of mitochondrial DNA, led to
the a priori clustering analyses and consider the information that can be inferred from such
analyses, and then examine the insights that can be gained from an examination of gene trees.
We then propose reasons that species splits were recognized despite the lack of supporting
evidence from the clustering analyses or evidence of any genetic or reproductive gaps between
species. Finally, we perform additional tests on two of the newly recognized species that
demonstrate the tendency of BPP to over-split species in the case of limited sampling across
A Priori Grouping
As discussed above, individuals are often assigned to groups (or putative species) before
input into MSC-based methods like BPP. This is usually accomplished using clustering methods
that report the relative support for each individual’s assignment into different clusters. Ruane et
al. (2014) assigned individuals to clusters in two different ways. First, they used the program
Structurama (Huelsenbeck et al. 2011), which searches for deviations from Hardy–Weinberg
equilibrium expectations across sampled gene loci, and then assigns individuals to genetic
7
http://mc.manuscriptcentral.com/systbiol
CHAMBERS AND HILLIS
groups that minimize these deviations. Ruane et al. (2014) also constructed a mitochondrial
DNA gene tree, from which they identified groups for subsequent population assignment.
Ruane et al. (2014) found that Structurama did not distinguish between their a priori
repeated the Ruane et al. (2014) Structurama analysis (see online Appendix 1, available on
cluster numbers depending on the run, indicating the data were not informative enough to
provide consistent and robust results across different runs (see online Appendix 1). However,
there were a few consistencies. We observed that Structurama almost always grouped Ruane et
al.’s (2014) nominal taxa polyzona, abnorma, and micropholis into a single cluster; that gentilis
and triangulum were always assigned to the same cluster, and that elapsoides and annulata were
generally shown as composites of multiple genetic clusters (Fig. S1). The fact that Structurama
consistently showed no division between gentilis and triangulum is especially noteworthy, as this
indicates that the samples of these putative taxa, collected thousands of kilometers apart from
equilibrium expectations (for the data examined) across the breadth of North America. Because
of this observation, we will be focusing on these two purported taxa for our subsequent analysis.
Ruane et al. (2014) collected data on 11 nuclear genes and one mitochondrial gene. Using
the same dataset reported by Ruane et al. (2014), we constructed gene trees for all 11 nuclear
genes (one representative nuclear gene is shown in Fig. 1, and the rest are shown in Fig. S2; see
online Appendix 2), as well as the single mitochondrial gene (Fig. S3). The overall amount of
8
http://mc.manuscriptcentral.com/systbiol
MULTISPECIES COALESCENT AND SPECIES DELIMITATION
divergence between the 11 nuclear genes was low (0.017–0.076 substitutions per site between
the most divergent samples of the L. triangulum complex), especially compared to the higher
divergence of the single mitochondrial gene (0.225 substitutions per site for samples within the
among all gene trees, and some gene trees would not be expected to be monophyletic within
species lineages. However, there is no evidence of any consistent nuclear genetic divergence, nor
evidence for any reproductive isolation, at the contact zones between some of the purported
species recognized by Ruane et al. (2014). For example, geographically closest individuals of
gentilis and triangulum are genetically indistinguishable at every nuclear locus (Fig. S2). The
lack of any genetic break at the contact zone of these purported species suggests that their
division is an arbitrary split in a population continuum, rather than a break between distinct
species.
Closely related species are expected to retain some shared interspecific polymorphisms.
Indeed, humans and chimpanzees are known to share genetic polymorphisms that are thought to
have arisen in their common ancestor (e.g., Fan et al. 1989). Nonetheless, humans and
chimpanzees are also estimated to be diagnostically distinct across 4% of their genomes (Varki
and Altheide 2005). Interspecific differences between humans and chimpanzees (which total
approximately 125 million nucleotides) far exceed all intraspecific polymorphisms, and only a
small percentage of the latter are shared across these species (Varki and Altheide 2005). Georges
et al. (2018) emphasized the importance of such diagnostic differences as evidence for species
boundaries and lineage independence. In contrast, there are no diagnostic nucleotide differences
among the nuclear genes sampled by Ruane et al. (2014) between gentilis and triangulum. Given
that there is no evidence of deviations from Hardy–Weinberg equilibrium expectations (as shown
9
http://mc.manuscriptcentral.com/systbiol
CHAMBERS AND HILLIS
in the Structurama analyses), and no evidence of even a single nuclear gene that consistently
differs between these two purported species, then what is the basis for hypothesizing the
existence of these species lineages? Is there any reason to expect any biological differences
In contrast to the low levels of nuclear divergence discussed above, upon construction of
the mitochondrial gene tree, Ruane et al. (2014) found clear evidence for multiple captures of L.
alterna mitochondrial DNA within western North American populations of the L. triangulum
complex (i.e., the populations referred to as the forms gentilis and annulata; Fig. S3). These
western populations of L. triangulum have mitochondrial haplotypes that are deeply embedded
within those of L. alterna, which in turn has a mitochondrial genome that is more closely related
to species of the L. getula complex and L. extenuata than to the eastern North American
populations of L. triangulum (Fig. S3). These introgression events appear to have happened
several times and are still ongoing (note the nearly identical mitochondrial DNA haplotypes of L.
alterna and L. triangulum where the two co-exist in Val Verde County, Texas; Fig. S3). Indeed,
the only consistent genetic difference between gentilis and triangulum is that individuals
assigned to gentilis have introgressed mitochondrial DNA from L. alterna, whereas individuals
assigned to triangulum do not. No single nucleotide from any of the sampled nuclear genes
BPP Analysis
Ruane et al. (2014) first ran BPP using Structurama assignments as terminal lineages on
guide trees, resulting in high support for six lineages within the L. triangulum complex (recall
that gentilis and triangulum were initially treated as a single lineage by Ruane et al. [2014] based
10
http://mc.manuscriptcentral.com/systbiol
MULTISPECIES COALESCENT AND SPECIES DELIMITATION
on their Structurama assignments). We found the same result when we performed the same
analysis using unguided BPP (Yang and Rannala 2014; PP=99.2%; Table S1; online Appendix
3).
combined triangulum–gentilis lineage (the introgressed haplotypes from L. alterna; Fig. S3),
Ruane et al. (2014) next tested whether BPP would support a division between triangulum and
gentilis, despite their Structurama results. To conduct this test, they ran BPP with a guide tree
generated from their mitochondrial gene tree, assigning these two lineages to different groups.
BPP strongly supported this split as well, while still differentiating L. alterna, thus leading
Given that no nuclear genes (Figs. 1 and S2) show evidence of genetic differentiation
between gentilis and triangulum, and even Structurama fails to separate individuals of these taxa,
the only basis for distinguishing these taxa appears to be the introgressed mitochondrial DNA.
Similar cases of mitochondrial DNA capture have confounded species delimitation in other taxa
(e.g., polar bears versus brown bears: Miller et al. 2012; freshwater mussels: Chong et al. 2016),
and deep intraspecific polymorphisms of mitochondrial DNA have similarly affected species
delimitation in other studies (e.g., Folt et al. 2019). Without the introgressed mitochondrial
DNA, there would have been no basis for testing a split between gentilis and triangulum.
Therefore, we tested if the support from BPP for the division of gentilis and triangulum was
limited to the distribution of the introgressed L. alterna DNA (as examined by Ruane et al. 2014;
see Fig. 2a), or if it was simply a reflection of the geographic proximity of samples taken from
across a broad geographic distribution. In other words, does BPP support any east–west split of
the gentilis–triangulum populations at any point in the combined continental distribution of these
11
http://mc.manuscriptcentral.com/systbiol
CHAMBERS AND HILLIS
forms, or is the split tested by Ruane et al. (2014) at the break in introgressed mitochondrial
DNA distinctive?
Using the nuclear dataset from Ruane et al. (2014), we tested five east–west splits of the
3 in Fig. 2b), as well as two splits farther west (splits 1 and 2 in Fig. 2b), and two splits farther
east (splits 4 and 5 in Fig. 2b; online Appendix 3). If the support from BPP reported by Ruane et
al. (2014) for split 3 reflects a real split between species, and is not simply a reflection of genetic
would expect much stronger support from BPP for split 3 than for splits 1, 2, 4, or 5. In contrast,
if BPP is simply supporting any split that results in clustering of two groups of geographically
proximate samples from a broad distribution of a single species, we would expect to see support
for all five splits in Fig. 2b. We found the latter result: regardless of the geographic split between
populations, BPP indicated very high support (PP = 100% for splits 1–4, and PP > 96% for split
5; Table S1) for all five of the east–west splits of the gentilis–triangulum cline.
Our empirical results support the simulations of Barley et al. (2018), who demonstrated
that if samples are taken from separated geographic localities from a single species that exhibits
isolation by distance, BPP consistently supports the separated geographic clusters as distinct
species. That result is in contrast to the simulations of Zhang et al. (2011), who simulated a
stepping-stone model and found that only in cases of relatively high migration rates did BPP
falsely recover low support for a single species. As noted by Barley et al. (2018), the results from
theoretical studies depend largely on parameters used in the respective simulations. Our results
suggest that the simulations conducted by Barley et al. (2018) better match the empirical system
studied by Ruane et al. (2014) than do the simulations of Zhang et al. (2011). Note that even if
12
http://mc.manuscriptcentral.com/systbiol
MULTISPECIES COALESCENT AND SPECIES DELIMITATION
the split between gentilis and triangulum reported by Ruane et al. (2014) represented an actual
species split, BPP also supports all the other east–west geographic splits shown in Figure 2b.
We do not suggest that any of the alternative species splits in Fig. 2b represent “better”
(2014). Rather, our analysis merely demonstrates that BPP supports virtually any geographic
partition of samples in this potential continental cline as “species.” But clearly, splits 1–5 in Fig.
2b cannot all be true species splits, as they each are mutually inconsistent with one another. BPP
does not provide stronger support for the gentilis–triangulum split than it does for other east–
When splits are hypothesized within an otherwise continuous distribution, contact zone
analyses have traditionally been used to assess the degree of genetic isolation and gene flow
between the putative taxa (Barton and Hewitt 1985; Durand et al. 2009). Systematists need to
distinguish between widespread, clinal geographic variation within a species on one hand, versus
distinct genetic and reproductive breaks between species on the other. This is especially
important when species are thought to be distributed parapatrically, such that the species contact
one another along narrow zones of potential gene flow. In such cases, the study of contact zones
can reveal if (a) hybridization between the putative species is absent or rare; (b) the contact zones
act as “genetic sinks” (thus restricting gene flow between the putative species); or (c) there is
broad gene flow and integration between the putative species at the contact zone. Case (a) is
uncontroversial, as it is consistent with virtually any concept of species (i.e., there is clear
evidence that the taxa are reproductively isolated, evolutionary distinct, and independent
13
http://mc.manuscriptcentral.com/systbiol
CHAMBERS AND HILLIS
lineages). In recent decades, many biologists have argued that case (b), or evidence of a narrow
hybrid zone that acts as a “genetic sink” that strongly restricts gene flow between species, is also
consistent with the hypothesis of distinct species (e.g., Sage and Selander 1979; Hafner et al.
evolving independently from one another (as there are no reproductive or genetic breaks between
Examining the population genetic structure at contact zones can also determine selective
forces that may be playing roles in driving, or maintaining, divergence (Sobel and Streisfeld
2015; Bertrand et al. 2016). Many approaches, genetic and otherwise, have been developed for
examining contact zone interactions (e.g., Gompert and Buerkle 2010; Derryberry et al. 2014),
although many species delimitation studies may require additional sampling for such an analysis.
Ruane et al. (2014) stated that they followed the “general lineage species concept” of de
Queiroz (1998, 2007). In these two papers, de Queiroz argued that virtually all species concepts
treat species as “separately evolving metapopulation lineages” that simply use different lines of
evidence to assess the independence and isolation of lineages. In other words, virtually all
“species concepts” are conceptualizing the same entities—namely, the individual, independent,
evolving lineages of life, within which organisms typically mate and exchange genes (as we
Species delimitation is typically a two-step process (see Hillis 2019). Taxonomists first
group organisms into putative taxa using one of several criteria. These include (1) correlated
14
http://mc.manuscriptcentral.com/systbiol
MULTISPECIES COALESCENT AND SPECIES DELIMITATION
equilibrium (as conducted, for example, by the programs Structure and Structurama); and (3)
multivariate analyses that assess overall divergence, such as principal components analysis.
another. However, differences arise within species as well as between them, so a second step is
needed to assess if the observed differences are evidence of independently evolving lineages, or
if the observed variation simply represents geographic or population variation within species. If
the groups in question come into geographic contact, then taxonomists typically assess lineage
independence by looking for direct or indirect evidence of reproductive barriers between the
groups at contact zones. Indirect evidence may include sharp geographic breaks in suites of
morphological, genetic, and/or behavioral characters at the contact zones; direct evidence may
Multispecies coalescent-based approaches have also been used to assess the evolutionary
independence of lineages (as in Ruane et al. 2014), but as shown in Barley et al. (2018), BPP
does not appear to discriminate adequately between geographic clinal structure versus species
boundaries. Although our results appear to be an empirical example of this scenario (Fig. 2b),
without adequate sampling at purported contact zones there is no way to distinguish these two
Despite the inability of BPP to distinguish between clinal variation versus speciation, we
can use the data collected by Ruane et al. (2014) to ask if there is any evidence for sharp genetic
or reproductive breaks among the various groups that they examined. Ruane et al. (2014) also
presented an analysis to summarize their data, in the form of a SplitsTree analysis (Fig. 3; Huson
and Bryant 2006). This tree does not represent any single gene tree, but is instead a summary of
15
http://mc.manuscriptcentral.com/systbiol
CHAMBERS AND HILLIS
support and counter-support for various clusters of individuals examined by Ruane et al. (2014)
across all examined loci. Individuals that are nearly identical across all loci are located adjacent
to one another (separated by small branch lengths) on this tree; in contrast, individuals that differ
between each purported taxon, and ask if geographically adjacent individuals in different
purported taxa exhibit any evidence for the genetic or reproductive breaks that are expected from
separately evolving lineages. If there are none, then there is no reason to hypothesize a break
projected on the Ruane et al. (2014) SplitsTree analysis, with a depiction of the distribution of
the purported taxa. The one-species hypothesis of Williams (1988; shown in Fig. 3a) is refuted
by two lines of evidence: first, there are far larger genetic gaps among subgroups of his L.
triangulum than there are between those subgroups and other well differentiated, sympatric
species (i.e., L. alterna). Second, where these subgroups of Williams’ L. triangulum come into
contact, they are sympatric, and yet maintain large genetic gaps between individuals. Thus, we
agree with Ruane et al. (2014) in rejecting the one-species hypothesis of Williams (1988).
Figure 3b presents an alternative taxonomic hypothesis that addresses the problems noted
above, and divides L. triangulum of Williams (1988) into three distinct species: L. triangulum, L.
elapsoides, and L. polyzona. This hypothesis is almost identical to the arrangement proposed by
Blanchard (1921), although he noted that collections at the time were not sufficient in lower
Central America to firmly establish the relationship between the nominal forms L. polyzona and
L. micropholis, and he tentatively treated those two species as distinct as well, pending further
collection of intermediate populations. There are substantial, consistent genetic breaks across
16
http://mc.manuscriptcentral.com/systbiol
MULTISPECIES COALESCENT AND SPECIES DELIMITATION
multiple loci among all three of the species recognized in this hypothesis. In addition, where any
two of these three species come into geographic contact, there are areas of known sympatry,
of parts of Kentucky, Tennessee, Alabama, Georgia, North Carolina, and Virginia in the United
States (indeed, this region of sympatry was discussed by Williams 1988). The known area of
sympatry between L. triangulum and L. polyzona in northern Veracruz, Mexico is much smaller,
with the two species reported together from just a single locality (also reported by Williams
1988). Thus, all the genetic and geographic data appear to support the recognition of these three
species.
In contrast, the remaining taxa recognized by Ruane et al. (2014; Fig. 3c) exhibit no
known areas of sympatry or any evidence of sharp genetic breaks at or near purported contact
zones. Instead, individuals on either side of purported contact zones (other than the ones noted in
Fig. 3b) are genetically much more similar to one another than they are to other geographically
distant individuals in their own taxon. This is not consistent with the expectation for
might expect similarities in occasional genes through independent lineage sorting, but we would
still expect large genetic gaps across most loci in comparisons of individuals drawn from
different species. No such genetic gaps exist between geographically adjacent samples of
gentilis–triangulum (Figs. 2b, 3c, and S2). These findings are also largely consistent with the
Structurama results (Fig. S1), except that annulata does appear to show significant Hardy–
17
http://mc.manuscriptcentral.com/systbiol
CHAMBERS AND HILLIS
deviations are not surprising given the large geographic distance between the samples examined
by Ruane et al. (2014) of annulata versus gentilis–triangulum (for example, the distance between
the closest samples examined for nuclear genes of annulata and gentilis–triangulum is
differences and Hardy–Weinberg deviations between these groups are indicative of geographic
2005; Leaché et al. 2009; Padial et al. 2010; Schlick-Steiner et al. 2010; Fujita et al. 2012;
Derkarabetian and Hedin 2014; Huang and Knowles 2016; Renner 2016), researchers sometimes
use limited data and rely on results generated by a single analysis to delimit species.
at contact zones, is a critical part of testing species hypotheses (Zhang et al. 2011; Edwards and
Knowles 2014; Pante et al. 2015; Solís-Lemus et al. 2015). We recommend against taxonomic
changes on the basis of analyses of limited samples, and demonstrate that BPP analyses of
limited geographic samples can support many groupings that are inconsistent with species.
Figure 3b, we emphasize that the data presented by Ruane et al. (2014) are inadequate to fully
examine the species boundaries in this group. The existing data do appear to support the species
delimited in Figure 3b, but it is certainly possible that additional genetic and geographic
sampling will demonstrate the existence of additional species boundaries in this group. However,
we see no convincing evidence from the data presented by Ruane et al. (2014) to support the
18
http://mc.manuscriptcentral.com/systbiol
MULTISPECIES COALESCENT AND SPECIES DELIMITATION
Species delimitation is not a simple process. In well studied, widely distributed taxa,
variation (of genes, morphology, and behavior), reproductive isolation, and gene flow. New
made only after due consideration of all available data (e.g., Setiadi et al. 2011; Barley et al.
2013; Hedin et al. 2015; Pante et al. 2015; Pyron et al. 2016; Folt et al. 2019). Although a
conservative approach to taxonomic change can risk underestimating diversity (Padial et al.
2010), this is preferable to making poorly supported taxonomic changes with each new dataset
SUPPLEMENTARY MATERIAL
Supplementary material, including data and online appendices, are available from the
ACKNOWLEDGMENTS
We thank Sara Ruane and Frank Burbrink for guidance on the milksnake re-analysis. We
also thank Carole Baldwin, Anthony Barley, Peter Beerli, David Cannatella, Kevin de Queiroz,
Harry Greene, Tracy Heath, Aleta Quinn, Jordan Satler, Robert Thomson, and April Wright for
comments and conversation related to the manuscript. Finally, we would like to thank Richard
Glor, Adam Leaché, and 13 anonymous reviewers for helpful suggestions and critique that
19
http://mc.manuscriptcentral.com/systbiol
CHAMBERS AND HILLIS
REFERENCES
Barley A.J., Brown J.M., Thomson R.C. 2018. Impact of model violations on the inference of
Barley A.J., White J., Diesmos A.C., Brown R.M. 2013. The challenge of species delimitation at
Evolution 67:3556-3572.
Barton N.H., Hewitt G.M. 1985. Analysis of hybrid zones. Ann. Rev. Ecol. Syst. 16:113-148.
Bertrand J.A.M., Delahaie B., Bourgeois Y.X.C., Duval T., García-Jiménez R., Cornuault J.,
Pujol B., Thébaud C., Milá B. 2016. The role of selection and historical factors in driving
29:824-836.
Blanchard F.N. 1921. A revision of the king snakes: genus Lampropeltis. Bull. U.S. Natl. Mus.
114:1-260.
Bradburd G.S., Coop G.M., Ralph P.L. 2018. Inferring continuous and discrete population
20
http://mc.manuscriptcentral.com/systbiol
MULTISPECIES COALESCENT AND SPECIES DELIMITATION
Camargo A., Morando M., Avila L.J., Sites J.W., Jr. 2012. Species delimitation with ABC and
Carstens B.C., Pelletier T.A., Reid N.M., Satler J.D. 2013. How to fail at species delimitation.
Chong J.P., Harris J.L., Roe K.J. 2016. Incongruence between mtDNA and nuclear data in the
freshwater mussel genus Cyprogenia (Bivalvia: Unionidae) and its impact on species
Dayrat B. 2005. Towards integrative taxonomy. Biol. J. Linn. Soc. Lond. 85:407-415.
de Queiroz K. 1998. The general lineage concept of species, species criteria, and the process of
D.J., Berlocher S.H., editors. Endless Forms: Species and Speciation. Oxford: Oxford
de Queiroz K. 2007. Species concepts and species delimitation. Syst. Biol. 56:879–886.
21
http://mc.manuscriptcentral.com/systbiol
CHAMBERS AND HILLIS
Derkarabetian S., Hedin M. 2014. Integrative taxonomy and species delimitation in harvestmen:
Durand E., Jay F., Gaggiotti O.E., François O. 2009. Spatial inference of admixture proportions
Eberle J., Warnock R.C.M., Ahrens D. 2016. Bayesian species delimitation in Pleophylla chafers
(Coleoptera) – the importance of prior choice and morphology. BMC Evol. Biol. 16:94.
Edwards D. L., Knowles L.L. 2014. Species detection and individual assignment in species
delimitation: can integrative data increase efficacy? Proc. R. Soc. Lond. B Biol. Sci.
281:20132765.
Ence D.D., Carstens B.C. 2011. SpedeSTEM: a rapid and accurate method for species
Fan W.M., Kasahara M., Gutknecht J., Klein D., Mayer W.E., Jonker M., Klein J. 1989. Shared
26:107–121.
22
http://mc.manuscriptcentral.com/systbiol
MULTISPECIES COALESCENT AND SPECIES DELIMITATION
Folt B., Bauder J., Spear S., Stevenson D., Hoffman M., Oaks J.R., Wood P.L., Jr., Jenkins C.,
Fontaneto D., Herniou E.A., Boschetti C., Caprioli M., Melone G., Ricci C., Barraclough
Biol. 5:914-921.
Fontaneto D., Barraclough T.G. 2015. Do species exist in asexuals? Theory and evidence
Frost D.R., Hillis D.M. 1990. Species in concept and practice: herpetological applications.
Herpetologica 46:87-104.
Fujita M.K., Leaché A.D., Burbrink F.T., McGuire J.A., and Moritz C. 2012. Coalescent-based
Georges A., Gruber B., Pauly G.B., White D., Adams M., Young M.J., Kilian A., Zhang X.,
Shaffer H.B., Unmack P.J. 2018. Genomewide SNP markers breathe new life into
23
http://mc.manuscriptcentral.com/systbiol
CHAMBERS AND HILLIS
Ghiselin M.T. 1974. A radical solution to the species problem. Syst. Zool. 23:536-544.
Hafner J.C., Hafner D.J., Patton J.L., Smith M.F. 1983. Contact zones and the genetics of
differentiation in the pocket gopher Thomomys bottae (Rodentia: Geomyidae). Syst. Zool.
32:1-20.
Hedin M., Carlson D., Coyle F. 2015. Sky island diversification meets the multispecies
Ecol. 24:3467-3484.
Hillis D.M. 1988. Systematics of the Rana pipiens complex: puzzle and paradigm. Annu.
Hillis D.M. 2007. Asexual evolution: can species exist without sex? Curr. Biol. 17:R543-
544.
24
http://mc.manuscriptcentral.com/systbiol
MULTISPECIES COALESCENT AND SPECIES DELIMITATION
Hime P.M., Hotaling S., Grewelle R.E., O’Neill E.M., Voss S.R., Shaffer H.B., Weisrock D.W.
2016. The influence of locus number and information content on species delimitation: an
delimitation from integrating multiple data types within a single Bayesian approach in
Huelsenbeck J.P., Andolfatto P., Huelsenbeck E.T. 2011. Structurama: Bayesian inference of
Huson D.H., Bryant D. 2006. Application of phylogenetic networks in evolutionary studies. Mol.
Irwin D.E. 2002. Phylogeographic breaks without geographic barriers to gene flow.
Evolution 56:2383-2394.
Knowles L.L., Carstens B.C. 2007. Delimiting species without monophyletic gene trees. Syst.
Biol. 56:887-895.
Leaché A.D., Fujita M.K. 2010. Bayesian species delimitation in West African forest geckos
25
http://mc.manuscriptcentral.com/systbiol
CHAMBERS AND HILLIS
Leaché A.D., Koo M.S., Spencer C.L., Papenfuss T.J., Fisher R.N., McGuire J.A. 2009.
coast horned lizard species complex (Phrynosoma). Proc. Natl. Acad. Sci. U.S.A.
Leaché A.D., Zhu T., Rannala B., Yang Z. 2018. The spectre of too many species. Syst. Biol.
68:168-181.
Luo A., Ling C., Ho S.Y.W., Zhu C. 2018. Comparison of methods for molecular species
Maddison W.P. 1997. Gene trees in species trees. Syst. Biol. 46:523-536.
Mayden R.L. 1997. A hierarchy of species concepts: the denouement in the saga of the
species problem. In: Claridge M.F., Dawah H.A., Wilson M.R., editors. Species:
Miller W., Schuster S.C., Welch A.J., Ratan A., Bedoya-Reina O.C., Zhao F., Kim H.L.,
Burhans R.C., Drautz D.I., Wittekindt N.E., Tomsho L.P., Ibarra-Laclette E., Herrera-
Estrella L., Peacock E., Farley S., Sage G.K., Rode K., Obbard M., Montiel R.,
Bachmann L., Ingólfsson Ó., Aars J., Mailund T., Wiig Ø., Talbot S.L., Lindqvist C.
2012. Polar and brown bear genomes reveal ancient admixture and demographic
footprints of past climate change. Proc. Natl. Acad. Sci. U.S.A. 109:E2382-E2390.
26
http://mc.manuscriptcentral.com/systbiol
MULTISPECIES COALESCENT AND SPECIES DELIMITATION
Olave M., Sola E., Knowles L.L. 2014. Upstream analyses create problems with DNA-based
Pante E., Puillandre N., Viricel A., Arnaud-Haond S., Aurelle D., Castelin M., Chenuil A.,
Destombe C., Forcioli D., Valero M., Viard F., Samad S. 2015. Species are hypotheses:
Pritchard J.K., Stephens M., Donnelly P. 2000. Inference of population structure using
Puechmaille S.J. 2016. The program STRUCTURE does not reliably recover the correct
population structure when sampling is uneven: subsampling and new estimators alleviate
Pyron R.A., Hsieh F.W., Lemmon A.R., Lemmon E.M., Hendry C.R. 2016. Integrating
brown and red-bellied snakes (Storeria). Zool. J. Linn. Soc. Lond. 177:937-949.
27
http://mc.manuscriptcentral.com/systbiol
CHAMBERS AND HILLIS
Reid N.M., Brown J.M., Satler J.D. et al. 2014. Poor fit to the multi-species coalescent model is
Rittmeyer E.N., Austin C.C. 2012. The effects of sampling on delimiting species from multi-
Ruane S., Bryson Jr. R.W., Pyron R.A., Burbrink F.T. 2014. Coalescent species
Sage R.D., Selander R.K. 1979. Hybridization between species of the Rana pipiens complex in
Schlick-Steiner B.C., Steiner F.M., Seifert B., Stauffer C., Christian E., Crozier R.H. 2010.
Entomol. 55:421-438.
Schwartz M., McKelvey K. 2008. Why sampling scheme matters: the effect of sampling scheme
28
http://mc.manuscriptcentral.com/systbiol
MULTISPECIES COALESCENT AND SPECIES DELIMITATION
Setiadi M.I., McGuire J.A., Brown R.M., Zubairi M., Iskander D.T., Andayani N., Supriatna J.,
Evans B.J. 2011. Adaptive radiation and ecological opportunity in Sulawesi and
Slatkin M. and Maddison W.P. 1990. Detecting isolation by distance using phylogenies of genes.
Genetics 126:249-260.
Sobel J.M., Streisfeld M.A. 2015. Strong premating reproductive isolation drives incipient
Solís-Lemus C., Knowles L.L., Ané C. 2015. Bayesian species delimitation combining multiple
Sukumaran J., Knowles L.L. 2017. Multispecies coalescent delimits structure, not
approach. In: Otte D., Endler J.A., editors. Speciation and its Consequences.
29
http://mc.manuscriptcentral.com/systbiol
CHAMBERS AND HILLIS
Varki A., Altheide T.K. 2005. Comparing the human and chimpanzee genomes: Searching for
Wiley E.O. 1978. The evolutionary species concept reconsidered. Syst. Zool. 27:17-26.
Williams K.L. 1988. Systematics and natural history of the American milk snake,
Yang Z., Rannala B. 2010. Bayesian species delimitation using multilocus sequence data. Proc.
Yang Z., Rannala B. 2014. Unguided species delimitation using DNA sequence data from
Yanchukov A., Hofman S., Szymura J.M., Mezhzherin S.M., Morozov-Leonov S.Y., Barton
30
http://mc.manuscriptcentral.com/systbiol
MULTISPECIES COALESCENT AND SPECIES DELIMITATION
Zhang C., Rannala B., Yang Z. 2014. Bayesian species delimitation can be robust to guide-tree
31
http://mc.manuscriptcentral.com/systbiol
CHAMBERS AND HILLIS
FIGURE CAPTIONS
Figure 1. Majority-rule consensus gene tree constructed with nuclear gene 2CL8, used as a
tree, it is clear that Central and South American milksnake lineages (polyzona, abnorma, and
recovered as a monophyletic lineage, while remaining U.S. lineages (triangulum, gentilis, and
annulata) are rarely resolved and exhibit no diagnostic differences. Remaining gene trees are
Figure 2. Results and group assignment from five runs of unguided BPP. a) Points represent
samples with nuclear gene data from Ruane et al. (2014), with the same data used here, and
ranges shaded according to the Ruane et al. (2014) final population assignment. b) Population
assignment between each of five runs of BPP (see online Appendix 1 for details).
complex) with SplitsTree networks (Huson and Bryant 2006) and ranges colored based on the
proposed species given in each hypothesis (adapted from Ruane et al. 2014). The Lampropeltis
alterna lineage shown in grey (not indicated on range map) is included because of its relevance
polytypic species across their entire range (Williams 1988); b) Hypothesis 2: three species, L.
32
http://mc.manuscriptcentral.com/systbiol
Downloaded from https://academic.oup.com/sysbio/advance-article-abstract/doi/10.1093/sysbio/syz042/5513370 by Buffalo State user on 20 July 2019
http://mc.manuscriptcentral.com/systbiol
Systematic Biology
Page 33 of 35
Page 34 of 35
http://mc.manuscriptcentral.com/systbiol
Systematic Biology
Downloaded from https://academic.oup.com/sysbio/advance-article-abstract/doi/10.1093/sysbio/syz042/5513370 by Buffalo State user on 20 July 2019
http://mc.manuscriptcentral.com/systbiol
Systematic Biology
Page 35 of 35