Ajb 1100570
Ajb 1100570
Ajb 1100570
• Premise of study: To reliably identify lineages below the species level such as subspecies or varieties, we propose an
extension to DNA-barcoding using next-generation sequencing to produce whole organellar genomes and substantial nuclear
ribosomal sequence. Because this method uses much longer versions of the traditional DNA-barcoding loci in the plastid and
ribosomal DNA, we call our approach ultra-barcoding (UBC).
• Methods: We used high-throughput next-generation sequencing to scan the genome and generate reliable sequence of high
copy number regions. Using this method, we examined whole plastid genomes as well as nearly 6000 bases of nuclear ribo-
somal DNA sequences for nine genotypes of Theobroma cacao and an individual of the related species T. grandiflorum, as
well as an additional publicly available whole plastid genome of T. cacao.
• Key results: All individuals of T. cacao examined were uniquely distinguished, and evidence of reticulation and gene flow
was observed. Sequence variation was observed in some of the canonical barcoding regions between species, but other
regions of the chloroplast were more variable both within species and between species, as were ribosomal spacers.
Furthermore, no single region provides the level of data available using the complete plastid genome and rDNA.
• Conclusions: Our data demonstrate that UBC is a viable, increasingly cost-effective approach for reliably distinguishing
variet- ies and even individual genotypes of T. cacao. This approach shows great promise for applications where very closely
related or interbreeding taxa must be distinguished.
Key words: cacao; chloroplast; DNA barcoding; evolution; genomics; Gossypium; Malvaceae; plastid; Theobroma; ultra-
barcoding.
320
February 2012] KANE ET AL.—ULTRA-BARCODING IN CACAO 321
EET-64 (T. cacao) USDA (TARS), Puerto Rico Hybrid between Upper Amazon Forastero (Nacional from Ecuador)
and Trinitario (from Venezuela). PI 275669
Criollo-22 USDA (SPCL), Beltsville, MD Pure Criollo variety Criollo-22
Stahel (T. cacao) USDA (TARS), Puerto Rico Trinitario with similarities to lower Amazon Forastero MIA 27956
Pentagonum (T. cacao) USDA (TARS), Puerto Rico Trinitario (Criollo-type) TARS 12044
Scavina-6 (T. cacao) USDA (SPCL), Beltsville, MD Upper Amazon Forastero, Peru MIA 29885
Amelonado (T. cacao) USDA (TARS), Puerto Rico Lower Upper Amazon Forastero TARS 16542
ICS39 (T. cacao) USDA (TARS), Puerto Rico Trinitario TARS 16664
ICS06 (T. cacao) USDA (TARS), Puerto Rico Trinitario TARS 16658
ICS01 (T. cacao) USDA (TARS), Puerto Rico Trinitario TARS 16656
T. grandiflorum (Cupuaçu) USDA (TARS), Puerto Rico Species related to T. cacao. Wild and cultivated in Amazon Basin 04-0254
Notes: MD: Maryland, USA; TARS, Tropical Agriculture Research Station; SPCL, Sustainable Perennial Crops Laboratory; USDA, U. S. Department
of Agriculture
which was the most closely related published full chloroplast genome at the
HQ244500 for the de novo assembly, and GenBank accessions JQ228379–
time of the analysis. The most complete chloroplast assemblies resulted from
JQ228389 for the reference-guided assemblies). The rDNA sequences are
SOAPdenovo, but both assemblies were aligned to the published chloroplast
GenBank accession JQ228369–JQ228378.
genome of G. hirsutum (Lee et al., 2006). Adjacent contigs were merged when
overlap was greater than 15 bp, and shorter overlaps and small gaps were
filled using custom perl scripts with trimmed Illumina read data as in Phylogenetic analysis—An alignment of the rDNA and the plastid se-
Dempewolf et al. (2010). Quality of the final assembly was confirmed and quences (our 10 samples as well as an additional sample of T. cacao, GI
error-corrected by aligning the quality-trimmed Illumina reads using the 328924764), which included all SNPs observed in the data nset was made
program MOSAIK (see below) for each genotype. Similarly, to generate a using the program MUSCLE (Edgar, 2004) and used for phylogenetic analy-
reference rDNA, assemblies were blasted against the existing rDNA ses. A phylogeny for the plastid data was analyzed under maximum likeli-
sequence for T. cacao (GI:7595560, GI:27447189, GI:133854413) and hood (ML; Felsenstein, 1973) using the program Garli 2.0 (Zwickl, 2006).
related species (GI:19919576–19919578; The TIM base substitution model was chosen for the ML analysis, based on
Soltis et al., 2003). Adjacent contigs were again merged when overlap was model testing with the program jModeltest 0.1.1 (Guindon and Gascuel,
greater than 15 bp, and gap filling and extension was performed where 2003; Posada, 2008). One thousand bootstrap replicates were performed to
possible using the unassembled Illumina reads. obtain support values for the phylogeny. The majority rule consensus tree
The full-length chloroplast sequence was annotated using DOGMA (Dual was generated and tree drawn and rooted using the program Figtree v1.3.1
Organellar GenoMe Annotator; Wyman et al., 2004), with additional informa- (http://tree.bio.ed.ac.uk/software/figtree/). The rDNA data set was analyzed
tion about splice sites and open reading frames provided from comparisons with a network-based approach, as rDNA undergoes recombination and is
with another completely sequenced T. cacao plastid genome (Jansen et al., therefore likely to violate the assumption of evolving in a bifurcating manner.
2011) as well as the fully annotated Gossypium chloroplast genomes (Ibrahim The program SplitsTree4 (Huson and Bryant, 2006) was used to visualize the
et al., 2006; Lee et al., 2006). The completed annotation was illustrated using relationships among the T. cacao varieties and to T. grandiflorum and were
OGDraw (Organellar Genome Draw; Lohse et al., 2007). displayed as a phylogenetic network. Support values at each node in the
network were estimated by running 1000 bootstrap replicates.
SNP genotyping for each sample—Trimmed, cleaned paired-end reads
were mapped to the reference T. cacao chloroplast and rDNA sequence using Comparisons with Gossypium—Overall structural similarities between the
MOSAIK (Hillier et al., 2008), with a hash size of 12, 12 mismatches allowed, full-length plastid genomes of Theobroma and Gossypium (Lee et al., 2006;
and an ACT score of 35. Sorted alignment files were converted to BAM Ibrahim et al., 2006) were examined aligning the two genomes with the blastz
format and single nucleotide polymorphisms (SNPs) were called for each algorithm (Schwartz et al., 2003) and illustrated using the program zPicture
sample against the reference using the program SAMTOOLS (Li et al., 2009), (Ovcharenko et al., 2004).
with a minimum quality of 20 for SNPs and 50 for insertions and deletions.
Addition- ally, alignments were manually checked and confirmed with the
raw reads for all significant differences from the reference. RESULTS
Sequence accession numbers—All of our WGS Illumina/Solexa reads are Illumina sequencing—We obtained between 1.7- and 4.6-
available from NCBI’s Short Read Archive (accession SRA048198.1), with
our annotated whole plastid genome assemblies also uploaded (GenBank Gbp sequence per sample after removing low-quality reads,
accession
TABLE 2. Illumina sequence summary statistics and observed average coverage of the nuclear and chloroplast genome for Theobroma based on Burrows-
Wheeler Aligner (BWA) alignments (see text).
Read length No. of pairs of No. of pairs after Nuclear genome Chloroplast genome rDNA
Name (bp) reads filtering Total bp coverage coverage coverage
EET-64 (T. cacao) 60 3.3E+07 3.2E+07 3.9E+09 9.36685 871 2669.9
Criollo-22 (T. cacao) 60 2.3E+07 2.1E+07 2.5E+09 6.08258 850.1 900.1
Stahel (T. cacao) 60 3.0E+07 2.8E+07 3.4E+09 8.17719 1229.1 2377.6
Pentagonum (T. cacao) 80 2.9E+07 2.6E+07 4.1E+09 9.87637 1539.8 4437.4
Scavina-6 (T. cacao) 60 1.9E+07 1.4E+07 1.7E+09 4.18122 186.9 783.9
Amelonado (T. cacao) 60 3.5E+07 3.4E+07 4.1E+09 9.89147 1291.4 3945.4
ICS39 (T. cacao) 80 3.2E+07 2.8E+07 4.5E+09 10.7130 1117.6 2735
ICS06 (T. cacao) 80 3.2E+07 2.8E+07 4.6E+09 10.9555 1324.7 3269.8
ICS01 (T. cacao) 60 2.5E+07 2.4E+07 2.9E+09 7.01850 944.6 3043.6
T. grandiflorum (Cupuaçu) 60 3.6E+07 3.4E+07 4.1E+09 9.86942 772.5 2662.2
or 4.2–11× coverage of the nuclear genome per sample
based on an estimated genome size of 416 Mbp (Figueira et distinguish any of the major varieties of T. cacao, with only
al., 1992). This was more than adequate coverage to assemble rbcL showing any variation and that only at one nucleotide.
both plastid and rDNA: 187–1540× average coverage of the The most variable 500-bp region of the chloroplast within T.
plastid and 784–3946× average coverage of the rDNA (Table cacao was in the 3′ end of the ccsA gene, with four SNPs
2). within T. cacao and two SNPs between T. cacao and T.
grandiflorum. When using the entire plastid genome, however,
De novo assembly of chloroplast and rDNA—Our de novo we observed orders of magnitude more vatiation: 78 SNPs
chloroplast assembly covered the entire 160 546-bp circular segregating within
plastid genome (NC_014676, Fig. 1). Sanger sequence of T. cacao and 415 SNPs observed between species. Indeed,
2162 bp of the plastid genome confirmed the assembly in nine there were multiple high-quality SNPs uniquely distinguishing
re- gions (GI: 338190271–338190279), and also confirmed the each sample sequenced with the exception of EET and Criollo-
length of repeats in microsatellites (Yang et al., 2011). The de 22, which were identical. The rDNA showed similarly high
novo rDNA assembly spanned the entire expressed portion in- varia- tion, particularly within ITS1, ITS2 and the ETS (Table
cluding ETS, 18S, ITS1, 5.8S, ITS2, and 28S (Fig. 2), for a 4). Again, however, the variation when including the entire
total of 5826 bp. rDNA sequence was far greater than any one region, enabling
unique identification of each individual sequenced, this time
SNP genotyping for each sample—Numerous SNPs were including EET and Criollo-22, which differed at several sites.
found both within and between Theobroma species (Figs. 1, 2;
Tables 3, 4), with far more variation between than within spe- Phylogenetic analysis—The phylogenetic analyses revealed
cies for most regions. Three SNPs were confirmed in 95 indi- significant divergence between the major clades of T. cacao.
viduals using Sanger sequence. Each of the examined The ML tree inferred by Garli (Fig. 3) showed two strongly
canonical barcoding regions showed substantial variation supported clades within T. cacao corresponding to the origin
between T. cacao and T. grandiflorum (Table 3), but none had of the Forastero and Criollo varieties. Furthermore, the tree
enough variation to shows that maternal lineages of Trinitario samples come
from both
Fig. 1. Map of the annotated circular chloroplast for Theobroma cacao (outer circle). Inner circle: SNPs segregating within T. cacao. Middle circle:
SNPs fixed between T. cacao and T. grandiflorum.
Fig. 2. Schematic diagram of the Theobroma cacao ribosomal DNA, including ETS, 18S, ITS1, 5.8S, ITS2, and 28S, with SNPs indicated by
diamonds.
LITERATURE CITED
AMBROSE, C. D., AND T. J. CREASE. 2011. Evolution of the nuclear ribo- somal
DNA intergenic spacer in four species of the Daphnia pulex complex.
BMC Genetics 12: 13.
ARGOUT, X., J. SALSE, J. M. AURY, M. J. GUILTINAN, G. DROC, J. GOUZY,
ET AL. 2010. The genome of Theobroma cacao. Nature Genetics 43:
101–108.
ARNHEIM, N., M. KRYSTAL, R. SCHMICKEL, G. WILSON, O. RYDER, AND E.
ZIMMER. 1980. Molecular evidence for genetic exchanges among
ribosomal genes on non-homologous chromosomes in man and apes.
Proceedings of the National Academy of Sciences, USA 77: 7323–
7327.
BARDAKCI, F., AND D. O. F. SKIBINSKI. 1994. Application of the RAPD
technique in tilapia fish: Species and subspecies identification.
Heredity 73: 117–123.
BARTLEY, B. G. D. 2005. The genetic diversity of cacao and its utilization.
CABI Publishing, Wallingford, UK.
BELLEMAIN, E., T. CARLSEN, C. BROCHMANN, E. COISSAC, P. TABERLET,
AND H. KAUSERUD. 2010. ITS as an environmental DNA barcode for
fungi: An in silico approach reveals potential PCR biases. BMC
Microbiology 10: 189.
BIRKY, W. C. 2001. The inheritance of genes in mitochondria and chlo-
roplasts: Laws, mechanisms and models. Annual Review of Genetics
35: 125–148.
BLEEKER, W., S. KLAUSMEYER, M. PEINTINGER, AND M. DIENST. 2007.
Chloroplast DNA variations of cultivated radish and its wild relatives.
Plant Science 168: 627–634.
CAMACHO, C., G. COULOURIS, V. AVAGYAN, N. MA, J. PAPADOPOULOS, K.
BEALER, AND T. L. MADDEN. 2009. BLAST+: Architecture and ap-
plications. BMC Bioinformatics 10: 421.
CAVERS, S., C. NAVARRO, AND A. J. LOWE. 2003. Chloroplast DNA phylo-
geography reveals colonization history of a Neotropical tree, Cedrela
odorata L., in Mesoamerica. Molecular Ecology 12: 1451–1460.
CBOL [Consortion of Barcode of Life]. 2009. A DNA barcode for land
plants. Proceedings of the National Academy of Sciences, USA 106:
12794–12797.
CHEESMAN, E. E. 1944. Notes on the nomenclature, classification and
possible relationships of cocoa populations. Tropical Agricure 21:
144–159.
CHOU, C. H., Y. C. CHIANG, AND T. Y. CHIANG. 1999. Within- and between-
individual length heterogeneity of the rDNA-IGS in Miscanthus sinensis
var. glaber (Poaceae): Phylogenetic analyses. Genome 42: 1088–1093.
COART, E., S. VAN GLABEKE, M. DE LOOSE, A. S. LARSEN, AND I. ROLDAN-
HILLIS, D. M., AND S. K. DAVIS. 1988. Ribosomal DNA: Intraspecific
RUIZ. 2006. Chloroplast diversity in the genus Malus: New insights
polymorphism, concerted evolution, and phylogeny reconstruction.
into the relationship between the European wild apple (Malus sylves-
Systematic Zoology 37: 63–66.
tris (L.) Mill.) and the domesticated apple (Malus domestica Borkh).
HILLIS, D. M., C. MORITZ, C. A. PORTER, AND R. J. BAKER. 1991. Evidence
Molecular Ecology 15: 2171–2182.
for biased gene conversion in concerted evolution of ribosomal DNA.
COEN, E., T. STRACHAN, AND G. DOVER. 1982. Dynamics of concerted evo-
Science 251: 308–310.
lution of ribosomal DNA and histone gene families in the melano-
HOWARD, C. 2010. The development of deoxyribonucleic acid (DNA)
gaster species subgroup of Drosophila. Journal of Molecular Biology
based methods for the identification and authentication of medicinal
158: 17–35.
plant material. Ph.D. dissertation, De Montfort University, Leicester,
COOMES, O. T. 2004. Rain forest ‘conservation-through-use’? Chambira
UK. Website http://hdl.handle.net/2086/3972.
palm fibre extraction and handicraft production in a land-constrained
HUSON, D. H., AND D. BRYANT. 2006. Application of phylogenetic net- works
community, Peruvian Amazon. Biodiversity and Conservation 13:
in evolutionary studies. Molecular Biology and Evolution 23: 254–
351–360.
267.
CRONN, R., A. LISTON, M. PARKS, D. S. GERNANDT, R. SHEN, AND T.
IBRAHIM, R. I., J. AZUMA, AND M. SAKAMOTO. 2006. Complete nucleotide
MOCKLER.
sequence of the cotton (Gossypium barbadense L.) chloroplast ge-
2008. Multiplex sequencing of plant chloroplast genomes using
nome with a comparative analysis of sequences among 9 dicot
Solexa sequencing-by-synthesis technology. Nucleic Acids Research
plants. Genes & Genetic Systems 81: 311–321.
36: e122.
INTERNATIONAL COCOA ORGANIZATION. 2011. About cocoa [online].
DEMPEWOLF, H., N. C. KANE, K. L. OSTEVIK, M. GELETA, M. S. BARKER, Z. LAI,
International Cocoa Organization, London, UK. Website http://www.
M. L. STEWART, E. BEKELE, J. M. ENGELS, Q. C. B. CRONK, AND
icco.org/about/growing.aspx [accessed 20 June 2011].
L. H. RIESEBERG. 2010. Establishing genomic tools and resources for
IRISH, B. I., R. GOENAGA, D. ZHANG, R. SCHNELL, S. BROWN, AND J. C.
Guizotia abyssinica (L.f.) Cass.—The development of a library of
MOTAMAYOR. 2010. Microsatellite fingerprinting of the USDA-ARS
expressed sequence tags, microsatellite loci and the sequenc- ing
tropical agriculture research station cacao (Theobroma cacao L.) ger-
of its chloroplast genome. Molecular Ecology Resources 10: 1048–
mplasm collection. Crop Science 50: 656–667.
1058.
IUCN [International Union for Conservation of Nature]. 2006. 2006 IUCN
DEXTER, K. G., T. D. PENNINGTON, AND C. W. CUNNINGHAM. 2010. Using
Red List. Website http://www.iucn.org [accessed December 16, 2011].
DNA to assess errors in tropical tree identifications: How often are
JANSEN, R. K., C. SASKI, S. B. LEE, A. K. HANSEN, AND H. DANIELL. 2011.
ecologists wrong and when does it matter? Ecological Monographs
Complete plastid genome sequences of three rosids (Castanea,
80: 267–286.
Prunus, Theobroma): Evidence for at least two independent trans-
DOORDUIN, L., B. GRAVENDEEL, Y. LAMMERS, Y. ARIYUREK, T. CHIN-A-
fers of rpl22 to the nucleus. Molecular Biology and Evolution 28:
WOENG,
835–847.
AND K. VRIELING. 2011. The complete chloroplast genome of 17 in-
KANE, N. C., AND Q. CRONK. 2008. Botany without borders, barcoding in
dividuals of pest species Jacobaea vulgaris: SNPs, microsatellites
focus. Molecular Ecology 17: 5175–5176.
and barcoding markers for population and phylogenetic studies.
KRESS, W. J., AND D. L. ERICKSON. 2008. DNA-barcoding—A windfall for
DNA Research 18: 93–105.
tropical biology? Biotropica 40: 405–408.
DUGAN, L. E., M. F. WOJCIECHOWSKI, AND L. R. LANDRUM. 2007. A large scale
KRONHOLM, I., O. LOUDET, AND J. DE MEAUX. 2010. Influence of muta-
plant survey: Efficient vouchering with identification through morphol-
tion rate on estimators of genetic differentiation—Lessons from
ogy and DNA analysis. Taxon 56: 1238–1244.
Arabidopsis thaliana. BMC Genetics 11: 33.
EDGAR, R. C. 2004. MUSCLE: Multiple sequence alignment with high ac-
LAHAYE, R. M. van der Bank, D. Bogarin, J. Warner, F. Pupulin, G. Gigot,
curacy and high throughput. Nucleic Acids Research 32: 1792–1797.
O. Maurin, S. Duthoit, T. G. Barraclough, and V. Savalainen. 2008.
ELDER, J. F. JR., AND B. J. TURNER. 1995. Concerted evolution of repetitive
DNA barcoding the floras of biodiversity hotspot. Proceedings of the
DNA sequences in eukaryotes. The Quarterly Review of Biology 70:
National Academy of Sciences, USA 105: 2923–2928.
297–320.
LEE, S.-B., C. KAITTANIS, R. K. JANSEN, J. B. HOSTETLER, L. J. TALLON, C.
FELSENSTEIN, J. 1973. Maximum likelihood and minimum-steps methods for
D. TOWN, AND H. DANIELL. 2006. The complete chloroplast genome
estimating evolutionary trees from data on discrete characters.
sequence of Gossypium hirsutum: Organization and phylogenetic re-
Systematic Zoology 22: 240–249.
lationships to other angiosperms. BMC Genomics 7: 61.
FERRI, G., M. ALU, B. CORRADINI, AND G. BEDUSCHI. 2009. Forensic bot-
Li, H., B. Handsaker, A. Wysoker, T. Fennell, J. Ruan, N. Homer, G.
any: Species identification of botanical trace evidence using a multi-
Marth, G. Abecasis, and R. Durbin, 1000 Genome Project Data
gene barcoding approach. International Journal of Legal Medicine
Processing Subgroup. 2009. The Sequence Alignment/Map
123: 395–401.
(SAM) format and SAMtools. Bioinformatics (Oxford, England)
FIGUEIRA, A., J. JANICK, AND P. GOLDSBROUGH. 1992. Genome size and
25: 2078–2079.
DNA polymorphism in Theobroma cacao. Journal of the American
LOHSE, M., O. DRECHSEL, AND R. BOCK. 2007. OrganellarGenomeDRAW
Society for Horticultural Science 117: 673–677.
(OGDRAW): A tool for the easy generation of high-quality cus-
FITTER, R., AND R. KAPLINSKY. 2001. Who gains from product rents as the
tom graphical maps of plastid and mitochondrial genomes. Current
coffee market becomes more differentiated? A value chain analysis.
Genetics 52: 267–274.
IDS Bulletin 32: 69–82.
LOU, S. K., K. L. WONG, M. LI, P. P. H. BUT, S. K. TSUI, AND P. C. SHAW.
GANLEY, A. R. D., AND T. KOBAYASHI. 2007. Highly efficient concerted
2010. An integrated web medicinal materials DNA database:
evolution in the ribosomal DNA repeats: Total rDNA repeat
MMDBD (Medicinal Materials DNA Barcode Database). BMC
variation revealed by whole-genome shotgun sequence data.
Genomics 11: 402.
Genome Research 17: 184–191.
MEYERS, S., AND A. LISTON. 2010. Characterizing the genome of wild
GLENN, T. C. 2011. Field guide to next-generation DNA sequencers.
relatives of Limnanthes alba (meadowfoam) using massively parallel
Molecular Ecology Resources.
sequencing. Acta Horticulturae 859: 309–314.
GUINDON, S., AND O. GASCUEL. 2003. A simple, fast, and accurate algorithm
MORIN, P. A., J. J. MOORE, AND D. S. WOODRUFF. 1992. Identification of
to estimate large phylogenies by maximum likelihood. Systematic
chimpanzee subspecies with DNA from air and allele specific probes.
Biology 52: 696–704.
Proceedings. Biological Sciences 249: 293–297 .
HEBERT, P. D. N., A. CYWINSKI, S. L. BALL, AND J. R. DEWAARD. 2003.
MOTAMAYOR, J. C., P. LACHNEAUD, J. W. DA SILVA E MOTA, R. LOOR, D. N.
Biological identifications through DNA barcodes. Proceedings.
KUHN, J. S. BROWN, AND R. J. SCHNELL. 2008. Geographic and genetic
Biological Sciences 270: 313–321.
population differentiation of the Amazonian chocolate tree (Theobroma
HILLIER, L. W., G. T. MARTH, A. R. QUINLAN, D. DOOLING, G. FEWELL, D.
cacao L). PLoS ONE 3: e3311.
BARNETT, P. FOX, ET AL. 2008. Whole-genome sequencing and vari- ant
discovery in C. elegans. Nature Methods 5: 183–188.
MOTILAL, L. A., D. ZHANG, P. UMAHARAN, S. MISCHKE, M. BOCCARA, AND
SOLTIS, D. E., M. A. GITZENDANNER, D. D. STRENGE, AND P. S. SOLTIS. 1997.
S. PINNEY. 2009. Increasing accuracy and throughput in large-scale
Chloroplast DNA intraspecific phylogeography of plants from the
microsatellite fingerprinting of cacao field germplasm collections.
Pacific Northwest of North America. Plant Systematics and
Tropical Plant Biology 2: 23–37.
Evolution 206: 353–373.
MUELLNER, A. N., H. GREGER, AND C. M. PANNELL. 2009. Genetic diver-
SOSNICKI, A. A., AND S. NEWMAN. 2010. The support of meat value chains by
sity and geographic structure in Aglaia elaeagnoidea (Meliaceae,
genetic technologies. Meat Science 86: 129–137.
Sapindales), a morphologically complex tree species, near the two
SOLTIS, D. E., A. E. SENTERS, M. J. ZANIS, S. KIM, J. D. THOMPSON, P. S. SOLTIS,
extremes of its distribution. Blumea—Biodiversity. Evolution and
L. P. RONSE DE CRAENE, P. K. ENDRESS, AND J. S. FARRIS. 2003.
Biogeography of Plants 54: 207–216.
Gunnerales are sister to other core eudicots: Implications for the evo-
NEWMASTER, S. G., AND S. RAGUPATHY. 2010. Ethnobotany genomics—
lution of pentamery. American Journal of Botany 90: 461–470.
Discovery and innovation in a new era of exploratory research.
STEELE, P. R., AND J. C. PIRES. 2011. Biodiversity assessment: State-of-the-
Journal of Ethnobiology and Ethnomedicine 6: 2.
art techniques in phylogenomics and species identification.
NEWTON, A. C. 2008. Conservation of tree species through sustainable use:
American Journal of Botany 98: 415–425.
How can it be achieved in practice? Oryx 42: 195–205.
STRAUB, S. C. K., M. FISHBEIN, T. LIVSHULTZ, Z. FOSTER, M. PARKS, K.
NOCK, C. J., D. L. WATERS, M. A. EDWARDS, S. G. BOWEN, N. RICE, G. M.
WEITEMIER, R. C. CRONN, ET AL. 2011. Building a model: Developing
CORDEIRO, AND R. J. HENRY. 2011. Chloroplast genome sequences from
genomic resources for common milkweed (Asclepias syriaca) with
total DNA for plant identification. Plant Biotechnology Journal 9:
low coverage genome sequencing. BMC Genomics 12: 211.
328–333.
STRAUB, S. C. K., M. PARKS, K. WEITEMIER, M. FISHBEIN, R. C. CRONN, AND
OVCHARENKO, I., G. G. LOOTS, R. C. HARDISON, W. MILLER, AND L. STUBBS.
A. LISTON. 2012. Navigating the tip of the genomic iceberg: Next-
2004. zPicture: Dynamic alignment and visualization tool for ana-
generation sequencing for plant systematics. American Journal of
lyzing conservation profiles. Genome Research 14: 472–477.
Botany 99: 349–364.
PALMER, J. D. 1985. Evolution of chloroplast and mitochondrial DNA in
WARD, J., S. R. GILMORE, J. ROBERTSON, AND R. PEAKALL. 2009. A grass
plants and algae. In R. J. MacIntyre [ed.], Monographs in evolution-
molecular identification system for forensic botany: A critical evalu-
ary biology: Molecular evolutionary genetics, 131–240. Plenum,
ation of the strengths and limitations. Journal of Forensic Sciences
New York, New York, USA.
54: 1254–1260.
PARKS, M., R. CRONN, AND A. LISTON. 2009. Increasing phylogenetic reso-
WENDEL, J. F., AND V. A. ALBERT. 1992. Phylogenetics of the cotton genus
lution at low taxonomic levels using massively parallel sequencing
(Gossypium): Character-state weighted parsimony analysis of
of chloroplast genomes. BMC Biology 7: 84.
chloro- plast-DNA restriction site data and its systematic and
PENNISI, E. 2007. Wanted: A barcode for plants. Science 318: 190–191.
biogeographic implications. Systematic Botany 17: 115–143.
PETIT, R. J., AND G. G. VENDRAMIN. 2007. Plant phylogeography based on
Whitlock, M. 2011. G´ST and D do not replace FST. Molecular Ecology
organelle genes: An introduction. In S. Weiss and N. Ferrand [eds.],
20: 1083–1091.
Phylogeography of southern European refugia, 23–97. Springer,
WHITTALL, J. B., J. SYRING, M. PARKS, J. BUENROSTRO, C. DICK, A. LISTON,
Dordrecht, Netherlands.
AND
PIREDDA, R., C. S. MARCO, M. ATTIMONELLI, R. BELLAROSA, AND B.
R. CRONN. 2010. Finding a (pine) needle in a haystack: Chloroplast
SCHIRONE.
genome sequence divergence in rare and widespread pines.
2011. Prospects of barcoding the Italian wild dendroflora: Oaks re-
Molecular Ecology 19 (supplement 1): 100–114.
veal severe limitations to tracking species identity. Molecular
WOLFE, K. H., W. H. LI, AND P. M. SHARP. 1987. Rates of nucleotide
Ecology Resources 11: 72–83.
substitution vary greatly among plant mitochondrial, chloroplast, and
POSADA, D. 2008. jModelTest: Phylogenetic model averaging. Molecular
nuclear DNAs. Proceedings of the National Academy of Sciences,
Biology and Evolution 25: 1253–1256.
USA 84: 9054–9058.
RAGUPATHY, S., S. G. NEWMASTER, M. MURUGESAN, AND V.
WOOD, G. A. R., AND R. A. LASS. 2001. Cocoa, 4th ed. Longman Group,
BALASUBRAMANIAM.
Blackwell, UK.
2009. DNA barcoding discriminates a new cryptic grass species re-
WYMAN, S. K., R. K. JANSEN, AND J. L. BOORE. 2004. Automatic annota-
vealed in an ethnobotany study by the hill tribes of the Western
tion of organellar genomes with DOGMA. Bioinformatics (Oxford,
Ghats in southern India. Molecular Ecology Resources 9: 164–171.
England) 20: 3252–3255.
ROGERS, S. O., AND A. J. BENDICH. 1987. Ribosomal RNA genes in
YANG, J. Y., A. LAMBERT, L. A. MOTILAL, H. DEMPEWOLF, K. MAHARAJ, AND
plants: Variability in copy number and in the intergenic spacer.
Q. C. B. CRONK. 2011. Chloroplast microsatellite primers for cacao
Plant Molecular Biology 9: 509–520.
(Theobroma cacao). American Journal of Botany 98: e372–e374.
ROSS, B. C., K. RAIOS, K. JACKSON, AND B. DWYER. 1992. Molecular clon-
YAO, H., J. SONG, C. LIU, K. LUO, J. HAN, Y. LI, X. PANG, H. XU, Y. ZHU, P.
ing of a highly repeated DNA element from Mycobacterium tuber-
XIAO, AND S. CHEN. 2010. Use of ITS2 region as the universal DNA
culosis and its use as an epidemiological tool. Journal of Clinical
barcode for plants and animals. PLoS ONE 5: e13102.
Microbiology 30: 942–946.
YESSON, C., R. T. BÁRCENAS, H. M. HERNÁNDEZ, M. DE LA LUZ RUÍZ-
ROWNTREE, J. K., R. S. COWAN, M. LEGGETT, M. M. RAMSAY, AND M. F.
MAQUEDA, A. PRADO, V. M. RODRÍGUEZ, AND J. A. HAWKINS. 2011.
FAY. 2010. Which moss is which? Identification of the threatened
DNA barcodes for Mexican Cactaceae, plants under pressure from
moss Orthodontium gracile using molecular and morphological tech-
wild collecting. Molecular Ecology Resources 11:
niques. Conservation Genetics 11: 1033–1042. Schaal, B. A., D. A.
ZHANG, Y.-J., P.-F. MA, AND D.-Z. LI. 2011. High-throughput sequenc- ing
Hayworth, K. M. Olsen, J. T. Rauscher, and W. A. Smith. 1998.
of six bamboo chloroplast genomes: Phylogenetic implications for
Phylogeographic studies in plants: Problems and prospects.
temperate woody bamboos (Poaceae: Bambusoideae). PLoS ONE 6:
Molecular Ecology 7: 465–474.
e20596.
SCHAAL, B. A., AND G. H. LEARN JR. 1988. Ribosomal DNA variation within
ZURAWSKI, G., AND M. T. CLEGG. 1987. Evolution of higher-plant chlo-
and among plant populations. Annals of the Missouri Botanical
roplast DNA-encoded genes: Implications for structure–function
Garden 75: 1207–1216.
and phylogenetic studies. Annual Review of Plant Physiology 38:
SCHWARTZ, S., W. J. KENT, A. SMIT, Z. ZHANG, R. BAERTSCH, R. C.
391–418.
HARDISON,
ZWICKL, D. J. 2006. Genetic algorithm approaches for the phylogenetic
D. HAUSSLER, AND W. MILLER. 2003. Human-mouse alignments with
analysis of large biological sequence datasets under the maximum
BLASTZ. Genome Research 13: 103–107.
likelihood criterion. Ph.D. dissertation, University of Texas at
SIMPSON, J. T., K. WONG, S. D. JACKMAN, J. E. SCHEIN, S. J. JONES, AND I.
Austin, Austin, Texas, USA.
BIROL. 2009. ABySS: A parallel assembler for short read sequence
data. Genome Research 19: 1117–1123.