0% found this document useful (0 votes)
3 views9 pages

yu2007

Download as pdf or txt
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 9

Gene 396 (2007) 66 – 74

www.elsevier.com/locate/gene

The complete nucleotide sequence of the mitochondrial genome of the


oriental fruit fly, Bactrocera dorsalis (Diptera: Tephritidae)
D.J. Yu a,b,⁎, L. Xu a , F. Nardi c , J.G. Li d , R.J. Zhang b
a
Shenzhen Entry-Exit Inspection & Quarantine Bureau, Shenzhen, PR China
b
Institute of Entomology & State Key Laboratory for Biocontrol, Sun Yat-Sen University, Guangzhou, PR China
c
Department of Evolutionary Biology, University of Siena, Siena, Italy
d
Beijing Entry-Exit Inspection & Quarantine Bureau, Beijing, PR China
Received 4 July 2006; received in revised form 30 January 2007; accepted 20 February 2007
Available online 15 March 2007
Received by: G. Pesole

Abstract

The complete mitochondrial genome of the oriental fruit fly Bactrocera dorsalis s.s. has been sequenced, and is here described and compared
with the homologous sequences of Bactrocera oleae and Ceratitis capitata. The genome is a circular molecule of 15,915 bp, and encodes the set
of 37 genes generally found in animal mitochondrial genomes. The structure and organization of the molecule is typical and similar to the two
closely related species B. oleae and C. capitata, although it presents an interesting case of putative intra-molecular recombination. The relevance
of the growing comparative dataset of tephritid complete mitochondrial genomes is discussed in relation to the possibility to develop robust assays
for species discrimination in quarantine and agricultural monitoring practices, as well as basic phylogeography/population genetic studies.
© 2007 Elsevier B.V. All rights reserved.

Keywords: Mitochondrial genome; Oriental fruit fly; Bactrocera dorsalis complex; Tephritidae; Intramolecular recombination, Species discrimination

1. Introduction from 117 host species, in 76 genera and 37 families (Allwood


et al., 1999), among which are a number of commercially grown
The oriental fruit fly, Bactrocera dorsalis (Hendel) is one of fruits (White and Elson-Harris, 1992) such as cherry, plum and
the most economically important fruit fly pests (Clarke et al., peaches. The species is widespread throughout much of South
2005). This polyphagous species has been recorded in Asia East Asia, Pakistan, India, Sri Lanka, Sikkim, Myanmar,
Southern China, Japan (eradicated from Ryukyu Island), and
some pacific islands including Hawaii (Carroll et al., 2002). In
Abbreviations: atp6-8, genes encoding ATP synthase subunits 6 and 8; cob,
gene encoding cytochrome oxidase b; cox1-3, genes encoding cytochrome
addition, given its high polyphagy and capability of adapting to
oxidase subunits 1-3; CR, control region; nad1-6 and nad4L, genes encoding new areas, B. dorsalis is causing serious concern in terms of its
NADH dehydrogenase subunits 1-6 and 4L; NCR, non-coding region; rrnL and possible introduction in other economically significant fruit
rrnS, genes encoding the large (16S) and small (12S) subunits of ribosomal growing regions worldwide. As an example, the annual
RNA; LSU and SSU, Large and Small ribosomal RNA subunits; PCG, protein investigation plan for fruit flies in China reported the presence
coding gene; trnX, genes encoding transfer RNA molecules, with the
corresponding amino acid denoted by the one-letter code and anticodon
of B. dorsalis in more than 10 provinces, and its range may
indicated in parentheses (XXX) when required; tRNA-X, transfer RNA expand to Southeast and Middle China as a consequence of the
molecules with corresponding amino acids denoted with a one-letter code; global climate warming (Hou, 2005).
Kb, kilobases; nt, nucleotides; PCR, Polymerase Chain Reaction; bp, base pair; B. dorsalis s.s. belongs to a large group of morphologically
mtDNA, mitochondrial DNA. similar species generally referred to as the B. dorsalis complex.
⁎ Corresponding author. Shenzhen Entry-Exit Inspection and Quarantine
Bureau, 2049 Heping Road, 518001, Shenzhen, Guangdong, PR China. Tel.:
Beginning with the revision of Drew and Hancock (1994), this
+86 755 82117990; fax: +86 755 25588630. group of species has been subdivided in the last decade to
E-mail address: [email protected] (D.J. Yu). include 75 recognized entities. Among these, a group of sibling
0378-1119/$ - see front matter © 2007 Elsevier B.V. All rights reserved.
doi:10.1016/j.gene.2007.02.023
D.J. Yu et al. / Gene 396 (2007) 66–74 67

species, hardly distinguishable on morphological grounds, has applied to species diagnostics in key taxa, such as tephritid fruit
drawn a lot of attention as it includes, besides some non pest flies. In the short term, background information on complete
species and B. dorsalis s.s., at least four species of major mitochondrial genomes in closely related taxa can be used to
agricultural importance: Bactrocera carambolae Drew & select one or more genes as the most variable/informative for
Hancock, Bactrocera papayae Drew & Hancock, Bactrocera specific problems. This strategy was applied in Nardi et al.
occipitalis (Bezzi) and Bactrocera philippinensis Drew & (2003) where, following a comparison of two completely
Hancock. sequenced mitochondrial genomes, the gene nad1 was chosen
An efficient and reliable way to discriminate among these (as for a wider phylogeographic analysis in B. oleae (Nardi et al.,
well as other) species has become crucial in terms of 2005).
international trade and quarantine controls (Follett and Neven, Here we report and describe the complete nucleotide
2006), and is foreseeable only using molecular methodologies sequence of the mitochondrial genome of B. dorsalis s.s. from
(for a review of earlier methods and recent perspectives: Clarke a Chinese population, and compare the sequence and genome
et al., 2005). organization with the two other available complete mitochon-
The animal mitochondrial genome is a closed circular drial genomes from tephritid fruit flies, the congeneric B. oleae
molecule ranging in size from 14 to 19 kb (Wolstenholme, (Nardi et al., 2003) and C. capitata (Spanos et al., 2000), that
1992; Boore, 1999). The gene content and organization are belongs to a different tribe in the same subfamily. The possible
generally conserved. Mitochondrial genome sequences have utility of mitochondrial sequences for species discrimination in
frequently been used as molecular markers in studies focusing at the B. dorsalis complex is also discussed.
the species level (for a comprehensive review see Avise, 2000)
although some limitations, such as their dependency on the 2. Materials and methods
female lineage only, and the possible detrimental effects of non
neutrality and introgression, are emerging (Ballard and Whitlock, Larvae of B. dorsalis s.s. were collected from Longan
2004; Ballard and Rand, 2005). Substitution rates are generally (Dimocarpus longan Lour.) in Shenzhen, Guangdong Province
higher in the mitochondrial than in the nuclear genome (Brown of China, and reared to adulthood under laboratory conditions
et al., 1982), providing more resolution at shallow divergence (Yu et al., 2004). Adult specimens were identified to the species
levels. Furthermore it is inherited via the female line only, and level according to the taxonomic key of Drew and Hancock
generally does not recombine (Scheffler, 1999). Most impor- (1994). Total DNA was isolated from single adult specimens
tantly, the effective population size of the molecules is one fourth using the DNEasy Tissue Kit (QUIAGEN, Germany), and the
of their nuclear counterparts and the mean coalescence time is DNA of one single individual was used for all amplifications
shorter. Therefore mitochondrial phyletic lineages reach mono- and sequencing that resulted in the complete genome sequence.
phyly faster after population/species are genetically separated The entire mitochondrial genome was amplified via PCR in 22
(Avise, 2000). overlapping pieces ranging in size from 150 bp to 2 kb (Fig. 1;
In recent years, as the sequencing of multiple complete primer sequences and amplification conditions available upon
mitochondrial genomes is becoming more commonplace, request from Yu) using a recombinant Taq DNA Polymerase
comparative analyses at the genus/species level have been (TaKaRa, Japan). Most primers are based on Simon et al.
produced that use complete molecules instead of one or a subset (1994), with modifications based on the available sequences of
of genes (Ballard, 2000a,b; Yukuhiro et al., 2002; Nardi et al., C. capitata (GenBank accession no. AJ242872) and B. oleae
2003; see also Rand, 2001). About 60 insect mitochondrial (GenBank accession no. AY210702); additional primers were
genomes are now available (Amiga database: Feijao et al., designed on available sequences. Most fragments were gel
2006), 11 among higher diptera and four (from three species, purified (QIAquick Gel Extraction Kit: QIAGEN, Germany)
including the genome of B. dorsalis presented here) from and directly sequenced on both strands using PCR and internal
tephritid fruit flies. Therefore the comparative study of complete primers. To ensure maximum accuracy, each fragment was
mitochondrial genomes is becoming a viable approach to the sequenced twice independently, and in case of discrepancies a
study of certain insect groups at the species level, and could be third PCR product was sequenced. The fragment encompassing

Fig. 1. Sequencing strategy. Full lines identify amplification products (1–20 and 22), the double line (21) identifies a fragment that was cloned before sequencing. See
section 2 for details.
68 D.J. Yu et al. / Gene 396 (2007) 66–74

the CR and nad2 gene (fragment 21 in Fig. 1) was cloned in a in B. oleae, where the longest intergenic spacer is 28 bp, and
pGEM-T vector (Promega, USA) and three clones were more similar to C. capitata, where two longer (N 40 bp)
sequenced. For this region cloning was carried out to obtain spacers are observed on both sides of trnQ. Neither B. oleae
better sequencing reads rich in areas rich in homopolymer runs. nor C. capitata display significant intergenic spacers in the
In both cases discrepancies among different reads (roughly 3‰ region between trnC and trnY (1 bp and 16 bp, respectively).
of bases) were corrected on a two out of three basis. Base composition shows both genome wide and strand
Primers 1007SpacerFw: (GAGGGTTACCCCCATT- specific compositional biases. Overall composition is
TATTG) and 1684SpacerRev (TTGATCGTCACCGAT- A = 39.3%, C = 16.2%, G = 10.2%, T = 34.3%, giving a total A +
TAAAGC) were used to amplify and sequence a fragment of T content of 73.6%. This figure is very similar to the congeneric
878 bp (encompassing the area around trnW/C/Y) in three B. oleae (72.6%) and slightly lower than C. capitata (77.5%). The
additional specimens of B. dorsalis s.s. from China (Shanghai, A + T content of isolated protein coding genes, rRNAs, tRNAs
Yunnan and Guangdong provinces) as well as B. dorsalis from and the CR is 71.2%, 77.8%, 75.2% and 88.1%, respectively,
U.S.A, Thailand, Vietnam, Japan (laboratory strain), B. papayae again similar to B. oleae and slightly lower than in C. capitata
from Malaysia, B. carambolae from Thailand, B. philippinensis (Table 2).
and B. occipitalis from the Philippines. These additional Considering the two strands separately, an asymmetrical
sequences were used for comparison, and not included in the compositional bias can be observed (Hassanin et al., 2005), and
complete genome sequence. is most evident comparing gene sequences on opposite strands:
Sequencing and cloning were performed at Beijing Aoke genes encoded on the J strand (nad2-3 and 6, cox1-3 and b,
Biological Limited Inc using a ABI3730 DNA Sequencer (PE atp6 and 8) have a comparable content in As and Ts (32.9%
Applied Biosystems, USA) and BigDyeTM chemistry. Electro- A; 37.6% T, on average), while genes on the N strand (nad1, 4-
pherograms were manually inspected for accuracy of the base 5 and 4L, rrnS and rrnL) display a strong bias towards higher A
calls and assembled. % than T% (48.4% A; 26.5%T).
Sequence annotation, and the comparison with C. capitata This parallels observations from B. oleae (Nardi et al.,
and B. oleae was performed using the DNAStar package 2003), apart from the fact that here the bias can also be observed
(DNAStar, USA) and the on-line blast tools available through in rRNA genes, though to a lesser extent.
the NCBI web site Altschul et al., 1997). Folding of the repeated
fragment and free energy values were studied in the trnC- 3.2. Protein coding genes and codon usage
spacer-trnY region and in an equally long area in the CR using
the mfold software (default parameters for DNA; Zucker, 2003). The mitochondrial genome of B. dorsalis encodes the
The complete sequence of B. dorsalis mtDNA was deposited regular set of 13 PCGs found with few exceptions in all animal
in GenBank under accession no. DQ845759, additional mitochondrial genomes. All protein coding genes, with the
sequences of the trnW/C/Y region under accession nos. exception of cox1 and atp8, start with a typical ATN initiation
EF377349–59. codon (cox2, 3 and b, atp6 and nad4 and 4L with ATG; nad2-3

3. Results and discussion

3.1. Genome organization and base composition

The mitochondrial genome of B. dorsalis is a closed circular


molecule of 15,915 bp, well in the range of the other two
tephritid genomes available (15,815 in B. oleae and 15,980 in
C. capitata). The gene content is typical of other metazoan
mitochondrial genomes (Fig. 2, Table 1): 13 PCGs (cox1-3 and
b, nad1-6 and 4L, atp6 and 8), 22 tRNA genes (one for each
amino acid, two for Leucine and Serine) and two for
mitochondrial rRNA subunits (rrnS and rrnL). Gene order
follows the basic Pancrustacean arrangement (Crease, 1999).
Only one long unassigned region is present between rrnS and
trnI, and it was deemed homologous to the CR (also known
as A + T rich region in insects) by positional homology, general
structure and base content.
Genome organization is very compact, with only 218
nucleotides dispersed in 14 intergenic spacers and contiguous
genes overlapping at 5 boundaries by a total of 27 bases. Fig. 2. Graphical representation of the mitochondrial genome of Bactrocera
dorsalis. Orientation of genes and tRNAs is specified by the direction of arrows.
Besides the aforementioned CR, two of the intergenic spacers Positive number at gene boundaries mark intergenic spacers (in bp, see Table 1),
appear of significant length: 66 bp between trnQ and trnM negative numbers gene overlaps. Shading identifies different regions (PCGs,
and 45 bp between trnC and trnY. This is unlike the situation rRNAs and CR).
D.J. Yu et al. / Gene 396 (2007) 66–74 69

Table 1
Genes and gene regions in the mtDNA of B. dorsalis
Gene Span IGS a Start Stop Gene Span IGS a Start Stop
trnI 1–66 −3 – – trnN 6166–6230 0 – –
trnQ 64–132 b 66 – – trnS(AGY) 6231–6298 0 – –
trnM 199–267 0 – – trnE 6299–6365 18 – –
Nad2 c 268–1290 10 ATT TAA trnF 6384–6448 b 0 – –
trnW c 1301–1369 −8 – – Nad5 6449–8168 b 15 ATT T
trnC c 1362–1424 b 45 – – trnH 8184–8249 b 0 – –
trnY c 1470–1536 b −2 – – Nad4 8250–9590 b −7 ATG TAG
Cox1 c 1535–3069 0 TCG TA Nad4L 9584–9880 b 2 ATG TAA
trnL(UUR) 3070–3135 4 – – trnT 9883–9947 0 – –
Cox2 3140–3829 4 ATG TAA trnP 9948–10,013 b 2 – –
trnK 3834–3904 0 – – Nad6 10,016–10,539 0 ATT TA
trnD 3905–3971 0 – – Cob 10,540–11,674 0 ATG T
atp8 3972–4133 −7 GTG TAA trnS (UCN) 11,675–11,741 15 – –
atp6 4127–4803 0 ATG TA Nad1 11,757–12,696 b 10 ATA T
Cox3 4804–5592 9 ATG TAA trnL(CUN) 12,707–12,771 b 0 – –
trnG 5602–5666 0 – – rrnL 12,772–14,104 b 0 – –
Nad3 5667–6018 0 ATT T trnV 14,105–14,176 b 0 – –
trnA 6019–6083 7 – – rrnS 14,177–14,966 b 0 – –
trnR 6091–6154 11 – – CR 14,967–15,915 0 – –
a
Intergenic nucleotides observed after the indicated gene. Positive numbers identify spacers (in bp), negative numbers identify gene overlaps.
b
Encoded on the N strand (Simon et al., 1994).
c
Region sequenced in additional geographic samples and species from the dorsalis complex.

and 5-6 with ATT; nad1 with ATA: Table 1). Initiation codons C. capitata, one (cox1) with B. oleae, and one (nad5) with C.
ATG or ATA, encoding for Methonine, are the most typical capitata. This is somehow expected given the close taxonomic
among insects (Bae et al., 2004; Kim et al., 2005), ATT codon is proximity of the three species, but not fully justified by the 10–
less common but often observed among tephritids (Table 3). 20% of differences observed at the nucleotide level between
The gene for cox1 is initiated by a TCG codon, as in B. oleae PCGs in the three species, suggesting some role for selection. A
and C. capitata., and the hexanucleotide ATTTAA, proposed as comparison of the length of protein coding genes among the
an initiation signal in mosquitoes and observed in most three tephritid species show that this figure is very conserved,
dipterans analyzed so far, is exactly adjacent to this triplet. with a difference of 2 codons at most (nad4L: Table 3).
The start codon for gene atp8 (GTG) is different from both B.
oleae (ATC) and C. capitata (ATT). 3.3. rRNAs and tRNAs
Canonical TAA and TAG termination codons are found in
five (cox2-3, nad2 and 4L, and atp8) and one (nad4) PCGs, The two genes encoding the small and the large ribosomal
respectively. The remaining genes show an incomplete subunits are located between trnL(CUN) and trnV, and between
termination codon (TA in cox1, atp6 and nad6; T in nad3 and trnV and the CR. The length of B. dorsalis rrnS and rrnL was
5, cytb and nad1: Table 1) likely extended to TAA during the determined to 790 bp and 1333 bp, respectively, based on the
maturation of transcript, a phenomenon commonly observed in location of neighboring tRNAs and a comparison with other
metazoan mitochondrial genes (Clary and Wolstenholme 1985; related sequences and structures (B. oleae, unpublished data).
Bae et al., 2004; Junqueira et al., 2004). In four cases (cytb, These figures are very similar to B. oleae and C. capitata and
nad3 and 6, atp6) a complete stop codon TAA or TAG is well in the range of other dipterans (Kim et al., 2005).
present on the genomic sequence, but the last one or two The complete set of 22 tRNAs typical of metazoan
adenines are overlapping with the neighboring gene, and were mitochondrial genomes is present in B. dorsalis. Their
considered to be incomplete. Considering only those three secondary structures generally conform to a regular cloverleaf
genes that strictly do not have complete termination codons in structure (Table 4). Anticodon sequences are the same as in B.
B. dorsalis, one (nad1) is in common with both B. oleae and oleae and C. capitata. In tRNA-S(AGN) eight unpaired

Table 2
Length and base composition of different genomic regions in B. dorsalis, B. oleae and C. capitata
Species and accession mtDNA CR rRNAs tRNAs PCGs
number a
Size AT% Size AT% Size AT% Size AT% Size AT%
B. dorsalis DQ845759 b 15,915 73.6 949 88.1 2123 77.8 1467 75.2 11,185 71.2
B. oleae AY210702 15,815 72.6 949 86.9 2116 77.1 1484 75.1 11,188 70.1
C. capitata AJ242872 15,980 77.5 1004 91.1 2123 80.2 1472 76.8 11,272 75.5
a
In base pairs.
b
GenBank accession number.
70 D.J. Yu et al. / Gene 396 (2007) 66–74

Table 3
Position, length, initiation and termination codons in B. dorsalis, B. oleae and C. capitata PCGs
B. dorsalis B. oleae C. capitata
Gene From To Length Start Stop From To Length Start Stop From To Length Start Stop
nad2 268 1290 1023 ATT TAA 206 1228 1023 ATT TAA 293 1315 1023 ATT TAA
cox1 1535 3069 1535 TCG TA 1428 2961 1534 TCG T 1527 3062 1536 TCG TAA
cox2 3140 3829 690 ATG TAA 3033 3722 690 ATG TAA 3138 3824 687 ATG TAA
atp8 3972 4133 162 GTG TAA 3864 4025 162 ATC TAA 3974 4135 162 ATT TAA
atp6 4127 4803 677 ATG TA 4019 4696 678 ATG TAA 4129 4806 678 ATG TAA
cox3 4804 5592 789 ATG TAA 4696 5484 789 ATG TAA 4806 5594 789 ATG TAA
nad3 5667 6018 352 ATT T 5559 5921 354 ATC TAG 5666 6019 354 ATA TAA
nad5 6449 8168 1720 ATT T 6357 8075 1719 ATT TAA 6452 8168 1717 ATT T
nad4 8250 9590 1341 ATG TAG 8156 9496 1341 ATG TAA 8260 9600 1341 ATG TAA
nad4L 9584 9880 297 ATG TAA 9490 9786 297 ATG TAA 9600 9890 291 ATG TAA
nad6 10,016 10,539 524 ATT TA 9922 10,446 525 ATC TAA 10,026 10,550 525 ATT TAA
cob 10,540 11,674 1135 ATG T 10,446 11,582 1137 ATG TAG 10,550 11,686 1137 ATG TAG
nad1 11,757 12,696 940 ATA T 11,663 12,602 940 ATG T 11,767 12,706 940 ATT T

nucleotides appear to replace the DHU arm (Table 4), as in B. observed in human mitochondria (Kajander et al., 2000) and in
oleae and C. capitata (personal observation based on sequence the somatic tissue of the Manila clam (Passamonti et al., 2003),
AJ242872). In B. dorsalis and C. capitata, but not B. oleae, an as well as for the apparently abnormal rate of gene
alternative structure with a paired stem can be drawn by rearrangement in certain hymenopteran lineages (Dowton and
allowing the suboptimal pairing [CGT]TC[AAG]. Austin, 1999). The alternative possibility of the duplication of a
large fragment encompassing trnY to CR and subsequent loss,
3.4. Intra-molecular recombination and the origin of the in one copy, of all 7 genes but not of the repeated sequence
intergenic spacers (duplication and random loss model: Boore, 2000) seems highly
unlikely in this case as there is no other trace of this supposed
The longer intergenic spacer (66 bp: Table 1) shows no duplication anywhere in the region.
significant similarity (e-value b 0.01) to other known sequences With regards to the direction of the recombination, not
in the GenBank database or to other regions in the B. dorsalis enough comparative data are available to draw firm conclu-
genome, and therefore it is impossible to formulate hypotheses sions, but since the spacer copy of the sequence is absent in the
on its origin. On the other hand, the smaller spacer sequence closely related B. oleae and C. capitata, while the CR copy is
(45 bp: Table 1) has a clear counterpart in the CR, where the first present (25 out of 33 identical bases) at least in B. oleae, we
33/45 bases of the spacer are found exactly repeated (Blast e- hypothesize that the CR copy is the original one. This could
value: e− 5). Sequences encompassing the two repeated regions have been duplicated as a second insertion between trnC and
have been independently confirmed by comparison with trnY.
GenBank entries AB191470–3, that encompasses the whole In order to better determine the lineage where the
CR in four B. dorsalis s.s. specimens from different locations recombination took place, the occurrence of both copies of
(Nakahara et al., unpublished data), and by re-sequencing the the repeated sequence was explored across available Bactrocrea
area around trnW/C/Y in additional specimens. species. The CR copy is clearly recognizable in B.carambolae
In detail, bases 1425–1457 correspond to bases 15309– (30 bp identity, Blast e-value b 0.005) and B. papayae (28/29 bp
15341 in inverted orientation, while the remaining 12 bp of the identity, Blast e-value b 0.005), but also, with decreasing
spacer (bases 1458–1469) have no resemblance with the co- similarity, across all Bactrocera species for which sequence
linear bases 15297–15308 in the CR. Interestingly, this repeated data is available (AF033920–34: Hoeben and Ma, unpublished
segment lays in regions capable of forming extremely rich data), including the aforementioned B. oleae (Nardi et al.,
secondary structure motifs (Fig. 3). Bases 1425–1469 are 2003). This is consistent with the hypothesis that this is the
surrounded by two tRNA genes, and are predicted to form a original copy.
small stem (5 bp) themselves (total dG = − 11.62). Bases 15297– The spacer copy is present with minor modifications (1/3
15341 are predicted to form a long stem structure together with a changes) across all geographic samples and species studied
partially complementary neighboring sequence (dG = − 19.76). from the dorsalis complex (sequences obtained in this study: B.
Secondary structures, formed by tRNA genes or self comple- dorsalis s.s., B. papayae, B. carambolae, B. philippinensis and
mentary sequences, are believed to play a major role as hotspots B. occipitalis) but absent in B. oleae. Therefore the recombi-
for recombination (Stanton et al., 1994). nation event can be placed, to our best knowledge, on the
We hypothesize that this duplicated sequence could mark a lineage leading to B. dorsalis after the separation with B. oleae,
recent recombination event. Intra-mitochondrial recombination but before the diversification of the B. dorsalis species group.
has been described in recent years for mitochondrial genomes Additional sequence data from species of Bactrocera inside and
(reviewed in Dowton and Campbell, 2001). This process seems outside the dorsalis complex would be useful to pinpoint this
to be responsible for the formation of the molecular chimeras event with more precision.
Table 4
Sequence, with secondary structure landmarks, of B. dorsalis tRNAs
tRNA Acceptor arm DHU arm Anticodon arm TψC arm Acceptor arm
trnI [AATGAAT] TG [CCT] GATAAA [AGG] G [TTACC] TT GAT AG [GGTAA] ATAAT [GCAAT] TAGT [ATTGC^ ATTCATT] A
trnQ [TATATTT] TG [GTGTA] TGA [TGCAC] A [AAAGT] TT TTG AT [ACTTT] TAGA [AATAG] TTTAATT [CTATT^ AAATATA] A
trnM [AAAAAGA] TA [AGCT] AATTA [AGCT] A [CTGGG] TT CAT AC [CCCAT] TTAT [AAAGG] TTCTAAT [CCTTT^ TCTTTTT] A
trnW [AAGGCTT] TA [AGTT] AATTA [AACT] A [ATAGC] CT TCA AA [GCTAT] AAAT [ATAAG] TATAAAT [CTTTT^ AAGCCTT] A
trnC [GGCTTTA] TA [GTCA] ATAA [TGAC] A [TTAGA] CT GCA AT [TCTAA] AGGA [GTAA] TAAA [TTAC^ TAAGGCT] T

D.J. Yu et al. / Gene 396 (2007) 66–74


trnY [GATTAAA] TG [GCT] GAAGTTTA [GGC] G [ATAGA] CT GTA AA [TCTAT] TTAT [AAGAA] TTTA [TTCTT^ TTTAATC] A
trnL(UUR) [TCTAATA] TG [GCA] GATTAG [TGC] A [ATGGA] TT TAA GC [TCCAT] ATAT [AAAGTA] TTT [TACTTT^ TATTAGA] A
trnK [CATTAGA] TG [ACT] GAAAGCA [AGT] A [CTGGT] CT CTT AA [ACCAT] CTTAT [AGTAA] ATTAGCAC [TTACT^ TCTAATG] G
trnD [AAAAAAT] TA [GTTA] AAAAA [TAAC] A [TTAGT] AT GTC AA [ACTAA] AATT [ATTAAA] CTA [TTTAAT^ ATTTTTT] G
trnG [ATCTATA] TA [GTAT] ATAA [GTAT] A [TTTGA] CT TCC AA [TCATA] AGGT [CTACT] AATT [AGTAG^ TATAGAT] A
trnA [AGGGTTG] TA [GTTA] ATTA [TAAC] A [TTTGA] TT TGC AT [TCAAA] AAGT [ATTGA] AATA [TCAAT^ CTACCTT] A
trnR [GAATATG] AA [GCGA] TTTA [TTGC] A [ATTAG] TT TCG AC [CTAAT] CTTA [GGTAT] TTT [ATGCC^ CTTATTC] T
trnN [TTAATTG] AA [GCC] AAAAAGA [GGC] T [TATCA] CT GTT AA [TGATA] GAACT [GAGA] TTGA [TCTC^ CAATTAA] G
trnS(AGY) [GAAGTAT] GA CGTTCAAG a A [AAAAG] CT GCT AA [CTTTT] ATCTTT [TAATGG] TTAAACT [CCATTT^ GTACTTC] T
trnE [GTTTATA] TA [GTTT] AATAA [AAAC] A [TTACA] TT TTC AT [TGTAA] AAAT [AAAAT] TTTCT [ATTTT^ TATAAAT] T
trnF [ATTCAAG] TA [GCT] TAAAATAG [AGC] A [TAACA] CT GAA GA [TGTTA] GGGT [AATT] GAAT [AATT^ CTTGGAT] G
trnH [ATCTAAA] TA [GTTT] ATTTA [AAAT] A [TTGAT] TT GTG GT [GTCAA] TGAA [ATGA] GGTTTA [TCAT^ TTTAGAT] C
trnT [GTTTTAA] TA [GTTT] AATAA [AAAC] A [TTGGT] CT TGT AA [ACCAA] AAAT [AAGAT] TTT [GTCTT^ TTAAAAC] T
trnP [CAGGAGG] TA [GTTT] ATTTA [AAAT] A [TTAAT] TT TGG GG [ATTAA] TGAT [AAAG] TTATTT [CTTT^ TCTCTTG] A
trnS(UCN) [AGTTAAT] GA [GCTT] GAAC [AAGC] A [TATGT] TT TGA AA [ACATA] AGAT [AGAAA] TCAACC [TTTCT^ ATTGACT] T
trnL(CUN) [ACTATTT] TG [GCA] GATTAG [TGC] A [ATAAA] TT TAG AA [TTTAT] TTAT [GTAAT] TTAT [ATTAC^ AAATAGT] A
trnV [CAATTTA] AA [GCTT] ATTTAGTA [AAGC] A [TTTCA] TT TAC AT [TGAAA] AGAT [TTTTG] TGCAAAT [CAATA^ TAAATTG] A
a
Non paired nucleotides replacing the DHU arm.

71
72 D.J. Yu et al. / Gene 396 (2007) 66–74

Fig. 3. Hypothetical secondary structures in the regions surrounding the repeated sequence (in bold) between trnC and trnY (panel A) and in the CR (panel B).

3.5. Levels of variability and informativeness of tephritid to be applied for studying evolutionary phenomena at the
mitochondrial genome sequences species/genus level. Two additional factors important for
choosing a gene to be used as a molecular marker are the
Comparing the nucleotide sequence of the 13 PCGs, 2 rRNA presence of mitochondrial pseudogenes in the nuclear genome
subunits and the CR between B. dorsalis and the other two and the possibility of efficiently modeling sequence evolution
published tephritid mitochondrial genomes (Table 5), the parameters. One would tend to exclude the nad2/cox1 region in
highest similarity is observed, for most genes, between B. any case, as the presence of pseudogenes was reported in the
dorsalis and B. oleae. This is as expected given the taxonomic closely related B. oleae (Nardi et al., 2003).
proximity of these species, that belong to the same genus, If sequences are to be used in a phylogenetic framework, i.e.
compared to C. capitata, which is in a different tribe within the applying methods that involve some model of sequence change
subfamily Dacinae (White and Elson-Harris, 1992). It is also through time, the CR is unlikely to be useful due to the presence of
consistent with phylogenetic analyses incorporating these three stretches of identical nucleotides, short microsatellites, and other
species (Smith et al, 2003). low complexity sequences, that may complicate comparative
The most conserved sequences appear to be the two rRNA sequence analysis. The most variable PCGs would be more
genes, with identities above 92% between the two Bactrocera appropriate and, taking into account only genes that provide a
species. The cytochrome oxidase and the ATPase/NADH significant amount of sequence (N1 kb), nad4–5 would be the
dehydrogenase complexes of genes are slightly less conserved most suitable markers.
(85–86% and 81–91% identity, respectively). The CR, with
77% identity between the two Bactrocera species, is the most Table 5
Percentage of identity at the nucleotide level between B. dorsalis, B. oleae and
variable region. Amino-acid identities (Table 5) follow the same C. capitata calculated in different gene/gene regions. Corresponding amino acid
trend, with slightly higher values for the cytochrome oxidase identities are given in parentheses
complex (97–98% identity) than for the ATPase and NADH
Region Bd a/Bo b Bd/Cc c Bo/Cc
dehydrogenase complexes (88–97%).
atp6 85.69(96.89) 84.22(97.78) 81.42(96.44)
The percentage of identity across gene regions for the three
atp8 87.65(90.57) 77.78(79.25) 80.25(79.25)
tephritid species is rather similar, meaning that at this shallow level cox1 85.34(98.04) 85.48(97.45) 84.51(98.23)
of divergence the observed differences in variability across genes/ cox2 86.52(96.89) 82.75(94.62) 82.90(94.71)
genomic regions are limited. This observation does not contradict cox3 86.82(98.47) 88.09(96.56) 84.28(96.95)
the notion that some genes in the mitochondrial genome, namely cob 85.58(97.62) 83.63(97.09) 85.84(95.77)
nad1 88.62(94.57) 87.34(92.83) 87.55(92.18)
the cytochrome oxidase complex, are more conservative than
nad2 81.72(88.69) 80.35(85.80) 78.98(84.08)
others, such as the NADH dehydrogenase and ATPase complexes nad3 84.18(92.31) 84.46(94.02) 82.20(89.74)
(Simon et al, 1994), but rather indicates that the limited number of nad4 88.07(95.07) 85.61(91.48) 84.71(88.57)
differences observed in comparisons between congeneric species nad4L 91.25(98.98) 85.19(91.67) 85.19(92.71)
are not sufficient for this difference to arise. This parallels what has nad5 85.99(92.57) 85.81(90.21) 85.57(89.03)
nad6 81.52(81.98) 78.29(79.07) 78.86(79.89)
been observed in a comparison between two genomes from
rrnS 93.44(N/A) 89.13(N/A) 87.39(N/A)
different geographic samples of the same species (B. oleae), where rrnL 92.65(N/A) 90.08(N/A) 88.26(N/A)
the distribution of mutations across genomic regions did not CR 77.19(N/A) 65.88(N/A) 65.65(N/A)
significantly depart from randomness (Nardi et al., 2003). a
B. dorsalis.
These estimates can provide useful guidelines for gene b
B. oleae.
c
selection if sequences from one or a limited number of genes are C. capitata.
D.J. Yu et al. / Gene 396 (2007) 66–74 73

If sequences are to be used to distinguish among species/ China (No. 30471162). F.N. was supported by a grant of the
geographic groups, and therefore the sheer number of Monte dei Paschi di Siena Foundation.
differences is at premium, the CR and/or intergenic spacers
are likely the best option. Care must be taken to avoid References
overweighting mutations such as unit gain/loss in homopolymer
runs or microsatellites that, being much more frequent than Allwood, A.J., et al., 1999. Host plant records for fruit flies (Diptera:
regular point mutations, would likely lead to problems of Tephritidae) in South-East Asia. Raffles Bull. Zool. 7, 92 Supp.
convergence. The informativeness of the CR is supported by Altschul, S.F., et al., 1997. Gapped BLAST and PSI-BLAST: a new generation
of protein database search programs. Nucleic Acids Res. 25, 3389–3402.
some preliminary sequence information (Nakahara et al.,
Armstrong, K.F., Ball, S.L., 2005. DNA barcodes for biosecurity: invasive species
unpublished data: GenBank accession nos: AB191470/3) in identification. Philos. Trans. R. Soc. Lond., B. Biol. Sci. 360, 1813–1823.
four B. dorsalis specimens, that display a difference of 3.4% at Armstrong, K.F., Cameron, C.M., Frampton, E.R., 2000. Fruit fly (Diptera:
the nucleotide level. On the contrary, few point mutations were Tephritidae) species identification: a rapid diagnostic technique for
observed in the 45 bp intergenic spacer across five species and quarantine application. Bull. Entomol. Res. 87, 111–118.
Avise, J.C., 2000. Phylogeography, the History and Formation of Species.
seven geographical samples of the dorsalis complex (this study:
Harvard University Press, Cambridge, Massachussets, USA.
GenBank accession nos. EF377349–59), but an 11 bp discrete Bae, J.S., Kim, I., Sohn, H.D., Jin, B.R., 2004. The mitochondrial genome of the
insertion appears as an autoaphomorphy of B. carambolae. This firefly, Pyrocoelia rufa: complete DNA sequence, genome organization, and
is a very promising marker for species diagnostics, and, as well phylogenetic analysis with other insects. Mol. Phylogenet. Evol. 32,
as other possible discrete genomic changes, could be easily 978–985.
Ballard, J.W.O., 2000a. Comparative genomics of mitochondrial DNA in members
targeted in a PCR assay. Nevertheless additional population data
of the Drosophila melanogaster subgroup. J. Mol. Evol. 51, 48–63.
will be needed to confirm that this insertion is fixed and Ballard, J.W.O., 2000b. Comparative genomics of mitochondrial DNA in
exclusive for the species. Drosophila simulans. J. Mol. Evol. 51, 4–75.
Talking about fruit flies in general, the cox1 gene is today the Ballard, J.V.O., Rand, D.M., 2005. The population biology of mitochondrial DNA
most commonly used molecular marker in phylogenetics (Barr and its phylogenetic implications. Annu. Rev. Ecol. Evol. Syst. 36, 621–642.
Ballard, J.W.O., Whitlock, M.C., 2004. The incomplete natural history of
and McPheron, 2006; Jamnongluk et al., 2003; Smith-Caldas
mitochondria. Mol. Ecol. 13, 729–744.
et al., 2001) and phylogeography (Mun et al., 2003), and the Barr, N.B., McPheron, B.A., 2006. Molecular phylogenetics of the genus Cer-
available amount of cox1 information is likely to arise dramatically atitis (Diptera:Tephritidae). Mol. Phylogenet. Evol. 38, 216–230.
with the advent of DNA barcoding projects (Hebert et al., 2003). Boore, J.L., 1999. Animal mitochondrial genomes. Nucleic Acids Res. 27,
Focusing on species discrimination for import/export controls 1767–1780.
Boore, J.L., 2000. The duplication/random loss model for gene rearrangement
and general biosecurity, most available tests are based on PCR–
exemplified by mitochondrial genomes of deuterostome animals. In:
RFLP assays targeting sequences from the nuclear ribosomal Sankoff, D., Nadeau, J.H. (Eds.), Comparative genomics. Kluwer Academic
cluster (Armstrong et al., 2000) and mitochondrial CR, rrnS and Publishers, Dordrecht.
rrnL, but none of these tests are currently capable of discriminat- Brown, W.M., Prager, E.M., Wang, A., Wilson, A.C., 1982. Mitochondrial DNA
ing between all the key species, and attempts are being undertaken sequences of primates: tempo and mode of evolution. J. Mol. Evol. 18, 225–239.
Cameron, S.L., Lambkin, C., Barker, S.C., Whiting, M.F., 2007. A
do develop more sensitive and efficient tests, i.e. taking advantage
mitochondrial genome phylogeny of Diptera: whole genome sequence
of barcoding data (Armstrong and Ball, 2005) or in the form of a data accurately resolve relationships over broad timescales with high
micro-array biochip (Frey and Pfunder, 2006). precision. Syst. Ent. 32, 40–59.
Nevertheless, both phylogeography/population genetics stud- Carroll, L.E., White, I.M., Freidberg, A., Norrbom, A.L., Dallwitz, M.J., Thompson,
ies and tests for species diagnosis heavily depend on the F.C., 2002 onwardss. Pest Fruit Flies of the world. http://delta-intkey.com.
Clarke, A.R., et al., 2005. Invasive phytophagous pest arising through a recent
availability of basic comparative data on intra and inter-specific
tropical evolutionary radiation: the Bactrocera dorsalis complex of fruit
variability, and on the informativeness (that is the presence of flies. Annu. Rev. Entomol. 50, 293–319.
discriminating mutations) of the gene to be targeted in the assay. Clary, D.O., Wolstenholme, D.R., 1985. The mitochondrial DNA molecule of
With regard to this consideration, complete mitochondrial genome Drosophila yakuba: nucleotide sequence, gene organization, and genetic
sequences are likely to play a major role in the near future code. J. Mol. Evol. 22, 252–271.
Crease, T.J., 1999. The complete sequence of the mitochondrial genome of
(Cameron et al., 2007), and the completion of the genome
Daphnia pulex (Cladocera: Crustacea). Gene 233, 89–99.
sequencing of B. dorsalis is a significant piece added to the puzzle. Dowton, M., Austin, A.D., 1999. Evolutionary dynamics of a mitochondrial
rearrangement ‘hotspot’ in the Hymenoptera. Mol. Biol. Evol. 16, 298–309.
Acknowledgements Dowton, M., Campbell, N.J.H., 2001. Intramitochondrial recombination — is it
why some mitochondrial genes sleep around? TREE 16, 269–271.
Drew, R.A.I., Hancock, D.L., 1994. The Bactrocera dorsalis complex of fruit
We wish to thank all colleagues that helped with sample
flies in Asia. Bull. Entomol. Res. Supp. Ser. 2, 1–68.
collection and determination: Mr Chenzhilin, Dr. Haymer, Feijao, P.C., Neiva, L.S., de Azeredo-Espin, A.M., Lessinger, A.C., 2006.
Mr Jiangxiaolong, Dr Muraji and Mr Yejun. We also wish to AMiGA: the arthropodan mitochondrial genomes accessible database.
thank the Editor and three anonymous Reviewers for their Bioionformatics 22, 902–903.
useful comments and suggestions, and Prof. Baldari for her Follett, P.A., Neven, L.G., 2006. Current trends in quarantine entomology.
Annu. Rev. Entomol. 51, 359–385.
assistance. This research was supported by funds of the 2008
Frey, J.E., Pfunder, M., 2006. Molecular techniques for identification of
Beijing Olympic Game (2004BA904B06) and 973 program quarantine insects and mites: the potential of microarrays. In: Rao, J.R.,
(2002CB111405) of Ministry of Science and Technology of P. Fleming, C.C., Moore, J.E. (Eds.), Molecular Diagnostics: Current
R. China, and the National Natural Sciences Foundation of Technology and Applications. ch. 6.
74 D.J. Yu et al. / Gene 396 (2007) 66–74

Hassanin, A., Leger, N., Deutsch, J., 2005. Evidence for multiple reversals of Scheffler, I.E., 1999. Mitochondria. John Wiley & Sons, Inc., New York.
asymmetric mutational constraints during the evolution of the mitochondrial Simon, C., Frati, F., Beckenbach, A., Crespi, B., Liu, H., Flook, P., 1994.
genome of metazoa, and consequences for phylogenetic inferences. Syst. Evolution, weighring, and phylogenetic utility of mitochondrial gene
Biol. 54, 277–298. sequences and a compilation of conserved polymerase chain reaction
Hebert, P.D., Ratnasingham, S., deWaard, J.R., 2003. Barcoding animal life: primers. Ann. Entomol. Soc. Am. 87, 651–701.
cytochrome c oxidase subunit 1 divergences among closely related species. Smith-Caldas, M.R.B., McPheron, B.A., Silva, J.G., Zucchi, R.A., 2001.
Proc. Biol. Sci. 270, S96–S99. Phylogenetic relationships among species of the fraterculus group
Hou, B.H., 2005. Risk analysis for Oriental Fruit Fly, Bactrocera dorsalis (Hendel) (Anastrepha: Diptera: Tephritidae) inferred from DNA sequences of
(Diptera: Tephritidae). PhD dissertation, Sun Yat-Sen University. 1–103. mitochondrial cytochrome oxidase I. Neotropical Entomol. 30, 565–573.
Jamnongluk, W., Baimai, V., Kittayapong, P., 2003. Molecular evolution of Smith, P.T., Kambhampati, S., Armstrong, K.A., 2003. Phylogenetic relation-
tephritid fruit flies in the genus Bactrocera based on the cytochrome oxidase ships among Bactrocera species (Diptera: Tephritidae) inferred from
I gene. Genetica 119, 19–25. mitochondrial DNA sequences. Mol. Phylogenet. Evol. 26, 8–17.
Junqueira, A.C., et al., 2004. The mitochondrial genome of the blowfly Chry- Spanos, L., Koutroumbas, G., Kotsyfakis, M., Louis, C., 2000. The
somya chloropyga (Diptera: Calliphoridae). Gene 339, 7–15. mitochondrial genome of the mediterranean fruit fly, Ceratitis capitata.
Kajander, O.A., et al., 2000. Human mtDNA sublimons resemble rearranged Insect Mol. Biol. 9, 139–144.
mitochondrial genomes found in pathological states. Hum. Mol. Genet 9, Stanton, D.J., Daehler, L.L., Moritz, C.C., Brown, W.M., 1994. Sequences with
2821–2835. the potential to form stem-and-loop structures are associated with coding
Kim, I., et al., 2005. The complete nucleotide sequence and gene organization of region duplications in animal mitochondrial DNA. Genetics 137, 233–241.
the mitochondrial genome of the oriental mole cricket, Gryllotalpa orientalis White, I.M., Elson-Harris, M.M., 1992. Fruit Flies of Economic Significance:
(Orthoptera: Gryllotalpidae). Gene 353, 155–168. Their Identification and Bionomics. CAB International, Wallingford, UK.
Mun, J., Bohonak, A.J., Roderick, G.K., 2003. Population structure of the Wolstenholme, D.R., 1992. Animal mitochondrial DNA: structure and
pumpkin fruit fly Bactrocera depressa (Tephritidae) in Korea and Japan: evolution. Int. Rev. Cytol. 141, 173–216.
Pliocene allopatry or recent invasion? Mol. Ecol. 12, 2941–2951. Yu, D.J., Zhang, G.M., Chen, Z.L., Zhang, R.J., Yin, W.Y., 2004. Rapid
Nardi, F., Carapelli, A., Dallai, R., Frati, F., 2003. The mitochondrial genome of identification of Bactrocera latifrons (Dipt. Tephritidae) by real-time PCR
the olive fly Bactrocera oleae: two haplotypes from distant geographical using SYBR Green chemistry. J. Appl. Entomol. 128, 670–676.
locations. Insect Mol. Biol. 12, 605–611. Yukuhiro, K., Sezutsu, H., Itoh, M., Shimizu, K., Banno, Y., 2002. Significant
Nardi, F., Carapelli, A., Dallai, R., Roderick, G.K., Frati, F., 2005. Population levels of sequence divergence and gene rearrangements have occurred
structure and colonization history of the olive fly, Bactrocera oleae (Diptera, between the mitochondrial genomes of the wild Mulberry Silkmoth, Bom-
Tephritidae). Mol. Ecol. 14, 2729–2738. byx mandarina, and its close relative, the domesticated Silkmoth, Bombyx
Passamonti, M., Boore, J.L., Scali, V., 2003. Molecular evolution and mori. Mol. Biol. Evol. 19, 1385–1389.
recombination in the gender-associated mitochondrial DNAs of the Manila Zuker, M., 2003. Mfold web server for nucleic acid folding and hybridization
clam Tapes philippinarum. Genetics 164, 603–611. prediction. Nucleic Acids Res. 31, 3406–3415.
Rand, D., 2001. Mitochondrial genomics flies high. Trends Ecol. Evol. 16, 2–4.

You might also like