Evolution of The Chloroplast Genome

Download as pdf or txt
Download as pdf or txt
You are on page 1of 9

Published online 12 December 2002

Evolution of the chloroplast genome

Christopher J. Howe*, Adrian C. Barbrook, V. Lila Koumandou,


R. Ellen R. Nisbet, Hamish A. Symington and Tom F. Wightman
Department of Biochemistry, University of Cambridge, Tennis Court Road, Cambridge CB2 1QW, UK
We discuss the suggestion that differences in the nucleotide composition between plastid and nuclear
genomes may provide a selective advantage in the transposition of genes from plastid to nucleus. We show
that in the adenine, thymine (AT)-rich genome of Borrelia burgdorferi several genes have an AT-content
lower than the average for the genome as a whole. However, genes whose plant homologues have moved
from plastid to nucleus are no less AT-rich than genes whose plant homologues have remained in the
plastid, indicating that both classes of gene are able to support a high AT-content. We describe the
anomalous organization of dinoflagellate plastid genes. These are located on small circles of 2–3 kbp, in
contrast to the usual plastid genome organization of a single large circle of 100–200 kbp. Most circles
contain a single gene. Some circles contain two genes and some contain none. Dinoflagellate plastids have
retained far fewer genes than other plastids. We discuss a similarity between the dinoflagellate minicircles
and the bacterial integron system.
Keywords: nucleotide composition; transposition; dinoflagellate; plastid; integron

1. INTRODUCTION genome of Synechocystis sp. PCC6803 contains some 3200


genes (Nakamura et al. 1998). There has been consider-
It is now accepted that the first plastids arose from endo-
able debate over the questions of why this transfer of gen-
symbiosis between a photosynthetic bacterium and a non-
etic material should have occurred, and why it should have
photosynthetic host (Howe et al. 1992). Many workers
been limited to a fraction of the endosymbiont genome.
favour a monophyletic model with one primary endosym-
It has been suggested that genes in the plastid are exposed
biosis involving a single endosymbiont and a single host
to high levels of reactive, and potentially mutagenic, spec-
(see, for example, Palmer 1993). However, it has also
ies generated during the electron transfer reactions of
been argued that the sequence-based trees taken to sup-
photosynthesis. Such reactive species would include oxy-
port the monophyletic model are unreliable because of the
gen free radicals (Allen & Raven 1996; Race et al. 1999).
effects of systematic biases in the sequence data, and that
It has also been suggested that movement of genes to the
more evidence is needed before more complex models can
nucleus is beneficial in offering improved repair mech-
be ruled out (e.g. Lockhart et al. 1998). Such models
anisms, as well as placing the genes in a sexually reproduc-
might include independent acquisition of: (i) closely
ing—and therefore recombining—population (Allen &
related endosymbionts by closely related hosts; (ii) closely
Raven 1996; Race et al. 1999). Although the location in
related endosymbionts by distantly related hosts; (iii) dis-
a sexual population should offer increased fitness through
tantly related endosymbionts by closely related hosts; and
the principle known as ‘Muller’s ratchet’, it remains to be
(iv) distantly related endosymbionts by distantly related
seen if repair processes in the nucleus are inherently more
hosts. Even if reliable sequence-based trees were available,
effective than those in the plastid. Repair processes have
it would be difficult to distinguish many of these models
been demonstrated in the chloroplast, and at least some
from a monophyletic origin, depending on whether the
repair activity is dependent on a homologue of the RecA
trees were based on endosymbiont or host genes, as sum-
protein (an important component of bacterial repair) that
marized in table 1. Regardless of whether there was a sin-
has been shown to be present in plastids (Cerutti et al.
gle primary endosymbiosis or several, it is clear that a large
1993, 1995; Cannon et al. 1995). We discuss below an
fraction of the original endosymbiont genome has either
additional selective advantage proposed for the movement
been lost or transferred to the nucleus (Martin et al.
of genes to the nucleus (Howe et al. 2000).
1998). Plastids typically contain of the order of 100–200
If there are advantages in moving genes from the plastid
genes (Sugiura 1992; Glöckner et al. 2000), whereas the
to the nucleus, there is then the question of why not all
original endosymbiont, as an oxygenic photosynthetic bac-
genes have been transposed. It is likely that the gene for
terium, would probably have contained a similar number
tRNA-Glu is needed for activation of glutamate in tetra-
of genes to present-day cyanobacteria. For example, the
pyrrole biosynthesis (Howe & Smith 1991). However,
given that the tRNA-Glu could be transcribed by a
nucleus-encoded plastid polymerase, this would not
*
Author for correspondence ([email protected]). require the retention of protein genes in the plastid. Two
One contribution of 21 to a Discussion Meeting Issue ‘Chloroplasts and main suggestions have been advanced to explain this. The
mitochondria: functional genomics and evolution’. first is that certain plastid proteins may be inherently diffi-

Phil. Trans. R. Soc. Lond. B (2003) 358, 99–107 99  2002 The Royal Society
DOI 10.1098/rstb.2002.1176
100 C. J. Howe and others Chloroplast genome evolution

Table 1. Likely origins of plastids inferred under monophyly or different models of polyphyly.

symbionts hosts host gene trees symbiont gene trees

monophyletic
unique unique monophyletic monophyletic
polyphyletic
closely related closely related monophyletic monophyletic
closely related distantly related polyphyletic monophyletic
distantly related closely related monophyletic polyphyletic
distantly related distantly related polyphyletic polyphyletic

cult to transport across the plastid envelope. It would with implications for attempts to discriminate between
therefore be difficult to relocate the genes for such pro- monophyly and polyphyly of plastid origins (Lockhart et
teins to the nucleus, with consequent synthesis of the pro- al. 1992). Notwithstanding the degeneracy of the genetic
teins in the cytosol and post-translational import into the code, the biased nucleotide composition affects the
organelle. However, several studies have shown that indi- amino-acid composition of the proteins encoded. This can
vidual plastid genes can be artificially introduced into the be seen particularly clearly by looking at proteins that are
nucleus and, if the genes have been modified by the fusion plastid encoded in some species, and nuclear encoded in
of a region encoding a plastid-targeting sequence to the others. A good example is the plastid SecA, which is
coding sequence for the mature protein, the resulting pro- involved in translocation of lumenal proteins across the
tein can be re-imported effectively into the organelle thylakoid membrane. SecA is encoded in the nucleus of
(Cheung et al. 1988; Kanevski & Maliga 1994). Although green plants and algae, and in the plastid of non-green
such studies show that these proteins can, in principle, be algae. Table 2 shows the predicted content of several
re-imported into the organelle, they do not exclude the amino acids for SecA proteins from different sources
possibility that the need for import may cause a minor (Barbrook et al. 1998). The codons for the amino acids
reduction in fitness on the organism. A second suggestion alanine, glycine and proline are GC-rich, in that they have
for the retention of genes by the plastid is that it allows a necessarily to contain at least two G or C residues (and
rapid regulation of expression in response to the redox may contain three), and the codons for phenylalanine, iso-
state of the organelle (Allen 1993; Pfannschmidt et al. leucine, lysine, asparagine and tyrosine are AT-rich, in
1999). We will discuss the much-reduced dinoflagellate that they have necessarily to contain at least two A or T
plastid genome, and argue that its residual gene content residues (and may contain three). The SecA protein from
is consistent with Allen’s proposal. the red and brown algae can be seen to be relatively
depleted in the GC-rich codons for (A,G,P) and enriched
in the AT-rich codons (F,I,K,N,Y). Thus, if a protein
2. TRANSPOSITION OF GENES TO THE NUCLEUS
remains encoded in the plastid, there will be a shift in its
One of the notable features of plastid genomes is their amino-acid composition. For many proteins, this may be
high AT-content, both in coding regions and in non- detrimental to their function, and there may therefore be
coding regions. The high AT-content is not restricted to a selective advantage in transfer of the gene to the nucleus.
green plastids; rather it is seen across the whole spectrum Genes that have moved to the nucleus have been
of pigment types, and is higher than generally seen in described as ‘molecular refugees’ moving to a less oppress-
cyanobacteria. For example, the plastid genomes of Nicoti- ive coding environment (Howe et al. 2000). This potential
ana tabacum, Porphyra purpurea and Odontella sinensis are driving force for the transfer of genes to the nucleus may
ca. 62%, 67% and 68% AT, respectively (Reith & Mun- not act independently of the other driving forces proposed,
holland 1995; Kowallik et al. 1995). The genome of Syne- such as Muller’s ratchet, but it may act synergistically with
chocystis sp. PCC6803 is ca. 52% AT (Nakamura et al. them. (Although this argument has been presented in
1998). Although one cannot exclude the possibility that terms of a relatively GC-rich organelle genome becoming
the plastid originated from a bacterial species with a more AT-rich, if the plastid originated from an AT-rich bac-
AT-rich genome, it seems more likely that the plastid gen- terium and genes have the possibility of becoming more
ome has become AT-rich since endosymbiosis. This shift GC-rich by transposition to the nucleus, the same argu-
in nucleotide composition could be due to the nature of ment applies.)
DNA damage occurring in the plastid, or to a tendency If some proteins were particularly seriously affected by
of the plastid DNA polymerase to mis-incorporate A and the shift in amino-acid composition resulting from an
T rather than G and C in replication, or to a bias in the increased AT-content of their genes, there might be a
DNA repair machinery. (Interestingly, mitochondrial gen- greater advantage in the transfer of those genes to the
omes are also rather AT-rich (Lang et al. 1999).) It is nucleus. (Conversely, the shift towards AT-richness might
worth noting that such a bias in nucleotide composition even be advantageous for some genes and proteins, mak-
causes serious problems for phylogenetic inference based ing it less favourable for the genes to be transposed to the
on plastid genes. Many of the techniques used assume that nucleus.) Differing effects of a shift in amino-acid compo-
nucleotide composition remains constant over time, and sition on different proteins might therefore offer an expla-
violation of this can result in organisms with similar nucle- nation of why some genes have been retained by the
otide compositions being grouped artefactually closely, plastid, some have been transferred to the nucleus, and

Phil. Trans. R. Soc. Lond. B (2003)


Chloroplast genome evolution C. J. Howe and others 101

Table 2. Content (%) of GC-rich (A,G,P) and GC-poor (F,I,K,N,Y) codons of the secA gene from green plants (Pisum sativum,
Spinacia oleracea), oxygenic photosynthetic bacteria (Anacystis nidulans R2, Anabaena variabilis, Phormidium laminosum, Synecho-
cystis sp. PCC6803, Prochloron didemni, Prochlorothrix hollandica) and plastid genomes of algae (Antithamnion sp., Porphyra purpurea,
Heterosigma carterae, Odontella sinensis and Pavlova lutherii ).
(Averages shown in italics for a group of organisms are above the overall average. Those underlined are below the overall average.
Modified from Barbrook et al. (1998).)

GC-rich GC-poor

A G P F I K N Y
GCN GGN CCN TTT/C ATA/T/C AAA/G AAT/C TAT/C

nuclear or bacterial
Pisum sativum 5.0 4.0 2.0 4.0 7.9 7.9 3.0 0.0
Spinacia oleracea 5.9 3.0 2.0 4.0 8.9 7.9 4.0 0.0
Anacystis nidulans R2 6.9 5.9 3.0 3.0 5.9 5.9 4.0 2.0
Anabaena variabilis 6.9 4.0 2.0 2.0 8.9 5.9 3.0 2.0
Phormidium laminosum 5.9 1.0 3.0 2.0 5.0 5.9 5.9 3.0
Synechocystis sp. 4.0 2.0 2.0 2.0 6.9 6.9 5.9 2.0
Prochloron didemni 6.9 0.0 3.0 2.0 4.0 6.9 5.9 3.0
Prochlorothrix hollandica 5.9 2.0 4.0 2.0 6.9 5.0 4.0 3.0
average 5.9 2.7 2.6 2.6 6.8 6.5 4.2 1.9
organellar
Antithamnion sp. 4.0 0.0 0.0 1.0 13.9 15.8 10.9 5.0
Porphyra purpurea 4.0 0.0 2.0 2.0 11.9 7.9 5.0 5.9
Heterosigma carterae 7.9 0.0 2.0 7.9 13.9 12.9 7.9 4.0
Odontella sinensis 5.0 0.0 4.0 4.0 11.9 6.9 7.9 3.0
Pavlova lutherii 3.0 2.0 0.0 4.0 10.9 12.9 7.9 5.9
average 4.8 0.4 1.6 3.8 12.5 11.3 7.9 4.8
average overall 5.5 1.8 2.2 3.0 9.0 8.4 5.6 3.0

others have different locations in different species. If indi- 100


vidual genes were affected in different ways by the shift in 90
nucleotide composition, this might become apparent by 80
looking at homologues in bacterial genomes with biased
70
nucleotide compositions. In an AT-rich bacterial genome,
60
AT (%)

genes whose products were adversely affected by the high


AT-content (under the ‘molecular refugees’ hypothesis, 50
homologues of nuclear genes for plastid proteins) might be 40
expected to have adopted a lower AT-content than genes 30
whose products were able to tolerate a high AT-content 20
(i.e. homologues of plastid genes for plastid proteins). We
10
have tested this using the genome of the bacterium Borrelia
0
burgdorferi and a set of genes for polypeptides of the ribo-
Plas (1,2)
Nuc (1,2)
Plas
Nuc
test genes
total genome

some and of the ATP synthase complex (figure 1). The


overall AT-content of the genome is high (ca. 71%), and
the test set of genes had a lower AT-content than this
(ca. 68%). The lower AT-content of the genes was more
marked when the first and second codon positions only Figure 1. AT content of genes in Borrelia burgdorferi. The
were considered (63%). However, when the test genes figure shows the AT content of the complete genome (total
were divided into groups according to whether the plastid genomes) and a subset (test genes) of the coding regions for
homologue was plastid or nuclear encoded, no clear pat- ribosomal proteins and subunits of ATP synthase, whose
tern was seen. Indeed, the B. burgdorferi genes with plastid- plant homologues are either encoded in the nucleus in land
encoded plastid homologues seemed to be slightly more plants generally (rps1,5,6,9,10,13,17,20,21 rpl1,3-5,9-
GC-rich than those with nuclear-encoded plastid homol- 13,15,17,25,27-31,34,atpC,D) or in the plastid (rps2-
4,7,8,11,12,14,18,19 rpl14,16,20,22,23,33 atpA,B,E). The
ogues.
AT-contents of the former and latter sets separately are
It therefore seems that in an AT-rich genome, genes will shown for the complete coding regions (Nuc and Plas,
maintain a lower AT-content than the genome as a whole, respectively) and for the 1st and 2nd codon positions alone
and there may well be an advantage in movement of genes (Nuc(1,2) and Plas(1,2)).
to a less AT-rich genome. There is no evidence that genes
whose homologues have moved to the nucleus in plants
and algae have a lower AT-content in B. burgdorferi than provide a driving force for movement of genes generally
genes whose homologues have remained in the plastid. to the nucleus, it may not be able to account for which
This indicates that although the shift in AT-content may genes are transferred. (However, it remains possible that

Phil. Trans. R. Soc. Lond. B (2003)


102 C. J. Howe and others Chloroplast genome evolution

individual positions within the proteins encoded may be cases across some of the minicircles of a species but
particularly affected, and that this was masked in the not all of them. The sequences of core regions of
analysis of B. burgdorferi sequences by considering the closely related species are very different from each
coding region as a whole. Recognizing this would require other. Strikingly, the coding regions of minicircles
the identification of residues that are generally conserved with different genes are always in the same orien-
and seeing to what extent these have resisted the effects tation with respect to the core region. Given the
of biased nucleotide composition in genomes such as B. number of minicircles that have now been studied,
burgdorferi.) It is possible that considerations such as the it seems unlikely that this conservation of orientation
need for redox control may be responsible for determining is by chance. We return to this observation later.
which genes are retained in the organelle. (ii) Some minicircles contain more than one gene. For
example, the petB and atpA genes are on a single
3. A REDUCED PLASTID GENOME IN minicircle in both A. carterae and A. operculatum
DINOFLAGELLATES (Barbrook et al. 2001; Hiller 2001). The same is true
for psbD and psbE (Hiller 2001; R. E. R. Nisbet,
The general pattern of plastid genome organization is unpublished data). These arrangements are not seen
for the complement of 100–200 genes to be located on a across all species. For example, the petB and atpA
large circular molecule (e.g. Sugiura 1992; Glöckner et al. genes of H. triquetra are on different minicircles
2000). However, a striking exception to this has been (Zhang et al. 1999). The arrangements also do not
shown in several species of peridinin-containing dinoflag- reflect the general pattern in conventional plastid
ellate algae. The organization of plastid genes has been genomes, where petB, atpA, psbD and psbE genes are
best characterized in Heterocapsa triquetra, Amphidinium located at different positions, so it is unlikely that
operculatum and A. carterae (Zhang et al. 1999; Barbrook & these two-gene minicircles can have been derived
Howe 2000; Barbrook et al. 2001; Hiller 2001). These simply by fragmentation of a conventional genomic
species appear to lack a conventional plastid genome and circle. Northern analysis of RNA from A. opercula-
have instead several small circular DNA molecules, typi- tum indicates that the atpA and petB genes are either
cally about 2–3 kbp in size, which generally contain a sin- not co-transcribed or form part of an unstable dicis-
gle gene. Although plasmid-like DNAs have been reported tronic transcript that is very rapidly cleaved into
from plastids of some green algae, they seem to be in monocistronic ones (Barbrook et al. 2001).
addition to the ‘main’ chloroplast genome and may not
(iii) Some minicrcles contain gene fragments, or no
encode functional genes (e.g. La Claire & Wang 2000).
genes at all. Several minicircles have been reported
The difficulty of isolating intact dinoflagellate plastids
from A. operculatum and A. carterae that contain
means that these minicircles have not yet been shown
fragments of coding regions, or no identifiable
directly to be located in the plastid. However, the indirect
coding regions at all, although they retain a recogniz-
evidence for this location appears strong. The minicircle
able minicircle core (Barbrook et al. 2001; Hiller
genes encode products that, in all other species, are plas-
2001; V. L. Koumandou and R. E. R. Nisbet,
tid-encoded, and these include rRNA (see below). Fur-
unpublished data). More complex minicrcles have
thermore, the predicted protein products do not include
been reported from H. triquetra that contain frag-
organellar targeting sequences, and no other copies of the
ments of more than one gene, and it has been pro-
minicircle genes have been detected.
Remarkably, only few genes have been identified so far posed that these originated by fusion of two separate
on the putative plastid minicircles. The following have minicircles followed by deletion (Zhang et al. 2001).
been reported on minicircles from one or more species: (iv) The coding regions of the minicircles show unusual
atpA, atpB, petB, petD, psaA, psaB, psbA, psbB, psbC, psbD, features. One of the most striking features revealed
psbE, 16S rRNA and 23S rRNA (Zhang et al. 1999; Bar- by inspection of the coding regions is the apparent
brook & Howe 2000; Barbrook et al. 2001; Hiller 2001). use of anomalous initiation codons. For example,
It is remarkable that no evidence has yet been found of GTA has been proposed as an initiation codon for
RNA polymerase subunit genes, ribosomal protein genes the psaA and psbB genes, and possibly also psbC of
or tRNA genes. As discussed above, one proposal for the A. operculatum (Barbrook & Howe 2000; Barbrook
retention of a plastid genome is to allow rapid regulation et al. 2001). In the case of psbB, the predicted N-
of important genes in response to redox processes in the terminus of the protein aligns closely with well-con-
plastid (Pfannschmidt et al. 1999). The fact that all the served sequences from other plastids (figure 2). This
protein genes identified so far encode major subunits of makes the assignment of GTA as initiation codon
the complexes of the light reactions of oxygenic photosyn- reasonably convincing, although there are as yet no
thesis seems to be consistent with this. Several features of direct protein sequence data to confirm it. A limited
the dinoflagellate plastid minicircles are worthy of com- number of studies using RT–PCR have so far failed
ment. to detect editing of dinoflagellate plastid transcripts,
although the possible existence either of a low level
(i) Minicircles contain a conserved ‘core’ region. This of edited transcripts or of heavily modified tran-
region is similar between minicircles of a given spec- scripts (which therefore escaped amplification in
ies carrying different genes. There are sections RT–PCR) cannot be excluded. If GTA is indeed
within the core that are essentially completely con- used as an initiation codon, this would be very
served across all minicircles of a given species, and unusual for organelle genomes generally (Edqvist et
others that are moderately well conserved: in many al. 2000).

Phil. Trans. R. Soc. Lond. B (2003)


Chloroplast genome evolution C. J. Howe and others 103

Amphidinium operculatum VRLPWFRVHIVVLNDPGRLISVHLMHTGLISGWAGLMALYELIVTDP


Heterocapsa triquetra MRLPWFRVHIVILNDPGRLISVHIMHTALVAGWAAVMTLYELIILDP
Guillardia theta MGLPWYRVHTVVLNDPGRLIAVHLMHTALVAGWAGSMALYELAVFDP
Odontella sinenis MALPWYRVHTVVLNDPGRLIAVHLMHTALVAGWAGSMALYELAVFDP
Cyanidium caldarium MALPWYRVHTVVLNDPGRLISVHLMHTALVSGWAGSMALYELAVFDP
Anabaena sp. MGLPWYRVHTVVLNDPGRLISVHLMHTALVAGWAGSMALYELAIYDP
Chlamydomonas reinhardtii MGLPWYRVHTVVINDPGRLISVHLMHTALVSGWAGSMALFEISVFDP
Marchantia polymorpha MGLPWYRVHTVVLNDPGRLIAVHLMHTALVSGWAGSMALYELAVFDP
Euglena gracilis MGLPWYRVHTVVLNDPGRFISVHLMHTALVSGWAGSMALYELAIFDP
: ***:*** *::*****:*:**:***.*::***. *:*:*: : **

Figure 2. Aligned predicted N-termini of psbB from a range of plants and algae.

F TTT 29 (20) S TCT 117 (70) Y TAT 60 (76) C TGT 10 (16)


F TTC 139(163) S TCC 75 (2) Y TAC 45 (31) C TGC 12 (6)
L TTA 1(157) S TCA 7(130) * TAA 2 (5) * TGA 0 (0)
L TTG 33 (4) S TCG 37 (1) * TAG 4 (1) W TGG 65 (75)

L CTT 122 (87) P CCT 43 (44) H CAT 47 (62) R CGT 57 (60)


L CTC 75 (56) P CCC 1 (0) H CAC 32 (19) R CGC 3 (2)
L CTA 45 (1) P CCA 37 (51) Q CAA 22 (50) R CGA 7 (1)
L CTG 19 (0) P CCG 17 (5) Q CAG 58 (27) R CGG 0 (0)

I ATT 84(139) T ACT 45 (76) N AAT 34 (26) S AGT 15 (29)


I ATC 87 (67) T ACC 43 (0) N AAC 50 (54) S AGC 6 (15)
I ATA 1 (4) T ACA 52 (47) K AAA 1 (4) R AGA 0 (49)
M ATG 50 (58) T ACG 5 (0) K AAG 55 (61) R AGG 32 (1)

V GTT 60(109) A GCT 73(145) D GAT 57 (77) G GGT 192(259)


V GTC 51 (76) A GCC 26 (7) D GAC 25 (14) G GGC 26 (13)
V GTA 56 (9) A GCA 81 (58) E GAA 18 (66) G GGA 17 (4)
V GTG 50 (3) A GCG 62 (4) E GAG 62 (25) G GGG 2 (1)

Figure 3. Codon preferences for the psaA, psbA,B,C, atpA and petB genes of Amphidinium operculatum and (in parentheses)
Heterocapsa triquetra.

Figure 3 shows the codon preference for A. operculatum ified makes it tempting to suggest that this represents a
and H. triquetra over a set of genes, psaA, psbA, psbB, psbC, plastid genome in the final stages of gene transfer to the
petB and atpA, which have been characterized from both. nucleus, with only those genes left that are essential for
The codon preference is heavily biased, although there effective regulation in response to redox or other require-
does not appear to be a consistent pattern, such as a pref- ments. It is of course possible that additional genes will
erence for A or T at the third codon position. So, for be discovered. However, the results of PCR with primers
example, there is a strong preference in A. operculatum for for genes that are generally located in the chloroplast,
GGT (Gly) over GGC/A/G and TCT (Ser) over together with sequencing of randomly selected clones indi-
TCC/A/G or AGT/C, yet TTC (Phe) is much preferred cates that the number of additional genes found will be
over TTT. There are also clear differences in the bias low. In support of this, the gene for the large subunit of
between the species. For example, although H. triquetra ribulose bis-phosphate carboxylase has been shown to be
has the same preference for TTC and GGT, TCA (which located in the nucleus in the dinoflagellate Gonyaulax
was only rarely used in A. operculatum) is much preferred polyedra (although the gene encodes a different form of
as a serine codon. This pattern of codon preferences is the enzyme from that usually found in plastids (Morse et
rather different from that seen in other plastids, where al. 1995)). The ribulose bisphosphate carboxylase–
there is a consistent preference for A or T at the third oxygenase large subunit gene is plastid located in other
codon position. The pattern of dinoflagellate preferences
algae and plants. Why the dinoflagellates should be in
is arguably rather more similar to cyanobacteria, where the
such an advanced state of gene loss is not clear. It is also
preferred nucleotide at the third position differs among
not clear how the minicircles are generated. They have a
different codon families (compare, for example TTT/C
superficial resemblance to the small circular DNA species
with GGT/C/A/G) as shown in figure 4. It will be interest-
found in plant mitochondria, which are derived by frag-
ing to see how patterns of codon preference vary across a
mentation of a ‘master’ chromosome by recombination
broader range of dinoflagellates. If any are found to have
across repeated sequences (Lonsdale et al. 1984). How-
a ‘conventional’ plastid genome organization, it will be
particularly interesting to see if they also have the conven- ever, there is as yet no evidence of a master chromosome
tional plastid preference for third position A or T. in dinoflagellate plastids. In addition, in the rare examples
Why should the dinoflagellate plastid genome be where there are two genes on the same minicircle, these
organized in this way? The limited number of genes ident- genes are not generally adjacent in other plastid genomes,

Phil. Trans. R. Soc. Lond. B (2003)


104 C. J. Howe and others Chloroplast genome evolution

F TTT 154 (80) S TCT 78 (33) Y TAT 61 (34) C TGT 8 (14)


F TTC 29(102) S TCC 7 (74) Y TAC 29 (51) C TGC 4 (4)
L TTA 186 (35) S TCA 24 (5) * TAA 6 (2) * TGA 0 (0)
L TTG 21 (91) S TCG 5 (6) * TAG 0 (4) W TGG 78 (79)

L CTT 67 (23) P CCT 71 (21) H CAT 76 (21) R CGT 60 (37)


L CTC 1 (53) P CCC 4 (85) H CAC 11 (68) R CGC 9 (23)
L CTA 10 (15) P CCA 46 (8) Q CAA 92 (61) R CGA 10 (3)
L CTG 1 (59) P CCG 6 (15) Q CAG 7 (37) R CGG 1 (41)

I ATT 146 (99) T ACT 92 (36) N AAT 63 (36) S AGT 49 (27)


I ATC 25 (87) T ACC 10 (99) N AAC 24 (58) S AGC 12 (35)
I ATA 30 (1) T ACA 47 (4) K AAA 72 (58) R AGA 32 (3)
M ATG 64 (81) T ACG 3 (18) K AAG 12 (22) R AGG 4 (2)

V GTT 111 (54) A GCT 170 (89) D GAT 85 (57) G GGT 155(145)
V GTC 3 (40) A GCC 6(144) D GAC 15 (52) G GGC 12 (75)
V GTA 80 (45) A GCA 78 (8) E GAA 120 (89) G GGA 97 (28)
V GTG 7 (59) A GCG 5 (23) E GAG 9 (26) G GGG 12 (37)

Figure 4. Codon preferences for the psaA, psbA,B,C, atpA and petB genes in the plastid of the liverwort Marchantia polymorpha
and (in parentheses) the cyanobacterium Synechocystis sp. PCC6803.

resistance. Several cassettes can integrate at a given site,


to generate a ‘super-integron’. Because expression of the
cassette gene relies on integration placing it under the con-
cassette
trol of the promoter at the insertion site, the open reading
frame of any cassette is always in the same orientation with
regard to the 59 bp core region. This resembles the situ-
ation in dinoflagellate plastid minicircles, where the coding
region is always in the same orientation with regard to the
integration/excision
core sequence. The 59 bp core elements of integrons are
flanked by imperfect inverted repeats (Hall et al. 1991), and
it is interesting that the core regions of dinoflagellate mini-
integrase
circles also generally contain imperfect inverted repeats
promoter (Barbrook et al. 2001; Zhang et al. 2002). At least in the
case of the core region of A. operculatum, the inverted
repeats and the region between them is remarkably similar
in size (55 bp) to the 59 bp of the integron core region
(Barbrook et al. 2001). Although some of the inverted
repeats in dinoflagellate core regions are larger than this,
cassette integrated and expressed
the same is true for the cores of integron cassettes
(Recchia & Hall 1997). Once inserted, integron cassettes
Figure 5. Insertion and expression of a bacterial integron can be re-excised by a reversal of the insertion event. How
cassette. cassettes are generated initially is not clear. It has been pro-
posed that they may be produced by reverse transcription
indicating that the two-gene minicircles are not generated of mRNA species. This would account for the absence of
by simple fragmentation of a larger minicircle. a promoter. Given the high error rate of reverse tran-
The minicircles demonstrate an intriguing similarity to scriptase, it is interesting to note that the dinoflagellate
the cassettes of bacterial integrons. The latter are naturally minicircle genes are highly diverged from their homologues
occurring gene capture systems, in which circular molecules in other plastids (Zhang et al. 2000), which might be
carrying an open reading frame and a conserved 59 bp core expected if they were generated by an error-prone mech-
region are inserted into the bacterial chromosome by anism. One possibility for the origin of the 59 bp element
recombination across a similar sequence in the bacterial of bacterial integrons is that it derives from stem-loops at
chromosome. This integration is carried out by an integrase the end of the transcripts (i.e. from a bacterial transcription
(IntI) encoded adjacent to the integration site. The inte- terminator). Transcripts in plastids of green plants and
gration places the open reading frame under the control of algae typically end with stem-loops, although they are gen-
a promoter also present at the insertion site (Rowe- erally processing sites rather than transcription terminators
Magnus & Mazel 1999, 2001). This results in the (Stern et al. 1991). An alternative proposal for the origin
expression of the open reading frame in the incoming cas- of the 59 bp element of bacterial integrons is that it is a
sette (figure 5). These systems are widely used by bacteria, separate element added either to the RNA or the cDNA
especially as a way of acquiring and exchanging antibiotic after reverse transcription (Recchia & Hall 1997).

Phil. Trans. R. Soc. Lond. B (2003)


Chloroplast genome evolution C. J. Howe and others 105

There are therefore many interesting similarities repair by dominant negative mutants of Escherichia coli
between the dinoflagellate minicircles and bacterial inte- RecA. Mol. Cell. Biol. 15, 3003–3011.
gron systems. Maybe in the same way that integrons are Cheung, A. Y., Bogorad, L., Van Montagu, M. & Schell, J.
used for lateral transfer of coding regions between differ- 1988 Relocating a gene for herbicide tolerance: a chloroplast
gene is converted into a nuclear gene. Proc. Natl Acad. Sci.
ent genomes, the minicircles are part of a system for trans-
USA 85, 391–395.
fer of coding regions between different genomes within the
Edqvist, J., Burger, G. & Gray, M. W. 2000 Expression of
same cell, i.e. for moving genes from the chloroplast to the mitochondrial protein-coding genes in Tetrahymena pyri-
nucleus. On this basis, one might expect to find integrated formis. J. Mol. Biol. 297, 381–393.
copies of minicrcles in the nuclear genome, perhaps in Glöckner, G., Rosenthal, A. & Valentin, K. 2000 The structure
association with the integrase gene. It may be that the and gene repertoire of an ancient red algal plastid genome.
minicircles remaining in the plastid are ones that have J. Mol. Evol. 51, 382–390.
failed to be incorporated into the nucleus, because their Hall, R. M., Brookes, D. E. & Stokes, H. W. 1991 Site-specific
continued presence in the plastid is required for efficient insertion of genes into integrons: role of the 59-base element
regulation (Pfannschmidt et al. 1999). These minicircles and determination of the recombination cross-over point.
may have acquired replication origins to facilitate their Mol. Microbiol. 5, 1941–1959.
continued propagation in the plastid. Hiller, R. G. 2001 ‘Empty’ minicrcles and petB/atpA and
psbD/psbE (cytb559 ␣) genes in tandem in Amphidinium cart-
There are several important questions remaining to be
erae plastid DNA. FEBS Lett. 505, 449–452.
answered. Perhaps the most significant is whether the
Howe, C. J. & Smith, A. G. 1991 Plants without chlorophyll.
minicircles are indeed located in the plastid, or whether Nature 349, 109.
they are actually nuclear (in addition or instead). Do all Howe, C. J., Beanland, T. J., Larkum, A. W. D. & Lockhart,
dinoflagellates have this anomalous gene organization, and P. J. 1992 Plastid origins. Trends Ecol. Evol. 7, 378–383.
is there also a ‘master’ circle that has not yet been found? Howe, C. J., Barbrook, A. C. & Lockhart, P. J. 2000 Organelle
How many additional genes are located on minicircles, genes: do they jump or are they pushed? Trends Genet. 16,
and where are the tRNA genes? How significant is the 65–66.
resemblance to bacterial integrons? Are there integrated Kanevski, I. & Maliga, P. 1994 Relocation of the plastid rbcL
copies of the minicircles, and if so, in which compart- gene to the nucleus yields functional ribulose-1,5-bis-phos-
ments? How are the minicircles replicated, and how are phate carboxylase in tobacco chloroplasts. Proc. Natl Acad.
the genes they contain transcribed? Is there any editing of Sci. USA 91, 1969–1973.
Kowallik, K. V., Stoebe, B., Schaffran, I., Kroth-Pancic, P. &
transcripts, or is the primary amino-acid sequence of plas-
Freier, U. 1995 The chloroplast genome of a chlorophyll
tid proteins the same as that predicted from the open read-
a ⫹ c-containing alga, Odontella sinensis. Plant Mol. Biol. Rep.
ing frames in the minicircles? How are the minicircles with 13, 336–342.
only fragments of genes, or no genes at all, generated, and La Claire, J. W. & Wang, J. 2000 Localization of plasmid-like
do they have a function? DNA in giant-celled marine algae. Protoplasma 213, 157–
164.
We thank BBSRC, The Broodbank Trust, Corpus Christi Col- Lang, B. F., Gray, M. W. & Burger, G. 1999 Mitochondrial
lege and the Cambridge European Trust for financial support. genome evolution and the origin of eukaryotes. A. Rev.
Genet. 33, 351–397.
Lockhart, P. J., Howe, C. J., Bryant, D. A., Beanland, T. J. &
REFERENCES Larkum, A. W. D. 1992 Substitutional bias confounds infer-
ence of cyanelle origins from sequence data. J. Mol. Evol.
Allen, J. F. 1993 Control of gene expression by redox potential
34, 153–162.
and the requirement for chloroplast and mitochondrial gen-
Lockhart, P. J., Steel, M. A., Barbrook, A. C., Huson, D. H.,
omes. J. Theor. Biol. 165, 609–631.
Allen, J. F. & Raven, J. A. 1996 Free-radical-induced mutation Charleston, M. A. & Howe, C. J. 1998 A covariotide model
vs redox regulation: costs and benefits of genes in organelles. explains apparent phylogenetic structure of oxygenic photo-
J. Mol. Evol. 42, 482–492. synthetic lineages. Mol. Biol. Evol. 15, 1183–1188.
Barbrook, A. C. & Howe, C. J. 2000 Minicircular plastid DNA Lonsdale, D. M., Hodge, T. P. & Fauron, C. M. 1984 The
in the dinoflagellate Amphidinium operculatum. Mol. Gen. physical map and organization of the mitochondrial genome
Genet. 263, 152–158. from the fertile cytoplasm of maize. Nucleic Acids Res. 12,
Barbrook, A. C., Lockhart, P. J. & Howe, C. J. 1998 Phylogen- 9249–9261.
etic analysis of plastid origins based on secA sequences. Curr. Martin, W., Stoebe, B., Goremykin, V., Hansmann, S., Hase-
Genet. 34, 336–341. gawa, M. & Kowallik, K. V. 1998 Gene transfer to the
Barbrook, A. C., Symington, H., Nisbet, R. E. R., Larkum, nucleus and the evolution of chloroplasts. Nature 393,
A. & Howe, C. J. 2001 Organisation and expression of the 162–165.
plastid genome of the dinoflagellate Amphidinium opercula- Morse, D., Salois, P., Markovic, P. & Woodland Hastings, J.
tum. Mol. Genet. Genomics 266, 632–638. 1995 A nuclear-encoded form II RuBisCo in dinoflagellates.
Cannon, G. C., Hedrick, L. A. & Heinhorst, S. 1995 Repair Science 268, 1622–1624.
mechanisms of UV-induced DNA damage in soybean chlor- Nakamura, Y., Kaneko, T., Hirosawa, M., Miyajima, N. &
oplasts. Plant Mol. Biol. 29, 1267–1277. Tabata, S. 1998 Cyanobase, a www database containing the
Cerutti, H., Ibrahim, H. Z. & Jagendorf, A. T. 1993 Treat- complete nucleotide sequence of the genome of Synechocystis
ment of pea (Pisum sativum L.) protoplasts with DNA-dam- sp. strain PCC6803. Nucleic Acids Res. 26, 63–67.
aging agents induces a 39-kilodalton chloroplast protein Palmer, J. D. 1993 A genetic rainbow of plastids. Nature 364,
immunologically related to Escherichia coli RecA. Plant Phy- 762–763.
siol. 102, 155–163. Pfannschmidt, T., Nilsson, A. & Allen, J. F. 1999 Photosyn-
Cerutti, H., Johnson, A. M., Boynton, J. E. & Gillham, N. W. thetic control of chloroplast gene expression. Nature 397,
1995 Inhibition of chloroplast DNA recombination and 625–628.

Phil. Trans. R. Soc. Lond. B (2003)


106 C. J. Howe and others Chloroplast genome evolution

Race, H. L., Herrmann, R. G. & Martin, W. 1999 Why have psbA, although we have not yet looked at variation in copy
organelles retained genomes? Trends Genet. 15, 364–370. numbers during culturing, during life cycle and so on.
Recchia, G. D. & Hall, R. M. 1997 Origins of the mobile gene J. E. Walker (MRC—Dunn Institute of Human Nutrition,
cassettes found in integrons. Trends Microbiol. 5, 389–394. Cambridge, UK ). There is a related question to your initial
Reith, M. & Munholland, J. 1995 Complete nucleotide
question ‘why should genes move from the chloroplast to
sequence of the Porphyra purpurea chloroplast genome. Plant
Mol. Biol. Rep. 13, 333–335.
the nucleus?’, and that is: why should genes stay in the
Rowe-Magnus, D. A. & Mazel, D. 1999 Resistance gene cap- chloroplast? Do you know? I cannot comment on chloro-
ture. Curr. Opin. Microbiol. 2, 483–488. plasts, but certainly in the case of mitochondria of higher
Rowe-Magnus, D. A. & Mazel, D. 2001 Integrons: natural organisms, the 13 proteins that are encoded there have
tools for bacterial genome evolution. Curr. Opin. Microbiol. one very clear, common feature: they are very hydro-
4, 565–569. phobic, and therefore it is reasonable to assume that they
Stern, D. B., Radwanski, E. R. & Kindle, K. 1991 A 3⬘ have been kept there because they would be difficult to
stem/loop structure of the Chlamydomonas chloroplast atpB transport into the mitochondrion; they would probably
gene reguates mRNA accumulation in vivo. Plant Cell 3, need to be kept soluble during that process and this would
285–297.
require the addition of an enormously long polar import
Sugiura, M. 1992 The chloroplast genome. Plant Mol. Biol.
sequence. One is encouraged that that may be so by sub-
19, 149–168.
Zhang, Z., Green, B. R. & Cavalier-Smith, T. 1999 Single unit c, which is about 80 amino acids long; it is a nuclear
gene circles in dinoflagellate chloroplast genomes. Nature gene product and it has an import sequence, if I remember
400, 155–159. correctly, in excess of 60 extremely polar amino acids, so
Zhang, Z., Green, B. R. & Cavalier-Smith, T. 2000 Phylogeny certainly in the case of these mitochondria and their gen-
of ultra-rapidly evolving dinoflagellate chloroplast genes: a omes that may be one of the overriding factors.
possible common origin for sporozoan and dinoflagellate C. J. Howe. In the chloroplast system it has certainly
plastids. J. Mol. Evol. 51, 26–40. been shown in a number of cases that one can take a
Zhang, Z., Cavalier-Smith, T. & Green, B. R. 2001 A family of chloroplast gene and put it into the nucleus with a chloro-
selfish minicircular chromosomes with jumbled chloroplast plast import sequence that has nothing particularly special
genes from a dinoflagellate. Mol. Biol. Evol. 18, 1558–1565.
attached to it and that import sequences will take the gene
Zhang, Z., Cavalier-Smith, T. & Green, B. R. 2002 Evolution
of dinoflagellate unigenic minicircles and the partially con-
product back into the chloroplast. So some genes could,
certed divergence of their putative replicon origins. Mol. at least in principle, be moved into the nucleus.
Biol. Evol. 19, 489–500. J. E. Walker. Similar experiments have been done by
Howard Jacobs with the ATPase 6 gene, which has been
moved into the nucleus, and he has had to attach an
Discussion extremely long polar import sequence in order to achieve
R. Fray (Plant Science Division, University of Nottingham, that.
Nottingham, UK ). Have you looked at the transcripts from C. J. Howe. John Allen has, of course, theories on the
these minicircles? Do you get a defined-length transcript retention of genes. I have a theory on the retention of one
or does the RNA polymerase just keep going round these gene, which is the glutamate tRNA that is used to activate
circles many times? glutamate for haem biosynthesis. If you are not really able
C. J. Howe. That is a good question. We do find to import tRNAs into the organelle, but at least the early
defined-length transcripts, and they correspond essentially stages of haem biosynthesis are carried out in the
with the size that one would expect from the coding region organelle, then you will at least have to retain that gluta-
itself, with a little bit added on at each end. They do not mate tRNA gene. Obviously, that leaves a lot of others to
correspond to the size of the whole minicircle, and in the be explained.
cases where we have two genes in the minicircle we seem J. C. Gray (Department of Plant Sciences, University of
to find separate transcripts for each gene. Now that does Cambridge, Cambridge, UK ). Integrons have a relatively
not rule out the possibility of a much larger transcript that short recognition sequence, and not the 500 base pairs
is processed very rapidly, so we do not pick up the inter- that you have as your common region. Surely that would
mediate dicistronic transcript, but we seem to find just suggest your common region is doing something else
single-sized transcripts. rather than providing an integration-recognition site?
C. J. Leaver (Department of Plant Sciences, University of C. J. Howe. Yes, it may well be doing something else.
Oxford, Oxford, UK ). Does the copy number of these In classic integrons, they talk about a 59 bp element
circles or those genes vary relative to each other, or when (which in fact can be anything from 59 to about 150 bp).
you culture these, or is it constant? Within the minicircle core region there are regions of
C. J. Howe. We have looked at that to some extent. It greater conservation and lesser conservation, so it is poss-
is curious that one of the genes that you seem to pick up ible that one part of the core might be functioning as that
quite often when you are cloning these minicircles is the kind of attachment site while the rest could be doing other
psbA minicircle. We thought everything needs lots of psbA things, such as acting as a promotor or indeed supplying
because it is turning over very rapidly, so maybe the dino- a replication origin, allowing the whole thing to replicate
flagellates have many copies of the psbA compared with independently—this is just speculation. There are quite a
the other genes, so they make lots and lots of protein. But lot of short inverted repeats in these core regions, which
like all the best hypotheses, that turned out to be wrong. is interesting because one of the features of the 59 bp
Copy number experiments that we have done suggest that elements in the integron model is that they also have
the minicircles all seem to occur in fairly similar numbers inverted repeats bounding them, so there is an extension
of copies. There is not a specific overrepresentation of the of that similarity.

Phil. Trans. R. Soc. Lond. B (2003)


Chloroplast genome evolution C. J. Howe and others 107

A. E. Douglas (Department of Biology, University of York, R. G. Herrmann (Department of Biology, Ludwig-


York, UK). I have a speculative comment/question. Dino- Maximilians Universität, Munich, Germany). I can make a
flagellate plastids have been replaced by alternative photo- comment on the import of highly hydrophobic proteins.
synthetic symbionts in some taxa. I wonder if this could We have imported a protein with 11-transmembrane-
be related to the fact that their genome may be degenerate spanning protein segments into the chloroplast. The prob-
or prone to collapse. Do you want to make any comment? lem does not appear to be one of import, but of interaction
C. J. Howe. Only that that it is a very interesting obser- with the chaperones, so the proteins are sometimes not
vation. I do not know. I would emphasize that dinoflagel- correctly inserted in the membrane. If one can modify the
lates are an incredibly broad group of organisms, and the chaperone/hydrophobicity protein in its action, it should
people looking at them have just picked out a small num- be possible to have chloroplast function of such a hydro-
ber. I think we need to look more broadly within the group phobic protein encoded by the nucleus.
and see what we can find. J. F. Allen. I think that John Walker’s point was that
T. Cavalier-Smith (Department of Zoology, University of hydrophobicity may not be an obstacle to import in plas-
Oxford, Oxford, UK ). Can I comment on the last question? tids, but it could be in mitochondria.
The fundamental reason why dinoflagellates can relatively R. E. Blankenship (Department of Chemistry and Bio-
easily undergo chloroplast replacement, which has hap- chemistry, Arizona State University, AZ, USA). Is there evi-
pened at least twice, maybe three times, is that they are dence against full-sized chloroplast genomes in these
the only group of chromalveolates that has retained the organisms? In other words, are you sure there is no
membrane vesicle targeting mechanism and photosyn- larger structure?
thesis, and the ability to phagocytose. All the chromists C. J. Howe. That is clearly an extremely important
underwent that fusion of the endoplasmic reticulum with question. We have tried very hard to find one. We have
the epiplastid membrane. Therefore, if a chromist ate a tried, for example, ‘PCRing’ with primer pairs for genes
foreign alga and failed to digest it, it would not already that would be adjacent on any respectable large chloro-
have a vesicle targeting mechanism that could be used to plast DNA to see if we can obtain PCR products with
enslave the new plastid; whereas dinoflagellates have it them both on, but we cannot. Clearly there could be other
ready-made, they just have to slightly alter the v-SNARE trivial reasons for that, so it needs to be looked at.
so that they can re-target it to the new food vacuole mem-
brane. It would be relatively easy. That, I think, is the
reason.
C. J. Howe. It sounds extremely plausible to me, yes. GLOSSARY
J. F. Allen (Plant Biochemistry, Lund University, Lund,
Sweden). Concerning hydrophobicity, in response to John A: adenine
Walker, I intended to send the hydrophobicity hypothesis C: cytosine
packing yesterday. This is something that one should dis- G: guanine
cuss, but it does not really come in the context of Chris RT–PCR: reverse transcription–polymerase chain reaction
Howe’s talk. T: thymine

Phil. Trans. R. Soc. Lond. B (2003)

You might also like