Origins
Origins
Origins
Virology
journal homepage: www.elsevier.com/locate/yviro
Review
art ic l e i nf o a b s t r a c t
Article history: Viruses and other selfish genetic elements are dominant entities in the biosphere, with respect to both
Received 27 January 2015 physical abundance and genetic diversity. Various selfish elements parasitize on all cellular life forms.
Returned to author for revisions The relative abundances of different classes of viruses are dramatically different between prokaryotes
19 February 2015
and eukaryotes. In prokaryotes, the great majority of viruses possess double-stranded (ds) DNA
Accepted 20 February 2015
genomes, with a substantial minority of single-stranded (ss) DNA viruses and only limited presence of
Available online 12 March 2015
RNA viruses. In contrast, in eukaryotes, RNA viruses account for the majority of the virome diversity
Keywords: although ssDNA and dsDNA viruses are common as well. Phylogenomic analysis yields tangible clues for
Evolution of viruses the origins of major classes of eukaryotic viruses and in particular their likely roots in prokaryotes.
Transposable elements
Specifically, the ancestral genome of positive-strand RNA viruses of eukaryotes might have been
Polintons
assembled de novo from genes derived from prokaryotic retroelements and bacteria although a
Bacteriophages
Recombination primordial origin of this class of viruses cannot be ruled out. Different groups of double-stranded RNA
Functional gene modules viruses derive either from dsRNA bacteriophages or from positive-strand RNA viruses. The eukaryotic
ssDNA viruses apparently evolved via a fusion of genes from prokaryotic rolling circle-replicating
plasmids and positive-strand RNA viruses. Different families of eukaryotic dsDNA viruses appear to have
originated from specific groups of bacteriophages on at least two independent occasions. Polintons, the
largest known eukaryotic transposons, predicted to also form virus particles, most likely, were the
evolutionary intermediates between bacterial tectiviruses and several groups of eukaryotic dsDNA
viruses including the proposed order “Megavirales” that unites diverse families of large and giant
viruses. Strikingly, evolution of all classes of eukaryotic viruses appears to have involved fusion between
structural and replicative gene modules derived from different sources along with additional acquisi-
tions of diverse genes.
Published by Elsevier Inc. This is an open access article under the CC BY license
(http://creativecommons.org/licenses/by/4.0/).
Contents
Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
The contrasting viromes of prokaryotes and eukaryotes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Evolutionary scenarios for the origin of eukaryotes and their impact on the reconstruction of virus evolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
Origins of the major classes of eukaryotic viruses and evolutionary relationships between viruses of prokaryotes and eukaryotes . . . . . . . . . . . . . . . 5
A general perspective on RNA virus evolution: Out of the primordial RNA world?. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Positive-strand RNA viruses: Assembly from diverse prokaryotic progenitors and gene exchanges leading to enormous diversification . . . . . . . 6
dsRNA viruses: Multiple origins from positive-strand RNA viruses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Negative-strand RNA viruses: The emerging positive-strand connection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Synopsis on eukaryotic RNA virome. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Retroelements and retroviruses: Viruses as derived forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
n
Corresponding author.
E-mail addresses: [email protected] (E.V. Koonin),
[email protected] (V.V. Dolja), [email protected] (M. Krupovic).
http://dx.doi.org/10.1016/j.virol.2015.02.039
0042-6822/Published by Elsevier Inc. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
E.V. Koonin et al. / Virology 479-480 (2015) 2–25 3
Introduction
viruses and encompass multiple transitions from capsid-less elements
A major discovery of environmental genomics over the last to bona fide viruses and vice versa (Koonin and Dolja, 2013, 2014).
decade is that the most common and abundant biological entities Thus, any reconstruction of virus evolution that fails to take into
on earth are viruses, in particular bacteriophages (Edwards and account the evolutionary relationships with non-viral selfish elements
Rohwer, 2005; Rohwer, 2003; Rohwer and Thurber, 2009; Suttle, is bound to be substantially incomplete. The capsid-less elements as
2005, 2007). In marine, soil and animal-associated environments, well as many viruses differ in their extent of integration with the host
virus particles consistently outnumber cells by one to two orders of cells: some insert into the cell genome and are transmitted mainly
magnitude. Viruses are major ecological and even geological agents vertically through the host generations, others are largely autonomous,
that in large part shape such processes as energy conversion in the and many combine both strategies mixed in different proportions.
biosphere and sediment formation in water bodies by killing off Viruses and other selfish elements certainly have not evolved
populations of abundant, ecologically important organisms such as from a single common ancestor: indeed, not a single gene is
cyanobacteria or eukaryotic algae (Fuhrman, 1999; Rohwer and conserved across the entire “greater virus world” or even in the
Thurber, 2009; Suttle, 2007). With the possible exception of some majority of selfish elements (Holmes, 2011; Koonin et al., 2006).
highly degraded intracellular parasitic bacteria, viruses and/or other However, these elements form a dense evolutionary network in
selfish elements, such as transposons and plasmids, parasitize on all which genomes are linked through different shared genes (Koonin
cellular organisms. Complementary to their physical dominance in and Dolja, 2014; Krupovic and Koonin, 2015; Yutin et al., 2013). This
the biosphere, viruses collectively appear to encompass the bulk of type of evolutionary relationship results from extensive exchange of
the genetic diversity on earth (Hendrix, 2003; Kristensen et al., genes and gene modules, in some cases between widely different
2010, 2013). The ubiquity of viruses in the extant biosphere and the elements, as well as parallel capture of homologous genes from the
results of theoretical modeling indicating that emergence of selfish hosts by distinct elements. Viruses with large genomes possess
genetic elements is intrinsic to any evolving system of replicators numerous genes that were acquired from the hosts at different
together imply that virus-host coevolution had been the mode of stages of evolution; such genes typically are restricted in their
the evolution of life ever since its origin (Szathmary and Demeter, spread to a narrow group of viruses. However, a small group of viral
1987; Takeuchi and Hogeweg, 2007, 2012; Takeuchi et al., 2011). hallmark genes that encode key proteins involved in genome
All cellular life forms possess genomes consisting of double- replication and virion formation and are shared by overlapping sets
stranded (ds) DNA and employ the same, standard scheme for repl- of diverse viruses ensures the connectivity of the evolutionary
ication and expression. In contrast, viruses and other selfish network in the virus world (Holmes, 2011; Koonin and Dolja,
elements exploit all theoretically conceivable inter-conversions of 2014; Koonin et al., 2006). Virus hallmark genes have no obvious
nucleic acids, with the genome represented by either RNA or DNA ancestors in cellular life forms, suggesting that virus-like elements
that can be either single-stranded or double-stranded, either circ- evolved at a pre-cellular stage of the evolution of life.
ular or linear, and consists of either a single or multiple molecules The viromes and mobilomes (i.e. the supersets of viruses and
(Agol, 1974; Baltimore, 1971; Koonin, 1991a). Typical viral genomes other selfish elements) of the three domains of cellular life (bacteria,
are small compared to genomes of cellular life forms but over the archaea and eukaryotes) are fundamentally different. Although
past few years the discovery of several groups of giant viruses has several families of dsDNA viruses are represented in both bacteria
dramatically expanded the viral genome size range that now spans and archaea, no viruses are known to be shared by eukaryotes with
3 orders of magnitude, from about 2 kilobases (kb) to over 2 mega- any of the other two cellular domains, even at the family or order
bases (Mb). The genomes of giant viruses are larger than the geno- level (King et al., 2011). The evolutionary connections between
mes of numerous bacteria and archaea, obliterating the gulf viruses of eukaryotes and those that infect bacteria and archaea are
between cells and viruses in terms of genome size and complexity distant and complex. In this review article, we quantify the differ-
(Claverie and Abergel, 2009; Claverie et al., 2006; Legendre et al., ences between the prokaryotic and eukaryotic viromes, summarize
2014; Philippe et al., 2013; Raoult et al., 2004). the existing evidence on putative prokaryotic ancestry of the major
Given the fundamental differences in the reproduction strategy classes of eukaryotic viruses and virus-like elements, and delineate
between viruses and cellular organisms, along with the prominence of the likely key events in the evolution of each class.
viruses in the biosphere, it has been proposed that all organisms be
classified into two primary “empires”, the ribosome-encoding (cellu-
lar) organisms and the capsid-encoding organisms (viruses) (Raoult The contrasting viromes of prokaryotes and eukaryotes
and Forterre, 2008). This division captures some of the essential
distinctions between cells and viruses but, due to the focus on capsids The high level classification of viruses that was introduced by
as a positive, defining trait of the virus empire, fails to reflect the full Baltimore in 1971 (largely inspired by his co-discovery, with
complexity of the evolutionary relationships among selfish genetic Temin, of reverse transcription in animal tumor viruses) is based
elements. Indeed, comparative genomic analyses make it increasingly on the replication-expression strategies and in particular on the
clear that the evolutionary connections between viruses and various form of nucleic acid that is incorporated into virions (obviously,
capsid-less elements are multifarious, involve all major groups of this criterion is only applicable to bona fide viruses) (Baltimore,
4 E.V. Koonin et al. / Virology 479-480 (2015) 2–25
1971). The following 7 classes have been delineated under this of the true impact of this class of genomic parasites. Thus, the actual
approach (Koonin, 1991a): (i) positive-strand RNA viruses (virions discrepancy between the prokaryotic and eukaryotic viromes is likely
contain RNA of the same polarity as mRNA), (ii) negative-strand to be even greater than suggested by the data in Fig. 1.
RNA viruses (virions contain RNA molecules complementary to the The biological causes of the dramatic difference in the compo-
mRNA), (iii) dsRNA viruses, (iv) reverse-transcribing viruses with sition of the virome between eukaryotes and prokaryotes remain
positive-strand RNA genomes, (v) reverse-transcribing viruses unclear. It stands to reason that the emergence of the eukaryotic
with dsDNA genomes (these were characterized subsequent to nucleus severely shrunk the niche for dsDNA virus reproduction by
the seminal publication of Baltimore), (vi) ssDNA viruses, (vii) creating a barrier for the access of viral DNA to the sites of host
dsDNA viruses. genome replication and transcription, and complicating the pro-
The viromes of prokaryotes and eukaryotes dramatically differ cess of virus maturation. Notably, the majority of dsDNA viruses of
with respect to the contribution of the different Baltimore classes eukaryotes replicate in the cytoplasm (see below) suggesting that
to the overall viral diversity (Fig. 1). In both bacteria and archaea, those few groups of dsDNA viruses that replicate in the nucleus
the vast majority of the viruses possess dsDNA genomes, mostly have evolved specific adaptations to overcome the barriers. Con-
within the range of 10 to 100 kb. The second most common class versely, the cytosolic compartment of eukaryotic cells, with its
includes small ssDNA viruses. Positive-strand RNA and dsRNA elaborate intracellular membrane system, might provide a fertile
viruses are extremely rare, and no retroviruses are known niche for the reproduction of RNA viruses (Belov, 2014; den Boon
(reverse-transcribing elements exist but are not highly abundant) and Ahlquist, 2010; Greninger, 2015; Nagy and Pogany, 2012).
(Fig. 1). With respect to the dramatic proliferation of retroelements, an
In contrast to bacteria and archaea, eukaryotes host numerous, accommodating niche could have been provided by the expanding
highly diverse RNA viruses (particularly of the positive-strand class) genomes of eukaryotes and their greater tolerance to insertion of
as well as reverse-transcribing elements and retroviruses that mobile elements compared to genomes of prokaryotes (Lynch,
typically integrate into the host genome and are extremely abun- 2007; Lynch and Conery, 2003).
dant, comprising a substantial fraction of the genome in many Regardless of the underlying causes, reconstruction of the evo-
groups of eukaryotes (Goodier and Kazazian, 2008; Kazazian, 2004). lution of the eukaryotic virome, with its dramatic differences from
Collectively, the diversity and abundance of RNA viruses and retro- the viromes of bacteria and archaea and comparatively greater
viruses in eukaryotes exceeds the diversity and abundance of DNA diversity, is a major challenge in the study of virus evolution. In the
viruses (Fig. 1; in this comparison, we refer to bona fide viruses following sections of this article, we discuss the evolutionary
because the prevalence of capsid-less elements is much more scenarios that have been developed for different classes of eukar-
difficult to quantify). yotic viruses over the last few years and how the evolutionary
The comparison in Fig. 1 that uses the number of recognized viral relationships between viruses of prokaryotes and eukaryotes bec-
genera from each of the Baltimore classes infecting prokaryotes and ome apparent in these scenarios.
eukaryotes as the measure of diversity most likely fails to pay full
justice to the actual prevalence of the dominant classes, in particular
dsDNA viruses, in the case of prokaryotes, and retroelements in Evolutionary scenarios for the origin of eukaryotes and their
eukaryotes. In the first instance, this appears to be the case given the impact on the reconstruction of virus evolution
existence of numerous unclassified bacteriophages and undoubt-
edly an even much greater number of phages that remain to be The origin of eukaryotes is a major problem in evolutionary biology
discovered. As a case in point, 39 new genera have been recently that is generally considered to be unresolved. It is now clear that
proposed within the bacteriophage family Siphoviridae (Adriaenssens nearly all extant eukaryotes possess membrane-bounded, energy-
et al., 2014). Despite the rapid accumulation of bacteriophage converting organelles, the mitochondria or partially degraded deriva-
sequences, the diversity of phage genes does not show any signs of tives thereof (such as mitosomes or hydrogenosomes), and the few
saturation, suggestive of a vast phage supergenome that so far has known cases of actual loss of mitochondria are secondary (Hjort et al.,
been barely tapped into (Kristensen et al., 2013). In the case of 2010; van der Giezen, 2009; van der Giezen and Tovar, 2005). Acc-
eukaryotes, the diversity of retroelements is not captured by the ordingly, the Last Eukaryotic Common Ancestor (LECA) is believed to
existing classification of viruses, resulting in a severe underestimate have been a typical, mitochondriate eukaryotic cell (Embley and
160
140
120
Number of virus genera
100
80
60
40
20
0
(+)RNA (-)RNA dsRNA Retro ssDNA dsDNA
Prokaryotes Eukaryotes
Fig. 1. Representation of different “Baltimore classes” of viruses in prokaryotes and eukaryotes. The bars show the number of genera in the respective classes according to
the latest ICTV report (King et al., 2011). Unclassified viruses are disregarded. The numbers for ssDNA viruses also include those for papillomaviruses and polyomaviruses.
E.V. Koonin et al. / Virology 479-480 (2015) 2–25 5
Martin, 2006; Lane and Martin, 2010, 2012). Another well established, engulfment of bacteria and evolution of the compartmentalized
key piece of information pertinent for the origin of eukaryotes is the eukaryotic cell (Guy et al., 2014; Koonin and Yutin, 2014; Yutin
sharp split of the evolutionarily conserved eukaryotic genes into the et al., 2009).
genes with an archaeal evolutionary affinity and those with a bacterial In the following sections, we examine the implications of each
affinity (along with some with no detectable prokaryotic homologs) of these scenarios of the evolution of eukaryotes for the origin of
(Brown and Doolittle, 1997; Esser et al., 2004; Yutin et al., 2008). The different classes of eukaryotic viruses.
archaeal ancestry is apparent primarily for genes encoding compo-
nents of informational systems along with some key components of
the cytoskeleton and the cell division machinery (Koonin and Yutin, Origins of the major classes of eukaryotic viruses and
2014), whereas operational genes, such as metabolic enzymes, appear evolutionary relationships between viruses of prokaryotes and
to be largely of bacterial origin. eukaryotes
Within the constraints set by these key observations, two distinct
classes of scenarios for the origin of eukaryotes are currently A general perspective on RNA virus evolution: Out of the primordial
considered; the scenarios within each class differ in detail but the RNA world?
classes are sharply differentiated by the postulated nature of the
organism that played host to the protomitochondrial endosymbiont According to the widely accepted RNA world hypothesis, the
(Embley and Martin, 2006). The historically first scenario postulates a RNA-only replication cycle antedates reverse transcription and DNA-
lineage of primary amitochondrial eukaryotes (sometimes called based replication (Bernhardt, 2012; Gilbert, 1986; Neveu et al., 2013;
archaezoa) that are perceived to have evolved as a sister group of Robertson and Joyce, 2012). Under this premise, the RNA viruses and
archaea or possibly as a sister group of one of the major archaeal related selfish elements whose replication relies on RNA-dependent
branches, such as the ‘TACK (Thaumarchaeota–Aigarchaeota–Cre- RNA-polymerase (RdRp), are the only major group of organisms
narchaeota–Korarchaeota) superphylum’ (Guy et al., 2014). Under this (apart from small, non-coding parasitic RNAs such as viroids Diener,
scenario, the hypothetical amitochondrial ancestor of eukaryotes 1989) that could be direct descendants of RNA world inhabitants.
possessed the principal features of the eukaryotic cellular architecture Because RdRp is the only viral hallmark protein that is universally
such as the advanced cytoskeleton and endomembrane system conserved in RNA viruses (Kamer and Argos, 1984; Koonin and Dolja,
including the nucleus (Kurland et al., 2006; Poole et al., 1999; Poole 1993; Koonin et al., 2006), this enzyme is the key to reconstructing
and Penny, 2007). These features would facilitate engulfment of the their evolutionary histories. Together with distantly related RNA-
protomitochondrial endosymbiont (and bacteria in general) which is dependent DNA polymerases or reverse transcriptases (RT), viral
conceivably the strongest aspect of the primary amitochondrial RdRps represent a deeply branching lineage within the ancient
scenario (hereinafter protoeukaryote scenario). The obvious weakest superfamily of palm domain-containing polymerases and primases
point of this scenario is the lack of any evidence of the existence of (Iyer et al., 2005). As is typical of viral hallmark genes (Koonin et al.,
primary amitochondrial eukaryotic forms despite intensive search. 2006), cellular organisms encode no homologs of viral RdRps with
The proponents of the protoeukaryotic scenario thus have to postu- the same enzymatic activity. The only known family of RdRps
late that such forms are either extinct or exceedingly rare. Further- encoded in cellular genomes, those involved in the amplification
more, there is no precedent for the evolution of large, internally of small interfering RNAs in eukaryotes, are homologs of the DNA-
compartmentalized cells among prokaryotes, and it has been argued dependent RNA polymerases (Iyer et al., 2003; Salgado et al., 2006).
that emergence of such cells is unfeasible without highly efficient Based on the structure of the encapsidated genome and genome
cellular energetics that is provided by the multiple mitochondria replication/expression cycles, the ‘RNA only’ viruses are divided into
residing within a single cell (Lane and Martin, 2010, 2012). three Baltimore classes: positive-strand, double-strand and negative-
The alternative, symbiogenetic scenario (Embley and Martin, strand (þRNA, dsRNA and RNA, respectively). All non-defective
2006; Martin et al., 2007), obviously fueled by the ubiquity of viruses from each of these classes employ virus-encoded RdRps for
mitochondria and related organelles in eukaryotes, postulates that genome replication and often for the distinct process of genome
the host of the proto-mitochondrial endosymbiont was not a transcription to generate viral subgenomic mRNAs. Early comparative
protoeukaryote endowed with the key features of the eukaryotic analyses identified 6 signature amino acid sequence motifs that are
cellular organization, including the nucleus, but rather a regular conserved in RdRps of diverse þRNA viruses infecting bacteria, plants
archaeon, most likely a mesophilic form that could comprise a and animals, suggesting their monophyletic origin (Kamer and Argos,
deep branch within the TACK superphylum or possibly a sister 1984; Koonin, 1991b; Xiong and Eickbush, 1990). It has been further
group thereof (Koonin and Yutin, 2014). The symbiogenetic sce- demonstrated that similar motifs were present in RdRps of dsRNA
nario implies a plausible succession of events leading to the key viruses and the RTs (Kamer and Argos, 1984; Koonin et al., 1989; Xiong
innovations of the eukaryotic cell such as the endomembrane and Eickbush, 1990). Although the RdRps of the –RNA viruses possess
system including the nucleus, the cytoskeleton, the ubiquitin- certain motifs resembling those conserved in þRNA and dsRNA
centered signaling system and pre-mRNA splicing (Koonin, 2006; viruses (Tordo et al., 1988; Xiong and Eickbush, 1990), the overall
Martin and Koonin, 2006). The weakness of the symbiogenetic level of similarity is extremely low, making the evolutionary connec-
scenario is the extreme rarity of endosymbiosis among prokar- tion between the RNA viruses and the rest of RNA viruses tenuous
yotes (although bacteria living inside other bacteria have been at best.
described Husnik et al., 2013; von Dohlen et al., 2001) and the In addition to protein sequence analysis, reconstruction of the
apparent absence of mechanisms, such as phagocytosis, that RdRp evolution is substantially aided by the comparisons of their
would facilitate engulfment of bacteria. The proponents of this atomic structures. It has been found that RdRps from diverse þ RNA
scenario therefore are forced to postulate a (extremely) rare event and dsRNA viruses of bacteria and animals possess a characteristic
at the root of eukaryogenesis. However, the recent discovery of ‘right-handed’ fold, comprising palm, fingers, and thumb domains
archaeal homologs (and putative ancestors) of key elements of the (Choi and Rossmann, 2009; Ferrer-Orta et al., 2006; Kidmose et al.,
eukaryotic cytoskeleton, cell division systems and ubiquitin 2010; Monttinen et al., 2014). A long-awaited first atomic structure
machinery provide for an amended symbiogenetic scenario. Under of the RdRp of a RNA virus, bat influenza A virus, helped to
this hypothesis, the archaeal ancestor of eukaryotes, the host of demystify the origins of these viruses by revealing a high level of
the protomitochondrial endosymbiont, could have possessed rela- structural similarity to RdRps of both þRNA and dsRNA viruses
tively complex intracellular organization that would facilitate (Pflug et al., 2014). Thus, the three classes of RNA viruses share the
6 E.V. Koonin et al. / Virology 479-480 (2015) 2–25
homologous core enzyme that is responsible for their replication superfamily. Furthermore, all these viruses express their genomes
and, by implication, related origins. via polyprotein processing (in some groups, there are two poly-
Under the symbiogenetic scenario for the origin of eukaryotes, it proteins, one encompassing the structural proteins and the other
seems natural to assume that RNA viruses of eukaryotes originate from one proteins involved in replication) and package the genomic
either RNA bacteriophages or RNA viruses of Archaea. This assumption, RNA into characteristic icosahedral virions with a pseudo-T ¼3
however, is challenged by the striking scarcity of bacterial and archaeal symmetry. Notably, Picornavirales include viruses infecting a broad
RNA viruses compared to the flourishing genomic and ecological range of hosts from three supergroups of eukaryotic organisms,
diversity of their eukaryotic counterparts (see above). Indeed, there Unikonts (vertebrates, insects), Plantae (angiosperms) and Chro-
are only a handful of the þRNA bacteriophages all of which belong to malveolates (diatomes, raphidophytes, thraustrochytrids), as well
the family Leviviridae infecting primarily enterobacteria and some as viruses from marine environments with unidentified hosts (Le
other proteobacteria (Bollback and Huelsenbeck, 2001). Likewise, only Gall et al., 2008).
a few dsRNA bacteriophages of the family Cystoviridae that infect The family of vertebrate viruses Caliciviridae is closely related to
γ-proteobacteria of the genus Pseudomonas are currently known Picornavirales, sharing a conserved S3H-VPg-3CPro-RdRp-JRC gene
(Mindich, 2004) although efforts on new virus isolation might expand array and differing only in the structure of their true T¼3 capsid.
this range (Mantynen et al., 2015). The targeted search for extant Strikingly, in the phylogenetic tree of the RdRp, caliciviruses con-
archaeal RNA viruses so far has netted only a single þRNA virus fidently cluster with the members of Totiviridae, a family of dsRNA
candidate that appears to represent a novel virus family but whose viruses that infect fungi (Unikonts) as well as Kinetoplastids,
host range remains to be validated (Bolduc et al., 2012). Thus, the very Trichomonads and Diplomonads, all of which belong to a distinct
existence of archaeal RNA viruses remains an open question. Finally, supergroup of unicellular eukaryotes, the Excavates. Because the
there is no evidence of RNA viruses infecting prokaryotes. The proto- clade that unites Caliciviradae and Totiviridae is lodged inside the
eukaryotic scenario would imply a different narrative on the origins of picornavirus-like RdRp tree, it seems likely that this family of dsRNA
the RNA viruses of eukaryotes whereby the remarkable diversity of viruses is a highly derived off-shoot of the picornavirus-like super-
these viruses evolved within the ancient protoeukaryotic lineage due family of þRNA viruses. The viruses in the remaining three major
to the features of the (proto)eukaryotic cell organization, such as an evolutionary lineages of picornavirus-like viruses (Fig. 2) encompass
intracellular membrane system, that might be conducive to RNA virus only subsets of the five picornaviral signature genes or, in the case of
reproduction. Should that be the case, the search for bacterial or the family Partitiviridae, only the picornavirus-type RdRp. Each of
archaeal ancestry would be futile in principle. Below we discuss how these groups also includes viruses infecting hosts that belong to two
the available data on the origins of different genes of RNA viruses bear or three eukaryotic supergroups (Koonin et al., 2008).
on these distinct origin scenarios. Thus, the evolutionary scenario best compatible with the super-
imposition of the phylogenetic trees of eukaryotes and picorna-like
Positive-strand RNA viruses: Assembly from diverse prokaryotic viruses involves early diversification antedating the divergence of
progenitors and gene exchanges leading to enormous diversification eukaryotic supergroups. The alternative, i.e. emergence of the
ancestors of each of the 6 lineages of the picornavirus-like super-
Large-scale phylogenomic analysis of the þRNA viruses of eukar- family in one of the eukaryotic supergroups followed by horizontal
yotes was initiated over two decades ago and yielded conclusions that virus transfer (HVT) to hosts from other supergroups, appears to be
withstood the test of time remarkably well (Goldbach and Wellink, decidedly less parsimonious because such a scenario would require
1988; Koonin, 1991b; Koonin and Dolja, 1993). These studies have numerous HVT events involving organisms with widely different
identified three major evolutionary lineages that collectively encom- lifestyles and ecological niches (Koonin et al., 2008). However, HVT
pass the vast majority of the þRNA viruses infecting eukaryotes: could have played an important role in the subsequent evolution of
picornavirus-like, alphavirus-like and flavivirus-like superfamilies the picorna-like viruses (Dolja and Koonin, 2011). One case in point
(Fig. 2). This classification is based on a combination of evidence from is the phylogeny of partitiviruses in which fungal and plant viruses
the RdRp phylogeny with signature genes and gene arrangements that intermix, pointing to multiple occurrences of HVT between two
have been identified for the picornavirus-like and alphavirus-like widely different host taxa (Nibert et al., 2013). Another example
superfamilies (see below). The congruence between the two lines of involves the closely related plant Potiviridae and fungal Hypoviridae
evidence is crucial because the high sequence divergence of the RdRp (Koonin et al., 1991a). The HVT between plants and fungi appears to
that is dictated by the overall high mutation rate of RNA viruses, be particularly plausible given close associations between plants and
despite the essentiality of the polymerase, hampers the construction their ubiquitous fungal pathogens and symbionts.
of fully reliable phylogenetic trees (Zanotto et al., 1996). In contrast to the picornavirus-like superfamily, the alphavirus-like
The picornavirus-like superfamily is by far the largest, most and flavivirus-like superfamilies exhibit much less diversity in terms of
diverse and most widely represented across the diversity of the both the numbers of included families and even more so their global
eukaryotic hosts. In addition to a distinct RdRp lineage, the pico- ecologies (Dolja and Koonin, 2011). The alphavirus-like superfamily
rnavirus-like superfamily is defined by the presence of a conserved includes the order Tymovirales along with several other families of
array of signature genes, which encode a superfamily 3 helicase plant viruses and two families of animal viruses (Supplementary Table
(S3H), a small genome-linked protein (VPg), a distinct chymot- S1 and Fig. 2). All these viruses are unified by a conserved array of
rypsin-like protease 3CPro and a single beta-barrel jelly-roll capsid replication-associated genes which encode capping enzyme, super-
protein (JRC), and are represented, some losses and replacements family 1 helicase and the RdRp (Koonin and Dolja, 1993). A recent in-
notwithstanding, in most members of this superfamily (Koonin depth comparative analysis of viral protein sequences has revealed a
and Dolja, 1993; Koonin et al., 2008). highly derived variant of the capping enzyme in the nodaviruses, an
The global ecology of the picornavirus-like superfamily, which abundant family of animal þRNA viruses with small genomes (Ahola
spans a broad range of multicellular and unicellular eukaryotic and Karlin, 2015). The RdRp of nodaviruses does not show an affinity
hosts (Supplementary Table S1) points to an early origin of these with the alphavirus-like superfamily but rather had been tentatively
viruses antedating the radiation of the eukaryotic supergroups. included in the picorna-like superfamily on the basis of limited
The core of the picornavirus-like superfamily is represented by the conservation of some sequence motifs (Koonin, 1991b; Koonin and
order Picornavirales that encompasses 5 families, several floating Dolja, 1993; Koonin et al., 2008). However, there is no strong objective
genera and many unclassified viruses (Le Gall et al., 2008). The support for this affinity. Although nodaviruses, similar to other þRNA
viruses within this order share all the signature genes of the viruses with small genomes, lack a helicase, the presence of the
E.V. Koonin et al. / Virology 479-480 (2015) 2–25 7
Fig. 2. Origin of the major groups of RNA viruses of eukaryotes. The depicted evolutionary reconstruction is predicated on the symbiogenetic scenario of eukaryogenesis. The
host ranges of viral groups are color-coded as shown in the inset. Icons of virion structures are shown for selected groups. Ancestor-descendant relationships that are
considered tentative are shown with dotted lines, and particularly weak links are additionally indicated by question marks (see text for details). Key horizontal gene transfer
events are shown by gray, curved arrows. Abbreviations: CII FP, Class II fusion protein; CP, capsid protein; CPf, capsid protein of filamentous viruses; JRC, jelly roll capsid
(protein); MP, movement protein; RT, reverse transcriptase; S2H, Superfamily 2 helicase; S3H, Superfamily 3 helicase.
predicted capping enzyme suggests their inclusion in the alphavirus- tetraviruses and birnaviruses that appear to share a common ancestor
like superfamily as a deep, perhaps basal branch (Fig. 2). This affiliation and are included in the alphavirus-like superfamily on the basis of
is compatible with the observation that nodaviruses share a distinct the RdRp phylogeny (Wang et al., 2012a). Unlike the picorna-like
variant of the JRC containing an autoprocessing domain with viruses, the great majority of which possess JRC-based icosahedral
8 E.V. Koonin et al. / Virology 479-480 (2015) 2–25
capsids (with the exception of filamentous potyviruses and capsid-less requirement of a helicase for the replication of (relatively) large
hypoviruses), capsid architectures of alphavirus-like viruses are extre- RNA genomes. The existence of such a requirement is suggested by
mely diverse. These architectures include: (i) icosahedral virions built the clear threshold for the presence of the helicase gene which is
of either JRC or unrelated proteins; (ii) helical rod-shaped or flexible found in all þ RNA viruses with genomes larger than approxi-
filamentous virions formed by a distinct family of four-helix bundle mately 6 kb but not in viruses with smaller genomes (Gorbalenya
capsid proteins; (iii) membrane-enveloped virions. The host ranges and Koonin, 1989). Strikingly, however, both the helicases and the
of alpha-like viruses are limited almost exclusively to plants, where proteases in the three viral superfamilies belong to different
these viruses reach remarkable diversity, and animals. Only the family protein families (Koonin and Dolja, 1993 and see above). Whether
Endornaviridae that consists of capsid-less elements has a broader host these analogous designs of the viral genomes evolved in parallel
range including “viruses” of plants and fungi, and a single “virus” of a from a common ancestor that lacked the helicase and the protease
plant-parasitic oomycete, potentially, a result of HVT from a host plant or through displacement of the corresponding ancestral domains,
(Koonin and Dolja, 2014; Roossinck et al., 2011). is difficult to ascertain.
The flavivirus-like superfamily is the smallest of the three major Elucidation of the exact evolutionary relationships among the
groups of the þRNA viruses of eukaryotes and encompasses only three superfamilies of þRNA viruses of eukaryotes requires in-depth
two families that appear to be rather odd bedfellows (Fig. 2). The phylogenetic analyses of their RdRps which is a daunting task given
Flaviviridae are enveloped animal viruses that encode a specific the high sequence divergence of this protein outside the conserved
lineage of RdRp, a superfamily 2 helicase as well as a protease and a motifs. Expansion of the collection of RdRp structures and refinement
capping enzyme that are distinct from the functionally analogous of methods for structure-based phylogeny could lead to progress.
proteins of the picornavirus-like and alphavirus-like superfamilies, Nonetheless, the available evidence seems to support evolutionary
respectively (Koonin and Dolja, 1993). None of these genes except primacy of the picornavirus-like superfamily. Most importantly, the
for RdRp is conserved in Tombusviridae, viruses with small icosahe- host ranges of alphavirus-like and flavivirus-like superfamilies are
dral capsid built of JRC that infect plants (with the exception of a limited almost exclusively to vertebrates, their arthropod parasites,
single marine virus that presumably infects a unicellular eukaryotic and flowering plants, that is, only three groups of multicellular
host) (Culley et al., 2006; Dolja and Koonin, 2011). Thus, the organisms. These narrow host ranges could point to relatively late
flavivirus-like superfamily is held together only by the phylogenetic evolutionary origins of the viruses of these superfamilies, perhaps
affinity of the RdRPs. Although this association is consistently concomitant with the emergence of the respective host groups.
observed in multiple, independent phylogenetic analyses (Koonin Furthermore, HVT, in particular via insect vectors, could have played
and Dolja, 1993), the lack of additional support from signature genes an important role in the evolution of these viral superfamilies. In
makes this superfamily a tenuous group. It is not inconceivable that contrast, the broad host range of picorna-like viruses encompasses
Flaviviridae and Tombusviridae would be best treated as separate four eukaryotic supergroups and a great variety of both unicellular
superfamilies of þRNA viruses. and multicellular organisms. Furthermore, multiple host-specific and
In accordance with a major, general trend of virus evolution (see metagenomic studies of marine RNA viruses (most of them demon-
also below), the histories of the three superfamilies of þ RNA viruses strated or thought to infect diverse unicellular eukaryotes) have
were not completely independent but rather involved multiple gene recovered a large number of novel picorna-like viruses but only one
exchanges. A striking case in point is the family Potiviridae, the largest tombus-like virus and no alpha-like viruses (Culley et al., 2006, 2014;
family of plant viruses (Gibbs and Ohshima, 2010) that are confidently Culley and Steward, 2007; Koonin et al., 2008).
included in the picornavirus-like superfamily on the basis of a The three-superfamily classification of þRNA viruses does not
combination of several features including the RdRp phylogeny, the readily accommodate the distinct order Nidovirales which includes
presence of two additional signature genes, namely the picornavirus- viruses with the largest known RNA genomes and several unique
like protease and VPg, and the mode of protein expression via poly- genomic features. Notably, none of these viruses encode JRC and,
protein processing. However, two other signature genes of the consistently, do not form icosahedral virions. Instead, members of
picornavirus-like superfamily, namely the S3H and the JRC, are repl- the Nidovirales have enveloped virions which vary from roughly
aced in the potyviruses, respectively, by a Superfamily 2 helicase most spherical to rod-shaped, depending on the organization of the
closely related to the homologous helicase of flaviviruses and by a helical nucleocapsids (Gorbalenya et al., 2006; Koonin and Dolja,
four-helix bundle capsid protein related to that of filamentous plant 1993). However, certain evolutionary affinity between RdRps of
viruses in the alphavirus-like superfamily (e.g. potexviruses) (Dolja picornavirus-like viruses and nidoviruses, together with the pre-
et al., 1991; Koonin and Dolja, 1993; Koonin et al., 2008). Thus, sence of distantly related proteases responsible for polyprotein
evolution of the potyviruses involved substantial modification of the processing in both of these virus groups (Gorbalenya et al., 2006;
picornavirus-like scaffold (and consequently, the virion structure) Koonin and Dolja, 1993), suggests that nidoviruses could be highly
through contributions from the other two superfamilies of þ RNA derived off-shoots of the picornavirus-like superfamily.
viruses (Fig. 2). Other notable cases of intersuperfamily gene exchange Thus, the extreme diversity of the picorna-like viruses, with
include the apparent transfer of the serine protease gene between respect to both the host range and the genome architecture, sugg-
flaviviruses and togaviruses in which, strikingly, the protease was ests that picornaviral ancestors have evolved concomitantly with
recruited for the capsid protein function (Gorbalenya et al., 1989b); or shortly after the emergence of eukaryotes, rapidly diversified
spread of the genes for movement proteins between plant-infecting and spawned the ancestors of the alphavirus-like and flavivirus-
viruses from all three superfamilies (Mushegian and Koonin, 1993); like superfamilies as well as the Nidovirales (that are known to
and spread of class II fusion proteins among flaviviruses, togaviruses infect only vertebrates, insects and crustaceans), perhaps later in
and bunyaviruses (Modis, 2014; Vaney and Rey, 2011). evolution (Fig. 2).
A notable complementary trend in the evolution of þRNA If the picornavirus-like superfamily indeed represents the ances-
viruses is the parallelism between the designs of the viral genomes tral viral reservoir from which the rest of the eukaryotic þRNA
in the three superfamilies. Indeed, apart from the RdRp and the CP, viruses evolved (with some notable exceptions discussed below),
most of the viruses in the picorna-like and alpha-like super- then, the problem of the origin of eukaryotic þRNA viruses boils
families and the animal viruses in the flavi-like superfamily down to the origin of the ancestral picorna-like virus. This question
encode proteins with two types of functionality, helicases and has been addressed through a focused search for potential prokar-
proteases (Koonin and Dolja, 1993). The presence of these domains yotic roots of picorna-like viruses (Koonin et al., 2008). In addition to
most likely is dictated by functional requirements such as the validating the tight relationship between the three superfamilies of
E.V. Koonin et al. / Virology 479-480 (2015) 2–25 9
the eukaryotic positive-strand RNA viruses, in-depth sequence ana- movement and possibly capsid proteins are related to respective
lysis of the RdRps of the picornavirus-like superfamily has revealed proteins of tombusviruses, it has been proposed that ourmiaviruses
remarkably high similarity of picornavius-like RdRps to the reverse evolved via recombination between a narnavirus-like element from a
transcriptases (RTs) of the bacterial group II retroelements (self- plant-pathogenic fungus and a tombusvirus (Rastgou et al., 2009).
splicing introns), in contrast to the much lower similarity to the
RdRps of RNA bacteriophages (Koonin et al., 2008). Considering the
wide spread of the group II retroelements in bacteria (Lambowitz dsRNA viruses: Multiple origins from positive-strand RNA viruses
and Zimmerly, 2004, 2011), in contrast to the scarcity of RNA
bacteriophages, it appears plausible that the prokaryotic RTs were The dsRNA viruses of eukaryotes appear to be much less dive-
the ancestors of picornavirus-like RdRps. Search for the closest rse than þRNA viruses as follows from the numbers of currently
homologs of the 3CPro confidently identified bacterial and mito- recognized families (10 versus 31, respectively; Supplementary
chondrial proteases of the HtrA family (Gorbalenya et al., 1989a; Table S2). However, the recent accelerated pace of discovery of
Koonin et al., 2008), suggesting direct descent of the viral protease new, diverse dsRNA viruses might soon challenge this perception
from bacterial endosymbiont of emerging eukaryotic cell. The exact (Liu et al., 2012a, 2012b). Early phylogenetic analyses of the RdRps
origins of the other picornaviral signature genes, S3H, JRC and VPg, led to the conclusion that the dsRNA viruses originated on multi-
proved much more difficult to trace. Nevertheless, S3H is encoded ple occasions, mainly from different groups of þRNA viruses
in some dsDNA bacteriophages and bacterial rolling-circle plasmids (Koonin, 1992; Koonin et al., 1989). The inclusion of two families
(see below) whereas the single β-barrel JRC of the picorna-like of dsRNA viruses, Totiviridae and Partitiviridae, into the picor-
variety is present in ssDNA bacteriophages of the family Microviridae navirus-like superfamily is in full accord with this evolutionary
(McKenna et al., 1992; Roux et al., 2012). Additionally, the JRC-like scenario. The viruses in the family Birnaviridae share an unusual
β-barrel fold is found in various carbohydrate-binding proteins permuted RdRp, a genome-linked protein and a distinct variant of
including those from bacteria (Norris et al., 1994; Wong et al., the JRC with some of the tetraviruses (the family Tetraviridae has
2000), and some non-viral β-barrel proteins, such as tumor necrosis been recently split into three distinct families, namely Alphate-
factor, are even known to form virus-like particles (Liu et al., 2002). traviridae, Carmotetraviridae and Permutotetraviridae; Table S1),
These cellular jelly-roll proteins are considerably more compact than supporting a common origin of these families of dsRNA and þRNA
CPs of microviruses and thus might be more likely to have been the viruses at an early stage of the evolution of the alphavirus-like
ancestors of JRC of RNA viruses. Consequently, bacterial origins for superfamily (Fig. 2) (Gorbalenya et al., 2002; Zeddam et al., 2010).
these genes are conceivable as well, leading to an evolutionary Notably, the divergence of birnaviruses from tetraviruses has
scenario in which the ancestral picorna-like virus was assembled apparently occurred following the acquisition of the JRC protein
from diverse building blocks derived from the proto-mitochondrial gene by their common ancestor from a nodavirus (Wang et al.,
endosymbiont during eukaryogenesis (Koonin et al., 2008) (Fig. 2). 2012a). The family of capsid-less viruses Endornaviridae that is
Clearly, this scenario is most plausible within the framework of the currently classified with dsRNA viruses clearly evolved from an
symbiogenetic scenario for the origin of eukaryotes. Under the alphavirus-like ancestor as indicated by the conservation of a sign-
protoeukaryote scenario, the ancestral picorna-like virus could be ature set of core replication genes (Koonin and Dolja, 2014).
construed as a direct descendant of the primordial RNA world that Evolutionary scenarios based on the phylogenetic analysis of
survived and thrived in the protoeukaryotic lineage (Fig. 2). In this viral replication proteins often deviate from those centered on the
case, the RdRp of the picorna-like viruses would be viewed as the evolution of other functional modules, in particular those of viral
primordial replicase, and S3H and JRC accordingly would be con- capsid proteins (Krupovic and Bamford, 2008, 2009). Thus, for
sidered ancestral forms of the respective proteins. The ancestral comprehensive reconstruction of virus evolution, that would
picorna-like virus thus could resemble the extant nodaviruses that reflect the intrinsic modularity of this process, it is essential to
possess a “minimal” genome within the picornavirus-like super- complement phylogenetic and comparative genomic analyses with
family encoding only the RdRp and the JRC. Incidentally, the only the analysis of structural data (Koonin et al., 2009). The emerging
reported putative RNA virus of archaea shows a similar genome picture of the evolution of dsRNA viruses is among the best
architecture although it is premature to discuss its possible role in illustrations of this general principle.
the evolution of the viruses of eukaryotes until the archaeal host Structural analyses have shown that eukaryotic dsRNA viruses from
range is validated (Bolduc et al., 2012). The 3CPro, for which the the families Picobirnaviridae, Chrysoviridae, Totiviridae, Partitiviridae,
bacterial origin appears undeniable, could be a later acquisition Reoviridae and bacteriophages of the family Cystoviridae employ related
concurrent with the symbiogenesis. capsid proteins to build their unique T¼ 1 icosahedral capsids from 60
Although the only known group of þRNA bacteriophages, the asymmetrical CP dimers (El Omari et al., 2013; Janssen et al., 2015;
leviviruses, apparently have not contributed to the origin of the bulk Luque et al., 2014; Poranen and Bamford, 2012). Based on comparisons
of the eukaryotic þRNA viruses, they did give rise to two distinct, of the virion and CP structures, it has been proposed that reoviruses
small lineages of the eukaryotic viruses (Fig. 2). Searches for the most are most closely related to cystoviruses whereas picobirnaviruses,
closely related homologs of the leviviral RdRps identified the RdRps of partitiviruses, and totiviruses form another, distant branch of dsRNA
these two narrow groups, fungal Narnaviridae and plant Ourmiavirus, viruses (El Omari et al., 2013); additionally, the CP of chrysoviruses has
as the eukaryotic descendants of the leviviruses. The narnaviruses been concluded to be most closely related to that of totiviruses (Luque
hardly meet the narrow definition of viruses because they are neither et al., 2014). Thus, bacterial cystoviruses appear to have contributed the
infectious nor possess an extracellular encapsidated form (Hillman structural genes to most of the dsRNA viruses infecting eukaryotes. The
and Cai, 2013). The entire replication cycle of the narnaviruses of the reoviruses, the largest family of dsRNA viruses that infect diverse
genus Mitovirus takes place within fungal mitochondria. Given the eukaryotic hosts (Fig. 2 and Supplementary Table S2), appear to be
origin of the mitochondria from an alphaproteobacterial endosym- direct descendants of the cystoviruses. In contrast, in the evolution of
biont, it appears most likely that the ancestral narnavirus evolved picobirnaviruses, partitiviruses, totiviruses, chrysoviruses and the
from an RNA bacteriophage brought along by the protomitochon- related megabirnaviruses the pivotal event was recombination (or
drion, by losing the capsid and thus switching to the status of a more likely, multiple, independent recombination events) with mem-
mitochondrial RNA plasmid. In contrast, plant ourmiaviruses are full- bers of the picornavirus-like superfamily of þRNA viruses, resulting in
fledged, infectious, encapsidated þRNA plant viruses. Because their chimeric genomes encoding cystovirus-derived capsid proteins and
RdRps are related to those of narnaviruses, whereas the intercellular pricornavirus-like RdRps (Fig. 2).
10 E.V. Koonin et al. / Virology 479-480 (2015) 2–25
The global ecology of the dsRNA viruses appears rather pecu- RNA virus families that include members infecting either animals
liar. Unlike most of the families of þRNA viruses that are confined or plants.
to a relatively narrow host ranges (e.g., arthropods for Iflaviridae, A major insight into the origin of RNA viruses came from the
vertebrates for Picornaviridae and plants for Secoviridae), extre- recently solved crystal structure of the Influenza A virus RdRp that
mely diverse hosts are often infected by the dsRNA viruses from has revealed striking similarity to the structure of the flavivirus
the same family. As a case in point, the family Reoviridae includes RdRps (Pflug et al., 2014). This finding strongly suggests that
viruses that infect vertebrates, arthropods, mollusks, fungi, flower- RNA viruses evolved from a þRNA ancestor of the flavivirus-
ing plants and a unicellular green alga. Likewise, Partitiviridae like superfamily but diverged from the ancestral forms beyond
infect fungi, flowering plants and an apicomplexan unicellular recognition at the sequence level due to the switch to a radically
eukaryote, whereas host range of Totiviridae includes fungi and different replication cycle. Although influenza RdRp is also struc-
several unicellular eukaryotic parasites from the Excavate super- turally similar to the RdRp of dsRNA bacteriophages (cystoviruses),
group (King et al., 2011). Such ecological patterns including two or a direct evolutionary connection seems unlikely given the sig-
three supergroups of eukaryotic hosts for each of the three largest nificantly lower similarity than that with the flavivirus RdRp and
families of the dsRNA viruses point to their ancient origins from the apparent relatively late emergence of the RNA viruses (see
the dsRNA bacteriophage and picornavirus-like ancestors as dis- above). This reasoning is further buttressed by the recent identi-
cussed above (Fig. 2). fication of a nematode-infecting flavi-like virus (Bekal et al., 2014)
The role of HVT in the evolution of the dsRNA viruses is most which suggests that nematodes could have played the role of a
apparent for the family Endornaviridae where the plant and fungal melting pot in which the progenitor of the RNA viruses was
virus branches in the phylogenetic trees of viral RdRps often conceived and that also played a key role in the spread of these
intermingle within the same cluster (Roossinck et al., 2011). A viruses to new hosts. Further, in-depth phylogenetic and structural
contribution of HVT appears likely also in the evolution of analysis of the proteins encoded by flavi-like viruses and RNA
reoviruses many of which, both from vertebrates and from plants, viruses are required to develop the proposed evolutionary sce-
are also capable of infecting their arthropod vectors (Ng and Falk, nario in more detail.
2006; Quito-Avila et al., 2012) that could serve as HVT interme- Given the accumulating evidence of the origin of both dsRNA
diaries. Thus, phylogenetic, structural, and host range analyses viruses and RNA viruses from different groups of þRNA viruses,
converge in supporting the major theme in the evolution of the the ancestor of the picorna-like viruses appears to have been the
dsRNA viruses: ancient polyphyletic origin from dsRNA bacterio- ultimate progenitor of the great majority of eukaryotic RNA viruses.
phages or distinct groups of þ RNA virus ancestors, or via recom- Whether this ancestral picorna-like virus was assembled from
bination between these distinct types of ancestors. The current several distinct building blocks of bacterial origin during eukaryo-
spread of the dsRNA viruses, however, could have been substan- genesis (Fig. 2) or evolved as a continuous lineage from the
tially affected by more recent HVT events. primordial gene pool, is an intriguing and important question. The
answer critically depends on the choice of the scenario for the origin
of eukaryotes that hopefully will be informed by the further advances
Negative-strand RNA viruses: The emerging positive-strand of archaeal and bacterial genomics. Regardless of the impending
connection solution to this key problem, a limited footprint of RNA bacterio-
phages on the evolution of eukaryotic RNA viruses is apparent in the
Negative-strand RNA viruses of eukaryotes include the order origin of narnaviruses and ourmiaviruses from leviviruses, and most
Mononegavirales that consists of three related virus families with likely, reoviruses from cystoviruses.
non-segmented genomes and 5 families of viruses with segmented
genomes (Supplementary Table S3). For a long time, the evolution- Synopsis on eukaryotic RNA virome
ary origin of the RNA viruses had been veiled in mystery due to
the highly derived sequences of their RdRps (Tordo et al., 1988; To recapitulate the key points on the eukaryotic RNA virome, the
Xiong and Eickbush, 1990) and the lack of readily identified homo- enormous diversity of RNA viruses is a hallmark of the eukaryotic
logs for other proteins, with the exception of capping enzymes in part of the virus world. We are far from a full understanding of the
Mononegavirales that also is extremely diverged from all homologs underlying causes of this remarkable bloom of RNA viruses but it
(Bujnicki and Rychlewski, 2002; Li et al., 2008). The narrow host stands to reason that the eukaryotic cytosol, with its extensive
ranges of RNA viruses, limited to animals and plants, imply endomembrane system provides a niche that is highly conducive to
relatively recent evolutionary origin. Furthermore, it has been RNA replication. There is sufficient evidence to derive the great
proposed that RNA viruses of plants were acquired from animals majority of eukaryotic RNA viruses from a common, positive-strand
via HVT (Dolja and Koonin, 2011). This scenario is compatible with ancestor that might have been assembled from several components
the markedly higher diversity and prevalence of the animal RNA with distinct roots in prokaryotes including a reverse transcriptase.
viruses compared to the relative scarcity of these viruses in plants. In contrast, several isolated groups of eukaryotic RNA viruses derive
The protein sequences, as well as virion and genome architectures, directly from bacterial RNA viral ancestors. The striking diversifica-
are highly similar between animal and plant viruses in the families tion of RNA viruses in eukaryotes, in part, depended on switches in
Rhabdoviridae and Bunyaviridae. Furthermore, arthropod parasites of genome replication-expression strategies (from positive-strand to
animals and plants could have readily served as HVT vehicles double-stranded and negative-stranded genomes) and multiple
because both plant and animal rhabdoviruses and bunyaviruses exchanges of genes between far diverged groups of viruses.
are transmitted by and replicate in their arthropod vectors (Ammar
el et al., 2009; Guu et al., 2012). The discovery of four –RNA viruses Retroelements and retroviruses: Viruses as derived forms
that infect soybean cyst nematodes further expands the ecological
reach of these viruses within animal lineage of evolution (Bekal An extremely common and abundant class of selfish elements in
et al., 2011). This finding suggests a potential major route of animal- eukaryotes consists of reverse-transcribing elements (or retroele-
to-plant HVT of RNA viruses given that the nematodes, many of ments for short), including retroviruses. Similar to the case of RNA
which are plant parasites, are the most numerous animals on earth viruses, the single common denominator of these extremely diverse
(Blaxter et al., 1998). Notably, two of these novel viruses are most elements is the polymerase involved in their replication, in this case,
closely related to bunyaviruses, and one to rhabdoviruses, the two the reverse transcriptase (RT) which defines the key feature of the
E.V. Koonin et al. / Virology 479-480 (2015) 2–25 11
reproduction cycle, namely reverse transcription of RNA into DNA The archaeal and bacterial retroelements that comprise one of the
(Eickbush and Jamburuthugoda, 2008; Finnegan, 2012; Kazazian, 4 major clades in the RT tree (Fig. 3) include 3 well-characterized
2004; Xiong and Eickbush, 1990). Beyond this unifying step, retro- groups of bacterial retroelements (represented also in some archaea):
elements show all conceivable reproduction strategies: some behave (i) Group II introns, (ii) retrons and (iii) diversity-generating retro-
like mobile elements that jump around host genomes via reverse elements (DGR) (Robart and Zimmerly, 2005; Simon and Zimmerly,
transcription and integration, and regularly degrade to become inte- 2008; Toro and Nisa-Martínez, 2014). The fourth group in this clade
gral parts of the host genomes; others behave as DNA or RNA of RTs includes the so-called retroplasmids that replicate in fungal
plasmids; yet others, the best-characterized ones, are bona fide mitochondria, and given the endosymbiotic origin of the mitochon-
viruses that pack in the virions either RNA or DNA, or even a DNA– dria, are likely to be of bacterial origin (Griffiths, 1995). In addition,
RNA hybrid, and go through an essential or facultative stage of analysis of bacterial and archaeal genomes revealed many RTs of
integration into the host genome during virus replication. Although unclear provenance that are likely to constitute or derive from
all retroelements are relatively small, their genomic complexity varies uncharacterized retroelements (Simon and Zimmerly, 2008).
greatly, from solo RT to sophisticated build-ups of viral genomes with The Group II self-splicing introns are by far the most common
over 10 genes, for example in the case of HIV. retroelements in archaea and bacteria representing over 70% of the
Given that the RT is the only universal gene among the retro- RTs detected by a survey of bacterial and archaeal genomes, and are
elements, a natural approach to the reconstruction of their evolu- the only group of prokaryotic retroelements with demonstrated
tion involves using a phylogenetic tree of the RT as a framework. independent horizontal mobility (Lambowitz and Zimmerly, 2004,
Phylogenetic analysis (Gladyshev and Arkhipova, 2011) divides the 2011; Simon and Zimmerly, 2008). In addition to bacteria and some
RTs into four major branches that include: (1) retroelements from archaea, Group II introns are commonly present in mitochondrial
prokaryotes including Group II self-splicing introns and retrons, genomes of fungi, plants and some protists. The large protein enco-
(2) LINE-like elements, (3) Penelope-like elements, (4) reverse- ded in Group II introns, in addition to the RT, encompasses an
transcribing viruses and related retrotransposons that contain Long endonuclease domain that is involved in transposition. This endo-
Terminal Repeats (LTR) (Fig. 3). Historically, all retroelements, with nuclease domain belongs to the HNH family which is one of the
the exception of reverse-transcribing viruses and their relatives, are nucleases frequently encoded also in Group I introns (Stoddard,
often called non-LTR retrotransposons. The 4 main branches of RTs as 2005). Thus, from the evolutionary standpoint, Group II introns are
well as several branches within each of them (see below) are well likely to have evolved from self-splicing, endonuclease-encoding
resolved but the position of the root is not known. introns (similar in architecture to Group I introns but with a distinct
Fig. 3. Evolution of retroelements and reverse-transcribing viruses. Genomic organizations of selected representatives of the major groups of retroelements overlay the
phylogenetic tree of the reverse transcriptases. The topology of the tree is from (Gladyshev and Arkhipova, 2011). Abbreviations: DGR, diversity-generating
retroelements; X/D/E, maturase, DNA binding, and endonuclease domains, respectively, of the intron-encoded protein; mtd, major tropism determinant; atd, accessory
tropism determinant; brt, bacteriophage reverse transcriptase; LINE, long interspersed nucleotide elements; END, endonuclease; ZK, zinc knuckle; gag, group-specific
antigen; env, envelope; pol, polymerase; PR, aspartate protease; RT, reverse transcriptase; RH, RNase H; INT, integrase; CHR, chromodomain; MA, matrix protein; CA/Cp,
capsid protein; NC, nucleocapsid; 6, 6-kDa protein; vif, vpr, vpu, tat, rev, and nef, regulatory proteins encoded by spliced mRNAs; gp120 and gp41, the 120- (surface) and
41-kDa (transmembrane) glycoproteins; ATF, aphid transmission factor; VAP, virion-associated protein; TT/SR, translation trans-activator/suppressor of RNA
interference; TP, terminal protein; P, polymerase; PreS, pre-surface protein (envelope); PX/TA, protein X/transcription activator; trbd, telomerase RNA-binding domain;
cc, coiled-coil.
12 E.V. Koonin et al. / Virology 479-480 (2015) 2–25
ribozyme structure) that acquired an RT gene resulting in a more shows the strongest similarity to GIY-YIG endonucleases from
autonomous reproduction strategy. Group I introns of some large DNA viruses such as phycodna-
Retrons are retroelements that consist of a solo RT gene and viruses (Van Etten, 2003). Thus, the complete forms of PLE found
are vertically inherited in bacteria suggestive of some ‘normal’ in animals might have evolved by fusing a viral intron-encoded
function(s) in bacterial cells; to date, however, there is no ind- endonuclease domain to the ancestral RT.
ication of the nature of such a presumptive function of the The LINE elements (Long Interspersed Nuclear Elements) com-
retrons (Lampson et al., 2005). The RT of the retrons makes prise another group of simple retroelements that appear to be
multiple copies of a branched RNA-DNA hybrid but accumulation both the most common retroelements in eukaryotes, being repre-
of these unusual molecules does not result in any discernible sented in the genomes of diverse organisms of all major eukaryotic
phenotype in the bacteria. groups, and the most abundant among the extant retroelements as
The DGRs are unusual retroelements that are present in some they reach extremely high copy numbers in animal genomes (de
bacteriophage and bacterial genomes and have been shown to Koning et al., 2011; Kazazian, 2004). Most of these LINE elements
employ the RT to modify specific target genes and accordingly are inactivated and decaying but a small fraction remains active
their protein products in a specific fashion resulting in changes in and spawns new copies. In addition, the active LINE RT mediate
phage receptor specificity, helping the phage to evade bacterial the retrotransposition of SINEs (such as the Alu elements that are
resistance (Medhekar and Miller, 2007). extremely abundant in primate genomes), small elements that lack
Bacterial retroelements, primarily Group II introns, have rea- any protein-coding genes but still follow the retrotransposon life
ched substantial diversity, with several distinct groups revealed by style and propagate to extremely high numbers in animal gen-
phylogenetic analysis, and invaded most of the bacterial divisi- omes (de Koning et al., 2011).
ons (Simon et al., 2008). In contrast, in archaea, the spread of these A typical, complete vertebrate LINE consists of two genes one of
elements is restricted to a few groups of mesophiles, such as which encodes the RT and endonuclease domains whereas the
Methanosarcina, that appear to have acquired numerous bacterial second one encodes an RNA-binding domain that is required for
genes via HGT. The same route has been proposed for the retro- transposition. The RTs of the LINEs form two distinct branches in the
elements (Rest and Mindell, 2003). phylogenetic tree (Fig. 3), and the respective elements also encode
In a stark contrast to the prokaryotic retroelements that are rather distinct endonucleases. The ‘classic’ LINEs including all elements
sparsely represented among bacteria, are rare in archaea and do not found in mammals encode an apurinic/apyrimidinic (AP) endonu-
reach high copy numbers, diverse eukaryotic genomes are replete clease that also possesses RNase H activity and is essential for tran-
with retroelements of different varieties. By conservative estimates, sposition. In contrast, a subset of LINEs from diverse eukaryotes
retroelement-derived sequences account for over 50% of mammalian encode a bona fide RNase H (Fig. 3). Although some phylogenetic
genomes (mostly non-LTR elements) and over 75% of some plant analyses suggest that RNase H is a late acquisition in the history of
genomes, e.g. maize (Defraia and Slotkin, 2014; Lee and Kim, 2014; non-LTR retroelements (Malik, 2005), it does not appear possible to
Solyom and Kazazian, 2012). Although usually not reaching such rule out that this is the ancestral architecture among the LINEs.
extravagant excesses, retroelements are abundant also in genomes of Another branch of LINEs encode a RLE (Restriction-like Endonu-
diverse unicellular eukaryotes (Bhattacharya et al., 2002; Lorenzi et clease) domain that, similar to the AP endonuclease, introduces a
al., 2008). The eukaryotic retroelements show limited diversity of the nick into the target and thus initiates transposition (Mandal et al.,
RT sequences compared to the prokaryotic retroelements which is in 2004; Yang et al., 1999). Furthermore, comparative analysis of the
sharp contrast with the enormous diversity of genome organizations LINEs in plants has shown that, in addition to the AP endonuclease, a
and reproduction strategies. We discuss these elements in accord group of these elements acquired a distinct RNase H domain,
with their branching in the phylogenetic trees of the RTs (Fig. 3). surprisingly, of apparent archaeal origin (Smyshlyaev et al., 2013).
Penelope-like retroelements (PLE) are simple retrotransposons In the phylogenetic tree of the RT (Fig. 3), the LINEs cluster (albeit
that typically encode a single large protein that in the originally with limited statistical support) with a recently discovered distinct
discovered group of PLE is a fusion of the RT with a GIY-YIG group of RT (denoted RVT) that contain no identifiable domains other
endonuclease (Fig. 3) (Evgen’ev, 2013; Lyozin et al., 2001). This than the RT proper, are not currently known to behave as mobile
complete form of PLE so far has been identified only in animals. elements, are present in a single copy in the genomes of diverse
However, a shorter PLE variants that lack the endonuclease are eukaryotes, and hence are likely to fulfill some still uncharacterized
integrated in subtelomeric regions of chromosomes in a broad variety function(s) in eukaryotic cells. Members of the RVT group have been
of eukaryotes (Gladyshev and Arkhipova, 2011). In the phylogenetic identified also in several bacterial genomes suggesting the possibility
tree of the RT, the PLE confidently cluster with the telomerase RT of horizontal gene transfer the direction of which remains uncertain
(TERT), a pan-eukaryotic enzyme that is essential for the replication of (Gladyshev and Arkhipova, 2011).
the ends of linear chromosomes (Chan and Blackburn, 2004). This Among the RT-elements, bona fide viruses, with genomes
relationship implies that the PLE-like branch of retroelements ante- encased in virus particles, and typical infection cycles including an
dates the LECA although the complete, endonuclease-encoding PLE extracellular phase, are a minority (Supplementary Table S4). Impor-
apparently evolved later. The recruitment of the PLE-related RT for the tantly, capsid-less retroelements are found in all major divisions of
telomerase function clearly was an early, pivotal event during the cellular organisms, and by inference, are ancestral to this entire class
evolution of the eukaryotic cell. Remarkably, several groups of of genetic elements. By contrast, reverse-transcribing viruses are
eukaryotes, in particular insects, have lost the TERT gene and instead derived forms that apparently evolved at an early stage in the
use a distinct variety of non-LTR retrotransposons as telomeric repeats evolution of eukaryotes (see below).
(Pardue and DeBaryshe, 2011). Thus, it seems that retroelements The reproduction strategy of the retroviruses (family Retro-
provide for the replication of chromosome ends in all eukaryotes viridae) partly resembles that of RNA viruses, combining aspects
thanks to their intrinsic ability to generate sequence repeats. analogous to both positive-strand RNA viruses and negative-strand
The GIY-YIG endonuclease domains are widely represented in RNA viruses. The retroviruses are effectively RNA viruses that have
Group I introns and are also present in the repair endonuclease evolved the capacity to convert to DNA, integrate into the host
UvrC that is strongly conserved among bacteria (Aravind et al., genome and then exploit the host replication and transcription
1999). These endonuclease domains are small and highly diverged, machinery. In addition to the typical infectious retroviruses, vert-
so establishing evolutionary relationships is difficult. Nevertheless, ebrate genomes carry numerous endogenous retroviruses that are
it is interesting to note that the Penelope endonuclease domain largely transmitted vertically and are often inactivated by
E.V. Koonin et al. / Virology 479-480 (2015) 2–25 13
mutation but, until that happens, have the potential to get emerged as a result of recombination between a non-LTR retro-
activated and yield infectious virus (Stoye, 2012; Weiss, 2013). transposon and a DNA transposon (Capy and Maisonhaute, 2002;
The two other families of reverse-transcribing viruses, Hepadnavir- Malik and Eickbush, 2001). Notably, the Gypsy/Ty3 retrotransposons
idae infecting animals and Caulimoviridae infecting plants (collectively have acquired a chromodomain (a widespread domain involved in
often denoted pararetroviruses), have ventured further into the DNA chromatin remodeling in eukaryotes) that is fused to the integrase of
world: these viruses package the DNA form of the genome (or these elements and modulates the specificity of integration (Novikova
sometimes a DNA–RNA, in the case of hepadnaviruses) into the virions et al., 2010).
but retain the reverse transcription stage in the reproduction cycle The aspartic protease of the LTR retroelements is homologous
(Nassal, 2008; Rothnie et al., 1994; Seeger and Hu, 1997). In contrast to to the pan-eukaryotic protein DDI1, an essential, ubiquitin-
the retroviruses, for viruses of these families, integration into the host dependent regulator of the cell cycle whereas DDI1 itself appears
genome is not an essential stage of the reproduction cycle although to have been derived from a distinct group of bacterial aspartyl
apparent spurious integration is common among caulimoviruses proteases (Krylov and Koonin, 2001; Sirkis et al., 2006). Thus,
(Harper et al., 2002; Staginnus and Richert-Poggeler, 2006). The strikingly, the ancestral Pol polyprotein of the LTR retroelements
remaining two families of reverse-transcribing viruses, Metaviridae seems to have evolved through assembly from 4 distinct compo-
and Pseudoviridae, include RT-encoding elements that are traditionally nents only one of which, the RT, derives from a pre-existing
not even considered viruses but rather retrotransposons because they retroelement.
normally do not infect new cells, although it has been suggested that Apart from the Pol polyprotein, the relationships between genes
Gypsy elements of Drosophila are infectious (Kim et al., 1994; Song in different groups of reverse-transcribing viruses are convoluted.
et al., 1994). In any case, these elements, e.g. Gypsy/Ty3-like elements The capsid protein domain of the Gag polyprotein is conserved
(Metaviridae) in animals or Copia/Ty1-like elements in fungi (Pseudo- between retroviruses and the Ty3/Gypsy metaviruses. The conserved
viridae), encode virion proteins and form particles, and thus meet the region of the nucleocapsid (NC) protein consists of a distinct C2HC
definition of a virus. Zn-knuckle that at least in retroviruses is involved in RNA and DNA
Among all retroelements, the reverse-transcribing viruses possess binding (Darlix et al., 2014). The retroviral capsid (CA) protein
the most complex genomes (Fig. 3). All retroviruses share 3 major contains a conserved C-terminal α-helical domain known as SCAN
genes that are traditionally denoted pol, gag and env, and in many that mediates protein dimerization (Ivanov et al., 2005). Phylogenetic
cases, also additional, variable genes. The retrovirus RT is a domain of analysis of the conserved portion of Gag suggests that the 3 classes of
the Pol polyprotein. In the viral branch of retroelements, the strictly retroviruses evolved from 3 distinct lineages of metaviruses as
conserved module consists of the RT together with the RNase H (RH) suggested by the so-called ”three kings” hypothesis (Llorens et al.,
domain that is essential for the removal of the RNA strand during the 2008). However, it is unclear whether the Gag-like protein of Copia/
synthesis of the DNA provirus. Two other domains, integrase and Ty1 (pseudoviruses) is homologous as well, and neither is the
aspartic protease, are found only in a subset of pol polyproteins. ultimate origin of this protein outside of the retroelements. Although
However, superposition of the domain architectures of the pol homologs of the Gag proteins in animals have been discovered and
polyproteins over the phylogenetic tree of the RTs strongly suggests shown to be important in development, the respective genes app-
that the common ancestor of the reverse-transcribing viruses arently have been transferred from retroviruses to the host genomes
encoded the complex form of Pol, most likely one with the PR-RT- (Kaneko-Ishino and Ishino, 2012).
RH-INT arrangement that is shared between retroviruses and meta- Strikingly, in the evolution of retroviruses, the env genes have
viruses (Fig. 3). The phylogenies of the RT, RH and INT domains of been apparently acquired by LTR retrotransposons on at least three
reverse-transcribing viruses appear to be concordant and cluster independent occasions from different groups of RNA and DNA
metaviruses with retroviruses to the exclusion of pseudoviruses viruses: gypsy/metaviruses have acquired their env-like gene from
(Malik and Eickbush, 1999), in agreement with the RT phylogeny in insect baculoviruses (dsDNA viruses); the envelope genes of the
Fig. 3 and the above evolutionary scenario. Under this scenario, Cer retroelements in the Caenorhabditis elegans genome appear to
caulimoviruses have lost the integrase domain whereas hepadna- derive from a phlebovirus ( RNA virus) source; and the Tas
viruses have lost both the integrase and the protease but acquired retroviral envelope (Ascaris lumricoides) might have been obtained
the terminal protein domain that is involved in the initiation of DNA from herpesviruses (dsDNA viruses) (Malik et al., 2000). The origin
synthesis. of the env genes of the vertebrate retroviruses that appear not to
A more complete phylogenetic analysis of the RNase H that be homologous to any of the above env genes remains obscure.
involved also the RH from non-LTR retroelements of the LINE branch Interestingly, however, in vertebrate retroviruses, such as HIV, the
as well as bacterial and eukaryotic RNH I indicated, first, that the gp41 domain of env is a class I fusion protein which is also found
non-LTR retroelements in eukaryotes were older than the LTR in many RNA viruses, including orthomyxoviruses, paramyxo-
elements, and second, quite unexpectedly, that in retroviruses, the viruses, coronaviruses, filoviruses and arenaviruses (Kielian and
ancestral RH apparently was secondarily replaced with the eukar- Rey, 2006; White et al., 2008). Thus, despite the lack of a readily
yotic homolog (Malik and Eickbush, 2001). The ultimate origin of the traceable ancestral relationship, it is thus conceivable that verte-
RH in retroelements is not easy to decipher because, for this short brate retroviruses assembled their env proteins from preexisting
domain, the topology of the deep branches in the tree is unreliable. protein domains of other eukaryotic viruses.
However, a “smoking gun” has been detected that links the RH in Caulimoviruses and especially hepadnaviruses are highly derived
retroelements with eukaryotic homologs, namely a distinct DNA– forms that apparently have lost and/or displaced several genes of the
RNA hybrid and dsRNA-binding domain that is shared by eukaryotic ancestral reverse-transcribing virus, with the exception of RT and RH,
RNH I and a subset of the retroelement RH (Majorek et al., 2014; and also PR in the case of caulimoviruses (Fig. 3). In addition, the
Smyshlyaev et al., 2013). The presence of this derived shared capsid proteins of caulimoviruses share the C2HC Zn-knuckle with the
character indicates that the retroelements have acquired a eukaryotic NCs of retroviruses and metaviruses (Covey, 1986). Thus, at least one
RNH I at an early stage of their evolution. domain of the ancestral nucleocapsid protein of reverse-transcribing
The INT domain of the LTR retroelements belongs to the DDE viruses survives in caulimoviruses. In contrast, the core protein of
family of transposases (named after the distinct catalytic triad) that hepadnaviruses shows no significant sequence similarity to capsid
mediate transposition of numerous DNA transposons in eukaryotes proteins of retroviruses or caulimoviruses, and might be a displace-
and prokaryotes (Nesmelova and Hackett, 2010). Therefore, it has ment of uncertain provenance. However, based on similar dimeriza-
been proposed that the founder of the LTR retrotransposon branch tion principles and sequence conservation patterns, it has been sugg-
14 E.V. Koonin et al. / Virology 479-480 (2015) 2–25
ested that the capsid protein of hepadnaviruses and the C-term- Furthermore, given the apparent origin of the eukaryotic splicing from
inal domain of retroviral CA actually are distant homologs (Steven Group II introns, the symbiogenetic scenario seems to offer a simpler
et al., 2005). evolutionary narrative than the protoeukaryotic scenario. Regardless,
The origins of the family-specific genes of reverse-transcribing the remarkable diversification of the retroelements in eukaryotes
viruses remain uncertain, with the notable exception of the could have been triggered by the (typically) weaker purifying selection
movement protein (MP) of caulimoviruses. The MP is conserved compared to prokaryotes which allowed for the massive proliferation
in a great variety of plant viruses including positive-strand RNA of integrated retroelements and provided the playground for their
viruses, negative strand RNA viruses and ssDNA viruses. Clearly, further evolution (Lynch, 2007; Lynch and Conery, 2003).
the MP gene horizontally spread among diverse viruses driven by
selection for the ability to cross plasmodesmata and hence cause Synopsis on eukaryotic retroelements
systemic infection in plants (Koonin et al., 1991b; Melcher, 2000;
Mushegian and Elena, 2015; Mushegian and Koonin, 1993). A To summarize, the retroelements enjoyed no less success in
much better known, textbook case of viral genes with a clear eukaryotes than RNA viruses with which they could share the
provenance are the oncogenes of numerous animal retroviruses ultimate common origin from prokaryotic Group II elements (self-
(e.g. such thoroughly characterized oncogenes as v-src, v-ras or splicing introns). However, bona fide reverse-transcribing viruses
v-abl) which are mutated versions of host genes involved in cell are derived forms and show limited diversity. Notably, although all
cycle control that cause cell transformation when expressed from these viruses share a common origin, they seem to have acqui-
an integrated DNA copy of the viral genome (Maeda et al., 2008). red the envelope proteins from different sources and on indepen-
Most likely, retroelements have been an integral part of biological dent occasions. Retroelements including retro-transcribing viruses
systems since the stage of the primordial replicators when they gave evolve in a much closer integration with the eukaryotic hosts than
rise to the first DNA genomes (Koonin, 2009). Indeed, under the RNA RNA viruses and sequences from these elements have been
World scenario, the transition to DNA genomes would necessarily extensively recruited by eukaryotes for a variety of cellular func-
require reverse transcription, with the implication that some varieties tions at all stages of evolution.
of retroelements already existed at that stage of evolution. However, in
prokaryotes, retroelements maintain a low profile and never attain Origins of ssDNA viruses of eukaryotes: Multiple crosses between
complex genomic architectures. In eukaryotes, the fortunes of retro- plasmids and RNA viruses
elements have turned around: they proliferated dramatically, have
become a defining factor of genome evolution and spawned several Viruses with ssDNA genomes are increasingly appreciated as a
families of reverse-transcribing viruses. The wide spread of each of the rapidly expanding, highly diverse class of economically, medically and
major groups of retroelements across the diversity of eukaryotes ecologically important pathogens. They infect hosts from all three
indicates that the principal events in the evolution of retroelements domains of cellular life and are present in all conceivable environ-
occurred before the radiation of the eukaryotic supergroups. The PLE ments, from near-surface atmosphere (Whon et al., 2012) to soil (Kim
appear to be the best candidates for the role of the founder eukaryotic et al., 2008), from freshwater and marine habitats (Labonte and Suttle,
retroelements that gave rise to other simple, widespread non-LTR 2013; Rosario et al., 2009; Roux et al., 2012; Zawar-Reza et al., 2014) to
elements, such as the LINEs, as well as fully ‘domesticated’ RTs such as the most extreme settings, such as terrestrial hot springs (Mochizuki
TERT and RVT that are conserved throughout the eukaryotic domain. A et al., 2012). Bacterial and archaeal ssDNA viruses are grouped into
much more complex series of events led to the emergence of the LTR four families, whereas the eukaryotic ssDNA viruses are classified into
retroelements (in particular, reverse-transcribing viruses) including 6 families, Anelloviridae, Bidnaviridae, Circoviridae, Geminiviridae,
highly derived forms such as caulimoviruses and hepadnaviruses. Nanoviridae and Parvoviridae, and one unassigned genus (Bacilladna-
The parsimonious version of the scenario for the origin of the virus) (Supplementary Table S5). Anelloviruses appear to be restricted
eukaryotic retroelements depends on the scenario for the origin of to various mammals (Okamoto, 2009); circoviruses are known to
eukaryotes. The symbiogenetic scenario would root the entire diversity infect different avian species and pigs (Delwart and Li, 2012); nano-
of the eukaryotic retroelements in prokaryotic ones, most likely, Group viruses and geminiviruses infect plants (Grigoras et al., 2014; Hanley-
II introns. This origin of the eukaryotic retroelements appears compa- Bowdoin et al., 2013); parvoviruses replicate in vertebrates and
tible with the ancestral relationship between Group II introns and the arthropods (Cotmore et al., 2014); bidnaviruses are restricted to
eukaryotic spliceosomal introns (that have lost both protein-coding insects (Hu et al., 2013); bacilladnaviruses replicate in marine algae
genes and the self-splicing capacity) as well as the snoRNAs, the (Nagasaki et al., 2005), whereas members of the proposed genus
catalytic components of the spliceosome (Chalamcharla et al., 2010; Dai “Gemycircularvirus” infect fungi (Jiang et al., 2013). Thus, ssDNA
et al., 2008; Lambowitz and Zimmerly, 2011; Robart et al., 2014; Toor viruses prey on a wide range of eukaryotic hosts; however, numerous
et al., 2008). Remarkably, the essential, highly conserved (yet function- metagenomic and paleovirological studies suggest that the host range
ally poorly characterized) pan-eukaryotic protein subunit of the of eukaryotic ssDNA viruses might be even considerably broader
spliceosome, Prp8, also is an inactivated RT derivative that most likely (Labonte and Suttle, 2013; Rosario et al., 2012).
evolved from the Group II intron RT (Dlakic and Mushegian, 2011). All eukaryotic ssDNA viruses, except for the members of the
Thus, under the symbiogenetic scenario, prokaryotic retroelements family Bidnaviridae (see below), replicate their genomes using a
provide intermediates between the primordial genetic pool and the rolling-circle (or rolling-hairpin) mechanism which involves nicking
diversity of the eukaryotic retroelements. In contrast, the protoeukar- of the viral genome by a virus-encoded rolling-circle replication
yote scenario implies that both prokaryotic and eukaryotic retroele- initiation endonuclease, RC-Rep. The same replication mechanism is
ments are direct descendants of primordial genetic entities that also used by most prokaryotic ssDNA viruses, many plasmids and
adopted distinct routes of evolution in prokaryotes and eukaryotes. some transposons (Chandler et al., 2013; Krupovic, 2013; Krupovic
The sequence variability of the prokaryotic RTs is extremely high, and Forterre, 2015; Rosario et al., 2012). Perhaps unexpectedly, the
with only the essential motifs of the RT domain conserved throughout, RC-Reps of eukaryotic ssDNA viruses bear only limited similarity to
by far exceeding the variation among the eukaryotic retroelements the RC-Reps of bacterial and archaeal ssDNA viruses. The RC-Reps of
(Simon and Zimmerly, 2008). This greater sequence diversity of the eukaryotic ssDNA viruses show a distinct two-domain organization
RTs in prokaryotes, despite their relatively low abundance, seems to be (Koonin and Ilyina, 1993) (Fig. 4): the N-terminal endonuclease
compatible with the origin of all eukaryotic retroelements from a domain is followed by the S3H domain which is required for genome
distinct branch of prokaryotic retroelements, such as Group II introns. replication as well as other processes, such as viral genome
E.V. Koonin et al. / Virology 479-480 (2015) 2–25 15
encapsidation (King et al., 2001). By contrast, none of the known replicons (Koonin and Ilyina, 1992), and specifically, from phytoplas-
prokaryotic ssDNA viruses encodes a S3H domain, whereas the mal plasmids (Krupovic et al., 2009). In contrast, RC-Reps of
endonuclease domains are not significantly similar to those of circoviruses show closer similarity to proteins from a different group
eukaryotic viruses, except for the short regions encompassing the of bacterial plasmids, represented by the plasmid p4M of Bifidobac-
three diagnostic sequence motifs that are common to all endonu- terium pseudocatenulatum (Gibbs et al., 2006; Krupovic et al., 2009).
cleases of the HUH superfamily (Chandler et al., 2013; Ilyina and Furthermore, phylogenetic analysis of the RC-Rep encoded by an
Koonin, 1992; Koonin and Ilyina, 1993) and the overall shared uncultivated Gastropod-associated circular ssDNA virus (GaCSV),
structural fold (Fig. 4). Thus, it appears extremely unlikely that ssDNA isolated from the mollusk Amphibola crenata, showed that the viral
viruses of eukaryotes are direct descendants of their prokaryotic protein is nested within the clade containing RC-Reps of bacterial
counterparts; the distantly related endonuclease domains involved in origin (Dayaram et al., 2013). A striking, independent finding that is
the mechanistically similar replication initiation processes probably compatible with an evolutionary relationship between bacterial RC
were acquired independently and from different sources. replicons and eukaryotic ssDNA viruses is that genomes of certain
In contrast, the eukaryotic ssDNA viruses share the endonuclease- plant geminiviruses retain functional bacterial promoters and can
helicase domain architecture with the RC-Reps of various bacterial replicate in different bacterial cells in an RC-Rep-dependent manner
plasmids (Fig. 4). Furthermore, RC-Reps from different families of (Rigden et al., 1996; Selth et al., 2002; Wang et al., 2013; Wu et al.,
eukaryotic ssDNA viruses are typically more similar to homologs 2007). Although it is usually difficult to pinpoint the exact origin of
form different groups of bacterial plasmids than they are to each viral RC-Reps, the above examples strongly suggest that RC-Reps of
other, suggesting a close evolutionary relationship between bacterial eukaryotic ssDNA viruses are polyphyletic and their roots are in
plasmids and eukaryotic ssDNA viruses (Koonin and Ilyina, 1992). In different groups of bacterial plasmids.
particular, RC-Reps of geminiviruses and fungal gemycircularviruses The key step in the transformation of a plasmid into a virus is the
cluster in phylogenetic trees with the homologous proteins encoded acquisition of the genetic determinants allowing genome encapsida-
by plasmids of phytoplasmas (parasitic wall-less bacteria replicat- tion and inter-cellular transfer. Indeed, some cryptic bacterial RC
ing in plant and insect cells) rather than the RC-Reps of other plant plasmids encode a single protein, the RC-Rep, and thus the only
or animal ssDNA viruses, such as nanoviruses and circoviruses qualitative difference between such plasmids and the simplest euka-
(Krupovic et al., 2009; Liu et al., 2011). Accordingly, it has ryotic ssDNA viruses, such as circoviruses, is the presence of a capsid
been hypothesized that geminiviruses have evolved from bacterial protein (CP) gene in the latter (Krupovic and Bamford, 2010). All
Fig. 4. The conserved RC-Rep proteins of ssDNA viruses and their homologs: key motifs, domain architectures and structures. (A) The catalytic motifs of the nicking
endonuclease and superfamily 3 helicase (S3H) domains. Note the absence of the S3H domain in the prokaryotic ssDNA viruses and the inactivation of the endonuclease
domain in the dsDNA-containing papillomaviruses and polyomaviruses. (B) Homologous structures of the endonuclease domains. The structures are colored according to the
secondary structure elements: α-helixes, blue; β-strands, green. Abbreviations and PDB accession numbers: PCV2, porcine circovirus type 2 (2HW0); TYLCV, tomato yellow
leaf curl virus (1L2M); FBNYV, faba bean necrotic yellows virus (2HWT); AAV5, adeno-associated virus type 5 (1M55); SV40, simian virus 40 (1TBD); BPV, bovine papilloma
virus (1F08).
16 E.V. Koonin et al. / Virology 479-480 (2015) 2–25
eukaryotic ssDNA viruses, for which structural information is avail- event of CP gene acquisition from an RNA virus, followed by a
able or the fold of the CP could be inferred using in silico analyses, recurrent replacement of the RC-Rep genes as well as gene
possess structurally similar CPs with the jelly-roll fold (Krupovic, fragments in CHIVs with distant counterparts from diverse ssDNA
2012, 2013). As discussed above, the jelly-roll fold is the most viruses representing three families, Circoviridae, Nanoviridae and
common fold in the CPs of icosahedral þRNA viruses and is also Geminiviridae (Krupovic et al., 2015; Roux et al., 2013). Thus,
found in CPs of some dsRNA viruses (Fig. 5) (Koonin et al., 2008; recombination between contemporary RNA and DNA viruses app-
Krupovic, 2013; Rossmann and Johnson, 1989). Strikingly, CPs of ear to be relatively common, and a similar event or, more likely,
some ssDNA viruses are more similar to the CPs of þRNA viruses several independent events involving different groups of bacterial
than they are to the CPs of other ssDNA viruses, mirroring the RC plasmids and RNA viruses, gave rise to the ancestors of euka-
relationships between the viral and plasmid RC-Reps. For example, ryotic ssDNA viruses (Krupovic, 2013; Stedman, 2013) (Fig. 6).
the CP of geminiviruses is most closely related to the CP from satellite Once in existence, eukaryotic ssDNA viruses have undergone
tobacco necrosis virus (STNV; Fig. 5) (Bottcher et al., 2004; Krupovic substantial diversification, giving rise to several new groups of
et al., 2009; Zhang et al., 2001). Thus, the genomes of eukaryotic viruses and other mobile genetic elements. One of the most
ssDNA viruses appear to be chimeras composed of RC-Rep genes striking examples of such diversification is presented by members
inherited from bacterial plasmids and CP genes derived from of the family Bidnaviridae. Bidnaviruses do not encode RC-Reps
different groups of þRNA viruses (Fig. 6). The exact circumstances and accordingly do not replicate by the rolling-circle mechanism;
under which bacterial plasmids crossed paths with eukaryotic þRNA instead, these viruses encode protein-primed family B DNA poly-
viruses and gave rise to ssDNA viruses remain obscure. It is clear, merases (Hu et al., 2013). Recent reconstruction of the evolution-
however, that each such event would involve recombination bet- ary history of these insect viruses has shown that in all likelihood,
ween two unrelated RNA and DNA replicons. Recent findings disc- they evolved from an insect-infecting parvovirus ancestor
ussed below indicate that such RNA-DNA recombination occasionally (Krupovic and Koonin, 2014). The key event in the evolution of
does take place and indeed is likely to play an important role in the bidnaviruses involved replacement of the typical parvovirus-like
emergence of new virus types. RC-Rep gene with a family B DNA polymerase gene acquired from
Metagenomic exploration of viral diversity in the Boiling Spr- large, virus-like DNA transposons of the Polinton/Maverick super-
ings Lake (BSL) at Lassen Park, California, has led to the discovery family (see below), followed by acquisition of additional genes
of a novel ssDNA viral genome (Diemer and Stedman, 2012). This from insect baculoviruses that have dsDNA genomes and reo-
virus, named BSL RDHV (RNA-DNA hybrid virus), encodes an RC- viruses that contain segmented dsRNA genomes (Krupovic and
Rep closely related to those of circoviruses and a CP which, Koonin, 2014). Evolution of bidnaviruses from genes of four widely
unexpectedly, is not related to circoviral CPs but instead has a different groups of viruses is a striking example emphasizing the
domain organization specific to CPs of icosahedral þRNA viruses central role of recombination and genomic plasticity in virus
of the family Tombusviridae (Diemer and Stedman, 2012). Subse- evolution.
quent discovery of many additional BSL RDHV-like genomes Many groups of prokaryotic and eukaryotic ssDNA viruses have the
enabled a more detailed analysis of this peculiar virus group, ability to integrate into the genomes of their cellular hosts. In bacterial
dubbed chimeric viruses (CHIV) (Roux et al., 2013). It has been and archaeal viruses, this process is mediated by dedicated integrases
shown that in the history of the CHIV group, there was a single or transposases. By contrast, integration of eukaryotic ssDNA virus
Fig. 5. Homologous single jelly-roll structures of the capsid proteins of RNA and DNA viruses of eukaryotes. The structures are colored according to the secondary structure
elements: α-helixes, blue; β-strands, green. Depicted structures and their PDB accession numbers: Tymoviridae, turnip yellow mosaic virus (1AUY); Picornaviridae, rhinovirus
16 (1ND2); Satellite virus, satellite tobacco necrosis virus (2BUK); Birnaviridae, infectious bursal disease virus (1WCD); Circoviridae, porcine circovirus type 2 (3R0R);
Parvoviridae, adeno-associated virus type 2 (1LP3); Papillomaviridae, human papillomavirus 16 (1DZL); Polyomaviridae, simian virus 40 (1SVA).
E.V. Koonin et al. / Virology 479-480 (2015) 2–25 17
Fig. 6. Evolution of ssDNA viruses of eukaryotes: polyphyletic origin from different plasmids and multiple cases of recombination with ssRNA viruses. Abbreviations: JRC,
jelly roll capsid protein; pPolB, protein-primed DNA polymerase of family B; RC-Rep, rolling circle replication protein. Different colors of JRC and RC-Rep denote distinct
variants of the respective genes.
genomes primarily depends on the endonuclease activity of their RC- completely obliterate the ancestral evolutionary signal, as in the case
Reps (Krupovic and Forterre, 2015; Liu et al., 2011). Whereas most of CHIVs, where original genes for both CP and RC-Rep have been
groups of eukaryotic ssDNA viruses integrate only sporadically, some replaced in some of the viruses. Furthermore, during the course of
have evolved towards more aggressive proliferation within host evolution, ssDNA viruses have taken different evolutionary paths
genomes, akin to transposable elements. For example, a group of which allowed them to explore diverse replication mechanisms,
parvovirus-like transposons, encoding both CP and RC-Rep proteins, including switch to dsDNA genomes, expand the host range and
has been discovered in the genome of acorn worm, Saccoglossus occasionally step away from the bona fide viral propagation and
kowalevskii, where these putative transposons are present in over 50 switch to transposon-like life-styles, reversibly or otherwise.
copies per genome (Liu et al., 2011). Some ssDNA viruses have
apparently abandoned the virus-like propagation in favor of the Origins and primary diversification of eukaryotic dsDNA viruses: The
transposon-like life style: elements encoding parvoviral RC-Reps bacteriophage and transposable element connections
(but lacking the CP genes) and flanked by typical terminal inverted
repeat sequences have been identified in the genomes of Hydra Compared to RNA viruses and retroelements, dsDNA viruses
magnipapillata and Schmidtea mediterranea in over 400 and 100 and mobile elements are somewhat less diverse and less abundant
copies per genome, respectively (Liu et al., 2011). in eukaryotes but nevertheless have been identified in all major
Yet another distinct evolutionary trajectory leads from ssDNA eukaryotic groups. All in all, there are 18 formally recognized fam-
viruses to small dsDNA viruses of the families Papillomaviridae and ilies of dsDNA viruses and many unclassified viruses that infect a
Polyomaviridae. From their ssDNA virus ancestors, members of both broad spectrum of unicellular and multicellular hosts and span
these families inherited genes for capsid and replication proteins almost the entire range of viral genome sizes, from about 4 kb to
(Figs. 4 and 5), albeit both underwent major modifications (see almost 2.5 Mb (Supplementary Table S6).
below in the section on the origin of eukaryotic dsDNA viruses). By far the largest and most common group of DNA viruses in
eukaryotes (Supplementary Table S6) consists of 7 families of large
Synopsis on ssDNA virus origins and giant viruses including mimiviruses and pandoraviruses, with
genomes in the megabase range. All these viruses that infect diverse
Taken together, the results of comparative genomic analysis clearly eukaryotes including animals and a variety of protists are thought to
indicate that eukaryotic ssDNA viruses evolved on several indepen- share a common ancestry as indicated by the conservation of a
dent occasions from bacterial plasmids via acquisition of CP genes substantial number of genes encoding essential proteins involved in
from pre-existing þRNA viruses (Fig. 6). This scenario is neutral with viral genome replication and virion formation. Although only 5 genes
respect to the two eukaryogenesis scenarios outlined above because it are strictly conserved in all viruses of this group, maximum like-
predicts de novo origin of ssDNA viruses postdating the emergence of lihood evolutionary reconstructions led to the inference of an
eukaryotes. Considering that plasmid-carrying bacteria often establish ancestral gene set consisting of approximately 50 genes (Iyer et al.,
mutualistic and parasitic interactions with diverse modern eukaryotes 2001, 2006; Koonin and Yutin, 2010). This major group of eukaryotic
or simply serve as a food source for the latter (in the case of grazing viruses has become known as the Nucleo-Cytoplasmic Large DNA
protists), different groups of ssDNA viruses probably emerged at Viruses (NCLDV)(Iyer et al., 2001) or more recently, the proposed
different time points during eukaryal evolution. Some groups, such order “Megavirales” (Colson et al., 2013).
as parvoviruses, could have arisen before the radiation of major The viruses of the family Mimiviridae are hosts to a distinct class
eukaryotic kingdoms, whereas other lineages, such as bidnaviruses, of satellite viruses, the virophages, that reproduce within the viral
have a more recent history. Mixing-and-matching of different func- “factories” inside protist cells infected by the giant virus and depend
tional modules from widely different plasmid and virus groups on the latter for their replication (Claverie and Abergel, 2009;
representing both RNA and DNA virospheres is an ongoing process Desnues et al., 2012; Krupovic and Cvirkaite-Krupovic, 2011; La
which continues to generate new groups of ssDNA viruses (Krupovic, Scola et al., 2008). Recently, an evolutionary connection between
2013; Stedman, 2013). The extent of gene shuffling is such that it can the virophages and large eukaryotic dsDNA transposons of the
18 E.V. Koonin et al. / Virology 479-480 (2015) 2–25
Polinton/Maverick group (hereinafter Polintons) has been identified bona fide viruses. This “bet-hedging” strategy, that is also char-
(Fischer and Suttle, 2011; Yutin et al., 2013). The polintons are acteristic of Mu-like bacteriophages and eukaryotic Ty1-copia
common in diverse unicellular protists and animals (Krupovic and retrotransposons (pseudoviruses) and Ty3-gypsy retrotransposons
Koonin, 2015), indicative of their ancient origin, perhaps concomitant (metaviruses) (Koonin and Dolja, 2014; Krupovic et al., 2011a;
with the origin of eukaryotes. Recently, it has been shown that the Sandmeyer and Menees, 1996) (and see above), would provide the
majority of the Polintons encode two proteins homologous to the flexibility of parasite-host relationships that conceivably underlies
version of the JRC that is typical of the capsids of icosahedral dsDNA the diversification and successful spread of Polintoviruses in div-
viruses that infect bacteria, eukaryotes and some archaea (double erse eukaryotes.
beta-barrel) (Krupovic et al., 2014). All key structural elements of the Some Polintoviruses apparently abandoned the virus lifestyle
capsid proteins are preserved in the polinton-encoded homologs after losing the genes involved in virion formation and became
suggesting that these proteins are indeed functional. The Polintons pure transposons (it seems prudent to reserve the term Polintons
also encode two proteins that are essential for morphogenesis in for these elements) (Krupovic and Koonin, 2015). Adenoviruses
members of the “Megavirales”, namely an FtsK-like ATPase and a followed the opposite course of evolution, having lost the inte-
Ulp1-like protease. The presence of these genes, together with those grase gene and thereby committing to the strict viral lifestyle.
for capsid proteins, leaves little doubt that, under some still unknown Polintons also contributed the pPolB gene to the evolution of a
conditions, the polintons actually produce virions that might possess distinct family of ssDNA viruses, the Bidnaviridae, which emerged
the ability to infect new hosts (Krupovic et al., 2014). Thus, the via extensive gene shuffling between four groups of selfish
Polintons, perhaps to be renamed Polintoviruses (the term we use elements (Krupovic and Koonin, 2014) (and see above).
hereinafter), combine central features of viruses and transposons, The “Megavirales”, the largest, most diverse group of eukaryotic
and seem to represent the second major group of eukaryotic dsDNA dsDNA viruses, apparently inherited from the Polintoviruses the virion
viruses, after the “Megavirales”, that infect numerous hosts across morphogenesis module including the major and minor capsid pro-
the entire eukaryotic diversity (Krupovic and Koonin, 2015). teins, genome packaging ATPase and maturation protease (Krupovic
Polintoviruses share blocks of homologous genes with diverse and Koonin, 2015). Among the numerous double-JRC proteins, the
viruses, transposons and plasmids (Krupovic and Koonin, 2015). In predicted major capsid protein of the Polintoviruses is most similar to
particular, bacteriophages of the family Tectiviridae, Polintons and the capsid proteins of phycodnaviruses (Krupovic et al., 2014),
the Mavirus virophage all share 4 genes encoding two capsid suggesting a direct evolutionary link between Polintoviruses and the
proteins, DNA-packaging ATPase and protein-primed DNA poly- “Megavirales”. Although the packaging ATPases and the maturation
merase (pPolB). The Polintoviruses share two additional genes proteases are highly diverged, the topologies of the respective phyl-
with the Mavirus, namely those for the capsid maturation protease ogenetic trees are compatible with the Polintovirus—“Megavirales”
and the RVE integrase, whereas the rest of the virophages also link (Yutin et al., 2013).
encode the capsid proteins, ATPase and protease, but lack pPolB Polintons reside in the nucleus of the host cell, and most likely,
and the integrase (Yutin et al., 2013). Adenoviruses join this their predicted viral forms, the Polintoviruses, also reproduce in the
network of related viruses through pPolB, the two capsid proteins nucleus and thus rely on the host enzymatic machinery for
and the protease, whereas the much larger “Megavirales” connect transcription. A key event in the evolution of the “Megavirales”
through the capsid proteins, the ATPase and the protease. Thus, was the escape from the nucleus, most likely concomitant with the
the morphogenetic module is the common denominator that links acquisition of the RNA polymerase and the capping apparatus from
all these diverse families of viruses. The yeast linear cytoplasmic the host. The escaped element that would replicate in the cyto-
plasmids (Klassen and Meinhardt, 2007) provide additional con- plasm using the ancestral Polinton pPolB spawned two groups of
nections between Polintons and the incomparably more complex mobile elements, namely cytoplasmic plasmids (surviving in fungi)
members of the “Megavirales”: these plasmids lack the morpho- and the “Megavirales” that share with these plasmids the distinct
genetic module but encode pPolB along with four key proteins three-domain capping enzyme, two RNA polymerase subunits and
required for cytoplasmic transcription that are conserved in most the D11-like helicase (Krupovic and Koonin, 2015). The cytoplasmic
of the “Megavirales”. plasmids retain pPolB but have lost the morphogenesis module and
The multiple connections between the Polintoviruses and various are thus restricted to the intracellular lifestyle. By contrast, evolu-
other groups of viruses and plasmids have prompted a unifying tion of the “Megavirales” took the route of increasing complexity
scenario under which Polintoviruses were the first group of eukar- and autonomy from host functions. The major events in the
yotic dsDNA viruses that, on different occasions, gave rise to several evolution of “Megavirales” from the putative cytoplasmic
groups of eukaryotic viruses, transposons and plasmids (Fig. 7) Polintovirus-like ancestor include the displacement of pPolB with
(Krupovic and Koonin, 2015). The Polintoviruses most likely evolved a RNA/DNA-primed PolB and acquisition of the D5-like helicase-
from bacteriophages of the family Tectiviridae that entered the primase (Krupovic and Koonin, 2015). It seems likely that pPolB that
protoeukaryotic cell along with the α-proteobacterial endosymbiont, initiates DNA replication at genome termini cannot efficiently
the ancestor of the mitochondria (Fig. 7). This scenario is compatible replicate genomes above a certain threshold (probably, about
with the presence of linear plasmids that encode pPolB in fungal 45 kb, as in adenoviruses). Replication of larger genomes would
mitochondria (Handa, 2008). In phylogenetic trees, these pPolBs form become efficient upon the recruitment of a dedicated primase-
a deep branch that is distinct from the rest of the eukaryotic plasmids helicase. Some Polintons encode divergent D5-like primases-heli-
and viruses, suggestive of early divergence of the descendants of the cases that typically cluster in phylogenetic trees with the primases-
ancestral tectivirus into mitochondrial and cytoplasmic or nuclear helicases of the “Megavirales” (Yutin et al., 2013). Several additional
lineages of mobile elements (Krupovic and Koonin, 2015). genes that belong to the inferred ancestral gene set of the “Mega-
The key event in the evolution of the Polintoviruses from the virales” are also shared with various Polintons (Yutin et al., 2013).
ancestral tectivirus apparently was the acquisition of the RVE Thus, Polintoviruses could have donated a substantial fraction of the
family integrase and the Ulp1-like cysteine protease, conceivably ancestral genes of the “Megavirales”. A notable exception is the PolB
via a single recombination event with a eukaryotic Ginger 1-like gene that replaced the ancestral pPolB and most likely was acquired
transposon (Bao et al., 2010; Krupovic and Koonin, 2015) (Fig. 7). from the eukaryotic host (Yutin and Koonin, 2012). The acquisition
The capture of the integrase was pivotal in the evolution of the of this form of PolB, jointly with the primase-helicase, provided the
Polintoviruses, endowing them with the ability to combine two opportunity for almost unlimited genome expansion in the “Mega-
alternative lifestyles, those typical of transposable elements and virales”, yielding the giant viruses.
E.V. Koonin et al. / Virology 479-480 (2015) 2–25 19
Fig. 7. Evolution of large dsDNA viruses of eukaryotes from two distinct groups of bacteriophages. The dotted line with a question mark shows a tenuous evolutionary
relationship. The host ranges of the eukaryotic virus groups are color-coded as shown in the inset. The hatched yellow square for the virophages indicates that these viruses
parasitize on the giant viruses of the family Mimivirdae which themselves infect amoeba and other protists. For each family of large eukaryotic viruses, a simplified schematic
depiction of the virion structure is included.
A radically different scenario of the origin of the giant viruses universal genes of the giant viruses were nested within the eukar-
among the “Megavirales”, such as the mimiviruses and pandora- yotic domain of the respective phylogenetic trees (Williams et al.,
viruses, has been proposed on the strength of their microbe-like size 2011; Yutin et al., 2014). Moreover, in different groups of giant
and genomic complexity, and most important, the presence of genes viruses, these genes were affiliated with different eukaryotes, sug-
encoding some components of the translation system, such as several gestive of independent acquisition. Consistent with this conclusion,
aminoacyl-tRNA synthetases, that are universally present in cellular reconstruction of the evolution of the gene repertoire of the “Mega-
life forms (Koonin, 2003). The initial and subsequent phylogenetic virales” indicates that the giant viruses most likely evolved from
analysis of these universal genes has suggested that the giant viruses smaller viruses in this group via the acquisition of numerous genes
did not fall into any of three domains of cellular life (bacteria, archaea from different sources and gene duplication (Filee, 2013; Yutin and
and eukaryote) and prompted the hypothesis that these viruses Koonin, 2013; Yutin et al., 2014). Thus, notwithstanding their com-
evolved by reductive evolution from a hypothetical (conceivably, plexity that is unprecedented in the virus world, the giant viruses
extinct) cellular domain (Colson et al., 2012, 2011; Nasir et al., 2012; share a common history with the rest of the “Megavirales” and thus
Raoult et al., 2004). However, independent phylogenetic studies that ultimately appear to have evolved from Polintoviruses.
employed representative sets of cellular life forms from the three The virophages retain many ancestral features of the Polinto-
domains and more advanced phylogenetic methods have effectively viruses, in particular the complete morphogenesis module. Unlike
refuted the fourth domain hypothesis by showing that nearly all the ancestors of the “Megavirales”, these smaller viruses have not
20 E.V. Koonin et al. / Virology 479-480 (2015) 2–25
acquired the molecular machinery required for the reproduction in originate from the narrowly spread tectiviruses. Conceivably, the key
the cytoplasm of the host cells and instead evolved to parasitize on event behind the success of the Polintoviruses that defined the wide
their giant relatives by exploiting their transcription apparatus and spread of their descendants was the acquisition of the transposase
other functions (Claverie and Abergel, 2009; Desnues et al., 2012; (see above). Furthermore, the fact that herpesviruses seem to be
Fischer and Suttle, 2011; Krupovic and Cvirkaite-Krupovic, 2011). limited to animal hosts might indicate that this group of viruses
Ten recognized families of eukaryotic dsDNA viruses do not show emerged relatively late in the course of eukaryotic evolution, with the
clear evolutionary relationship to the Polintovirus-centered assem- ancestor bacteriophage coming not from the proto-mitochondrion
blage of the eukaryotic dsDNA viruses (Supplementary Table S6) but from a distinct (perhaps transient) bacterial symbiont of early
(Koonin et al., 2015). All these viruses have narrow host ranges animals. Paradoxically, however, the proto-mitochondrial symbiont
compared to the “Megavirales”, mostly infecting members of a apparently did contain a provirus derived from a tailed bacteriophage
particular animal phylum such as chordates or arthropods. The and this provirus had a significant effect on the evolution of
evolution of these viruses so far has not been reconstructed in a mitochondria: in modern mitochondria, ancestral bacterial genes for
comprehensive manner as it had been the case with the “Megavir- RNA polymerase, DNA polymerase and DNA primase have been all
ales”. Nevertheless, some general trends have become apparent. replaced with the counterparts from the resident prophage early in
Five families of large eukaryotic dsDNA viruses, namely Baculoviridae, eukaryogenesis (Filee and Forterre, 2005; Shutt and Gray, 2006).
Hytrosaviridae, Nimaviridae, Nudiviridae, and Polydnaviridae, so far Finally, the two families of dsDNA viruses with small, circular
have been isolated exclusively from arthropods. Although these genomes, Papillomaviridae and Polyomaviridae, appear to have evol-
viruses, particularly the latter three families, mostly encode highly ved via a route that is completely distinct from the origins of all
diverged (presumably, fast-evolving) protein sequences and are larger dsDNA viruses of eukaryotes. The capsids of papillomaviruses
currently represented by only a few genomes each, phylogenomic and polyomaviruses are constructed from JRC proteins homologous
analysis suggests that they comprise a monophyletic group, with to those of eukaryotic ssDNA viruses (Fig. 5). Furthermore, the single
several signature genes that are not found in other viruses (Jehle multidomain replicative protein of these viruses, known as the large
et al., 2013; Wang et al., 2012b; Wang and Jehle, 2009). Polydna- T antigen in polyomaviruses and the E1 protein in papillomaviruses,
viruses represent a unique group of viruses that are only vertically is homologous to the replication proteins of ssDNA viruses, such as
transmitted, with the virus genomes permanently integrated in the circoviruses, nanoviruses, parvoviruses and geminiviruses (Fig. 4 and
genomes of the insect hosts. Nevertheless, even in this unusual case, see above). This large protein has a typical domain architecture
phylogenetic analysis of the retained viral genes indicates that consisting of a S3H and a rolling circle replication initiation endonu-
polydnaviruses are highly derived descendants of nudiviruses clease that, however, is inactivated in papillomaviruses and poly-
(Herniou et al., 2013; Theze et al., 2011). Preliminary phylogenetic omaviruses (Fig. 4). This inactivation of the key enzyme of RCR is
analysis of several essential genes that are shared by all these concomitant with the switch from rolling circle to the “theta-like”
arthoropod viruses and the “Megavirales”, such as PolB, RNAP replication mode and from ssDNA to dsDNA genome (Ilyina and
subunits, helicase-primase and thiol oxidoreductase, has suggested Koonin, 1992; Iyer et al., 2005). Thus, the small dsDNA viruses of
that this group of viruses might be a highly derived offshoot of the eukaryotes apparently are derivatives of ssDNA viruses which them-
“Megavirales” (Wang et al., 2012b) (Fig. 7). However, this remains but selves evolved via recombination of bacterial rolling circle-replicating
a tentative clue until a comprehensive study on the evolution of plasmids and ssRNA viruses (see above).
these unusual viruses is performed.
The highly diversified order Herpesvirales is of special interest from Synopsis of dsDNA virus evolution
the standpoint of virus evolution because of a distinct connection
with tailed viruses of the order Caudovirales which includes three Overall, the emerging picture of the origin of dsDNA viruses of
families, namely Siphoviridae, Podoviridae and Myoviridae. Caudovirales eukaryotes reveals three readily identifiable bacterial roots (Fig. 7; see
are nearly ubiquitous in Bacteria (Ackermann and Prangishvili, 2012) also Fig. 6). Two of these lines of descent come from distinct groups of
and are also present in diverse orders of Archaea, including the deeply bacteriophages and gave rise to the majority of large eukaryotic
branching archaeal phylum Thaumarchaeota (Krupovic et al., 2011b). viruses, whereas the third one comes from plasmids and yielded the
The putative bacterial or archaeal virus ancestors of the herpesvir- two families of small dsDNA viruses that actually are derivatives of
uses are unrelated to the tectiviruses, the likely ancestors of the ssDNA viruses. There is no evidence of a direct contribution of viruses
Polintovirus-related majority of eukaryotic dsDNA viruses (Fig. 7). infecting archaea to the emergence of eukaryotic virome, despite the
Herpesviruses share with the Caudovirales homologous major capsid remarkable diversity and abundance of archaeal dsDNA viruses
proteins of the HK97 fold that is unrelated to the double jelly-roll fold (Prangishvili, 2013; Prangishvili et al., 2006a, 2006b) (a caveat to be
present in the capsid proteins of numerous groups of icosahedral addressed in future studies is that most of the current knowledge on
viruses (including the Polintovirus-centered assemblage), terminases archaeal viruses comes from hyperthermophilic Crenarchaeota not
(packaging ATPases-nucleases), and capsid maturation proteases as from mesophilic members of the TACK superphylum which seem
well as several other proteins (Pietila et al., 2013; Selvarajan Sigamani to be the likely ancestors of eukaryotes). Given this demonstrable
et al., 2013; Baker et al., 2005; Krupovic and Bamford, 2011; Krupovic bacterial ancestry, the reconstruction of the evolution of eukaryotic
et al., 2010; Rixon and Schmid, 2014). Thus, tailed prokaryotic viruses dsDNA viruses seems to be best compatible with the symbiogenetic
and herpesviruses share a complex and unique virion assembly and scenario of eukaryogenesis. Acquisition of DNA polymerases and
maturation program which is not found in other dsDNA viruses. primases from the eukaryotic hosts opened the route of genome
The apparent bacteriophage origin of the herpesvirus morphogen- expansion to the evolving dsDNA viruses, resulting in acquisition of
esis module that consists of a capsid protein, an ATPase and a numerous genes from the hosts and exaptation (recruitment) of the
protease is a striking parallel with the similar evolutionary route of acquired genes for virus-host interaction.
the Polintovirus ancestor but the actual proteins involved are unre-
lated (or in the case of the ATPase, distantly related). This evolutionary
parallelism clearly reflects a general trend in the origins of the largest, Conclusions
most complex viruses of eukaryotes. Somewhat ironically, bacterio-
phages of the order Caudovirales, which are the most common viruses The recent dramatic expansion of the collection of viral genome
on earth, gave rise to a single (even if diverse) group of eukaryotic sequences, combined with the concerted efforts in evolutionary
dsDNA viruses, whereas the bulk of eukaryotic dsDNA viruses seem to genomics, translates into a new level of understanding of the origins
E.V. Koonin et al. / Virology 479-480 (2015) 2–25 21
of the major groups of eukaryotic viruses and the key events in their Aravind, L., Walker, D.R., Koonin, E.V., 1999. Conserved domains in DNA repair
evolution. We now can delineate both the major general trends in the proteins and evolution of repair systems. Nucleic Acids Res. 27 (5), 1223–1242.
Baker, M.L., Jiang, W., Rixon, F.J., Chiu, W., 2005. Common ancestry of herpesviruses
evolution of eukaryotic viruses and specific scenarios for different and tailed DNA bacteriophages. J. Virol. 79 (23), 14967–14970.
virus classes. One of the most striking trends is the distinct composi- Baltimore, D., 1971. Expression of animal virus genomes. Bacteriol. Rev. 35 (3),
tion of the eukaryotic virome compared to the viromes of archaea and 235–241.
Bao, W., Kapitonov, V.V., Jurka, J., 2010. Ginger DNA transposons in eukaryotes and
bacteria, namely, the high prevalence and enormous diversity of RNA their evolutionary relationships with long terminal repeat retrotransposons.
viruses. It might be tempting to directly derive the eukaryotic RNA Mob. DNA 1 (1), 3.
virome from the hypothetical primordial RNA world but the plausi- Bekal, S., Domier, L.L., Gonfa, B., McCoppin, N.K., Lambert, K.N., Bhalerao, K., 2014. A
novel flavivirus in the soybean cyst nematode. J. Gen. Virol. 95 (Pt 6),
bility of this link depends on the adopted scenario for the origin of
1272–1280.
eukaryotes. The primordial origin of eukaryotic RNA viruses appears Bekal, S., Domier, L.L., Niblack, T.L., Lambert, K.N., 2011. Discovery and initial
to be compatible with the protoeukaryotic but not with the symbio- analysis of novel viral genomes in the soybean cyst nematode. J. Gen. Virol. 92
genetic scenario. If, under the latter scenario, the host of the (Pt 8), 1870–1879.
Belov, G.A., 2014. Modulation of lipid synthesis and trafficking pathways by
mitochondrial endosymbiont was a typical archaeon, the existence picornaviruses. Curr. Opin. Virol. 9C, 19–23.
of a diverse RNA virome in such an organism appears exceedingly Bernhardt, H.S., 2012. The RNA world hypothesis: the worst theory of the early
unlikely. Instead, a more circuitous path to the eukaryotic RNA virome evolution of life (except for all the others)(a). Biol. Direct 7, 23.
Bhattacharya, S., Bakre, A., Bhattacharya, A., 2002. Mobile genetic elements in
would have to be postulated, with traceable contributions from protozoan parasites. J. Genet. 81 (2), 73–86.
bacterial retroelements as well as bona fide bacterial genes. This Blaxter, M.L., De Ley, P., Garey, J.R., Liu, L.X., Scheldeman, P., Vierstraete, A.,
type of chimeric origin is a pervasive theme in the evolution of all Vanfleteren, J.R., Mackey, L.Y., Dorris, M., Frisse, L.M., Vida, J.T., Thomas, W.K.,
1998. A molecular evolutionary framework for the phylum Nematoda. Nature
classes of eukaryotic viruses that is particularly apparent in the 392 (6671), 71–75.
emerging histories of dsRNA viruses, ssDNA viruses and dsDNA Bolduc, B., Shaughnessy, D.P., Wolf, Y.I., Koonin, E.V., Roberto, F.F., Young, M., 2012.
viruses. Strikingly, in each of these cases, the morphogenetic and Identification of novel positive-strand RNA viruses by metagenomic analysis of
archaea-dominated Yellowstone hot springs. J. Virol. 86 (10), 5562–5573.
replication-expression modules appear to be of different evolution-
Bollback, J.P., Huelsenbeck, J.P., 2001. Phylogeny, genome evolution, and host
ary provenances, and recombination between these distinct mod- specificity of single-stranded RNA bacteriophage (family Leviviridae). J. Mol.
ules gave rise to a novel type of viruses. At least in some cases, the Evol. 52 (2), 117–128.
recombination of modules and spread of individual genes, such as Bottcher, B., Unseld, S., Ceulemans, H., Russell, R.B., Jeske, H., 2004. Geminate
structures of African cassava mosaic virus. J. Virol. 78 (13), 6758–6765.
the movement protein gene in plants, seems to have a clear adaptive Brown, J.R., Doolittle, W.F., 1997. Archaea and the prokaryote-to-eukaryote transi-
value by opening up a major new niche for viruses with different tion. Microbiol. Mol. Biol. Rev. 61 (4), 456–502.
particular replication-expression strategies and virion structures. Bujnicki, J.M., Rychlewski, L., 2002. In silico identification, structure prediction and
phylogenetic analysis of the 20 -O-ribose (cap 1) methyltransferase domain in
Another major trend in the evolution of the viruses of eukaryotes the large structural protein of ssRNA negative-strand viruses. Protein Eng. 15
is the pervasive evolutionary connection between bona fide viruses (2), 101–108.
and non-viral mobile genetic elements, such as transposons and Capy, P., Maisonhaute, C., 2002. Acquisition/loss of modules: the construction set of
transposable elements. Russ. J. Genet. 38, 594–601.
plasmids. These non-viral elements appear to have made major Chalamcharla, V.R., Curcio, M.J., Belfort, M., 2010. Nuclear expression of a group II
contributions to the evolution of all classes of eukaryotic viruses as intron is consistent with spliceosomal intron ancestry. Genes Dev. 24 (8),
well as the hosts. Furthermore, elements with a dual life style, such as 827–836.
Chan, S.R., Blackburn, E.H., 2004. Telomeres and telomerase. Philos. Trans. R. Soc.
metaviruses and pseudoviruses as well as polintoviruses (polintons),
London, B: Biol. Sci. 359 (1441), 109–121.
appear to have played central roles in the evolution of the retroviruses Chandler, M., de la Cruz, F., Dyda, F., Hickman, A.B., Moncalian, G., Ton-Hoang, B.,
and large dsDNA viruses of eukaryotes, respectively. Perhaps, the most 2013. Breaking and joining single-stranded DNA: the HUH endonuclease
remarkable aspect of the evolution of the viruses of eukaryotes is that superfamily. Nat. Rev. Microbiol. 11 (8), 525–538.
Choi, K.H., Rossmann, M.G., 2009. RNA-dependent RNA polymerases from Flavivir-
it seems to be tractable, at least in its central features. idae. Curr. Opin. Struct. Biol. 19 (6), 746–751.
Claverie, J.M., Abergel, C., 2009. Mimivirus and its virophage. Annu. Rev. Genet. 43,
49–66.
Claverie, J.M., Ogata, H., Audic, S., Abergel, C., Suhre, K., Fournier, P.E., 2006.
Acknowledgments Mimivirus and the emerging concept of “giant” virus. Virus Res. 117 (1),
133–144.
The authors thank David Karlin and Tero Ahola for the kind Colson, P., de Lamballerie, X., Fournous, G., Raoult, D., 2012. Reclassification of giant
viruses composing a fourth domain of life in the new order Megavirales.
permission to cite the results of their work before publication. EVK Intervirology 55 (5), 321–332.
is supported by the intramural funds of the US Department of Colson, P., De Lamballerie, X., Yutin, N., Asgari, S., Bigot, Y., Bideshi, D.K., Cheng, X.W.,
Health and Human Services (National Library of Medicine). Federici, B.A., Van Etten, J.L., Koonin, E.V., La Scola, B., Raoult, D. (2013).
“Megavirales”, a proposed new order for eukaryotic nucleocytoplasmic large
DNA viruses. Arch. Virol.
Colson, P., Gimenez, G., Boyer, M., Fournous, G., Raoult, D., 2011. The giant Cafeteria
Appendix A. Supporting information roenbergensis virus that infects a widespread marine phagocytic protist is a
new member of the fourth domain of Life. PLoS One 6 (4), e18935.
Cotmore, S.F., Agbandje-McKenna, M., Chiorini, J.A., Mukha, D.V., Pintel, D.J., Qiu, J.,
Supplementary data associated with this article can be found in Soderlund-Venermo, M., Tattersall, P., Tijssen, P., Gatherer, D., Davison, A.J.,
the online version at http://dx.doi.org/10.1016/j.virol.2015.02.039. 2014. The family Parvoviridae. Arch. Virol. 159 (5), 1239–1247.
Covey, S.N., 1986. Amino acid sequence homology in gag region of reverse
transcribing elements and the coat protein gene of cauliflower mosaic virus.
References Nucleic Acids Res. 14 (2), 623–633.
Culley, A.I., Lang, A.S., Suttle, C.A., 2006. Metagenomic analysis of coastal RNA virus
communities. Science 312 (5781), 1795–1798.
Ackermann, H.W., Prangishvili, D., 2012. Prokaryote viruses studied by electron Culley, A.I., Mueller, J.A., Belcaid, M., Wood-Charlson, E.M., Poisson, G., Steward, G.F.,
microscopy. Arch. Virol. 157 (10), 1843–1849. 2014. The characterization of RNA viruses in tropical seawater using targeted
Adriaenssens, E.M., Edwards, R., Nash, J.H., Mahadevan, P., Seto, D., Ackermann, H.W., PCR and metagenomics. MBio 5 (3), e01210–e01214.
Lavigne, R., Kropinski, A.M., 2014. Integration of genomic and proteomic analyses Culley, A.I., Steward, G.F., 2007. New genera of RNA viruses in subtropical seawater,
in the classification of the Siphoviridae family. Virology. inferred from polymerase gene sequences. Appl. Environ. Microbiol. 73 (18),
Agol, V.I., 1974. Towards the system of viruses. Biosystems 6 (2), 113–132. 5937–5944.
Ahola, T., Karlin, D.G., 2015. Sequence analysis reveals a conserved extension in the Dai, L., Chai, D., Gu, S.Q., Gabel, J., Noskov, S.Y., Blocker, F.J., Lambowitz, A.M.,
methyltransferase guanylyltransferase of the alphavirus supergroup, and a Zimmerly, S., 2008. A three-dimensional model of a group II intron RNA and its
homologous domain in the nodavirus supergroup. Biol. Direct, in press. interaction with the intron-encoded reverse transcriptase. Mol. Cell 30 (4),
Ammar el, D., Tsai, C.W., Whitfield, A.E., Redinbaugh, M.G., Hogenhout, S.A., 2009. 472–485.
Cellular and molecular aspects of rhabdovirus interactions with insect and Darlix, J.L., de Rocquigny, H., Mauffret, O., Mely, Y., 2014. Retrospective on the all-in-
plant hosts. Annu. Rev. Entomol. 54, 447–468. one retroviral nucleocapsid protein. Virus Res. 193, 2–15.
22 E.V. Koonin et al. / Virology 479-480 (2015) 2–25
Dayaram, A., Goldstien, S., Zawar-Reza, P., Gomez, C., Harding, J.S., Varsani, A., 2013. site is internally permuted in viral RNA-dependent RNA polymerases of an
Novel ssDNA virus recovered from estuarine Mollusc (Amphibola crenata) ancient lineage. J. Mol. Biol. 324 (1), 47–62.
whose replication associated protein (Rep) shares similarities with Rep-like Greninger, A.L., 2015. Picornavirus-host interactions to construct viral secretory
sequences of bacterial origin. J. Gen. Virol. 94 (Pt 5), 1104–1110. membranes. Prog. Mol. Biol. Transl. Sci. 129, 189–212.
de Koning, A.P., Gu, W., Castoe, T.A., Batzer, M.A., Pollock, D.D., 2011. Repetitive Griffiths, A.J., 1995. Natural plasmids of filamentous fungi. Microbiol. Rev. 59 (4),
elements may comprise over two-thirds of the human genome. PLoS Genet. 7 673–685.
(12), e1002384. Grigoras, I., Ginzo, A.I., Martin, D.P., Varsani, A., Romero, J., Mammadov, A.,
Defraia, C., Slotkin, R.K., 2014. Analysis of retrotransposon activity in plants. Huseynova, I.M., Aliyev, J.A., Kheyr-Pour, A., Huss, H., Ziebell, H., Timchenko,
Methods Mol. Biol. 1112, 195–210. T., Vetten, H.J., Gronenborn, B., 2014. Genome diversity and evidence of
Delwart, E., Li, L., 2012. Rapidly expanding genetic diversity and host range of the recombination and reassortment in nanoviruses from Europe. J. Gen. Virol. 95
Circoviridae viral family and other Rep encoding small circular ssDNA genomes. (Pt 5), 1178–1191.
Virus Res. 164 (1–2), 114–121. Guu, T.S., Zheng, W., Tao, Y.J., 2012. Bunyavirus: structure and replication. Adv. Exp.
den Boon, J.A., Ahlquist, P., 2010. Organelle-like membrane compartmentalization Med. Biol. 726, 245–266.
of positive-strand RNA virus replication factories. Annu. Rev. Microbiol. 64, Guy, L., Saw, J.H., Ettema, T.J., 2014. The archaeal legacy of eukaryotes: a
241–256. phylogenomic perspective. Cold Spring Harb. Perspect. Biol. 6 (10), a016022.
Desnues, C., Boyer, M., Raoult, D., 2012. Sputnik, a virophage infecting the viral Handa, H., 2008. Linear plasmids in plant mitochondria: peaceful coexistences or
domain of life. Adv. Virus Res. 82, 63–89. malicious invasions? Mitochondrion 8 (1), 15–25.
Diemer, G.S., Stedman, K.M., 2012. A novel virus genome discovered in an extreme Hanley-Bowdoin, L., Bejarano, E.R., Robertson, D., Mansoor, S., 2013. Geminiviruses:
environment suggests recombination between unrelated groups of RNA and masters at redirecting and reprogramming plant processes. Nat. Rev. Microbiol.
DNA viruses. Biol. Direct 7, 13. 11 (11), 777–788.
Diener, T.O., 1989. Circular RNAs: relics of precellular evolution? Proc. Natl. Acad. Harper, G., Hull, R., Lockhart, B., Olszewski, N., 2002. Viral sequences integrated into
Sci. U.S.A. 86 (23), 9370–9374. plant genomes. Annu. Rev. Phytopathol. 40, 119–136.
Dlakic, M., Mushegian, A., 2011. Prp8, the pivotal protein of the spliceosomal Hendrix, R.W., 2003. Bacteriophage genomics. Curr. Opin. Microbiol. 6 (5), 506–511.
catalytic center, evolved from a retroelement-encoded reverse transcriptase. Herniou, E.A., Huguet, E., Theze, J., Bezier, A., Periquet, G., Drezen, J.M., 2013. When
RNA 17 (5), 799–808. parasitic wasps hijacked viruses: genomic and functional evolution of poly-
Dolja, V.V., Boyko, V.P., Agranovsky, A.A., Koonin, E.V., 1991. Phylogeny of capsid dnaviruses. Philos. Trans. R. Soc. London, B: Biol. Sci. 368 (1626), 20130051.
proteins of rod-shaped and filamentous RNA plant viruses: two families with Hillman, B.I., Cai, G., 2013. The family narnaviridae: simplest of RNA viruses. Adv.
distinct patterns of sequence and probably structure conservation. Virology 184 Virus Res. 86, 149–176.
(1), 79–86. Hjort, K., Goldberg, A.V., Tsaousis, A.D., Hirt, R.P., Embley, T.M., 2010. Diversity and
Dolja, V.V., Koonin, E.V., 2011. Common origins and host-dependent diversity of reductive evolution of mitochondria among microbial eukaryotes. Philos. Trans.
plant and animal viromes. Curr. Opin. Virol. 1 (5), 322–331. R. Soc. London, B: Biol. Sci. 365 (1541), 713–727.
Edwards, R.A., Rohwer, F., 2005. Viral metagenomics. Nat. Rev. Microbiol. 3 (6), Holmes, E.C., 2011. What does virus evolution tell us about virus origins? J. Virol. 85
504–510. (11), 5247–5251.
Eickbush, T.H., Jamburuthugoda, V.K., 2008. The diversity of retrotransposons and Hu, Z.Y., Li, G.H., Li, G.T., Yao, Q., Chen, K.P., 2013. Bombyx mori bidensovirus: the
the properties of their reverse transcriptases. Virus Res. 134 (1–2), 221–234. type species of the new genus Bidensovirus in the new family Bidnaviridae.
El Omari, K., Sutton, G., Ravantti, J.J., Zhang, H., Walter, T.S., Grimes, J.M., Bamford, D. Chin. Sci. Bull. 58, 4528–4532.
H., Stuart, D.I., Mancini, E.J., 2013. Plate tectonics of virus shell assembly and Husnik, F., Nikoh, N., Koga, R., Ross, L., Duncan, R.P., Fujie, M., Tanaka, M., Satoh, N.,
reorganization in phage phi8, a distant relative of mammalian reoviruses. Bachtrog, D., Wilson, A.C., von Dohlen, C.D., Fukatsu, T., McCutcheon, J.P., 2013.
Structure 21 (8), 1384–1395. Horizontal gene transfer from diverse bacteria to an insect genome enables a
Embley, T.M., Martin, W., 2006. Eukaryotic evolution, changes and challenges. tripartite nested mealybug symbiosis. Cell 153 (7), 1567–1578.
Nature 440 (7084), 623–630. Ilyina, T.V., Koonin, E.V., 1992. Conserved sequence motifs in the initiator proteins
Esser, C., Ahmadinejad, N., Wiegand, C., Rotte, C., Sebastiani, F., Gelius-Dietrich, G., for rolling circle DNA replication encoded by diverse replicons from eubacteria,
Henze, K., Kretschmann, E., Richly, E., Leister, D., Bryant, D., Steel, M.A., eucaryotes and archaebacteria. Nucleic Acids Res. 20 (13), 3279–3285.
Lockhart, P.J., Penny, D., Martin, W., 2004. A genome phylogeny for mitochon- Ivanov, D., Stone, J.R., Maki, J.L., Collins, T., Wagner, G., 2005. Mammalian SCAN
dria among alpha-proteobacteria and a predominantly eubacterial ancestry of domain dimer is a domain-swapped homolog of the HIV capsid C-terminal
yeast nuclear genes. Mol. Biol. Evol. 21 (9), 1643–1660. domain. Mol. Cell 17 (1), 137–143.
Evgen’ev, M.B., 2013. What happens when Penelope comes? An unusual retro- Iyer, L.M., Aravind, L., Koonin, E.V., 2001. Common origin of four diverse families of
element invades a host species genome exploring different strategies. Mob. large eukaryotic DNA viruses. J. Virol. 75 (23), 11720–11734.
Genet. Elem. 3 (2), e24542. Iyer, L.M., Balaji, S., Koonin, E.V., Aravind, L., 2006. Evolutionary genomics of nucleo-
Ferrer-Orta, C., Arias, A., Escarmis, C., Verdaguer, N., 2006. A comparison of viral cytoplasmic large DNA viruses. Virus Res. 117 (1), 156–184.
RNA-dependent RNA polymerases. Curr. Opin. Struct. Biol. 16 (1), 27–34. Iyer, L.M., Koonin, E.V., Aravind, L., 2003. Evolutionary connection between the
Filee, J., 2013. Route of NCLDV evolution: the genomic accordion. Curr. Opin. Virol. 3 catalytic subunits of DNA-dependent RNA polymerases and eukaryotic RNA-
(5), 595–599. dependent RNA polymerases and the origin of RNA polymerases. BMC Struct.
Filee, J., Forterre, P., 2005. Viral proteins functioning in organelles: a cryptic origin? Biol. 3, 1.
Trends Microbiol. 13 (11), 510–513. Iyer, L.M., Koonin, E.V., Leipe, D.D., Aravind, L., 2005. Origin and evolution of the
Finnegan, D.J., 2012. Retrotransposons. Curr. Biol. 22 (11), R432–R437. archaeo-eukaryotic primase superfamily and related palm-domain proteins:
Fischer, M.G., Suttle, C.A., 2011. A virophage at the origin of large DNA transposons. structural insights and new members. Nucleic Acids Res. 33 (12), 3875–3896.
Science 332 (6026), 231–234. Janssen, M.E., Takagi, Y., Parent, K.N., Cardone, G., Nibert, M.L., Baker, T.S., 2015.
Fuhrman, J.A., 1999. Marine viruses and their biogeochemical and ecological effects. Three- dimensional structure of a protozoal double-stranded RNA virus that
Nature 399 (6736), 541–548. infects the enteric pathogen Giardia lamblia. J. Virol. 89 (2), 1182–1194.
Gibbs, A., Ohshima, K., 2010. Potyviruses and the digital revolution. Annu. Rev. Jehle, J.A., Abd-Alla, A.M., Wang, Y., 2013. Phylogeny and evolution of Hytrosavir-
Phytopathol. 48, 205–223. idae. J. Invertebr. Pathol. 112 (Suppl) S62-7.
Gibbs, M.J., Smeianov, V.V., Steele, J.L., Upcroft, P., Efimov, B.A., 2006. Two families Jiang, D., Fu, Y., Guoqing, L., Ghabrial, S.A., 2013. Viruses of the plant pathogenic
of rep-like genes that probably originated by interspecies recombination are fungus Sclerotinia sclerotiorum. Adv. Virus Res. 86, 215–248.
represented in viral, plasmid, bacterial, and parasitic protozoan genomes. Mol. Kamer, G., Argos, P., 1984. Primary structural comparison of RNA-dependent
Biol. Evol. 23 (6), 1097–1100. polymerases from plant, animal and bacterial viruses. Nucleic Acids Res. 12
Gilbert, W., 1986. The RNA world. Nature 319, 618. (18), 7269–7282.
Gladyshev, E.A., Arkhipova, I.R., 2011. A widespread class of reverse transcriptase- Kaneko-Ishino, T., Ishino, F., 2012. The role of genes domesticated from LTR
related cellular genes. Proc. Natl. Acad. Sci. U.S.A. 108 (51), 20311–20316. retrotransposons and retroviruses in mammals. Front. Microbiol. 3, 262.
Goldbach, R., Wellink, J., 1988. Evolution of plus-strand RNA viruses. Intervirology Kazazian Jr., H.H., 2004. Mobile elements: drivers of genome evolution. Science 303
29 (5), 260–267. (5664), 1626–1632.
Goodier, J.L., Kazazian Jr., H.H., 2008. Retrotransposons revisited: the restraint and Kidmose, R.T., Vasiliev, N.N., Chetverin, A.B., Andersen, G.R., Knudsen, C.R., 2010.
rehabilitation of parasites. Cell 135 (1), 23–35. Structure of the Qbeta replicase, an RNA-dependent RNA polymerase consisting
Gorbalenya, A.E., Donchenko, A.P., Blinov, V.M., Koonin, E.V., 1989a. Cysteine of viral and host proteins. Proc. Natl. Acad. Sci. U.S.A. 107 (24), 10884–10889.
proteases of positive strand RNA viruses and chymotrypsin-like serine pro- Kielian, M., Rey, F.A., 2006. Virus membrane-fusion proteins: more than one way to
teases. A distinct protein superfamily with a common structural fold. FEBS Lett. make a hairpin. Nat. Rev. Microbiol. 4 (1), 67–76.
243 (2), 103–114. Kim, A., Terzian, C., Santamaria, P., Pelisson, A., Purd’homme, N., Bucheton, A., 1994.
Gorbalenya, A.E., Donchenko, A.P., Koonin, E.V., Blinov, V.M., 1989b. N-terminal Retroviruses in invertebrates: the gypsy retrotransposon is apparently an
domains of putative helicases of flavi- and pestiviruses may be serine proteases. infectious retrovirus of Drosophila melanogaster. Proc. Natl. Acad. Sci. U.S.A. 91
Nucleic Acids Res. 17 (10), 3889–3897. (4), 1285–1289.
Gorbalenya, A.E., Enjuanes, L., Ziebuhr, J., Snijder, E.J., 2006. Nidovirales: evolving Kim, K.H., Chang, H.W., Nam, Y.D., Roh, S.W., Kim, M.S., Sung, Y., Jeon, C.O., Oh, H.M.,
the largest RNA virus genome. Virus Res. 117 (1), 17–37. Bae, J.W., 2008. Amplification of uncultured single-stranded DNA viruses from
Gorbalenya, A.E., Koonin, E.V., 1989. Viral proteins containing the purine NTP- rice paddy soil. Appl. Environ. Microbiol. 74 (19), 5975–5985.
binding sequence pattern. Nucleic Acids Res. 17 (21), 8413–8440. King, A.M.Q., Lefkowitz, E., Adams, M.J., Carstens, B. (Eds.), 2011. Virus Taxonomy:
Gorbalenya, A.E., Pringle, F.M., Zeddam, J.L., Luke, B.T., Cameron, C.E., Kalmakoff, J., Ninth Report of the International Committee on Taxonomy of Viruses. Amster-
Hanzlik, T.N., Gordon, K.H., Ward, V.K., 2002. The palm subdomain-based active dam. Elsevier.
E.V. Koonin et al. / Virology 479-480 (2015) 2–25 23
King, J.A., Dubielzig, R., Grimm, D., Kleinschmidt, J.A., 2001. DNA helicase-mediated virion architecture and assembly with tailed viruses of bacteria. J. Mol. Biol. 397
packaging of adeno-associated virus type 2 genomes into preformed capsids. (1), 144–160.
EMBO J. 20 (12), 3282–3291. Krupovic, M., Koonin, E.V., 2014. Evolution of eukaryotic single-stranded DNA
Klassen, R., Meinhardt, F., 2007. Linear protein-primed replicating plasmids in viruses of the Bidnaviridae family from genes of four other groups of widely
eukaryotic microbes. Microbiol. Monogr. 7, 188–216. different viruses. Sci. Rep. 4, 5347.
Koonin, E.V., 1991a. Genome replication/expression strategies of positive-strand Krupovic, M., Koonin, E.V., 2015. Polintons: a hotbed of eukaryotic virus, transposon
RNA viruses: a simple version of a combinatorial classification and prediction of and plasmid evolution. Nat. Rev. Microbiol. 13 (2), 105–115.
new strategies. Virus Genes 5 (3), 273–281. Krupovic, M., Prangishvili, D., Hendrix, R.W., Bamford, D.H., 2011a. Genomics of
Koonin, E.V., 1991b. The phylogeny of RNA-dependent RNA polymerases of bacterial and archaeal viruses: dynamics within the prokaryotic virosphere.
positive-strand RNA viruses. J. Gen. Virol. 72 (Pt 9), 2197–2206. Microbiol. Mol. Biol. Rev. 75 (4), 610–635.
Koonin, E.V., 1992. Evolution of double-stranded RNA viruses: a case for poly- Krupovic, M., Ravantti, J.J., Bamford, D.H., 2009. Geminiviruses: a tale of a plasmid
phyletic origin from different groups of positive-stranded RNA viruses. Semin. becoming a virus. BMC Evol. Biol. 9, 112.
Virol. 3, 327–339. Krupovic, M., Spang, A., Gribaldo, S., Forterre, P., Schleper, C., 2011b. A thaumarch-
Koonin, E.V., 2003. Comparative genomics, minimal gene-sets and the last universal aeal provirus testifies for an ancient association of tailed viruses with archaea.
common ancestor. Nat. Rev. Microbiol. 1 (2), 127–136. Biochem. Soc. Trans. 39 (1), 82–88.
Koonin, E.V., 2006. The origin of introns and their role in eukaryogenesis: a Krupovic, M., Zhi, N., Li, J., Hu, G., Koonin, E.V., Wong, S., Shevchenko, S., Zhao, K.,
compromise solution to the introns-early versus introns-late debate? Biol. Young, N.S., 2015. Multiple layers of chimerism in a single-stranded DNA virus
Direct 1, 22. discovered by deep sequencing. Genome Biol. Evol., in press, http://dx.doi.org/
Koonin, E.V., 2009. On the origin of cells and viruses: primordial virus world 10.1093/gbe/evv034.
scenario. Ann. N.Y. Acad. Sci. 1178, 47–64. Krylov, D.M., Koonin, E.V., 2001. A novel family of predicted retroviral-like aspartyl
Koonin, E.V., Choi, G.H., Nuss, D.L., Shapira, R., Carrington, J.C., 1991a. Evidence for proteases with a possible key role in eukaryotic cell cycle control. Curr. Biol. 11
common ancestry of a chestnut blight hypovirulence-associated double- (15), R584–R587.
stranded RNA and a group of positive-strand RNA plant viruses. Proc. Natl. Kurland, C.G., Collins, L.J., Penny, D., 2006. Genomics and the irreducible nature of
Acad. Sci. U.S.A. 88 (23), 10647–10651. eukaryote cells. Science 312 (5776), 1011–1014.
Koonin, E.V., Dolja, V.V., 1993. Evolution and taxonomy of positive-strand RNA La Scola, B., Desnues, C., Pagnier, I., Robert, C., Barrassi, L., Fournous, G., Merchat, M.,
viruses: implications of comparative analysis of amino acid sequences. Crit. Rev. Suzan-Monti, M., Forterre, P., Koonin, E., Raoult, D., 2008. The virophage as a
Biochem. Mol. Biol. 28 (5), 375–430. unique parasite of the giant mimivirus. Nature 455 (7209), 100–104.
Koonin, E.V., Dolja, V.V., 2013. A virocentric perspective on the evolution of life. Labonte, J.M., Suttle, C.A., 2013. Previously unknown and highly divergent ssDNA
Curr. Opin. Virol. 3 (5), 546–557. viruses populate the oceans. ISME J. 7 (11), 2169–2177.
Koonin, E.V., Dolja, V.V., 2014. Virus world as an evolutionary network of viruses Lambowitz, A.M., Zimmerly, S., 2004. Mobile group II introns. Annu. Rev. Genet. 38,
and capsidless selfish elements. Microbiol. Mol. Biol. Rev. 78 (2), 278–303. 1–35.
Koonin, E.V., Gorbalenya, A.E., Chumakov, K.M., 1989. Tentative identification of Lambowitz, A.M., Zimmerly, S., 2011. Group II introns: mobile ribozymes that
RNA-dependent RNA polymerases of dsRNA viruses and their relationship to invade DNA. Cold Spring Harb. Perspect. Biol. 3 (8), a003616.
positive strand RNA viral polymerases. FEBS Lett. 252 (1–2), 42–46. Lampson, B.C., Inouye, M., Inouye, S., 2005. Retrons, msDNA, and the bacterial
Koonin, E.V., Ilyina, T.V., 1992. Geminivirus replication proteins are related to genome. Cytogenet. Genome Res. 110 (1–4), 491–499.
prokaryotic plasmid rolling circle DNA replication initiator proteins. J. Gen. Lane, N., Martin, W., 2010. The energetics of genome complexity. Nature 467 (7318),
Virol. 73 (Pt 10), 2763–2766. 929–934.
Koonin, E.V., Ilyina, T.V., 1993. Computer-assisted dissection of rolling circle DNA Lane, N., Martin, W.F., 2012. The origin of membrane bioenergetics. Cell 151 (7),
replication. Biosystems 30 (1–3), 241–268.
1406–1416.
Koonin, E.V., Krupovic, M., Yutin, N., 2015. Evolution of double-stranded DNA
Le Gall, O., Christian, P., Fauquet, C.M., King, A.M., Knowles, N.J., Nakashima, N.,
viruses of eukaryotes: from bacteriophages to transposons to giant viruses.
Stanway, G., Gorbalenya, A.E., 2008. Picornavirales, a proposed order of
Ann. N.Y. Acad. Sci., in press, http://dx.doi.org/10.1111/nyas.12728.
positive-sense single-stranded RNA viruses with a pseudo-T ¼ 3 virion archi-
Koonin, E.V., Mushegian, A.R., Ryabov, E.V., Dolja, V.V., 1991b. Diverse groups of
tecture. Arch. Viro.
plant RNA and DNA viruses share related movement proteins that may possess
Lee, S.I., Kim, N.S., 2014. Transposable elements and genome size variations in
chaperone-like activity. J. Gen. Virol. 72 (Pt 12), 2895–2903.
plants. Genomics Inform. 12 (3), 87–97.
Koonin, E.V., Senkevich, T.G., Dolja, V.V., 2006. The ancient Virus World and
Legendre, M., Bartoli, J., Shmakova, L., Jeudy, S., Labadie, K., Adrait, A., Lescot, M.,
evolution of cells. Biol. Direct 1, 29.
Poirot, O., Bertaux, L., Bruley, C., Coute, Y., Rivkina, E., Abergel, C., Claverie, J.M.,
Koonin, E.V., Wolf, Y.I., Nagasaki, K., Dolja, V.V., 2008. The Big Bang of picorna-like
2014. Thirty-thousand-year-old distant relative of giant icosahedral DNA
virus evolution antedates the radiation of eukaryotic supergroups. Nat. Rev.
viruses with a pandoravirus morphology. Proc. Natl. Acad. Sci. U.S.A. 111 (11),
Microbiol. 6 (12), 925–939.
4274–4279.
Koonin, E.V., Wolf, Y.I., Nagasaki, K., Dolja, V.V., 2009. The complexity of the virus
Li, J., Rahmeh, A., Morelli, M., Whelan, S.P., 2008. A conserved motif in region v of
world. Nat. Rev. Microbiol 7 (3), 250.
the large polymerase proteins of nonsegmented negative-sense RNA viruses
Koonin, E.V., Yutin, N., 2010. Origin and evolution of eukaryotic large nucleo-
that is essential for mRNA capping. J. Virol. 82 (2), 775–784.
cytoplasmic DNA viruses. Intervirology 53 (5), 284–292.
Liu, H., Fu, Y., Li, B., Yu, X., Xie, J., Cheng, J., Ghabrial, S.A., Li, G., Yi, X., Jiang, D., 2011.
Koonin, E.V., Yutin, N., 2014. The dispersed archaeal eukaryome and the complex
archaeal ancestor of eukaryotes. Cold Spring Harb. Perspect. Biol. 6 (4), a016188. Widespread horizontal gene transfer from circular single-stranded DNA viruses
Kristensen, D.M., Mushegian, A.R., Dolja, V.V., Koonin, E.V., 2010. New dimensions to eukaryotic genomes. BMC Evol. Biol. 11, 276.
of the virus world discovered through metagenomics. Trends Microbiol. 18 (1), Liu, H., Fu, Y., Xie, J., Cheng, J., Ghabrial, S.A., Li, G., Peng, Y., Yi, X., Jiang, D., 2012a.
11–19. Evolutionary genomics of mycovirus-related dsRNA viruses reveals cross-family
Kristensen, D.M., Waller, A.S., Yamada, T., Bork, P., Mushegian, A.R., Koonin, E.V., horizontal gene transfer and evolution of diverse viral lineages. BMC Evol. Biol.
2013. Orthologous gene clusters and taxon signature genes for viruses of 12, 91.
prokaryotes. J. Bacteriol. 195 (5), 941–950. Liu, H., Fu, Y., Xie, J., Cheng, J., Ghabrial, S.A., Li, G., Yi, X., Jiang, D., 2012b. Discovery
Krupovic, M., 2012. Recombination between RNA viruses and plasmids might have of novel dsRNA viral sequences by in silico cloning and implications for viral
played a central role in the origin and evolution of small DNA viruses. Bioessays diversity, host range and evolution. PLoS One 7 (7), e42147.
34 (10), 867–870. Liu, Y., Xu, L., Opalka, N., Kappler, J., Shu, H.B., Zhang, G., 2002. Crystal structure of
Krupovic, M., 2013. Networks of evolutionary interactions underlying the poly- sTALL-1 reveals a virus-like assembly of TNF family ligands. Cell 108 (3),
phyletic origin of ssDNA viruses. Curr. Opin. Virol. 3 (5), 578–586. 383–394.
Krupovic, M., Bamford, D.H., 2008. Virus evolution: how far does the double beta- Llorens, C., Fares, M.A., Moya, A., 2008. Relationships of gag-pol diversity between
barrel viral lineage extend? Nat. Rev. Microbiol. 6 (12), 941–948. Ty3/Gypsy and Retroviridae LTR retroelements and the three kings hypothesis.
Krupovic, M., Bamford, D.H., 2009. Does the evolution of viral polymerases reflect BMC Evol. Biol. 8, 276.
the origin and evolution of viruses? Nat. Rev. Microbiol. 7 (3), 250. Lorenzi, H., Thiagarajan, M., Haas, B., Wortman, J., Hall, N., Caler, E., 2008. Genome
Krupovic, M., Bamford, D.H., 2010. Order to the viral universe. J. Virol. 84 (24), wide survey, discovery and evolution of repetitive elements in three Enta-
12476–12479. moeba species. BMC Genomics 9, 595.
Krupovic, M., Bamford, D.H., 2011. Double-stranded DNA viruses: 20 families and Luque, D., Gomez-Blanco, J., Garriga, D., Brilot, A.F., Gonzalez, J.M., Havens, W.M.,
only five different architectural principles for virion assembly. Curr. Opin. Virol. Carrascosa, J.L., Trus, B.L., Verdaguer, N., Ghabrial, S.A., Caston, J.R., 2014. Cryo-
1 (2), 118–124. EM near-atomic structure of a dsRNA fungal virus shows ancient structural
Krupovic, M., Bamford, D.H., Koonin, E.V., 2014. Conservation of major and minor motifs preserved in the dsRNA viral lineage. Proc. Natl. Acad. Sci. U.S.A. 111 (21),
jelly-roll capsid proteins in Polinton (Maverick) transposons suggests that they 7641–7646.
are bona fide viruses. Biol. Direct 9 (1), 6. Lynch, M., 2007. The frailty of adaptive hypotheses for the origins of organismal
Krupovic, M., Cvirkaite-Krupovic, V., 2011. Virophages or satellite viruses? Nat. Rev. complexity. Proc. Natl. Acad. Sci. U.S.A. 104 (Suppl. 1), 8597–8604.
Microbiol. 9 (11), 762–763. Lynch, M., Conery, J.S., 2003. The origins of genome complexity. Science 302 (5649),
Krupovic, M., Forterre, P., 2015. Single-stranded DNA viruses employ a variety of 1401–1404.
mechanisms for integration into the host genomes. Ann. N.Y. Acad. Sci., in press, Lyozin, G.T., Makarova, K.S., Velikodvorskaja, V.V., Zelentsova, H.S., Khechumian, R.
http://dx.doi.org/10.1111/nyas.12675. R., Kidwell, M.G., Koonin, E.V., Evgen’ev, M.B., 2001. The structure and evolution
Krupovic, M., Forterre, P., Bamford, D.H., 2010. Comparative analysis of the mosaic of Penelope in the virilis species group of Drosophila: an ancient lineage of
genomes of tailed archaeal viruses and proviruses suggests common themes for retroelements. J. Mol. Evol. 52 (5), 445–456.
24 E.V. Koonin et al. / Virology 479-480 (2015) 2–25
Maeda, N., Fan, H., Yoshikai, Y., 2008. Oncogenesis by retroviruses: old and new Pietila, M.K., Laurinmaki, P., Russell, D.A., Ko, C.C., Jacobs-Sera, D., Hendrix, R.W.,
paradigms. Rev. Med. Virol. 18 (6), 387–405. Bamford, D.H., Butcher, S.J., 2013. Structure of the archaeal head-tailed virus
Majorek, K.A., Dunin-Horkawicz, S., Steczkiewicz, K., Muszewska, A., Nowotny, M., HSTV-1 completes the HK97 fold story. Proc. Natl. Acad. Sci. U.S.A. 110 (26),
Ginalski, K., Bujnicki, J.M., 2014. The RNase H-like superfamily: new members, 10604–10609.
comparative structural analysis and evolutionary classification. Nucleic Acids Poole, A., Jeffares, D., Penny, D., 1999. Early evolution: prokaryotes, the new kids on
Res. 42 (7), 4160–4179. the block. Bioessays 21 (10), 880–889.
Malik, H.S., 2005. Ribonuclease H evolution in retrotransposable elements. Cyto- Poole, A., Penny, D., 2007. Eukaryote evolution: engulfed by speculation. Nature 447
genet. Genome Res. 110 (1–4), 392–401. (7147), 913.
Malik, H.S., Eickbush, T.H., 1999. Modular evolution of the integrase domain in the Poranen, M.M., Bamford, D.H., 2012. Assembly of large icosahedral double-stranded
Ty3/Gypsy class of LTR retrotransposons. J. Virol. 73 (6), 5186–5190. RNA viruses. Adv. Exp. Med. Biol. 726, 379–402.
Malik, H.S., Eickbush, T.H., 2001. Phylogenetic analysis of ribonuclease H domains Prangishvili, D., 2013. The wonderful world of archaeal viruses. Annu. Rev.
suggests a late, chimeric origin of LTR retrotransposable elements and retro- Microbiol. 67, 565–585.
viruses. Genome Res. 11 (7), 1187–1197. Prangishvili, D., Forterre, P., Garrett, R.A., 2006a. Viruses of the Archaea: a unifying
Malik, H.S., Henikoff, S., Eickbush, T.H., 2000. Poised for contagion: evolutionary view. Nat. Rev. Microbiol. 4 (11), 837–848.
origins of the infectious abilities of invertebrate retroviruses. Genome Res. 10 Prangishvili, D., Garrett, R.A., Koonin, E.V., 2006b. Evolutionary genomics of
(9), 1307–1318. archaeal viruses: unique viral genomes in the third domain of life. Virus Res.
Mandal, P.K., Bagchi, A., Bhattacharya, A., Bhattacharya, S., 2004. An Entamoeba – (1), 52–67.
histolytica LINE/SINE pair inserts at common target sites cleaved by the Quito-Avila, D.F., Lightle, D., Lee, J., Martin, R.R., 2012. Transmission biology of
restriction enzyme-like LINE-encoded endonuclease. Eukaryot. Cell 3 (1), Raspberry latent virus, the first aphid-borne reovirus. Phytopathology 102 (5),
170–179. 547–553.
Mantynen, S., Laanto, E., Kohvakka, A., Poranen, M.M., Bamford, J.K., Ravantti, J.J., Raoult, D., Audic, S., Robert, C., Abergel, C., Renesto, P., Ogata, H., La Scola, B., Suzan,
2015. New enveloped dsRNA phage from freshwater habitat. J. Gen. Virol. M., Claverie, J.M., 2004. The 1.2-megabase genome sequence of Mimivirus.
Martin, W., Dagan, T., Koonin, E.V., Dipippo, J.L., Gogarten, J.P., Lake, J.A., 2007. The Science 306 (5700), 1344–1350.
evolution of eukaryotes. Science 316 (5824), 542–543, author reply 542-3. Raoult, D., Forterre, P., 2008. Redefining viruses: lessons from Mimivirus. Nat. Rev.
Martin, W., Koonin, E.V., 2006. Introns and the origin of nucleus-cytosol compart- Microbiol. 6 (4), 315–319.
mentation. Nature 440, 41–45. Rastgou, M., Habibi, M.K., Izadpanah, K., Masenga, V., Milne, R.G., Wolf, Y.I., Koonin,
McKenna, R., Xia, D., Willingmann, P., Ilag, L.L., Krishnaswamy, S., Rossmann, M.G., E.V., Turina, M., 2009. Molecular characterization of the plant virus genus
Olson, N.H., Baker, T.S., Incardona, N.L., 1992. Atomic structure of single- Ourmiavirus and evidence of inter-kingdom reassortment of viral genome
stranded DNA bacteriophage phi X174 and its functional implications. Nature segments as its possible route of origin. J. Gen. Virol. 90 (Pt 10), 2525–2535.
355 (6356), 137–143. Rest, J.S., Mindell, D.P., 2003. Retroids in archaea: phylogeny and lateral origins.
Medhekar, B., Miller, J.F., 2007. Diversity-generating retroelements. Curr. Opin. Mol. Biol. Evol. 20 (7), 1134–1142.
Microbiol. 10 (4), 388–395. Rigden, J.E., Dry, I.B., Krake, L.R., Rezaian, M.A., 1996. Plant virus DNA replication
Melcher, U., 2000. The ‘30K’ superfamily of viral movement proteins. J. Gen. Virol. processes in Agrobacterium: insight into the origins of geminiviruses? Proc.
81 (Pt 1), 257–266. Natl. Acad. Sci. U.S.A. 93 (19), 10280–10284.
Mindich, L., 2004. Packaging, replication and recombination of the segmented Rixon, F.J., Schmid, M.F., 2014. Structural similarities in DNA packaging and delivery
genome of bacteriophage Phi6 and its relatives. Virus Res. 101 (1), 83–92. apparatuses in Herpesvirus and dsDNA bacteriophages. Curr. Opin. Virol. 5,
Mochizuki, T., Krupovic, M., Pehau-Arnaudet, G., Sako, Y., Forterre, P., Prangishvili, 105–110.
D., 2012. Archaeal virus with exceptional virion architecture and the largest Robart, A.R., Chan, R.T., Peters, J.K., Rajashankar, K.R., Toor, N., 2014. Crystal
single-stranded DNA genome. Proc. Natl. Acad. Sci. U.S.A. 109 (33), structure of a eukaryotic group II intron lariat. Nature 514 (7521), 193–197.
13386–13391. Robart, A.R., Zimmerly, S., 2005. Group II intron retroelements: function and
Modis, Y., 2014. Relating structure to evolution in class II viral membrane fusion diversity. Cytogenet. Genome Res. 110 (1–4), 589–597.
proteins. Curr. Opin. Virol. 5, 34–41. Robertson, M.P., Joyce, G.F., 2012. The origins of the RNA world. Cold Spring Harb.
Monttinen, H.A., Ravantti, J.J., Stuart, D.I., Poranen, M.M., 2014. Automated struc- Perspect. Biol. 4, 5.
tural comparisons clarify the phylogeny of the right-hand-shaped polymerases. Rohwer, F., 2003. Global phage diversity. Cell 113 (2), 141.
Mol. Biol. Evol. 31 (10), 2741–2752. Rohwer, F., Thurber, R.V., 2009. Viruses manipulate the marine environment.
Mushegian, A.R., Elena, S.F., 2015. Evolution of plant virus movement proteins from Nature 459 (7244), 207–212.
the 30 K superfamily and of their homologs integrated in plant genomes. Roossinck, M.J., Sabanadzovic, S., Okada, R., Valverde, R.A., 2011. The remarkable
Virology 476C, 304–315. evolutionary history of endornaviruses. J. Gen. Virol. 92 (Pt 11), 2674–2678.
Mushegian, A.R., Koonin, E.V., 1993. Cell-to-cell movement of plant viruses. Insights Rosario, K., Duffy, S., Breitbart, M., 2009. Diverse circovirus-like genome architec-
from amino acid sequence comparisons of movement proteins and from tures revealed by environmental metagenomics. J. Gen. Virol. 90 (Pt 10),
analogies with cellular transport systems. Arch. Virol. 133 (3–4), 239–257. 2418–2424.
Nagasaki, K., Tomaru, Y., Takao, Y., Nishida, K., Shirai, Y., Suzuki, H., Nagumo, T., Rosario, K., Duffy, S., Breitbart, M., 2012. A field guide to eukaryotic circular single-
2005. Previously unknown virus infects marine diatom. Appl. Environ. Micro- stranded DNA viruses: insights gained from metagenomics. Arch. Virol. 157
biol. 71 (7), 3528–3535. (10), 1851–1871.
Nagy, P.D., Pogany, J., 2012. The dependence of viral RNA replication on co-opted Rossmann, M.G., Johnson, J.E., 1989. Icosahedral RNA virus structure. Annu. Rev.
host factors. Nat. Rev. Microbiol. 10 (2), 137–149. Biochem. 58, 533–573.
Nasir, A., Kim, K.M., Caetano-Anolles, G., 2012. Giant viruses coexisted with the Rothnie, H.M., Chapdelaine, Y., Hohn, T., 1994. Pararetroviruses and retroviruses: a
cellular ancestors and represent a distinct supergroup along with superking- comparative review of viral structure and gene expression strategies. Adv. Virus
doms Archaea, Bacteria and Eukarya. BMC Evol. Biol. 12, 156. Res. 44, 1–67.
Nassal, M., 2008. Hepatitis B viruses: reverse transcription a different way. Virus Roux, S., Enault, F., Bronner, G., Vaulot, D., Forterre, P., Krupovic, M., 2013. Chimeric
Res. 134 (1–2), 235–249. viruses blur the borders between the major groups of eukaryotic single-
Nesmelova, I.V., Hackett, P.B., 2010. DDE transposases: structural similarity and stranded DNA viruses. Nat. Commun. 4, 2700.
diversity. Adv. Drug Delivery Rev. 62 (12), 1187–1195. Roux, S., Krupovic, M., Poulet, A., Debroas, D., Enault, F., 2012. Evolution and
Neveu, M., Kim, H.J., Benner, S.A., 2013. The “strong” RNA world hypothesis: fifty diversity of the Microviridae viral family through a collection of 81 new
years old. Astrobiology 13 (4), 391–403. complete genomes assembled from virome reads. PLoS One 7 (7), e40418.
Ng, J.C., Falk, B.W., 2006. Virus-vector interactions mediating nonpersistent and Salgado, P.S., Koivunen, M.R., Makeyev, E.V., Bamford, D.H., Stuart, D.I., Grimes, J.M.,
semipersistent transmission of plant viruses. Annu. Rev. Phytopathol. 44, 2006. The structure of an RNAi polymerase links RNA silencing and transcrip-
183–212. tion. PLoS Biol. 4 (12), e434.
Nibert, M.L., Tang, J., Xie, J., Collier, A.M., Ghabrial, S.A., Baker, T.S., Tao, Y.J., 2013. 3D Sandmeyer, S.B., Menees, T.M., 1996. Morphogenesis at the retrotransposon-
structures of fungal partitiviruses. Adv. Virus Res. 86, 59–85. retrovirus interface: gypsy and copia families in yeast and Drosophila. Curr.
Norris, G.E., Stillman, T.J., Anderson, B.F., Baker, E.N., 1994. The three-dimensional Top. Microbiol. Immunol. 214, 261–296.
structure of PNGase F, a glycosylasparaginase from Flavobacterium meningo- Seeger, C., Hu, J., 1997. Why are hepadnaviruses DNA and not RNA viruses? Trends
septicum. Structure 2 (11), 1049–1059. Microbiol. 5 (11), 447–450.
Novikova, O., Smyshlyaev, G., Blinov, A., 2010. Evolutionary genomics revealed Selth, L.A., Randles, J.W., Rezaian, M.A., 2002. Agrobacterium tumefaciens supports
interkingdom distribution of Tcn1-like chromodomain-containing Gypsy LTR DNA replication of diverse geminivirus types. FEBS Lett. 516 (1–3), 179–182.
retrotransposons among fungi and plants. BMC Genomics 11, 231. Selvarajan Sigamani, S., Zhao, H., Kamau, Y.N., Baines, J.D., Tang, L., 2013. The
Okamoto, H., 2009. TT viruses in animals. Curr. Top. Microbiol. Immunol. 331, structure of the herpes simplex virus DNA-packaging terminase pUL15 nucle-
35–52. ase domain suggests an evolutionary lineage among eukaryotic and prokaryotic
Pardue, M.L., DeBaryshe, P.G., 2011. Retrotransposons that maintain chromosome viruses. J. Virol. 87 (12), 7140–7148.
ends. Proc. Natl. Acad. Sci. U.S.A. 108 (51), 20317–20324. Shutt, T.E., Gray, M.W., 2006. Bacteriophage origins of mitochondrial replication
Pflug, A., Guilligay, D., Reich, S., Cusack, S., 2014. Structure of influenza A and transcription proteins. Trends Genet. 22 (2), 90–95.
polymerase bound to the viral RNA promoter. Nature 516 (7531), 355–360. Simon, D.M., Clarke, N.A., McNeil, B.A., Johnson, I., Pantuso, D., Dai, L., Chai, D.,
Philippe, N., Legendre, M., Doutre, G., Coute, Y., Poirot, O., Lescot, M., Arslan, D., Zimmerly, S., 2008. Group II introns in eubacteria and archaea: ORF-less introns
Seltzer, V., Bertaux, L., Bruley, C., Garin, J., Claverie, J.M., Abergel, C., 2013. and new varieties. RNA 14 (9), 1704–1713.
Pandoraviruses: amoeba viruses with genomes up to 2.5 Mb reaching that of Simon, D.M., Zimmerly, S., 2008. A diversity of uncharacterized reverse transcrip-
parasitic eukaryotes. Science 341 (6143), 281–286. tases in bacteria. Nucleic Acids Res. 36 (22), 7219–7229.
E.V. Koonin et al. / Virology 479-480 (2015) 2–25 25
Sirkis, R., Gerst, J.E., Fass, D., 2006. Ddi1, a eukaryotic protein with the retroviral Wang, W.C., Hsu, Y.H., Lin, N.S., Wu, C.Y., Lai, Y.C., Hu, C.C., 2013. A novel prokaryotic
protease fold. J. Mol. Biol. 364 (3), 376–387. promoter identified in the genome of some monopartite begomoviruses. PLoS
Smyshlyaev, G., Voigt, F., Blinov, A., Barabas, O., Novikova, O., 2013. Acquisition of an One 8 (7), e70037.
Archaea-like ribonuclease H domain by plant L1 retrotransposons supports Wang, Y., Bininda-Emonds, O.R.P., Jehle, J.A., 2012b. Nudivirus genomics and
modular evolution. Proc. Natl. Acad. Sci. U.S.A. 110 (50), 20140–20145. phylogeny. In: Garcia, M. (Ed.), Molecular Structure, Diversity, Gene Expression
Solyom, S., Kazazian Jr., H.H., 2012. Mobile elements in the human genome: Mechanisms and Host-Virus Interactions. InTech, Rijeka.
implications for disease. Genome Med. 4 (2), 12. Wang, Y., Jehle, J.A., 2009. Nudiviruses and other large, double-stranded circular
Song, S.U., Gerasimova, T., Kurkulos, M., Boeke, J.D., Corces, V.G., 1994. An env-like DNA viruses of invertebrates: new insights on an old topic. J. Invertebr. Pathol.
protein encoded by a Drosophila retroelement: evidence that gypsy is an 101 (3), 187–193.
infectious retrovirus. Genes Dev. 8 (17), 2046–2057. Weiss, R.A., 2013. On the concept and elucidation of endogenous retroviruses.
Staginnus, C., Richert-Poggeler, K.R., 2006. Endogenous pararetroviruses: two-faced Philos. Trans. R. Soc. London, B: Biol. Sci. 368 (1626), 20120494.
travelers in the plant genome. Trends Plant Sci. 11 (10), 485–491. White, J.M., Delos, S.E., Brecher, M., Schornberg, K., 2008. Structures and mechan-
Stedman, K., 2013. Mechanisms for RNA capture by ssDNA viruses: grand theft RNA. isms of viral membrane fusion proteins: multiple variations on a common
J. Mol. Evol. 76 (6), 359–364. theme. Crit. Rev. Biochem. Mol. Biol. 43 (3), 189–219.
Steven, A.C., Conway, J.F., Cheng, N., Watts, N.R., Belnap, D.M., Harris, A., Stahl, S.J., Whon, T.W., Kim, M.S., Roh, S.W., Shin, N.R., Lee, H.W., Bae, J.W., 2012. Metagenomic
Wingfield, P.T., 2005. Structure, assembly, and antigenicity of hepatitis B virus characterization of airborne viral DNA diversity in the near-surface atmosphere.
capsid proteins. Adv. Virus Res. 64, 125–164. J. Virol. 86 (15), 8221–8231.
Stoddard, B.L., 2005. Homing endonuclease structure and function. Q. Rev. Biophys. Williams, T.A., Embley, T.M., Heinz, E., 2011. Informational gene phylogenies do not
38 (1), 49–95. support a fourth domain of life for nucleocytoplasmic large DNA viruses. PLoS
Stoye, J.P., 2012. Studies of endogenous retroviruses reveal a continuing evolu- One 6 (6), e21080.
tionary saga. Nat. Rev. Microbiol. 10 (6), 395–406. Wong, T.Y., Preston, L.A., Schiller, N.L., 2000. ALGINATE LYASE: review of major
Suttle, C.A., 2005. Viruses in the sea. Nature 437 (7057), 356–361. sources and enzyme characteristics, structure-function analysis, biological
Suttle, C.A., 2007. Marine viruses—major players in the global ecosystem. Nat. Rev. roles, and applications. Annu. Rev. Microbiol. 54, 289–340.
Microbiol. 5 (10), 801–812. Wu, C.Y., Yang, S.H., Lai, Y.C., Lin, N.S., Hsu, Y.H., Hu, C.C., 2007. Unit-length, single-
Szathmary, E., Demeter, L., 1987. Group selection of early replicators and the origin stranded circular DNAs of both polarity of begomoviruses are generated in
of life. J. Theor. Biol. 128 (4), 463–486. Escherichia coli harboring phage M13-cloned begomovirus genome with single
Takeuchi, N., Hogeweg, P., 2007. The role of complex formation and deleterious copy of replication origin. Virus Res. 125 (1), 14–28.
mutations for the stability of RNA-like replicator systems. J. Mol. Evol. 65 (6), Xiong, Y., Eickbush, T.H., 1990. Origin and evolution of retroelements based upon
668–686. their reverse transcriptase sequences. EMBO J. 9 (10), 3353–3362.
Takeuchi, N., Hogeweg, P., 2012. Evolutionary dynamics of RNA-like replicator Yang, J., Malik, H.S., Eickbush, T.H., 1999. Identification of the endonuclease domain
systems: a bioinformatic approach to the origin of life. Phys. Life Rev. 9 (3), encoded by R2 and other site-specific, non-long terminal repeat retrotranspo-
219–263. sable elements. Proc. Natl. Acad. Sci. U.S.A. 96 (14), 7847–7852.
Takeuchi, N., Hogeweg, P., Koonin, E.V., 2011. On the origin of DNA genomes: Yutin, N., Koonin, E.V., 2012. Hidden evolutionary complexity of Nucleo-
evolution of the division of labor between template and catalyst in model Cytoplasmic Large DNA viruses of eukaryotes. Virol. J. – (1), 161.
replicator systems. PLoS Comput. Biol. 7 (3), e1002024. Yutin, N., Koonin, E.V., 2013. Pandoraviruses are highly derived phycodnaviruses.
Theze, J., Bezier, A., Periquet, G., Drezen, J.M., Herniou, E.A., 2011. Paleozoic origin of Biol. Direct 8, 25.
insect large dsDNA viruses. Proc. Natl. Acad. Sci. U.S.A. 108 (38), 15931–15935. Yutin, N., Makarova, K.S., Mekhedov, S.L., Wolf, Y.I., Koonin, E.V., 2008. The deep
Toor, N., Keating, K.S., Taylor, S.D., Pyle, A.M., 2008. Crystal structure of a self- archaeal roots of eukaryotes. Mol. Biol. Evol. 25 (8), 1619–1630.
spliced group II intron. Science 320 (5872), 77–82. Yutin, N., Raoult, D., Koonin, E.V., 2013. Virophages, polintons, and transpovirons: a
Tordo, N., Poch, O., Ermine, A., Keith, G., Rougeon, F., 1988. Completion of the rabies complex evolutionary network of diverse selfish genetic elements with
virus genome sequence determination: highly conserved domains among the L different reproduction strategies. Virol. J. 10, 158.
(polymerase) proteins of unsegmented negative-strand RNA viruses. Virology Yutin, N., Wolf, M.Y., Wolf, Y.I., Koonin, E.V., 2009. The origins of phagocytosis and
165 (2), 565–576. eukaryogenesis. Biol. Direct 4, 9.
Toro, N., Nisa-Martínez, R., 2014. Comprehensive phylogenetic analysis of bacterial Yutin, N., Wolf, Y.I., Koonin, E.V., 2014. Origin of giant viruses from smaller DNA
reverse transcriptases. PLoS One 9 (11), e114083. viruses not from a fourth domain of cellular life. Virology 466–467, 38–52.
van der Giezen, M., 2009. Hydrogenosomes and mitosomes: conservation and Zanotto, P.M., Gibbs, M.J., Gould, E.A., Holmes, E.C., 1996. A reevaluation of the
evolution of functions. J. Eukaryot. Microbiol. 56 (3), 221–231. higher taxonomy of viruses based on RNA polymerases. J. Virol. 70 (9),
van der Giezen, M., Tovar, J., 2005. Degenerate mitochondria. EMBO Rep. 6 (6), 6083–6096.
525–530. Zawar-Reza, P., Arguello-Astorga, G.R., Kraberger, S., Julian, L., Stainton, D., Broady,
Van Etten, J.L., 2003. Unusual life style of giant chlorella viruses. Annu. Rev. Genet. P.A., Varsani, A., 2014. Diverse small circular single-stranded DNA viruses
37, 153–195. identified in a freshwater pond on the McMurdo Ice Shelf (Antarctica). Infect.
Vaney, M.C., Rey, F.A., 2011. Class II enveloped viruses. Cell. Microbiol. 13 (10), Genet. Evol. 26, 132–138.
1451–1459. Zeddam, J.L., Gordon, K.H., Lauber, C., Alves, C.A., Luke, B.T., Hanzlik, T.N., Ward, V.K.,
von Dohlen, C.D., Kohler, S., Alsop, S.T., McManus, W.R., 2001. Mealybug beta- Gorbalenya, A.E., 2010. Euprosterna elaeasa virus genome sequence and
proteobacterial endosymbionts contain gamma-proteobacterial symbionts. evolution of the Tetraviridae family: emergence of bipartite genomes and
Nature 412 (6845), 433–436. conservation of the VPg signal with the dsRNA Birnaviridae family. Virology
Wang, Q., Han, Y., Qiu, Y., Zhang, S., Tang, F., Wang, Y., Zhang, J., Hu, Y., Zhou, X., 397 (1), 145–154.
2012a. Identification and characterization of RNA duplex unwinding and Zhang, W., Olson, N.H., Baker, T.S., Faulkner, L., Agbandje-McKenna, M., Boulton, M.I.,
ATPase activities of an alphatetravirus superfamily 1 helicase. Virology 433 Davies, J.W., McKenna, R., 2001. Structure of the Maize streak virus geminate
(2), 440–448. particle. Virology 279 (2), 471–477.