Warthog Genomes Resolve an Evolutionary Conundrum
and Reveal Introgression of Disease Resistance Genes
Genís Garcia-Erill ,†,1 Christian H.F. Jørgensen,†,1 Vincent B. Muwanika,2 Xi Wang,1
Malthe S. Rasmussen,1 Yvonne A. de Jong,3 Philippe Gaubert,4 Ayodeji Olayemi,5 Jordi Salmona,4
Thomas M. Butynski,3 Laura D. Bertola,1 Hans R. Siegismund,1 Anders Albrechtsen,1
and Rasmus Heller* ,1
1
Department of Biology, University of Copenhagen, Ole Maaløes Vej 5, 2200 Copenhagen N, Denmark
2
Department of Environmental Management, Makerere University, PO Box 7062, Kampala, Uganda
3
Eastern Africa Primate Diversity and Conservation Program & Lolldaiga Hills Research Programme, PO Box 149, Nanyuki
10400, Kenya
4
Laboratoire Évolution & Diversité Biologique, Université Toulouse III Paul Sabatier, 31062 Toulouse, France
Downloaded from https://academic.oup.com/mbe/article/39/7/msac134/6627297 by guest on 06 April 2024
5
Natural History Museum, Obafemi Awolowo University, HO 220005 Ile Ife, Nigeria
*Corresponding author: E-mail:
[email protected].
†
These authors contributed equally to this work.
Associate editor: Dr Maria C. Ávila-Arcos
Abstract
African wild pigs have a contentious evolutionary and biogeographic history. Until recently, desert warthog
(Phacochoerus aethiopicus) and common warthog (P. africanus) were considered a single species. Molecular evidence
surprisingly suggested they diverged at least 4.4 million years ago, and possibly outside of Africa. We sequenced the
first whole-genomes of four desert warthogs and 35 common warthogs from throughout their range. We show that
these two species diverged much later than previously estimated, 400,000–1,700,000 years ago depending on assump-
tions of gene flow. This brings it into agreement with the paleontological record. We found that the common warthog
originated in western Africa and subsequently colonized eastern and southern Africa. During this range expansion, the
common warthog interbred with the desert warthog, presumably in eastern Africa, underlining this region’s import-
ance in African biogeography. We found that immune system–related genes may have adaptively introgressed into
common warthogs, indicating that resistance to novel diseases was one of the most potent drivers of evolution as com-
mon warthogs expanded their range. Hence, we solve some of the key controversies surrounding warthog evolution
and reveal a complex evolutionary history involving range expansion, introgression, and adaptation to new diseases.
Key words: Phacochoerus evolution, introgression, disease resistance, African phylogeography, population structure.
Article
Introduction Harris and Cerling 2002; Souron 2017), or around 2.2–
2.0 Ma (Hopwood and Hollyfield 1954; Ewer 1956; Cooke
The genus Phacochoerus has a contentious and complex 1994; Pickford 2006, 2012, 2013a, b; Pickford and
taxonomic history. Two species of warthog, the desert Gommery 2016). This inconsistency between the genetic
warthog (P. aethiopicus) and the common warthog (P. afri- data and the fossil record has shrouded the evolutionary re-
canus), were recognized since 1788. Subsequent zoologists, lationship between the two species of warthog in contro-
however, erroneously assumed that all extant warthogs be- versy, showcasing a more general conflict between
longed to P. africanus until Grubb (1993) established that evolutionary chronology based on genetic data and fossils
the extant Somali warthog (P. aethiopicus delamerei) is con- (Yang and Donoghue 2016). Moreover, the unresolved di-
specific with the extinct Cape warthog (P. aethiopicus vergence time has raised fundamental biogeographic ques-
aethiopicus), meaning there are in fact two extant species tions, as the earliest known Suinae fossil from Africa is
of warthog: the common and the desert warthog. Two dated to 5.1 Ma (Pickford and Gommery 2016, 2020).
studies added to this evolutionary conundrum by estimat- This means that the existing genetic estimates for the ap-
ing a surprisingly ancient divergence time between the two pearance of the two extant species of Phacochoerus overlap
species of warthogs: 4.4 Ma (Randi et al. 2002) and 8.8– with or predate the earliest known occurrence of Suinae in
5.7 Ma (Randi et al. 2002; Gongora et al. 2011). These dates Africa. This, controversially, suggests that the two extant
are inconsistent with the fossil evidence, which indicates warthogs evolved outside Africa (Gongora et al. 2011).
that Phacochoerus first appeared either around 1.0– In addition to the controversial evolutionary relation-
0.8 Ma (White and Harris 1977; Harris and White 1979; ship between the two warthog species, they remain
© The Author(s) 2022. Published by Oxford University Press on behalf of Society for Molecular Biology and Evolution.
This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial License (https://
creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium,
provided the original work is properly cited. For commercial re-use, please contact [email protected] Open Access
Mol. Biol. Evol. 39(7):msac134 https://doi.org/10.1093/molbev/msac134 1
Garcia-Erill et al. · https://doi.org/10.1093/molbev/msac134 MBE
understudied from a population genetic perspective. phylogeography within Phacochoerus. We particularly fo-
Muwanika et al. (2003) used microsatellites and mtDNA cus on estimating the divergence time between the two
to infer three common warthog refugia (western, eastern, extant warthog species and identifying possible introgres-
and southern Africa). This is consistent with the prevailing sion between them.
phylogeographic pattern for large African mammals
(Hewitt 2004; Lorenzen et al. 2012). Lorenzen et al. Results
(2012) built on these results and suggested that eastern
Africa was colonized from a southern African refugium, a After sample filtering, we analyzed 35 common warthogs
pattern emerging in many other savanna ungulates includ- and four desert warthogs (supplementary material table
ing impala (Aepyceros melampus; Nersting and Arctander S1, Supplementary Material online, fig. 1A). Six samples, in-
2001; Lorenzen et al. 2006), wildebeest (Connochaetes spp.; cluding one desert warthog, were sequenced to high-depth
Arctander et al. 1999), greater kudu (Tragelaphus strepsi- (∼17×) and the remaining samples were sequenced to low-
ceros; Nersting and Arctander 2001), and plains zebra depth (∼3×). An mtDNA tree with two deeply divergent
clades confirmed the taxonomic status of our samples as
Downloaded from https://academic.oup.com/mbe/article/39/7/msac134/6627297 by guest on 06 April 2024
(Equus quagga; Pedersen et al. 2018).
Warthogs, in contrast to all other suids, are highly desert and common warthogs (supplementary material
adapted to the open grasslands of the African savannas fig. S1, Supplementary Material online). After strict filtering
(Cumming 2013; Grubb and D’Huart 2013; Butynski and of genomic sites, excluding repeats, regions with low mapp-
De Jong 2018; De Jong and Butynski 2018). Their open- ability, sites with outlying depth patterns, and regions in-
country adaptations include longer legs for cursorial loco- ferred to be exclusively heterozygous, we retained
motion, a highly specialized dentition, and a large head 1.3 Gbp of the autosomal genome for further analysis.
with broad vision. The adaptation to African savanna habi-
tat from an ancestral forest-adapted stock mirrors those of Population Structure and Phylogeography of the
early hominins relative to their great ape ancestors, and the Common Warthog
possible correlation between African suid and hominid From the filtered autosomal sites, we visualized the struc-
evolution and biogeography has been pointed out ture within common warthog populations using PCAngsd.
(White and Harris 1977). Hence, an increased understand- The observed genetic structure clustered the samples in
ing of warthog phylogeography is desirable to enhance our discrete groups according to the country from which the
understanding of the biogeographic theater in which hu- samples were obtained (fig. 1B). Ghana was separated
mans evolved. from all other populations in PC1, suggesting a deep split
Of the two extant species of warthog, the common between Ghana and the other localities. The structure was
warthog occupies a wide range of habitats and has by far also recovered in NGSadmix, where each inferred admix-
the greatest present distribution (fig. 1A). Four subspecies ture component corresponds to a country at K = 5, with
of common warthog are currently recognized (africanus, the exception of the single Zambian sample (fig. 1C) which
aeliani, massaicus, and sundevallii; Grubb 1993, 2005), appears as genetically intermediate between Tanzania and
but the limits of their geographic distributions are poorly Zimbabwe. This is expected from biogeography and is in
understood and their taxonomic arrangement is in need of line with the Principal Component Analysis (PCA). In the
validation (Butynski and De Jong 2018). Desert warthogs Kenyan population there is some genetic substructure,
are currently restricted to Ethiopia, Kenya, and Somalia with one sample being modeled as a mixture of the other
where they are sympatric with common warthogs in sev- Kenyan and the Tanzanian clusters, in accordance with the
eral regions (fig. 1A; De Jong and Butynski 2018, 2021; two sampling localities (fig. 1A; supplementary material
Butynski and De Jong 2021; De Jong et al. in press). table S1, Supplementary Material online). The mixed an-
Desert warthogs are adapted to low-lying, arid habitats, cestry in one of the Kenyan samples and the single
whereas common warthogs are typically associated with Zambian sample are likely the result of having too few
more moist savanna habitats and open woodlands samples from these localities, rather than true admixture
(De Jong and Butynski 2018). The literature is conflicted between the populations. Evaluation of the admixture pro-
as to whether hybridization between common and desert portions using evalAdmix corroborated that five clusters
warthogs occurs. Souron (2016) found atypical warthog are needed to accurately model population structure
skulls in the Horn of Africa that might indicate hybrids, (supplementary material figs. S2 and S3, Supplementary
but De Jong and Butynski (2018) argued that their ancient Material online). It also indicates that there is some sub-
divergence time makes hybridization unlikely. These unre- structure or cryptic relatedness within most of the inferred
solved questions and conflicting observations currently populations, especially among the Zimbabwean samples.
prevent any resolution of the evolutionary history of The admixture analyses did not converge for values of K
Phacochoerus, as well as pose outstanding questions re- >5. This might be due to sample sizes that are too small
garding African suid biogeography and conservation. to characterize more subtle substructure.
In this study, we present the first whole-genome data Based on the inferred population structure, we used the
from warthogs and use it to resolve some of these out- countries as population groupings for Treemix. We found
standing questions regarding warthog evolution. To do the Ghanian population to be the most basal split within
so, we investigate the genetic structure and the common warthogs, with a progressive splitting of the
2
Warthog Genomes Resolve an Evolutionary Conundrum · https://doi.org/10.1093/molbev/msac134 MBE
FIG. 1. Sampling localities
and population structure.
(A) Sampling localities and
number of individuals remain-
ing after filtering (see
supplementary material table
S1, Supplementary Material
online). Approximate geo-
graphic limits of the four cur-
rently recognized subspecies
of common warthog are
shown. Species and subspecies
ranges based on Vercammen
and Mason (1993), Muwanika
Downloaded from https://academic.oup.com/mbe/article/39/7/msac134/6627297 by guest on 06 April 2024
et al. (2003), Butynski and De
Jong (2018), De Jong and
Butynski (2018), De Jong et al.
(2018, in press). (B) Plot of
common warthog samples, col-
ored by sample country, on the
first two principal components
inferred with PCAngsd. (C)
Admixture proportions of
common warthog samples, es-
timated with NGSadmix as-
suming five ancestral clusters.
(D) Genome-wide heterozy-
gosity of all desert and com-
mon warthog samples,
calculated from genotype like-
lihoods with realSFS. Similar le-
vels were obtained for the
high-depth individuals with
genotype calls (supplementary
material fig. S4, Supplementary
Material online). Above the
plot, we show the topology of
the TreeMix tree without migra-
tions (see supplementary
material fig. S8, Supplementary
Material online for full TreeMix
result).
remaining populations from east to south. This progressive analysis which identifies the rainforest as a strong barrier
splitting is accompanied by a decrease in heterozygosity along to gene flow (fig. 3B). This stepwise range expansion was
an axis from west to east and southwards (fig. 1D, corroborated by a directionality test (supplementary
supplementary material fig. S4, Supplementary Material on- material table S2, Supplementary Material online). We
line). This suggests serial founder events as the common wart- quantified the genetic distance between populations using
hog colonized new areas. Two outlying samples, one each in FST and obtained similar results when using single high-
Kenya and Namibia, diverged from this pattern by having depth–sequenced individuals as when using genotype like-
higher heterozygosity than expected given the position of lihoods (GLs) for groups of low-depth–sequenced indivi-
their population in the expansion. We correlated sample het- duals (fig. 2C). The inferred FST between common
erozygosity with error rates. This revealed that the high het- warthog populations was as low as 0.06 between Kenya
erozygosity of these two samples is an artifact of their and Tanzania and up to 0.34 between Namibia and
higher per base sequencing error rates (supplementary Ghana. Overall, genetic differentiation between Ghana
material fig. S5, Supplementary Material online). and all other common warthog populations (“eastern
We summarize the phylogeographic interpretation of and southern African” or “ESA” in the following) was
our analyses in figure 2A. We inferred a range expansion high, in agreement with the PCA and mtDNA tree
in common warthogs from an origin in western Africa, col- (supplementary material fig. S1, Supplementary Material
onizing first eastern Africa and subsequently southern online). Despite nominally belonging to different subspe-
Africa. The range expansion followed a semi-circular cies (fig. 1A), Tanzania and Zimbabwe had a low FST of
path circumnavigating the Central African rainforest, as 0.08. This is almost on par with that between Zimbabwe
shown by the geographically aware population structure and Namibia (FST = 0.07), which nominally belong to the
3
Garcia-Erill et al. · https://doi.org/10.1093/molbev/msac134 MBE
FIG. 2. Warthog phylogeo-
graphic synthesis. (A) Summary
of the main phylogeographic
findings in the present study.
Solid arrows show the inferred
directionality of the range ex-
pansions; their width is propor-
Downloaded from https://academic.oup.com/mbe/article/39/7/msac134/6627297 by guest on 06 April 2024
tional to inferred genetic
diversity. The transparent arrow
marks the introgression between
desert warthogs and common
warthogs. Also shown is an ap-
proximate current outline of
the Central African rainforest
(FAO 2012). (B) Posterior mean
migration rates among common
warthog localities, estimated
with EEMS. (C) Population pair-
wise FST values based on single,
high-depth individuals (above
diagonal), and several low-depth
individuals (below diagonal) per
population. (D) FST between
pairs of common warthog popu-
lations, estimated with realSFS,
against the geographical dis-
tance between corresponding
pairs of localities. Geographic
distance is the shortest distance
when either taking into account
the Central African rainforest as
a barrier (squares) or taking the
great circle distance among lo-
calities (circles). Notice the im-
proved linear fit when
including the rainforest as a bar-
rier, compared with the great cir-
cle distance.
same subspecies (fig. 1A). Finally, the range expansion is Introgression between Species
further supported by a linear relationship between pair- A population structure analysis combining common and
wise FST and geographic distance when following a trajec- desert warthog samples did not identify recent hybrids be-
tory around the Central African forest (fig. 3D). Common tween the two species (supplementary material fig. S6,
warthog phylogeography is therefore closely tied to the Supplementary Material online). We investigated historic-
distribution of savanna habitat, and it is probable that ex- al gene flow by using D-statistics and found a strong signal
tensive Central African forest cover prevented a south- of gene flow between desert warthog and all ESA common
ward expansion of common warthogs for prolonged warthog populations relative to Ghana (fig. 3A). We
periods of the Pleistocene. verified that desert warthog gene flow was not an
Desert warthog samples in general showed lower genet- artifact of gene flow from an out-group into Ghana
ic diversity than common warthogs, although their hetero- (supplementary material table S3, Supplementary
zygosity values overlap with common warthogs from Material online). The similar magnitude of introgression
southern African populations (fig. 1D). FST was 0.71–0.77 in all ESA common warthog populations suggests that
between desert and common warthog populations most of this admixture occurred in a population ancestral
(fig. 2C). When high, these FST values indicate that both to the ESA common warthog populations, that is, at an
species still share a substantial amount of genetic variation early stage of their range expansion. One low-depth sam-
despite their presumed ancient divergence time. ple (sample ID 8931) from Tsavo West National Park, an
4
Warthog Genomes Resolve an Evolutionary Conundrum · https://doi.org/10.1093/molbev/msac134 MBE
Downloaded from https://academic.oup.com/mbe/article/39/7/msac134/6627297 by guest on 06 April 2024
FIG. 3. Introgression between desert warthogs and common warthogs. (A) D-statistics when using desert warthog samples as P3 and common
warthog samples as P1 and P2, grouped by the country of origin of the two common warthog samples. D-statistics are estimated both from
called genotypes for the high-depth samples, using qpDstat, and by single read sampling for all possible combinations of low-depth samples
from the corresponding P1, P2, and P3 populations, using ANGSD. (B) Cartoon visualization of an admixture graph fitted to a reduced data
set, showing the relationships among common warthogs from the three main lineages and the desert warthog. The admixture graph was es-
timated with qpGraph from called genotypes using a single high-depth sample per population. See in supplementary material fig. S7,
Supplementary Material online estimates of branch length for the graph, and all other compatible admixture graphs for the same set of popula-
tions identified with qpBrute.
area of sympatry in Kenya, shows excessive allele sharing ESA common warthog populations (fig. 4A), even in the
with desert warthogs compared with other ESA common distant past. We attribute this to admixture between
warthog samples (fig. 3A). This might suggest gene flow ESA common warthogs and desert warthogs. The desert
that is too old to be detectable in the admixture analysis, warthog showed slightly different effective population
but recent enough to occur after the divergence between sizes compared with common warthog populations
Kenyan populations. around 900–200 thousand years ago (kya), but after
Using qpGraph, we identified 28 admixture graphs 200 kya, the desert warthog and ESA common warthogs
without significant outliers (|Z| < 3). When using a top- had very similar population sizes. In contrast, from
ology consistent with the inferred phylogeography, where 200 kya onwards, the PSMC strongly suggests that
Ghana is an out-group to all other common warthog po- Ghana had an increase in effective population size, where-
pulations, all admixture graphs show gene flow between as all other populations underwent population declines.
common warthogs and desert warthogs (supplementary To better understand the time frame of the major evo-
material fig. S7, Supplementary Material online). We chose lutionary events in Phacochoerus, we used fastsimcoal2 to
the best-fitting graph as the one with the lowest maximum estimate divergence times and other demographic para-
absolute Z-score of the difference between observed and meters based on two-dimensional site frequency spectra
fitted f4 statistic. This graph includes 3% introgression (2dSFS). We initially evaluated a set of three models with
from a desert warthog population into a population ances- different assumptions on the admixture between common
tral to Tanzania and Namibia, and 13% introgression from and desert warthogs, based on the qpGraph results
a ghost common warthog population into Tanzania (supplementary material fig. S9, Supplementary Material
(fig. 3B). Similarly, when migrations were modeled in online). Out of these three models, the “1-admixture”
TreeMix, we found migration edges between desert and model (fig. 4B), which corresponds to the best-fitting
common warthogs at m = 2 and m = 3, albeit with incon- qpGraph, had the best fit (supplementary material table
sistent placement of the migration edges (supplementary S4, Supplementary Material online). The time of the basal
material fig. S8, Supplementary Material online). split between desert and common warthogs was estimated
under this model to be 473 kya (95% confidence interval
Demographic History [CI] 390–624 kya; fig. 4B, supplementary material table
We estimated historical effective population sizes for the S5, Supplementary Material online). However, qpGraph
high-depth individuals using the Pairwise Sequentially cannot model bidirectional gene flow. Since some ac-
Markovian Coalescent (PSMC). The three Ghanaian sam- cepted admixture graphs included gene flow from com-
ples showed a different demographic history than the mon warthogs to desert warthogs (supplementary
5
Garcia-Erill et al. · https://doi.org/10.1093/molbev/msac134 MBE
Downloaded from https://academic.oup.com/mbe/article/39/7/msac134/6627297 by guest on 06 April 2024
FIG. 4. Demographic history of desert warthogs and common warthogs. (A) Effective population size of common warthog and desert warthog
populations changes across time, estimated from high-depth samples using PSMC. (B) Schematic diagram depicting the fastsimcoal2 demo-
graphic model and parameter point estimates, when using the best-fitting qpgraph to fix the topology of the admixture graph and admixture
proportions (1-admixture model). All inferred demographic parameters and 95% confidence intervals are shown in supplementary material
table S5, Supplementary Material online. (C) Schematic diagram depicting a more general demographic model, where the admixture propor-
tions are estimated parameters and bidirectional migration involving common warthogs to desert warthogs introgression is allowed
(3-admixture model). All inferred demographic parameters and 95% confidence intervals are shown in supplementary material table S6,
Supplementary Material online.
material fig. S7, Supplementary Material online), we could Material online). The inferred ancestral population sizes
not discard bidirectional gene flow to have occurred. For were, however, more variable across methods.
this reason, we decided to explore an additional demo- The 3-admixture model has a considerably better likeli-
graphic model with fastsimcoal2 that allows for bidirection- hood than the 1-admixture model, even when allowing the
al and asymmetric introgression between the two species, admixture proportions of the latter to be estimated para-
that we called “3-admixture” (fig. 4C, supplementary meters instead of fixing them (supplementary material
material fig. S9, Supplementary Material online). This re- table S4, Supplementary Material online). Due to limita-
sulted in estimated admixture proportions of 15% from des- tions on the use of composite likelihoods for model selec-
ert warthog to ancestral ESA common warthog and 8% tion, however, and the sensitivity of SFS-based divergence
in the opposite direction. Based on this model, the esti- time estimates to demographic model assumptions, we
mated desert-common divergence time is significantly older consider both model estimates as plausible for describing
(1,364 kya; 95% CI 1,023–1,683 kya; supplementary material the divergence history of Phacochoerus. Moreover, we cor-
table S6, Supplementary Material online). The split between roborated, by estimating the population pairwise FST re-
the western African and ESA common warthog lineages was sulting from each of these two models, that both of
estimated to be 226 kya (95% CI 193–260 kya) or 108 kya them can capture the basic characteristics of the popula-
(95% CI 87–165 kya), assuming unidirectional or bidirec- tion structure in warthogs (supplementary material table
tional gene flow, respectively. The most recent divergence S7, Supplementary Material online). Finally, we comple-
between eastern (Tanzania) and southern (Namibia) popu- mented the SFS-based fastsimcoal2 divergence time esti-
lations of common warthog was 29 kya (95% CI 16–44 kya) mates with the TT method. This is based on single
or 45 kya (95% CI 31–51 kya), respectively. Estimates of cur- samples from each population and makes different demo-
rent effective population sizes for each population under ei- graphic model assumptions than fastsimcoal2. This meth-
ther model closely resembled the results from PSMC, with od, likewise, suggested an older divergence time of 1.2–1.3
the highest Ne inferred in Ghana and the lowest Ne in the million years between desert and common warthogs
desert warthog and common warthogs from Namibia (supplementary material table S8, Supplementary
(supplementary material tables S5 and S6, Supplementary Material online). To reflect the uncertainty associated
6
Warthog Genomes Resolve an Evolutionary Conundrum · https://doi.org/10.1093/molbev/msac134 MBE
with the demographic modeling, we conservatively con- myxovirus-resistance genes (Mx1 and Mx2) are involved
clude that the species divergence time is 400–1,700 kya. in resistance against viruses in general (Verhelst et al.
Of note, these divergence times are based on the most rea- 2013) and classical swine fever (Zhou et al. 2018).
sonable available mutation rate for suids (see Materials JAKMIP1 is involved in T-cell differentiation, specifically
and Methods). in the differentiation of virus-specific memory T cells
and, therefore, in the adaptive immune system (Libri
Adaptive Introgression Scan et al. 2008). ST3GAL1 plays a role both in T- and B-cell dif-
We used genotype calls from the high-depth samples to ferentiation (Giovannone et al. 2018). Host variants in this
estimate regions of putative adaptive introgression be- gene are associated with the severity of influenza A infec-
tween desert warthog and ESA common warthog popula- tion (Maestri et al. 2015). Another gene in the outlier re-
tions. We used the fd statistic, which estimates the fraction gions, MUC19, encodes a secreted mucin which is the
of the genome shared with a putatively introgressed ances- major gel-forming mucin in pig saliva (Chen et al. 2004),
try, and is suited for local analyses and to detect adaptive and is perhaps also involved in immune system functions
Downloaded from https://academic.oup.com/mbe/article/39/7/msac134/6627297 by guest on 06 April 2024
introgression by an outlier based approach. A Manhattan (Hasnain et al. 2013; McBride et al. 2018).
plot and histogram of the genome-wide distribution show
outliers of high fd values (figs. 5A and B). The signals in the Discussion
top 0.01% fd windows are shown in supplementary
material table S4, Supplementary Material online. Most A Revision of Warthog Phylogeography and its
of these windows also show exceptionally low values of Broader Implications
FST between desert warthogs and ESA common warthogs Here we provide the first genome-level analyses of
compared with the genomic averages (fig. 5C, Phacochoerus and detailed insight into its previously con-
supplementary material fig. S5, Supplementary Material tentious evolutionary history. We did not find support
online). Local genomic analyses are more sensitive to map- for three continental refugia during the Pleistocene
ping and genotyping errors. We tried to alleviate this prob- (Muwanika et al. 2003), nor for a colonization of eastern
lem by filtering out all genomic windows that showed an Africa from southern Africa (Lorenzen et al. 2012).
excess of problematic sites, based on the proportion of Instead, we found consistent evidence that extant common
sites filtered by the reference genome filtering (see warthog originated in western Africa, followed by an expan-
Materials and Methods). Although this does not definitely sion circumnavigating the central African rainforest, hence
exclude all local biases, we consider the top outlier win- first colonizing eastern Africa and later southern Africa.
dows to be the best candidates for adaptive introgression Our results reject three common warthog subspecies
that can be identified with the available resources. This (P. a. africanus, P. a. massaicus, and P. a. sundevallii) across
makes them worthy of exploration here and of validation the sampled populations, as there is very low differenti-
in future studies. We did not pursue simulations to obtain ation and a shared demographic history until the late
P-values of our selection peaks under neutrality. This was Pleistocene between eastern and southern populations.
because of the many assumptions needed to accurately In contrast, the Ghanaian population, representing
simulate both the population demography and the se- P. a. africanus, is highly differentiated from ESA population,
quencing process, which we believe would make such diverged long ago (108–226 kya), and has a markedly dif-
P-values difficult to interpret. Instead, we chose to only ex- ferent demographic history with a much higher effective
plore the top outliers from our scan. population size. Together with the exclusive desert wart-
Nine of the 22 top outlier windows contain prominent hog gene flow into ESA common warthogs, these differ-
immune system–related genes and gene clusters, including ences might warrant the distinction of western and ESA
four windows in the major histocompatibility complex common warthog populations as different subspecies or
(MHC). In addition, the Fc gamma receptor (FCGR) locus Evolutionary Significant Units (Moritz 1994). The large
was also among the top regions, as well as genes Mx1, Mx2, sampling gap between Ghana and East Africa, however,
PTGS2, JAKMIP1, and ST3GAL1 (supplementary material prevents us from making firm conclusions about whether
table S4 and fig. S4, Supplementary Material online), all the Ghana population represents a highly distinct popula-
of which have well-characterized immune system–related tion or the edge of a continuum.
functions. The MHC is well-known for its role in the adap- Although our results revise several of the conclusions of
tive immune system and, thereby, in pathogen resistance previous warthog studies (Muwanika et al. 2003; Lorenzen
(Hill 1998). The FCGRs bind to immunoglobulin G and et al. 2012) using lower numbers of genetic markers, they
are hence important modulators of the immune response support some key biogeographical hypotheses. First, they
(Nimmerjahn and Ravetch 2006). PTGS2 encodes show that major dispersal events in African savanna-
cyclooxygenase-2 (COX-2), which is essential for synthesiz- adapted mammals alternate between an east–west axis
ing prostaglandins, contributing to the innate immune sys- in the northern savanna bioregion and a north–south
tem and inflammatory reactions (Ricciotti and FitzGerald axis in the southern savanna bioregion—the inverted
2011; Sander et al. 2017). The deadly African swine fever L-shape coined by Kingdon (2013). Second, they corrobor-
virus evades the pig immune response partly by inhibiting ate East Africa as the intersection of these two savanna
the expression of COX-2 (Granja et al. 2009). The two bioregions, serving as a “melting pot” of secondary contact
7
Garcia-Erill et al. · https://doi.org/10.1093/molbev/msac134 MBE
Downloaded from https://academic.oup.com/mbe/article/39/7/msac134/6627297 by guest on 06 April 2024
FIG. 5. Adaptive introgression scan. (A) Manhattan plot of fd values estimated for 100 kb windows, estimated from called genotypes using the
four high-depth common warthog from eastern and southern Africa samples with desert warthog ancestry as P2, the Ghana common warthog
as P1, the desert warthog sample as P3, and the domestic pig as out-group. For windows with high desert warthog admixture, the names of the
genes overlapping them are shown. (B) Distribution of the fd values plotted in A, indicating the 99.9% quantile used as a threshold to detect
outliers. (C) Plot of fd and FST within two outlying FST windows and its surrounding genomic region, together with its annotated protein coding
genes. Light gray-shaded areas indicate sites excluded from the analyses. Supplementary material fig. S10, Supplementary Material online shows
local context plots for all outlying windows in the genome scan.
between previously vicariant lineages, as well as a mosaic inferred divergence time refutes the possibility that the two
refugium for species (Lorenzen et al. 2012). Herein, we pro- warthog species diverged in Eurasia before moving to Africa
vide the most detailed phylogeographic validation of these (Randi et al. 2002; Gongora et al. 2011), leaving no fossil traces
two cornerstones of African biogeography, and demon- for millions of years in Africa. The inconsistency between the
strate that adaptive introgression may be a previously un- genetic and fossil sources of evidence has obscured the evolu-
derappreciated feature within the East African contact tionary relationship between the two extant species of wart-
zone. This has wide-ranging implications for understand- hog, for example, by leading researchers to assume that any
ing the biogeographic role played by eastern Africa, includ- interbreeding between the two taxa was highly unlikely (De
ing that of early hominins, of which at least three species Jong and Butynski 2018).
coinhabited the region during the early Pleistocene The discrepancy between our coalescent-based
(Antón et al. 2014). estimate of divergence time and previous phylogenetic
estimates can probably be attributed to two factors.
Reconciling the Fossil and Genetic Records First, genomic divergence predates species divergence by
Our demographic modeling is inconsistent with the previ- an expected 2Ne generations due to the presence of poly-
ously suggested molecular divergence time of 4.4–8.8 Ma morphism in the ancestral population (Edwards and Beerli
(Randi et al. 2002; Gongora et al. 2011). We estimate a 2000), but this distinction can only explain a minor pro-
much more recent divergence time that ranges within portion of the discrepancy. Second, most of the literature
390–624 or 840–1,728 kya, assuming either unidirectional (e.g., Frantz et al. 2013, 2015, 2016; Groenen 2016) used
or asymmetric bidirectional gene flow between ancestral node calibrations that originated in the first phylogenetic
desert warthog and common warthog populations. A spe- studies based on mtDNA and a few nuclear makers (Randi
cies divergence time within the 400–1,700 kya interval re- et al. 2002; Gongora et al. 2011). Any early overestimations
conciles the genetic evidence with the fossil record, which of lineage split times would, therefore, have been propa-
finds reliable evidence of desert warthog and common wart- gated in subsequent studies. A recent study highlights
hog coexisting only since around 400 kya (Cooke and the possibility that suid evolutionary rates were highly
Wilkinson 1978). The earliest fossils attributed to overestimated (Zhang et al. 2022). Collectively, these re-
Phacochoerus have been dated at 2.2–2.0 Ma (Pickford sults demonstrate the challenges of inferring mutation
2013a, b; Pickford and Gommery 2016). Moreover, our rates that are accurate over evolutionary time scales, and
8
Warthog Genomes Resolve an Evolutionary Conundrum · https://doi.org/10.1093/molbev/msac134 MBE
the effect of mutation rate uncertainty on the dating of Conclusions
evolutionary events. We solve the long-standing riddle of the time of diver-
gence of the two extant species of warthog and show
Adaptive Introgression of Disease Resistance how an eastward range expansion in common warthogs
We show that the ESA lineage of common warthogs ad- brought common warthog and desert warthog into con-
mixed with desert warthogs after splitting from the west- tact. We found evidence of introgression between the
ern African lineage 87–260 kya. The majority of this gene two species. This occurred either in the Sudanian savanna
flow is inferred to be from desert warthogs to the ESA region or in eastern Africa. Our results suggest that patho-
common warthog lineage. We hypothesized that—similar gen resistance played a prominent role in driving intro-
to Neanderthal (Homo neanderthalensis) introgression gressed genomic segments from desert warthogs to high
into humans (Sankararaman et al. 2014)—desert warthogs frequencies in eastern and southern common warthogs.
introgression contributed genetic variation beneficial to
the ESA common warthogs. The Zambezian (eastern and
Materials and Methods
Downloaded from https://academic.oup.com/mbe/article/39/7/msac134/6627297 by guest on 06 April 2024
southern African) savanna biomes differ from the
Sudanian (western) savanna biome in vegetation, climate, Sample Collection and Laboratory Protocol
and other aspects (Happold and Lock 2013). We, therefore, Fifty-five samples of warthog tissue, mainly dried skin, were
searched for candidates for adaptive introgression in ESA used in this study (supplementary material table S1,
common warthog populations. The prevalence of immune Supplementary Material online). They were collected dur-
system–related genes among the fd outliers was striking, ing 1994–1999 in Ghana, Kenya, Tanzania, Zambia,
with the MHC locus being particularly noteworthy. Zimbabwe, and Namibia, and most were previously ana-
Pathogen resistance has previously been identified as a ma- lyzed by Muwanika et al. (2003) who describe sample col-
jor driver of positive selection on genomic segments that lection and storage. Six of the samples are from desert
introgressed from Neanderthals to modern humans warthogs. The samples from common warthog cover three
(Dannemann et al. 2016; Deschamps et al. 2016; Quach of the four currently recognized subspecies of common
et al. 2016; Enard and Petrov 2018). Pathogens are a major warthog. We also included samples from a giant forest
driver of selection in many species, perhaps particularly hog (Hylochoerus meinertzhageni), a red river hog
when a species expands its range and encounters exotic (Potamochoerus porcus), a bush pig (Po. larvatus), and
pathogens to which it has no pre-existing resistance two domestic pigs (Sus scrofa) from Africa
(Karlsson et al. 2014). In such cases, adaptive introgression (supplementary material table S1, Supplementary
of resistance-conferring variants from a closely related na- Material online). Following the manufacturer’s protocol,
tive species is a plausible evolutionary scenario compared the QIAGEN DNeasy Blood and Tissue Kit (QIAGEN,
with selection on standing variation or de novo mutations Valencia, CA, USA) was used for DNA extraction.
(Hedrick 2013). It is likely that common warthogs encoun- Subsequently, RNase was added to the samples to ensure
tered new pathogens as they moved eastwards and south- they consist of RNA-free genomic DNA. DNA concentra-
wards through Africa, whereas desert warthogs may have tions were then measured with a Qubit 2.0 Fluorometer
had a head start of hundreds of thousands of years in adapt- and a Nanodrop before using gel electrophoresis to check
ing to such pathogens. One well-known and highly lethal suid the quality of the genomic DNA.
pathogen naturally endemic to eastern and southern Africa,
but not western Africa (Taylor 1977; Costard et al. 2009; Jori
et al. 2013; Zhu et al. 2019), is African swine fever virus (Zhu Sequencing
et al. 2019; Wang et al. 2020). Intriguingly, three genes iden- All samples were sequenced using Illumina paired-end
tified as adaptively introgressed in our study—the two Mx 150 bp reads. Forty-nine were sequenced to low depth
genes and PTGS2—play key roles in the immune response (about 2–5X depth of coverage) on the Illumina
to African and classical swine fever viruses (Netherton et al. NovaSeq platform. Ten samples were sequenced to
2009; Zhou et al. 2018; Fan et al. 2020). This raises the possi- medium-high depth (about 15–20X) on the Illumina
bility that African swine fever, or a similarly deadly pathogen, HiSeq2500 platform. A total of 9.07 billion raw reads
drove adaptive introgression in warthogs. were generated and analyzed. Before mapping we assessed
The signal of ancient introgression, together with our the quality of the raw reads using FastQC (Andrews 2010)
estimate of a shorter time of divergence, opens up the pos- and MultiQC (Ewels et al. 2016). We also downloaded pub-
sibility that hybridization between the two warthog spe- licly available whole-genome sequencing data of a babirusa
cies can still occur. Hybridization would be important (Babyrousa babyrussa) (Liu et al. 2019).
from a species conservation perspective. Although we
did not detect hybrids, one common warthog from an Processing and Mapping of Sequencing Data
area of sympatry in Kenya showed signs of increased desert Prior to mapping, we processed the paired-end reads with
warthog ancestry. Based on our findings, rare but on-going NGmerge (Gaspar 2018) to merge all read pairs that over-
hybridization cannot be excluded, especially where the lapped by at least 11 bp. We then mapped nonmerged
two species are broadly sympatric (Butynski and De Jong paired-end reads and merged single-end reads with bwa
2021; De Jong et al. in press). mem v0.7.17 (Li and Durbin 2010) separately in paired-end
9
Garcia-Erill et al. · https://doi.org/10.1093/molbev/msac134 MBE
and single-end mode, using default settings and mapping These statistics are based on identity by state (IBS) between
to the domestic pig genome assembly Sscrofa 11.1. We pairs of samples. We estimated GLs with ANGSD for all des-
marked and removed duplicate reads with samtools ert warthog and common warthogs jointly, using the GATK
v. 1.9 (Li et al. 2009), and removed other low quality align- model (-GL 2; McKenna et al. 2010), and called single nu-
ments using the samtools flag -F 3852 and nonproperly cleotide polymorphisms (SNPs) using a P-value threshold
paired reads with samtools flag -f 3. Finally, we used sam- of 10−6 and a MAF filter of 0.05. We then used the GLs as
tools to combine nonmerged and merged aligned reads for input for NGSrelate (Hanghøj et al. 2019; Waples et al.
each sample into a single BAM file. 2019), which implements the estimation of the three IBS
statistics. We identified three groups of sample duplicates.
Data Filtering The duplicated samples, all of which came from Ghana, had
Throughout all analyses we excluded bases with a base call KING-robust kinship values >0.46, R1 values >6.37, and R0
quality below 30, as well as reads with a mapping quality values <2 × 10−6. We excluded all but one sample from
below 30. Furthermore, unless otherwise stated, we re- each duplicated group, retaining samples 7152, 7155, and
Downloaded from https://academic.oup.com/mbe/article/39/7/msac134/6627297 by guest on 06 April 2024
stricted all analyses to autosomal chromosomes and ex- 6274 (supplementary material table S10, Supplementary
cluded regions annotated as repeats in the reference Material online). For some analyses, we pooled the sequen-
genome (Sscrofa 11.1). In addition, we conducted a series cing data for each duplicated individual to obtain higher
of sample-filtering steps and site-filtering steps that we ex- depth. We also identified five pairs of first-degree relatives
plain in the following sections. and removed one sample from each pair (supplementary
material table S10, Supplementary Material online). The fi-
Sample Filtering nal data set for analysis consisted of four desert warthogs
We first filtered the samples to exclude those with exces- and 35 common warthogs (supplementary material
sive per base sequencing error rates as well as closely re- tables S1 and S10, Supplementary Material online).
lated or duplicate individuals.
Site and Reference Sequence Filtering
Error rate Estimation We performed a series of quality controls on the reference
We used the “perfect-individual” method described in sequence, as ambiguous regions can impact downstream
Orlando et al. (2013) and incorporated in ANGSD analyses (Pečnerová et al. 2021). For these site-filtering
(Korneliussen et al. 2014) to estimate the per base sequen- steps, we used only the samples retained after the sample-
cing error rate for each individual. This method measures filtering steps outlined above.
error rates as excess mismatches between each sample
and the out-group (domestic pig reference), relative to Mappability Filter
the mismatches between the out-group and the We estimated mappability with GENMAP (v1.2.0;
“perfect-individual.” This excess corresponds to the per Pockrandt et al. 2020) conservatively using 100 bp k-mers
base sequencing error rate assuming that all individuals with up to two mismatches (and otherwise default set-
are equally distant to the out-group and that the consensus tings) to compute the mappability scores for each site.
sequence from the “perfect-individual” has few errors. Consequently, sites with a score <1 were excluded from
Sample 1257 was chosen as the “perfect-individual” be- further analyses.
cause it has higher depth. This sample can, therefore, be
used to create a consensus sequence with few errors. This Global Depth Filter
was done using ANGSD (-doFasta 2) and strict quality fil- We estimated global depth per site separately across low-
ters, including a minimum base quality of 35, minimum and high-depth desert warthog and common warthog
mapping quality of 35, minimum sequencing depth per samples using ANGSD (Korneliussen et al. 2014). We
site of 10, and keeping only uniquely mapping reads. Per then estimated the median depth per site across each of
base sequencing error rates were then estimated using all the low- and high-depth data sets, excluding sites that
bases (-doAncError 1). Based on the results, we excluded had a depth below half the median or above 1.5 times
sample ID 6436 due to excessive errors (supplementary the median for any of the two sets.
material fig. S11, Supplementary Material online). After
all site filters were applied (see below), we re-estimated Excess Heterozygosity (Hardy–Weinberg equilibrium ) Filter
the per-sample error rates using the same approach but We identified regions with excess of heterozygosity, which
only considering the sites qualified after site filtering (see is indicative of mapping problems, using the Hardy–
below). This was used to correlate sample heterozygosity Weinberg equilibrium (HWE) likelihood ratio test built
with sample error rate (supplementary material fig. S5, into the PCAngsd framework (Meisner and Albrechtsen
Supplementary Material online). 2018, 2019). This accounts for population structure in es-
timating per site inbreeding coefficients. Inbreeding coeffi-
Identification and Removal of Close Relatives cients (F) take values in the range from −1, when there is a
We used the allele frequency–free method described in total excess of heterozygous genotypes, to 1, indicating a
Waples et al. (2019) to identify duplicate or closely related total excess of homozygous genotypes, whereas 0 corre-
samples based on the R0, R1, and KING-robust statistics. sponds to having genotype in HWE proportions within
10
Warthog Genomes Resolve an Evolutionary Conundrum · https://doi.org/10.1093/molbev/msac134 MBE
each ancestry. It can, therefore, be used to identify regions each position (-doFasta 2). We removed heteroplasmic sites
with a strong excess of heterozygosity. by masking (as “N”) any mitochondrial site where <95% of
We first used ANGSD to estimate GLs for all common reads carried the same base.
warthogs that passed sample quality control. We did not After consensus calling, we aligned the mitochondrial
use any site filtering, except basic base quality and map- genomes using BioEdit (Hall et al. 2011) and used the align-
ping quality filters, calling SNPs with MAF ≥0.05 and ment in a BEASTv.1.8.4 (Drummond and Rambaut 2007)
SNP P-value <10−6 and using the GATK GL model imple- phylogenetic analysis. We used a GTR + G + I model and
mented in ANGSD. We used the GLs as input for PCAngsd, a coalescent BSP prior to avoid restricting the tree prior
using the first three principal components (PCs) to obtain by imposing a narrow population size range. The popula-
the individual allele frequencies used to correct for popu- tion size prior was set to a uniform range between 103
lation structure (Meisner and Albrechtsen 2019). We con- and 106 to reflect that we use it as a nuisance parameter.
sidered sites with F < −0.95 and P-value <10−6 to be Default priors were used for the other parameters. We
exclusively heterozygous in the ancestral populations used a single node calibration prior on the time of the
Downloaded from https://academic.oup.com/mbe/article/39/7/msac134/6627297 by guest on 06 April 2024
that are polymorphic and excluded all sites within 10 kb most recent common ancestor of Suinae (all individuals
of such sites. except the babirusa). The prior was normally distributed
All the site filters were used in subsequent analyses un- with mean 107 years and standard deviation 106 years
less stated otherwise. based on Frantz et al. (2016). We ran the MCMC chain
for 107 steps, sampling trees, and parameters every 1000
steps. Convergence and proper mixing were assessed by
Data Analyses visual inspection and by estimating parameter ESSs using
GLs, SNPs, and Genotype Calling TRACER (Rambaut et al. 2018). We used TreeAnnotator
We used GLs or single read sampling for all analyses in (Helfrich et al. 2018) to make a Maximum Clade
which the low-depth samples were analyzed in order to ac- Credibility (MCC) tree, discarding the first 1000 trees.
count for the genotype uncertainty of calling genotypes Finally, we used iTol (Letunic and Bork 2021) to visualize
(da Fonseca et al. 2016). the MCC tree together with node posteriors and a time
ANGSD was used to estimate the GLs by applying the scale.
GATK model (-GL 2), inferring the major and minor allele
from the GLs (-doMajorMinor 1), estimating the allele fre-
quencies (-doMaf 1) and, where applicable, calling SNPs Population Structure
using the default likelihood ratio test (-SNP_pval 1e-6) Principal Component Analysis
and a minimum allele frequency filter of 0.05. We used PCAngsd (Skotte et al. 2013; Meisner and
We called genotypes for the high-depth samples using Albrechtsen 2018) to estimate the genotype covariance
bcftools v. 1.10 consensus caller (-c) (Li et al. 2009), and matrix of the 35 common warthogs left after sample filter-
used bcftools v1.10 throughout to manipulate the called ing. We used two PCs to estimate the individual allele fre-
genotype files (Danecek et al. 2021). We called genotypes quencies used to estimate the covariance matrix. This was
per sample for all sites, both variable and fixed, and using detected as the optimal number of PCs to model the
only the base quality and read mapping quality filters. population structure based on Velicer’s minimum average
Further filtering of sites depended on the analyses and is partial test implemented in PCangsd.
described in the section corresponding to each of those
analyses. Analyzing and Evaluating Population Admixture
We estimated admixture proportions for 35 common
mtDNA Analyses warthogs using NGSadmix (Skotte et al. 2013) based on
To check the taxonomic status of our samples, we initially GLs. We ran NGSadmix from K = 2 to K = 7 until either
performed a mitochondrial genome mapping and consen- the results converged, which we defined as a maximum dif-
sus calling for each of our warthog and out-group samples. ference of two log-likelihood units between the top three
We also included the published mtDNA sequence of a do- maximum likelihood results, or 100 independent runs fin-
mestic pig, a red river hog, and a babirusa. To generate the ished without convergence. For the runs that converged,
consensus sequences, we mapped all desert warthog and we subsequently evaluated the model fit using
common warthog samples, and the giant forest hog sample evalAdmix (Garcia-Erill and Albrechtsen 2020). A positive
to the common warthogs mitochondrial genome correlation of the pairwise residuals indicates a poor model
(NC_008830; Wu et al. 2007), the bush pig sample to the fit and can be used to identify the best-fitting value of K.
red river hog mitochondrial genome (NC_020737; Hassanin
et al. 2012), and the babirusa to the domestic pig mitochon- EEMS
drial genome (NC_000845; Lin et al. 1999). In all cases, we used We used ANGSD to generate an IBS matrix for 34 common
the same pre- and post-processing of reads as with the nuclear warthogs for which we had sample location data (exclud-
genome mapping. We used the mapped data to build consen- ing the Zambian sample) by sampling a single read at each
sus mtDNA sequences for each sample using ANGSD SNP (called with -snp_pvaL 1e6 and -minMaf 0.05) where
(Korneliussen et al. 2014), by keeping the consensus base in we had data for both individuals (-doIBS 1 -makeMatrix 1).
11
Garcia-Erill et al. · https://doi.org/10.1093/molbev/msac134 MBE
Thus, the distance between samples is calculated at each introgression of an out-group to both warthog species
site (0 or 1) and averaged over all used sites. that might confound the analyses. We estimated
This matrix was used as input for Estimated Effective D-statistics for both low- and high-depth samples, using
Migration Surfaces (EEMS) (Petkova et al. 2016) along different approaches for each.
with coordinates of the sample origin, and analyzed using For the low-depth samples, we applied the single read
runeems_snps. We used 10 million steps and a burn-in of 2 sampling approach implemented in ANGSD (--
million steps with 400 demes and 10,176,836 sites in total. doAbbaBaba 1). Only sites that passed all reference gen-
The results were visualized using rEEMSplots. ome filters, with a domestic pig (sample ID 1003) as
out-group, were used. We used a block jackknife approach
Population Differentiation to estimate standard errors and calculate Z-scores (Busing
We used ANGSD to generate per-population site allele fre- et al. 1999), using blocks of 5 Mbp size. We also estimated
quency (saf) files from GLs for every population with more D-statistics based on single read sampling for a subset of
than one sample. For the Kenyan population, we excluded common warthogs from Tanzania and Ghana, with the do-
Downloaded from https://academic.oup.com/mbe/article/39/7/msac134/6627297 by guest on 06 April 2024
the sample modeled as a mixture of different clusters in mestic pig samples as P3 and a babirusa as the out-group.
NGSadmix. For each pairwise population, we then used This was to investigate potential confounding gene flow of
realSFS (Nielsen et al. 2012) to estimate population pair- an out-group of all extant African suids into the Ghanian
wise 2dSFS from saf files based on randomly sampled common warthog population.
blocks of contiguous nonfiltered sites until reaching For the high-depth samples, we estimated D-statistics
200 Mbp. We then used these 2dSFS as a prior for estimat- based on called genotypes. We used bcftools to filter the
ing genome-wide FST from saf files estimated from all auto- called genotypes with the following criteria. We removed
somal sites (except those removed by the previously sites with mappability below one and sites within anno-
described site filters), using Hudson’s FST estimator tated repeats in the reference genome as in other analyses.
(Bhatia et al. 2013). We also estimated FST between pairs Due to the inclusion of out-group samples that had not
of single high-depth samples, estimated from the 2dSFS been considered in the filters based on depth and excess
between each pair of individuals using the called geno- heterozygosity, we did not use the previous site filters.
types as input. This has been shown to be accurate in simi- Instead, we filtered based on the set of genotype calls, re-
lar situations (Pečnerová et al. 2021). moving sites where any of the samples had a depth below
10 or above 50, kept only biallelic SNPs, removed sites
Directionality of Range Expansion where all samples were called as heterozygous, and re-
To corroborate the direction of the range expansion, we moved sites where any sample had a heterozygous call
estimated a directionality index among common warthog with less than three reads supporting either of the two al-
populations (Peter and Slatkin 2013). We included in this leles. Based on the called genotypes, we used the qpDstat
analysis those populations with a high-depth sample and program from the package AdmixTools (Patterson et al.
known coordinate information of the sampling locality. 2012) to calculate the counts of ABBA and BABA sites
This resulted in keeping four populations, each with one for trees where desert warthog was P3, domestic pig was
sample. We based the analyses on called genotypes, apply- the out-group, and with all possible combinations of com-
ing the same filters as when estimating D-statistics from mon warthog populations as P1 and P2. We furthermore
called genotypes (see below) and keeping only sites vari- estimated D-statistics with the Ghanaian common wart-
able within the four common warthog samples. The red hog population as P1, the desert warthog or Tanzanian
river hog sample as an out-group to assign ancestral and common warthog as P2, the red river hog sample as P3
derived states to each SNP. We then used the R package and the domestic pig as out-group. This was done to inves-
“rangeexpansion” to infer the directionality index psi be- tigate potential gene flow of an out-group of the
tween each population pair. Phacochoerus genus to the Ghanaian common warthog
that could confound the results. We applied block jack-
Heterozygosity knife to estimate standard errors and calculate Z-scores,
We estimated the genome-wide heterozygosity for all sam- using 461 blocks of 5 cM and assuming a uniform conver-
ples using ANGSD and realSFS by estimating the individual sion of 1 cM/Mbp.
SFS and then dividing the number of heterozygous sites by
the total number of sites. For the six high-depth samples, Treemix and qpGraph
we also estimated the genome-wide heterozygosities We ran TreeMix (Pickrell and Pritchard 2012) on 34 com-
from the genotypes called with bcftools by counting the mon warthogs (excluding the Kenyan sample modeled as
total number of heterozygous sites of each sample and div- admixed in the NGSadmixe analyses) grouped by the
iding by the total number of sites remaining after filtering. country of origin of the samples and on four desert
warthogs and included the two domestic pig samples as
D-Statistics an out-group. We generated the input allele counts per
We used D-statistics (Green et al. 2010) to investigate pres- population by first estimating the likelihood of sample al-
ence of gene flow between desert warthogs and common lele frequency for each population with ANGSD. We used
warthog populations, and to investigate potential only sites that passed all site filters, polarized using the
12
Warthog Genomes Resolve an Evolutionary Conundrum · https://doi.org/10.1093/molbev/msac134 MBE
reference domestic pig genome as the ancestral state. We bottom 5% of missing data, which resulted in keeping win-
then merged all populations by keeping only sites where all dows with information in at least 34.2% of sites. We se-
populations had data, and called within each population lected the top 0.1% of the remaining fd values and
the maximum likelihood sample frequency as the allele lumped all outlier windows within 1 Mbp of each other
count. This procedure resulted in approximately 17 million into a single signal. We considered these regions as candi-
sites as input data. For each possible migration from 0 to 3, dates of adaptive introgression and extracted the genes
we ran 25 differently seeded replicates of TreeMix and contained in these windows from the domestic pig refer-
chose the run with the highest likelihood, breaking ties ence genome annotation. We acknowledge that our adap-
at random. We set domestic pig as an out-group, and tive introgression scan is sensitive to the relatively small
used a block size of 28 k input sites, corresponding to an sample sizes, as well as possibly biased by mapping issues
assumed maximum LD of 5 Mbp. given that we map our data to the distantly related domes-
We ran qpGraph from the package AdmixTools tic pig genome. We addressed the latter bias by imposing a
(Patterson et al. 2012) on the high-depth individuals strict set of filters for the inclusion of sites in the analyses
Downloaded from https://academic.oup.com/mbe/article/39/7/msac134/6627297 by guest on 06 April 2024
from a subset of the populations (Desert, Ghana, (see above).
Tanzania, and Namibia) using the heuristic graph search
tool qpBrute (Ní Leathlobhair et al. 2018; Liu et al. 2019). Demographic History Inference
We used the same set of genotype calls as used to estimate Mutation Rates and Dating
the D-statistics from high-depth samples. Z-scores were es- Previous studies used a rate of 2.5 × 10−8 mutations per
timated with block jackknife using blocks of 5 Mbp. We site per generation for demographic analyses in suids
ran qpGraph on all SNPs from the filtered sites and, other- (Groenen et al. 2012). This was based on now-obsolete
wise, used the default settings. high estimates of the human mutation rate. This appears
excessively high compared with rates inferred from phylo-
Adaptive Introgression Scan genomic analyses in ruminants (Chen et al. 2019). Given
We scanned the genome of the common warthog samples that mutation rates are notoriously difficult to estimate,
from eastern and southern Africa for regions showing in our main results, we used the mean rate of 2.48 ×
strong signals of introgression from desert warthogs. We 10−9 mutations per site per year phylogenetically esti-
based this analysis on the same genotype call set as used mated across wild ruminants (Chen et al. 2019), which
for estimating D-statistics (see above). We estimated local have life histories comparable with warthogs. This rate is,
admixture proportions between pooled high-depth furthermore, almost equal to the mean annual rate of
ESA samples (i.e., we pooled the four high-depth samples 2.2 × 10−9 mutations per site per year inferred across a
from Tanzania, Zambia, Zimbabwe, and Namibia) and broad range of mammals (Kumar and Subramanian
the high-depth desert warthog sample using the 2002). We assumed a generation time of 6 years in accord-
ABBA-BABA based fd statistic (Martin et al. 2015) in win- ance with Pacifici et al. (2013) and therefore a per gener-
dows of 100 kb. The fd statistic measures the fraction of ation mutation rate of 1.49 × 10−8. The consensually
the genome shared due to introgression between popula- agreed mutation rate for cattle is 1.17 × 10−8 per gener-
tions P2 and P3. This is measured as the fraction of excess ation estimated using pedigrees and a reproductive age
sharing of derived alleles between P2 and P3 relative to be- of 5 years (Harland et al. 2017). This, again, is similar to
tween a P1 and the P3 population. This P1 population the rate we assumed for warthogs. Recently, Zhang et al.
must be sister to P2 and assumed to not have received (2022) used domestic pig pedigrees to estimate a much
any introgression. An out-group is used to polarize derived lower mutation rate of 3.6 × 10−9 per site per generation,
alleles. This statistic allows for the possibility of bidirection- but mutation rates estimated through pedigrees are very
al introgression between P2 and P3 (Martin et al. 2015). It sensitive to filtering choices as they depend on the ability
has been shown that this statistic is powerful for detecting to successfully distinguish between low numbers of true de
adaptive introgression when selection is intermediate or novo mutations and sequencing errors or other artifacts
strong (Racimo et al. 2017). We used the high-depth sam- (Bergeron et al. 2022). We note that inferred dates scale
ple from Ghana as P1 and a domestic pig (sample ID 1003) about linearly with the assumed mutation rate, so our dat-
as an out-group. We also estimated FST in sliding windows ing estimates can be readily converted should more accur-
between desert warthogs and the pooled ESA common ate estimates of the suid mutation rate become available.
warthog samples using both low- and high-depth samples.
We estimated FST in genomics windows with ANGSD by PSMC
estimating safs for both populations, and then using these Using PSMC (Li and Durbin 2011), we estimated the effect-
saf files d to estimate a 2dSFS. The saf file was then also ive population sizes back through time. We ran PSMC on
used to estimate FST in 100 kb windows, using the the six high-depth–sequenced samples and added two
Hudson estimator and with the 2dSFS as prior high-depth samples from Ghana obtained by merging sev-
(Korneliussen et al. 2013). eral duplicate low-depth samples (see above and
We restricted these sliding window fd and FST analyses to supplementary material table S1, Supplementary
sites retained after the reference genome filtering outlined Material online). We applied the reference filters and
above. Furthermore, we removed any windows in the called genotypes using bcftools v1.10. We removed sites
13
Garcia-Erill et al. · https://doi.org/10.1093/molbev/msac134 MBE
covered by less than ten reads or more than two times convert model estimates from coalescence units to abso-
each sample mean depth, as well as sites called as hetero- lute values (i.e., years).
zygous where any of the two alleles was in less than three
reads. We used default settings for all PSMC parameters. Split Time Estimates with the TT Method
Results were scaled to real time using a generation time We applied the TT method (Schlebusch et al. 2017; Sjödin
of 6 years following (Pacifici et al. 2013) and a mutation et al. 2021) as a complement and validation of the split
rate of 1.49 × 10−8 per generation (as discussed above). times inferred with fastsimcoal2. This method is based
on modeling the probability of the genotype combina-
tions from two diploid samples from each population,
Fastsimcoal as a function of several parameters that include diver-
The demographic history of a representative subset of the gence times. It does not make any assumption on the
common warthog populations (Ghana, Tanzania, and population sizes after the split, but it assumes a constant
Namibia) and desert warthogs was further investigated ancestral population size and a clean split without pos-
terior gene flow. We obtained the genotype combina-
Downloaded from https://academic.oup.com/mbe/article/39/7/msac134/6627297 by guest on 06 April 2024
using a coalescent simulation based method implemented
in fastsimcoal2 v2.6.0.3 (Excoffier et al. 2013). To minimize tions as the 2DSFS between the high-depth desert
potential bias arising when determining ancestral allelic warthog sample and each of the five high-depth common
states, we used the folded 2dSFS based on randomly sub- warthog samples, and polarized the ancestral state as the
sampled 200 Mbp (see FST estimation above) as input for allele in the domestic pig reference. In addition, we ex-
the inference. Five plausible demographic models were cluded from the analyses those sites for which the babi-
tested (supplementary material fig. S5, Supplementary rusa and red river hog samples were not homozygous
Material online). As a baseline model, the “No-admixture” for the reference allele.
scenario is a model without any admixture events and fol-
lows a population tree equivalent to the TreeMix tree with- Supplementary Material
out any migration edges. We included a ghost population Supplementary data are available at Molecular Biology and
for consistency with the two other models. The 1-admixture Evolution online.
scenario adds two admixture events with admixture pro-
portions fixed to the results from the best-fitting
qpGraph. The “2-admixture” models a scenario in which Acknowledgments
there is admixture between the extinct Cape warthog, mod- The authors thank Amal Al-Chaer for her laboratory work
eled here as a ghost population branching off the desert related to the study. This work was made possible by the
warthog, and Namibia. We also considered a model 3-ad- active and expert collaboration of the researchers from
mixture similar to 1-admixture, but where admixture be- the Global-North and Africa in the Global-South. This long-
tween desert warthogs and ancestral ESA common term collaboration has grown into a network of colleagues
warthogs is bidirectional rather than unidirectional. across countries sharing and exchanging research ideas and
Finally, to check whether the better fit of the bidirectional joint funding opportunities that would not have been pos-
gene flow model 3-admixture relative to 1-admixture was sible otherwise. R.H., G.G.E., and X.W. were supported by a
due to not fixing the admixture proportions, we included Danmarks Frie Forskningsfond Sapere Aude research grant
an additional version of the 1-admixture model where the (DFF8049-00098B). R.H. and L.D.B. were supported by an
unidirectional admixture is not restricted to the values in- European Research Council Starting Grant (No. 853442).
ferred by qpGraph. A.A. and M.S.R. received funding from the Novo Nordisk
For each model, we ran 50 independent runs to find the Foundation (NNF20OC0061343) and the Danmarks Frie
best-fitting parameters yielding the highest likelihood, with Forskningsfond (DFF-0135-00211B).
100,000 coalescent simulations per likelihood estimation
(-n100000) and 20 conditional maximization algorithm cy-
cles (-L20). The model fits were assessed by comparing Data Availability
maximum likelihoods (Excoffier et al. 2013), despite limita- The raw sequencing data generated for this project have
tions due to using composite likelihoods. In order to evalu- been deposited in FastQ format in SRA with BioProject ac-
ate the fit of the unidirectional and bidirectional gene flow cession code PRJNA837362. Scripts for data analyses were
models 1-admixture and 3-admixture to the real data sets, developed using R (R Core Team 2018), python3 (Van
we computed Hudsons FST based on the simulated joint SFS Rossum and Drake 2009), and snakemake (Mölder et al.
for the maximum likelihood parameters. To obtain the 95% 2021) and can be found in the github page https://
CI, we generated 100 parametric bootstraps based on the github.com/GenisGE/warthogscripts.
maximum likelihood parameters estimated under the
best model and ran 50 independent runs for each boot-
strap, using the same settings as for the analyses of the ori- References
ginal data set. A mutation rate of 1.49 × 10−8 per site per FAO. 2012. Global Ecological Zones for FAO Forest Reporting: 2010
generation (see Mutation Rates and Dating) and a gener- Update. Rome, Italy: Food and Agriculture Organization of the
ation time of 6 years (Pacifici et al. 2013) were used to United Nations.
14
Warthog Genomes Resolve an Evolutionary Conundrum · https://doi.org/10.1093/molbev/msac134 MBE
Andrews S. 2010. FastQC: a quality control tool for high throughput common warthog (Phacochoerus africanus) in the Horn of
sequence data [cited 2022 Jun 21]. Available from: http://www. Africa. Mammalia.
bioinformatics.babraham.ac.uk/projects/fastqc/. Deschamps M, Laval G, Fagny M, Itan Y, Abel L, Casanova J-L, Patin E,
Antón SC, Potts R, Aiello LC. 2014. Evolution of early Homo: an inte- Quintana-Murci L. 2016. Genomic signatures of selective pres-
grated biological perspective. Science. 345:1236828. sures and introgression from archaic hominins at human innate
Arctander P, Johansen C, Coutellec-Vreto MA. 1999. Phylogeography immunity genes. Am J Hum Genet. 98:5–21.
of three closely related African bovids (tribe Alcelaphini). Mol Drummond AJ, Rambaut A. 2007. BEAST: Bayesian evolutionary ana-
Biol Evol. 16:1724–1739. lysis by sampling trees. BMC Evol Biol. 7:214.
Bergeron LA, Besenbacher S, Turner T, Versoza CJ, Wang RJ, Price AL, Edwards SV, Beerli P. 2000. Perspective: gene divergence, population
Armstrong E, Riera M, Carlson J, Chen H-Y, et al. 2022. The divergence, and the variance in coalescence time in phylogeo-
Mutationathon highlights the importance of reaching standard- graphic studies. Evolution. 54:1839–1854.
ization in estimates of pedigree-based germline mutation rates. Enard D, Petrov DA. 2018. Evidence that RNA viruses drove adaptive
Elife. 11:e73577. introgression between Neanderthals and modern humans. Cell.
Bhatia G, Patterson N, Sankararaman S, Price AL. 2013. Estimating 175:360–371.e13.
and interpreting FST: The impact of rare variants. Genome Res. Ewels P, Magnusson M, Lundin S, Käller M. 2016. MultiQC: summar-
23:1514–1521. ize analysis results for multiple tools and samples in a single re-
Downloaded from https://academic.oup.com/mbe/article/39/7/msac134/6627297 by guest on 06 April 2024
Busing FMTA, Meijer E, Van Der Leeden R. 1999. Delete-m jackknife port. Bioinformatics. 32:3047.
for unequal m. Stat Comput. 9:3–8. Ewer RF. 1956. The fossil suids of the transvall caves. Proc Zool Soc
Butynski TM, De Jong YA. 2018. Common warthog (Phacochoerus af- Lond. 127:527–544.
ricanus). In: Melletti M, Meijaard E, editors. Ecology, conservation Excoffier L, Dupanloup I, Huerta-Sánchez E, Sousa VC, Foll M. 2013.
and management of wild pigs and peccaries. Cambridge, UK: Robust demographic inference from genomic and SNP data.
Cambridge University Press. p. 85–100. PLoS Genet. 9:e1003905.
Butynski TM, De Jong YA. 2021. Sympatry between desert warthog Fan W, Jiao P, Zhang H, Chen T, Zhou X, Qi Y, Sun L, Shang Y, Zhu H,
Phacochoerus aethiopicus and common warthog Phacochoerus Hu R, et al. 2020. Inhibition of African swine fever virus replica-
africanus in Kenya, with particular reference to Laikipia tion by porcine type I and type II interferons. Front Microbiol.
County. Suiform Soundings. 20:33–44. 11:1203.
Chen L, Qiu Q, Jiang Y, Wang K, Lin Z, Li Z, Bibi F, Yang Y, Wang J, Nie Frantz LAF, Meijaard E, Gongora J, Haile J, Groenen MAM, Larson G.
W, et al. 2019. Large-scale ruminant genome sequencing provides 2016. The evolution of suidae. Ann Rev Anim Biosci. 4:61–85.
insights into their evolution and distinct traits. Science 364- Frantz LAF, Schraiber JG, Madsen O, Megens H-J, Bosse M, Paudel
(6446):eaav6202. Y, Semiadi G, Meijaard E, Li N, Crooijmans RPMA, et al. 2013.
Chen Y, Zhao YH, Kalaslavadi TB, Hamati E, Nehrke K, Le AD, Ann Genome sequencing reveals fine scale diversification and re-
DK, Wu R. 2004. Genome-wide search and identification of a no- ticulation history during speciation in Sus. Genome Biol. 14:
vel gel-forming mucin MUC19/Muc19 in glandular tissues. Am J R107.
Respir Cell Mol Biol. 30:155–165. Frantz LAF, Schraiber JG, Madsen O, Megens H-J, Cagan A, Bosse M,
Cooke HBS. 1994. Phacochoerus modestus from Sterkfontein Paudel Y, Crooijmans RPMA, Larson G, Groenen MAM. 2015.
Member-5. S Afr J Sci. 90:99–100. Evidence of long-term gene flow and selection during domesti-
Cooke HBS, Wilkinson AF. 1978. Suidae and Tayassuidae. In: Maglio cation from analyses of Eurasian wild and domestic pig genomes.
VJ, Cooke HBS, editors. Evolution of African Mammals. Nat Genet. 47:1141–1148.
Cambridge, MA: Harvard University Press. p. 435–482. Garcia-Erill G, Albrechtsen A. 2020. Evaluation of model fit of in-
Costard S, Wieland B, de Glanville W, Jori F, Rowlands R, Vosloo W, ferred admixture proportions. Mol Ecol Resour. 20:936–949.
Roger F, Pfeiffer DU, Dixon LK. 2009. African swine fever: how can Gaspar JM. 2018. NGmerge: Merging paired-end reads via novel
global spread be prevented? Philos Trans R Soc Lond B Biol Sci. empirically-derived models of sequencing errors. BMC
364:2683–2696. Bioinform. 19:536.
Cumming DHM. 2013. Phacochoerus africanus, common warthog. In: Giovannone N, Antonopoulos A, Liang J, Geddes Sweeney J, Kudelka
Kingdon J, Hoffman M, editors. Mammals of Africa. London (GB): MR, King SL, Lee GS, Cummings RD, Dell A, Barthel SR, et al. 2018.
Bloosmbury. Volume 6: Pigs, hippopotamuses, chevrotain, gir- Human B cell differentiation is characterized by progressive re-
affes, deer and bovids. p. 54–60. modeling of O-linked glycans. Front Immunol. 9:2857.
da Fonseca RR, Albrechtsen A, Themudo GE, Ramos-Madrigal J, Gongora J, Cuddahee RE, do Nascimento FF, Palgrave CJ, Lowden S,
Sibbesen JA, Maretty L, Zepeda-Mendoza ML, Campos PF, Ho SYW, Simond D, Damayanti CS, White DJ, Tay WT, et al. 2011.
Heller R, Pereira RJ. 2016. Next-generation biology: sequencing Rethinking the evolution of extant sub-Saharan African suids
and data analysis approaches for non-model organisms. Mar (Suidae, Artiodactyla): Evolution of extant African Suidae.
Genomics. 30:3–13. Zoologica Scripta 40:327–335.
Danecek P, Bonfield JK, Liddle J, Marshall J, Ohan V, Pollard MO, Granja AG, Sánchez EG, Sabina P, Fresno M, Revilla Y. 2009. African
Whitwham A, Keane T, McCarthy SA, Davies RM, et al. 2021. swine fever virus blocks the host cell antiviral inflammatory re-
Twelve years of SAMtools and BCFtools. Gigascience. 10(2): sponse through a direct inhibition of PKC-theta-mediated
giab008. p300 transactivation. J Virol. 83:969–980.
Dannemann M, Andrés AM, Kelso J. 2016. Introgression of Green RE, Krause J, Briggs AW, Maricic T, Stenzel U, Kircher M,
neandertal- and denisovan-like haplotypes contributes to adap- Patterson N, Li H, Zhai W, Fritz MH-Y. 2010. A draft sequence
tive variation in human toll-like receptors. Am J Hum Genet. 98: of the Neandertal genome. Science. 328:710–722.
22–33. Groenen MAM. 2016. A decade of pig genome sequencing: A win-
De Jong YA, Butynski TM. 2018. Desert warthog (Phacochoerus dow on pig domestication and evolution. Genet Sel Evol. 48:23.
aethiopicus). In: Melletti M, Meijaard E, editors. Ecology, conserva- Groenen MAM, Archibald AL, Uenishi H, Tuggle CK, Takeuchi Y,
tion and management of wild pigs and peccaries. Cambridge, UK: Rothschild MF, Rogel-Gaillard C, Park C, Milan D, Megens H-J,
Cambridge University Press. p. 101–113. et al. 2012. Analyses of pig genomes provide insight into porcine
De Jong YA, Butynski TM. 2021. New desert warthog records for demography and evolution. Nature. 491:393–398.
Laikipia County, central Kenya [cited 2022 Jun 21]. Available Grubb P. 1993. The Afrotropical suids (Phacochoerus, Hylochoerus
from: https://www.wildsolutions.nl/desert-warthog-laikipia/. and Potamochoerus). Taxonomy and description. In: Oliver
De Jong YA, d’Huart JP, Butynski TM. in press. Biogeography and con- WLR, editor. Pigs, peccaries and hippos: Status survey and conser-
servation of desert warthog (Phacochoerus aethiopicus) and vation action plan. Gland, Switzerland: IUCN. p. 66–75.
15
Garcia-Erill et al. · https://doi.org/10.1093/molbev/msac134 MBE
Grubb P. 2005. Order Artiodactyla. In: Wilson DER, editor. Mammal Letunic I, Bork P. 2021. Interactive Tree Of Life (iTOL) v5: an online
species of the world: A taxonomic and geographic reference. Vol. 1. tool for phylogenetic tree display and annotation. Nucleic Acids
Baltimore (MD): The Johns Hopkins University Press. p. 637–722. Res. 49:W293–W296.
Grubb P, D’Huart J-P. 2013. Phacochoerus aethiopicus Desert Li H, Durbin R. 2010. Fast and accurate long-read alignment with
Warthog. In: Kingdon J, Hoffmann M, editors. Mammals of Burrows-Wheeler transform. Bioinformatics. 26:589–595.
Africa: Volume VI: Pigs, hippopotamuses, chevrotain, giraffes, Li H, Durbin R. 2011. Inference of human population history from in-
deer and bovids. London: Bloomsbury Publishing. p. 51–53. dividual whole-genome sequences. Nature. 475:493–496.
Hall T, Biosciences I, Carlsbad C. 2011. BioEdit: An important soft- Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G,
ware for molecular biology. GERF Bull Biosci. 2:60–61. Abecasis G, Durbin R, Subgroup 1000 Genome Project Data
Hanghøj K, Moltke I, Andersen PA, Manica A, Korneliussen TS. 2019. Processing. 2009. The sequence alignment/map format and
Fast and accurate relatedness estimation from high-throughput SAMtools. Bioinformatics. 25:2078–2079.
sequencing data in the presence of inbreeding. Gigascience. 8: Libri V, Schulte D, van Stijn A, Ragimbeau J, Rogge L, Pellegrini S.
giz034. 2008. Jakmip1 is expressed upon T cell differentiation and has
Happold D, Lock JM. 2013. The biotic zones of Africa. In: Kingdon J, an inhibitory function in cytotoxic T lymphocytes. J Immunol
Happold DCD, Butynski TM, Hoffman M, Happold M, Kalina J, 181:5847–5856.
editors. Mammals of Africa. London: Bloomsbury Publishing. Lin CS, Sun YL, Liu CY, Yang PC, Chang LC, Cheng IC, Mao SJ, Huang
Downloaded from https://academic.oup.com/mbe/article/39/7/msac134/6627297 by guest on 06 April 2024
Volume 1: Introductory chapters and Afrotheria. p. 57–74. MC. 1999. Complete nucleotide sequence of pig (Sus scrofa)
Harland C, Charlier C, Karim L, Cambisano N, Deckers M, Mni M, mitochondrial genome and dating evolutionary divergence with-
Mullaart E, Coppieters W, Georges M. 2017. Frequency of mosai- in Artiodactyla. Gene. 236:107–114.
cism points towards mutation-prone early cleavage cell divisions Liu L, Bosse M, Megens H-J, Frantz LAF, Lee Y-L, Irving-Pease EK,
in cattle. BioRxiv [Preprint]. doi:10.1101/079863. Narayan G, Groenen MAM, Madsen O. 2019. Genomic analysis
Harris JM, Cerling TE. 2002. Dietary adaptations of extant and on pygmy hog reveals extensive interbreeding during wild boar
Neogene African suids. J Zool. 256:45–54. expansion. Nat Commun. 10:1992.
Harris JM, White TD. 1979. Evolution of the Plio-Pleistocene African Lorenzen ED, Arctander P, Siegismund HR. 2006. Regional genetic
Suidae. Trans Am Philos Soc. 69:1. structuring and evolutionary history of the impala Aepyceros
Hasnain SZ, Gallagher AL, Grencis RK, Thornton DJ. 2013. A new role melampus. J Hered. 97:119–132.
for mucins in immunity: Insights from gastrointestinal nematode Lorenzen ED, Heller R, Siegismund HR. 2012. Comparative phylogeo-
infection. Int J Biochem Cell Biol. 45:364–374. graphy of African savannah ungulates. Mol. Ecol. 21:3656–3670.
Hassanin A, Delsuc F, Ropiquet A, Hammer C, van Vuuren B J, Maestri A, Sortica VA, Tovo-Rodrigues L, Santos MC, Barbagelata L,
Matthee C, Ruiz-Garcia M, Catzeflis F, Areskoug V, Nguyen TT, Moraes MR, Alencar de Mello W, Gusmão L, Sousa RCM,
et al. 2012. Pattern and timing of diversification of Emanuel Batista Dos Santos S. 2015. Siaα2-3Galβ1- receptor gen-
Cetartiodactyla (Mammalia, Laurasiatheria), as revealed by a etic variants are associated with influenza A(H1N1)pdm09 sever-
comprehensive analysis of mitochondrial genomes. C R Biol. ity. PLoS One. 10:e0139681.
335:32–50. Martin SH, Davey JW, Jiggins CD. 2015. Evaluating the use of ABBA–
Hedrick PW. 2013. Adaptive introgression in animals: Examples and BABA statistics to locate introgressed loci. Mol Biol Evol. 32:
comparison to new mutation and standing variation as sources 244–257.
of adaptive variation. Mol Ecol. 22:4606–4618. McBride KE, Cheemarla NR, Guerrero-Plata MA. 2018. Role of mucin
Helfrich P, Rieb E, Abrami G, Lücking A, Mehler A. 2018. 19 in the respiratory tract. J Immunol. 200:60.8.
TreeAnnotator: Versatile visual annotation of hierarchical text McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky
relations. In: Proceedings of the Eleventh International A, Garimella K, Altshuler D, Gabriel S, Daly M, et al. 2010. The gen-
Conference on Language Resources and Evaluation (LREC ome analysis toolkit: A MapReduce framework for analyzing next-
2018) [cited 2022 Jun 21]. Available from: https://aclanthology. generation DNA sequencing data. Genome Res. 20:1297–1303.
org/L18-1308. Meisner J, Albrechtsen A. 2018. Inferring population structure and
Hewitt GM. 2004. Genetic consequences of climatic oscillations admixture proportions in low-depth NGS data. Genetics. 210:
in the quaternary. Philos Trans R Soc Lond B Biol Sci 359: 719–731.
183–195. Meisner J, Albrechtsen A. 2019. Testing for Hardy–Weinberg equi-
Hill AV. 1998. The immunogenetics of human infectious diseases. librium in structured populations using genotype or low-
Ann Rev Immunol. 16:593–617. depth next generation sequencing data. Mol Ecol Resour. 19:
Hopwood AT, Hollyfield JP. 1954. An annotated bibliography of the 1144–1152.
fossil mammals of Africa (1742-1950). London: British Museum Mölder F, Jablonski KP, Letcher B, Hall MB, Tomkins-Tinch CH,
(Nat. Hist.). Sochat V, Forster J, Lee S, Twardziok SO, Kanitz A, et al. 2021.
Jori F, Vial L, Penrith ML, Pérez-Sánchez R, Etter E, Albina E, Michaud Sustainable data analysis with snakemake. F1000Res. 10:33.
V, Roger F. 2013. Review of the sylvatic cycle of African swine fe- Moritz C. 1994. Defining “Evolutionarily Significant Units” for conser-
ver in sub-Saharan Africa and the Indian ocean. Virus Res. 173: vation. Trends Ecol Evol. 9:373–375.
212–227. Muwanika VB, Nyakaana S, Siegismund HR, Arctander P. 2003.
Karlsson EK, Kwiatkowski DP, Sabeti PC. 2014. Natural selection and Phylogeography and population structure of the common wart-
infectious disease in human populations. Nat Rev Genet. 15: hog (Phacochoerus africanus) inferred from variation in mito-
379–393. chondrial DNA sequences and microsatellite loci. Heredity. 91:
Kingdon J. 2013. Mammalian evolution in Africa. In: Kingdon J, 361–372.
Happold DCD, Butynski TM, Hoffman M, Happold M, Kalina J, Nersting LG, Arctander P. 2001. Phylogeography and conservation of
editors. Mammals of Africa. Introductory chapters and impala and greater kudu. Mol Ecol. 10:711–719.
Afrotheria. Vol. 1. London: Bloomsbury Publishing. p. 75–100. Netherton CL, Simpson J, Haller O, Wileman TE, Takamatsu H-H,
Korneliussen TS, Albrechtsen A, Nielsen R. 2014. ANGSD: Analysis of Monaghan P, Taylor G. 2009. Inhibition of a large double-
next generation sequencing data. BMC Bioinform. 15:356. stranded DNA virus by MxA protein. J Virol. 83:2310–2320.
Korneliussen TS, Moltke I, Albrechtsen A, Nielsen R. 2013. Calculation Nielsen R, Korneliussen T, Albrechtsen A, Li Y, Wang J. 2012. SNP call-
of Tajima’s D and other neutrality test statistics from low depth ing, genotype calling, and sample allele frequency estimation
next-generation sequencing data. BMC Bioinform. 14:289. from new-generation sequencing data. PLoS One. 7(7):e37558.
Kumar S, Subramanian S. 2002. Mutation rates in mammalian gen- Nimmerjahn F, Ravetch JV. 2006. Fcgamma receptors: Old friends
omes. Proc Natl Acad Sci U S A. 99:803–808. and new family members. Immunity. 24:19–28.
16
Warthog Genomes Resolve an Evolutionary Conundrum · https://doi.org/10.1093/molbev/msac134 MBE
Ní Leathlobhair M, Perri AR, Irving-Pease EK, Witt KE, Linderholm A, Ricciotti E, FitzGerald GA. 2011. Prostaglandins and inflammation.
Haile J, Lebrasseur O, Ameen C, Blick J, Boyko AR, et al. 2018. The Arterioscler Thromb Vasc Biol. 31:986–1000.
evolutionary history of dogs in the Americas. Science. 361:81–85. Sander WJ, O’Neill HG, Pohl CH. 2017. Prostaglandin E2 as a modu-
Orlando L, Ginolhac A, Zhang G, Froese D, Albrechtsen A, Stiller M, lator of viral infections. Front Physiol. 8:89.
Schubert M, Cappellini E, Petersen B, Moltke I, et al. 2013. Sankararaman S, Mallick S, Dannemann M, Prüfer K, Kelso J, Pääbo S,
Recalibrating Equus evolution using the genome sequence of Patterson N, Reich D. 2014. The genomic landscape of
an early Middle Pleistocene horse. Nature. 499(7456):74–78. Neanderthal ancestry in present-day humans. Nature. 507:
Pacifici M, Santini L, Di Marco M, Baisero D, Francucci L, Marasini 354–357.
GG, Visconti P, Rondinini C. 2013. Generation length for mam- Schlebusch CM, Malmström H, Günther T, Sjödin P, Coutinho A,
mals. Nat Conserv. 5:89. Edlund H, Munters AR, Vicente M, Steyn M, Soodyall H, et al.
Patterson N, Moorjani P, Luo Y, Mallick S, Rohland N, Zhan Y, 2017. Southern African ancient genomes estimate modern hu-
Genschoreck T, Webster T, Reich D. 2012. Ancient admixture man divergence to 350,000 to 260,000 years ago. Science. 358:
in human history. Genetics. 192:1065–1093. 652–655.
Pečnerová P, Garcia-Erill G, Liu X, Nursyifa C, Waples RK, Santander Sjödin P, McKenna J, Jakobsson M. 2021. Estimating divergence times
CG, Quinn L, Frandsen P, Meisner J, Stæger FF, et al. 2021. High from DNA sequences. Genetics. 217(4):iyab008.
genetic diversity and low differentiation reflect the ecological Skotte L, Korneliussen TS, Albrechtsen A. 2013. Estimating individual
Downloaded from https://academic.oup.com/mbe/article/39/7/msac134/6627297 by guest on 06 April 2024
versatility of the African leopard. Curr Biol. 31:1862–1871.e5. admixture proportions from next generation sequencing data.
Pedersen C-ET, Albrechtsen A, Etter PD, Johnson EA, Orlando L, Genetics. 195:693–702.
Chikhi L, Siegismund HR, Heller R. 2018. A southern African ori- Souron A. 2016. On specimens of extant warthogs (Phacochoerus)
gin and cryptic structure in the highly mobile plains zebra. Nat from the Horn of Africa with unusual basicranial morphology:
Ecol Evol. 2:491–498. rare variants of Ph. africanus or hybrids between Ph. africanus
Peter BM, Slatkin M. 2013. Detecting range expansions from genetic and Ph. aethiopicus? Suiform Soundings 15:86–92.
data. Evolution. 67:3274–3289. Souron A. 2017. Diet and ecology of extant and fossil wild pigs. In:
Petkova D, Novembre J, Stephens M. 2016. Visualizing spatial popu- Melletti MME, editor. Ecology, conservation and management
lation structure with estimated effective migration surfaces. Nat of wild pigs and Peccaries. Cambridge, UK: Cambridge
Genet. 48:94–100. University Press. p. 29–38.
Pickford M. 2006. Synopsis of the biochronology of African neogene Taylor HC. 1977. Aspects of the ecology of the Cape of Good Hope
and quaternary suiformes. Trans R Soc S Afr. 61:51–62. Nature Reserve in relation to fire and conservation. US Forest
Pickford M. 2012. Ancestors of Broom’s pigs. Trans R Soc S Afr. 67: Service, Washington Office, General Technical Report.
17–35. p. 483–487.
Pickford M. 2013a. The diversity, age, biogeographic and phylo- Van Rossum G, Drake FL. 2009. Python 3 reference manual. Scotts
genetic relationships of Plio-Pleistocene suids from Valley, CA: CreateSpace.
Kromdraai, South Africa. Ann Ditsong Natl Museum Nat Vercammen P, Mason DR. 1993. The warthogs (Phacochoerus afri-
History. 3:11–32. canus and P. aethiopicus). In Oliver WLR, editor. Pigs, peccaries
Pickford M. 2013b. Locomotion, diet, body weight, origin and geo- and hippos: status survey and action plan. Gland, Switzerland:
chronology of Metridiochoerus andrewsi from the Gondolin karst IUCN. p. 75–84.
deposits, Gauteng, South Africa. Ann Ditsong Natl Museum Nat Verhelst J, Hulpiau P, Saelens X. 2013. Mx proteins: Antiviral gate-
History 3:33–47. keepers that restrain the uninvited. Microbiol Mol Biol Rev. 77:
Pickford M, Gommery D. 2016. Fossil Suidae (Artiodactyla, Mammalia) 551–566.
from Aves Cave I and nearby sites in Bolt’s Farm Palaeokarst System, Wang S, Zhang J, Zhang Y, Yang J, Wang L, Qi Y, Han X, Zhou X, Miao
South Africa. Estudios Geologicos (Madrid). 72:059. F, Chen T, et al. 2020. Cytokine storm in domestic pigs induced
Pickford M, Gommery D. 2020. Fossil suids from Bolt’s Farm by infection of virulent African Swine Fever Virus. Front Vet Sci 7:
Palaeokarst System, South Africa: Implications for the taxonomy 601641.
of Potamochoeroides and Notochoerus and for biochronology. Waples RK, Albrechtsen A, Moltke I. 2019. Allele frequency-free in-
Estudios Geologicos (Madrid). 76:127. ference of close familial relationships from genotypes or low-
Pickrell J, Pritchard J. 2012. Inference of population splits and mix- depth sequencing data. Mol Ecol. 28:35–48.
tures from genome-wide allele frequency data. PLoS Genet. 8: White TD, Harris JM. 1977. Suid evolution and correlation of African
e1002967. hominid localities. Science. 198:13–21.
Pockrandt C, Alzamel M, Iliopoulos CS, Reinert K. 2020. GenMap: Wu G-S, Yao Y-G, Qu K-X, Ding Z-L, Li H, Palanichamy MG, Duan
Ultra-fast computation of genome mappability. Bioinformatics. Z-Y, Li N, Chen Y-S, Zhang Y-P. 2007. Population phylogenomic
36:3687–3692. analysis of mitochondrial DNA in wild boars and domestic pigs
Quach H, Rotival M, Pothlichet J, Loh Y-HE, Dannemann M, Zidane revealed multiple domestication events in East Asia. Genome
N, Laval G, Patin E, Harmant C, Lopez M, et al. 2016. Genetic Biol. 8:R245.
adaptation and neandertal admixture shaped the immune sys- Yang Z, Donoghue PCJ. 2016. Dating species divergences using
tem of human populations. Cell. 167:643–656.e17. rocks and clocks. Philos Trans R Soc Lond B Biol Sci. 371:
Racimo F, Marnetto D, Huerta-Sánchez E. 2017. Signatures of archaic 20150126.
adaptive introgression in present-day human populations. Mol Zhang M, Yang Q, Ai H, Huang L. 2022. Revisiting the evolutionary
Biol Evol. 34:296–317. history of pigs via de novo mutation rate estimation in a three-
R Core Team. 2018. R: A Language and Environment for Statistical generation pedigree. Genomics Proteomics Bioinf. doi:10.1016/j.
Computing. R Foundation for Statistical Computing [cited gpb.2022.02.001.
2022 Jun 21]. Available from: https://www.R-project.org/. Zhou J, Chen J, Zhang X-M, Gao Z-C, Liu C-C, Zhang Y-N, Hou J-X, Li
Rambaut A, Drummond AJ, Xie D, Baele G, Suchard MA. 2018. Z-Y, Kan L, Li W-L, et al. 2018. Porcine Mx1 protein inhibits clas-
Posterior summarization in Bayesian phylogenetics using tracer sical swine fever virus replication by targeting nonstructural pro-
1.7. Syst Biol. 67:901–904. tein NS5B. J Virol. 92:e02147–17.
Randi E, D’Huart J-P, Lucchini V, Aman R. 2002. Evidence of two gen- Zhu JJ, Ramanathan P, Bishop EA, O’Donnell V, Gladue DP, Borca
etically deeply divergent species of warthog, Phacochoerus afri- MV. 2019. Mechanisms of African swine fever virus pathogenesis
canus and P. aethiopicus (Artiodactyla: Suiformes) in East and immune evasion inferred from gene expression changes in
Africa. Mamm Biol. 67:91–96. infected swine macrophages. PLoS One. 14:e0223955.
17