Chapter 2. Dynamic Models of Language Evolution: The Linguistic Perspective
Chapter 2. Dynamic Models of Language Evolution: The Linguistic Perspective
Dynamic Models of
Language Evolution: the
Linguistic Perspective
Andrew D. M. Smith
University of Stirling
Abstract
This chapter gives an overview of language variation and how the dynam-
ics of language are explored through formal models. It briefly outlines the
dimensions over which language structures can vary, then looks at some
of the very different ways in which language change has been investi-
gated (sociolinguistics, historical linguistics, evolutionary linguistics). It
describes how dynamic models of change have been successfully used in
all of these fields, and how they have shed light on many aspects of lan-
guage dynamics, from the properties of language change through phylo-
genetic analyses of language history to computational and experimental
models of cultural evolution.
2.1 Introduction
Language is probably the key defining characteristic of humanity, an immensely
powerful tool which provides its users with an infinitely expressive means of
representing their complex thoughts and reflections, and of successfully com-
municating them to others. It is the foundation on which human societies
have been built and the means through which humanity’s unparalleled intel-
lectual and technological achievements have been realised. Although we have
a natural intuitive understanding of what a language is, the specification of
a particular language is nevertheless remarkably difficult, if not impossible,
to pin down precisely. All languages contain many separate yet integral sys-
tems which work interdependently to allow the expression of our thoughts and
the interpretation of others’ expressions: each has, for instance, a set of basic
meaningless sounds (e.g. [e], [l], [s]) which can be combined to make dif-
ferent meaningful words and parts of words (e.g. else, less, sell, -less); these
meaningful units can be combined to make complex words (e.g. spinelessness,
selling), and the words themselves can then be combined in very many complex
ways into phrases, clauses and an infinite number of meaningful sentences; fi-
nally each of these sentences can be interpreted in dramatically different ways,
depending on the contexts in which it is uttered and on who is doing the in-
terpretation. Languages can be analysed at any of these different levels, which
make up many of the sub-fields of linguistics, and the primary job of linguis-
tic theorists is to try to explain the rules which best explain these complex
combinations. The hallmarks of human language, which distinguish it from
less powerful communication systems used by animals, are frequently charac-
terised in terms of lists of so-called design features, such as those itemised over
half a century ago by Hockett (1960) and others. Language is unique in its
1
semanticity, productivity and mode of transmission: it is a system spread through
cultural learning which consists of relatively fixed mappings between form and
meaning, yet can effortlessly accommodate infinite novelty of expression. Lan-
guage gets its massive expressive power from its duality of patterning or double
articulation (Martinet, 1949): linguistic utterances are assembled from small
meaningful units (morphemes) combined according to a set of morphosyn-
tactic rules, while the morphemes themselves are built from a second set of
meaningless sounds (phonemes) using a different set of phonological rules.
Linguistic knowledge is fundamentally symbolic, made up of an inventory of
arbitrary associations between forms and meaning (de Saussure, 1916), which
can be composed into complex signals whose meanings can be derived from the
meanings of their component parts and the way these are combined (Krifka,
2001). Human language is also characterised by recursion, where forms can
contain embedded components where the part and the whole share a syntactic
category (Kinsella, 2009); this particular feature has famously been proposed
as the core component of the putative innate faculty of language (Hauser
et al., 2002). These (and perhaps other) design features effectively delimit the
differences between human language on one hand and animal communica-
tion systems on the other; yet they also underplay one of the most significant
features of language, that it is characterised at every level by dynamism, mas-
sive diversity and constant change. Evans and Levinson (2009) point out that
‘[w]e are the only known species whose communication system varies funda-
mentally in both form and content’ (Evans and Levinson, 2009, p.431), and
we are likewise the only known species whose communication system is not
fixed, but on the contrary is constantly changing. Notably, all languages ex-
ist in two separate manifestations: internally, as relatively persistent (though
changing) representations inside human brains, and externally, as ephemeral,
transitory utterances which are produced and received by interlocutors; much
recent work on the evolution of language focuses on how properties of lan-
guage can be seen as adaptive to the need to alternate between these internal
and external manifestations. In the remainder of this chapter I explore some
of the ways in which linguistic diversity is realised, and some of the perspec-
tives we have gained on understanding its dynamic nature: in Section 2.2, I
describe diverse dimensions along which languages can vary; in Section 2.3
I investigate different approaches to language change on different timescales;
in Section 2.4, I explore some of the formal models which have been used to
explore the dynamic nature of language.
2
2.2 Language diversity
There are around 7000 languages spoken in the world nowadays (Lewis,
2005); the precise tally is, perhaps surprisingly, impossible to calculate ac-
curately, for two important reasons. Firstly, the criteria used to determine
whether any system of linguistic expressions, or linguistic variety, should count
as a different language from another system of linguistic expressions are not
measured solely on objective linguistic terms, but are tied up with political
decisions and aspirations which can draw different conclusions about the ‘lan-
guagehood’ of particular linguistic varieties. Linguistic criteria, for instance,
are most frequently expressed in terms of the mutual intelligibility of vari-
eties: can speakers of variety A understand speakers of variety B (and vice
versa) based on their linguistic knowledge of their own variety? If they can,
then the varieties can reasonably be considered to be dialects of the same lan-
guage; if not, then they can be considered different languages. In the absence
of major physical obstacles which reduce opportunities for people to mix with
each other, of course, adjacent communities frequently speak mutually intel-
ligible varieties, and sometimes such communities can form chains or dialect
continua across very large areas, such as the unbroken chain of mutual intel-
ligibility running from Portugal through Spain and France to the foot of Italy.
In communities in areas around the political borders, however, people often
consider that they speak different languages from their neighbours across the
border (e.g. French or Italian), even though their varieties are linguistically
almost identical. The opposite situation is also true, where political consider-
ations can lead varieties which are not mutually intelligible to be widely con-
sidered dialects of a single language. Due to the historical and cultural unity
of China, for instance, the major varieties of Sinitic languages there (e.g. Man-
darin, Cantonese) are widely considered to be dialects (Goddard, 2005), yet
they are linguistically as distinct from each other as the separate languages of
Italian, French, Catalan, Spanish and Portuguese in Europe (Coulmas, 2013).
Changes in political identities can have linguistic repercussions, too, as can
be seen following the break-up of Yugoslavia in the 1990s, which led to the
demise of the unified language formerly known as Serbo-Croat and the active
building in the new countries of multiple distinct successor languages (Serbian,
Croatian, Bosnian, Montenegrin), so that linguistic identities could better re-
flect the new and developing political situation (Greenberg, 2004).
Secondly, despite the heroic efforts of field linguists who devote their ca-
reers to describing, documenting and cataloguing languages across the world,
surprisingly few languages are studied in great detail, and in fact many remain
completely unknown. Those languages we do know are dying out dismayingly
3
quickly; almost half the languages ever known to us have disappeared in the
last five hundred years (Nettle and Romaine, 2000), and estimates suggest
that perhaps only a tenth of the languages spoken today will still be spoken at
the end of the century (Krauss, 1992). This matters not just for the complete-
ness of linguistic catalogues, but because languages are major repositories of
the cultural knowledge of the communities which use them (Maffi, 2001), and
the death of a undocumented language can mean the irrevocable loss of the tra-
ditional knowledge stored within its vocabulary and grammatical distinctions
(Evans, 2010).
The pervasiveness of language diversity is regularly underestimated, how-
ever, and its significance underplayed. Below I will give a flavour of some of
the countless ways in which languages vary, from their sounds to the way they
assemble words, from their grammatical rules to the meanings they encode.
2.2.1 Sounds
One of the fundamental properties which gives language its expressive power
is the duality of patterning described above (Martinet, 1949). All languages
contain meaningless phonemes which can be used to build meaningful mor-
phemes, but they differ widely in their number and type, in how the space of
possible phonemes is divided up, and in the rules for combining the sounds
together into larger units. I discuss each of these differences briefly below.
In terms of sound inventory size alone, some languages in Southern African
differentiate up to 144 different phonemes,1 while Rotokas, spoken in Papua
New Guinea, manages with just 11. Some common sounds, such as the voice-
less stops [p],[t],[k] or the nasals [m],[n], occur in the vast majority of lan-
guages across the world,2 while other sounds have an extremely restricted
geographical range, such as the labiodental flap [ⱱ] which is used only in par-
ticular areas of Central Africa. Whole groups of sounds are subject to similarly
wide variation: while fricatives like [f],[v],[s],[z] form an important part of
the sound inventory of the vast majority of human languages, they are not
found at all in Australian aboriginal languages such as Dyirbal (Dixon, 1972),
Kayardild (Round, 2009) and Yorta Yorta (Bowe and Morey, 1999). Click
consonants (such as the tut-tut sound used to express disapproval by English
speakers), on the other hand, are found as phonemes only in Khoisan and Bantu
1
In many languages, the precise number of phonemes is a matter for debate, primarily
because it can be unclear whether some complex articulatory combinations should count as
one consonant or as a sequence of consonants (Zsiga, 2013).
2
Although not without some striking exceptions such as Arabic, which lacks /p/and Tahi-
tian, which lacks /k/.
4
languages in southern and eastern Africa (Clements, 2000).
Even when languages do use the same sounds as each other, their phonolog-
ical status within each language can be different. For all speakers of English,
for instance, the sounds [s] and [ʃ] (the latter pronounced like ‘sh’) count as
separate sounds, because they contrast with each other: changing from one
sound to the other can produce a different word with a different meaning (e.g.
sip-ship; gas-gash). Although Japanese speakers also use these sounds, how-
ever, they don’t contrast in Japanese: instead they are predictable variants
of the same sound, which is usually pronounced as [s], but is systematically
pronounced as [ʃ] whenever it occurs before the vowel /i/, as for instance
in the word sushi (Goddard, 2005). These contrastive distinctions are vitally
important to understanding and speaking a language, yet they are essentially
arbitrary and can be extremely difficult to discern for non-native speakers: in
Polish, for instance, there are two contrasting sounds [ɕ] and [ʂ] (e.g. [koɕ]
‘mow’ - [koʂ] ‘basket’), which both sound extremely similar (roughly like ‘sh’)
to English speakers who do not distinguish them.
All languages make use of changes in pitch, for instance to mark the bound-
aries of syntactic units, but again their status can vary quite remarkably: in
English, we can raise the pitch at the end of a sentence to make a question out
of a statement, but in languages like Mandarin, every word has a distinctive
pitch associated with it, as can be seen in Table 2.1, which shows four different
pronunciations of the sequence [ma], each with its own specific meaning. Lan-
guages like Mandarin in which pitch variations are used to distinguish different
words are called tone languages, yet these also differ in the number of distinct
tones they make use of, from simple tone languages like Shona which differ-
entiate only two, to more complex ones which make many more distinctions
(Ladefoged and Johnson, 2011).
5
(Butskhrikidze, 2002). Although all spoken languages have sound systems,
therefore, the ways in which these are organised can be extremely variable, and
the phonological distinctions which languages make appear to be arbitrarily
fine (Pierrehumbert et al., 2000).
2.2.2 Words
Linguistic diversity is also particularly noticeable in morphology, the study of
the minimally meaningful parts, or morphemes, from which words are made.
Languages differ in terms of how much information each word contains, how
their morphemes are combined into words, and how clearly defined the mor-
phemes are within a word, each of which will be explored briefly in turn.
In an isolating language like Thai, illustrated in (2.1), almost every word
contains just a single morpheme, while in a synthetic language like Swahili
(2.2), words are considerably more complex, frequently containing multiple
morphemes packaged together inside a single word. Most languages in fact lie
somewhere between these extremes, and are characterised by a mixture of the
two types of words. English, for instance, contains not only many monomor-
phemic words like cat, build or good, but also a considerable number of more
complex multimorphemic words such girl-ish-ly or over-pay-ment.
(2.2) Swahili
tu-li-mw-on-a
we-past-him-see-ind
‘we saw him’
We can also classify morphemes into roots, which cannot be further broken
down, and which normally contain the main content of the word’s meaning,
and affixes which are attached to the root and modify its meaning in some
way. The English word unhappiness, for instance, contains the root happy and
two affixes un- and -ness which add additional semantic content to the core
6
meaning.3 The order of roots and morphemes is extremely variable, although
a detailed analysis of the inflectional morphology of almost 1000 different lan-
guages (Dryer, 2013b) shows that while languages vary across the spectrum
from those which make exclusive use of suffixes (e.g. Central Yup’ik) to those
which make almost exclusive use of prefixes (e.g. Kihunde), there is a clear
cross-linguistic inclination towards suffixing.4 This has been attributed to pro-
cessing constraints: although both kinds of affix arise originally from distinct
words which have fused with adjoining words, Hall (1988) shows that suf-
fixes are likely to be preferentially understood by hearers because they allow
easier and quicker identification of the content words. In addition to the use
of prefixes and suffixes, however, some languages also show even more com-
plex morphological patterns, where morphemes are more tightly intermeshed
with each other: affixes can be placed either inside roots (e.g. Tagalog bili
‘buy’, b-um-ili ‘bought’; sulat ‘break’, s-in-ulat ‘was broken’), surrounding them
(e.g German kauf-en ‘buy’, ge-kauf-t ‘bought’), or intermeshed in more complex
ways. Semitic languages like Arabic and Hebrew, for example, use template
morphology based on roots, typically made up of three consonants, into which
vowels and other consonants are inserted in different ways to derive specific
meanings. The Arabic root ktb ‘write’, for instance, yields both specific forms
of the verb such as katab ‘he wrote’ and biyiktib ‘he is writing’ and separate
words such as kitaab ‘book’, kaatib ‘writer’ and maktaba ‘library’, along with
many others.
The third dimension along which we find considerable morphological di-
versity is the level of fusion within the forms, or how clearly distinguishable the
individual morphemes are. In agglutinating languages, the boundaries between
the morphemes are sharply defined, as can be seen in the Swahili example (2.2)
above and by the Hungarian example (2.3), where a single word barátságosab-
ban is built up from a series of clearly visible suffixes added in sequence to the
root barát ‘friend’. By contrast, in fusional forms such as the Latin adjective
bon-us ‘good’, the shape of the individual morphemes is not clear; the gram-
matical information specifying the gender (masculine), number (singular) and
case (nominative) of the adjective is all conflated into a single suffix -us which
cannot be separated into distinct morphemes.
7
barát-ság-os-abb-an
friend-ship-adj-comp-ind
‘in a more friendly manner’
2.2.3 Grammar
On the grammatical level, too, individual languages show an enormous range
of diversity, from the order in which words are assembled into sentences to
the syntactic categories and patterns of agreement which are obligatorily ex-
pressed in a language, only a few of which can be mentioned here.
Word order is the most famous and widely cited typological feature of lan-
guages, and indeed the development of the whole field of linguistic typology
can reasonably be traced back to Greenberg’s famous paper on the basic word
order in declarative sentences across languages (Greenberg, 1963). The three
main elements of a sentence — subject (S), object (O) and verb (V) — are
found in all six logically possible orders in different languages, although there
is a notable preponderance of the two orders SOV and SVO, which together
make up 89% of the languages which have a dominant word order in Dryer’s
(2013a) cross-linguistic survey of almost 1400 different languages.5 In En-
glish, of course, word order is used to indicate which role (e.g. the doer of
the activity, the person to whom the activity is done) that each noun phrase
is playing in the sentence and is thus relatively fixed. Many other languages,
however, use a very different grammatical means, known as case marking, in
which morphological affixes or function words accompany the noun phrases
and/or the verb to mark these roles. In (2.4) we can see that each noun phrase
in Japanese must be followed by a special functional postposition (here ga, ni,
o) which provides crucial information about the relationship between the noun
phrase and the verb ageta: the nominative marker ga follows the subject, the
accusative marker o follows the direct object, and the dative marker ni follows
the indirect object or beneficiary. In Swahili, by contrast, (2.5) agreement pre-
fixes (here a-, -ki-) are added to different parts of the verb complex, to specify
and classify both the subject and object of the verb.6
(2.4) Japanese
8
‘the teacher gave the book to the student.’
(2.5) Swahili
(2.6) Swahili
mtu a-li-lal-a
person subj.1-past-sleep-ind
‘the person slept.’
9
To complicate things further, many languages use both the Swahili-style
nominative-accusative system and the Basque-style ergative-absolutive system
to mark their arguments, and they even differ in the circumstances when each
system is used. In languages like Lakhota and Guaraní, for instance, the mark-
ing of the transitive subject A may depend on a number of different semantic
features such as the degree of voluntary control the subject is considered to
have over the activity being performed, or whether the sentence is considered
to be an event or a state of affairs (Mithun, 1991). In the Australian language
Dyirbal, the marking is determined by the semantic referent of the subject, with
participants in the speech act (I and you) using the ergative-absolutive system,
but other participants (third person pronouns, nouns) using the nominative-
accusative system. In Georgian, however, the determining factor is the tense
of the sentence: the ergative-absolutive is used in the past but the nominative-
accusative is used in the present (Song, 2001).
2.2.4 Meaning
Linguists have also begun to document the enormous diversity in which mean-
ing is conceptualised and encoded in languages. Space prohibits a full explo-
ration of this vast area, but to give some idea of the scale of cross-linguistic
semantic variation, I will concentrate on a fundamental and relatively well-
defined part of meaning structure, the conceptualising of spatial relationships
between objects.
In a major work exploring how spatial distinctions are expressed in around
a dozen languages, Levinson and Wilkins (2006) found surprisingly profound
diversity in the conceptualisation and linguistic encoding of topological rela-
tions, motion, and frames of reference, yet also that this diversity appears nev-
ertheless to be constrained by underlying abstract dimensions of apparently
universal relevance. Languages like English encode basic locative construc-
tions, or responses to a question like ‘Where is the X?’, using a noun phrase
to represent the figure, or focus of the scene, a form of the verb to be, and a
prepositional phrase to represent the (back)ground of the scene, as shown in
(2.9). Other languages, however, structure their basic locative constructions
in very different ways, using case marking or spatial nouns instead of preposi-
tions,7 using multiple different locative verbs (e.g. equivalents of ‘stand’, ‘sit’,
‘lie’, ‘hang’) depending on the shape or function of the figure, even using no
verbs at all, or using highly specific dispositional predicates which precisely
7
Or rather adpositions, a cover term encompassing both prepositions which occur before
their arguments as in English, and postpositions which occur after their arguments in languages
like Japanese.
10
orientate the figure and ground (Brown, 2006).
(2.9) the apple is in the bowl
NP BE PP
figure ground
‘the apple is in the bowl.’
After a detailed investigation of the kinds of scenes which can be described
using a language’s basic locative construction (BLC), Levinson and Wilkins (2006)
propose a topological space hierarchy in which all languages in their sample
encode core scenes like ‘cup on table’ and ‘ball under chair’ using their BLC,
but only some encode adhesion (‘stamp on letter’) in this way, fewer encode
scenes in which the ground or the figure is pierced (‘arrow in apple’, ‘apple
on skewer’) and fewer still encode scenes in which the ground is a human or
other animate being (‘ring on finger’). Looking at adpositions alone, the same
eight scenes are encoded using between zero and seven different lexical items
in the sample languages, with a wide range of distinctions being made (e.g.
both Dutch and Yélî Dnye require seven adpositions, but while Dutch encodes
different types of surface contact (‘stamp on letter’, ‘cup on table’) identically,
Yélî Dnye conflates adhesion (‘stamp on letter’, ‘ring on finger’) under the same
adposition. They conclude that abstract concepts like ‘contact’ and ‘horizon-
tal support’ are more likely candidates for universality than are concepts like
in and on, contrary to conclusions which had been drawn after perusal of
European languages alone (Levinson and Wilkins, 2006, pp. 519–520).
Famously, Talmy (2000) presented evidence of a major split between ways
in which simple motion events are linguistically encoded, which can be seen
even in relatively closely related languages such as English and French. In
English (2.10), motion events are commonly expressed using verbs which de-
scribe the manner in which the motion takes place (e.g. slither, crawl, slide)
and prepositional phrases describing its direction or path, while in French
(2.11) the same concepts are expressed using verbs specifying the direction
of the movement (e.g. enter, climb, descend) and an optional participle de-
scribing its manner (Israel, 2014).
(2.10) the children jumped down the stairs
manner direction
11
‘the schoolchildren jumped down the stairs’
12
concepts we want to refer to (e.g. selfie, vaping) to more obscure and unin-
tentional cases of reanalysis, where the internal structure of a construction is
understood in a different way from that which was originally intended (e.g.
the original structure of the word hamburg-er was reanalysed as ham-burger,
leading to novel coinages like cheese-burger) (Trask, 1996). Some changes in
pronunciation can be understood straightforwardly in terms of a general ten-
dency to reduce the effort required for articulation, by reducing the amount
of movement required within the mouth (e.g. the pronunciation of words like
better with a glottal stop instead of [t]), or by pronouncing sounds which oc-
cur close together more like one another (e.g. the pronunciation of words like
input as im-put, with the nasal being articulated in the same part of the mouth
as the following consonant).
The terms language change and language evolution are both wide-ranging
and frequently overlapping in their denotation: sometimes synonymous, yet
also often set in opposition to each other, both are frequently used to refer
to wildly different notions on diverse timescales, from specific changes in the
way a particular word is used at a particular point in time, to the biological
evolution of particular cognitive capacities which have allowed us to learn and
use language. In the following sections I will distinguish in turn three different
aspects of language change which models have successfully been used to ex-
plore: the ubiquity of linguistic variation, the ultimate source of the constant
state of flux in which language exists and an inevitable consequence of the way
human communication works; the identification of historical relationships be-
tween languages and their classification into metaphorical family trees; and
interactions between biological and cultural evolution from which language
initially emerged.
2.3.1 Variation
People speak differently from the way in which their parents speak, and dif-
ferently again from the way in which their children speak, even while they
consider themselves (and are considered by others) to be speakers of the same
language. Even within the same generation variation in people’s pronuncia-
tion of certain sounds or in their usage of particular words and constructions
gives other people crucial information about their identity: where they come
from, their socio-economic status and ambitions, the groups of people they
spend time with, and much more.
The systematic study of linguistic variation was originally spearheaded by
sociolinguists such as Labov (1972) and Trudgill (1972), who introduced the
idea of a formal linguistic variable which can be realised as multiple different
13
variants in different contexts. Labov (1972)’s pioneering work was focussed
on the isolated island of Martha’s Vineyard in Massachusetts, where islanders
often used particularly idiosyncratic pronunciations of certain diphthongs. He
found correlations between the extent to which these pronunciations were used
with social variables such as age, occupation and geography, and showed that
the idiosyncratic pronunciations were being used as a linguistic marker of the
speaker’s identity, in particular their commitment towards wanting to remain-
ing on the island, in the context of a collapsing fishing-based local economy
and improvements in travel which had led both to large growth in the num-
ber of casual visitors to the island and to increasing opportunities for islanders
to move to the mainland for work. At the same time, Trudgill (1972) was
one of the first to quantify not only the striking connections between social
class and the use of prestige variants, but also to show how the speech con-
text itself also plays an important role, demonstrating that all speakers use
fewer overtly prestigious forms as the context becomes increasingly informal.
He also showed systematic differences between men’s and women’s speech
which have been echoed in many subsequent studies: men consistently use
more vernacular forms and women use more standard forms. Many explana-
tions for these differences have been suggested, from claims that women are
more aware of the social consequences of the way they speak to suggestions
that the increased use of vernacular forms by men (particularly working-class
men) is an expression of their masculinity (Holmes, 2013). Frequently, how-
ever, variation is better explained in terms of social network properties, with
denser social networks where everyone knows each other tending to inhibit lin-
guistic change and looser social networks being more receptive to innovation.
As Milroy (1980) put it in his famous study of working-class communities in
Belfast, ‘the closer an individual’s network ties are with his local community,
the closer his language approximates to localized vernacular norms’ (Milroy,
1980, p.175). Interestingly, Milroy also found more complex interactions be-
tween gender and social networks too, showing that as young women rather
than men came to be chief breadwinners in the face of recession and the loss
of traditional heavy industry which had been strongly male-dominated, their
social networks became closer than in earlier generations, and their language
reflected what had previously been considered markers of masculine speech.
The inevitably of variation in language stems from the fact that language
is fundamentally a social interactional phenomenon (Croft, 2000) between
people who belong to different, though potentially overlapping, social com-
munities and have different individual experiences. Social communities are
internally cohesive and externally distinctive collections of people who have
come together because they have shared practices, beliefs or knowledge which
14
are lacked by non-members of the community (Trousdale, 2010). We each be-
long to multiple interactive communities and sub-communities defined by our
work, family, friends and recreational activities, in which we negotiate and
express facets of our identity. Bolinger (1975) highlights that ‘there is no limit
to the number of ways in which human beings league themselves together for
self-identification, security, gain, amusement, worship, or any of the other pur-
poses that are held in common; consequently there is no limit to the number
and variety of speech communities that are to be found in society’ (Bolinger,
1975, p.333). Our decisions about language use are a crucial part of this iden-
tity creation and maintenance, both within and outwith these communities.
Moore (2004), for instance, shows how particular aspects of grammatical vari-
ation (e.g. using non-standard ‘I were’, ‘he were’) became used increasingly
frequently by members of an emerging group of ‘townie’ girls in a school in
north-west England, as they sought to signal their group identity as different
from other established groups of girls at the school. In many multilingual so-
cieties, indeed, there are clear demarcation lines between different domains
in which different languages are used, often with an official, prestigious lan-
guage being used in more formal and religious contexts, and vernacular lan-
guages being used in informal contexts with friends, in the market and at home
(Holmes, 2013). In truth, though, monolingual societies also display similar
kinds of linguistic division between domains of use, albeit that they are more
subtle because they are expressed through the use of different varieties of the
same language rather than through different languages. The individuality and
variability of our complex social networks and of our experiences within them
is reflected in the individuality and variability of our linguistic repertoires,
both in the ephemeral external linguistic behaviour we produce, and in the
longer-lasting internal linguistic representations stored within our brains.
The process of linguistic communication itself is also inherently variable.
Communication has often been thought of as a simple deterministic computa-
tional process in which the speaker’s thoughts are encoded into an utterance,
conveyed to the hearer, and then decoded back into meanings; assuming that
both parties have the same encoding/decoding algorithms, then communica-
tion is successful (Shannon, 1948). Much work in the philosophy of language
and in what is now known as pragmatics (Grice, 1975), however, showed that
such a model cannot account for the nuances and detail of real-life communi-
cation, where the same utterance can be interpreted in radically different ways
depending on the context in which it occurs (and, indeed, where different utter-
ances can be interpreted identically). Communication is therefore not a simple
encoding-decoding process, nor if, even we relax the assumption that interlocu-
tors have the same algorithms, a process of ‘reverse engineering’ though which
15
the hearer reassembles the speaker’s meaning (Mufwene, 2002; Brighton et al.,
2005), but rather it is a system based on the complementary and co-operative
processes of ostension and inference (Sperber and Wilson, 1995; Scott-Phillips,
2014; Smith and Höfler, 2015), which relies crucially on the interlocutors be-
ing able to recognise and exploit their mutual common ground (Clark, 1996).
Common ground is both wide-ranging and multi-faceted, including not only
the interlocutors’ fundamental shared recognition of each other as a potential
conversation partner, but also a shared understanding of each other’s inten-
tions and of the joint goal of their conversation (Tomasello et al., 2005). It
encompasses an understanding of relevant material from the communicative
context, as well as shared attitudes, shared beliefs and shared conventions.
These conventions, crucially for linguistic communication, include knowledge
of the conventional (and of course arbitrary) meanings of particular words and
constructions. Much common ground derives, of course, from the shared com-
munities to which people belong, the distinctive behaviours and specialised
vocabulary they share, but indeed every pair of interlocutors also has their
own interpersonal common ground derived from their previous shared experi-
ences together, and from the things one person knows their interlocutor knows
about.
Common ground allows people to communicate successfully by providing
a backdrop against which both ostension and inference can take place. The
speaker8 can act out an appropriate ostensive act, one whose deliberate and
atypical nature both marks it as being intended as a communicative and en-
courages the hearer to interpret it in an opportune way. Having identified
the signal as communicative, by its ostensive nature, the hearer then uses the
evidence contained within the speaker’s signal and the broader communica-
tive context to inferentially construct a relevant meaning which makes sense
of the signal. This process of meaning construction is not deterministic at all;
rather it is inexact, ambiguous and underspecified, based on creative processes
of inference and conceptual blending (Fauconnier and Turner, 2002; Höfler
and Smith, 2009) which are inherently unstable and equivocal. Moreover, the
individual nature of our cognitive representations means that, no matter how
much common ground interlocutors share, there will inevitably be some (how-
ever minor) variation between their understandings of the same communica-
tive episode and therefore between their internal linguistic representations;
it is this variation which leads inexorably to the ubiquitous variation which
pervades language (Smith and Höfler, 2015).
8
Communication is not only spoken, of course, but I use the terms speaker and hearer in
a general sense to indicate the person performing the communicative act and the person to
whom the act is addressed.
16
2.3.2 History
The fundamental goals of historical linguistics are to identify historical rela-
tionships between languages, particularly in cases where there is no corrobo-
rating written evidence, to describe the histories of languages and groups of
languages, and, ultimately, to develop a comprehensive theory of language
change (Harrison, 2005). When a language is spread over a reasonably large
geographical area, the changes it undergoes will inevitably be different from
area to area, yielding different dialects of the language; eventually (particu-
larly when travel between dialect groups is difficult, as in pre-modern times),
these dialects therefore tend to separate into different, mutually unintelligible
languages. Languages which were once dialects of the same language (e.g.
French and Spanish were originally dialects of Latin), but have diverged in
this way, are said to belong to the same language family. This genealogical
analogy is very widespread in historical linguistic terminology, so languages
in the same family are said to be ‘genetically’ related to each other and to their
‘common ancestor’ language, and language families and higher groupings such
as language ‘phyla’ are conventionally illustrated by linguistic ‘family trees’,
but it is important to note that this terminology merely reflects an extended
metaphor, and does not imply any true biological connection (although see
subsection 2.4.2 and subsection 2.4.3 for further discussion of work seeking to
draw connections between genetic and linguistic features).
The establishment of such ‘genetic’ relationships among languages can be
acknowledged when a particular feature the languages share is unlikely to have
arisen independently or been borrowed between them, and can therefore best
be explained in terms of the feature’s ‘inheritance’ from a common ancestor
language. In order to reduce the possibility of independent identical inno-
vation in the languages under investigation both natural resemblances (e.g.
onomatopoeia like ‘cuckoo’) and chance resemblances (e.g. in the unrelated
languages Hawai’ian and Greek, the word for honey is meli (Trask, 1996)) must
be excluded from the analysis. This is primarily done by finding frequent and
systematic correspondences between lexical items in the languages, through the
painstaking technique of the comparative method, the ‘gold standard’ (McMa-
hon and McMahon, 2005) of historical analysis. Table 2.2, for example, shows
a small part of a system of regular correspondences in the initial consonants of
words in English, Dutch and German: in every row of the table (and in many
more examples not shown), the words have the same meaning, and there is a
systematic relationship in their form where initial [p] in English corresponds
to [p] in Dutch and to [pf] in German.
17
[Table 2.2 approximately here]
18
Secondly, the Neogrammarian axiom of the comparative method is that
sound change is assumed to be regular in each daughter language, and affects
all items with the relevant sound in the relevant context (e.g. the initial *p in
Table 2.2) simultaneously and with the same result. In fact, due to the ubiquity
of variation in language that we have already seen in subsection 2.3.1, and par-
ticularly the discovery that sound changes actually proceed through a language
a few words at a time, through a process of lexical diffusion (Chen and Wang,
1975), we know that this axiom is not warranted, at least in its strongest form.
Over the long timescales usually under consideration by historical linguists,
however, the result of a sound change progressing through lexical diffusion is
‘virtually indistinguishable’ (McMahon and McMahon, 2005) from a change
occurring universally and simultaneously, and the axiom can thus be recast
as an approximation which holds sufficiently for the comparative method to
remain valid.
Thirdly, and more problematically, however, the construction of language
family trees must necessarily exclude contact-induced changes, or borrowings,
which can cause spurious correspondences between unrelated languages and
potentially undermine the comparative method (McMahon and McMahon,
2003). Borrowings need to be excluded from consideration before comparisons
are undertaken, but it is unfortunately not always a trivial task to identify them:
whenever linguistically diverse cultures come into contact with each other, a
degree of multilingualism always ensues, which in turn facilitates the prop-
agation of linguistic features from one group to the other. Contact-induced
change is usually motivated by social factors like prestige, power and trade
relationships described in subsection 2.3.1, which can have wide-ranging ef-
fects on different linguistic systems. Although lexical items are usually thought
to be the area most susceptible to borrowing (Sankoff, 2004), phonological,
morphological and grammatical structures can also be borrowed in the right
circumstances (Heine and Kuteva, 2005, 2006). The most common approach
to weeding out borrowings from an analysis is to focus on basic and (near-)
universal vocabulary items (e.g. kinship terms like mother, father, body parts
like head, eye, and universal natural phenomena like river, sun) which are puta-
tively ‘less subject to replacement than other kinds of vocabulary’ (Campbell,
2004, p.178), and these are the basis for many of the phylogenetic models
discussed in subsection 2.4.2.
Finally, the evidence on which the comparative method depends is in-
evitably eroded by the same language change over time that it seeks to exploit
to find the historical relationships. The more time that has passed since an an-
cestor language was spoken, the less evidence of shared ancestry will remain in
its descendants, and thus the more problematic that ancestry will be to prove.
19
Importantly, the rate of linguistic change is variable to such an extent that it
is all but impossible to date linguistic changes without independent physical
evidence such as written inscriptions of some sort (McMahon and McMahon,
2000).
These issues clearly show that the comparative method is certainly not a
panacea for historical linguists, but it nevertheless remains an extremely use-
ful tool which can shed considerable light on relationships between languages,
as long as careful consideration is given to the data to which it is applied,
and every effort is made to exclude inappropriate data to minimise the po-
tential problems. Unfortunately, the necessary decisions on such data exclu-
sions often require the prospective analyst to have a very considerable level of
knowledge of the languages to be compared, and this high barrier to the valid
use of the comparative method has led to the development of other super-
ficially attractive but fundamentally flawed classificatory techniques such as
mass comparison (often now called multilateral comparison) (Greenberg, 1987;
Ruhlen, 1991, 1994). This contentious and highly controversial method has
been severely criticised (Campbell, 1988; McMahon and McMahon, 1995) due
to its methodology, which deliberately short-circuits the painstaking rigour of
the comparative method in favour of collecting basic vocabulary items from
a massive range of different languages, tabulating them and simultaneously
comparing them to ‘automatically’ identify language groupings by noting the
patterns which will inevitably be found. The twin requirements of frequent
systematic correspondences to control against accidental similarity and of the
comparative reconstruction of ancestral forms through regular and plausible
sound change are both cast aside in favour of undefined criteria for determin-
ing matches in both form and meaning, which serve to render the technique
of mass comparison devoid of scientific validity and its conclusions indistin-
guishable from chance (Campbell, 2004). As Aitchison (1996, p.172) puts
it: ‘[c]hance resemblances are easy to find among different languages if only
vague likenesses among shortish words are selected.’ Most breathtakingly, in
response to complaints that much of the primary data used in the initial appli-
cations of mass comparison was strewn with errors, it has even been claimed
with hopeless optimism that inaccurate or incomplete data supposedly has
‘merely a randomising effect’ (Greenberg, 1987, p.29) on mass comparison
and does not impact on its reliability.
Mass comparison’s extremely liberal acceptance of any evidence in favour
of languages being related, no matter how vague or tenuous, makes it a method-
ology which is ‘very good at finding patterns, but no good at all at telling
us whether those patterns mean anything’ (McMahon and McMahon, 2005,
p.22). The key issue is that mass comparison conflates the process of hypoth-
20
esis generation with that of hypothesis testing: although historical linguists
have indeed always begun with observing resemblances between languages
and wondering whether they might be related, this is far from assuming that an
occasional resemblance actually proves that a historical relationship existed.
Ostensibly comprehensive hierarchical linguistic classifications have been pro-
duced through the application of mass comparison (Ruhlen, 1994), though
they contain extremely contentious language families such as Amerind (‘[this]
classification and its attendant methods must be rejected’ (Campbell, 1988,
p.610)) and Indo-Pacific (‘This idea lacks any substance’ (Dixon, 1997, p.35)),
and even more controversial groupings of language families into macrofami-
lies such as Eurasiatic, finally culminating in the wildly speculative and un-
founded ‘last common ancestor’ of all existing languages, Proto-World, angrily
condemned as ‘at best a hopeless waste of time’ (Campbell and Poser, 2008,
p.393).
Although mass comparison has effectively no linguistic validity, it is never-
theless not without appeal outside the field of historical linguistics, precisely
because of the illusory linguistic classifications it can produce in areas where
historical linguists acknowledge that the data is no longer robust enough for
the comparative method to work. One of the most famous uses such classifica-
tions have found is in the comparison of genetic and linguistic trees (Cavalli-
Sforza et al., 1988) discussed in subsection 2.4.2, which sought to reconstruct
human phylogeny by finding correlations between trees built from measures of
genetic diversity between human populations and pseudo-historical linguistic
trees based on mass comparison (Greenberg, 1987).
2.3.3 Evolution
The primary aim of evolutionary linguistics, on the other hand, is to explain
the transition to language, how our unique communication system emerged
from a non-linguistic system. Historically, the predominant view of language
has been as an autonomous module within the brain, a language organ con-
taining the universal structures which form the basis for all human languages
(Chomsky, 1965). This view has been traditionally underpinned by the argu-
ment from the ‘poverty of the stimulus’, the claim that the linguistic evidence
to which children are exposed is not sufficient to acquire the grammar they end
up with (Chomsky, 1980) (although see Pullum and Scholz (2002) for a de-
tailed empirical assessment which found no convincing support for the claim).
An evolutionary account of this nativist view of language was developed by
Pinker and Bloom (1990), based on the assumption that the language organ
evolved biologically though natural selection for language learning, and mak-
21
ing use of the Baldwin effect, through which traits which are originally acquired
through learning can become encoded genetically, when learning changes the
evolutionary environment so that genes which encode the learnt behaviour
explicitly are selected for (Deacon, 1997). Although the Baldwin effect has
been demonstrated in some evolutionary models where genotype and pheno-
type are directly connected (Turkel, 2002), it is not found in more biologically
plausible scenarios (Yamauchi, 2001), and the encoding of specific linguistic
parameters as envisaged by Pinker and Bloom (1990) can only occur when the
linguistic target is fixed (Munroe and Cangelosi, 2002). When language is
changing even very slowly, selection actually works in favour of neutral genes
rather than specialised genes which encode linguistic principles directly, ef-
fectively because biological evolution is not able to keep pace with a moving
target (Chater and Christiansen, 2009; Chater et al., 2009).
As a result of this, attention in evolutionary linguistics has largely shifted
to the cultural evolution of language as an explanatory mechanism, turning the
problem on its head to suggest that the languages themselves have evolved in
order to become more learnable by human brains, and thus persist over time
among human populations, rather than our brains evolving to learn language
(Christiansen and Chater, 2008). According to cultural accounts, the require-
ment for language to oscillate continually between its internal and external
manifestation as it is learnt and used leads to the emergence of languages
which have adapted to be learnable by humans (Kirby, 2000; Zuidema, 2003;
Smith, 2008). The burden of evolutionary explanation thus shifts to identify-
ing the minimal set of cognitive capacities which can support the sharing of
inferable cultural conventions (Smith and Kirby, 2008); these capacities need
not be specific to language at all, but could have evolved biologically for some
other purpose; they may be universal pressures such as memory or processing
constraints but they may only apply in particular circumstances (or niches)
such as populations with small numbers (Nettle, 1999) or with a relatively
high number of second-language learners (Lupyan and Dale, 2010), as we
will see in subsection 2.4.3.
Explanations of both the biological evolution of relevant cognitive capac-
ities and the cultural evolution of language in dynamic populations of cogni-
tively adapted individuals are required for a full explanation of the emergence
and evolution of language, and it has become increasingly clear that the in-
teraction between biology and culture, too, is likely to be vital: biology pro-
vides cognitive adaptations which influence how we interact with each other,
while these cultural interactions can cause qualitative structural changes to
language itself. One profitable way of conceptualising this to consider lan-
guage as a complex adaptive system (Gell-Mann, 1994; Beckner et al., 2009),
22
a system where the linguistic structure is a set of emergent properties derived
from communicative interactions, without any system-wide central guidance
or optimisation (Hopper, 1987; Bybee, 2006). The very different evolutionary
timescales involved, however, mean that we are probably dealing with at least
three separate complex adaptive systems, which interact with each other in
interesting yet complex ways (Kirby and Hurford, 2002).
23
reflecting their different experiences. The same exemplar categories can also
be seen to form a distribution from which individuals’ production of linguistic
variants is drawn, and indeed each production yields another exemplar to be
added to the category. In an evolutionary model, some of these variants must
be selected according to some criteria, so that they are differentially produced
over other competing variants. Wedel (2006) uses a series of exemplar mod-
els like this to show in detail how genetic drift results in the random fixation
(when the whole community uses the same variant) of variants by pruning
variation, how small increases in selection bias can result in the inhibition of
sound changes where they would eliminate a functional contrast like those de-
scribed in subsection 2.2.1 and, conversely, how contrast can be shifted across
segments in cases of low functional contrast.
Baxter et al. (2009)’s framework focuses explicitly on the types of selection
mechanisms which can work on variants of a linguistic variable. Their model,
which is based on Hull (1988)’s general analysis of selection in evolutionary
systems as developed into Croft (2000)’s theory of utterance selection, makes
a crucial distinction between the replicators, the linguistic entities which are
replicated, and the interactors, the language users whose interaction with each
other causes the differential replication of the linguistic entities, yielding four
qualitatively different selection mechanisms, dubbed neutral evolution, neutral
interactor selection, weighted interactor selection and replicator selection. Neutral
evolution is directly equivalent to the solely random process of genetic drift,
which can, as Wedel (2006) showed, produce random fixations without any
need for true selection at all. Their mechanisms of interactor selection natu-
rally focus on how the structure of the networks and communities in which the
speakers interact can have substantial impacts on the replication of variants:
in neutral interactor selection, frequencies of interaction between speakers is
the only relevant factor; in weighted interactor selection, certain interactors
are preferred over others, due to different social valuations on them and the
social groups they belong to, and so the variants they use are indirectly pre-
ferred. In the final mechanism, replicator selection, the linguistic variants
themselves have different social valuations which lead directly to their differ-
ential replication. Baxter et al. (2009) used their framework to investigate a
notable theory of new dialect formation in isolated colonial communities such
as New Zealand after the arrival of British settlers (Trudgill, 2008), which held
that sociolinguistic features like identity and prestige play no part, but that the
new society effectively begins with a blank slate in which linguistic variants
have no pre-existing social values attached to them, and so the adoption of new
variants is driven solely by the frequency of interaction between interlocutors
(i.e. neutral interactor selection). Their analysis showed that although the
24
linguistic characteristics of New Zealand English were indeed consistent with
change simply through accommodation to interlocutors and frequency of in-
teraction, as Trudgill had suggested, it was inconceivable for this mechanism
alone to have been able to produce the level of dialect homogeneity which
was actually seen in New Zealand in the available timescale, and therefore
that some differential social valuation of either the interlocutors (weighted in-
teractor selection) or the variants they used (replicator selection) would have
been required.
Blythe and Croft (2012) used the same evolutionary framework to investi-
gate the general mechanisms of propagation in language change, or how a new
linguistic variant is diffused through a community and becomes convention-
alised so that it replaces the original convention. Such replacement events, in
common with many natural processes, are characterised by a trajectory which
follows an approximate sigmoid S-curve, progressing from a slow beginning
through a phase of rapid acceleration and a slow approach to completion. In
particular, they focus on a systematic exploration of (Baxter et al., 2009)’s
selection mechanisms, to see which of them can yield the characteristic dif-
fusion curve, finding that replicator selection is ‘almost certainly an essential
mechanism for language changes that follow an S-curve’ (Blythe and Croft,
2012, p.293). Interestingly, they find that the mere existence of any differen-
tial valuation of variants is enough: it is not required for the emergence of an
S-curve for individuals in the community to give variants the same values, nor
for those values to remain constant. Their hedging of replicator selection as al-
most certainly essential stems from their finding that although they did manage
to simulate an appropriate trajectory with weighted interactor selection alone,
this was only possible under rather implausible conditions where the popu-
lation was increasing exponentially over time and each new group into the
populations weighted their immediate predecessors highly. A similar finding
was reported by Gong et al. (2012), using a slightly different kind of evolu-
tionary model based on the Price equation for the description of evolutionary
processes (Price, 1970) in combination with the Pólya urn model of dynam-
ics from epidemiology, which confirmed that replicator selection (which they
term variant prestige) is the key selective pressure which drives the adoption
of new variants in a population.
A more specific dynamic model of language change is presented by Kandler
et al. (2010), who are interested in modelling the process of language shift,
where members of a multilingual speech community abandon their original
language for another language, frequently because the new language is seen
as more useful in achieving social mobility or in providing access to greater
economic opportunities (McMahon, 1994). Kandler et al. (2010) focus in par-
25
ticular on the shift from Celtic languages to English in Britain and Ireland over
the last couple of centuries, and intriguingly show that, although monolingual
speakers of the language with lower prestige always disappear, the minority
language itself can still persist in a bilingual community, as long as there are
both sufficient bilingual speakers and sufficient pressure on monolinguals in
the high-status language to become bilingual. Applying this to the current sit-
uation of Gaelic in Scotland, they suggest, perhaps optimistically, that around
860 English speakers need to become bilingual in Gaelic each year for the lan-
guage to become stable. See also the papers by Grin (Chapter 21), Wickström
(Chapter 22) and Uriarte (Chapter 23) in this volume for further discussion of
minority languages.
26
variation and ambiguity we saw in Section 2.2, however, can lead to severe
problems in deciding which forms should be included (McMahon and McMa-
hon, 2005). Many languages, for instance, have unrelated near-synonyms (e.g.
English small/little) which could both equally well be used for the same item,
while for other languages there is no simple one-to-one mapping: they may
have multiple specific words which together cover one meaning on the list
(Navajo, for instance, has no single word for water, but instead separate words
for drinking water, rain water and stagnant water (Campbell, 2004), or they may
have one single word which covers more than one item on the list (e.g. bark
and skin). Assuming such problem can be satisfactorily overcome, a cognate
coding matrix is then created from the completed list, coding all attested cog-
nates (i.e. words which can be shown to derive from a common ancestor) for
each meaning with the same state. It is clear, therefore, that phylogenetic mod-
els rely not only on the existence of a validated Swadesh-style list, but also on
the reliable prior application of the comparative method to accurately identify
the cognates and to successfully exclude as many borrowings as possible.
The simplest lexicostatistical models quantify the level of relatedness be-
tween languages simply by counting the number of cognate items across the
list: the higher the percentage of cognate vocabulary items, the more closely
related the languages are assumed to be. More sophisticated models focus
not just on a simple distance measure between languages (see Chapter 6 by
Ginsburg and Weber for a detailed account of how linguistic distances can be
calculated), but on producing an evolutionarily plausible route by which re-
lated languages have derived from a common ancestor language, finding the
best tree for the largest number of cognate items. The most straightforward
of these methods uses an assumption of parsimony similar to that assumed in
the comparative method: the best tree is simply the one which minimises the
number of evolutionary changes which are required to arrive at the observed
data. Maximum parsimony models have been successfully used to analyse the
internal history of the Bantu language family, by for instance (Holden, 2002).
Maximum compatibility trees are similar to maximum parsimony trees, but re-
quire in addition that languages with the same state for a particular meaning
are represented as a single group within the tree, therefore ruling out the in-
dependent (convergent, in biological terms) evolution of cognates in different
lineages of languages; such a model has been used by Nakhleh et al. (2005)
to analyse Indo-European. In both cases, the analysis frequently results in
several possible best-scoring trees, and so consensus trees which amalgamate
these together are often used in order to visualise the results: strict consensus
trees include only those splits which are in every one of the best-scoring trees,
while majority consensus trees contain those which occur in more than half of
27
them.
Other models make explicit assumptions about a particular hypothesis of
linguistic change, and then estimate the evolutionary history of the languages
of language families based on that hypothesis. A maximum likelihood analysis
tries to find a single tree and model parameters which maximise the proba-
bility of producing the observed data, while Bayesian methods (Pagel and
Meade, 2004) produce a probability distinction over the set of trees, allow-
ing the explicit representation of phylogenetic uncertainty. Gray and Atkinson
(2003), for instance, used a Bayesian model to establish the relationship be-
tween the frequency of usage and the rate of lexical change in four separate
Indo-European languages. Unfortunately, the space of possible trees is enor-
mous and highly skewed towards trees with low likelihood values, and there is
no existing algorithm which can guarantee that the best tree will be found in
reasonable computing time (Schmidt and von Haeseler, 2009); the best way
round this is to use Markov Chain Monte Carlo sampling to move through the
space until a stationary distribution is reached, and then build a consensus tree
from the trees within the stationary distribution.
One key issue with all such phylogenetic models, however, is their accu-
racy, which of course depends upon the encoding and analysis of the data.
Problematically, because we know only incomplete information about the true
history of most languages, it can be difficult to evaluate the different models,
although this is done to some extent by calibrating them on well-established
language families for which extremely strong written evidence or historical
records exist. Another, more subtle, problem with family trees as a model
of language history, as mentioned in subsection 2.3.2, is that they force an
idealised view of language change which deliberately ignores all kinds of bor-
rowing or influence from unrelated languages, and is completely unable, for
instance, to represent the formation of creole languages which have multiple
parents. For languages with extensive borrowing, network models are needed
in order to be able to represent these conflicting relationships by reticulated
joins between branches. Holden and Gray (2006), for instance, used a net-
work model to try to resolve some of the outstanding problems in Bantu his-
tory which had proved intractable to tree-based analysis, and found that while
West Bantu scattered very quickly into a number of different branches, East
Bantu was characterised by extensive borrowing early in its development. The
complexity of network models, however, and in particular their sensitivity to
changes in internal parameter settings, can make them extremely different to
interpret (McMahon and McMahon, 2005).
Much effort in the past few decades has been expended on connecting evi-
dence about language histories with evidence from other disciplines, in order
28
to shed further light on aspects of human history, seeking to correlate linguistic
family trees or models more generally with archaeological or genetic evidence.
The earliest and most famous of these was Cavalli-Sforza et al.’s (1988) at-
tempted reconstruction of human phylogeny by finding correlations between
linguistic trees and trees of genetic diversity, which was based on the inno-
vative and intuitively appealing idea that when populations split and merge,
both their genes and their languages could have been affected by common pro-
cesses. There are clear and seductive parallels not only between direct genetic
and historical linguistic inheritance, but also between processes of diffusion
such as gene exchange through marriage and language convergence through
borrowing. Despite this, the work has met with a considerable degree of crit-
icism from many different quarters, not only due to its reliance on many of
the extremely contentious linguistic groupings described in subsection 2.3.2
above, but also due to methodological shortcomings in the production of the
genetic trees and apparent analytical sleights of hand. The genetic trees were
in fact phenograms reflecting overall genetic similarity between populations,
which were simply assumed to be equivalent to the phylogeny of the popu-
lations without any further evidence (Bateman et al., 1990). The individual
population groupings used were also problematic, with a minority of them
being explicitly defined on the basis of the language spoken in their commu-
nity, calling into question the independence of the genetic and linguistic trees
which is required for any correlations to be valuable (McMahon and McMa-
hon, 2005). Most seriously of all, the ‘remarkable correspondence’ between
the genetic and linguistic trees claimed by Cavalli-Sforza et al. (1988, p.6002)
appears under closer examination to be illusory and largely due to the way
in which the trees were visually presented: almost half of the linguistic fami-
lies are matched with just a single population, and thus have no effect on the
congruence of the trees at all (McMahon and McMahon, 1995), most of the
other linguistic families are in fact split across different genetic populations,
and neither of the putative linguistic superphyla in the linguistic trees (Nos-
tratic and Eurasiatic) corresponds to any population aggregate on the genetic
tree (Bateman et al., 1990).
Less controversially, linguistic evidence from phylogenetic analysis has been
used to evaluate competing archaelogical hypotheses about human history.
Gray and Jordan (2000), for instance, used a maximum parsimony model to
derive a tree of 77 Austronesian languages which they showed to be signif-
icantly congruent with another tree representing the ‘express train’ model of
the human colonisation of the Pacific, supporting the theory of colonisation by
an original population in Taiwan which spread through a series of migrations
to Polynesia. A similar parsimony analysis was also used by Holden (2002) to
29
reconstruct the family history of the Bantu languages, who showed that the re-
sultant tree closely mirrored existing archaeological evidence about the spread
of farming in sub-Saharan Africa in the Neolithic and Early Iron Age.
Phylogenetic models have also been used for the purposes of glottochronol-
ogy, a technique originally developed through analogy with carbon dating to
estimate the dates of events in language family trees, particularly language
splits. As in radioactive decay, the method assumed a ‘statable regularity’
(Lees, 1953, p. 113) in the rate at which basic vocabulary items would be re-
placed by new words. By averaging results derived from control data based on
pairs of languages which could be independently dated (e.g. Latin and Spanish;
Old Norse and Modern Swedish) and for which cognacy judgements were car-
ried out using the standard comparative method, Lees (1953) derived a value
of 0.8048 (± 0.0176) per millennium for this glottochronological constant.
Dating language splits is inherently troublesome because they represent grad-
ual processes rather than precise historical events, but more problematically,
by analysing further pairs of languages, Bergslund and Vogt (1962) were able
to show that the glottochronological constant was actually an illusion, with
widely varying rates of language change depending on a range of different
factors such as the relative isolation of the language community and their spe-
cific cultural practices;10 additional work has since shown that languages tend
to change more quickly in smaller communities (Nettle, 1999), and that the
emergence of a distinct language itself is associated with rapid, punctuated
bursts of change followed by much longer periods of slower changes as the
new language diverges from its ancestor (Atkinson et al., 2008). More recent
phylogenetic models have begun to take account of varying rates of change,
again borrowing methods from biology (Drummond and Suchard, 2010) to
incorporate both probabilistic variation across the whole tree (relaxed clock
models) and a series of different rates for different regions of the tree (ran-
dom local clock models). In a famous example of this, for instance, Gray and
Atkinson (2003) used a maximum likelihood model with a relaxed clock and
smoothed variation across the tree, which they used to estimate an emergence
date for Indo-European at around 8000–9500 years ago, strongly consistent
with an Anatolian origin for the language and its subsequent expansion in as-
sociation with the spread of agriculture.
10
They describe a word taboo in East Greenlandic, for instance, where the name of a dead
person cannot be mentioned during a mourning period; as people’s names are often words in
the language, new words need to be deliberately (and frequently) created to replace the taboo
items.
30
2.4.3 Models of language evolution
Early models of language evolution were dominated by simple computer sim-
ulations exploring the biological evolution of the critical period for language
learning (Hurford, 1991) and of symbols themselves (Hurford, 1989). Pro-
ponents of cultural explanations of the evolution of language, on the other
hand, have shown that language acquisition is underpinned by general learn-
ing strategies rather than a language organ, showing for instance that learners
can extract sufficient information from the transitional probabilities between
words to be able to make successful grammaticality judgements in complex
sentences (Reali and Christiansen, 2005) and that distributional information
of this sort can be integrated with probabilistic phonological cues to create
accurate representations of lexical categories (Reali et al., 2003). Such simu-
lations are typically agent-based models, involving a simulated population of
individuals (agents) initially endowed with some specified cognitive capacities
and then left to interact with each other and update their linguistic knowledge
over thousands of interactions and perhaps over multiple generations of agents
dying and being replaced. This kind of modelling has been used to explore a
wide range of issues from the emergence of phonological structure and the
duality of patterning described in Section 2.2 (de Boer, 2001; Oudeyer, 2006;
Zuidema and de Boer, 2009) to the evolution of vocabulary (Smith, 2004) and
syntactic structure (Kirby, 2000). More recently, a whole research programme
(Steels, 2011) has been developed on the implementation of agent-based lan-
guage game models, which model the emergence of aspects of language through
the interactive negotiation of co-ordinated behaviour between agents, not only
general concepts like the development of shared lexicons (Steels, 1999; Smith,
2005; Steels and Belpaeme, 2005) and the design feature of compositionality
(Vogt, 2005), but also more focused grammatical and morphological features
such as the emergence of case systems (van Trijp, 2012) and agreement (Beuls
and Steels, 2013) through processes of grammaticalisation.
Probably the primary agent-based model, which has now outgrown its com-
putational origins to be used frequently in both mathematical and experimental
models as well, is the iterated learning model (Kirby and Hurford, 2002; Smith
et al., 2003b), which allows explorations of how the structure of language
evolves through its transmission over a diffusion chain, with agents learning
language through observing the linguistic usage of other agents who learnt
their language in the same way. Simple iterated learning models have been
instrumental in demonstrating the importance of cultural evolution in the evo-
lution of language, showing how random languages can be transformed into
stable, syntactically complex languages simply by being learnt over genera-
tions through a so-called transmission bottleneck. This term refers to the fact
31
that learners have to learn from only partial experience of the language, and
so have not explicitly learnt how to express certain meanings. The transmission
bottleneck forces learners to generalise from the data they have encountered to
represent such unobserved meanings, and this pressure biases the language to-
wards compositionality and systematicity, rather than the idiosyncrasy which
can persist without the bottleneck (Kirby, 2001; Smith et al., 2003a).
In the last decade, a set of experimental techniques has become popular
in work on the dynamics of language evolution. The iterated learning model
itself was transferred to this paradigm in a celebrated language experiment in
which an unstructured ‘alien’ language of random words, with no connection
between meanings and forms, was taught to participants; the words produced
for the meanings by this generation were then taught to the next generation of
participants (Kirby et al., 2008). Over cumulative generations, the languages
adapted so that they were easier to learn, essentially by tolerating massive am-
biguity and using the same word for multiple meanings; as a result of this the
language’s expressive power was lost. A second experiment reintroduced ex-
pressivity into the language by the explicit exclusion of ambiguity, and this re-
sulted in the languages adapting differently, still becoming increasingly learn-
able, but this time by developing compositionality so that aspects of meanings
were systematically reflected in the words produced, so that expressivity too
was maximised.
In another, graphically-based experiment, the competing pressures of learn-
ability and expressibility were explored differently: Garrod et al. (2007) asked
participants to play a Pictionary-style game in which one person draws an os-
tensive representation of some meaning, and the other infers what the mean-
ing is. When participants play this game repeatedly in pairs, they develop
a communicative system which is extremely expressive but in which the sig-
nals become increasingly simple after repeated use; the participants no longer
need to draw all the detail, but can simply provide shorthand simplified cues
to the meanings which effectively reside in their shared common history of use
(Caldwell and Smith, 2012). The cues themselves are idiosyncratic and arbi-
trary, showing clearly how common ground and shared experience can drive
the ‘drift to the arbitrary’ (Tomasello, 2008); this makes languages more in-
ternally efficient, but at the expense of learnability. In a further experiment,
Theisen et al. (2010) introduced a pressure for learnability by embedding the
game in an iterated learning chain, with new generations learning from the
drawings produced by the previous generation; in this case the drawings still
became increasingly conventionalised and arbitrary, but also developed com-
positionality, making them more learnable to future generations. More re-
cently, Kirby et al. (2015) use both computational simulations and laboratory
32
experiments to show convincingly that linguistic structure emerges culturally,
as a direct result of the interactions between the twin pressures of expressivity
and learnability.
Experimental techniques have also been used to explore the emergence of
signals themselves: in a pioneering use of this technique, Galantucci (2005)
investigated the co-ordination and conventionalisation of signalling systems in
an artificial context where existing communication systems were rendered use-
less but with other novel communicative opportunities; participants learnt in
many cases to exploit the opportunties and developed shared expressive com-
municative systems. This paradigm was pared back further by Scott-Phillips
et al. (2009) to a situation where participants could only move in a strictly con-
trolled fashion around a small grid, yet needed to communicate successfully in
order to co-ordinate their behaviour. Despite this extremely challenging situ-
ation, participants were able to develop successful communication systems, if
they were able to signal their communicative intent by moving ostensively in a
distinctive manner contrary to their expectations.
The controversy over Cavalli-Sforza et al.’s (1988) work, described in sub-
section 2.4.2, cast a long shadow over work comparing genetics and linguistics
for a considerable period of time, but over the last few years it has become
clear that genes can have potentially surprising effects on linguistic structures.
Dediu and Ladd (2007) first demonstrated the existence of a fascinating rela-
tionship at the population level between the allele frequencies in the popu-
lation of two genes involved in brain development and the likelihood of the
languages being spoken in the population having the tonal contrasts described
in subsection 2.2.1. Intriguingly, the two genes in question, ASPM and Micro-
cephalin, have both emerged relatively recently in human evolution and are
spreading quickly across the world, suggesting that they are favoured by nat-
ural selection and therefore that the correlation reflects some kind of small
cognitive bias whose effects on language structure only become detectable
after many generations of cultural evolution, although Dediu and Ladd are
careful not to speculate on the precise details of this bias. The significance of
the relationship was demonstrated by taking advantage of new large genetic
and linguistic databases (Haspelmath et al., 2005) to test many thousands of
other possible relationships, showing that although there is generally no re-
lationship between genetic markers and linguistic features, the relationship
between these two genes and tonal languages remained very strong. Simi-
lar correlations have been noted before, notably between linguistic diversity
and biodiversity (Nettle, 1999; Maffi, 2001), but increases in the availability
and ease of use of such databases has led over recent years to considerable
growth in correlational studies searching for relationships between combina-
33
tions of linguistic variables and other cultural traits, although researchers must
be careful to avoid simplistic uncontrolled data dredging over many differ-
ent variables, which will inevitably uncover spurious relationships (Roberts
and Winters, 2013). Interestingly, many correlational studies have found ev-
idence, however, for the evolutionary adaptation of aspects of languages to
particular ecological and social niches in their dynamic environment. Lupyan
and Dale (2010), for instance, showed an inverse correlation between popula-
tion size and morphological complexity, finding evidence in support of Wray
and Grace’s (2007) theory that exoteric (outward-facing) languages used in
large, disparate communities (such as English and Swahili) face pressure to be
more easily learnable by non-native speakers and to be easily used by people
from different backgrounds with little non-linguistic common ground, while
esoteric (inward-facing) languages in small communities (such as Tatar and
Elfdalian) face contrasting pressures to become reliable social markers of com-
munity membership, and thus showed increases in morphological complexity
and opacity to ensure that only people able to spend a great deal of effort would
be able to use them. Sometimes, the niche is physiological, such as Everett’s
(2013) findings that the languages spoken in populations living at high alti-
tudes have more ejective consonants, from which he hypothesises that this may
be due to lower air pressure making ejectives easier to articulate or that ejec-
tives mitigate the dehydrating effects of exhalation in drier climates. This line
of inquiry has been extended more recently, with Everett et al. (in press) show-
ing that extremely cold and dry regions constrain the emergence and spread
of tonal languages, because the reduced air quality appears to compromise the
fine control needed to maintain the tonal distinctions.
2.5 Conclusion
The question of what makes human language special has been long debated
and analysed, with its expressive power in particular often regarded with a
reverential sense of awe and wonder. This paper has focused on one crucial
characteristic of language: its dynamic nature, which marks it as unique and
rather strange in comparison to other communication systems. All languages
are in a state of constant flux, characterised by dynamic variation and massive
diversity at all levels of analysis. In Section 2.2, I gave a detailed overview of
some of the striking ways in which this diversity is manifested across different
languages, looking at the sounds they use and the way these are organised,
the ways in which words are created from their component parts and then
assembled into full sentences, and the conceptualisation of simple spatial re-
lationships between objects.
34
In Section 2.3, I set out three of the main timescales on which research into
language change and evolution is carried out. The fundamental basis of lin-
guistic change is its pervasive cumulative variation, which is primarily caused
by the complex social functions for which language is used, and the nature
of language in use, in particular its constant oscillation between internal lin-
guistic representations and external linguistic behaviour, mediated by the co-
operative communicative processes of ostension and inference. This pervasive
variation leads to the constant creation of new linguistic varieties, which over
time and space, develop into separate languages stemming from a common
source; much early work in linguistics as a scientific discipline, in fact, was in
the identification of relationships between modern languages and in inferring
and describing their shared and separate histories in terms of family trees and
networks. On a yet longer timescale, evolutionary linguistics is interested in
the initial emergence of language from a pre-linguistic communication system,
particularly in how interactions between biological and cultural evolution can
provide explanations of this emergence by investigating language as a com-
plex adaptive system with properties emerging from language use in a social
context.
Formal models have proved over the last few decades to be extremely im-
portant in the exploration and systematical testing of many aspects of the dy-
namic nature of language. In Section 2.4, I presented a number of recent mod-
els in this vein, including evolutionary models of the propagation of linguistic
variants, phylogenetic models of language family histories and their use in
wider debates about human history, and formal and experimental models of
cumulative cultural evolution showing the adaptive nature of language itself
under pressures of learnability and expressivity. It is clear that both the scope
and variety of models have increased enormously over recent years, with many
different techniques being developed to explore an increasing range of linguis-
tic characteristics; the increased availability of large datasets has led, for in-
stance, to many comparative studies exploring correlations between specific
linguistic characteristics and genetic or geographic information.
35
2.6 References
J. Aitchison (1996) The Seeds of Speech: Language Origin and Evolution (Cam-
bridge: Cambridge University Press).
Q. Atkinson, A. Meade, C. Venditti, S. Greenhill, and M. Pagel (2008) ‘Lan-
guages evolve in punctuational bursts’, Science, 319, 588.
R. Bateman, I. Goddard, R. O’Grady, V. Funk, R. Mooi, J. Krees, and P. Cannell
(1990) ‘Speaking of forked tongues: the feasibility of reconciling human
phylogeny and the history of language’, Current Anthropology, 31, 1-24.
G. Baxter, R. Blythe, W. Croft, and A. McKane (2009) ‘Modeling language
change: an evaluation of Trudgill’s theory of the emergence of New
Zealand English’, Language Variation and Change, 21, 257-296.
C. Beckner, R. Blythe, J. Bybee, M. Christiansen, W. Croft, N. Ellis, J. Hol-
land, J. Ke, D. Larsen-Freeman, and T. Schoenemann (2009) ‘Language is
a complex adaptive system: Position paper’, Language Learning, 59, 1-26.
K. Bergslund and H. Vogt (1962) ‘On the validity of glottochronology’, Current
Anthropology, 3, 115-153.
K. Beuls and L. Steels (2013) ‘Agent-based models of strategies for the emer-
gence and evolution of grammatical agreement’, PLoS one, 8, 358960.
R. Blythe and W. Croft (2012) ‘S-curves and the mechanisms of propagation in
language change’, Language, 88, 269-304.
D. Bolinger (1975) Aspects of Language (New York: Harcourt Brace Jo-
vanovich).
H. Bowe and S. Morey (1999) The Yorta Yorta (Bangerang) Language of the
Murray Goulburn, Including Yabula Yabula (Canberra: Pacific Linguistics).
H. Brighton, K. Smith, and S. Kirby (2005) ‘Language as an evolutionary sys-
tem’, Physics of Life Reviews, 2, 177-226.
36
P. Brown (2001) ‘Learning to talk about motion up and down in Tzeltal: is
there a language-specific bias for verb learning?’ In M. Bowerman and
S. Levinson (eds.) Language Acquisition and Conceptual Development (Cam-
bridge: Cambridge University Press).
N. Chomsky (1965) Aspects of the Theory of Syntax (Cambridge, MA: MIT Press).
37
N. Chomsky (1980) Rules and Representations (London: Basil Blackwell).
D. Dediu and R. Ladd (2007) ‘Linguistic tone is related to the population fre-
quency of the adaptive haplogroups of two brain size genes, ASPM and mi-
crocephalin’, Proceedings of the National Academy of Sciences of the United
States of America, 104, 10944-10949.
G. Deutscher (2010) Through the Language Glass: How Words Colour Your World
(London: William Heinemann).
R. Dixon (1997) The Rise and Fall of Languages (Cambridge: Cambridge Uni-
versity Press).
M. Dryer (2013a) ‘Order of subject, object and verb’ In M. Dryer and M. Haspel-
math (eds.) The World Atlas of Linguistic Structures Online (Leipzig: Max
Planck Institute for Evolutionary Anthropology).
38
M. Dryer (2013b) ‘Prefixing vs. suffixing in inflectional morphology’ In
M. Dryer and M. Haspelmath (eds.) The World Atlas of Linguistic Structures
Online (Leipzig: Max Planck Institute for Evolutionary Anthropology).
N. Evans (2010) Dying Words: Endangered Languages and What They Have to
Tell Us (Singapore: Wiley-Blackwell).
C. Everett, D. Blasi, and S. Roberts (in press) ‘Climate, vocal folds, and tonal
languages: connecting the physiological and geographic dots’, Proceedings
of the National Academy of Sciences.
M. Gell-Mann (1994) The Quark and the Jaguar (New York: Freeman).
C. Goddard (2005) The Languages of East and Southeast Asia (Oxford: Oxford
University Press).
R. Gray and F. Jordan (2000) ‘Language trees support the express-train se-
quence of Austronesian expansion’, Nature, 405, 1052-1055.
39
J. Greenberg (1963) ‘Some universals of grammar with particular reference
to the order of meaningful elements’ In J. Greenberg (ed.) Universals of
Language (Cambridge, MA: MIT Press), 2nd edition.
P. Grice (1975) ‘Logic and conversation’ In P. Cole and J. Morgan (eds.) Syntax
and Semantics, volume 3 (New York: Academic Press).
S. Harrison (2005) ‘On the limits of the comparative method’ In B. Joseph and
R. Janda (eds.) The Handbook of Historical Linguistics (Oxford: Blackwell).
M. Haspelmath, M. Dryer, D. Gil, and B. Comrie (eds.) (2005) The World Atlas
of Linguistic Structures (Oxford: Oxford University Press).
B. Heine and T. Kuteva (2005) Language Contact and Grammatical Change (Cam-
bridge: Cambridge University Press).
B. Heine and T. Kuteva (2006) The Changing Language of Europe (Oxford: Ox-
ford University Press).
C. Holden (2002) ‘Bantu language trees reflect the spread of farming across
sub-Saharan Africa: a maximum-parsimony analysis’, Proceedings of the
Royal Society B, 269, 793-799.
40
C. Holden and R. Gray (2006) ‘Rapid radiation, borrowing, and dialect con-
tinua in the Bantu languages’ In P. Forster and C. Renfrew (eds.) Phylo-
genetic Methods and the Prehistory of Languages (Cambridge: MacDonald
Institute Press).
M. Israel (2014) ‘Semantics: how language makes sense’ In C. Genetti (ed.) How
Languages Work: An Introduction to Language and Linguistics (Cambridge:
Cambridge University Press).
41
S. Kirby and J. Hurford (2002) ‘The emergence of linguistic structure: an
overview of the iterated learning model’ In A. Cangelosi and D. Parisi (eds.)
Simulating the Evolution of Language (London: Springer Verlag).
P. Lewis (ed.) (2005) Ethnologue: Languages of the World (Dallas: SIL Interna-
tional).
42
A. Martinet (1949) ‘La double articulation linguistique’, Travaux du Cercle Lin-
guistic de Copenhague, 5, 30-37.
43
D. Nettle (1999) ‘Is the rate of linguistic change constant?’, Lingua, 108, 119-
136.
D. Nettle and S. Romaine (2000) Vanishing Voices: The Extinction of the World’s
Languages (Oxford: Oxford University Press).
S. Pinker and P. Bloom (1990) ‘Natural language and natural selection’, Behav-
ioral and Brain Sciences, 13, 707-784.
44
M. Ruhlen (1994) On the Origin of Languages: Studies in Linguistic Taxonomy
(Stanford, CA: Stanford University Press).
A. Smith and S. Höfler (2015) ‘The pivotal role of metaphor in the evolution of
human language’ In J. Díaz Vera (ed.) Metaphor and Metonymy across Time
and Cultures (Berlin: Mouton de Gruyter).
K. Smith (2011) ‘Why formal models are useful for evolutionary linguists’ In
K. Gibson and M. Tallerman (eds.) Oxford Handbook of Language Evolution
(Oxford: Oxford University Press).
45
K. Smith and S. Kirby (2008) ‘Cultural evolution: implications for understand-
ing the human language faculty and its evolution’, Philosophical Transac-
tions of the Royal Society of London, series B – Biological Sciences, 363, 3591-
3603.
J.-J. Song (2001) Linguistic Typology: Morphology and Syntax (Harlow: Long-
man).
46
G. Trousdale (2010) An Introduction to English Sociolinguistics (Edinburgh: Ed-
inburgh University Press).
P. Trudgill (1972) ‘Sex, covert prestige, and linguistic change in the urban
British English of Norwich’, Language in Society, 1, 179-196.
R. van Trijp (2012) ‘The evolution of case systems for marking event structure’
In L. Steels (ed.) Experiments in Cultural Language Evolution (Oxford: John
Benjamins).
A. Wedel (2006) ‘Exemplar models, evolution and language change’, The Lin-
guistic Review, 23, 247-274.
W. Zuidema (2003) ‘How the poverty of the stimulus solves the poverty of the
stimulus’ In S. Becker, S. Thrun, and K. Obermayer (eds.) Advances in Neu-
ral Information Processing Systems 15 (Proceedings of NIPS ’02) (Cambridge,
MA: MIT Press).
47
Bio
Andrew Smith is Lecturer in Language Studies at the University of Stirling. He
received his PhD in Language Evolution from the University of Edinburgh. His
main research interests are in evolutionary and cognitive linguistics, focusing
in particular on grammaticalisation, the inferential socio-cultural and cogni-
tive bases of communication, metaphor, cultural evolution and word learning
mechanisms. His most recent book is the edited collection New Directions in
Grammaticalization Research (with Graeme Trousdale and Richard Waltereit),
published by John Benjamins in 2015.
Index Terms
General
comparative method, complex adaptive system, cultural evolution, evolution
of language, iterated learning, language change, language diversity, language
evolution, lexicostatistics, linguistic variation, model, morphology, phonol-
ogy, phylogeny, simulation,
48