The Vocal Tract
Sub-division of the tongue
tip back
f. The teeth (upper and lower) are usually shown in diagrams like
Fig.1 only at the front of the mouth, immediately behind the lips.
This is for the sake of a simple diagram, and you should remember
that most speakers have teeth to the sides of their mouths, back
almost to the soft palate. The tongue is contact with the upper side
teeth for many speech sounds. Sounds made with the tongue
touching the front teeth are called dental.
g. The lips are important in speech. They can be pressed together
(when we produce the sound [p], [b]), brought into contact with the
teeth (as [f], [v]), or rounded to produce the lip-shape for vowels
like [u:]. Sounds in which lips are in contact with each other are
called bilabial, while those with lip-to-teeth contact are called
The production of any speech sound requires the creation of an
airstream in the vocal tract. The airstream may be created by either
compressing or rarelying the air in the tract. In English the airstream is
initiated by lungs. When the lungs contract, they push air out, creating
an out-flowing airstream. We call this pulmonic egressive—pulmonic
The pulmonic egressive airstream, as it passess through the larynx,
may be modified by the vocal cords, through the introduction of voice.
Without voice, speech would be reduced to an inaudible whisper.
When the vocal cords are brought together, air passing out from the
lungs causes them to vibrate, and voice is produced. Sounds produced
with the vocal cords vibrating are called voiced. If the vocal cords are
pulled back, hey cannot vibrate. Sounds produced without the vocal
cords vibrating are called voiceless. When we breathe the vocal cords
are pulled back allowing the air to pass freely in and out of the lungs.
As the airstream passes through the vocal tract, it may be modified by
the movement of the articulation, that is by the lips and the tongue
obstructing its passage through the vocal tract to varying degrees. This
process is called articulation. The obstruction of the airstream may
occur at any point in the vocal tract, and is the result of an active
articulator moving towards a passive articulators are the location on
the roof of the mouth, for example the alveolar ridge, hard palate, etc.
Phonetic Transcription
Since the sixteenth century, efforts have been made to devise a
universal system for transcribing the sounds of speech. The best
known system is the International Phonetic Alphabet (IPA). In this
alphabet the relationship between symbol and sound is one to one.
English Consonants
Symbols Examples
Place/Point of Articulation
As the airstream passes through the vocal tract, it may be modified by
the movement of the articulators, That is by the lips and the tongue
obstructing its passage through the vocal tract to varying degrees. This
process is called articulation. The obstruction of the airstream may
occur at any point in the vocal tract, and is the result of an active
articulator moving towards a passive articulator. The active
articulators are the lips and the tongue, and the passive articulators are
the locations on the roof of the mouth, for example the alveolar ridge,
hard palate, etc.
Bilabial sounds are made with both
lips. There are five such sounds
possible in English: [p] pat, [b] bat,
[m] mat, [w] with, and [wh] where
(present only in some dialects). We
could say that the lower lip is the
active articulator and the upper lip the
passive articulator, though the upper
lip usually moves too, at least a little
Labiodental consonants are made with
the lower lip against the upper front
teeth. English has two labiodentals [f]
as in fat and [v] as in vat. The lower
lip is the active articulator and the
upper teeth are the passive articulator
Interdentals are made with the tip of
the tongue between front teeth. There
are two interdental sounds in English:
[θ] thigh and [ð] thy
Just behind your upper front teeth
there is a small ridge called the
alveolar ridge. English makes seven
sounds at or near this ridge: [t] tab, [d]
dab, [s] sip, [z] zip, [n] noose, [l] loose
and [r] red.
If you let your finger glide back along
the roof of your mouth you will note
that the anterior portion is hard while
the posterior portion is soft. Sounds
made near the hard part of the roof of
the mouth are said to be palatal.
English makes five sounds in the
palatal region: [Σ] leash, [Ζ] measure,
[±] church, [→] judge and [j] yes.
The soft par of the roof of the mouth
behind the hard palate is called the
vellum. Sounds made near the velum
are said to be velar. There are three
velar sounds in English: [k] kill, [g]
gill, and [ŋ] sing.
The space between the vocal folds is
the glottis. English has two sounds
made at the glottis. The first is easy to
hear: [h] as in high and history. The
second is called a glottal stop and it is
written phonetically as [♣] (a question
mark without the dot). This sound
occurs before each vowel sound in uh-
Table 1
English Consonants
Manner of State of
Articulation Glottis
Stop -voi p t k ♣
+voi b d g
Fricative -voi f Τ s Σ
+voi v ∆ z Ζ
Affricate -voi ±
+voi →
Nasal +voi m n Ν
Liquid +voi l, r
Glide +voi w j h
Manner of Articulation
So far, we have concentrated on describing consonant sounds in terms
of where they are articulated. We can also describe the same sounds in
terms of how they are articulated. It refers to manner of articulation.
Based on the manner of articulation, English consonants may be
grouped into six groups, namely: stops, fricative, affricates, nasals,
liquid, and glides.
Stops are made by totally obstructing the airstream. Notice
that when you say [p] and [b] your lips are closed together for
a moment, stopping the air flow. [p] and [b] are bilabial stops.
[b] is voiced bilabial stop. [t], [d], [k], and [g] are also stops.
Fricatives are made by forming a nearly complete stoppage of
the airstream. The opening through which the air escapes is so
small that friction is produced (much as air escaping from a
punctured tire makes a ‘hissing’ noise) [š] is made by almost
stopping the air with the tongue near the roof of the mouth. It
is a voiceless palatal fricative. [f], [v], [θ], [ð], [s], [z], [Σ], [Ζ]
are also fricatives.
The affricates are special group of sounds that are formed by
combining a stop and a fricative. In English, only one pair of
sounds occurs in this category, [±] as in chain and rich and
[→] as in Jane and ridge. Notice that in pronouncing [±], one
seems to pronounce [t] following by [Σ]. Similarly, [→] is
much like a phonetic compound, consisting of [d] following
by [Ζ].
In English, the three nasals, [m, n,Ν], are made with the lips
and the tongue in the same respective position as they are for
[p, t, k]; however, air pressure does not build up as it does in
the stops. Instead, the uvula (the flap that controls the opening
to the nasal passage) is open, allowing the air to flow through
the nose. In English, the nasals are always voiced. Whereas
[m] and [n] may occur at the beginning as well as at the end of
a syllable in English, as in mom and nun, [Ν] occurs only at
the end of syllable, as in sing.
The consonants [l] and [r], as heard in lilt and roar, are called
liquid. Both sounds are normally voiced. An [l] sound is
formed by touching the tip of the tongue to the alveolar ridge
and allowing air to escape to each side. The [r] sound in
English is formed by curling the tip of the tongue up behind
the alveolar ridge and flipping it forward and upward without
actually touching the alveolar ridge.
Glides are made with only a slight closure of the articulators.
In fact, if the vocal tract were any more open you would
produce a vowel sound. [w] is made by rising the back of the
tongue toward the velum while rounding your lips at the same
time; it is thus classified as a voiced bilabial glide. (Notice the
similarity in the way you articulate [w] in the woo and then [u]
vowel in this word; the only change is that you open your lips
a little more for [u]). The [y] glide, much like the [w], is
formed with the tongue and lips in the same position as they
are when making the sound ‘ee’ (as in bee).
Both [w] and [y] always appear either before or after a vowel
in English. In both cases, the sound ‘glide’ rapidly to or from
the articulatory position for that vowel. Since [w] and [j]
posses certain vowel-like properties--for example lack a
definite constriction of the oral cavity--they are not true
consonants and are often called semi vowels.
English Vowels
symbols Examples
We can change the shape of the vocal tract, and thus change vowel
quality, in various ways:
1. we can raise or lower the body of the tongue
2. we can advance or retract the body of the tongue
Tongue Height
If you repeat to yourself the vowel sounds of seat, set, sit--transcribe
[i], [ε], and [æ]--you will find that you naturally open your mouth a
little wider as you change from [i] to [ε], and then a little wider still as
you change from [ε] to [æ]. These varying degrees of openness
correspond to different degrees of tongue height: high for [i], mid for
[ε], and low for [æ].
High vowels like [i] are made with the front of the mouth less open
because the tongue body is raised, or high. The high vowels of English
are [i, u, ω, ] as in leap, loop, lip, look. Conversely, how vowels like
[æ] in sat are pronounced with the front of the mouth open and the
tongue lowered. [æ, a], as in cat and cot, are the English low vowels.
Mid vowels like [ε] in set are produced with an intermediate tongue
height; in English, these mid vowels include [e, ε, , χ, ], o] as in bait,
bet, but, about, caught, boat.
Tongue Advancement
Besides being held high or mid, or low, the tongue can also be pushed
forward or pulled back within the oral cavity. For example, in the high
front vowels [i], the body of the tongue is raised and pushed forward
just under the hard palate. The high back vowel [u] in boot, on the
other hand, is made by raising the body of the tongue in the back of
the mouth--toward the velum. The tongue is advanced or pushed
frontward for all the front vowels, [i, ω, e, ε, æ], as in see, Mick, take,
Fred, bake, and retracted or pulled back for the back vowels [u, ʊ, o, ],
a] as in you, look, so, soft, doc. Central vowels require neither fronting
nor retraction of the tongue
Lip Rounding
Vowel quality also depends on lip position. When you say the [u] in
two, your lips are rounded. For the [i] in tea, they are unrounded.
English has four rounded vowels: [u, Υ, o, ]], as in you, could, go,
wrong. All other vowels in English are unrounded. In the vowel chart,
the rounded vowels are enclosed in a dotted-line rectacle.
Vowels that are called tense are produced with an extra degree of
muscular effort. Lax vowels lack this extra effort. For example, tense
front vowels are made with a stronger or more extreme tongue
fronting gestures than lax front vowels, which are produced with a
weaker fronting movement: compare tense [i] in meet with lax [ω] in
mitt, or tense [e] in late with lax [ε] in let. Tense rounded vowels are
made with stronger or tighter lip rounding than their lax counterparts:
compare tense [u] in boot with lax [ ] in put.
high i u
mid e
ε χ o
low Θ a
Long Vowels
There are five long vowels in English. The symbol consists of one
vowel symbols plus a length mark made of two dots, [:]. Thus, we
have [i:, ∈:, Α:, :, u:]. We will now look at each of these long vowels
At this point, we still have not described the vowel sounds of English
words--e.g. hide, loud, coin. Unlike the simple vowels described
above, the vowels of these words are diphthong, two part vowel
sounds consisting of a vowel and a glide in the same syllable. If you
say eye slowly, concentrating on how you make this vowel, you
should find that your tongue starts out in the position for [a] and
moves toward the position for the vowel [i] or the palatal glide [j]. This
diphthong, which consists of two articulations and two corresponding
sounds, is written with two symbols; [aj], as in [haϕd] hide. To make
the vowel of loud, the tongue and the lips start in position for [a] and
move toward the position for [u] or [w], so this diphthong is written
[aω], as in [lawd] loud. In the vowel oϕ coin, the movement is from [o]
position toward position [Ι] or [j], so the vowel of coin is written [oϕ],
as in [koϕn] coin.
Symbols Examples
aj/aΙ bite, sight, by, die, Stein, aisle, choir, liar, island,
height, sign
aw/aΥ about, brown, doubt, coward
oj/ Ι boy, doily
eΙ oΥ
Figure 4
English Diphthongs
centring closing
ending in Ι
ending in ending in Υ
Ι↔ ε↔ Υ↔ εΙ αΙ Ι ↔Υ αΥ
The most complex English sounds of vowel type are the triphthongs.
They can be rather difficult to pronounce, and very difficult to
recognize. A triphthong is a glide from one vowel to another and then
to a third, all produced rapidly and without interruption. For example,
a careful pronunciation of the word ‘hour’ begins with a vowel quality
similar to [Α:] goes on to a glide towards the back close rounded area
(for which we use the symbol Υ), then ends with a mid-central vowel
(schwa, ↔). We use the symbols αΥ↔ to represent the way we
pronounce ‘hour’, but this is not always an accurate representation of
the pronunciation.
