The Word Phonetics' Defined.: Chapter Three Understanding Phonetics

Download as pdf or txt
Download as pdf or txt
You are on page 1of 16

CHAPTER THREE

UNDERSTANDING PHONETICS

I learned by watching my favorite shows. I would just rewind and say the words back, until they
sounded right to me. I never studied the American accent, in terms of getting a teacher or taking
phonetics classes. I've always been a good mimic. It really wasn't that hard for me.
(Adelaide Kane)

In this section, some essential things concerning phonetics are proposed for discussion i.e.
The definition of Phonetics, Human speech sounds from phonetics perspective, as well as
Phonetics for teaching pronunciation of English sounds. Those things are sequentially presented
as follows.

1. The Word ‘Phonetics’ Defined.

To begin the discussion of phonetics, we should clearly know what phonetics is. The clear
understanding of phonetics will be enriched by presenting definitions derived from a variety of
online sources on the internet and from the linguists. From the online sources, a number of
definitions of phonetics are found such as:
a. http://www.dictionary.com defines phonetics as the science or study of speech sounds and
their production, transmission, and reception, and their analysis, classification, and
transcription.
b. http://www.yourdictionary.com points out that phonetics is the study of the sounds of
human speech using the mouth, throat, nasal and sinus cavities, and lungs.
c. https://en.oxforddictionaries.com gives simple definition of phonetics that it is the study
and classification of speech sounds.
d. Another definition is put forward by http://www.baap.ac.uk that phonetics is the systematic
study of speech and the sounds of language.
e. http://dictionary.cambridge.org also proposes that phonetics is the study of the sounds
made by the human voice in speech.
f. Besides, http://www.chegg.com also defines phonetics as the study of sound in speech that
focuses on how speech is physically created and received, including study of the human
vocal and auditory tracts, acoustics, and neurology.
g. In addition, http://www.phon.ox.ac.uk puts forward the definition of phonetics that it is a
study that deals with the production of speech sounds by humans, often without prior
knowledge of the language being spoken.
h. Finally, https://en.wikibooks.org defines phonetics as the systematic study of the human
ability to make and hear sounds which use the vocal organs of speech, especially for
producing oral language. It is usually divided into the three branches of (1) articulatory, (2)
acoustic and (3) auditory phonetics.

In addition to the definitions from the online sources, some clear definitions of phonetics have
been offered by prominent linguists, for example;
a. Ladefoged, P (1975) states that phonetics is concerned with the speech sound that occur in
languages of the world. It makes effort to know what the sounds are, how they fall into
patterns, and how they change in different circumstances.
b. Catford (1992) defines phonetics as the study of the physiological, aerodynamic, and
acoustic characteristics of speech-sounds.
c. Fromkin, V., Rodman, R., & Hyams, N. (2003) point out that phonetics is the study of
speech sounds, utilized by all human languages to represent meanings.
d. Becker, A., and Bieswanger, M. (2004) are of the opinion that Phonetics is concerned with
the wide variety of sounds used by speakers of human languages.
e. O'Grady et.al. (2005) define phonetics as a branch of linguistics that comprises the study
of the sounds of human speech, or in the case of sign languages the equivalent aspects of
sign.
f. Next, Roach, P. (2009) claims that phonetics is the scientific study of speech which
concerns in the discovery of how speech sounds are produced, how they are used in spoken
language, how we can record speech sounds with written symbols and how we hear and
recognize different sounds.
g. Besides, Lanpher (2011) asserts that phonetics is the study of speech sound in any language
which has three branches namely Articulatory, Acoustics and Auditory Phonetics.
Articulatory phonetics deals with how sounds are articulated where mouth, tongue and
lungs are parts of the system. Auditory phonetics deals with how it sounds when
articulated. Acoustic phonetics is about how the sounds are perceived in our brain. The last
but not the least,
h. Nordquist, R. (2016) is of the opinion that phonetics is the branch of linguistics that deals
with the sounds of speech and their production, combination, description, and
representation by written symbols.

After reading all of the definitions above, we can then identify three mutually shared underlying
concepts of phonetics from the definitions namely: speech organs, speech sounds, and speech sounds
productions. The three concepts may be used for providing another definition of phonetics that it is a
scientific study which deals with humans’ organs of speech, speech sounds, as well speech productions.
The basic components of phonetics are discussed as follows.

a. Speech organs (articulators)

When speaking, people use speech sounds produced by flowing air from the lung to the mouth for
communicating messages. To vary the speech sounds’ quality, they need articulation that is the
action of producing sounds using their speech organs (articulators). Articulators refer to any organs
of speech that takes part in the production of human speech sounds. They include the lips, teeth,
alveolar ridge, hard palate, velum (soft palate), uvula, glottis and various parts of the tongue (e.g.
tip, middle, and back part). In this regard, articulators can be seen in figure 11 in the form of a
diagram of a human head. The learners are in need of learning it carefully to get a picture of the
articulators’ shapes that are used to make speech sounds in their mouth.
Figure 11
Articulators to produce speech sounds
Source: https://www.pinterest.com/pin/514254851179328077/

b. Speech sounds

Speech sounds may be understood as a set of distinctive sounds produced when people speak in a
particular language. So, a speech sound is plainly a sound uttered by speakers of a language. For
example the word ‘book’, this word seems to have three different sounds i.e. /b/, /ʊ/, and /k/ in
which /b/ is the representative of letter ‘b’, /ʊ/ is the representative of letter ‘oo’ and /k/ is the
representative of letter ‘k’. Therefore, it must be fully understood that speech sounds are not the
letters used to spell the words. They are all about what you are hearing, not what letters to use to
spell the words. In other words, English speech sounds are not representative of the English
alphabets. The problem now is how would somebody transcribe and distinguish between the
sounds they are hearing in communications? The answer is this, since sounds are different from
the letters, a new and different set of symbols is needed to represent the speech sounds. Since 1888,
an effort has been made to create a universal system for transcribing speech sounds to help the
linguists, language teachers, even learners transcribing languages consistently and accurately. This
system is famously known as The International Phonetic Alphabet (IPA). The IPA not only
transcribes speech sounds, but also provides symbols to represent sounds of every language in the
world.

c. The production of speech sounds

The production of speech sounds indicates the process of making speech sounds by using
articulators which is initiated from flowing the air provided by the lung, the air then goes through
the glottis in the larynx that then is modified by the vocal tract (oral cavity) or vocal apparatus
(nasal cavity) into different qualities of sounds i.e. consonants and vowels. Consonant sounds are
made by blocking the airflow completely or partially during the speech. The production of
consonant sounds is mainly determined by two factors i.e. the place where the sounds are produced
(place of articulation) and the manner how they are articulated (manner of articulation). For example,
when someone wants to make a p sound, he needs to move his two lips come together tightly and
then blocks the air flow for a moment. After that, he has to release the air pressure suddenly in the
mouth which causes plosive sound. The place of articulation of this sound is therefore called
bilabial, and the manner is called stop (also known as a plosive). On the other hand, vowel sounds
are made in a different way from consonants. Vowel sounds don’t need airflow obstruction, but
need a more continual airflow. In this regard, phoneticians describe that the production of different
vowel sounds is affected by HAR which stands for Height, Advancement, and Rounding. Height
refers to the height of the jaw whether its position is open, mid, or near close. Advancement refers
to the frontness or the backness of the tongue (how front or back the tongue is). Rounding refers
to the shape of the lips whether the lips are rounded or unrounded. For example, when the jaw is
near close, it will produce sound /i/ as in ‘he’, sound /I/ as in ‘with’, sound /ʊ/ as in ‘would’ etc.

Furthermore, Phonetics, as the study of speech sounds of spoken languages, makes effort
to describe two important things concerning speech sounds i.e. and what physical properties of the
sounds exist in spoken languages (acoustic phonetics) and how sounds are produced (it is also
called articulatory phonetics). By using acoustic approach, phonetics tells us that a language
mainly consists of two kinds of sounds i.e. segmental and supra-segmental sounds. Segmental
sounds refer to the smallest units of a sequence of sounds i.e. consonant and vowel sounds. On the
other hand, supra-segmental sounds refer to aspects of pronunciation that go beyond the production
of individual (segmental) sounds, for example: stress and intonation (Crane, L. B., Yeager, E., &
Whitman, R. L. 1981).

2. The Production of Segmental Sounds


2.1. English Consonant sounds

This discussion of this section is focused on describing three kinds of information related
to the production of the first segmental sound that is English consonant sounds namely; describing
which parts of the mouth are used for making a particular consonant sound of English (place of
articulation), describing how the consonant sound is produced (manner articulation), and
describing whether or not the vocal chords vibrate (distinctive feature). These kinds of information
are important for learners of English to help them be able to produce consonant sounds of English
naturally.

2.1.1. Place of Articulation


Place of articulation may be understood as the specific location in the mouth in which two
speech organs work together in producing a consonant sound. For example, the tongue and the
teeth come together in forming a dental sound. There are six speech organs used to utter consonant
sounds i.e. the lips, the teeth, the alveolar ridge, the hard palate, and the soft palate or velum and
four parts of the tongue (i.e. the tip, the blade, the body, and the root). In terms of place of
articulation, consonant sounds in American English can be classified into nine sounds which could
easily be remembered by using the following acronyms: BILA DAPAT PAVEGLOLA
No Acronyms Consonant sounds

1 BI Bilabial

2 LA Labiodental

3 D Dental

4 A Alveolar

5 PAT Palatal

6 PA Palato Alveolar

7 VE Velar

8 GLO Glottal

9 LA Labiovelar

a. Bilabial sounds: [p], [b], [m]


These consonant sounds are produced by putting two lips together as in sound [b] in the
word ‘buy’ (baɪ), [p] in the word ‘pay’, (peɪ), and [m] in the word ‘my’ (maɪ).

b. Labiodental sounds: [f] and [v]


These consonant sounds are produced by contacting the lower lip and upper teeth as in
sound [f] in the word ‘five’ (faɪv), ˈ, and [v] in the word ‘villa’ (vɪlə).

c. Dental sounds: [θ] or [ð]


These consonant sounds are made when tongue tip for sound [θ]or tongue blade (part just
behind the tip) for sound [ð]contacts upper teeth, as in the two th sounds e.g. sound [θ] as
in the word ‘Thursday’ (θɜrzˌdeɪ) and sound [ð] as in the word ‘this’ (ðɪs).

d. Alveolar sounds: [t], [d], [n], [l], [s], [z]


These consonant sounds are made when the tongue tip contacts the alveolar ridge (the gums
just behind the teeth), as sound [t] in the word ‘tie’ (taɪ), sound [d] in the word ‘die’ (daɪ),
sound [n] as in the word ‘night’ (naɪt), or sound [l] as in the word ‘lie’ (laɪ); or tongue blade
contacts the alveolar ridge, as sound [s] as in the word ‘sight’ (saɪt) or sound [z] as in the
word ‘zeal’ (zil).
e. Palatal sounds: [ʤ], [ʧ], [j]
These sounds are produced when the blade of the tongue (Middle of tongue) touches the
hard palate e.g. sound [ʤ] as in the word ‘judge’ (ʤʌʤ), sound [ʧ] as in the word ‘chew’
(ʧu), and sound [j] as in the word ‘young’ (jʌŋ).
f. Palato Alveolar sounds: [ʃ], [ʒ], [r]
These sounds are articulated when the tip the blade of the tongue approaches or touches
the alveolar ridge and the main body of the tongue approaches the hard palate in the mouth
e.g. sound [ʃ] as in the word ‘ship’ (ʃɪp) in which the tip of the tongue is used, sound [ʒ] as
in the word ‘casual (ˈkæʒəwəl) in which the blade of the tongue is used, and sound [r] as
in the word ‘remember’ (rɪˈmɛmbər) in which the tip of the tongue approaches (never
touches) the area in the mouth between alveolar ridge and hard palate. The sounds are
called palato alveolar because the production of the sounds involves two place of
articulations i.e. the alveolar ridge as the primary articulation and palatal as the secondary
articulation.
g. Velar sounds: [k], [g], [ŋ]
These consonant sounds are produced when the back of the tongue touches the soft palate
(the velum) as sound [k] in the word ‘cup’ (kʌp), sound [g] as in the word ‘gap’ (gæp),
and sound [ŋ] as in the word ‘sing’ (sɪŋ).
i. Glottal sound: [h]
This consonant sound is produced when the flow of air is stopped by the glottis with some
construction of the glottis closing, and then released e.g. sound [h] as in the word ‘how’
(haʊ) and ‘who’ (hu).

j. Labio-Velar sound: [w]


This consonant sound is produced when the back part of tongue contacts the soft palate and
lips also come close to each other. e.g. sound [w] as in the word ‘what’ (wʌt).

2.1.2. Manner of Articulation

Manner of articulation describes two things i.e. The first, how the different organs of speech
interact one another in producing consonant sounds. The second, how the airflow is obstructed to
affect the production of consonant sounds’ quality. Thus, the manner of articulation is a
determinant which presents distinctive feature of consonant sounds in the English language
because it is possible to make several different consonant sounds at the same place of articulation.
Therefore, manner of articulation gives basic distinctions of how consonant sounds are made
through seven concepts of manner of articulation i.e. Stops, Fricatives, Affricatives, Nasals,
Lateral, and Approximant. The six manners of articulations are easily remembered by the
following acronyms: STOFANASLA

No Acronyms Manner of Articulations

1 STO Stop

2 F Fricative

3 A Affricative

4 NAS Nasal

5 L Lateral
6 A Approximant

a. Stops or Plosives
In this manner of articulation, consonant sounds are produced by closing the speech organs
both oral and nasal cavity that the airstream is blocked. It results air pressure in the oral
cavity. When the air pressure is suddenly released, aspiration will then occur. The English
consonant sounds produced through this manner of articulation are [p] as in the word
‘pen’, [t] as in the word ‘tie’, [k] as in the word ‘key’ and [b] as in the word ‘buy’, [d] as
in the word ‘die’, [g] as in the word ‘go’.

b. Fricatives
This type of manner of articulation produces consonant sounds by blocking the airstream
in the mouth, but not making complete closure (a complete closure belongs to a stop) that
the air moves through the mouth and produces audible friction e.g. [s] as in the word
‘cycle’, [z] as in the word ‘zoom’, [f] as in the word ‘fan’, [v] as in the word ‘vast’, [θ] as
in the word ‘method’, [ð] as in the word ‘then’, [ʃ] as in the word ‘ship’, [ʒ] as in the word
‘pleasure’ and [h] as in the word ‘hot’.

c. Affricatives
This manner of articulation produces consonant sounds by blocking the airstream briefly
with the tongue in the mouth, but in contrast to stop, the blocked airstream is suddenly not
released, but is slowly released and causes audible friction e.g. sound [tʃ] as in the word
‘check’ and [dʒ] as in the word ‘jump’.

d. Nasals
This manner of articulation produces consonant sounds by blocking the airstream pass
through the oral cavity using the velum (soft palate) and the back of the tongue, so that the
air can only pass through the nasal cavity e.g. sounds [m] as in the word ‘man’, [n] as in
the word ‘night’ and [ŋ] as in the word ‘sing’.

e. Laterals (Liquid)
This manner of articulation produces consonant sound by contacting the tip of the tongue
onto the alveolar ridge so that the air passes through both sides of the tongue e.g. sounds
[l] as in the word ‘leap’.

f. Approximants (Glides)
This type of manner of articulation deals with the way consonant sounds are uttered by
making the articulators interact e.g. the tongue and the alveolar ridge without actually
touching. In English, there are three approximants i.e. sound [j] as in the word ‘you’, sound
[w] as in the word ‘world’ and sound [r] as in the word ‘rise’.

2.1.3. The distinctive features of the English consonant sounds

One of definitions of distinctive feature is available on http://www.yourdictionary.com in


which distinctive feature is defined as something unique or different that sets someone or
something apart from the rest. In English, some different consonant sounds are produced in the
same place and manner of articulation e.g. sound [p] and [b]. These sounds are both identified as
bilabial and stop sounds. To help the learners in distinguishing both sounds, distinctive feature is
then used that is to identify the sounds’ distinction by knowing whether the vocal chords vibrate
when the consonant sounds are made (voiced), or whether the vocal chords do not vibrate when
the consonant sounds are uttered (voiceless). To test the voicing quality of the consonant sounds,
the learners may place their fingers on the voice box (i.e. the location of the Adam's apple in the
upper throat). This is done to know whether the vocal cords vibrate or not when a particular
consonant sound is made. For example, in distinguishing sound [p] and [b], the learners produce
sound [p] while placing their fingers over the voice box, they should not feel the vibration.
Therefore, sound [p] is known as a voiceless sound in terms of its distinctive feature to sound [b].
On the other hand, in sound [b], the learners feel the vibration. Therefore, sound [b] is then called
as a voiced sound. In a nut shell, sound [p] is phonetically called as a voiceless bilabial stop and
sound [b] is a voiced bilabial stop.
Having presented the discussion of the types of English consonant sounds based on place of
articulation, manner of articulation, and distinctive features, it is then very important to present
the summary of the discussion in the form of chart.

Place of articulations

Palato Labio
Bilabial Labiodental Dental Alveolar Palatal Velar Glottal
Alveolar Velar

Stop p b t d k g

Fricative f v θ ð s z ʃ ʒ h

Affricative tʃ dʒ

Nasal m N ŋ

Lateral

Approximant j r w

Voiceless phonemes are bold. Voiced phonemes are normal

2.2. The Production of Vowel Sounds


Vowel sound is the second segmental sound discussed in this section. To produce vowel
sounds, native speakers of English also use the place of articulation and the manner of articulation
of vowel sounds. The place of articulation here refers to what speech organs are required to
produce vowel sounds which only consist of the mouth, the tongue, and the lips. On the other hand,
manner of articulation here refers to how the speech organs are functioned to produced vowel
sounds in terms of height, frontness, as well as rounding. Williamson, G (2015) is of the opinion
that the production of vowel sounds in English (Both American and British) involves five main
parameters namely: openness of the mouth, tongue elevation, position of tongue elevation, lips’
shapes, and length of vocalization. The first four of the parameters affect the size and shape of the
oral cavity which alter the types of vowels sounds (vowel quality) produced in the oral cavity. The
last one that is length of vocalization influences the duration of production.

a. Openness of the mouth

As stated earlier that mouth is one of speech organs which functions to alter the quality of
vowels. According to Williamson, G (2015), the quality of vowel sounds can be different from
one another according to the extent to which the jaws are either open or close (not ‘closed’,
as a complete closure would prevent the free flow of air out of the mouth). For example, as a
native speaker of English says the word ‘palm’. The speaker will produce a vowel sound /ɑ/.
If we notice how the native speaker produces the vowel sound, we will then obviously see that
the speaker’s jaws are wide apart and this is what Williamson, G (2015) calls as a relatively
open mouth posture in which an open mouth posture will produce open vowel sounds.
Therefore, he suggests that we can see and even feel ourselves the relative openness or
closeness of our mouths by alternating the production of these vowels in quick succession (/i/
– /ɑ/ – /i/ – /ɑ/ – /i/ – /ɑ/) as you observe ourselves in a mirror.

b. Tongue elevation

The tongue also plays a decisive role in influencing the quality of vowel sounds by altering
its positions in the mouth. In phonetics, the position of the tongue in the mouth can be
described in three vertical positions i.e. high, mid, and low. For instance, when the tongue is
placed in a high position, it will produce the vowel sound /i/ as in the word ‘bee’, which is
also known as a close vowel in the perspective of mouth’s openness. In contrast, when the
tongue moves downward to a low position, it will produce different vowel sounds e.g. vowel
sound /æ/ as in the word ‘trap’. Besides, different vowel sounds will be produced as the tongue
is placed about halfway between high and low that is a mid-position like the sound /ɛ/ as in
the word ‘dress’. In a nutshell, native English speakers correlate between tongue elevation and
openness of the mouth when articulating vowel sounds in which close vowels (with the mouth
relatively closed) are articulated with a relatively high tongue elevation and open vowels are
typically articulated with a relatively low tongue position.

c. Position of tongue elevation

When discussing the ‘position of tongue elevation’, it will refer to where this elevation takes
place on the three horizontal positions i.e. front, central, and back. Taking vowel /i/ again
as an example, we have known that this sound is both close and high vowel. However, if we
notice the position on the tongue horizontally, we will find out that this sound is articulated
by positioning the front of the tongue in the direction of the hard palate. Therefore, this sound
is also called as a front vowel. In addition, using the vowel /ɑ/, in the word ‘palm’ also as an
example, we can feel that this time the sound is produced by raising the back of the tongue in
the direction of the soft palate. Therefore, it is called as a back vowel sound.

d. Lips’ shapes

In making the sounds of English vowels, native English speakers usually use two types of lip
shapes i.e. rounded and unrounded. As a rounded vowel is made, a speaker’s lips form an
opening and circular mouth and unrounded vowels are made with the lips in relaxed position.
In most languages, front vowels tend to be unrounded, and back vowels tend to be rounded
(https://en.wikipedia.org). To see and to feel the difference between them, we can say the
word ‘lose’ which makes vowel sound /u/ and alternate it by saying the word ‘knee’ which
produces vowel sound /i/. When doing this, we will see that our lips are rounded for the
production of vowel sound /u/, but they will be unrounded (spread) when producing vowel
sound /i/.

e. Duration of vocalization

The last important parameter of vowel sound production is length of vocalization which means
that, native English speakers relatively produce longer and shorter vowels sounds due to the
intervals of time when producing the vowel sounds. These types of vowel sounds are
categorized as long and short. For example, as a native English speaker says the word ‘goose’,
the speaker will make sound /u/ which is relatively heard longer then when the native speaker
says the word ‘trap’ which produces sound /æ/ and is relatively heard shorter. In addition to
the above explanation, it is important to clarify here that the terms ‘long vowel’ and ‘short
vowel’ do not indicate the length of the vowel, but rather the sound of the vowel. In linguistic
contexts, the terms ‘long’ and ‘short’ are referred to as ‘tense’ and ‘lax’ vowels, respectively.
These two terms are used to indicate the distinctive feature of vowel sound. According to
Williamson (2015), it is sometimes important to make clear that a particular vowel is a tense
(long) vowel; a colon-like mark (ː) is placed after the symbol for the vowel, e.g. /iː/, /uː/, /ɑː/.
For example, as a native English speaker says the word ‘beat’, the speaker will make a tense
vowel /iː/ which is the counterpart to the lax /ɪ/ as the native English speaker says the word
‘bit’. Another example is the production of sound /uː/ known as a tense vowel as in the word
‘kook’ which is the counterpart to the lax /ʊ/ as in the word ‘cook’.

2.2.1. Vowel Sounds in English

After knowing how the vowel sounds are produced in English, it is then important to
recognize the categories of English vowel sounds. Williamson (2015) categories English vowel
sounds into two categories i.e. simple vowel which is also known as pure vowel or monophthongs
and complex vowel which is also known as dipthongs.

a. Simple Vowel Sounds


Simple vowel is made when a speaker produces a single vowel configuration without moving the
tongue, the lips, and the mouth, in other words the speech organs are in a relatively unchanging
position. For example:
/ɪ/ : it, ink, sink, ill /ɪt, ɪŋk, sɪŋk, ɪl/

/i:/ : seat, east, eagle /siːt, iːst, ˈiːgl/

/ɛ/ : get, set, red, bell /gɛt, sɛt, rɛd, bɛl/

/æ/ : cat, rat, bat, mat /kæt, ræt, bæt, mæt/

/ɑː/ : arm, ask, calm, car /ɑːm, ɑːsk, kɑːm, kɑː/

/ʌ/ : cup, shut, cut /kʌp, ʃʌt, kʌt/

/ə / : about, ago, along /əˈbaʊt, əˈgəʊ, əˈlɒŋ/

/ɜː/ : pearl, earn, girl /pɜːl, ɜːn, gɜːl/

/ʊ/ : Cook, book, look /kʊk, bʊk, lʊk/

/uː/ : Spoon, food, fool /spuːn, fuːd, fuːl/

/ɑ/ : pot, lot, lock /pɑt, lɑt, lɑk/

/ɔː/ : Call, all, brought /kɔːl, ɔːl, brɔːt/

The following figure presents the overall simple vowel sounds used in both American and British
English. The vowel sounds are presented based on the vowel sounds’ place of articulation (i.e. the
tongue elevation, the tongue position) and manner articulation (i.e. the duration of vocalization).
This figure exclusively shows the two vowels in black and in red. If the sound stands alone in
black, it means that the vowel occurs in both American English (AE) and British English (BE). On
the other hand, if a vowel sound is written in black and in red, it means that the black is for AE
and the red is for BE.
Figure 12
Articulators to produce speech sounds
The overall simple vowel sounds in American and British English
Source: Williamson (2015)

b. Complex vowel sounds

A complex vowel sound is formed by the combination of two vowels in a single syllable,
in which the sound begins as one vowel and changes to another vowel sound in the same syllable.
In producing such a vowel sound, a speaker’s mouth shape usually changes. This is because the
speaker has to make two types of month movements i.e. starting movement to produce simple
sound at the beginning and ending movement to combine the simple vowel sound with another to
produce a vowel sound combination in the same syllable at the end. For example:
/ɪə/ : hear, cheer, deer /hɪə, ʧɪə, dɪə/

/eə/ : chair, their, air /ʧeə, ðeə, eə/

/ʊə/ : tour, sure, poor /tʊə, ʃʊə, pʊə/

/eɪ/ : pray, say, day /preɪ, seɪ, deɪ/

/aɪ/ : fight, eye, pie /faɪt, aɪ, paɪ/

/ɔɪ/ : voice, boy, oil /vɔɪs, bɔɪ, ɔɪl/

/əʊ/ : Slow, go, so /sləʊ, gəʊ, səʊ/

/aʊ/ : out, found, count /aʊt, faʊnd, kaʊnt/

3. The Production of Supra-segmental sounds


3.1. Word stress

Word stress is a certain emphasis given to a unit of pronunciation (syllable) in individual


words not in phrases or sentences. As an invidual item of a phrase or a sentence, each word has its
own stress which is put on a syllable of the word. In English, short words usually have one stress
e.g. in the word ‘sample’, it has two syllables i.e. sam·ple. the stress is placed in the first syllable
[ˈsam-pəl], and longer words usually have two stresses i.e. primary stress ['] and secondary stress
[-] but the secondary stress is always a much smaller stress e.g. in the word ‘democracy’, it has
four syllables i.e. de·moc·ra·cy. The primary stress is put on the syllable ['ma] and the
secondary stress is placed in the last syllable [sē] that the word ‘democracy is
pronounced [di-'ma-krə-sē ]. As an important note about placing stress that there are many two-
syllable words in English which may be articulated in two different ways in which the stress place
changes the part of speech of the word. For example:
 Present = (noun) pres·ent \ ˈpre-zᵊnt \ a gift.
 Present = (verb) pre·sent \ pri-ˈzent \ to give something to someone.
Therefore, it is important that we stress the right syllables, so people can adequately understand
our words. We can use a dictionary to check the word stress of new words before using them.

In most literature of phonetics acoustic on English, it is generally recognized that there are
at least three level of stress i.e. primary stress, secondary stress, as well as unstressed which are
indicated by the following marks :

Primary stress : ['] (1)

Secondary stress : [‾] (2)

Unstressed : No mark (0)

(https://www.merriam-webster.com/dictionary/)

Primary stress is the strongest emphasis which is given to a syllable in a word. According to
Kunter, G., & Plag, I. (2007), primary stress is indicated by an acute accent. Secondary stress is
weaker emphasis than primary stress which is given to another syllable in a word. According to
Kunter, G., & Plag, I. (2007), secondary stress is indicated by a grave accent. Unstressed is the
weakest emphasis which is given to syllables in a word. According to Kunter, G., & Plag, I. (2007),
unstressed syllables are not marked. The following words can be used to illustrate how primary,
secondary, and unstressed are placed on the syllables of the words.
a. Randomize (verb) : ran·dom·ize (3 syllables)
'ran-də-ˌmīz
(1) (0) (2)
b. Necessarily (adverb) : nec·es·sar·i·ly (5 syllables)
ne - sə -'ser - ə - lē
(0) (0) (1) (0) (2)

(https://www.merriam-webster.com/dictionary/)

With regard to word stress, we need to recognize ground rules of placing word stress. The ground
rules refer to the basic rules or principles on which the way of placing stress on syllables in English
words should be based. Three ground rules related to word stress i.e.

a. One word can only have one stress.


When articulating English words, they generally only have one stress. Even if we recognize
that there is a secondary stress beside the primary one, but the secondary stress is only used
in long words and is always a much smaller than the primary.
b. We can only stress vowels, not consonants.
c. Word stress is complicated in English.

Many linguists and language practitioners formulate stress rules to help the learners of English
understand where to place the stress properly e.g. Pathare. E. (2006), Kopecky.A. (2010), Bowen.
T (2017), and Quynhnguyen. (2018). However, it is important to remember that there are many
exceptions to every rule. So, do not relay on them too much. The positive side of these rules is that
it provides a good framework for understanding that there are some general patterns to syllable
stress in multi-syllabic English words (Kopecky.A, 2010). In addition, it is suggested to use
a dictionary to check the word stress of new words in addition to knowing the rules.

3.2. Intonation

All languages in the word have their own intonation, including English. According to
Becker, A., & Bieswanger, M. (2004), intonation, in English, helps to mark the functions and the
boundaries of a syntactic unit e.g. whether a sentence functions to ask or to inform etc. For
instance, a statement like ‘she is sleeping’ is marked as giving information when it is ended by a
falling pattern of sound (pattern of sound = pitch), but is marked as a yes/no question when it is
finalized by a rising pitch. Besides, intonation is also used to emphasize new information in an
utterance which mainly functions to express emotions or attitudes. Therefore, Intonation is found
different not only within different languages, but also within the same language e.g. English. It is
widely known that English has dialectal and regional differences in intonation; for example, there
are quite a few differences between British, American as well as general English in terms of their
intonation due to a number of aspects. Many factors affect intonation e.g. the speakers’ voice level
(low or high), speech rate of delivery (fast or slowly), emotions (feeling sad or happy), attitude
(positive or negative), and gender (man or woman). In comparison to word stress which is
generally focused on where to place the stress properly in English words when speaking, intonation
is more about how speakers should say utterances by rising, falling or remaining their voice flat
when speaking. The speakers’ tendency to rise, to fall, or to remain flat their voice really depends
on the meaning or feeling they want to express. How they produce their voice commonly indicate
the mood of the speakers such as anger, surprise, interest, gratitude, boredom, disappointment etc.
Many linguists have analyzed English intonation and eventually distinguish at least four
relative pitch levels i.e. low pitch /1/, flat pitch /2/, high pitch /3/, and extra high /4/. Three pitch
levels i.e. low /1/, normal /2/, and high pitch /3/ are commonly used in normal speeches which
may convey different functions e.g. making statements, addressing questions, etc. On the other
hand, the other last level i.e. extra high pitch /4/ is commonly used for screaming or shouting
which may convey other different functions such as expressing terrible fear, great happiness, rising
anger etc. This section is only focused on discussing the pitch levels in normal speech since these
are the most frequent patterns found in utterances of normal social interactions. To show how
native English speakers used the pitch levels, linguists record the speakers’ utterances and use
numbers (intonation contour) to indicate the patterns of intonation. The linguists then establish
intonation patterns for normal speeches i.e. falling /2 + 3 + 1/, rising /2 + 3 +3/, and combining
patterns /2 + 3 + 2 + 1/. The overall patterns of intonation can be indicated by arrow symbols i.e.
a downward arrow () is used to indicate falling intonation, an upward arrow () is used to
indicate rising intonation, and a rightward arrow () is used to indicate flat intonation.
The identification of the patterns of intonation in English is then used to set general patterns
for describing and illustrating how native English speakers generally rise and fall their voice in
speech. The description is intended to provide an approximate guide to help the Indonesian learners
of English to improve their knowledge of intonation which can be applied in speaking.
Nevertheless, it should be noted that in terms of intonation, noticing and practicing it from a 'live'
conversation with native English speakers will still be a realistic and preferred option which can
never be substantially replaced by a written explanation. Since intonation is emotional and
attitudinal that it is best acquired naturally through talking and listening to native speakers of
English.

3.3. Juncture
The discussion of this section is focused on the last problem that is recognizing clear-cut
borderlines between words in connected speeches which is also known as juncture. In connected
speeches, it is essential to know which segment of a phrase (a phoneme) functions to keep
utterances apart for the sake of conveying a clear meaning in oral communications with our
interlocutors. Often, an utterance, we convey in English, is not clearly understood by our
interlocutor simply because we do not know where to pause when we are saying some segments
of the utterance. For instance, two different words i.e. ‘apart’ and ‘a part’. They both are extremely
close each other in their spelling, and when we verbally utter the words, they are very
indistinguishable due to their similar sounds. However, they both have very different meanings.
Apart is an adverb, meaning, ‘separately’. Whereas, A part is a noun, meaning, ‘a section or
division of a whole.’ A good way to say both words differently is to give a brief pause which will
separate segment a from segment part when we intend to say a section or division of a whole. So,
the words will sound a part to our interlocutor. However, when we intend to say ‘separately’, we
simply need to remove the pause. So, the word will sound apart. In this regard, Hockett.C.F. (1967)
is of the opinion that any difference of sound which functions to keep utterances apart is by
definition part of the phonological system of the language’. Therefore, a transition from one
segmental phoneme to another is called juncture and represented by [+] mark. The mark indicates
that the speaker is required to a short pause before uttering the next word. According to Nicolosi,
Harryman & Kresheck (2004), juncture is the manner of moving (transition) or mode of
relationship between two consecutive sounds. It is the relationship between two
successive syllables in speech. In normal conversation, juncture effectively functions as a means
to help the interlocutor to distinguish between two identical pairs which have different meanings.
For example: ‘see Mill’ and ‘seem ill’. These words sound identical in phonetic representations.
The listener can only grasp the precise meaning conveyed by the speaker, if the speaker properly
uses juncture, for example:

- Did Alex see Mill?


dɪd ˈæləks si [+] mɪl?
- Did Alex seem [+] ill?
dɪd ˈæləks sim ɪl

You might also like