The Word Phonetics' Defined.: Chapter Three Understanding Phonetics
The Word Phonetics' Defined.: Chapter Three Understanding Phonetics
The Word Phonetics' Defined.: Chapter Three Understanding Phonetics
UNDERSTANDING PHONETICS
I learned by watching my favorite shows. I would just rewind and say the words back, until they
sounded right to me. I never studied the American accent, in terms of getting a teacher or taking
phonetics classes. I've always been a good mimic. It really wasn't that hard for me.
(Adelaide Kane)
In this section, some essential things concerning phonetics are proposed for discussion i.e.
The definition of Phonetics, Human speech sounds from phonetics perspective, as well as
Phonetics for teaching pronunciation of English sounds. Those things are sequentially presented
as follows.
To begin the discussion of phonetics, we should clearly know what phonetics is. The clear
understanding of phonetics will be enriched by presenting definitions derived from a variety of
online sources on the internet and from the linguists. From the online sources, a number of
definitions of phonetics are found such as:
a. http://www.dictionary.com defines phonetics as the science or study of speech sounds and
their production, transmission, and reception, and their analysis, classification, and
transcription.
b. http://www.yourdictionary.com points out that phonetics is the study of the sounds of
human speech using the mouth, throat, nasal and sinus cavities, and lungs.
c. https://en.oxforddictionaries.com gives simple definition of phonetics that it is the study
and classification of speech sounds.
d. Another definition is put forward by http://www.baap.ac.uk that phonetics is the systematic
study of speech and the sounds of language.
e. http://dictionary.cambridge.org also proposes that phonetics is the study of the sounds
made by the human voice in speech.
f. Besides, http://www.chegg.com also defines phonetics as the study of sound in speech that
focuses on how speech is physically created and received, including study of the human
vocal and auditory tracts, acoustics, and neurology.
g. In addition, http://www.phon.ox.ac.uk puts forward the definition of phonetics that it is a
study that deals with the production of speech sounds by humans, often without prior
knowledge of the language being spoken.
h. Finally, https://en.wikibooks.org defines phonetics as the systematic study of the human
ability to make and hear sounds which use the vocal organs of speech, especially for
producing oral language. It is usually divided into the three branches of (1) articulatory, (2)
acoustic and (3) auditory phonetics.
In addition to the definitions from the online sources, some clear definitions of phonetics have
been offered by prominent linguists, for example;
a. Ladefoged, P (1975) states that phonetics is concerned with the speech sound that occur in
languages of the world. It makes effort to know what the sounds are, how they fall into
patterns, and how they change in different circumstances.
b. Catford (1992) defines phonetics as the study of the physiological, aerodynamic, and
acoustic characteristics of speech-sounds.
c. Fromkin, V., Rodman, R., & Hyams, N. (2003) point out that phonetics is the study of
speech sounds, utilized by all human languages to represent meanings.
d. Becker, A., and Bieswanger, M. (2004) are of the opinion that Phonetics is concerned with
the wide variety of sounds used by speakers of human languages.
e. O'Grady et.al. (2005) define phonetics as a branch of linguistics that comprises the study
of the sounds of human speech, or in the case of sign languages the equivalent aspects of
sign.
f. Next, Roach, P. (2009) claims that phonetics is the scientific study of speech which
concerns in the discovery of how speech sounds are produced, how they are used in spoken
language, how we can record speech sounds with written symbols and how we hear and
recognize different sounds.
g. Besides, Lanpher (2011) asserts that phonetics is the study of speech sound in any language
which has three branches namely Articulatory, Acoustics and Auditory Phonetics.
Articulatory phonetics deals with how sounds are articulated where mouth, tongue and
lungs are parts of the system. Auditory phonetics deals with how it sounds when
articulated. Acoustic phonetics is about how the sounds are perceived in our brain. The last
but not the least,
h. Nordquist, R. (2016) is of the opinion that phonetics is the branch of linguistics that deals
with the sounds of speech and their production, combination, description, and
representation by written symbols.
After reading all of the definitions above, we can then identify three mutually shared underlying
concepts of phonetics from the definitions namely: speech organs, speech sounds, and speech sounds
productions. The three concepts may be used for providing another definition of phonetics that it is a
scientific study which deals with humans’ organs of speech, speech sounds, as well speech productions.
The basic components of phonetics are discussed as follows.
When speaking, people use speech sounds produced by flowing air from the lung to the mouth for
communicating messages. To vary the speech sounds’ quality, they need articulation that is the
action of producing sounds using their speech organs (articulators). Articulators refer to any organs
of speech that takes part in the production of human speech sounds. They include the lips, teeth,
alveolar ridge, hard palate, velum (soft palate), uvula, glottis and various parts of the tongue (e.g.
tip, middle, and back part). In this regard, articulators can be seen in figure 11 in the form of a
diagram of a human head. The learners are in need of learning it carefully to get a picture of the
articulators’ shapes that are used to make speech sounds in their mouth.
Figure 11
Articulators to produce speech sounds
Source: https://www.pinterest.com/pin/514254851179328077/
b. Speech sounds
Speech sounds may be understood as a set of distinctive sounds produced when people speak in a
particular language. So, a speech sound is plainly a sound uttered by speakers of a language. For
example the word ‘book’, this word seems to have three different sounds i.e. /b/, /ʊ/, and /k/ in
which /b/ is the representative of letter ‘b’, /ʊ/ is the representative of letter ‘oo’ and /k/ is the
representative of letter ‘k’. Therefore, it must be fully understood that speech sounds are not the
letters used to spell the words. They are all about what you are hearing, not what letters to use to
spell the words. In other words, English speech sounds are not representative of the English
alphabets. The problem now is how would somebody transcribe and distinguish between the
sounds they are hearing in communications? The answer is this, since sounds are different from
the letters, a new and different set of symbols is needed to represent the speech sounds. Since 1888,
an effort has been made to create a universal system for transcribing speech sounds to help the
linguists, language teachers, even learners transcribing languages consistently and accurately. This
system is famously known as The International Phonetic Alphabet (IPA). The IPA not only
transcribes speech sounds, but also provides symbols to represent sounds of every language in the
world.
The production of speech sounds indicates the process of making speech sounds by using
articulators which is initiated from flowing the air provided by the lung, the air then goes through
the glottis in the larynx that then is modified by the vocal tract (oral cavity) or vocal apparatus
(nasal cavity) into different qualities of sounds i.e. consonants and vowels. Consonant sounds are
made by blocking the airflow completely or partially during the speech. The production of
consonant sounds is mainly determined by two factors i.e. the place where the sounds are produced
(place of articulation) and the manner how they are articulated (manner of articulation). For example,
when someone wants to make a p sound, he needs to move his two lips come together tightly and
then blocks the air flow for a moment. After that, he has to release the air pressure suddenly in the
mouth which causes plosive sound. The place of articulation of this sound is therefore called
bilabial, and the manner is called stop (also known as a plosive). On the other hand, vowel sounds
are made in a different way from consonants. Vowel sounds don’t need airflow obstruction, but
need a more continual airflow. In this regard, phoneticians describe that the production of different
vowel sounds is affected by HAR which stands for Height, Advancement, and Rounding. Height
refers to the height of the jaw whether its position is open, mid, or near close. Advancement refers
to the frontness or the backness of the tongue (how front or back the tongue is). Rounding refers
to the shape of the lips whether the lips are rounded or unrounded. For example, when the jaw is
near close, it will produce sound /i/ as in ‘he’, sound /I/ as in ‘with’, sound /ʊ/ as in ‘would’ etc.
Furthermore, Phonetics, as the study of speech sounds of spoken languages, makes effort
to describe two important things concerning speech sounds i.e. and what physical properties of the
sounds exist in spoken languages (acoustic phonetics) and how sounds are produced (it is also
called articulatory phonetics). By using acoustic approach, phonetics tells us that a language
mainly consists of two kinds of sounds i.e. segmental and supra-segmental sounds. Segmental
sounds refer to the smallest units of a sequence of sounds i.e. consonant and vowel sounds. On the
other hand, supra-segmental sounds refer to aspects of pronunciation that go beyond the production
of individual (segmental) sounds, for example: stress and intonation (Crane, L. B., Yeager, E., &
Whitman, R. L. 1981).
This discussion of this section is focused on describing three kinds of information related
to the production of the first segmental sound that is English consonant sounds namely; describing
which parts of the mouth are used for making a particular consonant sound of English (place of
articulation), describing how the consonant sound is produced (manner articulation), and
describing whether or not the vocal chords vibrate (distinctive feature). These kinds of information
are important for learners of English to help them be able to produce consonant sounds of English
naturally.
1 BI Bilabial
2 LA Labiodental
3 D Dental
4 A Alveolar
5 PAT Palatal
6 PA Palato Alveolar
7 VE Velar
8 GLO Glottal
9 LA Labiovelar
Manner of articulation describes two things i.e. The first, how the different organs of speech
interact one another in producing consonant sounds. The second, how the airflow is obstructed to
affect the production of consonant sounds’ quality. Thus, the manner of articulation is a
determinant which presents distinctive feature of consonant sounds in the English language
because it is possible to make several different consonant sounds at the same place of articulation.
Therefore, manner of articulation gives basic distinctions of how consonant sounds are made
through seven concepts of manner of articulation i.e. Stops, Fricatives, Affricatives, Nasals,
Lateral, and Approximant. The six manners of articulations are easily remembered by the
following acronyms: STOFANASLA
1 STO Stop
2 F Fricative
3 A Affricative
4 NAS Nasal
5 L Lateral
6 A Approximant
a. Stops or Plosives
In this manner of articulation, consonant sounds are produced by closing the speech organs
both oral and nasal cavity that the airstream is blocked. It results air pressure in the oral
cavity. When the air pressure is suddenly released, aspiration will then occur. The English
consonant sounds produced through this manner of articulation are [p] as in the word
‘pen’, [t] as in the word ‘tie’, [k] as in the word ‘key’ and [b] as in the word ‘buy’, [d] as
in the word ‘die’, [g] as in the word ‘go’.
b. Fricatives
This type of manner of articulation produces consonant sounds by blocking the airstream
in the mouth, but not making complete closure (a complete closure belongs to a stop) that
the air moves through the mouth and produces audible friction e.g. [s] as in the word
‘cycle’, [z] as in the word ‘zoom’, [f] as in the word ‘fan’, [v] as in the word ‘vast’, [θ] as
in the word ‘method’, [ð] as in the word ‘then’, [ʃ] as in the word ‘ship’, [ʒ] as in the word
‘pleasure’ and [h] as in the word ‘hot’.
c. Affricatives
This manner of articulation produces consonant sounds by blocking the airstream briefly
with the tongue in the mouth, but in contrast to stop, the blocked airstream is suddenly not
released, but is slowly released and causes audible friction e.g. sound [tʃ] as in the word
‘check’ and [dʒ] as in the word ‘jump’.
d. Nasals
This manner of articulation produces consonant sounds by blocking the airstream pass
through the oral cavity using the velum (soft palate) and the back of the tongue, so that the
air can only pass through the nasal cavity e.g. sounds [m] as in the word ‘man’, [n] as in
the word ‘night’ and [ŋ] as in the word ‘sing’.
e. Laterals (Liquid)
This manner of articulation produces consonant sound by contacting the tip of the tongue
onto the alveolar ridge so that the air passes through both sides of the tongue e.g. sounds
[l] as in the word ‘leap’.
f. Approximants (Glides)
This type of manner of articulation deals with the way consonant sounds are uttered by
making the articulators interact e.g. the tongue and the alveolar ridge without actually
touching. In English, there are three approximants i.e. sound [j] as in the word ‘you’, sound
[w] as in the word ‘world’ and sound [r] as in the word ‘rise’.
Place of articulations
Palato Labio
Bilabial Labiodental Dental Alveolar Palatal Velar Glottal
Alveolar Velar
Stop p b t d k g
Fricative f v θ ð s z ʃ ʒ h
Affricative tʃ dʒ
Nasal m N ŋ
Lateral
Approximant j r w
As stated earlier that mouth is one of speech organs which functions to alter the quality of
vowels. According to Williamson, G (2015), the quality of vowel sounds can be different from
one another according to the extent to which the jaws are either open or close (not ‘closed’,
as a complete closure would prevent the free flow of air out of the mouth). For example, as a
native speaker of English says the word ‘palm’. The speaker will produce a vowel sound /ɑ/.
If we notice how the native speaker produces the vowel sound, we will then obviously see that
the speaker’s jaws are wide apart and this is what Williamson, G (2015) calls as a relatively
open mouth posture in which an open mouth posture will produce open vowel sounds.
Therefore, he suggests that we can see and even feel ourselves the relative openness or
closeness of our mouths by alternating the production of these vowels in quick succession (/i/
– /ɑ/ – /i/ – /ɑ/ – /i/ – /ɑ/) as you observe ourselves in a mirror.
b. Tongue elevation
The tongue also plays a decisive role in influencing the quality of vowel sounds by altering
its positions in the mouth. In phonetics, the position of the tongue in the mouth can be
described in three vertical positions i.e. high, mid, and low. For instance, when the tongue is
placed in a high position, it will produce the vowel sound /i/ as in the word ‘bee’, which is
also known as a close vowel in the perspective of mouth’s openness. In contrast, when the
tongue moves downward to a low position, it will produce different vowel sounds e.g. vowel
sound /æ/ as in the word ‘trap’. Besides, different vowel sounds will be produced as the tongue
is placed about halfway between high and low that is a mid-position like the sound /ɛ/ as in
the word ‘dress’. In a nutshell, native English speakers correlate between tongue elevation and
openness of the mouth when articulating vowel sounds in which close vowels (with the mouth
relatively closed) are articulated with a relatively high tongue elevation and open vowels are
typically articulated with a relatively low tongue position.
When discussing the ‘position of tongue elevation’, it will refer to where this elevation takes
place on the three horizontal positions i.e. front, central, and back. Taking vowel /i/ again
as an example, we have known that this sound is both close and high vowel. However, if we
notice the position on the tongue horizontally, we will find out that this sound is articulated
by positioning the front of the tongue in the direction of the hard palate. Therefore, this sound
is also called as a front vowel. In addition, using the vowel /ɑ/, in the word ‘palm’ also as an
example, we can feel that this time the sound is produced by raising the back of the tongue in
the direction of the soft palate. Therefore, it is called as a back vowel sound.
d. Lips’ shapes
In making the sounds of English vowels, native English speakers usually use two types of lip
shapes i.e. rounded and unrounded. As a rounded vowel is made, a speaker’s lips form an
opening and circular mouth and unrounded vowels are made with the lips in relaxed position.
In most languages, front vowels tend to be unrounded, and back vowels tend to be rounded
(https://en.wikipedia.org). To see and to feel the difference between them, we can say the
word ‘lose’ which makes vowel sound /u/ and alternate it by saying the word ‘knee’ which
produces vowel sound /i/. When doing this, we will see that our lips are rounded for the
production of vowel sound /u/, but they will be unrounded (spread) when producing vowel
sound /i/.
e. Duration of vocalization
The last important parameter of vowel sound production is length of vocalization which means
that, native English speakers relatively produce longer and shorter vowels sounds due to the
intervals of time when producing the vowel sounds. These types of vowel sounds are
categorized as long and short. For example, as a native English speaker says the word ‘goose’,
the speaker will make sound /u/ which is relatively heard longer then when the native speaker
says the word ‘trap’ which produces sound /æ/ and is relatively heard shorter. In addition to
the above explanation, it is important to clarify here that the terms ‘long vowel’ and ‘short
vowel’ do not indicate the length of the vowel, but rather the sound of the vowel. In linguistic
contexts, the terms ‘long’ and ‘short’ are referred to as ‘tense’ and ‘lax’ vowels, respectively.
These two terms are used to indicate the distinctive feature of vowel sound. According to
Williamson (2015), it is sometimes important to make clear that a particular vowel is a tense
(long) vowel; a colon-like mark (ː) is placed after the symbol for the vowel, e.g. /iː/, /uː/, /ɑː/.
For example, as a native English speaker says the word ‘beat’, the speaker will make a tense
vowel /iː/ which is the counterpart to the lax /ɪ/ as the native English speaker says the word
‘bit’. Another example is the production of sound /uː/ known as a tense vowel as in the word
‘kook’ which is the counterpart to the lax /ʊ/ as in the word ‘cook’.
After knowing how the vowel sounds are produced in English, it is then important to
recognize the categories of English vowel sounds. Williamson (2015) categories English vowel
sounds into two categories i.e. simple vowel which is also known as pure vowel or monophthongs
and complex vowel which is also known as dipthongs.
The following figure presents the overall simple vowel sounds used in both American and British
English. The vowel sounds are presented based on the vowel sounds’ place of articulation (i.e. the
tongue elevation, the tongue position) and manner articulation (i.e. the duration of vocalization).
This figure exclusively shows the two vowels in black and in red. If the sound stands alone in
black, it means that the vowel occurs in both American English (AE) and British English (BE). On
the other hand, if a vowel sound is written in black and in red, it means that the black is for AE
and the red is for BE.
Figure 12
Articulators to produce speech sounds
The overall simple vowel sounds in American and British English
Source: Williamson (2015)
A complex vowel sound is formed by the combination of two vowels in a single syllable,
in which the sound begins as one vowel and changes to another vowel sound in the same syllable.
In producing such a vowel sound, a speaker’s mouth shape usually changes. This is because the
speaker has to make two types of month movements i.e. starting movement to produce simple
sound at the beginning and ending movement to combine the simple vowel sound with another to
produce a vowel sound combination in the same syllable at the end. For example:
/ɪə/ : hear, cheer, deer /hɪə, ʧɪə, dɪə/
In most literature of phonetics acoustic on English, it is generally recognized that there are
at least three level of stress i.e. primary stress, secondary stress, as well as unstressed which are
indicated by the following marks :
(https://www.merriam-webster.com/dictionary/)
Primary stress is the strongest emphasis which is given to a syllable in a word. According to
Kunter, G., & Plag, I. (2007), primary stress is indicated by an acute accent. Secondary stress is
weaker emphasis than primary stress which is given to another syllable in a word. According to
Kunter, G., & Plag, I. (2007), secondary stress is indicated by a grave accent. Unstressed is the
weakest emphasis which is given to syllables in a word. According to Kunter, G., & Plag, I. (2007),
unstressed syllables are not marked. The following words can be used to illustrate how primary,
secondary, and unstressed are placed on the syllables of the words.
a. Randomize (verb) : ran·dom·ize (3 syllables)
'ran-də-ˌmīz
(1) (0) (2)
b. Necessarily (adverb) : nec·es·sar·i·ly (5 syllables)
ne - sə -'ser - ə - lē
(0) (0) (1) (0) (2)
(https://www.merriam-webster.com/dictionary/)
With regard to word stress, we need to recognize ground rules of placing word stress. The ground
rules refer to the basic rules or principles on which the way of placing stress on syllables in English
words should be based. Three ground rules related to word stress i.e.
Many linguists and language practitioners formulate stress rules to help the learners of English
understand where to place the stress properly e.g. Pathare. E. (2006), Kopecky.A. (2010), Bowen.
T (2017), and Quynhnguyen. (2018). However, it is important to remember that there are many
exceptions to every rule. So, do not relay on them too much. The positive side of these rules is that
it provides a good framework for understanding that there are some general patterns to syllable
stress in multi-syllabic English words (Kopecky.A, 2010). In addition, it is suggested to use
a dictionary to check the word stress of new words in addition to knowing the rules.
3.2. Intonation
All languages in the word have their own intonation, including English. According to
Becker, A., & Bieswanger, M. (2004), intonation, in English, helps to mark the functions and the
boundaries of a syntactic unit e.g. whether a sentence functions to ask or to inform etc. For
instance, a statement like ‘she is sleeping’ is marked as giving information when it is ended by a
falling pattern of sound (pattern of sound = pitch), but is marked as a yes/no question when it is
finalized by a rising pitch. Besides, intonation is also used to emphasize new information in an
utterance which mainly functions to express emotions or attitudes. Therefore, Intonation is found
different not only within different languages, but also within the same language e.g. English. It is
widely known that English has dialectal and regional differences in intonation; for example, there
are quite a few differences between British, American as well as general English in terms of their
intonation due to a number of aspects. Many factors affect intonation e.g. the speakers’ voice level
(low or high), speech rate of delivery (fast or slowly), emotions (feeling sad or happy), attitude
(positive or negative), and gender (man or woman). In comparison to word stress which is
generally focused on where to place the stress properly in English words when speaking, intonation
is more about how speakers should say utterances by rising, falling or remaining their voice flat
when speaking. The speakers’ tendency to rise, to fall, or to remain flat their voice really depends
on the meaning or feeling they want to express. How they produce their voice commonly indicate
the mood of the speakers such as anger, surprise, interest, gratitude, boredom, disappointment etc.
Many linguists have analyzed English intonation and eventually distinguish at least four
relative pitch levels i.e. low pitch /1/, flat pitch /2/, high pitch /3/, and extra high /4/. Three pitch
levels i.e. low /1/, normal /2/, and high pitch /3/ are commonly used in normal speeches which
may convey different functions e.g. making statements, addressing questions, etc. On the other
hand, the other last level i.e. extra high pitch /4/ is commonly used for screaming or shouting
which may convey other different functions such as expressing terrible fear, great happiness, rising
anger etc. This section is only focused on discussing the pitch levels in normal speech since these
are the most frequent patterns found in utterances of normal social interactions. To show how
native English speakers used the pitch levels, linguists record the speakers’ utterances and use
numbers (intonation contour) to indicate the patterns of intonation. The linguists then establish
intonation patterns for normal speeches i.e. falling /2 + 3 + 1/, rising /2 + 3 +3/, and combining
patterns /2 + 3 + 2 + 1/. The overall patterns of intonation can be indicated by arrow symbols i.e.
a downward arrow () is used to indicate falling intonation, an upward arrow () is used to
indicate rising intonation, and a rightward arrow () is used to indicate flat intonation.
The identification of the patterns of intonation in English is then used to set general patterns
for describing and illustrating how native English speakers generally rise and fall their voice in
speech. The description is intended to provide an approximate guide to help the Indonesian learners
of English to improve their knowledge of intonation which can be applied in speaking.
Nevertheless, it should be noted that in terms of intonation, noticing and practicing it from a 'live'
conversation with native English speakers will still be a realistic and preferred option which can
never be substantially replaced by a written explanation. Since intonation is emotional and
attitudinal that it is best acquired naturally through talking and listening to native speakers of
English.
3.3. Juncture
The discussion of this section is focused on the last problem that is recognizing clear-cut
borderlines between words in connected speeches which is also known as juncture. In connected
speeches, it is essential to know which segment of a phrase (a phoneme) functions to keep
utterances apart for the sake of conveying a clear meaning in oral communications with our
interlocutors. Often, an utterance, we convey in English, is not clearly understood by our
interlocutor simply because we do not know where to pause when we are saying some segments
of the utterance. For instance, two different words i.e. ‘apart’ and ‘a part’. They both are extremely
close each other in their spelling, and when we verbally utter the words, they are very
indistinguishable due to their similar sounds. However, they both have very different meanings.
Apart is an adverb, meaning, ‘separately’. Whereas, A part is a noun, meaning, ‘a section or
division of a whole.’ A good way to say both words differently is to give a brief pause which will
separate segment a from segment part when we intend to say a section or division of a whole. So,
the words will sound a part to our interlocutor. However, when we intend to say ‘separately’, we
simply need to remove the pause. So, the word will sound apart. In this regard, Hockett.C.F. (1967)
is of the opinion that any difference of sound which functions to keep utterances apart is by
definition part of the phonological system of the language’. Therefore, a transition from one
segmental phoneme to another is called juncture and represented by [+] mark. The mark indicates
that the speaker is required to a short pause before uttering the next word. According to Nicolosi,
Harryman & Kresheck (2004), juncture is the manner of moving (transition) or mode of
relationship between two consecutive sounds. It is the relationship between two
successive syllables in speech. In normal conversation, juncture effectively functions as a means
to help the interlocutor to distinguish between two identical pairs which have different meanings.
For example: ‘see Mill’ and ‘seem ill’. These words sound identical in phonetic representations.
The listener can only grasp the precise meaning conveyed by the speaker, if the speaker properly
uses juncture, for example: