On The Recognition of Timbre - A First Step Toward Understanding How Hearing-Impaired People Perceive Timbre

2012 IEEE International Conference on Systems, Man, and Cybernetics

October 14-17, 2012, COEX, Seoul, Korea

On the Recognition of Timbre

A first step toward understanding how hearing-impaired people perceive timbre

Rumi Hiraga and Kazuhiko Otsuka

Department of Industrial Technology
Tsukuba University of Technology
Tsukuba, Japan

Abstract—There are many hearing-impaired college students who which to rate. The scales are, for example, light/dark,
have interest in music and some of them listen to music every day. sharp/dull, and cold/warm [6][7].
What do hearing-impaired people enjoy in music? Among the
several elements in music, focus was placed on timbre in an Though even timbre generated by a single musical
experiment. Timbre differs if a player plays a musical score with instrument differs, we recognize different timbre in sound
different musical instruments. Several sets of music data were generated by different types of musical instruments. We notice
prepared from commercial CDs in which the same phrases were the difference in timbre in musical performances played on the
played with different musical instruments and subjects were asked guitar and the piano. For example, “Golliwogg’s Cake-Walk”
to listen to and compare the timbres. Subject groups consisted of in Claude Debussy’s “Children’s Corner Suite” was originally
hearing-impaired people, hearing people with little or no experience composed for the piano, but it is also played as a guitar duo
in playing music, and hearing people with active experience in with a different arrangement. There are many musical pieces
playing music. Significant differences were found in the ability to that are played by several different kinds of instruments and we
differentiate timbre among the three groups. Hearing-impaired can enjoy them while recognizing the differences between the
people were found to be good at noticing sets in which all timbres instruments as a new musical experience with a familiar piece.
were similar by comparison but they encountered difficulties in
differentiating timbres in other cases. The brightness curve was In this paper, we describe an experiment we conducted on
found to be a candidate as an attribute to explain these difficulties. differentiating timbre in the same phrase from two musical
pieces. Hearing people enjoy musical performances played
Keywords-timbre; recognition; hearing-impaired people; music using several different arrangements. Is this way of enjoying
activities; music common to hearing-impaired people? To find out, we
asked three subject groups to listen to several sets of musical
performances. The groups were hearing-impaired people,
hearing people with little or no experience in playing music,
There are not a small number of hearing-impaired college and hearing people with active experience in playing music.
students who enjoy listening to music in the Department of The same musical phrase was obtained from commercial CDs
Industrial Technology at Tsukuba University of Technology, and three of the music fragments made a set. Subjects listened
where hearing-impairment students are qualified to enter after to several music fragment sets and judged whether some
taking an examination. We previously attempted to ascertain fragments were played with a similar timbre. The purpose of
the manner in which hearing-impaired people recognize the the experiment was to investigate whether hearing-impaired
emotions conveyed by percussion performances. Through people can perceive timbre in listening to music. If a certain
experiments, we found that there were no differences between type of timbre, say “timbre-A”, is easier for a hearing-impaired
hearing-impaired people and hearing people in recognizing person to recognize, then is it possible that he or she can enjoy
certain emotions in listening to percussion performances that music more when it is performed with timbre-A? If so, we
were played to convey an emotion [1][2][3]. may be able to design and construct a system that helps to
This raised the question: what makes music attractive to retrieve music performed with timbre-A.
people even when they have a hearing impairment? Music The results showed that there were significant differences
includes several elements such as melody, harmony, tempo, in the ability to differentiate timbre between the hearing-
rhythm, and timbre. In this paper, we describe an experiment impaired and the hearing subject groups. Hearing-impaired
we conducted on recognizing timbre. Timbre has been defined people were found to be good at noticing sets in which all
as “that attribute of auditory sensation in terms of which a timbres were similar by comparison. However, in other cases
listener can judge two sound similarly presented and having the they experienced difficulty in differentiating timbres. A
same loudness and pitch as dissimilar [4].” Physically, timbre brightness curve was found to be a candidate as an attribute to
is modeled by integrating the concepts of color and texture of explain the reason for this difficulty.
sound, where color is an instantaneous spectral envelope and
texture is the temporal nature of the sound as the sequential II. RELATED WORKS
changes in color with an arbitrary time scale [5]. Timbre is also
subjectively considered a multidimensional attribute and This research is related to the areas of music recognition in
subjects have been presented with a set of semantic scales with terms of the timbre perception and of the music recognition
ability of hearing-impaired people.

Though timbre has been a subject of work for several Table 1 Music CDs used in experiment
researchers, it has been thought to be an elusive and ill-defined “Claire de Lune”
concept [8] in spite of the definition provided by ANSI [4]. Length
Thus, it does not appear to be easy to describe clearly even for Timbre Player
hearing people. So far, several researchers have tried to 1 E. Ormandy conducting
investigate the relationships between timbre attributes and the Orchestra *1 16
the Philadelphia Orchestra
recognition of timbre. An experiment by Grey [9] used 16
2 Guitar duo J. Bream & J. Williams 15
synthesized tones to evaluate the relationships with respect to
perceptual similarities, using multidimensional scaling and the 3 Guitar solo K. Muraji 21
labeling of sounds as a learning task. An experiment in 4 Cello (piano
M. Maisky 15
multidimensional scaling with subjects by Samson et al. [10] accompaniment)
used synthesized sound with a number of spectral and temporal 5 Piano solo S. François 10
changes. They showed that these two elements are helpful in 6 Flute (orchestra
enabling subjects to recognize timbre. In a brain research J. Galway 17
experiment, Ilmoniemi et al. prepared synthesized sounds with
four attributes and conducted the subjective listening test to “My Neighbor Totoro”
determine which of the sound sets they used was the best [11]. Length
These experiments using synthesized sound made it possible to Timbre Player
clarify the relationship between the physical attributes of sound
1 Orchestra *2 unknown 15
and subjective recognition of timbre. Paul used orchestral tone
for subjects to rate how strongly the instruments segregated and 2 Violin (guitar C.
for subjects to recognize melodies played with a specific timbre accompaniment) Takashima
[12]. 3 Music Box unknown 23
4 Guitar solo N. Hiwatari 14
Not much but some research has been done on music 5 Cello (piano, guitar, base,
perception by hearing-impaired people. A.-A. Darrow teaches K. Kukita 9
percussion accompaniment)
and gives music therapy to hearing-impaired children [13][14].
6 Piano (ensemble) C. Orrje 14
Hayashida and Kato conducted experiments on how hearing-
impaired children understand rhythm in music [15]. Outside of
research, music activities for hearing-impaired people are *1 Because the arrangement is unknown, it is not clear what
conducted. As examples, a Japanese composer K. Sato teaches instrument plays the melody.
music to hearing-impaired children, a British pianist and *2 The violin plays the melody according to the notes in the
organist P. Whittaker OBE, who has been deaf all his life, has CD.
organized an orchestra and other music activities [16], and the
from Joe Hisaichi’s main theme of the animation movie “My
famous percussionist Dame E. Glennie has been profoundly
Neighbor Totoro”. Since the latter is a very well-known movie
deaf since age 12 [17]. Music activities, such as dancing, sign-
in Japan, the main theme’s melody is readily recognizable to
language songs, and Japanese Taiko drum playing are also
almost everyone in the country. Figure 1 shows the musical
popular among hearing-impaired students at Tsukuba
phrases used in the experiment and Table 1 shows the musical
University of Technology.
fragments and the instruments used to perform them.
We have conducted experiments on emotion
Each set consists of three identical musical phrase, or
communication in musical performance by hearing-impaired
“musical fragments” as we call them. We used three types of
subjects in the past, but all the musical performances used were
music fragment sets; in one, all three musical fragments are
percussion music with improvisational style. Our use of
similar in terms of timbre, in another all three musical
commercial CDs in the experiment is closer to everyday
fragments are different in terms of timbre, and in the other two
listening experiences of ordinary people.
of the three musical fragments are similar. Due to differences
in the arrangements and the pitch ranges of instruments, the six
III. EXPERIMENT musical fragments of each musical piece were in different keys.
The purpose of the experiment on timbre we conducted Utilizing this point, we made eight fragment sets (labeled C1 to
were to attempt to ascertain what hearing-impaired people C8) for “Claire de Lune” and nine sets (labeled T1 to T9) for
enjoy in listening to music and whether they could recognize the “My Neighbor Totoro” theme as shown in Table 2.
timbre in doing so. Subjects listened to several music fragment
sets and were asked to judge whether they could perceive B. Subjects
similar timbres within a set. The three subject groups listed below participated in the
A. Material
1. Hearing-impaired people comprising eight males and
We used commercial music CDs to prepare several sets of one female ranging in age from 20 to 23. To form this
musical fragments from the same phrases. We used the musical
group, we asked for hearing-impaired people who
phrases of the beginning four measures of Claude Debussy’s
said they liked music and often listened to it. All of
“Claire de Lune” from “Suite Bergamasque” and the chorus

Claire de Lune

My Neighbor Totoro

Figure 1 Music phrases used in experiment

them wore hearing aids, but none of them made use 1. How difficult was it for you to recognize timbre
of cochlear implants. similarity? Choose one from “Difficult”, “Neither difficult nor
2. Hearing people with comprising three males and five easy”, and “Easy”.
females ranging in age from 35 to 54, none of whom 2. How similar did you feel the musical fragments were
actively and regularly played or listened to music. in terms of timbre of the melody? Choose one from:
3. Hearing people with comprising two males and five
females ranging in age from 43 to 45, all of whom 1) All three musical fragments were similar.
actively and regularly played or listened to music. 2) All three musical fragments were different.
Some of them were or had been members of an
orchestra and some of them often attended classical 3) Two of the three musical fragments were similar.
music concerts. If the answer was 3), then subjects were asked to name the
C. Procedure two musical fragments that they felt were similar. For example,
if a subject listened to set T8 of “My Neighbor Totoro” in
Musical fragment sets were provided to subjects through a Table 2 and felt the fragments a and c were similar, then he or
sheet of a Microsoft Excel file. Subjects listened to three she selected a and c.
musical fragments labeled a, b, and c, in each set and answered
the following two questions.
We statistically analyzed the answers to the second
Table 2 Musical fragment sets question to investigate the differences between subject groups,
Each set consists of three musical fragments a, b, and c. between two music phrases, and between the fragment sets of
The name of each set is in the first column. each music phrase. The “anova1” and “multcompare” functions
of Matlab Statistics Toolbox were used for the analysis.
“Claire de Lune”
a b c A. Percentage of correct answers
C1 Flute Orchestra Piano We obtained the percentage of correct answers shown in
C2 Flute Orchestra Orchestra Table 3 by comparing the subjects’ answers and the timbre
C3 Guitar Solo Guitar Solo Guitar Duo combinations shown in Table 2. In addition to cases of exact
C4 Guitar Solo Guitar Solo Guitar Solo similarity, the timbres were counted as being similar in the
C5 Guitar Solo Cello Guitar Duo following cases.
C6 Cello Flute Orchestra z Between the guitar duo and the guitar solo in the
C7 Cello Orchestra Orchestra musical fragments from “Claire de Lune”.
C8 Piano Piano Piano
z Between the orchestra and the violin in musical
“My Neighbor Totoro” fragments from “My Neighbor Totoro” because the
orchestra arrangement made the violin solo.
a b c
T1 Orchestra Cello Violin
B. Differences between subject groups
T2 Piano Piano Music Box
T3 Guitar Guitar Guitar One-way Analysis of Variance (ANOVA) showed there
T4 Guitar Orchestra Violin were significant differences between subject groups. The
average correct rates for hearing-impaired people (HI), hearing
T5 Violin Violin Violin
people with little music activity experience (HL), and hearing
T6 Cello Orchestra Piano people with music activity experience (HE) were respectively
T7 Piano Piano Piano 0.68, 0.93, and 0.97. The p-value was 4.60e-07, which shows
T8 Cello Guitar Cello there were significant differences between HI, HL, and HE.
T9 Violin Orchestra Orchestra Multiple comparisons showed significant differences between
HI and HL as well as HI and HE. The ANOVA for each piece

of music showed significant differences between subject
groups and the multiple comparisons showed the differences
between hearing-impaired group and the two hearing groups.

C. Differences between musical phrases

There were no differences between the two musical
phrases (p-value=0.31).
D. Differences between musical fragment sets
1) Claire de Lune
The p-value obtained by calculating ANOVA was 0.15,
which means there were no significant differences between
fragment sets. On the other hand, the p-value was 0.021 if
ANOVA was calculated only for hearing-impaired subjects.
2) My Neighbor Totoro
The p-value was 0.28 for all subjects, but 0.25 for hearing-
impaired subjects only. For the “My Neighbor Totoro” phrase
there were no significant differences between fragment sets.

E. Self-assessment
The first question was on the difficulty in recognizing the
timbre similarity. The self-assessment was one of the three
choices: “Difficult”, “Neither difficult nor easy”, and “Easy”.
By mapping these choices as the numbers 1, 2, and 3
respectively, we got the average self-assessment value for each
of the musical fragment sets (Figure 2).

Figure 2 Self-assessment
Table 3 Percentage of correct answers for each subject
“Claire de Lune” The most significant experiment result was that there were
HI HL HE significant differences in recognizing timbre similarity between
C1 22.2 100.0 100.0 hearing-impaired people and hearing people. On the basis of
C2 33.3 87.5 100.0 our analysis, we attempted to ascertain what makes it difficult
C3 77.8 87.5 100.0 for hearing-impaired subjects to recognize timbre.
C4 88.9 87.5 100.0
C5 55.6 100.0 100.0 A. Recognizing similarities and differences in timbre
C6 33.3 100.0 71.4 From Table 2 and the method for determining the
C7 66.7 100.0 85.7 percentage of correct answers, we were able to group the
C8 77.8 100.0 100.0 fragment sets C1 to C8 and T1 to T9 in terms of the number of
similar timbres (from 1 to 3). For example, the number of
“My Neighbor Totoro” similar timbres in C1 is 1, that in C2 is 2, and that in C3 is 3.
HI HL HE The ascending order of correct answer percentages for
T1 77.8 87.5 100.0 musical fragment sets of “Claire de Lune” for hearing-
T2 88.9 87.5 100.0 impaired subjects is C1, C6=C2, C5, C7, C3=C8, C4 (Table 3).
T3 88.9 100.0 100.0 Table 4 shows the average correct rates for the subject groups,
T4 77.8 87.5 100.0 those for all subjects in the upper table and those for hearing-
T5 100.0 87.5 100.0 impaired subjects only in the lower table. For both musical
T6 66.7 100.0 100.0 phrases, the correct answer rates were the highest if a set
T7 77.8 87.5 100.0 consisted of a unique timbre. This indicates that for hearing-
T8 44.4 87.5 85.7 impaired subjects, the similarity is better perceivable when the
T9 77.8 87.5 100.0 number of similar timbres is larger.
HI: Hearing-impaired subject groups B. Correct answer percentages
HL: Hearing people with less music activity
Table 3 shows the correct answer percentages obtained for
HE: Hearing people with more music activity the subject groups. Less than half of hearing-impaired subjects

made correct answers for the musical fragment sets of C1, C2,
C6, and T8.
1) Timbre attributes
Since timbre can be expressed in terms of several attributes,
we used MIRToolbox [18] on Matlab to obtain the brightness
curve, the average attack slopes, spectral centroids, and the roll
off frequency of sound.
The brightness shows the sound energy at higher frequency.
When a frequency value is assigned, the ratio of energy above
the frequency is returned by a function of MIRToolbox. The
envelope often referred to as “ADSR envelope” in sound
engineering, affects the timbre. “Attack time”, the first term of
the envelope, starts from the time where there is no sound and
ends at the point where the sound reaches full volume.
“Spectral centroid” and “roll-off” are attributes related to the
brightness. The roll-off function of MIRToolbox calculates the
frequency below which 85% of the energy is contained. Figure 3 Brightness curve

2) Flute and orchestra. Orchestra) for C1, we can say that there were six subjects who
All the musical fragment sets of C1, C2, and C6 include the perceived the timbres of flute and orchestra were similar.
timbres of flute and orchestra. The third timbre is piano, Similarly, error case 3 and the number of error answers for C2
orchestra, and cello for C1, C2, and C6 respectively. Since the mean that four subjects perceived that the three timbres were
arrangement for the orchestra is not clear, we cannot definitely similar. These results imply that the flute and orchestra timbres
state which instruments play the melody. From listening, it is (in this arrangement) were perceived as being similar by
apparent that the two musical fragments played by flute and hearing-impaired people.
orchestra differ in timbre. Among the attribute values we obtained with MIRToolbox,
Table 5 shows the correct answers, error cases, and the the brightness curve was a candidate as an attribute to explain
number of error answers for C1, C2, C6, and T8 obtained for the confusion between the timbres of flute and orchestra. The
hearing-impaired subjects. By focusing on error case 2 (Flute- brightness curves in Figure 3 show how energy decreases a
according to the different frequency values. The higher energy
Table 4 Average correct rates in terms of number of in higher frequency values for the timbres of flute and orchestra
similar timbres. compared to other timbres may have some reason for the
confusion between the two timbres.
The first column shows the number of similar timbres in For the other set with low correct answer rates, i.e., T8, we
each musical fragment set. can see the error case 2 (Cello-Orchestra) where the number of
1: All timbres are different. the incorrect answer was 3, we have no good explanations with
2: Two of the three fragments are similar. the brightness curve, the average attack slopes, spectral
3: All three fragments are similar. centroids, and the roll-off frequency.
All subjects
“Claire de Lune” “My Neighbor Totoro”
Table 5 Correct answers, error cases, and number of
Average Average
Sets Sets answers (hearing-impaired subjects only).
rate rate
1 C1, C6 0.71 T6 0.89
Correct Error case Number of
2 C2, T1, T2,
0.83 0.85 answer answers
C5, C7 T4, T8
C1 1 3 1
3 C3, T3, T5,
0.91 0.92 2 (Flute-Orchestra) 6
C4, C8 T7, T9
C2 2 1 1
Hearing-impaired subjects only (Orchestra) 2 (Flute-Orchestra) 1
“Claire de Lune” “My Neighbor Totoro” 3 4
Average Average C6 1 2 (Cello-Orchestra) 2
Sets Sets
rate rate 2 (Flute-Orchestra) 3
1 C1, C6 0.28 T6 0.67 3 1
2 C2, T1, T2, T8 2 (Cello) 1 1
0.59 0.72
C5, C7 T4, T8 2 (Guitar-Cello) 3
3 C3, T3, T5, 3 1
0.82 0.86
C4, C8 T7, T9

C. Self-assessment perceiving timbre similarity. The brightness curve was found
From the self-assessment results for “Claire de Lune” in to be a candidate as an attribute to explain these difficulties.
Figure 2, we can observe the following;
1. The results for the two hearing subject groups were This research was supported by Special Research Funds
similar, while hearing-impaired subjects used a from the Tsukuba University of Technology and Grant-in-Aid
different standard to make their assessments. for Scientific Research (C).
2. Hearing subjects felt it was comparatively difficult
for them to assess C3 and C5, and comparatively easy
for them to assess C8. However, their correct answer REFERENCES
rates did not correspond with their assessments. [1] R. Hiraga and N. Kato, “Understanding Emotion through Multimedia-
comparison between hearing-impaired people and people with normal
In studying the relationships between the self-assessments hearing abilities,” Proc. of ACM ASSETS, pp. 141–148, October 2006.
and the correct answer rates for the three subject groups, we [2] R. Hiraga and N. Kato, “The catch and throw of music emotion by
could not find any relationships that stood out in particular. hearing-impaired people,” Proc. of ICoMCS, p. 115, December 2007.
[3] R. Hiraga, N. Kato, and N. Matsuda, “Recognizing emotion in drum
performances with/without visual information by hearing-impaired
D. Future works people,” Proc. of IEEE SMC, pp. 131–136, October 2008.
This experiment was intended to be a start toward [4] ANSI S1.1-1994 (American National Standard Acoustical
understanding how hearing-impaired people perceive timbre in Terminology.)
music. With the experiment results we have obtained we can [5] H. Terasawa, “A Hybrid Model for Timbre Perception: Quantitative
now proceed to the next step, one that will involve keeping the Representations of Sound Color and Density,” PhD. Thesis, Stanford
University, 2009.
following points in mind:
[6] W. A. Sethares, “Tuning, Timbre, Spectrum, Scale,” p. 28, Springer,
1. Musical phrases: We used two musical phrases in our 2005.
experiment, only one of which is well-known. Use [7] R. L. Pratt and P.E. Doak, “A subjective rating scale for timbre,” Journal
should be made of other musical phrases, both well- of Sound and Vibration, 45-3, pp. 317–328, 1975.
known and less familiar ones. [8] A. Bregman, “Auditory Scene Analysis,” A Bradford Book, 1994.
2. Musical fragments: Since the brightness curve is a [9] J M. Grey, “Multidimensional perceptual scaling of musical timbres,”
Journal of the Acoustic Soceity of America, 61-5, pp. 1270–1277, 1977.
candidate attribute for hearing-impaired people in
[10] S. Samson, R. J. Zatorre, and J. O. Ramsay, “Multidimensional scaling
judging timbre, we should plan to use musical of synthetic musical timbre: Perception of spectral and temporal
fragment sets that include fragments of a closed characteristics,” Journal of the Acoustic Society of America, 93-4, p.
brightness curve. An example of such a set would be 2402, 1993.
one combining piano and cello performance. [11] M. Ilmoniemi, V. Välimäki, and M. Huotilainen, “Subjective evaluation
of musical instrument timbre modifications,” Proc. of Joint Baltic-
3. Hearing-impaired people with and without cochlear Nordic Acoustics Meeting, pp. BNAM2004-1–6, June 2004.
implants: None of the hearing-impaired subjects in [12] I. Paul, “Auditory stream segregation by musical timbre: Effects of static
our experiment made use of a cochlear implant. On and dynamic acoustic attributes,” Journal of Experimental Psychology:
the other hand, there have been a few works on the Human Perception and Performance, 21-4, pp. 751–763, May 1995.
recognition of music by hearing-impaired people with [13] A.-A. Darrow, “The role of music in deaf culture: Deaf students’
cochlear implants. Gfeller et al. found that the perception of emotion in music,” Journal of Music Therapy, XLIII-1, pp.
2–15, March 2006.
training of listening to music played by several
[14] A.-A. Darrow, “Students with Hearing Loss,” Chapter 11 in M. S.
instruments improved the timbre recognition [19]. Adamek and A.-A. Darrow edt., “Music in Special Education,” pp. 233–
Another research has been done to show that 266, The American Music Therapy Association, Inc., August 2005.
providing music training to children who use cochlear [15] M. Hayashida and Y. Kato, “Perception and Production of Musical
implants effectively helps them to acquire speech Rhythms by Children and Adults with Hearing Impairments; Tapping
Responses and the Effects of Stimulus Presentation,” Journal of the
language skills [20]. We should deliberately select Japanese Association of Special Education, 41-3, pp. 287–296,
hearing-impaired persons who use cochlear implants September 2000 (in Japanese).
as subjects for future experiments. [16] Music and the Deaf, http://matd.org.uk/
[17] Evelyne Glennie, http://www.evelyn.co.uk/evelyn-glennie.html
[18] MIRToolbox,
We have conducted an experiment on the ability of https://www.jyu.fi/hum/laitokset/musiikki/en/research/coe/materials/mirt
hearing-impaired persons to perceive similarity in timbre and oolbox
compared questionnaire answers they provided with those [19] K. Gfeller, S. Witt, M. Adamek, M. Mehr, J. Rogers, J. Stordahl, and S.
Ringgenberg, “Efffects of Training on Timbre Recognition and
provided by hearing persons. Results showed that this Appraisal by Postlingaully Deafened Cochlear Implant Recipients,”
perception ability differed significantly between hearing- Journal of the American Academy of Audiology, 13-3, pp. 132-145,
impaired subjects and hearing subjects. If hearing-impaired March, 2002.
subjects were given musical fragment sets consisting of three [20] T. Torppa, A. Faulkner, M. Vainio, and J. Järvikivi, “Acquisition of
similar timbres, the similarity was perceivable to them. focus by normal hearing and Cochlear implanted children; The role of
musical experience,” Proc. of the 5th Int. Conf. on Speech Prosody,
However, in other cases they encountered difficulties in 2010.

