Sound Design For Emotion and Intention Expression of Socially Interactive Robots
Sound Design For Emotion and Intention Expression of Socially Interactive Robots
DOI 10.1007/s11370-010-0070-7
Received: 10 May 2010 / Accepted: 15 May 2010 / Published online: 8 June 2010
© Springer-Verlag 2010
Abstract The current concept of robots has been greatly Keywords Human–robot interaction · Emotion
influenced by the image of robots from science fiction. Since expression · Robot · Musical sound
robots were introduced into human society as partners the
importance of human–robot interaction has grown. In this
paper, we have designed seven musical sounds, five of which 1 Introduction
express intention and two that express emotion for the Eng-
lish teacher robot, Silbot. To identify the sound design con- With the remarkable advances in robotics, the application
siderations, we analyzed the sounds of robots, R2-D2 and areas of robots have extended from industrial automation to
Wall-E, from two popular movies, Star Wars and Wall-E, personal home services such as toys, security guards, teach-
respectively. From the analysis, we found that intonation, ers, search and rescue, cleaning, and artificial pets. Since
pitch, and timbre are dominant musical parameters to express robots have coexisted not as automatic machines but as con-
intention and emotion. To check the validity of these designed stituents of human society, they are required to interact with
sounds for intention and emotion, we performed a recogni- humans. It is one of the ultimate visions of socially intelligent
tion rate experiment. The experiment showed that the five robots to be able to communicate and interact with humans.
designed sounds for intentions and the two for emotions are Emotion, one of the user affects, has been recognized as
sufficient to deliver the intended emotions. one of the most important ways for humans to communicate
with each other [1]. Humans express their emotions by facial
expressions, gestures, and voice. These can also be useful
E.-S. Jee (B)
Mirae Robot Research Institute,
tools for robots in expressing their feelings. However, they
Daejeon, Korea are not sufficient enough to deliver their feelings vividly.
e-mail: [email protected] A fundamental aspect of music is its ability to express emo-
tions [2]. Music is widely used to enhance the emotional
Y.-J. Jeong · C. H. Kim
impact of movies. For example, in thrilling horror movies,
Human-Robot Interaction Research Center,
KAIST, Daejeon, Korea music and sound effect enhance feelings of fear and anxiety
e-mail: [email protected] [3]. The power of music to evoke and represent emotions is
C. H. Kim arguably its most attractive and most perplexing quality [4].
e-mail: [email protected] Many studies have tried to investigate the connection between
human emotions and voice or facial expressions. Recently,
H. Kobayashi
Graduate School of Art and Technology,
emotional aspects of music started to be studied in scientific
Hosei University, Tokyo, Japan or psychological ways due to the complexity of emotional
e-mail: [email protected] experiences in music [5]. In addition, a few studies focus on
how to express the robot’s emotion using speech synthesis,
Present Address:
C. H. Kim
facial expression, and sound [6,7].
Agency for Defence Development, In this paper, we produced linguistic and musical sounds
Daejeon, Korea for the English teacher robot, Silbot, to express its intentions
123
200 Intel Serv Robotics (2010) 3:199–206
and emotions. The remainder of the paper is organized as Thompson [18] suggested that music generates emotional
follows. Section 2 summarizes related works and Sect. 3 feelings.
examines the sounds of two representative robots, R2-D2 In addition, some studies have shown as to which musi-
and Wall-E, in terms of expressions for both intentions and cal parameters evoke emotional feelings. Hevner [19–22]
emotions. In Sect. 4, we design and produce linguistic and researched the emotional meaning in music through psycho-
musical sounds for the English teacher robot based on our logical experiments and produced the first comprehensive
analysis of the sounds of R2-D2 and Wall-E. The validity of series of publications that attempted a systematic explana-
the sounds was tested through an experiment in Sect. 5 and tion of the relationship between musical features and per-
conclusions are presented in Sect. 6. ceived emotion. Juslin [23] studied the utilization of the
acoustic cues in the communication of emotions in music
by a performer and a listener and measured the correla-
2 Related works tions between them. Gabrielsson and Lindstrom [24] sug-
gested more specified musical parameters than Hevner. For
Have you ever imagined a movie without sound? Sound is example, they examined tempo, mode, loudness, pitch, inter-
an important element of human communication and inter- vals, melody, harmony tonality, rhythm, timbre, articulation,
action. Among human activities related to sound, both amplitude envelope, musical form, and interaction between
language and music share many things in common in the parameters. Juslin and Laukka [25] modeled the emotional
perspective of cognitive science. For one thing, not only expression of different music performances through multiple
language but also music develops its story with sound. As regression analysis to clarify the relationship between emo-
with language, music has a hierarchical structure and it is tional descriptions and measured parameters such as tempo,
also insisted that music has a grammatical structure [8]. So, sound level, and articulation. Similarly, Schubert [26] consid-
what is the difference between music and language? Proba- ered different musical parameters of loudness, tempo,
bly, the most conspicuous difference is that music can con- melodic contour, texture, and timbre.
vey genuine and deep emotions through sound [9]. There
is no language more powerful than the language of music
to deliver vivid emotion. Music is indeed the language of 3 Sound analysis of R2-D2 and Wall-E
emotion [10].
As an emotionally rich medium, music is pertinent to Science fiction offers many examples of robots which can
express a robot’s emotions for human–robot interaction. interact with humans. The robots in these movies possess
Numerous studies about music and emotion have been con- human qualities including emotions. In this section, we ana-
ducted because it has been one of the most sensational top- lyzed the sound of R2-D2 from Star Wars and Wall-E from
ics since antiquity. Several studies have developed aesthetic the movie of the same name.
and philosophical discussions on music and emotion [11,12].
Musicologists and music theorists have studied the emotional 3.1 Characteristics of R2-D2
expressiveness not only in western art music but also in pop
music [9,13]. Feld [14] and Becker [15] have approached The mechanical droid R2-D2, the squat cylinder on rollers
music and emotion from anthropological and ethno- from Star Wars, is one of the most famous cinematic robots.
musicological perspectives. R2-D2’s emotions usually revolve around what is happen-
Recently, enormous amounts of investigation regarding ing to other people. The cuteness of R2-D2 and his loyalty
the relationship between music and emotion have been per- towards humans have increased our expectations of human-
formed by psycho-biological or neuro-psychological like robots. Cuteness is the quality that attracts people and
approaches. Blood et al. [16] examined cerebral blood flow provokes our protective instinct. His body proportion that
change involved in emotional responses to music through is very similar to that of a baby’s also helps us think that
positron emission tomography. They found that music could R2-D2 is cute. He is short while his half sphere-shaped head
enhance neural mechanisms associated with pleasant or is relatively big. And his short limbs are attached to his cyl-
unpleasant emotional states. Baumgartner et al. [3] investi- inder trunk. In addition to his cuteness in appearance, R2-D2
gated how music enhances emotion through functional mag- expresses his ideas and feelings mainly through shaking his
netic resonance imaging (fMRI). The brain imaging showed head.
that visual and musical stimuli automatically evoke strong R2-D2’s human quality appears apparently when he
emotional feelings and experiences. Juslin and Vastfall [17] expresses his ideas and emotions through human speech-like
discovered that there are underlying mechanisms that music sounds. A sound designer, Ben Burtt, used an analog syn-
induces to bring about certain emotions and concluded that thesizer and processed his own vocalizations through sound
these mechanisms are not unique to music. Livingstone and effects in order to create the sound of R2-D2. The conspicuous
123
Intel Serv Robotics (2010) 3:199–206 201
feature of R2-D2’s voice is the metallic quality coming from communicative sounds than those from Star Wars. In Wall-E,
the upper layer. The upper layer notes are generally high especially, communicative intentions are maximized by alter-
and short, while the lower notes are continuous like a human nating intonation, pitch, and timbre for one motive to empha-
voice. size the situation. From the analysis of the movies, sound
is proven to play much of the role in expressing emotions
3.2 Characteristics of Wall-E with intensity. Hence, the importance of sound is increas-
ing for communication beyond its simple sound effect in
In 2008, Wall-E was introduced into the world. This unit has robots.
developed sentience and a sense of emotion, particularly curi- The most important thing to keep in mind when we cre-
osity, as shown by his quirky habits. He is also cute because ate a robot sound is that the robot language should have
of his appearance. At first glance, Wall-E’s big eyes remind us universality. To achieve universality in the robot language,
of the alien character from the movie ET. His body proportion we assumed that intonation, pitch, and timbre are dominant
is also similar to that of a baby’s. His free eye movement is musical parameters to convey intention and express emotion
one of his unique ways of communicating, and sometimes it since they were marked as a musical feature from the anal-
shows Wall-E’s curiosity about the world. In addition, Wall-E ysis of well-designed sounds for robot characters in the two
is a more human-like robot than R2-D2. For instance, Wall-E famous movies, Star Wars and Wall-E. Hence, intonation,
feels loneliness before he meets the female robot ‘Eve’ and pitch, and timbre are carefully treated in this paper.
falls in love. He is also sympathetic. With his own will, Wall-E To determine the design rule of dominant musical parame-
finds his own work of cleaning under any circumstances. ters, we analyzed the sounds of five intentions and two emo-
Along with the cuteness from Wall-E’s appearance, his baby- tions, which are applied to our English teacher robot. The
like mumbling voice makes him more attractive. When we five intentions are affirmation, denial, encouragement, intro-
listen to a baby’s mumbling, we can understand what the duction, and question. The two emotions are happiness and
baby wants or needs. A baby’s mumbling is always cute even sadness. The sounds for affirmation, denial, encouragement,
though it is a universal phenomenon. According to Burtt, he happiness, and sadness are sampled in Star Wars. The sounds
tried to make the robot’s voice similar to that of a toddler’s for introduction and question are sampled in Wall-E. Figure 1
and one with the universal language of intonation such as shows the representative musical scores to analyze related
‘Oh’, ‘Hm?’, and ‘Huh’. These exclamations are effective sounds sampled from the two movies and Table 1 summarizes
in delivering emotional feelings. Moreover, since most of the analysis results in view of the three dominant musical
Wall-E’s voice is produced by transforming the human voice, parameters of intonation, pitch, and timbre.
it seems natural to us. From our analysis of sounds for expressing intentions and
emotions of the robot characters in Star Wars and Wall-E,
3.3 Summary of sound analysis we found three design rules for pitch, intonation, and timbre.
First, pitch is a basic property of sounds that indicates how
To analyze the sound used in Star Wars and Wall-E mov- high or how low they are. In terms of pitch, even though the
ies, we sampled 175 sounds from Star Wars and 100 sounds perceptible pitch range of humans is from 20 to 20,000 Hz,
from Wall-E. We categorized these sampled sounds into two the pitch range between 100 and 1,500 Hz is appropriate for
groups, intention sounds and emotional sounds. The intention the sound of a robot. This is because a voice in this pitch range
sounds are meant to convey meaning or to emphasize a sit- is most common in normal human communication, and most
uation and the emotional sounds express feelings. For inten- of us feel comfortable when we hear a sound within this pitch
tion sounds, there are expressions such as self-introduction, range.
positive or negative answers, grumbling, movements, and Next, intonation is a variation of pitch in speaking.
warnings. For emotion, there are expressions such as happi- In terms of intonation, it is encouraged that the contour of the
ness, sadness, surprise, embarrassment, a sigh, drowsiness, robot sound is similar to that of the human voice. In human
and laughter. communication, rising intonation conveys different mean-
In the two movies, intonation and pitch were usually used ing than falling intonation, and there is a customized use
to emphasize their intention expressing their thoughts or of intonation in accordance with a specific situation. Usu-
conveying their meaning such as mood conditions, inten- ally when people ask something, the intonation is rising.
sity of unpleasantness, and so on. Like intention, intonation Let us assume that a robot questions something with falling
and pitch were also used to control the intensity of emo- intonation. In this case, no one would understand the robot’s
tion. Unlike intention, however, timbre was strongly used intention.
to deliver the intensity of emotion since timbre has a strong Finally, timbre is the quality of sound. In terms of timbre,
influence on emotion. The other distinct feature is voice quality is important to show characteristics of a char-
that the recently released movie, Wall-E, has many more acter. By listening to someone’s voice, we could easily guess
123
202 Intel Serv Robotics (2010) 3:199–206
(a) (b)
(c) (d)
(e) (f)
(g)
Fig. 1 Representative musical scores sampled from Star Wars and Wars. d Musicals core for introduction Wall-E. e Musical score for ques-
Wall-E movies. a Musical score for affirmation Star Wars. b Musi- tion Wall-E. f Musical score for happiness Star Wars. g Musicals core
cal score for denial Star Wars. c Musical score for encouragement Star for sadness Star Wars
how old the speaker is and whether the speaker is a male or deliberately according to the certain situation. However, our
female. For instance, R2-D2’s metallic beeping represents robot functions as an English teacher in the classroom, and
that he is not whimsical but simple and honest. we could set up several possible situations in the classroom.
According to the scenario of a class situation, we created
a couple of linguistic sound samples in order to let robots
4 Development of linguistic and musical sounds express their intentions and emotions at will. The sound
for the English teacher robot group for intentional expressions consists of affirmation,
denial, encouragement, introduction, and question. The
We have analyzed the two characters from science fiction sounds for intention contain the human voice samples that are
movies. Their sounds are limited because of being controlled processed by sound effectors. The sound group for emotional
123
Intel Serv Robotics (2010) 3:199–206 203
123
204 Intel Serv Robotics (2010) 3:199–206
123
Intel Serv Robotics (2010) 3:199–206 205
123
206 Intel Serv Robotics (2010) 3:199–206
Acknowledgments This research was performed for the Intelligent 14. Feld S (1982) Sound and sentiment: birds, weeping, poetics, and
Robotics Development Program, one of the 21st Century Frontier R&D song in Kaluli expression. University of Pennsylvania Press,
Programs funded by the Ministry of Knowledge Economy of Korea. Philadelphia
15. Becker J (2001) Anthropological perspectives on music and emo-
tion. In: Juslin P, Sloboda H (eds) Music and emotion: theory and
research. Oxford University Press, New York, pp 135–160
References 16. Blood AJ, Zatorre RJ, Bermudez P, Evans AC (1999) Emotional
responses to pleasant and unpleasant music correlate with activity
1. Lee C, Lee GG (2007) Emotion recognition for affective user inter- in paralimbic brain regions. Nat Neurosci 2(4):382–387
faces using natural language dialogs. In: Proceedings of the IEEE 17. Juslin PN, Vastfall D (2008) Emotional responses to music: the
international symposium on robot and human interactive commu- need to consider underlying mechanisms. Behav Brain Sci 31:
nication, Jeju, pp 798–801 556–621
2. Berg J, Wingstedt J (2005) Relations between selected musical 18. Livingstone SR, Thompson WF (2009) The emergence of music
parameters and expressed emotions: extending the potential of from the theory of mind. Music Sci 10:83–115
computer entertainment. In Proceedings of the international confer- 19. Hevner K (1935) Expression in music: a discussion of experimen-
ences on advances in computer entertainment technology, Valencia, tal studies and theories. Psychol Rev 42:186–204
pp 164–171 20. Hevner K (1935) The affective character of the major and minor
3. Baumgartner T, Lutz K, Schmidt CF, Jancke L (2006) The emo- modes in music. Am J Psychol 47(4):103–118
tional power of music: how music enhances the feeling of affective 21. Hevner K (1936) Experimental studies of the elements of expres-
picture. Brain Res 1075:151–164 sion in music. Am J Psychol 48(2):103–118
4. Schubert E (2004) Modeling perceived emotion with continuous 22. Hevner K (1937) The affective value of pitch and tempo in music.
musical features. Music Percept 21(4):561–585 Am J Psychol 49(4):621–630
5. Juslin PN, Sloboda JA (2001) Music and emotion. Oxford Univer- 23. Juslin PN (2000) Cue utilization in communication of emotion
sity Press, New York in music performance: relating performance to perception. J Exp
6. Nakanishi T, Kitagawa T (2006) Visualization of music impres- Psychol 16(6):1797–1813
sion in facial expression to represent emotion. In Proceedings 24. Garbrielsson A, Lindstrom E (2001) The influence of musical
of the Asia–Pacific conference on conceptual modeling, Hobart, structure on emotional expression. In: Juslin P, Sloboda H (eds)
pp 55–64 Music and emotion: theory and research. Oxford University Press,
7. Jee E-S, Kim CH, Park S-Y, Lee K-W (2007) Composition of musi- New York, pp 223–248
cal sound expressing an emotion of robot based on musical factors. 25. Juslin PN, Laukka P (2003) Communication of emotions in vocal
In Proceedings of the IEEE international symposium on robot and expression andmusic performance: different channels, same code?
human interactive communication, Jeju, pp 637–641 Psychol Bull 129(5):770–814
8. Lerdahl F, Jackendoff R (1983) A generative theory of tonal music. 26. Schubert E (2004) Modeling perceived emotion with continuous
MIT Press, Cambridge musical features. Music Percept 21(4):561–585
9. Meyer LB (1956) Emotion and meaning in music. University of 27. Jee E-S, Cheong Y-J, Park S-Y, Kim CH, Kobayashi H (2009)
Chicago Press, Chicago Composition of musical sound to express robot’s emotion with
10. Pratt CC (1948) Music as a language of emotion. Bull Am Musicol intensity and synchronized expression with robot’s behavior. In:
Soc 11(1):67–68 Proceedings of the IEEE international symposium on robot and
11. Kivy P (1999) Feeling the musical emotions. Br J Aesthet 39:1–13 human interactive communication, Toyama, pp 369–374
12. Levinson J (1982) Music and negative emotion. Pac Philos Q 28. Russel JA (1980) A circumplex model of affect. J Pers Soc Psychol
63:327–346 39:1161–1178
13. Cook N, Dibben N (2001) Musicological approaches to emotion.
In: Juslin P, Sloboda H (eds) Music and emotion: theory and
research. Oxford University Press, New York, pp 45–70
123