Understanding Emotion in Raag An Empirical Study o
Understanding Emotion in Raag An Empirical Study o
net/publication/221494070
CITATIONS READS
52 2,997
2 authors, including:
Parag Chordia
SEE PROFILE
All content following this page was uploaded by Parag Chordia on 20 March 2014.
5.1. Content Analysis of Free Responses man was equally associated with “melancholy & sadness”
and “joy & happiness.” The results show that, although
The descriptive language of the free responses indicated each raag has a strong valence, a variety of other emo-
the intensity and complexity of listeners’ emotions. The tions are also present, suggesting that listeners’ responses
style ranged from lists of words and phrases describing are indeed complex.
simple emotions to lengthy and even poetic passages. A
sampling is shown in Table 4.
5.2. Quantitative Responses
Due to the fact that many of the free responses con-
sisted of fragments or lists of words, highly sophisticated Figure 1 shows the distribution of quantitative (slider) re-
content analysis was not particularly relevant; instead, we sponses for each emotion and raag. For happy, sad, tense,
focus on lower-level analysis, primarily word histograms. and romantic, three raag clusters are apparent. Desh and
Initially, simple histograms of terms appearing in each Khamaj are strongly positively valenced (happy, roman-
free response section were tabulated. Limiting the results tic and not tense) and Marwa and Darbari are negatively
to relevant descriptive terms, the most common for each valenced (sad, not romantic, tense) with Yaman falling be-
raag were collected and are shown in Table 2. Standard tween these poles. Multiple comparison of means con-
variations of each word (e.g. “happy” and “happiness”) firms that differences between these three clusters for these
were treated as examples of the same root word. three emotions are significant (p < .05) but that differ-
Terms were also grouped according to various cate- ences within clusters are not significant. The “longing”
gories of meaning, in an attempt to both give a broader values show no statistically significant differences by raag,
view of the responses and to capture information buried however all the raags clearly brought out this emotion.
in the “long tail” of infrequently used but clearly related For the“peaceful”emotion there are again three clusters:
terms. For example, words such as “bliss”, “exultation”, Marwa is least peaceful followed by Darbari and then a
“happy”, “joyful”, and “ecstasy” were grouped together cluster containing Yaman, Khamaj and Desh.
under “Joy & Happiness”. Relevant semantic groupings
were based on a modified version of a standard content 6. FEATURE SELECTION
analysis dictionary. As before, the most common occur-
rences were tabulated and the results are displayed in Ta- In order to explore possible predictors for these emotional
ble 3. reactions, several features were extracted from the audio
The tables clearly show a common pattern of responses excerpts. Some of these features have previously been
for certain raags. Raags Khamaj and Desh are associated shown to be highly effective in raag classification (Chor-
with positively valenced concepts such as “joy & happi- dia and Rae 2007), thus we sought to investigate their abil-
ness,” whereas Darbari and Marwa were consistently as- ity to explain emotional responses. We hypothesized that
sociated with “melancholy & sadness.” Interestingly, Ya- survey responses would in part be predicted by the pitches
Darbari Desh Khamaj Marwa Yaman
Freq. Term Freq. Term Freq. Term Freq. Term Freq. Term
75 sad 59 happy 53 happy 74 sad 46 sad
19 peaceful 41 peaceful 36 peaceful 16 serious 42 peaceful
14 deep 19 romantic 31 sad 14 good 38 happy
13 calm 17 sad 24 romantic 14 peaceful 13 good
10 longing 15 soothing 13 soothing 14 longing 12 calm
10 serious 11 nice 10 good 13 deep 11 longing
10 beautiful 10 pleasant 9 longing 10 tense 8 serious
9 good 10 patriotic 9 love 7 emotional 7 romantic
8 happy 9 playful 9 relax 7 happy 7 relax
5 sombre 9 remind 6 calm 6 soothe 6 devotional
Table 3. Semantic category histograms for each Raag. Words with a similar meaning were assigned to a common
semantic category.
“Emptiness. I’m tumbling down a deep abyss. Weightless, then maybe. Dark and surrounding. Fall.”
Darbari:
“It feels like pain, an agony that is long lasting and is happening at the time that the music is playing, a time
of hardship.”
“Absolutely fresh.... Clearing all the bad thoughts... Gives fresh meaning to life. Sounds serene. Very
Desh:
Peaceful i felt.”
“Pure, unblemished...Something white as milk, offering to take you in, clean all your sins...Compassionate
and loving, but in a distant sort of way. ”
“It so reminds me of the blossoming of flowers and prosperity... the happy chuckles of newborns... the
Khamaj:
pride of their mothers... ”
“spring time. birds chirping. sun is shining. love. a mother is telling her child a story. the child is smiling.
good times are upon us.”
“my reaction to this raaga was almost sexual. I felt desire. I felt the urge to look beautiful and dress up in
Marwa:
heavy gold jewelery. felt very aware of my body. at times I felt a strange anger and the desire to control
somebody else. i felt very powerful. i felt like a woman.”
“very depressing...felt like crying...someone was going away... describes life in some way....the ups and
downs. ”
“I feel like a butterfly. The wind is streaking past me, and colors are awash in the air. Melodious colors.”
Yaman:
“A combination of moods. It looks relaxed most of the time and ruminating about something. It seems to
turn a little angry occasionally as though an unhappy event was inadvertently recollected.”
correlation between each feature and the emotion, which features that give the greatest incremental contribution to
varies between -1 and 1, is also shown. explaining the variance of the dependent variable. In this
way, only one of a set of highly correlated features is typ-
Additionally the variance inflation factor (VIF) for each
ically included. Nevertheless, highly correlated variables
feature is shown, giving a measure of the multicollinear-
may remain in the model, making VIF useful for interpre-
ity of the independent variables. This occurs when one of
tation.
the variables is well approximated by a linear combination
of the other variables, as is likely to be the case for PCD Table 5 shows that “happy” and “sad” are modeled best,
values. The VIF is calculated by regressing each variable with R2 values of .34 and .28 respectively, whereas the
using the remaining independent variables, and is com- model is only weakly predictive in the case of “peace-
1 2
puted as 1−R 2 . If the R value is high, it indicates that the ful” (.11) and “tense” (.16). Unsurprisingly, “happy” re-
variable is multicollinear. In typical applications, VIF val- sponses are negative correlated with the minor 2nd (β =
ues above a threshold of 10 indicate that a variable should −.13), minor 3rd (β = −.16), minor 6th (β = −.43), and
be removed. Interpretation of models with multicollinear minor 7th (β = −.37). Surprisingly, the Major 3rd is also
variables is difficult as the relative contribution of each negatively correlated with a β value of -.13. As mentioned
cannot easily be read from the beta values and the reliabil- before, however, the Major 3rd has a high VIF value; ex-
ity of the estimate decreases (the size of the confidence in- amination of the correlation coefficients shows that it is
terval increases for that coefficient). In some cases, multi- strongly correlated with the Perfect 4th (.84), which is pos-
collinearity can reverse the expected sign of the regression itively related to “happy” responses. “Sad” responses show
coefficient. For example, in Table 5 in the “happy” model, an opposite pattern, positively related to the minor 2nd (β =
the Major 3rd coefficient is negative despite a strong posi- .14), minor 3rd (β = .35), and minor 6th (β = .20),
tive correlation; it can be seen, however, that the variable and negatively to the Perfect 4th (β = −.20). Many of
is highly multicollinear, with a VIF of 7.74. Because scale the other models loosely conform to the idea that raags
tones in raags may be highly correlated, it is important to with “minor” intervals are negatively valenced (sad, tense)
be aware of this potential confound. Stepwise regression and raags with “major” intervals are positively valenced
partially avoids this problem by automatically selecting (happy, peaceful).
“Happy” Regression Model — Total R2 = .34 “Sad” Regression Model — Total R2 = .28
feature β corr VIF feature β corr VIF
minor 2nd -0.13 -0.29 4.23 minor 2nd 0.14 0.26 2.11
minor 3rd -0.16 -0.09 1.86 Major 2nd -0.16 -0.14 1.91
Major 3rd -0.13 0.38 7.74 minor 3rd 0.35 0.09 2.12
Perfect 4th 0.13 0.37 5.66 Perfect 4th -0.20 -0.29 1.66
minor 6th -0.43 -0.25 6.02 minor 6th 0.20 0.28 1.36
Major 6th -0.35 -0.28 2.33 Major 6th 0.20 0.25 2.29
Major 7th -0.37 -0.03 8.11 spectral centroid -0.18 -0.10 1.98
sensory dissonance 0.13 0.06 2.74 note density -0.15 -0.11 1.65
PCDD entropy 0.18 0.38 2.27
Table 5. Summary of regression models for “happy” and “sad”. For each feature we report the standardized regression
coefficients, the bivariate correlation, and the variance inflation factor measure of multicollinearity.
“Peaceful” Regression Model — Total R2 = .11 “Tense” Regression Model — Total R2 = .16
scale deg beta corr VIF scale deg beta corr VIF
Major 2nd 0.22 0.11 3.32 minor 2nd 0.10 0.25 1.92
minor 3nd -0.24 -0.05 2.76 Major 2nd -0.21 -0.15 1.71
Major 3rd 0.22 0.22 4.69 minor 3rd 0.24 0.08 1.55
Perfect 4th 0.15 0.23 4.63 Major 3rd -0.14 -0.26 1.45
Perfect 5th 0.15 0.10 1.20 Perfect 5th -0.20 -0.18 1.74
Major 7th 0.23 0.03 1.88 minor 6th 0.14 0.12 1.45
PCDD entropy -0.16 0.19 3.46
The entropy of the PCDD distribution was a signifi- Full PCD only Minor
cant factor in the “happy” and “peaceful” models, posi- Happy 0.34 0.27 0.07
tively related in the former and negatively related in the Sad 0.28 0.22 0.06
latter. We had hypothesized that low PCDD entropy val-
ues would correspond to tension and longing and higher
Peaceful 0.10 0.10 0.02
values to a greater sense of flexibility. The effect shown Tense 0.16 0.13 0.05
here is weakly consistent with this prediction. Romantic 0.21 0.18 0.08
It is important to note that the relationship between
Table 7. Comparison of three regression models. The
PCDD entropy and the emotional characteristics we are
total adjusted R2 value for each model is shown.
testing is likely non-linear. Values outside the range rep-
resented in this study might well be expected to elicit dif-
ferent reactions. One might expect very low values, corre- which allows comparison between models with a different
sponding to predictability and repetition, to elicit “peace- number of independent variables, is reported in Table 7.
ful” and “sad” reactions. Very high values, on the other The second model considered only the PCD features. The
hand, might correspond to unpredictability and hence elicit full model was significantly more explanatory. Although
feelings of “tension” and “stress”. However, in the mid- the full model suggested that a feature that combined the
dle of the range, where most of the musical excerpts used minor 2nd , minor 3rd and minor 6th in total measure of
here lie, low entropy corresponds to raags with a relatively “minorness” might capture most of the information in the
fixed phraseology, creating a sense of tendency rather than PCD, this was not the case. The PCD features explained
repetition, and high entropy corresponds to raags with a an additional 10-20% of variation as compared with the
greater degree of flexibility, conveying more variability single “minor” feature.
than instability. If correlations observed in the above mod-
els are valid, this is the most likely explanation.
8. DISCUSSION
Two additional models were developed to see if a more
parsimonious explanation of the data could be given. The Survey responses have shown that different raags evoke
first used the total strength of the minor 2nd , minor 3rd , a clearly differentiated set of emotional reactions. Free
and minor 6th in the raag as the independent variable, a responses tended to cluster strongly in particular adjecti-
measure of the total degree of “minorness”. The adjusted val categories based on raag, and quantitative responses
R2 , were significantly different by raag for all emotions ex-
n−1 cept “longing”. Thus, a substantial step has been taken
Ra2 = 1 − (1 − R2 ) , (2) towards empirical verification of the nature and reliabil-
n−p−1
ity of emotional responses to raag. Importantly, responses approach to onset detection for audio segmenta-
did not vary systematically by familiarity with NICM sug- tion. In Proc. of the 4th European Workshop on
gesting that listeners were not simply referring to cultur- Image Analysis for Multimedia Interactive Services
ally determined concepts, but responding to underlying (WIAMIS–03), London, pp. 275–280.
features of the music. Huron, D. (2006). Sweet Anticipation: Music and the
The analysis, although preliminary, suggests that re- Psychology of Expectation. MIT Press.
sponses are in part attributable to pitch-class statistics; the
prevalence of certain scale degrees is useful in predicting Juslin, P. N. and J. A. Sloboda (2001). Music and
the valence of the emotional responses. The data suggest Emotion: Theory and Research. Oxford University
that the entropy of the PCDD and the spectral centroid are Press.
also important. These are undoubtedly just a few of the Kameoka, A. and M. Kuriyagawa (1969). Consonance
many factors that influence listener responses. As more theory, part i: Consonance of dyads. Journal of the
data are collected it will be possible to more fully exam- Acoustical Society of America 45(6), 1451–1459.
ine other factors. Sun, X. (2000). A pitch determination algorithm based
It is important to note that these models are currently on subharmonic-to-harmonic ratio. In In Proc. of
merely suggestive. In none of the cases were they highly International Conference of Speech and Language
predictive, with a maximum of 34% of the variance ac- Processing.
counted for. Because the goal here was explanatory rather
than classificatory, the models were not verified on an
independent data set. As with any task that forces re-
spondents to verbalize primarily non-verbal mental states,
there is significant measurement error due to the inherent
unnaturalness of the task and an imperfect ability to map
the verbal space. It is also possible that much of the true
emotional feel of the music is lost in the projection onto
simple emotions such as “happy” and “sad”. Although it
is likely that some aspect of raags can be effectively cap-
tured by mapping onto these axes, it is also likely that it is
a gross simplification of the actual emotional experience.
References
Balkwill, L. L. and W. F. Thompson (1999). A cross-
cultural investigation of the perception of emotion
in music: Psychophysical and cultural cues. Music
Perception 17, 43–64.
Bhatkande, V. (1934). Hindusthani Sangeet Paddhati.
Sangeet Karyalaya.
Chordia, P. and A. Rae (2007). Raag recognition us-
ing pitch-class and pitch-class dyad distributions.
In Proceedings of International Conference on Mu-
sic Information Retrieval.
Duxbury, C., J. P. Bello, M. Davies, and M. Sandler
(2003). A combined phase and amplitude based