Brain and Music
5/5
()
About this ebook
- Covers a variety of topics fundamental for music perception, including musical syntax, musical semantics, music and action, music and emotion
- Includes general introductory chapters to engage a broad readership, as well as a wealth of detailed research material for experts
- Offers the most empirical (and most systematic) work on the topics of neural correlates of musical syntax and musical semantics
- Integrates research from different domains (such as music, language, action and emotion both theoretically and empirically, to create a comprehensive theory of music psychology
Related to Brain and Music
Related ebooks
Psychology of Music Rating: 3 out of 5 stars3/5The Science of Music and the Music of Science: How Music Reveals Our Brain, Our Humanity and the Cosmos Rating: 0 out of 5 stars0 ratingsMusic, Sound and Sensation: A Modern Exposition Rating: 5 out of 5 stars5/5Theme and Variations: Musical Notes by a Neurologist Rating: 3 out of 5 stars3/5Musical Forces: Motion, Metaphor, and Meaning in Music Rating: 4 out of 5 stars4/5Music Practice Decoded. The Psychology of Getting Brilliant in Music Rating: 1 out of 5 stars1/5The Power of Music: Pioneering Discoveries in the New Science of Song Rating: 5 out of 5 stars5/5Music and Embodied Cognition: Listening, Moving, Feeling, and Thinking Rating: 1 out of 5 stars1/5Developing Rhythmic Sensitivity: A Study Designed for All Musicians Rating: 5 out of 5 stars5/5Theme and Variations: Musical Notes by a Neurologist Rating: 0 out of 5 stars0 ratingsThe Soundscape: Our Sonic Environment and the Tuning of the World Rating: 4 out of 5 stars4/5Physics and Music: The Science of Musical Sound Rating: 5 out of 5 stars5/5Musical Instrument Design: Practical Information for Instrument Making Rating: 5 out of 5 stars5/5Interpreting Musical Gestures, Topics, and Tropes: Mozart, Beethoven, Schubert Rating: 5 out of 5 stars5/5Music Theory Rating: 0 out of 5 stars0 ratingsThe Life of a Musician: A Musician’s Perspective Rating: 0 out of 5 stars0 ratingsMusic Elements: Music Theory, Songwriting, Lyrics & Creativity Explained Rating: 0 out of 5 stars0 ratingsMusic, Physics and Engineering Rating: 4 out of 5 stars4/5Memory, Space, Sound Rating: 0 out of 5 stars0 ratingsThe Universal Sense: How Hearing Shapes the Mind Rating: 4 out of 5 stars4/5Composition and Cognition: Reflections on Contemporary Music and the Musical Mind Rating: 5 out of 5 stars5/5How Music Can Make You Better Rating: 5 out of 5 stars5/5Representing Sound: Notes on the Ontology of Recorded Musical Communications Rating: 0 out of 5 stars0 ratingsInterpreting Music Rating: 5 out of 5 stars5/5Score Reading: A Key to the Music Experience Rating: 5 out of 5 stars5/5Hal Leonard Pocket Music Theory (Music Instruction): A Comprehensive and Convenient Source for All Musicians Rating: 4 out of 5 stars4/5The Physical Basis Of Piano Touch And Tone Rating: 0 out of 5 stars0 ratingsMusic Theory for Beginners: Music Rating: 0 out of 5 stars0 ratings
Biology For You
The Source: The Secrets of the Universe, the Science of the Brain Rating: 4 out of 5 stars4/5Why We Sleep: Unlocking the Power of Sleep and Dreams Rating: 4 out of 5 stars4/5Lifespan: Why We Age—and Why We Don't Have To Rating: 4 out of 5 stars4/5Gut: The Inside Story of Our Body's Most Underrated Organ (Revised Edition) Rating: 4 out of 5 stars4/5Dopamine Detox: Biohacking Your Way To Better Focus, Greater Happiness, and Peak Performance Rating: 4 out of 5 stars4/5The Laws of Connection: The Scientific Secrets of Building a Strong Social Network Rating: 0 out of 5 stars0 ratingsThe Soul of an Octopus: A Surprising Exploration into the Wonder of Consciousness Rating: 4 out of 5 stars4/5Sapiens: A Brief History of Humankind Rating: 4 out of 5 stars4/5Peptide Protocols: Volume One Rating: 4 out of 5 stars4/5"Cause Unknown": The Epidemic of Sudden Deaths in 2021 & 2022 Rating: 5 out of 5 stars5/5Ultralearning: Master Hard Skills, Outsmart the Competition, and Accelerate Your Career Rating: 4 out of 5 stars4/5Emotional Blackmail: When the People in Your Life Use Fear, Obligation, and Guilt to Manipulate You Rating: 4 out of 5 stars4/5Other Minds: The Octopus, the Sea, and the Deep Origins of Consciousness Rating: 4 out of 5 stars4/5The Obesity Code: the bestselling guide to unlocking the secrets of weight loss Rating: 4 out of 5 stars4/5Homo Deus: A Brief History of Tomorrow Rating: 4 out of 5 stars4/5The Sixth Extinction: An Unnatural History Rating: 4 out of 5 stars4/5How Emotions Are Made: The Secret Life of the Brain Rating: 4 out of 5 stars4/5The Art of Thinking Clearly Rating: 4 out of 5 stars4/5Our Kind of People: Inside America's Black Upper Class Rating: 3 out of 5 stars3/5The Winner Effect: The Neuroscience of Success and Failure Rating: 5 out of 5 stars5/5Woman: An Intimate Geography Rating: 4 out of 5 stars4/5The Autistic Brain: Thinking Across the Spectrum Rating: 4 out of 5 stars4/5Mothers Who Can't Love: A Healing Guide for Daughters Rating: 4 out of 5 stars4/5The Confident Mind: A Battle-Tested Guide to Unshakable Performance Rating: 5 out of 5 stars5/5The Deepest Well: Healing the Long-Term Effects of Childhood Trauma and Adversity Rating: 4 out of 5 stars4/5A Crack In Creation: Gene Editing and the Unthinkable Power to Control Evolution Rating: 4 out of 5 stars4/5The Rise and Fall of the Dinosaurs: A New History of a Lost World Rating: 4 out of 5 stars4/5Suicidal: Why We Kill Ourselves Rating: 4 out of 5 stars4/5
Reviews for Brain and Music
2 ratings1 review
- Rating: 5 out of 5 stars5/5The effects music has on the brain and how it affects your emotions. I enjoyed this book because it is the scientific view on music and how it actually does affect your brain
Book preview
Brain and Music - Stefan Koelsch
Preface
Music is part of human nature. Every human culture that we know about has music, suggesting that, throughout human history, people have played and enjoyed music. The oldest musical instruments discovered so far are around 30 000 to 40 000 years old (flutes made of vulture bones, found in the cave Hohle Fels in Geissenklösterle near Ulm in Southern Germany).¹ However, it is highly likely that already the first individuals belonging to the species homo sapiens (about 100 000 to 200 000 years ago) made musical instruments such as drums and flutes, and that they made music cooperatively together in groups. It is believed by some that music-making promoted and supported social functions such as communication, cooperation and social cohesion,² and that the human musical abilities played a key phylogenetic role in the evolution of language.³ However, the adaptive function of music for human evolution remains controversial (and speculative). Nevertheless, with regard to human ontogenesis, we now know that newborns (who do not yet understand the syntax and semantics of words and sentences) are equipped with musical abilities, such as the ability to detect changes of musical rhythms, pitch intervals, and tonal keys.⁴ By virtue of these abilities, newborn infants are able to decode acoustic features of voices and prosodic features of languages.⁵ Thus, infants' first steps into language are based on prosodic information (that is, on the musical aspects of speech). Moreover, musical communication in early childhood (such as parental singing) appears to play an important role in the emotional, cognitive, and social development of children.⁶
Listening to music, and music making, engages a large array of psychological processes, including perception and multimodal integration, attention, learning and memory, syntactic processing and processing of meaning information, action, emotion, and social cognition. This richness makes music an ideal tool to investigate human psychology and the workings of the human brain: Music psychology inherently covers, and connects, the different disciplines of psychology (such as perception, attention, memory, language, action, emotion, etc.), and is special in that it can combine these different disciplines in coherent, integrative frameworks of both theory and research. This makes music psychology the fundamental discipline of psychology.
The neuroscience of music is music psychology's tool to understanding the human brain. During the last few years, neuroscientists have increasingly used this tool, which has led to significant contributions to social, cognitive, and affective neuroscience. The aim of this book is to inform readers about the current state of knowledge in several fields of the neuroscience of music, and to synthesize this knowledge, along with the concepts and principles developed in this book, into a new theory of music psychology.
The first part of this book consists of seven introductory chapters. Their main contents are identical to those of a ‘first edition’ of this book (the publication of my PhD thesis), but I have updated the chapters with regard to scientific developments in the different areas. These chapters introduce the ear and hearing, a few music-theoretical concepts, perception of pitch and harmony, neurophysiological mechanisms underlying the generation of electric brain potentials, components of the event-related brain potential (ERP), the history of electrophysiological studies investigating music processing, and functional neuroimaging techniques. The purpose of these introductory chapters is to provide individuals from different disciplines with essential knowledge about the neuroscientific, music-theoretical, and music-psychological concepts required to understand the second part of the book (so that individuals without background knowledge in either of these areas can nevertheless understand the second part). I confined the scope of these chapters to those contents that are relevant to understanding the second part, rather than providing exhaustive accounts of each area. Scholars already familiar with those areas can easily begin right away with the second part.
The second part begins with a chapter on a model of music perception (Chapter 8). This model serves as a theoretical basis for processes and concepts developed in the subsequent chapters, and thus as a basis for the construction of the theory of music psychology introduced in this book. The chapter is followed by a chapter on music-syntactic processing (Chapter 9). In that chapter, I first tease apart different cognitive operations underlying music-syntactic processing. In particular, I advocate differentiating between: (a) processes that do not require (long-term) knowledge, (b) processes that are based on long-term knowledge and involve processing of local, but not long-distance, dependencies, and (c) processing of hierarchically organized structures (including long-distance dependencies). Then, I provide a detailed account on studies investigating music-syntactic processing using the early right anterior negativity (ERAN). One conclusion of these studies is the Syntactic Equivalence Hypothesis which states that there exist cognitive operations (and neural populations mediating these operations) that are required for music-syntactic, language-syntactic, action-syntactic, as well as mathematical-syntactic processing, and that are neither involved in the processing of acoustic deviance, nor in the processing of semantic information.
Chapter 10 deals with music-semantic processing. Here I attempt to tease apart the different ways in which music can either communicate meaning, or evoke processes that have meaning for the listener. In particular, I differentiate between extra-musical meaning (emerging from iconic, indexical, and symbolic sign quality), intra-musical meaning (emerging from structural relations between musical elements), and musicogenic meaning (emerging from music-related physical activity, emotional responses, and personality-related responses). One conclusion is that processing of extra-musical meaning is reflected electrically in the N400 component of the ERP, and processing of intra-musical meaning in the N5 component. With regard to musicogenic meaning, a further conclusion is that music can evoke sensations which, before they are ‘reconfigured’ into words, bear greater inter-individual correspondence than the words that an individual uses to describe these sensations. In this sense, music has the advantage of defining a sensation without this definition being biased by the use of words. I refer to this musicogenic meaning quality as a priori musical meaning.
Chapter 11 deals with neural correlates of music and action. The first part of that chapter reviews studies investigating premotor processes evoked by listening to music. The second part reviews studies investigating action with ERPs. These studies investigated piano playing in expert pianists, with a particular focus on (a) ERP correlates of errors that the pianists made during playing, and (b) processing of false feedback (while playing a correct note). Particularly with regard to its second part, this chapter is relatively short, due to the fact that only few neuroscientific studies are yet available in this area. However, I regard the topic of music and action as so important for the neuroscience of music, that I felt that something was missing without this chapter.
Chapter 12 is a chapter on music-evoked emotions and their neural correlates. It first provides theoretical considerations about principles underlying the evocation of emotion with music. These principles are not confined to music, but can be extrapolated to emotion psychology in general. I also elaborate on several social functions that are engaged when making music in a group. One proposition is that music is special in that it can activate all of these social functions at the same time. Engaging in these functions fulfils human needs, and can, therefore, evoke strong emotions. Then, a detailed overview of functional neuroimaging studies investigating emotion with music is provided. These studies show that music-evoked emotions can modulate activity in virtually all so-called limbic/paralimbic brain structures. This indicates, in my view, that music-evoked emotions touch the core of evolutionarily adaptive neuroaffective mechanisms, and reflects that music satisfies basic human needs. I also argue that experiences of fun and reward have different neural correlates than experiences of joy, happiness, and love. With regard to the latter emotions, I endorse the hypothesis that they are generated in the hippocampus (and that, on a more general level, the hippocampus generates tender positive emotions related to social attachments). In the final section of that chapter, I present a framework on salutary effects of music making. Due to the scarcity of studies, that framework is thought of as a basis for further research in this area.
In the final chapter, I first provide a concluding account on ‘music’ and ‘language’. I argue that there is no design feature that distinctly separates music and language, and that even those design features that are more prominent in either language or music also have a transitional zone into the respective other domain. Therefore, the use of the words ‘music’ and ‘language’ seems adequate for our everyday language, but for scientific use I suggest the term music-language-continuum.
Then, the different processes and concepts developed in the preceding chapters are summarized, and synthesized into a theory of music perception. Thus, readers with very limited time can skip to page 201 and read only Section 13.3, for these few pages contain the essence of the book. In the final section, the research questions raised in the previous chapters are summarized. That summary is meant as a catalogue of research questions that I find most important with regard to the topics dealt with in the second part of this book. This catalogue is also meant to provide interested students and scientists who are new to the field with possible starting points for research.
The theory developed in this book is based on the model of music perception described in Chapter 8; that model describes seven stages, or dimensions, of music perception. The principles underlying these dimensions are regarded here as so fundamental for music psychology (and psychology in general), that processes and concepts of other domains (such as music perception, syntactic processing, musical meaning, action, emotion, etc.) were developed and conceptualized in such a way that they correspond to the dimensions of music perception.
This led to a theory that integrates different domains (such as music, language, action, emotion, etc.) in a common framework, implying numerous shared processes and similarities, rather than treating ‘language’, ‘music’, ‘action’, and ‘emotion’ as isolated domains.⁷ That is, different to what is nowadays common in psychology and neuroscience, namely doing research in a particular domain without much regard to other domains, the music-psychological approach taken in this book aims at bringing different domains together, and integrating them both theoretically and empirically into a coherent theory. In this regard, notably, this book is about understanding human psychology and the human brain (it is not about understanding music, although knowledge about how music is processed in the brain can open new perspectives for the experience of music). In my view, we do not need neuroscience to explain, or understand music (every child can understand music, and Bach obviously managed to write his music without any brain scanner). However, I do believe that we need music to understand the brain, and that our understanding of the human brain will remain incomplete unless we have a thorough knowledge about how the brain processes music.
Many of my friends and colleagues contributed to this book through valuable discussions, helpful comments, and numerous corrections (in alphabetical order): Matthias Bertsch, Rebecca Chambers-Lepping, Ian Cross, Philipp Engel, Thomas Fritz, Thomas Gunter, Thomas Hillecke, Sebastian Jentschke, Carol Lynne Krumhansl, Moritz Lehne, Eun-Jeong Lee, Giacomo Novembre, Burkhard Maess, Clemens Maidhof, Karsten Müller, Jaak Panksepp, Uli Reich, Tony Robertson, Martin Rohrmeier, María Herrojo Ruiz, Daniela Sammler, Klaus Scherer, Walter Alfred Siebel, Stavros Skouras, and Kurt Steinmetzger. Aleksandra Gulka contributed by obtaining the reprint permissions of figures. It is with great joy that I see this book now finalized and in its entirety, and I hope that many readers will enjoy reading this book.
Stefan Koelsch
Leipzig, Germany
1. Conard et al. (2009)
2. Cross & Morley (2008), Koelsch et al. (2010a)
3. Wallin et al. (2000)
4. Winkler et al. (2009b) Stefanics et al. (2007), Perani et al. (2010)
5. Moon et al. (1993)
6. Trehub (2003)
7. See also Siebel et al. (1990).
Part I
Introductory Chapters
1
Ear and Hearing
1.1 The ear
The human ear has striking abilities of detecting and differentiating sounds. It is sensitive to a wide range of frequencies as well as intensities and has an extremely high temporal resolution (for detailed descriptions see, e.g., Geisler, 1998; Moore, 2008; Pickles, 2008; Plack, 2005; Cook, 2001). The ear consists of three parts: The outer, the middle, and the inner ear. The outer ear acts as a receiver and filters sound waves on their way to the ear drum (tympanic membrane) via the ear canal (meatus), amplifying some sounds and attenuating others (depending on the frequency and direction of these sounds). Sound waves (i.e., alternating compression and rarefaction of air) cause the tympanic membrane to vibrate, and these vibrations are subsequently amplified by the middle ear. The middle ear is composed of three linked bones: The malleus, incus, and stapes. These tiny bones help transmit the vibrations on to the oval window of the cochlea, a small membrane-covered opening in the bony wall of the inner ear and the interface between the air-filled middle-ear and the fluid-filled inner ear (Figure 1.1).
Figure 1.1 Top: The major parts of the human ear. In the Figure, the cochlea has been uncoiled for illustration purposes. Bottom: Anatomy of the cochlea (both figures from Kandel et al., 2000).
nc02f001.epsThe cochlea has three fluid-filled compartments, the scala tympani, the scala media, and the scala vestibuli (which is continuous with the scala tympani at the helicotrema). Scala media and scala tympani are separated by the basilar membrane (BM). The organ of Corti rests on the BM and contains the auditory sensory receptors that are responsible for transducing the sound stimulus into electrical signals. The vibration of the stapes results in varying pressures on the fluid in the scala vestibuli, causing oscillating movements of scala vestibuli, scala media (including BM) and scala tympani (for detailed descriptions see, e.g., Geisler, 1998; Pickles, 2008).
The organ of Corti contains the sensory receptor cells of the inner ear, the hair cells (bottom of Figure 1.1). There are two types of hair cells, inner hair cells and outer hair cells. On the apical surface of each hair cell is a bundle of around 100 stereocilia (mechanosensing organelles which respond to fluid motion or fluid pressure changes). Above the hair cells is the tectorial membrane that attaches to the longest stereocilia of the outer hair cells. The sound-induced movement of the scalae fluid (see above) causes a relative shearing between the tectorial membrane and BM, resulting in a deflection of the stereocilia of both inner and outer hair cells. The deflection of the stereocilia is the adequate stimulus of a hair cell, which then depolarizes (or hyperpolarizes, due to the direction of deflection) by opening an inward current (for detailed information see Steel & Kros, 2001).
The inner hair cells then release glutamate (Nouvian et al., 2006)¹ at their basal ends where the hair cells are connected to the peripheral branches of axons of neurons whose bodies lie in the spiral ganglion. The central axons of these neurons constitute the auditory nerve. The release of glutamate by the hair cells excites the sensory neurons and this in turn initiates action potentials in the cell's central axon in the auditory nerve. Oscillatory changes in the potential of a hair cell thus result in oscillatory release of transmitter and oscillatory firing in the auditory nerve (for details see, e.g., Pickles, 2008; Geisler, 1998). The duration of an acoustic stimulus is encoded by the duration of activation of an auditory nerve fibre.
Different frequencies of sounds are selectively responded to in different regions of the cochlea. Each sound initiates a travelling wave along the length of the cochlea. The mechanical properties of the basilar membrane vary along the length of the cochlea; the BM is stiff and thin at the basal end (and vibrates more to high frequency sounds, similar to the high e-string on a guitar, which resonates at a sound frequency of ∼330 Hz), whereas at the apex the BM is thicker and less stiff (and resonates at sounds with lower frequencies, similar to the low e-string on a guitar, which resonates at a sound frequency of ∼82 Hz). Different frequencies of sound produce different travelling waves with peak amplitudes at different points along the BM. Higher frequencies result in peak amplitudes of oscillations of the BM that are located nearer to the base of the cochlea, lower frequencies result in oscillatory peaks near the apex of the cochlea (for more details see, e.g., Pickles, 2008; Geisler, 1998).
The outer hair cells specifically sharpen the peak of the travelling wave at the frequency-characteristic place on the BM (e.g., Fettiplace & Hackney, 2006). Interestingly, outer hair cells achieve the changes in tuning of the local region in the organ of Corti by increasing or decreasing the length of their cell bodies (thereby affecting the mechanical properties of the organ of Corti; Fettiplace & Hackney, 2006). This change in length is an example of the active processes occurring within the organ of Corti while processing sensory information. Moreover, the outer hair cells are innervated by efferent nerve fibres from the central nervous system, and it appears that the changes in length are at least partly influenced by top-down processes (such processes may even originate from neocortical areas of the brain). Therefore, the dynamics of the cochlea (determining the processing of acoustic information) appears to be strongly influenced by the brain. The dynamic activity of the outer hair cells is a necessary condition for a high frequency-selectivity (which, in turn, is a prerequisite for both music and speech perception).
Corresponding to the tuning of the BM, the frequency-characteristic excitation of inner hair cells gives rise to action potentials in different auditory nerve fibres. Therefore, an auditory nerve fibre is most sensitive to a particular frequency of sound, its so-called characteristic frequency. Nevertheless, an individual auditory nerve fibre (which is innervated by several inner hair cells) still responds to a range of frequencies, because a substantial portion of the BM moves in response to a single frequency. The sound pressure level (SPL, for explanation and medical relevance see, e.g., Moore, 2008) is then encoded (1) by the action potential rate of afferent nerve fibres, and (2) by the number of neighbouring afferent nerve fibres that release action potentials (because the number of neurons that release action potentials increases as the intensity of an auditory stimulus increases). The brain decodes the spatio-temporal pattern consisting of the individual firing rates of all activated auditory nerve fibres (each with its characteristic frequency) into information about intensity and frequency of a stimulus (decoding of frequency information is dealt with in more detail further below).
1.2 Auditory brainstem and thalamus
The cochlear nerve enters the central nervous system in the brain stem (cranial nerve VIII).² Within the brain stem, information originating from the hair cells is propagated via both contra- and ipsilateral connections between the nuclei of the central auditory pathway (for a detailed description see Nieuwenhuys et al., 2008). For example, some of the secondary auditory fibres that originate from the ventral cochlear nucleus project to the ipsilateral superior olivary nucleus and to the medial superior olivary nucleus of both sides (both superior olivary nuclei project to the inferior colliculus). Other secondary auditory fibres project to the contralateral nucleus of the trapezoid body (that sends fibres to the ipsilateral superior olivary nucleus; see Figure 1.2). The pattern of contra- and ipsilateral connections is important for the interpretation of interaural differences in phase and intensity for the localization of sounds.
Figure 1.2 Dorsal view of nerve, nuclei, and tracts of the auditory system (from Nieuwenhuys et al., 2008).
nc02f002.epsThe inferior colliculus (IC) is connected with the medial geniculate body of the thalamus. The cells in the medial geniculate body send most of their axons via the radiatio acustica to the ipsilateral primary auditory cortex (for a detailed description see Nieuwenhuys et al., 2008). However, neurons in the medial division of the medial geniculate body (mMGB) also directly project to the lateral amygdala (LeDoux, 2000); specifically those mMGB neurons receive ascending inputs from the inferior colliculus and are likely to be, at least in part, acoustic relay neurons (LeDoux et al., 1990). The MGB, and presumably the IC as well, are involved in conditioned fear responses to acoustic stimuli. Moreover, already the IC plays a role in the expression of acoustic-motor as well as acoustic-limbic integration (Garcia-Cairasco, 2002), and chemical stimulation of the IC can evoke defence behaviour (Brandão et al., 1988). It is for these reasons that the IC and the MGB are not simply acoustic relay stations, but that these structures are involved in the detection of auditory signals of danger.
What is often neglected in descriptions of the auditory pathway is the important fact that auditory brainstem neurons also project to neurons of the reticular formation. For example, intracellular recording and tracing experiments have shown that giant reticulospinal neurons in the caudal pontine reticular formation (PnC) can be driven at short latencies by acoustic stimuli, most presumably due to multiple and direct input from the ventral (and dorsal) cochlear nucleus (perhaps even from interstitial neurons of the VIII nerve root) and nuclei in the superior olivary complex (e.g., lateral superior olive, ventral periolivary areas; Koch et al., 1992). These reticular neurons are involved in the generation of motor reflexes (by virtue of projections to spinal motoneurons), and it is conceivable that the projections from the auditory brainstem to neurons of the reticular formation contribute to the vitalizing effects of music, as well as to the (human) drive to move to music (perhaps in interaction with brainstem neurons sensitive for isochronous stimulation).³
1.3 Place and time information
The tonotopic excitation of the basilar membrane (BM),⁴ is maintained as tonotopic structure (also referred to as tonotopy) in the auditory nerve, auditory brainstem, thalamus, and the auditory cortex. This tonotopy is an important source of information about the frequencies of tones. However, another important source is the temporal patterning of the action potentials generated by auditory nerve neurons. Up to frequencies of about 4–5 kHz, auditory nerve neurons produce action potentials that occur approximately in phase with the corresponding oscillation of the BM (although the auditory nerve neurons do not necessarily produce an action potential on every cycle of the corresponding BM oscillation). Therefore, up to about 4–5 kHz, the time intervals between action potentials of auditory neurons are approximately integer ratios of the period of a BM oscillation, and the timing of nerve activity codes the frequency of BM oscillation (and thus of the frequency of a tone, or partial of a tone, which elicits this BM oscillation). The brain uses both place information (i.e., information about which part/s, of the BM was/were oscillating) and time information (i.e., information about the frequency/ies of the BM oscillation/s). Note, however, (a) that time information is hardly available at frequencies above about 5 kHz, (b) that place information appears to be not accurate enough to decode differences in frequencies in the range of a few percent (e.g., between a tone of 5000 and 5050 Hz), and (c) that place information alone cannot explain the phenomenon of the pitch perception of tones with missing fundamentals⁵ (for details about the place theory and temporal theory see, e.g., Moore, 2008).
The phenomenon of the perception of a ‘missing fundamental’ is an occurrence of residue pitch,⁶ also referred to as periodicity pitch, virtual pitch, or low pitch. The value of a residue pitch equals the periodicity (i.e., the timing) of the waveform resulting from the superposition of sinusoids. Importantly, dichotically presented stimuli also elicit residue perception, arguing for the notion that temporal coding of sounds beyond the cochlea is important for pitch perception. Such temporal coding has been reported for neurons of the inferior colliculus (e.g., Langner et al., 2002) and the auditory cortex (see below);⁷ even neurons in the (dorsal) cochlear nucleus (DCN) are able to represent the periodicity of iterated rippled noise, supporting the notion that already the DCN is involved in the temporal representation of both envelope periodicity and pitch (Neuert et al., 2005). However, note that two (or more) frequencies that can be separated (or ‘resolved’) by the BM, also generate (intermodulation) distortions on the BM with different frequencies, the one most easily audible having a frequency of f2 - f1 (usually referred to as difference combination tone). Usually both mechanisms (BM distortions generating combination tones, and temporal coding) contribute to the perception of residue pitch, although combination tones and residue pitch can also be separated (Schouten et al., 1962).
1.4 Beats, roughness, consonance and dissonance
If two sinusoidal tones (or two partials of two tones with similar frequency) cannot be separated (or ‘resolved’) by the BM, that is, if two frequencies pass through the same equivalent rectangular bandwidth (ERB; for details see Moore, 2008; Patterson & Moore, 1986),⁸ then the two frequencies are added together (or ‘merged’) by the BM. This results in an oscillation of the BM with a frequency equal to the mean frequency of the two components, and an additional beat⁹ (see also von Helmholtz, 1870). Such beats are regular amplitude fluctuations occurring due to the changing phase relationship between the two initial sinusoids, which results in the phenomenon that the sinusoids alternately reinforce and cancel out each other. The frequency of the beat is equal to the frequency difference between the two initial sinusoids. For example, two sinusoidal tones with frequencies of 1000 and 1004 Hz add up to (and are then perceived as) a tone of 1002 Hz, with four beats occurring each second (similar to when turning a volume knob up and down four times in one second). When the beats have higher frequencies (above ∼20 Hz), these beats are perceived as roughness (Plomp & Steeneken, 1968; Terhardt, 1974; 1978), and are a sensory basis for the so-called sensory dissonance (Terhardt, 1976, 1984; Tramo et al., 2001). Western listeners tend to judge two sinusoidal tones as consonant as soon as their frequency separation exceeds about one ERB (Plomp & Levelt, 1965), which is typically between 11% and 17% of the centre frequency.
Ernst Terhardt (1976; 1984) distinguished two components of musical consonance/dissonance, namely sensory consonance/dissonance and harmony.¹⁰ According to Terhardt, sensory consonance/dissonance represents the graded absence/presence of annoying factors (such as beats and roughness). Others (Tramo et al., 2001) argued that consonance is also a positive phenomenon (not just a negative phenomenon that depends on the absence of roughness), one reason being that residue pitches produced by the auditory system contribute to the percept of consonance.¹¹ Tramo et al. (2001) argue that, in the case of consonant intervals, the most common interspike interval (ISI) distributions of auditory nerve fibres correspond (a) to the F0 frequencies of the tones, as well as (b) to the frequency (or frequencies) of the residue pitch(es). Moreover (c), all or most of the partials can be resolved. By contrast, for dissonant intervals, the most common ISIs in the distribution do not correspond (a) to either of the F0s, nor (b) to harmonically related residue pitch(es). Moreover (c), many partials cannot be resolved.
Harmony, according to Terhardt, represents the fulfilment, or violation, of musical regularities that, given a particular musical style, govern the arrangement of subsequent or simultaneously sounding tones (‘tonal affinity, compatibility, and fundamental-note relation', Terhardt, 1984 p. 276).¹² The degree to which harmony is perceived as un/pleasant is markedly shaped by cultural experience, due to its relation to music- (and thus presumably also culture-) specific principles.
Sensory dissonance (i.e., the ‘vertical dimension of harmony’; Tramo et al., 2001) is universally perceived as less pleasant than consonance, but the degree to which sensory consonance/dissonance is perceived as pleasant/unpleasant is also significantly shaped by cultural experience. This notion has recently received support by a study carried out in Cameroon with individuals of the Mafa people who had presumably never listened to Western music before participating in the experiment (Fritz et al., 2009). The Mafa showed a significant preference for original Western music over continuously dissonant versions of the same pieces. Notably, the difference in normalized pleasantness ratings between original music and the continuously dissonant versions was moderate, and far smaller than those made by a control group of Western listeners. That is, both Western and Mafa listeners preferred more consonant over continuously dissonant music, but whereas this preference was very strong in Western listeners, it was rather moderate in the Mafas. This indicates that the preference for mainly consonant music over continuously dissonant music is shaped by cultural factors.¹³
Beating sensations can not only occur monaurally (i.e., when different frequencies enter the same ear), but also binaurally (i.e., when each ear receives different frequencies, for example one frequency entering one ear, and another frequency entering the other ear). Binaural beats presumably emerge mainly from neural processes in the auditory brainstem (Kuwada et al., 1979; McAlpine et al., 2000), which are due to the continuously changing interaural phase that results from the superposition of two sinusoids, possibly related to sound localization.¹⁴ Perceptually, binaural beats are somewhat similar to monaural beats, but not as distinct as monaural beats. Moreover, in contrast to monaural beats (which can be observed over the entire audible frequency range) binaural beats are heard most distinctly for frequencies between 300 and 600 Hz (and they become progressively more difficult to hear at higher frequencies; for details see Moore, 2008).
1.5 Acoustical equivalency of timbre and phoneme
With regard to a comparison between music and speech, it is worth mentioning that, in terms of acoustics, there is no difference between a phoneme and the timbre of a musical sound (and it is only a matter of convention if phoneticians use terms such as ‘vowel quality’ or ‘vowel colour’, instead of ‘timbre’).¹⁵ Both are characterized by the two physical correlates of timbre: Spectrum envelope (i.e., differences in the relative amplitudes of the individual harmonics) and amplitude envelope (also referred to as amplitude contour or energy contour of the sound wave, i.e., the way that the loudness of a sound changes, particularly with regard to the attack and the decay of a sound).¹⁶ Aperiodic sounds can also differ in spectrum envelope (see, e.g., the difference between / / and /s/), and timbre differences related to amplitude envelope play a role in speech, e.g., in the shape of the attack for /b/ vs. /w/ and / / vs. /t /.
1.6 Auditory cortex
The primary auditory cortex corresponds to the transverse gyrus of Heschl (or gyrus temporalis transversus ) which is part of the superior temporal gyrus (STG). Most researchers agree that the primary auditory cortex (corresponding to Brodmann’s area 41) consists of three sub-areas, referred to as AI (or A1), R, and RT by some authors (e.g., Kaas & Hackett, 2000; Petkov et al., 2006; see also Figure 1.3), or Te1.0, Te1.1, and Te1.2 by others (Morosan et al., 2001, 2005). The primary auditory cortex (or ‘auditory core region’) is surrounded by auditory belt and parabelt regions that constitute the auditory association cortex (Kaas & Hackett, 2000; Petkov et al., 2006).¹⁷, ¹⁸
Figure 1.3 Subdivisions and connectivity of the auditory cortex. (A) The auditory core region (also referred to as primary auditory cortex) is comprised of the auditory area I (AI), a rostral area (R), and a rostrotemporal area (RT). Area AI, as well as the other two core areas, has dense reciprocal connections with adjacent areas of the core and belt (left panel, solid lines with arrows). Connections with nonadjacent areas are less dense (left panel, dashed lines with arrows). The core has few, if any, connections with the parabelt or more distant cortex. (B) shows auditory cortical connections of the middle lateral auditory belt area (ML). Area ML, as well as other belt areas, have dense connections with adjacent areas of the core, belt, and parabelt (middle panel, solid lines with arrows). Connections with nonadjacent areas tend to be less dense (middle panel, dashed lines with arrows). The belt areas also have topographically organized connections with functionally distinct areas in the prefrontal cortex. (C) Laterally adjacent to the auditory belt is a rostral (RPB) and a caudal parabelt area (CPB). Both these parabelt areas have dense connections with adjacent areas of the belt and RM in the medial belt (illustrated for CPB by the solid lines with arrows). Connections to other auditory areas tend to be less dense (dashed lines with arrows). The parabelt areas have few, if any, connections with the core areas. The parabelt also has connections with the polysensory areas in the superior temporal sulcus (STS) and with functionally distinct areas in prefrontal cortex. Further abbreviations: CL, caudolateral area; CM, caudomedial area; ML, middle lateral area; RM, rostromedial area; AL, anterolateral area; RTL, lateral rostrotemporal area; RTM, medial rostrotemporal area. Reprinted with permission from Kaas & Hackett (2000).
nc02f003.epsFigure 1.3 shows these regions and their connectivity according to the nomenclature introduced by Kaas & Hackett (2000).¹⁹ Note that, unlike what is shown in Figure 1.3, Nieuwenhuys et al. (2008) stated that the parabelt region also covers parts of the temporal operculum, that is, part of the medial (and not only the lateral) surface of the STG (p. 613). Nieuwenhuys et al. (2008) also noted that the precise borders of the posterior parabelt region (which grades in the left hemisphere into Wernicke’s area ) are not known, but that ‘it is generally assumed that it includes the posterior portions of the planum temporale and superior temporal gyrus, and the most basal parts of the angular and supramarginal gyri’ (p. 613–614).
All of the core areas, and most of the belt areas, show a tonotopic structure, which is clearest in AI. The tonotopic structure of R seems weaker than that of AI, but stronger than that of RT. The majority of belt areas appear to show a tonotopic structure comparable to that of R and RT (Petkov et al., 2006, reported that, in the macaque monkey, RTM and CL have only a weak, and RTL and RM no clear tonotopic structure).
The primary auditory cortex (PAC) is thought to be involved in several auditory processes. (1) The analysis of acoustic features (such as frequency, intensity, and timbral features). Compared to the brainstem, the auditory cortex is capable of performing such analysis with considerably higher resolution (perhaps with the exception of the localization of sound sources). Tramo et al. (2002) reported that a patient with bilateral lesions of the PAC (a) had normal detection thresholds for sounds (i.e., the patient could say whether there was a tone or not), but (b) had elevated thresholds for determining whether two tones have the same pitch or not (i.e., the patient had difficulties detecting minute frequency differences between two subsequent tones). (2) Auditory sensory memory (also referred to as ‘echoic memory’). The auditory sensory memory is a short-term buffer that stores auditory information for a few instances (up to several seconds). (3) Extraction of inter-sound relationships. The study by Tramo et al. (2002) also reported that the patient with PAC lesions had markedly increased thresholds for determining the pitch direction (i.e., the patient had great difficulties in saying whether the second tone was higher or lower in pitch than the first tone, even though he could tell that both tones differed (see also Johnsrude et al., 2000; Zatorre, 2001, for similar results obtained from patients with right PAC lesions). (4) Stream segregation, including discrimination and organization of sounds as well as of sound patterns (see also Fishman et al., 2001). (5) Automatic change detection. Auditory sensory memory representations also serve the detection of changes in regularities inherent in the acoustic input. Such detection is thought to be reflected electrically as the mismatch negativity (MMN; see Chapter 5), and several studies indicate that the PAC is involved in the generation of the MMN (for an MEG-study localizing the MMN generators in the PAC see Maess et al., 2007). (6) Multisensory integration (Hackett & Kaas, 2004), particularly integration of auditory and visual information. (7) The transformation of acoustic features into auditory percepts, that is, transformation of acoustic features such as frequency, intensity etc. into auditory percepts such as pitch height, pitch chroma, and loudness.²⁰ It appears that patients with (right) PAC lesions have lost the ability to perceive residue pitch (Zatorre, 1988), consistent with animal studies showing that bilateral lesions of the auditory cortex (in the cat) impair the discrimination of changes in the pitch of a missing fundamental (but not changes in frequency alone; Whitfield, 1980). Moreover, neurons in the anterolateral region of the PAC show responses to a missing fundamental frequency (Bendor & Wang, 2005, data were obtained from marmoset monkeys), and magnetoencephalographic data suggest that response properties in the PAC depend on whether or not a missing fundamental of a complex tone is perceived (Patel & Balaban, 2001, data were obtained from humans). In that study (Patel & Balaban, 2001) phase changes of the auditory steady-state response (aSSR) were related to the pitch percept of a sound.²¹
As mentioned above, combination tones emerge already in the cochlea (generated by the nonlinear mechanics of the basilar membrane), and the periodicity of complex tones is coded in the spike pattern of auditory brainstem neurons.²² That is, different mechanisms contribute to the perception of residue pitch on at least three different levels: (1) On the basilar membrane (BM), (2) in the brainstem (due to temporal coding that leads to a periodicity of the neuronal spike pattern), and (3) in the auditory cortex.²³ However, the studies by Zatorre (2001) and Whitfield (1980) suggest that the auditory cortex plays a more prominent role for the transformation of acoustic features into auditory percepts than the brainstem (or the basilar membrane).
It is also worth noting that neurons in AI are responsive to both sinusoidal (‘pure’) tones and complex tones, as well as to noise stimuli, whereas areas outside AI become increasingly unresponsive to pure tones, and respond more strongly (or exclusively) to complex tones and noises. Therefore, it seems most plausible that accurate acoustic feature analysis, sound discrimination and pattern organization, as well as transformation of acoustic features into percepts are the results of close interactions between auditory core and belt areas. In addition, the auditory association cortex fulfils a large array of functions (many of which have just begun to be investigated systematically with neuroscientific methods) such as auditory scene analysis and stream segregation (De Sanctis et al., 2008; Gutschalk et al., 2007; Snyder & Alain, 2007), auditory memory (Näätänen et al., 2010; Schonwiesner et al., 2007), phoneme perception (Obleser & Eisner, 2009), voice perception (Belin et al., 2004), speaker identification (von Kriegstein et al., 2005), perception of the size of a speaker or an instrument (von Kriegstein et al., 2007), audio-motor transformation (Warren et al., 2005; Rauschecker & Scott, 2009), syntax processing (Friederici, 2009), or storage and activation of lexical representations (Lau et al., 2008).
With regard to functional differences between the left and the right PAC, as well as neighbouring auditory association cortex, several studies indicate that the left auditory cortex (AC) has a higher resolution of temporal information than the right AC, and that the right AC has a higher spectral resolution than the left AC (Zatorre et al., 2002; Hyde et al., 2008). Furthermore, with regard to pitch perception, Warren et al. (2003) report that changes in pitch height as well as changes in pitch chroma (see p. 20 for description of the term ‘pitch chroma’) activate PAC, but that chroma changes involve auditory belt areas anterior of the PAC (covering parts of the planum polare) more strongly than changes in pitch height. Conversely, changes in pitch height activated auditory