0% found this document useful (0 votes)
122 views37 pages

Spatial Sound - Technologies and Psychoacoustics: This Tutorial

This document provides an overview of spatial sound, including acoustics, psychoacoustics, and technologies. It discusses directional characteristics of sound sources and how sound propagates in rooms. Head-related transfer functions describe how a listener's ears and body affect the sound field. Psychoacoustics examines how humans perceive spatial attributes like localization, distance, and room characteristics. Technologies aim to recreate these spatial cues for realistic spatial audio reproduction.

Uploaded by

Oscar
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
122 views37 pages

Spatial Sound - Technologies and Psychoacoustics: This Tutorial

This document provides an overview of spatial sound, including acoustics, psychoacoustics, and technologies. It discusses directional characteristics of sound sources and how sound propagates in rooms. Head-related transfer functions describe how a listener's ears and body affect the sound field. Psychoacoustics examines how humans perceive spatial attributes like localization, distance, and room characteristics. Technologies aim to recreate these spatial cues for realistic spatial audio reproduction.

Uploaded by

Oscar
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 37

Spatial sound

Technologies and Psychoacoustics


IEEE Winter School 2012, FORTH, Crete
Ville Pulkki
1
1
Dept Signal Processing and Acoustics, Aalto University
January 2012
This tutorial
Overview to spatial sound
acoustics
psychoacoustics
technologies
Emphasis on reproduction, and on loudspeaker techniques.
Viewpoint is more or less mine/ours.
Spatial sound Technologies and Psychoacoustics 2/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Spatial sound reproduction
Spatial sound Technologies and Psychoacoustics 3/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Is spatial sound important?
Listening experiment by Rumsey et al. [JASA 118(2)]
Listening surround (5.1) audio tracks
compare to monophonic downmix
compare to stereo downmix
compare to different low-pass ltered versions in loudspeaker
channels
What are relative contributions of timbral and spatial quality?
Spatial sound Technologies and Psychoacoustics 4/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Relative contributions to total delity
! #! $! %! &! '! (! )! *!
+,-./01 )!2
340501 %!2
Spatial sound Technologies and Psychoacoustics 5/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Is this important? Part II
I would say yes, because
Stereo and 5.1 have taken over mono
Concert halls - the spaces for sound
We have two ears spatial hearing capabilities
Spatial sound Technologies and Psychoacoustics 6/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
This tutorial
Overview to spatial sound
acoustics
psychoacoustics
technologies
Spatial sound Technologies and Psychoacoustics 7/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Traditional acoustics
Physics of sound in enclosed spaces
Directional characteristics of sound sources
Sound propagation paths in room
Acoustics of the listener: head-related transfer functions
Spatial sound Technologies and Psychoacoustics 8/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Directional characteristics of sources
Sound sources have frequency-dependent directional patterns
Sound radiation is different to different directions
Typically omni at low frequencies, and something else at high
frequencies
Spatial sound Technologies and Psychoacoustics 9/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Directional characteristics of ute
c Ptynen & Lokki 2009
irregular behavior with direction
Spatial sound Technologies and Psychoacoustics 10/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Sound propagation paths
c L. Savioja
Source emits sound to different directions
Direct propagation path
Reections, diffraction, reverberation
Traveling path of sound can be modeled with, e.g., image source
method
Spatial sound Technologies and Psychoacoustics 11/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Room impulse response
c L. Savioja
Spatial sound Technologies and Psychoacoustics 12/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Directional analysis of IR
Intensity vectors plotted with time-frequency IR c Merimaa & Pulkki 2005
Spatial sound Technologies and Psychoacoustics 13/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Diffuseness analysis of IR
Diffuseness plotted with time-frequency IR c Merimaa & Pulkki 2005
Spatial sound Technologies and Psychoacoustics 14/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Effect of listener to sound
Presence of the listener changes acoustical eld
Ears on different sides of the head capture different pressure eld
The sound in ears depends on the direction-of-arrival of sound
Spatial sound Technologies and Psychoacoustics 15/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Transfer function from sound source to ear canal
Head Related Impulse Response (HRIR)
Head Related Transfer Function (HRTF)
c Duda: http://interface.cipic.ucdavis.edu/CIL_tutorial/
Spatial sound Technologies and Psychoacoustics 16/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
HRTF measurements
c Algazi et al.: http://interface.cipic.ucdavis.edu/
Spatial sound Technologies and Psychoacoustics 17/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
HRTF measurements at Aalto University
azi-ele movable
loudspeaker
azi rotation is silent
measure HRTFs with
moving loudspeaker
swept-sine measurement
Spatial sound Technologies and Psychoacoustics 18/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
HRIR dependence on sound source direction
0 l 2
-0.2
-0.l
0
0.l
0.2
0 l 2
-0.2
-0.l
0
0.l
0.2
0 l 2
-0.2
-0.l
0
0.l
0.2
0 l 2
-0.2
-0.l
0
0.l
0.2
0 l 2 ms ms ms
ms ms ms
-0.2
-0.l
0
0.l
0.2
0 l 2
-0.2
-0.l
0
0.l
0.2
left ear
0
0
left ear
60
0
left ear
0
60
right ear
60
0
right ear
0
60
right ear
0
0
a) b) c)
c M. Karjalainen
Spatial sound Technologies and Psychoacoustics 19/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
HRIR dependence on azimuth angle
c Algazi et al.: http://interface.cipic.ucdavis.edu/
Spatial sound Technologies and Psychoacoustics 20/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
HRIR dependence on elevation
c Algazi et al.: http://interface.cipic.ucdavis.edu/
Spatial sound Technologies and Psychoacoustics 21/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
HRTF spectrum
l0
2
l0
4
-40
-30
-20
-l0
0d8
l0
2
l0
3
l0
3
l0
3
l0
3
l0
3
l0
3
l0
4
-40
-30
-20
-l0
l0
2
l0
4
-40
-30
-20
-l0
0
l0
2
l0
4
-40
-30
-20
-l0
0
l0
2
l0
4
-40
-30
-20
-l0
0
l0
2
l0
4
-40
-30
-20
-l0
0
left ear
0
0
left ear
60
0
left ear
0
60
right ear
0
0
right ear
0
60
right ear
60
0
a) b) c)
Hz Hz
Hz Hz Hz
Hz
c M. Karjalainen
Spatial sound Technologies and Psychoacoustics 22/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Conclusions from Acoustics section
Sound arriving to listener ears is largely affected by space
Directional characteristics of source characteristics
Geometry of room, surface materials of the room
HRTFs of listener
Spatial sound Technologies and Psychoacoustics 23/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
This tutorial
Acoustics
Psychoacoustics
Technologies
Spatial sound Technologies and Psychoacoustics 24/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Psychoacoustics
Basic psychoacoustic attributes
Loudness silent loud
Pitch low high
Duration short long
Timbre spectrum with time
Spatial attributes location, room
Spatial sound Technologies and Psychoacoustics 25/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Which spatial attributes can we perceive?
Localization
Left/right
Up/down/front/back
Distance
Inter-aural coherence
Source extent
Room
Envelopment of the room
Room dimensions
Spaciousness
Spatial sound Technologies and Psychoacoustics 26/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Some spatial attributes for concert halls
intimacy
liveness
reverberant sound level
diffusion or uniformity
spatial impression
Wine tasting palette
Different experts have different palettes at least so far
Spatial sound Technologies and Psychoacoustics 27/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Psychoacoustics
Spatial sound Technologies and Psychoacoustics 28/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Localization of point sources
Direction
Left/right
Front/up/back/down
Distance
Evolution: to localize danger or to localize prey
Spatial sound Technologies and Psychoacoustics 29/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Directional resolution in horizontal plane
c M. Karjalainen / J. Blauert
Spatial sound Technologies and Psychoacoustics 30/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Directional resolution in median plane
c M. Karjalainen / J. Blauert
Spatial sound Technologies and Psychoacoustics 31/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Directional hearing
All perceived sound objects have location information embedded.
How do we decode the location?
Binaural cues
Inter-aural time difference
Inter-aural level difference
Monaural spectrum
Effect of head rotation to binaural cues
Precedence effect
Spatial sound Technologies and Psychoacoustics 32/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Binaural cues
Interaural Time Difference
ITD
Spatial sound Technologies and Psychoacoustics 33/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Frequency dependency of ITD decoding
left
right
ITD ITD
time/phase delay btw carriers
high frequencies > ~1600 Hz
time delay btw envelopes
low frequencies ~200 ~1600 Hz
Spatial sound Technologies and Psychoacoustics 34/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Dependency of ITD with direction and frequency
0.2
0.4
0.7
1.1
1.7
2.6
3.9
5.7
8.5
12.4
18.2
90
60
30
0
!30
!60
!90
!1
!0.5
0
0.5
1
x 10
!3
Direction [degree]
Frequency [kHz]
I
T
D

[
m
s
]
Spatial sound Technologies and Psychoacoustics 35/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Binaural cues
dB dB
Interaural Level Difference
ILD
Spatial sound Technologies and Psychoacoustics 36/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Dependency of ILD with direction and frequency
Spatial sound Technologies and Psychoacoustics 37/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Cone of confusion
cone of confusion
sound
source
cc

cc

ITD and ILD resolve the cone of confusion in which a sound


source is
Further cues:
the inuence of pinna and torso to monaural spectrum
head rotation effect to binaural cues
Spatial sound Technologies and Psychoacoustics 38/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Head rotation effect to ITD and ILD
coarse but prominent cue
Spatial sound Technologies and Psychoacoustics 39/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Inuence of body
Ear, head, body
Auditory spectrum and ILD changes
Spatial sound Technologies and Psychoacoustics 40/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Auditory spectrum in the median plane
90
60
30
15
0
!15
!30
0.2 0.4 0.7 1.1 1.7 2.6 3.9 5.7 8.5 12.4 18.2
!30
!20
!10
0
10
20
30
Frequency [kHz]
Elev [degr]
L
o
u
d
n
e
s
s
le
v
e
l s
p
e
c
tr
u
m
[p
h
o
n
]
90
60
30
15
0
!15
!30
0.2 0.4 0.7 1.1 1.7 2.6 3.9 5.7 8.5 12.4 18.2
!30
!20
!10
0
10
20
30
Frequency [kHz]
Elev [degr]
L
o
u
d
n
e
s
s
le
v
e
l s
p
e
c
tr
u
m
[p
h
o
n
]
Subject 1 Subject 2
The direction-independent part has been removed.
Spatial sound Technologies and Psychoacoustics 41/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Auditory spectrum in the median plane
90
60
30
15
0
!15
!30
0.2 0.4 0.7 1.1 1.7 2.6 3.9 5.7 8.5 12.4 18.2
!30
!20
!10
0
10
20
30
Frequency [kHz]
Elev [degr]
L
o
u
d
n
e
s
s
le
v
e
l s
p
e
c
tr
u
m
[p
h
o
n
]
90
60
30
15
0
!15
!30
0.2 0.4 0.7 1.1 1.7 2.6 3.9 5.7 8.5 12.4 18.2
!30
!20
!10
0
10
20
30
Frequency [kHz]
Elev [degr]
L
o
u
d
n
e
s
s
le
v
e
l s
p
e
c
tr
u
m
[p
h
o
n
]
Subject 3 Subject 4
Humans adapt to their ears
Spatial sound Technologies and Psychoacoustics 42/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Perception of sound source extent
limited capabilities
generally humans can perceive single ITD and ILD at single
frequency band
wide sound sources with short sound bursts are perceived
narrow
sound from different directions is perceived as wide source when
incoherent sound arrives from different directions
sound from different directions are fused to too narrow sound
object
coherent sound from different directions
different bands of broad-band sound
Spatial sound Technologies and Psychoacoustics 43/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Perception of sound source distance
Limited capabilities
loudness
direct to reverberant ratio
timbre
low distances: excess ILD
Spatial sound Technologies and Psychoacoustics 44/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Ventriloquist effects
Vision affects directional hearing a lot
if you see the source, you localize sound towards the visual
image
fusing is of order 30

directional hearing in median plane is bad outside the visual area


monaural localization system is adapted with visual cues
Spatial sound Technologies and Psychoacoustics 45/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Perception of coherence
Incoherent narrow-band sound at ear canals
low frequencies: wide, in ears
high frequencies > 1000 Hz: incoherence has very mild effect
Is coherence perception an independent mechanism or not?
Jeffress-type auditory model -people say: yes. Cross-correlation
between ears explains this.
Goupell says: no. Psychoacoustics t better to instantaneous cue
value changes.
Spatial sound Technologies and Psychoacoustics 46/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Precedence effect
Directional cues are relevant only when the direct sound
dominates.
Spatial sound Technologies and Psychoacoustics 47/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Precedence effect
c J. Blauert
Spatial sound Technologies and Psychoacoustics 48/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Precedence effect, thresholds
Spatial sound Technologies and Psychoacoustics 49/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Precedence effect
Directions of reections are not perceived
Mechanism to provide correct localization to the listener
Spatial sound Technologies and Psychoacoustics 50/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Binaural unmasking
Listening to signal is easier when noise is not in the same
direction
Cocktail-party effect
Noise and signal have to be separated in left/right direction
Binaural detection: both brain hemispheres have auditory
memory
Signals entered to different ears can be rewinded later
Unmasking of speech source from diffuse noise 3-5 dB
Unmasking of low-frequency sinusoid from point-like noise source
10-15 dB
Spatial sound Technologies and Psychoacoustics 51/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Binaural models
Try to predict spatial sound perception by humans
About three different schools
Cross-correlation -based
Equalization-cancellation models
Cross-hemisphere models (some recent activity at TKK)
No concensus yet, although xcorr models were dominant for some
time
Beyond the scope of this tutorial
Spatial sound Technologies and Psychoacoustics 52/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
How much do we understand of psychoacoustics
Spatial sound Technologies and Psychoacoustics 53/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Spatial audio quality
Could we estimate audio quality with binaural auditory models
localization ok
timbre maybe
spaciousness ???
sense of space ???
extent of sound sources no
preference no
Spatial sound Technologies and Psychoacoustics 54/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Listening tests
Most comprehensive way is to conduct listening tests
Expensive, time-consuming
Cant derive reproduction algorithms from them
Spatial sound Technologies and Psychoacoustics 55/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Conclusions from Psychoacoustics section
We can
localize point-like sound sources
perceive source extent somehow
hear room properties in certain extent
listen selectively within left/right direction
We can use binaural models to predict some spatial hearing
phenomena, though all attributes can not be estimated yet.
Spatial sound Technologies and Psychoacoustics 56/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
This tutorial
Acoustics
Psychoacoustics
Technologies
Spatial sound Technologies and Psychoacoustics 57/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Spatial sound technologies
Development paradigms
Binaural technologies
Techniques for loudspeaker listening
Coding of spatial audio
Spatial sound technologies not covered in this tutorial
Spatial sound Technologies and Psychoacoustics 58/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
What is the target in development
Spatial sound Technologies and Psychoacoustics 59/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Paradigms to develop spatial sound technologies
Sound eld reconstruction
Trial-and-error methods
Perceptual hypothesis
Spatial sound Technologies and Psychoacoustics 60/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Sound eld reconstruction
c Fazi, Nelson, ISVR, University of Southampton
Spatial sound Technologies and Psychoacoustics 61/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Sound eld reconstruction
Error term: difference of recostructed sound eld from original
sound eld
Clear, mathematically dened error term
Possible to derive techniques starting from physics
If error vanishes, perfect quality
Non-zero error term does not estimate perceived quality
May lead to overly complicated solutions
Wave eld synthesis, Higher-order Ambisonics, LSM method
Spatial sound Technologies and Psychoacoustics 62/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Trial-and-error approach
Make an educated guess ( just try something)
If it works, use it
Try to understand why it works
Amplitude panning
Spatial sound Technologies and Psychoacoustics 63/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Perceptual hypothesis approach
Knowledge from psychoacoustics
Make assumptions
Derive the method with the assumptions
Listen to the system and tune it
Directional Audio Coding
Spatial sound Technologies and Psychoacoustics 64/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Rules of thumb for spatial audio development
Dont touch timbre
If you produce coloration in any listening position, forget it
Directional error can be tolerated
as far as left sound source stays left
Make loudspeaker channels as incoherent as possible
reverberation quality is improved
Spatial sound Technologies and Psychoacoustics 65/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Spatial sound technologies
Development paradigms
Binaural technologies
Techniques for loudspeaker listening
Coding of spatial audio
Spatial sound technologies not covered in this tutorial
Spatial sound Technologies and Psychoacoustics 66/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Binaural recording, headphone playback
Real head w. microphones
Binaural microphone (dummy head)
This looks like simple, inexpensive and working system!?
Spatial sound Technologies and Psychoacoustics 67/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Binaural microphones
Dummy heads
Spatial sound Technologies and Psychoacoustics 68/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Binaural microphones
with real head
Spatial sound Technologies and Psychoacoustics 69/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Binaural recording, headphone playback
careful microphone and headphone equalization
binaural cues and auditory spectrum reproduced as were in
recording
in some cases this is appealing solution
Applications: personalized recording, academic use, noise
measurements, augmented reality audio
Spatial sound Technologies and Psychoacoustics 70/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Binaural recording
Challenges
equalization is problematic
listener head movements does change binaural cues
inside-head localization
front-back confusions
vision conicts with audition
works best only with recordings made with your own head
Spatial sound Technologies and Psychoacoustics 71/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Binaural recording, loudspeaker playback
c CIPIC
Left loudspeaker should correspond to left ear canal equally
Cross talk is problem
Spatial sound Technologies and Psychoacoustics 72/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Binaural recording, cross-talk cancelled playback
c JVC
verrry small listening area in anechoic listening room
good in some special cases
back-front confusions
Spatial sound Technologies and Psychoacoustics 73/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Binaural synthesis
convolve monophonic sound tracks
with measured [individual] HRTFs
auditory objects can be positioned
in 3D virtual space
inside-head localization, front-back
confusions
need of individual HRTFs
head tracking may be used to
resolve this
virtual reality, gaming, aviation
c Bill Gardner
Spatial sound Technologies and Psychoacoustics 74/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Augmented reality audio
acoustically transparent headphones + binaural microphones
reality can be augmented [Hrm et al. 2004]
Spatial sound Technologies and Psychoacoustics 75/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Spatial sound technologies
Development paradigms
Binaural technologies
Techniques for loudspeaker listening
Coding of spatial audio
Spatial sound technologies not covered in this tutorial
Spatial sound Technologies and Psychoacoustics 76/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Techniques for loudspeaker listening
Loudspeaker layouts
Virtual source positioning
Microphone techniques
Ambisonics
Wave eld synthesis
Signal-dependent methods
Spatial sound Technologies and Psychoacoustics 77/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Monophony
Spatial sound Technologies and Psychoacoustics 78/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Stereophony
Spatial sound Technologies and Psychoacoustics 79/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
5.1 Surround
different versions specied: 6.1, 7.1, 7.1, 10.2
Spatial sound Technologies and Psychoacoustics 80/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
3D loudspeaker setups
theaters, installations, 3D cinema
different number, different positioning
Spatial sound Technologies and Psychoacoustics 81/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Techniques for loudspeaker listening
Loudspeaker layouts
Virtual source positioning
Microphone techniques
Ambisonics
Wave eld synthesis
Signal-dependent methods
Spatial sound Technologies and Psychoacoustics 82/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Time panning

loudspeaker signal delays between about 0.5 and 5 ms create


interaural level differences and time differences
creates spatially spread virtual source
used as an effect, not for positioning
Spatial sound Technologies and Psychoacoustics 83/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Amplitude panning
Spatial sound Technologies and Psychoacoustics 84/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Amplitude panning
loudspeaker amplitude difference changes to interaural time
difference at low frequencies
loudspeaker amplitude difference changes to interaural level
difference at high frequencies
does not color sound in any position, although directional effect
may be lost
most used virtual source positioning method in the world as
mixers have it
Spatial sound Technologies and Psychoacoustics 85/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Virtual sources in surround
How to extend amplitude panning from stereo to multichannel?
Spatial sound Technologies and Psychoacoustics 86/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Matrixing in surround
matrixing, e.g., rst-order Ambisonics
coherent sound in all loudspeakers
nearest loudspeaker dominates
source spread is high
Spatial sound Technologies and Psychoacoustics 87/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Pair-wise panning in surround
nearest loudspeaker dominates less
source spread depends on panning direction
Spatial sound Technologies and Psychoacoustics 88/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Even spreading in pair-wise panning
Multiple-direction amplitude panning [Pulkki]
gain factors computed for multiple directions
gains averaged for each loudspeaker
does not affect anything when all directions btw same
loudspeakers
when loudspeaker between panning directions, spreads slightly
Spatial sound Technologies and Psychoacoustics 89/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Amplitude panning for 3D loudspeaker setups
Matrixing - even more artifacts
triplet-wise panning
Higher order Ambisonics (HOA), discussed later
Spatial sound Technologies and Psychoacoustics 90/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Triplet-wise panning
Divide setup into non-overlapping triangles
Select the triplet where virtual source is
Spatial sound Technologies and Psychoacoustics 91/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Vector base amplitude panning, VBAP
n

m
k
loudspeaker m
loudspeaker k
virtual
source loudspeaker n
g = [p
1
p
2
p
3
]

l
11
l
12
l
13
l
21
l
22
l
23
l
31
l
32
l
33

1
Normalization ||g|| = 1
Normalized barycentric coordinates
as gain factors [Pulkki-97]
Spatial sound Technologies and Psychoacoustics 92/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Vector base amplitude panning, VBAP
510 implementations for different platforms have been quite
successful
brutal resilience to different loudspeaker setups
loudspeaker number needed is 6 60, which is ok in many
cases
left/right direction quality ok
up/down direction quality individual
does not color sound (!!!)
quality degrades smoothly outside best listening position
does not recreate soundeld, or wave eld curvature
cannot bring virtual sources inside the array
Spatial sound Technologies and Psychoacoustics 93/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Techniques for loudspeaker listening
Loudspeaker layouts
Virtual source positioning
Microphone techniques
Ambisonics
Wave eld synthesis
Signal-dependent methods
Spatial sound Technologies and Psychoacoustics 94/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Microphone polar patterns
0.5
1
1.5
30
210
60
240
90
270
120
300
150
330
180 0
Omnidirectional

0.2
0.4
0.6
0.8
1
30
210
60
240
90
270
120
300
150
330
180 0
+ !
Figure of eight

Multiple coincindent microphones, polar patterns are additive.
Spatial sound Technologies and Psychoacoustics 95/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Microphone polar patterns
0.2
0.4
0.6
0.8
1
30
210
60
240
90
270
120
300
150
330
180 0
Cardiod

0.2
0.4
0.6
0.8
1
30
210
60
240
90
270
120
300
150
330
180 0
+ !
Hypercardiod

Spatial sound Technologies and Psychoacoustics 96/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Coincident microphones for stereo
implement amplitude panning
point-like virtual sources
directional patterns favor front
tend to decrease reverberation in listening
Spatial sound Technologies and Psychoacoustics 97/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Spaced microphones for stereo
implement amplitude panning + time panning
spread virtual sources
capture all directions depending on setup
more spacious reverberation
Spatial sound Technologies and Psychoacoustics 98/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Spaced microphone arrays for multichannel
Decca tree
Spatial sound Technologies and Psychoacoustics 99/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Spaced microphone arrays for multichannel
Fukada tree
Spatial sound Technologies and Psychoacoustics 100/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Spaced microphone arrays for multichannel
number of microphones equals to number of loudspeakers
no well-established system
spaced array has to be tuned into recording space
directions reproduced a bit diffuse
reverberation perceived spacious and enveloping
no/low coloration artifacts
Spatial sound Technologies and Psychoacoustics 101/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Techniques for loudspeaker listening
Loudspeaker layouts
Virtual source positioning
Microphone techniques
Ambisonics
Wave eld synthesis
Signal-dependent methods
Spatial sound Technologies and Psychoacoustics 102/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
B-format recording
B-format microphones
Omni + 3 dipoles on Cartesian axis
Steerable rst-order microphone
Cardioid or hypercardioid for each loudspeaker
Spatial sound Technologies and Psychoacoustics 103/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
B-format recording
www.soundfield.com
Spatial sound Technologies and Psychoacoustics 104/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
First-order Ambisonics
[Gerzon 70s]
A signal for each loudspeaker is decoded from B-format
Loudspeaker channels are relatively coherent
Coloring
OK quality in best listening position, and in good listening room
Nearmost loudspeaker dominates outside best listening position
Spatial sound Technologies and Psychoacoustics 105/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Higher-order microphone patterns
Spatial sound Technologies and Psychoacoustics 106/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Higher-order Ambisonics benets
lower coherence between loudspeaker signals
stabler localization in larger listening area
better quality in overall
Spatial sound Technologies and Psychoacoustics 107/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Higher-order Ambisonics challenges
high directivity can be obtained in limited frequency window
low-frequency noise
number of transducers 8 24 at least
quality of transducers has to be high
more expensive microphones
microphones still under development
Spatial sound Technologies and Psychoacoustics 108/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Commercial higher-order microphone array
www.trinnov-audio.com
Different patterns at different frequencies (order 1-3 ??)
Spatial sound Technologies and Psychoacoustics 109/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Techniques for loudspeaker listening
Loudspeaker layouts
Virtual source positioning
Microphone techniques
Ambisonics
Wave eld synthesis
Signal-dependent methods
Spatial sound Technologies and Psychoacoustics 110/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Wave eld synthesis
[Snow 1955, Berkhout 1988]
Recreate the same acoustical eld as was in the original place
Original idea: curtain of microphones, curtain of loudspeakers
Nowadays: a dense loudspeaker array around the listeners
Lots of academic interest
Some commercial applications
Spatial sound Technologies and Psychoacoustics 111/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Wave eld synthesis
c IRCAM web pages
Spatial sound Technologies and Psychoacoustics 112/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Wave eld synthesis
Large listening area
Virtual sources can be positioned inside the loudspeaker array in
some cases
Wave front curvature is reproduced correcly
perspective is correct
human capability to perceive curvature is limited
Typical horizontal setups consist of about 100-200 loudspeakers
3D setups would require about 100,000 loudspeakers !!!
Spatial sound Technologies and Psychoacoustics 113/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Wave eld synthesis
Complications
very expensive systems
recording is not possible, although it was the original idea
microphone directional patterns should be matched with
loudspeaker directional patterns, which is not possible
positioning tool for virtual sources and for reverberation
number of artifacts, coloring, WF truncation etc
some of artifacts may exist only in theory
Spatial sound Technologies and Psychoacoustics 114/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Least squares method
Spatial sound Technologies and Psychoacoustics 115/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Techniques for loudspeaker listening
Loudspeaker layouts
Virtual source positioning
Microphone techniques
Ambisonics
Wave eld synthesis
Signal-dependent methods
Fallers system
Directional Audio Coding (DirAC)
Spatial sound Technologies and Psychoacoustics 116/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Signal-dependent processing for multichannel
eliminate signal X1 from X2 based on cross correlation
Spatial sound Technologies and Psychoacoustics 117/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Signal-dependent processing for multichannel
frequency-band processing
Wiener ltering
Spatial sound Technologies and Psychoacoustics 118/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Signal-dependent processing for multichannel
Arbitrary directional patterns can be formed from B-format
recordings
1st order Ambisonics Fallers system
Spatial sound Technologies and Psychoacoustics 119/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Fallers system
directional patterns valid for dry, or non-diffuse sound
in reverberant eld, the response is just the original cardioid
in many recording schemes a large advantage from rst-order
Ambisonics is obtained
Spatial sound Technologies and Psychoacoustics 120/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Directional Audio Coding (DirAC)
Psychoacoustics: In single frequency channel we decode
single ITD
single ILD
when two sinusoids from two loudspeakers are near in
frequency, they cannot be localized individually
Humans are also sensitive to interaural coherence.
Spatial sound Technologies and Psychoacoustics 121/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Directional Audio Coding (DirAC)
Hypothesis: If we reproduce correctly at each frequency band
the direction of sound and
the diffuseness of sound,
the spatial audio should be perceived with high quality.
Spatial sound Technologies and Psychoacoustics 122/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Directional Audio Coding (DirAC)
[Pulkki07 JAES][Merimaa & Pulkki 05 JAES]
Spatial sound Technologies and Psychoacoustics 123/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Energetic analysis of sound eld
Length of intensity vector equals to energy density
Opposite direction of intensity vector equals to direction of sound
Spatial sound Technologies and Psychoacoustics 124/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Random eld
Length of intensity vector small and varies with time
Direction of intensity vector varies with time
Spatial sound Technologies and Psychoacoustics 125/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Directional analysis
B-format microphone
Direction vector = Intensity vector
Diffuseness estimate = 1 (abs(Intensity) / Energy)
Spatial sound Technologies and Psychoacoustics 126/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Synthesis, version I
teleconference application
single audio channel + metadata
Spatial sound Technologies and Psychoacoustics 127/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
DirAC teleconferencing
Spatial sound Technologies and Psychoacoustics 128/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
DirAC Teleconferencing, low-cost microphone
W = mic 2 ; X = mic1 - mic3 ; Y = mic4 - mic2 ;
Spatial sound Technologies and Psychoacoustics 129/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Low-cost microphone
Dipole signals X,Y, contain prominent self-noise
In trad. techniques dipole signal would be useless
In DirAC teleconferencing, dipole signals are not used as audio
signal
Dipole signals are used as data, to steer omni signal
Spatial sound Technologies and Psychoacoustics 130/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
High-quality DirAC
Spatial sound Technologies and Psychoacoustics 131/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
High-quality DirAC
can be seen as signal-dependent enhancement of rst-order
Ambisonics
minimizes coherence between loudspeakers for diffuse and
non-diffuse sound
different effecting possible for diffuse and non-diffuse sound
Spatial sound Technologies and Psychoacoustics 132/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
High-quality DirAC
Listening tests
Auditory virtual reality generated with 25 loudspeaker channels in
anechoic room
B-format recording, reproduction with DirAC
DirAC quality was rated as excellent [unpublished results]
Work in progress: in some settings, some small differences btw
reference and reproduction can be perceived.
Spatial sound Technologies and Psychoacoustics 133/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Spatial sound technologies
Topics
Development paradigms
Binaural technologies
Techniques for loudspeaker listening
Coding of spatial audio
Spatial sound technologies not covered in this tutorial
Spatial sound Technologies and Psychoacoustics 134/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Spatial audio coding
Idea by Faller and Baumgarte in 2002
Encode multichannel audio into mono or stereo with metadata
Metadata consists channel level and time differences
Evolved to MPEG-surround
Also: Creative with Gerzon vectors
Spatial sound Technologies and Psychoacoustics 135/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
MPEG surround encoding
www.mpegsurround.com
Stereo audio + metadata of level differences in 5.1 input
Spatial sound Technologies and Psychoacoustics 136/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
MPEG surround decoding
www.mpegsurround.com
Decode to 5.1, or stereo, or headphones
Spatial sound Technologies and Psychoacoustics 137/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
MPEG surround DirAC
What is the difference?
MPEG surround, input is audio tracks
measures differences between the tracks
only an audio coding technique
input to DirAC is pressure and 2D or 3D velocity vector
measured from B-format recording
deals with physical quantities, direction, diffuseness
is also a microphone technique
Spatial sound Technologies and Psychoacoustics 138/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
DirAC in audio coding
Spatial sound Technologies and Psychoacoustics 139/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Spatial sound technologies
Topics
Development paradigms
Binaural technologies
Techniques for loudspeaker listening
Coding of spatial audio
Spatial sound technologies not covered in this tutorial
Spatial sound Technologies and Psychoacoustics 140/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Other spatial sound technologies
Reverberators
process [dry] input sound in a way, that listener perceives the
sound being in a virtual space
DSP structures
Convolving reverberators
Directional microphones
Capture sound selectively in direction
Spatial sound Technologies and Psychoacoustics 141/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Acknowledgment
The Academy of Finland (Projects #105780 and #119092) and Fraunhofer
Gesellshaft IIS have supported this work. The research leading to these results
has received funding from the European Research Council under the European
Communitys Seventh Framework Programme (FP7/2007-2013) / ERC grant
agreement no [240453].
Spatial sound Technologies and Psychoacoustics 142/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Suggested reading
Applies to all topics:
J. Blauert, Spatial hearing: the psychophysics of human sound localisation, MIT Press, Cambridge, MA, (1995).
Acoustics
Kuttruff H., Room Acoustics, Applied Science Publishers Ltd, London (1973)
Fahy F.J., Sound intensity Elsevier Applied Science 1989
Savioja L., Huopaniemi J., Lokki T., Vnnen, R.: Creating Interactive Virtual Acoustic Environments JAES Volume
47 Issue 9 pp. 675-705; 1999.
Psychoacoustics
Brian C.J. Moore: An Introduction to the Psychology of Hearing Elsevier, 2003
Eds Gilkey R, Anderson T: Binaural and spatial hearing in real and virtual environments Psychology Press, 1997.
Bech S., Zacharov N: Perceptual Audio Evaluation: Theory, Method and Application John Wiley & Sons 2006.
Goupell and Hartmann: Interaural uctuations and the detection of interaural incoherence: Bandwidth effects, JASA
119(6), 2006
Spatial sound Technologies and Psychoacoustics 143/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Suggested reading
Binaural technologies
Moller H., Fundamentals of binaural technology, Applied Acoustics, 171-218, (1992).
Begault D. R., 3-D sound for virtual reality and multimedia, (1994).
A. Hrm, J. Jakka, M. Tikander, M. Karjalainen, T. Lokki, J. Hiipakka, and G. Lorho, "Augmented reality audio for
mobile and wearable appliances", Journal of the Audio Engineering Society (JAES), vol. 52, no. 6, pp. 618-639, June
2004.
O. Kirkeby, P.A. Nelson, H. Hamada, The "Stereo Dipole" : Binaural sound reproduction using two closely spaced
loudspeakers, Audio Engineering Society 102nd Convention, pre-print 4463 (1997).
Spatial sound Technologies and Psychoacoustics 144/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012
Suggested reading
Loudspeaker techniques
V. Pulkki "Spatial Sound Generation and Perception by Amplitude Panning Techniques" TKK PhD Dissertation 2001
http://lib.tkk./Diss/2001/isbn9512255324/
Berkhout, A. J.: A Holographic Approach to Acoustic Control JAES Volume 36 Issue 12 pp. 977-995; December 1988
Spors S., Rabenstein R., Ahrens, J.: The Theory of Wave Field Synthesis Revisited AES 124th Convention, Paper
Number: 7358. May 2008.
Fazi, Filippo M.; Nelson, Philip A.: The Ill-Conditioning Problem in Sound Field Reconstruction AES Convention: 123
(October 2007) Paper Number: 7244
Pulkki V. Spatial Sound Reproduction with Directional Audio Coding JAES Volume 55 Issue 6 pp. 503-516; June 2007
Merimaa J, Pulkki V: Spatial Impulse Response Rendering I-II JAES Volume 53 Issue 12, Volume 54, Issue 1; 2005
and 2006
Gerzon M: Periphony: With-Height Sound Reproduction JAES Volume 21 Issue 1 pp. 2-10; February 1973
Faller C.,: A Highly Directive 2-Capsule Based Microphone System Paper 7313 AES 123rd Convention 2007.
Villemoes, L; Herre, J; Breebaart, J; Hotho, G; Disch, S; Purnhagen, H; Kjrling, K: MPEG Surround: The Forthcoming
ISO Standard for Spatial Audio Coding 28th AES Conference 2006
Goodwin M., Jot J-M: Binaural 3-D Audio Rendering Based on Spatial Audio Scene Coding, Paper 7277, AES
Convention 123 October 2007
Spatial sound Technologies and Psychoacoustics 145/145
Pulkki Jan 2012
Aalto University IEEE Winter School 2012

You might also like