Evaluating Spatial Sound Systems
Evaluating Spatial Sound Systems
Mark F. Bocko
2
How many speakers are enough?
$
NHK 22.2
$ $
$
$ $
$
$ $$ $ $ $
$ $ $ $ $
$$
$ $ $
$
Framework
Quantitatively evaluate any 1 2
4
Outline
• Correlograms
• Examples
5
Human
Auditory
System
6
7
Reissner Membrane
Scala Vestibuli
Tectorial Membrane
Organ of Corti
Scala Tympani
Basilar Membrane
8
©2013 by American Physiological Society
9
Meddis Hair
Cell Model
~ Firing Probability
Around 3000 inner hair cells
along the length of the basilar
membrane
Neuron firing is
irregular and
clustered near
signal peaks
10
Meddis Hair
Cell Model
~ Firing Probability
Spontaneous
firing rate
11
Binaural Fusion Model
ea r Low Freq
left
m
Fro
High Freq
t
u tpu
O
Site of r
Binaural ht ea
rig
m
Fusion Fro
To right To left
cochlea Represent as a bi-directional delay line
cochlea
12
Binaural fusion mechanism 2 msec windowed cross-correlation
2 msec *
W(T)
xr(t) t
T
t1 t2 t3
TW
W(T)
xl(t) 𝜏 The lag where the peak in the cross-correlation
T
appears is the Interaural Time Difference
t
ITD
c is the speed of sound
ITD = 0 when = 0
ITD = (3/2)*(d/c) when = 90°
d
15
Demonstration of lateralization as a function of noise burst duration
• Play a series of uncorrelated stereo noise bursts of decreasing duration
(2sec 1sec 0.5sec 0.2sec 0.1sec 50msec 20msec 10msec 5msec 2msec 1msec)
Series of uncorrelated
2msec stereo noise bursts
• At about 2 msec and less, each burst is identified with a specific location
• The cross-correlation function always has a peak somewhere! But it is different each time.
• The auditory percept being computed by the brain is updated about every 2 milliseconds
16
Auditory “Sluggishness” “L” click
que ncy
Fre
Frequ
ency
De l
ay
2-D (ITD & frequency) map encodes source location
Brain decodes these maps to source locations ITD
ITD lateral position of source Stereo speaker pair – center panning
Frequency dependence source elevation (anechoic conditions)
18
Procedure
• For a given head model …
• Compute the reference correlograms for all possible sound source directions
• Specify the multi-channel reproduction system, the influence of the room, and
the signals driving each speaker (for whatever method you choose)
• Compute the resulting correlogram
• Project the computed correlogram onto the reference set to infer the direction
• One may infer a superposition of source directions
• Specific methods
• Decompose into spherical harmonics (orthogonality helps)
• Error minimization
• Machine learning
19
So how does the method work? … assessing the effect of reverberation
Aula Carolina
(Aachen)
20
Reverberation broadens the source image
22
Blauert, J., “Spatial Hearing: The Psychophysics of Human Sound Localization”, MIT Press 1983.
Spatial acuity with one ear!
If you don’t believe the cross-correlation model look at this!
23
Blauert, J., “Spatial Hearing: The Psychophysics of Human Sound Localization”, MIT Press 1983.
Sl Sr Modeling Stereo Reproduction
~ 2
𝑅 𝑅𝐿 ( 𝑡 , 𝜏 )= 𝑅𝑐 (𝑡 ,𝜏 ) + 𝑓 ( 𝜔 ) 𝑅𝑐 ( 𝑡 ,𝜏 )
25
Main Points
• Integrated a quantitative neurological model into a spatial audio analysis tool
• Randomness of auditory nerve firing events is important
• Predicts measured angular acuity
• Two time scales are in play
• Short ( ~ 2 msec) window for cross correlation in brainstem
• Longer ( ~ 100 msec) histogram “memory” (higher level processing)
• We can predict what a listener will tell you they hear
• Location and spread of sound source
• There’s a lot left to do …
• Integrate with room modeling software for a complete analysis package
• Create synthesis tools – find the designs and algorithms that best reproduce a desired spatial
sound effect
• Continue to refine auditory models
• Distance cues
26
END
27
Cochlea
28
Cross-correlation (similarity of two signals)
[x1 x2 x3] [x1 x2 x3] [x1 x2 x3] [x1 x2 x3] [x1 x2 x3]
[y1 y2 y3] [y1 y2 y3] [y1 y2 y3] [y1 y2 y3] [y1 y2 y3]
Lag -2 -1 0 1 2
32