0% found this document useful (0 votes)
8 views

Evaluating Spatial Sound Systems

This document proposes a framework for quantitatively evaluating any spatial sound reproduction method. The framework involves: 1. Specifying the listening space and speaker placement. 2. Specifying virtual acoustic sources to be created. 3. Computing signals driving each loudspeaker. 4. Comparing the reproduced sound field to the target virtual sources to assess performance. The goal is to make spatial audio system design more deterministic and less trial-and-error by incorporating models of human binaural hearing into analysis tools. Key metrics like perceived source location, extent, and diffusiveness would correspond to what listeners report hearing.

Uploaded by

Mark Bocko
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

Evaluating Spatial Sound Systems

This document proposes a framework for quantitatively evaluating any spatial sound reproduction method. The framework involves: 1. Specifying the listening space and speaker placement. 2. Specifying virtual acoustic sources to be created. 3. Computing signals driving each loudspeaker. 4. Comparing the reproduced sound field to the target virtual sources to assess performance. The goal is to make spatial audio system design more deterministic and less trial-and-error by incorporating models of human binaural hearing into analysis tools. Key metrics like perceived source location, extent, and diffusiveness would correspond to what listeners report hearing.

Uploaded by

Mark Bocko
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 32

Evaluating Spatial Sound Systems

Mark F. Bocko

Audio & Music Engineering


Audio Engineers love specs …
• Predicting which speakers will sound good …

2
How many speakers are enough?
$
NHK 22.2
$ $
$
$ $
$
$ $$ $ $ $
$ $ $ $ $
$$
$ $ $

$
Framework
Quantitatively evaluate any 1 2

spatial sound reproduction Specify listening space &


speaker placement
Specify virtual acoustic
sources to be created

method in any space … Compute signals driving 3


each loudspeaker
• Incorporate quantitative models of binaural (Your favorite method)

hearing into audio system design tools 4


Compute acoustic field at Compare
• Identify the computable quantities that listener (directional IR) & Assess

correspond to what listeners report they hear 5


(locations, spatial extent of sources, diffusiveness) Compute sound field-
listener interaction
(head model)
• Make the design of systems for creating spatial
audio more deterministic and less trial and error 6 7
Compute percepts Infer virtual acoustic
• Both for free space sound reproduction (binaural fusion model) source properties
• And for headphone based reproduction

4
Outline

• How the ear works – very briefly


• Meddis hair cell model

• Cross-correlation model of directional hearing

• Audio coherence and spatial hearing

• Interaural time and level differences

• Spectral coloring from source elevation

• Correlograms

• Examples
5
Human
Auditory
System

6
7
Reissner Membrane

Scala Vestibuli

Tectorial Membrane
Organ of Corti

Scala Tympani
Basilar Membrane

8
©2013 by American Physiological Society
9
Meddis Hair
Cell Model

~ Firing Probability
Around 3000 inner hair cells
along the length of the basilar
membrane

Neuron firing is
irregular and
clustered near
signal peaks
10
Meddis Hair
Cell Model

~ Firing Probability

Spontaneous
firing rate

11
Binaural Fusion Model
ea r Low Freq
left
m
Fro
High Freq

t
u tpu
O

Site of r
Binaural ht ea
rig
m
Fusion Fro
To right To left
cochlea Represent as a bi-directional delay line
cochlea
12
Binaural fusion mechanism  2 msec windowed cross-correlation
2 msec *

DELAY LINE FROM RIGHT EAR

DELAY LINE FROM LEFT EAR

W(T)

xr(t) t
T

t1 t2 t3
TW
W(T)
xl(t) 𝜏 The lag where the peak in the cross-correlation
T
appears is the Interaural Time Difference
t

t1 - t2 - t3 - • Jeffress, L. A. (1948). A place theory of sound localization. Journal


TW
13
of comparative and physiological psychology, 41(1), 35.
Interaural Time Difference and source direction
(in the horizontal plane)
Perceived ITD (direction to source) is
determined by location of the peak in the
short-time cross-correlation function
Low frequency limit of
Rayleigh diffraction around sphere

ITD
c is the speed of sound

ITD = 0 when = 0
ITD = (3/2)*(d/c) when = 90°
d

Note: Factor of 3/2 is due to diffraction around listeners head


14
Role of coherence in binaural
hearing
3 Sec white noise bursts
S1 S2
• S1 alone
• S2 alone
• S1 + S2 the same
• S1 + S2 different

15
Demonstration of lateralization as a function of noise burst duration
• Play a series of uncorrelated stereo noise bursts of decreasing duration
(2sec 1sec 0.5sec 0.2sec 0.1sec 50msec 20msec 10msec 5msec 2msec 1msec)

Series of uncorrelated
2msec stereo noise bursts

• At about 2 msec and less, each burst is identified with a specific location
• The cross-correlation function always has a peak somewhere! But it is different each time.
• The auditory percept being computed by the brain is updated about every 2 milliseconds
16
Auditory “Sluggishness” “L” click

• How quickly can a listener follow time-


varying binaural cues?
• Evidence for a 200 - 300 msec threshold
• Distribution of 2 msec window ITD’s has a
“memory” of 100 - 300 msec Series of L, C, R located clicks

10 msec 50 msec 100 msec 250 msec 500 msec

Your brain averages over a hundred or more 2 msec windows


and constructs a histogram of interaural time differences.
Histogram of ITD’s
17
Correlograms – Frequency dependent interaural time differences

que ncy
Fre

Frequ
ency
De l
ay
2-D (ITD & frequency) map encodes source location
Brain decodes these maps to source locations ITD
ITD  lateral position of source Stereo speaker pair – center panning
Frequency dependence  source elevation (anechoic conditions)
18
Procedure
• For a given head model …
• Compute the reference correlograms for all possible sound source directions
• Specify the multi-channel reproduction system, the influence of the room, and
the signals driving each speaker (for whatever method you choose)
• Compute the resulting correlogram
• Project the computed correlogram onto the reference set to infer the direction
• One may infer a superposition of source directions
• Specific methods
• Decompose into spherical harmonics (orthogonality helps)
• Error minimization
• Machine learning

19
So how does the method work? … assessing the effect of reverberation

Aula Carolina
(Aachen)

20
Reverberation broadens the source image

Note: Random nature of nerve


impulse stream creates a spread
of image width, even in a non-
Reverberant space
21
Spatial Blur – experimental measurements
The model reproduces the observed angular acuity.

Spread arises from statistics of neuronal pulses.

22
Blauert, J., “Spatial Hearing: The Psychophysics of Human Sound Localization”, MIT Press 1983.
Spatial acuity with one ear!
If you don’t believe the cross-correlation model look at this!

23
Blauert, J., “Spatial Hearing: The Psychophysics of Human Sound Localization”, MIT Press 1983.
Sl Sr Modeling Stereo Reproduction

Frequency dependence of head


diffraction

~ 2
𝑅 𝑅𝐿 ( 𝑡 , 𝜏 )= 𝑅𝑐 (𝑡 ,𝜏 ) + 𝑓 ( 𝜔 ) 𝑅𝑐 ( 𝑡 ,𝜏 )

𝜏 𝑑= left-right ear delay

𝑅 𝑐 ( 𝑡 , 𝜏 ) is the cross-correlation of the Sl and Sr


L R
d
24
Stereo Sweet Spot calculation
• Compute peak of distribution of ITD’s for
a real source at the intended location
• Compute peak of distribution of ITD’s for
the stereo rendered intended source
• Infer the apparent source direction from
peak of ITD distribution
• This example is for coherent sources –
the formalism also can be used with
partially coherent sources, i.e., real
signals in reverberant spaces.

25
Main Points
• Integrated a quantitative neurological model into a spatial audio analysis tool
• Randomness of auditory nerve firing events is important
• Predicts measured angular acuity
• Two time scales are in play
• Short ( ~ 2 msec) window for cross correlation in brainstem
• Longer ( ~ 100 msec) histogram “memory” (higher level processing)
• We can predict what a listener will tell you they hear
• Location and spread of sound source
• There’s a lot left to do …
• Integrate with room modeling software for a complete analysis package
• Create synthesis tools – find the designs and algorithms that best reproduce a desired spatial
sound effect
• Continue to refine auditory models
• Distance cues
26
END

27
Cochlea
28
Cross-correlation (similarity of two signals)
[x1 x2 x3] [x1 x2 x3] [x1 x2 x3] [x1 x2 x3] [x1 x2 x3]
[y1 y2 y3] [y1 y2 y3] [y1 y2 y3] [y1 y2 y3] [y1 y2 y3]
Lag -2 -1 0 1 2

Delay = 0 Delay = 30 samples

Signals are correlated but delayed


Uncorrelated signals

No dominant peak in cross-correlation


Precedence effect
• Law of the first wave-front …
• Direction is inferred from 1st wave-front (up to about 30-40 msec)

• Haas effect – short delays enhance “spaciousness”

0 – 2 msec delay 0 – 40 msec delay 0 – 200 msec delay


(in 20 steps) (in 20 steps) (in 20 steps)

Explained by saturation and recovery time of hair cell response.


31
Directional impulse responses
Directional Impulse Response
Track both the time of
arrival and the
direction of each room
reflection

(Matlab Demo: Imp_Resp_w_Angle_3.m)

32

You might also like