Spatial Sound Max Ms P
Spatial Sound Max Ms P
Spatial Sound Max Ms P
Originally conceptualized [2] for the software Pure Data, This section briefly overviews available loudspeaker spa-
ViMiC was recently refined and extended for release to tialization techniques for Max/MSP. For further details,
the Max/MSP community. ViMiC (Virtual Microphone refer to the indicated references.
Control) is a tool for real-time spatialization synthesis,
particularly for concert situations and site-specific im- Vector Based Amplitude Panning (VBAP) is an efficient
mersive installations, and especially for larger or non- extension of stereophonic amplitude panning techniques,
centralized audiences. Based on the concept of virtual applied to multi-loudspeaker setups. In a horizontal plane
microphones positioned within a virtual 3D room, ViMiC around the listener, a virtual sound source at a certain po-
supports loudspeaker reproduction up to 24 discrete chan- sition is created by applying the tangent panning law be-
nels for which the loudspeakers do not necessarily have to tween the closest pair of loudspeaker. This principle was
be placed uniformly and equidistant around the audience. also extended to project sound sources onto a three dimen-
Also, through the integrated Open Sound Control protocol sional sphere and assumes that the listener is located in the
(OSC), ViMiC is easily accessed and manipulated. center of the equidistant speaker setup [11].
Distance Based Amplitude Panning (DBAP) also uses
intensity panning applied to arbitrary loudspeaker config-
1. INTRODUCTION urations without assumptions as to the position of the lis-
tener. All loudspeakers radiate coherent signals, whereby
Besides the traditional concepts of pitch, timbre, and tem- the underlaying amplitude weighting is based on a dis-
poral structures, composers have long felt the desire to in- tance attenuation model between the position of the vir-
tegrate a spatial dimension into their music. First, through tual sound source and each loudspeaker [5].
static placement and separation of musicians in the con- Higher Order Ambisonics extends Blumleins pioneer-
cert space, and later, through dynamic modifications of ing idea of coincident recording techniques. HOA aims to
the sound source position, effects of spatial sound segre- physically synthesize a soundfield based on its expansion
gation and fusion were discovered. In the 20th century, into spherical harmonics up to a specified order. To date,
especially due to the invention and integration of micro- Max/MSP externals up to the 3rd order for horizontal-only
phones and loudspeakers in the musical performance, spa- or periphonic speaker arrays have been presented in [12]
tialization was popularized. and [14].
One of the earliest composers using the newly available Space Unit Generator, also called room-within-the-room
electronic tools was Karlheinz Stockhausen. For his com- model, dates back to [8]. Four loudspeakers represented
position Kontakte (1958-60) he developed a rotational as open windows are positioned around the listener
table, mounting a directed loudspeaker surrounded by and creates an inner room, and embedded in an outer
four stationary microphones that receive the loudspeaker room with virtual sound sources. Sound propagation of
signal. The recorded microphone signals were routed the virtual source rendered at the open windows creates
to different loudspeakers arranged around the audience. ICTDs and ICLDs. Some early reflections are calculated
Due to the directivity and separation of the microphones, according to the size of the outer room. A Max/MSP im-
the recorded audio signals contained Inter-channel Time plementation was presented in [17].
Differences (ICTDs) and Inter-channel Level Differences Spatialisateur, in development at IRCAM and Espaces
(ICLDs). Depending on the velocity of the speaker rota- Nouveaux since 1991, is a library of spatialization al-
tion, the change in ICTDs can create an audible Doppler gorithms, including VBAP, first-order Ambisonics and
effect. ViMiC follows somehow this Stockhausens tra- stereo techniques (XY, MS, ORTF) for up to 8 loudspeak-
ditions by using the concept of spatially displaced micro- ers. It can also reproduce 3D sound for headphones (bin-
phones for the purpose of sound spatialization. Relation to aural) or 2/4 loudspeakers (transaural). A room model is
pioneering works by Steinberg and Snow [13], Chowning included to create artificial reverberation controlled by a
[3], and Moore [8] also apply. perceptual-based user interface.
3. VIRTUAL MICROPHONE CONTROL Characteristic a b w
Omnidirectional 1 0 1
ViMiC is a key part of the network music project Sound- Subcardioid 0.7 0.3 1
Cardioid 0.5 0.5 1
WIRE 1 , and the Tintinnabulate Ensemble 2 directed by Supercardioid 0.33 0.67 1
Pauline Oliveros. At the MusiMars Festival 2008 3 , Ex Hypercardioid 0.3 0.7 1
Asperis, a composition by Sean Ferguson, featured ViMiC Figure-of-8 0 1 1
for Max/MSP and integrated gestural controllers to ma-
nipulate ViMiC sound rendering processes in various
ways.
ViMiC is a computer-generated virtual environment,
where gains and delays between a virtual sound source
and virtual microphones are calculated according to their
distances, and the axis orientations of their microphone
directivity patterns. Besides the direct sound component,
a virtual microphone signal can also include early reflec- Figure 1. Common microphone characteristics
tions and a late reverb tail, both dependent upon the sound
absorbing and reflecting properties of the virtual surfaces. Source directivity is known to contribute to immersion
and presence. Therefore ViMiC is also equipped with a
3.1. ViMiC Principles source directivity model. For the sake of simplicity, in
ViMiC is based on an array of virtual microphones with a graphical control window, the source directivity can be
simulated directivity patterns placed in a virtual room. modeled through a frequency independent gain factor for
each radiation angle to a 1 accuracy.
3.1.1. Source - Microphone Relation
3.1.2. Room model
Sound sources and microphones can be placed and moved
in 3D as desired. Figure 3 shows an example of one sound ViMiC contains a shoe-box room model to generate time-
source recorded with three virtual microphones. A vir- accurate early reflections that increase the illusion of this
tual microphone has five degrees of freedom: (X, Y, Z, virtual space and envelopment as described in the litera-
yaw, pitch) and a sound source has four: (X, Y, Z, yaw). ture [9]. Early reflections are strong auditory cues in en-
The propagation path between a sound source and each coding the sound source distance. According to virtual
microphone is accordingly simulated. Depending on the room size and position of the microphones, adequate early
speed-of-sound c and the distance di between a virtual reflections are rendered in 3D through the well-known im-
sound source and the i-th microphone, time-of-arrival and age method [1]. Each image source is rendered accord-
attenuation due to distance are estimated. This attenuation ing to the time of arrival, the distance attenuation, micro-
function, seen in Eq. 1 can be greatly modified by chang- phone characteristic and source directivity, as described
ing the exponent q. Thus, the effect of distance attenua- in section 3.1.1. Virtual room dimensions (height, length,
tion can be boosted or softened. The minimum distance to width) modified in real-time alter the refection pattern ac-
a microphone is limited to 1 meter in order to avoid high cordingly. The spectral influence of the wall properties
amplification. are simulated through high-mid-low shelf-filters. Because
larger propagation paths increase the audible effect of air
1 absorption, early reflections in ViMiC are additionally fil-
gi = d1 (1)
dqi tered through a 2nd-order Butterworth lowpass filter with
Further attenuation happens through the chosen micro- adjustable cut-off frequency.
phone characteristic and source directivity (see Fig. 2). Also, early reflections must be discretely rendered for
For all common microphone characteristics, the directiv- each microphone, as propagation paths differ. For eight
ity for a certain angle of incidence can be imitated by virtual microphones, 56 paths are rendered if the 1st-order
calculating Eq. 2 and applying a set of microphone coeffi- reflections are considered (8 microphones [6 early reflec-
cients from Table 3.1.1. By increasing the exponent w to tions + 1 direct sound path]). Although time delays are
a value greater than 1 will produce an artificially sharper efficiently implemented through a shared multi-tap delay
directivity pattern. Unlike actual microphone characteris- line, this processing can be computationally intensive.
tics, which vary with frequency, microphones in ViMiC
are designed to apply the concept of microphone directiv- 3.2. Late Reverb
ity without simulating undesirable frequency dependen-
The late reverberant field of a room is often considered
cies.
nearly diffuse without directional information. Thus, an
= (a + b cos )w 0 a, b 1 (2) efficient late reverb model, based on a feedback delay
1 http://ccrma.stanford.edu/groups/soundwire/
network [4] with 16 modulated delay lines diffused by a
2 http://www.myspace.com/tintinnabulate Hadamard mixing matrix, is used. By feeding the outputs
3 http://www.music.mcgill.ca/musimars/ of the room model into the late reverb a diffused reverb
tail is synthesized (see Fig. 2), for which timbral and tem- time delay. In this case, the sound paths of the old and the
poral character can be modified. This late reverb can be new sound position are cross-faded within 50 ms, in order
efficiently shared across several rendered sound sources. to avoid strongly audible phase modulations.
Air Absorption
coefficients
Source Parameter Room Model Parameter
Wall reflection
coefficients Sound Source
(xS , yS , zS ) S
Room size
tion
tion
Position [x,y,z] Number of
[xs,ys,zs] Microphone Parameter Orie
nta
Reflections [M] ntat
nta
ion
Orie
Orientation
Orie
[ ] s
Image Source Orientation
1
Directivity
Model [ i ] Quantity d1
[s ] Pregain [N] 1
... Position [G i ] 2 d2
[x ,y i ,z i]
i Directivity
2
[ i]
Determining Mic 1 d
3
(M+1)*N (x1 , y1 , z1 )
Delay and Gain Mic 2
values (x2 , y2 , z2 )
...
...
y
Monaural Rendering 3
Audio
Multitap
N virtual z x
3
Delay
...
Input Microphone
Signals (0, 0, 0) Orien
Mic 3 tation
... (x3 , y3 , z3 )
N
Channel
...
Output