Spatial Sound Max Ms P

Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

SPATIAL SOUND RENDERING IN MAX/MSP WITH VIMIC

Nils Peters1 , Tristan Matthews1 , Jonas Braasch2 , Stephen McAdams1


1
Schulich School of Music - Music Technology Area, McGill University, Montreal, CA
2
Rensselear Polytechnic Institute - School of Architecture, Troy, US
[email protected]
CIRMMT - Centre for Interdisciplinary Research in Music Media and Technology

ABSTRACT 2. SPATIALIZATION WITH MAX/MSP

Originally conceptualized [2] for the software Pure Data, This section briefly overviews available loudspeaker spa-
ViMiC was recently refined and extended for release to tialization techniques for Max/MSP. For further details,
the Max/MSP community. ViMiC (Virtual Microphone refer to the indicated references.
Control) is a tool for real-time spatialization synthesis,
particularly for concert situations and site-specific im- Vector Based Amplitude Panning (VBAP) is an efficient
mersive installations, and especially for larger or non- extension of stereophonic amplitude panning techniques,
centralized audiences. Based on the concept of virtual applied to multi-loudspeaker setups. In a horizontal plane
microphones positioned within a virtual 3D room, ViMiC around the listener, a virtual sound source at a certain po-
supports loudspeaker reproduction up to 24 discrete chan- sition is created by applying the tangent panning law be-
nels for which the loudspeakers do not necessarily have to tween the closest pair of loudspeaker. This principle was
be placed uniformly and equidistant around the audience. also extended to project sound sources onto a three dimen-
Also, through the integrated Open Sound Control protocol sional sphere and assumes that the listener is located in the
(OSC), ViMiC is easily accessed and manipulated. center of the equidistant speaker setup [11].
Distance Based Amplitude Panning (DBAP) also uses
intensity panning applied to arbitrary loudspeaker config-
1. INTRODUCTION urations without assumptions as to the position of the lis-
tener. All loudspeakers radiate coherent signals, whereby
Besides the traditional concepts of pitch, timbre, and tem- the underlaying amplitude weighting is based on a dis-
poral structures, composers have long felt the desire to in- tance attenuation model between the position of the vir-
tegrate a spatial dimension into their music. First, through tual sound source and each loudspeaker [5].
static placement and separation of musicians in the con- Higher Order Ambisonics extends Blumleins pioneer-
cert space, and later, through dynamic modifications of ing idea of coincident recording techniques. HOA aims to
the sound source position, effects of spatial sound segre- physically synthesize a soundfield based on its expansion
gation and fusion were discovered. In the 20th century, into spherical harmonics up to a specified order. To date,
especially due to the invention and integration of micro- Max/MSP externals up to the 3rd order for horizontal-only
phones and loudspeakers in the musical performance, spa- or periphonic speaker arrays have been presented in [12]
tialization was popularized. and [14].
One of the earliest composers using the newly available Space Unit Generator, also called room-within-the-room
electronic tools was Karlheinz Stockhausen. For his com- model, dates back to [8]. Four loudspeakers represented
position Kontakte (1958-60) he developed a rotational as open windows are positioned around the listener
table, mounting a directed loudspeaker surrounded by and creates an inner room, and embedded in an outer
four stationary microphones that receive the loudspeaker room with virtual sound sources. Sound propagation of
signal. The recorded microphone signals were routed the virtual source rendered at the open windows creates
to different loudspeakers arranged around the audience. ICTDs and ICLDs. Some early reflections are calculated
Due to the directivity and separation of the microphones, according to the size of the outer room. A Max/MSP im-
the recorded audio signals contained Inter-channel Time plementation was presented in [17].
Differences (ICTDs) and Inter-channel Level Differences Spatialisateur, in development at IRCAM and Espaces
(ICLDs). Depending on the velocity of the speaker rota- Nouveaux since 1991, is a library of spatialization al-
tion, the change in ICTDs can create an audible Doppler gorithms, including VBAP, first-order Ambisonics and
effect. ViMiC follows somehow this Stockhausens tra- stereo techniques (XY, MS, ORTF) for up to 8 loudspeak-
ditions by using the concept of spatially displaced micro- ers. It can also reproduce 3D sound for headphones (bin-
phones for the purpose of sound spatialization. Relation to aural) or 2/4 loudspeakers (transaural). A room model is
pioneering works by Steinberg and Snow [13], Chowning included to create artificial reverberation controlled by a
[3], and Moore [8] also apply. perceptual-based user interface.
3. VIRTUAL MICROPHONE CONTROL Characteristic a b w
Omnidirectional 1 0 1
ViMiC is a key part of the network music project Sound- Subcardioid 0.7 0.3 1
Cardioid 0.5 0.5 1
WIRE 1 , and the Tintinnabulate Ensemble 2 directed by Supercardioid 0.33 0.67 1
Pauline Oliveros. At the MusiMars Festival 2008 3 , Ex Hypercardioid 0.3 0.7 1
Asperis, a composition by Sean Ferguson, featured ViMiC Figure-of-8 0 1 1
for Max/MSP and integrated gestural controllers to ma-
nipulate ViMiC sound rendering processes in various
ways.
ViMiC is a computer-generated virtual environment,
where gains and delays between a virtual sound source
and virtual microphones are calculated according to their
distances, and the axis orientations of their microphone
directivity patterns. Besides the direct sound component,
a virtual microphone signal can also include early reflec- Figure 1. Common microphone characteristics
tions and a late reverb tail, both dependent upon the sound
absorbing and reflecting properties of the virtual surfaces. Source directivity is known to contribute to immersion
and presence. Therefore ViMiC is also equipped with a
3.1. ViMiC Principles source directivity model. For the sake of simplicity, in
ViMiC is based on an array of virtual microphones with a graphical control window, the source directivity can be
simulated directivity patterns placed in a virtual room. modeled through a frequency independent gain factor for
each radiation angle to a 1 accuracy.
3.1.1. Source - Microphone Relation
3.1.2. Room model
Sound sources and microphones can be placed and moved
in 3D as desired. Figure 3 shows an example of one sound ViMiC contains a shoe-box room model to generate time-
source recorded with three virtual microphones. A vir- accurate early reflections that increase the illusion of this
tual microphone has five degrees of freedom: (X, Y, Z, virtual space and envelopment as described in the litera-
yaw, pitch) and a sound source has four: (X, Y, Z, yaw). ture [9]. Early reflections are strong auditory cues in en-
The propagation path between a sound source and each coding the sound source distance. According to virtual
microphone is accordingly simulated. Depending on the room size and position of the microphones, adequate early
speed-of-sound c and the distance di between a virtual reflections are rendered in 3D through the well-known im-
sound source and the i-th microphone, time-of-arrival and age method [1]. Each image source is rendered accord-
attenuation due to distance are estimated. This attenuation ing to the time of arrival, the distance attenuation, micro-
function, seen in Eq. 1 can be greatly modified by chang- phone characteristic and source directivity, as described
ing the exponent q. Thus, the effect of distance attenua- in section 3.1.1. Virtual room dimensions (height, length,
tion can be boosted or softened. The minimum distance to width) modified in real-time alter the refection pattern ac-
a microphone is limited to 1 meter in order to avoid high cordingly. The spectral influence of the wall properties
amplification. are simulated through high-mid-low shelf-filters. Because
larger propagation paths increase the audible effect of air
1 absorption, early reflections in ViMiC are additionally fil-
gi = d1 (1)
dqi tered through a 2nd-order Butterworth lowpass filter with
Further attenuation happens through the chosen micro- adjustable cut-off frequency.
phone characteristic and source directivity (see Fig. 2). Also, early reflections must be discretely rendered for
For all common microphone characteristics, the directiv- each microphone, as propagation paths differ. For eight
ity for a certain angle of incidence can be imitated by virtual microphones, 56 paths are rendered if the 1st-order
calculating Eq. 2 and applying a set of microphone coeffi- reflections are considered (8 microphones [6 early reflec-
cients from Table 3.1.1. By increasing the exponent w to tions + 1 direct sound path]). Although time delays are
a value greater than 1 will produce an artificially sharper efficiently implemented through a shared multi-tap delay
directivity pattern. Unlike actual microphone characteris- line, this processing can be computationally intensive.
tics, which vary with frequency, microphones in ViMiC
are designed to apply the concept of microphone directiv- 3.2. Late Reverb
ity without simulating undesirable frequency dependen-
The late reverberant field of a room is often considered
cies.
nearly diffuse without directional information. Thus, an
= (a + b cos )w 0 a, b 1 (2) efficient late reverb model, based on a feedback delay
1 http://ccrma.stanford.edu/groups/soundwire/
network [4] with 16 modulated delay lines diffused by a
2 http://www.myspace.com/tintinnabulate Hadamard mixing matrix, is used. By feeding the outputs
3 http://www.music.mcgill.ca/musimars/ of the room model into the late reverb a diffused reverb
tail is synthesized (see Fig. 2), for which timbral and tem- time delay. In this case, the sound paths of the old and the
poral character can be modified. This late reverb can be new sound position are cross-faded within 50 ms, in order
efficiently shared across several rendered sound sources. to avoid strongly audible phase modulations.

Air Absorption
coefficients
Source Parameter Room Model Parameter
Wall reflection
coefficients Sound Source
(xS , yS , zS ) S
Room size

tion
tion
Position [x,y,z] Number of
[xs,ys,zs] Microphone Parameter Orie

nta
Reflections [M] ntat

nta
ion

Orie
Orientation

Orie
[ ] s
Image Source Orientation
1
Directivity
Model [ i ] Quantity d1
[s ] Pregain [N] 1
... Position [G i ] 2 d2
[x ,y i ,z i]
i Directivity
2
[ i]
Determining Mic 1 d
3
(M+1)*N (x1 , y1 , z1 )
Delay and Gain Mic 2
values (x2 , y2 , z2 )
...
...
y
Monaural Rendering 3
Audio
Multitap
N virtual z x
3
Delay
...

Input Microphone
Signals (0, 0, 0) Orien
Mic 3 tation
... (x3 , y3 , z3 )
N
Channel
...

Output

FDN Late Reverb


N channel
Figure 3. Geometric principles
...

Figure 2. Flowchart of the Max/MSP processing


Sound Source S (xS , yS , zS )
(x!S , yS! , zS! )
Orie
4. MOVING SOURCES nt ation
1
In Figure 4 the sound source moved from (x, y, z) to '1 d'
1
2
(x0 , y 0 , z 0 ), changing the propagation paths to all micro- d'
2
phones, and also, the time delay and attenuation. A con-
tinuous change in time delay engenders a pitch change Mic 1 '2
(x1 , y1 , z1 )
Mic 2 d'
(Doppler effect) that creates a very realistic impression of (x2 , y2 , z2 ) 3
a moving sound source. Doppler effect might not always
be desired. ViMiC accommodates both scenarios.
y
3
'3
z x
4.1. Rendering with Doppler effect Orie
ntat
(0, 0, 0) Mic 3 ion
(x3 , y3 , z3 )
For each changed sound path, the change in time delay
is addressed through a 4-pole interpolated delay-line, the
perceived quality of which is significantly better than with Figure 4. Moving source, geometric principles
an economical linear interpolation. To save resources, in-
terpolation is only applied when moving the virtual sound
source, otherwise the time delay is being rounded to the
5. PRACTICAL CONSIDERATIONS
next non-fractional delay value. At f s = 44.1 kHz and a
speed-of-sound of c = 344 m/s, the roundoff error is ap- 5.1. How to set up the virtual microphones?
proximately 4 mm. Some discrete reflections might not be
perceptually important due to the applied distance law, mi- Typically, each virtual microphone is associated with one
crophone characteristics, and source directivity. To mini- loudspeaker, and should be oriented at the same angle as
mize processor load, an amplitude threshold can be set to the loudspeaker. The more spaced the microphones are,
prevent the algorithm from rendering these reflections. the bigger the ICTDs will be. The use of virtual micro-
phones is especially interesting for arrays of speakers with
different elevation angles, because the time-delay based
4.2. Rendering without Doppler effect
panning possibilities help to project elevated sounds. Al-
This render method works without interpolation: the time though ViMiC is a 24-channel system, for smaller loud-
delays of the rendered sound paths remains static until one speaker setups the number of virtual microphones can be
of the paths has been changed by more than a specified reduced. For surround recordings for the popular ITU
5.1 speaker configuration, Tonmeisters developed differ- References
ent microphone setups (e.g. [15]) applicable in ViMiC. To
ease placing and modifying of the microphone positions, [1] J. B. Allen and D. A. Berkley. Image method for effi-
ciently simulating small-room acoustics. J. Acoust. Soc.
ViMiC provides an extra user interface where an array of
Am., 65(4):943 950, 1979.
microphones can either be graphically 4 edited or defined
[2] J. Braasch. A loudspeaker-based 3D sound projection us-
through cartesian and spherical coordinates (Fig. 5). ing Virtual Microphone Control (ViMiC). In Convention of
the AudioEng. Soc. 118, Preprint 6430, Barcelona, Spain,
2005.
[3] J. M. Chowning. The simulation of moving sound sources.
JAES, 19(1):2 6, 1971.
[4] J. Jot and A. Chaigne. Digital delay networks for designing
artificial reverberators. In 90th AES Convention, Preprint
3030, Paris, France, 1991.
[5] T. Lossius. Sound Space Body: Reflections on Artistic
Practice. PhD thesis, Bergen National Academy of the
Arts, 2007.
[6] J. Malloch, S. Sinclair, and M. M. Wanderley. From con-
Figure 5. Interface to position microphones troller to sound: Tools for collaborative development of
digital musical instruments. In Proceedings of the Interna-
tional Computer Music Conference, pages 6572, Copen-
hagen, Denmark, 2007.
5.2. Controllability [7] M. Marshall, N. Peters, A. Jensenius, J. Boissinot, M. Wan-
derley, and J. Braasch. On the development of a system
For easier and more flexible controllability, ViMiC was for gesture control of spatialization. In Proceedings of
structured as high-level modules using the Jamoma frame- the 2006 International Computer Music Conference, pages
work for Max/MSP [10]. Jamoma offers a clear advan- 360366, 2006.
tage in its standardization of presets and parameter han- [8] F. R. Moore. A General Model for Spatial Processing of
dling. The ViMiC parameters have been sorted into three Sounds. Computer Music Journal, 7(6):6 15, 1983.
primary namespaces: source, microphone and room; and [9] R. Pellegrini. Perception-based design of virtual rooms for
are specified with a data range. ViMiC is fully controlled sound reproduction. In 22nd AES International Confer-
through a GUI and through external OSC-messages [16]: ence, Preprint 000245, 2002.
[10] T. Place and T. Lossius. Jamoma: A modular standard for
/source/orientation/azimuth/degree 45 structuring patches in Max. In Proceedings of the Inter-
/microphones/3/directivity/ratio 0.5 national Computer Music Conference 2006, New Orleans,
/room/size/xyz 10. 30. 7. US, 2006.
[11] V. Pulkki. Generic panning tools for MAX/MSP. Proceed-
ings of International Computer Music Conference, pages
OSC enables the access and manipulation of ViMiC via 304307, 2000.
different hardware controllers, tracking devices or user in- [12] J. C. Schacher and P. Kocher. Ambisonics Spatialization
terfaces through OSC mapping applications (e.g. [6]) and Tools for Max/MSP. In Proceedings of the 2006 Interna-
allows gestural control of spatialization in real-time [7]. tional Computer Music Conference, pages 274 277, New
Orleans, US, 2006.
[13] J. Steinberg and W. Snow. Auditory Perspective-Physical
6. FUTURE WORK Factors. Electrical Engineering, 53(1):1215, 1934.
[14] G. Wakefield. Third-Order Ambisonic Extension for
Currently, a ViMiC module renders one sound source. Max/MSP with Musical Applications. In Proceedings of
the 2006 International Computer Music Conference, pages
Plans to develop an object which handles multiple sound
123 126, New Orleans, US, 2006.
source and frequency dependent source directivity are un- [15] M. Williams and G. Le Du. The Quick Reference Guide
der discussion. ViMiC for Max/MSP under MacOS is to Multichannel Microphone Arrays, Part 2: using Super-
available in the Subversion repository of Jamoma 5 . cardioid and Hypercardioid Microphones. In 116th AES
Convention, Preprint 6059, Berlin, Germany, May 811
2004.
7. ACKNOWLEDGMENT [16] M. Wright and A. Freed. Open Sound Control: A New Pro-
tocol for Communicating with Sound Synthesizers. Pro-
This work has been funded by the Canadian Natural Sci- ceedings of the 1997 International Computer Music Con-
ences and Engineering Research Council (NSERC) and ference, pages 101104, 1997.
the Centre for Interdisciplinary Research in Music, Media [17] S. Yadegari, F. R. Moore, H. Castle, A. Burr, and T. Apel.
and Technology (CIRMMT). Real-time implementation of a general model for spatial
processing of sounds. In Proceedings of the 2002 Interna-
4The [Ambimonitor] by ICST is used to display microphones[12].
tional Computer Music Conference, pages 244 247, San
5https://jamoma.svn.sourceforge.net/svnroot/ Francisco, CA, USA, 2002.
jamoma/branches/active

You might also like