0% found this document useful (0 votes)
43 views25 pages

Chapter 3 Stereo

Uploaded by

xmacghost
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
43 views25 pages

Chapter 3 Stereo

Uploaded by

xmacghost
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

Chapter 3

Stereo
Paul Geluso

Stereo Systems
Two-channel stereo has been the mainstay for hi- delity recording and playback systems since
the rst wave of stereo media was brought to the marketplace in the 1950s. Stereo systems are
designed to create the illusion of a spatial sound scene with directional sound sources localized
between two or more loudspeakers placed in front of the listener. Snow observed that binaural
systems transport the listener to the scene of the recording whereas stereo systems transport the
sound sources to the listener's room (Snow, 1953). This chapter will explore methods used to
capture, create, playback, and enhance stereo programs.

Blumlein's Patent
Alan Blumlein laid the groundwork for modern 2-channel stereo recording and reproduction
systems in his landmark 1933 British patent. He referred to his stereo invention as a binaural
transmission system and explained that a realistic, multi-directional sound impression can be
created by using two acoustic pathways. Further, he stated that acoustic phase and amplitude
information captured by two directional microphones can be reconstructed using just two loud-
speakers (Blumlein, 1933). Based on these principles, Blumlein explained that a 2-channel system
was capable of producing a near-replica of the original directional sound image. More elaborate
directional sound system concepts (see Chapter 2) were developed in his time that used a great
number of speakers, but eventually, 2-channel stereo became standard.
It should be noted that Blumlein's ideas extended beyond 2-channe. stereo reproduction.He
suggested methods to capture sound vertically as well as horizontally simultaneously, thus laying
the foundation for the immersive sound systems to come (Earg le, 1986).

Monophonic Systems and Distance Location


When a single loudspeaker is used to reproduce sound, the system is considered monophonic.
Snow described the effect of monophonic systems as if “sound (is) coming through a hole in the
wall” (Snow, 1953, p. 44). Compared to stereo and binaural systems, extreme spatial errors can
exist using monophonic systems. When recording multiple directional sound sources across the
sound stage with a single microphone, sounds arriving from many directions are captured and
fi
fi
64 Paul Geluso

fused together into a single signal. When the mono signal is later reproduced through a single
loudspeaker, the directional information from the recording environment may appear to be lost.
But remarkably, a monophonic system can give the illusion of depth and space if the sound
recording has captured enough spatial information such as wall re ections, reverberation, and
Location-dependent spectral information. This is certainly the case with many early wax cylinder
and disk recordings where a single transducer was used to record several sound sources. The
high-frequency content and the relative balance between the direct and diffused sound for each
sound source can give the listener a good idea of how far away the sound sources were from the
microphone. In general, sources recorded close are characterized by a high level of timbre detail,
whereas sources recorded from a distance will have a considerable alteration in timbre and a gen-
eral lack of defnition (Moylan, 2007). Captured early re ections and reverberation can provide
spatial cues with information about the size and treatment of the space in which the recording
was made. Even so, during playback from a monophonic system, the perceived sound image will
stay with the loudspeaker. In other words, monophonic systems are very speaker-centric.lron-
cally, unlike stereo and surround sound reproduction, monophonic systems have a wide listening
area where the program balance is perceived correctly.

Stereo Monitoring
A stereo system can create a realistic illusion of directional sounds arriving from across the hori-
zontal plane in front of the listener, bounded by two or more loudspeakers—and even beyond
(see Chapter 5). The ideal stereo listening position for 2-channel stereo is known as the sweet
spot. A listener placed in the sweet spot is centered in between the loudspeakers and is directly
facing the loudspeaker's baseline (Dickreiter, 1989). The ITU'stereo sound evaluation speci ca-
tion recommends locating the loudspeakers 30 degrees off the center line, at equal distance, and
in front of the listener for 2-channel stereo monitoring (see Figure 3.1). The direction of the
sound image and the sense of spaciousness experienced by a listener outside of the ideal listening
area can be distorted and unstable. Even a subtle turn of the head can alter the acoustic pathways
to the ears enough to affect the spectral, directional, and spatial attributes of a stereo program.
The stereo listening area and stereo imaging can be adapted and optimized by adjusting the loud
speaker spacing, direction (also known as toe-in angle), directional radiation characteristics, and
the frequency response (Eargle, 1986). The size, design, and acoustic treatment of the listening
room greatly affect the perception of stereo images as well. Griesinger (1985) found that good
low-frequency response is very important for creating the sense of spaciousness and that a poor
choice of speaker location in a room can cause a stereo signal to sound more monaural than
desired. If a listening space suffers from severe spectral ltering due to modal frequencies, early
re ections, or reverberation in the room, it is dif cult for the listener to tell if the spectral coloring
andor room sounds they hear are in the signal or are being generated and superimposed onto the
signal by the listening room. In this situation, headphone monitoring can be used to get a better
sense of what is actually captured in the sound recording. On the other hand, a listening space
can constructively add a sense of spaciousness and immersion to enhance the listener's exper!-
ence. Surely, if a stereo program is monitored loud enough in a small room or cal, the sound
image can envelop the listener even though sound is being emitted by only two loudspeakers.
fl
fi
fi
fl
fl
fi
Stereo 65

hard left center sound stage hard right

1000 Hz high pressure


wavelength = 34cm low pressure
acoustic path

Figure 3./ Stereo monitoring with the loudspeakers positioned + and -30 degrees off center. Phantom
images make up the center sound stage in between the loudspeakers. Sounds hard-panned
in the left or right channel only are perceived at the respective loudspeaker. The spacing of
IkHz high- and low-pressure wave fronts is indicated for reference (solid and dot-dashed
lines) (wavelength = 34.32 cm in air). The direct acoustic pathways to the ears are indicated
as well (dashed).

Directional information can be encoded as inter-channel time difference (ICTDs) and/or inter-
channel level differences (ICLDs) into a stereo program. ICLDs are less effective at low frequen-
cies due to their long wavelengths in comparison to the average size of the human head (see
Chapter 1). For 2-chanel stereo systems with speakers located 30 degrees off axis, Cooper sug-
gests that the amplitudes of frequencies below 327 Hz are not affected by head shadowing based
on wave front head shadow models (Cooper, 1987), as such waves will diffract easily around
the head. However, inter-channel level difference in stereo loudspeaker projection will result in
inter-aural time difference, due to speakers being positioned to the sides of the listener, and this
will produce a directional auditory sensation. This conversion from inter-channel amplitude dif-
ference to inter-aural time difference cannot occur in headphone listening. Of course, directional
sensation will be stronger for non-stationary, transient sounds compared to stationary, continu-
ous sounds (see Chapter 1).
66 Paul Geluso

ICTDs alone start to provide effective localization for frequencies below 1 kHz for stereo
systems. This includes the frequencies that have wavelengths in air more than 34 cm, about
twice the diameter of a human head. Blumlein described a crossover area, centered around
700 Hz, where both phase and level differences are providing localization information to the
brain. For higher frequencies above 2 kHz, the brain may produce a false reading of phase dif.
ference as the size of the wavelengths falls below the size of the human head, but at this point,
the head shadow starts to take signi cant effect and will attenuate higher frequencies at the
opposite ear (Blumlein, 1933). Albeit, research suggests that panning systems that use ICTDs
in conjunction with ICLDs provide the most natural sense of source localization (Theile, 1991;
Lee & Rumsey,2013).

Phantom Sound Images


Signals of mono sound sources are often mixed into 2-channel stereo programs creating multi-
mono signals delivered through two loudspeakers. By duplicating a mono signal and routing it
to both loudspeakers of a 2-channel stereo system, a phantom centered sound image appears
between the loudspeakers. A phantom image may be perceived as a virtual point source, or
be spread to exhibit some degree of width. Electronic methods to create stereo width of an
object are discussed later in this chapter. Although a phantom image of a source produced
with two loudspeakers is less clear and spatially less precise than the source image produced
by a single loudspeaker, phantom sound sources may appear very realistic to the listener
located precisely in the sweet spot. As the listener moves laterally out of the sweet spot, the
left and right acoustic pathways from the two loudspeakers to the listener's ears are no longer
equal and the phantom image appears to move and follow the listener to the closest speaker
(see Figure 3.2). At some point, when the sound level and arrival times from each loudspeaker
are no longer perceived as equal at the listener's ears, the phantom center Program image
will become unstable, less focused, and appear to follow the listener to the closest speaker.
At the extreme, as the listener moves farther to one side of the listening area, all phantom
centered images will now be perceived as coming from only one loudspeaker. Unlike phantom
sound images, produced by stereo loudspeaker pairs, sounds panned hard left or hard right
will behave in a monophonic way, staying with their assigned single loudspeaker despite the
listener's location in the room.

LCR Stereo
The stability of the center portion of a 2-channel stereo image can be greatly improved by the
addition of a dedicated center channel. Left-center-right (LCR) systems are commonly used in
larger venues to accommodate a wide seating area. In lm houses, the center channel keeps dia-
logue and other on-screen sounds rmly centered for all viewers. For this reason, LCR speaker
con gurations are used for the front channels of many surround sound systems (see Chapter 6).
As illustrated in case D in Figure 3.2, the center channel is locked to the center of the sound stage
(and to the center of the picture stage, de ned by the screen) therefore any signal sent to the center
channel will appear in that position regardless of where the listener/viewer is seated. Dialogue
fi
fi
fi
fi
fi
Stereo 67

phantom phantom
center center

A B
phantom phantom
phantom phantom

C D

Figure 3.2 An illustration representing how phantom images are affected by the listener's location in
the sweet spot (A and C) and outside of the sweet spot (B and D) for 2-channel stereo sys-
tems (A and B), and for 3-channel stereo systems (C and D). The dotted lines represent the
acoustic pathways from the speakers to each ear.

and music can be integrated into an LCR stereo image by simultaneously using the center chan-
nel for mono dialogue signals while routing 2-channel stereo signals like music and some sound
effects to the left and right speakers. In practice, some of the 2-channel music and sound effects
can be mixed into the center as well for stability. In either case, with the center channel employed,
excellent spatial focus, stability, and clarity are achieved for the center material, such as dialogue,
while a wider stereo image is created for music and effects.
68 Paul Geluso

Middle-Side Stereo
Two-channel stereo programs are typically stored and broadcasted as paired left and right sig-
nals. Alternatively, a stereo program can be stored as a paired middle signal and side signal. The
middle signal is derived from a unidirectional microphone oriented toward the center of the
sound stage or electronically by summing left and right stereo program signals, whereas the side
signal is derived from a bidirectional microphone oriented laterally with its null facing the center
of the sound stage (see Figure 3.3), or electronically by taking the difference of the left and righr
stereo signals. The side signal effectively cancels out sounds arriving from the front and rear
center. The phase information stored in the side signal can be used to decode directional infor.
mation across the stereo eld when paired with the middle signal by using a sum and difference
matrix (discussed below).
As mentioned above, to convert any 2-channel Left-Right stereo (XY stereo) program to a
middle-side stereo program (MS stereo), the middle signal is obtained by summing the left and
right signals whereas the side signal is obtained by polarity-inverting the right signal and summing

middle

side + • Side -

- middle signa
- side signal
Figure 3.3 Polar patterns for a cardioid-based middle-side system. The side signal is bi-polar, a require-
ment for middle-side systems.
fi
Stereo 69

it with the left signal, effectively creating the difference between left and right signals. The relative
intensities of the middle and side signals will vary depending on the stereowidth of the program
material.

Middle = Left + Right


Side = Left - Right

In theory, this process can be reversed without signal loss. The left signal can be restored by sum-
ming the middle and the side signals. Similarly, the right signal can be restored by nding the
difference between middle and side signals.

Left = Middle + Side


Right = Middle - Side

The middle signal emphasizes the center of the stereo image. Any multi-mono signals represent-
ing phantom centered images in the stereo program will be summed perfectly in the middle signal.
At the same time, these perfectly centered signals will be cancelled out of the side signal thus
leaving only lateral, differential, or de-correlated stereo program material like stereo reverbera-
tion, stereo delays, and panned sources in the side signal. Middle-side recording and processing
techniques will be discussed in detail later in this chapter.

Phase Correlation Metering


The term phase is often used in audio to describe the time-based relationship of two or more
elementary audio signals, such as sine waves. For complex signals containing multiple frequen-
cies, group delay de nes relative signal delay, and signal's polarity determines its waveform
orientation. The term phase ip or phase inversion implies that the polarity of an audio sig-
nal has been reversed, creating a mirror image when viewing the signal's waveform. A phase
shift implies that a displacement in time has occurred. Phase correlation relates to the amount
of or lack of phase inversion detected between two signals. For example, an in-phase, dual-
mono signal, meaning the same signal in each channel, will have a phase correlation of 1 (see
Figure 3.4).
If the left signal has little in common with right signal, the phase correlation value is zero. lf
che left signal contains a phase-inverted copy of the right signal, the phase correlation will go to
negative 1 (see Figure 3.5).
In other words, phase correlation deseribes just how in-phase, or out-of-phase, two signals are;
or Similarly, just how correlated or de-correlated they are; or how similar or dissimilar two sig-
nals are. The amount of correlation measured between the left and right signals can be monitored
with a phase correlation meter (see Figures 3.4, 3.5, and 3.6). The meter range is from -1 to 0 to
+1.For example, a correlation reading of +1 indicates the stereo program is dominated by dual-
mono signals. A reading of +.5 indicates a mixture of left, right, and phantom centered images.
A reading of O indicates a lack of correlation or random correlation thus indicating a wide stereo
image-for example, when two very different signals are panned to the opposite channels. This
fi
fl
fi
70 Paul Geluso

left signal

right signal

+.5
+1

phase correlation =+1


Figure 3.4 Two signals perfectly in phase with a phase correlation of + l (Huber, 1992).

can also occur when there is high degree of similar musical content in both channels, but not
identical sound recordings—as is the case when a musical part is doubled and each performance
is panned hard to opposite sides. If the program content in cach channel is too different, a hole in
the middle of the stereo image may occur due to the absence of a phantom center image caused
by the lack of common elements in the left-right signals. If the phase correlation falls below zero,
it is an indication that a portion of the stereo program is phase-inverted in one of the channels.
When inter-channel phase-inverted portions of the stereo program are present, the mono down-
mix will be compromised and can suffer from spectral coloration and/or severe gain loss. lf a
g0od mixture of phantom center and split stereo program material existS, the phase correlation
meter will hover around +.5 (see Figure 3.6). In general, this is a good sign indicating that most
likely a wide stereo image without a hole in the center exists.
Since stereo programs can contain a complex mixture of dual-mono, split, and de-correlated
stereo signals, the reading of the correlation meter does not always correspond precisely to the
perception of stereo width. Therefore, it is safe to say that using your ears is still the best way to
adjust the stereo program width during production. Even so, for mixing, mastering, and broad.
cast engineers, the correlation meter is a very useful metering device to warn of potential holes in
the middle of the stereo image, an overly mono mix, or mono-compatibility issues.
left signal
right signal

phase correlation =-1


Figure 3.5 Two signals perfectly out of phase with a phase correlation of -I (Huber, 1992).

O +.5
-1 +1

phase correlation ~.5


Figure 3.6 Phase correlation of +.5 (Huber, 1992).
72 Paul Geluso

Creating a Stereo Image


Stereo programs can contain complex inter-channel phase, level, and spectral relationships. Ste-
reo microphone techniques and/or signal processing equipment can use one or a combination
of these relationships to encode directional and spatial information into a 2-channel stereo pro-
gram. This spatial information can be used to imitate a natural listening experience or to create
perceived alternative sonic environments for a listener. Surely, the possibilities are endless and a
great degree of artistic license and creative potential exists that should be explored. Fundamental
stereo recording techniques, panning methods, and stereo enhancement techniques will be dis-
cussed next.

Stereo Microphone Techniques


By placing two or more microphones in a sound eld, a directional sound image can be cap-
tured. The effective recording area is determined by the directional characteristic, the distance
between, and the angle orientation of the microphones. Below are some general guidelines for
stereo recording.

• Both microphones should be placed to capture an abundance of directional sounds, including


direct sounds, room re ections, and reverberation, to maintain a certain degree of decorrela-
tion between the microphones.
• The microphones should not be placed in total isolation from one another; all recorded
sounds should be effectively heard by both microphones.
• The pick-up angle between the microphones should be no greater than 180 degrees.

XY Recording
The XY recording technique uses a matched pair of directional microphones positioned so that
level information is captured with minimal acoustic time of arrival differences between the two
capsules. Therefore, XY recording is considered a pure level difference recording system. In prac-
tice, an angle of 90 degrees between the axes of the highest sensitivity of the microphones (their
0°—on axis) is normally used. The angle can be varied at the discretion of the engineer to nar-
row or widen the recorded sound image. Sounds arriving from the center will be captured evenly
by both microphones and will appear as phantom center images when reproduced through a
2-channel stereo loudspeaker system. Sounds arriving off-center will be attenuated at the oppo-
site microphone thus providing a stereo impression when listening to the loudspeaker playback.
The amount of off-center attenuation at each microphone is determined by the angle of incidence
and by directional characteristics of the microphone used.
When the left signals and the right signals from an XY recording are summed, the resulting
mono signal takes on a similar directional characteristic as the microphones being used. For
example, the summation of an XY pair of cardioid signals yields a single mono cardioid signal
(see Figure 3.7). Similarly, the summation of an XY pair of gure-of-eights (a Blumlein Pair)
yields a single gure-of-eight signal (Dickreiter, 1989) (see Figure 3.8). When using XY recording
techniques, the mono down-mix signal is without unwanted comb- lter effects (but with boosted
fi
fl
fi
fi
fi
Stereo 73

center information), and will be perceived a bit closer to the center sound stage than its stereo
counterpart.

MS Recording

Middle-side recording is another coincident stereo recording technique. The system consists of
a directional middle microphone oriented toward the center of the sound stage paired with a
fgure-of-eight side microphone oriented laterally 90 degrees with its positive lobe facing the
left side (see Figure 3.9). The middle microphone is typically a cardioid, although virtually any

XY X+Y
45°
270°

Figure 3.7 Cardioid pair XY con guration and its equivalent mono signal (Dickreiter, 1989).

45° X+Y

90°

Figure 3.8 Blumlein Pair con guration and its equivalent mono signal (Dickreiter, 1989).
fi
fi
74 Paul Geluso

MS XY M=X+Y
45°
270°
S

Figure 3.9 MS pair con guration and its equivalent XY and mono signal (Dickreiter, 1989).

type of microphone can be used, including an omni-directional microphone. In any case, the side
microphone must have a true bi-directional characteristic for the system to work. The MS signal
can be converted to a 2-channel left-right stereo signal by using a sum and difference MS matrix
(Hibbing, 1989) as discussed earlier.
Using a sum and difference MS matrix, the side signal level can be adjusted to obtain the
desired stereo-base width. If the decoded stereo signal is summed back to mono, all of the side
information is cancelled out thus totally restoring the middle signal. Therefore, the mono signal
will have signi cantly less lateral information and appear much closer to the center sound stage
than its stereo counterpart (see Figure 3.9, right side).

Spaced Pair Recording


Using a spaced pair of microphones, time-of-arrival differences are captured. A small AB con gu-
ration, with a spacing of 16.5 cm to 30cm, keeps the ICTDs within the scale of natural hearing.
Due to the subtle differences in level between the closely spaced microphones, small AB is consid-
ered a pure stereo time-of-arrival recording technique. When working with microphones spaced
farther apart, greater amplitude and phase differences are captured. Spacing the AB system 1 to 2
meters or more apart will enhance the ICTDs and introduce greater ICLDs thus creating a stereo
program with an enhanced width. The placement of microphones used to make an AB recording
essentially imitates the placement of the loudspeakers for 2-channel stereo monitoring (Zielin-
sky, 2016). The great advantage of using this system is in the ability to use high-quality omni-
directional microphones. Omni-directional microphones are known for their natural sound
quality, lack of proximity effect, and extended bass-frequency response. The spacing of the
microphones provides an exciting spacious stereo image but must be done with care. Poor micro-
phone placement can cause sound sources to inadvertently jet across the stereo image or lead To
a severe loss of channel correlation thus creating a hole in the middle of the stereo image. Despite
these challenges, AB recording is a very popular technique among professional classical music
producers (see Figure 3.10).
fi
fi
fi
Stereo 75

L R
◎ ◎
1 meter

16.5 to 30 cm

Small AB Large AB

Figure 3.10 Spaced pair systems. Small AB (left) relies on head-related time of arrival cues with little
level difference being captured between the microphones. Large AB is spaced farther (1-2
meters) thus enhancing both ICTDs and ICLDS for a more spacious effect.

Near-Coincident Recording
Amplitude and time-of-arrival recording techniques can be combined using a pair of near-coin-
cident directional microphones. Taking a mixed ICLD and ICTD approach, a stereo image with
stable localization and a good sense of depth and spaciousness can be achieved while maintaining
excellent mono compatibility.

O.R.T.F

The O.R.T.F. stereo microphone technique was developed by French broadcasters in the 1960s.
This near-coincident stereo recording technique uses a pair of cardioid microphones spaced 17cm
and oriented 110 degrees apart (see Figure 3.11 left side). Engineers have relied on this system
for decades to deliver a spacious yet fully mono compatible stereo image. Since 17 cm is close to
the average spacing between human ears, the system delivers a very familiar and natural sense
of spaciousness in headphone and loudspeaker listening. Similarly, the 110-degree orientation
causes suf cient attenuation from off-axis sounds, not unlike the effect of head shadowing, deliv-
ering natural localization cues for directional sounds. When summing the Left and right channels
to create a mono signal, the overall level and the direct-to-diffuse sound mix heard in the stereo
program is preserved with minimal spectral coloration making excellent mono compatibility a
unique feature of the O.R.T.F. system.
fi
76 Paul Geluso

N.O.S
The N.O.S. system (see Figure 3.11 right side) is a similar near-coincident microphone technique
developed by the Dutch Broadcasting Foundation. This system consists of 2 cardioid micro-
phones spaced 30 cm apart, with an angle of 90 degrees between them. Since 30 cm is the
approximate path from ear to ear going around the head, like O.R.T.F., the system is related to
the way we naturally hear and therefore delivers a natural sounding recording while maintaining
excellent mono compatibility, like the O.R.T.F.system.

Acoustic Barrier Recording: Sphere and OSS


The stereo image produced by using near-coincident microphone techniques can be enhanced
with the addition of an acoustic barrier placed between the microphones. Theile (1991)
found that working with a solid sphere placed between omni-directional microphones deliv-
ered a natural (related to the human hearing system) inter-aural correlation necessary for a
good sense of depth and space in a sound recording. An acoustic barrier, like a sphere or a
disk placed in between a near-coincident pair of microphones, will cause a human head-like
acoustic shadow. The barrier physically blocks high frequencies while allowing lower fre-
quencies to diffract around it, thus creating a human head shadow-like effect. The amount
of low-pass tering depends on the angle of incidence and the size of the barrier. Recording
with a binaural microphone system with the addition of pinnae can encode all directional
information including height, but can introduce steep notch tering into both left and right
signals. Although less precise, recording with an acoustic barrier alone (no pinnae) can offer a
natural sounding direction-dependent low-pass ltering effect without severe coloration (see
Figure 3.12).

L R L R
17cm 30cm

110°

O.R.T.F N.O.S

Figure 3.11 O.R.TF, (left) and N.Q.S. (risht) stereo recording microphone con guration are shown "n
relation to the size of the average human head.
fl
fi
fl
fi
Stereo 77

L R
16.5 cm
L R

20°

solid sphere
15 to 20cm
diameter
30 cm barrier

A. Optimal Stereo Signal(OSS) B. Sphere Recording

Figure 3.12 Examples of acoustic barrier-based stereo recording systems using an Optimal Stereo Sys-
tem (OSS) disk (left), and a solid sphere (right).

LCR Recording
With the addition of a dedicated center microphone, the phantom center image produced by a
spaced microphone system can be enhanced. In addition, the spacing of the left and right micro-
phones can be greater than a conventional 2-channel recording system thus capturing greater
acoustic separation. lf the intended playback system has a dedicated center channel, the center
signal can be routed directly into the center loudspeaker, creating a uni ed wave front working
together with the left and right channels (see Figure 3.13).

Decca Tree and OCT

The Decca Tree system consists of a central microphone anked by two opposed left and right
microphones. Rather than using cardioid microphones like the OCT system (a similar 3-channel
recording system discussed in detail in Chapter 7), Neumann M50 or M150 microphones are
used. Both microphones have an omni-pressure capsule mounted ush to a small sphere designed
to enhance the directional characteristics of the microphone only in the mid and higher fre-
quencies while maintaining an omni-directional response in the lower frequencies. For orchestral
recording, the system is typically placed directly above the conductor (see Figure 3.14).

LCR Recording With a Coincident Stereo Center Channel

The center mono microphone in an LCR recording system can be substituted with a coincident
stereo pair of microphones like XY, Blumlein Pair, or MS. The center stereo system can provide
fl
fl
fi
A B C


1 meter
◎ 1 meter

left center right

Figure 3.13 ABC recording technique. A = left channel, B = center channel, and C = right channel.
The center B signal can be distributed evenly into the left and right channels for 2-channel
stereo reproduction.

1.5 meters

2 meters

Figure 3.14 Microphone con guration for the Decca Tree recording method (Huber, 1 992).
fi
Stereo 79

excellent control over the width of the center image in post-production. Based on the author's
experience, a coincident stereo system is desirable, as opposed to a spaced pair, because it can
deliver a highly focused uncolored mono signal if desired later during post-production.

Flanking Microphones (Outriggers)

An auxiliary pair of widely spaced microphones can be used to augment any mono or stereo
recording system. Normally, a stereo-paired set of microphones should contain a certain bal-
ance of correlated and de-correlated sounds, but when using a pair of anking microphones, the
goal is to capture highly de-correlated sound to widen the recording area andlor to enhance the
sense of spaciousness. The anking system should always be used with a center counterpart to
prevent a hole in the middle of the stereo image. Flanking microphones are able to deliver highly

Large Sound Stage

Flank Left Center Flank Right

(mono or stereo system)

Figure 3.15 Flanking microphones used with a main center mono or stereo system to capture highly
de-correlated sounds.
fl
fl
80 Paul Geluso

de-correlated low frequencies information that is diffcult to obtain using closely spaced micro-
phones. Typically, the signals from the anking microphones are panned hard left and hard righ,
respectively, and attenuated about 6 dB lower than the main center system (see Figure 3.15).

Stereo Panning

Level Based Panning

A good panning system should produce a sharp phantom image, and provide a smooth and
continuous illusion of xed or moving sound sources between the loudspeakers without holes or
images bouncing abruptly (Gerzon, 1992). The stereo panning effect can be achieved using chan-
nel level differences, delays, or equalization.
If the level of one side of mono 2-channel signal is attenuated, the image will start to migrate
to the opposite side. For most practical applications, ICLDs alone provide enough directional
information for music, speech, and most broadband sources. Using a panning potentiometer, the
sine/cosine law can be applied clectronically to maintain constant acoustic power when paming
across two channels in the stereo eld:

Left signal = cos(z)*Input signal

Right signal = sin(w)*Input signal

Where w = the pan pot location from O to 180 degrees.

Griesinger (2002) determined that using the sine/cosine law (with speakers spaced +/-45
degrees) to pan only high or low frequency sound sources tends to overshoot the desired panning
angle. Further, he found that speech-related spectra, in the range of 700 Hz to 1 kHz, dominate
our perception of localization.
A more recent study by Lee and Rumsey (2013) using musical sound sources concluded that
ICLD panning methods alone perform well regardless of the note pitch or duration of a musical
source (see Figure 3.16).

Delay Based Panning


The arrival of the frst wave front can determine the location of a sound source (see Chapter 2).
By applying a delay to one side of phantom centered dual-mono signal, the phantom image wil
migrate away from the side that is delayed. To be effective, the delay must be well below the
threshold of echo detection, otherwise the delayed occurrence will be perceived as a discrete
sound and the panning effect will be lost. In the author's experience, working with short delays
from .2 to 2 ms is effective. By using a low-pass ter on the delayed signal, the delay range can
be extended without an audible doubling effect. As mentioned before, the transient nature and
the spectral properties of sounds affect localization accuracy, thus have an effect on panning
methods as well. For musical sources, Lee and Rumsey (2013) determined that delay based pan-
ning performs well when working with musical sources for all but higher sustained pitches (See
Figure 3.17). The draw back of using a delay to pan a signal within a stereo program is that the
mono version may have unwanted audible comb ltering effects.
fi
fi
fl
fl
fi
(a) Overall ICLD results
24

20.

16-

Interchannel Level Difference (dB)

12

0
10 20 30
Angle (degree)

Figure 3.16 Lee and Rumsey (2013) ICLD study for speakers +/-30 degrees using musical sources.

(b) Overall ICTD results


1.6

1.4

1.2

1.0-
Interchannel Time Difference (ms)

0.8

0.6

0.4

0.2
0.0
10 20 30
Angle (degree)

Figure 3.17 Lee and Rumsey (2013) ICTD study for speakers +/-30 degrees using musical sources.
82 Paul Geluso

Combined Level and Delay Panning Methods


As mentioned earlier, using a combination of ICLDs and ICTDs can create a more natural locali-
zation effect than using ICLDs or ICTDs alone (Theile, 1991; Lee & Rumsey, 2013).In natural
listening environments, our ears rarely receive identical acoustic signals whereas when monitor-
ing stereo programs through headphones or loudspeakers, it is a common occurrence. As a sound
source moves closer to one side of our head in the natural listening environment, the opposite
ear receives a delayed, attenuated, and colored copy of the acoustic signal. This signal is known
as the cross-talk signal (see Chapter 5). In stereo headphone reproduction, the acoustic cross-
talk signal can be simulated by pairing a signal in the opposite channel with an appropriate
signal delay and a low-pass lter applied to a copy. Delays up to .5 milliseconds best simulate
natural acoustic cross-talk (Nacach, 2014). Unlike monitoring through headphones—through
loudspeakers, a listener will experience a double cross-talk signal when introducing head-related
panning delays (electronically generated and real) since the acoustic cross-talk signal still exists
from the opposite loudspeaker at each ear. Even so, through loudspeakers, combining level and
delay panning methods creates a spacious and natural sounding panning effect.
A combined level and delay panning method can use longer delay times as well. Haas (1941,
1951) determined that a 5- to 35-millisecond signal delay applied to one of the separated loud-
speakers changes the perceived panning location of a sound source to the direction of the rst-
arriving (precedent) sound. As much as 10 dB of gain is required to balance the perceived loudness
of the later arriving sound, through a stereo pair of loudspeakers. When a longer delay is applied
in combination with some or even no make-up gain in the delayed channel, the results include
increased spaciousness, broadening of source image, and eventually splitting of the image and
the emergence of echo perception. A low-pass lter can be implemented on the delayed channel
to reduce the perception of an echo or a doubling of the sound source caused by panning delays.
In addition, phase inversion, reverberation, and pitch-shift can be introduced into the delayed
channel for effect to further enhance the spatial quality of the stereo image.

Stereo Enhancement
Pseudo Stereo
Mono signals can be processed to create a pseudo stereo spatial impression using middle-side pro-
cessing techniques (Faller, 2005). In this scenario, the unprocessed mono source signal becomes
the middle signal of the pseudo stereo middle-side pair. Processing the mono source signal gener-
ates an arti cial side signal. The side signal path serves as the de-correlating processor as indi-
cated in Figure 3.18 below. The nal pseudo stereo signal is generated using a conventional
middle-side to 2-channel (MS to XY) stereo decoder. If the pseudo stereo signal is summed to
mono, the pseudo side signal cancels out completely thus restoring the original mono signal.
Three methods of stereophonic decorrelation by using a pseudo side signal are discussed below.

Split Equalization Effect


fi
fi
fi
fi
fi
Stereo 83

*left signal
middle signal +

side signal +

width control

mono signal'
Decorrelation
(eq, delay, and/or reverb)
A
phase inversion

- IeuaIs ap!s

middle signal +
• right signal

Figure 3.18 Block diagram of a mono to pseudo stereo processor.

a low-passed version of the mono input signal in the left channel and a high-passed version of
the mono input signal in the right channel. A frequency splitting effect can be accomplished
using virtually any type of spectral processor including a multi-band, parametric, or graphic
equalizer.

Split Comb Filter Effect


Using a similar approach, a short delay can be inserted into the pseudo side signal path to create
a stereo split comb ter effect. The rst split occurs at Fc = 1/ 2*delay time (sec) and continues
upward at n*Fc intervals where n is a whole number; 1, 2, 3 ...etc. This technique applies a
comb lter to the left signal with a perfectly inverted version of the comb lter applied to the
right signal—thus splitting the spectral content of the input signal at regular intervals between
the left and right stereo channels. If the resulting left and right channels are summed to create a
mono version of the pseudo stereo program, the comb lter effect will be completely cancelled
out thereby restoring the original mono signal.

Delay and Reverberation

Similarly, a pseudo side signal can be created using longer delays, for example in the 20-50
msec range, and/or reverberation as well. Using these methods, a spacious stereo effect is created.
As with all MS-based pseudo stereo processing effects, the mono version of the program (when
left and right channels are combined) will cancel out the spacious effect and appear much dryer
than the stereo version.
fi
fl
fi
fi
fi
84 Paul Geluso

Stereo Width Enhancement

MS Processing
Once a 2-channel stereo signal is converted to a middle-side pair (XY to MS conversion), the
stereo width can be adjusted to a certain degree by altering the balance between the middle
and side signals (see Figure 3.19). Boosting the side signal will cause the stereo image to widen
whereas attenuating the side signal will cause the stereo image to narrow. Similarly, the middle
and side signals can be processed independently for a variety of effects, and then converted back
into 2-channel stereo. For example, boosting the high frequencies of the middle signal alone will
emphasize the centered program material when the program is converted back to 2-Channel ste-
reo. Similarly, boosting the low frequency in the side signal will enhance the sense of space. Vir-
tually any effect, including equalization, compression, delay, and reverberation, can be applied

left signal +
left signal +middle signal
right signal

left signal+
right signal phase inversion >side signal
right signal-

*left signal
middle signal +

width control
side signal +

middle side signal A


Signal
phase inversion

Ieuais ap!s

middle signal +
-right signal

Figure 3.19 今 2-channel Stereo to middle-side encoder (XY to MS converter) (top). A middle-side t0
2-channel stereo decoder with stereo width control (MS to XY converter) (bottom).
Stereo 85

discretely to the middle and/or side signals to create a dynamic or static stereo effect. The effec-
tiveness of middle-side processing is highly program dependent. For example, if a stereoprogram
is dominated by dual-mono signals, it will have an anemic side signal.

Delay Based Effects and Reverberation

To increase the sense of envelopment, reverberation and delays have been used to enhance
mono and stereo recordings since the 1950s when engineers began to experiment regularly
with reverberation chambers and tape delays. Often, these time-based effects introduced subtle
pitch variations because unstable tape speeds and reverberation chambers can effect spectral
balance and the perception of pitch. For example, using a dedicated reverberation and/or
delay channel, dry signals can be panned independently and opposite of the effect to create a
natural stereo width sensation (see Figure. 3.20). Phase inversion can be applied to one side
of the stereo processed effects as well to enhance the size perceived space. Griesinger (1985)
observed that some phase differences in 2-channel stereo signals are necessary for a good sense
of spaciousness.

reverb

T②R L ORLOR

◎ ◎

reverberant sound

direct sound

Figure 3.20 Reverberation applied to a hard-panned mono source to create a stereo effect.
86 Paul Geluso

Summary
As discussed in this chapter, immersive effects can be created and perceived in stereo. They can
be captured acoustically by recording a combination of directional, distance, spectral, phase, and
environmental information—or be created electronically using stereo or middle-side processing
techniques in combination with spectral, delay, phase, and reverberation processing.In any case,
spatial and directional stereo effects are based on the listener's binaural perception of sound, and
the illusion of virtual sources and space created with loudspeakers placed in front of the listener,
not by physically surrounding the listener with real acoustic sources. Alternatively, when multiple
loudspeakers surround the listener (see Chapters 6, 7, 8, and 9), a physical immersion in acoustic
signals can exist. Even then, due to technical limitations of the number of loudspeakers available,
immersive effects must still rely on psychoacoustic principles to perceive them. Therefore, stereo
principles can be applied to more complex multichannel systems if we consider that each pair ot
loudspeakers can function as a stereo sub-system within the larger con guration.

Note
1 The International Telecommunication Union isa body coordinating standardization for telecommunications.

References
Blauert, J. (1997). Spatral Fiearing: The Psychophysics of Fuman Sound Localization, Rev. ed. O. Allen,
Trans.). Cambridge, MA: MIT Press.ntations,” German Federal Republic.
Blumlein, A. (1933). British Patent Speci cation 394,325. Reprinted, Journal of Audio Engineering Society,
6(2),91.
Cooper, Duane H. (1987). Problems with Shadowless Stereo theory: Asymptotic spectral status. Joural of
Audio Engmeering Society, 35(9),629-642.
Dickreiter, Michacl. (1989). Tonmeister Technology. New York: Temmer Enterprises.
Eargle, J. (1986). An analysis of some oft-axis stereo localization problems. Presented at the 79th AES
Convention, New York.
Faller, C. (200S). Pseudostereophony revisited. Presented at 118nd AES Corention, Barcelona.
Gerzon, M. (1992). Panpot laws for multispeaker stereo. Presented at 92nd AES Convention, Vienna. Pre-
print 3309, Audio Engineering Society.
Griesinger, D. (1985). Spaciousness and localization in listening rooms: How to make a coincident record-
ing sound as spacious as a spaced microphone arrays. Presented at 79th AES Convention, New York.
Griesinger, D. (2002). Stereo and surround panning in practice. Presented at the 112th AES Convention,
Munich.
Hass, H. (1949). The in uence of a single echo on the audibility of speech. Reprint, Journal of Audio Engi-
neering Socrety,20,145-159,
Haas, H. (1951), Uber den Ein uss cines Einfachechos auf die Horsamkeit von Sprache. Acustica, 1, 49-58.
Hibbing, M. (1989). XY and MS microphone techniques in comparison. Presented at 86th AES Conven-
tion, Hamburg. Preprint 2811 (A-5).
Huber, D. (1992). Microphone Manual: Design and Application, Waltham, MA: Focal Press.
Janovsky, W. H. (1948) “An apparatus for three dimensional reproduction... ” Patent No. 973570 (cited in
Blauert 1997).
Lee, H.-K, & Rumsey, F. (2004). Elicitation and grading of subjective attributes of 2-channel phantom
images. Presented at 116th AES Convention.
fl
fl
fi
fi
Stereo 87

Lee, H.-K., & Rumsey, F. (2013). Level and time panning of phantom images for musical sources. Journal
of Audio Engineering Society, 61(12).
Moylan, W. (2007). Understanding and Crafting the Mix: The Art of Recording. Burlington: Focal Press.
Nacach, S. (2014). The Duplex Panner: Comparative testing and applications of an enhanced stereo pan-
ning technique for headphone reproduced commercial music. Presented at the 137th AES Comuention,
Los Angeles.
Snow, W. (1952). Basic principles of stereophonic sound. Journal of Society of Motion Pictures and Televi-
sion Engineers, 61, 567-589.
Snow, W. (1953). Basic principles of stereo sound. Society of Motion Pictures and Television Engineers.
Reprint, Journal of SMPTE, 61, 567-587.
Theile, G. (1991). On the naturalness of two-channel stereo sound. Journal of Audio Engineering Society,
39(10),761-767
Zielinsky, G. (2016). Personal communication with the author.

You might also like