0% found this document useful (0 votes)
312 views20 pages

Sound Space: Altman, Rick

Uploaded by

Matt mark
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
312 views20 pages

Sound Space: Altman, Rick

Uploaded by

Matt mark
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

Altman, Rick

Sound space

Altman, Rick, (1992) "Sound space", Altman, Rick (ed.), Sound theory, sound practice, 46-64, Routledge
© http://www.tandf.co.uk/journals

Staff and students of Napier University are reminded that copyright subsists in this extract and the work
from which it was taken. This Digital Copy has been made under the terms of a CLA licence which allows
you to:

* access and download a copy;


* print out a copy;

Please note that this material is for use ONLY by students registered on the course of study as
stated in the section below. All other staff and students are only entitled to browse the material and
should not download and/or print out a copy.

This Digital Copy and any digital or printed copy supplied to or made by you under the terms of this
Licence are for use in connection with this Course of Study. You may retain such copies after the end of
the course, but strictly for your own personal use.

All copies (including electronic copies) shall include this Copyright Notice and shall be destroyed and/or
deleted if and when required by Napier University.

Except as provided for by copyright law, no further copying, storage or distribution (including by e-mail)
is permitted without the consent of the copyright holder.

The author (which term includes artists and other visual creators) has moral rights in the work and neither
staff nor students may cause, or permit, the distortion, mutilation or other modification of the work, or any
other derogatory treatment of it, which would be prejudicial to the honour or reputation of the author.

This is a digital version of copyright material made under licence from the rightsholder, and its accuracy
cannot be guaranteed. Please refer to the original published edition.

Licensed for use for the course: "LMD08112 - Theory for Film Practice".

Digitisation authorised by Catherine Campbell


ISN: 0415904579
2
Sound Space
Rick Altman

“The real can never be represented; representation alone can be repre-


sented. For in order to be represented, the real must be known, and
knowledge is always already a form of representation.” From this claim,
which I made in an earlier article on the role of technology in the history
of representation (1984), we can deduce several essential principles for
the writing of cinema history. In particular, we readily conclude that the
“reality” which each new technology sets out to represent is in large part
defined by preexistent representational systems. In order for its new mode
of representing to achieve acceptance, photography had to conform not to
reality as such, but to the visual version of this reality imposed by a certain
style of painting and engraving. In the same manner, the early years of
sound cinema were marked by a heavy debt to contemporary arbiters of
sound representation: radio, theater, phonography, and public address.
Such a theory is hardly devoid of problems, however. While it helps
us to understand how a nascent technology leans on preexisting forms, it
remains all too static, offering little insight into the processes whereby a
new form of representation is liberated from its models, eventually offering
to subsequent technologies its own representational norm. A constant
source of debate during the Hollywood thirties, the problem of sound
space provides a particularly clear test case, a unique occasion where a
change in representational norms is carefully discussed, documented, and
even quantified by contemporary technicians.
The single most important question occupying Hollywood sound techni-
cians during the late twenties and early thirties was this: what relationship
should obtain between image scale and sound scale? Disarmingly simple,
this question in fact implies a complex series of related problems. What
type of microphone should be used? Where should it be placed? May it

46
Rick Altman / 47

be moved during a take? Is it appropriate to make multiple image and/or


sound takes simultaneously? What sound take should be paired with what
image? What volume level should be used? Is it appropriate to mix multiple
takes? Under what circumstances must reverberation be added? And many
others. Indeed, as the most perceptive of sound technicians recognized
from the start, the question of sound scale foregrounds to an unexpected
extent problems of audience identification, of spectator pleasure, and of
subject placement.
Concentrating on the broad range of problems implied by the question
of sound scale, this essay will be divided into three parts. First, I will
trace through its various stages the industry’s desire for a match between
image scale and sound scale. Second, I will attempt to reach some general
conclusions about the standard sound practices which evolved in Holly-
wood during the thirties. Third, I will consider the ramifications of the
representational system thus constituted for the placement of the hearing
and seeing subject.

The Dream: Correlating Sound and Image

The question of sound scale, as contemplated by Hollywood technicians


of the early sound years, must be seen as part of a longer chain of attempts
to assure sound localization. Roughly stated, the three main approaches
involve:
1. manipulation at the place of exhibition, largely through speaker place-
ment and switching mechanisms (1927–31);
2. manipulation during production, especially of microphone choice and
placement, along with control of sound levels during editing (1929–
present);
3. development of multi-channel technology, eventually including stereo-
phonic localization capability (1930–present).
While the latter approach falls largely outside the scope of this study,
some information on the first stage will provide appropriate background for
the second stage, on which the remainder of this article will concentrate.
That early attempts to localize cinema sound should have concentrated
on the movie house itself is hardly surprising. After all, for a quarter of
a century movie theater owners had been designing the sound accompani-
ment to silent pictures. Even before The Jazz Singer, Lee DeForest was
insisting that loudspeakers be located in the orchestra pit “to simulate a
fifty-piece orchestra” (DeForest, 72). As soon as the spoken word became
a staple of sound cinema, a second horn was added to the standard
orchestra pit speaker, this one above the screen and facing out, whereas
48 / Sound Space

the orchestra horn typically aimed straight up (Rainey, Wilcox, Peck,


Hopkins 1930a and 1930b). The presence of two speakers, each destined
to reproduce a different type of sound, of course required a switching
mechanism. The projectionist, who needed to know the sound track like
an orchestra conductor, would simply switch the sound output to the
appropriate speaker each time there was a change from music to dialogue
or back. This of course assumes sound tracks where music and dialogue
are not mixed, a requirement assured by the difficulty of mixing separate
tracks before late 1930, by which time the separate speaker arrangement
had largely disappeared.
But if sound could be localized either on screen or in the orchestra pit,
then it could also, at least theoretically, be reproduced at different spots
on the screen, depending on where the person speaking is portrayed.
Never to my knowledge realized, this project, reported by J. C. Kroesen
in July, 1928, demonstrates the extent to which early technicians assumed
the necessity of tying sound to the image. “The screen,” said Kroesen,
“should be divided and so arranged that sound will be reproduced only at
or as near the point of action as possible” (Kroesen, 8). Pity the poor
projectionist trying desperately to operate the switching mechanism!
Nor was lateral placement the only possibility afforded by careful
speaker location. In an October, 1930, discussion, the Society of Motion
Picture Engineers reviewed the possibilities afforded by the recently intro-
duced multiple sound track technology. Realistic offstage effects can be
produced, a Mr. Ross pointed out, “by employing separate sound film
having a plurality of sound tracks, each related to a group of loudspeakers
located at points from which the sounds are to be produced. . . .”In this
manner, Ross suggested, depth localization as well as lateral placement
may be assured. “Any sound,” he said, “that one might wish to produce
from points other than the immediate foreground depicted on the screen
may be handled in this manner. Loudspeakers may be placed at remote
portions of the stage or auditorium. There are decided advantages for this
arrangement which are quite evident to anyone who has tried it.”1 Ross’s
protestations notwithstanding, Hollywood was not yet ready for the multi-
ple-channel solution.
Exhibitors were quite ready, however, to dispense with the need for
manual switching. Neither better instructions nor the short-lived introduc-
tion of automatic switching through control tracks could stem the tide. By
1931, the dream of sound localization through speaker placement was a
dead letter. In his description of the Western Electric Reproducing System,
Bell engineer S. K. Wolf clearly opposed the old-fashioned system (“when
it was desired that reproduced music should simulate accompaniment by
a theatre orchestra”) to what he called “modern practice,” which recog-
nized the fact that “in most instances the sound is desired to come from
Rick Altman / 49

the screen, and accordingly the horn or horns are placed behind it” (Wolf,
287).
Though the multiple-speaker approach to sound localization died an
early death, driven to its grave like many other innovations by the econom-
ics of exhibition and by the growing complexity of Hollywood’s sound
tracks, the need to stress sound’s spatial characteristics remained at the
center of debate in the limited but influential world of sound technicians.2
Whereas the proponents of localization through speaker placement and
mechanical switching had clearly in mind a theatrical or silent cinema
model, the most influential sound men of the period proposed a more
familiar and seemingly uncontested model: that of nature itself. Like
numerous other technicians of the period, Carl Dreher, chief sound engi-
neer for RKO, stressed the importance of maintaining a “natural” propor-
tionality between image and sound (Dreher 1929a, Dreher 1931, Miller).
Dreher’s appeal to nature, that is, to the apparently natural relationship
that exists between the picture of a speaking person and the voice associ-
ated with it, no doubt overlooked the extent to which such correspondences
differ from culture to culture and thus must be learned by individuals from
other practitioners within their culture, yet it clearly identified the source
and force of early arguments for some sort of sound/image match.
A second group of technician-theoreticians, headed by J. P. Maxfield,
chief of Western Electric’s west coast distribution wing, Electric Research
Products Incorporated (ERPI), reinforced this appeal to nature by a parallel
argument centering on the human body. Already in 1928, Lewis W.
Physioc had insisted that viewers would not accept a lack of auditory
perspective, because their eye/ear coordination would not allow them to
(Physioc, 24–25). Supporting his own argument that sound scale must
always match image scale, Maxfield insisted repeatedly that the eyes and
ears of a person viewing a real scene in real life must maintain “a fixed
relationship” to one another (Maxfield 1930a).
Reference to the human body as a strategy to circumvent history and
culture reached its height in a short but powerful 1930 article by RCA
sound technician John L. Cass. In order to maintain intelligibility of
dialogue, Cass claimed, more and more studios were resorting to the use
of multiple microphones, with a mixer choosing the best, that is, the most
intelligible, sound. “The resultant blend of sound,” asserted Cass. “may
not be said to represent any given point of audition, but is the sound which
would be heard by a man with five or six very long ears, said ears extending
in various directions” (Cass, 325). In other words, the current practice
resulted in the constitution of a monstrous spectator, of a being neither
found in nature nor worthy of existence. Cass thus decried the way in
which current image-editing practices forced the spectator to “jump from
a distant position to an intermediate position, and from there to close-up
50 / Sound Space

positions on important business,” while the practice of mixing multiple


mikes made the sound “run throughout as though heard from the indefinite
position described above” (325). Cass concluded: “Since it is customary
among humans to attempt to maintain constant the distance between the
eye and the ear, these organs should move together from one point to
another in order to maintain our much mentioned illusion [of reality]”
(325). Demonstrating nothing short of missionary zeal in his attempts to
save the spectator from monstrosity, Cass failed to consider the possibility,
to which we shall return later, that spectators do not remain from age to
age the same, that even the body and its functions are culturally deter-
mined, and that spectators who live long enough with monstrosity lean
to consider it not only beautiful, but even, eventually, normal.
Before the normality of a many-eared spectator could be contemplated,
however, numerous other attempts at codifying methods for assuring an
appropriate image/sound match would appear. The key figure here was
again Maxfield, the most powerful voice within the Bell/Western Electric/
ERPI complex that dominated the Hollywood sound scene until the early
thirties. In a series of articles which reiterated numerous times the same
points, Maxfield hammered home the need to limit most takes to a single
microphone (Maxfield 1929, 1930a, 1930b, 1931). Thus solving the
problem posed by Cass, Maxfield showed how the use of a single mike,
placed near the camera’s line of sight, automatically coordinates sight and
sound by providing a sound record of the characters’ movements toward
and away from the camera. Characters approaching the camera automati-
cally approach the microphone as well, thus matching closer image scale
to closer sound scale; conversely, the character who speaks from the
background demonstrates distant characteristics in sound and image alike.
While the use of a single, stable microphone assures the matching of
sound scale with image scale within a shot, a different set of guidelines
was necessary in order to assure proper matching of succeeding shots.
These guidelines, first proposed by Maxfield in 1931 (in the graphic,
empirical style characteristic of Bell’s scientific pretentions in the early
thirties), and reiterated in 1938, indicate proper microphone placement in
varying image situations (Maxfield 1931, Maxfield 1938). Throughout
the decade, Maxfield had been a pioneer in isolating the aspects of sound
which determine the spectator’s perception and evaluation of specific
sound phenomena. Demonstrating that volume alone is an insufficient
marker of distance, Maxfield had earlier revealed the importance of rever-
beration (or more accurately, the ratio of reflected sound to direct sound)
for determining perception of sound scale. In addition, Maxfield showed
that the focusing capabilities of the listening binaural human subject,
permitting humans selectively to cut out a certain percentage of reflected
sound, have no parallel in the monaural sound collection system of cinema.
Rick Altman / 51

A monaural correction factor is thus built into all Maxfield’s careful


determinations.3 Taken as a whole, Maxfield’s enormously influential
writings provided a complete program for matching sound scale to image
scale, within and between shots.

The Reality: Mismatching Image Scale and Sound Scale

With Maxfield’s influential articles reprinted in journals of all sorts


throughout the thirties, referred to by one Bell author after another, and
imitated throughout the industry, one might well assume that Hollywood
practice followed Maxfield’s strictures to the letter. A careful survey of
the period’s sound practices is far from bearing out such an assumption.
Unfortunately, space does not permit full treatment of this important
question. A few examples, along with a rough sketch of the period’s
general penchants, must suffice. As in every period, examples abound of
atypical practices, but here I will stress instead what I take to be the
accepted norm from the late twenties through the Second World War. An
appropriate starting point might be Rouben Mamoulian’s Applause, one
of the few films of 1929 to be universally praised for its revolutionary
approach to the sound track. My purpose here, however, is not to dwell
on Mamoulian’s many innovations, like the subjective use of sound levels
in the opening parade scene, but to show that even Applause neglects the
careful matching of sound scale to image scale.
The first stage scene in Applause, for example, exhibits a sound track
of uniform volume and reverberation characteristics. The sound track’s
uniformity is hardly matched by the image track, however, which reveals
a heavily edited theater scene, combining shots of varying scales and
angles. The “Doctor in the house” routine which calls the scene to a close,
for example, is clearly shot with a single microphone, while two cameras
are churning out images of different scales. Once edited together, the two
simultaneous camera takes produce a scene typical of the period. Perhaps
it is fitting to remark here that the term editing, entirely appropriate for
the images, is less so for the sound, since the sound take used is apparently
continuous and uncut. In fact, it would be perfectly correct to say that the
contemporary practice of using a single microphone system synchronized
to two or three cameras fairly begged early editors to use a continuous
sound track as the bench mark to which they edited the various images.4
For obvious economic and technical reasons, multiple-camera shooting
remained the rule throughout the early sound period. Soon, however, an
important change took place in the type of sound record associated with
the multiple-camera arrangement. The condenser mikes used in the late
twenties required extremely close placement in order to provide a distinct
dialogue record (Hunt, 482). Often, the mike had to be so close to the
52 / Sound Space

FOCAL LENTH IN MM.


Relation between focal length of camera and microphone position, for
achieving acoustic perspective.
Maxfield’s 1938 Graph
Rick Altman / 53

speaker that it could not be kept out of the field in a medium shot, thus
resulting in the common practice of handling action in medium shot, while
flashing into close-up for the sound record (thereby avoiding revelation of
the microphone, which would be visible in the medium shot of the same
scene). During this period, where a series of dialogue locations were built
into a single shot, preference was often given to a multiple-microphone
setup, with a mixer choosing the clearest sound record. More intimate
scenes easily accommodated the increasingly widespread choice of single-
miking.
In the early thirties, however, new microphones became available;
lighter, more compact, and requiring no amplification stage near the mike,
these new units made the microphone boom far more practical (Altman
1985b, Altman 1986a). Whereas the twenties often used what Dreher
dubbed “prop pick-ups” (Dreher 1929b), microphones which had to be
hidden in a prop in order to get close enough to the speaker to achieve
acceptable sound quality, the thirties adopted the mobile mike, suspended
from a boom which could be moved silently about, always pausing at the
appropriate point to capture a perfect rendition of lines which otherwise
might have turned out garbled or fuzzy. Furthermore, the mobile boom
made it relatively easy to stay out of the camera field while remaining at
proper distance for sound recording.5 In short, the combination of multiple-
camera shooting and single-miking with a mobile boom made for an ideal
combination, for two related reasons. First, the boom simplified the sound
problems inherent in the multiple-camera arrangement, thus preserving an
important economy factor (Hunt, 481–82). Second, the boom changed
radically the character of the sound “in the can.” With a single immobile
mike, such as that championed at the turn of the decade by Maxfield, the
spatial characteristics of the pro-filmic scene were already inscribed on
the sound track. A character receding or turning away from the mike was
recorded with a higher ratio of reflected to direct sound; similarly, the
size of the room had its effect on volume, reverberation, and frequency
characteristics. With the new system, however, the microphone is perpetu-
ally kept within approximately the same distance of the speaker, thus
canceling out nearly all the factors which the earlier system retained.
Coupled with devices for adding reverberation, voice equalization,
effort equalization, and so forth, in which the mid-thirties abound, this
new approach assured Hollywood both the economic benefits and the
requisite control associated with a system permitting the construction of
a sound track rather than the direct recording of already constructed
sounds. Parallel to the many image-treatment processes which permitted
the Hollywood of the thirties to exercise control over the image while
reducing the cost of its production, sound construction processes serve to
enhance the ability of the boomed mike to provide a clean, clear, continu-
54 / Sound Space

ous sound record, oblivious to image scale but attuned to dialogue intelligi-
bility, story continuity, and freedom of action.6
Perhaps most telling of all is the 1938 article in which Maxfleld reiter-
ated his strictures regarding microphone placement. Still insisting on a
careful matching of image and sound scale, Maxfield again explained his
chart providing proper microphone placement, but this time his instruc-
tions were interspersed with remarks reflecting years of experience watch-
ing technicians use his chart. These remarks reveal a fascinating tendency:

It has been the author’s experience, and that of some of the microphone
men with whom they have discussed the problem, that unless some
such guide is used there is a tendency to set the close-up takes correctly
and to make the microphone positions for the long-shot and semi-long-
shot takes decidedly too close. The use of the curve, of course, helps
to keep the judgment of the operator calibrated. (Maxfield 1938, 672)

Now, Maxfield’s original microphone distance chart was based on so-


called empirical data, generated by the experience of the first three years
of sound film. In 1931, Maxfield explained that his data came from his
own records of “several pictures with which the writer was associated”
(Maxfield 1931, 74). In the seven or eight years since Maxfield’s data
were first collected, however, something had evidently changed. Whereas
the 1931 chart was derived from actual experience, producing the straight-
line function presented in both charts, the 1938 article clearly admitted
that the chart must be used to control and rectify the intuitions of techni-
cians, who without such a guide would always tend to set microphones at
a distance producing close-up sound quality.
In other words, the “gut reaction” of sound technicians has changed
over the course of the decade. Their intuition in 1938 clearly reflected the
changing practice of the thirties; having internalized a new standard, the
technicians no longer sought to match sound scale to image scale through
“correct” microphone placement, but instead sought to produce a continu-
ous sound track of nearly level volume and unbroken close-up characteris-
tics. Throughout the thirties it was for the clarity of their sound tracks that
sound technicians had been praised and rewarded, rather than for their
spatial realism.
What once appeared as monstrosity had now become the norm. In fact,
Maxfield himself recognized the extent to which careful scale-matching
had disappeared. “There are occasions,” he admitted, “when it is necessary
to use several cameras on the same scene simultaneously. Where acoustic
perspective has no dramatic importance, a single close-up track can be
used for all the picture takes, the sound being dubbed at slightly lower
level for the long-shot scenes” (672). This capitulation by the champion
Rick Altman / 55

of scale-matching coincides exactly with numerous contemporary remarks


by top sound men. To quote but one, stereo pioneer W. H. Offenhauser
stated categorically that “it is our practice . . . to record all our sound
with the microphone placed close to the sound source” (Offenhauser,
146).
Why this striking change? Sound was not yet in its teens and already
sound technicians had reversed their position about sound space, not only
in theory but also in practice. It is no exaggeration to claim that this
reversal represented a fundamental turnabout in human perception. We
often give lip service to the notion that cinema teaches us to see and to
hear, that the media determine our very notion of reality. Yet we are rarely
privileged to isolate the moment when and the process whereby our
perception changes.
During the early years of sound cinema, theoreticians regularly insisted
that sound be treated according to the model provided by the human body.
While these appeals to nature carry strong rhetorical value, they frequently
disguise other, more important models. According to the theory elaborated
by Maxfield and followed by many early sound men, the sound track must
carry, independently from the image, all the information necessary to
reconstruct the “real” space of the scene (that is, the one represented by
the image). In this approach we easily recognize the technique of a
representational system with a decade’s experience in creating sound
space. Maxfield and his colleagues may have stressed nature and the body,
but their method owed more to radio technique than to the fixed distance
between the eyes and the ears. For where had Hollywood found its sound
technicians? By far the majority, like Carl Dreher, had come from the
radio studios. The early years of sound cinema were thus heavily marked
by the version of reality offered by other modes of representation—first
silent cinema, then radio.
“The real can never be represented; representation alone can be repre-
sented.” Up to now, this theory would appear confirmed: the audio tech-
nique of early sound cinema refers to other systems of representation. Not
only silent cinema for the location of loud speakers and radio for sound
perspective, but other models as well. Due to space limitations I will
outline only one of these here: the acoustics of large public spaces (church,
palace, concert hall), with their continuous reverberation and long decay
time. In spite of the importance of limiting reverberation in order to assure
dialogue intelligibility, Hollywood sound technicians regularly insisted
on reproducing music with a high degree of reverb, corresponding to the
large reverberant halls in which we are accustomed to hearing nineteenth-
century “serious” music played.
Silent cinema, radio, the concert hall: can history be built on the simple
notion that each new representational system derives its initial task from
56 / Sound Space

a previous system? Certainly not, for while such a theory helps us to


understand the early logic of a new representational technology, it fails to
explain eventual modifications in the representational use of that technol-
ogy. In order to understand the alterations in Maxfield’s claims from 1931
to 1938, we need to be able to explain how the construction of sound
tracks changed over that period.
A few examples should help us to understand these changes. The
opening scene from Paramount’s 1939 Union Pacific provides a particu-
larly representative case, directed as it is by one of the period’s most
conservative and exemplary directors, Cecil B. DeMille. This debate on
the floor of the U.S. Senate exhibits a sound track of uniform volume and
reverberation characteristics, with differences in sound level attributable
to the senators’ rhetoric or intensity rather than to any technical considera-
tions. Yet this uniform sound track is matched to a heavily edited image
track, revealing shots of radically differing scales. In particular, one
camera movement stands out for its clear demonstration of the lack of
concern for sound/image scale-matching: during one speech, the camera
tracks constantly back, reducing the scale of the actor, while the senator’s
speech level remains unchanged, with neither volume nor reverb varying
in the slightest. We note, therefore, that the choice of image to accompany
any particular sound is in no case dependent on the spatial characteristics
of the sound. Instead, the choice depends entirely on the narrative charac-
teristics of the sound. The sound track remains uniform throughout, dis-
playing medium close-up characteristics from beginning to end. The image
changes scale repeatedly, however, matching the dramatic effect of the
words uttered. The constant-level sound track thus serves to anchor a
pasted-up, discontinuous image sequence which remains obedient to narra-
tive concerns.
A second example comes from the same year’s Only Angels Have
Wings, made for Columbia by Howard Hawks, another filmmaker known
for thematic rather than technical innovations, and who might thus be
expected to reflect the industry’s standard pre-war practice.7 At the end
of the opening sequence, Jean Arthur is invited by Noah Beery and a
fellow pilot to have a drink. Their conversation is interrupted by the arrival
of the restaurant owner. During the ensuing scene there is a cut-in to a
medium close-up of the owner, without any accompanying change in
sound level. In the exterior conversation that follows, we witness another
cut which eloquently testifies to the period’s presuppositions about the
sound/image match. In the Applause and Union Pacific scenes, image cuts
always occur between speeches or during pauses, like punctuation in a
paragraph. While no change in sound level or characteristics is noticeable,
the practice clearly sets up a rhythmic correspondence between sound and
Rick Altman / 57

image, one which might just as well have been used to establish a scale
match between sound and image.
In the exterior conversation from Only Angels Have Wings, however,
the cut is made right smack in the middle of a phrase. Something different
is going on here. Far from matching sound scale to image scale—the
dream of technicians and theoreticians alike in the early thirties—Hawks
uses the uniformity and continuity of the medium-close-up sound track to
cover over a cut. Now, this technique obviously assumes a system in
which no match between sound scale and image is sought. Whereas Union
Pacific’s practice of making image cuts in the silences between phrases
could have attenuated the effect of a change in sound level, the cut during
a speech in Only Angels Have Wings would create a naked juxtaposition
of the two levels if there were to be a match in scale, thus revealing the
processes of image editing and sound mixing alike, thereby foregrounding
an apparatus which Hollywood would rather hide. That cutting during
dialogue has become routine by the late thirties reveals the extent to
which the uniform sound track has become the rule, unmatched to and
independent of the image.8
A second example from the same film further illustrates this fact. When
Noah Beery flies off to his death in the following scene, numerous shots
of the plane accompany a homogeneous, uniform-level sound track of the
plane’s engine noise. With one exception, the sound level remains the
same, whether we see the plane in long shot or just the pilot in medium
shot. As the camera closes in on the plane, no change in sound ties the
sound track to the image scale. Only when an internal auditor is implied
does the sound scale match the image, a situation which occurs when Gary
Grant and Jean Arthur listen to the plane disappear down the far end of
the runway, the fading of the motor sound replicating its growing distance
from the listeners. (More on this special situation in my final section.)
That the practices illustrated by Union Pacific and Only Angels Have
Wings represent the industry standard and continue through the forties is
clearly demonstrated by the authoritative comments of one of Hollywood’s
most knowledgeable and influential sound men, John G. Frayne:

To insure high intelligibility in a sound-stage pickup it is customary


practice to place the microphone as close to the actor as possible, the
distance usually being limited only by the camera angle of the scene.
. . . no attempt is ordinarily made in practice to try to obtain the same
acoustic as visual perspective of the scene. . . . “Panning” of the
microphone by the “boom” man is an accepted technique in production
recording and is omitted only if the physical location of the actors
makes it impossible for the boom man to keep up with the action.
58 / Sound Space

Panning from one actor to another or following the movements of an


actor tends to keep a constant relationship between microphone and
speaker with respect both to distance and to orientation of the sound
pressure axis. Thus a constant-frequency characteristic is preserved,
and a change in tonal inflection of the actor as he moves around in the
sound field is avoided. . . . (Frayne, 52–53)

No longer is there any question of matching sound scale to image scale,


unless it is to show how any indication of sound scale can be avoided.
We have moved a long way from the repeated demands for scale-matching
during sound’s early years.

Cinema’s Bifurcated Subject: Seeing/Hearing

Why did early technicians’ calls for scale-matching fall, as it were, on


deaf ears? Or, rather, why is it that early thirties proponents of auditory
perspective had by the end of the decade abandoned their dedication to
the creation of sound/image proportionality? What factors determined the
sound level practices which dominate Hollywood’s studio years? Two
related considerations come immediately to mind. A third is perhaps less
obvious.
Whether expressed in terms of “story continuity” (Mueller), the “busi-
ness of the play” (Dreher) or the creation of a persuasive illusion (DeFor-
est, Cass, Maxfield, Dreher), the criterion of intelligibility of dialogue
retained its primary importance throughout the period under study (Altman
1985b, Altman 1986a). For Harold B. Franklin, speaking in 1930 as head
of Fox West Coast Theaters for exhibitors everywhere, sound cinema’s
greatest advantage was the “ability to present every word so clearly and
distinctly that no one need strain to hear what is being said, at least when
recording and reproducing is properly conducted. A whisper is clearly
audible from the front row in the orchestra to the last row in the balcony”
(Franklin, 302). Franklin thus echoed one of the creators of the medium,
Lee DeForest, for whom “one of the great advantages of Phonofilm is
that, in common with the ‘Public Address’ system, the voice of the screen
image is far more distinct and clear in the far reaches of the house
and gallery than would be the normal human voice of a speaker on
the stage.”9For DeForest, cinema was thus an improved megaphone, a
mechanical aide for the hoarse actor and the carnival barker.
That the ideal of intelligibility might contradict the desire for a faithful
matching of sound scale to image scale occurred to many early theoreti-
cians of sound editing. While Bell’s Maxfield tried to dodge the problem,
however, by blithely asserting that proper scale-matching produces intelli-
gibility (Maxfield 1931, 71), RCA’s Dreher openly faced the problem,
Rick Altman / 59

recognizing that there are two potentially contradictory requirements of


good recording: “(1) intelligibility of dialog” and “(2) naturalness, or
acoustic fidelity to the original rendition,” within which category, he
included the need to retain the spatial characteristics of the original, and
thus a sound/image match.10 Influencing sound technology throughout
the thirties (especially the development of the microphone boom, sound
collector, and directional mikes), the ideal of intelligibility remained a
central factor throughout the period. Indeed, as I have written elsewhere,
this insistence on intelligibility at the expense of fidelity to the pro-filmic
situation suggests that the referent of Hollywood sound is not the pro-
filmic scene at all, but a narrative constructed as it were “behind” that
scene, a narrative that authorizes and engenders the scene, and of which
the scene itself is only one more signifier (Altman 1985b, Altman 1986a).
A second, related, consideration regards changing standards of reality
during the early years of sound cinema. In the late twenties and early
thirties, as we have seen, the reality standard constantly held up to the
cinema sound track was daily life in the real world. For inventors and
engineers like DeForest, Miller, and Dreher, sound cinema would succeed
only if it was “natural;” other theoretically inclined technicians, like
Physioc, Maxfield, and Cass, stressed instead the natural coordination of
the eyes and ears within the overall system of the human body. In calling
for a careful matching of sound scale to image scale, early theoreticians
clearly assumed that sound cinema needed to match a reality code derived
from daily life, where small-scale people—distant individuals—have
small-scale voices, and close-up people have close-up voices. Competing
with this daily life model, expressed in terms of scale-matching, there ran
throughout the decade another model, this one expressed in terms of
intelligibility.
We find here once again the familiar opposition of intelligibility to
naturalness (or acoustic fidelity), but it was rewritten in terms of differing
codes of reality. On the one side, daily life, on the other the medium that
taught the audiences of the twenties and thirties to expect visual narrative
to provide intelligible dialogue. I speak of course of the theater, that old
enemy of “pure cinema,” back to haunt the faithful once again. For if
sound cinema continued to practice intelligibility in spite of repeated
appeals for acoustic fidelity, it was because cinema continued to find in
the theater a long-consecrated code of reality applicable to audiovisual
narratives. Not even the naturalist theories of an Antoine could radically
alter the theater’s commitment to understandable dialogue, achieved
through such devices as the stage whisper, playing toward the audience,
and the declamatory style (which would be replaced only when sound
cinema’s superior ability to assure intelligibility led theater to borrow the
cinema’s technological means for amplifying dialogue11). To call for
60 / Sound Space

intelligibility in the language of the thirties’ cinema technicians is thus to


call for adherence to the theater as code of reality. With the theater of the
period itself stressing textual comprehension more than ever, 12 it is hardly
surprising that Hollywood felt the need to follow suit, abandoning the
image/sound match in favor of intelligibility, the everyday life model in
favor of a code of reality provided by the theater.
A third consideration involves nothing less than the subject placement
implied by the dominant sound model adopted by Hollywood during the
thirties. In order to elucidate this process, I must at this point expand on
the model described earlier. At one point during the scenes I have de-
scribed, an impression of auditory perspective is created by a change in
volume and reverberation levels. In Only Angels Have Wings, when Noah
Beery’s plane disappears into the night at the end of the runway, the next
shot, a two-shot of Gary Grant and Jean Arthur, is matched to the dwin-
dling sound of an airplane motor in the distance. Situations like these,
which have mistakenly caused some critics to see auditory perspective as
a common aspect of thirties’ sound tracks, are indeed common throughout
the Hollywood studio years, but they must not be confused with the scale-
matching discussed earlier.13 In both these cases we are dealing with what
might be called “point-of-audition” sound, a clumsy term whose only
merit is to recall unfailingly the “point-of-view” shot.
Frequently used to establish spatial relationships among neighboring
spaces which cannot be presented visually in a single master shot, point-
of-audition sound is identified by its volume, reverb level, and other
characteristics as representing sound as it would be heard from a point
within the diegesis, normally by a specific character or characters. In other
words, point-of-audition sound always carries signs of its own fictional
audition. As such, point-of-audition sound always has the effect of luring
the listener into the diegesis not at the point of enunciation of the sound,
but at the point of its audition. Point-of-audition sound thus relates us to
the narrative not as external auditors, identified with the camera and its
position (such as would have been the case with Maxfield’s acoustic
perspective), nor as participant in the dialogue (the standard situation of the
“intelligible” approach), but as internal auditor. When, in 1938, Maxfield
alluded to the potential “dramatic importance” of auditory perspective, he
was referring to those situations where the auditory perspective involved
is that of a character. Whereas in 1931 he was wholly concerned with the
perspective of the external auditor, by 1938 Maxfield—and Hollywood
as a whole—showed increased interest in the internal auditor.14
We are asked not to hear, but to identify with someone who will hear
for us. Instead of giving us the freedom to move about the film’s space
at will, this technique locates us in a very specific place—the body of the
character who hears for us. Point-of-audition sound thus constitutes the
Rick Altman / 61

perfect interpellation, for it inserts us into the narrative at the very intersec-
tion of two spaces which the image alone is incapable of linking, thus
giving us the sensation of controlling the relationship between those
spaces.
What is it then that is happening during those numerous moments,
exemplified by the long legislative scene from Union Pacific, where no
such identification is called for, where we find a sound track of uniform
level with no spatial characteristics? First, something important is clearly
not happening here: the auditor is at no point made aware of the sound
track as sound track by the radical changes in volume which would
have to accompany a careful sound/image match. This initial negative
consideration has numerous ramifications, not the least of which is the
dissimulation of the sound apparatus itself. The construction of a uniform-
level sound track, eschewing any attempt at matching sound scale to image
scale, thus takes its place alongside the thirties’ numerous invisible image-
editing devices within the overall strategy of hiding the apparatus itself,
thus separating the spectator from the reality of the representational situa-
tion, thereby making that spectator more available for reaction to the
subject-placement cues provided by the fiction and its vehicle.
Just as the lack of sudden changes in sound level duplicates the self-
effacing effect of contemporary image-editing, so the reverberation char-
acteristics of standard practice sound place the auditor in a manner quite
similar to familiar spectator-placement methods. According to the familiar
subject-placement arguments advanced by Pleynet, Baudry, and Comolli
apropos of the perspective image, we spectators are built into the picture
as source and consumer. Perspective images are always made for us; they
present a sumptuous banquet with an empty chair awaiting the honored
spectator-guest. Now, in order to achieve the continuous close-up sound
quality characteristic of Hollywood’s standard practice, the microphone
must be brought quite close to the speaker, cutting out unwanted set noises
while—and this is the important concern for the present argument—also
radically reducing the level of reverberation.
But what is sound without reverberation? On the one hand, to be sure,
it is close-up sound, sound spoken by someone close to me, but it is also
sound spoken toward me rather than away from me. Sound with low
reverb is sound that I am meant to hear, sound that is pronounced for me.
Like the perspective image, therefore, the continuous-level, low-reverb
sound track comforts the audience with the notion that the banquet is
indeed meant for them. The choice of reverbless sound thus appears to
justify an otherwise suspect urge toward eavesdropping, for it identifies
the sound we want to hear as sound that is made for us. While the image
is carefully avoiding signs of discursivity in order better to disguise
Hollywood’s underlying discourse, the sound track overtly adopts the
62 / Sound Space

discursive approach of low-reverb sound in order better to draw us into a


fabricated narrative.
Hollywood cinema thus established, in the course of the thirties, a
careful balance between a “forbidden” image, which we watch as voyeurs,
and “sanctioned” dialogue, which appears to be addressed directly to the
audience. In terms of movement, a similar complementarity was achieved.
The image displaces us incessantly, offering us diverse angles on objects
located at radically different distances. Our voyeurism consists precisely
in this mobility. Yet we flit about at our own peril, constantly risking
dizziness. Just as we are about to lose our balance, however, the sound
track holds out its hand, offering continuity of scale as an effective
stabilizer. Indeed, if we take the risk of flying about at all, it is certainly
in large part because we know that our bodies are anchored by sound, and
by the single, continuous experience that it offers. It is thus the sound
track that provides a base for visual identification, that authorizes vision
and makes it possible. The identity of Hollywood spectators begins with
their ability to be auditors.
While cinema’s perspective image carries a built-in spectator spot, an
interpellate position ready-made for the theatrical spectator, the varied-
scale editing practices developed during the silent period move the specta-
tor around at a dizzying pace. Far from inheriting a single place, the
spectator must fight to integrate the multiple positions allotted by the film
into a single unified home. While this wanderlust is partially cured by a
learned, and thus historically grounded, ability to insert shots of various
scales into a coherent Gestalt of filmic space, it is only with the aid of a
continuous-level sound track that the spectator finds a comfortable home.
By holding the auditor at a fixed and thus stable distance from all sound
sources (except those treated, for previously discussed reasons, through
a point-of-audition approach), Hollywood uses the sound track to anchor
the body to a single continuous experience. Along with the narrow dy-
namic range allowed for background music, this process serves to consti-
tute more completely the spectator’s unconscious self-identify as auditor,
thus providing a satisfying and comfortable base from which the eyes can
go flitting about, voyeuristically, satisfying our visual desires without
compromising our unity and fixity.
A New Model of Technological History
To summarize, we may say that the development of a stable Hollywood
audio/visual representational system is best understood according to a
tripartite historical model:
1. Multiple identities derived from pre-existing reality codes. In its early
years, sound cinema endeavored to be all things to all people. It
Rick Altman / 63

offered the amplification of public address systems, the performers of


vaudeville, the diversity of radio programs, the music of “silent” cin-
ema, the acoustics of the concert hall, and the dialogue consciousness
of theater. Never imitating all of these reality codes at once, sound
cinema nevertheless gained much of its identity from a clear ability to
serve purposes and offer experiences defined by pre-existing representa-
tional systems. This period witnessed varied attempts, typically mod-
eled on pre-existing representational systems, to represent space
through sound.

2. Jurisdictional struggle. For a number of years after Hollywood’s con-


version to sound, cinema remained the site of an ongoing jurisdictional
struggle. To which union would sound projectionists and engineers
belong? Where would sound technicians fit in the studio structure? By
reference to which model would differences of approach be adjudi-
cated? While many solutions were reached through industry-wide com-
promise (such as the adoption of the intelligibility-oriented theatrical
reality code outlined above), others required open warfare among Hol-
lywood personnel (such as the late twenties/early thirties battle between
sound and image technicians15) or the deployment of new apparatus
(such as the directional microphones developed by RCA in the early
thirties16). In some arenas, this jurisdictional struggle was over by the
end of the twenties (for example, the decision to base music volume
on “objective” stable radio or silent film orchestra levels rather than on
the volatile and “subjective” levels of parade music or marching bands),
while in others it dragged out well into the thirties (for example, the
differentiation between the concert hall as the acoustic model for high
music reverb levels and the drawing room as the basis for reduced
dialogue reverb—even when both types of sound are represented by
the image as produced in the same space). During this period, dialogue
volume was commonly modeled on theatrical intelligibility, with pri-
mary attention devoted not to space but to speech and to the narrative
content of which speech is the primary vehicle.

3. Development of new reality codes based on technological specificity.


Laboring hard to emulate diverse already-existing representational sys-
tems, Hollywood directors and sound technicians discovered only very
slowly the special capabilities of cinema sound technology. While
point-of-audition sound appeared very early in the history of sound
cinema, it did not become integrated into a general system of representa-
tion until most of Holly wood’s jurisdictional problems had been solved.
Indeed, curiously, it never received the kind of careful and extended
theoretical discussion devoted during the late twenties and early thirties
64 / Sound Space

to the realism versus intelligibility debate. Nevertheless, the mid-thir-


ties were an important crucible for the development of a new audio-
visual identity for cinema subjects, dependent both on homogeneous
dialogue levels (authorizing, as we have seen, the classic symbiosis of
forbidden image and sanctioned sound, as well as the comfortable fit
of varying image scale and stable sound scale), and on point-of-audition
sound (enforcing sound-based identification with specific characters).
Together, the intelligibility system (borrowed early on from theatrical
and telephonic precedents) and the point-of-audition system (elaborated
during the thirties out of cinema-specific possibilities) constituted a new
mode of cinematic unity and a new subject position for the Hollywood
audience.
In order to understand Hollywood’s conversion to sound we must grasp
the many ways in which Hollywood attempted to model cinema sound on
other existing uses of sound. If we want to understand Hollywood’s
standard representational system, however, we must do more than this.
We must reckon, as I have tried to do in this essay, with Hollywood’s
jurisdictional struggles and with the new sound structures developed by
Hollywood during the course of the thirties.
Theorists and historians have always concentrated heavily on image
space in their attempts to define Hollywood classical narrative. Perhaps
the arguments presented in this article will make it impossible in the
future to discuss Hollywood’s standard mode of representation without
appropriate consideration of sound space.

You might also like