Sound of Silence Mark Liberman
Sound of Silence Mark Liberman
Sound of Silence Mark Liberman
perceptiona)
Michael F. Dorman, Lawrence J. Raphael, and Alvin M. Liberman
Haskins Laboratories, 270 Crown Street, New Haven, Connecticut 06510
(Received1 August 1978;revised5 February 1979)
1518 J. Acoust.Soc. Am. 65(6), June 1979 001-4966/79/061518-15500.80 (D 1979 AcousticalSociety of America 1518
Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 137.189.170.231 On: Sat, 20 Dec 2014 19:13:10
silent silent
interval interval
I vocalic
1 section
I
i
>-
[ricati•II
I
I
FIG. 1. Schematic representation of
nøiseI I I
stimulus patterns sufficient for the per-
z
l Formant 2
I
ception of [sa], Ira], and [sta]. Adapted
I I from Liberman and Pisoni, 1977.
I
Formarq 1
I
', [sta]
TIME
we store the noise, but move it backward in time so as noise, than for fricatives synthesized without formant
to leave a brief (say 50 ms) interval of silence between transitions. In this instance, too, the transitions must
it and the vocalic portion of the syllable, we produce a have had different auditory representations when they
syllable that soundslike [sta] (Bastian, 1962). At one arrived at the central processing mechanisms respon-
level of interpretation there is no mystery in this: the sible for the ear advantage.
fricative Is] and the stop [t] have similar places of
production, hence similar formant transitions. But it
Another'piece
of relevantevidencecomesfrom a
study of selective adaptation. Following a now standard
is not so clear why silence is necessary in order for
adaptation procedure, Ganong (1975) first measured the
the transition cues to give rise to the perception of a
stop--that is, why a stop is not heard when fricative
displacementof the [be-de] boundarycausedby adapta-
tion with [de]. Fricative no{sewas then placedin
noise and formant transitions are separated by only a
brief interval. front of the [de], andthe (perceived)[s½]that resulted
was used as the adapting stimulus. The outcome was
Broadly speaking, two interpretations are possible. a shift in the [be-de] boundaryas large as that found
The one we are inclined to favor is that the silence when the adaptingstimulus was [de]. Patterns that
provides information to a (phonetic) perceiving device contained the noise, but not the formant transitions,
that is specialized to make appropriate use of it. To did not produce so large a shift. This indicates not
see why that is at least plausible, consider that a only that the transition cues were getting through, but
speaker cannot produce a stop without closing his vocal that they were getting through in full strength.
tract, and that he cannot close his vocal tract without
Thus, we are led to believe that the transition cues
producing a corresponding period of silence. When the
make a significant perceptual contribution, whether
listener hears an insufficiently long period of silence
or not they are preceded by a period of silence. On
between the fricative noise and the vocalic section, it is,
that view, silence is important, not because it pro-
by this account, as if he "knew" that a stop should not
vides time to evade masking, or because it collaborates
be perceived because it was not produced.
in an auditory interaction, but because it provides
An alternative interpretation puts the effect of the information that is essential to determining how the
silence cue squarely in the auditory domain. Thus, we transitions are to be interpreted in phonetic perception.
note about the example just offered, that it conforms The experiments in'this section are designedto get
to the paradigm for auditory forward masking. Con- at that matter via a different--perhaps more direct--
ceivably, the fricative noise masks the transition cues route by comparing the effect of the fricative noise on
that otherwise would be sufficient for the stops; in that transition cues that are, in one case, in a speech con-
case, the role of silence would be to provide time to text, and in the other, not. The results will bear, of
evade masking. Or, keeping the interpretation still course, on a masking interpretation, but also on the
in the auditory domain, we might suppose that the possibility of auditory interactions, since we will be
silence collaborates in some kind of perceptual inter- able to determine whether or not there are qualitative
action with the transition cues, the result of the inter- changes in the perception of the nonspeech transition
action being that experience we call a stop. cues depending on the presence or absence of the si-
lence.
Some evidence relevant to these interpretations is
already available. Harris (1958), for example, found
recognitionof the [f]-[•] contrast to be contingent A. Experiment I
primarily on the formant transitions that follow the
fricative noise. This situation could only arise if the Our first experiment was designed (1) to assess the
formant transitions had different effects in the auditory role of silence in the perception of stop manner pre-
domain--that is, if they were not masked by the pre- vocalically in the syllables [j' pc] and [f k½], and (2) to
ceding noise. Evidence from dichotic listening supports determine whether the fricative noise of If] masks or
this conclusion. Thus, Darwin (1971) found a larger interacts with information carried on the transition
right-ear advantage for fricatives synthesized with cues for the stops when those are isolated from the
appropriate formant transitions following the fricative rest of the syllable and are heard as nonspeech.
1519 J. Acoust.Soc.Am., Vol. 65, No. 6, June 1979 Dormaneta/.' The soundof silence 1519
Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 137.189.170.231 On: Sat, 20 Dec 2014 19:13:10
1. Method Fig. 2(a).] The resulting signals were randomizedand
recorded on magnetic tape with a 3-s interval between
Two sets of stimuli were made. Members of the
stimuli.
one--to be referred to as the "speech" stimuli--were
appropriate for determining the effect of silence on the The subjects were nine volunteers, all undergrad-
perceptionof the stop consonantsin [i Pc] and [•k½]. uates at Lehman College, who had not previously
They were made in the following way.
,
First, the served in experiments on speech perception. Divided
syllables [J'½], [g½], and [be] were recorded by a male into groups of five and four, they listened in a sound-
speaker, then digitized and stored, using the Pulse attenuated room, first to the speech stimuli, and then
Code Modulation (PCM) system at Haskins Labora- in a second session, to the "nonspeech" stimuli. In
tories. • Workingfrom high-resolutionoscillograms,
.
the speech condition, the listeners were told they would
and taking advantage of computer control, we next hear approximationsof the syllables [• pc], [• k½], and
separatedthe fricative noise of the [•] from the vocalic [J'½],andwere askedto indicate on a printed response
portionof the syllable[J'½], andremovedthe syllabi,e- sheet what they had heard. To provide some "practice,"
initial bursts from the [g½]and [be]. To create the we presented twenty of the stimuli before the experi-
experimental stimuli, we prefixed the J' noise to what ment proper began; no information was given about the
remained of the [b½]and [g½], leaving silent intervals "correctness" of the responses.
of 0, 4, 8, 12, 16, 20, 40, 60, 80, and 100 msbetween
the offset of the fricative noise and the vocalic section
In the "nonspeech" condition, the subjects were told
they would hear tokens of three stimulus types: f
appropriatefor [g½]and [be] [see Fig. 2(a) for a sche-
noise alone, • noise followed by a low-pitched chirp
matic representationof one of the • noise plus [g½]
(which they were to call "low"), or • noise followed
stimuli]. Four tokens of each stimulus type were pro-
by a high-pitched chirp (which they were to call "high").
duced. These were randomized and recorded on mag-
They were asked to indicate on their response sheets
netic tape with a 3-s interval between stimuli.
what they had heard. In this condition, the "practice"
Members of the other set--to be referred to as the consisted of presenting 50 of the stimuli. In order to
"nonspeech" stimuli--were intended to enable us to make sure that the subjects did, in fact, learn to
measure the extent to which the transition cues that identify the chirps, we provided knowledge of results.
distinguishthe stopsin [• pc] and [J'k½]are themselves To preclude biasing the experimental outcome by
masked by the f noise. These stimuli were made in experience during the practice sessions, we avoided
the following way. First, the [be] and [g½]patterns all short silent intervals--in which the chirps might
of the speech set were bandpass filtered between 0.9 or might not be heard--presenting only those stimuli
and 3.5 kHz, and truncated so as to include only the in which the noise preceded the chirps by 100 ms.
first 50 ms of the signal. This procedure eliminated During the experimental session, no information about
the first formant, producing signals that contained "correct" responses was given.
only the second- and third-formant transitions. (Lis-
In both "speech" and "nonspeech" conditions the
teners could hear these stim.u!i as "chirps," and we
stimuli were reproduced via a Revox 1240 tape recorder
supposed that with only a few minutes of practice they
and AR-4x loudspeaker.
would be able to identify them by pitch as "low" or
"high.") Then, to create a test of the identifiability of 2. Results and discussion
these transitions for comparison with the condition in
The results for the speech condition are shown in Fig.
which they were the essential cues for place of arti-
culation, we prefixed the • noise, setting the same
3. Sincethe identificationfunctionsfor if pc] and[• k½]
were found on preliminary examination to have similar
intervals of silence between it and the chirps that we
shapes, we have averaged them; this facilitates com-
had used in creating the "speech" stimuli. [See Fig.
2(b) for a schematic representation of the "chirp"
parison with the identificationfunctionfor [j'½]. We
see that when the silent interval was less than 20 ms,
stimulus derived from the "speech" stimulus shown in
listeners reported hearing [f½]--that is to say, they
did not hear a stop. The stops were identified with
3•0
20O0 •hir,ps
•_[•'P½]
]000
I i o......
I ,
Tirne • I, I I I I 20I
o I
40 I
60 8•
Silent Interval
FIG. 2. (a) Schematic representation of one of the speech pat-
terms used in experiment 1. (b) Schematic representation of FIG. 3. Silence as a necessary condition for stop manner;
the corresponding nonspeech ( chirp )pattern. identificationof stimulus patterns as [J'pe]-[fke] or if el.
1520 J. Acoust. Soc. Am., Vol. 65, No. 6, June 1979 Dorman et aL' The sound of silence 1520
Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 137.189.170.231 On: Sat, 20 Dec 2014 19:13:10
75 % accuracy only when the silent interval exceeded
100
-. •,,o .•,o,,,o,,
'
o...o
......................
o..................
,j•,,•............ -- ................
"....,. o,,.,,""
about 40 ms. Thus, we find silence to be an important
condition for the perception of stops in fricative-stop-
vowel syllables.
The identification functions shown in Fig. 3 were •o
derived from the responses of seven of the nine sub-
ao
jects. The two other subjects identified the J' noise
plus [gel siimuli in the same manner as the group of •o •[$ke]
seven, but made a total of only one [j' el response to o......o 'chirps'
the J'noise plus [be] stimuli. To accountfor that we 0
1521 J. Acoust.Soc.Am., Vol. 65, No. 6, June 1979 Dormanet aL: The soundof silence 1521
Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 137.189.170.231 On: Sat, 20 Dec 2014 19:13:10
the kinds of patterns we used, the more sensitive to "speech" condition.
variations in the duration of intersyllabic silence.
To produce the corresponding stimuli for the "non-
As in the experiments with prevocalic stops, we speech" condition, we simply isolated the second-
though
it usefulto providedatakelevant
to thepossi- formant transitions that alone distinguishedthe [bcb]
bility that the outcome is to be accounted for in terms and [beg] patterns of the "speech" stimuli (falling for
of masking--backward masking in the case of the post- [b], rising for [g]), andthen producedstimuli that
vocalic stops--or auditory interaction. To that end, were otherwise identical with those of the "speech"
we determined whether silence is also necessary for condition--that is, we placed after the isolated transi-
the perception of the formant transitions that are tions the same synthetic[de] that had been used in the
sufficient to distinguish the syllable-final stops when "speech" condition, and introduced between it and the
those transitions are presented in isolation, and sound transitions the same intervals of silence.
like chirps.
The subjects for experiment 2a were six under-
In the other experiment (2b), the stimuli were natural graduates at Lehman College who had previously par-
speech, not synthetic, and they included not only ticipated in experiments on speech perception. They
[bcbde] and [begde] but also the geminatecondition were tested individually. Test order ("speech" versus
[beddel.2 The useof naturalspeechwill permit a com- "nonspeech") was counterbalanced across subjects. In
parison with the results obtained when the stimuli were the "speech" condition, the subjects were asked to
synthetic. The point of testing the geminate condition respond[bcbd½], [begde], or [bede], andto write
is that, in production, the articulatory closure for the their responses. To familiarize the subjects with the
geminate stops is longer than that for single stops, and stimuli, we had them listen to twentyof the patterns
a study by Pickett and Decker (1960) leads us to sus- before the experiment began. The stimuli were re-
pect that the amount of silence necessary for percep- produced on a Revox 1240 tape recorder via TDH 39
tion may also be longer. A comparison of the two headphones.
cases of syllable-final stops seemed, therefore, to be
in order. In the "nonspeech" condition, the subjects were told
they would hear a high-pitchedchirp followedby [de],
1. Method a low-pitchedchirp followedby [de], or [de] alone.
They were asked to respond accordingly. To teach the
To produce stimuli for experiment 2a--the one with
subjects to identify the chirps, and to make sure they
synthetic stimuli--we used the Haskins Laboratories
parallel-resonance synthesizer to generate two-for- couldreliably do so, we first presented50 [b] and [g]
chirps in random order with feedback of results. Then
mant patterns appropriate for the disyllables [b•b d•]
and[b•gd•]. A schematicrepresentationof [b•bd•] we presented, also in random order, twenty-five [b]
is shown in Fig. 5. That disyllable differed from the and [g] chirps followed in each case, after a 120 ms
other one [b•gd(] in the second-formanttransition, interval, by [de]. Again, subjectswere told the cor-
the sole cue in these patterns for the perceived dis- rect answers after they had made their responses.
The point of using only the 120-ms interval was to
tinction betweenthe syllable-final stops' for [b] the
transition is falling, as shown in the figure, while
avoid biassing the results by providing "correct" res-
for [g] it is rising. We then introducedperiods of ponsesin those cases where the [de] syllable was
silence betweenthe secondsyllable [d•] and the first sufficiently close that "masking" might conceivably
haveoccurred. Finally, th• test proper wasbegun.
,
1522 J. Acoust.Soc. Am., Vol. 65, No. 6, June 1979 Dormanet aL' The soundof silence 1522
Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 137.189.170.231 On: Sat, 20 Dec 2014 19:13:10
100 lOO
-%
80
•- 60 o 60
o
a• •0
•[bebd½] •- 40
e,•e J[begde]
20
o.....,o [_bede]
Q_ 20 - sSynthetic
Speech
o------o Natural Speech
..
,0o[
o 60
C.) ,
-• t[bebde]
•- 40 e•e •[begde]
•- o......o 'chirps'
' I I I I I I I I I I I I I I I
1523 J. Acoust.Soc.Am., Vol. 65, No. 6, June 1979 Dormanet al.' The soundof silence 1523
Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 137.189.170.231 On: Sat, 20 Dec 2014 19:13:10
suppose that this difference is due to variation between cause the procedures of cutting and splicing the mag-
the conditions in the "settings" of the cues (for stop netic tape may have introduced a transient, which of
manner) other than silence, e.g., formant transitions. itself could contribute to the perception of a stop. We
should also note that others (Summerfield and Bailey,
Turning now to the comparison between geminate and
1977), working independently of us, have recently
nongeminate stops, we see in Fig. 9 that subjects
demqnstrated the power of silence to cue stop manner
needed a longer silent interval to identify syllable-final
prevocalically in the context of fricative-vowel versus
[d] than [b] or [g];4even at the longestinterval the iden- fricative- stop- vowel, e.g., [si] versus [ski], where the
tification of [d] reached only 38% correct. Further vocalic section alone is, by perceptual test, not suffi-
research by Repp (1976) suggests that an interval of
cient to produce the stop. At all events, we, too, wish
approximately 200 ms is necessary for listeners to
to test the silence cue in such circumstances, and to do
identify the syllable-final stop in a sequence of identical
it for several positions in the syllable' in prevocalic
stops (see also Pickett and Decker, 1960; Fujisaki,
position ("slit" versus "split"); in intervocalic position
Nakamura, and Imoto, 1975). This result is then an-
("say shop" versus "say chop," the affricate "ch" [t j']
other piece of evidence that speaks against an explana-
being taken here as a stop-initiated fricative); and in
tion of the perceptual disappearance of the syllable-
postvocalic position ("dish" versus "ditch"). The re-
final stops in terms of recognition masking, for one
sults may throw more light on the role of silence in the
would be hard- pressed to explain why syllable- initial
perception of stop manner, since in these instances
[d] should"backward-mask" syllable-final [d] over a there are no obvious transition cues to be masked.
period four times longer than it masks [b], or [g]. They will also provide the basis for further investiga-
Moredirectevidence
thatsyllabie-final
transitions tions into the reasons why silence should have a role
are not "backward masked" is also to be found in in stop perception at all.
studies by Repp (1976, 1976). Having presented to
listeners VCV's that had been synthesized with and To see the point of one of these further investigations
without syllable-final transitions, he found, in the we should recall that, as we have supposed, the role
case of stimuli without syllable-final transitions, that of silence might be to tell the listener that the speaker
the time required to identify the medial consonant in- either did or did not close his vocal tract appropriately
creased as a function of the duration of the closure for the production of a stop consonant. But to make
interval; in the case of stimuli with syllable-final that suggestion is to imply that our perception of
transitions, however, the time required was more speech is constrained to some degree by a device that
nearly constant (Repp, 1976). Clearly, then, the acts as if it knew what vocal tracts can and cannot do
syllable-final transitions had a perceptual effect even when they make linguistically relevant gestures; or,
though they were not heard as discrete phonetic events. more generally, that there is, in speech, a link be-
This same conclusion can be drawn from another ex- tween perception and production. Further evidence for
periment by Repp (1976). In that experiment the syl- such a link comes, for example, from studies that have
lable [de] was preceded, in the one case, by tad], in established an equivalence in phonetic perception be-
the other case by tab]. In both cases the listeners per- tween cues that are very different from an acoustic
ceived [ad½]. Nevertheless, they discriminatedbe- (and presumably auditory) point of view, but which are
tween the stimuli at a level slightly better than chance. the correlated results of the same articulatory gesture.
One'of the earliest of these is of special interest to us
Returning now to our own results, we conclude from
because it dealt with silence, albeit as a cue to voicing
experiments 2a and 2b that, just as silence is important rather than manner (Lisker, 1957b). The context was
for the perception of stops in prevocalic position, so
that of "rabid" versus "rapid." The results were (1)
also is it important for the perception of stops in post-
that variation in the duration of intersyllabic silence
vocalic position. Moreover, the results are consistent
was sufficient to cue the voicing distinction between
with the evidence presented in the Introductionmnamely,
the two words, and (2) that the location of the voicing
that silence is important, not because it provides time
boundary on the continuum of intersyllabic silence
to evade masking or because it enters into an auditory
varied as a function of whether the stimuli were syn-
interaction, but rather because it provides information
thesized, say, with or without a transition of the first
about the behavior of a vocal tract.
formant at the end of the first syllable. Thus, cues
with different acoustic properties were nevertheless
II. SILENCE AS A SUFFICIENT CONDITION BEFORE found to be equivalent in phonetic perception: Just
AND AFTER THE VOWEL; PERCEPTUAL as stimuli characterized by the presence of a transi-
EQUIVALENCE OF SILENCE AND SOUND tion of the first formant and a relatively long silent
interval were heard as "rapid," so also were stimuli
In the studies so far described, stops were (or were characterized by the absence of a transition of the first
not) perceived in patterns that contained transition formant and a shorter silent interval. We should ask
cues appropriate for stop manner. Now we shall turn now why silence should give rise to the same phonetic
to cases in which the transition cues are absent, and percept as the frequency modulation of the first-formant
it is leftto the power of the silence cue itself to produce transition. The answer is surely hard to find so long
the effect of a stop. We should note that even in the as we think in terms of what we know, or can surmise,
early study by Bastfan et al. (1961), silence might have about auditory perception. But in articulation we find
borne the entire burden, but we cannot be sure be- the tie that binds: These acoustically dissimilar events
1524 J. Acoust.Soc.Am., Vol. 65, No. 6, June1979 Dormanet aL' The soundof silence 1524
Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 137.189.170.231 On: Sat, 20 Dec 2014 19:13:10
are both to be found among the many acoustic conse-
quencesof the gesturethat converts"rabid" to "rapid."
There are other, equally diverse acoustic consequences
of the gesture, and these, too, according to the results
of the early study and its current extensions (Lisker,
1977) have an equivalence in phonetic perception.
o 40
Since articulatory gestures commonly have multiple
and diverse acoustic consequences, we should expect
to find many cases of such perceptual equivalence
among acoustically dissimilar cues. To be sure, there 0
is no problem in finding such cases; they abound, and •).... 1•)0 1;0 i
250 350i i
450 i
550 650ms
have been studied for all three phonetic dimensions: Silent Interval
manner, voicing, and place. (For a review, see
FIG. 10. Silence as a sufficient condition of stop manner; i-
Liberman and Studdeft-Kennedy, in press). In the
dentification of [p] in patterns composed of "s" followed by
third experiment of this section we examine one addi- "lit."
tional case. Taking advantage of the fact that the stop
gesture which differentiates fricative from affricate
in "ditch" versus "dish" generates changes in both the The subjects were ten volunteers, all undergraduates
duration of the silent closure interval and changes in at Lehman College who had not previously served in
the onset and duration of the fricative noise, we ex- experiments on speech perception. They were tested
amine the perceptual equivalence between silence, on in two groups of five, each under conditions similar to
the one hand, and, on the other, the rise time of the those of experiment 1. To familiarize the listeners
friction and also its duration. with the stimuli, we had them listen to the entire stim-
ulus continuum before the test sequence began.
A. Experiment 3
2. Results and discussion
Our third experiment was designed to determine
whether the perception of "split" could be induced by The effect of inserting intervals of silence between the
inserting silence betweenthe fricative noise of Is] and "s-noise" and[lit] is shownin Fig. 10. There we see
the syllable "lit." Is silence, in this sense, a suffi- that at silent intervals of less than 60 ms listeners
cient condition for the perception of stop manner, and, reported "slit," but at longer intervals--out to about
if so, over what range of durations is silence effective? 450 ms--they reported "split." In this case, then,
The second question is interesting because we know silence is a sufficient condition for stop manner. No-
that neither a very brief nor a very long closure is tice, however, that at the longest silent interval the
appropriate for stop manner. A too-brief closure stop was not heard; rather, the subjects reported '%-
would presumably indicate that the speaker had not silence-lit." Thus, neither the very brief nor the very
closed his vocal tract long enough to have said "split." long silent intervals produced a stop percept. This
A too-long closure, on the other hand, would suggest outcome accords well with our earlier supposition that
that he had produced the "s," then waited a while, and only a limited range of silent intervals should signal
finally said "lit." That being so, we would suppose stop manner.
that only a limited range of silent intervals would sig-
nal the production of stop manner.
B. Experiment 4
I. Method
To this point we have investigated silence as a condi-
A male speaker's recordings of the fricative noise of tion for the perception of stop manner. Now we turn to
Is] and the syllable "lit" were digitized and stored in silence as a condition for affricate manner. To see
computer memory. (Both segments were produced in why, consider that just as a speaker must close his
isolation.) Having listened carefully to these segments, vocal tract to produce the stop that distinguishes, for
we judgedthat the noise of the Is] did not end with a example, [sta] from [sa], so also must he close his
stop, nor did the "lit" begin with a stop. Using the vocal tract to produce the stop.-initiated fricative (i.e.,
editing facilities provided by the Haskins Laboratories affricate) that distinguishes, for example, the phrase
PCM system, we then appended the "s noise" to the "say chop" from "say shop." There is evidence, more-
"lit," separating these two segments by intervals of over, that the silence associated with vocal-tract
silence that ranged from 0 to 100 ms in steps of 15 ms, closure is a cue for the affricate-fricative contrast in
and from 100 to 650 ms in steps of 50 ms. Three intervocalic position. This evidence comes from early
tokens of each stimulus were generated. The resulting experiments with synthetic speech (Kuypers, 1955,
stimuli were randomized and recorded on audio tape Truby, 1955). The purpose of the experiment to be
With a 3-s interval between stimuli. The listeners described here is to replicate and expand these early
were instructed to label the stimuli as "slit," "split," findings. Specifically, we aim to determine whether
or "s" followed by "lit." (The last named category is silence can be a sufficient condition for the fricative-
not "slit," but rather "s" plus "lit," with a clearly per- affricate contrast in the naturally produced utterances
ceptible period of silence in between.) "say shop" and "say chop."
1525 J. Acoust. Soc. Am., Vol. 65, No. 6, June 1979 Dorman et al.' The sound of silence 1525
Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 137.189.170.231 On: Sat, 20 Dec 2014 19:13:10
1. Method the affricate It J']from the fricative [J']. We shouldre-
mark that, according to preliminary research we have
A male speaker's recording of "please say shop" was
done, the contrast between the voiced counterparts of
digitized and stored in computer memory. Using the
editing facilities provided by the Haskins Laboratories
thosephones
(i.e., [d3]and[3]) canalsobe cuedby
silence.
PCM system, we removed the initial 15 ms of j' noise
from "shop." The signal that remained still sounded Redirecting our attention to the data for the voiceless
to us like "shop." forms shown in Fig. 11, we see that at the very long
intervals of silence there is a tendency for our lis-
We should note parenthetically that in situations of
teners' perceptionsto revert to the fricative If]. This
this kind, where there are presumably a number of
tendency is similar to that we saw in the case of silence
different cues for the same distinction, it often happens
as a cue for stop manner in the contrast "split" versus
that relatively extreme "settings" of one of the cues
"slit" (cf. Fig. 10), but it is not so marked. In that
will cause the other cues to be "overridden" in per-
connection we note that the longest silent interval for
ception. For example, in this case, we have reason
to believe that the duration and onset of the frication
the present experiment with "shop" and "chop" was
400 ms, whereas for the earlier experiment with
noise, as well as silence, are cues to the affricate-
"slit" and "split" it was 650 ms. When we examine
fricative distinction (see Gerstman, 1957). Very long
the identification functions for "slit" versus "split,"
fricative noise, especially when combined with slow
we see that at 400 ms our listeners' responses had
onset, may so bias perception toward the fricative that
no amount of the silence cue can be effective.
only just begunto revert to "s-silence-lit." Pre-
sumably,then,in thepresentexperiment,tl•e"chop"
To generate our experimental stimuli we inserted responses would have reverted more nearly to "shop"
intervals of silence between the offset of "please say" had we carried the silent interval to greater lengths.
and the onset of "shop." These intervals covered the
Having seen that we convert the utterance "please say
range 0 to 400 ms. The steps were 10 ms each from
shop" into "please say chop" by appropriately increa-
0to 100 ms and 50 ms each from 100 to 400 ms. Four
sing the silent interval between "say" and "shop," we
tokens of each stimulus were generated. The resulting should wonder whether we can start with the utterance
stimuli were randomized and recorded on audio tape
"please say chop" and convert it to "please say shop"
with a 4-s interval between stimuli.
by shortening the silence. The results from prelim-
The subjects were ten volunteers, all undergraduates inary research suggests that this can, indeed, be done,
at Lehman College who had not previously participated thoughjust how convincingly dependsupon the "inten-
in experiments on speech perception. They were tested sity" of the affricate articulation in "chop" (Raphael
en masse under listening conditions similar to those of and Dorman, 1977). Of course this is analogous to the
experiment 1. The subjects were told they would hear results obtained in experiments 1 and 2, where too
either "please say shop" or "please say chop," and little silence caused stops not to be heard.
were instructed to write either "shop" or "chop" on
their response sheets. To familiarize them with the C. Experiment 5a and 5b
experimental stimuli, we played twenty of the stimuli Having found silence to be sufficient for the percep-
before the test sequence began. tion of affricate manner in syllable-initial position
2. Results and discussion ("shop" versus "chop"), we now wish to determine
whether it can be sufficient in syllable-final position,
The effect of varying the duration of the silent interval as in "dish" versus "ditch." We also wish in these
between "please say" and "shop" is shownin Fig. 11. experiments to examine the effects of two other cues
We see that "chop" responses begin to appear when the for affricate manner--namely, the duration and rise
silent interval exceeds about 30 ms; by 70 ms they time of the fricative noise (see Gerstman, 1957)--and
account for 75% of the responses. Thus, we conclude to study such relations as there may be between these
that silence can be a sufficient cue for distinguishing two cues, on the one hand, and silence on the other.
lOO 1. Method
1526 J. Acoust. Soc. Am., Vol. 65, No. 6, June 1979 Dorman et al.: The soundof silence 1526
Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 137.189.170.231 On: Sat, 20 Dec 2014 19:13:10
bined the silence cue with each of two durations of lO
0
In the other series we combined the silence cue with
Si lent Interval
each of two different conditions of noise rise time,
using for this purpose the second of the recordings FIG. 12. The relation between silence and sound; identification
referred to above. We produced the two rise times of [tf] for two conditionsof fricative-noise duration.
in the following way. For one, we simply used the
rise time of the original utterance, which was 35 ms. <0.005). That is to say that 14 ms of silence (the
For the other, we reduced the rise time to 5 ms by re- difference between 89 and 75 ms) is equivalent in
moving the first 30 ms of the noise. To compensate in these phonetic perceptions to 160 ms of noise.
the simplest possible way for the resulting reduction in
overall duration of the noise, we added 30 ms of noise In Fig. 13 we see the results of experiment 5b. Since
to the center. (Given that the rise time was not instan- listeners report "dish" at the shortest intervals of
taneous, this operation does not ensure that the dura- silence and "ditch" at the longest intervals, we see,
tions of the stimuli with the two conditions of rise time once again, that silence is sufficient to distinguish
were psychologically equal. We should note, however, between fricative and affricate. And here, too, we see
a relation between two acoustic cues to the same dis-
that they were more nearly so than they would have
been if the 30-ms insertion had not been made.) tinction: silence and rise time of the fricative noise.
The boundary between fricative and affricate is at
about 57 ms of silence when tl{e rise time is slow (35
The subjects for experiment 5a were ten undergrad- ms), but at 37 ms when the rise time is rapid (0 ms).
uate volunteers from Arizona State University who had This difference is significant (T-l,/•<0.005).
not previously participated in research on speech per-
We should note that relations of the kind described
ceptiono They were tested en masse in a large sound-
here can limit the effectiveness of silence as a cue.
attentuated room. The experimental stimuli were re-
produced on a Magnecord 1032 tape recorder via a At one extreme we might have such a long duration of
CEI 41-2 loudspeaker. The subjects for experiment noise, and thus a strong bias toward a fricative, that
no amount of silence would be sufficient to overcome it.
5b were 12 undergraduate volunteers from Lehman
College who had not previously participated in research At the other extreme we might have such a short dura-
on speech perception. They were tested in groups of tion and rise time of the noise, and thus so strong a
four under the conditions described for experiment 1. bias toward the affricate, that even durations of si-
The subjects in both experiments were given the same lence near 0 ms would not alter the perception of the
affricate. This is consistent with the caveat we men-
instructions. They were told that they would hear either
"put it in the dish" or "put it in the ditch" and were tioned in our earlier discussion. It wduld apply also
instructed to write either "sh" or "ch" on their res- in the case of "slit" and "split" to the trading relation
ponse sheets. To familiarize the subjects with the between temporal (silence) and spectral cues that have
experimental stimuli, we had them listen to twenty been reported by other investigators (Erickson, Fitch,
stimuli before we started the test sequence. Halwes, and Liberman, 1977; Liberman and Pisoni, 1977).
Returning now to the main findings of our experi-
2. Results and discussion
1527 J. Acoust.Soc. Am., Vol. 65, No. 6, June 1979 Dormanet aL' The soundof silence 1527
Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 137.189.170.231 On: Sat, 20 Dec 2014 19:13:10
ment, we should note that the relations among the A. Experiment 6
effects of the several cues are, in principle, like
The purpose of this experiment was to discover
those that have been reported for numerous others
whetherthe effectof intersyllabicsilenceon the per'-
(for a review, see Liberman and Studdert-Kennedy,
ception of syllable-final stops in the disyllables
in press). In all cases, cues that are quite different
from an acoustic point of view, nevertheless give rise
[babda]and[bagda] is differentwhenthe syllablesare
produced by two speakers instead of one.
to the same phonetic percept. It is cpnsistent with our ,
1528 J. Acoust.
Soc.Am.,Vol. 65, No.6, June1979 Dorman
et aL' Thesound
of silence 1528
Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 137.189.170.231 On: Sat, 20 Dec 2014 19:13:10
lO0
of silence between "say" and "shop." These intervals
ranged from 0 to 100 ms in steps of 10 ms. Three
tokens of each stimulus were generated. The re-
sulting stimuli were randomized and recorded on audio
tape with a 3-s interval between stimuli.
20
d""/
,,•
'•'""',,,,,,,,,•"i
,,,"•
• •o
A------A
Same Talker
Different
Talker n=8
The subjects were ten volunteers, all undergraduates
at Lehman College who had not previously participated
in research on speech perception. For the "same-talker"
(• n .......... n Different
Talker n-2 condition, the subjects were told that they would hear
a male voice saying either "please say shop" or "please
• i I
20 I i
40 | I
60 I I
80 i I
lOOms say chop." For the different-talker condition, the
Silent Interval subjects were told that they would hear a female voice
saying "please say" and a male voice saying either
FIG. 14. Silence as a condition for stop manner when it re-
flects the behavior of one vocal tract or two: identification of
"shop" or "chop." In both conditions the subjects were
syllable final stops in [bcb de]- [begde] in the same- and dif- asked to write either "sh" (for "shop") or-"ch" (for
ferent-talker conditions. "chop") on their response sheets. The subjects were
testedin twogroupsof five underthe listeningcondi-
tions described in experiment 1. The stimuli of the
val of silence, 'including even the very shortest. For
same- and different-talker conditions were presented
these subjects, it is as if their perceptual machinery
in blocks. The order of the blocks was counterbalanced
"knew" that, with two speakers, intersyllabic silence
across the two groups of subjects. To familiarize the
conveys no useful phonetic information.. The re-
maining two subjects behaved in the different-talker subjects with the stimuli, we presented 20 stimuli
before each trial block.
condition almost exactly as they had when there was
but a single talker. We cannot be sure why. We may
2. Results and discussion
note, however, that a single syllable by each talker
provides very little information about the identity of The results of experiment 7 are shown in Fig. 15.
the talker. Conceivably, therefore, the fact that the One sees in the same-talker condition a result similar
two syllables were produced by different talkers did not to that we obtained in the analogus condition of experi-
properly "register" with these two subjects. In that ment 4' the fricative in the word "shop" was heard
connection, it is relevant that one of these two sub- as the affricate in the word "chop" when the silent
jects did remark at the end of the experiment, that she interval between it, arid the immediately preceding
thought she had been listening to the same talker word exceeded about 45 ms. In contrast, silence had
speaking on two different pitches. This suggests that no effect in the different-talker condition: increases
the effect we obtained in the different-talker condition in the silent interval did not convert "shop" to "chop."
was not due solely to the acoustic differences between
We should note that the utterance "please say shop"
the voices as such, but rather to their role in informing
used in this experiment should have provided more
the listeners that there were, indeed, two sources of
information about the identity of the talker (or talkers)
speech.
than did the two syllables of the previous experiment.
B. Experiment 7 This may account for the fact that, in this experiment,
though not in the other, the effect of the same- versus
The purpose of this experiment was to determine if
different-talker conditions was found in every subject.
the effect of silence in converting "say shop" to "say
Perhaps, however, the'effect would not have been so
chop" is different when the words on either side of the
large had we used other settings of the cues for the
silence are produced by two talkers instead of one.
fricative-affricate distinction. Obviously, further
I. Method
lOO
The stimuli for this experiment were produced in the
same manner as those of experiment 4, except for the
80
X _! ß
' -Sam
Tal
addition of a "different-voice" condition. First we
digitized and stored in computer memory a male
speaker's recording of "please say shop." To produce
stimuli for the same-talker
a 3-s interval
L..•
Differ
Talk
between stimuli. To produce stimuli for the different- 20 40 60 80 100 ms
talker condition, we first digitized a female's recording Silent Interval
of "please say shop." The phrase "please say" was ex-
FIG. 15. Silence as a condition for affricate manner when it
cised from the recording and stored in computer reflects the behavior of one vocal tract or two: identification
memory. We then appendedthe male-produced "shop" of [tf] in patterns composedof "please say" and"shop" in the
to the female-produced "please say," leaving intervals same- and different-talker conditions.
1529 J. Acoust. Soc. Am., Vol. 65, No. 6, June 1979 Dorman et aL' The sound of silence 1529
Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 137.189.170.231 On: Sat, 20 Dec 2014 19:13:10
research is necessary to determine the limits over Having presented those data at various places in this
which the effect obtains. We should also wonder about paper, we should collect them here.
the effect in connection with the trading relations among
First, we should consider again the basic fact that
the fricative-affricate cues that we observed in our
silence was an important cue, and then note how dif-
earlier experiments. Having found, for example in
ficult it is, given our results, to account for that solely
experiment 5, that duration of silence can be traded
in auditory terms. Thus, we found that the transi-
for friction duration, we might ask whether these cues
tion cues for the stops were neither appreciably
also trade with the (perceived) magnitude of the differ-
masked nor altered by interaction when, having been
ence between the voices.
isolated from the speech patterns, they were heard as
We should emphasize that in both experiments the nonspeech chirps. It is also relevant, of course, that,
two talkers were male and female. Thus, the acoustic under some conditions, silence was a sufficient cue.
difference between the voices was relatively large. There were, in those cases, no other sufficient cues to
We are now conducting experiments contrived specifi- be masked. It is also telling that silence was effective
cally for the purpose of helping us to determine whether as a cue only over a limited range, just as we should
the phenomenon we have here described depends criti- expect given the assumption that it provides informa-
cally on such an acoustic difference, or, alternatively, tion about a stop closure that lasts for a limited amount
on an inference by the listener that he did or did not of time. Further evidence for a link between percep-
hear different sources of speech. At this point, we tion and production is provided by those of our experi-
believe it is the latter. • ments that showed an equivalence in phonetic perception
between duration of silence and duration of friction (or
between duration of silence and the rise time of the
IV. GENERAL DISCUSSION
friction). That•esult--similar,aswe havepointed
We should now assemble the results of our experi- out, to the results of other investigators--seems easiest
ments in terms of their bearing on the three questions to interpret on the assumption that the acoustically
we raised at the very beginning. As for the first different cues give rise to the same phonetic percept
question--Is silence a cue to stop manner?--the answer because they are normally the correlated (but dis-
is quite straightforward, and wholly in accord with the tributed) acoustic consequencesof the same gesture.
results of previous research. Silence is a cue, neces-
sary in some cases, sufficient in others. Thus, given Having said that the data of our experiments (and
spectral cues appropriate for a stop in absolute initial those of others) imply that perception of the silence
position (e.g., [g•]), silence precedingthose cues was cue is constrained as if by knowledge of what vocal
found to be necessary if a stop was to be perceived as tracts can do, we should offer a'few parenthetical
the second element of a fricative-stop-vowel syllable comments about what the data do not imply. First,
(e.g., [j'k½]). Similarly, in the case of stops in syl- they most certainly do not imply that a listener can
lable-final position (e.g., [bcb]) silence following the hear only what a vocal tract can do. Indeed, it is for
spectral cues was necessary if they were to give rise that reason that we have so often added the qualifica-
to the perception of a stop when a second syllable was tion "when the vocal tract makes linguistically signifi-
added(e.g., [b•bd½]). More interesting, perhaps, is cant gestures." For we know that synthetic speech can
the finding that even in the absence of sufficient spec- be readily perceived (as speech), though it departs,
tral cues, silence did, in some circumstances, pro- sometimes appreciably, from those acoustic patterns
duce the perception of a stop or affricate. Thus, that real vocal tracts can produce. Thus, synthetic
prefixing the noise of [s] to the syllable "lit" produced patterns sometimes contain only two formants, and the
"split" when the correct amount of silence was inter- transitions are sometimes made to change direction
posed; inserting silence between the words "say" and instantaneously. But such departures, we should note,
"shop" converted them to "say chop." are not linguistically relevant. Languages cannot en-
force a distinction between phones made with two for-
Our second question asked whether the effect of mants and those made with the greater number of for-
silence was exclusively auditory, or also phonetic. mants that real vocal tracts produce, nor can they con-
If auditory, we should expect to find explanations in trast instantaneous changes in formant slope with those
terms of masking or any one of a variety of interactions. more gradual changes that must characterize the be-
If phonetic, we should assume that silence informs the
havior of such real masses as the tongue. In cases
listener that the speaker did or did not make the closure like these, an experimenter can take all manner of
that is the distinguishing characteristic of the stops, liberties with the stimulus patterns without destroying
and further that the listener is sensitive to that infor-
or even distorting phonetic perception, provided he
mation, just as he would be if his perception of speech manages to include the acoustic information that en-
were constrained by knowledge of what a vocal tract ables the listener to hear the stimuli as speech. All
must (or must not) do when it makes a linguistically this is to say that if the speech perceiving mechanismjs
significant gesture. This question is, by its nature, "tuned" to a vocal tract, as we have implied it might
more problematic than the first one, and the answer is
be, then such "tuning" must hold only for those man-
correspondingly harder to find. We believe, however, euvers that have linguistic significance.
that the pattern of results obtained in the experiments
reported here lend support to the assumption that the Second, our assumption of a link between perception
effect of silence is, to a significant extent, phonetic. and production is not meant to imply anything about
1530 J. Acoust. Soc. Am., Vol. 65, No. 6, June 1979 Dorman eta/.: The sound of silence 1530
Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 137.189.170.231 On: Sat, 20 Dec 2014 19:13:10
the nature of the mechanism thai: mediates the link, or to record [be] and [ge] (rather than [pe] and [ke]), so that,
about the relative contributions of nature and nurture when the fricative noise and vocalic segment were combined,
the listeners would hear a normal soundingif pc] and [fke].
to its formation. In regard to the nature of the mech-
2The term C•eminate" is ordinarily used to refer to the doubl-
anism, there are aspects of our results (and those ing of a consonant within a word. Such doubling as we find
of others) that speak against at least one very simple in English occurs only across word boundaries. We never-
possibility: feature detectors that have evolved in theless here use the term, though our subjects were native
such a way as to be "tuned" to respond to fixed acous- speakers of English and were accustomed to consonant
doubling only at word boundaries.
tic consequences of articulatory gestures, and to be
SSincewriting this paper, a somewhatsimilar result by
"sprung" when those consequencesare present in the Rudnicky and Cole (1977) has come to our attention. Having
signalo In that connection, we note, first, that the re- recorded [ba ga] they found: (a) that after removing the
lations among cues that we have found suggest that the [ga] their listeners heard [bag], (2) that after replacing the
setting of one detector (e.g., the silence detector) must [ga] with ida] placed close in time to the first syllable, lis-
teners heard [bai da], and (3) that when the second syllable
be, in effect, variable and conditioned by the "value" was separated from the first syllable by a sufficient interval
of the other cues (e.g., duration of the noise). We of silence, .listeners heard [bag da]. This result is of parti-
should then note that, according to the results of the cular interest from our point of view because, in the condi-
experiment on identification of syllable-final stops, a. tion when the second syllable ida] was close to [ba] and the
subjects heard [bai da], it is clear that the transition cues
detector for the syllable-final transition cues could not
at the end of the first syllable were not being (backward)
respond directly upon sensing these cues, but would, masked by the second syllable; they were being perceived,
instead, have to wait until it had information about the but as a glide to [i] rather than.as a stop. That result is
next syllable. At the least, it would have to know similar to the findings of Libe}man and Pisoni (1977), re-
about that next syllable how far removed in time it was ferred to earlier in this paper, that f noise placedclose to
[ge] causeslisteners to perceive [fje ].
from the syllable containing the target phone and what awe have not commented on the difference between the identi-
kinds of phones it comprised. The consequence for a fication function for lb] and [g] because we have found that
detector model is that it loses much of the appeal difference to change, even to be reversed, depending on the
that it would otherwise have by virtue of its simplicity. surrounding vocalic environment. We emphasize the gemin-
ate versus nongeminate contrast because it remains more
As for questions about the contributions of nature nearly stable across vowel environments.
and nurture to the assumed link between perception and 5Usingstimulus patterns and proceduresvery different from
ours, Darwin and Bethell-Fox (1977) have, nevertheless,
production, we should emphasize that such questions
obtained results that are quite compatible. After synthesizing
stand apart from those that pertain to the existance of a pattern that was heard as an uninterrupted sequence of semi-
such a link. Our experiments bear only on the latter. vowels and vowels, they found that introducing changes in
fundamental frequency at appropriate places in the pattern
We turn finally to the third question: Whose vocal (without changing formant frequencies) caused the semivowels
tract is perception linked to? Given the results of to be heard as stops. Their interpretation was that the
our experirhents with same and different talkers, we rapid shifts in fundamental caused the sequence to "stream,"
should suppose that the answer is quite clear: the thus permitting the listener to hear two voices; that, in
turn, provided the silence necessary to convert semivowel
relevant vocal tract is not that of the listener nor is it
to stop.
that of the speaker; it is rather some very abstract
conception of vocal tracts in general. But those same Abbs, M. (1971). "A study of cues for the identification of
results add support to the view that a link to some voiced stop consonants in intervocalic contexts," Doctoral
vocal tract, however abstract, does figure in the per- dissertation, University of Wisconsin (unpublished).
ception of speech. Bailey, P., Summerfield, Q., and Dorman, M. "Friction dur-
ation and friction offset as cues to stop manner in fricative-
stop-vowel sequences," (unpublished).
ACKNOWLEDGMENTS BastJan, J., Eimas, P.; and Liberman, A. (1961). "Identifica-
tion and discrimination of phonemic contrast induced by
This research was supported by a grant (HD-01994) silent interval," J. Acoust. Soc. Am. 33, 842 (A).
from the National Institute of Child Health and Human BastJan, J. (1962). "Silent intervals as closure cues in the
Development to Haskins Laboratories. We wish to perception of stops," Haskins Laboratories, Speech Res.
thank Bruno Repp for helpful comments on an earlier Instrurn. 9, Appendix F.
Darwin, C. J. (1971). "Ear differences in the recall of frica-
version of this paper. We also wish to thank Anthony
tives and vowels," Q. J. Exp. Psychol. 23, 46-62.
Levas and Suzi Pollack for their assistance in collecting Darwin, C. J., and Bethell-Fox,-C. (1977). "Pitch continuity
and tabulating portions of the data. Michael F. Dorman and speech source attribution," J. Exp. Psychol.; Hum. Per-
is also at Arizona State University; Lawrence J. Raphael form. and Percept. 3, 665-672.
Erickson, D., Fitch, H., Halwes, T., and Liberman, A.
at Herbert H. Lehman College and the Graduate School
(1977). 'øTrading relation in perception between silence and
of the City University of New York; and Alvin M. Liber- spectrum," J. Acoust. Soc. Am. 61, S46 (A•).
man at the University of Connecticut and Yale Univer- Fujisaki, H., Nakamuro, K., and Imoto, T. (1975). "Auditory
sity. Address reprint requests to Michael F. Dorman, perception of duration of speech and non-speech stimuli,"
270 Crown Street, New Haven, Connecticut 06511. Auditory Analysis and the Perception of Speech, edited by
G. Fant and M. A. A. Tatham (Academic, London).
Ganong, W. (1975). "An experiment on tphonetic adaptation, •"
Research Laboratory of Electronics, MIT, Progress Report
1Whennative speakers of Englishproduceif pc] and [fke], 116, 206-210.
[p] and [k] are realized as voiceless inaspirates. It is for Gerstman, L. J. (1957). "Perceptual dimensionsfor the fric-
this reason that, when the fricative noise is removed from tion portions of certain speech sounds," Doctoral disserta-
if pc] and [fke], listeners hear the stopsthat remain as tion, New York University (unpublished).
voiced. Inøourexperiment, it was necessary, therefore, Harris, K. S. (1958). "Cues for the discrimination of Ameri-
1531 J. Acoust. Soc. Am., Vol. 65, No. 6, June 1979 Dorman eta/.: The soundof silence 1531
Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 137.189.170.231 On: Sat, 20 Dec 2014 19:13:10
can English fricatives in spoken syllables," Lang. Speech1, ception of a double consonant," Lang. Speech 3, 11-17.
1-7. Port, R. (1976). '•rhe influence of speaking tempo on the dur-
Kuypers, A. (1955). "Affricates in intervocalic position," ation of stressed vowel and media1 stop in English trochee
Haskins Laboratories, Q. Prog. Rep. 15, Appendix 6. words," Doctoral dissertation, University of Connecticut
Liberman, A.M., and Pisoni, D. B. (1977). "Evidence in a (unpublished).
special speech-processing subsystem in the human," Re- Raphael, L. J., and Dorman, M. F. (1977). "Perceptual equi-
cognition of Complex Acoustic Signals, edited by T. H. Bul- valence of cues for the fricative-affricate contrast," J.
lock (Dahlem Konfrerenzen, Berlin) Life Sciences Research Acoust. Soc. Am. 61, S45 (A).
Rep. 5. Repp, B. (1976). "Perception of implosive transitions in VCV
Liberman, A.M., andStuddeft-Kennedy,M. (1979)"Phonetic utterances," Haskins Laboratories, Status Rep. Speech
perception," inHandbook of Sensory Physiology, edited by Res. SR-48, 209-234.
R. Held, H. Leibowitz, and H. L. Teuber (Springer-Verlag, Repp, B. (1977). "Perceptual integration and selective at-
Heidelberg), Vol. VIII, "Perceptsion"(in press). tention in speech perception: further experiments on inter-
Lisker, L. (1957a). "Closure duration and the voiced-voiceless vocalic stop consonants," Haskins Laboratorie s, Status
distinction in English," Language 33, 42-49. Rep. Speech Res. SR-49, 37-70.
Lisker, L. (1957b). "Closure duration, first-formant transi- Rudnicky, A., and Cole, R. (1977). "Vowel identification and
tions and the voiced-voiceless contrast of intervocalic stops," subsequentcontext," J. Acoust. Soc. Am. 61, S39 (A).
Haskins Laboratories, Q. Prog. Rep. 23, Appendix 1. Summerfield, A. Q., and Bailey, P. (1977). "On the dissocia-
Lisker, L. (1977). '•1osure hiatus: cue to voicing, manner tion of spectral and temporal cues for stop consonant man-
and place of consonant occlusion," J. Acoust. Soc. Am. 61, ner," J. Acoust. Soc. Am. 61, S46 (A).
S48 (A). Truby, H., (1955). "Affricates," Haskins Laboratories, Q.
Pickett, J. 1V[., and Decker, L. (1960). "Time factors in per- Prog. Rep. 11, 7-8.
1532 J. Acoust.Soc.Am., Vol. 65, No. 6, June 1979 Dormaneta/.: The soundof silence 1532
Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 137.189.170.231 On: Sat, 20 Dec 2014 19:13:10