[ 3 1 ] G. S. Sebestyen, Decision-Making Process in Pattern Recognition. N*R, Ganguli was born on September 1,1939.
New York: Macmillan, 1962,p, 18, We received the Engineer’s degree in telecorn-
[32] S. E. G . Ohman, Toarticulation in VCV utterances-Spectra- rnuniation engineering from Jadavpur Uni-
graphic measurements,” J , Acaust. Soc. Anter., VOL 39, pp. 151- versity, India, in 196 1.
168,1966. From 1962-1968, he was engaged in the
[33] L. R. Rabiner, “On creating reference templates for speaker- development of computers in the Indian
independent recognition of isolated words,” IEEE Trans, Acoust., Statisticaf Institute and Jadavpur University
Speech., SgnalPracessing, vol. ASSP-26, pp. 34-42, 1978. Joint Computer Project. Since 1969 he has
been with the Electronics and Communication
Sciences Laboratory, Indian Statistical Insti-
tute, Calcutta, India. His current research in-
terests are in the areas of speech analysis, synthesis, and recognitiun.
A, K. Datta was born in 1935. He graduated S, Ray was born an January 2, 1953, He re-
with honors in physics from Calcutta Univer- ceived the M.Stat. degree with specialization
sity, Calcutta, India, in 1955, and received the in computer science from the Indian Statistical
M.Sc. degree in pure mathematics in 1963 as an Institute, Calcutta, India, in 1972.
external noncollegiate student. From 19734976, he worked as a Program-
Since 1955 he has been with the Electronics mer in the National Institute of Rural Develop-
and Communication Sciences Laboratory, In- ment, Hyderabad, India, and the Indian Oil
dian Statistical Institute, Calcutta, India, where Corporation, Calcutta, India. He is presently
he worked in the field of accounting machines, with the Electronics and Communication Sci-
computer memory, and computer hardware ences Laboratory, Indian Statistical Institute,
before taking up pattern recognition. His pre Calcutta, as a Programmer. His research in-
sent research activities include speech acoustics, speech pattern recagni- terests include computer-oriented statistical methods of pattern rec-
tion, handwritten character recognition and robotics. ognition.
Ahtruct-The threshold of audibility of TIM distortion was deter- less amplitude-dependent nonlinearities, and investigations on
mined for 63 subjects ‘representing alu categories of listeners, from the audibility o f high-frequency distortion phenomena in audio
musicians and sound engineers to the average man on the street, Three equipment have been rare. In spite of the recent interest in
different music samples were used, and controlled amounts of distor-
tion were produced with a digital stereophonic TIM generator. TWO these forms of distortion, only two prior publications -on thls
basic experimental methods were used to obtain the approximate subject are known to the authors.
threshold, after which the reliability of the detection of the distorted Levitt et al. [ 11 used a 12-bit/20 kHz -encoded digitally-
passages was veri€ied with a time localization test. The results show recorded monophonic test sentence “Nave you seen BilZ,” and
that the audibility varies very much depending on the music sample,
introduced a controUed limit on the maximum signal rise time.
Listening media, and person. The most sensitive goup of listeners could
reliably perceive O S percent of momentary TIM. Low values of TIM Based on experiments using three test people, they fuund that
were generally perceived only as slight changes in the tonal character 0.2 percent of the rms discrepancy between the undisturted
of the sound, and not as distortion. In a number of cases, a preference and the distorted signal, averaged over a 150 ms period, was
was found far the slightly distorted sound. discernible in the i of ‘‘BiZl.’’ The effect was termed “slope
overload,” following digital telecommunication terminology.
Jung et aE. [ 2 ] perfurmed monophonic listening tests in-
serting different operational amplifiers into the signal path and
A LARGE part of available psychoacoustic data on nonlin- used recorded music as a signal. Inverse correlation was re-
ear distortion is based on measurements using memoxy- ported between the measured slew rate of the operational
amplifier and the perceived sound quality. The effect was
Manuscript received January 19, 1978;revised May 30, 1979. This termed “slewing-induced distortion ,” following instrumenta-
paper was presented at the IEEE International Conference on Acoustics, tion t erminalogy .
Speech, and Signal Processing, Tulsa, OK, Ma& 1977, In addition to the above, only scattered qualitative remarks
M. Petri-Larmi is with the University Hospital of Oulu, Oulu, Finland.
M. OtaIa, E. Leinonen, and J. Lammasniemi are with the Technical on the audibility of this kind of’distortion are known to have
Research Centre of Finland, Oulu, Finland. been published.
continued in order to detect the ultimate threshold of audibility to recorder
for the six most sensitive subjects through the use of improved -15V
equipment and test procedures. The results have been pub- Fig. 2. The distortion generator. ody one of the two &ann& is
lished elsewhere [ 101 . ShOWll
11. TEST EQUIPMENT and distortion levelk in both channels were also recorded with a
The measurement setup is depicted in Fig. 1. An Ortofon 6-channel Watanabe Multicorder-type MC-611-6H strip-chart
SL 15 Q quadxophonic muving-coil picxup with type STM 72 recorder, preceded by logarithmic full-wave rectifying circuits
transformer was used. It was mounted on a Garrard Zero 100 S having 50 ms integration time. The subject had a pushbutton
turntable. The signal was amplified using a modified Quad 33 at his disposal, connected directly to one o f the free recorder
RIAA compensated preamplifier. The limiting of the maxi- channels
mum value of the signal rate of change was produced by a The generator produced very steeply rising distortion once
deltamodulator circuit, depicted in Fig. 2. the maximum value of the signal rate of change was exceeded.
Because of small positive feedback, the very high gain com- Fig. 3. shows the distortion measured with the DIM30 method
parator IC 1 operates as a muleivibrator with a negligible hys- t71.
teresis and a frequency of 3 MHz. When the input voltage at The main test records used with all 68 subjects were:
0 is higher than the voltage at 0 in capacitor C l ,‘the com- I) Deutsche Marchmsik, Tekfunken SLE 14183-P, side 2 ,
parator switches t o onepolarity. When the equilibrium is band H ,Einzug der Gladiatoreve (marches played by ;k large mil-
reached, the comparator returns to balanced oscillation. The itary band).
voltage across C1 is thus a very close approximation of the in- 2) Mmtovani’s Hit Purade, Decca RDS 6897, side 1, band 1 ,
put voltage during normal operation. DeliZuh (large violin-dominated light music orchestra).
The maximum value of the signal rate of change is reached 3) EteE8mowlaZaisen Osakunnan Laulajat, Finnlevy , S F U -
when the comparator stays in. one of its output polarities and 8509, side 1, band I , Exultate D m (mixed classical choir).
charges or discharges capacitor C1 iia R 1 with maximum The results were checked using the following records, with
speed. The maximum value of the signal rate of change is con- 15 of the more sensitive subjects:
trolled by a ganged set of four precision (k0.05 dB) step at- 4) Lincoln MQJW~Qand his Distingaid~edC d l e a p e ~ v, d -
tenuators, changing the input signal amplitude to the delta ume 111, Sheffield Lab Album LAB-2, side 1 band 1, You Are
modulator without influencing the output level. The distortion the Sunshim o f M y Life (light pop music).
generator could be periodically bypassed by a manually con- 5 ) The Missing Linc, Lincoln Mayorga and his Distinguished
trolled reed relay switch, operated by the subject. The clicks Colleagues, volume 11, Sheffield Lab Album S-10,side 1 band
and pops of the changeover were -60 dB below the signal level 1,Norwegian Wood, (light pop music).
under no-distortion conditions. 6) I’ve Got the Music in M e , Thelma Houston and Pressure
The modulator was constructed by K. Riemens of the Philips Cooker, Sheffield Lab SL’J/SL8, side I , band 1, I’ve Got the
Research Laboratories, Eindhoven, The Netherlands, using MUSk k i!Me (SOUI-type P O P ITlUSiC).
custom-made monolithic integrated circuits. The normalized peak signal spectra from the selections
The distortion was monitored directly from the distortion used are shown in Fig. 5 as measured with the Rockland 852
generator difference output with an oscilloscope. The signal L-octave
3 filter and the Hewlett-Packard HP 3575 A phase-gain
Fig. 3, The measured TIM distortion of the system, with the distortion
generator setting as a parameter, The horizontal axis is the normalized
Record 2
peak-to-peak signal amplitude; the vertical axis is the distortion per-
centage measured with the DIM30 method.
I I I I 1 I -
I 1
I I I 20 50 rQ0 200 500 1 2 5 IO x )
0,63 091 or3 i amplitude
si+ HZ kHi-
Fig. 4. The frequency response of the complete measurement setup up Fig. 5 . The normalized $-octave frequency spectra of test records 1-4
to loudspeaker terminals. The B & K QR 2011 test record. measured from the loudspeaker output terminals with zero distortion
found in test 1 and the music and distortian signals in both Fig. 7, The relevant characteristics of the audience.
channels were recorded on the &channel strip-chart recorder.
When the subject heard what he considered to be distortion,
rnent or being keen listeners, did not show values differing
he pressed a push button and a corresponding signal was regis-
from the average, except in cases hhere some increased sensi-
tered onto the strip chart. The coincidence of the registered
tivity was found for the very instrument which the subject
distortion peaks and the push-button operation was used as a
played. One of the subjects had an absolute-pitchdetection
verification of the detection. The amount of distortion was
ear, but proved to have a higher-than-average threshold of dis-
decreased until the subject lost reliable detection of the dis-
tortion peaks or did not use the button anymore. This test
tortion detection. Another, being socially deaf (-30 dB) in
one ear, proved to be one of the most sensitive subjects.
was repeated until the results were consistent.
Those actively engaged with sound reproduction yielded
The results of test la) and lb) differed slightly from each
lower-than-average thresholds. The group of audio profession-
other because of a time delay caused by the question, “was
als was the most sensitive. This group consisted of audio equip-
thut distortion?” in the subject’s mind. Test 2 proved to be
ment designers and salesmen, broadcast engineers and mixers,
accurate and yielded reliable and consistent results, although
recording studio engineers, audio journalists, etc. The group
the distortion on/off switch was not used, and the test was
also quickly learned to identify those passages of music which
therefore much more difficult for the subjects.
possibly could give rise to distortion and in a few cases the in-
A complete measurement session lasted about 2-3 hours and
vestigators were forced to use test 2 very critically in order i o
for some subjects even much more. This created listening fa-
disqualify anticipation.
tigue. Based on a limited number of repeated tests, in which
some improvement of results was noted, it is probable that
somewhat higher sensitivities could have been obtained if the
fatigue factor would have been eliminated. Both the AR 3a loudspeakers and Koss ESP-9 headphones
If the tests were repeated for the same subject after more were used in the experiments. However, the threshold of audi-
than a few week’s interval, a considerable amount of learning bility proved to be 3 to 10 times higher with the use of the
was noted in many cases. It is, therefore, probable that the headphones. N o apparent reawn was found for this result and
thresholds reported in this investigation are conservative. the effect persisted with five other brands of electrostatic and
dynamic headphones as well. Distortion, which was clearly
IV. SUBJECTS noticeable with the loudspeakers, did not degrade the sound
A total of 68 subjects was tested. All were first subjected to quality with the headphones. It was made certain that no
a clinical hearing investigation consisting of ear canal cleaning, equipment malfunction caused this effect and that it did not
visual ear drum inspection, ear drum mobility check, and man- arise from any difference in the loading of the power ampli-
ual pure-tone auditory threshold determination in the fre- fiers. Our results are, therefore, presented for the loudspeakers
quency range of 125 Hz 8 kHz. Several abnormal earswere only *
discovered, but they did not seem to produce significantly dif- The results axe shown in two dimensions: the minimum,
ferent results in the distortion perception tests, In fact, some detectable distortion-to-signal ratio and the minimum re-
of the audio professionals having seriously damaged hearing quirement far the signal rate o f change capability for just de-
still detected TIM with ease. Of particular interest is that the tectable distortion.
results with subjects who had severe hearing loss at frequencies The minimum detectable rms distortion-to-rms signal ratio
above 3-5 kHz did not differ significantly from the average in was obtained from the reliably and consistently detected dis-
their detection threshold. tortion peaks in test 2 strip charts for each subject and each
The subjects were selected for age, sex, school background, sample separately. The total system averaging time was 0.25 s
and interest in music with the aim of representing an average and the results are shown in Fig. 8. For the choir, 0.5 percent
user of audio equipment. The relevant characteristics of the of the rrns distortion was detectable. T h e thresholds of all the
audience is depicted in Fig. 7. Age or school background subjects were concentrated at a low level because of the striking
seemed to bear no correlation with the results. The group of character of the distortion. The large orchestras, with their
most sensitive males reported better values thanthe corre- more crowded signal spectra and, consequently, higher mask-
sponding group of females. However, as an average, sex did ing, yielded a 1-2 percent threshold of audibility and a much
not influence the results. larger spread between the results of individual subjects was
Those actively engaged with music, either playing an instru- noticed.
(Yh n
The results with records 4)-6) coincide closely with those
presented, Since only 15 subjects participated in the f d -
length tests using these records and because of the similarity
in the thresholds obtained, the results are not presented here.
Comparison with earlier investigations shows basic agree-
Fig. 8. The threshold of audibility of the r m distortion averaged over ment with the results of Levitt et d . [l] , although the mea-
0.25 s. The horizontal axis is the rms distortion-to-rms signal ratio; sure used by Levitt, the rms slope truncation per sampling pe-
the vertical axis is the number of observations, riud, is not directly comparable with the present results. The
requirement for the rate-of-change capability is considerably
lower than that estimated by Jung et QZ. [2] .
& Record 3
I 1
10 1Ims
