IEEE 269 Yr 2002
IEEE 269 Yr 2002
IEEE 269 Yr 2002
Published by
The Institute of Electrical and Electronics Engineers, Inc.
3 Park Avenue, New York, NY 10016-5997, USA
Print: SH95056
25 April 2003 PDF: SS95056
Sponsor
Transmission, Access and Optical Systems Committee
of the
IEEE Communications Society
No part of this publication may be reproduced in any form, in an electronic retrieval system or otherwise, without the prior
written permission of the publisher.
The trademark name, Bell Labs (AT&T Labs), is owned by AT&T Corp.
--``,-`-`,,`,,`,`,,`---
Use of an IEEE Standard is wholly voluntary. The IEEE disclaims liability for any personal injury, property, or other
damage, of any nature whatsoever, whether special, indirect, consequential, or compensatory, directly or indirectly
resulting from the publication, use of, or reliance upon this, or any other IEEE Standard document.
The IEEE does not warrant or represent the accuracy or content of the material contained herein, and expressly
disclaims any express or implied warranty, including any implied warranty of merchantability or fitness for a specific
purpose, or that the use of the material contained herein is free from patent infringement. IEEE Standards documents
are supplied ‘‘AS IS.’’
--``,-`-`,,`,,`,`,,`---
The existence of an IEEE Standard does not imply that there are no other ways to produce, test, measure, purchase,
market, or provide other goods and services related to the scope of the IEEE Standard. Furthermore, the viewpoint
expressed at the time a standard is approved and issued is subject to change brought about through developments in the
state of the art and comments received from users of the standard. Every IEEE Standard is subjected to review at least
every five years for revision or reaffirmation. When a document is more than five years old and has not been reaffirmed, it
is reasonable to conclude that its contents, although still of some value, do not wholly reflect the present state of the art.
Users are cautioned to check to determine that they have the latest edition of any IEEE Standard.
In publishing and making this document available, the IEEE is not suggesting or rendering professional or other services
for, or on behalf of, any person or entity. Nor is the IEEE undertaking to perform any duty owned by any other person
or entity to another. Any person utilizing this, and any other IEEE Standards document, should rely upon the advice of a
competent professional in determining the exercise of reasonable care in any given circumstances.
Interpretations: Occasionally questions may arise regarding the meaning of portions of standards as they relate to
specific applications. When the need for interpretations is brought to the attention of IEEE, the Institute will initiate
action to prepare appropriate responses. Since IEEE Standards represent a consensus of concerned interests, it is
important to ensure that any interpretation has also received the concurrence of a balance of interests. For this reason,
IEEE and the members of its societies and Standards Coordinating Committees are not able to provide an instant
response to interpretation requests except in those cases where the matter has previously received formal consideration.
Comments for revision of IEEE Standards are welcome from any interested party, regardless of membership affiliation
with IEEE. Suggestions for changes in documents should be in the form of a proposed change of text, together with
appropriate supporting comments. Comments on standards and requests for interpretations should be addressed to:
Note: Attention is called to the possibility that implementation of this standard may require use of subject
matter covered by patent rights. By publication of this standard, no position is taken with respect to the
existence or validity of any patent rights in connection therewith. The IEEE shall not be responsible for
identifying patents for which a license may be required by an IEEE Standard or for conducting inquiries
into the legal validity or scope of those patents that are brought to its attention.
Authorization to photocopy portions of any individual standard for internal or personal use is granted by the Institute of
Electrical and Electronics Engineers, Inc., provided that the appropriate fee is paid to Copyright Clearance Center. To
arrange for payment of licensing fee, please contact Copyright Clearance Center, Customer Service, 222 Rosewood
Drive, Danvers, MA 01923, USA; þ1 978 750 8400. Permission to photocopy portions of any individual standard for
educational classroom use can also be obtained through the Copyright Clearance Center.
This standard has been prepared in response to a widely expressed need by the telecommunications
industry for a standard, comprehensive method for testing the transmission performance of telephone
sets, handset, and headsets. The present standard is a revision of IEEE Std 269TM-1992. This revision
adds coverage for a wide range of ear simulators and test signals, and incorporates and updates the
contents of IEEE Std 1206TM-1994.
This revision, begun in 1999, was prepared by the Subcommittee on Telephone Instrument Testing of
the Transmission, Access and Optical Systems Committee of the IEEE Communications Society.
Participants
At the time this revised standard was approved, the Subcommittee on Telephone Instrument Testing
had the following membership:
John Bareham, Chair
Glenn Hess, Vice Chair
Steve Graham, Secretary
Roger Britt Dan Foley Ron Magnuson
Chandru Butani Hans W. Gierlich Henry Mar
Rodolfo Ceruti Deborah Gruenhagen Christopher J. Struck
Cliff Chamney Roger Gutzwiller Steve Temme
Paul Coverdale Joe Helms Stephen Whitesell
Fred Dekalb Soren Jonsson Allen Woo
Gijs Dirks Frederick M. Kruger Robert Young
The following members of the balloting committee voted on this revised standard. Balloters may have
voted for approval, disapproval, or abstention.
John Bareham Steve Graham Henry Mar
Chandru Butani Deborah Gruenhagen Christopher J. Struck
Cliff Chamney Roger Gutzwiller Stephen Whitesell
Fred Dekalb Glenn Hess Allen Woo
Dan Foley Frederick M. Kruger Robert Young
Ron Magnuson
*Member Emeritus
Savoula Amanatidis
IEEE Standards Managing Editor
1. Overview......................................................................................................................................... 1
2. References ...................................................................................................................................... 2
3. Definitions ...................................................................................................................................... 4
5.1.1 Selection.................................................................................................................. 9
5.1.2 Headset measurements made on ear simulators compared to real ears............... 10
5.1.3 Translation from DRP to ERP ............................................................................ 10
5.2 Mouth simulators............................................................................................................... 10
5.3.1 Selection................................................................................................................ 11
5.3.2 Handset positioning.............................................................................................. 11
5.3.3 Headset positioning .............................................................................................. 13
5.4 Measurement microphones ................................................................................................ 15
6.1 General............................................................................................................................... 18
6.2 Electrical measurement instruments................................................................................... 18
6.3 Measurement microphones ................................................................................................ 18
6.4 Ear simulator ..................................................................................................................... 18
6.5 Measurement bandwidth and resolution ........................................................................... 18
--``,-`-`,,`,,`,`,,`---
6.6 Electrical test signals .......................................................................................................... 18
7.1 General............................................................................................................................... 21
8.1 General............................................................................................................................... 36
--``,-`-`,,`,,`,`,,`---
8.5 Send.................................................................................................................................... 44
8.6 Sidetone.............................................................................................................................. 47
8.7 Overall................................................................................................................................ 50
9.1 General............................................................................................................................... 56
--``,-`-`,,`,,`,`,,`---
9.5 Echo frequency response.................................................................................................... 64
Annex A (normative) Ear simulators with flexible pinnas and positioning devices .......................... 67
Annex B (normative) Alternative ear simulators, mouth simulator, and test fixture........................ 69
G.1 General.................................................................................................................. 87
G.2 Fast Fourier transform (FFT) and cross-spectrum analysis ................................ 87
--``,-`-`,,`,,`,`,,`---
Annex H (normative) Loudness rating calculations .......................................................................... 94
J.4.1 Total harmonic distortion (THD) and harmonic analysis ..................... 101
J.4.2 Total harmonic distortion (THD) and noise.......................................... 102
J.4.3 Difference-frequency distortion (DF distortion) .................................... 102
J.4.4 Intermodulation distortion (IM distortion)............................................ 103
J.4.5 Alternatives to sine-wave stimulus signals.............................................. 103
J.4.6 Test frequencies ...................................................................................... 103
J.5 Coherence methods (N/C ratio)........................................................................... 103
Annex O (normative) Temporally weighted terminal coupling loss measurement method............. 115
O.1 General................................................................................................................ 115
Annex P (normative) Temporally weighted terminal coupling loss algorithm ................................ 118
Annex S (informative) Use of the free field as the telephonometric reference point ...................... 127
T.1 Conversions for dBV to dBm, and for 600 and 900
........................................ 129
T.2 Conversions for dBmp to dBrnC for electrical noise measurements .................. 130
1. Overview
Objective or subjective methods can be used to measure telephone transmission performance. This
standard discusses objective procedures utilizing mouth simulators, ear simulators, laboratory
microphones, and test instruments to characterize transmission performance. Subjective procedures
are particularly applicable for rating overall communication connections involving the real voice and
real ear of human subjects. Telephones, handsets, and headsets can be evaluated by purely objective
methods provided the results generally agree with the desirable performance characteristics of
subjective testing.
The relationships that are established between subjective and objective measurements will vary with
the physical constants of the telephone design, such as the size and shape of the handset or headset, the
sound leakage between the receiver and the ear of the user, and the signal processing in the speech
path. Therefore, the correlation between subjective and objective measurements should be established
separately for each telephone, headset, or handset design before measurements obtained using the
techniques covered herein can be interpreted to reflect performance under conditions of actual use.
1.1 Scope
This standard provides the techniques for objective measurement of electroacoustic characteristics of
analog and digital telephones, handsets, and headsets. Application is in the frequency range 100–
8500 Hz.
Although not specifically within the scope of this standard, the methods described are generally
applicable to a wide variety of other communications equipment, including cordless, wireless, and
mobile communications devices.
Telephones with handsfree or loudspeaking features are covered by IEEE Std 1329TM-1999.
Due to the various characteristics of these devices and the environments in which they operate, not all
of the test procedures in this standard are applicable to all types of telephones, handsets, or headsets.
Application of the test procedures to atypical telephones should be determined on an individual basis.
1.2 Purpose
The purpose of this standard is to provide practical methods for making laboratory measurements of
the transmission characteristics of analog and digital telephones, handsets, and headsets so that their
performance may be evaluated on a standardized basis.
This is a brief summary of the clauses contained in the standard. The primary measurement
procedures appear in Clause 7 through Clause 9 of the document.
--``,-`-`,,`,,`,`,,`---
Clause 2, Clause 3, and Clause 4 provide references, definitions, abbreviations, acronyms, and symbols
that will be useful in executing the tests of this standard. These clauses provide a background in the
terminology used for telephone, handset, and headset testing.
Clause 5 specifies the test equipment, test environment, and acoustic impairments. The test equipment
portion includes ear and mouth simulators, test fixtures, and measurement microphones, as well as
procedures for positioning the telephone handset or headset for testing. The test environment includes
both the acoustical and physical characteristics of the test space. Impairments include the acoustic
conditions.
Clause 6 describes the calibration procedures needed to ensure that the equipment is in a known state.
Calibration of the acoustic transducers and electrical interfaces is explained.
Clauses 7, 8, and 9 contain the transmission test procedures, such as send and receive, for analog
telephones, digital telephones, and analog four-wire handsets and headsets, respectively.
The annexes contain additional information or details of procedures referred to from within the
relevant clause. Normative annexes contain information that is considered to be an official part of the
standard. Informative annexes contain information that may be useful, or of general interest, but is
not part of the standard.
2. References
This standard shall be used in conjunction with the following publications. When the following
publications are superseded by an approved revision, the revision shall apply, but the impact on results
should be determined.
ANSI S1.4-1983 (Reaff 2001), American National Standard Specification for Sound Level Meters.
1
ANSI publications are available from the Sales Department, American National Standards Institute, 25 West 43rd Street, 4th
Floor, New York, NY 10036, USA (http://www.asi.org/).
ANSI S1.6-1984 (Reaff 2001), American National Standard Preferred Frequencies, Frequency Levels,
and Band Numbers for Acoustical Measurements.
ANSI S1.11-1986 (Reaff 1998), American National Standard Specifications for Octave Band and
Fractional-Octave-Band Analog and Digital Filters.
ANSI S1.12-1967 (Reaff 1997), American National Standard Specifications for Laboratory Standard
Microphones.
IEEE Std 661TM-1979 (Reaff 1998), IEEE Standard Method for Determining Objective Loudness
Ratings of Telephone Connections.2,3
IEEE Std 743TM-1995, IEEE Standard Equipment Requirements and Measurement Techniques for
Analog Transmission Parameters for Telecommunications.
IEEE Std 1329-1999, IEEE Standard Method for Measuring Transmission Performance of Handsfree
Telephone Sets.4
ITU-T Recommendation G.714-1988, Separate Performance Characteristics for the Encoding and
Decoding Sides of PCM Channels Applicable to 4-Wire Voice-Frequency Interfaces.
ITU-T Recommendation G.723-1996, Speech Coders: Dual Rate Speech Coder for Multimedia
Communications Transmitting at 5.3 and 6.3 kbit/s.
2
The IEEE standards or products referred to in Clause 2 are trademarks owned by the Institute of Electrical and Electronics
Engineers, Inc.
3
IEEE publications are available from the Institute of Electrical and Electronics Engineers, 445 Hoes Lane, P.O. Box 1331,
Piscataway, NJ 08855-1331, USA (http://standards.ieee.org/).
4
ASTM publications are available from the American Society for Testing and Materials, 100 Barr Harbor Drive, P.O. Box C700,
West Conshohocken, PA 19428-2959, USA (http://www.astm.org/).
5
ISO publications are available from the ISO Central Secretariat, Case Postale 56, 1 rue de Varembé, CH-1211, Genève 20,
Switzerland/Suisse (http://www.iso.ch/). ISO publications are also available in the United States from the Sales Department,
American National Standards Institute, 11 West 42nd Street, 13th Floor, New York, NY 10036, USA (http://www.ansi.org/).
6
ITU-T publications are available from the International Telecommunications Union, Place des Nations, CH-1211, Geneva 20,
Switzerland/Suisse (http://www.itu.int/).
--``,-`-`,,`,,`,`,,`---
ITU-T Recommendation G.726-1990, 40, 32, 24, 16 kbit/s Adaptive Differential Pulse Code Modula-
tion (ADPCM).
ITU-T Recommendation O.133-1993, Equipment for Measuring the Performance of PCM Encoders
and Decoders.
ITU-T Recommendation P.360-1998, Efficiency of Devices for Preventing the Occurrence of Excessive
Acoustic Pressure by Telephone Receivers.
3. Definitions
3.1 A-weighted: A measurement made using the A frequency weighting specified in ANSI S1.4-1983.
A-weighted sound pressure level is expressed as dBA, and the reference level is always 20 mPa
(micropascals).
7
The numbers in brackets correspond to those of the bibliography in Annex V.
3.2 acoustic echo path: The acoustic coupling from the handset or headset receiver to the handset or
headset microphone.
3.3 acoustic input: The free-field sound pressure level developed by a mouth simulator at the mouth
reference point. See also: sound pressure level.
3.4 acoustic output: The sound pressure level developed in an ear simulator. See: sound pressure level.
3.5 analog telephone set: A telephone set in which the two-way voice communication interface to the
--``,-`-`,,`,,`,`,,`---
8
Information on references can be found in Clause 2.
6 --``,-`-`,,`,,`,`,,`---
Copyright ß 2003 IEEE. All rights reserved.
3.45 sound pressure level: The sound pressure level, in decibels, of a sound is 20 times the logarithm to
the base 10 of the ratio of the pressure of the sound to the reference pressure. For this standard, the
reference pressure is normally 1 pascal (Pa), and sound pressure levels are expressed in dB re 1 Pa
(dBPa). When a reference pressure of 20 mPa is used, the sound pressure level will be expressed as
dBSPL. Unless otherwise indicated, rms values of pressure are used. Most telephony acoustic
measurements are referenced to 1 Pa ¼ 1 N/m2 (newton per square meter). However, measurements
such as receive noise and room noise are generally referenced to 20 mPa (micropascals).
^ ^ ^
NOTE—0 dB Pa ¼ 94 dBSPL, 0 dBSPL ¼ 20 mPa, 1 Pa ¼ 1 N/m2. A-weighted sound pressure level in dB (dBSPL, A-weighted) is
often abbreviated as dBA (see ANSI S1.4-1983).
3.46 speaker (also loudspeaker): An electroacoustic transducer that converts an electrical signal to
sound and delivers it to the ear from a distance of several centimeters or greater.
3.47 spectrum: A distribution of amplitude (or phase, or some other quantity) as a function of
frequency. It is often expressed in bands. Bands may be of constant percentage width, such as 1/3 or
1/12th octave bands (23% and 6% of the center frequency, respectively). Bands may also be of
fixed width, regardless of center frequency (e.g., 50 Hz). Instead of bands, a spectrum may also be
expressed as spectrum density, which is equivalent to 1 Hz bands.
3.48 talker sidetone: The direction of speech transmission from mouth to ear of the telephone user.
3.49 telephone set: A device that, when connected to a telephone network, allows two-way voice
communication.
3.50 test head: A fixture containing a mouth simulator and an ear simulator located in a specified
relationship with each other. See: loudness rating guard-ring position.
3.51 two-wire transmission: A transmission method, circuit, or system that provides common paths
(one pair) for signals in the send and receive directions.
3.52 zero transmission level point (0 TLP): An arbitrarily established point relative to which
transmission levels at all other points are specified.
--``,-`-`,,`,,`,`,,`---
4.1 Abbreviations and acronyms
4.2 Symbols
The letter ‘‘G’’ is used for spectra. This corresponds to common usage, especially in two-channel FFT
analysis literature. The analysis bandwidth shall be specified:
--``,-`-`,,`,,`,`,,`---
H( f ) ¼ frequency response, in dB
H 0 ( f ) ¼ response at the new preferred ISO R10 frequency
HEP( f ) ¼ echo path frequency response, in dB (V/V)
HLS( f ) ¼ listener sidetone frequency response, in dB (Pa/Pa)
HO( f ) ¼ overall frequency response, in dB (Pa/Pa)
HR( f ) ¼ receive frequency response, in dB (Pa/V)
HS( f ) ¼ send frequency response, in dB (V/Pa)
HSD( f ) ¼ diffuse field send frequency response, in dB (V/Pa)
HTS( f ) ¼ talker sidetone frequency response, in dB (Pa/Pa)
The letter ‘‘L’’ is used for rms levels measured over a wide band, with the bandwidth to be
specified. This corresponds to common usage in sound level measurements, as specified in ANSI
S1.1-1994:
SDE ¼ translation from HATS drum reference point to ear reference point
Other symbols:
Test equipment generally required to test all the devices covered by this standard is covered in this
clause. The specific test equipment required to produce test signals and analyze the resulting output is
determined by the test signal and analysis method chosen. Test circuits, interfaces, and impairments
for analog and digital telephones, as well as four-wire devices such as handsets and headsets, are
described in Clause 7, Clause 8, and Clause 9, respectively.
All equipment should be calibrated in accordance with the recommendations of the manufacturer
before performing the system calibration procedures in Clause 6.
The fundamental purpose of ear simulators is to test a receiver under conditions that most closely
approximate actual use by real persons. The recommendations that follow are based upon the manner
in which the receivers are intended to be used. Modifications to an ear simulator or test procedure shall
not be made. For example, flexible sealing material, such as putty, shall not be used.
5.1.1 Selection
An ear simulator with a flexible pinna shall be used for all measurements, unless the applicable
performance specification requires or allows an alternative. See Annex B. The Type 3.3 ear simulator is
recommended for all devices. The Type 3.4 ear simulator is recommended for all devices except supra-
concha and supra-aural headsets.
The ear simulators shall comply with the specifications given in ITU-T Recommendation P.57-2002.
Type 3.3 and Type 3.4 ear simulators both simulate the acoustical and mechanical characteristics of
real ears. They are likely to give results comparable to the typical listening experience of real persons
for the widest possible variety of handsets or headsets and applications, including nontraditional
designer handsets and headsets. Both types simulate typical leakage and how it changes with position
and/or applied force. There are, however, some differences between the two types, as well as the
positioning devices available for use with them. For more information, see Annex A.
(Type 3.3 was formerly called the soft HATS pinna. It has a hardness of 55, 10 degrees Shore-OO,
as measured according to ASTM D2240-2002. It is an anatomically shaped pinna which is structurally
identical to the pinna formerly described as Type 3.3 in ITU-T Recommendation P.57-2002.)
The same ear simulator shall be used for all measurements on a particular device. The choice of ear
simulator and positioning method shall be clearly stated in all test reports.
Type 3.3 and Type 3.4 ear simulators both simulate the acoustical and mechanical characteristics of
real ears. They are likely to give results comparable to the typical listening experience of real persons
for the widest possible variety of headsets. However, the correlation between measurements on ear
simulators and on real ears is better for some headset types than others. For headsets that are in close
proximity to, or in the ear canal, the correlation is not as good as for most other types.
For insert headsets, both Type 3.3 and Type 3.4 ear simulators are likely to provide a greater seal than
on many human subjects, resulting in an overestimation of output at low frequencies. Nonetheless,
both ear simulators are recommended for this application.
Type 3.3 and Type 3.4 ear simulators both measure at the eardrum reference point (DRP).
Measurements collected at the DRP shall be translated to the ERP. This is done because receive and
sidetone specifications are referenced to the ERP. It also permits comparison of measurements made
on the various type ear simulators.
For all measurements, the translation from DRP to ERP may be fulfilled by using a filter as specified
in Annex C. A filter shall be used for measurements of peak acoustic pressure, and is recommended for
measurements of distortion.
For measurements made with any kind of spectrum analysis, the translation from DRP to ERP may
be performed by using one of the tables in Annex C. Measurement examples include frequency
response, noise, linearity, and distortion. Tables may also be used for frequency response
measurements made with sine waves, if only the fundamental or total response is included.
For measurements of distortion using a sine or narrowband stimulus, a translation table may be
constructed based on one of the tables in Annex C. Separate tables are required for each harmonic or
difference-frequency distortion product, taking into account the frequency offset between the stimulus
frequency and the frequency of the distortion product.
The fundamental purpose of mouth simulators is to test a microphone under conditions that most
closely approximate actual use by real persons. The mouth simulator shall comply with the
specifications given in ITU-T Recommendation P.58-1996, unless the applicable performance
specification requires or allows an alternative. See Annex B. This mouth is generally installed in a HATS.
ITU-T Recommendation P.58-1996 does not define a sound field behind the lip plane of the mouth
simulator. However, experience has shown that at least one implementation of the mouth has a sound
10 --``,-`-`,,`,,`,`,,`---
Copyright ß 2003 IEEE. All rights reserved.
field distribution which closely approximates the sound field behind the lip plane of a real human head,
up to at least 4 kHz. The investigated region extends from behind the lip plane to the base of the
rubber ear and equal to or greater than 5 mm above the surface of the HATS cheek. This makes HATS
suitable for testing headsets, cordless and cellular phones, handsfree phones, and traditional corded
handsets. The sound field approximation may extend in frequency range as well as to other regions
around HATS, but these have not yet been verified.
The fundamental purpose of test fixtures is to test a device equipped with a handset or headset under
conditions that most closely approximate actual use by real persons.
5.3.1 Selection
The test fixture shall be a HATS, which complies with ITU-T Recommendation P.58-1996. When
using the Type 3.3 ear simulator, the HATS shall also comply with ITU-T Recommendation P.64-
1999, Annex E. When using the Type 3.4 ear simulator, the HATS shall also comply with ITU-T
Recommendation P.64-1999, Annex D.
The LRGP position was specified in previous editions of this standard. Send frequency response
measurements made on ordinary telephones from 300 to 3400 Hz are expected to give practically
identical results, whether obtained with LRGP or the HATS position. Systematic differences of about
--``,-`-`,,`,,`,`,,`---
For information about an alternative test fixture, and the ear simulators and mouth simulator with
which it can be used, see Annex B.
The handset receiver shall be nominally placed in the HATS position as specified by Annex D or
Annex E of ITU-T Recommendation P.64-1999. To do this, the ear-cap reference point (ECRP) shall
lie on the axis of motion of the positioning device. This axis is defined by a line that passes through the
ERP of the left and right ears. The ECRP may be inside of or outside of ERP depending on the applied
force and the shape of the receiver.
For the purposes of this standard, the ECRP is the intersection of the external ear-cap reference plane
with a normal axis through the effective acoustic center of the sound outlet ports. Generally, the
acoustic center of the sound outlet ports is at the center of their distribution.
For many handsets, the ear-cap reference plane is parallel to the reference plane of the positioning
device.
For some handsets, the above positioning may not apply, and the position that best represents
intended use shall be utilized.
The receiver shall contact the pinna with a force of 6 N (newtons). This is the default force for all
measurements.
In general, it is desirable that receive frequency response should not change too much as application
force changes. For this reason, the device should also be tested at 2 N and 10 N, which represent
minimum and maximum forces likely to be used by real persons on a long-term basis. These results are
for information, but do not have to be included in the test report.
lightweight receiver placed on his or her pinna, with the ear-cap reference plane horizontal.)
A manufacturer may specify a recommended test position (RTP) on either the Type 3.3 or Type 3.4 ear
simulator. The RTP may specify position with respect to ERP, a specific force, or other aspects of the
test position intended to simulate actual use. The force applied shall not exceed the range of 2–10 N.
If the phone is tested at the RTP, the definition of the RTP, including evidence of its authorization by
the device manufacturer, shall be included in the test report.
If the RTP is used, the device should also be tested at 2 N, 6 N, and 10 N on the same ear simulator.
These results are for information, but do not have to be included in the test report.
For maximum acoustic output measurements, the device shall be tested at either 6 N or the RTP, and
also at 13 N. The final result shall be an ‘‘upper envelope’’ curve consisting of the maximum output of
each measurement at each frequency. See Figure 1 and Figure 2.
Figure 2—Maximum acoustic output, LERP(f ). Upper envelope (heavy line) and two
individual measurements on one handset (light lines)
Except for maximum acoustic output, the same positioning shall be used for all measurements of any
particular device. The positioning method shall be clearly stated in the test report.
Handsets with carbon microphones require conditioning procedures before positioning for
measurement. See Annex D.
Headsets should be tested in a position that most closely approximates real use. Natural headband
pressure, or other positioning techniques normally used by a real person, shall be used for testing.
If the manufacturer specifies a recommended test position (RTP), the headset shall be tested in that
position. The purpose of the RTP shall be to clarify how to position the headset in a way that
corresponds to real use. The RTP shall be defined geometrically with respect to the MRP, center of the
lip plane, ERP, ear entrance point (EEP), and/or the HATS reference point (HRP). See ITU-T
Recommendation P.58-1996. Facial features, such as the corner of the mouth, shall not be used as
reference points. If no RTP is specified, the test position can be determined by observing the actual
design of the headset and by following any guidelines for positioning provided by the manufacturer.
When positioning a headset on a HATS, it is generally possible to approximate real use in an obvious
way. In any case where the headset does not fit on the HATS and its ear simulator quite in the way
intended for real persons, adjustments may be made so the receiver and microphone are as close as
possible to positions corresponding to real use. The body of the receiver, the headband or any other
non-acoustical component may be positioned as necessary.
A minimum of five measurements of frequency response and loudness rating shall be made on each
individual unit tested. The headset shall be completely removed from the ear simulator and re-
mounted for each trial. The mean and standard deviation of the loudness rating and each point of the
frequency response shall be computed for this group of measurement trials.
The accuracy of the final results shall be considered acceptable if the standard deviation of the
loudness rating is 1 dB or less, and if the standard deviation of the frequency response is 2 dB or less
from 200–1000 Hz, and 1 dB or less from 1000–4000 Hz. If the results of the first five trials do not meet
this criteria, report the results, but label them as reduced accuracy.
The reported results shall include the mean loudness rating and standard deviation, the mean frequency
response and standard deviation, a description of the test position, and the number of trials. Additional
measurements may be made in order to meet the mean and standard deviation criteria.
For maximum acoustic output measurements made using the Type 3.3 or Type 3.4 ear simulator,
at least five measurements shall be made. The final result shall be an ‘‘upper envelope’’ curve consisting
of the maximum output of each measurement at each frequency. All curves shall be reported, for a
total of at least five individual measurements plus the upper envelope curve. See the example in Figure 3
and Figure 4.
Figure 4—Maximum acoustic output, LERP(f ). Upper envelope (heavy line) and five
individual measurements on one receiver (light lines)
For headsets with large hard-cap receivers that are similar to receivers in handsets, 5.3.2 may apply.
These positions may fall behind the lip plane of the mouth. See B.2 for more information.
Pressure gradient microphones (cardioid, noise canceling, etc.) are sensitive to both position and
orientation. It is important that the correct orientation be used to obtain results representing actual
use performance.
For microphones that are not measured at RTP or BMP, or are not on a fixed boom, state the exact
geometric test position following the guidelines in 5.3.3.
The sizes and types of measurement microphones are specified in the clauses where their use is
required. All microphones used to implement this standard shall comply with the relevant
specifications in ANSI S1.12-1967.
Electroacoustic measurements should be conducted in a test environment that will not affect the results
beyond the intended influence of the test fixture and measurement transducers. The test environment
should have a low background noise level, and the test fixtures and device under test should be isolated
from reflections and mechanical disturbances that could cause significant error.
Be sure to record the test environmental conditions of temperature, humidity, and barometric
pressure, in addition to the background noise. Overall A-weighted noise level and octave band sound
levels are defined below.
The background noise level in the test environment shall not exceed the limits shown in Table 1. The
overall level shall not exceed 29 dBA. However, these limits may be relaxed if it can be shown that the
accuracy of the measurement is not impaired.
Background noise measurements shall be made using a 12.5 mm pressure microphone with a
microphone system noise level not exceeding 20 dBA. The individual factory-calibrated frequency
response of the microphone, if available, shall be taken into account.
The test environment should be sufficiently free of reflections. There should be no large objects within
1 m of the MRP. Small objects such as tripods that are used for positioning may be acceptable. Errors
due to the influence of reflections shall not exceed 1.5 dB below 800 Hz. Errors above 800 Hz shall
not exceed 1.0 dB.
A uniform diffuse sound field shall exist in a volume of radius 0.15 m. The diffuse field test point
(DFTP) is at the center of this spherical volume. There shall be no obstacles, including the
loudspeakers, within 0.5 m of the DFTP.
The classical method of creating a diffuse field is to construct a reverberation chamber. If one is
available, it is generally the best method. (Construction and verification of a reverberation room is
outside the scope of this standard.)
For the purposes of this standard, a diffuse field may be approximated by using several loudspeakers
and uncorrelated noise sources. Experience has shown that four to eight speakers and uncorrelated
sources in an ordinary room may be sufficient for measurements in 1/3 octaves. However, more may be
required, especially if measurements are to be made in 1/12 octave resolution.
Diffuse field measurement should be made using a 6.25 mm pressure microphone, but may also be
made using a 12.5 mm random pressure microphone. The individual factory-calibrated frequency
--``,-`-`,,`,,`,`,,`---
response of the microphone, if available, shall be taken into account.
Diffuse field conditions shall be verified by the following two tests, performed in the same resolution
and bandwidth used for measurements. The test is performed with reference to the DFTP, with no
mouth simulator or other objects present.
Calibration of the diffuse field shall be according to 6.7.1 and 6.7.2, except performed at the DFTP.
In order to test a telephone, handset, or headset realistically, it may be useful to test it in environments
similar to those in which it is expected to operate. Such environments can be considered acoustic
impairments, which may cause the telephone to work differently than in a quiet test space. Two such
impairments are described in this clause, but others may also be relevant for some applications.
The reference corner is one physical setup used for echo, howling, and stability tests. The reference
corner consists of three perpendicular plane, smooth, hard surfaces 0.5 m square, as shown in Figure 5.
A handset shall be placed along the diagonal from the apex of the reference corner to the outside
corner, with the earcap end of the handset 250 mm from the apex. A headset is placed on the surface as
if it was put down briefly by a user, with the receiver 250 mm from the apex.
--``,-`-`,,`,,`,`,,`---
Hoth noise is random acoustic noise which has a spectrum designed to simulate typical ambient room
noise. See Annex E for details. The noise level shall be specified in dBA.
6. Calibration
6.1 General
Analyzers and level meters shall be calibrated to an accuracy of at least 0.5 dB.
The sensitivity of measurement microphones shall be calibrated prior to each use, to an accuracy of
at least 0.5 dB. An acoustical calibrator with an accuracy of at least 0.2 dB shall be used.
The individual factory-calibrated frequency response, if available, shall be taken into account.
The sensitivity of the ear simulator should be calibrated each day the system is in use. For best
accuracy, it should be calibrated prior to measurements on each new device. An acoustical calibrator
with an accuracy of at least 0.2 dB shall be used, along with the factory-supplied adapters for the ear
simulator to be calibrated. Be sure to apply any correction factor required for any particular
combination of calibrator, adapter, and ear simulator.
The calibration procedures shall be performed using the same format as will be used for measurements.
Format examples are 1/N octave bandwidth analysis, constant bandwidth analysis, and R-series
preferred frequencies. Bandwidth shall be the same as or greater than that which will be used for
measurements. Resolution shall be the same as or finer than that which will be used for measurements.
Amplitude accuracy shall be the same as or better than that which will be used for measurements. The
actual format, bandwidth, resolution, and amplitude accuracy shall be stated. See Annex F for
additional details. Also, review F.9, F.11, G.6, and G.7 for a summary of the test signal parameters,
comparison of test methods and signals, and details about measurement bandwidth and resolution.
Electrical test signals should be calibrated each day the system is in use. For best accuracy, they should
be calibrated prior to measurements of each new device. The analyzer or level meters used shall be
calibrated first (6.2).
Following a calibration, the resistive load is removed and the source is connected to RETP without
further adjustment.
A similar calibration is required for testing handset and headset four-wire devices. See 9.2 for the
source impedance requirements.
The electrical test spectrum is measured across a calibrated resistive load. For sinusoidal test signals,
the spectrum shall be flat within 0.5 dB over the actual measurement bandwidth. Equalization may
be used to meet this requirement.
For all other test signals, the electrical spectrum shall meet the target spectrum and spectrum tolerance
for the type of signal used, as defined in Annex F. If no tolerance is specified in the signal definition,
the default tolerance is 3 dB from 175 to 4500 Hz (or the 1/3 octave bands from 200 to 4000 Hz), and
þ3/5 dB elsewhere. Equalization may be used to meet this requirement.
The standard electrical test level, nominal LRETP, is 16 dBV rms, 0.5 dB, for analog telephones.
This test level is recommended for measurements at minimum and reference volume control settings,
and 30 dBV is recommended at maximum volume. Total harmonic distortion shall be less than 1%
for these test conditions.
The standard test level for digital telephones, nominal LRETP, is 18.2 dBV, 0.5 dB, for a 600
interface. This corresponds to 16 dBm0. This test level is recommended for measurements at
minimum and reference volume control settings. For measurements at maximum volume control
settings, LRETP is 32.2 dBV. This corresponds to 30 dBm0. Total harmonic distortion of the test
signal shall be less than 1% for these test conditions.
--``,-`-`,,`,,`,`,,`---
The standard test level for handsets and headsets tested as four-wire devices is determined by the
procedure for setting the default receive volume control adjustment in 9.3.2.
For sinusoidal test signals, the level shall be held constant at all test frequencies.
For continuous spectrum test signals, the level shall be measured over the entire spectrum. Out-of-
band signals from 40–20 000 Hz shall add no more than 0.5 dB to this level.
The mouth simulator should be calibrated each day the system is in use. For best accuracy, it should be
calibrated prior to measurements on each new device. The measurement microphone used to calibrate
the mouth simulator shall be calibrated first (6.3).
The acoustic test spectrum is measured at the mouth reference point (MRP). For sinusoidal test
signals, the spectrum shall be flat within 0.5 dB over the actual measurement bandwidth.
Equalization may be used to meet this requirement.
For all other test signals, the acoustic spectrum shall meet the target spectrum and spectrum tolerance
for the type of signal used, as defined in Annex F. If no tolerance is specified in the signal definition,
the default tolerance is 3 dB from 175 to 4500 Hz (or the 1/3 octave bands from 200 to 4000 Hz), and
þ3/5 dB elsewhere. Equalization may be used to meet this requirement.
The standard acoustic test level for send, LMRP, is 4.7 dBPa rms, 0.5 dB, at the MRP. Total
harmonic distortion of the mouth simulator shall be less than 2% for this test condition.
For sinusoidal test signals, the level at MRP shall be held constant at all test frequencies.
For continuous spectrum test signals, the level shall be measured over the entire spectrum. Out-of-
band signals from 40 to 20 000 Hz shall add no more than 0.5 dB to this level.
A 6.25 mm pressure or free-field microphone shall be used to calibrate the HATS mouth simulator.
The microphone axis shall be oriented 90 degrees to the mouth axis with the center of the protection
--``,-`-`,,`,,`,`,,`---
grid at the MRP (see Figure 6). (The HATS manufacturer generally supplies a jig for this purpose.)
If a pressure microphone is used, the results may be used directly. The individual factory-calibrated
frequency response of the microphone, if available, shall be taken into account.
If a free-field microphone is used, the free-field correction curve for 90 degrees shall be taken into
account, and the individual factory-calibrated frequency response of the microphone, if available, shall
be taken into account.
To calibrate the mouth, measure GMRP( f ), the spectrum at the MRP. Adjust the mouth equalization
to meet the target spectrum for the signal being used at a total sound pressure of 4.7 dBPa. This
spectrum is used to calculate the send, sidetone and overall frequency responses.
NOTE—In principle, a very small ideal microphone should be used to calibrate a mouth simulator, so that the physical size of the
microphone does not influence the measurement. In practice, a 6.25 mm laboratory measurement microphone with flat frequency
response in a pressure field may be used to calibrate a HATS mouth simulator to the required accuracy. Some manufacturers
recommend a free-field microphone instead, which typically has less sensitivity to mechanical vibration and results in a better
calibration. The free-field microphone can be compensated to give the same frequency response as a pressure microphone by
using free-field correction curves for the angle of sound incidence. The compensation is on the order of 1 dB at 8 kHz.
7.1 General
Procedures are given in the following clauses for measurement of receive, send, sidetone, and overall
performance characteristics of handset and headset telephones. Parameters include frequency
response, noise, input–output linearity, distortion, and mute. In addition, procedures are given for
measuring telephone set impedance, howling, and maximum acoustic output.
The telephone should be connected to the test circuit(s) described in 7.2. Other test circuits may be
used for specific applications. Because telephone set characteristics are affected by loop impedances,
terminations, loop currents, and operating levels, the measurements should be made using test loops
and other conditions representative of those conditions the telephone is expected to encounter in use.
Records should be kept of the measurement conditions.
The measured frequency responses shall be presented as decibels relative to one pascal per volt
[dB (Pa/V)] for receive, decibels relative to one volt per pascal [dB (V/Pa)] for send, decibels relative to
one pascal per pascal [dB (Pa/Pa)] for sidetone and overall, and decibels relative to one volt per volt
[dB (V/V)] for echo. The stimulus level and signal type shall be reported for each test.
The calibration procedures described in Clause 6 shall be carried out before making any
measurements. The acoustical test environment shall meet the specifications given in 5.5.
In general, multiple test signals and stimulus levels should be used to ensure the telephone is
characterized in realistic, stable, and well-defined states. This is especially the case for telephones with
nonlinear processes such as compression or voice activated switching (VOX) circuitry, etc. See Annex F
and Annex G for further information on test signals and analysis methods.
The standard test signal for all telephones consists of artificial voices defined in ITU-T
Recommendation P.50-1999. See F.6.1.1 for details.
Sinusoidal test signals (F.4.1) may be used for testing telephones, handsets, or headsets, if it can be
shown that they do not have adaptive, nonlinear or dynamic signal processing (e.g., compressors,
AGC, voice activity detection, adaptive echo cancellers, etc.). Such evidence shall be given in the test
report if sinusoidal test signals are used.
Other test signals may be used when it can be shown that they produce results consistent with actual
use. They also may be necessary for some specific purposes as discussed in relevant places within this
standard.
The measurements in this clause shall be performed at the standard test levels specified in 6.6.2
and 6.7.2.
The measurement shall be performed using the same format as was used for calibration. Format
examples are 1/N octave bandwidth analysis, constant bandwidth analysis, and R-series preferred
frequencies. Measurement bandwidth shall be the same as or less than that which was used for
calibration. Measurement resolution shall be the same as or coarser than that which was used for
calibration. The actual bandwidth used shall be stated.
In general, the test signals and analysis methods in this standard cover a frequency range from
approximately 100 to 8500 Hz. The exact range depends on the analysis method, and the test signal
(see G.6 and G.7)
Choose the ear simulator, mouth simulator, and test position according to 5.1, 5.2, and 5.3. This
equipment shall be used for all tests described in Clause 7, unless otherwise specified. The ear
simulator, mouth simulator, and test position used shall be stated.
If the telephone is equipped with a tone control, the tone control shall be set to the manufacturer’s
default setting. This is the default tone control adjustment that shall be used for all measurements.
If no default setting is defined by the manufacturer, the tone control shall be set so that the frequency
response is as close as possible to the center of the required frequency response template. The tone
control shall be set before setting the volume control. If the tone and volume controls interact, an
iterative process for setting these controls may be necessary.
All measurements shall be done at the reference receive volume control setting (3.38). A range of
volume control settings may also be used where appropriate, such as minimum and maximum volume.
--``,-`-`,,`,,`,`,,`---
7.1.6 Reference send gain control setting
All measurements shall be done at the reference send volume control setting (3.39). A range of volume
control settings may be used where appropriate, such as minimum and maximum volume.
A general-purpose DC feed circuit is shown in Figure 7. Since the parameters of the feed circuit affect
transmission performance, they should be recorded as part of the test setup. If available, parameters
should be obtained from the applicable performance specification. If not, the following values should
be used:
C 50 mF (microfarads)
L 5 H (henries) (each)
R ¼ 400
(ohms), including resistance of inductors
V ¼ 50 V (volts)
A ¼ ammeter used to measure current drawn by the telephone under test. Alternatively, the current
can be fixed by a current source, regardless of the R value.
In some cases, ground loops may occur when connecting test equipment to RETP or SETP. The
insertion of a high quality 1:1 audio transformer can usually prevent this. If used, this transformer
shall be included during calibration and when determining the loss of the feed circuit.
The send electrical test point (SETP) is for measuring send output signals. It shall be connected to a
900
load. The receive electrical test point (RETP) is for applying receive input signals. It shall be
connected to a 900
source. (Other terminations may be substituted as defined by applicable
performance specifications.)
The loss of the feed circuit used should be measured. The loss should not be greater than 0.1 dB over
the range 100–8500 Hz. The loss from 20 to 100 Hz should not exceed 1 dB. The circuit of Figure 7,
using ideal components, should just meet this specification.
The following procedure may be used to determine the loss of the feed circuit:
The noise level of the feeding bridge should be low enough not to influence measurement results.
Feed circuit for overall measurements using two phones is shown in Figure 8.
Telephone performance can be influenced by various conditions in the network to which a telephone is
connected. The specific impairments described in 7.3.1 through 7.3.6 should be investigated where
applicable. Other impairments, such as ADSL signals from a high-speed modem, may be relevant for
--``,-`-`,,`,,`,`,,`---
specific situations. The general method is to make a standard measurement as specified in 7.4 through
7.7, but with the impairment introduced.
Loop current may be varied to determine if there are any detrimental effects. This is especially
important if the telephone is powered from the line rather than from a local power supply.
Network noise can affect nonlinear processes within an analog telephone. Network noise shall be
approximated using white noise, with levels measured in dBmp. Noise shall be inserted at the RETP.
Network termination impedance will affect analog telephone transmission performance. A 900
Other terminations may be used for specific applications. For example, a complex termination more
typical of North American loops, may be useful for sidetone measurements.
A wireline analog telephone should be tested with various lengths of cable or simulated cable.
Recommended loop lengths for testing North American telephones are 0, 2.7, and 4.6 km (0, 9, and
15 kft) of 26 AWG nonloaded cable. The recommended loop simulator circuit is shown in Figure 9 and
components for various lengths are shown in Table 3.
For some measurements, particularly sidetone and howling, real cable may give results more
representative of actual performance compared to the loop simulator of Figure 9 and Table 3.
R2, R3 109
174
312
NOTES
1—All values are 1%
2—2.7 km and 4.6 km can be made up of cascaded sections of the aboved
a
0.305 km ¼ 1 kft
b
0.914 km ¼ 3 kft
c
1.83 km ¼ 6 kft
d
2.7 km ¼ 9 kft. 4.6 km ¼ 15 kft
The telephone should be tested with a parallel telephone set simulator with suitable dc and ac
characteristics. In general, measurements made with the parallel set simulator should be compared to
the same measurements made without the simulator. The minimum recommended measurements are
send and receive frequency response, loudness ratings, and distortion, each measured with standard
loop lengths.
The parallel telephone set simulator circuit shall have the VI curve as shown in Figure 10, 0.3 V over
the current range of 0–100 mA. The return loss shall be greater than 10 dB with respect to 600
from
200 to 4000 Hz. Component values may be adjusted to meet these tolerances. One possible
implementation of this is shown in Figure 11.
A cordless telephone should be tested across the range of expected usage. This should include the
minimum and maximum specified distance the telephone is expected to operate between the base unit
and mobile unit.
7.4 Receive
Receive frequency response is the ratio of sound pressure measured in the ear simulator, referred to the
ear reference point (ERP), to the voltage input at the receive electrical test point (RETP), which is
expressed in decibels. The receive frequency response in dB, HR( f ), is given by Equation (1). HR( f )
may be used to calculate the receive loudness rating (RLR) according to ITU-T Recommendation
P.79-1999. See Annex H.
GERP ð f Þ
HR ðfÞ ¼ 20 log in dB ðPa=VÞ ð1Þ
GRETP ð f Þ
where
In some cases, frequency response calculation may be performed with cross-spectrum or related
techniques. Justification for such techniques shall be given in the test report. See G.1 for more
information.
Receive noise is internally generated audio frequency noise present at the receiver when no stimulus is
applied. The receiver shall be coupled to the ear simulator with the RETP terminated and with no
signal input. The telephone’s microphone should be isolated from sound input and mechanical
disturbances that would cause significant error. Measure the acoustic output signal, referred to the
ERP, from 100 to 8500 Hz, averaging over a minimum period of 5 s. Receive noise should be measured
with the send mute feature both ‘‘on’’ and ‘‘off.’’
The overall receive noise level is measured with A-weighting in dBA. The measurement may be
implemented directly using an A-weighting filter, or by using single-channel FFT with Hann
windowing or real-time spectrum analysis, followed by an A-weighted power summation.
Receive narrowband noise, including single frequency interference (SFI), is an impairment that can be
perceived as a tone relative to the overall weighted noise level. This test measures the weighted noise
level characteristics in narrow bands of not more than 31 Hz maximum from 100 to 8500 Hz. These
levels can then be compared to the receive noise (7.4.2).
The receiver shall be coupled to the ear simulator with the RETP terminated and with no signal input.
Measure the A-weighted receive noise level, referred to the ERP, using a selective voltmeter or
spectrum analyzer with an effective bandwidth of not more than 31 Hz, over the frequency range of
100–8500 Hz. If FFT analysis is used, then a ‘‘flat top’’ windowing shall be employed.
Receive linearity is a measure of how the frequency response changes with input level.
The test consists of measuring the receive frequency response as specified in 7.4.1 and applying the
procedures described in Annex I. Linearity shall be measured using the same test method and stimulus
type used to measure frequency response.
If artificial voices or another wideband stimulus are used, the test shall be performed at seven levels,
from 46 to 16 dBV, in 5 dB intervals, measured in 1/3 octave bands. Smaller intervals and/or a
wider range of levels may also be used. The reference stimulus level is 16 dBV. These levels take into
account the high crest factor of artificial voices, which approaches 23 dB.
If sine-wave signals are used, they shall be applied at the ISO R10 frequencies from 200 through
5000 Hz, at seven levels, from 36 to 5 dBV, in 5 dB intervals. (For ISO R10 preferred frequencies,
see ISO 3:1973.) Smaller intervals and/or a wider range of levels may also be used. The reference
stimulus level is 21 dBV.
Receive distortion is measured at ERP using the standard input level of 16.0 dBV. Other input levels
should be tested covering a range from 30 to 0 dBV. Measurements should also be made over a range
of frequencies within the telephone band, such as the ISO R10 preferred frequencies. For higher input
levels above 0 dBV, verify that distortion of the test system is less than 1% THD.
For information about THD and other distortion measurement methods and test signals, and the
conditions under which they may be used, see Annex J. Different distortion measurement methods are
likely to give different results.
Receive mute is sometimes called ‘‘DTMF mute’’ or ‘‘autodial mute.’’ Receive muting is usually
automatic, but may be manually controlled, and would normally be activated by touch-tone dialing,
line ‘‘hold’’ operation, activating the hold button, or other means. Mute leakage is the amount of
signal measured at the ERP when an electrical stimulus is applied to the RETP.
To measure mute leakage, engage the mute, apply the test signal, and measure the receive noise
according to 7.4.2. The test signal shall be the same as that used for receive frequency response (7.4.1),
at 0 dBV. An additional measurement shall be made using the DTMF tones of the telephone set being
tested. In the case of the DTMF tones of the telephone set, there will not be any control over the level.
Each result is expressed in dBA, the weighted noise level which should be compared to muted receive
noise measured according to 7.4.2 (with no stimulus applied).
NOTE—If a sinusoidal stimulus was used to measure receive frequency response in 7.4.1, the same sinusoidal frequency pattern
shall be used for the mute measurement, but only over the range of 200–4000 Hz, at 0 dBV. The absolute level at each frequency
is measured, not the frequency response. A-weighting should be applied to the result, expressed in dBA as a function of
frequency. The weighting permits more relevant comparison with results obtained with artificial voices.
7.5 Send
Send frequency response is the ratio of voltage output at the send electrical test point (SETP) to the
sound pressure at the mouth reference point (MRP), which is expressed in decibels. The send
frequency response in dB, HS( f ), is given by Equation (2). The send frequency response, HS( f ), may
be used to calculate the send loudness rating (SLR) according to ITU-T Recommendation P.79-1999.
See Annex H.
GSETP ð f Þ
HS ð fÞ ¼ 20 log in dB ðV=PaÞ ð2Þ
GMRP ð f Þ
where
In some cases, frequency response calculation may be performed with cross-spectrum or related
techniques. Justification for such techniques shall be given in the test report. See G.1 for more
information.
Send noise is internally generated audio frequency noise present at the SETP. Measure the electrical
output signal at SETP, averaging over a minimum period of 5 s. The telephone’s microphone should
be isolated from sound input and mechanical disturbances that would cause significant error. Send
noise should be measured with the mute feature both ‘‘on’’ and ‘‘off.’’
Send overall noise shall be measured and reported in units of dBmp. It shall also be measured with
A-weighting (defined in ANSI S1.4-1983), reported in units of dBm(A). Measurements in dBmp and
dBm(A) are generally not the same, and they may not be correlated.
Psophometric measurements are made from 100 to 6000 Hz, while A-weighted measurements are made
from 100 to 8500 Hz. These measurements can be made directly using a psophometrically weighted or
A-weighted noise meter with the correct terminating impedance. The measurement may also be
implemented using a single-channel FFT with Hann windowing, or a real-time spectrum analysis,
followed by a weighted power summation.
Send narrowband noise, including single frequency interference (SFI), is an impairment that can be
perceived as a tone relative to the overall weighted noise level. This test measures the weighted noise
level characteristics in narrow bands of not more than 31 Hz maximum from 100 to 6500 Hz. These
levels can then be compared to the send noise (7.5.2).
The handset or headset should be isolated from sound input and mechanical disturbances that would
cause significant error. Measure the psophometrically weighted noise level at the SETP with a selective
voltmeter or spectrum analyzer with an effective bandwidth of not more than 31 Hz, over the
frequency range of 100–6500 Hz. If FFT analysis is used, then a ‘‘flat top’’ windowing shall be
employed.
The procedure shall be repeated using A-weighting instead of psophometric weighting, and the
frequency range shall be changed to 100–8500 Hz.
Send linearity is a measure of how the frequency response changes with input level.
The test consists of measuring the send frequency response as specified in 7.5.1 and applying the
procedures described in Annex I. Linearity shall be measured using the same test method and stimulus
type used to measure frequency response.
If artificial voices or another wideband test signal are used, the test shall be performed at seven levels
from 34.7 to 4.7 dBPa, in 5 dB intervals, measured in 1/3 octave bands. Smaller intervals and/or
a wider range of levels may also be used. The reference stimulus level is 4.7 dBPa. These levels take
into account the high crest factor of artificial voices, which approaches 23 dB.
If sine-wave signals are used, they shall be applied at the R10 frequencies from 200 through 5000 Hz,
at seven levels, from 24.7 to þ5.3 dBPa, in 5 dB intervals. Smaller intervals and/or a wider range of
levels may also be used. The reference stimulus level is 9.7 dBPa.
Send distortion is measured at SETP using the standard input level of 4.7 dBPa. Other input levels
should be tested covering a range from 30 to þ10 dBPa. Measurements should also be made over a
range of frequencies within the telephone band, such as the ISO R10 preferred frequencies. For higher
input levels, verify that distortion of the test system is less than 2% THD.
For information about THD and other distortion measurement methods and test signals, and the
conditions under which they may be used, see Annex J. Different distortion measurement methods are
likely to give different results.
The mute function is for voice privacy during line hold and mute. Send muting is often manually
controlled, but may be automatically controlled. Mute leakage is the amount of signal measured at the
SETP when an acoustic stimulus is applied to the handset or headset microphone.
To measure mute leakage, engage the mute, apply the test signal, and measure the send noise
according to 7.5.2. The test signal shall be the same as that used for send frequency response (7.5.1), at
þ5 dBPa. The result is expressed in dBmp, a weighted noise level that should be compared to muted
broadband noise measured according to 7.5.2 (with no stimulus).
NOTE—If a sinusoidal stimulus was used to measure send frequency response in 7.5.1, the same sinusoidal frequency pattern
shall be used for the mute measurement, but only over the range of 200–4000 Hz, at þ5 dBPa. The absolute level at each
frequency is measured, not the frequency response. Psophometric weighting should be applied to the result, expressed in dBmp as
a function of frequency. The weighting permits more relevant comparison with results obtained with artificial voices.
Send frequency response in a diffuse field is a measure of how much of the noise in the room where a
telephone is being used is transmitted to the network. It is the ratio of voltage output at the send
electrical test point (SETP) to the sound pressure at the diffuse field test point (DFTP, see 5.5.3),
which is expressed in decibels. The diffuse field send frequency response in dB, HSD( f ), is given by
Equation (3).
The diffuse field send frequency response may be sensitive to both the level and type of signal used.
This measurement may be performed in 1/3 octave resolution.
During the measurement, the mouth simulator is present but not active, with the MRP is located at the
DFTP. The mouth simulator is not present during calibration.
GSETP ð f Þ
HSD ð f Þ ¼ 20 log in dB ðV=PaÞ ð3Þ
GDFTP ð f Þ
30 --``,-`-`,,`,,`,`,,`---
Copyright ß 2003 IEEE. All rights reserved.
where
Send signal-to-noise ratio is a measure of the desired speech transmission relative to unwanted noise in
the room where the talker’s phone is used. See Annex K.
7.6 Sidetone
Talker sidetone frequency response is the ratio of the sound pressure measured in the ear simulator,
referred to the ear reference point (ERP), to the sound pressure at the mouth reference point (MRP),
which is expressed in decibels. The talker sidetone frequency response in dB, HTS( f ), is given by
Equation (4). Talker sidetone frequency response may be used to calculate the sidetone masking rating
(STMR) according to ITU-T Recommendation P.79-1999. See Annex H.
The STMR measured on an open-ear HATS is approximately 24 dB. This represents the effective floor
of STMR measurements on actual telephones.
GERP ð f Þ
HTS ð f Þ ¼ 20 log in dB ðPa=PaÞ ð4Þ
GMRP ð f Þ
where
In some cases, frequency response calculation may be performed with cross-spectrum or related tech-
niques. Justification for such techniques shall be given in the test report. See G.1 for more information.
Listener sidetone is a measure of the signal present at the receiver due to sound in the room where the
telephone is used. The measurement is similar to talker sidetone, except that the stimulus signal is
generated in the entire test room, and not presented from a mouth simulator.
Listener sidetone frequency response is the ratio of the sound pressure measured in the ear simulator,
referred to the ear reference point (ERP), to the sound pressure from a diffused sound field at the
DFTP (5.5.3), which is expressed in decibels. The listener sidetone frequency response in dB, HLS( f ), is
given by Equation (5).
GERP ð f Þ
HLS ð f Þ ¼ 20 log in dB ðPa=PaÞ ð5Þ
GDFTP ð f Þ
where
The cross-spectrum method is not recommended for listener sidetone frequency response calculation.
This measurement is conducted using a uniform diffuse sound field as specified in 5.5.3. This
measurement may be performed in 1/3 octave resolution.
The level of the test signal should be in the range of 40–65 dBA. The level and spectrum used should be
reported.
For measurement of listener sidetone, the handset or headset is mounted on an appropriate test
fixture. The mouth simulator is present, but not active, with the MRP at the DFTP.
For the alternate method, listener sidetone response HLS( f ) can be approximated by Equation (6). It is
the talker sidetone response HTS( f ) minus the difference in send frequency responses from the
standard near field method and a similar method using a diffuse noise signal.
To use this alternate method, measure the talker sidetone per 7.6.1, measure the send frequency
response per 7.5.1, then measure the send frequency response in a diffuse field per 7.5.7 and apply
Equation (6).
NOTE—This method may not be valid when the send, receive, or sidetone path has nonlinear characteristics.
Sidetone linearity is a measure of how the frequency response changes with input level.
The test consists of measuring the talker sidetone frequency response as specified in 7.6.1 and applying
the procedures described in Annex I. Linearity shall be measured using the same test method and
stimulus type used to measure frequency response.
If artificial voices or another wideband test signal are used, the test shall be performed at seven levels
from 34.7 to 4.7 dBPa, in 5 dB intervals, measured in 1/3 octave bands. Smaller intervals and/or a
wider range of levels may also be used. The reference stimulus level is 4.7 dBPa. These levels take into
account the high crest factor of artificial voices, which approaches 23 dB.
If sine-wave signals are used, they shall be applied at the R10 frequencies from 200 through 5000 Hz,
at seven levels, from 24.7 to þ5.3 dBPa, in 5 dB intervals. Smaller intervals and/or a wider range of
levels may also be used. The reference stimulus level is 9.7 dBPa.
Sidetone distortion is measured at ERP using the standard input level of 4.7 dBPa. Other input levels
should be tested covering a range from 30 to þ10 dBPa. Measurements should also be made over a
range of frequencies within the telephone band, such as the ISO R10 preferred frequencies. For higher
input levels, verify that distortion of the test system is less than 2% THD.
For information about THD and other distortion measurement methods and test signals, and the
conditions under which they may be used, see Annex J. Different distortion measurement methods are
likely to give different results.
Sidetone delay is measured between the mouth simulator and the ear simulator, using one of the
methods described in Annex L.
If round trip sidetone delay is more than 5 ms, sidetone echo response should be measured. See
Annex M.
7.7 Overall
Overall frequency response is measured on two telephones connected as shown in Figure 8. This is a
simulated end-to-end setup requiring two test fixtures acoustically isolated from each other. The test
conditions should generally be the same as those used for send and receive measurements on the same
telephone(s).
Overall frequency response is the ratio of the sound pressure measured in the ear simulator, referred to
the ear reference point (ERP), on the far-end telephone, to the sound pressure at the mouth reference
point (MRP) for the near-end telephone, which is expressed in decibels. The overall frequency
response in dB, HO( f ), is given by Equation (7). It may be used to calculate the overall loudness rating
(OLR) according to ITU-T Recommendation P.79-1999. See Annex H.
GERP ð f Þ
HO ð f Þ ¼ 20 log in dB ðPa=PaÞ ð7Þ
GMRP ð f Þ
where
In some cases, frequency response calculation may be performed with cross-spectrum or related
techniques. Justification for such techniques shall be given in the test report. See G.1 for more
information.
Overall linearity is a measure of how the frequency response changes with input level.
The test consists of measuring the overall frequency response as specified in 7.7.1 and applying the
procedures described in Annex I. Linearity shall be measured using the same test method and stimulus
type used to measure frequency response.
If artificial voices or another wideband test signal are used, the test shall be performed at seven levels
from 34.7 to 4.7 dBPa, in 5 dB intervals, measured in 1/3 octave bands. Smaller intervals and/or a
wider range of levels may also be used. The reference stimulus level is 4.7 dBPa. These levels take into
account the high crest factor of artificial voices, which approaches 23 dB.
If sine-wave signals are used, they shall be applied at the R10 frequencies from 200 through 5000 Hz,
at seven levels, from 24.7 to þ5.3 dBPa, in 5 dB intervals. Smaller intervals and/or a wider range of
levels may also be used. The reference stimulus level is 9.7 dBPa.
Overall distortion is measured in a similar manner to sidetone distortion. However, this measurement
is between two telephone sets connected across a network connection.
These measurements should be made at the standard test level of 16 dBV.
7.8.1 AC impedance
The impedance, measured at the line terminals of the telephone set, should be determined over the
frequency range 1008500 Hz.
Return loss is defined by Equation (8), and measured with the circuit in Figure 12:
V
RLð f Þ ¼ 20 log A ð8Þ
VB
--``,-`-`,,`,,`,`,,`---
where
Echo return loss can be calculated from return loss according to ITU-T G.122-1993, Annex B,
Clause B.4 (trapezoidal rule).
7.9 Howling
Telephones can experience instability such as howling, acoustic feedback, or oscillation, when
subjected to various loop circuits, receive volume control settings, and physical positioning of the
handset or headset. Instability can be evaluated using the feed circuit described in Clause 7. The
instability should be checked over a range of loop lengths. The position of the handset or headset can
have a major effect on the acoustic stability of the telephone, as nearby acoustic reflecting surfaces can
add to the feedback of the receiver into the microphone.
For each test setup, the telephone shall be evaluated with the receive volume control in the reference
receive volume control and reference send gain control setting, the lowest volume setting, and the
highest volume setting. Instability can be perceived as an audible howling or whistling from the
telephone receiver, or repetitive fluctuation of the telephone set line current.
In each test loop or receive volume control setting, the handset or headset should be placed in a
minimum of four physical positions: Face up, face down and lying sideways on a hard flat surface, and
in the reference corner shown in Figure 5 of 5.6.1.
--``,-`-`,,`,,`,`,,`---
7.10 Maximum acoustic output
The testing methods provided in this clause only cover the application of in-band signals, but the same
sound pressure limits may apply if ringing signals appear in the handset or headset receiver while the
telephone set is off-hook. See Annex N for a discussion of maximum pressure limits.
Maximum acoustic output measurements shall be made on the same ear simulator and with the same
positioning and force as used for receive frequency response measurements. For handsets measured on
HATS, an additional measurement with a force of 13 N is required. See 5.3.2 for handsets, and 5.3.3
for headsets. Telephone sets with adjustable receive volume controls shall be adjusted to the maximum
setting.
Acoustic output can be referenced to the ERP, DRP, free field (0 degrees elevation and azimuth), or
to a diffuse field, as required by the appropriate safety standard. This may require measurements
made at one reference point be translated to the required reference point. A filter may be required.
See Annex C.
The maximum acoustic pressure is the maximum steady-state sound pressure emitted from a receiver.
The stimulus for this test is a slow logarithmic sine sweep applied at RETP from 100 to 8500 Hz. The
measurement shall be made with real-time filter analysis (RTA) in 1/12 octave bands, described in G.3.
The detector shall be set to rms fast, which is a 250 ms effective averaging time (equivalent to a 125 ms
time constant). The detector shall be set to hold the maximum level achieved in each band during the
entire sweep.
The sweep time shall be at least 90 s. A sweep time should be selected that provides consistent results
with no underestimation. That is, the result should be within 0.5 dB at all frequencies for a test
period 30 s.
Additional consideration should be given to the acoustic pressure caused by tones, other audio signals,
or long duration, high amplitude electrical signals applied to power, network, or auxiliary leads of the
telephone.
The peak acoustic pressure is the maximum unweighted peak sound pressure emitted from a telephone
receiver. The stimulus for this test is a surge applied at RETP. The measurement shall be made at
the ear simulator with an unweighted ‘‘peak hold’’ level detector with a rise time equal to, or less than,
50 ms.
The 10/700 ms surge generator specified in 6.2 of IEC 61000-4-5-2001 shall be used. The open circuit
voltage shall be 1000 V, and the short circuit current shall be 25 A.
Measure the peak pressure in the ear simulator while operating the surge generator. An oscilloscope or
a sound level meter, having an unweighted ‘‘peak hold’’ setting is used to make the measurement.
Reverse the telephone set connections and repeat.
8.1 General
The test procedures for digital telephone sets generally follow those for analog telephone sets when
using a reference codec as the digital interface. The procedures in this clause assume a telephone set
equipped with a handset or headset.
Procedures are given in the following clauses for measurement of parameters affecting the receive,
send, sidetone, and overall performance characteristics of digital telephone sets. These parameters
include frequency response, noise, linearity, distortion, delay, and out-of-band signals. In addition,
procedures are given for measuring echo, stability loss, convergence time, discontinuous speech
transmission, and maximum acoustic output.
The telephone should be connected to the test circuit(s) described in 8.2. Other test circuits may be
used for specific applications. Records should be kept of the measurement setup and conditions.
The measured frequency responses shall be presented as decibels relative to one pascal per volt [dB
(Pa/V)] for receive, decibels relative to one volt per pascal [dB (V/Pa)] for send, decibels relative to one
pascal per pascal [dB (Pa/Pa)] for sidetone and overall, and decibels relative to one volt per volt [dB
(V/V)] for echo. The stimulus level and signal type shall be reported for each test.
The calibration procedures described in Clause 6 shall be carried out before making any
measurements. The acoustical test environment shall meet the specifications given in 5.5.
In general, multiple test signals and stimulus levels should be used to ensure the telephone is
characterized in realistic, stable, and well-defined states. This is especially the case for telephones with
nonlinear processes such as compression or voice activated switching (VOX) circuitry, etc. See Annex F
and Annex G for further information on test signals and analysis methods.
The standard test signal for all telephones consists of artificial voices defined in ITU-T
Recommendation P.50-1999. See F.6.1.1 for detail.
Sinusoidal test signals (F.4.1) may be used for testing telephones, handsets, or headsets if it can be
shown that they do not have adaptive, nonlinear, or dynamic signal processing (e.g., compressors,
AGC, voice activity detection, adaptive echo cancellers, etc.). Such evidence shall be given in the test
report if sinusoidal test signals are used.
Other test signals may be used when it can be shown that they produce results consistent with actual
use. They also may be necessary for some specific purposes as discussed in relevant places within this
standard.
The measurements in this clause shall be performed at the standard test levels specified in 6.6.2 and 6.7.2.
The measurement shall be performed using the same format as was used for calibration. Format
examples are 1/N octave bandwidth analysis, constant bandwidth analysis, and R-series preferred
frequencies. Measurement bandwidth shall be the same as or less than that which was used for
calibration. Measurement resolution shall be the same as or coarser than that which was used for
calibration. The actual bandwidth used shall be stated.
In general, the test signals and analysis methods in this standard cover a frequency range from
approximately 100 to 8500 Hz. The exact range depends on the codec, analysis method, and the test
signal (see G.6 and G.7).
Choose the ear simulator, mouth simulator, and test position according to 5.1, 5.2, and 5.3. This
equipment shall be used for all tests described in Clause 8, unless otherwise specified. The ear
simulator, mouth simulator, and test position used shall be stated.
For wideband applications, the Type 1 ear simulator shall not be used, since it is intended for use only
to 4000 Hz.
--``,-`-`,,`,,`,`,,`---
If the telephone is equipped with a tone control, the tone control shall be set to the manufacturer’s
default setting. This is the default tone control adjustment that shall be used for all measurements.
If no default setting is defined by the manufacturer, the tone control shall be set so that the frequency
response is as close as possible to the center of the required frequency response template. The tone
control shall be set before setting the volume control. If the tone and volume controls interact, an
iterative process for setting these controls may be necessary.
All measurements shall be done at the reference receive volume control setting (3.38). A range of
volume control settings may also be used where appropriate, such as minimum and maximum volume.
All measurements shall be done at the reference send volume control setting (3.39). A range of volume
control settings may be used where appropriate, such as minimum and maximum volume.
If analog test equipment is used, the digital telephone under test shall be connected to the reference
codec through an interface as shown in Figure 13. The interface shall provide all the signaling and
supervisory sequences necessary for the telephone set to work in all test modes. The interface shall also
be capable of converting a digital stream to or from the telephone set under test to a format
compatible with the reference codec in 8.2.2 or 8.2.3.
The send electrical test point (SETP) is for measuring send output signals. It shall be connected to a
600
load. The receive electrical test point (RETP) is for applying receive input signals. It shall be
connected to a 600
source.
If digital test equipment is used, the digital telephone under test shall be connected using a direct
digital interface as shown in Figure 14. In this case, a reference codec is not required, as the
measurements are done in the digital domain. The SETP and RETP would then be located at the
digital translation interface. Digital signals shall be referenced to the analog equivalent as defined
in 8.2.2 or 8.2.3.
--``,-`-`,,`,,`,`,,`---
For wireless telephones, the interface is the same, except that a radio link is also included in the
interface.
The interfacing for overall response consists of two telephone sets connected back-to-back through the
appropriate digital telephone interface, with or without the ideal codec as necessary (Figure 15).
8.2.2.1 General
A reference codec is used for testing a digital telephone with analog test equipment. The standard
for encoding voice frequency signals in North America is the m-law, which is defined in ITU-T
Recommendation G.711-1988. The codec defined in this clause is based on that standard. For other
coding schemes, an appropriate codec should be used.
For the digital-to-analog (D/A) converter, a digital test sequence (DTS) representing the pulse-code
modulation (PCM) equivalent of an analog sinusoidal signal whose rms value is 3.17 dB below the
maximum full load capacity of the codec shall generate 0 dBm in a 600
load.
A 0 dBm signal is not the maximum digital code. For m-law codecs 0 dBm is 3.17 dB below digital full
scale. For A-law codecs 0 dBm is 3.14 dB below digital full scale.
The idle channel noise should be less than 84 dBmp when receiving one of the quiet codes or when
the A/D digital output is connected to the D/A digital input. The quantizing distortion of the
reference codec should approach theoretical limits specified in Annex A of ITU-T Recommendation
O.133-1993. The intrinsic error of m-law PCM encoding limits the signal-to-distortion ratio to
about 38 dB.
8.2.3.1 General
There are a number of wideband codecs being used including ITU-T Recommendation G.722-1988,
and low bit rate vocoders, such as add ITU-T Recommendation G.722.1-1988, ITU-T Recommenda-
tion G.723.1-1996, and ITU-T Recommendation G.729-1996. However, the codec defined in this
clause is based on 16 bit, 16 kHz linear PCM coding or 256 kbit/s. For other coding schemes, an
appropriate codec should be used.
For the digital-to-analog (D/A) converter, a digital test sequence (DTS) representing the pulse-code
modulation (PCM) equivalent of an analog sinusoidal signal whose rms value is 3.17 dB below the
maximum full load capacity of the codec shall generate 0 dBm in a 600
load. This is the same as
prescribed for ITU-T Recommendation G.711-1988.
The nominal 3 dB bandwidth shall be 50–7000 Hz with anti-aliasing filter ripple less than 0.5 dB.
The idle channel noise should be less than 89 dBm unweighted across this same bandwidth when
receiving the quiet code or when the A/D digital output is connected to the D/A digital input.
The most common digital impairments include delay, bit errors, frame or packet loss, and network
echo cancellers that the phone might encounter. There are many commercial units available to induce
these impairments, and are usually specific to the type of digital transmission system being tested.
--``,-`-`,,`,,`,`,,`---
Network impairments can vary between types of voice networks. With the introduction of packet
voice transmission, such as Voice over IP (VoIP), new types of impairments have been introduced.
Impairments in ISDN and similar systems are typically limited to speech compression transcoding
(A-law to m-law conversions, etc.), speech path compression (ITU-T Recommendation G.726-1990
ADPCM compression), and delay.
Some tests are sensitive to these impairments, and it is important to understand the performance of the
telephone and the suitability of the test method in the presence of these impairments. For each
impairment, the affected tests are described.
Network delay, or latency, is the most important impairment for packet voice network devices. If the
device features nonlinear processes (echo cancellation, or voice activity detection) to enhance voice
quality, these processes can be sensitive to network delay. In this case, echo canceller performance (see
8.11 and 8.12) should be checked with simulated network delay of 50, 150, and 300 ms one way to
ensure that performance is not degraded.
Delay in the device or test setup may affect some tests and the methodology used. It is important to
understand how much delay can be tolerated by each particular test performed. Delay should be
measured, and the result used to offset the source and measurement signals where temporal correlation
of these are important to the test. The cross-spectrum method for frequency response is one example
where system delay shall be taken into account.
8.3.2 Jitter
Jitter is a variation in network or device delay due to the late or early arrival of packets in packet based
systems. Jitter can cause problems with some of the test methods. If possible, jitter should be removed
during testing. This can be done by increasing the size of the jitter buffer, resulting in a longer, but
stable delay. This stable delay can then be measured, and the result used to offset the source and
measurement signal where temporal correlation of these are important to the test.
If the jitter cannot be sufficiently controlled, then all tests shall be performed with caution. In this case,
the cross-spectrum method for frequency response shall not be used.
Packet networks can suffer from congestion, causing jitter, buffer underruns/overruns, or packets
arriving out of order. This typically results in lost packets. Some devices may feature packet loss
protection algorithms. It is beyond the scope of this standard to detail a test method for packet loss
protection performance. It is recommended that a perceptually based test of packet loss protection be
used. A suitable example is PESQ (ITU-T Recommendation P.862-2001) with 1%, 5%, and 10%
packet loss, normally distributed with respect to time. Many networks may experience bursty packet
loss; however, it is outside the scope of this standard to define a bursty distribution.
All other measurements in Clause 8 shall be performed with packet loss set to zero.
--``,-`-`,,`,,`,`,,`---
Network echo cancellers are typically deployed when network delay exceeds 25 ms, one way, and may
also be present in the test interface. Echo cancellers can affect nonlinear speech path quality enhancing
processes due to filters and echo suppression algorithms.
When a network echo canceller is inserted into the system under test, it is recommended that the send,
receive, and overall frequency responses and respective loudness rating, as well as linearity, distortion,
and noise be tested with network echo cancellers both enabled and disabled.
All other measurements should operate transparently in the presence of a network echo canceller and
should not need to be investigated.
transmission direction. The system will then disable the speech path, allowing additional bandwidth
for other network traffic. DTX may cause noise pumping, and both front end speech clipping and
trailing speech clipping.
If the phone and test system support DTX, frequency responses, loudness ratings, linearity, and
distortion should be measured with the feature both enabled and disabled. Speech like test signals shall
be used for frequency response and loudness rating measurements since the DTX algorithm may
interpret steady-state test signals as noise.
8.4 Receive
Receive frequency response is the ratio of sound pressure measured in the ear simulator, referred to the
ear reference point (ERP), to the voltage input at the receive electrical test point (RETP), which is
expressed in decibels. The receive frequency response in dB, HR( f ), is given by Equation (9). The
receive response, HR( f ), may be used to calculate the receive loudness rating (RLR), according to
ITU-T Recommendation P.79-1999. See Annex H.
GERP ð f Þ
HR ð f Þ ¼ 20 log in dB ðPa=VÞ ð9Þ
GRETP ð f Þ
where
In some cases, frequency response calculation may be performed with cross-spectrum or related
techniques. Justification for such techniques shall be given in the test report. See G.1 for more
information.
Receive noise is internally generated audio frequency noise present at the receiver when no stimulus is
applied. Connect the telephone set to the reference codec, and transmit idle code or silence to RETP.
The receiver shall be coupled to the ear simulator. The telephone’s microphone should be isolated
from sound input and mechanical disturbances that would cause significant error. Measure the
acoustic output signal, referred to the ERP, from 100 to 8500 Hz, averaging over a minimum period of
5 s. Receive noise should be measured with the telephone mute feature both ‘‘on’’ and ‘‘off.’’
The receive noise level is measured with A-weighting in dBA. The measurement may be implemented
directly using an A-weighting filter, or by using single-channel FFT with Hann windowing or real-time
spectrum analysis, followed by an A-weighted power summation.
Narrowband noise, including single frequency interference (SFI), is an impairment that can be
perceived as a tone depending on its level relative to the overall weighted noise level. This test measures
the weighted noise level characteristics in narrow bands of not more than 31 Hz, from 100 to 8500 Hz.
These levels can then be compared to the receive noise level (8.4.2).
The receiver shall be coupled to the ear simulator with idle code or silence at RETP. Measure the
A-weighted receive noise level, referred to the ERP, using a selective voltmeter or spectrum analyzer
with an effective bandwidth of not more than 31 Hz, over the frequency range of 100–8500 Hz. If FFT
analysis is used, then ‘‘flat top’’ windowing shall be employed.
Receive linearity is a measure of how the frequency response changes with input level.
The test consists of measuring the receive frequency response as specified in 8.4.1 and applying the
procedures described in Annex I. Linearity shall be measured using the same test method and stimulus
used to measure frequency response, except that the analysis bandwidth is different. For the
narrowband codec, the analysis bandwidth is 100–3400 Hz. For the wideband codec, the analysis
bandwidth is typically 100–6800 Hz. If artificial voices or another wideband stimulus are used, the test
shall be performed at seven levels, from 48.2 to 18.2 dBV, in 5 dB intervals, measured in 1/3
octave bands. Smaller intervals and/or a wider range of levels may also be used. The reference stimulus
level is 18.2 dBV. These levels take into account the high crest factor of artificial voices, which
approaches 23 dB.
If sine-wave signals are used, they shall be applied at the R10 frequencies, at seven levels, from 38.2
to 8.2 dBV, in 5 dB intervals. Smaller intervals and/or a wider range of levels may also be used. The
reference stimulus level is 23.2 dBV.
measured using narrowband pseudo-random noise as the stimulus. See J.3 for details of the method.
For the narrowband codec, the stimulus bandwidth is 100–3400 Hz. For the wideband codec, the
stimulus bandwidth is typically 100–6800 Hz. In all cases the analysis bandwidth is 100–8500 Hz.
Receive distortion is measured at ERP using the standard input level of 18.2 dBV. Other input levels
should be tested covering a range from 30 to 0 dBV. Measurements also should be made over a range
of frequencies within the telephone band, such as the ISO R10 preferred frequencies. For higher input
levels, verify that distortion of the test system is less than 1% THD.
For information about THD and other distortion measurement methods and test signals, and the
conditions under which they may be used, see Annex J. Different distortion measurement methods
are likely to give different results.
Receive mute is sometimes called ‘‘DTMF mute’’ or ‘‘autodial mute.’’ Receive muting is usually
automatic, but may be manually controlled, and would normally be activated by touch-tone dialing,
line ‘‘hold’’ operation, activating the hold button, or other means. Mute leakage is the amount of
signal measured at the ERP when an electrical stimulus is applied to the RETP.
To measure mute leakage, engage the mute, apply the test signal, and measure the receive noise
according to 8.4.2. The test signal shall be the same as that used for receive frequency response (8.4.1),
at 0 dBV. An additional measurement shall be made using the DTMF tones of the telephone set being
tested. In the case of the DTMF tones of the telephone set, there won’t be any control over the level.
Each result is expressed in dBA, a weighted noise level which should be compared to muted receive
noise measured according to 8.4.2 (with no stimulus applied).
NOTE—If a sinusoidal stimulus was used to measure receive frequency response in 8.4.1, the same sinusoidal frequency pattern
shall be used for the mute measurement, but only over the range of 200–4000 Hz, at 0 dBV. The absolute level at each frequency
is measured, not the frequency response. A-weighting should be applied to the result, expressed in dBA as a function of
frequency. The weighting permits more relevant comparison with results obtained with artificial voices.
Delay is an important factor for digital telephones and network edge devices. It is a measure of the
time taken for an excitation signal to traverse a given speech path for the device. Some devices may
have delay in excess of 50 ms, as well as a variable delay or jitter.
--``,-`-`,,`,,`,`,,`---
Receive delay is measured between RETP and the ear simulator.
Receive out-of-band signals are signals that appear outside the specified frequency range for any input
that is inside the specified frequency range. This test is designed to ensure that speech processing,
coding, or compression is properly implemented.
Apply a sine-wave signal at RETP at a level of 18.2 dBV, in the frequency range 300–3400 Hz.
Measure the signal level at the ear simulator of any spurious tones that may appear between 4.6 kHz
and 8.0 kHz. No weighting is applied to the result.
For wideband applications, apply a sine-wave signal at RETP in the frequency range of 150–6.7 kHz.
At the ear simulator measure the level of any spurious tones that may appear from 7.2 kHz to
approximately 8.5 kHz (see F.8).
The out-of-band signals shall be compared to the 1 kHz signal level at the ear simulator.
8.5 Send
Send frequency response is the ratio of voltage output at the send electrical test point (SETP) to the
sound pressure at the mouth reference point (MRP), which is expressed in decibels. The send
frequency response in dB. HS( f ), is given by Equation (10). The send frequency response, HS( f ) may
be used to calculate the send loudness rating (SLR) according to ITU-T Recommendation P.79-1999.
See Annex H.
GSETP ð f Þ
HS ð f Þ ¼ 20 log in dB ðV=PaÞ ð10Þ
GMRP ð f Þ
where
In some cases, frequency response calculation may be performed with cross-spectrum or related
techniques. Justification for such techniques shall be given in the test report. See G.1 for more
information.
Send noise is internally generated audio frequency noise present at the SETP. Connect the telephone
set to the reference codec and place it in an active state with no acoustic input. Measure the electrical
output signal at SETP, averaging over a minimum period of 5 s. The telephone microphone should be
isolated from sound input and mechanical disturbances that would cause significant error. Send noise
should be measured with the mute feature both ‘‘on’’ and ‘‘off.’’
Psophometric measurements are made from 100 to 3400 Hz for narrowband codecs, while A-weighted
measurements are made from 100 to 6800 Hz for wideband codecs. These measurements can be made
directly using a psophometrically weighted or A-weighted noise meter with the correct terminating
--``,-`-`,,`,,`,`,,`---
impedance. The measurement may also be implemented using a single-channel FFT with Hann
windowing, or a real-time spectrum analysis, followed by a weighted power summation.
For narrowband codecs, send overall noise shall be measured and reported in units of dBmp. For
wideband codecs, send overall noise shall be measured with A-weighting (defined in ANSI S1.4-1983),
reported in units of dBm(A). Measurements in dBmp and dBm(A) are generally not the same, and they
may not be correlated.
Narrowband noise, including single frequency interference (SFI), is an impairment that can be
perceived as a tone depending on its level relative to the overall weighted noise level. This test measures
the weighted noise level characteristics in narrow bands of not more than 31 Hz. These levels can then
be compared to the send noise (8.5.2). The handset or headset should be isolated from sound input and
mechanical disturbances that would cause significant error. Measure the psophometrically weighted
noise level at the SETP, using a selective voltmeter or spectrum analyzer, with an effective bandwidth
of not more than 31 Hz, over the frequency range of 100–3400 Hz for narrowband codecs. For
wideband codecs, use A-weighting instead of psophometric weighting, over the frequency range of
100–6800 Hz.
Send linearity is a measure of how the frequency response changes with input level.
The test consists of measuring the send frequency response as specified in 8.5.1 and applying the
procedures described in Annex I. Linearity shall be measured using the same test method and stimulus
used to measure frequency response, except that the analysis bandwidth is different. For the narrow-
band codec, the analysis bandwidth is 100–3400 Hz. For the wideband codec, the analysis bandwidth is
typically 100–6800 Hz.
If artificial voices or another wideband test signal are used, the test shall be performed at
seven levels from 34.7 to 4.7 dBPa, in 5 dB intervals, measured in 1/3 octave bands.
Smaller intervals and/or a wider range of levels may also be used. The reference stimulus level
is 4.7 dBPa. These levels take into account the high crest factor of artificial voices, which approaches
23 dB.
If sine-wave signals are used, they shall be applied at the R10 frequencies, at seven levels, from
24.7 to þ5.3 dBPa, in 5 dB intervals. Smaller intervals and/or a wider range of levels may also be
used. The reference stimulus level is 9.7 dBPa.
measured using narrowband pseudo-random noise as the stimulus. See J.3 for details of the
method. For the narrowband codec, the stimulus bandwidth is 100–3400 Hz. For the wideband
codec, the stimulus bandwidth is typically 100–6800 Hz. In all cases the analysis bandwidth is
100–8500 Hz.
Send distortion is measured at SETP using the standard input level of 4.7 dBPa. Other input levels
should be tested covering a range from 30 to þ10 dBPa. Measurements should also be made over a
range of frequencies within the telephone band, such as the ISO R10 preferred frequencies. For higher
input levels, verify that distortion of the test system is less than 2% THD.
For information about THD and other distortion measurement methods and test signals, and the
conditions under which they may be used, see Annex J. Different distortion measurement methods are
likely to give different results.
The mute function is for voice privacy during line hold and mute. Send muting is often manually
controlled, but may be automatically controlled. Mute leakage is the amount of signal measured at the
SETP when an acoustic stimulus is applied to the handset or headset microphone.
To measure mute leakage, engage the mute, apply the test signal, and measure the send noise
according to 8.5.2. The test signal shall be the same as that used for send frequency response (8.5.2),
at þ5 dBPa. The result is expressed in dBmp, a weighted noise level that should be compared to muted
broadband noise measured according to 8.5.2 (with no stimulus).
NOTE—If a sinusoidal stimulus was used to measure send frequency response in 8.5.1, the same sinusoidal frequency pattern
shall be used for the mute measurement, but only over the range of 200–4000 Hz, at þ5 dBPa. The absolute level at each
frequency is measured, not the frequency response. Psophometric weighting should be applied to the result, expressed in dBmp as
a function of frequency. The weighting permits more relevant comparison with results obtained with artificial voices.
Delay is an important factor for digital telephones and network edge devices. It is a measure of the
time taken for an excitation signal to traverse a given speech path for the device. Some devices may
have delay in excess of 50 ms, as well as a variable delay or jitter.
Send delay is measured between the MRP and SETP. The electroacoustic delay between the electrical
input to the MRP and the microphone of the device should be considered unimportant.
Send out-of-band susceptibility is a measure of signals that appear inside the specified frequency range
for any input that is outside the specified frequency range. This test is designed to ensure that speech
processing, coding, or compression, is properly implemented.
Apply a sine-wave signal at the MRP at a level of 4.7 dBPa, in the frequency range 4.5–8.5 kHz.
Measure the signal level at the SETP of any spurious tones that may appear between 300 and 3400 Hz.
No weighting is applied to the result.
For wideband applications, apply a sine-wave signal at SETP in the frequency range of 7.1–8.5 kHz.
At the SETP measure the level of any spurious tones that may appear from 100 to 6.8 kHz.
Send frequency response in a diffuse field is a measure of how much of the noise in the room where a
telephone is being used is transmitted to the network. It is the ratio of voltage output at the send
electrical test point (SETP) to the sound pressure at the diffuse field test point (DFTP, see 5.5.3),
which is expressed in decibels. The diffuse field send frequency response in dB, HSD( f ), is given by
Equation (11).
The diffuse field send frequency response may be sensitive to both the level and type of signal used.
This measurement may be performed in 1/3 octave resolution.
During the measurement, the mouth simulator is present but not active, with the MRP is located at the
DFTP. The mouth simulator is not present during calibration.
GSETP ð f Þ
HSD ð f Þ ¼ 20 log in dB ðV=PaÞ ð11Þ
GDFTP ð f Þ
where
Send signal-to-noise ratio is a measure of the desired speech transmission relative to unwanted noise in
the room where the talker’s phone is used. See Annex K.
--``,-`-`,,`,,`,`,,`---
8.6 Sidetone
Talker sidetone frequency response is the ratio of the sound pressure measured in the ear simulator,
referred to the ear reference point (ERP), to the sound pressure at the mouth reference point (MRP),
which is expressed in decibels. The talker sidetone frequency response in dB, HTS( f ), is given by
Equation (12). Talker sidetone frequency response may be used to calculate the sidetone masking
rating (STMR) according to ITU-T Recommendation P.79-1999. See Annex H.
The STMR measured on an open-ear HATS is approximately 24 dB. This represents the effective floor
of STMR measurements on actual telephones.
GERP ð f Þ
HTS ð f Þ ¼ 20 log in dB ðPa=PaÞ ð12Þ
GMRP ð f Þ
where
In some cases, frequency response calculation may be performed with cross-spectrum or related
techniques. Justification for such techniques shall be given in the test report. See G.1 for more
information.
Listener sidetone is a measure of the signal present at the receiver due to sound in the room in which
the receiver is used. The measurement is similar to talker sidetone, except that the stimulus signal is
generated in the entire test room, and not presented from a mouth simulator.
Listener sidetone frequency response is the ratio of the sound pressure measured in the ear simulator,
referred to the ear reference point (ERP), to the sound pressure from a diffused sound field at the
DFTP (5.5.3), which is expressed in decibels. The listener sidetone frequency response in dB, HLS( f ),
is given by Equation (13).
GERP ð f Þ
HLS ð f Þ ¼ 20 log in dB ðPa=PaÞ ð13Þ
GDFTP ð f Þ
where
This measurement is conducted using a uniform diffuse sound field as specified in 5.5.3. This
measurement may be performed in 1/3 octave resolution.
The level of the test signal should be in the range of 40–65 dBA. The level and spectrum used should be
reported.
For measurement of listener sidetone, the handset or headset is mounted on an appropriate test
fixture. The mouth simulator is present, but not active, with the MRP at the DFTP.
For the alternate method, listener sidetone response HLS( f ) is given approximately by Equation (14).
It is the talker sidetone response HTS( f ) minus the difference in send frequency responses from the
standard near field method and a similar method using a diffuse noise signal.
To use this alternate method, measure the talker sidetone per 8.6.1, measure the send frequency
response per 8.5.1, then measure the send frequency response in a diffuse field per 8.5.9, and apply
Equation (14).
HLS ð f Þ ¼ HTS ð f Þ ½HS ð f Þ HSD ð f Þ in dB ðPa=PaÞ ð14Þ
where
NOTE—This method may not be valid when the send, receive, or sidetone path has nonlinear characteristics.
Sidetone linearity is a measure of how the frequency response changes with input level.
The test consists of measuring the talker sidetone frequency response as specified in 8.6.1 and applying
the procedures described in Annex I. Linearity shall be measured using the same test method and
stimulus used to measure frequency response, except that the analysis bandwidth is different. For the
narrowband codec, the analysis bandwidth is 100–3400 Hz. For the wideband codec, the analysis
bandwidth is typically 100–6800 Hz.
If artificial voices or another wideband test signal are used, the test shall be performed at seven levels
from 34.7 to 4.7 dBPa, in 5 dB intervals, measured in 1/3 octave bands. Smaller intervals and/or a
wider range of levels may also be used. The reference stimulus level is 4.7 dBPa. These levels take into
--``,-`-`,,`,,`,`,,`---
account the high crest factor of artificial voices, which approaches 23 dB.
If sine-wave signals are used, they shall be applied at the R10 frequencies, at seven levels, from 24.7
to þ5.3 dBPa, in 5 dB intervals. Smaller intervals and/or a wider range of levels may also be used.
The reference stimulus level is 9.7 dBPa.
Sidetone distortion is measured at ERP using the standard input level of 4.7 dBPa. Other input levels
should be tested covering a range from 30 to þ10 dBPa. Measurements also should be made over a
range of frequencies within the telephone band, such as the ISO R10 preferred frequencies. For higher
input levels, verify that distortion of the test system is less than 2% THD.
For information about THD and other distortion measurement methods and test signals, and the
conditions under which they may be used, see Annex J. Different distortion measurement methods are
likely to give different results.
Sidetone delay is measured between the mouth simulator and the ear simulator, using one of the
methods described in Annex L.
If round trip sidetone delay is more than 5 ms, sidetone echo response should be measured. See
Annex M.
8.7 Overall
The overall response should be measured using two telephone sets connected back-to-back, through
the appropriate digital telephone interface, with or without the reference codec as necessary.
Overall frequency response is measured on two telephones connected back-to-back, using the interface
shown in Figure 15. The test conditions should be chosen according to 8.1, except that two test fixtures
are used. In general, the test conditions should be the same as those used for send and receive
measurements on the same telephone(s).
Overall frequency response is the ratio of the sound pressure measured in the ear simulator, referred to
the ear reference point (ERP), on the far-end telephone, to the sound pressure at the mouth reference
point (MRP) for the near-end telephone, which is expressed in decibels. The overall frequency
response in dB, HO( f ), is given by Equation (15). It may be used to calculate the overall loudness
rating (OLR) according to ITU-T Recommendation P.79-1999. See Annex H.
GERP ð f Þ
HO ð f Þ ¼ 20 log in dB ðPa=PaÞ ð15Þ
GMRP ð f Þ
--``,-`-`,,`,,`,`,,`---
where
In some cases, frequency response calculation may be performed with cross-spectrum or related
techniques. Justification for such techniques shall be given in the test report. See G.1 for more
information.
Overall linearity is a measure of how the frequency response changes with input level.
The test consists of measuring the overall frequency response as specified in 8.7.1 and applying the
procedures described in Annex I. Linearity shall be measured using the same test method and stimulus
used to measure frequency response, except that the analysis bandwidth is different. For the
narrowband codec, the analysis bandwidth is 100–3400 Hz. For the wideband codec, the analysis
bandwidth is typically 100–6800 Hz.
If artificial voices or another wideband test signal are used, the test shall be performed at seven levels
from 34.7 to 4.7 dBPa, in 5 dB intervals, measured in 1/3 octave bands. Smaller intervals and/or a
wider range of levels may also be used. The reference stimulus level is 4.7 dBPa. These levels take into
account the high crest factor of artificial voices, which approaches 23 dB.
If sine-wave signals are used, they shall be applied at the R10 frequencies, at seven levels, from 24.7
to þ5.3 dBPa, in 5 dB intervals. Smaller intervals and/or a wider range of levels may also be used. The
reference stimulus level is 9.7 dBPa.
Overall distortion can be measured in a manner similar to sidetone distortion (8.6.5), except that the
measurement is made from one set to another connected in a back to back configuration. For the
narrowband codec, the stimulus bandwidth is 100–3400 Hz. For the wideband codec, the stimulus
bandwidth is typically 100–6800 Hz. In all cases the analysis bandwidth is 100–8500 Hz.
Echo frequency response and TCLW are traditional means of evaluating echo in telephones
(and also networks). For an alternative measure that may overcome certain shortcomings in this
method, see 8.9.
Echo frequency response is the ratio of the voltage output at the send electrical test point (SETP)
to the voltage input at the receive electrical test point (RETP), expressed in dB. Echo response in dB,
HE( f ), is given by Equation (16). The inverse of this response is echo path loss, which may be used
to calculate TCLW, the weighted terminal coupling loss, according to ITU-T Recommendation
G.122-1993 Annex B, Clause B.4 (trapezoidal rule).
GSETP ð f Þ
HE ð f Þ ¼ 20 log in dB ðV=VÞ ð16Þ
GRETP ð f Þ
--``,-`-`,,`,,`,`,,`---
where
In some cases, frequency response calculation may be performed with cross-spectrum or related
techniques. Justification for such techniques shall be given in the test report. See G.1 for more
information.
a) Receiver placed on the same ear simulator used for receive frequency response measurements
(8.4.1).
b) Handset or headset is suspended in an anechoic chamber at least 500 mm from any reflecting
objects.
c) Receiver and microphone facing a hard, smooth surface free of any object for 500 mm.
A headset is placed on the surface as if it was put down briefly by a user.
d) In the reference corner of Figure 5 (5.6.1), with the receiver placed 250 mm from the corner.
Telephone sets with adjustable receive volume controls shall be tested at the reference receive volume
control setting.
The recommended test signal for this test is the composite source signal, with a white spectrum for the
noise part (CSS, see F.7.1). The recommended test signal level is 12.2 dBV (10 dBm0). This level
results in a relatively good signal-to-noise ratio for the measurement. The crest factor of CSS can be
less than 10 dB, allowing more headroom than artificial voices. For devices that incorporate nonlinear
processes, additional measurements using signal levels of 8.2 dBV (6 dBm0) and 18.2 dBV
(16 dBm0) may be performed. The measurement and calibration shall be determined during the
‘‘On’’ portions of the signal.
For the narrowband codec, the stimulus and analysis bandwidth is 100–3400 Hz. For the wideband
codec, the stimulus and analysis bandwidth is typically 100–6800 Hz.
Temporally weighted terminal coupling loss (TCLT) is an alternative measure of echo that may
overcome some shortcomings of echo frequency response and TCLW, the traditional means of
evaluating echo in telephones (and also networks). There can be two problems with echo frequency
response and TCLW:
a) If the echo is low enough in level, the signal-to-noise ratio of the measurement can be poor or
even negative. In such a case, it may not be clear if a measured result is echo or noise.
b) The results obtained may not correlate well with perception, particularly if the echo comes in
bursts.
The temporally weighted terminal coupling loss method is described in Annex O, and an algorithm is
given in Annex P. Several results are available from this method, but long-term temporally weighted
terminal coupling loss, single talk (LTCLT) is recommended for single-number description of
telephone echo. LTCLT is comparable to TCLW, except that it incorporates psychoacoustic factors
and separates echo from noise.
The test signal is real speech (F.6.3), with pauses edited so they are less than 20 ms. Synthesized real
speech (F.6.2) may also be suitable, but has not yet been validated. The test level is 18.2 dBV,
measured during the active portions of the speech signal.
The test signal is applied at the RETP for 30 s so that the telephone reaches its steady state. No signal
other than the acoustic return from the receiver is applied to the microphone.
Record the electrical signals at RETP and SETP for the next 1 minute. Align the RETP and SETP
signals in time by adding delay equal to EPD (echo path delay) to the RETP signal. LTCLT
is the difference (in dB) between the signal level at RETP and SETP calculated using the algorithm
in Annex P.
--``,-`-`,,`,,`,`,,`---
a) Receiver placed on the same ear simulator used for receive frequency response measurements
(8.4.1).
b) Handset or headset is suspended in an anechoic chamber, at least 500 mm from any reflecting
objects.
c) Receiver and microphone facing a hard, smooth surface free of any object for 500 mm. A
headset is placed on the surface as if it was put down briefly by a user.
d) In the reference corner of Figure 5 (5.6.1), with the receiver placed 250 mm from the corner.
Telephone sets with adjustable receive volume controls shall be tested at the reference receive volume
control setting.
The stability measurement is the same as echo (8.8) except the test signal is a sine wave at an
input level greater than or equal to 12.2 dBV (10 dBm0) and less than or equal to 2.2 dBV
(0 dBm0), at 1/12 octave intervals (or R40) for frequencies from 200 Hz to 4 kHz. Stability loss is the
maximum value of the inverse of echo [Equation (16)]. The measurement is performed under all four
physical configurations specified in 8.8.
For the narrowband codec, the stimulus and analysis bandwidth is 100–3400 Hz. For the wideband
codec, the stimulus and analysis bandwidth is typically 100–6800 Hz.
During the measurements, the operator should monitor the telephone for any sign of howling,
whistling, or other signs of instability.
Some devices may have a nonlinear process, such as an echo canceller, to improve TCLW. The
convergence time is a measure of how fast the full attenuation of the echo signal is achieved.
To measure convergence time, reset the device to a nominal state by initiating a new call in a quiet
environment of less than 30 dBA. Trigger a time capture with the onset of the input signal at RETP for
a duration of 1 s. Capture the trigger signal at RETP and the return signal at SETP. The convergence
time is taken from the onset of the trigger signal at RETP to where 90% of full echo path loss is
achieved at SETP. If the canceller does not appear to converge inside of 1 s, a longer time capture
may be needed.
CSS at 12.2 dBV is the preferred test signal for this measurement (F.7.1).
For the narrowband codec, the stimulus and analysis bandwidth is 100–3400 Hz. For the wideband
codec, the stimulus and analysis bandwidth is typically 100–6800 Hz.
Discontinuous speech transmission (DTX) is often featured as a voice/speech activity detector (VAD/
SAD) that detects when the speech path is idle in a particular direction. The system will then mute the
speech path, allowing additional bandwidth for data traffic.
DTX may cause noise pumping, and both front end speech and trailing speech clipping, especially if
the device has its own VAD feature working in tandem with a network DTX.
If the device has its own VAD feature, this can be characterized by measuring comfort noise matching,
speech detection switching time, and hangover switching time. Techniques to characterize switching
times can be found in IEEE Std 1329-1999, Clause 10.
Comfort noise matching is a comparison of background or network noise levels heard during active
speech transmission, and inserted replacement noise once the speech path is discontinued.
The comfort noise level introduced to replace the actual background noise should roughly match the
loudness as perceived by the user of the original background noise. This level matching is subjectively
asymmetric, in that there is more likely to be annoyance in the comfort noise loudness being greater
than the original noise than in being less than the original.
8.12.1 General
The receive comfort noise of a digital telephone is the short-term average background noise level
measured at the output of the telephone receiver, with the digital telephone receiving either a silence
indication packet from the transmitting telephone, or no packets from the transmitting telephone, for
some nontransient period of time.
Telephone sets with adjustable receive levels shall be adjusted as close as possible to the reference
receive volume control setting. Use the same ear simulator and positioning that was used for receive
measurements (8.4).
For the narrowband codec, the stimulus and analysis bandwidth is 100–3400 Hz. For the wideband
codec, the stimulus and analysis bandwidth is typically 100–6800 Hz.
With both VAD disabled at the transmitting source and comfort noise generation on the receiving unit
under test turned off, a band-limited white noise test signal should be sent from the transmitting end
such that the receive noise level measured at the receiving telephone is 48 dBA. This test signal, at this
level, will be assigned the level of ‘‘N dB’’ as a calibrated point for the purpose of the comfort noise
test, since it may be generated either as an acoustic signal at a ‘‘golden’’ transmitting telephone (and
measured in dBA), or injected digitally (and measured in dBm0p).
If it is not possible to disable the VAD, then a band-limited white noise signal at 62.2 dBV
(60 dBm0) is input at RETP with a 12.2 dBV (10 dBm0), 1 kHz tone. The injected noise level is
measured at the receiver, with the 1 kHz tone filtered out. Remove the 1 kHz tone, and, once the device
has discontinued the speech path, measure the generated comfort noise.
The following test sequence shall be followed for all calibrated test noise levels of ‘‘M dB’’ that will
range from N 10 to N þ 10 dB:
The testing methods provided in this clause only cover the application of in-band signals, but the
same sound pressure limits may apply if ringing signals appear in the handset or headset receiver
with the telephone set in off-hook conditions. See Annex N for a discussion of maximum pressure
limits.
--``,-`-`,,`,,`,`,,`---
Maximum acoustic output measurements shall be made on the same ear simulator and with the same
positioning and force as used for receive frequency response measurements. For handsets measured on
HATS, an additional measurement with a force of 13 N is required. See 5.3.2 for handsets, 5.3.3 for
headsets. Telephone sets with adjustable receive volume controls shall be adjusted to the maximum
setting.
Acoustic output can be referenced to the ERP, DRP, free field (0 degrees elevation and azimuth), or to
a diffuse field, as required by the appropriate safety standard. This may require that measurements
made at one reference point be translated to the required reference point. A filter may be required.
See Annex C.
The maximum acoustic pressure is the maximum steady-state sound pressure emitted from a receiver.
The measurement shall be made with real-time filter analysis (RTA) in 1/12 octave bands, described in
G.3. The detector shall be set to rms fast, which is a 250 ms effective averaging time (equivalent to a
125 ms time constant). The detector shall be set to hold the maximum level achieved in each band
during the entire sweep.
Additional consideration should be given to the acoustic pressure caused by tones, other audio signals
or long duration, high amplitude electrical signals applied to power, network, or auxiliary leads of the
digital telephone.
For digital telephones, the long duration acoustic pressure shall be determined by applying digital
codes to the receive input. This may be performed by using an analog test set to drive a reference codec
or by use of a digital code generator. If a set other than an ITU-T Recommendation G.711-1988
type set is to be tested, then an analog codec should be used. The analog level shall be set to switch
between the maximum positive and the maximum negative values for the reference codec. The
switching rate shall sweep through the range of 100–3400 Hz for narrowband and 100–6800 Hz for
wideband.
If a G.711 type of set is to be tested, a digital generator may be used. In this case, the codes shall be
switched between the maximum positive and the maximum negative values, defined in ITU-T
Recommendation G.711-1988 (viz. þ3.17 dBm0 for m-law coding and þ3.14 dBm0 for A-law coding).
The switching rate shall sweep through a range of 100–3400 Hz for narrowband and 100–6800 Hz for
wideband.
The sweep time shall be at least 90 s. A sweep time should be selected that provides consistent results
with no underestimation. That is, the result should be within 0.5 dB at all frequencies for a test
period 30 s.
The peak acoustic pressure is the maximum unweighted peak sound pressure emitted from a telephone
receiver. The stimulus for this test is a series of very short sweeps applied at RETP. The short sweeps
are to avoid activating any long-term nonlinear processes, such as AGC, that may be operating in the
device. The measurement shall be made at the ear simulator with an unweighted ‘‘peak hold’’ level
detector having a rise time equal to or less than 50 ms.
Additional consideration should be given to the peak acoustic pressure caused by tones or short
duration, high amplitude electrical pulses applied to power, network, or auxiliary leads of the digital
telephone.
For digital telephones, the short duration acoustic pressure shall be determined by applying digital
codes to the receive input. This may be performed by using an analog test set to drive a reference
codec, or by use of a digital code generator. If a set other than a G.711 type set is to be tested, then an
analog codec should be used. The analog level shall be set to switch between the maximum positive
and the maximum negative values for the reference codec. The switching rate shall sweep through the
range of 100–3400 Hz for narrowband and 100–6800 Hz for wideband.
If a G.711 type of set is to be tested, a digital generator may be used. In this case the codes shall be
switched between the maximum positive and the maximum negative values, defined in ITU-T
Recommendation G.711-1988 (viz. þ3.17 dBm0 for m-law coding and þ3.14 dBm0 for A-law coding).
The switching rate shall sweep through a range of 100–3400 Hz for narrowband and 100–6800 Hz for
wideband.
The duration of the ON codes shall be a number of complete cycles approximating but not exceeding
500 ms. The ON codes shall be followed by a quiet interval of at least 500 ms before repeating the
codes, as shown in Figure 16.
NOTE—It is advisable to repeat some tests more than one time, to ensure that the protection system is not damaged.
9.1 General
Procedures are given in this clause for measurement of send and receive performance characteristics of
handsets and headsets tested as four-wire devices, which are not connected to a complete telephone.
Parameters include frequency response, noise, input–output linearity, distortion, ac impedance, and dc
resistance. In addition, procedures are given for measuring echo frequency response and maximum
acoustic output.
Loudness ratings (RLR and SLR) should not be used for four-wire handsets and headsets as they are
only defined for complete telephone systems. It is possible to calculate loudness ratings for handsets
and headsets, but the results can only be used to compare similar devices since they are not generally
meaningful. In this case the numbers shall be referred to as ‘‘relative RLR’’ or ‘‘relative SLR.’’
--``,-`-`,,`,,`,`,,`---
The handset or headset shall be connected to the appropriate test circuit(s) described in 9.2. Other test
circuits may be used for specific applications. Because four-wire devices are affected by changes in
voltage, current, and impedance, the measurements should be made over the conditions that are
expected in actual use. Records should be kept of the measurement conditions.
The measured frequency responses shall be presented as decibels relative to one pascal per volt
[dB (Pa/V)] for receive, decibels relative to one volt per pascal [dB (V/Pa)] for send, and decibels
relative to one volt per volt [dB (V/V)] for echo. The stimulus level and signal type shall be reported for
each test.
The calibration procedures described in Clause 6 shall be carried out before making any
measurements. The acoustical test environment shall meet the specifications given in 5.5.
In general, multiple test signals and stimulus levels should be used to ensure the handset or headset is
characterized in realistic, stable, and well-defined states. This is especially the case for devices with
nonlinear processes such as compression or voice activated switching (VOX) circuitry, etc. See Annex F
and Annex G for further information on test signals and analysis methods.
The standard test signal for all handsets and headsets consists of artificial voices defined in ITU-T
Recommendation P.50-1999. See F.6.1.1 for details.
Sinusoidal test signals may be used for testing handsets or headsets if it can be shown that they do not
have adaptive, nonlinear, or dynamic signal processing (e.g., compressors, AGC, voice activity
detection, adaptive echo cancellers, etc.). Such evidence shall be given in the test report if sinusoidal
test signals are used.
Other test signals may be used when it can be shown that they produce results consistent with actual
use. They also may be necessary for some specific purposes as discussed in relevant places within this
standard.
The measurements in this clause shall be performed at the standard test level for send specified in 6.7.2,
and at the receive stimulus level determined by the procedure in 9.3.2.
The measurement shall be performed using the same format as was used for calibration. Format
examples are 1/N octave bandwidth analysis, constant bandwidth analysis, and R-series preferred
frequencies. Measurement bandwidth shall be the same as or less than that which was used for
calibration. Measurement resolution shall be the same as or coarser than that which was used for
calibration. The actual bandwidth used shall be stated.
In general, the test signals and analysis methods in this standard cover a frequency range from
approximately 100 to 8500 Hz. The exact range depends on the analysis method, and the test signal
(see G.6 and G.7)
Choose the ear simulator, mouth simulator, and test position according to 5.1, 5.2, and 5.3. This
equipment shall be used for all tests described in Clause 9, unless otherwise specified. The ear
simulator, mouth simulator, and test position used shall be stated.
If the handset or headset is equipped with a tone control, the tone control shall be set to the
manufacturer’s default setting. This is the default tone control adjustment that shall be used for all
measurements.
If no default setting is defined by the manufacturer, the tone control shall be set so that the frequency
response is as close as possible to the center of the required frequency response template. The tone
control shall be set before setting the volume control. If the tone and volume controls interact,
an iterative process for setting these controls may be necessary.
--``,-`-`,,`,,`,`,,`---
All measurements shall be done at the default receive volume control setting (9.3.2) and default send
gain adjustment (9.4.1). These default settings for handsets and headsets are defined differently than
the reference receive volume control setting for complete telephones. A range of control settings may
also be used where appropriate, such as minimum and maximum.
The test circuits are terminated into a load greater than 100 k
. This termination is the send electrical
test point (SETP) for measuring send output signals. This same termination is also the receive
electrical test point (RETP) for applying receive input signals.
In the microphone test circuits, Figure 17, Figure 18, and Figure 19, the values for voltage V,
capacitance C, resistances R1 and R2, inductance L, and diode D should simulate the range of
operating parameters of the headset or handset interface. These values are intended to provide support
for both dc and ac characteristics.
The effective load impedance provided by these test circuits shall be equal to the range of operating
impedances of the headset or handset interface. For electret microphones, the effective load impedance
ZL ¼ (R1 R2)/(R1 þ R2). This assumes C is large enough so that its impedance is small compared to
R1 and R2 at the lowest frequency tested. In many cases, R2 is infinite, so the effective load impedance
ZL ¼ R1.
In the case of a completely self-powered microphone system, or a microphone system powered by its
intended host, the circuit of Figure 18 may be used. The microphone should be connected to its
intended host or suitable simulation.
The effective impedance ZS in the receiver test circuit of Figure 20 shall be equal to the specified
nominal impedance of the receiver under test. The effective impedance ZS should take into account
both the output impedance of the signal generator and any other added impedances.
Figure 20—Receiver test circuit (ZTERM 100 k:, used for calibration only)
In case of a receiver system which needs to be powered, the headset or handset should be connected to
its intended host or suitable simulation.
The circuit of Figure 21 may be used for measurement of dc characteristics.
9.3 Receive
9.3.1 General
Receive characteristics of handsets and headsets are measured with the receiver sound port terminated
in the appropriate ear simulator, as defined in Clause 5.
Receive measurements should be taken with the handset or headset driven from a source equivalent to
the interface circuitry as specified in 9.2.
If a headset or handset is equipped with a receive volume control, it shall be set to the manufacturer’s
default setting. For frequency response measurements, LRETP shall be adjusted so that
LERP ¼ –14 dBPa.
If no default setting is defined by the manufacturer, the following procedure shall be followed to
determine the default receive volume control setting, using the test signal chosen for subsequent receive
measurements. If a sine-wave signal is used, the frequency shall be 1 kHz:
a) Set the volume control to maximum. Adjust LRETP so that LERP ¼ 14 dBPa. Record this level
and call it LMAX.
b) Set the volume control to minimum. Adjust LRETP so that LERP ¼ 14 dBPa. If this is not
possible, move the control up slightly. Record this level and call it LMIN.
c) Calculate the halfway point in dB between LMIN and LMAX, and call it LMID. Set LRETP to
LMID. This value of LRETP shall be used for frequency response measurements. Adjust the
volume control so that LERP ¼ 14 dBPa. This is the default receive volume control setting
which shall be used for all measurements unless otherwise specified.
Receive frequency response is the ratio of sound pressure measured in the ear simulator, referred to
the ear reference point (ERP), to voltage input at the receive electrical test point (RETP), which
is expressed in decibels. The receive frequency response in dB, HR( f ), is given by Equation (17).
GERP ð f Þ
HR ð f Þ ¼ 20 log in dB ðPa=VÞ ð17Þ
GRETP ð f Þ
where
GERP( f ) is the rms spectrum at ERP,
GRETP( f ) is the rms spectrum at RETP.
In some cases, frequency response calculation may be performed with cross-spectrum or related
techniques. Justification for such techniques shall be given in the test report. See G.1 for more
information.
Receive noise is internally generated audio frequency noise present at the handset or headset receiver
when no stimulus is applied. The receiver shall be coupled to the ear simulator with the RETP
terminated and with no signal input. The handset or headset microphone should be isolated from
sound input and mechanical disturbances that would cause significant error. Measure the acoustic
output signal, referred to the ERP, from 100 to 8500 Hz, averaging over a minimum period of 5 s.
Receive noise should be measured with the send mute feature both ‘‘on’’ and ‘‘off.’’
The receive noise level is measured with A-weighting in dBA. The measurement may be implemented
directly using an A-weighting filter, or by using single-channel FFT with Hann windowing or real-time
spectrum analysis, followed by an A-weighted power summation.
Receive narrowband noise, including single frequency interference (SFI), is an impairment that can be
perceived as a tone relative to the overall weighted noise level. This test measures the weighted noise
level characteristics in narrow bands of not more than 31 Hz maximum, from 100 to 8500 Hz. These
levels can then be compared to the receive noise (9.3.4).
The receiver shall be coupled to the ear simulator with the RETP terminated and with no signal input.
Measure the A-weighted receive noise level, referred to the ERP, using a selective voltmeter, or a
spectrum analyzer with an effective bandwidth of not more than 31 Hz, over the frequency range of
100–8500 Hz. If FFT analysis is used, then ‘‘flat top’’ windowing shall be employed.
Receive linearity is a measure of how the frequency response changes with input level.
The test consists of measuring the receive frequency response as specified in 9.3.3 and applying the
procedures described in Annex I. Linearity shall be measured using the same test method and stimulus
type used to measure frequency response.
The default receive volume and tone control setting shall be used. If artificial voices or another
wideband test signal are used, the test shall be performed in 1/3 octave bands. If sine-wave signals are
used, they shall be applied at the R10 frequencies from 200 through 5000 Hz.
For any test signal, the reference stimulus level is LMID, as determined according to the procedure in
9.3.2. The test shall be performed at seven levels, from LMID 15 dB to LMID þ15 dB, in 5 dB intervals.
Smaller intervals and/or a wider range of levels may also be used.
Receive distortion is measured at ERP using an input level of LMID, as determined according to the
procedure in 9.3.2. Other input levels should be tested covering a range of at least from 30 to þ5 dBV
and above, if necessary, until obvious clipping or limiting occurs. Measurements should also be made
over a range of frequencies within the telephone band, such as the ISO R10 preferred frequencies. For
higher input levels, verify that distortion of the test system is less than 1% THD.
For information about THD and other distortion measurement methods and test signals, and the
conditions under which they may be used, see Annex J. Different distortion measurement methods are
likely to give different results.
9.3.8 AC impedance
Mount the receiver to the appropriate ear simulator. Connect an impedance bridge to the receive
circuitry described in 9.2. Measure the impedance at each frequency of interest.
9.3.9 DC resistance
The resistance of the receive circuit should be obtained by the current–voltage method shown in
Figure 21. This measurement may be taken for various dc supply voltages, but use caution to avoid
damaging the receive circuitry.
9.4 Send
If a headset or handset is equipped with a send gain adjustment, the gain control shall be set to the
manufacturer’s default setting. This is the default send gain control adjustment that shall be used for
send frequency response measurements.
--``,-`-`,,`,,`,`,,`---
Copyright ß 2003 IEEE. All rights reserved. 61
If no default setting is defined by the manufacturer, the following procedure shall be followed to
determine the default send gain control setting, using the test signal chosen for subsequent send
measurements. If a sine-wave signal is used, the frequency shall be 1 kHz:
a) Set the gain adjustment to maximum. Set LMRP to 4.7 dBPa, then measure LSETP. Record this
level and call it LMAX.
b) Set the gain adjustment to minimum. Set LMRP to 4.7 dBPa, then measure LSETP. Record this
level and call it LMIN. If this is not possible, move the control up slightly, then repeat the
procedure.
c) Calculate the halfway point in dB between LMAX and LMIN, and call it LMID. Set LMRP to
4.7 dBPa, then measure LSETP. Adjust the send gain control so that LSETP ¼ LMID. This is the
default send gain control adjustment that shall be used for all measurements unless otherwise
specified.
Send frequency response is the ratio of voltage output at the send electrical test point (SETP) to the
sound pressure at the mouth reference point (MRP), which is expressed in decibels. The send
frequency response in dB, HS( f ), is given by Equation (18).
GSETP ð f Þ
HS ð f Þ ¼ 20 log in dB ðV=PaÞ ð18Þ
GMRP ð f Þ
where
In some cases, frequency response calculation may be performed with cross-spectrum or related
techniques. Justification for such techniques shall be given in the test report. See G.1 for more
information.
Send noise is internally generated audio frequency noise present at the microphone terminals or
circuitry with no stimulus applied. Measure the electrical output signal at SETP, averaging over a
minimum period of 5 s. The handset or headset microphone should be isolated from sound input and
mechanical disturbances that would cause significant error. Send noise should be measured with the
mute feature both ‘‘on’’ and ‘‘off.’’
Send overall noise shall be measured with psophometric weighting, and reported in units of dBV(p).
It shall also be measured with A-weighting and reported in units of dBV(A). Measurements in dBV(p)
and dBV(A) are generally not the same, and they may not be correlated. Units of dBmp or dBm(A)
can then be calculated based on the method described in Annex T.
Psophometric measurements are made from 100 to 6000 Hz, while A-weighted measurements are made
from 100 to 8500 Hz. These measurements can be made directly using a psophometrically weighted or
A-weighted noise meter with the correct terminating impedance. The measurement may also be
implemented using a single-channel FFT with Hann windowing, or a real-time spectrum analysis,
followed by a weighted power summation.
Send narrowband noise, including single frequency interference (SFI), is an impairment that can be
perceived as a tone relative to the overall weighted noise level. This test measures the A-weighted noise
level characteristics in narrow bands, of not more than 31 Hz maximum, from 100–8500 Hz.
The handset or headset should be isolated from sound input and mechanical disturbances that would
cause significant error. Measure the A-weighted noise level across R2 with a selective voltmeter, or a
spectrum analyzer with an effective bandwidth of not more than 31 Hz, over the frequency range of
100–8500 Hz. If FFT analysis is used, then ‘‘flat top’’ windowing shall be employed.
The same procedure may be applied, but with psophometric weighting, if specified by a performance
standard.
Send linearity is a measure of how the frequency response changes with input level.
The test consists of measuring the send frequency response as specified in 9.4.2 and applying the
procedures described in Annex I. Linearity shall be measured using the same test method and stimulus
type used to measure frequency response.
If artificial voices or another wideband stimulus are used, the test shall be performed at seven levels,
from 34.7 to 4.7 dBPa, in 5 dB intervals, measured in 1/3 octave bands. Smaller intervals and/or a
wider range of levels may also be used. The reference stimulus level is 4.7 dBPa. These levels take into
account the high crest factor of artificial voices, which approaches 23 dB.
If sine-wave signals are used, they shall be applied at the R10 frequencies from 200 through 5000 Hz
for seven levels, from 24.7 to þ5.3 dBPa, in 5 dB intervals. Smaller intervals and/or a wider range of
levels may also be used. The reference stimulus level is 9.7 dBPa.
Send distortion is measured using the standard input level of 4.7 dBPa. Other input levels should be
--``,-`-`,,`,,`,`,,`---
tested covering a range from 30 to þ10 dBPa. Measurements should also be made over a range of
frequencies within the telephone band, such as the ISO R10 preferred frequencies. For higher input
levels, verify that distortion of the test system is less than 2% THD.
For information about THD and other distortion measurement methods and test signals, and the
conditions under which they may be used, see Annex J. Different distortion measurement methods are
likely to give different results.
Send frequency response in a diffuse field is a measure of how much of the noise in the room where a
telephone is being used is transmitted to the network. It is the ratio of voltage output at the
send electrical test point (SETP) to the sound pressure at the diffuse field test point (DFTP, see 5.5.3),
which is expressed in decibels. The diffuse field send frequency response in dB, HSD( f ), is given by
Equation (19).
--``,-`-`,,`,,`,`,,`---
The diffuse field send frequency response may be sensitive to both the level and type of signal used.
This measurement may be performed in 1/3 octave resolution.
During the measurement, the mouth simulator is present but not active, with the MRP is located at the
DFTP. The mouth simulator is not present during calibration.
GSETP ð f Þ
HSD ð f Þ ¼ 20 log in dB ðV=PaÞ ð19Þ
GDFTP ð f Þ
where
Send signal-to-noise ratio is a measure of the desired speech transmission relative to unwanted noise in
the room where the talker’s phone is used. See Annex K.
9.4.9 AC impedance
Connect the headset or handset receiver according to Figure 17, Figure 18, or Figure 19. Temporarily
disconnect R2 and measure the electrical output for an input level representing the magnitude of
typical voice signals. Connect R2 and adjust its resistance to cause a 6 dB drop in output voltage level.
This resistance value is the magnitude of the impedance of the microphone circuit, which may
include R1, at each frequency of interest.
9.4.10 DC resistance
The resistance of a dynamic type microphone can be measured directly. The resistance of electret and
carbon type microphones should be obtained from the current–voltage characteristics. This
measurement may be taken for various dc supply voltages, but use caution to avoid damaging the
microphone circuitry. The microphone should be isolated from sound input and mechanical
disturbances for these measurements.
Echo frequency response is the ratio of the voltage output at the send electrical test point (SETP) to
the voltage input at the receive electrical test point (RETP), expressed in dB. Echo response in dB,
HE( f ), is given by Equation (20). The inverse of this response is echo path loss.
Echo path loss may be used to calculate TCLW, the weighted terminal coupling loss, according to
ITU-T Recommendation G.122-1993, Annex B, Clause B.4 (trapezoidal rule). For handsets and
headsets this calculation shall be labeled as ‘‘relative TCLW,’’ since true TCLW is defined only for
complete telephones.
The TCLW may be normalized to nominal RLR and SLR target specifications, corrected from relative
SLR and relative RLR. It shall then be labeled ‘‘normalized TCLW,’’ and the method of normalization
shall be stated.
GSETP ð f Þ
HE ð f Þ ¼ 20 log in dB ðV=VÞ ð20Þ
GRETP ð f Þ
where
GSETP( f ) is the rms spectrum at SETP,
GRETP( f ) is the rms spectrum at RETP.
In some cases, frequency response calculation may be performed with cross-spectrum or related
techniques. Justification for such techniques shall be given in the test report. See G.1 for more
information.
Echo frequency response shall be measured under the following two conditions:
a) Receiver placed on the same ear simulator used for receive measurements (9.3).
b) Handset or headset is suspended in the anechoic chamber at least 500 mm from any reflecting
objects.
Echo frequency response may be measured under the following two conditions:
c) Receiver and microphone facing a hard, smooth surface free of any object for 500 mm. Handset
receiver and microphone facing down. Headset is placed on the surface as if it was put down
briefly by a user.
d) In the reference corner of Figure 5 in 5.6.1, with the receiver placed 250 mm from the corner.
The recommended test signal for this test is the composite source signal, with a white spectrum for the
noise part (CSS, see F.7.1). The recommended test signal level is LMID þ 6 dB. (This level is intended
to result in a test roughly comparable to an echo test with the same handset or headset installed in a
complete telephone. It also results in an improved signal to noise ratio for the measurement.)
The testing methods provided in this clause only cover the application of in-band signals, but the same
sound pressure limits may apply if ringing signals appear in the handset or headset receiver while the
telephone set is off-hook. See Annex N for a discussion of maximum pressure limits.
Maximum acoustic output measurements shall be made on the same ear simulator and with the same
positioning and force as used for receive frequency response measurements. For handsets measured on
HATS, an additional measurement with a force of 13 N is required. See 5.1, as well as 5.3.2 for
handsets, and 5.3.3 for headsets. Handset and headsets with adjustable receive volume controls shall
be adjusted to the maximum setting.
Acoustic output can be referenced to the ERP, DRP, free field (0 degrees elevation and azimuth), or to
a diffuse field, as required by the appropriate safety standard. This may require measurements made at
one reference point be translated to the required reference point. A filter may be required. See Annex C.
The maximum acoustic pressure is the maximum steady-state sound pressure emitted from a receiver.
The stimulus for this test is a slow logarithmic sine sweep applied at RETP from 100 to 8500 Hz.
The measurement shall be made with real-time filter analysis (RTA) in 1/12 octave bands, described
in G.3. The detector shall be set to rms fast, which is a 250 ms effective averaging time (equivalent to a
125 ms time constant). The detector shall be set to hold the maximum level achieved in each band
during the entire sweep.
The sweep time shall be at least 90 s. A sweep time should be selected that provides consistent results
with no underestimation. That is, the result should be within 0.5 dB at all frequencies for a test
period 30 s.
The peak acoustic pressure is the maximum unweighted peak sound pressure emitted from a receiver.
The stimulus for this test is a surge applied to the receive terminals of the handset or headset. The
measurement shall be made at the ear simulator with an unweighted ‘‘peak hold’’ level detector, with a
rise time equal to or less than 50 ms.
Connect the positive terminal of the surge generator to the positive terminal of the receive circuitry
(see 7.10.2). Measure the peak pressure in the ear simulator while operating the surge generator.
An oscilloscope or a sound level meter, having an unweighted ‘‘peak hold’’ setting is used to make the
measurement. Reverse the connection and repeat.
Annex A
(normative)
Type 3.3 and Type 3.4 ear simulators have a soft pinna which deforms when the receiver is
pressed against it. The resulting leak depends on force or position, as well as the exact shape of
the receiver. The relationship between position and force will vary depending on the shape of the
receiver.
The change in force or position of the receiver against the ear simulator will cause the acoustic leak to
vary. The leak will generally introduce variations in the frequency response, especially at the lower
frequency range of the receiver, just as it does on a human ear.
The variation of leak with force or position is often not linear, especially at very low forces (2 N or less)
or very high forces (13 N or more).
Both ears have acoustical characteristics similar to the average human adult ear.
The Type 3.3 ear simulator is shaped like a real human ear, while the Type 3.4 ear simulator has a
simplified shape.
--``,-`-`,,`,,`,`,,`---
The measured results obtained by the Type 3.3 and Type 3.4 ear simulators may differ:
a) The receiver position, or force applied, may result in leaks that are slightly different. In order to
achieve a similar leak on the two different ear simulators with a handset, different forces may
have to be applied.
b) The acoustical input impedance of the two simulators is not identical. In general, the impedance
of the Type 3.3 ear simulator is slightly higher than that of the Type 3.4 ear simulator. For
measurements with similar leakage, the effect is that the receive loudness rating calculated from
measurement on a Type 3.3 ear simulator could be one to two dB lower (louder) than that
obtained from the Type 3.4 ear simulator.
Regardless of these differences, both the Type 3.3 and Type 3.4 ear simulators are generally the most
realistic way to measure handsets and headsets in a way that relates to the actual experience of real
listeners. The choice between the Type 3.3 and Type 3.4 ear simulator is up to the user. However,
measurements using the two simulators cannot be expected to be exactly equal.
The recommendations in this standard for using the Type 3.3 and Type 3.4 ear simulators reflect the
currently available equipment. When new or revised simulators become available, their use should be
carefully considered in view of the principles expressed in this standard as well as the information and
recommendations provided by the equipment manufacturer.
In principle, positioning of a handset on the Type 3.3 or Type 3.4 ear simulator can be specified either
by position relative to the ERP or by the applied force. The two are related, since greater applied force
results in moving the receiver inward toward the center of the head. However, the relationship between
applied force and position may be nonlinear.
The positioning device currently available for the Type 3.3 ear simulator can hold the receiver by
position relative to the ERP or by force on the pinna. Positioning relative to ERP is typically very
repeatable. The recommended procedure is to begin by placing the receiver in the positioning device
without contacting the pinna, then gradually moving the receiver inward so as to increase the force,
stopping at the target force or position.
When using the positioning device currently available for the Type 3.3 ear simulator, placement by
force is typically somewhat less repeatable than placement by position relative to the ERP. In addition,
there can be a large difference in pinna deformation at a given force reading depending on whether the
force has been increased from a low value to arrive at the target, or decreased from a high value. In
other words, whether the receiver has been positioned from outside the ERP and moved in toward the
center of the head, or the reverse. The recommended procedure is to begin by placing the receiver in
the positioning device without contacting the pinna, and to gradually move the receiver inward so as to
increase the force, stopping at the target force. The default target force is 6 N.
The positioning device currently available for the Type 3.4 ear simulator can hold the receiver by force
on the pinna. The positioning by force is typically very repeatable. There can be a difference in pinna
deformation at a given force reading depending on whether the force has been increased from a low
value to arrive at the target, or decreased from a high value. In other words, whether the receiver has
been positioned from outside the ERP and moved in toward the center of the head, or the reverse. The
recommended procedure is to begin by placing the receiver in the positioning device without
contacting the pinna, and to gradually move the receiver inward so as to increase the force, stopping at
the target force. The default target force is 6 N.
When using the positioning device currently available for the Type 3.4 ear simulator, it is not possible
to hold the receiver by position relative to the ERP.
The positioning recommendations in this standard for positioning handsets or headsets on the Type
3.3 or Type 3.4 ear simulators reflect the currently available equipment. When new or revised
positioning devices become available, their use should be carefully considered in view of the principles
expressed in this standard as well as the information and recommendations provided by the equipment
manufacturer.
68 --``,-`-`,,`,,`,`,,`---
Copyright ß 2003 IEEE. All rights reserved.
Annex B
(normative)
The following specialized ear simulators may be used as alternates if the applicable performance
specification requires or allows it, and if the following application requirements are met:
a) The Type 1 ear simulator may be used for large, supra-aural or supra-concha, hard-cap,
conically symmetrical receivers, which naturally seal to the simulator rim, in the band of
100–4000 Hz. (The frequency range may be extended to 8500 Hz, but only if the performance
specification requires it. However, the accuracy or relevance of the results in this extended range
are not assured.) These receivers should also be tested in a realistic unsealed condition using the
Type 3.3 or Type 3.4 ear simulator as specified in this subclause.
b) The Type 2 ear simulator may be used for sealing or nonsealing receivers that are inserted into
the ear canal.
c) The Type 3.1 ear simulator may be used for intra-concha receivers designed for sitting on the
bottom of the concha cavity.
d) The Type 3.2 ear simulator with a high- or low-grade leak may be used for large, supra-aural or
supra-concha, hard-cap, receivers, which naturally seal to the simulator rim, in the band of
100–8000 Hz. The low leak is intended for receivers that are pressed firmly to the ear, while the
high leak is intended for loosely coupled receivers.
The Type 1 ear simulator measures at the ear reference point (ERP), while all the other ear simulators
measure at the eardrum reference point (DRP). Measurements collected at the DRP shall be translated
to the ERP. This is done because receive and sidetone specifications are referenced to the ERP. It also
permits comparison of measurements made on the various type ear simulators.
For Types 2, 3.1, 3.3, and 3.4 ear simulators, DRP to ERP translation shall be performed according
to Annex C.
For measurements of receive or talker sidetone using the Type 1 ear simulator, a leakage
correction is often applied to the loudness rating calculation. Follow the applicable performance
standard for the correction and how to apply it. The leakage correction is not applied to the frequency
response.
B.2.1 General
When an alternative ear simulator described in B.1 is used, an alternative mouth simulator may be used.
When measurements are being made exclusively in the send direction, an alternative mouth simulator
may also be used. The mouth simulator recommended in 5.2 is usually installed in a HATS, but the
alternative ear simulators generally cannot be mounted to a HATS. Alternative mouth simulators and
ear simulators are generally installed on a test head. The alternative mouth shall comply with the
specification given in ITU-T Recommendation P.51-1996, whereas the mouth recommended in 5.2 shall
comply with ITU-T Recommendation P.58-1996. There are minor differences between these
specifications, so there may be small differences between the simulators.
The alternative mouth is suitable for measurements at or in front of the lip plane only. Traditionally,
it has been used for measuring corded telephone handsets.
Neither ITU-T Recommendation P.51-1996 nor ITU-T Recommendation P.58-1996 defines a sound
field behind the lip plane. However, practical experience has shown that the sound field distribution in
the region between the HATS mouth and ear closely approximates the sound field around a real
human head up to at least 4 kHz. The region extends from beyond the lip plane to the base of the
rubber ear and equal to or greater than 5 mm above the surface of the HATS cheek. This makes HATS
suitable for testing headsets, cordless and cellular phones, handsfree phones, and traditional corded
handsets. The sound field approximation may extend in frequency range as well as to other regions
around HATS, but these have not yet been verified.
In principle, a 6.25 mm free-field microphone should be used to calibrate the mouth simulator.
--``,-`-`,,`,,`,`,,`---
In practice, the mouth simulator may be calibrated at the MRP using a 12.5 mm free-field microphone
oriented at 0 degrees to the mouth axis with the center of the protection grid at the MRP (Figure B.1).
The calibrated frequency response of the microphone should be taken into account. Subtract 0.6 dB
from the measurement to give the actual sound pressure at the MRP. This compensates for the
acoustic center of the microphone being slightly in front of the protection grid. The method is valid
over the entire frequency range covered in this standard.
An alternate method is to calibrate at the MRP using a 12.5 mm pressure microphone oriented at
90 degrees to the mouth axis with the center of the protection grid at the MRP. The calibrated
frequency response of the microphone should be taken into account. This method is valid only to
5 kHz.
To calibrate the mouth, measure GMRP( f ), the spectrum at the MRP. Adjust the mouth equalization
to meet the target spectrum for the signal being used at a total sound pressure of 4.7 dBPa. This
spectrum is used to calculate the send, sidetone, and overall frequency responses.
The test fixture shall implement the HATS position defined in ITU-T Recommendation P.64-1999,
Annex E. The HATS position may be implemented on a standard test head.
The LRGP position was specified in previous editions of this standard. Send frequency response
measurements made on ordinary telephones at 300–3400 Hz are expected to give practically identical
results, whether obtained with LRGP or the HATS position. Systematic differences of about 1–2 dB in
send frequency response measurements on pressure gradient microphones have to be expected from
the upwards tilted speaking direction of about 19 degrees using the LRGP position. See ITU-T
Recommendation P.64-1999, Annex F.
--``,-`-`,,`,,`,`,,`---
Annex C
(normative)
All ear simulators except Type 1 are made with the measurement microphone in a position corresponding
to the eardrum, so measurements are made at the drum reference point (DRP). For telephony
measurements, the ear reference point (ERP) is used for loudness rating calculations and to maintain
comparability to historical measurements. The measurements collected at the DRP are therefore
generally translated to the ERP, or to another suitable acoustical terminal, depending on the application.
For all measurements, the translation may be implemented by using a minimum-phase filter based on
one of the tables or other transfer functions referred to in this annex. The magnitude of the filter response
shall match the transfer function within a tolerance of 2 dB. A tolerance of 1 dB is preferred.
A filter shall be used for measurements of peak acoustic pressure. (Peak measurements should be made
on the actual waveform at the desired acoustic terminal. Both the magnitude and phase of the transfer
function is necessary to best preserve the waveshape for a proper measurement of its peak value.)
For measurements made with any kind of spectrum analysis, the translation may be performed with
post-processing using one of the tables or other transfer functions referred to in this annex.
Measurement examples include frequency response, noise, linearity, and distortion. These tables may
also be used for frequency response measurements made with sine waves, if only the fundamental or
total response is included.
The translations given or referred to in this annex may be interpolated to match the frequency format
of the measurement to which they are applied.
The DRP to ERP translation, SDE, shall be added to the data measured at the DRP in order to
translate to the ERP. The effect is to remove a broad frequency response peak of about 10 dB in
the region of 3000 Hz. The DRP to ERP translation in this annex is from ITU-T Recommendation
P.57-2002. See Figure C.1, Table C.1, and Table C.2.
--``,-`-`,,`,,`,`,,`---
224 0.1 750 1.1 2500 9.3 8500 18.6
236 0.1 800 1.1 2650 9.9 9000 —
250 0.2 850 1.2 2800 10.6 9500 —
265 0.3 900 1.3 3000 10.7 10000 —
280 0.3 950 1.4 3150 10.4
300 0.2 1000 1.6 3350 9.6
315 0.2 1060 1.9 3550 8.5
Translation from DRP to free field at 0 degrees azimuth and 0 degrees elevation, or diffuse field, or any
other similar acoustical terminal shall be made using the transfer function supplied by the
manufacturer of the ear simulator, if available. Alternatively, the transfer functions specified in
ITU-T Recommendation P.58-1996 may be used. Transfer functions with resolution of at least 1/12
octave or R40 format shall be used if available. Report the transfer function used.
Annex D
(normative)
The orientation of a carbon transmitter during test, and the treatment it receives immediately prior to
a test, can have a significant influence on test results. Conditioning should be applied before making
any measurement, and the measurement should start within 10 s after conditioning. Because of the
wide possible variation in handset geometries and test fixtures, general guidelines are given for
conditioning, rather than detailed specifications. For tests between different locations, it is
recommended that identical procedures, as nearly as possible, be used to reduce differences and to
make results comparable.
For best reproducibility, automatic mechanical conditioning should be used. Connect the telephone
set terminals as required to the feed circuit and the appropriate terminating load. Turn the feed current
on. After 5 s, condition the microphone by rotating it smoothly. Rotation is made so that the plane of
the granule bed moves through an arc of at least 180 degrees and back. The rotation procedure is
repeated twice with the handset coming to rest in the test position without jarring the carbon granules.
The time of each rotation cycle should lie within the range of 2–12 s.
The final handset position should be 45 degrees face-up for all transmission testing, i.e., send, receive,
sidetone, overall.
NOTE—The axis of rotation for conditioning may be arbitrarily located with respect to the transmitter axis. In practice, one
orientation that provides the proper motion for many existing telephone sets is to have the axis of rotation coaxial with the axis
of the mouth simulator.
The performances of existing types of handset receivers are independent of the position (vertical,
horizontal face-up, or down) of the handset, but carbon transmitter resistance may affect receiving
output. In this case, the conditioning procedure in this annex should be followed.
Annex E
(normative)
Hoth noise can be described as acoustic random noise that has a power density spectrum
corresponding to that specified in Table E.1. The spectrum of Hoth noise is designed to simulate
typical ambient room noise over time.
Table E.1 gives the spectrum density adjusted in level to produce a reading of 50 dBA. Figure E.1
shows a plot of this spectrum. The spectrum below is independent of level and shifts equally for each
1/3 octave band.
At low frequencies, sound levels are somewhat difficult to control due to both the size of the test
chamber, and the introduction of external noise (air-conditioning/heating, etc.). The test chamber
should be designed to minimize undesirable low frequency sound levels.
For optimum ambient noise simulation in the test chamber, it is best to have a diffuse source for Hoth
noise. This can best be achieved by having somewhat reflective walls, and multiple sound sources. A
compromise can be made with either the room, or the number of sound sources.
--``,-`-`,,`,,`,`,,`---
Annex F
(normative)
Test signals
F.1 General
The test signal should place the telephone in a well-defined, reproducible state for the period of the
measurement. It should insure that the transfer function of the unit remains stable during the
measurement period, and yet provide a suitable signal for the specific measurement. The choice of
the signal will be a balance between one that correctly stimulates the processing algorithms in the
telephone, and one that is suitable for the specific measurement.
Unless otherwise stated, test signal levels are specified as long-term rms levels during at least one
complete period of the active part of the signal.
The long-term rms level of artificial voices (F.6.1.1), speech-like signals with pauses less than 20 ms,
and signals with sinusoidal or pseudo-random modulation (F.3.2, F.3.3, and F.6.1.3) shall be
measured during at least one complete cycle of the modulation pattern.
The level of signals with square-wave modulation (F.3.1) may be measured during the entire signal and
then corrected to account for the duty cycle. For example, a 250 ms ‘‘on’’ period followed by a 150 ms
‘‘off’’ period would have a duty cycle of 5/8, which corresponds to 2.04 dB. A long-term rms
measurement of such a signal, including the pauses, would underestimate the active level by 2.04 dB.
The level of random signals (F.5) shall be measured for long enough (large enough time-bandwidth
product) to insure that the measurement error ( one standard deviation) is 0.5 dB or less.
The level of speech-like signals with pauses (F.6.1.2, F.6.2, and F.6.3) shall be measured with a speech
voltmeter or other algorithm which meets the specifications of ITU-T Recommendation P.56-1993,
Method B.
The level of compound signals (F.7) shall be measured according to the principles of this clause.
The details vary according to the exact signal. Clause F.7 offers some additional guidelines.
F.2 Classifications
The various types of signals are divided into several groups, as discussed below. The classical
measurement signals can be separated into deterministic signals and continuous random signals. More
complex random signals include modulated random signals and speech-like signals that characterize
human speech. Finally, there are compound signals composed of two sources: one for biasing the unit
into a stable state, and the other being the actual test signal itself.
Several types of modulation may be applied to deterministic or random signals. This is done in order
to approximate the syllabic rhythm of real speech.
Test signals may be modulated in various ways to correctly stimulate a telephone, depending on the
signal processing actually used in the phone. For example, a modulated noise signal is often an
appropriate stimulus for a send circuit with a noise-guard feature. In the presence of a continuous
signal over a few hundred milliseconds in duration, the noise-guard process reduces gain substantially.
On the other hand, a continuous noise signal is often an appropriate stimulus for a receive circuit with
automatic gain control (AGC).
Square-wave modulation is an on–off pattern. The recommended pattern is 250 ms ‘‘on’’ and 150 ms
‘‘off,’’ 10 ms. This pattern is common in many telephone testing methods. It is close to the
modulation rate of real speech. Other timing patterns may also be used.
In some cases, a periodic pulse pattern of this type will not correctly activate the telephone circuit. In
such cases, a randomly varied pulse pattern may be used. The average ‘‘on’’ and ‘‘off ’’ times should
approximate 250 ms and 150 ms respectively.
With this type of modulation, all measurements are to be performed during the ‘‘on’’ part of the
pattern. For other types of modulation, the signal is to be measured during the entire presentation time.
Sine-wave modulation may be used to produce a simple and smooth speech amplitude envelope. The
recommended rate is 4 Hz. Modulation depth should be at least 50%, but not so great as to introduce
distortion.
Pseudo-random modulation may be used to produce a relatively speech-like amplitude envelope. The
modulation spectrum should cover from approximately 1 to 10 Hz, with the center at approximately
4 Hz. The extremes of the modulation spectrum should be rolled off gradually.
Deterministic (periodic) signals can always be used to measure the frequency response of linear, time
invariant telephones. When modulated, they can be used to measure the response of telephones with
many, but not all, speech-processing features.
In addition to use in measuring the frequency response of linear, time invariant telephones, sine waves
are useful for measurements of harmonic and difference-frequency distortion. This signal can be
modulated by square-wave, sine-wave, and pseudo-random signals.
The target spectrum for sine-wave signals is flat, which means equal amplitude at all frequencies.
F.4.2 Pseudo-random
A pseudo-random signal has a periodic structure in the time domain. In the frequency domain, almost
any magnitude and phase spectrum is possible. When used with FFT types of analysis, the period of
the pseudo-random signal is generally matched in length and synchronously triggered at the start of
the analysis period. When used with an MLS analyzer, the period of the MLS signal shall be matched
to the analysis period. This signal can be modulated by square-wave, sine-wave, and pseudo-random
signals. If square-wave modulation is used, the ‘‘on’’ time shall correspond to one or more complete
period(s) of the pseudo-random signal.
The target spectrum for pseudo-random signals can be white (F.5.1), pink (F.5.2), or P.50 (F.5.3).
Random signals can be described by their statistical characteristics, such as the long-term
power spectral density and probability density functions. These signals are not periodic, but are
stationary as far as these statistical characteristics are concerned. When measuring such signals, a
sufficient number of averages should be taken to obtain a given accuracy in estimating the long-term
spectrum.
In practice, many practical noise generators produce pseudo-random signals, typically with a very long
period. If the period of such signals is very long compared to the analysis period, and if the analysis
period is not correlated to the generator period, then these signals can be considered random.
White noise has a constant spectral density per hertz. The amplitude distribution is typically truncated
Gaussian, with a crest factor of 12 dB, 2 dB. This signal can be modulated by square-wave,
sine-wave, and pseudo-random signals.
The target spectrum for white noise is flat, when analyzed in fixed bandwidths. When analyzed in
constant percentage bandwidths, this is equivalent to a spectrum with band levels rising at 3 dB per
octave.
Pink noise has a power spectral density that decreases 3 dB per octave. The amplitude distribution is
typically truncated Gaussian, with a crest factor of 12 dB, 2 dB. This signal can be modulated by
square-wave, sine-wave, and pseudo-random signals.
The target spectrum for pink noise is flat, when analyzed in constant percentage bandwidths. When
analyzed in fixed bandwidths, this is equivalent to a spectrum with band levels falling at 3 dB per
--``,-`-`,,`,,`,`,,`---
octave.
The spectrum of this signal is the same as artificial voices (F.6.1.1). The amplitude distribution is
typically truncated Gaussian, with a crest factor of 12 dB, 2 dB. This signal can be modulated by
square-wave, sine-wave, and pseudo-random signals.
The target spectrum for P.50 noise is the column ‘‘Sound pressure level (third octave)’’ in Table 1 of
ITU-T Recommendation P.50-1999. The table can be used directly for the acoustic test spectrum at an
overall level of 4.7 dBPa. A constant can be added in all frequency bands to give other overall levels.
Speech-like signals include ITU-T Recommendation P.50-1999 artificial voices, ITU-T Recommenda-
tion P.59-1993 artificial conversational speech, simulated speech generator (SSG), as well as
synthesized and real speech signals. When long-term averaging is used, these signals place the
telephone in a well-defined reproducible state, ensure that the transfer function of the unit remains
stable, and provide a suitable signal for the specific measurement.
Typical parameters of simulated speech include long-term average spectrum, short-term spectrum,
instantaneous amplitude distribution, speech waveform structure, and the syllabic envelope.
The target spectrum for simulated speech is the column ‘‘Sound pressure level (third octave)’’ in
Table 1 of ITU-T Recommendation P.50-1999. The table can be used directly for the acoustic test
spectrum at an overall level of –4.7 dBPa. A constant can be added in all frequency bands to give other
overall levels.
NOTE—The P.50 signals published on the CD-ROM will have to be equalized to meet the target spectrum.
At least one complete segment of both male and female artificial voices should be used as the standard
test signal. The male and female artificial voices are each approximately 10.5 s long. The combined test
signal should consist of the male followed by the female artificial voices, resulting in a signal length of
approximately 21 s. No gap in the combined test signal should exceed 100 ms.
NOTE—The SSG signal published on the CD-ROM will have to be equalized to meet the target spectrum.
Speech-like signals may be produced using a digital processing technique rather than applying one of
the signal sources described above. Conversational speech can be sampled, digitized, processed, and
reproduced as synthesized speech. It also may be created from complex multiple tones that simulate
the talk-spurts, pauses, and activity factors associated with speech characteristics.
The target spectrum for synthesized speech is the original spectrum produced by the synthesis
procedure.
Speech-like signals are not limited to signal sources or synthesized digital processing, but also may
include real speech signals. This is often done by recording conversational speech, preferably in a
digital format, to avoid signal degradation with use. These real speech recordings are then reproduced
using a playback device as the signal source. See F.10, ‘‘Test signals published on CD-ROM,’’ for one
source of these and other test signals.
The target spectrum for real speech is the original spectrum of the recorded speech.
The signals described above rely on one signal source to place the telephone in a well-defined
reproducible state, insure that the transfer function of the unit remains stable, and provide a suitable
--``,-`-`,,`,,`,`,,`---
signal for the specific measurement. By applying two signal sources, one can be used specifically for
‘‘biasing’’ the unit into a stable, reproducible state, while the other is the actual test signal required for
measurements. These compound signals include those where the two sources are applied in sequence,
and those where both sources are applied simultaneously.
Compound test signals can provide extra test flexibility and solve problems which are difficult or
impossible using simple test signals. The bias signal can be a signal that, by itself, is unsuitable or very
inconvenient for the actual measurement. The measurement signal can be a signal that, by itself, is
unsuitable as a bias signal.
If desired, the measurement signal can be presented so as not to have a substantial effect on the action
of the bias signal. This can be done by adjusting the temporal and/or level relationships between the
two signals. The bias signal can be changed to put the telephone in different states with minor or even
no change in the measurement signal.
This class of test signals is characterized by the separation of the bias and analysis signals in time. The
bias signal is presented until the telephone is in a stable state. Once a stable state is reached,
the appropriate analysis signal is applied and a measurement is performed. The analysis should
be completed while the telephone is still in its stable state. The CSS is one example of this type of
signal.
The composite source signal (CSS) is a compound signal using a voiced signal to simulate the voice
properties, followed by a noise-like signal for measuring the transfer functions, and an inserted pause
to provide amplitude modulation. The noise-like signal has either a flat or speech shaped power
density spectrum. It has the advantage of short measurement periods and duplex operation where,
using an uncorrelated double-talk signal, the test signals can be applied from the talking and listening
directions at the same time. See ITU-T Recommendation P.501-2000 for the definition of this signal.
See Clause F.10 for one source of this and other test signals.
If the signal includes pauses, calibration and measurements are to be performed during the ‘‘on’’ part
of the pattern.
The target spectrum of the voiced part of this signal is defined in ITU-T Recommendation P.501-2000.
The noise part may have various target spectra, according to the application. The target spectrum may
be white noise (F.5.1), pink noise (F.5.2), or P.50 noise (F.5.3).
This class of test signals is characterized by presentation of the bias and analysis signals at the same
time. Some conditioning of the telephone may be required before beginning the analysis. The bias and
analysis signals shall be separable by the analysis method. A synchronous analysis method is usually
required. The P.50 Burst with Sine Sweep is one example of this type of signal.
The target spectrum of the total signal (bias plus analysis) is the same as the target spectrum of the bias
signal.
The analysis part of the signal (for example, the TDS sweep) may have to be shaped to fulfill this
requirement, but there is no other requirement on the spectrum of the analysis part of the signal.
--``,-`-`,,`,,`,`,,`---
The target spectrum is the column ‘‘Sound pressure level (third octave)’’ in Table 1 of ITU-T
Recommendation P.50-1999. The table can be used directly for the acoustic test spectrum at an overall
level of 4.7 dBPa. A constant can be added in all frequency bands to give other overall levels.
The target spectrum is the column ‘‘Sound pressure level (third octave)’’ in Table 1 of ITU-T
Recommendation P.50-1999. The table can be used directly for the acoustic test spectrum at an overall
level of 4.7 dBPa. A constant can be added in all frequency bands to give other overall levels.
The target spectrum is the original spectrum produced by the synthesis procedure.
The target spectrum is the column ‘‘Sound pressure level (third octave)’’ in Table 1 of ITU-T
Recommendation P.50-1999. The table can be used directly for the acoustic test spectrum at an overall
level of –4.7 dBPa. A constant can be added in all frequency bands to give other overall levels.
The target spectrum is the column ‘‘Sound pressure level (third octave)’’ in Table 1 of ITU-T
Recommendation P.50-1999. The table can be used directly for the acoustic test spectrum at an overall
level of 4.7 dBPa. A constant can be added in all frequency bands to give other overall levels.
The target spectrum is the original spectrum produced by the synthesis procedure.
In general, the test signals and analysis methods in this standard cover a frequency range from
approximately 100–8500 Hz. The exact range depends on the analysis method, and perhaps also the
test signal (see G.6). The lower limit is the practical lower limit of the mouth simulator, while the upper
limit is determined by the range of the DRP-to-ERP translation curve (Annex C). For digital phones,
the exact range may also be determined by the codec.
Some signals, such as SSG (F.6.1.3), are defined only for a smaller bandwidth, and cannot be used
outside their defined range.
Table F.1 defines the bandwidth and maximum analysis interval for the various test signals identified
in Annex F. Signals may be analyzed with finer resolution if desired. These parameters shall be applied
to both the calibration and test procedures.
--``,-`-`,,`,,`,`,,`---
F.5.1 White noisea White 25 Hz bands 1/12 octave bands
a
F.5.2 Pink noise Pink 1/12 octave bands 25 Hz bands
F.6.1.1 P.50 Artificial voices P.50 1/12 octave bands 25 Hz bands
F.6.1.2 P.59 Artificial conversational P.50 1/12 octave bands 25 Hz bands
speech
F.6.1.3 Simulated speech generator P.50 1/12 octave bands 25 Hz bands
F.6.2 Synthesized speech As synthesized 1/12 octave bands 25 Hz bands
F.6.3 Real speech As recorded 1/12 octave bands 25 Hz bands
F.7.1 Composite source signal See F.7.1 25 Hz bands 1/12 octave bands
F.7.2.1 TDS Sweep with bias Same as bias 50 Hz Bands 1/12 octave bands
to F.7.2.5
F.7.2.6 Pseudo-random noise with bias Same as bias 50 Hz Bands 1/12 octave bands
to F.7.2.10
a
Modulation may be required depending on the application. See F.3.
The artificial voices according to ITU-T Recommendation P.50-1999, as well as a large speech
database, is included on a CD-ROM published as ITU-T Recommendation P.50-1998, Appendix 1,
Test Signals. Other specialized signals, including the composite source signal (CSS), are published on a
CD-ROM included with ITU-T Recommendation P.501-2000.
Table F.2 identifies the various test signals described previously. The corresponding test methods and
conditions are shown for each signal. The various method classifications are described in Annex G.
Anechoic
chamber
Signal type needed?
Sec.
ref. Test method Deterministic signal Random signal Speech-like signal Compound signal
Ref: 269-2002
Sine-based
5.4.1 Discrete tone R N N N N N N N Y Y
5.4.2 Swept sine R N N N N N N Y Y Y
5.4.3 Time delay spectrometry R N N N N N N Y Y Ya
Page: 100
a
Anechoic chamber is required unless simulated free field methods are used.
1–148
--``,-`-`,,`,,`,`,,`---
IEEE
ANALOG AND DIGITAL TELEPHONE SETS, HANDSETS, AND HEADSETS Std 269-2002
Annex G
(normative)
Analysis methods
G.1 General
Various analysis techniques are available for electroacoustic measurements. Each technique has
inherent advantages and limitations. A particular method can be better suited for use with certain
stimulus signals. Certain methods, in fact, rely upon the use of a synchronized or otherwise unique
stimulus signal. This clause describes the most common techniques and their application to
measurements of analog and digital telephones using handsets and headsets.
The recommended method for calculating frequency response is based on dividing one rms spectrum
by another. See Equation (1) as an example in the case of receive frequency response. This method is
satisfactory and accurate in the great majority of cases. It applies to methods that measure an rms
spectrum such as single-channel FFT (G.2.2) and real-time filter analysis (G.3). It also applies to
stepped or swept sine methods (G.4.1 and G.4.2), if an rms detector insensitive to jitter is used.
In some cases it may be useful or desirable to use an alternative measurement method which calculates
frequency response by use of a cross-spectrum or similar process. These methods include dual-channel
FFT (G.2.1), MLS (G.2.3), TDS (G.4.3), and stepped or swept sine methods (G.4.1 and G.4.2) in that
a quadrature or similar detector is used. Such methods can sometimes reduce measurement time,
reduce the influence of noise, or offer other benefits.
Cross-spectrum and related methods are usually sensitive to jitter or other unstable phase or time
relationships that can exist between input and output of a telephone with some kinds of digital
processing. The test report shall include sufficient justification for use of cross-spectrum methods, such
as demonstrating that the delay and phase response are repeatable.
If cross-spectrum or related methods are used, the receive frequency response in dB, HR( f ), is given by
Equation (G.1):
GðRETPÞðERPÞ ð f Þ
HR ð f Þ ¼ 20 log in dB ðPa=VÞ ðG:1Þ
GRETP ð f Þ
where
G(RETP)(ERP) ( f ) is the cross spectrum between RETP and ERP,
GRETP ( f ) is the rms spectrum at RETP.
The Fourier transform is a mathematical operation that decomposes a time signal into its complex
frequency components. The inverse Fourier transform reverses the process, reconstructing the time
signal from its Fourier components. By applying the FFT algorithm to a sampled time signal, a
spectrum can be computed. This is a parallel analysis resulting in a narrow-band (constant bandwidth)
frequency spectrum. Low frequency resolution can be limited. Here, blocks of time data are analyzed.
Care should be taken in the proper windowing of the data (i.e., Hanning, flat-top, etc.), overlap
processing, and the number of averages, to ensure an accurate analysis. The record length and window
type determine the frequency resolution. The frequency range and time resolution are inversely related.
Because the data is discrete, the highest frequency that can be measured is determined by the sampling
frequency. Some degree of data processing is usually available in both the time domain and in the
frequency domain. An FFT analyzer can also have a zoom capability, for increased frequency
resolution across a restricted bandwidth.
When analyzing a periodic signal such as pseudo-random noise or a segment of real or artificial speech
or artificial voices, the averaging time shall be at least one full period of the signal. Averaging time
shall be stated for all measurements.
A dual-channel FFT analyzer performs simultaneous measurements of the telephone input and output.
This type of measurement is optimized for system analysis. Most FFT analyzers calculate the frequency
response from the cross spectrum and either the input or output autospectrum. In this way, different
response estimators can be used to minimize noise at the system input or output. This also enables com-
putation of other functions such as coherence, phase, group delay, coherent power, and noncoherent
power. Extensive data processing is normally available in both the time and frequency domains. It is
possible to improve measurement S/N by averaging and delay compensation. Special care is needed
when applying this method to telephones that are time variant or employ nonlinear signal processing.
Without cross-spectrum capabilities, the system input and output are measured separately. These
response measurements require control of the excitation spectrum and/or a two-pass analysis.
Therefore, measurement S/N due to noise at the system input or output is not improved. Any post-
processing features available will apply only to the directly measured spectra, not to the response
function. Special care is needed when applying this method to telephones that are time variant or
employ nonlinear signal processing. This method requires the stimulus to be stable between
measurement of the system input (or calibration) and measurement of the system output.
The MLS technique employs a large (typically 16 K) well-defined pseudo-random pulse excitation. The
length of the excitation signal is equal to the correlation length, eliminating leakage. The MLS
excitation and analysis are inherently synchronized. The received response signal is cross-correlated
with the MLS signal, typically using a fast Hadamard transform, to obtain the time response. An FFT
is then used to obtain the frequency response. This also enables computation of coherence, phase,
group delay, coherent power, and noncoherent power. Some nonlinear analysis capabilities and
post-processing are available. This method can improve measurement S/N.
Real-time analysis is essentially a parallel filter bank, usually implemented digitally. This results in a
constant percentage (logarithmic) frequency resolution. The analysis is carried out in parallel and the
signal is processed continuously. The filters shall be 1/12 or 1/24 octave, which comply with the ANSI
S1.11-1986 standard. The statistical accuracy of real-time measurements is usually determined by the
averaging time or the confidence level. This type of analysis is optimized for single-port acoustical
measurements (i.e., no control of the system input).
When analyzing a periodic signal such as pseudo-random noise or a segment of real or artificial speech
or artificial voices, the averaging time shall be at least one full period of the signal. Averaging time
shall be stated for all measurements.
Two channels enable simultaneous measurement of the system input and output, for direct
computation of the frequency response (output/input). This method does provide limited harmonic
distortion measurement capability, and some direct post-processing of the data.
A single-channel real-time analyzer requires separate measurements of the system input and output.
Response measurements will require control of the excitation spectrum and/or a two-pass analysis.
This method requires the stimulus to be stable between measurement of the system input (or
calibration) and measurement of the system output. This method does provide limited harmonic
distortion measurement capability, and some direct post-processing of the data.
Sinusoidal excitation provides a high measurement S/N ratio and high degree of frequency selectivity.
The analysis is performed serially using either a quadrature or rms detector. This often includes a
tracking filter for noise suppression and selective measurements of distortion components. The
quadrature detector multiplies the response signal by a synchronized (and appropriately delayed) sine
and cosine signal. This enables measurement of the complex, steady-state frequency response (i.e.,
magnitude and phase, real and imaginary parts). Complex averaging algorithms can be employed to
improve the measurement S/N ratio. The use of an rms detector requires a separate phase meter to
obtain phase information.
Discrete tone testing allows a measurement to be performed at precisely defined frequencies. These
frequencies can be at the ANSI/ISO preferred numbers or in other user-defined formats. See ISO
3:1973 and ANSI S1.6-1984 for preferred number series. The actual frequency interval (not resolution)
used in the measurement shall be stated. In addition to frequency response measurements,
intermodulation and difference frequency distortion testing are often carried out using this method.
Additionally, phase and group delay information is provided. These tests normally require an anechoic
room, although tone-burst techniques can be used with gating to obtain simulated free field results.
Measurement S/N can be improved using complex averaging.
This technique is similar to discrete tone testing, but instead employs a continuous linear or
logarithmic sine sweep excitation. The measurement is typically slow due to sweep rate limitations.
This method is well suited for frequency response and harmonic distortion measurements. An
anechoic room is generally required, although tone-burst techniques can be used with gating to obtain
simulated free field results.
TDS, as classically implemented, utilizes a linearly swept sine excitation signal that is synchronized to
the measuring instrument. With this signal, a one-to-one relationship is established between time and
frequency and simulated free field measurements can be performed. The measured response signal is
multiplied with an appropriately delayed version of the excitation. This, in turn, is fed to a selectable
constant bandwidth tracking filter and a detector.
In practice, TDS can be implemented by many modern techniques. For example, post-processing
--``,-`-`,,`,,`,`,,`---
algorithms can substitute for the tracking filter. TDS can also be implemented using a logarithmic
sweep followed by convolution.
Like other simulated free field techniques, the effective time window determines frequency resolution
and the lowest valid frequency. The time window is determined by the time between the arrival of the
direct sound and the arrival of the first reflection.
The TDS method also is well suited for harmonic distortion, and provides phase, group delay, and
time response information. This method may be implemented using an analog or digital process. In the
later case, refinements and corrections for deterministic errors in the measurement process may be
incorporated. It is possible to improve measurement S/N through complex averaging or delay
compensation. This method allows post-processing of the data. Special care is needed when applying
this method to telephones that are time variant or employ nonlinear signal processing.
Simulated free field techniques employ some method of time windowing the measured response. Time
windowing enables the direct sound in a measurement to be separated from its reflections, producing a
simulated free field condition. In this case, the frequency resolution is the reciprocal of the applied time
window. Both gating and post-process windowing can be used on measurements in ordinary rooms.
As discussed previously, MLS and TDS are inherently simulated free field techniques. Dual-channel
FFT analysis can also be used. The time windowing may be performed as a part of the data collection
or as a post-processing window operation.
In general, the test signals and analysis methods in this standard cover a frequency range from
approximately 100–8500 Hz. The lower limit is determined by the mouth simulator, whose practical
lower limit is approximately 100 Hz for general use. The upper limit is determined by the range of the
DRP-to-ERP translation curve (Annex C). These limits may be somewhat modified when using
standardized test signals that specify a particular bandwidth. The exact range also depends on the
analysis method. For measurements of frequency response, the analysis should cover the same
bandwidth as the test signal.
For example, if artificial voices (F.6.1.1) are analyzed in 1/12th octave bands, the range should include
the bands centered from 91.7 through 7286 Hz. In ITU-T Recommendation P.50-1999, the test signal
is defined for the 1/3 octave bands from 100 through 8000 Hz. The corresponding 1/12th octave bands
extend from 91.7 through 8660 Hz. However, the version of artificial voices currently published in
ITU-T Recommendation P.50-1999, Appendix 1 is sampled at 16 kHz, thereby limiting the useful
upper band to 7286 Hz. Other implementations of the artificial voices would have to be evaluated on a
case-by-case basis with respect to sampling rate and other characteristics.
If signals such as CSS (F.7.1) are analyzed in linear format, the range includes the lowest band at
approximately 100 Hz, through approximately 8500 Hz.
Digital telephones with a sampling rate of 8 kHz have an upper cutoff frequency just below 4 kHz.
When testing telephones known to be of this type, the high frequency limit of test signals should be
reconsidered. When using artificial voices or any signal with a speech-like spectrum, the full bandwidth
should be used (up to approximately 8500 Hz). Artificial voices and other speech-like signals have little
long-term power above 4 kHz, so only a few hundredths of a dB total stimulus power is lost due to a
cutoff slightly below 4 kHz. However, when using test signals with a relatively flat or pink spectrum
(F.4 or F.5), the test signal should only extend to approximately 4 kHz.
The standard frequency pattern for sinusoidal test signals is the R40 sequence. (See Table G.1,
Table G.2 as well as ISO 3:1973 and ANSI S1.6-1984.) However, when testing digital devices, or
devices which have any internal digital processing, some of these frequencies should be adjusted up
to 1% so they do not coincide with the sampling frequency, typically 8000 Hz, or submultiples
thereof. An example would be to use 1004 Hz instead of 1000 Hz as a test tone.
The R10 frequency pattern is used for calculating loudness ratings. (See Annex H.)
Constant-percentage bandwidth filters with 1/3 or 1/12 octave bandwidth have center frequencies and
passband upper and lower limit frequencies that are calculated by specific equations. See Table G.1
and Table G.2 for a complete list of 1/3 and 1/12 octave band frequencies within the scope of this
standard.
Exact center frequencies of 1/3 octave filters can be calculated according to Equation (G.2). The
frequencies are actually based on 10 bands per decade.
f ¼ 10ðn=10Þ ðG:2Þ
where
n is the band number,
f is the frequency.
The 1/3 octave passband upper and lower limit frequencies can be calculated according to
Equation (G.3).
For 1/12 octave bands, the formulas are similar, except the centers are shifted one-half a band. This is
done so that four 1/12 octave bands will cover the exactly same range as a 1/3 octave band encom-
passing them. The frequencies are actually based on 40 bands per decade, according to Equation (G.4).
f ¼ 10ðn þ 0:5Þ=40 ðG:4Þ
Example: 1/12 octave band number 80 has a center frequency of 102.92 Hz.
The 1/12 octave passband upper and lower limit frequencies can be calculated according to
Equation (G.5).
ðG:5Þ
--``,-`-`,,`,,`,`,,`---
344.75
355 - - - - -
365.17
375 - - - - -
386.81
400 - - - - - 398.11 - - - - - 400
409.73
425 - - - - -
434.01
450 - - - - -
459.73
475 - - - - -
486.97
500 - - - - - 501.19 - - - - - 500
515.82
530 - - - - -
546.39
560 - - - - -
578.76
600 - - - - -
613.06
630 - - - - - 630.96 - - - - - 630
649.38
670 - - - - -
687.86
710 - - - - -
728.62
750 - - - - -
771.79
800 - - - - - 794.33 ----- 800
817.52
850 - - - - -
865.96
900 - - - - -
917.28
950 - - - - -
971.63
1000 - - - - - 1000.00 - - - - - 1000
1029.20
1060 - - - - -
1090.18
1120 - - - - -
Annex H
(normative)
ISO R10 format data is required for calculating loudness ratings according to ITU-T Recommenda-
tion P.79-1999. Measured frequency responses (receive, send, sidetone, etc.) should be directly
converted to R10 format for this purpose.
Although it has been common practice to remeasure at the R10 frequencies only for the purpose of
calculating loudness ratings, this practice is neither necessary or desirable. The conversion procedure
in this annex makes remeasurement unnecessary. Measurement at the R10 points is not always
desirable, since undersampling can occur. While this is not likely to introduce much error when the
frequency response is smooth, when the frequency response is irregular the undersampling error can be
larger. Irregular frequency response it not generally desirable, but it may be more likely in devices with
digital signal processes running than in some types of simple analog systems.
--``,-`-`,,`,,`,`,,`---
Leakage correction is not used for Type 2 and Type 3 ear simulators. Historically, a leakage correction
was used to calculate loudness ratings on a Type 1 ear simulator.
Measurements may be performed in various frequency formats, depending upon the analysis method
employed. Response measurements can contain numerous peaks and dips. This conversion, therefore,
should be performed using ‘‘band-averaging.’’ The measured points within a particular 1/3 octave
band are ‘‘power averaged’’ according to Equation (H.1), and assigned to the R10 frequency at the
band center.
where
For the lowest frequency within the band, i ¼ 1. For the highest included frequency, i ¼ N. The 1/3
octave passband limit frequencies can be calculated according to Equation (H.2):
Example: For the 100 Hz band, the band number ¼ 20; For the 125 Hz band, the band number ¼ 21,
etc. See also G.7.
For measured data at frequencies coinciding with a band-edge frequency ( i ¼ 1 and/or i ¼ N), reduce
the value by 3 dB, and use that data point in both the upper and lower frequency band calculations.
For constant percentage bandwidth measurements, there will always be the same number of points for
each converted band (4 or 8, for 1/12 or 1/24 octave bands, respectively). For constant bandwidth data
(e.g., FFT) on a log frequency axis, the measurement data will appear under sampled at low
frequencies and over sampled at higher frequencies.
--``,-`-`,,`,,`,`,,`---
Annex I
(normative)
Linearity
Linearity is a measure of how frequency response changes with input level. The test consists of
measuring the relevant frequency response, but performing the measurement at several different stimu-
lus levels. If the telephone is linear, the frequency response should be the same regardless of the stimulus
level. Frequency responses are to be measured according to Clauses 7–9 (for example, 7.4.1).
The purpose of this method is to give a complete overview of the linearity of a device over a wide
frequency and dynamic range, all in one graph. The method is a particular combination of
measurements, post-processing and display procedures.
The stimulus intervals and frequency patterns for linearity measurements have been specified in the
body of this standard (for example, 7.4.4). These parameters have been selected to reveal typical non-
linearities over the basic frequency and dynamic range of typical devices, without taking too much
measurement time. For additional investigation of specific behaviors, these parameters may be altered.
For example, it may be useful to use a much smaller stimulus interval, say 1 dB, for a more detailed
look at the dynamic behavior of a device. If sharp resonances are to be investigated, a more dense
frequency pattern, such as 1/12th octaves or R40, might be useful.
Linearity shall be measured using the same stimulus type used to measure frequency response (send,
receive, sidetone, or overall). When using artificial voices, the linearity measurement includes the
effects of anything nonlinear, whatever the cause. Nonlinearities could be intentional or unintentional
compression or expansion, distortion of various kinds, or other nonlinear processes. The linearity
measurement shows if nonlinearity occurs, as well as the level and frequency range where it occurs. To
--``,-`-`,,`,,`,`,,`---
analyze the cause, further investigation is required. However, the common patterns are shown in the
figures in this annex.
The linearity test shall be performed at seven levels, in 5 dB intervals. Smaller intervals and/or a wider
range of levels may also be used. The reference stimulus level shall be specified.
For a linear phone measured with artificial voices, the result is seven parallel lines at levels from 0 to
30 dB relative to the reference stimulus (see Figure I.1). If the measurement is made with sine waves,
the result is seven parallel lines at levels from 15 to þ15 dB relative to the reference stimulus (see
Figure I.2). Nonlinearities are displayed as variations from the parallel lines (see Figure I.4, Figure I.5,
Figure I.6, and Figure I.7).
Each displayed curve is a relative frequency response which shows any deviations from
linearity. Each curve is displaced vertically by the amount the stimulus level differs from the
--``,-`-`,,`,,`,`,,`---
Figure I.1—Linear phone measured with artificial voices
reference stimulus. The linearity information for the entire frequency and dynamic range is shown in
one graph.
If an imaginary vertical line were drawn through all the curves of Figure I.1 or Figure I.2 at a
particular frequency, it would intersect the points typically displayed in a one-frequency input/output
curve. In that case, the intersected points would be the y values, and the stimulus levels would be the x
values, as in Figure I.3. In Figure I.1 and Figure I.2, the same information is displayed at all
frequencies in one graph.
Figure I.4, Figure I.5, Figure I.6, and Figure I.7 show examples of nonlinearities measured according
to this method.
Figure I.5—Wideband compressor with 1.5 to 1 ratio, measured with artificial voices
Annex J
(normative)
Distortion
J.1 Overview
Distortion is a measure of unwanted signals which appear at the output of a device at frequencies not
present in the input. Distortion is a function of input level, frequency, and the type of signal. Because
of this, different methods cannot necessarily be expected to correlate with each other.
The recommended method for all telephones is signal-to-distortion-and-noise ratio (SDN), defined in
J.3. It uses a narrowband pseudo-random noise as the stimulus, and analysis of THD þ noise with a
weighted notch filter.
Distortion test methods using sine-wave stimulus may be suitable for use on many handsets and
headsets and on some telephones. Sine methods and extensions of sine methods are described in J.4.
Continuous spectrum distortion methods may be a suitable alternative under some conditions where
artificial voices or other continuous-spectrum test signals are used, and cross-spectrum methods are
valid. See J.5.
Subjective predictors, such as algorithms which estimate mean opinion scores (MOS), may also be
useful in identifying distortions and degradations peculiar to digital processing. These algorithms have
generally been developed primarily for measurements of distortions found in networks, and may not
be completely applicable to telephones or headsets. The results may not correlate directly with other
measures. However, their use is encouraged as a supplemental investigation. One example is PESQ
(ITU-T Recommendation P.862-2001).
To test the suitability of a proposed distortion test signal, the signal should be applied at each
distortion test frequency using the standard level. The frequency response should then be measured
at those test frequencies. If the result is within 2 dB of the comparable values previously
obtained in the complete frequency response measurement, then the proposed distortion test signal is
suitable.
Distortion does not have to be measured using the same test signal as is used for measuring frequency
response, but the suitability test shall be fulfilled.
The recommended distortion test method for this standard is signal-to-distortion-and-noise ratio. This
method uses a narrowband pseudo-random noise as the stimulus, and analyzes THD þ noise with a
weighted notch filter. See Equation (J.1) and Equation (J.2).
The narrowband pseudo-random noise should have an effective bandwidth of 25–50 Hz. Out-of-band
signals should add no more than 0.5 dB to the overall level of the test signal. The periodic nature of
this signal will provide some modulation effect, depending on how the signal is constructed. The period
should be at least 250 ms, with frequency components no more than 4 Hz apart. The crest factor
should be 9 3 dB.
The output fundamental is measured with a bandpass filter or equivalent algorithm. Measurement is
made using an A-weighting filter according to ANSI S1.4-1983, but with a notch added to eliminate
the test signal. (Send distortion may be measured using the psophometric weighting if required by the
relevant performance standard.)
Output from the notched filter includes harmonic and nonharmonic products, as well as both
continuous noise and modulation noise. The notch filter output is divided by the fundamental and
expressed in percent, using Equation (J.1) and Equation (J.2). The result is the A-weighted signal-to-
distortion-and-noise ratio.
The notch shall attenuate the test signal by at least 50 dB. This will result in a distortion floor of 0.3%,
permitting measurements of distortion from 1% and above with 6% or better accuracy.
Measurements should be made over a range of frequencies within the telephone band, such as the ISO
R10 preferred frequencies from 315 to 3150 Hz. Test frequencies over one half the upper frequency
limit of the telephone may not be useful for evaluation of harmonic distortion. For high acoustic test
--``,-`-`,,`,,`,`,,`---
levels, verify that the distortion of the test system is less than 2%.
Total harmonic distortion is the ratio of the power sum of all the harmonics to the fundamental. It is
usually expressed as a percentage, according to Equations (J.3)–(J.6).
Harmonics may also be expressed separately to give diagnostic information in addition to THD.
Alternatively,
power sum of included harmonics
% THD ¼ 100 ðJ:5Þ
unfiltered total output
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
ðA2 Þ2 þ ðA3 Þ2 þ þ ðAn Þ2
% THD ¼ 100 ðJ:6Þ
ðA1 Þ2 þ ðA2 Þ2 þ ðA3 Þ2 þ þ ðAn Þ2
where
An is the amplitude of nth product.
Total harmonic distortion þ noise is the ratio of the rms amplitude of the residual harmonics and
noise to the rms amplitude of the fundamental, harmonics and noise combined (Equations (J.7) and
(J.8)). It is usually expressed as a percent.
Total harmonic distortion and noise is measured by use of a notch (bandstop) filter to eliminate the
fundamental. This measurement will be equivalent to total harmonic distortion, with an error of less
than 5%, if the magnitude of the distortion does not exceed 30%, and if there is no significant noise
component.
The notch shall attenuate the test signal by at least 50 dB. This will result in a distortion floor of 0.3%,
permitting measurements of distortion from 1% and above with 6% or better accuracy.
output from notch filter
% THD þ noise ¼ 100 ðJ:7Þ
unfiltered total output
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
ðA2 Þ2 þ ðA3 Þ2 þ þ ðAn Þ2 þ ðAnoise Þ2
% THD þ noise ¼ 100 ðJ:8Þ
ðA1 Þ2 þ ðA2 Þ2 þ ðA3 Þ2 þ þ ðAn Þ2 þ ðAnoise Þ2
where
An is the amplitude of nth product,
Anoise is the amplitude of wideband noise and nonharmonic products.
Difference-frequency distortion is measured by using two stimulus signals, typically spaced from 20 to
200 Hz apart. A complex group of distortion products results, consisting of odd and even order products
(Equations (J.9) and (J.10)). It is essentially the same as the production sidebands in a mixer or
modulator.
Difference-frequency distortion tests may be the best way to evaluate a telephone above 1000 Hz,
where the harmonics of a single tone (or narrowband pseudo-random noise signal) lie above the set’s
cutoff frequency.
power sum of included products
% total DF distortion ¼ 100 ðJ:9Þ
power sum of both stimulus signals
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
ðA2 Þ2 þ ðA3 Þ2 þ ðA3 Þ2 þ ðA4 Þ2 þ ðA5 Þ2 þ ðA5 Þ2 þ
% total DF distortion ¼ 100
ðAf1 Þ2 þ ðAf2 Þ2
ðJ:10Þ
where
An is the amplitude of nth product,
Afn is the amplitude of nth stimulus signal products.
Intermodulation distortion measurement typically uses one test tone at a fixed low frequency, such as
60 Hz, together with a second tone stepped or swept through the band of the device. Intermodulation
distortion measurement is not recommended for use with telephone products operating in the normal
speech band. It may be usable in wideband telephony, but that has not been studied for use in this
standard.
Harmonic and difference-frequency distortion measurement methods can be extended for more
appropriate application to telephone and headset testing, where a sinusoidal stimulus is not always
suitable.
One alternative is to use modulated sine waves as the stimulus. A square-wave, sine-wave, or a pseudo-
random modulation can be used to modulate the sine-wave signals. Refer to F.3 for details. One
modulated sine wave is used for harmonic distortion, while two are used for difference-frequency
--``,-`-`,,`,,`,`,,`---
distortion.
Another alternative is to use a narrowband pseudo-random noise signal as the stimulus. One
narrowband noise signal is used for harmonic distortion measurements, while two narrowband noise
signals are used for difference-frequency distortion. The total stimulus level is calculated on a power
basis. See J.3 for details about the narrowband noise stimulus.
When using these alternative test signals, analysis is with rms detectors and bandpass filters, or
equivalent algorithm.
Measurements should be made over a range of frequencies within the telephone band, such as the ISO
R10 preferred frequencies from 315 to 3150 Hz. Test frequencies over one half the upper frequency
limit of the telephone may not be useful for evaluation of harmonic distortion. For high acoustic test
levels, verify that the distortion of the test system is less than 2%.
Conventional techniques for measuring harmonic and intermodulation distortion are not usable in
continuous spectrum methods. An alternative is to use the ratio of noncoherent to coherent power
(N/C), each summed over the most important part of the telephone bandwidth of 300–3300 Hz
(Equation (J.11)). This method is suitable if, and only if, the telephone or headset under test has a
stable coherent frequency response. (Magnitude and phase are stable.)
Coherent power is the power in the output spectrum that is linearly related to the input. Noncoherent
power is the remainder. The following can cause this nonlinear remainder:
An analyzer cannot distinguish among these factors, so care is needed in setting up the measurement
and in interpreting the results. Factors 3, 4, and 5 can be largely eliminated by proper measurement
setup.
A separate measurement of noise in the device under test, summed over the telephone bandwidth of
300–3300 Hz, should be made with the continuous spectrum test signal deactivated. If this noise is
significantly less than the noncoherent power, then the noncoherent power is due to nonlinearity in the
device and Equation (J.12) is valid.
Another method for interpreting the N/C ratio is to perform the measurement at different levels and
compare the results. For example at moderate levels, the N/C ratio will usually be at its lowest,
indicating relatively low noise as well as relatively low nonlinearity. At low levels, the N/C ratio
typically increases due to noise. At high levels, the N/C ratio normally increases due to nonlinearity.
where
2
Gab
2
coherence ¼
Gaa Gbb
Gaa is the input autospectrum,
Gbb is the output autospectrum,
Gab is the cross spectrum.
Annex K
(normative)
Send signal-to-noise ratio, SendSNR( f ), is a measure of the desired speech transmission relative to
unwanted noise in the room where the talker’s phone, handset, or headset, is used. The measurement
is intended to apply to both passive and active systems. SendSNR( f ) is given by Equation (K.1).
Two test signals are used for this measurement. The first is the desired speech signal, presented from
the mouth simulator. The signal and positioning should be the same as used to determine send
frequency response (7.5.1). The second is a noise signal presented in a diffuse field (5.5.3). This noise
signal may be Hoth noise (Annex E) or any other noise signal representative of actual working
conditions. The DFTP and the MRP shall coincide.
The desired speech signal is presented together with the diffuse noise signal to obtain GSETP(SþN)( f ).
The diffuse noise signal is presented alone, to obtain GSETP(N)( f ).
The results are sensitive to the relative levels of both signals, and may be sensitive to the absolute levels
and types of signals used. The results may also be sensitive to the spectrum of the test signals. The
recommended noise spectrum is Hoth noise, at 4.7 dBPa. The recommended speech signal test level is
also 4.7 dBPa.
G
ð SETPðSþNÞ ð f ÞGSETPðNÞ ð f ÞÞ
SendSNRð f Þ ¼ 10 log 10 10 1 in dB ðK:1Þ
for
GSETPðSþNÞ ð f Þ >GSETPðNÞ ð f Þ
where
GSETP(SþN)( f ) is the rms spectrum at SETP with both the mouth simulator and noise sources active,
GSETP(N)( f ) is the rms spectrum at SETP with only the noise source active. The mouth simulator
present, but inactive.
Weighted send signal-to-noise ratio, SendSNRW, is a single number that results from applying an
intelligibility weighting WSNR (Table K.1) to the SendSNR (Equation K.2).
X
f¼5000
SendSNRW ¼ SendSNRð f Þ WSNR ðK:2Þ
f¼200
1600 0.112
2000 0.114
2500 0.102
3150 0.102
4000 0.072
5000 0.060
Annex L
(normative)
Delay
L.1 General
Delay can be measured in several ways, many of which are described in this clause. Electroacoustical
delays in the test equipment, such as the mouth simulator, can generally be ignored. The range of delay
that can be measured by the test equipment shall exceed the expected delay in the device under test, or
time domain aliasing may occur. The method used should be stated with the measurement.
Delay can be measured using a captured pulse. The pulse can be a swept sine or a gated sine. The
recommended timing for a pulse is 30–50 ms on and 500–800 ms off. This timing allows
measuring equipment, such as a digital storage oscilloscope, to acquire sufficient data for a clean
measurement. The pulse is delivered to the input test point and triggers the time capture. Record the
difference in time between the start of the input pulse and the start of the measured pulse at the output
test point.
Measure the impulse response. Delay between channels is the time at which the magnitude of the
impulse response is at its maximum. The delay between two events is the time difference between the
maxima of the two impulse responses.
The magnitude of the impulse response is calculated as the square root of the sum of the squares of the
impulse response (real part) and the Hilbert transform of the impulse response (imaginary part).
Measure the cross-correlation. Delay between channels is the time at which the cross-correlation
coefficient is at its maximum. The delay between two events is the time difference between the maxima
of the two impulse responses. If available on the analyzer, the magnitude of the cross correlation
should be used rather than the real part.
Measure the impulse response. Delay between channels is the time at which the magnitude of the
impulse response is at its maximum. The delay between two events is the time difference between the
maxima of the two impulse responses.
The magnitude of the impulse response is calculated as the square root of the sum of the
squares of the impulse response (real part) and the Hilbert transform of the impulse response
(imaginary part).
Measure the impulse response. Delay between channels is the time at which the magnitude of the
impulse response is at its maximum. The delay between two events is the time difference between the
maxima of the two impulse responses.
The magnitude of the impulse response is calculated as the square root of the sum of the squares of the
impulse response (real part) and the Hilbert transform of the impulse response (imaginary part).
--``,-`-`,,`,,`,`,,`---
Annex M
(normative)
Sidetone echo
In some phones there may be an audible delay in the sidetone path. This delay may be heard as an
unnatural quality and/or as an echo. The perceived quality can depend on the amount of delay, the
amplitude and spectrum of the delayed sidetone, and the amplitude and spectrum of the local
(acoustic) sidetone.
Sidetone delay is measured between the mouth simulator and the ear simulator, using one of the
methods described in Annex L. If the delay is 5 ms or less, talker sidetone may be measured in the
standard way (7.6.1 or 8.6.1).
If the delay exceeds 5 ms, the local (undelayed) sidetone and the sidetone echo should be measured
separately, using one of the simulated free field techniques described in G.5. In this application the
time window is used to separate the local sidetone from the sidetone echo, not necessarily to simulate a
free field.
--``,-`-`,,`,,`,`,,`---
To measure local sidetone, the window should begin at approximately 0 ms, depending on the
exact shape of the time window. The window should be as long as possible without including the
sidetone echo.
To measure sidetone echo, the window should begin just before the onset of the echo, depending on
the exact shape of the time window. The window should be as long as possible without including the
sidetone echo.
The true frequency resolution of a simulated free field measurement will be determined by the time
window chosen. The effective time window should be at least 5.7 ms, which corresponds to a frequency
resolution (lowest measurable frequency) of 175 Hz.
Both local sidetone frequency response and sidetone echo frequency response are defined similarly to
Equation (4) in 7.6.1. The exact formula depends on the method chosen.
Annex N
(informative)
N.1 Abstract
Both North American and European acoustic pressure limits for telephone headsets are under review.
Two new limits at ERP (ear reference point) and DRP (eardrum reference point) are proposed. The
new limits are based on the generally accepted 85 dBA 8 h TWA (time-weighted-average) free-field
exposure limit. The TWA allows the exposure limit to increase 3 dB for each time the exposure
duration is cut in half, e.g., 88 dBA for 4 h, 91 dBA for 2 h and so on and so forth. With a 2 s duration
(as specified in ITU-T Recommendation P.360-1998) the allowable free-field exposure level is 127 dBA.
Subtract 4 dB from 127 dBA to compensate for narrower telephony bandwidth compared to the free-
field broad frequency bandwidth. The maximum allowable exposure level for telephone for a 2 s
duration is 123 dBA. The new proposed acoustic pressure limits for headset at ERP and DRP are then
obtained by applying the ERP and DRP transfer functions to the 123 dBA free-field limit across the
frequency bandwidth. The proposed limits also suggest adding the A-weighting coefficients to simplify
actual tests.
The proposal contained in this Annex is a procedure for deriving new telephone headset acoustic
pressure limits that combine the best aspects of both current North American and European limits.
The specific numbers and coefficients, such as the selection of the transfer functions and the damage
risk factor should be further examined and discussed. Hopefully, this proposal will help in resolving
years of differences over the proper telephone headset acoustic pressure limit on both sides of Atlantic
Ocean.
N.2 Introduction
The two most common telephone headset acoustic pressure limits are the North American frequency-
dependent curves at ERP and DRP and the European 118 dBA flat (independent of frequency)
at ERP.
The North American limit curves were based on United States OSHA (Occupational Safety
and Health Administration) 90 dBA 8 h TWA free-field noise exposure limit. OSHA allows the
exposure limit to increase 5 dB for each time the exposure duration is cut in half, e.g.,
95 dBA for 4 h, 100 dBA for 2 h, and so on and so forth. With a 15 min duration the allowable
free-field exposure level is 115 dBA. OSHA regulates the maximum free-field exposure limit at
115 dBA.
In 1980, Bell Labs published two telephone headset acoustic pressure limits for ERP and DRP in its
PUB 48006. The limits were obtained by transferring the OSHA 115 dBA free-field limit to ERP and
DRP and adding A-weighting coefficients across the frequency bandwidth. Presently, the Bell Labs
limits are known as the North American telephone headset acoustic pressure limits. These limits are
shown in Figure N.1 and Figure N.2.
--``,-`-`,,`,,`,`,,`---
ITU-T Recommendation P.360-1998 explains the derivation of the European 118 dBA limit. This limit
was based on an 85 dBA 8 h TWA free-field noise exposure limit. (This is 5 dB lower than OSHA’s
90 dBA limit.) The allowable limit increases 3 dB for every halving of exposure duration. (OSHA uses
a 5 dB increment.) The following additional assumptions have been made in ITU-T Recommendation
P.360-1998 to adapt these limits to telephone usage
--``,-`-`,,`,,`,`,,`---
Figure N.3—Current European telephone headset acoustic pressure limit at ERP
Both the North American and European limits have their strengths and shortcomings. The North
American’s ERP and DRP limits were based on OSHA’s occupational noise exposure limits. The
90 dBA 8 h TWA free-field limit has been called too high. Neither the 5 dB increment for every halving
of duration, nor the absolute maximum of 115 dBA, are widely accepted. Nevertheless, the North
American’s method of utilizing frequency dependent transfer functions to transfer the free-field limit
to ERP and DRP limits is correct.
The European limit is based on 85 dBA 8 h TWA free-field limit that is generally accepted. Its increment
of 3 dB for every halving of duration is more accepted than OSHA’s 5 dB increment. However,
transferring free-field limit to ERP by simply adding 5 dB without frequency dependency is hardly
justifiable. Subtracting 10 dB from the limit for ‘‘nonoccupational exposure’’ is also questionable.
N.3 Proposal
Since 85 dBA 8 h TWA free-field exposure limit is more accepted globally, it should be adopted for the
new limits. The 3 dB increment for every halving of duration is also more generally accepted and
--``,-`-`,,`,,`,`,,`---
Figure N.4 and Figure N.5 show the preliminary proposed limits at ERP and DRP and the limits with
a 3 dB and 6 dB safety margin.
This proposal offers a procedure for deriving new telephone headset acoustic pressure limits that
combine the best aspects of both current North American and European limits. The specific numbers
and coefficients, such as the selection of the transfer functions and safety margin, should be further
examined and discussed.
Annex O
(normative)
O.1 General
The temporally weighted terminal coupling loss (TCLT) measurement method is described for single-
talk application. This method requires that the echo and the source signal be recorded over the
duration of the measurement, and post processing be used. Real-time measurement techniques are
possible, but are not described in this standard.
Freezing the canceller is not recommended for TCL tests. Test results with nonstationary signals have
shown that convergence times and subsequent converged TCL when ‘‘thawed’’ depend upon the point
in time at which the canceller was frozen.
a) Provide a measure of time-dependent echo return loss with peaky behavior, psychoacoustically
weighted
b) Provide an estimate of the number of potentially objectionable echo bursts, and the psycho-
acoustically weighted echo return loss during the bursts
c) Provide several other useful parameters describing echo, including long-term temporally
weighted terminal coupling loss, single talk (LTCLT)
An example test algorithm in pseudo code is detailed in Annex P. The rest of this clause defines the
method and gives some background information.
The echo signal is first filtered to model the frequency sensitivity of human hearing at low levels.
The echo and stimulus files are synchronized. Noise subtraction may then be applied, if it can be
assumed that the noise is stationary and not correlated to the echo. Echo and source are converted
into 4 ms power averaged frames allowing adequate resolution and immunity to synchronization
errors.
If the stimulus is inactive, the algorithm simply skips that frame, and moves on to the next echo
and stimulus frames. If the stimulus is declared active, the echo frame is compared with a threshold
to determine if an echo event occurs. The period of echo activity between inactive echo states is
termed an echo ‘‘event.’’ These events are then weighted using psychoacoustic modeling.
By using a threshold of 67.2 dBV (65 dBm) (5 dB above mlaw noise floor), TCLT can be determined.
In modeling echo audibility, the algorithm accounts for three fundamental aspects of human hearing
behavior: frequency weighting, temporal combination, and temporal weighting.
The frequency sensitivity of human hearing at a loudness level of 30 Phons is approximated. (30 Phons
is equivalent to 30 dB at 1 kHz.)
Thirty (30) Phons was chosen as it represents echo levels that result from terminals that just fail
handset terminals coupling loss specifications (determined using loss planning analysis). Variance from
20 to 50 Phons provide essentially the same weighting within the telephony band. An A-weighting filter
is used. See ANSI S1.4-1983.
--``,-`-`,,`,,`,`,,`---
O.3.2 Temporal combination
Temporal combination is the ear’s tendency to combine the loudness of sequential signals even though
they may be discrete in time. This typically occurs when the two signals are separated by less than
about 20 ms. The exact time is a complex function of many variables, but 20 ms is a suitable value in
this application. This is sometimes referred to as the Haas effect.
If two bursts of echo are separated by a period of inactivity less than 20 ms, they are considered as one
longer echo event as far as loudness is concerned. This continues until the gap between events is at least
20 ms, at which time the echo event is declared over. This can be thought of as a 20 ms hangover for
the current echo event. During this hangover period, echo and stimulus powers are not included as
part of the event. An example of temporal combination follows in Figure O.1.
Temporal weighting models the listener’s reduced sensitivity to sounds as their duration decreases
below 750 ms. The exact time is a complex function of many variables, but 750 ms is a suitable value in
this application.
The duration of the total echo event after temporal combination is measured. The total duration
includes any gap(s) between events that are captured by temporal combination, but not the final 20 ms
hangover. If the total duration is less than 750 ms, the level of the event is reduced to account for the
temporal integration behavior of human hearing. If the duration is longer than 750 ms, the level of the
total event is left unweighted. Test results have shown echo bursts less than 750 ms to be common
occurrences from cancellers.
A simplified equation (Equation O.1) describing the relationship was derived based upon audition
studies with noise. (Tones result in a slightly different relationship, but it was felt that noise was a
much closer approximation to the true nature of the echo than a sine.)
Temporal integration weighting ¼ 23 þ 8 logðtÞ in dB ðO:1Þ
where
Traditional TCL methods refer the echo power during the duration of measurement to the source
power during the duration of measurement to arrive at the terminal coupling loss. In the TCLT
method, the final weighted power of echo during each event is referred to the power of the source
signal during the same event, to arrive at the ‘‘Active TCLT,’’ or ATCLT, of each event. The echo is
referred to the source signal during the event only, as this is the way in which our ear would compare
the echo. This parameter can be statistically analyzed to give information about echo events during the
entire test sequence.
A long term average of the weighted active echo return loss is found by summing the power of all
weighted echo during active events, and comparing to the power of the source as seen during all events
only. The result is the ‘‘Active Long Term TCLT,’’ or ALTCLT.
For comparison with traditional TCL methods, the power of all weighted echo during events is --``,-`-`,,`,,`,`,,`---
summed, then referred to the total source power as measured for the entire duration of the
measurement. The result is the ‘‘Long Term TCLT,’’ or LTCLT.
The terminology for TCLT results was chosen to be consistent with the nomenclature of ITU-T
recommendation P.56-1993.
Other statistics compiled by the algorithm in Annex P include minimum, maximum, mean, and
standard deviation of ATCLT, the total number of echo events, the number of echo events per minute,
the percentage of echo event free speech, number of events <750 ms, and the average length of an
event and the duration of source inactivity.
Annex P
(normative)
P.1 General
This algorithm is an example of how to implement the measurement of TCLT as defined in Annex O.
The algorithm is provided as an assistance to the test developer, but it is not the definition of TCLT.
Modifications to this algorithm may be made, and may be necessary, to completely fulfill the intent of
Annex O.
TCLT is a newly proposed method for evaluating the echo return loss of a terminal using
psychoacoustic modeling and for predicting the occurrences of potentially objectionable echoes. The
principles are defined in Annex O. It incorporates three fundamental aspects of human audition
Speech based stimulus signals are recommended as their results are most representative of real world
usage. The measured output from the telephone is always some echo or noise making its way through
the system uncancelled.
It may be useful to record the stimulus and echo in digital format. Echo and stimulus frames shall be
calibrated according to the principles of this standard. It is possible to apply this method to two-wire
analog sets by use of a test hybrid. (See IEEE Std 1329-1999.)
The stimulus and the echo files will be processed as power values averaged over 4 ms frames. The
successive stimulus file frames will be termed xi, the echo frames will be denoted yi, where i ¼ 1, 2, 3....
is the actual frame index. Intermediate frames conforming to an ‘‘echo event’’ will be noted as xk, and
yk, where k ¼ 1, 2, 3... is the echo event index, and is reset when the event ends a new one commences.
--``,-`-`,,`,,`,`,,`---
Statistics compiled during the TCLT measurement include the active long term TCLT (ALTCLT), long
term TCLT (LTCLT), minimum and maximum active TCLT (MINTCLT, MAXTCLT), its standard
deviation (sigma) and mean, the total number of echo events (NEVENTS), the number of echo events
per minute (NEVMIN), the percentage of echo event free speech (PER), number of events < 750 ms
(N750), the average length of an event (AVGEVENT), and the duration stimulus was inactive (DUR).
The terminology for TCLT results was chosen to be consistent with the nomenclature of ITU-T
recommendation P.56-1993. The duration of stimulus inactivity is not included in the time-based
results.
Stimulus and measured results shall be calibrated according to the requirements in Clauses 6–9.
Calculate the correlation of stimulus and echo file to fine tune EPD (echo path delay). See Annex L for
methods. Use the criteria that the present correlation peak occurs at EPD unless a following
correlation peak has a magnitude at least 10 dB greater. This approximate guideline is based upon
subjective studies on delay detection with multiple impulses.
Align the echo and stimulus files in time by removing delay equal to EPD from the echo file.
The individual echo samples are A-weighted filter. (See ANSI S1.4-1983).
If it can be assumed that the noise in the echo path is stationary and uncorrelated with the echo, the
noise is measured for 2 s after the stop of source and echo activity. The noise is then subtracted, on a
power basis, from the echo plus noise to arrive at a better estimate of the echo alone. This procedure
shall be performed only if the echo plus noise is at least 3 dB greater than the noise alone.
Samples are converted to absolute numbers using the calibration data. The stimulus samples are
combined into 4 ms power averaged frames denoted as xi. The weighted, noise filtered echo samples are
combined into 4 ms power averaged frames denoted as yi.
Initialize variables:
i ¼ 0 (frame counter)
j ¼ 0 (frame counter for inactive signal duration)
nk¼0 ¼ 0 (number of frames in current echo event)
NSAMPS ¼ 0 (accumulated number of frames for all events)
HAAS ¼ 0 (counter up to 20 ms)
ei¼0 ¼ 0 (running summation of all echo power for all events after weighting,
as seen at frame counter i)
pi¼0 ¼ 0 (running summation of all stimulus power during the measurement, as
seen at frame counter i)
ek¼0 ¼ 0 (running summation of echo power during the particular echo event
after weighting, as seen at event frame counter k)
--``,-`-`,,`,,`,`,,`---
LEVENT ¼ 0 (echo return loss level of most recent event, after weighting)
NEVENT ¼ 0 (total number of echo events)
N750 ¼ 0 (total number of echo events < 750 ms)
MINTCL ¼ 75 (minimum echo return loss level of all events)
MAXTCL ¼ 0 (maximum echo return loss level of all events)
EVENT[NEVENT] ¼ 0 (initialize array for all event loss levels (in dB) to
zero; used to calculate sigma)
TEMPSK ¼ 0 (running sum of stimulus power during all events)
SUM ¼ 0 (used in calculating sigma)
SQ ¼ 0 (used in calculating sigma)
Increment frame counter and read in 4 ms averaged echo power yi, and 4 ms averaged stimulus power,
xi; if there are no more valid inputs and either measurement file is complete, go to Step 8: calculate
parameters.
pi ¼ pi þ xi
Is stimulus loud enough for a valid echo loss calculation? If not, disregard
present frame and move to next frame.
NSAMPS ¼ NSAMPS þ nk
Increment the counter for the number of events that were temporally weighted
N750 ¼ N750 þ 1
Else
Store the echo return loss of the most recent event in dB for future sigma
calculation
EVENT(NEVENT) ¼ LEVENT
Reconvert the echo return loss of the most recent event into linear;
recalculate weighted linear echo power
ek ¼ sk/(10**(LEVENT/10))
Accumulate all the echo event powers for future use in calculating ALTCLt and
LTCLt
ei ¼ ei þ ek
Accumulate all the stimulus powers during events for future use in calculating
ALTCLt
TEMPSK ¼ TEMPSK þ sk
Reset echo event variables
k¼0
nk ¼ 0
WEIGHT ¼ 0
--``,-`-`,,`,,`,`,,`---
HAAS ¼ 0
ek ¼ 0
sk ¼ 0
Go to 1
Calculate active long term TCLt (ALTCLt), long term TCLt (LTCLt), the number of echo events per
minute (NEVMIN), the percentage of echo event free speech (PER), the average length of an event
(AVGEVENT), and duration during which speech was inactive (DUR).
NOTE—Zero check ei before computing; if ei ¼ 0, set ALTCLt and LTCLt to 100 dB.
ALTCLt ¼ 10*log10(TEMPSK/ei)
LTCLt ¼ 10*log10(pi/ei)
NEVMIN ¼ 60*NEVENT/((i j)*0.004) {number of events per minute}
PER ¼ 100*((i j) NSAMPS)/(i j) {percentage of echo free speech)
AVGEVENT ¼ NSAMPS*4/NEVENT {average length of an event in
milliseconds}
DUR ¼ j**0.004
Calculate sigma by analyzing the EVENT array which contains the echo return loss of each event;
each event, regardless of duration, is given equal weighting in the sigma calculation; the suggestion is
that it is the transition between discrete events and not their duration that is most objectionable.
Print ALTCLt, LTCLt, MINTCL, MAXTCL, NEVENT, NEVMIN, PER, N750, AVGEVENT, DUR,
SIGMA, MEAN
--``,-`-`,,`,,`,`,,`---
Annex Q
(normative)
The main signal consists of eight 1024-point pseudo-random noise segments. Each segment has the
same magnitude spectrum but a different phase spectrum with the phase randomized within and
between the segments uniformly from 0 to 360 degrees, in order to randomize the interaction between
the intermodulation products of the harmonically related spectral components. The duration of each
segment is 80 ms. They are merged with each other through a raised cosine window, with an additional
80 ms merging segment between them. The simultaneous fade-out of the previous segment and the
fade-in of the following segment eliminate the transients, which would occur at the segment
--``,-`-`,,`,,`,`,,`---
boundaries. The complete main signal thus consists of eight pseudo-random segments interleaved with
eight merging segments, each of 80 ms. Duration, having a total length of 1.28 s. A simple filter at the
output provides the desired frequency shaping to approximate an average speech spectrum.
Measurements show that a Gamma distribution with parameter m ¼ 0.545 provides a good
approximation to the instantaneous amplitude distribution of continuous speech. The syllabic
characteristics can be represented by a low pass response that is practically flat up to about 4 Hz
(the 3 dB point) followed by 6 dB per octave roll-off.
The final wave shape of the modulating signal was derived empirically from the Gamma distribution.
Varying the period of this pulse in a pseudo-random manner and adjusting its rise and fall time ratio
results in a satisfactory approximation to the spectrum of the modulation envelope of real speech.
In order to extend the repetition time of the final signal and to spread more evenly the maxima of the
modulating signal over the repeated sequence of the Gaussian signal, the ratio between the sampling
clock frequencies of both signals was chosen to be 4/255. Thus the clocking frequency of the main
signal is 12,800 Hz, and the clock frequency for the modulating signal is about 200.8 Hz. The repetition
times are: 0.28 s for the Gaussian signal, 10.2 s for the modulating signal and 326.4 s for the final
modulated signal.
The Gaussian signal is made up of 16 segments. The odd number segments are generated by filling a 2
by n array with zeros and then filling in the desired real and imaginary spectrum components using
Equations (Q.1) and (Q.2). The first entry is zero, i.e., no DC component and there are no components
above 5500 Hz.
Xr ðoÞ ¼ cosð2a Þ ðQ:1Þ
The inverse FFT is then taken to transfer the signal to the time domain.
xðnÞ , Xr ðoÞ þ Xi ðoÞ ðQ:3Þ
The even number segments S(n) are
Si ðnÞ ¼ Si ðn 1Þ 0:5ð1 þ cosðði 0:5Þ=1024ÞÞ þ Si ðn þ 1Þ 0:5ð1 cosðði 0:5Þ=1024ÞÞ
i ¼ 1 to 1024
n ¼ 2, 4, . . . , 16 for n þ 1 > 16 use n þ 1 16
For the Gamma function, the 2048 samples are divided into 21 random-length pulse periods (number
of samples). The periods are 167, 43, 63, 119, 48, 57, 78, 88, 93, 107, 51, 71, 259, 60, 67, 207, 143, 54,
130, 45, and 98. Each period is divided into rise time of one-third and a fall time of two-thirds. That is,
rise and fall times are in 1:2 ratio.
The cubic interpolating spline function is used to model the rising and falling section of each segment.
where
n is the number of samples in the rising (or falling) section
s(i) is the value of the ith data point in the period
Annex R
(normative)
The bias signal consists of P.50 noise (F.5.3). For send measurements, it is presented in bursts at a 4 Hz
rate and 50% duty cycle (125 ms ‘‘ON,’’ 125 ms ‘‘OFF’’). The bias is presented at the standard test
level during the ‘‘ON’’ bursts.
For receive measurements, the bias may be presented either continuously or in the burst pattern.
Continuous presentation may be the most appropriate bias of a telephone with a simple AGC
function, but burst presentation may be better for telephones with more complex functions. Ideally,
both ways should be measured to determine which gives the most typical results. The telephone will be
measured in its average state during the entire measurement.
The measurement signal is a series of sine sweeps from 100 to 8500 Hz, at any rate suitable for time
delay spectrometry (TDS) measurements. The sweeps are not synchronized with the bias pulses. The
sweep spectrum may approximate the P.50 spectrum. At 315 Hz, the level of the measurement signal is
15 dB below the overall level of the bias signal.
The measurement is performed by TDS (G.4.3). The sweep length and number of averages are
adjusted to obtain a satisfactory signal-to-noise ratio in the measurement. Typically, a measurement
time (sweep length times number of averages) in the range of 16–128 s gives good results.
The true frequency resolution of the TDS measurement will be determined by the time window chosen,
not by the frequency interval in the analyzer. The minimum effective time window is 5.7 ms, which
corresponds to a frequency resolution (lowest measurable frequency) of 175 Hz.
In principle, this method can be used with any desired bias signal, including any of the speech-like
signals (see F.6).
126 --``,-`-`,,`,,`,`,,`---
Copyright ß 2003 IEEE. All rights reserved.
Annex S
(informative)
Current performance requirements are based on measurements referred to the ERP. Future
requirements may be based on measurements referred to the free field. This annex provides
background information on this concept.
One goal of a telephonic experience is to simulate a conversation where two people are 1 m apart,
talking to each other. Now insert a complete telephone system between our two talkers. In a perfect
world, the quality of the conversation would be the same with a telephone system and in free space.
This is called the orthotelephonic reference.
Consider a loudspeaker with a perfectly flat free field frequency response through the audio band, as
shown in Figure S.1.
Play the same speaker into a HATS ear simulator, and the result is a 17 dB peak at 2.8 kHz. (For more
complete data, see ITU-T Recommendation P.57-2002 and ITU-T Recommendation P.58-1996). See
Figure S.2.
The HATS ear simulator replicates the resonances which occur in a typical human pinna and ear canal
system, and measures at the (ear) drum reference point or DRP. It is because of the pinna and
the resonances in the ear canal that a loudspeaker with a flat free field response will not measure flat
into a HATS, at the DRP. Therefore, if a telephone receiver or headset is to sound the same as
--``,-`-`,,`,,`,`,,`---
a hypothetical flat speaker in the free field, the frequency response at the DRP should follow the free
Figure S.3—Phone that sounds the same as flat loudspeaker, measured at DRP
Most telephone companies are more familiar with the Type 1 ear simulator. This type of simulator
uses the ear reference point (ERP) rather than the DRP, which results in a different frequency curve
shape. Using the above hypothetical receiver tested into a Type 1 ear simulator yields a frequency
response which looks like Figure S.4.
Figure S.4—Phone that sounds the same as flat loudspeaker, measured at ERP
The important thing to remember is that the above curves, using either Type 1 simulators or HATS,
can all be referenced to a free field response. Another way of looking at it is that if you want your
handset or headset to sound like a flat loudspeaker in the free field, e.g., simulating the orthotelephonic
reference, the frequency response should look like either the ERP or DRP to free field transfer
function curves above.
The complete orthotelephonic response is due to the combination of frequency responses in the send,
network, line, and receive paths of an overall (end-to-end) connection. The exact distribution of
frequency response shaping in these paths is outside the scope of this Annex.
128
--``,-`-`,,`,,`,`,,`---
Copyright ß 2003 IEEE. All rights reserved.
Annex T
(informative)
T.1 Conversions for dBV to dBm, and for 600 and 900
dBV ¼ 10 log V2
¼ 20 log V
For R ¼ 600
P ¼ V2 =R
therefore
For R ¼ 900
P ¼ V2 =R
therefore
Correction (dB) ¼ 10 log (|Z1|/|Z2|), i.e., the log of the ratio of the magnitude of the impedances, when
converting from impedance Z1 to Z2.
Depending on the impedance being used, conversion factors can be applied dB for dB to the measured
or calculated result.
Two weighted noise measurement units have typically been used in telephony, dBmp and dBrnC. The
main differences between these two measurement units are the shape of the weighting filter and the
reference unit. The weighting filter for dBrnC is described in IEEE 743-1995.
The differences in the weighting functions are extremely slight, as to be insignificant; thus the
conversion between the two units can be expressed as
dBrnC ¼ dBmp þ 90
Conversion from loudness ratings defined in ANSI/IEEE Std 661-1979 to those defined in ITU-T
Recommendation P.79-1993, as specified by ANSI/TIA/EIA-810-A-2000, is as follows:
The above conversions should be used as an approximation only. These conversions are based upon
approximated frequency response curves as specified in ANSI/TIA/EIA-810-A-2000. Proper
conversion may depend upon actual measurements being made with each measurement standard
where frequency responses deviate significantly from the norm.
where
^ ^
0 dBPa ¼ 94 dBSPL, and 0 dBSPL ¼ 20 mPa, 1 Pa ¼ 1 N/m2
Annex U
(informative)
U.1 Introduction
The results of a subjective loudness balance test procedure may be used to estimate the receive
loudness in those cases where objective measurements do not correlate well with real use performance.
This loudness balance subjective test procedure differs in specific test details but is similar to the CCM
laboratory ‘‘Contra-Lateral Balance’’ procedure. The procedure has been used to obtain loudness
differences between a reference headset receiver and four test headset receivers. All of the headsets had
on-ear type receivers. The standard deviations of the loudness balances obtained from 10 subjects
ranged from 1.8 to 4.9 dB, and averaged 2.7 dB over 23 trials. (A trial consisted of four loudness
--``,-`-`,,`,,`,`,,`---
balances for each of 10 subjects for one sound source and one test headset.) The accuracy of the
average loudness differences obtained in the tests for the four test headsets was represented by 95%
confidence intervals about the average of 2.1 dB.
The methods described in this clause have been successfully used with headsets. In principle, similar
methods can be used with handsets.
The loudness balance procedure is used to obtain loudness differences between a test and reference
headset. The receiver in the reference headset shall have objectively measured performance that
correlates well with its real-use receive performance. With the type of artificial ears currently available,
this requires a tight acoustic seal between the receiver and artificial ear during objective measurements,
and between the receiver and human ear in real use.
The loudness balance tests should be performed in a quiet room with background noise no greater
than 40 dBA. A loudness balance between the reference and test headsets is obtained by allowing
the subject to adjust the signal level to the test receiver until loudness of the sound from the test
receiver is judged equal to the loudness of the sound from the reference receiver. During this
determination, the test receiver is on one ear and the reference receiver is on the other ear. After
the loudness balance is determined, the loudness difference between the test and reference receivers
is represented by the difference in signal levels to the two receivers. To counteract the effects of
hearing acuity differences between the subject’s left and right ears, the tests should be repeated
with the test and reference headset receivers reversed on the subjects ears. The results of the two
trials are averaged to determine the loudness difference. To obtain reasonably reliable data, a
minimum of 10 subjects should be used in the tests. These test subjects should have ‘‘clinically
normal’’ hearing. That is, the magnitude of measured hearing loss at any test frequency shall
be less than 30 dB. If possible, each test subject should have approximately equal hearing in
both ears.
The loudness differences should be determined for six different signal sources consisting of 1/3 octave
band noise centered at the following frequencies: 315 Hz, 500 Hz, 800 Hz, 1250 Hz, 2000 Hz, and
3150 Hz. The use of a narrowband of noise is preferred over pure tones since sounds that are normally
heard are more complex than pure tones. Furthermore, subjects may adapt to pure tones after a short
period of listening. This could result in inaccurate measurements. This adaptation is less likely when
narrowband noise is used.
Two loudness balances are made for each of the six signals for each ear. The signal sources are
presented in a random order to the subject. The subject determines a loudness balance by adjusting an
attenuator that controls signal level to the test headset receiver, while alternating the signal between
the test and reference receivers with a switch. With each new signal, the starting signal level in the test
receiver should always be below that of the reference receiver. That is, the subject should always
initially need to increase the test receiver level to arrive at a loudness balance. After completing the
loudness balances for the first receiver–ear placement, the receivers are then reversed on the subject’s
ears and the tests repeated.
Loudness differences between some test headsets and the reference headset may be similar for the six
test sounds. The subject may thus learn during the test to set the balance attenuator at a specific
location to achieve a loudness balance. However, the subject’s final decision may be influenced more
by what he or she thinks is the correct position to produce a balance than by the actual balance itself.
To prevent any such biasing of the results, a means should be incorporated in the test design to
randomly shift the loudness balance point.
Before the tests begin, the subject should be given ample time to adjust the headsets to his or her ears.
The importance of proper receiver-to-ear coupling should be stressed to the subject and directions
given not to change the positioning of the receivers once the tests begin. Each test subject should adjust
the signal level to the reference receiver for his or her preferred listening level. After the receivers have
been properly positioned, the 1250 Hz sound source should be directed to the reference headset and the
subject instructed to adjust an attenuator until the sound is at his or her preferred level. This level for
the reference headset should then remain constant for all sound sources for that subject. (When the test
and reference receivers are reversed on the subject’s ears, the subject is again asked to adjust the
attenuator for preferred listening level.) In those cases where the test headset incorporates receive
compression, it is necessary to determine, in pre-tests with the test and reference receivers, the signal
level for the reference receiver. This level, which will probably be below the preferred level of most
subjects, should be such that it prevents the acoustic output of the test receiver from being limited for
at least 10 dB or so above the expected balance point for the six signal sources.
A block diagram of an example test circuit for implementing the loudness balance tests is given in
Figure U.1.
Amplifiers 2 and 4 convert from the circuit impedance to the headset receiver impedance, which is
300
for this example. Switch 1 is a hand-held push button switch, which enables the subject to
alternate the signal between the test and reference headsets. Attenuator 1 is adjusted by the subject to
attain preferred listening levels in the reference headset receiver. Attenuator 2 is adjusted by the
experimenter to randomly shift the balance point. Attenuator 3 is adjusted by the subject to attain a
loudness balance between the test and reference receivers.
For circuit line-up, the test and reference headset receive jacks are terminated at 300
(in the
example). Using a 1000 Hz tone, the gain of the amplifiers is adjusted, such that when the gain of
Amplifier 3 is numerically equal to the sum of the losses of Attenuators 2 and 3, the voltage levels at
TP1 and TP2 are equal.
After a loudness balance has been attained by the subject, the loudness difference between the test and
reference headset receivers is represented by the difference in the voltage levels at TP1 and TP2. The
loudness difference is also represented by the difference between the sum of the dB losses of
Attenuators 2 and 3 and the dB gain of Amplifier 3. For example, assume an amplifier gain of 15 dB
and a total loss of 16 dB for Attenuators 2 and 3. The loudness difference would be 16 15 ¼ 1 dB, the
test receiver is 1 dB louder than the reference receiver. The gain of Amplifier 3 should be determined in
pre-tests with the test and reference headsets so that the combination of gain in Amplifier 3 and loss in
Attenuators 2 and 3 provide a maximum range of adjustment on either side of the estimated loudness
balance point.
To estimate the receive characteristics of the test headset, the receive characteristics of the reference
headset shall first be objectively measured. The measurement bands are the same as specified for the
loudness balance procedure: 315 Hz, 500 Hz, 800 Hz, 1250 Hz, 2000 Hz, and 3150 Hz. The desired
results from the objective measurements are the receiver output pressures, in dB SPL, at the six test
frequencies.
The loudness difference between the test and reference receivers, at each of the test frequencies, is
calculated by averaging the 40 loudness differences (2 repetitions/2 ears/10 subjects) obtained at each
test frequency. The estimated output pressure for the test headset at each test frequency (assuming the
same input voltage had been used to objectively measure the reference receiver) is calculated by
TREP ¼ RROP þ LD ðU:1Þ
where
--``,-`-`,,`,,`,`,,`---
RROP is the reference receiver objective pressure in dB Pa,
LD is the loudness difference between test and reference receivers in dB.
Annex V
(informative)
Bibliography
[B1] Forsythe, G. E., Malcolm, M. A., and Moler, C. B., Computer Methods for Mathematical
Computations, Englewood Cliffs, NJ, Prentice-Hall, Inc., 1977.
[B2] IEEE 100, The Authoritative Dictionary of IEEE Standards Terms, Seventh Edition.9
--``,-`-`,,`,,`,`,,`---
9
The IEEE standards or products referred to in Annex V Bibliography are trademarks owned by the Institute of Electrical and
Electronics Engineers, Inc.