0% found this document useful (0 votes)

117 views6 pages

Measuring and Monitoring Speech Quality For Voice Over IP With PO

The document describes a study that evaluates three objective speech quality metrics - POLQA, ViSQOL, and P.563 - using a new dataset called TCD-VoIP that contains speech clips suffering from common degradations in Voice over IP (VoIP) calls like background noise, competing speakers, echo effects, clipping effects, and choppy speech. The results show that the full-reference metrics POLQA and ViSQOL are capable of accurately predicting a variety of common VoIP degradations, while highlighting the need for an accurate single-ended, no-reference metric to monitor speech quality for VoIP scenarios.

Uploaded by

Dwi pratiwi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

117 views6 pages

Measuring and Monitoring Speech Quality For Voice Over IP With PO

Uploaded by

Dwi pratiwi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

Technological University Dublin

ARROW@TU Dublin

Conference papers School of Computing

2015

Measuring and Monitoring Speech Quality for Voice over IP with

POLQA, ViSQOL and P.563
Andrew Hines
Technological University Dublin, [email protected]

Eoin Gillen
University of Dublin, Trinity College

Naomi Harte
University of Dublin, Trinity College

Follow this and additional works at: https://arrow.tudublin.ie/scschcomcon

Part of the Digital Communications and Networking Commons, and the Signal Processing Commons

Recommended Citation
Hines, A., Gillen, E. & Harte, N. (2015). Measuring and Monitoring Speech Quality for Voice over IP with
POLQA, ViSQOL and P.563. Interspeech Conference, Dresden, Germany, September 6-10. doi:10.21427/
t1sg-k177

This Conference Paper is brought to you for free and

open access by the School of Computing at ARROW@TU
Dublin. It has been accepted for inclusion in Conference
papers by an authorized administrator of ARROW@TU
Dublin. For more information, please contact
[email protected], [email protected],
[email protected], [email protected],
[email protected].

This work is licensed under a Creative Commons

Attribution-Noncommercial-Share Alike 3.0 License
Measuring and Monitoring Speech Quality for Voice over IP with POLQA,
ViSQOL and P.563
Andrew Hines1,2 , Eoin Gillen2 , Naomi Harte2
1
School of Computing, Dublin Institute of Technology, Ireland
2
Sigmedia, School of Engineering, Trinity College Dublin, Ireland
[email protected]

Abstract undistorted signal as a reference, and the degraded signal under

evaluation is compared to the reference to predict quality. No-
There are many types of degradation which can occur in Voice reference metrics predict the speech quality based on an evalu-
over IP (VoIP) calls. Of interest in this work are degradations ation of the degraded signal only and as-such are sometimes re-
which occur independently of the codec, hardware or network ferred to as single-ended. Full-reference metrics are generally
in use. Specifically, their effect on the subjective and objec- more accurate and are useful for the measuring speech qual-
tive quality of the speech is examined. Since no dataset suit- ity in the development and evaluation of VoIP services, while
able for this purpose exists, a new dataset (TCD-VoIP) has been no-reference metrics can be deployed for realtime monitoring
created and has been made publicly available. The dataset con- where access to a clean reference signal may not be practi-
tains speech clips suffering from a range of common call qual- cal [3]. Whether full-reference or no-reference, signal based
ity degradations, as well as a set of subjective opinion scores metrics are usually designed to predict the quality of a speech
on the clips from 24 listeners. The performances of three ob- signal on using a 5 point rating that mimics the mean opinion
jective quality metrics: POLQA, ViSQOL and P.563, have been scores (MOS) from subjective listener tests. This paper eval-
evaluated using the dataset. The results show that full reference uates two full-reference metrics, namely POLQA and ViSQOL
metrics are capable of accurately predicting a variety of com- and one no-reference metric, P.563. These are described in more
mon VoIP degradations. They also highlight the outstanding detail in section 3.
need for a wideband, single-ended, no-reference metric to mon- Many speech quality databases have been developed to
itor accurately speech quality for degradations common in VoIP evaluate speech quality and to develop and train objective
scenarios. speech metrics (e.g. see [4] for a list of databases benchmarked
Index Terms: Speech Quality, VoIP, POLQA, P.563, ViSQOL in the standardisation of POLQA). However, due to their valu-
able proprietary nature, many databases are not publicly avail-
1. Introduction able. As such, a newly developed database of wideband speech
with a wide variety of degradations that are common to VoIP
The growth in high speed mobile and fixed broadband has seen
scenarios was developed by the authors, called TCD-VoIP [5].
Voice over Internet Protocol (VoIP) services adopted by both
It is used here to evaluate POLQA, ViSQOL and P.563 met-
consumer and business users as a viable alternative and po-
rics’ speech quality prediction for a variety of VoIP scenarios.
tential replacement to Public Switched Telephone Networks
There have been some studies evaluating POLQA for packet
(PSTNs) for domestic and international voice calls [1, 2]. Ser-
loss and jitter in VoIP conditions [6, 7] and to evaluate Skype
vices such as Google Hangouts and Skype provide a variety of
with POLQA [8]. To our knowledge, to date no work has ex-
free and paid VoIP services from two-way voice-only calls to
amined the robustness of these metrics within this specific and
multi-party video and voice conferences. Measuring and mon-
increasingly important domain. The results also have wide ap-
itoring speech quality for VoIP applications are different goals
plicability as the degradations used (background noise, compet-
and are fulfilled by different objective speech quality models.
ing speaker, echo effects, clipping effects and choppy speech)
During development and testing of VoIP systems objective met-
occur independently of the codec, network or hardware.
rics can be used to measure and predict speech quality. For
The paper is laid out as follows. Section 2 describes the
deployed systems, realtime monitoring is essential to provide
TCD-VoIP database and section 3 introduces the objective met-
predictions of the actual speech quality experienced by users
rics that were benchmarked in this study. Section 4 describes
of the VoIP applications. While subjective testing is consid-
the evaluation method and section 5 presents and discusses the
ered the gold standard for evaluating speech quality, objective
results before concluding remarks are made in section 6.
metrics are essential to the application developers and network
system operators to ensure changes to the platform do not have
a negative impact on users’ quality of experience (QoE). 2. TCD-VoIP
Metrics can be classified into categories: parametric and The TCD-VoIP dataset [5] is a publicly available dataset con-
signal-based models. The main types of signal based models taining degradations common to VoIP 1 . It contains five cate-
are full-reference and no-reference. A parametric model esti- gories of degradation: background noise, competing speaker,
mates quality using rules based analysis of parameters of the echo effects, clipping effects and choppy speech. The degra-
network, signal and degradation. They are useful for network dations are platform-independent, i.e. they are conditions that
planning and can estimate the quality based on the relationship occur independently of the codec, network or hardware.
between factors, e.g. speech signal bit-rate and network band-
width available. Full-reference signal based metrics use a clean, 1 Available for download from: www.sigmedia.tv/resources
NOISE COMPSPKR ECHO
5 5 5

4.5 4.5 4.5

4 4 4

3.5 3.5 3.5

MOS SCORE

MOS SCORE
3 3 3

2.5 2.5 2.5

2 2 2

1.5 1.5 1.5

1 1 1

0.5 0.5 0.5

0 0 0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Condition # Condition # Condition #

(a) (b) (c)

CLIP CHOP
5 5

4.5 4.5

4 4

3.5 3.5
MOS SCORE

MOS SCORE
3 3

2.5 2.5

2 2

1.5 1.5

1 1

0.5 0.5

0 0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
Condition # Condition #

(d) (e)

Figure 1: The subjective MOS results from the five degradation types. The MOS value for a condition is the average score given by all
listeners to the four speech samples affected by that condition. The error bars are 95% confidence intervals obtained using the method
in ITU-T Rec. P.1401 [9]. The 1st condition in each figure represents an the clean reference condition. The MNRUs are highlighted in
white. The bar charts highlight that conditions covering the ACR quality range were tested across each degradation type.

The database has been subjectively labelled with listener Table 1: Summary of Degradations and Parameters used in
tests complying with the ACR methodology presented in ITU-T TCD-VoIP
Rec. P.800 [10]. The TSP speech database from McGill Uni-
versity in Canada [11] provided the reference speech material. Degradation Conditions Parameters Range
It was recorded in an anechoic chamber and consists of speak- Rate 0-6 chops/s
ers reading sentences from the Harvard test sentence list. The Chop 20 Period 0.02-0.04 s
Mode Insert, Delete, Overwrite
speech samples in the TSP speech database are 16-bit WAV files Clip 10 Multiplier 1-55
sampled at 48 kHz. Competing
10
Gender code 1-5
Speaker SNR 10-50 dB
Full information on the degradation types in the TCD-VoIP Alpha 0-0.5
dataset can be found in [5] with a summary presented here in Echo 20
Delay 0-220 ms
Table 1. The database was designed to have each type of degra- Noise Type Car, Street, Office, Babble
Noise 20 SNR 5-55 dB
dation spanning the full MOS range, i.e. from Bad to Excel- MNRUs 4 SNR (Q) 48, 36, 24, 12
lent. The per-condition results, grouped by degradation class
are presented in Figure 1. For chop, the degradation varied ac-
cording to whether zeros were inserted to replace samples, or
whether they were deleted or overwritten with earlier samples. E-Model [13, 14] is an example of a parametric model and
The chop period refers to the length of the chopped sample; the is not investigated here as this paper is focused on evaluating
rate to how often the samples got chopped. To create clipped signal-based models. Full-reference models, can produce accu-
samples, a multiplier was applied to the original signal and val- rate measurements of speech quality by comparing a reference
ues over the maximum value were simply clipped to that level. and test signal. Monitoring models are no-reference and esti-
Competing speakers is treated as a separate issue to large crowd mate the speech quality from the test signal without access to a
babble noise as the speaker is intelligible and this is a common clean reference to compare against. With access to more infor-
VoIP call scenario. The gender and SNR level of the competing mation, full-reference metrics can generally produce more ac-
speaker were varied. For echo, the alpha value is the % am- curate predictions than no-reference metrics but are not suitable
plitude of the first delayed version of the signal relative to the for deployment as realtime monitoring metrics in VoIP applica-
original. The delay parameter was the number of ms before the tions.
first delayed version of the signal relative to the original. Four The performances of three objective metrics: POLQA,
types of noise were used: speech babble noise; car noise; road ViSQOL and P.563, have been compared in this study. Prior
noise and office noise. The SNR was also varied for these noise to the development of POLQA, ITU-T Rec. P.862 [15] pre-
degradations. sented PESQ, a full-reference metric designed to estimate the
Aside from the VoIP conditions, Modulated Noise Refer- quality of narrowband (300 - 3,400Hz) speech and networks. It
ence Unit (MNRU) conditions were also included in the tests first aligns the degraded and reference signals, and then com-
(see ITU-T Rec. P.810 [12] and [5] for further details). 24 pares both signals using a perceptual model. A subsequent re-
listeners were used in all experiments and each condition was vision to P.862 (P.862.2) [16] enabled PESQ to rate wideband
tested with 4 speakers (2 male and 2 female). (50 - 7,000Hz) signals. PESQ is widely used, however it is
inaccurate in some scenarios: suboptimal listening levels, loud-
ness loss, delays in conversational tests, talker echo or sidetone.
3. Objective Speech Quality Metrics These limitations are acknowledged in the recommendation.
As mentioned in the introduction, different application scenar- POLQA, introduced in ITU-T Rec. P.863 [4], is seen as
ios necessitate different speech quality models. The ITU G.107 a successor to PESQ, and was designed to conform to newer
industry requirements and address acknowledged shortcomings objective metrics. POLQA was tested using its super-wideband
of PESQ. The extended PESQ revision (P862.2) added wide- mode and ViSQOL was tested using its wideband speech mode.
band support to the model (50 – 7,000 Hz) while POLQA can As P.563 is a narrowband metric the degraded signals were re-
rate signals with bandwidths of 50 –14,000 Hz (superwideband sampled at 8 kHz for testing with the no-reference metric. The
signals). The basic philosophy used in POLQA is the same as test evaluated 384 sample speech files. For each condition, 4
that used in PESQ, i.e. the algorithm first aligns the degraded samples (2 male and 2 female speakers) were tested giving 96
and reference signals, and then compares both signals using a conditions. The objective mean opinion score (MOS-LQO) for
perceptual model. The POLQA algorithm also contains some the given condition was computed. The per-condition MOS-
additional steps and considerations designed to improve predic- LQS were used to benchmark the metrics.
tion accuracy of the metric. Despite this, some of the limitations
of PESQ (specifically in the cases of delays in conversational 5. Results and Discussion
tests, talker echo or sidetone) are still present in POLQA [17].
In this paper, we only assess the performance of POLQA for the The results for the objective metrics POLQA, ViSQOL and
VoIP scenarios, as its performance should be superior to PESQ. P.563 on the dataset (denoted by MOS-LQO) are compared to
ViSQOL [18] evolved from NSIM (Neurogram Similarity the subjective results (denoted by MOS-LQS) in Figure 2 bro-
Index Measure), a tool developed by Hines and Harte [19] to ken down by degradation condition. A statistical analysis of the
predict speech intelligibility for hearing-impaired listeners. An object metric prediction accuracy compared with the subjective
overview of NSIM and ViSQOL is given in Hines et al. [20], listener test results is presented in Table 2. Pearson correla-
and the performance of ViSQOL is compared to that of PESQ tion coefficients (⇢pearson ), Spearman rank order coefficients
for two common VoIP issues (clock drift and jitter). NSIM is a (⇢spearman ) and root mean squared error (RM SE) for each
full-reference metric which compares neurograms created from metric. The results are further broken down by condition type.
the reference and degraded signals. A simplified algorithm is As ViSQOL performed poorly with the CHOP data, two aggre-
used in ViSQOL, which compares spectrograms created from gated condition totals are displayed including and excluding the
Short-term Fourier Transforms (STFT) of both signals. NSIM CHOP conditions.
outputs a similarity score from 0 – 1, which is mapped to a POLQA performs well in predicting quality across all the
MOS-LQO scale. Metrics such as POLQA and ViSQOL are degradation types tested. With careful review of Figure 2, it can
useful when designing new VoIP systems or measuring and been seen that POLQA has a general trend of over-estimating
evaluating the performance of systems for particular scenar- quality for noise, echo and competing speakers. It tends to
ios. ViSQOL and POLQA have also recently been adapted underestimate for clip, with more consistent performance for
to be for audio quality evaluation [21]. This work builds on chop. Examining the statistics in Table 2 confirms POLQA’s
prior work where POLQA, PESQ and ViSQOL were evaluated ability to predict speech quality accurately across all condition
under a variety of narrowband speech scenarios with different classes in the TCD-VoIP database with high scores both in terms
datasets [22, 23, 24]. of Pearson correlation coefficients and Spearman rank correla-
P.563, introduced in ITU-T Rec. P.563 [25], is a no- tions.
reference metric designed to estimate the quality of narrow- The quality predictions from ViSQOL are well-correlated
band (100 – 3,100 Hz) speech. Sometimes referred to as single with the subjective results in the clip, competing speaker, echo
ended, no-reference metrics like P.563 predict speech quality and noise tests. There is a general trend visible that ViSQOL
without access to a clean reference signal. This class of met- underestimates quality for echo conditions. ViSQOL’s scores
ric is particularly useful in realtime monitoring scenarios where on the choppy speech are not well-matched to the subjective
no reference signal is available to compare against. P.563 was scores. An analysis of the individual conditions found that the
designed to account for the full range of degradations present insert and delete conditions account for the over-estimated clus-
in PSTNs. To rate a speech signal without a reference, P.563 ter (above the diagonal), while the underestimated cluster of 7
makes use of a large number of characteristic speech parame- chop conditions seen significantly below the diagonal was com-
ters, which can be split into 6 categories: basic speech descrip- posed exclusively of the overwrite chop sub-condition. When
tors, vocal tract analysis, speech statistics, static SNR, segmen- portions of the signal are overwritten, ViSQOL’s comparison
tal SNR and interruptions/mutes. The output score is based on algorithm can be tricked into aligning speech segments with the
these parameters. Output scores have been mapped to MOS val- overwritten repetition segment rather than the original segment.
ues using a set of speech clips and subjective test results. Some This causes mis-aligned comparisons with the reference spec-
of its limitations include talker echo, sidetones and singing trogram and results in a low speech quality estimate. Conditions
voice as well as limited testing with amplitude clipping during using the other two chop modes do not cause this problem, in
standardisation [3]. fact, ViSQOL marginally overestimated quality for these con-
The metrics chosen for this test are the current recom- ditions. Overall, the correlation statistics reveal that the per-
mended full-reference (POLQA) and no-reference (P.563) met- formance of ViSQOL and POLQA is close, particularly if the
rics and the full-reference metric ViSQOL that was developed chop conditions are not taken into account. This is useful for
to specifically target VoIP scenarios. It should be noted there researchers as ViSQOL is a freely available speech quality met-
are other speech quality metrics have been developed and are in ric.
common use that were not tested here (e.g. [26, 27, 16]). The P.563 was the only no-reference metric tested in this work.
three chosen metrics, provide a baseline benchmark for objec- As a no-reference metric, its scores were not expected to be as
tive full-reference and no-reference metrics on this dataset. well-matched as those of POLQA or ViSQOL. However, as can
been seen in Figure 2, P.563’s predictions bear almost no rela-
tion to the subjective results. It appears that the lowest (MOS
4. Metric Evaluation  2) subjective results also obtain the lowest P.563 results, but
The subjective listener test mean opinion scores (MOS-LQS) no further relationship can be discerned. Almost all of P.563’s
for the database were compared with predictions from the three results lie between MOS values of 2.5 and 3.5. Looking at Ta-
5 5 5
CHOP CHOP CHOP
4.5 CLIP 4.5 CLIP 4.5 CLIP
COMPSPKR COMPSPKR COMPSPKR
ECHO ECHO ECHO
4 NOISE 4 NOISE 4 NOISE

3.5 3.5 3.5

MOS−LQO

MOS−LQO
3 3 3

2.5 2.5 2.5

2 2 2

1.5 1.5 1.5

1 1 1
1 1.5 2 2.5 3 3.5 4 4.5 5 1 1.5 2 2.5 3 3.5 4 4.5 5 1 1.5 2 2.5 3 3.5 4 4.5 5
MOS−LQS MOS−LQS MOS−LQS

Figure 2: Scattter plots for POLQA, ViSQOL and P.563

Table 2: Pearson correlation coefficients, Spearman rank correlations and RMSE per condition. The results are broken down by
degradation class with a grouped result for all conditions and a final grouping for all conditions excluding the chop condition.

CHOP CLIP COMPSPKR ECHO NOISE ALL ALL (excluding CHOP)

⇢pearson ViSQOL 0.485 0.978 0.968 0.950 0.927 0.834 0.939
POLQA 0.968 0.843 0.986 0.957 0.960 0.900 0.885
P.563 0.628 0.823 0.661 0.640 0.547 0.630 0.638

⇢spearman ViSQOL 0.543 0.987 0.952 0.979 0.886 0.836 0.939

POLQA 0.955 0.890 0.965 0.953 0.952 0.903 0.891
P.563 0.617 0.719 0.626 0.520 0.469 0.561 0.552

RM SE ViSQOL 0.381 0.244 0.204 0.557 0.533 0.772 0.593

POLQA 0.074 0.387 0.263 0.385 0.466 0.480 0.537
P.563 0.732 0.694 0.685 0.961 0.885 0.813 0.838

ble 2, the only test in which a a positive trend can be seen is quality metrics can provide accurate predictions of speech qual-
the clipped speech test. This was somewhat of a surprise as it ity and could be used in developing and testing. The re-
was noted earlier that amplitude clipping was a condition with sults for the ITU recommended POLQA metric were consis-
limited testing during the metrics’s development. tent across all degradation classes, further validating its capa-
These results show that POLQA is capable of predicting bilities in a wide-range of speech scenarios. The tests also
the subjective MOS value of any condition in the TCD-VoIP highlighted alignment problems for ViSQOL when choppy data
dataset, although their predictions for some clipped speech may uses a overwrite strategy repeating a previous segment of the
be slightly low. ViSQOL is capable of predicting the subjective speech. This will be investigated further in future development
MOS values for speech with clipping, noise, competing speak- of the ViSQOL metric. For monitoring applications, the re-
ers or echo, but struggles with choppy speech, specifically in sults showed that the no-reference metric tested, P.563, could
cases where portions of the signal have been overwritten. not accurately predict quality to a usable level of accuracy. This
The results for P.563 show that it is incapable of predicting highlights an important unaddressed requirement for VoIP ap-
subjective MOS values for conditions in the dataset. P.563’s use plications, namely the need for a no-reference wideband speech
cases (listed by Möller et al. [3]) are mostly for detecting signal quality metric capable of monitoring VoIP applications. The
warping or network effects. Also, two of its limitations are echo authors are currently using the findings presented to address the
and clipping. From this, it can be concluded that P.563 is unsuit- challenge of monitoring VoIP quality with a realtime, wideband
able for the task of rating clips in TCD-VoIP. A gap exists for a no-reference metric.
wideband speech quality metric capable of monitoring VoIP ap-
plications as none of the no-reference objective quality metrics 7. Acknowledgements
currently available were specifically developed with this task in
Thanks to Google, Inc. and Enterprise Ireland for funding.
mind.

6. Conclusion
This paper reports benchmarking results of three speech quality
metrics on a VoIP speech database. Two full-reference signal-
based metrics were evaluated to establish their accuracy as mea-
surement tools for speech quality. The results showed that for
the classes of VoIP degradation tested, full-reference speech
8. References [21] A. Hines, E. Gillen, J. Skoglund, A. Kokaram, and N. Harte,
“Visqolaudio: An objective audio quality metric for low bitrate
[1] Alcatel-Lucent, “PSTN industry analysis and service provider
codecs,” The Journal of the Acoustical Society of America, vol.
strategies: Synopsis,” http://goo.gl/tTPFes, Alcatel-Lucent, Paris,
137:6, June 2015.
France, Tech. Rep. Bell Labs Analysis for BT, 2013.
[22] A. Hines, J. Skoglund, A. Kokaram, and N. Harte, “Robustness of
[2] L. K. Vanston and R. L. Hodges, “Forecasts for the us telecom- speech quality metrics to background noise and network degra-
munications network,” Telektronnik, vol. 104, no. 3/4, pp. 18–28, dations: Comparing ViSQOL, PESQ and POLQA,” in Acous-
2008. tics, Speech and Signal Processing (ICASSP), 2013 IEEE Inter-
[3] S. Möller, W.-Y. Chan, N. Côté, T. H. Falk, A. Raake, and M. Wal- national Conference on, 2013.
termann, “Speech quality estimation: Models and trends,” Signal [23] A. Hines, P. Pocta, and H. Melvin, “Detailed analysis of PESQ
Processing Magazine, IEEE, vol. 28, no. 6, pp. 18–28, 2011. and VISQOL behaviour in the context of playout delay adjust-
[4] ITU, “Perceptual objective listening quality assessment,” Int. ments introduced by VoIP jitter buffer algorithms,” in Quality
Telecomm. Union, Geneva, Switzerland, ITU-T Rec. P.863, 2011. of Multimedia Experience (QoMEX), Klagenfurt am Wörthersee,
Austria, 2013.
[5] N. Harte, E. Gillen, and A. Hines, “TCD-VoIP, a research
database of degraded speech for assessing quality in voip appli- [24] P. Pocta, H. Melvin, and A. Hines, “An analysis of the impact
cations,” in Quality of Multimedia Experience (QoMEX), Costa of playout delay adjustments introduced by VoIP jitter buffers on
Navarino, Greece, 2015. speech quality,” Acta Acoustica united with Acustica, vol. 101,
no. 2, May–June 2015.
[6] M. Soloducha and A. Raake, “Speech quality of VoIP: bursty
[25] ITU, “Single-ended method for objective speech quality assess-
packet loss revisited,” in Speech Communication; 11. ITG Sym-
ment in narrow-band telephony applications,” Int. Telecomm.
posium; Proceedings of, Sept 2014, pp. 1–4.
Union, Geneva, Switzerland, ITU-T Rec. P.563, 2011.
[7] J. Holub and O. Slavata, “Impact of IP channel parameters on the [26] ANSI ATIS, “0100005-2006: Auditory non-intrusive quality esti-
final quality of the transferred voice,” in Wireless Telecommunica- mation plus (ANIQUE+): Perceptual model for non-intrusive es-
tions Symposium (WTS), 2012, April 2012, pp. 1–5. timation of narrowband speech quality,” 2006.
[8] J. Zhu, R. Vannithamby, C. Rodbro, M. Chen, and S. Vang Ander- [27] V. Grancharov, D. Zhao, J. Lindblom, and W. Kleijn, “Low-
sen, “Improving QoE for Skype video call in mobile broadband complexity, nonintrusive speech quality assessment,” Audio,
network,” in Global Communications Conference (GLOBECOM), Speech, and Language Processing, IEEE Transactions on, vol. 14,
2012 IEEE, Dec 2012, pp. 1938–1943. no. 6, pp. 1948–1956, Nov 2006.
[9] ITU, “Methods, metrics and procedures for statistical evaluation,
qualification and comparison of objective quality prediction mod-
els,” Int. Telecomm. Union, Geneva, Switzerland, Tech. Rep.
ITU-T Rec. P.1401, 2012.
[10] ——, “Methods for subjective determination of transmission
quality,” Int. Telecomm. Union, Geneva, Switzerland, Tech. Rep.
ITU-T Rec. P.800, 1996.
[11] P. Kabal, “Tsp speech database,” McGill University, Quebec,
Canada, Tech. Rep. Database Version 1.0, 2002.
[12] ITU, “Modulated noise reference unit (mnru),” Int. Telecomm.
Union, Geneva, Switzerland, Tech. Rep. ITU-T Rec. P.810, 1996.
[13] ——, “The E-model, a computational model for use in transmis-
sion planning,” Int. Telecomm. Union, Geneva, Switzerland, ITU-
T Rec. G.107, 2009.
[14] ——, “Wideband E-model,” Int. Telecomm. Union, Geneva,
Switzerland, ITU-T Rec. G.107.1, 2011.
[15] ——, “Perceptual evaluation of speech quality (PESQ): an objec-
tive method for end-to-end speech quality assessment of narrow-
band telephone networks and speech codecs,” Int. Telecomm.
Union, Geneva, Switzerland, ITU-T Rec. P.862, 2001.
[16] ——, “Wideband extension to recommendation P.862 for the as-
sessment of wideband telephone networks and speech codecs,”
Int. Telecomm. Union, Geneva, Switzerland, ITU-T Rec. P.862.2,
2005.
[17] ——, “Perceptual objective listening quality assessment,” Int.
Telecomm. Union, Geneva, Switzerland, Tech. Rep. ITU-T Rec.
P.863, 2011.
[18] A. Hines, J. Skoglund, A. C. Kokaram, and N. Harte, “Visqol:
an objective speech quality model,” EURASIP Journal on Audio,
Speech, and Music Processing, vol. 2015:13, May 2015.
[19] A. Hines and N. Harte, “Speech intelligibility prediction using
a neurogram similarity index measure,” Speech Communication,
vol. 54, no. 2, pp. 306 – 320, 2012.
[20] A. Hines, J. Skoglund, A. Kokaram, and N. Harte, “ViSQOL: The
virtual speech quality objective listener,” in Acoustic Signal En-
hancement; Proceedings of IWAENC 2012; International Work-
shop on, Sept 2012, pp. 1–4.

Nokia Schematic C1-01 RM-607 v1.0
50% (4)
Nokia Schematic C1-01 RM-607 v1.0
15 pages
SASE For Dummies®, Versa Networks Special Edition by Kumar Mehta Apurva Mehta
100% (1)
SASE For Dummies®, Versa Networks Special Edition by Kumar Mehta Apurva Mehta
79 pages
VoIP Wireshark
100% (1)
VoIP Wireshark
36 pages
Cheatsheet GCP A4
100% (1)
Cheatsheet GCP A4
3 pages
TBEA Modbus Grid-Connected Inverter Communication Protocol20180605
No ratings yet
TBEA Modbus Grid-Connected Inverter Communication Protocol20180605
22 pages
Azure Security Document
No ratings yet
Azure Security Document
712 pages
SmartCare SEQ Analyst V200R002C01 Web Service Quality Assessment and Optimization Guide
No ratings yet
SmartCare SEQ Analyst V200R002C01 Web Service Quality Assessment and Optimization Guide
38 pages
3 GSM Speech Quality Influence Factors Troubleshooting Methods and Tools Deliverables 20110730
No ratings yet
3 GSM Speech Quality Influence Factors Troubleshooting Methods and Tools Deliverables 20110730
68 pages
Overview of Voice Over Ip Technologies, Network Architectures and Protocols
No ratings yet
Overview of Voice Over Ip Technologies, Network Architectures and Protocols
54 pages
3 GSM Speech Quality - Influence Factors + Troubleshooting Methods and Tools + Deliverables 20110730
No ratings yet
3 GSM Speech Quality - Influence Factors + Troubleshooting Methods and Tools + Deliverables 20110730
68 pages
Voice Over IP (VoIP)
100% (1)
Voice Over IP (VoIP)
54 pages
Polqa V3
100% (1)
Polqa V3
8 pages
Comparative Study For Performance Analysis of VOIP Codecs Over WLAN in Nonmobility Scenarios
No ratings yet
Comparative Study For Performance Analysis of VOIP Codecs Over WLAN in Nonmobility Scenarios
16 pages
Rru 11 61
No ratings yet
Rru 11 61
2 pages
Assessment of Speech Quality in Voip: February 2011
No ratings yet
Assessment of Speech Quality in Voip: February 2011
19 pages
Perceptual Objective Listening Quality Assessment (POLQA), The Third Generation ITU-T Standard For End-to-End Speech Quality Measurement Part I-Temporal Alignment
No ratings yet
Perceptual Objective Listening Quality Assessment (POLQA), The Third Generation ITU-T Standard For End-to-End Speech Quality Measurement Part I-Temporal Alignment
19 pages
Speech Signal Improvement Challenge
No ratings yet
Speech Signal Improvement Challenge
12 pages
Perceptual Objective Listening Quality Assessment (POLQA), The Third Generation ITU-T Standard For End-to-End Speech Quality Measurement Part II-Perceptual Model
No ratings yet
Perceptual Objective Listening Quality Assessment (POLQA), The Third Generation ITU-T Standard For End-to-End Speech Quality Measurement Part II-Perceptual Model
18 pages
ICASSP 2023 Speech Signal Improvement Challenge
No ratings yet
ICASSP 2023 Speech Signal Improvement Challenge
13 pages
Voice Over IP: Speech Transmission Over Packet Networks
No ratings yet
Voice Over IP: Speech Transmission Over Packet Networks
24 pages
A Study of Voice Over Internet Protocol Quality M - 2022 - Procedia Computer Sci
No ratings yet
A Study of Voice Over Internet Protocol Quality M - 2022 - Procedia Computer Sci
8 pages
Speech Quality Testing Solution (MOS) Whitepaper
No ratings yet
Speech Quality Testing Solution (MOS) Whitepaper
38 pages
White Paper Voice Quality
No ratings yet
White Paper Voice Quality
16 pages
450222
No ratings yet
450222
826 pages
White Paper: Next-Generation (3G/4G) Voice Quality Testing With Polqa
No ratings yet
White Paper: Next-Generation (3G/4G) Voice Quality Testing With Polqa
22 pages
01 Eie 2012
No ratings yet
01 Eie 2012
7 pages
Modeling The Perceived Voice Quality For
No ratings yet
Modeling The Perceived Voice Quality For
6 pages
CDMA - Voice Quality in Degraded Channel
No ratings yet
CDMA - Voice Quality in Degraded Channel
7 pages
Frequency Synthesizer
No ratings yet
Frequency Synthesizer
15 pages
QoS VoIP Zaključak
No ratings yet
QoS VoIP Zaključak
5 pages
Perceptual Objective Listening Quality Analysis: Technical White Paper
No ratings yet
Perceptual Objective Listening Quality Analysis: Technical White Paper
8 pages
Evaluation of VoIP Quality Over The Paki
No ratings yet
Evaluation of VoIP Quality Over The Paki
6 pages
Monitoring VoIP Call Quality Using Improved Simplified E-Model
No ratings yet
Monitoring VoIP Call Quality Using Improved Simplified E-Model
5 pages
Voice Analysis Mobile Networks White Papers Books en
No ratings yet
Voice Analysis Mobile Networks White Papers Books en
6 pages
Ieee Paper 1
No ratings yet
Ieee Paper 1
12 pages
Methodology
No ratings yet
Methodology
11 pages
Objective Speech Quality Measures For Internet Telephony
No ratings yet
Objective Speech Quality Measures For Internet Telephony
9 pages
A Handbook For Successful VOIP
No ratings yet
A Handbook For Successful VOIP
13 pages
WP Pesq Introduction
No ratings yet
WP Pesq Introduction
15 pages
Assessment and Prediction of Speech Quality in Telecommunications
No ratings yet
Assessment and Prediction of Speech Quality in Telecommunications
16 pages
Gap Model QOE
No ratings yet
Gap Model QOE
12 pages
Voice Quality Snom Whte Paper 2015-08-06 Approved Content
No ratings yet
Voice Quality Snom Whte Paper 2015-08-06 Approved Content
10 pages
Common Voip Service Quality Thresholds: Tektronix Reference Chart
No ratings yet
Common Voip Service Quality Thresholds: Tektronix Reference Chart
2 pages
Altran Otn WP 01 Q3fy18 New
No ratings yet
Altran Otn WP 01 Q3fy18 New
22 pages
Voice Over IP (VoIP) Speech Quality Measurement With Open-Source Software Components
No ratings yet
Voice Over IP (VoIP) Speech Quality Measurement With Open-Source Software Components
4 pages
ITN Module 1
No ratings yet
ITN Module 1
56 pages
Quality Speech Measure GSM
No ratings yet
Quality Speech Measure GSM
5 pages
TMOS Administration: (Total Questions: 166)
No ratings yet
TMOS Administration: (Total Questions: 166)
84 pages
Subjective Evaluation of Voice Quality Over GSM Network For Quality of Experience (QoE) Measurement
No ratings yet
Subjective Evaluation of Voice Quality Over GSM Network For Quality of Experience (QoE) Measurement
5 pages
Towards Non-Intrusive Speech Quality Assessment For Modern Telecommunications
No ratings yet
Towards Non-Intrusive Speech Quality Assessment For Modern Telecommunications
6 pages
Call Quality and Its Parameter Measurement in Telecommunication Networks
No ratings yet
Call Quality and Its Parameter Measurement in Telecommunication Networks
4 pages
VOIP Troubleshooting Whitepaper Final
No ratings yet
VOIP Troubleshooting Whitepaper Final
8 pages
Voice Quality Degradation Recognition Using The Call Lengths
No ratings yet
Voice Quality Degradation Recognition Using The Call Lengths
6 pages
Solving Qos in Voip:: A Formula For Explosive Growth?
No ratings yet
Solving Qos in Voip:: A Formula For Explosive Growth?
4 pages
Anite Network Testing - Testing of Voice Quality Using PESQ and POLQA Algorithms - Whitepaper
No ratings yet
Anite Network Testing - Testing of Voice Quality Using PESQ and POLQA Algorithms - Whitepaper
17 pages
POLQA White Paper 1011
No ratings yet
POLQA White Paper 1011
8 pages
Measuring Voice Quality
No ratings yet
Measuring Voice Quality
3 pages
Perceived Speech Quality Prediction For Voice Over IP-based Networks
No ratings yet
Perceived Speech Quality Prediction For Voice Over IP-based Networks
5 pages
Speech Quality Measurement Tools For Dynamic Network Management
No ratings yet
Speech Quality Measurement Tools For Dynamic Network Management
6 pages
Mean Opinion Score: Codecs Bandwidth PCM Modulation
No ratings yet
Mean Opinion Score: Codecs Bandwidth PCM Modulation
4 pages
RDCM Manual 1v1
No ratings yet
RDCM Manual 1v1
22 pages
Training - TEMS Pocket
No ratings yet
Training - TEMS Pocket
101 pages
Network Design: Architecting With Google Cloud Platform: Design and Process
100% (1)
Network Design: Architecting With Google Cloud Platform: Design and Process
32 pages
Telemovel Samsung P310
No ratings yet
Telemovel Samsung P310
138 pages
PESQ - Perceptual Evaluation of Speech Quality: Principle
No ratings yet
PESQ - Perceptual Evaluation of Speech Quality: Principle
2 pages
Quality Assurance Versus Quality Control Testing Is A Quality Control Activity
No ratings yet
Quality Assurance Versus Quality Control Testing Is A Quality Control Activity
5 pages
Agilent Technologies
No ratings yet
Agilent Technologies
20 pages
FALLSEM2024-25 SWE4005 ETH VL2024250103359 2024-07-26 Reference-Material-I
No ratings yet
FALLSEM2024-25 SWE4005 ETH VL2024250103359 2024-07-26 Reference-Material-I
56 pages
Altai IX600 Catalog Eng 20211222
No ratings yet
Altai IX600 Catalog Eng 20211222
2 pages
Acn Udp
No ratings yet
Acn Udp
11 pages
VoIP Book
No ratings yet
VoIP Book
52 pages
EC8491 2marks
No ratings yet
EC8491 2marks
34 pages
Vehicle Tracking by GPS - GSM
No ratings yet
Vehicle Tracking by GPS - GSM
29 pages
Park Warning: Shri Sant Gajanan Maharaj College of Engineering Shegaon - 444203 (M.S.)
No ratings yet
Park Warning: Shri Sant Gajanan Maharaj College of Engineering Shegaon - 444203 (M.S.)
19 pages
APN Problem Troubleshooting
No ratings yet
APN Problem Troubleshooting
6 pages
PAWM Vs PAPM Reference1
No ratings yet
PAWM Vs PAPM Reference1
9 pages
Only. This File Is Illegal.: Analysis and Simulation of Fractal Antenna For Mobile Wimax
No ratings yet
Only. This File Is Illegal.: Analysis and Simulation of Fractal Antenna For Mobile Wimax
8 pages
Bluetooth and LRC
No ratings yet
Bluetooth and LRC
2 pages
4lecture 9 Private Network VPN
No ratings yet
4lecture 9 Private Network VPN
6 pages
AW Bone Conduction Hearing For Children
No ratings yet
AW Bone Conduction Hearing For Children
4 pages
SRX Getting Started - Troubleshooting Commands
No ratings yet
SRX Getting Started - Troubleshooting Commands
3 pages
Mastering FreeSWITCH
From Everand
Mastering FreeSWITCH
Anthony Minessale II
No ratings yet
Ultimate Cisco Collaboration Infrastructure for Enterprise Solutions: Unlock the True Potential of Cisco Collaboration Infrastructure for Deploying and Managing Solutions for Enterprises (English Edition)
From Everand
Ultimate Cisco Collaboration Infrastructure for Enterprise Solutions: Unlock the True Potential of Cisco Collaboration Infrastructure for Deploying and Managing Solutions for Enterprises (English Edition)
Lalit Pamnani
No ratings yet
CCNA Voice Study Guide: Exam 640-460
From Everand
CCNA Voice Study Guide: Exam 640-460
Andrew Froehlich
No ratings yet
Designing and Implementing Linux Firewalls and QoS using netfilter, iproute2, NAT and l7-filter
From Everand
Designing and Implementing Linux Firewalls and QoS using netfilter, iproute2, NAT and l7-filter
Lucian Gheorghe
No ratings yet
VoIP Telephony and You: A Guide to Design and Build a Resilient Infrastructure for Enterprise Communications Using the VoIP Technology (English Edition)
From Everand
VoIP Telephony and You: A Guide to Design and Build a Resilient Infrastructure for Enterprise Communications Using the VoIP Technology (English Edition)
Rashmi Nanda
No ratings yet
Configuring IPCop Firewalls: Closing Borders with Open Source
From Everand
Configuring IPCop Firewalls: Closing Borders with Open Source
Barrie Dempster
No ratings yet
Groovy for Domain-Specific Languages, Second Edition: Extend and enhance your Java applications with domain-specific scripting in Groovy
From Everand
Groovy for Domain-Specific Languages, Second Edition: Extend and enhance your Java applications with domain-specific scripting in Groovy
Fergal Dearle
No ratings yet
Internet Communications Using SIP: Delivering VoIP and Multimedia Services with Session Initiation Protocol
From Everand
Internet Communications Using SIP: Delivering VoIP and Multimedia Services with Session Initiation Protocol
Henry Sinnreich
3/5 (1)
Voice Technologies and Systems: Definitive Reference for Developers and Engineers
From Everand
Voice Technologies and Systems: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet

Measuring and Monitoring Speech Quality For Voice Over IP With PO

Uploaded by

Measuring and Monitoring Speech Quality For Voice Over IP With PO

Uploaded by

Technological University Dublin

Conference papers School of Computing

Measuring and Monitoring Speech Quality for Voice over IP with

Follow this and additional works at: https://arrow.tudublin.ie/scschcomcon

This Conference Paper is brought to you for free and

This work is licensed under a Creative Commons

Abstract undistorted signal as a reference, and the degraded signal under

4.5 4.5 4.5

3.5 3.5 3.5

2.5 2.5 2.5

1.5 1.5 1.5

0.5 0.5 0.5

(a) (b) (c)

3.5 3.5 3.5

2.5 2.5 2.5

1.5 1.5 1.5

Figure 2: Scattter plots for POLQA, ViSQOL and P.563

CHOP CLIP COMPSPKR ECHO NOISE ALL ALL (excluding CHOP)

⇢spearman ViSQOL 0.543 0.987 0.952 0.979 0.886 0.836 0.939

RM SE ViSQOL 0.381 0.244 0.204 0.557 0.533 0.772 0.593

You might also like