0% found this document useful (0 votes)
41 views5 pages

Measurement and Analysis of Video Streaming Performance in Live Umts Networks

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
41 views5 pages

Measurement and Analysis of Video Streaming Performance in Live Umts Networks

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Measurement and Analysis of Video Streaming Performance

in Live UMTS Networks

Ralf Weber#1, Mauricio Guerra*2, Salil Sawhney*3,


Leonid Golovanevsky*4, Ming Kang*5
#
QUALCOMM CDMA Technologies GmbH. Engineering Services Group (ESG)
Nordostpark 98, 90411 Nuremberg, Germany
1
[email protected]
*
QUALCOMM Incorporated. Engineering Services Group (ESG)
5775 Morehouse Drive, San Diego CA 92121, USA
2
[email protected], [email protected], [email protected], [email protected]

Abstract — This paper discusses an approach to assess Video dramatically due to fading and other network effects such as
Streaming (VS) performance over live UMTS networks. The handover. Understanding the impact of these transmission
suggested methodology can be used to evaluate the service impairments on the overall quality seen by end users is
performance of Release 99 (Rel’99) or HSDPA network important for a successful deployment of VS services over
implementations. In this paper, measurement and analysis cellular networks such as UMTS.
results of VS performance in a deployed Rel’99 UMTS
network are presented. The measurements were carried out The recent study by Heijenk and Niemegeers [19]
through field tests in an unloaded network cluster. RTP addressed this topic with the help of a network simulator to
statistics and video quality metrics are shown for single-user compare video quality. Herein, a comparison was done
VS tests in both stationary and mobile environments. The between radio link control (RLC) acknowledged mode
results indicate that RTP performance metrics are a good (AM) and unacknowledged mode (UM) in terms of average
means to correlate RF conditions and protocol observations peak signal to noise ratio (PSNR) (with a simple mapping to
with terminal and network related behaviors, whereas the subjective mean opinion score, MOS), jitter, and video
suggested set of video quality metrics can be used to evaluate frame error rate. In this paper, we investigate VS
the corresponding user perception. performance in a live UMTS network in order to examine
effects of different network issues (such as RF condition
Keywords: UMTS, Release 99, HSDPA, Multimedia, Video
change, data rate switching etc.) on the overall quality.
Streaming, Network Performance
Performance considerations for VS assessment over
mobile radio networks should include the following aspects:
1 Introduction
o Packet loss, jitter, delay, audio-video synchronization
Currently, Video Streaming (VS) is emerging as a new and occupancy of media buffers that mitigate
differentiating service in networks based on Universal delay/jitter problems for audio and video components
Mobile Telecommunication Systems (UMTS) [1]. This type of a clip
of service can be offered over either UMTS Release 99
(Rel’99) or its evolution based on High Speed Downlink o Encoded clip can have variable instantaneous data rates
Packet Access (HSDPA). While Rel’99 can guarantee o User perception of real-time application like VS is
dedicated resources in terms of bandwidth through a fixed more sensitive to transmission errors as compared to
assignment of either interactive or streaming bearers, the other non-real time applications like FTP, HTTP, etc.
data rate of HSDPA can vary depending on radio conditions (given the use of unreliable transport protocol like
and the number of users if a given Quality-of-Service (QoS) UDP, coupled with tighter delay requirements)
cannot be supported by the network.
o Resulting video metrics are clip dependent, making the
Although quality assessments of video codec selection of clip content important
performance have been an active research area for more
than one decade (See e.g., [2-17] and references therein), o Correlation of RF related problems to video
most of the previous work was mainly focused on the impairments is not straight forward due to layered link
impairments introduced by source coding and compression layer retransmission schemes inherent in mobile
algorithms [19]. On the other hand, transmission errors in communication systems
forms of packet delay, jitter (packet inter-arrival variations), The remainder of this paper is organized as follows:
and packet loss can have major impact on the end user Section II is a brief overview of some basic concepts and
experience. This is particularly true in a cellular network definitions used in this paper for assessing the quality of VS.
environment where the channel condition can vary

TO BE PUBLISHED IN PROCEEDINGS OF WPMC 2006 CONFERENCE, SAN DIEGO, CA, USA, SEPT. 17-20, 2006 – PAGE 1 OF 5
Section III describes the overall assessment process. Section which is based on the difference in the mean and variance
IV describes the test scenarios and Section V presents the between the reference and impaired video clips as well as
performance results. Section VI summarizes the paper with the correlation coefficients between the two clips), PIQE
some concluding remarks. (Psycho-visually-based Image Quality Evaluator)
Similarity [6], and PIQE Blockiness [6]. SSIM and PIQE
Similarity are full reference estimators. While SSIM
focuses on luminance, contrast and structural elements,
PIQE Similarity focuses on remaining edge correspondence
between original and degraded video frames. PIQE
Blockiness primarily estimates the blockiness artifacts in a
video frame due to block-based video compression that can
result from transmission errors and packet losses. To assess
the PSNR degradations introduced by the radio
transmission, we define the PSNR calculation based on the
encoded MPEG4 clip (not the raw, un-compressed source)
originally transmitted and the impaired clip received by the
user equipment (UE) client. To distinguish this metric from
the commonly defined PSNR [19] between the raw source
clip and encoded clip, we denote it here as E-PSNR.

3 Video Assessment Process


The applied principal video assessment process is
Figure 1 Typical VS Protocol Architecture depicted in Figure 2.

2 Video Streaming Assessment Concept

2.1. General considerations and protocol architecture


In this work, we are focusing on the performance of
streaming MPEG-4 encoded videos over UMTS Rel’99
networks in the packet switched (PS) domain. We assume
that one interactive Radio Access Bearer (RAB) class [20]
of either 64/128/384kbps is used for content delivery. The
typical protocol architecture of Figure 1 is considered.
Control information is sent over Real Time Streaming
Protocol (RTSP) over TCP/IP to allow clients to request,
setup, play, pause, record and teardown VS sessions. Media
data stream and receiver feedback statistics are delivered
using Real time Transport Protocol/Real time Transport
Control Protocol (RTP/RTCP) [21-24] over UDP/IP. The
RLC protocol is configured in Acknowledged “No-Discard”
Mode [25]. A target downlink (DL) transport channel Block
Error Rate (BLER) of 1% is adjusted and a media de-jitter
buffer of 2.5s is employed in the application layer to offset
the inter-arrival variations.

2.2. Quality metrics


We consider two types of quality metrics:
1) RTP statistics and application layer metrics including
the RTP throughputs (total, video, and audio), RTP
video/audio jitter, video/audio buffer occupancy, numbers
of lost video/audio packets, etc. These RTP metrics provide
insight into the impact of network conditions on the
Figure 2 Applied VS Assessment Process
resulting trends in audio and video quality.
2) Perceptual video quality metrics (see [1-16]) including Independent of the assessment methodology, it is
average PSNR [19], Structural Similarity [16] (SSIM, important to mention that the encoding rate, the display size

TO BE PUBLISHED IN PROCEEDINGS OF WPMC 2006 CONFERENCE, SAN DIEGO, CA, USA, SEPT. 17-20, 2006 – PAGE 2 OF 5
(resolution) as well as the clip content (e.g. high motion and Table 2 RTP and Buffer Performance Comparison
scene complexity) can impact VS performance. Average Performance Metrics for Mid-Cell Far-Cell
Furthermore, audio and video settings should be defined Values 104kbps Clip Location Location
Mobility

consistently between the media server and the clients. For


this, setup and configuration (parameters) need to be Misc
SDU Throughput [kbps] 96.2 89.7 92.4

specified and the selected encoder implementation has to Metrics


Audio/Video Sync Offset [ms] -30.2 -29.3 -30.1
meet pre-defined quality requirements.
RTP Throughput [kbps] 62.6 61.9 60.5
4 Test Scenarios
13 188 438
The following tests were executed: Total # RTP Packets Dropped
(0.4%) (4.3%) (3.3%)

1) Two test scenarios were used to assess single-user Video


Jitter [ms] 26.7 33.6 39.2
performance in stationary environments: Metrics

- Mid Cell Scenario – allows an evaluation of Differential RTP Packet Delay


[ms]
677.6 1460.7 -58.0
streaming performance in such common RF
environments Buffer Occupancy [s] 4.8 4.9 4.7
- Far Cell Scenario – allows an evaluation of streaming
performance in RF environments that could be seen RTP Throughput [kbps] 25.0 24.3 23.8
by users in weak or marginal RF conditions (cell-
edge scenario). Total # RTP Packets Dropped
34
(1.0%)
427
(7.0%)
1017
(5.5%)
2) To assess single-user performance in mobility, a
Jitter [ms] 21.6 30.7 30.9
selected metric drive route was selected, which allows
an evaluation of streaming performance in mobile Audio
Total # Re-buffering Events 0 2 4
Metrics
environments with both good and weak RF conditions.
Re-buffering Events Duration
- 13.9 7.4
[s]
Table 1 summarizes the RF conditions at selected mid
cell and far cell locations. In both cases, the average Active Differential RTP Packet Delay
[ms]
868.1 1587.9 17.9
Set Size was one.
Buffer Occupancy [s] 4.4 4.4 4.2
Table 1 RF Environment in Unloaded Test Locations

RF Metrics (Average) Mid-Cell Location Far-Cell Location Table 3 Perceptual Video Quality (104 Kbps Clip)
Average Performance Mid-Cell Far-Cell
Combined Ec/No [dB] -3.4 -9.7 Mobility
Metrics Location Location
Rx Power [dBm] -83.3 -100.7
Total # Dropped Frames 13 (~0%) 170 (3.8%) 409 (3.0%)
Tx Power [dBm] -17.6 5.7
PIQE Blockiness 0.0 0.03 0.02

The cluster of cells had minimal intra-cell interference PIQE Similarity 1.0 0.95 0.96
(unloaded cell) and inter-cell interference (small cluster).
The selected metric route intentionally included mixed RF Structural Similarity 1.0 0.96 0.98

environments with near, mid and far cell conditions.


Statistics were obtained over a whole metric route, including
streaming of 3 identical clips in sequence. Each clip RTP statistics show a clear downward trend with
consisted 4503 frames and had a duration of 5 minutes. degraded RF environment going from mid-cell to far-cell
and mobility scenarios. Since all re-buffering events were
The selected clips had an audio coding rate of 24 kbps triggered by reduced audio buffer occupancy, there were no
(AAC format) and a video coding rate of 80 kbps (MPEG-4 re-buffering events listed as part of the video metrics.
format) resulting in a nominal total encoding rate of 104
kbps. No bit rate adaptation was employed. The average As in stationary cases, significant impairments were
speed along the metric route was 34 km/h. observed in the mobility scenario, when the assigned RAB
rate was lower than the total clip encoding rate (down-
switch to 64kbps). There have been correlations noticed
5 Performance Results between audio re-buffering events and RAB changes.
Both RTP layer statistics and video quality metrics were From the RF environment shown in Figure 3, it is
considered to assess VS performance. Table 2 and Table 3 apparent that just before the down-switch from 384 kbps
summarize the obtained RTP statistics and perceptual video RAB to 64 kbps RAB, the Ec/No and RSSI decrease
quality metrics, respectively, indicating trends for perceived significantly while UE Tx power increases. At the same
VS performance of the respective scenarios. time, the sudden increase in DL BLER indicates that the

TO BE PUBLISHED IN PROCEEDINGS OF WPMC 2006 CONFERENCE, SAN DIEGO, CA, USA, SEPT. 17-20, 2006 – PAGE 3 OF 5
Node-B seem to hit the maximum DCH power threshold for close to 0) before and after the RAB down-switch, while
this RAB and thus is unable to transmit more power during significant degradations occur during the 64 kbps RAB
this weak RF environment. As a result hereof, the RNC period. In contrast to E-PSNR, blockiness and similarity
switches the UE to a 64 kbps RAB. metrics undergo high variations within this interval, which
Figure 4 shows the impact of the RAB down-switch on makes it difficult to correlate the degradations with the
the trend in audio and video quality metrics. The frame performance perception during this time.
render rate goes to zero during the re-buffering period Figure 5 depicts the RAB down-switch impact on RTP
while audio and video buffers are being replenished. quality metrics. A number of audio and video RTP packets
are lost continuously during the 64 kbps RAB assignment.
Given that all RABs are configured here as RLC-AM “No
Discard” Mode, the interface between RNC and UE can be
considered reliable. Hence, the RTP packet loss is assumed
to occur over the RNC-SGSN-GGSN server interface.

Figure 3 RF Conditions and RAB assignment

Figure 5 RAB down-switch impacts on RTP Quality Metrics

As can be seen from Figure 5, RTP audio packet jitter


is low and has limited variance during the 64 kbps RAB.
However, jitter spikes can be observed just around the time
of the down-switch occasions from 384 kbps to 64 kbps
RAB and stabilizes afterwards. High values of RTP audio
packet jitter are also apparent during the time intervals
when re-buffering is in progress. Further spikes are also
seen after the completion of re-buffering.
Figure 4 RAB down-switch impacts on Audio and Video Quality
The synchronization offset between audio and video
As soon as a 64 kbps RAB is assigned, the frame render RTP packets (A/V Sync Offset) shows spikes not only
rate is always lower than the expected 15 fps. This is due to around the time RTP packet losses are clustered, but also
the fact that frames can only be rendered as fast as they are just before a RAB down-switch occurs, as well as before
received, because the restricted 64 kbps air interface rate is and after re-buffering events (see Figure 5).
not sufficient to accommodate the nominal audio plus video Dropped frames are a result of consecutive loss of RTP
clip encoding rate of 104 kbps. packets (see Figure 4 and Figure 5). This causes the user to
The E-PSNR values shown in Figure 4 experience a see some intermittent freeze frames. The same effect
significant reduction during the 64 kbps RAB interval, happens during re-buffering periods when rendering is
while before and after the down-switch no degradations (E- stopped completely.
PSNR value set to 255 dB) are observed. Each audio re-buffering event freezes video rendering
to ensure that audio buffer fills up to a specified level so that
According to Figure 4, PIQE Similarity and SSIM have
a continuous synchronous playback of audio and video is
values equal to or close to 1 before and after the RAB possible afterwards. However, re-buffering always results in
down-switch, indicating an error free situation. However, a strong negative impact on user perception.
during the RAB down-switch, they also show significant
As shown above, higher number of video impairments
degradations (PIQE Similarity more than SSIM). Also
were observed when the network down-switches the
PIQE Blockiness indicates no errors (values equal to or
interactive RAB to 64kbps. This usually resulted in a higher

TO BE PUBLISHED IN PROCEEDINGS OF WPMC 2006 CONFERENCE, SAN DIEGO, CA, USA, SEPT. 17-20, 2006 – PAGE 4 OF 5
number of re-buffering events owing to a depletion of the approach might need to be revised for VS assessments when
media buffer as the buffer outflow rate (playback rate) a bit rate adaptation concept is implemented.
exceeded the buffer inflow rate. As expected, the lowest
number of dropped frames was observed at the mid cell References
location where the best overall RF conditions were present.
[1] 3GPP Specifications, www.3gpp.org
To mitigate the impact of RAB down-switching, several [2] P. Marziliano et. al, “A no-reference perceptual blur metric”,
approaches could be considered. One of them would be to ICIP’02.
implement bit rate adaptation techniques driven either by [3] Y. Yoshida et al, “Parameter estimation of uniform image blur using
the media server or the media client. Dynamic streaming DCT”, IEICE Trans. Fundamentals, pp . 1154 – 1157, July, 1993.
rate adaptations can improve streaming performance by [4] X. Marichal et. al, “Blur determination in the compressed domain
reducing the effect of RAB down-switching. Another using DCT informtation”, ICIP’99
approach would be to employ clips with a nominal coding [5] J. Caviedes et. al, “No-reference sharpness metric based on local
edge Kurtosis”, ICIP’02.
rate lower than the 64 kbps RAB. While using the latter
[6] A. Leontaris and A.R. Reibman, “Comparison of blocking and
approach, the tradeoff would be that this could decrease the blurring metrics for video compression”, ICASSP, 2005
quality of the clip, while the number of supported VS users [7] R. W. Chan and P. B. Goldsmith, “A psychovisually-based image
(cell capacity) can increase. To accommodate higher clip quality evaluator for JPEG images”, ICSMC’00
rates at the expense of reduced coverage, the DCH power [8] J. Yang et al, “Noise estimation for blocking artifacts reduction in
thresholds for the 128 kbps RAB could be optimized to DCT coded images”, IEEE Trans. CSVT, pp. 1116-1134, Oct. 2000
disallow or minimize switching to the 64 kbps RAB. It is up [9] S. Minami et. al., “ An optimization approach for removing blocking
to the operators to decide which technology (Rel’99 or effects in transform coding”, IEEE Trans. on CSVT, pp. 74-82, Apr.
HSDPA) and which strategy to follow in order to provide 1995
adequate video quality without sacrificing cell capacity. [10] W. Gao et al, “A de-blocking algorithm and a blockiness metric for
highly compressed images”, IEEE Trans. On CSVT, pp. 1150—
1159, Dec. 2002
6. Conclusions [11] H. R. Wu et. al., “A generalized block-edge impairment metric for
video coding”, IEEE SP Let., Nov. 1997
This paper demonstrated a practical approach to assess
[12] S. Suthaharan, “Perceptual quality metric for digital video coding”,
video streaming performance over mobile radio networks. IEE EL. Let., Mar. 2003
Real-time measurement results obtained in a deployed [13] T. Vlachos, “Detection of blocking artifacts in compressed video”,
Rel’99 UMTS network were shown. The same assessment IEE El. Let., Jun. 2000.
concept can also be used for VS services over HSDPA. [14] S. Liu et. al, “Efficient DCT-domain blind measurement and
reduction of blocking artifacts”, IEEE Trans. On CSVT, pp. 1139—
RTP statistics and video quality metrics were employed 1149, Dec. 2002
to study single-user VS performance of a 104 kbps encoded [15] Z. Wang et al, “ Blind measurement of blocking artifacts in
clip in stationary mid/far-cell as well as in mobility compressed video”, ICIP’00.
conditions. Attention was paid to interactions of underlying [16] Z. Wang et al, “Image quality assessment: From error visibility to
protocol layers and mechanisms. structural similarity”, IEEE Trans. Image Processing, pp. 600—612,
Apr. 2004
It turned out that the video and audio performance was [17] A.B. Watson, “Toward a perceptual video quality metric”, Human
impacted mainly when the networked down-switched the Vision, Visual Processing, and Digital Display VIII, 3299, pp. 139--
radio bearer to 64 kbps and thus below the effective clip 147, 1998
rate. This resulted in a drastic increase of re-buffering events [18] R. Weber, “Multimedia Performance Assessments in Deployed
leading to perceived visual impairments. The latter were UMTS Networks”, IEEE Proceedings of ICMSC 2005 Conference,
Montreal, Canada, Aug. 14 – 17, 2005
reflected in reduced E-PSNR, higher blockiness as well as
[19] A. Lo, G. Heijenk, I. Niemegeers, "Performance Evaluation of
lower similarity. While the resulting E-PSNR showed only MPEG-4 Video Streaming over UMTS Networks using an
marginal variations during the impaired time intervals, all Integrated Tool Environment", Proc. of 2005 Int. Symp. on
other video metrics had much higher variations and did not Performance Evaluation of Computer and Telecommunication
indicate a clear quality trend. As long as the bearer rate was Systems (SPECTS’05), Philadelphia, PA, USA, July 24-28, 2005.
128 kbps or higher, no visual impairments were observed [20] A.N. Netravali and B.G. Haskell, Digital Pictures: Representation,
with a Rel’99 RAB target downlink transport channel BLER Compression, and Standards (2nd Ed), Plenum Press, NY, 1995.
of about 1%. As a result, a few dropped/freeze frames were [21] 3GPP Technical Specification: Group Services and System Aspects:
QoS Concept and Architecture (3G TS 23.107 version 3.0.0)
always randomly distributed across the 5 minutes clip with
[22] RFC 3551, Standard 65, RTP Profile for Audio and Video
minimal effect on the user perceived quality. Conferences with Minimal Control
It could be shown that the applied assessment concept [23] RFC 3550, Standard 64, RTP : A Transport Protocol for Real-Time
Applications
allows a combined investigation of video quality and
[24] RFC 1890, Obsolete, RTP Profile for Audio and Video Conferences
network related issues including radio conditions and RTP with Minimal Control
protocol aspects. The achieved correlation of user perceived
[25] RFC 1889, Obsolete, RTP : A Transport Protocol for Real-Time
impairments with network related events can be used to Applications, 3GPP Technical Specification: Radio Link Control
improve and optimize UMTS systems. The applied (RLC) protocol specification (3G TS 25.322)

TO BE PUBLISHED IN PROCEEDINGS OF WPMC 2006 CONFERENCE, SAN DIEGO, CA, USA, SEPT. 17-20, 2006 – PAGE 5 OF 5

You might also like