0% found this document useful (0 votes)
138 views4 pages

Contemporary Video Compression Standards: H.265/HEVC, VP9, VP10, Daala

Uploaded by

Bill
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
138 views4 pages

Contemporary Video Compression Standards: H.265/HEVC, VP9, VP10, Daala

Uploaded by

Bill
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

2016 International Siberian Conference on Control and Communications (SIBCON)

Contemporary Video Compression Standards


H.265/HEVC, VP9, VP10, Daala

M.P. Sharabayko, N.G. Markov


Tomsk Polytechnic University, 634050, Tomsk, Russia

Abstract— In this paper we compare compression efficiency YouTube, Netflix, and so on. All this creates the demand for
of the latest video coding standards H.265/HEVC, VP9, VP10 royalty-free codecs like Google VP9, VP10, Xiph Daala, etc.,
and Daala to H.264/AVC with the help of reference video that are being developed to provide competitive compression
encoders available. Experimental results show that H.265/HEVC efficiency free of charge.
on average provides 37–40% bitrate savings, VP9 provides 9–
11%, VP10 – 10–12% bitrate savings, while Daala provides 8-9% Royalty-free encoders avoid the usage of patented
bitrate overhead on average. compression tools, which makes it hard to achieve compression
rates competitive to proprietary encoders. One of the
Keywords—video compression standards; H.265/HEVC; representative example is the elegant bypass of the
H.264/AVC; VP9; VP10; Daala bidirectionally predicted frames (B-frames) applied in Google
VP9 video encoder [3] – one of the most noticeable royalty-
I. INTRODUCTION free codecs at the time. VP9 provides compression rates
Evolution of digital video compression has more than a 30- superior to H.264/AVC and inferior to H.265/HEVC [1]. This
year old history. Today video compression is applied in digital still makes it possible to replace H.264/AVC in some fields of
and cable television, in Video on Demand (VoD) services, in usage. Further increase of compression efficiency is expected
conferencing and surveillance systems, etc. The amount of in the emerging VP10 codec being developed by Google [4].
video data tends to increase due to the greater availability of
On the other hand, there is Daala, that is being developed
video recording devices and an increased demand for visual
by Xiph.org open source community since 2010. The goal of
quality of video. Contemporary systems more often exploit
the project is to achieve royalty-free compression efficiency
high definition (HD) and ultra-high definition (UHD)
superior to H.265/HEVC. Among the key distinctive features
resolutions of video.
of Daala are lapped transforms instead of non-overlapped
At the moment the most part of video compression systems block-based DCT, lifting pre- and post-filtering instead of
is based on H.264/AVC video compression standard. With its deblock filtering, frequency domain intra prediction,
first version accepted in 2003, the standard was targeting Time/Frequency Resolution Switching etc.
contemporary demand on compression efficiency and
In our previous experiments [1,5] H.265/HEVC appeared to
computing powers available in the nearest years. For that time
provide the best compression efficiency making it possible to
it provided superior compression efficiency and had become
save up to 50% of bitrate compared to H.264/AVC. Due to the
one the most widely-used standards. Today with the increased
fact that VP9 has to get around the patented compression
amount of video data and an emerging transition to UHD
techniques, it was less efficient even in intra-frame coding, but
systems within the capacity of existing data transmission
still showed better results than H.264/AVC. Google started the
channels and storage media there is an escalating need for a
development of VP10 [4] to further improve compression
more efficient video compression.
efficiency of techniques used in VP9 standard. Daala, being in
In 2014 the next generation proprietary standard an early development stage back at the time of previous
H.265/HEVC was released. The standard puts emphasis on experiments, showed poor results: almost 10 times higher
high-resolution video coding and potentially provides up to two bitrates compared to H.264/AVC at the same distortion levels.
times better compression efficiency compared to
In this paper we carry out a research on the compression
H.264/AVC [1]. The standard increased a selection of tools
efficiency of the contemporary video compression standards
that can be used in compression [2]. It has three times more
H.264/AVC, H.265/HEVC and video encoders Google VP9,
intra prediction modes available; four times larger block sizes
VP10, Xiph Daala to outline the current state of royalty-free
and an adaptive partitioning; temporal motion vector
codecs.
prediction, etc. The provided increase of compression
efficiency comes at the cost of higher computational II. ENCODER IMPLEMENTATIONS USED
complexity, which slows down its introduction to commercial
systems. In this research paper we aim to compare maximum video
compression efficiency, provided by the latest compression
At the same time, the usage of H.265/HEVC, as well as standards to get an updated outlook at the capabilities of
H.264/AVC, necessitates licensing fees, therefore it is almost modern mainstream video compression techniques. It is one of
impossible to use them in free open-source solutions. The the reasons we give preference to reference test model
royalties are even more significant for companies making implementations of the encoders rather than using the
business in video on demand services and systems with large commercial versions.
amount of video to be stored and transmitted, like Google with

978-1-4673-8383-7/16/$31.00 ©2016 IEEE


2016 International Siberian Conference on Control and Communications (SIBCON)
A. H.264/AVC vpxenc --codec=vp9 --fps=<FR> --i420 --min-q=<Q> --
Reference JM encoder within H.264/AVC standard max-q=<Q> --cq-level=<Q> --kf-min-dist=1000 --kf-max-
implements quasi-full Rate-Distortion Optimization (RDO) dist=1000 --passes=1 -w=<W> -h=<H>
model for coding decisions. This makes JM encoder the best Parameters '--fps', '-w' and '-h' define frame rate, width and
choice for our experiments. The reference model has a lot of height of an input video sequence. Parameters '--min-q', '--max-
configuration properties available, and the most common q' and '--cq-level' define the quantization values available,
property sets are combined in several configuration files while making them equal (forces constant quantization mode).
available with the sources of the encoder. Parameters '--kf-min-dist' and '--kf-max-dist' specify key frame
We use JM ver. 19.0 with default configuration distance, and we force a key frame to be used only at the start
“JM_LB_HE”, which sets up a hierarchical B-frames structure, of the sequence similar to the other encoders.
disables rate control and enables all the available prediction D. VP10 Encoder
block sizes. Additional options are passed by the following
command line arguments: The open source 'libvpx' encoder ver. 1.5.0 also has an
implementation of VP10. The encoder is in the early
lencod -p FrameRate=<FR> -p QPISlice=<QP> - development stage, which makes it possible to analyze
p QPPSlice=<QP> -p QPBSlice=<QP> -p SourceWidth=<W> compression efficiency of its most recent coding tools. The
-p SourceHeight=<H> -p OutputWidth=<W> - options are the same used for VP9 except for the parameter '--
p OutputHeight=<H> codec' that needs to be set to 'vp10' instead of 'vp9'.
The command line sets a fixed quantization parameter QP vpxenc --codec=vp10 --fps=<FR> --i420 --min-q=<Q> --
to be used for each frame and defines the frame rate and video max-q=<Q> --cq-level=<Q> --kf-min-dist=1000 --kf-max-
resolution. Additional tweaking is done to disable QP offset dist=1000 --passes=1 -w=<W> -h=<H>
used for hierarchical B-frames by default.
E. Daala Encoder
B. H.265/HEVC Encoder Xiph Daala encoder and decoder implementations [7] are
We use reference HM encoder ver. 16.5 with its simplified used in our experiments (master version of January 21, 2016).
RDO model to estimate compression efficiency of The configuration used for testing forces a key frame to be
H.265/HEVC standard. “Low Delay Main” configuration is placed only at the start of the sequence and uses four B-frames.
provided with source codes. To get constant QP on each frame Quantization (quality) level is controlled by the parameter '-v'.
we modified 'Qpoffset' values of GOP structure in
configuration file. The “Low Delay Main” configuration daala -k 9999 -b 4 -v <Q>
defines hierarchical B-frames structure and enables almost all III. RESULTS AND DISCUSSION
available coding tools.
Experiments are carried out on JCT-VC test sequences [8].
C. VP9 Encoder The test set provides diverse video sequences specific for video
To test VP9 compression efficiency we use the open source conferencing, surveillance systems, desktop capturing and
'libvpx' encoder ver. 1.5.0 provided by the WebM project [6]. other fields of application of video compression.
The encoder provides command line interface to configure
most of the coding options. The encoder is run with the
following parameters:

Fig. 1. Bitrate-PSNR plot for Traffic test sequence Fig. 2. Bitrate-SSIM plot for Traffic test sequence

978-1-4673-8383-7/16/$31.00 ©2016 IEEE


2016 International Siberian Conference on Control and Communications (SIBCON)
Compression efficiency is compared in terms of distortion on pretty much the same techniques, which produce almost
levels at the same bitrate values. Peak Signal to Noise Ratio similar distortion features.
(PSNR) and Structural Similarity Index (SSIM) are used as two
measures of distortion. The first one corresponds to the metric Test sequences of Class A–D have common features of
used in Rate-Distortion Optimization module of most encoders. photo-realistic video sequences and may be a good benchmark
On the other hand, SSIM provides better correlation with for, e.g., video surveillance video content.
subjective distortion perception of human visual system which Class A test sequences have the highest resolution of
should make a notable difference for Daala as it is partially 2560×1600 pixels in the test set. They mostly share the features
focused on the improved reduction of blockiness effects. of video surveillance content with a highway traffic (Traffic)
Table I provides the experimental results in terms of BD- and a crowd of people passing a crossroad (PeopleOnStreet).
Rate (average bitrate difference on the common interval of Both videos have both plain textured regions and regions with
distortion levels) [9]. As can be seen, HM encoder provides smooth borders of different objects.
results superior to all other codecs. There are a lot of The comparison results for the sequences of this Class have
comparisons of H.265/HEVC to H.264/AVC, and we will not peculiar results. Fig. 1 and Fig. 2 show rate-distortion curves
focus on that as much as on comparison of royalty-free codecs for Traffic test sequences with PSNR and SSIM distortion
to JM implementation of H.264/AVC. Another obvious measures respectively. Daala compression efficiency In terms
observation is that all the codecs except for Daala have almost of rate–SSIM is very close to HM and is superior to the rest
the same compression results in terms of PSNR and SSIM. encoders tested. At the same time, the efficiency of both VP9
This once again highlights the fact that those codecs are based and VP10 is almost the same as of JM.
TABLE I. COMPRESSION EFFICIENCY COMPARED TO JM IN TERMS OF BD-RATE

Class Sequence HM vs JM VP9 vs JM VP10 vs JM Daala vs JM


ΔRate (PSNR) ΔRate (SSIM) ΔRate (PSNR) ΔRate (SSIM) ΔRate (PSNR) ΔRate (SSIM) ΔRate (PSNR) ΔRate (SSIM)
Traffic -37.80 -40.59 -9.03 -5.03 -10.38 -6.32 -0.77 -30.71
A
PeopleOnStreet -53.00 -54.65 -8.94 -7.97 -10.38 -9.54 17.93 -9.92
Kimono -42.23 -48.07 -27.24 -33.15 -28.83 -34.53 -13.03 -30.33
ParkScene -35.66 -41.40 -10.69 -10.41 -12.35 -12.15 3.81 -23.23
B Cactus -35.54 -37.32 -8.14 -5.43 -10.27 -7.92 22.09 -5.28
BQTerrace -45.02 -52.90 -12.71 -14.09 -14.67 -16.76 6.29 -33.26
BasketballDrive -48.38 -50.76 -27.80 -25.30 -29.62 -27.13 10.81 -14.93
RaceHorses -26.33 -31.026 -8.15 -5.78 -9.41 -7.54 28.42 7.47
BQMall -32.58 -35.61 -5.85 -0.51 -7.09 -2.67 30.21 1.49
C
PartyScene -30.96 -32.57 -6.79 -3.06 -7.76 -3.87 14.48 -7.22
BasketballDrill -42.00 -42.45 -12.84 -12.77 -13.3 -12.78 17.15 -11.51
RaceHorses -25.77 -27.27 -6.79 -1.23 -7.35 -1.85 23.69 6.31
BQSquare -39.14 -39.98 -17.15 -9.16 -18.48 -9.99 2.28 -32.98
D
BlowingBubbles -27.40 -30.00 -5.46 -2.50 -6.47 -3.34 15.75 -7.06
BasketballPass -28.32 -31.12 2.56 10.38 2.61 11.20 11.30 -21.29
FourPeople -35.82 -35.77 2.03 6.27 0.71 4.62 29.36 -3.54
E Johnny -47.46 -53.70 -11.84 -21.17 -14.09 -22.91 23.44 -14.16
KristenAndSara -45.34 -47.02 -11.06 -15.73 -13.30 -18.71 29.94 -14.18
BasketballDrillText -40.49 -39.96 -10.62 -10.49 -11.19 -10.48 23.13 -7.54
ChinaSpeed -37.16 -39.31 5.25 -10.70 4.65 -11.25 96.08 9.26
F
SlideEditing -26.37 -29.35 -7.16 -8.59 -5.63 -6.71 137.69 109.56
SlideShow -37.15 -31.63 -27.40 -23.66 -27.51 -23.15 155.25 113.34
On average: -37.27 -39.66 -10.26 -9.55 -11.37 -10.63 40.54 8.63
Different results are obtained on Class B test sequences Fig. 3 and Fig. 4 show rate-distortion plots of the compared
with resolution of 1280×720 pixels. The sequences contain a encoders for BQTerrace test sequence with the results to be
lot of details and noticeable motion. Compared by PSNR to studied closely. When distortion is measured by PSNR metric,
JM, HM encoder provides 41% bitrate savings, VP9 provides Daala provides 6.29% bitrate overhead compared to JM, while
17% bitrate savings, VP10 provides 19%, while Daala provides based on SSIM metric Daala provides 33.26% bitrate savings
16% bitrate overhead by PSNR and 15% bitrate savings by (very close to HM). Daala looks better at lower bitrates which
SSIM distortion metric. should be due to lapped transforms and pre– and post-filtering.
At higher bitrate levels VP9 and VP10 have better results

978-1-4673-8383-7/16/$31.00 ©2016 IEEE


2016 International Siberian Conference on Control and Communications (SIBCON)
compared to Daala. Even compared to HM both VP9 and VP10 considered one of the target usage of royalty-free video codecs.
show competitive results with SSIM-based distortion On this test set VP9 and VP10 have good results for Johnny
measurement. and KristenAndSara, and not very good results for FourPeople
sequence. SSIM-based results of Daala are comparable to VP9
and VP10.
Class F test set contains video sequences with full or partial
artificial content: desktop capture (SlideShow, SlideEditing),
video game (ChinaSpeed) and subtitles (BasketballDrillText).
The results in Table I obviously show that Daala works bad on
this test sequences as it tends to smooth texture edges. On
artificial-based content this feature of Daala tools does not
work well as it does for photo-realistic content.
IV. CONCLUSION
In this paper we studied compression efficiency of the
contemporary video compression standards and candidates for
the next generation video coding standard. The results showed
the superior compression efficiency of H.265/HEVC coding
tools over H.264/AVC tools and the studied royalty-free
encoders. Compression efficiency of the royalty-free codecs is
Fig. 3. Bitrate-PSNR plot for BQTerrace test sequence
not very stable and lies between the efficiency of HM and JM
encoders. However, VP9 and potentially VP10 encoders may
be considered a good substitute over H.264/AVC based
encodes. Daala compression efficiency is the most unstable:
best results are achieved on photo-realistic content, while
compression efficiency of artificial content with sharp edges is
a weak point of Daala.
Rate-PSNR evaluation of compression results of Daala
encoder is not competitive to the results of the rest encoders,
while rate-SSIM measurements show relatively good results.
To further acknowledge Daala efficiency, a research on
compression distortion of Daala encoder compared to
H.264/AVC, H.265/HEVC, VP9 and VP10 with the help of
subjective distortion measurements needs to be carried out.
REFERENCES
[1] M.P. Sharabayko, Next Generation Video Codecs: HEVC, VP9 and
Daala, In Youth and Contemporary Information Technologies, Tomsk
Fig. 4. Bitrate-SSIM plot for BQTerrace test sequence Polytechnic University: Tomsk, Russia, 2013; Vol. 13, 35-37.
Class C sequences have resolution of 832×480 pixels. VP9 [2] Recommendation H.265: High effciency video coding, ITU-T, April
and VP10 provide slightly better compression efficiency 2015.
compared to JM (7–13% bitrate savings by PSNR). [3] M.P. Sharabayko, O.G. Ponomarev, R.I. Chernyak, Intra Compression
Efficiency in VP9 and HEVC, Applied Mathematical Sciences, 7
Compression efficiency of Daala on RaceHorses and BQMall (2013), 6803-6824. http://dx.doi.org/10.12988/ams.2013.311644.
sequences is slightly worse compared to JM by SSIM. [4] D. Mukherjee, H. Su, J. Bankoski, A. Converse, J. Han, Z. Liu, Y. Xu.
However, 7–10% bitrate savings are achieved on PartyScene An Overview of new Video Coding Tools under consideration for VP10
and BasketballDrill video sequences. the successor to VP9. Proc. SPIE 9599, Applications of Digital Image
Processing XXXVIII, 95991E. September 22, 2015.
Class D test sequences have the smallest resolution of [5] A. Grange, H. Alvestrand. A VP9 Bitstream Overview (Internet-Draft),
416×240 pixels, which is not the target usage of the Google, August 2013.
contemporary compression systems, still needs to be [6] The WebM Project. – URL: http://www.webmproject.org/ (07.02.2015).
considered. On BQSquare and BasketballPass test sequences [7] Xiph.org Daala video. – URL: https://xiph.org/daala/ (07.02.2015).
Daala shows results very close to HM with SSIM distortion [8] F. Bossen. Common test conditions and software reference
measurement. The results for the rest two Class D sequences configurations. In Document of ITU-T Q.6/SG16 JCTVC-K1100. ITU-
are close to JM efficiency, as well as the results for VP9 and T: Shanghai, CN, 2012.
VP10. [9] G. Bjontegaard, Improvements of the BD-PSNR model. ITU-T
SC16/Q6, 35th VCEG Meeting Doc. VCEG-AI11, Berlin, Germany, 16-
Class E test sequences represent video conferencing test 18 July, 2008.
case with resolution of 1280×720 pixels, which may be

978-1-4673-8383-7/16/$31.00 ©2016 IEEE

You might also like