Efficient Video Encoder Autotuning Via Offline Bayesian Optimization and Supervised Learning Paper

Uploaded by

Yunfeng Dong

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

28 views6 pages

Efficient Video Encoder Autotuning Via Offline Bayesian Optimization and Supervised Learning Paper

Uploaded by

Yunfeng Dong

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

Efficient Video Encoder Autotuning via Offline

Bayesian Optimization and Supervised Learning

Roberto Azevedo Yuanyi Xue Xuewei Meng
DisneyResearch|Studios Disney Entertainment & ESPN Tech. Disney Entertainment & ESPN Tech.
[email protected] [email protected] [email protected]

Wenhao Zhang Scott Labrozzi Christopher Schroers

Disney Entertainment & ESPN Tech. Disney Entertainment & ESPN Tech. DisneyResearch|Studios
[email protected] [email protected] [email protected]

Abstract—Modern video encoders are complex software con- quality (according to a specific quality metric). The huge num-
taining dozens of parameters, which allows them to be configured ber of available parameters and the exponential combination
to different scenarios, requirements, or specific titles or scenes. of them, however, makes such an approach impractical. An
Besides the number of parameters, the inter-dependency between
them adds to the complexity of finding a per-title optimized improved method is using optimization methods (e.g., genetic
combination of encoding parameters. Even though good practices algorithms [4] or Bayesian optimization [5]) to guide the
in the industry have emerged, with the definition of presets per search for the best encoding parameters per title. Sharma et
content type (e.g., film vs. cartoon), such practices are suboptimal al. [6] is an example of such a work that uses genetics algo-
for specific titles or scenes. Indeed, finding the best encoding rithms to find the best encoding parameter for H.265/HEVC
parameters for a piece of content is currently a mix of best
practices and trial-and-error artwork. We propose an efficient (High-Efficiency Video Coding) [7]. However, even though
video encoder autotuner based on offline Bayesian optimization such approaches can provide an approximation of the optimum
and supervised machine learning. Our proposal uses Bayesian encoding parameter values per content/scene, they still require
optimization to search for a per-title best encoding parameter hundreds of encodings for each title during inference.
set offline to generate a dataset. Then, we use the generated Brute-force-based approaches have also been proposed for
dataset to train machine learning models that can map features
extracted from the content to the best encoding parameters. bitrate-ladder construction for HTTP-based dynamic adaptive
Our experiments show that our generated dataset can find a streaming (HAS) [8]–[10]. More recently, data-driven methods
combination of parameters that improves up to approximately have been explored for such a problem [11]–[14]. In HAS,
−14.49% BD-Rate (0.77 BD-PSNR) and −11.59% BD-Rate (2.12 the video content is split into short segments. Each segment
BD-VMAF) when optimizing for PSNR and VMAF, respectively. is then encoded at different resolutions and quality levels,
In comparison, our prediction models can recover ∼80% of such
performance while requiring only one fast encoding (compared which constitutes a bitrate-ladder. During streaming, based
to hundreds of encodes of a search optimization). on network conditions, display resolution, etc., the client can
Index Terms—Video encoder, Encoding parameters, Bayesian dynamically decide which representation to download for each
Optimization, Deep Learning segment. One key issue of bitrate ladder optimization is to
predict for each target bitrate which resolution provides the
I. I NTRODUCTION best quality. Such a problem can be seen as a specific case
Ideo is one of the most important media for communica- of video encoder parameter autotuning, in which resolution is
V tion and entertainment in today’s digital world, dominat-
ing global internet traffic. Video compression standards [1]–[3]
the only parameter being selected.
In this paper, we propose a data-driven method that enables
provide the key technologies that support the successful de- a significantly faster decision process than previous video
ployment of digital video. Modern video encoders have many encoder autotuning methods. Our proposal is based on i) gen-
parameters that can be tuned to specific scenarios (e.g., on- erating an offline dataset through optimization methods (e.g.,
demand vs live video) or in a per-content/scene manner (e.g., Bayesian optimization or genetics algorithms) and then ii) us-
cartoon vs live action content). Examples of encoding param- ing such data to learn a model that predicts the best encoding
eters include the number of reference frames, rate-distortion parameter. The main goal is for the model to learn the best
optimization mode, adaptive quantization mode and strength, encoding parameters found on such a dataset and, by imitating
number of B-frames, motion estimation range, deblocking such behavior, extrapolate to predict the best encoding param-
filter strength, etc. However, finding the best encoding parameters on new samples. Our proposal is also non-invasive since
eters for a specific content/scene is non-trivial. it does not need any change on the encoder. Compared to
A naive approach is to brute-force all the combinations previous approaches, our proposed method allows for a better
of values for different parameters, encode the content with exploration of the search space (due to the per-title dynamic
such combinations, and then choose the one with the best exploration of the space of parameters) and faster inference
time (due to the learned prediction models). A. Dataset generation
We use H.264/AVC (Advanced Video Coding) [1], [15] as For each video in the source dataset, we perform a Bayesian
our target codec and evaluate our method on the PSNR (Peak- optimization-based approach to guide the search for the “best”
Signal-to-Noise Ratio) and VMAF (Video Multimethod As- encoding parameters for that video. Also, for each video,
sessment Fusion) [16] quality metrics. However, our overall we extract features that can characterize its content. These
proposal is encoder-agnostic, and it can be easily applied to features can later be used to predict the ground-truth encoding
different encoders, encoder parameter sets, and target quality parameters found by the optimization search.
metrics. Experimental results show that our dataset generation
1) Bayesian Optimization: Bayesian optimization is an
method supports (without any changes on the encoder itself)
approach to optimizing objective functions that take a long
an improvement of -14.49% BD-Rate (0.77 BD-PSNR) and
time to evaluate. It uses the accumulated knowledge in the
-11.59% BD-Rate (2.12 BD-VMAF) when optimizing for
known area of the search space to guide sampling in the
PSNR and VMAF, respectively. Our prediction models can re-
remaining area in an iterative process. For that, it builds a
cover ∼80% of that performance with just one faster encoding
surrogate for the objective and quantifies the uncertainty in that
process (compared to hundreds of encoding of optimization-
surrogate using a Gaussian process regression, and then uses
based approaches).
an acquisition function to decide where to sample. Bayesian
II. P ROBLEM FORMULATION optimization has been used in many optimization tasks when
the function to be optimized needs to be treated as a black box,
We consider the encoder as a function E that takes the e.g., as hyperparameter search for deep learning [19] [20] or
frames of a video V = {F1 , F2 , ..., Fn }, the specific encoding in parameter tuning of compilers [21].
parameters p = {p1 , p2 , ..., pp }, and the target bitrate b as Bayesian optimization is a good fit for our problem because
input. The outputs of the encoder are a set of encoded frames it does not assume first or second-order derivatives, and
V ′ = {F1′ , F2′ , ...Fn′ }, i.e., thus can work with the encoder as a black-box function.
However, our overall proposal is not dependent on Bayesian
V ′ = E(V, p, b). (1)
optimization and could perfectly work with other optimization
Given the set of encoding parameters, the encoder tries its search algorithms, e.g., genetics algorithms. One drawback
best to encode V into V ′ while keeping the final bitrate as of genetics algorithms, however, is that they require many
close as possible to b. The final achieved quality and bitrate more encode runs to converge when compared to Bayesian
depend on the encoder heuristics themselves and how the user optimization.
controls such heuristics based on p. For our implementation of Bayesian optimization, we as-
Given an objective quality metric M(V, V ′ ), finding the best sume that there is a default preset pdef that we want to improve
set of encoding parameters can then be defined as, upon and only consider samples that have a better quality than
pdef . For each target bitrate b, each video sample goes through
max M(V, V ′ ), (2) the above Bayesian optimization approach, being encoded a
p1 ,p2 ,...∈P1 ,P2 ,...
maximum of N times. Algorithm 1 details such a process.
where V ′ is given by Eq. (1). Common examples of M
are PSNR, SSIM [17], and VMAF [16]. The goal of the Algorithm 1 Pseudo-code for the proposed Bayesian
above problem is to find the best parameters set p ∈ P that optimization-based dataset generation.
maximizes output quality produced by the encoder. 1: for all V in the source content dataset D do
As aforementioned, a straightforward solution for the prob- 2: Place a Gaussian process prior on f
lem above is using optimization algorithms, e.g., genetic 3: n←0
algorithms [6], simulated annealing [18], and Bayesian op- 4: pbest (V) ← pdef
timization [5]. Such approaches require hundreds of function 5: Observe f at pdef point
evaluations to converge to the maximum solution. However, 6: while n ≤ N do
since the evaluation of the encoder function (i.e., encoding 7: Update the posterior probability distribution on f
the content and compute the objective quality) is an expensive using the gathered data so far
process, running such an optimization search approach per 8: Let xn be a maximizer of the acquisition function
title/scene during inference time is prohibitive. over x, where the acquisition function is computed
using the current posterior distribution.
III. P ROPOSED M ETHOD 9: Observe yn = f (xn )
We first use a Bayesian optimization-based approach to gen- 10: n←n+1
erate an offline dataset (Subsection III-A). The dataset is then 11: end while
used to train machine learning models (Subsection III-B) in a 12: Save the point evaluated with the maximum f (x) as
supervised way to approximate the best encoding parameters best encoding parameter for V, pbest (V)
solution found in the offline dataset. Fig. 1 overviews the 13: end for
proposed method.
Fig. 1. Overview of the proposed approach.

(a) (b) (c) (d)

Fig. 2. Examples of the Bayesian optimization search performed for different samples from Inter4K optimized for PSNR (a)–(b) and VMAF (c)–(d).

2) Feature extraction: Aiming at predicting the best encod- c) First pass features from x264: 3 Together with the
ing parameters, we extract features from the video samples to above features, we also run a fast encode of x264 which
characterize them and be used as input by machine learning allows us to extract the following features4 : Q: Average of
algorithms. Specifically, we extract the following features: macroblocks QPs before adaptive quantization; AQ: average
a) Spatial Information (SI) and Temporal Informa- of macroblocks QPs after adaptive quantization decided by
tion (TI): 1 SI is computed as the Root Mean Square (RMS) the rate control; MV: bits used by the motion vectors; Tex:
difference between the Sobel maps of each of the frames [22], Number of bits used by the texture component; and Misc: bits
s spend in other signalization, e.g., slice header and skip flags.
1 X SI, TI, Energy-based Video Complexity, and 1st-pass fea-
SI(v, u) = |sij |2 , (3)
w × h i,j tures are computed per frame, and then statistics on those
features (mean, standard deviation, minimum, and maximum)
where w and h are the width and height of the u and v frames are computed for each video sample.
and
s = S(v) − S(u), (4) B. Prediction Models
q Given a dataset that maps features to the best-found encod-
S(z) = (G1 ∗ z)2 + (GT1 ∗ z)2 , (5) ing parameters, machine learning methods can then be trained
in a supervised way to predict such values. The goal is to
where ∗ denotes the 2-dimensional convolution operation, and learn a model that can learn Mθ (V ) ≈ pbest (V) for any V,
G1 is the vertical Sobel filter. TI is based on the motion where θ are the model’s parameters. Two main approaches are
between adjacent frames, Mt (i, j), defined as the difference possible: classification or regression. From a small number
of the pixel luminance at the same location, at time t, i.e., of combinations, it is straightforward to train a classification
model. However, this limits the approach to a pre-defined
Mt (i, j) = Ft (i, j) − Ft−1 (i, j), (6) number of presets. Since our found best encoding parameters
are not pre-defined, i.e., the values are dynamically chosen
where Ft (i, j) is the pixel at the (i, j) of the t-th frame. TI is
based on the optimization search approach, we opt for using
computed as the maximum over time of the standard deviation
a regression approach. In our experiments, we focused mainly
over space of Mn (i, j) over all i and j.
on XGBoost and Multi-Layer Perceptron (MLP) models, but
b) Energy-based Video Complexity Features: we com- our general proposal is not restricted to them. We also exper-
pute the per-frame average spatial energy (E) and average imented with SVM (Support Vector Machines) and Random
temporal energy (h), following the definition of [23]2 .
3 https://www.videolan.org/developers/x264.html
1 https://github.com/Telecommunication-Telemedia-Assessment/SITI 4 A more detailed description of those features can be found at the x264
2 https://github.com/cd-athena/VCA documentation.
Forest models, but omitted these results here since XGBoost of our offline datasets, following Algorithm 1. For illustrative
and MLP consistently performed better in our experiments. purposes, Fig. 2 shows examples of performing our Bayesian
optimization approach for sample titles in the dataset, compar-
IV. E XPERIMENTS ing the performance of the default preset to the best encoding
A. Datasets generation and analysis parameter found during the optimization.
We experiment with the freely available dataset Inter4k5 , Table II and Fig 3(a) show the statistics of the Inter4K-
which is composed of one thousand 4k videos of 5 seconds HD/PSNR, in which, we can find a parameter set that provides
duration each. We downsampled all the sample videos to up to +0.91 PSNR in the low bitrate regime compared to the
1920x1080 resolution and used that as our source dataset default preset. Table III and Figs 3(b) show the statistics of
for the following experiments. This source dataset is named the Inter4K-HD/VMAF, in which, we find a parameter set
Inter4K-HD in the rest of this document. We generated three supporting up to +4.70 VMAF scores on average in the lower
versions of this initial dataset, which we named Inter4K- bitrate regime when compared to the default preset. Finally,
HD/PSNR, Inter4K-HD/VMAF, and Inter4K-HD/VMAF- Table IV and Fig 3(c) show the statistics of the Inter4K-
MultiRes. Inter4K-HD/PSNR and Inter4K-HD/VMAF use, HD/VMAF-MultiRes dataset.
respectively, PSNR and VMAF as the target metric for
the Bayesian optimization discussed in Section III, whereas TABLE II
I NTER 4 K -HD/PSNR DATASET STATISTICS . VALUES ARE REPORTED IN
Inter4K/VMAF-MultiRes is similar to Inter4K/VMAF but THE FORMAT: “AVG . PSNR ( STANDARD DEVIATION )”.
also allows to configure the resolution of the output video as
Inter4K-HD/PSNR
an additional encoding parameter. Bitrate Default Best Avg. ∆PSNR
For all the three dataset variants above, we focus on
1Mbps 33.56 (5.76) 34.47 (5.88) +0.91 (2.56)
H.264/AVC as our target codec, using the very-slow preset 2Mbps 37.05 (5.82) 37.81 (5.91) +0.76 (0.48)
from x264 as our default preset pdef . Table I details the 3Mbps 39.02 (5.81) 39.70 (5.89) +0.68 (0.44)
range of encoding parameters and the default values used in 4Mbps 40.07 (5.25) 40.70 (5.36) +0.64 (0.44)
5Mbps 41.45 (5.75) 42.06 (5.86) +0.61 (0.45)
the Bayesian optimization. The “resolution” parameter is only
used for Inter4k-HD/VMAF-MultiRes dataset.

TABLE I TABLE III

H.264/AVC E NCODING PARAMETERS AND RANGES USED DURING I NTER 4 K -HD/VMAF DATASET STATISTICS . VALUES ARE REPORTED IN
BAYESIAN OPTIMIZATION FOR OUR DATASET GENERATION . THE FORMAT: “AVG . VMAF ( STANDARD DEVIATION )”.

Parameter Range Default Inter4K-HD/VMAF

Bitrate Default Best Avg. ∆VMAF
aq_mode (0, 2) 1
aq_strength (0, 1.0) 1.0 1Mbps 60.65 (18.23) 65.35 (16.66) +4.70 (2.56)
bframes (0, 16) 3 2Mbps 79.19 (13.13) 81.25 (12.36) +2.06 (1.50)
deblock_alpha (−6.0, 6.0) −1.0 3Mbps 86.72 (10.21) 88.02 (9.67) +1.30 (1.14)
deblock_beta (−6.0, 6.0) −1.0 4Mbps 90.69 (8.25) 91.64 (7.83) +0.95 (0.93)
ipratio (0, 1.6) 1.4 5Mbps 93.06 (6.82) 93.82 (6.49) +0.75 (0.78)
mbtree (0, 1) 1
merange (4, 32) 16
qcomp (0, 1.0) 0.6
ref (1, 16) 4 TABLE IV
subme (1, 8) 7 I NTER 4 K -HD/VMAF-M ULTI R ES DATASET STATISTICS .VALUES ARE
target-bitrate (fixed) 1Mbps–5Mbps REPORTED IN THE FORMAT: “AVG . VMAF ( STANDARD DEVIATION )”.
max-rate (fixed) (1.5×target-bitrate)
Inter4K-HD/VMAF
bufsize (fixed) (2.0×target-bitrate)
Bitrate Default Best Avg. ∆VMAF
psy-rd (fixed) 1
psy-trellis (fixed) 0.15 1Mbps 60.65 (18.23) 70.01 (13.86) +9.37 (2.56)
2Mbps 79.19 (13.13) 82.75 (10.69) +3.57 (1.50)
VMAF-MultiRes Only
3Mbps 86.72 (10.21) 88.59 (8.59) +2.57 (1.14)
resolution {1080p, 720p, 540p, 360p} 1080p 4Mbps 90.69 (8.25) 91.90 (7.05) +1.21 (0.93)
5Mbps 93.06 (6.82) 93.97 (5.88) +0.91 (0.78)

For each target bitrate in our dataset (1Mbps, 2Mbps,

3Mbps, 4Mbps, and 5Mbps), each video sample goes through Our results show that the improvements supported by the
the above Bayesian optimization approach, being encoded a Bayesian optimization are inversely proportional to the target
maximum of N = 200 times. In total, for each version of bitrate, which is expected since at lower bitrates choosing the
the Inter4K-HD dataset, we generate 5 target bitrate × 200 encoding parameters carefully is more important. Also, when
encodings × 1000 videos, i.e., 1 million unique encodes. comparing the results from Table III and Table IV it is clear
Finally, for each video and target bitrate, we selected the the significant VMAF improvements on lower target bitrates
best metric found from these encodings as the ground truth in Inter4K-HD/VMAF-MultiRes. Such improvements comes
from the ability to choose a lower resolution to compensate
5 https://alexandrosstergiou.github.io/datasets/Inter4K/ for higher compression artifacts. (see Fig. 4)
(a)

(b)

seen during training. All the features extracted from the source
content and the encoding parameters are min/max normalized.
For the XGBoost model, we use the default python xgboost
library6 with max_depth = 0, while the MLP is composed of
5 layers, each with 512 neurons. For the MLP training, we use
an adaptive learning rating starting at 1 × 10−3 being divided
by 5 every time that 2 consecutive epochs fail to decrease the
loss function, until the tolerance of 1 × 10−7 .
Table V shows the BD-Rate and BD-PSNR/VMAF for
evaluating the different trained models (XGBoost and MLP)
Fig. 4. Histogram of best chosen resolution for Inter4K-HD/VMAF-MultiRes.
on the three dataset variants we generated. It also reports
the BD-Rate and BD-PSNR/VMAF computed on the whole
dataset and such metrics computed only on the test set. It is
Finally, Fig. 5 depicts the average rate-distortion curve of expected that the upper bound of the prediction model reported
the default preset and the optimal per-content found for each metrics are the ones of “Best (test set)”, while the “Best (full
of the three dataset variants, while the first row of Table V dataset)” is kept just for reference. From the results, it is clear
shows the improvement in terms of BD-Rate and BD-Metric. that the prediction models are able to recover most of the
performance of the search optimization, while requiring just
B. Prediction model results one fast encoding and feature extraction step.
We independently trained prediction models on the different V. C ONCLUSION
dataset variants: Inter4K-HD/PSNR, Inter4K-HD/VMAF, We introduce a video encoder autotuning framework that
and Inter4K-HD/VMAF-MultiRes. The data was split in takes advantage of Bayesian optimization to search the space
80% for training and 20% for validation and the same split was of encoder parameters and build an offline dataset which is
used for all the dataset variants on the results presented below. then used to train supervised machine learning methods. Our
When splitting the dataset into train/test, we make sure that method supports an automated and efficient search of encoding
for a given content, all the target bitrate data will appear only parameters while offering better performance than previous
either in the training set or in the test set. Thus, we guarantee
that our tests are only performed in video content that was not 6 https://github.com/dmlc/xgboost
(a) (b) (c)
Fig. 5. Inter4K-HD rate-distortion curves when optimizing for PSNR (a), VMAF (b), and VMAF/MultiRes (c).

TABLE V
BD-R ATE /M ETRIC COMPARING THE BEST IN THE DATASET AND PREDICTED PARAMETERS . D EFAULT PRESET ( X 264, VERY SLOW ) IS USED AS ANCHOR
TO COMPUTE BD-R ATE /M ETRIC .

Inter4k-HD/PSNR Inter4k-HD/VMAF Inter4k-HD/VMAF-MultiRes)

Model BD-Rate ↓ BD-PSNR ↑ BD-Rate ↓ BD-VMAF ↑ BD-Rate ↓ BD-VMAF ↑
Best (full dataset) -14.49 0.77 -11.60 2.32 -16.25 3.76
Best (test set) -13.47 0.71 -11.56 2.08 -16.27 3.70
XGBoost -10.89 (80.8%) 0.56 (78.9%) -9.21 (79.7%) 1.64 (78.8%) -13.25 (81.4%) 3.27 (88.4%)
MLP -10.98 (81.5%) 0.58 (81.7%) -8.73 (75.5%) 1.57 (75.5%) -13.44 (82.6%) 3.17 (85.7%)

fixed parameter search methods. Moreover, after training, our [8] A. Aaron, Z. Li et al., “Per-title encode optimization,”
method provides an efficient solution, which only requires one Tech. Rep., 2015. [Online]. Available: https://netflixtechblog.com/
per-title-encode-optimization-7e99442b62a2
fast encoding of the content plus a pass of feature extraction. [9] K. S. Durbha, H. Tmar et al., “Bitrate ladder construction using visual
Specifically, we demonstrate that using x264, we are able to information fidelity,” arXiv preprint arXiv:2312.07780, 2023.
find a parameter set that achieves up to 14.49% and 11.59% [10] A. Telili, W. Hamidouche et al., “Bitrate ladder prediction methods for
adaptive video streaming: A review and benchmark,” arXiv preprint
BD-Rate reduction compared to very-slow encoding parameter arXiv:2310.15163, 2023.
preset when optimizing for PSNR and VMAF, respectively, [11] A. V. Katsenou, J. Sole, and D. R. Bull, “Efficient bitrate ladder
and can recover ∼ 80% of such performance with a much construction for content-optimized adaptive video streaming,” IEEE
Open Journal of Signal Processing, vol. 2, pp. 496–511, 2021.
more efficient prediction model. Our proposed framework also [12] F. Nasiri, W. Hamidouche et al., “Multi-preset video encoder bitrate
opens up new avenues for future work. Although we focus only ladder prediction,” ser. ViSNext ’22. New York, NY, USA: ACM,
on simple hand-designed features and more traditional ma- 2022, p. 8–13.
[13] A. Telili, W. Hamidouche et al., “Efficient per-shot transformer-based
chine learning algorithms in our experiments, more advanced bitrate ladder prediction for adaptive video streaming,” in 2023 IEEE
features (e.g., deep learning-based ones) and models (e.g., ICIP. IEEE, 2023, pp. 1835–1839.
Transformers) can be easily integrated into our framework. [14] J. Yang, M. Guo et al., “Optimal transcoding resolution predic-
tion for efficient per-title bitrate ladder estimation,” arXiv preprint
The experimentation of our method with other codecs (e.g., arXiv:2401.04405, 2024.
HEVC and AV1) is another interesting future work. [15] “Recommendation itu-t h.264. advanced video coding for generic au-
diovisual services,” 2021.
R EFERENCES [16] “VMAF: Video multimethod assessment fusion.” [Online]. Available:
https://github.com/Netflix/vmaf
[1] T. Wiegand, G. J. Sullivan et al., “Overview of the H. 264/AVC video [17] Z. Wang, A. C. Bovik et al., “Image quality assessment: from error vis-
coding standard,” IEEE Transactions on circuits and systems for video ibility to structural similarity,” IEEE transactions on image processing,
technology, vol. 13, no. 7, pp. 560–576, 2003. vol. 13, no. 4, pp. 600–612, 2004.
[2] B. Bross, Y.-K. Wang et al., “Overview of the Versatile Video Coding [18] D. Bertsimas and J. Tsitsiklis, “Simulated annealing,” Statistical science,
(VVC) standard and its applications,” IEEE Transactions on Circuits and vol. 8, no. 1, pp. 10–15, 1993.
Systems for Video Technology, vol. 31, no. 10, pp. 3736–3764, 2021. [19] J. Wu, X.-Y. Chen et al., “Hyperparameter optimization for machine
[3] J. Han, B. Li et al., “A technical overview of AV1,” Proceedings of the learning models based on bayesian optimization,” Journal of Electronic
IEEE, vol. 109, no. 9, pp. 1435–1462, 2021. Science and Technology, vol. 17, no. 1, pp. 26–40, 2019.
[4] D. E. Goldberg, Genetic Algorithms in Search, Optimization and Ma- [20] A. H. Victoria and G. Maragatham, “Automatic tuning of hyperparame-
chine Learning, 1st ed. USA: Addison-Wesley Longman Publishing ters using bayesian optimization,” Evolving Systems, vol. 12, no. 1, pp.
Co., Inc., 1989. 217–223, 2021.
[5] P. I. Frazier, “A tutorial on bayesian optimization,” arXiv preprint [21] A. H. Ashouri, G. Mariani et al., “Cobayn: Compiler autotuning frame-
arXiv:1807.02811, 2018. work using bayesian networks,” ACM Transactions on Architecture and
[6] R. R. Sharma and K. V. Arya, “Parameter optimization for HEVC/H.265 Code Optimization (TACO), vol. 13, no. 2, pp. 1–25, 2016.
encoder using multi-objective optimization technique,” in 2016 11th [22] “Recommendation p.910 : Subjective video quality assessment methods
International Conference on Industrial and Information Systems (ICIIS). for multimedia applications,” 2023.
Roorkee, India: IEEE, Dec. 2016, pp. 592–597. [23] V. V. Menon, C. Feldmann et al., “Vca: video complexity analyzer,” in
[7] G. J. Sullivan, J.-R. Ohm et al., “Overview of the High Efficiency Video Proceedings of the 13th ACM Multimedia Systems Conference (MMSys
Coding (HEVC) standard,” IEEE Transactions on circuits and systems ’22). New York, NY, USA: ACM, 2022, p. 259–264.
for video technology, vol. 22, no. 12, pp. 1649–1668, 2012.

New PPT For Report
No ratings yet
New PPT For Report
42 pages
Chapter 4 Conic Section and Its Application
100% (1)
Chapter 4 Conic Section and Its Application
13 pages
Simple Neon Lamp Circuits and Working Explained 2
No ratings yet
Simple Neon Lamp Circuits and Working Explained 2
36 pages
Codecs and Packaging For Pcs Mobile and Ott STB Smart TV
No ratings yet
Codecs and Packaging For Pcs Mobile and Ott STB Smart TV
159 pages
Database Design Using Entity-Relationship Diagrams (3rd Edition, CRC Press) Sikha Saha Bagui Download
No ratings yet
Database Design Using Entity-Relationship Diagrams (3rd Edition, CRC Press) Sikha Saha Bagui Download
53 pages
h264 Thesis
100% (3)
h264 Thesis
6 pages
Wa0004.
No ratings yet
Wa0004.
5 pages
The Essential Guide To Low Latency Video Streaming:: White Paper
No ratings yet
The Essential Guide To Low Latency Video Streaming:: White Paper
12 pages
Decode To Encode
No ratings yet
Decode To Encode
232 pages
1 s2.0 S2667241321000148 Main
No ratings yet
1 s2.0 S2667241321000148 Main
14 pages
A Universal Optimization Framework For Learning-Based Image Codec
No ratings yet
A Universal Optimization Framework For Learning-Based Image Codec
19 pages
Bandwidth Prediction For Adaptive Video Streaming: Faculdade de Engenharia Da Universidade Do Porto
No ratings yet
Bandwidth Prediction For Adaptive Video Streaming: Faculdade de Engenharia Da Universidade Do Porto
96 pages
A Linear Source Model and A Unified Rate Control Algorithm For DCT Video Coding
No ratings yet
A Linear Source Model and A Unified Rate Control Algorithm For DCT Video Coding
13 pages
Video Compression Using Hybrid Hexagon Search and Teaching-Learning Based Optimization Technique For 3D Reconstruction
No ratings yet
Video Compression Using Hybrid Hexagon Search and Teaching-Learning Based Optimization Technique For 3D Reconstruction
15 pages
Choi Variable Rate Deep Image Compression With A Conditional Autoencoder ICCV 2019 Paper
No ratings yet
Choi Variable Rate Deep Image Compression With A Conditional Autoencoder ICCV 2019 Paper
9 pages
Fire Water Demand Calc
No ratings yet
Fire Water Demand Calc
2 pages
Tesi
No ratings yet
Tesi
113 pages
Bjøntegaard Delta (BD) : A Tutorial Overview of The Metric, Evolution, Challenges, and Recommendations
No ratings yet
Bjøntegaard Delta (BD) : A Tutorial Overview of The Metric, Evolution, Challenges, and Recommendations
18 pages
Applsci 12 07423 v2
No ratings yet
Applsci 12 07423 v2
18 pages
Chadha Deep Perceptual Preprocessing For Video Coding CVPR 2021 Paper
No ratings yet
Chadha Deep Perceptual Preprocessing For Video Coding CVPR 2021 Paper
10 pages
DLL - Science 3 - Q3 - Week 1
No ratings yet
DLL - Science 3 - Q3 - Week 1
7 pages
Error Detection and Data Recovery Architecture For Motion Estimation
100% (1)
Error Detection and Data Recovery Architecture For Motion Estimation
63 pages
Video Data Compression by Progressive Iterative Ap
No ratings yet
Video Data Compression by Progressive Iterative Ap
7 pages
RDT Ortego
No ratings yet
RDT Ortego
28 pages
User-Priority Based AV1 Coding Tool Selection
No ratings yet
User-Priority Based AV1 Coding Tool Selection
10 pages
Complexity-Based Consistent-Quality Encoding in The Cloud
No ratings yet
Complexity-Based Consistent-Quality Encoding in The Cloud
5 pages
Brochure Damen ASD Tug 3212
100% (1)
Brochure Damen ASD Tug 3212
39 pages
Video Data Compression by Progressive Iterative Ap-Pages-1
No ratings yet
Video Data Compression by Progressive Iterative Ap-Pages-1
1 page
Per Title SMC2021-1
No ratings yet
Per Title SMC2021-1
38 pages
High Efficiency Video Coding With Content Split Block Search Algorithm and Hybrid Wavelet Transform
No ratings yet
High Efficiency Video Coding With Content Split Block Search Algorithm and Hybrid Wavelet Transform
7 pages
PTE Overview Slides
No ratings yet
PTE Overview Slides
12 pages
Wa0004.
No ratings yet
Wa0004.
3 pages
The Lecture Contains:: Lecture 41: Performance Measures, Intraframe Coding, Predictive and Transform Coding
No ratings yet
The Lecture Contains:: Lecture 41: Performance Measures, Intraframe Coding, Predictive and Transform Coding
9 pages
12420ijma01 PDF
No ratings yet
12420ijma01 PDF
16 pages
24 25ost
No ratings yet
24 25ost
4 pages
Learning in Situ: A Randomized Experiment in Video Streaming
No ratings yet
Learning in Situ: A Randomized Experiment in Video Streaming
17 pages
Barrierboard - Info Sheet
No ratings yet
Barrierboard - Info Sheet
2 pages
How To Encode Video For The Future: Dror Gill, CTO, Beamr
No ratings yet
How To Encode Video For The Future: Dror Gill, CTO, Beamr
10 pages
Encoding and Video Content Based HEVC Video Quality Prediction
No ratings yet
Encoding and Video Content Based HEVC Video Quality Prediction
24 pages
Deep-Learning Based Precoding Techniques For Next-Generation Video Compression
No ratings yet
Deep-Learning Based Precoding Techniques For Next-Generation Video Compression
12 pages
A Real-Time H.264/AVC Encoder & Decoder With Vertical Mode For Intra Frame and Three Step Search Algorithm For P-Frame
No ratings yet
A Real-Time H.264/AVC Encoder & Decoder With Vertical Mode For Intra Frame and Three Step Search Algorithm For P-Frame
13 pages
Santos Training2
No ratings yet
Santos Training2
27 pages
ParameterOptimizationforH 265-HEVCEncoderUsingNSGAII
No ratings yet
ParameterOptimizationforH 265-HEVCEncoderUsingNSGAII
11 pages
Particulate Nature of Matter
100% (1)
Particulate Nature of Matter
16 pages
Fluid Power - 2
No ratings yet
Fluid Power - 2
11 pages
ORF309 Probability
No ratings yet
ORF309 Probability
28 pages
Oiml R 40: Nternational Ecommendation
No ratings yet
Oiml R 40: Nternational Ecommendation
13 pages
04399963
No ratings yet
04399963
5 pages
H.264 Encoding Guide PDF
0% (1)
H.264 Encoding Guide PDF
5 pages
2-Pass Video Encoding
No ratings yet
2-Pass Video Encoding
20 pages
Plant Pigments
No ratings yet
Plant Pigments
3 pages
Algebra 2 Lesson 5.7 Final
No ratings yet
Algebra 2 Lesson 5.7 Final
4 pages
Video Coding Using Motion Compensation: (Chapter 9 - Continues)
No ratings yet
Video Coding Using Motion Compensation: (Chapter 9 - Continues)
45 pages
Good Is The Activity of The Soul in Accordance With Virtue
No ratings yet
Good Is The Activity of The Soul in Accordance With Virtue
6 pages
MPEG Video Coding and Beyond: Spring '09 Instructor: Min Wu
No ratings yet
MPEG Video Coding and Beyond: Spring '09 Instructor: Min Wu
45 pages
Load Test On Separately Excitied DC Generator
No ratings yet
Load Test On Separately Excitied DC Generator
5 pages
800T/H 30.5 MM Push Buttons NEMA Push Button Specifications: Approximate Dimensions
No ratings yet
800T/H 30.5 MM Push Buttons NEMA Push Button Specifications: Approximate Dimensions
2 pages
Belimo EF Installation-Instructions En-Us
No ratings yet
Belimo EF Installation-Instructions En-Us
10 pages
x264 PRO and x264 PRO User Guide
No ratings yet
x264 PRO and x264 PRO User Guide
10 pages
MCC1106 - Industrial Automation and Robotics - MTech SEM 1
No ratings yet
MCC1106 - Industrial Automation and Robotics - MTech SEM 1
1 page
In Communications Based On MATLAB: A Practical Course
No ratings yet
In Communications Based On MATLAB: A Practical Course
10 pages
Class 9 Cbse Board Syllabus
No ratings yet
Class 9 Cbse Board Syllabus
7 pages
Oracle Recommended Patches R12.ATG - PF.B
No ratings yet
Oracle Recommended Patches R12.ATG - PF.B
32 pages
Digital Video Transcoding: Jun Xin, Chia-Wen Lin, Ming-Ting Sun
No ratings yet
Digital Video Transcoding: Jun Xin, Chia-Wen Lin, Ming-Ting Sun
14 pages
An Introduction To Combustion - Pyronics PDF
No ratings yet
An Introduction To Combustion - Pyronics PDF
4 pages
H.264 Considerations
No ratings yet
H.264 Considerations
2 pages
Structural Cals For UCW
No ratings yet
Structural Cals For UCW
11 pages
Image Recognition Using CIFAR 10
100% (1)
Image Recognition Using CIFAR 10
56 pages
Design of Unde Ground Water Tank
100% (2)
Design of Unde Ground Water Tank
18 pages
Asco Numatics ATEX General Info
No ratings yet
Asco Numatics ATEX General Info
18 pages
Andy Clark - Associative Engines
100% (3)
Andy Clark - Associative Engines
248 pages
12 Wcdma Hsdpa RRM and Parameters
No ratings yet
12 Wcdma Hsdpa RRM and Parameters
67 pages
HELMKE Plus: Three-Phase Low Voltage Squirrel Cage Motors
No ratings yet
HELMKE Plus: Three-Phase Low Voltage Squirrel Cage Motors
28 pages
Software Architecture with Python
From Everand
Software Architecture with Python
Anand Balachandran Pillai
3/5 (1)
Mastering Python for Finance
From Everand
Mastering Python for Finance
James Ma Weiming
5/5 (1)
Mastering OpenStack: Design, deploy, and manage clouds in mid to large IT infrastructures
From Everand
Mastering OpenStack: Design, deploy, and manage clouds in mid to large IT infrastructures
Omar Khedher
No ratings yet
Efficient Memory Optimization for IoT Intrusion Detection
From Everand
Efficient Memory Optimization for IoT Intrusion Detection
Ethan Evelyn
No ratings yet
Bag of Words Model: Unlocking Visual Intelligence with Bag of Words
From Everand
Bag of Words Model: Unlocking Visual Intelligence with Bag of Words
Fouad Sabry
No ratings yet
Efficient Editing with OniVim: Definitive Reference for Developers and Engineers
From Everand
Efficient Editing with OniVim: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
JBoss AS 5 Performance Tuning
From Everand
JBoss AS 5 Performance Tuning
Francesco Marchioni
No ratings yet
Node Web Development, Second Edition
From Everand
Node Web Development, Second Edition
David Herron
No ratings yet
Performance Testing with JMeter 2.9
From Everand
Performance Testing with JMeter 2.9
Bayo Erinle
No ratings yet
Audio Visual Speech Recognition: Advancements, Applications, and Insights
From Everand
Audio Visual Speech Recognition: Advancements, Applications, and Insights
Fouad Sabry
No ratings yet
Human Visual System Model: Understanding Perception and Processing
From Everand
Human Visual System Model: Understanding Perception and Processing
Fouad Sabry
No ratings yet
Machine Learning - Advanced Concepts
From Everand
Machine Learning - Advanced Concepts
Derrick Mwiti
No ratings yet
Image Compression: Efficient Techniques for Visual Data Optimization
From Everand
Image Compression: Efficient Techniques for Visual Data Optimization
Fouad Sabry
No ratings yet
Advanced Backend Code Optimization
From Everand
Advanced Backend Code Optimization
Sid Touati
No ratings yet
Motion Estimation: Advancements and Applications in Computer Vision
From Everand
Motion Estimation: Advancements and Applications in Computer Vision
Fouad Sabry
No ratings yet
Volume Rendering: Exploring Visual Realism in Computer Vision
From Everand
Volume Rendering: Exploring Visual Realism in Computer Vision
Fouad Sabry
No ratings yet
SAS Interview Questions You'll Most Likely Be Asked
From Everand
SAS Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet

Efficient Video Encoder Autotuning Via Offline Bayesian Optimization and Supervised Learning Paper

Uploaded by

Efficient Video Encoder Autotuning Via Offline Bayesian Optimization and Supervised Learning Paper

Uploaded by

Efficient Video Encoder Autotuning via Offline

Bayesian Optimization and Supervised Learning

Wenhao Zhang Scott Labrozzi Christopher Schroers

(a) (b) (c) (d)

TABLE I TABLE III

Parameter Range Default Inter4K-HD/VMAF

For each target bitrate in our dataset (1Mbps, 2Mbps,

Inter4k-HD/PSNR Inter4k-HD/VMAF Inter4k-HD/VMAF-MultiRes)

You might also like