A Survey On Video Coding Optimizations Using Machine Learning

The most common type of data used globally is presently video data. The volume of video data has been rising explosively around the globe as a result of the quick development of video applications and the rising demand for higher-quality video services, giving the biggest challenge to multimedia processing, transmission, and storage. Video coding by compression has become somewhat saturated while the compression ratio has grown in the last three decades.

Uploaded by

International Journal of Innovative Science and Research Technology

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

30 views5 pages

A Survey On Video Coding Optimizations Using Machine Learning

Uploaded by

International Journal of Innovative Science and Research Technology

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

Volume 8, Issue 11, November – 2023 International Journal of Innovative Science and Research Technology

ISSN No:-2456-2165

A Survey on Video Coding Optimizations using

Machine Learning
Mahesh Pawaskar Dr. Gaurav Vijay
School of Engineering, School of Engineering
Career Point University, Kota Career Point University, Kota
Rajasthan, India Rajasthan, India

Abstract:- The most common type of data used globally is Digital video has many advantages over traditional
presently video data. The volume of video data has been analog video, which has led to its replacement. Text and audio
rising explosively around the globe as a result of the quick data are compatible with digital videos. To properly store and
development of video applications and the rising demand transmit visual information, an effective video coding system
for higher-quality video services, giving the biggest is required. Digital videos consume large amounts of data, and
challenge to multimedia processing, transmission, and if they are not compressed properly, it would be highly
storage. Video coding by compression has become difficult to store and transmit video data. Although today, data
somewhat saturated while the compression ratio has storage capacity, network bandwidth, and computer power
grown in the last three decades. Deep Learning have increased tremendously, demands for better-quality
algorithms offer new possibilities for improving video video have never stopped.
coding technologies since they can make data-driven
predictions and learn from vast amounts of unstructured YouTube is a video-sharing website, that enables to
data. We explore machine learning-based video encoding watch online videos. YouTube received 112.9B visits in the
optimization in this research, which lays a solid month of October 2023, with an average session duration of
groundwork for further advancements in video coding. 35:09 which increased the traffic by 19.00% within two
The video service's designer must choose a suitable video months. [1]
coding scheme to satisfy criteria like efficiency,
complexity, rate distortion, flexibility, etc. This article One of the key technologies in video applications is
also presents challenges associated with machine learning video coding, which makes it possible to compress and
video coding optimization. The survey is mainly organize video data more efficiently for computing,
presented from two key aspects, first is low complexity transmission, and storage. In order to improve video
optimization with the help of advanced learning tools, compression efficiency, machine learning is the most
such as feed-forward CNN, deep RL, and deep NN, and advanced research topic to be explored. The study of machine
second is learning-based visual quality assessment (VQA). learning allows for the analysis of data to find hidden patterns
and drive decisions. Owing to its exceptional ability to learn
Keywords:- Video Coding, Deep Learning, Machine from data, a number of recent studies have significantly
Learning, High-Efficiency Video Coding Standard (HEVC), enhanced video coding results by including machine learning
Versatile Video Coding (VVC), Visual Quality Assessment. algorithms in the process.
(VQA).
The idea of perceptual redundancy is explained by the
I. INTRODUCTION fact that not all video distortions are perceptually visible by
the Human Visual System (HVS), which is ultimately
Numerous video applications, including TV responsible for perceiving most videos. The eyes and the
broadcasting, movies, video-on-demand, video conferences, brain are the two functioning components of the HVS.
mobile video, video surveillance, remote control, robotics, Numerous visual characteristics and redundancies have been
3D videos, and free viewpoint TV, have emerged with the identified and inspired by HVS research that is based on
development of multimedia computing, communication, and physiological (eye) and psychological (brain) studies [2]. The
display technologies. Numerous aspects of daily life, notion of Just Noticeable Difference (JND) arises when
including industry, communication, national security, the multiple pixel values in an image exhibit a very fine-scale
military, education, medicine, and entertainment, have made variation. In most cases, the distortion is unnoticeable. The
extensive use of these video applications. The majority of eyes are responsible for these physiological perceptual
data transmission over the internet today is video data, and its redundancies. Additionally, the perceptual sensitivity differs
volume is increasing dramatically yearly. YouTube is depending on the video's subject matter, the viewer's
extensively used to share information through video. consciousness, and their region of interest, or ROI, which
corresponds to how the brain processes psychology. The goal
of video coding is to preserve visual quality while utilizing
signal and perceptual redundancy as much as possible.

IJISRT23NOV1819 www.ijisrt.com 2537

Volume 8, Issue 11, November – 2023 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165
II. EXISTING STANDARDS In July 2020, Joint Video Exploration Team (JVET)
finalized the Versatile Video Coding (VVC) standard. It
In this paper, we overview key challenging issues in increases compression capabilities by adding new tools,
video coding and recent advances in machine learning-based reaching up to approximately 50%-bit rate reduction for
video coding optimization. The Motion Picture Expert Group equivalent video quality when compared with HEVC. It is
(MPEG) of ISO/IEC and Video Coding Expert Group useful for emerging applications such as HDR/WCG video,
(VCEG) from ITU-T, play a vital role in standardizing video 3600 immersive video, screen content coding etc. [9]
coding and advances in coding technologies. There are five
leading standards from different generations, that are popular The main objective of video coding standards is to
in video coding standards. These standards are H.261, minimize the bit rate without significantly damaging visual
MPEG-4, H.264/AVC, H.265/HEVC, and VVC. [3] [4] quality. While achieving bit rate and visual quality, there is a
challenge of maintaining low complexity. Now, researchers
H.261 was an early video coding standard developed for are focusing on learning-based approaches to upgrade video
video conferencing applications. H.261 used basic coding performance. There are a number of different areas to
compression techniques such as motion compensation and use learning-based approaches. In this article, two areas are
discrete cosine transform (DCT) for video compression. considered for learning-based approaches, which are low-
While these techniques are less advanced compared to complexity coding optimization, and high-quality coding
modern video codecs, they were innovative at the time. It laid optimization. Subsequent sections discuss existing learning-
the foundation for subsequent video coding standards and based approaches.
contributed to the evolution of video compression
technology. [5] III. LEARNING BASED LOW COMPLEXITY
CODING OPTIMIZATION
MPEG-4 is part of the MPEG suite of standards and
was introduced as a successor to earlier video coding In predictive coding, refined variable block size
standards like MPEG-2. It was first published in 1998 and partitioning can improve prediction accuracy, which in turn
has gone through several revisions and extensions over the can lower coding residue and improve coding efficiency.
years. It uses advanced video compression techniques, Variable block size partition and number of prediction modes
including motion compensation, transform coding such as may increase the complexity of the coding. There are
DCT, and quantization to reduce data size while maintaining different numbers of intra-prediction modes. For example,
video quality. [6] there are 1, 4/9, 35, and 67 prediction modes for MPEG-2,
H.264/AVC, H.265, and VVC respectively. Refined modes,
H.264 has had a profound impact on the multimedia Transform Unit (TU), Motion Estimation (ME), and loop
industry, and it is one of the most widely adopted video filtering techniques are incorporated in inter-prediction mode
compression standards. Digital Multimedia Broadcasting to increase coding efficiency. The selection of a more
(DMB), Digital Video Broadcasting-Handheld (DVB-H), and appropriate mode in minimum time is desirable to speed up
iPod are just a few of the video coding apps that have video coding. Additionally, there are multiple decision layers
embraced and grown to love this standard. It offers several that are to be explored in a recursive way. A number of
feature sets for coding algorithms that have been found to approaches are adopted for mode decision. These approaches
satisfy particular application requirements. The market share have very limited complexity and are fast. But there are some
of the leading online video codecs and containers has been drawbacks which are 1) very few features are exploited
studied recently, and it shows that in 2018, 82% of online which restricts the discriminability for distinguishing each
video streams were encoded with AVC/H.264. [7] mode. 2) Due to limited statistical analyses, thresholding of
these algorithms may not be optimized.
High Efficiency Video Coding (HEVC), which is also
referred to as H.265 is the latest video coding standard. The Learning of mode decisions can be made using
HEVC standard is advanced and standardized collaboratively classification problems. In this paper, research on machine
by the International Telecommunication Union-T Video learning-based mode decision approaches is explored. In this
Quality Experts Group (VQEG) and ISO/IEC MPEG section, four different machine learning-based approaches are
organizations. With certain modifications from previous discussed.
standards, block-based motion-compensated hybrid video
coding techniques serve as the foundation for the architecture Decision tree, binary classifier, including support vector
of the video coding layer [8]. A few market players machine (SVM), Back Propagation Nural Network (BPNN)
incorporated HEVC into their product lines. However, this unsupervised learning are different machine learning models
coder's market share is not particularly high and already that are applied to skip some modes or to select the best
seems to be saturated. Business uncertainty resulted from the mode among seven different modes of H.264/AVC.
unexpected complexity and delay in realizing the entire cost
of licensing for implementation. According to recent research Eduardo Martinez-Enriquez et al. proposed two-level
on the market share of top online video codecs and classification-based approach for inter-mode decisions in
containers, in the year 2018, only 12 % of video streams on AVC. The first classifier determines whether to skip the mode
the internet were coded with HEVC [7]. or DIRECT the mode. Whereas the second determines
whether to use small modes such as 8×8, 8×4, 4×8, and 4×4

IJISRT23NOV1819 www.ijisrt.com 2538

Volume 8, Issue 11, November – 2023 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165
or large modes such as 16×16, 16×8, 8×16, and 8×8. The model by partitioning as a three-level classification problem.
experimental result showed that, compared to others, this In order to solve classification problem, they used various
method saves 60% of the total encoding time. Obviously, sizes of convolution kernels and trainable parameters.
there is a small amount of compromise with the rate-distortion Experimental result showed that, their approach reduced intra
parameter. [10]. encoding time with negligible Bjontegaard delta bit-rate [16].

Yu-Huan et al. proposed a multi-phase nearest mean As brute-force searches for rate-distortion optimization,
classification based on RD cost clustering for fast mode the quad-tree partition of the Coding Unit (CU) is responsible
decisions. It is an unsupervised clustering machine learning for complexity in encoding. Mai Xu et al. collected large-
model. This method achieved a 68% reduction of time at the scale database which includes CU partition data for intra and
expense of a slight increase in bit rate. [11] inter-modes of HEVC. This is used for deep learning CU
partition. They used a hierarchical CU partition map (HCPM)
Jui-Chiu Chiang et al proposed a fast stereo video to depict the CU partition of a whole coding tree unit. Next,
encoding algorithm. This algorithm is based on hierarchical they suggested, using an early terminated hierarchical CNN
two-stage neural classification, including fast prediction (ETH-CNN) to develop prediction skills for the HCPM.
source determination and fast block partition selection. [12] Consequently, by using ETH-CNN to determine the CU
partition instead of a brute-force search, the encoding
Paula Carrillo et al. suggested a machine learning based difficulty of intra-mode HEVC can be significantly
approach. In this technique, they have proposed three-level decreased. Third, to discover the CU partition's temporal
topology for inter-mode decision. The first level improves correlation, an ETH-LSTM is suggested. A combination of
speed by SKIP early decision. In the second level, there is a ETH-LSTM and the ETH-CNN is used to predict the CU
direct division between inter 8×8 and sub modes against inter partition, which reduces HEVC complexity in inter-mode.
16×16 and sub modes. If other leaf is selected, third Experimental results showed that, their method outperformed
classification between inter 16×16 sub modes and intra 4×4 is other state-of-the-art approaches in terms of complexity [17].
evaluated. [13]
In order to reduce complexity in H.264 to HEVC,
H.265/HEVC has a large number of decision modes. Jingyao Xu et al. suggested deep learning based approach to
That makes the task more challenging. H.265/HEVC has replace brute-force searching for rate-distortion optimization.
more complex decision computation as compared to They built large-scale transcoding database. After that,
H.264/AVC. H.265/HEVC includes recursive quad-tree CU determined correlation between HEVC CTU partition and
mode decision, multi-class CU and TU mode decision. In the H.264 features. These relation helps to find out temporal and
following sections, machine learning based HEVC INTRA spatial-temporal similarities of the CTU partition. Next, they
and INTER coding optimization are discussed. In recent proposed hierarchical long short-term memory (H-LSTM)
years Deep Nural Network (NN) has been widely used in architecture network. This deep learning-based architecture
visual signal processing. Researchers are putting their efforts predict the CTU partition of HEVC. The performance of (H-
into exploring end-to-end deep learning-based decision LSTM) is compared with other methods. [18]
schemes.
 Discussion on learning based low complexity coding
Zhenyu Liu et al. used Convolution Neural Network optimization:
(CNN) to analyse the texture of images and reduces the
number of CU mode. Whatever CU modes are available are Low complexity optimization becomes more important
undergone through an exhaustive Rate-Distortion- when the coding complexity grows exponentially. In the
Optimization process. In this encoding, CNN determines the meantime, the complexity of each mode decision problem in
texture of CU and then identifies the optimal CU/PU VVC increases. In order to solve complicated decision
configuration. They have incorporated quantization problems, advanced learning tools, such as feed-forward
parameters in CNN architecture. This method could save CNN, deep RL, and deep NN, are better options.
63% intra-coding time with the cost of a 2.66 % BDBR
increase [14]. IV. LEARNING-BASED VISUAL QUALITY
ASSESSMENT (VQA)
Thorsten Laude et al. developed deep learning intra
prediction mode decision process for H.265/HEVC. Input Minimizing distortion (D) or increasing quality (Q) is
values of block samples to be coded are fed through a deep the aim of video coding. The quality Q is determined by
convolution neural network. Without RD optimization of all PSNR, based on the pixel-by-pixel difference between the
feasible modes, the choice of intra-prediction mode is original and reconstructed pictures, and the distortion D is
expressed as a classification problem. [15] determined by MSE. But, there is no guarantee that PSNR
and MSE reflect the real perceived quality of HVS. There are
Tianyi Li et al. proposed a complexity reduction a number of Visual Quality Assessment (VQA) metrics that
approach for INTRA mode. This model learns Deep have been developed, such as SSIM, FSIM, Multi-Scale
Convolution Neural Network to predict CTU partition instead SSIM etc. Creating a useful visual quality metric that is in
of RDO. They established a large-scale database with line with human perception is difficult. Through the
diversiform patterns of CTU partition. Then, they created a extraction of visual elements from data and the development

IJISRT23NOV1819 www.ijisrt.com 2539

Volume 8, Issue 11, November – 2023 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165
of data-driven solutions, machine learning opens up new omnidirectional images, the proposed method outperforms
possibilities. Visual Quality Assessment (VQA) matrix can both the state-of-the-art and the current two-dimensional
be categorized as No Reference (NR), Reduced Reference image quality models. [22]
(RR), and Full Reference (FR). In the FR matrix full
reference frame is available, RR needs partial side The fact that deep learning-based methods need a huge
information and NR doesn’t have any reference frame or volume of labelled data for training is one of the difficult
information. Recently, there have been attempts to apply problems. If the training dataset is insufficient or does not
deep learning to VQA by researchers due to the progress of accurately replicate real-world movies, the learning model
deep learning in image recognition. can have trouble handling a variety of contents and
distortions. Very limited labelled images are available in
Le kang et al. developed a Convolution Neural Network quality assessment.
for No-Reference image quality assessment. As a complete
optimization process, they combined feature learning and  Discussion on Learning-based Visual Quality
regression. This enabled to employ of new training Assessment:
techniques to boost performance. The suggested approach
delivers state-of-the-art performance on common Image To use the VQA algorithm as the quality objective, the
Quality Assessment (IQA) datasets and produces predictions block-based VQA algorithm adaptation is desirable
of image quality that are highly correlated with human compared to the image- or video-based algorithms. Another
perception. They demonstrated that the proposed method point is that while rate-distortion theory was first developed
could estimate quality in local regions [19]. using the MSE, it should also be reevaluated in terms of
adaptation. Creating a mathematical relationship between
Sebastian et al presented a work, that developed a deep VQA and MSE before implementing it in video coding is one
neural network-based approach for Image Quality way to solve this issue. Compute complexity is another
Assessment (IQA). The proposed network comprised 10 difficult problem. The computational complexity rises
convolution layers, 5 pooling layers for feature extraction, dramatically when sophisticated feature extraction methods
and two fully connected layers for regression. This model is and trained classifiers are used in quality prediction. The
useful for No Reference as well as Full Reference images. coding algorithm will become exceedingly complex due to
This model allows for joint learning of local quality and local the high frequency of invoking of learning-based VQA
weights. The proposed model was evaluated on LIVE, CISQ algorithms, particularly the deep learning-based schemes, in
and TID2013 database as well as the wild image quality the RDO. It is worthwhile to investigate how to include the
database. This model shows superior performance to state-of- VQA—particularly the learning-based VQA with superior
art NR and FR IQA. [20] performance—into video coding with manageable
complexity.
Chunling Fan et al. developed a multi-expert
Convolutional Neural Networks (CNNs) based NR IQA V. CONCLUSION
algorithm. This network consisted of distortion type
classification, CNN based IQA algorithms, and fusion We reviewed the development of several video coding
algorithm. To determine the type of distortion, present in the standards in this research. VVC is an advanced video coding
input image, they first introduce a distortion-type classifier. standard that has been shown to compress video more
Then, they provide an IQA method based on multi-expert effectively. The great accuracy used by the HEVC and VVC
CNN for every kind of distortion. In the end, a fusion algorithms is a result of the progress of more computer
technique combines the multi-expert CNN-based image power. This article also presents challenges associated with
quality forecasts and the distortion kinds' classification machine learning video coding optimization. The survey is
results. Model was assessed with LIVE II database and cross- mainly presented from two key aspects, first is low
dataset on CSIQ database. The proposed algorithm shows complexity optimization with the help of advanced learning
improvement for NR IQA. [21]. tools, such as feed-forward CNN, deep RL, and deep NN,
and second is learning-based visual quality assessment
Hak Gu Kim et al. proposed a deep learning-based (VQA). In each case, the problem formulation, advantages,
virtual reality image quality assessment method. The and challenge issues are presented. To sum up, learning-
proposed deep network consists of a virtual reality (VR) based coding optimizations have a lot of benefits and
quality score predictor and human perception guider. By promise, and the academic and industrial groups will find this
encoding the positional feature and visual feature of a patch to be a promising future.
on the omnidirectional image, the proposed VR quality score
predictor learns the positional and visual properties of the There are other factors than coding efficiency that
image. Patch weight and patch quality score are estimated influence the industry's choice of video coding technology
using the encoded positional feature and visual feature. The for goods and services. Appropriate licensing conditions are
image quality score is then anticipated by adding together all crucial when selecting video coding options.
of the patch scores and weights. Using adversarial learning,
the suggested human perception guide assesses the projected
quality score by comparing it to the human subjective score.
The experimental results demonstrate that, for

IJISRT23NOV1819 www.ijisrt.com 2540

Volume 8, Issue 11, November – 2023 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165
REFERENCES [14]. Z. Liu, X. Yu, Y. Gao, S. Chen, X. Ji and D. Wang,
"CU Partition Mode Decision for HEVC Hardwired
[1]. https://www.semrush.com/website/youtube.com/overvie Intra Encoder Using Convolution Neural Network," in
w/ IEEE Transactions on Image Processing, vol. 25, no.
[2]. Zhang, Yun, Sam Kwong, and Shiqi Wang. "Machine 11, pp. 5088-5103, Nov. 2016, doi:
learning based video coding optimizations: A survey." 10.1109/TIP.2016.2601264.
Information Sciences 506 (2020): 395-423. [15]. T. Laude and J. Ostermann, "Deep learning-based intra
[3]. https://mpeg.chiariglione.org/who-we-are prediction mode decision for HEVC," 2016 Picture
[4]. Richardson, Iain E. The H. 264 advanced video Coding Symposium (PCS), Nuremberg, Germany,
compression standard. John Wiley & Sons, 2011. 2016, pp. 1-5, doi: 10.1109/PCS.2016.7906399.
[5]. Akramullah, Shahriar. Digital video concepts, methods, [16]. T. Li, M. Xu and X. Deng, "A deep convolutional
and metrics: quality, compression, performance, and neural network approach for complexity reduction on
power trade-off analysis. Springer Nature, 2014. intra-mode HEVC," 2017 IEEE International
[6]. T. Sikora, "The MPEG-4 video standard verification Conference on Multimedia and Expo (ICME), Hong
model," in IEEE Transactions on Circuits and Systems Kong, China, 2017, pp. 1255-1260, doi:
for Video Technology, vol. 7, no. 1, pp. 19-31, Feb. 10.1109/ICME.2017.8019316.
1997, doi: 10.1109/76.554415. [17]. M. Xu, T. Li, Z. Wang, X. Deng, R. Yang and Z. Guan,
[7]. “Market share of top online video codecs and containers "Reducing Complexity of HEVC: A Deep Learning
worldwide from 2016 to 2018,” Statista,New York, Approach," in IEEE Transactions on Image Processing,
2019. [Online]. Available: vol. 27, no. 10, pp. 5044-5059, Oct. 2018, doi:
https://www.statista.com/statistics/710673/worldwide- 10.1109/TIP.2018.2847035.
[8]. G. J. Sullivan, J. Ohm, W. Han and T. Wiegand, [18]. J. Xu, M. Xu, Y. Wei, Z. Wang and Z. Guan, "Fast
"Overview of the High Efficiency Video Coding H.264 to HEVC Transcoding: A Deep Learning
(HEVC) Standard," in IEEE Transactions on Circuits Method," in IEEE Transactions on Multimedia, vol. 21,
and Systems for Video Technology, vol. 22, no. 12, pp. no. 7, pp. 1633-1645, July 2019, doi:
1649-1668, Dec. 2012, doi: 10.1109/TMM.2018.2885921.
10.1109/TCSVT.2012.2221191. [19]. L. Kang, P. Ye, Y. Li and D. Doermann,
[9]. B. Bross et al., "Overview of the Versatile Video "Convolutional Neural Networks for No-Reference
Coding (VVC) Standard and its Applications," in IEEE Image Quality Assessment," 2014 IEEE Conference on
Transactions on Circuits and Systems for Video Computer Vision and Pattern Recognition, Columbus,
Technology, vol. 31, no. 10, pp. 3736-3764, Oct. 2021, OH, USA, 2014, pp. 1733-1740, doi:
doi: 10.1109/TCSVT.2021.3101953. 10.1109/CVPR.2014.224.
[10]. E. Martinez-Enriquez, A. Jimenez-Moreno, M. Angel- [20]. S. Bosse, D. Maniry, K. -R. Müller, T. Wiegand and W.
Pellon and F. Diaz-de-Maria, "A Two-Level Samek, "Deep Neural Networks for No-Reference and
Classification-Based Approach to Inter Mode Decision Full-Reference Image Quality Assessment," in IEEE
in H.264/AVC," in IEEE Transactions on Circuits and Transactions on Image Processing, vol. 27, no. 1, pp.
Systems for Video Technology, vol. 21, no. 11, pp. 206-219, Jan. 2018, doi: 10.1109/TIP.2017.2760518.
1719-1732, Nov. 2011, doi: [21]. C. Fan, Y. Zhang, L. Feng and Q. Jiang, "No Reference
10.1109/TCSVT.2011.2134010. Image Quality Assessment based on Multi-Expert
[11]. Y. -H. Sung and J. -C. Wang, "Fast Mode Decision for Convolutional Neural Networks," in IEEE Access, vol.
H.264/AVC Based on Rate-Distortion Clustering," in 6, pp. 8934-8943, 2018, doi:
IEEE Transactions on Multimedia, vol. 14, no. 3, pp. 10.1109/ACCESS.2018.2802498.
693-702, June 2012, doi: 10.1109/TMM.2012.2186793. [22]. H. G. Kim, H. -T. Lim and Y. M. Ro, "Deep Virtual
[12]. J. -C. Chiang, W. -C. Chen, L. -M. Liu, K. -F. Hsu and Reality Image Quality Assessment With Human
W. -N. Lie, "A Fast H.264/AVC-Based Stereo Video Perception Guider for Omnidirectional Image," in IEEE
Encoding Algorithm Based on Hierarchical Two-Stage Transactions on Circuits and Systems for Video
Neural Classification," in IEEE Journal of Selected Technology, vol. 30, no. 4, pp. 917-928, April 2020,
Topics in Signal Processing, vol. 5, no. 2, pp. 309-320, doi: 10.1109/TCSVT.2019.2898732.
April 2011, doi: 10.1109/JSTSP.2010.2066956.
[13]. P. Carrillo, Tao Pin and H. Kalva, "Low complexity
H.264 video encoder design using machine learning
techniques," 2010 Digest of Technical Papers
International Conference on Consumer Electronics
(ICCE), Las Vegas, NV, USA, 2010, pp. 461-462, doi:
10.1109/ICCE.2010.5418749.

IJISRT23NOV1819 www.ijisrt.com 2541

Software Development: BCS Level 4 Certificate in IT study guide
From Everand
Software Development: BCS Level 4 Certificate in IT study guide
Tig Williams
3.5/5 (2)
Immediate download Intelligent Image and Video Compression: Communicating Pictures 2nd Edition David Bull ebooks 2024
100% (3)
Immediate download Intelligent Image and Video Compression: Communicating Pictures 2nd Edition David Bull ebooks 2024
62 pages
Learn IoT Programming Using Node-RED: Begin to Code Full Stack IoT Apps and Edge Devices with Raspberry Pi, NodeJS, and Grafana
From Everand
Learn IoT Programming Using Node-RED: Begin to Code Full Stack IoT Apps and Edge Devices with Raspberry Pi, NodeJS, and Grafana
Bernardo Ronquillo Japón
No ratings yet
DevOps Bootcamp
From Everand
DevOps Bootcamp
Mitesh Soni
No ratings yet
A Survey on Perceptually Optimized Video Coding
No ratings yet
A Survey on Perceptually Optimized Video Coding
36 pages
Tesi
No ratings yet
Tesi
113 pages
A Study of the Evolution of Video Codec
No ratings yet
A Study of the Evolution of Video Codec
13 pages
Deep Learning-Based Video Coding - A Review and A Case Study
No ratings yet
Deep Learning-Based Video Coding - A Review and A Case Study
35 pages
Effective Video Coding For Multimedia Applications
No ratings yet
Effective Video Coding For Multimedia Applications
266 pages
Intelligent Image and Video Compression: Communicating Pictures 2nd Edition David Bull - Download the ebook with all fully detailed chapters
100% (1)
Intelligent Image and Video Compression: Communicating Pictures 2nd Edition David Bull - Download the ebook with all fully detailed chapters
62 pages
2101.06341
No ratings yet
2101.06341
27 pages
Semantically Video Coding: Instill Static-Dynamic Clues Into Structured Bitstream For AI Tasks
No ratings yet
Semantically Video Coding: Instill Static-Dynamic Clues Into Structured Bitstream For AI Tasks
14 pages
Untitled
No ratings yet
Untitled
561 pages
A Different Approach For Spatial Prediction and Transform Using Video Image Coding
No ratings yet
A Different Approach For Spatial Prediction and Transform Using Video Image Coding
6 pages
International Journal of Engineering Research and Development
No ratings yet
International Journal of Engineering Research and Development
7 pages
Error Detection and Data Recovery Architecture For Motion Estimation
100% (1)
Error Detection and Data Recovery Architecture For Motion Estimation
63 pages
Mastering Video Coding A Comprehensive Dive From Tools To Consumer Deployment
No ratings yet
Mastering Video Coding A Comprehensive Dive From Tools To Consumer Deployment
8 pages
Decode to Encode(1)
No ratings yet
Decode to Encode(1)
232 pages
Video_Compression_by_Neural_Networks (1)
No ratings yet
Video_Compression_by_Neural_Networks (1)
33 pages
Lecture 10 Introduction To Video Processing & Applications: CSE 489-02 & CSE 589-02 Multimedia Processing
No ratings yet
Lecture 10 Introduction To Video Processing & Applications: CSE 489-02 & CSE 589-02 Multimedia Processing
104 pages
(IJCST-V12I3P20) :bassant Mohamed Elamir, Amany Fawzy Elgamal, Marwa Hussein Abdelfattah
No ratings yet
(IJCST-V12I3P20) :bassant Mohamed Elamir, Amany Fawzy Elgamal, Marwa Hussein Abdelfattah
17 pages
preprints202403.1272.v1
No ratings yet
preprints202403.1272.v1
37 pages
Image Compression: by Artificial Neural Networks
No ratings yet
Image Compression: by Artificial Neural Networks
14 pages
Video Compression Edited By Amal Punchihewa instant download
No ratings yet
Video Compression Edited By Amal Punchihewa instant download
49 pages
Digital Video Concepts
No ratings yet
Digital Video Concepts
20 pages
Video Processing Syllabus
100% (1)
Video Processing Syllabus
4 pages
UHD_Database_Focus_on_Smart_Cities_and_Smart_Trans
No ratings yet
UHD_Database_Focus_on_Smart_Cities_and_Smart_Trans
19 pages
Dimensions in Data Processing 2402
No ratings yet
Dimensions in Data Processing 2402
76 pages
Image and Video Compression Techniques in Image Processesing An Overview (Audio, Video, Image Compression Techniques)
No ratings yet
Image and Video Compression Techniques in Image Processesing An Overview (Audio, Video, Image Compression Techniques)
1 page
Efficient Framework For Macroblock Prediction and Parallel Task Assignment in Video Coding
No ratings yet
Efficient Framework For Macroblock Prediction and Parallel Task Assignment in Video Coding
181 pages
Advances in Video Compression: Jens-Rainer Ohm
No ratings yet
Advances in Video Compression: Jens-Rainer Ohm
5 pages
Rippel Learned Video Compression ICCV 2019 Paper
No ratings yet
Rippel Learned Video Compression ICCV 2019 Paper
10 pages
Digital Video Transcoding: Jun Xin, Chia-Wen Lin, Ming-Ting Sun
No ratings yet
Digital Video Transcoding: Jun Xin, Chia-Wen Lin, Ming-Ting Sun
14 pages
T E V V: A S S T C A: HE Volution of Olumetric Ideo Urvey of Mart Ranscoding and Ompression Pproaches
No ratings yet
T E V V: A S S T C A: HE Volution of Olumetric Ideo Urvey of Mart Ranscoding and Ompression Pproaches
11 pages
Implementation of P-N Learning Based Compression in Video Processing
No ratings yet
Implementation of P-N Learning Based Compression in Video Processing
3 pages
New Trends in Image and Video Compression
No ratings yet
New Trends in Image and Video Compression
7 pages
lFinalGroup-8
No ratings yet
lFinalGroup-8
62 pages
Compusoft, 2 (5), 127-129 PDF
No ratings yet
Compusoft, 2 (5), 127-129 PDF
3 pages
H.264 Video Encoder Standard - Review
No ratings yet
H.264 Video Encoder Standard - Review
5 pages
A Reconfigurable Multiple Transform Selection Architecture For VVC
No ratings yet
A Reconfigurable Multiple Transform Selection Architecture For VVC
12 pages
Unit - 3
No ratings yet
Unit - 3
104 pages
2013 - 3D Video Coding For Embedded Devices
No ratings yet
2013 - 3D Video Coding For Embedded Devices
219 pages
Basic Prediction Techniques in Modern Video Coding Standards PDF
No ratings yet
Basic Prediction Techniques in Modern Video Coding Standards PDF
90 pages
In Communications Based On MATLAB: A Practical Course
No ratings yet
In Communications Based On MATLAB: A Practical Course
10 pages
Wa0004.
No ratings yet
Wa0004.
3 pages
The VISIONE Video Search System Explotting Off The Shelf Text Search Engines For Large Scale Video Retrieval
No ratings yet
The VISIONE Video Search System Explotting Off The Shelf Text Search Engines For Large Scale Video Retrieval
26 pages
Gob Spic 20
No ratings yet
Gob Spic 20
29 pages
Research Public Journals
No ratings yet
Research Public Journals
13 pages
2018, Energy efficient embedded video processing systems _ a hardware-software collaborative approach Henkel, Jörg_ Khan, Muhammad Usman Karim_ Shafique, Muhammad-Springer
No ratings yet
2018, Energy efficient embedded video processing systems _ a hardware-software collaborative approach Henkel, Jörg_ Khan, Muhammad Usman Karim_ Shafique, Muhammad-Springer
242 pages
Colour Banding: Exploring the Depths of Computer Vision: Unraveling the Mystery of Colour Banding
From Everand
Colour Banding: Exploring the Depths of Computer Vision: Unraveling the Mystery of Colour Banding
Fouad Sabry
No ratings yet
Developing Applications with Kivy: Definitive Reference for Developers and Engineers
From Everand
Developing Applications with Kivy: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
CircuitPython in Practice: Definitive Reference for Developers and Engineers
From Everand
CircuitPython in Practice: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Create Ai Online
From Everand
Create Ai Online
Anthony W. Bryant
No ratings yet
Quality of Experience Engineering for Customer Added Value Services: From Evaluation to Monitoring
From Everand
Quality of Experience Engineering for Customer Added Value Services: From Evaluation to Monitoring
Abdelhamid Mellouk
No ratings yet
Learning WebRTC
From Everand
Learning WebRTC
Dan Ristic
No ratings yet
Hands-On Industrial Internet of Things: Build robust industrial IoT infrastructure by using the cloud and artificial intelligence
From Everand
Hands-On Industrial Internet of Things: Build robust industrial IoT infrastructure by using the cloud and artificial intelligence
Giacomo Veneri
No ratings yet
Efficient Container Image Building with BuildKit: Definitive Reference for Developers and Engineers
From Everand
Efficient Container Image Building with BuildKit: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Securing the CI/CD Pipeline: Best Practices for DevSecOps
From Everand
Securing the CI/CD Pipeline: Best Practices for DevSecOps
Sai Sravan Cherukuri
No ratings yet
Learning VirtualDub: The complete guide to capturing, processing and encoding digital video
From Everand
Learning VirtualDub: The complete guide to capturing, processing and encoding digital video
Sohail Salehi
No ratings yet
Free Video Editor Software Untuk Windows, Mac Dan Linux Edisi Bahasa Inggris
From Everand
Free Video Editor Software Untuk Windows, Mac Dan Linux Edisi Bahasa Inggris
Cyber Jannah Studio
No ratings yet
Perception, Attitude, and Readiness in Artificial Intelligence Adoption among Academic Librarians in the Bicol Region Librarians Council (BRLC)
No ratings yet
Perception, Attitude, and Readiness in Artificial Intelligence Adoption among Academic Librarians in the Bicol Region Librarians Council (BRLC)
6 pages
Optimal Voltage Regulation in Standalone Photovoltaic Systems Using Model Predictive Control and MOGA
No ratings yet
Optimal Voltage Regulation in Standalone Photovoltaic Systems Using Model Predictive Control and MOGA
8 pages
Transition to Telepsychotherapy: Experiential Perspective of Debutant Therapists
No ratings yet
Transition to Telepsychotherapy: Experiential Perspective of Debutant Therapists
6 pages
Crude Oil Price Volatility and its Impact on Nigeria’s Balance of Trade: An Empirical Assessment (2000–2023)
No ratings yet
Crude Oil Price Volatility and its Impact on Nigeria’s Balance of Trade: An Empirical Assessment (2000–2023)
13 pages
Assessment of Underground Water Quality of Gosa Landfill Site of the Federal Capital Territory, Abuja Nigeria
No ratings yet
Assessment of Underground Water Quality of Gosa Landfill Site of the Federal Capital Territory, Abuja Nigeria
11 pages
Investigating the Interplay between Climate Change and Sustainable Environment Development: Challenges, Strategies and Future Directions
No ratings yet
Investigating the Interplay between Climate Change and Sustainable Environment Development: Challenges, Strategies and Future Directions
11 pages
Unlocking the Therapeutic Power of Coriander: A Review of Coriandrum Sativum’s Bioactive Compounds and Health Benefits
No ratings yet
Unlocking the Therapeutic Power of Coriander: A Review of Coriandrum Sativum’s Bioactive Compounds and Health Benefits
15 pages
Analysis of the Role of Websites, Design, and Performance Metrics in Improving Company Performance in Medan City
No ratings yet
Analysis of the Role of Websites, Design, and Performance Metrics in Improving Company Performance in Medan City
4 pages
Developing Gamified Educational Technologies to Enhance Learning and Motivate Student Engagement in Education: A Quantitative Study Using Human-Computer Interaction (HCI)
No ratings yet
Developing Gamified Educational Technologies to Enhance Learning and Motivate Student Engagement in Education: A Quantitative Study Using Human-Computer Interaction (HCI)
16 pages
Monte Carlo-Based Modeling of 2-D Ising Systems Using Metropolis Algorithm, Simulation Techniques, Thermodynamic Behavior and Magnetization Patterns
No ratings yet
Monte Carlo-Based Modeling of 2-D Ising Systems Using Metropolis Algorithm, Simulation Techniques, Thermodynamic Behavior and Magnetization Patterns
16 pages
A Review on Gold Nanoparticles: Properties, Synthesis and Biomedical Application in Drug Delivery and Cancer Therapy
No ratings yet
A Review on Gold Nanoparticles: Properties, Synthesis and Biomedical Application in Drug Delivery and Cancer Therapy
6 pages
A Phytochemical Evaluation of Sierra Leonean Cassia siamea: A Source of Bioactive Compounds
No ratings yet
A Phytochemical Evaluation of Sierra Leonean Cassia siamea: A Source of Bioactive Compounds
5 pages
Assessing the Achievements of the Re-Alignment of an Industry Educatiocal Based System in Society
No ratings yet
Assessing the Achievements of the Re-Alignment of an Industry Educatiocal Based System in Society
5 pages
Smart Narrator Robot: Enhancing Experiential Learning through Conditional Autonomy
No ratings yet
Smart Narrator Robot: Enhancing Experiential Learning through Conditional Autonomy
6 pages
Analyzing Social Communication Deficits in Autism Using Wearable Sensors and Real-Time Affective Computing Systems
No ratings yet
Analyzing Social Communication Deficits in Autism Using Wearable Sensors and Real-Time Affective Computing Systems
17 pages
A MIC-MAC-Based Structural Exploration of Determinants Impacting Investment Sensitivity
No ratings yet
A MIC-MAC-Based Structural Exploration of Determinants Impacting Investment Sensitivity
8 pages
Cost Comparative Analysis of Solar/Utility and Diesel/Utility Hybrid Power System for a Typical Residential Building
No ratings yet
Cost Comparative Analysis of Solar/Utility and Diesel/Utility Hybrid Power System for a Typical Residential Building
8 pages
Real - Time Recognition of Cardiovascular Conditions from ECG Images with Deep Learning
No ratings yet
Real - Time Recognition of Cardiovascular Conditions from ECG Images with Deep Learning
10 pages
Enhancing Model Accuracy for Keypoint-Based Sign Language Recognition using Optimized Neural Network Architectures
No ratings yet
Enhancing Model Accuracy for Keypoint-Based Sign Language Recognition using Optimized Neural Network Architectures
7 pages
EduTech Portal: An AI-Powered Student Assistant Chatbot
No ratings yet
EduTech Portal: An AI-Powered Student Assistant Chatbot
12 pages
Architecture as a Reflection of Cultural Continuity: A Study of Traditional Trends
No ratings yet
Architecture as a Reflection of Cultural Continuity: A Study of Traditional Trends
3 pages
Analysis of the Export Competitiveness of Indonesia's Horticultural Fruit Products in the International Market
No ratings yet
Analysis of the Export Competitiveness of Indonesia's Horticultural Fruit Products in the International Market
8 pages
Development of Mirror Biosensor in Saliva pH Measurement in Health Services
No ratings yet
Development of Mirror Biosensor in Saliva pH Measurement in Health Services
7 pages
Evaluating the Impact of Shopee Mall on Consumer Purchase: Basis for Developing an Effective Marketing Plan
No ratings yet
Evaluating the Impact of Shopee Mall on Consumer Purchase: Basis for Developing an Effective Marketing Plan
61 pages
Design and Implementation of a GPS-GSM based Real-Time Vehicle Theft Tracking System for Urban Security in Uganda
No ratings yet
Design and Implementation of a GPS-GSM based Real-Time Vehicle Theft Tracking System for Urban Security in Uganda
7 pages
Continuing Training and Professional Performance of Primary School Teachers in Tchad: The Case of Teachers in the Farchana Refugee Camp
No ratings yet
Continuing Training and Professional Performance of Primary School Teachers in Tchad: The Case of Teachers in the Farchana Refugee Camp
7 pages
A Decade of Genome Editing: Comparative Review of ZFN, Talen, and CRISPR/CAS9
No ratings yet
A Decade of Genome Editing: Comparative Review of ZFN, Talen, and CRISPR/CAS9
10 pages
Enhancing the Robustness of Computer Vision Models to Adversarial Perturbations Using Multi-Scale Attention Mechanisms
No ratings yet
Enhancing the Robustness of Computer Vision Models to Adversarial Perturbations Using Multi-Scale Attention Mechanisms
14 pages
ResumeMatch: Intelligent Resume Enhancement & Job Fit Analysis
No ratings yet
ResumeMatch: Intelligent Resume Enhancement & Job Fit Analysis
7 pages
Behavior Addiction in Adolescents Post COVID 19: A Systematic Mental Health Review
No ratings yet
Behavior Addiction in Adolescents Post COVID 19: A Systematic Mental Health Review
8 pages
Fast Affine Motion Estimation For Versatile Video Coding VVC Encoding
No ratings yet
Fast Affine Motion Estimation For Versatile Video Coding VVC Encoding
10 pages
Chadha Deep Perceptual Preprocessing For Video Coding CVPR 2021 Paper
No ratings yet
Chadha Deep Perceptual Preprocessing For Video Coding CVPR 2021 Paper
10 pages
JVET-M Notes DF
No ratings yet
JVET-M Notes DF
387 pages
Comparing VVC HEVC and AV1 Using Objective and Sub
No ratings yet
Comparing VVC HEVC and AV1 Using Objective and Sub
9 pages
Futuresource_InterDigital_Spotlight_on_HEVC_Dec2023
No ratings yet
Futuresource_InterDigital_Spotlight_on_HEVC_Dec2023
20 pages
UL - 8K - Backhaul Whitepaper Clean Version
No ratings yet
UL - 8K - Backhaul Whitepaper Clean Version
31 pages
streaming
No ratings yet
streaming
63 pages
Deep Affine Motion Compensation Network For Inter Prediction in VVC
No ratings yet
Deep Affine Motion Compensation Network For Inter Prediction in VVC
11 pages
Jvet-Ac Notes DF
No ratings yet
Jvet-Ac Notes DF
290 pages
Fast Partitioning Decision Strategies For The Upcoming Versatile Video Coding (VVC) Standard
No ratings yet
Fast Partitioning Decision Strategies For The Upcoming Versatile Video Coding (VVC) Standard
5 pages

A Survey On Video Coding Optimizations Using Machine Learning

Uploaded by

A Survey On Video Coding Optimizations Using Machine Learning

Uploaded by

Volume 8, Issue 11, November – 2023 International Journal of Innovative Science and Research Technology

A Survey on Video Coding Optimizations using

IJISRT23NOV1819 www.ijisrt.com 2537

IJISRT23NOV1819 www.ijisrt.com 2538

IJISRT23NOV1819 www.ijisrt.com 2539

IJISRT23NOV1819 www.ijisrt.com 2540

IJISRT23NOV1819 www.ijisrt.com 2541

You might also like