0% found this document useful (0 votes)
45 views14 pages

Learning-Driven Lossy Image Compression A Comprehensive Survey

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
45 views14 pages

Learning-Driven Lossy Image Compression A Comprehensive Survey

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

...

Learning-Driven Lossy Image Compression; A


Comprehensive Survey
Sonain Jamil, Md. Jalil Piran, and MuhibUr Rahman

Abstract—In the realm of image processing and computer and the second is lossy image compression. Lossless image
vision (CV), machine learning (ML) architectures are widely compression techniques are very efficient for small-size data.
applied. Convolutional neural networks (CNNs) solve a wide
range of image processing issues and can solve image com- Lossless techniques such as Huffman coding, run-length
pression problem. Compression of images is necessary due to encoding (RLE), arithmetic coding, Lempel-Ziv-Welch (LZW)
arXiv:2201.09240v1 [eess.IV] 23 Jan 2022

bandwidth and memory constraints. Helpful, redundant, and Coding, and JPEG-LS are efficient for the small data [14]-
irrelevant information are three different forms of information [15]. The major drawback of the lossless compression tech-
found in images. This paper aims to survey recent techniques niques is less compression efficiency than lossy compression
utilizing mostly lossy image compression using ML architectures
including different auto-encoders (AEs) such as convolutional techniques. That is why many researchers are working on
auto-encoders (CAEs), variational auto-encoders (VAEs), and image compression using ML.
AEs with hyper-prior models, recurrent neural networks (RNNs), There are many surveys focused on image compression.
CNNs, generative adversarial networks (GANs), principal com- Many surveys address the pros and cons of conventional
ponent analysis (PCA) and fuzzy means clustering. We divide all
of the algorithms into several groups based on architecture. We image compression algorithms based on discrete cosine trans-
cover still image compression in this survey. Various discoveries form (DCT), and discrete wavelet transforms (DWT). Shukla
for the researchers are emphasized and possible future directions et al. in [5], surveyed the prediction and transform-based
for researchers. The open research problems such as out of image compression algorithms. This survey highlighted the
memory (OOM), striped region distortion (SRD), aliasing, and importance and application of the prediction and transform-
compatibility of the frameworks with central processing unit
(CPU) and graphics processing unit (GPU) simultaneously are based conventional algorithms for image compression. These
explained. The majority of the publications in the compression prediction-based algorithms were based on edge detection, gra-
domain surveyed are from the previous five years and use a dient, and block-based prediction. However, transform-based
variety of approaches. algorithms were based on wavelets. The survey presented the
Index Terms—learned image compression, deep learning, comparative analysis of the entropy coding. However, this
JPEG, end-to-end image compression, machine learning. lacked the discussion about end-to-end based image com-
pression architectures using ML. In [6], the authors stud-
ied the DCT and DWT-based algorithms. The survey did
I. I NTRODUCTION not address the recently learned image compression tech-
niques. Likewise, in [13] the authors surveyed several lossy
I N this modern era of big data, the size of the data is the
biggest concern for scientists and researchers. Due to the
limited bandwidth of the channel and limited memory space,
and lossless algorithms for image compression. The article
highlighted the advantages and drawbacks of the predictive,
entropy coding, and discrete Fourier transform-based image
there is the requirement of data compression for the successful
compression frameworks. The survey did not address the
transmission and storage of data without losing significant
ML-based image compression architectures. Several surveys
information [1]. Data compression can be performed in several
highlight the importance of the conventional DCT, DWT, and
ways: audio compression, image compression, video compres-
entropy-based approaches for image compression. There is
sion, and document compression [2]. There are three types
a requirement to survey ML-based still image compression
of information in an image: useful, redundant, and irrelevant.
techniques.
Irrelevant information can be ignored for the compression
of images. Redundant information is crucial for highlight- A plethora of techniques have been proposed by researchers
ing details in images, whereas useful information is neither in recent years to address the problem of still image compres-
redundant nor irrelevant. Without accurate information, we sion using ML techniques [4]. So, there is a need to survey
cannot reconstruct or decompress images properly [3]. In ML-based image compression techniques and highlight the
image compression, there are two major categories. The first open research problems. In this survey, we focus on recent
is lossless image compression, where no information is lost, ML-based image compression techniques. The following open
research questions need to be addressed.
S. Jamil is with the Department of Electronics Engineering, Sejong Uni- • Which end-to-end still image compression using ML
versity, Seoul, South Korea e-mail: [email protected].
M. J. Piran is with the Department of Computer Engineering, Sejong gives better compression efficiency?
University, Seoul, South Korea e-mail: [email protected]. • Which learned image compression frameworks give a
M. Rahman is with the Department of Electrical Engineering, better visual representation of the reconstructed image?
Polytechnique Montreal, Montreal, QC H3T 1J4, Canada e-mail:
[email protected]. • Which compression technique saves GPU memory?
Manuscript received Month xx, xxxx; revised Month xx, xxxx. • Which compression technique has a fast response time?
... 2

TABLE I In [7], the authors presented the comprehensive study of the


L IST OF ACRONYMS . image compression models based on the neural networks.
Acronym Meaning
This study lacked the discussion of other ML-based end-to-
AEs Autoencoders end image compression methods. The survey included old
BGAN+ Binary Generative Adversarial Network algorithms. Shum et al. in [8], surveyed the DCT and DWT-
BPG Better Portable Graphics based image compression and representation algorithms. The
bpp Bits per pixel
CAEs Convolutional Autoencoders authors in [5] and [9] conducted a study on conventional DCT
CLIC Challenge on Learned Image Compression and DWT-based frameworks.
CNNs Convolutional Neural Networks Similarly, the authors in [10] surveyed the entropy-based
C Computer Vision
CPU Central Processing Unit image compression algorithms. [6] and [11] surveyed the
DCT Discrete Cosine Transform DCT and DWT-based image compression techniques in 2014.
DNN Deep Neural Network Likewise in 2017, Setyaningsih et al. [12], surveyed the
DWT Discrete Wavelet Transform
GANs Generative Adversarial Networks hybrid compression techniques. In [13], the authors surveyed
GDN Generalized Divisive Normalization the predictive, entropy encoding, as well as discrete Fourier
GMM Gaussian Mixture Module transform-based image compression architectures. In [14] and
GPU Graphics Processing Unit
HDR High Resolution [15] the authors surveyed the conventional image compression
HEVC High Efficiency Video Coding algorithms. In [14] the authors surveyed arithmetic entropy
IGDN Inverse Generalized Divisive Normalization coding techniques while in [15], the authors presented the sur-
JPEG Joint Photographic Expert Group
JPEG XR JPEG Extended Range vey of image compression standards such as joint photographic
KL Kullback-Leiber experts group (JPEG), JPEG-2000, portable network graphics
LSTM Long Short-Term Memory (PNG), WebP. Table II presents the comprehensive summary
LZW Lempel-Ziv-Welch
ML Machine Learning of the related surveys.
MNIST Modified National Institute of Standards and Technology After analyzing Table II, it is evident that all the existing
MSE Mean Squared Error surveys deal with the conventional compression techniques.
MS-SSIM Multi-scale Structural Similarity Index Measure
NL Non-Local However, there is a plethora of learning-driven lossy image
NLAM Non-local Attention Module compression frameworks and these architectures have limita-
NLD Non-linear Decoder tions, pros and cons. There is a need for to survey learning-
NLE Non-linear Encoder
NN Neural Network driven lossy image compression models.In this survey, we
OOM Out Of Memory provide an in-depth detailed discussion regarding still lossy
PCA Principal Component Analysis image compression. We present analysis, comparison, pros and
PNG Portable Network Graphics
PSNR Peak Signal-to-Noise Ratio cons of the state-of-the-art techniques used for still lossy image
RD Rate Distortion compression. We outline several open research problems and
RLE Run Length Encoding trends in this research field for substantial future research.
RNNs Recurrent Neural Networks
SCT Stop Code Tolerant The upcoming section explains the still image compression.
SRD Striped Region Distortion
SSIM Structural Similarity Index Measure
SWTA-AE Stochastic Winner-Take-All Autoencoder III. S TILL I MAGE C OMPRESSION
VAEs Variational Autoencoders The concept of image compression started in 1992 when
VVC Versatile Video Coding
Wallace proposed the JPEG standard for still image compres-
sion [16]. In the JPEG compression standard, DCT is used,
which can be calculated as follows [17].
This article presents a detailed survey of using ML algo- PN −1 PN −1
1
rithms for still image compression considering these questions. F (u, v) = √2N C(u)C(v) x=0 y=0 f (x, y) . . .
(1)
cos( (2x+1)uπ
N )cos( (2y+1)vπ
N ),
As presented in Fig. 1, the rest of this paper is organized
as follows. Section II presents the related work. In Section where x is the pixel row, for the integer 0 6 x 6 N , y
III, still image compression with conventional compression is the pixel column, for the integer 0 6 y 6 N , Fu,v is
techniques is presented. The use of ML techniques for image the reconstructed approximate coefficient at coordinates (u, v),
compression is explained in Section IV. Section V gives an fx,y is the reconstructed pixel value at coordinates (x, y) and
insight into the future research directions. Finally, Section VI C(u) is (
draws the conclusions. √1 , for u = 0
C(u) = N (2)
1, for otherwise.
II. R ELATED W ORK
The basic framework of JPEG compression and decompression
Still image compression has been a concern for researchers is shown in Fig. 2.
for many years. Initially, image compression was performed JPEG was widely used for still image compression until an
using conventional frameworks, and with the introduction of updated, and improved version of JPEG known as JPEG-2000
ML, algorithms used ML models for image compression. was proposed in 2001 [18]. JPEG-2000 uses DWT instead of
Several surveys focused on image compression algorithms. DCT. The basic framework of JPEG-2000 is shown in Fig. 3.
... 3

Introduction AEs

VAEs
Still image
compression
CNNs

Learning-driven Learning-driven
RNNs
image compression image compression

GANs
Research Gaps and
Future Direction
PCA

Fuzzy Means
Conclusion Clustering

Fig. 1. Organization of the paper.

TABLE II
S UMMARY OF THE IMAGE COMPRESSION SURVEYS .

Survey Year Scope of the Architecture Surveyed Contributions and limita-


tions

AE VAE CNN RNN GAN

[7] 1999 × × × × × Survey of NN based frame-


works only

[8] 2003 × × × × × Survey of DCT and DWT-


based image representation
and compression algorithms

[5] 2010 × × × × × Survey of predictive and trans-


form based image compression
methods

[9] 2012 × × × × × Survey of entropy based


encoding, JPEG, JPEG2000
methods

[10] 2013 × × × × × Survey of entropy based en-


coding methods

[6] 2014 × × × × × Survey of DCT and DWT


based architectures only

[11] 2014 × × × × × Survey of JPEG and DCT


based architectures

[12] 2017 × × × × × Survey of hybrid DCT, DWT


based compression techniques

[13] 2018 × × × × × Survey of predictive, entropy


coding and discrete Fourier
transform based architectures

[14] 2019 × × × × × Survey of arithmetic encoding


based architectures only

[15] 2021 × × × × × Survey of conventional JPEG,


JPEG2000, WebP, PNG based
architectures

This survey 2021 X X X X X Survey of end-to-end ML


based image compression
frameworks, New outlook to
the open research gaps
... 4

Quantization Entropy
Table Table

Entropy Compressed
DCT Quantizer Zig Zag
Encoding Data

8 x 8 block
RGB Y Cr Cb of image JPEG Encoder
Conversion

Compressed Entropy Zig Zag De- Inverse


Data Decoding Re-ordering quantizer DCT

8 x 8 block
of image Y Cr Cb RGB
Entropy Quantization
Table Table

JPEG Decoder

Fig. 2. Block diagram of JPEG compression and decompression.

The entropy encoding scheme used in JPEG-2000 is Huffman failed to perform better for red-green-blue (RGB) images.
encoding [19], RLE [20] or arithmetic encoding [21]. Similarly, in [41], the authors used the Kodak dataset to train
JPEG and JPEG-2000 standards have been widely used. AE and achieved better performance than JPEG, JPEG-2000,
Later on, researchers developed JPEG extended range (JPEG and WebP. In [42] and [43], the authors used AE for the
XR) [22], and JPEG XT [23] to overcome the limitations of the compression of images. In [44], the authors proposed AE
JPEG standard. However, both these algorithms failed due to for the image compression. They used generalized divisive
hardware incompatibility. Recently, WebP [24], better portable normalization (GDN) and inverse generalized divisive normal-
graphics (BPG) [25] and JPEG XL [26] algorithm were pro- ization (IGDN) layers instead of rectified linear unit (ReLU)
posed for the compression of high resolution (HDR) images. layers to boost up the training process. Cheng et al. in [45]
Recently, high-efficiency video coding (HEVC) displayed the and Alexandre et al. in [46], deployed AE for the compression
most impressive performance. In [27], the authors explained of high resolution image.
the use of HEVC for high-depth medical image compression.
They also proposed a novel algorithm to reduce the complexity B. Compression with VAE
of HEVC. In [28], the authors presented a comparative analysis
VAE is a version of NN and belongs to the probabilistic
of all these algorithms. These conventional algorithms still
graphical models and variational Bayesian methods. Compres-
have limitations. ML has addressed several issues like visual
sion with VAE incorporates mean and variance for latent space
quality improvement for image compression. The next section
distributions. The results achieved by VAE are better than
presents learning-driven lossy image compression techniques.
simple AE. A simple illustration of image compression using
VAE is shown in Fig. 4.
IV. L EARNING -D RIVEN L OSSY I MAGE C OMPRESSION Several researchers used VAE for still image compression.
ML has a significant influence on every field of research. In [47], the authors demonstrated the use of VAE having non-
There has been a revolution in image processing due to linear transform and uniform quantizer for image compression.
CNN’s compiling property of features extractions. Inspired by Challenge one learned image compression (CLIC) dataset
this, many ML architectures have been proposed to compress used for training the model. The model was complex due to
images. We categorize architectures concerning ML models. massive training parameters. Recently, Chen et al. in [48],
proposed a VAE based architecture for high resolution image
A. Compression with AE compression. The authors demonstrated a non-local attention
module (NLAM) to boost the training process. However, the
The most widely used learning-driven lossy image compres-
model complexity increased significantly. Similarly, Larsen et
sion model was proposed by Ballé in 2016 [29]. The backbone
al. in [49], proposed a VAE to compress images. In [50], the
of this model is AE [38]. An AE can be divided into three
authors used VAE to compress images and achieved bits per
parts input, bottleneck, and output. A bottleneck in CAE is
pixel (bpp) of 4.10. However, the model was very complex
the place where latent space is represented.
due to its architecture.
AEs gained popularity in image compression as the basic
purpose of an AE is to reduce the dimensions of the input
image. Ollivier et al. in [39], used AE to compress data by C. Compression with CNN
minimizing the code length. This method used the generative CNN has prime importance in image processing due to its
property of AE. In [40], the authors used AE with Kalman feature extraction characteristics. CNN is also used for image
filter for the compression of the modified national institute compression. The simple illustration of the image compression
of standards and technology (MNIST) images. The model and artifacts reduction is shown in Fig. 5.
... 5

Entropy
Input Image Image Tiling DWT Quantization
Encoder

Wavelet
Code Stream
coefficients

Reconstructed Reorganize Inverse Entropy


Inverse DWT
Image Image Tiles Quantization Decoder

Fig. 3. Block diagram of JPEG-2000 compression and decompression.

3
2 7
6

5
1 8

Figure Description

1 Original Image 3 Mean 5 Latent state distributions 7 Decoder


Reconstructed
2 Encoder 4 Variance 6 Samples from distributions 8
Image

Fig. 4. Image Compression with VAE.

In [51]– [55], the authors used CNNs to compress images. CNN-based end-to-end compression architecture for the multi-
These CNN-based image compression models outperformed spectral image compression. The authors extracted spatial
JPEG and JPEG-2000 in structural similarity index measure features by partitioning and achieved better performance in
(SSIM) and peak signal-to-noise ratio (PSNR). In [56], the terms of PSNR than JPEG-2000. Similarly, [62]– [65] also
authors proposed a CNN-based framework for the compression used CNN based frameworks.
of gigapixel images. They used unsupervised learning for the
training of the neural network. Similarly, in [57], the authors
proposed a state-of-the-art optimized deep CNN-based tech- D. Compression with RNN
nique for the compression of the still images. The proposed
method used an attention module to compress certain regions RNN can be utilized for image compression too. A sim-
of the images with different bits adaptively. The approach ple RNN-based image compression architecture has some
outperformed JPEG and JPEG-2000. However, the results of convolutional layers, GDN layer, RNN modules, binarized
Ballé [29] are better. convolutional layers, and IGND layer. A simple illustration
image compression using RNN is shown in Fig. 6 as proposed
In [58], the authors also used CNN for the compression in [66]. Covell et al. proposed RNN based compression
of high-resolution images. They used CLIC 2018 dataset for architecture using stop code tolerant (SCT) to train the model.
training and achieved 7.81% BD-Rate reduction over BPG and They achieved bpp and PSNR of 0.25 and 27dB for Kodak
JPEG-2000. In [59], the authors introduced a new concept of and imageNet datasets, respectively. The dataset used in the
multi-spectral transform for the multi-spectral image compres- method was Celeba dataset [67]. Likewise, in [68], an RNN-
sion using CNN. It achieved better compression efficiency than based method was proposed to compress still images. The
state-of-the-art anchors like JPEG-2000. Likewise, in [60], the training dataset used in this article was Kodak and achieved
authors proposed a data reduction CNN for the compression of bpp of 0.5 and SSIM of 0.77. The model’s performance was
hyperspectral images. Similarly, in [61], the authors proposed a superior to JPEG, JPEG-2000, and WebP.
... 6

Input Image Convolutional Layers


Output Image

a b c d e
Feature
Shrinking Enhancement Mapping Reconstruction
Extraction

Fig. 5. Image Compression and artifacts reduction with CNN.

Original
Analysis Block
Image

Conv GDN RNN


Layer Binarized
Conv

Conv Conv
RNN GDN Layer
Layer
Compressed
Image Synthesis Block

Fig. 6. Image Compression with RNN.

Toderici et al. in [69], demonstrated the use of RNN with


entropy encoding to compress images. They achieved bpp

Decoder
Encoder

Compressed
Quantizer Image
of 0.5, PSNR of 33.59dB, SSIM of 0.8933, and multi-scale Representation
Input Decoded
structural similarity index (MS-SSIM) of 0.9877. In [70], the Image Image
authors proposed a DL-based end-to-end image compression
model to compress images based on semantic analysis. The Fig. 7. Image Compression with GAN.
authors used RNN for the compression purpose and achieved
better results than the conventional compression techniques
like JPEG. The compression rate at 0.75 bpp was obtained 32 selective generative compression, where semantic label maps
using this method. Furthermore, [71]- [74] also used RNN for were used. They provided insight towards the full resolution
image compression and achieved a better compression ratio image compression and targeted low bit-rate, i.e., less than 0.1
than anchors such as JPEG and JPEG-2000. bpp. In [77], the authors demonstrated the use of unified binary
GAN (BGAN+) for image compression and image retrieval.
The model achieved better than JPEG and JPEG-2000. The
E. Compression with GAN visual quality of the reconstructed image was much improved
GANs are algorithmic architectures. GANs use two NN, than JPEG and JPEG-2000 at low bit-rate such as 0.15 bpp.
pitting one against the other to generate new, synthetic data In [78]- [83], the authors proposed GAN-based architectures
instances that can pass for actual data. These have the gen- for the image compression and achieved high compression
erator part and discriminator portion. GANs are also utilized efficiency. The drawback of the GAN-based architecture is
for image compression. The simple flow diagram of image the cost of deployment.
compression using GAN is shown in Fig. 7, where encoder
and decoders are GAN modules.
Torfason et al. in [75], illustrated the use of GAN for the F. Compression with PCA
compression and classification of semantic data. Similarly, PCA is also used for image compression. In [84], the authors
[76] proposed an extreme image compression architecture utilized vector quantization and PCA for the compression of
using GAN. They used two types of modules for image com- hyperspectral images. The technique achieved better perfor-
pression. The first module was generative compression, where mance than JPEG-2000. Similarly, in [85], the authors used
no semantic label maps were required. The type applied was PCA and DCT to compress hyperspectral images. The used
... 7

and PSNR achieved at minimum bpp by reference respectively.


Ayzik et al. [51] achieved PSNR of 30dB and MS-SSIM
of 0.925 at the minimum bpp of 0.025. The comparison of
the three context models as presented in [91] shows that the
encoding time for all three context models, which are masked
context [92], causal context, and causal context plus causal
global prediction model. The reason behind this is that the
encoding can be performed in parallel while the decoding
time for all three models is different as the masked context
model takes less time for decoding while the combined causal
context and causal global prediction take approximately 39
seconds, as explained in [91]. The comparison of the encoding
and decoding time of all three models is shown in Fig. 12.
However, the decoding time of masked context-based CAE
architectures used for still image compression is less than the
other causal and global context models. The encoding time for
all three types of architectures is the same. The use of masked
convolution [93] has also impacted the visual representation of
Fig. 8. Percentage of architectures present for still image compression. the reconstructed image. The reconstructed image has better
visual quality in [91] than other state-of-the-art techniques.
Table V presents the comparison of the performance of the
PSNR metric for the evaluation of the quality of the image. recent deep learning-based learned image compression models.
The method applies to hyperspectral images only.
In [86], the authors used weighted PCA and JPEG-2000 to We also compare the run time complexity of the various
compress hyperspectral images. They demonstrated that the benchmark learning-driven algorithms with standard codecs.
hybrid approach of weighted PCA and JPEG-2000 achieved Table VI shows the comprehensive comparison.
better PSNR than the JPEG and JPEG-2000. Similarly, the
authors of [87] proposed PCA for lossy compression and I. Impact of Context-Based Entropy Modeling
achieved quality reconstructed image. However, the method In current learned lossy image compression algorithms, the
depended on the number of components, and the performance rate loss is generally the entropy of the codes. To reduce
decreased with the increase of PCA components. entropy and improve joint rate-distortion performance, a pre-
cise estimate of the probabilistic distribution of the codecs
G. Compression with fuzzy means clustering is critical. Most of the learning-driven architectures have
Martino et al., in [88] proposed a fast algorithm to improve utilized context-based entropy modeling to improve the RD
the performance of the multilevel fuzzy transform image performance of the codecs. The architectures consider local
compression method [89]. The authors demonstrated that the as well as non-local (NL) contexts.
reconstructed image’s visual quality. The use of global similarity among pixels in image de-
Moreover, in [90] the authors proposed a conventional noising was initially proposed using NL approaches. Deep
technique of using a curve-fitting novel hyperbolic function for neural network (DNN)-based image processing approaches
the compression. The main advantages of this approach were: then incorporate it into DNNs to use the global information
strengthening the edges of the image, removing the blocking and improve performance in various tasks. The authors in
effect, improving the SSIM, and increasing the PSNR up to [99] investigated pixel self-similarity and presented the NL
20 dB. means for image denoising based on a content weighted NL
Table III and IV present a detailed summary of the tech- average of all pixels in the picture. Wang et al. in [100]
niques used for image compression. Fig. 8 shows the percent- defined the non-local operation as a uniform block, which they
age of the different ML architectures present in literature for used in DNNs to mix local and NL data for object detection.
still image compression. It is evident that 24% of architectures In [101], Liu et al. suggested an NL recurrent network for
in the literature are based on AEs while 12% are based on image restoration that includes NL operations into a recurrent
VAEs. The majority of the architectures are based on CNN network. The authors of [102] used the NL operation to create
32%. attention masks to capture long-range dependence between
pixels and pay greater attention to the challenging sections
of picture restoration.
H. Comparative Analysis of State-of-the-art Techniques Li et al. [103] proposed learning context to model entropy
This subsection presents a comparative analysis of state-of- block in the end-to-end image compression framework. The
the-art techniques in terms of compression ratio, computational analysis, as well as synthesis block, utilized U-Net [104]-
time, PSNR, MS-SSIM, bpp, and RD-rate. Fig. 9 shows the [105] block for encoding and decoding operations. The NL
minimum bpp of the learning-driven lossy image compression operation was proposed in entropy block to incorporate the
models. Fig. 10 and Fig. 11 show the corresponding MS-SSIM global similarities.
... 8

TABLE III
C OMPARISON OF THE LEARNING - DRIVEN LOSSY IMAGE COMPRESSION MODELS .

Research Technique bpp Performance Parameters Contributions and limitations

PSNR SSIM MS-


[dB] SSIM

[39] AE – – – – Generative property of AEs, by minimizing code-length, lim-


ited to theoretical perspective.

[40] AE – – – – The combination of the non-recurrent architecture of three-


layers with kalman filter, limited to grayscale images.

[41] AE 0.4 29 0.83 0.94 Compressive AE with MSE loss function trained by residual
network and non-differentiability of the quantization noise
reduced using rounding based quantization, visual artifacts
problem at low bit rate.

[42] AE 0.2 – – 0.92 Vector quantization preferred over scalar quantization, joint
optimization of learning of latent space representation, worst
performance than BPG.

[43] AE 0.1 31.5 – – Global rate-distortion (RD) constraint in a framework stochas-


tic winner-take-all autoencoder (SWTA-AE), worst perfor-
mance than JPEG-2000, changing strides of convolutional
layers harms NN.

[44] AE 0.2 31 – – GDN as well as IGDN layers, worst performance than HEVC
intra.

[45] AE 1.2 33 – – Parametric ReLU activation function, to produce feature maps


with low dimensionality, uses a CAE instead of a transform
and an inverse transform, along with some max-pooling-up-
sampling layers, comparable performance with Ball’ e model
for grayscale images only.

[46] AE 0.126 29.30 – 0.924 Feature map coding block with Importance Net, Uses skip
connections AEs and trained with loss function of MSE and
MS-SSIM, less PSNR than BPG.

[47] VAE 0.15 30.76 – 0.955 Non-linear encoder (NLE) transform, uniform quantizer, non-
linear decoder (NLD) transform, and post-processing module
form a VAE framework of article.

[48] VAE 0.2 30 – 0.7768 Non local (NL) operations to consider local as well as global
context, better performance than Ball’ e model.

[49] VAE – – – – Kullback-Leiber (KL) loss function, limited to image gener-


ation.

[50] VAE 4.10 – – – Recurrent VAE for learning latent space, limited to conceptual
compression.

[51] CNN 0.025 – – 0.925 Initial decoded image and side information require more
complexity and storage space, cloud based model, visual
artifacts at low bpp.

[52] CNN 0.25 30 – – Coarse to fine hyperprior based model, lacks parallel decod-
ing.

[53] CNN 0.0726 23.93 0.8118 – Loss function consisting of MSE, adversarial and layer wise
loss.

[54] CNN 0.519 33.62 – 0.981 The Lagrange multiplier for joint rate-distortion, considerable
run-time complexity due to the attention modules.

[55] CNN 0.2 31 – 0.7878 Gaussian mixture module (GMM) alogwith JointIQ-Net to
outperform JPEG-2k, BPG, versatile video coding (VVC)
Intra scheme.

[56] CNN – – – – Two step CNN based framework and the unsupervised train-
ing, limited to histopathology images.

[57] CNN 0.5 – 0.77 – Better performance than JPEG and JPEG-2k but worst per-
formance than Ballé. [29]
... 9

TABLE IV
C OMPARISON OF THE LEARNING - DRIVEN LOSSY IMAGE COMPRESSION MODELS .

Research Technique bpp Performance Parameters Contributions and limitations

PSNR SSIM MS-


[dB] SSIM

[58] CNN 0.2 – – 0.93 Trained with perceptual as well as adversarial loss for the
generation of sharp details, outperforms JPEG, JPEG-2000,
BPG and WebP.

[59] CNN – – – – Multi-spectral transform for the multi-spectral image com-


pression.

[60] CNN – – – – Introduction of data reduction block to save GPU memory.

[61] CNN – – – – Feature extraction by partitioning, better performance than


JPEG-2000.

[67] RNN 0.25 27 – – RNN trained using SCT, limited to adaptive encoding.

[68] RNN 0.5 – 0.77 – Focuses on 32x32 size thumbnail images, better than JPEG,
JPEG-2000 and WebP.

[69] RNN 0.5 33.59 0.8933 0.9877 Long short-term memory (LSTM)-based progressive encoding
of thumbnail pictures (32 × 32), decent compression perfor-
mance at low bit rates, pixel RNN for entropy coding.

[70] RNN 0.75 32 – – Semantic analysis, better performance than JPEG.

[75] GAN 0.0983 28.54 0.85 0.973 For semantic understanding combined training for compres-
sion and classification.

[76] GAN 0.033 – – – Full-resolution image compression, applicable only when se-
mantic labels are available.

[77] BGAN+ 0.15 – – 0.95 Reconstructed image without distorting the small objects
present in the image at low bit-rates such as 0.15bpp.

[84] PCA – – – – Better performance than JPEG-2000, limited to hyper-spectral


data.

[88]- [89] Multilevel – – – – Reconstructed image with better visual quality, limited to
Fuzzy maximum of 1024×1024 image resolutions.
Transform

[90] Curve- – 20 – – Removed blocking effects and improved SSIM, results are
fitting with dependent on block size and compression ratio.
hyperbolic
function

All the learning-driven lossy image compression models image compression techniques. There is no technique available
have limitations and research gaps. The upcoming section in the literature that addresses this problem. There is a need
highlights the research gaps of learning-driven lossy image to solve this problem in future research.
compression frameworks.
B. Standardization of Architecture
V. R ESEARCH G APS AND F UTURE D IRECTIONS
The second research gap is the standardization of the end-
There are several techniques present for still image compres- to-end compression architecture. It is very challenging for
sion. All frameworks have their pros and cons. After a detailed the researchers. The deep learning architectures present in
survey, we find that there are still research gaps. All ML-based the literature work well only when the training is performed
architectures have problems of SRD, parallel acceleration, and on GPU and testing is also performed using GPU. Similarly,
OOM. when the training is done using CPU and testing is performed
on CPU, the results are better. However, if the training is
A. SRD in Reconstructed Image performed on the GPU and the testing is done on the CPU
SRD is the striped regions distortion in the reconstructed and vice versa, the image cannot be reconstructed correctly.
image using ML-based architectures. SRD in the reconstructed There is a requirement for the standard generalized model that
image is one of the open research problems in ML-based gives an optimal solution to this problem.
... 10

1.25

0.75

0.5
min bpp

0.25

0
8]

9]

0]

1]

2]

3]

4]

5]

8]

9]

0]

1]

2]

4]

5]

4]

5]

6]

7]

2]

3]

4]
[3

[3

[4

[4

[4

[4

[4

[4

[4

[4

[5

[5

[5

[5

[5

[6

[6

[6

[6

[7

[7

[7
7

17

18

18

18

21

20

20

20

18

20
01

01

01

02

02

02

01

01

01

01

01

01
20

20

20

20

20

20

20

20

20

20
s2

n2

g2

u2

g2

e2

u2

ll2

i2

i2

g2

n2
as

as

re

ou

en

ik

an

Li

on

ng
r ic

r ic
ei

Le
so

en

en

Li

an

so
ve
nd

yz
um

um

Ch

m
Zh

ss

So
Th

de

de
s ts

s ts
Ch

Ch

W
Co
A

Ra
xa

rfa
D

To

To
gu

gu
le

To
A

A
Fig. 9. Minimum bpp achieved by different models.

0.75

0.5
MS-SSIM

0.25

0
8]

9]

0]

1]

2]

3]

4]

5]

8]

9]

0]

1]

2]

4]

5]

4]

5]

6]

7]

2]

3]

4]
[3

[3

[4

[4

[4

[4

[4

[4

[4

[4

[5

[5

[5

[5

[5

[6

[6

[6

[6

[7

[7

[7
7

17

18

18

18

21

20

20

20

18

20
01

01

01

02

02

02

01

01

01

01

01

01
20

20

20

20

20

20

20

20

20

20
s2

n2

g2

u2

g2

e2

u2

ll2

i2

i2

g2

n2
as

as

re

ou

en

ik

an

Li

on

ng
r ic

r ic
ei

Le
so

en

en

Li

an

so
ve
nd

yz
um

um

Ch

m
Zh

ss

So
Th

de

de
s ts

s ts
Ch

Ch

W
Co
A

Ra
xa

rfa
D

To

To
gu

gu
le

To
A

Fig. 10. MS-SSIM achieved at minimum bpp by different models.

C. Parallel Acceleration especially for high-resolution pictures. Block partition is a


The third research gap is due to serial decoding; parallel good option for dealing with the concerns mentioned above,
acceleration of the autoregressive entropy model is impossible. but it introduces new challenges in terms of decreasing dupli-
So, there is a requirement for a framework to address the cation between blocks and removing block effects.
problem of parallel acceleration. In [106], the authors proposed
the learned image block-based framework to address this E. Aliasing Effect In Reconstructed Image
problem. The framework has block artifacts problems for
meager bit rates. So, it is still an open research problem. The fifth research gap is the learning-driven lossy image
compression frameworks’ aliasing effect in the reconstructed
image. The reconstructed images of the CNN and CAE-
D. OOM Problem based architectures have variations in the directionality of the
The fourth research gap is given limited GPU resources; patterns, which is called aliasing. This problem needs to be
full-resolution inference frequently produces OOM problems, addressed.
... 11

40

30

20
PSNR

10

0
8]

9]

0]

1]

2]

3]

4]

5]

8]

9]

0]

1]

2]

4]

5]

4]

5]

6]

7]

2]

3]

4]
[3

[3

[4

[4

[4

[4

[4

[4

[4

[4

[5

[5

[5

[5

[5

[6

[6

[6

[6

[7

[7

[7
7

17

18

18

18

21

20

20

20

18

20
01

01

01

02

02

02

01

01

01

01

01

01
20

20

20

20

20

20

20

20

20

20
s2

n2

g2

u2

g2

e2

u2

ll2

i2

i2

g2

n2
as

as

re

ou

en

ik

an

Li

on

ng
r ic

r ic
ei

Le
so

en

en

Li

an

so
ve
nd

yz
um

um

Ch

m
Zh

ss

So
Th

de

de
s ts

s ts
Ch

Ch

W
Co
A

Ra
xa

rfa
D

To

To
gu

gu
le

To
A

A
Fig. 11. PSNR achieved at minimum bpp by different models.

TABLE V
S UMMARY OF THE PERFORMANCE OF THE END - TO - END IMAGE COMPRESSION FRAMEWORKS .

Research Model Minimum bpp PSNR (dB) MS-SSIM

[41] AE 0.4 29 0.94

[42] AE 0.2 – 0.92

[47] VAE 0.15 30.76 0.955

[48] VAE 0.2 30 0.7768

[48] VAE (loss function MSE) 0.1276 34.63 0.9738

[48] VAE (loss function MS-SSIM) 0.1074 32.54 0.9759

[53] CNN 0.0726 23.93 0.8118

[54] CNN 0.519 33.62 0.981

[55] CNN 0.2 31 0.7878

[69] RNN 0.5 33.59 0.9877

[75] GAN 0.0983 28.54 0.973

[94] Hyperprior based AE 0.304 34.4 –

[95] Autoregressive Model 0.01921 18.95 –

[96] Invertible NN (MSE loss function) 0.130 31.51 0.972

[96] Invertible NN (MS-SSIM loss function) 0.124 28.01 0.978

[97] Conditional Autoencoder 0.1697 32.2332 0.9602

[98] Asymmetric Gained VAE (loss function MSE) 0.107 30.68 0.93

[98] Asymmetric Gained VAE (loss function MS-SSIM) 0.109 27.76 0.94
... 12

40 38.7 AEs. These models achieve better compression efficiency than


35 JPEG-2000, which is an anchor in image compression. We
30 also highlighted four significant research gaps in ML-based
25 image compression models. Those gaps are the SRD problems,
Time (s)

20 architecture standardization, parallel acceleration, aliasing, and


15 OOM. These problems are yet to be addressed in still image
10 7.9
compression.
6.7
5 3.4 3.5 3.5
0 R EFERENCES
Encoding Decoding
Masked Context Causal Context Causal Context + Causal Global Prediction [1] U. Jayasankar, V. Thirumal and D. Ponnurangam, “A survey on data
compression techniques: From the perspective of data quality, coding
Fig. 12. Comparison of encoding and decoding time of masked context, schemes, data type and applications,” Journal of King Saud University
causal context and causal context plus causal global prediction. - Computer and Information Sciences, vol. 33, no. 2, pp. 119–140, Feb.
2021.
[2] Chen S, Zhang S, Zheng X, Ruan X., “Layered adaptive compression
TABLE VI design for efficient data collection in industrial wireless sensor networks,”
RUN - TIME COMPLEXITY OF VARIOUS LEARNING - DRIVEN ALGORITHMS Journal of Network and Computer Applications. 2019 Mar 1;129:37-45.
WITH STANDARD CODECS . [3] Md. Jalil Piran, SM Riasulislam. QV Pham, DY Suh, and Z. Han, “Mul-
timedia Communication over Cognitive Radio Networks from QoS/QoE
Perspective: A Comprehensive Survey,” Journal of Network and Com-
Codecs Encoding Time (ms) Decoding Time (ms)
puter Applications, vol. 172, pp. 1-44, Dec. 2020.
JPEG 18.600 13.000 [4] Gupta O, Raskar R., “Distributed learning of deep neural network over
multiple agents,” Journal of Network and Computer Applications. 2018
JPEG-2000 367.400 80.400 Aug 15;116:1-8.
[5] J. Shukla, M. Alwani and A. K. Tiwari, “A survey on lossless image
WebP 67.000 83.700 compression methods,” in Proc. of 2010 2nd International Conference
on Computer Engineering and Technology, vol. 6, pp. V6–136, 2010.
[29] 242.120 338.090
[6] M. Rehman, M. Sharif and M. Raza, “Image compression: A survey,”
[30] 64.700 12.100 Research Journal of Applied Sciences, Engineering and Technology vol.
7, no. 4, pp. 656–672, Jul. 2014.
[31] 8.600 9.900 [7] J. Jiang, “Image compression with neural networks-a survey,” Signal
processing: image Communication, vol. 14, no. 9, pp. 737–760, Jul. 1999.
[32] 3.500 4.000 [8] H.-Y. Shum, S. B. Kang and S.-C. Chan, “Survey of image-based repre-
[33] 75.120 73.230 sentations and compression techniques,” IEEE Transactions on Circuits
and Systems for Video Technology, vol. 13, no. 11, pp. 1020-1037, Nov.
[34] 79.500 17.400 2003.
[9] J. Vrindavanam, S. Chandran and G. K. Mahanti, “A survey of image
[35] 42.000 32.000 compression methods,” in Proc. of International Conference and Work-
[36] 74.000 984.000 shop on Emerging Trends in Technology, pp. 12–17. 2012.
[10] G. Vijayvargiya, S. Silakari and R. Pandey, “A survey: various tech-
[37] 24.000 32.000 niques of image compression,” (IJCSIS) International Journal of Com-
puter Science and Information Security, vol. 11, no. 10, Oct. 2013.
[69] 1606.900 1079.300 [11] A. M. Raid, W. M. Khedr, M. A. El-Dosuky and W. Ahmed, “Jpeg image
compression using discrete cosine transform-A survey,” International
Journal of Computer Science & Engineering Survey (IJCSES), vol.5, no.2,
Apr. 2014.
These are five major issues in still image compression [12] E. Setyaningsih and A. Harjoko, “Survey of hybrid image compression
that can contribute to future research. The upcoming section techniques,” International Journal of Electrical and Computer Engineer-
concludes the survey. ing, vol. 7, no. 4, pp. 2206, Aug. 2017.
[13] A. J. Hussain, A. Al-Fayadh and N. Radi, “Image compression tech-
niques: A survey in lossless and lossy algorithms,” Neurocomputing, vol.
VI. C ONCLUSION 300, pp. 44–69, Jul. 2018.
[14] M. Rahman and M. Hamada, “Lossless image compression techniques:
Images have three types of information, including helpful A state-of-the-art survey,” Symmetry, vol. 11, no. 10, pp. 1274, Oct. 2019.
information, redundant information, and useless information, [15] M. Rahman, M. Hamada and J. Shin, “The Impact of State-of-the-Art
in the era of the modern world, where the visual quality Techniques for Lossless Still Image Compression,” Electronics, vol. 10,
no. 3, pp. 360, Feb. 2021.
of images plays a vital role. On the other hand, the is [16] G. K. Wallace, “The JPEG still picture compression standard,” IEEE
the requirement for more memory to store as well as more Transactions on Consumer Electronics, vol. 38, no. 1, pp. xviii-xxxiv,
bandwidth is required to transmit those high-resolution images. Feb. 1992.
To resolve the problem of memory storage, we perform [17] E. Y. Lam and J. W. Goodman, “A mathematical analysis of the
DCT coefficient distributions for images,” IEEE Transactions on Image
image compression. Initially, image compression was based Processing, vol. 9, no. 10, pp. 1661–1666, Oct. 2000.
on conventional arithmetic and entropy encoding techniques [18] A. Skodras, C. Christopoulos and T. Ebrahimi, “The JPEG 2000 still
such as JPEG and JPEG-2000. However, since the evolution image compression standard,” IEEE Signal Processing Magazine, vol.
18, no. 5, pp. 36–58, Sep. 2001.
of ML many several learned image compression models have [19] H. Kasban and S. Hashima, “Adaptive Radiographic Image Compression
been proposed in the literature. This paper surveyed several Technique using Hierarchical Vector Quantization and Huffman Encod-
learned image compression techniques based on AEs, VAEs, ing,” J Ambient Intell Human Comput, vol. 10, pp. 2855–2867, Jul. 2019.
[20] Y. Qin, Z. Wang, H. Wang and Q. Gong, “Binary image encryption in a
CNNs, RNNs, GANs, PCA, and fuzzy means clustering. joint transform correlator scheme by aid of run-length encoding and QR
The majority of the architectures are based on CNNs and code,” Optics & Laser Technology, vol. 103, pp. 93–98, Jul. 2018.
... 13

[21] L. Xiang, Y. Li, W. Hao, P. Yang and X. Shen, “Reversible natural lan- [44] T. Dumas, A. Roumy and C. Guillemot, “Autoencoder Based Image
guage watermarking using synonym substitution and arithmetic coding,” Compression: Can the Learning be Quantization Independent?” in Proc.
Computers, Materials & Continua, vol. 55, no.3, pp. 541–559, Jun. 2018. of 2018 IEEE Int. Conf. on Acoustics, Speech and Signal Processing
[22] F. Dufaux, G. J. Sullivan and T. Ebrahimi, “The JPEG XR image coding (ICASSP), IEEE, pp. 1188–1192, 2018.
standard [Standards in a Nutshell],” IEEE Signal Processing Magazine, [45] Z. Cheng, H. Sun, M. Takeuchi and J. Katto, “Deep Convolutional
vol. 26, no. 6, pp. 195–204, Nov. 2009. AutoEncoder-based Lossy Image Compression,” in Proc. of 2018 Picture
[23] A. Artusi, R. K. Mantiuk, T. Richter et al. “Overview and evaluation of Coding Symposium (PCS), IEEE, pp. 253–257, 2018.
the JPEG XT HDR image compression standard,” Journal of Real-Time [46] D. Alexandre, C.-P. Chang, W.-H. Peng and H.-M. Hang, “An
Image Processing, vol. 16, pp. 413–428, Apr. 2019. autoencoder-based learned image compressor: Description of challenge
[24] G. Ginesu, M. Pintus and D. D. Giusto, “Objective assessment of the proposal by nctu,” in Proc. of the IEEE Conf. on Computer Vision and
WebP image coding algorithm,” Signal Processing: Image Communica- Pattern Recognition Workshops, pp. 2539–2542, 2018.
tion, vol. 27, no. 8, pp. 867–874, Sep. 2012. [47] L. Zhou, C. Cai, Y. Gao, S. Su and J. Wu, ”Variational Autoencoder
[25] S. P. Mohanty, E. Kougianos and P. Guturu, “SBPG: Secure Better for Low Bit-rate Image Compression,” in Proc. of The IEEE Conf. on
Portable Graphics for Trustworthy Media Communications in the IoT,” Computer Vision and Pattern Recognition (CVPR) Workshops, June 2018.
IEEE Access, vol. 6, pp. 5939–5953, Jan. 2018. [48] T. Chen, H. Liu, Z. Ma, Q. Shen, X. Cao and Y. Wang, “End-to-End
[26] J. Cho, O. -J. Kwon and S. Choi, “Improvement of JPEG XL Lossy Learnt Image Compression via Non-Local Attention Optimization and
Image Coding Using Region Adaptive DCT Block Partitioning Structure,” Improved Context Modeling,” IEEE Transactions on Image Processing,
IEEE Access, vol. 9, pp. 113213–113225, Aug. 2021. vol. 30, pp. 3179–3191, Feb. 2021.
[27] S. S. Parikh, D. Ruiz, H. Kalva, G. Fernandez-Escribano and V. Adzic, [49] A. B. L. Larsen, S. K. Sønderby, H. Larochelle and O. Winther,
“High Bit-Depth Medical Image Compression With HEVC,” IEEE Jour- “Autoencoding beyond pixels using a learned similarity metric,” in Proc.
nal of Biomedical and Health Informatics, vol. 22, no. 2, pp. 552–560, of The 33rd Int. Conf. on Machine Learning (ICML’16), vol. 48, pp.
Mar. 2018. 1558–1566, 2016.
[28] T. H. Mandeel, M. Imran Ahmad, N. A. A. Khalid and M. N. Md Isa, [50] K. Gregor, F. Besse, D. J. Rezende, I. Danihelka and D. Wierstra,
“A Comparative Study on Lossless compression mode in WebP, Better “Towards Conceptual Compression,” in Proc. of Advances In Neural
Portable Graphics (BPG), and JPEG XL Image Compression Algorithms,” Information Processing Systems, pp. 3549–3557, 2016.
in Proc. of 2021 8th International Conference on Computer and Com- [51] S. Ayzik and S. Avidan, “Deep image compression using decoder side
munication Engineering (ICCCE), pp. 17–22, 2021. information,” in Proc. of Computer Vision-ECCV 2020: 16th European
[29] J. Ballé, V. Laparra and E. P. Simoncelli, “End-to-end optimized image Conf., Glasgow, UK, pp. 699–714, 2020.
compression,” arXiv preprint, 2016. arXiv:1611.01704. [52] Y. Hu, W. Yang, Z. Ma and J. Liu, “Learning end-to-end lossy image
[30] J. Ballé, D. Minnen, S. Singh, S. J. Hwang and N. Johnston, “Variational compression: A benchmark,” IEEE Trans. on Pattern Analysis and
image compression with a scale hyperprior,” in Proc. of International Machine Intelligence, Mar. 2021.
Conf. on Learning Representations, 2018. [53] S. K. Raman, A. Ramesh, V. Naganoor, S. Dash, G. Kumaravelu
[31] O. Rippel and L. Bourdev, “Real-Time Adaptive Image Compression,” and H. Lee, “CompressNet: Generative Compression at Extremely Low
in Proc. of the 34th Int. Conf. on Machine Learning, vol. 70, pp. 2922– Bitrates,” in Proc. of The IEEE Winter Conf. on Applications of Computer
2930, 2017. Vision, pp. 2325–2333, 2020.
[32] D. Mishra, S. K. Singh and R. K. Singh, “Wavelet-based deep auto [54] Z. Cheng, H. Sun, M. Takeuchi and J. Katto, “Learned image com-
encoder-decoder (wdaed)-based image compression,” IEEE Trans. on pression with discretized gaussian mixture likelihoods and attention
Circuits and Systems for Video Technology, vol. 31, no. 4, pp. 1452– modules,” in Proc. of the IEEE/CVF Conf. on Computer Vision and
1462, Apr. 2021. Pattern Recognition, pp. 7939–7948, 2020.
[33] C. Cai, L. Chen, X. Zhang and Z. Gao, “End-to-End Optimized ROI [55] J. Lee, S. Cho and M. Kim, “An end-to-end joint learning scheme
Image Compression,” IEEE Trans. on Image Processing, vol. 29, pp. of image compression and quality enhancement with improved entropy
3442–3457, 2020. minimization,” arXiv preprint arXiv:1912.12817, 2020.
[34] C. Cai, L. Chen, X. Zhang and Z. Gao, “Efficient Variable Rate Image [56] D. Tellez, G. Litjens, J. van der Laak and F. Ciompi, “Neural Image
Compression with Multi-scale Decomposition Network,” IEEE Trans. on Compression for Gigapixel Histopathology Image Analysis,” IEEE Trans-
Circuits and Systems for Video Technology, vol. 29, no. 12, pp. 3687– actions on Pattern Analysis and Machine Intelligence, vol. 43, no. 2, pp.
3700, Dec. 2019. 567–578, Feb. 2021.
[35] A. K. Ashok and N. Palani, “Autoencoders with Variable Sized Latent [57] W. Li, W. Sun, Y. Zhao, Z. Yuan and Y. Liu, “Deep Image Compression
Vector for Image Compression,” in Proc. of IEEE Conf. on Computer with Residual Learning,” Applied Sciences, vol. 10, no. 11, pp. 4023, Jun.
Vision and Pattern Recognition (CVPR) Workshops, 2018. 2020.
[36] M. Li, K. Ma, J. You, D. Zhang and W. Zuo, “Efficient and Effective [58] H. Liu, T. Chen, Q. Shen, T. Yue and Z. Ma, “Deep Image Compression
Context-Based Convolutional Entropy Modeling for Image Compression,” via End-to-End Learning,” in Proc. of Computer Vision Pattern Recogni-
IEEE Transactions on Image Processing, vol. 29, pp. 5900–5911, Apr. tion, vol. 06, 2018.
2020. [59] J. Li and Z. Liu, “Multispectral transforms using convolution neural
[37] M. Li, W. Zuo, S. Gu, J. You and D. Zhang, “Learning Content-Weighted networks for remote sensing multispectral image compression,” Remote
Deep Image Compression,” IEEE Transactions on Pattern Analysis and Sensing, vol. 11, no. 7, pp. 759, Mar. 2019.
Machine Intelligence, vol. 43, no.10, pp. 3446–3461, Oct. 2021. [60] T. M. Zeegers, D. M. Pelt, T. van Leeuwen, R. van Liere and K. J. Baten-
[38] V. Alves de Oliveira, M. Chabert, T. Oberlin, C. Poulliat, M. Bruno, burg, “Task-Driven Learned Hyperspectral Data Reduction Using End-to-
C. Latry, M. Carlavan, S. Henrot, F. Falzon, R. Camarero, “Reduced- End Supervised Deep Learning,” Journal of Imaging, vol. 6, no. 12, pp.
Complexity End-to-End Variational Autoencoder for on Board Satellite 132, Dec. 2020.
Image Compression,” Remote Sensing, vol. 13, no.3, pp. 447, Jan. 2021. [61] F. Kong, K. Hu, Y. Li, D. Li and D. Zhao, “Spectral-Spatial Feature Par-
[39] Y. Ollivier, “Auto-encoders: reconstruction versus compression,” CoRR, titioned Extraction Based on CNN for Multispectral Image Compression,”
2014. Remote Sensing, vol. 13, no. 1, pp. 9, Dec. 2020.
[40] A. Sento, “Image Compression with Auto-encoder Algorithm using [62] J. Cai, Z. Cao and L. Zhang, “Learning a single tucker decomposition
Deep Neural Network (DNN),” in Proc. of 2016 Management and network for lossy image compression with multiple bits-per-pixel rates,”
Innovation Technology Int. Conf. (MITicon). IEEE, pp. MIT–99, 2016. IEEE Trans. on Image Processing vol. 29, pp. 3612–3625, Jan. 2020.
[41] L. Theis, W. Shi, A. Cunningham and F. Huszar, “Lossy image com- [63] A. Prakash, N. Moran, S. Garber, A. DiLillo and J. Storer, “Semantic
pression with compressive autoencoders,” in Proc. of International Conf. Perceptual Image Compression using Deep Convolution Networks,” in
on Learning Representations, 2017. Proc. of 2017 Data Compression Conf. (DCC), pp. 250–259, 2017.
[42] E. Agustsson, F. Mentzer, M. Tschannen, L. Cavigelli, R. Timofte, [64] P. Akyazi and T. Ebrahimi, “Learning-based image compression using
L. Benini and L. V. Gool, “Soft-to-Hard Vector Quantization for End- convolutional autoencoder and wavelet decomposition,” in Proc. of IEEE
to-End Learning Compressible Representations,” in Proc. of Advances in Conference on Computer Vision and Pattern Recognition Workshops,
Neural Information Processing Systems, pp. 1141–1151, 2017. 2019.
[43] T. Dumas, A. Roumy and C. Guillemot, “Image compression with [65] Y. Xue and J. Su, “Attention Based Image Compression Post-Processing
Stochastic Winner-Take-All Auto-Encoder,” in Proc. of 2017 IEEE Int. Convolutional Neural Network,” in Proc. of CVPR Workshops, 2019.
Conf. on Acoustics, Speech and Signal Processing (ICASSP). IEEE, pp. [66] K. Islam, L. M. Dang, S. Lee and H. Moon, “Image Compression With
1512–1516, 2017. Recurrent Neural Network and Generalized Divisive Normalization,” in
... 14

Proc. of the IEEE/CVF Conference on Computer Vision and Pattern [90] W. Khalaf, D. Zaghar and N. Hashim, “Enhancement of Curve-Fitting
Recognition pp. 1875-1879, 2021. Image Compression Using Hyperbolic Function,” Symmetry, vol. 11, pp.
[67] M. Covell, N. Johnston, D. Minnen, S. Jin Hwang, J. Shor, S. Singh, 291, Feb. 2019.
D. Vincent and G. Toderici, “Target-Quality Image Compression with Re- [91] Z. Guo, Z. Zhang, R. Feng and Z. Chen, “Causal Contex-
current, Convolutional Neural Networks.” ArXiv, abs/1705.06687, 2017. tual Prediction for Learned Image Compression,” IEEE Transac-
[68] G. Toderici, S. M. O’Malley, S. J. Hwang, D. Vincent, D. Minnen, tions on Circuits and Systems for Video Technology, Jun. 2021, doi:
S. Baluja, M. Covell and R. Sukthankar, “Variable Rate Image Com- 10.1109/TCSVT.2021.3089491.
pression with Recurrent Neural Networks,” CoRR, vol. abs/1511.06085, [92] D. Minnen, J. Balle, and G. D. Toderici, “Joint autoregressive and
2016. hierarchical priors for learned image compression,” in Proc. of Advances
[69] G. Toderici, D. Vincent, N. Johnston, S. Jin Hwang, D. Minnen, J. Shor in Neural Information Processing Systems, pp. 10771–10780, 2018.
and M. Covell, “Full Resolution Image Compression with Recurrent [93] X. He, Q. Hu, X. Zhang, C. Zhang, W. Lin and X. Han, “Enhancing
Neural Networks,” in Proc. of the IEEE Conf. on Computer Vision and HEVC Compressed Videos with a Partition-Masked Convolutional Neural
Pattern Recognition, pp. 5306–5314, 2017. Network,” in Proc. of 2018 25th IEEE International Conference on Image
[70] C. Wang, Y. Han and W. Wang, “An End-to-End Deep Learning Image Processing (ICIP), pp. 216–220, 2018.
Compression Framework Based on Semantic Analysis,” Applied Sciences, [94] Y. Hu, W. Yang, and J. Liu, “Coarse-to-fine hyper-prior modeling for
vol. 9, no. 17, pp. 3580, Sep. 2019. learned image compression.” in Proc. of the AAAI Conference on Artificial
[71] A. Punnappurath and M. S. Brown, “Learning raw image reconstruction- Intelligence, vol. 34, no. 07, pp. 11013–11020, 2020.
aware deep image compressors,” IEEE Trans. on pattern analysis and [95] D. Minnen and S. Singh, “Channel-Wise Autoregressive Entropy Models
machine intelligence, vol. 42, no. 4, pp. 1013–1019, Apr. 2020. for Learned Image Compression,” in Proc. of 2020 IEEE International
[72] N. Johnston, D. Vincent, D. Minnen, M. Covell, S. Singh, T. Chinen, S. Conference on Image Processing (ICIP), 2020, pp. 3339–3343.
J. Hwang, J. Shor and G. Toderici, “Improved Lossy Image Compression [96] Y. Xie, K. L. Cheng and Q. Chen, “Enhanced invertible encoding for
with Priming and Spatially Adaptive Bit Rates for Recurrent Networks,” learned image compression,” in Proc. of the 29th ACM International
in Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition, Conference on Multimedia, 2021, pp. 162–170.
pp. 4385–4393, 2018. [97] Y. Choi, M. El-Khamy and J. Lee, “Variable rate deep image com-
[73] Y. Hu, W. Yang, M. Li and J. Liu, “Progressive Spatial Recurrent Neural pression with a conditional autoencoder,” in Proc. of the IEEE/CVF
Network for Intra Prediction,” IEEE Trans. on Multimedia, vol. 21, no. International Conference on Computer Vision, pp. 3146–3154, 2019.
12, pp. 3024–3037, Dec. 2019. [98] Z. Cui, J. Wang, S. Gao, T. Guo, Y. Feng and B. Bai, “Asymmetric
[74] A. G. Ororbia, A. Mali, J. Wu, S. O’Connell, W. Dreese, D. Miller, C. L. Gained Deep Image Compression With Continuous Rate Adaptation,”
Giles, “Learned Neural Iterative Decoding for Lossy Image Compression in Proc. of the IEEE/CVF Conference on Computer Vision and Pattern
Systems,” in Proc. of 2019 Data Compression Conf. (DCC), pp. 3–12, Recognition, pp. 10532–10541, 2021.
2019. [99] A. Buades, B. Coll, and J.-M. Morel, “A non-local algorithm for image
[75] R. Torfason, F. Mentzer, E. Augustsson, M. Tschannen, R. Timofte denoising,” in Proc. of IEEE Conf. Comput. Vis. Pattern Recog., vol. 2.
and L. V. Gool, “Towards Image Understanding from Deep Compression IEEE, pp. 60–65, 2005.
Without Decoding,” in Proc. of Int. Conf. on Learning Representations, [100] X. Wang, R. Girshick, A. Gupta, and K. He, “Non-local neural
2018. networks,” in Proc. of IEEE Conf. Comput. Vis. Pattern Recog., pp. 7794–
[76] E. Agustsson, M. Tschannen, F. Mentzer, R. Timofte and L. V. Gool, 7803, 2018.
“Generative adversarial networks for extreme learned image compres- [101] D. Liu, B. Wen, Y. Fan, C. C. Loy, and T. S. Huang, “Non-local
sion,” in Proc. of the IEEE/CVF International Conference on Computer recurrent network for image restoration,” Neural Inf. Process. Syst, pp.
Vision, pp. 221–231, 2019. 1673–1682, Dec. 2018.
[77] J. Song, T. He, L. Gao, X. Xu, A. Hanjalic and T. H. Shen, “Unified [102] Y. Zhang, K. Li, K. Li, B. Zhong, and Y. Fu, “Residual non-local
Binary Generative Adversarial Network for Image Retrieval and Com- attention networks for image restoration,” in Proc. of Int. Conf. Learning
pression,” International Journal of Computer Vision, vol. 128, no. 8, pp. Representations, 2019.
2243–2264, Feb. 2020. [103] M. Li, K. Zhang, J. Li, W. Zuo, R. Timofte and D. Zhang, “Learning
[78] X. Zhang and X. Wu, “Near-lossless L-infinity constrained Multi-rate Context-Based Nonlocal Entropy Modeling for Image Compression,”
Image Decompression via Deep Neural Network,” CoRR, 2018. IEEE Transactions on Neural Networks and Learning Systems, Aug.
[79] B. Kang, S. Tripathi and T. Q. Nguyen, “Toward Joint Image Gener- 2021.
ation and Compression using Generative Adversarial Networks,” arXiv [104] O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional net-
preprint, 2019, arXiv:1901.07838. works for biomedical image segmentation,” in Proc. of Int. Conf. Medical
[80] L. Wu, K. Huang and H. Shen, “A GAN-based tunable image com- Image Computing Computer-assisted Intervention. Springer, pp. 234–241,
pression system,” in Proc. of the IEEE/CVF Winter Conference on 2015.
Applications of Computer Vision, pp. 2334–2342, 2020. [105] X. Li, H. Chen, X. Qi, Q. Dou, C.-W. Fu, and P.-A. Heng, “H-
[81] D. J. Im, C. D. Kim, H. Jiang and R. Memisevic, “Generating images DenseUNet: hybrid densely connected UNet for liver and tumor seg-
with recurrent adversarial networks,” CoRR, vol. abs/1602.05110, 2016. mentation from CT volumes,” IEEE Trans. Medical Imaging, vol. 37, no.
[82] E. M. Tolunay and A. Ghalayini, “Generative Neural Network Based 12, pp. 2663–2674, Dec. 2018.
Image Compression,” 2018. [106] Y. Wu, X. Li, Z. Zhang, X. Jin and Z. Chen, “Learned Block-based
[83] L. Galteri, L. Seidenari, M. Bertini and A. Del Bimbo, “Deep Generative Hybrid Image Compression,” IEEE Transactions on Circuits and Systems
Adversarial Compression Artifact Removal,” in Proc. of the IEEE Int. for Video Technology, Oct. 2021.
Conf. on Computer Vision, pp. 4826–4835, 2017.
[84] D. Báscones, C. González and D. Mozos, “Hyperspectral image com-
pression using vector quantization, PCA and JPEG2000,” Remote sensing,
vol. 10, no. 6, pp. 907, Jun. 2018.
[85] R. J. Yadav and M. S. Nagmode, “Compression of hyperspectral image
using PCA–DCT technology,” in Proc. of Innovations in Electronics and
Communication Engineering, pp. 269–277, 2018.
[86] A. C. Karaca and M. K. G” ull” u, “Target preserving hyperspectral
image compression using weighted PCA and JPEG2000,” in Proc. of
International Conference on Image and Signal Processing, pp. 508–516,
2018.
[87] A. H. Abbas, A. Arab and J. Harbi, “Image compression using principal
component analysis,” Mustansiriyah Journal of Science, vol. 29, no. 2,
Jan. 2018.
[88] F. Di Martino, I. Perfilieva and S. Sessa, “A Fast Multilevel Fuzzy
Transform Image Compression Method,” Axioms, vol. 8, no. 4, pp. 135,
Dec. 2019.
[89] F. Di Martino and S. Sessa, “Multi-level fuzzy transforms image com-
pression,” Journal of Ambient Intelligence and Humanized Computing,
vol. 10, no. 7, pp. 2745–2756, Jul. 2019.

You might also like