Learning-Driven Lossy Image Compression A Comprehensive Survey
Learning-Driven Lossy Image Compression A Comprehensive Survey
Abstract—In the realm of image processing and computer and the second is lossy image compression. Lossless image
vision (CV), machine learning (ML) architectures are widely compression techniques are very efficient for small-size data.
applied. Convolutional neural networks (CNNs) solve a wide
range of image processing issues and can solve image com- Lossless techniques such as Huffman coding, run-length
pression problem. Compression of images is necessary due to encoding (RLE), arithmetic coding, Lempel-Ziv-Welch (LZW)
arXiv:2201.09240v1 [eess.IV] 23 Jan 2022
bandwidth and memory constraints. Helpful, redundant, and Coding, and JPEG-LS are efficient for the small data [14]-
irrelevant information are three different forms of information [15]. The major drawback of the lossless compression tech-
found in images. This paper aims to survey recent techniques niques is less compression efficiency than lossy compression
utilizing mostly lossy image compression using ML architectures
including different auto-encoders (AEs) such as convolutional techniques. That is why many researchers are working on
auto-encoders (CAEs), variational auto-encoders (VAEs), and image compression using ML.
AEs with hyper-prior models, recurrent neural networks (RNNs), There are many surveys focused on image compression.
CNNs, generative adversarial networks (GANs), principal com- Many surveys address the pros and cons of conventional
ponent analysis (PCA) and fuzzy means clustering. We divide all
of the algorithms into several groups based on architecture. We image compression algorithms based on discrete cosine trans-
cover still image compression in this survey. Various discoveries form (DCT), and discrete wavelet transforms (DWT). Shukla
for the researchers are emphasized and possible future directions et al. in [5], surveyed the prediction and transform-based
for researchers. The open research problems such as out of image compression algorithms. This survey highlighted the
memory (OOM), striped region distortion (SRD), aliasing, and importance and application of the prediction and transform-
compatibility of the frameworks with central processing unit
(CPU) and graphics processing unit (GPU) simultaneously are based conventional algorithms for image compression. These
explained. The majority of the publications in the compression prediction-based algorithms were based on edge detection, gra-
domain surveyed are from the previous five years and use a dient, and block-based prediction. However, transform-based
variety of approaches. algorithms were based on wavelets. The survey presented the
Index Terms—learned image compression, deep learning, comparative analysis of the entropy coding. However, this
JPEG, end-to-end image compression, machine learning. lacked the discussion about end-to-end based image com-
pression architectures using ML. In [6], the authors stud-
ied the DCT and DWT-based algorithms. The survey did
I. I NTRODUCTION not address the recently learned image compression tech-
niques. Likewise, in [13] the authors surveyed several lossy
I N this modern era of big data, the size of the data is the
biggest concern for scientists and researchers. Due to the
limited bandwidth of the channel and limited memory space,
and lossless algorithms for image compression. The article
highlighted the advantages and drawbacks of the predictive,
entropy coding, and discrete Fourier transform-based image
there is the requirement of data compression for the successful
compression frameworks. The survey did not address the
transmission and storage of data without losing significant
ML-based image compression architectures. Several surveys
information [1]. Data compression can be performed in several
highlight the importance of the conventional DCT, DWT, and
ways: audio compression, image compression, video compres-
entropy-based approaches for image compression. There is
sion, and document compression [2]. There are three types
a requirement to survey ML-based still image compression
of information in an image: useful, redundant, and irrelevant.
techniques.
Irrelevant information can be ignored for the compression
of images. Redundant information is crucial for highlight- A plethora of techniques have been proposed by researchers
ing details in images, whereas useful information is neither in recent years to address the problem of still image compres-
redundant nor irrelevant. Without accurate information, we sion using ML techniques [4]. So, there is a need to survey
cannot reconstruct or decompress images properly [3]. In ML-based image compression techniques and highlight the
image compression, there are two major categories. The first open research problems. In this survey, we focus on recent
is lossless image compression, where no information is lost, ML-based image compression techniques. The following open
research questions need to be addressed.
S. Jamil is with the Department of Electronics Engineering, Sejong Uni- • Which end-to-end still image compression using ML
versity, Seoul, South Korea e-mail: [email protected].
M. J. Piran is with the Department of Computer Engineering, Sejong gives better compression efficiency?
University, Seoul, South Korea e-mail: [email protected]. • Which learned image compression frameworks give a
M. Rahman is with the Department of Electrical Engineering, better visual representation of the reconstructed image?
Polytechnique Montreal, Montreal, QC H3T 1J4, Canada e-mail:
[email protected]. • Which compression technique saves GPU memory?
Manuscript received Month xx, xxxx; revised Month xx, xxxx. • Which compression technique has a fast response time?
... 2
Introduction AEs
VAEs
Still image
compression
CNNs
Learning-driven Learning-driven
RNNs
image compression image compression
GANs
Research Gaps and
Future Direction
PCA
Fuzzy Means
Conclusion Clustering
TABLE II
S UMMARY OF THE IMAGE COMPRESSION SURVEYS .
Quantization Entropy
Table Table
Entropy Compressed
DCT Quantizer Zig Zag
Encoding Data
8 x 8 block
RGB Y Cr Cb of image JPEG Encoder
Conversion
8 x 8 block
of image Y Cr Cb RGB
Entropy Quantization
Table Table
JPEG Decoder
The entropy encoding scheme used in JPEG-2000 is Huffman failed to perform better for red-green-blue (RGB) images.
encoding [19], RLE [20] or arithmetic encoding [21]. Similarly, in [41], the authors used the Kodak dataset to train
JPEG and JPEG-2000 standards have been widely used. AE and achieved better performance than JPEG, JPEG-2000,
Later on, researchers developed JPEG extended range (JPEG and WebP. In [42] and [43], the authors used AE for the
XR) [22], and JPEG XT [23] to overcome the limitations of the compression of images. In [44], the authors proposed AE
JPEG standard. However, both these algorithms failed due to for the image compression. They used generalized divisive
hardware incompatibility. Recently, WebP [24], better portable normalization (GDN) and inverse generalized divisive normal-
graphics (BPG) [25] and JPEG XL [26] algorithm were pro- ization (IGDN) layers instead of rectified linear unit (ReLU)
posed for the compression of high resolution (HDR) images. layers to boost up the training process. Cheng et al. in [45]
Recently, high-efficiency video coding (HEVC) displayed the and Alexandre et al. in [46], deployed AE for the compression
most impressive performance. In [27], the authors explained of high resolution image.
the use of HEVC for high-depth medical image compression.
They also proposed a novel algorithm to reduce the complexity B. Compression with VAE
of HEVC. In [28], the authors presented a comparative analysis
VAE is a version of NN and belongs to the probabilistic
of all these algorithms. These conventional algorithms still
graphical models and variational Bayesian methods. Compres-
have limitations. ML has addressed several issues like visual
sion with VAE incorporates mean and variance for latent space
quality improvement for image compression. The next section
distributions. The results achieved by VAE are better than
presents learning-driven lossy image compression techniques.
simple AE. A simple illustration of image compression using
VAE is shown in Fig. 4.
IV. L EARNING -D RIVEN L OSSY I MAGE C OMPRESSION Several researchers used VAE for still image compression.
ML has a significant influence on every field of research. In [47], the authors demonstrated the use of VAE having non-
There has been a revolution in image processing due to linear transform and uniform quantizer for image compression.
CNN’s compiling property of features extractions. Inspired by Challenge one learned image compression (CLIC) dataset
this, many ML architectures have been proposed to compress used for training the model. The model was complex due to
images. We categorize architectures concerning ML models. massive training parameters. Recently, Chen et al. in [48],
proposed a VAE based architecture for high resolution image
A. Compression with AE compression. The authors demonstrated a non-local attention
module (NLAM) to boost the training process. However, the
The most widely used learning-driven lossy image compres-
model complexity increased significantly. Similarly, Larsen et
sion model was proposed by Ballé in 2016 [29]. The backbone
al. in [49], proposed a VAE to compress images. In [50], the
of this model is AE [38]. An AE can be divided into three
authors used VAE to compress images and achieved bits per
parts input, bottleneck, and output. A bottleneck in CAE is
pixel (bpp) of 4.10. However, the model was very complex
the place where latent space is represented.
due to its architecture.
AEs gained popularity in image compression as the basic
purpose of an AE is to reduce the dimensions of the input
image. Ollivier et al. in [39], used AE to compress data by C. Compression with CNN
minimizing the code length. This method used the generative CNN has prime importance in image processing due to its
property of AE. In [40], the authors used AE with Kalman feature extraction characteristics. CNN is also used for image
filter for the compression of the modified national institute compression. The simple illustration of the image compression
of standards and technology (MNIST) images. The model and artifacts reduction is shown in Fig. 5.
... 5
Entropy
Input Image Image Tiling DWT Quantization
Encoder
Wavelet
Code Stream
coefficients
3
2 7
6
5
1 8
Figure Description
In [51]– [55], the authors used CNNs to compress images. CNN-based end-to-end compression architecture for the multi-
These CNN-based image compression models outperformed spectral image compression. The authors extracted spatial
JPEG and JPEG-2000 in structural similarity index measure features by partitioning and achieved better performance in
(SSIM) and peak signal-to-noise ratio (PSNR). In [56], the terms of PSNR than JPEG-2000. Similarly, [62]– [65] also
authors proposed a CNN-based framework for the compression used CNN based frameworks.
of gigapixel images. They used unsupervised learning for the
training of the neural network. Similarly, in [57], the authors
proposed a state-of-the-art optimized deep CNN-based tech- D. Compression with RNN
nique for the compression of the still images. The proposed
method used an attention module to compress certain regions RNN can be utilized for image compression too. A sim-
of the images with different bits adaptively. The approach ple RNN-based image compression architecture has some
outperformed JPEG and JPEG-2000. However, the results of convolutional layers, GDN layer, RNN modules, binarized
Ballé [29] are better. convolutional layers, and IGND layer. A simple illustration
image compression using RNN is shown in Fig. 6 as proposed
In [58], the authors also used CNN for the compression in [66]. Covell et al. proposed RNN based compression
of high-resolution images. They used CLIC 2018 dataset for architecture using stop code tolerant (SCT) to train the model.
training and achieved 7.81% BD-Rate reduction over BPG and They achieved bpp and PSNR of 0.25 and 27dB for Kodak
JPEG-2000. In [59], the authors introduced a new concept of and imageNet datasets, respectively. The dataset used in the
multi-spectral transform for the multi-spectral image compres- method was Celeba dataset [67]. Likewise, in [68], an RNN-
sion using CNN. It achieved better compression efficiency than based method was proposed to compress still images. The
state-of-the-art anchors like JPEG-2000. Likewise, in [60], the training dataset used in this article was Kodak and achieved
authors proposed a data reduction CNN for the compression of bpp of 0.5 and SSIM of 0.77. The model’s performance was
hyperspectral images. Similarly, in [61], the authors proposed a superior to JPEG, JPEG-2000, and WebP.
... 6
a b c d e
Feature
Shrinking Enhancement Mapping Reconstruction
Extraction
Original
Analysis Block
Image
Conv Conv
RNN GDN Layer
Layer
Compressed
Image Synthesis Block
Decoder
Encoder
Compressed
Quantizer Image
of 0.5, PSNR of 33.59dB, SSIM of 0.8933, and multi-scale Representation
Input Decoded
structural similarity index (MS-SSIM) of 0.9877. In [70], the Image Image
authors proposed a DL-based end-to-end image compression
model to compress images based on semantic analysis. The Fig. 7. Image Compression with GAN.
authors used RNN for the compression purpose and achieved
better results than the conventional compression techniques
like JPEG. The compression rate at 0.75 bpp was obtained 32 selective generative compression, where semantic label maps
using this method. Furthermore, [71]- [74] also used RNN for were used. They provided insight towards the full resolution
image compression and achieved a better compression ratio image compression and targeted low bit-rate, i.e., less than 0.1
than anchors such as JPEG and JPEG-2000. bpp. In [77], the authors demonstrated the use of unified binary
GAN (BGAN+) for image compression and image retrieval.
The model achieved better than JPEG and JPEG-2000. The
E. Compression with GAN visual quality of the reconstructed image was much improved
GANs are algorithmic architectures. GANs use two NN, than JPEG and JPEG-2000 at low bit-rate such as 0.15 bpp.
pitting one against the other to generate new, synthetic data In [78]- [83], the authors proposed GAN-based architectures
instances that can pass for actual data. These have the gen- for the image compression and achieved high compression
erator part and discriminator portion. GANs are also utilized efficiency. The drawback of the GAN-based architecture is
for image compression. The simple flow diagram of image the cost of deployment.
compression using GAN is shown in Fig. 7, where encoder
and decoders are GAN modules.
Torfason et al. in [75], illustrated the use of GAN for the F. Compression with PCA
compression and classification of semantic data. Similarly, PCA is also used for image compression. In [84], the authors
[76] proposed an extreme image compression architecture utilized vector quantization and PCA for the compression of
using GAN. They used two types of modules for image com- hyperspectral images. The technique achieved better perfor-
pression. The first module was generative compression, where mance than JPEG-2000. Similarly, in [85], the authors used
no semantic label maps were required. The type applied was PCA and DCT to compress hyperspectral images. The used
... 7
TABLE III
C OMPARISON OF THE LEARNING - DRIVEN LOSSY IMAGE COMPRESSION MODELS .
[41] AE 0.4 29 0.83 0.94 Compressive AE with MSE loss function trained by residual
network and non-differentiability of the quantization noise
reduced using rounding based quantization, visual artifacts
problem at low bit rate.
[42] AE 0.2 – – 0.92 Vector quantization preferred over scalar quantization, joint
optimization of learning of latent space representation, worst
performance than BPG.
[44] AE 0.2 31 – – GDN as well as IGDN layers, worst performance than HEVC
intra.
[46] AE 0.126 29.30 – 0.924 Feature map coding block with Importance Net, Uses skip
connections AEs and trained with loss function of MSE and
MS-SSIM, less PSNR than BPG.
[47] VAE 0.15 30.76 – 0.955 Non-linear encoder (NLE) transform, uniform quantizer, non-
linear decoder (NLD) transform, and post-processing module
form a VAE framework of article.
[48] VAE 0.2 30 – 0.7768 Non local (NL) operations to consider local as well as global
context, better performance than Ball’ e model.
[50] VAE 4.10 – – – Recurrent VAE for learning latent space, limited to conceptual
compression.
[51] CNN 0.025 – – 0.925 Initial decoded image and side information require more
complexity and storage space, cloud based model, visual
artifacts at low bpp.
[52] CNN 0.25 30 – – Coarse to fine hyperprior based model, lacks parallel decod-
ing.
[53] CNN 0.0726 23.93 0.8118 – Loss function consisting of MSE, adversarial and layer wise
loss.
[54] CNN 0.519 33.62 – 0.981 The Lagrange multiplier for joint rate-distortion, considerable
run-time complexity due to the attention modules.
[55] CNN 0.2 31 – 0.7878 Gaussian mixture module (GMM) alogwith JointIQ-Net to
outperform JPEG-2k, BPG, versatile video coding (VVC)
Intra scheme.
[56] CNN – – – – Two step CNN based framework and the unsupervised train-
ing, limited to histopathology images.
[57] CNN 0.5 – 0.77 – Better performance than JPEG and JPEG-2k but worst per-
formance than Ballé. [29]
... 9
TABLE IV
C OMPARISON OF THE LEARNING - DRIVEN LOSSY IMAGE COMPRESSION MODELS .
[58] CNN 0.2 – – 0.93 Trained with perceptual as well as adversarial loss for the
generation of sharp details, outperforms JPEG, JPEG-2000,
BPG and WebP.
[67] RNN 0.25 27 – – RNN trained using SCT, limited to adaptive encoding.
[68] RNN 0.5 – 0.77 – Focuses on 32x32 size thumbnail images, better than JPEG,
JPEG-2000 and WebP.
[69] RNN 0.5 33.59 0.8933 0.9877 Long short-term memory (LSTM)-based progressive encoding
of thumbnail pictures (32 × 32), decent compression perfor-
mance at low bit rates, pixel RNN for entropy coding.
[75] GAN 0.0983 28.54 0.85 0.973 For semantic understanding combined training for compres-
sion and classification.
[76] GAN 0.033 – – – Full-resolution image compression, applicable only when se-
mantic labels are available.
[77] BGAN+ 0.15 – – 0.95 Reconstructed image without distorting the small objects
present in the image at low bit-rates such as 0.15bpp.
[88]- [89] Multilevel – – – – Reconstructed image with better visual quality, limited to
Fuzzy maximum of 1024×1024 image resolutions.
Transform
[90] Curve- – 20 – – Removed blocking effects and improved SSIM, results are
fitting with dependent on block size and compression ratio.
hyperbolic
function
All the learning-driven lossy image compression models image compression techniques. There is no technique available
have limitations and research gaps. The upcoming section in the literature that addresses this problem. There is a need
highlights the research gaps of learning-driven lossy image to solve this problem in future research.
compression frameworks.
B. Standardization of Architecture
V. R ESEARCH G APS AND F UTURE D IRECTIONS
The second research gap is the standardization of the end-
There are several techniques present for still image compres- to-end compression architecture. It is very challenging for
sion. All frameworks have their pros and cons. After a detailed the researchers. The deep learning architectures present in
survey, we find that there are still research gaps. All ML-based the literature work well only when the training is performed
architectures have problems of SRD, parallel acceleration, and on GPU and testing is also performed using GPU. Similarly,
OOM. when the training is done using CPU and testing is performed
on CPU, the results are better. However, if the training is
A. SRD in Reconstructed Image performed on the GPU and the testing is done on the CPU
SRD is the striped regions distortion in the reconstructed and vice versa, the image cannot be reconstructed correctly.
image using ML-based architectures. SRD in the reconstructed There is a requirement for the standard generalized model that
image is one of the open research problems in ML-based gives an optimal solution to this problem.
... 10
1.25
0.75
0.5
min bpp
0.25
0
8]
9]
0]
1]
2]
3]
4]
5]
8]
9]
0]
1]
2]
4]
5]
4]
5]
6]
7]
2]
3]
4]
[3
[3
[4
[4
[4
[4
[4
[4
[4
[4
[5
[5
[5
[5
[5
[6
[6
[6
[6
[7
[7
[7
7
17
18
18
18
21
20
20
20
18
20
01
01
01
02
02
02
01
01
01
01
01
01
20
20
20
20
20
20
20
20
20
20
s2
n2
g2
u2
g2
e2
u2
ll2
i2
i2
g2
n2
as
as
re
ou
en
ik
an
Li
on
ng
r ic
r ic
ei
Le
so
en
en
Li
an
so
ve
nd
yz
um
um
Ch
m
Zh
ss
So
Th
de
de
s ts
s ts
Ch
Ch
W
Co
A
Ra
xa
rfa
D
To
To
gu
gu
le
To
A
A
Fig. 9. Minimum bpp achieved by different models.
0.75
0.5
MS-SSIM
0.25
0
8]
9]
0]
1]
2]
3]
4]
5]
8]
9]
0]
1]
2]
4]
5]
4]
5]
6]
7]
2]
3]
4]
[3
[3
[4
[4
[4
[4
[4
[4
[4
[4
[5
[5
[5
[5
[5
[6
[6
[6
[6
[7
[7
[7
7
17
18
18
18
21
20
20
20
18
20
01
01
01
02
02
02
01
01
01
01
01
01
20
20
20
20
20
20
20
20
20
20
s2
n2
g2
u2
g2
e2
u2
ll2
i2
i2
g2
n2
as
as
re
ou
en
ik
an
Li
on
ng
r ic
r ic
ei
Le
so
en
en
Li
an
so
ve
nd
yz
um
um
Ch
m
Zh
ss
So
Th
de
de
s ts
s ts
Ch
Ch
W
Co
A
Ra
xa
rfa
D
To
To
gu
gu
le
To
A
40
30
20
PSNR
10
0
8]
9]
0]
1]
2]
3]
4]
5]
8]
9]
0]
1]
2]
4]
5]
4]
5]
6]
7]
2]
3]
4]
[3
[3
[4
[4
[4
[4
[4
[4
[4
[4
[5
[5
[5
[5
[5
[6
[6
[6
[6
[7
[7
[7
7
17
18
18
18
21
20
20
20
18
20
01
01
01
02
02
02
01
01
01
01
01
01
20
20
20
20
20
20
20
20
20
20
s2
n2
g2
u2
g2
e2
u2
ll2
i2
i2
g2
n2
as
as
re
ou
en
ik
an
Li
on
ng
r ic
r ic
ei
Le
so
en
en
Li
an
so
ve
nd
yz
um
um
Ch
m
Zh
ss
So
Th
de
de
s ts
s ts
Ch
Ch
W
Co
A
Ra
xa
rfa
D
To
To
gu
gu
le
To
A
A
Fig. 11. PSNR achieved at minimum bpp by different models.
TABLE V
S UMMARY OF THE PERFORMANCE OF THE END - TO - END IMAGE COMPRESSION FRAMEWORKS .
[98] Asymmetric Gained VAE (loss function MSE) 0.107 30.68 0.93
[98] Asymmetric Gained VAE (loss function MS-SSIM) 0.109 27.76 0.94
... 12
[21] L. Xiang, Y. Li, W. Hao, P. Yang and X. Shen, “Reversible natural lan- [44] T. Dumas, A. Roumy and C. Guillemot, “Autoencoder Based Image
guage watermarking using synonym substitution and arithmetic coding,” Compression: Can the Learning be Quantization Independent?” in Proc.
Computers, Materials & Continua, vol. 55, no.3, pp. 541–559, Jun. 2018. of 2018 IEEE Int. Conf. on Acoustics, Speech and Signal Processing
[22] F. Dufaux, G. J. Sullivan and T. Ebrahimi, “The JPEG XR image coding (ICASSP), IEEE, pp. 1188–1192, 2018.
standard [Standards in a Nutshell],” IEEE Signal Processing Magazine, [45] Z. Cheng, H. Sun, M. Takeuchi and J. Katto, “Deep Convolutional
vol. 26, no. 6, pp. 195–204, Nov. 2009. AutoEncoder-based Lossy Image Compression,” in Proc. of 2018 Picture
[23] A. Artusi, R. K. Mantiuk, T. Richter et al. “Overview and evaluation of Coding Symposium (PCS), IEEE, pp. 253–257, 2018.
the JPEG XT HDR image compression standard,” Journal of Real-Time [46] D. Alexandre, C.-P. Chang, W.-H. Peng and H.-M. Hang, “An
Image Processing, vol. 16, pp. 413–428, Apr. 2019. autoencoder-based learned image compressor: Description of challenge
[24] G. Ginesu, M. Pintus and D. D. Giusto, “Objective assessment of the proposal by nctu,” in Proc. of the IEEE Conf. on Computer Vision and
WebP image coding algorithm,” Signal Processing: Image Communica- Pattern Recognition Workshops, pp. 2539–2542, 2018.
tion, vol. 27, no. 8, pp. 867–874, Sep. 2012. [47] L. Zhou, C. Cai, Y. Gao, S. Su and J. Wu, ”Variational Autoencoder
[25] S. P. Mohanty, E. Kougianos and P. Guturu, “SBPG: Secure Better for Low Bit-rate Image Compression,” in Proc. of The IEEE Conf. on
Portable Graphics for Trustworthy Media Communications in the IoT,” Computer Vision and Pattern Recognition (CVPR) Workshops, June 2018.
IEEE Access, vol. 6, pp. 5939–5953, Jan. 2018. [48] T. Chen, H. Liu, Z. Ma, Q. Shen, X. Cao and Y. Wang, “End-to-End
[26] J. Cho, O. -J. Kwon and S. Choi, “Improvement of JPEG XL Lossy Learnt Image Compression via Non-Local Attention Optimization and
Image Coding Using Region Adaptive DCT Block Partitioning Structure,” Improved Context Modeling,” IEEE Transactions on Image Processing,
IEEE Access, vol. 9, pp. 113213–113225, Aug. 2021. vol. 30, pp. 3179–3191, Feb. 2021.
[27] S. S. Parikh, D. Ruiz, H. Kalva, G. Fernandez-Escribano and V. Adzic, [49] A. B. L. Larsen, S. K. Sønderby, H. Larochelle and O. Winther,
“High Bit-Depth Medical Image Compression With HEVC,” IEEE Jour- “Autoencoding beyond pixels using a learned similarity metric,” in Proc.
nal of Biomedical and Health Informatics, vol. 22, no. 2, pp. 552–560, of The 33rd Int. Conf. on Machine Learning (ICML’16), vol. 48, pp.
Mar. 2018. 1558–1566, 2016.
[28] T. H. Mandeel, M. Imran Ahmad, N. A. A. Khalid and M. N. Md Isa, [50] K. Gregor, F. Besse, D. J. Rezende, I. Danihelka and D. Wierstra,
“A Comparative Study on Lossless compression mode in WebP, Better “Towards Conceptual Compression,” in Proc. of Advances In Neural
Portable Graphics (BPG), and JPEG XL Image Compression Algorithms,” Information Processing Systems, pp. 3549–3557, 2016.
in Proc. of 2021 8th International Conference on Computer and Com- [51] S. Ayzik and S. Avidan, “Deep image compression using decoder side
munication Engineering (ICCCE), pp. 17–22, 2021. information,” in Proc. of Computer Vision-ECCV 2020: 16th European
[29] J. Ballé, V. Laparra and E. P. Simoncelli, “End-to-end optimized image Conf., Glasgow, UK, pp. 699–714, 2020.
compression,” arXiv preprint, 2016. arXiv:1611.01704. [52] Y. Hu, W. Yang, Z. Ma and J. Liu, “Learning end-to-end lossy image
[30] J. Ballé, D. Minnen, S. Singh, S. J. Hwang and N. Johnston, “Variational compression: A benchmark,” IEEE Trans. on Pattern Analysis and
image compression with a scale hyperprior,” in Proc. of International Machine Intelligence, Mar. 2021.
Conf. on Learning Representations, 2018. [53] S. K. Raman, A. Ramesh, V. Naganoor, S. Dash, G. Kumaravelu
[31] O. Rippel and L. Bourdev, “Real-Time Adaptive Image Compression,” and H. Lee, “CompressNet: Generative Compression at Extremely Low
in Proc. of the 34th Int. Conf. on Machine Learning, vol. 70, pp. 2922– Bitrates,” in Proc. of The IEEE Winter Conf. on Applications of Computer
2930, 2017. Vision, pp. 2325–2333, 2020.
[32] D. Mishra, S. K. Singh and R. K. Singh, “Wavelet-based deep auto [54] Z. Cheng, H. Sun, M. Takeuchi and J. Katto, “Learned image com-
encoder-decoder (wdaed)-based image compression,” IEEE Trans. on pression with discretized gaussian mixture likelihoods and attention
Circuits and Systems for Video Technology, vol. 31, no. 4, pp. 1452– modules,” in Proc. of the IEEE/CVF Conf. on Computer Vision and
1462, Apr. 2021. Pattern Recognition, pp. 7939–7948, 2020.
[33] C. Cai, L. Chen, X. Zhang and Z. Gao, “End-to-End Optimized ROI [55] J. Lee, S. Cho and M. Kim, “An end-to-end joint learning scheme
Image Compression,” IEEE Trans. on Image Processing, vol. 29, pp. of image compression and quality enhancement with improved entropy
3442–3457, 2020. minimization,” arXiv preprint arXiv:1912.12817, 2020.
[34] C. Cai, L. Chen, X. Zhang and Z. Gao, “Efficient Variable Rate Image [56] D. Tellez, G. Litjens, J. van der Laak and F. Ciompi, “Neural Image
Compression with Multi-scale Decomposition Network,” IEEE Trans. on Compression for Gigapixel Histopathology Image Analysis,” IEEE Trans-
Circuits and Systems for Video Technology, vol. 29, no. 12, pp. 3687– actions on Pattern Analysis and Machine Intelligence, vol. 43, no. 2, pp.
3700, Dec. 2019. 567–578, Feb. 2021.
[35] A. K. Ashok and N. Palani, “Autoencoders with Variable Sized Latent [57] W. Li, W. Sun, Y. Zhao, Z. Yuan and Y. Liu, “Deep Image Compression
Vector for Image Compression,” in Proc. of IEEE Conf. on Computer with Residual Learning,” Applied Sciences, vol. 10, no. 11, pp. 4023, Jun.
Vision and Pattern Recognition (CVPR) Workshops, 2018. 2020.
[36] M. Li, K. Ma, J. You, D. Zhang and W. Zuo, “Efficient and Effective [58] H. Liu, T. Chen, Q. Shen, T. Yue and Z. Ma, “Deep Image Compression
Context-Based Convolutional Entropy Modeling for Image Compression,” via End-to-End Learning,” in Proc. of Computer Vision Pattern Recogni-
IEEE Transactions on Image Processing, vol. 29, pp. 5900–5911, Apr. tion, vol. 06, 2018.
2020. [59] J. Li and Z. Liu, “Multispectral transforms using convolution neural
[37] M. Li, W. Zuo, S. Gu, J. You and D. Zhang, “Learning Content-Weighted networks for remote sensing multispectral image compression,” Remote
Deep Image Compression,” IEEE Transactions on Pattern Analysis and Sensing, vol. 11, no. 7, pp. 759, Mar. 2019.
Machine Intelligence, vol. 43, no.10, pp. 3446–3461, Oct. 2021. [60] T. M. Zeegers, D. M. Pelt, T. van Leeuwen, R. van Liere and K. J. Baten-
[38] V. Alves de Oliveira, M. Chabert, T. Oberlin, C. Poulliat, M. Bruno, burg, “Task-Driven Learned Hyperspectral Data Reduction Using End-to-
C. Latry, M. Carlavan, S. Henrot, F. Falzon, R. Camarero, “Reduced- End Supervised Deep Learning,” Journal of Imaging, vol. 6, no. 12, pp.
Complexity End-to-End Variational Autoencoder for on Board Satellite 132, Dec. 2020.
Image Compression,” Remote Sensing, vol. 13, no.3, pp. 447, Jan. 2021. [61] F. Kong, K. Hu, Y. Li, D. Li and D. Zhao, “Spectral-Spatial Feature Par-
[39] Y. Ollivier, “Auto-encoders: reconstruction versus compression,” CoRR, titioned Extraction Based on CNN for Multispectral Image Compression,”
2014. Remote Sensing, vol. 13, no. 1, pp. 9, Dec. 2020.
[40] A. Sento, “Image Compression with Auto-encoder Algorithm using [62] J. Cai, Z. Cao and L. Zhang, “Learning a single tucker decomposition
Deep Neural Network (DNN),” in Proc. of 2016 Management and network for lossy image compression with multiple bits-per-pixel rates,”
Innovation Technology Int. Conf. (MITicon). IEEE, pp. MIT–99, 2016. IEEE Trans. on Image Processing vol. 29, pp. 3612–3625, Jan. 2020.
[41] L. Theis, W. Shi, A. Cunningham and F. Huszar, “Lossy image com- [63] A. Prakash, N. Moran, S. Garber, A. DiLillo and J. Storer, “Semantic
pression with compressive autoencoders,” in Proc. of International Conf. Perceptual Image Compression using Deep Convolution Networks,” in
on Learning Representations, 2017. Proc. of 2017 Data Compression Conf. (DCC), pp. 250–259, 2017.
[42] E. Agustsson, F. Mentzer, M. Tschannen, L. Cavigelli, R. Timofte, [64] P. Akyazi and T. Ebrahimi, “Learning-based image compression using
L. Benini and L. V. Gool, “Soft-to-Hard Vector Quantization for End- convolutional autoencoder and wavelet decomposition,” in Proc. of IEEE
to-End Learning Compressible Representations,” in Proc. of Advances in Conference on Computer Vision and Pattern Recognition Workshops,
Neural Information Processing Systems, pp. 1141–1151, 2017. 2019.
[43] T. Dumas, A. Roumy and C. Guillemot, “Image compression with [65] Y. Xue and J. Su, “Attention Based Image Compression Post-Processing
Stochastic Winner-Take-All Auto-Encoder,” in Proc. of 2017 IEEE Int. Convolutional Neural Network,” in Proc. of CVPR Workshops, 2019.
Conf. on Acoustics, Speech and Signal Processing (ICASSP). IEEE, pp. [66] K. Islam, L. M. Dang, S. Lee and H. Moon, “Image Compression With
1512–1516, 2017. Recurrent Neural Network and Generalized Divisive Normalization,” in
... 14
Proc. of the IEEE/CVF Conference on Computer Vision and Pattern [90] W. Khalaf, D. Zaghar and N. Hashim, “Enhancement of Curve-Fitting
Recognition pp. 1875-1879, 2021. Image Compression Using Hyperbolic Function,” Symmetry, vol. 11, pp.
[67] M. Covell, N. Johnston, D. Minnen, S. Jin Hwang, J. Shor, S. Singh, 291, Feb. 2019.
D. Vincent and G. Toderici, “Target-Quality Image Compression with Re- [91] Z. Guo, Z. Zhang, R. Feng and Z. Chen, “Causal Contex-
current, Convolutional Neural Networks.” ArXiv, abs/1705.06687, 2017. tual Prediction for Learned Image Compression,” IEEE Transac-
[68] G. Toderici, S. M. O’Malley, S. J. Hwang, D. Vincent, D. Minnen, tions on Circuits and Systems for Video Technology, Jun. 2021, doi:
S. Baluja, M. Covell and R. Sukthankar, “Variable Rate Image Com- 10.1109/TCSVT.2021.3089491.
pression with Recurrent Neural Networks,” CoRR, vol. abs/1511.06085, [92] D. Minnen, J. Balle, and G. D. Toderici, “Joint autoregressive and
2016. hierarchical priors for learned image compression,” in Proc. of Advances
[69] G. Toderici, D. Vincent, N. Johnston, S. Jin Hwang, D. Minnen, J. Shor in Neural Information Processing Systems, pp. 10771–10780, 2018.
and M. Covell, “Full Resolution Image Compression with Recurrent [93] X. He, Q. Hu, X. Zhang, C. Zhang, W. Lin and X. Han, “Enhancing
Neural Networks,” in Proc. of the IEEE Conf. on Computer Vision and HEVC Compressed Videos with a Partition-Masked Convolutional Neural
Pattern Recognition, pp. 5306–5314, 2017. Network,” in Proc. of 2018 25th IEEE International Conference on Image
[70] C. Wang, Y. Han and W. Wang, “An End-to-End Deep Learning Image Processing (ICIP), pp. 216–220, 2018.
Compression Framework Based on Semantic Analysis,” Applied Sciences, [94] Y. Hu, W. Yang, and J. Liu, “Coarse-to-fine hyper-prior modeling for
vol. 9, no. 17, pp. 3580, Sep. 2019. learned image compression.” in Proc. of the AAAI Conference on Artificial
[71] A. Punnappurath and M. S. Brown, “Learning raw image reconstruction- Intelligence, vol. 34, no. 07, pp. 11013–11020, 2020.
aware deep image compressors,” IEEE Trans. on pattern analysis and [95] D. Minnen and S. Singh, “Channel-Wise Autoregressive Entropy Models
machine intelligence, vol. 42, no. 4, pp. 1013–1019, Apr. 2020. for Learned Image Compression,” in Proc. of 2020 IEEE International
[72] N. Johnston, D. Vincent, D. Minnen, M. Covell, S. Singh, T. Chinen, S. Conference on Image Processing (ICIP), 2020, pp. 3339–3343.
J. Hwang, J. Shor and G. Toderici, “Improved Lossy Image Compression [96] Y. Xie, K. L. Cheng and Q. Chen, “Enhanced invertible encoding for
with Priming and Spatially Adaptive Bit Rates for Recurrent Networks,” learned image compression,” in Proc. of the 29th ACM International
in Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition, Conference on Multimedia, 2021, pp. 162–170.
pp. 4385–4393, 2018. [97] Y. Choi, M. El-Khamy and J. Lee, “Variable rate deep image com-
[73] Y. Hu, W. Yang, M. Li and J. Liu, “Progressive Spatial Recurrent Neural pression with a conditional autoencoder,” in Proc. of the IEEE/CVF
Network for Intra Prediction,” IEEE Trans. on Multimedia, vol. 21, no. International Conference on Computer Vision, pp. 3146–3154, 2019.
12, pp. 3024–3037, Dec. 2019. [98] Z. Cui, J. Wang, S. Gao, T. Guo, Y. Feng and B. Bai, “Asymmetric
[74] A. G. Ororbia, A. Mali, J. Wu, S. O’Connell, W. Dreese, D. Miller, C. L. Gained Deep Image Compression With Continuous Rate Adaptation,”
Giles, “Learned Neural Iterative Decoding for Lossy Image Compression in Proc. of the IEEE/CVF Conference on Computer Vision and Pattern
Systems,” in Proc. of 2019 Data Compression Conf. (DCC), pp. 3–12, Recognition, pp. 10532–10541, 2021.
2019. [99] A. Buades, B. Coll, and J.-M. Morel, “A non-local algorithm for image
[75] R. Torfason, F. Mentzer, E. Augustsson, M. Tschannen, R. Timofte denoising,” in Proc. of IEEE Conf. Comput. Vis. Pattern Recog., vol. 2.
and L. V. Gool, “Towards Image Understanding from Deep Compression IEEE, pp. 60–65, 2005.
Without Decoding,” in Proc. of Int. Conf. on Learning Representations, [100] X. Wang, R. Girshick, A. Gupta, and K. He, “Non-local neural
2018. networks,” in Proc. of IEEE Conf. Comput. Vis. Pattern Recog., pp. 7794–
[76] E. Agustsson, M. Tschannen, F. Mentzer, R. Timofte and L. V. Gool, 7803, 2018.
“Generative adversarial networks for extreme learned image compres- [101] D. Liu, B. Wen, Y. Fan, C. C. Loy, and T. S. Huang, “Non-local
sion,” in Proc. of the IEEE/CVF International Conference on Computer recurrent network for image restoration,” Neural Inf. Process. Syst, pp.
Vision, pp. 221–231, 2019. 1673–1682, Dec. 2018.
[77] J. Song, T. He, L. Gao, X. Xu, A. Hanjalic and T. H. Shen, “Unified [102] Y. Zhang, K. Li, K. Li, B. Zhong, and Y. Fu, “Residual non-local
Binary Generative Adversarial Network for Image Retrieval and Com- attention networks for image restoration,” in Proc. of Int. Conf. Learning
pression,” International Journal of Computer Vision, vol. 128, no. 8, pp. Representations, 2019.
2243–2264, Feb. 2020. [103] M. Li, K. Zhang, J. Li, W. Zuo, R. Timofte and D. Zhang, “Learning
[78] X. Zhang and X. Wu, “Near-lossless L-infinity constrained Multi-rate Context-Based Nonlocal Entropy Modeling for Image Compression,”
Image Decompression via Deep Neural Network,” CoRR, 2018. IEEE Transactions on Neural Networks and Learning Systems, Aug.
[79] B. Kang, S. Tripathi and T. Q. Nguyen, “Toward Joint Image Gener- 2021.
ation and Compression using Generative Adversarial Networks,” arXiv [104] O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional net-
preprint, 2019, arXiv:1901.07838. works for biomedical image segmentation,” in Proc. of Int. Conf. Medical
[80] L. Wu, K. Huang and H. Shen, “A GAN-based tunable image com- Image Computing Computer-assisted Intervention. Springer, pp. 234–241,
pression system,” in Proc. of the IEEE/CVF Winter Conference on 2015.
Applications of Computer Vision, pp. 2334–2342, 2020. [105] X. Li, H. Chen, X. Qi, Q. Dou, C.-W. Fu, and P.-A. Heng, “H-
[81] D. J. Im, C. D. Kim, H. Jiang and R. Memisevic, “Generating images DenseUNet: hybrid densely connected UNet for liver and tumor seg-
with recurrent adversarial networks,” CoRR, vol. abs/1602.05110, 2016. mentation from CT volumes,” IEEE Trans. Medical Imaging, vol. 37, no.
[82] E. M. Tolunay and A. Ghalayini, “Generative Neural Network Based 12, pp. 2663–2674, Dec. 2018.
Image Compression,” 2018. [106] Y. Wu, X. Li, Z. Zhang, X. Jin and Z. Chen, “Learned Block-based
[83] L. Galteri, L. Seidenari, M. Bertini and A. Del Bimbo, “Deep Generative Hybrid Image Compression,” IEEE Transactions on Circuits and Systems
Adversarial Compression Artifact Removal,” in Proc. of the IEEE Int. for Video Technology, Oct. 2021.
Conf. on Computer Vision, pp. 4826–4835, 2017.
[84] D. Báscones, C. González and D. Mozos, “Hyperspectral image com-
pression using vector quantization, PCA and JPEG2000,” Remote sensing,
vol. 10, no. 6, pp. 907, Jun. 2018.
[85] R. J. Yadav and M. S. Nagmode, “Compression of hyperspectral image
using PCA–DCT technology,” in Proc. of Innovations in Electronics and
Communication Engineering, pp. 269–277, 2018.
[86] A. C. Karaca and M. K. G” ull” u, “Target preserving hyperspectral
image compression using weighted PCA and JPEG2000,” in Proc. of
International Conference on Image and Signal Processing, pp. 508–516,
2018.
[87] A. H. Abbas, A. Arab and J. Harbi, “Image compression using principal
component analysis,” Mustansiriyah Journal of Science, vol. 29, no. 2,
Jan. 2018.
[88] F. Di Martino, I. Perfilieva and S. Sessa, “A Fast Multilevel Fuzzy
Transform Image Compression Method,” Axioms, vol. 8, no. 4, pp. 135,
Dec. 2019.
[89] F. Di Martino and S. Sessa, “Multi-level fuzzy transforms image com-
pression,” Journal of Ambient Intelligence and Humanized Computing,
vol. 10, no. 7, pp. 2745–2756, Jul. 2019.