IEEE2015
IEEE2015
Compression
Abstract— Deep neural networks (DNNs) are increasingly compression is described in [11]. The authors in [12] and
being researched and employed as a solution to various image [13] provide a summary of different neural network models
and video processing tasks. In this paper we address the and mathematical/statistical techniques that neural networks
problem of digital image compression using DNNs. We use two can be complemented with to improve the compression
different DNN architectures for image compression i.e. one
results.
employing the logistic sigmoid neurons and the other engaging
the hyperbolic tangent neurons. Experiments show that the In this paper we attempt for digital image compression by
network employing the hyperbolic tangent neurons out deep neural networks (DNNs) i.e. ANNs having multiple
performs the one with the sigmoid neurons. Results indicate hidden layers of neurons between the input and output
that the hyperbolic tangent neurons not only improve the layers [4]. The extra layers enable the network to compactly
PSNR of the reconstructed images by a significant 2~5dB on represent significantly larger set of functions than single
average but they also converge several order of magnitude layered networks. Besides this the choice of the activation
faster than the logistic sigmoid neurons. function strongly dictates the complexity and performance of
neural networks [5]. We achieve still image compression by
Keywords— Deep neural networks; image compression;
artificial neurons; logistic sigmoid neurons; hyperbolic tangent
employing DNNs with sigmoid neurons and hyperbolic
neurons tangent neurons. We demonstrate experimentally that the
hyperbolic tangent neurons outperform the sigmoid neurons
I. INTRODUCTION in terms of the network performance and high convergence
Digital image compression plays a vital role in the of the network.
transmission and storage of digital image data. It allows the The rest of the paper is organized as follows. Section II
transmission of image data at extremely low bandwidths and gives an introduction to the image compression problem.
minimizes the memory requirements for storage of this data. Section III gives a brief insight to the neural networks and
Besides, the algorithms designed for digital image states the preliminaries. Section IV introduces the proposed
compression can be effectively extended to compress video DNNs in detail. Section V reports the experimental results.
data as videos are simply consecutive frames of still images. Section VI provides the conclusion to the paper.
Artificial neural networks (ANNs) [1][2][3] are
II. IMAGE COMPRESSION
simplified models of the biological neuron system. They
tend to mimic the manner in which human brain performs The process of image compression can be formulated as
calculations and makes decision. The complexity of the real designing a compressor and decompressor module as shown
nervous system is highly abstracted when modeling an ANN. in the fig. 1
Several studies have been proposed to address the
problem of digital image compression via ANNs. The most Original
Image (I)
Compressed
Data (IC) Reconstructed
elemental and simple network i.e. the single structured of size M bits of size N bits Image (I')
Compression Decompression
neural network is described in [6]. The authors use a 3-layer Module Module
W0
X0
W1
X1
Encoder Decoder
Output
Activation
input
X2 Summer
function
Fig. 3. Deep neural network architecture for image compression
Xn W2
C. Learning
These networks are trained on various observed data sets
bias Wn
(input-output pairs) from the training data, the task being to
find a reasonable estimate of the underlying process and
model it so it can be generalized to a large set of test data.
1
This is achieved by rigorously training the network with the
help of learning algorithms. These training processes are
Fig. 2. An artificial neuron characterized by the usage of a given output that is compared
to the predicted output and by adaption of all parameters
These artificial neuron form the basis of artificial neural according to this comparison. The parameters of a neural
networks and it works as follows: An artificial neuron has ݊ network are its weights. This paper uses the back
inputs and a bias. Each input to the neuron has its own propagation for training the artificial neural network. The
weight associated with it illustrated by the circle in fig. 2. back propagation is a supervised learning algorithm and is
Each input before entering to the neuron is multiplied with especially suitable for feed-forward networks. The term back
its respective weight. The artificial neuron sums all these propagation is abbreviated for “backwards prorogation of
weighted inputs and passes this weighted sum through an errors” and it implies that the errors (and therefore the
activation function which will determine the activation or learning) propagate backwards from the output nodes to the
output. This output signal is typically sent as an input to inner nodes. Hence back propagation is used to calculate the
another neuron. gradient of the error with respect to the network’s modifiable
B. Architecture weights. This gradient is then used in a simple gradient
descent algorithm to find weights that minimize the error.
An ANN is setup by creating connections between these
The term back propagation refers to the entire procedure
artificial neurons analogous to the connection between the
encompassing both the calculation of gradient and its use in
biological neurons in human nervous system. Fig. 3 shows a
gradient descent. Back propagation requires that the
generalized architecture for an ANN (a DNN). The neurons
activation function used by the artificial neurons (or “nodes”)
are tightly interconnected and organized into different layers.
to be differentiable. Once the optimum weights for the
The input layer receives the input, the output layer produces
network are found it can be actively employed for the
the final output. Usually one or more hidden layers are
particular application and can produce independent results.
D. Deep Neural Networks the input and output layers ()ܰ<ܭ. The number of hidden
Deep neural networks are ANNs having multiple hidden layers and hidden neurons is determined by the number of
layers of neurons between the input and output layers as input and output neurons as well as desired compression
shown in fig. 3. These networks can model complex non ratio. The compression ratio of this DNN is the ratio of
linear relationships more efficiently. Due to these additional input neurons to the number of neurons in the last of the
layers the network can capture the highly variant features of hidden layers.
The training of the DNN is carried out with a set of
the input and learn more higher abstract representations of
images selected from the training set. The training images
the data. This contributes to better generalization property of
are divided into non overlapping blocks of size ܹ by ܹ
the DNNs.
pixels. The pixels in each of these blocks are normalized by
IV. PROPOSED DEEP NEURAL NETWORKS FOR IMAGE a normalizing function ݂. These normalized blocks are fed
COMPRESSION into the input layer of the network at random, each neuron
in the input layer corresponds to one pixel i.e. ܰ ൌ ܹ ൈ ܹ.
Compact internal data representations of the original As in case of supervised learning the desired output for the
image are generated for the purpose of still image network is known in advance, which in case of image
compression by two different DNNs. The two DNNs differ compression purpose is same as the input to the network.
on the basis of activation function (neurons) they deploy. We tend to resolve the network to produce at the output
Researchers conclude that the performance and the what it sees at the input. As our DNN uses back propagation
complexity of a neural network is characterized by the type for training the network, the difference between the actual
of neurons or activation function it employs [5]. This output and desired output is calculated and the error is
activation functions influences the mapping of input data to propelled backwards to adjust the parameters of the network
the output data and determine the boundedness of the accordingly. With the new weights the output is again
network in an unit interval. The computations required to calculated and compared with the desired one, errors are re-
calculate the activation function and its derivative dictate the propagated, the parameters are readjusted and the process
speed of the neural network. We implemented two DNNs continues in an iterative fashion. In our implementation of
one with the sigmoid neurons and the other with hyperbolic DNN the training is stopped when the iterations reach their
tangent neurons. These units are described as follows maximum limit or when the average mean square error
drops below a certain threshold. Once the training of the
A. Logistic Sigmoid Unit network is completed, the parameters of the network are
The sigmoid unit is one of the most widely used saved. With these finalized weights we utilize this DNN to
activation function and is given by ߜሺݔሻ ൌ ͳΤͳ ݁െ ݔwhere compress and decompress the test images. The test image to
ݔis the input to the neuron. This function limits the be compressed is divided into non overlapping blocks. Each
amplitude of the output of a neuron over the 0 to +1 closed block is fed into the input of the network after normalization.
range. The gradient of the sigmoid function vanishes as we The input layer and the ݊ hidden layers act as the
increase or decrease ݔ. This activation function is especially compressor module and perform a non linear and non
advantageous to use in neural networks trained by back- orthogonal transformation ܸ. The compressed data is found
propagation algorithms. at the output of the last hidden layer. The output layer acts
as the decompressor module and reconstructs the
B. Hyperbolic Tangent Unit normalized input data block by performing a second
The hyperbolic tangent transfer function is given by transformation ܶ . The decompressed image block can be
࣮ሺݔሻ ൌ ݁ ௫ െ ݁ ି௫ Τ݁ ௫ ݁ ି௫ The tangent hyperbolic found at the output neurons of the output layer. The
function produces the scaled output over the -1 to +1 closed dynamic range of the reconstructed data block is restored by
range. This function shares many properties of the sigmoid applying the inverse normalization function ݂ ିଵ . The
function but because the output space of the this tangent transformation ܸ and ܶ are optimized by training the DNN
function is broader, it may be more efficient for modeling a on several training images.
complex non linear relations more abstractly. In our analysis
of these networks the hyperbolic tangent neuron show V. EXPERIMENTAL RESULTS
superior performance than that of the sigmoid neuron. The Experiments were performed on test images taken from
DNN with these non linear neurons used for still image the standard set of images: lena, baboon, cameraman, pepper
compression is shown in fig. 3. The network consists of an and boats. The size of the test images was 512 by 512. The
input layer, ݊ number of hidden layers and an output layer. performance of the network is measured by the peak signal
As this network is targeting image to noise ratio (PSNR) of the reconstructed images at the
compression/decompression it must have equal number of output layer and is defined as
input and output neurons, ܰ. The number of neurons in the
input layer or the output layer corresponds to the size of ܺܣܯூ ଶ
ܴܲܵܰ ൌ ͳͲ݈݃ଵ ቆ ቇ
image block to be compressed. Compression can be ܧܵܯ
achieved by allowing the number of neurons at the last
ܺܣܯூ is the maximum possible pixel value of the image and
hidden layer, ܭ, to be less than that of the neurons at both
ܧܵܯis the mean square error given by
ିଵ ିଵ
ͳ 0.045
ܧܵܯൌ ሾܫை ሺ݅ǡ ݆ሻ െ ܫோ ሺ݅ǡ ݆ሻሿଶ 0.04
݉݊
ୀ ୀ 0.035
Where ݉ and ݊ represent the size of the image in the 0.03
horizontal and vertical dimensions respectively, ܫை is the 0.025
mse
original image and ܫோ is the reconstructed image. In this 0.02
paper the degree of compression achieved is expressed in 0.015
terms of compression ratio (CR) defined as the ratio of input 0.01
neurons to the number of neurons in the last of the hidden 0.005
layers. The number of neurons in the input layer, output
0
layer and the last hidden layer were adjusted to achieve
different CRs i.e. (4:1), (8:1), (16:1). Table (1)-(3) show the 0 200 400 600
compression performance in terms of PSNR achieved by the epochs
set of five standard real world images compressed at
different CRs by the sigmoid DNN and the tangent DNN.
Sigmoid Hyperbolic Tangent
There is a significant performance improvement for the
hyperbolic tangent units compared to the sigmoid units. Fig. 4. Comparison of epoch vs. mse for sigmoid neurons and
TABLE I hyperbolic tangent neurons at a CR of 8:1
PSNR OF THE RECONSTRUCTED IMAGES AT A CR OF 8:1
Compression Scheme
0.06
Sequence Sigmoid Tangent
lena 24.37 28.67 0.05
baboon 20.59 22.78 0.04
cameraman 23.24 25.78
mse 0.03
peppers 25.54 27.82
boats 24.54 29.60 0.02
TABLE II 0.01
PSNR OF THE RECONSTRUCTED IMAGES AT A CR OF 4:1
0
Compression Scheme
Sequence 0 200 400 600
Sigmoid Tangent
lena 27.30 30.16 epochs
TABLE III
PSNR OF THE RECONSTRUCTED IMAGES AT A CR OF 16:1
0.09
Compression Scheme 0.08
Sequence Sigmoid Tangent 0.07
lena 22.78 27.58 0.06
baboon 20.06 22.73 0.05
mse
(c)
(a) (b) Fig. 9(a) Original image (b) image reconstructed by deep neural
network employing sigmoid neurons at a CR of 16:1 (c) image
reconstructed by deep neural network employing hyperbolic
tangent neurons at a CR of 16:1
[1] Simon O. Haykin, Neural Networks and Learning Machines, 3rd ed.,
Prentice Hall, 2008.
[2] Yaser S. Abu-Mostafa, Malik Magdon Ismail and Hsuan-Tien Lin,
Learning from Data, AMLBook, 2012.
(a) (b) [3] Peter H Sydenham and Richard thorn, Handbook of Measuring
System Design, vol. 3, John Wiley & Sons, 2005, pp. 901-908.
[4] G. Hinton, L. Deng, et al., “Deep neural networks for acoustic
modeling in speech recognition,” IEEE Signal Processing Magazine,
vol. 29(6), pp. 82-97, 2012.
[5] B. Karlik, A.V. Olgac, “Performance analysis of various activation
functions in generalized MLP architectures of neural networks”,
International Journal of Artificial Intelligence and Expert Systems,
Vol. 1, pp. 111–122, 2011.
(c)
[6] G. L. Sicuranza, G. Ramponi, and S. Marsi, “Artificial neural
Fig. 8(a) Original image (b) image reconstructed by deep neural network for image compression,” Electronics Letter, 6 (1990), pp.
network employing sigmoid neurons at a CR of 4:1 (c) image 477-479, 1990.
reconstructed by deep neural network employing hyperbolic [7] S. Carrato, S. Marsi, “Parallel structure based on neural networks for
tangent neurons at a CR of 4:1 image compression,” Electronics Letters, vol 28(12), pp. 1152-1153,
1992.
[8] G. Qiu, M. R. Varley, T. J. Terrell, “Image compression by edge
VI. CONCLUSIONS pattern learning using multilayer perceptrons,” Electronics Letters,
vol 29(7), pp. 601-603, 1993.
In this paper we use deep neural architectures for the purpose [9] A. Namphol, S. Chin, M. Arozullah, “Image compression with a
of still image compression. As a general conclusion from the hierarchical neural network,” IEEE Trans. on Aerospace and
experimental results obtained, the DNN with hyperbolic Electronic Systems 32(1), pp. 326-337, Jan. 1996.
tangent neurons performed superior to the one with sigmoid [10] Y. Benbenisti, D. Kornreich, H. B. Mitchell, P. A. Schaefer, “A high
neurons. Simulations showed that not only the compression performance single-structure image compression neural network,”
performance of the network with hyperbolic tangent neurons IEEE Trans. on Aerospace and Electronic Systems, vol. 33(3), pp.
1060-1063, July 1997.
is remarkably increased but also they cause faster
convergence of the learning algorithms than did the sigmoid [11] K.S. Ng and L.M Cheng, “Artificial neural network for discrete
cosine transform and image compression,” Proceedings of the fourth
functions. We can conclude that hyperbolic tangent neurons international conference on documents analysis and recognition, vol.
engage more plausibly for the image compression purpose in 2, pp. 675-678, August 1997.
[12] C. Cramer, “Neural networks for image and video compression: A [13] J.Jiang, “Image Compression with neural networks - A survey,”
review,” European journal of operationa; research, vol. 108, pp. 266- Signal Processing: Image Communication, vol. 14, pp. 737-760, 1999.
282, 1998.