Analysis of Image Compression Algorithms On Vivado HLS
Analysis of Image Compression Algorithms On Vivado HLS
Abstract—The resolution of the cameras is rising day by day FPGA gives the developer an ability to have the hardware
due to developments in optical technology. Such high resolution reconfigured based on the requirement of the application.
images cause the size of the images to be huge, putting a strain Moreover, the faster time-to market as compared to an ASIC
on the available storage. Thus, there is a need to compress
the size of the image without losing important data from the makes its advantageous to have development on the FPGA.
actual image. In this paper, we therefore present implementation The development of the algorithm is made easier by making
and a detailed analysis of different lossy image compression use of High Level Synthesis (HLS) workflow. VIVADO HLS
algorithms such as DCT (Discrete Cosine Transform), FDCT from Xilinx offers major advantages for the development and
(Fast Discrete Cosine Transform) and Haar Wavelet Transform. deployment of applications such as signal processing and im-
These algorithms are simulated and synthesized using Xilinx’s
VIVADO HLS platform for a Xilinx Artix 7 family board. age processing. These tools allow hardware-based algorithms
These algorithms are accelerated using the pragmas provided to be built and tested using higher-level languages (such as
by VIVADO HLS to optimize the application. The paper also C,C++) before the HDL-based implementation, verification
provides an analysis on the tradeoff between these algorithms and validation. HLS significantly reduces algorithm develop-
and usage of accelerators in the algorithms. We have calculated ment time.
the Most Significant Error (MSE) and Peak Signal to Noise Ratio
(PSNR) and achieved the PSNR value to be in the permissible II. R ELATED W ORK
range of 30dB to 50dB.
Various techniques have been proposed by researchers re-
Index Terms—Vivado, High-Level Synthesis, DCT, Fast DCT, lated to compression and decompression techniques. Few of
Haar wavelet, lossy compression, Matlab, FPGA, Acceleration, the recent developments in image compression are described
pipelining, accelerators, C language.
here.
I. I NTRODUCTION Yuecheng Li et al. [2] describes about the implementation of
image compression using JPEG baseline encoder. HLS tool is
Pictures are a part and parcel of human life in this modern utilized for system design. It uses 8x8 DCT algorithm. The
age. The high resolution images can be compressed to reduce pixels are quantized and Huffman coding is then applied. The
the redundancy in the image data. This results in reduction of AC components of pixels are encoded using zig-zag scanning.
storage space utilization. Image compression algorithms are This paper focuses on very less hardware utilization.
of two types: Lossless and lossy. Lossless compression is a Ahmad Shawahna et al. [3] describes JPEG compression
technique to reduce the size of an image while preserving the using DCT. The implementation is done in VHDL (VHSIC
quality of image. Lossy compression is a process in which Hardware Description Language). The paper proposes 5 steps
certain portions of the image are discarded in order to give the for image compression such as Color Space Conversion, Down
image an even smaller size. The image compression algorithms Sampling, 2-D DCT, Quantization, encoding. The compression
have dozens of operations being performed on a pixel. This ratio of around 82% to 85% was achieved. The paper focuses
results in an extremely heavy load on the computer software. on designing a parallel architecture.
In order to reduce the resource dependency on the Central M. B. Mutgekar et al. [4] proposes DCT and Fast DCT using
Processing Unit, the usage of Field Programmable Gate Array FFT on Nexys 4 DDR board. The main focus of this paper is
(FPGA) is beneficial for parallelization of operations [1]. implementation and testing of 2D DCT. But it also proposes
978-1-6654-0430-3/21/$31.00 ©2021 IEEE that Fast DCT has an improved performance.
Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY WARANGAL. Downloaded on October 11,2022 at 12:50:52 UTC from IEEE Xplore. Restrictions apply.
T. G. Anitha et al. [5] proposes image compression tech- multiplied with the 8x8 block and then with the transpose
niques using 2D-FFT and IFFT on Matlab. The paper focuses of the DCT matrix [7].
on processing speed. The decompressed images were identical
to the source image with quality better than 35dB. C = DADT (1)
R. Praisline Jasmi et al. [6] proposes Huffman coding,
Discrete Wavelet Transform (DWT) and fractal coding for where:
image compression. The paper claims that DWT can be used C = resultant matrix of the DCT which contains values
to boost the quality of compressed images, while the Fractal of different frequencies with top left being DC and
Algorithm offers better compression ratios (CR) and peak bottom right being maximum frequency.
signal to noise ratios (PSNR). D = DCT matrix which is calculated using,
III. I MPLEMENTATION DETAILS AND METHODS √1 if i = 0
Di,j = qN2 (2j+1)iπ
N cos otherwise
2N
Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY WARANGAL. Downloaded on October 11,2022 at 12:50:52 UTC from IEEE Xplore. Restrictions apply.
H = Haar matrix which is given by,
0.125 0.125 0.25 0 0.5 0 0 0
0.125 0.125 0.25 0 −0.5 0 0 0
0.125 0.125 −0.25 0 0 0.5 0 0
0.125 0.125 −0.25 0 0 −0.5 0 0
H=
0.125 −0.125 0
0.25 0 0 0.5 0
0.125 −0.125 0 0.25 0 0 −0.5 0
0.125 −0.125 0 −0.25 0 0 0 0.5
where:
Fig. 4: Values of blocks of Signal Flow Graph of FDCT X = Final block of compressed 8x8 pixels.
−1
HT = Inverse of Transpose of Haar Transform matrix.
B = Output matrix of previous step.
Fig.3 and each block of signal flow graph is explained H −1 = Inverse of Haar Transform matrix.
using Fig 4. • Combine all the 8x8 block of compressed pixels to
• Each resultant matrix is divided by the quantization construct the compressed image.
matrix for lossy compression and then rounded to 0.
• Now, the output matrix from the previous step is multi-
IV. S IMULATION R ESULTS
plied with the quantization matrix.
• After quantization, we again run the FDCT in reversed The simulation results have been obtained using the Vivado
order to get the block of pixels in compressed form. HLS C simulation feature which builds and simulates the code.
• Combine all the 8 block of compressed pixels to construct The text file containing compressed pixel values generated as
the compressed image. a result of C simulation is transformed into JPEG image using
Matlab.
3) Haar Wavelet Transform:
• Divide the image into 8x8 blocks of pixel values. A. Image Output
• From left to right and top to bottom, Haar wavelet is
applied to every such block of pixels. For this, Transpose Fig.5 shows the original and compressed images using
of Haar Transform matrix is multiplied with the 8x8 block various compression algorithms. The original image is of
and then with the Haar Transform matrix. man, which has dimension of 1024*1024 and size is 1024
KB.
B = H T AH (5) Table I shows the image compression results performed by
all the three algorithms. The algorithm is performed on four
where: images of different dimension and different sizes to validate
B = Resultant matrix after Haar Transformation is the performance of the algorithms. Compression ratio of 80%
applied. to 88% was achieved.
Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY WARANGAL. Downloaded on October 11,2022 at 12:50:52 UTC from IEEE Xplore. Restrictions apply.
1) Compression Ratio(CR): This ratio defines the compres-
sion achieved for a given image. The equation is as follows:
Sizeof U ncompressedImage
CR = (7)
Sizeof CompressedImage
2) Mean Square Error of an image represents the cumulative
square error between the compressed image and original
image. The equation is as follows:
1 XP XQ
M SE = [f 1(x, y) − f 2(x, y)]2 (8)
PQ j=1 i=1
Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY WARANGAL. Downloaded on October 11,2022 at 12:50:52 UTC from IEEE Xplore. Restrictions apply.
We have made a comparison of the three algorithms based the parallel, same clock cycle memory access to different
on each of these reports. We have used the image: man.bmp locations thereby causing an increase in the BRAM utilization.
for our analysis.
b) DSP 48E
1) Latency is defined as the number of clock cycles DSP 48E is the Arithmetic and Logical Unit of the FPGA
required to produce an output [12]. [12]. It is responsible for complex computations such as
addition, subtraction, multiplication and summation.
Fig. 7: BRAM 18K Utilization The look up tables shown in Fig. 9 depicts the increase of
LUT utilization in DCT and HAAR when accelerators are
The BRAM utilization can be seen increasing in Fig.7, used as multiple functions are implemented simultaneously.
when accelerators have been used in DCT and HAAR, The FDCT has, on the other hand, decreased use.
because of the pipelining feature of accelerators increases
Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY WARANGAL. Downloaded on October 11,2022 at 12:50:52 UTC from IEEE Xplore. Restrictions apply.
d) Flip Flops better throughput, better speed but with overall increase in
The Flip Flop performs the function of basic data input, clock hardware utilization.
input and output acting as a latch. It holds the data for a time The current work is limited to Vivado HLS platform.
period beyond a clock cycle [12]. Further development of the application can be done on the
Vivado Design Suite. The HLS C application can be exported
as an Intellectual Property(IP) to the Vivado Design Suite
and further development of the hardware design can be done
around it. This design can be used to program FPGA hardware.
VI. ACKNOWLEDGMENT
We would like to express our gratitude towards the Elec-
tronics Engineering Department of Sardar Patel Institute of
Technology for providing us constant guidance and facilities.
R EFERENCES
[1] D. Tsiktsiris, D. Ziouzios, and M. Dasygenis, “A High-Level Synthesis
Implementation and Evaluation of an Image Processing Accelerator,”
Fig. 10: Flip Flops Utilization Technologies, vol. 7, no. 1, p. 4, Dec. 2018.
Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY WARANGAL. Downloaded on October 11,2022 at 12:50:52 UTC from IEEE Xplore. Restrictions apply.