Tipe
Tipe
net/publication/340375328
CITATIONS READS
0 2,987
4 authors, including:
James Rocco
Lakehead University Thunder Bay Campus
1 PUBLICATION 0 CITATIONS
SEE PROFILE
All content following this page was uploaded by Marko Javorac on 02 April 2020.
Abstract—In this project we are to conceptualize the design II. F OURIER S ERIES & T RANSFORMS
of a JPEG image compression software which will provide three
modes of compression for any allotted image; low, medium and Fouriers analysis consists of two forms of operation, Fourier
high. This will be carried out through a numerical computing
Series and Fourier Transform. Fouriers series consists of taking
program known as Matlab, where the algorithm designed will
be implemented and tested. The expectations would include; a periodic function and representing it as a trigonometric
the implementation of a fast fourier transform function,discrete series in terms of sine and cosine. Whereas fouriers transform
cosine transform function, as well as the creation of a graphical looks at non-periodic functions. It consists of taking these
user interface. This project will help the involved participants non-periodic functions in their time domain and transforming
have a better understanding of the mathematical functions, fast
them into the frequency domain. This process can be inverted
fourier transform and discrete cosine transform, along with being
a more proficient programmer in Matlab. which is known as taking the inverse fourier transform, taking
the frequency domain and representing it in the time. In this
I. I NTRODUCTION report, we will be analyzing the effects of fouriers transform,
moreover, than the series and exploring the benefits of being
The most common type of image compression algorithm able to manipulate the frequency domain for lossy image
used today is called JPEG. JPEG stands for ‘Joint Photo- compression.
graphic Experts Group’ and is a type of lossy image compres-
sion that allows you to compress raw images from cameras
A. Discrete Cosine Transform
and store them as smaller files. Compression is important
for allowing easier transport of images via the internet and Discrete Cosine Transform (DCT) is used in standard jpeg
creates smaller image sizes for help with storage. This type of compression. DCT is derived from Discrete Fourier Transform,
compression is referred to as a lossy because the compression but has a special property that allows it to be very efficient
algorithm will give away unrecoverable image quality in order in comparison with other methods. DCT, when used with
to compress the image. With this, the more compression matrices, can be done in a singular matrix multiplication.
selected for the image the higher rate of loss, resulting in This makes it both very easy to implement and efficient
an overall lower quality of the image. for a computer to run. One of the key advantages of DCT
In this project, three forms of image compression will be over other methods for compression is the fact that it can be
designed, implemented and tested using fourier techniques. implemented with an 8X8 matrix multiplication. This results in
The use of fourier transforms is a mathematical operation that highly efficient compression in comparison to Discrete Fourier
takes the time domain of a function and transforms it into Transform, and is much easier to implement than the algorithm
the frequency domain. These techniques are mostly used in for Fast Fourier Transform.
electronics communication but have been a useful advantage The general expression for this matrix transform is: D =
T*M*T’, where T is the transformation matrix, and M is
and can be applied here for image compression. The advantage an 8X8 block of the image. The variable T’ denotes the
is that by being in the frequency domain, the image can be transformation matrix except it is transposed.
broken up into cosine and sine components allowing for easy s
7 X
7
2 (2x + 1)iπ (2y + 1)iπ
removal of components, in turn compressing the image. Along Di,j = C(i)c(j)
X
p(x, y)cos( )cos( ) (1)
8 16 16
with image compression, a graphical user interface (GUI) will x=0 y=0
fourier transform. D =T ·M ·T
0
(4)
B. Discrete Fourier Transform until we get a single term DFT for computation. For example
As mentioned before discrete functions are represented in for a sequence of N=8, the splitting process takes place 3
terms of vectors, to compute vector to vector calculations, times. So we start with a single DFT of 8 terms and split it to
matrices must be used to determine the outcomes. In turn, the 8 DFT of 1 term each. We can further optimize the function
Discrete Fourier Transform, DFT, is an operation that consists using the fact that the exponential term is periodic and hence
of inputting N numbers and receiving an N amount of outputs, reduces the computation of exponential terms by half. DIT first
where ‘N’ represents the number of points. With image com- computes the even indexed elements. This algorithm is a good
pression the number of points can be incredibly large which example of divide and conquer algorithms. When expressed on
affects how long the computations of the calculation will take. paper as a data flow diagram, the algorithm shows a butterfly
Due to this disadvantage of the DFT, other algorithms were effect.
derived to increase computation time in the calculations of III. JPEG C OMPRESSION A LGORITHM
these transforms.
The importance of the discrete fouriers transform stems A. Overview
from the ability to represent the function in terms of frequency The general method for jpeg compression using DCT is as
which allows the operation to reveal periodicities in the follows:
function. This allows for the repetitive nature to be discarded First, the DCT matrix (see formula 2 in the DCT equations
in image compression in order to shrink the image. This idea section) must be made. In the case of JPEG compression, this
is implemented in the derived algorithms as well. matrix must be 8X8 in size because the JPEG quantization
matrix, which is a standard that has been experimentally fine
N −1
X tuned, is also 8X8. This means that the transform can be done
Xk = xn · e−i2π·kn/N (5) on any image as long as it is segmented into 8X8 pieces and
n=0
then reattached later.
C. Fast Fourier Transform Then, after the transform has been done on the entire image,
A fast Fourier transform is or a group of algorithms used it is divided by the quantization matrix. This resultant is then
to compute a DFT or IDFT of a sequence of numbers. This rounded to the nearest integer values, and from this becomes a
sequence is usually a signal. As we have seen that a Fourier matrix with zeros in the matrix. This rounding to attain zeroes
Analysis converts a signal from its original domain i.e. from in the matrix is a key point in the compression process.
a domain of time or space to the frequency domain. The Finally, the image is multiplied by the quantization matrix,
DFT is obtained by decomposing a sequence of values into and has the inverse DCT implemented on it. The result is a
components of different frequencies. But computing DFT compressed image. The reason that it becomes compressed
using the traditional definition can take a long time. If N is the fact that the zeroes in the matrix hold no data, in
is the number of elements in the sequence, the number of other words the image has less data even after the remaining
operations performed through DFT will be N2 . As N increases, elements have been reverted to their original values.
the time for computation increases respectively. This is where
B. Algorithm Implementation
FFT comes in. An FFT rapidly computes such transformations
by factorizing the DFT matrix into a product of sparse (mostly The algorithm that was fully implemented in this project is
zero) factors. FFT algorithms exploit the symmetries of the outlined here:
exponential term in DFT because it is periodic. FFT reduced • Convert the image from RGB to YCbCr format
the N2 operations to N log(N) operations, which can make • Trim the image so that it is a multiple of 8, for the 8X8
a huge difference in terms of computation speed as our N matrices to fit evenly inside the image
increases. There are many different FFT algorithms based on • Initialize the DCT Matrix, so that the transform can be
a wide range of published theories, from simple complex- done on the blocks of the image
number arithmetic to group theory and number theory. • Initialize the quantization Matrix, this is necessary for the
The most simple algorithm used to implement FFT is the compression of the image after DCT has been done on
Cooley-Tukey algorithm. The algorithm re-expresses the DFT it.
of N sequences as a product of N1 & N2, where N1 is the • Because the image is a three dimensional matrix, here
number of small DFTs of sequences N2. The Cooley-Tukey is where a for-loop should be started, with a variable
algorithm can be combined with other DFT algorithms, for ranging from one to three, for each plane in the matrix.
example, Rader’s, Bluestein’s or prime-factor algorithm to deal All steps after this are within this loop until otherwise
with prime numbers. said.
The most common and simplest form of the Cooley-Tukey • Convert the image to a double, so that when it is divided
algorithm is radix-2 Decimation in Time. The DIT implemen- by the quantization matrix you get decimals.
tation requires N which can be expressed as a power of 2. In • Subtract 128 from each element in the matrix, so that it
this algorithm, we split N into two halves in terms of even and ranges from -128-127. This is necessary because the
odd positions in the sequence. The splitting process continues • DCT is centred about the origin.
• Apply DCT to the 8X8 blocks of the plane that is of three matrices in which two matrices are fix-up matrices.
currently being considered. A general equation to calculate the DFT of a sequence N is:
• Divide these transformed 8X8 blocks by the quantization
X = FN · x (6)
matrix, element by element, with the compression levels
set for that plane (Y, Cb, or Cr). where X is the transform, FN is the exponential matrix for N
• Round the resultant matrix to the nearest integer values. terms and x is the sequence or the signal. According to the
Decimals below 0.5 will be rounded to zero, thus shrink- algorithm the FN can be expressed as :
ing the amount of memory the image takes up.
I D F 0
• Now that the decimals have been rounded, multiply this FN = · N/2 ·P (7)
I −D 0 FN/2
matrix by the quantization matrix (element by element)
to begin undoing the previous steps. The elements that 1 0 0 0 0
have been rounded to zero will not be recovered. 0 W 0 0
• Apply inverse DCT on the 8X8 blocks of the plane, this 0 0 W 2 0 0
takes the image out of the frequency domain and reverts D= (8)
0 0 0 ... 0
back into the visual domain. 0 0 0 0 0
• Paste the 8X8 blocks of the current plane back together. 0 0 0 0 W ( N/2 − 1))
• Add 128 to the matrix to bring the values back to their
original range. 1 0 0 0
0 0 1 0
• Initialize a three dimensional matrix to hold the com- P =0 1 0 0 , (9)
pressed image, and assign each plane to its respective
location in this matrix. 0 0 0 1
• End the outermost for-loop, all the work on the separate Permutation matrix of even and odd
plains has been done. As we can see from the equations that the N term can be split n
• Convert the image from YCbCr back to RGB, so that it times i.e. N=1. This recursion reduces the calculations from N2
has its original colours. to N log2 N. The algorithm takes a complete image matrix i.e.
all three dimensions and computes the exponential matrix once
IV. C OMPRESSION USING FAST F OURIER T RANSFORM for both column and rows. This is useful as it saves the time for
A. Overview computation. The resultant exponential matrices can be used
for transforming the matrix to the frequency domain. Due to
The FFT compression implemented works on the basics of the fact that the algorithm can only work on sequences which
compression. The algorithm is converting an image matrix of can be represented in term of 2n , we are initially padding
3 dimensions (RGB) into the frequency domain using FFT. the image with zeros so that the image is suitable for the DIT
To remove low signals, a threshold is used in the algorithm algorithm. After the image is compressed it is brought back to
which is provided by the user and based on that, we are its original size. The overhead observed in this implementation
filtering Signals above the threshold, making it a high pass is that it increases the matrix size which can further make the
filter. This is done due to the fact the human vision is only operations more expensive and is a trade-off with time.
able to perceive frequencies above a certain threshold, using
this theory low frequencies can be removed from the image V. G RAPHICAL U SER I NTERFACE
while keeping most of the details intact. After filtering out A. Overview
the low frequencies, the image is transformed back into its Our Graphical User Interface(GUI) was designed and devel-
original domain using IFFT. oped in MATLAB using the GUIDE utility. The application
allows the user to select an image from their local device using
the “import image” button. The user then specifies the type of
compression desired along with the amount of compression.
The FFT method offers 3 unique levels of compression includ-
Fig. 1. Flow Diagram explaining the flow for FFT compression
ing High, Medium, and Low. The DCT method allows users
to fine tune their compression using Luma(Y) and Chroma
B. Algorithm Implementation Red(Cr) and Chroma Blue(Cb) scales. The application requires
the user to name their compressed file which is entered in the
The program uses a self implemented FFT function which text box. The application also includes 3 panes that provide
is based on Radix-2 Decimation In Time or DIT derived the user with a preview of the image in original, compressed,
from Cooley-Tukey algorithm. As we stated in the previous and difference.
sections that the DIT only works best with signals or sequences
which can be expressed in terms of 2n . The program uses a B. Layout
matrix implementation of the algorithm according to which the The application is divided into 3 major sections, File I/O,
exponential matrix in of DFT can be expressed as a product Compression Settings, and Image Preview Panes.
C. Image Preview Panes
The image preview pans contain 3 distinct windows that
are populated throughout the compression process. The first
Fig. 6. Image View Panes for Orignal, Compressed & Difference Image
“import image”, a file explorer will appear and will allow them
to navigate to the image. Once a user selects an image, the
text will be updated with the path to the folder so the user
can verify their selection. The user will then enter the name
Fig. 7. Image View Pane displaying the original & compressed image along
of the output file in the text field. with their difference
for j = 1:repx
%Extracting 8x8 matrix for operations
B = Mat8_operation(A,x,y);
D = T*B*T’;
DQ = round(D./Q);
Fig. 9. Resulting Images from JPEG Compression
%putting B in a big matrix
Final = PushMatOperation(Final,DQ,x,y);
y = y+8;
more compression. Both the algorithms used in the report are end
not comparable as they use two different ways to compress a x = x+8;
end
file. Both the algorithms tells us that we can remove certain
amount of data from the image and still will not be able to %DCompress Each plane
for i = 1:repy
affect the quality we perceive as humans since human vision y =1; % variable to hold the y coordinate
cannot en-capture. for j = 1:repx
%Extracting 8x8 matrix for operations
B = Mat8_operation(A,x,y);
VII. F UTURE W ORK DQ = B.*Q;
D = (T’)*DQ*T;
Through this project, we got the opportunity to explore
the field of image compression and different mathematical %putting D in a big matrix
Final = PushMatOperation(Final,D,x,y);
formulas which holds a great amount of importance in the y = y+8;
modern world of computing. During our research on these end
x = x+8;
algorithms, the major issue that we faced was that of opti- end
mization. The implemented algorithms were optimized to the
best of our capabilities but there is still a lot of room for further
progress. In the future, more complex FFT algorithms can be B.
Code Snippet to Calculate the exponential matrix using FFT
used to deal with varied sizes of matrices, which are faster.
if(n>0)
Moreover, due to the fact that we are dealing with matrices,
parallel computing can be used to fasten up the process of %generating a permutation matrix for splitting
the sequence based on even and odd positions
matrix multiplication. In our implementation, in order to save %this is done due to the fact that the
time, we are padding our matrix with zeros, in the future exponential term
interpolation can be used to increase the size of the matrix P = gen_perm_mat(n);
without affecting the quality. Since this project introduced us
%exponential term
with image compression, compressing videos or audio files w = exp(-2*i*pi/2ˆn);
can be the next challenge. One interesting project that can be
%This statement generates a fixup matrix which
looked into is to analyse properties of certain things from a is comprised of 2 identity matrices and 2
video, for example calculating the velocity of a ball thrown by diagnal Exponential matrices of order (2ˆn)/2;
ID = gen_ID_mat(n,w);
a robot in a video or the number of chemicals in a paper just
by analysing the color. Compression has endless applications %recursion for further splitting Matrix
F = Gen_FFT_Mat(n-1);
in today’s world, so a lot can be achieved.
%this section of code is used to
VIII. A PPENDIX generate the middle matrix
N = (2ˆn)/2;
A. BigN = 2ˆn;
Code snippet displaying JPEG compression F_final = zeros(2ˆn,2ˆn);
F_final(1:N,1:N) = F;
%compression one plane at a time F_final(N+1:BigN,N+1:BigN) = F;
for plane = 1:3 Final = ID*F_final*P;
CompMat = ModImg(:,:,plane); else %WHEN n=0 i.e. only one term left
CompMat = double(CompMat) - 128; Final = 1;
QQ = Q*Quality(plane); end
CompImgMat(:,:,plane) =
CompressEachPlane(CompMat,T,QQ);
end
%decompression
for plane = 1:3
DCompMat = CompImgMat(:,:,plane);
QQ = Q*Quality(plane);