0% found this document useful (0 votes)
11 views7 pages

IP Exercises 2024 Ex5

The document discusses image compression techniques including lossy and lossless compression. It describes how lossy compression works by applying transformations, quantization and entropy coding to an image to remove unnecessary data. The discrete cosine transform is commonly used to break images into blocks and represent high frequency information with less precision to reduce file size. Setting more high frequency DCT coefficients to zero gradually degrades the reconstructed image quality but allows for higher compression, as measured by lower PSNR values.

Uploaded by

hung kung
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views7 pages

IP Exercises 2024 Ex5

The document discusses image compression techniques including lossy and lossless compression. It describes how lossy compression works by applying transformations, quantization and entropy coding to an image to remove unnecessary data. The discrete cosine transform is commonly used to break images into blocks and represent high frequency information with less precision to reduce file size. Setting more high frequency DCT coefficients to zero gradually degrades the reconstructed image quality but allows for higher compression, as measured by lower PSNR values.

Uploaded by

hung kung
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

5 Compression

5.1 In general
Image compression is an operation which aims to minimise the amount of bits
needed to represent an image digitally. The advantages of a more compact
image representation are obvious: more efficient transmission and storage.
Images can be compressed by exploiting the correlation between neighbouring
pixels and by using some properties of the human visual system. Image
compression can be divided into two domains: compression with quality loss
(lossy) and without any loss of quality (lossless).

5.2 Lossy compression


Lossy compression means that some distortion in the reconstruction of the
compressed image is tolerated to be able to attain higher compression ratios.
The most important group of lossy compression algorithms are based on an
image transformation followed by quantization and entropy coding. The next
figure shows that typical structure:

Transformation Reordening Bit allocation/


or Quantisation of the entropy coding
decomposition coefficients

In a first step the image is transformed to another representation with the help
of an image transformation.

The main idea behind an image transformation is that we decompose the


image, with dimensions M x N, in a linear combination of M x N basis images
and that the coefficients of that linear compression form once again an image
that we shall call the transformed image. Formulated mathematically this
becomes:
𝑀−1 𝑁−1 𝑀𝑁−1

𝑖𝑚𝑎𝑔𝑒(𝑥, 𝑦) = ∑ ∑ 𝑐𝑜𝑒𝑓𝑓(𝑖, 𝑗) ⋅ 𝑏𝑖𝑗 (𝑥, 𝑦) = ∑ 𝑐𝑜𝑒𝑓𝑓(𝑖) ⋅ 𝑏𝑖 (𝑥, 𝑦)


𝑖=0 𝑗=0 𝑖=0
image(x,y) is the pixel on position (x,y) in the original image and bij(x,y) the
pixel on position (x,y) in basic image bij. The coeff(i,j) form a new image with
the same dimensions as the original image. In this case there is no loss of
information and we can retrieve the original image by applying the inverse
transformation on the transformed image.

An image transformation is used for two reasons as a basis for a compression


algorithm. First of all, the image may be represented with less than MxN
coefficients in the transformed domain without having serious visual
consequences. A second reason is that the coefficients of the transformation
can have a probability distribution that is strongly non uniform, which makes
efficient entropy coding possible.
The second step is the quantization step. This is the step that introduces loss.
The most important part of quantization is the representation of the
coefficients (continuous variables) of the transformation by a final set of values
that come as close as possible to the coefficients. Formally, quantization is
defined as follows: X is a real, continuous stochastic variable. A quantizer is an
operator that transforms X to Y=Q(X) so that:
𝑎𝑖−1 < 𝑋 ≤ 𝑎𝑖 ⇔ 𝑌 = 𝛾𝑖
The coefficients ai form the quantization thresholds, the i-values are called the
reconstruction levels. The quantization is uniform if the quantization
thresholds are on equal distance from each other:
∀𝑖: ⇌ 𝑎𝑖 − 𝑎𝑖−1 = Δ
The quantization module in the coder does more than convert continuous
coefficients to discrete symbols. During the quantization step, coefficients that
aren’t significant for the visual quality of the image are eliminated.
The next step in the scheme is the reordering of the quantized coefficients so
that they would be more efficient to encode (e.g.: a longer series of the same
symbol with run-length coding), or to support a certain functionality, like
spatial scalability.
The last step is bit allocation. Typical algorithms for bit allocation are
arithmetic coding, Huffman encoding and run-length encoding. This step is
normally lossless.
5.3 Performance of image compression techniques
To compare the performance of lossless compression codecs, they are tested
on a number of images where the compression ratio or the bit-rate is used as a
value meter. The compression ratio (CR) is the proportion between the
number of bits in the original image file and the number of bits in the
compressed bitstream. The bit-rate is a value that states how many bits are
assigned on average in the compressed bitstream per pixel in the image.
number of bits in original image
CR =
number of bits in compressed bitstream
number of bits in compressed bitstream
bit - rate =
number of pixels
Lossy compression codecs are compared by looking at the quality of the
reconstructed image.
To objectively quantify this quality, the PSNR (peak signal-to-noise ratio) is
used. This is defined as:
𝑀𝐴𝑋 2
𝑃𝑆𝑁𝑅 = 10 log10
𝜎𝑒2
𝑁−1
1
𝜎𝑒2 = ∑ (𝑥𝑖 − 𝑥𝑖′ )2
𝑁
𝑖=0
MAX is the maximum grey value that can be achieved in the image (255 for 8
bit, 32767 for 16 bit), 2e is the variance of the error between the original and
reconstructed image, xi de ith grey value in the original image, x’i the ith grey
value in the reconstructed image and N the number of pixels that are
considered. It boils down to the comparison between the original and the
reconstructed image.

5.4 DCT-based compression


The DCT is a frequency decomposition. In the transformed block stand the
received coefficients ordered from the DC coefficient in the left upper corner to
the high frequency coefficients when we advance towards the lower right part.
As the human eye is less sensitive to higher frequencies, the high frequent
components can be coded with a lesser precision or even be omitted without
having a lot of effect on the image quality.
High frequent coefficients are less
important visually. During the quantisation
step they are represented with less
Eye sensitivity
precision.
JPEG compression

Quantisation
for each block

DCT quantized
= coefficients
Image is divided in
frequency decomposition
blocks of 8x8 pixels
Run-length coding
+
Huffman encoding

Basis images of the transformation

The two-dimensional DCT of an MxN matrix is defined as:


Forward transformation:
𝑀−1 𝑁−1
𝜋(2𝑥 + 1) × 𝑖 𝜋(2𝑦 + 1) × 𝑗
𝐷𝐶𝑇(𝑖, 𝑗) = 𝑐(𝑖) 𝑐(𝑗) ∑ ∑ 𝑖𝑚𝑎𝑔𝑒(𝑥, 𝑦) × cos ( ) × cos ( )
2𝑀 2𝑁
𝑥=0 𝑦=0
1 1
,𝑖 = 0 ,𝑗 = 0
√𝑀 √𝑁
𝑐(𝑖) = , 𝑐(𝑗) =
2 2
√ ,1 ≤ 𝑖 ≤ 𝑀 − 1 √ ,1 ≤ 𝑗 ≤ 𝑁 −1
{ 𝑀 { 𝑁
Inverse transformation:
𝑀−1 𝑁−1
𝜋(2𝑥 + 1) × 𝑖 𝜋(2𝑦 + 1) × 𝑗
𝑖𝑚𝑎𝑔𝑒(𝑥, 𝑦) = ∑ ∑ 𝑐(𝑖) × 𝑐(𝑗) × 𝐷𝐶𝑇(𝑖, 𝑗) × cos ( ) × cos ( )
2𝑀 2𝑁
𝑖=0 𝑗=0
1 1
,𝑖 = 0 ,𝑗 = 0
√𝑀 √𝑁
𝑐(𝑖) = , 𝑐(𝑗) =
2 2
√ ,1 ≤ 𝑖 ≤ 𝑀 − 1 √ ,1 ≤ 𝑗 ≤ 𝑁 −1
{ 𝑀 { 𝑁

5.4.1 Exercise 20
Load the greyscale image lenagray.tif and convert it to double format. Look at
the image. Split it up in blocks of 8 by 8 pixels and execute the DCT on each
block (function: DCT2). Block based operations in Matlab are best executed
with blkproc. Check what the effect is when you set a part of the high frequency
coefficients to zero. Do this in four steps where each time you set more
coefficients to zero. Start by setting only the highest frequency coefficients to
zero and end up with keeping only the DC components. Use masking for this
operation (multiplication with a matrix with only ones on the positions of the
coefficients you want to keep and zeros on the other positions). Each time
execute the IDCT (function IDCT2) and compare the image with the original.
Write a program to compute the PSNR and execute it for every case. What can
you conclude about the influence of setting the high frequent components to
zero? What kind of distortions occurs if you retain only a few coefficients?

5.5 Wavelet transformation and compression


The picture below shows the execution of one level of the wavelet
transformation. The operation can be summarised as the execution of a
filtering on rows and columns of the image followed by a dyadic subsampling.
Just as the DCT the wavelet transformation is a frequency decomposition: the
detail subbands that come from the first decomposition step contain the
information of the image with the highest frequency content.
The information in the detail subbands on the second level contain less high
frequent information, etc. The LL subbands (lowpass image) consists of the
low frequent information. Further, we notice that each detail subband has
highfrequent details associated to a certain direction: horizontal details,
vertical details and diagonal details. Research has shown that the diagonal
details have a smaller impact on the outlook of the image.
rows
columns
rows
columns
Low resolution version of the image
= low frequent information
Details (lower frequencies)

Details (highest frequencies)

We can also see that the detail subbands of natural images don’t contain a lot of
information. By removing the coefficients out of the highfrequent subbands or
quantifying them more coarsely, efficient lossy compression can be achieved.
Lossless compression becomes possible by exploiting the inter- or intra-
subband correlation between the coefficients by using a form of entropy
coding.

5.5.1 Exercise 21
Load the image lenagray.tif and convert it to double format. Execute a wavelet
decomposition with 3 levels. Use the function wavedec2 with filtertype bior4.4.
Derive all the subbands from C and S (see help wavedec2) with the use of the
functions appcoef2 and detcoef2. Display the subbands with myshow.m (take
64 as second parameters for the LL subband and 128 for all other subbands).
Can you detect the different orientations of the details? The orientations
depend on the sequence in which the filtering is performed. Explain. See what
happens when you put all the coefficients from the HH1 subband to zero (in
the C matrix) and execute the wavelet reconstruction (command waverec2).
Compare the result with the original. Is there a lot of difference? Repeat the
previous steps and successively put all coefficients from
1) HH1 LH1 HL1
2) HH1 LH1 HL1 HH2
3) HH1 LH1 HL1 HH2 HL2 LH2
4) HH1 LH1 HL1 HH2 HL2 LH2 HH3
5) HH1 LH1 HL1 HH2 HL2 LH2 HH3 LH3 HL3
to zero and check the results. What are the kind of errors you see in the last
cases between the original and the reconstructed image (2)? Calculate the
PSNR for each case and compare.

You might also like