Session 10
Session 10
IDIG4321
Example
We normally use FFT shift so that the low frequency part of the image will be in the center.
Properties
Rotation in the image will result in rotation in the Fourier spectrum.
Properties
Original images are in the size of 256 × 256, Fourier spectrum of images in the logarithmic scale.
Maximum logarithmic value in each column from the center column moving to the right.
Original image, DCT of the images, along with the maximum logarithmic DCT value in each
column.
Walsh transform
Example
From Top right to left:
Original image
H128 , R = 0.0077972, P128 = 91%, MSE = 0.028679
H64 , R = 0.015564, P64 = 94%, MSE = 0.017166
H32 , R = 0.031006, P32 = 96%, MSE = 0.011089
H16 , R = 0.061523, P16 = 97.6%, MSE = 0.0076097
H8 , R = 0.12109, P8 = 98.8%, MSE = 0.0066097
H4 , R = 0.43747, P4 = 99.5%, MSE = 0.0045335
H2 , R = 0.74982, P2 = 99.9%, MSE = 0.0030490
1D Haar transform
2D Haar transform
First we calculated the horizontal Haar transform for the image.
2D Haar transform
2D Haar transform
Image compression
With a mathematical view, compression is the transform of a set of data with statistical
relationships to another set of data which does not have a statistical relationship
In image processing we have two types of image compression:
Lossless compression: is a type of image compression algorithms that allows the perfect
reconstruction of the original image from the compressed data.
Lossy compression: is a type of image compression that allows us to reconstruct an approximation
of the original image and not the original image itself.
Measures of performance
compression measures:
Bits in the original image
Compression ratio: Bits in the compressed image
Fidelity measures
Mean Square error (MSE): Avg(original - reconstructed)2
Signal to Noise Ratio (SNR): 10 log10 (Signal Power/Noise power)
Peak Signal to Noise Ratio (PSNR): 10 log10 (Peak Signal Power/Peak Noise Power)
Human Visual System (HVS) based: Assignments of image quality measure.
Coding redundancy?
R = 1 − 1/4.42 = 0.7474
Spatial redundancy?
All 256 intensities are equally probable.
The pixels along each line are identical.
The intensity of each line was selected randomly.
Run length pair specifies the start of a new intensity and the number of consecutive pixels that
have the intensity.
Each 256 pixel line of the original representation is replaced by a single 8 bit intensity value and
length 256 in the run-length representation.
Spatial redundancy?
Temporal redundancy?
A random event E with probability P(E ) is said to contain the following units of information:
1
I (E ) = log P(E ) = − log P(E )
We can use the histogram of the observed image to estimate the symbol probabilities of the
source.
Assuming that pr (rk ) is the normalized histogram, the intensity source’s entropy is equal to:
L−1
P
H=− pr (rk ) log pr (rk )
k=0
Huffman codes
Arithmetic coding
Blocking symbols prior to coding can lead to coding efficiency
Not practical with Huffman coding
Arithmetic coding allows you to do precisely this
Basic idea - map data sequences to sub-intervals in (0,1) with lengths equal to probability of
corresponding sequence.
To encode a given sequence transmit any number within the sub-interval
In practice, for images, arithmetic coding gives 10-20% improvement in compression ratios over
a simple Huffman coder. The complexity of arithmetic coding is however 50 - 300% higher.
Arithmetic coding
Run-length codes
Reference group
Final notes
Acknowledgment
Some of the slides, figures, and images are inspired and/or used from slides created by:
Professor Azeddine Beghdadi, University of Paris 13.
Dr. Onur G. Guleryuz, Google Daydream R&D team
Professor Gordon Wetzstein, Stanford University
Dr. Frank (Qingzhong) Liu, New Mexico Tech
Dr. Farah Torkamaniazar, Shahid Beheshti University