Discrete Cosine Transform
Discrete Cosine Transform
1
Transform coding constitutes an integral component of
contemporary image/video processing applications.
Transform coding relies on the premise that pixels in an
image exhibit a certain level of correlation with their
neighboring pixels
Similarly in a video transmission system, adjacent pixels in I
onsecutive frames show very high correlation.
Consequently, these correlations can be exploited to
predict the value of a pixel from its respective neighbors.
Transformation is a lossless operation, therefore, the
inverse transformation renders a perfect reconstruction of
the original image.
2
Better image processing
Ta k e into account long-range correlations in space
Conceptual insights in spatial-frequency
information. what it means to be “smooth,
moderate change, fast change, …”
Fa s t computation
Alternative representation and sensing
Eff icient storage and transmission
Energ y compaction
P i c k a few “representatives” (basis)
J u s t store/send the “contribution” from each basis
3
Discrete Cosine Transform (DCT) has emerged as
the image transformation in most visual systems.
DCT has been widely deployed by modern video
coding standards, for example, MPEG, JVT etc.
It is the same family as the Fourier Transform
Converts data to frequency domain
Represents data via summation of variable
frequency cosine waves.
Captures only real components of the function.
Discrete Sine Transform (DST) captures odd (imaginary)
components →not as useful.
Discrete Fourier Transform (DFT) captures both odd and
even components →computationally intense.
4
Discrete Fourier Transform
5
• What is energy compaction?
– is the ability to pack the energy of the
spatial sequence into as few frequency
coefficients as possible
– this is very important for image
compression
– if compaction is high, we only have to
transmit a few coefficients instead of the
whole set of pixels
6
A s compared to DFT, application of DCT results in
less blocking artifacts due to the even symmetric
extension properties of DCT.
D C T uses real computations, unlike the
complex computations used in DFT.
D F T is a complex transform and therefore stipulates that
both image magnitude and phase information be
encoded.
D C T hardware simpler, as compared to that of DFT.
D C T provides better energy compaction than DFT for
most natural images.
T h e implicit periodicity of DFT gives rise to boundary
discontinuities that result in significant high-frequency
content. After quantization, Gibbs Phenomenon causes the
boundary points to take on erroneous values . 8
Discrete Cosine Transform
– in this example we see the
amplitude spectra of the image
above
under the DFT and DCT
– note the much more
concentrated histogram
obtained with the DCT
• why is energy
compaction important?
– the main reason is
image compression
– turns out to be beneficial
in other applications
9
1D DCT:
Where:
1D DCT is O(n2)
2D DCT:
2D DCT is O(n3)
11
Image compression
• an image compression system has three main
blocks
– you will see that the pixels are highly correlated (i.e. most of the
time they are very similar)
– this is just a consequence of the fact that surfaces are smooth
13
Image compression
• the transform eliminates these correlations
– this is best seen by considering the 2-pt transform
– note that the first coefficient is always the DC-value
X 0 x[0] x[1]
– an orthogonal transform can be written in matrix form as
X Tx, TTT I
– i.e. T has orthogonal columns
– this means that
X[1]x[0] x[1]
– note that if x[0] similar to x[1], then
14
Image compression
15
Image compression
• a second advantage of working in
the frequency domain
– is that our visual system is less sensitive
to distortion around edges
– the transition associated with the edge
masks our ability to perceive the noise
– e.g. if you blow up a compressed picture,
it is likely to look like this
– in general, the
compression
errors are more
annoying in the
smooth image
regions
16
Image compression
• three JPEG examples
17
Image compression
• the transform
– does not save any bits
– does not introduce any distortion
• But is possible when we throw away
information
• this is called “lossy compression” and
implemented by the quantizer
18
Image compression
• what is a quantizer?
– think of the round() function, that rounds to the
nearest integer
– round(1) = 1; round(0.55543) = 1; round
(0.0000005) = 0
– instead of an infinite range between 0 and 1
(infinite number of bits to transmit)
– the output is zero or one (1 bit)
– we threw away all the stuff in between, but saved
a lot of bits
– a quantizer does this less drastically
19
Quantizer
21
Quantizer
• note that we can quantize some frequency coefficients
more heavily than others by simply increasing Q
• this leads to the idea of a quantization matrix
• we start with an image block (e.g. 8x8 pixels)
22
Quantizer
• next we apply a transform (e.g. 8x8 DCT)
DCT
23
Quantizer
• and quantize with a varying
Q
DCT
Q mtx
24
Quantizer
• note that higher frequencies are quantized more heavily
Q mtx
increasing frequency
25
Quantizer
• this saves a lot of bits, but we no longer have an exact
replica of original image block
DCT quantized DCT
26
Quantizer
• note, however, that visually the blocks are not
very different
original decompressed
27
Image compression
• three JPEG examples
30
Energy compaction
x[n] y[n]
sequence
by
1D-DCT
computing
1D-DCT of
rows n1 k1
• 2) compute x[ n1 , n 2 ] f [ k1 , n 2 ]
1D-DCT n2
k2
of
columns
1D-DCT
k1
k1
f [ k 1, n 2 ] C x[ k 1, k 2 ]
41
D ecorrelation
Energ y Compaction
Separability
Symmetry
Orthogonality
42
The principle advantage of image transformation is the
removal of redundancy between neighboring pixels. This
leads to uncorrelated transform coefficients which
can be encoded independently.
The amplitude of the autocorrelation after the DCT
operation is very small at all lags. Hence, it can be
inferred that DCT exhibits excellent de correlation
properties.
43
Efficiency of a transformation scheme can be directly gauged
by its ability to pack input data into as few coefficients as
possible. This allows the quantizer to discard coefficients
with relatively small amplitudes without introducing
visual distortion in the reconstructed image. DCT exhibits
excellent energy compaction for highly correlated
images.
44
T h e principle advantage that C(u, v) can be computed
in two steps by successive 1-D operations on rows
and columns of an image.
45
A separable and symmetric transform can be expressed in
the form
T = AfA ,
where A is an N ×N symmetric transformation matrix
with entries a(i , j) given by,
f = A-1 T A-1 .
A-1= AT.
47
I m a g e Processing
Compression - Ex.) JPEG
Scientif ic Analysis - Ex.) Radio Telescope Data
A u d i o Processing
Compression - Ex.) MPEG – Layer 3, aka. MP3
Scientif ic Computing /
High Performance Computing (HPC)
P a r t i a l Differential Equation Solvers
48
Truncation of higher spectral coefficients
results in blurring of the images, especially
wherever the details are high.
49