0% found this document useful (0 votes)
13 views40 pages

Discrete Cosine Transform

Uploaded by

deepak52764
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views40 pages

Discrete Cosine Transform

Uploaded by

deepak52764
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 40

Discrete Cosine Transform

1
Transform coding constitutes an integral component of
contemporary image/video processing applications.
Transform coding relies on the premise that pixels in an
image exhibit a certain level of correlation with their
neighboring pixels
Similarly in a video transmission system, adjacent pixels in I
onsecutive frames show very high correlation.
Consequently, these correlations can be exploited to
predict the value of a pixel from its respective neighbors.
Transformation is a lossless operation, therefore, the
inverse transformation renders a perfect reconstruction of
the original image.
2
Better image processing
Ta k e into account long-range correlations in space
Conceptual insights in spatial-frequency
information.  what it means to be “smooth,
moderate change, fast change, …”
Fa s t computation
Alternative representation and sensing
Eff icient storage and transmission
Energ y compaction
P i c k a few “representatives” (basis)
J u s t store/send the “contribution” from each basis

3
 Discrete Cosine Transform (DCT) has emerged as
the image transformation in most visual systems.
DCT has been widely deployed by modern video
coding standards, for example, MPEG, JVT etc.
 It is the same family as the Fourier Transform
 Converts data to frequency domain
 Represents data via summation of variable
frequency cosine waves.
 Captures only real components of the function.
 Discrete Sine Transform (DST) captures odd (imaginary)
components →not as useful.
 Discrete Fourier Transform (DFT) captures both odd and
even components →computationally intense.

4
Discrete Fourier Transform

• due to its computational efficiency the


DFT is very popular
• however, it has strong
disadvantages for some
applications
– it is complex
– it has poor energy compaction

5
• What is energy compaction?
– is the ability to pack the energy of the
spatial sequence into as few frequency
coefficients as possible
– this is very important for image
compression
– if compaction is high, we only have to
transmit a few coefficients instead of the
whole set of pixels

6
A s compared to DFT, application of DCT results in
less blocking artifacts due to the even symmetric
extension properties of DCT.
D C T uses real computations, unlike the
complex computations used in DFT.
D F T is a complex transform and therefore stipulates that
both image magnitude and phase information be
encoded.
D C T hardware simpler, as compared to that of DFT.
D C T provides better energy compaction than DFT for
most natural images.
T h e implicit periodicity of DFT gives rise to boundary
discontinuities that result in significant high-frequency
content. After quantization, Gibbs Phenomenon causes the
boundary points to take on erroneous values . 8
Discrete Cosine Transform
– in this example we see the
amplitude spectra of the image
above
under the DFT and DCT
– note the much more
concentrated histogram
obtained with the DCT
• why is energy
compaction important?
– the main reason is
image compression
– turns out to be beneficial
in other applications
9
 1D DCT:

Where:

1D DCT is O(n2)

 2D DCT:

 Where α(u) and α(v) are defined as shown in the 1D case.

 2D DCT is O(n3)
11
Image compression
• an image compression system has three main
blocks

– a transform (usually DCT on 8x8 blocks)


– a quantizer
– a lossless (entropy) coder
• each tries to throw away information which is not
12
essential to understand the image, but consume bits
Image compression
• the transform throws away correlations
– if you make a plot of the value of a pixel as a function of one of
its neighbors

– you will see that the pixels are highly correlated (i.e. most of the
time they are very similar)
– this is just a consequence of the fact that surfaces are smooth

13
Image compression
• the transform eliminates these correlations
– this is best seen by considering the 2-pt transform
– note that the first coefficient is always the DC-value
X 0 x[0]  x[1]
– an orthogonal transform can be written in matrix form as

X  Tx, TTT  I
– i.e. T has orthogonal columns
– this means that
X[1]x[0]  x[1]
– note that if x[0] similar to x[1], then

14
Image compression

– in the transform domain we only have to transmit one


number without any significant cost in image quality
– by "decorrelating” the signal we reduced the bit rate to ½

– note that an orthogonal matrix


TTT  I
applies a rotation to the pixel space
– this aligns the data with the canonical axes

15
Image compression
• a second advantage of working in
the frequency domain
– is that our visual system is less sensitive
to distortion around edges
– the transition associated with the edge
masks our ability to perceive the noise
– e.g. if you blow up a compressed picture,
it is likely to look like this
– in general, the
compression
errors are more
annoying in the
smooth image
regions

16
Image compression
• three JPEG examples

36KB 5.7KB 1.7KB


– note that the blockiness is more visible more to compression

17
Image compression
• the transform
– does not save any bits
– does not introduce any distortion
• But is possible when we throw away
information
• this is called “lossy compression” and
implemented by the quantizer

18
Image compression
• what is a quantizer?
– think of the round() function, that rounds to the
nearest integer
– round(1) = 1; round(0.55543) = 1; round
(0.0000005) = 0
– instead of an infinite range between 0 and 1
(infinite number of bits to transmit)
– the output is zero or one (1 bit)
– we threw away all the stuff in between, but saved
a lot of bits
– a quantizer does this less drastically
19
Quantizer

• it is a function of this type


– inputs in a given range are mapped
to the same output
• to implement this, we
– 1) define a quantizer step size Q
– 2) apply a rounding function
xq= round (x/Q)
– the larger the Q, the less reconstruction levels
– more compression at the cost of larger distortion
– e.g. for x in [0,255], we need 8 bits and have 256 color
values
– with Q = 64, we only have 4 levels and only need 2 bits
– with Q = 32; how many bits required? 20
Quantizer

– e.g. for x in [0,255], we need 8 bits and have 256


color values
– with Q = 64, we only have 4 levels and only need
2 bits
– with Q = 32, we only have 8 levels and only need
3 bits hence compression ratio is 8/3= (2.67):1

21
Quantizer
• note that we can quantize some frequency coefficients
more heavily than others by simply increasing Q
• this leads to the idea of a quantization matrix
• we start with an image block (e.g. 8x8 pixels)

22
Quantizer
• next we apply a transform (e.g. 8x8 DCT)

DCT

23
Quantizer
• and quantize with a varying
Q

DCT

Q mtx

24
Quantizer
• note that higher frequencies are quantized more heavily
Q mtx
increasing frequency

– in result, many high frequency coefficients are simply wiped out


DCT quantized DCT

25
Quantizer
• this saves a lot of bits, but we no longer have an exact
replica of original image block
DCT quantized DCT

inverse DCT original pixels


26
Quantizer
• note, however, that visually the blocks are not
very different
original decompressed

– we have saved lots of bits without much “perceptual” loss


– this is the reason why JPEG and MPEG work

27
Image compression
• three JPEG examples

36KB 5.7KB 1.7KB


– note that the two images on the left look identical
– JPEG requires 6x less bits 28
Discrete Cosine Transform
• note that
– the better the energy compaction
– the larger the number of coefficients
that get wiped out
– the greater the bit savings for the same
loss
• this is why the DCT
is important
• we will do
mostly the 1D-
DCT
– the formulae are simpler
the insights the same
– as always, extension to
2D is trivial
27
Discrete Cosine Transform (Formal)
• the first thing to note is that there are various versions
of the DCT
– these are usually known as DCT-I to DCT-IV
– they vary in minor details
– the most popular is the DCT-II,
also known as even symmetric
DCT, or as “the DCT”

30
Energy compaction

• To understand the energy compaction property


– we start by considering the sequence y[n] = x[n]+x[2N-1-n]
– this just consists of adding a mirrored version of x[n] to itself

x[n] y[n]

– next we remember that the DFT is identical to the DFS


of the periodic extension of the sequence
– let’s look at the periodic extensions for the two cases
• when transform is DFT: we work with extension of x[n]
• when transform is DCT: we work with extension of y[n]
Energy compaction

• the two extensions are


DFT DCT

– note that in the DFT case the extension introduces


discontinuities
– this does not happen for the DCT, due to the symmetry of y[n]
– the elimination of this artificial discontinuity, which contains a lot
of high frequencies,
37
– is the reason why the DCT is much more efficient
Fast algorithms
• The interpretation of the DCT
as

– also gives us a fast algorithm for its computation


– it consists exactly of the three steps
– 1) y[n] = x[n]+x[2N-1-n]
– 2) Y[k] = DFT{y[n]}
this can be computed with a 2N-pt FFT
– 3)

– the complexity of the N-pt DCT is that of the 2N-pt DFT36


2D-DCT
• 1) create
intermediate n2 n2

sequence
by
1D-DCT
computing
1D-DCT of
rows n1 k1

• 2) compute x[ n1 , n 2 ] f [ k1 , n 2 ]

1D-DCT n2
k2
of
columns
1D-DCT

k1
k1
f [ k 1, n 2 ] C x[ k 1, k 2 ]
41
D ecorrelation
Energ y Compaction
Separability
Symmetry
Orthogonality

42
 The principle advantage of image transformation is the
removal of redundancy between neighboring pixels. This
leads to uncorrelated transform coefficients which
can be encoded independently.
 The amplitude of the autocorrelation after the DCT
operation is very small at all lags. Hence, it can be
inferred that DCT exhibits excellent de correlation
properties.

43
Efficiency of a transformation scheme can be directly gauged
by its ability to pack input data into as few coefficients as
possible. This allows the quantizer to discard coefficients
with relatively small amplitudes without introducing
visual distortion in the reconstructed image. DCT exhibits
excellent energy compaction for highly correlated
images.

44
T h e principle advantage that C(u, v) can be computed
in two steps by successive 1-D operations on rows
and columns of an image.

45
A separable and symmetric transform can be expressed in
the form
T = AfA ,
where A is an N ×N symmetric transformation matrix
with entries a(i , j) given by,

and f is the N x N image matrix.


T h i s is an extremely useful property since it implies that
the transformation matrix can be pre computed
offline and then applied to the image thereby providing
orders of magnitude improvement in computation
efficiency. 46
 The inverse transformation as

f = A-1 T A-1 .

DCT basis functions are orthogonal. The inverse


transformation matrix of A is equal to its
transpose i.e.

A-1= AT.

Therefore, and in addition to it de correlation


characteristics, this property renders some reduction in
the pre-computation complexity

47
I m a g e Processing
Compression - Ex.) JPEG
Scientif ic Analysis - Ex.) Radio Telescope Data
A u d i o Processing
Compression - Ex.) MPEG – Layer 3, aka. MP3
Scientif ic Computing /
High Performance Computing (HPC)
P a r t i a l Differential Equation Solvers

48
Truncation of higher spectral coefficients
results in blurring of the images, especially
wherever the details are high.

C o a r s e quantization of some of the low


spectral coefficients introduces graininess in
the smooth portions of the images.

S e r i o u s blocking artifacts are introduced at the


boundaries, since each block is independently
encoded, often with a different encoding strategy
and the extent of quantization.

49

You might also like