0% found this document useful (0 votes)
56 views54 pages

KMA SS05 Kap03 Compression

1) The document discusses various data compression techniques, including lossless compression methods like Huffman coding and run-length encoding, as well as lossy compression methods used for images and video. 2) It provides examples of how compression works, such as using Huffman coding to assign shorter bit codes to more frequent characters. 3) Transformations like the discrete cosine transform are discussed as a way to represent data in the frequency domain, where omitting less important frequencies can provide better compression while maintaining quality.

Uploaded by

jnax101
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
56 views54 pages

KMA SS05 Kap03 Compression

1) The document discusses various data compression techniques, including lossless compression methods like Huffman coding and run-length encoding, as well as lossy compression methods used for images and video. 2) It provides examples of how compression works, such as using Huffman coding to assign shorter bit codes to more frequent characters. 3) Transformations like the discrete cosine transform are discussed as a way to represent data in the frequency domain, where omitting less important frequencies can provide better compression while maintaining quality.

Uploaded by

jnax101
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 54

Institut fr Telematik | Universitt zu Lbeck

Communication
Systems for Multimedia
Applications
Chapter 3: Compression Techniques
Summer Term 2005
Prof. Dr. Stefan Fischer
3-2
Overview
The principle of compression
Lossless Compression
Principle
Example: Huffman Code
Lossy Compression
Still image compression
Example: J PG
Video compression
Example: MPEG
3-3
Principles of Data Compression
1. Lossless Compression
The original object can be reconstructed perfectly
Compression rates of 2:1 to 50:1 are typical
Example: Huffman coding
2. Lossy Compression
There is a difference between the original object and the
reconstructed object.
Physiological and psychological properties of the ear
and eye can be taken into account.
Higher compression rates are possible than with
lossless compression (typically up to 100:1).
3-4 Simple Lossless Algorithms: Pattern Substitution
Example 1: ABC -> 1; EE -> 2
Example 2:
Note that in this example both algorithms lead to the same compression rate.
3-5
Run Length Coding
Principle
Replace all repetitions of the same symbol in the text (runs) by a repetition counter
and the symbol.
Example
Text:
AAAABBBAABBBBBCCCCCCCCDABCBAABBBBCCD
Encoding:
4A3B2A5B8C1D1A1B1C1B2A4B2C1D
As we can see, we can only expect a good compression rate when long runs occur
frequently.
Examples are long runs of blanks in text documents, leading zeroes in numbers or
strings of whitein gray-scale images.
3-6
Run Length Coding for Binary Files
When dealing with binary files we are sure that a run of 1s is always followed by a run of 0s and vice versa. It is thus
sufficient to store the repetition counters only!
Example
000000000000000000000000000011111111111111000000000 28 14 9
000000000000000000000000001111111111111111110000000 26 18 7
000000000000000000000001111111111111111111111110000 23 24 4
000000000000000000000011111111111111111111111111000 22 26 3
000000000000000000001111111111111111111111111111110 20 30 1
000000000000000000011111110000000000000000001111111 19 7 18 7
000000000000000000011111000000000000000000000011111 19 5 22 5
000000000000000000011100000000000000000000000000111 19 3 26 3
000000000000000000011100000000000000000000000000111 19 3 26 3
000000000000000000011100000000000000000000000000111 19 3 26 3
000000000000000000011100000000000000000000000000111 19 3 26 3
000000000000000000001111000000000000000000000001110 20 4 23 3 1
000000000000000000000011100000000000000000000111000 22 3 20 3 3
011111111111111111111111111111111111111111111111111 1 50
011111111111111111111111111111111111111111111111111 1 50
011111111111111111111111111111111111111111111111111 1 50
011111111111111111111111111111111111111111111111111 1 50
011111111111111111111111111111111111111111111111111 1 50
011000000000000000000000000000000000000000000000011 1 2 46 2
3-7
Variable Length Coding
Classical character codes use the same number of bits for each character. When the
frequency of occurrence is different for different characters, we can use fewer bits for frequent
characters and more bits for rare characters.
Example
Code 1: A B C D E . . .
1 2 3 4 5 (binary)
Encoding of ABRACADABRA with constant bit length (= 5 Bits):
0000100010100100000100011000010010000001
000101001000001
Code 2: A B R C D
0 1 01 10 11
Encoding: 0 1 01 0 10 0 11 0 1 01 0
3-8
Delimiters
Code 2 can only be decoded unambiguously when delimiters are stored with the
codewords. This can increase the size of the encoded string considerably.
Idea
No code word should be the prefix of another codeword! We will then no longer
need delimiters.
Code 3:
Encoded string: 1100011110101110110001111
A 1 1
B 0 0
R 0 1 1
C 0 1 0
D 1 0

3-9
Representation as a TRIE
An obvious method to represent such a code is a TRIE. In fact, any TRIE
with M leaf nodes can be used to represent a code for a string containing
M different characters.
The figure on the next page shows two codes which can be used for
ABRACADABRA. The code for each character is represented by the
path from the root of the TRIE to that character where 0goes to the
left, 1goes to the right, as is the convention for TRIEs.
The TRIE on the left corresponds to the encoding of ABRACADABRA on
the previous page, the TRIE on the right generates the following
encoding:
01101001111011100110100
which is two bits shorter.
3-10
Two Tries for our Example
The TRIE representation guarantees indeed that no
codeword is the prefix of another codeword. Thus
the encoded bit string can be uniquely decoded.
3-11
Huffman Code
Now the question arises how we can find the best variable-
length code for given character frequencies (or
probabilities). The algorithm that solves this problem was
found by David Huffman in 1952.
Algorithm Generate-Huffman-Code:
Determine the frequencies of the characters and mark the
leaf nodes of a binary tree (to be built) with them.
1. Out of the tree nodes not yet marked as DONE, take the two with
the smallest frequencies and compute their sum.
2. Create a parent node for them and mark it with the sum. Mark the
branch to the left son with 0, the one to the right son with 1.
3. Mark the two son nodes as DONE. When there is only one node not
yet marked as DONE, stop (the tree is complete). Otherwise,
continue with step 2.
3-12
Huffman Code, Example
Probabilities of the characters:
p(A) = 0.3; p(B) = 0.3; p(C) = 0.1; p(D) = 0.15; p(E) = 0.15
30%
30%
1
1
100
1
1
25
A
B
C
D
E
0
40
0
0
0
60
10%
15%
15%
11
10
011
010
00
3-13
Huffman Code, why is it optimal?
Characters with higher probabilities are closer to the root of
the tree and thus have shorter codeword lengths; thus it is a
good code. It is even the best possible code!
Reason:
nodes. This is obviously the same as summing up the
products of each characters codeword length with its
frequency of occurrence.
No other tree with the same frequencies attached to the leaf
nodes has a smaller weighted path length than the Huffman
tree.
3-14
Decoding Huffman Codes (1)
An obvious possibility is to use the TRIE:
1. Read the input stream sequentially and traverse the
TRIE until a leaf node is reached.
2. When a leaf node is reached, output the character
attached to it.
3. To decode the next bit, start again at the root of the
TRIE.
Observation
The input bit rate is constant, the output character rate is
variable.
3-15
Decoding Huffman Codes (2)
As an alternative we can use a decoding table.
Creation of the decoding table:
If the longest codeword has L bits, the table has 2
L
entries.
Let c
i
be the codeword for character s
i
. Let c
i
have l
i
bits. We then create
2
L-li
entries in the table. In each of these entries the first l
i
bits are equal
to c
i
, and the remaining bits take on all possible L-l
i
binary combinations.
At all these addresses of the table we enter s
i
as the character
recognized, and we remember l
i
as the length of the codeword.
3-16
Decoding with the Table
Algorithm Table-Based Huffman Decoder
1. Read L bits from the input stream into a buffer.
2. Use the buffer as the address into the table and output the recognized
character s
i
.
3. Remove the first l
i
bits from the buffer and pull in the next l
i
bits from the
input bit stream.
4. Continue with step 2.
Observation
Table-based Huffman decoding is fast.
The output character rate is constant, the input bit rate is variable.
3-17
Huffman Code, Comments
A very good code for many practical purposes.
Can only be used when the frequencies (or probabilities) of
the characters are known in advance.
Variation: Determine the character frequencies separately
for each new document and store/transmit the code
tree/table with the data.
Note that a loss in optimalitycomes from the fact that each
character must be encoded with a fixed number of bits, and
thus the codeword lengths do not match the frequencies
exactly (consider a code for three characters A, B and C,
each occurring with a frequency of 33 %).
Further lossless algorithm: Ziv-Lempel
3-18
Transformations
Motivation for Transformations
Improvement of the compression ratio while maintaining a
good image quality.
What is a transformation?
Mathematically: a change of the base of the representation
Informally: representation of the same data in a different way.
Motivation for the use of transformations in compression
algorithms: In the frequency domain, leaving out detail is
often less disturbing to the human visual (or auditive) system
than leaving out detail in the original domain.
3-19
In the frequency domain the signal (one-dimensional
or two-dimensional) is represented as an overlay of
base frequencies. The coefficients of the frequencies
specify the amplitudes with which the frequencies
occur in the signal.
The Frequency Domain
3-20
The Fourier Transform
The Fourier transformof a function f is defined as follows:
where e can be written as
Note:
The sin part makes the function complex. If we only use the
cos part the transformremains real-valued.
( ) ( ) dx e x f t f
itx 2

=
( ) ( ) x i x e
ix
sin cos + =
3-21
Overlaying the Frequencies
A transformasks howthe amplitude for each base frequency must be chosen such that the
overlay (sum) best approximates the original function.
The output signal (c) is represented as a sum of the two sine waves (a) and (b).
3-22
One-Dimensional Cosine Transform
The Discrete Cosine Transform(DCT) is defined as
follows:
with
( )

=
+
=
7
0
16
1 2
cos
2
1
x
x u u
u x
s C S

=
=
otherwise
u for
C
u
1
0
2
1
3-23
Example for a 1D Approximation (1)
The following one-dimensional signal is to be approximated by the coefficients of
a 1DDCT with eight base frequencies.
3-24
Example for a 1D Approximation (2)
Some of the DCT kernels used in the approximation.
3-25
Example for a 1D Approximation (3)
DC coefficient
3-26
Example for a 1D Approximation (4)
DC coefficient + 1st AC coefficient
3-27
Example for a 1D Approximation (5)
DC coefficient + AC coefficients 1-3
3-28
Example for a 1D Approximation (6)
DC coefficient + AC coefficients 1-7
3-29
JPEG
The J oint Photographic Experts Group (J PEG, a
working group of ISO) has developed a very
efficient compression algorithm for still images
which is commonly referred to under the name of
the group.
J PEG compression is done in in four steps:
1. Image preparation
2. Discrete Cosine Transform (DCT)
3. Quantization
4. Entropy Encoding
3-30
The DCT-based JPEG Encoder
8 x 8 blocks
Source Image Data
DCT-Based Encoder
FDCT
Quantizer
Entropy
Encoder
Compressed
Image Data
Table
Speci-
fication
Table
Speci-
fication
3-31
Color Subsampling
One advantage of the YUV color model is that the color components U
and V of a pixel can be represented with a lower resolution than the
luminance value Y. The human eye is more sensitive to brightness than
to variations in chrominance. Therefore J PEG uses color subsampling:
for each group of four luminance values one chrominance value for each
U and V is sampled.
In J PEG, four Y blocks of size of 8x8 together with one U block and one V
block of size 8x8 each are called a macroblock.
= Luminance
= Chrominance
3-32
JPEG " Baseline" Mode
J PEG Baseline Mode is a compression algorithm based on a DCT transform from
the time domain into the frequency domain.
Image transformation
FDCT (Forward Discrete Cosine Transform). Very similar to the Fourier
transform. It is used separately for every 8x8 pixel block of the image.
with
This transform is computed 64 times per block. The result are 64 coeffici-
ents in the frequency domain.

=
=
otherwise
v u for
C C
v u
1
0 ,
2
1
,
( ) ( )
16
1 2
cos
16
1 2
cos
4
1
7
0
7
0
v y u x
s C C S
x y
yx v u vu
+ +
=

= =
3-33
Base Frequencies for the 2D-DCT
To cover an entire block of size of 8x8 we use 64 base frequencies, as
shown below.
3-34
Example of a Base Frequency
The figure below shows the DCT kernel corresponding to the base
frequency (0,2) shown in the highlighted frame (first row, third
column) on the previous page.
( ) ( )
16
0 1 2 cos
16
2 1 2 cos +

+ y x
3-35
Example
original
1 coefficient
4 coefficients
16 coefficients
Encoding of an Image
with the 2D-DCT and
block
size 8x8
3-36
Quantization
The next step in J PEG is the quantization of the DCT coefficients. Quantiz-ation
means that the range of allowable values is subdivided into intervals of fixed
size. The larger the intervals are chosen, the larger the quantization error will be
when we decompress.
Maximum quantization error: a/2
upper limit
quantitation interval
of size a
lower limit
a
a/2
a/2
3-37
Quantization: Quality vs. Compression Ratio
Coarse Quantization
000
001
010
011
100
00000
00001
00010
00011
00100
.
.
.
.
.
.
.
.
.
.
.
R
a
n
g
e

o
f

V
a
l
u
e
s
.
.
.
.
Fine Quantization
3-38
Quantization
In J PEG the number of quantization intervals can be chosen
separately for each DCT coefficient (Q-factor). The Q-factors
are specified in a quantization table.
Entropy-Encoding
The quantization step is followed by an entropy encoding
(lossless encoding) of the quantized values:
The DC coefficient is the most important one (basic color of the block).
The DC coefficient is encoded as the difference between the current DC
coefficient value and the one from the previous block (differential
coding).
The AC coefficients are processed in zig-zag order. This places
coefficients with similar values in sequence.
3-39
Quantization and Entropy Encoding
Zig-zag reordering of the coefficients is better than a read out line-by-line because
the input to the entropy encoder will have a few non-zero and many zero coefficients
(representing higher frequencies, i.e., sharp edges). The non-zero coefficients tend
to occur in the upper left-hand corner of the block, the zero coefficients in the lower
right-hand corner.
Run-length encoding is used to encode the values of the AC coefficients. The zig-
zag read out maximizes the run-lengths. The run-length values are then Huffman-
encoded (this is similar to the fax compression algorithm).
entropy
encoder
AC
63
AC
1
DC
i
=DC
i
- DC
i -1
quant.
DCT
coeff.
DC
i
DC
i-1
zig-zag
reordering
3-40
JPEG Decoder
Reconstructed
Image Data
DCT-Based Decoder
IDCT
De-
quantizer
Entropy
Decoder
Compressed
Image Data
Table
Speci-
fication
Table
Speci-
fication
3-41
Quantization Factor and Image Quality
Example: Palace in Mannheim
Palace, original image Palace image with Q=6
3-42
Palace Example (continued)
Palace image with Q=12 Palace image with Q=20
3-43
Video Compression with MPEG
MPEG stands for Moving Picture Experts
Group (a committee of ISO).
Goal of MPEG-1: compress a video signal
(with audio) to a data stream of 1.5 Mbit/s,
the data rate of a T1 link in the U.S. and the
rate that can be streamed from a CD-ROM.
3-44 Goals of MPEG-1 Compression
Randomaccess within 0.5 s while maintaining a
good image quality for the video
Fast forward / fast rewind
Possibility to play the video backwards
Alloweasy and precise editing
3-45
MPEG Frame Types
MPEG distinguishes four types of frames:
I-Frame (Intra Frame)
Intra-coded full image, very similar to the J PEG image, encoded with
DCT, quantization, run-length coding and Huffman coding
P-Frame (Predicted Frame)
Uses delta encoding. The P frame refers to preceeding I- and P-
frames. DPCM encoded macroblocks, motion vectors possible.
B-Frame (Interpolated Frame)
"bidirectionally predictive coded pictures. The B frame refers to
preceeding and succeeding frames, interpolated the data and encodes
the differences.
D-Frame
"DC coded picture", only the DC coefficient of each block is coded
(upper left-hand corner of the matrix), e.g., for previews.
3-46
Group of Pictures in MPEG
The sequence of I, P and B frames is not standardized but can be chosen according
to the requirements of the application. This allows the user to chose his/her own
compromise between video quality, compression rate, ease of editing, etc.
3-47
MPEG Encoder
frame
memory
inverse
quantizer
frame
memory
DCT
quantizer
IDCT
motion
compensation
motion
estimation
entropy
encoder
m
o
t
i
o
n
v
e
c
t
o
r
s
p
r
e
d
i
c
t
i
v
e
f
r
a
m
e
3-48
MPEG Decoder
inverse
quantizer
previous
picture store
IDCT
entropy
decoder
future
picture store
1/2
0
M
u
x
motioncompensation
3-49
Temporal Redundancy and Motion Vectors
"Motion Compensated Interpolation"
On the encoder side the search range can be chosen as a parameter: the larger the
search range, the higher the potential for compression, but the longer the run-time
of the algorithm.
A
B
previous frame
current frame
future frame
block-matching technique
1. block B =block A
2. block B =block C
3. block B =(block A +
block C) / 2
3-50
MPEG-2
MPEG-2 extends MPEG-1 for higher bandwidths
and better image qualities, up to HDTV. It was
developed jointly by ISO and ITU-T (where the
standard is called H.262).
MPEG-2 defines scalable data streams which allow
receivers with different bandwidth and processing
power to receive and decode only parts of the data
stream.
3-51
Scalability in MPEG-2 (1)
" SNR scalability : Each frame is encoded in several
layers. A receiver who only decodes the base layer will get
a lowimage quality. A receiver decoding addtional
(higher) layers gets a better image quality. An example is
color subsampling: the base layer contains only one
quarter of the values for the U and V components,
compared to the Y components. The enhancement layer
contains the U and V components in full resolution, for
better color quality.
" Spatial scalability : The frames are encoded with
different pixel resolutions (e.g., for a standard TV set and
for an HDTV TV set). Both encodings are trans-mitted in
the same data stream.
3-52
Scalability in MPEG-2 (2)
" Temporal scalability : The base layer contains only
very few frames per second, the enhancement layers
additional frames per second. Receivers de-coding the
higher layers will thus get a higher frame rate (i.e., a
higher temporal resolution).
" Data partitioning : The data stream is decomposed
into several streams with different amounts of
redundancy for error correction. The most important
parts of the stream are encoded in the base layer, e.g.,
the low-frequency coefficients of the DCT and the
motion vectors. This layer can then be enriched with
an error correcting code for better error resilience than
the enhancement layers where errors are not as
harmful.
3-53
MPEG-2 Video Profiles
Simple profile

no B frames
not scalable
Main profile


B frames
not scalable
SNR scalable
profile

B frames
SNR scaling
Spatially
scalable pro-
file
B frames
spatial scal-
ing
High profile


B frames
spatial or
SNR scaling
High level
1920x1152x60
<=80 Mbits/s <=100
Mbits/s
High-1440 level
1440x1152x60
<=60 Mbits/s <=60 Mbits/s <=80 Mbits/s
Main level
720x576x30
<=15 Mbits/s <=15 Mbits/s <=15 Mbits/s <=20 Mbits/s
Low level
352x288x30
<=4 Mbits/s <=4 Mbits/s
3-54
Summary
Compression of video and audio streams as well as
of still images is essential for efficient transmission
of multimedia data
Next to J PEG and MPEG, there are meanwhile a
lot of other compression techniques available.
Literature:
W. Effelsberg, R. Steinmetz: Video Compression
Techniques, dpunkt, 2001.

You might also like