Chapter 5 Data Compression
Chapter 5 Data Compression
3 Multimedia System
Chapter 5
Data Compression
Lossy Compression
Lossy compression algorithms is normally not to reproduce an exact copy of the source
information after decompression but rather a version of it which is perceived by the
recipient as a true copy.
In lossy compression some information is lost during the processing, where the image
data is stored into important and unimportant data. The system then discards the
unimportant data.
It provides much higher compression rates but there will be some loss of information
compared to the original source file. The main advantage is that the loss cannot be visible
to eye or it is visually lossless. Visually lossless compression is based on knowledge
about color images and human perception.
Lossless Compression
In this type of compression no information is lost during the compression and the
decompression process. Here the reconstructed image is mathematically and visually
identical to the original one. It achieves only about a 2:1 compression ratio.
This type of compression technique looks for patterns in strings of bits and then expresses
them more concisely.
Lossless compression algorithm the aim is to be transmitted in such a way that, when the
compressed information is decompressed, therefore is no loss information.
Basic Technique
These are the techniques, which have been used only in the past. The important basic
techniques are run length encoding and move to front encoding.
Huffman Coding
A commonly used method for data compression is Huffman coding. The method starts by
building a list of all the alphabet symbols in descending order of their probabilities. It
then construct a tree with a symbol at every leaf from the bottom up.
The Character string to be transmitted is first analyzed and the character types and their
relative frequency determined.
A Huffman (code) tree is a binary tree with branches assigned the value 0 or 1. As each
branch divides, a binary value of 0 or 1 is assigned to each new branch: a binary 0 for the
left branch and a binary 1 for the right branch.
RN
0 1
BN LN = A
0 1
BN LN = B
0 1
LN = D LN = C
Arithmetic Coding
In this method the input stream is read symbol and appends more to the code each same
time a symbol is input and processed. To understand this method it is useful to imagine
resulting code as a number in the range [0,1] that is the range of real numbers from 0 to 1
not including one. The first step is to calculate or at least to estimate the frequency of
occurrence of each symbol.
Picture Preparation:
Preparation includes analog to digital conversion and generating an appropriate digital
representation of the information. An image is divided into blocks of 8x8 pixels, and
represented by a fixed number of bits per pixel.
Picture Processing:
Processing is actually the first step of the compression process which makes use of
sophisticated algorithms. A transformation from the time to the frequency domain can be
performed using DCT. In the case of motion video compression, interframe coding uses a
motion vector for each 8x8 blocks. Motion video computation for digital video.
Quantization:
Quantization process the results of the previous step. It specifies the granularity of the
mapping of real numbers into integers. This process results in a reduction of precision.
For example they could be quantized using a different number of bits per coefficient. For
example 12 bits for real values, 8 bits for integer value.
Entropy Encoding:
Entropy encoding is usually the last step. It compresses a sequential digit data stream
without loss. For example, a sequence of zeroes in a data stream can be compressed by
specifying the number of occurrences followed by the zero itself.
compression schemes in widespread use. It’s ability to attain considerable size reductions
with minimal visual impact with relative light computational requirements and the ability
to fine tune the compression level to suit the image at hand has made it the standard for
continuous tone still images.
Image Preparation
The image preparation model in JPEG is very general.
It is not based on
• 9-bit YUV encoding
• fixed number of lines, columns
• mapping of encoded chrominance
It is independent from image Parameters such as image size, image and pixel ratio.
Source image consists of 1 to 255 components (planes).
• For example, each component Ci (1≤ i ≤ 255) may be assigned to YUV, RGB or
YIQ signals.
All pixels of all components within the same image are coded with the same number of
bits.
• Lossy modes of JPEG use precision of 8-12 bits per pixel.
• Lossless mode uses precision of 2 up to 12 bits per pixel.
• If JPEG application makes use of any other number of bits, then application must
perform a suitable image transformation to the well-defined number of bits/pixel
(JPEG standard).
Interleaved Data units of different components are combined to minimum coded units
(MCUs).
If image has the same resolution, then MCU consists of exactly one data unit for each
component.
Decoder displays the image MCU by MCU.
If image has different resolution for single components, then reconstruction of MCUs is
more complex.
For each component, determine regions of the data units. Each component has
same number of regions, MCU corresponds to one region.
Data units in a region are ordered left-right, top-bottom
Build MCU
JPEG standard - only 4 components can be encoded in interleaved mode
Bound on length of MCU
MCU consists of at most 10 data units.
After image preparation, uncompressed image samples are grouped into data units of 8x8
pixels and passed to the JPEG encoder
Image Processing
First step:
• Pixel values are shifted (ZERO-SHIFT) into the range [-128,127] with 0 in the
center.
Values in the 8x8 pixel are defined by Syx with y,x in the range [0,7] and there
are 64 sampled values Syx in each block.
• DCT maps values from time to frequency domain.
1D Forward Discrete Cosine Transformation
S(v,u) coefficients:
• S(0,0) includes the lowest frequency in both directions and it is called the DC
coefficient. S(0,0) determines the fundamental color of the BLOCK(64 pixels).
For this coefficient the frequency is equal 0 in both directions.
• S(0,1)…S(7,7) are called AC coefficients. Their frequency is non-zero in one or
both directions. There exist many AC coefficients with a value around 0.
Factoring
• By computing the DCT coefficients, we can use factoring; the problem will be
reduced to a series of 1D FDCTs.
Quantization
GOAL: To throw out bits.
Example:
• 101101 = 45 (6 bits).
• We can truncate this to 4 bits: 1011 - 11
• or 3 bits 101 = 5 (original value - 40) or 110 = 6 (value = 48)
Uniform quantization is achieved by dividing the DCT coefficient value S(v,u) by N and
rounding the result.
In S(v,u) how many bits do we throw away?
ANSWER: Use quantization tables
Entropy Encoding
• After image processing we have quantized DC and AC coefficients.
• Initial step of entropy encoding is to map 8x8 plane into 64 element vector
H.261
• H.261 is a video coding standard published by the ITU-T in 1990.
• It is the most widely used international video compression standard for video
coding.
• H.261 is usually used in conjunction with other control and framing standards.
• The H.261 standard describes the video coding/decoding methods for the video
portion of an audiovisual service
• Designed for data rates of p*64 kbps, where p is in the range 1-30
MPEG standard consists of both video and audio compression. MPEG standard includes
also many technical specifications such as image resolution, video and audio
synchronization, multiplexing of the data packets, network protocol, and so on. Here we
consider only the video compression in the algorithmic level. The MPEG algorithm relies
on two basic techniques
MPEG itself does not specify the encoder at all, but only the structure of the decoder, and
what kind of bit stream the encoder should produce. Temporal prediction techniques with
motion compensation are used to exploit the strong temporal correlation of video signals.
The motion is estimated by predicting the current frame on the basis of certain previous
and/or forward frame. The information sent to the decoder consists of the compressed
DCT coefficients of the residual block together with the motion vector. There are three
types of pictures in MPEG:
• Intra-pictures (I)
• Predicted pictures (P)
• Bidirectionally predicted pictures (B)
Figure 1 demonstrates the position of the different types of pictures. Every Nth frame in
the video sequence is an I-picture, and every Mth frame a P-picture. Here N=12 and M=4.
The rest of the frames are B-pictures.
Intra pictures are coded as still images by DCT algorithm similarly than in JPEG. They
provide access points for random access, but only with moderate compression. Predicted
pictures are coded with reference to a past picture. The current frame is predicted on the
basis of the previous I- or P-picture. The residual (difference between the prediction and
the original picture) is then compressed by DCT. Bidirectional pictures are similarly
coded than the P-pictures, but the prediction can be made both to a past and a future
frame which can be I- or P-pictures. Bidirectional pictures are never used as reference.
The pictures are divided into 16×16 macroblocks, each consisting of four 88
elementary blocks. The B-pictures are not always coded by bidirectional prediction, but
four different prediction techniques can be used:
• Bidirectional prediction
• Forward prediction
• Backward prediction
• Intra coding.
The choice of the prediction method is chosen for each macroblock separately. The
bidirectional prediction is used whenever possible. However, in the case of sudden
camera movements, or a breaking point of the video sequence, the best predictor can
sometimes be given by the forward predictor (if the current frame is before the breaking
point), or backward prediction (if the current frame is after the breaking point). The one
that gives the best match is chosen. If none of the predictors is good enough, the
macroblock is coded by intra-coding. Thus, the B-pictures can consist of macroblock
coded like the I-, and P-pictures.
The intra-coded blocks are quantized differently from the predicted blocks. This is
because intra-coded blocks contain information in all frequencies and are very likely to
produce 'blocking effect' if quantized too coarsely. The predicted blocks, on the other
hand, contain mostly high frequencies and can be quantized with more coarse
quantization tables.
Forward Forward
prediction prediction
I B B B P B B B P B B B I
Bidirectional
prediction
Motion estimation:
The prediction block in the reference frame is not necessarily in the same coordinates
than the block in the current frame. Because of motion in the image sequence, the most
suitable predictor for the current block may exist anywhere in the reference frame. The
motion estimation specifies where the best prediction (best match) is found, whereas
motion compensation merely consists of calculating the difference between the reference
and the current block.
The motion information consists of one vector for forward predicted and backward
predicted macroblocks, and of two vectors for bidirectionally predicted macroblocks. The
MPEG standard does not specify how the motion vectors are to be computed, however,
block matching techniques are widely used. The idea is to find in the reference frame a
similar macroblock to the macroblock in the current frame (within a predefined search
range). The candidate blocks in the reference frame are compared to the current one. The
one minimizing a cost function measuring the mismatch between the blocks, is the one
which is chosen as reference block.
Exhaustive search where all the possible motion vectors are considered are known to give
good results. Because the full searches with a large search range have such a high
computational cost, alternatives such as telescopic and hierarchical searches have been
investigated. In the former one, the result of motion estimation at a previous time is used
as a starting point for refinement at the current time, thus allowing relatively narrow
searches even for large motion vectors. In hierarchical searches, a lower resolution
representation of the image sequence is formed by filtering and subsampling. At a
reduced resolution the computational complexity is greatly reduced, and the result of the
lower resolution search can be used as a starting point for reduced search range
conclusion at full resolution.
H.261 MPEG
Only 1, 2 or 3 skipped pictures No restrictions on skipped pictures.
allowed.
Pixel accurate motion vectors. Sub-pixel accurate motion vectors.
Typical motion vector range is +/-7 Typical motion vector range is +/-
pixels. 15 pixels
Used mostly in interactive The end-to-end coding delay is not
applications. End-to-end delay is critical.
very critical.
Addr. 8 Y 8
U 8 D/A RGB
VDP1 VDP2 V 8
Data 16 α 8
Sync. Genlock
2