0% found this document useful (0 votes)
27 views30 pages

HGGJ Chapter Four

Uploaded by

Nati
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views30 pages

HGGJ Chapter Four

Uploaded by

Nati
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 30

Chapter Four

Multimedia Data
Compression

1
Outline
• Introduction
• Compression with loss and lossless
• Run-Length Encoding
• Huffman Coding
• Dictionary- based coding(LZW)
• Adaptive Coding

2
Introduction: What is multimedia
Data Compression?
• The process of coding that will effectively
reduce the total number of bits needed to
represent certain information.
• Refers to sending or storing a smaller number
of bits.

3
Introduction: Why multimedia Data
Compression?
• Audio, image and video require vast amounts of
data
– 320x240x8bits grayscale image: 77Kb
– 1100x900x24bits color image: 3MB
– 640x480x24x30frames/sec: 27.6 MB/sec
• Low network’s bandwidth doesn't allow for real
time video transmission
• Slow storage or processing devices don't allow for fast
playing back
• Compression reduces storage requirements

4
Multimedia data compression
Techniques

5
Lossless Techniques
• Now a days,libraries,museums,film studios, and governments are
converting more and more data and archives into digital form.
Some of the data(e.g., precious books and paintings)in deed need
to be stored with out any loss.
• As a start, suppose we want to encode the call numbers of the 120
million or so Items in the Library(a mere 20million,if we consider
just books).Why don’t we just transmit each item as a27-bit
number, giving each item a unique binary code(since 227 > 120,
000, 000)?
• The main problem is that this “great idea” requires too many bits.
And in fact there exist many coding techniques that will effectively
reduce the total number of bits needed to represent the above
information. The process involved is generally referred to as
compression .

6
Lossless Techniques
• In lossless data compression, the integrity of the
data is preserved.
• The original data and the data after compression
and decompression are exactly the same.
– in these methods, the compression and
decompression algorithms are exact inverses of each
other.
– no part of the data is lost in the process.
– Redundant data is removed in compression and
added during decompression.
– Lossless compression methods are normally used
when we cannot afford to lose any data.

7
Lossless Techniques(1.Run-Length
Encoding)
• is probably the simplest method of compression.
• It can be used to compress data made of any combination of
symbols.
• It does not need to know the frequency of occurrence of symbols
and can be very efficient if data is represented as 0s and 1s.
• The basic idea is that if the information source we wish to compress
has the property that symbols tend to form continuous groups,
instead of coding each symbol in the group individually, we can code
one such symbol and the length of the group.
• The method can be even more efficient if the data uses only two
symbols (for example 0 and 1) in its bit pattern and one symbol is
more frequent than the other.
• As an example, consider a bi-level image (one with only1-bit black
and white pixels) with monotone regions—like an fx. This
information source can be efficiently coded using run-length coding.
8
Lossless Techniques(1.Run-Length
Encoding:example1)

9
Lossless Techniques(1.Run-Length
Encoding:example2)
• Compress the following data using run-
Length encoding
111234444888889

10
Lossless Techniques(2. Shannon-Fano Algorithm)

 Shannon-Fano Algorithm : a top-down approach


1. Sort the symbols according to the frequency count
(probability) of their occurrences.
2. Recursively divide the symbols into two parts, each with
approximately the same number of counts, until all parts
contain only one symbol.

An Example: coding of “HELLO“

Frequency count of the symbols in "HELLO".

11
Shannon-Fano Algorithm

Coding tree for HELLO by the Shannon-Fano algorithm

12
Shannon-Fano Algorithm

One result of performing the Shannon-Fano algorithm on HELLO

13
Shannon-Fano Algorithm

Another coding tree for HELLO by the Shannon-Fano algorithm

14
Shannon-Fano Algorithm

Another result of performing the Shannon-Fano algorithm on HELLO

15
Lossless Techniques(3.Huffman
coding)
• Assigns shorter codes to symbols that occur more frequently and
longer codes to those that occur less frequently.
• Binary coding tree will be used,in which the left branches are coded 0
and right branches 1.Asimple list data structure is also used.
• Algorithm (HuffmanCoding).
1.Initialization: put all symbols on the list sorted according to their
frequency counts.
2.Repeat until the list has only one symbol left.
(a)From the list, pick two symbols with the lowest frequency counts. Form
a Huffman subtree that has these two symbols as child nodes and
create a parent node for them.
(b)Assign the sum of the children’s frequency counts to the parent and
insert it into the list, such that the order is maintained.
(c)Delete the children from the list.
3. Assign a code word for each leaf based on the path from the root.

16
Lossless Techniques(3.Huffman
coding)
• a bottom-up approach
• For example, imagine we have a text file
that uses only five characters (A, B, C, D, E).
• Before we can assign bit patterns to each
character, we assign each character a weight
based on its frequency of use.
• In this example, assume that the frequency
of the characters is as shown in table below.

17
Lossless Techniques(3.Huffman
coding:example1)

Figure 15.4 Huffman coding


Lossless Techniques(3.Huffman
coding:example1)
A character’s code is found by starting at the root and
following the branches that lead to that character. The code
itself is the bit value of each branch on the path, taken in
sequence.

Figure Final tree and code


Lossless Techniques(3.Huffman coding: Encoding)

Let us see how to encode text using the code for our five
characters. Figure 15.6 shows the original and the encoded
text.

15.20
Figure Huffman encoding
Lossless Techniques(3.Huffman coding: Decoding)

The recipient has a very easy job in decoding the data it


receives. Figure shows how decoding takes place.

15.21
Figure Huffman decoding
Properties of Huffman Coding

1. Unique Prefix Property: No Huffman code is a prefix


of any other Huffman code
- preclude any ambiguity in decoding.

22
2. Optimality: minimum redundancy code –
proved optimal for a given data model
(i.e., a given probability distribution):
• The two least frequent symbols will have the same
length for their Huffman codes, differing only at the
last bit.
• Symbols that occur more frequently (higher probability)
will have shorter Huffman codes than symbols that
occur less frequently.

23
Dictionary-Based Coding- Lempel Ziv encoding

Lempel Ziv (LZ) encoding is an example of a category of


algorithms called dictionary-based encoding.
The idea is to create a dictionary (a table) of strings used
during the communication session.
If both the sender and the receiver have a copy of the
dictionary, then previously-encountered strings can be
substituted by their index in the dictionary to reduce the
amount of information transmitted.

15.24
Lempel Ziv encoding: Compression
•In this phase there are two concurrent events: building an
indexed dictionary and compressing a string of symbols.
•The algorithm extracts the smallest substring that cannot be
found in the dictionary from the remaining uncompressed
string.
•It then stores a copy of this substring in the dictionary as a
new entry and assigns it an index value.
•Compression occurs when the substring, except for the last
character, is replaced with the index found in the dictionary.
•The process then inserts the index and the last character of
the substring into the compressed string.

15.25
15.26
Figure An example of Lempel Ziv encoding
Lempel Ziv encoding:
Decompression

Decompression is the inverse of the compression process.


The process extracts the substrings from the compressed
string and tries to replace the indexes with the
corresponding entry in the dictionary, which is empty at first
and built up gradually.
The idea is that when an index is received, there is already
an entry in the dictionary corresponding to that index.

15.27
Figure An example of Lempel Ziv decoding
Adaptive Huffman Coding

 statistics are gathered and updated


dynamically as data stream arrives.
 increments the frequency counts for the
symbols
 The key idea is to build a Huffman tree that
is optimal for the part of the message
already seen, and to reorganize it when
needed, to maintain its optimality

29
Why Adaptive Huffman Coding?

• Huffman coding suffers from the fact that the


uncompresser need have some knowledge of the
probabilities of the symbols in the compressed
files
• this can need more bit to encode the file
• if this information is unavailable compressing the
file requires two passes
• first pass: find the frequency of each symbol and
construct the huffman tree
• second pass: compress the file

30

You might also like