0% found this document useful (0 votes)
13 views36 pages

M2 Huffman

The document discusses Huffman coding, a method for data compression developed by David Huffman, which utilizes prefix codes that are optimal based on symbol probabilities. It covers both binary and non-binary Huffman coding techniques, including examples and the design of Huffman codes for various symbol sets. Additionally, it outlines course outcomes related to data compression principles and techniques.

Uploaded by

jamiemathew1303
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views36 pages

M2 Huffman

The document discusses Huffman coding, a method for data compression developed by David Huffman, which utilizes prefix codes that are optimal based on symbol probabilities. It covers both binary and non-binary Huffman coding techniques, including examples and the design of Huffman codes for various symbol sets. Additionally, it outlines course outcomes related to data compression principles and techniques.

Uploaded by

jamiemathew1303
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 36

Huffman coding

Neena Raj N. R.

Department of Computer Science and Engineering


Mar Baselios College of Engineering and Technology, Nalanchira

January 2024
Syllabus

Module 2
Run length encoding, RLE Text compression, Statistical methods-Prefix Codes,
Binary Huffman coding, Illustration of Binary Huffman coding, Non-binary
Huffman Algorithms, Arithmetic Coding algorithm, Illustration of Arithmetic
Coding algorithm,

Neena Raj N. R. CS1U43D DCT January 2024 2 / 36


Course Outcomes

Course Outcomes
CO1 Describe the fundamental principles of data Understand
compression.
CO2 Apply
Make use of statistical and dictionary based
compression techniques for various applications
CO3 Illustrate various image compression standards. Apply
CO4 Summarize video compression mechanisms to re- Understand
duce the redundancy in video.
CO5 Use the fundamental properties of digital audio Understand
to compress audio data.

Neena Raj N. R. CS1U43D DCT January 2024 3 / 36


Huffman Coding Binary Huffman coding

Binary Huffman coding

Developed by David Huffman.


Huffman codes are prefix codes and are optimum for a given model
(set of probabilities).
The Huffman procedure is based on two observations regarding
optimum prefix codes.
1 In an optimum code, symbols that occur more frequently (have a
higher probability of occurrence) will have shorter codewords than
symbols that occur less frequently.
2 In an optimum code, the two symbols that occur least frequently will
have the same length.

Neena Raj N. R. CS1U43D DCT January 2024 4 / 36


Huffman Coding Binary Huffman coding

Binary Huffman coding

Design of a Huffman code


Example: Design a Huffman code for a source that puts out letters
from an alphabet A = {a1 , a2 , a3 , a4 , a5 } with P(a1 ) = P(a3 ) = 0.2,
P(a2 ) = 0.4, and P(a4 ) = P(a5 ) = 0.1.
To design the Huffman code, we first sort the letters in a descending
probability order as shown in Table 3.1.

Neena Raj N. R. CS1U43D DCT January 2024 5 / 36


Huffman Coding Binary Huffman coding

Binary Huffman coding

Here c(ai ) denotes the codeword for ai .


The two symbols with the lowest probability are a4 and a5 .

c(a4 ) = α1 ∗ 0
c(a5 ) = α1 ∗ 1

where α1 is a binary string, and * denotes concatenation.

Neena Raj N. R. CS1U43D DCT January 2024 6 / 36


Huffman Coding Binary Huffman coding

Binary Huffman coding

c(a3 ) = α2 ∗ 0
c(a40 ) = α2 ∗ 1
= α1
c(a4 ) = α2 ∗ 10
c(a5 ) = α2 ∗ 11

Neena Raj N. R. CS1U43D DCT January 2024 7 / 36


Huffman Coding Binary Huffman coding

Binary Huffman coding

c(a30 ) = α3 ∗ 0
= α2
c(a1 ) = α3 ∗ 1
c(a3 ) = α3 ∗ 01
c(a4 ) = α3 ∗ 010
c(a5 ) = α3 ∗ 011

Neena Raj N. R. CS1U43D DCT January 2024 8 / 36


Huffman Coding Binary Huffman coding

Binary Huffman coding

c(a3 ”) = 0
c(a2 ) = 1
c(a1 ) = 01
c(a3 ) = 001
c(a4 ) = 0010
c(a5 ) = 0011

Neena Raj N. R. CS1U43D DCT January 2024 9 / 36


Huffman Coding Binary Huffman coding

Binary Huffman coding

Neena Raj N. R. CS1U43D DCT January 2024 10 / 36


Huffman Coding Binary Huffman coding

Binary Huffman coding


The procedure can be summarized as shown in Figure.

Neena Raj N. R. CS1U43D DCT January 2024 11 / 36


Huffman Coding Binary Huffman coding

Binary Huffman coding


The entropy for this source is 2.122 bits/symbol.
The average length for this code is 2.2 bits/symbol
A measure of the efficiency of this code is its redundancy—the
difference between the entropy and the average length. In this case,
the redundancy is 0.078 bits/symbol.
The redundancy is zero when the probabilities are negative powers of
two.

Neena Raj N. R. CS1U43D DCT January 2024 12 / 36


Huffman Coding Binary Huffman coding

Binary Huffman coding


An alternative way of building a Huffman code is to use the fact that
the Huffman code, by virtue of being a prefix code, can be
represented as a binary tree in which the external nodes or leaves
correspond to the symbols.
The Huffman code for any symbol can be obtained by traversing the
tree from the root node to the leaf corresponding to the symbol,
adding a 0 to the codeword every time the traversal takes us over an
upper branch and a 1 every time the traversal takes us over a lower
branch.

Neena Raj N. R. CS1U43D DCT January 2024 13 / 36


Huffman Coding Binary Huffman coding

Binary Huffman coding


Design a Huffman code for a source that puts out letters from an
alphabet A = {a1 , a2 , a3 , a4 , a5 } with P(a1 ) = P(a3 ) = 0.2,
P(a2 ) = 0.4, and P(a4 ) = P(a5 ) = 0.1.

Fig. Building the binary huffman tree

Neena Raj N. R. CS1U43D DCT January 2024 14 / 36


Huffman Coding Binary Huffman coding

Binary Huffman coding

Neena Raj N. R. CS1U43D DCT January 2024 15 / 36


Huffman Coding Binary Huffman coding

Binary Huffman coding


Which of the different Huffman codes for a given set of symbols is
best?
The code with the smallest variance.
The variance of a code measures how much the sizes of the individual
codes deviate from the average size.
For the codes in the previous example,

Neena Raj N. R. CS1U43D DCT January 2024 16 / 36


Huffman Coding Binary Huffman coding

Binary Huffman coding

Neena Raj N. R. CS1U43D DCT January 2024 17 / 36


Huffman Coding Binary Huffman coding

Binary Huffman coding

Neena Raj N. R. CS1U43D DCT January 2024 18 / 36


Huffman Coding Binary Huffman coding

Binary Huffman coding


Number of Codes
Huffman code tree for n symbols has n-1 interior nodes.
Each interior node has two edges coming out of it, labeled 0 and 1.
Swapping the two labels produces a different Huffman code-tree.
The total number of different Huffman code-trees is 2n−1
(example, for 6 symbols, total number of different Huffman
code-trees is 32).

Neena Raj N. R. CS1U43D DCT January 2024 19 / 36


Huffman Coding Binary Huffman coding

Binary Huffman coding

Fig. Two Huffman Code-Trees.

Neena Raj N. R. CS1U43D DCT January 2024 20 / 36


Huffman Coding Nonbinary Huffman Coding

Nonbinary Huffman Coding

The binary Huffman coding procedure can be easily extended to the


nonbinary case where the code elements come from an m-ary
alphabet, and m is not equal to two.
The Nonbinary Huffman Coding procedure is based on two
observations.
1 In an optimum code, symbols that occur more frequently (have a
higher probability of occurrence) will have shorter codewords than
symbols that occur less frequently.
2 The m symbols that occur least frequently will have the same length.

Neena Raj N. R. CS1U43D DCT January 2024 21 / 36


Huffman Coding Nonbinary Huffman Coding

Nonbinary Huffman Coding

In the general case of an m-ary code and an M-letter alphabet, how


many letters should we combine in the first phase? Let m be the
number of letters that are combined in the first phase. Then m’ is the
number between two and m, which is equal to M modulo (m - 1).

Neena Raj N. R. CS1U43D DCT January 2024 22 / 36


Huffman Coding Nonbinary Huffman Coding

Nonbinary Huffman Coding

Design of a Nonbinary Huffman Code


Example:

Neena Raj N. R. CS1U43D DCT January 2024 23 / 36


Huffman Coding Nonbinary Huffman Coding

Nonbinary Huffman Coding

Neena Raj N. R. CS1U43D DCT January 2024 24 / 36


Huffman Coding Nonbinary Huffman Coding

Nonbinary Huffman Coding

Neena Raj N. R. CS1U43D DCT January 2024 25 / 36


Huffman Coding Nonbinary Huffman Coding

Nonbinary Huffman Coding

Neena Raj N. R. CS1U43D DCT January 2024 26 / 36


Huffman Coding Nonbinary Huffman Coding

Nonbinary Huffman Coding

Neena Raj N. R. CS1U43D DCT January 2024 27 / 36


Huffman Coding Nonbinary Huffman Coding

Nonbinary Huffman Coding

Neena Raj N. R. CS1U43D DCT January 2024 28 / 36


Huffman Coding Nonbinary Huffman Coding

Nonbinary Huffman Coding

Neena Raj N. R. CS1U43D DCT January 2024 29 / 36


Huffman Coding Nonbinary Huffman Coding

Nonbinary Huffman Coding

Figure (a) shows a Huffman code tree for five symbols with
probabilities 0.15, 0.15, 0.2, 0.25, and 0.25.
The average code size is
2 × .25 + 3 × .15 + 3 × .15 + 2 × .20 + 2 × .25 = 2.3bits/symbol.
Neena Raj N. R. CS1U43D DCT January 2024 30 / 36
Huffman Coding Nonbinary Huffman Coding

Nonbinary Huffman Coding

Figure (b) shows a ternary Huffman code tree for the same five
symbols. The tree is constructed by selecting, at each step, three
symbols with the smallest probabilities and merging them into one
parent symbol, with the combined probability. The average code size
of this tree is
2 × .15 + 2 × .15 + 2 × .20 + 1 × .25 + 1 × .25 = 1.5trits/symbol.

Neena Raj N. R. CS1U43D DCT January 2024 31 / 36


Huffman Coding Nonbinary Huffman Coding

Nonbinary Huffman Coding

Neena Raj N. R. CS1U43D DCT January 2024 32 / 36


Huffman Coding Nonbinary Huffman Coding

Nonbinary Huffman Coding

Neena Raj N. R. CS1U43D DCT January 2024 33 / 36


Syllabus

Module 2
Run length encoding, RLE Text compression, Statistical methods-Prefix Codes,
Binary Huffman coding, Illustration of Binary Huffman coding, Non-binary
Huffman Algorithms, Arithmetic Coding algorithm, Illustration of Arithmetic
Coding algorithm,

Neena Raj N. R. CS1U43D DCT January 2024 34 / 36


Course Outcomes

Course Outcomes
CO1 Describe the fundamental principles of data compression. Understand
CO2 Apply
Make use of statistical and dictionary based compression tech-
niques for various applications
CO3 Illustrate various image compression standards. Apply
CO4 Summarize video compression mechanisms to reduce the re- Understand
dundancy in video.
CO5 Use the fundamental properties of digital audio to compress Understand
audio data.

Neena Raj N. R. CS1U43D DCT January 2024 35 / 36


References

References

[1] K. Sayood, Introduction to data compression. Morgan Kaufmann,


2003.

Neena Raj N. R. CS1U43D DCT January 2024 36 / 36

You might also like