0% found this document useful (0 votes)

8 views32 pages

Huffman Coding

The document discusses Huffman coding, a method for data compression that uses variable-length codes for characters based on their frequencies, resulting in reduced storage space and transmission time. It details the construction of Huffman trees, the algorithm's greedy approach, and the efficiency of dynamic Huffman coding for real-time data processing. The conclusion emphasizes the potential savings in data compression and the possibility of achieving better compression through alternative methods.

Uploaded by

kashafbutt72

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views32 pages

Huffman Coding

Uploaded by

kashafbutt72

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 32

Huffman Coding

Motivation

To compress or not to compress, that is the question!

 reducing the space

required to store files
on disk or tape

 reducing the time

to transmit large files.

Image Source : plus.maths.org/issue23/ features/data/data.jpg

Example:
• A file with 100K characters

Character a b c d e f

Frequency 45 13 12 16 9 5
(in 1000s)
Fixed-length 000 001 010 011 100 101
codeword

Space = (453 + 133 + 123 + 163 + 93 + 53) * 1000

= 300K bits
Can we do better ??
YES !!
• Use variable-length codes instead.
• Give frequent characters short codewords, and
infrequent characters long codewords.
a b c d e f

Frequency 45 13 12 16 9 5
(in 1000s)
Fixed-length 000 001 010 011 100 101
codeword
Variable-length 0 101 100 111 1101 1100
codeword

Space = (451 + 133 + 123 + 163 + 94 + 54) * 1000

= 224K bits ( Savings = 25%)

PREFIX-FREE CODE :

• No codeword is also prefix of some other

codeword.

No Ambiguity !!

Variable-length 0 101 100 111 1101 1100

codeword
Representation:

The Huffman algorithm is represented as:

100
• binary tree 0 1

• each edge represents either a:45 0 55

• 0, "go to the left child" 25 30

0 1 1
0
• 1, "go to the right child" c:12 b:13 14 d:16
0 1
• each leaf corresponds a particular code.
f:5 e:9

• Cost of the tree

• B(T) = ∑f(c) dT(c) where c ε C
0ptimal Code

100 100
0 1
0 1

86 14 a:45 0 55
0 1
1 0

14 25 30
58 28 0 1 1
1 0 1 0 1 0
0
c:12 b:13 14 d:16
a:45 b:13 c:12 d:16 e:9 f:5 0 1

f:5 e:9

• Always a full binary tree

• One leaf for each letter of the alphabet
Constructing a Huffman code
• Build the tree T in a bottom-up manner.
• Begins with a set of |C| leaves
• Upward |C| - 1 "merging" operations

Greedy Choice?
• The two smallest nodes are chosen at each
step.
The steps of Huffman's algorithm

f:5 e:9 c:12 b:13 d:16 a:45 c:12 b:13 14 d:16 a:45
0 1

f:5 e:9

14 d:16 25 a:45 25 30 a:45

0 1 0 1 0 1 0 1

f:5 e:9 c:12 b:13 c:12 b:13 14 d:16

0 1

f:5 e:9

a:45 55
0 1 0
100 1
a:45 55
25 30
0 1
0 1 0 1

d:16 25 30
c:12 b:13 14
0 1 0 1
0 1
c:12 b:13 14 d:16
f:5 e:9
0 1

f:5 e:9
Running Time Analysis

Q is implemented as a binary min-heap.

The merge operation is executed exactly |n| - 1 times. Each
heap operation requires time O(log n).
= O(nlog n)
Huffman code

E = 01
I = 00
Input C = 10
◦ ACE A = 111
Output H = 110
◦ (111)(10)(01) = 1111001

Huffman Coding Example

 Decoding
1. Read compressed file & binary tree
2. Use binary tree to decode file
Follow path from root to leaf

Huffman Code Algorithm Overview

A H
1111001
3 2

1 0 C E I
5 5 8 7

1 0 1 0
10 15

1 0
25
Huffman Decoding 1
A H
1111001
3 2

1 0 C E I
5 5 8 7

1 0 1 0
10 15

1 0
25
Huffman Decoding 2
A H
1111001
3 2

1 0 C E I
5 8 7
5 A
1 0 1 0
10 15

1 0
25
Huffman Decoding 3
A H
1111001
3 2

1 0 C E I
5 8 7
5 A
1 0 1 0
10 15

1 0
25
Huffman Decoding 4
A H
1111001
3 2

1 0 C E I
5 8 7
5 AC
1 0 1 0
10 15

1 0
25
Huffman Decoding 5
A H
1111001
3 2

1 0 C E I
5 8 7
5 AC
1 0 1 0
10 15

1 0
25
Huffman Decoding 6
A H
1111001
3 2

1 0 C E I
5 8 7
5 ACE
1 0 1 0
10 15

1 0
25
Huffman Decoding 7
Induction on the number of code words

The Huffman algorithm finds an optimal

code for n = 1

Suppose that the Huffman algorithm finds

an optimal code for codes size n, now
consider a code of size n + 1 . . .

Correctness proof
Greedy Choice Proof
T T’ T’’

x a a

y y b

a b x b x y

Assume that f[a] < f[b] and f[x] < f[y]

Since f[x] and f[y] are the two lowest frequencies,

f[x] < f[a] and f[y] < f[b].

T – Tree constructed by Huffman
 X – Any code tree

 Show C(T) <= C(X)

 T’ and X’ – Trees from the greedy choice

 C(T’) = C(T)
 C(X’) <= C(X)

 T’’and X’’ – Trees with minimum cost leaves x and y

removed

Finish the induction proof

C(X’’) = C(X’) – x – y
C(T’’) = C(T’) – x – y
C(T’’) <= C(X’’)

C(T) = C(T’)
= C(T’’) + x +y
<= C(X’’) + x+y
= C(X’)
<= C(X)

X : Any tree, X’: – modified,

X’’ : Two smallest leaves removed
What
is
our
next
Challenges and how to tackle them?

Two passes over the data:

• One pass to collect frequency counts of the letters
• A second pass to encode and transmit the letters, based on
the static tree structure.

Problems:
Delay (network communication, file compression applications)
Extra disk accesses slow down the algorithm.

We need one-pass methods, in which letters are encoded “on the fly”.
Dynamic Huffman codes

Algorithm FGK
• The next letter of the message is encoded on the
basis of a Huffman tree for the previous letters.
• Encoding length = (2S + t), where S is the encoding
length by a static Huffman code, and t is the number
of letters in the original message.

Sender and receiver

• start with the same initial tree
• use the same algorithm to modify the tree after
each letter is processed and thus always have
equivalent copies of it.
Sibling Property:

A binary tree with p leaves of nonnegative weight is a Huffman tree iff

• the p leaves have nonnegative weights w1, . . . , wp, and the weight
of each internal node is the sum of the weights of its children; and
• the nodes can be numbered in non-decreasing order by weight, so
that nodes (2j - 1)and 2j are siblings, for 1 ≤ j ≤ p - 1, and their
common parent node is higher in the numbering.
11
32
21 10
11 9
f
10 7 11 8

5 5 5 6
3 5 6
4 d e
c
2 1
3 2
a b
Difficulty

Suppose that MT = ai1 , ai2, . . . , ait, has already been processed.

ai(t+1) is encoded and decoded using Huffman tree for MT.

How to modify this tree quickly in order to get a Huffman tree for MT+1?

Eg. Assume t = 32,

11 11
32 33

X
21 10 22 10
11 9 11 9
f f
10 7 11 8 ai(t+1) = “b” 11 7 11 8

5 5 5 6 5 6 5 6
3 5 6 3 5 6
4 d e 4 d e
c c
2 1
3 2
2 1
4 2
a b a b
First phase

• Begin with the leaf of ai(t+1), as the current node.

• Repeatedly interchange the contents of the current node, including the subtree
rooted there, with that of the highest numbered node of the same weight
• Make the parent of the latter node the new current node.
• Halt when the root is reached.

Eg. Assume t = 32, ai(t+1) = “b”

11
11 32
32
21 10
11 9 21 10 11 9

f f
11 8 10 7 11 8
10 7

5 5 5 6 5 5 5 6 6
3 5 6 3 5
4 d e 4
c c d e

2 3 2 1 3 2
1 2
a b a b
11
32 32 11
21 10
11 9
21 10
f 11 9

10 7 11 8
7 8
5 5 6 10 11
6
5 5 5 5 6 6 e f
3
4 e
c d 5 5
2 1 3 2 3 4
2 1 3 2 a b c d
a b
Second phase
• We turn this tree into the desired Huffman tree for MT+1 by incrementing the
weights of ai(t+1)’s leaf and its ancestors by 1

32 11 33 11

21 10
10 12 9
11 9 21

10 7 11 8 6 6 10 7 11 8
5 5 6 5 6
e
6
f e f

5 2 4 5 5
2 1 3 2 5 3 1 2 3 4
4 a
a c d b c d
b
• Huffman savings are between 20% - 90%
• Dynamic Huffman Coding optimal and efficient
• Optimal data compression achievable by a
character code can always be achieved with a
prefix code.
• Better compression possible (depends on data)
• Using other approaches (e.g., pattern
dictionary)

Conclusions
Thank you for your
attention!

Unit III - Daa
No ratings yet
Unit III - Daa
127 pages
Huffman Coding Tree
No ratings yet
Huffman Coding Tree
4 pages
Graph Theory - Important Application of Trees Huffman Coding
No ratings yet
Graph Theory - Important Application of Trees Huffman Coding
50 pages
Huffman Trees and Codes-V1
No ratings yet
Huffman Trees and Codes-V1
15 pages
Useful XSLT Mapping Functions in SAP XI/PI
No ratings yet
Useful XSLT Mapping Functions in SAP XI/PI
14 pages
Lecture 22 Compression
No ratings yet
Lecture 22 Compression
42 pages
Ivoclar Digital - Scanner-CAD
No ratings yet
Ivoclar Digital - Scanner-CAD
16 pages
Dijkstras Algorithm
No ratings yet
Dijkstras Algorithm
38 pages
Bassam Osman Resume
No ratings yet
Bassam Osman Resume
4 pages
Statistics 191: Introduction To Applied Statistics: Simple Linear Regression: Diagnostics
No ratings yet
Statistics 191: Introduction To Applied Statistics: Simple Linear Regression: Diagnostics
25 pages
Asymptotic Notation
No ratings yet
Asymptotic Notation
49 pages
Lecture 25
No ratings yet
Lecture 25
30 pages
Mini Project
No ratings yet
Mini Project
26 pages
Huffman Student
No ratings yet
Huffman Student
17 pages
Nursing Informatics
No ratings yet
Nursing Informatics
5 pages
Ajayroyal828@gmail - Com 9908104197
No ratings yet
Ajayroyal828@gmail - Com 9908104197
10 pages
Unit 3
No ratings yet
Unit 3
122 pages
Train and Test Datasets in Machine Learning
No ratings yet
Train and Test Datasets in Machine Learning
6 pages
2611 Mobile Asset Management SAPs Solution Portfolio and Roadmap
100% (1)
2611 Mobile Asset Management SAPs Solution Portfolio and Roadmap
37 pages
Basic Configuration of Mikrotik
No ratings yet
Basic Configuration of Mikrotik
7 pages
Beige Pastel Minimalist Thesis Defense Presentation
No ratings yet
Beige Pastel Minimalist Thesis Defense Presentation
22 pages
Bellman Ford
No ratings yet
Bellman Ford
9 pages
7.4 Huffman Coding
No ratings yet
7.4 Huffman Coding
26 pages
EE Lab Manuls Fast Nu
No ratings yet
EE Lab Manuls Fast Nu
69 pages
Tutorial Sheet 2 Deadlocks
No ratings yet
Tutorial Sheet 2 Deadlocks
2 pages
5 Huffman Coding
No ratings yet
5 Huffman Coding
50 pages
Lesson 14
No ratings yet
Lesson 14
27 pages
Lesson 02
No ratings yet
Lesson 02
22 pages
Lesson 04
No ratings yet
Lesson 04
20 pages
0g Huffman
No ratings yet
0g Huffman
23 pages
Termis 6 0
No ratings yet
Termis 6 0
952 pages
Lesson 05
No ratings yet
Lesson 05
18 pages
Lesson 13
No ratings yet
Lesson 13
17 pages
Huffman Coding
No ratings yet
Huffman Coding
16 pages
Data Compression Unit-2
No ratings yet
Data Compression Unit-2
74 pages
Huffman
No ratings yet
Huffman
22 pages
OPt Exam Example 2
No ratings yet
OPt Exam Example 2
6 pages
Unit Iii Greedy and Dynamic Programming
No ratings yet
Unit Iii Greedy and Dynamic Programming
120 pages
MIS Project Management at First National Bank
0% (1)
MIS Project Management at First National Bank
5 pages
2.3a Huffman Coding
No ratings yet
2.3a Huffman Coding
25 pages
Huffman Tree and Coding
No ratings yet
Huffman Tree and Coding
6 pages
Huffman Coding
No ratings yet
Huffman Coding
11 pages
Unite 4-Greedy Method - CSE
No ratings yet
Unite 4-Greedy Method - CSE
41 pages
Huffman Coding
No ratings yet
Huffman Coding
22 pages
Huffman Coding Algorithm
No ratings yet
Huffman Coding Algorithm
4 pages
HuffmanCoding 2
No ratings yet
HuffmanCoding 2
16 pages
Lect18 19
No ratings yet
Lect18 19
17 pages
Imc14 03 Huffman Codes PDF
No ratings yet
Imc14 03 Huffman Codes PDF
31 pages
Huffman Coding
No ratings yet
Huffman Coding
6 pages
Huffman Encoding Report
No ratings yet
Huffman Encoding Report
36 pages
1 PHP
No ratings yet
1 PHP
30 pages
LP-III Assignment No 2
No ratings yet
LP-III Assignment No 2
16 pages
University of Management & Technology: Submitted By: Usama Dastagir 14030027011 Hassan Humayoun 14030027043
No ratings yet
University of Management & Technology: Submitted By: Usama Dastagir 14030027011 Hassan Humayoun 14030027043
7 pages
Slide 2 Suplemen Huffman Coding
No ratings yet
Slide 2 Suplemen Huffman Coding
13 pages
16 Greedy Algorithms
No ratings yet
16 Greedy Algorithms
21 pages
16 Greedy Algorithms
No ratings yet
16 Greedy Algorithms
21 pages
Allplan
No ratings yet
Allplan
315 pages
Steps of Huffman Encoding:: Calculate The Frequency of Each Character Build A Priority Queue Build A Binary Tree
No ratings yet
Steps of Huffman Encoding:: Calculate The Frequency of Each Character Build A Priority Queue Build A Binary Tree
1 page
Huffman Coding: Greedy Algorithm
No ratings yet
Huffman Coding: Greedy Algorithm
27 pages
CCP303
No ratings yet
CCP303
17 pages
Background of The Study
No ratings yet
Background of The Study
4 pages
Huffman Coding
No ratings yet
Huffman Coding
12 pages
Running A T-Test in Excel
No ratings yet
Running A T-Test in Excel
3 pages
Design and Analysis of Dynamic Huffman Codes: Jeffrey Scott Vitter
No ratings yet
Design and Analysis of Dynamic Huffman Codes: Jeffrey Scott Vitter
21 pages
So You Want To Take On An Asset?
No ratings yet
So You Want To Take On An Asset?
3 pages
Programming For Computations - Python
No ratings yet
Programming For Computations - Python
244 pages
Huffman Coding: Version of September 17, 2016
No ratings yet
Huffman Coding: Version of September 17, 2016
27 pages
Huffman Codes: Forxinc: Addxtoheapqbyp (X)
No ratings yet
Huffman Codes: Forxinc: Addxtoheapqbyp (X)
3 pages
M1 Greedy - Huffman Codes
No ratings yet
M1 Greedy - Huffman Codes
2 pages
2013-10-01 18.34.33 Crash
No ratings yet
2013-10-01 18.34.33 Crash
4 pages
GE CardioSoft V5.1 ECG Software - Service Manual
No ratings yet
GE CardioSoft V5.1 ECG Software - Service Manual
110 pages
Huffman's Algorithm Lecture1
No ratings yet
Huffman's Algorithm Lecture1
69 pages
Huffman Coding
No ratings yet
Huffman Coding
10 pages
Optimization Problems
No ratings yet
Optimization Problems
38 pages
Exercises Dobson
0% (1)
Exercises Dobson
3 pages
Huffman Coding - Wikipedia
No ratings yet
Huffman Coding - Wikipedia
11 pages
Huffman Code
No ratings yet
Huffman Code
5 pages
Unit 2
No ratings yet
Unit 2
28 pages
Project Report Huffman Algorithm: Jinnah University For Women
No ratings yet
Project Report Huffman Algorithm: Jinnah University For Women
11 pages
Huffman Coding
No ratings yet
Huffman Coding
8 pages
Equations
No ratings yet
Equations
5 pages
S 2
No ratings yet
S 2
8 pages
Adaptive Huffman Coding
No ratings yet
Adaptive Huffman Coding
26 pages
General Examples Using The Crow Model
No ratings yet
General Examples Using The Crow Model
10 pages
250 Ms-Excel Keyboard Shortcuts
No ratings yet
250 Ms-Excel Keyboard Shortcuts
17 pages
Huffman Codes and Its Implementation: Submitted by Kesarwani Aashita Int. M.Sc. in Applied Mathematics (3 Year)
No ratings yet
Huffman Codes and Its Implementation: Submitted by Kesarwani Aashita Int. M.Sc. in Applied Mathematics (3 Year)
28 pages
Huffman Coding
No ratings yet
Huffman Coding
32 pages
16 Greedy Algorithms
No ratings yet
16 Greedy Algorithms
21 pages
CHP 2 - Origin & Nature of CRM
No ratings yet
CHP 2 - Origin & Nature of CRM
28 pages
Huffman Code1
100% (1)
Huffman Code1
13 pages
B+ Tree: by Li Wen CS157B Professor: Sin-Min Lee
No ratings yet
B+ Tree: by Li Wen CS157B Professor: Sin-Min Lee
26 pages
Answer:: o o o o
No ratings yet
Answer:: o o o o
14 pages
Huffman Coding
No ratings yet
Huffman Coding
10 pages
Data Structure: Huffman Tree:Project Submitted To: Sir Abdul Wahab
No ratings yet
Data Structure: Huffman Tree:Project Submitted To: Sir Abdul Wahab
24 pages
Huffman Coding
No ratings yet
Huffman Coding
23 pages
Compression: Another Example of Greedy Algorithm: Huffman Codes
No ratings yet
Compression: Another Example of Greedy Algorithm: Huffman Codes
4 pages
Installing, Testing and Updating Astrodatabank
No ratings yet
Installing, Testing and Updating Astrodatabank
12 pages

Huffman Coding

Uploaded by

Huffman Coding

Uploaded by

Huffman Coding

To compress or not to compress, that is the question!

 reducing the space

 reducing the time

Image Source : plus.maths.org/issue23/ features/data/data.jpg

Space = (45*3 + 13*3 + 12*3 + 16*3 + 9*3 + 5*3) * 1000

Space = (45*1 + 13*3 + 12*3 + 16*3 + 9*4 + 5*4) * 1000

= 224K bits ( Savings = 25%)

• No codeword is also prefix of some other

Variable-length 0 101 100 111 1101 1100

The Huffman algorithm is represented as:

• each edge represents either a:45 0 55

• 0, "go to the left child" 25 30

• Cost of the tree

• Always a full binary tree

14 d:16 25 a:45 25 30 a:45

f:5 e:9 c:12 b:13 c:12 b:13 14 d:16

Q is implemented as a binary min-heap.

Huffman Coding Example

Huffman Code Algorithm Overview

The Huffman algorithm finds an optimal

Suppose that the Huffman algorithm finds

Assume that f[a] < f[b] and f[x] < f[y]

Since f[x] and f[y] are the two lowest frequencies,

f[x] < f[a] and f[y] < f[b].

 Show C(T) <= C(X)

 T’ and X’ – Trees from the greedy choice

 T’’and X’’ – Trees with minimum cost leaves x and y

Finish the induction proof

X : Any tree, X’: – modified,

Two passes over the data:

Sender and receiver

A binary tree with p leaves of nonnegative weight is a Huffman tree iff

Suppose that MT = ai1 , ai2, . . . , ait, has already been processed.

ai(t+1) is encoded and decoded using Huffman tree for MT.

Eg. Assume t = 32,

• Begin with the leaf of ai(t+1), as the current node.

Eg. Assume t = 32, ai(t+1) = “b”

You might also like

Space = (453 + 133 + 123 + 163 + 93 + 53) * 1000

Space = (451 + 133 + 123 + 163 + 94 + 54) * 1000