0% found this document useful (0 votes)

102 views16 pages

12 - Huffman Coding Algorithm

The document describes the Huffman coding algorithm for data compression. It explains that Huffman coding is a variable-length encoding technique that assigns binary codes to characters based on their frequency in a text. Codes are assigned such that more frequent characters have shorter codes. This results in more efficient storage and transmission of data compared to fixed-length encoding schemes. The algorithm builds a Huffman tree from the character frequencies where each leaf node represents a character. The encoded data can then be uniquely decoded using the Huffman tree.

Uploaded by

Jahongir Azzamov

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

102 views16 pages

12 - Huffman Coding Algorithm

Uploaded by

Jahongir Azzamov

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 16

Huffman Coding Algorithm

What is Encoding ?

Encoding, in computers, can be defined as the process of transmitting

or storing sequence of characters efficiently.

Every information in computer science is encoded as strings of 1s and

0s.

The objective of information theory is to usually transmit information

using fewest number of bits in such a way that every encoding is
unambiguous
There are two types of encoding schemes

Fixed-Length encoding –
Every character is assigned a binary code using same number of
bits.

Assuming that each character uses 8 bits.

Thus, a string like “aabacdad” can require 64 bits (8 bytes) for
storage or transmission.

Variable- Length encoding –

this scheme uses variable number of bits for encoding the
characters depending on their frequency in the given text.
Variable- Length encoding
Ex:
for a given string like “aabacdad”, frequency of characters ‘a’, ‘b’, ‘c’
and ‘d’ is 4,1,1 and 2 respectively.

Since ‘a’ occurs more frequently than ‘b’, ‘c’ and ‘d’, it uses least
number of bits, followed by ‘d’, ‘b’ and ‘c’.

Lets randomly assign binary codes to each character as follows-

• a 0
b 011
c 111
d 11
Thus, the string “aabacdad” gets encoded to 00011011111011
(0 | 0 | 011 | 0 | 111 | 11 | 0 | 11)
using fewer number of bits compared to fixed-length encoding
scheme. (i.e. 14 vs 64)
a0
Problem with this strategy b 011
the real problem lies with the decoding phase.
c 111

If we try and decode the string 00011011111011, it will be quite ambiguous

d 11since,
it can be decoded to the multiple strings,
few of which are-
• aaadacdad (0 | 0 | 0 | 11 | 0 | 111 | 11 | 0 | 11)
aaadbcad (0 | 0 | 0 | 11 | 011 | 111 | 0 | 11)
aabbcb (0 | 0 | 011 | 011 | 111 | 011) … and so on

Prefix rule ( To prevent such ambiguities during decoding)

Encoding phase should satisfy the “prefix rule”

which states that no binary code should be a prefix of another code.
(i.e. 0, is a prefix of binary code for b i.e 011, is “non-prefix”)

This will produce uniquely decodable codes.

Ex: Applying prefix Rule for encoding
Lets reconsider assigning the binary codes to characters ‘a’, ‘b’, ‘c’, ‘d’.
a 0
b 11
c 101
d 100

Using the above codes, string “aabacdad” gets encoded to

001101011000100 (0 | 0 | 11 | 0 | 101 | 100 | 0 | 100).
Now, we can decode it back to string “aabacdad”.
Huffman Encoding-

Developed by David Huffman in 1951, Huffman Encoding method is

used for finding efficient coding,

 This technique is the basis for all data compression and encoding
schemes.

 It uses variable-length encoding scheme for assigning binary

codes to characters depending on how frequently they occur in
the given text
Huffman Encoding-

Algorithm for creating the Huffman Tree-

Step 1- Create a leaf node for each character and build a min heap using all
the nodes (the frequency value is used to compare two nodes in min heap)

Step 2- Repeat Steps 3 to 5 while heap has more than one node

Step 3- Extract two nodes, say x and y, with minimum frequency from the
heap

Step 4- Create a new internal node z with x as its left child and y as its right
child. frequency(z)= frequency(x) + frequency(y)

Step 5- Add z to min heap

Step 6- Last node in the heap is the root of Huffman tree

Ex
for a given string like “aabacdad”, frequency of characters ‘a’, ‘b’,
‘c’ and ‘d’ is 4,1,1 and 2 respectively.

• Arrange the characters in ascending order of their frequency

and apply Huffman algorithm

• Huffman tree will generate the following code

• 0 gets decoded to ‘a’

• 110 gets decoded to ‘b’
• 111 gets decoded to ‘c’
• 10 gets decoded to ‘d’
Analysis 0 gets decoded to ‘a’ (4)
“aabacdad”, 110 gets decoded to ‘b’ (1)
111 gets decoded to ‘c’(1)
10 gets decoded to ‘d’(2)
Default encoding
Here decoding is not required as ASCII representation is universally known and unique
==> Number of bits used for encoding = 8 * 8 = 64 bits

Uniform encoding (with less number of bits )

total bits required = decoding scheme + encoded message
decoding scheme => [Ascii bits for 4 characters + 2 bit representation for 4 characters]
encoded message => [2 bits for each character in the string ]
==> (8*4 + 4*2) +( 8*2) = (32 +8)+ 16 =56 bits

Variable encoding =decoding scheme + encoded message

==> total bits required = decoding scheme + encoded message decoding scheme
Þ [Ascii bits for 4 characters + bit representation for 4 characters]
Þ encoded message => [ a*4+b*1+c*1+d*2 bits (for characters used in the string)]
Þ (8*4 + 9) +( 1*4+3*1+3*1+2*2) = (32 +9)+ (14) =55 bits

Þ The real benefit is realized with larger size strings

Compare encoding of
aaaabaddd
using
Uniform encoding and Variable encoding methods

 Find the bits used in Uniform encoding ?

 Find the bits used in Variable encoding (Huffman encoding) ?

0 gets decoded to ‘a’ (5)
10 gets decoded to ‘b’ (1)
Compare encoding of 11 gets decoded to ‘d’(3)
aaaabaddd
using
Uniform encoding and Variable encoding methods

 Find the bits used in Uniform encoding ?

 Find the bits used in Variable encoding (Huffman encoding) ?

 Uniform Encoding = 38 + 32 + 9*2 = 24 + 6 + 18 = 48 bits

 Variable Encoding = 38 + 7 + (51 + 12 + 32) = 24 + 7 + 13 = 44 bits

A file containing 6 unique characters and frequency of each character is given:
c=34
d=9
g=35
u=2
m=2
a=100

How many bits are required to store this file using Huffman Encoding?
A file containing 6 unique characters and frequency of each character is given:

c=34 - 110 / 011

d=9 - 1110 / 0101
g=35 - 10 / 00
u=2 - 11110 / 01000
m=2 - 11111 / 01001
a=100 - 0/1

Number of bits required to store this file using Huffman Encoding?

Size of file = 182

Bits for representing 8*6 + 20 = 68

Bits for encoding = 34*3 + 9*4 + 35*2 + 2*5 + 2*5 + 100*1
(102 + 36 + 70 + 10 +10 + 100 ) = 328

Total bits = 328 + 68 =396

Consider a file contents are encoded as "0101101101001"
using the following Huffman encoding

c=34 - 011
d=9 - 0101
g=35 - 00
u=2 - 01000
m=2 - 01001
a=100 -1

Decode the text "0101101101001"

Consider a file contents are encoded as "0101101101001"
using the following Huffman encoding

c=34 - 011
d=9 - 0101
g=35 - 00
u=2 - 01000
m=2 - 01001
a=100 -1

Decoding the text dacm

0101 1 011 01001
d a c m

Rajan Object-Oriented Numerical Methods Via C++
No ratings yet
Rajan Object-Oriented Numerical Methods Via C++
541 pages
Huffman
No ratings yet
Huffman
13 pages
Network
No ratings yet
Network
184 pages
Lecture - Notes MAtrix Algebra 2
100% (2)
Lecture - Notes MAtrix Algebra 2
99 pages
Ai Practical Notebook (Vi)
No ratings yet
Ai Practical Notebook (Vi)
13 pages
Python For Artificial Intelligence Programming
No ratings yet
Python For Artificial Intelligence Programming
106 pages
Machine Learning Techniques Unit-1 (KAI-601)
No ratings yet
Machine Learning Techniques Unit-1 (KAI-601)
78 pages
Signal Sampling, Quantization, Binary Encoding: Oleh Albert Sagala
No ratings yet
Signal Sampling, Quantization, Binary Encoding: Oleh Albert Sagala
46 pages
Protobuf Tutorial
100% (1)
Protobuf Tutorial
112 pages
Analysis of Computer Networks
100% (1)
Analysis of Computer Networks
288 pages
Huffman Coding 1
No ratings yet
Huffman Coding 1
54 pages
Operating Systems
100% (1)
Operating Systems
247 pages
Python 3 Without Prior Knowledge - Learn How To Program A Neural Network Within 7 Days
No ratings yet
Python 3 Without Prior Knowledge - Learn How To Program A Neural Network Within 7 Days
150 pages
The Laws of Cryptography With Java Code
100% (1)
The Laws of Cryptography With Java Code
334 pages
OS by JJsir
No ratings yet
OS by JJsir
269 pages
Huffman Code
No ratings yet
Huffman Code
47 pages
Getting Started With Ocean 2020
No ratings yet
Getting Started With Ocean 2020
53 pages
Inequalities Maze
No ratings yet
Inequalities Maze
2 pages
C-Api in Python
No ratings yet
C-Api in Python
162 pages
Graph Theory - Important Application of Trees Huffman Coding
No ratings yet
Graph Theory - Important Application of Trees Huffman Coding
50 pages
Web Technology II PHP JS
No ratings yet
Web Technology II PHP JS
211 pages
Ultimate Python Guide 1721491488
No ratings yet
Ultimate Python Guide 1721491488
225 pages
Chapter 4 Neural Network
No ratings yet
Chapter 4 Neural Network
46 pages
Machine Learning and Data Mining For Sports Analytics: Ulf Brefeld Jesse Davis Jan Van Haaren Albrecht Zimmermann
No ratings yet
Machine Learning and Data Mining For Sports Analytics: Ulf Brefeld Jesse Davis Jan Van Haaren Albrecht Zimmermann
206 pages
Operating System Previous Question Paper
No ratings yet
Operating System Previous Question Paper
10 pages
Computer Education 1 Quarter 1 Module 3
No ratings yet
Computer Education 1 Quarter 1 Module 3
32 pages
Cellular Mobile Radio Systems (Husni Hammuda) 1997
No ratings yet
Cellular Mobile Radio Systems (Husni Hammuda) 1997
211 pages
Filter Design Using Matlab
No ratings yet
Filter Design Using Matlab
4 pages
Network Traffic & Flow Analysis
No ratings yet
Network Traffic & Flow Analysis
41 pages
Huffman Coding: Greedy Algorithm
No ratings yet
Huffman Coding: Greedy Algorithm
27 pages
Computer Oriented Numerical Methods!
No ratings yet
Computer Oriented Numerical Methods!
160 pages
Multiprocessor System Architecture
No ratings yet
Multiprocessor System Architecture
11 pages
N Tier Architecture
100% (1)
N Tier Architecture
31 pages
Machine Learning Introduction
No ratings yet
Machine Learning Introduction
58 pages
Data Compression - Unit 2
No ratings yet
Data Compression - Unit 2
31 pages
Answers All 2007
0% (1)
Answers All 2007
64 pages
Introduction To Parallel Programming
No ratings yet
Introduction To Parallel Programming
129 pages
Soft Computing UNIT 1
No ratings yet
Soft Computing UNIT 1
10 pages
60 Years of Connexive Logic Hitoshi Omori Heinrich Wansing Z Library
No ratings yet
60 Years of Connexive Logic Hitoshi Omori Heinrich Wansing Z Library
260 pages
Algorithms - Key Size and Parameters Report - 2014 PDF
No ratings yet
Algorithms - Key Size and Parameters Report - 2014 PDF
113 pages
DB01 - Introduction To Database Systems
No ratings yet
DB01 - Introduction To Database Systems
62 pages
2.decision Tree
No ratings yet
2.decision Tree
74 pages
Artificial Intelligence: Long Short Term Memory Networks
No ratings yet
Artificial Intelligence: Long Short Term Memory Networks
14 pages
Chapter-1 Data Analysis For Cyber Security
No ratings yet
Chapter-1 Data Analysis For Cyber Security
30 pages
4a (Digital System) Number System
No ratings yet
4a (Digital System) Number System
52 pages
Project
No ratings yet
Project
10 pages
NN Examples Matlab
No ratings yet
NN Examples Matlab
91 pages
11 Huffman Coding
No ratings yet
11 Huffman Coding
25 pages
DeepLearning Networking
No ratings yet
DeepLearning Networking
64 pages
Simulation and 3D Visualization of Physical Phenomena On Mobile Devices
No ratings yet
Simulation and 3D Visualization of Physical Phenomena On Mobile Devices
80 pages
Xtensa XCC Compiler Ug
No ratings yet
Xtensa XCC Compiler Ug
118 pages
Program No. 1: AIM: Write A C++ Program To Perform Encryption and Decryption Using The
No ratings yet
Program No. 1: AIM: Write A C++ Program To Perform Encryption and Decryption Using The
29 pages
Me3116 E3.0
No ratings yet
Me3116 E3.0
14 pages
Huffman Code
No ratings yet
Huffman Code
29 pages
Concurrent and Real-Time Programming in Java: © Andy Wellings, 2004
No ratings yet
Concurrent and Real-Time Programming in Java: © Andy Wellings, 2004
35 pages
JAVA Project Final
No ratings yet
JAVA Project Final
9 pages
GTU SEM-1 Result - CSE (AIML)
No ratings yet
GTU SEM-1 Result - CSE (AIML)
5 pages
Week - 1 - History of JAVA Programming Language
No ratings yet
Week - 1 - History of JAVA Programming Language
2 pages
Huffman Encoding: WWW - Cis.Upenn - Edu/ Matuszek/Cit594-2002/SLIDES/HUFFMAN
No ratings yet
Huffman Encoding: WWW - Cis.Upenn - Edu/ Matuszek/Cit594-2002/SLIDES/HUFFMAN
13 pages
Research Data Analysis With Power BI: Vijay Krishnan S Bharanidharan G Krishnamoorthy
No ratings yet
Research Data Analysis With Power BI: Vijay Krishnan S Bharanidharan G Krishnamoorthy
8 pages
Huffman Coding
No ratings yet
Huffman Coding
7 pages
C1SE.38 SprintBacklog EQR
No ratings yet
C1SE.38 SprintBacklog EQR
8 pages
Coding Theory
No ratings yet
Coding Theory
34 pages
Huffman Coding
No ratings yet
Huffman Coding
23 pages
Lecture1 Introduction
No ratings yet
Lecture1 Introduction
20 pages
SYMBIAN OS Report
No ratings yet
SYMBIAN OS Report
25 pages
Fundamentals of Algorithm
No ratings yet
Fundamentals of Algorithm
14 pages
Mastering WebGL: Crafting Advanced 3D Web Experiences: WebGL Wizadry
From Everand
Mastering WebGL: Crafting Advanced 3D Web Experiences: WebGL Wizadry
Kameron Hussain
No ratings yet
Computer Education For Nepali School Students - QBASIC CLASS IX
No ratings yet
Computer Education For Nepali School Students - QBASIC CLASS IX
10 pages
Example of 2D Convolution
No ratings yet
Example of 2D Convolution
5 pages
The Basic Java Applet and Japplet: I2Puj4 - Chapter 6 - Applets, HTML, and Gui'S
No ratings yet
The Basic Java Applet and Japplet: I2Puj4 - Chapter 6 - Applets, HTML, and Gui'S
20 pages
Module 5
No ratings yet
Module 5
60 pages
The 5 Basics To Know Before You Work On Raspberry Pi!
No ratings yet
The 5 Basics To Know Before You Work On Raspberry Pi!
4 pages
Huffman Coding Algorithm
No ratings yet
Huffman Coding Algorithm
3 pages
A Multilingual Chatbot For Supporting Mobile Companies Complaints. Case Study: ATM Mobilis of Algeria
No ratings yet
A Multilingual Chatbot For Supporting Mobile Companies Complaints. Case Study: ATM Mobilis of Algeria
75 pages
Unit-8: Advanced Microprocessor: (MPI) GTU # 3160712
No ratings yet
Unit-8: Advanced Microprocessor: (MPI) GTU # 3160712
129 pages
CPP Journal
No ratings yet
CPP Journal
25 pages
Lesson Plan
No ratings yet
Lesson Plan
8 pages
Model Sample Paper (Standard) 2
No ratings yet
Model Sample Paper (Standard) 2
8 pages
Worksheet 1 Csip
No ratings yet
Worksheet 1 Csip
10 pages
CPP-Day 1
No ratings yet
CPP-Day 1
9 pages
Cns Final Lab Manual
No ratings yet
Cns Final Lab Manual
25 pages
Soc 2040 - Systems Programming: Inha University Tashkent Department of Cse & Ice Spring Semester 2024
No ratings yet
Soc 2040 - Systems Programming: Inha University Tashkent Department of Cse & Ice Spring Semester 2024
30 pages
MSC2050 Discrete Mathematics, Presentation 7: Dr. Anna Tomskova
No ratings yet
MSC2050 Discrete Mathematics, Presentation 7: Dr. Anna Tomskova
28 pages
SSZG 516
No ratings yet
SSZG 516
5 pages
NEWAthar Experiment ALL Rtos
No ratings yet
NEWAthar Experiment ALL Rtos
24 pages
Soc2040 SP Week6 Lecture1 Slides On Machine Level Programming Part2 Spring 2024
No ratings yet
Soc2040 SP Week6 Lecture1 Slides On Machine Level Programming Part2 Spring 2024
22 pages
MT EM Fall2022
No ratings yet
MT EM Fall2022
7 pages
Chapter 5. Recursion
No ratings yet
Chapter 5. Recursion
11 pages
Duck Typing Godot GDScript Tutorial Ep 19 Godot Tutorials
No ratings yet
Duck Typing Godot GDScript Tutorial Ep 19 Godot Tutorials
5 pages
V - Cse - CS3501 - CD - QB - Unit 2
No ratings yet
V - Cse - CS3501 - CD - QB - Unit 2
7 pages
DM 2022 Spring MID
No ratings yet
DM 2022 Spring MID
6 pages
Assignment 5
No ratings yet
Assignment 5
2 pages
Assignment-3 Formal Writing Solution
No ratings yet
Assignment-3 Formal Writing Solution
2 pages
Unit 2 - DS - Class
No ratings yet
Unit 2 - DS - Class
77 pages
Brochure
No ratings yet
Brochure
2 pages

12 - Huffman Coding Algorithm

Uploaded by

12 - Huffman Coding Algorithm

Uploaded by

Huffman Coding Algorithm

Encoding, in computers, can be defined as the process of transmitting

Every information in computer science is encoded as strings of 1s and

The objective of information theory is to usually transmit information

Assuming that each character uses 8 bits.

Variable- Length encoding –

Lets randomly assign binary codes to each character as follows-

If we try and decode the string 00011011111011, it will be quite ambiguous

Prefix rule ( To prevent such ambiguities during decoding)

Encoding phase should satisfy the “prefix rule”

This will produce uniquely decodable codes.

Using the above codes, string “aabacdad” gets encoded to

Developed by David Huffman in 1951, Huffman Encoding method is

 It uses variable-length encoding scheme for assigning binary

Algorithm for creating the Huffman Tree-

Step 5- Add z to min heap

Step 6- Last node in the heap is the root of Huffman tree

• Arrange the characters in ascending order of their frequency

• Huffman tree will generate the following code

• 0 gets decoded to ‘a’

Uniform encoding (with less number of bits )

Variable encoding =decoding scheme + encoded message

Þ The real benefit is realized with larger size strings

 Find the bits used in Uniform encoding ?

 Find the bits used in Variable encoding (Huffman encoding) ?

 Find the bits used in Uniform encoding ?

 Find the bits used in Variable encoding (Huffman encoding) ?

 Uniform Encoding = 3*8 + 3*2 + 9*2 = 24 + 6 + 18 = 48 bits

 Variable Encoding = 3*8 + 7 + (5*1 + 1*2 + 3*2) = 24 + 7 + 13 = 44 bits

c=34 - 110 / 011

Number of bits required to store this file using Huffman Encoding?

Size of file = 182

Bits for representing 8*6 + 20 = 68

Total bits = 328 + 68 =396

Decode the text "0101101101001"

Decoding the text dacm

You might also like

 Uniform Encoding = 38 + 32 + 9*2 = 24 + 6 + 18 = 48 bits

 Variable Encoding = 38 + 7 + (51 + 12 + 32) = 24 + 7 + 13 = 44 bits