0% found this document useful (0 votes)

127 views7 pages

Huffman Coding Notes

Huffman coding is a lossless data compression algorithm that uses variable-length codes to encode characters based on their frequency of occurrence. It creates a binary tree by assigning codes to characters from the most frequent to the least, with the most common characters getting the shortest codes. This results in more frequent characters requiring fewer bits on average than less common characters, allowing the text to be compressed. The algorithm runs in O(n log n) time and allows the original text to be perfectly reconstructed from the encoded data.

Uploaded by

Rebel Star

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

127 views7 pages

Huffman Coding Notes

Uploaded by

Rebel Star

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

Huffman Coding

Introduction
Huffman Coding is one approach followed for T
ext Compression. Text
compression means reducing the space requirement for saving a particular text.

Huffman Coding is a lossless data compression algorithm, ie. it is a way of

compressing data without the data losing any information in the process. It is
useful in cases where there is a series of frequently occurring characters.

Working of Huffman Algorithm:

Suppose, the given string is:

Here, each of the characters of the string takes 8 bits of memory. Since there are a
total of 15 characters in the string so the total memory consumption will be 15*8 =
120 bits. Let’s try to compress its size using the Huffman Algorithm.

First-of-all, Huffman Coding creates a tree by calculating the frequencies of each

character of the string and then assigns them some unique code so that we can
retrieve the data back using these codes.

Follow the steps below:

1. Begin with calculating the frequency of each character value in the given
string.

2. Sort the characters in ascending order concerning their frequency and store
them in a priority queue, say Q
.
3. Each character should be considered as a different leaf node.

4. Make an empty node, say z . The left child of z is marked as the minimum
frequency and the right child, the second minimum frequency. The value of z
is calculated by summing up the first two frequencies.

Here, “.” denote the internal nodes.

5. Now, remove the two characters with the lowest frequencies from the
priority queue Q a
nd append their sum to the same.
6. Simply insert the above node z to the tree.
7. For every character in the string, repeat steps 3 to 5.

8. Assign 0 to the left side and 1 to the right side except for the leaf nodes.

The size table is given below:

Character Frequency Code Size

A 5 11 5*2 = 10

B 1 100 1*3 = 3

C 6 0 6*1 = 6

D 3 101 3*3 = 9

4*8 = 32 bits 15 bits 28 bits

Size before encoding: 120 bits

Size after encoding: 32 + 15 + 28 = 75 bits

To decode the code, simply traverse through the tree (starting from the root)
to find the character. Suppose we want to decode 101, then:

Time complexity:
In the case of encoding, inserting each character into the priority queue takes
O(log n) time. Therefore, for the complete array, the time complexity becomes
O(nlog(n)).

Similarly, extraction of the element from the priority queue takes O

(log n) time.
Hence, for the complete array, the achieved time complexity is O(nlog n).

Python Code:
Go through the given Python code, for deeper understanding:

# Huffman Coding in python

string = 'BCAADDDCCACACAC' #String similar to the above-taken example

# Creating tree nodes

class NodeTree(object):

def init(self, left=None, right=None):

self.left = left
self.right = right

def children(self): #Return children of a node

return (self.left, self.right)

def nodes(self):
return (self.left, self.right)

def __str__(self):
return '%s_%s' % (self.left, self.right)

# Main function implementing huffman coding

def huffman_code_tree(node, left=True, binString=''):
if type(node) is str:
return {node: binString}
(l, r) = node.children()
d = dict()
d.update(huffman_code_tree(l, True, binString + '0'))
d.update(huffman_code_tree(r, False, binString + '1'))
return d

# Calculating frequency
freq = {}
for c in string:
if c in freq:
freq[c] += 1
else:
freq[c] = 1

freq = sorted(freq.items(), key=lambda x: x[1], reverse=True)

nodes = freq

while len(nodes) > 1:

(key1, c1) = nodes[-1]
(key2, c2) = nodes[-2]
nodes = nodes[:-2]
node = NodeTree(key1, key2)
nodes.append((node, c1 + c2))

nodes = sorted(nodes, key=lambda x: x[1], reverse=True)

huffmanCode = huffman_code_tree(nodes[0][0])

print(' Char | Huffman code ')

print('----------------------')
for (char, frequency) in freq:
print(' %-4r |%12s' % (char, huffmanCode[char]))

Applications of Huffman Coding:

● They are used for transmitting fax and text.
● They are used by conventional compression formats like PKZIP, GZIP, etc.

Resource - Python Cheat Sheets - Python Programming With Sequences of Data - Y9
No ratings yet
Resource - Python Cheat Sheets - Python Programming With Sequences of Data - Y9
8 pages
Ch. 1-6 Heatcote Python Programming
No ratings yet
Ch. 1-6 Heatcote Python Programming
39 pages
5.lists: 5.1. Accessing Values in Lists Ex
No ratings yet
5.lists: 5.1. Accessing Values in Lists Ex
8 pages
Huffman Coding
No ratings yet
Huffman Coding
10 pages
Multimedia University of Kenya. Faculty of Engineering and Technology. Bsc. Electrical and Telecommunication Engineering
No ratings yet
Multimedia University of Kenya. Faculty of Engineering and Technology. Bsc. Electrical and Telecommunication Engineering
8 pages
Arithmetic Coding
No ratings yet
Arithmetic Coding
22 pages
Arithmetic Coding
No ratings yet
Arithmetic Coding
15 pages
Python Programming
No ratings yet
Python Programming
2 pages
Python Programming
No ratings yet
Python Programming
3 pages
Ethical Hacking: Presented By:-Shravan Sanidhya
No ratings yet
Ethical Hacking: Presented By:-Shravan Sanidhya
29 pages
Python
No ratings yet
Python
4 pages
Huffman Coding 1
No ratings yet
Huffman Coding 1
54 pages
Python - Quick Guide
100% (1)
Python - Quick Guide
65 pages
Final Exam Question 171 CSI217
No ratings yet
Final Exam Question 171 CSI217
2 pages
4.2.7 Dictionaries
No ratings yet
4.2.7 Dictionaries
16 pages
Eurostar Report PDF
No ratings yet
Eurostar Report PDF
2 pages
Lec01 Introduction PDF
No ratings yet
Lec01 Introduction PDF
49 pages
Python
No ratings yet
Python
62 pages
Classified 2015 04 29 000000
No ratings yet
Classified 2015 04 29 000000
6 pages
Python: Programming Language
No ratings yet
Python: Programming Language
31 pages
3rd - Python - L 1
No ratings yet
3rd - Python - L 1
9 pages
Heap Sort
No ratings yet
Heap Sort
11 pages
Application Development Using Python: Dept. of CSE, DSATM 2020-21 1
100% (1)
Application Development Using Python: Dept. of CSE, DSATM 2020-21 1
9 pages
TE 46 Graduate Profile EE
No ratings yet
TE 46 Graduate Profile EE
71 pages
Coding Round Question & Answers
No ratings yet
Coding Round Question & Answers
56 pages
UNIT 3 Queue
No ratings yet
UNIT 3 Queue
17 pages
Python: Duration: 2 Months
No ratings yet
Python: Duration: 2 Months
3 pages
Lecture 02 Complexity Analysis
No ratings yet
Lecture 02 Complexity Analysis
15 pages
Python: From Darkness To Dawn
No ratings yet
Python: From Darkness To Dawn
42 pages
Ques Python
No ratings yet
Ques Python
30 pages
Python Unit-1
No ratings yet
Python Unit-1
22 pages
Python Basic
No ratings yet
Python Basic
109 pages
Python NOTES Unit-4
No ratings yet
Python NOTES Unit-4
163 pages
Keywords in Python
No ratings yet
Keywords in Python
18 pages
Informed Search Algorithms in AI - Javatpoint
No ratings yet
Informed Search Algorithms in AI - Javatpoint
10 pages
Data Structures - Python 3.7.0
No ratings yet
Data Structures - Python 3.7.0
13 pages
Applications of Binary Trees
No ratings yet
Applications of Binary Trees
4 pages
Design and Analysis of Algorithm: Binary Tree
No ratings yet
Design and Analysis of Algorithm: Binary Tree
18 pages
Python: Vks-Learning Hub
No ratings yet
Python: Vks-Learning Hub
45 pages
Huffman Code (Variable Length)
No ratings yet
Huffman Code (Variable Length)
19 pages
Python Tuples, Dictionary and Sets
No ratings yet
Python Tuples, Dictionary and Sets
29 pages
Hash Tables
100% (1)
Hash Tables
30 pages
Python Regular Expression - Exercises, Practice, Solution - W3resource12
No ratings yet
Python Regular Expression - Exercises, Practice, Solution - W3resource12
1 page
Python
No ratings yet
Python
52 pages
Common Data Representation Formats Used For Big Data Include
No ratings yet
Common Data Representation Formats Used For Big Data Include
7 pages
Unit1 - Introduction & Syntax of Python Program
No ratings yet
Unit1 - Introduction & Syntax of Python Program
7 pages
Binary Tree - Interview Questions and Practice Problems
No ratings yet
Binary Tree - Interview Questions and Practice Problems
9 pages
Python - Module at Master Livewires - Python GitHub
No ratings yet
Python - Module at Master Livewires - Python GitHub
4 pages
Cheat Sheet For Competitive Programming With Python 3 - by Utsav Chokshi - Cheat Sheets - Medium
No ratings yet
Cheat Sheet For Competitive Programming With Python 3 - by Utsav Chokshi - Cheat Sheets - Medium
10 pages
Final Python Question Bank
No ratings yet
Final Python Question Bank
457 pages
Hash Tables: Dr. Dibakar Saha
No ratings yet
Hash Tables: Dr. Dibakar Saha
26 pages
Lecture 1
No ratings yet
Lecture 1
26 pages
Scripting Languages U-4
No ratings yet
Scripting Languages U-4
28 pages
Hacker (Computer Security) : History
No ratings yet
Hacker (Computer Security) : History
2 pages
01-03 GRE Configuration Guide
No ratings yet
01-03 GRE Configuration Guide
90 pages
Lecture 1 Introduction To Programming
No ratings yet
Lecture 1 Introduction To Programming
19 pages
Python Crash Course Strings, Math
No ratings yet
Python Crash Course Strings, Math
27 pages
Datatypes in Python
No ratings yet
Datatypes in Python
6 pages
Arithmetic Coding
No ratings yet
Arithmetic Coding
5 pages
2.3a Huffman Coding
No ratings yet
2.3a Huffman Coding
25 pages
Unified Connectivity (UCON) Overview
No ratings yet
Unified Connectivity (UCON) Overview
39 pages
Lenovo Cashback Offer T&Cs
No ratings yet
Lenovo Cashback Offer T&Cs
3 pages
RegRipper Plugins
No ratings yet
RegRipper Plugins
15 pages
Untitled
No ratings yet
Untitled
36 pages
User Requirement Specification of The New Synthesis Unit For p2000 Cyclotron Facility
No ratings yet
User Requirement Specification of The New Synthesis Unit For p2000 Cyclotron Facility
26 pages
Module 5 - Introduction To Digital Productivity Tools
No ratings yet
Module 5 - Introduction To Digital Productivity Tools
93 pages
NetNumen Port List
No ratings yet
NetNumen Port List
76 pages
InteliVision5-1.6.0 NewFeatures
No ratings yet
InteliVision5-1.6.0 NewFeatures
16 pages
Untitled
No ratings yet
Untitled
340 pages
Product - 100 Practical Industrial Programming Using IEC 61131-3 For PLCs (PDFDrive)
No ratings yet
Product - 100 Practical Industrial Programming Using IEC 61131-3 For PLCs (PDFDrive)
14 pages
Sqe 02
No ratings yet
Sqe 02
33 pages
Data Mining-Applications, Issues
No ratings yet
Data Mining-Applications, Issues
9 pages
A Semi-Supervised Approach For Detection of SCADA Attacks in Gas Pipeline Control Systems
No ratings yet
A Semi-Supervised Approach For Detection of SCADA Attacks in Gas Pipeline Control Systems
8 pages
PX 777 Programming Software Instructions
No ratings yet
PX 777 Programming Software Instructions
7 pages
Marketing Management Assignment
No ratings yet
Marketing Management Assignment
10 pages
Build Your Own IoT Gateway With Python
No ratings yet
Build Your Own IoT Gateway With Python
98 pages
Function vs. Procedure
No ratings yet
Function vs. Procedure
2 pages
Manuscript
No ratings yet
Manuscript
8 pages
CS341: Project in Mining Massive Datasets: Michele Catasta, Jure Leskovec, Jeffrey Ullman
No ratings yet
CS341: Project in Mining Massive Datasets: Michele Catasta, Jure Leskovec, Jeffrey Ullman
29 pages
Ex-Rc1 Spec 05-08
No ratings yet
Ex-Rc1 Spec 05-08
8 pages
Immersionrc Laprf Interface Protocol: Version 1.2, Aug 2017
No ratings yet
Immersionrc Laprf Interface Protocol: Version 1.2, Aug 2017
2 pages
Designing Secure USB-Based Dongles
No ratings yet
Designing Secure USB-Based Dongles
7 pages
Arquivo 1
No ratings yet
Arquivo 1
2 pages
Advanced iOS App Architecture v2.0
100% (1)
Advanced iOS App Architecture v2.0
316 pages
AMAG 5013 Data Sheet Rebranding - Symmetry EN 2DBC Edge Network Controller - V2
No ratings yet
AMAG 5013 Data Sheet Rebranding - Symmetry EN 2DBC Edge Network Controller - V2
3 pages
Flutter
No ratings yet
Flutter
1 page
Manual User SIO-8AII Brainchild
No ratings yet
Manual User SIO-8AII Brainchild
141 pages
Healthcare Data Is Getting Generated
No ratings yet
Healthcare Data Is Getting Generated
18 pages
Universiti Teknikal Malaysia Melaka Repository - UTeM (PDFDrive)
No ratings yet
Universiti Teknikal Malaysia Melaka Repository - UTeM (PDFDrive)
24 pages
SSRN Id2981304
No ratings yet
SSRN Id2981304
24 pages

Huffman Coding Notes

Uploaded by

Huffman Coding Notes

Uploaded by

Huffman Coding is a lossless data compression algorithm, ie. it is a way of

Working of Huffman Algorithm:

First-of-all, Huffman Coding creates a tree by calculating the frequencies of each

Follow the steps below:

Here, “.” denote the internal nodes.

The size table is given below:

Character Frequency Code Size

A 5 11 5*2 = 10

4*8 = 32 bits 15 bits 28 bits

Size before encoding:​ 120 bits

Size after encoding:​ 32 + 15 + 28 = 75 bits

Similarly, extraction of the element from the priority queue takes O

# Huffman Coding in python

# Creating tree nodes

​def​ ​__init__​(self, left=None, right=None):

​def​ ​children​(self): ​#Return children of a node

# Main function implementing huffman coding

freq = sorted(freq.items(), key=​lambda​ x: x[​1​], reverse=​True​)

while​ len(nodes) > ​1​:

nodes = sorted(nodes, key=​lambda​ x: x[​1​], reverse=​True​)

print(​' Char | Huffman code '​)

Applications of Huffman Coding:

You might also like

Size before encoding: 120 bits

Size after encoding: 32 + 15 + 28 = 75 bits

def init(self, left=None, right=None):

def children(self): #Return children of a node

freq = sorted(freq.items(), key=lambda x: x[1], reverse=True)

while len(nodes) > 1:

nodes = sorted(nodes, key=lambda x: x[1], reverse=True)

print(' Char | Huffman code ')