Huffman
Algorithm
PRESENTER NAME
FA H I M A H A M M E D
221902136
I M RA N S A R K A R S E T U
181002036
PRESENTED TO
M D. A B U R U M M A N R E FAT
• Definition
• Simulation
• Time Complexity
Contents
• Application
• Advantage/
Disadvantage
1
• Huffman Coding is a lossless data
compression
algorithm. The idea is to assign variable-
length codes to input characters, lengths
of the assigned codes are based on the
Definition frequencies of corresponding
characters.
■ The most frequent character gets the
smallest code and the least frequent
character gets the largest
code.
2
Message- BCCABBDDAECCBBAEDDCC
• In Electric components the
alphabet is sent through ASCII
code. The ASCII code letter
capital A is 65 and we need 8
bits binary to convert 65.
• • For 1 Letter We need 8 bits
• • For 20 Letters We need
8×20=160 bits
3
Simulation
Message-
BCCABBDDAECCBBAEDDCC
First of all place the
counts in increasing
order then take minimum
and add them now the
root node of letter E and
A is 5
4
Simulation
Message-
BCCABBDDAECCBBAEDDCC
Then between the root
node 5 and counts take
the minimum and add
them up. Here 5 and 4
are minimum so we add
them and make 9 as root
node so continue the
process.
5
Simulation
Message-
BCCABBDDAECCBBAEDDCC
6
Simulation
Message-
BCCABBDDAECCBBAEDDCC
Mark the left hand edges as 0 and
right hand edges as 1 and then
traverse from root node to any letter.
Suppose we want to go A from root
node so the distance will be 001, for B
10 and so on.
7
Simulation
Message-
BCCABBDDAECCBBAEDDCC
For A we need 3
bits,B 2 bits, C 2 bits,
D 2 bits, E 3 bits
For A,
Count = 3
So total bits
3*3=9 bits
And so on
8
Simulation
Message-
BCCABBDDAECCBBAEDDCC
As we see first we do need 160
bits and now we need 45 bits
now we have compressed the
cost and size. 9
Time Complexity
Huffman (C)
n = |C|
Q=C
for i = 1 to n-1
allocate new node z
z.left = x = Extract-min (Q)
z.right = y = Extract-min (Q)
insert (Q, z)
return Extract-Min (Q) //
returns the root
Complexity = O(n lgn)
1
Generic File Compression:
→ Files: GZIP, BZIP, 7Z
→ Archives: 7z
→ File System: NTFS,FS+,ZFS
Multimedia :
Image: GIF, ZPEG Sound: Mp3
Video: MPEG, HDTV Application
Communication:
→ ITU-T T4 Group 3 Fax
→ V.42 Bis modem
→ Skype
Databases: Google, Facebook,...
1
ADVANTAGES DISADVANTAGES
Advantages: Disadvantages:
→The Huffman Coding has the ➡Requires two passes over the two input
(one to compute frequencies, one for
minimum average length. coding),thus encoding is slow.
➡Easy to implement and fast. ➡Requires storing the Huffman codes(or
at least character frequencies)in the
encoded file, thus reducing the
compression benefit obtained by
encoding.
1
Thank You
David Albert Huffman