0% found this document useful (0 votes)
60 views

Huffman Coding

Huffman coding is a lossless data compression algorithm that assigns variable-length codes to input characters based on their frequency, with more common characters getting shorter codes. It involves building a Huffman tree from character frequencies and traversing it to assign codes. The codes are prefix-free to prevent ambiguities during decoding. Huffman coding is commonly used for data compression and can also be used in cryptography.

Uploaded by

Sudipta Mondol
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
60 views

Huffman Coding

Huffman coding is a lossless data compression algorithm that assigns variable-length codes to input characters based on their frequency, with more common characters getting shorter codes. It involves building a Huffman tree from character frequencies and traversing it to assign codes. The codes are prefix-free to prevent ambiguities during decoding. Huffman coding is commonly used for data compression and can also be used in cryptography.

Uploaded by

Sudipta Mondol
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Huffman coding

Huffman coding is a lossless data compression algorithm. In this algorithm, a variable-length


code is assigned to input different characters. The code length is related to how frequently
characters are used. Most frequent characters have the smallest codes and longer codes for least
frequent characters.

Fixed-length and variable length are two types of encoding schemes, explained as follows-

Fixed-Length encoding - Every character is assigned a binary code using same number of
bits. Thus, a string like “aabacdad” can require 64 bits (8 bytes) for storage or transmission,
assuming that each character uses 8 bits.

Variable- Length encoding - As opposed to Fixed-length encoding, this scheme uses


variable number of bits for encoding the characters depending on their frequency in the given
text. Thus, for a given string like “aabacdad”, frequency of characters „a‟, „b‟, „c‟ and „d‟ is 4,1,1
and 2 respectively. Since „a‟ occurs more frequently than „b‟, „c‟ and „d‟, it uses least number of
bits, followed by „d‟, „b‟ and „c‟. Suppose we randomly assign binary codes to each character as
follows-

a0
b 011
c 111
d 11

Thus, the string “aabacdad” gets encoded to 00011011111011 (0 | 0 | 011 | 0 | 111 | 11 | 0 | 11),
using fewer number of bits compared to fixed-length encoding scheme.

There are mainly two parts. First one to create a Huffman tree , and another one to traverse the
tree to find codes.

There are mainly two major parts in Huffman Coding


1) Build a Huffman Tree from input characters.
2) Traverse the Huffman Tree and assign codes to characters.

For an example, consider some strings “YYYZXXYYX”, the frequency of character Y is larger
than X and the character Z has the least frequency. So the length of the code for Y is smaller than
X, and code for X will be smaller than Z.

Complexity for assigning the code for each character according to their frequency is O(n log n).
Charecteristics of Huffman Coding

 Huffman Coding is a famous Greedy Algorithm.


 It is used for the lossless compression of data.
 It uses variable length encoding.
 It assigns variable length code to all the characters.
 The code length of a character depends on how frequently it occurs in the given text.
 The character which occurs most frequently gets the smallest code.
 The character which occurs least frequently gets the largest code.

Prefix Rule - no code is prefix to another code

 Huffman Coding implements a rule known as a prefix rule.


 This is to prevent the ambiguities while decoding.
 It ensures that the code assigned to any character is not a prefix of the code assigned to
any other character.

Application of Huffman coding:


1. Huffman coding is use in data compression (reduce the size of data or message).

2. A rather different use of Huffman encoding is in conjunction with cryptography.


Example:ABBCDBCCDAABBEEEBEAB

Problem

A file contains the following characters with the frequencies as shown. If Huffman Coding is
used for data compression find the Huffman Code for each character .

Characters Frequencies

a 10

e 15

i 12

o 3

u 4

s 13

t 1

Step-01:

Step-02:
Step-03:

Step-04:
Step-05:
Step-06:
Step-07:

1. Huffman Code For Characters-

Following this rule, the Huffman Code for each character is-

 a = 111
 e = 10
 i = 00
 o = 11001
 u = 1101
 s = 01
 t = 11000

You might also like