0% found this document useful (0 votes)
48 views15 pages

Huffman Trees and Codes-V1

The document discusses encoding techniques, specifically focusing on variable-length encoding and the use of prefix codes to avoid ambiguity in symbol representation. It details Huffman's algorithm for constructing Huffman trees, which assign shorter codes to more frequent symbols, and provides examples of encoding and decoding using Huffman codes. Additionally, it includes calculations for compression ratios comparing Huffman encoding to fixed-length encoding.

Uploaded by

LekshmiRajesh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
48 views15 pages

Huffman Trees and Codes-V1

The document discusses encoding techniques, specifically focusing on variable-length encoding and the use of prefix codes to avoid ambiguity in symbol representation. It details Huffman's algorithm for constructing Huffman trees, which assign shorter codes to more frequent symbols, and provides examples of encoding and decoding using Huffman codes. Additionally, it includes calculations for compression ratios comparing Huffman encoding to fixed-length encoding.

Uploaded by

LekshmiRajesh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 15

Huffman Trees and

Codes
Greedy Technique
What is Encoding?
• Encoding a text means assigning some sequence of bits called the
codeword to each of the text’s symbols.
• Fixed-length encoding assigns to each symbol a bit string of the same
length. Eg. ASCII code.
• Variable-length encoding assigns codewords of different lengths to
different symbols.
• This idea was used in the telegraph code invented in the mid-19th
century by Samuel Morse. In that code, frequent letters such as e (.)
and a (.−) are assigned short sequences of dots and dashes while
infrequent letters such as q (−−.−) and z (−−..) have longer ones.
Problem in variable-length encoding
• How many bits of an encoded text represent the first (or, more
generally, the ith) symbol?
• Prefix-free (or simply prefix) codes avoid this complication.
• In a prefix code, no codeword is a prefix of a codeword of another
symbol. Hence, with such an encoding, scan a bit string until we get
the first group of bits that is a codeword for some symbol, replace
these bits by this symbol, and repeat this operation until the bit
string’s end is reached.
Huffman’s algorithm
• Huffman’s algorithm generates Huffman codes by constructing Huffman tree.
Huffman codes has shorter codewords to more frequent symbols and longer
codewords to less frequent symbols
Steps for constructing Huffman tree:
• Step 1 Initialize n one-node trees and label them with the symbols of the
alphabet given. Record the frequency of each symbol in its tree’s root to indicate
the tree’s weight. (More generally, the weight of a tree will be equal to the sum
of the frequencies in the tree’s leaves.)
• Step 2 Repeat the following operation until a single tree is obtained. Find two
trees with the smallest weight (ties can be broken arbitrarily). Make them the left
and right subtree of a new tree and record the sum of their weights in the root of
the new tree as its weight.
Example of constructing a Huffman
coding tree
Example of constructing a Huffman
coding tree
Example of constructing a Huffman
coding tree
Example of constructing a Huffman
coding tree
Example of constructing a Huffman
coding tree
Huffman codes for the example
Huffman encoding and decoding
• Huffman codes are prefix codes.

• DAD is encoded as 011101.

• 10011011011101 is decoded as BAD_AD.


Compression ratio
• The average number of bits per symbol in this code is
2 . 0.35 + 3 . 0.1+ 2 . 0.2 + 2 . 0.2 + 3 . 0.15 = 2.25.

• For fixed-length encoding: 3

• Compression ratio: (3-2.25)/3*100% = 25%


Example
• Consider the frequency of occurrence of characters in a string as
follows and determine Huffman code using Huffman tree and also
find the compression ratio of fixed-6 length encoding.
A B H J M P
23 11 15 2 19 30

• Determine Huffman codes for the string ABRACADABRA


Example (Contd.,)
• Construct a Huffman code for the following data:
symbol A B C D -
frequency 0.4 0.1 0.2 0.15 0.15

a) Encode ABACABAD using the code of question


b) Decode 100010111001010 using the code of question (a)

You might also like