0% found this document useful (0 votes)
525 views24 pages

4.6 Huffman Coding, Optimal Merge Pattern

Uploaded by

Suraj kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
525 views24 pages

4.6 Huffman Coding, Optimal Merge Pattern

Uploaded by

Suraj kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 24

Department of Computer Science and Engineering (CSE)

UNIVERSITY INSTITUTE OF
ENGINEERING
COMPUTER SCIENCE
ENGINEERING
Bachelor of Engineering
Design and Analysis of Algorithms(CSH-311/IT
H-311)

Topic: Huffman coding, Optimal Merge


tree DISCOVER . LEARN . EMPOWER

University Institute of Engineering (UIE)


Department of Computer Science and Engineering (CSE)

Learning Objectives & Outcomes


Objective:
• To understand the concept of Huffman coding and
optimal merge pattern.

Outcome:
• Student will understand
 Huffman coding
 Optimal merge pattern

University Institute of Engineering (UIE)


Department of Computer Science and Engineering (CSE)

Huffman Coding

• Prefix Codes, means the codes (bit sequences) are


assigned in such a way that the code assigned to one
character is not the prefix of code assigned to any other
character. This is how Huffman Coding makes sure that
there is no ambiguity when decoding the generated
bitstream.
• There are mainly two major parts in Huffman Coding
1) Build a Huffman Tree from input characters.
2) Traverse the Huffman Tree and assign codes to
characters.

University Institute of Engineering (UIE)


Department of Computer Science and Engineering (CSE)

Steps to build Huffman Tree


Input is an array of unique characters along with their frequency of
occurrences and output is Huffman Tree.
1. Create a leaf node for each unique character and build a min heap
of all leaf nodes (Min Heap is used as a priority queue. The value of
frequency field is used to compare two nodes in min heap. Initially,
the least frequent character is at root)
2. Extract two nodes with the minimum frequency from the min
heap.
3. Create a new internal node with a frequency equal to the sum of
the two nodes frequencies. Make the first extracted node as its left
child and the other extracted node as its right child. Add this node to
the min heap.
4. Repeat steps#2 and #3 until the heap contains only one node. The
remaining node is the root node and the tree is complete.

University Institute of Engineering (UIE)


Department of Computer Science and Engineering (CSE)

Example
PROBLEM:
A file contains the following characters with the frequencies
as shown. If Huffman Coding is used for data compression,
determine-
1) Huffman Code for each character
2) Average code length
3) Length of Huffman encoded
message (in bits)

University Institute of Engineering (UIE)


Department of Computer Science and Engineering (CSE)

Solution

University Institute of Engineering (UIE)


Department of Computer Science and Engineering (CSE)

University Institute of Engineering (UIE)


Department of Computer Science and Engineering (CSE)

University Institute of Engineering (UIE)


Department of Computer Science and Engineering (CSE)

University Institute of Engineering (UIE)


Department of Computer Science and Engineering (CSE)

University Institute of Engineering (UIE)


Department of Computer Science and Engineering (CSE)

Now,
•We assign weight to all the edges of the constructed
Huffman Tree.
•Let us assign weight ‘0’ to the left edges and weight ‘1’ to
the right edges.

University Institute of Engineering (UIE)


Department of Computer Science and Engineering (CSE)

Rule
• If you assign weight ‘0’ to the left edges, then assign
weight ‘1’ to the right edges.
• If you assign weight ‘1’ to the left edges, then assign
weight ‘0’ to the right edges.
• Any of the above two conventions may be followed.
• But follow the same convention at the time of decoding
that is adopted at the time of encoding.

University Institute of Engineering (UIE)


Department of Computer Science and Engineering (CSE)

University Institute of Engineering (UIE)


Department of Computer Science and Engineering (CSE)

Answers based on previous huffman tree


1. Huffman Code For Characters-

To write Huffman Code for any character, traverse the Huffman Tree from root node
to the leaf node of that character.
Following this rule, the Huffman Code for each character is-
a = 111
e = 10
i = 00
o = 11001
u = 1101
s = 01
t = 11000

From here, we can observe-


Characters occurring less frequently in the text are assigned the larger code.
Characters occurring more frequently in the text are assigned the smaller code.

University Institute of Engineering (UIE)


Department of Computer Science and Engineering (CSE)

2. Average Code Length-

Using formula-01, we have-


Average code length
= ∑ ( frequencyi x code lengthi ) / ∑ ( frequencyi )
= { (10 x 3) + (15 x 2) + (12 x 2) + (3 x 5) + (4 x 4) + (13 x 2)
+ (1 x 5) } / (10 + 15 + 12 + 3 + 4 + 13 + 1)
= 2.52

University Institute of Engineering (UIE)


Department of Computer Science and Engineering (CSE)

3. Length of Huffman Encoded Message-

Using formula-02, we have-


Total number of bits in Huffman encoded message
= Total number of characters in the message x Average code
length per character
= 58 x 2.52
= 146.16
≅ 147 bits

University Institute of Engineering (UIE)


Department of Computer Science and Engineering (CSE)

Optimal merge patterns


• Merge a set of sorted files of different length into a single
sorted file. We need to find an optimal solution, where
the resultant file will be generated in minimum time.
• If the number of sorted files are given, there are many
ways to merge them into a single sorted file. This merge
can be performed pair wise. Hence, this type of merging
is called as 2-way merge patterns.
• As, different pairings require different amounts of time,
in this strategy we want to determine an optimal way of
merging many files together. At each step, two shortest
sequences are merged.

University Institute of Engineering (UIE)


Department of Computer Science and Engineering (CSE)

• To merge a p-record file and a q-record file requires


possibly p + q record moves, the obvious choice being,
merge the two smallest files together at each step.
• Two-way merge patterns can be represented by binary
merge trees. Let us consider a set of n sorted files {f1, f2,
f3, …, fn}. Initially, each element of this is considered as a
single node binary tree. To find this optimal solution, the
following algorithm is used.

University Institute of Engineering (UIE)


Department of Computer Science and Engineering (CSE)

Algorithm

University Institute of Engineering (UIE)


Department of Computer Science and Engineering (CSE)

Example

University Institute of Engineering (UIE)


Department of Computer Science and Engineering (CSE)

Hence, the solution takes 15 + 35 + 60 + 95 = 205 number of comparisons.

University Institute of Engineering (UIE)


Department of Computer Science and Engineering (CSE)

REFERENCES
Text books:
•Cormen, Leiserson, Rivest, Stein, “Introduction to Algorithms”, Prentice Hall of
India, 3rd edition 2012. problem, Graph coloring.

Websites:
•https://www.geeksforgeeks.org/huffman-coding-greedy-algo-3/
•https://www.programiz.com/dsa/huffman-coding
•https://www.tutorialspoint.com/design_and_analysis_of_algorithms/
design_and_analysis_of_algorithms_optimal_merge_pattern.htm#:~:text=Merge
%20a%20set%20of%20sorted,into%20a%20single%20sorted%20file.

University Institute of Engineering (UIE)


Department of Computer Science and Engineering (CSE)

Summary

• Huffman Coding
• Optimal Merge Pattern

University Institute of Engineering (UIE)


THANK YOU

University Institute of Engineering (UIE)

You might also like