4.6 Huffman Coding, Optimal Merge Patterns.
4.6 Huffman Coding, Optimal Merge Patterns.
Huffman Coding
Prefix Codes, means the codes (bit sequences) are assigned in such a way that the code
assigned to one character is not the prefix of code assigned to any other character. This is
how Huffman Coding makes sure that there is no ambiguity when decoding the generated
bitstream.
Let us understand prefix codes with a counter example. Let there be four characters a, b, c
and d, and their corresponding variable length codes be 00, 01, 0 and 1. This coding leads to
ambiguity because code assigned to c is the prefix of codes assigned to a and b. If the
compressed bit stream is 0001, the de-compressed output may be “cccd” or “ccb” or “acd” or
“ab”.
character Frequency
a 5
b 9
c 12
d 13
e 16
f 45
Step 1. Build a min heap that contains 6 nodes where each node represents root of a tree with
single node.
Step 2 Extract two minimum frequency nodes from min heap. Add a new internal node with
frequency 5 + 9 = 14.
Now min heap contains 5 nodes where 4 nodes are roots of trees with single element each,
and one heap node is root of tree with 3 elements
character Frequency
c 12
d 13
Internal Node 14
e 16
f 45
Step 3: Extract two minimum frequency nodes from heap. Add a new internal node with
frequency 12 + 13 = 25
Now min heap contains 4 nodes where 2 nodes are roots of trees with single element each,
and two heap nodes are root of tree with more than one nodes.
character Frequency
Internal Node 14
e 16
Internal Node 25
f 45
Step 4: Extract two minimum frequency nodes. Add a new internal node with frequency 14 +
16 = 30
Character Frequency
Internal Node 25
Internal Node 30
f 45
Step 5: Extract two minimum frequency nodes. Add a new internal node with frequency 25 +
30 = 55
Now min heap contains 2 nodes.
character Frequency
f 45
Internal Node 55
Step 6: Extract two minimum frequency nodes. Add a new internal node with frequency 45 +
55 = 100
Since the heap contains only one node, the algorithm stops here.
f 0
c 100
d 101
a 1100
b 1101
e 111
Merge a set of sorted files of different length into a single sorted file. We need to find an
optimal solution, where the resultant file will be generated in minimum time.
If the number of sorted files are given, there are many ways to merge them into a single
sorted file. This merge can be performed pair wise. Hence, this type of merging is called
as 2-way merge patterns.
As, different pairings require different amounts of time, in this strategy we want to
determine an optimal way of merging many files together. At each step, two shortest
sequences are merged.
To merge a p-record file and a q-record file requires possibly p + q record moves, the
obvious choice being, merge the two smallest files together at each step.
Two-way merge patterns can be represented by binary merge trees. Let us consider a set
of n sorted files {f1, f2, f3, …, fn}. Initially, each element of this is considered as a single node
binary tree. To find this optimal solution, the following algorithm is used.
At the end of this algorithm, the weight of the root node represents the optimal cost.
Example
Let us consider the given files, f1, f2, f3, f4 and f5 with 20, 30, 10, 5 and 30 number of elements
respectively.
50 + 60 + 65 + 95 = 270
Sorting the numbers according to their size in an ascending order, we get the following
sequence −
f4, f3, f1, f2, f5
15 + 35 + 65 + 95 = 210
In this context, we are now going to solve the problem using this algorithm.
Initial Set
Step-1
Step-2
Step-3
Step-4
Source Notes:
1. https://www.geeksforgeeks.org/huffman-coding-greedy-algo-3/
2. https://www.programiz.com/dsa/huffman-coding
3. https://www.tutorialspoint.com/design_and_analysis_of_algorithms/
design_and_analysis_of_algorithms_optimal_merge_pattern.htm#:~:text=Merge%20a
%20set%20of%20sorted,into%20a%20single%20sorted%20file.
Lecture Video:
1. https://youtu.be/0kNXhFIEd_w
2. https://youtu.be/HHIc5JZyenI
Online Notes:
1. http://vssut.ac.in/lecture_notes/lecture1428551222.pdf
Text Book Reading:
rd
1. Cormen, Leiserson, Rivest, Stein, “Introduction to Algorithms”, Prentice Hall of India, 3
edition 2012. problem, Graph coloring.