Open In App

Efficient Huffman Coding for Sorted Input | Greedy Algo-4

Last Updated : 06 Nov, 2023
Comments
Improve
Suggest changes
Like Article
Like
Report

We recommend to read following post as a prerequisite for this.
Greedy Algorithms | Set 3 (Huffman Coding)

Time complexity of the algorithm discussed in above post is O(nLogn). If we know that the given array is sorted (by non-decreasing order of frequency), we can generate Huffman codes in O(n) time. Following is a O(n) algorithm for sorted input.
1. Create two empty queues.
2. Create a leaf node for each unique character and Enqueue it to the first queue in non-decreasing order of frequency. Initially second queue is empty.
3. Dequeue two nodes with the minimum frequency by examining the front of both queues. Repeat the following steps two times 
        1. If second queue is empty, dequeue from first queue. 
        2. If first queue is empty, dequeue from second queue. 
        3. Else, compare the front of two queues and dequeue the minimum. 
4. Create a new internal node with frequency equal to the sum of the two nodes frequencies. Make the first Dequeued node as its left child and the second Dequeued node as right child. Enqueue this node to second queue.
5. Repeat steps#3 and #4 while there is more than one node in the queues. The remaining node is the root node and the tree is complete. 

C++14
// Clean c++ stl code to generate huffman codes if the array
// is sorted in non-decreasing order
#include <bits/stdc++.h>
using namespace std;

// Node structure for creating a binary tree
struct Node {
    char ch;
    int freq;
    Node* left;
    Node* right;
    Node(char c, int f, Node* l = nullptr,
         Node* r = nullptr)
        : ch(c)
        , freq(f)
        , left(l)
        , right(r){};
};

// Find the min freq node between q1 and q2
Node* minNode(queue<Node*>& q1, queue<Node*>& q2)
{
    Node* temp;

    if (q1.empty()) {
        temp = q2.front();
        q2.pop();
        return temp;
    }

    if (q2.empty()) {
        temp = q1.front();
        q1.pop();
        return temp;
    }

    if (q1.front()->freq < q2.front()->freq) {
        temp = q1.front();
        q1.pop();
        return temp;
    }
    else {
        temp = q2.front();
        q2.pop();
        return temp;
    }
}

// Function to print the generated huffman codes
void printHuffmanCodes(Node* root, string str = "")
{
    if (!root)
        return;
    if (root->ch != '$') {
        cout << root->ch << ": " << str << '\n';
        return;
    }

    printHuffmanCodes(root->left, str + "0");
    printHuffmanCodes(root->right, str + "1");

    return;
}

// Function to generate huffman codes
void generateHuffmanCode(vector<pair<char, int> > v)
{
    if (!v.size())
        return;

    queue<Node*> q1;
    queue<Node*> q2;

    for (auto it = v.begin(); it != v.end(); ++it)
        q1.push(new Node(it->first, it->second));

    while (!q1.empty() or q2.size() > 1) {
        Node* l = minNode(q1, q2);
        Node* r = minNode(q1, q2);
        Node* node = new Node('$', l->freq + r->freq, l, r);
        q2.push(node);
    }

    printHuffmanCodes(q2.front());
    return;
}

int main()
{
    vector<pair<char, int> > v
        = { { 'a' , 5 },  { 'b' , 9 },  { 'c' , 12 },
            { 'd' , 13 }, { 'e' , 16 }, { 'f' , 45 } };
    generateHuffmanCode(v);
    return 0;
}
C++ C Java Python3 C# JavaScript

Output: 

f: 0
c: 100
d: 101
a: 1100
b: 1101
e: 111

Time complexity: O(n)
If the input is not sorted, it needs to be sorted first before it can be processed by the above algorithm. Sorting can be done using heap sort or merge-sort both of which run in Theta(nlogn). So, the overall time complexity becomes O(nlogn) for unsorted input. 
Auxiliary Space: O(n)

Reference: 
http://en.wikipedia.org/wiki/Huffman_coding
This article is compiled by Aashish Barnwal and reviewed by GeeksforGeeks team.  


Next Article
Practice Tags :

Similar Reads