0% found this document useful (0 votes)
50 views26 pages

Huffman Paraphrased

The document discusses implementing Huffman coding. It begins with an abstract explaining Huffman coding and its use of a binary tree method to encode messages into shorter codes without data loss. It then provides theory on the basics of Huffman coding, including assigning variable-length binary sequences based on character frequencies and building a Huffman tree by merging nodes. The document includes a flowchart and implementation of Huffman coding in C language with functions like creating a min heap, extracting minimum values, and building the Huffman tree.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
50 views26 pages

Huffman Paraphrased

The document discusses implementing Huffman coding. It begins with an abstract explaining Huffman coding and its use of a binary tree method to encode messages into shorter codes without data loss. It then provides theory on the basics of Huffman coding, including assigning variable-length binary sequences based on character frequencies and building a Huffman tree by merging nodes. The document includes a flowchart and implementation of Huffman coding in C language with functions like creating a min heap, extracting minimum values, and building the Huffman tree.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 26

Vietnam National University

Ho Chi Minh City University of Technology

COMMUNICATION NETWORKS

PROJECT REPORT
HUFFMAN CODING IMPLEMENT

Instructor: Prof. Dr. Vo Que Son


Group 5:
Nguyen Ngoc Minh Khang ID: 1411714
Du Nghiem Vinh Bao ID:

November, 24th 2018


Project report 1

Contents
Abstract .......................................................................... 2

Theory............................................................................. 2

1) The Basics ................................................... 3

2) Huffman Tree.............................................. 4

3) The reading of the Huffman Tree ............... 5

FLOWCHART ................................................................... 7

Implementation .............................................................. 7

Conclusion .................................................................... 24

References .................................................................... 25

November, 24h 2018


Project report 2

Abstract
The Huffman Coding is a lossless
data compression algorithm,
developed by David Huffman in
the early of 50s while he was a
PhD student at MIT. The
algorithm is based on a binary-
tree frequency-sorting method
that allow encode any message sequence into shorter
encoded messages and a method to reassemble into original
message without losing any data.

Theory

November, 24h 2018


Project report 3

1) The Basics

The idea is to assign variable-length binary sequences to


input characters whose lengths are based on frequencies of
corresponding characters. The most frequent character gets
the shortest binary sequence and vice versa, the least
frequent character gets the longest binary sequence.
The variable-length codes assigned to input characters
are prefixed, meaning the codes (bit sequences) are assigned
in such a way that the code assigned to one character is not
prefix of code assigned to any other character. This is to
ensure that ambiguity doesn’t occur during decoding the
generated bit stream.

November, 24h 2018


Project report 4

2) Huffman Tree

a) Definition
The best way to visualize the Huffman coding is to think of it
as a tree, a Huffman tree. A Huffman tree is comprised of
binary whose leafs correspond to characters.
The length of a leaf is amount to the length of the bit
sequence corresponding to the character. The Huffman tree
is the binary tree with minimum external braches weight so
the goal is to build a tree with the minimum external
branches.

November, 24h 2018


Project report 5

b) Building the tree


Each node of the tree are represented with a byte symbol
and the frequency of that byte on the data. The creation of
the Huffman tree have the following steps:

1. Scan the data and calculate the frequency of occurrence of


each byte;
2. Insert those nodes into a reverse priority queue based on
the frequencies (a lowest frequency is given highest
priority);
3. Start a loop until the queue is empty;
4. Immerge the two lowest frequency nodes from the queue
into one new node through sum.
5. Redo from step 2 with the new node. Repeat the process
until there is only one node left.
6. The fully constructed Huffman tree will provide the
reading of bit sequences corresponding to the relating
characters.

3) The reading of the Huffman Tree

November, 24h 2018


Project report 6

Starting from the root


 Maintain an auxiliary array.
 While moving to the left side, write 0 to the array.
 While moving to the right side, write 1 to the array.
 Print the array when a leaf node is encountered.

November, 24h 2018


Project report 7

FLOWCHART

Implementation
Implement in C language

// C program for Huffman Coding


#include <stdio.h>
November, 24h 2018
Project report 8

#include <stdlib.h>

// This constant can be avoided by explicitly


// calculating height of Huffman Tree
#define MAX_TREE_HT 100

// A Huffman tree node


struct MinHeapNode {

// One of the input characters


char data;

// Frequency of the character


unsigned freq;

// Left and right child of this node


struct MinHeapNode *left, *right;
};

November, 24h 2018


Project report 9

// A Min Heap: Collection of


// min heap (or Hufmman tree) nodes
struct MinHeap {

// Current size of min heap


unsigned size;

// capacity of min heap


unsigned capacity;

// Attay of minheap node pointers


struct MinHeapNode** array;
};

// A utility function allocate a new


// min heap node with given character
// and frequency of the character
struct MinHeapNode* newNode(char data, unsigned freq)
{

November, 24h 2018


Project report 10

struct MinHeapNode* temp


= (struct MinHeapNode*)malloc
(sizeof(struct MinHeapNode));

temp->left = temp->right = NULL;


temp->data = data;
temp->freq = freq;

return temp;
}

// A utility function to create


// a min heap of given capacity
struct MinHeap* createMinHeap(unsigned capacity)

struct MinHeap* minHeap


= (struct MinHeap*)malloc(sizeof(struct MinHeap));

November, 24h 2018


Project report 11

// current size is 0
minHeap->size = 0;

minHeap->capacity = capacity;

minHeap->array
= (struct MinHeapNode**)malloc(minHeap->
capacity * sizeof(struct MinHeapNode*));
return minHeap;
}

// A utility function to
// swap two min heap nodes
void swapMinHeapNode(struct MinHeapNode** a,
struct MinHeapNode** b)

November, 24h 2018


Project report 12

struct MinHeapNode* t = *a;


*a = *b;
*b = t;
}

// The standard minHeapify function.


void minHeapify(struct MinHeap* minHeap, int idx)

int smallest = idx;


int left = 2 * idx + 1;
int right = 2 * idx + 2;

if (left < minHeap->size && minHeap->array[left]->


freq < minHeap->array[smallest]->freq)
smallest = left;

if (right < minHeap->size && minHeap->array[right]->

November, 24h 2018


Project report 13

freq < minHeap->array[smallest]->freq)


smallest = right;

if (smallest != idx) {
swapMinHeapNode(&minHeap->array[smallest],
&minHeap->array[idx]);
minHeapify(minHeap, smallest);
}
}

// A utility function to check


// if size of heap is 1 or not
int isSizeOne(struct MinHeap* minHeap)
{

return (minHeap->size == 1);


}

// A standard function to extract

November, 24h 2018


Project report 14

// minimum value node from heap


struct MinHeapNode* extractMin(struct MinHeap*
minHeap)

struct MinHeapNode* temp = minHeap->array[0];


minHeap->array[0]
= minHeap->array[minHeap->size - 1];

--minHeap->size;
minHeapify(minHeap, 0);

return temp;
}

// A utility function to insert


// a new node to Min Heap
void insertMinHeap(struct MinHeap* minHeap,

November, 24h 2018


Project report 15

struct MinHeapNode* minHeapNode)

++minHeap->size;
int i = minHeap->size - 1;

while (i && minHeapNode->freq < minHeap->array[(i - 1) /


2]->freq) {

minHeap->array[i] = minHeap->array[(i - 1) / 2];


i = (i - 1) / 2;
}

minHeap->array[i] = minHeapNode;
}

// A standard funvtion to build min heap


void buildMinHeap(struct MinHeap* minHeap)

November, 24h 2018


Project report 16

int n = minHeap->size - 1;
int i;

for (i = (n - 1) / 2; i >= 0; --i)


minHeapify(minHeap, i);
}

// A utility function to print an array of size n


void printArr(int arr[], int n)
{
int i;
for (i = 0; i < n; ++i)
printf("%d", arr[i]);

printf("\n");
}

November, 24h 2018


Project report 17

// Utility function to check if this node is leaf


int isLeaf(struct MinHeapNode* root)

return !(root->left) && !(root->right);


}

// Creates a min heap of capacity


// equal to size and inserts all character of
// data[] in min heap. Initially size of
// min heap is equal to capacity
struct MinHeap* createAndBuildMinHeap(char data[], int
freq[], int size)

struct MinHeap* minHeap = createMinHeap(size);

November, 24h 2018


Project report 18

for (int i = 0; i < size; ++i)


minHeap->array[i] = newNode(data[i], freq[i]);

minHeap->size = size;
buildMinHeap(minHeap);

return minHeap;
}

// The main function that builds Huffman tree


struct MinHeapNode* buildHuffmanTree(char data[], int
freq[], int size)

{
struct MinHeapNode *left, *right, *top;

// Step 1: Create a min heap of capacity


// equal to size. Initially, there are

November, 24h 2018


Project report 19

// modes equal to size.


struct MinHeap* minHeap =
createAndBuildMinHeap(data, freq, size);

// Iterate while size of heap doesn't become 1


while (!isSizeOne(minHeap)) {

// Step 2: Extract the two minimum


// freq items from min heap
left = extractMin(minHeap);
right = extractMin(minHeap);

// Step 3: Create a new internal


// node with frequency equal to the
// sum of the two nodes frequencies.
// Make the two extracted node as
// left and right children of this new node.
// Add this node to the min heap
// '$' is a special value for internal nodes, not used

November, 24h 2018


Project report 20

top = newNode('$', left->freq + right->freq);

top->left = left;
top->right = right;

insertMinHeap(minHeap, top);
}

// Step 4: The remaining node is the


// root node and the tree is complete.
return extractMin(minHeap);
}

// Prints huffman codes from the root of Huffman Tree.


// It uses arr[] to store codes
void printCodes(struct MinHeapNode* root, int arr[], int top)

November, 24h 2018


Project report 21

// Assign 0 to left edge and recur


if (root->left) {

arr[top] = 0;
printCodes(root->left, arr, top + 1);
}

// Assign 1 to right edge and recur


if (root->right) {

arr[top] = 1;
printCodes(root->right, arr, top + 1);
}

// If this is a leaf node, then


// it contains one of the input
// characters, print the character
// and its code from arr[]
if (isLeaf(root)) {

November, 24h 2018


Project report 22

printf("%c: ", root->data);


printArr(arr, top);
}
}

// The main function that builds a


// Huffman Tree and print codes by traversing
// the built Huffman Tree
void HuffmanCodes(char data[], int freq[], int size)

{
// Construct Huffman Tree
struct MinHeapNode* root
= buildHuffmanTree(data, freq, size);

// Print Huffman codes using


// the Huffman tree built above
int arr[MAX_TREE_HT], top = 0;

November, 24h 2018


Project report 23

printCodes(root, arr, top);


}

// Driver program to test above functions


int main()
{

char arr[] = { 'a', 'b', 'c', 'd', 'e', 'f' };


int freq[] = { 5, 9, 12, 13, 16, 45 };

int size = sizeof(arr) / sizeof(arr[0]);

HuffmanCodes(arr, freq, size);

return 0;
}

November, 24h 2018


Project report 24

Conclusion

November, 24h 2018


Project report 25

References
[1] https://sites.fas.harvard.edu/~cscie119/lectures/trees.pdf

[2] http://homes.sice.indiana.edu/yye/lab/teaching/spring2014-C343/huffman.php

November, 24h 2018

You might also like