0% found this document useful (0 votes)
44 views2 pages

IEEE Paper

This document presents an overview of Huffman coding, an algorithm for data compression. It discusses how Huffman coding assigns shorter binary codes to more frequent symbols to reduce the overall number of bits used. The algorithm involves sorting symbol probabilities in descending order, recursively combining the two lowest probabilities until there is one probability left, and assigning binary codes to symbols based on this tree. An implementation of the Huffman coding algorithm in MATLAB is described, including functions for calculating entropy, generating codes, encoding/decoding, and measuring compression efficiency. Results of testing on sample data show compression ratios up to 57.51% and efficiency up to 99.14%.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
44 views2 pages

IEEE Paper

This document presents an overview of Huffman coding, an algorithm for data compression. It discusses how Huffman coding assigns shorter binary codes to more frequent symbols to reduce the overall number of bits used. The algorithm involves sorting symbol probabilities in descending order, recursively combining the two lowest probabilities until there is one probability left, and assigning binary codes to symbols based on this tree. An implementation of the Huffman coding algorithm in MATLAB is described, including functions for calculating entropy, generating codes, encoding/decoding, and measuring compression efficiency. Results of testing on sample data show compression ratios up to 57.51% and efficiency up to 99.14%.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

Huffman Coding

Esraa Khaled Mostafa Hebatallah Hesham Sayed Laila Tamer Wagih


1900102 1900081 1900024
Department of Electronics and Department of Electronics and Department of Electronics and
communications Ain Shams University communications Ain Shams University communications Ain Shams University
Cairo,Egypt Cairo,Egypt Cairo,Egypt

Omar Ahmed Maghrabi Youssef Mohamed Ibrahim


1900843 1901028
Department of Electronics and Department of Electronics and
communications Ain Shams University communications Ain Shams University
Cairo,Egypt Cairo,Egypt

Abstract— This paper presents a thorough examination of B)Input:


Adaptive Huffman Coding, a dynamic approach for optimizing
data compression. The MATLAB code implemented in this Probability: A vector containing probabilities of
project encompasses key functionalities such as entropy and unique symbols extracted from the text file.
information gain calculation, coding and decoding processes C)Sorting Probabilities:
using Adaptive Huffman Coding, efficiency measurement,
compression ratio determination, and a comprehensive Sort the probabilities in descending order and storethe
comparison of input and output files. The investigation focuses sorted values along with their indices.
on the algorithm's dynamic adjustment of variable-length
[Prob_Sorted_Desc,Descending_idx]=sort(probability, 'descend');
codes based on changing symbol frequencies, showcasing
superior compression efficiency and decoding speeds. Through D)Recursive Call:
a comparative analysis against traditional Huffman Coding,
the study highlights the adaptability and efficacy of Adaptive - Create a new vector (Recursion_prob) with one less
Huffman Coding in diverse data compression scenarios. element than the sorted probability vector.
Keywords— Huffman Coding, Data compression, Efficiency, - Add the last two probabilities in the sorted vector and put
Length, Compression ration
the result as the last element in the new vector.
I. INTRODUCTION - Call the Huffman_Dictionary function recursively with
Huffman coding is a smart way to make data smaller the new vector.
without losing important stuff. Imagine you have a bunch of Recursion_prob(length(Prob_Sorted_Desc) - 1)
letters, and some show up a lot, like 'E' or 'A,' while others, = Prob_Sorted_Desc(length(Prob_Sorted_Desc) - 1)
+ Prob_Sorted_Desc(length(Prob_Sorted_Desc));
like 'J' or 'Q,' are rare. Instead of using the same number of
bits for each letter, Huffman coding uses shorter codes for Old_Code_Words = Huffman_Dictionary(Recursion_prob);
common letters and longer codes for uncommon ones. This
makes the overall number of bits used way less, saving E) Code Generation:
space and making things more efficient. It's like giving - Concatenate '0' and '1' to the last code word in the old
priority to the important letters when talking in a secret code words to form new code words of the leaf nodes.
code. This method is super helpful for compressing data, - Rearrange the code words according to the indices
from written words to pictures and videos on computers. obtained from the sorting of their probabilities in
II. BENIEFITS descending order to branch from the highest
probability 2 new nodes.
In the context of information encoding and data processing,
Huffman coding emerges as a highly beneficial technique, New_Code_Words = [Old_Code_Words(1:(length(Old_Code_Words) - 1))
providing efficient and unambiguous codes through strcat(Old_Code_Words{length(Old_Code_Words)}, '0')
meticulous symbol frequency analysis. Its ability to assign strcat(Old_Code_Words{length(Old_Code_Words)}, '1')];
shorter codes to frequently occurring symbols optimizes Code_Words(Descending_idx) = New_Code_Words;
data representation, minimizing overall encoded message
length and contributing to space optimization. The benefits F)Average Code Length Calculation:
extend to reliable information transmission and storage, - Get a vector for the lengths of the code words.
underlining Huffman coding's significance in addressing the - Calculate the average code length by multiplying each
challenges posed by the staggering volume of data code word's length by its corresponding probability
processed by computers. and summing the results.

Code_Word_Length = cellfun(@length, Code_Words);


III. ALGORITHM Average_Code_Length = sum(Code_Word_Length .* probability);
A)Function Definition:
function[Code_Words,Average_Code_Length]=
Huffman_Dictionary(probability)
G) Output:

- Return the generated code words (Code_Words)


and the average code length
(Average_Code_Length).

Fig. 1. Example of a generated huffman tree

RESULTS
Parameter Result
Efficiency (E) 99.14 %
Compression ratio 57.51%
Entropy H(x) 4.5610
Avg word length 4.6006
Table1. Summary of our Results

REFERENCES
[1] David A. Huffman. A method for the construction of
minimum-redundancy codes. Proceedings of the Institute of
Radio Engineers (IRE), 40(9):1098–1101, September 1952.
[2] Barbay. 2016. Optimal prefix free codes with partial
sorting. In Proceedings of the Symposium on Combinatorial
Pattern Matching. LIPIcs Series, Leibniz-Zentrum fuer
Informatik
[3] ECE452s Information Theory and Coding course by
Dr. Fatma Newagy

You might also like