ADCexp 2

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 8

Experiment no : 2

Name Sarthak Sandip Bhosale

UID 2023201002

Exp no. 2

Aim Given a text file or text message to create a dictionary and


huffmann encode and decode symbols of the dictionary. To
find entropy and efficiency of the coding method.

Software Matlab Online

Theory :
Huffman encoding is a lossless data compression algorithm that assigns variable-length codes to
input characters, with shorter codes assigned to more frequent characters. Developed by David
A. Huffman in 1952, it ensures efficient compression by minimizing the average code length
based on the frequency of characters in the data set.

Huffman Encoding Algorithm:

1. Frequency Calculation: First, determine the frequency of each character in the input
data.
2. Build a Min-Heap Tree: Create a min-heap (priority queue) where each node represents
a character and its frequency. The heap arranges nodes in ascending order of frequency.
3. Merge Nodes: While the heap contains more than one node, extract the two nodes with
the lowest frequencies, merge them into a new node whose frequency is the sum of the
two. This process builds a binary tree where each leaf node represents a character.
4. Assign Codes: Traverse the tree from the root to each leaf, assigning 0 to left edges and 1
to right edges, generating the Huffman code for each character.
The total size of the compressed data is calculated by summing the products of the frequency of
each character f(ci) and the length of its corresponding Huffman code L(ci). Mathematically, this
can be represented as

Where S is the total number of bits in the compressed data, f(ci) is the frequency of character ci ,
and L(ci) is the length of the Huffman code for ci.

Huffman encoding guarantees optimal prefix-free codes, ensuring that no code is a prefix of any
other, which allows for unambiguous decoding. It is widely used in data compression algorithms
such as ZIP, JPEG, and others.

Program :

%cretes a function named huffmanCoding that takes 1 input and gives 6


%outputs as mentioned below
function [huffman_code, decoded_str, entropy, compression_ratio,
efficiency, avglen] = huffmanCoding(input_str)

% Step 1: Calculate the frequency of each character in the input string


symbols = unique(input_str); % Unique symbols including spaces
freq = zeros(1, length(symbols)); %creates an array with all zeros

for i = 1:length(symbols)

freq(i) = sum(input_str == symbols(i)); %frequency of each character


end

% finding probability
freq = freq / sum(freq);

% Convert symbols to a cell array of characters


symbols = cellstr(symbols(:))'; % Convert to cell array of strings
% Step 2: Create a Huffman dictionary
[dict, avglen] = huffmandict(symbols, freq);
% Convert input string to cell array for encoding
input_cell = cellstr(input_str(:))';

% Step 3: Encode the input string using the Huffman dictionary


huffman_code = huffmanenco(input_cell, dict);

% Display results
disp('Huffman Dictionary:');
disp('Character Huffman Code');
for i = 1:length(dict)
fprintf(' %s %s\n', dict{i, 1}, num2str(dict{i, 2}));
end

% Step 4: Display average code word length


disp(['Average Code Word Length: ', num2str(avglen)]);

% Step 5: Calculate entropy


entropy = -sum(freq .* log2(freq));
disp(['Entropy: ', num2str(entropy)]);

% Step 6: Calculate compression ratio


original_size = length(input_str) * 8; % Each character is 8 bits in
ASCII
compressed_size = length(huffman_code); % Huffman encoded bitstream
length
compression_ratio = original_size / compressed_size;
disp(['Compression Ratio: ', num2str(compression_ratio)]);

% Step 7: Calculate efficiency


efficiency = (entropy / avglen) * 100;
disp(['Efficiency: ', num2str(efficiency), '%']);

% Step 8: Decode the Huffman code back to the original string


decoded_cell = huffmandeco(huffman_code, dict);

% Convert decoded cell array back to string


decoded_str = strjoin(decoded_cell, ''); % Join without removing spaces

% Display decoded string


disp('Decoded String:');
disp(decoded_str);
end
% Example usage:
input_str = ' sounds blissful';
[huffman_code, decoded_str, entropy, compression_ratio, efficiency,
avglen] = huffmanCoding(input_str);

Huffman Dictionary:
Character Huffman Code
0 0 0
b 0 0 1 1
d 0 0 1 0
f 1 1 0 1
i 1 1 0 0
l 0 1 1
n 1 1 1 1
o 1 1 1 0
s 1 0
u 0 1 0
Average Code Word Length: 3.125
Entropy: 3.125
Compression Ratio: 2.56
Efficiency: 100%
Decoded String:
soundsblissful

%clearing all variableas and screen


clear all;
close all;
clc;
number_of_colors = 256; %256 color palette
%Reading image
a=imread('peacock.jpg');
fprintf("Input Image:")

Input Image:

figure(1),imshow(a) %display image


%converting an image to grayscale
% I=rgb2gray(a);
% Use indexed image instead of grayscale
[I, myCmap] = rgb2ind(a, number_of_colors); % Convert image to an
indexed format
%size of the image
[m,n]=size(I);
Totalcount=m*n; %total number of pixels
%variables using to find the probability
cnt=1;
%computing the cumulative probability.
pro = zeros(256,1);
for i=0:255
k=(I==i); % Logical array to find where pixels have value i
count=sum(k(:)); % Count the number of occurrences of intensity i
%pro array is having the probabilities
pro(cnt)=count/Totalcount;
cnt=cnt+1;
end
% Probablities can also be found using histcounts
pro1 = histcounts(I,0:256,'Normalization','probability');
cumpro = cumsum(pro); % if the cumulative sum is needed
sigma = sum(pro); % if the sum is needed; should always be 1.0
%Symbols for an image
symbols = 0:255;
%Huffman code Dictionary
%dict = huffmandict(symbols,pro);
[dict, avglen] = huffmandict(symbols, pro);
%average code word length
fprintf('Average code length: %.4f bits/symbol\n', avglen);
Average code length: 7.7658 bits/symbol

%entropy
nonzero_probs = pro(pro > 0);
entropy = -sum(nonzero_probs .* log2(nonzero_probs));
fprintf('Entropy: %.4f bits/symbol\n', entropy);

Entropy: 7.7389 bits/symbol

%function which converts array to vector


newvec = reshape(I,[numel(I),1]);
%Huffman Encodig
hcode = huffmanenco(newvec,dict);
%compression ratio
% Size of Compressed Data
size_compressed_bits = length(hcode); % Number of bits in compressed
data
% Size of Original Data
size_original_bits = Totalcount * 8; % 8 bits per pixel for grayscale
image
compression_ratio = size_original_bits / size_compressed_bits;
fprintf('Compression Ratio: %.2f\n', compression_ratio);

Compression Ratio: 1.03

% Efficiency
compression_efficiency = entropy / avglen *100;
fprintf('Efficiency: %.4f\n', compression_efficiency);

Efficiency: 99.6533

%Huffman Decoding
dhsig1 = huffmandeco(hcode,dict);
%convertign dhsig1 double to dhsig uint8
dhsig = uint8(dhsig1);
%vector to array conversion
back = reshape(dhsig,[m n]);
%Grayscale image
I=rgb2gray(a);
fprintf("Grayscaled Image:")

Grayscaled Image:

imshow(I);
%converting image from grayscale to rgb
RGB =ind2rgb(back,myCmap);
imwrite(RGB,'decoded.JPG');
%display image
fprintf("Decoded Image:")

Decoded Image:

figure(2),imshow(RGB)

Conclusion :
In this experiment, We applied Huffman encoding to both a text string and an image to
demonstrate its efficiency in data compression. For the text string, the algorithm calculated key
metrics such as entropy, average code word length, compression ratio, and efficiency, yielding
an optimal efficiency of 100%. Similarly, when applied to an image, Huffman encoding provided
a compression ratio of 1.03, with an efficiency of 99.65%. These results confirm that Huffman
encoding is highly effective for compressing both textual and visual data while preserving
information integrity, as demonstrated by the accurate decoding of both the string and image.

You might also like