Huffman coding
Huffman coding
▪ For each binary operator, the left subtree consists of its left
operand and the right subtree consists of its right operand.
▪ For unary operators, one of the subtrees is empty.
3
Expression Trees
Expression Tree
(a+b*c)+((d*e+f)*g)
+
+ *
a * + g
b c * f
d e
5
Expression Tree
+ *
a * + g
b c * f
d e
Expression Tree
+ *
a * + g
b c * f
d e
Expression Tree
+ *
a * + g
b c * f
d e
Enforcing Parenthesis
/* inorder traversal routine using the parenthesis */
void inorder(TreeNode treeNode)
{
if( treeNode != NULL )
{
if(treeNode.left != NULL && treeNode.right !=
NULL) //if not leaf
system.out.print("(“);
inorder(treeNode.left);
system.out.print(treeNode.data+" “);
inorder(treeNode.right);
if(treeNode.left != NULL && treeNode.right!=
NULL) //if not leaf
system.out.print(“)“); }
}
Expression Tree
▪ Inorder : (a+(b*c))+(((d*e)+f)*g)
+ *
a * + g
b c * f
d e
Expression Tree
▪ Postorder traversal: a b c * + d e * f + g * +
which is the postfix form.
+
+ *
a * + g
b c * f
d e
Constructing Expression Tree
▪ ab+cde+**
stack
Constructing Expression Tree
▪ ab+cde+**
top
If symbol is an
operand, put it in a
one node tree and
a b push it on a stack.
▪ ab+cde+**
If symbol is an operator,
pop two trees from the
stack, form a new tree
with operator as the root
+
and T1 and T2 as left and
right subtrees and push
a b this tree on the stack.
▪ ab+cde+**
+ c d e
a b
Constructing Expression Tree
▪ ab+cde+**
+ c +
a b d e
Constructing Expression Tree
▪ ab+cde+**
+ *
a b c
+
d e
Constructing Expression Tree
▪ ab+cde+**
+ *
a b c
+
d e
Huffman coding
Huffman Encoding
▪ Data compression plays a significant role in
computer networks.
▪ To transmit data to its destination faster, it is
necessary to either increase the data rate of the
transmission media or to simply send less data.
▪ Improvements with regard to the transmission
media has led to increase in the rate.
▪ The other options is to send less data by means
of data compression.
▪ Compression methods are used for text, images,
voice and other types of data.
Huffman Encoding
▪ Huffman code is method for the compression for
standard text documents.
▪ It makes use of a binary tree to develop codes of
varying lengths for the letters used in the
original message.
▪ Huffman code is also part of the JPEG image
compression scheme.
▪ The algorithm was introduced by David
Huffman in 1952 as part of a course assignment
at MIT.
Huffman Encoding
▪ To understand Huffman encoding, it is best to
use a simple example.
▪ Encoding the 32-character phrase: "traversing
threaded binary trees",
▪ If we send the phrase as a message in a network
using standard 8-bit ASCII codes, we would have
to send 8*32= 256 bits.
▪ Using the Huffman algorithm, we can send the
message with only 116 bits.
Huffman Encoding
▪ List all the letters used, including the "space"
character, along with the frequency with which
they occur in the message.
▪ Consider each of these (character,frequency)
pairs to be nodes; they are actually leaf nodes, as
we will see.
▪ Pick the two nodes with the lowest frequency,
and if there is a tie, pick randomly amongst
those with equal frequencies.
Huffman Encoding
▪ Make a new node out of these two, and make the
two nodes its children.
26
Greedy Algorithm: Huffman Encoding
27
ABBRA CADA BBRA
Total bit=15 x 8=120 bits
ASCII
HC bit=?
28
29
30
31
Huffman Code Problem
• Left tree represents a fixed length encoding scheme
• Right tree represents a Huffman encoding scheme
32
Example
33
Huffman Encoding
Original text:
traversing threaded binary trees
size: 33 characters (space and newline)
NL : 1 i: 2
SP : 3
a: 3 n: 2
b: 1 r: 5
d: 2 s: 2
e: 5
g: 1
t: 3
h: 1 v: 1
y: 1
Huffman Encoding
2 is equal to sum
of the frequencies of
the two children nodes.
e r
a t
3 3 5 5
d i n s 2 SP
2 2 2 2 3
NL b g h v y
1 1 1 1 1 1
Huffman Encoding
e r
a t
3 3 5 5
d i n s 2 2 SP
2 2 2 2 3
NL b g h v y
1 1 1 1 1 1
Huffman Encoding
e r
a t
3 3 5 5
d i n s 2 2 2 SP
2 2 2 2 3
NL b g h v y
1 1 1 1 1 1
Huffman Encoding
e r
a t 4 4
3 3 5 5
d i n s 2 2 2 SP
2 2 2 2 3
NL b g h v y
1 1 1 1 1 1
Huffman Encoding
4 e r 5
a t 4 4
3 3 5 5
d i n s 2 2 2 SP
2 2 2 2 3
NL b g h v y
1 1 1 1 1 1
Huffman Encoding
9 10
6 8
4 e r 5
a t 4 4
3 3 5 5
d i n s 2 2 2 SP
2 2 2 2 3
NL b g h v y
1 1 1 1 1 1
Huffman Encoding
14 19
9 10
6 8
4 e r 5
a t 4 4
3 3 5 5
d i n s 2 2 2 SP
2 2 2 2 3
NL b g h v y
1 1 1 1 1 1
Huffman Encoding
33
14 19
9 10
6 8
4 e r 5
a t 4 4
3 3 5 5
d i n s 2 2 2 SP
2 2 2 2 3
NL b g h v y
1 1 1 1 1 1
Huffman Encoding
43
Huffman Encoding
33
0 1
14 19
9 10
6 8
4 e r 5
a t 4 4
3 3 5 5
d i n s 2 2 2 SP
2 2 2 2 3
NL b g h v y
1 1 1 1 1 1
44
Huffman Encoding
33
0 1
14 19
0 1 0 1
9 10
6 8
4 e r 5
a t 4 4
3 3 5 5
d i n s 2 2 2 SP
2 2 2 2 3
NL b g h v y
1 1 1 1 1 1
45
Huffman Encoding
33
0 1
14 19
0 1 0 1
9 10
6 8
0 1 0 1
0 1 0 1
4 e r 5
a t 4 4
3 3 5 5
0 1 0 1 0 1 0 1
d i n s 2 2 2 SP
2 2 2 2 3
NL b g h v y
1 1 1 1 1 1
46
Huffman Encoding
33
0 1
14 19
0 1 0 1
9 10
6 8
0 1 0 1
0 1 0 1
4 e r 5
a t 4 4
3 3 5 5
0 1 0 1 0 1 0 1
d i n s 2 2 2 SP
2 2 2 2 0 1 0 1 0 1 3
NL b g h v y
1 1 1 1 1 1
47
Huffman Encoding
Encoded:
0011100001110010111001110101011010010111100
110011110101000010010101001111100001010110
00011011101111100111010110101110000
49
Huffman Encoding
Encoded:
0011100001110010111001110101011010010111100
110011110101000010010101001111100001010110
00011011101111100111010110101110000
50
Summary
❖ Summary
Q&A