0% found this document useful (0 votes)
2 views

Huffman coding

Uploaded by

tayyabxardar007
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Huffman coding

Uploaded by

tayyabxardar007
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 52

CUI Abbottabad

Department of Computer Science

Data Structures and Algorithms


Lecture No.
Applications of Tree

COMSATS University Islamabad, Abbottabad Campus


Agenda
❖ Applications of Tree
▪ Expression Tree
▪ Huffman Coding

COMSATS University Islamabad, Abbottabad Campus


Expression Trees

▪ Binary tree has many forms and applications. One form of


binary tree is the Expression tree which is very important in
compiler construction.

▪ An expression tree is defined as a binary tree that is built up


from arithmetical or logical expression by placing the operands
as the leave nodes and the operators as the interior nodes.

▪ For each binary operator, the left subtree consists of its left
operand and the right subtree consists of its right operand.
▪ For unary operators, one of the subtrees is empty.

3
Expression Trees
Expression Tree

(a+b*c)+((d*e+f)*g)
+

+ *

a * + g

b c * f

d e

5
Expression Tree

▪ The inner nodes contain operators while leaf


nodes contain operands.

+ *

a * + g

b c * f

d e
Expression Tree

▪ The tree is binary because the operators are


binary.

+ *

a * + g

b c * f

d e
Expression Tree

▪ Inorder traversal yields: a+b*c+d*e+f*g

+ *

a * + g

b c * f

d e
Enforcing Parenthesis
/* inorder traversal routine using the parenthesis */
void inorder(TreeNode treeNode)
{
if( treeNode != NULL )
{
if(treeNode.left != NULL && treeNode.right !=
NULL) //if not leaf
system.out.print("(“);
inorder(treeNode.left);
system.out.print(treeNode.data+" “);
inorder(treeNode.right);
if(treeNode.left != NULL && treeNode.right!=
NULL) //if not leaf
system.out.print(“)“); }
}
Expression Tree

▪ Inorder : (a+(b*c))+(((d*e)+f)*g)

+ *

a * + g

b c * f

d e
Expression Tree

▪ Postorder traversal: a b c * + d e * f + g * +
which is the postfix form.
+

+ *

a * + g

b c * f

d e
Constructing Expression Tree

▪ Algorithm to convert postfix expression into an


expression tree.
▪ We already have an expression to convert an
infix expression to postfix.
▪ Read a symbol from the postfix expression.
▪ If symbol is an operand, put it in a one node tree
and push it on a stack.
▪ If symbol is an operator, pop two trees from the
stack, form a new tree with operator as the root
and T1 and T2 as left and right subtrees and push
this tree on the stack.
Constructing Expression Tree

▪ ab+cde+**

stack
Constructing Expression Tree

▪ ab+cde+**
top
If symbol is an
operand, put it in a
one node tree and
a b push it on a stack.

Stack is growing left to right


Constructing Expression Tree

▪ ab+cde+**

If symbol is an operator,
pop two trees from the
stack, form a new tree
with operator as the root
+
and T1 and T2 as left and
right subtrees and push
a b this tree on the stack.

Stack is growing left to right


Constructing Expression Tree

▪ ab+cde+**

+ c d e

a b
Constructing Expression Tree

▪ ab+cde+**

+ c +

a b d e
Constructing Expression Tree

▪ ab+cde+**

+ *

a b c
+

d e
Constructing Expression Tree

▪ ab+cde+**

+ *

a b c
+

d e
Huffman coding
Huffman Encoding
▪ Data compression plays a significant role in
computer networks.
▪ To transmit data to its destination faster, it is
necessary to either increase the data rate of the
transmission media or to simply send less data.
▪ Improvements with regard to the transmission
media has led to increase in the rate.
▪ The other options is to send less data by means
of data compression.
▪ Compression methods are used for text, images,
voice and other types of data.
Huffman Encoding
▪ Huffman code is method for the compression for
standard text documents.
▪ It makes use of a binary tree to develop codes of
varying lengths for the letters used in the
original message.
▪ Huffman code is also part of the JPEG image
compression scheme.
▪ The algorithm was introduced by David
Huffman in 1952 as part of a course assignment
at MIT.
Huffman Encoding
▪ To understand Huffman encoding, it is best to
use a simple example.
▪ Encoding the 32-character phrase: "traversing
threaded binary trees",
▪ If we send the phrase as a message in a network
using standard 8-bit ASCII codes, we would have
to send 8*32= 256 bits.
▪ Using the Huffman algorithm, we can send the
message with only 116 bits.
Huffman Encoding
▪ List all the letters used, including the "space"
character, along with the frequency with which
they occur in the message.
▪ Consider each of these (character,frequency)
pairs to be nodes; they are actually leaf nodes, as
we will see.
▪ Pick the two nodes with the lowest frequency,
and if there is a tie, pick randomly amongst
those with equal frequencies.
Huffman Encoding
▪ Make a new node out of these two, and make the
two nodes its children.

▪ This new node is assigned the sum of the


frequencies of its children.

▪ Continue the process of combining the two


nodes of lowest frequency until only one node,
the root, remains.
Greedy Algorithm: Huffman Encoding

26
Greedy Algorithm: Huffman Encoding

27
ABBRA CADA BBRA
Total bit=15 x 8=120 bits
ASCII
HC bit=?

28
29
30
31
Huffman Code Problem
• Left tree represents a fixed length encoding scheme
• Right tree represents a Huffman encoding scheme

32
Example

33
Huffman Encoding
Original text:
traversing threaded binary trees
size: 33 characters (space and newline)

NL : 1 i: 2
SP : 3
a: 3 n: 2
b: 1 r: 5
d: 2 s: 2
e: 5
g: 1
t: 3
h: 1 v: 1
y: 1
Huffman Encoding

2 is equal to sum
of the frequencies of
the two children nodes.

e r
a t
3 3 5 5

d i n s 2 SP
2 2 2 2 3
NL b g h v y
1 1 1 1 1 1
Huffman Encoding

There a number of ways to combine


nodes. We have chosen just one such
way.

e r
a t
3 3 5 5

d i n s 2 2 SP
2 2 2 2 3
NL b g h v y
1 1 1 1 1 1
Huffman Encoding

e r
a t
3 3 5 5

d i n s 2 2 2 SP
2 2 2 2 3
NL b g h v y
1 1 1 1 1 1
Huffman Encoding

e r
a t 4 4
3 3 5 5

d i n s 2 2 2 SP
2 2 2 2 3
NL b g h v y
1 1 1 1 1 1
Huffman Encoding

4 e r 5
a t 4 4
3 3 5 5

d i n s 2 2 2 SP
2 2 2 2 3
NL b g h v y
1 1 1 1 1 1
Huffman Encoding

9 10
6 8

4 e r 5
a t 4 4
3 3 5 5

d i n s 2 2 2 SP
2 2 2 2 3
NL b g h v y
1 1 1 1 1 1
Huffman Encoding

14 19

9 10
6 8

4 e r 5
a t 4 4
3 3 5 5

d i n s 2 2 2 SP
2 2 2 2 3
NL b g h v y
1 1 1 1 1 1
Huffman Encoding
33

14 19

9 10
6 8

4 e r 5
a t 4 4
3 3 5 5

d i n s 2 2 2 SP
2 2 2 2 3
NL b g h v y
1 1 1 1 1 1
Huffman Encoding

▪ Start at the root. Assign 0 to left branch and 1 to


the right branch.
▪ Repeat the process down the left and right
subtrees.
▪ To get the code for a character, traverse the tree
from the root to the character leaf node and read
off the 0 and 1 along the path.

43
Huffman Encoding
33
0 1

14 19

9 10
6 8

4 e r 5
a t 4 4
3 3 5 5

d i n s 2 2 2 SP
2 2 2 2 3
NL b g h v y
1 1 1 1 1 1
44
Huffman Encoding
33
0 1

14 19
0 1 0 1
9 10
6 8

4 e r 5
a t 4 4
3 3 5 5

d i n s 2 2 2 SP
2 2 2 2 3
NL b g h v y
1 1 1 1 1 1
45
Huffman Encoding
33
0 1

14 19
0 1 0 1
9 10
6 8
0 1 0 1
0 1 0 1
4 e r 5
a t 4 4
3 3 5 5
0 1 0 1 0 1 0 1
d i n s 2 2 2 SP
2 2 2 2 3
NL b g h v y
1 1 1 1 1 1
46
Huffman Encoding
33
0 1

14 19
0 1 0 1
9 10
6 8
0 1 0 1
0 1 0 1
4 e r 5
a t 4 4
3 3 5 5
0 1 0 1 0 1 0 1
d i n s 2 2 2 SP
2 2 2 2 0 1 0 1 0 1 3
NL b g h v y
1 1 1 1 1 1
47
Huffman Encoding

Huffman character codes

NL  10000 • Notice that the code is


SP  1111 variable length.
a  000
b  10001 • Letters with higher
d  0100 frequencies have shorter
e  101 codes.
g  10010
h  10011 • The tree could have been
i  0101 built in a number of ways;
n  0110 each would yielded
r  110
s  0111 different codes but the
t  001 code would still be
v  11100 minimal.
y  11101
48
Huffman Encoding
Original: traversing threaded binary trees

Encoded:

0011100001110010111001110101011010010111100
110011110101000010010101001111100001010110
00011011101111100111010110101110000

49
Huffman Encoding

Original: traversing threaded binary trees


With 8 bits per character, length is 264.

Encoded:
0011100001110010111001110101011010010111100
110011110101000010010101001111100001010110
00011011101111100111010110101110000

Compressed into 122 bits, 54% reduction.

50
Summary
❖ Summary

COMSATS University Islamabad, Abbottabad Campus


THANK YOU

Q&A

COMSATS University Islamabad, Abbottabad Campus

You might also like