0% found this document useful (0 votes)

101 views40 pages

Huffman Coding

Huffman coding is an algorithm that creates a variable-length prefix code to encode messages. It builds a binary tree based on symbol frequencies, with more frequent symbols nearer the root. Each symbol is assigned a code consisting of the path from root to its leaf node. This results in shorter codes for more frequent symbols, allowing the entire message to be encoded using the fewest possible bits compared to any other prefix code.

Uploaded by

Ricardo Lazo Jr.

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

101 views40 pages

Huffman Coding

Uploaded by

Ricardo Lazo Jr.

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 40

Applications of Trees

Encoding messages

 Encode a message composed of a string of characters

 Codes used by computer systems
 ASCII
• uses 8 bits per character
• can encode 256 characters
 Unicode
• 16 bits per character
• can encode 65536 characters
• includes all characters encoded by ASCII
 ASCII and Unicode are fixed-length codes
 all characters represented by same number of bits
Problems

 Suppose that we want to encode a message

constructed from the symbols A, B, C, D, and E
using a fixed-length code
 How many bits are required to encode each
symbol?
 at least 3 bits are required
 2 bits are not enough (can only encode four
symbols)
 How many bits are required to encode the
message DEAACAAAAABA?
 there are twelve symbols, each requires 3 bits
 12*3 = 36 bits are required
Drawbacks of fixed-length codes

 Wasted space
 Unicode uses twice as much space as ASCII
• inefficient for plain-text messages containing only ASCII characters
 Same number of bits used to represent all characters
 ‘a’ and ‘e’ occur more frequently than ‘q’ and ‘z’

 Potential solution: use variable-length codes

 variable number of bits to represent characters when frequency of
occurrence is known
 short codes for characters that occur frequently
Advantages of variable-length codes

 The advantage of variable-length codes over fixed-length is short codes

can be given to characters that occur frequently
 on average, the length of the encoded message is less than fixed-
length encoding
 Potential problem: how do we know where one character ends and
another begins?
• not a problem if number of bits is fixed!

A = 00
0010110111001111111111
B = 01
C = 10 ACDBADDDDD
D = 11
Prefix property

 A code has the prefix property if no character code is the prefix (start of
the code) for another character
 Example:

Symbol Code
P 000
01001101100010
Q 11
R 01 RSTQPT
S 001
T 10
 000 is not a prefix of 11, 01, 001, or 10
 11 is not a prefix of 000, 01, 001, or 10 …
Code without prefix property

 The following code does not have prefix property

Symbol Code
P 0
Q 1
R 01
S 10
T 11

 The pattern 1110 can be decoded as QQQP, QTP, QQS, or TS

Problem
 Design a variable-length prefix-free code such that the message
DEAACAAAAABA can be encoded using 22 bits
 Possible solution:
 A occurs eight times while B, C, D, and E each occur once
 represent A with a one bit code, say 0
• remaining codes cannot start with 0
 represent B with the two bit code 10
• remaining codes cannot start with 0 or 10
 represent C with 110
 represent D with 1110
 represent E with 11110
Encoded message

DEAACAAAAABA

Symbol Code
A 0
B 10
C 110
D 1110
E 11110

1110111100011000000100 22 bits
Another possible code

DEAACAAAAABA

Symbol Code
A 0
B 100
C 101
D 1101
E 1111

1101111100101000001000 22 bits
Better code

DEAACAAAAABA

Symbol Code
A 0
B 100
C 101
D 110
E 111

11011100101000001000 20 bits
What code to use?

 Question: Is there a variable-length code that makes the most efficient

use of space?

Answer: Yes!
Huffman coding tree

 Binary tree
 each leaf contains symbol (character)
 label edge from node to left child with 0
 label edge from node to right child with 1
 Code for any symbol obtained by following path from root to the leaf
containing symbol
 Code has prefix property
 leaf node cannot appear on path to another leaf
 note: fixed-length codes are represented by a complete Huffman tree
and clearly have the prefix property
Building a Huffman tree

 Find frequencies of each symbol occurring in message

 Begin with a forest of single node trees
 each contain symbol and its frequency
 Do recursively
 select two trees with smallest frequency at the root
 produce a new binary tree with the selected trees as children and
store the sum of their frequencies in the root
 Recursion ends when there is one tree
 this is the Huffman coding tree
Example

 Build the Huffman coding tree for the message

This is his message
 Character frequencies

A G M T E H _ I S

1 1 1 1 2 2 3 3 5

 Begin with forest of single trees

1 1 1 1 2 2 3 3 5
A G M T E H _ I S
Step 1

1 1 1 1 2 2 3 3 5
A G M T E H _ I S
Step 2

2 2

1 1 1 1 2 2 3 3 5
A G M T E H _ I S
Step 3

2 2 4

1 1 1 1 2 2 3 3 5
A G M T E H _ I S
Step 4

2 2 4

1 1 1 1 2 2 3 3 5
A G M T E H _ I S
Step 5

2 2 4 6

1 1 1 1 2 2 3 3 5
A G M T E H _ I S
Step 6

4 4

2 2 2 2 6
E H

1 1 1 1 3 3 5
A G M T _ I S
Step 7

8 11

4 4 6 5
S

2 2 2 2 3 3
E H _ I

1 1 1 1
A G M T
Step 8
19

11 8

6 5 4 4
S

3 3 2 2 2 2
_ I E H

1 1 1 1
A G M T
Label edges
19
0 1
11 8
0 1 S 01 0 1
E 110
6 5 4 4
H 111
0 1 S 0 1 0 1
_ 000
3 3 I 001
2 2 2 2
A 1000
_ I 0 1 0 1 H
G 1001 E
M 1010 1 1 1 1
T 1011
A G M T
Huffman code & encoded message
This is his message

S 01
E 110
H 111
_ 000
I 001
A 1000
G 1001
M 1010
T 1011

10111110010100000101000111001010001010110010110001001110
Huffman Coding

• an algorithm that takes as input the frequencies (which are the

probabilities of occurrences) of symbols in a string and produces as
output a prefix code that encodes the string using the fewest possible
bits, among all possible binary prefix codes for these symbols.
Huffman Coding
• This algorithm, known as Huffman coding, was developed by David
Huffman in a term paper he wrote in 1951 while a graduate student
at MIT.
• (Note that this algorithm assumes that we already know how many
times each symbol occurs in the string, so we can compute the
frequency of each symbol by dividing the number of times this symbol
occurs by the length of the string.)
Huffman Coding
• Huffman coding is a fundamental algorithm in data compression, the
subject devoted to reducing the number of bits required to represent
information.
• Huffman coding is extensively used to compress bit strings
representing text and it also plays an important role in compressing
audio and image files.
Example2:

 Use Huffman coding to encode the following symbols with the

frequencies listed: A: 0.08, B: 0.10, C: 0.12, D: 0.15, E: 0.20, F: 0.35.
What is the average number of bits used to encode a character?
 Solution: STEP1

0.08 0.10 0.12 0.15 0.20 0.35

A B C D E F
Example2:

 Solution: STEP2

0.18

0.10 0.08 0.12 0.15 0.20 0.35

B A C D E F
Example2:

 Solution: STEP3

0.18 0.27

0.10 0.08 0.15 0.12 0.20 0.35

B A D C E F
Example2:
 Solution: STEP4

0.38

0.20 0.18 0.27

0.10 0.08 0.15 0.12 0.35

B A D C F
Example2:
 Solution: STEP5

0.38 0.62

0.27
0.20 0.18 0.35

F
E

0.15 0.12
0.10 0.08

D C
B A
Example2:
1.00
 Solution: STEP6

0.38
0.62

0.20 0.18
0.35
0.27
E
F

0.10 0.08

0.15 0.12
B A

D C
Example2:
1.00
 Solution: STEP7
0

0.38
0.62 0
0

0.20 0.18
0.35
0.27
E 0
F
0
0.10 0.08

0.15 0.12
B A

D C
Example2:
1.00
 Solution: STEP8
1
0

0.38
0.62 0 1
0
1
0.20 0.18
0.35
0.27
E 0 1
F
0 1
0.10 0.08

0.15 0.12
B A

D C
Example2:
1.00
 Solution: STEP8
1
0

0.38
0.62 1
0
0 1 Symbol Code
A 111
B 110 0.20 0.18
0.27
0.35 C 011
D 010
E 10 E
F 0 1 0 1
F 00

0.10 0.08
0.15 0.12

B A
D C
Example2:

 Use Huffman coding to encode the following symbols with the

frequencies listed: A: 0.08, B: 0.10, C: 0.12, D: 0.15, E: 0.20, F: 0.35.
What is the average number of bits used to encode a character?
 Solution:
Symbol Code average number of bits
used to encode a
character
A 111 3*0.08
B 110 3*0.10
C 011 3*0.12
D 010 3*0.15
E 10 2*0.20
F 00 2*0.35
2.45
Try this!
 Construct a Huffman code for the letters of the English alphabet where
the frequencies of letters in typical English text are as shown in this
table.
Thank you
for learning discrete math
with me!

CA Advanced ITT New Syllabus 500 MCQ Booklet
No ratings yet
CA Advanced ITT New Syllabus 500 MCQ Booklet
121 pages
Wowza ServerSideAPI
No ratings yet
Wowza ServerSideAPI
3,430 pages
Final MPSC
No ratings yet
Final MPSC
41 pages
MapReduce Design Patterns Building Effective Algorithms and Analytics For Hadoop and Other Systems 1st Edition Donald Miner Download
100% (1)
MapReduce Design Patterns Building Effective Algorithms and Analytics For Hadoop and Other Systems 1st Edition Donald Miner Download
54 pages
ts4300 Tape Library Documentation
No ratings yet
ts4300 Tape Library Documentation
430 pages
Azure Strategy and Implementation Guide: Fourth Edition
No ratings yet
Azure Strategy and Implementation Guide: Fourth Edition
223 pages
Static Huffman Coding Term Paper
No ratings yet
Static Huffman Coding Term Paper
23 pages
Graph Theory - Important Application of Trees Huffman Coding
No ratings yet
Graph Theory - Important Application of Trees Huffman Coding
50 pages
Data Compression - Unit 2
No ratings yet
Data Compression - Unit 2
31 pages
Huffman Trees and Codes-V1
No ratings yet
Huffman Trees and Codes-V1
15 pages
Mini Project
No ratings yet
Mini Project
26 pages
Huffman
No ratings yet
Huffman
13 pages
Huffman Code
No ratings yet
Huffman Code
47 pages
AIF C01 Demo
No ratings yet
AIF C01 Demo
8 pages
Huffman Coding
No ratings yet
Huffman Coding
30 pages
Huffman Coding
No ratings yet
Huffman Coding
65 pages
C++ Annotated Reference Manual
No ratings yet
C++ Annotated Reference Manual
223 pages
Crime Presentation
100% (1)
Crime Presentation
22 pages
Huffman
No ratings yet
Huffman
70 pages
L10 Huffman Encoding Greedy
No ratings yet
L10 Huffman Encoding Greedy
52 pages
UGRD-ITE6100B Fundamentals of Database System-Preliminary Examination
No ratings yet
UGRD-ITE6100B Fundamentals of Database System-Preliminary Examination
13 pages
CS301 Lec26
No ratings yet
CS301 Lec26
30 pages
7.4 Huffman Coding
No ratings yet
7.4 Huffman Coding
26 pages
5c. Huffman
No ratings yet
5c. Huffman
13 pages
Abstraction
No ratings yet
Abstraction
28 pages
MPMC Lab Record
No ratings yet
MPMC Lab Record
12 pages
Coding With Lua Cheatsheet: Create New Scripts Run Code
No ratings yet
Coding With Lua Cheatsheet: Create New Scripts Run Code
2 pages
CA Module 1
No ratings yet
CA Module 1
64 pages
05 BSC Computer Science Single Major First Year NEPSyllabuswef 202425
No ratings yet
05 BSC Computer Science Single Major First Year NEPSyllabuswef 202425
44 pages
Huffman Code
No ratings yet
Huffman Code
5 pages
History of Fsuipc6: The (General) Section: Hideregdetails Yes
No ratings yet
History of Fsuipc6: The (General) Section: Hideregdetails Yes
13 pages
Variable in Interfaces and Extent Interface-2.Pptx-2
No ratings yet
Variable in Interfaces and Extent Interface-2.Pptx-2
8 pages
VCam 20230511 191245
No ratings yet
VCam 20230511 191245
7 pages
Discrete Mathematics
No ratings yet
Discrete Mathematics
51 pages
Firewall Comparsion Guide
No ratings yet
Firewall Comparsion Guide
5 pages
Networking
No ratings yet
Networking
55 pages
L14 Huffman Code
No ratings yet
L14 Huffman Code
30 pages
5 Huffman Coding
No ratings yet
5 Huffman Coding
50 pages
U20cs404 - CN Unit 3 Notes
No ratings yet
U20cs404 - CN Unit 3 Notes
56 pages
2 2 5huffman
No ratings yet
2 2 5huffman
52 pages
Uccx808-X Low Power Current Mode Push-Pull PWM: 1 Features 3 Description
No ratings yet
Uccx808-X Low Power Current Mode Push-Pull PWM: 1 Features 3 Description
22 pages
2.3a Huffman Coding
No ratings yet
2.3a Huffman Coding
25 pages
Top 50 Manual Testing Interview Questions and Answers in 2022 - Edureka
No ratings yet
Top 50 Manual Testing Interview Questions and Answers in 2022 - Edureka
13 pages
Question Bank Java
No ratings yet
Question Bank Java
15 pages
Group Assignment Multimedia System
No ratings yet
Group Assignment Multimedia System
26 pages
04huffman 2x2
No ratings yet
04huffman 2x2
6 pages
Huffman Coding Scheme
No ratings yet
Huffman Coding Scheme
59 pages
Algorithmics: Information Coding Techniques
No ratings yet
Algorithmics: Information Coding Techniques
44 pages
05 Compression
No ratings yet
05 Compression
46 pages
Huffman Coding
No ratings yet
Huffman Coding
10 pages
Lec.4n - COMM 552 Information Theory and Coding
No ratings yet
Lec.4n - COMM 552 Information Theory and Coding
23 pages
Mmis G1 Ass
No ratings yet
Mmis G1 Ass
13 pages
Huffman Coding
No ratings yet
Huffman Coding
12 pages
Huffman Coding
No ratings yet
Huffman Coding
22 pages
Webservices Package Content: Trade Control and Expert System (Traces) NT
No ratings yet
Webservices Package Content: Trade Control and Expert System (Traces) NT
5 pages
Set Router / Wifi / Wireless Access Point / Repeater Configuration
No ratings yet
Set Router / Wifi / Wireless Access Point / Repeater Configuration
29 pages
Siemens Mindsphere Challenge - Business Use Case
No ratings yet
Siemens Mindsphere Challenge - Business Use Case
5 pages
Huffman
No ratings yet
Huffman
24 pages
Lecture2huffmancoding 151018181815 Lva1 App6892
No ratings yet
Lecture2huffmancoding 151018181815 Lva1 App6892
31 pages
Huffman
No ratings yet
Huffman
17 pages
Mad Unit 3-Jntuworld
No ratings yet
Mad Unit 3-Jntuworld
53 pages
11 Huffman Coding
No ratings yet
11 Huffman Coding
25 pages
Huffman Code
No ratings yet
Huffman Code
29 pages
Huffman Encoding: WWW - Cis.Upenn - Edu/ Matuszek/Cit594-2002/SLIDES/HUFFMAN
No ratings yet
Huffman Encoding: WWW - Cis.Upenn - Edu/ Matuszek/Cit594-2002/SLIDES/HUFFMAN
13 pages
Miracle HIS Brochure
No ratings yet
Miracle HIS Brochure
2 pages
Huffman Code
No ratings yet
Huffman Code
7 pages
Huffman Code
No ratings yet
Huffman Code
25 pages
Huffman's Algorithm Lecture1
No ratings yet
Huffman's Algorithm Lecture1
69 pages
Unite 4-Greedy Method - CSE
No ratings yet
Unite 4-Greedy Method - CSE
41 pages
Imc14 03 Huffman Codes PDF
No ratings yet
Imc14 03 Huffman Codes PDF
31 pages
Simulated Annealing: by Rohit Ray ESE 251
No ratings yet
Simulated Annealing: by Rohit Ray ESE 251
20 pages
Data Compression
No ratings yet
Data Compression
28 pages
Programming
No ratings yet
Programming
2 pages
Artifical Intelligence Vs Human Intelligence Eng 201 Assignment
No ratings yet
Artifical Intelligence Vs Human Intelligence Eng 201 Assignment
3 pages
Huffman Coding
No ratings yet
Huffman Coding
16 pages
ICS 220 - Data Structures and Algorithms: Dr. Ken Cosh
No ratings yet
ICS 220 - Data Structures and Algorithms: Dr. Ken Cosh
22 pages
Coding Theory
No ratings yet
Coding Theory
49 pages
Huffman
No ratings yet
Huffman
11 pages
12 - Huffman Coding Algorithm
No ratings yet
12 - Huffman Coding Algorithm
16 pages
Assignment No: 02 Title: Huffman Algorithm
No ratings yet
Assignment No: 02 Title: Huffman Algorithm
7 pages
Huffman Coding
No ratings yet
Huffman Coding
3 pages
Huffman Codes and Its Implementation: Submitted by Kesarwani Aashita Int. M.Sc. in Applied Mathematics (3 Year)
No ratings yet
Huffman Codes and Its Implementation: Submitted by Kesarwani Aashita Int. M.Sc. in Applied Mathematics (3 Year)
28 pages
Huffman Coding: Greedy Algorithm
No ratings yet
Huffman Coding: Greedy Algorithm
27 pages
Data Structure: Huffman Tree:Project Submitted To: Sir Abdul Wahab
No ratings yet
Data Structure: Huffman Tree:Project Submitted To: Sir Abdul Wahab
24 pages
Huffman Coding
No ratings yet
Huffman Coding
10 pages
Compression: Another Example of Greedy Algorithm: Huffman Codes
No ratings yet
Compression: Another Example of Greedy Algorithm: Huffman Codes
4 pages
Huffman Trees and Codes: Greedy Technique
No ratings yet
Huffman Trees and Codes: Greedy Technique
6 pages
Text Compression: Examples Huffman Coding: Go Go Gophers
No ratings yet
Text Compression: Examples Huffman Coding: Go Go Gophers
8 pages
Mini Project 2
No ratings yet
Mini Project 2
4 pages
Learn C++
From Everand
Learn C++
Durgesh
4.5/5 (9)

Huffman Coding

Uploaded by

Huffman Coding

Uploaded by

Applications of Trees

 Encode a message composed of a string of characters

 Suppose that we want to encode a message

 Potential solution: use variable-length codes

 The advantage of variable-length codes over fixed-length is short codes

 The following code does not have prefix property

 The pattern 1110 can be decoded as QQQP, QTP, QQS, or TS

 Question: Is there a variable-length code that makes the most efficient

 Find frequencies of each symbol occurring in message

 Build the Huffman coding tree for the message

 Begin with forest of single trees

• an algorithm that takes as input the frequencies (which are the

 Use Huffman coding to encode the following symbols with the

0.08 0.10 0.12 0.15 0.20 0.35

0.10 0.08 0.12 0.15 0.20 0.35

0.10 0.08 0.15 0.12 0.20 0.35

0.20 0.18 0.27

0.10 0.08 0.15 0.12 0.35

 Use Huffman coding to encode the following symbols with the

You might also like