0% found this document useful (0 votes)
5 views9 pages

Umair Week 7

The document explains the process of Huffman encoding, starting with the creation of frequency tables for characters in a message. It details the construction of a Huffman Tree based on these frequencies and provides the corresponding Huffman codes for each character. Additionally, it discusses the time complexity of the encoding process and provides references for further reading.

Uploaded by

studycloud002
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views9 pages

Umair Week 7

The document explains the process of Huffman encoding, starting with the creation of frequency tables for characters in a message. It details the construction of a Huffman Tree based on these frequencies and provides the corresponding Huffman codes for each character. Additionally, it discusses the time complexity of the encoding process and provides references for further reading.

Uploaded by

studycloud002
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 9

1

The first step is to create a frequency table of the characters used in the message. I'll use the
following abbreviations: SP for space, LF for line feed (a kind of "end of message" marker).

Character | Frequency

----------------------

P | 9

E | 5

R | 3

| 8 (SP)

I | 5

C | 3

K | 3

D | 2

A | 1

O | 1

F | 1

L | 1

PE | 1

Total characters: 43.

The second step is to build a Huffman Tree using these frequencies. A Huffman Tree is a type of
binary tree used for Huffman Encoding.
2

The process starts by treating each character and its frequency as a node. we then find the two
nodes with the smallest frequencies and combine them into a new node, where the frequency of
the new node is the sum of the frequencies of the two original nodes. Repeat this process, always
combining the two smallest nodes, until only one node is left -- this is the root of the Huffman
Tree.

Huffman Tree for this message:

43 (root)

├── 20

│ ├── 10

│ │ ├── 5: E

│ │ └── 5: I

│ └── 10: P

└── 23

├── 11

│ ├── 5: SP

│ └── 6

│ ├── 3: R

│ └── 3: C

└── 12

├── 6

│ ├── 3: K

│ └── 3
3

│ ├── 1: A

│ └── 2: D

└── 6

├── 3

│ ├── 1: O

│ └── 2: F

└── 3

├── 1: L

└── 2: PE

The Huffman Encoding for each character is found by reading the path from the root to the
character in the tree: going to the left child gives a '0', and going to the right gives a '1'.

Character | Huffman Code

------------------------

P | 01

E | 000

R | 1000

SP | 100

I | 001

C | 101

K | 1110

D | 11111

A | 11110
4

O | 11101

F | 11100

L | 1101

PE | 1100

Now, replace each character in your message with its Huffman code:

PETER PIPER PICKED A PECK OF PICKLED PEPPER ->


0100001001001100110100101000011101110100010111101110011110100101011111111100110
10001011100011010

The length of this message in bits is 190

Part 2

First, we need to create a frequency table of the letters in your phrase:

CHARACTER | FREQUENCY

-------------------------

SPACE ( ) | 6

E | 6

P | 6

A | 4

R | 2

O | 2
5

C | 2

K | 2

F | 2

I | 2

X | 1

D | 1

J | 1

M | 1

H | 1

T | 1

S | 1

N | 1

-------------------------

TOTAL | 40

This is just a simple count of how many times each character appears in the text. Note that
"Space" is considered a character in this encoding scheme.

Then we use Huffman encoding algorithm to create a binary tree:

Sort the frequency table in ascending order.

Select the two characters with the lowest frequency. If there's a tie, choose arbitrarily.

Combine these two into a new node, the frequency of which equals the sum of their frequencies.
Replace the two nodes in the frequency table with this new node.
6

Repeat steps 2-3 until only one node (the root of the tree) remains.

Here's an example Huffman Tree for the above frequency table:

/40\

/20\ /20\

/9\ /11\ /10\ /10\

/4\/5\ /5\/6\ /4\/6\ /5\/5\

A R O C K F I SPACE E P

Each left branch gets a '0' and each right branch gets a '1'. So, you can assign each character a
binary string as follows:

CHARACTER | FREQUENCY | HUFFMAN CODE

--------------------------------------

SPACE ( ) | 6 | 110

E | 6 | 111

P | 6 | 100

A | 4 | 000

R | 2 | 001

O | 2 | 010

C | 2 | 0110

K | 2 | 0111

F | 2 | 1010
7

I | 2 | 1011

X | 1 | 01000

D | 1 | 01001

J | 1 | 01010

M | 1 | 01011

H | 1 | 10000

T | 1 | 10001

S | 1 | 10010

N | 1 | 10011

--------------------------------------

TOTAL | 40 |

To encode the message, we simply replace each character with its corresponding Huffman code.
As the size of the encoded message, we sum the lengths of all the encoded characters. Note that
some characters are encoded with fewer bits than others, depending on their frequency in the
source text.

The size of the message in bits is calculated by multiplying the frequency of each character by
the length of its corresponding Huffman code and then adding these up:

63 + 63 + 63 + 43 + 23 + 25 + 24 + 24 + 24 + 24 + 15 + 15 + 15 + 15 + 15 + 15 + 15 + 15 = 18 +
18 + 18 + 12 + 6 + 10 + 8 + 8 + 8 + 8 + 5 + 5 + 5 + 5 + 5 + 5 + 5 + 5 = 154 bits

So, the size of this message is 154 bits.

Part 3
8

The time complexity of Huffman encoding mainly depends on two steps:

Building the frequency table, which involves traversing the entire input.

Building the Huffman Tree.

Assuming n is the number of unique characters:

Building the frequency table is O(n) if the input is a list of frequencies or O(m) if you are
counting characters in a string of length m.

Building the Huffman Tree: The algorithm is a greedy algorithm that requires a priority queue
(often implemented as a binary heap) where the least frequent character has the highest priority.
We insert all elements into the queue, which takes O(n log n) time. Then we remove two nodes at
a time and insert one node until one node remains, which also takes O(n log n) time.

So, the overall time complexity of Huffman encoding is O(n log n), where n is the number of
unique characters in the input string.
9

Reference
Huffman Decoding [explained with example]. (n.d.). OpenGenus IQ: Computing Expertise &
Legacy. https://iq.opengenus.org/huffman-decoding/
Huffman Coding and Decoding Algorithm. (n.d.). Huffman Coding and Decoding Algorithm.
https://www.topcoder.com/thrive/articles/huffman-coding-and-decoding-algorithm

You might also like