Adaptiv Huffman Coding
Adaptiv Huffman Coding
Adaptiv Huffman Coding
Autumn 2007 One pass During the pass calculate the frequencies Update the Huffman tree accordingly
Coder new Huffman tree computed after transmitting the symbol Decoder new Huffman tree computed after receiving the symbol
Symbol set and their initial codes must be known ahead of time. Need NYT (not yet transmitted symbol) to indicate a new leaf is needed in the tree.
CSEP 590 - Lecture 2 - Autumn 2007 2
a d
a d
1 2
7
6
6
3
3
2
d c 1 b 2
Number the nodes as they are removed from the priority queue.
CSEP 590 - Lecture 2 - Autumn 2007 5
Initialization
Symbols a1, a2, ... ,am have a basic prefix code, used when symbols are first encountered. Example: a, b ,c, d, e, f, g, h, i, j
0 0 0 1 0 1 1 0 1 0 1 0 1 0 1 1 1
Initialization
The tree will encode up to m + 1 symbols including NYT. We reserve numbers 1 to 2m + 1 for node numbering. The initial Huffman tree consists of a single node weight
0 NYT
2m + 1
b c
f 0 g
node number
h i
j
7 CSEP 590 - Lecture 2 - Autumn 2007 8
Coding Algorithm
1. If a new symbol is encountered then output the code for NYT followed by the fixed code for the symbol. Add the new symbol to the tree. 2. If an old symbol is encountered then output its code. 3. Update the tree to preserve the node number invariant.
Decoding Algorithm
1. Decode the symbol using the current tree. 2. If NYT is encountered then use the fixed code to decode the symbol. Add the new symbol to the tree. 3. Update the tree to preserve the node number invariant.
10
Example
aabcdad in alphabet {a,b,..., j}
0 NYT
21
12
Example
aabcdad
1
0
Example
aabcdad
1
0
21
1
21
1
0 19 NYT
1 a
20
0 19 NYT
1 a
20
output = 000
CSEP 590 - Lecture 2 - Autumn 2007 13
output = 0001
CSEP 590 - Lecture 2 - Autumn 2007 14
Example
aabcdad
1
0
Example
aabcdad
2
0
21
1
21
1
0 19 NYT
2 a
20
0 19 NYT
2 a
20
NYT
output = 0001
CSEP 590 - Lecture 2 - Autumn 2007 15
output = 00010001
CSEP 590 - Lecture 2 - Autumn 2007 16
Example
aabcdad
2
0
Example
aabcdad
2
0
21
1
21
1
0
0
19
1
2 a
20
0
0 0 17 NYT
19
1
2 a
20
0 17 NYT
0 a b
18
1 a b
18
output = 00010001
CSEP 590 - Lecture 2 - Autumn 2007 17
output = 00010001
CSEP 590 - Lecture 2 - Autumn 2007 18
Example
aabcdad
2
0
Example
aabcdad
3
0
21
1
21
1
1
0
19
1
2 a
20
0
1 0 17 NYT
19
1
2 a
20
0 17 NYT
1 a b
18
1 a b
output = 0001000100010
CSEP 590 - Lecture 2 - Autumn 2007 20
Example
aabcdad
0
Example
aabcdad
0
3 1
0
21
1
3 1
0
21
1
19
1
2 a
20
0
0
19
1
2 a
20
0
0
17
1
a 1 b
18
0 NYT
17
1
a 1 b
18
0 NYT
a 16 15 0 c
a 16 15 1 c
output = 0001000100010
CSEP 590 - Lecture 2 - Autumn 2007 21
output = 0001000100010
CSEP 590 - Lecture 2 - Autumn 2007 22
Example
aabcdad
0
Example
aabcdad
0
3 1
0
21
1
3 2
0
21
1
19
1
2 a
20
1
0
19
1
2 a
20
1
0
17
1
a 1 b
18
0 NYT
17
1
a 1 b
18
0 NYT
1 16 15 a c
1 16 15 a c
output = 0001000100010
CSEP 590 - Lecture 2 - Autumn 2007 23
output = 0001000100010
CSEP 590 - Lecture 2 - Autumn 2007 24
Example
aabcdad
0
Example
aabcdad
2 4
0
21
1
4 2
0
21
1
19
1
19
1
2 a
20
0
2 a
20
1 0
0
17
1
a 1 b
18
1
0
17
1
a 1 b
18
0 NYT
1 16 15 a c
1 16 15 a
1
0 NYT
13
0 a d
14
output = 0001000100010000011
CSEP 590 - Lecture 2 - Autumn 2007 25
output = 0001000100010000011
CSEP 590 - Lecture 2 - Autumn 2007 26
Example
aabcdad
2
0
Example
aabcdad
4
0
4
0
21
1
21
1
19
1
2 a
20
0
2 1
0
19
1
2 a
20 exchange!
1
0
17
1
a 1 b
18
1
0
17
1
a 1 b
18
0
0
1 16 15 a
1
1 16 15 a
1
0 NYT
13
1 a d
14
0 NYT
13
1 a d
14
output = 0001000100010000011
CSEP 590 - Lecture 2 - Autumn 2007 27
output = 0001000100010000011
CSEP 590 - Lecture 2 - Autumn 2007 28
Example
aabcdad
2
0
Example
aabcdad
4
0
4
0
21
1
21
1
19
1
2 a 1
20
0
2 1 b
0
19
1
2 a 2
20 exchange!
1 b
17
0
18
1
17
0
18
1
1
0
a 16 15 1
1
1 0 NYT
a 16 15 1
1
0 NYT
13
1 a d
14
13
1 a d
14
output = 0001000100010000011
CSEP 590 - Lecture 2 - Autumn 2007 29
output = 0001000100010000011
CSEP 590 - Lecture 2 - Autumn 2007 30
Example
aabcdad
4
0
Example
aabcdad
4
0
21
1
21
1
2 a
19
0
20
1
2 a
19
0
20
1
1 b
17
0
2 1
18
1
1 b
17
0
2 1
18
1
15
1
1 a c
16
0
1 16 15 a
1
0 NYT
13
1 a d
14
0 NYT
13
1 a d
14
output = 0001000100010000011
CSEP 590 - Lecture 2 - Autumn 2007 31
output = 0001000100010000011
CSEP 590 - Lecture 2 - Autumn 2007 32
Example
aabcdad
5
0
Example
Note: the first a is coded as 000, the second as 1, and the third as 0
21
1
aabcdad
5
0
21
1
2 a
19
0
20
1
3 a
19
0
20
1
1 b
17
0
2 1
18
1
1 b
17
0
2 1
18
1
15
1
1 a c
16
0
1 16 15 a
1
0 NYT
13
1 a d
14
0 NYT
13
1 a d
14
output = 00010001000100000110
CSEP 590 - Lecture 2 - Autumn 2007 33
output = 00010001000100000110
CSEP 590 - Lecture 2 - Autumn 2007 34
Example
aabcdad
6
0
Example
aabcdad
6
0
21
1
21
1
3 a
19
0
20
1
exchange! 18
1
3 a
19
0
20
1
1 b
17
0
2 1
1 d
17
0
2 1
18
1
a 16 15 1
1
a 16 15 1
1
0 NYT
13
1 a d
14
0 NYT
13
1 a b
14
output = 000100010001000001101101
CSEP 590 - Lecture 2 - Autumn 2007 35
output = 000100010001000001101101
CSEP 590 - Lecture 2 - Autumn 2007 36
Example
aabcdad
6
0
Example
aabcdad
6
0
21
1
21
1
3 a
19
0
20
1
3 a
19
0
20
1
2 d
17
0
2 1
18
1
2 d
17
0
2 1
18
1
15
1
1 a c
16
0
1 16 15 a
1
0 NYT
13
1 a b
14
0 NYT
13
1 a b
14
output = 000100010001000001101101
CSEP 590 - Lecture 2 - Autumn 2007 37
output = 000100010001000001101101
CSEP 590 - Lecture 2 - Autumn 2007 38
Example
aabcdad
7
0
21
1 0
7
1
3 a
19
0
20
1
3 a
4
0 1
2 d
17
0
2 1
18
1
2 d
0 0
2
1
1. Fixed code table 2. Binary tree with parent pointers 3. Table of pointers nodes into tree 4. Doubly linked list to rank the nodes 1 a c
15
1
1 a c
16
1
1
0 NYT
13
1 a b
14
0 NYT
a 1 b
output = 000100010001000001101101
CSEP 590 - Lecture 2 - Autumn 2007 CSEP 590 - Lecture 2 - Autumn 2007 40
In Class Exercise
Decode using adaptive Huffman coding assuming the following fixed code
0 0 0 1 0 1 1 0 1 0 1 0 1 1
Huffman Summary
41
b c
00110000
CSEP 590 - Lecture 2 - Autumn 2007
Statistical compression algorithm Prefix code Fixed-to-variable length code Optimization to create a best code Symbol merging Context Adaptive coding Decoder and encoder behave almost the same Need for data structures and algorithms
CSEP 590 - Lecture 2 - Autumn 2007 42