Ic23 Unit04 Script
Ic23 Unit04 Script
MI
A
1 2
Recently on Image Compression ... 3 4
5 6
7 8
9 10
What did we learn so far? 11 12
Integer Arithmetic Coding approximates Pure Arithmetic Coding. 13 14
15 16
It is (relatively) easy to implement and storage efficient. 17 18
19 20
Huffman Coding and Arithmetic Coding are often used in practice. 21 22
23 24
Problems: 25 26
27 28
These methods are only optimal if our simplified assumptions are fulfilled. 29 30
31 32
Both methods can create significant overhead. 33 34
35 36
37 38
39 40
41 42
43 44
45 46
47 48
49 50
51 52
49 50
51 52
Outline MI
A
1 2
Learning Unit 04: 3 4
5 6
Adaptive, Higher Order, and Run Length Coding 7 8
9 10
11 12
13 14
15 16
17 18
Contents 19 20
21 22
1. Adaptive Entropy Coding 23 24
25 26
2. Higher Order Entropy Coding 27 28
29 30
3. Run Length Encoding 31 32
4. BWT, MTF, and Bzip2 33 34
35 36
37 38
39 40
41 42
43 44
45 46
47 48
c 2023 Christian Schmaltz, Pascal Peter 49 50
51 52
21 22
Adaptive Huffman Coding 23 24
25 26
27 28
Use symbol counters as in integer arithmetic coding (WNC). 29 30
33 34
P
The total number of symbols is C := i∈S ci. 35 36
37 38
ci
Probabilities are estimated as pi := C. 39 40
41 42
43 44
45 46
47 48
49 50
51 52
Adaptive Huffman Coding MI
A
1 2
Adaptive Huffman Coding – Version I 3 4
5 6
1. Initialise counters ci for each symbol to 0. 7 8
P 9 10
2. Set C to C := i∈S ci. 11 12
3. Set pi = ci
for all i. (If C = 0, set all pi to the same constant.) 13 14
C 15 16
4. Generate a Huffman tree with these probabilities. 17 18
19 20
5. Encode the next symbol si, and increment ci by one. 21 22
23 24
6. If there are any symbols left, continue with step (2). 25 26
27 28
29 30
31 32
33 34
35 36
Remarks 37 38
39 40
For decoding, replace the word “Encode” with “Decode” in step (5). 41 42
43 44
For practical reasons, the counters ci are often initialised to a small positive 45 46
9 10
It is not necessary to transmit the encoding scheme any more!
11 12
Changing symbol probabilities can be reflected. 13 14
15 16
17 18
Drawbacks 19 20
21 22
Requires some conventions to always obtain the same Huffman codes. 23 24
25 26
Generating a new Huffman tree after each symbol requires a lot of time. 27 28
29 30
• There are update strategies for the Huffman tree (Faller, Gallagher, Knuth,
31 32
Vitter) to ease this problem. 33 34
Requires some time to obtain good approximations. 35 36
37 38
39 40
41 42
43 44
45 46
47 48
49 50
51 52
Adaptive Huffman Coding – Readjustment MI
A
1 2
Adaptive Huffman Coding – Readjustment 3 4
5 6
Problem: If the probabilities of the source symbols change, it takes very long to 7 8
15 16
a · · · ab · · · bc · · · cd · · · d, 17 18
19 20
N
21 22
in which each symbol is repeated 4 times for a very large N . 23 24
25 26
• Standard Huffman coding needs 2N bits in total (2 bits per symbol).
27 28
• Adaptive Huffman Coding uses 1 bit for most a’s, 2 bits for all b’s and d’s, 29 30
and 3 bits for all c’s, thus requiring approximately 31 32
33 34
35 36
N N N N 37 38
+ 2 + 3 + 2 = 2N bits.
4 4 4 4 39 40
41 42
• In this example, adaptive Huffman coding is not better than standard 43 44
Huffman coding (even though the situation is perfect for adaptive methods). 45 46
47 48
49 50
51 52
9 10
• If the probabilities have not changed, no harm is done. 11 12
• If the probabilities changed, this is reflected much faster than before. 13 14
15 16
17 18
Remarks 19 20
21 22
After multiplying all ci with p, they are rounded to integers (due to practical 23 24
reasons). 25 26
27 28
If ci = 0 is undesired, rounding to zero must be prevented. 29 30
31 32
33 34
35 36
37 38
39 40
41 42
43 44
45 46
47 48
49 50
51 52
Adaptive Huffman Coding – Readjustment MI
A
1 2
Example: Assume we want to encode the word 3 4
5 6
a · · · ab · · · bc · · · cd · · · d, 7 8
9 10
N 11 12
in which each symbol is repeated 4 times for a very large N . 13 14
Standard Huffman coding needs 2N bits in total (2 bits per symbol). 15 16
17 18
Adaptive Huffman Coding uses 1 bit for most a’s, 2 bits for all b’s and d’s, and 3 19 20
requiring approximately 27 28
N 29 30
4· = N bits. 31 32
4
(e.g. every 4 symbols, multiply with p = 0.25) 33 34
35 36
37 38
39 40
41 42
43 44
45 46
47 48
49 50
51 52
9 10
Idea: Allow introducing new symbols while encoding by adding the ’symbol’
11 12
NYT (not yet transmitted). 13 14
15 16
17 18
19 20
21 22
23 24
25 26
27 28
29 30
31 32
33 34
35 36
37 38
39 40
41 42
43 44
45 46
47 48
49 50
51 52
Adaptive Huffman Coding MI
A
1 2
Adaptive Huffman Coding – Version II 3 4
5 6
1. Start with S = {NYT}. 7 8
9 10
2. Initialise the counter cNYT = 1. 11 12
P 13 14
3. Set C to C := i∈S ci.
15 16
ci 17 18
4. Set pi = C for all i.
19 20
5. Generate a Huffman tree with these probabilities. 21 22
23 24
6. Encode the next symbol si: 25 26
(a) If si has been encoded before, encode it and increment ci by one. 27 28
29 30
(b) Otherwise, transmit the code word for NYT, followed by the binary code of the 31 32
symbol. Add the symbol to S, introduce a new counter ci, and set it to one. 33 34
35 36
7. If there are any unencoded symbols left, continue with step (3). 37 38
39 40
41 42
43 44
45 46
47 48
49 50
51 52
15 16
Furthermore, the counters ci must not get too large. 17 18
19 20
21 22
23 24
25 26
27 28
29 30
31 32
33 34
35 36
37 38
39 40
41 42
43 44
45 46
47 48
49 50
51 52
Adaptive Arithmetic Coding MI
A
1 2
Example 3 4
5 6
counters next symbol α l new bits 7 8
A, B, C, D or operation 9 10
1,1,1,1 0 1 11 12
1 1 13 14
1,1,1,1 B 4 = 0.25 4 = 0.25 15 16
1,2,1,1 x → 2x 0.5 0.5 0 17 18
1,2,1,1 x → 2x − 1 0.0 1 1 19 20
0.0 + 1 · 15 = 0.2 2 21 22
1,2,1,1 B 5 · 1 = 0.4 23 24
1,3,1,1 C 7
0.2 + 0.4 · 46 = 15 1
0.4 · 61 = 15 25 26
7 1 1 1
1,3,2,1 A 15 15 · 7 = 105 27 28
29 30
Encoding the word BBCA with (pure) adaptive arithmetic coding (Variant I with 31 32
initial count of 1). 33 34
35 36
Remarks 37 38
39 40
Underflow expansion steps were skipped 41 42
43 44
The final code words is 0101111. 45 46
47 48
49 50
51 52
33 34
35 36
37 38
39 40
41 42
43 44
45 46
47 48
49 50
51 52
Adaptive Arithmetic Coding MI
A
1 2
When is Readjusting Necessary? 3 4
5 6
A readjustment must be done before encoding the next symbol if (and as long as) 7 8
9 10
M 11 12
C> +2 13 14
4
15 16
1 17 18
• Note that C ≥ pmin , since
19 20
mini ci 1 21 22
pmin = ≥ 23 24
C C
25 26
27 28
29 30
• Thus, it follows (compare Learning Unit 03) 31 32
33 34
M 35 36
+2≥C
4 37 38
M 1 39 40
⇒ +2≥ 41 42
4 pmin 43 44
4 45 46
⇒ M≥ −8
pmin 47 48
49 50
51 52
Remarks MI
A
1 2
Final Remarks 3 4
5 6
Compared to adaptive Huffman coding, adaptive arithmetic coding 7 8
9 10
• is easier to implement 11 12
• can be faster 13 14
15 16
• requires less memory 17 18
19 20
• and compresses stronger 21 22
23 24
This is the exact opposite of what we have in the non-adaptive case!
25 26
27 28
29 30
31 32
33 34
35 36
37 38
39 40
41 42
43 44
45 46
47 48
49 50
51 52
Outline MI
A
1 2
Learning Unit 04: 3 4
5 6
Adaptive, Higher Order, and Run Length Coding 7 8
9 10
11 12
13 14
15 16
17 18
Contents 19 20
21 22
1. Adaptive Entropy Coding 23 24
25 26
2. Higher Order Entropy Coding 27 28
29 30
3. Run Length Encoding 31 32
4. BWT, MTF, and Bzip2 33 34
35 36
37 38
39 40
41 42
43 44
45 46
47 48
c 2023 Christian Schmaltz, Pascal Peter 49 50
51 52
15 16
In practise, this assumption is often wrong: 17 18
19 20
• If a ’Q’ appears in an English text, it is very likely that the next symbol is ’u’. 21 22
• If all neighbours of a pixel x are black, it is very likely that x is also black. 23 24
25 26
By taking this into account, stronger compression ratios can be obtained. 27 28
29 30
Thus, we will not make this assumption any more, and consider higher order 31 32
entropy coders. 33 34
35 36
37 38
39 40
41 42
43 44
45 46
47 48
49 50
51 52
Higher Order Entropy Coding MI
A
1 2
What is k-th Order Entropy Coding? 3 4
5 6
For k-th order entropy coding, one . . . 7 8
9 10
• considers a k-th order context, i.e. the preceding k symbols si1 · · · sik before 11 12
the current symbol sj . 13 14
15 16
• uses the conditional probability
17 18
19 20
P (si1 · · · sik sj ) 21 22
P (sj | si1 · · · sik ) =
P (si1 · · · sik ) 23 24
25 26
to encode the next symbol. 27 28
29 30
These conditional probabilities can be illustrated using probabilistic finite state 31 32
automata. 33 34
35 36
37 38
39 40
41 42
43 44
45 46
47 48
49 50
51 52
37 38
The outgoing edges denote probabilities for the next symbol in a source word.
39 40
Such an automaton generates infinitely many source words of infinite length. 41 42
51 52
Higher Order Entropy Coding MI
A
1 2
k-th Order Huffman Coding 3 4
5 6
For each context si1 · · · sik , one separate Huffman tree is created. 7 8
9 10
• There are |S|k possible contexts. 11 12
• As in extended Huffman coding, noticably more space is needed to store the 13 14
15 16
encoding scheme.
17 18
Whenever a symbol is encoded/decoded, the Huffman tree corresponding to the 19 20
27 28
Solution: 29 30
Estimate the probabilities p1, . . . , pm and use them for the first k symbols 31 32
(∅-context). 33 34
35 36
For instance, use relative source probabilities: 37 38
39 40
X 41 42
p` := P (si1 · · · sik si` )
43 44
1≤i1 ,...,ik ≤m
45 46
47 48
49 50
51 52
On average, the length of the next code word in this context will be 9 10
11 12
m
X 13 14
l(i1, . . . , ik ) = P (sj | si1 · · · sik )l(i1, . . . , ik , j) 15 16
j=1 17 18
m 19 20
X P (si1 · · · sik sj )
= l(i1, . . . , ik , j) 21 22
P (s i 1 · · · s i k
) 23 24
j=1
25 26
(k) 27 28
Thus, the average code word length l of a k-th order encoding scheme is
29 30
(k) X 31 32
l = P (si1 · · · sik )l(i1, . . . , ik ) 33 34
1≤i1 ,...ik ≤m 35 36
m 37 38
X X
= P (si1 · · · sik sj )l(i1, . . . , ik , j) 39 40
41 42
1≤i1 ,...ik ≤m j=1
X 43 44
= P (si1 · · · sik+1 )l(i1, . . . , ik+1) 45 46
1≤i1 ,...ik+1 ≤m
47 48
49 50
51 52
21 22
X 23 24
H (k) = P (si1 · · · sik )H(i1, . . . , ik )
25 26
1≤i1 ,...ik ≤m
27 28
29 30
31 32
33 34
35 36
37 38
39 40
41 42
43 44
45 46
47 48
49 50
51 52
Higher Order Coding Efficiency MI
A
1 2
Theorem 3 4
(Shannon bound for higher-order encoding) 5 6
7 8
There exists an k-th order encoding scheme for which
9 10
11 12
(k)
l < H (k) + 1 13 14
15 16
17 18
The average code word length of a uniquely decodable k-th order encoding 19 20
H (k) ≤ H (k−1) 9 10
holds for all k ≥ 1. Thus, the entropy can only decrease when considering larger 11 12
13 14
contexts. 15 16
Higher-order entropy coding is also possible with arithmetic coding. 17 18
19 20
• This extension is straightforward. 21 22
23 24
Using probabilistic finite state automata, it is even possible to compute 25 26
27 28
H (∞) = lim H (k) 29 30
k→∞ 31 32
33 34
35 36
37 38
39 40
41 42
43 44
45 46
47 48
49 50
51 52
Outline MI
A
1 2
Learning Unit 04: 3 4
5 6
Adaptive, Higher Order, and Run Length Coding 7 8
9 10
11 12
13 14
15 16
17 18
Contents 19 20
21 22
1. Adaptive Entropy Coding 23 24
25 26
2. Higher Order Entropy Coding 27 28
29 30
3. Run Length Encoding 31 32
4. BWT, MTF, and Bzip2 33 34
35 36
37 38
39 40
41 42
43 44
45 46
47 48
c 2023 Christian Schmaltz, Pascal Peter 49 50
51 52
Adaptive Huffman Coding uses 1 bit for most a’s, 2 bits for all b’s and d’s, and 3 25 26
27 28
bits for all c’s, thus requiring approximately 2N bits. 29 30
Adaptive Huffman Coding with readjustment uses 1 bit for most symbols, thus 31 32
requiring approximately 33 34
N 35 36
4· = N bits. 37 38
4 39 40
(e.g. every 4 symbols, multiply with p = 0.25) 41 42
Problem: This is still a lot of data for a word that simple. 43 44
45 46
47 48
49 50
51 52
Run Length Encoding MI
A
1 2
Run Length Encoding 3 4
5 6
Idea: Count how often each symbol occurs, and transmit this counter. 7 8
9 10
This idea is used in run length encoding (RLE) or run length coding (RLC).
11 12
If the same symbol appears k times in a row, this is called a run of length k, or a 13 14
15 16
run with run length k, respectively.
17 18
19 20
21 22
23 24
25 26
27 28
29 30
31 32
33 34
35 36
37 38
39 40
41 42
43 44
45 46
47 48
49 50
51 52
9 10
Example 11 12
13 14
15 16
The source word 17 18
19 20
00000 111 000 1 0 1111 000 1111111 21 22
23 24
5 3 3 11 4 3 7 25 26
27 28
is encoded as 5 3 3 1 1 4 3 7. 29 30
31 32
Remarks 33 34
35 36
It must be known what the first symbol is. 37 38
39 40
• Either transmit/store the first bit 41 42
• or use a run length of zero if the first symbol is “unexpected” 43 44
45 46
By scanning images in a predefined way (e.g. row-wise), RLE can be used to 47 48
9 10
• An escape sequence (ESC) indicating the start of a run 11 12
• The length of the run 13 14
15 16
• The symbol itself 17 18
19 20
21 22
Example
23 24
25 26
27 28
The source word 29 30
31 32
CBAAAAABBBACAA 33 34
35 36
37 38
can be encoded as C B (ESC 5 A) (ESC 3 B) A C (ESC 2 A).
39 40
41 42
43 44
45 46
47 48
49 50
51 52
Run Length Encoding MI
A
1 2
Remarks 3 4
5 6
Runs with a length smaller than four should be ignored (otherwise, the resulting 7 8
19 20
sequence. 21 22
23 24
• Then, the symbol is already known and must not be transmitted, again.
25 26
• Instead of saving the run length k itself, k − l can be stored. 27 28
29 30
• However, the word might get longer. 31 32
33 34
35 36
37 38
39 40
41 42
43 44
45 46
47 48
49 50
51 52
9 10
CBAAAAABBBACAA 11 12
13 14
15 16
can be encoded as C B (AAA 2) (BBB 0) A C AA when repeating one symbol 17 18
three times indicates the start of a run length. 19 20
21 22
For the source word 23 24
25 26
CBAAABBBACAA, 27 28
29 30
31 32
the resulting code word C B (AAA 0) (BBB 0) A C AA is longer than the initial 33 34
word. 35 36
37 38
39 40
41 42
43 44
45 46
47 48
49 50
51 52
Run Length Encoding MI
A
1 2
Illustration of Run Length Encoding - Variant II 3 4
5 6
7 8
9 10
11 12
13 14
15 16
17 18
19 20
21 22
23 24
25 26
27 28
29 30
31 32
33 34
35 36
37 38
Illustration of RLE: The image could be encoded as YY BBB4 Y BBB1 LL BBB3 39 40
LLL1 BBB1 LLL3 B MMM0 LLL2 MMM3 LLL0 MMM2 DDD0 L MMM1 DDD2, 41 42
resulting in 63 characters instead of the 9 · 8 = 72 characters with the trivial encoding 43 44
(Y = Yellow, B = Blue, L = Light grey, M = Medium grey, D = Dark). 45 46
47 48
49 50
51 52
13 14
• If a run length larger than the possible value occurs, a run of length zero is 15 16
added in between. 17 18
19 20
• Example (Variant I, with two bits): 011111 is encoded as “01 11 00 10”, i.e. 21 22
as “1 3 0 2”. 23 24
25 26
Using Huffman coding (with the same problems as described above) 27 28
29 30
Using Golomb-, Rice-, or Fibonacci coding. 31 32
35 36
37 38
39 40
41 42
43 44
45 46
47 48
49 50
51 52
9 10
Bzip2 uses two symbols called RUNA and RUNB in one of its RLE:
11 12
• A run length is encoded as a sequence of RUNA and RUNB, where the 13 14
15 16
values represented by RUNA and RUNB are summed up.
17 18
• The symbol RUNA has a value of 2j−1 if it is the j-th symbol of the 19 20
sequence. 21 22
23 24
• The symbol RUNB has a value of 2j if it is the j-th symbol of the sequence. 25 26
27 28
There are two additional practical benefits to this approach: 29 30
31 32
• No escape sequences are necessary. 33 34
• Since RUNA, RUNB do not appear in the source alphabet, a source symbol 35 36
37 38
indicates the start of a new run.
39 40
41 42
43 44
45 46
47 48
49 50
51 52
Run Length Encoding MI
A
1 2
Example 3 4
5 6
The word 7 8
| ·{z
AAAAABC A · · A} 9 10
43times 11 12
is transformed to 13 14
A 4 B C A 42 15 16
17 18
Since 19 20
4=2·1+1·2 21 22
23 24
and 25 26
42 = 2 · 1 + 2 · 2 + 1 · 4 + 2 · 8 + 1 · 16, 27 28
this is encoded as 29 30
31 32
33 34
A RUNB RUNA B C A RUNB RUNB RUNA RUNB RUNA 35 36
37 38
39 40
41 42
43 44
45 46
47 48
49 50
51 52
9 10
x = c120 + c221 + · · · cn2n−1 11 12
13 14
15 16
Resembles a binary notation, but with different coefficient set. 17 18
19 20
Algorithm: 21 22
23 24
1. Set r0 = x, i = 1. 25 26
r 27 28
2. Compute ri = i−1 2 -1.
29 30
3. Compute ci = ri−1 − 2ri. 31 32
33 34
4. If ri > 0, set i := i + 1 and goto step 2. 35 36
37 38
In the final representation, replace 1 by RUNA and 2 by RUNB.
39 40
41 42
43 44
45 46
47 48
49 50
51 52
Outline MI
A
1 2
Learning Unit 04: 3 4
5 6
Adaptive, Higher Order, and Run Length Coding 7 8
9 10
11 12
13 14
15 16
17 18
Contents 19 20
21 22
1. Adaptive Entropy Coding 23 24
25 26
2. Higher Order Entropy Coding 27 28
29 30
3. Run Length Encoding 31 32
4. BWT, MTF, and Bzip2 33 34
35 36
37 38
39 40
41 42
43 44
45 46
47 48
c 2023 Christian Schmaltz, Pascal Peter 49 50
51 52
Bzip2 MI
A
1 2
BWT, MTF, and Bzip2 3 4
5 6
7 8
Bzip2 is an open-source compression algorithm widely 9 10
used in Unix/Linux. 11 12
13 14
It was written single-handedly by Julian Seward, who is 15 16
noticeably slower. 23 24
25 26
It cleverly combines several encoding algorithms we have 27 28
seen so far. 29 30
Julian Seward 31 32
Bzip, the predecessor, was based on arithmetic coding, 33 34
39 40
and MTF. 41 42
43 44
45 46
47 48
49 50
51 52
Burrows-Wheeler-Transform MI
A
1 2
Burrows-Wheeler-Transform 3 4
5 6
Problem: Words of the form ABABAB are not compressed using RLE. 7 8
9 10
Idea: Sort the word before applying RLE. 11 12
13 14
Example 15 16
17 18
19 20
Sorting the word ABRACADABRA results in AAAAABBCDRR. The latter can be 21 22
compressed much better using RLE. 23 24
25 26
27 28
29 30
31 32
33 34
35 36
37 38
39 40
41 42
43 44
45 46
47 48
49 50
51 52
Burrows-Wheeler-Transform MI
A
1 2
New Problem: Storing/transmitting is too expensive. 3 4
5 6
Idea: Approximate sorting that can be reversed with little or no additional
7 8
information. 9 10
Make use of the fact that the same symbols often appear in the same contexts. 11 12
13 14
These ideas are used in the so-called Burrows-Wheeler-Transform (BWT) or 15 16
Burrows-Wheeler-Transform MI
A
1 2
Example 3 4
5 6
Matrix after step 1 Matrix after sorting 7 8
AB R A C ADAB R A AAB R A C ADAB R 9 10
11 12
B R A C ADAB R AA ABRAABRACAD
13 14
R A C ADAB R AAB *AB R A C ADAB R A 15 16
A C ADAB R AAB R A C ADAB R AAB R 17 18
C ADAB R AAB R A ADAB R AAB R A C 19 20
ADAB R AAB R A C −→ B R AAB R A C AD A 21 22
DAB R AAB R A C A B R A C ADAB R A A 23 24
AB R AAB R A C AD C ADAB R AAB R A 25 26
B R AAB R A C ADA DAB R AAB R A C A 27 28
29 30
R AAB R A C ADAB R AAB R A C ADA B
31 32
AAB R A C ADAB R R A C ADAB R AA B 33 34
35 36
BWT of the word ABRACADABRA. The word RDARCAAAABB and an index to the 37 38
third row will be transmitted/stored. 39 40
41 42
43 44
45 46
47 48
49 50
51 52
Burrows-Wheeler-Transform MI
A
1 2
Back-Transformation 3 4
5 6
1. Sort the symbols of the word received using a stable sorting algorithm. 7 8
9 10
(Remark: stable means here that the original position of each symbol in the 11 12
unsorted word is still known after sorting.) 13 14
15 16
2. Initialise an index i with the received index i0.
17 18
3. Add the symbol in the i-th position of the sorted word to the decoded word. 19 20
21 22
4. Find the last decoded symbol in the received (unsorted) word. Set i to the index of 23 24
that symbol. 25 26
27 28
5. If i = i0, decoding is done. 29 30
31 32
6. Otherwise, continue with step (3). 33 34
35 36
37 38
39 40
41 42
43 44
45 46
47 48
49 50
51 52
Burrows-Wheeler-Transform MI
A
1 2
Example 3 4
5 6
A R
7 8
A D 9 10
11 12
*A A 13 14
A R 15 16
17 18
A C 19 20
B A 21 22
23 24
B A 25 26
C A
27 28
29 30
D A 31 32
33 34
R B
35 36
R B 37 38
39 40
Illustration of reversing the BWT. Received word: RDARCAAAABB, received 41 42
index: 3. Right: Received word. Left: Sorted symbols. Decoded so far: ABRAC 43 44
45 46
47 48
49 50
51 52
Burrows-Wheeler-Transform MI
A
1 2
Remarks 3 4
5 6
The BWT was invented by Michael Burrows and David Wheeler in 1994, but 7 8
15 16
• It only reorders the word
17 18
• Other compression algorithms can benefit from the new order 19 20
21 22
It is most useful for very long words. 23 24
25 26
If the EOF-symbol is included in the word to be encoded, the index can be 27 28
omitted. (Why?) 29 30
31 32
33 34
35 36
37 38
39 40
41 42
43 44
45 46
47 48
49 50
51 52
Move-to-Front Coding MI
A
1 2
Move-to-Front Coding 3 4
5 6
Idea: Assume that some symbols appear more frequent in certain parts of the 7 8
source word. 9 10
11 12
If there is such a local correlation, this means that recently used symbols are 13 14
correlation. 21 22
23 24
This idea is used in Move-to-Front Coding. 25 26
27 28
Algorithm 29 30
31 32
1. Sort all symbols and put them into a table. 33 34
35 36
2. Encode the next symbol by its position in the table. 37 38
39 40
3. Move the symbol just encoded to the beginning of the table. (The symbols before 41 42
this symbol move one step towards the end of the table) 43 44
45 46
4. As long as any unencoded symbols are left, continue with step 2. 47 48
49 50
51 52
Move-to-Front Coding MI
A
1 2
Example 3 4
5 6
table next symbol code stream 7 8
0 1 2 3 4 9 10
A B C D R R 4 11 12
13 14
R A B C D D 4 15 16
D R A B C A 2 17 18
A D R B C R 2 19 20
21 22
R A D B C C 4
23 24
C R A D B A 2 25 26
A C R D B A 0 27 28
A C R D B A 0 29 30
31 32
A C R D B A 0 33 34
A C R D B B 4 35 36
B A C R D B 0 37 38
39 40
Encoding the word RDARCAAAABB with MTF coding results in the code word 41 42
44224200040. 43 44
45 46
47 48
49 50
51 52
Move-to-Front Coding MI
A
1 2
Decoding 3 4
5 6
Decoding is very similar to encoding: 7 8
9 10
Algorithm 11 12
13 14
15 16
1. Sort all symbols and put them into a table.
17 18
2. Take the symbol at the position indicated by the code word. 19 20
21 22
3. Move the symbol just decoded to the beginning of the table. 23 24
25 26
4. As long as any undecoded symbols are left, continue with step 2. 27 28
29 30
31 32
33 34
35 36
37 38
39 40
41 42
43 44
45 46
47 48
49 50
51 52
Move-to-Front Coding MI
A
1 2
Example 3 4
5 6
table code stream next symbol 7 8
0 1 2 3 4 9 10
A B C D R 4 R 11 12
13 14
R A B C D 4 D 15 16
D R A B C 2 A 17 18
A D R B C 2 R 19 20
21 22
R A D B C 4 C
23 24
C R A D B 2 A 25 26
A C R D B 0 A 27 28
A C R D B 0 A 29 30
31 32
A C R D B 0 A 33 34
A C R D B 4 B 35 36
B A C R D 0 B 37 38
39 40
Decoding the code word 44224200040 with MTF yields RDARCAAAABB 41 42
(compare slide 42). 43 44
45 46
47 48
49 50
51 52
Move-to-Front Coding MI
A
1 2
Remarks 3 4
5 6
It is not necessary to know the whole code word to start encoding. 7 8
9 10
As always, any initial ordering can be used, as long as encoder and decoder use
11 12
the same. 13 14
15 16
BWT is typically followed by MTF.
17 18
19 20
21 22
23 24
25 26
27 28
29 30
31 32
33 34
35 36
37 38
39 40
41 42
43 44
45 46
47 48
49 50
51 52
Bzip2 MI
A
1 2
BZIP2 in Detail 3 4
5 6
1. RLE using four equal symbols as escape sequence, and runs of length 4 to 255. 7 8
9 10
2. BWT 11 12
13 14
3. MTF 15 16
4. RLE using the approach from slide 33 17 18
19 20
5. Huffman-coding (with canonical codes) 21 22
23 24
• Up to six different Huffman trees might be used. 25 26
27 28
• The tree used can change every 50 symbols. 29 30
• MTF with Unary coding is used to encode the Huffman trees used. 31 32
33 34
• The code word lengths are stored using delta encoding: Only the difference 35 36
between consecutive code word lengths is stored. 37 38
39 40
• A sparse bit array is used to indicate which symbols occur: The 256 possible 41 42
symbols are assigned into 16 blocks. 16 bits are used to indicate which blocks 43 44
are empty, as well as 16 bits for each non-empty block. 45 46
47 48
49 50
51 52
Bzip2 MI
A
1 2
Remarks on Bzip2 3 4
5 6
You can download a C implementation on www.bzip.org. 7 8
9 10
It is useful for daily compression tasks:
11 12
• Around 20% worse than state-of-the art techniques (PAQ). 13 14
15 16
• Usually quite a lot faster (around factor 1000). 17 18
19 20
Julian Seward called the first run length encoding step a mistake in hindsight. 21 22
23 24
• It does not contribute to compression performance and is mostly irrelevant. 25 26
• Instead, it was designed to protect the BWT of too many repeating symbols. 27 28
29 30
• However, this could be done more efficiently by modifying the BWT. 31 32
33 34
35 36
37 38
39 40
41 42
43 44
45 46
47 48
49 50
51 52
Summary MI
A
1 2
Summary 3 4
5 6
adaptive coding: no overhead, adjustment to local distribution 7 8
9 10
higher order coding: uses preceding symbols to define conditional probabilities 11 12
13 14
Run length encoding (RLE) counts runs of the same symbols.
15 16
The Burrows-Wheeler-Transform (BWT) reorders the source word such that 17 18
19 20
symbols are ordered with respect to their context. 21 22
Move-to-Front coding (MTF) remaps symbols to low integers (mostly zeroes). 23 24
25 26
Bzip2 illustrates that a clever combination of different approaches can yield a 27 28
41 42
Can we find other ways to help the entropy coders?
43 44
45 46
47 48
49 50
51 52
References MI
A
1 2
References 3 4
5 6
D. Hankerson, G. A. Harris, P.D. Johnson Jr. Introduction to Information Theory 7 8
(BWT, MTF) 23 24
25 26
M. Burrows and D. J. Wheeler. A block-sorting lossless data compression
27 28
algorithm, SRC Research Report, 1994. 29 30
(BWT, MTF, implementation details) 31 32
33 34
35 36
37 38
39 40
41 42
43 44
45 46
47 48
49 50
51 52