0% found this document useful (0 votes)
12 views29 pages

Ic23 Unit04 Script

The document discusses different methods of image compression including lossless and lossy techniques. Lossless techniques covered include entropy coding methods like Huffman coding and arithmetic coding. Lossy methods include transform coding techniques like JPEG and wavelet-based JPEG2000 as well as other approaches like fractal compression and neural networks.

Uploaded by

soumyas.vit
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views29 pages

Ic23 Unit04 Script

The document discusses different methods of image compression including lossless and lossy techniques. Lossless techniques covered include entropy coding methods like Huffman coding and arithmetic coding. Lossy methods include transform coding techniques like JPEG and wavelet-based JPEG2000 as well as other approaches like fractal compression and neural networks.

Uploaded by

soumyas.vit
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 29

Recently on Image Compression ...

MI
A
1 2
Recently on Image Compression ... 3 4
5 6
7 8
9 10
What did we learn so far? 11 12
Integer Arithmetic Coding approximates Pure Arithmetic Coding. 13 14


15 16
It is (relatively) easy to implement and storage efficient. 17 18


19 20
Huffman Coding and Arithmetic Coding are often used in practice. 21 22


23 24
Problems: 25 26
27 28
These methods are only optimal if our simplified assumptions are fulfilled. 29 30


31 32
Both methods can create significant overhead. 33 34


35 36
37 38
39 40
41 42
43 44
45 46
47 48
49 50
51 52

Recently on Image Compression ... MI


A
1 2
This Learning Unit 3 4
5 6
Image Compression 7 8
9 10
Part I: Lossless Part II: Lossy 11 12
13 14
Entropy Coding
Lossless Codecs
Transform Coding
Other Approaches 15 16
Information Theory JPEG 17 18
Fractal Compression
PNG
Huffman Coding JPEG2000 19 20
gif Neural Networks
Integer Coding HEVC intra 21 22
JBIG
Arithmetic Coding 23 24
JBIGLS
RLE, BWT, MTF, bzip
Basics
25 26
Inpainting-based
Compression 27 28
Dictionary Coding PDE-based Inpainting
quantisation 29 30
Prediction Data Selection
error measures
31 32
LZ-Family
Tonal Optimisation 33 34
linear prediction
Deflate
Patch-based Inpainting 35 36
PPM Teaser: Video Coding
Inpainting-based Codecs 37 38
PAQ
PDE-based video coding 39 40
MPEG-family 41 42
43 44
45 46
How to estimate probabilities on the fly and how to relax our strict assumptions? 47 48


49 50
51 52
Outline MI
A
1 2
Learning Unit 04: 3 4
5 6
Adaptive, Higher Order, and Run Length Coding 7 8
9 10
11 12
13 14
15 16
17 18
Contents 19 20
21 22
1. Adaptive Entropy Coding 23 24
25 26
2. Higher Order Entropy Coding 27 28
29 30
3. Run Length Encoding 31 32
4. BWT, MTF, and Bzip2 33 34
35 36
37 38
39 40
41 42
43 44
45 46
47 48
c 2023 Christian Schmaltz, Pascal Peter 49 50
51 52

Adaptive Entropy Coding MI


A
1 2
Adaptive Coding 3 4
5 6
7 8
Motivation 9 10
11 12
Problem: Sometimes, the source word is not known in advance. 13 14


Thus, the probability of each symbol is unknown. 15 16


17 18
Idea: Start with a rough estimate of the probabilities, improve during encoding. 19 20


21 22
Adaptive Huffman Coding 23 24
25 26
27 28
Use symbol counters as in integer arithmetic coding (WNC). 29 30


Each symbol si has a correponding integer counter ci. 31 32




33 34
P
The total number of symbols is C := i∈S ci. 35 36


37 38
ci
Probabilities are estimated as pi := C. 39 40


41 42
43 44
45 46
47 48
49 50
51 52
Adaptive Huffman Coding MI
A
1 2
Adaptive Huffman Coding – Version I 3 4
5 6
1. Initialise counters ci for each symbol to 0. 7 8
P 9 10
2. Set C to C := i∈S ci. 11 12
3. Set pi = ci
for all i. (If C = 0, set all pi to the same constant.) 13 14
C 15 16
4. Generate a Huffman tree with these probabilities. 17 18
19 20
5. Encode the next symbol si, and increment ci by one. 21 22
23 24
6. If there are any symbols left, continue with step (2). 25 26
27 28
29 30
31 32
33 34
35 36
Remarks 37 38
39 40
For decoding, replace the word “Encode” with “Decode” in step (5). 41 42


43 44
For practical reasons, the counters ci are often initialised to a small positive 45 46


number (instead of to zero). 47 48


49 50
51 52

Adaptive Huffman Coding MI


A
1 2
Advantages 3 4
5 6
Allows Huffman coding if the source word is not known in advance. 7 8


9 10
It is not necessary to transmit the encoding scheme any more!


11 12
Changing symbol probabilities can be reflected. 13 14


15 16
17 18
Drawbacks 19 20
21 22
Requires some conventions to always obtain the same Huffman codes. 23 24


25 26
Generating a new Huffman tree after each symbol requires a lot of time. 27 28


29 30
• There are update strategies for the Huffman tree (Faller, Gallagher, Knuth,
31 32
Vitter) to ease this problem. 33 34
Requires some time to obtain good approximations. 35 36


37 38
39 40
41 42
43 44
45 46
47 48
49 50
51 52
Adaptive Huffman Coding – Readjustment MI
A
1 2
Adaptive Huffman Coding – Readjustment 3 4
5 6
Problem: If the probabilities of the source symbols change, it takes very long to 7 8


reflect this change. 9 10


11 12
Example: Assume we want to encode the word 13 14


15 16
a · · · ab · · · bc · · · cd · · · d, 17 18
19 20
N
21 22
in which each symbol is repeated 4 times for a very large N . 23 24
25 26
• Standard Huffman coding needs 2N bits in total (2 bits per symbol).
27 28
• Adaptive Huffman Coding uses 1 bit for most a’s, 2 bits for all b’s and d’s, 29 30
and 3 bits for all c’s, thus requiring approximately 31 32
33 34
35 36
N N N N 37 38
+ 2 + 3 + 2 = 2N bits.
4 4 4 4 39 40
41 42
• In this example, adaptive Huffman coding is not better than standard 43 44
Huffman coding (even though the situation is perfect for adaptive methods). 45 46
47 48
49 50
51 52

Adaptive Huffman Coding – Readjustment MI


A
1 2
Idea 3 4
5 6
Periodically, multiply all ci by some fixed number p < 1. 7 8


9 10
• If the probabilities have not changed, no harm is done. 11 12
• If the probabilities changed, this is reflected much faster than before. 13 14
15 16
17 18
Remarks 19 20
21 22
After multiplying all ci with p, they are rounded to integers (due to practical 23 24


reasons). 25 26
27 28
If ci = 0 is undesired, rounding to zero must be prevented. 29 30


31 32
33 34
35 36
37 38
39 40
41 42
43 44
45 46
47 48
49 50
51 52
Adaptive Huffman Coding – Readjustment MI
A
1 2
Example: Assume we want to encode the word 3 4
5 6
a · · · ab · · · bc · · · cd · · · d, 7 8
9 10
N 11 12
in which each symbol is repeated 4 times for a very large N . 13 14
Standard Huffman coding needs 2N bits in total (2 bits per symbol). 15 16


17 18
Adaptive Huffman Coding uses 1 bit for most a’s, 2 bits for all b’s and d’s, and 3 19 20


bits for all c’s, thus requiring approximately 2N bits. 21 22


23 24
Adaptive Huffman Coding with readjustment uses 1 bit for most symbols, thus 25 26


requiring approximately 27 28
N 29 30
4· = N bits. 31 32
4
(e.g. every 4 symbols, multiply with p = 0.25) 33 34
35 36
37 38
39 40
41 42
43 44
45 46
47 48
49 50
51 52

Adaptive Huffman Coding – Version II MI


A
1 2
Adaptive Huffman Coding V2: Motivation 3 4
5 6
Problem: Sometimes, the set of symbols is not known in advance. 7 8


9 10
Idea: Allow introducing new symbols while encoding by adding the ’symbol’


11 12
NYT (not yet transmitted). 13 14
15 16
17 18
19 20
21 22
23 24
25 26
27 28
29 30
31 32
33 34
35 36
37 38
39 40
41 42
43 44
45 46
47 48
49 50
51 52
Adaptive Huffman Coding MI
A
1 2
Adaptive Huffman Coding – Version II 3 4
5 6
1. Start with S = {NYT}. 7 8
9 10
2. Initialise the counter cNYT = 1. 11 12
P 13 14
3. Set C to C := i∈S ci.
15 16
ci 17 18
4. Set pi = C for all i.
19 20
5. Generate a Huffman tree with these probabilities. 21 22
23 24
6. Encode the next symbol si: 25 26
(a) If si has been encoded before, encode it and increment ci by one. 27 28
29 30
(b) Otherwise, transmit the code word for NYT, followed by the binary code of the 31 32
symbol. Add the symbol to S, introduce a new counter ci, and set it to one. 33 34
35 36
7. If there are any unencoded symbols left, continue with step (3). 37 38
39 40
41 42
43 44
45 46
47 48
49 50
51 52

Adaptive Arithmetic Coding MI


A
1 2
Adaptive Arithmetic Coding 3 4
5 6
All concepts introduced for adaptive Huffman coding can also be used for 7 8


adaptive arithmetic coding. 9 10


11 12
However, the counters ci must not be initialised to zero. 13 14


15 16
Furthermore, the counters ci must not get too large. 17 18


19 20
21 22
23 24
25 26
27 28
29 30
31 32
33 34
35 36
37 38
39 40
41 42
43 44
45 46
47 48
49 50
51 52
Adaptive Arithmetic Coding MI
A
1 2
Example 3 4
5 6
counters next symbol α l new bits 7 8
A, B, C, D or operation 9 10
1,1,1,1 0 1 11 12
1 1 13 14
1,1,1,1 B 4 = 0.25 4 = 0.25 15 16
1,2,1,1 x → 2x 0.5 0.5 0 17 18
1,2,1,1 x → 2x − 1 0.0 1 1 19 20
0.0 + 1 · 15 = 0.2 2 21 22
1,2,1,1 B 5 · 1 = 0.4 23 24
1,3,1,1 C 7
0.2 + 0.4 · 46 = 15 1
0.4 · 61 = 15 25 26
7 1 1 1
1,3,2,1 A 15 15 · 7 = 105 27 28
29 30
Encoding the word BBCA with (pure) adaptive arithmetic coding (Variant I with 31 32
initial count of 1). 33 34
35 36
Remarks 37 38
39 40
Underflow expansion steps were skipped 41 42


43 44
The final code words is 0101111. 45 46


47 48
49 50
51 52

Adaptive Arithmetic Coding MI


A
1 2
Adaptive WNC Algorithm 3 4
5 6
Problem: The pi may become arbitrarily small, resulting in empty intervals in 7 8


the WNC algorithm (see Learning Unit 03). 9 10


11 12
Reminder: For a source alphabet S = {s1, . . . , sm} whose symbols appear with 13 14


probabilities p1, . . . pm, no empty interval appears in the WNC algorithm if 15 16


17 18
• M is divisible by 4 19 20
and 21 22
23 24
• 25 26
4 27 28
M≥ − 8, where pmin = min pi
pmin i 29 30
31 32
Solution: Readjust if necessary (compare slide 6).


33 34
35 36
37 38
39 40
41 42
43 44
45 46
47 48
49 50
51 52
Adaptive Arithmetic Coding MI
A
1 2
When is Readjusting Necessary? 3 4
5 6
A readjustment must be done before encoding the next symbol if (and as long as) 7 8


9 10
M 11 12
C> +2 13 14
4
15 16
1 17 18
• Note that C ≥ pmin , since
19 20
mini ci 1 21 22
pmin = ≥ 23 24
C C
25 26
27 28
29 30
• Thus, it follows (compare Learning Unit 03) 31 32
33 34
M 35 36
+2≥C
4 37 38
M 1 39 40
⇒ +2≥ 41 42
4 pmin 43 44
4 45 46
⇒ M≥ −8
pmin 47 48
49 50
51 52

Remarks MI
A
1 2
Final Remarks 3 4
5 6
Compared to adaptive Huffman coding, adaptive arithmetic coding 7 8


9 10
• is easier to implement 11 12
• can be faster 13 14
15 16
• requires less memory 17 18
19 20
• and compresses stronger 21 22
23 24
This is the exact opposite of what we have in the non-adaptive case!


25 26
27 28
29 30
31 32
33 34
35 36
37 38
39 40
41 42
43 44
45 46
47 48
49 50
51 52
Outline MI
A
1 2
Learning Unit 04: 3 4
5 6
Adaptive, Higher Order, and Run Length Coding 7 8
9 10
11 12
13 14
15 16
17 18
Contents 19 20
21 22
1. Adaptive Entropy Coding 23 24
25 26
2. Higher Order Entropy Coding 27 28
29 30
3. Run Length Encoding 31 32
4. BWT, MTF, and Bzip2 33 34
35 36
37 38
39 40
41 42
43 44
45 46
47 48
c 2023 Christian Schmaltz, Pascal Peter 49 50
51 52

Higher Order Entropy Coding MI


A
1 2
Higher Order Entropy Coding 3 4
5 6
So far, we assumed that the random variables representing the symbol in a 7 8


source word are independent and identically distributed. 9 10


11 12
This resulted in zeroth order entropy coders. 13 14


15 16
In practise, this assumption is often wrong: 17 18


19 20
• If a ’Q’ appears in an English text, it is very likely that the next symbol is ’u’. 21 22
• If all neighbours of a pixel x are black, it is very likely that x is also black. 23 24
25 26
By taking this into account, stronger compression ratios can be obtained. 27 28


29 30
Thus, we will not make this assumption any more, and consider higher order 31 32


entropy coders. 33 34
35 36
37 38
39 40
41 42
43 44
45 46
47 48
49 50
51 52
Higher Order Entropy Coding MI
A
1 2
What is k-th Order Entropy Coding? 3 4
5 6
For k-th order entropy coding, one . . . 7 8


9 10
• considers a k-th order context, i.e. the preceding k symbols si1 · · · sik before 11 12
the current symbol sj . 13 14
15 16
• uses the conditional probability
17 18
19 20
P (si1 · · · sik sj ) 21 22
P (sj | si1 · · · sik ) =
P (si1 · · · sik ) 23 24
25 26
to encode the next symbol. 27 28
29 30
These conditional probabilities can be illustrated using probabilistic finite state 31 32


automata. 33 34
35 36
37 38
39 40
41 42
43 44
45 46
47 48
49 50
51 52

Higher Order Entropy Coding MI


A
1 2
Illustration 3 4
5 6
1
4 ,a 7 8
1 9 10
2 ,b 1 11 12
S1 S2 3 ,c 13 14
15 16
2
3 ,a 17 18
19 20
1
4 ,c 21 22
1,d 23 24
25 26
27 28
29 30
S3
31 32
33 34
Each node represents a state of the automaton. 35 36


37 38
The outgoing edges denote probabilities for the next symbol in a source word.


39 40
Such an automaton generates infinitely many source words of infinite length. 41 42


Source words that appear in practice are finite substrings. 43 44


45 46
Every source symbol must appear at least at one edge. 47 48


Sum of outgoing probabilities must be 1. 49 50




51 52
Higher Order Entropy Coding MI
A
1 2
k-th Order Huffman Coding 3 4
5 6
For each context si1 · · · sik , one separate Huffman tree is created. 7 8


9 10
• There are |S|k possible contexts. 11 12
• As in extended Huffman coding, noticably more space is needed to store the 13 14
15 16
encoding scheme.
17 18
Whenever a symbol is encoded/decoded, the Huffman tree corresponding to the 19 20


given context is used. 21 22


23 24
Problem: For the first k symbols, there is no k-th order context. 25 26


27 28
Solution: 29 30


Estimate the probabilities p1, . . . , pm and use them for the first k symbols 31 32
(∅-context). 33 34
35 36
For instance, use relative source probabilities: 37 38


39 40
X 41 42
p` := P (si1 · · · sik si` )
43 44
1≤i1 ,...,ik ≤m
45 46
47 48
49 50
51 52

Higher Order Entropy Coding MI


A
1 2
Example: 1-st Order Huffman Coding 3 4
5 6
7 8
Estimated probabilities P (si1 si2 ) (rows: si1 , columns: si2 ): 9 10
11 12
A B C D 13 14
A 0.16 0.1 0.1 0.04 15 16
B 0.08 0.17 0.04 0.01 17 18
C 0.14 0.01 0.01 0.04 19 20
D 0.02 0.02 0.05 0.01 21 22
23 24
A B C D 25 26
Relative source probabilities for initial Huffman tree: 27 28
0.4 0.3 0.2 0.1
29 30
A B C D 31 32
∅ 0 10 110 111 33 34
35 36
A 0 10 110 111 37 38
Huffman Codes:
B 00 1 010 011 39 40
C 0 100 101 11 41 42
D 00 010 1 011 43 44
45 46
BAABD → 10|00|0|10|011 47 48
49 50
51 52
Higher Order Coding Efficiency MI
A
1 2
Higher Order Coding Efficiency 3 4
5 6
Let l(i1, . . . , ik , j) be the length of the code word for sj in the context si1 · · · sik . 7 8


On average, the length of the next code word in this context will be 9 10
11 12
m
X 13 14
l(i1, . . . , ik ) = P (sj | si1 · · · sik )l(i1, . . . , ik , j) 15 16
j=1 17 18
m 19 20
X P (si1 · · · sik sj )
= l(i1, . . . , ik , j) 21 22
P (s i 1 · · · s i k
) 23 24
j=1
25 26
(k) 27 28
Thus, the average code word length l of a k-th order encoding scheme is


29 30
(k) X 31 32
l = P (si1 · · · sik )l(i1, . . . , ik ) 33 34
1≤i1 ,...ik ≤m 35 36
m 37 38
X X
= P (si1 · · · sik sj )l(i1, . . . , ik , j) 39 40
41 42
1≤i1 ,...ik ≤m j=1
X 43 44
= P (si1 · · · sik+1 )l(i1, . . . , ik+1) 45 46
1≤i1 ,...ik+1 ≤m
47 48
49 50
51 52

Higher Order Coding Efficiency MI


A
1 2
Similarly, we can define H(i1, . . . , ik ), the entropy of the alphabet 3 4


S = {s1, . . . sm} in the context si1 , . . . , sik , as 5 6


7 8
m
X 9 10
H(i1, . . . , ik ) = − P (sj | si1 · · · sik ) log2 P (sj | si1 · · · sik ) 11 12
j=1 13 14
15 16
17 18
The k-th order entropy H (k) is then given by 19 20


21 22
X 23 24
H (k) = P (si1 · · · sik )H(i1, . . . , ik )
25 26
1≤i1 ,...ik ≤m
27 28
29 30
31 32
33 34
35 36
37 38
39 40
41 42
43 44
45 46
47 48
49 50
51 52
Higher Order Coding Efficiency MI
A
1 2
Theorem 3 4
(Shannon bound for higher-order encoding) 5 6
7 8
There exists an k-th order encoding scheme for which


9 10
11 12
(k)
l < H (k) + 1 13 14
15 16
17 18
The average code word length of a uniquely decodable k-th order encoding 19 20


scheme is never smaller than the k-th order entropy: 21 22


23 24
(k) 25 26
l ≥ H (k) 27 28
29 30
31 32
The proof follows directly from the fact that 33 34
35 36
37 38
H(i1, . . . , ik ) ≤ l(i1, . . . , ik ) < H(i1, . . . , ik ) + 1
39 40
41 42
holds in each context (compare Learning Unit 01). 43 44
45 46
47 48
49 50
51 52

Higher Order Coding Efficiency MI


A
1 2
Remarks 3 4
5 6
It can be shown that 7 8


H (k) ≤ H (k−1) 9 10
holds for all k ≥ 1. Thus, the entropy can only decrease when considering larger 11 12
13 14
contexts. 15 16
Higher-order entropy coding is also possible with arithmetic coding. 17 18


19 20
• This extension is straightforward. 21 22
23 24
Using probabilistic finite state automata, it is even possible to compute 25 26


27 28
H (∞) = lim H (k) 29 30
k→∞ 31 32
33 34
35 36
37 38
39 40
41 42
43 44
45 46
47 48
49 50
51 52
Outline MI
A
1 2
Learning Unit 04: 3 4
5 6
Adaptive, Higher Order, and Run Length Coding 7 8
9 10
11 12
13 14
15 16
17 18
Contents 19 20
21 22
1. Adaptive Entropy Coding 23 24
25 26
2. Higher Order Entropy Coding 27 28
29 30
3. Run Length Encoding 31 32
4. BWT, MTF, and Bzip2 33 34
35 36
37 38
39 40
41 42
43 44
45 46
47 48
c 2023 Christian Schmaltz, Pascal Peter 49 50
51 52

Run Length Encoding MI


A
1 2
Run Length Encoding 3 4
5 6
7 8
Consider once more the example 9 10
11 12
13 14
a · · · ab · · · bc · · · cd · · · d,
15 16
17 18
N
in which each symbol is repeated 4  1 times. 19 20
21 22
Standard Huffman coding needs 2N bits in total (2 bits per symbol). 23 24


Adaptive Huffman Coding uses 1 bit for most a’s, 2 bits for all b’s and d’s, and 3 25 26


27 28
bits for all c’s, thus requiring approximately 2N bits. 29 30
Adaptive Huffman Coding with readjustment uses 1 bit for most symbols, thus 31 32


requiring approximately 33 34
N 35 36
4· = N bits. 37 38
4 39 40
(e.g. every 4 symbols, multiply with p = 0.25) 41 42
Problem: This is still a lot of data for a word that simple. 43 44


45 46
47 48
49 50
51 52
Run Length Encoding MI
A
1 2
Run Length Encoding 3 4
5 6
Idea: Count how often each symbol occurs, and transmit this counter. 7 8


9 10
This idea is used in run length encoding (RLE) or run length coding (RLC).


11 12
If the same symbol appears k times in a row, this is called a run of length k, or a 13 14


15 16
run with run length k, respectively.
17 18
19 20
21 22
23 24
25 26
27 28
29 30
31 32
33 34
35 36
37 38
39 40
41 42
43 44
45 46
47 48
49 50
51 52

Run Length Encoding MI


A
1 2
Run Length Encoding - Variant I 3 4
5 6
If |S| = {0, 1}, transmitting only the counters is sufficient. 7 8


9 10
Example 11 12
13 14
15 16
The source word 17 18
19 20
00000 111 000 1 0 1111 000 1111111 21 22
23 24
5 3 3 11 4 3 7 25 26
27 28
is encoded as 5 3 3 1 1 4 3 7. 29 30
31 32
Remarks 33 34
35 36
It must be known what the first symbol is. 37 38


39 40
• Either transmit/store the first bit 41 42
• or use a run length of zero if the first symbol is “unexpected” 43 44
45 46
By scanning images in a predefined way (e.g. row-wise), RLE can be used to 47 48


compress binary images. 49 50


51 52
Run Length Encoding MI
A
1 2
Illustration of Run Length Encoding - Variant I 3 4
5 6
4 1 7 7 8
9 10
3 5 11 12
13 14
5 3 15 16
17 18
7 3 19 20
21 22
5 5 23 24
25 26
3 7 27 28
29 30
1 13
31 32
33 34
35 36
Using RLE, this binary image is encoded as “4 1 7 3 5 5 3 7 3 5 5 3 7 1 13”. 37 38
39 40
41 42
43 44
45 46
47 48
49 50
51 52

Run Length Encoding MI


A
1 2
Run Length Encoding - Variant II 3 4
5 6
If |S| > 2, one can replace runs that are long enough by tokens consisting of: 7 8


9 10
• An escape sequence (ESC) indicating the start of a run 11 12
• The length of the run 13 14
15 16
• The symbol itself 17 18
19 20
21 22
Example
23 24
25 26
27 28
The source word 29 30
31 32
CBAAAAABBBACAA 33 34
35 36
37 38
can be encoded as C B (ESC 5 A) (ESC 3 B) A C (ESC 2 A).
39 40
41 42
43 44
45 46
47 48
49 50
51 52
Run Length Encoding MI
A
1 2
Remarks 3 4
5 6
Runs with a length smaller than four should be ignored (otherwise, the resulting 7 8


word does not have less symbols than before). 9 10


11 12
• Instead of saving the run length k itself, k − 4 can be stored, as it is known 13 14
that the symbol occurs at least four times. 15 16
17 18
If there is no free symbol, repeating one symbol l times can be used as escape


19 20
sequence. 21 22
23 24
• Then, the symbol is already known and must not be transmitted, again.
25 26
• Instead of saving the run length k itself, k − l can be stored. 27 28
29 30
• However, the word might get longer. 31 32
33 34
35 36
37 38
39 40
41 42
43 44
45 46
47 48
49 50
51 52

Run Length Encoding MI


A
1 2
Examples 3 4
5 6
The source word 7 8


9 10
CBAAAAABBBACAA 11 12
13 14
15 16
can be encoded as C B (AAA 2) (BBB 0) A C AA when repeating one symbol 17 18
three times indicates the start of a run length. 19 20
21 22
For the source word 23 24


25 26
CBAAABBBACAA, 27 28
29 30
31 32
the resulting code word C B (AAA 0) (BBB 0) A C AA is longer than the initial 33 34
word. 35 36
37 38
39 40
41 42
43 44
45 46
47 48
49 50
51 52
Run Length Encoding MI
A
1 2
Illustration of Run Length Encoding - Variant II 3 4
5 6
7 8
9 10
11 12
13 14
15 16
17 18
19 20
21 22
23 24
25 26
27 28
29 30
31 32
33 34
35 36
37 38
Illustration of RLE: The image could be encoded as YY BBB4 Y BBB1 LL BBB3 39 40
LLL1 BBB1 LLL3 B MMM0 LLL2 MMM3 LLL0 MMM2 DDD0 L MMM1 DDD2, 41 42
resulting in 63 characters instead of the 9 · 8 = 72 characters with the trivial encoding 43 44
(Y = Yellow, B = Blue, L = Light grey, M = Medium grey, D = Dark). 45 46
47 48
49 50
51 52

Run Length Encoding MI


A
1 2
Run Length Encoding - Variant III 3 4
5 6
Run lengths can also be considered between very common symbols (usually 7 8


zeros) and uncommon symbols. 9 10


11 12
13 14
15 16
Example 17 18
19 20
21 22
The source word 23 24
25 26
27 28
00002001000000000053 29 30
31 32
could be encoded as 4 2 2 1 10 5 0 3. 33 34
35 36
37 38
39 40
41 42
43 44
45 46
47 48
49 50
51 52
Run Length Encoding MI
A
1 2
Encoding the Numbers 3 4
5 6
7 8
The occurring numbers can be encoded in various ways: 9 10
Using a fixed number of bits 11 12


13 14
• If a run length larger than the possible value occurs, a run of length zero is 15 16
added in between. 17 18
19 20
• Example (Variant I, with two bits): 011111 is encoded as “01 11 00 10”, i.e. 21 22
as “1 3 0 2”. 23 24
25 26
Using Huffman coding (with the same problems as described above) 27 28


29 30
Using Golomb-, Rice-, or Fibonacci coding. 31 32


Other methods are also possible (see next slide) 33 34




35 36
37 38
39 40
41 42
43 44
45 46
47 48
49 50
51 52

Run Length Encoding MI


A
1 2
RLE in bzip2 3 4
5 6
In the next section we introduce Bzip2, which uses a special version of RLE. 7 8


9 10
Bzip2 uses two symbols called RUNA and RUNB in one of its RLE:


11 12
• A run length is encoded as a sequence of RUNA and RUNB, where the 13 14
15 16
values represented by RUNA and RUNB are summed up.
17 18
• The symbol RUNA has a value of 2j−1 if it is the j-th symbol of the 19 20
sequence. 21 22
23 24
• The symbol RUNB has a value of 2j if it is the j-th symbol of the sequence. 25 26
27 28
There are two additional practical benefits to this approach: 29 30


31 32
• No escape sequences are necessary. 33 34
• Since RUNA, RUNB do not appear in the source alphabet, a source symbol 35 36
37 38
indicates the start of a new run.
39 40
41 42
43 44
45 46
47 48
49 50
51 52
Run Length Encoding MI
A
1 2
Example 3 4
5 6
The word 7 8


| ·{z
AAAAABC A · · A} 9 10
43times 11 12
is transformed to 13 14
A 4 B C A 42 15 16
17 18
Since 19 20


4=2·1+1·2 21 22
23 24
and 25 26
42 = 2 · 1 + 2 · 2 + 1 · 4 + 2 · 8 + 1 · 16, 27 28
this is encoded as 29 30
31 32
33 34
A RUNB RUNA B C A RUNB RUNB RUNA RUNB RUNA 35 36
37 38
39 40
41 42
43 44
45 46
47 48
49 50
51 52

Run Length Encoding MI


A
1 2
How to compute this representation in general? 3 4
5 6
Desired notation for x ≥ 1: c1c2 · · · cn with ci ∈ {1, 2}, such that 7 8


9 10
x = c120 + c221 + · · · cn2n−1 11 12
13 14
15 16
Resembles a binary notation, but with different coefficient set. 17 18


19 20
Algorithm: 21 22


23 24
1. Set r0 = x, i = 1. 25 26
r  27 28
2. Compute ri = i−1 2 -1.
29 30
3. Compute ci = ri−1 − 2ri. 31 32
33 34
4. If ri > 0, set i := i + 1 and goto step 2. 35 36
37 38
In the final representation, replace 1 by RUNA and 2 by RUNB.


39 40
41 42
43 44
45 46
47 48
49 50
51 52
Outline MI
A
1 2
Learning Unit 04: 3 4
5 6
Adaptive, Higher Order, and Run Length Coding 7 8
9 10
11 12
13 14
15 16
17 18
Contents 19 20
21 22
1. Adaptive Entropy Coding 23 24
25 26
2. Higher Order Entropy Coding 27 28
29 30
3. Run Length Encoding 31 32
4. BWT, MTF, and Bzip2 33 34
35 36
37 38
39 40
41 42
43 44
45 46
47 48
c 2023 Christian Schmaltz, Pascal Peter 49 50
51 52

Bzip2 MI
A
1 2
BWT, MTF, and Bzip2 3 4
5 6
7 8
Bzip2 is an open-source compression algorithm widely 9 10


used in Unix/Linux. 11 12
13 14
It was written single-handedly by Julian Seward, who is 15 16


also the author of Valgrind. 17 18


19 20
It (usually) compresses better than zip/gzip, but is 21 22


noticeably slower. 23 24
25 26
It cleverly combines several encoding algorithms we have 27 28


seen so far. 29 30
Julian Seward 31 32
Bzip, the predecessor, was based on arithmetic coding, 33 34


source: Wikimedia Commons


but discontinued due to patent issues. 35 36
37 38
In addition to RLE and Huffman Coding, we need BWT


39 40
and MTF. 41 42
43 44
45 46
47 48
49 50
51 52
Burrows-Wheeler-Transform MI
A
1 2
Burrows-Wheeler-Transform 3 4
5 6
Problem: Words of the form ABABAB are not compressed using RLE. 7 8


9 10
Idea: Sort the word before applying RLE. 11 12


13 14
Example 15 16
17 18
19 20
Sorting the word ABRACADABRA results in AAAAABBCDRR. The latter can be 21 22
compressed much better using RLE. 23 24
25 26
27 28
29 30
31 32
33 34
35 36
37 38
39 40
41 42
43 44
45 46
47 48
49 50
51 52

Burrows-Wheeler-Transform MI
A
1 2
New Problem: Storing/transmitting is too expensive. 3 4


5 6
Idea: Approximate sorting that can be reversed with little or no additional


7 8
information. 9 10
Make use of the fact that the same symbols often appear in the same contexts. 11 12


13 14
These ideas are used in the so-called Burrows-Wheeler-Transform (BWT) or 15 16


block sorting (see next slide). 17 18


19 20
21 22
23 24
25 26
27 28
29 30
31 32
33 34
35 36
37 38
39 40
41 42
43 44
45 46
47 48
49 50
51 52
Burrows-Wheeler-Transform MI
A
1 2
Algorithm: Burrows-Wheeler-Transform 3 4
5 6
1. Create a quadratic “matrix” containing the source word and all cyclically-shifted 7 8
versions of it. 9 10
11 12
2. Sort the rows such that they occur in lexicographical ordering. 13 14
15 16
3. Store/transmit the word in the last column, and the index of the row containing
17 18
the original word. 19 20
21 22
23 24
25 26
27 28
29 30
31 32
33 34
35 36
37 38
39 40
41 42
43 44
45 46
47 48
49 50
51 52

Burrows-Wheeler-Transform MI
A
1 2
Example 3 4
5 6
Matrix after step 1 Matrix after sorting 7 8
AB R A C ADAB R A AAB R A C ADAB R 9 10
11 12
B R A C ADAB R AA ABRAABRACAD
13 14
R A C ADAB R AAB *AB R A C ADAB R A 15 16
A C ADAB R AAB R A C ADAB R AAB R 17 18
C ADAB R AAB R A ADAB R AAB R A C 19 20
ADAB R AAB R A C −→ B R AAB R A C AD A 21 22
DAB R AAB R A C A B R A C ADAB R A A 23 24
AB R AAB R A C AD C ADAB R AAB R A 25 26
B R AAB R A C ADA DAB R AAB R A C A 27 28
29 30
R AAB R A C ADAB R AAB R A C ADA B
31 32
AAB R A C ADAB R R A C ADAB R AA B 33 34
35 36
BWT of the word ABRACADABRA. The word RDARCAAAABB and an index to the 37 38
third row will be transmitted/stored. 39 40
41 42
43 44
45 46
47 48
49 50
51 52
Burrows-Wheeler-Transform MI
A
1 2
Back-Transformation 3 4
5 6
1. Sort the symbols of the word received using a stable sorting algorithm. 7 8
9 10
(Remark: stable means here that the original position of each symbol in the 11 12
unsorted word is still known after sorting.) 13 14
15 16
2. Initialise an index i with the received index i0.
17 18
3. Add the symbol in the i-th position of the sorted word to the decoded word. 19 20
21 22
4. Find the last decoded symbol in the received (unsorted) word. Set i to the index of 23 24
that symbol. 25 26
27 28
5. If i = i0, decoding is done. 29 30
31 32
6. Otherwise, continue with step (3). 33 34
35 36
37 38
39 40
41 42
43 44
45 46
47 48
49 50
51 52

Burrows-Wheeler-Transform MI
A
1 2
Example 3 4
5 6
A R
7 8
A D 9 10
11 12
*A A 13 14
A R 15 16
17 18
A C 19 20
B A 21 22
23 24
B A 25 26
C A
27 28
29 30
D A 31 32
33 34
R B
35 36
R B 37 38
39 40
Illustration of reversing the BWT. Received word: RDARCAAAABB, received 41 42
index: 3. Right: Received word. Left: Sorted symbols. Decoded so far: ABRAC 43 44
45 46
47 48
49 50
51 52
Burrows-Wheeler-Transform MI
A
1 2
Remarks 3 4
5 6
The BWT was invented by Michael Burrows and David Wheeler in 1994, but 7 8


initial work was already done by Wheeler in 1983. 9 10


11 12
It does not reduce the amount of data to be stored/transmitted. 13 14


15 16
• It only reorders the word
17 18
• Other compression algorithms can benefit from the new order 19 20
21 22
It is most useful for very long words. 23 24


25 26
If the EOF-symbol is included in the word to be encoded, the index can be 27 28


omitted. (Why?) 29 30
31 32
33 34
35 36
37 38
39 40
41 42
43 44
45 46
47 48
49 50
51 52

Move-to-Front Coding MI
A
1 2
Move-to-Front Coding 3 4
5 6
Idea: Assume that some symbols appear more frequent in certain parts of the 7 8


source word. 9 10
11 12
If there is such a local correlation, this means that recently used symbols are 13 14


more likely to occur again. 15 16


17 18
Reduce the entropy by adapting the representation of symbols to local 19 20


correlation. 21 22
23 24
This idea is used in Move-to-Front Coding. 25 26


27 28
Algorithm 29 30
31 32
1. Sort all symbols and put them into a table. 33 34
35 36
2. Encode the next symbol by its position in the table. 37 38
39 40
3. Move the symbol just encoded to the beginning of the table. (The symbols before 41 42
this symbol move one step towards the end of the table) 43 44
45 46
4. As long as any unencoded symbols are left, continue with step 2. 47 48
49 50
51 52
Move-to-Front Coding MI
A
1 2
Example 3 4
5 6
table next symbol code stream 7 8
0 1 2 3 4 9 10
A B C D R R 4 11 12
13 14
R A B C D D 4 15 16
D R A B C A 2 17 18
A D R B C R 2 19 20
21 22
R A D B C C 4
23 24
C R A D B A 2 25 26
A C R D B A 0 27 28
A C R D B A 0 29 30
31 32
A C R D B A 0 33 34
A C R D B B 4 35 36
B A C R D B 0 37 38
39 40
Encoding the word RDARCAAAABB with MTF coding results in the code word 41 42
44224200040. 43 44
45 46
47 48
49 50
51 52

Move-to-Front Coding MI
A
1 2
Decoding 3 4
5 6
Decoding is very similar to encoding: 7 8


9 10
Algorithm 11 12
13 14
15 16
1. Sort all symbols and put them into a table.
17 18
2. Take the symbol at the position indicated by the code word. 19 20
21 22
3. Move the symbol just decoded to the beginning of the table. 23 24
25 26
4. As long as any undecoded symbols are left, continue with step 2. 27 28
29 30
31 32
33 34
35 36
37 38
39 40
41 42
43 44
45 46
47 48
49 50
51 52
Move-to-Front Coding MI
A
1 2
Example 3 4
5 6
table code stream next symbol 7 8
0 1 2 3 4 9 10
A B C D R 4 R 11 12
13 14
R A B C D 4 D 15 16
D R A B C 2 A 17 18
A D R B C 2 R 19 20
21 22
R A D B C 4 C
23 24
C R A D B 2 A 25 26
A C R D B 0 A 27 28
A C R D B 0 A 29 30
31 32
A C R D B 0 A 33 34
A C R D B 4 B 35 36
B A C R D 0 B 37 38
39 40
Decoding the code word 44224200040 with MTF yields RDARCAAAABB 41 42
(compare slide 42). 43 44
45 46
47 48
49 50
51 52

Move-to-Front Coding MI
A
1 2
Remarks 3 4
5 6
It is not necessary to know the whole code word to start encoding. 7 8


9 10
As always, any initial ordering can be used, as long as encoder and decoder use


11 12
the same. 13 14
15 16
BWT is typically followed by MTF.


17 18
19 20
21 22
23 24
25 26
27 28
29 30
31 32
33 34
35 36
37 38
39 40
41 42
43 44
45 46
47 48
49 50
51 52
Bzip2 MI
A
1 2
BZIP2 in Detail 3 4
5 6
1. RLE using four equal symbols as escape sequence, and runs of length 4 to 255. 7 8
9 10
2. BWT 11 12
13 14
3. MTF 15 16
4. RLE using the approach from slide 33 17 18
19 20
5. Huffman-coding (with canonical codes) 21 22
23 24
• Up to six different Huffman trees might be used. 25 26
27 28
• The tree used can change every 50 symbols. 29 30
• MTF with Unary coding is used to encode the Huffman trees used. 31 32
33 34
• The code word lengths are stored using delta encoding: Only the difference 35 36
between consecutive code word lengths is stored. 37 38
39 40
• A sparse bit array is used to indicate which symbols occur: The 256 possible 41 42
symbols are assigned into 16 blocks. 16 bits are used to indicate which blocks 43 44
are empty, as well as 16 bits for each non-empty block. 45 46
47 48
49 50
51 52

Bzip2 MI
A
1 2
Remarks on Bzip2 3 4
5 6
You can download a C implementation on www.bzip.org. 7 8


9 10
It is useful for daily compression tasks:


11 12
• Around 20% worse than state-of-the art techniques (PAQ). 13 14
15 16
• Usually quite a lot faster (around factor 1000). 17 18
19 20
Julian Seward called the first run length encoding step a mistake in hindsight. 21 22


23 24
• It does not contribute to compression performance and is mostly irrelevant. 25 26
• Instead, it was designed to protect the BWT of too many repeating symbols. 27 28
29 30
• However, this could be done more efficiently by modifying the BWT. 31 32
33 34
35 36
37 38
39 40
41 42
43 44
45 46
47 48
49 50
51 52
Summary MI
A
1 2
Summary 3 4
5 6
adaptive coding: no overhead, adjustment to local distribution 7 8


9 10
higher order coding: uses preceding symbols to define conditional probabilities 11 12


13 14
Run length encoding (RLE) counts runs of the same symbols.


15 16
The Burrows-Wheeler-Transform (BWT) reorders the source word such that 17 18


19 20
symbols are ordered with respect to their context. 21 22
Move-to-Front coding (MTF) remaps symbols to low integers (mostly zeroes). 23 24


25 26
Bzip2 illustrates that a clever combination of different approaches can yield a 27 28


very good compression algorithm. 29 30


31 32
33 34
Outlook 35 36
37 38
BWT and MTF change the codeword/alphabet in order to improve the coding. 39 40


41 42
Can we find other ways to help the entropy coders?


43 44
45 46
47 48
49 50
51 52

References MI
A
1 2
References 3 4
5 6
D. Hankerson, G. A. Harris, P.D. Johnson Jr. Introduction to Information Theory 7 8


and Data Compression. Second edition. Chapman & Hall/CRC, 2003. 9 10


(Higher-order coding, probabilistic finite state automata) 11 12
13 14
T. Strutz. Bilddatenkompression. Fourth edition. Vieweg+Teubner, 2009. 15 16


(Run-Length-Encoding, BWT, MTF (in German)) 17 18


19 20
K. Sayood. Introduction to Data Compression. Morgan Kaufmann, 2006. 21 22


(BWT, MTF) 23 24
25 26
M. Burrows and D. J. Wheeler. A block-sorting lossless data compression


27 28
algorithm, SRC Research Report, 1994. 29 30
(BWT, MTF, implementation details) 31 32
33 34
35 36
37 38
39 40
41 42
43 44
45 46
47 48
49 50
51 52

You might also like