Dictionary Methods: Introduction To Lempel-Ziv Encoding
Dictionary Methods: Introduction To Lempel-Ziv Encoding
Dictionary Methods
1
20-02-2023
Dictionary Coding
Dictionary Coding
2
20-02-2023
Dictionary Coding
Categorization of Dictionary-Based
Coding Techniques
• The heart of dictionary coding is the formulation of the
dictionary.
• A successfully built dictionary results in data compression;
the opposite case may lead to data expansion.
• According to the ways in which dictionaries are
constructed, dictionary coding techniques can be classified
as static or adaptive.
3
20-02-2023
4
20-02-2023
a a c a a c a b c a b a b a c
Dictionary Lookahead
(previously coded) Buffer
5
20-02-2023
LZ77: Example
a a c a a c a b c a b a a a c (_,0,a)
a a c a a c a b c a b a a a c (1,1,c)
a a c a a c a b c a b a a a c (3,4,b)
a a c a a c a b c a b a a a c (3,3,a)
a a c a a c a b c a b a a a c (1,2,c)
LZ77 Decoding
• Decoder keeps same dictionary window as encoder.
• For each message it looks it up in the dictionary and
inserts a copy at the end of the string
• What if l > p? (only part of the message is in the
dictionary.)
• E.g. dict = aac, codeword = (3,4,b)
a a c a a c a b c a b a a a c (3,4,b)
6
20-02-2023
a a c a a c a b c a b a a a c (1,a)
a a c a a c a b c a b a a a c (1,c)
a a c a a c a b c a b a a a c (0,3,4)
LZ77 example
• Encode and decode the following example using
LZ77
• ‘c a b r a c a d a b r a r r a r r a d’
7
20-02-2023
Comparison to Lempel-Ziv 78
• Both LZ77 and LZ78 and their variants keep a
“dictionary” of recent strings that have been
seen.
• The differences are:
▫ How the dictionary is stored (LZ78 is a table)
▫ How it is extended (LZ78 only extends an existing entry
by one character)
▫ How it is indexed (LZ78 indexes the entries in the table)
▫ How elements are removed
8
20-02-2023
ABBCBCABABCAABCAAB
9
20-02-2023
10
20-02-2023
11
20-02-2023
3 BC (2,C)
4 BCA (3,A)
5 BA (2,A)
6 BCAA (4,A)
DCE//SEM VI//EXTC//Dr. Vishakha Kelkar
7 BCAAB (6,B)
12
20-02-2023
13
20-02-2023
<(0,A)><(0,B)><(2,C)><(3,A)><(2,A)><(4,A)><(6,B)>
W=empty C=empty
14
20-02-2023
W=empty C=empty
W=empty C=empty
15
20-02-2023
W=empty C=empty
W=empty C=empty
16
20-02-2023
W=empty C=empty
W=empty C=empty
17
20-02-2023
Exercise
Use LZ78 to trace encoding the string
SATATASACITASA.
Also decode it.
Exercise :LZ78
18
20-02-2023
Exercise :LZ78
Exercise :LZ78
19
20-02-2023
39 39 126 126
39 39 126 126
39 39 126 126
39 39 126 126
20
20-02-2023
21
20-02-2023
Dictionary Message: a b a b a b a b a
0a Codes :
1b
Dictionary Message: a b a b a b a b a
0a Codes :
1b
22
20-02-2023
Dictionary Message: a b a b a b a b a
0a Codes : 0
1b
2 ab
23
20-02-2023
Dictionary Message: a b a b a b a b a
0a Codes : 0 1
1b
2 ab
3 ba
Dictionary Message: a b ab a b a b a
0a Codes : 0 1
1b
2 ab *
3 ba
24
20-02-2023
Dictionary Message: a b ab a b a b a
0a Codes : 0 1 2
1b
2 ab
3 ba
4 aba
Dictionary Message: a b ab ab a b a
0a Codes :0 1 2
1b
2 ab *
3 ba
4 aba
25
20-02-2023
26
20-02-2023
STRING=b
CHAR =a
STRING+CHAR=ba Output code for the STRING: 3
27
20-02-2023
LZW: Decoding
Initialize Dictionary
Input code c
Decode code c (index) to w
Output decoded string w
Put w? in Dictionary
REPEAT
a) Input code c
Decode the 1st symbol s1 of the code c
Complete the previous Dictionary entry with s1
b) Finish decoding the remainder of the code c
Output decoded string w
Put put w? in Dictionary
UNTIL no more codes
DCE//SEM VI//EXTC//Dr. Vishakha Kelkar
28
20-02-2023
Dictionary Codes : 0 1 2 4 3 6
0a Message: a
1b
2 a?
Dictionary Codes : 0 1 2 4 3 6
0a Message: a b+
1b
2 ab
29
20-02-2023
Dictionary Codes : 0 1 2 4 3 6
0a Message: a b
1b
2 ab
3 b?
Dictionary Codes : 0 1 2 4 3 6
0a Message: a b a+
1b
2 ab
3 ba
30
20-02-2023
Dictionary Codes : 0 1 2 4 3 6
0a Message: a b ab
1b
2 ab
3 ba
4 ab?
Dictionary Codes : 0 1 2 4 3 6
0a Message: a b aba+
1b
2 ab
3 ba
4 ab?
31
20-02-2023
Dictionary Codes : 0 1 2 4 3 6
0a Message: a b aba+
1b
2 ab
3 ba
4 aba
Dictionary Codes : 0 1 2 4 3 6
0a Message: a b ababa
1b
2 ab
3 ba
4 aba
5 aba?
32
20-02-2023
Dictionary Codes : 0 1 2 4 3 6
0a Message: a b ababa b+
1b
2 ab
3 ba
4 aba
5 abab
Dictionary Codes : 0 1 2 4 3 6
0a Message: a b ababa ba
1b
2 ab
3 ba
4 aba
5 abab
6 ba?
33
20-02-2023
Dictionary Codes : 0 1 2 4 3 6
0a Message: a b ababa ba b+
1b
2 ab
3 ba
4 aba
5 abab
6 ba?
Dictionary Codes : 0 1 2 4 3 6
0a Message: a b ababa ba b+
1b
2 ab
3 ba
4 aba
5 abab
6 bab
34
20-02-2023
Dictionary Codes : 0 1 2 4 3 6
0a Message: a b ababa ba bab
1b
2 ab
3 ba
4 aba
5 abab
6 bab
7 bab?
DCE//SEM
VI//EXTC//Dr.
Vishakha Kelkar
35
20-02-2023
Exsercise:LZW
Exsercise:LZW
36
20-02-2023
Exsercise:LZW
Example
Consider the following 4 x 4 8 bit image
39 39 126 126
39 39 126 126 Dictionary Location Entry
39 39 126 126 0 0
1 1
39 39 126 126 . .
255 255
256 -
511 -
Initial Dictionary
37
20-02-2023
Example
- Is 39 in the dictionary……..Yes
39 39 126 126 - What about 39-39………….No
- Then add 39-39 in entry 256
39 39 126 126
39 39 126 126
39 39 126 126
Dictionary Location Entry
0 0
1 1
. .
255 255
256 - 39-39
511 -
DCE//SEM VI//EXTC//Dr. Vishakha Kelkar
Example
39 39 126 126
39 39 126 126
39 39 126 126
39 39 126 126
38
20-02-2023
Decoding LZW
• The dictionary which was used for encoding need
not be sent with the image.
LZW: Notes
• Extremely effective when there are repeated patterns
in the data that are widely spread
• Negatives: Create entries in the dictionary that may
never be used
• Applications: TIFF, V.42 bis modem standard
39
20-02-2023
DICTIONARY METHODS
• Adapts well to changes in the file (e.g. a Tar file with many
file types within it).
• Initial algorithms did not use probability coding and
performed poorly in terms of compression. More modern
versions (e.g. gzip) do use probability coding as “second
pass” and compress much better.
• The algorithms are becoming outdated, but ideas are used in
many of the newer algorithms.
40