0% found this document useful (0 votes)

49 views5 pages

Unit 2 - Part 7 Coding Information Sources: 1 Adaptive Variable-Length Codes

The document discusses source coding algorithms like Huffman coding and Lempel-Ziv-Welch (LZW) coding that are used to compress messages. It explains how LZW coding works, including the encoding and decoding procedures. LZW is an adaptive compression algorithm that builds a dictionary of symbol sequences encountered in the message and assigns codes to the sequences. During encoding, it searches for the longest match from the dictionary and outputs the code. The dictionary is reconstructed during decoding using the codes from the encoded stream. The document also defines efficiency and redundancy measures for source codes.

Uploaded by

fsa

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

49 views5 pages

Unit 2 - Part 7 Coding Information Sources: 1 Adaptive Variable-Length Codes

Uploaded by

fsa

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

Department of Electrical Engineering

Indian Institute of Technology Jodhpur

EE 321: Contemporary Communication Systems
2016-17 Second Semester (January - May 2017)

Unit 2 - Part 7
Coding Information Sources
Arun Kumar Singh
18 January 2017

We discuss two source coding algorithms to compress messages (a message is a sequence of symbols).
The first, Huffman coding, is efficient when one knows the probabilities of the different symbols making
up a message, and each symbol is drawn independently from some known distribution. The second,
Lempel-Ziv-Welch (LZW), is an adaptive compression algorithm that does not assume any knowledge
of the symbol probabilities. Both Huffman codes and LZW are widely used in practice, and are a part
of many real-world standards such as GIF, JPEG, MPEG, MP3, and more.

1 Adaptive Variable-length Codes

To understand the source coding (compression) problem better, let us consider the problem of digital
representation of the text of a book written in, say, English. There are several possible approaches:
One approach is to analyze a few books and estimate the probabilities of different letters of the
alphabet. Then, treat each letter as a symbol and apply Huffman coding to compress the document
of interest. This approach is reasonable but ends up achieving relatively small gains compared to the
best one can do. One big reason is that the probability with which a letter appears in any text is not
always the same. For example, a priori, "x" is one of the least frequently appearing letters, appearing
only about 0.3% of the time in English text. But in the sentence "... nothing can be said to be certain,
except death and ta ", the next letter is almost certainly an "x". In this context, no other letter can
be more certain!
An approach that adapts to the material being compressed might avoid these shortcomings. One
approach to adaptive encoding is to use a two pass process: in the first pass, count how often each
symbol (or pairs of symbols, or triples - whatever level of grouping you have chosen) appears and use
those counts to develop a Huffman code customized to the contents of the file. Then, in the second
pass, encode the file using the customized Huffman code. This strategy is expensive but workable, yet
it falls short in several ways. Whatever size symbol grouping is chosen, it will not do an optimal job on
encoding recurring groups of some different size, either larger or smaller. And if the symbol probabilities
change dramatically at some point in the file, a one-size-fits-all Huffman code will not be optimal; in
this case one would want to change the encoding midstream.
A different approach to adaptation is taken by the popular Lempel-Ziv-Welch (LZW) algorithm.
This method was developed originally by Ziv and Lempel, and subsequently improved by Welch.

2 Lampel-Ziv-Welch (LZW) Codes

LZW code uses a dictionary that is indexed by codes used. The dictionary is assumed to be initial-
ized with 256 entries (indexed with ASCII codes 0 through 255) representing the ASCII table. The
compression algorithm assumes that the output is either a file or a communication channel. The input

1
being a file or buffer. Conversely, the decompression algorithm assumes that the input is a file or a
communication channel and the output is a file or a buffer.
As the message to be encoded is processed, the LZW algorithm builds a dictionary that maps
symbol sequences to/from an N -bit index. The dictionary has 2N entries and the transmitted code can
be used at the decoder as an index into the dictionary to retrieve the corresponding original symbol
sequence. The sequences stored in the dictionary can be arbitrarily long. The algorithm is designed so
that the dictionary can be reconstructed by the decoder based on information in the encoded stream -
the dictionary, while central to the encoding and decoding process, is never transmitted. This property
is crucial to the understanding of the LZW method.

2.1 Encoding Procedure

The encoder reads one character at a time. If the code is in the dictionary, then it adds the character
to the current work string, and waits for the next one. This occurs on the first character as well. If
the work string is not in the dictionary, (such as when the second character comes across), it adds the
work string to the dictionary and sends over the wire (or writes to a file) the code assigned to the work
string without the new character. It then sets the work string to the new character.

2.1.1 Algorithm
Step 1. Initialize dictionary to contain one entry for each byte. Initialize the encoded string with
the first byte of the input stream.

Step 2. Read the next byte from the input stream.

Step 3. If the byte is an EOF goto step 6.

Step 4. If concatenating the byte to the encoded string produces a string that is in the dictionary:

concatenate the the byte to the encoded string.

go to step 2.

Step 5. If concatenating the byte to the encoded string produces a string that is not in the
dictionary:

add the new sting to the dictionary.

write the code for the encoded string to the output stream.
set the encoded string equal to the new byte.
go to step 2.

Step 6. Write out code for encoded string and exit.

2.1.2 Pseudo-code
set w = NIL
loop
read a character k
if wk exists in the dictionary
w = wk
else
output the code for w
add wk to the dictionary
w=k
endloop

2
2.2 Decoding Procedure
LZW data is decoded pretty much the opposite of how its encoded. The dictionary is initialized so that
it contains an entry for each byte. Instead of maintaining an encoded string, the decoding algorithm
maintains the last code word and the first character in the string it encodes. New code words are read
from the input stream one at a time and string encoded by the new code is output.
During the encoding process, the code prior to the current code is written because concatenating
the first character of the current code with the string encoded by the prior code generated a code that
was not in the dictionary. When that happened the string formed by the concatenation was added to
the dictionary. The same string needs to be added to the dictionary when decoding.

2.2.1 Algorithm
Step 1. Initialize dictionary to contain one entry for each byte.

Step 2. Read the first code word from the input stream and write out the byte it encodes.

Step 3. Read the next code word from the input stream.

Step 4. If the code word is an EOF exit.

Step 5. Write out the string encoded by the code word.

Step 6. Concatenate the first character in the new code word to the string produced by the
previous code word and add the resulting string to the dictionary.

Step 7. Go to step 3.

2.2.2 Pseudo-code
read fixed length token k (code or char)
output k
w=k
loop
read a fixed length token k
entry = dictionary entry for k
output entry
add w+ first char of entry to the dictionary
w = entry
endloop

2.2.3 Exception to the Rules

When decoding certain input streams the decoder may see a code word that is one larger than anything
that it has in its dictionary. Whenever this exception occurs, concatenate the first character of the
string encoded by the previous code word to the end of the string encoded by the previous code word.
The resulting string is the value of the new code word. Write it to the output stream and add it to the
dictionary.

Some interesting observations about LZW compression:

1. The encoder algorithm is greedy - it is designed to find the longest possible match in the dictionary
before it makes a transmission.

2. The dictionary is filled with sequences actually found in the message stream. No encodings are
wasted on sequences not actually found in the file.

3
3. A common choice for the size of the dictionary is 4096 (N = 12). A larger table means the encoder
has a longer memory for sequences it has seen and increases the possibility of discovering repeated
sequences across longer spans of message. However, dedicating dictionary entries to remembering
sequences that will never be seen again decreases the efficiency of the encoding.

3 Efficiency and Redundancy of the Source Code

Source coding theorem is used to derive a measure of the efficiency of a source code:

H(S)
= ,
L
where H(S) denotes the entropy of the source and L denotes the average length of the codeword for
the given source code.
The redundancy of a source code is defined as:

= 1 .

Example 1:
Consider the string abcabcabcabcabcabc to demonstrate the encoding process.

Table 1: Encoder Dictionary

New Byte Encoded String New Code Code Output

a a None None
b b 256 (ab) a
c c 257 (bc) b
a a 258 (ca) c
b ab None None
c c 259 (abc) ab (256)
a ca None None
b b 260 (cab) ca (258)
c bc None None
a a 261 (bca) 257 (bc)
b ab None None
c abc None None
a a 262 (abca) 259 (abc)
b ab None None
c abc None None
a abca None None
b b 263 (abcab) 262 (abca)
c bc None None
None None None 257 (bc)

NOTE: The decoding process demonstrates the exception and how it is handled. (New encoded
string = old encoded string + the first character of the old encoded string = abc + a = abca)

4
Table 2: Decoder Dictionary

Input Code Encoded String Added Code String Output

a a None a
b b 256 (ab) b
c c 257 (bc) c
256 ab 258 (ca) ab
258 ca 259 (abc) ca
257 bc 260 (cab) bc
259 abc 261 (bca) abc
262 Not In Dictionary 262 (abca) abca
257 bc 263 (abcab) bc

Barrios, Agustin - Details PDF
33% (3)
Barrios, Agustin - Details PDF
5 pages
LZ78
No ratings yet
LZ78
17 pages
Chapter 3-Part II
100% (1)
Chapter 3-Part II
26 pages
Source Coding Techniques: 1. Huffman Code. 2. Two-Pass Huffman Code. 3. Lemple-Ziv Code
No ratings yet
Source Coding Techniques: 1. Huffman Code. 2. Two-Pass Huffman Code. 3. Lemple-Ziv Code
111 pages
Lempel-Ziv-Welch (LZW) Compression Algorithm
No ratings yet
Lempel-Ziv-Welch (LZW) Compression Algorithm
12 pages
Compression: Safeen H. Rasool Assist. Lecturer
No ratings yet
Compression: Safeen H. Rasool Assist. Lecturer
27 pages
Image Compression
No ratings yet
Image Compression
50 pages
Design and Implementation Af LZW Data Compression Algorithm
No ratings yet
Design and Implementation Af LZW Data Compression Algorithm
11 pages
Imc14 05 Dictionary Codes
No ratings yet
Imc14 05 Dictionary Codes
31 pages
Dictionary Techniques (Lempel-Ziv Codes) : Dictionary, and Encode These Patterns by Transmitting
No ratings yet
Dictionary Techniques (Lempel-Ziv Codes) : Dictionary, and Encode These Patterns by Transmitting
26 pages
LZW Fundamentals: Lempel Ziv 1977 1978 Terry Welch's 1978 Algorithm 1984
No ratings yet
LZW Fundamentals: Lempel Ziv 1977 1978 Terry Welch's 1978 Algorithm 1984
9 pages
Chapter 3 Multimedia Data Compression
No ratings yet
Chapter 3 Multimedia Data Compression
21 pages
Lempel Ziv
No ratings yet
Lempel Ziv
11 pages
Information Theory: Mohamed Hamada
No ratings yet
Information Theory: Mohamed Hamada
44 pages
Huffman Coding, RLE, LZW
No ratings yet
Huffman Coding, RLE, LZW
41 pages
Lecture 13 - Delta Coding
No ratings yet
Lecture 13 - Delta Coding
41 pages
Lec 13
No ratings yet
Lec 13
7 pages
Multimedia Systems Chapter 7
No ratings yet
Multimedia Systems Chapter 7
21 pages
Literature of LZW Algorithm: Data Compression
No ratings yet
Literature of LZW Algorithm: Data Compression
4 pages
Compression: Some Slides Courtesy James Allan@umass
No ratings yet
Compression: Some Slides Courtesy James Allan@umass
47 pages
LZW Compression and Decompression: December 4, 2015
No ratings yet
LZW Compression and Decompression: December 4, 2015
7 pages
Mad Unit 3-Jntuworld
No ratings yet
Mad Unit 3-Jntuworld
53 pages
20 Compression
No ratings yet
20 Compression
58 pages
Implementation of Lempel-Ziv Algorithm For Lossless Compression Using VHDL
No ratings yet
Implementation of Lempel-Ziv Algorithm For Lossless Compression Using VHDL
2 pages
Lempel-Ziv-Welch (LZW) Compression Algorithm
No ratings yet
Lempel-Ziv-Welch (LZW) Compression Algorithm
22 pages
Class Notes CS 3137 1 LZW Encoding
No ratings yet
Class Notes CS 3137 1 LZW Encoding
5 pages
L117, L18, L19, L20, L21 - Module 5 - Source Coding - II
No ratings yet
L117, L18, L19, L20, L21 - Module 5 - Source Coding - II
53 pages
4.ResM Non Stat Coding
No ratings yet
4.ResM Non Stat Coding
9 pages
Lempel Ziv
No ratings yet
Lempel Ziv
22 pages
Image Compression
No ratings yet
Image Compression
33 pages
Basics of Information Theory
No ratings yet
Basics of Information Theory
21 pages
Lemp El Ziv Compression
No ratings yet
Lemp El Ziv Compression
6 pages
Compression: Author: Paul Penfield, Jr. Url: Toc
No ratings yet
Compression: Author: Paul Penfield, Jr. Url: Toc
5 pages
Lec5 - LZW Compression
No ratings yet
Lec5 - LZW Compression
29 pages
LZW Data Compression
No ratings yet
LZW Data Compression
5 pages
Gray Level Count Probabil Ity 21 12 3/8 95 4 1/8 169 4 1/8 243 12 3/8
No ratings yet
Gray Level Count Probabil Ity 21 12 3/8 95 4 1/8 169 4 1/8 243 12 3/8
51 pages
Chapter Three
No ratings yet
Chapter Three
30 pages
Channel Coding Using Matlab
No ratings yet
Channel Coding Using Matlab
14 pages
Lossless Data Compression Algorithm Abraham Lempel Jacob Ziv Terry Welch LZ78
No ratings yet
Lossless Data Compression Algorithm Abraham Lempel Jacob Ziv Terry Welch LZ78
9 pages
Lecture 10-Print
No ratings yet
Lecture 10-Print
50 pages
Slides 4
No ratings yet
Slides 4
11 pages
Tut2 Arvr
No ratings yet
Tut2 Arvr
5 pages
Efficient Sequential Algorithms, Comp309: University of Liverpool
No ratings yet
Efficient Sequential Algorithms, Comp309: University of Liverpool
20 pages
Day 20
No ratings yet
Day 20
33 pages
Coldplay - Yellow: Were Came Wrote Was Took Was
No ratings yet
Coldplay - Yellow: Were Came Wrote Was Took Was
2 pages
A Response To David Gates: "The Door Is About To Close ": Are You Ready?
No ratings yet
A Response To David Gates: "The Door Is About To Close ": Are You Ready?
52 pages
LZW Based Compression
No ratings yet
LZW Based Compression
8 pages
Lempel Ziv Welch
No ratings yet
Lempel Ziv Welch
16 pages
6phrase - Very Very Important - C MCQ - 4
No ratings yet
6phrase - Very Very Important - C MCQ - 4
21 pages
SAP JAVA GUI Installation: For Windows Instead
No ratings yet
SAP JAVA GUI Installation: For Windows Instead
6 pages
Introduction To IOS - XR 6.0: System Engineer, Global Service Providers CCIE SP #42403
No ratings yet
Introduction To IOS - XR 6.0: System Engineer, Global Service Providers CCIE SP #42403
48 pages
Information Theory Coding
No ratings yet
Information Theory Coding
11 pages
8086 Hardware Specification
100% (1)
8086 Hardware Specification
84 pages
Adobe Framemaker 2015 Help
100% (1)
Adobe Framemaker 2015 Help
990 pages
CH 6
No ratings yet
CH 6
21 pages
LZW HW
No ratings yet
LZW HW
74 pages
Isomorphisms and Allomorphisms in The Morphemic Structure of English and Ukrainian Words
No ratings yet
Isomorphisms and Allomorphisms in The Morphemic Structure of English and Ukrainian Words
13 pages
English 5 - DLP - Week 1 - Day 1 - August 5, 2024
No ratings yet
English 5 - DLP - Week 1 - Day 1 - August 5, 2024
4 pages
Image Compression-2
No ratings yet
Image Compression-2
13 pages
Context Free Grammar - Kannada
No ratings yet
Context Free Grammar - Kannada
6 pages
Lempel-Ziv-Welch - 2024
No ratings yet
Lempel-Ziv-Welch - 2024
9 pages
Be Your Best Self
No ratings yet
Be Your Best Self
7 pages
Project Report PDF
No ratings yet
Project Report PDF
5 pages
PDF 20220904 234628 0000
No ratings yet
PDF 20220904 234628 0000
16 pages
PSS Paper 1 Year 5
No ratings yet
PSS Paper 1 Year 5
4 pages
Multimedia Data Compression
No ratings yet
Multimedia Data Compression
31 pages
Number Bingo Game
No ratings yet
Number Bingo Game
16 pages
POAI MCQs
No ratings yet
POAI MCQs
40 pages
Cocomo LL - Steps To Calculate Effort For Application Composition Model
No ratings yet
Cocomo LL - Steps To Calculate Effort For Application Composition Model
5 pages
Lost/Found in Translation: Qurratulain Hyder As Self-Translator
No ratings yet
Lost/Found in Translation: Qurratulain Hyder As Self-Translator
16 pages
Kickstart Arrays Lesson
100% (1)
Kickstart Arrays Lesson
3 pages
Black Doodle Group Project Presentation
No ratings yet
Black Doodle Group Project Presentation
33 pages
A) What Is An Algorithm? Write The Properties of The Algorithm
No ratings yet
A) What Is An Algorithm? Write The Properties of The Algorithm
21 pages
1 Syllabus For IINDUSTRY 4.0 - 20 Use Cases
No ratings yet
1 Syllabus For IINDUSTRY 4.0 - 20 Use Cases
5 pages
Multimedia 2017 2018 Lec10
No ratings yet
Multimedia 2017 2018 Lec10
34 pages
Vendor Creation Form
No ratings yet
Vendor Creation Form
8 pages
Chapter 7
No ratings yet
Chapter 7
70 pages
Learning English
No ratings yet
Learning English
2 pages
Stochastic Mechanics
No ratings yet
Stochastic Mechanics
113 pages
Nios 302 Ch-23
No ratings yet
Nios 302 Ch-23
12 pages
Assignment 6-Fall 2024
No ratings yet
Assignment 6-Fall 2024
5 pages
Aaquib Resume
No ratings yet
Aaquib Resume
2 pages
An Astrologer's Day
No ratings yet
An Astrologer's Day
7 pages
MMC Module 3
No ratings yet
MMC Module 3
65 pages
Lisp Interpreter in Rust
From Everand
Lisp Interpreter in Rust
Vishal Patil
1/5 (1)
Error-Correction on Non-Standard Communication Channels
From Everand
Error-Correction on Non-Standard Communication Channels
Edward A. Ratzer
No ratings yet
Python Programming Concepts
From Everand
Python Programming Concepts
MRB
No ratings yet
Coding for beginners The basic syntax and structure of coding
From Everand
Coding for beginners The basic syntax and structure of coding
Diamond Moore
No ratings yet
Assembly Programming:Simple, Short, And Straightforward Way Of Learning Assembly Language
From Everand
Assembly Programming:Simple, Short, And Straightforward Way Of Learning Assembly Language
Sherwyn Allibang
5/5 (2)
Perceptual Computing: Fundamentals and Applications
From Everand
Perceptual Computing: Fundamentals and Applications
Fouad Sabry
No ratings yet

Unit 2 - Part 7 Coding Information Sources: 1 Adaptive Variable-Length Codes

Uploaded by

Unit 2 - Part 7 Coding Information Sources: 1 Adaptive Variable-Length Codes

Uploaded by

Department of Electrical Engineering

Indian Institute of Technology Jodhpur

1 Adaptive Variable-length Codes

2 Lampel-Ziv-Welch (LZW) Codes

2.1 Encoding Procedure

Step 2. Read the next byte from the input stream.

Step 3. If the byte is an EOF goto step 6.

concatenate the the byte to the encoded string.

add the new sting to the dictionary.

Step 6. Write out code for encoded string and exit.

Step 4. If the code word is an EOF exit.

Step 5. Write out the string encoded by the code word.

2.2.3 Exception to the Rules

Some interesting observations about LZW compression:

3 Efficiency and Redundancy of the Source Code

Table 1: Encoder Dictionary

New Byte Encoded String New Code Code Output

Input Code Encoded String Added Code String Output

You might also like