0% found this document useful (0 votes)
47 views7 pages

LZW Compression and Decompression: December 4, 2015

This document provides an overview of LZW compression and decompression. It describes how LZW works by building a dictionary of strings and assigning codes to strings, allowing strings to be encoded with single codes rather than individual characters. This leads to data compression. Decompression involves rebuilding the dictionary and looking up codes to reconstruct the original strings. The advantages of LZW include fast speeds and lossless compression, while disadvantages relate to implementation complexity and storage needs depending on data.

Uploaded by

tafzeman891
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
47 views7 pages

LZW Compression and Decompression: December 4, 2015

This document provides an overview of LZW compression and decompression. It describes how LZW works by building a dictionary of strings and assigning codes to strings, allowing strings to be encoded with single codes rather than individual characters. This leads to data compression. Decompression involves rebuilding the dictionary and looking up codes to reconstruct the original strings. The advantages of LZW include fast speeds and lossless compression, while disadvantages relate to implementation complexity and storage needs depending on data.

Uploaded by

tafzeman891
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

LZW COMPRESSION AND

DECOMPRESSION

December 4, 2015

1
Contents
1 INTRODUCTION 3

2 CONCEPT 3

3 COMPRESSION 3

4 DECOMPRESSION: 4

5 ADVANTAGES OF LZW: 6

6 DISADVANTAGES OF LZW: 6

2
1 INTRODUCTION
LZW stands for Lempel-Ziv-Welch. This algorithm was created in 1984 by these
people namely Abraham Lempel, Jacob Ziv, and Terry Welch. This algorithm
is very simple to implement. In 1977, Lempel and Ziv published a paper on the
“sliding-window” compression followed by the “dictionary” based compression
which were named LZ77 and LZ78, respectively. later, Welch made a contri-
bution to LZ78 algorithm, which was then renamed to be LZW Compression
algorithm.

2 CONCEPT
Many files in real time, especially text files, have certain set of strings that
repeat very often, for example ” The ”,”of”,”on”etc., . With the spaces, any
string takes 5 bytes, or 40 bits to encode. But what if we need to add the
whole string to the list of characters after the last one, at 256. Then every time
we came across the string like” the ”, we could send the code 256 instead of
32,116,104 etc.,. This would take 9 bits instead of 40bits.
This is the algorithm of LZW compression. It starts with a ”dictionary” of
all the single character with indexes from 0 to 255. It then starts to expand the
dictionary as information gets sent through. Pretty soon, all the strings will be
encoded as a single bit, and compression would have occurred.
LZW compression replaces strings of characters with single codes. It does
not analyze the input text. Instead, it adds every new string of characters it sees
to a table of strings. Compression occurs when a single code is output instead
of a string of characters. The code that the LZW algorithm outputs can be of
any variable length, but it must have more bits in it than a single character.
The first 256 codes (when using 8-bit characters) are by default allocated to
the standard character set. The remaining codes are assigned to strings as the
algorithm proceeds. The sample program runs as shown with 12 bit codes. This
means codes starting from 0 to 255 refer to individual bytes, while codes from
256 to 4095 refer to substrings.

3 COMPRESSION
LZW consists of a dictionary of 256 characters (in the case of 8 bits) and
uses those as the ”standard” character set. It then reads data 8 bits at a
time (e.g., ’a’, ’b’, etc.) and encodes the inputdata as a number that rep-
resents its index in the dictionary. Whenever it comes across a new sub-
string (e.g., ”ab”), it adds it to the dictionary. Whenever it comes across
a substring it has already seen, it just reads in a new character and con-
catenates it with the current string to get a new substring. The next time
LZW revisits a substring, it will be encoded using a single number. Usu-
ally the maximum number of entries (say, 2048) is defined for the dictionary,
so that the process doesn’t run away with memory. Thus, the codes which

3
are taking place of the substrings in this example are 12 bits long (21 1 =
2048).Itisnecessaryf orthecodestobelongerinbitsthanthecharacters(12vs.8bits), butsincemanyf requentlyoccu

4 DECOMPRESSION:
The Decompression process for LZW is also very simple. In addition, it has an
edge over static compression methods because no dictionary or other pre-existing
information is necessary for the decoding algorithm–a dictionary identical to the
one created during compression is re-built during the process. Both encoding
and decoding programs must start with same initial dictionary, in this scenario,
all the 256 ASCII characters. Here’s how it works The LZW decoder first reads
in an index , looks up the index in the dictionary, and returns the substring

4
associated with the index. The first character of this substring is appended to
the current working string. This new concatenation is added to the dictionary
.The decoded string then becomes the current working string (the current index,
ie. the substring, is remembered), and the process repeats.

So, the encoded output starts out 0,1,2,4,... . When we start trying to decode,
a problem arises (in the table below, keep in mind that the Current String is
just the substring that was decoded in the last iterationof the loop. Also, the
New Dictionary Entry is created by appending the Current String with the first
character of the new Dictionary Translation):
So, the encoded output starts out 0,1,2,4,... . When we start trying to
decode, a problem might arise(in the table below, we must understand that
the Current String is just the substring that was decoded/translated in the last
iteration of the loop. Also, the New Dictionary Entry is created by appending

5
the Current String with the first character of the new Dictionary Translation):

5 ADVANTAGES OF LZW:
• LZW compression is very fast.
• It is loss less compression technique.
• The algorithm is very simple to implement.
• There is no need to analyze the incoming text.
• The whole algorithm can be expressed in only a dozen lines.
• LZW excels when used for data streams that have any repeated strings. Be-
cause of this it does extremely well for compressing English text. Compression
ratio of 50 percent or more is expected.
• For any fixed stationary source the LZW algorithm performs just as well as if
it is designed for that source.

6 DISADVANTAGES OF LZW:
• Although the algorithm is pretty simple but implementation of this algorithm
is complicated mainly because of management of the string table.
• Files that do not contain any repetitive data at all cannot be compressed

6
much.
• The method is good at text files but not as good at other types of files.
• The amount of storage needed is indeterminate as it depends on the total
length of all the strings.
• Also problem involves while searching the strings. Each time a new char-
acter is read in, the algorithm has to search for the new string formed by
string+character.
• Each and every time a new character is read in, the string table has to be
searched for a match. If a match is not found then a new string has to be added
to the string table. This causes two problems. First the string table can get very
large very fast. If string lengths average even as low as three or four character
each, the overhead of storing a variable length string and its code could easily
reach seven or eight bytes per code.

You might also like