0% found this document useful (0 votes)
101 views

Data Reduction Tech For Graphics-REPORT

This document is a project report on data reduction techniques for graphics submitted in partial fulfillment of a Bachelor of Computer Engineering degree. It discusses techniques for reducing the file size of graphics to decrease network traffic, focusing on the GIF and JPEG formats. It includes an acknowledgment, abstract, index, and introduction discussing the need for data compression to optimize web graphics and transmission speeds.

Uploaded by

mehul dholakiya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
101 views

Data Reduction Tech For Graphics-REPORT

This document is a project report on data reduction techniques for graphics submitted in partial fulfillment of a Bachelor of Computer Engineering degree. It discusses techniques for reducing the file size of graphics to decrease network traffic, focusing on the GIF and JPEG formats. It includes an acknowledgment, abstract, index, and introduction discussing the need for data compression to optimize web graphics and transmission speeds.

Uploaded by

mehul dholakiya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 27

In the partial fulfillment of Bachelor of Computer Engineering for

Year, 2006

Prepared By : Guided By :
Radadia Archana G. Mrs. Vaikhari Deodhar
Roll No. – 46 Mrs. Mayuri Mehta

DEPARTMENT OF COMPUTER ENGINEERING


SARVAJANIK COLLEGE OF ENGINEERING AND TECHNOLOGY
DR. R.K. DESAI ROAD, ATHWALINES
SURAT - 395001

Veer Narmad South Gujarat University


Surat
2006

Data Reduction Techniques For Graphics 1


SARVAJANIK COLLEGE OF ENGINEERING & TECHNOLOGY
Dr. R. K. Desai Road, Athwalines, Surat – 395 001, Gujarat, INDIA

Computer Department

CERTIFICATE

This is to certify that project report entitled Data Reduction


Techniques for Graphics is prepaired by Ms. Radadia Archana
G., Roll No.:- 46 of B.E.IVth(7th Semester), Computer Engineering
Department, have satisfactorily completed their work under
guidance of Mrs. Mayuri Mehta and Ms. Vaikhari Deodhar in the
year 2006.

Place :Surat
Date :

Signature of Guide Signature of Co-Guide Head of Department


C.O.Department
Signature of Jury Members

Data Reduction Techniques For Graphics 2


ACKNOWLEDGEMENT

I am very grateful to my guide Mrs. Vaikhari Deodhar and co-guide Mrs.


Mayuri Mehta for giving me right support and valuable guidance during the of this
Report. They helped me by solving many doubts and suggested me many references to
emerge out with a perfect presentation. I am thankful to madam for giving me valuable
time.

I would also like to offer my gratitude towards DIC, Mr. Keyur Rana of
Computer Engg. Department, who helped me by giving their valuable suggestions and
encouragement which not only helped me in preparing this report but also in having a
better insight in this field.

Data Reduction Techniques For Graphics 3


ABSTRACT

One of the reasons that the World Wide Web has become so popular is that it
allows for the easy display of graphics. While most of the information content publish on
the Web is still textual in nature, graphics make pages more visually appealing,
supplement textual information, and supply information that can only be transmitted
visually. There must be some way to make this transmission of image on the Web faster.
So this article includes some basic and universal techniques for the reduction of
graphics file size to reduce the traffic on the net. Here two Universal graphics file for the
net which are GIF and JPEG have been discussed and algorithms for these reduction
techniques are given. In this article some features of GIF and JPEG file format is also
discussed.

Data Reduction Techniques For Graphics 4


INDEX

NO CHAPTER PAGE
NO.
1 Introduction 1
2 Data Compression 2

2.1 What is data compression?


2.2 Data compression method
3 Types of Compression 5

3.1 Lossless compression


3.2 Lossy Compression
4 Lossless algorithm 7

4.1 Probability coding


4.2 Dictionary based algorithms
5 Lossy algorithm 17

5.1 JPEG compression


5.2 MPEG compression
Conclusion 21

Bibliography 22

1. INTRODUCTION

Why we need data reduction?

Data Reduction Techniques For Graphics 5


"The Web seems slow today."

If this lament sounds all too familiar, you are not alone. The network's backbone isn't the
problem; it's what happens at each end that frustrates users. The increasing size of digital
media and lack of server bandwidth are the main culprits. More bandwidth won't
necessarily solve this problem. What will help is minimizing the amount of data that
travels through this bandwidth.

Optimizing Web Graphics

The secret of shrinking graphic file size is reduction of bit depth resolution, and
dimension while preserving image quality. This classic size-versus-quality tradeoff
is the key to the art and science of graphics compression.

Bit Depth = Color Depth Where bit depth = number of bits/pixel

Color Palettes

The two ways to store color raster images are indexed and RGB. Indexed formats
are indexed, or mapped to a 256-color (or less) lookup table (CLUT) or color
palette. RGB formats, also known as true color, use 8 bits (0 to 255) for each Red,
Green, and Blue value to form a 24-bit pixel (8+8+8=24) palette which can hold
one of 16.7 million colors (2^24=16,777,216 colors). Some formats support even
higher bit depths, useful for medical imaging or smooth transparencies.

A CLUT is a digital version of a paint store's color-mixing chart: it contains 256


entries, each with its own formula for a specific color. Indexed images refer to each
color by its position in the palette.

2. DATA COMPRESSION

2.1 What is Data Compression?

Data Reduction Techniques For Graphics 6


Data compression shrinks down a file. So, that it takes up less space. This is
desirable for data storage & data communication. Storage space on disks is expensive.
So, a file, which occupies less disk space, is “cheaper” than an uncompressed file. So,
smaller a file the faster it can be transferred. Compressed file appears to increase the
speed of data transfer over an uncompressed file.

Data compression is also called ‘source encoding’ because in process of data


compression source data is compressed by achieving different encoding technique. This
means that data is originally the combination of message M. This message from the
alphabet is encoded into the binary alphabet. This string of binary digits (0’s & 1’s), is
the encoded data. So essentially encoding is just transferring a message M, from the
alphabet into the binary digit & decoding is just reverse, that transfer binary alphabet
back into the original message, which was in English alphabet.

A code is mapping of source messages (words from the source alphabet alpha)
into code words (word from the code alphabet beta). The source messages are the basic
units into which the string to be represented is partitioned. These basic units may be
single symbols from the source alphabet, or they may be string of symbols. For string
EXAMPLE, alpha = {a, b, c, d, e, f, g, space}. For purposes of explanation, beta will be
taken to {0, 1}. Codes can be categorized as block-block, block-variable, variable-block
or variable-variable, where block-block indicates that the source messages and codeword
are of fixed length available variable codes map variable-length source messages into
Variable length codeword.

2.2 Data Compression Methods


-Classification of Data Compression Methods
In addition to categorization of data compression schemes with respect to message and
codeword lengths, these methods are classified as static and dynamic.

Data Reduction Techniques For Graphics 7


Static Method
A static is one in which the mapping from the set of messages to set of code
words is fixed before transmission begins, so that a given message is represented by same
codeword every time it appears in message ensemble. The classic static defined-word
scheme is Huffman coding. In Huffman coding, the assignment of code words to source
messages is based on the probabilities with which the source message appear in the
message ensemble. Messages, which appear more frequently, are represented by short
code words; messages with smaller probabilities map to longer code words. These
probabilities are determined before transmission begins. A Huffman code example is
given below:

Dynamic Method
A code is dynamic if the mapping from the set of messages to the set of code
words changes over time. For example, dynamic Huffman coding involves computing an
approximation to the probabilities of occurrence “on the fly”, as the ensembles is being
transmitted. The assignment of code words to messages is based on the values of the
relative’s frequencies of occurrence at each point in time. A message x may be
represented by a short codeword early in the transmission because it occurs frequently at
the beginning of the ensemble, even though its probability of occurrence over the total
ensemble is low. Later, when the more probable message begins to occur with higher
frequency, the short codeword will be mapped to one of the higher probability messages
and x will be mapped to longer codeword.
As an illustration, below figure represent a dynamic Huffman code table corresponding to
the following string.

“aa bbb cccc ddddd eeeeee fffffffgggggggg”


“aa bbb”
Although the frequency of space over the entire message is greater than that of b, at this
point in time b has higher frequency and therefore is mapped to the shorter codeword.

Data Reduction Techniques For Graphics 8


Source Message Probability Codeword
a 2/40 1001
b 3/40 100
c 4/40 011
d 5/40 010
e 6/40 111
f 7/40 110
g 8/40 00
Space 5/40 101
a 2/6 01
b 3/6 0
Space 1/6 11
A dynamic Huffman code table

Dynamic codes are also referred in literature as adaptive, in that they adapt to changes in
ensemble characteristic over time.

The essential figure of merit for data compression is the “compression ratio” of the
file size of a compressed file to the original uncompressed file.

3. TYPES OF COMPRESSION

There are two types of Data Compression


1) Lossless compression
2) Lossy compression

Data Reduction Techniques For Graphics 9


3.1 Lossless Compression

Lossless compression techniques, as their name implies, involve no loss of


information. If data has been lossless compressed, the original data can be recovered
exactly from the compressed data. Lossless compression is generally used for application
that cannot tolerate any difference between the original and reconstructed data.

Lossless compression consists of those techniques guaranteed to generate an


exact duplicate of the input stream after a compress/expand cycle. This is the type
of compression used when storing database records, spreadsheets, or word processing
file. In these applications, the loss of even as single bit could be catastrophic. The
amount of compression that can be achieved by a given algorithm depends on both the
amount of redundancy in the source and the efficiency of its extraction. The amount of
compression that can be achieved by a given algorithm depends on both Lossless data
compression works by finding repeated patterns in a message and encoding those patterns
in efficient manner. For these reason, Lossless data compression also referred to as
redundancy reduction. . The useful algorithms recognize redundancy and inefficiencies in
the encoding and are most effective when designed for the statistical properties of the bit
stream.

It is used in software compression tools such as the highly popular zip


format, WINZIP. Lossless compression is used when every byte of data is important,
such as Lossless data compression is ideal for text. Lossless compression is most often
use when in input data set byte is important for example in executable programs and
source code. Lossless algorithms do not change the content of a file. If you compress a
file and then decompress it, it has not changed. The following algorithms are lossless:
· Shannon-fano Algorithm
· Huffman coding
· Arithmetic coding
· RLE Encoding

Data Reduction Techniques For Graphics 10


3.2 Lossy Compression

Lossy compression techniques involve some loss of information, and data that
have been compressed using lossy techniques generally cannot be recovered or
reconstructed exactly. When the compressed message is decoded it does not give back the
original message. Data has been lost. In return for accepting this distortion in the
reconstruction, we can generally obtain much higher compression ratio than is possible
with lossless compression. Lossy compression produces a much smaller compressed
file than lossless compression. Because Lossy compression cannot be decoded to yield
the exact original message, it is not good method of compression for critical data, such as
textual data.

It is most useful for Digitally Sampled Analog Data (DSAD). DSAD consists
mostly of sound, video, graphics, or picture files. Algorithms for Lossy compression of
DSAD vary, but many use a threshold level truncation. This means that a level is chosen
past which all data is truncated. In a sound file, for example, the very high and low
frequencies, which the human ear cannot hear, may be truncated from the file.
The following algorithms are the example of Lossy compression.
· JPEG Compression
· MPEG Compression

4. LOSSLESS ALGORITHM

4.1 Probability coding:

Data Reduction Techniques For Graphics 11


The main objective of this chapter is to introduce important lossless
compression algorithm: Huffman Algorithm, Shannon-fano Algorithm, Arithmetic
Algorithm, Run length Encoding. Data Compression enters into field of Information
Theory because of its concern with redundancy. Redundant information in a message
takes extra bit to encode, and if we can get rid of that extra information, we will have
reduced the size of message.

4.1.1 Huffman Coding

This is a variable length coding technique that creates a prefix-condition code


given the non-uniform probability distribution of any finite alphabet. It provides a
systematic approach to designing a variable length code which is best for a given finite
alphabet source (that is, a given probability distribution for its symbols). Huffman
compression reduces the average code length used to represent the symbols of an
alphabet. Symbols of the source alphabet, which occur frequently, are assigning with
short length codes. The general strategy is to allow the code length to vary from
character to character and to ensure that the frequently occurring character has shorter
codes. The mechanics of producing the Huffman code for a given probability distribution
are quite simple. To illustrate, we will consider an example.

4.1.2 About Huffman codes

Huffman compression belongs into a family of algorithms with a variable


codeword length. That means that individual symbols (characters in a text file for
instance) are replaced by bit sequences that have a distinct length. So symbols that occur
a lot in a file are given a short sequence while other that are used seldom get a longer bit
sequence. A practical example will show you the principle: Suppose you want to
compress the following piece of data:
ACDABA
Since these are 6 characters, this text is 6 bytes or 48 bits long. With Huffman encoding,
the file is searched for the most frequently appearing symbols (in this case the character

Data Reduction Techniques For Graphics 12


'A' occurs 3 times) and then a tree is build that replaces the symbols by shorter bit
sequences. In this particular case, the algorithm would use the following substitution
table:
A=0, B=10, C=110, D=111. If these code words are used to compress the file, the
compressed data look like this:
01101110100
This means that 11 bits are used instead of 48, a compression ratio of 4 to 1 for this
particular file.

4.1.3 Steps of Huffman Coding:

Step 1
List the source symbols in decreasing probability order. (The symbols A through G for
the example are already labeled in decreasing probability order).
Step 2
Combine the two least-probability symbols into a single entity (call it "fg" in Stage 2
for the example), associating with it the sum of the probabilities of the individual
symbols in the combination. Assign to each of the symbols that were combined one of the
binary digits, "0" for upper symbol , "1" for the lower one (F à"0", G à"1" in the
example).
Step 3
Re-order, if necessary, the resulting symbol list in decreasing probability order, treating
any combined symbols from the previous list as a single symbol. If the list now has only
2 symbols, assign to each of them one of the binary digits (upper symbol à0, lower one
à1) and go to Step 4. Otherwise go to Step 2.
Step 4
Read off the binary codeword for the source symbols, from right to left. (Note:
Assignment of code letters "0" and "1" to upper and lower symbols may be done in any
way as long as it is consistent throughout. For example, text uses "1" for upper and "0"
for lower. If this was done for the example below, all the codeword would have "1" and

Data Reduction Techniques For Graphics 13


"0" interchanged.). The Huffman code is also always a prefix condition code, by
construction.

Huffman encoding can be further optimized in two different ways:

· Adaptive Huffman code dynamically changes the code words according to the change
of probabilities of the symbols.
· Extended Huffman compression can encode groups of symbols rather than single
symbols. Huffman compression is mainly used in compression programs like pkZIP, lha,
gz, zoo and arj. It is also used within JPEG and MPEG compression.

4.1.2 Arithmetic Coding

Arithmetic coding is a technique for coding that allows the information from the
messages in a message sequence to be combined to share the same bits. The technique
allows the total number of bits sent to asymptotically approach the sum of the self
information of the individual messages the drawback with Huffman coding, is that we
assign an integer number of bits as the code for each symbol. The code is thus optimal
when each symbol has an occurrence probability of the description of the algorithm in
pseudo code for implementation.

Set Low to 0
Set High to 1
While there are input symbol do
Take a symbol
CodeRange=High-Low
High=low+CodeRange*Highrange(symbol)
Low=Low+CodeRange*LowRange(symbol)
End of while
Output Low

Data Reduction Techniques For Graphics 14


The Low and High parts of subintervals are subintervals of interval [0, 1).The size
of each subinterval depends on the probability that this symbol occurs in the input file.

EXAMPLE:-

The message to be encoded is “ARITHMETIC”. There are ten symbols in the message.
The probability distribution is given follow.

Symbol Subinterval
A 1/10
C 1/10
E 1/10
H 1/10
I 2/10
M 1/10
R 1/10
T 2/10

Each character is assigned the portion of the starting interval [0, 1). The size of the
interval corresponds to symbol’s probability of appearance.

Symbol Subinterval
A 0.00-0.10
C 0.10-0.20
E 0.20-0.30
H 0.30-0.40
I 0.40-0.60
M 0.60-0.70
R 0.70-0.80
T 0.80-1.00

Data Reduction Techniques For Graphics 15


The coding is performed below. Note the most significant portion of a coded message
belongs to the first symbol to be encoded. In this case, the first symbol is “A” which owm
range [0, 0.1).

Symbol Low Range High Range


A 0.0 0.1
R 0.07 0.08
I 0.074 0.076
T 0.0756 0.076
H 0.07572 0.076
M 0.07596 0.07576
E 0.075968 0.07600
T 0.0759712 0.075972
I 0.07597152 0.7597168
C 0.075971536 0.075971552

Unfortunately, this leaves behind a remainder that can be decoded into an


indefinitely long string of bogus characters. This appears to be an artifact of using
decimal floating-point math to perform the calculations in this example. In practice,
arithmetic coding is based on binary fixed-point math, which avoids this problem.

One other problem is the fact that the binary fraction that is output by the
arithmetic coder is of indefinite length, and the decoder has no idea of where the string
ends if it's not told. In practice, a length header can be sent to indicate how long the
fraction is, or an end of-transmission symbol of some sort can be used to tell the decoder
where the end of the fraction is.

As with Huffman coding, arithmetic coding can also be performed using an


adaptive algorithm, with the coder and decoder starting with a predetermined character

Data Reduction Techniques For Graphics 16


probability interval table, tallying characters for their actual frequencies as they are
encoded or decoded, and then adjusting the probability interval table accordingly.

The neat thing about arithmetic coding is that by amassing a complete message
into a single probability interval value, individual characters can be encoded with the
equivalent of fractional values of bits. Huffman coding requires an integer number of bits
for each character, and so this is one of the reasons that arithmetic coding is in general
more efficient than Huffman coding.

The problem with arithmetic coding is that it is very computation-intensive and so


it is slow. Huffman and arithmetic coding are sometimes referred to as forms of
"statistical coding" or "entropy coding". The term "VLW (Variable Length Word)" also
seems to pop up on occasion when discussing such coding techniques, though its exact
definition seems a bit unclear.

4.1.3 Run-Length Encoding

RLE is probably the easiest compression algorithm. It replaces sequences of the


same data values within a file by a count number and a single value.
Suppose the following string of data (17 bytes) has to be compressed:
ABBBBBBBBBCDEEEEF
Using RLE compression, the compressed file takes up 10 bytes and could look like this:
A *8B C D *4E F
As you can see, repetitive strings of data are replaced by a control character (*)
followed by the number of repeated characters and the repetitive character itself. The
control character is not fixed; it can differ from implementation to implementation. If the
control character itself appears in the file then one extra character is coded. You can see,
RLE encoding is only effective if there are sequences of 4 or more repeating characters
because three characters are used to conduct RLE so coding two repeating characters
would even lead to an increase in file size.

Data Reduction Techniques For Graphics 17


It is important to know that there are many different run-length encoding
schemes. The above example has just been used to demonstrate the basic principle of
RLE encoding. Sometimes the implementation of RLE is adapted to the type of data that
are being compressed. Run length encoding is actually not very useful for compressing
text files, since a typical text file doesn't have a lot of long, repetitive character strings. It
is very useful, however, for compressing bytes of a monochrome image file, which
normally consists of solid black picture bits, or "pixels", in a sea of white pixels, or the
reverse. Run-length encoding is also often used as a preprocessor for other
compression algorithms. As the next chapter explains, for example, it is used as one of
the many pieces of the JPEG image compression scheme.

Advantages and Disadvantages

This algorithm is very easy to implement and does not require much CPU
horsepower. RLE compression is only efficient with files that contain lots of repetitive
data. These can be text files if they contain lots of spaces for indenting but line-art
images that contain large white or black areas are far more suitable. Computer generated
color images (e.g. architectural drawings) can also give fair compression ratios. The
algorithm is easy to implement (in hardware, if necessary) and runs very quickly.
• If part of the data is lost or corrupted, all or nearly all of the rest of the data can be
reconstructed. Not true of many compression techniques!
• If the data is not suitable to run-length encoding (and it’s easy to construct data that
isn’t), then this encoding can be larger than the original.

4.2 Dictionary Based algorithms

4.2.1 Introduction

Data Reduction Techniques For Graphics 18


Dictionary based algorithm scan a file for sequences of data that occur more
than once. These sequences are then stored in a dictionary and within the compressed file,
references are put where-ever repetitive data occurred these compression method use the
property of many data types to contain repeating code sequences. Good examples of
such data are text files (code word represents characters) and raster images (code word
represents pixels). The algorithms that create a dictionary of the phrases that occur in the
input data. When they encounter a phrase already present in the dictionary, they just
output the index number of the phrase in the dictionary. These methods are based on the
algorithm developed and published by Lempel and Ziv which is LZW. It is a lossless
'dictionary based' compression algorithm. Lempel and Ziv published a series of papers
describing various compression algorithms. Their first algorithm was published in 1977,
hence its name: LZ77. This compression algorithm maintains its dictionary within the
data themselves. . Suppose you want to compress the following string of text: “the quick
brown fox jumps over the lazy dog.” The word 'the' occurs twice in the file so the data
can be compressed like this: “the quick brown fox jumps over << lazy dog.” in which <<
is a pointer to the first 4 characters in the string.

In 1978, Lempel and Ziv published a second algorithm outlining a similar


algorithm that is now referred to as LZ78. This algorithm maintains a separate dictionary.
Suppose you once again want to compress the following string of text:” the quick brown
fox jumps over the lazy dog”. The word 'the' occurs twice in the file so this string is put
in an index that is added to the compressed file and this entry is referred to as *. The data
then look like this:” * quick brown fox jumps over * lazy dog.”

4.2.2 LZW

Data Reduction Techniques For Graphics 19


To compress and decompress, LZW uses the dictionary made of two rows. The
first row defines the code; the second row defines the string corresponding to that code.
Before compression begins, the dictionary is initialized with only the set of alphabet
characters in the text.

-The Compression Algorithm

1. At the start, initialize the dictionary contains all possible alphabets;


2. Is there any symbol in the text?
a. if it is, read the next symbol and append it to the buffer ;
b. if not, go to step 4;
3. Any match between buffer and dictionary?
a. if yes, go back to step 2;
b. if not,
-Send the code corresponding to the string in the buffer minus the last
symbol;
-Add the whole buffer contents to the dictionary;
- Purge the buffer except for the last symbol;
-Go back to step 2;
4. Send the code corresponding to the string in the buffer;
5. END.

-Decompression Algorithm

1. At the start, initialize the dictionary contains all possible alphabets;


2. Is there any more received code?
a. If yes, decode the next code and send corresponding symbols to temporary

buffer;
b. If not, go to step 5;
3. Are there any more symbols in the temporary buffer?

Data Reduction Techniques For Graphics 20


a. If yes, Move the next symbol from the temporary buffer to the buffer;
b. If not, go back to step 2;
4. Are match between buffer and dictionary?
a. if yes, go back to step 3;
b. if not;
- Print the symbols in the buffer except the last one;
- Add the whole buffer contents to the dictionary;
- Purge the buffer except for the last symbol;
- Go back to step 3;
5. Print the content of the buffer;
6. END.

-Advantages of LZW

· LZW compression works best for files containing lots of repetitive data. This is often
the case with text and monochrome images. Files that are compressed but that do not
contain any repetitive information at all can even grow bigger!
· LZW compression is fast.

5. LOSSY ALGORITHM

- Lossy Compression Algorithms

Data Reduction Techniques For Graphics 21


1) JPEG Compression Algorithm
2) MPEG Compression Algorithm

5.1 JPEG Compression


JPEG (pronounced "jay-peg") is a standardized image compression
mechanism. JPEG stands for Joint Photographic Experts Group, the original name of the
committee that wrote the standard.

JPEG is a lossy compression scheme for color and gray-scale images. It works
on full 24-bit color, and was designed to be used with photographic material and
naturalistic artwork. It is not the ideal format for line-drawings, textual images, or other
images with large areas of solid color or a very limited number of distinct colors.

JPEG is designed so that the loss factor can be tuned by the user to tradeoff image
size and image quality, and is designed so that the loss has the least effect on human
perception. It however does have some anomalies when the compression ratio gets high,
such as odd effects across the boundaries of 8x8 blocks. For high compression ratios,
other techniques such as wavelet compression appear to give more satisfactory results.
JPEG is designed for compressing full-color or gray-scale images of natural, real-world
scenes. It works well on photographs, naturalistic artwork, and similar material; not so
well on lettering, simple cartoons, or line drawings. JPEG handles only still images, but
there is a related standard called MPEG for motion pictures.

Why use JPEG?

JPEG is used to make your image files smaller, and to store 24-bit-per-pixel
color data instead of 8-bit-per-pixel data. Making image files smaller is a win for

Data Reduction Techniques For Graphics 22


transmitting files across networks and for archiving libraries of images. Being able to
compress a 2 Mb full-color file down to, say, 100 Kb makes a big difference in disk
space and transmission time! And JPEG can easily provide 20:1 compression of full-
color data. If you are comparing GIF and JPEG, the size ratio is usually more like 4:1.
when network transmission is involved, the time savings from transferring a shorter file
can be greater than the time needed to decompress the file.

JPEG’s Compression Algorithm

There are four steps in JPEG compression algorithm. The first step is to extract
an 8x8 pixel block from the picture. The second step is to calculate the discrete cosine
transform for each element in the block. Third, quantized rounds off the discrete cosine
transform (DCT) coefficient according to the specified image quality (this phase is where
most of the original image information is lost, thus it is dubbed the Lossy phase of the
JPEG algorithm). Fourth, the coefficients are compressed using an encoding scheme
such as Huffman Coding or Arithmetic Coding. The final compressed code is then
written to the output file.

Compression versus Quality

In JPEG Compression, the black-and-white drawing with hard edges is used for
demonstration because we can clearly see how the compression techniques affect the
image. But the demonstration isn’t really fair because JPEG wasn’t designed for these
type of images. With a true photo realistic image, the changes aren’t nearly visible, at
least at low compression level.

The examples below compare a 24-color uncompressed image on the above with
JPEG compressed at a very low level on the below. If you can see any differences, you
have got better eyes than I do, and yet the JPEG file is about one twenty fifth the size of
the uncompressed image. The lightly compressed image on the below is virtually
identical to the uncompressed image on above. You can definitely see a difference in

Data Reduction Techniques For Graphics 23


quality between the uncompressed image on above and the heavily compressed image on
the left. The image is still quite recognizable, although you can see quality problems,
especially in the background.

JPEG compression ratio

JPEG gets better compression. JPEG typically gets about 10:1 at the lowest
compression levels and up to 200:1 and more at the highest levels of compression. At a
medium compression level, where the quality loss is only slightly apparent, 30:1 is a
typical ratio.

5.2 MPEG Compression

MPEG is a compression standard for digital video sequences, such as used in


computer video and digital television networks. In addition, MPEG also provides for
the compression of the sound track associated with the video. The name comes from its
originating organization, the Moving Pictures Experts Group.

In addition to reducing the data rate, MPEG has several important features. The
movie can be played forward or in reverse, and at either normal or fast speed. The
encoded information is random access, that is, any individual frame in the sequence can
be easily displayed as a still picture. This goes along with making the movie editable,
meaning that short segments from the movie can be encoded only with reference to
themselves, not the entire sequence. The main distortion associated with MPEG occurs
when large sections of image change quickly. In effect, a burst of information is needed
to keep up with the rapidly changing scenes. If the data rate is fixed, the viewer notices
“blocky” patterns when changing from one scene to next. This can be minimized in
networks that transmit multiple video channels simultaneously, such as cable television.

Advantages and Disadvantages


-Advantages of Data Compression:-

Data Reduction Techniques For Graphics 24


- Less disk space (more data in reality)
- Faster writing and reading
- Faster file transfer
- Variable dynamic range
- Byte order independent
- Disadvantages of Data Compression:-
- Added complication
- Effect of errors in transmission
- Slower for sophisticated methods (but simple methods can be faster for
writing to disk.)
- ``Unknown'' byte / pixel relationship
- Need to decompress all previous data

CONCLUSION

Data Reduction Techniques For Graphics 25


After above discussion we conclude that Data Compression is very useful technique
in our everyday life Application. Because without using this, watching T.V., using
internet service, sending fax, WinZip files and many other things is going to become very
trouble some or beyond our imagination. Although the compression efficiency is most
important feature in any compression schemes, other functionality such as complexity &
delay should be taken into account when choosing a best compression scheme. Data
Compression compresses the file and this way reduces the size of that file. It is very
useful in communication purpose because if large data file transmitted by sender than
space requirement is large, speed is low and occurrence on that file very high, so by using
technique these all problems are solved.

BIBLIOGRAPHY

Websites:

Data Reduction Techniques For Graphics 26


- http://www.hn.is.uec.ac.jp/~arimura/compression_links.html
- http://www.dogma.net/Datacompression
- http://www.data-compression.com
- http://www.rasip.fer.hr
- http://www.rasip.etf.hr/research/compress

Books:

- Visual communication. By, Netravali.A.W & B.Prasad


- Introduction to Data Compression. By. Khalid Sayood

Data Reduction Techniques For Graphics 27

You might also like