Unit - 4
Unit - 4
Compression: It is the process of reducing the size of the given data or an image. It will help us
to reduce the storage space required to store an image or File.
Data Redundancy:
• The data or words that either provide no relevant information or simply restate that which
is already known. It is said to be data redundancy.
• Consider N1 and N2 number of information carrying units in two data sets that represent
the same information,
• Data Redundancy Rd = 1- (1/Cr)
Where Cr is called the Compression Ratio.
Cr=N1/N2.
Types of Redundancy
There are three basic Redundancy and they are classified as
1) Coding Redundancy
2) Interpixel Redundancy
3) Psychovisual Redundancy.
1. Coding Redundancy:
We developed this technique for image enhancement by histogram processing on the assumption
that the grey levels of an image are random quantities. Here the grey level histogram of the
image also can provide a great deal of insight in the construction of codes to reduce the amount
of data used to represent it.
• Coding redundancy refers to the inefficiency in representing data using a specific coding
scheme.
• It arises when the chosen coding method does not make optimal use of available code
space or when there is unnecessary repetition in the coding.
• Compression techniques aim to reduce coding redundancy by using more efficient coding
schemes or removing unnecessary information.
2. Interpixel Redundancy:
In order to reduce the interpixel redundancy in an image, the 2-D pixel array normally used for
human viewing and interpretation must be transformed in to more efficient form.
• Interpixel redundancy, also known as spatial redundancy, refers to the correlation
between neighboring pixels in an image.
• In natural images, adjacent pixels often have similar values, and encoding each pixel
independently may result in redundancy.
• Compression methods, such as predictive coding, exploit interpixel redundancy by
predicting the value of a pixel based on its neighbors and encoding only the prediction
error.
3. Psychovisual Redundancy:
Certain information simply has less relative importance than other information in the normal
visual processing. This information is called Psycovisual Redundant.
• Psychovisual redundancy involves exploiting the limitations and characteristics of the
human visual system.
• Not all information in an image is equally important to human perception. Some details
may be less noticeable or even imperceptible to the human eye.
• Compression algorithms take advantage of psychovisual redundancy by allocating more
bits to essential information and fewer bits to less critical information, reducing the
overall amount of data without a significant impact on perceived quality.
QB402 (b) Construct Huffman coding for the given 7 symbols and its probabilities. Also
calculate the following Average length, Entropy, Efficiency and Redundancy.
A B C D E F G
0.2 0.1 0.2 0.05 0.3 0.05 0.1
• Coding efficiency. Even in a lossy compression process, the desirable coding efficiency
might not be achievable. This is especially the case when there are specific constraints on
output signal quality.
• Interplay with other data modalities, such as audio and video. In a system where several
data modalities have to be supported, the compression methods for each modality should
have some common elements. For instance, in an interactive videophone system, the
audio compression method should have a frame structure that is consistent with the video
frame structure. Otherwise, there will be unnecessary requirements on buffers at the
decoder and a reduced tolerance to timing errors.
An Input image f(x,y) is fed in to encoder and create a set of symbols and after transmission over
the channel ,the encoded representation is fed in to the decoder. A General Compression system
model:
The General system model consist of the following components. They are broadly classified as
1. Source Encoder
2. Channel Encoder
3. Channel
4. Channel Decoder
5. Source Decoder
Fig. 3.1 shows, a compression system consists of two distinct structural blocks: an
encoder and a decoder. An input image f(x, y) is fed into the encoder, which creates a set of
symbols from the input data. After transmission over the channel, the encoded representation is
fed to the decoder, where a reconstructed output image f^(x, y) is generated. In general, f(x, y)
may or may not be an exact replica of f(x, y). If it is, the system is error free or information
preserving; if not, some level of distortion is present in the reconstructed image. Both the
encoder and decoder shown in Fig. 3.1 consist of two relatively independent functions or
subblocks. The encoder is made up of a source encoder, which removes input redundancies, and
a channel encoder, which increases the noise immunity of the source encoder's output. As would
be expected, the decoder includes a channel decoder followed by a source decoder. If the channel
between the encoder and decoder is noise free (not prone to error), the channel encoder and
decoder are omitted, and the general encoder and decoder become the source encoder and
decoder, respectively.
The Source Encoder and Decoder:
The source encoder is responsible for reducing or eliminating any coding, interpixel, or
psychovisual redundancies in the input image. The specific application and associated fidelity
requirements dictate the best encoding approach to use in any given situation. Normally, the
approach can be modeled by a series of three independent operations. As Fig. 3.2 (a) shows, each
operation is designed to reduce one of the three redundancies. Figure 3.2 (b) depicts the
corresponding source decoder. In the first stage of the source encoding process, the mapper
transforms the input data into a (usually nonvisual) format designed to reduce interpixel
redundancies in the input image. This operation generally is reversible and may or may not
reduce directly the amount of data required to represent the image.
This technique was developed by David Huffman. The codes generated using this technique or
procedure are called Huffman codes. These codes are prefix codes and are optimum for a given
model. The Huffman procedure is based on two observations regarding optimum prefix codes
1. In an optimum code, symbols that occur more frequently will have shorter code words than
symbols that occur less frequently.
2. In an optimum code, the two symbols that occur least frequently will have the same length
Huffman coding is a widely used algorithm for lossless data compression. Developed by David A.
Huffman in 1952, it is a variable-length prefix coding algorithm that assigns shorter codes to more
frequently occurring symbols and longer codes to less frequent symbols. The basic idea is to construct
a binary tree, called the Huffman tree, in which the leaves correspond to the symbols to be encoded.
Discuss the example of Huffman with example and calculated Average length, Entropy,
Efficiency and Redundancy