0% found this document useful (0 votes)
28 views12 pages

Lecture 17

Image compression involves reducing the amount of data required to represent a digital image by removing redundant information. It is crucial for efficient storage, transmission, and manipulation of images. Compression can be lossless, preserving all information, or lossy, allowing higher compression but imperfect reproduction. Key aspects include quantifying fidelity using error metrics, modeling compression using encoders and decoders, and understanding information theory foundations.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views12 pages

Lecture 17

Image compression involves reducing the amount of data required to represent a digital image by removing redundant information. It is crucial for efficient storage, transmission, and manipulation of images. Compression can be lossless, preserving all information, or lossy, allowing higher compression but imperfect reproduction. Key aspects include quantifying fidelity using error metrics, modeling compression using encoders and decoders, and understanding information theory foundations.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Image Compression

• Images usually require an enormous amount of data to represent:


Example: A standard 8.5” by 11” sheet of paper scanned at 100
samples per inch (dpi) and quantized to two gray levels (binary
image) would require more than 100k bytes to represent.
(8.5 × 100)(11× 100) / 8 = 116875 bytes.

• Image compression involves reducing the amount of data (bits)


required to represent a digital image. This is done by removing
redundant information.
• Image compression is crucial for efficient storage, retrieval,
transmission, and manipulation of images.
• More generally, data compression involves the efficient
representation of data in digital format.
• Information theory --- a field pioneered by Claude E. Shannon in
the 1940s --- is the theoretical basis for most data compression
techniques.
• Compression can be either lossless (information preserving) or
lossy. In lossy compression, one can achieve higher levels of data
reduction with less than perfect reproduction of the original image.
• Data compression refers to the process of reducing the amount of
data required to represent a given quantity of information.
• Various amounts of data may be used to represent/describe the
same information. Some representations maybe less efficient in the
sense that they have data redundancy.
• If n1 and n2 are the number of information carrying units (ex. bits)
in two datasets that represent the same information, the relative
redundancy RD of the first dataset is defined as

1 n1
RD = 1 − , where C R =
CR n2

is called the compression ratio.


• For images, data redundancy can be of three types:
Coding redundancy: This refers to the binary codewords
used to represent grayvalues.
Interpixel redundancy: This refers to the correlation
between adjacent pixels in an image.
Psychovisual redundancy: This refers to the unequal
sensitivity of the human eye to different visual information.
• A fidelity criterion or error criterion is required to quantify the
loss of information (if any) due to a compression scheme.
• Objective fidelity criteria are based on some quantitive function of
the original input image and the compressed and subsequently
decompressed image.
Example: Root-mean-square (RMS) error, which is defined as the
square-root of the MSE.

e(m, n) = fˆ (m, n) − f (m, n)

f ( m, n) : Original image
fˆ (m, n) : Reconstructed image
e( m, n) : Error image
∑∑ [ fˆ (m, n) − f (m, n)]
 1
M −1 N −1 2
 1 2
erms =  
MN m = 0 n =0

• A related measure is the mean-square signal-to-noise ratio


(SNRms):

∑∑ [ fˆ (m, n)]
M −1 N −1 2

SNRms = m = 0 n =0

∑∑ [ fˆ (m, n) − f (m, n)]


M −1 N −1 2

m =0 n = 0

• The rms value of SNR, denoted SNRrms, is the square-root of


SNRrms.
• When the ultimate image is to be viewed by a human, subjective
fidelity criterion may be more appropriate.
• Here, image quality is measured by subjective evaluations by a
human observer.
• Ratings by a number of human observers, based on “typical”
decompressed images, are averaged to obtain this subjective
fidelity criterion.
• Example of an absolute comparison scale:
Value Rating Description
1 Excellent An image of extremely high quality --- as
good as desired.
2 Fine An image of high quality, providing enjoyable
viewing. Interference is not objectionable.
3 Passable An image of acceptable quality. Interference
is not objectionable.
4 Marginal An image of poor quality; you wish you could
improve it. Interference is somewhat
objectionable.
5 Inferior A very poor image, but you could watch it.
Objectionable interference is definitely
present.
6 Unusable An image so bad that you cold not watch it.

• Example of a relative comparison scale, based on a “side-by-side”


comparison of the original image and the decompressed image:
Value −3 −2 −1 0 1 2 3
Much Slightly Slightly Much
Rating Worse Same Better
Worse Worse better Better
Image Compression model

f (m, n) Source Channel Channel Channel Source fˆ (m, n)


Encoder Encoder Decoder Decoder

• Source Encoder is used to remove redundancy in the input image.


• Channel Encoder is used to introduce redundancy in a controlled
fashion to help combat noise. Example: Parity bit.
• This provides a certain level of immunity from noise that is
inherent in any storage/transmission system. If the channel is not
prone to noise, this block maybe eliminated.
• The Channel could be a communication link or a storage/retrieval
system.
• Channel Decoder and Source Decoder invert the operations of the
corresponding encoder blocks.
• We will mainly concentrate on the source encoder/decoder blocks
and not on the channel encoder/decoder steps.
Source Encoder and Decoder
• Source encoder is responsible for reducing or eliminating any
coding, interpixel, or psychovisual redundancy.

f (m, n) Symbol To Channel


Mapper Quantizer (Compressed
Encoder
Image)
Source Encoder

• The first block “Mapper” transforms the input data into a (usually
nonvisual) format, designed to reduce interpixel redundancy. This
block is reversible and may or may not reduce the amount of data.
Example: run-length encoding, image transform.
• The Quantizer reduces accuracy of the mapper output in
accordance with some fidelity criterion. This block reduces
psychovisual redundancy and is usually not invertible.
• The Symbol Encoder creates a fixed or variable length codeword
to represent the quantizer output and maps the output in
accordance with this code. This block is reversible and reduces
coding redundancy.

From Channel
Symbol Inverse fˆ (m, n)
(Compressed
Decoder Mapper
Image)

Source Decoder

• The decoder blocks are inverse operations of the corresponding


encoder blocks (except the quantizer block, which is not
invertible).
Elements of Information Theory
• What is information --- how to quantify it?
• What is the minimum amount of data that is sufficient to represent
an image without loss of information?
• What is theoretically the best compression possible?
• What is the theoretically best possible transmission rate for
reliable communication over a noisy channel?
• Information theory provides answers to these and other related
fundamental questions.
• The fundamental premise of information theory is that the
generation of information can be modeled as a probabilistic
process.
• A discrete source of information generates one of N possible
symbols from a source alphabet set A = {a0 , a1 , , aN −1}, in unit
time.
Example: A = {a, b, c, 
, z}, {0, 1}, {0, 1, 2, 
, 255}.

• The source output can be modeled as a discrete random variable E,


which can take values in set A = {a0 , a1 , , aN −1}, with 

corresponding probabilities { p0 , p1 , , pN −1}. 

• We will denote the symbol probabilities by the vector


z = [P (a0 ), P( a1 ) , P ( a N −1 ) ] = [ p0 , p1 , p N −1 ] .
T T
 

N −1
• Naturally, pi ≥ 0, and pi = 1.


i =0

• The information source is characterized by the pair ( A, z ).


• Observing an occurrence (or realization) of the random variable E
results in some gain of information denoted by I(E). This gain of
information was defined to be (Shannon):
1
I ( E ) = log = − log P ( E )
P( E )

• The base for the logarithm depends on the units for measuring
information. Usually, we use base 2, which gives the information
in units of “binary digits” or “bits.” Using a base 10 logarithm
would give the entropy in the units of decimal digits.
• The amount of information attributed to an event E is inversely
related to the probability of that event.
• Examples:
Certain event: P( E ) = 1.0 . In this case I ( E ) = log(1 / 1) = 0 . This
agrees with intuition, since if the event E is certain to occur (has
probability 1), knowing that it has occurred has not led to any
gain of information.
Coin toss: P( E = Heads) = 0.5 . In this case
I ( E ) = log(1 / 0.5) = log(2) = 1 bit. This again agrees with
intuition.
Rare event: P( E ) = 0.001. In this case
I ( E ) = log(1 / 0.001) = log(1000) = 9.97 bits. This again agrees
with intuition, since knowing that a rare event has occurred
leads to a significant gain of information.
• The entropy H (z) of a source is defined as the average amount of
information gained by observing a single source symbol:
N −1
H (z ) = −


pi log pi
i =0
• By convention, in the above formula, we set 0 log 0 = 0 .

• The entropy of a source quantifies the “randomness” of a source.


• Higher the source entropy, more the uncertainty associated with a
source output, and higher the information associated with a source.
• For a fixed number of source symbols, the entropy is maximized if
all the symbols are equally likely (recall uniform histogram).

Example:

Symbol ai Probability pi Information (in bits)


I (ai ) = − log pi
0 ½ 1
1 ¼ 2
2 1/8 3
3 1/16 4
4 1/32 5
5 1/64 6
6 1/64 6

Source entropy:
H (z ) = − ∑ pi log pi = −[1 2 log 1 2 + 1 4 log 1 4 +
6
+ 1 64 log 1 64]
i =0

= [1 2 + 1 2 + + 3 32] = 63 = 1.96875 bits


32
• Given that a source produces the above symbols with indicated
probabilities, how do we represent them using binary strings?

Symbol ai Probability Binary String Length of


(Codeword) codeword li
0 ½ 000 3
1 ¼ 001 3
2 1/8 010 3
3 1/16 011 3
4 1/32 100 3
5 1/64 101 3
6 1/64 110 3

Average length of a codeword:


6 6 6
Lavg = pi li = pi (3) = 3 pi = 3 bits
i =0 i =0 i =0

• Is this is the best we can do (in terms of Lavg )? For a fixed length
codeword scheme, yes. How about if we employ a variable length
scheme?
• Idea: Since the symbols are not all equally likely, assign shorter
codewords to symbols with higher probability and longer
codewords to symbols with lower probability, such that the
average length is smaller.
• Consider the following scheme

Symbol ai Probability pi Binary String Length of


(Codeword) codeword li
0 ½ 0 1
1 ¼ 10 2
2 1/8 110 3
3 1/16 1110 4
4 1/32 11110 5
5 1/64 111110 6
6 1/64 111111 6

6
Lavg = pi li = [1 2 + 1 2 + + 3 32] = 63 = 1.96875 bit
i =0 
32

• Notice that this is the same as the source entropy!

Shannon’s noiseless coding theorem


Let (A,z) be a discrete source with probability vector z and entropy
H(z). The average codeword length of any distortionless (uniquely
decodable) coding is bounded by

Lavg ≥ H (z )

In other words, no codes exist that can losslessly represent the source
if Lavg < H ( z) .
• Note that Shannon’s theorem is quite general in that it refers to any
code, not a particular coding scheme.
• Also, it does not specify a scheme to construct codes whose
average length satisfies Lavg ≥ H (z) , nor does it claim that a code
satisfying Lavg = H (z ) exists.

• Indeed, the Huffman code (to be studied later) is a particular


algorithm for assigning codewords, with average codeword length
satisfying

H (z ) ≤ Lavg < H (z ) + 1

• Using a block coding scheme (code a block of n symbols at a time


instead of a single symbol), one can obtain codewords with
average codeword length satisfying


Lavg 1
H (z ) ≤ ≡ Lavg < H (z ) +
n n

• The efficiency of an encoding scheme is defined as

H (z )
η= ≤1
Lavg

You might also like