0% found this document useful (0 votes)
55 views

Image Compression

The document discusses various approaches to image compression including lossless and lossy compression, it explains how compression aims to reduce redundancy in digital images by decreasing coding, interpixel, and psychovisual redundancy, and it provides examples of lossless compression techniques like Huffman coding, arithmetic coding, and LZW coding that aim to reduce coding redundancy.
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
55 views

Image Compression

The document discusses various approaches to image compression including lossless and lossy compression, it explains how compression aims to reduce redundancy in digital images by decreasing coding, interpixel, and psychovisual redundancy, and it provides examples of lossless compression techniques like Huffman coding, arithmetic coding, and LZW coding that aim to reduce coding redundancy.
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 133

Image Compression

Goal of Image Compression


• The goal of image compression is to reduce the amount of data
required to represent a digital image.
Approaches

• Lossless
• Information preserving
• Low compression ratios

• Lossy
• Not information preserving
• High compression ratios

• Trade-off: image quality vs compression ratio


Data ≠ Information
• Data and information are not synonymous terms!

• Data is the means by which information is conveyed.

• Data compression aims to reduce the amount of data required


to represent a given quantity of information while preserving
as much information as possible.
Definitions: Compression
Ratio

compression

Compression ratio:
Definitions: Data Redundancy

• Relative data redundancy:

Example:
Types of Data Redundancy
(1) Coding Redundancy
(2) Interpixel Redundancy
(3) Psychovisual Redundancy

• Compression attempts to reduce one or more of these


redundancy types.
Coding Redundancy
• Code: a list of symbols (letters, numbers, bits etc.)
• Code word: a sequence of symbols used to represent a piece of
information or an event (e.g., gray levels).
• Code word length: number of symbols in each code word
Coding Redundancy (cont’d)

N x M image
rk: k-th gray level Expected value:
P(rk): probability of rk E ( X )   xP ( X  x)
x
l(rk): # of bits for rk
Coding Redundancy (con’d)
• Case 1: l(rk) = constant length

Example:
Coding Redundancy (cont’d)
• Case 2: l(rk) = variable length
• Consider the probability of the gray levels:

variable length
Interpixel redundancy
• Interpixel redundancy implies that any pixel value can be
reasonably predicted by its neighbors (i.e., correlated).


f ( x) o g ( x )   f ( x ) g ( x  a )da


autocorrelation: f(x)=g(x)
Interpixel redundancy
(cont’d)
• To reduce interpixel redundancy, the data must be
transformed in another format (i.e., through a
transformation)
• e.g., thresholding, differences between adjacent pixels, DFT

(profile – line 100)


• Example:
original threshold

thresholded

(1+10) bits/pair
Psychovisual redundancy

• The human eye does not respond with equal sensitivity to all visual
information.

• It is more sensitive to the lower frequencies than to the higher


frequencies in the visual spectrum.

• Idea: discard data that is perceptually insignificant!


Psychovisual redundancy
(cont’d)
Example: quantization
256 gray levels 16 gray levels 16 gray levels

i.e., add to each pixel a


C=8/4 = 2:1 small pseudo-random number
prior to quantization
How do we measure
information?
• What is the information content of a message/image?

• What is the minimum amount of data that is sufficient to


describe completely an image without loss of information?
Modeling Information
• Information generation is assumed to be a probabilistic process.

• Idea: associate information with probability!

A random event E with probability P(E) contains:

Note: I(E)=0 when P(E)=1


How much information does a pixel
contain?

• Suppose that gray level values are generated by a random


variable, then rk contains:

units of information!
How much information does an image contain?
• Average information content of an image:
L 1
E   I (rk ) Pr(rk )
k 0

using

Entropy units/pixel

(assumes statistically independent random events)


Redundancy (revisited)

• Redundancy:

where:

Note: if Lavg= H, then R=0 (no redundancy)


Entropy Estimation
• It is not easy to estimate H reliably!

image
Entropy Estimation (cont’d)
• First order estimate of H:
Estimating Entropy (cont’d)
• Second order estimate of H:
• Use relative frequencies of pixel blocks :

image
Estimating Entropy (cont’d)

• The first-order estimate provides only a lower-


bound on the compression that can be achieved.

• Differences between higher-order estimates of


entropy and the first-order estimate indicate the
presence of interpixel redundancy!

Need to apply transformations!


Estimating Entropy (cont’d)
• For example, consider differences:

16
Estimating Entropy (cont’d)

• Entropy of difference image:

• Better than before (i.e., H=1.81 for original image)

• However, a better transformation could be found since:


Image Compression Model
Image Compression Model
(cont’d)

• Mapper: transforms input data in a way that facilitates


reduction of interpixel redundancies.
Image Compression Model
(cont’d)

• Quantizer: reduces the accuracy of the mapper’s output in


accordance with some pre-established fidelity criteria.
Image Compression Model
(cont’d)

• Symbol encoder: assigns the shortest code to the most


frequently occurring output values.
Image Compression Models
(cont’d)

• Inverse operations are performed.

• But … quantization is irreversible in general.


Fidelity Criteria

• How close is to ?

• Criteria
• Subjective: based on human observers
• Objective: mathematically defined criteria
Subjective Fidelity Criteria
Objective Fidelity Criteria
• Root mean square error (RMS)

• Mean-square signal-to-noise ratio (SNR)


Objective Fidelity Criteria
(cont’d)

RMSE = 5.17 RMSE = 15.67 RMSE = 14.17


Lossless Compression
Lossless Taxonomy
Huffman Coding (coding redundancy)

• A variable-length coding technique.


• Optimal code (i.e., minimizes the number of code symbols per
source symbol).

• Assumption: symbols are encoded one at a time!


Huffman Coding (cont’d)
• Forward Pass
1. Sort probabilities per symbol
2. Combine the lowest two probabilities
3. Repeat Step2 until only two probabilities
remain.
Huffman Coding (cont’d)

• Backward Pass
Assign code symbols going backwards
Huffman Coding (cont’d)
• Lavg using Huffman coding:

• Lavg assuming binary codes:


Huffman Coding/Decoding
• After the code has been created, coding/decoding can be
implemented using a look-up table.
• Note that decoding is done unambiguously.
Arithmetic (or Range) Coding
(coding redundancy)
• No assumption on encode source symbols one at a time.
• Sequences of source symbols are encoded together.
• There is no one-to-one correspondence between source symbols
and code words.

• Slower than Huffman coding but typically achieves better


compression.
Arithmetic Coding (cont’d)
• A sequence of source symbols is assigned a single arithmetic
code word which corresponds to a sub-interval in [0,1].

• As the number of symbols in the message increases, the


interval used to represent it becomes smaller.

• Smaller intervals require more information units (i.e., bits) to


be represented.
Arithmetic Coding (cont’d)

Encode message: a1 a2 a3 a3 a4

1) Assume message occupies [0, 1)

0 1
2) Subdivide [0, 1) based on the probability of αi

3) Update interval by processing source symbols


Example

Encode
a1 a2 a3 a3 a4

[0.06752, 0.0688)
or,
0.068
Example
• The message a1 a2 a3 a3 a4 is encoded using 3 decimal digits or
3/5 = 0.6 decimal digits per source symbol.

• The entropy of this message is:

Note: -finite
(3 x precision
0.2log10(0.2)+0.4log 10(0.4))=0.5786
arithmetic might digits/symbol
cause problems due to
truncations!
Arithmetic Decoding
1.0 0.8 0.72 0.592 0.5728

a4
0.8 0.72 0.688 0.5856 0.57152

Decode 0.572
a3
0.4 0.56 0.624 0.5728 056896

a2
a3 a3 a1 a2 a4
0.2 0.48 0.592 0.5664 0.56768

a1
0.0 0.4
0.56 0.56 0.5664
LZW Coding (interpixel
redundancy)
• Requires no priori knowledge of pixel probability distribution
values.

• Assigns fixed length code words to variable length sequences.

• Patented Algorithm US 4,558,302

• Included in GIF and TIFF and PDF file formats


LZW Coding
• A codebook (or dictionary) needs to be constructed.

• Initially, the first 256 entries of the dictionary are assigned


to the gray levels 0,1,2,..,255 (i.e., assuming 8 bits/pixel)

Initial Dictionary
Consider a 4x4, 8 bit image Dictionary Location Entry

39 39 126 126 0 0
39 39 126 126 1 1
. .
39 39 126 126 255 255
39 39 126 126 256 -

511 -
LZW Coding (cont’d)
39 39 126 126
As the encoder examines image
39 39 126 126 pixels, gray level sequences
39 39 126 126 (i.e., blocks) that are not in the
39 39 126 126 dictionary are assigned to a new
entry.
Dictionary Location Entry

0 0
1 1
- Is 39 in the dictionary……..Yes
. . - What about 39-39………….No
255 255
256 - 39-39
- Then add 39-39 in entry 256
511 -
Example
39 39 126 126 Concatenated Sequence: CS = CR + P
39 39 126 126
(CR) (P)
39 39 126 126
39 39 126 126
CR = empty
If CS is found:
(1) No Output
(2) CR=CS
else:
(1) Output D(CR)
(2) Add CS to D
(3) CR=P
Decoding LZW

• The dictionary which was used for encoding need not


be sent with the image.

• Can be built on the “fly” by the decoder as it reads the


received code words.
Differential Pulse Code Modulation (DPCM)
Coding (interpixel redundancy)
• A predictive coding approach.
• Each pixel value (except at the boundaries) is predicted based on
its neighbors (e.g., linear combination) to get a predicted image.
• The difference between the original and predicted images yields a
differential or residual image.
• i.e., has much less dynamic range of pixel values.
• The differential image is encoded using Huffman coding.
Run-length coding (RLC)
(interpixel redundancy)
• Used to reduce the size of a repeating string of characters
(i.e., runs):

1 1 1 1 1 0 0 0 0 0 0 1  (1,5) (0, 6) (1, 1)


a a a b b b b b b c c  (a,3) (b, 6) (c, 2)

• Encodes a run of symbols into two bytes: (symbol, count)


• Can compress any type of data but cannot achieve high
compression ratios compared to other compression methods.
Bit-plane coding (interpixel redundancy)

• An effective technique to reduce inter pixel redundancy is to


process each bit plane individually.

(1) Decompose an image into a series of binary images.

(2) Compress each binary image (e.g., using run-length coding)


Combining Huffman Coding
with Run-length Coding
• Assuming that a message has been encoded using Huffman
coding, additional compression can be achieved using run-
length coding.

e.g., (0,1)(1,1)(0,1)(1,0)(0,2)(1,4)(0,2)
Lossy Compression
• Transform the image into a domain where compression can be
performed more efficiently (i.e., reduce interpixel redundancies).

~ (N/n)2 subimages
Example: Fourier Transform

The magnitude of the


FT decreases, as u, v
increase!

K << N
K-1 K-1
Transform Selection

• T(u,v) can be computed using various transformations, for


example:
• DFT
• DCT (Discrete Cosine Transform)
• KLT (Karhunen-Loeve Transformation)
DCT

forward

inverse

if u=0 if v=0
if u>0 if v>0
DCT (cont’d)
• Basis set of functions for a 4x4 image (i.e.,cosines of different
frequencies).
DCT (cont’d)
DFT WHT DCT
8 x 8 subimages

64 coefficients
per subimage

50% of the
coefficients
truncated

RMS error: 2.32 1.78 1.13


DCT (cont’d)
• DCT minimizes "blocking artifacts" (i.e., boundaries between
subimages do not become very visible).

DFT
i.e., n-point periodicity
gives rise to
discontinuities!

DCT
i.e., 2n-point periodicity
prevents
discontinuities!
DCT (cont’d)

• Subimage size selection:

original 2 x 2 subimages 4 x 4 subimages 8 x 8 subimages


JPEG Compression
• JPEG is an image compression standard which was accepted as
an international standard in 1992.
• Developed by the Joint Photographic Expert Group of the
ISO/IEC for coding and compression of color/gray scale
images.
• Yields acceptable compression in the 10:1 range.
• A scheme for video compression based on JPEG called Motion
JPEG (MJPEG) exists
JPEG Compression (cont’d)
• JPEG uses DCT for handling interpixel redundancy.

• Modes of operation:
(1) Sequential DCT-based encoding
(2) Progressive DCT-based encoding
(3) Lossless encoding
(4) Hierarchical encoding
JPEG Compression
(Sequential DCT-based encoding)

Entropy
encoder

Entropy
decoder
JPEG Steps

1. Divide the image into 8x8 subimages;

For each subimage do:

2. Shift the gray-levels in the range [-128, 127]


- DCT requires range be centered around 0

3. Apply DCT (i.e., 64 coefficients)


1 DC coefficient: F(0,0)
63 AC coefficients: F(u,v)
Example

(non-centered
spectrum)
JPEG Steps

4. Quantize the coefficients (i.e., reduce the amplitude of


coefficients that do not contribute a lot).

Q(u,v): quantization table


Example
• Quantization Table Q[i][j]
Example (cont’d)

Quantization
JPEG Steps (cont’d)
5. Order the coefficients using zig-zag ordering
- Place non-zero coefficients first
- Create long runs of zeros (i.e., good for run-length encoding)
Example
JPEG Steps (cont’d)

6. Form intermediate symbol sequence and encode coefficients:

6.1 DC coefficients: predictive encoding


6.2 AC coefficients: variable length coding
Intermediate Coding

DC

symbol_1 (SIZE) symbol_2 (AMPLITUDE)


DC (6) (61) end of block

symbol_1 (RUN-LENGTH, SIZE) symbol_2 (AMPLITUDE)


AC (0,2) (-3)
SIZE: # bits for encoding amplitude
RUN-LENGTH: run of zeros
DC/AC Symbol Encoding
symbol_1 symbol_2
• DC coefficients
(SIZE) (AMPLITUDE)

predictive [-2048, 2047]=


coding: [-211, 211-1]
• AC coefficients 1 ≤SIZE≤11

0 0 0 0 0 0 476
(6,9)(476)
= [-210, 210-1]
1 ≤SIZE≤10
If RUN-LENGTH > 15, use symbol (15,0) , i.e., RUN-LENGTH=16
Entropy Encoding (e.g,
variable length)
Entropy Encoding (e.g,
Huffman)
Symbol_1 Symbol_2
(Variable Length Code (VLC)) (Variable Length Integer (VLI))

(1,4)(12)  (111110110 1100)


VLC VLI
Effect of “Quality”

10 (8k bytes) 50 (21k bytes) 90 (58k bytes)

worst quality, best quality,


highest compression lowest compression
Effect of “Quality” (cont’d)
Example 1: homogeneous 8 x
8 block
Example 1 (cont’d)

Quantized De-quantized
Example 1 (cont’d)

Reconstructed Error
Example 2: less homogeneous 8
x 8 block
Example 2 (cont’d)

Quantized De-quantized
Example 2 (cont’d)

Reconstructed – spatial Error


JPEG for Color Images
• Transform RGB to YUV or YIQ and subsample color.
Color Specification
Luminance
Received brightness of the light, which is proportional to the total
energy in the visible band.
Chrominance
Describe the perceived color tone of a light, which depends on the
wavelength composition of light
Chrominance is in turn characterized by two attributes
Hue
Specify the color tone, which depends on the peak wavelength of the light
Saturation
Describe how pure the color is, which depends on the spread or bandwidth of the
light spectrum

NTU, GICE, MD531, DISP Lab An Introduction to Image Compression Wei-Yi Wei 90
YUV Color Space
In many applications, it is desirable to describe a color in terms
of its luminance and chrominance content separately, to enable
more efficient processing and transmission of color signals
One such coordinate is the YUV color space
Y is the components of luminance
Cb and Cr are the components of chrominance
The values in the YUV coordinate are related to the values in the RGB
coordinate by
 Y   0.299 0.587 0.114   R   0 
      
Cb
     0.169 0.334 0.500 G
   128 
 Cr   0.500 0.419 0.081  B   128 
      

NTU, GICE, MD531, DISP Lab An Introduction to Image Compression Wei-Yi Wei 91
Luminance/Chrominance commonly used, with Chrominance
subsampled due to human vision insensitivity

NTU, GICE, MD531, DISP Lab An Introduction to Image Compression Wei-Yi Wei 92
Spatial Sampling of Color Component
The three different chrominance downsampling format
(a) 4 : 4 : 4 (b) 4 : 2 : 2 (c) 4 : 2 : 0
W W W

H Y H Y H Y

W W/2 W/2
H/2 Cb
H Cb H Cb

W W/2 W/2
H/2 Cr
H Cr H Cr

NTU, GICE, MD531, DISP Lab An Introduction to Image Compression Wei-Yi Wei 93
JPEG
B
The JPEG Encoder
G
R Chrominance
YVU color 8X8
Downsampling
Image coordinate FDCT
(4:2:2 or 4:2:0)

Huffman
zigzag
Encoding
Quantizer Bit-stream
Differential Huffman
Quantization Coding Encoding
Table

The JPEG Decoder


Huffman
De-zigzag
Decoding
Bit-stream Dequantizer
De-DC Huffman
Quantization coding Decoding
Table
B
G
Chrominance R
8X8 YVU color
Upsampling Decoded
IDCT coordinate
(4:2:2 or 4:2:0) Image

NTU, GICE, MD531, DISP Lab An Introduction to Image Compression Wei-Yi Wei 94
JPEG Modes
• JPEG supports several different modes
• Sequential Mode
• Progressive Mode
• Hierarchical Mode
• Lossless Mode

• Sequential is the default mode


• Each image component is encoded in a single left-to-right, top-to-
bottom scan.
• This is the mode we have been describing.
Progressive JPEG
• The image is encoded in multiple scans, in order to produce a
quick, rough decoded image when transmission time is long.

Sequential

Progressive
Progressive JPEG (cont’d)
• Each scan, codes a subset of DCT coefficients.
(1) Progressive spectral selection algorithm
(2) Progressive successive approximation
algorithm
(3) Combined progressive algorithm
Progressive JPEG (cont’d)
(1) Progressive spectral selection algorithm
• Group DCT coefficients into several spectral bands
• Send low-frequency DCT coefficients first
• Send higher-frequency DCT coefficients next
Progressive JPEG (cont’d)
(2) Progressive successive approximation algorithm
• Send all DCT coefficients but with lower precision.
• Refine DCT coefficients in later scans.
Progressive JPEG (cont’d)
(3) Combined progressive algorithm
• Combines spectral selection and successive approximation.
Results using spectral
selection
Results using successive
approximation
Example using successive
approximation
after 0.9s after 1.6s

after 3.6s after 7.0s


Hierarchical JPEG
• Hierarchical mode encodes the image at several different
resolutions.

• Image is transmitted in multiple passes with increased


resolution at each pass.
Hierarchical JPEG (cont’d)
Hierarchical JPEG (cont’d)
Hierarchical JPEG (cont’d)

N/4 x N/4

N/2 x N/2

NxN
Lossless JPEG
• Uses predictive coding (see next slides)
JPEG 2000
B
The JPEG 2000 Encoder
G
JPEG 2000
R Forward Bit-stream
Component 2D DWT Quantization EBCOT
Image
Transform

Rate-
Context Arithmetic
Distortion
Modeling Coding
Control
Tier-1
Tier-2

The JPEG 2000 Decoder


B
JPEG 2000 G
Inverse R
Bit-stream EBCOT
Dequantization 2D IDWT Component Decoded
Decoder Image
Transform

NTU, GICE, MD531, DISP Lab An Introduction to Image Compression Wei-Yi Wei 109
Original

NTU, GICE, MD531, DISP Lab An Introduction to Image Compression Wei-Yi Wei 110
JPEG
27:1

NTU, GICE, MD531, DISP Lab An Introduction to Image Compression Wei-Yi Wei 111
JPEG2000
27:1

NTU, GICE, MD531, DISP Lab An Introduction to Image Compression Wei-Yi Wei 112
Lossy Methods: Taxonomy
Lossless Differential Pulse Code
Modulation (DPCM) Coding
• Each pixel value (except at the boundaries) is predicted based on
certain neighbors (e.g., linear combination) to get a predicted
image.
• The difference between the original and predicted images yields a
differential or residual image.
• Encode differential image using Huffman coding.

xm dm Entropy
Encoder

Predictor
pm
Lossy Differential Pulse Code
Modulation (DPCM) Coding
• Similar to lossless DPCM except that (i) it uses quantization and (ii)
the pixels are predicted from the “reconstructed” values of certain
neighbors.
Block Truncation Coding
• Divide image in non-overlapping blocks of pixels.
• Derive a bitmap (0/1) for each block using thresholding.
• e.g., use mean pixel value in each block as threshold.
• For each group of 1s and 0s, determine reconstruction value
• e.g., average of corresponding pixel values in original block.
Subband Coding
• Analyze image to produce components containing frequencies in
well defined bands (i.e., subbands)
• e.g., use wavelet transform or steerable filters.
• Optimize quantization/coding in each subband.
Vector Quantization
• Develop a dictionary of fixed-size vectors (i.e., code vectors),
usually blocks of pixel values.
• Partition image in non-overlapping blocks (i.e., image vectors).
• Encode each image vector by the index of its closest code vector.
Fractal Coding
• Decompose image into segments (i.e., using standard
segmentations techniques based on edges, color, texture, etc.) and
look them up in a library of IFS codes.
• Best suited for textures and natural images, relying on the fact that
parts of an image often resemble other parts of the same image
(i.e., self-similarity).
Fingerprint Compression
• An image coding standard for digitized fingerprints, developed
and maintained by:
• FBI
• Los Alamos National Lab (LANL)
• National Institute for Standards and Technology (NIST).

• The standard employs a discrete wavelet transform-based


algorithm (Wavelet/Scalar Quantization or WSQ).
Memory Requirements
• FBI is digitizing fingerprints at 500 dots per inch with 8 bits of
grayscale resolution.
• A single fingerprint card turns into about 10 MB of data!

A sample fingerprint image


768 x 768 pixels =589,824 bytes
Preserving Fingerprint
Details

The "white" spots in the middle of


the black ridges are sweat pores

They’re admissible points of


identification in court, as are the little
black flesh ‘‘islands’’ in the grooves
between the ridges

These details are just a couple pixels wide!


What compression scheme should be used?

• Better use a lossless method to preserve every pixel perfectly.

• Unfortunately, in practice lossless methods haven’t done better


than 2:1 on fingerprints!

• Does JPEG work well for fingerprint compression?


Results using JPEG
compression
file size 45853 bytes
compression ratio: 12.9

The fine details are pretty much


history, and the whole image has
this artificial ‘‘blocky’’ pattern
superimposed on it.

The blocking artifacts affect the


performance of manual or
automated
systems!
Results using WSQ
compression
file size 45621 bytes
compression ratio: 12.9

The fine details are preserved


better than they are with JPEG.

NO blocking artifacts!
WSQ Algorithm
Varying compression ratio
• FBI’s target bit rate is around 0.75 bits per pixel (bpp)
• i.e., corresponds to a target compression ratio of 10.7
(assuming 8-bit images)

• This target bit rate is set via a ‘‘knob’’ on the WSQ algorithm.
• i.e., similar to the "quality" parameter in many JPEG
implementations.
Varying compression ratio
(cont’d)
• In practice, the WSQ algorithm yields a higher compression ratio than
the target because of unpredictable amounts of lossless entropy
coding gain.
• i.e., mostly due to variable amounts of blank space in the images.

• Fingerprints coded with WSQ at a target of 0.75 bpp will actually


come in around 15:1
Varying compression ratio
(cont’d)
Original image 768 x 768 pixels (589824 bytes)
Varying compression ratio (cont’d)
0.9 bpp compression

WSQ image, file size 47619 bytes, JPEG image, file size 49658 bytes,
compression ratio 12.4 compression ratio 11.9
Varying compression ratio (cont’d)
0.75 bpp compression
WSQ image, file size 39270 bytes JPEG image, file size 40780 bytes,
compression ratio 15.0 compression ratio 14.5
Varying compression ratio (cont’d)
0.6 bpp compression

WSQ image, file size 30987 bytes, JPEG image, file size 30081 bytes,
compression ratio 19.0 compression ratio 19.6
Video Compression

You might also like