0% found this document useful (0 votes)
54 views13 pages

Image Data Compression

1) Image data compression exploits redundancy in images to allow for more efficient coding and storage. Common sources of redundancy include correlated pixel values and repetitive patterns. 2) The Karhunen-Loeve transform determines uncorrelated variables from correlated variables through a linear transformation, allowing images to be represented with fewer dimensions while minimizing information loss. 3) This transform diagonalizes the covariance matrix to find principal components that account for the most variance. Omitting less informative components achieves compression. Reconstruction approximates the original data.

Uploaded by

Hussain
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
54 views13 pages

Image Data Compression

1) Image data compression exploits redundancy in images to allow for more efficient coding and storage. Common sources of redundancy include correlated pixel values and repetitive patterns. 2) The Karhunen-Loeve transform determines uncorrelated variables from correlated variables through a linear transformation, allowing images to be represented with fewer dimensions while minimizing information loss. 3) This transform diagonalizes the covariance matrix to find principal components that account for the most variance. Omitting less informative components achieves compression. Reconstruction approximates the original data.

Uploaded by

Hussain
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

Image Data Compression

Image data compression is important for


- image archiving e.g. satellite data
- image transmission e.g. web data
- multimedia applications e.g. desk-top editing

Image data compression exploits redundancy for more efficient coding:

data redundancy
digitized image coding
reduction

transmission,
storage, archiving

digitized image reconstruction decoding

Run Length Coding


Images with repeating greyvalues along rows (or columns) can be
compressed by storing "runs" of identical greyvalues in the format:

greyvalue1 repetition1 greyvalue2 repetition2 •••

For B/W images (e.g. fax data) another run length code is used:

row # column # column # column # column #


run1 begin run1 end run2 begin run2 end •••

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 run length coding:


0
1 (0 3 5 9 9)
2 (1 1 7 9 9)
3 (3 4 4 6 6 8 8 10 10 12 14)

1
Probabilistic Data Compression
A discrete image encodes information redundantly if
1. the greyvalues of individual pixels are not equally probable
2. the greyvalues of neighbouring pixels are correlated

Information Theory provides limits for minimal encoding of probabilistic


information sources.

Redundancy of the encoding of individual pixels with G greylevels each:

r=b-H b = number of bits used for each pixel


= ⎡ log2 G ⎤
1
H = ∑g =0 P(g) log2
G−1
H = entropy of pixel source
P(g)
= mean number of bits required to encode
information of this source

The entropy of a pixel source with equally probable greyvalues is equal to


the number of bits required for coding.
3

Huffman Coding
The Huffman coding scheme provides a variable-length code with
minimal average code-word length, i.e. least possible redundancy, for a
discrete message source. (Here messages are greyvalues)

1. Sort messages along increasing probabilities such that g(1) and g(2)
are the least probable messages
2. Assign 1 to code word of g(1) and 0 to codeword of g(2)
3. Merge g(1) and g(2) by adding their probabilities
4. Repeat steps 1 - 4 until a single message is left.

Example:
message probability code word coding tree Entropy: H = 2.185
g1 0.3 00 0 Average code word
0 length of Huffman
g2 0.25 01
1 0.55 code: 2.2
g3 0.25 10 0
g4 0.10 110 0 1
0.45
g5 0.10 111 1
1 0.20
4

2
Statistical Dependence
An image may be modelled as a set of statistically dependent random
variables with a multivariate distribution p(x1, x2, ..., xN) = p(x).
Often the exact distribution is unknown and only correlations can be
(approximately) determined.

Correlation of two variables: Covariance of two variables:


E[xixj] = cij E[(xi-mi)(xj-mj)] = vij with mk = mean of xk

Correlation matrix: Covariance matrix:

E[x xT] = c11 c12 c13 ... E[(x-m) (x-m)T] = v11 v12 v13 ...
c21 c22 c23 v21 v22 v23
c31 c32 c33 v31 v32 v33
... ...

Uncorrelated variables need not be statistically independent:


E[xixj] = 0 p(xixj) = p(xi) p(xj)
For Gaussian random variables, uncorrelatedness implies statistical independence.
5

Karhunen-Loève Transform
(also known as Hotelling Transform or Principal Components Transform )

Determine uncorrelated variables y from correlated variables x by a linear


transformation.
y = A (x - m)
E[y y T] = A E[(x - m) (x - m)T] AT = A V AT = D D is a diagonal matrix

• An orthonormal matrix A which diagonalizes the real symmetric


covariance matrix V always exists.
• A is the matrix of eigenvectors of V, D is the matrix of corresponding
eigenvalues.

x = AT y + m reconstruction of x from y

If x is viewed as a point in n-dimensional Euclidean space, then A defines a


rotated coordinate system.

3
Illustration of Minimum-loss
Dimension Reduction
Using the Karhunen-Loève transform, data compression is achieved by
• changing (rotating) the coordinate system
• omitting the least informative dimension(s) in the new coodinate system

Example:
x2 y1 x2

y2 • • • ••

•• • • •
• ••••
• • • • • •• ••
• ••
x1 x1

y2


• •
•• •• • • •• y1 • • • •• •••••• • •• • y1

• • •
7

Compression and Reconstruction with


the Karhunen-Loève Transform
Assume that the eigenvalues λn and the corresponding eigenvectors in A are
sorted in decreasing order λ1 ≥ λ2 ≥ ... ≥ λN

D= λ1 0 0 ... Eigenvectors a and eigenvalues λ are defined by


V a = λ a and can be determined by solving
0 λ2 0
det [V - λ I ] = 0.
0 0 λ3
... There exist special procedures for determining
eigenvalues of real symmetric matrices V.

Then x can be transformed into a K-dimensional vector yK, K < N, with a


transformation matrix AK containing only the first K eigenvectors of A
corresponding to the largest K eigenvalues.
yK = AK (x - m)

The approximate reconstruction x´ minimizing the MSE is

x′ = AK yK + m
T

Hence yK can be used for data compression!


8

4
Example for Karhunen-Loève Compression
N=3 V= 2 -0,866 -0,5 m=0
xT = [x1 x2 x3] -0,866 2 0
-0,5 0 2

det (V - λI) = 0 λ1 = 3 λ2 = 2 λ3 = 1

AT = 0,707 0 0,707 D= 3 0 0
-0,612 0,5 0,612 0 2 0
-0,354 -0,866 0,354 0 0 1

Compression into K=2 dimensions:


y2 = A 2 x = 0,707 -0,612 -0,354 x
0 0,5 -0,866 Note the discrepancies between
the original and the approximated
Reconstruction from compressed values: values:
x1´= 0,5 x1 - 0,43 x2 - 0,25 x3
x´= A2T y = 0,707 0 y
-0,612 0,5 x2´= -0,085 x1 - 0,625 x2 + 0,39 x3
-0,354 0,354 x3´= 0,273 x1 + 0,39 x2 + 0,25 x3
9

Eigenfaces (1)
Turk & Pentland: Face Recognition Using Eigenfaces (1991)

Eigenfaces = eigenvectors of covariance matrix of normalized face images

Example images of eigenface project at Rice University

10

5
Eigenfaces (2)
First 18 eigenfaces determined from covariance matrix of 86 face images

11

Eigenfaces (3)
Original images and reconstructions from 50 eigenfaces

12

6
Predictive Compression
Principle:
• estimate gmn´ from greyvalues in the neighbourhood of (mn)
• encode difference dmn = gmn - gmn´
• transmit difference data + predictor

For a 1D signal this is known as Differential Pulse Code Modulation (DPCM):

f(t) d(t) d(t) f(t)


+ quantizer coder decoder +
- +
f´(t) predictor
f´(t) predictor
+

compression reconstruction

Linear predictor for a neighbourhood of K pixels:


gmn´= a1g1 + a2g2 + ... + aKgK
Computation of a1 ... aK by minimizing the expected reconstruction error
13

Example of Linear Predictor


For images, a linear predictor based on 3 pixels (3rd order) is often sufficient:
gmn´ = a1 gm,n-1 + a2 gm-1,n-1 + a3 gm-1,n
If gmn is a zero mean stationary random process with autocorrelation C,
then minimizing the expected error gives
a1c00 + a2c01 + a3c11 = c10 01 11
a1c01 + a2c00 + a3c10 = c11 n
a1c11 + a2c10 + a3c00 = c01 00 10

This can be solved for a1, a2, a3 using Cramer´s Rule. m

Example:
Predictive compression with 2nd order
predictor and Huffman coding, ratio 6.2
Left: Reconstructed image
Right: Difference image (right) with
maximal difference of 140 greylevels

14

7
Discrete Cosine Transform (DCT)
Discrete Cosine Transform is commonly used for image compression,
e.g. in JPEG (Joint Photographic Expert Group) Baseline System standard.
1 N −1 N −1
Definition of DCT: G 00 = ∑ ∑ g
N m =0 n =0 mn
1
∑ ∑ g cos[(2m + 1)uπ] cos[(2n + 1)vπ]
N −1 N −1
Gu v =
2N3 m =0 n =0 mn
1 1
∑ ∑
N −1 N−1
Inverse DCT: gmn = G + Guv cos[(2m + 1)uπ] cos[(2n + 1) vπ ]
N 00 2N3 u= 0 v =0

In effect, the DCT computes a Fourier Transform of a function made


symmetric at N by a mirror copy.
=> 1. Result does not contain sinus terms 2. No wrap-around errors

Example:
DCT compression with ratio 1 : 5.6
Left: Reconstructed image
Right: Difference image (right) with
maximal difference of 125 greylevels
15

Principle of Baseline JPEG


(Source: Gibson et al., Digital Compression for Multimedia, Morgan Kaufmann 98)

8 x 8 blocks
Encoder
FDCT Quantizer Entropy Encoder

source image table table compressed


data specifications specifications image data

• transform RGB into YUV coding, subsample color information


• partition image into 8 x 8 blocks, left-to-right, top-to-bottom
• compute Discrete Cosine Transform (DCT) of each block
• quantize coefficients according to psychovisual quantization tables
• order DCT coefficients in zigzag order
• perform runlength coding of bitstream of all coefficients of a block
• perform Huffman coding for symbols formed by bit patterns of a block
16

8
YUV Color Model for JPEG
Human eyes are more sensitive to luminance (brightness) than to
chrominance (color). YUV color coding allows to code chrominance
with fewer bits than luminance.

CCIR-601 scheme:

Y = 0.299 R + 0.587 G + 0.144 B "luminance"


Cb = 0.1687 R - 0.3313 G + 0.5 B "blueness" » U
Cr = 0.5 R - 0.4187 G - 0.0813 B "redness" » V

In JPEG:
1 Cb, 1 Cr and 4 Y values for each 2 x 2 image subfield
(6 instead of 12 values)

17

Illustrations for Baseline JPEG


a0 a1
• • • • • • • •
a2 • • • • • • • •
a3 • • • • • • • •
• • • • • • • •
• • • • • • • •
• • • • • • • •
• • • • • • • •
• • • • • • • • a63
partitioning the image into blocks DCT coefficient ordering for
efficient runlength coding
blocks
• •
DCT 2 •
coefficients 1
0 transmission sequence
1 for blocks of image



62
63
7 6 ••• 1 0

MSB LSB 18

9
JPEG-compressed Image

original JPEG-compressed difference image


5.8 MB 450 KB standard deviation of
luminance differences: 1,44

19

Problems with Block Structure of


JPEG
JPEC encoding with
compression ratio 1:70

block boundaries are visible

20

10
Progressive Encoding

Progressive encoding allows to first transmit a coarse version of the


image which is then progressively refined (convenient for browsing
applications).

Spectral selection
1. transmission: DCT coefficients a0 ... ak1 low frequency
2. transmission: DCT coefficients ak1 ... ak2 coefficients first


Successive approximation
1. transmission: bits 7 ... n1 most significant
2. transmission: bits n1+1 ... n2 bits first


21

MPEG Compression
Original goal:
Compress a 120 Mbps video stream to be handled by a CD with 1 Mbps.

Basic procedure:
• temporal prediction to exploit redundancy between image frames
• frequency domain decomposition using the DCT
• selective reduction of precision by quantization
• variable length coding to exploit statistical redundancy
• additional special techniques to maximize efficiency

Motion compensation:
16 x 16 blocks luminance with 8 x 8 blocks chromaticity of the current
image frame are transmitted in terms of
- an offset to the best-fitting block in a reference frame (motion vector)
- the compressed differences between the current and the reference block

22

11
MPEG-7 Standard
MPEG-7: “Multimedia Content Description Interface”
• introduced as standard in 2002
• supports multimedia content description (audio and visual)
• not aimed at a particular application

Descrtiption of visual
contents in terms of:
• descriptors (e.g. color,
texture, shape, motion,
localization, face features)
• segments
• structural information
• Description Definition
Language (DDL)

segmentation methodology
required!
23

Quadtree Image Representation


Properties of quadtree:
• every node represents a squared image area, e.g. by its mean greyvalue
• every node has 4 children except leaf nodes
• children of a node represent the 4 subsquares of the parent node
• nodes can be refined if necessary

100 101 quadtree structure:


11
102 103 root
0
12 13
0 1 2 3

10 11 12 13
2 3

100 101 102 103


24

12
Quadtree Image Compression
A complete quadtree represents an image of N = 2K x 2K pixels with
1 + 4 + 16 + ... + 22K nodes ≈ 1.33 N nodes.

An image may be compressed by


- storing at every child node the greyvalue difference between child
and parent node
- omitting subtrees with equal greyvalues

Quadtree image compression supports progressive image transmission:


• images are transmitted by increasing quadtree levels, i.e. images are
progressively refined
• intermediate image representations provide useful information, e.g. for
image retrieval

25

13

You might also like