0% found this document useful (0 votes)
26 views

Compression Lecture

The document discusses various methods for image compression, including lossless and lossy techniques. Lossless methods aim to remove redundant image information while preserving image content, using approaches like run length encoding (RLE), interpixel differences, and quadtree decomposition. RLE replaces repeated pixel values with run counts, interpixel differences encodes differences between adjacent pixels, and quadtree decomposition partitions images into variable size squares. Dictionary-based methods like LZW coding combine repeated sequences into a dictionary to achieve compression.

Uploaded by

fuzzy_mouse
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views

Compression Lecture

The document discusses various methods for image compression, including lossless and lossy techniques. Lossless methods aim to remove redundant image information while preserving image content, using approaches like run length encoding (RLE), interpixel differences, and quadtree decomposition. RLE replaces repeated pixel values with run counts, interpixel differences encodes differences between adjacent pixels, and quadtree decomposition partitions images into variable size squares. Dictionary-based methods like LZW coding combine repeated sequences into a dictionary to achieve compression.

Uploaded by

fuzzy_mouse
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 62

Image Compression Methods

Image compression:
all methods which reduce the amount of image
data, leaving the image content unchanged.
Two fundamental approaches:
lossless compression methods:
remove redundant image information,
coding redundancy:
interpixel redundancy:
lossy compression methods:
remove redundant as well as irrelevant image
information

Assumptions:
We use
a 1616 gray scale image coded with one byte
pixel depth, resulting in a file size
of 16 16 = 256 bytes of raw data

a 16 16 real-color image (RGB) coded


with one byte pixel depth for each component, resulting in a file size of 16163 =
768 bytes of raw data.

a 1616 binary image coded with one byte


pixel depth, resulting in a file size of 16
16 = 256 bytes of raw data

Representation of an image:
two-dimensional
one-dimensional (zig-zag)
hexadecimal

Data stream:
F0 F0 F0 F0 F0 F0 F0 F0 F0 F0 F0 82 82 82 82 82 AF
82 AF F0 FF F0 F0 82 AF AF AF F0 82 82 AF 82 AF
F0 F0 F0 F0 F0 82 82 AF AF 82 AF 82 F0 AF AF AF
82 82 AF 82 F0 F0 F0 F0 F0 AF AF AF AF AF AF 82
F0 F0 F0 82 82 82 AF 82 82 82 F0 F0 4B F0 F0 F0 F0
F0 F0 82 AF AF AF 82 F0 F0 F0 F0 F0 AF AF 82 82
82 F0 F0 4B F0 4B F0 4B F0 F0 F0 4B F0 4B F0 AF 82
82 82 F0 F0 F0 F0 F0 F0 F0 F0 F0 AF 82 F0 4B F0 4B
F0 4B 4B 4B 4B 4B F0 F0 F0 4B F0 F0 AF 82 F0 F0
F0 F0 F0 F0 F0 F0 E1 F0 82 82 F0 F0 4B 4B F0 4B 4B
4B 4B F0 4B F0 4B F0 F0 F0 E1 F0 F0 F0 F0 F0 FF
F0 FF F0 4B 4B 4B 4B 4B 4B 4B 4B F0 F0 F0 F0 F0
FF FF F0 F0 F0 FF FF F0 F0 F0 F0 F0 4B 4B 4B 4B
32 32 F0 F0 F0 FF F0 FF F0 F0 F0 F0 F0 4B 4B 4B 4B
32 32 32 F0 F0 F0 F0 F0 F0 4B 4B 4B 4B F0 F0 F0 32
32 4B 4B 4B 4B F0 4B 4B 4B

Terminology
encoder (CODEC)
decoder (DECODEC)
compression ratio
n
CR =
m
relative data redundancy
1
RD = 1
CR
bit-per-pixel value
8m
C =
n
with:
n: size of uncompressed image
m: size of compressed image
bbp

Lossless Data Compression


simple compression algorithms
dictionary-based approaches
information-theoretical approaches

Simple Lossless Methods


use heuristic approaches
used by CODECs of more powerful methods as intermediate step

Examples:
Run length encoding (RLE) method
Interpixel differences method
Quadtree decomposition method

Run length encoding (RLE)


zigzag scanning

F0 F0 F0 F0 F0 F0 F0 F0 F0 F0 F0 82 82 82 82 82 AF
82 AF F0 FF F0 F0 82 AF AF AF F0 82 82 AF 82 AF
F0 F0 F0 F0 F0 82 82 AF AF 82 AF 82 F0 AF AF AF
82 82 AF 82 F0 F0 F0 F0 F0 AF AF AF AF AF AF 82
F0 F0 F0 82 82 82 AF 82 82 82 F0 F0 4B F0 F0 F0 F0
F0 F0 82 AF AF AF 82 F0 F0 F0 F0 F0 AF AF 82 82
82 F0 F0 4B F0 4B F0 4B F0 F0 F0 4B F0 4B F0 AF 82
82 82 F0 F0 F0 F0 F0 F0 F0 F0 F0 AF 82 F0 4B F0 4B
F0 4B 4B 4B 4B 4B F0 F0 F0 4B F0 F0 AF 82 F0 F0
F0 F0 F0 F0 F0 F0 E1 F0 82 82 F0 F0 4B 4B F0 4B 4B
4B 4B F0 4B F0 4B F0 F0 F0 E1 F0 F0 F0 F0 F0 FF
F0 FF F0 4B 4B 4B 4B 4B 4B 4B 4B F0 F0 F0 F0 F0
FF FF F0 F0 F0 FF FF F0 F0 F0 F0 F0 4B 4B 4B 4B
32 32 F0 F0 F0 FF F0 FF F0 F0 F0 F0 F0 4B 4B 4B 4B
32 32 32 F0 F0 F0 F0 F0 F0 4B 4B 4B 4B F0 F0 F0 32
32 4B 4B 4B 4B F0 4B 4B 4B

Run length encoding (RLE)


Method:
adjacent gray values with the same value
are denoted by the number and the value:
Examples:
11 F0 F0 0B
5 82 82 05
1 AF AF 01
etc.
Data stream:
F0 0B 82 05 AF 01 82 01 AF
01 F0 01 FF 01 F0 02 . . .

Run length encoding (RLE)


compression ratio CR and relative data redundancy RD :

256
= 1.164
220
220
RD = 1
= 0.14
256
CR =

Run length encoding (RLE)


variation:
avoid number if gray values if occurs only
once,
Examples:
11 F0 F0F0 0A
5 82 8282 04
1 AF AF
1 82 82
etc.
Data stream:
F0 F0 0A 82 82 04 AF 82
AF F0 FF F0 F0 01 82 . . .

Run length encoding (RLE)


compression ratio CR and relative data redundancy RD :

256
= 1.196
220
220
RD = 1
= 0.164
256
CR =

Run length encoding (RLE)


variations:
separate the 8 bit-layers,
binary-RLE-code each.
etc.. . .

Run length encoding (RLE)


shows good results when the image contains large areas of the same gray value
is used as one step in JPEG compression
is used by PCX, ps, BMP, GIF, TIFF, and
RLE (Compuserve) data format

Lossless Data Compression


simple compression algorithms
Run length encoding (RLE) method
Interpixel differences method
Quadtree decomposition method
dictionary-based approaches
information-theoretical approaches

Interpixel differences
Assumption:
adjacent pixels have similar gray values
F0 F0 F0 F0 F0 F0 F0 F0 F0 F0 F0 82 82 82 82 82 AF
82 AF F0 FF F0 F0 82 AF AF AF F0 82 82 AF 82 AF
F0 F0 F0 F0 F0 82 82 AF AF 82 AF 82 F0 AF AF AF
82 82 AF 82 F0 F0 F0 F0 F0 AF AF AF AF AF AF 82
F0 F0 F0 82 82 82 AF 82 82 82 F0 F0 4B F0 F0 F0 F0
F0 F0 82 AF AF AF 82 F0 F0 F0 F0 F0 AF AF 82 82
82 F0 F0 4B F0 4B F0 4B F0 F0 F0 4B F0 4B F0 AF 82
82 82 F0 F0 F0 F0 F0 F0 F0 F0 F0 AF 82 F0 4B F0 4B
F0 4B 4B 4B 4B 4B F0 F0 F0 4B F0 F0 AF 82 F0 F0
F0 F0 F0 F0 F0 F0 E1 F0 82 82 F0 F0 4B 4B F0 4B 4B
4B 4B F0 4B F0 4B F0 F0 F0 E1 F0 F0 F0 F0 F0 FF
F0 FF F0 4B 4B 4B 4B 4B 4B 4B 4B F0 F0 F0 F0 F0
FF FF F0 F0 F0 FF FF F0 F0 F0 F0 F0 4B 4B 4B 4B
32 32 F0 F0 F0 FF F0 FF F0 F0 F0 F0 F0 4B 4B 4B 4B
32 32 32 F0 F0 F0 F0 F0 F0 4B 4B 4B 4B F0 F0 F0 32
32 4B 4B 4B 4B F0 4B 4B 4B

Interpixel differences
Method:
subtract neighbouring gray values in data
stream
shift by smallest value
reduce numbers of bits per value
reduce number of bit layers
of image

Interpixel differences
Results:
Number of
bit layors

Compressionratio CR

relative
data redundancy RD

256 8
= 2.0
256 4

256 8
= 1.6
256 5

256 5
= 0.375
256 8

256 8
= 1.3
256 6

256 6
= 0.25
256 8

256 8
= 1.14
256 7

256 7
= 0.125
256 8

256 4
= 0.5
256 8

Lossless Data Compression


simple compression algorithms
Run length encoding (RLE) method
Interpixel differences method
Quadtree decomposition method
dictionary-based approaches
information-theoretical approaches

Quadtree decomposition
Assumptions:
image contains areas with equal or very
similar gray values
image size is square and a power of 2
the image has no gray value 0.

level 0 (root)

level 1

level 2

level 3
level 4

Quadtree decomposition
Coding the tree (example):
the root (level 0) does not get coded,
proceeding to the next higher level is coded
with 00000000
the leaves and and their position in the
tree get coded
if a square gets subdivided, the subsquares
are numbered counterclockwise with 00,
01, 10, 11, starting with the upper right
quadrant

Quadtree decomposition
Coding the tree:
no leaf on level 1
move to level 2
code: 00000000
one leaf on level 2,
leaf A: gray value 240,
position 00,
position of parent 00
move to level 3
code: 1111000 0000 00000000
leaves on level 3, (for example leaf B
and C)
leaf B: gray value 240,
position 01,
position of parent 01
position of grandparent 10
code: 1111000 010110
leaf C: gray value 240,
position 01,
position of parent 01
position of grandparent 01
code: 1111000 010101
...

Quadtree decomposition
3076 bits or 384.5 bytes
used with a subsequent binary RLE
Variation: separating the bit layers of a gray
value image

Lossless Data Compression


simple compression algorithms
dictionary-based approaches
information-theoretical approaches

LZW Coding
(Abraham Lempel, Jacob Ziv, Terry Welch 1977-1984)

used in
the UNIX utility compress
GIF data formats
PostScript ps data formats
TIFF data formats
ARJ data formats

LZW Coding
Idea:
combine repeatedly occurring sequences of
gray values into substrings,
store them in a table called
dictionary D.
compressed code: pointers to items in dicitonary

LZW Coding
Let:
P : string buffer
C: character buffer (8 bit)
D: dictionary
n: no. of gray value in input string
# : end of file
|| : concatenate
Algorithm:
START
- D: contains the alphabet
- P: is empty
- C: is empty
BEGIN LOOP
1. C := next symbol of data stream
IF (C=#) STOP
2. P ||C D?
YES: P := P ||C
NO:
output index P
insert P ||C in D
P := C
END LOOP
Example: ANNA_AND_BARBARA_AND_A_BANANA

input C
n
P
0
0
0
0
0
0
1
2
A
3
N
4
N
5
A
6
_
7
A
8
AN
9
D
10
_
11
B
12
A
13
R
14
B
15 BA
16
R
1/
A
18 A_
19
A
20 AN
21 AND
22
_
23 _A
24
_
25 _B
26
A
27 AN
28
A
29 AN
30 ANA

P ||C D?

A
N
N
A
_
A
N
D
_
B
A
R
B
A
R
A
_
A
N
D
_
A
_
B
A
N
A
N
A
#

yes
no
no
no
no
no
yes
no
no
no
no
no
no
yes
no
no
yes
no
yes
yes
no
yes
no
yes
no
yes
no
yes
yes
no

output
code

dictionary D
pos entry
0
A
1
B
2
D
3
N
4
R
5
_

P:=C

P:=P||C

A
0
3
3
0
5

6
7
8
9
10

AN
NN
NA
A_
_A

N
N
A
_
A
AN

6
2
5
1
0
4

11
12
13
14
15
16

AND
D_
_B
BA
AR
RB

D
_
B
A
R
B

14
4

17
18

BAR
RA

R
A

BA

A_
9

19

A_A

A
AN
AND

11

20

AND_

_
_A

10

21

_A_

_
_B

13

22

_BA

23

ANA

AN
AN
ANA
23

LZW Coding
Assumption: each number is coded with
1 byte.
input stream had 29 bytes.
output stream has 19 bytes,
compression ratio and the relative data redundancy:
29
CR =
= 1.5
19
19
= 0.34
RD = 1
29

Lossless Data Compression


simple compression algorithms
dictionary-based approaches
information-theoretical approaches

Elements of Information Theory


What exactly should be left in an image in
order not to lose any information, and what
can get removed ?
How is information defined?
How much information is contained in an
image?
How is the amount of information calculated?

Elements of Information Theory


The probability p(Ej ) of an event Ej
Number of occurances of event Ej
p(Ej ) =
Total number of events N
Example: Throwing dice, one of the events
Ej is to throw a six. For an ideal die, the
1
probability for any event Ej is p(Ej ) = .
6

Elements of Information Theory


Let K be the number of events:
K
X
p(Ej ) = 1
j=1

Example: Throwing a die, K=6 events Ej


can occur.

Elements of Information Theory


The information I(Ej ) of a certain event Ej
which can occur with probability p(Ej ),
1
I(Ej ) = log
p(Ej )
= log p(Ej )
Example: Throwing a die, the information of any throw would be
1
I(Ej ) = log p(Ej ) = ln = 1.79
6
if we use ln

Elements of Information Theory


The base of the logarithm defines the units
of the information I(Ej ).
Example: Throwing a die, the information of any throw would be
I(Ei) = log2 p(Ei)
1
ln
= 6 = 2, 585 bit
ln 2
if we use log base 2

Elements of Information Theory


What is information?. . .
I(Ej ) = log p(Ej )
the less probable an event, the higher the
information content
an event Ej that occurs with probability
p(Ej ) = 1 has information content 0.

Elements of Information Theory


a thing that generates events is called an
information source Q
Let Q be an information source generating
events,
The entropy HQ of an information source Q
HQ =

K
X

p(Ej )I(Ej )

j=1

K
X
j=1

p(Ej ) log p(Ej )

Elements of Information Theory


Example: Throwing a die, the entropy of
the die is
6
X
Hdie =
p(Ej ) log2 p(Ej )
=

j=1
6
X
j=1

1
1
log2 = 2, 585 bit
6
6

If all probabilities p(Ej ) are equal,


the entropy of the source is equal to the
information of an event:
HQ = I(Ej )
the entropy HQ is maximum

Elements of Information Theory


Images:
replace event Ej occurs
by gray value gj occurs in data stream
replace total number of events N
by numbers of pixels n in an image
replace probability p(Ej )
H(gj )
by normalized histogram value

n
the information content of an image:
H(gj )
p(gj ) =
n
H(gj )
I(gj ) = log2 p(gj ) = log2
n

Elements of Information Theory


Images:
replace event Ej occurs
by gray value gj occurs in data stream
replace total number of events N
by numbers of pixels n in an image
replace probability p(Ej )
H(gj )
by normalized histogram value

n
the entropy of an image:
H(gj )
p(gj ) =
n
H(gj )
I(gj ) = log2
n
K1
X
HQ =
p(gj )I(gi)
j=0
K1
X

j=0

H(gj )
H(gj )
log2
n
n

Elements of Information Theory


Image:

Elements of Information Theory


Image:
F0

F0

F0

F0

82

82

F0

82

82

F0

F0

F0

F0

F0

F0

F0

F0

F0

F0

82

AF

AF

82

AF

AF

82

F0

F0

F0

F0

F0

F0

F0

F0

82

82

AF

AF

82

AF

AF

82

82

F0

F0

F0

F0

F0

F0

82

AF

AF

82

AF

AF

AF

82

AF

AF

82

F0

F0

F0

F0

F0

F0

82

AF

AF

82

AF

82

AF

AF

82

F0

F0

F0

F0

F0

FF

F0

F0

82

82

AF

AF

AF

82

82

F0

F0

E1

F0

F0

F0

F0

F0

82

AF

AF

82

82

82

AF

AF

82

F0

E1

FF

F0

F0

F0

F0

82

AF

82

F0

82

F0

82

AF

82

F0

F0

FF

FF

F0

F0

F0

F0

82

F0

F0

4B

F0

F0

82

F0

FF

FF

FF

FF

FF

F0

F0

F0

F0

F0

F0

4B

F0

F0

F0

F0

F0

F0

F0

F0

F0

F0

F0

F0

4B

4B

F0

4B

F0

4B

4B

F0

F0

F0

F0

F0

F0

4B

F0

F0

F0

4B

F0

4B

F0

4B

F0

F0

F0

F0

32

F0

F0

F0

4B

F0

F0

F0

4B

4B

4B

F0

F0

32

F0

32

F0

F0

32

F0

F0

4B

F0

F0

F0

4B

F0

F0

32

F0

32

F0

F0

32

F0

4B

4B

4B

4B

4B

4B

4B

4B

4B

4B

4B

4B

4B

4B

4B

4B

4B

4B

4B

4B

4B

4B

4B

4B

4B

4B

4B

4B

4B

4B

4B

4B

Information-theoretical values:
gj
H(gj )
HK (gj )
p(gj )
pK (gj )
I(gj )
HQ (gj )

32hex
7
7
0.0273
0.0273
5.1926
0.1420

4Bhex
49
56
0.1914
0.2188
2.3853
0.4566

82hex
36
92
0.1406
0.3594
2.8301
0.3980

AFhex
29
121
0.1133
0.4727
3.1420
0.3559

E1hex
2
123
0.0078
0.4805
7.0000
0.0547

F0hex
124
247
0.4844
0.9648
1.0458
0.5066

FFhex
9
256
0.0352
1.0
4.8301
0.1698

sum
256
n.a.
1.0
n.a.
n.a.
2.0835

Elements of Information Theory


Fundamental theorem
of information theory
A code is optimal in the sense of information theory, if the average code length Lavg
for a gray value is equal to the entropy HQ
of the image.

Elements of Information Theory


Fundamental theorem
of information theory
A code is optimal in the sense of information theory, if the average code length Lavg
for a gray value is equal to the entropy HQ
of the image.
Average code word length:
Lavg =

K1
X
j=0

Lavg = HQ

H(gj )
L(gj )
n

Elements of Information Theory


Average code word length:
Lavg =

K1
X
j=0

H(gj )
L(gj )
n

Lavg = HQ
K1
K1
X H(gj )
X
L(gj ) =
p(gj ) log2 p(gj )
n
j=0
j=0

Elements of Information Theory


Average code word length:
Lavg =

K1
X
j=0

H(gj )
L(gj )
n

Lavg = HQ
K1
K1
X H(gj )
X
L(gj ) =
p(gj ) log2 p(gj )
n
j=0
j=0
K1
X
j=0

p(gj )L(gj ) =

K1
X
j=0

p(gj ) log2 p(gj )

Elements of Information Theory


Average code word length:
!

Lavg = HQ
K1
K1
X H(gj )
X
L(gj ) =
p(gj ) log2 p(gj )
n
j=0
j=0
K1
X

p(gj )L(gj ) =

j=0
K1
X
j=0

K1
X

p(gj ) log2 p(gj )

j=0

p(gj ) [L(gj ) + log2 p(gj )] = 0

Elements of Information Theory


Average code word length:
!

Lavg = HQ
K1
K1
X H(gj )
X
L(gj ) =
p(gj ) log2 p(gj )
n
j=0
j=0
K1
X

p(gj )L(gj ) =

j=0

K1
X

K1
X

p(gj ) log2 p(gj )

j=0

p(gj ) [L(gj ) + log2 p(gj )] = 0

j=0

L(gj ) = log2 p(gj )

Elements of Information Theory


Average code word length:
!

Lavg = HQ
K1
K1
X H(gj )
X
L(gj ) =
p(gj ) log2 p(gj )
n
j=0
j=0
K1
X

p(gj )L(gj ) =

j=0

K1
X

p(gj ) log2 p(gj )

j=0

K1
X

p(gj ) [L(gj ) + log2 p(gj )] = 0

j=0

L(gj ) = log2 p(gj )


H(gj )
L(gj ) = log2
n

Elements of Information Theory


Average code word length:
!

Lavg = HQ
K1
K1
X H(gj )
X
L(gj ) =
p(gj ) log2 p(gj )
n
j=0
j=0
K1
X

p(gj )L(gj ) =

j=0

K1
X

p(gj ) log2 p(gj )

j=0

K1
X

p(gj ) [L(gj ) + log2 p(gj )] = 0

j=0

L(gj ) = log2 p(gj )


H(gj )
L(gj ) = log2
= I(gj )
n

Elements of Information Theory


gj
H(gj )
HK (gj )
p(gj )
pK (gj )
I(gj )
HQ (gj )

32hex
7
7
0.0273
0.0273
5.1926
0.1420

4Bhex
49
56
0.1914
0.2188
2.3853
0.4566

82hex
36
92
0.1406
0.3594
2.8301
0.3980

AFhex
29
121
0.1133
0.4727
3.1420
0.3559

E1hex
2
123
0.0078
0.4805
7.0000
0.0547

F0hex
124
247
0.4844
0.9648
1.0458
0.5066

FFhex
9
256
0.0352
1.0
4.8301
0.1698

file size of raw image data


should be 533.38 bits or 67 bytes

sum
256
n.a.
1.0
n.a.
n.a.
2.0835

Elements of Information Theory


compression ratio CR and the relative data
redundancy RD amounts to
256
CR =
= 3.82
67
1
RD = 1
= 0.738
CR
i,e, 73.8% of the data of the example image is redundant data. Coding the same
image with 8 bits as we are used to do,
leaves 256 67 = 189 bits of redundant
information in the image!

Lossless Data Compression


simple compression algorithms
dictionary-based approaches
information-theoretical approaches
Fano-Shannon Code
Huffman Code
Arithmetic Code

Huffman Code
Image:

Huffman Code
Image:
F0

F0

F0

F0

82

82

F0

82

82

F0

F0

F0

F0

F0

F0

F0

F0

F0

F0

82

AF

AF

82

AF

AF

82

F0

F0

F0

F0

F0

F0

F0

F0

82

82

AF

AF

82

AF

AF

82

82

F0

F0

F0

F0

F0

F0

82

AF

AF

82

AF

AF

AF

82

AF

AF

82

F0

F0

F0

F0

F0

F0

82

AF

AF

82

AF

82

AF

AF

82

F0

F0

F0

F0

F0

FF

F0

F0

82

82

AF

AF

AF

82

82

F0

F0

E1

F0

F0

F0

F0

F0

82

AF

AF

82

82

82

AF

AF

82

F0

E1

FF

F0

F0

F0

F0

82

AF

82

F0

82

F0

82

AF

82

F0

F0

FF

FF

F0

F0

F0

F0

82

F0

F0

4B

F0

F0

82

F0

FF

FF

FF

FF

FF

F0

F0

F0

F0

F0

F0

4B

F0

F0

F0

F0

F0

F0

F0

F0

F0

F0

F0

F0

4B

4B

F0

4B

F0

4B

4B

F0

F0

F0

F0

F0

F0

4B

F0

F0

F0

4B

F0

4B

F0

4B

F0

F0

F0

F0

32

F0

F0

F0

4B

F0

F0

F0

4B

4B

4B

F0

F0

32

F0

32

F0

F0

32

F0

F0

4B

F0

F0

F0

4B

F0

F0

32

F0

32

F0

F0

32

F0

4B

4B

4B

4B

4B

4B

4B

4B

4B

4B

4B

4B

4B

4B

4B

4B

4B

4B

4B

4B

4B

4B

4B

4B

4B

4B

4B

4B

4B

4B

4B

4B

Huffman Code - Algorithm


1. Generate the probability table of the image
2. Sort the table according to increasing probabilities
3. Start with the two smallest probabilities and
connect them in a node. This node carries
the sum of the two probabilities
4. Assign value 0 to the node with the smaller
sum, value 1 to the node with the larger
sum.
5. Apply step 3 and 4 to the sub-table(s), until
a single knotPis left with
probability p(gj ) = 1

Huffman Code - Algorithm


gj
H(gj )
HK (gj )
p(gj )
pK (gj )
I(gj )
HQ (gj )

32hex
7
7
0.0273
0.0273
5.1926
0.1420

4Bhex
49
56
0.1914
0.2188
2.3853
0.4566

82hex
36
92
0.1406
0.3594
2.8301
0.3980

AFhex
29
121
0.1133
0.4727
3.1420
0.3559

E1hex
2
123
0.0078
0.4805
7.0000
0.0547

gj
p(gj )
step 1
p(gj )
step 2
p(gj )t
step 3
p(gj )
step 4
p(gj )
step 5
p(gj )
step 6
p(gj )

E1hex
32hex
FFhex
AFhex
0.0078 0.0273 0.0352 0.1133
0
1
0.0351
0.0352 0.1133
0
1
0.0703
0.1133
0
1
0.1836
1
0.3242
1
0.5156
1
1.0000

F0hex
124
247
0.4844
0.9648
1.0458
0.5066

FFhex
9
256
0.0352
1.0
4.8301
0.1698

sum
256
n.a.
1.0
n.a.
n.a.
2.0835

82hex
0.1406

4Bhex
0.1914

F0hex
0.4844

0.1406

0.1914

0.4844

0.1406

0.1914

0.4844

0.1406
0

0.1914

0.4844

0.1914
0

0.4844
0.4844
0

Huffman tree
E1hex
32hex
FFhex
AFhex
0.0078 0.0273 0.0352 0.1133
0
1
0.0351
0.0352 0.1133
0
1
0.0703
0.1133
0
1
0.1836
1
0.3242
1
0.5156
1
1.0000

gj
p(gj )
step 1
p(gj )
step 2
p(gj )t
step 3
p(gj )
step 4
p(gj )
step 5
p(gj )
step 6
p(gj )

E1

32

1
FF

1
AF

1
82

0
4B

0
F0

82hex
0.1406

4Bhex
0.1914

F0hex
0.4844

0.1406

0.1914

0.4844

0.1406

0.1914

0.4844

0.1406
0

0.1914

0.4844

0.1914
0

0.4844
0.4844
0

Huffman code
E1

32

1
FF

1
AF

1
82

0
4B

0
F0

gj
H(gj )
Huffman Code
bits

32hex
7
111001
76

4Bhex
49
10
49 2

82hex
36
110
36 3

AFhex
29
1111
29 4

E1hex
2
111000
26

F0hex
124
0
124 1

FFhex
9
11101
95

Huffman code
Everything added, we get 545 bits or 68.125 bytes
of code.
Ideal
Huffman
533.38 bits of code 545 bits of code
HQ = 2.0835
Lavg = 2.1289
256
CR =
= 3.82
67

256 8
CR =
= 3.758
545

256
CR =
= 3.82
67

545
RD = 1
= 0.734
256 8

Huffman code
Remarks:
bits and bytes have integer lengths,
Huffman code cannot be ideal!
the code table has to be stored or sent with
the picture.
JPEG uses Huffman coding as intermediate step,
JPEG has a fixed code table established by
thousands of pictures.
Huffman code is a prefix code

You might also like