0% found this document useful (0 votes)
47 views

Image Compression: I. Fundamentals

This document provides an overview of image compression techniques, including run length encoding, Huffman coding, and JPEG compression. It discusses the following key points: - Run length encoding achieves compression by encoding repeated pixel values with a code indicating the number of repeats. It works best for images with large areas of uniform color but less so for natural images. - Huffman coding assigns variable length codes to characters based on their frequency, with more common characters having shorter codes. This results in more efficient representation of the data. - JPEG compression uses discrete cosine transform, quantization, and entropy encoding including Huffman coding. It partitions an image into 8x8 pixel blocks, applies DCT, then quantizes and

Uploaded by

LuPham
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
47 views

Image Compression: I. Fundamentals

This document provides an overview of image compression techniques, including run length encoding, Huffman coding, and JPEG compression. It discusses the following key points: - Run length encoding achieves compression by encoding repeated pixel values with a code indicating the number of repeats. It works best for images with large areas of uniform color but less so for natural images. - Huffman coding assigns variable length codes to characters based on their frequency, with more common characters having shorter codes. This results in more efficient representation of the data. - JPEG compression uses discrete cosine transform, quantization, and entropy encoding including Huffman coding. It partitions an image into 8x8 pixel blocks, applies DCT, then quantizes and

Uploaded by

LuPham
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 12

Image Compression

Introduction

Size of Image file D.BMP is 65,322 bytes.

Size of Image file D.JPG is 4,781 bytes.

Size of image D is 148147, 24 bit depth.

I. Fundamentals
1. A compression ratio is simply the size of the original data divided by the size of the
compressed data.
2. A technique that compresses a 1 megabyte image to 100 kilobytes has achieved a
compression ratio of
CR = Uncompressed size/Compressed size = 1024 KB/ 100 KB = 10.24
Space savings: 1 Compressed Size/Uncompressed Size (=90.23%)
3. There are two basic types of image compression: lossless compression and lossy
compression.
a. A lossless scheme encodes and decodes the data perfectly, and the resulting
image matches the original image exactly. There is no degradation in the
process-no data is lost.
b. Lossy compression schemes allow redundant and nonessential information to
be lost. Typically with lossy schemes there is a tradeoff between compression
and image quality. The goal of lossy compression is that the final
decompressed image be visually lossless.
4. Denote n1, n2 are the numbers of information-carrying units in two data sets that
represent the same information. Then RD = 1 1/CR is relative data redundancy.

II. Run Length Encoding


In some area of the image have a constant color. This repeating value is called run.
For example, a source string
AAAABBBBBCCCCCCCCDEEEE
could be represented with
4A 5B 8C 1D 4E
This example represents 22 bytes of data with 10 bytes, achieving a compression ratio of:
22 bytes / 10 bytes = 2.2.
For this example:
ABCD
string would be encoded
1A 1B 1C 1D
and a compression ratio of 0.5.
Clearly, needed a better method:
Set unique value as the original and run length encode only repetitive data.
If we use a # as our special prefix, we can encode the following data
ABCDMMMMMSBBBBB
as
ABCD#5MS#5B
Remark:

Since it takes three bytes to encode a run of data, it makes sense to encode
only runs of 3 or longer (otherwise, you are expanding your data).

When special character is found in the source data then must encode this
character as a run of length 1.

The MacPaint image file format uses run length encoding, combining the
prefix character with the count byte.

The most significant (highest) bit of the prefix byte determines that following
bytes is repeating data or unique data.

If the bit is set (=1), that byte stores the count (in twos complement)
of how many times to repeat the next data byte.

If the bit is not set (=0), that byte plus one is the number of how
many of the following bytes are unique and can be copied verbatim to
the output.

Only seven bits are used for the count.

For example, if count byte is 172, then, next byte is repeated 44 time. If
count byte is 45, then 45 next bytes are unique.

The PCX file format run length sets the two most significant (highest) bits if
there is a run. This leaves six bits, limiting the count to 63, to represent the
length of runs. For example, compressed string 165, 211,145,153,193,234
have original string 165,145,145,....,14519, 153, 234

Other image file formats that use run length encoding are RLE and GEM.

The TIFF and TGA file format specifications allow for optional run length
encoding of the image data.

Run length encoding works very well for images with solid backgrounds like
cartoons. For natural images, it doesn't work as well.

III. Huffman Coding

Here is a histogram of text from one chapter in some book.

More than 96% of this file consists of only 31 characters: the lower case letters, the
space, the comma, the carriage return, and the period.

If use only 5 bits for each of these characters, for example, 00000= a,
00001=b,..., and arbitriary for other characters (for instance, 8 bits!) the the file
reduces in size by 5/8.

Huffman encoding extremly takes the idea: to assign frequently used characters
fewer bits, and seldom used characters more bits.

Example:
Letters

A
B
C
D
E
F
G

Probability

Huffman code

0.318
0.227
0.149
0.130
0.122
0.031
0.023

00
11
010
011
100
1010
1011

original data stream : F

E B A A

A D

D ..

Huffman encoded: 1010 100 010 100 11 00 00 1011 00 011 011...


grouped into bytes: 10101000

10100110

byte 1

00010110

byte 2

0011011....

byte 3

byte 4

The first step in creating Huffman codes is to create an array of character frequencies.
The algorithm is as follows:
1. Input: all characters as free nodes.
2. Repeat
2.1.
The two free nodes with the lowest frequency are assigned to
a parent node with a weight equal to the sum of the two free child
nodes.
2.2.
The two child nodes are removed from the free nodes list.
The newly created parent node is added to the list as the free node.
Until there is only one free node left.
Output free node.
An example:
Input:

step 1

step 2

A
0.318

A
0.318

A
0.318

B
0.227

B
0.227

B
0.227

C
0.149

C
0.149

C
0.149

D
0.130

D
0.130

E
0.122

F
0.031

G
0.023

E
0.122

0.054
F
0.031

D
0.130

0.176

E
0.122

0.054
F
0.031

step 3

A
0.318

B
0.227

G
0.023

0.176

0.279
C
0.149

G
0.023

D
0.130

E
0.122

0.054
F
0.031

G
0.023

step 4

A
0.318

0.403

0.279
C
0.149

B
0.227

0.176

E
0.122

D
0.130

0.054
F
0.031

G
0.023

0.403

step 5

B
0.227

0.176

E
0.122

0.597

A
0.318
C
0.149

0.054
F
0.031

0.279
D
0.130

G
0.023

1.000

0.597

A
0.318

0.403

0.176

0.279
C
0.149

D
0.130

E
0.122

B
0.227

0.054
F
0.031

G
0.023

A more sophisticated version of the Huffman approach is called arithmetic


encoding. In this scheme, sequences of characters are represented by individual
codes, according to their probability of occurrence. This has the advantage of better
data compression, say 5-10%.

IV. JPEG (Transform Compression)


JPEG is named the Joint Photographers Experts Group.
Source
Image

Compressed
Image Data

DCT

Quantizer

Entropy
Encoder

Compressed
Image Data

Entropy
Encoder

Dequantizer

Inverse
DCT

Uncompressed
Image

JPEG encoder and decoder

JPEG compression consists basic steps:


1. Input the source dark-gray image I.
2. Partition image into 8 x 8 pixel blocks and perform the DCT on each block.
3. Quantize resulting DCT coefficients.
4. Entropy code the reduced coefficients.
The second step consists of separating image components are broken into arrays or "tiles" of 8
x 8 pixels. The elements within the tiles are converted to signed integers (for pixels in the
range of 0 to 255, subtract 128). These tiles are then transformed into the spatial frequency
domain via the forward DCT. Element (0,0) of the 8 x 8 block is referred to as DC. The 63
other elements are referred to as ACYX, where x and y are the position of the element in the
array. DC is the average value of the 8 x 8 original pixel values.
1) As an example, one 88 8-bit block of the
image I and centred (around zero) values in the
block shifting by 2b, where b is number of bits
per pixel. There in the example shifting by 128
as matrix F.

-78 -73 -67 -68 -58 -67 -68 -57


-65 -69 -73 -38 -39 -43 -59 -56
-66 -69 -60 -15 -14 -64 -62 -55
-65 -70 -57 -26 -74 -22 -58 -59
-67 -67 -60 -28 -52 -40 -60 -58
-49 -63 -68 -58 -51 -60 -70 -53
-46 -57 -64 -69 -73 -67 -63 -48
-47 -49 -59 -60 -63 -52 -50 -38

50 55 61 60 70 61
63 59 55 90 89
62 59

60 71

85 69 72

68 113 114 64 66 73

63 58 71 102 54 106 70 69
61 61 68 100 76 88

68 70

79 65 60 70 77 68 58 75
82 71 64 59 55 61 65 80
81 79 69 68 65 76 78 90

2) DCT of image block F as matrix G.


7

G (u, v ) = ( u) (v ) F ( x , y ) cos( ( x +1 / 2) u / 8) cos( ( y +1 / 2) v / 8)


x =0 y =0

where

u = 0, 1, ..,7; v = 0, 1, ..,7

and

(0) = 1 / 8, ( u) = 2 / 8 , u =1, 2, ..., 7

There are the original image and the DCT inverted image using only number of DCT
coefficients (u and v run from 0 to N/2) and 1/9 number of DCT coefficients.

-452.87 -21.19 -25.44 2.29 29.62 4.79 5.53 -23.80

16

11

10

16

24

40

51

61

-7.23 -15.53 -44.20 0.54 10.69 0.55 -6.63 0.76

12

12

14

19

26

58

60

55

-18.38 0.78 36.34 -2.75 -4.56 -17.35 -6.34 21.68

14

13

16

24

40

57

69

56

-30.61 4.11 15.13 2.30 -5.01 -8.56

14

17

22

29

51

87

80

62

-1.38 -8.87 2.77 -5.20 -24.37 9.18 12.25 -15.55

18

22

37

56

68

109 103 77

-16.93 6.49 13.74 -8.25 -8.03 3.95

7.50 0.72

24

35

55

64

81

104 113 92

-9.09 17.00

49

64

78

87 103

121 120 101

2.57 -11.31 -10.66 12.22 -3.73 -12.53 13.28

72

92

95

98

2.34 4.00
1.19

-8.34 -5.50 16.10 -9.96

5.49 7.27

112 100 103 99

3) Quantization:
A typical quantization matrix, as
specified in the original JPEG Standard , is B
adjacent box.

-28

-2

-3

-1

-1

-3

-1

G ( x, y)
-2
B ( x , y ) = round
Q ( x, y)
, for x , y =0,1, ..., 7

-1

4)

as

The coefficients are reordered in accordance with


the zigzag ordering:
{-28, -2, -1, -1,-1,-3, 0, -3, 0, -2, 0, 0, 2, 0, 1, 0,
0, 0, 1, 0, -1, EOB}
B(0,0), B(0,1), B(1,0), B(2,0), B(1,1), B(1,2),...
where the EOB symbol denotes the end-of-block condition.

5) Encode
5.1. The Zero Run Length Coding (RLC)
Let's consider the 63 vector (it's the 64 vector without the first coefficient). Say that we
have -2, -1, -1,-1,-3, 0, -3, 0, -2, 0, 0, 2, 0, 1, 0, 0, 0, 1, 0, -1, 0 , 0 ,0 , only 0,..,0. Here it
is how the RLC JPEG compression is done for this example :

(0,-2), (0,-1), (0,-1), (0,-1), (0,-3), (1,-3), (1,-2), (2, 2), (1,1), (3,1), (1,-1), EOB
ACTUALLY, EOB has as an equivalent (0,0) and it will be (later) Huffman coded like
(0,0). So we'll encode :
(0,-2), (0,-1), (0,-1), (0,-1), (0,-3), (1,-3), (1,-2), (2, 2), (1,1), (3,1), (1,-1), (0,0)
Note that if the quantized vector doesn't finishes with zeroes (has the last element not 0)
we'll not have the EOB marker. Somewhere in the quantized vector we have:
7, nineteen zeros, 3, 0, 0, 0 ,0,0 2, thirty-four zeroes, 5, EOB
The JPG Huffman coding makes the restriction (you'll see later why) that the number of
previous 0's to be coded as a 4-bit value, so it can't overpass the value 15 (0xF). So, the
previous example would be coded as :

(0, 7), (15,0), (3,3), (5,2), (15,0) (15,0), (2,5) , (0,0)


5.2. Huffman Coding
JPEG standard stores the minimum size in bits in which keep that value (it's called the
category of that value) and then a bit-coded representation of that value like this:
Values

Category

Bits for the value

-1, 1

0, 1

2,3

00,01, 10,11

-7,-6,-5,-4,

4,5,6,7

000,001,010,011, 100,101,110,111

-15,..,-8,

8,..,15

0000,....,0111, 1000,....,1111

-31,..,-16,

16,..,31

00000,....,01111, 10000,....,11111

-63,..,-32,

32,..,63

-127,..,-64,

64,..,12

-255,..,-128,

128,..,255

-511,..,-256,

256,..,511

-1023,..,-512,

512,..,1023

10

-2047,..,-1024,

1024,..,2047

11

-4095,..,-2048,

2048,..,4095

12

-8191,..,-4096,

4096,..,8191

13

-16383,..,-8192,

8192,..,16383

14

-32767,..,-16384,

16384,..,32767

15

-3,-2,

In consequence for the previous example:


(0,-2), (0,-1), (0,-1), (0,-1), (0,-3), (1,-3), (1,-2), (2, 2), (1,1), (3,1), (1,-1), (0,0)
let's encode ONLY the right value of these pairs, except the pairs that are special markers
like (0,0) or (if we would have) (15,0)
Value

Category

bit-coded

-2
-1
-3
2
1

2
1
2
2
1

01
0
00
10
1

codes as
2,
1,
2,
2,
1,

01
0
00
10
1

(0,-2), (0,-1), (0,-1), (0,-1), (0,-3), (1,-3), (1,-2), (2, 2), (1,1), (3,1), (1,-1), (0,0)
=>

(0,2)01, (0,1)0, (0,1)0, (0,1)0, (0,2)00, (1,2)00, (1,2)01, (2,2)10, (1,1)1, (3,1)1,
(1,1)0, (0,0)
The pairs of 2 values enclosed in bracket parenthesis, can be represented on a byte. In this
byte, the high nibble represents the number of previous 0s, and the lower nibble is the
category of the new value different by 0.

0, 2
0, 1
1, 2
2, 2
1, 1
3, 1
0, 0

01
00
111001
11111000
1100
111010
1010

The FINAL step of the encoding consists in Huffman encoding this byte, and then writing
in the JPG file, as a stream of bits, the Huffman code of this byte, followed by the
remaining bit-representation of that number. The final stream of bits written in the JPG
file on disk for the previous example
(01)01 (00)0 (00)0 (00)0 (01)00 (111001)00 (111001)01 (11111000)10 (1100)1
(111010)1 (1100)0 (1010)
5.3. The encoding of the DC coefficient
DC is the coefficient in the quantized vector corresponding to the lowest frequency in the
image (it's the 0 frequency) , and (before quantization) is mathematically = (the sum of
8x8 image samples) / 8 .
The authors of the JPEG standard noticed that there's a very close connection between the
DC coefficient of consecutive blocks, so they've decided to encode in the JPG file the
difference between the DCs of consecutive 8x8 blocks:
Diff = DC(i) - DC(i-1)
And in JPG decoding you will start from 0 -- you consider that the first
DC(0) = 0
Diff = (category, bit-coded representation). For example, if Diff is equal to -511 , then
Diff corresponds to (9, 000000000). Say that 9 has a Huffman code = 1111110. (In the
JPG file, there are 2 Huffman tables for an image component: one for DC (and one for
AC). In the JPG file, the bits corresponding to the DC coefficient will be:
1111110 000000000

And, applied to this example of DC and to the previous example of ACs, for this vector
with 64 coefficients, THE FINAL STREAM OF BITS written in the JPG file will be:
1111110 000000000 (01)01 (00)0 (00)0 (00)0 (01)00 (111001)00 (111001)01
(11111000)10 (1100)1 (111010)1 (1100)0 (1010)
(In the JPG file , first it's encoded DC then ACs)

6) Decoder process:
{-28, -2, -1, -1,-1,-3, 0, -3, 0, -2, 0, 0, 2, 0, 1, 0, 0, 0, 1, 0, -1, EOB}

-448 -22 -30

0 24

-12 -12 -42

-74 -77 -72 -62 -59 -65 -66 -61


-68 -67 -56 -41 -38 -48 -56 -55

-14

0 32

-73 -67 -51 -32 -30 -45 -58 -62

-28

0 22

-73 -68 -54 -36 -35 -49 -61 -64

-60 -61 -54 -43 -42 -51 -56 -53

-24

-54 -62 -63 -58 -57 -61 -58 -50

-52 -62 -67 -65 -64 -65 -59 -49

-40 -51 -56 -54 -54 -55 -49 -39

54 51 56 66 69 63 62 67

50

55

61

60 70 61

60

71

60 61 72 87 90 80 72 73

63

59

55

90 89

85

69

72

55 61 77 96 98 83 70 66

62 59

68

113 114

64

66

73

55 60 74 92 93 79 67 64

63

58

71

102 54

106 70

69

68 67 74 85 86 77 72 75

61

61

68

100 76 88

68

70

74 66 65 70 71 67 70 78

79

65

60

70

77

68

58

75

76 66 61 63 64 63 69 79

82

71

64

59

55

61

65

80

88 77 72 74 74 73 79 89

81

79

69

68

65

76

78

90

Uncompressed Image Block

Origin Image Block

There are the matrix E of difference results in erorr values (original-uncompressed):


-4

5 -6

3 -2 -17

1 -2 -2

3 -1

5 -3 -1

7 -2 -9 17 16 -19 -4
7

and an average absolute erorr:

8 -21 -3 10 -39
e ( x,27y ) 3= 65 .3000

64

x =0 y =0

-7 -6 -6 15 -10 11 -4 -5
5 -1 -5 -0

-7

2 -3 -6 -9

1 -12 -3

3 -4 -9 -2 -4
3 -1

1
1

You might also like