Lecture4 Data Compression

Download as pdf or txt
Download as pdf or txt
You are on page 1of 34

A Lecture Series on

DATA COMPRESSION
JPEG and MPEG Standards

Abdou Youssef (Speaker)


Anastase Nakassis (MDVTG Manager)

1
Motivation for Standards
 Why Standards?
{ Compatibility
{ Production cost reduction
{ Triggering growth and product development
 Do JPEG/MPEG standards kill research?
{ Not necessarily
{ Those standards are very exible regarding the encoder
design, leaving much room for improvement
{ The trends are toward even greater exibility in future
generations of the standards
{ Another growth area is the development of systems which
integrate components that use the standards

2
Image/Video Compression Standards
(Outline)
 JPEG
{ Baseline JPEG
{ Extended JPEG
{ Lossless JPEG (DPCM + Hu man/Arithmetic)
 MPEG
{ History (H.261 and H.263)
{ MPEG1
{ MPEG2
{ Why not MPEG3
{ Future: MPEG4 (slated for 1998)

3
The JPEG \Toolkit"
 JPEG provides a \toolkit" of techniques for compressing
continuous-tone, still, color and monochrome images
 Baseline JPEG provides a DCT-based algorithm, and uses
run-length encoding and Hu man coding
 Baseline JPEG operates only in sequential mode, and is re-
stricted to 8 bits/pixel input images
 Extended JPEG o ers several optional enhancements:
{ 12-bit/pixel input
{ Progressive transmission
{ Choice between Arithmetic and Hu man coding
{ Adaptive quantization
{ Tiling
{ Still picture interchange le format (SPIFF)
{ Selective re nement
 Applications of JPEG
{ Desktop publishing, color fax, photojournalism,
picture archiving and communications systems for med-
ical images, general image archiving systems, consumer
imaging, graphic arts, and others
4
The Baseline JPEG Algorithm
1. It operates on 8  8 blocks of the input image
2. Mean-normalization (subtract 128 from each pixel)
3. Transform: DCT-transform each block
4. Quantization
 An 8  8 quantization matrix Q is user-provided
 Each block is divided by Q (point by point)
 The terms are then rounded to their nearest integers
 Remark: Up to 4 quantization matrices per image are
allowed (for example, one for luminance, and for each of
the three color components)
5. Entropy-coding of the DC coecients (the top left coecient
of each quantized block) using DPCM+Hu man
 Hu man-encode the DC residuals derived from the dif-
ference between each DC and the DC of the preceding
block
6. Entropy-coding of the AC (i.e., non-DC) coecients
 Zigzag-order the quantized coecients of each block
 Record for each nonzero coecient both its distance (called
run) to the preceding nonzero coecient in the zigzag se-
quence, and its value (called level)
 Hu man code the [run,level] terms using one single Hu -
man table for all the AC's of the image
5
A Schematic Diagram of the JPEG Encoder

6
The Quantization Matrices
 They are user-provided
 They can be computed using the contrast sensitivity function
of the HVS
 Their values are 8-bit integers
 They provide control over the bitrate by scaling them by a
constant factor
 Example
0 1
11
B
B 16 10 16 24 40 51 61 CC
B
12
B
B
B
B
12 14 19 26 58 60 55 CCCCC
13
B
B
B
B
14 16 24 40 57 69 56 CCCC
Q= 17
B
B
B
B
14 22 29 51 87 80 62 CCCC
:
22
B
B
B
B
18 37 56 68 109 103 77 CCCC
B
35
B
B
B 24 55 64 81 104 113 92 CCCC
B
64
B
B
B 49 78 87 103 121 120101 CCCC
@ A
72 92 95 98 112 100 103 99

7
The DC Hu man Table
 The DC residuals are in the range [-2047,2047]
 Divide this range into 12 categories corresponding to \one-
dimensional ripples"

 Category k, 0  k  11, consists of 2k integers ranging from


,2k + 1 to ,2k, , and from 2k, to 2k , 1
1 1

 Develop a Hu man code for the 12 categories, where every


codeword is at most 16 bits long
 Encode each DC residual as a binary string hsm where
{ h is the codeword of the residual's category
{ s = sign of the residual; s = 0 if negative, 1 if positive
{ m = the (k , 1)-bit [(magnitude of the residual) ,2k, ] 1

8
Coding of the AC Terms
(The AC Hu man Table)
 Divide the range of the AC's into 10 categories
 Category k, 1  k  10, consists of 2k integers ranging from
,2k + 1 to ,2k, , and from 2k, to 2k , 1
1 1

 Describe each [run; level] by 8 bits `NNNNSSSS'


{ NNNN = the value of run (i.e., the runlength)
{ SSSS = the category of level
{ If the runlength run > 15, then
 run = 15p + r, 0  r  15, r = r r r r in binary
3 2 1 0

 describe [run; level] by


111100001 111100002 ::: 1111000p r3r2r1r0SSSS
 Add an end-of-block EOB symbol after the last nonzero
coecient in each block
 The total number of symbols needed is 16*10+1+1=162
 Build a Hu man table for those 162 symbols, where every
codeword is at most 16 bits long
 Encode each AC quantized term as hsm where
{ h is the codeword of the AC's [run; category]
{ s = sign of the term; s = 0 if negative, 1 if positive
{ m = the (k , 1)-bit [(magnitude of the term) ,2k, ] 1

9
Examples
 Take 0an 8  8 block of Lena 1
B
B 143
147 149 152 156 147 146 149 CC
B
B
B
B
B
151
146 143 154 148 144 153 132 CCCCC
B
B
B
B
147
143 145 149 144 145 128 133 CCCC
B=
B
B
B
B
152
145 145 144 146 134 130 137 CCCC
B
B
B
B
146
143 142 147 124 127 139 138 CCCC
B
B
B
B 139
145 139 127 126 135 139 141 CCCC
B
B
B
B 145
137 124 130 138 136 140 144 CCCC
@ A
144 124 136 134 137 139 142 145
 Normalize
0 B 1
BB 15
19 21 24 28 19 18 21 CC
BB
18BB 23
BB
15 26 20 16 25 4 CCCCC
15 BB 19 17 21 16 17 0 5 CCCC
BB
17 BB 24 17 16 18 6 2 9 CCCC
NB = 15
BB
BB 18 14 19 ,4 ,1 11 10 CCCC
BB
17 BB 11
BB 11 ,1 ,2 7 11 13 CCCC
9 BB 17
B@ ,4 2 10 8 12 16 CCCC
16 ,4 8 6 9 11 14 17 A

10
 Perform
0 DCT 1
:4
B
B 103 12:4 6:0 2:1 8:1 5:7 ,0:7 ,0:4 CC
B
:7
B
B
B 31 11:6 ,22:8 ,0:7 ,0:1 ,0:5 ,4:7 ,1:8 CCCCC
B
:0
B
B
B 11 ,21:2 ,3:7 8:8 2:3 0:1 ,0:2 5:6 CCCC
D=
B
:2
B
B
B
B
0 ,4:9 10:3 ,8:6 ,8:9 ,2:5 ,7:6 ,1:4 CCCC
:9
B
B
B
B
4 ,0:4 ,1:9 ,8:9 6:6 2:6 3:6 3:6 CCCC
:7
B
B
B
B 0 ,0:9 ,0:9 4:1 7:2 ,15:5 2:8 1:8 CCCC
, :1
B
B
B
B 3 ,1:0 ,2:5 ,5:5 ,5:2 ,6:7 10:7 ,1:1 CCCC
,2:9
@
,0:2 ,1:2 ,2:7 3:6 2:6 0:4 ,6:4 A

 Quantize
0
(round(D:=Q)) 1
BB 6
1 0 0 0 0 00 CC
BB
1BB 4 ,2 0 0 0 00 CCCCC
,2 BB
BB 1
BB
0 0 0 0 00 CCCC
0 BB 0 0 0 0 0 00 CCCC
Dq = 0
BB
BB 0 0 0 0 0 00 CCCC
BB
0 BB 0
BB 0 0 0 0 00 CCCC
0 BB 0
B@ 0 0 0 0 00 CCCC
A
0 0 0 0 0 0 00
 zigzag ordering: 6 1 4 1 1 0 -2 -2 allzeros

11
Example Hu man Table for Lena

12
Coding the Example Block
 zigzag ordering: 6 1 4 1 1 0 0 -2 -2 allzeros
 Hu man symbols: [run, category]
DC (0,1) (0,3) (0,1) (0,1) (2,2) (0,2) EOB
 Code:
(DC di . code)/001/100100/001/001/111110010/0100/1010
 Bitrate (assuming 8 bits dor the DC residual symbol):
40
64
= 0:625 bits/pixel

13
Decoding
1. Entropy-decode the bitstream back to the quantized blocks
2. Dequantize: multiply each block coecient by the corre-
sponding coecient of the quantization matrix
3. Apply the inverse DCT transform on each block
4. Denormalize: add 128 to each coecient

 Example: Reconstructed block ^


B
0 1
B
B 143 147 153 157 157 154 149 145 CC
B
B
B
B
B
145 147 151 154 153 149 144 141 CCCCC
B
B
B
B
146 147 149 149 146 142 137 134 CCCC
B
B
B
B
146 146 145 142 139 135 132 130 CCCC
B
B
B
B
145 143 140 136 133 131 130 130 CCCC
B
B
B
B 141 138 134 131 130 131 134 135 CCCC
B
B
B
B 136 134 130 128 129 133 139 142 CCCC
@ A
133 131 127 126 129 135 142 147

14
 Error
0 block 1
B
B 0 0 ,4 ,5 ,1 ,7 ,3 4 CC
B
B
B
B 6 , 1 , 8 0 ,5 ,5 9 ,9 CCCCC
B
B
B
B 1 ,4 ,4 0 ,2 3 ,9 ,1 CCCC
6 ,1 0 2 7 ,1 ,2
B
B
B
B
B
7 CCCC
B
B
B
B
1 0 2 11 ,9 ,4 9 8 CCCC
B
B
B
B ,2 7 5 ,4 ,4 4 5 6 CCCC
B
B
B
B 9 3 ,6 2 9 3 1 2 CCCC
@
11 ,7 9 8 8 4 0 ,2 A

 MSE=5.2

15
Extended JPEG
 Extended JPEG allows for several optional enhancements:
{ 12-bit/pixel input
{ Arithmetic coding is allowed as an alternative to Hu man
coding
{ Adaptive quantization: Allows 5-bit scale change to the
quantization matrix from one block to another
{ SPIFF: A le format that provides for the interchange
of compressed image les between di erent application
environments
{ Progressive transmission (PT)
 Sequential PT
 Spectral selection
 Successive approximations
 Hierarchical PT (pyramid encoding)
{ Tiling
{ Selective re nement


16
Sequential Progressive Transmission

17
Hierarchical Progressive Transmission

 Subsampling lters are left to the users to specify


 Upsampling: bilinear interpolation
A B ,! A aB
c x b
CD C dD
a = A B; b = B D; c = A C ; d = C D; x = A
+
2
+
2
+
2
+
2
+B +C +D
4

18
MPEG
(History)

Standard Description
Activity
Internation Telecommunication Union (ITU);
H.261 designed for ISDN applications at p  64 kbits/sec,
Dec. 1990 (p = 1; 2; :::; 30)
ITU Recommendation;
H.263 video coding for low bitrate communication;
Feb. 1995 improved visual telephony applications
ISO Moving Picture Expert Group;
MPEG1: storage/retrieval of video+audio at about
1.5 Mbits/s
MPEG MPEG2: for generic applications at higher birates
Nov. 1994 (initially 10Mbits/s);
MPEG3: initially intended for 50 Mbits/s bitrate,
but MPEG2 was found to achieve the MPEG3 speed
goals, leading to the abandonment of MPEG3

19
Basic Concepts
 Video  a sequence of images called frames
 Color
{ Three components: red (R), green (G), and blue (B)
{ For compatibility with non-colored media, the RGB model
was converted to an equivalent model | Y CbCr
 Y is the luminance component, which was experimen-
tally determined to be
Y = 0:299R + 0:587G + 0:114B
 Cb = B , Y
 Cr = R , Y
{ Y is referred to as luma, and Cb & Cr as chroma
 Every frame is really 3 images: one Y , one Cb and one Cr
 8 bits/pixel for each of the three color components
 Because human vision is less sensitive to color, the Cb and
Cr images are downsampled by 2 in each dimension (they
are quarter the size of Y images)
 Much of MPEG processing is on the basis of a macroblock:
A 16  16 luminance block with the two 8  8 associated
chroma blocks
20
Modes of MPEG Compression
 Intraframe compression
{ Exploits spatial redundancy only
{ Operates on single frames independetly of other frames
 Interframe compression
{ Exploits both spatial and temporal redunadancies
{ Employs motion estimation (MS) without standardizing
any MS algorithm
{ Derives motion-compensated predictions of frames
{ Finally, it performs JPEG-like compression on the resid-
ual frames

21
Types of Frames in MPEG

 MPEG has 3 types of frames: I, P, and B


 I frames are stricly intra compressed as in JPEG. Their pur-
pose is to provide random access points to the video
 P frames are motion-compensated forward-predictive-coded
frames; they are interframe compressed, and typically pro-
vide more compression than I frames
 B frames are motion-compensated bidirectionally-predictive-
coded frames; they are interframe compressed, and typically
provide the most compression
 The relative numbers of I, P and B frames are arbitrary
 An I frame must occur at least once every 132 frames to
provide user-acceptible speed of random access to various
parts of a video

22
Interframe Compression of P and B Blocks
 A motion-compensated prediction (i.e., approximation) f of
a P/B frame f is made
 The residual f , f is then compressed in a JPEG-like style
 Any macroblock of the original frame f may be strictly intra
compressed if its prediction is deemed to be poor
 Issues to be addressed
{ Method of strict intra compression
{ Method of compressing residual macroblocks
{ Motion-compensated prediction

23
Flowchart of MPEG Compression

24
Motion-Estimation and Prediction
 Motion estimation is performed on the basis of macroblocks,
using the 16  16 luminance blocks only
 Motion is assumed to be uniform across all the pixels of a
macroblock
 Remark: There is a tradeo in deciding the size of the basic
block for motion estimation
{ The block has to be suciently large to avoid \false hits"
{ The block has to be suciently small to avoid diverse
motions within one single block
{ The MPEG block size, 16  16, is a good compromise

25
Motion-Estimation and Prediction for P Frames

 Consider a P-frame P
 P will be predicted (i.e., approximated) from one single ref-
erence frame R
 R is the most recent (decoded) I or P frame
 For each macroblock MB of P , nd the closest matching
macroblock MB' in the reference frame R
 If the MB-to-MB' match is satisfactory, then
{ treat MB' as the prediction (i.e., approximation) of MB
{ record the motion (i.e., displacement) vector between the
two macroblocks (allowing half-pixel accuracy)
{ Compute and compress the macroblock residual MB,MB'
(luma and chroma)

26
 If the MB-to-MB' match is found to be unsatisfactory, then
the macroblock MB is strictly intra compressed as is done in
I frames
 The motion vectors of all the macroblocks of P exhibit re-
dundancy due to similar (or sometimes identical) motion ex-
perienced by many neighboring macroblocks
 This redundancy is exploited by coding the consecutive dif-
ferential values of motion vectors (i.e., DPCM)
 Remark 1: MPEG does not standardize the decision mech-
anism for judging whether or or not a match between two
macroblocks is satisfactory
 Remark 2: A typical decision mechanism involves computing
an error measure between the luminance of the two macro
blocks. The match is treated as satisfory if and only if the
error is below a certain threshold. Possible error measures
include mean-square error (MSE), mean absolute-di erence
(MB ,MB )
error (MAD), and variance
variance(MB ) .
0

27
Motion-Estimation and Prediction for B Frames

 Consider a B-frame B
 B will be predicted (i.e., approximated) from TWO reference
frames R1 and R2
 R is the most recent (decoded) past I/P frame, and R is
1 2
the nearest (decoded) future I/P frame
 For each macroblock MB of B , nd the closest matching
macroblock MB1 in the reference frame R1, and the closest
matching macroblock MB2 in R2
 The predicted macroblock is PM = NINT ( MB + MB )
1 1 2 2

{ = 0:5 and = 0:5 if both matches are satisfactory


1 2

{ = 1 and = 0 if only the 1st match is satisfactory


1 2

{ = 0 and = 1 if only the 2nd match is satisfactory


1 2

{ = 0 and = 0 if neither match is satisfactory


1 2

28
 Compute and compress the macroblock residual MB , PM
(luma and chroma)
 If neither match is found to be satisfactory, then the mac-
roblock MB is strictly intra compressed as is done in I frames
 Record the motion vector(s) between the MB and the other
one or two macroblocks (allowing half-pixel accuracy)
 Again, the motion-vector redundancy is exploited by cod-
ing the consecutive di erential values of motion vectors (i.e.,
DPCM)
 Remark: The prediction mode chosen is 2-bit coded and
passed on along with the macroblock header information

29
MPEG1 Constrained Parameters

horizontal picture size 768


vertical picture size  576
picture area  396 macroblocks at 25 frames/ses
picture area  330 macroblocks at 30 frames/sec
picture rate  30 frames/sec
bitrate  1.856 Mbits/sec (constant)

30
MPEG2
 Higher data rates than MPEG1
 MPEG2 allows for higher quality source images
{ 4:2:0 (chroma subsamples only horizontally)
{ 4:4:4 (no subsampling of chroma)
{ Note: 4:2:2 is what's supported by MPEG1
 MPEG2 allows for ner quantization and for specifying separate
quantization table for luma and chroma
 MPEG2 allows for ner adjustment of quantization scale fac-
tor, used in intra compression
 MPEG2 allows for interlaced video
 MPEG2 supports error concealment (of lost macroblocks)
 MPEG2 supports scalable compression
{ SNR-scalability: by sending bands of DCT coecients
{ spatial-scalability: pixel resolution by down-/up-sampling
{ temporal-scalability: di erent frame rates by skipping frames

31
MPEG2 (Cont.)
 MPEG2 has a Pro le and Level structure
{ Pro les are algorithmic elements included in MPEG2
{ Levels are upper bounds on parameter values
 Pro les (pro les are backward-compatible)
1. Simple: No use of B frames
2. Main: what was described earlier, but no scalability
3. SNR-scalable
4. Spatially scalable
5. High: temporally scalable, higher-quality source data (4:2:0,
4:2:2 and possibly 4:4:4 in the future)
 Levels
1. Low: 352  240 frame size
2. Main: 720  480 frame size
3. High 1440: 1440  1152 frame size
4. High: 1920  1080 frame size

32
Allowed Level-Pro le Combinations

simple main SNR- spatially high


scalable scalable
high NO YES No No YES
high 1440 NO YES No YES YES
main YES YES YES NO YES
low NO YES YES NO NO
 The Grand Alliance (for HDTV) follows the Main Pro le
and High Level combination, for a maximum bitrate of 45
Mbits/sec

33
References
1. B. pennebaker and J. L. Mitchell, JPEG Still Image Data
Compression Standard, Van Nostrand reinhold, New York
1993.
2. ISO-11172-2: Generic Coding of moving pictures and associ-
ated audio (MPEG-1)
3. ISO-13818-2: Generic Coding of moving pictures and associ-
ated audio (MPEG-2)

34

You might also like