WSU-DTC Campus
Dept. of Information Technology
Itec3121 - Multimedia Systems
Year: III, Semester: II
Chapter 10:
Basic Video Compression Techniques
Abraham Abayneh
[email protected]
Itec3121 - Multimedia Systems
1
Chapter content
• Introduction to Video Compression
• Video Compression Based on Motion
Compensation
Itec3121 - Multimedia Systems
2
Introduction to Video Compression
• Video compression can be defined as reducing the file size of a video
by discarding some information or quality.
• Video compression has several obvious benefits. It can ensure optimal
use of storage space and reduce the cost, be it cloud-based or on
premise storage.
• The Moving Picture Experts Group (MPEG) method is used to
compress video.
• . In principle, a motion picture is a rapid sequence of a set of frames in
which each frame is a picture.
• . In other words, a frame is a spatial combination of pixels, and a video
is a temporal combination of frames that are sent one after another.
• . Compressing video, then, means spatially compressing each frame
and temporally compressing a set of frames
Itec3121 - Multimedia Systems
3
Spatial compression
The spatial compression of each frame is done with JPEG, or a
modification of it.
Each frame is a picture that can be independently compressed
Intraframe( Spatial Redundancy) is DCT-based and very similar to JPEG
Temporal compression
• In temporal compression, redundant frames are removed.
• When we watch television, example, we receive 30 frames per sec.
• However, most of the consecutive frames are almost the same. For
example, in a static scene in which someone is talking, most frames
are the same except for the segment around the speaker’s lips, which
changes from one frame to the next.
• Interframe(Temporal Redundancy) uses block-based motion
compensation
— utilized for reducing temporal redundancy
Itec3121 - Multimedia Systems
4
Various MPEG Standards
•MPEG-1
— 320x240 full-motion video
— 1.5 Mb/s
• MPEG-2
— higher resolution and transmission rate 3-15Mb/s
— defines different levels (profiles) for scalability
• MPEG-4
— full-motion video at low bitrate (9-40 Kbps)
— intended for interactive multimedia, video telephony
Itec3121 - Multimedia Systems
5
Why does video compression
work ?
JPEG compression exploits the spatial redundancy in
image data through DCT
For video data, each frame can be compressed using
JPEG and a compression ratio of 20:1 can be achieved.
Can it be do more ?
Consecutive frames are very similar. If we encode the
first frame and encode where each region moves to in the
second frame, we obtain a prediction of the second frame.
Only the residual needs to be encoded.
This is similar to predictive coding that exploits the
temporal redundancy in video data
Itec3121 - Multimedia Systems
6
MPEG compression
Motion Picture Expert Group, established in 1990 to
create standards for delivering audio and video data
MPEG-1: for VHS quality video (VCD, 320 x 240
pixels/frame or audio of 1.5 Mbits/sec).
The components of MPEG1 are
JPEG+ motion prediction for video coding
MUSICAM based audio
Stream control
MPEG-2: designed for various bitrates from 352x240
consumer video (4Mbit/s), 720x480 studio TV(15
Mbit/s), to HDTV 1440x1152 (60 Mbit/s)
Itec3121 - Multimedia Systems
7
Motion Prediction
• Suppose the first frame is encoded already using
JPEG, what is the best way of encode frame 2 ?
– JPEG
– Motion prediction
• Motion prediction
Frame 1 Frame 2
Itec3121 - Multimedia Systems
8
Motion prediction
For motion prediction to work, we need to record the motion of every pixel.
This can be done more efficiently using image blocks called “Macroblocks”
The predicted macroblock and the actual image block are compared and the
difference is encoded
Itec3121 - Multimedia Systems
9
Motion prediction
• Previous frame is called “reference” frame
• Current frame is called “target” frame
• The target frame is divided into 16x16 macroblocks
• For each macroblock, its best match in the reference
frame is computed, the 2D motion vector and the image
prediction error are recorded
– Prediction error: DCT+Quantization+RLE+Huffman
– Motion vector: Quantization+entropy coding
Itec3121 - Multimedia Systems
10
Matching Macroblocks
• Different match measures and methods can be used for
finding the best match for each macroblock
• Different measures
– Mean absolute difference
1 16 16
MAD ( I , I ' ) | I (i, j ) I ' (i, j ) |
16 16 i 1 j 1
– Mean squared difference
1 16 16
MSE ( I , I ' ) ( I (i, j ) I ' (i, j )) 2
16 16 i 1 j 1
Itec3121 - Multimedia Systems
11
Matching Macroblocks
• Different matching methods
– Full search method - search the RxR regions to find the
position with minimum MAD or MSE
Itec3121 - Multimedia Systems
12
Matching Macroblocks
• Two-dimensional
logarithmic search
– Search at the largest
scale at nine locations
– Find the best match
– Start from the best
match, reduce the
scale, repeat the
previous steps
Itec3121 - Multimedia Systems
13
Matching Macroblocks
Hierarchical motion
estimation
Build image pyramid
by down-sampling the
image
Estimate the motion at
the coarse level
Propagate the motion
from the coarse level
to the next fine level
Refine the motion at
fine level
Repeat these steps
until the finest level Itec3121 - Multimedia Systems
14
MPEG compression
MPEG encodes video frames using the following pattern
I-frame: Intraframe
P-frame: Interframe
B-frame: Bi-directional frame, search for macroblocks both in I-frame
and P-frame
So B frames are decoded after next P frame is decoded
Itec3121 - Multimedia Systems
15
Why B frame ?
Images in video are best predicted by both previous and
following images, especially for occluded areas
Frame 1 Frame 2 Frame 3
In frame 2, the black region can not be predicted from frame
1, because it is not visible in frame 1
But it can be inferred from frame 3
Itec3121 - Multimedia Systems
16
B-frame encoding
Itec3121 - Multimedia Systems
17
Slices
• To make the process of decoding more resilient to
transmission errors, each frame is divided into slices.
• If one slice is corrupted, the decode will restart from the
beginning of the next slice. The following image is
divided into 7 slices
Itec3121 - Multimedia Systems
18
MPEG video bitstream
Itec3121 - Multimedia Systems
19
MPEG video bitstream
Sequence information
Video Params include width, height, aspect ratio of pixels,
picture rate.
Bitstream Params are bit rate, buffer size, and constrained
parameters flag (means bitstream can be decoded by most
hardware)
Two types of QTs: one for intra-coded blocks (I-frames) and
one for inter-coded blocks (P-frames).
Group of Picture (GOP)Information
Time code: bit field with SMPTE time code (hours, minutes,
seconds, frame).
GOP Params are bits describing structure of GOP.
Itec3121 - Multimedia Systems
20
MPEG video bitstream
Picture information
Type: I, P, or B-frame?
Buffer Params indicate how full decoder's buffer should be before starting
decode.
Encode Params indicate whether half pixel motion vectors are used.
Slice information
Vert Pos: what line does this slice start on?
QScale: How is the quantization table scaled in this slice?
Macroblock information
Addr Incr: number of MBs to skip.
Type: Does this MB use a motion vector? What type?
QScale: How is the quantization table scaled in this MB?
Coded Block Pattern (CBP): bitmap indicating which blocks are coded.
Itec3121 - Multimedia Systems
21
Thank You!!!
Itec3121 - Multimedia Systems
22