Unit-Iii: Audio & Video Coding
Unit-Iii: Audio & Video Coding
PART-II
VIDEO CODING
0 1 2 3
Current Frame
Previous Frame
◼ Blu-Ray DVD
❑ Capacity: 25 GB (single layer)
❑ Read rate: 36 Mbits/s
◼ Video Streaming or TV Broadcast
❑ 1 Mbits/s to 20 Mbits/s
Requires 30x to 1200x compression
ELE-4410: Multimedia Systems & Networks 5
Principles of video coding
◼ Compression is achieved by removing redundant and irrelevant
information from the video sequence
◼ Redundancies in videos:
❑ Spatial redundancy
◼ Neighbouring pixels inside a picture are similar.
❑ Statistical redundancy
◼ Unequal distribution of colour intensities.
◼ Some colours are more dominant than others.
❑ Temporal redundancy
◼ Similarity among the frames.
2 3
0 1
6
Image Vs Video Coding
◼ Image coding: uses Spatial and Statistical
redundancy reductions
◼ Video coding: uses Spatial, Statistical AND
Temporal redundancy reductions
Perceptual
Redundancy
reduction
Image data
DCT Q VLC
out
in + DCT Q VLC
-
Statistical
IDCT
+
Buffer +
ME
Temporal
ELE-4410: Multimedia Systems & Networks 8
Inter-frame Video Coding
Frame 0 Frame 1
Scaled Frame
Difference
Video
input
ITU-T
(1995/96)
H.263
H.263 H.263++
(1995/96) (2000)
(1997/98)
H.263+
(1990)
H.261
H.261 H.263+
( MPEG-4
H.264
Part
(2002)
10 )
MV-HEVC, 3D-HEVC
H.265/SHVC
(2016)
H.264/SVC,
(1994/95)
(H.262)
MPEG-2
H.265/
(Jan 2013)
HEVC
MPEG MPEG-4 v2
(1999/00)
MPEG-4 v2
(1999/00)
(1993)
MPEG-1
MPEG-1 MPEG-4 v3
(2001)
(1993) MPEG-4 v3
(2001)
1990 1992 1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014 2016
◼ Note:
❑ The two chrominance components (U,V) contain considerably less
information than the luminance component. For this reason,
chrominance is often sub-sampled
A Macro-block in Systems
ELE-4410: Multimedia 4:2:0 format
& Networks 22
Digital Video Formats
◼ Common Intermediate Format (CIF):
❑ This format was defined by CCITT for H.261 coding standard
(teleconferencing and videophone).
❑ Several size formats:
◼ SQCIF: 88x72 pixels. QCIF: 176x144 pixels.
◼ CIF: 352x288 pixels. 4CIF: 704x576 pixels.
◼ Non-interlaced (progressive), and chrominance sub-sampling using
4:2:0.
◼ Source Input Format (SIF):
❑ Utilized in MPEG as a compromise with Rec. 601.
❑ Two size formats (similar to CIF):
◼ QSIF: 180x120 or 176x144 pixels at 30 or 25 fps
◼ SIF: 360x240 or 352x288 pixels at 30 or 25 fps
❑ It is assumed that SIF is derived from a Rec.601.
◼ High Definition Television (HDTV):
❑ 1080x720 pixels.
❑ 1920x1080 pixels.
GroupELE-4410:
of Frames (GOF)
Multimedia Systems & Networks 26
Motion Estimation &
Compensation
◼ Translation: simple
movement of typically
rigid objects Frame n+1 Frame n+2
(Rotation) (Zoom)
◼ Camera pans vs.
movement of objects Rotation: spinning about
an axis
❑ Camera versus object
rotation
Zooms –in/out
❑ Camera zoom vs. object
Frame n Frame n+1
zoom (movement in/out)
Current
Portion
Reference of
Search Frame
Frame interest
window
PC
13 15 16
◼ MAD:
Send Motion
Vectors Send MCD(n)
Motion compensated Motion compensated
frame M(n) difference image
MCD(n) =F(n) – M(n)
1 4 2 3 8 5 6 7
ELE-4410: Multimedia Systems & Networks 84
Coding, Decoding and Display Order
◼ Low
◼ High 1440
◼ 352 samples/line
❑ 1440 samples/line
◼ 288 lines per frame
❑ 1152 lines per frame
◼ 30 frames/sec
❑ 60 frames/sec
◼ 4 Mbits/s
❑ 60 Mbits/s
Applications:
➢Video on LANs
➢Internet video
➢Wireless video
➢Video database
➢Interactive home shopping
➢Video e-mail, home movies
➢Virtual reality games, flight
ELE-4410: Multimediasimulation,
Systems & Networks multi-viewpoint training
112
MPEG-4: Scene with Audio Visual Objects
representation of a VOL)
◼ VOP: video object plane