0% found this document useful (0 votes)

21 views127 pages

Unit-Iii: Audio & Video Coding

The document discusses video coding, highlighting its characteristics, the need for compression, and the principles behind video coding techniques. It explains various types of redundancies in video data, the importance of motion estimation, and the different types of frames used in video coding. Additionally, it covers digital video formats, chrominance sub-sampling, and the chronological development of video coding standards.

Uploaded by

ritnainverma123

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

21 views127 pages

Unit-Iii: Audio & Video Coding

Uploaded by

ritnainverma123

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 127

UNIT-III:

AUDIO & VIDEO CODING

PART-II
VIDEO CODING

ELE-4410: Multimedia Systems & Networks 1

Video= Motion Picture
◼ Frame by frame => image sequence
◼ An image sequence (or video) is a series of 2-D images that are sequentially
ordered in time. (3-D digital signal)
◼ Video is sequence of images captured/played @ 25/30/60/100 frames/sec.

0 1 2 3

ELE-4410: Multimedia Systems & Networks 2

Characteristics of Video
◼ Adjacent frames are similar and changes are
due to object or camera motion

ELE-4410: Multimedia Systems & Networks 3

Characteristics of Video: Example
◼ Only the sun has changed position between these 2
frames

Current Frame
Previous Frame

ELE-4410: Multimedia Systems & Networks 4

Need of Video Compression?
◼ Uncompressed 1080p high definition (HD) video at 24
frames/ second
❑ Pixels per frame: 1920x1080
❑ Bits per pixel: 8-bits x 3 (RGB)
❑ 1.5 hours: 806 GB
❑ Bit-rate: 1.2 Gbits/s

◼ Blu-Ray DVD
❑ Capacity: 25 GB (single layer)
❑ Read rate: 36 Mbits/s
◼ Video Streaming or TV Broadcast
❑ 1 Mbits/s to 20 Mbits/s
Requires 30x to 1200x compression
ELE-4410: Multimedia Systems & Networks 5
Principles of video coding
◼ Compression is achieved by removing redundant and irrelevant
information from the video sequence
◼ Redundancies in videos:
❑ Spatial redundancy
◼ Neighbouring pixels inside a picture are similar.
❑ Statistical redundancy
◼ Unequal distribution of colour intensities.
◼ Some colours are more dominant than others.
❑ Temporal redundancy
◼ Similarity among the frames.

◼ Irrelevant Information in Videos:

❑ Minute color and intensity differences which are imperceptible by HVS (psycho
visual redundancy).

2 3
0 1
6
Image Vs Video Coding
◼ Image coding: uses Spatial and Statistical
redundancy reductions
◼ Video coding: uses Spatial, Statistical AND
Temporal redundancy reductions
Perceptual
Redundancy
reduction

Image data
DCT Q VLC

Spatial & Statistical redundancy reduction

ELE-4410: Multimedia Systems & Networks 7

Inter-frame Video Coding
Perceptual
Spatial

out
in + DCT Q VLC

-
Statistical
IDCT
+

Buffer +
ME
Temporal
ELE-4410: Multimedia Systems & Networks 8
Inter-frame Video Coding

ELE-4410: Multimedia Systems & Networks 9

Key ideas in Video Compression
◼ Predict a new frame from a previous frame and only code
the prediction error
◼ Prediction error will be coded using the Transform
methods (DCT or wavelet).
◼ Prediction errors have smaller energy than the original
pixel values and can be coded with fewer bits
◼ Those regions that cannot be predicted well will be coded
directly using Transform coding (DCT).
◼ Divide each frame (predicted/unpredicted) into smaller
block, called Macroblock (MB).
◼ Work on each MB (16x16 pixels) independently for
reduced complexity
❑ Motion compensation done at the MB level.
❑ DCT coding of error at the block level (8x8 pixels).

ELE-4410: Multimedia Systems & Networks 10

Temporal Redundancies

Frame 0 Frame 1

Scaled Frame
Difference

ELE-4410: Multimedia Systems & Networks 11

Key Ideas in Video Coding
Transform Coding: Predictive Coding:
(DPCM)

ELE-4410: Multimedia Systems & Networks 12

Motion Compensated Hybrid Video
Coding

Video
input

ELE-4410: Multimedia Systems & Networks 13

Hybrid Video Decoder

ELE-4410: Multimedia Systems & Networks 14

Chronological Table of Video Coding
Standards
(2000)
H.263++

ITU-T
(1995/96)
H.263

H.263 H.263++
(1995/96) (2000)
(1997/98)
H.263+
(1990)
H.261

H.261 H.263+
( MPEG-4
H.264
Part
(2002)
10 )

(1990) (1997/98) H.264

H.264/SVC, MVC Extension
(2006/2009)

MV-HEVC, 3D-HEVC
H.265/SHVC
(2016)

H.264/SVC,
(1994/95)
(H.262)
MPEG-2

H.265/
(Jan 2013)
HEVC

MPEG-2 H.265/ H.265/SHVC

( MPEG-4 MVC
HEVC MV-HEVC,
(H.262) MPEG-4 v1
(1998/99)
Part 10 ) Extension
3D-HEVC
MPEG-4 v1 (2002) (2006/2009) (Jan 2013)
(1994/95)
ISO/IEC (1998/99)
(2016)

MPEG MPEG-4 v2
(1999/00)

MPEG-4 v2
(1999/00)
(1993)
MPEG-1

MPEG-1 MPEG-4 v3
(2001)

(1993) MPEG-4 v3
(2001)

1990 1992 1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014 2016

ELE-4410: Multimedia Systems & Networks 15

Application Scenario

Application Bit Rate Video Standard

UHDTV 50-100 Mbps HEVC

Digital TV Broadcasting 2…6 Mbps (10…20 MPEG-2, H.264/AVC

Mbps for HD)

DVD Video 6…8 Mbps MPEG-2, H.264/AVC

Internet video streaming 20…200 kbps H.263, MPEG-4 or H.264

Video conferencing, 20-320 kbps H.261, H.263,

video telephony H.264/AVC

Video over 3G wireless 20-200kbps H.263, MPEG-4

ELE-4410: Multimedia Systems & Networks 16

Digital Video Formats
◼ Composite video
❑ Convert RGB to YIQ (YUV)
❑ Multiplexing YIQ into a single signal
❑ Used in most consumer analog video devices
◼ Each frame is divided into 16x16 non-overlapping blocks,
consisting of luminance and chrominance (YUV or YCbCr)
components. Each block is called Macroblock (MB).
◼ The luminance and chrominance components share the
same image boundary in the image.
◼ Two chrominance components (CbCr) will always occur in
pairs.

ELE-4410: Multimedia Systems & Networks 17

Pixel Representation
◼ Y,U,V Colour Space
❑ The Human Visual System (HVS) is sensitive to three colour
components.
❑ Colour can be represented by Red, Green and Blue components (RGB).
❑ Transform to YUV or YCbCr with less correlated representation:.

◼ Note:
❑ The two chrominance components (U,V) contain considerably less
information than the luminance component. For this reason,
chrominance is often sub-sampled

ELE-4410: Multimedia Systems & Networks 18

Pixel Representation

ELE-4410: Multimedia Systems & Networks 19

Chrominance Sub-sampling
◼ Human vision is relatively insensitive to
chrominance.
❑ For this reason, chrominance is often sub-sampled.
◼ Chrominance sub-sampling is specified as a
three-element ratio.
◼ Depending upon the number of chrominance
samples for every four luminance samples,
different colour formats such as 4:4:4, 4:2:2, 4:1:1
and 4:2:0 are defined.
◼ 4:1:1 and 4:2:0 have same number of luminance
and chrominance samples.

ELE-4410: Multimedia Systems & Networks 20

Chrominance Sub-sampling

ELE-4410: Multimedia Systems & Networks 21

◼ Frame is divided in 16x16 non-overlapping blocks,
consisting of luminance and chrominance (YUV or YCbCr)
components.
◼ The luminance and chrominance components share the
same image boundary in the image.
◼ Two chrominance samples will always occur in pairs.

A Macro-block in Systems
ELE-4410: Multimedia 4:2:0 format
& Networks 22
Digital Video Formats
◼ Common Intermediate Format (CIF):
❑ This format was defined by CCITT for H.261 coding standard
(teleconferencing and videophone).
❑ Several size formats:
◼ SQCIF: 88x72 pixels. QCIF: 176x144 pixels.
◼ CIF: 352x288 pixels. 4CIF: 704x576 pixels.
◼ Non-interlaced (progressive), and chrominance sub-sampling using
4:2:0.
◼ Source Input Format (SIF):
❑ Utilized in MPEG as a compromise with Rec. 601.
❑ Two size formats (similar to CIF):
◼ QSIF: 180x120 or 176x144 pixels at 30 or 25 fps
◼ SIF: 360x240 or 352x288 pixels at 30 or 25 fps
❑ It is assumed that SIF is derived from a Rec.601.
◼ High Definition Television (HDTV):
❑ 1080x720 pixels.
❑ 1920x1080 pixels.

ELE-4410: Multimedia Systems & Networks 23

Digital Video Formats

ELE-4410: Multimedia Systems & Networks 24

Digital Video Formats

ELE-4410: Multimedia Systems & Networks 25

Different type frames in video coding
◼ I- frames (Intra-coded frames)
❑ No prediction, coded in the same way as an image.
◼ P-frames (Inter-coded or predictive frames)
❑ Uses backward prediction.
❑ Current frame is predicted from previously coded I or P-frame.
❑ Does not work well for uncovered regions by object motion.
◼ B-frames (Bi-directional Predicted frames)
❑ Uses forward and backward prediction.
❑ Not used for predicting any other frame.
❑ Can handle better covered/uncovered regions
❑ These predicted from previous as well as next I- or P-frames.

GroupELE-4410:
of Frames (GOF)
Multimedia Systems & Networks 26
Motion Estimation &
Compensation

ELE-4410: Multimedia Systems & Networks 27

Types of Motion
Frame n

◼ Translation: simple
movement of typically
rigid objects Frame n+1 Frame n+2
(Rotation) (Zoom)
◼ Camera pans vs.
movement of objects Rotation: spinning about
an axis
❑ Camera versus object
rotation
Zooms –in/out
❑ Camera zoom vs. object
Frame n Frame n+1
zoom (movement in/out)

ELE-4410: Multimedia Systems & Networks 28

Describing Motion
◼ Translational
❑ Move (object) from (x,y) to (x+dx,y+dy)
◼ Rotational
❑ Rotate (object) by (r rads) (counter/clockwise)
◼ Zoom
❑ Move (in/out) from (object) to increase its size by
(t times)

Which is easiest? Which are we most likely to

encounter?

ELE-4410: Multimedia Systems & Networks 29

Motion Estimation
◼ Determining parameters for the motion
descriptions
◼ For some portion of the frame, estimate its
movement between 2 frames- the current
frame and the reference frame
◼ What is some portion?
❑ Individual pixels (all of them)?
❑ Lines/edges (have to find them first)
❑ Objects (must define them)
❑ Uniform regions (just chop up the frame)

ELE-4410: Multimedia Systems & Networks 30

General Idea
◼ For a region PC in the current frame, find a
region PR in the search window in reference
frame so that Error(PR,PC) is minimized
◼ Issues: Error measures, search techniques,
choice of search window, choice of reference
frame, choice of region PC

Current
Portion
Reference of
Search Frame
Frame interest
window
PC

ELE-4410: Multimedia Systems & Networks 31

Block-based Motion Estimation
◼ PC is a block of pixels (in the current frame)
◼ The search window is a rectangular segment
(in the reference frame)

T=1 (reference) T=2 (current)

ELE-4410: Multimedia Systems & Networks 32
Motion Vectors
◼ A motion vector (MV) describes the offset between
the location of the block being coded (in the current
frame) and the location of the best-match block in
the reference frame

T=1 (reference) T=2 (current)

ELE-4410: Multimedia Systems & Networks 33
Motion Compensation

The blocks being predicted are on a grid

1 3 1 2 3 4
2 4
5 6 7 8
6 7 8
5
9 10 11 12
10
9 12
11
14 13 14 15 16

13 15 16

The blocks used for prediction are NOT

ELE-4410: Multimedia Systems & Networks 34
Motion Vector Search

◼ 1. Mean squared error ◼ Given error measure,

❑ Select a block in the how to efficiently
reference frame to determine best-match
minimize block in search window?
Σ(b(Bref)-b(Bcurr))2 ❑ Full search: best results,
◼ 2. Mean abs. error most computation
❑ Select block to ❑ Logarithmic search –
minimize heuristic, faster
Σ|b(Bref)-b(Bcurr)| ❑ Hierarchical motion
estimation

ELE-4410: Multimedia Systems & Networks 35

Block-Matching Algorithm: Matching
Criterion
◼ For an N x N block B, MSE:

◼ MAD:

◼ The very idea of using a block of pixels and assuming a

common displacement for them in the matching process
corresponds to a local smoothness (coherence) constraint on
the displacement vector field.Systems & Networks
ELE-4410: Multimedia 36
Motion Vector Search
Logarithmic Search: First examine positions
◼ Full search: Evaluate marked 1.
every position in the Choose best of these (lowest error
measure) and examine positions
search window marked 2 surrounding it
Choose the best of these, and examine the
positions marked 3
Final result = best of these

ELE-4410: Multimedia Systems & Networks 37

Hierarchical Motion Estimation
◼ Use an averaging filter on the image, then
downsample by a factor of 2
◼ Conduct a search on the downsampled
image (only ¼ of the size)
◼ Given the results of the search on the
downsampled image, return to the full
resolution image and refine the search there

ELE-4410: Multimedia Systems & Networks 38

Motion Compensation
◼ The standards do not specify HOW the
encoder will find the motion vectors (MVs)
◼ The encoder can use exhaustive/fast search,
MSE /MAE/other error metric, etc.
◼ The standard DOES specify
❑ The allowable syntax for specifying the MVs
❑ What the decoder will do with them
◼ What the decoder does is to grab the
indicated block from reference frame, and
glue it in place

ELE-4410: Multimedia Systems & Networks 39

Motion Compensation Example

Frame n-1 Frame n

(0,0) (-16,0) (5,0) (0,0)

(0,0) (16,7) (5,2) (0,0)

(20,-24) (0,0) (-20,-18) (0,0) MOTION COMPENSATED

Frame n

ELE-4410: Multimedia Systems & Networks 40

Objects versus Macroblocks
◼ Real moving objects will not coincide with
boundaries of macroblocks

background Prediction error

Background well
encoded (no
moving object motion vector)
Moving object well encoded Prediction error
with motion vector

◼ If encoder sends MV=(MVX,MVY), object well coded,

but background poorly coded
◼ If encoder sends MV=(0,0), background well coded,
but moving object poorly coded
◼ Either approach is valid
ELE-4410: Multimedia Systems & Networks 41
Motion Compensation
◼ This glued together frame is called
the motion compensated frame
◼ The encoder can also form the difference between
the motion compensated frame and the actual
frame.
◼ This is called the motion compensated difference
frame
◼ This difference frame formed using MC should have
less correlation between pixels than the difference
frame formed without using MC

ELE-4410: Multimedia Systems & Networks 42

Motion Compensated Difference
Frames
◼ Suppose we are doing lossless coding
◼ Encoder has sequence of frames: …, F(n-2), F(n-1)
◼ Next: encode F(n)
◼ Past frames have been losslessly encoded, so the
decoder knows F(n-1) perfectly already
◼ Encoder sends the motion vectors for frame F(n)
relative to frame F(n-1), to form motion
compensated frame M(n)
❑ Encoder knows M(n), Decoder knows M(n)

ELE-4410: Multimedia Systems & Networks 43

Motion Compensated Prediction: An
example

ELE-4410: Multimedia Systems & Networks 44

Encoding Difference Frames
◼ Encoder forms motion ◼ With no motion compensation
compensated diff frame: encoder could do frame diff:
MCD(n) = F(n) – M(n) FD(n) = F(n) – F(n-1)
◼ Encoder losslessly ◼ Encoder losslessly
encodes MCD(n) encodes FD(n)
◼ Decoder can then do ◼ Decoder can then do
F(n) = MCD(n) + M(n) F(n) = FD(n) + F(n-1)
→ knows F(n) exactly → knows F(n) exactly

If successive frames are very similar:

fewer bits to send Motion Vectors + MCD(n) instead of FD(n)
fewer bits to send FD(n) instead of F(n)

ELE-4410: Multimedia Systems & Networks 45

Motion compensated difference frames

◼ Decoder knows F(n-1) and, once you send the

motion vectors, it knows M(n)
Send FD(n)

Reference Frame Original Frame Difference Image

F(n-1) F(n) FD(n)=F(n)-F(n-1)

Send Motion
Vectors Send MCD(n)
Motion compensated Motion compensated
frame M(n) difference image
MCD(n) =F(n) – M(n)

ELE-4410: Multimedia Systems & Networks 46

Motion Compensated Difference
Frames
◼ But we are NOT doing lossless coding
◼ Encoder has sequence of frames: …, F(n-2), F(n-1)
◼ Next: encode F(n)
◼ Past frames have been lossy encoded, so the
decoder has versions …, G(n-2), G(n-1)
◼ Encoder knows …, G(n-2), G(n-1) also
◼ Encoder sends the motion vectors for frame F(n)
relative to frame G(n-1), to form motion
compensated frame M(n)
ELE-4410: Multimedia Systems & Networks 47
Video Coding Standards
(ITU-T and MPEG
standards)

ELE-4410: Multimedia Systems & Networks 48

Major Applications of Video Compression

ELE-4410: Multimedia Systems & Networks 49

Video Coding
Standardization Organizations
◼ Two organizations dominate video compression
standardization:
❑ ITU-T Video Coding Experts Group (VCEG)
International Telecommunications Union – Telecommunications
Standardization Sector (ITU-T, a United Nations Organization,
formerly CCITT), Study Group 16, Question 6
❑ ISO/IEC Moving Picture Experts Group (MPEG)
International Standardization Organization and International
Electrotechnical Commission, Joint Technical Committee Number
1, Subcommittee 29, Working Group 11

ELE-4410: Multimedia Systems & Networks 50

Dynamics of the Video
Standardization Process
◼ VCEG is older and more focused on conventional (esp. low-
delay) video coding goals (e.g. good compression and
packet-loss/error resilience)
◼ MPEG is larger and takes on more ambitious goals (e.g.
“object oriented video”, “synthetic-natural hybrid coding”, and
digital cinema)
◼ Sometimes the major organizations team up (e.g. ISO, IEC
and ITU teamed up for both MPEG-2 and JPEG)
◼ Relatively little industry consortium activity (DV and
organizations that tweak the video coding standards in minor
ways, such as DVD, 3GPP, 3GPP2, SMPTE, IETF, etc.)
◼ Growing activity for internet streaming media outside of
formal standardization (e.g., Microsoft, Real Networks,
Quicktime) ELE-4410: Multimedia Systems & Networks 51
The Scope of Video Coding Standardization

◼ Only restrictions on the Bitstream, Syntax, and

Decoder are standardized:
❑ Permits the optimization of encoding
❑ Permits complexity reduction for implementability
❑ Provides no guarantees on quality

ELE-4410: Multimedia Systems & Networks 52

Standard specifies bit stream
◼ The video compression ◼ This allows future encoders of
standards define syntax and better performance to remain
semantics for the bit stream compatible with existing
between encoder and decoder decoders.
bit stream
◼ Also allows for commercially
ENCODER DECODER secret encoders to be
compatible with standard
decoders
not this Standard defines not this
this Today’s Ho-Hum Today’s
Encoder Decoder
◼ Encoder is not specified by
MPEG except that it produces Tomorrow’s Nifty
a compliant bit stream Encoder
◼ Compliant decoder must Today’s decoder
interpret all legal MPEG bit Very secret still works!
streams Encoder

ELE-4410: Multimedia Systems & Networks 53

Target Applications
◼ Standards
❑ MPEG-1: Video CD
❑ MPEG-2: Digital TV
❑ MPEG-4: Multimedia
❑ H.261: ISDN videophone
❑ H.263: PSTN videophone
❑ H.264 / MPEG-4 part 10: Universal video

ELE-4410: Multimedia Systems & Networks 54

Motion Compensated Hybrid Coding
H.261, MPEG-1, MPEG-2, H.263, MPEG-4, H.264/JVT

ELE-4410: Multimedia Systems & Networks 55

Requirement of a successful Video
Coding Standards
◼ Interoperability: should assure that encoders and decoders
from different manufacturers work together seamlessly.
◼ Innovation: should perform significantly better than previous
standard.
◼ Competition: should be flexible enough to allow competition
between manufacturers based on technical merit. Only
standardize bit-stream syntax and reference decoder.
◼ Independence from transmission and storage media:
should be flexible enough to be used for a range of
applications.
◼ Forward compatibility: should decode bit-streams from prior
standard
◼ Backward compatibility: prior generation decoders should
be able to partially decode new bit-streams
ELE-4410: Multimedia Systems & Networks 56
ITU-T H.261 Video Coding Standards
◼ International standard for ISDN picture phones and for video
conferencing/video phone systems with low delay (for real-time,
interactive applications) and with slow motion (1990).
◼ Image format: CIF (352 x 288 Y samples, above 128kbps) or QCIF
(176*144 Y samples, 64-128kbps), frame rate: 7.5,10,15 and 30 fps
◼ Bit-rate: multiple of 64 kbps (= ISDN-channel), px64 kbps,
p=1,…,30, typically 128 kbps including audio.
◼ Picture quality: for 128 kbps acceptable with limited motion in the
scene.
◼ Stand-alone videoconferencing system or desk-top video
conferencing system, integrated with PC.
◼ Macroblock Structure:
❑ Macroblock (MB) of 16x16 pixels
❑ Sampling format: 4:2:0 color format
❑ Progressive scanning
❑ MB consists of 4 luminance and
2 chrominance blocks

ELE-4410: Multimedia Systems & Networks 57

H.261 Video Coding Standards
◼ Motion Compensated prediction
❑ Each MB can be coded in intra- and inter-mode.
❑ Integer-pel accuracy for inter mode.
❑ One displacement vector per macroblock
❑ Maximum displacement vector range +/- 16 horizontally and vertically.
❑ Methods for generating the MVs are not specified in the standard
◼ Standards only define the bitstream syntax, or the decoder operation)
❑ Differential encoding of motion vectors (DMV).
❑ Encoder and decoder uses the decoded MVs to perform motion
compensation
❑ Adaptive loop filter, separable in 1-D horizontal and vertical is used to
suppress propagation of coding noise temporally.
◼ impulse response of separable filter : [¼, ½, ¼]
◼ Loop filter can be turned on or off.

ELE-4410: Multimedia Systems & Networks 58

H.261 Encoder

ELE-4410: Multimedia Systems & Networks 59

H.261 Standards
◼ Residual Coding:
❑ 8x8 DCT
❑ Quantization
◼ Uniform quantizer (∆=8) for intra-mode DC coefficients
◼ Uniform threshold quantizer with dead-zone (∆=2,4,…,62 (MQUANT)) for
AC coefficients in intra-mode and all coefficients in inter-mode.
❑ Zig-zag scan of DCT coefficients.
❑ Run-level coding for entropy coding : (zero-run, value) symbols.
◼ zero-run: the number of coefficients quantized to zero since the last nonzero
coefficient.
◼ value: the amplitude of the current nonzero coefficient.
◼ Variable Length Coding (VLC)
❑ DCT coefficients are converted into runlength representations and then
coded using VLC (Huffman coding for each pair of symbols).
❑ Other information are also coded using VLC (Huffman coding).

ELE-4410: Multimedia Systems & Networks 60

Parameter Selection and Rate control
in H.261
◼ MTYPE (intra vs. inter, zero vs. non-zero MV in inter)
◼ CBP (which blocks in a MB have non-zero DCT
coefficients)
◼ MQUANT (allow the changes of the quantizer step size at
the MB level)
❑ should be varied to satisfy the rate constraint.
◼ MV (ideally should be determined not only by prediction
error but also the total bits used for coding MV and DCT
coefficients of prediction error)
◼ Loop Filter on/off

ELE-4410: Multimedia Systems & Networks 61

H.261 Data Stream
◼ Picture Layer
❑ Picture Start Code (PSC) - 20 bit pattern
❑ Temporal Reference (TR) - 5 bit input frame number
❑ Type Information (PTYPE) - CIF or QCIF selection
❑ Spare bits to be defined in later versions
◼ GOB Layer
❑ Group of Blocks Start Code (GBSC) - 16 bit pattern
❑ Group Number (GC) - 4 bit GOB address
❑ Quantizer information (GQUANT) – Initial quantizer step size normalized to the
range 1 to 31.
◼ At the start QUANT=GQUANT
❑ Spare bits to be defined in later versions
◼ Macroblock (MB) layer
❑ Macroblock address (MBA)
◼ Location of this MB relative to the previously encoded MB inside the GOB.
❑ Type information (MTYPE) - 10 types in total
❑ Quantizer (MQUANT): normalized quantizer step size to be used until the next
MQUANT or GQUANT. (Range 1 to 31)
ELE-4410: Multimedia Systems & Networks 62
H.261Data Stream
◼ Macroblock (MB) layer
❑ Motion Vector Data (MVD)
◼ differential displacement vector
❑ Coded Block Pattern (CBP)
◼ Indicates which blocks in the MB are coded.
◼ Blocks not coded contain zero coefficients.
◼ Block Layer
❑ Lowest layer is the block layer, consisting of
◼ quantized transform coefficients (TCOEFF),
◼ End of block (EOB) symbol
❑ All coded blocks have the EOB symbol.
◼ Types of Coded MB:
❑ Intra - Original Pels are transform Coded
❑ Inter - Frame difference pels (zero-motion vectors) are coded.
◼ Skipped MBs are considered inter by default.
❑ Inter_MC - displaced (nonzero-motion vectors)
❑ Inter_MC with filter - displaced blocks are filtered by loop filter.
◼ – Used for very low bit rates.
ELE-4410: Multimedia Systems & Networks 63
H.261 Macblock Type (VLC table)

ELE-4410: Multimedia Systems & Networks 64

H.263Video Coding Standard
◼ International standard for picture phones over analog
subscriber lines (1995)
◼ H.263 is the video coding standard in H.323/H.324,
targeted for visual telephone over PSTN or Internet.
◼ Developed after H.261, can accommodate
computationally more intensive options
❑ Initial version (H.263 baseline): 1995
❑ H.263+: 1997
❑ H.263++: 2000
◼ Image format: usually CIF, QCIF or Sub-QCIF.
◼ frame rate: usually below 10 fps.

ELE-4410: Multimedia Systems & Networks 65

H.263Video Coding Standard
◼ Goal: Improved quality at lower rates
❑ Bit rate: arbitrary, typically 20 kbps for PSTN
❑ Picture Quality: with new options as good as H.261 (at only half
rate).
◼ Result: Significantly better quality at lower rates
❑ Better video at 18-24 Kbps than H.261 at 64 Kbps
❑ Enable video phone over regular phone lines (28.8 Kbps) or
wireless modem
◼ Software-only PC video phone
or TV set-top box.
◼ Widely used as compression engine
for Internet video streaming.
◼ H.263 is also the compression core
of the MPEG-4 standard.

ELE-4410: Multimedia Systems & Networks 66

H.261 Vs H.263
◼ Improved motion estimation and compensation
❑ H.261 (1990): integer-pel accuracy, loop filter, 1 motion vector
per MB
❑ H.263 (1995): half-pel accuracy, no loop filter, 1 motion vector per
MB.
❑ half-pel accuracy motion estimation with bilinear interpolation
filter.
❑ Larger motion search range [-31.5,31], and unrestricted MV at
boundary blocks.
❑ More efficient predictive coding for MVs (median prediction using
three neighbors).
❑ overlapping block motion compensation (option).
❑ variable block size: 16x16 -> 8x8, 4 MVs per MB (option).
❑ use bidirectional temporal prediction (PB picture) (option).
◼ Improved 3-D VLC for DCT coefficients (last, run, level).

ELE-4410: Multimedia Systems & Networks 67

H.261 Vs H.263
◼ Reduced overhead.
◼ Support more picture formats.
◼ Optional features defined in annexes
❑ Unrestricted motion vectors (Annex D)
❑ Syntax-based arithmetic coding (SAC) (Annex E)
◼ 4% savings in bit rate for P-mode, 10% saving for I-mode, at 50%
more computations.
❑ Advanced prediction mode(APM) (Annex F)
◼ Overlapped block motion compensation (OBMC),
◼ Switch between 1 or 4 motion vectors per MB
◼ PB pictures (Annex G).
◼ Additional optional features in H.263++. (H.263 as of 2001).
◼ The options, when chosen properly, can improve the PSNR
0.5-1.5 dB over default at 20-70 kbps range.

ELE-4410: Multimedia Systems & Networks 68

Performance of H.263 and H.261

ELE-4410: Multimedia Systems & Networks 69

Performance of H.263 and H.261

ELE-4410: Multimedia Systems & Networks 70

Overlapped Block Motion
Compensation (OBMC) in H.263
◼ Conventional block motion compensation
❑ One best matching block is found from a reference frame
❑ The current block is replaced by the best matching block
◼ OBMC
❑ Each pixel in the current block is predicted by a weighted average of
several corresponding pixels in the reference frame
❑ The corresponding pixels are determined by the MVs of the current as
well as adjacent MBs
❑ The weights for each corresponding pixel depends on the expected
accuracy of the associated MV

ELE-4410: Multimedia Systems & Networks 71

Overlapped Block Motion
Compensation (OBMC) in H.263
◼ Idea: superimpose
several prediction
signals, using the
motion vectors from
neighboring blocks
also.

ELE-4410: Multimedia Systems & Networks 72

OBMC weights in H.263

ELE-4410: Multimedia Systems & Networks 73

OBMC weights in H.263

ELE-4410: Multimedia Systems & Networks 74

Performance of H.263 with OBMC

ELE-4410: Multimedia Systems & Networks 75

H.263 : PB Pictures (Annex-G)

◼ PB-picture mode codes two pictures as a group. The

second picture (P) is coded first, then the first picture (B) is
coded using both the P-picture and the previously coded
picture. This is to avoid the reordering of pictures required
in the normal B-mode. But it still requires additional coding
delay than P-frames only.
◼ In a B-block, forward prediction (predicted from the
previous frame) can be used for all pixels; backward
prediction (from the future frame) is only used for those
pels that the backward motion vector aligns with pels of the
current MB. Pixels in the “white area” use only forward
prediction.
◼ An improved PB-frame mode was defined in H.263+, that
removes the previous restriction.
ELE-4410: Multimedia Systems & Networks 76
H.263 : PB Pictures

ELE-4410: Multimedia Systems & Networks 77

Performance of H.263 with PB Mode

ELE-4410: Multimedia Systems & Networks 78

Video Coding Standards
(MPEG-1/2 and MPEG-4)

ELE-4410: Multimedia Systems & Networks 79

Contents
◼ ISO Standards
❑ MPEG-1
❑ MPEG-2
❑ MPEG-4
❑ MPEG-7 (overview only)

ELE-4410: Multimedia Systems & Networks 80

ISO MPEG
◼ MPEG-1 Standard (1991) (ISO/IEC 11172)
❑ Target bit-rate about 1.5 Mbps.
❑ Typical image format CIF, no interlace.
❑ Frame rate 24 ... 30 fps.
❑ Main application: video storage for multimedia (e.g., on CD-ROM).
◼ MPEG-2 Standard (1994) (ISO/IEC 13818)
❑ Extension for interlace, optimized for TV resolution (NTSC: 704 x
480 Pixel).
❑ Image quality similar to NTSC, PAL, SECAM at 4 -8 Mbps
❑ HDTV at 20 Mbps.
◼ MPEG-4 Standard (1999) (ISO/IEC 14496)
❑ Object based coding.
❑ Wide-range of applications, with choices of interactivity, scalability,
error resilience, etc.

ELE-4410: Multimedia Systems & Networks 81

MPEG-1 Overview
◼ Audio/video on CD-ROM (1.5 Mbps, CIF: 352x240).
❑ Maximum: 1.856 Mbps, 768x576 pels.
◼ Start late 1988, test in 10/89, Committee Draft 9/90
◼ ISO/IEC 11172-1~5 (Systems, video, audio, compliance,
software).
◼ Prompted explosion of digital video applications: MPEG1
video CD and downloadable video over Internet.
◼ Software only decoding, made possible by the introduction
of Pentium chips, key to the success in the commercial
market.
◼ MPEG-1 Audio
❑ Offers 3 coding options (3 layers), higher layer have higher coding
efficiency with more computations
❑ MP3 = MPEG1 layer 3 audio.
ELE-4410: Multimedia Systems & Networks 82
MPEG-1 Vs H.261
◼ Developed at about the same time.
◼ Must enable random access (Fast forward/rewind).
❑ Using GOP structure with periodic I-picture and P-picture.
◼ Not for interactive applications.
❑ Do not have as stringent delay requirement.
◼ Fixed rate (1.5 Mbps), good quality (VHS equivalent).
❑ SIF video format (similar to CIF).
◼ CIF: 352x288, SIF: 352x240.
❑ Using more advanced motion compensation.
◼ Half-pel accuracy motion estimation, range up to +/- 64.
❑ Using bi-directional temporal prediction.
◼ Important for handling uncovered regions.
❑ Using perceptual-based quantization matrix for I-blocks (same as
JPEG).
◼ DC coefficients coded predictively.
ELE-4410: Multimedia Systems & Networks 83
MPEG-1/2 GOP Structure
◼ "Group of Pictures" = “GOP“, GOP structure is very flexible

1 4 2 3 8 5 6 7
ELE-4410: Multimedia Systems & Networks 84
Coding, Decoding and Display Order

ELE-4410: Multimedia Systems & Networks 85

MPEG-1 Encoder

ELE-4410: Multimedia Systems & Networks 86

MPEG-1: Coding of I-frames
◼ I-pictures: intra-frame coded
◼ 8x8 DCT
◼ Arbitrary weighting matrix for coefficients
◼ Differential coding of DC-coefficients
◼ Uniform quantization
◼ Zig-zag-scan, run-level-coding
◼ Entropy coding
◼ Unfortunately, not quite JPEG

ELE-4410: Multimedia Systems & Networks 87

MPEG-1: Coding of P-pictures
◼ Motion-compensated prediction from an
encoded I-picture or P-picture (DPCM)
◼ Half-pixel accuracy of motion compensation,
bilinear interpolation
◼ One displacement vector per macroblock
◼ Differential coding of displacement vectors
◼ Coding of prediction error with 8x8-DCT, uniform
threshold quantization, zig-zag-scan as in I-
pictures

ELE-4410: Multimedia Systems & Networks 88

MPEG-1: Coding of B-pictures
◼ Motion-compensated prediction from two consecutive P-
or I-pictures.
◼ either
❑ only forward prediction (1 vector/macroblock).
◼ or
❑ only backward prediction (1 vector/macroblock).
◼ or
❑ Average of forward and backward prediction = interpolation (2
vectors/macroblock).
◼ Half-pelaccuracy of motion compensation, bilinear
interpolation.
◼ Coding of prediction error with 8x8-DCT, uniform
quantization, zig-zag scan as in I-pictures.

ELE-4410: Multimedia Systems & Networks 89

MPEG-2 Overview
◼ Audio/Video broadcast (TV, HDTV, Terrestrial, Cable,
Satellite, High Speed Inter/Intranet) as well as DVD video.
◼ 4~8 Mbps for TV quality, 10-15 for better quality at SDTV
resolutions (BT.601).
◼ 18-45 Mbps for HDTV applications.
❑ MPEG-2 video high profile at high level is the video coding
standard used in HDTV.
◼ Test in 11/91, Committee Draft 11/93.
◼ ISO/IEC 13818-1~6 (Systems, video, audio, compliance,
software, DSM-CC).
◼ Consist of various profiles and levels.
◼ Backward compatible with MPEG1.
◼ MPEG-2 Audio
❑ Support 5.1 channel
❑ MPEG2 AAC: requires 30% fewer bits than MPEG1 layer 3.

ELE-4410: Multimedia Systems & Networks 90

MPEG-2 Vs. MPEG-1 Video
◼ MPEG1 only handles progressive sequences (SIF).
◼ MPEG2 is targeted primarily at interlaced sequences
and at higher resolution (BT.601 = 4CIF).
◼ More sophisticated motion estimation methods
(frame/field prediction mode) are developed to improve
estimation accuracy for interlaced sequences (Motion
compensation with blocks of size 16x8 pels).
◼ Different DCT modes and scanning methods are
developed for interlaced sequences.
◼ MPEG2 has various scalability modes.
◼ MPEG2 has various profiles and levels, each
combination targeted for different application.
◼ Improved coding efficiency by different quantization, VLC
tables.
ELE-4410: Multimedia Systems & Networks 91
MPEG-1 Bandwidth Requirements
Examples: A digitized video is to be compressed using MPEG-1 standard
assuming frame sequence of : I B B P B B P B B P B B I…And average
compression ratio of 10:1 (I) , 20: 1 (P), 50:1 (B). Derive the average bit
rate that is generated by the encoder for both NTSC and PAL digitization
format Note: The Frame size 352 x 240 NTSC; 352 x 288 PAL
and each pixel represented by 8 bits

Solution: Frame sequence= I B B P B B P B B P B B I ………..

Hence : 1/12 of frames are I –frames , 3/12 are P- frames , and 8/12 are B-
frames
Average compression ratio=(1x0.1+3x0.05+ 8x0.02)/12=0.0342 or 29.24:1
NTSC frame size :
Without Compression =(352x240x8)+2(176x120x8)=1.01376Mbits/frame
with compression=1.01376Mbits/frame x 1/29.24=34.67 kbits/frame
Hence bit rate generated at 30 fps =1.040 Mbps

ELE-4410: Multimedia Systems & Networks 92

PAL frame size
Without compression =352 x 288 x 8+2 (176 x 144 x 8)=1.216512
Mbits/frame
with compression =1.216512 x 1/29.24 =41.604 Kbits / frame
Hence bit-rate generated at 25 fps=1.040 Mbps

ELE-4410: Multimedia Systems & Networks 93

MPEG-2 Vs. MPEG-1 Video
◼ MPEG-2 is intended for higher data rates than MPEG-1.
◼ MPEG-2 allows for higher quality source material by supporting 4:2:2
(chroma channels sub-sampled in the horizontal dimension only), and
4:4:4 (no sub-sampling of chroma) formats, in addition to 4:2:0 (Chroma
channels sub-sampled by 2 in both directions.).
◼ MPEG-2 refers to intended display rate, MPEG-1 refers to coded frame
rate.
◼ Group of Pictures layer does not exist in MPEG-2
❑ It is an optional header useful for establishing a SMPTE time code base or
indicating that certain B pictures at the beginning of an edited sequence
comprise a broken link.
❑ In MPEG-1 Group of pictures is mandatory.
◼ Picture Layer
❑ In MPEG-2, a frame may be coded progressively or interlaced
❑ Interlaced frames may then be coded as either a frame picture or as two
separately coded field pictures.
◼ » Progressive frames are logical for video that originates from film
ELE-4410: Multimedia Systems & Networks 94
◼ » Interlace is logical for video cameras.
MPEG-2 Vs. MPEG-1 Video
◼ Repeat_first_field is new to MPEG-2 to signal a field or
frame that is repeated for purposes of frame rate
conversion
❑ This method has been used to change the 24frame/sec movies to 30
frames a second video.
◼ Changes in motion estimation:
❑ Motion vectors are now always represented along half-sample grid
❑ Increased flexibility in coding motion vectors
◼ » can change from +/- 16 pixels to +/- 64 pixels without large increase in
overhead.
❑ Restricted vertical motion vector range
◼ » Motion is more prominent across the screen than up or down.
◼ Prediction modes now include field, frame, Dual Prime and
16x8 MC
◼ Combinations for ELE-4410:
MainMultimedia
Profile and
Systems Simple Profile:
& Networks 95
Frame Vs. Field Picture

ELE-4410: Multimedia Systems & Networks 96

Motion Compensation for Interlaced
Video
◼ Field prediction for field pictures
◼ Field prediction for frame pictures
◼ Dual prime for P-pictures
◼ 16x8 MC for field pictures

◼ Field Prediction for Field Pictures:

❑ Each field is predicted individually from the reference
fields
◼ A P-field is predicted from one previous field.
◼ A B-field is predicted from two fields chosen from two reference
pictures.
ELE-4410: Multimedia Systems & Networks 97
Field Prediction for Field Pictures

ELE-4410: Multimedia Systems & Networks 98

Field Prediction for Frame Pictures

◼ The MB to be predicted is split into top field pels and bottom

field pels.
◼ Each 16x8 field block is predicted separately with its own
motion vector (P frame) or two motion vectors (B-frame)

ELE-4410: Multimedia Systems & Networks 99

◼ Dual Prime
for P-
Pictures

ELE-4410: Multimedia Systems & Networks 100

DCT Modes in MPEG-2
◼ Two types of DCT and two types of scan pattern:
❑ Frame DCT: divides an MB into 4 blocks for Lum, as usual

❑ Field DCT: reorder pixels in an MB into top and bottom fields.

◼ Zig-zag scan as known from H.261/263 and MPEG-1 is augmented

by alternate scan in MEG-2, in order to coder interlaced blocks that
have more correlation in horizontal than in the vertical direction.

ELE-4410: Multimedia Systems & Networks 101

MPEG-2 Levels
◼ High ◼ Main
❑ 1920 samples/line ◼ 720 samples/line
❑ 1152 lines per frame ◼ 576 lines per frame
❑ 60 frames/sec ◼ 30 frames/sec
❑ 80 Mbits/s ◼ 15 Mbits/s

◼ Low
◼ High 1440
◼ 352 samples/line
❑ 1440 samples/line
◼ 288 lines per frame
❑ 1152 lines per frame
◼ 30 frames/sec
❑ 60 frames/sec
◼ 4 Mbits/s
❑ 60 Mbits/s

ELE-4410: Multimedia Systems & Networks 102

MPEG-2 Algorithms and Profile
◼ MAIN – Non-scalable coding algorithm supporting
functionality for:
❑ Coding interlaced video
❑ Random access
❑ B-picture prediction modes
❑ 4:2:0 YUV representation
◼ Non scalable MPEG-2
❑ Introduces Field and Frame Pictures
◼ Interlace fields and frames are coded separately.
◼ Separate Prediction.
❑ New Motion Compensation modes to explore temporal
redundancy between fields
◼ Dual Prime prediction.
◼ 16x8 block motion compensation.

ELE-4410: Multimedia Systems & Networks 103

MPEG-2 Algorithms and Profile
◼ SIMPLE - Includes all functionality provided by the MAIN profile but:
❑ Does not support B-picture prediction modes.
❑ 4:2:0 YUV representation.
◼ SNR Scalable - Supports all functionality provided by the MAIN
Profile plus an algorithm for:
❑ SNR Scalable Coding (2 Layers Allowed)
◼ » Support receivers with different display capability.
◼ » Based on classical Pyramidal approach for progressive image coding.
❑ 4:2:2 YUV representation.
◼ SPATIAL Scalable - Supports all functionality provided by the SNR
Scalable Profile plus an algorithm for :
❑ Spatial Scalable Coding (2 Layers Allowed)
◼ » Provide interoperability between different services.
◼ » Support receivers with different display capability.
❑ 4:2:2 YUV representation.

ELE-4410: Multimedia Systems & Networks 104

MPEG-2 Algorithms and Profile

◼ HIGH - Supports all functionality provided by the SPATIAL

Scalable Profile plus an algorithm for :
❑ 3 layers with the SNR and Spatial Scalable coding modes.
❑ 4:2:2 YUV representation for improved quality requirements.

ELE-4410: Multimedia Systems & Networks 105

MPEG-2 Scalability
◼ Data partition
❑ All headers, MVs, first few DCT coefficients in the base layer
❑ Can be implemented at the bit stream level
❑ Simple
◼ SNR scalability
❑ Base layer includes coarsely quantized DCT coefficients
❑ Enhancement layer further quantizes the base layer quantization error
❑ Relatively simple
◼ Spatial scalability
❑ Complex
◼ Temporal scalability
❑ Simple
◼ Drift problem:
❑ If the encoder’s base layer information for a current frame depends on the
enhancement layer information for a previous frame
❑ Exist in the data partition and SNR scalability modes

ELE-4410: Multimedia Systems & Networks 106

SNR Scalable Encoder

ELE-4410: Multimedia Systems & Networks 107

Spatial Scalable Codec

ELE-4410: Multimedia Systems & Networks 108

Temporal Scalability
(Option 1)
◼ In this options of temporal scalability, only base layer is
used to predict images in enhancement layer.
◼ Obviously, the error in enhancement layer do not
propagate with time.

ELE-4410: Multimedia Systems & Networks 109

Temporal Scalability
(Option 2)
◼ In this options of temporal scalability, the enhancement
layer may use the base layer and enhancement layer for
the prediction.
◼ It is used for coding of stereoscopic video.

ELE-4410: Multimedia Systems & Networks 110

Profiles and levels in MPEG-2

ELE-4410: Multimedia Systems & Networks 111

MPEG-4 Overview
◼ Support highly interactive multimedia applications as well as
traditional applications
◼ Advanced functionalities: interactivity, scalability, error resilience…
◼ Coding of natural and synthetic audio and video, as well as
graphics.
◼ Enable the multiplexing of audiovisual objects and composition in a
scene.

Applications:
➢Video on LANs
➢Internet video
➢Wireless video
➢Video database
➢Interactive home shopping
➢Video e-mail, home movies
➢Virtual reality games, flight
ELE-4410: Multimediasimulation,
Systems & Networks multi-viewpoint training
112
MPEG-4: Scene with Audio Visual Objects

ELE-4410: Multimedia Systems & Networks 113

MPEG-4: Video Coding
◼ Basic video coding
❑ Definition of Video Object (VO), Video Object Layer (VOL), Video
Object Plane (VOP)
❑ Improved coding efficiency vs. MPEG-1/2
◼ Based on H.263 baseline
❑ 3D VLC.
❑ Four MVs and Unrestricted MVs.
❑ OBMC not required.
◼ Global motion compensation
❑ Using 8-parameter projective mapping.
❑ Effective for sequences with large global motion.
◼ Sprites
❑ Code a large background in the beginning of the sequence, plus affine
mappings, which map parts of the background to the displayed scene at
different time instances.
❑ Decoder can vary the mapping to zoom in/out, pan left/right

◼ Quarter pixel motion compensation.

◼ DC and AC prediction: can predict DC and part of AC from either the
previous and block above.
ELE-4410: Multimedia Systems & Networks 114
MPEG-4: Video Coding
◼ Object-based video coding
❑ Binary shape coding
◼ Run-length coding
◼ Pel-wise coding using context-based arithmetic coding
◼ Quadtree coding
❑ -map shape coding
◼ Binary alpha map: specifies whether a pel belongs to an object
◼ Gray scale alpha map: a pel belong to the object can have a
transparency value in the range (0-255).
❑ Padding for block-based DCT of texture
❑ Shape-adaptive DCT
◼ DWT for still texture coding
◼ Mesh animation, face and body animation.

ELE-4410: Multimedia Systems & Networks 115

MPEG-4: Sprite Coding
◼ Analyze the video stream to find the static background
❑ Create a still image of the background
❑ Code the moving objects against the background
◼ 8 global motion parameters describing camera motion are coded for
each sequence
❑ Represent an affine transform of the sprite from the first frame

ELE-4410: Multimedia Systems & Networks 116

Object based Video Coding
◼ Entire scene is decomposed into multiple objects
❑ Object segmentation is the most difficult task!
❑ But this does not need to be standardized
◼ Each object is specified by its shape, motion, and texture
(color)
❑ Shape and texture both changes in time (specified by motion)
◼ MPEG-4 assumes the encoder has a segmentation map
available, specifies how to code (actually decode!)
shape, motion and texture

ELE-4410: Multimedia Systems & Networks 117

Object Description Hierarchy in
MPEG-4
◼ VO: video object
◼ VOL: video object layer
❑ (can be different parts of a VO or different rate/resolution

representation of a VOL)
◼ VOP: video object plane

ELE-4410: Multimedia Systems & Networks 118

Example of Scene Composition in
MPEG-4
◼ The decoder can compose a scene by including different
VOPs in a VOL.

ELE-4410: Multimedia Systems & Networks 119

MPEG-4 Shape Coding
◼ Uses block-based approach (block=MB)
❑ Boundary blocks (blocks containing both the object and background)
❑ Non-boundary blocks: either belong to the object or background
◼ Boundary block’s binary alpha map (binary alpha block) is
coded using context-based arithmetic coding
❑ Intra-mode: context pels within the same frame.
❑ Inter-mode: context pels include previous frame, displaced by MV.
◼ Shape MV separate from texture MV.
◼ Shape MV predictively coded using texture MV.
◼ Grayscale alpha maps are coded using DCT
◼ Texture in boundary blocks coded using
❑ padding followed by conventional DCT
❑ Or shape-adaptive DCT

ELE-4410: Multimedia Systems & Networks 120

MPEG-4 Generic video coding

ELE-4410: Multimedia Systems & Networks 121

Video Compression Progress

ELE-4410: Multimedia Systems & Networks 122

Video Compression Progress

ELE-4410: Multimedia Systems & Networks 123

Video Compression Progress

ELE-4410: Multimedia Systems & Networks 124

Video Compression Progress

ELE-4410: Multimedia Systems & Networks 125

MPEG-7 Overview
◼ To enable search and browsing of multimedia documents
◼ Defines the syntax for describing the structural and
conceptual content
◼ MPEG-1/2/4 make content available, whereas MPEG-7
allows you to find the content you need!
❑ Enable multimedia document indexing, browsing, and retrieval
❑ Define the syntax for the metadata (e.g. index and summary) attached to
the document
❑ Generation of index and summary is not part of the standard!
◼ Content description in MPEG-7
❑ Descriptor (D): describing low-level features
❑ Description scheme (DS): combining Ds to describe high-level
features/structures
❑ Description definition language (DDL): define how Ds and DSs can be
defined or modified
❑ System tools
ELE-4410: Multimedia Systems & Networks 126
MPEG-7 Visual Descriptors
◼ Color
❑ Histogram, dominant color, etc.
◼ Texture
❑ Homogeneity: energy in different orientation and frequency bands
(Gabor transform)
❑ Coarseness, directionarity, regularity
❑ Edge orientation histogram
◼ Motion
❑ Camera motion
❑ Motion trajectory of feature points in non-rigid object
❑ Motion parameters of a rigid object
❑ Motion activity
◼ Shape
❑ Boundary-based vs. region-based

ELE-4410: Multimedia Systems & Networks 127

Decode To Encode
No ratings yet
Decode To Encode
232 pages
UNIT 6 Video Processing
No ratings yet
UNIT 6 Video Processing
53 pages
Unit - 3
No ratings yet
Unit - 3
104 pages
Video&Animation
No ratings yet
Video&Animation
98 pages
Hitachi Dx235nlc 5
100% (1)
Hitachi Dx235nlc 5
1,320 pages
H.264 MPEG 4 Part 10 White Paper
No ratings yet
H.264 MPEG 4 Part 10 White Paper
47 pages
Video Compression
No ratings yet
Video Compression
98 pages
Wk8 MPEG Part1
No ratings yet
Wk8 MPEG Part1
36 pages
Mul c4
No ratings yet
Mul c4
25 pages
Lecture 20 - Video Coding
No ratings yet
Lecture 20 - Video Coding
36 pages
12 Mpeg
No ratings yet
12 Mpeg
60 pages
Videoprocessing4 240501171322 058694b4
No ratings yet
Videoprocessing4 240501171322 058694b4
32 pages
Presentation On Digital Technology
No ratings yet
Presentation On Digital Technology
42 pages
Video&Animation
No ratings yet
Video&Animation
98 pages
Lec 6,7,8,9
No ratings yet
Lec 6,7,8,9
27 pages
Chapter 10 Mmedia
No ratings yet
Chapter 10 Mmedia
22 pages
Video Compression: Dereje Teferi (PHD) Dereje - Teferi@Aau - Edu.Et
No ratings yet
Video Compression: Dereje Teferi (PHD) Dereje - Teferi@Aau - Edu.Et
26 pages
Lecture 10 Introduction To Video Processing & Applications: CSE 489-02 & CSE 589-02 Multimedia Processing
No ratings yet
Lecture 10 Introduction To Video Processing & Applications: CSE 489-02 & CSE 589-02 Multimedia Processing
104 pages
Chapter 2:multimedia Information Representation
No ratings yet
Chapter 2:multimedia Information Representation
56 pages
EEE 5111 - Lecture-4
No ratings yet
EEE 5111 - Lecture-4
45 pages
Lec10 - Video Compression
100% (1)
Lec10 - Video Compression
49 pages
MC-12 (MPEG Video Compression)
No ratings yet
MC-12 (MPEG Video Compression)
22 pages
Video Compression Fundamentals: Pamela C. Cosman
No ratings yet
Video Compression Fundamentals: Pamela C. Cosman
57 pages
Video Compression Basics
No ratings yet
Video Compression Basics
37 pages
Video Coding (VC-1)
No ratings yet
Video Coding (VC-1)
35 pages
Chap 5
No ratings yet
Chap 5
28 pages
Lec 04.4 - Video Compression - Intra Coding and H.264 - Intra - InterModes - OK
No ratings yet
Lec 04.4 - Video Compression - Intra Coding and H.264 - Intra - InterModes - OK
21 pages
Lecture 5 Fourier Transform and Video
No ratings yet
Lecture 5 Fourier Transform and Video
28 pages
MDCS
No ratings yet
MDCS
14 pages
Introduction To Video Compression Techniques
No ratings yet
Introduction To Video Compression Techniques
77 pages
Unit - 6 Fundamentals of Digital Video
100% (1)
Unit - 6 Fundamentals of Digital Video
29 pages
An Introduction To MPEG: School of Computer Science, University of Central Florida, VLSI and M-5 Research Group Tao Tao
No ratings yet
An Introduction To MPEG: School of Computer Science, University of Central Florida, VLSI and M-5 Research Group Tao Tao
19 pages
Video Image Compression
No ratings yet
Video Image Compression
16 pages
ICSE 2004: English Paper 1 (English Language) : Answer Key / Correct Responses On
No ratings yet
ICSE 2004: English Paper 1 (English Language) : Answer Key / Correct Responses On
6 pages
Video Formats and Mpeg Compression
No ratings yet
Video Formats and Mpeg Compression
52 pages
Video Coding
No ratings yet
Video Coding
36 pages
Multimedia Note
No ratings yet
Multimedia Note
13 pages
UOK MULTIMEDIA - CHAP5 - Fundamental Concepts in Video (Leonce 2021)
100% (1)
UOK MULTIMEDIA - CHAP5 - Fundamental Concepts in Video (Leonce 2021)
37 pages
Video Coding
No ratings yet
Video Coding
23 pages
Chapter 5
No ratings yet
Chapter 5
18 pages
JPEG and H.26x Standards
No ratings yet
JPEG and H.26x Standards
30 pages
Mpeg
No ratings yet
Mpeg
27 pages
Multimedia Note
No ratings yet
Multimedia Note
13 pages
Presented By: Priya Raina 13-516
No ratings yet
Presented By: Priya Raina 13-516
68 pages
NSP NFM-P 19.3 Installation and Upgrade Guide PDF
No ratings yet
NSP NFM-P 19.3 Installation and Upgrade Guide PDF
488 pages
L3 - 4-Digital Video Standards
No ratings yet
L3 - 4-Digital Video Standards
60 pages
Video Compression MPEG
No ratings yet
Video Compression MPEG
25 pages
ISO 27001 Presentation
No ratings yet
ISO 27001 Presentation
9 pages
Records: Archives: Management: Preservation
No ratings yet
Records: Archives: Management: Preservation
113 pages
2K6EC 705 (F) : Data Compression Handout 1 Video Signal Representation
No ratings yet
2K6EC 705 (F) : Data Compression Handout 1 Video Signal Representation
10 pages
Video Processing: CSC361/661 - Digital Media Spring 2004 Burg/Wong
100% (3)
Video Processing: CSC361/661 - Digital Media Spring 2004 Burg/Wong
42 pages
Video PDF
No ratings yet
Video PDF
37 pages
Video Coding Using Motion Compensation: (Chapter 9 - Continues)
No ratings yet
Video Coding Using Motion Compensation: (Chapter 9 - Continues)
45 pages
H.264/ AVC: Compression Standard
No ratings yet
H.264/ AVC: Compression Standard
21 pages
HEVC
No ratings yet
HEVC
50 pages
JPEG, Basic Ideas, Standards H.261, MPEG-1, MPEG-2 AVC, HEVC, Container Formats
No ratings yet
JPEG, Basic Ideas, Standards H.261, MPEG-1, MPEG-2 AVC, HEVC, Container Formats
20 pages
CM3106 Chapter 12: MPEG Video: Prof David Marshall and DR Kirill Sidorov
No ratings yet
CM3106 Chapter 12: MPEG Video: Prof David Marshall and DR Kirill Sidorov
63 pages
PERIODIC TEST in ICT-Grade 9 (Computer System Servicing)
No ratings yet
PERIODIC TEST in ICT-Grade 9 (Computer System Servicing)
3 pages
Beginner Guide For MPEG-2 Standard
No ratings yet
Beginner Guide For MPEG-2 Standard
12 pages
Key Point Mapping
No ratings yet
Key Point Mapping
69 pages
H.264 Video Encoder Standard - Review
No ratings yet
H.264 Video Encoder Standard - Review
5 pages
Mpeg-2 Video Compression Technique Presentation
No ratings yet
Mpeg-2 Video Compression Technique Presentation
12 pages
Video
No ratings yet
Video
24 pages
Fundamental Concepts in Video: Lecturer Dr. Aree Ali Mohammed 2012-2013 4 Stage
No ratings yet
Fundamental Concepts in Video: Lecturer Dr. Aree Ali Mohammed 2012-2013 4 Stage
25 pages
What Is Interactive Media1
No ratings yet
What Is Interactive Media1
4 pages
IBM z14 ZR1 - Hardware Innovation
No ratings yet
IBM z14 ZR1 - Hardware Innovation
18 pages
BMC Remedy Error Message
100% (1)
BMC Remedy Error Message
270 pages
Shounter Volume III, Section - 4
No ratings yet
Shounter Volume III, Section - 4
99 pages
Curriclum-Syllabus-MS Data Science & MGT IIT Indore
No ratings yet
Curriclum-Syllabus-MS Data Science & MGT IIT Indore
16 pages
Manifest UFSFiles Win64
No ratings yet
Manifest UFSFiles Win64
362 pages
User Manual K64
No ratings yet
User Manual K64
20 pages
Brochure Inpage
No ratings yet
Brochure Inpage
2 pages
XG Boost
No ratings yet
XG Boost
5 pages
A Comparison of A Dynamic and Static Optimization of An ASP Flooding Process For EOR
No ratings yet
A Comparison of A Dynamic and Static Optimization of An ASP Flooding Process For EOR
20 pages
Protolabs Investor Presentation - November
No ratings yet
Protolabs Investor Presentation - November
22 pages
Configure SSL Mastertheboss
No ratings yet
Configure SSL Mastertheboss
12 pages
2-Digit Addition & Subtraction: With and Without Regrouping Worksheets
No ratings yet
2-Digit Addition & Subtraction: With and Without Regrouping Worksheets
21 pages
C13 DischargeKeeper e S1 4
No ratings yet
C13 DischargeKeeper e S1 4
4 pages
Enhancing Discontinuities in Seismic Data and Automated Fault Mapping
No ratings yet
Enhancing Discontinuities in Seismic Data and Automated Fault Mapping
19 pages
Marketing Cell, BTCL.: Bangladesh Telecommunications Company Limited
No ratings yet
Marketing Cell, BTCL.: Bangladesh Telecommunications Company Limited
42 pages
Interaction Model
No ratings yet
Interaction Model
11 pages
Broncolor Mobil Manual
No ratings yet
Broncolor Mobil Manual
15 pages
Whitepaper PDF
No ratings yet
Whitepaper PDF
57 pages
Quiz Application Using Java
No ratings yet
Quiz Application Using Java
2 pages
Cisco UCS C-Series IMC Emulator Quick Start Guide
No ratings yet
Cisco UCS C-Series IMC Emulator Quick Start Guide
14 pages
Sokoban en
No ratings yet
Sokoban en
6 pages
The Relevant Résumé Template 2 PDF
No ratings yet
The Relevant Résumé Template 2 PDF
1 page
Analog Dialogue, Volume 47, Number 4
From Everand
Analog Dialogue, Volume 47, Number 4
Analog Dialogue
No ratings yet
Colour Banding: Exploring the Depths of Computer Vision: Unraveling the Mystery of Colour Banding
From Everand
Colour Banding: Exploring the Depths of Computer Vision: Unraveling the Mystery of Colour Banding
Fouad Sabry
No ratings yet
Joint Photographic Experts Group: Unlocking the Power of Visual Data with the JPEG Standard
From Everand
Joint Photographic Experts Group: Unlocking the Power of Visual Data with the JPEG Standard
Fouad Sabry
No ratings yet