04 VideoCompressionStandards
04 VideoCompressionStandards
Source
Pre-Processing Encoding
Destination
Post-Processing Decoding
Scope of Standard
H.261: The Basis of Modern
Video Compression
• ITU-T (ex-CCITT) Rec. H.261: The first wide-
spread practical success
– First design (late ‘90) embodying typical structure that dominates
today: 16x16 macroblock motion compensation, 8x8 DCT, scalar
quantization, and variable-length coding
– Key aspects later dropped by other standards: loop filter, integer
motion comp., 2-D VLC, header overhead
– v2 (early ‘93) added a backward-compatible high-resolution
graphics trick mode
– Operated at 64-2048 kbps
– Still in use, although mostly as a backward-compatibility feature –
overtaken by H.263
Typical MC+DCT Video Coder
DCT,
Quantize, Encoded Residual
Input Entropy Encode (To Channel)
Frame
Entropy Decode,
Motion Quant. Recon.,
Compensated Inverse DCT
Prediction
(Dotted Box Approximated
Shows Decoder) Input Frame
Prior Coded (To Display)
Frame Approx
Motion Comp. Frame Buffer
Predictor (Delay)
34
? 67 %
Intraframe
32 DCT coding
(DCT 1974,
30 Integer-pel JPEG 1992)
motion
28 compensation
(H.261 1991)
26 Bit-Rate [kbps]
0 100 200 300 400 500
MPEG-1:
Practicality at Higher Bit Rates
• Formally ISO/IEC 11172-2 (‘93), developed by
ISO/IEC JTC1 SC29 WG11 (MPEG) – use is
fairly widespread, but mostly overtaken by
MPEG-2
– Superior quality to H.261 when operated a higher bit
rates ( 1 Mbps for CIF 352x288 resolution)
– Can provide approximately VHS quality between 1-2
Mbps using SIF 352x240/288 resolution
– Technical features: Adds bi-directional motion predic-
tion and half-pixel motion to H.261 design
MPEG-2/H.262: Even Higher Bit
Rates and Interlace
• Formally ISO/IEC 13818-2 & ITU-T H.262, de-
veloped (‘94) jointly by ITU-T and ISO/IEC SC29
WG11 (MPEG) – Now in wide use for DVD and
standard and high-definition DTV (the most com-
monly used video coding standard)
– Primary new technical features: support for interlaced-
scan pictures and scalability
– Essentially the same as MPEG-1 for progressive-scan
pictures, and MPEG-1 forward compatibility required
– Not especially useful below 4 Mbps (range of use nor-
mally 5-30 Mbps)
H.263: The Next Generation
• ITU-T Rec. H.263 (v1: 1995): The next generation
of video coding performance, developed by ITU-T
– the current best standard for practical video
telecommunication (has overtaken H.261 as dom-
inant videoconferencing codec)
– Superior to H.261 at all bit rates
– Wins by a factor of two at very low rates
– Versions 2 (late 1997/early 1998) & v3 (2000) later de-
veloped
MPEG-4: Baseline H.263
and Many Creative Extras
• MPEG-4 (v1: early 1999), formally ISO/
IEC 14496-2: Contains the H.263 design
and adds all prior features and various cre-
ative new extras
– Includes segmented coding of shapes, zero-tree
wavelet coding of still textures, coding of syn-
thetic and semi-synthetic content, etc.
– v2 (early 2000) & v3 (early 2001) later added
MPEG-4 and H.263 Standardiza-
tion Dynamics
• MPEG-4 project launched soon after H.263 completed
• MPEG-4 project was very ambitious and was planned to
be significantly different from H.263
• Compatibility with H.263 was not initially planned in
MPEG-4 (although it eventually turned out to be signifi-
cantly compatible!)
• ITU-T decided to extend its H.263 quickly and compatibly
rather than join up with longer, more ambitious, poten-
tially-incompatible MPEG-4 effort for the features the ITU
wanted
• Much cross-fertilization of ideas and people in projects
Detailed Recent History In Video
Coding Standardization
• ITU-T Events
– H.263v1 completed late ‘95
– H.263+ project (H.263 v2) technically final Sept ‘97
– H.263++ project (H.263 v3) technically final July ‘00
– H.26L project underway (test version available)
• ISO/IEC Events
– MPEG-4 v1 completed early ’99
– MPEG-4 v2 completed early ’00
– MPEG-4 v3 completed early ‘01
– Potential for new work under evaluation
H.263++ New Version 3 Features
Part 1 of 2
• Annex U: Fidelity enhancement by macroblock
and block-level reference picture selection – a sig-
nificant improvement in compression quality
• Annex V: Packet Loss & Error Resilience using
data partitioning with reversible VLCs (roughly
similar to MPEG-4 data partitioning, but improved
by using reversible coding of motion vectors
rather than coefficients)
H.263++ New Version 3 Features
Part 2 of 2
• Annex W:Additional Supplemental Enhancement
Information
– IDCT Mismatch Elimination (specific fixed-point fast IDCT)
– Arbitrary binary user data
– Text messages (arbitrary, copyright, caption, video description,
and URI)
– Error Resilience:
• Picture header repetition (current, previous, next+TR, next-TR)
• Spare reference pictures for error concealment
– Interlaced field indications (top & bottom)
H.263++ Annex U
Rate Distortion Performance
H.263 combined with 50 10 3 reference pictures
Long-Term Memory Prediction
36
TMN-10
PSNR [dB]
34
17 %
32
Foreman
30 10 Hz, QCIF
100 frames encoded
28
26 Bit-Rate [kbps]
0 20 40 60 80 100 120 140 160
Average Bit Rate Savings
40
Average
Bit-Rate
Savings
30 in [%] Average
@ 34 dB
Tempete
Container
20 Silent
Foreman
Mobile & Calendar
13.5 % Mother & Daughter
10 Stefan
0 Number of
1 2 3 5 10 17 33 65 100 Reference Frames
MPEG-4 Version 3
Just Completed (part 1 of 2)
• “Studio Profile”
– Various additions oriented toward professional use of video within
specialized studio environments
– Adds 4:2:2 and 4:4:4 sampling structures
– Adds more MPEG-2 elements to MPEG-4
• “Fine Granularity Scalability Streaming Video Profile”, a
new form of scalable video coding
– Uses a scalable enhancement layer
– Temporal prediction in enhancement layer is stopped to prevent
temporal error propagation
– Enhancement layer coded by bit-planes to form a “progressive-
transmission” bitstream
MPEG-4 Version 3
Just Completed (part 2 of 2)
• “Advanced Simple Profile”, a combination of v1
features, containing:
– “Simple Profile” features
– B pictures
– MPEG-2-style quantization
– Interlace features (at higher levels only)
– ¼-pel motion
– Global motion comp
– Single stream support in new “level 0”
ITU-T VCEG H.26L Project
Goals (Completion 2002)
• Compression beyond capability of H.263vN
• Real-time low-cost complexity
• Delay reduction
• Enhanced error and packet loss resilience
• Bit-rate adaptivity (e.g. scalability & BR reduc-
tion)
• Spatio-temporal resolution adaptivity
• Robustness to source material behavior
H.26L Status
• Test Model Long-Term Number 6: De-
signed January ’01 (Eibsee), description
and software soon available on the ‘net
• TML-5 software and spec availalbe
(Geneva, November ’00)
• Gain goal over 1999 standards:
50% savings in bits for same fidelity!
(at all bit rates)
The H.26L TML-6 Design
Part 1 of 4
• Still using a hybrid of DPCM and transform cod-
ing as in prior standards. Common elements in-
clude:
– 16x16 macroblocks
– Conventional sampling of chrominance and association
of luminance and chrominance data
– Block motion displacement
– Block transforms (not wavelets or fractals)
– Scalar quantization
– Variable-length coding
The H.26L TML-6 Design
Part 2 of 4
• Motion Compensation:
– Multiple reference pictures (per H.263++ Annex U)
– B picture support (per several prior standards)
– Multihypothesis concept being evaluated
– 1/4 sample accuracy motion (sort of per MPEG-4,
could possibly go to 1/8 pel)
– 6x6 tap filtering to 1/2 sample accuracy, bilinear filter-
ing to 1/4 sample accuracy
– Various block sizes and shapes for motion compensa-
tion (7 segmentations of the macroblock)
– “Funny position” with heavier filtering
– Affine motion under consideration
The H.26L TML-6 Design
Part 3 of 4
• Intra Coding Structure:
– Directional spatial prediction (6 types for luma, one for
chroma)
– Alterations under consideration
• Transform
– Variable block size for intra (16x16, 8x8, 4x4)
– Technically not exactly a DCT, but an integer transform
closely approximating a DCT
– Based primarily on 4x4 transform size (all prior stan-
dards used 8x8)
– Expanded to 8x8 for chroma by 2x2 DC transform
– Adaptive block size under consideration
The H.26L TML-6 Design
Part 4 of 4
• Two inverse scan patterns
• Logarithmic step size control
• Smaller step size for chroma (per H.263 Annex T)
• Universal variable-length coding (configurability under
consideration)
• Adaptive arithmetic coding under strong consideration
• In-loop deblocking filter
• Distinct Network Adaptation Layer (NAL) design for net-
work transport
• Inter-sequence transitional pictures under consideration
Future Work in MPEG
Live Feed
UNICAST, MULTI-
On-demand CAST
Content