Presented By: Priya Raina 13-516

The Moving Picture Experts Group (MPEG) was formed in 1988 to set standards for audio and video compression and transmission. MPEG standards are identified by 5-digit numbers, such as MPEG-1 standard 11172. The development process for an MPEG standard involves initial proposals, test models, working drafts, committee drafts, final committee drafts, and final approval as an international standard. This process aims to optimize coding performance while meeting requirements for applications such as random access, synchronization, error robustness, delay, and editability.

Uploaded by

Priya Raina

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

117 views68 pages

Presented By: Priya Raina 13-516

Uploaded by

Priya Raina

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 68

Presented by:

Priya Raina
13-516

The Moving Picture Experts Group (MPEG) is
a working group of experts that was formed
by ISO and IEC to set standards for audio and video
compression and transmission.

Founded in 1988 by Hiroshi Yasuda and Leonardo
Chiariglione
Introduction

The ISO standards by MPEG identified by 5 digit no.
MPEG-1 is 11172.
Proposal of new work within a committee(NP)
NP approved at Subcommittee(SC29) and then at
Technical Committee level (JTC1)
Scope definiton and Calls for Proposals CfP.

Birth of a standard

For Audio and Video coding standards the first
document produced is called a Test Model, describes,
in a programming language, the operation of the
encoder and the decoder.
Used to carry out simulations to optimise the
performance of the coding scheme.
Working Draft (WD) is produced which is already in
the form of a standard :kept internal to MPEG for
revision.
Sufficiently solid WD becomes Committee Draft (CD).
It is then sent to National Bodies (NB) for ballot. If
the number of positive votes is above the quorum,
the CD becomes Final Committee Draft (FCD) and is
again submitted to NBs for the second ballot after a
thorough review that may take into account the
comments issued by NBs. If the number of positive
votes is above the quorum the FCD becomes Final
Draft International Standard (FDIS). ISO will then hold
a yes/no ballot with National Bodies where no
technical changes are allowed. The document then
becomes International Standard (IS).
NP
Approval
by SC29
and JTC1
Scope
Definition
and CfP
Test
Model
WD
CD
FCD
FDIS
IS

High-Definition Television (HDTV)
1920x1080
30 frames per second (full motion)
8 bits for each three primary colors
Total 1.5 Gb/sec!
Each cable channel is 6 MHz
Max data rate of 19.2 Mb/sec
Reduced to 18 Mb/sec w/audio + control
Compression rate must be 83:1!

The Need for Video
Compression

CD-ROM and DAT key storage devices
1-2 Mbits/sec for 1x CD-ROM
Two types of application videos:
Asymmetric (encoded once, decoded many)
Video games, Video on Demand
Symmetric (encoded once, decoded once)
Video phone, video mail
(How do you think the two types might influence design?)
Video at about 1.5 Mbits/sec
Audio at about 64-192 kbits/channel

Compatibility Goals

Random Access, Reverse, Fast Forward, Search
At any point in the stream
Can reduce quality somewhat during task, if needed
Audio/Video Synchronization
Even when under two different clocks
Robustness to errors
Not catastrophic if bits lost
Coding/Decoding delay under 150ms
For interactive applications
Editability
Modify/Replace frames

Requirements

Standards at a glance
MPEG 1
MPEG 2
MPEG 3
MPEG 4
MPEG 7
MPEG 21
MPEG A
MPEG B

MPEG C
MPEG D
MPEG E
MPEG V
MPEG M
MPEG U
MPEG H
MPEG DASH

MPEG 1
ISO/IEC 11172
Coding of moving pictures and associated audio at up to about 1.5
Mbit/s

Horizontal picture size
768 pixels
Vertical picture size
576 lines
Number of macroblocks
396
Number of macroblocks
picture rate
396 25 = 9900
Picture rate
30 pictures/s
VBV buffer size
2,621,440 bits
Bit rate
1,856,000 bits/s

provides the information about the audio and video
layers with stream identification and
synchronization information essential to the
decoding and subsequent rendering for each of
them.
required to carry not only the multiplexed audio and
video information but all of the other non-
audio/video information, and in many cases private,
data needed for a successful and pleasing user
experience.
error free environment of CDs or optical discs.

Systems Layer

Unique multiplex SYSMUX designed to precisely
deliver a clock reference and elementary streams in
such a way as to enable audio-video synchronization
due to constant delay.
Achieved by specification of System Target Decoder
(STD) model. The STD is an idealized model of a
demultiplexing and decoding complex that precisely
specifies the delivery time of each byte in an MPEG
Systems multiplex and its distribution to the
appropriate decoder or resource in the complex.

requirements: in the context of storage and replay of stored
data, mainly are related to random access i.e. fwd-bwd replay,
f.fwd mode, editing.
basic principle: hybrid coding=combination of block-wise
motion-compensated prediction and scalar-quantized DCT-
based coding of the residual. The same transform is applied
when intraframe mode is selected for a whole picture or a
macroblock.
exploits perceptual compression methods to significantly
reduce the data rate i.e. reduces or discards information in
certain frequencies and areas of the picture that the human eye
has limited ability to fully perceive.
Also exploits temporal (over time) and spatial (across a picture)
redundancy
Video Layer

Structure of the Coded Bit-Stream
GOP-1 GOP-2 GOP-n
I B B B P B B..
Slice-1
Slice-2

Slice-N
mb-1 mb-2 mb-n
0 1
2 3
4 5
Sequence layer
GOP layer
Picture layer
Slice layer
Macroblock layer
8x8 block

Sequence information
Video Params include width, height, aspect ratio of pixels,
picture rate.
Bitstream Params are bit rate, buffer size, and constrained
parameters flag (means bitstream can be decoded by most
hardware)
Two types of QTs: one for intra-coded blocks (I-frames) and
one for inter-coded blocks (P-frames).
Group of Picture (GOP)Information
Time code: bit field with SMPTE time code (hours, minutes,
seconds, frame).
GOP Params are bits describing structure of GOP.

Picture information
Type: I, P, or B-frame?
Buffer Params indicate how full decoder's buffer should be before
starting decode.
Encode Params indicate whether half pixel motion vectors are used.
Slice information
Vert Pos: what line does this slice start on?
QScale: How is the quantization table scaled in this slice?
Macroblock information
Addr Incr: number of MBs to skip.
Type: Does this MB use a motion vector? What type?
QScale: How is the quantization table scaled in this MB?
Coded Block Pattern (CBP): bitmap indicating which blocks are
coded.

The color-space is transformed to Y'CbCr (Y'=Luma, Cb=Chroma Blue,
Cr=Chroma Red). Luma(brightness, resolution) is stored separately
from chroma (color, hue, phase) and even further separated into red and
blue components. The chroma is also subsampled to 4:2:0, meaning it is
reduced by one half vertically and one half horizontally, to just one quarter
the resolution of the video.
Because the human eye is much more sensitive to small changes in
brightness (the Y component) than in color (the Cr and Cb
components), chroma subsampling is a very effective way to reduce the
amount of video data that needs to be compressed. Can manifest as
chroma aliasing artifacts
Because of subsampling, Y'CbCr video must always be stored using even
dimensions , otherwise chroma mismatch ("ghosts") will occur
8x8 blocks for quantization. However, because chroma (color) is
subsampled by a factor of 4, each pair of (red and blue) chroma blocks
corresponds to 4 different luma blocks. This set of 6 blocks, with a
resolution of 16x16, is called a macroblock.

Spatial Redundancy Reduction
Zig-Zag Scan,
Run-length
coding
Quantization
major reduction
controls quality
Intra-Frame
Encoded

Frames: I,P,B,D: D frame exclusive to MPEG. It is an I frame encoded using DC
transform coefficients only; Very low quality; never referenced by I-, P- or B-
frames; used for fast previews of video, for instance when seeking through a
video at high speed. Now obsolete.

only blocks that change are updated, (up to the maximum GOP size). This is
known as conditional replenishment.
Movement of the objects, and/or the camera may result in large portions of the
frame needing to be updated, even though only the position of the previously
encoded objects has changed. Through motion estimation the encoder can
compensate for this movement and remove a large amount of redundant
information.
The encoder compares the current frame with adjacent parts of the video from
the anchor frame (previous I- or P- frame) in a diamond pattern, up to a
(encoder-specific) predefined radius limit from the area of the current
macroblock. If a match is found, only the direction and distance (i.e.
the vector of the motion) from the previous video area to the current macroblock
need to be encoded into the inter-frame (P- or B- frame). The reverse of this
process, performed by the decoder to reconstruct the picture, is called motion
compensation.
A predicted macroblock rarely matches the current picture perfectly, however.
The differences between the estimated matching area, and the real
frame/macroblock is called the prediction error

Temporal Redundancy Reduction
I frames are independently encoded
P frames are based on previous I, P frames
B frames are based on previous and following I and P
frames
In case something is uncovered

Quantization is performed by taking each of the 64 frequency values of the DCT block,
dividing them by the frame-level quantizer, then dividing them by their
corresponding values in the quantization matrix. Finally, the result is rounded down.
This significantly reduces, or completely eliminates, the information in some
frequency components of the picture. Typically, high frequency information is less
visually important, and so high frequencies are much more strongly
quantized (drastically reduced). MPEG-1 actually uses two separate quantization
matrices, one for intra-blocks (I-blocks) and one for inter-block (P- and B- blocks) so
quantization of different block types can be done independently, and so, more
effectively
This is also the primary source of most MPEG-1 videocompression artifacts,
like blockiness, color banding, noise, ringing, discoloration, et al. This happens when
video is encoded with an insufficient bitrate, and the encoder is therefore forced to
use high frame-level quantizers (strong quantization) through much of the video.
entropy coding in the field of information theory.
The coefficients of quantized DCT blocks tend to zero towards the bottom-right.
Maximum compression can be achieved by a zig-zag scanning of the DCT block
starting from the top left and using Run-length encoding techniques.
The DC coefficients and motion vectors are DPCM-encoded.

Run-length encoding (RLE) is a very simple method of compressing repetition. A
sequential string of characters, no matter how long, can be replaced with a few bytes,
noting the value that repeats, and how many times. For example, if someone were to
say "five nines", you would know they mean the number: 99999.
RLE is particularly effective after quantization, as a significant number of the AC
coefficients are now zero (called sparse data), and can be represented with just a
couple of bytes. This is stored in a special 2-dimensional Huffman table that codes
the run-length and the run-ending character.
Huffman Coding is a very popular method of entropy coding, and used in MPEG-1
video to reduce the data size. The data is analyzed to find strings that repeat often.
Those strings are then put into a special table, with the most frequently repeating
data assigned the shortest code. This keeps the data as small as possible with this
form of compression.
[38]
Once the table is constructed, those strings in the data are
replaced with their (much smaller) codes, which reference the appropriate entry in
the table. The decoder simply reverses this process to produce the original data.
This is the final step in the video encoding process, so the result of Huffman
coding is known as the MPEG-1 video "bitstream.

. I-frame only sequences gives least compression, but is
useful for random access, FF/FR and editability. I and P
frame sequences give moderate compression but add a
certain degree of random access, FF/FR functionality. I,P
& B frame sequences give very high compression but also
increases the coding/decoding delay significantly. Such
configurations are therefore not suited for video-
telephony or video-conferencing applications.
The typical data rate of an I-frame is 1 bit per pixel while
that of a P-frame is 0.1 bit per pixel and for a B-frame,
0.015 bit per pixel.

MPEG: Video Encoding
Pre
processing
Frame
Memory
+
-
DCT
Motion
Compensation
Motion
Estimation
Frame
Memory
+
IDCT
Quantizer
(Q)
Regulator
VLC
Encoder
Buffer
Q
-1
Output
Input
P
r
e
d
i
c
t
i
v
e

f
r
a
m
e

M
o
t
i
o
n

v
e
c
t
o
r
s

Interframe predictive coding (P-pictures)
For each macroblock the motion estimator produces the
best matching macroblock
The two macroblocks are subtracted and the difference is
DCT coded
Interframe interpolative coding (B-pictures)
The motion vector estimation is performed twice
The encoder forms a prediction error macroblock from
either or from their average
The prediction error is encoded using a block-based DCT
The encoder needs to reorder pictures because B-
frames always arrive late

MPEG-1 videos are most commonly seen
using Source Input Format (SIF) resolution: 352x240,
352x288, or 320x240. These low resolutions,
combined with a bitrate less than 1.5 Mbit/s, make
up what is known as a constrained parameters
bitstream (CPB), later renamed the "Low Level" (LL)
profile in MPEG-2. This is the minimum video
specifications any decoder should be able to handle,
to be considered MPEG-1 compliant

Audio Layer
Layer I :for applications that require both low complexity decoding and
encoding. Layer II: higher compression efficiency for a slightly higher
complexity

Layer II/MP2 is a time-domain encoder. It uses a low-delay 32 sub-
band polyphased filter bank for time-frequency mapping; having
overlapping ranges (i.e. polyphased) to prevent aliasing.
the 32 sub-band filter bank returns 32 amplitude coefficients, one for each
equal-sized frequency band/segment of the audio, which is about 700 Hz
wide (depending on the audio's sampling frequency). The encoder then
utilizes the psychoacoustic model to determine which sub-bands contain
audio information that is less important, and so, where quantization will
be inaudible.
can also optionally use intensity stereo coding, a form of joint stereo. This
means that the frequencies above 6 kHz of both channels are
combined/down-mixed into one single (mono) channel, but the "side
channel" information on the relative intensity (volume, amplitude) of each
channel is preserved and encoded into the bitstream separately. On
playback, the single channel is played through left and right speakers, with
the intensity information applied to each channel to give the illusion of
stereo sound.
[38][50]
This perceptual trick is known as stereo irrelevancy.

MP3 is a frequency-domain audio transform encoder.
worse temporal resolution than Layer II. This causes quantization
artifacts, due to transient sounds like percussive events and other
high-frequency events that spread over a larger window. This
results in audible smearing and pre-echo.
forced to use a hybrid time domain (filter bank) /frequency domain
(MDCT) model to fit in with Layer II simply wastes processing
time and compromises quality by introducing aliasing artifacts
MP3 can use middle/side (mid/side, m/s, MS, matrixed) joint
stereo. With mid/side stereo, certain frequency ranges of both
channels are merged into a single (middle, mid, L+R) mono
channel, while the sound difference between the left and right
channels is stored as a separate (side, L-R) channel. Unlike
intensity stereo, this process does not discard any audio
information.

MPEG 2
ISO/IEC 13818
Generic coding of moving pictures and associated audio

container formats. One is the transport stream, a data packet
format designed to transmit one data packet in four ATM data
packets for streaming digital video and audio over fixed or
mobile transmission mediums, where the beginning and the
end of the stream may not be identified.
The other is the program stream, an extended version of
the MPEG-1 container format designed for random access
storage mediums such as hard disk drives, optical
discs and flash memory.
Standard Definition and High Definition television
broadcasting over Terrestrial, satellite and cable networks, and
optical disk - specifically DVD for movie distribution.

Systems Layer

The Video section, part 2 of MPEG-2, is similar to the
previous MPEG-1 standard, but also provides support
for interlaced video, the format used by analog broadcast
TV systems. MPEG-2 video is not optimized for low bit-
rates, especially less than 1 Mbit/s at standard
definition resolutions. All standards-compliant MPEG-2
Video decoders are fully capable of playing back MPEG-1
Video streams conforming to the Constrained Parameters
Bitstream syntax. MPEG-2/Video is formally known as
ISO/IEC 13818-2 and as ITU-T Rec. H.262.
[5]

With some enhancements, MPEG-2 Video and Systems
are also used in some HDTV transmission systems.
Video Layer

Interlaced and non-interlaced frame
Different color subsampling modes e.g., 4:2:2, 4:2:0, 4:4:4
Flexible quantization schemes can be changed at
picture level
Scalable bit-streams
Profiles and levels

A number of levels and profiles have been defined for
MPEG-2 video compression. Each of these describes
a useful subset of the total functionality offered by
the MPEG-2 standards. An MPEG-2 system is
usually developed for a certain set of profiles at a
certain level. Basically:
Profile = quality of the video
Level = resolution of the video

Levels
Profiles
SNR
4:2:0
Spatial
4:2:0
High
4:2:0;4:2:2
Multiview
4:2:0

High
Enhancement 1920 X 1151/60 1920 X 1151/60
Lower 960 X 576/30 1920 X 1151/60
Bitrate 100, 80,25 130, 50, 80

High-1440
Enhancement 1440 X 1152/60 1440 X 1152/60 1920 X 1152/60
Lower 720 X 576/30 720 X 576/30 1920 X 1152/60
Bitrate 60, 40, 15 80, 60, 20 100, 40, 60

Main
Enhancement 720 X 576/30 720 X 576/30 720 X 576/30
Lower 352 X 288/30 720 X 576/30
Bitrate 15, 10 20, 15, 4 25, 10, 15

Low
Enhancement 352 X 288/30 352 X 288/30
Lower 352 X 288/30
Bitrate 4, 3 8, 4, 4

Multiview Profile
Stereoscopic view disparity prediction
Virtual walk-throughs composed from multiple
viewpoints

Supporting Interlaced
Video
MPEG-2 must support interlaced video as well since this
is one of the options for digital broadcast TV and HDTV
In interlaced video each frame consists of two fields,
referred to as the top-field and the bottom-field
In a Frame-picture, all scanlines from both fields are
interleaved to form a single frame, then divided into 1616
macroblocks and coded using MC
If each field is treated as a separate picture, then it is called
Field-picture
MPEG 2 defines Frame Prediction and Field Prediction as
well as five prediction modes

Fig. 11.6: Field pictures and Field-prediction for Field-pictures in MPEG-2.
(a) Framepicture vs. Fieldpictures, (b) Field Prediction for Fieldpictures

Zigzag and Alternate Scans of DCT Coefficients for
Progressive and Interlaced Videos in MPEG-2.

MPEG-2 layered coding
The MPEG-2 scalable coding: A base layer and one or
more enhancement layers can be defined
The base layer can be independently encoded, transmitted
and decoded to obtain basic video quality
The encoding and decoding of the enhancement layer is
dependent on the base layer or the previous enhancement
layer
Scalable coding is especially useful for MPEG-2 video
transmitted over networks with following characteristics:
Networks with very different bit-rates
Networks with variable bit rate (VBR) channels
Networks with noisy connections

MPEG-2 Scalabilities
MPEG-2 supports the following scalabilities:
1. SNR Scalabilityenhancement layer provides higher
SNR
2. Spatial Scalability enhancement layer provides
higher spatial resolution
3. Temporal Scalabilityenhancement layer facilitates
higher frame rate
4. Hybrid Scalability combination of any two of the
above three scalabilities
5. Data Partitioning quantized DCT coefficients are
split into partitions

multi-channel perceptual audio coder
appropriate for applications involving storage or transmission of mono,
stereo or multi-channel music or other audio signals where quality of the
reconstructed audio is paramount.
AAC achieves coding gain primary through three strategies. First, it uses a
high-resolution transform (a 1024-frequency-bins) to achieve redundancy
removal. This is the invertible removal of information based on purely
statistical properties of a signal. Second, it uses a continuously signal-
adaptive model of the human auditory system to determine a threshold for
the perception of quantization noise and thereby achieve irrelevancy
reduction. This is the irretrievable removal of information based on the fact
that it is not perceivable Third, entropy coding is used to match the actual
entropy of the quantized values with the entropy of their representation in
the bitstream. Additionally, AAC provided tools for the joint coding of
stereo signals and other coding tools for special classes of signals.
AAC is the default or standard audio format
for YouTube, iPhone, iPod, iPad
Audio Layer: Advanced
Audio Coding

MPEG 4
ISO/IEC 14496
Coding of audio-visual objects

Huge standard with 31 parts.
multimedia for the fixed and mobile web.
Features:
efficient across a variety of bit-rates ranging from a few kilobits per second to tens
of megabits per second. MPEG-4 provides the following functions:
Improved coding efficiency over MPEG-2
[citation needed]

Ability to encode mixed media data (video, audio, speech)
Error resilience to enable robust transmission
Ability to interact with the audio-visual scene generated at the receiver
Subsets of the MPEG-4 tool sets have been provided for use in specific
applications. These subsets, called 'Profiles', limit the size of the tool set a
decoder is required to implement.
[1]
In order to restrict computational
complexity, one or more 'Levels' are set for each Profile.
[1]
A Profile and
Level combination allows:
[1]

A codec builder to implement only the subset of the standard needed, while
maintaining interworking with other MPEG-4 devices that implement the
same combination

Systems Layer:
container formats. One is the transport stream, a data packet format designed to
transmit one data packet in four ATM data packets for streaming digital video and
audio over fixed or mobile transmission mediums, where the beginning and the end
of the stream may not be identified.
The other is the program stream, an extended version of the MPEG-1 container format
designed for random access storage mediums such as hard disk drives, optical
discs and flash memory.
Standard Definition and High Definition television broadcasting over Terrestrial,
satellite and cable networks, and optical disk - specifically DVD for movie
distribution.
The synchronized delivery of streaming information from source to destination,
exploiting different QoS as available from the network, is specified in terms of the
synchronization layer and a delivery layer containing a two-layer multiplexer, as
depicted in Figure 2.
The first multiplexing layer is managed according to the DMIF specification, part 6 of
the MPEG-4 standard. (DMIF stands for Delivery Multimedia Integration Framework)
This multiplex may be embodied by the MPEG-defined FlexMux tool, which allows
grouping of Elementary Streams (ESs) with a low multiplexing overhead.
Multiplexing at this layer may be used, for example, to group ES with similar QoS
requirements, reduce the number of network connections or the end to end delay.

The TransMux (Transport Multiplexing) layer in Figure 2 models the layer that
offers transport services matching the requested QoS. Only the interface to this layer
is specified by MPEG-4 while the concrete mapping of the data packets and control
signaling must be done in collaboration with the bodies that have jurisdiction over the
respective transport protocol. Any suitable existing transport protocol stack such as
(RTP)/UDP/IP, (AAL5)/ATM, or MPEG-2s Transport Stream over a suitable link
layer may become a specific TransMux instance. The choice is left to the end
user/service provider, and allows MPEG-4 to be used in a wide variety of operation
environments.

SL SL SL
TransMux Layer
FlexMux
TransMux Streams
FlexMux Channel
TransMux Channel
FlexMux Streams
DMIF Network Interface
DMIF Application Interface
Elementary Stream Interface
SL-Packetized Streams
Elementary Streams
FlexMux
Sync Layer
DMIF Layer
SL SL SL
FlexMux
SL
(RTP)
UDP
IP
(PES)
MPEG2
TS
AAL2
ATM
H223
PSTN
....
....
....
DAB
Mux
File
Broad-
cast
Inter-
active
(not specified in MPEG-4)
D
e
l
i
v
e
r
y

L
a
y
e
r

The systems part of the MPEG-4 addresses the description of the
relationship between the audio-visual components that constitute a
scene. The relationship is described at two main levels:
The Binary Format for Scenes (BIFS) describes the spatio-temporal
arrangements of the objects in the scene. Viewers may have the
possibility of interacting with the objects, e.g. by rearranging them on
the scene or by changing their own point of view in a 3D virtual
environment. The scene description provides a rich set of nodes for 2-
D and 3-D composition operators and graphics primitives.
At a lower level, Object Descriptors (ODs) define the relationship
between the Elementary Streams pertinent to each object (e.g the audio
and the video stream of a participant to a videoconference) ODs also
provide additional information such as the URL needed to access the
Elementary Steams, the characteristics of the decoders needed to parse
them, intellectual property and others.

Other issues addressed by MPEG-4 Systems:
A standard file format supports the exchange and authoring of MPEG-4 content
Interactivity, including: client and server-based interaction; a general event model for triggering
events or routing user actions; general event handling and routing between objects in the scene,
upon user or scene triggered events.
Java (MPEG-J) is used to be able to query to terminal and its environment support and there is also a
Java application engine to code 'MPEGlets'.
A tool for interleaving of multiple streams into a single stream, including timing information
(FlexMux tool).
A tool for storing MPEG-4 data in a file (the MPEG-4 File Format, MP4)
Interfaces to various aspects of the terminal and networks, in the form of Java APIs (MPEG-J)
Transport layer independence. Mappings to relevant transport protocol stacks, like (RTP)/UDP/IP
or MPEG-2 transport stream can be or are being defined jointly with the responsible standardization
bodies.
Text representation with international language support, font and font style selection, timing and
synchronization.
The initialization and continuous management of the receiving terminals buffers.
Timing identification, synchronization and recovery mechanisms.
Datasets covering identification of Intellectual Property Rights relating to media objects.

DAI (DMIF Application Interface) : interface between the
demultiplexer and th decoding buffer
ESI (Elementary Stream Interface): interface between the
decoding buffer and decoder.
DAI provides series of packets, called SL packets, to the
decoding buffer of each elementary stream. SL packet contains
one full access unit or the fragment of it as a payload and also
carries the timing information of the payload for decoding and
composition in a header. Each access unit remains in the
decoding buffer before the decoding time arrives and produces
a composition unit as a result of decoding which will remain
the composition memory until the composition time arrives. By
using this conceptual model, sender can guarantee the stream
does not break the terminal receiving it by causing overflow or
underflow of the decoding buffer or composition.

Terminal architecture

Audio

General
Audio
Transform
coding
techniques
a 6 kbit/s + bandwidth
below 4 kHz to broadcast
quality audio from mono
up to multichannel
low
delays
Fine Granularity
Scalability (FGS)
scalability resolution
down to 1 kbit/s per
channel
Speech
signals

2 kbit/s up to 24 kbit/s
using the speech coding
tools.
Lower bitrates possible
when variable rate
coding
HVXC tools: speed
and pitch modified
under user control
during playback.
CELP tools: change of
the playback speed
Synthetic
Audio

MPEG-4 Structured Audio is a language to describe 'instruments' (little
programs that generate sound) and 'scores' (input that drives those
objects). These objects are not necessarily musical instruments, they are
in essence mathematical formulae, that could generate the sound of a
piano, that of falling water or something 'unheard' in nature
Synthesized
Speech

Scalable
TTS
coders
200 bit/s to 1.2 Kbit/s allows a text, or a text with
prosodic parameters (pitch
contour, phoneme duration,
and so on), as its inputs to
generate intelligible synthetic
speech

Formats Supported
The following formats and bitrates are be supported by MPEG-4
Visual :
bitrates: typically between 5 kbit/s and more than 1 Gbit/s
Formats: progressive as well as interlaced video
Resolutions: typically from sub-QCIF to 'Studio' resolutions (4k x 4k
pixels)
Compression Efficiency
For all bit rates addressed, the algorithms are very efficient. This
includes the compact coding of textures with a quality adjustable
between "acceptable" for very high compression ratios up to "near
lossless".
Efficient compression of textures for texture mapping on 2-D and 3-D
meshes.
Random access of video to allow functionalities such as pause, fast
forward and fast reverse of stored video
Video

Content-Based Functionalities
Content-based coding of images and video allows separate
decoding and reconstruction of arbitrarily shaped video
objects.
Random access of content in video sequences allows
functionalities such as pause, fast forward and fast reverse
of stored video objects.
Extended manipulation of content in video sequences allows
functionalities such as warping of synthetic or natural text,
textures, image and video overlays on reconstructed video
content. An example is the mapping of text in front of a
moving video object where the text moves coherently with
the object.

Scalability of Textures, Images and Video

Complexity scalability in the encoder allows encoders of different complexity to
generate valid and meaningful bitstreams for a given texture, image or video.
Complexity scalability in the decoder allows a given texture, image or video
bitstream to be decoded by decoders of different levels of complexity. The
reconstructed quality, in general, is related to the complexity of the decoder
used. This may entail that less powerful decoders decode only a part of the
bitstream.
Spatial scalability allows decoders to decode a subset of the total bitstream
generated by the encoder to reconstruct and display textures, images and video
objects at reduced spatial resolution. A maximum of 11 levels of spatial
scalability are supported in so-called 'fine-granularity scalability', for video as
well as textures and still images.
Temporal scalability allows decoders to decode a subset of the total bitstream
generated by the encoder to reconstruct and display video at reduced temporal
resolution. A maximum of three levels are supported.
Quality scalability allows a bitstream to be parsed into a number of bitstream
layers of different bitrate such that the combination of a subset of the layers can
still be decoded into a meaningful signal. The bitstream parsing can occur either
during transmission or in the decoder. The reconstructed quality, in general, is
related to the number of layers used for decoding and reconstruction.
Fine Grain Scalability a combination of the above in fine grain steps, up to 11
steps

Shape and Alpha Channel Coding

Shape coding assists the description and composition of conventional
images and video as well as arbitrarily shaped video objects.
Applications that benefit from binary shape maps with images are
content-based image representations for image databases, interactive
games, surveillance, and animation. There is an efficient technique to
code binary shapes. A binary alpha map defines whether or not a pixel
belongs to an object. It can be on or off.
Gray Scale or alpha Shape Coding
An alpha plane defines the transparency of an object, which is not
necessarily uniform; it can vary over the object, so that, e.g., edges are
more transparent (a technique called feathering). Multilevel alpha
maps are frequently used to blend different layers of image sequences.
Other applications that benefit from associated binary alpha maps
with images are content-based image representations for image
databases, interactive games, surveillance, and animation.

Coding of 2-D Meshes with Implicit Structure
2D mesh coding includes:
Mesh-based prediction and animated texture transfiguration
2-D Delaunay or regular mesh formalism with motion tracking of animated
objects
Motion prediction and suspended texture transmission with dynamic meshes.
Geometry compression for motion vectors:
2-D mesh compression with implicit structure & decoder reconstruction
Coding of 3-D Polygonal Meshes
MPEG-4 provides a suite of tools for coding 3-D polygonal meshes. Polygonal
meshes are widely used as a generic representation of 3-D objects. The
underlying technologies compress the connectivity, geometry, and properties
such as shading normals, colors and texture coordinates of 3-D polygonal
meshes.

The Animation Framework eXtension (AFX, see further down) will provide
more elaborate tools for 2D and 3D synthetic objects.

MPEG-7
A suite of standards for description and search of audio,
visual and multimedia content.

MPEG-21
A suite of standard that define a normative open framework
for end-to-end multimedia creation, delivery and
consumption that provides content creators, producers,
distributors and service providers with equal opportunities
in the MPEG-21 enabled open market, and also be to the
benefit of the content consumers providing them access to a
large variety of content in an interoperable manner

MPEG-A
A suite of standards specifying application formats that
involve multiple MPEG and, where required, non
MPEG standards
MPEG-B
A suite of standards for systems technologies that do
not fall in other well-established MPEG standards
MPEG-C
A suite of video standards that do not fall in other erll-
established MPEG standards

MPEG-D
A suite of standards for Audio technologies that do not fall in
other MPEG standards
MPEG-E
A standard for an Application Programming Interface (API) of
Multimedia Middleware (M3W) that can be used to providea
uniform view to an interoperable multimedia middleware
platform.
MPEG-V
MPEG-V outlines an architecture and specifies associated
information representations to enable interoperability between
virtual worlds (e.g., digital content provider of a virtual world,
gaming, simulation), and between real and virtual worlds( e.g.,
sensors, actuators, vision and rendering, robotics).

MPEG-M
MPEG-M is a suite of standards to enable the easy design and
implementation of media-handling value chains whose devices
interoperate because they are all based on the same set of
technologies, especially MPEG technologies accessible from the
middleware and multimedia services
MPEG-U
MPEG-U provides a general purpose technology with innovative
functionality that enable its use in heterogeneous scenarios such as
broadcast, mobile, home network and web domains:

MPEG-H
Suite of standards for heterogeneous environment delivery of
audio-visual information compressed with high efficiency

MPEG-DASH
DASH is a suite of standards providing a solution for the efficient
and easy streaming of multimedia using existing available HTTP
infrastructure (particularly servers and CDNs, but also proxies,
caches, etc.).

Xfinity Bill PDF
50% (10)
Xfinity Bill PDF
2 pages
Xbox Conexant-CX25870-871 Datasheet
No ratings yet
Xbox Conexant-CX25870-871 Datasheet
291 pages
Mpeg 1 Part2 Video
No ratings yet
Mpeg 1 Part2 Video
107 pages
Datasheet: Product Specification 1/7.5" Cmos Vga (640 X 480) Image Sensor With Omnipixel3-Gs™ Technology
No ratings yet
Datasheet: Product Specification 1/7.5" Cmos Vga (640 X 480) Image Sensor With Omnipixel3-Gs™ Technology
125 pages
Mpeg 1.nis
No ratings yet
Mpeg 1.nis
18 pages
ch06f Mpeg Compression
100% (1)
ch06f Mpeg Compression
71 pages
CM3106 Chapter 12: MPEG Video: Prof David Marshall and DR Kirill Sidorov
No ratings yet
CM3106 Chapter 12: MPEG Video: Prof David Marshall and DR Kirill Sidorov
63 pages
Basic Video Compression Technique
100% (1)
Basic Video Compression Technique
51 pages
XDV
No ratings yet
XDV
54 pages
Image Compression
100% (1)
Image Compression
38 pages
Image Compression: CS474/674 - Prof. Bebis
100% (1)
Image Compression: CS474/674 - Prof. Bebis
110 pages
Digital Video
100% (1)
Digital Video
30 pages
H261
0% (1)
H261
21 pages
Chroma Sub Sampling Notation
No ratings yet
Chroma Sub Sampling Notation
3 pages
MC-11 (Video)
100% (1)
MC-11 (Video)
31 pages
Unit 4
No ratings yet
Unit 4
34 pages
BKSR 3202 R 3210 R 3220
No ratings yet
BKSR 3202 R 3210 R 3220
158 pages
Mpeg 4 PDF
No ratings yet
Mpeg 4 PDF
53 pages
Virtaul Reality
No ratings yet
Virtaul Reality
64 pages
JPEG Image Compression
No ratings yet
JPEG Image Compression
54 pages
Image Compression
100% (1)
Image Compression
111 pages
Overview of Graphics System
100% (5)
Overview of Graphics System
42 pages
Camara - OV7670
No ratings yet
Camara - OV7670
30 pages
MK20109V1 1 HDR
No ratings yet
MK20109V1 1 HDR
13 pages
Chrominance Subsampling in Digital Images
100% (1)
Chrominance Subsampling in Digital Images
15 pages
Display Devices
No ratings yet
Display Devices
78 pages
Conditional Access System: Basic Principles and Design Concepts
No ratings yet
Conditional Access System: Basic Principles and Design Concepts
26 pages
Digital Data Transmission Techniques
No ratings yet
Digital Data Transmission Techniques
21 pages
θ is swept out between t = 0.5 h and t = 1.5 h after perigee passage?
0% (1)
θ is swept out between t = 0.5 h and t = 1.5 h after perigee passage?
1 page
CImg Reference
No ratings yet
CImg Reference
193 pages
Digital Television-Transmission and Reception: Prepared by
100% (2)
Digital Television-Transmission and Reception: Prepared by
20 pages
XVS Switchers (Sal) - Dist
100% (1)
XVS Switchers (Sal) - Dist
21 pages
Video Compression Using H.264
No ratings yet
Video Compression Using H.264
27 pages
Cap 1, 2 y 9
No ratings yet
Cap 1, 2 y 9
58 pages
Network Management Assignment
100% (1)
Network Management Assignment
52 pages
Image Compression Using DCT Implementing Matlab
50% (2)
Image Compression Using DCT Implementing Matlab
23 pages
Overview of DVB-S Modulation For Digital-ATV: Tapr PSR #111 SPRING 2010 12
No ratings yet
Overview of DVB-S Modulation For Digital-ATV: Tapr PSR #111 SPRING 2010 12
12 pages
RED Compressed Raw Patent - US-11503294-B2
No ratings yet
RED Compressed Raw Patent - US-11503294-B2
35 pages
Tic Tac Toe Verilog
No ratings yet
Tic Tac Toe Verilog
7 pages
Video Compression 1 H 261
No ratings yet
Video Compression 1 H 261
15 pages
(Richard D. Gitlin, Jeremiah F. Hayes, Stephen B
No ratings yet
(Richard D. Gitlin, Jeremiah F. Hayes, Stephen B
745 pages
HD Standards
No ratings yet
HD Standards
10 pages
How To Do VSAT Installation
No ratings yet
How To Do VSAT Installation
4 pages
A.M Telap Digital Vedio
No ratings yet
A.M Telap Digital Vedio
119 pages
Digital Photography and Video Editing
No ratings yet
Digital Photography and Video Editing
8 pages
Ec8093 Dip - Question Bank With Answers
100% (1)
Ec8093 Dip - Question Bank With Answers
189 pages
Image Compression
No ratings yet
Image Compression
42 pages
An Introduction To MPEG Video Compression
No ratings yet
An Introduction To MPEG Video Compression
24 pages
Release Notes SONY RCP 1500
100% (1)
Release Notes SONY RCP 1500
13 pages
STM32 Reference Manual
No ratings yet
STM32 Reference Manual
1,093 pages
Connectivity Prediction in Mobile Ad Hoc Networks for Real-Time Control
From Everand
Connectivity Prediction in Mobile Ad Hoc Networks for Real-Time Control
Sebastian Thelen
5/5 (1)
L3 - 4-Digital Video Standards
No ratings yet
L3 - 4-Digital Video Standards
60 pages
Basics of MPEG: Picture Sizes: Up To 4095 X 4095 Most Algorithms Are For The CCIR 601 Format For Video Frames
No ratings yet
Basics of MPEG: Picture Sizes: Up To 4095 X 4095 Most Algorithms Are For The CCIR 601 Format For Video Frames
15 pages
Video Compression MPEG
No ratings yet
Video Compression MPEG
25 pages
Video Coding Using Motion Compensation: (Chapter 9 - Continues)
No ratings yet
Video Coding Using Motion Compensation: (Chapter 9 - Continues)
45 pages
Video
No ratings yet
Video
11 pages
Mpeg-2 Video Compression Technique Presentation
No ratings yet
Mpeg-2 Video Compression Technique Presentation
12 pages
Beginner Guide For MPEG-2 Standard
No ratings yet
Beginner Guide For MPEG-2 Standard
12 pages
Video Compression MPEG
100% (3)
Video Compression MPEG
31 pages
Video Image Compression
No ratings yet
Video Image Compression
16 pages
NV Multimedia Communication - Unit II
No ratings yet
NV Multimedia Communication - Unit II
56 pages
EEE 5111 - Lecture-4
No ratings yet
EEE 5111 - Lecture-4
45 pages
Latest Powervu Keys 2019, Late
0% (2)
Latest Powervu Keys 2019, Late
12 pages
Vip6102w Uhd Ip Set Top Data Sheet
No ratings yet
Vip6102w Uhd Ip Set Top Data Sheet
2 pages
World Television Signal Guide
No ratings yet
World Television Signal Guide
59 pages
Tata Sky E-Brochure
No ratings yet
Tata Sky E-Brochure
6 pages
Nagra Vision
No ratings yet
Nagra Vision
9 pages
294user Manual PDF
No ratings yet
294user Manual PDF
85 pages
Middle East and North Africa Pay TV Forecasts 2019 TOC - Toc - 225
No ratings yet
Middle East and North Africa Pay TV Forecasts 2019 TOC - Toc - 225
13 pages
IndianChannelsInUSA Streaming
No ratings yet
IndianChannelsInUSA Streaming
2 pages
User Guide Blaupunkt 39 210i GB 5b Fhbcup Uk Bla Man 0200 Web
No ratings yet
User Guide Blaupunkt 39 210i GB 5b Fhbcup Uk Bla Man 0200 Web
27 pages
LVP605 Series Profile
No ratings yet
LVP605 Series Profile
5 pages
Pi (1998) (720p-2A2S) (Media Info)
No ratings yet
Pi (1998) (720p-2A2S) (Media Info)
2 pages
Philips Sru5106 27 Dfu Aen Learning Remote
No ratings yet
Philips Sru5106 27 Dfu Aen Learning Remote
40 pages
SRT 4669XII - CONAX - MPEG 4 - DVR - Manual - English For Middle East & Africa
0% (2)
SRT 4669XII - CONAX - MPEG 4 - DVR - Manual - English For Middle East & Africa
29 pages
Receiver D8120
80% (5)
Receiver D8120
2 pages
Transdata
No ratings yet
Transdata
121 pages
Samsung LN52A650A1F Quick Guide
No ratings yet
Samsung LN52A650A1F Quick Guide
12 pages
Iptv 1
No ratings yet
Iptv 1
5 pages
Swot Cable
No ratings yet
Swot Cable
1 page
Yasar - Club - 6464
No ratings yet
Yasar - Club - 6464
12 pages
2016년 한국영화산업결산 - 보고서
No ratings yet
2016년 한국영화산업결산 - 보고서
102 pages
Log
No ratings yet
Log
13 pages
Need Uec HDPVR
No ratings yet
Need Uec HDPVR
3 pages
PoE NVR Kits PDF
No ratings yet
PoE NVR Kits PDF
1 page
Bab 1 Sejarah Perkembangan Teknologi Broadcasting
No ratings yet
Bab 1 Sejarah Perkembangan Teknologi Broadcasting
77 pages
DSTV Ads Reports (4024)
No ratings yet
DSTV Ads Reports (4024)
9 pages
Johansson Catalogue 2015 English
No ratings yet
Johansson Catalogue 2015 English
104 pages
Firma Mea
No ratings yet
Firma Mea
32 pages
EUG Projector Price List 2018.
No ratings yet
EUG Projector Price List 2018.
10 pages

Presented By: Priya Raina 13-516

Uploaded by

Presented By: Priya Raina 13-516

Uploaded by

Presented by:

You might also like