Concepts of Multimedia Processing and Transmission
Concepts of Multimedia Processing and Transmission
IT 481, Lecture 6
Dennis McCaughey, Ph.D.
26 February, 2007
Conventional Audio Signal Format
Parameter Value
Channels 2 (stereo)
MPEG Audio
Encoding methods DTS
none SDDS
(optional)
Audio specifications for Linear PCM and Packed PCM encoding schemes
Sampling frequency 48/96/192 kHz, 44.1/88.2/176.4 kHz 48/96 kHz
Quantization depth 16/20/24 bits 16/20/24 bits
8ch
Maximum number of 6ch (fs: 48/96/44.1/88.2 kHz) or
(2ch for Stereo
channels 2ch (fs: 192/176.4 kHz)
+ 6ch for Multi channel)
9.6 Mbps(Linear PCM / Packed 6.144 Mbps
Maximum bit rate PCM) (Linear PCM)
1200Hz (fs: 48/96/192 kHz) 600Hz
Frame rate 1102.5Hz (fs: 44.1/88.2/176.4 kHz) (fs: 48/96 kHz)
9
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
Dynamic Range of CD and DVD
e(n)
1-Bit
s(t) Sampler s(n) + + -
+ -
Quantizer
+
1 Sample
+ +
Delay
12
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
Delta-Mod Decoder
s(n)
Reconstruction
e(n) + + + Filter
s(t)
1 Sample
Delay
13
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
Delta Modulation - example
14
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
Delta Modulation Variants
e(n)
1 Sample
s(t) Sampler + + Quantizer q(n)
Delay
+ - +
+
16
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
G.721 Adaptive Differential Pulse Code
Modulation (ADPCM)
e(n)
s(t) Sampler s(n) + Quantizer Coder
+
-
a +
1 Sample
+ +
Delay
20
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
DPCM Decoder (Simplified)
s(n)
Reconstruction
Decoder e(n) + + + Filter
s(t)
a
1 Sample
Delay
21
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
DPCM Encoder Schematic
22
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
DPCM Decoder Schematic
23
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
Increased Predictor Order
24
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
DPCM: Third Order Predictor Encoder
25
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
DPCM: Third Order Decoder Schematic
26
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
Sub-band Coding (SBC)
33
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
G.722 Adaptive DPCM (ADPCM)
Subband Decoder
34
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
Linear Predictive Coding
35
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
LPC Features
Perceptual
– Pitch:
Closely related to the frequency of the signal
Important since the ear is more sensitive in the frequency
range for 2-5kKz
– Period:
The duration of the signal
– Loudness:
The average energy in the signal
Voice Tract Excitation Parameters
– Voiced Sounds: generated through the vocal chords such
as those related to the letters m, v and l
– Unvoiced Sounds: the vocal chords are open such as
those related to the letters f and s
36
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
Linear Predictive Coding (LPC) Signal
Encoder
37
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
Linear Predictive Coding (LPC) Signal
Decoder
38
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
Perceptual Properties of the Ear:
Sensitivity as a Function of Frequency
40
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
Variation with Frequency Effect of
Frequency Masking
The masking effect is a function of frequency band. The width of each curve
at a particular sound level is known as the critical bandwidth. Experiments
show the critical bandwidth increases linearly in steps of 100Hz. e.g. for a
signal of 1kHz (2x500Hz) the critical bandwidth is about 200Hz
41
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
Temporal Masking Caused by a Loud
Signal
After the ear hears a loud sound, there is a delay before it can hear a quieter
sound
42
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
MPEG Perceptual Audio Coding
43
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
MPEG-1&2 Encoder
Psychoacoustic
Model
44
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
New Features for Layer 3 (MP3)
45
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
MPEG 1 Layer 3 (MP3) Encoder
Coded Audio
Signal at
32 - 192 KBits/s
Digital
Audio Signal
(PCM) 576 Distortion
(768 kBits/s 32 Sub-
Multiplexing
Lines Control Loop
Bands
Analysis
Non Uniform Huffman
FilterBank MDCT
Quantization Encoding
(32 Subbands)
Rate Control
Loop
FFT Psycho
1024 Acoustic
Points Model
Perceptual Model
46
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
MP3 Components
47
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
Perceptual Model
48
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
Psychoacoustic Model
49
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
MPEG Audio Filter Bank Boundaries
50
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
Psychoacoustic Model Functions
53
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
Inner Non-uniform Quantization Rate Control Loop
54
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
Distortion Control Loop
55
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
Rate Distortion Criteria
Compressed
Layer Application Quality IO Delay
Bit Rate
CD Quality Audio
CD at
3 Over Low Bit Rate 64kbps 60ms
64kbps
Channels
59
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
MPEG Perceptual Coder Schematic: (a)
Encoder/Decoder (b) Example Frame Format
60
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
Perceptual Coder Schematics: (a) Forward Adaptive Bit
Allocation (MPEG); (b) Fixed Bit Allocation (Dolby AC-1)
61
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
Perceptual Coder Schematics: (a) Backward Adaptive Bit
Allocation (Dolby AC-2); (b) Hybrid Backward/Forward Bit
Allocation (Dolby AC-s)
62
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
Mid Term Topics
Huffman Code
Advantages of digital over analog audio
Shannon’s Sampling Theorem
IIR and FIR digital filters
Quality of Service
JPEG compression process
What is multimedia
Why are psychoacoustics important
DPCM and how it works (fundamental
principle)
User and network requirements
63
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007