0% found this document useful (0 votes)
118 views63 pages

Concepts of Multimedia Processing and Transmission

This document discusses concepts related to multimedia processing and transmission. It covers conventional audio signal formats, including how pulse code modulation (PCM) converts analog audio signals to digital signals by sampling and representing the samples with numbers. It also discusses sampling rates and rates required for accurate audio reproduction. The document outlines parameters for digital audio storage on compact discs, including sample rates, channels, bits per sample and error correction coding.

Uploaded by

velmanir
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
118 views63 pages

Concepts of Multimedia Processing and Transmission

This document discusses concepts related to multimedia processing and transmission. It covers conventional audio signal formats, including how pulse code modulation (PCM) converts analog audio signals to digital signals by sampling and representing the samples with numbers. It also discusses sampling rates and rates required for accurate audio reproduction. The document outlines parameters for digital audio storage on compact discs, including sample rates, channels, bits per sample and error correction coding.

Uploaded by

velmanir
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 63

Concepts of Multimedia

Processing and Transmission

IT 481, Lecture 6
Dennis McCaughey, Ph.D.
26 February, 2007
Conventional Audio Signal Format

 On vinyl and audio cassettes, the audio waveform is


recorded as an analogue signal. Therefore any
imperfections will be heard as noise (hiss) or other
defects. 
 To reduce these defects, CDs use Pulse Code
Modulation (PCM), the simplest of digital coding
technologies.

Slide: Courtesy, Hung Nguyen


2
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
Pulse Code Modulation (PCM)

 Using PCM technology samples of the analogue


waveform are taken at intervals and stored as
numbers. The example below shows the conversion
of an analogue waveform (which could be part of an
audio signal) to digital by representing each sample
by a number (from 0 to 100 in this simple example).

Slide: Courtesy, Hung Nguyen


3
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
Sampling for Audio Signal

 In practice the range of values and sampling rate


must be high enough to ensure accurate
reproduction of the original analogue waveform.
 The upper limit for the human ear is about 20kHz
therefore the audio must be sampled at 40,000
times per second or higher (since two samples are
required for both halves of a sine wave).
 To reduce distortion and quantization noise each
sample must be represented by at least a 16-bit
number giving 65,536 values or levels (0 to 65,535)
per sample.
Slide: Courtesy, Hung Nguyen
4
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
CD Digital Audio Parameters

 Audio is stored on Compact Discs with the following


parameters

Parameter Value

Sample rate 44.1 kHz

Channels 2 (stereo)

Bits per sample, per channel 16

Levels per sample 65,536

Total data rate (Mb/s) 1.4112


Slide: Courtesy, Hung Nguyen
5
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
Data Integrity in Audio-CD

 Digital encoding allows the use of error correction


codes, which are necessary to correct errors
resulting from the manufacturing process and minor
damage or marks which may occur from handling
and use.
 The result is that the amount of data stored on a CD
is nearly four times the data needed to represent
the audio only. But this is a small price to pay for a
robust format that allows recordings to be played
back free of clicks, hiss and other defects
associated with analog media.

Slide: Courtesy, Hung Nguyen


6
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
CD Error Correction and Modulation

 Error correction provided by CIRC (Cross Interleaved Read-


Solomon Code), which adds two dimensional parity information
and also interleaves the data on the disc to protect from burst
errors
– CIRC corrects error bursts up to 3,500 bits (2.4 mm in length) and
compensates for error bursts up to 12,000 bits (8.5 mm) such as
caused by minor scratches.
 EFM (Eight to Fourteen) modulation: as CD-ROM discs uses a
14-bit byte, a modification necessary because of the way data
is stored and read with lasers, using the pits (indentations) and
lands (spaces between indentations) on the disc.
– In transferring from magnetic to optical media, the 8-bit byte is
modulated and stored on optical media as a 14-bit byte.  This
reduces the effect of jitter and other distortions on the error rate.
– When the computer reads the CD-ROM, an interface card
demodulates the 14-bit optical code back to 8-bit code.
Slide: Courtesy, Hung Nguyen
7
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
CD Data Format

Slide: Courtesy, Hung Nguyen


8
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
DVD Coding Format

Audio Object Video Object


Linear PCM(Scalable)
Encoding methods Linear PCM
Packed PCM (lossless
(mandatory) Dolby AC3
encoding)

MPEG Audio
Encoding methods DTS
none SDDS
(optional)

Audio specifications for Linear PCM and Packed PCM encoding schemes
Sampling frequency 48/96/192 kHz, 44.1/88.2/176.4 kHz 48/96 kHz
Quantization depth 16/20/24 bits 16/20/24 bits
8ch
Maximum number of 6ch (fs: 48/96/44.1/88.2 kHz) or
(2ch for Stereo
channels 2ch (fs: 192/176.4 kHz)
+ 6ch for Multi channel)
9.6 Mbps(Linear PCM / Packed 6.144 Mbps
Maximum bit rate PCM) (Linear PCM)
1200Hz (fs: 48/96/192 kHz) 600Hz
Frame rate 1102.5Hz (fs: 44.1/88.2/176.4 kHz) (fs: 48/96 kHz)
9
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
Dynamic Range of CD and DVD

Slide: Courtesy, Hung Nguyen


10
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
Delta Modulation

 In delta modulation, differences between speech


samples are encoded & original to be recovered by
the decoder at the receiving end
 The analog signal is approximated with a series of
segments
 Each segment of the approximated signal is
compared to the original analog wave to determine
the increase or decrease in relative amplitude,
 The decision process for establishing the state of
successive bits is determined by this comparison,
and
 Only the change of information is sent, i.e., only an
increase or decrease of the signal amplitude from
the previous sample is sent whereas a no-change
condition causes the modulated signal to remain at
the same 0 or 1 state of the previous sample.
Slide: Courtesy, Hung Nguyen
11
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
Delta-Mod Encoder

e(n)
1-Bit
s(t) Sampler s(n) + + -
+ -
Quantizer

+
1 Sample
+ +
Delay

e(n) is a sequence of + "1s"

12
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
Delta-Mod Decoder

s(n)
Reconstruction
e(n) + + + Filter
s(t)

1 Sample
Delay

13
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
Delta Modulation - example

14
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
Delta Modulation Variants

 Examples of delta modulation are continuously


variable slope delta modulation and delta-sigma
modulation.
– Continuously variable slope delta (CVSD) modulation:
A type of delta modulation in which the size of the steps of
the approximated signal is progressively increased or
decreased as required to make the approximated signal
closely match the input analog wave.
– Sigma-Delta Modulation: Delta modulation in which the
integral of the input signal is encoded rather than the
signal itself. Note: Sigma-Delta modulation may be
achieved by including a digital integrator preceding the
Quantizer in a delta-modulation encoder.
 Important concept in “State-of-the-Art” A/D converters
Slide: Courtesy, Hung Nguyen
15
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
Sigma-Delta-Mod Encoder

e(n)
1 Sample
s(t) Sampler + + Quantizer q(n)
Delay
+ - +
+

16
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
G.721 Adaptive Differential Pulse Code
Modulation (ADPCM)

 PCM does not attempt to remove speech signal


redundancy, this is done by the ADPCM encoder
 The CCITT standard G.721 ADPCM algorithm for
32 kbps speech coding used in CT2 and DECT
cordless phone systems
 In practice, ADPCM encoders are implemented
using a linear predictor for the current sample, and
the difference between predicted and actual sample
(prediction error) is encoded for transmission
 Prediction is based on the knowledge of the
autocorrelation property of speech
Slide: Courtesy, Hung Nguyen
17
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
Adaptive PCM Example

 In an adaptive PCM system for speech coding, the


input signal is sampled at 8 KHz and each sample
is represented by 8 bits. The quantizer step size is
recomputed every 10 msec and is encoded for
transmission using 5 bits. What would the
transmission bit rate of such a speech coder?
– Sampling frequency = fs = 8 KHz
– Number of bits per sample = n = 8 bits
– Number of information bits per second = 8,000x8 = 64,000
bits/sec
– Quantization step sized recomputed every 10 msec, we
have 100 step size sample to be transmitted every second
– Therefore, the number of overhead bits = 100x5 = 500
bits/sec, and the effective transmission bit rate is
64,000+500 = 65,000 bits/sec
Slide: Courtesy, Hung Nguyen
18
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
ADPCM Encoder used in CT2

Slide: Courtesy, Hung Nguyen


19
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
DPCM Encoder (Simplified)

e(n)
s(t) Sampler s(n) + Quantizer Coder
+
-
a +

1 Sample
+ +
Delay

Neglecting the Quantizer, it is easy to show:


e(n) = s(n) – as(n-1)
The Coder may be a Huffman/Entropy encoder

20
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
DPCM Decoder (Simplified)

s(n)
Reconstruction
Decoder e(n) + + + Filter
s(t)

a
1 Sample
Delay

21
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
DPCM Encoder Schematic

22
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
DPCM Decoder Schematic

23
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
Increased Predictor Order

 Can improve the compression performance


by increasing the number of samples
beyond the previous one
 In the example a 3rd order predictor is used
– The previous three samples contained in R1, R2
&R3 are weighted by C1, C2 &C3 and added to
form the overall prediction
– C1, C2 and C3 are functions of the correlation
between the first sample and the following two
– e.g. for a Markov Process C2 =(C1)2 C3 = (C1)3

24
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
DPCM: Third Order Predictor Encoder

25
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
DPCM: Third Order Decoder Schematic

26
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
Sub-band Coding (SBC)

 Quantization typically produces distortion broad in


spectrum. But human ear does not detect distortion
equally well at all frequency
 Thus it’s possible to achieve substantial
improvement in quality by coding speech in
narrower bands
 Speech is typically divided into four or eight sub-
bands by a bank of filters and each sub-band is
sampled at a band-pass Nyquist rate and encoded
accordance to a perceptual criteria
 SBC can be thought of as a method of controlling
and distributing quantization noise across the signal
spectrum
Slide: Courtesy, Hung Nguyen
27
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
An SBC Encoder

Slide: Courtesy, Hung Nguyen


28
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
An SBC Decoder

Slide: Courtesy, Hung Nguyen


29
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
Example of SBC

 This table gives the frequency range of each


band with the number of bits used to encode
each band
SB Number Frequency (Hz) # of encoded bits
1 225-450 4
2 450-900 3
3 1000-1500 2
4 1800-2700 1

 Assuming that no side information needs to


be transmitted, compute the minimum
encoding rate of this SBC encoder
Slide: Courtesy, Hung Nguyen
30
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
Example of SBC (cont’d)

 For perfect reconstruction of band-pass


signals, need to sample at Nyquist rate
which is twice the signal bandwidth
– Band 1: 2x(450-225) = 450 samples/sec
– Band 2: 2x(900-450) = 900 samples/sec
– Band 3: 2x(1,500-1,000) = 1,000 samples/sec
– Band 4: 2x(2,700-1,800) = 1,800 samples/sec
 Total encoding rate is
– 450x4+900x3+1,000x2+1,800x1 = 8,300 bits/s
Slide: Courtesy, Hung Nguyen
31
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
G.722 Adaptive DPCM

 Better sound quality that G.721


 Employs Subband Coding
 Input speech bandwidth is expanded to be
from 50Hz to 7KHz
 Divides frequency band into two subbands
– 50Hz to 3.5KHz
– 3.5 KHZ-7 KHz
– Each subband sampled & encoded
independently using ADPCM
 Operating bit rate can be 64, 56 or 48kbps
 e.g. 64kbps lower band at 48kbps upper
band at 16kbps
32
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
G.722 Adaptive DPCM (ADPCM)
Subband Encoder

33
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
G.722 Adaptive DPCM (ADPCM)
Subband Decoder

34
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
Linear Predictive Coding

 LPC analyzes the audio waveform to


determine a selection of perceptual features
it contains
 These are then quantized and sent to the
destination together with a sound
synthesizer that regenerates the sound that
is perceptually comparable with the original
 While sounding synthetic very high
compression ratios can be obtained

35
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
LPC Features

 Perceptual
– Pitch:
 Closely related to the frequency of the signal
 Important since the ear is more sensitive in the frequency
range for 2-5kKz
– Period:
 The duration of the signal
– Loudness:
 The average energy in the signal
 Voice Tract Excitation Parameters
– Voiced Sounds: generated through the vocal chords such
as those related to the letters m, v and l
– Unvoiced Sounds: the vocal chords are open such as
those related to the letters f and s

36
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
Linear Predictive Coding (LPC) Signal
Encoder

37
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
Linear Predictive Coding (LPC) Signal
Decoder

38
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
Perceptual Properties of the Ear:
Sensitivity as a Function of Frequency

The ear is most sensitive in the range of 2-5kHz


Tone A is audible while tone B is not
39
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
Perceptual Properties of the Ear:
Frequency Masking

Loud tone suppresses a quieter one. Tone B masks Tone A.


Tone B is audible while Tone A is not even if Tone A is audible by itself

40
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
Variation with Frequency Effect of
Frequency Masking

The masking effect is a function of frequency band. The width of each curve
at a particular sound level is known as the critical bandwidth. Experiments
show the critical bandwidth increases linearly in steps of 100Hz. e.g. for a
signal of 1kHz (2x500Hz) the critical bandwidth is about 200Hz

41
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
Temporal Masking Caused by a Loud
Signal

After the ear hears a loud sound, there is a delay before it can hear a quieter
sound

42
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
MPEG Perceptual Audio Coding

 Perceptual encoding is a lossy compression


technique,
– i.e. the decoded data is not an exact replica of the original
digital audio data.
– Instead, digital audio data is compressed in a way that
despite the high compression rate the decoded audio
sounds exactly - or as closely as possible - like the original
audio.
 This is achieved by adapting the encoding process
to the characteristics of the human perception of
sound:
 The parts of the audio signal that humans perceive
distinctly are coded with high accuracy,
 The less distinctive parts are coded less accurately,
and parts of the sound we do not hear at all are
mostly discarded or replaced by quantization noise.

43
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
MPEG-1&2 Encoder

Psychoacoustic
Model

44
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
New Features for Layer 3 (MP3)

 Modified DCT (MDCT)


– DCT with overlap
– Long/short window switching
 Short for better temporal resolution (to prevent pre-
echoes)
 Long for better frequency resolution
 Non-uniform quantization
 Entropy coding
– Run-length and Huffman coding
 Bit reservoir (buffer)

45
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
MPEG 1 Layer 3 (MP3) Encoder

Coded Audio
Signal at
32 - 192 KBits/s
Digital
Audio Signal
(PCM) 576 Distortion
(768 kBits/s 32 Sub-

Multiplexing
Lines Control Loop
Bands
Analysis
Non Uniform Huffman
FilterBank MDCT
Quantization Encoding
(32 Subbands)
Rate Control
Loop
FFT Psycho
1024 Acoustic
Points Model

Perceptual Model

46
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
MP3 Components

 Perceptual model: An estimate of the actual (time and


frequency dependent) masking threshold is computed by
using rules known from psychoacoustics.
 Filter bank: A hybrid polyphase / MDCT filter bank is used to
decompose the input signal into sub-sampled spectral
components. Together with the corresponding inverse filter
bank in the decoder it forms an analysis/synthesis system.
 Quantization and coding: The spectral components are
quantized and coded with the aim of keeping the noise
introduced by the quantization below the masking threshold.
– Distortion Control Loop
– Non-uniform Quantization Control Loop
– Huffman Coding
 Multiplexing: A bit stream formatter is used to assemble the
bit stream, which consists of the quantized and coded spectral
coefficients and some side information, e.g. bit allocation
information.

47
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
Perceptual Model

 The perceptual model consists of outputs


values for the masking threshold or allowed
noise for each coder partition.
 In Layer-3, these coder partitions are
roughly equivalent to the critical bands of
human hearing.
– The the compression result should be
indistinguishable from the original signal If the
quantization noise can be kept below the
masking threshold for each coder partition

48
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
Psychoacoustic Model

 Time align audio data


– The psychoacoustic model must account for both the
delay of the au­dio data through the filter bank and a data
off-set so that the relevant data is centered within its
analysis window
 Convert audio to spectral domain
– The psychoacoustic model uses a time-to-frequency map-
ping such as a 512- or 1,024-point Fourier transform
– A standard Hanning window, applied to audio data before
Fourier transformation, condi­tions the data to reduce the
edge effects of the transform window.
 Partition spectral values into critical bands
– To simplify the psychoacoustic calculations, the model
groups the frequency values into perceptual quanta

49
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
MPEG Audio Filter Bank Boundaries

Finer resolution at lower frequencies

50
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
Psychoacoustic Model Functions

 Incorporate threshold in quiet


– This threshold is the lower bound for noise
masking and is determined in the ab­sence of
masking signals
 Separate into tonal and non-tonal
components
– The model must identify and separate the tonal
and noiselike components of the audio signal
 Apply spreading function
– The model deter-mines the noise-masking
thresholds by applying an empirically determined
masking or spreading function to the signal
components
51
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
Psychoacoustic Model Functions

 Find the minimum masking threshold for each


sub-band
– The psychoacoustic model calculates the masking
thresholds with a higher-frequency resolution than
provided by the filter banks.
– Where the filter band is wide relative to the critical
band (at the lower end of the spectrum), the model
selects the minimum of the masking thresholds
covered by the filter band.
– Where the filter band is narrow relative to the critical
band, the model uses the average of the masking
thresholds covered by the filter band.
52
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
MPEG-1 Layer-3 Filter Bank

 The filter bank belongs to the class of hybrid filter banks.


 It is built by cascading two different kinds of filter bank:
– First: the polyphase filter-bank (as used in Layer-1 and
Layer-2)
– Second: an addi­tional Modified Discrete Cosine Transform
(MDCT).
 The polyphase filter bank has the purpose of making Layer-3
more similar to Layer-1 and Layer-2.
 The subdivision of each polyphase frequency band into 18
finer sub bands increases the potential for redundancy
removal, leading to better coding efficiency for tonal signals.
– 576 lines = 32 sub bands X 18
 Better frequency resolution allows finer tracking and control of
the error signal.

53
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
Inner Non-uniform Quantization Rate Control Loop

 The Huffman code tables assign shorter code


words to (more frequent) smaller quantized values.
 If the number of bits exceeds the number of bits
available to code a given block of data, the global
gain adjusted result larger quantization step sizes,
thus smaller quantized values.
 This operation is repeated with different
quantization step sizes until the resulting bit
demand for Huffman coding is small enough. The
loop is called a rate loop because it modifies the
overall coder rate until it is small enough

54
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
Distortion Control Loop

 The quantization noise is shaped according to the masking


threshold, scale factors are applied to each scale factor band.
 If the quantization noise in a given band is found to exceed
the masking threshold (allowed noise) as supplied by the
perceptual model, the scale factor for this band is adjusted to
reduce the quantization noise.
 A smaller quantization noise re-quires a larger number of
quantization steps and thus a higher bit-rate.
– Thus the Non-uniform Quantization Rate Control Loop is
repeated every time new scale factors are used.
 The outer Distortion Control Loop is executed until the actual
noise (computed from the difference of the original spectral
values minus the quantized spectral values) is below the
masking threshold for every scale factor band (i.e. critical
band).

55
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
Rate Distortion Criteria

 Shannon’s Rate Distortion Theorem states that


there is a mapping from a source waveform to
output code words such that for a given distortion D,
R(D) bits/sample are sufficient to reconstruct to
waveform with an average distortion that is
arbitrarily close to D
 The function R(D) is called the rate distortion
function and represents the fundamental limit on the
achievable rate for a given distortion.
 Shannon predicted that such theoretical limit cannot
be achieved by one sample at a time as in scalar
quantizer but rather by coding many samples at a
time by vector quantization
Slide: Courtesy, Hung Nguyen
56
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
Vector Quantization (VQ)

 VQ [Gray] is a delayed-decision coding technique


which maps a vector of input samples, typically a
speech frame, to a code book index.
 The code book has a finite set of vectors covering
the entire range of input values
 In each quantizing interval, the code book is
searched for the best match of the input frame.
 VQ can yield better performance even when the
samples are independent of one another, and
performs best when there is strong correlation
between samples in the group

Slide: Courtesy, Hung Nguyen


57
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
Achievable Rate by VQ

 The rate R of a vector quantizer is defined as


log 2 n
R bits/sample
L
 Where L is the number of samples in the vector, and
n is the size of the code book
 The distortion is measured as the squared Euclidean
distance between the quantization and input vectors
 VQ is most efficient at very low bit rate (R = 0.5
bits/sample) and is a computationally intensive
operation, and more efficient VQ-based algorithms
are available
Slide: Courtesy, Hung Nguyen
58
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
MPEG Layers 1, 2 & 3 Performance

Compressed
Layer Application Quality IO Delay
Bit Rate

Digital Audio High at


1 32-448kbps 2-ms
Cassette 192kbps

Digital Audio & Near CD at


2 32-192kbps 40ms
Video Broadcasting 128kbps

CD Quality Audio
CD at
3 Over Low Bit Rate 64kbps 60ms
64kbps
Channels

59
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
MPEG Perceptual Coder Schematic: (a)
Encoder/Decoder (b) Example Frame Format

60
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
Perceptual Coder Schematics: (a) Forward Adaptive Bit
Allocation (MPEG); (b) Fixed Bit Allocation (Dolby AC-1)

61
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
Perceptual Coder Schematics: (a) Backward Adaptive Bit
Allocation (Dolby AC-2); (b) Hybrid Backward/Forward Bit
Allocation (Dolby AC-s)

62
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
Mid Term Topics

 Huffman Code
 Advantages of digital over analog audio
 Shannon’s Sampling Theorem
 IIR and FIR digital filters
 Quality of Service
 JPEG compression process
 What is multimedia
 Why are psychoacoustics important
 DPCM and how it works (fundamental
principle)
 User and network requirements
63
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007

You might also like