0% found this document useful (0 votes)

118 views63 pages

Concepts of Multimedia Processing and Transmission

This document discusses concepts related to multimedia processing and transmission. It covers conventional audio signal formats, including how pulse code modulation (PCM) converts analog audio signals to digital signals by sampling and representing the samples with numbers. It also discusses sampling rates and rates required for accurate audio reproduction. The document outlines parameters for digital audio storage on compact discs, including sample rates, channels, bits per sample and error correction coding.

Uploaded by

velmanir

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

118 views63 pages

Concepts of Multimedia Processing and Transmission

Uploaded by

velmanir

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 63

Concepts of Multimedia

Processing and Transmission

IT 481, Lecture 6
Dennis McCaughey, Ph.D.
26 February, 2007
Conventional Audio Signal Format

 On vinyl and audio cassettes, the audio waveform is

recorded as an analogue signal. Therefore any
imperfections will be heard as noise (hiss) or other
defects.
 To reduce these defects, CDs use Pulse Code
Modulation (PCM), the simplest of digital coding
technologies.

Slide: Courtesy, Hung Nguyen

2
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
Pulse Code Modulation (PCM)

 Using PCM technology samples of the analogue

waveform are taken at intervals and stored as
numbers. The example below shows the conversion
of an analogue waveform (which could be part of an
audio signal) to digital by representing each sample
by a number (from 0 to 100 in this simple example).

Slide: Courtesy, Hung Nguyen

3
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
Sampling for Audio Signal

 In practice the range of values and sampling rate

must be high enough to ensure accurate
reproduction of the original analogue waveform.
 The upper limit for the human ear is about 20kHz
therefore the audio must be sampled at 40,000
times per second or higher (since two samples are
required for both halves of a sine wave).
 To reduce distortion and quantization noise each
sample must be represented by at least a 16-bit
number giving 65,536 values or levels (0 to 65,535)
per sample.
Slide: Courtesy, Hung Nguyen
4
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
CD Digital Audio Parameters

 Audio is stored on Compact Discs with the following

parameters

Parameter Value

Sample rate 44.1 kHz

Channels 2 (stereo)

Bits per sample, per channel 16

Levels per sample 65,536

Total data rate (Mb/s) 1.4112

Slide: Courtesy, Hung Nguyen
5
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
Data Integrity in Audio-CD

 Digital encoding allows the use of error correction

codes, which are necessary to correct errors
resulting from the manufacturing process and minor
damage or marks which may occur from handling
and use.
 The result is that the amount of data stored on a CD
is nearly four times the data needed to represent
the audio only. But this is a small price to pay for a
robust format that allows recordings to be played
back free of clicks, hiss and other defects
associated with analog media.

Slide: Courtesy, Hung Nguyen

6
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
CD Error Correction and Modulation

 Error correction provided by CIRC (Cross Interleaved Read-

Solomon Code), which adds two dimensional parity information
and also interleaves the data on the disc to protect from burst
errors
– CIRC corrects error bursts up to 3,500 bits (2.4 mm in length) and
compensates for error bursts up to 12,000 bits (8.5 mm) such as
caused by minor scratches.
 EFM (Eight to Fourteen) modulation: as CD-ROM discs uses a
14-bit byte, a modification necessary because of the way data
is stored and read with lasers, using the pits (indentations) and
lands (spaces between indentations) on the disc.
– In transferring from magnetic to optical media, the 8-bit byte is
modulated and stored on optical media as a 14-bit byte. This
reduces the effect of jitter and other distortions on the error rate.
– When the computer reads the CD-ROM, an interface card
demodulates the 14-bit optical code back to 8-bit code.
Slide: Courtesy, Hung Nguyen
7
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
CD Data Format

Slide: Courtesy, Hung Nguyen

8
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
DVD Coding Format

Audio Object Video Object

Linear PCM(Scalable)
Encoding methods Linear PCM
Packed PCM (lossless
(mandatory) Dolby AC3
encoding)

MPEG Audio
Encoding methods DTS
none SDDS
(optional)

Audio specifications for Linear PCM and Packed PCM encoding schemes
Sampling frequency 48/96/192 kHz, 44.1/88.2/176.4 kHz 48/96 kHz
Quantization depth 16/20/24 bits 16/20/24 bits
8ch
Maximum number of 6ch (fs: 48/96/44.1/88.2 kHz) or
(2ch for Stereo
channels 2ch (fs: 192/176.4 kHz)
+ 6ch for Multi channel)
9.6 Mbps(Linear PCM / Packed 6.144 Mbps
Maximum bit rate PCM) (Linear PCM)
1200Hz (fs: 48/96/192 kHz) 600Hz
Frame rate 1102.5Hz (fs: 44.1/88.2/176.4 kHz) (fs: 48/96 kHz)
9
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
Dynamic Range of CD and DVD

Slide: Courtesy, Hung Nguyen

10
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
Delta Modulation

 In delta modulation, differences between speech

samples are encoded & original to be recovered by
the decoder at the receiving end
 The analog signal is approximated with a series of
segments
 Each segment of the approximated signal is
compared to the original analog wave to determine
the increase or decrease in relative amplitude,
 The decision process for establishing the state of
successive bits is determined by this comparison,
and
 Only the change of information is sent, i.e., only an
increase or decrease of the signal amplitude from
the previous sample is sent whereas a no-change
condition causes the modulated signal to remain at
the same 0 or 1 state of the previous sample.
Slide: Courtesy, Hung Nguyen
11
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
Delta-Mod Encoder

e(n)
1-Bit
s(t) Sampler s(n) + + -
+ -
Quantizer

+
1 Sample
+ +
Delay

e(n) is a sequence of + "1s"

12
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
Delta-Mod Decoder

s(n)
Reconstruction
e(n) + + + Filter
s(t)

1 Sample
Delay

13
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
Delta Modulation - example

14
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
Delta Modulation Variants

 Examples of delta modulation are continuously

variable slope delta modulation and delta-sigma
modulation.
– Continuously variable slope delta (CVSD) modulation:
A type of delta modulation in which the size of the steps of
the approximated signal is progressively increased or
decreased as required to make the approximated signal
closely match the input analog wave.
– Sigma-Delta Modulation: Delta modulation in which the
integral of the input signal is encoded rather than the
signal itself. Note: Sigma-Delta modulation may be
achieved by including a digital integrator preceding the
Quantizer in a delta-modulation encoder.
 Important concept in “State-of-the-Art” A/D converters
Slide: Courtesy, Hung Nguyen
15
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
Sigma-Delta-Mod Encoder

e(n)
1 Sample
s(t) Sampler + + Quantizer q(n)
Delay
+ - +
+

16
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
G.721 Adaptive Differential Pulse Code
Modulation (ADPCM)

 PCM does not attempt to remove speech signal

redundancy, this is done by the ADPCM encoder
 The CCITT standard G.721 ADPCM algorithm for
32 kbps speech coding used in CT2 and DECT
cordless phone systems
 In practice, ADPCM encoders are implemented
using a linear predictor for the current sample, and
the difference between predicted and actual sample
(prediction error) is encoded for transmission
 Prediction is based on the knowledge of the
autocorrelation property of speech
Slide: Courtesy, Hung Nguyen
17
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
Adaptive PCM Example

 In an adaptive PCM system for speech coding, the

input signal is sampled at 8 KHz and each sample
is represented by 8 bits. The quantizer step size is
recomputed every 10 msec and is encoded for
transmission using 5 bits. What would the
transmission bit rate of such a speech coder?
– Sampling frequency = fs = 8 KHz
– Number of bits per sample = n = 8 bits
– Number of information bits per second = 8,000x8 = 64,000
bits/sec
– Quantization step sized recomputed every 10 msec, we
have 100 step size sample to be transmitted every second
– Therefore, the number of overhead bits = 100x5 = 500
bits/sec, and the effective transmission bit rate is
64,000+500 = 65,000 bits/sec
Slide: Courtesy, Hung Nguyen
18
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
ADPCM Encoder used in CT2

Slide: Courtesy, Hung Nguyen

19
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
DPCM Encoder (Simplified)

e(n)
s(t) Sampler s(n) + Quantizer Coder
+
-
a +

1 Sample
+ +
Delay

Neglecting the Quantizer, it is easy to show:

e(n) = s(n) – as(n-1)
The Coder may be a Huffman/Entropy encoder

20
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
DPCM Decoder (Simplified)

s(n)
Reconstruction
Decoder e(n) + + + Filter
s(t)

a
1 Sample
Delay

21
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
DPCM Encoder Schematic

22
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
DPCM Decoder Schematic

23
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
Increased Predictor Order

 Can improve the compression performance

by increasing the number of samples
beyond the previous one
 In the example a 3rd order predictor is used
– The previous three samples contained in R1, R2
&R3 are weighted by C1, C2 &C3 and added to
form the overall prediction
– C1, C2 and C3 are functions of the correlation
between the first sample and the following two
– e.g. for a Markov Process C2 =(C1)2 C3 = (C1)3

24
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
DPCM: Third Order Predictor Encoder

25
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
DPCM: Third Order Decoder Schematic

26
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
Sub-band Coding (SBC)

 Quantization typically produces distortion broad in

spectrum. But human ear does not detect distortion
equally well at all frequency
 Thus it’s possible to achieve substantial
improvement in quality by coding speech in
narrower bands
 Speech is typically divided into four or eight sub-
bands by a bank of filters and each sub-band is
sampled at a band-pass Nyquist rate and encoded
accordance to a perceptual criteria
 SBC can be thought of as a method of controlling
and distributing quantization noise across the signal
spectrum
Slide: Courtesy, Hung Nguyen
27
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
An SBC Encoder

Slide: Courtesy, Hung Nguyen

28
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
An SBC Decoder

Slide: Courtesy, Hung Nguyen

29
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
Example of SBC

 This table gives the frequency range of each

band with the number of bits used to encode
each band
SB Number Frequency (Hz) # of encoded bits
1 225-450 4
2 450-900 3
3 1000-1500 2
4 1800-2700 1

 Assuming that no side information needs to

be transmitted, compute the minimum
encoding rate of this SBC encoder
Slide: Courtesy, Hung Nguyen
30
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
Example of SBC (cont’d)

 For perfect reconstruction of band-pass

signals, need to sample at Nyquist rate
which is twice the signal bandwidth
– Band 1: 2x(450-225) = 450 samples/sec
– Band 2: 2x(900-450) = 900 samples/sec
– Band 3: 2x(1,500-1,000) = 1,000 samples/sec
– Band 4: 2x(2,700-1,800) = 1,800 samples/sec
 Total encoding rate is
– 450x4+900x3+1,000x2+1,800x1 = 8,300 bits/s
Slide: Courtesy, Hung Nguyen
31
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
G.722 Adaptive DPCM

 Better sound quality that G.721

 Employs Subband Coding
 Input speech bandwidth is expanded to be
from 50Hz to 7KHz
 Divides frequency band into two subbands
– 50Hz to 3.5KHz
– 3.5 KHZ-7 KHz
– Each subband sampled & encoded
independently using ADPCM
 Operating bit rate can be 64, 56 or 48kbps
 e.g. 64kbps lower band at 48kbps upper
band at 16kbps
32
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
G.722 Adaptive DPCM (ADPCM)
Subband Encoder

33
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
G.722 Adaptive DPCM (ADPCM)
Subband Decoder

34
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
Linear Predictive Coding

 LPC analyzes the audio waveform to

determine a selection of perceptual features
it contains
 These are then quantized and sent to the
destination together with a sound
synthesizer that regenerates the sound that
is perceptually comparable with the original
 While sounding synthetic very high
compression ratios can be obtained

35
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
LPC Features

 Perceptual
– Pitch:
 Closely related to the frequency of the signal
 Important since the ear is more sensitive in the frequency
range for 2-5kKz
– Period:
 The duration of the signal
– Loudness:
 The average energy in the signal
 Voice Tract Excitation Parameters
– Voiced Sounds: generated through the vocal chords such
as those related to the letters m, v and l
– Unvoiced Sounds: the vocal chords are open such as
those related to the letters f and s

36
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
Linear Predictive Coding (LPC) Signal
Encoder

37
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
Linear Predictive Coding (LPC) Signal
Decoder

38
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
Perceptual Properties of the Ear:
Sensitivity as a Function of Frequency

The ear is most sensitive in the range of 2-5kHz

Tone A is audible while tone B is not
39
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
Perceptual Properties of the Ear:
Frequency Masking

Loud tone suppresses a quieter one. Tone B masks Tone A.

Tone B is audible while Tone A is not even if Tone A is audible by itself

40
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
Variation with Frequency Effect of
Frequency Masking

The masking effect is a function of frequency band. The width of each curve
at a particular sound level is known as the critical bandwidth. Experiments
show the critical bandwidth increases linearly in steps of 100Hz. e.g. for a
signal of 1kHz (2x500Hz) the critical bandwidth is about 200Hz

41
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
Temporal Masking Caused by a Loud
Signal

After the ear hears a loud sound, there is a delay before it can hear a quieter
sound

42
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
MPEG Perceptual Audio Coding

 Perceptual encoding is a lossy compression

technique,
– i.e. the decoded data is not an exact replica of the original
digital audio data.
– Instead, digital audio data is compressed in a way that
despite the high compression rate the decoded audio
sounds exactly - or as closely as possible - like the original
audio.
 This is achieved by adapting the encoding process
to the characteristics of the human perception of
sound:
 The parts of the audio signal that humans perceive
distinctly are coded with high accuracy,
 The less distinctive parts are coded less accurately,
and parts of the sound we do not hear at all are
mostly discarded or replaced by quantization noise.

43
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
MPEG-1&2 Encoder

Psychoacoustic
Model

44
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
New Features for Layer 3 (MP3)

 Modified DCT (MDCT)

– DCT with overlap
– Long/short window switching
 Short for better temporal resolution (to prevent pre-
echoes)
 Long for better frequency resolution
 Non-uniform quantization
 Entropy coding
– Run-length and Huffman coding
 Bit reservoir (buffer)

45
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
MPEG 1 Layer 3 (MP3) Encoder

Coded Audio
Signal at
32 - 192 KBits/s
Digital
Audio Signal
(PCM) 576 Distortion
(768 kBits/s 32 Sub-

Multiplexing
Lines Control Loop
Bands
Analysis
Non Uniform Huffman
FilterBank MDCT
Quantization Encoding
(32 Subbands)
Rate Control
Loop
FFT Psycho
1024 Acoustic
Points Model

Perceptual Model

46
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
MP3 Components

 Perceptual model: An estimate of the actual (time and

frequency dependent) masking threshold is computed by
using rules known from psychoacoustics.
 Filter bank: A hybrid polyphase / MDCT filter bank is used to
decompose the input signal into sub-sampled spectral
components. Together with the corresponding inverse filter
bank in the decoder it forms an analysis/synthesis system.
 Quantization and coding: The spectral components are
quantized and coded with the aim of keeping the noise
introduced by the quantization below the masking threshold.
– Distortion Control Loop
– Non-uniform Quantization Control Loop
– Huffman Coding
 Multiplexing: A bit stream formatter is used to assemble the
bit stream, which consists of the quantized and coded spectral
coefficients and some side information, e.g. bit allocation
information.

47
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
Perceptual Model

 The perceptual model consists of outputs

values for the masking threshold or allowed
noise for each coder partition.
 In Layer-3, these coder partitions are
roughly equivalent to the critical bands of
human hearing.
– The the compression result should be
indistinguishable from the original signal If the
quantization noise can be kept below the
masking threshold for each coder partition

48
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
Psychoacoustic Model

 Time align audio data

– The psychoacoustic model must account for both the
delay of the audio data through the filter bank and a data
off-set so that the relevant data is centered within its
analysis window
 Convert audio to spectral domain
– The psychoacoustic model uses a time-to-frequency map-
ping such as a 512- or 1,024-point Fourier transform
– A standard Hanning window, applied to audio data before
Fourier transformation, conditions the data to reduce the
edge effects of the transform window.
 Partition spectral values into critical bands
– To simplify the psychoacoustic calculations, the model
groups the frequency values into perceptual quanta

49
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
MPEG Audio Filter Bank Boundaries

Finer resolution at lower frequencies

50
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
Psychoacoustic Model Functions

 Incorporate threshold in quiet

– This threshold is the lower bound for noise
masking and is determined in the absence of
masking signals
 Separate into tonal and non-tonal
components
– The model must identify and separate the tonal
and noiselike components of the audio signal
 Apply spreading function
– The model deter-mines the noise-masking
thresholds by applying an empirically determined
masking or spreading function to the signal
components
51
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
Psychoacoustic Model Functions

 Find the minimum masking threshold for each

sub-band
– The psychoacoustic model calculates the masking
thresholds with a higher-frequency resolution than
provided by the filter banks.
– Where the filter band is wide relative to the critical
band (at the lower end of the spectrum), the model
selects the minimum of the masking thresholds
covered by the filter band.
– Where the filter band is narrow relative to the critical
band, the model uses the average of the masking
thresholds covered by the filter band.
52
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
MPEG-1 Layer-3 Filter Bank

 The filter bank belongs to the class of hybrid filter banks.

 It is built by cascading two different kinds of filter bank:
– First: the polyphase filter-bank (as used in Layer-1 and
Layer-2)
– Second: an additional Modified Discrete Cosine Transform
(MDCT).
 The polyphase filter bank has the purpose of making Layer-3
more similar to Layer-1 and Layer-2.
 The subdivision of each polyphase frequency band into 18
finer sub bands increases the potential for redundancy
removal, leading to better coding efficiency for tonal signals.
– 576 lines = 32 sub bands X 18
 Better frequency resolution allows finer tracking and control of
the error signal.

53
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
Inner Non-uniform Quantization Rate Control Loop

 The Huffman code tables assign shorter code

words to (more frequent) smaller quantized values.
 If the number of bits exceeds the number of bits
available to code a given block of data, the global
gain adjusted result larger quantization step sizes,
thus smaller quantized values.
 This operation is repeated with different
quantization step sizes until the resulting bit
demand for Huffman coding is small enough. The
loop is called a rate loop because it modifies the
overall coder rate until it is small enough

54
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
Distortion Control Loop

 The quantization noise is shaped according to the masking

threshold, scale factors are applied to each scale factor band.
 If the quantization noise in a given band is found to exceed
the masking threshold (allowed noise) as supplied by the
perceptual model, the scale factor for this band is adjusted to
reduce the quantization noise.
 A smaller quantization noise re-quires a larger number of
quantization steps and thus a higher bit-rate.
– Thus the Non-uniform Quantization Rate Control Loop is
repeated every time new scale factors are used.
 The outer Distortion Control Loop is executed until the actual
noise (computed from the difference of the original spectral
values minus the quantized spectral values) is below the
masking threshold for every scale factor band (i.e. critical
band).

55
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
Rate Distortion Criteria

 Shannon’s Rate Distortion Theorem states that

there is a mapping from a source waveform to
output code words such that for a given distortion D,
R(D) bits/sample are sufficient to reconstruct to
waveform with an average distortion that is
arbitrarily close to D
 The function R(D) is called the rate distortion
function and represents the fundamental limit on the
achievable rate for a given distortion.
 Shannon predicted that such theoretical limit cannot
be achieved by one sample at a time as in scalar
quantizer but rather by coding many samples at a
time by vector quantization
Slide: Courtesy, Hung Nguyen
56
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
Vector Quantization (VQ)

 VQ [Gray] is a delayed-decision coding technique

which maps a vector of input samples, typically a
speech frame, to a code book index.
 The code book has a finite set of vectors covering
the entire range of input values
 In each quantizing interval, the code book is
searched for the best match of the input frame.
 VQ can yield better performance even when the
samples are independent of one another, and
performs best when there is strong correlation
between samples in the group

Slide: Courtesy, Hung Nguyen

57
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
Achievable Rate by VQ

 The rate R of a vector quantizer is defined as

log 2 n
R bits/sample
L
 Where L is the number of samples in the vector, and
n is the size of the code book
 The distortion is measured as the squared Euclidean
distance between the quantization and input vectors
 VQ is most efficient at very low bit rate (R = 0.5
bits/sample) and is a computationally intensive
operation, and more efficient VQ-based algorithms
are available
Slide: Courtesy, Hung Nguyen
58
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
MPEG Layers 1, 2 & 3 Performance

Compressed
Layer Application Quality IO Delay
Bit Rate

Digital Audio High at

1 32-448kbps 2-ms
Cassette 192kbps

Digital Audio & Near CD at

2 32-192kbps 40ms
Video Broadcasting 128kbps

CD Quality Audio
CD at
3 Over Low Bit Rate 64kbps 60ms
64kbps
Channels

59
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
MPEG Perceptual Coder Schematic: (a)
Encoder/Decoder (b) Example Frame Format

60
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
Perceptual Coder Schematics: (a) Forward Adaptive Bit
Allocation (MPEG); (b) Fixed Bit Allocation (Dolby AC-1)

61
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
Perceptual Coder Schematics: (a) Backward Adaptive Bit
Allocation (Dolby AC-2); (b) Hybrid Backward/Forward Bit
Allocation (Dolby AC-s)

62
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007
Mid Term Topics

 Huffman Code
 Advantages of digital over analog audio
 Shannon’s Sampling Theorem
 IIR and FIR digital filters
 Quality of Service
 JPEG compression process
 What is multimedia
 Why are psychoacoustics important
 DPCM and how it works (fundamental
principle)
 User and network requirements
63
Dennis Mccaughey, IT 481, Spring 2007 02/26/2007

DIGITAL Communications
No ratings yet
DIGITAL Communications
253 pages
Principles of Digital Audio (5th Ed) - KC Pohlmann (McGraw-Hill)
No ratings yet
Principles of Digital Audio (5th Ed) - KC Pohlmann (McGraw-Hill)
10 pages
Module 15 Comp An Ding Wit
100% (5)
Module 15 Comp An Ding Wit
79 pages
Chapter 4 - Digital Communication System - Lesson 1
No ratings yet
Chapter 4 - Digital Communication System - Lesson 1
42 pages
Data Communication
No ratings yet
Data Communication
114 pages
Data Communication
100% (2)
Data Communication
114 pages
18EC743-MMC-Module-4 Notes
100% (1)
18EC743-MMC-Module-4 Notes
45 pages
02-Digital Representation of Information
No ratings yet
02-Digital Representation of Information
39 pages
Lecture 2d Image and Video
No ratings yet
Lecture 2d Image and Video
90 pages
Physical Layer: CH 4: Digital Transmission
No ratings yet
Physical Layer: CH 4: Digital Transmission
52 pages
02-Digital Representation of Information
No ratings yet
02-Digital Representation of Information
39 pages
Lecture 4a Baseband and Data Formating
No ratings yet
Lecture 4a Baseband and Data Formating
59 pages
Digital Baseband Handout
No ratings yet
Digital Baseband Handout
20 pages
6 - Digital Audio Technology
No ratings yet
6 - Digital Audio Technology
24 pages
MMC M4 Ia3
No ratings yet
MMC M4 Ia3
137 pages
Advantages of Digital Broadcasting
No ratings yet
Advantages of Digital Broadcasting
34 pages
Group One Assignments
No ratings yet
Group One Assignments
19 pages
EE 413 Digital Communication Systems
No ratings yet
EE 413 Digital Communication Systems
12 pages
2 - PCM & Delta Modulation
No ratings yet
2 - PCM & Delta Modulation
33 pages
Audio and Audio Compression
No ratings yet
Audio and Audio Compression
27 pages
W5 Replacement
No ratings yet
W5 Replacement
32 pages
Pulse Code Modulation: (Cf. Analogue Communication)
No ratings yet
Pulse Code Modulation: (Cf. Analogue Communication)
24 pages
LECTURE5
No ratings yet
LECTURE5
57 pages
Digital Audio Compression: by Davis Yen Pan
No ratings yet
Digital Audio Compression: by Davis Yen Pan
14 pages
Bit 3233 Audio Video Production: Analog Audio and Digital Audio
No ratings yet
Bit 3233 Audio Video Production: Analog Audio and Digital Audio
19 pages
Digitizing and Packetizing Voice: Describe Cisco Voip Implementations
No ratings yet
Digitizing and Packetizing Voice: Describe Cisco Voip Implementations
24 pages
Dcs 1
No ratings yet
Dcs 1
46 pages
Chapter 3 Ecm413 PDF
No ratings yet
Chapter 3 Ecm413 PDF
57 pages
Transmission of Information: David Falconer and Halim Yanikomeroglu
No ratings yet
Transmission of Information: David Falconer and Halim Yanikomeroglu
42 pages
Digital Transmission
No ratings yet
Digital Transmission
61 pages
Bab 7 Multimedia Kompresi Audio
No ratings yet
Bab 7 Multimedia Kompresi Audio
52 pages
Digital Representation of Audio Information
No ratings yet
Digital Representation of Audio Information
22 pages
Digital Audio
No ratings yet
Digital Audio
29 pages
BCT 2305 Lecture 1c Data Representation and Transmission
No ratings yet
BCT 2305 Lecture 1c Data Representation and Transmission
103 pages
Digital 3
No ratings yet
Digital 3
17 pages
Digital Communication Systems: Unit-1 Source Coding Systems
No ratings yet
Digital Communication Systems: Unit-1 Source Coding Systems
192 pages
Chapter 2 Part 3
No ratings yet
Chapter 2 Part 3
19 pages
MMC M4 Ia3 1
No ratings yet
MMC M4 Ia3 1
55 pages
CPEDATCOM Chapter 3 PDF
No ratings yet
CPEDATCOM Chapter 3 PDF
7 pages
Basics of Digital Communication
100% (1)
Basics of Digital Communication
41 pages
Digital Transmission: Prepared by Prof.V.K.Jain 1
No ratings yet
Digital Transmission: Prepared by Prof.V.K.Jain 1
32 pages
18ec743 - Module 4 - Audio & Video Compression
No ratings yet
18ec743 - Module 4 - Audio & Video Compression
50 pages
Data Communications - PPTX Additional
No ratings yet
Data Communications - PPTX Additional
39 pages
0 Module-4
No ratings yet
0 Module-4
50 pages
ch4 - Acquiring Audio Data PDF
No ratings yet
ch4 - Acquiring Audio Data PDF
18 pages
Audio Compression
No ratings yet
Audio Compression
6 pages
G.722-Wideband Speech Coding
No ratings yet
G.722-Wideband Speech Coding
19 pages
EE412/CS455 Principles of Digital Audio and Video
No ratings yet
EE412/CS455 Principles of Digital Audio and Video
71 pages
Digital Transmission
No ratings yet
Digital Transmission
61 pages
Digital Communication Basic
No ratings yet
Digital Communication Basic
12 pages
1 5 Analog To Digital Voice Encoding
No ratings yet
1 5 Analog To Digital Voice Encoding
10 pages
Telecom Interview Questions Answers Guide PDF
50% (2)
Telecom Interview Questions Answers Guide PDF
10 pages
Msa 02
No ratings yet
Msa 02
9 pages
Flexo Vs Offset
No ratings yet
Flexo Vs Offset
18 pages
09-Smartax Ma5620 and Ma5626-En
No ratings yet
09-Smartax Ma5620 and Ma5626-En
4 pages
Ethernid™ Administrator'S Guide: For The Ethernid™ Ee Ethernid™ Ge Metronid™ Te Metronid™ Te-R Metronid™ Te-S
No ratings yet
Ethernid™ Administrator'S Guide: For The Ethernid™ Ee Ethernid™ Ge Metronid™ Te Metronid™ Te-R Metronid™ Te-S
206 pages
NWM CH 1
No ratings yet
NWM CH 1
29 pages
Hi
No ratings yet
Hi
12 pages
Moovly User Guide PDF
No ratings yet
Moovly User Guide PDF
36 pages
7750 Datasheet PDF
No ratings yet
7750 Datasheet PDF
7 pages
Electronics Technician: Volume 6-Digital Data Systems
No ratings yet
Electronics Technician: Volume 6-Digital Data Systems
522 pages
PanelView Standard Operator Terminals User Manual
No ratings yet
PanelView Standard Operator Terminals User Manual
2 pages
DH67CL TechProdSpec02
No ratings yet
DH67CL TechProdSpec02
91 pages
How To Amosconnect
No ratings yet
How To Amosconnect
13 pages
Boss BR-1600CD Owners Manual
No ratings yet
Boss BR-1600CD Owners Manual
312 pages
Satellite Television
No ratings yet
Satellite Television
12 pages
Digital Transmission
No ratings yet
Digital Transmission
10 pages
Phoenix SCH
No ratings yet
Phoenix SCH
19 pages
Concepts of Multimedia Processing and Transmission
No ratings yet
Concepts of Multimedia Processing and Transmission
63 pages
Iq Check Tool - Setup Guide-Fiery
No ratings yet
Iq Check Tool - Setup Guide-Fiery
12 pages
Communication System Eeeb453 Chapter 5 (Part IV) Digital Transmission
No ratings yet
Communication System Eeeb453 Chapter 5 (Part IV) Digital Transmission
10 pages
CP-X5021N CP-X4021N CP-WX4021N: Multi Purpose LCD Projectors
No ratings yet
CP-X5021N CP-X4021N CP-WX4021N: Multi Purpose LCD Projectors
2 pages
Sampling and Analog-to-Digital Conversion
No ratings yet
Sampling and Analog-to-Digital Conversion
47 pages
Discovering Computers 2010: Living in A Digital World
No ratings yet
Discovering Computers 2010: Living in A Digital World
35 pages
Ratna Monika, Skom
No ratings yet
Ratna Monika, Skom
5 pages
Unit 2 - Week 1
No ratings yet
Unit 2 - Week 1
3 pages
Course Title: Computer Hardware and Networking Lab Course Code: 6049 Course Category: A Periods/Week: 5 Periods/Semester: 75 Credits: 3
No ratings yet
Course Title: Computer Hardware and Networking Lab Course Code: 6049 Course Category: A Periods/Week: 5 Periods/Semester: 75 Credits: 3
2 pages
18i 1467 (Cyber) PDF
No ratings yet
18i 1467 (Cyber) PDF
9 pages
Datasheet of DS-2TP21B-6AVFW
No ratings yet
Datasheet of DS-2TP21B-6AVFW
4 pages
Ccna 4 Accessing The Wan: A Data Communications Network That Operates The Geographic
No ratings yet
Ccna 4 Accessing The Wan: A Data Communications Network That Operates The Geographic
7 pages
Token Bus (IEEE 802.4)
No ratings yet
Token Bus (IEEE 802.4)
10 pages
Video Games - Questionnaire and Answer Sheet
No ratings yet
Video Games - Questionnaire and Answer Sheet
5 pages
3D GLASS Printers
No ratings yet
3D GLASS Printers
10 pages
Software Release Notes - Phoenix System v1.6.1
No ratings yet
Software Release Notes - Phoenix System v1.6.1
3 pages
Collision Domain
No ratings yet
Collision Domain
4 pages
(425, Single Selection) : Wrong!
No ratings yet
(425, Single Selection) : Wrong!
2 pages

Concepts of Multimedia Processing and Transmission

Uploaded by

Concepts of Multimedia Processing and Transmission

Uploaded by

Concepts of Multimedia

Processing and Transmission

 On vinyl and audio cassettes, the audio waveform is

Slide: Courtesy, Hung Nguyen

 Using PCM technology samples of the analogue

Slide: Courtesy, Hung Nguyen

 In practice the range of values and sampling rate

 Audio is stored on Compact Discs with the following

Sample rate 44.1 kHz

Bits per sample, per channel 16

Levels per sample 65,536

Total data rate (Mb/s) 1.4112

 Digital encoding allows the use of error correction

Slide: Courtesy, Hung Nguyen

 Error correction provided by CIRC (Cross Interleaved Read-

Slide: Courtesy, Hung Nguyen

Audio Object Video Object

Slide: Courtesy, Hung Nguyen

 In delta modulation, differences between speech

e(n) is a sequence of + "1s"

 Examples of delta modulation are continuously

 PCM does not attempt to remove speech signal

 In an adaptive PCM system for speech coding, the

Slide: Courtesy, Hung Nguyen

Neglecting the Quantizer, it is easy to show:

 Can improve the compression performance

 Quantization typically produces distortion broad in

Slide: Courtesy, Hung Nguyen

Slide: Courtesy, Hung Nguyen

 This table gives the frequency range of each

 Assuming that no side information needs to

 For perfect reconstruction of band-pass

 Better sound quality that G.721

 LPC analyzes the audio waveform to

The ear is most sensitive in the range of 2-5kHz

Loud tone suppresses a quieter one. Tone B masks Tone A.

 Perceptual encoding is a lossy compression

 Modified DCT (MDCT)

 Perceptual model: An estimate of the actual (time and

 The perceptual model consists of outputs

 Time align audio data

Finer resolution at lower frequencies

 Incorporate threshold in quiet

 Find the minimum masking threshold for each

 The filter bank belongs to the class of hybrid filter banks.

 The Huffman code tables assign shorter code

 The quantization noise is shaped according to the masking

 Shannon’s Rate Distortion Theorem states that

 VQ [Gray] is a delayed-decision coding technique

Slide: Courtesy, Hung Nguyen

 The rate R of a vector quantizer is defined as

Digital Audio High at

Digital Audio & Near CD at

You might also like