0% found this document useful (0 votes)
57 views35 pages

CTN521 - 3 PCM Coding

Pulse Code Modulation and Coding

Uploaded by

hrsyed
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
57 views35 pages

CTN521 - 3 PCM Coding

Pulse Code Modulation and Coding

Uploaded by

hrsyed
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 35

PCM Coding

Overview of Today
• PCM
– Linear Sampling Techniques
– m-LaW
• DPCM
• Generic Coding Techniques
ADPCM
• MPEG-1 Psychoacoutic Coding
• Vocoding Speech Specific Techniques
Audio Signals
• Analog audio is basically voltage as a continuous function
of time.
• Unlike video which is 3D, audio is a 1D signal.
– Can capture without having to discretize the higher dimensions.
• Audio sampling basically boils down to quantizing signal
level to a set of values.
• Digital audio parameters:
– bits per sample
– sampling rate
– number of channels.
Sampling

• Pulse Amplitude Modulation (PAM)


– Each sample’s amplitude is represented by 1 analog value
• Sampling theory (Nyquist)
– If input signal has maximum frequency (bandwidth) f,
sampling frequency must be at least 2f
– With a low-pass filter to interpolate between samples, the
input signal can be fully reconstructed
PCM
0100 Quantization error (“noise”)
0011
0010
0001
0000
1001
1010
1011
1100

• Pulse Code Modulation (PCM)


– Each sample’s amplitude represented by an integer code-word
– Each bit of resolution adds 6 dB of dynamic range
– Number of bits required depends on the amount of noise that is tolerated
SNR – 4.77
n =
6.02
Linear PCM
• Uses evenly spaced quantization levels.
• Typically 16-bits per sample.
• Provides a large dynamic range.
• Difficult for humans to perceive
quantization noise.
• Compact Disks
– 16-bit linear sampling
– 44.1 KHz sampling rate
– 2 channels
Non-linear Sampling
• If we try to use 8 bits per sample, dynamic
range is reduced significantly and
quantization noise can be heard.
• In particular, we end up with not enough
levels for the lower amplitudes.
• Solution is to sample more densely in the
lower amplitudes and less densely for the
higher amplitudes.
• Sort of like a log scale.
Non-linear Sampling Illustrated
Output

Input
m-law and A-law
• Non-linear sampling called “companding”
• 8-bits companded provides dynamic range
equivalent to 12-bits.
• U-law and A-law are companding standards
defined in G.711
• Difference is in exact shape of piece-wise
linear companding function.
m -Law companding

• Provides 14-bit quality (dynamic range) with an 8-bit


encoding
• Used in North American & Japanese ISDN voice service
• Simple to compute encoding
ln(1 + |x|)
f(x) = 127 x sign(x) x (x normalized to [-1, 1])
ln(1 + )
m -Law Encoding
High-resolution 8-bit Inverse
PCM encoding Table -Law 14-bit
Table decoding
(12, 14, 16 bits) Lookup encoding Lookup
Sender Receiver

Input Step Segmen Quanti Code


Amplitude Size t - Value
0-1 1 zation 0
1-3 0000 1
2 000 0001
...

...

...

...
29-31 15
31-35 1111 16
4 001 0000
...

...

...

...
91-95 31
95-103 1111 32
8 010 0000
...

...

...

...
215-223 47
223-239 1111 48
16 011 0000
...

...

...

...

463-479 63
1111
m -Law Decoding
High-resolution 8-bit Inverse
PCM encoding Table -Law 14-bit
Table decoding
(12, 14, 16 bits) Lookup encoding Lookup
Sender Receiver

m-Law Multiplier Decode


Endoding Amplitude
0000000 1 0
0000001 2
2

...

...
0001111 30
0010000 33
4
...

...
0011111 93
0100000 99
8
...

...
0101111 219
0110000 231
16
...

...
0111111 471
Difference Encoding
0100
0011
0010
0001
0000
1001
1010
1011
1100

• Differential-PCM (DPCM)
– Exploit temporal redundancy in samples
– Difference between 2 x-bit samples can be represented
with significantly fewer than x-bits
– Transmit the difference (rather than the sample)
Slope Overload Problem
0100
0011
0010
0001
0000
1001
1010
1011
“Slope Overload”
1100

• Differences in high frequency signals near the


Nyquist frequency cannot be represented with a
smaller number of bits!
– Error introduced leads to severe distortion in the higher
frequencies
Adaptive DPCM (ADPCM)

• Use a larger step-size to encode differences


between high-frequency samples & a smaller step-
size for differences between low-frequency
samples
• Use previous sample values to estimate changes in
the signal in the near future
ADPCM
• To ensure differences are always small...
– Adaptively change the step-size (quanta)
– (Adaptively) attempt to predict next sample
value
y-bit + x-bit
PCM Difference ADPCM
sample + Quantizer “difference”

Predicted Step-Size
PCM Adjuster
Sample n+1
+
Predictor + +
Dequantizer
IMA’s proposed ADPCM
16-bit + 4-bit
PCM Difference ADPCM
sample + Quantizer difference

Step-Size
PCM Adjuster
Sample n–1
+
Register + +
Dequantizer

• Predictor is not adaptive and simply uses the last


sample value
• Quantization step-size increases logarithmically
with signal frequency
IMA Difference Quantization
16-bit + Difference 4-bit
PCM + Quantizer ADPCM
sample – difference
PCM (in step-size units)
Step-Size
sample Adjuster
n–1
+
Register + +
Dequantizer

Quantizer Step-Size
Quantization Output Multiples
difference < 1 4 step_size 000 0.0
1 1
4 step_size < difference < 2 step_size 001 0.25
1 step_size < difference < 3 step_size 010 0.50
2 4
3 step_size < difference < step_size 011 0.75
4
step_size < difference < 5 4 step_size 100 1.0
5 step_size < difference < 3 step_size 101 1.25
4 2
3 step_size < difference < 7 step_size 110 1.5
2 4
7 step_size < difference 111 1.75
4
IMA Step-size Table
Step Step Step Step Step
Index Size Index Size Index Size Index Size Index Size
0 7 18 41 36 230 54 1282 72 7132
1 8 19 45 37 253 55 1411 73 7845
2 9 20 50 38 279 56 1552 74 8630
3 10 21 55 39 307 57 1707 75 9493
4 11 22 60 40 337 58 1878 76 10442
5 12 23 66 41 371 59 2066 77 11487
6 13 24 73 42 408 60 2272 78 12635
7 14 25 80 43 449 61 2499 79 13899
8 16 26 88 44 494 62 2749 80 15289
9 17 27 97 45 544 63 3024 81 16818
10 19 28 107 46 598 64 3327 82 18500
11 21 29 118 47 658 65 3660 83 20350
12 23 30 130 48 724 66 4026 84 22358
13 25 31 143 49 796 67 4428 85 24623
14 28 32 157 50 876 68 4871 86 27086
15 31 33 173 51 963 69 5358 87 29794
16 34 34 190 52 1060 70 5894 88 32767
17 37 35 209 53 1166 71 6484
Adaptive Step-size Selection
16-bit + Difference
4-bit
PCM
Sample
+ Quantizer ADPCM
difference
– (in step-size units)
PCM Step-Size
Sample Adjuster
n–1 +
Register + +
Dequantizer

Index
Range Limit Adjustment Step-Size Quantize
Step-Size
Table
(0 to 88) + Table Index r
Output
Lookup Adjustment
Previous Lookup
Index Register

New Step-Size
Adaptive Step-size Selection
Range Limit Step-Size Difference
Step-Size
Table (0 to 88) + Index Table Index
Difference
Quantizer
Quantizer
Lookup Adjustment Adjustment
Previous Lookup
Index Register
New Step-Size

Quantizer Step-Size Table Step-Size


Quantization Output Index Adjustment Adjustment
difference < 1 4 step_size 000 -1 X 0.91
1 1
4 step_size < difference < 2 step_size 001 -1 X 0.91
1 step_size < difference < 3 step_size 010 -1 X 0.91
2 4
3 step_size < difference < step_size 011 -1 X 0.91
4
step_size < difference < 5 4 step_size 100 2 X 1.21
5 step_size < difference < 3 step_size 101 4 X 1.46
4 2
3 step_size < difference < 7 step_size 110 6 X 1.77
2 4
7 step_size < difference 111 X 2.14
4 8
IMA ADPCM Example
x n ce
de lier iffer e
t nt i n
utpu stme able ultip ted d a l ue
c e iz e iz e r o ju z e t z e m titu d v
e n A d i i s t e
put iffer tep S uant dex tep-S tep-s econ e dic
In D S Q In S S R Pr
X  Step Q Adj I M  Decode
150 7 0 150
155 5 7 010 -1 0 0.5 3.5 154
167 13 7 111 8 8 1.75 12 166 + Difference
170 4 16 001 -1 7 0.25 4 170 Xn + Difference
Quantizer
Quantizer
250 80 14 111 8 15 1.75 24.5 195 –
250 55 31 111 8 23 1.75 54 249 Step-Size
Xn–1 Step-Size
250 1 66 000 -1 22 0.0 0 249 Adjuster
Adjuster
250 1 60 000 -1 21 0.0 0 249 +
200 -49 55 011 -1 20 0.75 -41 208
200
Register
Register + Dequantizer
Dequantizer
+
200
200
200
200
200
Networking Considerations
The IMA codec is
+ reasonably robust to errors
Dequantizer + PCM An interval with a low-level
+ sample n–1
Step-Size signal will correct any step-
Adjuster Register size error

Quantizer Step-Size Table


Quantization Output Index Adjustment
difference < 1 4 step_size 000 -1
1 1
4 step_size < difference < 2 step_size 001 -1
1 step_size < difference < 3 step_size 010 -1
2 4
3 step_size < difference < step_size 011 -1
4
step_size < difference < 5 4 step_size 100 2
5 step_size < difference < 3 step_size 101 4
4 2
3 step_size < difference < 7 step_size 110 6
2 4
7 step_size < difference 111 8
4
Psychoacoustic Properties
100
Audible
Sound 80
Level
(dB) 60
40
20 Inaudible
0 Frequency
0.02 0.05 0.1 0.2 0.5 1 2 5 10 20 (kHz)

• Human perception of sound is a function of frequency


and signal strength
– (MPEG exploits this relationship.)
Auditory Masking
100 Audible

80
Sound Masking tone
Level 60
(dB) 40
Masked tone
20
Inaudible
0 Frequency
0.02 0.05 0.1 0.2 0.5 1 2 5 10 20 (kHz)

• The presence of tones at certain frequencies makes


us unable to perceive tones at other “nearby”
frequencies
– Humans cannot distinguish between tones within 100
Hz at low frequencies and 4 kHz at high frequencies
MPEG Encoder Block Diagram
PCM
Audio
Samples Mapping Quantizer Coding
(32, 44.1,
48 kHz)

Psycho-
Frame Encoded
acoutstic Bitstream
Packing
Model

Ancillary Data
Subband Filter
• Transforms signal from time domain to
frequency domain.
– 32 PCM samples yields 32 subband samples.
• Each subband corresponds to a freq. band evenly
spaced from 0 to Nyquist freq.
– Filter actually works on a window of 512
samples that is shifted over 32 samples at a time.
• Subband coefficients are analyzed with
psychoacoustic model, quantized, and coded.
Layer 1
• 384 samples per frame.
• Iterative bit allocation process:
– For each subband, determine MNR.
– Increase number of quantization bits for
subband with smallest MNR.
– Iterate until all bits used.
• Fixed allocation of bits among subbands for
a particular frame.
• Up to 448 kb/s
Layer 2
• 1152 samples per frame.
• Iterative bit allocation.
• Subband allocation is dynamic.
• Up to 384 kb/s
Layer 3
• 1152 samples
– Up to 320 kb/s
• Each subband further analyzed using MDCT
to create 576 frequency lines.
– 4 different windowing schemes depending on
whether samples contain “attack” of new
frequencies.
• Lots of bit allocation options for quantizing
frequency coefficients.
• Quantized coefficients Huffman coded.
Vo-coding
• Concept: Develop a mathematical
model of the vocal cords & throat
– Derive/compute model parameters for
a short interval and transmit to the
decoder
– Use the parameters to synthesize
speech at the decoder

• So what is a good model?


– A “buzzer” in a “tube”!
– The buzzer is characterized by its
intensity & pitch
– The tube is characterized by its
formants
Vocoding - Basic Concepts
75

Amplitude 60
45
30
15
Frequency
0 (kHz)

• Formant — frequency maxima & minima in


the spectrum of the speech signal
• Vocoders group and code portions of the
signal by amplitude
“Buzzer” and “Tube” Model

“yadda yadda yadda”

• Vocoding principles:
– voice = formants + buzz pitch & intensity
– voice – estimated formants = “residue”
• Linear Predictive Coding (LPC)
– A sample is represented as a linear combination of p
previous samples
p
y(n) = 
k=1
ak y(n – k) + G x x(n)
LPC
• Decoder artificially generates speech via formant synthesis
– A mathematical simulation of the vocal tract as a series of bandpass
filters
– Encoder codes & transmit filter coefficients, pitch period, gain
factor, & nature of excitation
• Standards:
– Regular Pulse Excited Linear Predictive Coder (RPE-LPC)
• Digital cellular standard GSM 6.1 (13 kbps)
– Code Excited Linear Predictive Coder (CELP)
• US Federal Standard 1016 (4.8 kbps)
– Linear Predictive Coder (LPC)
• US Federal Standard 1015 (2.4 kbps)
Networking Concerns
• Audio bandwidth is actually quite small.
• But human sensitivity to loss and noise is
quite high.
• Netwoking concerns:
– Loss concealment
– Jitter control
• Especially for telephony applications.

You might also like