0% found this document useful (0 votes)

28 views23 pages

Unit2 1

The document discusses different types of vocoders used for digital coding of speech signals, including channel vocoders, formant vocoders, cepstrum vocoders, voice-excited vocoders, LPC-10 vocoders, code-excited linear prediction (CELP) vocoders, and mixed-excitation linear prediction (MELP) vocoders. It provides details on how each vocoder type works and the principles behind them.

Uploaded by

Hydra ANI

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

28 views23 pages

Unit2 1

Uploaded by

Hydra ANI

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 23

Ajay Kumar Garg Engineering College

Wireless & Mobile Communication

(KEC-076)
UNIT 2
Vocoders
Lecture-1
By

Mr. Naveen Kumar Saini

Assistant Professor
Department of Electronics & Communication Engineering Department
VOCODERS
• Vocoder is an audio processor that is used to transmit speech or voice signal
in the form of digital data. The vocoder is used as short form for voice
coder. Vocoders are basically used for digital coding of speech and voice
simulation. The bitrate for available narrowband vocoders is from 1.2 to 64
kbps.
• Vocoder operates on the principle of formants. Formants are basically the
meaningful components of a speech that is generated due to the human voice.
• Whenever a speech signal is transmitted, it is not needed to transmit the
precise waveform. We can simply transmit the information by which one can
reconstruct that particular waveform. This reconstructed waveform at the
receiver must be similar and identical to the waveform actually transmitted.

2
• Vocoder works in such a way that it first captures the characteristic element of the signal.
Then other audio signals are affected by the use of that characteristic signal.
• Vocoders are a class of speech coding systems that analyze the voice signal at the
transmitter, transmit parameters derived from the analysis, and then synthesize the voice at
the receiver using those parameters.
• The pitch frequency for most speakers is below 300 Hz, and extracting this information
from the signal is very difficult.
• The pole frequencies correspond to the resonant frequencies of the vocal tract and are
often called the formants of the speech signal.
• For adult speakers, the formants are centered around 500 Hz, 1500 Hz, 2500 Hz, and
3500 Hz.

3
HUMAN SPEECH PRODUCTION SYSTEM

Speech production :

(a) Human speech production

modelling

(b) Equivalent synthetic speech

production blocks

4
• A voice model is used to simulate voice. As speech contains a sequence of voiced
and unvoiced sounds, this is the basis for the operation of a voice model.
• Before proceeding further, it is better to first understand what is voiced and
unvoiced sounds.
• Voice sounds are basically the sounds generated by vibrations of the vocal
cords.
• On contrary, the sound produced at the pronunciation of the letters such as ‘s’, ‘p’
or ‘f’ is known as unvoiced sounds. Unvoiced sounds are generated
by expelling air through lips and teeth.
• voiced sounds are simulated by the impulse generator, the frequency of which
is equal to the fundamental frequency of vocal cords. The noise source present in
the circuit is used to simulate the unvoiced sounds.
• The position of the switch helps in determining whether the sound is voiced or
unvoiced.
• Then the selected signal is passed through a filter that simulates the effect of
mouth, throat and nasal passage of speaker. The filter unit then filters the input
in such a way so as the required letter is pronounced. Thus we can have a
synthesized approximated speech waveform. 5
VOICE ENCODER

6
VOICE DECODER

7
TYPES OF VOCODERS:
• Channel Vocoders
• Formant Vocoders
• Cepstrum Vocoders
• Voice-Excited Vocoders
• LPC-10 which uses linear predictive coding
• Code-excited linear prediction (CELP)
• Mixed-excitation linear prediction (MELP)
• Adaptive Differential Pulse Code Modulation (ADPCM)

8
Channel Vocoders
• The sound generating mechanism forms the source and is linearly separated from
the intelligence modulating vocal tract filter which forms the system.

• The speech signal is assumed to be of two types: voiced and unvoiced

• Voiced sounds are a result of quasiperiodic vibrations of the vocal chord and
unvoiced sounds are fricatives produced by turbulent air flow through a
constriction

• The pitch frequency for most speakers is below 300 Hz.

• The pole frequencies correspond to the resonant frequencies of the vocal tract and
are often called the formants of the speech signal. 9
Formant Vocoders
• The formant vocoder is similar in concept to the channel vocoder.
• The formant vocoder can operate at lower bit rates than the channel vocoder
because it uses fewer control signals
• The formant vocoder attempts to transmit the positions of the peaks (formants) of
the spectral envelope, instead of sending samples
• A formant vocoder must be able to identify at least three formants for representing
the speech sounds
• It must control the intensities of the formants.
• Formant vocoder can reproduce speech at bit rates lower than 1200 bits/s.

10
Cepstrum Vocoders:
• The Cepstrum vocoder separates the excitation and vocal tract spectrum by
inverse Fourier transforming of the log magnitude spectrum to produce the
Cepstrum of the signal.
• The low frequency coefficients in the cepstrum correspond to the vocal tract
spectral envelope
• The high frequency excitation coefficients forming a periodic pulse train at
multiples of the sampling period
• Linear filtering is performed to separate the vocal tract cepstral coefficients from
the excitation coefficients
• In the receiver, the vocal tract cepstral coefficients are Fourier transformed to
produce the vocal tract impulse response.
• By convolving this impulse response with a synthetic excitation signal (random
noise or periodic pulse train), the original speech is reconstructed
11
Voice-Excited Vocoder:
• Voice-excited Vocoders eliminate the need for pitch extraction and voicing
detection operation.

• This system uses a hybrid combination of PCM transmission for the low
frequency band of speech, combined with channel vocoding of higher frequency
bands.

• A pitch signal is generated at the synthesizer by rectifying, band pass filtering, and
clipping the baseband signal.

• Voice excited vocoder have been designed for operation at 7200 bits/s to 9600
bits/s 12
LPC-10 (which uses linear predictive coding)
• Linear predictive coding (LPC) is a method used mostly in audio signal
processing and speech processing for representing the spectral envelope of
a digital signal of speech in compressed form, using the information of
a linear predictive model.
• Most signals, such as speech, music and video signals, are partially predictable
and partially random.
• These signals can be modelled as the output of a filter excited by an uncorrelated
input.
• The random input models the unpredictable part of the signal, whereas the filter
models the predictable structure of the signal.
• The aim of linear prediction is to model the mechanism that introduces the
correlation in a signal.

13
• Speech is generated by inhaling air and then exhaling it through the glottis and the
vocal tract. The noise-like air, from the lung, is modulated and shaped by the
vibrations of the glottal cords and the resonance of the vocal tract.
• Figure illustrates a source-filter model of speech. The source models the lung, and
emits a random input excitation signal which is filtered by a pitch filter.

The pitch filter - long-term predictor, it models the correlation of each sample with the
samples a pitch period away.
The vocal tract- Short-term predictor, it models the correlation of each sample with
the few preceding samples.
14
• A linear predictor model forecasts the amplitude of a signal at time m, x(m), using a
linearly weighted combination of P past samples [x(m−1), x(m−2), ..., x(m−P)] as :

where the integer variable m is the discrete time index, xˆ(m) is the prediction of x(m), and
ak are the predictor coefficients.

15
Code-excited linear prediction (CELP)
One of the main principles behind CELP is
called Analysis-by-Synthesis (AbS),
meaning that the encoding (analysis) is
performed by perceptually optimising the
decoded (synthesis) signal in a closed loop.
The CELP technique is based on three
ideas:
1.The use of a linear prediction (LP) model
to model the vocal tract
2.The use of (adaptive and fixed) codebook
entries as input (excitation) of the LP
model
3.The search performed in closed-loop in a
``perceptually weighted domain''
16
Mixed-excitation linear prediction (MELP)
• The MELP vocoder evolved from improvements and modifications to another code
excited linear predictive (CELP) coder known as LPC-10.
• The MELP coder uses a mixed-excitation model that can produce more natural sounding
speech because it can represent a richer ensemble of possible speech characteristics.
• MELP encoding is robust in difficult acoustic environments with significant background
noise and reverberation such as those frequently encountered in commercial and military
communication systems.

17
• The Mixed Excitation Linear Prediction coder is based on the traditional Linear
Prediction Coder (LPC) parametric model, but also includes five additional features. They
are:
1. Mixed excitation
2. Aperiodic pulses
3. Adaptive spectral enhancement
4. Pulse dispersion
5. Fourier magnitude modeling.
• When the input speech is voiced, the MELP coder can synthesize using either periodic or
aperiodic pulses.
• Aperiodic pulses are used most often during transition regions between voiced and
unvoiced segments of the speech signal. This feature enables the decoder to reproduce
erratic glottal pulses without introducing tonal sounds.

18
Adaptive differential pulse code modulation
(ADPCM)
• Adaptive differential pulse code modulation (ADPCM) is a very efficient digital coding
of waveforms.

• The principle of ADPCM is to use our knowledge of the signal in the past time to predict
it in the future, the resulting signal being the error of this prediction.

• PCM is performed before ADPCM to decrease the number of bits for coding by passing
through a PCM process before transforming to an ADPCM sample.

19
• ADPCM Encoder:
• Subsequent to the conversion of the A-law or µ -law, PCM input signal to uniform PCM,
a difference signal is obtained by subtracting an estimate of the input signal from the
input signal itself.
• An adaptive 31-, 15-, 7-, or 4-level quantizer is used to assign five, four, three, or two
binary digits, respectively, to the value of the difference signal for transmission to the
decoder.

20
• ADPCM Encoder:
• An inverse quantizer produces a quantized difference signal from these same five, four,
three or two binary digits, respectively.
• The signal estimate is added to this quantized difference signal to produce the
reconstructed version of the input signal.
• Both the reconstructed signal and the quantized difference signal are operated upon by an
adaptive predictor, which produces the estimate of the input signal, thereby completing
the feedback loop.

21
• ADPCM Decoder:
• The decoder includes a structure identical to the feedback portion of the encoder, together
with a uniform PCM to A-law or -law conversion and a synchronous coding adjustment.
• The synchronous coding adjustment prevents cumulative distortion occurring on
synchronous tandem coding (ADPCM, PCM, ADPCM, etc., digital connections) under
certain conditions.
• The synchronous coding adjustment is achieved by adjusting the PCM output codes in a
manner which attempts to eliminate quantizing distortion in the next ADPCM encoding
stage.

22
THANK YOU!!

Unit 2 Wireless
No ratings yet
Unit 2 Wireless
159 pages
Adaptive Multi Rate Coder Using ACLP
No ratings yet
Adaptive Multi Rate Coder Using ACLP
45 pages
Unit 2 A
No ratings yet
Unit 2 A
48 pages
Human Speech Producing Organs: 2.4 Kbps
No ratings yet
Human Speech Producing Organs: 2.4 Kbps
108 pages
Speech Compression Techniques - Formant and CELP Vocoders
No ratings yet
Speech Compression Techniques - Formant and CELP Vocoders
41 pages
Speech Coders For Wireless Communication
No ratings yet
Speech Coders For Wireless Communication
53 pages
Speech Processing Project
No ratings yet
Speech Processing Project
16 pages
Linear Prediction Coding Vocoders: Institute of Space Technology Islamabad
No ratings yet
Linear Prediction Coding Vocoders: Institute of Space Technology Islamabad
15 pages
ch5.3 (Vocoders)
No ratings yet
ch5.3 (Vocoders)
23 pages
Nice
No ratings yet
Nice
15 pages
Linear Predictive Coding
No ratings yet
Linear Predictive Coding
22 pages
Lecture LPC
No ratings yet
Lecture LPC
7 pages
Speech Coder
No ratings yet
Speech Coder
20 pages
LPC Modeling: Unit 5 1.speech Compression
No ratings yet
LPC Modeling: Unit 5 1.speech Compression
13 pages
Low Bit Rate Speech Coding
No ratings yet
Low Bit Rate Speech Coding
165 pages
Research Paper
No ratings yet
Research Paper
5 pages
Comparative Analysis of Speech Compression Algorithms With Perceptual and LP Based Quality Evaluations
No ratings yet
Comparative Analysis of Speech Compression Algorithms With Perceptual and LP Based Quality Evaluations
5 pages
Unit 3
No ratings yet
Unit 3
44 pages
Ijetae 0612 54 PDF
No ratings yet
Ijetae 0612 54 PDF
4 pages
Speech Generation
No ratings yet
Speech Generation
11 pages
Vocoder
No ratings yet
Vocoder
72 pages
REC085 t3 Sheet
No ratings yet
REC085 t3 Sheet
15 pages
VOCODER
No ratings yet
VOCODER
4 pages
New Speech Coding Techniques: Mr. L.Ramesh Ap/Ece
No ratings yet
New Speech Coding Techniques: Mr. L.Ramesh Ap/Ece
24 pages
4: Speech Compression: Data Rates
No ratings yet
4: Speech Compression: Data Rates
14 pages
Speech Compression
No ratings yet
Speech Compression
37 pages
Speech and Audio Processing: Lecture-3
No ratings yet
Speech and Audio Processing: Lecture-3
20 pages
CELP
No ratings yet
CELP
23 pages
Code Excited Liner Predictive Coding
No ratings yet
Code Excited Liner Predictive Coding
9 pages
Information Theory Module 5
No ratings yet
Information Theory Module 5
69 pages
HST.582J / 6.555J / 16.456J Biomedical Signal and Image Processing
No ratings yet
HST.582J / 6.555J / 16.456J Biomedical Signal and Image Processing
12 pages
McCree MixedExcitationLPCVocoderModel ieeetSAP95
No ratings yet
McCree MixedExcitationLPCVocoderModel ieeetSAP95
9 pages
Dokumen - Tips Elec9344speech Audio Processing 4pdfspeech Signal For Digital Storage or Transmission
No ratings yet
Dokumen - Tips Elec9344speech Audio Processing 4pdfspeech Signal For Digital Storage or Transmission
87 pages
Vocoder
No ratings yet
Vocoder
8 pages
Speech Signal Processing
No ratings yet
Speech Signal Processing
41 pages
Speech and Audio Coding
No ratings yet
Speech and Audio Coding
16 pages
Multi-Band Excitation Vocoder: RLE Technical Report No. 524
No ratings yet
Multi-Band Excitation Vocoder: RLE Technical Report No. 524
140 pages
Comparative Analysis of Speech Compression Algorithms With Perceptual and LP Based Quality Evaluations
No ratings yet
Comparative Analysis of Speech Compression Algorithms With Perceptual and LP Based Quality Evaluations
1 page
Test 1
No ratings yet
Test 1
77 pages
Multimedia Communications: Speech Compression
No ratings yet
Multimedia Communications: Speech Compression
26 pages
EE6425 Class Project: LPC 10 Speech Analysis and Synthesis Model
No ratings yet
EE6425 Class Project: LPC 10 Speech Analysis and Synthesis Model
23 pages
Linear Predictive Coding: Jeremy Bradbury December 5, 2000
No ratings yet
Linear Predictive Coding: Jeremy Bradbury December 5, 2000
23 pages
Speech Coding
100% (3)
Speech Coding
36 pages
Human Speech Communication
No ratings yet
Human Speech Communication
44 pages
Speech Compression
No ratings yet
Speech Compression
15 pages
Speech Compression Using GSM
No ratings yet
Speech Compression Using GSM
23 pages
A Simple LPC Vocoder Bob Beauchaine EE586, Spring 2004: Vocal Tract Modeling
No ratings yet
A Simple LPC Vocoder Bob Beauchaine EE586, Spring 2004: Vocal Tract Modeling
12 pages
Vocoders: Phase Insensitivity
No ratings yet
Vocoders: Phase Insensitivity
3 pages
Codificadores de Voz
No ratings yet
Codificadores de Voz
26 pages
Prepared By: Mamatha.K.S M.Tech (S.P) 1 Sem Guided By: Mr. Satish.M.N
No ratings yet
Prepared By: Mamatha.K.S M.Tech (S.P) 1 Sem Guided By: Mr. Satish.M.N
21 pages
2.2 Speech Processing: - Speech Synthesis. - Speech Recognition. - Speech Coding
No ratings yet
2.2 Speech Processing: - Speech Synthesis. - Speech Recognition. - Speech Coding
7 pages
Vocoders: Nadeem Pasha
No ratings yet
Vocoders: Nadeem Pasha
7 pages
A New Mixed Excitation LPC Vocoder
No ratings yet
A New Mixed Excitation LPC Vocoder
4 pages
Speech Coding: Fundamentals and Applications: ARK Asegawa Ohnson
No ratings yet
Speech Coding: Fundamentals and Applications: ARK Asegawa Ohnson
20 pages
Speech Coding: Fundamentals and Applications: ARK Asegawa Ohnson
No ratings yet
Speech Coding: Fundamentals and Applications: ARK Asegawa Ohnson
20 pages
Speech Coding Journal
No ratings yet
Speech Coding Journal
20 pages
Unit Iv Audio and Video Coding
No ratings yet
Unit Iv Audio and Video Coding
15 pages
A Beginner's Guide to Ham Radio
From Everand
A Beginner's Guide to Ham Radio
George Freeman
No ratings yet
Sound Design and Mixing in Reason
From Everand
Sound Design and Mixing in Reason
Andrew Eisele
3/5 (2)
Filter Bank: Insights into Computer Vision's Filter Bank Techniques
From Everand
Filter Bank: Insights into Computer Vision's Filter Bank Techniques
Fouad Sabry
No ratings yet
Notes Unit-1-1
No ratings yet
Notes Unit-1-1
15 pages
XG Boosting Reference
No ratings yet
XG Boosting Reference
6 pages
Tutorial Sheets of Data Structure
No ratings yet
Tutorial Sheets of Data Structure
10 pages
Dereje Teferi Dereje - Teferi@aau - Edu.et
No ratings yet
Dereje Teferi Dereje - Teferi@aau - Edu.et
36 pages
USL - 21070126112 - Colaboratory
No ratings yet
USL - 21070126112 - Colaboratory
3 pages
Chapter # 04 (Grand Test) : Mathematics (Ix)
No ratings yet
Chapter # 04 (Grand Test) : Mathematics (Ix)
2 pages
Roux CMLL Algorithms: Cu Er Oot
No ratings yet
Roux CMLL Algorithms: Cu Er Oot
3 pages
CS Lab 2
No ratings yet
CS Lab 2
7 pages
A D W T: Udio Enoising Using Avelet Ransform
No ratings yet
A D W T: Udio Enoising Using Avelet Ransform
7 pages
Finite Element Analysis
No ratings yet
Finite Element Analysis
5 pages
Polynomials PPT - PDF
No ratings yet
Polynomials PPT - PDF
1 page
UNIT 5 Session 6
No ratings yet
UNIT 5 Session 6
67 pages
REAL NUMBERS and Polynomial
No ratings yet
REAL NUMBERS and Polynomial
1 page
Linear Arrangement Questions For Cat 16
No ratings yet
Linear Arrangement Questions For Cat 16
8 pages
Final - Exam - 2018 - PHD
No ratings yet
Final - Exam - 2018 - PHD
3 pages
Task1 BasicProgramming AbelHendrikMP TI23T
No ratings yet
Task1 BasicProgramming AbelHendrikMP TI23T
3 pages
Sorting Algorithm Report
No ratings yet
Sorting Algorithm Report
5 pages
Selection Sort and Insertion Sort Algorithm
No ratings yet
Selection Sort and Insertion Sort Algorithm
7 pages
Unit 3b
No ratings yet
Unit 3b
9 pages
Linear Time Invariant (LTI) Systems: Level-1
No ratings yet
Linear Time Invariant (LTI) Systems: Level-1
18 pages
DAA Syllabus
No ratings yet
DAA Syllabus
3 pages
Competitive Programming Theory
No ratings yet
Competitive Programming Theory
3 pages
Lecture 5 Convolution Student
No ratings yet
Lecture 5 Convolution Student
4 pages
Curated List of Links
100% (1)
Curated List of Links
28 pages
Telecommunication Questions
No ratings yet
Telecommunication Questions
2 pages
Dana Analytics Unitwise Questions
No ratings yet
Dana Analytics Unitwise Questions
2 pages
VTU Provisional Results Sheet
No ratings yet
VTU Provisional Results Sheet
1 page
Lecture 11
No ratings yet
Lecture 11
31 pages
Topic: Non-Negative Matrix Factorisation: Assignment - 2
No ratings yet
Topic: Non-Negative Matrix Factorisation: Assignment - 2
6 pages
Adsa Practice Questions
No ratings yet
Adsa Practice Questions
7 pages

Unit2 1

Uploaded by

Unit2 1

Uploaded by

Ajay Kumar Garg Engineering College

Wireless & Mobile Communication

Mr. Naveen Kumar Saini

(a) Human speech production

(b) Equivalent synthetic speech

• The speech signal is assumed to be of two types: voiced and unvoiced

• The pitch frequency for most speakers is below 300 Hz.

You might also like