ch5.3 (Vocoders)
ch5.3 (Vocoders)
bandwidth of the lowpass filter is selected to match the time variations in the characteristics of the vocal tract.
For measurement of the spectral magnitudes, a voicing detector and a pitch estimator are included in the speech analysis.
2
Bandpass Filter
S(n)
Rectifier
Lowpass Filter
A/D Converter
Encoder
To
Channel
Voicing detector
Pitch detector
3
At the receiver the signal samples are passed through D/A converters. The outputs of the D/As are multiplied by the voiced or unvoiced signal sources. The resulting signal are passed through bandpass filters. The outputs of the bandpass filters are summed to form the synthesized speech signal.
4
Decoder
From Channel
Voicing Information
Pitch period
Pulse generator
The phase vocoder is similar to the channel vocoder. However, instead of estimating the pitch, the phase vocoder estimates the phase derivative at the output of each filter.
By coding and transmitting the phase derivative, this vocoder destroys the phase information .
6
Short-term magnitude
ak n
Differentiator
sin k n
Encoder
S(n)
To Channel
Differentiator
bk n
sin k n
amplitude
cos k n
Decoder Integrator
Decimate Short-term Phase derivative
From Channel
Cos
Interpolator
Sin
Interpolator
sin k n
The formant vocoder can be viewed as a type of channel vocoder that estimate the first three or four formants in a segment of speech. It is this information plus the pitch period that is encoded and transmitted to the receiver.
9
Example of formant:
(a)
: The spectrogram of the utterance day one showing the pitch and the harmonic structure of speech. (b) : A zoomed spectrogram of the fundamental and the second harmonic.
(a)
(b)
10
F2 Input
Speech
F1
Pitch
And
V/U
Decoder
Fk :The frequency of the kth formant Bk :The bandwidth of the kth formant
11
F2
F1
B1 V/U F0
F1
Excitation Signal
12
The objective of LP analysis is to estimate parameters of an all-pole model of the vocal tract. Several methods have been devised for generating the excitation sequence for speech synthesizes.
LPC-type of speech analysis and synthesis are differ primarily in the type of excitation signal that is generated for speech synthesis.
13
LPC 10 :
This methods is called LPC-10 because of 10 coefficient are typically employed. LPC-10 partitions the speech into the 180 sample frame. Pitch and voicing decision are determined by using the AMDF and zero crossing measures.
14
Speech quality in speech quality can be improved at the expense of a higher bit rate by computing and transmitting a residual error, as done in the case of DPCM. One method is that the LPC model and excitation parameters are estimated from a frame of speech.
15
The speech is synthesized at the transmitter and subtracted from the original speech signal to form the residual error. The residual error is quantized, coded, and transmitted to the receiver At the receiver the signal is synthesized by adding the residual error to the signal generated from the model.
16
LP analysis
Excitation
parameters
LP Synthesis model
Encoder
To Channel
17
Code Excited LP :
CELP is an analysis-by-synthesis method in which the excitation sequence is selected from a codebook of zero-mean Gaussian sequence. The bit rate of the CELP is 4800 bps.
18
Side information
Gain
parameters
+ -
CELP (synthesizer) :
From Channel
decoder
LP Synthesis filter
updates
20
The VSELP coder and decoder basically differ in method by which the excitation sequence is formed. In next block diagram of the VSELP, there are three excitation source. One excitation is obtained from the pitch period state. The other two excitation source are obtained from two codebook.
21
Parameters
Bits/20ms
38 5
10 LPC coefficients Average speech energy Excitation codewords from two VSELP codebooks Gain parameters Lag of pitch filter
Total
14 8 7
56 32 28
29
159
22
VSELP Decoder :
Long-term Filter state
0
Codebook 1 Pitch synthesis filter
Spectral envelop (LP) synthesis filter
1
Codebook 2
2
23