0% found this document useful (0 votes)

69 views16 pages

Speech Processing Project

This document describes a speech processing project that implements a linear predictive coding (LPC) vocoder using voice excitation. The project was completed in MATLAB for an ECE 5525 class in fall 2005. The document provides background on LPC theory and models, including how speech is analyzed on a frame-by-frame basis to extract parameters like pitch period, voicing decision, LPC coefficients, and gain. It also describes how the parameters are encoded and used to synthesize speech by driving an LPC synthesis filter with an excitation signal.

Uploaded by

japaoli

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

69 views16 pages

Speech Processing Project

Uploaded by

japaoli

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 16

Speech Processing Project

Linear Predictive coding using Voice excited

Vocoder

ECE 5525
Osama Saraireh
Fall 2005
Dr. Veton Kepuska
Models Implemented In MATLAB
LPC Background
 The speech signal is filtered to no more than one half the system
sampling frequency and then
 A/D conversion is performed.
 The speech is processed on a frame by frame basis where the
analysis frame length can be variable.
 For each frame a pitch period estimation is made along with a
voicing decision.
 A linear predictive coefficient analysis is performed to obtain an
inverse model of the speech spectrum A (z).
 In addition a gain parameter G, representing some function of the
speech energy is computed.
 An encoding procedure is then applied for transforming the
analyzed parameters into an efficient set of transmission
parameters with the goal of minimizing the degradation in the
synthesized speech for a specified number of bits. Knowing the
transmission frame rate and the number of bits used for each
transmission parameters, one can compute a noise-free channel
transmission bit rate.
LPC Theory
 At the receiver, the transmitted parameters are decoded into
quantized versions of the coefficients analysis and pitch estimation
parameters.
 An excitation signal for synthesis is then constructed from the
transmitted pitch and voicing parameters.
 The excitation signal then drives a synthesis filter 1/A (z)
corresponding to the analysis model A (z).
 The digital samples s^(n) are then passed through an D/A
converter and low pass filtered to generate the synthetic speech
s(t).
 Either before or after synthesis, the gain is used to match the
synthetic speech energy to the actual speech energy.
 The digital samples are the converted to an analog signal and
passed through a filter similar to the one at the input of the
system.
Speech production

 Where p is the number of poles.

 G is the filter Gain.

and a[k] are the parameters that determine the poles.
Voiced and Unvoiced sounds

 There are two mutually exclusive ways excitation functions to

model voiced and unvoiced speech sounds.
 For a short time-basis analysis:
 voiced speech is considered periodic with a fundamental
frequency of Fo, and a pitch period of 1/Fo, which depends
on the speaker. Hence, Voiced speech is generated by
exciting the all pole filter model by a periodic impulse train.
 On the other hand, unvoiced sounds are generated by exciting
the all-pole filter by the output of a random noise generator
Voiced/Unvoiced
 The fundamental difference between these two types of speech
sounds comes from the following:
 the way they are produced.
 The vibrations of the vocal cords produce voiced sounds.
 The rate at which the vocal cords vibrate dictates the pitch of
the sound.
 On the other hand, unvoiced sounds do not rely on the
vibration of the vocal cords.
 The unvoiced sounds are created by the constriction of the
vocal tract.
 The vocal cords remain open and the constrictions of the vocal
tract force air out to produce the unvoiced sounds
 Given a short segment of a speech signal, lets say about 20 ms or
160 samples at a sampling rate 8 KHz, the speech encoder at the
transmitter must determine the proper excitation function, the
pitch period for voiced speech, the gain, and the coefficients
Mathematical Representation of the Model

 The parameters of the all-pole filter model are determined from

the speech samples by means of linear prediction. To be specific
the output of the Linear Prediction filter is:
^ p
s ( n)   a p ( k ) s ( n  k )
k 1

 the corresponding error between the observed sample S(n) and

the predicted value is

^
e( n )  s ( n )  s ( n )
Cont’d
 by minimizing the sum of the squared error we can determine the
pole parameters of the model. The result of differentiating the
sum above with respect to each of the parameters and equation
the result to zero, is a sep of p linear equations:
p

a
k 1
p (k )rss (m  k )  rss ( m )

where ss (m ) represent the autocorrelation of the sequence s (n)


r
defined as
N
rss ( m )   s (n) s (n  m)
n 0

Rss a   rss (m )
Auto-correlation
 where Rss a is a pxp autocorrelation matrix, rss is a px1
autocorrelation vector, and a is a px1 vector of model parameters.

 [row col] = size(data);

 if col==1 data=data'; end
 nframe = 0;
 msfr = round(sr/1000*fr); % Convert ms to samples
 msfs = round(sr/1000*fs); % Convert ms to samples
 duration = length(data);
 speech = filter([1 -preemp], 1, data)'; % Preemphasize speech
 msoverlap = msfs - msfr;
 ramp = [0:1/(msoverlap-1):1]'; % Compute part of window
 for frameIndex=1:msfr:duration-msfs+1 % frame rate=20ms
 frameData = speech(frameIndex:(frameIndex+msfs-1)); % frame size=30ms
 nframe = nframe+1;
 autoCor = xcorr(frameData); % Compute the cross correlation
 autoCorVec = autoCor(msfs+[0:L]);
Gain representation in the system

 The gain parameter of the filter can be obtained by the input-

output relationship as
p
follow
s (n)   a p (k ) s (n  k )  Gx(n)
k 1

 where X(n) represent the input sequence.

 We can further manipulate this equation and in terms of the error
sequence we have
p
Gx(n)  s(n)   a p (k ) s(n  k )  e(n)
k 1
N 1 N 1
G 2
 x
n0
( n2
)   ( n)
e 2

n 0
 if the input excitation is normalized to unit energy by design, then
N 1 N 1 p
G 2
 x ( n)   e ( n)  r
n0
2

n0
2
ss (0)   a p (k )rss (k )
k 1

 where G^2 is set equal to the residual energy resulting from the
least square optimization
 once the LPC coefficients are computed, we can determine
weather the input speech frame is voiced, and if it is indeed
voiced sound, then what is the pitch. We can determine the pitch
by computing the
p following sequence in matlab:
p
re (n)   ra (k )rss (n  k ) ra (n)   aa (k )a p (i  k )
k 1 k 1

 which is defined as the autocorrelation sequence of the prediction

coefficients.
Pitch Detection
re (n)
 The pitch is detected by finding the peak of the normalized re (0)
sequence

 In the time interval corresponds to 3 to 15 ms in the 20ms

sampling frame. If the value of this peak is at least 0.25, the
frame of speech is considered voiced with a pitch period equal to
the value of n  N p where Is a re ( N p ) maximum value
re (0)

 If the peak value is less than 0.25, the frame speech is considered
unvoiced and the pitch would equal to zero
 The value of the LPC coefficients, the pitch period, and the type of
excitation are then transmitted to the receiver.
 The decoder synthesizes the speech signal by passing the proper
excitation through the all pole filter model of the vocal tract.
 Typically the pitch period requires 6 bits, the gain parameters are
represented in 5 bits after the dynamic range is compressed and
the prediction coefficients require 8-10 bits normally for accuracy
reasons.
 This is very important in LPC because any small changes in the
prediction coefficients result in large change in the pole positions
of the filter model, which cause instability in the model.
 This is overcome by using the PARACOR method .
Model LPC Vocoder (pitch detector)

output speech spectrum using LPC vocoder

10
Original speech signal
0.4

8
0.3

0.2 6

0.1
4

0
2
-0.1

0
-0.2

-0.3 -2
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
4 4
x 10 x 10
Voice excited LPC Vocoder

Original speech signal reconstructed signal using voice Excited LPC vocoder
0.4 0.4

0.3 0.3

0.2
0.2

0.1
0.1

0
0

-0.1
-0.1

-0.2
-0.2
-0.3
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
4 -0.3
x 10 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
4
x 10

Find YourSelf - Khyber EyeCon
No ratings yet
Find YourSelf - Khyber EyeCon
52 pages
Unit 2 Wireless
No ratings yet
Unit 2 Wireless
159 pages
BPLCK105D - Module 2 - Functions in C++
No ratings yet
BPLCK105D - Module 2 - Functions in C++
10 pages
Information Theory Module 5
No ratings yet
Information Theory Module 5
69 pages
Classic Porsche 05 06 2024
No ratings yet
Classic Porsche 05 06 2024
116 pages
M. Ali Asdar Departement of Pulmonology and Respiratory Medicine Faculty of Medicine University of Indonesia - Persahabatan General Hospital Jakarta
No ratings yet
M. Ali Asdar Departement of Pulmonology and Respiratory Medicine Faculty of Medicine University of Indonesia - Persahabatan General Hospital Jakarta
30 pages
Telangana State Report 10-05-2022
No ratings yet
Telangana State Report 10-05-2022
34 pages
SVP (1-5) Units Notes 4th Yr CSM
No ratings yet
SVP (1-5) Units Notes 4th Yr CSM
35 pages
Semitic Alphabets
No ratings yet
Semitic Alphabets
16 pages
Human Speech Communication
No ratings yet
Human Speech Communication
44 pages
Anais Aesbr2007
No ratings yet
Anais Aesbr2007
160 pages
Speech Compression
No ratings yet
Speech Compression
37 pages
Unit 2 A
No ratings yet
Unit 2 A
48 pages
Speech Coding Techniques
No ratings yet
Speech Coding Techniques
17 pages
Warmups Linear Functions 8 TH Grade Math Common Core Standards
No ratings yet
Warmups Linear Functions 8 TH Grade Math Common Core Standards
61 pages
Unit 3
No ratings yet
Unit 3
44 pages
Unit Iv Audio and Video Coding
No ratings yet
Unit Iv Audio and Video Coding
15 pages
MMC Unit III-1
No ratings yet
MMC Unit III-1
122 pages
File Page No 1663658874765
No ratings yet
File Page No 1663658874765
10 pages
Mil STD 188 220D - CHG - Notice 1
No ratings yet
Mil STD 188 220D - CHG - Notice 1
562 pages
Dokumen - Tips Elec9344speech Audio Processing 4pdfspeech Signal For Digital Storage or Transmission
No ratings yet
Dokumen - Tips Elec9344speech Audio Processing 4pdfspeech Signal For Digital Storage or Transmission
87 pages
Report
No ratings yet
Report
9 pages
Tesis Doctoral Antenas Fractales
No ratings yet
Tesis Doctoral Antenas Fractales
218 pages
IELTS Writing Task 2
No ratings yet
IELTS Writing Task 2
34 pages
Cryptography
No ratings yet
Cryptography
201 pages
Low Bit Rate Speech Coding
No ratings yet
Low Bit Rate Speech Coding
165 pages
MANUAL USUARIO STM32cubeMX
No ratings yet
MANUAL USUARIO STM32cubeMX
345 pages
Experimental Investigation of Circular Concrete Filled Steel Tube Geometry On Seismic Performance
No ratings yet
Experimental Investigation of Circular Concrete Filled Steel Tube Geometry On Seismic Performance
54 pages
Unit2 1
No ratings yet
Unit2 1
23 pages
Statement Up0510110008421
No ratings yet
Statement Up0510110008421
3 pages
Speech Compression Techniques - Formant and CELP Vocoders
No ratings yet
Speech Compression Techniques - Formant and CELP Vocoders
41 pages
Tense and Aspect in IE PDF
No ratings yet
Tense and Aspect in IE PDF
255 pages
Dietary Practices Among Individuals With Type 2 Diabetes (Diabetes Mellitus) : A Guide To Nutrition Intervention
100% (2)
Dietary Practices Among Individuals With Type 2 Diabetes (Diabetes Mellitus) : A Guide To Nutrition Intervention
68 pages
Human Speech Producing Organs: 2.4 Kbps
No ratings yet
Human Speech Producing Organs: 2.4 Kbps
108 pages
Multi-Band Excitation Vocoder: RLE Technical Report No. 524
No ratings yet
Multi-Band Excitation Vocoder: RLE Technical Report No. 524
140 pages
Review of Literature On Graduate Employability
No ratings yet
Review of Literature On Graduate Employability
15 pages
REC085 t3 Sheet
No ratings yet
REC085 t3 Sheet
15 pages
Linear Prediction
No ratings yet
Linear Prediction
94 pages
Introduction To Soil Ecology
No ratings yet
Introduction To Soil Ecology
15 pages
Speech Signal Processing
No ratings yet
Speech Signal Processing
41 pages
(Ebook) Mastering Twitter Ads by Antonio Calero (PDF)
No ratings yet
(Ebook) Mastering Twitter Ads by Antonio Calero (PDF)
106 pages
Strategy Papers and Cases Questions
0% (1)
Strategy Papers and Cases Questions
9 pages
ED Mid
No ratings yet
ED Mid
1 page
New Speech Coding Techniques: Mr. L.Ramesh Ap/Ece
No ratings yet
New Speech Coding Techniques: Mr. L.Ramesh Ap/Ece
24 pages
Implementation of Linear Predictive Coding (LPC) of Speech: Outline
No ratings yet
Implementation of Linear Predictive Coding (LPC) of Speech: Outline
15 pages
Linear Predictive Coding: Jeremy Bradbury December 5, 2000
No ratings yet
Linear Predictive Coding: Jeremy Bradbury December 5, 2000
23 pages
1.1 Survey of The History, Growth and Role of Translation in India
No ratings yet
1.1 Survey of The History, Growth and Role of Translation in India
50 pages
LPC Vocoder Project
No ratings yet
LPC Vocoder Project
4 pages
Speech Coders For Wireless Communication
No ratings yet
Speech Coders For Wireless Communication
53 pages
Codificadores de Voz
No ratings yet
Codificadores de Voz
26 pages
Dilution Systems For Aerosols Series DIL, DDS and HDS: Special Advantages
No ratings yet
Dilution Systems For Aerosols Series DIL, DDS and HDS: Special Advantages
4 pages
Vocoder
No ratings yet
Vocoder
72 pages
14 Potter Alves Green Zappa Nissen McCoy
No ratings yet
14 Potter Alves Green Zappa Nissen McCoy
4 pages
HST.582J / 6.555J / 16.456J Biomedical Signal and Image Processing
No ratings yet
HST.582J / 6.555J / 16.456J Biomedical Signal and Image Processing
12 pages
vn0610L PDF
No ratings yet
vn0610L PDF
5 pages
ch5.3 (Vocoders)
No ratings yet
ch5.3 (Vocoders)
23 pages
Speech Coder
No ratings yet
Speech Coder
20 pages
Adaptive Multi Rate Coder Using ACLP
No ratings yet
Adaptive Multi Rate Coder Using ACLP
45 pages
Code Excited Liner Predictive Coding
No ratings yet
Code Excited Liner Predictive Coding
9 pages
Read The Masterplan
No ratings yet
Read The Masterplan
47 pages
Lisp Interpreter in Rust
From Everand
Lisp Interpreter in Rust
Vishal Patil
1/5 (1)
Linear Predictive Coding
No ratings yet
Linear Predictive Coding
22 pages
Calcium Carbonate
33% (3)
Calcium Carbonate
1 page
Orientering
No ratings yet
Orientering
15 pages
Tests For Two Correlations
No ratings yet
Tests For Two Correlations
10 pages
Prepared By: Mamatha.K.S M.Tech (S.P) 1 Sem Guided By: Mr. Satish.M.N
No ratings yet
Prepared By: Mamatha.K.S M.Tech (S.P) 1 Sem Guided By: Mr. Satish.M.N
21 pages
Speech Compression
No ratings yet
Speech Compression
14 pages
Digital Signal and Image Processing using MATLAB, Volume 3: Advances and Applications, The Stochastic Case
From Everand
Digital Signal and Image Processing using MATLAB, Volume 3: Advances and Applications, The Stochastic Case
Gérard Blanchet
3/5 (1)
EE6425 Class Project: LPC 10 Speech Analysis and Synthesis Model
No ratings yet
EE6425 Class Project: LPC 10 Speech Analysis and Synthesis Model
23 pages
Amine Unit
100% (1)
Amine Unit
69 pages
50 KLD STP Boq
No ratings yet
50 KLD STP Boq
104 pages
Speech and Audio Coding
No ratings yet
Speech and Audio Coding
16 pages
Use of Spectral Autocorrelation in Spectral Envelope Linear Prediction For Speech Recognition
No ratings yet
Use of Spectral Autocorrelation in Spectral Envelope Linear Prediction For Speech Recognition
31 pages
McCree MixedExcitationLPCVocoderModel ieeetSAP95
No ratings yet
McCree MixedExcitationLPCVocoderModel ieeetSAP95
9 pages
57 - Linear Predictive Coding PDF
No ratings yet
57 - Linear Predictive Coding PDF
7 pages
SCBA Pre-Use Inspection
No ratings yet
SCBA Pre-Use Inspection
2 pages
LPC Analysis and Synthesis
No ratings yet
LPC Analysis and Synthesis
17 pages
A Tutorial On Speech Synthesis Models
No ratings yet
A Tutorial On Speech Synthesis Models
8 pages
A Simple LPC Vocoder Bob Beauchaine EE586, Spring 2004: Vocal Tract Modeling
No ratings yet
A Simple LPC Vocoder Bob Beauchaine EE586, Spring 2004: Vocal Tract Modeling
12 pages
Nice
No ratings yet
Nice
15 pages
LPC Modeling: Unit 5 1.speech Compression
No ratings yet
LPC Modeling: Unit 5 1.speech Compression
13 pages
Speech Compression
No ratings yet
Speech Compression
15 pages
Mercedes-Benz: Faculty of Political Science
No ratings yet
Mercedes-Benz: Faculty of Political Science
7 pages
CELP
No ratings yet
CELP
23 pages
Linear Prediction Coding Vocoders: Institute of Space Technology Islamabad
No ratings yet
Linear Prediction Coding Vocoders: Institute of Space Technology Islamabad
15 pages
4: Speech Compression: Data Rates
No ratings yet
4: Speech Compression: Data Rates
14 pages
Linear, Time-Varying System e (N), Excitation X (N), Speech Output
No ratings yet
Linear, Time-Varying System e (N), Excitation X (N), Speech Output
4 pages
Speech Generation
No ratings yet
Speech Generation
11 pages
LPC Vocoder: 1-Introduction
No ratings yet
LPC Vocoder: 1-Introduction
12 pages
Lecture LPC
No ratings yet
Lecture LPC
7 pages
Ijetae 0612 54 PDF
No ratings yet
Ijetae 0612 54 PDF
4 pages
Lab 9a. Linear Predictive Coding For Speech Processing: Vocal Tract Parameters Pitch Period Voiced/Unvoiced Speech Switch
No ratings yet
Lab 9a. Linear Predictive Coding For Speech Processing: Vocal Tract Parameters Pitch Period Voiced/Unvoiced Speech Switch
5 pages
Application of Microphone Array For Speech Coding in Noisy Environment
No ratings yet
Application of Microphone Array For Speech Coding in Noisy Environment
5 pages
Packet Tracer Activity 3.5.1
No ratings yet
Packet Tracer Activity 3.5.1
2 pages
Analog Dialogue, Volume 47, Number 2
From Everand
Analog Dialogue, Volume 47, Number 2
Analog Dialogue
No ratings yet

Speech Processing Project

Uploaded by

Speech Processing Project

Uploaded by

Speech Processing Project

Linear Predictive coding using Voice excited

 Where p is the number of poles.

 There are two mutually exclusive ways excitation functions to

 The parameters of the all-pole filter model are determined from

 the corresponding error between the observed sample S(n) and

where ss (m ) represent the autocorrelation of the sequence s (n)

 [row col] = size(data);

 The gain parameter of the filter can be obtained by the input-

 where X(n) represent the input sequence.

 which is defined as the autocorrelation sequence of the prediction

 In the time interval corresponds to 3 to 15 ms in the 20ms

output speech spectrum using LPC vocoder

You might also like