0% found this document useful (0 votes)

52 views22 pages

Linear Prediction: The Problem, Its Solution and Application To Speech

This document discusses linear prediction, beginning with an overview of linear time-invariant systems and their representation as difference equations. It then derives the mathematics of linear prediction models, which allow prediction of a system's future outputs based on its inputs and previous outputs. Two common methods for solving the normal equations to obtain the predictor coefficients are described. The document focuses on applying linear prediction to analyze speech signals.

Uploaded by

Alex Krockas Botamas Chonna

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

52 views22 pages

Linear Prediction: The Problem, Its Solution and Application To Speech

Uploaded by

Alex Krockas Botamas Chonna

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 22

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/241802223

Linear Prediction: The Problem, its Solution and Application to Speech

Article · January 2008

CITATIONS READS
12 428

3 authors, including:

David Dorran
Technological University Dublin - City Campus
33 PUBLICATIONS 147 CITATIONS

SEE PROFILE

All content following this page was uploaded by David Dorran on 24 January 2014.

The user has requested enhancement of the downloaded file.

Dublin Institute of Technology
ARROW@DIT
Conference papers Audio Research Group

2008-01-01

Linear Prediction: The Problem, its Solution and

Application to Speech
Alan O'Cinneide
Dublin Institute of Technology, [email protected]

David Dorran
Dublin Institute of Technology, [email protected]

Mikel Gainza
Dublin Institute of Technology, [email protected]

Follow this and additional works at: http://arrow.dit.ie/argcon

Part of the Signal Processing Commons

Recommended Citation
O'Cinneide, A., Dorran, D., Gainza, M.: Linear Prediction: The Problem, its Solution and Application to Speech. DIT Internal
Technical Report.

This Working Paper is brought to you for free and open access by the Audio
Research Group at ARROW@DIT. It has been accepted for inclusion in
Conference papers by an authorized administrator of ARROW@DIT. For
more information, please contact [email protected],
[email protected].

This work is licensed under a Creative Commons Attribution-

Noncommercial-Share Alike 3.0 License
Linear Prediction
The Technique, Its Solution and Application to Speech

Alan Ó Cinnéide
August 2008

Supervisors: Drs. David Dorran, Mikel Gainza and Eugene Coyle

Contents
1 Overview and Introduction 2

2 Linear Systems: Models and Prediction 2

2.1 Linear System Theory . . . . . . . . . . . . . . . . . . . . . . . . 3
2.2 All-Pole Linear Prediction Model . . . . . . . . . . . . . . . . . . 5
2.3 Solutions to the Normal Equations . . . . . . . . . . . . . . . . . 7
2.3.1 The Autocorrelation Method . . . . . . . . . . . . . . . . 7
2.3.2 The Covariance Method . . . . . . . . . . . . . . . . . . . 9
2.3.3 Comparison of the Two Methods . . . . . . . . . . . . . . 9

3 Linear Prediction of Speech 10

3.1 Human Speech Production: Anatomy and Function . . . . . . . . 10
3.2 Speech Production as a Linear System . . . . . . . . . . . . . . . 12
3.2.1 Model Limitations . . . . . . . . . . . . . . . . . . . . . . 13
3.3 Practical Considerations . . . . . . . . . . . . . . . . . . . . . . . 14
3.3.1 Prediction Order . . . . . . . . . . . . . . . . . . . . . . . 14
3.3.2 Closed Glottal Analysis . . . . . . . . . . . . . . . . . . . 15

4 Examples 16
4.1 Human Speech: Voiced Vowel . . . . . . . . . . . . . . . . . . . . 16
4.2 Human Speech: Unvoiced Fricative . . . . . . . . . . . . . . . . . 16
4.3 Trumpet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

1
1 Overview and Introduction
Linear prediction is a signal processing technique that is used extensively in the
analysis of speech signals and, as it is so heavily referred to in speech processing
literature, a certain level of familiarity with the topic is typically required by
all speech processing engineers. This paper aims to provide a well-rounded
introduction to linear prediction, and so doing, facilitate the understanding
of the technique. Linear prediction and its mathematical derivation will be
described, with a specific focus on applying the technique to speech signals. It
is noted, however, that although progress in linear prediction has been driven
primarily by speech research, it involves concepts that prove useful to digital
signal processing in general.
First to be discussed within the paper are general linear time-invariant sys-
tems, along with its theory and mathematics, before moving into a general
description of linear prediction models. The equations that yield one variant of
linear prediction coefficients are derived and the methods involved to solve these
equations are then briefly discussed. Different interpretations of the equations
yield slightly different results, and these differences will be explained.
A section focussing specifically on the linear prediction of speech then be-
gins. The anatomical process of speech production is described, followed by
an introduction to a theoretical linear model of the process. The limitations
of applying the linear prediction model to speech are described, and comments
are also given concerning certain practicalities that are specific to the linear
prediction of speech.
The paper concludes with an implementation of linear prediction using two
different types of signal. It is hoped that the balance between theory and prac-
tical will allow the reader for easy assimilation of this technique.

2 Linear Systems: Models and Prediction

Linear prediction [11, 7, 9] is a technique of time series analysis, that emerges
from the examination of linear systems. Using linear prediction, the parameters
of a such a system can be determined by analysing the systems inputs and
outputs. Makhoul [7] says that the method first appeared in a 1927 paper

2
on sun-spot analysis, but has since been applied to problems in neuro-physics,
seismology as well as speech communication.
This section will review linear systems and, elaborating upon them, derives
the mathematics of linear prediction.

2.1 Linear System Theory

A linear system is such that produces its output as a linear combination of its
current and previous inputs and its previous outputs [13]. It can be described
as time-invariant if the system parameters do not change with time. Mathemat-
ically, linear time-invariant (LTI) systems can be represented by the following
equation:
q
X p
X
y(n) = bj x(n − j) − ak y(n − k) (1)
j=0 k=1

This is the general difference equation for any linear system, with output signal
y and input signal x, and scalars bj and ak , for j = 1 . . . q and k = 1 . . . p where
the maximum of p and q is the order of the system. The system is represented
graphically in figure 1.

Figure 1: A graphical representation of the general difference equation for an

LTI system.

By re-arranging equation (1) and transforming into the Z-domain, we can

3
reveal the transfer function H(z) of such a system:
p
X q
X
y(n) + ak y(n − k) = bj x(n − j)
k=1 j=0
p
X q
X
ak y(n − k) = bj x(n − j) where a0 = 1
k=0 j=0
p
X q
X
ak z −k Y (z) = bj z −j X(z)
k=0 j=0

q
X
bj z −j
Y (z) j=0
⇒ H(z) = = p (2)
X(z) X
−k
ak z
k=0

The coefficients of the input and output signal samples in equation (1) reveal
the poles and zeros of the transfer function.
Linear prediction follows naturally from the general mathematics of linear
systems. As the system output is defined as a linear combination of past samples,
the system’s future output can be predicted if the scaling coefficients bj and ak
are known. These scalars are thus also known as the predictor coefficients of
the system [9].
The general linear system transfer function gives rise to three different types
of linear model, dependent on the form of the transfer function H(z) given in
equation (2) [9, 7].

• When the numerator of the transfer function is constant, an all-pole or

autoregressive (AR) model is defined.

• The all-zero or moving average model assumes that the denominator of

the transfer function is a constant.

• The third and most general case is the mixed pole/zero model, also called
the autoregressive moving-average (ARMA) model, where nothing is as-
sumed about the transfer function.

The all-pole model for linear prediction is the most widely studied and im-
plemented of the three approaches, for a number of reasons. Firstly, the in-
put signal, which is required for ARMA and all-zero modelling, is oftentimes

4
an unknown sequence. As such, they are unavailable for use in our deriva-
tions. Secondly, the equations derived from the all-pole model approach are
relatively straight-forward to solve, contrasting sharply with the nonlinear equa-
tions dervied from ARMA or all-zero modelling. Finally, and perhaps the most
important reason why all-pole modelling is the preferred choice of engineers,
many real world applications, including most types of speech production, can
be faithfully modeled using the approach.

2.2 All-Pole Linear Prediction Model

Following from the linear system equation (1), one can formulate the equations
necessary to determine the parameters of an all-pole linear system, the so-called
linear prediction normal equations. First, following on from the all-pole model
(see Figure 2), a linear prediction estimate ŷ at sample number n for the output
signal y by a pth order prediction filter can be given by:
p
X
ŷ(n) = − ak y(n − k) (3)
k=1

The error or residue between the output signal and its estimate at sample n

Figure 2: A graphical representation of an all pole linear system, where the

output is a linear function of scaled previous outputs and the input.

can then be expressed as the difference between the two signals.

e(n) = y(n) − ŷ(n) (4)

5
The total squared error for an as-of-yet unspecified range of signal samples is
given by the following equation:
X
E= [e(n)]2
n
X
= [y(n) − ŷ(n)]2 (5)
n
X
= [y(n)]2 − 2 · y(n) · ŷ(n) + [ŷ(n)]2
n

Equation (5) gives a value indicative of the energy in the error signal. Ob-
viously, it is desirous to choose the predictor coefficients so that the value of E
is minimised over the unspecified interval. The optimal minimising values can
be determined through differential calculus, i.e. by obtaining the derivative of
equation 5 with respect to each predictor coefficient and setting that value equal
to zero.
∂E
= 0 for 1 ≤ k ≤ p
∂ak
∂ X
⇒ ( ([y(n)]2 − 2 · y(n) · ŷ(n) + [ŷ(n)]2 )) = 0
∂ak n
X ∂ X ∂
−2 y(n) · ŷ(n) + 2 ŷ(n) · ŷ(n) = 0
n
∂ak n
∂ak
X ∂ X ∂
y(n) · ŷ(n) = ŷ(n) · ŷ(n)
n
∂a k n
∂a k

∂
ŷ(n) = −y(n − k) . . . from equation (3)
∂ak
X X
⇒ y(n) · −y(n − k) = ŷ(n) · −y(n − k)
n n
X X p
X
− y(n) · y(n − k) = (− ai y(n − i)) · −y(n − k)
n n i=1

X p
X X
− y(n) · y(n − k) = ai y(n − i) · y(n − k) (6)
n i=1 n

For the sake of brevity and future utility, a correlation function φ is defined.
The expansion of this summation describes what will be called the correlation
matrix.
X
φ(i, k) = y(n − i) · y(n − k) (7)
n

Substituting the correlation function into equation (6) allows it to be written

6
more compactly:
p
X
−φ(0, k) = ai φ(i, k) (8)
i=1

The derived set of equations are called the normal equations of linear prediction.

2.3 Solutions to the Normal Equations

The limits on the summation of the total squared energy were omitted from
equation (5) so as to give their selection special attention. The section will
show that two different but logical summation intervals lead to a two different
sets of normal equations and result in different predictor coefficients.
Given sufficient data points and appropriate limits, the normal equations
define p equations with p unknowns which can be solved by any general simul-
taneous linear equation solving algorithms, e.g. Gaussian elimination, Crout
decomposition, etc. However, certain limits lead to matrix redundancies and al-
low for efficient solutions that can significantly reduce the computational load.

2.3.1 The Autocorrelation Method

The autocorrelation method of linear prediction minimises the error signal over
all time, from −∞ to +∞. When dealing with finite digital signals, the signal is
windowed such that all samples outside the interval of interest are taken to be 0
(see Figure 3). If the signal is non-zero from 0 to N − 1, then the resulting error

Figure 3: Windowing a signal by multiplication with an appropriate function,

in this case a Hanning window.

signal will be non-zero from 0 to N − 1 + p. Thus, summing the total energy

7
over this interval is mathematically equivalent to summing over all time.
∞
X
E= [e(n)]2
n=−∞
−1+p
NX
(9)
2
= [e(n)]
n=0

When these limits are applied to equation (7), a useful property emerges.
Because the error signal is zero outside the analysis interval, the correlation
function of the normal equations can be identically expressed in a more conve-
nient form.
−1+p
NX
φauto (i, k) = y(n − i) · y(n − k) 1 ≤ i ≤ p 1 ≤ k ≤ p
n=0
N −1+(i−k)
X
= y(n) · y(n + (i − k)) 1 ≤ i ≤ p 1 ≤ k ≤ p
n=0

This form of the correlation function is simply the short-time autocorrelation

function of the signal, evaluated with a lag of (i − k) samples. This fact gives
this method of solving the normal equations its name.
The implications of this convenience is such that the correlation matrix de-
fined by the normal equations exhibits a double-symmetry that can exploited by
a computer algorithm. Given that ai,j is the member of the correlation matrix
on the ith row and j th column, the correlation matrix demonstrates:

• standard symmetry, where ai,j = aj,i ,

 
a1,1 a2,1 a3,1 ··· am,1
 
 a2,1 a2,2 a3,2 ··· am,2 
 
 
 a3,1 a3,2 a3,3 ··· am,3 
 
 .. .. .. .. 
 
..
 . . . . . 
 
am,1 am,2 am,3 ··· am,n

• Toeplitz symmetry, where ai,j = ai−1,j−1 .

 
a1,1 a1,2 a1,3 · · · a1,m
 
 a2,1 a1,1 a1,2 · · · am,2 
 
 
 a3,1 a2,1 a1,1 · · · am,3 
 
 .. .. .. .. 
 
..
 . . . . . 
 
am,1 am,2 am,3 · · · a1,1

8
These redundancies mean that the normal equations can be solved using
the Levinson-Durbin method, an recursive procedure that greatly reduces the
computational load.

2.3.2 The Covariance Method

In contrast with the autocorrelation method, the covariance method of linear

prediction minimises the total squared energy only over the interval of interest.
N
X −1
E= [e(n)]2 (10)
n=0

Using these limits, an examination of the equation (7) reveals that the signal val-
ues required for the calculation extend beyond the analysis interval (see Figure
4).

Figure 4: The covariance method require −p samples (shown here in red) beyond
the analysis interval from 0 to N − 1 (shown in blue).

N
X −1
φcovar (i, k) = y(n − i) · y(n − k) 1 ≤ i ≤ p 1 ≤ k ≤ p (11)
n=0

Samples are required from −p to N − 1. The resulting correlation matrix ex-

hibits standard symmetry, but unlike the matrix defined by the autocorrelation
mehthod, the matrix does not demonstrate Toeplitz symmetry. This means
that a different method must be used to solve the normal equations, such as
Cholesky decomposition or the square-root method.

2.3.3 Comparison of the Two Methods

Each of these solutions to the linear prediction normal equations has its own
strengths and weaknesses; determining which is more advantageous to use is
greatly determined by the signal being analysed. When analysis signals are long,

9
the two different solutions are virtually identical. Because of the greater redun-
dancies in the matrix defined by the autocorrelation method, it is slightly easier
to compute [9]. Experimental evidence indicates that the covariance method is
more accurate for periodic speech sounds [1], while the autocorrelation method
performs better for fricative sounds [9].

3 Linear Prediction of Speech

In order for linear prediction to apply to speech signals, the speech production
process must closely adhere to the theoretical framework established in the
previous sections. This section reviews the actual physical process of speech
production and discusses the linear model utilised by speech engineers.

3.1 Human Speech Production: Anatomy and Function

Following Figure 5, the vast majority of human speech sounds are produced in
the following manner [3]. The lungs initiate the speech process by acting as
the bellows that expels air up into the other regions of the system. The air
pressure is maintained by the intercostal and abdominal muscles, allowing for
the smooth function of the speech mechanisms. The air that leaves the lungs
then enters into the remaining regions of the speech production system via the
trachea. This organ system, consisting of the lungs, trachea and interconnecting
channels, is known as the pulmonary tract. The turbulent air stream is driven
up the trachea into the larynx. The larynx is a box-like apparatus that consists
of muscles and cartilage. Two membranes, known as the vocal folds, span the
structure, supported at the front by the thyroid cartilage and at the back by the
arytenoid cartilages. The arytenoids are attached to muscles which enable them
to approximate and separate the vocal folds. Indeed, the principal function of
the larynx, unrelated to the speech process, is to seal the trachea by maintaining
the vocal folds closed. This has the dual benefit of being able to protect the
pulmonary tract and permit the build up of pressure within the chest cavity
necessary for certain exertions and coughing [9].
The space between the vocal folds is called the glottis. A speech sound is
classified as voiced or voiceless depending on the glottal behaviour as air passes

10
Figure 5: The human speech production system. Image taken from
http://cobweb.ecn.purdue.edu/˜ee649/notes/physiology.html.

through it. According to the myeloelastic-aerodynamic theory of phonation [2],

the vibration of the vocal folds results from the interaction between two opposing
forces. Approximated folds are forced apart by rising subglottal air pressure. As
air rushes through the glottis, the suction phenomenon known as the Bernoulli
effect is observed. This effect due to decreased pressure across the constriction
aperture adducts the folds back together. The interplay between these forces
results in vocal fold vibration, producing a voiced sound. This phonation has
a fundamental frequency directly related to the frequency of vibration of the
folds. During a voiceless speech sound, the glottis is kept open and the stream
of air continues through the larynx without hindrance. The resulting glottal
excitation waveform exhibits a flat frequency spectrum.
The phonation from the larynx then enters the various chambers of vocal
tract: the pharynx, the nasal cavity and the oral cavity. The pharynx is the
chamber stemming the length of the throat from the larynx to the oral cavity.
Access to the nasal cavity is dependent on the position of the velum, a piece of
soft tissue that makes up the back of the roof of the mouth. For the production
of certain phonemes, the velum descends, coupling the nasal cavity with the

11
other chambers of the vocal tract.
The fully realised phone is radiated out of the body via either the lips or
the nose or both. This is a continuous process, during which the state and
configuration of the system’s constituents alter and change dynamically with
the thoughts of the speaker.

3.2 Speech Production as a Linear System

The acoustic theory of speech production assumes the speech production process
to be a linear system, consisting of a source and filter [11]. This model captures
the fundamentals of the speech production process described in the previous
section: a source phonation modulated in the frequency domain by a dynami-
cally changing vocal tract filter, Figure 6. According to the source-filter theory,

Figure 6: The simplified speech model proposed by the acoustic theory of speech
production.

short-time frames of speech can be characterised by identifying the parameters

of the source and filter.

Glottal source. The source signal is one of two states: a pulse train of a
certain fundamental frequency for voiced sounds and white noise for un-
voiced sounds. This two-state source fits reasonably well with true glottal
behaviour, though moments of mixed excitation cannot be represented
well.

12
Vocal tract filter. The vocal tract is parameterised by its resonances, which
are called formants 1 . All acoustic tubes have natural resonances, the
parameters of which are a function of its shape.

Though the vocal tract changes its shape, and thus its resonances, continuously
with running speech, it is not unreasonable to assume it static over short-time
intervals of the order of 20 milliseconds. Thus, speech production can be viewed
as a LTI system and linear prediction can be applied to it.

3.2.1 Model Limitations

In truth, the speech production system is known to have some nonlinearity ef-
fects and the glottal source and vocal tract filter are not completely de-coupled.
In other words, the acoustic effects of the vocal tract has been noticed to modu-
late depending on the behaviour of the source in ways that linear systems cannot
fully describe. Additionally the vocal tract deviates from the behaviour of an
all-pole filter during the production of certain vocal sounds.

System linearity. Linear systems by definition assume that inputs to the sys-
tem have no effect on the system’s parameters [13]. In the case of the
speech production process, this means that the vibratory behaviour of the
glottis has no bearing on the formant frequencies and bandwidths - an
assumption which is sometimes violated [2]. Especially in the situation
where the pitch of the voice is high and the centre frequency first formant
low, an excitation pulse can influence the decay of the previous pulse.

All pole model. The described method of linear prediction works on the as-
sumption that the frequency response of the vocal tract consists of poles
only. This supposition is acceptable for most voiced speech sounds, but is
not appropriate for nasal and fricative sounds. During the production of
these types of utterances, zeros are produced in the spectrum due to the
trapping of certain frequencies within the tract. The use of a model lack-
ing representation of zeros in addition to poles should not cause too much
concern, as if p is of high enough order, the all pole model is sufficient for
almost all speech sounds [11].
1 The word formant comes from the Latin verb formāre meaning “to shape”.

13
Despite these limitations, all-pole linear prediction remains a highly useful
technique for speech analysis.

3.3 Practical Considerations

Analysing speech signals using linear prediction requires a couple of provisos to

achieve the optimal results. These considerations are discussed in this section.

3.3.1 Prediction Order

The choice of prediction order is an important one as it determines the charac-

teristics of the vocal tract filter. Should the prediction be too low, key areas of
resonance will be missed as there are insufficient poles to model them - if the
prediction order is too high, source specific characteristics, e.g. harmonics, are
determined (see Figure 7). Formants require two complex conjugate poles to

Figure 7: The spectral envelopes of a trumpet sound, as determined by linear

prediction analyses, each successively increased prediction orders.

characterise correctly. Thus, the prediction order should be twice the number
of formants present in the signal bandwidth. For a vocal tract of 17 centimetres

14
long, there is an average of one formant per kilohertz of bandwidth.
Where p represents the prediction order and fs the signal’s sampling fre-
quency, the following formula is used as a general rule of thumb [9]:

fs
p= +γ (12)
1000

The value of γ, described in the literature as a “fudge factor”, necessary to

compensate for glottal roll-off and predictor flexibility, is normally given as 2 or
3.

3.3.2 Closed Glottal Analysis

The general consensus of the speech processing community is that linear predic-
tive analysis of voiced speech should be confined to the closed glottal condition
[5]. Indeed, it has been shown that closed phase covariance method linear predic-
tion yields better formant tracking and inverse filtering results that than other
pitch synchronous and asynchronous methods [1]. During the glottal closed
phase the the signal represents a freely decaying oscillation, theoretically en-
suring the absence of source-filter interaction and thus better adhering to the
assumptions of linear prediction [15].
Some voice types are unsuited to this type of analysis [10]. In order to obtain
a unique solution to the normal equations, a critical minimum of signal samples
must exist related to the signal’s bandwidth. High-pitched voices are known
to have closed phases that are too short for analysis purposes. Other voices,
particularly breathy voices, are known to exhibit continuous glottal leakage.
There are numerous methods used to determine the closed glottal interval.
The first attempts to do so typically used special laboratory techniques, such as
electroglottography [4], recorded simultaneously with the digital audio. More
recently, efforts have focused on ascertaining the closed glottal interval through
the analysis of the recorded speech signal [14, 6]. Some of these techniques has
met significant success, particularly the DYPSA algorithm of Naylor et al. [8]
that successfully identifies the closed glottal instant in more than 90% of cases
(Figure 8).

15
Figure 8: A speech waveform with the closed glottal interval highlighted in red
and delimited by circles. These regions represent the instants of glottal closing
and opening respectively, as identified by the DYPSA algorithm.

4 Examples
Within this section, some implementations of linear prediction are given, along
with all the practical considerations taken for the analysis.

4.1 Human Speech: Voiced Vowel

A voice sample of a male voice, recorded at a sampling rate of 44.1 kHz, was
analysed. The signal is the voiced vowel sound /a/. As it is periodic, covariance
method linear prediction during the closed glottal phase yield the most accurate
formant values.
The signal was first processed by the DYPSA algorithm to determine the
closed glottal interval, which underwent covariance analysis. The order of the
prediction filter, calculated according to formula (12), was determined to be 46,
see figure 9.

4.2 Human Speech: Unvoiced Fricative

An unvoiced vocal sample was also analysed. This segment, sampled at a rate
of 9 kHz, is taken from the TIMIT speech database and is of the fricative sound
R
/ /, the sh found in both “shack” and “cash”, as pronounced by an American
female.
As autocorrelation linear prediction analysis performs better with unvoiced
sounds, that method was implemented with a filter of prediction order of 11,

16
Figure 9: Top: time-domain representation of /a/ sound. Below: The sound’s
spectrum and spectral envelope as determined by covariance method linear pre-
diction analysis.

see figure 10.

R
Figure 10: Top: time-domain representation of / / sound. Below: The
sound’s spectrum and spectral envelope as determined by autocorrelation
method linear prediction analysis.

4.3 Trumpet

Though this report has primarily concerned itself with the linear prediction of
speech, linear prediction also has applications for musical signal processing [12].
Certain instrumental sounds, such as brass instruments, exhibit strong formant
structure that lend themselves well to modelling through linear prediction.
In this example, given in figure 11, a B[ trumpet sample was analysed, play-
ing the E[5. Trumpets are known to exhibit 3 formants, indicating a prediction

17
order of 6 is required to determine all the resonances. As the signal is periodic,
covariance method linear prediction analysis is performed.

Figure 11: Top: time-domain representation of a trumpet playing the note

E[5. Below: The sound’s spectrum and spectral envelope as determined by
covariance method linear prediction analysis.

References
[1] S. Chandra and Lin Wen. Experimental comparison between stationary
and nonstationary formulations of linear prediction applied to voiced speech
analysis. Acoustics, Speech, and Signal Processing [see also IEEE Transac-
tions on Signal Processing], IEEE Transactions on, 22(6):403–415, 1974.

[2] D. G. Childers. Speech Processing and Synthesis Toolboxes. John Wiley &
Sons, Inc., New York, 2000.

[3] Michael Dobrovolsky and Francis Katamba. Phonetics: The sounds of lan-
guage. In William O’Grady, Michael Dobrovolsky, and Francis Katamba,
editors, Contemporary Linguistics: An Introduction. Addison Wesley Long-
man Limited, Essex, 1997.

[4] A. Krishnamurthy and D. Childers. Two-channel speech analysis. Acous-

tics, Speech, and Signal Processing [see also IEEE Transactions on Signal
Processing], IEEE Transactions on, 34(4):730–743, 1986.

[5] J. Larar, Y. Alsaka, and D. Childers. Variability in closed phase analysis

of speech. volume 10, pages 1089–1092, 1985.

18
[6] Changxue Ma, Y. Kamp, and L. F. Willems. A frobenius norm approach
to glottal closure detection from the speech signal. Speech and Audio Pro-
cessing, IEEE Transactions on, 2(2):258–265, 1994.

[7] J. Makhoul. Linear prediction: A tutorial review. Proceedings of the IEEE,

63(4):561–580, 1975.

[8] Patrick A. Naylor, Anastasis Kounoudes, Jon Gudnason, and Mike

Brookes. Estimation of glottal closure instants in voiced speech using the
dypsa algorithm. Audio, Speech and Language Processing, IEEE Transac-
tions on [see also Speech and Audio Processing, IEEE Transactions on],
15(1):34–43, 2007.

[9] T. W. Parsons. Voice and speech processing. McGraw-Hill New York, 1987.

[10] M. D. Plumpe, T. F. Quatieri, and D. A. Reynolds. Modeling of the glottal

flow derivative waveform with application to speaker identification. Speech
and Audio Processing, IEEE Transactions on, 7(5):569–586, 1999.

[11] L. R. Rabiner and R. W. Schafer. Digital Processing of Speech Signals.

Prentice-Hall Signal Processing Series. Prentice-Hall Inc., Engelwood Cliffs,
New Jersey, 1978.

[12] Curtis Roads. The Computer Music Tutorial. MIT Press, Boston, Mas-
sachusetts, 1996.

[13] A. V. Oppenheim R. W. Schafer. Digital Signal Processing. Prentice Hall,

Englewood Cliffs, New Jersey, 1975.

[14] R. Smits and B. Yegnanarayana. Determination of instants of significant

excitation in speech using group delay function. Speech and Audio Process-
ing, IEEE Transactions on, 3(5):325–333, 1995.

[15] D. Wong, J. Markel, and Jr. Gray, A. Least squares glottal inverse filtering
from the acoustic speech waveform. Acoustics, Speech, and Signal Process-
ing [see also IEEE Transactions on Signal Processing], IEEE Transactions
on, 27(4):350–355, 1979.

View publication stats

(Thomas F. Quatieri) Discrete Time Speech Signal P (BookFi - Org) 2 PDF
100% (3)
(Thomas F. Quatieri) Discrete Time Speech Signal P (BookFi - Org) 2 PDF
800 pages
Lawrence R. Rabiner, Digital Processing of Speech Signals
100% (1)
Lawrence R. Rabiner, Digital Processing of Speech Signals
527 pages
P9 Sampling, Convolution, and FIR Filtering
No ratings yet
P9 Sampling, Convolution, and FIR Filtering
12 pages
Linear Prediction of Speech: D. Markel A. H. Gray, JR
No ratings yet
Linear Prediction of Speech: D. Markel A. H. Gray, JR
299 pages
Digital Processing of Speech Signals (Rabiner & Schafer 1978) PDF
100% (2)
Digital Processing of Speech Signals (Rabiner & Schafer 1978) PDF
265 pages
Discrete-Time Processing of Speech Signals (IEEE Press Classic Reissue) PDF
No ratings yet
Discrete-Time Processing of Speech Signals (IEEE Press Classic Reissue) PDF
919 pages
Rabiner & Juang - Fundamentals of Speech Recognition
100% (2)
Rabiner & Juang - Fundamentals of Speech Recognition
277 pages
PDF
No ratings yet
PDF
27 pages
Linear Prediction: The Technique, Its Solution and Application To Speech
No ratings yet
Linear Prediction: The Technique, Its Solution and Application To Speech
20 pages
Lab 9a. Linear Predictive Coding For Speech Processing: Vocal Tract Parameters Pitch Period Voiced/Unvoiced Speech Switch
No ratings yet
Lab 9a. Linear Predictive Coding For Speech Processing: Vocal Tract Parameters Pitch Period Voiced/Unvoiced Speech Switch
5 pages
Test2 SP
No ratings yet
Test2 SP
43 pages
LPC
No ratings yet
LPC
5 pages
Discrete Time Processing of Speech Signa
No ratings yet
Discrete Time Processing of Speech Signa
12 pages
Linear Predict
No ratings yet
Linear Predict
14 pages
Report
No ratings yet
Report
9 pages
Atal 2006 LPC PDF
No ratings yet
Atal 2006 LPC PDF
5 pages
A Tutorial On Speech Synthesis Models
No ratings yet
A Tutorial On Speech Synthesis Models
8 pages
Use of Spectral Autocorrelation in Spectral Envelope Linear Prediction For Speech Recognition
No ratings yet
Use of Spectral Autocorrelation in Spectral Envelope Linear Prediction For Speech Recognition
31 pages
Biomodelling - Linier Prediction
No ratings yet
Biomodelling - Linier Prediction
23 pages
Gram - Linear Prediction and Optimum Linear Fil-33-52
No ratings yet
Gram - Linear Prediction and Optimum Linear Fil-33-52
20 pages
Linear Predictor: Nature of Linear Prediction
No ratings yet
Linear Predictor: Nature of Linear Prediction
9 pages
Speech Processing
No ratings yet
Speech Processing
12 pages
Prentice Hall - Digital Processing of Speech Signals - 1978 PDF
No ratings yet
Prentice Hall - Digital Processing of Speech Signals - 1978 PDF
265 pages
Pub - Digital Processing of Speech Signals PDF
No ratings yet
Pub - Digital Processing of Speech Signals PDF
265 pages
Linear Prediction
No ratings yet
Linear Prediction
18 pages
Prepared By: Mamatha.K.S M.Tech (S.P) 1 Sem Guided By: Mr. Satish.M.N
No ratings yet
Prepared By: Mamatha.K.S M.Tech (S.P) 1 Sem Guided By: Mr. Satish.M.N
21 pages
Dokumen - Pub Discrete Time Speech Signal Processing Principles and Practice Low Price Ed Lpe 013242942x 9780132429429 9788177587463 8177587463
No ratings yet
Dokumen - Pub Discrete Time Speech Signal Processing Principles and Practice Low Price Ed Lpe 013242942x 9780132429429 9788177587463 8177587463
802 pages
Linear Prediction
No ratings yet
Linear Prediction
198 pages
Theory of LPC Vaidhyanathan
100% (1)
Theory of LPC Vaidhyanathan
198 pages
SP Module 5 PPT L4C
No ratings yet
SP Module 5 PPT L4C
145 pages
Anais Aesbr2007
No ratings yet
Anais Aesbr2007
160 pages
Estimation of Formant Frequency of Speech Signal by Linear Prediction Method and Wavelet Transform IJERTV2IS3371
No ratings yet
Estimation of Formant Frequency of Speech Signal by Linear Prediction Method and Wavelet Transform IJERTV2IS3371
6 pages
Test 1
No ratings yet
Test 1
77 pages
Linear Prediction Analysis: Vignans Institute of Engineering For Women
No ratings yet
Linear Prediction Analysis: Vignans Institute of Engineering For Women
20 pages
Geometric functions in computer aided geometric design
From Everand
Geometric functions in computer aided geometric design
Oscar Ruiz
No ratings yet
Self Learning Speaker Identification A System For PDF
No ratings yet
Self Learning Speaker Identification A System For PDF
185 pages
Linear Predictive Coding
No ratings yet
Linear Predictive Coding
22 pages
David S Undermann, Harald H Oge, Antonio Bonafonte, Helenca Duxans
No ratings yet
David S Undermann, Harald H Oge, Antonio Bonafonte, Helenca Duxans
5 pages
Linear Prediction
No ratings yet
Linear Prediction
5 pages
Speech Processing Project
No ratings yet
Speech Processing Project
16 pages
Speaker Recognition System
No ratings yet
Speaker Recognition System
7 pages
IOSRJEN (WWW - Iosrjen.org) IOSR Journal of Engineering
No ratings yet
IOSRJEN (WWW - Iosrjen.org) IOSR Journal of Engineering
5 pages
Estimation Theory in Signal Processing
No ratings yet
Estimation Theory in Signal Processing
2 pages
19EC1165
No ratings yet
19EC1165
2 pages
10212EC122 - Signal Processing Techniques For Speech Recognition Syllabus
No ratings yet
10212EC122 - Signal Processing Techniques For Speech Recognition Syllabus
3 pages
ps7 Fall09
No ratings yet
ps7 Fall09
2 pages
A Feature Extraction Method Based On Gauss Wavelet Filter and Linear Prediction Filter Coefficients in Speech Recognition
No ratings yet
A Feature Extraction Method Based On Gauss Wavelet Filter and Linear Prediction Filter Coefficients in Speech Recognition
13 pages
Module2 SSP
No ratings yet
Module2 SSP
70 pages
Engineering Physics
From Everand
Engineering Physics
Dr. S.G Ibrahim
No ratings yet
Robot Manipulators: Modeling, Performance Analysis and Control
From Everand
Robot Manipulators: Modeling, Performance Analysis and Control
Etienne Dombre
No ratings yet
Digital Signal and Image Processing using MATLAB, Volume 3: Advances and Applications, The Stochastic Case
From Everand
Digital Signal and Image Processing using MATLAB, Volume 3: Advances and Applications, The Stochastic Case
Gérard Blanchet
3/5 (1)
Foundations of Image Science
From Everand
Foundations of Image Science
Harrison H. Barrett
No ratings yet
Shortcuts to College Calculus Refreshment Kit
From Everand
Shortcuts to College Calculus Refreshment Kit
Juan Acevedo
No ratings yet
Constructed Layered Systems: Measurements and Analysis
From Everand
Constructed Layered Systems: Measurements and Analysis
W. H. Cogill
No ratings yet
Worked Examples in Mathematics for Scientists and Engineers
From Everand
Worked Examples in Mathematics for Scientists and Engineers
G. Stephenson
No ratings yet
Digital Signal Processing (DSP) with Python Programming
From Everand
Digital Signal Processing (DSP) with Python Programming
Maurice Charbit
No ratings yet
Introduction to the Mathematics of Inversion in Remote Sensing and Indirect Measurements
From Everand
Introduction to the Mathematics of Inversion in Remote Sensing and Indirect Measurements
S. Twomey
No ratings yet
Optical Flow: Exploring Dynamic Visual Patterns in Computer Vision
From Everand
Optical Flow: Exploring Dynamic Visual Patterns in Computer Vision
Fouad Sabry
No ratings yet
Elements of Tensor Calculus
From Everand
Elements of Tensor Calculus
A. Lichnerowicz
3.5/5 (2)
Ordinary Differential Equations and Stability Theory: An Introduction
From Everand
Ordinary Differential Equations and Stability Theory: An Introduction
David A. Sanchez
No ratings yet
Some Case Studies on Signal, Audio and Image Processing Using Matlab
From Everand
Some Case Studies on Signal, Audio and Image Processing Using Matlab
Dr. Hedaya Mahmood Alasooly
No ratings yet
New Directions in Dynamical Systems, Automatic Control and Singular Perturbations
From Everand
New Directions in Dynamical Systems, Automatic Control and Singular Perturbations
John O’Reilly
No ratings yet
Matched Filter Paper
No ratings yet
Matched Filter Paper
7 pages
ECE-C574 ASIC Design I Syllabus Winter Lecture: 1h50m Prof. Baris Taskin Lab: 1h50m
No ratings yet
ECE-C574 ASIC Design I Syllabus Winter Lecture: 1h50m Prof. Baris Taskin Lab: 1h50m
4 pages
Isolation Frequency Response: Headphone Measurements: Massdrop HD 6XX
No ratings yet
Isolation Frequency Response: Headphone Measurements: Massdrop HD 6XX
1 page
Owner'S Manual: Digital To Analog Converter
No ratings yet
Owner'S Manual: Digital To Analog Converter
4 pages
ECE CS 430 630 Syllabus - SCAA
No ratings yet
ECE CS 430 630 Syllabus - SCAA
5 pages
Modern Photonics Syllabus
No ratings yet
Modern Photonics Syllabus
5 pages
TS EAPCET Top To Bottom Colleges List
No ratings yet
TS EAPCET Top To Bottom Colleges List
4 pages
Week 3
No ratings yet
Week 3
22 pages
Project Management Professional (Training)
100% (1)
Project Management Professional (Training)
182 pages
School Prefect
No ratings yet
School Prefect
12 pages
Examination Hall Ticket Study Center: Shri R.S. Dubey Jr. College of Arts and Com. (35232)
No ratings yet
Examination Hall Ticket Study Center: Shri R.S. Dubey Jr. College of Arts and Com. (35232)
1 page
English9 1ST Exam
100% (1)
English9 1ST Exam
4 pages
Investigatory Project On Aids PDF 4
No ratings yet
Investigatory Project On Aids PDF 4
1 page
Logitech Wireless Keyboard K350 Manual
No ratings yet
Logitech Wireless Keyboard K350 Manual
40 pages
12 Maths Important Questions 2025
No ratings yet
12 Maths Important Questions 2025
5 pages
Job Characteristics Model
0% (1)
Job Characteristics Model
3 pages
Godavari River Upsc Notes 12
No ratings yet
Godavari River Upsc Notes 12
2 pages
Hypnagogia
No ratings yet
Hypnagogia
12 pages
NEUST AAF F001 Syllabus of Instruction Rev.02
No ratings yet
NEUST AAF F001 Syllabus of Instruction Rev.02
4 pages
Heat Transfer Characteristics of Shell and Tube Heat Exchanger
No ratings yet
Heat Transfer Characteristics of Shell and Tube Heat Exchanger
23 pages
Admin Commands
No ratings yet
Admin Commands
4 pages
Writing Successful Undergraduate Dissertations in Social Sciences A Student S Handbook 2nd Edition Francis Jegede Download
No ratings yet
Writing Successful Undergraduate Dissertations in Social Sciences A Student S Handbook 2nd Edition Francis Jegede Download
46 pages
A Stroll Around RISC OS
No ratings yet
A Stroll Around RISC OS
9 pages
IGEH Cuadernillo 1ro
No ratings yet
IGEH Cuadernillo 1ro
61 pages
Optical Fiber
No ratings yet
Optical Fiber
12 pages
05.contraction of Skeletal Muscle
No ratings yet
05.contraction of Skeletal Muscle
87 pages
Leadership in A Nutshell
No ratings yet
Leadership in A Nutshell
23 pages
English10 Q1 M8
No ratings yet
English10 Q1 M8
12 pages
Tierney 1988
No ratings yet
Tierney 1988
7 pages
Defarch 3-2
No ratings yet
Defarch 3-2
14 pages
PPST Domain Learnin G Area Strength/S Weaknesse S Opportunitie S Threats
No ratings yet
PPST Domain Learnin G Area Strength/S Weaknesse S Opportunitie S Threats
13 pages
2010.1.15.facial Faradic
No ratings yet
2010.1.15.facial Faradic
2 pages
CV - Marielle
No ratings yet
CV - Marielle
3 pages
Construction and Trial Experiment of A Small Size Thermo-Acoustic Refrigeration System
No ratings yet
Construction and Trial Experiment of A Small Size Thermo-Acoustic Refrigeration System
6 pages
17 Earthquake Engineering
No ratings yet
17 Earthquake Engineering
70 pages