Fundamentals of Speech Processing
Fundamentals of Speech Processing
COURSE OBJECTIVES:
1. To understand the characteristics of speech signal,
2. To understand models of speech signal processing
3. To apply signal processing concepts to speech signal,
4. To understand the applications of speech signal.
5. To get an insight into a few applications of speech processing.
UNIT-I
PRODUCTION AND CLASSIFICATION OF SPEECH SOUNDS: Anatomy and
physiology of speech production, spectrographic analysis of speech, and categorization of
speech sounds Digital models for the speech signal: The acoustic theory of speech
production. 9 hrs
UNIT-II
TIME DOMAIN MODELS FOR SPEECH PROCESSING: Short-time energy, average
magnitude, average zero-crossing rate, speech vs. silence discrimination using energy and
zero-crossings, pitch period estimation using a parallel processing approach, short-time
autocorrelation function, average magnitude difference function, pitch period estimation
using autocorrelation.
SHORT-TIME FOURIER ANALYSIS: Fourier transform interpretation, Linear filtering
interpretation, sampling rates of STFT in time and frequency, Filter bank summation method
of short-time synthesis, Overlap addition method of short-time synthesis. 9 hrs
UNIT-III
HOMOMORPHIC SPEECH PROCESSING: Homomorphic systems for convolution
complex cepstrum of speech, pitch detection, formant estimation. 10 hrs
UNIT-IV
LINEAR PREDICTION ANALYSIS OF SPEECH: Principles of linear prediction,
Computation of the gain for the model, Solution of the LPC equations, Comparison between
autocorrelation and covariance methods, Frequency domain interpretation of mean squared
prediction error, synthesis of speech from LP parameters, pitch detection and formant
analysis using LPC parameters. 10 hrs
UNIT-V
APPLICATIONS: Speaker recognition systems, speech recognition systems, isolated word
recognition, connected word recognition and large vocabulary word recognition, hidden
Markov models, three basic problems of HMM, Types of HMM. 10 hrs
TEXT BOOKS:
1. Lawrence R. Rabiner and Ronald W. Schafer, “Digital Processing
of Speech Signals”, 2nd Indian Reprint, Pearson Education, 2005.
REFERENCE BOOKS:
1. Thomas F. Quatieri, “Discrete-Time Speech Signal Processing
Principles and Practice”, 1st Indian Reprint, Pearson Education,
2004.
2. Lawrence R. Rabiner, Biing-Hwang Juang, and B. Yegnanarayana,
“Fundamentals of Speech Recognition”, Pearson Education, 2009.
COURSE OUTCOMES:
Students will be able to:
CO1: analyze the properties of speech signals.
CO2: analyze the exposure to frequency domain analysis speech signal
in terms of synthesis & analysis.
CO3: analyze the separation of source and filter properties using
homomorphic analysis.
CO4: analyze the linear prediction coefficients and its significance
in speech processing.
CO5: differentiate between the different speech sounds.