0% found this document useful (0 votes)
70 views55 pages

Speech Signal Processing Lab Work Book

Uploaded by

Saanvi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
70 views55 pages

Speech Signal Processing Lab Work Book

Uploaded by

Saanvi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 55

SPEECH SIGNAL PROCESSING

21EC3081

STUDENT ID: ACADEMIC YEAR: 2023-24


STUDENT NAME:
Table of Contents

1. Session 01: Introduction to MATLAB 1


2. Session 02: Speech acquisition and recording 3
3. Session 03: Non-Stationary Nature of Speech Signal 8
4. Session 04: Identification of Voice/Unvoiced/Silence regions of Speech 11
5. Session 05: Different Sounds (Phonemes) In Language 15
6. Session 06: Short Term Time Domain Processing of Speech 21
7. Session 07: Fundamental frequency estimation in Speech signal 25
8. Session 08: Format synthesis MFCC extraction from Speech signal 28
9. Session 09: Linear Prediction Analysis 32
10. Session 10: Cepstral Analysis of Speech Signal 36
11. Session 11: LPCC extraction from Speech signal 40
12. Session 12: Speech Enhancement 44
13. Session 13: Speaker recognition 48

0
A.Y. 2023-24 LAB/SKILL CONTINUOUS EVALUATION

S.No Date Experiment Name Pre- In-Lab (25M) Post- Viva Total Faculty
Lab Program/ Data and Analysis & Lab Voce (50M) Signature
(10M) Procedure Results Inference (10M) (5M)
(5M) (10M) (10M)
1. Introductory to MATLAB
Speech acquisition and recording
2.
Non-Stationary Nature of Speech Signal
3.
Identification of Voice/Unvoiced/Silence
4.
regions of Speech
Different Sounds (Phonemes) In
5.
Language
Short Term Time Domain Processing of
6.
Speech
Fundamental frequency estimation in
7.
Speech signal
Format synthesis MFCC extraction from
8.
Speech signal
Linear Prediction Analysis
9.
Cepstral Analysis of Speech Signal
10.
LPCC extraction from Speech signal
11.
Speech Enhancement
12.
Speaker Recognition
13.

0
Experiment # <TO BE FILLED BY STUDENT> Student ID <TO BE FILLED BY STUDENT>
Date <TO BE FILLED BY STUDENT> Student Name <TO BE FILLED BY STUDENT>

Experiment 1: Introduction to MATLAB


Aim/Objective: Introduction to Relational Operators, Loops & Functions, Matrix Operations

Description:

• Easy and efficient programming in a high-level language, with an interactive


interface for rapid development.

• Vectorised computations for efficient programming, and automatic memory allocation.

• Built-in support for state-of-the-art numerical computing methods.

• Has variety of modern data structures and data types, including complex numbers.

• High-quality graphics and visualization


Pre-Requisites:

Computer with MATLAB installed, microphone, audio recording software.

Pre-Lab:

1. What are the basic elements of the MATLAB environment, such as the command window,
workspace, and editor?

2. How can you perform basic arithmetic operations and mathematical calculations using
MATLAB?

3. What are variables and how are they defined and used in MATLAB?

4. How can you create and manipulate arrays (vectors and matrices) in MATLAB?

5. What are some common built-in functions and operators in MATLAB, and how are they
used?

Course Title <TO BE FILLED BY CC> ACADEMIC YEAR: 2023-24


Course Code(s) <TO BE FILLED BY CC AND MUST INCLUDE ALL R,A,P CODES> Page 1 of 57
Experiment # <TO BE FILLED BY STUDENT> Student ID <TO BE FILLED BY STUDENT>
Date <TO BE FILLED BY STUDENT> Student Name <TO BE FILLED BY STUDENT>

In-Lab:

Procedure:

 To open MATLAB software, go to Run in the Start Menu and type MATLAB and
press Enter.
 To write a MATLAB program and w script is to be opened.
 The code should be written in script file and save it with an extension.m to
execute the program, click on the Run button or press F5.

 If there are no errors in the program, the output waveform will be obtained. If
there are any errors those will be displayed on the command window.
 Observe the output wave forms after run the program.

Program:

% Display a simple "Hello, MATLAB!" message

disp('Hello, MATLAB!');

% Perform basic arithmetic operations

a = 5;

b = 3;

sum = a + b;

difference = a - b;

product = a * b;

quotient = a / b;

disp(['Sum: ' num2str(sum)]);

disp(['Difference: ' num2str(difference)]);

disp(['Product: ' num2str(product)]);

disp(['Quotient: ' num2str(quotient)]);

% Generate data and plot a simple line graph


clc
clear all
close all
fo = 5; %frequency of the sine wave
Fs = 500; %sampling rate
Ts = 1/Fs; %sampling time interval
t = 0:Ts:1-Ts; %sampling period
n = length(t); %number of samples
y = 2*sin(2*pi*fo*t); %the sine curve
Course Title <TO BE FILLED BY CC> ACADEMIC YEAR: 2023-24
Course Code(s) <TO BE FILLED BY CC AND MUST INCLUDE ALL R,A,P CODES> Page 2 of 57
Experiment # <TO BE FILLED BY STUDENT> Student ID <TO BE FILLED BY STUDENT>
Date <TO BE FILLED BY STUDENT> Student Name <TO BE FILLED BY STUDENT>

%plot the cosine curve in the time domain


figure(1)
plot(t,y)
xlabel('time (seconds)')
ylabel('y(t)')
title('Sample Sine Wave')
grid
figure(2)
Y=fft(y);
Y=Y/length ( Y ) ;
f=-Fs/2:Fs/length(y):Fs/2-Fs/length(y);
stem (f, fftshift ( abs ( Y ) ) )
ylabel('y');

title('Sine Wave');

 Data and Results:

This Section is meant for the students to collect, record the results generated during the
Program/Experiment execution as shown below. Include instructions on how to present the results,
such as creating tables, graphs, or visualizations.

Analysis and Inferences:

Course Title <TO BE FILLED BY CC> ACADEMIC YEAR: 2023-24


Course Code(s) <TO BE FILLED BY CC AND MUST INCLUDE ALL R,A,P CODES> Page 3 of 57
Experiment # <TO BE FILLED BY STUDENT> Student ID <TO BE FILLED BY STUDENT>
Date <TO BE FILLED BY STUDENT> Student Name <TO BE FILLED BY STUDENT>

Here, we can observe two frequency component i.e., +5Hz and -5Hz, of the input sinusoidal
wave.

Post-Lab:

Sample VIVA-VOCE Questions (In-Lab):

1. What is the purpose of the "disp" function in MATLAB, and how is it used in the first
program?

2. Can you explain the order of operations in the basic arithmetic operations program? How
does MATLAB handle mathematical expressions?

3. In the second program, what does the "^" operator do, and how does it affect the vector?

4. How are the sum, mean, maximum, and minimum values calculated in the third program?
Are there any built-in functions used for these calculations?

5. Describe the steps involved in creating a simple line graph using the plot function in
MATLAB, as demonstrated in the fourth program.

Evaluator Remark (if Any):

Marks Secured: _____out of 50

Signature of the Evaluator with Date

Evaluator MUST ask Viva-voce prior to signing and posting marks for each experiment.

Course Title <TO BE FILLED BY CC> ACADEMIC YEAR: 2023-24


Course Code(s) <TO BE FILLED BY CC AND MUST INCLUDE ALL R,A,P CODES> Page 4 of 57
Experiment # <TO BE FILLED BY STUDENT> Student ID <TO BE FILLED BY STUDENT>
Date <TO BE FILLED BY STUDENT> Student Name <TO BE FILLED BY STUDENT>

Experiment 2: Speech acquisition and recording


Aim/Objective: To gain basic understanding of speech signals and their acquisition

Pre-Requisites:

Computer with MATLAB installed, microphone, audio recording software.

Pre-Lab:

1. What is speech acquisition, and why is it important in the field of speech and audio
processing?

2. What are the primary components of a speech acquisition system, and how do they work
together to capture and record speech?

3. What are the different types of microphones commonly used for speech acquisition, and
what factors should be considered when selecting a microphone for a specific application?

4. What is the sampling rate, and why is it essential in speech recording? How does the
sampling rate affect the quality and fidelity of recorded speech?

5. How does the bit depth or resolution of a recording device impact the quality of recorded
speech? What is the relationship between bit depth and dynamic range?

In-Lab:

Procedure:

1. Connect the microphone to the computer.


2. Launch the audio recording software.
3. Set the sampling frequency to 16 kHz.
4. Speak into the microphone and record your voice.
5. Save the recording file
Program:

The following code shows how to acquire and record a speech signal in MATLAB:

Code snippet:-
A: Record the Audio
close all;
% voice from microphone to the signal of MATLAB
n=50;
fs=16000; % sets n to be record time, sample rate is fs
channel = 1;
y = audiorecorder(n*fs,16,channel); % Record
pause % Pause
audioplayer(y,fs); % plays the voice of the record.
%figure;
%plot(y);
Course Title <TO BE FILLED BY CC> ACADEMIC YEAR: 2023-24
Course Code(s) <TO BE FILLED BY CC AND MUST INCLUDE ALL R,A,P CODES> Page 5 of 57
Experiment # <TO BE FILLED BY STUDENT> Student ID <TO BE FILLED BY STUDENT>
Date <TO BE FILLED BY STUDENT> Student Name <TO BE FILLED BY STUDENT>

audiowrite('voice.wav',y); % Saves the file with name Orivoice.wav pause;


%[y,fs,bits]= wavread(‘Orivoice.wav’);

B: File .wav audio play

% get a section of vowel


[x,fs]=audioread('clip_aa.wav');
ms1=fs/1000; % maximum speech Fx at 1000Hz
ms20=fs/50; % minimum speech Fx at 50Hz%
% plot waveform
t=(0:length(x)-1)/fs; % times of sampling instants
subplot(3,1,1);
plot(t,x);
legend('Waveform');
xlabel('Time (s)');
ylabel('Amplitude');% do fourier transform of windowed signal
Y=fft(x.*hamming(length(x)));
% plot spectrum of bottom 5000Hz
hz5000=5000*length(Y)/fs;
f=(0:hz5000)*fs/length(Y);
subplot(3,1,2);
plot(f,20*log10(abs(Y(1:length(f)))+eps));
legend('Spectrum');
xlabel('Frequency (Hz)');
ylabel('Magnitude (dB)');
This code will create an audio recorder object with a sampling frequency of 8000 Hz and a single
channel. The user's voice will be recorded and the recorded speech signal will be saved to a WAV file
called speech.wav.

 Data and Results:

A:

Ans = audioplayer with properties:

SampleRate: 16000

BitsPerSample: 16

NumberOfChannels: 1

DeviceID: -1

CurrentSample: 513

TotalSamples: 80000

Running: 'on'

StartFcn: []

StopFcn: []

TimerFcn: []

TimerPeriod: 0.0500

Course Title <TO BE FILLED BY CC> ACADEMIC YEAR: 2023-24


Course Code(s) <TO BE FILLED BY CC AND MUST INCLUDE ALL R,A,P CODES> Page 6 of 57
Experiment # <TO BE FILLED BY STUDENT> Student ID <TO BE FILLED BY STUDENT>
Date <TO BE FILLED BY STUDENT> Student Name <TO BE FILLED BY STUDENT>

B:

 Analysis and Inferences:


We have collected the audio data with sample rate of 8000Hz. Additionally, amplitude and
magnitude of the another .wav vowel sound can be observed in the graph.

Post-Lab:

Sample VIVA-VOCE Questions (In-Lab):

1. What is the role of a microphone in speech acquisition, and how does it convert sound
waves into electrical signals?

2. Can you explain the concept of sampling rate in the context of speech recording? How does
the choice of sampling rate impact the quality and fidelity of the recorded speech?

3. What is the significance of the bit depth or resolution in speech recording? How does it
affect the dynamic range and accuracy of the recorded speech?

4. Describe some common challenges or factors that can affect the quality of speech
recordings, such as background noise, microphone placement, and room acoustics.

5. What are some techniques or approaches that can be employed to minimize or mitigate
background noise during speech recording?

Evaluator Remark (if Any):

Course Title <TO BE FILLED BY CC> ACADEMIC YEAR: 2023-24


Course Code(s) <TO BE FILLED BY CC AND MUST INCLUDE ALL R,A,P CODES> Page 7 of 57
Experiment # <TO BE FILLED BY STUDENT> Student ID <TO BE FILLED BY STUDENT>
Date <TO BE FILLED BY STUDENT> Student Name <TO BE FILLED BY STUDENT>

Marks Secured: _____out of 50

Signature of the Evaluator with Date

Evaluator MUST ask Viva-voce prior to signing and posting marks for each experiment

Course Title <TO BE FILLED BY CC> ACADEMIC YEAR: 2023-24


Course Code(s) <TO BE FILLED BY CC AND MUST INCLUDE ALL R,A,P CODES> Page 8 of 57
Experiment # <TO BE FILLED BY STUDENT> Student ID <TO BE FILLED BY STUDENT>
Date <TO BE FILLED BY STUDENT> Student Name <TO BE FILLED BY STUDENT>

Experiment 3: Non-Stationary Nature of Speech Signal

Aim/Objective: To understand the non-stationary nature of speech signals.

Pre-Requisites:

Computer with MATLAB installed with ECG data points.

Pre-Lab:

1. What is meant by the non-stationary nature of signals, and why is it important to study?

2. What are some key characteristics of non-stationary signals, and how do they differ from
stationary signals?

3. How is speech produced by the human vocal tract, and what are the main components of
the speech production mechanism?

4. What are some factors that contribute to the non-stationarity of speech signals, such as
phonetic content, prosody, and speaker variability?

5. What are some techniques or methods used to analyze and model the non-stationary nature
of speech signals?

In-Lab:

Procedure:

1. Record your voice for a few seconds.


2. Plot the waveform of the recorded speech signal.
3. Calculate the power spectrum of the speech signal.
4. Observe how the power spectrum changes over time.
Program: The following code shows how to analyze the non-stationary nature of a speech
signal in MATLAB:
Code snippet
[x,fs]=audioread('e_sound.wav');
ms1=fs/1000; % maximum speech Fx at 1000Hz
ms20=fs/50; % minimum speech Fx at 50Hz%
% plot waveform
t=(0:length(x)-1)/fs; % times of sampling instants
subplot(3,1,1);
plot(t,x);
legend('Waveform');
xlabel('Time (s)');
ylabel('Amplitude');% do fourier transform of windowed signal
Y=fft(x.*hamming(length(x)));
% plot spectrum of bottom 5000Hz
hz5000=5000*length(Y)/fs;
f=(0:hz5000)*fs/length(Y);

Course Title <TO BE FILLED BY CC> ACADEMIC YEAR: 2023-24


Course Code(s) <TO BE FILLED BY CC AND MUST INCLUDE ALL R,A,P CODES> Page 9 of 57
Experiment # <TO BE FILLED BY STUDENT> Student ID <TO BE FILLED BY STUDENT>
Date <TO BE FILLED BY STUDENT> Student Name <TO BE FILLED BY STUDENT>

subplot(3,1,2);
plot(f,20*log10(abs(Y(1:length(f)))+eps));
legend('Spectrum');
xlabel('Frequency (Hz)');
ylabel('Magnitude (dB)');
This code will load the speech signal speech.wav and plot the waveform of the signal. The
power spectrum of the signal will be calculated using the spectrogram() function and the
pwelch() function.
 Data and Results:

 Analysis and Inferences:


In this, we have analyzed the audio data with sample rate of 8000Hz. Additionally,
amplitude and magnitude of the another .wav vowel “e” sound can be observed in
the graph.
Post-Lab:

Sample VIVA-VOCE Questions (In-Lab):

1. Can you explain the difference between stationary and non-stationary signals, and how does
this distinction apply to speech signals?

2. What are some factors that contribute to the non-stationarity of speech signals, and how do
they impact the characteristics of the signal?

3. Describe the concept of short-term and long-term variability in speech signals. How do these
variations affect the analysis and processing of speech signals?

4. What are some common techniques or methods used to analyze and model the non-
stationary nature of speech signals?

5. How can spectrograms or time-frequency representations be used to visualize and


characterize the non-stationarity in speech signals?

Course Title <TO BE FILLED BY CC> ACADEMIC YEAR: 2023-24


Course Code(s) <TO BE FILLED BY CC AND MUST INCLUDE ALL R,A,P CODES> Page 10 of 57
Experiment # <TO BE FILLED BY STUDENT> Student ID <TO BE FILLED BY STUDENT>
Date <TO BE FILLED BY STUDENT> Student Name <TO BE FILLED BY STUDENT>

Evaluator Remark (if Any):

Marks Secured: _____out of 50

Signature of the Evaluator with Date

Evaluator MUST ask Viva-voce prior to signing and posting marks for each experiment.

Course Title <TO BE FILLED BY CC> ACADEMIC YEAR: 2023-24


Course Code(s) <TO BE FILLED BY CC AND MUST INCLUDE ALL R,A,P CODES> Page 11 of 57
Experiment # <TO BE FILLED BY STUDENT> Student ID <TO BE FILLED BY STUDENT>
Date <TO BE FILLED BY STUDENT> Student Name <TO BE FILLED BY STUDENT>

Experiment 4: Identification of Voice/Unvoiced/Silence regions of Speech


Aim/Objective: To understand the different regions of speech signals and how they can be
identified.

Pre-Requisites:

Computer with MATLAB installed, microphone, audio recording software.

Pre-Lab:

1. What are the voice, unvoiced, and silence regions in a speech signal, and why is it important
to identify and distinguish them?

2. Can you explain the characteristics of a voiced speech signal and how it differs from an
unvoiced or silence region?

3. What are some techniques or methods commonly used for the identification of
voice/unvoiced/silence regions in speech signals?

4. How does the fundamental frequency (pitch) play a role in determining the voiced regions of
a speech signal? What are some algorithms used for pitch estimation?

5. Describe the concept of energy or power in a speech signal. How can energy-based
measures be utilized to identify unvoiced or silence regions?

In-Lab:

Procedure:

1. Record your voice for a few seconds.


2. Plot the waveform of the recorded speech signal.
3. Calculate the power spectrum of the speech signal.
4. Identify the voice, unvoiced, and silence regions of the speech signal
Program:

The following code shows how to identify the different regions of a speech signal in
MATLAB:

Code snippet:-
clear all;
close all;
[x,fs]= audioread('e_sound.wav');
figure(1)
ms1=fs/1000; % maximum speech Fx at 1000Hz
ms20=fs/50; % minimum speech Fx at 50Hz%
% plot waveform
t=(0:length(x)-1)/fs; % times of sampling instants
plot(t,x);
legend('Waveform');
xlabel('Time (s)');

Course Title <TO BE FILLED BY CC> ACADEMIC YEAR: 2023-24


Course Code(s) <TO BE FILLED BY CC AND MUST INCLUDE ALL R,A,P CODES> Page 12 of 57
Experiment # <TO BE FILLED BY STUDENT> Student ID <TO BE FILLED BY STUDENT>
Date <TO BE FILLED BY STUDENT> Student Name <TO BE FILLED BY STUDENT>

ylabel('Amplitude');% do fourier transform of windowed signal


% Three speech segments, of different length.
speechSignal=audioread('e_sound.wav');
ss1 = speechSignal(36095:36700);
ss2 = speechSignal(36095:36550);
ss3 = speechSignal(36095:36400);
ss4 = speechSignal(99200:99800);
% Calculation of the short time autocorrelation
[ac1, lags1] = xcorr(ss1);
[ac2, lags2] = xcorr(ss2);
[ac3, lags3] = xcorr(ss3);
[ac4, lags4] = xcorr(ss4);
figure(2)
subplot(3,1,1);
plot(lags1, ac1);
legend('Window Length: 606 Samples')
title('Short-Time Autocorrelation Function')
grid on;
subplot(3,1,2);
plot(lags2, ac2);
xlim([lags1(1) lags1(end)]);
legend('Window Length: 456 Samples')
grid on;
subplot(3,1,3);
plot(lags3, ac3);
xlim([lags1(1) lags1(end)]);
legend('Window Length: 306 Samples')
xlabel('Lag in samples')
grid on;
%% detection voiced and unvoiced signal
figure(3)
%ss4 = speechSignal(12200:12800);
subplot(2,2,1);
plot(ss1);
legend('Voiced Speech')
subplot(2,2,2);
plot(lags1, ac1);
xlim([lags1(1) lags1(end)]);
legend('Autocorrelation of Voiced Speech')
grid on;
subplot(2,2,3);
plot(ss4);
legend('Unvoiced Speech')
subplot(2,2,4);
plot(lags4, ac4);
xlim([lags1(1) lags1(end)]);
legend('Autocorrelation of Unvoiced Speech')
grid on;

This code will load the speech signal speech.wav and plot the waveform of the signal. The
power spectrum of the signal will be calculated using the spectrogram() function. The voice
regions of the signal will be identified using the voiced() function. The unvoiced regions of the
signal will be identified using the unvoiced() function. The voice and unvoiced regions of the
signal will be plotted
Data and Results:

Course Title <TO BE FILLED BY CC> ACADEMIC YEAR: 2023-24


Course Code(s) <TO BE FILLED BY CC AND MUST INCLUDE ALL R,A,P CODES> Page 13 of 57
Experiment # <TO BE FILLED BY STUDENT> Student ID <TO BE FILLED BY STUDENT>
Date <TO BE FILLED BY STUDENT> Student Name <TO BE FILLED BY STUDENT>

Course Title <TO BE FILLED BY CC> ACADEMIC YEAR: 2023-24


Course Code(s) <TO BE FILLED BY CC AND MUST INCLUDE ALL R,A,P CODES> Page 14 of 57
Experiment # <TO BE FILLED BY STUDENT> Student ID <TO BE FILLED BY STUDENT>
Date <TO BE FILLED BY STUDENT> Student Name <TO BE FILLED BY STUDENT>

 Analysis and Inferences:

Here, an audio vowel sound “e” is taken and segmented in different parts. Autocorrelation
function is measured for different signal segments and shown in figure 2. Thereafter, we determine
autocorrelation function of voiced and unvoiced signal.

Post-Lab:

Sample VIVA-VOCE Questions (In-Lab):

1. Can you explain the significance of identifying voice, unvoiced, and silence regions in speech
signals? What kind of information or insights can be derived from this identification?

2. Describe the characteristics of a voiced speech signal and an unvoiced speech signal. What
are the key differences between them in terms of spectral content and waveform shape?

3. What are some common techniques or algorithms used for the identification of
voice/unvoiced/silence regions in speech signals? How do these methods work?

4. How does the concept of fundamental frequency (pitch) play a role in determining the
voiced regions of a speech signal? Can you explain the process of pitch estimation and its
importance in this context?

Course Title <TO BE FILLED BY CC> ACADEMIC YEAR: 2023-24


Course Code(s) <TO BE FILLED BY CC AND MUST INCLUDE ALL R,A,P CODES> Page 15 of 57
Experiment # <TO BE FILLED BY STUDENT> Student ID <TO BE FILLED BY STUDENT>
Date <TO BE FILLED BY STUDENT> Student Name <TO BE FILLED BY STUDENT>

5. Discuss the role of energy or power in identifying unvoiced or silence regions of a speech
signal. How are energy-based measures employed in this process?

Evaluator Remark (if Any):

Marks Secured: _____out of 50

Signature of the Evaluator with Date

Evaluator MUST ask Viva-voce prior to signing and posting marks for each experiment.

Course Title <TO BE FILLED BY CC> ACADEMIC YEAR: 2023-24


Course Code(s) <TO BE FILLED BY CC AND MUST INCLUDE ALL R,A,P CODES> Page 16 of 57
Experiment # <TO BE FILLED BY STUDENT> Student ID <TO BE FILLED BY STUDENT>
Date <TO BE FILLED BY STUDENT> Student Name <TO BE FILLED BY STUDENT>

Experiment 5: Different Sounds (Phonemes) In Language


Aim/Objective: To understand the different sounds (phonemes) in language and how they can be
represented in digital form

Pre-Requisites:

Computer with MATLAB installed, microphone, audio recording software.

Pre-Lab:

1. What are phonemes, and why are they important in the study of language and linguistics?

2. How do phonemes differ from letters or graphemes in a written language system?

3. Can you explain the concept of minimal pairs and how they relate to identifying and
distinguishing different phonemes?

4. What are some techniques or methods used to analyze and classify phonemes in different
languages?

5. How does the International Phonetic Alphabet (IPA) facilitate the representation and
transcription of phonemes?

In-Lab:

Procedure:

1. Record your voice for a few seconds, pronouncing different vowels and
consonants.
2. Plot the waveform of the recorded speech signal.
3. Calculate the power spectrum of the speech signal.
4. Identify the different phonemes in the speech signal.
Program:

The following code shows how to represent phonemes in digital form in MATLAB:

Code snippet
clear all;
close all;

Fs=32000;
SS = audioread('male2.wav')
% Hamming window
winLen = 301;
winOverlap = 300;
wHamm = hamming(winLen);
% Framing and windowing.
sigFramed = buffer(SS, winLen, winOverlap, 'nodelay');
sigWindowed = diag(sparse(wHamm)) * sigFramed;
% Short-Time Energy calculation

Course Title <TO BE FILLED BY CC> ACADEMIC YEAR: 2023-24


Course Code(s) <TO BE FILLED BY CC AND MUST INCLUDE ALL R,A,P CODES> Page 17 of 57
Experiment # <TO BE FILLED BY STUDENT> Student ID <TO BE FILLED BY STUDENT>
Date <TO BE FILLED BY STUDENT> Student Name <TO BE FILLED BY STUDENT>

energyST = sum(sigWindowed.^2,1);
% Time in seconds, for the graphs
t = [0:length(SS)-1]/Fs;
subplot(1,1,1);
plot(t, SS);
title('speech: Male Voice');
xlims = get(gca,'Xlim');
hold on;
% Delayed Short-Time energy due to lowpass filtering
delay = (winLen - 1)/2;
plot(t(delay+1:end - delay), energyST, 'r');
xlim(xlims);
xlabel('Time (sec)');
legend({'Speech','Short-Time Energy'});
hold off;

This code will load the speech signal speech.wav and plot the waveform of the signal. The power
spectrum of the signal will be calculated using the spectrogram() function. The MFCCs of the signal
will be calculated using the mfcc() function. The MFCCs of the signal will be plotted.

Data and Results:

Course Title <TO BE FILLED BY CC> ACADEMIC YEAR: 2023-24


Course Code(s) <TO BE FILLED BY CC AND MUST INCLUDE ALL R,A,P CODES> Page 18 of 57
Experiment # <TO BE FILLED BY STUDENT> Student ID <TO BE FILLED BY STUDENT>
Date <TO BE FILLED BY STUDENT> Student Name <TO BE FILLED BY STUDENT>

 Analysis and Inferences:

In this experiment, we have calculated short time energy of the recorded voice Phonemes. It
is shown in figure above.

Post-Lab:

Sample VIVA-VOCE Questions (In-Lab):

1. Can you explain the concept of phonemes and their role in language? How do phonemes
differ from individual sounds or letters?

2. What is the importance of studying phonemes in linguistics and language analysis? How do
they contribute to our understanding of language structure and variation?

3. How are minimal pairs used to identify and distinguish different phonemes? Can you provide
an example of a minimal pair?

4. Describe the process of analyzing and classifying phonemes in different languages. What are
some common methods or techniques used in phonetics and phonology?

5. What is the International Phonetic Alphabet (IPA), and how does it assist in representing and
transcribing phonemes across languages? Can you provide an example of using IPA symbols?

Evaluator Remark (if Any):

Marks Secured: _____out of 50

Signature of the Evaluator with Date

Evaluator MUST ask Viva-voce prior to signing and posting marks for each experiment.

Course Title <TO BE FILLED BY CC> ACADEMIC YEAR: 2023-24


Course Code(s) <TO BE FILLED BY CC AND MUST INCLUDE ALL R,A,P CODES> Page 19 of 57
Experiment # <TO BE FILLED BY STUDENT> Student ID <TO BE FILLED BY STUDENT>
Date <TO BE FILLED BY STUDENT> Student Name <TO BE FILLED BY STUDENT>

Experiment 6: Short Term Time Domain Processing of Speech


Aim/Objective: To understand the different short-term time domain processing techniques that can
be used to analyze speech signals..

Pre-Requisites:

Computer with MATLAB installed, microphone, audio recording software.

Pre-Lab:

1. What is short-term time domain processing, and why is it commonly used in speech signal
analysis?

2. Can you explain the concept of windowing and its role in short-term time domain processing
of speech?

3. What are some common window functions used in speech signal analysis, such as
rectangular, Hamming, or Hanning windows? How do they affect the analysis results?

4. How is the concept of frame or window size related to short-term time domain processing?
What factors should be considered when choosing an appropriate frame size for speech
analysis?

5. Describe the process of computing the short-term energy of a speech signal using
windowing. What information does the energy provide about the signal?

Course Title <TO BE FILLED BY CC> ACADEMIC YEAR: 2023-24


Course Code(s) <TO BE FILLED BY CC AND MUST INCLUDE ALL R,A,P CODES> Page 20 of 57
Experiment # <TO BE FILLED BY STUDENT> Student ID <TO BE FILLED BY STUDENT>
Date <TO BE FILLED BY STUDENT> Student Name <TO BE FILLED BY STUDENT>

In-Lab:

Procedure:

1. Record your voice for a few seconds.


2. Plot the waveform of the recorded speech signal.
3. Calculate the short-term energy of the speech signal.
4. Calculate the short-term autocorrelation of the speech signal.
5. Calculate the short-term power spectrum of the speech signal.
Program:

The following code shows how to analyze the non-stationary nature of a speech signal in
MATLAB:

Code snippet

The following code shows how to perform short-term time domain processing on a speech signal in
MATLAB:

Code snippet
SS = audioread('e_sound.wav')
% Hamming window
winLen = 301;
winOverlap = 300;
wHamm = hamming(winLen);
% Framing and windowing.
sigFramed = buffer(SS, winLen, winOverlap, 'nodelay');
sigWindowed = diag(sparse(wHamm)) * sigFramed;
% Short-Time Energy calculation
energyST = sum(sigWindowed.^2,1);
% Time in seconds, for the graphs
t = [0:length(SS)-1]/Fs;
subplot(1,1,1);
plot(t, SS);
title('speech: He took me by surprise');
xlims = get(gca,'Xlim');
hold on;
% Delayed Short-Time energy due to lowpass filtering
delay = (winLen - 1)/2;
plot(t(delay+1:end - delay), energyST, 'r');
xlim(xlims);
xlabel('Time (sec)');
legend({'Speech','Short-Time Energy'});
hold off;

This code will load the speech signal speech.wav and plot the waveform of the signal. The short-term
energy of the signal will be calculated using the energy() function. The short-term autocorrelation of
the signal will be calculated using the autocorr() function. The short-term power spectrum of the
signal will be plotted using the spectrogram() function.

Course Title <TO BE FILLED BY CC> ACADEMIC YEAR: 2023-24


Course Code(s) <TO BE FILLED BY CC AND MUST INCLUDE ALL R,A,P CODES> Page 21 of 57
Experiment # <TO BE FILLED BY STUDENT> Student ID <TO BE FILLED BY STUDENT>
Date <TO BE FILLED BY STUDENT> Student Name <TO BE FILLED BY STUDENT>

 Data and Results:

 Analysis and Inferences:

In this experiment, we have calculated short time energy of the recorded voice with one Phonemes.
It is shown in figure above.

Post-Lab:

Sample VIVA-VOCE Questions (In-Lab):

1. What is the purpose of short-term time domain processing in speech analysis? How does it
enable us to extract useful information from speech signals?

2. Can you explain the concept of windowing and why it is employed in short-term time
domain processing of speech? What are some common window functions used in this
process?

3. Discuss the significance of frame size and frame overlap in short-term time domain
processing. How do these parameters impact the analysis results?
Course Title <TO BE FILLED BY CC> ACADEMIC YEAR: 2023-24
Course Code(s) <TO BE FILLED BY CC AND MUST INCLUDE ALL R,A,P CODES> Page 22 of 57
Experiment # <TO BE FILLED BY STUDENT> Student ID <TO BE FILLED BY STUDENT>
Date <TO BE FILLED BY STUDENT> Student Name <TO BE FILLED BY STUDENT>

4. Describe the process of computing short-term energy using windowing. How does the
energy information help in understanding the characteristics of a speech signal?

5. What is the autocorrelation function, and how is it used in short-term time domain
processing of speech? How does the autocorrelation function provide insights into the
periodicity or pitch of the speech signal?

Evaluator Remark (if Any):

Marks Secured: _____out of 50

Signature of the Evaluator with Date

Evaluator MUST ask Viva-voce prior to signing and posting marks for each experiment.

Course Title <TO BE FILLED BY CC> ACADEMIC YEAR: 2023-24


Course Code(s) <TO BE FILLED BY CC AND MUST INCLUDE ALL R,A,P CODES> Page 23 of 57
Experiment # <TO BE FILLED BY STUDENT> Student ID <TO BE FILLED BY STUDENT>
Date <TO BE FILLED BY STUDENT> Student Name <TO BE FILLED BY STUDENT>

Experiment 7: Fundamental frequency estimation in Speech signal


Aim/Objective: To understand the different techniques that can be used to estimate the
fundamental frequency of a speech signal.

Pre-Requisites:

Computer with MATLAB installed, microphone, audio recording software.

Pre-Lab:

1. What is the fundamental frequency in speech signals, and what role does it play in speech
perception and analysis?

2. Can you explain the concept of pitch and how it relates to the fundamental frequency of a
speech signal?

3. What are some common methods or algorithms used for fundamental frequency estimation
in speech signals?

4. Describe the process of autocorrelation-based fundamental frequency estimation. How does


autocorrelation help in identifying the periodicity of the speech signal?

5. How does the concept of harmonics relate to the fundamental frequency? Can you explain
the relationship between harmonics and the periodic nature of speech signals?

Course Title <TO BE FILLED BY CC> ACADEMIC YEAR: 2023-24


Course Code(s) <TO BE FILLED BY CC AND MUST INCLUDE ALL R,A,P CODES> Page 24 of 57
Experiment # <TO BE FILLED BY STUDENT> Student ID <TO BE FILLED BY STUDENT>
Date <TO BE FILLED BY STUDENT> Student Name <TO BE FILLED BY STUDENT>

In-Lab:

Procedure:

1. Record your voice for a few seconds.


2. Plot the waveform of the recorded speech signal.
3. Calculate the short-term autocorrelation of the speech signal.
4. Estimate the fundamental frequency of the speech signal using the
autocorrelation method.
5. Compare the estimated fundamental frequency with the actual fundamental
frequency of the speech signal.
Program:

The following code shows how to analyze the non-stationary nature of a speech signal in
MATLAB:

Code snippet
clc
clear all
close all %Program to find autocorrelation of a speech segment
[y,Fs]=audioread('aa sound.wav');%input: speech segment
max_value=max(abs(y));
y=y/max_value;
t=(1/Fs:1/Fs:(length(y)/Fs))*1000;
subplot(2,1,1);
plot(t,y);
title('A 30 millisecond segment of speech');
sum1=0;autocorrelation=0;
for l=0:(length(y)-1)
sum1=0;
for u=1:(length(y)-l)
s=y(u)*y(u+l);
sum1=sum1+s;
end
autocor(l+1)=sum1;
end
kk=(1/Fs:1/Fs:(length(autocor)/Fs))*1000;
subplot(2,1,2);
plot(kk,autocor);
title('Autocorrelation of the 30 millisecond segment of speech');
auto=autocor(21:160);
max1=0;
for uu=1:140
if(auto(uu)>max1)
max1=auto(uu);
sample_no=uu;
end
end
pitch_period_To=(20+sample_no)*(1/Fs)
pitch_freq_Fo=1/pitch_period_To

Course Title <TO BE FILLED BY CC> ACADEMIC YEAR: 2023-24


Course Code(s) <TO BE FILLED BY CC AND MUST INCLUDE ALL R,A,P CODES> Page 25 of 57
Experiment # <TO BE FILLED BY STUDENT> Student ID <TO BE FILLED BY STUDENT>
Date <TO BE FILLED BY STUDENT> Student Name <TO BE FILLED BY STUDENT>

This code will load the speech signal “aa sound.wav” and plot the waveform of the signal. The
autocorrelation of the signal will be calculated using the autocorr() function. The fundamental
frequency of the signal will be estimated using the autocorr() function. The estimated fundamental
frequency will be displayed.

 Data and Results:

Course Title <TO BE FILLED BY CC> ACADEMIC YEAR: 2023-24


Course Code(s) <TO BE FILLED BY CC AND MUST INCLUDE ALL R,A,P CODES> Page 26 of 57
Experiment # <TO BE FILLED BY STUDENT> Student ID <TO BE FILLED BY STUDENT>
Date <TO BE FILLED BY STUDENT> Student Name <TO BE FILLED BY STUDENT>

 Analysis and Inferences:

Estimated Pitch frequency is 412.1495Hz.

Post-Lab:

Sample VIVA-VOCE Questions (In-Lab):

1. Can you explain the concept of fundamental frequency in speech signals and its significance
in speech perception and analysis?

2. What is the relationship between fundamental frequency and the perceived pitch of a
speech signal?

3. Discuss some common methods or algorithms used for fundamental frequency estimation in
speech signals. How do they work, and what are their limitations?

4. Explain the concept of autocorrelation-based fundamental frequency estimation. How does


autocorrelation help in identifying the periodicity of a speech signal and estimating the
fundamental frequency?

5. What are the challenges associated with accurate fundamental frequency estimation in
speech signals, such as variations in pitch, background noise, or voice quality?

Evaluator Remark (if Any):

Marks Secured: _____out of 50

Signature of the Evaluator with Date

Evaluator MUST ask Viva-voce prior to signing and posting marks for each experiment.

Course Title <TO BE FILLED BY CC> ACADEMIC YEAR: 2023-24


Course Code(s) <TO BE FILLED BY CC AND MUST INCLUDE ALL R,A,P CODES> Page 27 of 57
Experiment # <TO BE FILLED BY STUDENT> Student ID <TO BE FILLED BY STUDENT>
Date <TO BE FILLED BY STUDENT> Student Name <TO BE FILLED BY STUDENT>

Experiment 8: Speech Format Synthesis


Aim/Objective: To understand speech synthesis and how it can be used to generate speech signals
from text.

Pre-Requisites:

Computer with MATLAB installed, microphone, audio recording software.

Pre-Lab:

1. What is speech format synthesis, and why is it important in speech signal processing?

2. Can you explain the concept of speech formats, such as formants, pitch, and duration, and
how they contribute to the perception of speech?

3. Describe the main steps involved in speech format synthesis, such as text-to-speech
conversion or speech reconstruction from acoustic features.

4. What are some common techniques or methods used in speech format synthesis, such as
rule-based synthesis, concatenative synthesis, or statistical parametric synthesis?

5. How does the choice of speech synthesis method impact the quality and naturalness of the
synthesized speech?

In-Lab:

Procedure:

1. Write a text file containing the text you want to synthesize.


2. Load the text file into MATLAB.
3. Generate speech from the text file using a speech synthesis model.
4. Play the synthesized speech.
Program:

The following code shows how to implement a simple speech synthesis algorithm in MATLAB:

Code snippet*
% Program to do text to speech.
% Get user's sentence
userPrompt = 'What do you want the computer to say?';
titleBar = 'Text to Speech';
defaultString = 'Hello KL! Goodmorning!';
caUserInput = inputdlg(userPrompt, titleBar, 1, {defaultString});
if isempty(caUserInput)
return;
end; % Bail out if they clicked Cancel.
caUserInput = char(caUserInput); % Convert from cell to string.
NET.addAssembly('System.Speech');
obj = System.Speech.Synthesis.SpeechSynthesizer;
obj.Volume = 100;
Speak(obj, caUserInput);

Course Title <TO BE FILLED BY CC> ACADEMIC YEAR: 2023-24


Course Code(s) <TO BE FILLED BY CC AND MUST INCLUDE ALL R,A,P CODES> Page 28 of 57
Experiment # <TO BE FILLED BY STUDENT> Student ID <TO BE FILLED BY STUDENT>
Date <TO BE FILLED BY STUDENT> Student Name <TO BE FILLED BY STUDENT>

*https://in.mathworks.com/matlabcentral/answers/159113-text-to-speech-
synthesis-matlab-code

This code will load the text of the speech signal text and generate the phonemes of the speech signal
phonemes. The waveform of the speech signal will be generated using the concat_speech() function.
The synthesized speech will be played.

 Data and Results:

obj =

SpeechSynthesizer with properties:

State: Ready

Rate: 0

Volume: 100

Voice: [1x1 System.Speech.Synthesis.VoiceInfo]

 Analysis and Inferences:

We observed that a line of text can be converted into Speak using TTS system with the help of the
Matlab.

Post-Lab:

Sample VIVA-VOCE Questions (In-Lab):

1. Can you explain the concept of speech format synthesis and its significance in speech signal
processing?

2. Describe the main components or parameters involved in speech format synthesis, such as
formants, pitch, and duration.

3. Discuss the difference between rule-based synthesis, concatenative synthesis, and statistical
parametric synthesis in speech format synthesis. How do these methods generate
synthesized speech?

4. How does the choice of speech synthesis method affect the quality, naturalness, and
intelligibility of the synthesized speech?

5. Explain the concept of prosody in speech synthesis and its role in conveying intonation,
rhythm, and emphasis. How is prosody modeled and incorporated into the synthesized
speech?

Course Title <TO BE FILLED BY CC> ACADEMIC YEAR: 2023-24


Course Code(s) <TO BE FILLED BY CC AND MUST INCLUDE ALL R,A,P CODES> Page 29 of 57
Experiment # <TO BE FILLED BY STUDENT> Student ID <TO BE FILLED BY STUDENT>
Date <TO BE FILLED BY STUDENT> Student Name <TO BE FILLED BY STUDENT>

Evaluator Remark (if Any):

Marks Secured: _____out of 50

Signature of the Evaluator with Date

Evaluator MUST ask Viva-voce prior to signing and posting marks for each experiment.

Course Title <TO BE FILLED BY CC> ACADEMIC YEAR: 2023-24


Course Code(s) <TO BE FILLED BY CC AND MUST INCLUDE ALL R,A,P CODES> Page 30 of 57
Experiment # <TO BE FILLED BY STUDENT> Student ID <TO BE FILLED BY STUDENT>
Date <TO BE FILLED BY STUDENT> Student Name <TO BE FILLED BY STUDENT>

Experiment 9: Linear Prediction Analysis


Aim/Objective: To understand linear prediction analysis of speech signals
Course Title <TO BE FILLED BY CC> ACADEMIC YEAR: 2023-24
Course Code(s) <TO BE FILLED BY CC AND MUST INCLUDE ALL R,A,P CODES> Page 31 of 57
Experiment # <TO BE FILLED BY STUDENT> Student ID <TO BE FILLED BY STUDENT>
Date <TO BE FILLED BY STUDENT> Student Name <TO BE FILLED BY STUDENT>

Pre-Requisites:

Computer with MATLAB installed, microphone, audio recording software.

Pre-Lab:

1. What is linear prediction analysis, and why is it widely used in speech signal processing?

2. Can you explain the underlying principle of linear prediction analysis and how it models
speech signals?

3. What are the main steps involved in performing linear prediction analysis on a speech
signal?

4. How does the prediction error or residual signal provide information about the
characteristics of the speech signal?

5. What are the parameters of linear prediction analysis, such as the order of the prediction
filter, and how do they impact the accuracy and quality of the analysis?

In-Lab:

Procedure:

1. Record your voice for a few seconds.


2. Plot the waveform of the recorded speech signal.
3. Calculate the autocorrelation function of the speech signal.
4. Use the autocorrelation function to estimate the linear prediction coefficients of the
speech signal.
Program:

The following code shows how to estimate the LPA coefficients of a speech signal in MATLAB:

Code snippet
clear all;
close all;

% Voiced sound
phons = audioread('e_sound.wav');
x = phons(36095:36700);
len_x = length(x);
% The signal is windowed
w = hamming(len_x);
wx = w.*x;
% Lpc autocorrelation method
order = 30;
% LPC function of MATLAB is used
[lpcoefs, errorPow] = lpc(wx, order);
% The estimated signal is calculated as the output of linearly filtering
% the speech signal with the coefficients estimated above
estx = filter([0 -lpcoefs(2:end)], 1, [wx; zeros(order,1)]);

Course Title <TO BE FILLED BY CC> ACADEMIC YEAR: 2023-24


Course Code(s) <TO BE FILLED BY CC AND MUST INCLUDE ALL R,A,P CODES> Page 32 of 57
Experiment # <TO BE FILLED BY STUDENT> Student ID <TO BE FILLED BY STUDENT>
Date <TO BE FILLED BY STUDENT> Student Name <TO BE FILLED BY STUDENT>

% The prediction error is estimated in the interval 0<=m<=N-1+p


er = [wx; zeros(order,1)] - estx;

%Prediction error energy in the same interva


erEn = sum(er.^2);

% Autocorrelation of the prediction error


[acs,lags] = xcorr(er);

% Calculate the frequency response of the linear prediction model


[H, W] = freqz(sqrt(erEn), lpcoefs(1:end), 513);

% Calculate the spectrum of the windowed signal


S = abs(fft(wx,1024));

% Calculate the spectrum of the error signal


eS = abs(fft(er,1024));

% Display results
subplot(5,1,1);
plot([wx; zeros(order,1)],'g');
title('Phoneme /aa/ - Linear Predictive Analysis, Autocorrelation Method');
hold on;
plot(estx);
hold off;
xlim([0 length(er)])
legend('Speech Signal','Estimated Signal');
subplot(5,1,2);
plot(er);
xlim([0 length(er)])
legend('Error Signal');
subplot(5,1,3);
plot(linspace(0,0.5,513), 20*log10(abs(H)));
hold on;
plot(linspace(0,0.5,513), 20*log10(S(1:513)), 'g');
legend('Model Frequency Response','Speech Spectrum')
hold off;
subplot(5,1,4);
plot(lags, acs);
legend('Prediction Error Autocorrelation')
subplot(5,1,5);
plot(linspace(0,0.5,513), 20*log10(eS(1:513)));
legend('Prediction Error Spectrum')

This code will load the speech signal speech.wav and estimate the LPA coefficients of the signal using
a predictor of order 10. The LPA coefficients will be plotted, as well as the error between the
predicted values and the actual values.

 Data and Results:

Course Title <TO BE FILLED BY CC> ACADEMIC YEAR: 2023-24


Course Code(s) <TO BE FILLED BY CC AND MUST INCLUDE ALL R,A,P CODES> Page 33 of 57
Experiment # <TO BE FILLED BY STUDENT> Student ID <TO BE FILLED BY STUDENT>
Date <TO BE FILLED BY STUDENT> Student Name <TO BE FILLED BY STUDENT>

 Analysis and Inferences:

In this experiment, Linear predictive analysis is performed using autocorrelation function. It is


observed that model frequency response is tracing the speech spectrum well for higher order LPC
coefficients.

Post-Lab:

Sample VIVA-VOCE Questions (In-Lab):

1. Can you explain the concept of linear prediction analysis and its significance in speech signal
processing?

2. What is the underlying principle of linear prediction analysis? How does it model speech
signals?

3. Describe the main steps involved in performing linear prediction analysis on a speech signal.

Course Title <TO BE FILLED BY CC> ACADEMIC YEAR: 2023-24


Course Code(s) <TO BE FILLED BY CC AND MUST INCLUDE ALL R,A,P CODES> Page 34 of 57
Experiment # <TO BE FILLED BY STUDENT> Student ID <TO BE FILLED BY STUDENT>
Date <TO BE FILLED BY STUDENT> Student Name <TO BE FILLED BY STUDENT>

4. How does the prediction error or residual signal obtained in linear prediction analysis
provide information about the characteristics of the speech signal?

5. What are the parameters involved in linear prediction analysis, such as the order of the
prediction filter? How do these parameters affect the accuracy and quality of the analysi

Evaluator Remark (if Any):

Marks Secured: _____out of 50

Signature of the Evaluator with Date

Evaluator MUST ask Viva-voce prior to signing and posting marks for each experiment.

Course Title <TO BE FILLED BY CC> ACADEMIC YEAR: 2023-24


Course Code(s) <TO BE FILLED BY CC AND MUST INCLUDE ALL R,A,P CODES> Page 35 of 57
Experiment # <TO BE FILLED BY STUDENT> Student ID <TO BE FILLED BY STUDENT>
Date <TO BE FILLED BY STUDENT> Student Name <TO BE FILLED BY STUDENT>

Experiment 10: Cepstral Analysis of Speech Signal


Aim/Objective: To understand cepstral analysis of speech signals.

Pre-Requisites:

Computer with MATLAB installed, microphone, audio recording software.

Pre-Lab:

1. What is cepstral analysis, and how does it differ from traditional spectral analysis of speech
signals?

2. Can you explain the concept of quefrency and how it is related to the time domain
representation of speech signals in cepstral analysis?

3. What are the main steps involved in performing cepstral analysis on a speech signal?

4. How is the cepstrum calculated from the speech signal, and what information does it
provide about the underlying source and filter characteristics of the speech?

5. Describe the process of liftering in cepstral analysis and its impact on the resulting cepstral
coefficients.

In-Lab:

Procedure:

1. Record your voice for a few seconds.


2. Plot the waveform of the recorded speech signal.
3. Calculate the power spectrum of the speech signal.
4. Calculate the cepstrum of the speech signal.
Program:

The following code shows how to calculate the Cepstrum of a speech signal in MATLAB:

Code snippet

clear all;
close all;

% get a section of vowel


[x,fs]=audioread('clip_aa.wav');
ms1=fs/1000; % maximum speech Fx at 1000Hz
ms20=fs/50; % minimum speech Fx at 50Hz
t=(0:length(x)-1)/fs; % times of sampling instants
subplot(3,1,1);
plot(t,x);
legend('Waveform');
xlabel('Time (s)');
ylabel('Amplitude');
% do fourier transform of windowed signal

Course Title <TO BE FILLED BY CC> ACADEMIC YEAR: 2023-24


Course Code(s) <TO BE FILLED BY CC AND MUST INCLUDE ALL R,A,P CODES> Page 36 of 57
Experiment # <TO BE FILLED BY STUDENT> Student ID <TO BE FILLED BY STUDENT>
Date <TO BE FILLED BY STUDENT> Student Name <TO BE FILLED BY STUDENT>

Y=fft(x.*hamming(length(x)));
% plot spectrum of bottom 5000Hz
hz5000=5000*length(Y)/fs;
f=(0:hz5000)*fs/length(Y);
subplot(3,1,2);
plot(f,20*log10(abs(Y(1:length(f)))+eps));
legend('Spectrum');
xlabel('Frequency (Hz)');
ylabel('Magnitude (dB)');
% cepstrum is DFT of log spectrum
C=fft(log(abs(Y)+eps));
% plot between 1ms (=1000Hz) and 20ms (=50Hz)
q=(ms1:ms20)/fs;
subplot(3,1,3);
plot(q,abs(C(ms1:ms20)));
legend('Cepstrum');
xlabel('Quefrency (s)');
ylabel('Amplitude');

This code will load the speech signal speech.wav and calculate the cepstrum of the signal using a
cepstrum of order 10. The cepstrum will be plotted .

 Data and Results:

Course Title <TO BE FILLED BY CC> ACADEMIC YEAR: 2023-24


Course Code(s) <TO BE FILLED BY CC AND MUST INCLUDE ALL R,A,P CODES> Page 37 of 57
Experiment # <TO BE FILLED BY STUDENT> Student ID <TO BE FILLED BY STUDENT>
Date <TO BE FILLED BY STUDENT> Student Name <TO BE FILLED BY STUDENT>

 Analysis and Inferences:

We obtain Cepstrum which is the quefrency region corresponding to typical speech


fundamental frequencies as shown in Figure.

Post-Lab:

Sample VIVA-VOCE Questions (In-Lab):

1. Can you explain the concept of cepstral analysis and how it differs from traditional spectral
analysis in speech signal processing?

2. Describe the main steps involved in performing cepstral analysis on a speech signal.

3. What is the quefrency domain in cepstral analysis, and how does it relate to the time
domain representation of speech signals?

4. How is the cepstrum calculated from the speech signal, and what information does it
provide about the underlying source and filter characteristics of the speech?

5. Explain the concept of liftering in cepstral analysis and its purpose in enhancing or modifying
the cepstral coefficients.

Evaluator Remark (if Any):

Marks Secured: _____out of 50

Signature of the Evaluator with Date

Evaluator MUST ask Viva-voce prior to signing and posting marks for each experiment.

Course Title <TO BE FILLED BY CC> ACADEMIC YEAR: 2023-24


Course Code(s) <TO BE FILLED BY CC AND MUST INCLUDE ALL R,A,P CODES> Page 38 of 57
Experiment # <TO BE FILLED BY STUDENT> Student ID <TO BE FILLED BY STUDENT>
Date <TO BE FILLED BY STUDENT> Student Name <TO BE FILLED BY STUDENT>

Experiment Title: MFCC Extraction from Speech Signal


Aim/Objective: To understand Mel-frequency cepstral coefficients (MFCCs) and how they can be
extracted from speech signals.

Pre-Requisites:

Computer with MATLAB installed, microphone, audio recording software.

Pre-Lab:

1. What are Mel Frequency Cepstral Coefficients (MFCC), and why are they widely used in
speech signal processing?

2. Can you explain the process of extracting MFCC from a speech signal? What are the main
steps involved?

3. What is the Mel scale, and how does it relate to the human perception of pitch and
frequency?

4. Describe the process of converting the linear frequency scale to the Mel scale in MFCC
extraction.

5. How are filterbanks used in MFCC extraction? What is their purpose, and how are they
designed?

Course Title <TO BE FILLED BY CC> ACADEMIC YEAR: 2023-24


Course Code(s) <TO BE FILLED BY CC AND MUST INCLUDE ALL R,A,P CODES> Page 39 of 57
Experiment # <TO BE FILLED BY STUDENT> Student ID <TO BE FILLED BY STUDENT>
Date <TO BE FILLED BY STUDENT> Student Name <TO BE FILLED BY STUDENT>

In-Lab:

Procedure:

1. Record your voice for a few seconds.


2. Plot the waveform of the recorded speech signal.
3. Calculate the power spectrum of the speech signal.
4. Calculate the MFCCs of the speech signal.
Program:

The following code shows how to extract the MFCCs of a speech signal in MATLAB:

Code snippet
clear all;
close all;

clc
clear all;
close all;
[s,fs]=audioread('aa sound.wav');
t=(1/fs:1/fs:(length(s)/fs))*1000;
s1= s;
figure(1)
max_value=max(abs(s1));
s1=s1/max_value;
plot(t,s1);
Ws=1024;
Ol=512;
L=floor((length(s)-Ol)/Ol);
N=12;
ccs=zeros(N,L);
for n=1:L
seg=s(1+(n-1)*Ol:Ws+(n-1)*Ol);
ccs(:,n)=mfcc_model(seg.*hamming(1,Ws),40,N,fs);
end
figure(2)
waterfall([1:L]*length(s)/(L*fs),[1:N],ccs)
xlabel('Time, s')
ylabel('Amplitude')
ylabel('Band')
zlabel('Amplitude')

Required Functions :
function band=spread_mel(hz_points,hz_c,hz_size,hz_max)
%hz_array is an array spaced in Hz
%hz_c is the current index
band=zeros(1, hz_size);
hz1=hz_points(max(1,hz_c-1)); %start
hz2=hz_points(hz_c); %middle
hz3=hz_points(min(length(hz_points),hz_c+1)); %end
for hi=1:hz_size

Course Title <TO BE FILLED BY CC> ACADEMIC YEAR: 2023-24


Course Code(s) <TO BE FILLED BY CC AND MUST INCLUDE ALL R,A,P CODES> Page 40 of 57
Experiment # <TO BE FILLED BY STUDENT> Student ID <TO BE FILLED BY STUDENT>
Date <TO BE FILLED BY STUDENT> Student Name <TO BE FILLED BY STUDENT>

hz=hi*hz_max/hz_size;
if hz > hz3
band(hi)=0;
elseif hz>=hz2
band(hi)=(hz3-hz)/(hz3-hz2);
elseif hz>=hz1
band(hi)=(hz-hz1)/(hz2-hz1);
else
band(hi)=0;
end
end

function cc=mfcc_model(seg, N, M, fs)


% Do FFT of audio frame seg, map to M MFCCs
% from 0 Hz to Fs/2 Hz, using N filterbanks
% typical values N=26,M=12,Fs=8000,seg~20ms
m_low=0; %mel span lower limit
m_top=2595*log10(1+(fs/700)); %mel span upper limit
mdiv=(m_top-m_low)/(N-1); %mel resolution
%Define an array of centre frequencies
xm=m_low:mdiv:m_top;
%Convert this to Hz frequencies
xf=700*(10.^(xm/2595)-1);
%Quantise to the FFT resolution
xq = floor((length(seg)/2 + 1)*xf/(fs/2));
%Take the FFT of the speech...
S=fft(seg);
S=abs(2*(S.*S)/length(S));
S=S(1:length(S)/2);
F=[1:length(S)]*(fs/2)/length(S);
%Compute the mel filterbanks.m
x1=zeros(1,N);
for xi=1:N
band=spread_mel(xf,xi,length(S),fs/2);
x1(xi)=sum(band.*S');
end
x=log(x1);
%Convert to MFCC using loop (could use matrix)
cc=zeros(1,M);
for xc=1:M
cc(xc)=sqrt(2/N)*sum(x.*cos(pi*xc*([1:N]-0.5)/N));
end

This code will load the speech signal speech.wav and calculate the MFCCs of the signal using 26 mel
filters and 13 cepstral coefficients. The MFCCs will be plotted

Course Title <TO BE FILLED BY CC> ACADEMIC YEAR: 2023-24


Course Code(s) <TO BE FILLED BY CC AND MUST INCLUDE ALL R,A,P CODES> Page 41 of 57
Experiment # <TO BE FILLED BY STUDENT> Student ID <TO BE FILLED BY STUDENT>
Date <TO BE FILLED BY STUDENT> Student Name <TO BE FILLED BY STUDENT>

 Data and Results:

Course Title <TO BE FILLED BY CC> ACADEMIC YEAR: 2023-24


Course Code(s) <TO BE FILLED BY CC AND MUST INCLUDE ALL R,A,P CODES> Page 42 of 57
Experiment # <TO BE FILLED BY STUDENT> Student ID <TO BE FILLED BY STUDENT>
Date <TO BE FILLED BY STUDENT> Student Name <TO BE FILLED BY STUDENT>

 Analysis and Inferences:

We analysed a speech recording using 12 mel-based critical-band filters, following the


MFCC method as shown in Figure.
Post-Lab:

Sample VIVA-VOCE Questions (In-Lab):

1. Can you explain the concept of Mel Frequency Cepstral Coefficients (MFCC) and their
significance in speech signal processing?

2. Describe the main steps involved in extracting MFCC from a speech signal.

3. How is the Mel scale used to convert the linear frequency scale to the Mel scale in MFCC
extraction? What is the motivation behind using the Mel scale?

4. Explain the concept of filterbanks in MFCC extraction. How are the filterbanks designed, and
what role do they play in capturing the spectral information of the speech signal?

5. Discuss the role of the Discrete Cosine Transform (DCT) in MFCC extraction. How does the
DCT compress the cepstral coefficients obtained

Evaluator Remark (if Any):

Marks Secured: _____out of 50

Signature of the Evaluator with Date

Evaluator MUST ask Viva-voce prior to signing and posting marks for each experiment.

Course Title <TO BE FILLED BY CC> ACADEMIC YEAR: 2023-24


Course Code(s) <TO BE FILLED BY CC AND MUST INCLUDE ALL R,A,P CODES> Page 43 of 57
Experiment # <TO BE FILLED BY STUDENT> Student ID <TO BE FILLED BY STUDENT>
Date <TO BE FILLED BY STUDENT> Student Name <TO BE FILLED BY STUDENT>

Experiment Title: Speech Enhancement


Aim/Objective: To understand speech enhancement and how it can be used to improve the quality
of speech signals in noisy environments.

Pre-Requisites:

Computer with MATLAB installed, microphone, audio recording software.

Pre-Lab:

1. What is speech enhancement, and why is it important in speech signal processing?

2. Can you explain the difference between additive noise and non-additive noise in the context
of speech signals?

3. What are the main challenges in speech enhancement, such as noise reduction while
preserving speech intelligibility and quality?

4. Describe some common techniques or methods used in speech enhancement, such as


spectral subtraction, Wiener filtering, or subspace methods.

5. How does the choice of a noise model influence the speech enhancem

Course Title <TO BE FILLED BY CC> ACADEMIC YEAR: 2023-24


Course Code(s) <TO BE FILLED BY CC AND MUST INCLUDE ALL R,A,P CODES> Page 44 of 57
Experiment # <TO BE FILLED BY STUDENT> Student ID <TO BE FILLED BY STUDENT>
Date <TO BE FILLED BY STUDENT> Student Name <TO BE FILLED BY STUDENT>

In-Lab:

Procedure:

1. Record your voice for a few seconds in a noisy environment.


2. Plot the waveform of the recorded speech signal.
3. Calculate the power spectrum of the speech signal.
4. Apply a speech enhancement algorithm to the speech signal.
5. Plot the waveform of the enhanced speech signal.
Program*:

The following code shows how to implement a simple speech enhancement algorithm in MATLAB:

Code snippet
clear
close all
clc
[clean, fs] = audioread('jarvus.wav');
[noise] = audioread('jarvus_pub.wav');
output = noiseReduction_YW(noise, fs);
subplot(3,2,1)
plotWave_YW(0,clean,fs,'time',1);
title('Clean speech')
subplot(3,2,2)
plotWave_YW(0,clean,fs,'freq');
subplot(3,2,3)
plotWave_YW(0,noise,fs,'time',1);
title('Noisy speech')
subplot(3,2,4)
plotWave_YW(0,noise,fs,'freq');
subplot(3,2,5)
plotWave_YW(0,output,fs,'time',1);
title('Enhanced speech')
subplot(3,2,6)
plotWave_YW(0,output,fs,'freq');
Data and Results:

Course Title <TO BE FILLED BY CC> ACADEMIC YEAR: 2023-24


Course Code(s) <TO BE FILLED BY CC AND MUST INCLUDE ALL R,A,P CODES> Page 45 of 57
Experiment # <TO BE FILLED BY STUDENT> Student ID <TO BE FILLED BY STUDENT>
Date <TO BE FILLED BY STUDENT> Student Name <TO BE FILLED BY STUDENT>

* https://medium.com/audio-processing-by-matlab/noise-reduction-by-wiener-filter-by-matlab-
44438af83f96

 Analysis and Inferences:

We have performed the experiment for de-noising the speech signal as shown in Figure.

Post-Lab:

Sample VIVA-VOCE Questions (In-Lab):

1. Can you explain the concept of speech enhancement and its importance in speech signal
processing?

2. What are some common types of noise that affect speech signals, and how do they degrade
the quality and intelligibility of speech?

Course Title <TO BE FILLED BY CC> ACADEMIC YEAR: 2023-24


Course Code(s) <TO BE FILLED BY CC AND MUST INCLUDE ALL R,A,P CODES> Page 46 of 57
Experiment # <TO BE FILLED BY STUDENT> Student ID <TO BE FILLED BY STUDENT>
Date <TO BE FILLED BY STUDENT> Student Name <TO BE FILLED BY STUDENT>

3. Describe some common techniques or methods used in speech enhancement, such as


spectral subtraction, Wiener filtering, or subspace methods. How do these methods work to
reduce noise in speech signals?

4. Discuss the challenges associated with speech enhancement, such as preserving speech
intelligibility while reducing noise artifacts or distortion.

5. How do the characteristics of the noise impact the choice of speech enhancement
algorithms? What factors should be considered when selecting an appropriate algorithm for
a given noise scenario?

Evaluator Remark (if Any):

Marks Secured: _____out of 50

Signature of the Evaluator with Date

Evaluator MUST ask Viva-voce prior to signing and posting marks for each experiment.

Course Title <TO BE FILLED BY CC> ACADEMIC YEAR: 2023-24


Course Code(s) <TO BE FILLED BY CC AND MUST INCLUDE ALL R,A,P CODES> Page 47 of 57
Experiment # <TO BE FILLED BY STUDENT> Student ID <TO BE FILLED BY STUDENT>
Date <TO BE FILLED BY STUDENT> Student Name <TO BE FILLED BY STUDENT>

Experiment Title: Speech Recognition


Aim/Objective: To understand speech recognition and how it can be used to convert speech signals
into text.

Pre-Requisites:

Computer with MATLAB installed, microphone, audio recording software.

Pre-Lab:

1. What is speech enhancement, and why is it important in speech signal processing?

2. Can you explain the types of noise commonly encountered in speech signals, such as
additive noise, background noise, or reverberation?

3. What are the main challenges in speech enhancement, such as reducing noise while
preserving speech intelligibility and quality?

4. Describe some common techniques or methods used in speech enhancement, such as


spectral subtraction, Wiener filtering, or adaptive filtering.

5. How does the choice of a noise model influence the speech enhancement algorithm? What
factors should be considered when selecting an appropriate noise model?

Course Title <TO BE FILLED BY CC> ACADEMIC YEAR: 2023-24


Course Code(s) <TO BE FILLED BY CC AND MUST INCLUDE ALL R,A,P CODES> Page 48 of 57
Experiment # <TO BE FILLED BY STUDENT> Student ID <TO BE FILLED BY STUDENT>
Date <TO BE FILLED BY STUDENT> Student Name <TO BE FILLED BY STUDENT>

In-Lab:

Procedure:

1. Record your voice for a few seconds.


2. Plot the waveform of the recorded speech signal.
3. Calculate the power spectrum of the speech signal.
4. Extract MFCCs from the speech signal.
5. Train a speech recognition model on the MFCCs.
6. Recognize the speech signal using the trained model.
Program:

The following code shows how to implement a simple speech recognition algorithm in MATLAB:

Code snippet
clear all;
close all;
 [x,fs]=audioread('male2.wav',[24120 25930]);

 % resample to 10,000Hz (optional)

 x=resample(x,10000,fs);

 fs=10000;

 %

 % plot waveform

 t=(0:length(x)-1)/fs; % times of sampling instants

 subplot(2,1,1);

 plot(t,x);

 legend('Waveform');

 xlabel('Time (s)');

 ylabel('Amplitude');

 %

 % get Linear prediction filter

Course Title <TO BE FILLED BY CC> ACADEMIC YEAR: 2023-24


Course Code(s) <TO BE FILLED BY CC AND MUST INCLUDE ALL R,A,P CODES> Page 49 of 57
Experiment # <TO BE FILLED BY STUDENT> Student ID <TO BE FILLED BY STUDENT>
Date <TO BE FILLED BY STUDENT> Student Name <TO BE FILLED BY STUDENT>


 ncoeff=2+fs/1000; % rule of thumb for formant estimation

 a=lpc(x,ncoeff);

 %

 % plot frequency response

 [h,f]=freqz(1,a,512,fs);

 subplot(2,1,2);

 plot(f,20*log10(abs(h)+eps));

 legend('LP Filter');

 xlabel('Frequency (Hz)');

 ylabel('Gain (dB)');

 Data and Results:

Result can be observed with the help of Matlab.

 Analysis and Inferences:

We compared different voice using Matab.

Post-Lab:

Sample VIVA-VOCE Questions (In-Lab):

1. Can you explain the concept of speech enhancement and its significance in speech signal
processing?

Course Title <TO BE FILLED BY CC> ACADEMIC YEAR: 2023-24


Course Code(s) <TO BE FILLED BY CC AND MUST INCLUDE ALL R,A,P CODES> Page 50 of 57
Experiment # <TO BE FILLED BY STUDENT> Student ID <TO BE FILLED BY STUDENT>
Date <TO BE FILLED BY STUDENT> Student Name <TO BE FILLED BY STUDENT>

2. What are some common types of noise that affect speech signals, and how do they impact
the quality and intelligibility of speech?

3. Describe some popular techniques or algorithms used in speech enhancement, such as


spectral subtraction, Wiener filtering, or adaptive filtering. How do these methods work to
reduce noise in speech signals?

4. Discuss the challenges involved in speech enhancement, such as the trade-off between noise
reduction and preserving speech quality.

5. Explain the concept of signal-to-noise ratio (SNR) and its role in evaluating the effectiveness
of speech enhancement algorithms. Are there any other objective or subjective metrics used
for assessing speech enhancement quality?

Evaluator Remark (if Any):

Marks Secured: _____out of 50

Signature of the Evaluator with Date

Evaluator MUST ask Viva-voce prior to signing and posting marks for each experiment.

Course Title <TO BE FILLED BY CC> ACADEMIC YEAR: 2023-24


Course Code(s) <TO BE FILLED BY CC AND MUST INCLUDE ALL R,A,P CODES> Page 51 of 57
Experiment # <TO BE FILLED BY STUDENT> Student ID <TO BE FILLED BY STUDENT>
Date <TO BE FILLED BY STUDENT> Student Name <TO BE FILLED BY STUDENT>

Course Title <TO BE FILLED BY CC> ACADEMIC YEAR: 2023-24


Course Code(s) <TO BE FILLED BY CC AND MUST INCLUDE ALL R,A,P CODES> Page 52 of 57

You might also like