Speech Processing Lab Manual
Speech Processing Lab Manual
Output
Program 2: Write a MATLAB program that reads a speech file and plots the waveform,
spectrum and autocorrelation sequence of any three unvoiced segments present in the
given speech signal.
Aim:
To Write a MATLAB program that reads a speech file and plots the waveform, spectrum
and autocorrelation sequence of any three unvoiced segments present in the given speech
signal.
Software Required:
MATLAB
Hardware Required:
Personal PC, Microphones and Speakers
Program:
clear all;
fp=fopen ('c:\a.wav', 'r');
fseek (fp, 21044,-1);
a=fread (fp, 8000);
a-a-128;
plot (a);
xlabel('sample number');
ylabel('amplitude');
title('plot of the utterance h');
Fs = 22100;
t = 0:1/Fs:.296;
h=spectrum.welch;
d = psd (h, a, 'Fs', Fs);
figure; % Calculate the PSD
plot (d);
to find pitch period of the signal using spectrum of unvoiced speech
clear all;
fp=fopen('c:/h.wav');
fseek (fp, 38044,-1);
a=fread (fp, 2048);
subplot (2,1,1); plot(a);
title('plot of unvoiced part of a signal');
xlabel('sample number');
ylabel('amplitude');
a-a-128;
b=abs (fft (a));
c=log (b^2);
f = 22100/2048:22100/2048:22100;
subplot (2,1,2);
plot (f, c);
title('plot of FFT of a signal');
xlabel('frequency');
ylabel('amplitude');
disp (b);
for k=1:400, sum (k) =0; end
for k=1:400,
for i=1:45,
sum (k) =sum (k) + (b (i) *b (i+k));
sum (k) =sum (k)/52000;
end
end
figure;
plot (sum);
title('plot of spectral correlation of unvoiced signal');
xlabel('sample number');
ylabel('correlation');
Program 3: Write a MATLAB program that reads a speech file and plots the waveform,
spectrum and autocorrelation sequence of any three silence regions present in the given
speech signal.
f0 =
pitch(audioIn,fs,WindowLength=windowLength,OverlapLength=overlapLength,Range=
[50,250]);
figure
tiledlayout(2,1)
nexttile()
plot(timeVector,audioIn)
axis([(110e3/fs) (135e3/fs) -1 1])
ylabel("Amplitude")
xlabel("Time (s)")
title("Utterance - Two")
nexttile()
timeVectorPitch = linspace(twoStart/fs,twoStop/fs,numel(f0));
plot(timeVectorPitch,f0,"*")
axis([(110e3/fs) (135e3/fs) min(f0) max(f0)])
ylabel("Pitch (Hz)")
xlabel("Time (s)")
title("Pitch Contour")
OUTPUT:
% Compute MFCC
coeffs = mfcc(audioIn, fs, 'NumCoeffs', numCoeffs, ...
'WindowLength', winLengthSamples, ...
'OverlapLength', overlapLengthSamples, ...
'NumFilters', numFilters);
% Parameters
fs_new = 16000; % New sampling rate for G.722
N = 2; % Decimation factor
% ADPCM encoder for the low band (simple example, not actual G.722 ADPCM)
encoded_low = adpcm_encode(low_band_ds);
% ADPCM encoder for the high band (simple example, not actual G.722 ADPCM)
encoded_high = adpcm_encode(high_band_ds);
% Combine encoded sub-bands (this is simplified and not the actual G.722 bitstream)
encoded_signal = [encoded_low; encoded_high];
% Function for ADPCM encoding (simple example, not actual G.722 ADPCM)
function encoded = adpcm_encode(x)
% Initialize variables
x = x(:);
n = length(x);
encoded = zeros(n, 1);
predsample = 0;
step_size = 1;
for i = 1:n
diff = x(i) - predsample;
if diff >= 0
encoded(i) = 1;
else
encoded(i) = 0;
end
predsample = predsample + step_size * (2 * encoded(i) - 1);
end
end
PROGRAM:
subplot(3, 1, 2);
plot(t, noisySignal);
title('Noisy Signal');
xlabel('Time (s)');
ylabel('Amplitude');
subplot(3, 1, 3);
plot(t, FilteredSignal);
title('Filtered Signal using Wiener Filter');
xlabel('Time (s)');
ylabel('Amplitude');
PROGRAM 8: Write a MATLAB program to find PLP coefficients for 256 samples of
voiced speech
Aim: To write a MATLAB program to find PLP coefficients for 256 samples of voiced
speech
SOFTWARE REQUIRED:
MATLAB
HARDWARE REQUIRED:
Personal computer, Microphones and speakers.
Program:
program to find PLP coefficients for 256 samples of voiced speech
clear all;
fp=fopen('watermark.wav');
fseek (fp, 224000,-1);
a=fread (fp, 256);
plot 256 points of voiced speech
subplot (2,1,1);
plot (a);
title('plot of voiced part of a signal'); xlabel('sample number');
ylabel('amplitude'); %find 256 point FFT
b=fft (a);
bl= (abs (b));
for i=1:256,
bl (i) =b1 (i) *bl (i);
calculation of squared power
end
calculate frequency in Hz for every FFT bin
for i=1:128,
end
f(i)=22100/256*i;
for i=1:128,
c1 (i)=b1 (i);
end
calculate Bark scale frequency for each frequency in Hz corresponding to FFT bin
for i=1:128,
bark (i)=6*log (2*pi*f (i)/1200*pi+sqrt ( (2*pi*f (i)/1200*pi)^2+1));
end
plot spectrum in Bark scale for each FFT bin. And find cube root of power spectrum bin
and plot it on the Bark scale.
for i=1:128,
end
c11 (i) =nthroot (c1 (i),3);
subplot (2,1,2);
stem (bark, c11);
title('plot of cube root of power spectrum in Bark scale for voiced speech');
xlabel('Frequency in Bark scale'); ylabel('Amplitude in dB');
for j=1:28,
sum (j)=0;
for i=1:58,
if ((bark (i)>10+ (j-1)*2.5) && (bark (i) <15+ (j-1)*2.5))
if (bark (i) <12.5+ (j-1)*2.5)
g(i)=((bark (i)-(2.5+2.5* (j-1)))*1/5); else
g(i)=((15+2.5* (j-1) -bark (i))*1/5);
end
sum (j)=sum (j)+c11 (i) *g (i);
end
end
end
Find IFFT of the resulting integrated values considering it as a signal.
d=ifft (sum); dl-real (d);
for i=1:14,
end
x (i) =d1 (i);
plot the first 14 MFCC coefficients by cepstral truncation.
stem (x);
title('plot of PLP coefficients for voiced speech');
xlabel('Frequency in mel scale');
ylabel('Amplitude in dB');
Output:
PROGRAM 10: Design a Speech emotion recognition system using DCT in MATLAB
AIM:
To design a speech emotion recognition system using DCT in MATLAB.
SOFTWARE REQUIRED:
MATLAB
HARDWARE REQUIRED:
Personal computer, Microphones and speakers.
PROCEDURE:
Designing a Speech Emotion Recognition (SER) system using Discrete Cosine Transform
(DCT) in MATLAB involves several steps, including preprocessing, feature extraction,
and classification. Here’s a simplified outline of how this can be implemented:
PROGRAM:
% Load necessary libraries
clc; clear; close all;
% Parameters
fs = 16000; % Sampling frequency
frame_size = 0.025; % Frame size in seconds
frame_stride = 0.01; % Frame stride in seconds
% Load the speech signal
[audio, fs] = audioread('emotion_speech.wav');
% Normalize the signal
audio = audio / max(abs(audio));
% Frame the signal
frame_length = round(frame_size * fs);
frame_step = round(frame_stride * fs);
frames = buffer(audio, frame_length, frame_length - frame_step, 'nodelay');
% Apply Hamming window
hamming_window = hamming(frame_length);
frames = frames .* hamming_window;
% Step 4: Classification
% Split data into training and testing sets (using dummy data for example)
labels = [ones(1, size(frames, 2)/2), zeros(1, size(frames, 2)/2)]; % Dummy labels
train_features = features(:, 1:end/2);
test_features = features(:, end/2+1:end);
train_labels = labels(1:end/2);
test_labels = labels(end/2+1:end);
Designing a Speech Emotion Recognition (SER) system using Wavelet Packet Transform
(WPT) in MATLAB involves several steps, including preprocessing, feature extraction,
and classification. Here’s a simplified outline of how this can be implemented:
PROGRAM:
% Load necessary libraries
clc; clear; close all;
% Parameters
fs = 16000; % Sampling frequency
frame_size = 0.025; % Frame size in seconds
frame_stride = 0.01; % Frame stride in seconds
% Load the speech signal
[audio, fs] = audioread('emotion_speech.wav');
% Normalize the signal
audio = audio / max(abs(audio));
% Frame the signal
frame_length = round(frame_size * fs);
frame_step = round(frame_stride * fs);
frames = buffer(audio, frame_length, frame_length - frame_step, 'nodelay');
% Apply Hamming window
hamming_window = hamming(frame_length);
frames = frames .* hamming_window;
% WPT Features
wpt = wpdec(audio, 3, 'db1'); % 3-level wavelet packet decomposition with 'db1'
wavelet
wpt_features = wprcoef(wpt, [3 0]); % Extract coefficients at level 3
% Step 4: Classification
% Split data into training and testing sets (using dummy data for example)
labels = [ones(1, size(frames, 2)/2), zeros(1, size(frames, 2)/2)]; % Dummy labels
train_features = features(:, 1:end/2);
test_features = features(:, end/2+1:end);
train_labels = labels(1:end/2);
test_labels = labels(end/2+1:end);
PROGRAM 12: Write a MATLAB program to calculate positive and negative ZCR for
voiced and unvoiced segments
Aim:
To write a MATLAB program to calculate positive and negative ZCR for voiced and
unvoiced segments
Software Required:
MATLAB
Hardware Required:
Personal PC, Microphones and Speakers
PROGRAM:
clear all;
fp-fopen('watermark.wav'); fseek (fp, 60044,-1); a=fread (fp, 10000);
plot (a);
xlabel('sample number'); ylabel('amplitude');
title('plot of speech segment for 10000 samples');
file read and plotted. We are reading first 10000 samples after fseek by 60000 samples.
x=0; figure;
for i=5000:6999,
if (a(i)>128) && (a (i+1)<128)
x=x+1;
else x-x;
end
end
disp('number of zero crossings for voiced segment=');
disp (x);
samples between 5000 to 7000 are read and number of zero cross- ings are found.
fseek (fp, 65044,-1);
b=fread (fp, 2000);
plot (b);
xlabel('sample number');
ylabel('amplitude');
title('plot of voiced speech segment');
We are reading samples between 5000 to 7000 and are plotted. This is unvoiced
segment.
figure;
for i=7000:8999,
if (a(i)>128) && (a (i+1) <128)
x=x+1;
else x=x;
end
end
disp ('number of zero crossings for unvoiced segment='); disp (x);
samples between 7000 to 9000 are read and number of zero cross- ings are found.
fseek (fp, 67044,-1);
c-fread (fp, 2000);
xlabel('sample number');
ylabel('amplitude');
title('plot of unvoiced speech segment');
plot (c);
We are reading samples between 7000 to 9000 and are plotted. This is unvoiced
segment.
Output
number of negative zero crossings for voiced segment= 135
number of negative zero crossings for unvoiced segment=193
PROGRAM 13: Write a MATLAB Program to find Pitch Period of the Voiced Signal
Using Cepstrum Of Speech signal
Aim:
To write a MATLAB Program to find Pitch Period of the Voiced Signal Using Cepstrum of
Speech signal
Software Required:
MATLAB
Hardware Required:
Personal PC, Microphones and Speakers
PROGRAM:
to find pitch period of the voiced signal using cepstrum of speech signal
clear all;
fp=fopen('watermark.wav');
fseek (fp, 224000,-1);
a=fread (fp, 2048);
a=a-128;
subplot (2,1,1);
plot (a);
title('plot of voiced part of a signal');
xlabel('sample number');
ylabel('amplitude');
%2048 samples of the voiced signal are displayed
al=window (@Hamming, 2048);
subplot (2,1,2);
plot (al);
title('plot of Hamming window signal');
xlabel('sample number');
ylabel('amplitude');
for i=1:2048,
end
all (i) a (i) *al (i);
figure;
subplot (2,1,1);
plot (all);
title('plot of windowed voiced part of a signal');
xlabel('sample number');
ylabel ('amplitude');
the signal is passed via Hamming window and displayed
b=abs (fft (a11));
f
22100/2048:22100/2048:22100;
subplot (2,1,2);
plot (f,b);
title('plot of FFT of windowed signal');
xlabel('frequency');
ylabel ('amplitude');
c=log (b);
figure;
for i=1:1024,
d(i)=c(i);
end
f = 22100/2048:22100/2048: 11050;
subplot (2,1,1);
plot (f, d);
title('log spectrum of windowed voiced speech signal'); xlabel('Frequency in Hz');
ylabel ('amplitude in dB');
e-abs (ifft (c));
subplot (2,1,2);
plot (e);
title('cepstrum of voiced speech signal');
xlabel('Quefrency');
ylabel('amplitude in dB');
for i=1:80,
end
h(i)=0;
for i-81:1967,
end
h(i) =e(i);
for i 1968:2048,
h(i)=0;
end
figure;
subplot (2,1,1);
plot (e);
title('cepstrum of voiced speech signal after passing through
window' );
xlabel('Quefrency');
ylabel('amplitude in dB');
g=abs (fft (h));
for i=1:1024,
end
k(i)=g(i);
f = 22100/2048:22100/2048:11050;
subplot (2,1,2);
plot (f, k);
title('spectrum of voiced speech cepstral windowed');
xlabel('frequency');
ylabel('amplitude in dB');
disp (k);
PROGRAM:14: Write a MATLAB Code to find Formants from the Voiced Signal Using
Cepstrum of Speech signal
Aim:
To write a MATLAB Program to find Pitch Period of the Voiced Signal Using Cepstrum of
Speech signal
Software Required:
MATLAB
Hardware Required:
Personal PC, Microphones and Speakers
PROGRAM:
to find formants from the voiced signal using cepstrum of speech signal
clear all;
fp=fopen('watermark.wav');
fseek (fp, 224000,-1);
a=fread (fp, 2048);
a=a-128;
subplot (2,1,1);
plot (a);
title('plot of voiced part of a signal');
xlabel('sample number');
ylabel('amplitude');
$2048 samples of the voiced signal are displayed
al=window (@Hamming, 2048);
subplot (2,1,2);
plot (al);
title('plot of Hamming window signal'); xlabel('sample number');
ylabel('amplitude');
for i=1:2048,
end
all (i) a (i) *al (i);
figure;
subplot (2,1,1);
plot (all);
title('plot of windowed voiced part of a signal'); xlabel('sample number');
ylabel('amplitude');
the signal is passed via Hamming window and displayed b=abs (fft (all));
f = 22100/2048:22100/2048:22100;
subplot (2,1,2);
plot (f,b);
title('plot of FFT of windowed signal');
xlabel('frequency');
ylabel('amplitude');
c=log (b);
figure;
for i=1:1024,
d(i)=c(i);
end
f = 22100/2048:22100/2048:11050;
subplot (2,1,1);
plot (f, d);
title('log spectrum of windowed voiced speech signal'); xlabel('Frequency in Hz');
ylabel('amplitude in dB'); e-abs (ifft (c));
subplot (2,1,2); plot (e);
title('cepstrum of voiced speech signal');
xlabel('Quefrency'); ylabel('amplitude in dB');
for i=1:40,
h(i)=e(i);
end
for i=41:2007,
h(i)=0;
end
for i=2008:2048,
h(i)=e (i);
end
figure;
subplot (2,1,1);
plot (h);
title('cepstrum of voiced speech signal after passing through
window' );
xlabel('Quefrency');
ylabel('amplitude in dB');
PROGRAM:15: Write a MATLAB Code to find Formants from the Unvoiced Signal
Using Cepstrum of Speech signal
Aim:
To write a MATLAB Program to find Pitch Period of the Unvoiced Signal Using Cepstrum
of Speech signal
Software Required:
MATLAB
Hardware Required:
Personal PC, Microphones and Speakers
PROGRAM:
%to find formants from the unvoiced signal using cepstrum of a speech signal
clear all;
fp=fopen('watermark.wav');
fseek (fp, 23044,-1);
a=fread (fp, 2048); a=a-128;
subplot (2,1,1);
plot (a);
title('plot of unvoiced part of a signal');
xlabel('sample number');
ylabel('amplitude');
%2048 samples of the voiced signal are displayed
a1=window (@Hamming, 2048);
subplot (2,1,2);
plot (al);
title('plot of Hamming window signal');
xlabel('sample number');
ylabel('amplitude');
for i=1:2048,
a11 (i)= a (i) *al (i);
end
figure;
subplot (2,1,1);
plot (all);
title('plot of windowed unvoiced part of a signal'); xlabel('sample number');
ylabel('amplitude');
%the signal is passed via Hamming window and displayed b=abs (fft (all));
f = 22100/2048:22100/2048:22100;
subplot (2,1,2);
plot (f,b);
title('plot of FFT of windowed signal');
xlabel('frequency');
ylabel('amplitude');
c=log (b);
figure;
for i=1:1024,
end
d(i)=c(i);
f = 22100/2048:22100/2048:11050;
subplot (2,1,1);
plot (f, d);
title('log spectrum of windowed unvoiced speech signal');
xlabel('Frequency in Hz');
ylabel('amplitude in dB'); e=abs (ifft (c));
subplot (2,1,2);
plot (e);
title('cepstrum of unvoiced speech signal');
xlabel('Quefrency');
ylabel('amplitude in dB');
for i=1:40,
h(i) =e(i);
end
for i=41:2007,
h(i)=0;
end
for i=2008:2048,
h(i) =e (i);
end
figure;
subplot (2,1,1);
plot (h);
title('cepstrum of voiced speech signal after passing through window' );
xlabel('Quefrency');
ylabel('amplitude in dB');
g=abs (fft (h));
for i=1:1024,
end
k(i)=g(i);
f = 22100/2048:22100/2048:11050;
subplot (2,1,2);
plot (f, k);
title('spectrum of voiced speech cepstral windowed');
xlabel('frequency');
ylabel('amplitude in dB');
disp (k);