0% found this document useful (0 votes)
45 views2 pages

For End For End: "Sp01.wav"

The document reads in an audio signal, segments it into frames, applies windowing and calculates short-term energy and zero-crossing rate features for each frame. It then separates the voiced frames from unvoiced frames based on these features. The voiced frames are extracted and transformed to the frequency domain using FFT. The magnitude is calculated and converted to log scale. Finally, discrete cosine transform is applied to the log data, and the resulting DCT coefficients are plotted.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
45 views2 pages

For End For End: "Sp01.wav"

The document reads in an audio signal, segments it into frames, applies windowing and calculates short-term energy and zero-crossing rate features for each frame. It then separates the voiced frames from unvoiced frames based on these features. The voiced frames are extracted and transformed to the frequency domain using FFT. The magnitude is calculated and converted to log scale. Finally, discrete cosine transform is applied to the log data, and the resulting DCT coefficients are plotted.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 2

% % audio input

[data,fs]=audioread("sp01.wav") % speech signal of 8khz as input,fs-sampling


frequency
plot (data)
frame_duration = 0.010;
frame_length = frame_duration * fs;% to find the number of samples in each frame
N = length(data) % total number of samples= 22529
no_frames = floor(N/frame_length);% no. of frames = 281

%% framing
frame=[];
for i= 1:no_frames
frame = [frame data(frame_length * (i-1) +1 : frame_length * i)];
end
% % windowing
f_win=[];
for j=1:no_frames
f_win= [f_win frame(:,j) .* hamming(length(frame(:,j)))];
end

% % short term energy


ste=[];
for k=1:no_frames
ste(k)=(sum(f_win(:,k).^2))/80 ;
end

% % zero crossing count


ZCR1 = [];
for m=1:no_frames
x = f_win(:,m);
ZCC = 0;
for m1=1:length(x)-1
if ((x(m1) < 0) && (x(m1 + 1) > 0 ))
ZCC = ZCC + 1;
elseif ((x(m1) > 0) && (x(m1 + 1) < 0))
ZCC = ZCC + 1;
end
end
ZCR1 = [ZCR1 ZCC];
end

% %seperating voiced frames using STE and ZCC


voiced=[];
for h=1:no_frames

if (ZCR1(h)<30 && ste(h)>0.000132)


voiced(:,h)=frame(:,h);
else
voiced =[voiced zeros(80,1)]

end
voicedpart = voiced(:);
end
% SEPEARTING FRAMES OF VOICED PART
voice=[];
for i= 1:no_frames
voice = [voice voicedpart(frame_length * (i-1) +1 : frame_length * i)];
end
% REMOVING ZEROS FRAMES
voiced_sep=voice( :, all(voice,1) );
% fft
FFT=[];
for s=1:length(voiced_sep)
FFT=[FFT fft(voiced_sep(:,s))];
mag=abs( FFT);% magnitude
phase=angle(FFT);
end
% logarithm
LOG=[];
for s=1:length(mag)
LOG=[LOG log(mag(:,s))];
end

% DCT of log data


voice_dct=[]
for s=1:length(LOG)
voice_dct=[voice_dct dct(LOG(:,s))]
end

figure
plot(voice_dct);

You might also like