0% found this document useful (0 votes)

61 views10 pages

Report On Project 1 Speech Emotion Recognition

This document describes a project on speech emotion recognition. It includes an abstract, introduction, flowchart, code to run adaptive noise cancellation algorithms, and results. The code reads an audio file, applies Fourier transforms, filters the signal using butterworth filtering, and achieves 85% accuracy in removing hissing noise. Instructions are provided on running the code and interpreting the wave file parameters.

Uploaded by

archana kumari

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

61 views10 pages

Report On Project 1 Speech Emotion Recognition

Uploaded by

archana kumari

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

A

REPORT ON PROJECT 1

SPEECH EMOTION RECOGNITION

DEPARTMENT OF ELECTRICAL ENGINEERING

POWER ELECTRONICS AND DRIVES

SUBJECT – DIGITAL SIGNAL PROCESSING

SUBMITTED TO: DR HEMANT KUMAR MEENA

SUBMITTED BY: MONALIKA

(2022PPD5231)
ARCHANA
(2022PPD5243)

M.TECH (1ST YEAR)

CONTENTS

1. Abstract
2. Introduction
3. Flow chart
4. Running code
5. Training and testing input
6. Accuracy achieved
7. Instructions for running the code
ABSTRACT
Virtually all audio recordings contain some amount of noise. This noise may join audio
signal during recording process or due to long media storage. To produce best quality
audio recordings these unwanted audio noises must be removed to the greatest extent
possible. Noise cancellation of audio signal is key challenge problem in Audio Signal
Processing. Since noise is random process and varying every instant of time, noise is
estimated at every instant to cancel from the original signal. There are many schemes
for noise cancellation but most effective scheme to accomplish noise cancellation is to
use adaptive filter. Active Noise Cancellation (ANC) is achieved by introducing “anti-
noise” wave through an appropriate array of secondary sources. These secondary sources are
interconnected through an electronic system using a specific signal processing algorithm for
the particular cancellation scheme. In this paper, the three conventional adaptive algorithms;
RLS(Recursive Least Square), LMS(Least Mean Square) and NLMS(Normalized Least Mean
Square) for ANC are analyzed based on single channel broadband feed forward. For
obtaining faster convergence, Normalized Least Mean Square (NLMS) algorithm is modified
and associated extended algorithm under Gaussian noise assumption. Simulation results
indicate a higher quality of noise cancellation and more minimizing mean square error
(MSE).
INTRODUCTION
Audio noise reduction system is that kind of system which is helpful to remove unwanted
noise from speech signals. Audio noise reduction can be classified into two kinds.
Complementary type and non-complementary type. One cannot effectively use speech
communication in an environment in which the background noise level is too high. It
interferes with the original message and corrupts the parameters of the message signal.
Noise can be defined as an unwanted signal that interferes with the communication or
measurement of another signal. A noise itself is an information-bearing signal that conveys
information regarding the sources of the noise and the environment in which it propagates.
The types and sources of noise and distortions are many and include: (i) electronic noise such
as thermal noise and shot noise, (ii) acoustic noise emanating from moving, vibrating or
colliding sources such as revolving machines, moving vehicles, keyboard clicks, wind and
rain, (iii) electromagnetic noise that can interfere with the transmission and reception of
voice, image and data over the radio-frequency spectrum, (iv) electrostatic noise generated
by the presence of a voltage, (v) communication channel distortion and fading and (vi)
quantization noise and lost data packets due to network congestion.
FLOW CHART
4. Running code

from google.colab import drive

drive.mount('/content/drive')
import numpy as np
import IPython
import wave
import struct
import matplotlib.pyplot as plt

#Importing audio file from drive

audiofile = '/content/drive/MyDrive/whistle_withoutac.wav'
IPython.display.Audio(audiofile)

#Converting m4a file to wav format

!pip install pydub

from pydub import AudioSegment
file_extension = 'm4a'
track = AudioSegment.from_file(audiofile, file_extension)
audiofile_wave = audiofile.replace(file_extension, 'wav')
file_handle = track.export(audiofile_wave, format='wav')
print(audiofile_wave)
IPython.display.Audio(audiofile_wave)

#Reading wav file

# from scipy.io import wavfile

# fs, data = wavfile.read(audiofile_wave)
# print(fs, len(data))
from scipy import fftpack
wav_file = wave.open(audiofile_wave, 'r')
nchannels, sampwidth, framerate, nframes, comptype, compname = wav_file.getparams()
print("Params:", "\n\tChannel:", nchannels, "\n\tSample Width:", sampwidth,
"\n\tFramerate:", framerate, "\n\t
Number of Frames:", nframes, "\n\tcomptype:", comptype, "\n\tCompname:", compname)

# Reading wave format data from wav file.

frames_wave = wav_file.readframes(nframes)
wav_file.close()
print("Length:", nframes)

# Deserializing
frames_wave = struct.unpack('{n}h'.format(n=nframes), frames_wave)
frames_wave = np.array(frames_wave)
print("Min value:", np.min(frames_wave), "Max value:", np.max(frames_wave)
#Applying Fourier

#Fast Fourier Transform

# frames_freq_domian = np.fft.fft(frames_wave)
frames_freq_domian = fftpack.fft(frames_wave)

# Above value is in complex number but we want absolute number

# This will give us the frequency we want
magnitude = np.abs(frames_freq_domian) # Or ampliude ?
phase = np.angle(frames_freq_domian) # Normally we are not interested in phase
information, its only used in
reconstruction.

print(magnitude.shape, phase.shape)
print("The max frequency (highest magnitude) is {} Hz".format(np.where(magnitude ==
np.max(magnitude))[
0][0]))

fig = plt.figure(figsize = (25, 6))

fig.suptitle('Original wav data')

ax1 = fig.add_subplot(1,3,1)
ax1.set_title("Original audio wave / Spatial Domain")
ax1.set_xlabel("Time(s)")
ax1.set_ylabel("Amplitude (16 bit depth - Calulated above)")
ax1.plot(frames_wave)
ax2 = fig.add_subplot(1,3,2)
ax2.set_title("Frequency by magnitude (Max at {} Hz) / Frequency
Domain".format(np.where(magnitude == n
p.max(magnitude))[0][0]))
ax2.set_xlabel("Frequency (Hertz)")
ax2.set_ylabel("Magnitude (normalized)")
ax2.set_xlim(0, 44100) # we are not interested in rest
ax2.plot(magnitude / nframes) # Normalizing magnitude
ax3 = fig.add_subplot(1,3,3)
ax3.set_title("[Unclipped]Frequency by magnitude (Max at {} Hz) / Frequency
Domain".format(np.where(ma
gnitude == np.max(magnitude))[0][0]))
ax3.set_xlabel("Frequency (Hertz)")
ax3.set_ylabel("Magnitude (normalized)")
ax3.plot(magnitude / nframes) # Normalizing magnitude
plt.show()

#Filtering the signal

from scipy import signal

from matplotlib import pyplot as plt
def butter_pass_filter(data, cutoff, fs, order=5):
nyq = 0.5 * fs # Nyquist frequency
normal_cutoff = cutoff / nyq # A fraction b/w 0 and 1 of sampling rate
print("normal_cutoff:", normal_cutoff, (data.shape[0] / 2) * normal_cutoff) # Tricky ?
b, a = signal.butter(order, normal_cutoff, btype='high', analog=False)
y = signal.filtfilt(b, a, data)

def _plot_graph():
# Get the filter coefficients so we can check its frequency response.
# Plot the frequency response.
w, h = signal.freqz(b, a, worN=8000)
plt.subplot(2, 1, 1)
plt.plot(0.5 *fs*w/np.pi, np.abs(h), 'b')
plt.plot(cutoff, 0.5 * np.sqrt(2), 'ko')
plt.axvline(cutoff, color='k')
plt.xlim(0, 0.5*fs)
plt.title("Filter Frequency Response")
plt.xlabel('Frequency [Hz]')
plt.grid()
plt.show()
_plot_graph()
return y

# Filter requirements.
order = 10
fs = framerate #* 6.28 # sample rate, Hz
cutoff = 900 #* 6.28 # desired cutoff frequency of the filter, Hz

# Get the filter coefficients so we can check its frequency response.

y = butter_pass_filter(frames_wave, cutoff, fs, order)
print(frames_wave.shape, y.shape, np.array_equal(frames_wave, y))
fig = plt.figure(figsize = (25, 6))
# fig.suptitle('Horizontally stacked subplots')

ax1 = fig.add_subplot(1,4,1)
ax1.set_title("[After Filter] Original audio wave / Spatial Domain")
ax1.set_xlabel("Time(s)")
ax1.set_ylabel("Amplitude (16 bit depth - Calulated above)")
ax1.plot(y)
ax2 = fig.add_subplot(1,4,2)
ax2.set_title("[Before Filter] Original audio wave / Spatial Domain")
ax2.set_xlabel("Time(s)")
ax2.set_ylabel("Amplitude (16 bit depth - Calulated above)")
ax2.plot(frames_wave, 'r')
m = np.abs(fftpack.fft(y))
ax3 = fig.add_subplot(1,4,3)
ax3.set_title("[After Filter] Frequency by magnitude")
ax3.set_xlabel("Frequency (Hertz)")
ax3.set_ylabel("Magnitude (normalized)")
ax3.set_xlim(0, 44100) # we are not interested in rest
ax3.plot(np.abs(fftpack.fft(y)) / nframes)
# ax2.plot(range(0, 676864), m, 'g-', label='dataa')
ax4 = fig.add_subplot(1,4,4)
ax4.set_title("[Before Filter] Frequency by magnitude")
ax4.set_xlabel("Frequency (Hertz)")
ax4.set_ylabel("Magnitude (normalized)")
ax4.set_xlim(0, 44100) # we are not interested in rest
# ax2.plot(magnitude * 2 / (16 * len(magnitude)))
ax4.plot(magnitude / nframes, 'r')
plt.show()
IPython.display.Audio(data=y, rate=44100)

5. Training and testing input

We have taken an audio file as input. This audio file contains some noise in it. To remove
that noise we perform following operation on it:
I. Convert into wav file.
II. Reading wav file.
III. Applying fourier.
IV. Obtain waveform
V. Filtering the signal.
VI. Writing filtered signal back to file.

6. Accuracy achieved
Our code will work for any other audio file which contains noise inherently. In this code we
achieved 85% of accuracy. It removes almost all the hissing noise present in it.

7. Instructions for running the code

The following are the instructions should take while using this code:
Channels: A standard wave file consist of 2 channels wav_file.setnchannels(1) ensures that
this is a mono-sound. For stereo use: wav_file.setnchannels(2)
Each Channel sample is stored as a 16bit Signed number Hence can have range (-32768 :
+32767 )
Sample rate: Audio signals are analog, but we want to represent them digitally. Meaning we
want to discretize them in value and in time. The sample rate gives how many times per
second we get a value.
The unit is Hz.
The sample rate needs to be at least double of the highest frequency in the original sound
[Based on Nyquist theorm which states that to retain freq f we should sample at rate 2f]
otherwise you get aliasing.
Human hearing range goes from ~20Hz to ~20kHz, so you can cut off anything above
20kHZ. Meaning a sample rate of more than 40kHz does not make much sense
Bit-depth: The higher the bit-depth, the more dynamic range can be captured. Dynamic
range is the difference between the quietest and loudest volume of an instrument, part or
piece of music. A typical value seems to be 16 bit or 24 bit. A bit-depth of 16 bit has a
theoretical dynamic range of 96 dB, whereas 24 bit has a dynamic range of 144 dB (source).
Subtype: PCM_16 means 16 bit depth, where PCM stands for Pulse-Code Modulation.
Filtering the signal: The response of a filter for a single frequency can be expressed as a
complex number such that it represent 2 parts of wave (Amplitude and Phase), where the
angle is the phase response of the filter and the absolute value the magnitude response.
amplitude*e^(i*phase) = amplitude*cos(phase)+i*amplitude*sin(phase)
If we havenot limited the output to 44100 Hz in Frequency domain plot, you would have
notice that first half and second half are almost same, complex conjugate (due to above and
below sinwaves). Since human being can only hear upto 44100Hz Frequency we are not
interested in above ranges.
Sound can be expressed in terms of waves, which in turn have frequency components. If we
mask some frequency, we can remove noise. Which frequency to mask is the matter of trial
and error. Since we dont have past data to train on it cannot be a ML based task.
In simplicity we want to allow a certain range of freuency and mask other, like appying a
rectangular window over the whole signal. Now sinc func representation in frequency domain
is a rectangle, we can use it as an ideal filter. We just need to place this over our signal in freq
domain and multiply (Amplitude modulation),
then take a IFFT and get filtered singal back.
But based on nature of sinc it is not possible since it never reaches zero and if we have to use
this we need to calculate value till infinity. So we can take a approximation here and settle
with something in between.
Now above is just a explanation and their are many filters and their types. scipy comes with
lots of them already built in and their is no need to dive in the maths, we just need to use the
functions exposed.
Filters never add any new frequency components to the sound. They can only scale the
amplitudes of already existing frequencies.

Noise Cancellation DSP Theory
No ratings yet
Noise Cancellation DSP Theory
19 pages
MEH-Nakai Lab-1
No ratings yet
MEH-Nakai Lab-1
93 pages
Cep DSP
No ratings yet
Cep DSP
17 pages
Signals - Systems-04 Discrete Time Signals and Systems
No ratings yet
Signals - Systems-04 Discrete Time Signals and Systems
89 pages
22ece1040 TDM - PCM
No ratings yet
22ece1040 TDM - PCM
19 pages
DT081A - Signal and Image Processing Lab 1 Report
No ratings yet
DT081A - Signal and Image Processing Lab 1 Report
20 pages
EE-421 Digital Signal Processing Complex Engineering Problem
No ratings yet
EE-421 Digital Signal Processing Complex Engineering Problem
10 pages
Speech Enhancement
No ratings yet
Speech Enhancement
5 pages
Sec 6 - Audio Signal Enhancement
No ratings yet
Sec 6 - Audio Signal Enhancement
11 pages
Kalman
No ratings yet
Kalman
15 pages
Day - 3 Explain
No ratings yet
Day - 3 Explain
8 pages
AudioComm PE Vaibhav
No ratings yet
AudioComm PE Vaibhav
14 pages
Lecture24 Signal Proc
No ratings yet
Lecture24 Signal Proc
16 pages
Boom Placer Spare Parts Manual Sp1420 RMC - 80009056 - 0, Edition - Dec '18
No ratings yet
Boom Placer Spare Parts Manual Sp1420 RMC - 80009056 - 0, Edition - Dec '18
104 pages
S - S.Lab.15 (OEL)
No ratings yet
S - S.Lab.15 (OEL)
6 pages
ECE251s Signals Project
No ratings yet
ECE251s Signals Project
11 pages
PaperCrafter - Issue 168, February 2022
100% (4)
PaperCrafter - Issue 168, February 2022
92 pages
Lowpass and Highpass Filter Project
No ratings yet
Lowpass and Highpass Filter Project
12 pages
Group Activity Report
No ratings yet
Group Activity Report
12 pages
EECS 3451 Lab 05: Amplitude Modulation
No ratings yet
EECS 3451 Lab 05: Amplitude Modulation
22 pages
Ab Star Action
No ratings yet
Ab Star Action
7 pages
Lab 12 - Filter Design: Nikki Tran ECE 351 05/02/2019
No ratings yet
Lab 12 - Filter Design: Nikki Tran ECE 351 05/02/2019
11 pages
Case Study1
No ratings yet
Case Study1
10 pages
Error-Correction on Non-Standard Communication Channels
From Everand
Error-Correction on Non-Standard Communication Channels
Edward A. Ratzer
No ratings yet
Denoising of Real-Time Audio Signal Using Matlab Filter Techniques
No ratings yet
Denoising of Real-Time Audio Signal Using Matlab Filter Techniques
6 pages
Import Numpy As NP
No ratings yet
Import Numpy As NP
8 pages
Audio Noise Detection
No ratings yet
Audio Noise Detection
29 pages
Safeasign 24001756
No ratings yet
Safeasign 24001756
8 pages
Dsplab 3
100% (1)
Dsplab 3
17 pages
QMM Report Tata Steel
100% (1)
QMM Report Tata Steel
33 pages
Types of Sensor and Their Application
50% (2)
Types of Sensor and Their Application
6 pages
Lab 4a 1D Filtering in Frequency Domain
No ratings yet
Lab 4a 1D Filtering in Frequency Domain
11 pages
Project Definitions
100% (1)
Project Definitions
29 pages
Audiosignalprocessing
No ratings yet
Audiosignalprocessing
11 pages
Project1 Final Report (Team 13)
No ratings yet
Project1 Final Report (Team 13)
12 pages
LPC, Which Has Mathematically Tractable and Well-Understood Model. This Model Is
No ratings yet
LPC, Which Has Mathematically Tractable and Well-Understood Model. This Model Is
14 pages
Attachment 3
No ratings yet
Attachment 3
11 pages
Matlab Audio SNR MSE
No ratings yet
Matlab Audio SNR MSE
12 pages
APPFDL
No ratings yet
APPFDL
9 pages
Fee Structure Agm Current
No ratings yet
Fee Structure Agm Current
2 pages
EE442 Lab1
No ratings yet
EE442 Lab1
4 pages
DSP 1
No ratings yet
DSP 1
9 pages
DSP Lab 3RT
No ratings yet
DSP Lab 3RT
4 pages
OSS Engine Parts Section
No ratings yet
OSS Engine Parts Section
28 pages
EasyChair Preprint 3136
No ratings yet
EasyChair Preprint 3136
7 pages
Lab Filter Noise Music
No ratings yet
Lab Filter Noise Music
5 pages
MFCC
100% (2)
MFCC
6 pages
Standard PDI G102
No ratings yet
Standard PDI G102
8 pages
Good Matter
No ratings yet
Good Matter
57 pages
Analog & Digital Signals1
No ratings yet
Analog & Digital Signals1
37 pages
Digital Audio Processing Revisited: Juan P Bello
No ratings yet
Digital Audio Processing Revisited: Juan P Bello
29 pages
MATLAB For Audio Signal Processing: P. Professorson UT Arlington Night School
No ratings yet
MATLAB For Audio Signal Processing: P. Professorson UT Arlington Night School
27 pages
Assessment of Signal Distortion Removal in The Noisy Environments
No ratings yet
Assessment of Signal Distortion Removal in The Noisy Environments
10 pages
Sirisha Kurakula G00831237 Project
0% (1)
Sirisha Kurakula G00831237 Project
12 pages
Ee-323 Digital Signal Processing Complex Engineering Problem (Cep)
No ratings yet
Ee-323 Digital Signal Processing Complex Engineering Problem (Cep)
23 pages
1 Objectives: First Unit: Speach Processing
No ratings yet
1 Objectives: First Unit: Speach Processing
12 pages
Aml CT2 4M
No ratings yet
Aml CT2 4M
8 pages
Nozzle First
No ratings yet
Nozzle First
21 pages
Chapter 1
No ratings yet
Chapter 1
49 pages
Report Project OSG202
No ratings yet
Report Project OSG202
4 pages
FLEX-1500 Service Manual
No ratings yet
FLEX-1500 Service Manual
49 pages
DSPHW - May 2008 Lab-Project 1: Speech Enhancement Part 1: Matlab Implementation
No ratings yet
DSPHW - May 2008 Lab-Project 1: Speech Enhancement Part 1: Matlab Implementation
8 pages
Echocancellation: 1 Adaptive Echo Cancellation
No ratings yet
Echocancellation: 1 Adaptive Echo Cancellation
7 pages
Form 60
No ratings yet
Form 60
1 page
Introduction To Signal Processing: - CD - Rowell 2008
No ratings yet
Introduction To Signal Processing: - CD - Rowell 2008
3 pages
NS & Tech - Grade 4 - Terminology List - IsiZulu
No ratings yet
NS & Tech - Grade 4 - Terminology List - IsiZulu
11 pages
American International University-Bangladesh (AIUB) Faculty of Engineering (EEE)
No ratings yet
American International University-Bangladesh (AIUB) Faculty of Engineering (EEE)
6 pages
B1 Final Test SpeakingTestFormat
No ratings yet
B1 Final Test SpeakingTestFormat
4 pages
Audio Noise Reduction From Audio Signals and Speech Signals: Nishan Singh, Dr. Vijay Laxmi
No ratings yet
Audio Noise Reduction From Audio Signals and Speech Signals: Nishan Singh, Dr. Vijay Laxmi
5 pages
EXERCISE - Signals
No ratings yet
EXERCISE - Signals
3 pages
MATLAB Audio Processing Ho
No ratings yet
MATLAB Audio Processing Ho
7 pages
This Content Downloaded From 42.1.77.20 On Tue, 05 Nov 2024 14:43:27 UTC
No ratings yet
This Content Downloaded From 42.1.77.20 On Tue, 05 Nov 2024 14:43:27 UTC
17 pages
The Process of Photosynthesis
No ratings yet
The Process of Photosynthesis
2 pages
Question Bank CDMA
No ratings yet
Question Bank CDMA
7 pages
Pengaruh Model PBL Terhadap Kemampuan Berpikir Kreatif Ditinjau Dari Kemandirian Belajar Siswa
No ratings yet
Pengaruh Model PBL Terhadap Kemampuan Berpikir Kreatif Ditinjau Dari Kemandirian Belajar Siswa
14 pages
PS-Sheffield-MA Landscape Management
No ratings yet
PS-Sheffield-MA Landscape Management
2 pages
The Classification of Stocks With Basic Financial Indicators An Application of Cluster Analysis On The BIST 100 Index
No ratings yet
The Classification of Stocks With Basic Financial Indicators An Application of Cluster Analysis On The BIST 100 Index
29 pages
Multimedia Audio Processing and Practice Homework-1
No ratings yet
Multimedia Audio Processing and Practice Homework-1
1 page
FV - Pitch Deck - Company Name
No ratings yet
FV - Pitch Deck - Company Name
12 pages
Mathura Vrindavan Tour
No ratings yet
Mathura Vrindavan Tour
1 page
HDL Adaptative
No ratings yet
HDL Adaptative
4 pages
CT TIF Presentation For Kickoff-Final
No ratings yet
CT TIF Presentation For Kickoff-Final
13 pages
Message Analyzer FAQ and Known Issues
No ratings yet
Message Analyzer FAQ and Known Issues
11 pages
Banana - Mail Arte - Flue - v4 - n3-4 - 1984
No ratings yet
Banana - Mail Arte - Flue - v4 - n3-4 - 1984
3 pages
R June 6 Prakash Bari Health
No ratings yet
R June 6 Prakash Bari Health
6 pages
En Entl Encl1106 (Подъемники)
No ratings yet
En Entl Encl1106 (Подъемники)
2 pages
What Is Budgetary Cycle
No ratings yet
What Is Budgetary Cycle
6 pages
Spring Lighting 2013 - HKD1800 Travel Reimbursement
No ratings yet
Spring Lighting 2013 - HKD1800 Travel Reimbursement
1 page
Spectrum MediaStore5000 Datasheet PDF
No ratings yet
Spectrum MediaStore5000 Datasheet PDF
2 pages
Mahbubur Rahman Ticket
No ratings yet
Mahbubur Rahman Ticket
1 page

Report On Project 1 Speech Emotion Recognition

Uploaded by

Report On Project 1 Speech Emotion Recognition

Uploaded by

A

SPEECH EMOTION RECOGNITION

DEPARTMENT OF ELECTRICAL ENGINEERING

POWER ELECTRONICS AND DRIVES

SUBJECT – DIGITAL SIGNAL PROCESSING

SUBMITTED TO: DR HEMANT KUMAR MEENA

SUBMITTED BY: MONALIKA

M.TECH (1ST YEAR)

from google.colab import drive

#Importing audio file from drive

#Converting m4a file to wav format

!pip install pydub

#Reading wav file

# from scipy.io import wavfile

# Reading wave format data from wav file.

#Fast Fourier Transform

# Above value is in complex number but we want absolute number

fig = plt.figure(figsize = (25, 6))

#Filtering the signal

from scipy import signal

# Get the filter coefficients so we can check its frequency response.

5. Training and testing input

7. Instructions for running the code

You might also like