0% found this document useful (0 votes)
75 views79 pages

Final Report

Download as docx, pdf, or txt
Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1/ 79

MOOD BASED MUSIC RECOMMENDATION SYSTEM

A PROJECT REPORT
Submitted by

Amardeep Tomar19BCS1137)
Ashish Yadav (19BCS1124)
Marjana Bharali(19BCS1884)
Tanmay Tripathi (19BCS1120)
Vidushi Somani (19BCS1119)

in partial fulfillment for the award of the degree of

BACHELOR OF ENGINEERING
IN

COMPUTER SCIENCE & ENGINEERING

Chandigarh University
DEC 22
BONAFIDE CERTIFICATE

Certified that this project report “Mood based Music recommendation system” is the
bonafide work of “Marjana Bharali(19BCS1884), Vidushi Somani(19BCS1119),
Ashish(19BCS1124), Amardeep(19BCS1137), Tanmay Tripathi(19BCS1120)” who
carried out the project work under my supervision.

SIGNATURE SIGNATURE
Dr. Navpreet Kaur Er. Payal Thakur
HEAD OF THE DEPARTMENT Supervisor
Assistant Professor
Computer Science & Engineering Computer Science & Engineering

Submitted for the project viva-voce examination held on

INTERNAL EXAMINER EXTERNAL EXAMINER


ACKNOWLEDGEMENT

We would like to express our profound gratitude to Dr. Navpreet Kaur, HOD of
Computer Science & Engineering department, for her contributions to the completionof
my project titled “Mood based music recommendation system”.

We would like to express our special thanks to our mentor Ms. Payal Thakur for her
time and efforts she provided throughout the year. Your useful advice and
suggestionswere really helpful to us during the project’s completion. In this aspect, we
are eternally grateful to you.
Finally, we would like to offer many thanks to all our colleagues for their
valuable suggestions and constructive feedback.

Amardeep Tomar-19BCS1137
Ashish Yadav-19BCS1124
Marjana Bharali-19BCS1884
Tanmay Tripathi-19BCS1120
Vidushi Somani-19BCS1119
TABLE OF CONTENTS
List of Figures.................................................................................................................i
List of Tables.................................................................................................................ii
List of abbreviation.......................................................................................................iii
Abstract..........................................................................................................................iv

Chapter-1 Introduction
1.1 Identification.............................................................................................1
1.2 Identification of Problem.........................................................................2
1.3 Identification of Tasks............................................................................3
1.4 Team Work Distribution........................................................................4
1.5 Timeline…...............................................................................................5

Chapter-2 Literature Review/ Background Study


2.1 Timeline of the reported problem….......................................................6
2.2 Proposed Solutions...............................................................................11
2.3 Bibliometric Analysis............................................................................12
2.4 Review Summary..................................................................................13
2.5 Problem Definition................................................................................13
2.6Goals/Objectives....................................................................................14

Chapter-3 Design Flow/Process


3.1 Evaluation of features................................................................................15
3.1.1 S/H Requirements................................................................................16
3.2 Design Constraints.....................................................................................17
3.3 Constraint Analysis....................................................................................20
3.4 Design Flow..............................................................................................22
3.5 Design Selection........................................................................................23
3.6 Gesture Classification...............................................................................24
3.7Implementation Plan/Methodology...........................................................25
3.7.1 Implementation.................................................................................25
3.7.2 Feature Extraction............................................................................28

Chapter-4 Result Analysis and Validation


4.1 Implementation of Solution......................................................................32
4.1.1 Analysis...............................................................................................32
4.1.2 Experimental results............................................................................33
4.1.3 Testing................................................................................................34
4.1.4 Accuracy.............................................................................................36

Chapter-5 Conclusion and Future Work

5.1 Conclusion...............................................................................................38
5.2 Future Work.............................................................................................39

References......................................................................................................................40
Appendix........................................................................................................................45
User Manual...................................................................................................................58
LIST OF FIGURES

Fig 1.1 Timeline of Project..................................................................4

Fig 3.1 Design Constraints.................................................................19

Fig 3.2 Design Flow1..........................................................................22

Fig 3.3 Design Flow 2........................................................................22

Fig 3.4 Final Design Flow…..............................................................23

Fig 3.5 Flowchart...............................................................................27

Fig 3.6 ANN.......................................................................................28

Fig 3.7 CNN......................................................................................29

Fig 3.8 Pooling...................................................................................30

Fig 4.1 Mood Happy...........................................................................33

Fig 4.2 Mood Surprised......................................................................33

Fig 4.3 Mood Neutral........................................................................34

i
LIST OF TABLES

1.1 Accuracy of Model

i
LIST OF ABBREVIATIONS

OpenCV Open Computer Vision

CNN Convolutional Neural Network

ANN Artificial Neural Network

HMM Hidden Markov Models

JSON JavaScrit Object Notation

VE Virtual Environment

i
सार

आधुनिक दुनिया हर ड़ी मेंबदल रह़ी है। हर क्षेत्र मेंई तकि़ीक हो रह़ी हैहमारेदैनिक ज़ीवि का।
इमेज प्रोसेनसिंग इस बदलत़ी दुनिया मेंप्रमुख अग्रदू तोमेंसेएक है। एक क्लिक सेबहुत कु छ हो रहा है।
एक क़ी मदद सेकई च़ीजेंसिंभव हैंछनव। टेक्स्टइमेज को नबिा नकस़ी मदद के एक भाषा सेदू सऱी भाषा मेंबदला जा सकता है माव दुभानषया।
कोई भ़ी अपिा समय बचा सकता हैतानक नकस़ी को नचत्र के रूप मेंटेक्स्ट नकया जा सके नसिंगल इमेज बहुत कु छ समझात़ी है। छनवयोका उपयोग
सामानजक पर नकस़ी व्यक्लि क़ी पहचा करिेके नलए भ़ी नकया जाता हैम़ीनिया और कई अन्य web. इस़ी वजह सेफे स निटेक्शि हर नदि बहुत
लोकनप्रय हो रहा है।फे स निटेक्शि क़ी मदद सेनकस़ी व्यक्लि क़ी पहचाि बहुत आसाी सेक़ी जा सकत़ी है। क्या होगा अगर एक बता सकता हैनक व्यक्लि
नकस प्रकार क़ी भाविात्मक क्लथिनत मेंहै? यह नकस़ी को उस तक पहुचिे मेंमदद करेगा व्यक्लि।
उदाहरण के नलए यनद कोई व्यक्लि दुख़ी हैतो उसेखुश करिेके नलए कु छ कर सकता हैऔरजल्द ह़ी। इस पररयोजिा मेंयह खोजा गया हैनक क्या नकस़ी
व्यक्लि क़ी पहचा सिंभव हैक्या यह सिंभव है?नकस़ी व्यक्लि क़ी भाविात्मक क्लथिनत क़ी पहचा करिेके नलए। नफर इस पर सिंग़ीत का सुझाव देके नलए भ़ी
शोध नकया गया हैउसक़ी भाविाओिंका आधार।

i
ABSTRACT

The modern world is constantly evolving. Every area of daily life is being
transformedby new technologies. One of the primary innovators in this
evolving environment is image processing. Many things can be done with a
single click. An image can be used to accomplish a lot of things. Without the
assistance of a human interpreter, a text image can be translated from one
language to another. Texting someone with an image can also save time
because a single image can convey a lot of information. On social media and in
many other online spaces, images are also used to identify a person. Face
detection is consequently growing in popularity daily. A person can be easily
recognised with the aid of face detection. What if it were possible to determine
a person's emotional state? It would be beneficial to speak with that person. For
instance, if someone is depressed, you can make them joyful, and so on. In this
experiment, it has been investigated whether it is possible to identify a person
and their emotional state. Then, according to studies, music recommendations
can be madebased on a person's feelings.

v
CHAPTER 1
INTRODUCTION
1.1 Client Identification/Need identification /Identification of Issue:
A user's emotions and moods can be recognized by their facial expressions. These
representations can be obtained from a live feed through the system's camera. Much research is
done in the fields ofcomputer vision and machine learning (ML), where machines are trained
to recognize different human emotions and moods.
Machine Learning offers a variety of techniques that can be used to detect human emotions
One suchtechnique of his is using the MobileNet model in Keras. This will generate a small
trained model for easy Android ML integration.
Music is a great connection. It connects us across markets, ages, backgrounds, languages,
preferences, political leanings and income levels. Music players and other streaming apps are
in highdemand because they can be used anytime, anywhere and combined with daily
activities, travel, sports, etc. With the rapid development of mobile networks and digital
multimedia technology, digital music has become mainstream consumer content sought after
by many young people.
People often use music as a means of mood regulation, especially to lift bad moods, boost
energy levels, and relieve tension. Listening to the right kind of music at the right time also
improves mentalhealth. Human emotions are therefore strongly associated with music.
The proposed system creates a mood-based music player that performs real-time mood
detection andsuggests songs depending on the detected mood. This will be an additional
feature to the traditional music player app pre-installed on mobile phones. The main benefit of
including sentiment detection is customer satisfaction.
The goal of this system is to analyze users' images, predict users' facial expressions, and
suggest songs suitable for the detected mood.
Neural networks and machine learning have been used for these tasks with good results. The
machine learning algorithm has proven very useful for pattern recognition and classification,
so it can also be used foremotion recognition.

With the development of digital music technology, it is essential to develop a personalized


music recommendation system to recommend music to users. Making recommendations from
the large
1
amount of data available on the internet is a big challenge. E- commerce giants such as
Amazon and eBay provide users with personalized recommendations basedon their
preferences and history. Meanwhile, companies like Spotify and Pandora use machine learning
and deep learning techniques to provide relevant recommendations.
The goal of this work is to create a music recommendation system or player that recognizes
the user'sface, identifies the current mood, and recommends her playlists based on the detected
mood.

1.2 Identification of Problem


With the rapid development of cellular networks and digital multimedia technology, digital
music has become mainstream consumer content sought after by many young people. People
often use music as a mood-regulating tool, especially to change bad moods, increase energy
levels, and reducetension. Helps improve mental health. Human emotions are therefore
strongly associated with music.The proposed system creates a mood-based music player that
performs real-time mood detection andsuggests songs depending on the detected mood. This
will be an addition to the traditional music player apps that come pre-installed on your phone.
The main benefit of incorporating sentiment detection is customer satisfaction. The purpose of
this system is to analyze the user's image, predict the user's facial expressions, and suggest
songs that match the detected mood.
Human emotions are complex and nuanced, but it is possible to train machine learning models
to accurately recognize different emotions that can be distinguished from each other by
specific facialexpressions. A person's facial expressions can be used to identify a mood, and
once a particular mood is identified, appropriate music for that person's identified mood can be
suggested.
Of course, all of this is backed up by research showing that music affects our emotions in
differentways.Generating and soothing music relaxes the mind and body. In some cases,
changing your mood can also help you overcome situations such as depression or sadness.
Measures can also be taken to raiseit to a better level.

2
1.3 Identification of Tasks
 Information gathering/ learning: - The first step in designing a successful web site
is to gather information. Many things need to be taken into consideration to
provide the look and feel of your site. So we decided first to gather information
about structure, design, overview, featuresand complexities of the project.
 Planning: - Now we will put together a plan for our project. The plan will include
what contents will be on the site. The user of the website is kept in mind when
designing our site. We will try to make the front end as friendly as possible
because a good user interface creates an easy to navigate website. And to make
sure we will also design wireframes on different platforms to have an overview of
the end product.
 Design: - After developing wireframes and planning a roadmap we will move to
designing. We will create one or more prototype designs for our web site. This
might be typically a .jpg image of what the final design will look like. This would
include the contents of the website as well.
 Execution :-

o Initial Execution: Run/train the model with the given initial parameters on
the test dataset and explaining our process and results. Including the
results in the report, formatted in the manner most appropriate for clarity.
o Initial Evaluation: Evaluate these initial conditions compared to the
validation dataset from the scenario and your expectations of the results.
How do your results meet up with the validation dataset?
o Parameter Changes: Change parameters to investigate the effects to the
model accuracy. How did changing the parameters impact our results? Are
there trends or patterns that came into our notice?
 Model Deployment - Unlike software or application deployment, model
deployment is a different. A simple ML model lifecycle would have stages like
Scoping, Data Collection, Data Engineering, Model Training, Model Validation,
Deployment, and Monitoring.

3
 Debugging/ Testing – We’ll debug system to detect and remove existing and
potential bugs in a software that can lead to failure. Further testing our model
manually and using automated software to ensure its proper functioning.

1.4 Team Work Distribution


It is systematically and equally divided with all the team members.

1. Ashish(19BCS1124)
Language and Framework
Machine Learning
2. Amardeep Tomar(19BCS1137)
Frontend Developer
Content Writer/ editor
3. Marjana Bharali(19BCS1884)
Developer
Content Writer
4. Tanmay Tripathi(19BCS1120)
Frontend and Backend
integration Backend developer
5. Vidushi Somani(19BCS1119)
Developer
Content Writer

1.5 Timeline

Figure 1.1- Timeline

4
1.6 Organization of the Report

CHAPTER 1

Here we are describing the issues that exist with current models. Evaluating problems based on
various parameters. Providing and justifying our solution. Further identifying the tasks needed
to be performed for developing efficient solutions.

CHAPTER 2

Here we are expecting to go through the Timeline of the problem by investigating data from
across the world, documenting proof of the incidents. Further we will be proposing solutions
for the identified issue along with the goals and objectives that we want to achieve through this
project.

CHAPTER 3

This part includes the design flow and implementation methodologies. Implementing various
machine learning algorithms on our model.

Then we will be working upon adding different features identified in the literature survey.
Testing of the models against Design Constraints.

CHAPTER 4

It will cover the entire timeline of implementation of the project. Documented using DFD’s,
and flow diagrams. Further showcasing the development work. Adding the snapshots of our
implemented work and the models at required stages and validating the data.

CHAPTER 5

This will include the Conclusion talking about our expectations from the system, the results
after the performance and how far we met our outcomes. Wewill also be specifying the future
scope of the system which might include required modifications in the solution, change in
approach,suggestions for extending the solution.

5
CHAPTER 2

LITERATURE REVIEW/BACKGROUND STUDY


2.1. Timeline of the reported problem
Traditionally, there have been two basic perspectives on how emotions are related to music:
emotions that may be noticed in music (cognitivist approach) and emotions that are felt from
music (emotivist perspective). In their study, Vempala and Russo compared the
relationships betweenmusic and these two distinct types of emotions. They used music
parameters as inputs to train neural networks for both perspectives, and the models' outputs
were arousal and valence. based on psychological input provided by research participants
and emotions noted by music analyses. Resultsrevealed that while networks in both
situations produced similar outcomes for arousal, the cognitivistperspective networks
outperformed them in terms of valence. Even taking into account the fact thatit is impossible
to distinguish characteristics that have a greater impact on the emotions that are sensed
through music from others that correlate more with emotions in music, the research clearly
illustrates the possibility of using this method and possibilities for further development. All
processes in the human body are closely interrelated, therefore emotions, psychical
andpsychophysical conditions might have an impact on each other. The cardiovascular
system is significantly influenced by stress . Some sorts of music effects on heart rate, blood
pressure andother psychophysical conditions as well. Ellis and Thayer draw our attention to
the fact that different music attributes such as tempo or beat level can trigger emotional,
psychophysiological and behavioral effects.
Abdat, Garikapati et al (2020) in there article explains how to build an automatic music player
based on user click trends from movie music. They suggest a music suggestion pattern based
on consumer behavior in this article. They used association rules to work out the correlation
between emotion and song. The experimental findings indicate an accuracy of 80% on
the outcome, but since human emotions change over time, complex estimation and
identification of human emotion is a key feature in music recommendation systems.

6
Ali Mollahosseini et al (2017) proposed “AffectNet: A Database for Facial Expression,
Valence, and Arousal Computing in the Wild” where more than 1,000,000 facial images were
obtained from the Internet by querying three major search engines using 1250 emotion related
keywords in six different languages. About half of the retrieved images were manually
annotated for the presence of seven discrete facial expressions and the intensity of valence and
arousal. Two baselines are proposed to classify images in the categorical model and predict the
value of valence and arousal in the continuous domain of dimensional model. There were
certain limitations such that VGG16 only makes improvement over AlexNet by replacing large
kernel sized filters with multiple 3X3 kernel-sized filters one after another. With a given
receptive field multiple stacked smaller size kernel is better than the one with a larger size
kernel. AffectNet database does not contain very strong samples.

Anagha, Bhasa et al (2020) in their paper explains about automatic face recognition system.
This explains three sessions, 1. Face detection, 2. Feature Extraction and 3. Expression
recognition.This paper briefs detection of respective face obtaining the face and perform
morphological operations to obtain the feature such as eyes and mouth from the face.
They proposed AAM technique for facial feature extractions like extracting eye, eyebrows,
mouth, lips etc.

ArtoLehtiniemi et al (2012) pulished a research paper.The theoretical idea of a music


recommendation method based on mood images is proposed in this research paper. The song is
selected for you depending on the genre it belongs to. This is performed manually and does not
take into account individual emotions. The suggested algorithm suggests music based on the
music's genre, sound, and when it meets the user's listening habits, the music list is
modified and approved.

ChinnamahammadBhasha et al (2020) in their paper analysed and proposed Bezier curve


fitting. They used for extracting the facial features from the original facial input images
and also they proposed to extract region of interest from the input facial images. First the input
image colour isadjusted to make it compatible for feature extraction process(). Then the feature
extraction of eye and mouth are performed using region of interest technique to extract the
facial Emotion Based Music Recommendation System using computer vision and machine
learning techiniques feature points for matching. Finally application of Bezier curve on eye
and mouth points, the human feelings are understand.

7
Deepthi et al ( 2019) published a paper on mood based music.The research in this paper was
based on a pre-existing user profile. As a result, they used pre-existing photographs as
feedback, and they made recommendations based on emotional dimensions.

H. Immanuel James et al (2019) proposed "Emotion Based Music Recommendation" which


aims at scanning and interpreting the facial emotions and creating a playlist accordingly. The
tedious task of manually Segregating or grouping songs into different lists is reduced by
generating an appropriate playlist based on an individual's emotional features. The proposed
system focuses on detecting human emotions for developing emotion-based music players.
Linear classifier is used for face detection. A facial landmark map of a given face image is
created based on the pixel’s intensity values indexed of each point using regression trees
trained with a gradient boosting algorithm. A multiclass SVM Classifier is used to classify
emotions Emotions are classified as Happy, Angry, Sad or Surprise. The limitations are that
the proposed system is still not able to record all the emotions correctly due to the less
availability of the images in the image dataset being used. Diverse emotions are not found.
Handcrafted features often lack enough generalizability in the wild settings.

Jae Sik et al, in their research paper stated the concept of context-awareness in a
recommendation system. Recommendation can be defined as “the process of utilizing the
opinions of a community of customers to help individuals in that community more effectively
identify content of interest from a potentially overwhelming set of choices” . Generally, the
products are recommended based on the demographic features of the customer, or based on an
analysis of the past buying behavior of the customer.

James,H.Immanuel et al in their proposed system "Emotion based music recommendation


system” (2019). proposed system based on Bezier curve fitting . This system used two step for
facial expression and emotion first one is detection and analysis of facial area from input
original image and next phase is verification of facial emotion of characteristics feature in the
region of interest . The first phase filtering ,based on result of lighting compassion then to
estimate face position and facial location of eye and mouth it used feature map After extracting
region of interest this system extract points of the feature map to apply Bezier curve on eye
and mouth The for understanding of emotion this system uses training and measuring the
difference of Hausdorff distance With Bezier curve between entered face image and image
from database.

8
John O'Donovan et al (2018) in their proposed system“Moodplay: Interactive mood-based
music discovery and recommendation."(2016) stated that facial expressions can be categorized
into seven main categories including angry, disgust, happy, fearful, surprise, sad and neutral.
In other words, these facial expressions were globally same for all races, social strata and age
brackets and were recognized same among distinct cultures. However, in case of human beings
prefer to camouflage their expressions, using only facial expression signals may not be enough
to detect emotions reliably.
Kiruthika, Balamurugan et al (2020) in their paper presents an algorithmic rule that automates
method of generating an audio list, supported facial expressions of a user. Manual
segregation of a list and annotation of songs, in accordance with this emotion of a user is been
proposed. Existing approach algorithms are slow and integration of extra hardware like
electroencephalogram systems and sensorswould be complex and provide less promising
results.It additionally aims at increasing accuracy of designed system. Countenance
recognition module of projected formula is valid, tested and experimented on pre-defined
dataset based images.

Mahadik et al, in their report consisted of particular system ,with Anaconda and Python 3.5
softwares were used to test the functionality and ViolaJones and haar cascade algorithms were
used for face detection. Similarly, KDEF (Karolinska Directed Emotional Faces) dataset and
VGG (Visual Geometry Group) 16 were used with CNN (Convolution Neural Network) model
which was designed with an accuracy of 88%, for face recognition and classification that
validated the performance measures. However, the results proved that the network architecture
designed had better advancements than existing algorithms. Another system used Python 2.7,
OpenSource Computer Vision Library (OpenCV) & CK (CohnKanade) and CK+ (Extended
Cohn-Kanade) database which gave approximately 83% accuracy. Certain researchers have
described the Extended Cohn-Kanade (CK+) database for those wanting to prototype and
benchmark systems for automatic facial expression detection.

Renuka et al (2012) in their research paper suggested MoodPlay, a music recommendation


framework that takes into account both the user's mood and the music they are listening to.
The MoodPlay investigates and suggests music dependent on the users' subjective aspects.

9
S Metilda Florence et al(2020) proposed a paper "Emotional Detection and Music
Recommendation System based on User Facial Expression" where the proposed system can
detect the facial expressions of the user and based on his/her facial expressions extract the
facial landmarks, which would then be classified to get a particular emotion of the user.
Once the emotion has been classified the songs matching the user's emotions would be
shown to the user. It could assist a user to make a decision regarding which music one
should listen to helping the user to reduce his/her stress levels. The user would not have to
waste any time in searching or to look up for songs. The proposed architecture contained
three modules, namely, Emotion extraction module, Audio extraction module and Emotion-
Audio extraction module. Although it had some limitations like the proposed system was not
able to record all the emotions correctly due to the less availability of the images in the
image dataset being used. The image that is fed into the classifier should be taken in a well-
lit atmosphere for the classifier to give accurate results. The quality of the image should be
at least higher than 320p for the classifier to predict the emotion of the user accurately.
Handcrafted features often lack enough generalizability in the wild settings.

Soui Yading at al in their research paper surveyed a general framework and state-of-art
approaches in recommending music. Two popular algorithms: collaborative filtering (CF) and
content-based model (CBM), have been found to perform well. Due to the relatively poor
experience in finding songs in long tail and the powerful emotional meanings in music, two
user-centric approaches: context-based model and emotion-based model, have been paid
increasing attention. In this paper, three key components in music recommender - user
modelling, item profiling, and match algorithms are discussed. Six recommendation models
and four potential issues towards user experience, are explained. However, subjective music
recommendation system has not been fully investigated. To this end, we propose a motivation-
based model using the empirical studies of human behaviour, sports education, music
psychology.

Yusuf Yaslan et al proposed "Music recommendation system based on mood" (2018) which
stated that facial expressions can be categorized into seven main categories including angry,
disgust, happy, fearful, surprise, sad and neutral. In other words, these facial expressions were

1
globally same for all races, social strata and age brackets and were recognized same among
distinct cultures. However, in case of human beings prefer to camouflage their expressions,
using only facial expression signals may not be enough to detect emotions reliably.

2.2. Proposed solutions


The main objective of the system is to find the closest music tracks to the abstract etalon one,
which is defined by a specific set of music-related criteria, regardless of the actual purpose,
whether there isa need to change a user's emotional state or maintain and keep it the same.
We could point out several action modes of our system.
Collaborative filtering is prone to popularity bias as expected by its inherent social component.
It tends to reinforce popular artists, at the expense of discarding less-known music” (Celma
and Cano 2008). The popularity of music can be measured in terms of total play counts (Celma
and Cano 2008) or the fraction of total consumption fulfilled (Goel et al. 2010). In a user
preference dataset,popular items seem to be similar to (or related with) lots of items, such that
they are more likely to be recommended. As a consequence, the recommenders are sometimes
biased towards a small number of popular items and do not explore the Long Tail of unknown
items that could be more interesting and novel for the users (Celma 2010). Navigation through
the network of popular artists reveals a poor discovery ratio. And this can decrease user
satisfaction and novelty detection in the recommendation workflow (Herlocker et al. 2004;
McNee et al. 2006). On the other hand, content- based and human expert-based
recommendation systems are non- vulnerable to the popularity bias.
One possible way to recommend long tail items using conventional collaborative filtering, is to
identify a candidate pool of long tail items from which to draw recommendations.The positive
effects of music on people's wellbeing can be applied in many contexts. It is necessaryto have
a generalist system that takes into account the unique characteristics of each individual userto
address these potential scenarios. This section aims to build a solution that combines well-
known generalised methodologies with personal characteristics of the physical and emotional
influence of music-related features in varied circumstances.
As a result, on-the-fly model training will be supported by continuous data gathering and
analysis ofuser feedback, which will further enhance recommendation personalisation.

1
2.3. Bibliometric analysis
Research from the previous section is taken into consideration when identifying sets of
musical characteristics and their values in relation to the emotional and physiological states
that they mayelicit or lead to. We will examine the scenarios in which the impact of music on
human wellbeingcan be used in this part.

1) Research and intellectual activity: The effectiveness of intellectual activity and study
requires maintaining energy, vitality, spirits, freshness of the brain, and acute attention.
According to studies, nearly half of those surveyed think that listening to music while studying
helps them focus better. Others said that listening to music keeps their minds quiet and keeps
them from nodding off when studying. The work goes more quickly when the person is alert
and upbeat. In this situation, listeningto music should be done with the intention of raising
alertness and aiding in a quicker recovery from fatigue. Fatigue, arousal, satisfaction with the
process, and productivity outcomes are important factors to consider while evaluating one's
personal status during and after a working session. Our method entails recording indications of
the aforementioned criteria before, during, and after the listening session and comparing them
to characteristics of the music being listened to. Of course, there may be many other factors
that are presented that affect people while they are engaged in intellectual activity. In order to
support this process with music effectively, the system must be aware of these other factors
rather than allow music curation to distract from tasks being completed to a sufficient level.

2) Physical work and sport: Activities may have different demands on speed and endurance.
Human performance and well-being are determined by psychological and physical factors.
There arewell-established, consistent health support practises in sports that include in-depth
medical evaluations, measures, and wellbeing monitoring. Our strategy of providing emotional
support through music consumption can considerably improve sports regulation approaches by
using these activities. While highly rigorous repeated sprint activities may not benefit much
from listening to favourite music, it does increase motivation and reduce overexertion.
On 1.5 miles of running activities, listening to music had a significant impact on performance
but does not reduce perceived exertion. According to the findings of these studies, music has a
variety of effects on many types of sport activities. Faster music is used for anaerobic
activities,

1
while slower music is preferred for exercises aimed atbuilding strength and stamina. The
choice of music is also crucial on an individual basis.
3) Personal safety: People encounter circumstances every day where they must maintain a
keen focus on critically important matters in order to prevent hazards to their lives and health.
For instance, drivers must pay attention to traffic, and being unwell or drowsy could have
negative effects. Jeon outlines the use of music to lessen the emotive impacts of driving in his
research. Especially as they cross the street, pedestrians are at risk. Nothing, not even music,
should divert attention from driving in either scenario. The recommendation system must
support being revitalised, feeling energised, and preventing environmental destruction.

2.4 Review Summary


The ultimate goal of our team was to provide the user with something that will really help
them in providing music according to their current mood ,and our project was based on exactly
that. After going through 100 research papers and closely analyzing them we learnt that they
use some sort of feature selection and feature extraction, most often prior to using the
machine- learning algorithm. As an evaluation method of ML algorithms,researchers, in most
cases,they were using OpenCV to get the image of user and converting in into greyscale image
as according to the need of datadset.After getting the users iamge that was sent to the trained
datset. In most of the reports we found that they were using dataset from Kaggle for better
accuracy so we used Kaggle dataset. For music recommendation part ,after analyzing some
papers we used spotipy that will help us create dataset for spotify api,that in the end will
provide us the music recommendation ,In the end we were able to built a system that will
take user’s image and recognize their emotions and suggest music as per detected mood.

2.5 Problem Definition


Only a few years back,we were chosing music by genre/artist and that was the only option
available.But now a days only listening to music is not enough,the old things have changed
drastically with the availability of custom or pre-curated playlists and personal
recommendations on music platforms.But,people will feel more connected to music if it
matches their mood and emotion. In our proposed system, a mood-based music player is
created which performs real time mood detection and suggests songs as per detected mood.

1
This becomes an additional feature to the traditional music player apps that come pre-
installed in our mobile phones.
An important benefit of incorporating mood detection is customer satisfaction. The objective
of this system is to analyse the user’s image, predict the expression of the user and suggest
songs suitable to the detected mood.The accuracy of our model is correctly recognizing the
emotions will increase overtime to provide best accurate result.

2.6 Goals/Objectives
Our aim, is to develop a method of face mood detection that is fast, robust,
reasonably simple andaccurate with a relatively simple and easy to understand
algorithms and techniques. The examples provided in this thesis are real-time
andtaken from our own surroundings.
In previous time, for psychologist, analyzing facial expression was an essential
part. Nowadays image processing has motivated significantly on research work
of automatic face mood detection.There are lots of depressed people lived in our
society. Also, lots of busy people those who do not know their present mental
condition. So, we try to develop such an application and by this application they
will able to see their present mental condition.
1. Can detect facial expressions using OpenCV.
2. It will suggest music.
3. User friendly and reliable application.
4. User can get rid of their mental depression and also release their tension
5. We will also describe the problem of our existing apps and why our project is the best.
6. We create database where we store songs and user information of our application

1
CHAPTER-3

DESIGN

FLOW/PROCESS

3.1 Evaluation & Selection of Specifications/Features


 Image processing is a technique than can convert an image into digital form and perform
different kinds of operation on it for getting better image and useful information. Image
processing technique used two types of method. These are analog and digital image processing.
Analog technique can be used for hard copies and digital technique used for manipulating
digital image. The purpose of imageprocessing is divided into five groups. These are:
visualization, image sharpening, image retrieval, measurement of pattern, image recognition.
Visualization observe invisible object. Image sharpeningthat makes better image. Image
retrieval finds interesting image. Measurement of pattern measures different objects of an
image. Image recognition finds the difference of an image.
 The human face is an important organ of an individual‘s body and it especially plays an
important role in extraction of an individual‘s behavior and emotional state. Manually
segregating the list of songs and generating an appropriate playlist based on an individual‘s
emotional features is a very tedious, time consuming, labor intensive and upheld task. We are
using 3 types of algorithm in our development. Such as PCA, MPCA, Machine learning
language. We are working with human’s eyesand mouth for emotion detection. We are working
and testing many images to detect human’s emotions.
 Systems in this situation gradually close the gap between feature vectors that reflect the current
and desired points. This mode is designed to be less annoying for the user. The calculation of
multidimensional (multi-featured) "delta" distance (step), which takes into consideration the
personality of each unique user, calls for a more complex algorithm. Therefore, during the data
collecting and model construction stages, we must record a Personalized Emotion
Transformation Model (PETM) for each user.
 In both situations, we must continue to update the model to ensure thatit continues to reflect the
user's current personality.

 As a result, on-the-fly model training will be supported by continuous data gathering and
1
analysis of user feedback, which will further enhance recommendation personalisation.

1
 The feature extraction task, and the subsequent characterization, can and has been performed
with a multitude of methods. The general approach of using of Gabor transforms coupled with
neural networks, similar to Zhang’s approach is a popular approach. Other extraction methods
such as local binary patterns by Shan , histogram of oriented gradients by Carcagni , and facial
landmarks with Active Appearance Modeling by Lucey have been used. Classification is often
performed using learning models such as support vector machines.

3.1.1 Software and Hardware Requirements:


The software and hardware required for the same are as
follows:
Software Requirements:
Browser
• Chrome
• Brave
• Mozilla Firefox
• Edge

Operating System
• Windows
• Linux
• Android
• MacOS

Software Requirements
• Visual Studio
• Pycharm
• Jupyter
• Python 3.9.0
• OpenCV 4.2.0

1
HARDWARE:
• PC with 4 GB RAM
• Microsoft Windows 7/8/10 (32 or 64 bit).
• 500 MB disk space
• 1 GB for Android SDK
• 1280x800 screen resolution
• A faster processor which can run a browser

3.2 Design Constraints

Following constrains were kept in check while developing this application :

• Economic Constraints :

Any outside, uncontrollable economic influence on a corporation is referred to be an economic


constraint. Economic limitations are primarily defined by the fact that they are external. This
indicates that they are independent of the internal corporate environment. The fact that the firm
cannot control them and must adapt to them is another trait.

• Environmental :

Any restrictions on a strategy's alternatives caused by external or internal politics, rivalry,


social expectations, cultural or economic reasons, technology limitations, or legal requirements
are considered environmental constraints. The environment in which a commercial activity
functions may limit or confine it. These environmental limits and how they develop over time
must be constantly kept in mind by businesses.

• Health :

Health care inputs market restrictions are typically a sign of market dysfunction, which may be
brought on by the fact that these markets are highly regulated in order to address issues with
information asymmetry.

1
• Manufacturability :

Engineering design should adhere to the constraints of the selected manufacturing methods
when determining and using manufacturability constraints. The design challenge can be
formulated in a variety of ways and, as was previously said, should take the mechanics of the
fabrication processes into account. The design could be formulated as a nested problem, a
sequential problem, or a single issue (for example, structural topology optimization) (e.g.,
kinematic mechanism optimization). Which is ideal to utilise relies on the manufacturing
process chosen, the preferences of the designer, and the problem's goals. A single step for
problem solving may be appropriate if the issue is really straightforward.

• Safety :

Any unforeseen events that may have an impact on your project are considered project risks. Even while
most project risks are bad, there are some that can be good. A new technology might be unveiled while
your project is still under development, for instance. This technology might make your project go more
quickly, or it might increase market rivalry and lower the worth of your finished product.

Project risks can be identified through risk analysis and managed effectively utilising risk
management techniques. You might run the following risks:

1. Stretched resources

2. Operational mishaps

3. Low performance

4. Lack of clarity

5. Scope creep

6. High costs

1
Figure 3.1- Constraints

Resources tie closely with cost constraints on your project because these project requirements cost
money. Without proper resource allocation , can experience lower project quality, an increased
budget, and timeline delays.
Some resources to consider include:

● People

● Equipment or materials

● Facilities

● Software
Use a resource management plan to ensure you have the resources you need for every element of
your project so that this constraint doesn’t negatively affect other project areas.

2
3.3 Analysis and Feature finalization subject to constraints
Deep Learning based Facial Expression Recognition using Keras:
1. Using this algorithm, up to five distinct facial emotions can be detected in real time. It
runson top of a Convolutional Neural Network (CNN) that is built with the help of Keras
whose backend is TensorFlow in Python.
2. The facial emotions that can be detected and classified by this system are Happy, Sad,
Anger,Surprise and Neutral. OpenCV is used for image processing tasks where a face is
identified from a live webcam feed which is then processed and fed into the trained
neural network for emotion detection.
3. Deep learning based facial expression recognition techniques bring down to a greater
extent,the dependency on face-physics- based models and other pre-processing
techniques by enabling lengthwise learning to occur in the pipeline directly from the
input images.

Machine Learning Algorithms:


1. One of the most important applications of artificial intelligence is machine learning. It
provides the application that can automatically learn and improve from experience
without being apparently programmed. The learning process starts with observations or
data. Such as,we can assume a good decision based on direct experience or instruction.
2. The basic aim is to allow the device without human interruption. Mostly machine
learning algorithm is classified into two types. These are supervised and unsupervised
learning.
3. Supervised Machine Learning Algorithms: Supervised learning algorithms able to do
different analysis with new data based on what it learned from the past and can also
predict future event. The supervised learning algorithms create deduced function for
predicting the starting analysis of known training data and output values.
4. The system is able to compare its output with correct output and also find error for
modification.
5. Unsupervised Machine Learning Algorithms: Unsupervised machine learning algorithms
areused for training unclassified and those data which are not leveled. Unsupervised
learning is able to describe a secret shape from unleveled data. This system can’t provide
proper output but it is able to take important decision from data set for describing secret

2
shape from unleveled data.
6. Semi-supervised Machine Learning Algorithms: Semi-supervised machine algorithms lies

2
between supervised and unsupervised learning. For training they use both leveled and
unleveled data but in this training data there are a small amount of leveled data and a
huge amount of unleveled data.
7. By using this method the systems are able to develop learning exactitude. Normally semi-
supervised learning algorithms are used when leveled data need proficient and
relevantresource for training.
8. Keras:Keras is high level neural netorks library written in python that works as a wrapper
to tensorflow.It is used in cases where we wnt to quickly build and test the neural
network with minimal lines of codes.it contains implementations of commomly used
code used neural elements like layers,objectives,activation functions,optimizers,and tools
to make working with image and text data easier.

PCA:

 In high-dimensional data, this method is designed to model linear variation. Its goal is
tofind a set of mutually orthogonal basis functions that capture the directions of
maximum variance in the data and for which the coefficients are pairwise décor related.
For linearly embedded manifolds, PCA is guaranteed to discover the dimensionality of
the manifold and produces a compact representation.

 PCA was used to describe face image in term of a set of basic functions or “Eigen face”.
Eigen face was introduced early on as powerful use pf principal components analysis
(PCA) to solve problems in face recognition and detection. PCA is an unsupervised
technique, sothe method does not rely on class information. In our implementation of
Eigen faces, we use the nearest neighbor (NN) approach to classify our test vectors using
the Euclidean distance.

MPCA:
 One extension of PCA is that of applying PCA to tensors or multilinear arrays which
results in a method known as multilinear principal components analysis (MPCA). Since a
face image is most naturally array, meaning that there are two dimensions describing the
location of each pixel in a face image, the idea is to determine a multilinear projection for
the image, instead of forming a one-dimensional (1D) vector form the face image and
finding a linear projection for the vector.
2
 It is through that the multilinear projection will better capture the correlation between
neighborhood pixels that is otherwise lost in forming a 1D vector from the image.

3.4 Design Flow

Figure 3.2- Design flow 1

Figure 3.3- Design Flow 2

2
3.5 Design Selection:

Figure 3.4- Final Design Flow

The above design flow is what we used for the project . It is a combination of both of the pervious
two flowcharts , which was best suited for our requirements.
At first, user needs to take an image as input. For improving lost contrast, use histogram equalization
by remapping the brightness value of an image. Then detect face boundary, cropping eye and
cropping lip region by PCA and MPCA. Then it sends the image to machine learning kit (ML kit).
Machine learning kit is recently developed by google which has trained data. It provides powerful
feature and bear new information. That’s why machine learning becomes most popular nowadays.
Machine learning SDK can recognize text, detect faces, recognize landmarks, scan bar codes and
leveling images. In this project we use ML kit for detecting face mood. It can detect happiness
percentage.

But applying some conditions, using ML kit we develop four facial expressions (Happy, Sad, Calm
and Angry) and based on this facial expression it suggest music from database which is developed in
firebase. We will compare our research work with another research works and applications. We are
actually study that research work and try to learn about them. We studied so many things from that
research work and find many things such as algorithm, accuracy of that apps etc. all this research
2
work

2
are emotions analysis.
“An Emotion Recognition Challenge” in this project used baseline method, principal component
analysis (PCA) algorithm. For this project as the database we are using firebase where data is stored.
It is a cloudhosted Real time database. It stores data as JSON tree format. Using machine learning
language, PCA, MPCA easily can analyze facial mood expression.
The system thus aim at providing android user with a cheaper free and user friendly accurate
emotion detection system, which is really helpful to the users. For changing mood system our apps is
really helpful. The main advantage of our apps is to detect accurate human emotions and also
suggest music and jokes for changing their mood.

3.6 Gesture Classification:

• In Hidden Markov Models (HMM) is used for the classification of the gestures. This model deals
with dynamic aspects of gestures. Gestures are extracted from a sequence of video images by
tracking the skin-color blobs corresponding to the hand into a body– face space
centred on the face of the user.
• The goal is to recognize two classes of gestures: deictic and symbolic. The image is filtered using a
fast look–up indexing table. After filtering, skin colour pixels are gathered into blobs. Blobs are
statistical objects based on the location (x, y) and the colorimetry (Y, U, V) of the skin color pixels
in order to determine homogeneous areas.
• In Naïve Bayes Classifier is used which is an effective and fast method for static hand gesture
recognition. It is based on classifying the different gestures according to geometric based invariants
which are obtained from image data after segmentation.
• Thus, unlike many other recognition methods, this method is not dependent on skin colour. The
gestures are extracted from each frame of the video, with a static background.
• The first step is to segment and label the objects of interest and to extract geometric invariants
from them. Next step is the classification of gestures by using a K nearest neighbor algorithm aided
with distance weighting algorithm (KNNDW) to provide suitable data for a locally weighted Naïve
Bayes‟ classifier.
• According to the paper on “Human Hand Gesture Recognition Using a Convolution Neural
Network” by Hsien-I Lin, Ming-Hsiang Hsu, and Wei-Kai Chen (graduates of Institute of
Automation

2
• Technology National Taipei University of Technology Taipei, Taiwan), they haveconstructed a
skin model to extract the hands out of an image and then apply binary threshold to the whole image.
After obtaining the threshold image they calibrate it about the principal axis in order tocentre the
image about the axis. They input this image to a convolutional neural network model in order to train
and predict the outputs. They have trained their model over 7 hand gestures and using this model
they produced an accuracy of around 95% for those 7 gestures.

3.7 Implementation Plan/Methodology


The mood-based music recommendation system is an application that focuses on implementing real
time mood detection. It is a prototype of a new product that comprises two main modules: Facial
expression recognition/mood detection and Music recommendation.
Android is a mobile operating system which is developed by google. It is the most useable Operating
System and everyone can easily understand this. Html & CSS is used for design. Firebase is used for
storing and synchronizing data. It provides a real time database. Google API is used for train data.
Before Starting to work we need to create python virtual environment.

3.7.1 Implementation

1. Mood Detection Module: This Module is divided into two parts:

a. Face Detection Ability to detect the location of face in any input image or frame. The output
is the bounding box coordinates of the detected faces. For this task, initially the python
library OpenCV was considered. But integrating it with an android app was a complex task
so the FaceDetector class available in Java was considered. This library identifies the faces
of people in a Bitmap graphic object and returns the number of faces present in a given
image.

b. Mood Detection Classification of the emotion on the face as happy, angry, sad, neutral,
surprise, fear or disgust. For this task, the traditional Keras module of Python was used but,
in the survey, it was found that this approach takes a lot of time to train and validate and also
works slowly when integrated with android apps.So,MobileNet which is a CNN architecture
model for Image Classification and Mobile Vision was used.

c. There are other models as well but what makes MobileNet special isthat it has very less
computation power to run or apply transfer learning to. This makes it a perfect fit for Mobile
2
devices, embedded systems and computers without GPU or low computational efficiency

2
without compromising the accuracy of the results.
It uses depth wise separable convolutions to build light weight deep neural networks. The
dataset used for training was obtained by combining FER 2013 dataset and MMA Facial
Expression Recognition dataset from Kaggle. The FER 2013 dataset contained grayscale
images of size 48×48 pixels. Thus, all these images were converted as per the images in
FER 2013 dataset and combined to obtain aneven larger dataset with 40,045 training images
and 11,924 testing images. MobileNetwas used with Keras to train and test our model for
seven classes – happy, angry, neutral, sad, surprise, fear ad disgust. We trained it for 25
epochs and achieved an accuracy of approximately 75%.

2. Music Recommendation Module: The dataset of songs classified as per mood was found on
Kaggle for two different languages – Hindi and English. Research for a good cloud storage
platform to store, retrieve and query this song data as per users request was conducted. Options
like AWS, Google Cloud, etc. were found but these were rejected as they were costly and
provided very limited storage for free. Then research for open-source streaming services like
Restream.io, Ampache, etc. was conducted, but again, these services were web based/used for
live streaming on YouTube/available only for personal use.

After a lot of research (and time constraints), Firebase was considered a backend server. It can
be integrated with an android app just by one click and its free plan provides storage of 5GB.
But functions like user queries, server updates, etc. are a part of a paid plan so it was decided to
limit the scope of the project.The mp3 versions of the songs were manually uploaded on
Firebase storage and were linked in the Real Time database as per mood and language.

3
Fig 3.5- Flowchart of Facial Mood Expression

TensorFlow model as input and generates a TensorFlow Lite model as output with .tflite extension.
Since the MobileNet model is used, the size of the tflite file is expected be around 20- 25 Megabyte
(MB) which was the desired size. In Android Studio, an assets folder was created to store the .tflite
file and labels.txt file. The labels.txt file contains the class labels of the model. All the appropriate
methods were created for loading the model, running the interpreter and obtaining the results. A
project on Firebase was created and mp3 songs were uploaded in the storage section. These songs as
per mood and language in the real time database section. After this, the Firebase database was linked
to Android studio. An appropriate UI for the android application was created and the tflite model
methods were linked with the songs on Firebase. Finally, the application was tested to fix the bugs if
any.

The system architecture diagram depicts the overall outline of the software system and the
relationships, constraints, and boundaries between components. When the user opens the android
app, the main screen will be displayed which contains three buttons take snap, use emoji, play songs.
If the user clicks on take snap button, the camera opens, user clickspicture. This picture is given as
input to face detection program. If no face is detected or multiple faces are detected, then an
appropriate error message is displayed to the user. Upon successful single face detection, the picture

3
is given as input to the mood detection module. The detected mood is displayed to the user, after
which

3
the play songs button gets enabled.. If the user presses the use emoji button, then a screen of five
emojis will be displayed as shown in Fig.10. User can click on any emoji to obtain the respective
playlist. To exit the app, the user has to just press the back button.

3.7.2 Feature Extraction and Representation


The representation of an image as a 3D matrix having dimension as of height and width of the
image and the value of each pixel as depth (1 in case of Grayscale and 3 in case of RGB). Further,
these pixel values are used for extracting useful features using CNN.

Artificial Neural Network (ANN):


Artificial Neural Network is a connection of neurons, replicating the structure of human brain.
Each connection of neuron transfers information to another neuron. Inputs are fed into first layer
ofneurons which processes it and transfers to another layer of neurons called as hidden layers. After
processing of information through multiple layers of hidden layers, information is passed to final
output layer.

Fig 3.6 ANN

3
These are capable of learning and have to be trained. There are different learning strategies:
1. Unsupervised Learning
2. Supervised Learning
3. Reinforcement Learning

Convolutional Neural Network (CNN):


Unlike regular Neural Networks, in the layers of CNN, the neurons are arranged in 3 dimensions:
width, height, depth. The neurons in a layer will only be connected to a small region of the layer
(window size) before it, instead of all of the neurons in a fully-connected manner. Moreover, the
final output layer would have dimensions (number of classes), because by the end of the CNN
architecture we will reduce the full image into a single vector of class scores.

Fig 3.7 CNN

3
1. Convolution Layer:
In convolution layer we take a small window size [typically of length 5*5] that extends to the
depth of the input matrix. The layer consists of learnable filters of window size. During every
iteration we slid the window by stride size [typically 1], and compute the dot product of filter
entries and input values at a given position.
As we continue this process we will create a 2-Dimensional activation matrix that gives the
response of that matrix at every spatial position. That is, the network will learn filters that
activate when they see some type of visual feature such as an edge of some orientation or a
blotch of some colour.
2. Pooling Layer:
We use pooling layer to decrease the size of activation matrix and ultimately reduce the
learnable parameters. There are two types of pooling:
A.Max Pooling: In max pooling we take a window size [for example window of size
2*2], and only take the maximum of 4 values. Well lid this window and continue this
process, so well finally get an activation matrix half of its original Size.

B.Average Pooling: In average pooling, we take advantage of of all Values in a window.

Fig 3.8 Pooling

3
3. Fully Connected Layer:
In convolution layer, neurons are connected only to a local region, while in a fully connected
region, we will connect all the inputs to neurons.
4. Final Output Layer:
After getting values from fully connected layer, we will connect them to the final layer of
neurons [having count equal to total number of classes], that will predict the probability of
each image to be in different classes.

3
CHAPTER 4

RESULTS ANALYSIS AND VALIDATION


4.1. Implementation of solution
4.1.1 Analysis

We are utilising Firebase as the database for this project, where data is kept. It is a real-time
databasehoused in the cloud. It uses the JSON tree format to store data. With the help of the
Keras module and CNN architecture, we can quickly assess facial expressions of mood. After
monitoring facial mood expression, it can identify a person's mood with an accuracy of around
75.7% and can also recommend music depending on that mood. The user must first provide an
image as input. Use histogram equalisation to improve lost contrast byremapping an image's
brightness value. Then, using CNN Architect, Mobilenet, and mobile vision, detect the face
boundary, cropping eye, and cropping lip region. the machine learning kit receives the image
after that (ML kit). Google just created a machine learning kit using trained data. It has strong
features and contains fresh data. Because of this, machine learning is currently very popular.
Text recognition, face detection, landmark recognition, bar code scanning, and image levelling
are all possible with machine learning SDK. In this research, we employ ML kit to identify the
mood of the face. It is able to gauge happiness levels. However, under certain circumstances,
we create four face expressions (happy, sad, calm, and angry) using the ML kit, and based on
these facial expressions, it suggests music from a database that was created in Firebase. Once
an image has been chosen for input, CNN Architect, Mobilenet, and Mobile Vision examine
itbefore sending it to the machine learning kit for additional processing. After adding some
conditionsand utilising the ML kit, we were able to predict human face expressions with an
accuracy of around75.7%. It makes musical suggestions based on face emotion.

3
4.1.2 Experimental Results
Working Project:
Once the web page opens it acess the front camera to detect facial emotions and recommend music -

Fig 4.1 Mood Happy

Fig 4.2 Mood Surprised

3
Fig 4.3 Mood Neutral

4.1.3 Testing
First of all, what are we trying to achieve when performing ML testing, as well as any software
testing whatsoever? Quality assurance is required to make sure that the software system works
according to the requirements. Were all the features implemented as agreed? Does the program
behave as expected? All the parameters that you test the program against should be stated in the
technical specification document. Moreover, software testing has the power to point out all the
defects and flaws during development. You don’t want your clients to encounter bugs after the
software is released and come to you waving their fists. Different kinds of testing allow us to catch
bugs that are visible only during runtime.

Model evaluation in machine learning testing :

1. Unit testing:

The program is broken down into blocks, and each element (unit) is tested separately. The first
level of testing software is unit testing: they allow us to test the smallest unit of code that makes

3
logical sense to isolate within a system. The cool thing about unit tests is they can be anything
you want them to be — although it’s usually a function, class, or line of code in most
programming languages. Ideally, we want our tests to be small. The smaller the better. This is
because smaller tests aren’t only more efficient from a practical standpoint — since testing
smaller units will allow your tests to run faster — but also conceptually, as it will provide you
with a more detailed view of how the granular code is behaving.

1. Regression testing:

They cover already tested software to see if it doesn’t suddenly break. Regression testing is a
software testing practice that ensures an application still functions as expected after any code
changes, updates, or improvements. Regression testing is responsible for the overall stability
and functionality of the existing features. Whenever a new modification is added to 51 the
code, regression testing is applied to guarantee that after each update, the system stays
sustainable under continuous improvements. Changes in the code may involve dependencies,
defects, or malfunctions. Regression testing targets to mitigate these risks, so that the
previously developed and tested code remains operational after new changes.

2. Integration testing:

This type of testing observes how multiple components of the program work together.
Integration testing is one of the agile methodologies of software testing where individual
components or units of code are tested to validate interactions among different software
system modules. In this process, these system components are either tested as a single group
or organized iteratively. Typically, system integration testing is taken up to validate the
performance of the entire software system as a whole. The main purpose of this testing
method is to expand the process and validate the integration of the modules with other groups.
It is performed to verify if all the units operate in accordance with their specifications defined.

4
4.1.4 Accuracy:

4
4
CHAPTER 5.
CONCLUSION AND FUTURE WORK

5.1. Conclusion
The outcome from our successful completion of our task is listed below. There are those in
the current system. As a result, the system aims to offer customers a less expensive, free,
and intuitiveapproach for accurately detecting emotions. Our apps are quite useful for
shifting mood systems. Our apps' key benefit is their ability to accurately identify human
emotions and provide music recommendations for users' shifting moods.
Face photos are used to train a model that predicts the four fundamental human emotions
given a testimage in an image processing and classification technique. When utilised with
test data from the same dataset used to train the classifiers, the predictor performs
reasonably well. The predictor, however, consistently performs poorly at identifying the
phrase connected to contempt. This is probably because there aren't enough training and test
photos that explicitly show contempt, the dataweren't labelled well prior to training, and it's
hard to tell when someone is being disrespectful.
Additionally, because the classifier has not been trained to predict emotions for test data
with expressions that do not obviously only belong to one of the four fundamental
expressions, it isineffective at doing so.
In this project, we presented a model to recommend a music based om the emotion based
detected from the facial expression. This project proposed designed & developed an emotion
based music recommendation system using face recognition System. Music are the one that
has the power to heal any stress or any kind of emotions. Recent development promises a
wide scope in developing emotion based music recommendation system. Thus the proposed
system presents Face based emotion recognition system to detect the emotions and play
music from the emotion detected.

3
5.2 Future Work
1. In the future, the system would be designed in a way that would be automatically playing videos
based on the human facial mood. This system would also be helpful in music therapy to increase the
robustness of the classifiers by adding more training images from different datasets, investigating
more accurate detection methods that still maintain computational efficiency, and considering the
classification of more nuanced and sophisticated expressions.
2. This system would be also helpful in music therapy treatment and provide the music therapist the
help needed to treat the patients suffering from disorders like mental stress, anxiety, acute
depression and trauma.
3. Due to the subjective nature in music and the issues existing in the previous
methods, two human-centred approaches are proposed. By considering
affective and social information, emotion-based model and context-based
model largely improved the quality of recommendation. However, this research
is still at an early stage.
4. In addition to that, finding suitable music to be played on detection of fear or disgust mood is also
a challenge. As a result, it can be considered as a future scope for our project.
5. Our trained model is an overfit model, which can sometimes lead to fluctuations in accurate detection.

6. For example, the “disgust” mood is mostly classified as “angry” mood since the facial features
(eyebrows, cheeks) are similar for both. Thus, for more accurate results it needs to be trained
for more images, and for a greater number of epochs. Recommendation of movies and TV
series on the basis of mood detection can also be considered as a future scope for our project.

3
REFERENCES
1) Grand View Research. Music Streaming Market Size, Share & Trends Analysis Report By
Service (On-demand Streaming, Live Streaming), By Platform (Apps, Browsers), By Content
Type, By End-use, By Region, And Segment Forecasts, 2022–2030, 2022.

2) Hanjalic, A.; Xu, L.Q. Affective video content representation and modeling. IEEE Trans.
Multimed. 2005, 7, 143–154.

3) Lu, L.; Liu, D.; Zhang, H.J. Automatic mood detection and tracking of music audio signals.
IEEE Trans. Audio Speech Lang. Process. 2005, 14, 5–18.

4) Yang, Y.H.; Chen, H.H. Ranking-based emotion recognition for music organization and
retrieval. IEEE Trans. Audio Speech Lang. Process. 2010, 19, 762–774.

5) Yang, Y.H.; Chen, H.H. Machine recognition of music emotion: A review. ACM Trans.
Intell. Syst. Technol. (TIST) 2012, 3, 1–30.

6) Lara, C.A.; Mitre-Hernandez, H.; Flores, J.; Perez, H. Induction of emotional states in
educational video games through a fuzzy control system. IEEE Trans. Affect. Comput. 2018,
12, 66–77. [CrossRef]

7) Muszynski, M.; Tian, L.; Lai, C.; Moore, J.; Kostoulas, T.; Lombardo, P.; Pun, T.; Chanel,
G. Recognizing induced emotions of movie audiences from multimodal information. IEEE
Trans. Affect. Comput. 2019, 12, 36–52.

8) Juslin, P.N.; Sloboda, J.A. Music and Emotion: Theory and Research; Oxford University
Press: Oxford, UK, 2001.

9) Zentner, M.; Grandjean, D.; Scherer, K.R. Emotions evoked by the sound of music:
Characterization, classification, and measurement. Emotion 2008, 8, 494.

10) Gabrielsson, A. Emotion perceived and emotion felt: Same or different? Music. Sci. 2001, 5,
123–147.

4
11) Adomavicius, G.; Tuzhilin, A. Toward the next generation of recommender systems: A
survey of the state-of-the-art and possible extensions. IEEE Trans. Knowl. Data Eng. 2005,
17, 734– 749. [CrossRef]

12) Paul, D.; Kundu, S. A survey of music recommendation systems with a proposed music
recommendation system. In Emerging Technology in Modelling and Graphics; Springer:
Berlin/Heidelberg, Germany, 2020; pp. 279–285.

13) Agrafioti, F.; Hatzinakos, D.; Anderson, A.K. ECG pattern analysis for emotion detection.
IEEE Trans. Affect. Comput. 2011, 3, 102–115. [CrossRef]

14) Lin, Y.P.; Wang, C.H.; Jung, T.P.; Wu, T.L.; Jeng, S.K.; Duann, J.R.; Chen, J.H. EEG-based
emotion recognition in music listening. IEEE Trans. Biomed. Eng. 2010, 57, 1798–1806.

15) Wijnalda, G.; Pauws, S.; Vignoli, F.; Stuckenschmidt, H. A personalized music system for
motivation in sport performance. IEEE Pervasive Comput. 2005, 4, 26–32. [CrossRef]

16) 1 Yang, Y.H.; Lin, Y.C.; Su, Y.F.; Chen, H.H. A regression approach to music emotion
recognition. IEEE Trans. Audio Speech Lang. Process. 2008, 16, 448–457. [CrossRef]

17) Deng, J.J.; Leung, C.H. Music retrieval in joint emotion space using audio features and
emotional tags. In Proceedings of the International Conference on Multimedia Modeling,
Huangshan, China, 7–9 January 2013; Springer: Heidelberg, Germany, 2013; pp. 524–534.

18) Deng, J.J.; Leung, C.H.; Milani, A.; Chen, L. Emotional states associated with music:
Classification, prediction of changes, and consideration in recommendation. ACM Trans.
Interact. Intell. Syst. (TiiS) 2015, 5, 1–36. [CrossRef]

19) Ecoffet, A.; Huizinga, J.; Lehman, J.; Stanley, K.O.; Clune, J. Go-Explore: A New
Approach for Hard-Exploration Problems. arXiv 2019, arXiv:1901.10995. Available online:
https://arxiv.org/abs/1901.10995 (accessed on 18 October 2022).

20) De Prisco, R.; Guarino, A.; Lettieri, N.; Malandrino, D.; Zaccagnino, R. Providing music
service in ambient intelligence: experiments with gym users. Expert Syst. Appl. 2021, 177,
114951. [CrossRef]

4
21) Wen, X. Using deep learning approach and IoT architecture to build the intelligent music
recommendation system. Soft Comput. 2021, 25, 3087–3096. [CrossRef]

22) De Prisco, R.; Zaccagnino, G.; Zaccagnino, R. A multi-objective differential evolution


algorithm for 4-voice compositions. In Proceedings of the 2011 IEEE Symposium on
Differential Evolution (SDE), Paris, France, 11–15 April 2011; IEEE: Piscataway, NJ, USA,
2011; pp. 1–8.

23) Prisco, R.D.; Zaccagnino, G.; Zaccagnino, R. A genetic algorithm for dodecaphonic
compositions. In Proceedings of the European Conference on the Applications of
Evolutionary Computation, Torino, Italy, 27–29 April; Springer: Berlin/Heidelberg,
Germany, 2011; pp. 244–253.

24) Song, Y.; Dixon, S.; Pearce, M. A survey of music recommendation systems and future
perspectives. In Proceedings of the 9th International Symposium on Computer Music
Modeling and Retrieval, Citeseer, London, UK, 19–22 June 2012; Volume 4; pp. 395–410.

25) González, E.J.S.; McMullen, K. The design of an algorithmic modal music platform for
eliciting and detecting emotion. In Proceedings of the 2020 8th International Winter
Conference on Brain-Computer Interface (bci), Gangwon, Korea, 26–28 February 2020;
IEEE: Piscataway, NJ, USA, 2020; pp. 1–3.

26) M. S. Hossain and G. Muhammad, "An Emotion Recognition System for Mobile
Applications," in IEEE Access, vol. 5, pp. 2281-2287, 2017.

27) C. Baccigalupo and E. Plaza. Case-based sequential ordering of songs for playlist
recommendation. In Advances in Case-Based Reasoning, pages 286--300. Springer, 2006.

28) L. Baltrunas and X. Amatriain. Towards time-dependant recommendation based on implicit


feedback. In Workshop on context-aware recommender systems (CARS'09), 2009.

29) L. Chen and P. Pu. Interaction design guidelines on critiquing-based recommender systems.

4
User Modeling and User-Adapted Interaction, 19(3):167--206, 2009.

30) L. Gou, F. You, J. Guo, L. Wu, and X. L. Zhang. Sfviz: interest-based friends exploration
and recommendation in social networks. In Proceedings of the 2011 Visual Information
Communication-International Symposium, page 15. ACM, 2011.

31) G. Gonzalez, J. L. De La Rosa, M. Montaner, and S. Delfin. Embedding emotional context in


recommender systems. In Data Engineering Workshop, 2007 IEEE 23rd International
Conference on, pages 845--852. IEEE, 2007.

32) B. Faltings, P. Pu, M. Torrens, and P. Viappiani. Designing example-critiquing interaction.


In Proceedings of the 9th international conference on Intelligent user interfaces, pages 22--
29. ACM, 2004.

33) O. Celma and P. Herrera. A new approach to evaluating novel recommendations. In


Proceedings of the 2008 ACM Conference on Recommender Systems, RecSys '08, pages
179-
-186, New York, NY, USA, 2008. ACM.

34) M. J. Albers. Cognitive strain as a factor in effective document design. In Proceedings of the
15th Annual International Conference on Computer Documentation, SIGDOC '97, pages 1--
6, New York, NY, USA, 1997. ACM.

35) C. Baccigalupo and E. Plaza. Case-based sequential ordering of songs for playlist
recommendation. In Advances in Case-Based Reasoning, pages 286--300. Springer, 2006.

36) S. Bostandjiev, J. O'Donovan, and T. Höllerer. Tasteweights: a visual interactive hybrid


recommender system. In Proceedings of the sixth ACM conference on Recommender
systems, pages 35--42. ACM, 2012.

37) T. Bertin-Mahieux, D. P. Ellis, B. Whitman, and P. Lamere. The million song dataset. In
Proceedings of the 12th International Conference on Music Information Retrieval (ISMIR
2011), 2011.

38) M. Zentner, D. Grandjean, and K. R. Scherer. Emotions evoked by the sound of music:
Characterization, classification, and measurement. Emotion, 8(4):494--521, 2008.

4
39) S. Zhao, M. X. Zhou, X. Zhang, Q. Yuan, W. Zheng, and R. Fu. Who is doing what and when:

4
Social map-based recommendation for content-centric social web sites. ACM Transactions
on Intelligent Systems and Technology (TIST), 3(1):5, 2011.

40) D. Parra and X. Amatriain. Walk the talk: Analyzing the relation between implicit and
explicit feedback for preference elicitation. In Proceedings of the 19th International
Conference on User Modeling, Adaption, and Personalization, UMAP'11, pages 255--268,
Berlin, Heidelberg, 2011. Springer-Verlag.

41) H.-S. Park, J.-O. Yoo, and S.-B. Cho. A context-aware music recommendation system using
fuzzy bayesian networks with utility theory. In Fuzzy systems and knowledge discovery,
pages 970--979. Springer, 2006.

42) J. O'Donovan, B. Smyth, B. Gretarsson, S. Bostandjiev, and T. Höllerer. Peerchooser: visual


interactive recommendation. In Proceedings of the SIGCHI Conference on Human Factors in
Computing Systems, pages 1085--1088. ACM, 2008.

43) S. Nagulendra and J. Vassileva. Understanding and controlling the filter bubble through
interactive visualization: A user study. In Proceedings of the 25th ACM Conference on
Hypertext and Social Media, HT '14, pages 107--115, New York, NY, USA, 2014. ACM.

4
Appendix
Appendix A: Project Reflection

This appendix's primary focus is project reflection. We are working to finish the project we
started two months ago. For us, this endeavour represents a particular kind of dream. There
are several apps in the Google Play Store that claim to be able to recognize facial emotions,
but they don't trulyfunction correctly. We work quite hard to produce face mood detector
apps that are truly enjoyable for everyone who leads a very busy daily life. We make every
effort to provide for Android app users. Our apps have excellent usability. Our apps' ability
to identify a user's mood and to recommend music, videos, and jokes to alleviate despair is
one of their most intriguing features.
Humans are incredibly busy in their daily lives.People experience problems like mental
stress, anxiety, severe depression, and trauma as a result. We are developing this web
application to prevent this kind of disorder.

App.py
from flask import Flask, render_template, Response, jsonify
import gunicorn
from camera import *

app = Flask( name )

headings = ("Name","Album","Artist")
df1 = music_rec()
df1 = df1.head(15)
@app.route('/')
def index():
print(df1.to_json(orient='records'))
return render_template('index.html', headings=headings, data=df1)

4
def gen(camera):
while True:
global df1
frame, df1 = camera.get_frame()
yield (b'--frame\r\n'
b'Content-Type: image/jpeg\r\n\r\n' + frame + b'\r\n\r\n')

@app.route('/video_feed')
def video_feed():
return Response(gen(VideoCamera()),
mimetype='multipart/x-mixed-replace; boundary=frame')

@app.route('/t')
def gen_table():
return df1.to_json(orient='records')

if name == ' main ':


app.debug = True
app.run()

4
Camera.py

import numpy as np
import cv2
from PIL import Image
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, Flatten
from tensorflow.keras.layers import Conv2D
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.layers import MaxPooling2D
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from pandastable import Table, TableModel
from tensorflow.keras.preprocessing import image
import datetime
from threading import Thread
from Spotipy import *
import time
import pandas as pd
face_cascade=cv2.CascadeClassifier("haarcascade_frontalface_default.xml")
ds_factor=0.6

emotion_model = Sequential()
emotion_model.add(Conv2D(32, kernel_size=(3, 3), activation='relu',
input_shape=(48,48,1)))
emotion_model.add(Conv2D(64, kernel_size=(3, 3), activation='relu'))
emotion_model.add(MaxPooling2D(pool_size=(2, 2)))
emotion_model.add(Dropout(0.25))
emotion_model.add(Conv2D(128, kernel_size=(3, 3), activation='relu'))
emotion_model.add(MaxPooling2D(pool_size=(2, 2)))

4
emotion_model.add(Conv2D(128, kernel_size=(3, 3), activation='relu'))

emotion_model.add(MaxPooling2D(pool_size=(2, 2)))
emotion_model.add(Dropout(0.25))
emotion_model.add(Flatten())
emotion_model.add(Dense(1024, activation='relu'))
emotion_model.add(Dropout(0.5))
emotion_model.add(Dense(7, activation='softmax'))
emotion_model.load_weights('model.h5')

cv2.ocl.setUseOpenCL(False)

emotion_dict =
{0:"Angry",1:"Disgusted",2:"Fearful",3:"Happy",4:"Neutral",5:"Sad",6:"Surpris
ed"}
music_dist={0:"songs/angry.csv",1:"songs/disgusted.csv
",2:"songs/fearful.csv",3:"songs/happy.csv",4:"songs/neutral.csv",5:"songs/sad.c
sv",6:"songs/surprised.csv"}
global last_frame1
last_frame1 = np.zeros((480, 640, 3), dtype=np.uint8)
global cap1
show_text=[0]

''' Class for calculating FPS while streaming. Used this to check performance of
using another thread for video streaming '''
class FPS:
def init (self):
# store the start time, end time, and total number of frames

4
# that were examined between the start and end intervals
self._start = None
self._end = None

self._numFrames = 0
def start(self):
# start the timer
self._start = datetime.datetime.now()
return self
def stop(self):
# stop the timer
self._end = datetime.datetime.now()
def update(self):
# increment the total number of frames examined during the
# start and end intervals
self._numFrames += 1
def elapsed(self):
# return the total number of seconds between the start and
# end interval
return (self._end - self._start).total_seconds()
def fps(self):
# compute the (approximate) frames per
second return self._numFrames / self.elapsed()

''' Class for using another thread for video streaming to boost performance '''
class WebcamVideoStream:

def init (self, src=0):

5
self.stream = cv2.Video
Capture(src,cv2.CAP_DSHOW)
(self.grabbed, self.frame) = self.stream.read()
self.stopped = False

def start(self):
# start the thread to read frames from the video
stream
Thread(target=self.update, args=()).start()
return self

def update(self):
# keep looping infinitely until the thread is stopped
while True:
# if the thread indicator variable is set, stop the
thread
if self.stopped:
return
# otherwise, read the next frame from the
stream (self.grabbed, self.frame) =
self.stream.read()

def read(self):
# return the frame most recently read
return self.frame
def stop(self):
# indicate that the thread should be stopped
self.stopped = True

''' Class for reading video stream, generating prediction and recommendations '''
class VideoCamera(object):

5
def get_frame(self):
global cap1
global df1
cap1 = WebcamVideoStream(src=0).start()
image = cap1.read()
image=cv2.resize(image,(600,500))
gray=cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)
face_rects=face_cascade.detectMultiScale(gray,1.3,5)
df1 = pd.read_csv(music_dist[show_text[0]])
df1 = df1[['Name','Album','Artist']]
df1 = df1.head(15)
for (x,y,w,h) in face_rects:
cv2.rectangle(image,(x,y-50),(x+w,y+h+10),(0,255,0),2)
roi_gray_frame = gray[y:y + h, x:x + w]
cropped_img =
np.expand_dims(np.expand_dims(cv2.resize(roi_gray_frame, (48, 48)), -1), 0)
prediction = emotion_model.predict(cropped_img)

maxindex = int(np.argmax(prediction))
show_text[0] = maxindex

#print("===========================================",mu
sic_dist[show_text[0]],"========================================
===")
#print(df1)
cv2.putText(image, emotion_dict[maxindex], (x+20, y-
60), cv2.FONT_HERSHEY_SIMPLEX, 1, (255, 255, 255), 2, cv2.LINE_AA)
df1 = music_rec()

global last_frame1
last_frame1 = image.copy()

5
pic = cv2.cvtColor(last_frame1, cv2.COLOR_BGR2RGB)
img = Image.fromarray(last_frame1)
img = np.array(img)
ret, jpeg = cv2.imencode('.jpg', img)
return jpeg.tobytes(), df1

def music_rec():
# print('---------------- Value------------', music_dist[show_text[0]])
df = pd.read_csv(music_dist[show_text[0]])
df = df[['Name','Album','Artist']]
df = df.head(15)
return df

Spotipy.py
import spotipy
import spotipy.oauth2 as oauth2
from spotipy.oauth2 import SpotifyOAuth
from spotipy.oauth2 import SpotifyClientCredentials
import pandas as pd
import time

auth_manager =
SpotifyClientCredentials('d483c8e67915409e948b91a03c365c44','37c149ac2bac
43ebb962d0e854066973')
sp = spotipy.Spotify(auth_manager=auth_manager)

def getTrackIDs(user, playlist_id):


track_ids = []

5
playlist = sp.user_playlist(user, playlist_id)
for item in playlist['tracks']['items']:
track = item['track']
track_ids.append(track['id'])
return track_ids

def getTrackFeatures(id):
track_info = sp.track(id)

name = track_info['name']
album = track_info['album']['name']
artist = track_info['album']['artists'][0]['name']
# release_date = track_info['album']['release_date']
# length = track_info['duration_ms']
# popularity = track_info['popularity']

track_data = [name, album, artist] #, release_date, length, popularity


return track_data

# Code for creating dataframe of feteched playlist

emotion_dict =
{0:"Angry",1:"Disgusted",2:"Fearful",3:"Happy",4:"Neutral",5:"Sad",6:"Surpris
ed"} music_dist={0:"0l9dAmBrUJLylii66JOsHB?
si=e1d97b8404e34343",1:"1n6cp Wo9ant4WguEo91KZh?
si=617ea1c66ab6446b ",2:"4cllEPvFdoX6NIVWPKai9I?
si=dfa422af2e8448ef",3:"0deORnapZgrxFY4 nsKr9JA?
si=7a5aba992ea14c93",4:"4kvSlabrnfRCQWfN0MgtgA?si=b36add73
b4a74b3a",5:"1n6cpWo9ant4WguEo91KZh?si=617ea1c66ab6446b",6:"37i9dQ

5
ZEVXbMDoHDwVN2tF?si=c09391805b6c4651"}
Train.py
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D
from keras.layers import MaxPooling2D
from keras.optimizers import Adam
from keras.preprocessing.image import ImageDataGenerator

train_dir = 'data/train'
val_dir = 'data/test'
train_datagen = ImageDataGenerator(rescale=1./255)
val_datagen = ImageDataGenerator(rescale=1./255)

train_generator = train_datagen.flow_from_directory(
train_dir,
target_size = (48,48),
batch_size = 64,
color_mode = "grayscale",
class_mode = 'categorical'
)

val_generator = val_datagen.flow_from_directory(
val_dir,
target_size = (48,48),
batch_size = 64,
color_mode = "grayscale",
class_mode = 'categorical'
)

5
emotion_model = Sequential()

emotion_model.add(Conv2D(32, kernel_size=(3,3), activation='relu',


input_shape = (48,48,1)))
emotion_model.add(Conv2D(64, kernel_size=(3,3), activation='relu'))
emotion_model.add(MaxPooling2D(pool_size=(2,2)))
emotion_model.add(Dropout(0.25))

emotion_model.add(Conv2D(128, kernel_size=(3,3), activation='relu'))


emotion_model.add(MaxPooling2D(pool_size=(2,2)))
emotion_model.add(Conv2D(128, kernel_size=(3,3), activation='relu'))
emotion_model.add(MaxPooling2D(pool_size=(2,2)))
emotion_model.add(Dropout(0.25))

emotion_model.add(Flatten())
emotion_model.add(Dense(1024, activation='relu'))
emotion_model.add(Dropout(0.5))
emotion_model.add(Dense(7, activation='softmax'))

emotion_model.compile(loss='categorical_crossentropy',optimizer=Adam(lr=0.0
001, decay=1e-6),metrics=['accuracy'])

emotion_model_info = emotion_model.fit_generator(
train_generator,
steps_per_epoch = 28709 // 64,
epochs=75,
validation_data = val_generator,
validation_steps = 7178 // 64
)

5
emotion_model.save_weights('model.h5')

Utis.py
''' Class for using separate thread for video streaming through web camera'''
import cv2
from threading import Thread
class WebcamVideoStream:

def init (self, src=0):


self.stream = cv2.VideoCapture(src,cv2.CAP_DSHOW)
(self.grabbed, self.frame) = self.stream.read()
self.stopped = False

def start(self):
# start the thread to read frames from the video
stream
Thread(target=self.update, args=()).start()
return self

def update(self):
# keep looping infinitely until the thread is stopped
while True:
# if the thread indicator variable is set, stop the
thread
if self.stopped:
return

5
# otherwise, read the next frame from the stream
(self.grabbed, self.frame) = self.stream.read()

def read(self):
# return the frame most recently read
return self.frame
def stop(self):
# indicate that the thread should be stopped
self.stopped = True

5
USER MANUAL

To run the project, follow these steps:

1. Open cmd and run - pip install -r requirements.txt

5
2.Check the system for installed requirements:

6
6
3.Once installed, open app.py file and run it

6
2. After successfully running the file, Open link that appears on
terminal http://127.0.0.1:5000

6
3. A web page will open that will recognize your face emotions on the basis
of trained dataset and recommend music accordingly

Music recommendation

You might also like