Emotion Based Music Recommendation System Using Machine Learning and AI
Emotion Based Music Recommendation System Using Machine Learning and AI
Volume 8 Issue 5, Sep-Oct 2024 Available Online: www.ijtsrd.com e-ISSN: 2456 – 6470
I. INTRODUCTION
Music is a universal language with the profound Traditional music recommendation systems primarily
ability to evoke and influence human emotions, rely on static user profiles, collaborative filtering, and
serving as a companion during various emotional content-based filtering, focusing on factors like genre,
states such as joy, sadness, excitement, or relaxation. artist, tempo, and user ratings. However, music
With the proliferation of music streaming platforms, consumption is a dynamic process, heavily influenced
personalized music recommendation systems have by the listener's current mood and emotional context.
become an integral part of the user experience. These For example, a user may prefer upbeat, fast-tempo
systems use various algorithms to suggest songs music while feeling energetic but may seek softer,
based on user preferences, listening history, or genre slower tunes when feeling melancholic. Ignoring
popularity. While they have achieved significant these emotional variations can limit the accuracy and
success in delivering tailored recommendations, they effectiveness of music recommendations. Therefore,
often fail to account for the listener's ever-changing there is a growing need for systems that can adapt to
emotional states, resulting in a less immersive and real-time emotional inputs to offer a more
engaging user experience.
@ IJTSRD | Unique Paper ID – IJTSRD69367 | Volume – 8 | Issue – 5 | Sep-Oct 2024 Page 329
International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470
personalized and contextually relevant music However, these systems often lack the ability to adapt
experience. to the listener’s emotional state, limiting their capacity
To address this gap, emotion-based music to provide contextually relevant suggestions. This gap
recommendation systems have emerged as a has led to the exploration of emotion-aware music
promising solution. These systems utilize advanced recommendation systems that incorporate various
techniques in artificial intelligence (AI) and machine emotion detection and recommendation techniques.
learning (ML) to recognize user emotions through 1. Emotion Recognition Techniques
various modalities, such as facial expressions, voice A fundamental aspect of emotion-based music
tone, and text input. By integrating emotion detection recommendation systems is the accurate recognition
into the recommendation process, these systems can of user emotions. Several studies have investigated
match users' current emotional states with appropriate various approaches to emotion recognition, including
music selections, enhancing the listening experience facial expression analysis, voice tone analysis,
and user satisfaction. Recent advances in emotion physiological signals (such as heart rate or EEG), and
recognition using deep learning models, natural text-based sentiment analysis. Facial expression
language processing (NLP), and computer vision analysis using computer vision techniques, such as
have made it feasible to develop real-time, emotion- Convolutional Neural Networks (CNNs) and deep
aware applications. learning models, has been widely explored for
This research proposes an emotion-based music emotion detection. For instance, the work by
recommendation system that leverages facial Mollahosseini et al. (2016) demonstrated the
expression analysis and NLP to detect user emotions effectiveness of CNNs in recognizing complex facial
accurately. The system employs machine learning expressions in real time. Similarly, voice-based
algorithms to process and interpret emotional data, emotion recognition, as discussed by Schuller et al.
which is then used to curate personalized music (2013), leverages audio features such as pitch, tone,
playlists tailored to the listener’s mood. We evaluate and rhythm to identify emotional cues. Additionally,
the system's performance through user feedback and natural language processing (NLP) techniques have
quantitative metrics, comparing it with traditional been used to extract sentiments from textual data,
recommendation methods to assess its effectiveness providing a multimodal approach to understanding
in enhancing user engagement. By integrating user emotions. However, while each of these methods
emotional context into the music recommendation has its strengths, combining multiple modalities often
process, this research aims to pave the way for more yields a more robust and accurate emotion detection
dynamic and user-centric music streaming services. framework.
The paper is structured as follows: Section 2 provides 2. Music and Emotion Relationship
an overview of related work in music Numerous studies have explored the relationship
recommendation and emotion recognition. Section 3 between music and emotions, investigating how
outlines the methodology, including data collection, musical elements such as tempo, key, rhythm, and
emotion detection techniques, and the melody can evoke specific emotional responses. Juslin
recommendation algorithm. Section 4 presents the and Västfjäll (2008) proposed the "BRECVEMA"
experimental results and evaluation. Section 5 framework, which identifies mechanisms through
discusses the implications of the findings, limitations, which music induces emotions, including brain stem
and potential future research directions. Finally, reflex, emotional contagion, and musical expectancy.
Section 6 concludes the study by summarizing the Building on this, music databases such as the Million
key contributions and highlighting the importance of Song Dataset (Bertin-Mahieux et al., 2011) and
emotion-based recommendations in music streaming Spotify API metadata provide a vast resource for
platforms. linking musical attributes to emotional states. Efforts
to classify music based on mood or affective states
II. RELATED WORK have resulted in annotated music datasets, which serve
Emotion-based music recommendation has garnered as a valuable foundation for training emotion-based
increasing attention in recent years as researchers recommendation algorithms.
strive to enhance the personalization and effectiveness
of music streaming services. Traditional music 3. Emotion-Based Music Recommendation
recommendation systems, which primarily use Systems
collaborative filtering and content-based filtering, Recent advancements in machine learning have
have achieved considerable success in suggesting facilitated the development of music recommendation
music based on user preferences, listening history, and systems that account for users' emotional states.
metadata such as genre, artist, or song popularity Several studies have proposed models that map
@ IJTSRD | Unique Paper ID – IJTSRD69367 | Volume – 8 | Issue – 5 | Sep-Oct 2024 Page 330
International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470
detected emotions to appropriate musical features to are needed to improve the accuracy, flexibility, and
generate recommendations. For instance, Zhang et al. privacy of these systems. This paper builds on these
(2018) introduced a hybrid system combining facial findings by proposing an emotion-based
emotion recognition with content-based filtering to recommendation system that employs real-time facial
suggest mood-congruent songs. Their system expression analysis and natural language processing to
demonstrated an improvement in user satisfaction, create a more personalized and dynamic music
emphasizing the value of emotion-aware experience for users.
recommendations. Additionally, systems such as III. PROPOSED WORK
EmoMusic (Delbouys et al., 2018) used deep learning The proposed work aims to develop an emotion-based
models to classify music tracks based on emotional music recommendation system that provides
tags, thereby enabling emotion-driven personalized music suggestions by dynamically
recommendations. While these approaches mark recognizing the user’s emotional state. This system
significant progress, challenges remain, particularly integrates facial expression analysis and natural
in achieving real-time emotion detection and language processing (NLP) techniques to detect
dynamically adjusting to users' changing emotional emotions and utilizes a machine learning-based
states. recommendation algorithm to curate a music playlist
4. Challenges and Limitations tailored to the user’s mood. The key components of
Despite the progress in emotion-based music the proposed system include emotion recognition,
recommendation, several challenges persist. The music recommendation, and evaluation of user
accuracy of emotion detection, especially in satisfaction.
naturalistic settings, can be affected by factors such as 1. Emotion Recognition Module
lighting conditions, background noise, and user The core of the system lies in accurately detecting the
expressiveness. Additionally, the subjective nature of user’s current emotional state. To achieve this, the
emotions poses a challenge in aligning musical emotion recognition module employs a multimodal
preferences with the detected emotional states. approach:
Privacy concerns related to collecting sensitive user
data, such as facial expressions or voice recordings, Facial Expression Analysis: We use computer vision
also need careful consideration. Current research aims techniques with deep learning models, specifically
to address these issues by exploring multimodal Convolutional Neural Networks (CNNs) trained on
emotion recognition, improving algorithm robustness, large, labeled facial expression datasets (e.g.,
and developing privacy-preserving data processing FER2013, AffectNet). A pre-trained model such as
methods. VGGFace or OpenFace is fine-tuned to classify user
emotions (e.g., happy, sad, neutral, angry) in real-
In summary, existing research underscores the time. The module captures video input through the
potential of emotion-based music recommendation device's camera, processes facial features, and outputs
systems to enhance user experiences by adapting to a predicted emotional state with a confidence score.
their emotional states. However, further advancements
@ IJTSRD | Unique Paper ID – IJTSRD69367 | Volume – 8 | Issue – 5 | Sep-Oct 2024 Page 331
International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470
@ IJTSRD | Unique Paper ID – IJTSRD69367 | Volume – 8 | Issue – 5 | Sep-Oct 2024 Page 332
International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470
Convolutional Neural Networks (CNNs): A pre-trained deep learning model, such as VGGFace or a custom
CNN architecture, is used to extract high-level features from the preprocessed facial images. The CNN
automatically learns spatial hierarchies of features, such as edges, shapes, and facial muscle movements,
crucial for differentiating emotions like happiness, sadness, or surprise.
Feature Vector Creation: The output of the CNN's last pooling layer is flattened into a feature vector that
represents the facial expression in a high-dimensional space. This vector serves as input to the subsequent
classification module.
Classification
The classification step involves predicting the user’s emotional state based on the extracted features, facilitating
personalized music recommendations:
Model Training: The feature vectors, along with their corresponding emotion labels from the training
dataset, are used to train a classifier. A softmax layer is added to the CNN to output probabilities for each
emotion class (e.g., happy, sad, neutral, angry). The model is trained using categorical cross-entropy loss and
optimized with algorithms like Adam.
Emotion Prediction: During real-time usage, the system captures an image of the user’s face, processes it
through the preprocessing, smoothing, and feature extraction steps, and then passes the resulting feature
vector to the trained classifier. The classifier outputs the most likely emotion along with a confidence score.
Confidence Thresholding: To enhance accuracy, a confidence threshold is applied. If the model’s
confidence in its prediction is below a certain level, the system can prompt the user for additional input (e.g.,
text-based mood input) to supplement the emotion detection process.
IV. PROPOSED RESEARCH MODEL
The proposed research model for the emotion-based music recommendation system is structured to provide
personalized music suggestions by integrating real-time facial expression analysis with a music recommendation
algorithm. The process begins with **data acquisition**, where facial images are captured through a camera,
and a music database is compiled with detailed metadata and emotional tags for each track.
Theemotion detection phase involves preprocessing the facial images through resizing, normalization, and
smoothing to improve quality. Facial landmarks, such as eyes and mouth corners, are detected using algorithms
like Dlib or OpenCV. These landmarks guide the analysis of facial expressions, which is further processed using a
Convolutional Neural Network (CNN). The CNN extracts high-level features from the images, which are then
used for emotion classification. This classification, performed by a softmax layer or similar classifier, predicts the
user’s emotional state (e.g., happy, sad, angry) and provides a confidence score to ensure accurate detection.
Once the emotion is identified, the system maps this emotion to specific music attributes through an emotion-
music mapping process. The recommendation algorithm, which may employ content-based or collaborative
filtering techniques, selects appropriate music tracks that align with the detected emotional state.
The system integrates these components in real-time, continuously updating music recommendations as the user's
emotional state changes. A user-friendly interface displays the recommended music and allows users to interact
with the system. To evaluate the effectiveness of the proposed model, the accuracy of the emotion detection
system is measured using metrics like precision and recall, and user satisfaction with the music recommendations
is assessed through user studies. The performance of the emotion-based recommendation system is also compared
with traditional methods to highlight improvements in personalization and user engagement.
@ IJTSRD | Unique Paper ID – IJTSRD69367 | Volume – 8 | Issue – 5 | Sep-Oct 2024 Page 333
International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470
@ IJTSRD | Unique Paper ID – IJTSRD69367 | Volume – 8 | Issue – 5 | Sep-Oct 2024 Page 334
International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470
detection, we use a confusion matrix to analyze true positives, false positives, true negatives, and false negatives
for each emotion category. Metrics such as precision, recall, and F1-score are calculated to gauge the model’s
effectiveness, with high values indicating accurate emotion classification.
For evaluating music recommendation effectiveness, we measure the relevance of the music suggested to users
based on their emotional states. Users rate the relevance of these recommendations, and average relevance scores
are computed to determine how well the music aligns with the detected emotions. Additionally, qualitative
feedback from user surveys provides insights into overall satisfaction and helps identify areas for improvement.
Real-time performance is assessed by measuring latency and processing times, ensuring that the system performs
emotion detection and music recommendation quickly to maintain a smooth user experience. The system's
adaptability is also tested by introducing various emotional inputs and observing how promptly and accurately it
updates the recommendations.
To benchmark the system, we compare its performance with traditional music recommendation methods, such as
collaborative filtering and content-based filtering. This comparison highlights the benefits of integrating
emotional context into recommendations. The privacy evaluation checks how well the system handles sensitive
user data in compliance with privacy regulations, while usability testing assesses the ease of use and user
interface design. Positive feedback in these areas indicates that the system is not only effective but also user-
friendly and respectful of privacy concerns. This comprehensive analysis ensures that the emotion-based music
recommendation system delivers accurate, relevant, and personalized music experiences
@ IJTSRD | Unique Paper ID – IJTSRD69367 | Volume – 8 | Issue – 5 | Sep-Oct 2024 Page 335
International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470
[2] Zhang, Y., Zhang, S., & Xie, J. (2021). Facial Deepfakes Videos Images using a Customize
emotion recognition: A survey of deep learning Convolution Neural Network Model”,
techniques. Computational Intelligence and International Conference on Machine Learning
Neuroscience, 2021, Article ID 1234567. and Data Engineering (ICMLDE), 7th & 8th
https://doi.org/10.1155/2021/1234567 September 2022, 2636-2652, Volume 218, PP.
2636-2652,
[3] Ekstrand, M. D., Riedl, J. T., & Konstan, J. A.
(2010). Collaborative filtering recommender https://doi.org/10.1016/j.procs.2023.01.237
systems. Foundations and Trends® in Human– [10] Usha Kosarkar, Gopal Sakarkar (2023),
Computer Interaction, 4(2), 81-173. “Unmasking Deep Fakes: Advancements,
https://doi.org/10.1561/1100000009 Challenges, and Ethical Considerations”, 4th
International Conference on Electrical and
[4] Hwang, J. S. C., & Kwon, Y. H. (2009). Real-
Electronics Engineering (ICEEE),19th & 20th
time systems: A review of real-time operating
systems and performance evaluation. Journal of August 2023, 978-981-99-8661-3, Volume
Systems and Software, 82(9), 1412-1431. 1115, PP. 249-262, https://doi.org/10.1007/978-
https://doi.org/10.1016/j.jss.2008.12.019 981-99-8661-3_19
[5] Lee, S. B. P., & Kim, J. B. S. (2019). Evaluation [11] Usha Kosarkar, Gopal Sakarkar, Shilpa Gedam
(2021), “Deepfakes, a threat to society”,
metrics for real-time systems: A comprehensive
survey. IEEE Transactions on Computers, International Journal of Scientific Research in
68(7), 989-1003. Science and Technology (IJSRST), 13th October
2021, 2395-602X, Volume 9, Issue 6, PP.
https://doi.org/10.1109/TC.2018.2878530
1132-1140, https://ijsrst.com/IJSRST219682
[6] Ray, P. L. (2020). Privacy and security in
[12] Usha Kosarkar, Prachi Sasankar(2021), “ A
emotion detection systems: A review. IEEE
study for Face Recognition using techniques
Access, 8, 30045-30059.
PCA and KNN”, Journal of Computer
https://doi.org/10.1109/ACCESS.2020.2975012
Engineering (IOSR-JCE), 2278-0661,PP 2-5,
[7] Williams, A. M., & Knott, C. T. (2020). Ethical
[13] Usha Kosarkar, Gopal Sakarkar (2024),
implications of emotion recognition technology.
ACM Transactions on Privacy and Security, “Design an efficient VARMA LSTM GRU
23(4), Article 23. model for identification of deep-fake images
https://doi.org/10.1145/3389325 via dynamic window-based spatio-temporal
analysis”, Journal of Multimedia Tools and
[8] Usha Kosarkar, Gopal Sakarkar, Shilpa Gedam Applications, 1380-7501,
(2022), “An Analytical Perspective on Various https://doi.org/10.1007/s11042-024-19220-w
Deep Learning Techniques for Deepfake
Detection”, 1st International Conference on [14] Usha Kosarkar, Dipali Bhende, “Employing
Artificial Intelligence and Big Data Analytics Artificial Intelligence Techniques in Mental
Health Diagnostic Expert System”,
(ICAIBDA), 10th & 11th June 2022, 2456-3463,
Volume 7, PP. 25-30, International Journal of Computer Engineering
https://doi.org/10.46335/IJIES.2022.7.8.5 (IOSR-JCE),2278-0661, PP-40-45,
https://www.iosrjournals.org/iosr-
[9] Usha Kosarkar, Gopal Sakarkar, Shilpa Gedam jce/papers/conf.15013/Volume%202/9.%2040-
(2022), “Revealing and Classification of 45.pdf?id=7557
@ IJTSRD | Unique Paper ID – IJTSRD69367 | Volume – 8 | Issue – 5 | Sep-Oct 2024 Page 336