Seminar Report (SA)
Seminar Report (SA)
OF
SUBMITTED BY
1
SAVITRIBAI PHULE PUNE UNIVERSITY
A.Y. 2023 -2024
CERTIFICATE
Submitted by
is a Bonafide student of this institute and the work has been carried out by her under the supervision
of Prof. Prajakta Puranik and it is approved for the partial fulfillment of the requirement of
Savitribai Phule Pune University, for the award of the Third-Year degree of Artificial intelligence and
machine learning.
Place: Pune
Date:
2
ABSTRACT
This paper presents a sentiment analysis approach aimed at extracting sentiments associated with
positive or negative polarities for specific subjects within a document, rather than simply classifying
the entire document as positive or negative. The key challenges in sentiment analysis revolve around
identifying how sentiments are conveyed in text and determining whether these expressions reflect
positive or negative opinions towards the subject. To enhance accuracy, it's crucial to accurately
identify the semantic relationships between sentiment expressions and the subject. By employing
semantic analysis coupled with syntactic parsing and sentiment lexicons, our prototype system
achieved high precision (ranging from 75% to 95%, depending on the dataset) in identifying
sentiments within web pages and news articles.
Keywords:
Sentiment Analysis
Natural Language Processing
Positive and Negative Polarities
Semantic Analysis
Web Pages
News Articles
3
ACKNOWLEDGEMENT
It gives us great pleasure in presenting the seminar report on Sentiment Analysis using Natural
Language Processing. We would like to take this opportunity to thank my internal guide Prof.
Prajakta Puranik for giving me all the help and guidance I needed. I am grateful to them for their
kind support. Their valuable suggestions were very helpful. We are also grateful to Prof. Kirti
Randhe, HOD (AI & ML), for his indispensable support, suggestions. We are thankful to our
principal Dr. P. K. Srivastava for his continuous support.
4
Table of Content
Sr. Page
Content
No. No.
1 Introduction 6
Introduction &Motivation 9
Problem Statement 10
Objectives 10
2 Literature Review 12
Details of design/technology/Analytical Work with Diagrams and
3 13
Experimental Study
4 Tables with Data 16
5 List of Figures 18
6 Discussions and Conclusions 22
7 References 23
5
CHAPTER 1: INTRODUCTION
❖ INTRODUCTION
➢ Sentiment Analysis
Sentiment analysis, also known as opinion mining, is a branch of natural language processing (NLP) that
focuses on extracting and analyzing subjective information from text data. Its primary objective is to
determine the sentiment or emotional tone conveyed by a piece of text, whether it is positive, negative,
or neutral.
In essence, sentiment analysis aims to understand the attitude, opinion, or emotion expressed in textual
content, such as product reviews, social media posts, customer feedback, news articles, and more. By
analyzing the sentiment of text data, sentiment analysis provides valuable insights into public opinion,
customer sentiment, market trends, brand perception, and other aspects of human communication.
Sentiment analysis techniques can range from simple rule-based approaches to sophisticated machine
learning and deep learning algorithms. Some common methods include lexicon-based approaches, where
sentiment lexicons or dictionaries containing words and their associated sentiment scores are used to
determine the sentiment of text. Machine learning algorithms, on the other hand, learn from labelled data
to classify text into different sentiment categories. Deep learning models, such as neural networks, can
automatically learn intricate patterns and representations of text features for sentiment analysis tasks.
The applications of sentiment analysis are vast and diverse, spanning across various industries and
domains. It is widely used in social media monitoring to track and analyze public sentiment on platforms
like Twitter, Facebook, and Instagram. In customer feedback analysis, sentiment analysis helps
businesses understand customer opinions and improve products and services accordingly. It is also
employed in brand monitoring, market research, political analysis, healthcare, and more.
6
Comprehensive and effective analysis of textual and audio data requires the use of natural language
processing, or NLP. It can overcome the variations in slang, dialects, and grammatical errors common in
everyday speech.
Businesses utilize it for a variety of automated tasks, including:
• Processing, analyzing, and archiving huge documents;
• Examining call center records or consumer feedback;
• Using chatbots for automated customer care.
• Respond to who-what-when-where inquiries.
• Organize and take out text
o Processing
Begin by gathering the textual data such as reviews from customers, posts on social media, news items,
or any other type of textual content that needs to be sentimentally analyzed. The text that has been
gathered is pre-processed using a variety of procedures to clean and standardize the data:
• Deleting information that isn't necessary (such HTML tags and special characters).
• Tokenization is the process of dividing the text into discrete words or units.
• Eliminating stop words (common words that don't add much to the sentiment, such as"and" "the,"
etc.).
• Reducing words to their minimal form is known as stemming or lemmatization.
o Analysis
Techniques such as bag-of-words or word embeddings (e.g., Word2Vec, GloVe) are used to convert text
for analysis.After that, labeled datasets are used to train the models, which link text to different attitudes
(positive, negative, or neutral).
Following training and validation, the model applies learnt patterns to assign labels and predicts
sentiment on fresh data.
7
This strategy makes use of the machine learning method. First, predictive analysis is carried out and the
datasets are trained. The procedure of extracting words from the text is the following step. Several
methods, including Naive Bayes, Support Vector machines, hidden Markov models, and conditional
random fields, are available for text extraction in machine learning applications.
3. Neural Network
Neural networks have developed at a very rapid pace in recent years. Positive, negative, and neutral
feelings are classified in text using artificial neural networks, which are modelled after the structure of
the human brain. To analyze sequential data, such as text, it has recurrent neural networks, long short-
term memory, gated recurrent units, etc.
4. Hybrid Approach
It is the fusion of two or more methodologies, such as machine learning and rule-based methodologies.
The advantage is that, in contrast to the other two methods, the accuracy is excellent.
8
➢ MOTIVATION
This project aimed to develop a system to address the longstanding challenge faced by quantitative
analysts and researchers in understanding market dynamics influenced by sentiments and news. The
system aims to leverage various trends and emerging features to enhance market analysis and decision
making processes.
➢ Purpose
Sentiment analysis stands as a prominent application of machine learning and natural language processing
(NLP). It has emerged as one of the fastest-growing research areas in computer science, garnering significant
attention due to its diverse applications. This analysis involves studying user reviews, news reports, feedback,
and social media updates, among other sources, to discern sentiments. Researchers collect and analyze
responses, categorizing sentiments into Positive, Negative, or Neutral. The paper provides an in-depth
examination of sentiment analysis, covering its fundamentals, types, and various approaches. Additionally, it
discusses recent tools, APIs, and real-world applications across different domains.
9
❖ PROBLEM STATEMENT
Sentiments extracted from diverse texts are categorized as positive, negative, or neutral, reflecting a
growing interest in sentiment analysis within the research community. This analysis aims to assess
sentiments based on reviews of products on E-commerce platforms like Amazon. Utilizing a
combination of text analytics, natural language processing, and machine learning techniques,
sentiment analysis assigns sentiment scores to topics, categories, or entities within phrases, facilitating
comprehensive understanding and interpretation of textual data.
❖ OBJECTIVES
Objectives for sentiment analysis using Natural Language Processing (NLP) can vary depending on
the specific context and application. Here are some common objectives:
1. Sentiment Classification: Develop models to accurately classify text into positive, negative,
or neutral sentiment categories. The primary objective is to automatically determine the
sentiment expressed in textual data.
2. Opinion Mining: Extract opinions, sentiments, and attitudes expressed in text data to
understand public opinion, customer feedback, market trends, and brand perception. The goal
is to gain insights into subjective information conveyed through text.
3. Social Media Monitoring: Monitor social media platforms to track and analyze public sentiment,
discussions, and trends. The objective is to identify emerging topics, sentiment shifts, and influential
voices in online conversations.
4. Customer Feedback Analysis: Analyze customer reviews, surveys, and support tickets to
understand customer sentiment towards products, services, and brands. The aim is to identify areas for
improvement, address customer concerns, and enhance customer satisfaction.
5. Brand Monitoring and Reputation Management: Monitor brand mentions and sentiment on
social media, review platforms, and news articles to manage brand reputation and perception. The
objective is to track brand sentiment, respond to negative feedback, and promote positive brand
sentiment.
6. Market Research: Analyze textual data from market surveys, focus groups, and online forums
to gauge consumer sentiment, preferences, and behavior. The goal is to identify market trends,
customer preferences, and opportunities for product innovation or marketing strategies.
7. Political Analysis: Analyze public sentiment towards political candidates, parties, policies,
and issues using textual data from news articles, social media, and public forums. The
objective is to understand voter sentiment, predict election outcomes, and inform political
campaigns.
8. Healthcare Analytics: Analyze patient feedback, reviews of healthcare providers, and social
media discussions about health-related topics to understand patient sentiment and experiences.
The goal is to improve healthcare services, patient satisfaction, and public health outcomes.
9. Financial Analysis: Analyze textual data from news articles, social media, and financial
reports to gauge investor sentiment, market trends, and economic indicators. The objective is
to inform investment decisions, predict market movements, and mitigate financial risks.
10. Customized Applications: Develop customized sentiment analysis solutions tailored to
specific industries, domains, or use cases. The objective is to address unique requirements,
challenges, and objectives of individual applications, such as sentiment analysis for legal
documents, customer service interactions, or product reviews in e-commerce platforms.
10
These objectives demonstrate the diverse applications and potential benefits of sentiment analysis
using NLP techniques across various domains and industries.
11
CHAPTER 2: LITERATURE REVIEW
❖ Literature review
2. "Deep Learning for Sentiment ACM Computing Reviews deep learning Deep learning models: RNNs,
Analysis: A Review" Surveys, 2018 techniques applied to CNNs, Transformers
sentiment analysis,
highlighting recent
advancements.
12
CHAPTER 3: Details of design/technology/Analytical Work with
Diagrams and Experimental Study
Step 1: Tokenization
Tokenization is the process by which big quantity of text is divided into smaller parts called tokens.
Step 4: Classification
• Rule-based systems that perform sentiment analysis based on a set of manually crafted rules.
• Automatic systems that rely on machine learning techniques to learn from data.
• Hybrid systems that combine both rule based and automatic approaches.
13
❖ Algorithm:
import string
from collections import Counter
# converting to lowercase
lower_case = text.lower()
# Removing punctuations
cleaned_text = lower_case.translate(str.maketrans('', '', string.punctuation))
stop_words = ["i", "me", "my", "myself", "we", "our", "ours", "ourselves", "you", "your", "yours",
"yourself","yourselves", "he", "him", "his", "himself", "she", "her", "hers", "herself", "it", "its",
"itself","they", "them", "their", "theirs", "themselves", "what", "which", "who", "whom", "this",
"that", "these","those", "am", "is", "are", "was", "were", "be", "been", "being", "have", "has", "had",
"having", "do", "does", "did", "doing", "a", "an", "the", "and", "but", "if", "or", "because", "as", "until",
"while","of", "at", "by", "for", "with", "about", "against", "between", "into", "through", "during",
"before","after", "above", "below", "to", "from", "up", "down", "in", "out", "on", "off", "over",
"under", "again","further", "then", "once", "here", "there", "when", "where", "why", "how", "all",
"any", "both", "each","few", "more", "most", "other", "some", "such", "no", "nor", "not", "only",
"own", "same", "so", "than","too", "very", "s", "t", "can", "will", "just", "don", "should", "now"]
emotion_list = []
with open('emotions.txt', 'r') as file:
14
for line in file:
clear_line = line.replace("\n", '').replace(",", '').replace("'", '').strip()
word, emotion = clear_line.split(':')
if word in final_words:
emotion_list.append(emotion)
print(emotion_list)
w = Counter(emotion_list)
print(w)
15
CHAPTER 4: TABLES
16
17
❖ LIST OF FIGURES
18
• Model Loss Graph
19
20
21
CHAPTER 5: Discussions and Conclusions
❖ Conclusions:
Sentiment analysis, a burgeoning field of study within natural language processing and computational
linguistics, delves into the intricate realm of human emotions, attitudes, and opinions towards various
entities. This multifaceted project endeavors to tackle one of the core challenges of sentiment analysis:
the nuanced categorization of sentiment polarity, distinguishing between positive, neutral, and
negative sentiments expressed within textual data. Through a comprehensive exploration of sentiment
analysis methodologies, this endeavor aims to unravel the complexities of human sentiment, enabling
deeper insights and understanding into the subjective nature of language.
22
CHAPTER 5: References
2000.
2. Bo Pang, Lillian Lee, and Shivakumar Vaithyanathan. Thumbs up? Sentiment Classification
using Machine Learning Techniques. In Proceedings of the Conference on Empirical
Conference on Knowledge Discovery and Data Mining (KDD), pages 341--349. 2002.
8. Mary S. Neff, Roy J. Byrd, and Branimir K. Boguraev. The Talent System: TEXTRACT
Architecture and Data Model. In Proceedings of the HLT-NAACL 2003 Workshop on
Software Engineering and Architecture of Language Technology systems (SEALTS), pages
1--8. 2003.
1997
23
12. Richard M. Tong. An operational system for detecting and tracking opinions in on-line
discussions. Working Notes of the ACM SIGIR 2001 Workshop on Operational Text
15. Jeonghee Yi and Tetsuya Nasukawa. Sentiment Analyzer: Extracting Sentiments towards a
Given Topic using Natural Language Processing Techniques. In Proceedings of the Third
IEEE International Conference on Data Mining (ICDM). (To appear). 2003.
24