0% found this document useful (0 votes)
18 views24 pages

Seminar Report (SA)

Report for seminar

Uploaded by

idalgavearpita31
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views24 pages

Seminar Report (SA)

Report for seminar

Uploaded by

idalgavearpita31
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 24

A SEMINAR REPORT ON

“Sentiment Analysis Using Natural Language Processing”

SUBMITTED TO THE SAVITRIBAI PHULE PUNE UNIVERSITY, PUNE


IN THE PARTIAL FULFILLMENT OF THE REQUIREMENTS
FOR THE ACADEMIC YEAR 2023-24

OF

THIRD YEAR OF ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING

SUBMITTED BY

Student Name -Manoj Dhanawade Roll No.:

DEPARTMENT OF ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING


ISBM COLLEGE OF ENGINEERING
Pune – 412115

1
SAVITRIBAI PHULE PUNE UNIVERSITY
A.Y. 2023 -2024

CERTIFICATE

This is to certify that the Seminar report entitles

“Sentiment Analysis Using Natural Language Processing.”

Submitted by

Student Name – Manoj Dhanawade Roll No.:

is a Bonafide student of this institute and the work has been carried out by her under the supervision
of Prof. Prajakta Puranik and it is approved for the partial fulfillment of the requirement of
Savitribai Phule Pune University, for the award of the Third-Year degree of Artificial intelligence and
machine learning.

Prof. Prajakta Puranik Prof. Kirti Randhe Dr. P. K. Srivastava

Guide HOD Principal

Place: Pune
Date:

2
ABSTRACT

This paper presents a sentiment analysis approach aimed at extracting sentiments associated with
positive or negative polarities for specific subjects within a document, rather than simply classifying
the entire document as positive or negative. The key challenges in sentiment analysis revolve around
identifying how sentiments are conveyed in text and determining whether these expressions reflect
positive or negative opinions towards the subject. To enhance accuracy, it's crucial to accurately
identify the semantic relationships between sentiment expressions and the subject. By employing
semantic analysis coupled with syntactic parsing and sentiment lexicons, our prototype system
achieved high precision (ranging from 75% to 95%, depending on the dataset) in identifying
sentiments within web pages and news articles.

Keywords:
Sentiment Analysis
Natural Language Processing
Positive and Negative Polarities
Semantic Analysis
Web Pages
News Articles

3
ACKNOWLEDGEMENT

It gives us great pleasure in presenting the seminar report on Sentiment Analysis using Natural
Language Processing. We would like to take this opportunity to thank my internal guide Prof.
Prajakta Puranik for giving me all the help and guidance I needed. I am grateful to them for their
kind support. Their valuable suggestions were very helpful. We are also grateful to Prof. Kirti
Randhe, HOD (AI & ML), for his indispensable support, suggestions. We are thankful to our
principal Dr. P. K. Srivastava for his continuous support.

(Manoj Rumesh Dhanawade)


(T.E. AIML.)

4
Table of Content

Sr. Page
Content
No. No.
1 Introduction 6
Introduction &Motivation 9
Problem Statement 10
Objectives 10
2 Literature Review 12
Details of design/technology/Analytical Work with Diagrams and
3 13
Experimental Study
4 Tables with Data 16
5 List of Figures 18
6 Discussions and Conclusions 22
7 References 23

5
CHAPTER 1: INTRODUCTION

❖ INTRODUCTION

❖ Sentiment Analysis Using Natural Language Processing.


Sentiment analysis, often referred to as opinion mining, represents a vital facet within the realm of natural
language processing (NLP). This field is dedicated to the extraction and examination of subjective
information embedded within text data. In an era dominated by digital communication channels and the
unprecedented surge in textual data online, sentiment analysis has assumed a pivotal role. Its significance
lies in its ability to decode public sentiment, gauge customer feedback, track market trends, and discern
brand perception.
At its core, sentiment analysis seeks to discern the sentiment or emotional disposition conveyed through
textual content, categorizing it as positive, negative, or neutral. This analytical process serves as a
cornerstone for decision-making across diverse sectors, empowering businesses, organizations,
governments, and individuals to tailor their strategies and communication in response to prevailing
sentiments.

➢ Sentiment Analysis
Sentiment analysis, also known as opinion mining, is a branch of natural language processing (NLP) that
focuses on extracting and analyzing subjective information from text data. Its primary objective is to
determine the sentiment or emotional tone conveyed by a piece of text, whether it is positive, negative,
or neutral.
In essence, sentiment analysis aims to understand the attitude, opinion, or emotion expressed in textual
content, such as product reviews, social media posts, customer feedback, news articles, and more. By
analyzing the sentiment of text data, sentiment analysis provides valuable insights into public opinion,
customer sentiment, market trends, brand perception, and other aspects of human communication.
Sentiment analysis techniques can range from simple rule-based approaches to sophisticated machine
learning and deep learning algorithms. Some common methods include lexicon-based approaches, where
sentiment lexicons or dictionaries containing words and their associated sentiment scores are used to
determine the sentiment of text. Machine learning algorithms, on the other hand, learn from labelled data
to classify text into different sentiment categories. Deep learning models, such as neural networks, can
automatically learn intricate patterns and representations of text features for sentiment analysis tasks.
The applications of sentiment analysis are vast and diverse, spanning across various industries and
domains. It is widely used in social media monitoring to track and analyze public sentiment on platforms
like Twitter, Facebook, and Instagram. In customer feedback analysis, sentiment analysis helps
businesses understand customer opinions and improve products and services accordingly. It is also
employed in brand monitoring, market research, political analysis, healthcare, and more.

➢ Natural Language Processing


Computers can now analyze, alter, and comprehend human language thanks to a machine learning
technique called natural language processing, or NLP. Today's organizations handle massive amounts of
text and voice data from a variety of communication channels, including social media newsfeeds, emails,
text messages, audio, video, and more. They automatically handle this data, interpret the message's intent
or sentiment, and react to human communication in real time using NLP software.

6
Comprehensive and effective analysis of textual and audio data requires the use of natural language
processing, or NLP. It can overcome the variations in slang, dialects, and grammatical errors common in
everyday speech.
Businesses utilize it for a variety of automated tasks, including:
• Processing, analyzing, and archiving huge documents;
• Examining call center records or consumer feedback;
• Using chatbots for automated customer care.
• Respond to who-what-when-where inquiries.
• Organize and take out text

➢ Natural Language Processing working


NLP employs a wide range of methods to give computers the same level of natural language
comprehension as people. Natural language processing uses artificial intelligence (AI) to analyze and
interpret spoken or written language in a way that is understandable to a computer. Computers have
programs to read and microphones to gather audio, just as people have various sensors, such as ears to
hear and eyes to see. Computers have programs to process the inputs that they receive, much as humans
have brains to process information. The input is eventually changed into computer-understandable code
during processing.
Sentiment analysis is a technique in natural language processing (NLP) that helps identify the sentiment
included in a text, such as a social media post, review, or comment.
Determining if the expressed sentiment is neutral, negative, or positive is the aim. Let's take two general
steps to understand the overview:

o Processing
Begin by gathering the textual data such as reviews from customers, posts on social media, news items,
or any other type of textual content that needs to be sentimentally analyzed. The text that has been
gathered is pre-processed using a variety of procedures to clean and standardize the data:
• Deleting information that isn't necessary (such HTML tags and special characters).
• Tokenization is the process of dividing the text into discrete words or units.
• Eliminating stop words (common words that don't add much to the sentiment, such as"and" "the,"
etc.).
• Reducing words to their minimal form is known as stemming or lemmatization.

o Analysis
Techniques such as bag-of-words or word embeddings (e.g., Word2Vec, GloVe) are used to convert text
for analysis.After that, labeled datasets are used to train the models, which link text to different attitudes
(positive, negative, or neutral).
Following training and validation, the model applies learnt patterns to assign labels and predicts
sentiment on fresh data.

➢ Approaches to Sentiment Analysis


1. Rule based
Tokenization, parsing, and the lexicon method are all rule-based here. The method involves counting the
quantity of both positive and negative terms in the provided dataset. The sentiment is good if there are
more positive words than negative ones; otherwise, the opposite is true.
2. Machine Learning

7
This strategy makes use of the machine learning method. First, predictive analysis is carried out and the
datasets are trained. The procedure of extracting words from the text is the following step. Several
methods, including Naive Bayes, Support Vector machines, hidden Markov models, and conditional
random fields, are available for text extraction in machine learning applications.
3. Neural Network
Neural networks have developed at a very rapid pace in recent years. Positive, negative, and neutral
feelings are classified in text using artificial neural networks, which are modelled after the structure of
the human brain. To analyze sequential data, such as text, it has recurrent neural networks, long short-
term memory, gated recurrent units, etc.
4. Hybrid Approach
It is the fusion of two or more methodologies, such as machine learning and rule-based methodologies.
The advantage is that, in contrast to the other two methods, the accuracy is excellent.

8
➢ MOTIVATION

This project aimed to develop a system to address the longstanding challenge faced by quantitative
analysts and researchers in understanding market dynamics influenced by sentiments and news. The
system aims to leverage various trends and emerging features to enhance market analysis and decision
making processes.

➢ Purpose
Sentiment analysis stands as a prominent application of machine learning and natural language processing
(NLP). It has emerged as one of the fastest-growing research areas in computer science, garnering significant
attention due to its diverse applications. This analysis involves studying user reviews, news reports, feedback,
and social media updates, among other sources, to discern sentiments. Researchers collect and analyze
responses, categorizing sentiments into Positive, Negative, or Neutral. The paper provides an in-depth
examination of sentiment analysis, covering its fundamentals, types, and various approaches. Additionally, it
discusses recent tools, APIs, and real-world applications across different domains.

9
❖ PROBLEM STATEMENT

Sentiments extracted from diverse texts are categorized as positive, negative, or neutral, reflecting a
growing interest in sentiment analysis within the research community. This analysis aims to assess
sentiments based on reviews of products on E-commerce platforms like Amazon. Utilizing a
combination of text analytics, natural language processing, and machine learning techniques,
sentiment analysis assigns sentiment scores to topics, categories, or entities within phrases, facilitating
comprehensive understanding and interpretation of textual data.

❖ OBJECTIVES
Objectives for sentiment analysis using Natural Language Processing (NLP) can vary depending on
the specific context and application. Here are some common objectives:
1. Sentiment Classification: Develop models to accurately classify text into positive, negative,
or neutral sentiment categories. The primary objective is to automatically determine the
sentiment expressed in textual data.
2. Opinion Mining: Extract opinions, sentiments, and attitudes expressed in text data to
understand public opinion, customer feedback, market trends, and brand perception. The goal
is to gain insights into subjective information conveyed through text.
3. Social Media Monitoring: Monitor social media platforms to track and analyze public sentiment,
discussions, and trends. The objective is to identify emerging topics, sentiment shifts, and influential
voices in online conversations.
4. Customer Feedback Analysis: Analyze customer reviews, surveys, and support tickets to
understand customer sentiment towards products, services, and brands. The aim is to identify areas for
improvement, address customer concerns, and enhance customer satisfaction.
5. Brand Monitoring and Reputation Management: Monitor brand mentions and sentiment on
social media, review platforms, and news articles to manage brand reputation and perception. The
objective is to track brand sentiment, respond to negative feedback, and promote positive brand
sentiment.
6. Market Research: Analyze textual data from market surveys, focus groups, and online forums
to gauge consumer sentiment, preferences, and behavior. The goal is to identify market trends,
customer preferences, and opportunities for product innovation or marketing strategies.
7. Political Analysis: Analyze public sentiment towards political candidates, parties, policies,
and issues using textual data from news articles, social media, and public forums. The
objective is to understand voter sentiment, predict election outcomes, and inform political
campaigns.
8. Healthcare Analytics: Analyze patient feedback, reviews of healthcare providers, and social
media discussions about health-related topics to understand patient sentiment and experiences.
The goal is to improve healthcare services, patient satisfaction, and public health outcomes.
9. Financial Analysis: Analyze textual data from news articles, social media, and financial
reports to gauge investor sentiment, market trends, and economic indicators. The objective is
to inform investment decisions, predict market movements, and mitigate financial risks.
10. Customized Applications: Develop customized sentiment analysis solutions tailored to
specific industries, domains, or use cases. The objective is to address unique requirements,
challenges, and objectives of individual applications, such as sentiment analysis for legal
documents, customer service interactions, or product reviews in e-commerce platforms.
10
These objectives demonstrate the diverse applications and potential benefits of sentiment analysis
using NLP techniques across various domains and industries.

11
CHAPTER 2: LITERATURE REVIEW

❖ Literature review

Sr Paper Title Journal Name, Insights Methodology/Technology/


No Year Algorithm used

1. "A Survey of Sentiment IEEE Provides a comprehensive Literature review, survey


Analysis Techniques and Transactions on overview of sentiment methodology
Applications" Knowledge and analysis techniques,
Data Engineering, applications, and
2019 challenges.

2. "Deep Learning for Sentiment ACM Computing Reviews deep learning Deep learning models: RNNs,
Analysis: A Review" Surveys, 2018 techniques applied to CNNs, Transformers
sentiment analysis,
highlighting recent
advancements.

3. "Sentiment Analysis in Social ACM Analyzes sentiment Lexicon-based methods,


Media: A Review and a New Transactions on analysis methods in social machine learning algorithms
Approach" Internet media and proposes a novel (SVM, Naive Bayes), rule-based
Technology, 2015 approach for improved approaches
accuracy.

4. "Challenges in Sentiment Information Identifies key challenges in Literature review, analysis of


Analysis" Processing & sentiment analysis, challenges
Management, including sarcasm
2017 detection, context-
dependency, and domain
adaptation.
5. "Advances in Aspect-Based IEEE Intelligent Explores advancements in Aspect-based sentiment
Sentiment Analysis" Systems, 2020 aspect-based sentiment analysis, deep learning models
analysis, addressing (LSTM, attention mechanisms),
challenges and proposing domain-specific embeddings
novel techniques.

12
CHAPTER 3: Details of design/technology/Analytical Work with
Diagrams and Experimental Study

❖ Details of design/technology/Analytical Work with Diagrams

Step 1: Tokenization
Tokenization is the process by which big quantity of text is divided into smaller parts called tokens.

Step 2: Cleaning the Data


❖ Remove numbers
❖ Stemming/lemmatization
❖ Part of speech tagging
❖ Remove punctuation
❖ Lowercase

Step 3: Removing the stop words


One of the major forms of pre processing is to filter out useless data. In natural language processing,
useless words (data), are referred to as stop words.

Step 4: Classification
• Rule-based systems that perform sentiment analysis based on a set of manually crafted rules.
• Automatic systems that rely on machine learning techniques to learn from data.
• Hybrid systems that combine both rule based and automatic approaches.

13
❖ Algorithm:

import string
from collections import Counter

import matplotlib.pyplot as plt

# reading text file


text = open("read.txt", encoding="utf-8").read()

# converting to lowercase
lower_case = text.lower()

# Removing punctuations
cleaned_text = lower_case.translate(str.maketrans('', '', string.punctuation))

# splitting text into words


tokenized_words = cleaned_text.split()

stop_words = ["i", "me", "my", "myself", "we", "our", "ours", "ourselves", "you", "your", "yours",
"yourself","yourselves", "he", "him", "his", "himself", "she", "her", "hers", "herself", "it", "its",
"itself","they", "them", "their", "theirs", "themselves", "what", "which", "who", "whom", "this",
"that", "these","those", "am", "is", "are", "was", "were", "be", "been", "being", "have", "has", "had",
"having", "do", "does", "did", "doing", "a", "an", "the", "and", "but", "if", "or", "because", "as", "until",
"while","of", "at", "by", "for", "with", "about", "against", "between", "into", "through", "during",
"before","after", "above", "below", "to", "from", "up", "down", "in", "out", "on", "off", "over",
"under", "again","further", "then", "once", "here", "there", "when", "where", "why", "how", "all",
"any", "both", "each","few", "more", "most", "other", "some", "such", "no", "nor", "not", "only",
"own", "same", "so", "than","too", "very", "s", "t", "can", "will", "just", "don", "should", "now"]

# Removing stop words from the tokenized words list


final_words = []
for word in tokenized_words:
if word not in stop_words:
final_words.append(word)

# NLP Emotion Algorithm


# 1) Check if the word in the final word list is also present in emotion.txt
# - open the emotion file
# - Loop through each line and clear it
# - Extract the word and emotion using split

# 2) If word is present -> Add the emotion to emotion_list


# 3) Finally count each emotion in the emotion list

emotion_list = []
with open('emotions.txt', 'r') as file:
14
for line in file:
clear_line = line.replace("\n", '').replace(",", '').replace("'", '').strip()
word, emotion = clear_line.split(':')

if word in final_words:
emotion_list.append(emotion)

print(emotion_list)
w = Counter(emotion_list)
print(w)

# Plotting the emotions on the graph

fig, ax1 = plt.subplots()


ax1.bar(w.keys(), w.values())
fig.autofmt_xdate()
plt.savefig('graph.png')
plt.show()

15
CHAPTER 4: TABLES

❖ TABLES WITH DATA

16
17
❖ LIST OF FIGURES

18
• Model Loss Graph

19
20
21
CHAPTER 5: Discussions and Conclusions

❖ Experimental Results and Discussion

❖ Conclusions:

Sentiment analysis, a burgeoning field of study within natural language processing and computational
linguistics, delves into the intricate realm of human emotions, attitudes, and opinions towards various
entities. This multifaceted project endeavors to tackle one of the core challenges of sentiment analysis:
the nuanced categorization of sentiment polarity, distinguishing between positive, neutral, and
negative sentiments expressed within textual data. Through a comprehensive exploration of sentiment
analysis methodologies, this endeavor aims to unravel the complexities of human sentiment, enabling
deeper insights and understanding into the subjective nature of language.

22
CHAPTER 5: References

1. Chinatsu Aone, Mila Ramos-Santacruz, and William J. Niehaus. AssentorR: An NLP-Based


Solution to E-mail Monitoring. In Proceedings of AAAI/IAAI 2000, pages 945--950.

2000.
2. Bo Pang, Lillian Lee, and Shivakumar Vaithyanathan. Thumbs up? Sentiment Classification
using Machine Learning Techniques. In Proceedings of the Conference on Empirical

Methods in Natural Language Processing (EMNLP), pages 79--86. 2002.


3. Ralph Grishman and Beth Sundheim. Message understanding conference - 6: A brief history.
In Proceedings of the 16th International Conference on Computational Linguistics

(COLING), pages 466--471. 1996.


4. Vasileios Hatzivassiloglou and Kathleen R. McKeown. Predicting the semantic orientation
of adjectives. In Proceedings of the 35th Annual Meeting of the ACL and the 8th Conference

of the European Chapter of the ACL, pages 174--181. 1997.


5. Vasileios Hatzivassiloglou and Janyce M. Wiebe. Effects of adjective orientation and
gradability on sentence subjectivity. In Proceedings of 18th International Conference on

Computational Linguistics (COLING), pages 299-305. 2000.


6. Chris Manning and Hinrich Schutze. Foundations of Statistical Natural Language

Processing. MIT Press, Cambridge, MA. 1999.


7. Satoshi Morinaga, Kenji Yamanishi, Kenji Tateishi, Toshikazu Fukushima. Mining Product
Reputations on the Web. In Proceedings of the Eighth ACM SIGKDD International

Conference on Knowledge Discovery and Data Mining (KDD), pages 341--349. 2002.

8. Mary S. Neff, Roy J. Byrd, and Branimir K. Boguraev. The Talent System: TEXTRACT
Architecture and Data Model. In Proceedings of the HLT-NAACL 2003 Workshop on
Software Engineering and Architecture of Language Technology systems (SEALTS), pages

1--8. 2003.

9. Penn Treebank Project. http://www.cis.upenn.edu/treebank/

10. SAIC Information Extraction. http://www.itl.nist.gov/iaui/894.02/related_projects/muc/


11. Ellen Spertus. Smokey: Automatic recognition of hostile messages. In Proceedings of the
Conference on Innovative Applications of Artificial Intelligence (IAAI), pages 1058--1065.

1997

23
12. Richard M. Tong. An operational system for detecting and tracking opinions in on-line
discussions. Working Notes of the ACM SIGIR 2001 Workshop on Operational Text

Classification, pages 1--6. 2001.


13. Peter Turney. Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised
Classification of Reviews. In Proceedings of the 40th Annual Meeting of the Association for

Computational Linguistics (ACL), pages 417--424, 2002.


14. Janyce M. Wiebe, Theresa Wilson, and Matthew Bell. Identifying collocations for
recognizing opinions. In Proceedings of the ACL/EACL Workshop on Collocation. 2001.

15. Jeonghee Yi and Tetsuya Nasukawa. Sentiment Analyzer: Extracting Sentiments towards a
Given Topic using Natural Language Processing Techniques. In Proceedings of the Third
IEEE International Conference on Data Mining (ICDM). (To appear). 2003.

24

You might also like