0% found this document useful (0 votes)
23 views65 pages

Report Sentiment Analysis Using NLP and Deep Learning

Report Sentiment Analysis Using NLP and Deep Learning
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views65 pages

Report Sentiment Analysis Using NLP and Deep Learning

Report Sentiment Analysis Using NLP and Deep Learning
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 65

“Sentiment Analysis using NLP and Deep Learning”

A Project Report Submitted to


Rajiv Gandhi Proudyogiki Vishwavidyalaya

Towards Partial Fulfillment for the Award of


Bachelor of Engineering in Computer Science &
Engineering

Submitted by: Guided by:


Aeshna Jain (0827CS201017) Prof. Ritika Bhatt
Bhavik Mundra (0827CS201057) Associate Professor
Computer Science & Engineering

Acropolis Institute of Technology & Research, Indore


July-Dec 2023
Sentiment Analysis using NLP and Deep Learning

EXAMINER APPROVAL

The Project entitled “Sentiment Analysis using NLP and Deep Learning”
submitted by Aeshna Jain(0827CS201017), Bhavik Mundra(0827CS201057),
has been examined and is hereby approved towards partial fulfillment for the
award of Bachelor of Engineering degree in Computer Science & Engineering
discipline, for which it has been submitted. It is understood that by this approval
the undersigned do not necessarily endorse or approve any statement made,
opinion expressed or conclusion drawn therein, but approve the project only for
the purpose for which it has been submitted.

(Internal Examiner) (External Examiner)

Date: Date:

i
Sentiment Analysis using NLP and Deep Learning

GUIDE RECOMMENDATION

This is to certify that the work embodied in this project “Sentiment Analysis
using NLP and Deep Learning” submitted by Aeshna Jain(0827CS201017),
Bhavik Mundra(0827CS201057), is a satisfactory account of the bonafide work
done under the supervision of Prof. Ritika Bhatt are recommended towards
partial fulfillment for the award of the Bachelor of Engineering (Computer Science
& Engineering) degree by Rajiv Gandhi Proudyogiki Vishwavidhyalaya, Bhopal.

(Project Guide) (Project Coordinator)

ii
Sentiment Analysis using NLP and Deep Learning

STUDENTS UNDERTAKING

This is to certify that project entitled “Sentiment Analysis using NLP and Deep
Learning” has developed by us under the supervision of Prof. Ritika Bhatt. The
whole responsibility of work done in this project is ours. The sole intension of this
work is only for practical learning and research.

We further declare that to the best of our knowledge; this report does not contain
any part of any work which has been submitted for the award of any degree either
in this University or in any other University / Deemed University without proper
citation and if the same work found then we are liable for explanation to this.)

Aeshna Jain(0827CS201017)

Bhavik Mundra(0827CS201057)

iii
Sentiment Analysis using NLP and Deep Learning

Acknowledgement

We thank the almighty Lord for giving me the strength and courage to sail out
through the tough and reach on shore safely. There are number of people without
whom this projects work would not have been feasible. Their high academic
standards and personal integrity provided me with continuous guidance and
support. We owe a debt of sincere gratitude, deep sense of reverence and respect
to our guide and mentors Prof. Ritika Bhatt, Associate Professor, AITR, for their
motivation, sagacious guidance, constant encouragement, vigilant supervision and
valuable critical appreciation throughout this project work, which helped us to
successfully complete the project on time.

We express profound gratitude and heartfelt thanks to Dr. Kamal Kumar Sethi,
HOD CSE, AITR Indore for his support, suggestion and inspiration for carrying out
this project. I am very much thankful to other faculty and staff members of CSE
Dept, AITR Indore for providing me all support, help and advice during the project.
We would be failing in our duty if do not acknowledge the support and guidance
received from Dr. S.C. Sharma, Director, AITR, Indore whenever needed. We take
opportunity to convey my regards to the management of Acropolis Institute,
Indore for extending academic and administrative support and providing me all
necessary facilities for project to achieve our objectives.

We are grateful to our parent and family members who have always loved and
supported us unconditionally. To all of them, we want to say, “Thank you”, for being
the best family that one could ever have and without whom none of this would
have been possible.

Aeshna Jain(0827CS201017)

Bhavik Mundra(0827CS201057)

iv
Sentiment Analysis using NLP and Deep Learning

Executive Summary

“Sentiment Analysis using NLP and Deep Learning”

This project is submitted to Rajiv Gandhi Proudyogiki Vishwavidhyalaya, Bhopal


(MP), India for partial fulfillment of Bachelor of Engineering in Computer Science
& Engineering branch under the sagacious guidance and vigilant supervision of
Prof. Ritika Bhatt.

The project is based on Natural Language Processing and Deep Learning, which is
a sub field of machine learning, concerned with algorithms inspired by the
structure and function of the brain called artificial neural networks. In this project,
BERT model is used for sentiment analysis and deep convolution network is used
for emotion detection.

Key words: Natural Language Processing, Deep Learning, BERT.

v
Sentiment Analysis using NLP and Deep Learning

“Where the vision is one year,


cultivate flowers;
Where the vision is ten years,
cultivate trees;
Where the vision is eternity,
cultivate people.”
- Oriental Saying

vi
Sentiment Analysis using NLP and Deep Learning

List of Figures

Figure 3-1: Use Case diagram 18

Figure 3-2: Activity Diagram 19

Figure 3-3: ERD Diagram 20

Figure 3-4: Text Analyzer 20

Figure 3-5: Review Analyzer 21

Figure 4-1: Text Analyzer Module 29

Figure 4-2: Visualization of Result 1 30

Figure 4-3: Review Analyzer Module 30

Figure 4-4: Visualization of Result 2 31

Figure 4-5: Input for Text Analyzer 33

Figure 4-6: Output for Text Analyzer 33

vii
Sentiment Analysis using NLP and Deep Learning

List of Abbreviations

Abbr1: NLP –Natural Language Processing

Abbr2: API-Application Programming Interface

Abbr3: CNN-Convolution Neural Network

Abbr4: CSV-Comma Separated Values

Abbr5: GPU-Graphics Processing Unit

Abbr6: ML-Machine Learning

Abbr7: OS-Operating System

Abbr8: RAM-Random Access Memory

Abbr9: URL-Uniform Resource Locator

Abbr10: BERT-Bidirectional Encoder Representations from Transformers

viii
Sentiment Analysis using NLP and Deep Learning

Table of Contents

Chapter 1: Introduction 1
1.1. Introduction 1
1.2. Overview 1
1.3. Background and Motivation 2
1.4. Problem Statement and Objectives 3
1.5. Scope of the Project 3
1.6. Team Organization 4
1.7. Report Structure 5
Chapter 2: Review of Literature 7
2.1 Review of Literature 7
2.2. Preliminary Investigation 8
2.2.1. Current System 8
2.3. Requirement Identification and Analysis for Project 8
2.4. Conclusion 11
Chapter 3: Proposed System 12
3.1. The Proposal 12
3.2. Benefits of the Proposed System 13
3.3. Feasibility Study 14
3.3.1. Technical 15
3.3.2. Economical 16
3.3.3. Operational 17
3.4. Design Representation 18
3.4.1. Use Case Diagram 18
3.4.2. Activity Diagram 19
3.4.3. Entity Relationship Diagram 20
3.4.4. Text Analyzer 20
3.4.5. Review Analyzer 21
3.4.6. Dataset Structure 21
3.5. Deployment Requirements 22

ix
Sentiment Analysis using NLP and Deep Learning

3.5.1. Hardware 22
3.5.2. Software 22
Chapter 4: Implementation 23
4.1. Implementation 23
4.2. Technique Used 23
4.2.1. Web Scrapping 23
4.2.2. Natural Language Processing 24
4.2.3. Deep Neural Network 25
4.3. Tools Used 26
4.3.1. Beautiful Soup 26
4.3.2. Hugging Face Transformer BERT model 26
4.3.3. Scikit Learn 27
4.3.4. Pandas 27
4.4. Language Used 28
4.5. Glimpse of Project 29
4.6. Testing 31
4.6.1. Strategy Used 31
4.6.2. Results 33
Chapter 5: Conclusion 34
5.1. Conclusion 34
5.2. Limitations of the Work 34
5.3. Suggestion and Recommendations for Future Work 35
Bibliography 36
Source Code 37

x
Sentiment Analysis using NLP and Deep Learning

Chapter 1: Introduction

1.1 Introduction

In an era marked by the relentless growth of digital content, sentiment analysis


tools have emerged as indispensable instruments to navigate the ever-expanding
landscape of text data. The need for these tools is paramount, driven by the
explosive proliferation of user-generated content on social media, the unceasing
surge in online reviews, and the continuous stream of textual information across
various digital platforms. As we delve into the heart of the information age,
understanding the sentiments and emotions expressed in this vast sea of text is
not only a necessity but also holds immense potential for numerous sectors and
applications.

This project report aims to shed light on the need, benefits, and solutions
offered by sentiment analysis tools. We will explore how these tools are
instrumental in addressing the challenges posed by the deluge of textual data and
how they empower businesses, researchers, and individuals to gain valuable
insights, enhance decision-making processes, and uncover hidden trends and
patterns within their data. This report not only serves as an exploration of
sentiment analysis but also as a guide to harness its potential, demonstrating its
versatility in deciphering the intricate world of human emotions and opinions
conveyed through written text.

1.2 Overview

The project revolves around the utilization of Natural Language Processing (NLP)
and Deep Learning techniques to develop a model capable of evaluating comments
Page 1 of 54
Sentiment Analysis using NLP and Deep Learning

and reviews from a designated website. Users will input the source URL where
these reviews are hosted, and through the integration of NLP and Deep Learning,
the system will assess the emotions and sentiments expressed in these comments.
The primary goal of this system is to deliver trustworthy sentiment analysis of
product reviews, equipping stakeholders with valuable insights to make well-
informed business decisions.

1.3 Background and Motivation

In recent years, the exponential growth of digital content on the internet has
transformed the way information is shared, opinions are expressed, and products
are reviewed. This digital revolution has brought about a staggering amount of
user-generated content, particularly in the form of comments and reviews on
various online platforms. Understanding the sentiments and emotions embedded
within this vast corpus of text data is crucial for individuals, businesses, and
organizations looking to navigate the intricate landscape of public opinion
effectively.

Natural Language Processing (NLP) and Deep Learning have emerged as


pivotal technologies in the field of sentiment analysis. NLP enables computers to
understand, interpret, and generate human language, making it the ideal tool for
processing textual data. Deep Learning, with its neural network architectures,
excels at extracting intricate patterns and nuances within the text, allowing for a
more accurate assessment of sentiments. These advancements have laid the
foundation for the development of sophisticated sentiment analysis systems that
can dissect and analyze the emotional undercurrents present in user comments
and reviews.

The motivation behind this project stems from the pressing need for a
robust and reliable sentiment analysis system that harnesses the power of NLP

Page 2 of 54
Sentiment Analysis using NLP and Deep Learning

and Deep Learning. In today's digital age, the opinions and sentiments expressed
in comments and reviews on platforms ranging from e-commerce sites to social
media can significantly impact decision-making processes. Businesses rely on
customer feedback to enhance their products and services, while individuals seek
informed opinions to make purchasing decisions.

1.4 Problem Statement and Objectives

In recent years, the exponential growth of digital content on the internet has
transformed the way information is shared, opinions are expressed, and products
are reviewed. This digital revolution has brought about a staggering amount of
user-generated content, particularly in the form of comments and reviews on
various online platforms. Understanding the sentiments and emotions embedded
within this vast corpus of text data is crucial for individuals, businesses, and
organizations looking to navigate the intricate landscape of public opinion
effectively.

Objective: The aim of this system is to give a wider analysis of comments and
reviews posted for a product so that owner and developer of the product can take
a and beneficial business decisions and overcome any further risk posing situation.

1.5 Scope of the Project

This system provides a wider view into the comments and posts posted for a
product on different online platforms such as online shopping site, online food
ordering site etc.

Page 3 of 54
Sentiment Analysis using NLP and Deep Learning

Business and Market Intelligence:

• Product and Service Improvement: Analyze customer reviews, feedback,


and social media comments to identify areas for product or service
enhancement.
• Competitive Analysis: Compare sentiment data with competitors to gain a
competitive edge and identify market trends.

Customer Support and Engagement:

• Real-time Feedback: Provide real-time sentiment analysis to help customer


support teams address issues promptly.
• Sentiment-based Routing: Route customer inquiries based on their
sentiments for improved service.

Market Research:

• Consumer Insights: Extract insights from consumer reviews and comments


for market research and consumer behavior analysis.
• Trend Prediction: Use sentiment data to predict emerging trends and adapt
marketing strategies accordingly.

1.6 Team Organization

• Aeshna Jain:
I conducted thorough research to select the appropriate technology and
delved deep into its intricacies. I also undertook model development to
create a system capable of analyzing textual data, specifically reviews on
the platform. Furthermore, I designed a Deep Learning model to effectively
classify the emotions expressed in the provided data, ensuring a precise
and comprehensive analysis of user experiences.

Page 4 of 54
Sentiment Analysis using NLP and Deep Learning

• Bhavik Mundra:
I conducted research, gathered essential resources, and performed web
scraping to collect the data required for building the model. Additionally, I
dedicated effort to developing the application's frontend and seamlessly
integrating it with the model. This integration allows for the presentation
of results, enabling decision-making for the system's users or stakeholders.

1.7 Report Structure

The project Sentiment Analysis using NLP and Deep Learning is primarily
concerned with the Natural Language Processing and Deep Learning
Techniques and whole project report is categorized into five chapters.

Chapter 1: Introduction: introduces the background of the problem


followed by rationale for the project undertaken. The chapter describes the
objectives, scope and applications of the project. Further, the chapter gives the
details of team members and their contribution in development of project which
is then subsequently ended with report outline.

Chapter 2: Review of Literature: explores the work done in the area of


Project undertaken and discusses the limitations of existing system and highlights
the issues and challenges of project area. The chapter finally ends up with the
requirement identification for present project work based on findings drawn from
reviewed literature and end user interactions.

Chapter 3: Proposed System: starts with the project proposal based on


requirement identified, followed by benefits of the project. The chapter also
illustrate software engineering paradigm used along with different design
representation. The chapter also includes and details of major modules of the
project. Chapter also gives insights of different type of feasibility study carried out

Page 5 of 54
Sentiment Analysis using NLP and Deep Learning

for the project undertaken. Later it gives details of the different deployment
requirements for the developed project.

Chapter 4: Implementation: includes the details of different Technology/


Techniques/ Tools/ Programming Languages used in developing the Project. The
chapter also includes the different user interface designed in project along with
their functionality. Further it discusses the experiment results along with testing
of the project. The chapter ends with evaluation of project on different parameters
like accuracy and efficiency.

Chapter 5: Conclusion: Concludes with objective wise analysis of results


and limitation of present work which is then followed by suggestions and
recommendations for further improvement.

Page 6 of 54
Sentiment Analysis using NLP and Deep Learning

Chapter 2: Review of Literature

2.1 Review of Literature

Natural Language Processing (NLP) is a subfield of artificial intelligence that


focuses on the interaction between computers and human language. NLP
encompasses a range of techniques and algorithms that enable machines to
understand, interpret, and generate human language in a way that is both
meaningful and useful. NLP plays a crucial role in sentiment analysis, which is the
process of determining and understanding the emotional tone or sentiment
expressed in textual data. Researchers have explored various NLP techniques, such
as tokenization, part-of-speech tagging, and named entity recognition, to
preprocess textual data effectively. Additionally, word embeddings, particularly
Word2Vec and GloVe, have been integrated into sentiment analysis models for
better feature representation.

The advent of Deep Learning has revolutionized sentiment analysis.


Recurrent Neural Networks (RNNs), Long Short-Term Memory (LSTM) networks,
and Gated Recurrent Unit (GRU) networks have demonstrated their proficiency in
modelling sequential data, enabling improved sentiment analysis for long text
sequences. Moreover, Convolutional Neural Networks (CNNs) have been employed
to capture local features and patterns in text data, proving valuable in sentiment
analysis.

Transfer learning has gained traction in sentiment analysis with the


introduction of pre-trained models such as BERT (Bidirectional Encoder
Representations from Transformers) and GPT (Generative Pre-trained

Page 7 of 54
Sentiment Analysis using NLP and Deep Learning

Transformer). These models, trained on extensive text corpora, offer exceptional


performance by capturing contextual information and nuances in sentiments.

A significant advancement in sentiment analysis is the emergence of


aspect-based sentiment analysis, which focuses on extracting sentiments related
to specific aspects or features within a text. This approach has practical
applications in product reviews and customer feedback analysis.

2.2 Preliminary Investigation

2.2.1 Current System

There are many sentiment analysis tools launched by google and Facebook
among most prominent are:

• Google Insights
• Google Alerts
• Facebook Insights
• Other tools like Brand watch.

But all these tools are not customized to a specific intent and therefore
there is a need to develop a sentiment analyzer for particular purpose.

2.3 Requirement Identification and Analysis for Project

One of the major challenges in developing Sentiment Analysis systems using NLP
and Deep Learning is the lack of standardized data. Reviews vary on the daily basis,
new reviews are posted and deleted sometimes daily. Additionally, the accuracy of
the predictions may be affected by the quality of the data and the availability of
adequate amount of data.

Page 8 of 54
Sentiment Analysis using NLP and Deep Learning

• Creating a dataset from scratch for a NLP project can be a challenging task.
The main challenge is to webscrape a particular site o daily basis for fresh
reviews and if reviews are not adequate it will affect the training of the
model.
• Data collection can be a time-consuming process, and it may be difficult to
ensure the data is of high quality and representative of the problem being
solved.
• Furthermore, the dataset may need to be cleaned and pre-processed to
remove outliers, missing values, or irrelevant features. This can be a tedious
and error-prone process that requires careful attention to detail.
• Web scraping is the process of automatically extracting data from websites.
For sentiment analysis system, web scraping techniques can be used to
collect data from various online shopping sites.
• The web scraping process can be automated using programming tools such
as Python, Beautiful Soup, and Selenium. These tools can be used to extract
information from the HTML code of web pages and convert it into a
structured format such as a CSV or JSON file.
• It is important to note that web scraping can be a complex process, as
websites may use different structures and formats to present information.
Additionally, web scraping can raise legal and ethical concerns, as it may
violate website terms of service or infringe on user privacy. Therefore, it is
important to use web scraping techniques responsibly and with caution.
• Cleaning a dataset created from web scraping is an essential step in
preparing the data for machine learning. The web scraping process can
introduce errors and inconsistencies into the data, such as missing values,
duplicate entries, or incorrect formatting. These errors can negatively
impact the performance of the machine learning model.
• To clean the dataset, several steps may be necessary. First, missing values
and duplicates must be identified and removed. This can be done using
Page 9 of 54
Sentiment Analysis using NLP and Deep Learning

statistical methods, such as calculating the mean or median of a feature and


imputing missing values with that value. Duplicates can be removed by
identifying records with identical feature values.
• Next, irrelevant features can be removed to reduce the dimensionality of
the dataset. This can be done using feature selection methods, such as
Principal Component Analysis (PCA). These methods can identify features
that have little predictive power and remove them from the dataset.
• The dataset may need to be normalized or standardized to ensure that all
features are on a similar scale. This can improve the performance of
machine learning algorithms that are sensitive to differences in feature
scales.
• Cleaning a dataset created from web scraping can be a time-consuming
process that requires careful attention to detail. However, it is an essential
step in preparing the data for machine learning and can greatly improve the
performance of the final model.
• To perform emotion analysis Deep learning is used, a neural network is
trained on a well-established dataset available on Kaggle and that model is
used to extract emotion from the text for more accurate measure of
sentiment.
• The system will take sentiment score that is detected using BERT model
and emotion scores into account and categorized the reviews as bad or
good.
• Finally, the system will perform analysis and draw useful results based on
the data after analyze reviews which helps stakeholders take effective
business decisions.
• The whole system is divided into two components which serves two
different users. The first component analyze only the text which a user can
input and get what is the tone and emotion the text is intended to say.

Page 10 of 54
Sentiment Analysis using NLP and Deep Learning

• In the other component the reviews are analyzed based on the products
and that review serves as a fundamental for visualizing various business
decisions. The application will provide graphical way to analyze the
reviews based on sentiment and emotions of the posts.
• The first step in creating the Sentiment detection model is to use the
pretrained BERT model which then returns a score ranging between 1 to 5.
The score less than 3 denotes the negative review, review with score 3 are
considered as neutral and larger than 3 are considered as positive reviews.
• Next, the emotional analysis is done by developing a convolution neural
network and trained this model on emotion detection dataset available on
Kaggle.
• Overall, both these analysis gives a wider view towards the perspective
shown by customers of specific product and the application will also
provide a graphical way to all these analysis to make it easy for
stakeholders to take important business decisions.

2.4 Conclusion

This chapter reviews the literature surveys that have been done during the
research work. In conclusion Sentiment Analysis using NLP and Deep Learning
have the potential to provide a valuable tool for both users and organizations.
While there are still challenges to overcome, the research in this field is promising
and suggests that accurate sentiment predictions can be achieved. Future research
should focus on developing more accurate models and standardizing the data used
to train them.

Page 11 of 54
Sentiment Analysis using NLP and Deep Learning

Chapter 3: Proposed System

3.1 The Proposal

The objective of this project is to create a system capable of analyzing emotions


and sentiments in user-provided reviews and text. The system will achieve this by
web scraping reviews of a particular product from a specific organization and then
conducting an analysis. To accomplish this, the project utilizes the BERT NLP
model and incorporates Deep Learning principles to extract information such as
determining the tone and emotion conveyed by the user in the text.

The proposed system is a web-based application designed to cater to


various user needs. It operates by accepting user-provided text input, which is
then subjected to thorough sentiment and emotional analysis. The primary goal of
this system is to precisely determine the user's intention, the sentiment conveyed,
the underlying emotions, and the tone of the text. By doing so, it offers valuable
insights into the user's perspective and feelings, allowing for a deeper
understanding of the communicated message.

Furthermore, this system encompasses an additional module, enhancing its


utility and versatility. This module enables users to select their specific
organization and product of interest. Once the selection is made, the system
provides a set of powerful graphical tools that facilitate the analysis of reviews and
feedback related to the chosen product and organization. These tools are
instrumental in distilling critical business insights and decision-making. They
offer a comprehensive view of customer sentiments and opinions, aiding
organizations in making informed and data-driven choices to improve their
products and services, enhance customer satisfaction, and drive success.

Page 12 of 54
Sentiment Analysis using NLP and Deep Learning

3.2 Benefits of the Proposed System

The proposed sentiment analyzer uses NLP BERT model to get the sentiments
score ranging from 1 to 5 where sentiment score less than 3 is a negative comment
and greater than 3 is a positive comment and score with exact 3 is a neutral
statement. Some key points of explanation are:

• Accurate predictions: The use of emotion detection and sentiment


analysis together make the predictions accurate and the dataset used in
training a neural network is a well-established dataset from Kaggle which
enhances the quality of prediction. Also, the sentiment score is calculated
using BERT model which is trained on millions of textual data and hence
will output accurate predictions with accuracy of 95%.
• Efficient and automated: The envisioned system offers the capability to
streamline the tasks of sentiment analysis and its visual representation,
resulting in significant time and resource savings for business analysts and
organizations when it comes to making informed business decisions. With
the integration of this system, organizations can redirect their efforts
towards enhancing product development, rather than dedicating valuable
time to the manual review of customer feedback and sentiments.
• Customizable: The system can be customized to suit specific needs and
preferences. For example, users can input their reviews and comments or
any message they want to posts to anyone and can instantly get what it will
be intended if read by another user Also the review analysis is done for
specific organization and hence the visualization can be customized
according to their business needs.
• Insights for decision-making: The sentiment analysis data generated by
the system provides a data-driven basis for decision-making. It helps
organizations identify patterns and trends in customer feedback, allowing
them to make decisions that are backed by solid evidence.

Page 13 of 54
Sentiment Analysis using NLP and Deep Learning

• Scalable: The system should be designed to handle a large volume of data.


This means it should have the capacity to process and analyze a substantial
number of text inputs and reviews without a significant decrease in
performance. This scalability can be achieved by using distributed
computing frameworks or cloud-based solutions that can dynamically
allocate resources as needed.

Overall, the proposed sentiment analysis system offers several benefits, including
improved accuracy, efficiency, customizability, insights for decision-making, and
scalability.

3.3 Feasibility Study

A feasibility study is an important step in determining whether the proposed


sentiment analysis system is viable and practical. Here is an overview of the
feasibility study for the proposed system:

• Technical feasibility: The proposed system is technically feasible as it uses


established technologies such as NLP, web scraping, and deep learning
algorithms. The use of these technologies has been proven to be effective in
similar applications.
• Economic feasibility: The economic feasibility of the system depends on
factors such as the cost of hardware, software, and labor. The system
requires a computer and software for model training, web scraping, and
performing deep learning tasks. The cost of these components can be
significant, but the potential cost savings from automated analyzer could
outweigh the initial investment.
• Legal feasibility: The system must comply with relevant legal and ethical
standards, including data privacy and security regulations. The use of

Page 14 of 54
Sentiment Analysis using NLP and Deep Learning

personal information in the dataset and the handling of sensitive data must
be addressed.
• Operational feasibility: The system's operational feasibility depends on
the availability of necessary resources, such as trained personnel, sufficient
data storage, and reliable internet connectivity. The use of automated tools
such as web scraping and using pretrained model can streamline
operations and make the system more efficient.
• Schedule feasibility: The development and implementation of the system
will require time and resources. The project must be appropriately planned
and scheduled to ensure timely delivery.

Based on the feasibility study, the proposed sentiment analysis system is


technically feasible, economically viable, legally compliant, operationally feasible,
and can be delivered within a reasonable schedule.

3.3.1 Technical

The proposed sentiment analysis system is technically feasible, as all the


required technologies and resources for its development and
implementation are well-established. The technical feasibility of the
sentiment analysis system can be assessed by evaluating its components,
infrastructure, and technological aspects. Assess the availability and
accessibility of the data sources, such as product reviews and user-
generated content. Ensure that web scraping and data retrieval methods
are technically feasible and compliant with data usage policies. The
availability of suitable NLP libraries and pre-trained models like BERT or
GPT for sentiment analysis make it more feasible and accurate.
Continuously optimize sentiment analysis algorithms and processes to
reduce latency and resource consumption. The technical feasibility ensures

Page 15 of 54
Sentiment Analysis using NLP and Deep Learning

the user interface is responsive and adaptable to different devices and


screen sizes.

3.3.2 Economical

The economic feasibility study of the proposed sentiment analysis system


aims to determine the financial viability of the project. The study examines
the costs associated with the development and implementation of the
system and compares them to the potential benefits that the system can
provide.

The main costs of the system include the development and implementation
of the software, data collection and processing, computational resources,
and maintenance costs. The costs associated with software development
and implementation include the cost of hiring developers, purchasing
necessary software and hardware, and any other development-related
expenses. Data collection and processing costs include the cost of web
scraping tools and the time and effort required to collect and process the
data.

On the other hand, the potential benefits of the system include increased
accuracy in sentiment prediction, reduced human error, improved
efficiency in the analysis process, and reduced operational costs. The
system can automate the process of sentiment prediction and its
visualization, allowing taking effective business decisions, leading to
increased profitability and productivity.

The economic feasibility study indicates that the proposed system is


economically feasible, as the potential benefits outweigh the costs. The
system can help businesses streamline their operations, save time and
resources, and increase profitability. Therefore, the proposed sentiment
Page 16 of 54
Sentiment Analysis using NLP and Deep Learning

analysis system is a practical and cost-effective solution for automated


review analysis based on reviews data and deep learning algorithms.

3.3.3 Operational

The operational feasibility study of the proposed system aims to determine


the system's ability to meet its objectives and whether it can be integrated
into the current workflow of businesses.

The operational feasibility study examines the current process of the


businesses and identifies any potential bottlenecks, gaps, or limitations
that the proposed system can address. It also assesses the availability of
resources required for the system, including personnel, technology, and
data.

The study also considers the training needs for the employees who will use
the system and assesses their ability to adapt to the new technology.
Additionally, it evaluates the system's usability, reliability, and
performance.

The operational feasibility study indicates that the proposed system is


operationally feasible. The system can easily integrate into the current
workflow of online shopping businesses, and it can address potential
bottlenecks in the reviewing process by automating the sentiment
prediction process. The system's usability, reliability, and performance can
be enhanced by designing a user-friendly interface, implementing robust
testing procedures, and ensuring the availability of required computational
resources.

Moreover, the system's integration with existing technologies, such as web


scraping, NLP algorithms and deep learning algorithms, ensures that the
system can meet its objectives and provide accurate predictions based on
Page 17 of 54
Sentiment Analysis using NLP and Deep Learning

review data. The training needs for the employees can be addressed by
providing proper training and support materials.

Therefore, the operational feasibility study concludes that the proposed


sentiment analysis system is operationally feasible and can provide
businesses with an efficient and accurate automated sentiment and
emotion detection solution.

3.4 Design Representation


3.4.1 Use Case Diagram

Fig. 3-1: Use Case Diagram

Page 18 of 54
Sentiment Analysis using NLP and Deep Learning

3.4.2 Activity Diagram

Fig. 3-2: Activity Diagram

Page 19 of 54
Sentiment Analysis using NLP and Deep Learning

3.4.3 Entity Relationship Diagram

Fig. 3-4: Entity Relationship Diagram

3.4.4 Text Analyzer

Fig. 3-5: Text Analyzer

Page 20 of 54
Sentiment Analysis using NLP and Deep Learning

3.4.5 Review Analyzer

Fig. 3-6: Review Analyzer

3.4.6 Dataset Structure


Dataset is the training and testing data on which our machine learns and
test its accuracy. For this we split the data into 80:20 ratio i.e. 80% data is
utilized for training and 20% of data is used for testing purpose. This
dataset is available on Kaggle as Emotion Detection dataset, for analyzing
sentiment scores we have used BERT model which is trained on millions of
textual data.

Dataset available on Kaggle for emotion detection consists of 3 file “train”,


“test”, “val” all these are csv files consists of reviews and its corresponding
emotion separated by semicolons.

Dataset created for visualization consists of reviews, their corresponding


sentiment scores evaluated by BERT model, emotion labels generated by
deep neural network and corresponding category score which defines the
type of review good or bad.

Page 21 of 54
Sentiment Analysis using NLP and Deep Learning

3.5 Deployment Requirements


The deployment hardware and software requirements for the proposed system
include the following:

3.5.1 Hardware
• Processor: Intel Core i5 or higher
• RAM: 8 GB or higher
• Hard Disk Space: 1 TB or higher
• Graphics Card: NVIDIA or AMD with a minimum of 2 GB
memory
• Internet Connection: Broadband or higher

3.5.2 Software
• Operating System: Windows 10 or Ubuntu 18.04 or higher
Python 3.7 or higher
• Transformers from Hugging Face Library
• Scikit-learn Library
• BeautifulSoup Library
• StreamlitLibrary
• Tensorflow Library
Overall, the hardware and software requirements for the proposed system require
careful consideration to ensure that the system functions correctly and provides
users with a satisfactory experience.

Page 22 of 54
Sentiment Analysis using NLP and Deep Learning

Chapter 4: Implementation

4.1 Implementation

The implementation process would involve the following steps:

• Develop the code for data preprocessing, emotion detection, sentiment


extraction, and model training using the chosen programming language
and frameworks.
• Train the deep learning model on the pre-processed dataset and evaluate
its performance using appropriate metrics.
• Develop the web-based application using a suitable web framework and
deploy it to a cloud service provider.
• Test the application and refine it as necessary based on user feedback.
• Monitor the performance of the application and update the deep learning
model periodically as new data becomes available.

Overall, implementing this model requires a thorough understanding of Deep


Learning, Natural Language Processing, web development, as well as expertise in
the specific technologies used.

4.2 Technique Used

4.2.1 Web Scrapping

Web scraping is the process of extracting data from websites using


automated software. It involves downloading and parsing the HTML or
XML content of web pages to extract the desired information. In this house

Page 23 of 54
Sentiment Analysis using NLP and Deep Learning

rent prediction system, web scraping will be used to collect data from
various real estate websites to create a dataset for training the machine
learning model.

Web scraping can be performed using various tools and libraries such as
Beautiful Soup, Scrapy, Selenium, and many more. These tools allow
developers to extract data from websites in a structured manner, making it
easy to collect the required information.

In this sentiment analysis system, web scraping will be used to extract


reviews from product site as specified by stakeholders.

Web scraping can be a complex process, and it is important to ensure that


it is performed ethically and legally. It is important to respect the website's
terms of service and not to overload the server with too many requests.
Proper data cleaning and processing should also be performed on the
extracted data to ensure that it is accurate and usable.

4.2.2 Natural Language Processing

Natural Language Processing (NLP) is a field of artificial intelligence that


focuses on the interaction between computers and human language. It
seeks to bridge the gap between human communication and machine
understanding, enabling computers to comprehend, generate, and respond
to natural language in a way that is both meaningful and contextually
relevant. NLP is a multifaceted discipline that involves various tasks,
including text analysis, language translation, sentiment analysis, and
speech recognition, among others. Its applications are vast and diverse,
ranging from chatbots and virtual assistants to language translation
services and data mining for valuable insights from textual data. NLP
continues to advance rapidly, offering promising possibilities for
Page 24 of 54
Sentiment Analysis using NLP and Deep Learning

revolutionizing the way we interact with technology and process vast


amounts of textual information.

In this project, NLP hugging face library is used to predict the sentiment
score. BERT model is a pretrained model trained on millions of textual data
and is fine tuned to predict sentiment score for the reviews.

4.2.3 Deep Neural Network

Deep Convolutional Networks have found their application in sentiment


analysis, a field that involves classifying text data according to the
emotional tone or sentiment expressed within it. In sentiment analysis,
deep convolutional networks are used to process textual information in a
manner similar to their image-processing applications. They can
automatically learn and extract meaningful features from text, capturing
local patterns and dependencies in the data, such as identifying relevant
words or phrases that express sentiment. By stacking multiple
convolutional layers and combining them with pooling layers and fully
connected layers, these networks can build a hierarchical representation of
text data, enabling them to understand the nuances of sentiment within a
document or sentence. Deep Convolutional Networks have proven effective
in sentiment analysis tasks, including sentiment classification for social
media content, product reviews, and customer feedback, helping
businesses gain insights into public opinion and customer satisfaction.

In this project, a deep convolution network is trained on Kaggle dataset of


emotion detection, then this model is used to predict emotion of reviews of
a particular product.

Page 25 of 54
Sentiment Analysis using NLP and Deep Learning

4.3 Tools Used

4.3.1 Beautiful Soup

Beautiful Soup is a Python library that is used for web scraping purposes.
It is used to extract the data from HTML and XML files by parsing the
content of the files. Beautiful Soup provides a simple and intuitive interface
for working with HTML and XML files, making it easy to extract the
required data.

In this system, Beautiful Soup will be used to extract data from specific
product site. The library will be used to parse the HTML files of the web
pages and extract the reviews. The data extracted by Beautiful Soup will be
used to analyze sentiment. Beautiful Soup can be installed using pip or
conda package manager.

4.3.2 Hugging face Transformer BERT model

BERT, which stands for Bidirectional Encoder Representations from


Transformers, is a state-of-the-art natural language processing model
developed by Google. BERT is designed to understand the contextual
nuances and relationships between words in a sentence. What sets BERT
apart is its ability to analyze language bidirectionally, meaning it considers
the entire context of a word by looking at both the words before and after
it. This contextual understanding is crucial in sentiment analysis because
the sentiment of a word or phrase often depends on the surrounding text.
BERT can pre-train on vast amounts of text data to learn language
representations and then fine-tune on specific tasks like sentiment
analysis. This fine-tuning process allows BERT to capture subtle nuances in
sentiment by considering the context, making it exceptionally effective in
sentiment classification tasks. BERT's ability to handle complex sentence
Page 26 of 54
Sentiment Analysis using NLP and Deep Learning

structures and ambiguous language contributes to more accurate


sentiment analysis, making it a valuable tool in understanding and
interpreting emotions and opinions expressed in text data.

4.3.3 Scikit learn

Scikit-learn is a popular and versatile open-source machine learning


library in Python that provides a wide range of tools and algorithms for
various tasks related to data analysis, machine learning, and statistical
modelling. It is widely used for classification, regression, clustering,
dimensionality reduction, and more. In the context of emotion detection,
scikit-learn can be employed to build and evaluate machine learning
models for classifying text or other data sources based on the emotions or
sentiments expressed within them. By utilizing scikit-learn, developers and
data scientists can preprocess and vectorize text data, select appropriate
features, and train classification models, such as Support Vector Machines
(SVMs), Naive Bayes, or decision trees, to recognize emotions like
happiness, sadness, anger, or surprise within textual content. Scikit-learn's
user-friendly API and extensive documentation make it a valuable tool for
implementing emotion detection algorithms and conducting sentiment
analysis in a wide range of applications, from social media sentiment
tracking to customer feedback analysis.

4.3.4 Pandas

Pandas is an open-source data manipulation and analysis library in Python.


It provides data structures for efficiently storing and manipulating tabular
data, such as data frames and series.

Page 27 of 54
Sentiment Analysis using NLP and Deep Learning

In the sentiment analysis system, Pandas can be used for various data-
related tasks, such as:

• Data cleaning: Pandas provides various functions to clean and


preprocess data, such as handling missing values, removing
duplicates, and converting data types.
• Data manipulation: Pandas provides functions for filtering,
sorting, grouping, and aggregating data.
• Data analysis: Pandas provides statistical and mathematical
functions to analyze data, such as mean, median, standard deviation,
and correlation.

In the context of the system, Pandas can be used for cleaning and
preprocessing the data scraped from websites, as well as manipulating and
analyzing the dataset to prepare it for training the deep learning model.
Additionally, Pandas can be used to load the trained model and apply it to
new data to predict the rent.

4.4 Language Used

Python language is used in the system due to the following Characteristics:

• Simple: Python is a simple and minimalistic language. Reading a good


Python program feels almost like reading English (but very strict English!).
This pseudo-code nature of Python is one of its greatest strengths. It allows
you to concentrate on the solution to the problem rather than the syntax
i.e. the language itself.
• Free and Open Source: Python is an example of a FLOSS (Free/Libre and
Open-Source Software). In simple terms, you can freely distribute copies of
this software, read the software's source code, make changes to it, use
pieces of it in new free programs, and that you know you can do these
Page 28 of 54
Sentiment Analysis using NLP and Deep Learning

things. FLOSS is based on the concept of a community which shares


knowledge. This is one of the reasons why Python is so good - it has been
created and improved by a community who just want to see a better
Python.
• Object Oriented: Python supports procedure-oriented programming as
well as object-oriented programming. In procedure-oriented languages,
the program is built around procedures or functions which are nothing but
reusable pieces of programs. In object-oriented languages, the program is
built around objects which combine data and functionality. Python has a
very powerful but simple way of doing object-oriented programming,
especially, when compared to languages like C++ or Java.
• Extensive Libraries: The Python Standard Library is huge indeed. It can
help you do various things involving regular expressions, documentation
generation, unit testing, threading, databases, web browsers, CGI, ftp,
email, XML, etc.

4.5 Glimpse of Project


The Following are the screenshots of the result of the project:

Fig. 4-1: Text Analyzer Module

Page 29 of 54
Sentiment Analysis using NLP and Deep Learning

Fig. 4-2: Visualization of Result 1

Fig. 4-3: Review Analyzer Module

Page 30 of 54
Sentiment Analysis using NLP and Deep Learning

Fig. 4-4: Visualization Result 2

4.6 Testing
Testing is the process of evaluation of a system to detect differences between given
input and expected output and also to assess the feature of the system. Testing
assesses the quality of the product. It is a process that is done during the
development process.

4.6.1 Strategy Used


Tests can be conducted based on two approaches –
• Functionality testing
Page 31 of 54
Sentiment Analysis using NLP and Deep Learning

• Implementation testing
The texting method used here is Black Box Testing. It is carried out to test
functionality of the program. It is also called ‘Behavioral’ testing. The tester
in this case, has a set of input values and respective desired results. On
providing input, if the output matches with the desired results, the program
is tested ‘ok’, and problematic otherwise.

The sentiment analysis system has undergone various testing methods to


ensure its effectiveness and accuracy. One of the testing methods used is
unit testing, which involves testing individual components of the system to
ensure they are working correctly. Integration testing was also performed
to verify that the different components of the system work seamlessly
together.

In addition, performance testing was carried out to assess the system's


ability to handle a large volume of data and processing demands. This was
done to ensure that the system does not become slow or unresponsive
when handling multiple requests from users.

Finally, user acceptance testing was conducted to ensure that the system
meets the expectations and requirements of end-users. This involved a
group of users interacting with the system to verify that it is easy to use,
user-friendly, and provides accurate and reliable results.

Page 32 of 54
Sentiment Analysis using NLP and Deep Learning

4.6.2 Results
Sample Input and output:

Fig. 4-5: Input for Text Analyzer

Fig. 4-6: Output for Text Analyzer

Page 33 of 54
Sentiment Analysis using NLP and Deep Learning

Chapter 5: Conclusion

5.1 Conclusion
Sentiment analysis systems have become increasingly indispensable in today's
world due to their ability to extract valuable insights from the vast amount of
textual data generated daily. These systems help in solving various real-world
problems and offer significant advantages in our daily lives. In the business realm,
sentiment analysis is employed to gauge customer opinions and sentiment about
products and services, enabling companies to adapt and improve their offerings
based on feedback. It also aids in identifying emerging trends and issues early,
allowing for timely responses. In the realm of social media, sentiment analysis is
used to track public sentiment on important topics, from political events to public
health crises, helping governments and organizations make informed decisions.
Sentiment analysis is also applied in customer service to detect and address
negative feedback swiftly, enhancing user experiences. In news and media, it
assists in classifying news articles and identifying potentially biased reporting. By
providing a better understanding of public sentiment and opinions, sentiment
analysis systems play a pivotal role in enhancing decision-making, customer
satisfaction, and overall communication in the digital age.

5.2 Limitations of the Work


• Since no task can be 100% perfect. The same applies to this project as the
predicted sentiment is not 100% accurate.
• For accurate results we have used a pretrained BERT model because it is
trained on millions of textual data which is not possible by developing out

Page 34 of 54
Sentiment Analysis using NLP and Deep Learning

own model from scratch. However, in emotion detection task we have a


small sized dataset where we have developed a deep CNN model.
• One significant drawback of this system is the continuous influx of new
posts and reviews, making it challenging to consistently predict the total
score since it tends to fluctuate.
• Furthermore, the accuracy of the model's predictions depends on the
quality of the deep convolution network developed. If the algorithm is not
accurate, the model's predictions may be affected.

5.3 Suggestion and Recommendations for Future Work


The future prospects for this system are expansive. We aim to enhance its
capabilities by scaling it to process images and analyze real-time sentiments.
Additionally, we plan to adapt the system for diverse applications, including the
crucial task of mitigating toxicity on social media platforms. This involves
identifying and removing highly negative and irrelevant comments that can
adversely impact individuals' mental well-being. Furthermore, we intend to
incorporate more nuanced sentiment analysis techniques, such as tone analysis, to
ensure the accuracy of our results. These developments will not only broaden the
system's utility but also contribute to a safer and more insightful online
environment.

Page 35 of 54
Sentiment Analysis using NLP and Deep Learning

Bibliography

• Singh, Harjasdeep and Srivastava, Durgesh, Sentiment Analysis Using


Various Approaches: A Review (July 14, 2022). Proceedings of the
Advancement in Electronics & Communication Engineering 2022, Available
at SSRN: https://ssrn.com/abstract=4157625 or
http://dx.doi.org/10.2139/ssrn.4157625.
• S Sukheja , S Chopra , M Vijayalakshmi
Sentiment Analysis using Deep Learning -A survey, International
Conference on Computer Science, Engineering and Applications (ICCSEA),
p. 1 - 4 Posted: 2020
• Sentiment Analysis Using Natural Language Processing-Authors: Drashti
Panchal, Mihika Mehta, Aryaman Mishra, Saish Ghole, Mrs. Smita Dandge
DOI Link: https://doi.org/10.22214/ijraset.2022.42711
• Beneath the Tip of the Iceberg: Current Challenges and New Directions in
Sentiment Analysis Research Soujanya Poriaα∗, Devamanyu Hazarikaβ,
Navonil Majumderα, Rada
Mihalceaγ5.https://www.irejournals.com/formatedpaper/1702692.pdf

Page 36 of 54
Sentiment Analysis using NLP and Deep Learning

Source Code

#StreamLit Frontend Code

import pickle

import streamlit as st

import pandas as pd

import matplotlib.pyplot as plt

def plot_sentiment_distribution(excel_file):

"""

Read data from an Excel file, filter sentiment scores, and create a
bar chart.

Parameters:

• excel_file: The path to the Excel file.

Returns:

• None

"""

try:

# Step 1: Data Extraction

df = pd.read_excel(excel_file)

# Step 2: Data Preparation

sentiment_scores = [1, 2, 3, 4, 5]

Page 37 of 54
Sentiment Analysis using NLP and Deep Learning

filtered_data = df[df['sentiment'].isin(sentiment_scores)]

review_counts =
filtered_data['sentiment'].value_counts().sort_index()

# Step 3: Data Visualization

# plt.bar(review_counts.index, review_counts.values)

# plt.xlabel('Sentiment Score')

# plt.ylabel('Number of Reviews')

# plt.title('Bar Chart of Reviews by Sentiment Score')

# plt.xticks(sentiment_scores)

# plt.show()

st.bar_chart(review_counts)

st.xlabel('Sentiment Score')

st.ylabel('Number of Reviews')

st.title('Bar Chart of Reviews by Sentiment Score')

st.xticks(sentiment_scores)

except Exception as e:

print(f"An error occurred: {e}")

# Example usage:

def plot_sentiment_pie_chart(excel_file):

try:

Page 38 of 54
Sentiment Analysis using NLP and Deep Learning

# Data Extraction

df = pd.read_excel(excel_file)

# Data Preparation

sentiment_counts = df['sentiment'].value_counts()

# Data Visualization (Pie Chart)

plt.figure(figsize=(6, 6))

labels = ['Good', 'Average', 'Bad']

sizes = [

sentiment_counts.get(5, 0), # Good

sentiment_counts.get(3, 0), # Average

sentiment_counts.get(1, 0) # Bad

colors = ['#77DD77', '#FFDD00', '#FF6961']

plt.pie(sizes, labels=labels, colors=colors,


autopct='%1.1f%%', startangle=140)

plt.title('Composition of Reviews')

# Display the pie chart

st.pyplot(plt)

except Exception as e:

Page 39 of 54
Sentiment Analysis using NLP and Deep Learning

st.write(f"An error occurred: {e}")

def plot_emotion_category_bar_chart(excel_file):

try:

# Data Extraction

df = pd.read_excel(excel_file)

# Data Preparation

emotions = df['emotion_score']

emotion_counts = emotions.value_counts()

# Data Visualization (Bar Chart)

plt.figure(figsize=(10, 6))

emotion_counts.plot(kind='bar', color='yellow')

plt.xlabel('Emotion Category')

plt.ylabel('Number of Reviews')

plt.title('Bar Chart of Reviews by Emotion Category')

# Display the bar chart

st.pyplot(plt)

except Exception as e:

st.write(f"An error occurred: {e}")

Page 40 of 54
Sentiment Analysis using NLP and Deep Learning

def main():

st.title("Emotion Detector")

menu = ["Text Analyzer","Review Analyzer"]

choice = st.sidebar.selectbox("Menu",menu)

if choice == "Text Analyzer":

st.write("Text Analyzer Module")

with st.form(key = "form2"):

inputText = st.text_input("Type here...")

submit_button = st.form_submit_button(label="Submit")

if submit_button:

response = model.Completion.create(

engine="text-davinci-003",

prompt=(f"Please analyze the emotion in the


following text: '{inputText}'\n\n"),

temperature=0,

max_tokens=10

print(response)

sentiment = response.choices[0].text

st.text(sentiment)

else:

st.write("Review Analyzer")

products = ["Yelp","Oneplus 11R 5G", "Oneplus 10", "iPhone


11", "Motorola G30", "Oppo A30", "Samsung Galaxy S23 Ultra"]
Page 41 of 54
Sentiment Analysis using NLP and Deep Learning

productChoice = st.selectbox("Products", products)

if productChoice == "":

st.write("Selected product is :
{}".format(productChoice))

elif productChoice == "Samsung Galaxy S23 Ultra":

st.write("Selected product is :
{}".format(productChoice))

elif productChoice=="Yelp":

st.write("Review Analyzer Module")

plot_sentiment_distribution('categorised_data2.xlsx')

plot_sentiment_pie_chart('categorised_data2.xlsx')

plot_emotion_category_bar_chart('categorised_data2.xlsx')

if __name__ == '__main__':

main()

#Training Model

# -*- coding: utf-8 -*-

"""semantic analysis.ipynb

Automatically generated by Colaboratory.

Original file is located at

Page 42 of 54
Sentiment Analysis using NLP and Deep Learning

https://colab.research.google.com/drive/1Cv1pGVt5TDnJatb-
T_DuNcgvQkupvTnq

"""

!pip install torch==1.8.1+cu111 torchvision==0.9.1+cu111


torchaudio===0.8.1 -f
https://download.pytorch.org/whl/torch_stable.html

!pip install transformers requests beautifulsoup4 pandas numpy

from transformers import AutoTokenizer,


AutoModelForSequenceClassification

import torch

import requests

from bs4 import BeautifulSoup

import re

tokenizer = AutoTokenizer.from_pretrained('nlptown/bert-base-
multilingual-uncased-sentiment')

model1 =
AutoModelForSequenceClassification.from_pretrained('nlptown/bert-
base-multilingual-uncased-sentiment')

tokens = tokenizer.encode('ver bad quality', return_tensors='pt')

tokens #encoded to list of numbers

#just to see what decode function do

result = model1(tokens)

Page 43 of 54
Sentiment Analysis using NLP and Deep Learning

result

result.logits

result.logits[0]

int(torch.argmax(result.logits))+1

import pickle

from transformers import AutoTokenizer,


AutoModelForSequenceClassification

# Define the model and tokenizer

tokenizer = AutoTokenizer.from_pretrained('nlptown/bert-base-
multilingual-uncased-sentiment')

model =
AutoModelForSequenceClassification.from_pretrained('nlptown/bert-
base-multilingual-uncased-sentiment')

# Create a dictionary to store both the tokenizer and the model

model_and_tokenizer = {

'tokenizer': tokenizer,

'model': model

# Save the dictionary to a pickle file

with open('model_and_tokenizer.pkl', 'wb') as file:

Page 44 of 54
Sentiment Analysis using NLP and Deep Learning

pickle.dump(model_and_tokenizer, file)

r = requests.get('https://www.yelp.com/biz/social-brew-cafe-
pyrmont')

soup = BeautifulSoup(r.text, 'html.parser')

regex = re.compile('.*comment.*')

results = soup.find_all('p', {'class':regex})

reviews = [result.text for result in results]

# https://www.yelp.com/biz/social-brew-cafe-pyrmont

reviews

import numpy as np

import pandas as pd

df = pd.DataFrame(np.array(reviews), columns=['review'])

df

def sentiment_score(review):

tokens = tokenizer.encode(review, return_tensors='pt')

result = model1(tokens)

return int(torch.argmax(result.logits))+1

sentiment_score(df['review'].iloc[1])

print(len(df.index))

Page 45 of 54
Sentiment Analysis using NLP and Deep Learning

# for i in range(0,512):

# df['score'].iloc[i]=sentiment_score(df['review'].iloc[i])

df['sentiment'] = df['review'].apply(lambda x:
sentiment_score(x[:512]))

df

df.tail()

"""**Emotion** **Detection**"""

import tensorflow as tf

import keras

import pandas as pd

from nltk.corpus import stopwords

from nltk.stem.porter import PorterStemmer

import re

train=pd.read_table('train.txt', delimiter = ';', header=None, )

val=pd.read_table('val.txt', delimiter = ';', header=None, )

test=pd.read_table('test.txt', delimiter = ';', header=None, )

data = pd.concat([train , val , test])

data.columns = ["text", "label"]

data.shape

data.isna().any(axis=1).sum()

#text preprocessing

ps = PorterStemmer()

Page 46 of 54
Sentiment Analysis using NLP and Deep Learning

def preprocess(line):

review = re.sub('[^a-zA-Z]', ' ', line) #leave only characters


from a to z

review = review.lower() #lower the text

review = review.split() #turn string into list of words

#apply Stemming

review = [ps.stem(word) for word in review if not word in


stopwords.words('english')] #delete stop words like I, and ,OR
review = ' '.join(review)

#trun list into sentences

return " ".join(review)

import nltk

nltk.download('stopwords')

data['text']=data['text'].apply(lambda x: preprocess(x))

from sklearn import preprocessing

label_encoder = preprocessing.LabelEncoder()

data['N_label'] = label_encoder.fit_transform(data['label'])

data['text']

from sklearn.feature_extraction.text import CountVectorizer

Page 47 of 54
Sentiment Analysis using NLP and Deep Learning

cv = CountVectorizer(max_features=5000,ngram_range=(1,3))#example:
the course was long-> [the,the course,the course was,course, course
was, course was long,...]

data_cv = cv.fit_transform(data['text']).toarray()

data_cv

from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test =train_test_split(data_cv,


data['N_label'], test_size=0.25, random_state=42)

# first neural network with keras tutorial

from tensorflow.keras.models import Sequential

from tensorflow.keras.layers import Dense

# load the dataset

# split into input (X) and output (y) variables

# define the keras model

model = Sequential()

model.add(Dense(12, input_shape=(X_train.shape[1],),
activation='relu'))

model.add(Dense(8, activation='relu'))

model.add(Dense(6, activation='softmax'))

# compile the keras model

model.compile(loss='sparse_categorical_crossentropy',
optimizer='adam', metrics=['accuracy'])

# fit the keras model on the dataset

model.fit(X_train, y_train, epochs=10, batch_size=10)


Page 48 of 54
Sentiment Analysis using NLP and Deep Learning

# evaluate the keras model

_, accuracy = model.evaluate(X_train, y_train)

print('Accuracy: %.2f' % (accuracy*100))

_, accuracy = model.evaluate(X_test, y_test)

print('Accuracy: %.2f' % (accuracy*100))

import pandas as pd

import numpy as np

text='Some of the best Milkshakes me and my daughter ever tasted.


MMMMMM HMMMMMMMM.'

text=preprocess(text)

array = cv.transform([text]).toarray()

pred = model.predict(array)

a=np.argmax(pred, axis=1)

label_encoder.inverse_transform(a)[0]

tf.keras.models.save_model(model,'my_model.keras')

import pickle

vectorizer_filename = 'count_vectorizer.pkl'

with open(vectorizer_filename, 'wb') as file:

pickle.dump(cv, file)

label_encoder_filename='label_encoder.pkl'

with open(label_encoder_filename,'wb')as file:

pickle.dump(label_encoder,file)

Page 49 of 54
Sentiment Analysis using NLP and Deep Learning

import tensorflow as tf

# Load the model from the saved HDF5 file

loaded_model = tf.keras.models.load_model('my_model.keras')

model_and_weights = {

'model': loaded_model

import pickle

# Save the dictionary with the model to a pickle file

with open('model_pickle.pkl', 'wb') as file:

pickle.dump(model_and_weights, file)

df['review'].iloc[1]

def predict_emotion(text, model, cv, label_encoder):

# Preprocess the input text

text = preprocess(text)

# Transform the preprocessed text into a feature array

array = cv.transform([text]).toarray()

# Make predictions using the model

pred = model.predict(array)

# Get the predicted class

predicted_class = np.argmax(pred, axis=1)[0]

# Inverse transform the predicted class to get the emotion label

Page 50 of 54
Sentiment Analysis using NLP and Deep Learning

predicted_emotion =
label_encoder.inverse_transform([predicted_class])[0]

return predicted_emotion

text='you are a beautiful girl.'

# text=preprocess(text)

# array = cv.transform([text]).toarray()

import pickle

# Load the model from the pickle file

with open('model_pickle.pkl', 'rb') as file:

loaded_model_dict = pickle.load(file)

loaded_model = loaded_model_dict['model']

# Use the loaded model for inference

predictions = loaded_model.predict(array)

a=np.argmax(predictions, axis=1)

label_encoder.inverse_transform(a)[0]

# for i in range(0,len(df)):

# df['emotion'].iloc[i]=emotion_f(df['review'].iloc[i])

# for i in range(0,512):

# df['score'].iloc[i]=sentiment_score(df['review'].iloc[i])

print(predict_emotion(df['review'].iloc[0],model,cv,label_encoder))

df['emotion_score'] = df['review'].apply(lambda x:
predict_emotion(x[:512],model,cv,label_encoder))

df

df3=df.copy()
Page 51 of 54
Sentiment Analysis using NLP and Deep Learning

df3

# df = pd.DataFrame(data)

# Specify the filename for the Excel file

excel_filename = 'data2.xlsx'

# Save the DataFrame to an Excel file

df.to_excel(excel_filename, index=False)

df

# df = df.rename(columns={'label': 'emotion', 'N_label':


'sentiment_score','text':'reviews'})

df

def catogery(review):

for i in range(0,len(df)):

if df['sentiment']>3 and (df['emotion_score']=="joy" or


df['emotion_score']=="love" or df['emotion_score']=="surprise"):

df['category']="good"

elif df['sentiment']<=3 and (df['emotion_score']=="sad" or


df['emotion_score']=="sad" ):

df['category']="bad"

df

df2 = df.copy()

df2

Page 52 of 54
Sentiment Analysis using NLP and Deep Learning

import pandas as pd

def categorize_reviews(df):

for index, row in df.iterrows():

sentiment = row['sentiment']

emotion_score = row['emotion_score']

if sentiment > 3 and (emotion_score == "joy" or


emotion_score == "love" or emotion_score == "surprise"):

df.at[index, 'category'] = 'good'

elif sentiment <= 3 and (emotion_score == "sadness" or


emotion_score == "anger"):

df.at[index, 'category'] = 'bad'

elif sentiment <= 3 and (emotion_score == "joy" or


emotion_score == "love" or emotion_score == "surprise"):

df.at[index, 'category'] = 'good'

elif sentiment > 3 and (emotion_score == "sadness" or


emotion_score == "anger"):

df.at[index, 'category'] = 'bad'

return df

# # Example usage:

# data = {

# 'sentiment': [4, 2, 5, 3, 1],

Page 53 of 54
Sentiment Analysis using NLP and Deep Learning

# 'emotion_score': ['joy', 'anger', 'love', 'sadness',


'surprise']

# }

# df = pd.DataFrame(data)

df2 = categorize_reviews(df2)

# Specify the filename for the Excel file

excel_filename = 'categorised_data2.xlsx'

# Save the DataFrame to an Excel file

df2.to_excel(excel_filename, index=False)

excel_filename = 'categorised_data2.xlsx'

df2 = pd.read_excel(excel_filename)

df2

# text=input("enter your comment")

# print(sentiment_score(text))
# print(predict_emotion(text,model,cv,label_encoder))

Page 54 of 54

You might also like