Review Analysis and Sentiment Learning Using NLP
Review Analysis and Sentiment Learning Using NLP
A PROJECT REPORT
SUBMITTED IN PARTIAL FULFILLMENT OF THE
REQUIREMENTS FOR THE AWARD OF DEGREE
OF
BACHELOR OF TECHNOLOGY
IN
SOFTWARE ENGINEERING
Submitted by:
ROHAN SAHU
(2k22/SE/139)
SAHIL
(2k22/SE/149)
Under the supervision of
Dr. SONIKA DAHIYA
ASSISTANT PROFESSOR
CANDIDATE’S DECLARATION
Place: Delhi
Date:12th December 2024
CERTIFICATE
I hereby certify that the Project Dissertation titled “Review Analysis and Sentiment Learning
using NLP” which is submitted by Rohan Sahu (2k22/SE/139) and Sahil (2k22/SE/149),
Department of Software Engineering, Delhi Technological University, Delhi in partial
fulfilment of the requirement for the award of the degree of Bachelor of Technology, is a
record of the project work carried out by the students under my supervision. To the best of
my knowledge, this work is not been submitted in part or full for any Degree or Diploma to
this University or elsewhere.
SUPERVISOR
iii
ABSTRACT
ACKNOWLEDGEMENT
I would like to express my gratitude and appreciation to all those who gave me the
opportunity to complete this dissertation. Special thanks to my supervisor in charge, Dr.
Sonika Dahiya (Assistant Professor), Department of Software Engineering who helped me,
stimulating, suggestions and encouragement. Helped and guided me for the completion of
the project undertaken by me. It is with their supervision that this work came into existence.
I would like to thank the department of Software Engineering for providing the
infrastructure, facilities and opportunity to work in this knowledgeable project that helped
me to learn new things about this burning technology.
I also want to give special thanks to all my fellow mates who supported me in every course
and provided their valuable ideas and thoughts to me.
SUPERVISIOR’S CERTIFICATE
ABSTRACT
ACKNOLEDGEMENT
LIST OF TABLES
LIST OF FIGURES
1. INTRODUCTION
1.1 BACKGROUND
2. RELATED WORKS
3. METHODOLOGY
6.EXPERIMENTAL WORK
7. RESULTS
8.LIMITATIONS
9.CONCLUSION AND FUTURE WORK
REFERENCES
1
CHAPTER-1
INTRODUCTION
1.1 BACKGROUND AND MOTIVATION
In the world of E-commerce and online streaming platform age, user-posted content such
as reviews, feedback, and social media posts has become a valuable resource for
understanding consumer behaviour and sentiment, so that business person can understand
what’s trending in market. Businesses heavily rely on this textual data to assess customer
satisfaction, monitor brand insight, and make cost effective and meaningful decision.
However, manually analysing such vast and diverse data is not only time consuming but
also prone to biasness, inconsistency and inaccuracy. This has driven the need for
automated systems capable of processing and extracting meaning from textual data.
In this regard, the field of natural language processing (NLP) has become revolutionary,
providing sophisticated models and algorithms for efficient text analysis. One important
use of natural language processing (NLP) is sentiment analysis, which helps businesses
identify underlying themes, categorize reviews as neutral, negative, or positive, and
assess the emotional tone of customer feedback. Despite tremendous advancements in
this area, problems like managing sarcasm, domain-specific language, and linguistic
subtleties still exist.
The growing need for scalable, precise, and affordable tools to interpret unstructured text
data is what inspired this project. This project intends to close current gaps and develop a
system that can provide useful insights by utilizing cutting-edge NLP techniques. In
addition to helping companies improve their goods and services, being able to handle
customer reviews and sentiments effectively builds stronger customer relationships,
which in turn boosts growth and competitiveness.
Because manual review analysis takes a lot of time, is prone to human error, and cannot
keep up with the sheer volume of data, it is not scalable. Accurate text interpretation is
further complicated by elements like sarcasm, domain-specific terminology, ambiguous
language, and cultural quirks. Current automated solutions frequently find it difficult to
manage these complexities, which results in less-than-ideal sentiment detection and
thematic analysis outcomes.
2
As a result, a reliable, scalable, and accurate system that can classify sentiments, analyze
and interpret user-generated text data, and offer useful insights is desperately needed.
By utilizing cutting-edge Natural Language Processing (NLP) techniques, this project
seeks to close this gap and create a system that can overcome these obstacles,
empowering researchers and businesses to make defensible decisions based on
trustworthy sentiment analysis and review interpretation
3
CHAPTER-2
RELATED WORK
2.1. BACKGROUND RESEARCH
Sentiment analysis aims to automatically identify and classify sentimental tendencies
in texts through computer technology and linguistic knowledge, there were various
development and improvement in this section till now, various authors and university
use different Machine learning techniques to evaluate the textual sentiments.
The TextRank algorithm, in which a graph-based text summarization methodology is
involved that represents words or phrases as nodes in a graph, with giving edge
weights capturing semantic similarity. One more method developed which named as
Word2Vec, this method is used for learning distributed word representations by
capturing semantic relationships within a continuous vector space.
On the other hand, deep learning-based approaches have been extensively used in
sentiment analysis, including Deep Neural Network, CNN, and attention mechanism-
based network, using CNN in sentiment analysis by Meena aimed to classify the
sentiment polarity in social media data. They categorized comments preferred by
people into sentiment polarities such as positive, negative, and neutral, achieving an
impressive accuracy of 95.4%. Similarly, Kruspe et al. used a neural network with pre-
trained word and sentence embeddings to perform sentiment analysis on European
COVID-19-related Twitter messages. 79,000 of the 4.6 million tweets that this model
examined contained COVID-19 keywords along with semantic information.
Amazon reviews using various RNN variants, to classify customer sentiment as
negative, neutral, or positive. These RNNs were combined with different word
embeddings for feature extraction, to achieve the highest accuracy of 93.75%.
In sentiment learning both machine learning and deep learning used to understand
the meaning of opinion, but they both worked in different way.
Machine learning based methods are used to build a quick model and work well on
small datasets, in case of large and complex datasets these methods fails to deliver a
high accuracy, for large dataset Deep learning based methods are used these
methods works on more complicated datasets and deliver a high accuracy, these are
difficult to develop and take time and need more labelled dataset.
METHODOLOGY
3.1 NATURAL LANGUAGE PROCESSING
A branch of computer science, and more specifically artificial intelligence, is called
natural language processing (NLP). It is closely related to information retrieval,
knowledge representation, and computational linguistics, a branch of linguistics,
since its main goal is to enable computers to process data that has been encoded in
natural language. Usually, rule-based, statistical, or neural-based methods for
machine learning and deep learning are used to gather data from text corpora.
NLP research has helped enable the era of AI, from communications skills of large
language models to the ability of image generation, NLP is already a part of day-to-
day life of many people. From voice assistant to recommendation system of social
media.
BENEFITS OF NLP
NLP make it easier for human being to communicate with machines, by allowing
them to access machine in natural human language.
1.automation of repetitive tasks
2. sentiment analysis
3. Enhanced research work
4. Content generation
7
NLP TASKS
Tasks which that are made easier with the use of NLP and can be performed easily
and to a great accuracy:
1.Sentiment analysis
This the analysis which requires to analysis the sentiment of the required line or we
can say using NLP we can extract the true meaning of the sentence, NLP identify the
polarity of the word or phrase as positive, negative or neutral. This is used by the big
companies and movie makers to analysis the feedback of the consumer on the
particular movie or product so that they can make a business related decision.
2.Text summarization
NLP is used to summarized the text by analysing the text provided to it, and extract
the meaning of the text. After that model generate the new text having similar
meaning as the original text which was giving to the model in starting.
3.Speach recognition
NLP after training with the audio dataset can identify the voice of the person, this
feature of NLP is used in the various field such as voice recognition locks, google
assistant, apple siri etc
4. Text Classification
Assigning predefined categories to a given text, such as spam detection, sentiment
analysis, or topic labelling.
CHALLENGES IN NLP
Biased Training
NLP models can inherit biases from their training data, leading to skewed results,
especially in sensitive areas like healthcare or HR. If the data is biased, the model’s
predictions will be too, affecting accuracy and fairness.
New Vocabulary
Language is constantly evolving, and NLP systems may struggle with new words or
shifting grammar. This can lead to incorrect guesses or confusion, especially in fast-
changing fields like technology or pop culture.
Tone of Voice
8
NLP struggles to capture the emotional tone or intent behind words, such as sarcasm
or emphasis. This makes sentiment analysis and understanding user intent more
difficult and less reliable.