0% found this document useful (0 votes)
31 views45 pages

Mini Project Document

Uploaded by

saishivayadav288
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views45 pages

Mini Project Document

Uploaded by

saishivayadav288
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 45

VIGNANA BHARATHI INSTITUTE OF TECHNOLOGY

(A UGC Autonomous Institution, Approved by AICTE, Affiliated to JNTUH, Accredited


by NBA & NAAC) Aushapur (V), Ghatkesar (M), Medchal(dist)

A MINOR PROJECT REPORT

ON

Exploring the Combination of Sentiment Analysis and Voice Recognition


through Deep Learning Techniques

Submitted in partial fulfillment of the requirement


for the award of the degree of
BACHELOR OF
TECHNOLOGY IN
DEPARTMENT OF ELECTRONICS AND COMMUNICATION ENGINEERING

BY

Arati Sai Raghava Lalit Kumar 21P61A0403


Bollu Jayanth 21P61A0407
Bodduna Nikhil 21P61A0423

Under the esteemed guidance of

Mrs.Kranthi Rekha
Assistant Professor
Dept. of ECE
Aushapur (V), Ghatkesar (M), Hyderabad, Medchal – Dist, Telangana – 501 301.

DEPARTMEN
T OF
ELECTRONICS AND COMMUNICATION ENGINEERING

CERTIFICATE

This is to certify that the major project titled “Exploring the Combination of
Sentiment Analysis and Voice Recognition through Deep Learning
Techniques” submitted by Arati Sai Raghava Lalit Kumar(21P61A0403),
Bollu Jayanth(21P61A0407),Bodduna Nikhil(21P61A0423) in B. Tech IV-I
semester Electronic And Communication Engineering is a record of the
bonafide work carried out by them.

The results embodied in this report have not been submitted to any other
University for the award of any degree.

INTERNAL GUIDE PROJECT CO-ORDINATOR


Mrs.Kranthi Rekha Dr.G.Narsimhulu
(Assistant Professor) (Assistant Professor)

HEAD OF THE DEPARTMENT EXTERNAL EXAMINER


Dr.U.Poorna Lakshmi
(Associate Professor)

i
Department of Electronics And Communication Engineering

DECLARATION

We, Arati Sai Raghava Lalit Kumar, Bollu Jayanth, Bodduna Nikhil,
bearing hall ticket numbers 21P61A0403, 21P61A0407, 21P61A0423 hear by
declare that the major project report entitled “Exploring the Combination of
Sentiment Analysis and Voice Recognition through Deep Learning
Techniques” under the guidance of Mrs.Ch.Kranthi Rekha, Department of
Electronics And Communication Engineering, Vignana Bharathi Instituteof
Technology, Hyderabad, have submitted to Jawaharlal Nehru Technological
University Hyderabad, Kukatpally, in partial fulfillment of the requirements for
the award of the degree of Bachelor of Technology in Electronics And
Communication Engineering.

This is a record of bonafide work carried out by us and the results embodied in
this project have not been reproduced or copied from any source. The results
embodied in this project report have not been submitted to any other university
or institute for the award of any other degree or diploma.

By:
Arati Sai Raghava Lalit Kumar (21P61A0403)
Bollu Jayanth (21P61A0407)
Bodduna Nikhil(21P61A0423)

ii
ACKNOWLEDGEMENT

We are extremely thankful to our beloved Chairman, Dr. N. Goutham Rao


and secretary, Dr. G. Manohar Reddy who took keen interest to provide us
the infrastructural facilities for carrying out the project work. Self-confidence,
hard work, commitment and planning are essential to carry out any task.
Possessing these qualities is sheer waste, if an opportunity does not exist. So,
we whole- heartedly thank Dr. P. V. S. Srinivas, Principal, and Dr.U.Poorna
Lakshmi, Head of the Department, Electronics And Communication
Engineering for their encouragement, support and guidance in carrying out the
project.

We would like to express our indebtedness to the project coordinator,


Dr.G.Narsimulu, Assistant Professor, Department of ECE for her valuable
guidance during the course of project work.

We thank our Project Guide, Mrs.Ch.Kranthi Rekha, Assistant Professor, for


providing us with an excellent project and guiding us in completing our major
project successfully.

We would like to express our sincere thanks to all the staff of Electronics And
Communication Engineering, VBIT, for their kind cooperation and timely help
during the course of our project. Finally, we would like to thank our parents and
friends who have always stood by us whenever we were in need of them.

By:
Arati Sai Raghava Lalit Kumar (21P61A0403)
Bollu Jayanth (21P61A0407)
iii Bodduna Nikhil(21P61A0423)
ABSTRACT

Sentiment analysis, also known as opinion mining, is crucial for understanding


public opinion and emotions in textual data. Utilizing natural language
processing (NLP) and machine learning, it categorizes sentiments as positive,
negative, or neutral. The growing volume of user-generated content on social
media and review sites has increased the demand for automated sentiment
analysis, which is vital for businesses and researchers in extracting insights and
shaping strategies. Current sentiment analysis systems use supervised learning
algorithms that require large, annotated datasets. These systems combine lexical
resources with models like support vector machines (SVM) and neural
networks. Despite advancements, challenges remain in handling sarcasm,
ambiguous phrases, and evolving language on social media, which complicate
accurate sentiment classification. Our system addresses these limitations by
integrating rule-based methods with advanced deep learning techniques, such as
transformer-based models like BERT. By using contextual embeddings and
transfer learning, our system better understands nuanced language patterns.
Additionally, a dynamic update mechanism for lexicons and training data will
allow our model to adapt to evolving language trends, enhancing the accuracy
and reliability of sentiment analysis.

Keywords: Opinion Mining, BERT, Transformer Models, Supervised


Learning, Lexical Resources, Deep Learning, Contextual Embeddings, Social
Media Analysis, Transfer Learning, Dynamic Update Mechanism, Sentiment
Classification, Textual Data Analysis.

iv
DEPARTMENT
OF
ELECTRONICS AND COMMUNICATION ENGINEERING

VISION

 To produce creative engineers who can address the global challenges and excel at an
International level, in advancement of Electronics and Communication Technologies
through Research and Teaching of International Standard.

MISSION

 To Impact quality education in Electronics and Communication Engineering through an


effective teaching-learning process and make students globally competitive.
 To carry out research through constant interaction with research and development
organizations.
 To involve the students in creative and group activities useful for Career Choices& lifelong
learning.
 To enable students to develop skills to solve complex technological problems of current
times and also provide a framework for promoting collaborative and multidisciplinary
activities.

PROGRAM EDUCATIONAL OBJECTIVES (PEOs)

PEO 1: Domain Knowledge: Graduates of ECE-VBIT, will be able to synthesize


mathematics, science, engineering fundamentals, laboratory and attain practical experiences
to formulate and solve engineering problems in electronics engineering domains and shall
have proficiency in electronics based engineering and the use of electronic tools.

PEO 2: Professional Employment: Graduates of ECE-VBIT will succeed in entry-level


engineering positions within the core electronic engineering or manufacturing firms in
regional, national, or international industries and with government agencies.

PEO 3: Higher Degrees: Graduates of ECE-VBIT will succeed in the pursuit of advanced
degrees in engineering or other fields where a solid foundation in mathematics, science, and
engineering fundamentals is required.

PEO 4: Engineering Citizenship: Graduates of ECE-VBIT will be prepared to


communicate and work effectively on team-based engineering projects and will practice the
ethics of their profession consistent with a sense of social responsibility.

PEO 5: Lifelong Learning: Graduates of ECE-VBIT, will recognize the importance of, and
have the skills for, continued independent learning to become experts in their chosen fields
and to broaden their professional knowledge.
PROGRAM OUTCOMES (POs)
Engineering graduates will be able to:

1. Engineering Knowledge: Apply the knowledge of mathematics, science, engineering


fundamentals, and an engineering specialization for the solution of complex engineering
problems.
2. Problem Analysis: Identify, formulate, research literature, and analyse complex
engineering substantiated conclusions using first principles of mathematics, natural
sciences, and engineering sciences, problems reaching.
3. Design/development of solutions: Design solutions for complex engineering problems
and design system components or processes that meet the specified needs with appropriate
consideration for public health and safety, and cultural, societal, and environmental
considerations.
4. Conduct investigations of complex problems: Use research-based knowledge and
research methods including design of experiments, analysis and interpretation of data, and
synthesis of the information to provide valid conclusions.
5. Modern tool usage: Create, select, and apply appropriate techniques, resources, and
modern engineering and IT tools including prediction and modelling to complex
engineering activities with an understanding of the limitations.
6. The engineer and society: Apply reasoning informed by the contextual knowledge to
assess societal, health, safety, legal, and cultural issues and the consequent responsibilities
relevant to the professional engineering practice.
7. Environment and sustainability: Understand the impact of the professional engineering
solutions in societal and environmental contexts, and demonstrate the knowledge of, and
the need for sustainable development.
8. Ethics: Apply ethical principles and commit to professional ethics and responsibilities and
norms of the engineering practice.
9. Individual and teamwork: Function effectively as an individual, and as a member or
leader in diverse teams, and in multidisciplinary settings.
10. Communication: Communicate effectively on complex engineering activities with the
engineering community and with the society at large, such as being able to comprehend
and write effective reports and design documentation, make effective presentations.
11. Project management and finance: Demonstrate knowledge and understanding of the
engineering and management principles and apply these to one’s work, as a member and
leader in a team, to manage projects and in multidisciplinary environments.
12. Life-long learning: Recognize the need for and have the preparation and ability to engage
in independent and life-long learning in the broadest context of technological change.

PROGRAM SPECIFIC OUTCOMES (PSOs)

PSO1: Analyze, design and implement specific engineering problems


in the areas of VLSI and Embedded systems.

PSO2: Apply the knowledge of domain specific skill set for analysis of
Signal Processing and Communications.

PSO3: Analyze and solve the complex engineering problems using


state of the art hardware and software tools.

PSO4: Develop proficiency in innovative technologies to sustain with


the dynamic industry challenges.
Course Outcomes (COs)

CO1 - Identify challenging practical problems, solutions of Electronics and

Communication Engineering field.

CO2 - Analyse the various methodologies and technologies and discuss with team for

solving the problem.

CO3 - Choose efficient tools for designing project.

CO4 - Build the project through effective team work by using recent technologies.

CO5 - Elaborate and test the completed task and compile the project report.

Correlation Levels

Substantial/ High 3
Moderate/ Medium 2
CO – PSO Correlation Matrix

PSOs
COs PSO1 PSO2 PSO3 PSO4
CO1 3 2
CO2 3 2
CO3 3 2
CO4 3 2
CO5 3 2

CO – PO Correlation Matrix

POs
COs PO1 PO2 PO3 PO4 PO5 PO6 PO7 PO8 PO9 PO10 PO11 PO12
CO1 2 2 2 2 2 2 3 3 3 3
CO2 2 2 2 2 2 2 3 3 3 3
CO3 2 2 2 2 2 2 3 3 3 3
CO4 2 2 3 2 2 2 3 3 3 3
CO5 2 2 2 2 2 2 3 3 3 3

Project Outcomes (PROs)

CO Taxonomy
Course Outcome Statement
No. Level

1 Identify challenging practical problems, solutions of


Electronics and Communication Engineering field. APPLY
Analyse the various methodologies and technologies and
2
discuss with team for solving the problem. ANALYZE
Apply technical knowledge and project management skills
3
for solving the problem.
APPLY
Design and Development of technical projects as an
4
individual or in a team
CREATE
Prepare the project reports and give proper explanation
5
during the presentation and demonstration.
CREATE
TABLE OF CONTENTS

CERTIFICATE i

DECLARATION ii

ACKNOWLEDGMENET iii

ABSTRACT iv

CHAPTER 1 INTRODUCTION…………………………………………………1-2
1.1 INTRODUCTION TO PROJECT……………………………………1
1.2 PROBLEM STATEMENT…………………………………………….2
1.3 AIM AND OBJECTIVES……………………………………………...2
CHAPTER 2 ARTIFICIAL INTELLIGENCE……………………………….4-8
2.1 MACHINE LEARNING…………………………………………4
2.2 DEEP LEARNING…………………………………………….....5
2.3 DEEP NEURAL NETWORK…………………………………...6
2.4 CONVOLUTIONAL NEURAL NETWORK………………….7
2.5 RECURRENT NEURAL NETWORK…………………………8

CHAPTER 3 SENTIMENT ANALYSIS………………………………………...9-17


3.1 SENTIMENT ANALYSIS……………………………………….9
3.1.1STRUCTURED DATA…………………………………………10
3.1.2UNSTRUCTED DATA………………………………………....10
3.1.3HUMAN GENERATED………………………………………..11
3.1.4MACHINE GENERATED……………………………………..11
3.2 WHY SENTIMENT ANALYSIS………………………………..12
3.3 APPLICATIONS OF SENTIMENT ANLYSIS IN
REAL WORLD…………………………………………………...13
3.4 TECHNIQUES FOR SENTIMENT EXTRACTION………….14
3.5 MACHINE LEARNING TECHNIQUES………………………15
3.6 THE NETWORK TWITTER……………………………………16
3.7 BERT…………………………………………………………........17
CHAPTER 4 LIREATURE SURVEY……………………………………………18
CHAPTER 5 METHODOLOGY FOR IMPLEMENTATION…………………19-23
5.1 EXISTING METHOD……………………………………………19
5.1.1 LIMITATIONS………………………………………………….20
5.2 THE PROPOSED METHOD…………………………………….21
5.2.1 DATA COLLECTION………………………………………….22
5.2.2 DATA PREPROCESSING……………………………………..22
5.2.3 SENTIMENT ANALYSIS……………………………………..23
CHAPTER 6 SYSTEM ARCHITECTURE……………………………………..24
CHAPTER 7 TOTALS REQUIRED……………………………………………..25
7.1 HARDWARE REQUIREMENTS………………………………25
7.2 SOFTWARE REQUIREMENTS………………………………25
7.2.1 VISUAL STUDIO CODE SOFTWARE……………………..25
7.2.2 LIBRARIES…………………………………………………....25
7.3 APPLICATIONS………………………………………………...25
CHAPTER 8 IMPLEMENTATION DEATAILS…………………………………….26-27
8.1 IMPORTS………………………………………………………..26
8.2 INITIALIZING THE RECOGNIZE………………………......26
8.3 CAPTURING AUDIO INPUT……………………………….....27
8.4 RECOGNIZING SPEECH……………………………………...27
8.5 PERFORMING SENTIMENT ANALYSIS…………………...27
8.6 HANDLING EXCEPTIONS……………………………………27
CHAPTER 9 PROGRAM SOURCE CODE WITH ADEQUATE COMMRENTS...28
CHAPTER 10 RESULTS…………………………………………………………29
CHAPTER 11 CONCLUSION & FUTURE SCOPE…………………………..30-31
CHAPTER 12 REFERENCES…………………………………………………...32-33
CHAPTER 1
Introduction

1.1 Introduction to project:


The rise of social media and online reviews has generated vast amounts of textual data,
necessitating effective methods for sentiment analysis. Traditional methods relied on rule-
based systems and machine learning techniques that required extensive feature engineering.
However, deep learning has revolutionized this field by automating feature extraction and
improving accuracy.

Sentiment is an attitude, thought, or judgment prompted by feeling. Sentiment analysis,


which is also known as opinion mining, studies people’s sentiments towards certain entities.
From a user’s perspective, people are able to post their own content through various social
media, such as forums, micro-blogs, or online social networking sites. From a researcher’s
perspective, many social media sites release their application programming interfaces (APIs),
prompting data collection and analysis by researchers and developers. However, those types
of online data have several flaws that potentially hinder the process of sentiment analysis.
The first flaw is that since people can freely post their own content, the quality of their
opinions cannot be guaranteed. The second flaw is that ground truth of such online data is not
always available. A ground truth is more like a tag of a certain opinion, indicating whether
the opinion is positive, negative, or neutral.

“It is a quite boring movie but the scenes were good enough. ”

The given line is a movie review that states that “it” (the movie) is quite boring but the
scenes were good. Understanding such sentiments require multiple tasks.
Hence, SENTIMENTAL ANALYSIS is a kind of text classification based on Sentimental
Orientation (SO) of opinion they contain.
Sentiment analysis of product reviews has recently become very popular in text mining and
computational linguistics research.

• Firstly, evaluative terms expressing opinions must be extracted from the review.

• Secondly, the SO, or the polarity, of the opinions must be determined.

• Thirdly, the opinion strength, or the intensity, of an opinion should also be determined.

• Finally, the review is classified with respect to sentiment classes, such as Positive and
Negative, based on the SO of the opinions it contains.
1
1.2 Problem statement:
In the past few years, several studies have come up with ideas for deep-learning-based
sentiment analyses. These analyses have different features and levels of performance. This
work looks at the most recent studies that used deep learning models to solve different
problems related to sentiment analysis. We applied deep learning models with TF-IDF and
word embedding to Twitter datasets and implemented the state-of-the-art of sentiment
analysis approaches based on deep learning.

1.3 Aim and Objectives:


Aim: This thesis aims to explore the different combinations of application of the deep
learning techniques and publish their comparative study, performed on the data available on
the social media networks for a furniture store and derive insights from it. Most papers that
do comparison studies focus on reliability metrics like overall accuracy or F-score and ignore
processing time out. This thesis addresses that gap. Also, only a small number of datasets are
used to evaluate the models.

Objectives:

• To apply different word embedding methods with deep learning techniques.


• To find the most popular deep learning Techniques.
• Discussion on the processing time of the deep learning models.

2
CHAPTER 2
Artificial Intelligence:
Artificial Intelligence (AI) refers to the simulation of human intelligence processes by
machines, particularly computer systems. It encompasses a range of technologies, including
machine learning, natural language processing, computer vision, and robotics, enabling
machines to perform tasks that typically require human intelligence, such as reasoning,
problem-solving, understanding language, and recognizing patterns.

The development of AI has evolved through various phases, from early rule-based systems
to the current era of deep learning and neural networks. This evolution has been fueled by
advancements in computational power, the availability of vast datasets, and improved
algorithms. AI applications are diverse, spanning industries such as healthcare, finance,
transportation, and entertainment, where they enhance decision-making, automate processes,
and personalize user experiences.

Fig. 2. Artificial Intelligence

3
Despite its transformative potential, AI presents challenges, including ethical considerations,
bias in algorithms, and concerns regarding privacy and job displacement. The need for
responsible AI development and deployment is paramount to ensure that its benefits are
equitably distributed while mitigating risks.

2.1 Machine Learning:


Machine learning, as the name implies, is the process of computers learning without explicit
human programming. First, give them excellent data, then train them by developing several
machine learning models utilizing the data and different techniques. Primarily divided into
two types: [3, 29]

• Supervised Learning: Using labelled data, supervised learning algorithms are taught. The
outcome is predicted using a supervised learning model. In supervised learning, the model is
fed input and output.

• Unsupervised Learning: Algorithms for unsupervised learning are taught on unlabeled


data. Unsupervised learning models uncover data’s hidden patterns. In unsupervised learning,
the model is fed simply input data. Ex: Clustering

• Reinforcement Learning: Reinforcement Learning (RL) is a machine learning technique


that enables an agent to learn in an interactive environment by trial and error using feedback
from its actions and experiences. The investigation’s target is finding an appropriate action
model that would maximize the agent’s overall cumulative reward. RL model performs
learning based on the suitable action that would maximize the agent’s total cumulative
reward.

4
Fig 2.1 Machine Learning

2.2 Deep Learning:


By incorporating a multi-layer structure into the neural network’s hidden layers, deep
learning can achieve more complex results. Features in conventional machine learning
methods are specified and retrieved by hand or via the use of feature selection techniques.
Deep learning models, on the other hand, automatically learn and extract information,
leading to improved accuracy and performance. Classifier models’ hyperparameters are
often measured automatically as well. A comparison of standard machine learning (Support
Vector Machine (SVM), Bayesian networks, and decision trees) with deep learning for
sentiment polarity categorization is shown in Figures 2.1 and 2.2.

The core of deep learning involves training models through vast datasets using optimization
techniques like backpropagation and gradient descent. Key advancements, such as
convolutional neural networks (CNNs) for image analysis and recurrent neural networks
(RNNs) for sequential data, have revolutionized how machines interpret and interact with
information.
5
Fig 2.2.1 Sentiment Analysis Using Machine Learning and Deep Learning

2.3 Deep Neural Network:


It is a new generation of machine learning that mimics the structure and function of the human
brain. This algorithm’s distinctive characteristic enables it to automatically grasp the needed
features. The deep learning model is a mathematical function f: X Y. Deep learning is the
development of an ANN that employs more than one hidden layer to model a dataset .As
shown in figure 2.3 It has three primary layers:

• Input Layer: neurons receive input from variable X.

• It contains neurons that receive signals from the preceding input layer. Each buried layer
trains its own set of characteristics.The more buried layers, the more intricate abstract.

• Output Layer: This layer is made up of neurons that receive input from the hidden layer and
create the output value.

6
Fig 2.3 Deep Neural Network

2.4 Convolutional Neural Networks (CNN):

CNNs consist of multiple layers of convolutions with nonlinear activation functions, such as
ReLU or tanh, applied to the results. In a conventional feedforward neural network, each input
neuron is connected to each output neuron in the next layer. This is also known as a fully
connected or affine layer.

In CNNs, the output is computed using convolutions over the input layer. This produces local
connections in which each input area is linked to a neuron in the output. Each layer applies
several filters, generally hundreds or thousands as seen above, and mixes the resulting images.
A CNN automatically learns the values of its filters based on the desired task during the
training phase.

For instance, a CNN for image classification may learn to detect edges from raw pixels in the
first layer, then use the edges to detect simple shapes in the second layer, and finally use these
simple shapes to deter higher-level features, such as facial shapes, in higher layers. The last
layer is a classifier that employs these high-level characteristics. Instead of picture pixels, the
input to the majority of NLP jobs is a matrix of phrases or texts. Each row of the matrix
represents one token, which is often a word but might also be a character. In other words, each
row is a vector representing a word. These vectors are often word embeddings (low-
dimensional representations) such as word2vec or GloVe, but they may also be one-hot
vectors that index the word into a dictionary.

7
2.4 Convolutional Neural Network

2.5 Recurrent Neural Networks (RNN):


Recurrent Neural Network is an extension of feedforward neural network with an internal
memory. RNN is recurrent in nature since it performs the same function for each data input
while the outcome of the current input is dependent on the previous calculation. After the
output has been generated, it is duplicated and fed back into the recurrent network. For
decision-making, it evaluates both the current input and the outcome from the prior input
from which it has learnt. Long Short Term Memory Networks is an advanced RNN, a
sequential network, that allows for the persistence of information. It is capable of resolving
the gradient issue encountered by RNN.

Fig 2.5. Recurrent Neural Network


8
CHAPTER 3
3.1 Sentiment Analysis:
Sentiment analysis is the technique of obtaining information about an entity and determining
its subjectivities automatically. The purpose is to identify whether usergenerated material
expresses favorable, negative, or neutral sentiments. Classification of sentiment may be
accomplished on three levels of extraction: aspect or feature level, phrase level, and
document level. There are currently three solutions to the issue of sentiment analysis : (1)
lexicon-based strategies, (2) machine-learning-based techniques, and (3) hybrid approaches.

Initially, approaches based on a lexicon were utilized for sentiment analysis. They are
separated into dictionary-based and corpus-based techniques In the first kind, sentiment
categorization is accomplished by the use of a terminology dictionary, such as
SentiWordNet and WordNet. However, corpus-based sentiment analysis does not rely on a
predefined dictionary, but rather on a statistical analysis of the contents of a collection of
documents, using techniques such as k-nearest neighbors (k-NN) , conditional random field
(CRF) , and hidden Markov models (HMM) , among others.

Fig 3.1 Categorisation of Sentiment Analysis Techniques

Machine learning Techniques offered for sentiment analysis issues fall into two categories:
(1) standard models and (2) deep learning models. Traditional models relate to traditional
machine learning algorithms, such as the nave Bayes classifier, the maximum entropy
classifier and support vector machines (SVM) .
9
Deep learning models can deliver superior than traditional methods.CNN, DNN, and RNN
are among the deep learning models that may be utilized for sentiment analysis. These
methods handle categorization issues at the document, phrase, and aspect levels.

The next section will cover these approaches to deep learning. The hybrid techniques
combine methodologies based on lexicons and machine learning. Commonly, sentiment
lexicons play a crucial part in the bulk of these tactics.

3.1.1 Structured Data:

It came from the name for a common language used to access database called Structured
Query Language (SQL).SQL provides ways to manage data in database. In general structured
mean orderly form.
Eg: Excel.

Fig 3.1.1 Structured Data

Dependent data is facts, commonly textual content files, displayed in titled columns and
rows that could without difficulty be ordered and processed by means of records mining
equipment. This can be visualized as a superbly organized file wherever everything is
known, labeled and straightforward to access. Most groups are possibly to be familiar with
this form of records and already using it effectively.

3.1.2 Unstructured Data:


The World Wide Web has been dominated by unstructured content, and searching the web
has primarily been based on techniques from Information Retrieval. It represents 80% of
the data, which includes text and multimedia content. Eg: E-mail messages, videos, photos,
audio files. They may have internal structure but neatly they don‘t fit the database.

10
Fig 3.1.2 Unstructured Data

Several real world applications now need to create bridges for smooth integration of semi
structured sources with existing structured databases for seamless querying.

3.1.3 Human generated:

 Mobile data- It gives text and the location.

 Social media data- YouTube, FB, Flickr, Twitter, LinkedIn.

 Text internal of the company- Documents, logos, survey, results, e-mails.

3.1.4 Machine generated:

 Satellite images- Information about weather data. Eg: Google earth.

 Radar or sonar- Vehicular and oceanic information.

 Scientific data- Atmospheric data and seismic image.

11
3.2 Why Sentiment Analysis?
Sentiment analysis (also known as opinion mining) refers to the use of natural language
processing, text analysis and computational linguistics to identify and extract subjective
information in source materials. Sentiment analysis is widely applied to reviews and social
media for a variety of applications, ranging from marketing to customer service.

Generally speaking, sentiment analysis aims to determine the attitude of a speaker or a


writer with respect to some topic or the overall contextual polarity of a document. The
attitude may be his or her judgment or evaluation (see appraisal theory), affective state (that
is to say, the emotional state of the author when writing), or the intended emotional
communication (that is to say, the emotional effect the author wishes to have on the
reader). Everyday enormous amount of data is created from social networks, blogs and
other media and diffused in to the World Wide Web.

Fig 3.2 Sentiment Analysis –for reviews

This huge data contains very crucial opinion related information that can be used to benefit
businesses and other aspects of commercial and scientific industries. Manual tracking and
extraction of this useful information is not possible, thus, Sentiment analysis is required.
Sentiment Analysis is the phenomenon of extracting sentiments or opinions from reviews
expressed by users over a particular subject, area or product online. It is an application of
natural language processing, computational linguistics, and text analytics to identify
subjective information from source data. It clubs the sentiments in to categories like positive
or negative. Thus, it determines the general attitude of the speaker or a writer with respect to
the topic in context.

12
3.3 Applications of Sentiment Analysis in Real World:
a) Product and service reviews:
Reviews of consumer products and service. Many automated websites provides feedback.
Eg: Google product search.

b) Reputation Monitoring:
Monitoring reputation of a specific brand.
Eg: Twitter, Facebook.

c) Result Prediction:
By analyzing sentiments from different sources one can predict the outcome of the event.
Eg: Election. It enables managers to track how voters feel about different issues, how they relate to
speeches and actions of the candidate.
d) Decision making:
Sentiment analysis uses these various sources to find the articles that discuss the aggregate
the score.
Eg: The Stock Sonar graphically shows positive and negative sentiment of each stock.

Fig 3.3 Application of Sentiment Analysis

13
3.4 Techniques for Sentiment Extraction
There are two main techniques for sentiment classification: symbolic techniques and machine
learning techniques. The symbolic approach uses manually crafted rules and lexicons, where
the machine learning approach uses unsupervised, weakly supervised or fully supervised
learning to construct a model from a large training corpus. We proposed a system which uses
machine learning techniques instead of symbolic techniques to provide the polarity for
sentences present in the World Wide Web.

Fig 3.4 Classification of Sentiment Analysis

14
3.5 Machine Learning Techniques:
a) Supervised Methods:

Supervised learning is the machine learning task of deriving a function from marked training
data. The training data consist of a set of training examples. In supervised learning, every
case analyse an information object (normally a vector) and coveted yield esteem (likewise
called the supervisory sign). A supervised learning calculation breaks down the preparation
information and produces a function, which can be utilized for mapping new cases. An ideal
situation will take into consideration the calculation to accurately decide the class names
for unknown occurrences. This requires the taking in calculation to sum up from the
preparation information to unknown circumstances in a "sensible" manner.

In order to train a classifier for sentiment recognition in text classic supervised learning
techniques (e.g Support Vector Machines, naïve Bayes Multinomial, Hidden Markov
Model) can be used. A supervised approach entails the use of a labeled training corpus to
learn classification function. The method that in the literature often yields the highest
accuracy regards a Support Vector Machine classifier. They are the ones we used in our
experiments described below.

b)Unsupervised Methods:
Unsupervised learning is the machine learning undertaking of inducing a capacity to hide
structure from unlabeled information. Since the cases given to the learner are unlabeled, there
is no mistake or reward sign to assess a potential arrangement. Unsupervised learning is
firmly identified with the issue of thickness estimation in measurements. However
unsupervised adapting additionally envelops numerous different systems that look to
condense and clarify key elements of the information.

c)Reinforcement Learning:
Reinforcement learning varies from standard managed learning in that right
information/yield sets are never exhibited, nor imperfect activities explicitly adjusted.
Further, there is an attention on-line execution, which includes finding a harmony between
investigation (of unknown region) and exploitation (of current information).
15
3.6 The Network: Twitter
Twitter is an online social networking service and micro blogging service that enables its
users to send and read text-based messages called \tweets". Tweets are publicly visible by
default, but senders can restrict the message delivery to a limited crowd. Twitter is one of the
largest microblogging service having over 500 million registered users as of 2012. Statistics
revealed by the Infographics Labs5 suggest that back in the year 2012, on a daily basis 175
million tweets were communicated.

There is a large mass of people using twitter to express sentiments, which makes it an
interesting and challenging choice for sentiment analysis. When so much attention is being
paid to twitter, why not monitor and cultivate methods to analyze these sentiments. Twitter
has been selected with the following purposes in mind.

Fig 3.6 Twitter

Twitter is an Open access social network.

 Twitter is an Ocean of sentiments (limited within 140 characters, i.e. high


sentiment density).
 Twitter provides user friendly API making it easier to mine sentiments in real-time.

16
3.7 BERT:
BERT is an open source natural language processing machine learning framework (NLP).
Word embedding is intended to assist computers in understanding the meaning of
ambiguous words in text by leveraging surrounding material to build context. [13] BERT,
which stands for Bidirectional Encoder Representations from Transformers, is based on
Transformers, a deep learning model in which every output element is linked to every input
element, and the weightings between them are produced dynamically depending on their
relationship. (This is referred to as attention in NLP.) A fundamental Transformer consists
of an encoder that reads the text input and a decoder that generates a prediction for the job.
Since the objective of BERT is to construct a language representation model, it simply
requires the encoder. Encoder input for BERT is a series of tokens, which are transformed to
vectors and then processed by the neural network. [14] Some of the other alternative
options.

Hugging Face:- Distilled BERT,GPT 23 and XLNet. They are efficient but BERT beat’s them all
and has been a better performer by being state of the art in 7 0f 11 NLP tasks.

Fig 3.7 BERT Architecture

17

CHAPTER 4
Literature Survey:

Authors Title Proposed Model Performance Metrics

Xu,X. Wang Et al. Target embedding and position LSTM(Long Short term Acc-80.45%
attention with LSTM memory network)

L. Xu,L. Bing Et al. Aspect sentiment classification EMNLP(Empirical methods in Acc-86.78%


with aspect-specific NLP)

Putta Durga Et al. An Effective Deep Sentiment D-RNN(Decision-Recurrent Acc-90.2%


Analysis Neural Network)
Using a D-RNN.

H.Zhang Et al. A survey on quadruple extraction CNN(Convolutional Neural Acc-98.5%


Network)

Table 4.1 Previous proposed approaches in Sentiment Analysis

18

CHAPTER 5
Methodology for Implementation:
5.1 Existing Method:
This Existing Method is related to the An Effective Deep Sentiment Analysis Using a
Decision-Based Recurrent Neutral Network (D-RNN).This section explains the methodology
that focuses on extracting accurate sentiments from the three datasets: Twitter, Restaurant
and Laptop.
This Existing method follows a step-by-step process.
The First step focuses on the pre-trained model Bert-large-cased (BLC), consisting of 24-
layer, 1024-hidden, 16-heads,340M parameters.
The Second step follows data preprocessing such as removing the noise, particular
characters or URLs etc.

Fig 5.1 Separation of Dataset

19
5.1.1 Limitations:
 Removing Special Characters From The Given Reviews
 Tokenization
 Aspect Extraction
 Sentiment Extraction

Fig 5.1.1 Flowchart of CNN

20

5.2 The Proposed Method:


Sentiment analysis using deep learning methodologies involves leveraging neural networks
and advanced models to classify text based on sentiment.
In the age of digital communication, social media platforms have become a crucial source of
public opinion and sentiment. Instagram in particular, stands out as a prominent platform
where users express their thoughts, opinions and emotions on various topics in real time.

Fig 5.2 Dataset

21
5.2.1 Data Collection: Gather text data from sources like social media, reviews, or custom
datasets. Label the data with sentiment categories (positive, negative or neutral).

Data collection is a pivotal phase in sentiment analysis, which aims to extract and interpret
subjective information from textual data to determine sentiment polarity—positive,
negative, or neutral. The quality and relevance of the collected data directly influence the
effectiveness of sentiment analysis models, making this step critical for achieving accurate
and meaningful results.

Fig 5.2.1 Data Collection

5.2.2 Data Preprocessing: Clean the text (remove URLs, punctuation, emojis, etc.)

Data preprocessing is a fundamental step in sentiment analysis, a field focused on extracting


subjective information from text to determine sentiment polarity—whether it is positive,
negative, or neutral. This process is critical for enhancing the quality and relevance of input
data, which directly influences the performance of machine learning and deep learning
models.

Fig 5.2.2 Data Preprocessing


22
5.2.3 Sentiment Analysis: Utilizing sentiment analysis tools such as Deep
Learning,TextBlob, VADER or advanced machine learning models to determine the
sentiment of each Instas. These tools classify Instas into positive, negative or neutral
categories based on the polarity of the text.

The process typically involves several stages, including data collection, preprocessing,
feature extraction, and model training. Diverse data sources such as product reviews, social
media posts, and customer feedback serve as the foundation for analysis. Preprocessing
techniques, such as tokenization, stop word removal, and normalization, are employed to
prepare the data for effective analysis. Feature extraction methods, including Bag of Words,
TF-IDF, and word embeddings, convert textual data into numerical representations suitable
for machine learning algorithms.

Fig 5.2.3 Sentiment Analysis

Various models, ranging from traditional machine learning techniques like Support Vector
Machines (SVM) to advanced deep learning architectures such as Recurrent Neural
Networks (RNNs) and Transformers, are utilized to classify sentiment. The choice of model
depends on the complexity of the data and the specific requirements of the task.

Despite its advancements, sentiment analysis faces challenges, including ambiguity in


language, context dependence, and the presence of sarcasm. Addressing these challenges is
crucial for improving the accuracy and reliability of sentiment analysis systems

23
CHAPTER 6
System architecture:

Fig 6.1 System Architecture

24
CHAPTER 7
Tools Required:
7.1 Hardware Requirements:
Computer with a good CPU and GPU (optional but recommended for faster training)
For real-time detection, voice assistant.

7.2 Software Requirements:

Python Programming Language.


7.2.1 Visual Studio Code Software: An integrated development environment (IDE)
for writing and debugging code.

7.2.2 Libraries:
vaderSentiment:Specifically,Sentiment intensity Analyzer from the vadersentiment
library,which is used for sentiment analysis.
Installation: pip install vadersentiment
SpeechRecognition: The Speech recognition library is used to recognize speech from audio
input,enabling the program to convert spoken words into text.
Installation: pip install SpeechRecognition.

7.3 Applications:

Social Media Monitoring :Brands use sentiment analysis to gauge public opinion
on social media platforms like Twitter, Facebook, and Instagram.
Customer Feedback Analysis:Businesses analyze reviews and feedback from
customers on platforms like Amazon, Yelp, and TripAdvisor.
Market Research Companies: Leverage sentiment analysis to track consumer trends and
preferences in various industries.
Political Sentiment Analysis: Analyzing public opinion on political candidates, policies, or
events using data from news articles and social media.

25
CHAPTER 8
Implementation Details:

8.1 Imports:
from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer
import speech_recognition as sr

 VADER (Valence Aware Dictionary and sEntiment Reasoner) : This is a sentiment


analysis tool specifically designed for social media texts. It assigns sentiment scores to text
inputs, categorizing them into positive, negative, neutral, and compound scores.
 SpeechRecognition: This library helps in recognizing speech from audio sources,
converting it into text.

8.2 Initializing the Recognizer:


recognizer = sr.Recognizer()

 This line creates an instance of the Recognizer class, which is essential for recognizing speech
from an audio input.

8.3 Capturing Audio Input:


with sr.Microphone() as source:
print('Clearing background noise...')
recognizer.adjust_for_ambient_noise(source, duration=1)
print('Waiting for your message...')
recorded_audio = recognizer.listen(source)
print('Done recording..')

 Microphone as Source: This context manager opens the microphone as the audio source.

 Clearing Background Noise:

 adjust_for_ambient_noise(source, duration=1) helps to filter out ambient noise by adjusting


the recognizer’s sensitivity based on the environment. The duration parameter sets how long
it listens to sample the noise.
 Listening for Input:
 recognizer.listen(source) captures audio from the microphone until silence is detected,
saving the audio to recorded_audio. 26
8.4 Recognizing Speech:
try:
print('Printing the message..')
text = recognizer.recognize_google(recorded_audio, language='en-US')
print('Your message: {}'.format(text))

 Speech Recognition:

 recognize_google(recorded_audio, language='en-US') uses Google’s Web Speech API to


convert the captured audio into text. It returns the recognized text as a string.

 Error Handling:

 The entire recognition process is wrapped in a try block to handle potential errors, such as
issues with audio clarity or network problems.

8.5 Performing Sentiment Analysis:


# Perform sentiment analysis
analyser = SentimentIntensityAnalyzer()
sentiment_scores = analyser.polarity_scores(text)
print('Sentiment scores:', sentiment_scores)

 Initializing Sentiment Analyzer:

 An instance of SentimentIntensityAnalyzer is created to analyze the sentiment of the


recognized text.

 Analyzing Sentiment:

 analyser.polarity_scores(text) computes sentiment scores, which include:


o Positive: Proportion of text that expresses positive sentiment.
o Negative: Proportion that expresses negative sentiment.
o Neutral: Proportion that expresses neutral sentiment.
o Compound: A normalized score that summarizes the overall sentiment, ranging from
-1 (very negative) to +1 (very positive).

8.6 Handling Exceptions:


except Exception as ex:
print('Error:', ex)

 This part catches any exceptions that might occur during the speech recognition or sentiment
analysis process and prints an error message. This can help in debugging and understanding
what went wrong.

27
CHAPTER 9
Program Source code with adequate comments:

28
CHAPTER 10
Results:
29

CHAPTER 11
Conclusion:
Sentiment analysis deals with the classification of texts based on the sentiments they
contain. This article focuses on a typical sentiment analysis model consisting of three core
steps, namely data preparation, review analysis and sentiment classification, and describes
representative techniques involved in those steps.

Sentiment analysis is an emerging research area in text mining and computational


linguistics, and has attracted considerable research attention in the past few years. Future
research shall explore sophisticated methods for opinion and product feature extraction, as
well as new classification models that can address the ordered labels property in rating
inference. Applications that utilize results from sentiment analysis is also expected to
emerge in the near future.
30

Future Scope:

So the discourse feeling acknowledgment is an extremely fascinating subject and there is


something else to find in the field, in our model the future work will incorporate the
improvement of exactness of the model to come by improved results, we can likewise
prepare the model to give aftereffects of the discourse that is longer in term, like in this
model we can perceive the feeling just for brief length of time. In future we will ready to
stack the more drawn out example dataset and the model will arrange various feelings in
various timeframe. Its future work can likewise incorporate the recording of on time
information through a receiver with the goal that there is no need of stacking the dataset;
we'll just train the model and afterward information can be recorded to give the feelings of
that individual's voice.
31

CHAPTER 12
References
[1] Putta Durga and Deepthi Godavarthi, ‘‘An Effective Deep Sentiment Analysis Using a
Decision-Based Recurrent Neural Network ’’Deep learning, oct. 2023,Andhra Pradesh-
52223,India.
[2] H. Zhang, Y. N. Cheah, O. M. Alyasiri, and J. An, ‘‘A survey on aspect-based sentiment
quadruple extraction with implicit aspects and opinions,’’ Tech. Rep., Jun. 2023, doi:
10.21203/rs.3.rs-3098487/v1
[3] Xu, X. Wang, B. Yang, and Z. Kang, ‘‘Target embedding and position attention with
LSTM for aspect based sentiment analysis,’’ in Proc. 5th Int. Conf. Math. Artif. Intell.,
Apr. 2020, pp. 93–97.
[4] L. Xu, L. Bing, W. Lu, and F. Huang, ‘‘Aspect sentiment classification with aspect-specific
opinion spans,’’ in Proc. Conf. Empirical Methods Natural Lang. Process. (EMNLP),
2020, pp. 3561–3567.
[5] R. Liu, Y. Shi, C. Ji, and M. Jia, ‘‘A survey of sentiment analysis based on transfer
learning,’’ IEEE Access, vol. 7, pp. 85401–85412, 2019.
[6] A. Krishna, V. Akhilesh, A. Aich, and C. Hegde, ‘‘Sentiment analysis of restaurant reviews
using machine learning techniques,’’ in Emerging Research in Electronics,Computer
Science and Technology (Lecture Notes in Electrical Engineering), vol. 545, 2019, pp.
687 -696.
[7] K. Ayyub, S. Iqbal, E. U. Munir, M. W. Nisar, and M. Abbasi, ‘‘Exploring diverse features
for sentiment quantification using machine learning algorithms,’’ IEEE Access, vol. 8, pp.
142819–142831,2020.
[8] A. Viloria, N. Varela, J. Vargas, and O. B. P. Lezama, ‘‘Comparative analysis between
different automatic learning environments for sentiment analysis,’’ in Proc. Int. Symp.
Distrib. Comput. Artif. Intell. Cham, Switzerland: Springer, 2021, pp. 134–141.
[9] F. Iqbal, J. M. Hashmi, B. C. M. Fung, R. Batool, A. M. Khattak, S. Aleem, and P. C. K.
Hung, ‘‘A hybrid framework for sentiment analysis using genetic algorithm based feature
reduction,’’ IEEE Access, vol. 7,pp. 14637–14652, 2019.
[10] J. Odili, M. N. M. Kahar, A. Noraziah, and S. F. Kamarulzaman,‘‘A comparative
evaluation of swarm intelligence techniques for solving combinatorial optimization
problems,’’ Int. J. Adv. Robot.Syst., vol. 14, no. 3, May 2017, Art. no. 172988141770596,
doi:10.1177/1729881417705969.
[11] P. Stodola, K. Michenka, J. Nohel, and M. Rybanský, ‘‘Hybrid algorithm based on ant
colony optimization and simulated annealing applied to the dynamic traveling salesman
problem,’’ Entropy, vol. 22, no. 8, p. 884, Aug. 2020, doi: 10.3390/e22080884.
[12] S. A. Ajagbe and M. O. Adigun, ‘‘Deep learning techniques for detection
and prediction of pandemic diseases: A systematic literature review,’’ Mul-
timedia Tools Appl., vol. 2023, pp. 1–35, May 2023, doi: 10.1007/s11042-023-15805-z.
[13] S. A. Ajagbe, K. A. Amuda, M. A. Oladipupo, O. F. Afe, and K. I. Okesola, ‘‘Multi-
classification of Alzheimer disease on magnetic resonance images(MRI) using deep
convolutional neural network (DCNN) approaches,’’Int. J. Adv. Comput. Res., vol. 11,
no. 53, pp. 51–60, Mar. 2021, doi:10.19101/IJACR.2021.1152001.
32

[14] T. Gu, G. Xu, and J. Luo, ‘‘Sentiment analysis via deep multichannel neural networks
with variational information bottleneck,’’ IEEE Access, vol. 8,pp. 1210141212020.
[15] M. K. Hayat, A. Daud, A. A. Alshdadi, A. Banjar, R. A. Abbasi, Y. Bao, and H. Dawood,
‘‘Towards deep learning prospects: Insights for social media analytics,’’ IEEE Access,
vol. 7, pp. 36958–36979, 2019, doi:10.1109/ACCESS.2019.2905101.
[16] H. Liu, I. Chatterjee, M. Zhou, X. S. Lu, and A. Abusorrah, ‘‘Aspect based sentiment
analysis: A survey of deep learning methods,’’ IEEE Trans. Computat. Social Syst., vol. 7,
no. 6, pp. 1358–1375, Dec. 2020,doi: 10.1109/TCSS.2020.3033302.
[17] P. Gupta, S. Kumar, R. R. Suman, and V. Kumar, ‘‘Sentiment analysis of lockdown in
India during COVID-19: A case study on Twitter,’’ IEEE Trans. Computat. Social Syst.,
vol. 8, no. 4, pp. 992–1002, Aug. 2021 doi:10.1109/TCSS.2020.3042446.
[18] A. Elouardighi, M. Maghfour, H. Hammia, and F.-Z. Aazi,‘‘A machine learning approach
for sentiment analysis in the standard or dialectal Arabic Facebook comments,’’ in Proc.
3rd Int. Conf. Cloud Comput. Technol. Appl.(CloudTech),Oct.2017,pp.1–8,
doi:10.1109/CloudTech.2017.8284706.
[19] P. Vyas, M. Reisslein, B. P. Rimal, G. Vyas, G. P. Basyal, and P. Muzumdar,‘‘Automated
classification of societal sentiments on Twitter with machine learning,’’ IEEE Trans.
Technol. Soc., vol. 3, no. 2, pp. 100–110, Jun. 2022,doi: 10.1109/TTS.2021.3108963.
[20] C. Du and L. Huang, ‘‘Text classification research with attention-based recurrent neural
networks,’’ Int. J. Comput. Commun. Control, vol. 13,no. 1, pp. 50–61, 2018.
33

You might also like