0% found this document useful (0 votes)
2 views

Ppt- Sentiment Analysis Using Machine Learning Algorithms

The document discusses a research project focused on sentiment analysis using machine learning algorithms to classify tweets as positive or negative. It outlines the methodology, including data extraction, preprocessing, and classification using the NLTK dataset, and presents experimental results demonstrating the model's accuracy. Future work aims to enhance the model's capabilities, including predicting sarcasm and applying the technique to Arabic tweets.

Uploaded by

Nivashini G
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Ppt- Sentiment Analysis Using Machine Learning Algorithms

The document discusses a research project focused on sentiment analysis using machine learning algorithms to classify tweets as positive or negative. It outlines the methodology, including data extraction, preprocessing, and classification using the NLTK dataset, and presents experimental results demonstrating the model's accuracy. Future work aims to enhance the model's capabilities, including predicting sarcasm and applying the technique to Arabic tweets.

Uploaded by

Nivashini G
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 23

SENTIMENT ANALYSIS USING

MACHINE LEARNING ALGORITHMS


Ms.G.NIVASHINI - RESEARCH SCHOLAR
Ms. M. SHIMA - RESEARCH SCHOLAR
Dr.R.HEMALATHA - RESEARCH SUPERVISOR
PG & RESEARCH DEPARTMENT OF COMPUTER SCIENCE,
TIRUPPUR KUMARAN COLLEGE FOR WOMEN
TIRUPPUR, TAMILNADU
CONTENTS
 ABSTRACT
 INTRODUCTION
 RELATED WORK
 PROPOSED SYSTEM
 EXPERIMENTATION AND RESULTS
 CONCLUSION AND FUTURE WORK
 REFERENCES
ABSTRACT
The goal of this work is to use Machine
Learning (ML) methods to create a classifier
that can predict a comment's polarity.
Our work is essentially divided into three
tasks: data extraction, processing and
modelling. Our model is constructed using the
NLTK dataset.
Based on a supervised probabilistic machine
learning algorithm, we tended to create a
classifier to classify our tweets into positive
and negative sentiments then we opt for two
experiments to evaluate the performance of
our model.
INTRODUCTION

 The transition from web 1.0 to web 2.0 has made it


simpler for individuals to create and exchange ideas,
opinions, and methods online.
 As a result, the amount of subjective information, or
opinion data, on the Internet is growing rapidly.
 Sentiment analysis (SA), which focuses on opinion
mining (identification and classification) from textual
data, is one of the concepts that have emerged from
the growing purpose of gathering, analyzing, and
using this data.
 More people are using the internet and different
social media platforms to voice their thoughts and
ideas.
 As a result, there are now more user-generated
sentences. Including sentiment information that is
too complex for humans to read and comprehend.
 automatic analysis of opinions expressed on
various web platforms is becoming more and more
important for making effective decisions.
RELATED WORK

Sentiment analysis has garnered significant


scholarly interest in recent years as a result of
the widespread distribution of internet
evaluations.
As a result, a great deal of research has been
done in this field. Data preparation to
eliminate data noise has been covered by
some writers.
The findings demonstrated that sentiment
trending words in sentiment analysis have
some bearing on the prediction's outcome.
The accuracy of the prediction findings
declines after the high frequency words are
eliminated, particularly for the distinct high
frequency terms of each class.
Additional research on the classification of
Malayalam tweets as either positive or
negative using various machine learning
techniques, including NB, SVM, and RF
Comparing the classification performance and
accuracy of algorithms is the main focus of the
majority of this field's research (ML).
PROPOSED SYSTEM

 Algorithms for machine learning. To get to the


evaluation phase, we must first complete the
Tweet collection phase, followed by the
preprocessing, data preparation, and classification
stages.
 Python provides high-level tools and an easy-to-
use syntax, and because Anaconda is the best
way to install machine learning packages, we
decided to utilize it as our development
environment.
• A. Phase of Data Collection
The data utilized in this work consists of a
dataset of sample English tweets from the NLTK
package. NLTK’s Twitter corpus currently
contains a sample of 20k (20,000 non
sentimental tweets) Tweets.
• B. Preprocessing Phase
The language also in its original form can not be
processed accurately by a machine, so we have to clean
up our tweets to make it easier to understand and use
by a supervised machine learning algorithm.
 Data tokenization
 Delete stop words
 Remove URL
 Remove @ mentions
 Change to lowercase
• C. Preparing data
In the data preparation step, we will
convert the tokens to Python dictionary format
using words as keywords and True as values,
mixed at random, in order to get the data ready
for sentiment analysis.
• D. Phase of Classification
– The machine learning algorithm can be used to
learn from the training data once the data has
been separated into training and test sets.
– The algorithms listed below were applied: Naïve
Bayes (NB) classification (supervised, probabilistic
classification)
EXPERIMENTATION AND RESULTS

 Every experiment is conducted using an Intel (R)


Core (TM) i3-6006U processor with a CPU running
at 2.00 GHZ and 4.00 GB of RAM.
 Using supervised learning, we were able to classify
tweeter reviews with a sufficient degree of
accuracy.
 The primary goal of this project is to educate the
computer to read and comprehend human-typed
English sentences and to categorize them as either
positive or negative emotions.
We will be using the NLTK package in Python
for all of the NLP tasks in this tutorial.
Once the samples have been downloaded, we
are ready to begin processing the data. The
first part of understanding data is to use a
process called tokenization, or splitting strings
into smaller parts called tokens.
The basic way to divide language into tokens
is to divide text based on spaces and
punctuation. First of all we need to download
the punkt module which helps us tokenize
words and phrases.
We will construct a sentiment analysis model
that would link tweets with either a good or
negative attitude.
By default, all good tweets are included in the
data, followed by all negative tweets.
We should supply an unbiased sample of our
data for the model to be trained on. We've
included code to randomly arrange the data
using the.shuffle() method of random in order
to prevent bias.
CONCLUSION AND FUTURE WORK

Sentiment detection is a developing field with a


number of difficulties. The study of strategies
and tactics that guarantee the automatic
categorization of emotions into positive or
negative polarity is the goal of this endeavor.
This article employs a variety of methods.
The most recent ones are produced using
information from NLTK's Twitter corpus, which at
the moment comprises 30,000 tweet samples.
Before converting all tweets to lowercase, we
preprocess the data using tokenization,
lemmatization, the removal of stop words,
URLs, @ mentions, punctuation, and special
characters.
We must supply enough training data to train
our model appropriately since future work will
also involve enhancing it to predict sarcasm
Future plans call for applying the classification
technique to evaluate its efficacy with Arabic
tweets, given the high volume of generated
per minute, many of which are in the Arabic
language.
REFERENCES

• [1] J. Li, S. Fong, Y. Zhuang, and R. Khoury, “ Hierarchical


Classification in Text Mining for Sentiment Analysis,” in 2014
International Conference on Soft Computing and Machine
Intelligence, september. 2014, p. 46-51, doi: 10.1109/ISCMI.201
• 4.37.
• [2] H. Parveen and S. Pandey, “Sentiment analysis on Twitter
Dataset using Naive Bayes algorithm ,” in 2016 2nd International
Conference on Applied and Theoretical Computing and
Communication Technology (iCATccT), july. 2016, p. 416-419, doi:
10.1109/ICATCCT.2016.7912034.
• [3] S. S. and P. K.v., “ Sentiment analysis of malayalam tweets using
machine learning techniques ,” ICT Express, april. 2020, doi:
10.1016/j.icte.2020.04.003.
THANK YOU

You might also like