Sentiment Analysis of Tweets Using Machine Learning

The document discusses sentiment analysis of tweets using machine learning techniques. It describes classifying tweets as positive or negative using classifiers like Naive Bayes, support vector machines, and recurrent neural networks. The workflow involves preprocessing tweets, extracting features, training classifiers on labeled data, and evaluating performance on test data. Applications include analyzing consumer sentiment for organizations and improving marketing based on public opinions.

Uploaded by

Makp112

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

93 views22 pages

Sentiment Analysis of Tweets Using Machine Learning

Uploaded by

Makp112

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 22

SENTIMENT ANALYSIS OF TWEETS

USING MACHINE LEARNING

• Problem Statement:
• To study & apply Machine learning techniques used for sentiment
analysis to classify tweets as positive or negative.
• Scope:
• Sentiment analysis can be used for diverse applications in various
ﬁelds to maximize interests or proﬁt of companies based on the
reviews they receive.
• Motivation
• The opinions of individuals towards an entity is very valuable. In an age
on internet, these opinions are produced on social platforms like Twitter in
huge amounts. Humans are incapable of processing such large amounts of
data which puts forth the need to automate this process using sentiment
analysis.
• With the help of machine learning algorithms this has fairly become an
easy and efficient task.
SENTIMENT ANALYSIS

• Sentiment analysis deals with identifying and classifying opinions or sentiments

expressed in source text towards entities such as products, services, organizations,
individuals, issues, events, topics, and their attributes.
• Social media is generating a vast amount of sentiment rich data in the form of
tweets, status updates, blog posts etc. This data could be used for performing
sentiment analysis.
• The amount of user generated content is too large for a normal user to analyze. So
to automate this, various sentiment analysis techniques are used.
TECHNIQUES TO PERFORM SENTIMENT ANALYSIS

• Knowledge Based approach: This technique requires a large database of

predefined emotions and an efficient knowledge representation for
identifying sentiments. This technique is found to be difficult due to the
requirement of a huge lexical database hence making it tedious and
erroneous.
• Machine Learning approach: Machine learning approach makes use of a
training set to develop a sentiment classifier that classifies sentiments. Unlike
the latter this does not require a large database of predefined emotions.
MACHINE LEARNING TECHNIQUES

• This approach makes use of a training set and test set.

• The training set consists of input feature vectors and their corresponding class
labels. Using this set, a classification model is developed which tries to classify
the input feature vector into corresponding class labels.
• Test set is used to validate the model by predicting the class labels of unseen
feature vectors.
CLASSIFIERS
• Naïve Bayes Classifier:
• This classifier is based on Bayes theorem. An assumption made here is that all the input feature
vectors are independent of each other and are equal.
• The conditional probability for Naive Bayes can be deﬁned as :
𝑚

Ρ 𝑋 𝑦 = ෑ Ρ 𝑥𝑖 𝑦𝑗
𝑖=1
• Nave Bayes does not consider the relationships between features. So it cannot utilize the
relationships between part of speech tag, emotional keyword and negation.
• ’X’ is the feature vector defined as X={x1,x2,....xm} and yj is the class label.
• Here, in sentiment analysis of tweets there are different independent features like emoticons,
emotional keyword, count of positive and negative keywords, and count of positive and
negative hash tags which are effectively utilized by Naive Bayes classifier for classification.
• Support Vector Machine:
• This is a binary classifier i.e. it can classify the input features vectors into only two distinct
classes.
• It separates the tweets using a hyper plane.
• For classification of tweets we have used linear
Kernel as it maintains a wide gap between two classes.
• The support vector machine is give a set of labelled training
data of the two categories and is trained on this training data.
• The mathematical function used:
𝑔 𝑋 = 𝜔𝑇 𝜙 𝑋 + 𝑏
’X’ is the feature vector, ’w’ is the weights vector and
’b’ is the bias vector. φ() is the non linear mapping
from input space to high dimensional feature
space.
• Recurrent Neural Network:
• Recurring neural network (RNN) are popular and efficient models which have proven to be
useful in Natural language processing (NLP). RNN make use of sequential information.
• RNN different from these algorithms or other neural networks is its ability to connect previous
information to current tasks, thus it makes use of memory.
• RNN’s have a memory which stores about the computations previously.
• RNN’s have three layers : input layer, hidden layer and output layer.
• These compute results based on a correlation between the current data step and previous data
step, just like humans take decisions.
SENTIMENT ANALYSIS OF TWEETS

• Twitter is a social media platform used widely by individuals to large organizations

• Users on Twitter use this platform to express their views or opinions related to any
entity like a person, product, service or organization.
• Sentiment analysis of tweets is a challenging task as tweets are short in length, 140
characters are allowed at a time to be precise which occur with misspelling, use of
slangs, and use of emoticons.
• Tweets are short, noisy and covers a variety of topics. Tweeters often used diﬀerent
vocabularies also. All of this puts a challenge to sentiment analysis of tweets.
WORKFLOW
DATASET
• The dataset used here was Neik Sanders corpus file.
• This file consists of sentiments of famous organizations like Apple, Microsoft,
Google and Twitter.
• It has been formatted in the following way :
Company Sentiment(positive,negativ Twitter ID
e,neutral,irrelevant)

Apple Positive 1.26E+17

Microsoft Irrelevant 1.26E+17

PRE-PROCESSING OF TWEETS
• Twitter policies do not allow to store tweets for more than 24 hours. So we retrieve the
tweets from Twitter using twitter api library defined by python.
• To retrieve the tweets we need the twitter ID which is obtained from Neik Sanders corpus
file.
• Tweets obtained could have misspellings, slang words, emoticons and hence they require
pre-processing before giving it to the classifier for classification purposes.
• Preprocessing steps include removing url, avoiding misspellings and slang words.
Misspellings are avoided by replacing repeated characters with 2 occurrences.
• Slang words contribute much to the emotion of a tweet. So they can’t be simply removed.
• A slang word dictionary is maintained to replace slang words occurring in tweets with their
associated meanings.
FEATURE VECTORS

• Feature vector is composed of 8 relevant features.

• Part of speech (pos) tag
• Special keyword
• Presence of negation
• Emoticon
• Number of positive keywords
• Number of negative keywords
• Number of positive hash tags
• Number of negative hash tags.
EVALUATION

• After pre-processing of tweets and feature extraction step, a support vector machine is
defined and is trained on the data obtained.
• It was tested for the keyword “GOOGLE”, and it returned a 82% positive sentiment and
18% negative sentiment.
• A csv file is also generated which stores the sentiment for
the tweet along with the search word.
Creating test set
Pre-processing of test set
Initializing the tweets
FUTURE SCOPE
• Sarcastic comments are the ones which are very difficult to identify. Tweets containing
sarcastic comments give exactly opposite results owing to the mindset of the author.
• The context in which a word is used, the interpretation changes. For ex: the word
‘unpredictable’ in ‘unpredictable plot’ in context of a land plot is negative whereas
‘unpredictable plot ’ in context of a movie’s plot is positive. So it’s important to relate
the interpretation with the context of the tweets.
• The use of native language combined with English usage is difficult to interpret.
• To improve the accuracy one way is to train your system in a way such that it gets the
sentiment of word based on the entire tweet i.e. if a word in the tweet has more than
one meaning then it compares all the meanings of the word and takes the one which
best suits the sentiment of entire tweet.
APPLICATIONS
• Sentiment analysis of tweets can be extended to any review related website for
example product review to understand products popularity, movie review etc.
• Highly useful in sub component technology such as detecting antagonistic, heated
language in mails, context sensitive information detection, spam detection etc.
• Organizations can use it to determine consumer attitudes and trends is one of the
major applications of sentiment analysis.
• Consumers can use sentiment analysis to research products or services before making
a purchase.
• Marketers can use this to research public opinion of their company and products, or
to analyse customer satisfaction.
CONCLUSION

• Using various machine learning algorithms to perform sentiment analysis of

tweets .
• A comprehensive study of the comparison between the diﬀerent models and
their performance(accuracy) was also obtained.
• Reducing the dataset through feature extraction, enhanced the performance
of the classiﬁers used and produced better results.

Sentiment Analysis Final Documentation Report
50% (2)
Sentiment Analysis Final Documentation Report
21 pages
Sentiment Analysis of Twitter Data My
75% (4)
Sentiment Analysis of Twitter Data My
14 pages
National BIM Standard - United States: 2 Reference Standards
No ratings yet
National BIM Standard - United States: 2 Reference Standards
2 pages
Fin Ijprems1714118825
No ratings yet
Fin Ijprems1714118825
6 pages
Batch-6c Minipro Doc Rev-2
No ratings yet
Batch-6c Minipro Doc Rev-2
33 pages
Twitter Sentiment Analysis
100% (2)
Twitter Sentiment Analysis
10 pages
Twitter Sentiment Analysis
No ratings yet
Twitter Sentiment Analysis
7 pages
Twiiter Sentiment Analysis
No ratings yet
Twiiter Sentiment Analysis
15 pages
Introduction
No ratings yet
Introduction
27 pages
Sentiment Analysis Using Machine Learning Algorithms
No ratings yet
Sentiment Analysis Using Machine Learning Algorithms
23 pages
Machine Learning For Sentiment Analysis of Twitter Data
No ratings yet
Machine Learning For Sentiment Analysis of Twitter Data
9 pages
Senti bp1
No ratings yet
Senti bp1
2 pages
6 Project Report Sem6
No ratings yet
6 Project Report Sem6
13 pages
IJCRT2207068
No ratings yet
IJCRT2207068
5 pages
FML Project Report
No ratings yet
FML Project Report
18 pages
(IJCST-V9I4P5) :G. Bala Krishna Priya, Dr. Jabeen Sultana, Prof. M. Usha Rani
No ratings yet
(IJCST-V9I4P5) :G. Bala Krishna Priya, Dr. Jabeen Sultana, Prof. M. Usha Rani
5 pages
Abstract
No ratings yet
Abstract
2 pages
TSA Synopsis
No ratings yet
TSA Synopsis
18 pages
Digital Assignment-1 Literature Review On Twitter Sentiment Analysis Name: G.Tirumala Reg No: 16BCE0202 1)
No ratings yet
Digital Assignment-1 Literature Review On Twitter Sentiment Analysis Name: G.Tirumala Reg No: 16BCE0202 1)
9 pages
Sentiment Analysis On Twitter Using Streaming Api: Abstract
No ratings yet
Sentiment Analysis On Twitter Using Streaming Api: Abstract
5 pages
Twitter Sentiment Analysis Using Machine Learning Algorithms IJERTV12IS070128
No ratings yet
Twitter Sentiment Analysis Using Machine Learning Algorithms IJERTV12IS070128
3 pages
Twitte Analysis
No ratings yet
Twitte Analysis
53 pages
Sentiment Analysis
No ratings yet
Sentiment Analysis
3 pages
Twitter Sentiment Analysis - Final - Report Copy Sahil
No ratings yet
Twitter Sentiment Analysis - Final - Report Copy Sahil
26 pages
Project Report
No ratings yet
Project Report
10 pages
Sentimental Analysis On Twitter Data Using Naive Bayes: Ijarcce
No ratings yet
Sentimental Analysis On Twitter Data Using Naive Bayes: Ijarcce
4 pages
A Review On Twitter Sentiment Analysis Approaches
No ratings yet
A Review On Twitter Sentiment Analysis Approaches
5 pages
10 1109@icaccs48705 2020 9074208
No ratings yet
10 1109@icaccs48705 2020 9074208
3 pages
Se Write-Up
No ratings yet
Se Write-Up
2 pages
Effective Sentiment Analysis of Twitter With Apache Spark
No ratings yet
Effective Sentiment Analysis of Twitter With Apache Spark
8 pages
Minor 1
No ratings yet
Minor 1
20 pages
Machine Learning With Advance Model
No ratings yet
Machine Learning With Advance Model
19 pages
571 Document Mod
No ratings yet
571 Document Mod
30 pages
Sentiment Analysis On Twitter Data Using Machine Learning Algorithms in Python
No ratings yet
Sentiment Analysis On Twitter Data Using Machine Learning Algorithms in Python
14 pages
IC-RTETM Final Sentiment Analysis
No ratings yet
IC-RTETM Final Sentiment Analysis
13 pages
Sentiment Analysis Twitter
No ratings yet
Sentiment Analysis Twitter
3 pages
Social Media Sentiment
No ratings yet
Social Media Sentiment
8 pages
Twitter Sentiment Analysis by Robin Singh
No ratings yet
Twitter Sentiment Analysis by Robin Singh
57 pages
Lab Report - CSE 816
No ratings yet
Lab Report - CSE 816
17 pages
Machine Learning Based Sentiment Analysis For Text Messages
No ratings yet
Machine Learning Based Sentiment Analysis For Text Messages
7 pages
Cmu CS QTR 127
No ratings yet
Cmu CS QTR 127
38 pages
Sentiment Analysis
No ratings yet
Sentiment Analysis
11 pages
Sample 1
No ratings yet
Sample 1
22 pages
(IJCST-V8I5P3) : Gajendra R. Wani
No ratings yet
(IJCST-V8I5P3) : Gajendra R. Wani
4 pages
Twitter Sentiment Analysis System
No ratings yet
Twitter Sentiment Analysis System
5 pages
Sentiment Analysis of User-Generated Twitter Updates Using Various Classification Techniques
No ratings yet
Sentiment Analysis of User-Generated Twitter Updates Using Various Classification Techniques
18 pages
Product Rating Through Sentiment Analysis
No ratings yet
Product Rating Through Sentiment Analysis
23 pages
Natural Language Processing (Ue16Cs333) MINI-PROJECT (2019) Sentiment Analysis
No ratings yet
Natural Language Processing (Ue16Cs333) MINI-PROJECT (2019) Sentiment Analysis
2 pages
Sentiment Analysis of Twitter
No ratings yet
Sentiment Analysis of Twitter
26 pages
PPPT
No ratings yet
PPPT
20 pages
Sentiment Analysis On Twitter Data Using Machine Learning Algorithms in Python
No ratings yet
Sentiment Analysis On Twitter Data Using Machine Learning Algorithms in Python
15 pages
Finalreview 1
No ratings yet
Finalreview 1
4 pages
Project Review On The Opinion Minin
No ratings yet
Project Review On The Opinion Minin
4 pages
Machine Learning With Sentiment Approach
No ratings yet
Machine Learning With Sentiment Approach
5 pages
ML Project Report
No ratings yet
ML Project Report
26 pages
Sentiment Analysis of Tweets Using Natural Language Processing (#1130188) - 2484168
No ratings yet
Sentiment Analysis of Tweets Using Natural Language Processing (#1130188) - 2484168
3 pages
Sentiment Analysis of User Comment Text Based On L
No ratings yet
Sentiment Analysis of User Comment Text Based On L
13 pages
Template For The First Slide of PPT Presentation1
No ratings yet
Template For The First Slide of PPT Presentation1
18 pages
IEEE Spectrum Third Edition
From Everand
IEEE Spectrum Third Edition
Gerardus Blokdyk
No ratings yet
EtherNet IP The Ultimate Step-By-Step Guide
From Everand
EtherNet IP The Ultimate Step-By-Step Guide
Gerardus Blokdyk
No ratings yet
Model Based Systems Engineering A Complete Guide - 2020 Edition
From Everand
Model Based Systems Engineering A Complete Guide - 2020 Edition
Gerardus Blokdyk
No ratings yet
Evaluating Strategies For Cost Reduction in SCM Relating To Exports and Imports.
100% (1)
Evaluating Strategies For Cost Reduction in SCM Relating To Exports and Imports.
53 pages
Manufacturing Execution System
100% (1)
Manufacturing Execution System
6 pages
Tesco
100% (1)
Tesco
20 pages
Toyota Case Study
No ratings yet
Toyota Case Study
18 pages
DLeasy
No ratings yet
DLeasy
11 pages
Module 5 Part 2 - Questions With Answers
No ratings yet
Module 5 Part 2 - Questions With Answers
6 pages
12 Tkti
No ratings yet
12 Tkti
27 pages
Mba-3 Sem Management Information System
No ratings yet
Mba-3 Sem Management Information System
3 pages
Information Retrieval and Artificial Intelligence.
No ratings yet
Information Retrieval and Artificial Intelligence.
5 pages
Database Notes
No ratings yet
Database Notes
18 pages
Documentum Architecture
100% (1)
Documentum Architecture
44 pages
Chapter 7 Normalization
No ratings yet
Chapter 7 Normalization
77 pages
Hitopadesha Stories
No ratings yet
Hitopadesha Stories
539 pages
EMC SAN CLI Administrator's Guide
No ratings yet
EMC SAN CLI Administrator's Guide
78 pages
United States: (12) Patent Application Publication (10) Pub. No.: US 2013/0073547 A1
No ratings yet
United States: (12) Patent Application Publication (10) Pub. No.: US 2013/0073547 A1
16 pages
Sap Hana Imdb
No ratings yet
Sap Hana Imdb
22 pages
Pig & Hive Questionaire
No ratings yet
Pig & Hive Questionaire
2 pages
Publish To The World: PIP STE05121 (Fabrication & Installation of Anchor Bolts)
No ratings yet
Publish To The World: PIP STE05121 (Fabrication & Installation of Anchor Bolts)
3 pages
IBM Extended Lessons To Data Warehousing
No ratings yet
IBM Extended Lessons To Data Warehousing
46 pages
Arcgis Pro: 10 Reasons To Migrate
No ratings yet
Arcgis Pro: 10 Reasons To Migrate
36 pages
Satya Final Minor Report
100% (1)
Satya Final Minor Report
25 pages
Data Flow Diagram - Interaction Overview Diagram - Healthcare Management Workflow Diagrams - DFD For Hospital Management System
No ratings yet
Data Flow Diagram - Interaction Overview Diagram - Healthcare Management Workflow Diagrams - DFD For Hospital Management System
12 pages
Oracle Questions
No ratings yet
Oracle Questions
87 pages
Read Nightwatch: A Practical Guide To Viewing The Universe - Download Ebook
20% (5)
Read Nightwatch: A Practical Guide To Viewing The Universe - Download Ebook
1 page
Chapter 1 - Database Management Systems
No ratings yet
Chapter 1 - Database Management Systems
51 pages
DM Case Study
No ratings yet
DM Case Study
5 pages
Business Infographics
No ratings yet
Business Infographics
15 pages
Data Warehousing For Business Intelligence
No ratings yet
Data Warehousing For Business Intelligence
5 pages
BIA Model Test Paper - 2014
No ratings yet
BIA Model Test Paper - 2014
23 pages
Ch2 Database System Concept
No ratings yet
Ch2 Database System Concept
20 pages
Convergence of Artificial Intelligence and Edge Computing in IoT - A Comprehensive Review and Future Perspectives
No ratings yet
Convergence of Artificial Intelligence and Edge Computing in IoT - A Comprehensive Review and Future Perspectives
2 pages
AWS DB Quiz
No ratings yet
AWS DB Quiz
6 pages
SAP Master Data Governance, Consolidation
No ratings yet
SAP Master Data Governance, Consolidation
54 pages

Sentiment Analysis of Tweets Using Machine Learning

Uploaded by

Sentiment Analysis of Tweets Using Machine Learning

Uploaded by

SENTIMENT ANALYSIS OF TWEETS

USING MACHINE LEARNING

• Sentiment analysis deals with identifying and classifying opinions or sentiments

• Knowledge Based approach: This technique requires a large database of

• This approach makes use of a training set and test set.

• Twitter is a social media platform used widely by individuals to large organizations

Apple Positive 1.26E+17

Microsoft Irrelevant 1.26E+17

• Feature vector is composed of 8 relevant features.

• Using various machine learning algorithms to perform sentiment analysis of

You might also like