100% found this document useful (2 votes)

409 views10 pages

Twitter Sentiment Analysis

The document discusses a project on Twitter sentiment analysis using machine learning techniques, specifically focusing on logistic regression and Naive Bayes algorithms. It outlines the objectives, data acquisition and preprocessing methods, feature engineering, model training, and evaluation results, highlighting the accuracy of both algorithms. Future work includes exploring advanced machine learning models and addressing dataset imbalances to enhance prediction accuracy.

Uploaded by

chandrakantmeher007

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (2 votes)

409 views10 pages

Twitter Sentiment Analysis

Uploaded by

chandrakantmeher007

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

Twitter Sentiment Analysis

using Machine Learning

Abhishek Pani- 21051192

Shirshak Pattnaik- 21052360
Abhijeet Pani- 21052552
Barenya Nayak- 21052577
Chandrakanta Meher- 21052580

Twitter Sentiment Analysis Machine Learning Algorithms

Twitter sentiment analysis is the process of determining Machine learning algorithms are used to train models that can
the sentiment or opinion expressed in a tweet. It involves automatically analyze and classify tweets based on their
using machine learning algorithms to analyze the text sentiment. These algorithms use techniques such as SVM,
and classify it as positive, negative, or neutral. naive bayes, natural language processing and text classification
to make predictions.
What is Sentiment Analysis?
Sentiment analysis, also known as opinion mining, is a technique used to determine the
sentiment or emotional tone expressed in a piece of text. It involves analyzing the text to
identify whether the sentiment expressed is positive, negative, or neutral.

Objective of this project

The objective of this project is to perform sentiment analysis on Twitter data using machine
learning techniques, specifically logistic regression and Naive Bayes classification algorithms.
The focus is on distinguishing between positive and negative sentiments expressed in tweets. To
achieve this, the project incorporates preprocessing steps including stemming for text cleaning.
By analyzing tweets, the aim is to develop models that accurately classify the sentiment of tweets,
enabling valuable insights into public opinion, customer sentiment, and trends on various topics
discussed on Twitter. Ultimately, the project seeks to contribute to the understanding of
sentiment dynamics in social media and provide a tool for businesses, researchers, and
individuals to gauge public sentiment effectively.

Extracting the dataset.... Dependencies...

Loading the dataset...

Perfoming EDA

Data Acquisition and Preprocessing

To perform sentiment analysis on Twitter data, it is important to acquire and preprocess the data effectively.
There are several methods for collecting Twitter data, including using the Twitter API, which provides access to a
wide range of tweets and related information.
Data Acquisition and Preprocessing Techniques
Technique Description

Twitter API Use the Twitter API to access and collect a large volume of tweets related to the desired topic.

Keyword Filtering Filtered the collected tweets based on specific keywords or hashtags relevant to the analysis.

Language Detection Identified the language of each tweet and removed tweets that were not in the desired language.

Text Cleaning Removed any unnecessary characters, such as punctuations and special symbols from the tweet text.

Tokenization Split the tweet text into individual words or tokens for further analysis.

Stopword Removal Removed common words, such as ‘and’, ’the’ and ‘is’, that do not carry significant meaning for sentimental analysis.

Normalisation Converted all words to lowercase and applied Stemming or Lemmatization to reduce words to their base form.

Data Sampling Selected a representative sample from the collected dataset for analysis to reduce computational requirements.
FEATURE ENGINEERING
1.Stemming: Reduces words to their base form (e.g.,
"running" -> "run"). This helps capture synonyms
and reduces features.
2.Regular Expressions: Define patterns to identify
specific elements (e.g., hashtags, emoticons).
Useful for removing irrelevant information or
creating sentiment features (e.g., positive
emoticons).
3.TF-IDF: Assigns weights to words based on their
importance in a document and rarity across the
corpus. Focuses on informative words for
sentiment analysis.

Train-Test-Split

1.Train-Test Split: Divides data into training and testing sets for
machine learning.
2.Model Training: The training set educates the model on
patterns and relationships.
3.Model Evaluation: The unseen test set assesses the model's
ability to generalize to new data.
4.Overfitting Prevention: Helps avoid models that memorize the
training data but fail on unseen data.
5.Validation Technique: Crucial step in machine learning to
ensure robust model performance.
Model Evaluation

1.Naive Bayes
2.Logistic Regression

Naive Bayes Classification

Naive Bayes is a popular machine learning
algorithm used for sentiment analysis. It is
based on the Bayes' theorem, which
calculates the probability of a certain event
occurring given the prior knowledge of other
related events. In the context of sentiment
analysis, Naive Bayes can determine the
sentiment (positive, negative, or neutral) of a
given text based on the occurrence of
specific words or features.
How it Works
1.Training Phase: The algorithm learns from a labeled dataset, calculating the probability of each feature
occurring in each class.
2.Prediction Phase: When given a new input, the algorithm calculates the probability of the input belonging to
each class and assigns it to the class with the highest probability.

Advantages
Naive Bayes is simple and efficient, making it computationally inexpensive.
It performs well with high-dimensional data and is less prone to overfitting.
It can handle both binary and multi-class classification problems.

Applying Naive Bayes for sentiment analysis

Logistic Regression Classification

Implementing Logistic Regression for sentiment analysis

Key Findings
1.Logistic Regression is a powerful machine learning algorithm for sentiment analysis.
2.It can accurately classify tweets into positive, negative, or neutral sentiments.
3.The algorithm uses a sigmoid function to map the input features to a probability between 0 and 1.
4.By setting a threshold, we can classify the tweets as positive or negative based on the predicted probabilities.
Result Analysis

Naive Bayes
The training data exhibited an accuracy of 0.81, whereas the test data
showed an accuracy score of 0.74.

Logistic Regression
The training data exhibited an accuracy of 0.83, whereas the test data
showed an accuracy score of 0.77.
Result Analysis

Naive Bayes 1

1
Logistic Regression

Future Work and Potential Areas for Improvement

Exploring Other Machine Learning Models
In future work, it would be beneficial to explore other machine learning models, such as
deep learning algorithms, to improve the performance of the sentiment analysis model.

Fine-tuning Model Parameters

Further optimization of the model's parameters, such as the learning rate or regularization
techniques, could lead to improved performance.

Handling Imbalanced Datasets

Addressing the issue of imbalanced datasets, where the number of positive and negative
sentiment tweets is significantly different, could improve the model's ability to accurately predict
sentiment.

Diving into Neural Networks

We can enhance our performance significantly by utilizing NLP models such as BERT, GPT, and
CNN, which can provide more precise results when working with the dataset.
Thank You

Sentiment Analysis
100% (1)
Sentiment Analysis
35 pages
Full ML Viva Questions Answers Q1 To Q70
No ratings yet
Full ML Viva Questions Answers Q1 To Q70
6 pages
AICTE Internship
No ratings yet
AICTE Internship
17 pages
"Resume Ranking Using NLP and Machine Learning": Bachelor of Engineering
No ratings yet
"Resume Ranking Using NLP and Machine Learning": Bachelor of Engineering
41 pages
Distributed Database System (KCA045)
No ratings yet
Distributed Database System (KCA045)
9 pages
Bus Reservations Using HTML Css and Js
No ratings yet
Bus Reservations Using HTML Css and Js
16 pages
Customer Segmentation Using Machine Learning
100% (1)
Customer Segmentation Using Machine Learning
28 pages
Cyber Cafe Management System
0% (1)
Cyber Cafe Management System
8 pages
Sentiment Analysis of Twitter
No ratings yet
Sentiment Analysis of Twitter
26 pages
Final Project Documentation
No ratings yet
Final Project Documentation
53 pages
PDF Sentimental Analysis Project Documentation
No ratings yet
PDF Sentimental Analysis Project Documentation
74 pages
DS Practical (BSC CS)
No ratings yet
DS Practical (BSC CS)
49 pages
Proposalwriting
No ratings yet
Proposalwriting
16 pages
Sentiment Analysis of Tweets Using Machine Learning
No ratings yet
Sentiment Analysis of Tweets Using Machine Learning
22 pages
Ch-4 Processor Memory Modeling Using Queuing Theory
100% (2)
Ch-4 Processor Memory Modeling Using Queuing Theory
19 pages
Summer Internship Report On: Aws Data Engineering (Topic)
No ratings yet
Summer Internship Report On: Aws Data Engineering (Topic)
21 pages
Information Retrieval
100% (1)
Information Retrieval
11 pages
Project PPT Final
No ratings yet
Project PPT Final
11 pages
Cap456-Introduction To Big Data
No ratings yet
Cap456-Introduction To Big Data
1 page
Machine Learning-Powered Web Application For Predicting and Identifying Fake Job Listing
No ratings yet
Machine Learning-Powered Web Application For Predicting and Identifying Fake Job Listing
6 pages
STQA SEM III SPPU MAR APR 2023 FINAL PDF-1 - Removed
No ratings yet
STQA SEM III SPPU MAR APR 2023 FINAL PDF-1 - Removed
35 pages
Project - Report
No ratings yet
Project - Report
56 pages
An XML File Which Will Display The Book Information and DTD
No ratings yet
An XML File Which Will Display The Book Information and DTD
7 pages
Infosys Campus Registration Guide
No ratings yet
Infosys Campus Registration Guide
7 pages
Sentiment Analysis On Movie Reviews: Natural Language Processing UML602 Project Report
No ratings yet
Sentiment Analysis On Movie Reviews: Natural Language Processing UML602 Project Report
13 pages
Placement Portal Management System
No ratings yet
Placement Portal Management System
29 pages
JNTUA-B.Tech.2-2 CSE-R15-SYLLABUS PDF
No ratings yet
JNTUA-B.Tech.2-2 CSE-R15-SYLLABUS PDF
24 pages
RDBMS Lab Manual Mechanical RGPV
0% (1)
RDBMS Lab Manual Mechanical RGPV
14 pages
Data Mining Report
100% (1)
Data Mining Report
15 pages
Phishing Websites by ML
No ratings yet
Phishing Websites by ML
4 pages
Important Topics For BSC-IT Sem-6 (E-Next - In)
No ratings yet
Important Topics For BSC-IT Sem-6 (E-Next - In)
2 pages
Report - File - Vikram 221347009
No ratings yet
Report - File - Vikram 221347009
8 pages
3-1 Bigdata (Spark)
No ratings yet
3-1 Bigdata (Spark)
3 pages
Presentation ON Raster Methods of Transformation: Presented by Rajeev Kumar Jha
No ratings yet
Presentation ON Raster Methods of Transformation: Presented by Rajeev Kumar Jha
16 pages
Data Mining of Restaurant Review Using W PDF
No ratings yet
Data Mining of Restaurant Review Using W PDF
4 pages
Full Doc-Passport Tracking System
No ratings yet
Full Doc-Passport Tracking System
72 pages
Playstore App Review Analysis: Capstone Project
No ratings yet
Playstore App Review Analysis: Capstone Project
11 pages
Mca (Management) 2020 Pattern
No ratings yet
Mca (Management) 2020 Pattern
74 pages
Agriculture Management System-3
No ratings yet
Agriculture Management System-3
22 pages
Machine Learning
No ratings yet
Machine Learning
2 pages
Case Study (Analysis of Algorithm
No ratings yet
Case Study (Analysis of Algorithm
14 pages
PHP Mini Project - Front, Bonafide & Index
No ratings yet
PHP Mini Project - Front, Bonafide & Index
3 pages
NLP Assignment-1 20192
No ratings yet
NLP Assignment-1 20192
3 pages
B.SC (CS) Real Syllabus
No ratings yet
B.SC (CS) Real Syllabus
75 pages
Project Report Text Editor in Java
100% (1)
Project Report Text Editor in Java
10 pages
Summer Internship Report: Bachelor of Technology
No ratings yet
Summer Internship Report: Bachelor of Technology
38 pages
Question Bank - WTL-oral Question Bank - WTL-oral
No ratings yet
Question Bank - WTL-oral Question Bank - WTL-oral
9 pages
Dbms Mini Report
No ratings yet
Dbms Mini Report
14 pages
Bda Unit 3
No ratings yet
Bda Unit 3
22 pages
Class Diagram - Online Railway Reservation System: Books
No ratings yet
Class Diagram - Online Railway Reservation System: Books
1 page
Resume New Resume NewResume NewResume NewResume NewResume New
No ratings yet
Resume New Resume NewResume NewResume NewResume NewResume New
3 pages
Practical No: 13: To Write A Prolog - Program For N Queen Problem
0% (1)
Practical No: 13: To Write A Prolog - Program For N Queen Problem
5 pages
Multimodal Sentiment Analysis
No ratings yet
Multimodal Sentiment Analysis
6 pages
Unit Ii
No ratings yet
Unit Ii
61 pages
Black Box Testing Through Inspection On Hotel Management
No ratings yet
Black Box Testing Through Inspection On Hotel Management
22 pages
Microprocessors Lab Viva Voce Questions 10CSL48
No ratings yet
Microprocessors Lab Viva Voce Questions 10CSL48
4 pages
Enterprise Computing With Java Practical File: Master of Computer Application
No ratings yet
Enterprise Computing With Java Practical File: Master of Computer Application
45 pages
KDD Vs Data Mining
No ratings yet
KDD Vs Data Mining
2 pages
A Project Report ON: Submitted by
No ratings yet
A Project Report ON: Submitted by
26 pages
Operating System Support in Distributed Systems
No ratings yet
Operating System Support in Distributed Systems
4 pages
01 Computers and Digital Basics
100% (1)
01 Computers and Digital Basics
41 pages
Mandatory Documentation and Records: Status Interpretation Notes
100% (1)
Mandatory Documentation and Records: Status Interpretation Notes
29 pages
21CSC305P ML - Unit 1-E
No ratings yet
21CSC305P ML - Unit 1-E
137 pages
Electrical Drives and Control
100% (1)
Electrical Drives and Control
8 pages
CAD/CAM Systems: o P & N Va
No ratings yet
CAD/CAM Systems: o P & N Va
16 pages
ANSI Codes
No ratings yet
ANSI Codes
12 pages
Negative Effect of Social Media For Students Finish
100% (1)
Negative Effect of Social Media For Students Finish
1 page
Defining Operations in Oracle Process Manufacturing
No ratings yet
Defining Operations in Oracle Process Manufacturing
39 pages
HPC Module Wise Lession Plan For Autumn 2023 July - Dec
No ratings yet
HPC Module Wise Lession Plan For Autumn 2023 July - Dec
5 pages
Sanyo Cm21sf1 Cm21sf1 Chassis Fc8-A SM
No ratings yet
Sanyo Cm21sf1 Cm21sf1 Chassis Fc8-A SM
37 pages
Performance Best Practices For VMware Vsphere 6.7 VMware ESXi 6.7
No ratings yet
Performance Best Practices For VMware Vsphere 6.7 VMware ESXi 6.7
220 pages
Mastering jBPM6 - Sample Chapter
No ratings yet
Mastering jBPM6 - Sample Chapter
52 pages
What Is Epub 3 Matt Garrish
No ratings yet
What Is Epub 3 Matt Garrish
29 pages
CC2530 Registers
No ratings yet
CC2530 Registers
299 pages
Careers in Applied Psychology
No ratings yet
Careers in Applied Psychology
15 pages
Chapter 6 Word - Table and Mail Merge
No ratings yet
Chapter 6 Word - Table and Mail Merge
29 pages
5.2 Autonomous Vehicles
No ratings yet
5.2 Autonomous Vehicles
20 pages
Networking Interview Questions
No ratings yet
Networking Interview Questions
22 pages
Minor Project Report GRP 11
No ratings yet
Minor Project Report GRP 11
21 pages
MSSQL Lab Sertup
No ratings yet
MSSQL Lab Sertup
18 pages
Se Group - 7 Project
No ratings yet
Se Group - 7 Project
27 pages
Inomax Manual de Operacion PDF
No ratings yet
Inomax Manual de Operacion PDF
136 pages
The 24/7 Innovation: The 7Rs of Process Redesign
No ratings yet
The 24/7 Innovation: The 7Rs of Process Redesign
3 pages
Module 3 InterfacePakagesException
No ratings yet
Module 3 InterfacePakagesException
29 pages
0702LS Infineon PDF
No ratings yet
0702LS Infineon PDF
12 pages
ECE650 Chapter 0 Course Outline
No ratings yet
ECE650 Chapter 0 Course Outline
11 pages
Anthropometry As Ergonomic Consideration For Hospital
No ratings yet
Anthropometry As Ergonomic Consideration For Hospital
8 pages
MP22 8259
No ratings yet
MP22 8259
8 pages
Changelog
No ratings yet
Changelog
2 pages
Sap Powerdesigner: Object-Oriented Model Report
No ratings yet
Sap Powerdesigner: Object-Oriented Model Report
13 pages
Unixtoolbox Book
No ratings yet
Unixtoolbox Book
30 pages
SQLAgent Job Documentation
No ratings yet
SQLAgent Job Documentation
13 pages
R E C E S S L Unch Break: K. J. Somaiya College of Arts, Commerce & Science, Kopargaon
No ratings yet
R E C E S S L Unch Break: K. J. Somaiya College of Arts, Commerce & Science, Kopargaon
7 pages
How To Make Micro-SIM From Usual SIM Card
No ratings yet
How To Make Micro-SIM From Usual SIM Card
1 page

Twitter Sentiment Analysis

Uploaded by

Twitter Sentiment Analysis

Uploaded by

Twitter Sentiment Analysis

using Machine Learning

Abhishek Pani- 21051192

Twitter Sentiment Analysis Machine Learning Algorithms

Objective of this project

Extracting the dataset.... Dependencies...

Loading the dataset...

Data Acquisition and Preprocessing

Naive Bayes Classification

Applying Naive Bayes for sentiment analysis

Implementing Logistic Regression for sentiment analysis

Future Work and Potential Areas for Improvement

Fine-tuning Model Parameters

Handling Imbalanced Datasets

Diving into Neural Networks

You might also like