0% found this document useful (0 votes)
16 views

Improvement in Sentiment Analysis of Twitter Texts Using Machine Learning Algorithms

Uploaded by

somaraju parasa
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views

Improvement in Sentiment Analysis of Twitter Texts Using Machine Learning Algorithms

Uploaded by

somaraju parasa
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 21

PUDUCHERRY TECHNOLOGICAL UNIVERSITY

(An Autonomous Institution of Govt. of Puducherry)

Department of Information Technology

Improvement in Sentiment Analysis of Twitter Texts Using Machine Learning


Algorithms

RAC-2
9-10-2024 (A.N)

Research Supervisor : Dr. P Maragathavalli


Presented by : Parasa Somaraju
(Reg.No:2301712006
)
1
Contents
•Problem Definition
•Design Diagram
•Motivation
•Introduction
•Comparison of existing work
•Limitations
•Proposal methodogies
•Algorithms To Be Implemented
•Technologies
•Time line chart
•Conclusion
•References
1. Problem Definition

Challenge:

Accurate sentiment detection on short, informal, and often ambiguous text (tweets) with movie-specific slang
and jargon.
Objectives:

 To improve the precision, recall, and accuracy of sentiment classification in movie-related tweets using
advanced machine learning techniques.
 Combine text and images in tweets for more accurate sentiment analysis, especially when people use memes
or GIFs.
 Improve the system’s ability to recognize sarcasm and irony, which are common on Twitter.
 Go beyond just “positive” or “negative” and detect specific emotions like happiness, anger, or surprise.
2. Domain Specification Diagram

MACHINE LEARNING
Machine learning is a subfield of artificial intelligence, which is broadly defined as the capability of a machine to
imitate intelligent human behaviour. focuses on analyzing and interpreting patterns and structures in data to
enable learning, reasoning, and decision making outside of human interaction.

Figure 1. The diagram of ML approach


3. MOTIVATION

 Movies have a massive social impact, and real-time opinion analysis can inform marketing strategies and

predict box-office performance.

 Analyzing movie sentiment on Twitter provides valuable insights for producers, marketers, and viewers

 Traditional sentiment analysis techniques struggle with short text and domain-specific nuances.

 2024 advancements in machine learning allow more accurate analysis of short texts like tweets.
4. Introduction

In the digital age, opinions about movies are everywhere — from social media to dedicated review
platforms. But what if you could harness the power of natural language processing (NLP) to instantly
gauge the sentiments expressed in these reviews? This article takes you on a journey through the creation
of a Movie Sentiment Analysis application, from its inception to deployment.

Twitter Sentiment Analysis Important


The movie industry thrives on audience feedback. Analysing sentiments in movie reviews can provide
crucial insights into audience reception. Positive sentiments often indicate a hit, while negative
sentiments can signal areas for improvement.

OTHER IMPORTANTS AREAS

 Understanding Customer Feedback

 Reputation Management

 Political Analysis &Marketing


5. Comparison of existing work

Advantages of the Limitations of the


S.No Title of the Paper Year Modality Techniques
system system

High precision Data sparsity in


Sentimental Analysis of
1 2024 text Analysis SVM, Deep Learning through certain
Movie Tweet Reviews
preprocessing. categories.

Achieved 88% Limited by


Decoding Twitter: Ensemble Learning,
2 2024 Social Media Sentiment accuracy with noise in Twitter
Sentiment Analysis Lexicon-based
SVM. data.

Improved
Complexity in
Machine Learning-Based Naive Bayes, Feature classification
3 2024 Opinion Mining feature
Sentiment Analysis Selection accuracy via
extraction.
new methods.

Combines
Twitter Sentiment High
strengths of ML
4 Analysis Using ML 2024 Real-time Sentiment Hybrid ML-DL Models computational
algorithms for
Techniques cost.
robustness.
Continue…

Advantages of the Limitations of the


9 Title of the Paper Year Modality Techniques
system system

Fine-grained Requires complex


Aspect-Based Sentiment Graph Neural
sentiment analysis, graph construction,
5 Analysis on Twitter Using 2022 Text Networks,
captures tweet may struggle with
Graph Neural Networks BERT
relationships short tweets

Increased training
Robust Twitter Sentiment Adversarial Improved time, potential
6 Analysis with Adversarial 2023 Text Training, generalization, robust decrease in
Training CNN-LSTM to noise and attacks performance on
clean data
Reliance on
Emotion-Enhanced Twitter Ensemble Captures nuanced emotion lexicons,
7 Sentiment Analysis using 2024 Text Learning, emotions, improved potential bias in
Ensemble Learning EmoLex accuracy emotion
categorization
High
No need for task-
Zero-Shot Twitter Sentiment computational
GPT-4, Few- specific training data,
8 Analysis with Large 2024 Text requirements,
shot learning adaptable to new
Language Models potential biases in
domains
pre-trained models
6. Limitations of Existing system:

 1. Data Sparsity - Certain categories have limited data - Insufficient training data for accurate model

performance

 2. Noise in Twitter Data - High volume of irrelevant or misleading information - Difficulty in

distinguishing between signal and noise

 3. Complexity in Feature Extraction - Challenges in extracting relevant features from text data -

Difficulty in capturing contextual relationships

 4. High Computational Cost - Resource-intensive processing requirements - Scalability issues with

large datasets
7. Proposal methodologies

1. Data Collection:
 Collect tweets related to movie reviews from Twitter.

2. Pre-processing:
 Tokenization: Splitting the text into individual tokens
(words).
 Stop word Removal: Removing common words that don't
add significant meaning (e.g., "the," "is").
 Slang Normalization: Converting informal language or
slang into standardized text.
 Emoji and Special Character Handling: Replacing or
interpreting emojis/special characters into text.

3. Feature Extraction:
 Text Embeddings: Use models like BERT or Word2Vec to
convert text into numerical vectors.
 TF-IDF (Term Frequency-Inverse Document
Frequency): A technique to weigh important words in the text.
Continue…

4. Sentiment Classification:

 BERT (Bidirectional Encoder Representations from Transformers): A transformer-based


model.
 CNN-LSTM Hybrid: A combination of Convolutional Neural Networks (CNN) for feature
extraction and Long Short-Term Memory (LSTM) for sequence prediction.

5. Evaluation:
 Use metrics such as Accuracy, Precision, Recall, and F1-Score to evaluate model performance.

6. Model Deployment:
 Deploy the trained model for real-time movie sentiment analysis on Twitter data.
 This diagram can be structured as a linear progression, where each step leads into the next with
arrows, making the process easy to follow.
8. Algorithms to be Implemented:

1. BERT/RoBERTa Transformers
 State-of-the-art for contextual text understanding.
2. Hybrid Models (CNN + LSTM)
 CNN-LSTM: Combines convolutional layers with LSTMs for better feature extraction and
sequential learning.
3. Ensemble Learning
 Combining traditional machine learning with deep learning for optimal

results. 4.Traditional ML
 Support Vector Machines (SVM) and Naive Bayes for baselines.
9. Technologies

 Programming Languages: Python, using libraries like TensorFlow, Keras, PyTorch.

 Data Collection: Twitter API for gathering movie-related tweets.

 Preprocessing: Using NLTK and Spacy for tokenization, lemmatization, and sentiment-specific

preprocessing.

 Models: Hugging Face Transformers library for BERT and GPT-based models.

 Cloud Platforms:AWS SageMaker, Google Cloud AI for training large models.

 Visualization Tools:TensorBoard, Seaborn, Matplotlib.


10. Conclusion:

 Achieve improvement in accuracy through advanced deep learning models and enhanced preprocessing.
 Further research needed to handle sarcasm, multimodal data (text + images), and multilingual tweets.
 Potential impact: Improved tools for analyzing public sentiment on movies, enabling better decision-making for film
studios and marketers.
12.References:

1. Li, Y., Chen, L., & Yu, Z. (2020). Sentiment analysis of Twitter data: A comprehensive survey. *Information Fusion*, 57,
115-135. DOI: [10.1016/j.inffus.2019.10.018](https://doi.org/10.1016/j.inffus.2019.10.018).

2. Liu, B., Wu, H., Wang, Y., & Guo, Y. (2022). A survey of deep learning techniques for sentiment analysis on Twitter.
*Neurocomputing*, 484, 50-67. DOI: [10.1016/j.neucom.2021.07.045](https://doi.org/10.1016/j.neucom.2021.07.045).

3. Gupta, A., & Jha, S. (2023). Sentiment analysis of Twitter data: A systematic review. *Journal of Information Science*,
49(1), 71-97. DOI: [10.1177/01655515211030691](https://doi.org/10.1177/01655515211030691).

4. Alhajji, S., & Al-Qurishi, M. (2024). Sentiment analysis of Twitter data: A comprehensive review. *Journal of King
Saud University - Computer and Information Sciences*, 36(1), 101637. DOI:
[10.1016/j.jksuci.2022.06.025](https://doi.org/10.1016/j.jksuci.2022.06.025).

5. Chen, J., Luo, L., & Zhang, X. (2020). Twitter sentiment analysis: A deep learning approach using LSTM networks.
*Information Processing & Management*, 57(1), 102143. DOI:
[10.1016/j.ipm.2019.102143](https://doi.org/10.1016/j.ipm.2019.102143).

6. Hasan, M., & Basak, D. (2022). Twitter sentiment analysis using machine learning techniques: A comprehensive review.
*WIREs Data Mining and Knowledge Discovery*, e1396. DOI: [10.1002/widm.1396](https://doi.org/10.1002/widm.1396).
12. Timeline Chart

2023 2024
RESEARCH PLAN Oct Nov Dec Jan Feb Mar Apr May Jan Feb Mar aprl may june july aug sept

Domain Selection
Study of Existing work
Problem Definition

Data Collection and Analysis

Propose Methodology
Activities Algorithm Design
Implementation of Modules

Evaluation of Results
Journal Publications

Documentation of the reports


Conference paper upload

You might also like