0% found this document useful (0 votes)
12 views

Softcom-Assignment1

Uploaded by

Yousuf ali Safin
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views

Softcom-Assignment1

Uploaded by

Yousuf ali Safin
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 18

C OU R SE N O : C SE 4 1 1 4

Cou rse Tit le :Pat t e rn R e cogn it ion an d Mach in e L e arn in g

A Comprehensive Study to Sentiment Analysis of Bangla


Cricket-Related Social Media Comments Using ML and LSTM
Models
Research Participants

Adibul Haque Yousuf Ali Miftahul Sheikh


ID: 20200204029 ID: 20200204037 ID: 20200204038

Slide 02
Research Paper Presentation
Outline
01 Abstract 07 EVALUATION METRICS

02 Introduction 08 RESEARCH GAP

03 LITERATURE REVIEW 9 CONCLUSION

04 DATASETS 10 CONTRIBUTION OF GROUP MEMBERS

05 PRE-PROCESSING TECHNIQUES 11 REFERENCES

06 MODELS

Slide 03
ABSTRACT
• Sentiment analysis of Bangla cricket-related
social media comments.

• Logistic Regression, KNN, and LSTM models


applied.

• Text normalization, tokenization, and word


embeddings on Facebook and YouTube
comments.

• KNN: 72.1%, Logistic Regression: 70.1%, LSTM:


77.6%.

• ML + DL boost Bangla sentiment analysis;


future: expand dataset, explore hybrids.

Slide 04
INTRODUCTION

• Increased social media use boosts cricket


discussions in Bangladesh.

• Essential to understand public sentiment on


cricket in Bangladeshi culture.

• Lack of Bangla sentiment analysis in cricket


context.

• Aims to bridge the gap in Bangla sentiment


analysis for cricket-related social media
comments.

• Combines traditional and deep learning


techniques to enhance sentiment analysis
accuracy.

Slide 05
Motivation
• Aims to accurately interpret fan sentiments.

• Employs NLP and ML to handle Bangla language


specifics.

• Addresses unique Bangla linguistic challenges

• Applies advanced methods for deeper analysis.

• Enhances understanding of Bangladeshi cricket


fans' opinions.

Slide 04
LITERATURE REVIEW

• The paper analyzes Bangla movie reviews for sentiment.


EVALUATION OF NA¨ IVE BAYES
• It uses Naive Bayes (NB) and Support Vector Machines
AND SUPPORT VECTOR MACHINES
(SVM) for polarity detection.
ON BANGLA TEXTUAL MOVIE
• SVM, with stemmed unigram features, achieved a
REVIEWS.
precision of 0.86.

• 82.20% for abusive Bengali text detection.


A DEEP LEARNING APPROACH TO • Outperformed ANN (81.10%), LinearSVC (75.70%), Logit
DETECT ABUSIVE BENGALI TEXT. (75.20%), MNB (73.90%), and RF (70.50%).
• LSTM > other models.

Slide 06
LITERATURE REVIEW
• The study used 57,000 Bangla news items to identify
A STUDY TOWARDS BANGLA FAKE fake news.
NEWS DETECTION USING MACHINE • Bi-LSTM models with GloVe and FastText achieved up to
LEARNING AND DEEP LEARNING. 96% accuracy.
• GRU model accuracy was 77%.

• RNN with LSTM for Bangla cricket sentiment analysis.


CRICKET SENTIMENT ANALYSIS FROM • The LSTM model achieves an accuracy of 95%
BANGLA TEXT USING RECURRENT • LSTM outperforms the Support Vector Machine (SVM),
NEURAL NETWORK WITH LONG SHORT which has an accuracy of 71.03%
TERM MEMORY MODEL.

Slide 07
DATASETS
• Paper [1]: Utilized phishing • Paper [5]: 10,000 URLs from
dataset with 11,000 URLs and 30 Kaggle, balanced phishing and non-
features. phishing.

• Paper [2]: Real-world phishing


data used, unspecified source.

• Paper [3]: CIC-Bell-DNS 2021 with


400,000 benign and 13,011
malicious samples; UCI Phishing
Domains and 3,000 URLs.

• Paper [4]: Real-world website


details, no specific dataset
provided.

Slide 09
PRE-PROCESSING TECHNIQUES

01 02 03 04

D ATA F E AT U R E F E AT U R E N O R M A L I Z AT I
CLEANING EXTRACTION SELECTION ON AND
SCALING

05 06 07 08

D ATA BALANCING D ATA A D VA N C E D


ENCODING D ATA SPLITTING TECHNIQUES

Slide 10
MODELS USED
CLASSIFIER ACCURACY PRECISION RECALL

LINEAR REGRESSION GFG STANDARD PROFESSIONAL

SVM 0.7214 0.6852 0.7215

RANDOM FOREST 0.7065 0.6754 0.7012

KNN 0.7114 0.6814 0.7114

Slide 11 XGBOOST 0.7449 0.7350 0.7450


Evaluation Metrics
• Accuracy
• Precis ion
• Recall/S ens itivity
• F-measure
• Error Rate (ERR)
• Fals e Pos itive Rate ( FPR)
• Specifi city
• Detection S peed ( DS)

Slide 12
RESEARCH GAP

Paper [1]: Paper [4]:


1 The HEFS method is slow for real-time 4 The system struggles with new phishing
detection in resource-limited environments techniques and targeted attacks, has
and lacks thorough testing against privacy concerns, and requires more
diff erent phishing types and false alarms. research and teamwork to improve
accuracy with feedback and context
Paper [2]:
2 Paper [5]:
The study does not test the model against 5 The model’s eff ectiveness depends heavily
various phishing types or discuss real-world
deployment challenges, and its embedding on data, may miss some phishing threats,
techniques may not capture all phishing and does not show signifi cant benefi ts of
variations. using ANN and AdaBoost together over ANN
alone.
Paper [3]: Paper [6]:
3 The GNN models need better accuracy and 6 The study relies on outdated data and
adaptation to new phishing tactics, lacks comprehensive testing against
focusing mainly on URL structures and various phishing techniques, potentially
requiring signifi cant computing power. limiting its practical applicability and
eff ectiveness.

Slide 13
CONCLUSION

1 Extensive research of ML
techniques.

Random Forest and Neural Networks are


2 highly accurate.

Feature engineering and


3 preprocessing are crucial.

4 Larger datasets and real-world tests


are needed.

The study suggests future cybersecurity


5 improvements.

Slide 14
Related
Papers
Nayan Banik and Md Hasan Hafizur Rahman. Evaluation Elias Hossain, Md Nadim Kaysar, Abu Zahid Md Jalal
01 of na¨ ıve bayes and support vector machines on bangla Uddin Joy, MdMizanur Rahman, and Wahidur Rahman. A
textual movie reviews. In 2018 international conference 03 study towards bangla fake news detection using machine
on Bangla speech and language processing (ICBSLP), learning and deep learning. In Sentimental Analysis and
pages 1–6. IEEE, 2018. Deep Learning: Proceedings of ICSADL 2021, pages 79–
95. Springer, 2022.

Estiak Ahmed Emon, Shihab Rahman, Joti Banarjee, Amit Md Ferdous Wahid, Md Jahid Hasan, and Md Shahin Alom.
Kumar Das, and Tanni Mittra. A deep learning approach to Cricket sentiment analysis from bangla text using
02 detect abusive bengali text. In 2019 7th International 04 recurrent neural network with long short term memory
Conference on Smart Computing & Communications model. In 2019 International Conference on Bangla
(ICSCC), pages 1–5. IEEE, 2019. Speech and Language Processing (ICBSLP), pages 1–4.
IEEE, 2019.

Slide 15
CONTRIBUTION OF GROUP MEMBERS
Wr i ti n g Rep or t Prep arin g
Pap er P resen tation

Abstract, Introduction, Adib


Yousfu Ali
Conclusion.

Adibul Literature Review,


Mifta
Haque Datasets, References

Pre-processing
Miftahul
Techniques, Models, Nafisa
Sheikh
Evaluation Metrics

Slide 16
THANK YOU

You might also like