0% found this document useful (0 votes)

82 views3 pages

Natural Language Processing Assignment

The document describes a Twitter sentiment analysis project that aims to identify tweets containing hate speech or violence using machine learning techniques. It outlines preprocessing steps like tokenization, stemming, and punctuation removal. A naive Bayes classifier is then trained on the preprocessed tweet data and evaluated using a test set, achieving an F1 score of 0.53. The model is then used to classify tweets in a test set, achieving a score of 0.567. In conclusion, the document discusses how sentiment analysis can be applied to social media trend analysis and marketing using Python libraries.

Uploaded by

kuymancho

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

82 views3 pages

Natural Language Processing Assignment

Uploaded by

kuymancho

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

NATURAL LANGUAGE PROCESSING ASSIGNMENT

TWITTER SENTIMENT ANALYSIS

PROBLEM STATEMENT:

The objective of this task is to detect tweets having hate, violence

provoking words . We can say a tweet contains hate speech if it contains
provoking comments against a religion, caste or region . Our work here
is to seperate these type of tweets from other tweets.

SOURCE CODE :

data = pd.read_csv('Sentiment Analysis

Dataset.csv',error_bad_lines=False)
data.columns = ['id','label','source','text']
data.head(2)
data = data.drop(['id','source'],axis=1)
data.head(10)

PREPROCESSING:

1. TOKENIZATION:

tokenized_tweet = combi[‘tidy_tweet’].apply(lambda x: x.split())

tokenized_tweet.head()

2. STEMING:

from nltk.stem.porter import *

stemmer = PorterStemmer()
tokenized_tweet = tokenized_tweet.apply(lambda x: [stemmer.stem(i) for
i in x]) # stemming
tokenized_tweet.head()

3. PUNCTUATION REMOVAL:

combi['tidy_tweet'] = combi['tidy_tweet'].str.replace("[^a-zA-Z#]", "

")
NAIVE BAYES CLASSIFICATION:

Naive Bayes Classifier is a classification algorithm that relies on

Bayes’ Theorem. This theorem provides a way of calculating a type or
probability called posterior probability, in which the probability of
an event A occurring is reliant on a probabilistic known background.

PROGRAM:

from sklearn.model_selection import train_test_split

from sklearn.metrics import f1_score

train_bow = bow[:31962,:]
test_bow = bow[31962:,:]

# splitting data into training and validation set

xtrain_bow, xvalid_bow, ytrain, yvalid = train_test_split(train_bow,
train['label'], random_state=42, test_size=0.3)

lreg = LogisticRegression()
lreg.fit(xtrain_bow, ytrain) # training the model

prediction = lreg.predict_proba(xvalid_bow) # predicting on the

validation set
prediction_int = prediction[:,1] >= 0.3 # if prediction is greater than
or equal to 0.3 than 1 else 0
prediction_int = prediction_int.astype(np.int)

f1_score(yvalid, prediction_int) # calculating f1 score

Output: 0.53

SCORE CALCULATION:

test_pred = lreg.predict_proba(test_bow)
test_pred_int = test_pred[:,1] >= 0.3
test_pred_int = test_pred_int.astype(np.int)
test['label'] = test_pred_int
submission = test[['id','label']]
submission.to_csv('sub_lreg_bow.csv', index=False)

The score is 0.567.

RESULT:
Sentiment Analysis is an interesting way for the
applicability of Natural Language Processing in making automated
conclusions about text. It is being utilized in social media trend
analysis and, sometimes, for marketing purposes. Making a Sentiment
Analysis program in Python is not a difficult task, Now a days we have
so many ready-for-use libraries in Phython . This makes our task very
easy.This program is an explanation to how the application works.

META Automation Users Guide
No ratings yet
META Automation Users Guide
245 pages
Sentiment Analysis Final Documentation Report
50% (2)
Sentiment Analysis Final Documentation Report
21 pages
Template For The First Slide of PPT Presentation1
No ratings yet
Template For The First Slide of PPT Presentation1
18 pages
Ppt- Sentiment Analysis Using Machine Learning Algorithms
No ratings yet
Ppt- Sentiment Analysis Using Machine Learning Algorithms
23 pages
Sentimental Analysis
No ratings yet
Sentimental Analysis
3 pages
vertopal.com_C1_W2_Assignment
No ratings yet
vertopal.com_C1_W2_Assignment
18 pages
NLP Labsheet-2 Sentiment Analysis Using Naive Bayes Classifier
No ratings yet
NLP Labsheet-2 Sentiment Analysis Using Naive Bayes Classifier
15 pages
IC-RTETM_Final_Sentiment_Analysis
No ratings yet
IC-RTETM_Final_Sentiment_Analysis
13 pages
Session 7
No ratings yet
Session 7
17 pages
Part C - Assignment No. 2 Mini-Project On Twitter
No ratings yet
Part C - Assignment No. 2 Mini-Project On Twitter
7 pages
Module4-TextAnalytics
No ratings yet
Module4-TextAnalytics
9 pages
Sentiment Analysis On User-Generated Tweets
No ratings yet
Sentiment Analysis On User-Generated Tweets
15 pages
Twitter Sentiment Analysis Dss
No ratings yet
Twitter Sentiment Analysis Dss
14 pages
document-dsbda-codes-for-mini-project
No ratings yet
document-dsbda-codes-for-mini-project
9 pages
Importing Packages: Id Label Tweet 0 1 2 3 4
No ratings yet
Importing Packages: Id Label Tweet 0 1 2 3 4
8 pages
Ai Project
No ratings yet
Ai Project
15 pages
Assignment 1 Groupwork C0927405 C0928791
No ratings yet
Assignment 1 Groupwork C0927405 C0928791
11 pages
nlp_essentials
No ratings yet
nlp_essentials
22 pages
Twitte Analysis
No ratings yet
Twitte Analysis
53 pages
Sentiment Analysis of Twitter
No ratings yet
Sentiment Analysis of Twitter
26 pages
Sentiment Analysis of Social Media with Python _ by Haaya Naushan _ Towards Data Science
No ratings yet
Sentiment Analysis of Social Media with Python _ by Haaya Naushan _ Towards Data Science
9 pages
NLP - Twitter Sentiment Analysis With Tensorflow - Sebastian Correa - Medium
No ratings yet
NLP - Twitter Sentiment Analysis With Tensorflow - Sebastian Correa - Medium
13 pages
How To Perform Sentiment Analysis in Python 3 Using The Natural Language Toolkit (NLTK) - DigitalOcean
No ratings yet
How To Perform Sentiment Analysis in Python 3 Using The Natural Language Toolkit (NLTK) - DigitalOcean
29 pages
Lab Report - CSE 816
No ratings yet
Lab Report - CSE 816
17 pages
CSE4062S21_Group3_Project_Delivery7_FinalReport
No ratings yet
CSE4062S21_Group3_Project_Delivery7_FinalReport
9 pages
Sentiment Analysis Using Naïve Bayes Classifier
No ratings yet
Sentiment Analysis Using Naïve Bayes Classifier
23 pages
Ex_2
No ratings yet
Ex_2
5 pages
Pre Processing
No ratings yet
Pre Processing
9 pages
1729401471516
No ratings yet
1729401471516
98 pages
Python 21to30
No ratings yet
Python 21to30
9 pages
Sentiment Analysis of Tweets Using Machine Learning
No ratings yet
Sentiment Analysis of Tweets Using Machine Learning
22 pages
Chapter 4
No ratings yet
Chapter 4
35 pages
twitter sentiment analysis ppt
100% (2)
twitter sentiment analysis ppt
10 pages
Sentiment Analysis Using NLP
No ratings yet
Sentiment Analysis Using NLP
37 pages
DS- LAB REPORT. (4)
No ratings yet
DS- LAB REPORT. (4)
25 pages
C1_W1_Assignment (2)
No ratings yet
C1_W1_Assignment (2)
14 pages
COMP 4650 6490 Assignment 3 2023-v1.1
No ratings yet
COMP 4650 6490 Assignment 3 2023-v1.1
6 pages
Se Write-Up
No ratings yet
Se Write-Up
2 pages
NLP
No ratings yet
NLP
45 pages
ML Sentimentanalysis
No ratings yet
ML Sentimentanalysis
5 pages
MP 1
No ratings yet
MP 1
14 pages
Sentiment Analysis
100% (1)
Sentiment Analysis
19 pages
Pysentimiento: A Python Toolkit For Sentiment Analysis and Socialnlp Tasks
No ratings yet
Pysentimiento: A Python Toolkit For Sentiment Analysis and Socialnlp Tasks
4 pages
Twitter Analysis
No ratings yet
Twitter Analysis
8 pages
fin_ijprems1714118825
No ratings yet
fin_ijprems1714118825
6 pages
Prediction of Election Result by Enhanced Sentiment Analysis On Twiter Data PDF
No ratings yet
Prediction of Election Result by Enhanced Sentiment Analysis On Twiter Data PDF
4 pages
Lecture 3 Sentiment Analysis
No ratings yet
Lecture 3 Sentiment Analysis
41 pages
DOC-20250208-WA0002
No ratings yet
DOC-20250208-WA0002
21 pages
10 1109@icaccs48705 2020 9074208
No ratings yet
10 1109@icaccs48705 2020 9074208
3 pages
Notes
No ratings yet
Notes
6 pages
Part C - Assignment No. 2 Mini-Project On Twitter
No ratings yet
Part C - Assignment No. 2 Mini-Project On Twitter
7 pages
A Natural Language Processing For Sentiment Analysis From Text Using Deep Learning Algorithm
No ratings yet
A Natural Language Processing For Sentiment Analysis From Text Using Deep Learning Algorithm
7 pages
Lab 08 - Supervised Text Classification-Part 1
No ratings yet
Lab 08 - Supervised Text Classification-Part 1
6 pages
Ml Projrct Article 2
No ratings yet
Ml Projrct Article 2
6 pages
Twitter Sentiment Analysis Using Classifiers: Prepared By: Guide
No ratings yet
Twitter Sentiment Analysis Using Classifiers: Prepared By: Guide
19 pages
Sample_1
No ratings yet
Sample_1
22 pages
15 SentimentAnalysis
No ratings yet
15 SentimentAnalysis
17 pages
MOD 4 notes
No ratings yet
MOD 4 notes
19 pages
vertopal.com_C1_W1_Assignment
No ratings yet
vertopal.com_C1_W1_Assignment
16 pages
Hate_Speech_Detection_Documentation_With_Code
No ratings yet
Hate_Speech_Detection_Documentation_With_Code
4 pages
Blazor and API Example: Classroom Quiz Application
From Everand
Blazor and API Example: Classroom Quiz Application
Taurius Litvinavicius
No ratings yet
Digital Electronics Notes
No ratings yet
Digital Electronics Notes
9 pages
Mastercam Solids Tutorial PDF
No ratings yet
Mastercam Solids Tutorial PDF
132 pages
BIXOLON Utility: Software Manual
No ratings yet
BIXOLON Utility: Software Manual
32 pages
Hexagon_MI_VIRES_DataSheet-VTD_A4_WEB
No ratings yet
Hexagon_MI_VIRES_DataSheet-VTD_A4_WEB
4 pages
ARIBA
No ratings yet
ARIBA
70 pages
Main File Projguidelines 12jan
No ratings yet
Main File Projguidelines 12jan
79 pages
SystemManager 5.27 ReleaseNotes
No ratings yet
SystemManager 5.27 ReleaseNotes
17 pages
Lesson 4 Types of Media
No ratings yet
Lesson 4 Types of Media
23 pages
Clear Hub
No ratings yet
Clear Hub
31 pages
Axpert V Off-Grid Inverter Selection Guide
No ratings yet
Axpert V Off-Grid Inverter Selection Guide
1 page
Daikin - 2022 - Operation Manuals - English
No ratings yet
Daikin - 2022 - Operation Manuals - English
12 pages
ppt
No ratings yet
ppt
9 pages
3D Secure
No ratings yet
3D Secure
31 pages
ECSS E 50 04A (14november2007) PDF
No ratings yet
ECSS E 50 04A (14november2007) PDF
139 pages
Whole House Water Softener Iom
No ratings yet
Whole House Water Softener Iom
32 pages
Hoja de Especificaciones Serie x510
No ratings yet
Hoja de Especificaciones Serie x510
10 pages
Project: Introduction To Remote File System
No ratings yet
Project: Introduction To Remote File System
2 pages
Autocad User Interface Elements: Application Menu
100% (1)
Autocad User Interface Elements: Application Menu
12 pages
3-Plumbing-1679390163.Digital Literacy Level 3 Candidate Tool
No ratings yet
3-Plumbing-1679390163.Digital Literacy Level 3 Candidate Tool
5 pages
Mathcad For Electrical Engineering
No ratings yet
Mathcad For Electrical Engineering
22 pages
Acer ES1-532G
No ratings yet
Acer ES1-532G
52 pages
The Only Way Manufacturers Can Survive
No ratings yet
The Only Way Manufacturers Can Survive
9 pages
Continue
No ratings yet
Continue
4 pages
S20 TE Tactical Brochure FINAL July 2021
No ratings yet
S20 TE Tactical Brochure FINAL July 2021
7 pages
mockk srm
No ratings yet
mockk srm
1 page
Tikz Coordinates 2019
No ratings yet
Tikz Coordinates 2019
44 pages
Final Report Editedddddd
100% (1)
Final Report Editedddddd
30 pages
Intel VROC Quick Configuration Guide
No ratings yet
Intel VROC Quick Configuration Guide
22 pages
029 Conditional Rendering Practice_en
No ratings yet
029 Conditional Rendering Practice_en
7 pages

Natural Language Processing Assignment

Uploaded by

Natural Language Processing Assignment

Uploaded by

NATURAL LANGUAGE PROCESSING ASSIGNMENT

TWITTER SENTIMENT ANALYSIS

The objective of this task is to detect tweets having hate, violence

data = pd.read_csv('Sentiment Analysis

tokenized_tweet = combi[‘tidy_tweet’].apply(lambda x: x.split())

from nltk.stem.porter import *

combi['tidy_tweet'] = combi['tidy_tweet'].str.replace("[^a-zA-Z#]", "

Naive Bayes Classifier is a classification algorithm that relies on

from sklearn.model_selection import train_test_split

# splitting data into training and validation set

prediction = lreg.predict_proba(xvalid_bow) # predicting on the

f1_score(yvalid, prediction_int) # calculating f1 score

The score is 0.567.

You might also like