0% found this document useful (0 votes)

30 views5 pages

Sentiment Analysis

The document outlines a project to build a sentiment analysis model for classifying tweets as positive, neutral, or negative using Python and various libraries. It details the process of data collection and preprocessing using the Sentiment140 dataset, including steps like tokenization and feature engineering. Additionally, it provides Python code snippets for loading and analyzing the dataset, as well as visualizing the distribution of sentiments.

Uploaded by

gangasaikiranreddy

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

30 views5 pages

Sentiment Analysis

Uploaded by

gangasaikiranreddy

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 5

1.

Sentiment Analysis of Tweets (Text Classification)

Objective: Build a sentiment analysis model to classify tweets as positive, neutral, or negative
based on the text content.

Tools and Technologies:

Programming Language: Python
Libraries: pandas, nltk, scikit-learn, matplotlib, seaborn
Dataset: Use the Sentiment140 dataset (available on Kaggle) or the Twitter API to gather
tweet data.
Day-by-Day Breakdown:
Data Collection and Preprocessing

Load the dataset (e.g., Sentiment140) using pandas.

Preprocess the text by removing stopwords, special characters, and converting the text to
lowercase.
Tokenize the text and apply stemming/lemmatization.
Feature Engineering and Model Selection

PYTHON CODE
import re
import numpy as np
import pandas as pd
import seaborn as sns
from wordcloud import WordCloud
import matplotlib.pyplot as plt
from nltk.stem import WordNetLemmatizer
from sklearn.svm import LinearSVC
from sklearn.naive_bayes import BernoulliNB
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics import confusion_matrix, classification_report
# Importing the dataset
DATASET_COLUMNS=['target','ids','date','flag','user','text']
DATASET_ENCODING = "ISO-8859-1"
df = pd.read_csv('Project_Data.csv', encoding=DATASET_ENCODING,
names=DATASET_COLUMNS)
df.sample(5)

df.head()

df.columns

output: Index(['target', 'ids', 'date', 'flag', 'user', 'text'], dtype='object')

print('length of data is', len(df))

output: length of data is 241985

df. shape

output: (241985, 6)

df.info()

output:
<class 'pandas.core.frame.DataFrame'>

RangeIndex: 241985 entries, 0 to 241984

Data columns (total 6 columns):

# Column Non-Null Count Dtype

--- ------ -------------- -----

0 target 241985 non-null int64

1 ids 241985 non-null int64

2 date 241985 non-null object

3 flag 241984 non-null object

4 user 241984 non-null object

5 text 241932 non-null object

dtypes: int64(2), object(4)

memory usage: 11.1+ MB

df.dtypes

output:

print('Count of columns in the data is: ', len(df.columns))

print('Count of rows in the data is: ', len(df))

output:

Count of columns in the data is: 6

Count of rows in the data is: 241985

ax = df.groupby('target').count().plot(kind='bar', title='Distribution of data',legend=False)

ax.set_xticklabels(ax.get_xticklabels(), rotation=0) # Rotate existing labels if needed

labels = [item.get_text() for item in ax.get_xticklabels()]

labels = ['Negative' if label == '0' else 'Positive' for label in labels] # Replace '0' and '4' with your
actual target values

ax.set_xticklabels(labels, rotation=0)

text, sentiment = list(df['text']), list(df['target'])

output:

import seaborn as sns

sns.countplot(x='target', data=df)

ouput:

22K61A0654 2 Sasi Auto
No ratings yet
22K61A0654 2 Sasi Auto
24 pages
Emergency Assistance Response System (PHP)
No ratings yet
Emergency Assistance Response System (PHP)
146 pages
Ke Record 2k21
No ratings yet
Ke Record 2k21
48 pages
Machine Learning Record VR19
No ratings yet
Machine Learning Record VR19
46 pages
NLP Transformer-Based Models Used For Sentiment Analysis
No ratings yet
NLP Transformer-Based Models Used For Sentiment Analysis
45 pages
Prototype 1
No ratings yet
Prototype 1
10 pages
Building Good Training Sets UNIT 1 PART2
No ratings yet
Building Good Training Sets UNIT 1 PART2
46 pages
NLP Transformer-Based Models Used For Sentiment Analysis: 1. BERT
No ratings yet
NLP Transformer-Based Models Used For Sentiment Analysis: 1. BERT
98 pages
Data Science Project
No ratings yet
Data Science Project
34 pages
10 Streamlit
No ratings yet
10 Streamlit
7 pages
Sample 1
No ratings yet
Sample 1
22 pages
DS - Lab Report.
No ratings yet
DS - Lab Report.
25 pages
Code
No ratings yet
Code
18 pages
ML Lab Manual
No ratings yet
ML Lab Manual
43 pages
Code
No ratings yet
Code
13 pages
Python Library Functions
No ratings yet
Python Library Functions
12 pages
Sentiment Analysis On User-Generated Tweets
No ratings yet
Sentiment Analysis On User-Generated Tweets
15 pages
Document Dsbda Codes For Mini Project
No ratings yet
Document Dsbda Codes For Mini Project
9 pages
ML Lab Mannual1
No ratings yet
ML Lab Mannual1
37 pages
Aadarsh
No ratings yet
Aadarsh
26 pages
Part C Assignment No 2 Mini Project On Twitter 1
No ratings yet
Part C Assignment No 2 Mini Project On Twitter 1
9 pages
Social Media Sentimental Analysis 1
No ratings yet
Social Media Sentimental Analysis 1
30 pages
Sentiment Analysis
No ratings yet
Sentiment Analysis
3 pages
Internship-Review Hiranmai 045
No ratings yet
Internship-Review Hiranmai 045
20 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
26 pages
Twitter Sentiment Analysis Dss
No ratings yet
Twitter Sentiment Analysis Dss
14 pages
UNITIV BtechIot
No ratings yet
UNITIV BtechIot
43 pages
Twitter Sentiment Analysis
No ratings yet
Twitter Sentiment Analysis
16 pages
Rimjhim
No ratings yet
Rimjhim
21 pages
05 Pandas
No ratings yet
05 Pandas
12 pages
Twitter Sentiment Analysis
No ratings yet
Twitter Sentiment Analysis
13 pages
Python For Machine Learning
No ratings yet
Python For Machine Learning
66 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
18 pages
Tweet-Sentiment-Extraction - Exploratory Data Analysis
No ratings yet
Tweet-Sentiment-Extraction - Exploratory Data Analysis
11 pages
Sarthak Synopsis
No ratings yet
Sarthak Synopsis
7 pages
Hate Speech Detection
No ratings yet
Hate Speech Detection
6 pages
Introduction
No ratings yet
Introduction
27 pages
ML Week10.1
No ratings yet
ML Week10.1
5 pages
Business Analytics Using Python MBBA6021 Project Description With Data Set Links
No ratings yet
Business Analytics Using Python MBBA6021 Project Description With Data Set Links
4 pages
Machinelearning
No ratings yet
Machinelearning
26 pages
Ai Project File
No ratings yet
Ai Project File
11 pages
Part C - Assignment No. 2 Mini-Project On Twitter
No ratings yet
Part C - Assignment No. 2 Mini-Project On Twitter
7 pages
Class XII-IP-Practical File 1
No ratings yet
Class XII-IP-Practical File 1
28 pages
Lab Report - CSE 816
No ratings yet
Lab Report - CSE 816
17 pages
Sentiment Analysis On Twitter Data Using Machine Learning Algorithms in Python
No ratings yet
Sentiment Analysis On Twitter Data Using Machine Learning Algorithms in Python
14 pages
Fin Ijprems1714118825
No ratings yet
Fin Ijprems1714118825
6 pages
Project
No ratings yet
Project
222 pages
IR Case Study Final Presentation
No ratings yet
IR Case Study Final Presentation
12 pages
Car Price Prediction
No ratings yet
Car Price Prediction
42 pages
DSBDA05
No ratings yet
DSBDA05
5 pages
COM 324 Software Engineering
67% (3)
COM 324 Software Engineering
44 pages
About The Dataset - Car Evaluation Dataset (UCI Machine Learning Repository
No ratings yet
About The Dataset - Car Evaluation Dataset (UCI Machine Learning Repository
5 pages
Part C - Assignment No. 2 Mini-Project On Twitter
No ratings yet
Part C - Assignment No. 2 Mini-Project On Twitter
7 pages
Q 3
No ratings yet
Q 3
2 pages
Roll NO 2020
No ratings yet
Roll NO 2020
8 pages
Importing Packages: Id Label Tweet 0 1 2 3 4
No ratings yet
Importing Packages: Id Label Tweet 0 1 2 3 4
8 pages
Template For The First Slide of PPT Presentation1
No ratings yet
Template For The First Slide of PPT Presentation1
18 pages
Twitter Sentiment Analysis Using Machine Learning Algorithms IJERTV12IS070128
No ratings yet
Twitter Sentiment Analysis Using Machine Learning Algorithms IJERTV12IS070128
3 pages
Computerized System Life Cycle Management
100% (1)
Computerized System Life Cycle Management
107 pages
Data Modelling 101 For Data Analysts
No ratings yet
Data Modelling 101 For Data Analysts
13 pages
Dsbda
No ratings yet
Dsbda
12 pages
Manual
No ratings yet
Manual
48 pages
Machine Learning Lab (17CSL76)
No ratings yet
Machine Learning Lab (17CSL76)
48 pages
APM - 9.5 Configuration Administration Guide
No ratings yet
APM - 9.5 Configuration Administration Guide
543 pages
Cloud Log - Previous
No ratings yet
Cloud Log - Previous
277 pages
Senti bp1
No ratings yet
Senti bp1
2 pages
GE Intelligent Platforms - Knowledge Base Restoring Backup Files On A FIX32 System
0% (1)
GE Intelligent Platforms - Knowledge Base Restoring Backup Files On A FIX32 System
1 page
P - C Data Model Specification
No ratings yet
P - C Data Model Specification
194 pages
Angular
No ratings yet
Angular
2 pages
Daniel Akhademe
No ratings yet
Daniel Akhademe
14 pages
Resume - Dalton J Lind 2020
No ratings yet
Resume - Dalton J Lind 2020
1 page
Install DR
No ratings yet
Install DR
65 pages
Relevant Tables For The Business Partner Conversion - SAP Documentation
No ratings yet
Relevant Tables For The Business Partner Conversion - SAP Documentation
2 pages
8c74aea7-6462-47d8-8401-cccae75fa3a7
No ratings yet
8c74aea7-6462-47d8-8401-cccae75fa3a7
23 pages
P11 Materi Revolusi Industri 4.0
No ratings yet
P11 Materi Revolusi Industri 4.0
21 pages
Data Base - SQL Vs NoSQL
No ratings yet
Data Base - SQL Vs NoSQL
14 pages
Ashwani Raj Agrahari - Fresher - Resume - Backend Developer
No ratings yet
Ashwani Raj Agrahari - Fresher - Resume - Backend Developer
3 pages
ETI Micro Project Om
No ratings yet
ETI Micro Project Om
14 pages
Lesson: Managing Roles
No ratings yet
Lesson: Managing Roles
26 pages
2 - Electronic Record and Document Management System (Edms)
No ratings yet
2 - Electronic Record and Document Management System (Edms)
34 pages
CLOUDCOMPUTING
No ratings yet
CLOUDCOMPUTING
12 pages
Online Consent Form
No ratings yet
Online Consent Form
2 pages
Fraud App Detection: Jyoti Singh, Lakshita Suthar, Diksha Khabya, Simmi Pachori, Nikita Somani, Dr. Mayank Patel
No ratings yet
Fraud App Detection: Jyoti Singh, Lakshita Suthar, Diksha Khabya, Simmi Pachori, Nikita Somani, Dr. Mayank Patel
6 pages
Database Theory
No ratings yet
Database Theory
16 pages
Fake Product Identification
No ratings yet
Fake Product Identification
11 pages
Appstore Homework
No ratings yet
Appstore Homework
3 pages
Lab 9
No ratings yet
Lab 9
3 pages
1805 Woodgate Arch - Google Search
No ratings yet
1805 Woodgate Arch - Google Search
1 page
Kalpana Resume
No ratings yet
Kalpana Resume
2 pages
Essential n8n Playbook
From Everand
Essential n8n Playbook
Leandro Calado
No ratings yet

Sentiment Analysis

Uploaded by

Sentiment Analysis

Uploaded by

1.

Sentiment Analysis of Tweets (Text Classification)

Tools and Technologies:

Load the dataset (e.g., Sentiment140) using pandas.

output: Index(['target', 'ids', 'date', 'flag', 'user', 'text'], dtype='object')

print('length of data is', len(df))

output: length of data is 241985

RangeIndex: 241985 entries, 0 to 241984

Data columns (total 6 columns):

# Column Non-Null Count Dtype

--- ------ -------------- -----

0 target 241985 non-null int64

1 ids 241985 non-null int64

2 date 241985 non-null object

3 flag 241984 non-null object

4 user 241984 non-null object

5 text 241932 non-null object

dtypes: int64(2), object(4)

memory usage: 11.1+ MB

print('Count of columns in the data is: ', len(df.columns))

print('Count of rows in the data is: ', len(df))

Count of columns in the data is: 6

ax = df.groupby('target').count().plot(kind='bar', title='Distribution of data',legend=False)

ax.set_xticklabels(ax.get_xticklabels(), rotation=0) # Rotate existing labels if needed

labels = [item.get_text() for item in ax.get_xticklabels()]

text, sentiment = list(df['text']), list(df['target'])

import seaborn as sns

You might also like