0% found this document useful (0 votes)

46 views50 pages

ChatGPT Twitter Sentiment Analyzer

Uploaded by

Hatori Hockaido

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

46 views50 pages

ChatGPT Twitter Sentiment Analyzer

Uploaded by

Hatori Hockaido

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 50

vjazkqtnm

August 4, 2024

[1]: import pandas as pd

import seaborn as sns
import matplotlib.pyplot as plt
from PIL import Image
import numpy as np
from wordcloud import WordCloud,STOPWORDS
from collections import Counter
%matplotlib inline

[2]: df=pd.read_csv("/kaggle/input/tweets-onchatgpt-chatgpt/ChatGPT.csv")

/tmp/ipykernel_34/299525578.py:1: DtypeWarning: Columns (0,6,9,10) have mixed

types. Specify dtype option on import or set low_memory=False.
df=pd.read_csv("/kaggle/input/tweets-onchatgpt-chatgpt/ChatGPT.csv")

[3]: mask=np.array(Image.open("/kaggle/input/wordcloud-mask-collection/
↪stormtrooper_mask.png"))

[4]: df=df[: 50000]

[5]: df.head()

[5]: Unnamed: 0 Date \

0 0 2023-04-08 03:33:13+00:00
1 1 2023-04-08 03:32:16+00:00
2 2 2023-04-08 03:32:05+00:00
3 3 2023-04-08 03:31:08+00:00
4 4 2023-04-08 03:30:51+00:00

Tweet \
0 #ChatGPT If only there were a movie or somethi…
1 Workshop Description: In this workshop, you wi…
2 New Post at AiNewsDrop!\n\nChatGPT is a doctor…
3 OpenAI’s GPT-4 Just Got Supercharged! #ai #Cha…
4 "Classical art" is struggling - not changed th…

Url User \
0 https://twitter.com/kirbydjohnson/status/16445… kirbydjohnson

1
1 https://twitter.com/UniversityWeb/status/16445… UniversityWeb
2 https://twitter.com/AiNewsDrop/status/16445436… AiNewsDrop
3 https://twitter.com/tubeblogger/status/1644543… tubeblogger
4 https://twitter.com/majorradic/status/16445432… majorradic

UserCreated UserVerified UserFollowers UserFriends \

0 2012-07-29 22:57:46+00:00 False 50.0 76.0
1 2013-01-13 20:18:16+00:00 False 9056.0 387.0
2 2022-01-19 06:25:45+00:00 False 276.0 313.0
3 2013-06-25 01:42:59+00:00 False 312.0 293.0
4 2011-12-12 01:29:15+00:00 False 522.0 684.0

Retweets Likes Location \

0 0.0 0 NaN
1 0.0 0 NaN
2 0.0 0 NaN
3 0.0 0 The Internet
4 0.0 2 Beograd Navodni

Description
0 NaN
1 University webinars, speech videos, & news for…
2 I'm up almost 24/7 hunting for AI News\n\nWith…
3 Exploring the Best Of The Web. Tech, Games, Ar…
4 Mediji i milosrdje\n\n …

[6]: df.shape

[6]: (50000, 13)

[7]: df.isnull().sum()

[7]: Unnamed: 0 5
Date 0
Tweet 5
Url 5
User 5
UserCreated 5
UserVerified 5
UserFollowers 5
UserFriends 5
Retweets 5
Likes 5
Location 11851
Description 2190
dtype: int64

2
[8]: df.dropna(inplace=True)

[9]: df.isnull().sum()

[9]: Unnamed: 0 0
Date 0
Tweet 0
Url 0
User 0
UserCreated 0
UserVerified 0
UserFollowers 0
UserFriends 0
Retweets 0
Likes 0
Location 0
Description 0
dtype: int64

[10]: df.head()

[10]: Unnamed: 0 Date \

3 3 2023-04-08 03:31:08+00:00
4 4 2023-04-08 03:30:51+00:00
5 5 2023-04-08 03:30:00+00:00
6 6 2023-04-08 03:28:59+00:00
7 7 2023-04-08 03:28:31+00:00

Tweet \
3 OpenAI’s GPT-4 Just Got Supercharged! #ai #Cha…
4 "Classical art" is struggling - not changed th…
5 Alibaba invites businesses to trial 'ChatGPT r…
6 Trying to stop students from using #AI and #ch…
7 I Asked ChatGPT's AI Chatbot How Can I Earn Cr…

Url User \
3 https://twitter.com/tubeblogger/status/1644543… tubeblogger
4 https://twitter.com/majorradic/status/16445432… majorradic
5 https://twitter.com/gadgetsnow/status/16445430… gadgetsnow
6 https://twitter.com/Sherab_Taye/status/1644542… Sherab_Taye
7 https://twitter.com/cryptoccentral/status/1644… cryptoccentral

UserCreated UserVerified UserFollowers UserFriends \

3 2013-06-25 01:42:59+00:00 False 312.0 293.0
4 2011-12-12 01:29:15+00:00 False 522.0 684.0
5 2010-07-29 10:32:28+00:00 True 90392.0 56.0
6 2010-05-10 12:40:40+00:00 False 338.0 53.0

3
7 2023-01-27 00:12:12+00:00 False 1.0 2.0

Retweets Likes Location \

3 0.0 0 The Internet
4 0.0 2 Beograd Navodni
5 0.0 0 New Delhi
6 0.0 0 World
7 0.0 2 United States

Description
3 Exploring the Best Of The Web. Tech, Games, Ar…
4 Mediji i milosrdje\n\n …
5 The official technology news website of The Ti…
6 Musician // Thinker // Metaphysical Dandy // f…
7 Get the latest updates and insights on the cry…

[11]: df.shape

[11]: (37460, 13)

[12]: import re
df['Date'] = pd.to_datetime(df['Date'])
df['New_Date'] = pd.to_datetime(df['Date']).dt.date
df['Time'] = df['Date'].dt.time
df['Hour'] = df['Date'].dt.hour
df['Day'] = df['Date'].dt.strftime('%A')

df['TweetStr'] = df['Tweet'].astype(str)
df['hashtags'] = df['TweetStr'].apply(lambda x: re.findall(r"#(\w+)", x))

[13]: df.drop(columns="Unnamed: 0",axis=1,inplace=True)

[14]: df["New_Date"].value_counts()

[14]: New_Date
2023-03-30 3999
2023-03-31 3948
2023-03-29 3828
2023-03-28 3788
2023-04-04 3357
2023-04-05 3310
2023-04-03 3134
2023-04-06 3087
2023-04-01 2919
2023-04-07 2804
2023-04-02 2461
2023-03-27 616

4
2023-04-08 209
Name: count, dtype: int64

[15]: len(df["New_Date"].unique())

[15]: 13

[16]: tweet_count=df["New_Date"][: 100000].value_counts().sort_index()

plt.figure(figsize=(15,12))
sns.barplot(x=tweet_count.index, y=tweet_count.values,palette="Set1")
plt.xlabel('Date')
plt.ylabel('Number of Tweets')
plt.title('Number of Tweets per Date')
plt.xticks(rotation=90)
plt.show()

[17]: df["User"].value_counts()

5
[17]: User
crypaudio 440
torksmith 352
yuhanito 352
JimyTuits 245
MostBay 206
…
ANPrecision 1
TeddyLumidi 1
stacha_cl 1
AlanAtWTS 1
pritenmodi 1
Name: count, Length: 20809, dtype: int64

[18]: len(df["User"])

[18]: 37460

[19]: plt.figure(figsize=(25,18))
sns.countplot(y="User",data=df[: 100],palette="Set2")
plt.yticks(rotation=45)
plt.show()

6
[20]: df["UserVerified"].value_counts()

[20]: UserVerified
False 35660
True 1800
Name: count, dtype: int64

[21]: sns.countplot(x="UserVerified",data=df,palette="gnuplot")
plt.show()

[22]: plt.figure(figsize=(20,18))
sns.barplot(df[: 100], y="User", x="UserFollowers",palette="brg")
plt.show()

7
[23]: plt.figure(figsize=(20,18))
sns.barplot(df[: 100], y="User", x="UserFriends",palette="turbo")
plt.show()

8
[24]: plt.figure(figsize=(20,18))
sns.barplot(df[: 100], y="User", x="Retweets",palette="rainbow")
plt.show()

9
[25]: print(df["Likes"].dtype)
df["Likes"] = df["Likes"].astype(int) # Convert to float
print("=============================================")

object
=============================================

[26]: plt.figure(figsize=(20,18))
sns.barplot(df[: 100], y="User", x="Likes",palette="cubehelix")
plt.show()

10
[27]: df["Location"].value_counts()

[27]: Location
India 760
Manchester, UK 710
United States 602
Global 523
ChatGPT & AI News 440
…
LA / NYC / Sometimes SF 1
Crypto-twitter 1
Geneva, Switzerland 1
Northern San Diego County 1
Sea Cliff, NY 1
Name: count, Length: 8634, dtype: int64

[28]: len(df["Location"].unique())

11
[28]: 8634

[29]: plt.figure(figsize=(20,18))
sns.countplot(y="Location",data=df[: 100],palette="gist_earth")
plt.show()

/opt/conda/lib/python3.10/site-packages/IPython/core/pylabtools.py:152:
UserWarning: Glyph 127759 (\N{EARTH GLOBE ASIA-AUSTRALIA}) missing from current
font.
fig.canvas.print_figure(bytes_io, **kw)

[30]: plt.figure(figsize=(20,18))
sns.barplot(df[: 100], y="User", x="Hour",palette="terrain")
plt.show()

12
[31]: df["Day"].value_counts()

[31]: Day
Tuesday 7145
Wednesday 7138
Thursday 7086
Friday 6752
Monday 3750
Saturday 3128
Sunday 2461
Name: count, dtype: int64

[32]: plt.figure(figsize=(15,10))
sns.countplot(y="Day",data=df,palette="CMRmap")
plt.show()

13
[33]: import emoji

def preprocess_word(word):
word = re.sub(r'(.)\1+', r'\1\1', word)
word = re.sub(r'(-|\')', '', word)
return word

def is_valid_word(word):
return (re.search(r'^[a-zA-Z][a-z0-9A-Z\._]*$', word) is not None)

def handle_emojis(tweet):
emoji_patterns = {
r'(:\s?\)|:-\)|$\s?:|\(-:|:\'$)': ' EMO_POS ',
r'(:\s?D|:-D|x-?D|X-?D)': ' EMO_POS ',
r'(<3|:\*)': ' EMO_POS ',
r'(;-?\)|;-?D|$-?;)|�': ' EMO_POS ',
r'(:\s?\(|:-\(|$\s?:|\)-:)': ' EMO_NEG ',
r'(:,\(|:\'\(|:"\()': ' EMO_NEG '
}
for pattern, replacement in emoji_patterns.items():
tweet = re.sub(pattern, replacement, tweet)
return tweet

def remove_emoji(tweet):

14
return emoji.replace_emoji(tweet, replace=" ")

def preprocess_tweet(tweet):
tweet = tweet.lower()
tweet = re.sub("^\d+\s|\s\d+\s|\s\d+$", " ", tweet)
tweet = re.sub(r'((www\.[\S]+)|(https?://[\S]+))', '', tweet)
tweet = re.sub(r'@[\S]+', '', tweet)
tweet = re.sub(r'#(\S+)', '', tweet)
tweet = re.sub(r'\brt\b', '', tweet)
tweet = re.sub(r'\.{2,}', ' ', tweet)
tweet = tweet.strip(' "\'')
tweet = remove_emoji(tweet)
tweet = re.sub(r'\s+', ' ', tweet)
processed_tweet = [preprocess_word(word) for word in tweet.split()]
return ' '.join(processed_tweet)

[34]: df["clean_tweet"] = df["Tweet"].apply(preprocess_tweet)

1 Model From Huggingface

https://huggingface.co/cardiffnlp/twitter-roberta-base-sentiment-latest
[35]: from transformers import AutoModelForSequenceClassification, AutoTokenizer,␣
↪AutoConfig, pipeline

MODEL = "cardiffnlp/twitter-roberta-base-sentiment-latest"

tokenizer = AutoTokenizer.from_pretrained(MODEL)
config = AutoConfig.from_pretrained(MODEL)
model = AutoModelForSequenceClassification.from_pretrained(MODEL)

sa = pipeline("sentiment-analysis", tokenizer=tokenizer, model=model)

2024-08-04 08:13:44.295283: E
external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register
cuDNN factory: Attempting to register factory for plugin cuDNN when one has
already been registered
2024-08-04 08:13:44.295396: E
external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register
cuFFT factory: Attempting to register factory for plugin cuFFT when one has
already been registered
2024-08-04 08:13:44.449955: E
external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to
register cuBLAS factory: Attempting to register factory for plugin cuBLAS when
one has already been registered
config.json: 0%| | 0.00/929 [00:00<?, ?B/s]
vocab.json: 0%| | 0.00/899k [00:00<?, ?B/s]

15
merges.txt: 0%| | 0.00/456k [00:00<?, ?B/s]
special_tokens_map.json: 0%| | 0.00/239 [00:00<?, ?B/s]
pytorch_model.bin: 0%| | 0.00/501M [00:00<?, ?B/s]
/opt/conda/lib/python3.10/site-packages/torch/_utils.py:831: UserWarning:
TypedStorage is deprecated. It will be removed in the future and UntypedStorage
will be the only storage class. This should only matter to you if you are using
storages directly. To access UntypedStorage directly, use
tensor.untyped_storage() instead of tensor.storage()
return self.fget.__get__(instance, owner)()
Some weights of the model checkpoint at cardiffnlp/twitter-roberta-base-
sentiment-latest were not used when initializing
RobertaForSequenceClassification: ['roberta.pooler.dense.bias',
'roberta.pooler.dense.weight']
- This IS expected if you are initializing RobertaForSequenceClassification from
the checkpoint of a model trained on another task or with another architecture
(e.g. initializing a BertForSequenceClassification model from a
BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForSequenceClassification
from the checkpoint of a model that you expect to be exactly identical
(initializing a BertForSequenceClassification model from a
BertForSequenceClassification model).
Hardware accelerator e.g. GPU is available in the environment, but no `device`
argument is passed to the `Pipeline` object. Model will be on CPU.

2 Test Tweet Label & Score

[36]: print(df.loc[12229].Tweet)
print("\n")
print("============================================================")
print("\n")
test=sa(df.loc[12229].Tweet)
print(test)

O Chato GPT #desenho #drawing #draw #characters #humor #natureza #cartum #charge
#vidaurbana #cidade #ilustração #desenho #robo #cerveja #beercast #robot #ia #ai
#chatgpt #inteligenciaartificial https://t.co/EZWMR52u9g

============================================================

[{'label': 'neutral', 'score': 0.8574587106704712}]

[37]: print(df.loc[12229].clean_tweet)
print("\n")

16
print("============================================================")
print("\n")
test=sa(df.loc[12229].clean_tweet)
print(test)

o chato gpt

============================================================

[{'label': 'neutral', 'score': 0.7420278787612915}]

[38]: print(df.loc[18000].Tweet)
print("\n")
print("============================================================")
print("\n")
test=sa(df.loc[18000].Tweet)
print(test)

We can assure you that #AI won’t replace lawyers because regulators won’t let AI
do all legal tasks without a qualified #lawyer to take responsibility.

#chatgpt #elawcareers https://t.co/cQ01hCWXci

============================================================

[{'label': 'neutral', 'score': 0.6222677230834961}]

[39]: print(df.loc[18000].clean_tweet)
print("\n")
print("============================================================")
print("\n")
test=sa(df.loc[18000].clean_tweet)
print(test)

we can assure you that won’t replace lawyers because regulators won’t let ai do
all legal tasks without a qualified to take responsibility.

============================================================

[{'label': 'neutral', 'score': 0.525272011756897}]

17
[40]: # df["sentiment_label"] = "-"
# df["sentiment_score"] = -1

# df_part_1 = df[:10000]
# df_part_2 = df[10000:20000]
# df_part_3 = df[20000:30000]
# df_part_4 = df[30000:]

[41]: # from tqdm import tqdm

# tqdm.pandas()

# # Apply sentiment analysis to clean_tweet column for all parts of the␣

↪DataFrame

# df_part_1["sentiment"] = df_part_1["clean_tweet"].progress_apply(lambda x:␣

↪sa(x)[0]['label'])

# df_part_1["sentiment_score"] = df_part_1["clean_tweet"].progress_apply(lambda␣
↪x: sa(x)[0]['score'])

# df_part_2["sentiment"] = df_part_2["clean_tweet"].progress_apply(lambda x:␣

↪sa(x)[0]['label'])

# df_part_2["sentiment_score"] = df_part_2["clean_tweet"].progress_apply(lambda␣
↪x: sa(x)[0]['score'])

# df_part_3["sentiment"] = df_part_3["clean_tweet"].progress_apply(lambda x:␣

↪sa(x)[0]['label'])

# df_part_3["sentiment_score"] = df_part_3["clean_tweet"].progress_apply(lambda␣
↪x: sa(x)[0]['score'])

# df_part_4["sentiment"] = df_part_4["clean_tweet"].progress_apply(lambda x:␣

↪sa(x)[0]['label'])

# df_part_4["sentiment_score"] = df_part_4["clean_tweet"].progress_apply(lambda␣
↪x: sa(x)[0]['score'])

[42]: # df_new = pd.concat([df_part_1, df_part_2, df_part_3, df_part_4],␣

↪ignore_index=True)

# df_new.to_csv("tweet_gpt.csv", index=False)

[43]: df1=pd.read_csv("/kaggle/input/tweet-gpt/tweet_gpt.csv")

[44]: df1.columns

[44]: Index(['Date', 'Tweet', 'Url', 'User', 'UserCreated', 'UserVerified',

'UserFollowers', 'UserFriends', 'Retweets', 'Likes', 'Location',
'Description', 'New_Date', 'Time', 'Hour', 'Day', 'TweetStr',
'hashtags', 'clean_tweet', 'sentiment_label', 'sentiment_score',

18
'sentiment'],
dtype='object')

[45]: df1.
↪drop(columns=["Url","Tweet","User","UserCreated","sentiment_label"],axis=1,inplace=True)

[46]: df1["sentiment"].value_counts()

[46]: sentiment
neutral 19218
positive 11705
negative 6537
Name: count, dtype: int64

[47]: plt.figure(figsize=(10,8))
sns.countplot(y="sentiment",data=df1,palette="gist_rainbow")
plt.show()

[48]: sentiment_counts = df1.groupby(['New_Date', 'sentiment']).size().

↪reset_index(name='Count')

19
plt.figure(figsize=(12, 6))
sns.barplot(data=sentiment_counts, x='New_Date', y='Count', hue='sentiment')
plt.title('Sentiment Label Distribution Over Time')
plt.xlabel('Date')
plt.ylabel('Count')
plt.xticks(rotation=45)
plt.tight_layout()
plt.show()

[49]: df1.
↪drop(columns=["Date","UserVerified","UserFollowers","UserFriends","Retweets","Likes","Locati

[50]: df1.
↪drop(columns=["Time","Hour","Day","TweetStr","hashtags","sentiment_score"],axis=1,inplace=Tr

[51]: from bs4 import BeautifulSoup

import re
import string
import nltk
from nltk.corpus import stopwords

nltk.download('stopwords')

def clean_text(text):
# Remove HTML tags
text = BeautifulSoup(text, 'html.parser').get_text()

# Remove special characters and digits

20
text = re.sub(r"[^a-zA-Z]", " ", text)

# Remove punctuation
text = text.translate(str.maketrans("", "", string.punctuation))

# Remove emojis
emoji_pattern = re.compile("["
u"\U0001F600-\U0001F64F" # emoticons
u"\U0001F300-\U0001F5FF" # symbols & pictographs
u"\U0001F680-\U0001F6FF" # transport & map␣
↪symbols

u"\U0001F1E0-\U0001F1FF" # flags (iOS)

u"\U00002702-\U000027B0"
u"\U000024C2-\U0001F251"
"]+", flags=re.UNICODE)
text = emoji_pattern.sub(r'', text)

# Convert to lowercase
text = text.lower()

# Remove stop words

stop_words = set(stopwords.words('english'))
tokens = nltk.word_tokenize(text)
tokens = [word for word in tokens if word not in stop_words]

return ' '.join(tokens)

[nltk_data] Downloading package stopwords to /usr/share/nltk_data…

[nltk_data] Package stopwords is already up-to-date!

[52]: df1["clean_description"] = df["Description"].apply(clean_text)

/tmp/ipykernel_34/1709735225.py:11: MarkupResemblesLocatorWarning: The input

looks more like a filename than markup. You may want to open this file and pass
the filehandle into Beautiful Soup.
text = BeautifulSoup(text, 'html.parser').get_text()
/tmp/ipykernel_34/1709735225.py:11: MarkupResemblesLocatorWarning: The input
looks more like a URL than markup. You may want to use an HTTP client like
requests to get the document behind the URL, and feed that document to Beautiful
Soup.
text = BeautifulSoup(text, 'html.parser').get_text()

[53]: df1.head()

[53]: Description \
0 Exploring the Best Of The Web. Tech, Games, Ar…
1 Mediji i milosrdje\n\n …

21
2 The official technology news website of The Ti…
3 Musician // Thinker // Metaphysical Dandy // f…
4 Get the latest updates and insights on the cry…

clean_tweet sentiment \
0 openai’s gpt4 just got supercharged! positive
1 classical art" is struggling not changed the … neutral
2 alibaba invites businesses to trial chatgpt ri… neutral
3 trying to stop students from using and is like… negative
4 i asked chatgpts ai chatbot how can i earn cry… neutral

clean_description
0 NaN
1 NaN
2 NaN
3 exploring best web tech games art weirdtubes c…
4 mediji milosrdje moralno preispitivanje sport

[54]: df1=df1[["clean_tweet","clean_description","sentiment"]]

[55]: df1.isnull().sum()

[55]: clean_tweet 1
clean_description 9474
sentiment 0
dtype: int64

[56]: df1['clean_description'].fillna(df1['clean_tweet'], inplace=True)

/tmp/ipykernel_34/652614557.py:1: FutureWarning: A value is trying to be set on

a copy of a DataFrame or Series through chained assignment using an inplace
method.
The behavior will change in pandas 3.0. This inplace method will never work
because the intermediate object on which we are setting values always behaves as
a copy.

For example, when doing 'df[col].method(value, inplace=True)', try using

'df.method({col: value}, inplace=True)' or df[col] = df[col].method(value)
instead, to perform the operation inplace on the original object.

df1['clean_description'].fillna(df1['clean_tweet'], inplace=True)

[57]: df1['clean_tweet'].fillna(df1['clean_description'], inplace=True)

/tmp/ipykernel_34/2763964737.py:1: FutureWarning: A value is trying to be set on

a copy of a DataFrame or Series through chained assignment using an inplace
method.

22
The behavior will change in pandas 3.0. This inplace method will never work
because the intermediate object on which we are setting values always behaves as
a copy.

For example, when doing 'df[col].method(value, inplace=True)', try using

'df.method({col: value}, inplace=True)' or df[col] = df[col].method(value)
instead, to perform the operation inplace on the original object.

df1['clean_tweet'].fillna(df1['clean_description'], inplace=True)

[58]: df1.head()

[58]: clean_tweet \
0 openai’s gpt4 just got supercharged!
1 classical art" is struggling not changed the …
2 alibaba invites businesses to trial chatgpt ri…
3 trying to stop students from using and is like…
4 i asked chatgpts ai chatbot how can i earn cry…

clean_description sentiment
0 openai’s gpt4 just got supercharged! positive
1 classical art" is struggling not changed the … neutral
2 alibaba invites businesses to trial chatgpt ri… neutral
3 exploring best web tech games art weirdtubes c… negative
4 mediji milosrdje moralno preispitivanje sport neutral

[59]: df1.isnull().sum()

[59]: clean_tweet 0
clean_description 0
sentiment 0
dtype: int64

[60]: df1['text'] = df1['clean_tweet'] + ' ' + df1['clean_description']

[61]: df1.head()

[61]: clean_tweet \
0 openai’s gpt4 just got supercharged!
1 classical art" is struggling not changed the …
2 alibaba invites businesses to trial chatgpt ri…
3 trying to stop students from using and is like…
4 i asked chatgpts ai chatbot how can i earn cry…

clean_description sentiment \
0 openai’s gpt4 just got supercharged! positive

23
1 classical art" is struggling not changed the … neutral
2 alibaba invites businesses to trial chatgpt ri… neutral
3 exploring best web tech games art weirdtubes c… negative
4 mediji milosrdje moralno preispitivanje sport neutral

text
0 openai’s gpt4 just got supercharged! openai’s …
1 classical art" is struggling not changed the …
2 alibaba invites businesses to trial chatgpt ri…
3 trying to stop students from using and is like…
4 i asked chatgpts ai chatbot how can i earn cry…

[62]: df1.drop(columns=["clean_tweet","clean_description"],axis=1,inplace=True)

[63]: df1.rename(columns={'sentiment': 'label'}, inplace=True)

[64]: df1=df1[["text","label"]]

[65]: df1.head()

[65]: text label

0 openai’s gpt4 just got supercharged! openai’s … positive
1 classical art" is struggling not changed the … neutral
2 alibaba invites businesses to trial chatgpt ri… neutral
3 trying to stop students from using and is like… negative
4 i asked chatgpts ai chatbot how can i earn cry… neutral

[66]: df1.shape

[66]: (37460, 2)

[67]: df1.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 37460 entries, 0 to 37459
Data columns (total 2 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 text 37460 non-null object
1 label 37460 non-null object
dtypes: object(2)
memory usage: 585.4+ KB

[68]: negative_len=df1[df1["label"]=="negative"]["text"].str.len()
positive_len=df1[df1["label"]=="positive"]["text"].str.len()
neutral_len=df1[df1["label"]=="neutral"]["text"].str.len()
plt.figure(figsize=(10, 8))

24
plt.hist(negative_len, bins=30, label='Negative', color="black",alpha=0.9)
plt.hist(positive_len, bins=30, label='Positive', color="green",alpha=0.9)
plt.hist(neutral_len, bins=30, label='Neutral', color="darkorange",alpha=0.9)
plt.title("Length Distribution of Text Data by Sentiment", fontsize=16)
plt.xlabel("Length of Text", fontsize=14)
plt.ylabel("Frequency", fontsize=14)
plt.legend()
plt.grid(True)
plt.show()

3 Positive data wordcloud

[69]: plt.figure(figsize=(15,15))
positive_wordcloud=df1[df1["label"]=="positive"]
positive_text=" ".join(positive_wordcloud['text'].values.tolist())
wordcloud = WordCloud(width=800, height=800,stopwords=STOPWORDS,␣
↪background_color='black', max_words=800,colormap="brg",mask=mask).

↪generate(positive_text)

25
plt.imshow(wordcloud, interpolation='bilinear')
plt.axis('off')
plt.show()

4 Negative data wordcloud

[70]: plt.figure(figsize=(15,15))
negative_wordcloud=df1[df1["label"]=="negative"]
negative_text=" ".join(negative_wordcloud['text'].values.tolist())
wordcloud = WordCloud(width=800, height=800,stopwords=STOPWORDS,␣
↪background_color='black', max_words=800,colormap="gnuplot2",mask=mask).

↪generate(negative_text)

plt.imshow(wordcloud, interpolation='bilinear')
plt.axis('off')
plt.show()

26
5 Neutral Data wordcloud
[71]: plt.figure(figsize=(15,15))
neutral_wordcloud=df1[df1["label"]=="neutral"]
neutral_text=" ".join(neutral_wordcloud['text'].values.tolist())
wordcloud = WordCloud(width=800, height=800,stopwords=STOPWORDS,␣
↪background_color='black', max_words=800,colormap="nipy_spectral",mask=mask).

↪generate(neutral_text)

plt.imshow(wordcloud, interpolation='bilinear')
plt.axis('off')
plt.show()

27
6 40 Most common Words From Positive text
[72]: positive_text = df1[df1["label"] == "positive"]
data_set = positive_text["text"].str.split()
all_words = [word for sublist in data_set for word in sublist]
counter = Counter(all_words)
common_words = counter.most_common(40)
df_common_words = pd.DataFrame(common_words, columns=['Word', 'Count'])

colors = ["cyan", "lime", "magenta", "gold", "purple", "tomato", "teal",␣

↪"sandybrown", "mediumseagreen",

"royalblue", "darkorchid", "darkturquoise", "darkgoldenrod",␣

↪"mediumvioletred", "mediumaquamarine",

"lightcoral", "darkslategray", "olivedrab", "dodgerblue",␣

↪"indianred", "limegreen", "steelblue",

28
"darkviolet", "chocolate", "mediumslateblue", "darkgreen",␣
"orangered", "mediumblue", "peru", "mediumspringgreen"]
↪

plt.figure(figsize=(15,10))
sns.barplot(x='Count', y='Word', data=df_common_words, palette=colors)
plt.title('40 Most Common Words in Positive Texts')
plt.xlabel('Count')
plt.ylabel('Word')
plt.show()

7 40 Most common Words From Negative text

[73]: negative_text = df1[df1["label"] == "negative"]
data_set = negative_text["text"].str.split()
all_words = [word for sublist in data_set for word in sublist]
counter = Counter(all_words)
common_words = counter.most_common(40)
df_common_words = pd.DataFrame(common_words, columns=['Word', 'Count'])

colors = ["cyan", "lime", "magenta", "gold", "purple", "tomato", "teal",␣

↪"sandybrown", "mediumseagreen",

29
"royalblue", "darkorchid", "darkturquoise", "darkgoldenrod",␣
↪"mediumvioletred", "mediumaquamarine",
"lightcoral", "darkslategray", "olivedrab", "dodgerblue",␣
↪"indianred", "limegreen", "steelblue",

"darkviolet", "chocolate", "mediumslateblue", "darkgreen",␣

↪"orangered", "mediumblue", "peru", "mediumspringgreen",

"coral", "darkcyan", "deeppink", "yellow", "blue", "orange",␣

↪"lightgreen", "navy", "darkred", "skyblue",

"lime", "orchid", "silver", "gold"]

plt.figure(figsize=(15,10))
sns.barplot(x='Count', y='Word', data=df_common_words, palette=colors)
plt.title('40 Most Common Words in Negative Texts')
plt.xlabel('Count')
plt.ylabel('Word')
plt.show()

30
8 40 Most common Words From Neutral text
[74]: neutral_text = df1[df1["label"] == "neutral"]
data_set = neutral_text["text"].str.split()
all_words = [word for sublist in data_set for word in sublist]
counter = Counter(all_words)
common_words = counter.most_common(40)
df_common_words = pd.DataFrame(common_words, columns=['Word', 'Count'])

colors = ["cyan", "lime", "magenta", "gold", "purple", "tomato", "teal",␣

↪"sandybrown", "mediumseagreen",

"royalblue", "darkorchid", "darkturquoise", "darkgoldenrod",␣

↪"mediumvioletred", "mediumaquamarine",

"lightcoral", "darkslategray", "olivedrab", "dodgerblue",␣

↪"indianred", "limegreen", "steelblue",

"darkviolet", "chocolate", "mediumslateblue", "darkgreen",␣

↪"orangered", "mediumblue", "peru", "mediumspringgreen",

"coral", "darkcyan", "deeppink", "yellow", "blue", "orange",␣

↪"lightgreen", "navy", "darkred", "skyblue",

"lime", "orchid", "silver", "gold"]

plt.figure(figsize=(15,10))
sns.barplot(x='Count', y='Word', data=df_common_words, palette=colors)
plt.title('40 Most Common Words in Neutral Texts')
plt.xlabel('Count')
plt.ylabel('Word')
plt.show()

31
9 Topic Modeling
[75]: from sklearn.feature_extraction.text import CountVectorizer
from sklearn.decomposition import LatentDirichletAllocation

vectorizer = CountVectorizer(max_df=0.95, min_df=2, stop_words='english')

X = vectorizer.fit_transform(df1['text'])

lda = LatentDirichletAllocation(n_components=5, random_state=42)

lda.fit(X)

topic_names = ["Technology", "Business", "Language Processing", "Privacy and␣

↪Security", "Technical Details"]

print("Top words for each topic:")

feature_names = vectorizer.get_feature_names_out()
for topic_idx, topic in enumerate(lda.components_):
top_words = [feature_names[i] for i in topic.argsort()[:-11:-1]]
print(f"Topic {topic_idx + 1} ({topic_names[topic_idx]}): {', '.
↪join(top_words)}")

columns_to_plot = ['number_character', 'number_of_words', 'number_sentence']

for i, column in enumerate(columns_to_plot, start=1):

plt.subplot(3, 1, i)
ax = sns.barplot(data=df1, x='label', y=column,palette="gnuplot2")
plt.ylabel(f'{column.replace("_", " ").capitalize()}')
for p in ax.patches:
ax.annotate(format(p.get_height(), '.2f'),
(p.get_x() + p.get_width() / 2., p.get_height()),
ha = 'center', va = 'center',
xytext = (0, 10),
textcoords = 'offset points')

plt.tight_layout()
plt.show()

33
10 Average Sentence Length
[78]: avg_len=df1["text"].apply(len)
avg_len=avg_len.mean()
print(f"Average Text Length is : {avg_len:.2f}")

Average Text Length is : 221.73

[79]: df1.
↪drop(columns=["number_character","number_of_words","number_sentence"],axis=1,inplace=True)

[80]: df1["label"]=df1["label"].replace({"positive":0,"neutral":1,"negative":2})

/tmp/ipykernel_34/2065424062.py:1: FutureWarning: Downcasting behavior in

`replace` is deprecated and will be removed in a future version. To retain the
old behavior, explicitly call `result.infer_objects(copy=False)`. To opt-in to
the future behavior, set `pd.set_option('future.no_silent_downcasting', True)`
df1["label"]=df1["label"].replace({"positive":0,"neutral":1,"negative":2})

34
[81]: label_name=["Positive","Neutral","Negative"]

[82]: df1.head()

[82]: text label

0 openai’s gpt4 just got supercharged! openai’s … 0
1 classical art" is struggling not changed the … 1
2 alibaba invites businesses to trial chatgpt ri… 1
3 trying to stop students from using and is like… 2
4 i asked chatgpts ai chatbot how can i earn cry… 1

[83]: df1.shape

[83]: (37460, 2)

[84]: vocab_size=30000
max_len=200

[85]: from tensorflow.keras.preprocessing.text import one_hot

texts = df1['text']
one_hot_texts = [one_hot(text,vocab_size) for text in texts]
df1['text'] = one_hot_texts

[86]: df1.head()

[86]: text label

0 [17938, 28515, 23139, 29673, 9473, 17938, 2851… 0
1 [27980, 21069, 23588, 16033, 2742, 12471, 2746… 1
2 [12702, 25466, 6541, 15189, 7067, 16205, 24079… 1
3 [11407, 15189, 27714, 5797, 25951, 9998, 19950… 2
4 [567, 29957, 8956, 7986, 23894, 6317, 7804, 56… 1

[87]: from tensorflow.keras.preprocessing.sequence import pad_sequences

my_onehot_data=df1["text"]
embedded_doc=pad_sequences(my_onehot_data,padding='pre',maxlen=max_len)
embedded_doc

[87]: array([[ 0, 0, 0, …, 23139, 29673, 9473],

[ 0, 0, 0, …, 769, 2476, 24611],
[ 0, 0, 0, …, 7986, 23894, 7923],
…,
[ 0, 0, 0, …, 28519, 7285, 27211],
[ 0, 0, 0, …, 20039, 15852, 10511],
[ 0, 0, 0, …, 11548, 28784, 13348]], dtype=int32)

[88]: text_data=np.array(embedded_doc)
y=df1["label"]

35
label_data=np.array(y)
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(text_data,label_data,␣
↪test_size=0.2, random_state=42)

[89]: from tensorflow.keras.models import Sequential

from tensorflow.keras.layers import Embedding, LSTM, Dropout, Dense,␣
↪BatchNormalization,GRU

from tensorflow.keras.optimizers import Adam

embedding_dim = 100
model = Sequential()
model.add(Embedding(input_dim=vocab_size,␣
↪output_dim=embedding_dim,input_shape=(max_len,)))

model.add(GRU(units=512, activation="tanh", return_sequences=True))

model.add(Dropout(0.5))
model.add(GRU(units=256, activation="relu"))
model.add(Dropout(0.5))
model.add(Dense(units=256))
model.add(Dense(units=3, activation='softmax'))

learning_rate = 0.0001
optimizer = Adam(learning_rate=learning_rate)

model.compile(optimizer=optimizer, loss='sparse_categorical_crossentropy',␣
↪metrics=['accuracy'])

model.summary()

/opt/conda/lib/python3.10/site-packages/keras/src/layers/core/embedding.py:93:
UserWarning: Do not pass an ìnput_shape`/ìnput_dim` argument to a layer. When
using Sequential models, prefer using an Ìnput(shape)` object as the first
layer in the model instead.
super().__init__(**kwargs)
Model: "sequential"

��
� Layer (type) � Output Shape � Param # �
��
� embedding (Embedding) � (None, 200, 100) � 3,000,000 �
��
� gru (GRU) � (None, 200, 512) � 943,104 �
��
� dropout (Dropout) � (None, 200, 512) � 0 �
��
� gru_1 (GRU) � (None, 256) � 591,360 �

36
��
� dropout_1 (Dropout) � (None, 256) � 0 �
��
� dense (Dense) � (None, 256) � 65,792 �
��
� dense_1 (Dense) � (None, 3) � 771 �
��

Total params: 4,601,027 (17.55 MB)

Trainable params: 4,601,027 (17.55 MB)

Non-trainable params: 0 (0.00 B)

[90]: from keras.utils import plot_model

plot_model(model, to_file='model_plot.png', show_shapes=True,␣
↪show_layer_names=True)

[90]:

37
38
[91]: from tensorflow.keras.callbacks import EarlyStopping, ReduceLROnPlateau
early_stopping = EarlyStopping(monitor='val_loss', patience=2,␣
↪restore_best_weights=True)

reduce_lr = ReduceLROnPlateau(monitor='val_loss', factor=0.2, patience=2,␣

↪min_lr=0.0001)

[92]: history = model.fit(X_train, y_train, validation_data=(X_test, y_test),␣

↪epochs=40, batch_size=16, callbacks=[early_stopping, reduce_lr])

Epoch 1/40
1873/1873 �� 439s 232ms/step
- accuracy: 0.5246 - loss: 0.9971 - val_accuracy: 0.6137 - val_loss: 0.8954 -
learning_rate: 1.0000e-04
Epoch 2/40
1873/1873 �� 436s 233ms/step
- accuracy: 0.6666 - loss: 0.7830 - val_accuracy: 0.6849 - val_loss: 0.7147 -
learning_rate: 1.0000e-04
Epoch 3/40
1873/1873 �� 435s 232ms/step
- accuracy: 0.7856 - loss: 0.5333 - val_accuracy: 0.6913 - val_loss: 0.7522 -
learning_rate: 1.0000e-04
Epoch 4/40
1873/1873 �� 435s 232ms/step
- accuracy: 0.8400 - loss: 0.4096 - val_accuracy: 0.6995 - val_loss: 0.7919 -
learning_rate: 1.0000e-04

[93]: fig, ax = plt.subplots(nrows=1, ncols=2, figsize=(10,5))

# First subplot
ax[0].plot(history.history['accuracy'],label="Accuracy",color="red")
ax[0].plot(history.history['val_accuracy'],label="Validation␣
↪Accuracy",color="green")

ax[0].set_title('Model Accuracy')
ax[0].set_ylabel('Accuracy')
ax[0].set_xlabel('Epoch')
ax[0].legend(loc='best')

# Second subplot
ax[1].plot(history.history['loss'],label="Loss",color="blue")
ax[1].plot(history.history['val_loss'],label="Validation Loss",color="black")
ax[1].set_title('Model Loss')
ax[1].set_ylabel('Loss')
ax[1].set_xlabel('Epoch')
ax[1].legend(loc='best')

39
plt.show()

[94]: pred=model.predict(X_test)
final_pred=np.argmax(pred,axis=1)

235/235 �� 20s 86ms/step

[95]: from sklearn.metrics import roc_curve, classification_report, confusion_matrix

[96]: print(classification_report(y_test,final_pred,target_names=label_name))

precision recall f1-score support

Positive 0.70 0.73 0.71 2401

Neutral 0.71 0.75 0.73 3798
Negative 0.54 0.40 0.46 1293

accuracy 0.68 7492

macro avg 0.65 0.63 0.64 7492
weighted avg 0.68 0.68 0.68 7492

[97]: cf=confusion_matrix(y_test,final_pred)
plt.figure(figsize=(10,8))
sns.
↪heatmap(cf,annot=True,cmap="gnuplot",fmt="d",xticklabels=label_name,yticklabels=label_name,l

plt.title('Confusion Matrix')

40
plt.xlabel('Predicted Labels')
plt.ylabel('True Labels')
plt.show()

[98]: import scikitplot as skplt

y_probas = model.predict(X_test)
skplt.metrics.plot_roc(y_test, y_probas,figsize=(10,8))
plt.show()

235/235 �� 20s 85ms/step

41
[99]: skplt.metrics.plot_precision_recall(y_test, y_probas,figsize=(10,8))
plt.show()

42
[100]: from sklearn.metrics import roc_auc_score
roc_auc = roc_auc_score(y_test, pred,multi_class='ovr')
plt.plot([])
plt.text(0,0, f'ROC AUC Score: {roc_auc:.4f}', fontsize=16, ha='center',␣
↪va='center',color="indigo")

plt.axis('off')
plt.xlim(-1, 1)
plt.ylim(-1,1)

plt.show()

43
[101]: from sklearn.metrics import log_loss,cohen_kappa_score,matthews_corrcoef
logarithm_loss=log_loss(y_test,pred)
plt.plot([])
plt.text(0,0, f'Log Loss: {logarithm_loss:.4f}', fontsize=16, ha='center',␣
↪va='center',color="black")

plt.axis('off')

# Set the x-axis limits

plt.xlim(-1, 1)
plt.ylim(-1,1)

plt.show()

44
[102]: kappa = cohen_kappa_score(y_test,final_pred)
plt.plot([])
plt.text(0,0, f'Cohen Kappa Score: {kappa:.4f}', fontsize=16, ha='center',␣
↪va='center',color="orangered")

plt.axis('off')

# Set the x-axis limits

plt.xlim(-1, 1)
plt.ylim(-1,1)

plt.show()

45
[103]: mcc = matthews_corrcoef(y_test, final_pred)
plt.plot([])
plt.text(0,0, f'Matthews Correlation Coefficient: {mcc:.4f}', fontsize=16,␣
↪ha='center', va='center',color="saddlebrown")

plt.axis('off')
plt.xlim(-1, 1)
plt.ylim(-1,1)

plt.show()

46
[104]: from sklearn.metrics import accuracy_score

accuracy = accuracy_score(y_test, final_pred)

plt.plot([])
plt.text(0, 0, f'Accuracy Score: {accuracy:.4f}', fontsize=16, ha='center',␣
↪va='center', color="saddlebrown")

plt.axis('off')
plt.xlim(-1, 1)
plt.ylim(-1,1)

plt.show()

47
11 Custom Data prediction
[105]: custom_text_data = []
custom_data = ["Great job! Your model achieved an accuracy score of"]
custom_text_data.extend(custom_data)

combined_onehot_data = [one_hot(text, vocab_size) for text in custom_text_data]

combined_padded_data = pad_sequences(combined_onehot_data, padding='pre',␣

↪maxlen=max_len)

custom_padded_data = combined_padded_data[-len(custom_data):]

custom_pred = model.predict(custom_padded_data)

custom_final_pred = np.argmax(custom_pred, axis=1)

for p in custom_final_pred:
if p == 0:
print("Positive")
elif p == 1:
print("Neutral")

48
elif p == 2:
print("Negative")

1/1 �� 0s 106ms/step

Positive

[106]: custom_text_data = []
custom_data = ["The movie I watched last night was a complete disaster. The␣
↪plot was confusing, the acting was terrible, and the special effects were␣

↪laughable. I wasted two hours of my life and I wouldn't recommend it to␣

↪anyone."

]
custom_text_data.extend(custom_data)

combined_onehot_data = [one_hot(text, vocab_size) for text in custom_text_data]

combined_padded_data = pad_sequences(combined_onehot_data, padding='pre',␣

↪maxlen=max_len)

custom_padded_data = combined_padded_data[-len(custom_data):]

custom_pred = model.predict(custom_padded_data)

custom_final_pred = np.argmax(custom_pred, axis=1)

for p in custom_final_pred:
if p == 0:
print("Positive")
elif p == 1:
print("Neutral")
elif p == 2:
print("Negative")

1/1 �� 0s 101ms/step

Positive

[107]: custom_text_data = []
custom_data = ["I went to the grocery store and bought some fruits and␣
↪vegetables. Then, I stopped by the bank to deposit a check. After that, I␣

↪headed home to prepare dinner."

]
custom_text_data.extend(custom_data)

combined_onehot_data = [one_hot(text, vocab_size) for text in custom_text_data]

combined_padded_data = pad_sequences(combined_onehot_data, padding='pre',␣

↪maxlen=max_len)

49
custom_padded_data = combined_padded_data[-len(custom_data):]

custom_pred = model.predict(custom_padded_data)

custom_final_pred = np.argmax(custom_pred, axis=1)

for p in custom_final_pred:
if p == 0:
print("Positive")
elif p == 1:
print("Neutral")
elif p == 2:
print("Negative")

1/1 �� 0s 95ms/step

Neutral

[ ]:

Twitter Project2
No ratings yet
Twitter Project2
339 pages
Data Science Project
No ratings yet
Data Science Project
34 pages
INDEXReport Ayush
No ratings yet
INDEXReport Ayush
38 pages
10 Streamlit
No ratings yet
10 Streamlit
7 pages
Tutorial 2
No ratings yet
Tutorial 2
82 pages
C1 W2 Assignment
No ratings yet
C1 W2 Assignment
18 pages
TWITTER SENTIMENT NLP Projectt
No ratings yet
TWITTER SENTIMENT NLP Projectt
19 pages
Tweet Emotion Recognition: NLP With Tensorflow
No ratings yet
Tweet Emotion Recognition: NLP With Tensorflow
10 pages
Tweepy Functions
No ratings yet
Tweepy Functions
34 pages
Sma Exp 09 Code Print
No ratings yet
Sma Exp 09 Code Print
5 pages
Social Media Sentimental Analysis 1
No ratings yet
Social Media Sentimental Analysis 1
30 pages
Part C Assignment No 2 Mini Project On Twitter 1
No ratings yet
Part C Assignment No 2 Mini Project On Twitter 1
9 pages
Twitter Sentiment Analysis Dss
No ratings yet
Twitter Sentiment Analysis Dss
14 pages
Chapter 26 Text Mining - Introduction To Data Science
No ratings yet
Chapter 26 Text Mining - Introduction To Data Science
20 pages
Mlds5 Code
No ratings yet
Mlds5 Code
7 pages
Blog Post HTML
No ratings yet
Blog Post HTML
6 pages
Twitter Analysis
No ratings yet
Twitter Analysis
5 pages
Zainab Pate Data PPF #3 - Colab
No ratings yet
Zainab Pate Data PPF #3 - Colab
11 pages
Sentiment - Analysis - Data Analytics
No ratings yet
Sentiment - Analysis - Data Analytics
18 pages
Methodology
No ratings yet
Methodology
9 pages
Assign 5 TT
No ratings yet
Assign 5 TT
13 pages
Cyberbullying Tweet Recognition Project 1677256740
No ratings yet
Cyberbullying Tweet Recognition Project 1677256740
17 pages
NLP10 Ipynb
No ratings yet
NLP10 Ipynb
107 pages
Twitter Sentiment Analysis
No ratings yet
Twitter Sentiment Analysis
13 pages
Notes
No ratings yet
Notes
6 pages
Sentimental Analysis
No ratings yet
Sentimental Analysis
11 pages
Artificial Neural Network Code
No ratings yet
Artificial Neural Network Code
3 pages
Tweet-Sentiment-Extraction - Exploratory Data Analysis
No ratings yet
Tweet-Sentiment-Extraction - Exploratory Data Analysis
11 pages
NLP - (1) (1) .Ipynb - Colab
No ratings yet
NLP - (1) (1) .Ipynb - Colab
10 pages
Section5 - Jupyter Notebook
No ratings yet
Section5 - Jupyter Notebook
4 pages
Reg. No.: 39110009 Colab Notebook Link: Name: Abivirshan Suresh
No ratings yet
Reg. No.: 39110009 Colab Notebook Link: Name: Abivirshan Suresh
27 pages
Hate Speech
No ratings yet
Hate Speech
4 pages
Tweet Similarity
No ratings yet
Tweet Similarity
3 pages
Machine Learning Report
No ratings yet
Machine Learning Report
15 pages
Sentiment Analysis - Comparing Algorithms Accuracy
No ratings yet
Sentiment Analysis - Comparing Algorithms Accuracy
22 pages
Twitter Sentiment Analysis System
No ratings yet
Twitter Sentiment Analysis System
5 pages
Part C - Assignment No. 2 Mini-Project On Twitter
No ratings yet
Part C - Assignment No. 2 Mini-Project On Twitter
7 pages
Analyzing Social Media Data in Python Chapter2
No ratings yet
Analyzing Social Media Data in Python Chapter2
30 pages
There Were 50 or More Warnings
No ratings yet
There Were 50 or More Warnings
15 pages
AminaRahmanK DL Lab5
No ratings yet
AminaRahmanK DL Lab5
11 pages
Sentiment Analysis On User-Generated Tweets
No ratings yet
Sentiment Analysis On User-Generated Tweets
15 pages
Russia Vs Ukraine Tweets Analysis
No ratings yet
Russia Vs Ukraine Tweets Analysis
20 pages
More Than Sentiments
No ratings yet
More Than Sentiments
6 pages
HW - Regex: 1 Instructions HW - Regular Expression - 10 Points
No ratings yet
HW - Regex: 1 Instructions HW - Regular Expression - 10 Points
9 pages
Twitter Sentiment Analysis (NLP) : This Photo CC By-Nc
100% (1)
Twitter Sentiment Analysis (NLP) : This Photo CC By-Nc
18 pages
4aeee7-Ba25-Ff2e-30d7-63d306a7270 Open Ai Playground Example Prompts - Google Sheets
No ratings yet
4aeee7-Ba25-Ff2e-30d7-63d306a7270 Open Ai Playground Example Prompts - Google Sheets
8 pages
Part C - Assignment No. 2 Mini-Project On Twitter
No ratings yet
Part C - Assignment No. 2 Mini-Project On Twitter
7 pages
Headlight Wiring
No ratings yet
Headlight Wiring
127 pages
Anand Institute of Higher Technology Department of Computer Science and Engineering ACADEMIC YEAR: 2018-19 Mini Project Report
No ratings yet
Anand Institute of Higher Technology Department of Computer Science and Engineering ACADEMIC YEAR: 2018-19 Mini Project Report
9 pages
Malpezzi Ozanne Thibodeau Characteristic Prices 59 Metro Areas Hedonic Indexes Hud-50814
No ratings yet
Malpezzi Ozanne Thibodeau Characteristic Prices 59 Metro Areas Hedonic Indexes Hud-50814
200 pages
Wrangle Report
No ratings yet
Wrangle Report
4 pages
Twitter Sentiment Analysis in Python
0% (1)
Twitter Sentiment Analysis in Python
9 pages
Bertweet: A Pre-Trained Language Model For English Tweets
No ratings yet
Bertweet: A Pre-Trained Language Model For English Tweets
6 pages
Trainz 2004 DRAFT Content Creation Procedures
100% (1)
Trainz 2004 DRAFT Content Creation Procedures
101 pages
Gathering Data: E-Predictions/image-Predictions - TSV
No ratings yet
Gathering Data: E-Predictions/image-Predictions - TSV
3 pages
CS5228 Project 2 Twitter Sentiment Analysis Group No.: 29: 1 Problem Statement
No ratings yet
CS5228 Project 2 Twitter Sentiment Analysis Group No.: 29: 1 Problem Statement
15 pages
Twitter Python Assignment
No ratings yet
Twitter Python Assignment
8 pages
Sentiment Analysis Presentationnotes
No ratings yet
Sentiment Analysis Presentationnotes
4 pages
Excel - Chapter
No ratings yet
Excel - Chapter
69 pages
NLP - Twitter Sentiment Analysis With Tensorflow - Sebastian Correa - Medium
No ratings yet
NLP - Twitter Sentiment Analysis With Tensorflow - Sebastian Correa - Medium
13 pages
Schema de Principe Electrical Schematic
No ratings yet
Schema de Principe Electrical Schematic
78 pages
Semitic Alphabets
No ratings yet
Semitic Alphabets
16 pages
Regent College London New
No ratings yet
Regent College London New
2 pages
Erlie M.
No ratings yet
Erlie M.
65 pages
Basic Tweet Preprocessing in Python: 1. Hashtag Extraction Using Regex
No ratings yet
Basic Tweet Preprocessing in Python: 1. Hashtag Extraction Using Regex
2 pages
Strategy Papers and Cases Questions
0% (1)
Strategy Papers and Cases Questions
9 pages
Oracle Final Exam Semester 1
100% (1)
Oracle Final Exam Semester 1
22 pages
Court Order
100% (1)
Court Order
17 pages
FIT5196-S2-2020 Assessment 1: Task 1: Parsing Text Files (U)
No ratings yet
FIT5196-S2-2020 Assessment 1: Task 1: Parsing Text Files (U)
4 pages
ENGLISH-8-Quarter 2-Week 5
100% (1)
ENGLISH-8-Quarter 2-Week 5
6 pages
WAH5 - Functional Language Worksheets
No ratings yet
WAH5 - Functional Language Worksheets
6 pages
November 09
No ratings yet
November 09
2 pages
2018 Oakland Linuxmalware
No ratings yet
2018 Oakland Linuxmalware
15 pages
Q.18604 Cummin Genset Nta 855 - 1
100% (2)
Q.18604 Cummin Genset Nta 855 - 1
1 page
SonarQube Users (Archive) - Java - lang.OutOfMemoryError - Java Heap Space PDF
No ratings yet
SonarQube Users (Archive) - Java - lang.OutOfMemoryError - Java Heap Space PDF
9 pages
Quick Start Guide: Register Your Product and Get Support at
No ratings yet
Quick Start Guide: Register Your Product and Get Support at
6 pages
Java4s Com Hibernate
No ratings yet
Java4s Com Hibernate
5 pages
Feldman-Mahalanobis Model
No ratings yet
Feldman-Mahalanobis Model
3 pages
How To Make A Good Presentation
No ratings yet
How To Make A Good Presentation
34 pages
Operating Room
No ratings yet
Operating Room
1 page
Second Quarter Lesson Plan in English 7
No ratings yet
Second Quarter Lesson Plan in English 7
5 pages
Worksheet 3 LS6 - MIANO, REYMARK
No ratings yet
Worksheet 3 LS6 - MIANO, REYMARK
1 page
Sec A: Project: It Building, Bhaktapur NEA Supply GEN Supply
No ratings yet
Sec A: Project: It Building, Bhaktapur NEA Supply GEN Supply
3 pages
My MVP in Volleyball: Individual Awards: Collegiate Awards
No ratings yet
My MVP in Volleyball: Individual Awards: Collegiate Awards
1 page
Failure Mode For Gas CHromatograph
No ratings yet
Failure Mode For Gas CHromatograph
2 pages
Modbus TCP Client RTU Slave MN67010 ENG
No ratings yet
Modbus TCP Client RTU Slave MN67010 ENG
9 pages
Trabajo Final de Ingles Técnico
No ratings yet
Trabajo Final de Ingles Técnico
5 pages
The Act
No ratings yet
The Act
2 pages
Deep Dive: Exploring the Real-world Value of Open Source Intelligence
From Everand
Deep Dive: Exploring the Real-world Value of Open Source Intelligence
Micah Hoffman
No ratings yet
Unofficial TIBCO® Business Works™ Interview Questions, Answers, and Explanations: TIBCO Certification Review Questions
From Everand
Unofficial TIBCO® Business Works™ Interview Questions, Answers, and Explanations: TIBCO Certification Review Questions
equitypress
3.5/5 (2)