0% found this document useful (0 votes)

193 views35 pages

Mini Project Report

This document presents a Twitter sentiment analysis mini-project. The project analyzes tweets to determine sentiment. It includes 5 modules: 1) Streaming tweets from Twitter, 2) Using cursors and pagination to retrieve tweets, 3) Analyzing tweet data, 4) Visualizing tweet data, and 5) Performing sentiment analysis on the tweets to classify them as positive, negative, or neutral. The project aims to understand public opinion by analyzing large amounts of sentiment-rich Twitter data.

Uploaded by

Sree pujitha Doppalapudi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

193 views35 pages

Mini Project Report

Uploaded by

Sree pujitha Doppalapudi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 35

(AUTONOMOUS)

MINI-PROJECT REPORT

DEPARTMENT OF INFORMATION TECHNOLOGY

SECOND YEAR

IV SEMESTER

TWITTER SENTIMENT ANALYSIS

PRESENTED BY:

B.V.L.Pravallika - 19NG1A1207
D.Sree pujitha – 19NG1A1216
P.Sri harika – 19NG1A1246
V.Mounika – 19NG1A1257
ACKNOWLEDGEMENT

I feel priviledged to thank our Chairman sir,Director sir,Principal sir and our HOD
sir for their support. My sincere thanks to the faculty members and I also thank my team members for
the cooperation in making this mini project a success.

I also thank our parents and our dear friends for their help and support.
ABSTRACT

Sentiment analysis deals with identifying and classifying opinions or sentiments

expressed in source text. Social media is generating a vast amount of sentiment rich data in the form
of tweets, status updates, blog posts etc. Sentiment analysis of this user generated data is very useful
in knowing the opinion of the crowd. Twitter sentiment analysis is difficult compared to general
sentiment analysis due to the presence of slang words and misspellings. The maximum limit of
characters that are allowed in Twitter is 140.

Knowledge base approach and Machine learning approach are the two strategies used
for analyzing sentiments from the text. By doing sentiment analysis in a specific domain, it is
possible to identify the effect of domain information in sentiment classification. We present a new
feature vector for classifying the tweets as positive, negative and extract peoples' opinion about
products.

Social media have received more attention nowadays. Public and private opinion
about a wide variety of subjects are expressed and spread continually via numerous social media.
Twitter is one of the social media that is gaining popularity. Twitter offers organizations a fast and
effective way to analyze customer perspectives towards the critical to success in the market place.
This paper reports on the design of a sentiment analysis,extracting a vast amount of tweets
TABLE OF CONTENTS

SNO CONTENTS PAGE

NUMBER

1 Introduction 1

2 Software and Hardware Requirements 4

3 Algorithm Specification 5

4 Advantages and Disadvantages 6

5 Applications 7

6 Streaming Tweets 8

7 Cursor and Pagination 10

8 Analysing Tweet Data 13

9 Visualizing Tweet Data 17

10 Sentimental analysis of Tweet Data 22

11 Outputs 27

12 Conclusion 30

13 References 31
INTRODUCTION
As internet is growing bigger, its horizons are becoming wider. Social Media and Micro blogging
platforms like Facebook, Twitter dominate in spreading encapsulated news and trending topics
across the globe at a rapid pace. A topic becomes trending if more and more users are contributing
their opinion and judgements, thereby making it a valuable source of online perception. These topics
generally intended to spread awareness or to promote public figures, political campaigns during
elections, product endorsements and entertainment like movies, award shows.

Large organizations and firms take advantage of people's feedback to improve their products and
services which further help in enhancing marketing strategies. One such example can be leaking the
pictures of upcoming iPhone to create a hype to extract people's emotions and market the product
before its release. Thus, there is a huge potential of discovering and analysing interesting patterns
from the infinite social media data for business-driven applications.

Software Requirement Analysis

A) Problem
Sentiment analysis is the prediction of emotions in a word, sentence or corpus of
documents. It is intended to serve as an application to understand the attitudes, opinions and
emotions expressed within an online mention. The intention is to gain an overview of the wider
public opinion behind certain topics. Precisely, it is a paradigm of categorizing conversations into
positive, negative or neutral labels. Many people use social media sites for networking with other
people and to stay up-to-date with news and current events. These sites (Twitter, Facebook,
Instagram, google+) offer a platform to people to voice their opinions. For example, people quickly
post their reviews online as soon as they watch a movie and then start a series of comments to
discuss about the acting skills depicted in the movie. This kind of information forms a basis for
people to evaluate, rate about the performance of not only any movie but about other products and to
know about whether it will be a success or not. This type of vast information on these sites can used
for marketing and social studies. Therefore, sentiment analysis has wide applications and include
emotion mining, polarity, classification and influence analysis.

Twitter is an online networking site driven by tweets which are 140 character limited
messages. Thus, the character limit enforces the use of hashtags for text classification. Currently
around 6500 tweets are published per second, which results in approximately 561.6 million tweets

1
per day. These streams of tweets are generally noisy reflecting multi topic, changing attitudes
information in unfiltered and unstructured format. Twitter sentiment analysis involves the use of
natural language processing to extract, identify to characterize the sentiment content. Sentiment
Analysis is often carried out at two levels 1) coarse level and 2) fine level. In coarse level, the
analysis of entire documents is done while in fine level, the analysis of attributes is done. The
sentiments present in the text are of two types: Direct and Comparative. In comparative sentiments,
the comparison of objects in the same sentence is involved while in direct sentiments, objects are
independent of one another in the same sentence.

However, doing the analysis of tweets expressed in not an easy job. A lot of
challenges are involved in terms of tonality, polarity, lexicon and grammar of the tweets. They tend
to be highly unstructured and non-grammatical. It gets difficult to interpret their meaning. Moreover,
extensive usage of slang words, acronyms and out of vocabulary words are quite common while
tweeting online. The categorization of such words per polarity gets tough for natural processors
involved. The rest of this project report is structured as follows. In Section II, we detailed some
related work of our project by highlighting Software and Hardware requirements. Section III cover
details of methodology & implementation of the project Finally, Section VI concludes the report.

B) Modules and Functionalities

MODULE 1 : Streaming Tweets

In this step, we hit the API by performing Authentication and Stream the Data (unfiltered) from
requested Twitter Account.

Step 1: Connect to the API

Step This project consists of five modules . They are:

Step 3: Get the Response.

MODULE 2 : Cursor and Pagination

Pagination is a technique used for breaking large amount of data into smaller portions called pages.
The Twitter standard APIs utilize a technique called cursoring to paginate large result sets. Simply it
handles pagination so that we can specify the number of tweets we want to get.

Step 1:Connect to API and import cursor from tweepy.

2
Step 2:Access user timeline tweets using twitter client(ex : pycon)

(Returns a collection of most recent tweets posted by user indicated by screen name or user id
parameter)

MODULE 3 : Analysing Tweet Data

Analysing tweet data compiles all the behaviours and actions audience take when they come
across your posts and profile - the clicks , likes , re-tweets. Tweet Analyzer integrated with twitter.
Tweet analyzer fetches 5 most recent tweets from given twitter handle.

Step 1: Connect to the API

Step 2: Using tweet analyser functionality we analyse and categorize contents from tweets.

MODULE 4 : Visualizing Tweet Data

Data visualisation is a part of statistical analysis. After collecting and analysing the data , a
good visual representation is designed for data. A picture can speak thousands of words. Different
models give different perspectives of data.

Step 1: Connect to the API.

Step 2: Analyse the data.

Step 3: Plot the data.

MODULE 5 : Sentiment Analysis Tweet Data

Key aspect of sentiment analysis is to analyse a body of body of text based on the polarity.
Sentiment polarity for an element defines the orientation of expressed sentiment.

Step 1:Connect to the API.

Step 2:Analyse the data.

Step 3:If sentiment polarity>0 ------> returns 1

Step 4:If sentiment polarity=0 ------> returns 0

Step 5:If sentiment polarity<0 ------> returns -1

3
SOFTWARE AND HARDWARE REQUIREMENTS

SOFTWARE REQUIREMENTS

Operating System : Windows 7/8/8.1/10

Twitter Developer Account

R Studio

Python IDLE 3.9.1

HARDWARE SPECIFICATIONS

Processor : Intel(R) Core(TM) i3 or more

Installed RAM : 4.00 GB or more

System type : 64-bit operating system, x64-based processor

Monitor : 1024 x 720 Display

4
ALGORITHM SPECIFICATION
Naive Bayes Classification
Written reviews are great datasets for doing sentiment analysis because they often come with a score
that can be used to train an algorithm. Naive Bayes is a popular algorithm for classifying text

Consider, for example, the following phrases extracted from positive and negative reviews of movies
and restaurants,. Words like great, richly, awesome, and pathetic, and awful and ridiculously are very
informative cues: + ...zany characters and richly applied satire, and some great plot twists − It was
pathetic. The worst part about it was the boxing scenes...

TEXTBLOB PACKAGE

The TextBlob package for Python is a convenient way to do a lot of Natural Language Processing
(NLP) tasks. For example: From textblob

import TextBlob

TextBlob(“not a very great calculation”).sentiment

This tells us that the English phrase “not a very great calculation” has a polarity of about -0.3,
meaning it is slightly negative, and a subjectivity of about 0.6, meaning it is fairly subjective.

When calculating sentiment for a single word, TextBlob uses a sophisticated technique known to
Mathematicians as “averaging”.

TextBlob(“great").sentiment

## Sentiment(polarity=0.8, subjectivity=0.75)

TextBlob("very great").sentiment

## Sentiment(polarity=1.0, subjectivity=0.9750000000000001)

The polarity gets maxed out at 1.0, but you can see that subjectivity is also modified by “very” to
become 0.75⋅.

TextBlob("not very great").sentiment

#Sentiment(polarity=-0.3076923076923077, subjectivity=0.5769230769230769)

Textblob will ignore one letter words in its sentiment phrases.

5
ADVANTAGES

1.UPSELLING OPPORTUNITIES
Identifying the happy and satisfied customers ,and increasing the selling of product.
2.AGENT MONITORING
The superiors will monitor the quality of service provided by each team member.
3.IDENTIFYING KEY EMOTIONAL TRIGERRS
Identifying the emoji trigerrs like sad,happy etc,.. for understanding the customer
satisfaction.
4.HANDLING MULTIPLE CUSTOMERS
By handling multiple customers ,we can save time and can manage other works at that
time.
5.ADAPTIVE CUSTOMER SERVICE
If the provided service by customer is good.then,the customer can easily adapt to their
service.
6.QUICK ESCALATIONS
Finding the negative emoji’s quickly and satisfying the needs of the customer.
7.REDUCE THE CUSTOMER CHURN
Identifing the unsatisfied customer,and provide a smooth service to satisfy their needs.
8.TRACKING OVERALL CUSTOMER SATISFACTION
Tracking the customer satisfaction time to time.
9.DETECT CHANGES IN OPINION
Detecting the customer opinions and satisfying their needs.Because the customer
opinion always changes before and after receiving products.

DISADVANTAGES
1.Inability to perform well in different domains.
2.Inadequate accuracy and performance in sentimental analysis based on insufficient data.
3.Incapability to deal with complex sentences that require more than sentiment words and simple
analyzing
4. It also has lot of application issues with the slang used and the short form of words

6
APPLICATIONS
Twitter sentiment analysis is designed to analyze the sentiment of tweets. It’s ideal for social
listening and detecting brand sentiment in real time.
Based on a scoring mechanism, sentiment analysis monitors conversations and evaluates language
and voice inflections to quantify attitudes, opinions, and emotions related to a business, product or
service, or topic. Sentiment analysis is sometimes also referred to as opinion mining.
The applications of sentimental analysis are endless and can be applied to any industry, from finance
and retail to hospitality and technology.The most popular applications of sentiment analysis in real
life:
1.Social media monitoring

2.Customer support

3.Customer feedback

4.Brand monitoring and reputation management

5.Voice of customer (VoC)

6.Voice of employee

7.Product analysis

8.Market research and competitive research

Sentimental analysis one of those technologies, the usefulness of which wholly depends on the
understanding capabilities

It can be extremely useful if you know how to use it and it can be completely useless if you apply it
on something it is not supposed to do.

7
STREAMING TWEETS
from tweepy.streaming import StreamListener

from tweepy import OauthHandler

from tweepy import Stream

import consumer

# # # # TWITTER STREAMER # # # #

class TwitterStreamer():

"""

Class for streaming and processing live tweets.

"""

def __init__(self):

pass

def stream_tweets(self, fetched_tweets_filename, hash_tag_list):

# This handles Twitter authetification and the connection to Twitter Streaming API

listener = StdOutListener(fetched_tweets_filename)

auth = OAuthHandler(consumer.CONSUMER_KEY, consumer.CONSUMER_SECRET)

auth.set_access_token(consumer.ACCESS_TOKEN, consumer.ACCESS_TOKEN_SECRET)

stream = Stream(auth, listener)

stream.filter(track=hash_tag_list)

class StdOutListener(StreamListener):

"""

This is a basic listener that just prints received tweets to stdout.

"""

8
def __init__(self, fetched_tweets_filename):

self.fetched_tweets_filename = fetched_tweets_filename

def on_data(self, data):

try:

print(data)

with open(self.fetched_tweets_filename, 'a') as tf:

tf.write(data)

return True

except BaseException as e:

print("Error on_data %s" % str(e))

return True

def on_error(self, status):

print(status)

if __name__ == '__main__':

hash_tag_list = ["donal trump", "hillary clinton", "barack obama", "bernie sanders"]

fetched_tweets_filename = "tweets.txt"

twitter_streamer = TwitterStreamer()

twitter_streamer.stream_tweets(fetched_tweets_filename, hash_tag_list)

9
CURSOR AND PAGINATION

from tweepy import API

from tweepy import Cursor
from tweepy.streaming import StreamListener
from tweepy import OAuthHandler
from tweepy import Stream

import consumer

# # # # TWITTER CLIENT # # # #
class TwitterClient():
def __init__(self, twitter_user=None):
self.auth = TwitterAuthenticator().authenticate_twitter_app()
self.twitter_client = API(self.auth)

self.twitter_user = twitter_user

def get_user_timeline_tweets(self, num_tweets):

tweets = []
for tweet in Cursor(self.twitter_client.user_timeline, id=self.twitter_user).items(num_tweets):
tweets.append(tweet)
return tweets

def get_friend_list(self, num_friends):

friend_list = []
for friend in Cursor(self.twitter_client.friends, id=self.twitter_user).items(num_friends):
friend_list.append(friend)
return friend_list

def get_home_timeline_tweets(self, num_tweets):

10
home_timeline_tweets = []
for tweet in Cursor(self.twitter_client.home_timeline, id=self.twitter_user).items(num_tweets):
home_timeline_tweets.append(tweet)
return home_timeline_tweets

# # # # TWITTER AUTHENTICATER # # # #
class TwitterAuthenticator():

def authenticate_twitter_app(self):
auth = OAuthHandler(consumer.CONSUMER_KEY, consumer.CONSUMER_SECRET)
auth.set_access_token(consumer.ACCESS_TOKEN, consumer.ACCESS_TOKEN_SECRET)
return auth
class TwitterStreamer():
"""
Class for streaming and processing live tweets.
"""
def __init__(self):
self.twitter_autenticator = TwitterAuthenticator()

def stream_tweets(self, fetched_tweets_filename, hash_tag_list):

# This handles Twitter authetification and the connection to Twitter Streaming API
listener = TwitterListener(fetched_tweets_filename)
auth = self.twitter_autenticator.authenticate_twitter_app()
stream = Stream(auth, listener)

# This line filter Twitter Streams to capture data by the keywords:

stream.filter(track=hash_tag_list)
class TwitterListener(StreamListener):
"""

11
This is a basic listener that just prints received tweets to stdout.
"""
def __init__(self, fetched_tweets_filename):
self.fetched_tweets_filename = fetched_tweets_filename

def on_data(self, data):

try:
print(data)
with open(self.fetched_tweets_filename, 'a') as tf:
tf.write(data)
return True
except BaseException as e:
print("Error on_data %s" % str(e))
return True

def on_error(self, status):

if status == 420:
# Returning False on_data method in case rate limit occurs.
return False
print(status)
if __name__ == '__main__':
# Authenticate using config.py and connect to Twitter Streaming API.
hash_tag_list = ["donal trump", "hillary clinton", "barack obama"]
fetched_tweets_filename = "tweets.txt"
twitter_client = TwitterClient("pycon")
print(twitter_client.get_user_timeline_tweets(1))
# twitter_streamer = TwitterStreamer()
# twitter_streamer.stream_tweets(fetched_tweets_filename, hash_tag_list)

12
ANALYSING TWEET DATA
from tweepy import API
from tweepy import Cursor
from tweepy.streaming import StreamListener
from tweepy import OAuthHandler
from tweepy import Stream

import consumer
import numpy as np
import pandas as pd

class TwitterClient():
def __init__(self, twitter_user=None):
self.auth = TwitterAuthenticator().authenticate_twitter_app()
self.twitter_client = API(self.auth)

self.twitter_user = twitter_user

def get_twitter_client_api(self):
return self.twitter_client

def get_user_timeline_tweets(self, num_tweets):

tweets = []
for tweet in Cursor(self.twitter_client.user_timeline, id=self.twitter_user).items(num_tweets):
tweets.append(tweet)
return tweets
def get_friend_list(self, num_friends):
friend_list = []
for friend in Cursor(self.twitter_client.friends, id=self.twitter_user).items(num_friends):
friend_list.append(friend)

13
return friend_list

def get_home_timeline_tweets(self, num_tweets):

home_timeline_tweets = []
for tweet in Cursor(self.twitter_client.home_timeline, id=self.twitter_user).items(num_tweets):
home_timeline_tweets.append(tweet)
return home_timeline_tweets

class TwitterAuthenticator():

def authenticate_twitter_app(self):
auth = OAuthHandler(consumer.CONSUMER_KEY, consumer.CONSUMER_SECRET)
auth.set_access_token(consumer.ACCESS_TOKEN, consumer.ACCESS_TOKEN_SECRET)
return auth

# # # # TWITTER STREAMER # # # #
class TwitterStreamer():
"""
Class for streaming and processing live tweets.
"""
def __init__(self):
self.twitter_autenticator = TwitterAuthenticator()

def stream_tweets(self, fetched_tweets_filename, hash_tag_list):

# This line filter Twitter Streams to capture data by the keywords:

14
stream.filter(track=hash_tag_list)

# # # # TWITTER STREAM LISTENER # # # #

def on_data(self, data):

try:
print(data)
with open(self.fetched_tweets_filename, 'a') as tf:
tf.write(data)
return True
except BaseException as e:
print("Error on_data %s" % str(e))
return True

def on_error(self, status):

if status == 420:
# Returning False on_data method in case rate limit occurs.
return False
print(status)

class TweetAnalyzer():
"""

15
Functionality for analyzing and categorizing content from tweets.
"""
def tweets_to_data_frame(self, tweets):
df = pd.DataFrame(data=[tweet.text for tweet in tweets], columns=['Tweets'])

df['id'] = np.array([tweet.id for tweet in tweets])

df['len'] = np.array([len(tweet.text) for tweet in tweets])
df['date'] = np.array([tweet.created_at for tweet in tweets])
df['source'] = np.array([tweet.source for tweet in tweets])
df['likes'] = np.array([tweet.favorite_count for tweet in tweets])
df['retweets'] = np.array([tweet.retweet_count for tweet in tweets])

return df

if __name__ == '__main__':

twitter_client = TwitterClient()
tweet_analyzer = TweetAnalyzer()

api = twitter_client.get_twitter_client_api()

tweets = api.user_timeline(screen_name="Narendra Modi", count=20)

#print(dir(tweets[0]))
#print(tweets[0].retweet_count)

df = tweet_analyzer.tweets_to_data_frame(tweets)

print(df.head(10))

16
VISUALIZING TWEET DATA
from tweepy import API
from tweepy.streaming import StreamListener
from tweepy import OAuthHandler
from tweepy import Stream

import consumer
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

# # # # TWITTER CLIENT # # # #
class TwitterClient():
def __init__(self, twitter_user=None):
self.auth = TwitterAuthenticator().authenticate_twitter_app()
self.twitter_client = API(self.auth)

self.twitter_user = twitter_user

def get_twitter_client_api(self):
return self.twitter_client

def get_user_timeline_tweets(self, num_tweets):

tweets = []
for tweet in Cursor(self.twitter_client.user_timeline, id=self.twitter_user).items(num_tweets):
tweets.append(tweet)
return tweets

def get_friend_list(self, num_friends):

17
friend_list = []
for friend in Cursor(self.twitter_client.friends, id=self.twitter_user).items(num_friends):
friend_list.append(friend)
return friend_list

def get_home_timeline_tweets(self, num_tweets):

home_timeline_tweets = []
for tweet in Cursor(self.twitter_client.home_timeline, id=self.twitter_user).items(num_tweets):
home_timeline_tweets.append(tweet)
return home_timeline_tweets

# # # # TWITTER AUTHENTICATER # # # #
class TwitterAuthenticator():

def authenticate_twitter_app(self):
auth = OAuthHandler(consumer.CONSUMER_KEY, consumer.CONSUMER_SECRET)
auth.set_access_token(consumer.ACCESS_TOKEN, consumer.ACCESS_TOKEN_SECRET)
return auth

# # # # TWITTER STREAMER # # # #
class TwitterStreamer():
"""
Class for streaming and processing live tweets.
"""
def __init__(self):
self.twitter_autenticator = TwitterAuthenticator()

def stream_tweets(self, fetched_tweets_filename, hash_tag_list):

# This handles Twitter authetification and the connection to Twitter Streaming API

18
listener = TwitterListener(fetched_tweets_filename)
auth = self.twitter_autenticator.authenticate_twitter_app()
stream = Stream(auth, listener)

# This line filter Twitter Streams to capture data by the keywords:

stream.filter(track=hash_tag_list)

# # # # TWITTER STREAM LISTENER # # # #

def on_data(self, data):

try:
print(data)
with open(self.fetched_tweets_filename, 'a') as tf:
tf.write(data)
return True
except BaseException as e:
print("Error on_data %s" % str(e))
return True

def on_error(self, status):

if status == 420:
# Returning False on_data method in case rate limit occurs.
return False

19
print(status)
class TweetAnalyzer():
"""
Functionality for analyzing and categorizing content from tweets.
"""
def tweets_to_data_frame(self, tweets):
df = pd.DataFrame(data=[tweet.text for tweet in tweets], columns=['tweets'])

df['id'] = np.array([tweet.id for tweet in tweets])

return df

if __name__ == '__main__':

twitter_client = TwitterClient()
tweet_analyzer = TweetAnalyzer()

api = twitter_client.get_twitter_client_api()

tweets = api.user_timeline(screen_name="Narendra Modi", count=20)

#print(dir(tweets[0]))
#print(tweets[0].retweet_count)

20
df = tweet_analyzer.tweets_to_data_frame(tweets)

# Get average length over all tweets:

print(np.mean(df['len']))

# Get the number of likes for the most liked tweet:

#print(np.max(df['likes']))

# Get the number of retweets for the most retweeted tweet:

print(np.max(df['retweets']))

#print(df.head(10))
time_favs = pd.Series(data=df['likes'].values, index=df['date'])
time_favs.plot(figsize=(16, 4), color='r')
plt.show()

#time_retweets = pd.Series(data=df['retweets'].values, index=df['date'])

#time_retweets.plot(figsize=(16, 4), color='r')
#plt.show()

# Layered Time Series:

#time_likes = pd.Series(data=df['likes'].values, index=df['date'])
#time_likes.plot(figsize=(16, 4), label="likes", legend=True)
time_retweets = pd.Series(data=df['retweets'].values, index=df['date'])
time_retweets.plot(figsize=(16, 4), label="retweets", legend=True)
plt.show()

21
SENTIMENTAL ANALYSIS OF TWEET DATA
from tweepy import API
from tweepy import Cursor
from tweepy.streaming import StreamListener
from tweepy import OAuthHandler
from tweepy import Stream

from textblob import TextBlob

import consumer

import matplotlib.pyplot as plt

import numpy as np
import pandas as pd
import re
# # # # TWITTER CLIENT # # # #
class TwitterClient():
def __init__(self, twitter_user=None):
self.auth = TwitterAuthenticator().authenticate_twitter_app()
self.twitter_client = API(self.auth)

self.twitter_user = twitter_user

def get_twitter_client_api(self):
return self.twitter_client

def get_user_timeline_tweets(self, num_tweets):

tweets = []
for tweet in Cursor(self.twitter_client.user_timeline, id=self.twitter_user).items(num_tweets):
tweets.append(tweet)

22
return tweets

def get_friend_list(self, num_friends):

friend_list = []
for friend in Cursor(self.twitter_client.friends, id=self.twitter_user).items(num_friends):
friend_list.append(friend)
return friend_list

def get_home_timeline_tweets(self, num_tweets):

home_timeline_tweets = []
for tweet in Cursor(self.twitter_client.home_timeline, id=self.twitter_user).items(num_tweets):
home_timeline_tweets.append(tweet)
return home_timeline_tweets

# # # # TWITTER AUTHENTICATER # # # #
class TwitterAuthenticator():

def authenticate_twitter_app(self):
auth = OAuthHandler(consumer.CONSUMER_KEY, consumer.CONSUMER_SECRET)
auth.set_access_token(consumer.ACCESS_TOKEN,consumer.ACCESS_TOKEN_SECRET)
return auth

# # # # TWITTER STREAMER # # # #
class TwitterStreamer():
"""
Class for streaming and processing live tweets.
"""
def __init__(self):
self.twitter_autenticator = TwitterAuthenticator()

23
def stream_tweets(self, fetched_tweets_filename, hash_tag_list):
# This handles Twitter authetification and the connection to Twitter Streaming API
listener = TwitterListener(fetched_tweets_filename)
auth = self.twitter_autenticator.authenticate_twitter_app()
stream = Stream(auth, listener)

# This line filter Twitter Streams to capture data by the keywords:

stream.filter(track=hash_tag_list)

# # # # TWITTER STREAM LISTENER # # # #

class TwitterListener(StreamListener):
"""
This is a basic listener that just prints received tweets to stdout.
"""
def __init__(self, fetched_tweets_filename):
self.fetched_tweets_filename = fetched_tweets_filename
def on_data(self, data):
try:
print(data)
with open(self.fetched_tweets_filename, 'a') as tf:
tf.write(data)
return True
except BaseException as e:
print("Error on_data %s" % str(e))
return True

def on_error(self, status):

if status == 420:

24
# Returning False on_data method in case rate limit occurs.
return False
print(status)

class TweetAnalyzer():
"""
Functionality for analyzing and categorizing content from tweets.
"""

def clean_tweet(self, tweet):

return ' '.join(re.sub("(@[A-Za-z0-9]+)|([^0-9A-Za-z \t])|(\w+:\/\/\S+)", " ", tweet).split())

def analyze_sentiment(self, tweet):

analysis = TextBlob(self.clean_tweet(tweet))

if analysis.sentiment.polarity > 0:
return 1
elif analysis.sentiment.polarity == 0:
return 0
else:
return -1
def tweets_to_data_frame(self, tweets):
df = pd.DataFrame(data=[tweet.text for tweet in tweets], columns=['tweets'])

df['id'] = np.array([tweet.id for tweet in tweets])

25
df['retweets'] = np.array([tweet.retweet_count for tweet in tweets])

return df

if __name__ == '__main__':

twitter_client = TwitterClient()
tweet_analyzer = TweetAnalyzer()

api = twitter_client.get_twitter_client_api()

tweets = api.user_timeline(screen_name="Narendra Modi", count=200)

df = tweet_analyzer.tweets_to_data_frame(tweets)
df['sentiment'] = np.array([tweet_analyzer.analyze_sentiment(tweet) for tweet in df['tweets']])
print(df.head(10))

26
OUTPUTS
1.STREAMING TWEETS

2.CURSOR AND PAGINATION

27
3.ANALYSING TWEET DATA

4.VISUALIZING TWEET DATA

28
5.SENTIMENTAL ANALYSIS OF TWEET DATA

29
CONCLUSION
Twitter sentiment analysis comes under the category of text and opinion mining. It focuses
on analyzing the sentiments of the tweets and feeding the data to a machine learning model
to train it and then check its accuracy, so that we can use this model for future use
according to the results.

It comprises of steps like data collection, text preprocessing, sentiment detection, sentiment
classification, training and testing the model. This research topic has evolved during the last
decade with models reaching the efficiency of almost 85%-90%. But it still lacks the
dimension of diversity in the data. Along with this it has a lot of application issues with the
slang used and the short forms of words. Many analyzers don’t perform well when the
number of classes are increased. Also, it’s still not tested that how accurate the model will
be for topics other than the one in consideration.

Hence sentiment analysis has a very bright scope of development in future.

30
REFERENCES
1.Sahar A. El_Rahman, "Sentiment Analysis of Twitter Data" in , Computer and Information
sciences College Princess Nourah Bint Abdulrahman University.

2.Anurag P. Jain, "Sentiments Analysis Of Twitter Data Using Data Mining", 2015 ICIP

3.Rasika Wagh and Payal Punde, "Survey on Sentiment Analysis using Twitter Dataset", ICECA,
2018.

4.Adyan Marendra Ramadhani and Hong Soon Goo, "Twitter Sentiment Analysis using Deep
Learning Methods" in , Department of Management Information Systems Dong-A University Busan
South Korea, 2017.

5.Bing Liu, Sentiment Analysis and Opinion Mining Morgan and Claypool Publishers, May 2012.

6.V. Kharde and S. Sonawane, "Sentiment Analysis of Twitter Data: A Survey of Techniques",
International Journal of Computer Applications, vol. 139, pp. 11, 2016.

7.Huma Parveen and Shikha Pandey, "Sentiment Analysis on Twitter Data-set using Naive Bayes
Algorithm" in , Bhilai, India:Dept. of Computer Science and Engineering Rungta College of
Engineering and Technology, 2016.

Epa WBP 08 04
No ratings yet
Epa WBP 08 04
100 pages
Module 5 Notes
No ratings yet
Module 5 Notes
28 pages
DWM Manual
No ratings yet
DWM Manual
60 pages
Reliability Centered maintenance-RCM of Sale Gas Compressor Installed
100% (2)
Reliability Centered maintenance-RCM of Sale Gas Compressor Installed
25 pages
PDF Sentimental Analysis Project Documentation
No ratings yet
PDF Sentimental Analysis Project Documentation
74 pages
Fake Account Detection Using Machine Learning and Data Science
No ratings yet
Fake Account Detection Using Machine Learning and Data Science
58 pages
Introduction-to-Digital-Forensics - PD
No ratings yet
Introduction-to-Digital-Forensics - PD
9 pages
Eti Chapter 4
No ratings yet
Eti Chapter 4
25 pages
Vaccination Management System
No ratings yet
Vaccination Management System
4 pages
Twitter Sentiment Analysis
No ratings yet
Twitter Sentiment Analysis
71 pages
Eti Project For Com
No ratings yet
Eti Project For Com
13 pages
Report For Face Mask Detection Using Python and Deep Learning
100% (2)
Report For Face Mask Detection Using Python and Deep Learning
30 pages
ACN Unit - 5 Notes
No ratings yet
ACN Unit - 5 Notes
25 pages
Mini Project CSDF
No ratings yet
Mini Project CSDF
8 pages
Whatsapp Chat Analyser
No ratings yet
Whatsapp Chat Analyser
11 pages
Cwipedia - in Question Bank: Unit Test-II
No ratings yet
Cwipedia - in Question Bank: Unit Test-II
27 pages
Mca, Bca Project List 2023-2024
No ratings yet
Mca, Bca Project List 2023-2024
90 pages
Autism Spectrum Disorder Detection Using Facial Images
No ratings yet
Autism Spectrum Disorder Detection Using Facial Images
14 pages
Cyberspace News Prediction of Text and Image
No ratings yet
Cyberspace News Prediction of Text and Image
53 pages
Project
No ratings yet
Project
43 pages
Liver Tumor Detection Using Matlab: A Project Report On
No ratings yet
Liver Tumor Detection Using Matlab: A Project Report On
83 pages
Text Editor Using Python: Title of The Project
No ratings yet
Text Editor Using Python: Title of The Project
6 pages
Co6i-Eti-Unit 3 Notes
No ratings yet
Co6i-Eti-Unit 3 Notes
15 pages
Three Level Password Authentication System Mechanism
No ratings yet
Three Level Password Authentication System Mechanism
5 pages
Github Copilot Coding With Copilot
100% (2)
Github Copilot Coding With Copilot
41 pages
Advance Java PDF
No ratings yet
Advance Java PDF
25 pages
Location Tracker Device Project Flow and Quotation
No ratings yet
Location Tracker Device Project Flow and Quotation
8 pages
Python Practical File
No ratings yet
Python Practical File
42 pages
Spammer Detect Project Document
No ratings yet
Spammer Detect Project Document
45 pages
PWP Chapter 4
No ratings yet
PWP Chapter 4
38 pages
INternship Report
No ratings yet
INternship Report
22 pages
Cpe Project Black Book-Group
No ratings yet
Cpe Project Black Book-Group
49 pages
Python Microproject
0% (1)
Python Microproject
20 pages
Final CPP Project
No ratings yet
Final CPP Project
19 pages
Data Leakage Detection System
No ratings yet
Data Leakage Detection System
17 pages
Face Mask Detection
No ratings yet
Face Mask Detection
34 pages
Case Study DS-BDA
No ratings yet
Case Study DS-BDA
29 pages
PWP Notes (Chapter Wise)
100% (1)
PWP Notes (Chapter Wise)
123 pages
11th Bio Zoology Guide - Book Back Answers and Additional Questions K.K.D TM PDF
100% (2)
11th Bio Zoology Guide - Book Back Answers and Additional Questions K.K.D TM PDF
81 pages
Chatbot Using PHP: Department of Computer Engineering
No ratings yet
Chatbot Using PHP: Department of Computer Engineering
16 pages
JARVIS
No ratings yet
JARVIS
6 pages
CSC Mini Project
No ratings yet
CSC Mini Project
12 pages
Synopsis P
100% (1)
Synopsis P
6 pages
Mini Project Report: Submitted in Partial Fulfilment of The Requirement For The University of Mumbai For The Degree of by
No ratings yet
Mini Project Report: Submitted in Partial Fulfilment of The Requirement For The University of Mumbai For The Degree of by
24 pages
ETI Micro Project 39 To 42
0% (1)
ETI Micro Project 39 To 42
20 pages
Final Twitter - Sentiment - Analysis - Report
100% (1)
Final Twitter - Sentiment - Analysis - Report
14 pages
A Project Report ON: Department of Computer Engineering
No ratings yet
A Project Report ON: Department of Computer Engineering
13 pages
Data Duplication Removal Using File Checksum
No ratings yet
Data Duplication Removal Using File Checksum
2 pages
File Sharing and Data Duplication Removal in Cloud Using File Checksum
No ratings yet
File Sharing and Data Duplication Removal in Cloud Using File Checksum
3 pages
BDA Mini Project Report
No ratings yet
BDA Mini Project Report
27 pages
Title: Personality Prediction System Problem Statement:: Literature Review
No ratings yet
Title: Personality Prediction System Problem Statement:: Literature Review
5 pages
PY001
No ratings yet
PY001
6 pages
Mini Project 1 - Session 2 Sad
No ratings yet
Mini Project 1 - Session 2 Sad
8 pages
Python Mini Project
No ratings yet
Python Mini Project
11 pages
Secure File Storage On Cloud Using Hybrid Cryptography
No ratings yet
Secure File Storage On Cloud Using Hybrid Cryptography
5 pages
Cybersecurity Essentials Syllabus
No ratings yet
Cybersecurity Essentials Syllabus
2 pages
Paper Presentation
No ratings yet
Paper Presentation
2 pages
3-1 Bigdata (Spark)
No ratings yet
3-1 Bigdata (Spark)
3 pages
Python Micro-Project AH
No ratings yet
Python Micro-Project AH
13 pages
KUKA-youBot UserManual v0.86.1
No ratings yet
KUKA-youBot UserManual v0.86.1
46 pages
AL-502 DBMS Unit 2
No ratings yet
AL-502 DBMS Unit 2
103 pages
AQL Sampling
No ratings yet
AQL Sampling
4 pages
How To Make Jarvis Iron Man Computer
No ratings yet
How To Make Jarvis Iron Man Computer
6 pages
Web-Based Chat Application With Webcam Using PHP
No ratings yet
Web-Based Chat Application With Webcam Using PHP
5 pages
21CB603 - Ai - Cia 1 - MSM
No ratings yet
21CB603 - Ai - Cia 1 - MSM
11 pages
AMIBIOS Modding (Looking For AMIBCP For Windows 2.x) - VOGONS
No ratings yet
AMIBIOS Modding (Looking For AMIBCP For Windows 2.x) - VOGONS
3 pages
Introduction To PSCAD © 2012 Nayak Corporation Inc
No ratings yet
Introduction To PSCAD © 2012 Nayak Corporation Inc
31 pages
Steganography Project Report For Major Project in B Tech
No ratings yet
Steganography Project Report For Major Project in B Tech
74 pages
Campus Recruitment and Placement System: Rajnish Tripathi, Raghvendra Singh Ms. Jaweria Usmani
No ratings yet
Campus Recruitment and Placement System: Rajnish Tripathi, Raghvendra Singh Ms. Jaweria Usmani
6 pages
Wipro Aptitude Exam-Aptitude Paper1
No ratings yet
Wipro Aptitude Exam-Aptitude Paper1
4 pages
Quality Notification: Completing Notification in SAP System
No ratings yet
Quality Notification: Completing Notification in SAP System
5 pages
Ttnet Airties Air 5650 UM Translated
No ratings yet
Ttnet Airties Air 5650 UM Translated
31 pages
Advanced Query Operators: Analytics Data Sources
No ratings yet
Advanced Query Operators: Analytics Data Sources
10 pages
WTL Report (Abhi)
No ratings yet
WTL Report (Abhi)
26 pages
Elapsed Time (Min) Hydromer Reading K L/T Hydrometer Reading With Correction Effective Length From
No ratings yet
Elapsed Time (Min) Hydromer Reading K L/T Hydrometer Reading With Correction Effective Length From
2 pages
Cse - 101 (0613101) - 54-1
No ratings yet
Cse - 101 (0613101) - 54-1
7 pages
MANUAL USER Radar Overlay DEF ENG
No ratings yet
MANUAL USER Radar Overlay DEF ENG
25 pages
MongoDB Restaurants
No ratings yet
MongoDB Restaurants
5 pages
LabVIEW Core 1 Training Course
No ratings yet
LabVIEW Core 1 Training Course
3 pages
Course Work Syllabus PHD in Bharathiar University
100% (1)
Course Work Syllabus PHD in Bharathiar University
10 pages
Gis in Forestry
No ratings yet
Gis in Forestry
4 pages
Synopsis Vaishnavi.. 1
No ratings yet
Synopsis Vaishnavi.. 1
12 pages
Resume Format For Engineering Freshers
No ratings yet
Resume Format For Engineering Freshers
1 page
Resume Fomrat
No ratings yet
Resume Fomrat
1 page
Referencing Quick Guide
No ratings yet
Referencing Quick Guide
2 pages
Afsdf 234 R 34 Tewfbdsfbsdfgsdfg
No ratings yet
Afsdf 234 R 34 Tewfbdsfbsdfgsdfg
2 pages
Computer Application Lab File - 231126 - 220302
No ratings yet
Computer Application Lab File - 231126 - 220302
5 pages
CEH v8 Pro
No ratings yet
CEH v8 Pro
10 pages
Tay Ho Bus Route For Dance Show - 18.05.2024 - For Audiences
No ratings yet
Tay Ho Bus Route For Dance Show - 18.05.2024 - For Audiences
2 pages
Defect Classifications
No ratings yet
Defect Classifications
6 pages