TM 2
TM 2
import twitter
CONSUMER_KEY = ‘ ‘
CONSUMER_SECRET = ‘ ‘
OAUTH_TOKEN = '‘
OAUTH_TOKEN_SECRET = ‘ ‘
auth = twitter.oauth.OAuth(OAUTH_TOKEN,
OAUTH_TOKEN_SECRET, CONSUMER_KEY,
CONSUMER_SECRET)
twitter_api = twitter.Twitter(auth=auth)
Exploring Trending Topics
A word, phrase or topic that is mentioned at a greater rate than others is said to
be a "trending topic".
Trending topics become popular either through a concerted effort by users, or
because of an event that prompts people to talk about a specific topic.
With an authorized API connection in place, you can now issue a request to get a
list of trending topics.
The example demonstrates how to ask Twitter for the topics that are currently
trending worldwide
The API can easily be parameterized to constrain the topics to more specific
locales
The API aims to provide a way to map a unique identifier to any named place on
Earth (or theoretically, even in a virtual world). Eg: USA – 23424977
Twitter imposes rate limits on how many requests an application can make to
any given API resource within a given time window.
Twitter’s rate limits are well documented, and each individual API resource also
states its particular limits for your convenience.
For example, the API request that we just issued for trends limits applications
to 15 requests per 15-minute window.
Example :Retrieving trends
WORLD_WOE_ID = 1
US_WOE_ID = 23424977
world_trends =
twitter_api.trends.place(_id=WORLD_WOE_ID) us_trends
= twitter_api.trends.place(_id=US_WOE_ID)
print world_trends
print
print us_trends
Tweets Analysis
Extracting Tweets
Text Cleaning
Frequent Words and Word Cloud Word Associations
Topic Modelling
Sentiment Analysis
* The human-readable text of a tweet is available
through t['text']: RT @hassanmusician:
#MentionSomeoneImportantForYou God.
The entities in the text of a tweet are conveniently
processed for you and available through t['entities']:
Clues as to the “interestingness” of a tweet are
available through t['favor ite_count'] and
t['retweet_count'], which return the number of times
it’s been bookmarked or retweeted, respectively.
If a tweet has been retweeted, the
t['retweeted_status'] field provides significant detail
about the original tweet itself and its author.
The t['retweeted'] field denotes whether or not the
authenticated user (via an authorized application) has
retweeted this particular tweet.
‘retweet_count’ reflects the total number of times
that the original tweet has been retweeted and
should reflect the same value in both the original
tweet and all subsequent retweets.
1.4.1Extracting Tweet Entities
1.4.2. Analyzing Tweets and Tweet
Entities with Frequency Analysis
Now take a closer look at what’s in the data by
computing a frequency distribution and looking at the
top 10 items in each list.
As of Python 2.7, a collections module is available that
provides a counter that makes computing a
frequency distribution .
Next Example demonstrates how to use a Counter to
compute frequency distributions as ranked lists of
terms.
Example : Creating a basic frequency
distribution from the words in tweets
Ex: Using prettytable to display
tuples in a nice tabular format
from prettytable import PrettyTable
for label, data in (('Word', words),
('Screen Name', screen_names),
('Hashtag', hashtags)):
pt = PrettyTable(field_names=[label, 'Count'])
c = Counter(data) [ pt.add_row(kv) for kv in
c.most_common()[:10] ] pt.align[label],
pt.align['Count'] = 'l', 'r' # Set column alignment
print pt
1.4.3 Computing lexical diversity of
tweets
Lexical Diversity
What is it?
Calculating simple frequencies and can be applied to
unstructured text is a metric called lexical diversity.
Mathematics?
Number of unique tokens in the text divided by
the total number of tokens in the text.
Lexical diversity can be worth considering as a
primitive statistic for answering a number of
questions. How?
How broad or narrow the subject matter is that an
individual or group discusses
Breaking down the analysis to specific time periods
could yield additional insight.
Comparing different groups or individuals
Lexical Diversity of Coca Cola and Pepsi
Example: Calculating lexical diversity
for tweets
O/P:
Understanding the Example:
Obs 1: 0.67: One in 3 words is a unique word.