0% found this document useful (0 votes)
84 views

Analyzing Social Media Data in Python Chapter4

This document discusses analyzing social media data from Twitter using Python. It explains why maps are useful for visualizing geographical data from tweets. It describes the different types of location data available in Twitter posts, including text, user-defined locations, bounding boxes, and coordinates. It discusses selecting biases when only a small percentage of tweets contain geographical metadata. Finally, it demonstrates how to plot Twitter data on maps using the Basemap library in Python.

Uploaded by

Fgpeqw
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
84 views

Analyzing Social Media Data in Python Chapter4

This document discusses analyzing social media data from Twitter using Python. It explains why maps are useful for visualizing geographical data from tweets. It describes the different types of location data available in Twitter posts, including text, user-defined locations, bounding boxes, and coordinates. It discusses selecting biases when only a small percentage of tweets contain geographical metadata. Finally, it demonstrates how to plot Twitter data on maps using the Basemap library in Python.

Uploaded by

Fgpeqw
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

DataCamp Analyzing Social Media Data in Python

ANALYZING SOCIAL MEDIA DATA IN PYTHON

Maps and Twitter data

Alex Hanna
Computational Social Scientist
DataCamp Analyzing Social Media Data in Python

Why maps?
Geographical scope
Participants or observers?
Differentiating tweets
For or against?
DataCamp Analyzing Social Media Data in Python

How Twitter gets location data


Location is device-dependent
In practice, aggregate geographical to
county, state-level
DataCamp Analyzing Social Media Data in Python

Beware selection biases!


Warning: only 1-3% of Twitter data have geographical data
Limits the generalizability of inference
DataCamp Analyzing Social Media Data in Python

Types of Geographical Data available in Twitter


Twitter text (most imprecise)
User location
Bounding boxes
Coordinates and points (most precise)
DataCamp Analyzing Social Media Data in Python

ANALYZING SOCIAL MEDIA DATA IN PYTHON

Let's practice!
DataCamp Analyzing Social Media Data in Python

ANALYZING SOCIAL MEDIA DATA IN PYTHON

Geographical Data in
Twitter JSON

Alex Hanna
Computational Social Scientist
DataCamp Analyzing Social Media Data in Python

Locations in Twitter text


DataCamp Analyzing Social Media Data in Python

User-defined location
> print(tweet['user']['location'])

Bay Area
DataCamp Analyzing Social Media Data in Python

place JSON
> print(tweet['place'])

{'attributes': {},
'bounding_box': {'coordinates':
[[[-80.47611, 37.185195],
[-80.47611, 37.273387],
[-80.381618, 37.273387],
[-80.381618, 37.185195]]],
'type': 'Polygon'},
'country': 'United States',
'country_code': 'US',
'full_name': 'Blacksburg, VA',
'name': 'Blacksburg',
'place_type': 'city',
...}
DataCamp Analyzing Social Media Data in Python

Calculating the centroid


coordinates = [
[-80.47611, 37.185195],
[-80.47611, 37.273387],
[-80.381618, 37.273387],
[-80.381618, 37.185195]]

longs = np.unique( [x[0] for x


in coordinates] )
lats = np.unique( [x[1] for x
in coordinates] )

central_long = np.sum(longs) / 2
central_lat = np.sum(lats) / 2
DataCamp Analyzing Social Media Data in Python

coordinates JSON
> print(tweet['coordinates'])

{'type': 'Point',
'coordinates': [-72.2833, 21.7833]}
DataCamp Analyzing Social Media Data in Python

ANALYZING SOCIAL MEDIA DATA IN PYTHON

Let's practice!
DataCamp Analyzing Social Media Data in Python

ANALYZING SOCIAL MEDIA DATA IN PYTHON

Creating Twitter maps

Alex Hanna
Computational Social Scientist
DataCamp Analyzing Social Media Data in Python

Introducing Basemap
Library for plotting two-dimensional
maps
Built on top of matplotlib
Converts coordinates into map
projections
DataCamp Analyzing Social Media Data in Python

Beginning with Basemap


from mpl_toolkits.basemap
import Basemap

m = Basemap(projection='merc',
llcrnrlat = -35.62,
llcrnrlon = -17.29,
urcrnrlat = 37.73,
urcrnrlon = 51.39)

m.fillcontinents(color='white')
m.drawcoastlines(color='gray')
m.drawcountries(color='gray')
DataCamp Analyzing Social Media Data in Python

Plotting points
africa = pd.read_csv('africa.csv')
longs = africa['CapitalLongtiude']
lats = africa['CapitalLatitude']

m = Basemap(...)

m.fillcontinents(color='white',
zorder = 0)
m.drawcoastlines(color='gray')
m.drawcountries(color='gray')

m.scatter(longs.values,
lats.values,
latlon = True,
alpha = 0.7)
DataCamp Analyzing Social Media Data in Python

Using color
africa = pd.read_csv('africa.csv')
longs = africa['CapitalLongtiude']
lats = africa['CapitalLatitude']
arabic = africa['Arabic']

m = Basemap(...)
m.fillcontinents(color='white',
zorder = 0)
m.drawcoastlines(color='gray')
m.drawcountries(color='gray')

m.scatter(longs.values,
lats.values,
latlon = True,
c = arabic.values,
cmap = 'Paired',
alpha = 1)
DataCamp Analyzing Social Media Data in Python

ANALYZING SOCIAL MEDIA DATA IN PYTHON

Let's practice!
DataCamp Analyzing Social Media Data in Python

ANALYZING SOCIAL MEDIA DATA IN PYTHON

Congratulations!

Alex Hanna
Computational Social Scientist

You might also like