0% found this document useful (0 votes)
2 views

Visualization in Python

The document provides an overview of data visualization in Python, highlighting key libraries such as ggplot and Folium. It discusses various visualization techniques including standard charts, thematic maps, and the use of aesthetics and geometric objects in ggplot. The conclusion emphasizes Python's diverse visualization options and its integration with data processing tools.

Uploaded by

deepaa
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Visualization in Python

The document provides an overview of data visualization in Python, highlighting key libraries such as ggplot and Folium. It discusses various visualization techniques including standard charts, thematic maps, and the use of aesthetics and geometric objects in ggplot. The conclusion emphasizes Python's diverse visualization options and its integration with data processing tools.

Uploaded by

deepaa
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 26

Data visualization in

Python

Martijn Tennekes, Ali Hürriyetoglu

THE CONTRACTOR IS ACTING UNDER A FRAMEWORK CONTRACT CONCLUDED WITH THE COMMISSION

Eurostat
Outline

• Overview data visualization in Python


• ggplot
• Folium
• Conclusion

2
Eurostat
Which packages/functions

• Standard charts (e.g. line chart, bar chart, scatter plot):


• Matplotlib, Pandas, Seaborn, ggplot, Altair, ...
• Thematic maps
• Folium, Basemap, Cartopy, Iris, …
• Other visualisations
• Bokeh (interactive plots), plotly, …

3
Eurostat
ggplot

• Based on one of the most popular R package (ggplot2)

• Based on the Grammar of Graphics (Wilkinson, 2005)

• Charts are build up according to this grammar:


• data
• mapping / aestetics
• geoms
• stats
• scales
• coord
• Facets
• Pandas DataFrames are used natively in ggplot.
4
Eurostat
ggplot and qplot
Stacking of layers
Data: DataFrame. and transformations
with +

ggplot(mpg, aes(x = displ, y = cty) ) +


geom_point()

Aestatics: x, y, color, fill, shape


Geometry: points

Shortcut function: qplot (quick plot):

qplot(diamonds.carat, diamonds.price)

5
Eurostat
Aesthetics

Mapping of data to
visual attributes of
geometric objects:

– Position: x, y
– Color: color
– Shape: shape

ggplot(aes(x='carat', y='price', color='clarity'), diamonds) +


geom_point()
6
Eurostat
Aesthetics

Mapping of data to
visual attributes of
geometric objects:

– Position: x,y
– Color: color
– Shape: shape

ggplot(aes(x='carat', y='price', shape="cut"), diamonds) +


geom_point()
7
Eurostat
Geom

• Geometric objects:
• Points, lines, polygons, …
• Functions start with “geom_”

• Also margins:
• geom_errorbar(), geom_pointrange(),
geom_linerange().
• Note: they require the aesthetics ymin and
ymax.

ggplot(mpg, aes(x = displ, y = cty)) +


geom_point() + geom_line() 8
Eurostat
Stat

• stat_smooth() and stat_density() enable statistical transformation


• Most geoms have default stat (and the other way round)
• geom and stat form a layer
• One or more layers form a plot

9
Eurostat
stat_smooth

ggplot(aes(x='date', y='beef'), data=meat) + geom_point() + \


stat_smooth(method='loess')
10
Eurostat
stat_density

ggplot(aes(x='price', color='clarity'), data=diamonds) + stat_density()


11
Eurostat
Scales (and axes)

• A scale indicates how the value of a variable scales with an


aesthetic
• Therefore:
• A scale belongs to one aesthetic (x, y, color, fill, etc.)
• The axis is an essential part of a scale
• With scale_XXX, the scales and axes can be adjusted (XXX stands
for the a combination of aesthetic and type of scale, e.g.
scale_fill_gradient)

12
Eurostat
scale_x_log

ggplot(diamonds, aes(x='price')) + geom_histogram() + scale_x_log(base=100)


13
Eurostat
Coord

• A chart is drawn in a coordinate


system. This can be transformed.
• A pie chart has a polar coordinate
system.

df = pd.DataFrame({"x": np.arange(100)})
df['y'] = df.x * 10 # polar coords
p = ggplot(df, aes(x='x', y='y')) + geom_point() + coord_polar()
print(p)
14
Eurostat
Facets

• With facets, small


multiples are created.
• Each facet shows a subset
of the data.

ggplot(diamonds, aes(x='price')) + \
geom_histogram() + \
facet_grid("cut")

15
Eurostat
Facets example

ggplot(chopsticks, aes(x='chopstick_length',
y='food_pinching_effeciency')) + \
geom_point() + \
geom_line() + \
scale_x_continuous(breaks=[150, 250, 350]) + \
facet_wrap("individual") 16
Eurostat
Facets
example 2

ggplot(diamonds, aes(x="carat", y="price", color="color",


shape="cut")) + geom_point() + facet_wrap("clarity") 17
Eurostat
ggplot tips
• You can annotate plots

ggplot(mtcars, aes(x='mpg')) + geom_histogram() + \


xlab("Miles per Gallon") + ylab("# of Cars")

• Assign a plot to a variable, for instance g:


g = ggplot(mpg, aes(x = displ, y = cty)) +
geom_point()

• The function save saves the plot to the desired format:

g.save(“myimage.png”)

18
Eurostat
Folium: Thematic maps
• A thematic map is a visualization where statistical
information with a spatial component is shown.
• Other libraries are: Basemap, Cartopy, Iris
• Folium builds on the data wrangling strengths of
the Python ecosystem and the mapping strengths
of the Leaflet.js library.
• Manipulate your data in Python, then visualize it
in on a Leaflet map via Folium.

19
Eurostat
Folium features
• Built-in tilesets from OpenStreetMap, MapQuest
Open, MapQuest Open Aerial, Mapbox, and
Stamen
• Supports custom tilesets with Mapbox or Cloudmade API
keys.
• Supports GeoJSON and TopoJSON overlays,
• as well as the binding of data to those overlays to create
choropleth maps with color-brewer color schemes.

20
Eurostat
Basic Maps

folium.Map(location=[50.89, 5.99], zoom_start=14)


21
Eurostat
Basic maps

folium.Map(location=[50.89, 5.99], zoom_start=14, tiles='Stamen Toner')


22
Eurostat
GeoJSON/TopoJSON Overlays

ice_map = folium.Map(location=[-59, -11], tiles='Mapbox Bright', zoom_start=2)


ice_map.geo_json(geo_path=geo_path)
ice_map.geo_json(geo_path=topo_path, topojson='objects.antarctic_ice_shelf')
ice_map.create_map(path='ice_map.html') 23
Eurostat
Choropleth maps

map = folium.Map(location=[48, -102], zoom_start=3)


map.choropleth(geo_path=state_geo, data=state_data,
columns=['State', 'Unemployment'], key_on='feature.id',
fill_color='YlGn', fill_opacity=0.7, line_opacity=0.2, 24
legend_name='Unemployment Rate (%)')
Eurostat
Summary

• Python has many options for data visualization


• Each visualisation library has a particular audience
• Javascript backend is mostly used to extend power of the
visualisation
• Python’s extensive data processing tools integrates well
with visualisation requirements

25
Eurostat
References
• http://yhat.github.io/ggplot/
• https://folium.readthedocs.io/en/latest/

26
Eurostat

You might also like