0% found this document useful (0 votes)
61 views33 pages

Working With Categorical Data Chapter3

This document discusses using the catplot function in Seaborn to create categorical plots from a dataset of Las Vegas hotel reviews. It demonstrates how to create box plots, bar plots, point plots, and count plots using variables like review score, hotel amenities, and traveler type. Parameters like hue, order, and facetgrid are shown to split the data into multiple groups or arrange subplots. The goal is to introduce basic categorical plotting in Python using Seaborn for exploratory data analysis.

Uploaded by

Walid Sassi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
61 views33 pages

Working With Categorical Data Chapter3

This document discusses using the catplot function in Seaborn to create categorical plots from a dataset of Las Vegas hotel reviews. It demonstrates how to create box plots, bar plots, point plots, and count plots using variables like review score, hotel amenities, and traveler type. Parameters like hue, order, and facetgrid are shown to split the data into multiple groups or arrange subplots. The goal is to introduce basic categorical plotting in Python using Seaborn for exploratory data analysis.

Uploaded by

Walid Sassi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 33

Introduction to

categorical plots
using Seaborn
W O R K I N G W I T H C AT E G O R I C A L D ATA I N P Y T H O N

Kasey Jones
Research Data Scientist
Our third dataset
Name: Las Vegas TripAdvisor Reviews - reviews
Rows: 504

Columns: 20

WORKING WITH CATEGORICAL DATA IN PYTHON


Las Vegas reviews
reviews.info()

RangeIndex: 504 entries, 0 to 503


Data columns (total 20 columns):
# Column Non-Null Count Dtype
------ -------------- -----
0 User country 504 non-null object
...
6 Traveler type 504 non-null object
7 Pool 504 non-null object
8 Gym 504 non-null object
9 Tennis court 504 non-null object
...
dtypes: int64(7), object(13)
memory usage: 78.9+ KB

1 https://www.kaggle.com/crawford/las-vegas-tripadvisor-reviews

WORKING WITH CATEGORICAL DATA IN PYTHON


Seaborn
Introduction to Data Visualization with Seaborn
Intermediate Data Visualization with Seaborn

Categorical plots:

import seaborn as sns


import matploblib.pyplot as plt

sns.catplot(...)

plt.show()

WORKING WITH CATEGORICAL DATA IN PYTHON


The catplot function
Parameters:

x : name of variable in data

y : name of variable in data

data : a DataFrame

kind : type of plot to create - one of: "strip" , "swarm" , "box" , "violin" , "boxen" ,
"point" , "bar" , or "count"

WORKING WITH CATEGORICAL DATA IN PYTHON


Box plot

Box plot wiki page

WORKING WITH CATEGORICAL DATA IN PYTHON


Review score
reviews["Score"].value_counts()

5 227
4 164
3 72
2 30
1 11

WORKING WITH CATEGORICAL DATA IN PYTHON


Box plot example
sns.catplot(
x="Pool",
y="Score",
data=reviews,
kind="box"
)
plt.show()

WORKING WITH CATEGORICAL DATA IN PYTHON


Two quick options
# Setting font size and plot background
sns.set(font_scale=1.4)
sns.set_style("whitegrid")

sns.catplot(
x="Pool",
y="Score",
data=reviews,
kind="box"
)
plt.show()

WORKING WITH CATEGORICAL DATA IN PYTHON


Boxplot practice
W O R K I N G W I T H C AT E G O R I C A L D ATA I N P Y T H O N
Seaborn bar plots
W O R K I N G W I T H C AT E G O R I C A L D ATA I N P Y T H O N

Kasey Jones
Research Data Scientist
Traditional bar chart
# Code provided for clarity
reviews["Traveler type"].value_counts().plot.bar()

WORKING WITH CATEGORICAL DATA IN PYTHON


The syntax
sns.set(font_scale=1.3)
sns.set_style("darkgrid")
sns.catplot(x="Traveler type", y="Score", data=reviews, kind="bar")

WORKING WITH CATEGORICAL DATA IN PYTHON


Ordering your categories
reviews["Traveler type"] = reviews["Traveler type"].astype("category")
reviews["Traveler type"].cat.categories

Index(['Business', 'Couples', 'Families', 'Friends', 'Solo'], dtype='object')

WORKING WITH CATEGORICAL DATA IN PYTHON


Updated visualization
sns.catplot(x="Traveler type", y="Score", data=reviews, kind="bar")

Note: catplot() has an order parameter

WORKING WITH CATEGORICAL DATA IN PYTHON


The hue parameter
hue :
name of a variable in data

used to split the data by a second category

also used to color the graphic

sns.set(font_scale=1.2)
sns.set_style("darkgrid")
sns.catplot(x="Traveler type", y="Score", data=reviews, kind="bar",
hue="Tennis court") # <--- new parameter

WORKING WITH CATEGORICAL DATA IN PYTHON


Bar plot across two variables

WORKING WITH CATEGORICAL DATA IN PYTHON


Bar plot practice
W O R K I N G W I T H C AT E G O R I C A L D ATA I N P Y T H O N
Point and count
plots
W O R K I N G W I T H C AT E G O R I C A L D ATA I N P Y T H O N

Kasey Jones
Research Data Scientist
Point plot example
sns.catplot(x="Pool", y="Score", data=reviews, kind="point") # <--- updated

WORKING WITH CATEGORICAL DATA IN PYTHON


Bar plot vs. point plot
Bar plot Point plot

WORKING WITH CATEGORICAL DATA IN PYTHON


Point plot with hue
sns.catplot(x="Spa", y="Score", data=reviews, kind="point",
hue="Tennis court", dodge=True # < --- New Parameter!
)

WORKING WITH CATEGORICAL DATA IN PYTHON


Using the join parameter
sns.catplot(x="Score",
y="Review weekday",
data=reviews,
kind="point",
join=False # < --- New!
)

WORKING WITH CATEGORICAL DATA IN PYTHON


One last catplot type
sns.catplot(x="Tennis court", data=reviews, kind="count", hue="Spa")

WORKING WITH CATEGORICAL DATA IN PYTHON


Time to practice!
W O R K I N G W I T H C AT E G O R I C A L D ATA I N P Y T H O N
Additional catplot()
options
W O R K I N G W I T H C AT E G O R I C A L D ATA I N P Y T H O N

Kasey Jones
Research Data Scientist
Difficulties with categorical plots

WORKING WITH CATEGORICAL DATA IN PYTHON


Using the catplot() facetgrid

WORKING WITH CATEGORICAL DATA IN PYTHON


Using different arguments
sns.catplot(x="Traveler type", kind="count",
col="User continent",
col_wrap=3,
palette=sns.color_palette("Set1"), data=reviews)

x : "Traveler type"

kind : "count"

col : "User continent"

col_wrap : 3

palette : sns.color_palette("Set1")

Common colors: "Set" , "Set2" , "Tab10" , "Paired"


1 http://seaborn.pydata.org/tutorial/color_palettes.html

WORKING WITH CATEGORICAL DATA IN PYTHON


One more look

WORKING WITH CATEGORICAL DATA IN PYTHON


Updating plots
Setup: save your graphic as an object: ax
Plot title: ax.fig.suptitle("My title")

Axis labels: ax.set_axis_labels("x-axis-label", "y-axis-label")

Title height: plt.subplots_adjust(top=.9)

ax = sns.catplot(x="Traveler type", col="User continent", col_wrap=3,


kind="count", palette=sns.color_palette("Set1"), data=reviews)
ax.fig.suptitle("Hotel Score by Traveler Type & User Continent")
ax.set_axis_labels("Traveler Type", "Number of Reviews")
plt.subplots_adjust(top=.9)
plt.show()

WORKING WITH CATEGORICAL DATA IN PYTHON


Finished product

WORKING WITH CATEGORICAL DATA IN PYTHON


catplot() practice
W O R K I N G W I T H C AT E G O R I C A L D ATA I N P Y T H O N

You might also like