0% found this document useful (0 votes)
72 views18 pages

Introduction To Seaborn: Chris Mo

This document introduces the Seaborn library for statistical data visualization in Python. It discusses Seaborn's relationship to Matplotlib and Pandas, and how Seaborn builds on them to provide more complex visualization types. Some key plot types covered include distribution plots, regression plots, and options for faceting data. Examples are provided of customizing distribution plots using arguments like kde, hist, and rug, as well as higher-level plotting functions in Seaborn like lmplot for regression analysis. The goal is to help intermediate Python users learn to leverage Seaborn for statistical data visualization.

Uploaded by

vrhdzv
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
72 views18 pages

Introduction To Seaborn: Chris Mo

This document introduces the Seaborn library for statistical data visualization in Python. It discusses Seaborn's relationship to Matplotlib and Pandas, and how Seaborn builds on them to provide more complex visualization types. Some key plot types covered include distribution plots, regression plots, and options for faceting data. Examples are provided of customizing distribution plots using arguments like kde, hist, and rug, as well as higher-level plotting functions in Seaborn like lmplot for regression analysis. The goal is to help intermediate Python users learn to leverage Seaborn for statistical data visualization.

Uploaded by

vrhdzv
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

Introduction to

Seaborn
I N T E R M E D I AT E D ATA V I S U A L I Z AT I O N W I T H S E A B O R N

Chris Mo
Instructor
Python Visualization Landscape
The python visualization landscape is complex and can be
overwhelming

INTERMEDIATE DATA VISUALIZATION WITH SEABORN


Matplotlib
matplotlib provides the raw building blocks for Seaborn's
visualizations

It can also be used on its own to plot data

import matplotlib.pyplot as plt


import pandas as pd
df = pd.read_csv("wines.csv")
fig, ax = plt.subplots()
ax.hist(df['alcohol'])

INTERMEDIATE DATA VISUALIZATION WITH SEABORN


Pandas
pandas is a foundational library for analyzing data

It also supports basic plo ing capability

import pandas as pd
df = pd.read_csv("wines.csv")
df['alcohol'].plot.hist()

INTERMEDIATE DATA VISUALIZATION WITH SEABORN


Seaborn
Seaborn supports complex visualizations of data

It is built on matplotlib and works best with pandas'


dataframes

INTERMEDIATE DATA VISUALIZATION WITH SEABORN


Seaborn
The distplot is similar to the histogram shown in previous
examples

By default, generates a Gaussian Kernel Density Estimate


(KDE)

import seaborn as sns


sns.distplot(df['alcohol'])

INTERMEDIATE DATA VISUALIZATION WITH SEABORN


Histogram vs. Distplot
Pandas histogram Seaborn distplot
df['alcohol'].plot.hist() sns.distplot(df['alcohol'])

Actual frequency of Automatic label on x axis


observations
Muted color pale e
No automatic labels
KDE plot
Wide bins
Narrow bins

INTERMEDIATE DATA VISUALIZATION WITH SEABORN


Let's practice!
I N T E R M E D I AT E D ATA V I S U A L I Z AT I O N W I T H S E A B O R N
Using the
distribution plot
I N T E R M E D I AT E D ATA V I S U A L I Z AT I O N W I T H S E A B O R N

Chris Mo
Instructor
Creating a histogram
Distplot function has multiple optional arguments

In order to plot a simple histogram, you can disable the kde


and specify the number of bins to use

sns.distplot(df['alcohol'], kde=False, bins=10)

INTERMEDIATE DATA VISUALIZATION WITH SEABORN


Alternative data distributions
A rug plot is an alternative way to view the distribution of
data

A kde curve and rug plot can be combined

sns.distplot(df['alcohol'], hist=False, rug=True)

INTERMEDIATE DATA VISUALIZATION WITH SEABORN


Further Customizations
The distplot function uses several functions including
kdeplot and rugplot

It is possible to further customize a plot by passing


arguments to the underlying function

sns.distplot(df['alcohol'], hist=False,
rug=True, kde_kws={'shade':True})

INTERMEDIATE DATA VISUALIZATION WITH SEABORN


Let's practice!
I N T E R M E D I AT E D ATA V I S U A L I Z AT I O N W I T H S E A B O R N
Regression Plots in
Seaborn
I N T E R M E D I AT E D ATA V I S U A L I Z AT I O N W I T H S E A B O R N

Chris Mo
Instructor
Introduction to regplot
The regplot function generates a sca er plot with a
regression line

Usage is similar to the distplot

The data and x and y variables must be de ned

sns.regplot(x="alcohol", y="pH", data=df)

INTERMEDIATE DATA VISUALIZATION WITH SEABORN


lmplot() builds on top of the base regplot()
regplot - low level lmplot - high level

sns.regplot(x="alcohol", sns.lmplot(x="alcohol",
y="quality", y="quality",
data=df) data=df)

INTERMEDIATE DATA VISUALIZATION WITH SEABORN


lmplot faceting
Organize data by colors ( Organize data by columns (
hue ) col )

sns.lmplot(x="quality", sns.lmplot(x="quality",
y="alcohol", y="alcohol",
data=df, data=df,
hue="type") col="type")

INTERMEDIATE DATA VISUALIZATION WITH SEABORN


Let's practice!
I N T E R M E D I AT E D ATA V I S U A L I Z AT I O N W I T H S E A B O R N

You might also like