0% found this document useful (0 votes)
11 views31 pages

Chapter 3

The document provides an overview of categorical data visualization techniques using Seaborn, including various plot types such as stripplots, swarmplots, boxplots, and barplots. It also covers regression plots, including how to evaluate models with residual plots and perform polynomial regression. Additionally, it discusses matrix plots and heatmaps for visualizing relationships in data, emphasizing the importance of data formatting for effective visualization.

Uploaded by

sai krishna
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views31 pages

Chapter 3

The document provides an overview of categorical data visualization techniques using Seaborn, including various plot types such as stripplots, swarmplots, boxplots, and barplots. It also covers regression plots, including how to evaluate models with residual plots and perform polynomial regression. Additionally, it discusses matrix plots and heatmaps for visualizing relationships in data, emphasizing the importance of data formatting for effective visualization.

Uploaded by

sai krishna
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 31

Categorical Plot

Types
D ATA V I S U A L I Z AT I O N W I T H S E A B O R N

Chris Mof tt
Instructor
Categorical Data
Data which takes on a limited and xed number of values

Normally combined with numeric data

Examples include:
Geography (country, state, region)

Gender

Ethnicity

Blood type

Eye color

DATA VISUALIZATION WITH SEABORN


Plot types - show each observation

DATA VISUALIZATION WITH SEABORN


Plot types - abstract representations

DATA VISUALIZATION WITH SEABORN


Plot types - statistical estimates

DATA VISUALIZATION WITH SEABORN


Plots of each observation - stripplot
sns.stripplot(data=df, y="DRG Definition",
x="Average Covered Charges",
jitter=True)

DATA VISUALIZATION WITH SEABORN


Plots of each observation - swarmplot
sns.swarmplot(data=df, y="DRG Definition",
x="Average Covered Charges")

DATA VISUALIZATION WITH SEABORN


Abstract representations - boxplot
sns.boxplot(data=df, y="DRG Definition",
x="Average Covered Charges")

DATA VISUALIZATION WITH SEABORN


Abstract representation - violinplot
sns.violinplot(data=df, y="DRG Definition",
x="Average Covered Charges")

DATA VISUALIZATION WITH SEABORN


Abstract representation - lvplot
sns.lvplot(data=df, y="DRG Definition",
x="Average Covered Charges")

DATA VISUALIZATION WITH SEABORN


Statistical estimates - barplot
sns.barplot(data=df, y="DRG Definition",
x="Average Covered Charges",
hue="Region")

DATA VISUALIZATION WITH SEABORN


Statistical estimates - pointplot
sns.pointplot(data=df, y="DRG Definition",
x="Average Covered Charges",
hue="Region")

DATA VISUALIZATION WITH SEABORN


Statistical estimates - countplot
sns.countplot(data=df, y="DRG_Code", hue="Region")

DATA VISUALIZATION WITH SEABORN


Let's practice!
D ATA V I S U A L I Z AT I O N W I T H S E A B O R N
Regression Plots
D ATA V I S U A L I Z AT I O N W I T H S E A B O R N

Chris Mof tt
Instructor
Bicycle Dataset
Aggregated bicycle sharing data in Washington DC

Data includes:
Rental amounts

Weather information

Calendar information

Can we predict rental amounts?

DATA VISUALIZATION WITH SEABORN


Plotting with regplot()
sns.regplot(data=df, x='temp',
y='total_rentals', marker='+')

DATA VISUALIZATION WITH SEABORN


Evaluating regression with residplot()
A residual plot is useful for evaluating the t of a model

Seaborn supports through residplot function

sns.residplot(data=df, x='temp', y='total_rentals')

DATA VISUALIZATION WITH SEABORN


Polynomial regression
Seaborn supports polynomial regression using the order
parameter

sns.regplot(data=df, x='temp',
y='total_rentals', order=2)

DATA VISUALIZATION WITH SEABORN


residplot with polynomial regression
sns.residplot(data=df, x='temp',
y='total_rentals', order=2)

DATA VISUALIZATION WITH SEABORN


Categorical values
sns.regplot(data=df, x='mnth', y='total_rentals',
x_jitter=.1, order=2)

DATA VISUALIZATION WITH SEABORN


Estimators
In some cases, an x_estimator can be useful for highlighting
trends

sns.regplot(data=df, x='mnth', y='total_rentals',


x_estimator=np.mean, order=2)

DATA VISUALIZATION WITH SEABORN


Binning the data
x_bins can be used to divide the data into discrete bins

The regression line is still t against all the data

sns.regplot(data=df,x='temp',y='total_rentals',
x_bins=4)

DATA VISUALIZATION WITH SEABORN


Let's practice!
D ATA V I S U A L I Z AT I O N W I T H S E A B O R N
Matrix Plots
D ATA V I S U A L I Z AT I O N W I T H S E A B O R N

Chris Mof tt
Instructor
Getting data in the right format
Seaborn's heatmap() function requires data to be in a grid
format

pandas crosstab() is frequently used to manipulate the data

pd.crosstab(df["mnth"], df["weekday"],
values=df["total_rentals"],aggfunc='mean').round(0)

DATA VISUALIZATION WITH SEABORN


Build a heatmap
sns.heatmap(pd.crosstab(df["mnth"], df["weekday"],
values=df["total_rentals"], aggfunc='mean')
)

DATA VISUALIZATION WITH SEABORN


Customize a heatmap
sns.heatmap(df_crosstab, annot=True, fmt="d",
cmap="YlGnBu", cbar=False, linewidths=.5)

DATA VISUALIZATION WITH SEABORN


Centering a heatmap
Seaborn support centering the heatmap colors on a speci c value

sns.heatmap(df_crosstab, annot=True, fmt="d",


cmap="YlGnBu", cbar=True,
center=df_crosstab.loc[9, 6])

DATA VISUALIZATION WITH SEABORN


Plotting a correlation matrix
Pandas corr function calculates correlations between columns
in a dataframe

The output can be converted to a heatmap with seaborn

sns.heatmap(df.corr())

DATA VISUALIZATION WITH SEABORN


Let's practice!
D ATA V I S U A L I Z AT I O N W I T H S E A B O R N

You might also like