0% found this document useful (0 votes)
19 views

Day2Part2. DataVisualization

Uploaded by

askaraskergazy
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views

Day2Part2. DataVisualization

Uploaded by

askaraskergazy
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 29

Overview

Introduction to Data Visualization


Fundamentals of Matplotlib library
Seaborn library
3D Visualization
Practice in Python
Learning outcomes
At the end of the training
session you will be able to:
to work with Matplotlib
library to plot basic graphs
to visualize the data correctly
to make advanced plotting
Why do we need To explore the data

data visualization?
For clear data
transmission

To share data with others


Introduction to Data Visualization

Visualization fundamentally serves 4 purposes in data science, in


understanding:

the distribution of features


relationships between two or more variables
comparisons between variables
the composition of data
Introduction to Data Visualization

Scripting Layer
(pyplot)

Architecture
Matplotlib Artist Layer

Backend Layer
Matplotlib library

Introduction to Matplotlib

Matplotlib is a cross-platform, data visualization and graphical plotting library for Python and its
numerical extension NumPy.

matplotlib.pyplot is a collection of functions that can make visualization

Each pyplot function makes some change to a figure: e.g., creates a figure, creates a plotting area
in a figure, plots some lines in a plotting area, decorates the plot with labels, etc.
Matplotlib library

1. Installing Matplotlib

Matplotlib and its dependencies can be downloaded as a binary (pre-compiled) package from
the Python Package Index (PyPI), and installed with the following command:

python -m pip install matplotlib

2. Import matplotlib.pyplot

import matplotlib.pyplot as plt

from matplotlib import pyplot as plt


Matplotlib library

Matplotlib and NumPy

Numpy is a package for scientific computing. Numpy is a required dependency for matplotlib,
which uses numpy functions for numerical data and multi-dimensional arrays

Matplotlib and Pandas

Pandas is a library used by matplotlib mainly for data manipulation and analysis. Pandas provides
an in-memory 2D data table object called a Dataframe. Unlike numpy, pandas is not a required
dependency of matplotlib.
Pandas and numpy are often used together
Let's Go!

A function can create a figure:


matplotlib.pyplot.figure(),
Another function that creates a plotting area in a figure:
matplotlib.pyplot.plot().
pyplot styles
There are nearly 30
builtin styles to
matplotlib that can
be activated with the
plt.style.use
function. The style
names are available
in the
plt.style.available
Figure functions in matplotlib

Functions Description

1 figure() It creates a new figure

2 show() It displays a specified figure

3 figtest() It adds text to the figure

4 savefig() It saves the current figure

5 close() It closes a figure window


Axis functions in matplotlib
Functions Description

6 xlim() It gets or sets the x-axis limits of the current axes

7 ylim() It gets or sets the y-axis limits of the current axes

8 xscale() It sets the scaling of the x-axis of the current axes

9 yscale() It sets the scaling of the y-axis of the current axes

It gets or sets the x-limits of the current tick


10 xticks()
locations and labels

It gets or sets the y-limits of the current tick


11 yticks()
locations and labels
Image functions in matplotlib

Functions Description

It reads an image from a file into an


1 imread()
array

2 imsave() It saves an array as an image file

3 imshow() It displays an image on the axes


Types of plots in matplotlib
Functions Description

You can plot markers and lines to


1 plot()
the axes

2 scatter() It creates a scatter plot of x vs y

3 bar() It creates a bar plot

4 barh() It creates a horizontal bar plot

5 hist() It plots a histogram


Types of plots in matplotlib
Functions Description

6 hist2d() It creates a 2D histogram plot

7 boxplot() It creates a box-with-whisker plot

8 pie() It plots a pie chart

9 stackplot() It creates a stacked area plot

10 polar() It creates a polar plot


Types of plots in matplotlib

Functions Description

11 stem() It creates a stem plot

12 step() It creates a step plot

13 quiver() It plots a 2D field of arrows


Qualitative(Categorical) Data
Bar graph and pie chart can help you compare data between different groups or to track
changes over time.
Pie Charts have two common pitfalls:

It can be difficult for viewers to


compare sector sizes within the
chart.
If a pie chart contains too many
sectors, it is difficult for a viewer
to decipher any useful
information.
Quantitative Data

A line graph reveals trends or progress over time, and you can use it to show many different
categories of data. You should use it when you chart a continuous data set.
Quantitative Data
A scatter plot or scattergram chart will show the relationship between two different variables or
reveal distribution trends.
Quantitative Data
Histogram represents the frequency distribution of variables in a data set.
Link to source
Seaborn
Seaborn library

Seaborn
Seaborn is a data visualization library built on top of
matplotlib and closely integrated with pandas data
structures in Python.

Dataset oriented It supports high-level Visualizing


API to determine abstractions for univariate and
the relationship multi-plot grids. bivariate
between distribution.
variables.
Seaborn library

Matplotlib is a Python library used to plot various graphs with the help of additional libraries like
Numpy and Pandas. It is an effective Python tool for data visualization and is mainly used to plot
2D graphs of arrays. Moreover, it also uses Pyplot to offer a free and open-source MATLAB-like
interface. It can work with different operating systems and their graphical front ends.

Seaborn is also a Python library that utilizes Matplotlib, Pandas, and Numpy to plot graphs. It is a
superset of the Matplotlib library and is constructed on top of it. It helps in the visualization of
univariate and bivariate data. Moreover, you can use it to create static Time-Series data graphs.
Data Visualization with GhatGPT

ChatGPT could:

1. suggest which data to visualize


2. choose the proper diagram, graph,plot
3. give comparison of plotting types
4. provide code to draw charts and debugging
5. change the format and styling
6. can help to visualize your own dataset(in ChatGPT 4.0 version)
Data Visualization with GhatGPT

Example:
Let's do some practice!
Review
1.Matplotlib is one of the most successful and commonly used
libraries, that provide various tools for data visualization in Python.
2. Matplotlib and Seaborn can deal with visualization of Pandas
data frames

3. Add creativity and you can get an amazing


visualization

You might also like