0% found this document useful (0 votes)
9 views

Lec7 Data Visualization 1

The document discusses different data visualization techniques in Python including scatter plots, histograms, and bar plots. It covers the basics of each plot, how and when to use them, and provides code examples to generate each type of plot using Matplotlib.

Uploaded by

Muthu Kumaran
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

Lec7 Data Visualization 1

The document discusses different data visualization techniques in Python including scatter plots, histograms, and bar plots. It covers the basics of each plot, how and when to use them, and provides code examples to generate each type of plot using Matplotlib.

Uploaded by

Muthu Kumaran
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 24

Data visualization

Part I
In this lecture
We will learn how to create basic plots using matplotlib library

• Scatter plot

• Histogram

• Bar plot

Python for Data Science 2


Data Visualization
• Data visualization allows us to quickly interpret the data
and adjust different variables to see their effect
• Technology is increasingly making it easier for us to do so
Why visualize data?
o Observe the patterns

o Identify extreme values that could be anomalies

o Easy interpretation

Python for Data Science 3


Popular plotting libraries in Python
Python offers multiple graphing libraries that offers diverse
features

• matplotlib • to create 2D graphs and plots


• pandas visualization • easy to use interface, built on
Matplotlib
• seaborn • provides a high-level interface
for drawing attractive and
informative statistical graphics
• ggplot • based on R’s ggplot2, uses
Grammar of Graphics
• plotly • can create interactive plots
Python for Data Science 4
Matplotlib
• Matplotlib is a 2D plotting library which
produces good quality figures

• Although it has its origins in emulating the


MATLAB graphics commands, it is independent
of MATLAB

• It makes heavy use of NumPy and other


extension code to provide good performance
even for large arrays

Python for Data Science 5


Scatter plot

Python for Data Science 6


Scatter Plot
What is a scatter plot?
• A scatter plot is a set of points that represents
the values obtained for two different variables
plotted on a horizontal and vertical axes

When to use scatter plots?


• Scatter plots are used to convey the relationship
between two numerical variables
• Scatter plots are sometimes called correlation
plots because they show how two variables are
correlated
Python for Data Science 7
Importing data into Spyder
 Importing necessary libraries
‘pandas’ library to work with dataframes

‘numpy’ library to do numerical operations

‘matplotlib’ library to do visualization

Python for Data Science 8


Importing data into Spyder
 Importing data

 Removing missing values from the dataframe

Python for Data Science 9


Scatter plot
x y

Python for Data Science 10


Scatter plot
 The price of the car decreases as age of the car increases

Python for Data Science 11


Histogram

Python for Data Science 12


Histogram
What is a histogram?
• It is a graphical representation of data using
bars of different heights
• Histogram groups numbers into ranges and
the height of each bar depicts the frequency
of each range or bin

When to use histograms?


• To represent the frequency distribution of
numerical variables

Python for Data Science 13


Histogram
x
Histogram with default arguments

Python for Data Science 14


Histogram

Python for Data Science 15


Histogram
 Frequency distribution of kilometre of the cars shows that
most of the cars have travelled between 50000 – 100000 km
and there are only few cars with more distance travelled

Python for Data Science 16


Bar plot

Python for Data Science 17


Bar plot
What is a bar plot?
• A bar plot is a plot that presents categorical
data with rectangular bars with lengths
proportional to the counts that they
represent
When to use bar plot?
• To represent the frequency distribution of
categorical variables
• A bar diagram makes it easy to compare sets
of data between different groups

Python for Data Science 18


Bar plot

x height of the bars

Python for Data Science 19


Bar plot
 Frequency distribution of fuel type

Python for Data Science 20


Bar plot

x height of the bars

Set the labels of the xticks


Set the location of the xticks
Python for Data Science 21
Bar plot
 Bar plot of fuel type shows that most of the cars have petrol as
fuel type

Python for Data Science 22


Summary
We have learnt how to create basic plots using matplotlib library

• Scatter plot

• Histogram

• Bar plot

Python for Data Science 23


THANK YOU

You might also like