0% found this document useful (0 votes)
206 views

Matplotlib Notes

Uploaded by

asayushsingh638
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
206 views

Matplotlib Notes

Uploaded by

asayushsingh638
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Chapter – 4

Plotting Data using Matplotlib

Matplotlib :- Matplotlib library is used for creating static, animated, and


interactive 2D- plots or figures in Python. Matplotlib makes easy things easy and
hard things possible. It can be installed using the following pip command from the
command prompt:
pip install matplotlib

For plotting using Matplotlib, we need to import its Pyplot module using the
following command:

import matplotlib.pyplot as plt

What is visualization?
Data visualization means graphical or pictorial representation of the data using
graph, chart, etc. The purpose of plotting data is to visualize variation or show
relationships between variables.

Plot() function : - The pyplot module of matplotlib contains a collection of


functions that can be used to work on a plot. The plot() function of the pyplot
module is used to create a figure.

A figure contains a plotting area, legend, axis labels, ticks, title, etc. The plot()
function by default plots a line chart.

We can click on the save button on the output window and save the plot as an
image. A figure can also be saved by using savefig() function.

For example: plt.savefig('x.png').

Customisation of Plots: - Pyplot library gives us numerous functions, which can


be used to customise charts such as adding titles or legends.
1. grid() :- The grid() function to add grid lines to the plot.
2. legend() : - A legend is used to describe elements for a particular area of a
graph.
3. savefig() : - Savefig() method is used to save the figure created after
plotting data.
4. show() : - The show() function in pyplot module of matplotlib library is used
to display all figures. Parameters:
5. title() : - Set the title of axis.
6. xlabel() : - Set the label of the x axis.
7. ylabel() : - Set the label of the y axis.
8. xticks() : - Get or set the current tick locations and labels of the x-axis..
9. yticks() : - Get or set the current tick locations and labels of the x-axis.

Marker : - A marker is any symbol that represents a data value in a line chart or a
scatter plot.

Colour : - It is also possible to format the plot further by changing the colour of
the plotted data.We can either use character codes or the color names as values to
the parameter color in the plot().

Linewidth and Line Style: - The linewidth and linestyle property can be used to
change the width and the style of the line chart. Linewidth is specified in pixels.
The default line width is 1 pixel showing a thin line.

We can also set the line style of a line chart using the linestyle parameter. It can
take a string such as "solid", "dotted", "dashed" or "dashdot".

The Pandas Plot function (Pandas Visualisation)

The Plot()
The plot() method of Pandas accepts a considerable number of arguments that can
be used to plot a variety of graphs. It allows customizing different plot types by
supplying the kind keyword arguments. Where kind accepts a string indicating the
type of .plot syntax is: plt.plot(kind)

example: df.plot(kind='line')

Basics of Simple Plotting

Plotting a Line chart: A line plot is a graph that shows the frequency of data
along a number line. It is used to show continuous dataset. A line plot is used to
visualise growth or decline in data over a time interval.
The linewidth and linestyle property can be used to change the width and the style
of the line chart. Linewidth is specified in pixels. The default line width is 1 pixel
showing a thin line. We can also set the line style of a line chart using the linestyle
parameter.

plt.plot(df.weight, df.height,marker='*', markersize=10,


color='green', linewidth=2, linestyle='dashdot')

Bar graph: - The bar() function takes arguments that describes the layout of the
bars. Bar graph presents categorical data with rectangular bars with heights or
lengths proportional to the values that they represent.

We can also customise the bar chart by adding certain parameters to the plot
function. We can control the edgecolor of the bar, linestyle and linewidth. We can
also control the color of the lines.

df.plot(kind='bar',x='Day',title='Mela Sales
Report',color=['red','yellow','purple'],
edgecolor='Green',linewidth=2,linestyle='--')

Boxplot : - A Box Plot is the visual representation of the statistical five number
summary of a given data set.

A Box Plot is the visual representation of the statistical summary of a given data
set. The summary includes Minimum value, Quartile 1, Quartile 2, Median,
Quartile 4 and Maximum value.

The whiskers are the two lines outside the box that extend to the highest and lowest
values. It also helps in identifying the outliers. An outlier is an observation that is
numerically distant from the rest of the data.

The distance between the box and lower or upper whiskers in some boxplots are
more, and in some less. Shorter distance indicates small variation in data, and
longer distance indicates spread in data to mean larger variation.

We can display the whisker in horizontal direction by adding a parameter


vert=False. We can change the color of the whisker as well.

df.plot(kind='box',title='Compare Resorts', color='red',


vert=False)
Hist / histogram plot: - A histogram is a type of graph that provides a visual
interpretation of numerical data by indicating the number of data points that lie
within a range of values.

Histograms are column-charts, where each column represents a range of values,


and the height of a column corresponds to how many values are in that range. To
make a histogram, the data is sorted into "bins" and the number of data points in
each bin is counted. The height of each column in the histogram
is then proportional to the number of data points its bin contains.

The histogram can be customized like edgecolor, border, style, fill, Another
property called hatch can be used to fill to each hist with pattern ( '-', '+', 'x', '\\', '*',
'o', 'O', '.').

df.plot(kind=’hist’,bins=20)
df.plot(kind='hist',bins=[18,19,20,21,22])
df.plot(kind='hist',bins=range(18,25))
df.plot(kind='hist',edgecolor='Green', linewidth=2,
linestyle=':', fill=False,hatch='o')

Pie / Pie chart: - Pie is a type of graph in which a circle is divided into different
sectors and each sector represents a part of the whole. A pie plot is used to
represent numerical data proportionally.

import pandas as pd
import matplotlib.pyplot as plt
df=pd.DataFrame({'GeoArea':[83743,78438,22327,22429,21081,16579,10486],'Fo
restCover':[67353,27692,17280,17321,19240,13464,8073]},
index=['Arunachal Pradesh','Assam','Manipur','Meghalaya',
'Mizoram','Nagaland','Tripura'])
df.plot(kind='pie',y='ForestCover', title='Forest cover of North Eastern
states',legend=False)
plt.show()

Customisation of pie chart:


1. Explode—it specifies the fraction of the radius with which to explode or
expand each slot.
2. Autopct—to display the percentage of that part as a label.

df.plot(kind='pie', y='ForestCover',title='Forest cover of North


Eastern states', legend=False, explode=exp, autopct="%.2f",
colors=c)

Scatter plot: - It is similar to a line chart the major difference is that while line
graph connects the data points with a line, scatter chart simply plots the data points
to show trend in the data.
Scatter plots are used when you want to show the relationship between two
variables. Scatter plots are sometimes called correlation plots because they show
how two variables are correlated. The size of the bubble can also be used to reflect
a value.

plt.scatter(x=discount,y=saleInRs,s=size,color='red',linewidth=3,m
arker='*',edgecolor='blue')

Using Open Data:- There are many websites that provide data freely for anyone to
download and do analysis, primarily for educational purposes. These are called
Open Data as the data source is open to the public. Availability of data for access
and use promotes further analysis and innovation.

A lot of emphasis is being given to open data to ensure transparency, accessibility


and innovation. “Open Government Data (OGD) Platform India” (data. gov.in) is a
platform for supporting the Open Data initiative of the Government of India. Large
datasets on different projects and parameters are available on the platform.

You might also like