0% found this document useful (0 votes)
20 views29 pages

Chapter 3 & 4 Notes - Plotting With PyPlot-I & II

dfds

Uploaded by

praveen.ojha014
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views29 pages

Chapter 3 & 4 Notes - Plotting With PyPlot-I & II

dfds

Uploaded by

praveen.ojha014
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 29

CHAPTER 3 – PLOTTING WITH PYPLOT – I (BAR GRAPHS & SCATTER PLOT)

What is Data Visualization?


- It refers to the graphical or visual representation of information and data using visual elements
like charts, graphs, and maps etc.
- Helpful in decision making.
- It unveils pattern, trends, outliers, correlations etc. in the data, and thereby helps decision
makers understand the meaning of data to drive business decisions.

Using PyPlot of Matplotlib Library


- The matplotlib is a Python library that provides many interfaces and functionality for 2D-graphics.
In short, matplotlib is a high quality plotting library of Python.
- PyPlot is a collection of methods within matplotlib which allows user to construct 2D plots easily
and interactively.

Importing PyPlot
- In order to use pyplot methods on your computers, we need to import it by issuing one of the
following commands:

- With the first command above, you will need to issue every pyplot command as per following
syntax:

matplotlib.pyplot.<command>

- But with the second command above, you have provided pl as the shorthand for
matplotlib.pyplot and thus now you can invoke PyPlot’s methods as this:
pl.plot(X , Y)

Commonly used chart types

P a g e 1 | 29
Line chart using plot( ) function
- A Line chart or line graph is a type of chart which displays information as a series of data points
called ‘markers’ connected by straight line segments.
- The PyPlot interface offers plot( ) function for creating a line graph.
- E.g.

The import statement is to be given just once

List b containing values as double of values in list a

List c containing values as squares of values in list a

show( ) method is used to display plot as per given specification

Output:

- You can set x-axis’ and y-axis’ labels using functions xlabel( ) and ylabel( ) respectively, i.e.:
<matplotlib.pyplot or its alias> . xlabel(<string>)
and
<matplotlib.pyplot or its alias> . ylabel(<string>)

Applying Various Settings in plot( ) Function


The plot( ) function allows you to specify multiple settings for your chart/graph such as:
 color(line color/marker color)
 marker type
 marker size , etc.
Changing Line Color
<matplotlib>.plot(<data1>,<data2> , <color code>)

P a g e 2 | 29
Different color code

Changing Line Style


<matplotlib>.plot(<data1>, <data2> , <linestyle>)
linestyle or ls = [‘Solid’ | ‘dashed’ , ‘dashdot’ , ‘dotted’]
e.g.
import matplotlib.pyplot as plt
a=[1,2,3,4]
b=[2,4,6,8]
c=[1,4,9,16]
plt.plot(a,b,'r',linestyle='dashed')
plt.show()

Output:

Changing Marker Type, Size and Color


- data points being plotted are called markers. To change market type, its size and color, following
arguments can be used in plot( ) function:

marker = <valid marker type> , markersize = <in points> , markeredgecolor = <valid color>

P a g e 3 | 29
Marker Type for Plotting

For example:

import matplotlib.pyplot as plt


a=[1,2,3,4]
b=[2,4,6,8]
c=[1,4,9,16]
plt.plot(a,b,'r',marker='d',markersize=6,markeredgecolor='green') #plot1
plt.show()
plt.plot(a,b,'k',linestyle='solid',marker='s',markersize=6,markeredgecolor='red') #plot2
plt.show()
plt.plot(a,b,'r+',linestyle='solid',markersize=6,markeredgecolor='green') #plot3
plt.show()

Output:
#plot1

P a g e 4 | 29
#plot2

#plot3

** when you do not specify markeredgecolor separately in plot( ) , the marker takes the same
color as the line.
E.g.
import matplotlib.pyplot as plt
a=[1,2,3,4]
b=[2,4,6,8]
plt.plot(a,b,'r',marker='d',markersize=6)
plt.show()

Output:

P a g e 5 | 29
** Also, if you do not specify the linestyle separately along with linecolor & markerstyle-
combination-string(e.g. ‘r+’ above), Python will only plot the markers and not the line.
E.g.
import matplotlib.pyplot as plt
a=[1,2,3,4]
b=[2,4,6,8]
plt.plot(a,b,'ro')
plt.show()

Output:

Creating Scatter Chart


- It is a graph of plotted points on two axes that show the relationship between two sets of data.
- The scatter charts can be created through two functions of pyplot library:
1. plot( ) function
2. scatter( ) function

P a g e 6 | 29
Scatter charts using plot( ) function
- If you specify the linecolor and markerstyle (e.g. “r+” or “bo” etc.) without the linestyle
argument, then the plot created resembles a scatter chart as only the datapoints are plotted now.

e.g.

Example 3.1:

P a g e 7 | 29
P a g e 8 | 29
Scatter Charts using scatter Function ( )
- This function can be used as:
matplotlib.pyplot.scatter(<array1>, <array2>)
or
<pyplot aliasname>.scatter(<array1>, <array2>)
e.g.
import matplotlib.pyplot as pl
#a1 and a4 are two ndarrays
pl.scatter(a1 , a4)

Output:

Specifying marker type and size


Using marker argument marker type can be specified and using argument s , size can be marker
size can be specified. E.g.
pl.scatter(a1 , a4, marker = “x”, s = 5)

P a g e 9 | 29
Specifying color of the markers
Using argument c , you can specify the color of the markers.

Detailed syntax of scatter( ) function


matplotlib.pyplot.scatter(<array1>, <array2>,s=None , c=None , marker=None)

where
s – The marker size in points
c – marker color
marker – marker style

Specifying varying colors and sizes for data points


Scatter function allows you to specify different sizes and color and size for the data points. For this
purpose, you need to specify an array of colors having the same shape as arrays being plotted as
the value of c argument and an array of sizes having the same shape as arrays being plotted as the
value of s argument.
E.g.

P a g e 10 | 29
Creating Bar Charts
- A Bar Graph/ Chart is a graphical display of data using bars of different heights.
- Pyplot offers bar( ) function to create a bar chart where you can specify the sequences for x-axis
and corresponding sequence to be plotted on y-axis .

e.g.
import matplotlib.pyplot as plt
a=[1,2,3,4]
b=[2,4,6,8]
c=[1,4,9,16]
plt.bar(a,b)
plt.xlabel("values")
plt.ylabel("Doubles")
plt.show()
plt.bar(a,c)
plt.xlabel("values")
plt.ylabel("Squares")
plt.show()

- If you want to specify x-axis label and y-axis label, then you need to give commands:
matplotlib.pyplot.xlabel(<label string>)
matplotlib.pyplot.ylabel(<label string>)

P a g e 11 | 29
Changing Widths of the Bars in a Bar Chart
- By default, bar chart draws bars with equal widths. (default width is 0.8 units)
- Bar width can be changed in following 2 manners:
1. To specify common width(other than the default width) for all bars, you can specify width
argument having a scalar float value in the bar( ) function, i.e.

<matplotlib.pyplot>.bar(x sequence , y sequence , width = <float value>)

e.g.

2. To specify different widths for different bars of a bar chart, you can specify width
argument having a sequence (list or tuple) containing widths for each of the bars, in the
bar( ) function, i.e.

<matplotlib.pyplot>.bar(x sequence , y sequence , width = <width values sequence>)

e.g.

import matplotlib.pyplot as plt


a=[1,2,3,4]
b=[2,4,6,8]
c=[1,4,9,16]
plt.bar(a,b, width = [0.5 , 0.6 , 0.7 , 0.8])
plt.xlabel("values")
plt.ylabel("Doubles")
plt.show( )

Output:

P a g e 12 | 29
Note: the width sequence must have widths for all bars(i.e., its length must match the length of
data sequences being plotted) otherwise Python will report an error [valueError : shape
mismatch error]

Changing Colors of the Bars in a Bar Chart


Color of the Bars can be changed in 2 ways:
(i) To specify the common color(other than default color) for all bars, you can specify color
argument having a valid color code/name in the bar( ) function:
<matplotlib.pyplot>.bar(x sequence , y sequence ,color = <color code/name>)
e.g.
import matplotlib.pyplot as plt
a=[1,2,3,4]
b=[2,4,6,8]
c=[1,4,9,16]
plt.bar(a,b, width = [0.5 , 0.6 , 0.7 , 0.8], color=’red’)
plt.xlabel("values")
plt.ylabel("Doubles")
plt.show( )

Output:

P a g e 13 | 29
(ii) To specify different colors for different bars of a bar chart, you can specify color argument
having a sequence(list or tuple) containing colors for each of the bars, in the bar( ) function:
<matplotlib.pyplot>.bar(x sequence , y sequence ,color = <color code sequence >)
e.g.
import matplotlib.pyplot as plt
a=[1,2,3,4]
b=[2,4,6,8]
c=[1,4,9,16]
plt.bar(a,b, width = [0.5 , 0.6 , 0.7 , 0.8], color=[’red’, ‘g’ , ‘b’ , ‘black’])
plt.xlabel("values")
plt.ylabel("Doubles")
plt.show( )

Output:

Note: the color sequence must have color for all bars(i.e., its length must match the length of
data sequences being plotted) otherwise Python will report an error [valueError : shape
mismatch error]

Creating Multiple Bars chart


Say we want to plot ranges , A = [2 , 4 , 6 , 8] and B =[2.8 , 3.5 , 6.5 , 7.7]
Steps:
1. Deciding X points and thickness. Say, we want the thickness of each bar as 0.35, then for the
first range, X point will be X and for the second range, the X will shift by first bar’s thickness,
i.e. X+0.35.
2. Deciding colors.Say we want red color for the first range and blue color for the second range.
3. The width argument will take value as 0.35 in this case.

4. Plot using multiple bar( ) functions.

P a g e 14 | 29
E.g.
import matplotlib.pyplot as plt
import numpy as np
A=[2,4,6,8]
B=[2.8,3.5,6.5,7.7]
X=np.arange(len(A))
plt.bar(X,A,color="red",width=0.35)
plt.bar(X+0.35,B,color="blue",width=0.35)
plt.show()

Output:

Creating a Horizontal Bar Chart


- To create a horizontal bar chart, you need to use barh( ) function, in place of bar.
e.g.
import matplotlib.pyplot as plt
a=[1,2,3,4]
b=[2,4,6,8]
plt.barh(a,b)
plt.ylabel("values")
plt.xlabel("Doubles")
plt.show()

Output:

P a g e 15 | 29
Anatomy of a Chart

 Figure : Pyplot by default plots every chart into an area called Figure. A figure contains other
elements of the plot in it.
 Axes : It defined the area on which actual plot( line or bar or graph etc.) will appear. Axes have
properties like label , limits and tick marks on them.
There are two axes in a plot : (i) x-axis, the horizontal axis , (ii) y-axis, the vertical axis.

Axis label – it defines the name for an axis. It is individually defined for X-axis and Y-axis each.
Limits – These define the range of values and number of values marked on X-axis and Y-axis.
Tick_Marks – The tick marks are individual points marked on the X-axis or Y-axis.

 Title: This is the text that appears on the top of the plot. It defines what the chart is about.
 Legends : These are the different colors that identify different sets of data plotted on the plot.
The legends are shown in a corner of the plot.
Adding a Title
- The syntax to add a title in a plot is :
<matplotlib.pyplot>.title(<title string>)
e.g.
import matplotlib.pyplot as plt
a=[1,2,3,4]
b=[2,4,6,8]
plt.bar(a,b)
plt.title("Values and Doubles")
plt.ylabel("values")
plt.xlabel("Doubles")
plt.show()

P a g e 16 | 29
Output:

Setting Limits and Ticks

(i) Setting Xlimits and Ylimits

- Both xlim( ) and ylim( ) are used as per following format:


<matplotlib.pyplot>.xlim(<xmin> , <xmax>)
<matplotlib.pyplot>.ylim(<ymin> , <ymax>)
- E.g.
import matplotlib.pyplot as plt
import numpy as np
x=np.arange(4)
y=[5.,25.,45.,20.]
plt.xlim(-2.0,4.0)
plt.bar(x,y)
plt.title("A Simple Bar Chart")
plt.show()

Output:

P a g e 17 | 29
Note: While setting up the limits for axes, you must keep in mind that only the data that
falls into the limits of X and Y-axes will be plotted; rest of the data will not show in the plot.
E.g.
import matplotlib.pyplot as plt
import numpy as np
x=np.arange(4)
y=[5.,25.,45.,20.]
plt.xlim(-4.0,1.0)
plt.bar(x,y)
plt.title("A Simple Bar Chart")
plt.show()

Output:

Note: If you do not specify X or Y limits, PyPlot will automatically decide the limits for X and
Y-axes as per the values being plotted.

(ii) Setting Ticks for Axes


- To set own tick marks:
 For X-axis , you can use xticks( ) function as per format:
xticks(<sequence containing tick data points>, [optional sequence containing tick labels])
 For Y-axis , you can use yticks( ) function as per format:
yticks(<sequence containing tick data points>, [optional sequence containing tick labels])

e.g.

P a g e 18 | 29
Adding Legend

- When we plot multiple ranges on a single plot, it becomes necessary that legends are specified.
- Two step process:
(i) In the plotting functions like plot( ) , bar( ) etc., give a specific label to data range using
argument label.
(ii) Add legend to the plot using legend( ) as per format:
<matplotlib.pyplot>.legend(loc = <position number or string>)

The loc argument can either take values 1, 2, 3, 4 signifying the position strings ‘upper
right’,’upper left’,’lower left’,’lower right’ respectively. Default position is ‘upper right’ or 1.

e.g.
import matplotlib.pyplot as plt
import numpy as np
val=[[5.,25.,45.,20.],[4.,23.,49.,17.],[6.,22.,47.,19.]]
x=np.arange(4)

#step1: specify label for each range being plotted using label
plt.bar(x+0.00,val[0],color='b',width=0.25,label='range1')
plt.bar(x+0.25,val[1],color='g',width=0.25,label='range2')

P a g e 19 | 29
plt.bar(x+0.50,val[2],color='r',width=0.25,label='range3')

#step2:add legend,i.e.
plt.legend(loc='upper left')

plt.title("MultiRange Bar chart")


plt.xlabel('X')
plt.ylabel('Y')
plt.show()
Legend
Output:

Saving a Figure
- If you want to save a plot created using pyplot functions for later use or for keeping records, you
can use savefig( ) to save the plot.
- You can use the pyplot’s savefig( ) as per format:
<matplotlib.pyplot>.savefig(<string with filename and path>)
*you can save figures in popular formats like .pdf , .png , etc.

e.g. plt.savefig(“multibar.pdf”) # save the plot in current directory


plt.savefig(“c:\\data\\multibar.pdf”) #save the plot at the given path

*************

P a g e 20 | 29
CHAPTER 4 – PLOTTING WITH PYPLOT II – HISTOGRAM, FREQUENCY DISTRIBUTION, BOXPLOTS

Creating Histogram with PyPlot


- A histogram is a summarisation tool for discrete or continuous data.
- It provides visual interpretation of numerical data by showing the number of data points that
fall within a specified range of values (called bins).
- It is similar to a vertical bar graph. However, a histogram, unlike a vertical bar graph, shows no
gaps between the bars.

Histogram using hist( ) Function


- The syntax for using hist( ) function of pyplot is:
Matplotlib.pyplot.hist(x , bins=None, Cumulative = False , histtype=’bar’ , align=’mid’,
orientation = ‘vertical’)
Parameters:
1. x - array or sequence to be plotted on histogram.
2. bins – integer ,optional ; If an integer is given, bins+1 bin edges are calculated and
returned.
3. cumulative – bool , optional ; If True, then a histogram is computed where each bin gives
the counts in that bin plus all bins for smaller values. Default value is False.
4. histtype – {‘bar’ , ‘barstacked’ , ‘step’ , ‘stepfilled’ } , optional ; the type of histogram to
draw. Default is ‘bar’.
5. orientation – {‘horizontal’ , ‘vertical’}, optional ; If ‘horizontal’ , barh will be used for bar-
type histograms.

e.g.1.

import matplotlib.pyplot as plt


import numpy as np
x=[-10,-8,3,6,9,10,-10,2,1,-8,3,6,10]
plt.xlim(-10,10)
plt.hist(x)
plt.show() #hist1
plt.hist(x,bins=50)
plt.show() #hist2
plt.hist(x,bins=100)
plt.show() #hist3

P a g e 21 | 29
#hist1

#hist2

#hist3

P a g e 22 | 29
e.g.2
import matplotlib.pyplot as plt
import numpy as np
x=[-10,-8,3,6,9,10,-10,2,1,-8,3,6,10]
plt.xlim(-10,10)
plt.hist(x,cumulative=True)
plt.show() #hist1
plt.hist(x,bins=50,cumulative=True)
plt.show() #hist2
plt.hist(x,bins=100,cumulative=True)
plt.show() #hist3

#hist1

#hist2

P a g e 23 | 29
#hist3

e.g.3
import matplotlib.pyplot as plt
import numpy as np
x=[-10,-8,3,6,9,10,-10,2,1,-8,3,6,10]
y=[8,10,9,2,1,5,-10,-2,3,-8,7,9,10]
plt.xlim(-10,10)
plt.hist([x,y]) #plotting two histogram - plot1
plt.show()
plt.hist(x,histtype='step') #plot2
plt.show()
plt.hist(x,orientation='horizontal') #plot3
plt.show()

#plotting two histogram - plot1

P a g e 24 | 29
#plot2

#plot3

P a g e 25 | 29
Creating Frequency Polygons
- A frequency polygon is a type of frequency distribution graph.
- In a frequency polygon, the number of observations is marked with a single point at the midpoint
of an interval. A straight line then connects each set of points.
- Frequency polygons make it easy to compare two or more distributions on the same set of axes.

- Python’s pyplot module of matplotlib provides no separate function for creating frequency
polygon. Therefore, to create a frequency polygon, what you can do is:

(i) Plot a histogram from the data.


(ii) Mark a single point at the midpoint of an interval/bin.
(iii) Draw a straight lines to connect the adjacent points.
(iv) Connect first data point to the midpoint of previous interval on x-axis.
(v) Connect last data point to the midpoint of following interval on x-axis.

e.g.

P a g e 26 | 29
Creating Box Plots

- A box plot uses five important numbers of a data range: the extremes (the highest and the
lowest numbers), the median, and the upper and lower quartiles, making up the five number
summary.
- A box plot is used to show the range and middle half of ranked data. Ranked data is numerical data
such as numbers etc. The middle half of the data is represented by the box. The highest and lowest
scores are joined to the box by straight lines. The regions above the upper quartile and below the
lower quartile each contain 25% of the data.

- boxplot( ) method is used to create boxplots. The syntax is:

matplotlib.pyplot.boxplot(x , notch =None , vert=None , meanline = None, showmeans = None,


showbox = None)

P a g e 27 | 29
Parameters:
x – Array or a sequence of vectors.
notch – bool,optional(False) ; If True, will produce a notched box plot. Otherwise, a rectangular
boxplot is produced.
vert – bool, optional(True) ; If True (default), makes the boxes vertical. If false, everything is
drawn horizontally.
showbox – bool,optional(True) ; show the central box.
showmeans – bool, optional (False) ; Show the arithmetic means.

E.g
import matplotlib.pyplot as plt
import numpy as np
ary=[5,20,30,45,60,80,100,140,150,200,240]
plt.boxplot(ary) #simple boxplot
plt.show()
plt.boxplot(ary,showmeans=True) #boxplot with mean
plt.show()
plt.boxplot(ary,showmeans=True,notch=True) #notched boxplot
plt.show()
plt.boxplot(ary,showbox=False) #boxplot without central box
plt.show()

#simple boxplot

P a g e 28 | 29
#boxplot with mean

#notched boxplot

#boxplot without central box

***************

P a g e 29 | 29

You might also like