Chapter 3 & 4 Notes - Plotting With PyPlot-I & II
Chapter 3 & 4 Notes - Plotting With PyPlot-I & II
Importing PyPlot
- In order to use pyplot methods on your computers, we need to import it by issuing one of the
following commands:
- With the first command above, you will need to issue every pyplot command as per following
syntax:
matplotlib.pyplot.<command>
- But with the second command above, you have provided pl as the shorthand for
matplotlib.pyplot and thus now you can invoke PyPlot’s methods as this:
pl.plot(X , Y)
P a g e 1 | 29
Line chart using plot( ) function
- A Line chart or line graph is a type of chart which displays information as a series of data points
called ‘markers’ connected by straight line segments.
- The PyPlot interface offers plot( ) function for creating a line graph.
- E.g.
Output:
- You can set x-axis’ and y-axis’ labels using functions xlabel( ) and ylabel( ) respectively, i.e.:
<matplotlib.pyplot or its alias> . xlabel(<string>)
and
<matplotlib.pyplot or its alias> . ylabel(<string>)
P a g e 2 | 29
Different color code
Output:
marker = <valid marker type> , markersize = <in points> , markeredgecolor = <valid color>
P a g e 3 | 29
Marker Type for Plotting
For example:
Output:
#plot1
P a g e 4 | 29
#plot2
#plot3
** when you do not specify markeredgecolor separately in plot( ) , the marker takes the same
color as the line.
E.g.
import matplotlib.pyplot as plt
a=[1,2,3,4]
b=[2,4,6,8]
plt.plot(a,b,'r',marker='d',markersize=6)
plt.show()
Output:
P a g e 5 | 29
** Also, if you do not specify the linestyle separately along with linecolor & markerstyle-
combination-string(e.g. ‘r+’ above), Python will only plot the markers and not the line.
E.g.
import matplotlib.pyplot as plt
a=[1,2,3,4]
b=[2,4,6,8]
plt.plot(a,b,'ro')
plt.show()
Output:
P a g e 6 | 29
Scatter charts using plot( ) function
- If you specify the linecolor and markerstyle (e.g. “r+” or “bo” etc.) without the linestyle
argument, then the plot created resembles a scatter chart as only the datapoints are plotted now.
e.g.
Example 3.1:
P a g e 7 | 29
P a g e 8 | 29
Scatter Charts using scatter Function ( )
- This function can be used as:
matplotlib.pyplot.scatter(<array1>, <array2>)
or
<pyplot aliasname>.scatter(<array1>, <array2>)
e.g.
import matplotlib.pyplot as pl
#a1 and a4 are two ndarrays
pl.scatter(a1 , a4)
Output:
P a g e 9 | 29
Specifying color of the markers
Using argument c , you can specify the color of the markers.
where
s – The marker size in points
c – marker color
marker – marker style
P a g e 10 | 29
Creating Bar Charts
- A Bar Graph/ Chart is a graphical display of data using bars of different heights.
- Pyplot offers bar( ) function to create a bar chart where you can specify the sequences for x-axis
and corresponding sequence to be plotted on y-axis .
e.g.
import matplotlib.pyplot as plt
a=[1,2,3,4]
b=[2,4,6,8]
c=[1,4,9,16]
plt.bar(a,b)
plt.xlabel("values")
plt.ylabel("Doubles")
plt.show()
plt.bar(a,c)
plt.xlabel("values")
plt.ylabel("Squares")
plt.show()
- If you want to specify x-axis label and y-axis label, then you need to give commands:
matplotlib.pyplot.xlabel(<label string>)
matplotlib.pyplot.ylabel(<label string>)
P a g e 11 | 29
Changing Widths of the Bars in a Bar Chart
- By default, bar chart draws bars with equal widths. (default width is 0.8 units)
- Bar width can be changed in following 2 manners:
1. To specify common width(other than the default width) for all bars, you can specify width
argument having a scalar float value in the bar( ) function, i.e.
e.g.
2. To specify different widths for different bars of a bar chart, you can specify width
argument having a sequence (list or tuple) containing widths for each of the bars, in the
bar( ) function, i.e.
e.g.
Output:
P a g e 12 | 29
Note: the width sequence must have widths for all bars(i.e., its length must match the length of
data sequences being plotted) otherwise Python will report an error [valueError : shape
mismatch error]
Output:
P a g e 13 | 29
(ii) To specify different colors for different bars of a bar chart, you can specify color argument
having a sequence(list or tuple) containing colors for each of the bars, in the bar( ) function:
<matplotlib.pyplot>.bar(x sequence , y sequence ,color = <color code sequence >)
e.g.
import matplotlib.pyplot as plt
a=[1,2,3,4]
b=[2,4,6,8]
c=[1,4,9,16]
plt.bar(a,b, width = [0.5 , 0.6 , 0.7 , 0.8], color=[’red’, ‘g’ , ‘b’ , ‘black’])
plt.xlabel("values")
plt.ylabel("Doubles")
plt.show( )
Output:
Note: the color sequence must have color for all bars(i.e., its length must match the length of
data sequences being plotted) otherwise Python will report an error [valueError : shape
mismatch error]
P a g e 14 | 29
E.g.
import matplotlib.pyplot as plt
import numpy as np
A=[2,4,6,8]
B=[2.8,3.5,6.5,7.7]
X=np.arange(len(A))
plt.bar(X,A,color="red",width=0.35)
plt.bar(X+0.35,B,color="blue",width=0.35)
plt.show()
Output:
Output:
P a g e 15 | 29
Anatomy of a Chart
Figure : Pyplot by default plots every chart into an area called Figure. A figure contains other
elements of the plot in it.
Axes : It defined the area on which actual plot( line or bar or graph etc.) will appear. Axes have
properties like label , limits and tick marks on them.
There are two axes in a plot : (i) x-axis, the horizontal axis , (ii) y-axis, the vertical axis.
Axis label – it defines the name for an axis. It is individually defined for X-axis and Y-axis each.
Limits – These define the range of values and number of values marked on X-axis and Y-axis.
Tick_Marks – The tick marks are individual points marked on the X-axis or Y-axis.
Title: This is the text that appears on the top of the plot. It defines what the chart is about.
Legends : These are the different colors that identify different sets of data plotted on the plot.
The legends are shown in a corner of the plot.
Adding a Title
- The syntax to add a title in a plot is :
<matplotlib.pyplot>.title(<title string>)
e.g.
import matplotlib.pyplot as plt
a=[1,2,3,4]
b=[2,4,6,8]
plt.bar(a,b)
plt.title("Values and Doubles")
plt.ylabel("values")
plt.xlabel("Doubles")
plt.show()
P a g e 16 | 29
Output:
Output:
P a g e 17 | 29
Note: While setting up the limits for axes, you must keep in mind that only the data that
falls into the limits of X and Y-axes will be plotted; rest of the data will not show in the plot.
E.g.
import matplotlib.pyplot as plt
import numpy as np
x=np.arange(4)
y=[5.,25.,45.,20.]
plt.xlim(-4.0,1.0)
plt.bar(x,y)
plt.title("A Simple Bar Chart")
plt.show()
Output:
Note: If you do not specify X or Y limits, PyPlot will automatically decide the limits for X and
Y-axes as per the values being plotted.
e.g.
P a g e 18 | 29
Adding Legend
- When we plot multiple ranges on a single plot, it becomes necessary that legends are specified.
- Two step process:
(i) In the plotting functions like plot( ) , bar( ) etc., give a specific label to data range using
argument label.
(ii) Add legend to the plot using legend( ) as per format:
<matplotlib.pyplot>.legend(loc = <position number or string>)
The loc argument can either take values 1, 2, 3, 4 signifying the position strings ‘upper
right’,’upper left’,’lower left’,’lower right’ respectively. Default position is ‘upper right’ or 1.
e.g.
import matplotlib.pyplot as plt
import numpy as np
val=[[5.,25.,45.,20.],[4.,23.,49.,17.],[6.,22.,47.,19.]]
x=np.arange(4)
#step1: specify label for each range being plotted using label
plt.bar(x+0.00,val[0],color='b',width=0.25,label='range1')
plt.bar(x+0.25,val[1],color='g',width=0.25,label='range2')
P a g e 19 | 29
plt.bar(x+0.50,val[2],color='r',width=0.25,label='range3')
#step2:add legend,i.e.
plt.legend(loc='upper left')
Saving a Figure
- If you want to save a plot created using pyplot functions for later use or for keeping records, you
can use savefig( ) to save the plot.
- You can use the pyplot’s savefig( ) as per format:
<matplotlib.pyplot>.savefig(<string with filename and path>)
*you can save figures in popular formats like .pdf , .png , etc.
*************
P a g e 20 | 29
CHAPTER 4 – PLOTTING WITH PYPLOT II – HISTOGRAM, FREQUENCY DISTRIBUTION, BOXPLOTS
e.g.1.
P a g e 21 | 29
#hist1
#hist2
#hist3
P a g e 22 | 29
e.g.2
import matplotlib.pyplot as plt
import numpy as np
x=[-10,-8,3,6,9,10,-10,2,1,-8,3,6,10]
plt.xlim(-10,10)
plt.hist(x,cumulative=True)
plt.show() #hist1
plt.hist(x,bins=50,cumulative=True)
plt.show() #hist2
plt.hist(x,bins=100,cumulative=True)
plt.show() #hist3
#hist1
#hist2
P a g e 23 | 29
#hist3
e.g.3
import matplotlib.pyplot as plt
import numpy as np
x=[-10,-8,3,6,9,10,-10,2,1,-8,3,6,10]
y=[8,10,9,2,1,5,-10,-2,3,-8,7,9,10]
plt.xlim(-10,10)
plt.hist([x,y]) #plotting two histogram - plot1
plt.show()
plt.hist(x,histtype='step') #plot2
plt.show()
plt.hist(x,orientation='horizontal') #plot3
plt.show()
P a g e 24 | 29
#plot2
#plot3
P a g e 25 | 29
Creating Frequency Polygons
- A frequency polygon is a type of frequency distribution graph.
- In a frequency polygon, the number of observations is marked with a single point at the midpoint
of an interval. A straight line then connects each set of points.
- Frequency polygons make it easy to compare two or more distributions on the same set of axes.
- Python’s pyplot module of matplotlib provides no separate function for creating frequency
polygon. Therefore, to create a frequency polygon, what you can do is:
e.g.
P a g e 26 | 29
Creating Box Plots
- A box plot uses five important numbers of a data range: the extremes (the highest and the
lowest numbers), the median, and the upper and lower quartiles, making up the five number
summary.
- A box plot is used to show the range and middle half of ranked data. Ranked data is numerical data
such as numbers etc. The middle half of the data is represented by the box. The highest and lowest
scores are joined to the box by straight lines. The regions above the upper quartile and below the
lower quartile each contain 25% of the data.
P a g e 27 | 29
Parameters:
x – Array or a sequence of vectors.
notch – bool,optional(False) ; If True, will produce a notched box plot. Otherwise, a rectangular
boxplot is produced.
vert – bool, optional(True) ; If True (default), makes the boxes vertical. If false, everything is
drawn horizontally.
showbox – bool,optional(True) ; show the central box.
showmeans – bool, optional (False) ; Show the arithmetic means.
E.g
import matplotlib.pyplot as plt
import numpy as np
ary=[5,20,30,45,60,80,100,140,150,200,240]
plt.boxplot(ary) #simple boxplot
plt.show()
plt.boxplot(ary,showmeans=True) #boxplot with mean
plt.show()
plt.boxplot(ary,showmeans=True,notch=True) #notched boxplot
plt.show()
plt.boxplot(ary,showbox=False) #boxplot without central box
plt.show()
#simple boxplot
P a g e 28 | 29
#boxplot with mean
#notched boxplot
***************
P a g e 29 | 29