FDS Unit 5 JPR
FDS Unit 5 JPR
Importing Matplotlib – Line plots – Scatter plots – visualizing errors – density and contour
plots – Histograms – legends – colors – subplots – text and annotation – customization –
three dimensional plotting - Geographic Data with Basemap - Visualization with Seaborn.
Output –
So, with three lines of code, you can generate a basic graph using python matplotlib. Simple,
isn’t it?
Let us see how can we add title, labels to our graph created by python matplotlib library to
bring in more meaning to it. Consider the below example:
1 frommatplotlib importpyplot as plt
2
3 x =[5,2,7]
4 y =[2,16,4]
5 plt.plot(x,y)
6 plt.title('Info')
7 plt.ylabel('Y axis')
8 plt.xlabel('X axis')
9 plt.show()
Output –
You can even try many styling techniques to create a better graph. What if you want to
change the width or color of a particular line or what if you want to have some grid lines,
there you need styling! So, let me show you how to add style to a graph using python
matplotlib. First, you need to import the style package from python matplotlib library and
then use styling functions as shown in below code:
frommatplotlib importpyplot as plt
frommatplotlib importstyle
style.use('ggplot')
x =[5,8,10]
y =[12,16,6]
x2 =[6,9,11]
y2 =[6,15,7]
plt.plot(x,y,'g',label='line one', linewidth=5)
plt.plot(x2,y2,'c',label='line two',linewidth=5)
plt.title('Epic Info')
plt.ylabel('Y axis')
plt.xlabel('X axis')
plt.legend()
plt.grid(True,color='k')
plt.show()
Output –
Next in this python matplotlib, we will understand different kinds of plots. Let’s start with
bar graph!
Python Matplotlib: Bar Graph
First, let us understand why do we need a bar graph. A bar graph uses bars to compare data
among different categories. It is well suited when you want to measure the changes over a
period of time. It can be represented horizontally or vertically. Also, the important thing to
keep in mind is that longer the bar, greater is the value. Now, let us practically implement it
using python matplotlib.
frommatplotlib importpyplot as plt
plt.bar([0.25,1.25,2.25,3.25,4.25],[50,40,70,80,20],
label="BMW",width=.5)
plt.bar([.75,1.75,2.75,3.75,4.75],[80,20,20,50,60],
label="Audi", color='r',width=.5)
plt.legend()
plt.xlabel('Days')
plt.ylabel('Distance (kms)')
plt.title('Information')
plt.show()
Output –
In the above plot, I have displayed the comparison between the distance covered by two
cars BMW and Audi over a period of 5 days. Next, let us move on to another kind of plot
using python matplotlib – Histogram.
Python Matplotlib – Histogram
Let me first tell you the difference between a bar graph and a histogram. Histograms are
used to show a distribution whereas a bar chart is used to compare different entities.
Histograms are useful when you have arrays or a very long list. Let’s consider an example
where I have to plot the age of population with respect to bin. Now, bin refers to the range
of values that are divided into series of intervals. Bins are usually created of the same size.
In the below code, I have created the bins in the interval of 10 which means the first bin
contains elements from 0 to 9, then 10 to 19 and so on.
importmatplotlib.pyplot as plt
population_age =[22,55,62,45,21,22,34,42,42,4,2,102,95,85,55,110,120,
70,65,55,111,115,80,75,65,54,44,43,42,48]
bins =[0,10,20,30,40,50,60,70,80,90,100]
plt.hist(population_age, bins, histtype='bar', rwidth=0.8)
plt.xlabel('age groups')
plt.ylabel('Number of people')
plt.title('Histogram')
plt.show()
Output –
As you can see in the above plot, we got age groups with respect to the bins. Our biggest age
group is between 40 and 50.
Python Matplotlib : Scatter Plot
Usually we need scatter plots in order to compare variables, for example, how much
one variable is affected by another variable to build a relation out of it. The data is
displayed as a collection of points, each having the value of one variable which determines
the position on the horizontal axis and the value of other variable determines the position
on the vertical axis.
Consider the below example:
importmatplotlib.pyplot as plt
x =[1,1.5,2,2.5,3,3.5,3.6]
y =[7.5,8,8.5,9,9.5,10,10.5]
x1=[8,8.5,9,9.5,10,10.5,11]
y1=[3,3.5,3.7,4,4.5,5,5.2]
days =[1,2,3,4,5]
sleeping =[7,8,6,11,7]
eating =[2,3,4,3,2]
working =[7,8,7,2,2]
playing =[8,5,7,8,13]
slices =[7,2,2,13]
activities =['sleeping','eating','working','playing']
cols =['c','m','r','b']
plt.pie(slices,
labels=activities,
colors=cols,
startangle=90,
shadow=True,
explode=(0,0.1,0,0),
autopct='%1.1f%%')
plt.title('Pie Plot')
plt.show()
Output –
In the above pie chart, I have divided the circle into 4 sectors or slices which represents the
respective category (playing, sleeping, eating and working) along with the percentage they
hold. Now, if you have noticed these slices adds up to 24 hrs, but the calculation of pie slices
is done automatically for you. In this way, pie charts are really useful as you don’t have to
be the one who calculates the percentage or the slice of the pie.
Next in python matplotlib, let’s understand how to work with multiple plots.
Python Matplotlib : Working With Multiple Plots
I have discussed about multiple types of plots in python matplotlib such as bar plot, scatter
plot, pie plot, area plot etc. Now, let me show you how to handle multiple plots. For this, I
have to import numpy module which I discussed in my previous blog on Python Numpy.
Let me implement it practically, consider the below example.
importnumpy as np
importmatplotlib.pyplot as plt
deff(t):
returnnp.exp(-t) *np.cos(2*np.pi*t)
t1 =np.arange(0.0, 5.0, 0.1)
t2 =np.arange(0.0, 5.0, 0.02)
plt.subplot(221)
plt.plot(t1, f(t1), 'bo', t2, f(t2))
plt.subplot(222)
plt.plot(t2, np.cos(2*np.pi*t2))
plt.show()
Output -
The code is pretty much similar to the previous examples that you have seen but there is
one new concept here i.e. subplot. The subplot() command specifies numrow, numcol,
fignum which ranges from 1 to numrows*numcols. The commas in this command are
optional if numrows*numcols<10. So subplot (221) is identical to subplot (2,2,1).
Therefore, subplots helps us to plot multiple graphs in which you can define it by aligning
vertically or horizontally. In the above example, I have aligned it horizontally.
Apart from these, python matplotlib has some disadvantages. Some of them are listed
below:
They are heavily reliant on other packages, such as NumPy.
It only works for python, so it is hard or impossible to be used in languages other
than python. (But it can be used from Julia via PyPlot package).
Pyplot
matplotlib.pyplot is a collection of command style functions that make matplotlib work like
MATLAB. Each pyplot function makes some change to a figure: e.g., creates a figure, creates
a plotting area in a figure, plots some lines in a plotting area, decorates the plot with labels,
etc. In matplotlib.pyplot various states are preserved across function calls, so that it keeps
track of things like the current figure and plotting area, and the plotting functions are
directed to the current axes (please note that “axes” here and in most places in the
documentation refers to the axes part of a figure and not the strict mathematical term for
more than one axis).
importmatplotlib.pyplotasplt
plt.plot([1,2,3,4])
plt.ylabel('some numbers')
plt.show()
You may be wondering why the x-axis ranges from 0-3 and the y-axis from 1-4. If you
provide a single list or array to the plot() command, matplotlib assumes it is a sequence of
y values, and automatically generates the x values for you. Since python ranges start with 0,
the default x vector has the same length as y but starts with 0. Hence the x data
are [0,1,2,3].
plot() is a versatile command, and will take an arbitrary number of arguments. For
example, to plot x versus y, you can issue the command:
plt.plot([1,2,3,4],[1,4,9,16])
For every x, y pair of arguments, there is an optional third argument which is the format
string that indicates the color and line type of the plot. The letters and symbols of the
format string are from MATLAB, and you concatenate a color string with a line style string.
The default format string is ‘b-‘, which is a solid blue line. For example, to plot the above
with red circles, you would issue
importmatplotlib.pyplotasplt
plt.plot([1,2,3,4],[1,4,9,16],'ro')
plt.axis([0,6,0,20])
plt.show()
See the plot() documentation for a complete list of line styles and format strings.
The axis() command in the example above takes a list of [xmin, xmax, ymin, ymax] and
specifies the viewport of the axes.
If matplotlib were limited to working with lists, it would be fairly useless for numeric
processing. Generally, you will use numpy arrays. In fact, all sequences are converted to
numpy arrays internally. The example below illustrates a plotting several lines with
different format styles in one command using arrays.
importnumpyasnp
importmatplotlib.pyplotasplt
plt.plot(x,y,linewidth=2.0)
Use the setter methods of a Line2D instance. plot returns a list of Line2D objects;
e.g., line1, line2 = plot(x1, y1, x2, y2). In the code below we will suppose that we
have only one line so that the list returned is of length 1. We use tuple unpacking
with line, to get the first element of that list:
line,=plt.plot(x,y,'-')
line.set_antialiased(False)# turn off antialising
Use the setp() command. The example below uses a MATLAB-style command to set
multiple properties on a list of lines. setp works transparently with a list of objects
or a single object. You can either use python keyword arguments or MATLAB-style
string/value pairs:
lines=plt.plot(x1,y1,x2,y2)
# use keyword args
plt.setp(lines,color='r',linewidth=2.0)
# or MATLAB style string value pairs
plt.setp(lines,'color','r','linewidth',2.0)
In [69]: lines=plt.plot([1,2,3])
In [70]: plt.setp(lines)
alpha: float
animated: [True | False]
antialiased or aa: [True | False]
...snip
importnumpyasnp
importmatplotlib.pyplotasplt
deff(t):
returnnp.exp(-t)*np.cos(2*np.pi*t)
t1=np.arange(0.0,5.0,0.1)
t2=np.arange(0.0,5.0,0.02)
plt.figure(1)
plt.subplot(211)
plt.plot(t1,f(t1),'bo',t2,f(t2),'k')
plt.subplot(212)
plt.plot(t2,np.cos(2*np.pi*t2),'r--')
plt.show()
The figure() command here is optional because figure(1) will be created by default, just as
a subplot(111) will be created by default if you don’t manually specify any axes.
The subplot() command specifies numrows, numcols, fignum where fignum ranges from 1
to numrows*numcols. The commas in the subplot command are optional
if numrows*numcols<10. So subplot(211) is identical to subplot(2, 1, 1). You can create an
arbitrary number of subplots and axes. If you want to place an axes manually, i.e., not on a
rectangular grid, use the axes() command, which allows you to specify the location
as axes([left, bottom, width, height]) where all values are in fractional (0 to 1) coordinates.
See pylab_examples example code: axes_demo.py for an example of placing axes manually
and pylab_examples example code: subplots_demo.py for an example with lots of subplots.
You can create multiple figures by using multiple figure() calls with an increasing figure
number. Of course, each figure can contain as many axes and subplots as your heart
desires:
importmatplotlib.pyplotasplt
plt.figure(1)# the first figure
plt.subplot(211)# the first subplot in the first figure
plt.plot([1,2,3])
plt.subplot(212)# the second subplot in the first figure
plt.plot([4,5,6])
plt.figure(2)# a second figure
plt.plot([4,5,6])# creates a subplot(111) by default
plt.figure(1)# figure 1 current; subplot(212) still current
plt.subplot(211)# make subplot(211) in figure1 current
plt.title('Easy as 1, 2, 3')# subplot 211 title
You can clear the current figure with clf() and the current axes with cla(). If you find it
annoying that states (specifically the current image, figure and axes) are being maintained
for you behind the scenes, don’t despair: this is just a thin stateful wrapper around an
object oriented API, which you can use instead (see Artist tutorial)
If you are making lots of figures, you need to be aware of one more thing: the memory
required for a figure is not completely released until the figure is explicitly closed
with close(). Deleting all references to the figure, and/or using the window manager to kill
the window in which the figure appears on the screen, is not enough, because pyplot
maintains internal references until close() is called.
Visualizing Errors
For any scientific measurement, accurate accounting for errors is nearly as important, if not
more important, than accurate reporting of the number itself. For example, imagine that I
am using some astrophysical observations to estimate the Hubble Constant, the local
measurement of the expansion rate of the Universe. I know that the current literature
suggests a value of around 71 (km/s)/Mpc, and I measure a value of 74 (km/s)/Mpc with
my method. Are the values consistent? The only correct answer, given this information, is
this: there is no way to know.
Suppose I augment this information with reported uncertainties: the current literature
suggests a value of around 71 ±± 2.5 (km/s)/Mpc, and my method has measured a value of
74 ±± 5 (km/s)/Mpc. Now are the values consistent? That is a question that can be
quantitatively answered.
In visualization of data and results, showing these errors effectively can make a plot convey
much more complete information.
Basic Errorbars
%matplotlib inline
importmatplotlib.pyplotasplt
plt.style.use('seaborn-whitegrid')
importnumpyasnp
In [2]:
x=np.linspace(0,10,50)
dy=0.8
y=np.sin(x)+dy*np.random.randn(50)
plt.errorbar(x,y,yerr=dy,fmt='.k');
Here the fmt is a format code controlling the appearance of lines and points, and has the
same syntax as the shorthand used in plt.plot, outlined in Simple Line Plots and Simple
Scatter Plots.
In addition to these basic options, the errorbar function has many options to fine-tune the
outputs. Using these additional options you can easily customize the aesthetics of your
errorbar plot. I often find it helpful, especially in crowded plots, to make the errorbars
lighter than the points themselves:
In [3]:
plt.errorbar(x,y,yerr=dy,fmt='o',color='black',
ecolor='lightgray',elinewidth=3,capsize=0);
In addition to these options, you can also specify horizontal errorbars (xerr), one-sided
errorbars, and many other variants. For more information on the options available, refer to
the docstring of plt.errorbar.
Continuous Errors
Here we'll perform a simple Gaussian process regression, using the Scikit-Learn API
(see Introducing Scikit-Learn for details). This is a method of fitting a very flexible non-
parametric function to data with a continuous measure of the uncertainty. We won't delve
into the details of Gaussian process regression at this point, but will focus instead on how
you might visualize such a continuous error measurement:
In [4]:
fromsklearn.gaussian_processimportGaussianProcess
xfit=np.linspace(0,10,1000)
yfit,MSE=gp.predict(xfit[:,np.newaxis],eval_MSE=True)
dyfit=2*np.sqrt(MSE)# 2*sigma ~ 95% confidence region
We now have xfit, yfit, and dyfit, which sample the continuous fit to our data. We could pass
these to the plt.errorbar function as above, but we don't really want to plot 1,000 points
with 1,000 errorbars. Instead, we can use the plt.fill_between function with a light color to
visualize this continuous error:
In [5]:
plt.fill_between(xfit,yfit-dyfit,yfit+dyfit,
color='gray',alpha=0.2)
plt.xlim(0,10);
Note what we've done here with the fill_between function: we pass an x value, then the
lower y-bound, then the upper y-bound, and the result is that the area between these
regions is filled.
The resulting figure gives a very intuitive view into what the Gaussian process regression
algorithm is doing: in regions near a measured data point, the model is strongly
constrained and this is reflected in the small model errors. In regions far from a measured
data point, the model is not strongly constrained, and the model errors increase.
For more information on the options available in plt.fill_between() (and the closely
related plt.fill() function), see the function docstring or the Matplotlib documentation.
Finally, if this seems a bit too low level for your taste, refer to Visualization With Seaborn,
where we discuss the Seaborn package, which has a more streamlined API for visualizing
this type of continuous errorbar.
In [1]:
%matplotlib inline
importmatplotlib.pyplotasplt
plt.style.use('seaborn-white')
importnumpyasnp
Visualizing a Three-Dimensional Function
We'll start by demonstrating a contour plot using a function z=f(x,y)z=f(x,y), using the
following particular choice for ff (we've seen this before in Computation on Arrays:
Broadcasting, when we used it as a motivating example for array broadcasting):
In [2]:
deff(x,y):
returnnp.sin(x)**10+np.cos(10+y*x)*np.cos(x)
A contour plot can be created with the plt.contour function. It takes three arguments: a grid
of x values, a grid of y values, and a grid of z values. The x and y values represent positions
on the plot, and the z values will be represented by the contour levels. Perhaps the most
straightforward way to prepare such data is to use the np.meshgrid function, which builds
two-dimensional grids from one-dimensional arrays:
In [3]:
x=np.linspace(0,5,50)
y=np.linspace(0,5,40)
X,Y=np.meshgrid(x,y)
Z=f(X,Y)
Now let's look at this with a standard line-only contour plot:
In [4]:
plt.contour(X,Y,Z,colors='black');
otice that by default when a single color is used, negative values are represented by dashed
lines, and positive values by solid lines. Alternatively, the lines can be color-coded by
specifying a colormap with the cmap argument. Here, we'll also specify that we want more
lines to be drawn—20 equally spaced intervals within the data range:
In [5]:
plt.contour(X, Y, Z, 20, cmap='RdGy');
Here we chose the RdGy (short for Red-Gray) colormap, which is a good choice for centered
data. Matplotlib has a wide range of colormaps available, which you can easily browse in
IPython by doing a tab completion on the plt.cm module:
plt.cm.<TAB>
Our plot is looking nicer, but the spaces between the lines may be a bit distracting. We can
change this by switching to a filled contour plot using the plt.contourf() function (notice
the f at the end), which uses largely the same syntax as plt.contour().
In [6]:
plt.contourf(X, Y, Z, 20, cmap='RdGy')
plt.colorbar();
The colorbar makes it clear that the black regions are "peaks," while the red regions are
"valleys."
One potential issue with this plot is that it is a bit "splotchy." That is, the color steps are
discrete rather than continuous, which is not always what is desired. This could be
remedied by setting the number of contours to a very high number, but this results in a
rather inefficient plot: Matplotlib must render a new polygon for each step in the level. A
better way to handle this is to use the plt.imshow() function, which interprets a two-
dimensional grid of data as an image.
In [7]:
plt.imshow(Z, extent=[0, 5, 0, 5], origin='lower',
cmap='RdGy')
plt.colorbar()
plt.axis(aspect='image');
There are a few potential gotchas with imshow(), however:
Finally, it can sometimes be useful to combine contour plots and image plots. For example,
here we'll use a partially transparent background image (with transparency set via
the alpha parameter) and overplot contours with labels on the contours themselves (using
the plt.clabel() function):
In [8]:
contours = plt.contour(X, Y, Z, 3, colors='black')
plt.clabel(contours, inline=True, fontsize=8)
importnumpyasnp
importmatplotlib.pyplotasplt
mu,sigma=100,15
x=mu+sigma*np.random.randn(10000)
plt.xlabel('Smarts')
plt.ylabel('Probability')
plt.title('Histogram of IQ')
plt.text(60,.025,r'$\mu=100,\ \sigma=15$')
plt.axis([40,160,0,0.03])
plt.grid(True)
plt.show()
All of the text() commands return an matplotlib.text.Text instance. Just as with with lines
above, you can customize the properties by passing keyword arguments into the text
functions or using setp():
t=plt.xlabel('my data',fontsize=14,color='red')
These properties are covered in more detail in Text properties and layout.
Using mathematical expressions in text
matplotlib accepts TeX equation expressions in any text expression. For example to write
the expression in the title, you can write a TeX expression surrounded by dollar
signs:
plt.title(r'$\sigma_i=15$')
The r preceding the title string is important – it signifies that the string is a raw string and
not to treat backslashes as python escapes. matplotlib has a built-in TeX expression parser
and layout engine, and ships its own math fonts – for details see Writing mathematical
expressions. Thus you can use mathematical text across platforms without requiring a TeX
installation. For those who have LaTeX and dvipng installed, you can also use LaTeX to
format your text and incorporate the output directly into your display figures or saved
postscript – see Text rendering With LaTeX.
Annotating text
The uses of the basic text() command above place text at an arbitrary position on the Axes.
A common use for text is to annotate some feature of the plot, and the annotate() method
provides helper functionality to make annotations easy. In an annotation, there are two
points to consider: the location being annotated represented by the argument xy and the
location of the text xytext. Both of these arguments are (x,y) tuples.
importnumpyasnp
importmatplotlib.pyplotasplt
ax=plt.subplot(111)
t=np.arange(0.0,5.0,0.01)
s=np.cos(2*np.pi*t)
line,=plt.plot(t,s,lw=2)
plt.annotate('local max',xy=(2,1),xytext=(3,1.5),
arrowprops=dict(facecolor='black',shrink=0.05),
)
plt.ylim(-2,2)
plt.show()
In this basic example, both the xy (arrow tip) and xytext locations (text location) are in
data coordinates. There are a variety of other coordinate systems one can choose –
see Basic annotation and Advanced Annotation for details. More examples can be found
in pylab_examples example code: annotation_demo.py.
Logarithmic and other nonlinear axes
matplotlib.pyplot supports not only linear axis scales, but also logarithmic and logit scales.
This is commonly used if data spans many orders of magnitude. Changing the scale of an
axis is easy:
plt.xscale(‘log’)
An example of four plots with the same data and different scales for the y axis is shown
below.
importnumpyasnp
importmatplotlib.pyplotasplt
frommatplotlib.tickerimportNullFormatter# useful for `logit` scale
# Fixing random state for reproducibility
np.random.seed(19680801)
# make up some data in the interval ]0, 1[
y=np.random.normal(loc=0.5,scale=0.4,size=1000)
y=y[(y>0)&(y<1)]
y.sort()
x=np.arange(len(y))
# plot with various axes scales
plt.figure(1)
# linear
plt.subplot(221)
plt.plot(x,y)
plt.yscale('linear')
plt.title('linear')
plt.grid(True)
# log
plt.subplot(222)
plt.plot(x,y)
plt.yscale('log')
plt.title('log')
plt.grid(True)
# symmetric log
plt.subplot(223)
plt.plot(x,y-y.mean())
plt.yscale('symlog',linthreshy=0.01)
plt.title('symlog')
plt.grid(True)
# logit
plt.subplot(224)
plt.plot(x,y)
plt.yscale('logit')
plt.title('logit')
plt.grid(True)
# Format the minor tick labels of the y-axis into empty strings with
# `NullFormatter`, to avoid cumbering the axis with too many labels.
plt.gca().yaxis.set_minor_formatter(NullFormatter())
# Adjust the subplot layout, because the logit one may take more space
# than usual, due to y-tick labels like "1 - 10^{-3}"
plt.subplots_adjust(top=0.92,bottom=0.08,left=0.10,right=0.95,hspace=0.25,
wspace=0.35)
plt.show()
Legends
Generating legends flexibly in Matplotlib
This legend guide is an extension of the documentation available at legend() - please
ensure you are familiar with contents of that documentation before proceeding with this
guide.
This guide makes use of some common terms, which are documented here for clarity:
legend entry
A legend is made up of one or more legend entries. An entry is made up of exactly one key
and one label.
legend key
The colored/patterned marker to the left of each legend label.
legend label
The text which describes the handle represented by the key.
legend handle
The original object which is used to generate an appropriate entry in the legend.
Controlling the legend entries
Calling legend() with no arguments automatically fetches the legend handles and their
associated labels. This functionality is equivalent to:
handles,labels=ax.get_legend_handles_labels()
ax.legend(handles,labels)
The get_legend_handles_labels() function returns a list of handles/artists which exist on
the Axes which can be used to generate entries for the resulting legend - it is worth noting
however that not all artists can be added to a legend, at which point a "proxy" will have to
be created (see Creating artists specifically for adding to the legend (aka. Proxy artists) for
further details).
Note
Artists with an empty string as label or with a label starting with an underscore, "_", will be
ignored.
For full control of what is being added to the legend, it is common to pass the appropriate
handles directly to legend():
fig,ax=plt.subplots()
line_up,=ax.plot([1,2,3],label='Line 2')
line_down,=ax.plot([3,2,1],label='Line 1')
ax.legend(handles=[line_up,line_down])
In some cases, it is not possible to set the label of the handle, so it is possible to pass
through the list of labels to legend():
fig,ax=plt.subplots()
line_up,=ax.plot([1,2,3],label='Line 2')
line_down,=ax.plot([3,2,1],label='Line 1')
ax.legend([line_up,line_down],['Line Up','Line Down'])
Creating artists specifically for adding to the legend (aka. Proxy artists)
Not all handles can be turned into legend entries automatically, so it is often necessary to
create an artist which can. Legend handles don't have to exist on the Figure or Axes in
order to be used.
Suppose we wanted to create a legend which has an entry for some data which is
represented by a red color:
importmatplotlib.patchesasmpatches
importmatplotlib.pyplotasplt
fig,ax=plt.subplots()
red_patch=mpatches.Patch(color='red',label='The red data')
ax.legend(handles=[red_patch])
plt.show()
There are many supported legend handles. Instead of creating a patch of color we could
have created a line with a marker:
importmatplotlib.linesasmlines
fig,ax=plt.subplots()
blue_line=mlines.Line2D([],[],color='blue',marker='*',
markersize=15,label='Blue stars')
ax.legend(handles=[blue_line])
plt.show()
Legend location
The location of the legend can be specified by the keyword argument loc. Please see the
documentation at legend() for more details.
The bbox_to_anchor keyword gives a great degree of control for manual legend placement.
For example, if you want your axes legend located at the figure's top right-hand corner
instead of the axes' corner, simply specify the corner's location and the coordinate system
of that location:
ax.legend(bbox_to_anchor=(1,1),
bbox_transform=fig.transFigure)
More examples of custom legend placement:
fig,ax_dict=plt.subplot_mosaic([['top','top'],['bottom','BLANK']],
empty_sentinel="BLANK")
ax_dict['top'].plot([1,2,3],label="test1")
ax_dict['top'].plot([3,2,1],label="test2")
# Place a legend above this subplot, expanding itself to
# fully use the given bounding box.
ax_dict['top'].legend(bbox_to_anchor=(0.,1.02,1.,.102),loc='lower left',
ncol=2,mode="expand",borderaxespad=0.)
ax_dict['bottom'].plot([1,2,3],label="test1")
ax_dict['bottom'].plot([3,2,1],label="test2")
# Place a legend to the right of this smaller subplot.
ax_dict['bottom'].legend(bbox_to_anchor=(1.05,1),
loc='upper left',borderaxespad=0.)
plt.show()
plt.show()
Legend Handlers
In order to create legend entries, handles are given as an argument to an
appropriate HandlerBase subclass. The choice of handler subclass is determined by the
following rules:
1. Update get_legend_handler_map() with the value in the handler_map keyword.
2. Check if the handle is in the newly created handler_map.
3. Check if the type of handle is in the newly created handler_map.
4. Check if any of the types in the handle's mro is in the newly created handler_map.
For completeness, this logic is mostly implemented in get_legend_handler().
All of this flexibility means that we have the necessary hooks to implement custom
handlers for our own type of legend key.
The simplest example of using custom handlers is to instantiate one of the
existing legend_handler.HandlerBase subclasses. For the sake of simplicity, let's
choose legend_handler.HandlerLine2D which accepts a numpoints argument (numpoints
is also a keyword on the legend() function for convenience). We can then pass the mapping
of instance to Handler as a keyword to legend.
frommatplotlib.legend_handlerimportHandlerLine2D
fig,ax=plt.subplots()
line1,=ax.plot([3,2,1],marker='o',label='Line 1')
line2,=ax.plot([1,2,3],marker='o',label='Line 2')
ax.legend(handler_map={line1:HandlerLine2D(numpoints=4)})
<matplotlib.legend.Legend object at 0x7f2cf9a16ef0>
As you can see, "Line 1" now has 4 marker points, where "Line 2" has 2 (the default). Try
the above code, only change the map's key from line1 to type(line1). Notice how now
both Line2D instances get 4 markers.
Along with handlers for complex plot types such as errorbars, stem plots and histograms,
the default handler_map has a special tuple handler (legend_handler.HandlerTuple)
which simply plots the handles on top of one another for each item in the given tuple. The
following example demonstrates combining two legend keys on top of one another:
fromnumpy.randomimportrandn
z=randn(10)
fig,ax=plt.subplots()
red_dot,=ax.plot(z,"ro",markersize=15)
# Put a white cross over some of the data.
white_cross,=ax.plot(z[:5],"w+",markeredgewidth=3,markersize=15)
fig,ax=plt.subplots()
p1,=ax.plot([1,2.5,3],'r-d')
p2,=ax.plot([3,2,1],'k-o')
l=ax.legend([(p1,p2)],['Two keys'],numpoints=1,
handler_map={tuple:HandlerTuple(ndivide=None)})
classAnyObject:
pass
classAnyObjectHandler:
deflegend_artist(self,legend,orig_handle,fontsize,handlebox):
x0,y0=handlebox.xdescent,handlebox.ydescent
width,height=handlebox.width,handlebox.height
patch=mpatches.Rectangle([x0,y0],width,height,facecolor='red',
edgecolor='black',hatch='xx',lw=3,
transform=handlebox.get_transform())
handlebox.add_artist(patch)
returnpatch
fig,ax=plt.subplots()
classHandlerEllipse(HandlerPatch):
defcreate_artists(self,legend,orig_handle,
xdescent,ydescent,width,height,fontsize,trans):
center=0.5*width-0.5*xdescent,0.5*height-0.5*ydescent
p=mpatches.Ellipse(xy=center,width=width+xdescent,
height=height+ydescent)
self.update_prop(p,orig_handle,legend)
p.set_transform(trans)
return[p]
c=mpatches.Circle((0.5,0.5),0.25,facecolor="green",
edgecolor="red",linewidth=3)
fig,ax=plt.subplots()
ax.add_patch(c)
ax.legend([c],["An ellipse, not a rectangle"],
handler_map={mpatches.Circle:HandlerEllipse()})
Colors
Matplotlib recognizes the following formats to specify a color:
1. an RGB or RGBA tuple of float values
in [0, 1] (e.g. (0.1, 0.2, 0.5) or (0.1, 0.2, 0.5, 0.3)). RGBA is short for Red, Green, Blue,
Alpha;
2. a hex RGB or RGBA string (e.g., '#0F0F0F' or '#0F0F0F0F');
3. a shorthand hex RGB or RGBA string, equivalent to the hex RGB or RGBA string
obtained by duplicating each character, (e.g., '#abc', equivalent to '#aabbcc',
or '#abcd', equivalent to '#aabbccdd');
4. a string representation of a float value in [0, 1] inclusive for gray level (e.g., '0.5');
5. a single letter string, i.e. one of {'b', 'g', 'r', 'c', 'm', 'y', 'k', 'w'}, which are short-hand
notations for shades of blue, green, red, cyan, magenta, yellow, black, and white;
6. a X11/CSS4 ("html") color name, e.g. "blue";
7. a name from the xkcd color survey, prefixed with 'xkcd:' (e.g., 'xkcd:sky blue');
8. a "Cn" color spec, i.e. 'C' followed by a number, which is an index into the default
property cycle
(rcParams["axes.prop_cycle"] (default: cycler('color', ['#1f77b4', '#ff7f0e', '#2ca02c'
, '#d62728', '#9467bd', '#8c564b', '#e377c2', '#7f7f7f', '#bcbd22', '#17becf']))); the
indexing is intended to occur at rendering time, and defaults to black if the cycle
does not include color.
9. one
of {'tab:blue', 'tab:orange', 'tab:green', 'tab:red', 'tab:purple', 'tab:brown', 'tab:pink', '
tab:gray', 'tab:olive', 'tab:cyan'} which are the Tableau Colors from the 'tab10'
categorical palette (which is the default color cycle);
For more information on colors in matplotlib see
the matplotlib.colors API;
the List of named colors example.
importmatplotlib.pyplotasplt
importnumpyasnp
t=np.linspace(0.0,2.0,201)
s=np.sin(2*np.pi*t)
# 1) RGB tuple:
fig,ax=plt.subplots(facecolor=(.18,.31,.31))
# 2) hex string:
ax.set_facecolor('#eafff5')
# 3) gray level string:
ax.set_title('Voltage vs. time chart',color='0.7')
# 4) single letter color string
ax.set_xlabel('time (s)',color='c')
# 5) a named color:
ax.set_ylabel('voltage (mV)',color='peachpuff')
# 6) a named xkcd color:
ax.plot(t,s,'xkcd:crimson')
# 7) Cn notation:
ax.plot(t,.7*s,color='C4',linestyle='--')
# 8) tab notation:
ax.tick_params(labelcolor='tab:orange')
plt.show()
Matplotlib Subplot
Display Multiple Plots
With the subplot() function you can draw multiple plots in one figure:
Example
Draw 2 plots:
import matplotlib.pyplot as plt
import numpy as np
#plot 1:
x = np.array([0, 1, 2, 3])
y = np.array([3, 8, 1, 10])
plt.subplot(1, 2, 1)
plt.plot(x,y)
#plot 2:
x = np.array([0, 1, 2, 3])
y = np.array([10, 20, 30, 40])
plt.subplot(1, 2, 2)
plt.plot(x,y)
plt.show()
Result:
The subplot() Function
The subplot() function takes three arguments that describes the layout of the figure.
The layout is organized in rows and columns, which are represented by
the first and second argument.
The third argument represents the index of the current plot.
plt.subplot(1, 2, 1)
#the figure has 1 row, 2 columns, and this plot is the first plot.
plt.subplot(1, 2, 2)
#the figure has 1 row, 2 columns, and this plot is the second plot.
So, if we want a figure with 2 rows an 1 column (meaning that the two plots will be
displayed on top of each other instead of side-by-side), we can write the syntax like this:
Example
Draw 2 plots on top of each other:
import matplotlib.pyplot as plt
import numpy as np
#plot 1:
x = np.array([0, 1, 2, 3])
y = np.array([3, 8, 1, 10])
plt.subplot(2, 1, 1)
plt.plot(x,y)
#plot 2:
x = np.array([0, 1, 2, 3])
y = np.array([10, 20, 30, 40])
plt.subplot(2, 1, 2)
plt.plot(x,y)
plt.show()
Result:
You can draw as many plots you like on one figure, just descibe the number of rows,
columns, and the index of the plot.
Example
Draw 6 plots:
import matplotlib.pyplot as plt
import numpy as np
x = np.array([0, 1, 2, 3])
y = np.array([3, 8, 1, 10])
plt.subplot(2, 3, 1)
plt.plot(x,y)
x = np.array([0, 1, 2, 3])
y = np.array([10, 20, 30, 40])
plt.subplot(2, 3, 2)
plt.plot(x,y)
x = np.array([0, 1, 2, 3])
y = np.array([3, 8, 1, 10])
plt.subplot(2, 3, 3)
plt.plot(x,y)
x = np.array([0, 1, 2, 3])
y = np.array([10, 20, 30, 40])
plt.subplot(2, 3, 4)
plt.plot(x,y)
x = np.array([0, 1, 2, 3])
y = np.array([3, 8, 1, 10])
plt.subplot(2, 3, 5)
plt.plot(x,y)
x = np.array([0, 1, 2, 3])
y = np.array([10, 20, 30, 40])
plt.subplot(2, 3, 6)
plt.plot(x,y)
plt.show()
Result:
point. So is the text centered on the point, or is the first letter in the text positioned on that
point? Let’s see.
fig, ax = plt.subplots()
ax.set_title("Different horizonal alignment options when x = .5")
ax.text(.5, .8, 'ha left', fontsize = 12, color = 'red', ha = 'left')
ax.text(.5, .6, 'ha right', fontsize = 12, color = 'green', ha = 'right')
ax.text(.5, .4, 'ha center', fontsize = 12, color = 'blue', ha = 'center')
ax.text(.5, .2, 'ha default', fontsize = 12)
Text(0.5, 0.2, 'ha default')
Creating a text box
The fontdict dictionary object allows you to customize the font. Similarly, passing
the bbox dictionary object allows you to set the properties for a box around the text.
Color values between 0 and 1 determine the shade of gray, with 0 being totally black and 1
being totally white. We can also use boxstyle to determine the shape of the box. If
the facecolor is too dark, it can be lightened by trying a value of alpha closer to 0.
fig, ax = plt.subplots()
x, y, text = .5, .7, "Text in grey box with\nrectangular box corners."
ax.text(x, y, text,bbox={'facecolor': '.9', 'edgecolor':'blue', 'boxstyle':'square'})
x, y, text = .5, .5, "Text in blue box with\nrounded corners and alpha of .1."
ax.text(x, y, text,bbox={'facecolor': 'blue', 'edgecolor':'none', 'boxstyle':'round', 'alpha' :
0.05})
x, y, text = .1, .3, "Text in a circle.\nalpha of .5 darker\nthan alpha of .1"
ax.text(x, y, text,bbox={'facecolor': 'blue', 'edgecolor':'black', 'boxstyle':'circle', 'alpha' : 0.5})
Text(0.1, 0.3, 'Text in a circle.\nalpha of .5 darker\nthan alpha of .1')
Basic annotate method example
Like we said earlier, often you’ll want the text to be below or above the point it’s labeling.
We could do this with the text method, but annotate makes it easier to place text relative to
a point. The annotate method allows us to specify two pairs of coordinates. One xy
coordinate specifies the point we wish to label. Another xy coordinate specifies the position
of the label itself. For example, here we plot a point at (.5,.5) but put the annotation a little
higher, at (.5,.503).
fig, ax = plt.subplots()
x, y, annotation = .5, .5, "annotation"
ax.title.set_text = "Annotating point (.5,.5) with label located at (.5,.503)"
ax.scatter(x,y)
ax.annotate(annotation,xy=(x,y),xytext=(x,y+.003))
Text(0.5, 0.503, 'annotation')
Annotate with an arrow
Okay, so we have a point at xy and an annotation at xytext . How can we connect the two?
Can we draw an arrow from the annotation to the point? Absolutely! What we’ve done with
annotate so far looks the same as if we’d just used the text method to put the point at (.5,
.503). But annotate can also draw an arrow connecting the label to the point. The arrow is
styled by passing a dictionary to arrowprops .
fig, ax = plt.subplots()
x, y, annotation = .5, .5, "annotation"
ax.scatter(x,y)
ax.annotate(annotation,xy=(x,y),xytext=(x,y+.003),arrowprops={'arrowstyle' : 'simple'})
Text(0.5, 0.503, 'annotation')
Adjusting the arrow length
It looks a little weird to have the arrow touch the point. How can we have the arrow go
close to the point, but not quite touch it? Again, styling options are passed in a dictionary
object. Larger values from shrinkA will move the tail further from the label and larger
values of shrinkB will move the head farther from the point. The default
for shrinkA and shrinkB is 2, so by setting shrinkB to 5 we move the head of the
# Show graph
plt.show()
Custom Line Width
Finally you can custom the line width as well using linewidth argument.
# Libraries and data
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
df=pd.DataFrame({'x_values':range(1,11),'y_values': np.random.randn(10)})
# Show graph
plt.show()
Matplotlib - Three-dimensional Plotting
Even though Matplotlib was initially designed with only two-dimensional plotting in mind,
some three-dimensional plotting utilities were built on top of Matplotlib's two-dimensional
display in later versions, to provide a set of tools for three-dimensional data visualization.
Three-dimensional plots are enabled by importing the mplot3d toolkit, included with the
Matplotlib package.
A three-dimensional axes can be created by passing the keyword projection='3d' to any of
the normal axes creation routines.
from mpl_toolkits import mplot3d
import numpy as np
import matplotlib.pyplot as plt
fig= plt.figure()
ax= plt.axes(projection='3d')
z =np.linspace(0,1,100)
x = z *np.sin(20* z)
y = z *np.cos(20* z)
ax.plot3D(x, y, z,'gray')
ax.set_title('3D line plot')
plt.show()
We can now plot a variety of three-dimensional plot types. The most basic three-
dimensional plot is a 3D line plot created from sets of (x, y, z) triples. This can be created
using the ax.plot3D function.
defdraw_map(m,scale=0.2):
# draw a shaded-relief image
m.shadedrelief(scale=scale)
Cylindrical projections
The simplest of map projections are cylindrical projections, in which lines of constant
latitude and longitude are mapped to horizontal and vertical lines, respectively. This type
of mapping represents equatorial regions quite well, but results in extreme distortions near
the poles. The spacing of latitude lines varies between different cylindrical projections,
leading to different conservation properties, and different distortion near the poles. In the
following figure we show an example of the equidistant cylindrical projection, which
chooses a latitude scaling that preserves distances along meridians. Other cylindrical
projections are the Mercator (projection='merc') and the cylindrical equal area
(projection='cea') projections.
In [5]:
fig=plt.figure(figsize=(8,6),edgecolor='w')
m=Basemap(projection='cyl',resolution=None,
llcrnrlat=-90,urcrnrlat=90,
llcrnrlon=-180,urcrnrlon=180,)
draw_map(m)
The additional arguments to Basemap for this view specify the latitude (lat) and longitude
(lon) of the lower-left corner (llcrnr) and upper-right corner (urcrnr) for the desired map,
in units of degrees.
Pseudo-cylindrical projections
Pseudo-cylindrical projections relax the requirement that meridians (lines of constant
longitude) remain vertical; this can give better properties near the poles of the projection.
The Mollweide projection (projection='moll') is one common example of this, in which all
meridians are elliptical arcs. It is constructed so as to preserve area across the map: though
there are distortions near the poles, the area of small patches reflects the true area. Other
pseudo-cylindrical projections are the sinusoidal (projection='sinu') and Robinson
(projection='robin') projections.
In [6]:
fig=plt.figure(figsize=(8,6),edgecolor='w')
m=Basemap(projection='moll',resolution=None,
lat_0=0,lon_0=0)
draw_map(m)
The extra arguments to Basemap here refer to the central latitude (lat_0) and longitude
(lon_0) for the desired map.
Perspective projections
Perspective projections are constructed using a particular choice of perspective point,
similar to if you photographed the Earth from a particular point in space (a point which, for
some projections, technically lies within the Earth!). One common example is the
orthographic projection (projection='ortho'), which shows one side of the globe as seen
from a viewer at a very long distance. As such, it can show only half the globe at a time.
Other perspective-based projections include the gnomonic projection (projection='gnom')
and stereographic projection (projection='stere'). These are often the most useful for
showing small portions of the map.
Here is an example of the orthographic projection:
fig=plt.figure(figsize=(8,8))
m=Basemap(projection='ortho',resolution=None,
lat_0=50,lon_0=0)
draw_map(m);
Conic projections
A Conic projection projects the map onto a single cone, which is then unrolled. This can
lead to very good local properties, but regions far from the focus point of the cone may
become very distorted. One example of this is the Lambert Conformal Conic projection
(projection='lcc'), which we saw earlier in the map of North America. It projects the map
onto a cone arranged in such a way that two standard parallels (specified in Basemap
by lat_1 and lat_2) have well-represented distances, with scale decreasing between them
and increasing outside of them. Other useful conic projections are the equidistant conic
projection (projection='eqdc') and the Albers equal-area projection (projection='aea').
Conic projections, like perspective projections, tend to be good choices for representing
small to medium patches of the globe.
In [8]:
fig=plt.figure(figsize=(8,8))
m=Basemap(projection='lcc',resolution=None,
lon_0=0,lat_0=50,lat_1=45,lat_2=55,
width=1.6E7,height=1.2E7)
draw_map(m)
Other projections
If you're going to do much with map-based visualizations, I encourage you to read up on
other available projections, along with their properties, advantages, and disadvantages.
Most likely, they are available in the Basemap package. If you dig deep enough into this
topic, you'll find an incredible subculture of geo-viz geeks who will be ready to argue
fervently in support of their favorite projection for any given application!
Drawing a Map Background
Earlier we saw the bluemarble() and shadedrelief() methods for projecting global images
on the map, as well as the drawparallels() and drawmeridians() methods for drawing lines
of constant latitude and longitude. The Basemap package contains a range of useful
functions for drawing borders of physical features like continents, oceans, lakes, and rivers,
as well as political boundaries such as countries and US states and counties. The following
are some of the available drawing functions that you may wish to explore using IPython's
help features:
Physical boundaries and bodies of water
o drawcoastlines(): Draw continental coast lines
o drawlsmask(): Draw a mask between the land and sea, for use with
projecting images on one or the other
o drawmapboundary(): Draw the map boundary, including the fill color for
oceans.
o drawrivers(): Draw rivers on the map
o fillcontinents(): Fill the continents with a given color; optionally fill lakes
with another color
Political boundaries
o drawcountries(): Draw country boundaries
o drawstates(): Draw US state boundaries
o drawcounties(): Draw US county boundaries
Map features
o drawgreatcircle(): Draw a great circle between two points
o drawparallels(): Draw lines of constant latitude
o drawmeridians(): Draw lines of constant longitude
o drawmapscale(): Draw a linear scale on the map
Whole-globe images
o bluemarble(): Project NASA's blue marble image onto the map
o shadedrelief(): Project a shaded relief image onto the map
o etopo(): Draw an etopo relief image onto the map
o warpimage(): Project a user-provided image onto the map
For the boundary-based features, you must set the desired resolution when creating a
Basemap image. The resolution argument of the Basemap class sets the level of detail in
boundaries, either 'c' (crude), 'l' (low), 'i' (intermediate), 'h' (high), 'f' (full), or None if no
boundaries will be used. This choice is important: setting high-resolution boundaries on a
global map, for example, can be very slow.
Here's an example of drawing land/sea boundaries, and the effect of the resolution
parameter. We'll create both a low- and high-resolution map of Scotland's beautiful Isle of
Skye. It's located at 57.3°N, 6.2°W, and a map of 90,000 × 120,000 kilometers shows it well:
In [9]:
fig,ax=plt.subplots(1,2,figsize=(12,8))
fori,resinenumerate(['l','h']):
m=Basemap(projection='gnom',lat_0=57.3,lon_0=-6.2,
width=90000,height=120000,resolution=res,ax=ax[i])
m.fillcontinents(color="#FFDDCC",lake_color='#DDEEFF')
m.drawmapboundary(fill_color="#DDEEFF")
m.drawcoastlines()
ax[i].set_title("resolution='{0}'".format(res));
Notice that the low-resolution coastlines are not suitable for this level of zoom, while high-
resolution works just fine. The low level would work just fine for a global view, however,
and would be much faster than loading the high-resolution border data for the entire globe!
It might require some experimentation to find the correct resolution parameter for a given
view: the best route is to start with a fast, low-resolution plot and increase the resolution as
needed.
Plotting Data on Maps
Perhaps the most useful piece of the Basemap toolkit is the ability to over-plot a variety of
data onto a map background. For simple plotting and text, any plt function works on the
map; you can use the Basemap instance to project latitude and longitude coordinates to (x,
y) coordinates for plotting with plt, as we saw earlier in the Seattle example.
In addition to this, there are many map-specific functions available as methods of
the Basemap instance. These work very similarly to their standard Matplotlib counterparts,
but have an additional Boolean argument latlon, which if set to True allows you to pass raw
latitudes and longitudes to the method, rather than projected (x, y) coordinates.
Some of these map-specific methods are:
contour()/contourf() : Draw contour lines or filled contours
imshow(): Draw an image
pcolor()/pcolormesh() : Draw a pseudocolor plot for irregular/regular meshes
plot(): Draw lines and/or markers.
scatter(): Draw points with markers.
quiver(): Draw vectors.
barbs(): Draw wind barbs.
drawgreatcircle(): Draw a great circle.
We'll see some examples of a few of these as we continue. For more information on these
functions, including several example plots, see the online Basemap documentation.
Example: California Cities
Recall that in Customizing Plot Legends, we demonstrated the use of size and color in a
scatter plot to convey information about the location, size, and population of California
cities. Here, we'll create this plot again, but using Basemap to put the data in context.
We start with loading the data, as we did before:
In [10]:
importpandasaspd
cities=pd.read_csv('data/california_cities.csv')
This shows us roughly where larger populations of people have settled in California: they
are clustered near the coast in the Los Angeles and San Francisco areas, stretched along the
highways in the flat central valley, and avoiding almost completely the mountainous
regions along the borders of the state.
Example: Surface Temperature Data
As an example of visualizing some more continuous geographic data, let's consider the
"polar vortex" that hit the eastern half of the United States in January of 2014. A great
source for any sort of climatic data is NASA's Goddard Institute for Space Studies. Here
we'll use the GIS 250 temperature data, which we can download using shell commands
(these commands may have to be modified on Windows machines). The data used here was
downloaded on 6/12/2016, and the file size is approximately 9MB:
# !curl -O http://data.giss.nasa.gov/pub/gistemp/gistemp250.nc.gz
# !gunzip gistemp250.nc.gz
The data comes in NetCDF format, which can be read in Python by the netCDF4 library. You
can install this library as shown here
The data paints a picture of the localized, extreme temperature anomalies that happened
during that month. The eastern half of the United States was much colder than normal,
while the western half and Alaska were much warmer. Regions with no recorded
temperature show the map background.
Visualization with Seaborn
Seaborn provides an API on top of Matplotlib that offers sane choices for plot style and
color defaults, defines simple high-level functions for common statistical plot types, and
integrates with the functionality provided by Pandas DataFrames.
To be fair, the Matplotlib team is addressing this: it has recently added the plt.style tools
discussed in Customizing Matplotlib: Configurations and Style Sheets, and is starting to
handle Pandas data more seamlessly. The 2.0 release of the library will include a new
default stylesheet that will improve on the current status quo. But for all the reasons just
discussed, Seaborn remains an extremely useful addon.
Seaborn Versus Matplotlib
Here is an example of a simple random-walk plot in Matplotlib, using its classic plot
formatting and colors. We start with the typical imports:
In [1]:
importmatplotlib.pyplotasplt
plt.style.use('classic')
%matplotlib inline
importnumpyasnp
importpandasaspd
Now we create some random walk data:
In [2]:
# Create some data
rng=np.random.RandomState(0)
x=np.linspace(0,10,500)
y=np.cumsum(rng.randn(500,6),0)
And do a simple plot:
In [3]:
# Plot the data with Matplotlib defaults
plt.plot(x,y)
plt.legend('ABCDEF',ncol=2,loc='upper left');
Although the result contains all the information we'd like it to convey, it does so in a way
that is not all that aesthetically pleasing, and even looks a bit old-fashioned in the context of
21st-century data visualization.
Now let's take a look at how it works with Seaborn. As we will see, Seaborn has many of its
own high-level plotting routines, but it can also overwrite Matplotlib's default parameters
and in turn get even simple Matplotlib scripts to produce vastly superior output. We can set
the style by calling Seaborn's set() method. By convention, Seaborn is imported as sns:
importseabornassns
sns.set()
Now let's rerun the same two lines as before:
In [5]:
# same plotting code as above!
plt.plot(x, y)
plt.legend('ABCDEF', ncol=2, loc='upper left');
sns.distplot(data['x'])
sns.distplot(data['y']);
here are other parameters that can be passed to jointplot—for example, we can use a
hexagonally based histogram instead:
In [11]:
with sns.axes_style('white'):
sns.jointplot("x", "y", data, kind='hex')
Part – A