Visualization With Matplotlib
Visualization With Matplotlib
INTRODUCTION
Matplotlib is an amazing visualization library in Python for 2D plots of arrays.
Matplotlib is a multi-platform data visualization library built on NumPy arrays and
designed to work with the broader SciPy stack.
It was introduced by John Hunter in the year 2002.
Matplotlib is mostly written in python, a few segments are written in C, Objective-C
and Javascript for Platform compatibility.
Installation of Matplotlib
If you have Python and PIP already installed on a system, then install it using this
command:
You can also use a python distribution that already has Matplotlib installed, like Anaconda,
Spyder etc.
Import Matplotlib
Once Matplotlib is installed, import it in your applications by adding the import
module statement
import matplotlib
Most of the Matplotlib utilities lies under the pyplot submodule, and are usually imported
under the plt alias:
Types of Matplotlib
Matplotlib comes with a wide variety of plots. Some of the sample plots are:
Matplotlib Line Plot
Matplotlib Bar Plot
Matplotlib Histograms Plot
Matplotlib Scatter Plot
Matplotlib Pie Charts
Matplotlib Area Plot
Ex:
plt.plot(xpoints, ypoints)
plt.show()
---
Importing matplotlib
import matplotlib as mpl
import matplotlib.pyplot as plt
Setting Styles
In order to set a matplotlib style we need to use plt.style.use and select the desired
theme.
You can list the available themes or style as follows:
print(plt.style.available)
output
Ex: %matplotlib
import matplotlib.pyplot as plt
MATLAB-style interface
Matplotlib was originally written as a Python alternative for MATLAB users. So much
of its syntax reflects that.
Ex: plt.subplot(2, 1, 1) # (rows, columns, panel number)
plt.plot(x, np.sin(x))
Object-oriented interface
The object-oriented interface is useful for more complicated situations, and
for when you want more control over your figure.
Ex:
# First create a grid of plots
# ax will be an array of two Axes objects
fig, ax = plt.subplots(2)
# Call plot() method on the appropriate object
ax[0].plot(x, np.sin(x))
ax[1].plot(x, np.cos(x));
---
If we want to create a single figure with multiple lines, we can simply call the plot function
multiple times.
Ex:
plt.plot(x, np.sin(x))
plt.plot(x, np.cos(x))
The linestyle and color codes can be combined into a single nonkeyword argument to
the plt.plot() function.
Ex:
plt.plot(x, x + 0, '-g') # solid green
plt.plot(x, x + 1, '--c') # dashed cyan
Matplotlib does a decent job of choosing default axes limits, but sometimes more
control is needed. To control these axes limits we can make use of two methods, plt.xlim()
and plt.ylim().
Ex:
plt.plot(x, np.sin(x))
plt.xlim(-1, 11)
plt.ylim(-1.5, 1.5)
This can also be written as plt.axis([-1, 11, -1.5, 1.5]).
Labeling Plots
Titles and axis labels are the simplest labels. There are methods that can be used to set
them,
plt.title("A Sine Curve")
plt.xlabel("x")
plt.ylabel("sin(x)")
The function used for line plots (plt.plot()) can be used to produce scatter plots.
Ex:
The third argument in the function call is a character that represents the type of symbol
used for the plotting. You can specify options such as '-' and '--' to control the line style. The
marker style has its own set of short string codes.
marker symbol description
"." point
"," pixel
"o" circle
"v" triangle_down
"^" triangle_up
"<" triangle_left
">" triangle_right
"s" square
"p" pentagon
"P" plus (filled)
"*" star
"h" hexagon1
"H" hexagon2
"+" plus
"x" x
"X" x (filled)
"D" diamond
"d" thin_diamond
Ex:
plt.ylim(-1.2, 1.2)
Scatter Plots with plt.scatter
Another method of creating scatter plots is the plt.scatter function. The primary
difference of plt.scatter from plt.plot is that it can be used to create scatter plots where the
properties of each individual point (size, face color, edge color, etc.) can be individually
controlled or mapped to data.
Ex:
rng = np.random.RandomState(0)
x = rng.randn(100)
y = rng.randn(100)
colors = rng.rand(100)
sizes = 1000 * rng.rand(100)
plt.scatter(x, y, c=colors, s=sizes, alpha=0.3, cmap='viridis')
plt.colorbar(); # show color scale
The color argument is automatically mapped to a color scale by the colorbar() and the
size argument is given in pixels. In this way, the color and size of points can be used to convey
information in the visualization.
---
VISUALIZING ERRORS
Basic Errorbars
Error bars help you indicate estimated error or uncertainty to give a general sense of
how precise a measurement is. This is done through the use of markers drawn over the original
graph and its data points. To visualize this information error bars work by drawing lines that
extend from the center of the plotted data point or edge. The length of an error bar helps to
reveal uncertainty of a data point. The function plt.errorbar() is used to represent error bars.
Ex:
# importing matplotlib
import matplotlib.pyplot as plt
# creating error
x_error = 0.5
y_error = 0.3
# plotting graph
plt.plot(x, y)
plt.errorbar(x, y, yerr = y_error, xerr = x_error, fmt ='o')
The ‘fm’t is a format code controlling the appearance of lines and points.
Ex:
plt.errorbar(x, y, yerr=dy, fmt='.k')
In addition to these basic options, the errorbar function has many options to finetune
the outputs.
Ex:
The fill_between() function is used to fill area between two horizontal curves.
Ex:
---
A contour plot can be created with the plt.contour function. It takes three arguments:
a grid of x values,
a grid of y values, and
a grid of z values.
The x and y values represent positions on the plot and the z values represents the contour
levels. To builds two-dimensional grids from one-dimensional arrays np.meshgrid function is
used.
Ex:
x = np.linspace(0, 5, 50)
y = np.linspace(0, 5, 40)
X, Y = np.meshgrid(x, y)
Z = f(X, Y)
plt.contour(X, Y, Z, colors='black')
When a single color is used, negative values are represented by dashed lines, and
positive values by solid lines by default. you can color-code the lines by specifying the cmap
argument.
Ex: plt.contour(X, Y, Z, 20, cmap='RdGy')
The contourf function is used to fill the spaces between the levels in the plot. The first
color fills the space between the lowest level. The last color corresponds to the highest level in
the plot.
Ex:
The color steps in the above plot are discrete rather than continuous. To handle this in
better way plt.imshow() function is used. It interprets a two-dimensional grid of data as an
image.
Ex:
plt.imshow(Z, extent=[0, 5, 0, 5], origin='lower', cmap='RdGy')
plt.colorbar()
plt.axis(aspect='image');
plt.imshow() doesn’t accept an x and y grid, so you must manually specify the extent
[xmin, xmax, ymin, ymax] of the image on the plot.
plt.imshow() by default follows the standard image array definition where the origin is
in the upper left, not in the lower left as in most contour plots. This must be changed
when showing gridded data.
plt.imshow() will automatically adjust the axis aspect ratio to match the input data; you
can change this by setting, for example, plt.axis(aspect='image') to make x and y units
match.
The combination of these three functions—plt.contour, plt.contourf, and plt.imshow
gives limitless possibilities for displaying this sort of three dimensional data within a two-
dimensional plot.
---
import numpy as np
import matplotlib.pyplot as plt
plt.style.use('seaborn-white')
data = np.random.randn(1000)
plt.hist(data)
The hist() function has many options to tune both the calculation and the display. You can also
customized the histogram by using different arguments.
Ex:
x1 = np.random.normal(0, 0.8, 1000)
x2 = np.random.normal(-2, 1, 1000)
x3 = np.random.normal(3, 2, 1000)
kwargs = dict(histtype='stepfilled', alpha=0.3, normed=True, bins=40)
plt.hist(x1, **kwargs)
plt.hist(x2, **kwargs)
plt.hist(x3, **kwargs)
Ex:
plt.hist2d(x, y, bins=30, cmap='Blues')
cb = plt.colorbar()
cb.set_label('counts in bin')
plt.hexbin: Hexagonal binnings
---