0% found this document useful (0 votes)
36 views

Visualization With Matplotlib

Unit 3 data science

Uploaded by

bonamkotaiah
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
36 views

Visualization With Matplotlib

Unit 3 data science

Uploaded by

bonamkotaiah
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

VISUALIZATION WITH MATPLOTLIB

INTRODUCTION
 Matplotlib is an amazing visualization library in Python for 2D plots of arrays.
 Matplotlib is a multi-platform data visualization library built on NumPy arrays and
designed to work with the broader SciPy stack.
 It was introduced by John Hunter in the year 2002.
 Matplotlib is mostly written in python, a few segments are written in C, Objective-C
and Javascript for Platform compatibility.

Installation of Matplotlib
If you have Python and PIP already installed on a system, then install it using this
command:

pip install matplotlib

You can also use a python distribution that already has Matplotlib installed, like Anaconda,
Spyder etc.

Import Matplotlib
Once Matplotlib is installed, import it in your applications by adding the import
module statement

import matplotlib

Most of the Matplotlib utilities lies under the pyplot submodule, and are usually imported
under the plt alias:

import matplotlib.pyplot as plt

Types of Matplotlib
Matplotlib comes with a wide variety of plots. Some of the sample plots are:
 Matplotlib Line Plot
 Matplotlib Bar Plot
 Matplotlib Histograms Plot
 Matplotlib Scatter Plot
 Matplotlib Pie Charts
 Matplotlib Area Plot

Ex:

import matplotlib.pyplot as plt


import numpy as np

xpoints = np.array([0, 6])


ypoints = np.array([0, 250])

plt.plot(xpoints, ypoints)
plt.show()

---

GENERAL MATPLOTLIB TIPS

Importing matplotlib
import matplotlib as mpl
import matplotlib.pyplot as plt

The plt interface is what we will use most often.

Setting Styles
 In order to set a matplotlib style we need to use plt.style.use and select the desired
theme.
 You can list the available themes or style as follows:

import matplotlib.pyplot as plt

print(plt.style.available)

output

['Solarize_Light2', '_classic_test_patch', '_mpl-gallery', '_mpl-gallery-nogrid', 'bmh',


'classic', 'dark_background', 'fast', 'fivethirtyeight', 'ggplot',
'grayscale', 'seaborn', 'seaborn-bright', 'seaborn-colorblind',
'seaborn-dark', 'seaborn-dark-palette', 'seaborn-darkgrid', 'seaborn-deep',
'seaborn-muted', 'seaborn-notebook', 'seaborn-paper', 'seaborn-pastel',
'seaborn-poster', 'seaborn-talk', 'seaborn-ticks', 'seaborn-white', 'seaborn-whitegrid',
'tableau-colorblind10']

show() or No show()? How to Display Your Plots


The best use of Matplotlib differs depending on how you are using it. There are roughly
three applicable contexts. Using Matplotlib i) in a script, ii) in an IPython terminal, iii) in an
IPython notebook.
i) Plotting from a script
If you are using Matplotlib from within a script, the function ‘plt.show()’ is
friendly to use. It looks for all currently active figure objects, and opens one or more
interactive windows that display your figure.

ii) Plotting from an IPython shell


It is very convenient to use Matplotlib interactively within an IPython shell.
To enable this mode, you can use the ‘%matplotlib’ command after starting
Ipython.

Ex: %matplotlib
import matplotlib.pyplot as plt

To update the properties if lines plt.draw() is used.

iii) Plotting from an IPython notebook


In the IPython notebook we have two options
-- %matplotlib notebook
-- %matplotlib inline

Saving Figures to File


Matplotlib has the ability to save figures in a wide variety of formats. You can save a
figure using the savefig(). This function has the following parameters:
 fname : path or name of output file with extension.
 dpi : dots per inch resolution of figure
 facecolor : facecolor of figure
 edgecolor : edgecolor of figure
 orientation : landscape or portrait
 format : The file format, e.g. ‘png’, ‘pdf’, ‘svg’, etc.
 transparent : If it is True, the patches of axes will all be transparent
Ex:
plt.savefig("output1", facecolor='y', bbox_inches="tight", pad_inches=0.3,
transparent=True)
---
Note:
Two Interfaces
The confusing feature of Matplotlib is its dual interfaces:
a convenient MATLAB-style
a more powerful object-oriented interface

MATLAB-style interface
Matplotlib was originally written as a Python alternative for MATLAB users. So much
of its syntax reflects that.
Ex: plt.subplot(2, 1, 1) # (rows, columns, panel number)
plt.plot(x, np.sin(x))

Object-oriented interface
The object-oriented interface is useful for more complicated situations, and
for when you want more control over your figure.
Ex:
# First create a grid of plots
# ax will be an array of two Axes objects
fig, ax = plt.subplots(2)
# Call plot() method on the appropriate object
ax[0].plot(x, np.sin(x))
ax[1].plot(x, np.cos(x));
---

SIMPLE LINE PLOTS


Line plots are used to represent the relation between two data X and Y on a different
axis. A simple plot is the visualisation of a single function y=f(x).
To plot a line plot in Matplotlib, you use the generic plot() function from the PyPlot
instance. It is used for generating a 2D plot based on the given data points represented by the
variables x and y. It connects data points with lines. It also allows us to customize the plot’s
appearance through parameters such as line styles and markers.
The figure() function can be thought of as a single container that contains all the objects
representing axes, graphics, text, and labels.
Ex:
import matplotlib.pyplot as plt
import numpy as np
fig = plt.figure()
x = np.linspace(0, 10, 1000)
plt.plot(x, np.sin(x))
plt.show()

If we want to create a single figure with multiple lines, we can simply call the plot function
multiple times.
Ex:

plt.plot(x, np.sin(x))
plt.plot(x, np.cos(x))

Adjusting the Plot: Line Colors and Styles


The plt.plot() function takes additional arguments that can be used to customize the line
colors and styles. o adjust the color, you can use the color keyword, which accepts a string
argument representing the color of the line. The color can be specified in a variety of ways:
plt.plot(x, np.sin(x - 0), color='blue') # specify color by name
plt.plot(x, np.sin(x - 1), color='g') # short color code (rgbcmyk)
plt.plot(x, np.sin(x - 2), color='0.75') # Grayscale between 0 and 1
plt.plot(x, np.sin(x - 3), color='#FFDD44') # Hex code (RRGGBB from 00 to FF)
plt.plot(x, np.sin(x - 4), color=(1.0,0.2,0.3)) # RGB tuple, values 0 and 1
plt.plot(x, np.sin(x - 5), color='chartreuse'); # all HTML color names supported

If no color is specified, Matplotlib will automatically cycle through a set of default


colors for multiple lines.
Similarly, you can adjust the line style using the linestyle keyword.
plt.plot(x, x + 0, linestyle='solid')
plt.plot(x, x + 1, linestyle='dashed')
plt.plot(x, x + 2, linestyle='dashdot')
plt.plot(x, x + 3, linestyle='dotted');

plt.plot(x, x + 4, linestyle='-') # solid


plt.plot(x, x + 5, linestyle='--') # dashed
plt.plot(x, x + 6, linestyle='-.') # dashdot
plt.plot(x, x + 7, linestyle=':'); # dotted

The linestyle and color codes can be combined into a single nonkeyword argument to
the plt.plot() function.
Ex:
plt.plot(x, x + 0, '-g') # solid green
plt.plot(x, x + 1, '--c') # dashed cyan

Adjusting the Plot: Axes Limits

Matplotlib does a decent job of choosing default axes limits, but sometimes more
control is needed. To control these axes limits we can make use of two methods, plt.xlim()
and plt.ylim().
Ex:
plt.plot(x, np.sin(x))
plt.xlim(-1, 11)
plt.ylim(-1.5, 1.5)
This can also be written as plt.axis([-1, 11, -1.5, 1.5]).

Labeling Plots
Titles and axis labels are the simplest labels. There are methods that can be used to set
them,
plt.title("A Sine Curve")
plt.xlabel("x")
plt.ylabel("sin(x)")

plt.legend() method is used to label each line type in the plot.

plt.plot(x, np.sin(x), '-g', label='sin(x)')


plt.plot(x, np.cos(x), ':b', label='cos(x)')
plt.axis('equal')
plt.legend();
---

SIMPLE SCATTER PLOTS


Scatter plots are close cousin of the line plot. Instead of points being joined by line
segments, here the points are represented individually with a dot, circle, or other shape.

Scatter Plots with plt.plot

The function used for line plots (plt.plot()) can be used to produce scatter plots.
Ex:

import matplotlib.pyplot as plt


import numpy as np
x = np.linspace(0, 10, 30)
y = np.sin(x)
plt.plot(x, y, 'o', color='black')
plt.show()

The third argument in the function call is a character that represents the type of symbol
used for the plotting. You can specify options such as '-' and '--' to control the line style. The
marker style has its own set of short string codes.
marker symbol description
"." point
"," pixel
"o" circle
"v" triangle_down
"^" triangle_up
"<" triangle_left
">" triangle_right
"s" square
"p" pentagon
"P" plus (filled)
"*" star
"h" hexagon1
"H" hexagon2
"+" plus
"x" x
"X" x (filled)
"D" diamond
"d" thin_diamond

Additional keyword arguments to plt.plot specify a wide range of properties of the


lines and markers.

Ex:

plt.plot(x, y, '-p', color='gray', markersize=15, linewidth=4, markerfacecolor='white',


markeredgecolor='gray', markeredgewidth=2)

plt.ylim(-1.2, 1.2)
Scatter Plots with plt.scatter

Another method of creating scatter plots is the plt.scatter function. The primary
difference of plt.scatter from plt.plot is that it can be used to create scatter plots where the
properties of each individual point (size, face color, edge color, etc.) can be individually
controlled or mapped to data.

Ex:
rng = np.random.RandomState(0)
x = rng.randn(100)
y = rng.randn(100)
colors = rng.rand(100)
sizes = 1000 * rng.rand(100)
plt.scatter(x, y, c=colors, s=sizes, alpha=0.3, cmap='viridis')
plt.colorbar(); # show color scale

The color argument is automatically mapped to a color scale by the colorbar() and the
size argument is given in pixels. In this way, the color and size of points can be used to convey
information in the visualization.
---

VISUALIZING ERRORS
Basic Errorbars

Error bars help you indicate estimated error or uncertainty to give a general sense of
how precise a measurement is. This is done through the use of markers drawn over the original
graph and its data points. To visualize this information error bars work by drawing lines that
extend from the center of the plotted data point or edge. The length of an error bar helps to
reveal uncertainty of a data point. The function plt.errorbar() is used to represent error bars.
Ex:

# importing matplotlib
import matplotlib.pyplot as plt

# making a simple plot


x =[1, 2, 3, 4, 5, 6, 7]
y =[1, 2, 1, 2, 1, 2, 1]

# creating error
x_error = 0.5
y_error = 0.3

# plotting graph
plt.plot(x, y)
plt.errorbar(x, y, yerr = y_error, xerr = x_error, fmt ='o')

The ‘fm’t is a format code controlling the appearance of lines and points.
Ex:
plt.errorbar(x, y, yerr=dy, fmt='.k')

In addition to these basic options, the errorbar function has many options to finetune
the outputs.

Ex:

plt.errorbar(x, y, yerr=dy, fmt='o', color='black', ecolor='lightgray', elinewidth=3, capsize=0)


Continuous Errors

In some situations, it is desirable to show error bars on continuous quantities. To


represent these type of continuous errors we use the combine primitives like plt.plot and
plt.fill_between.

The fill_between() function is used to fill area between two horizontal curves.

Ex:

# Visualize the result


plt.plot(xdata, ydata, 'or')
plt.plot(xfit, yfit, '-', color='gray')
plt.fill_between(xfit, yfit - dyfit, yfit + dyfit,
color='gray', alpha=0.2)
plt.xlim(0, 10)

---

DENSITY AND CONTOUR PLOTS


Sometimes it is useful to display three-dimensional data in two dimensions using
contours or color-coded regions. There are three Matplotlib functions that can be helpful for
this task. plt.contour for contour plots, plt.contourf for filled contour plots, and plt.imshow for
showing images.
Visualizing a Three-Dimensional Function

Using plt.contour() function

A contour plot can be created with the plt.contour function. It takes three arguments:

 a grid of x values,
 a grid of y values, and
 a grid of z values.

The x and y values represent positions on the plot and the z values represents the contour
levels. To builds two-dimensional grids from one-dimensional arrays np.meshgrid function is
used.

Ex:

x = np.linspace(0, 5, 50)
y = np.linspace(0, 5, 40)
X, Y = np.meshgrid(x, y)
Z = f(X, Y)
plt.contour(X, Y, Z, colors='black')

When a single color is used, negative values are represented by dashed lines, and
positive values by solid lines by default. you can color-code the lines by specifying the cmap
argument.
Ex: plt.contour(X, Y, Z, 20, cmap='RdGy')

Using plt.contourf() function

The contourf function is used to fill the spaces between the levels in the plot. The first
color fills the space between the lowest level. The last color corresponds to the highest level in
the plot.
Ex:

plt.contourf(X, Y, Z, 20, cmap='RdGy')


plt.colorbar();

Using plt.imshow() function

The color steps in the above plot are discrete rather than continuous. To handle this in
better way plt.imshow() function is used. It interprets a two-dimensional grid of data as an
image.

Ex:
plt.imshow(Z, extent=[0, 5, 0, 5], origin='lower', cmap='RdGy')
plt.colorbar()
plt.axis(aspect='image');

There are a few problems in implementing plot with imshow():

 plt.imshow() doesn’t accept an x and y grid, so you must manually specify the extent
[xmin, xmax, ymin, ymax] of the image on the plot.

 plt.imshow() by default follows the standard image array definition where the origin is
in the upper left, not in the lower left as in most contour plots. This must be changed
when showing gridded data.

 plt.imshow() will automatically adjust the axis aspect ratio to match the input data; you
can change this by setting, for example, plt.axis(aspect='image') to make x and y units
match.
The combination of these three functions—plt.contour, plt.contourf, and plt.imshow
gives limitless possibilities for displaying this sort of three dimensional data within a two-
dimensional plot.

---

HISTOGRAMS, BINNINGS, AND DENSITY

A histogram is used to represent data provided in the form of some groups. It is an


accurate method for the graphical representation of numerical data distribution. It is a type of
bar plot where the X-axis represents the bin ranges while the Y-axis gives information about
frequency.
Ex:

import numpy as np
import matplotlib.pyplot as plt
plt.style.use('seaborn-white')
data = np.random.randn(1000)
plt.hist(data)
The hist() function has many options to tune both the calculation and the display. You can also
customized the histogram by using different arguments.
Ex:
x1 = np.random.normal(0, 0.8, 1000)
x2 = np.random.normal(-2, 1, 1000)
x3 = np.random.normal(3, 2, 1000)
kwargs = dict(histtype='stepfilled', alpha=0.3, normed=True, bins=40)
plt.hist(x1, **kwargs)
plt.hist(x2, **kwargs)
plt.hist(x3, **kwargs)

Two-Dimensional Histograms and Binnings


We can also create histograms in two dimensions by dividing points among two
dimensional bins. One straightforward way to plot a two-dimensional histogram is to use
Matplotlib’s plt.hist2d function.

Ex:
plt.hist2d(x, y, bins=30, cmap='Blues')
cb = plt.colorbar()
cb.set_label('counts in bin')
plt.hexbin: Hexagonal binnings

The two-dimensional histogram creates a tessellation of squares across the axes.


Another natural shape for such a tessellation is the regular hexagon. The plt.hexbin() function
is used to epresents a two-dimensional dataset binned within a grid of hexagons.
Ex:

plt.hexbin(x, y, gridsize=30, cmap='Blues')


cb = plt.colorbar(label='count in bin')

---

You might also like