Scipy Tutorial: Travis E. Oliphant 8Th October 2004
Scipy Tutorial: Travis E. Oliphant 8Th October 2004
Travis E. Oliphant
8th October 2004
1 Introduction
SciPy is a collection of mathematical algorithms and convenience functions built on the Numeric extension
for Python. It adds significant power to the interactive Python session by exposing the user to high-level
commands and classes for the manipulation and visualization of data. With SciPy, an interactive Python
session becomes a data-processing and system-prototyping environment rivaling sytems such as Matlab, IDL,
Octave, R-Lab, and SciLab.
The additional power of using SciPy within Python, however, is that a powerful programming language
is also available for use in developing sophisticated programs and specialized applications. Scientific ap-
plications written in SciPy benefit from the development of additional modules in numerous niche’s of the
software landscape by developers across the world. Everything from parallel programming to web and data-
base subroutines and classes have been made available to the Python programmer. All of this power is
available in addition to the mathematical libraries in SciPy.
This document provides a tutorial for the first-time user of SciPy to help get started with some of the
features available in this powerful package. It is assumed that the user has already installed the package.
Some general Python facility is also assumed such as could be acquired by working through the Tutorial in
the Python distribution. Throughout this tutorial it is assumed that the user has imported all of the names
defined in the SciPy namespace using the command
>>> from scipy import *
>>> info(optimize.fmin)
fmin(func, x0, args=(), xtol=0.0001, ftol=0.0001, maxiter=None, maxfun=None,
full_output=0, printmessg=1)
Description:
1
Uses a Nelder-Mead simplex algorithm to find the minimum of function
of one or more variables.
Inputs:
Additional Inputs:
Another useful command is source. When given a function written in Python as an argument, it prints
out a listing of the source code for that function. This can be helpful in learning about an algorithm or
understanding exactly what a function is doing with its arguments. Also don’t forget about the Python
command dir which can be used to look at the namespace of a module or package.
2
Subpackage Description
cluster Clustering algorithms
cow Cluster of Workstations code for parallel programming
fftpack FFT based on fftpack – default
fftw* FFT based on fftw — requires FFTW libraries (is this still needed?)
ga Genetic algorithms
gplt* Plotting — requires gnuplot
integrate Integration
interpolate Interpolation
io Input and Output
linalg Linear algebra
optimize Optimization and root-finding routines
plt* Plotting — requires wxPython
signal Signal processing
special Special functions
stats Statistical distributions and functions
xplt Plotting with gist
Because of their ubiquitousness, some of the functions in these subpackages are also made available in
the scipy namespace to ease their use in interactive sessions and programs. In addition, many convenience
functions are located in the scipy base package and the in the top-level of the scipy package. Before looking
at the sub-packages individually, we will first look at some of these common functions.
still available in umath (part of Numeric) if you need it (note: importing umath or fastumath resets the behavior of the infix
operators to use the umath or fastumath ufuncs respectively).
2 Be careful when treating logical expressions as integers as the 8-bit integers may silently overflow at 256.
3
2.2.1 Type handling
Note the difference between iscomplex (isreal) and iscomplexobj (isrealobj). The former command is
array based and returns byte arrays of ones and zeros providing the result of the element-wise test. The
latter command is object based and returns a scalar describing the result of the test on the entire object.
Often it is required to get just the real and/or imaginary part of a complex number. While complex
numbers and arrays have attributes that return those values, if one is not sure whether or not the object
will be complex-valued, it is better to use the functional forms real and imag. These functions succeed for
anything that can be turned into a Numeric array. Consider also the function real if close which transforms
a complex-valued number with tiny imaginary part into a real number.
Occasionally the need to check whether or not a number is a scalar (Python (long)int, Python float,
Python complex, or rank-0 array) occurs in coding. This functionality is provided in the convenient function
isscalar which returns a 1 or a 0.
Finally, ensuring that objects are a certain Numeric type occurs often enough that it has been given a
convenient interface in SciPy through the use of the cast dictionary. The dictionary is keyed by the type it is
desired to cast to and the dictionary stores functions to perform the casting. Thus, > > > a = cast[’f’](d)
returns an array of float32 from d. This function is also useful as an easy way to get a scalar of a certain
type: > > > fpi = cast[’f’](pi).
4
[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4]]])
>>> mgrid[0:5:4j,0:5:4j]
array([[[ 0. , 0. , 0. , 0. ],
[ 1.6667, 1.6667, 1.6667, 1.6667],
[ 3.3333, 3.3333, 3.3333, 3.3333],
[ 5. , 5. , 5. , 5. ]],
[[ 0. , 1.6667, 3.3333, 5. ],
[ 0. , 1.6667, 3.3333, 5. ],
[ 0. , 1.6667, 3.3333, 5. ],
[ 0. , 1.6667, 3.3333, 5. ]]])
Having meshed arrays like this is sometimes very useful. However, it is not always needed just to evaluate
some N-dimensional function over a grid due to the array-broadcasting rules of Numeric and SciPy. If this
is the only purpose for generating a meshgrid, you should instead use the function ogrid which generates an
“open” grid using NewAxis judiciously to create N, N-d arrays where only one-dimension has length greater
than 1. This will save memory and create the same result if the only purpose for the meshgrid is to generate
sample points for evaluation of an N-d function.
2.2.5 Polynomials
There are two (interchangeable) ways to deal with 1-d polynomials in SciPy. The first is to use the poly1d
class in scipy base. This class accepts coefficients or polynomial roots to initialize a polynomial. The
polynomial object can then be manipulated in algebraic expressions, integrated, differentiated, and evaluated.
It even prints like a polynomial:
>>> p = poly1d([3,4,5])
>>> print p
2
3 x + 4 x + 5
>>> print p*p
4 3 2
9 x + 24 x + 46 x + 40 x + 25
>>> print p.integ(k=6)
3 2
x + 2 x + 5 x + 6
>>> print p.deriv()
6 x + 4
>>> p([4,5])
array([ 69, 100])
The other way to handle polynomials is as an array of coefficients with the first element of the array
giving the coefficient of the highest power. There are explicit functions to add, subtract, multiply, divide,
integrate, differentiate, and evaluate polynomials represented as sequences of coefficients.
5
2.2.6 Vectorizing functions (vectorize)
One of the features that SciPy provides is a class vectorize to convert an ordinary Python function which
accepts scalars and returns scalars into a “vectorized-function” with the same broadcasting rules as other
Numeric functions (i.e. the Universal functions, or ufuncs). For example, suppose you have a Python
function named addsubtract defined as:
which defines a function of two scalar variables and returns a scalar result. The class vectorize can be used
to “vectorize” this function so that
>>> vec_addsubstract = vectorize(addsubtract)
returns a function which takes array arguments and returns an array result:
>>> vec_addsubtract([0,3,6,9],[1,3,5,7])
array([1, 6, 1, 2])
This particular function could have been written in vector form without the use of general function.
But, what if the function you have written is the result of some optimization or integration routine. Such
functions can likely only be vectorized using vectorize.
6
a place at the top level. There are convenience functions for the interactive use: disp (similar to print),
and who (returns a list of defined variables and memory consumption–upper bounded). Another function
returns a common image used in signal processing: lena.
Finally, two functions are provided that are useful for approximating derivatives of functions using
discrete-differences. The function central diff weights returns weighting coefficients for an equally-spaced
N -point approximation to the derivative of order o. These weights must be multiplied by the function cor-
responding to these points and the results added to obtain the derivative approximation. This function is
intended for use when only samples of the function are avaiable. When the function is an object that can
be handed to a routine and evaluated, the function derivative can be used to automatically evaluate the
object at the correct points to obtain an N-point approximation to the oth -derivative at a given point.
4 Integration (integrate)
The integrate sub-package provides several integration techniques including an ordinary differential equation
integrator. An overview of the module is provided by the help command:
>>> help(integrate)
Methods for Integrating Functions
7
>>> result = integrate.quad(lambda x: special.jv(2.5,x), 0, 4.5)
>>> print result
(1.1178179380783249, 7.8663172481899801e-09)
>>> I = sqrt(2/pi)*(18.0/27*sqrt(2)*cos(4.5)-4.0/27*sqrt(2)*sin(4.5)+
sqrt(2*pi)*special.fresnl(3/sqrt(pi))[0])
>>> print I
1.117817938088701
The first argument to quad is a “callable” Python object (i.e a function, method, or class instance).
Notice the use of a lambda-function in this case as the argument. The next two arguments are the limits of
integration. The return value is a tuple, with the first element holding the estimated value of the integral
and the second element holding an upper bound on the error. Notice, that in this case, the true value of this
integral is
2 18 √ 4√ √
r
3
I= 2 cos (4.5) − 2 sin (4.5) + 2πSi √ ,
π 27 27 π
where Z x π
Si (x) = sin t2 dt.
0 2
is the Fresnel sine integral. Note that the numerically-computed integral is within 1.04 × 10 −11 of the exact
result — well below the reported error bound.
Infinite inputs are also allowed in quad by using ±integrate.inf (or inf ) as one of the arguments. For
example, suppose that a numerical value for the exponential integral:
Z ∞ −xt
e
En (x) = dt.
1 tn
is desired (and the fact that this integral can be computed as special.expn(n,x) is forgotten). The
functionality of the function special.expn can be replicated by defining a new function vec expint based
on the routine quad:
>>> vec_expint(3,arange(1.0,4.0,0.5))
array([ 0.1097, 0.0567, 0.0301, 0.0163, 0.0089, 0.0049])
>>> special.expn(3,arange(1.0,4.0,0.5))
array([ 0.1097, 0.0567, 0.0301, 0.0163, 0.0089, 0.0049])
The function which is integrated can even use the quad argument (though the error bound may under-
estimate the error due to possible numerical error in the integrand from the use of quad). The integral in
this case is Z ∞ Z ∞ −xt
e 1
In = n
dt dx = .
0 1 t n
8
>>> result = quad(lambda x: expint(3, x), 0, Inf)
>>> print result
(0.33333333324560266, 2.8548934485373678e-09)
>>> I3 = 1.0/3.0
>>> print I3
0.333333333333
This last example shows that multiple integration can be handled using repeated calls to quad. The
mechanics of this for double and triple integration have been wrapped up into the functions dblquad and
tplquad. The function, dblquad performs double integration. Use the help function to be sure that the
arguments are defined in the correct order. In addition, the limits on all inner integrals are actually functions
which can be constant functions. An example of using double integration to compute several values of I n is
shown below:
9
4.4 Ordinary differential equations (integrate.odeint)
Integrating a set of ordinary differential equations (ODEs) given initial conditions is another useful example.
The function odeint is available in SciPy for integrating a first-order vector differential equation:
dy
= f (y, t) ,
dt
given initial conditions y (0) = y0 , where y is a length N vector and f is a mapping from RN to RN . A
higher-order ordinary differential equation can always be reduced to a differential equation of this type by
introducing intermediate derivatives into the y vector.
For example suppose it is desired to find the solution to the following second-order differential equation:
d2 w
− zw(z) = 0
dz 2
1
and dw 1
with initial conditions w (0) = √ 3 2 dz
= −√3
3Γ( 13 )
. It is known that the solution to this
3 Γ( 3 )
2 z=0
differential equation with these boundary conditions is the Airy function
w = Ai (z) ,
10
>>> t = x
>>> ychk = airy(x)[0]
>>> y = odeint(func, y0, t)
>>> y2 = odeint(func, y0, t, Dfun=gradient)
5 Optimization (optimize)
There are several classical optimization algorithms provided by SciPy in the optimize package. An overview
of the module is available using help (or pydoc.help):
>>> info(optimize)
Optimization Tools
11
newton -- Secant method or Newton’s method
The first four algorithms are unconstrained minimization algorithms (fmin: Nelder-Mead simplex, fmin bfgs:
BFGS, fmin ncg: Newton Conjugate Gradient, and leastsq: Levenburg-Marquardt). The fourth algorithm
only works for functions of a single variable but allows minimization over a specified interval. The last
algorithm actually finds the roots of a general function of possibly many variables. It is included in the
optimization package because at the (non-boundary) extreme points of a function, the gradient is equal to
zero.
The minimum value of this function is 0 which is achieved when xi = 1. This minimum can be found using
the fmin routine as shown in the example below:
Another optimization algorithm that needs only function calls to find the minimum is Powell’s method
available as optimize.fmin powell.
12
This expression is valid for the interior derivatives. Special cases are
∂f
= −400x0 x1 − x20 − 2 (1 − x0 ) ,
∂x0
∂f
= 200 xN −1 − x2N −2 .
∂xN −1
A Python function which computes this gradient is constructed by the code-segment:
The calling signature for the BFGS minimization algorithm is similar to fmin with the addition of the
fprime argument. An example usage of fmin bfgs is shown in the following example which minimizes the
Rosenbrock function.
The inverse of the Hessian is evaluted using the conjugate-gradient method. An example of employing this
method to minimizing the Rosenbrock function is given below. To take full advantage of the NewtonCG
method, a function which computes the Hessian must be provided. The Hessian matrix itself does not need
to be constructed, only a vector which is the product of the Hessian with an arbitrary vector needs to be
available to the minimization routine. As a result, the user can provide either a function to compute the
Hessian matrix, or a function to compute the product of the Hessian with an arbitrary vector.
13
5.3.1 Full Hessian example:
The Hessian of the Rosenbrock function is
∂2f
= 200 (δi,j − 2xi−1 δi−1,j ) − 400xi (δi+1,j − 2xi δi,j ) − 400δi,j xi+1 − x2i + 2δi,j ,
Hij =
∂xi ∂xj
= 202 + 1200x2i − 400xi+1 δi,j − 400xi δi+1,j − 400xi−1 δi−1,j ,
if i, j ∈ [1, N − 2] with i, j ∈ [0, N − 1] defining the N × N matrix. Other non-zero entries of the matrix are
∂2f
= 1200x20 − 400x1 + 2,
∂x20
∂2f ∂2f
= = −400x0 ,
∂x0 ∂x1 ∂x1 ∂x0
∂2f ∂2f
= = −400xN −2 ,
∂xN −1 ∂xN −2 ∂xN −2 ∂xN −1
∂2f
= 200.
∂x2N −1
The code which computes this Hessian along with the code to minimize the function using fmin ncg is
shown in the following example:
14
result, the user can supply code to compute this product rather than the full Hessian by setting the fhess p
keyword to the desired function. The fhess p function should take the minimization vector as the first
argument and the arbitrary vector as the second argument. Any extra arguments passed to the function to
be minimized will also be passed to this function. If possible, using Newton-CG with the hessian product
option is probably the fastest way to minimize the function.
In this case, the product of the Rosenbrock Hessian with an arbitrary vector is not difficult to compute.
If p is the arbitrary vector, then H (x) p has elements:
Code which makes use of the fhess p keyword to minimize the Rosenbrock function using fmin ncg follows:
An objective function to pass to any of the previous minization algorithms to obtain a least-squares fit is.
N
X −1
J (p) = e2i (p) .
i=0
The leastsq algorithm performs this squaring and summing of the residuals automatically. It takes
as an input argument the vector function e (p) and returns the value of p which minimizes J (p) = e T e
15
directly. The user is also encouraged to provide the Jacobian matrix of the function (with derivatives down
the columns or across the rows). If the Jacobian is not provided, it is estimated.
An example should clarify the usage. Suppose it is believed some measured data follow a sinusoidal
pattern
yi = A sin (2πkxi + θ)
where the parameters A, k, and θ are unknown. The residual vector is
By defining a function to compute the residuals and (selecting an appropriate starting position), the least-
squares fit routine can be used to find the best-fit parameters Â, k̂, θ̂. This is shown in the following example
and a plot of the results is shown in Figure 1.
>>> x = arange(0,6e-2,6e-2/30)
>>> A,k,theta = 10, 1.0/3e-2, pi/6
>>> y_true = A*sin(2*pi*k*x+theta)
>>> y_meas = y_true + 2*randn(len(x))
16
Least-squares fit to noisy data
10
True
-5 Noisy
Fit
-10
17
algorithm for locating a minimum. Optimally a bracket should be given which contains the minimum
desired. A bracket is a triple (a, b, c) such that f (a) > f (b) < f (c) and a < b < c. If this is not given, then
alternatively two starting points can be chosen and a bracket will be found from these points using a simple
marching algorithm. If these two starting points are not provided 0 and 1 will be used (this may not be the
right choice for your function and result in an unexpected minimum being returned).
18
5.6.2 Scalar function root finding
If one has a single-variable equation, there are four different root finder algorithms that can be tried. Each
of these root finding algorithms requires the endpoints of an interval where a root is suspected (because
the function changes signs). In general brentq is the best choice, but the other methods may be useful in
certain circumstances or for academic purposes.
6 Interpolation (interpolate)
There are two general interpolation facilities available in SciPy. The first facility is an interpolation class
which performs linear 1-dimensional interpolation. The second facility is based on the FORTRAN library
FITPACK and provides functions for 1- and 2-dimensional (smoothed) cubic-spline interpolation.
>>> x = arange(0,10)
>>> y = exp(-x/3.0)
>>> f = interpolate.linear_1d(x,y)
>>> help(f)
Instance of class: linear_1d
<name>(x_new)
Inputs:
x_new -- New independent variables.
Outputs:
y_new -- Linearly interpolated values corresponding to x_new.
19
1.0
0.8
Interpolated
Actual
0.6
0.4
0.2
0.0
0 2 4 6 8
20
different was to represent a curve and obtain (smoothing) spline coefficients: directly and parametrically.
The direct method finds the spline representation of a curve in a two-dimensional plane using the function
interpolate.splrep. The first two arguments are the only ones required, and these provide the x and
y components of the curve. The normal output is a 3-tuple, (t, c, k), containing the knot-points, t, the
coefficients c and the order k of the spline. The default spline order is cubic, but this can be changed with
the input keyword, k.
For curves in N -dimensional space the function interpolate.splprep allows defining the curve paramet-
rically. For this function only 1 input argument is required. This input is a list of N -arrays representing
the curve in N -dimensional space. The length of each array is the number of curve points, and each array
provides one component of the N -dimensional data point. The parameter variable is given with the keword
argument, u, which defaults to an equally-spaced monotonic sequence between 0 and 1. The default output
consists of two objects: a 3-tuple, (t, c, k), containing the spline representation and the parameter variable
u.
The keyword argument, s, is used √ to specify the amount of smoothing to perform during the spline fit.
The default value of s is s = m − 2m where m is the number of data-points being fit. Therefore, if no
smoothing is desired a value of s = 0 should be passed to the routines.
Once the spline representation of the data has been determined, functions are available for evaluating the
spline (interpolate.splev) and its derivatives (interpolate.splev, interpolate.splade) at any point and
the integral of the spline between any two points (interpolate.splint). In addition, for cubic splines (k = 3)
with 8 or more knots, the roots of the spline can be estimated (interpolate.sproot). These functions are
demonstrated in the example that follows (see also Figure 3).
>>> # Cubic-spline
>>> x = arange(0,2*pi+pi/4,2*pi/8)
>>> y = sin(x)
>>> tck = interpolate.splrep(x,y,s=0)
>>> xnew = arange(0,2*pi,pi/50)
>>> ynew = interpolate.splev(xnew,tck,der=0)
>>> xplt.plot(x,y,’x’,xnew,ynew,xnew,sin(xnew),x,y,’b’)
>>> xplt.legend([’Linear’,’Cubic Spline’, ’True’],[’b-x’,’m’,’r’])
>>> xplt.limits(-0.05,6.33,-1.05,1.05)
>>> xplt.title(’Cubic-spline interpolation’)
>>> xplt.eps(’interp_cubic’)
21
Cubic-spline interpolation Derivative estimation frm spline
1.0 1.0
True
0.5 Cubic Spline 0.5 True
X Linear
Cubic Spline
0.0 0.0
-0.5 -0.5
-1.0 -1.0
0 1 2 3 4 5 6 0 1 2 3 4 5 6
0.5 0.5
True
Cubic Spline
0.0 0.0
X Linear
-1.0 -1.0
0 1 2 3 4 5 6 -1.0 -0.5 0.0 0.5 1.0
22
>>> xplt.limits(-0.05,6.33,-1.05,1.05)
>>> xplt.title(’Integral estimation from spline’)
>>> xplt.eps(’interp_cubic_int’)
23
>>> znew = interpolate.bisplev(xnew[:,0],ynew[0,:],tck)
>>> xplt.surf(znew,xnew,ynew,shade=1,palette=’rainbow’)
>>> xplt.title3("Interpolated function.")
>>> xplt.eps("2d_interp")
Interpolated function.
Sparsely sampled function.
7.1 B-splines
A B-spline is an approximation of a continuous function over a finite-domain in terms of B-spline coefficients
and knot points. If the knot-points are equally spaced with spacing ∆x, then the B-spline approximation to
a 1-dimensional function is the finite-basis expansion.
X x
y (x) ≈ cj β o −j .
j
∆x
In these expressions, β o (·) is the space-limited B-spline basis function of order, o. The requirement of
equally-spaced knot-points and equally-spaced data points, allows the development of fast (inverse-filtering)
24
algorithms for determining the coefficients, cj , from sample-values, yn . Unlike the general spline interpolation
algorithms, these algorithms can quickly find the spline coefficients for large images.
The advantage of representing a set of samples via B-spline basis functions is that continuous-domain
operators (derivatives, re-sampling, integral, etc.) which assume that the data samples are drawn from an
underlying continuous function can be computed with relative ease from the spline coefficients. For example,
the second-derivative of a spline is
1 X x
y000 (x) = 2
cj β o00 −j .
∆x j ∆x
d2 β o (w)
= β o−2 (w + 1) − 2β o−2 (w) + β o−2 (w − 1)
dw2
it can be seen that
1 X h o−2 x x x i
y 00 (x) = 2
cj β − j + 1 − 2β o−2 − j + β o−2 −j −1 .
∆x j ∆x ∆x ∆x
Thus, the second-derivative signal can be easily calculated from the spline fit. if desired, smoothing splines
can be found to make the second-derivative less sensitive to random-errors.
The savvy reader will have already noticed that the data samples are related to the knot coefficients via
a convolution operator, so that simple convolution with the sampled B-spline function recovers the original
data from the spline coefficients. The output of convolutions can change depending on how boundaries are
handled (this becomes increasingly more important as the number of dimensions in the data-set increases).
The algorithms relating to B-splines in the signal-processing sub package assume mirror-symmetric boundary
conditions. Thus, spline coefficients are computed based on that assumption, and data-samples can be
recovered exactly from the spline coefficients by assuming them to be mirror-symmetric also.
Currently the package provides functions for determining seond- and third-order cubic spline coeffi-
cients from equally spaced samples in one- and two-dimensions (signal.qspline1d, signal.qspline2d, sig-
nal.cspline1d, signal.cspline2d). The package also supplies a function (signal.bspline) for evaluating
the bspline basis function, β o (x) for arbitrary order and x. For large o, the B-spline basis function can be
approximated well by a zero-mean Gaussian function with standard-deviation equal to σ o = (o + 1) /12:
x2
o 1
β (x) ≈ p exp − .
2πσo2 2σo
A function to compute this Gaussian for arbitrary x and o is also available (signal.gauss spline). The
following code and Figure uses spline-filtering to compute an edge-image (the second-derivative of a smoothed
spline) of Lena’s face which is an array returned by the command lena(). The command signal.sepfir2d
was used to apply a separable two-dimensional FIR filter with mirror-symmetric boundary conditions to the
spline coefficients. This function is ideally suited for reconstructing samples from spline coefficients and is
faster than signal.convolve2d which convolves arbitrary two-dimensional filters and allows for choosing
mirror-symmetric boundary conditions.
25
>>> signal.sepfir2d(ck, [1], derfilt)
>>>
>>> ## Alternatively we could have done:
>>> ## laplacian = array([[0,1,0],[1,-4,1],[0,1,0]],Float32)
>>> ## deriv2 = signal.convolve2d(ck,laplacian,mode=’same’,boundary=’symm’)
>>>
>>> xplt.imagesc(image[::-1]) # flip image so it looks right-side up.
>>> xplt.title(’Original image’)
>>> xplt.eps(’lena_image’)
>>> xplt.imagesc(deriv[::-1])
>>> xplt.title(’Output of spline edge filter’)
>>> xplt.eps(’lena_edge’)
400 400
300 300
200 200
100 100
0 0
0 100 200 300 400 500 0 100 200 300 400 500
7.2 Filtering
Filtering is a generic name for any system that modifies an input signal in some way. In SciPy a signal can
be thought of as a Numeric array. There are different kinds of filters for different kinds of operations. There
are two broad kinds of filtering operations: linear and non-linear. Linear filters can always be reduced to
multiplication of the flattened Numeric array by an appropriate matrix resulting in another flattened Numeric
array. Of course, this is not usually the best way to compute the filter as the matrices and vectors involved
may be huge. For example filtering a 512 × 512 image with this method would require multiplication of a
5122 x5122 matrix with a 5122 vector. Just trying to store the 5122 × 5122 matrix using a standard Numeric
array would require 68, 719, 476, 736 elements. At 4 bytes per element this would require 256GB of memory.
In most applications most of the elements of this matrix are zero and a different method for computing the
output of the filter is employed.
26
7.2.1 Convolution/Correlation
Many linear filters also have the property of shift-invariance. This means that the filtering operation is
the same at different locations in the signal and it implies that the filtering matrix can be constructed
from knowledge of one row (or column) of the matrix alone. In this case, the matrix multiplication can be
accomplished using Fourier transforms.
Let x [n] define a one-dimensional signal indexed by the integer n. Full convolution of two one-dimensional
signals can be expressed as
X∞
y [n] = x [k] h [n − k] .
k=−∞
This equation can only be implemented directly if we limit the sequences to finite support sequences that
can be stored in a computer, choose n = 0 to be the starting point of both sequences, let K + 1 be that
value for which y [n] = 0 for all n > K + 1 and M + 1 be that value for which x [n] = 0 for all n > M + 1,
then the discrete convolution expression is
min(n,K)
X
y [n] = x [k] h [n − k] .
k=max(n−M,0)
For convenience assume K ≥ M. Then, more explicitly the output of this operation is
Thus, the full discrete convolution of two finite sequences of lengths K + 1 and M + 1 respectively results in
a finite sequence of length K + M + 1 = (K + 1) + (M + 1) − 1.
One dimensional convolution is implemented in SciPy with the function signal.convolve. This function
takes as inputs the signals x, h, and an optional flag and returns the signal y. The optional flag allows for
specification of which part of the output signal to return. The default value of ’full’ returns
the entire
signal.
If the flag has a value of ’same’ then only the middle K values are returned starting at y M2−1 so that
the output has the same length as the largest input. If the flag has a value of ’valid’ then only the middle
K − M + 1 = (K + 1) − (M + 1) + 1 output values are returned where z depends on all of the values of the
smallest input from h [0] to h [M ] . In other words only the values y [M ] to y [K] inclusive are returned.
This same function signal.convolve can actually take N -dimensional arrays as inputs and will return
the N -dimensional convolution of the two arrays. The same input flags are available for that case as well.
Correlation is very similar to convolution except for the minus sign becomes a plus sign. Thus
∞
X
w [n] = y [k] x [n + k]
k=−∞
27
is the (cross) correlation of the signals y and x. For finite-length signals with y [n] = 0 outside of the range
[0, K] and x [n] = 0 outside of the range [0, M ] , the summation can simplify to
min(K,M −n)
X
w [n] = y [k] x [n + k] .
k=max(0,−n)
The SciPy function signal.correlate implements this operation. Equivalent flags are available for this
operation to return the full K +M +
1 length sequence (’full’) or a sequence with the same size as the largest
sequence starting at w −K + M2−1 (’same’) or a sequence where the values depend on all the values of
the smallest sequence (’valid’). This final option returns the K − M + 1 values w [M − K] to w [0] inclusive.
The function signal.correlate can also take arbitrary N -dimensional arrays as input and return the
N -dimensional convolution of the two arrays on output.
When N = 2, signal.correlate and/or signal.convolve can be used to construct arbitrary image filters
to perform actions such as blurring, enhancing, and edge-detection for an image.
Convolution is mainly used for filtering when one of the signals is much smaller than the other (K M ),
otherwise linear filtering is more easily accomplished in the frequency domain (see Fourier Transforms).
where x [n] is the input sequence and y [n] is the output sequence. If we assume initial rest so that y [n] = 0
for n < 0, then this kind of filter can be implemented using convolution. However, the convolution filter
sequence h [n] could be infinite if ak 6= 0 for k ≥ 1. In addition, this general class of linear filter allows initial
conditions to be placed on y [n] for n < 0 resulting in a filter that cannot be expressed using convolution.
The difference equation filter can be thought of as finding y [n] recursively in terms of it’s previous values
Often a0 = 1 is chosen for normalization. The implementation in SciPy of this general difference equation
filter is a little more complicated then would be implied by the previous equation. It is implemented so that
28
only one signal needs to be delayed. The actual implementation equations are (assuming a 0 = 1).
y [n] = b0 x [n] + z0 [n − 1]
z0 [n] = b1 x [n] + z1 [n − 1] − a1 y [n]
z1 [n] = b2 x [n] + z2 [n − 1] − a2 y [n]
.. .. ..
. . .
zK−2 [n] = bK−1 x [n] + zK−1 [n − 1] − aK−1 y [n]
zK−1 [n] = bK x [n] − aK y [n] ,
where K = max (N, M ) . Note that bK = 0 if K > M and aK = 0 if K > N. In this way, the output at
time n depends only on the input at time n and the value of z0 at the previous time. This can always be
calculated as long as the K values z0 [n − 1] . . . zK−1 [n − 1] are computed and stored at each time step.
The difference-equation filter is called using the command signal.lfilter in SciPy. This command takes
as inputs the vector b, the vector, a, a signal x and returns the vector y (the same length as x) computed
using the equation given above. If x is N -dimensional, then the filter is computed along the axis provided.
If, desired, initial conditions providing the values of z0 [−1] to zK−1 [−1] can be provided or else it will be
assumed that they are all zero. If initial conditions are provided, then the final conditions on the intermediate
variables are also returned. These could be used, for example, to restart the calculation in the same state.
Sometimes it is more convenient to express the initial conditions in terms of the signals x [n] and y [n] . In
other words, perhaps you have the values of x [−M ] to x [−1] and the values of y [−N ] to y [−1] and would
like to determine what values of zm [−1] should be delivered as initial conditions to the difference-equation
filter. It is not difficult to show that for 0 ≤ m < K,
K−m−1
X
zm [n] = (bm+p+1 x [n − p] − am+p+1 y [n − p]) .
p=0
Using this formula we can find the intial condition vector z0 [−1] to zK−1 [−1] given initial conditions on y
(and x). The command signal.lfiltic performs this function.
Median Filter A median filter is commonly applied when noise is markedly non-Gaussian or when it is
desired to preserve edges. The median filter works by sorting all of the array pixel values in a rectangular
region surrounding the point of interest. The sample median of this list of neighborhood pixel values is
used as the value for the output array. The sample median is the middle array value in a sorted list of
neighborhood values. If there are an even number of elements in the neighborhood, then the average of
the middle two values is used as the median. A general purpose median filter that works on N-dimensional
arrays is signal.medfilt. A specialized version that works only for two-dimensional arrays is available as
signal.medfilt2d.
Order Filter A median filter is a specific example of a more general class of filters called order filters. To
compute the output at a particular pixel, all order filters use the array values in a region surrounding that
pixel. These array values are sorted and then one of them is selected as the output value. For the median
filter, the sample median of the list of array values is used as the output. A general order filter allows the
user to select which of the sorted values will be used as the output. So, for example one could choose to pick
the maximum in the list or the minimum. The order filter takes an additional argument besides the input
array and the region mask that specifies which of the elements in the sorted list of neighbor array values
should be used as the output. The command to perform an order filter is signal.order filter.
29
Wiener filter The Wiener filter is a simple deblurring filter for denoising images. This is not the Wiener
filter commonly described in image reconstruction problems but instead it is a simple, local-mean filter. Let
x be the input signal, then the output is
( 2
σ σ2 2 2
σx2 m x + 1 − σ x x σx ≥ σ ,
2
y=
mx σx2 < σ 2 .
Where mx is the local estimate of the mean and σx2 is the local estimate of the variance. The window for
these estimates is an optional input parameter (default is 3 × 3). The parameter σ 2 is a threshold noise
parameter. If σ is not given then it is estimated as the average of the local variances.
Hilbert filter The Hilbert transform constructs the complex-valued analytic signal from a real signal. For
example if x = cos ωn then y = hilbert (x) would return (except near the edges) y = exp (jωn) . In the
frequency domain, the hilbert transform performs
Y =X ·H
where H is 2 for positive frequencies, 0 for negative frequencies and 1 for zero-frequencies.
Detrend
30
7.3 Filter design
7.3.1 Finite-impulse response design
7.3.2 Inifinite-impulse response design
7.3.3 Analog filter frequency response
7.3.4 Digital filter frequency response
8 Input/Output
8.1 Binary
8.1.1 Arbitrary binary input and output (fopen)
8.1.2 Read and write Matlab .mat files
8.1.3 Saving workspace
8.2 Text-file
8.2.1 Read text-files (read array)
8.2.2 Write a text-file (write array)
9 Fourier Transforms
9.1 One-dimensional
9.2 Two-dimensional
9.3 N-dimensional
9.4 Shifting
9.5 Sample frequencies
9.6 Hilbert transform
9.7 Tilbert transform
10 Linear Algebra
When SciPy is built using the optimized ATLAS LAPACK and BLAS libraries, it has very fast linear algebra
capabilities. If you dig deep enough, all of the raw lapack and blas libraries are available for your use for
even more speed. In this section, some easier-to-use interfaces to these routines are described.
All of these linear algebra routines expect an object that can be converted into a 2-dimensional array.
The output of these routines is also a two-dimensional array. There is a matrix class defined in Numeric
that scipy inherits and extends. You can initialize this class with an appropriate Numeric array in order
to get objects for which multiplication is matrix-multiplication instead of the default, element-by-element
multiplication.
31
10.1 Matrix Class
The matrix class is initialized with the SciPy command mat which is just convenient short-hand for Matrix.Matrix.
If you are going to be doing a lot of matrix-math, it is convenient to convert arrays into matrices using this
command. One convencience of using the mat command is that you can enter two-dimensional matrices in
using MATLAB-like syntax with commas or spaces separating columns and semicolons separting rows as
long as the matrix is placed in a string passed to mat.
then
−37 9 22 −1.48 0.36 0.88
1
A−1 = 14 2 −9 = 0.56 0.08 −0.36 .
25
4 −3 1 0.16 −0.12 0.04
The following example demonstrates this computation in SciPy
>>> A = mat(’[1 3 5; 2 5 1; 2 3 8]’)
>>> A
Matrix([[1, 3, 5],
[2, 5, 1],
[2, 3, 8]])
>>> A.I
Matrix([[-1.48, 0.36, 0.88],
[ 0.56, 0.08, -0.36],
[ 0.16, -0.12, 0.04]])
>>> linalg.inv(A)
array([[-1.48, 0.36, 0.88],
[ 0.56, 0.08, -0.36],
[ 0.16, -0.12, 0.04]])
x + 3y + 5z = 10
2x + 5y + z = 8
2x + 3y + 8z = 3
32
However, it is better to use the linalg.solve command which can be faster and more numerically stable. In
this case it gives the same answer as shown in the following example:
>>> A = mat(’[1 3 5; 2 5 1; 2 3 8]’)
>>> b = mat(’[10;8;3]’)
>>> A.I*b
Matrix([[-9.28],
[ 5.16],
[ 0.76]])
>>> linalg.solve(A,b)
array([[-9.28],
[ 5.16],
[ 0.76]])
This is a recursive way to define the determinant where the base case is defined by accepting that the
determinant of a 1 × 1 matrix is the only matrix element. In SciPy the determinant can be calculated with
linalg.det. For example, the determinant of
1 3 5
A = 2 5 1
2 3 8
is
5 1
− 3 2 1 +5 2 5
|A| = 1
3 8 2 8 2 3
= 1 (5 · 8 − 3 · 1) − 3 (2 · 8 − 2 · 1) + 5 (2 · 3 − 2 · 5) = −25.
33
For matrix A the only valid values for norm are ±2, ±1, ±inf, and ’fro’ (or ’f’) Thus,
P
maxi j |aij | ord = inf
P
min i |a | ord = −inf
Pj ij
maxj P i |aij | ord = 1
kAk = minj i |aij | ord = −1
max σi ord = 2
p min σi ord = −2
trace (AH A) ord = ’fro’
where i represents uncertainty in the data. The strategy of least squares is to pick the coefficients c j to
minimize 2
X X
J (c) = y i −
c f
j j (x i .
)
i j
Theoretically, a global minimum will occur when
∂J X X
=0= yi − cj fj (xi ) (−fn∗ (xi ))
∂c∗n i j
or
X X X
cj fj (xi ) fn∗ (xi ) = yi fn∗ (xi )
j i i
A Ac = AH y
H
where
{A}ij = fj (xi ) .
When AH A is invertible, then
−1
c = AH A AH y = A † y
where A† is called the pseudo-inverse of A. Notice that using this definition of A the model can be written
y = Ac + .
The command linalg.lstsq will solve the linear least squares problem for c given A and y. In addition
linalg.pinv or linalg.pinv2 (uses a different method based on singular value decomposition) will find A †
given A.
The following example and figure demonstrate the use of linalg.lstsq and linalg.pinv for solving a
data-fitting problem. The data shown below were generated using the model:
yi = c1 e−xi + c2 xi
where xi = 0.1i for i = 1 . . . 10, c1 = 5, and c2 = 4. Noise is added to yi and the coefficients c1 and c2 are
estimated using linear least squares.
34
Data fitting with linalg.lstsq
c1,c2= 5.0,2.0
5.5
i = r_[1:11]
xi = 0.1*i
yi = c1*exp(-xi)+c2*xi
5.0
zi = yi + 0.05*max(yi)*randn(len(yi))
A = c_[exp(-xi)[:,NewAxis],xi[:,NewAxis]] 4.5
c,resid,rank,sigma = linalg.lstsq(A,zi)
xplt.plot(xi,zi,’x’,xi2,yi2) 3.5
xplt.limits(0,1.1,3.0,5.5)
xplt.xlabel(’x_i’)
xplt.title(’Data fitting with linalg.lstsq’) 3.0
xplt.eps(’lstsq_fit’) 0.0 0.2 0.4 0.6 0.8 1.0
xi
10.3 Decompositions
In many applications it is useful to decompose a matrix using other representations. There are several
decompositions supported SciPy.
35
With it’s default optional arguments, the command linalg.eig returns λ and v. However, it can also return
vL and just λ by itself (linalg.eigvals returns just λ as well).
In addtion, linalg.eig can also solve the more general eigenvalue problem
Av = λBv
H
A vL = λ ∗ BH v L
for square matrices A and B. The standard eigenvalue problem is an example of the general eigenvalue
problem for B = I. When a generalized eigenvalue problem can be solved, then it provides a decomposition
of A as
A = BVΛV−1
where V is the collection of eigenvectors into columns and Λ is a diagonal matrix of eigenvalues.
By definition, eigenvectors are only defined
P up to a constant scale factor. In SciPy, the scaling factor for
2
the eigenvectors is chosen so that kvk = i vi2 = 1.
As an example, consider finding the eigenvalues and eigenvectors of the matrix
1 5 2
A = 2 4 1 .
3 6 2
|A − λI| = (1 − λ) [(4 − λ) (2 − λ) − 6] −
5 [2 (2 − λ) − 3] + 2 [12 − 3 (4 − λ)]
= −λ3 + 7λ2 + 8λ − 3.
λ1 = 7.9579
λ2 = −1.2577
λ3 = 0.2997.
The eigenvectors corresponding to each eigenvalue can be found using the original equation. The eigenvectors
associated with these eigenvalues can then be found.
>>> v1 = mat(v[:,0]).T
>>> print max(ravel(abs(A*v1-l1*v1)))
4.4408920985e-16
36
10.3.2 Singular value decomposition
Singular Value Decompostion (SVD) can be thought of as an extension of the eigenvalue problem to matrices
that are not square. Let A be an M × N matrix with M and N arbitrary. The matrices A H A and AAH
are square hermitian matrices3 of size N × N and M × M respectively. It is known that the eigenvalues
of square hermitian matrices are real and non-negative. In addtion, there are at most min (M, N ) identical
non-zero eigenvalues of AH A and AAH . Define these positive eigenvalues as σi2 . The square-root of these
are called singular values of A. The eigenvectors of AH A are collected by columns into an N × N unitary4
matrix V while the eigenvectors of AAH are collected by columns in the unitary matrix U, the singular
values are collected in an M × N zero matrix Σ with main diagonal entries set to the singular values. Then
A = UΣVH
is the singular-value decomposition of A. Every matrix has a singular value decomposition. Sometimes, the
singular values are called the spectrum of A. The command linalg.svd will return U, V H , and σi as an
array of the singular values. To obtain the matrix Σ use linalg.diagsvd. The following example illustrates
the use of linalg.svd.
>>> print A
Matrix([[1, 3, 2],
[1, 2, 3]])
>>> print U*Sig*Vh
Matrix([[ 1., 3., 2.],
[ 1., 2., 3.]])
10.3.3 LU decomposition
The LU decompostion finds a representation for the M × N matrix A as
A = PLU
Axi = bi
3A DH
hermition matrix D satisfies = D.
4A unitary matrix D satisfies DH D = I = DDH so that D−1 = DH .
37
for many different bi . The LU decomposition allows this to be written as
PLUxi = bi .
Because L is lower-triangular, the equation can be solved for Uxi and finally xi very rapidly using forward-
and back-substitution. An initial time spent factoring A allows for very rapid solution of similar systems of
equations in the future. If the intent for performing LU decomposition is for solving linear systems then the
command linalg.lu factor should be used followed by repeated applications of the command linalg.lu solve
to solve the system for each new right-hand-side.
A = UH U
A = LLH
where L is lower-triangular and U is upper triangular. Notice that L = UH . The command linagl.cholesky
computes the cholesky factorization. For using cholesky factorization to solve systems of equations there
are also linalg.cho factor and linalg.cho solve routines that work similarly to their LU decomposition
counterparts.
10.3.5 QR decomposition
The QR decomposition (sometimes called a polar decomposition) works for any M × N array and finds an
M × M unitary matrix Q and an M × N upper-trapezoidal matrix R such that
A = QR.
Notice that if the SVD of A is known then the QR decomposition can be found
A = UΣVH = QR
implies that Q = U and R = ΣVH . Note, however, that in SciPy independent algorithms are used to find
QR and SVD decompositions. The command for QR decomposition is linalg.qr.
38
Matrix([[ 9.9001, 1.7895, -0.655 ],
[ 0. , 0.5499, -1.5775],
[ 0. , 0.5126, 0.5499]])
>>> print T2
Matrix([[ 9.9001+0.j , -0.3244+1.5546j, -0.8862+0.569j ],
[ 0. +0.j , 0.5499+0.8993j, 1.0649-0.j ],
[ 0. +0.j , 0. +0.j , 0.5499-0.8993j]])
>>> print abs(T1-T2) # different
[[ 0. 2.1184 0.1949]
[ 0. 0. 1.2676]
[ 0. 0. 0. ]]
>>> print abs(Z1-Z2) # different
[[ 0.0683 1.1175 0.1973]
[ 0.1186 0.5644 0.247 ]
[ 0.1262 0.7645 0.1916]]
>>> T,Z,T1,Z1,T2,Z2 = map(mat,(T,Z,T1,Z1,T2,Z2))
>>> print abs(A-Z*T*Z.H)
Matrix([[ 0., 0., 0.],
[ 0., 0., 0.],
[ 0., 0., 0.]])
>>> print abs(A-Z1*T1*Z1.H)
Matrix([[ 0., 0., 0.],
[ 0., 0., 0.],
[ 0., 0., 0.]])
>>> print abs(A-Z2*T2*Z2.H)
Matrix([[ 0., 0., 0.],
[ 0., 0., 0.],
[ 0., 0., 0.]])
A matrix function can be defined using this Taylor series for the square matrix A as
∞
X f (k) (0) k
f (A) = A .
k!
k=0
While, this serves as a useful representation of a matrix function, it is rarely the best way to calculate a
matrix function.
39
Another method to compute the matrix exponential is to find an eigenvalue decomposition of A:
A = VΛV−1
ejA − e−jA
sin (A) =
2j
ejA + e−jA
cos (A) = .
2
The tangent is
sin (x) −1
tan (x) = = [cos (x)] sin (x)
cos (x)
and so the matrix tangent is defined as
[cos (A)]−1 sin (A) .
These matrix functions can be found using linalg.sinhm, linalg.coshm, and linalg.tanhm.
40
>>> A = rand(3,3)
>>> B = linalg.funm(A,lambda x: special.jv(0,real(x)))
>>> print A
[[ 0.0593 0.5612 0.4403]
[ 0.8797 0.2556 0.1452]
[ 0.964 0.9666 0.1243]]
>>> print B
[[ 0.8206 -0.1212 -0.0612]
[-0.1323 0.8256 -0.0627]
[-0.2073 -0.1946 0.8516]]
11 Statistics
SciPy has a tremendous number of basic statistics routines with more easily added by the end user (if you
create one please contribute it). All of the statistics functions are located in the sub-package stats and a
fairly complete listing of these functions can be had using info(stats).
41
13 Plotting with xplt
13.1 Gist
The underlying graphics library for xplt is the pygist library. All of the commands of pygist are avail-
able under xplt as well. For more information on the pygist commands you can read the documentation
of that package in html here http://bonsai.ims.u-tokyo.ac.jp/~mdehoon/software/python/pygist_
html/pygist.html or in pdf at this location http://bonsai.ims.u-tokyo.ac.jp/~mdehoon/software/
python/pygist.pdf.
42