Python.corso.johansson
Python.corso.johansson
Python.corso.johansson
1
2.6.4 Dictionaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.7 Control Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.7.1 Conditional statements: if, elif, else . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.8 Loops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.8.1 for loops: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.8.2 Using Lists: Creating lists using for loops: . . . . . . . . . . . . . . . . . . . . . . . . 31
2.8.3 while loops: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.9 Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.9.1 Default argument and keyword arguments . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.9.2 Unnamed functions (lambda function) . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.10 Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.11 Modules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
2.12 Exceptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
2.13 Further reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
2
4 SciPy - Library of scientific algorithms for Python 61
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.2 Special functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
4.3 Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
4.3.1 Numerical integration: quadrature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
4.4 Ordinary di↵erential equations (ODEs) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
4.5 Fourier transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
4.6 Linear algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
4.6.1 Linear equation systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
4.6.2 Eigenvalues and eigenvectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
4.6.3 Matrix operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
4.6.4 Sparse matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
4.7 Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
4.7.1 Finding a minima . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
4.7.2 Finding a solution to a function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
4.8 Interpolation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
4.9 Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
4.9.1 Statistical tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
4.10 Further reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
3
6.4.2 Simplify . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
6.4.3 apart and together . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
6.5 Calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
6.5.1 Di↵erentiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
6.6 Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
6.6.1 Sums and products . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
6.7 Limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
6.8 Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
6.9 Linear algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
6.9.1 Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
6.10 Solving equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
6.11 Quantum mechanics: noncommuting variables . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
6.12 States . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
6.12.1 Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
6.13 Further reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
4
9.4 Creating and cloning a repository . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
9.5 Status . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
9.6 Adding files and committing changes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
9.7 Commiting changes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
9.8 Removing files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176
9.9 Commit logs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
9.10 Di↵s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179
9.11 Discard changes in the working directory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
9.12 Checking out old revisions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
9.13 Tagging and branching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
9.13.1 Tags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
9.14 Branches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
9.15 pulling and pushing changesets between repositories . . . . . . . . . . . . . . . . . . . . . . . 191
9.15.1 pull . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191
9.15.2 push . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
9.16 Hosted repositories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
9.17 Graphical user interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194
9.18 Further reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
5
Chapter 1
Introduction to scientific computing
with Python
This curriculum builds on material by J. Robert Johansson from his “Introduction to scientific computing
with Python,” generously made available under a Creative Commons Attribution 3.0 Unported License at
https://github.com/jrjohansson/scientific-python-lectures. The Continuum Analytics enhancements use the
Creative Commons Attribution-NonCommercial 4.0 International License.
In experimental and theoretical sciences there are well established codes of conduct for how results
and methods are published and made available to other scientists. For example, in theoretical sciences,
derivations, proofs and other results are published in full detail, or made available upon request. Likewise,
in experimental sciences, the methods used and the results are published, and all experimental data should
be available upon request. It is considered unscientific to withhold crucial details in a theoretical proof or
experimental method, that would hinder other scientists from replicating and reproducing the results.
In computational sciences there are not yet any well established guidelines for how source code and
generated data should be handled. For example, it is relatively rare that source code used in simulations for
6
published papers are provided to readers, in contrast to the open nature of experimental and theoretical work.
And it is not uncommon that source code for simulation software is withheld and considered a competitive
advantage (or unnecessary to publish).
However, this issue has recently started to attract increasing attention, and a number of editorials in
high-profile journals have called for increased openness in computational sciences. Some prestigious journals,
including Science, have even started to demand of authors to provide the source code for simulation software
used in publications to readers upon request.
Discussions are also ongoing on how to facilitate distribution of scientific software, for example as sup-
plementary materials to scientific papers.
1.1.1 References
• Reproducible Research in Computational Science, Roger D. Peng, Science 334, 1226 (2011).
• Shining Light into Black Boxes, A. Morin et al., Science 336, 159-160 (2012).
• The case for open computer programs, D.C. Ince, Nature 482, 485 (2012).
• Replication: An author of a scientific paper that involves numerical calculations should be able to
rerun the simulations and replicate the results upon request. Other scientists should also be able to
perform the same calculations and obtain the same results, given the information about the methods
used in a publication.
• Reproducibility: The results obtained from numerical simulations should be reproducible with an
independent implementation of the method, or using a di↵erent method altogether.
In summary: A sound scientific result should be reproducible, and a sound scientific study should be
replicable.
To achieve these goals, we need to:
• Keep and take note of exactly which source code and version were used to produce data and figures in
published papers.
• Record information of which version of external software was used. Keep access to the environment
that was used.
• Make sure that old codes and notes are backed up and kept for future reference.
• Be ready to give additional information about the methods used, and perhaps also the simulation
codes, to an interested reader who requests it (even years after the paper was published!).
• Ideally codes should be published online, to make it easier for other scientists interested in the codes
to access them.
7
∗ git - http://git-scm.com
∗ mercurial - http://mercurial.selenic.com. Also known as hg.
∗ subversion - http://subversion.apache.org. Also known as svn.
• Online repositories for source code. Available as both private and public repositories.
– Some good alternatives are
∗ Github - http://www.github.com
∗ Bitbucket - http://www.bitbucket.com
∗ Privately hosted repositories on the university’s or department’s servers.
Note Repositories are also excellent for version controlling manuscripts, figures, thesis files, data files, lab
logs, etc. — basically any digital content that must be preserved and is frequently updated. Again, both
public and private repositories are readily available. They are also excellent collaboration tools!
• clean and simple language: Easy-to-read and intuitive code, easy-to-learn minimalistic syntax,
maintainability scales well with size of projects.
• expressive language: Fewer lines of code, fewer bugs, easier to maintain.
Technical details:
• dynamically typed: No need to define the type of variables, function arguments or return types.
• automatic memory management: No need to explicitly allocate and deallocate memory for vari-
ables and data arrays. No memory leak bugs.
• interpreted: No need to compile the code. The Python interpreter reads and executes the python
code directly.
Advantages:
• The main advantage is ease of programming, minimizing the time required to develop, debug and
maintain the code.
• Well designed language that encourage many good programming practices:
• Modular and object-oriented programming, good system for packaging and re-use of code. This often
results in more transparent, maintainable and bug-free code.
• Documentation tightly integrated with the code.
• A large standard library, and a large collection of add-on packages.
Disadvantages:
• Since Python is an interpreted and dynamically typed programming language, the execution of Python
code can be slow compared to compiled statically typed programming languages, such as C and Fortran.
• Somewhat decentralized, with di↵erent environment, packages and documentation spread out at dif-
ferent places. Can make it harder to get started.
8
Figure 1.2: Optimizing what
• Great performance due to close integration with time-tested and highly optimized codes written in C
and Fortran:
9
Figure 1.3: Scientific Python Stack
$ python my-program.py
We can also start the interpreter by simply typing python at the command line, and interactively type
Python code into the interpreter.
This is often how we want to work when developing scientific applications, or when doing small calcula-
tions. But the standard Python interpreter is not very convenient for this kind of work, due to a number of
limitations.
10
1.4.4 IPython
IPython is an interactive shell that addresses the limitation of the standard Python interpreter, and it is a
work-horse for scientific use of python. It provides an interactive prompt to the Python interpreter with a
greatly improved user-friendliness.
• Command history, which can be browsed with the up and down arrows on the keyboard.
• Tab auto-completion.
• In-line editing of code.
• Object introspection, and automatic extract of documentation strings from Python objects like classes
and functions.
• Good interaction with operating system shell.
• Support for multiple parallel back-end processes, that can run on computing clusters or cloud services
like Amazon EC2.
$ ipython notebook
from a directory where you want the notebooks to be stored. This will open a new browser window (or
a new tab in an existing window) with an index page where existing notebooks are shown and from which
new notebooks can be created.
11
Figure 1.6: IPython notebook
1.4.6 Spyder
Spyder is a MATLAB-like IDE for scientific computing with python. It has the many advantages of a
traditional IDE environment, for example that everything from code editing, execution and debugging is
carried out in a single environment, and work on di↵erent calculations can be organized as projects in the
IDE environment.
Some advantages of Spyder:
• Powerful code editor, with syntax high-lighting, dynamic code introspection and integration with the
python debugger.
• Variable explorer, IPython command prompt.
• Integrated documentation and help.
% python --version
Python 3.4.3 :: Anaconda 2.3.0 (x86_64)
% python2 --version
Python 2.7.10
12
Figure 1.7: Spyder screenshot
1.6 Installation
1.6.1 Linux
In Ubuntu Linux, to installing python and all the requirements of these lectures run:
To use the Anaconda Python distribution that includes a very large range of scientific libraries pre-
compiled (including all of those required for these lectures), you can run:
% wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh
% bash Miniconda-latest-Linux-x86_64.sh
% conda install anaconda
1.6.2 MacOS X
Python is included by default in Mac OS X, but OS releases typically lag the latest Python versions. To use
the Anaconda Python distribution that includes a very large range of scientific libraries pre-compiled, you
can run these commands in a terminal:
% wget https://repo.continuum.io/miniconda/Miniconda3-latest-MacOSX-x86_64.sh
% bash Miniconda3-latest-MacOSX-x86_64.sh
% conda install anaconda
1.6.3 Windows
Windows lacks a good packaging system, so the easiest way to set up a Python environment is to install a
pre-packaged distribution. Some good alternatives are:
13
• Anaconda. The Anaconda Python distribution comes with many scientific computing and data science
packages and is free, including for commercial use and redistribution. It also has add-on products such
as Accelerate, IOPro, and MKL Optimizations, which have free trials and are free for academic use.
• Python.org. Official distribution from the creators of Python. The tools pip (included with recent
versions) or conda may be used to install additional packages.
• Enthought Python Distribution. EPD is a commercial product but is available free for academic use.
14
Chapter 2
Introduction to Python programming
This curriculum builds on material by J. Robert Johansson from his “Introduction to scientific computing
with Python,” generously made available under a Creative Commons Attribution 3.0 Unported License at
https://github.com/jrjohansson/scientific-python-lectures. The Continuum Analytics enhancements use the
Creative Commons Attribution-NonCommercial 4.0 International License.
myprogram.py
• Every line in a Python program file is assumed to be a Python statement, or part thereof.
– The only exception is comment lines, which start with the character # (optionally preceded by
an arbitrary number of white-space characters, i.e., tabs or spaces). Comment lines are usually
ignored by the Python interpreter.
$ python myprogram.py
• On UNIX systems, it is common to define the path to the interpreter on the first line of the program.
Note that this is a comment line as far as the Python interpreter is concerned:
#!/usr/bin/env python
If we do, and if we additionally set the file script to be executable, we can run the program like this:
$ myprogram.py
2.1.1 Example:
In [1]: ls scripts/hello-world*.py
scripts/hello-world-in-swedish.py scripts/hello-world.py
15
#!/usr/bin/env python
print("Hello world!")
Hello world!
#!/usr/bin/env python
# -*- coding: UTF-8 -*-
print("Hej världen!")
Hej världen!
Other than these two optional lines in the beginning of a Python code file, no additional code is required
for initializing a program.
2.3 Modules
Most of the functionality in Python is provided by modules. The Python Standard Library is a large collection
of modules that provides cross-platform implementations of common facilities such as access to the operating
system, file I/O, string management, network communication, and much more.
2.3.1 References
• The Python Language Reference: http://docs.python.org/2/reference/index.html
• The Python Standard Library: http://docs.python.org/2/library/
To use a module in a Python program, it first has to be imported. A module can be imported using the
import statement. For example, to import the module math, which contains many standard mathematical
functions, we can do:
16
This includes the whole module and makes it available for use later in the program. For example, we can
do:
1.0
Alternatively, we can choose to import all symbols (functions and variables) in a module to the current
namespace (so that we don’t need to use the prefix “math.” every time we use something from the math
module:
1.0
This pattern can be very convenient, but in large programs that include many modules, it is often a good
idea to keep the symbols from each module in their own namespaces, by using the import math pattern.
This would eliminate potentially confusing problems with namespace collisions.
As a third alternative, we can choose to import only a few selected symbols from a module by explicitly
listing which ones we want to import instead of using the wildcard character *:
1.0
doc
file
loader
name
package
spec
acos
acosh
asin
asinh
atan
atan2
atanh
ceil
copysign
cos
cosh
17
degrees
e
erf
erfc
exp
expm1
fabs
factorial
floor
fmod
frexp
fsum
gamma
hypot
isfinite
isinf
isnan
ldexp
lgamma
log
log10
log1p
log2
modf
pi
pow
radians
sin
sinh
sqrt
tan
tanh
trunc
And using the function help, we can get a description of each function (almost; not all functions have
docstrings, as they are technically called, but the vast majority of functions are documented this way).
In [11]: help(math.log)
log(...)
log(x[, base])
In [12]: log(10)
Out[12]: 2.302585092994046
In [13]: log(10, 2)
Out[13]: 3.3219280948873626
18
help(math)
Some very useful modules from the Python standard library are os, sys, math, shutil, re, subprocess,
multiprocessing, threading.
A complete list of standard modules for Python 2 and Python 3 are available at
http://docs.python.org/2/library/ and http://docs.python.org/3/library/, respectively.
and, as, assert, break, class, continue, def, del, elif, else, except,
exec, finally, for, from, global, if, import, in, is, lambda, not, or,
pass, print, raise, return, try, while, with, yield
Note: Be aware of the keyword lambda, which could easily be a natural variable name in a scientific
program. But being a keyword, it cannot be used as a variable name.
2.4.2 Assignment
The assignment operator in Python is =. Python is a dynamically typed language, so we do not need to
specify the type of a variable when we create one.
Assigning a value to a new variable creates the variable:
Although not explicitly specified, a variable does have a type associated with it. The type is derived from
the value that was assigned to it.
In [15]: type(x)
Out[15]: float
In [16]: x = 1
In [17]: type(x)
Out[17]: int
If we try to use a variable that has not yet been defined, we get an NameError:
In [18]: try:
print(y)
except NameError as e:
print(repr(e))
19
2.4.3 Fundamental types
In [19]: # integers
x = 1
type(x)
Out[19]: int
In [20]: # float
x = 1.0
type(x)
Out[20]: float
In [21]: # boolean
b1 = True
b2 = False
type(b1)
Out[21]: bool
In [22]: # complex numbers: note the use of ‘j‘ to specify the imaginary part
x = 1.0 - 1.0j
type(x)
Out[22]: complex
In [23]: print(x)
(1-1j)
1.0 -1.0
BuiltinFunctionType
BuiltinMethodType
CodeType
DynamicClassAttribute
FrameType
FunctionType
GeneratorType
GetSetDescriptorType
LambdaType
MappingProxyType
MemberDescriptorType
MethodType
20
ModuleType
SimpleNamespace
TracebackType
builtins
cached
doc
file
loader
name
package
spec
calculate meta
new class
prepare class
In [26]: x = 1.0
# check if the variable x is a float
type(x) is float
Out[26]: True
Out[27]: False
We can also use the isinstance method for testing types of variables:
Out[28]: True
In [30]: x = int(x)
print(x, type(x))
1 <class ’int’>
In [31]: z = complex(x)
print(z, type(z))
In [32]: try:
x = float(z)
except TypeError as e:
print(repr(e))
Complex variables cannot be cast to floats or integers. We need to use z.real or z.imag to extract the
part of the complex number we want:
21
In [33]: y = bool(z.real)
print(z.real, " -> ", y, type(y))
y = bool(z.imag)
print(z.imag, " -> ", y, type(y))
In [34]: 1 + 2, 1 - 2, 1 * 2, 1 / 2
Out[36]: 1.0
Out[37]: 4
Note: The / operator always performs a floating point division in Python 3.x. This is not true in Python
2.x, where the result of / is always an integer if the operands are integers. To be more specific, 1/2 = 0.5
(float) in Python 3.x, and 1/2 = 0 (int) in Python 2.x (but 1.0/2 = 0.5 in Python 2.x).
• The boolean operators are spelled out as the words and, not, or.
Out[38]: False
Out[39]: True
Out[40]: True
• Comparison operators >, <, >= (greater or equal), <= (less or equal), == equality, is identical.
22
In [43]: 2 >= 2, 2 <= 2
In [44]: # equality
[1,2] == [1,2]
Out[44]: True
Out[45]: True
Out[46]: str
Out[47]: 11
Hello test
In [49]: s[0]
Out[49]: ’H’
In [50]: s[0:5]
Out[50]: ’Hello’
In [51]: s[4:5]
Out[51]: ’o’
If we omit either (or both) of start or stop from [start:stop], the default is the beginning and the
end of the string, respectively:
23
In [52]: s[:5]
Out[52]: ’Hello’
In [53]: s[6:]
Out[53]: ’world’
In [54]: s[:]
We can also define the step size using the syntax [start:end:step] (the default value for step is 1, as
we saw above):
In [55]: s[::1]
In [56]: s[::2]
Out[56]: ’Hlowrd’
This technique is called slicing. Read more about the syntax here:
http://docs.python.org/release/2.7.3/library/functions.html?highlight=slice#slice
Python has a very rich set of functions for text processing. See for example
http://docs.python.org/2/library/string.html for more information.
In [58]: print("str1", 1.0, False, -1j) # The print statement converts all arguments to strings
In [59]: print("str1" + "str2" + "str3") # strings added with + are concatenated without space
str1str2str3
value = 1.000000
print(s2)
24
2.6.2 List
Lists are very similar to strings, except that each element can be of any type.
The syntax for creating lists in Python is [...]:
In [63]: l = [1,2,3,4]
print(type(l))
print(l)
<class ’list’>
[1, 2, 3, 4]
We can use the same slicing techniques to manipulate lists as we could use on strings:
In [64]: print(l)
print(l[1:3])
print(l[::2])
[1, 2, 3, 4]
[2, 3]
[1, 3]
In [65]: l[0]
Out[65]: 1
print(l)
Lists play a very important role in Python. For example, they are used in loops and other flow control
structures (discussed below). There are a number of convenient functions for generating lists of various
types; for example, the range function:
In [68]: start = 10
stop = 30
step = 2
range(start, stop, step)
In [69]: # in Python 3 range generates an iterator, which can be converted to a list using ’list(...)’.
# It has no effect in Python 2
list(range(start, stop, step))
Out[69]: [10, 12, 14, 16, 18, 20, 22, 24, 26, 28]
25
In [70]: list(range(-10, 10))
Out[70]: [-10, -9, -8, -7, -6, -5, -4, -3, -2, -1, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
In [71]: s
Out[72]: [’H’, ’e’, ’l’, ’l’, ’o’, ’ ’, ’w’, ’o’, ’r’, ’l’, ’d’]
[’ ’, ’H’, ’d’, ’e’, ’l’, ’l’, ’l’, ’o’, ’o’, ’r’, ’w’]
We can modify lists by assigning new values to elements in the list. In technical jargon, lists are mutable.
26
In [78]: l.remove("A")
print(l)
2.6.3 Tuples
Tuples are like lists, except that they cannot be modified once created; that is, they are immutable.
In Python, tuples are created using the syntax (..., ..., ...), or even ..., ...:
In [82]: x, y = point
print("x =", x)
print("y =", y)
x = 10
y = 20
In [83]: try:
point[0] = 20
except TypeError as e:
print(repr(e))
2.6.4 Dictionaries
Dictionaries are also like lists, except that each element is a key-value pair. The syntax for dictionaries is
{key1 : value1, ...}:
print(type(params))
print(params)
27
<class ’dict’>
{’parameter3’: 3.0, ’parameter2’: 2.0, ’parameter1’: 1.0}
parameter1 = 1.0
parameter2 = 2.0
parameter3 = 3.0
parameter1 = A
parameter2 = B
parameter3 = 3.0
parameter4 = D
if statement1:
print("statement1 is True")
elif statement2:
print("statement2 is True")
else:
print("statement1 and statement2 are False")
For the first time, here we encounter a peculiar and unusual aspect of the Python programming language:
Program blocks are defined by their indentation level.
Compare to the equivalent C code:
if (statement1)
{
printf("statement1 is True\n");
}
else if (statement2)
{
28
printf("statement2 is True\n");
}
else
{
printf("statement1 and statement2 are False\n");
}
In C, blocks are defined by enclosing them in curly braces { and }. The level of indentation (spaces or a
tab before the code statements) does not have an e↵ect; it’s just optional formatting.
But in Python, the extent of a code block is defined by its indentation level — denoted with a tab or 4-5
spaces. This means that we have to be careful to indent our code correctly, or else we will get syntax errors.
Examples:
In [88]: statement1 = statement2 = True
if statement1:
if statement2:
print("both statement1 and statement2 are True")
# Bad indentation!
if statement1:
if statement2: # next line is not properly indented
print("both statement1 and statement2 are True")
if statement1:
print("printed if statement1 is True")
In [90]: if statement1:
print("printed if statement1 is True")
2.8 Loops
In Python, loops can be programmed in a number of di↵erent ways. The most common is the for loop,
which is used together with iterable objects, such as lists. The basic syntax is:
29
1
2
3
The for loop iterates over the elements of the supplied list, and executes the containing block once for
each element. Any kind of list can be used in the for loop. For example:
0
1
2
3
-3
-2
-1
0
1
2
scientific
computing
with
python
parameter3 = 3.0
parameter4 = D
parameter2 = B
parameter1 = A
Sometimes it is useful to have access to the indices of the values when iterating over a list. We can use
the enumerate function for this:
0 -3
1 -2
2 -1
3 0
4 1
5 2
30
2.8.2 Using Lists: Creating lists using for loops:
A convenient and compact way to initialize lists:
[0, 1, 4, 9, 16]
0
1
2
3
4
done
Note that the print("done") statement is not part of the while loop body because of the di↵erence in
indentation.
2.9 Functions
A function in Python is defined using the keyword def, followed by a function name, a signature within
parentheses (), and a colon :. The following code, with one additional level of indentation, is the function
body.
In [100]: func0()
test
Optional, but highly recommended: Define a so-called “docstring” — a description of the function’s
purpose and behavior. The docstring should follow directly after the function definition, before the code in
the function body.
In [102]: help(func1)
func1(s)
Print a string ’s’ and tell how many characters it has
31
In [103]: func1("test")
test has 4 characters
Functions that return a value use the return keyword:
In [104]: def square(x):
"""
Return the square of x.
"""
return x ** 2
In [105]: square(4)
Out[105]: 16
We can return multiple values from a function using tuples (see above):
In [106]: def powers(x):
"""
Return a few powers of x.
"""
return x ** 2, x ** 3, x ** 4
In [107]: powers(3)
Out[107]: (9, 27, 81)
In [108]: x2, x3, x4 = powers(3)
print(x3)
27
32
2.9.2 Unnamed functions (lambda function)
In Python we can also create unnamed functions using the lambda keyword:
# is equivalent to
def f2(x):
return x**2
Out[114]: (4, 4)
This technique is useful, for example, when we want to pass a simple function as an argument to another
function, like this:
In [116]: # in python 3 we can use ‘list(...)‘ to convert the iterator to an explicit list
list(map(lambda x: x**2, range(-3,4)))
Out[116]: [9, 4, 1, 0, 1, 4, 9]
2.10 Classes
Classes are the key features of object-oriented programming. A class is a structure for representing an object
and the operations that can be performed on the object.
In Python, a class can contain attributes (variables) and methods (functions).
A class is defined almost like a function, but using the class keyword, and the class definition usually
contains a number of class method definitions (a function in a class).
• Each class method should have an argument self as its first argument. This object is a self-reference.
– init : The name of the method that is invoked when the object is first created.
– str : A method that is invoked when a simple string representation of the class is needed, as
for example when printed.
– There are many more; see http://docs.python.org/2/reference/datamodel.html#special-method-
names
33
"""
Translate the point by dx and dy in the x and y direction.
"""
self.x += dx
self.y += dy
def __str__(self):
return("Point at [%f, %f]" % (self.x, self.y))
In [118]: p1 = Point(0, 0) # this will invoke the __init__ method in the Point class
In [119]: p2 = Point(1, 1)
p1.translate(0.25, 1.5)
print(p1)
print(p2)
Note that calling class methods can modifiy the state of that particular class instance, but does not a↵ect
other class instances or any global variables.
That is one of the nice things about object-oriented design: code such as functions and related variables
are grouped in separate and independent entities.
2.11 Modules
One of the most important concepts in good programming is to reuse code and avoid repetitions.
The idea is to write functions and classes with a well-defined purpose and scope, and reuse these instead
of repeating similar code in di↵erent parts of a program (modular programming). This improves readability
and maintainability of your programs. In practice, your programs have fewer bugs, and are easier to extend
and debug/troubleshoot.
Python supports modular programming at di↵erent levels. Functions and classes are examples of tools
for low-level modular programming. Python modules are a higher-level modular programming construct,
where we can collect related variables, functions and classes in a module. A Python module is defined in a
Python file (with file-ending .py), and can be made accessible to other Python modules and programs using
the import statement.
The following example, mymodule.py, contains simple example implementations of a variable, function
and a class:
my_variable = 0
def my_function():
34
"""
Example function
"""
return my_variable
class MyClass:
"""
Example class.
"""
def __init__(self):
self.variable = my_variable
def set_variable(self, new_value):
"""
Set self.variable to a new value
"""
self.variable = new_value
def get_variable(self):
return self.variable
Writing mymodule.py
We can import the module mymodule into our Python program using import:
In [122]: help(mymodule)
NAME
mymodule
DESCRIPTION
Example of a Python module. Contains a variable called my variable,
a function called my function, and a class called MyClass.
CLASSES
builtins.object
MyClass
class MyClass(builtins.object)
| Example class.
|
| Methods defined here:
|
| init (self)
|
| get variable(self)
|
| set variable(self, new value)
| Set self.variable to a new value
|
35
| ----------------------------------------------------------------------
| Data descriptors defined here:
|
| dict
| dictionary for instance variables (if defined)
|
| weakref
| list of weak references to the object (if defined)
FUNCTIONS
my function()
Example function
DATA
my variable = 0
FILE
/Users/dmertz/Drive/Modules/scientific-python-lectures/mymodule.py
In [123]: mymodule.my_variable
Out[123]: 0
In [124]: mymodule.my_function()
Out[124]: 0
Out[125]: 10
2.12 Exceptions
In Python, errors are managed with a special language construct called “Exceptions”. When errors occur,
exceptions can be raised, which interrupts the normal program flow and falls back to the point in the code
where the closest try-except statement is defined.
To generate an exception, we can use the raise statement, which takes an argument that must be an
instance of the class BaseException or a class derived from it.
In [127]: try:
raise Exception("description of the error")
except Exception as e:
print(repr(e))
A typical use of exceptions is to abort functions when some error condition occurs. For example:
36
def my_function(arguments):
if not verify(arguments):
raise Exception("Invalid arguments")
To gracefully catch errors that are generated by functions and class methods, or by the Python interpreter
itself, use the try and except statements:
try:
# normal code goes here
except:
# code for error handling goes here
# this code is not executed unless the code
# above generated an error
For example:
In [128]: try:
print("test")
# generate an error: the variable test is not defined
print(test)
except:
print("Caught an exception")
test
Caught an exception
To get information about the error, we can access the Exception class instance that describes the
exception by using, for example:
except Exception as e:
In [129]: try:
print("test")
# generate an error: the variable test is not defined
print(test)
except Exception as e:
print("Caught an exception:" + str(e))
test
Caught an exception:name ’test’ is not defined
37
Chapter 3
Numpy - multidimensional data arrays
This curriculum builds on material by J. Robert Johansson from his “Introduction to scientific computing
with Python,” generously made available under a Creative Commons Attribution 3.0 Unported License at
https://github.com/jrjohansson/scientific-python-lectures. The Continuum Analytics enhancements use the
Creative Commons Attribution-NonCommercial 4.0 International License.
3.1 Introduction
The numpy package (module) is used in almost all numerical computation using Python. Numpy provides
high-performance vector, matrix and higher-dimensional data structures for Python. It is implemented in
C and Fortran, so when calculations are vectorized (formulated with vectors and matrices), performance is
very good.
To use numpy, you need to import the module. For example:
In the numpy package, the terminology used for vectors, matrices and higher-dimensional data sets is
array.
38
Out[3]: array([1, 2, 3, 4])
In [4]: # a matrix: the argument to the array function is a nested Python list
M = array([[1, 2], [3, 4]])
M
The v and M objects are both of the type ndarray that the numpy module provides.
The di↵erence between the v and M arrays is only their shapes. We can get information about the shape
of an array by using the ndarray.shape property.
In [6]: v.shape
Out[6]: (4,)
In [7]: M.shape
Out[7]: (2, 2)
The number of elements in the array is available through the ndarray.size property:
In [8]: M.size
Out[8]: 4
In [9]: shape(M)
Out[9]: (2, 2)
In [10]: size(M)
Out[10]: 4
So far the numpy.ndarray looks very much like a Python list (or nested list). Why not simply use Python
lists for computations, instead of creating a new array type?
There are several reasons:
• Python lists are very general. They can contain any kind of object. They are dynamically typed. They
do not support mathematical functions such as matrix and dot multiplications. Implementing such
functions for Python lists would not be very efficient because of the dynamic typing.
• Numpy arrays are statically typed and homogeneous. The type of the elements is determined when
the array is created.
• Numpy arrays are memory efficient.
• Because of the static typing, fast implementation of mathematical functions such as multiplication and
addition of numpy arrays can be implemented in a compiled language (C and Fortran is used).
Using the dtype (data type) property of an ndarray, we can see what type the data of an array has:
In [11]: M.dtype
39
Out[11]: dtype(’int64’)
We get an error if we try to assign a value of the wrong type to an element in a numpy array:
In [12]: try:
M[0,0] = "hello"
except ValueError as e:
print(repr(e))
ValueError("invalid literal for int() with base 10: ’hello’",)
If we want, we can explicitly define the type of the array data when we create it, using the dtype keyword
argument:
In [13]: M = array([[1, 2], [3, 4]], dtype=complex)
M
Out[13]: array([[ 1.+0.j, 2.+0.j],
[ 3.+0.j, 4.+0.j]])
Common data types that can be used with dtype are: int, float, complex, bool, object, etc.
We can also explicitly define the bit size of the data types, for example: int64, int16, float128,
complex128.
arange
In [14]: # create a range
x = arange(0, 10, 1) # arguments: start, stop, step
x
Out[14]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
In [15]: x = arange(-1, 1, 0.1)
x
Out[15]: array([ -1.00000000e+00, -9.00000000e-01, -8.00000000e-01,
-7.00000000e-01, -6.00000000e-01, -5.00000000e-01,
-4.00000000e-01, -3.00000000e-01, -2.00000000e-01,
-1.00000000e-01, -2.22044605e-16, 1.00000000e-01,
2.00000000e-01, 3.00000000e-01, 4.00000000e-01,
5.00000000e-01, 6.00000000e-01, 7.00000000e-01,
8.00000000e-01, 9.00000000e-01])
40
In [17]: logspace(0, 10, 10, base=math.e)
mgrid
In [18]: x, y = mgrid[0:5, 0:5] # similar to meshgrid in MATLAB
In [19]: x
In [20]: y
random data
In [21]: from numpy import random
diag
In [24]: # a diagonal matrix
diag([1,2,3])
41
In [25]: # diagonal with offset from the main diagonal
diag([1,2,3], k=1)
In [27]: ones((3,3))
In [30]: data.shape
Out[30]: (77431, 7)
42
Using numpy.savetxt, we can store a Numpy array to a file in CSV format:
In [32]: M = rand(3,3)
In [33]: savetxt("random-matrix.csv", M)
!cat random-matrix.csv
In [36]: save("random-matrix.npy", M)
!file random-matrix.npy
random-matrix.npy: data
In [37]: load("random-matrix.npy")
43
3.4 More properties of the numpy arrays
In [38]: M.itemsize # bytes per element
Out[38]: 8
Out[39]: 72
Out[40]: 2
In [41]: # v is a vector, and has only one dimension, taking one index
v[0]
Out[41]: 1
Out[42]: 0.9811806531134164
If we omit an index of a multidimensional array, it returns the whole row (or, in general, a N-1 dimensional
array)
In [43]: M
In [44]: M[1]
In [47]: M[0,0] = 1
In [48]: M
44
Out[48]: array([[ 1. , 0.95956617, 0.29935288],
[ 0.84995869, 0.98118065, 0.60505178],
[ 0.55596909, 0.89157525, 0.42748148]])
In [50]: M
In [51]: A = array([1,2,3,4,5])
A
In [52]: A[1:3]
Array slices are mutable: if they are assigned a new value, the original array from which the slice was
extracted is modified:
In [54]: A[::] # lower, upper, step all take the default values
In [55]: A[::2] # step is 2, lower and upper defaults to the beginning and end of the array
Negative indices count from the end of the array (positive index from the beginning):
In [58]: A = array([1,2,3,4,5])
45
Out[59]: 5
Index slicing works exactly the same way for multidimensional arrays:
In [63]: # strides
A[::2, ::2]
In [65]: col_indices = [1, 2, -1] # remember, index -1 means the last element
A[row_indices, col_indices]
We can also use index masks: If the index mask is an Numpy array of data type bool, then an element
is selected (True) or not (False) depending on the value of the index mask at the position of each element:
46
Out[67]: array([0, 2])
This feature is very useful to conditionally select elements from an array, for example, by using comparison
operators:
Out[70]: array([False, False, False, False, False, False, False, False, False,
False, False, True, True, True, True, False, False, False,
False, False], dtype=bool)
In [71]: x[mask]
3.6 Functions for extracting data from arrays and creating arrays
3.6.1 where
The index mask can be converted to position index using the where function:
3.6.2 diag
With the diag function we can also extract the diagonal and subdiagonals of an array:
In [74]: diag(A)
47
3.6.3 take
The take function is similar to the fancy indexing described above:
In [76]: v2 = arange(-3,3)
v2
In [78]: v2.take(row_indices)
3.6.4 choose
Constructs an array by picking elements from several arrays:
choose(which, choices)
In [81]: v1 = arange(0, 5)
In [82]: v1 * 2
In [83]: v1 + 2
In [84]: A * 2, A + 2
48
Out[84]: (array([[ 0, 2, 4, 6, 8],
[20, 22, 24, 26, 28],
[40, 42, 44, 46, 48],
[60, 62, 64, 66, 68],
[80, 82, 84, 86, 88]]), array([[ 2, 3, 4, 5, 6],
[12, 13, 14, 15, 16],
[22, 23, 24, 25, 26],
[32, 33, 34, 35, 36],
[42, 43, 44, 45, 46]]))
In [86]: v1 * v1
If we multiply arrays with compatible shapes, we get an element-wise multiplication of each row:
In [88]: A * v1
In [89]: dot(A, A)
49
In [91]: dot(v1, v1)
Out[91]: 30
Or we can cast the array objects to the type matrix. This changes the behavior of the standard arithmetic
operators +, -, * to use matrix algebra.
In [92]: M = matrix(A)
v = matrix(v1).T # make it a column vector
In [93]: v
Out[93]: matrix([[0],
[1],
[2],
[3],
[4]])
In [94]: M * M
In [95]: try:
M * v
except ValueError as e:
print(repr(e))
Out[96]: matrix([[30]])
If we try to add, subtract or multiply objects with incompatible shapes, we get an error:
In [98]: v = matrix([1,2,3,4,5,6]).T
In [100]: try:
M * v
except ValueError as e:
print(repr(e))
Explore these related functions: inner, outer, cross, kron, tensordot using the help function. For
example: help(kron).
50
3.7.4 Array/Matrix transformations
Above we used the .T to transpose the matrix object v. We could have used the transpose function to
accomplish the same thing.
Other mathematical functions that transform matrix objects are:
In [102]: conjugate(C)
In [103]: C.H
We can extract the real and imaginary parts of complex-valued arrays using real and imag:
In [107]: abs(C)
In [109]: C.I * C
51
Determinant
In [110]: det(C)
Out[110]: (2.0000000000000004+0j)
In [111]: det(C.I)
Out[111]: (0.50000000000000011+0j)
Out[112]: (77431, 7)
mean
In [113]: # the temperature data is in column 3
mean(data[:,3])
Out[113]: 6.1971096847515854
The daily mean temperature in Stockholm over the last 200 years has been about 6.2 C.
Out[115]: -25.800000000000001
Out[116]: 28.300000000000001
Out[118]: 45
52
In [119]: # product of all elements
prod(d+1)
Out[119]: 3628800
Out[122]: 110
The dataformat is: year, month, day, daily average temperature, low, high, location.
If we are interested in the average temperature only in a particular month, say February, then we can
create a index mask and use it to select only the data for that month using:
Out[124]: array([ 1., 2., 3., 4., 5., 6., 7., 8., 9., 10., 11.,
12.])
Out[126]: -3.2121095707365961
With these tools we have very powerful data processing capabilities at our disposal. For example, to
extract the average monthly average temperatures for each month of the year only takes a few lines of code:
fig, ax = subplots()
ax.bar(months, monthly_mean)
ax.set_xlabel("Month")
ax.set_ylabel("Monthly avg. temp.");
53
3.7.8 Calculations with higher-dimensional data
When functions such as min, max, etc. are applied to a multidimensional arrays, it is sometimes useful to
apply the calculation to the entire array, and sometimes only on a row or column basis. Using the axis
argument we can specify how these functions should behave:
In [128]: m = rand(3,3)
m
Out[129]: 0.90335299212908093
Many other functions and methods in the array and matrix classes accept the same (optional) axis
keyword argument.
54
3.8 Reshaping, resizing and stacking arrays
The shape of an Numpy array can be modified without copying the underlaying data, which makes it a fast
operation even for large arrays.
In [132]: A
In [133]: n, m = A.shape
In [134]: B = A.reshape((1,n*m))
B
Out[134]: array([[ 0, 1, 2, 3, 4, 10, 11, 12, 13, 14, 20, 21, 22, 23, 24, 30, 31,
32, 33, 34, 40, 41, 42, 43, 44]])
Out[135]: array([[ 5, 5, 5, 5, 5, 10, 11, 12, 13, 14, 20, 21, 22, 23, 24, 30, 31,
32, 33, 34, 40, 41, 42, 43, 44]])
In [136]: A # and the original variable is also changed. B is only a different view of the same data
We can also use the function flatten to make a higher-dimensional array into a vector. But this function
create a copy of the data.
In [137]: B = A.flatten()
B
Out[137]: array([ 5, 5, 5, 5, 5, 10, 11, 12, 13, 14, 20, 21, 22, 23, 24, 30, 31,
32, 33, 34, 40, 41, 42, 43, 44])
In [138]: B[0:5] = 10
B
Out[138]: array([10, 10, 10, 10, 10, 10, 11, 12, 13, 14, 20, 21, 22, 23, 24, 30, 31,
32, 33, 34, 40, 41, 42, 43, 44])
In [139]: A # now A has not changed, because B’s data is a copy of A’s, not refering to the same data
55
3.9 Adding a new dimension: newaxis
With newaxis, we can insert new dimensions in an array; for example, converting a vector to a column or
row matrix:
In [140]: v = array([1,2,3])
In [141]: shape(v)
Out[141]: (3,)
In [142]: # make a column matrix of the vector v
v[:, newaxis]
Out[142]: array([[1],
[2],
[3]])
In [143]: # column matrix
v[:,newaxis].shape
Out[143]: (3, 1)
In [144]: # row matrix
v[newaxis,:].shape
Out[144]: (1, 3)
3.10.2 concatenate
In [148]: b = array([[5, 6]])
In [149]: concatenate((a, b), axis=0)
Out[149]: array([[1, 2],
[3, 4],
[5, 6]])
In [150]: concatenate((a, b.T), axis=1)
Out[150]: array([[1, 2, 5],
[3, 4, 6]])
56
3.10.3 hstack and vstack
In [151]: vstack((a,b))
In [152]: hstack((a,b.T))
In [156]: A
If we want to avoid this behavior, so that when we get a new completely independent object B copied
from A, then we need to do a so-called “deep copy” using the function copy:
In [157]: B = copy(A)
In [159]: A
57
3.12 Iterating over array elements
Generally, it’s best to avoid iterating over the elements of arrays whenever we can. Why? In a interpreted
language like Python (or MATLAB), iterations are really slow compared to vectorized operations.
However, sometimes iterations are unavoidable. For such cases, the Python for loop is the most conve-
nient way to iterate over an array:
In [160]: v = array([1,2,3,4])
for element in v:
print(element)
1
2
3
4
for row in M:
print("row", row)
row [1 2]
1
2
row [3 4]
3
4
When we need to iterate over each element of an array and modify its elements, it is convenient to use
the enumerate function to obtain both the element and its index in the for loop:
58
3.13 Vectorizing functions
As mentioned several times, to get good performance we should try to avoid looping over elements in our
vectors and matrices, and instead use vectorized algorithms. The first step in converting a scalar algorithm
to a vectorized algorithm is to make sure that the functions we write work with vector inputs.
In [165]: try:
Theta(array([-3,-2,-1,0,1,2,3]))
except ValueError as e:
print(repr(e))
ValueError(’The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()’
OK, that didn’t work because we didn’t write the Theta function so that it can handle a vector input.
To get a vectorized version of Theta, we can use the Numpy function vectorize. In many cases it can
automatically vectorize a function:
In [167]: Theta_vec(array([-3,-2,-1,0,1,2,3]))
We can also implement the function to accept a vector input from the beginning (requires more e↵ort
but might give better performance):
In [169]: Theta(array([-3,-2,-1,0,1,2,3]))
Out[170]: (0, 1)
In [171]: M
59
Out[171]: array([[ 1, 4],
[ 9, 16]])
In [174]: M.dtype
Out[174]: dtype(’int64’)
In [175]: M2 = M.astype(float)
M2
In [176]: M2.dtype
Out[176]: dtype(’float64’)
In [177]: M3 = M.astype(bool)
M3
60
Chapter 4
SciPy - Library of scientific algorithms
for Python
This curriculum builds on material by J. Robert Johansson from his “Introduction to scientific computing
with Python,” generously made available under a Creative Commons Attribution 3.0 Unported License at
https://github.com/jrjohansson/scientific-python-lectures. The Continuum Analytics enhancements use the
Creative Commons Attribution-NonCommercial 4.0 International License.
4.1 Introduction
The SciPy framework builds on top of the low-level NumPy framework for multidimensional arrays, and
provides a large number of higher-level scientific algorithms. Some of the topics that SciPy covers are:
Each of these submodules provides a number of functions and classes that can be used to solve problems
in their respective topics.
In this lecture, we will look at how to use some of these subpackages.
To access the SciPy package in a Python program, we start by importing everything from the scipy
module.
61
If we only need to use part of the SciPy framework, we can selectively include only those modules we are
interested in. For example, to include the linear algebra package under the name la, we can do:
In [4]: #
# The scipy.special module includes a large number of Bessel functions
# Here we will use the functions jn and yn, which are the Bessel functions
# of the first and second kind and real-valued order. We also include the
# function jn_zeros and yn_zeros that gives the zeroes of the functions jn
# and yn.
#
from scipy.special import jn, yn, jn_zeros, yn_zeros
In [5]: n = 0 # order
x = 0.0
x = 1.0
# Bessel function of second kind
print("Y_%d(%f) = %f" % (n, x, yn(n, x)))
J 0(0.000000) = 1.000000
Y 0(1.000000) = 0.088257
fig, ax = subplots()
for n in range(4):
ax.plot(x, jn(n, x), label=r"$J_%d(x)$" % n)
ax.legend();
62
In [7]: # zeros of Bessel functions
n = 0 # order
m = 4 # number of roots to compute
jn_zeros(n, m)
4.3 Integration
4.3.1 Numerical integration: quadrature
Numerical evaluation of a function of the type
Z b
f (x)dx
a
is called numerical quadrature, or simply quadrature. SciPy provides a series of functions for di↵erent
kind of quadrature, for example the quad, dblquad and tplquad for single, double and triple integrals,
respectively.
The quad function takes a large number of optional arguments which can be used to fine-tune the behavior
of the function (try help(quad) for details).
The basic usage is as follows:
63
val, abserr = quad(f, x_lower, x_upper)
If we need to pass extra arguments to the integrand function, we can use the args keyword argument:
print(val, abserr)
0.7366751370811073 9.389126882496403e-13
For simple functions, we can use a lambda function (nameless function) instead of explicitly defining a
function for the integrand:
analytical = sqrt(pi)
print("analytical =", analytical)
As shown in the example above, we can also use ‘Inf’ or ‘-Inf’ as integral limits.
Higher-dimensional integration works in the same way:
x_lower = 0
x_upper = 10
y_lower = 0
y_upper = 10
print(val, abserr)
0.7853981633974476 1.638229942140971e-13
Note how we had to pass lambda functions for the limits for the y integration, since these in general can
be functions of x.
64
4.4 Ordinary di↵erential equations (ODEs)
SciPy provides two di↵erent ways to solve ODEs: An API based on the function odeint, and an object-
oriented API based on the class ode. Usually odeint is easier to get started with, but the ode class o↵ers a
finer level of control.
Here we will use the odeint functions. For more information about the class ode, try help(ode). It
does pretty much the same thing as odeint, but in an object-oriented fashion.
To use odeint, first import it from the scipy.integrate module.
A system of ODEs are usually formulated on standard form before it is attacked numerically. The
standard form is:
y 0 = f (y, t)
where
y = [y1 (t), y2 (t), ..., yn (t)]
and f is some function that gives the derivatives of the function yi (t). To solve an ODE, we need to
know the function f and an initial condition, y(0).
Note that higher-order ODEs can always be written in this form by introducing new variables for the
intermediate derivatives.
Once we have defined the Python function f and array y 0 (that is f and y(0) in the mathematical
formulation), we can use the odeint function as:
where t is and array with time-coordinates for which to solve the ODE problem. y t is an array with
one row for each point in time in t, where each column corresponds to a solution y i(t) at that point in
time.
We will see how we can implement f and y 0 in Python code in the examples below.
Example: double pendulum Let’s consider a physical example: The double compound pendulum,
described in some detail here: http://en.wikipedia.org/wiki/Double pendulum
In [15]: Image(url=’http://upload.wikimedia.org/wikipedia/commons/c/c9/Double-compound-pendulum-dimensio
The equations of motion of the pendulum are given on the wiki page:
6 2p✓1 3 cos(✓1 ✓2 )p✓2
✓˙1 = m` 2 16 9 cos2 (✓1 ✓2 )
˙✓2 = 6 2 8p✓2 3 cos(✓ 1 ✓2 )p✓1
m` 16 h9 cos2 (✓1 ✓2 ) . i
ṗ✓1 = 12 m`2 ✓˙1 ✓˙2 sin(✓1 ✓2 ) + 3 g` sin ✓1
h i
ṗ✓2 = 12 m`2 ✓˙1 ✓˙2 sin(✓1 ✓2 ) + g` sin ✓2
To make the Python code simpler to follow, let’s introduce new variable names and the vector notation:
x = [✓1 , ✓2 , p✓1 , p✓2 ]
6 2x3 3 cos(x1 x2 )x4
ẋ1 = m` 2 16 9 cos2 (x
1 x2 )
6 8x4 3 cos(x1 x2 )x3
ẋ2 = m`2 16 ⇥9 cos2 (x1 x2 ) ⇤
ẋ3 = 1 2
2 m` ⇥ẋ1 ẋ2 sin(x1 x2 ) + 3 g` sin x1 ⇤
ẋ4 = 1
2 m`
2
ẋ1 ẋ2 sin(x1 x2 ) + g` sin x2
In [16]: g = 9.82
L = 0.5
m = 0.1
65
def dx(x, t):
"""
The right-hand side of the pendulum ODE
"""
x1, x2, x3, x4 = x[0], x[1], x[2], x[3]
x1 = + L * sin(x[:, 0])
y1 = - L * cos(x[:, 0])
x2 = x1 + L * sin(x[:, 1])
y2 = y1 - L * cos(x[:, 1])
Simple animation of the pendulum motion. We will see how to make a better animation in Lecture 4.
66
In [21]: from IPython.display import clear_output
import time
In [22]: fig, ax = subplots(figsize=(4,4))
x1 = + L * sin(x[t_idx, 0])
y1 = - L * cos(x[t_idx, 0])
x2 = x1 + L * sin(x[t_idx, 1])
y2 = y1 - L * cos(x[t_idx, 1])
ax.cla()
ax.plot([0, x1], [0, y1], ’r.-’)
ax.plot([x1, x2], [y1, y2], ’b.-’)
ax.set_ylim([-1.5, 0.5])
ax.set_xlim([1, -1])
display(fig)
clear_output()
time.sleep(0.1)
Example: Damped harmonic oscillator ODE problems are important in computational physics, so
we will look at one more example: the damped harmonic oscillation. This problem is well described on the
wiki page: http://en.wikipedia.org/wiki/Damping
The equation of motion for the damped oscillator is:
d2 x dx
2
+ 2⇣!0 + !02 x = 0
dt dt
67
where x is the position of the oscillator, !0 is the frequency, and ⇣ is the damping ratio. To write this
second-order ODE on standard form, we introduce p = dx dt :
dp 2
= 2⇣!0 p !0 x
dt
dx
=p
dt
In the implementation of this example, we will add extra arguments to the RHS function for the ODE,
rather than using global variables as we did in the previous example. As a consequence of the extra arguments
to the RHS, we need to pass an keyword argument args to the odeint function:
dx = p
dp = -2 * zeta * w0 * p - w0**2 * x
In [26]: # solve the ODE problem for three different values of the damping ratio
68
4.5 Fourier transform
Fourier transforms are one of the universal tools in computational physics; they appear over and over again
in di↵erent contexts. SciPy provides functions for accessing the classic FFTPACK library from NetLib, an
efficient and well tested FFT library written in FORTRAN. The SciPy API has a few additional convenience
functions, but overall the API is closely related to the original FORTRAN library.
To use the fftpack module in a python program, include it using:
To demonstrate how to do a fast Fourier transform with SciPy, let’s look at the FFT of the solution to
the damped oscillator from the previous section:
In [29]: N = len(t)
dt = t[1]-t[0]
69
Since the signal is real, the spectrum is symmetric. We therefore only need to plot the part that cor-
responds to the postive frequencies. To extract that part of the w and F, we can use some of the indexing
tricks for NumPy arrays we saw in Lecture 2:
In [31]: indices = where(w > 0) # select only indices for elements that corresponds to positive frequenc
w_pos = w[indices]
F_pos = F[indices]
As expected, we now see a peak in the spectrum that is centered around 1, which is the frequency we
used in the damped oscillator example.
70
4.6.1 Linear equation systems
Linear equation systems on the matrix form
Ax = b
where A is a matrix and x, b are vectors can be solved like:
In [34]: x = solve(A, b)
In [35]: # check
dot(A, x) - b
In [36]: A = rand(3,3)
B = rand(3,3)
In [37]: X = solve(A, B)
In [38]: X
In [39]: # check
norm(dot(A, X) - B)
Out[39]: 3.5975337699988621e-16
In [41]: evals
In [43]: evals
71
Out[43]: array([ 1.18476874+0.j , -0.00767866+0.12773283j,
-0.00767866-0.12773283j])
In [44]: evecs
The eigenvectors corresponding to the nth eigenvalue (stored in evals[n]) is the nth column in evecs,
i.e., evecs[:,n]. To verify this, let’s try mutiplying eigenvectors with the matrix and compare to the product
of the eigenvector and the eigenvalue:
In [45]: n = 1
Out[45]: 1.629370826225489e-16
There are also more specialized eigensolvers, like the eigh for Hermitian matrices.
In [47]: # determinant
det(A)
Out[47]: 0.019400158815669057
72
In [49]: from scipy.sparse import *
More efficient way to create sparse matrices: create an empty matrix and populate it using matrix
indexing (avoids creating a potentially large dense matrix):
In [54]: A.todense()
In [55]: A
In [56]: A = csr_matrix(A); A
In [57]: A = csc_matrix(A); A
73
Out[57]: <4x4 sparse matrix of type ’<class ’numpy.float64’>’
with 6 stored elements in Compressed Sparse Column format>
In [58]: A.todense()
In [59]: (A * A).todense()
In [60]: try:
dot(A, A).todense()
except ValueError as e:
print(repr(e))
In [61]: v = array([1,2,3,4])[:,newaxis]; v
Out[61]: array([[1],
[2],
[3],
[4]])
4.7 Optimization
Optimization (finding minima or maxima of a function) is a large field in mathematics, and optimiza-
tion of complicated functions or in many variables can be rather involved. Here we will only look at a
few very simple cases. For a more detailed introduction to optimization with SciPy, see: http://scipy-
lectures.github.com/advanced/mathematical optimization/index.html
To use the optimization module in SciPy, first include the optimize module:
74
4.7.1 Finding a minima
First, let’s find the minima of a simple function of a single variable:
We can use the fmin bfgs function to find the minima of a function:
Out[67]: array([-2.67298167])
75
Out[68]: array([ 0.46961745])
We can also use the brent or fminbound functions. They have slightly di↵erent syntax and use di↵erent
algorithms.
In [69]: optimize.brent(f)
Out[69]: 0.46961743402759754
Out[70]: -2.6729822917513886
76
In [74]: optimize.fsolve(f, 0.6)
4.8 Interpolation
Interpolation is simple and convenient in SciPy: The interp1d function, when given arrays describing X
and Y data, returns an object that behaves like a function that can be called for an arbitrary value of x (in
the range covered by X). It returns the corresponding interpolated y value:
77
4.9 Statistics
The scipy.stats module contains a large number of statistical distributions, statistical functions and tests.
For a complete documentation of its features, see http://docs.scipy.org/doc/scipy/reference/stats.html.
There is also a very powerful Python package for statistical modeling called statsmodels. See
http://statsmodels.sourceforge.net for more details.
In [82]: n = arange(0,15)
78
In [84]: x = linspace(-5,5,100)
Statistics:
79
t-statistic = 1.13574794292
p-value = 0.256198284814
Since the p value is very large, we cannot reject the hypothesis that the two sets of random data have
di↵erent means.
To test whether the mean of a single sample of data has mean 0.1 (the true mean is 0.0):
Low p-value means that we can reject the hypothesis that the mean of Y is 0.1.
In [89]: Y.mean()
Out[89]: 0.0
80
Chapter 5
matplotlib - 2D and 3D plotting in
Python
This curriculum builds on material by J. Robert Johansson from his “Introduction to scientific computing
with Python,” generously made available under a Creative Commons Attribution 3.0 Unported License at
https://github.com/jrjohansson/scientific-python-lectures. The Continuum Analytics enhancements use the
Creative Commons Attribution-NonCommercial 4.0 International License.
In [1]: # This line configures matplotlib to show figures embedded in the notebook,
# instead of opening a new window for each figure. More about that later.
# If you are using an old version of IPython, try using ’%pylab inline’ instead.
%matplotlib inline
5.1 Introduction
Matplotlib is an excellent 2D and 3D graphics library for generating scientific figures. Some of the many
advantages of this library include:
One of the of the key features of matplotlib that I would like to emphasize, and that I think makes
matplotlib highly suitable for generating figures for scientific publications is that all aspects of the figure
can be controlled programmatically. This is important for reproducibility and convenient when one needs to
regenerate the figure with updated data or change its appearance.
More information at the Matplotlib web page: http://matplotlib.org/
To get started using Matplotlib in a Python program, either include the symbols from the pylab module
(the easy way):
or import the matplotlib.pyplot module under the name plt (the tidy way):
81
5.2 MATLAB-like API
The easiest way to get started with plotting using matplotlib is often to use the MATLAB-like API provided
by matplotlib.
It is designed to be compatible with MATLAB’s plotting functions, so it is easy to get started with if
you are familiar with MATLAB.
To use this API from matplotlib, we need to include the symbols in the pylab module:
5.2.1 Example
A simple figure with MATLAB-like plotting API:
In [6]: figure()
plot(x, y, ’r’)
xlabel(’x’)
ylabel(’y’)
title(’title’)
show()
Most of the plotting related functions in MATLAB are covered by the pylab module. For example,
subplot and color/symbol selection:
In [7]: subplot(1,2,1)
plot(x, y, ’r--’)
82
subplot(1,2,2)
plot(y, x, ’g*-’);
The good thing about the pylab MATLAB-style API is that it is easy to get started with if you are
familiar with MATLAB, and it has a minumum of coding overhead for simple plots.
However, I’d encourrage not using the MATLAB compatible API for anything but the simplest figures.
Instead, I recommend learning and using matplotlib’s object-oriented plotting API. It is remarkably
powerful. For advanced figures with subplots, insets and other components it is very nice to work with.
axes = fig.add_axes([0.1, 0.1, 0.8, 0.8]) # left, bottom, width, height (range 0 to 1)
axes.plot(x, y, ’r’)
axes.set_xlabel(’x’)
axes.set_ylabel(’y’)
axes.set_title(’title’);
83
Although a little bit more code is involved, the advantage is that we now have full control of where the
plot axes are placed, and we can easily add more than one axis to the figure:
# main figure
axes1.plot(x, y, ’r’)
axes1.set_xlabel(’x’)
axes1.set_ylabel(’y’)
axes1.set_title(’title’)
# insert
axes2.plot(y, x, ’g’)
axes2.set_xlabel(’y’)
axes2.set_ylabel(’x’)
axes2.set_title(’insert title’);
84
If we don’t care about being explicit about where our plot axes are placed in the figure canvas, then we
can use one of the many axis layout managers in matplotlib. My favorite is subplots, which can be used
like this:
axes.plot(x, y, ’r’)
axes.set_xlabel(’x’)
axes.set_ylabel(’y’)
axes.set_title(’title’);
85
In [11]: fig, axes = plt.subplots(nrows=1, ncols=2)
for ax in axes:
ax.plot(x, y, ’r’)
ax.set_xlabel(’x’)
ax.set_ylabel(’y’)
ax.set_title(’title’)
86
That was easy, but it isn’t so pretty with overlapping figure axes and labels, right?
We can deal with that by using the fig.tight layout method, which automatically adjusts the positions
of the axes on the figure canvas so that there is no overlapping content:
for ax in axes:
ax.plot(x, y, ’r’)
ax.set_xlabel(’x’)
ax.set_ylabel(’y’)
ax.set_title(’title’)
fig.tight_layout()
87
5.3.1 Figure size, aspect ratio and DPI
Matplotlib allows the aspect ratio, DPI and figure size to be specified when the Figure object is created,
using the figsize and dpi keyword arguments. figsize is a tuple of the width and height of the figure in
inches, and dpi is the dots-per-inch (pixel per inch). To create an 800x400 pixel, 100 dots-per-inch figure,
we can do:
<matplotlib.figure.Figure at 0x109225d68>
The same arguments can also be passed to layout managers, such as the subplots function:
axes.plot(x, y, ’r’)
axes.set_xlabel(’x’)
axes.set_ylabel(’y’)
axes.set_title(’title’);
88
5.3.2 Saving figures
To save a figure to a file we can use the savefig method in the Figure class:
In [15]: fig.savefig("filename.png")
Here we can also optionally specify the DPI and choose between di↵erent output formats:
What formats are available and which ones should be used for best quality? Matplotlib can
generate high-quality output in a number formats, including PNG, JPG, EPS, SVG, PGF and PDF. For
scientific papers, I recommend using PDF whenever possible. (LaTeX documents compiled with pdflatex
can include PDFs using the includegraphics command). In some cases, PGF can also be good alternative.
In [17]: ax.set_title("title");
Axis labels
Similarly, with the methods set xlabel and set ylabel, we can set the labels of the X and Y axes:
In [18]: ax.set_xlabel("x")
ax.set_ylabel("y");
Legends
Legends for curves in a figure can be added in two ways. One method is to use the legend method of
the axis object and pass a list/tuple of legend texts for the previously defined curves:
The method described above follows the MATLAB API. It is somewhat prone to errors and unflexible if
curves are added to or removed from the figure (resulting in a wrongly labelled curve).
A better method is to use the label="label text" keyword argument when plots or other objects are
added to the figure, and then using the legend method without arguments to add the legend to the figure:
89
In [20]: ax.plot(x, x**2, label="curve1")
ax.plot(x, x**3, label="curve2")
ax.legend();
The advantage with this method is that if curves are added or removed from the figure, the legend is
automatically updated accordingly.
The legend function takes an optional keyword argument loc that can be used to specify where in the
figure the legend is to be drawn. The allowed values of loc are numerical codes for the various places the
legend can be drawn. See http://matplotlib.org/users/legend guide.html#legend-location for details. Some
of the most common loc values are:
In [21]: ax.legend(loc=0) # let matplotlib decide the optimal location
ax.legend(loc=1) # upper right corner
ax.legend(loc=2) # upper left corner
ax.legend(loc=3) # lower left corner
ax.legend(loc=4) # lower right corner
# .. many more options are available
Out[21]: <matplotlib.legend.Legend at 0x107a4ccc0>
The following figure shows how to use the figure title, axis labels and legends described above:
In [22]: fig, ax = plt.subplots()
90
5.3.4 Formatting text: LaTeX, fontsize, font family
The figure above is functional, but it does not (yet) satisfy the criteria for a figure used in a publication.
First and foremost, we need to have LaTeX formatted text, and second, we need to be able to adjust the
font size to appear right in a publication.
Matplotlib has great support for LaTeX. All we need to do is to use dollar signs encapsulate LaTeX in
any text (legend, title, label, etc.). For example, "$y=x^3$".
But here we can run into a slightly subtle problem with LaTeX code and Python text strings. In LaTeX,
we frequently use the backslash in commands, for example \alpha to produce the symbol ↵. But the
backslash already has a meaning in Python strings (the escape code character). To avoid Python messing
up our latex code, we need to use “raw” text strings. Raw text strings are prepended with an ‘r’, like
r"\alpha" or r’\alpha’ instead of "\alpha" or ’\alpha’:
We can also change the global font size and font family, which applies to all text elements in a figure
(tick labels, axis labels and titles, legends, etc.):
91
In [25]: fig, ax = plt.subplots()
92
Or, alternatively, we can request that matplotlib uses LaTeX to render the text elements in the figure:
93
In [30]: # restore
matplotlib.rcParams.update({’font.size’: 12, ’font.family’: ’sans’, ’text.usetex’: False})
We can also define colors by their names or RGB hex codes and optionally provide an alpha value using
the color and alpha keyword arguments:
94
Line and marker styles To change the line width, we can use the linewidth or lw keyword argument.
The line style can be selected using the linestyle or ls keyword arguments:
# custom dash
line, = ax.plot(x, x+8, color="black", lw=1.50)
line.set_dashes([5, 10, 15, 10]) # format: line length, space length, ...
# possible marker symbols: marker = ’+’, ’o’, ’*’, ’s’, ’,’, ’.’, ’1’, ’2’, ’3’, ’4’, ...
ax.plot(x, x+ 9, color="green", lw=2, ls=’*’, marker=’+’)
ax.plot(x, x+10, color="green", lw=2, ls=’*’, marker=’o’)
ax.plot(x, x+11, color="green", lw=2, ls=’*’, marker=’s’)
ax.plot(x, x+12, color="green", lw=2, ls=’*’, marker=’1’)
95
ax.plot(x, x+15, color="purple", lw=1, ls=’-’, marker=’o’, markersize=8, markerfacecolor="red")
ax.plot(x, x+16, color="purple", lw=1, ls=’-’, marker=’s’, markersize=8,
markerfacecolor="yellow", markeredgewidth=2, markeredgecolor="blue");
Plot range The first thing we might want to configure is the ranges of the axes. We can do this using the
set ylim and set xlim methods in the axis object, or axis(’tight’) for automatrically getting “tightly
fitted” axes ranges:
96
Logarithmic scale It is also possible to set a logarithmic scale for one or both axes. This functionality is
in fact only one application of a more general transformation system in Matplotlib. Each of the axes’ scales
are set seperately using set xscale and set yscale methods which accept one parameter (with the value
“log” in this case):
In [35]: fig, axes = plt.subplots(1, 2, figsize=(10,4))
97
In [36]: fig, ax = plt.subplots(figsize=(10, 4))
ax.set_xticks([1, 2, 3, 4, 5])
ax.set_xticklabels([r’$\alpha$’, r’$\beta$’, r’$\gamma$’, r’$\delta$’, r’$\epsilon$’], fontsize
There are a number of more advanced methods for controlling major and minor tick place-
ment in matplotlib figures, such as automatic placement according to di↵erent policies. See
http://matplotlib.org/api/ticker api.html for details.
Scientific notation With large numbers on axes, it is often better use scientific notation:
98
5.3.8 Axis number and axis label spacing
In [38]: # distance between x and y axis and the numbers on the axes
rcParams[’xtick.major.pad’] = 5
rcParams[’ytick.major.pad’] = 5
fig, ax = plt.subplots(1, 1)
ax.set_xlabel("x")
ax.set_ylabel("y");
99
In [39]: # restore defaults
rcParams[’xtick.major.pad’] = 3
rcParams[’ytick.major.pad’] = 3
Axis position adjustments Unfortunately, when saving figures the labels are sometimes clipped, and it
can be necessary to adjust the positions of axes a little bit. This can be done using subplots adjust:
ax.set_title("title")
ax.set_xlabel("x")
ax.set_ylabel("y")
100
5.3.9 Axis grid
With the grid method in the axis object, we can turn on and o↵ grid lines. We can also customize the
appearance of the grid lines using the same keyword arguments as the plot function:
In [41]: fig, axes = plt.subplots(1, 2, figsize=(10,3))
101
5.3.10 Axis spines
We can also change the properties of axis spines:
ax.spines[’bottom’].set_color(’blue’)
ax.spines[’top’].set_color(’blue’)
ax.spines[’left’].set_color(’red’)
ax.spines[’left’].set_linewidth(2)
ax2 = ax1.twinx()
ax2.plot(x, x**3, lw=2, color="red")
ax2.set_ylabel(r"volume $(m^3)$", fontsize=18, color="red")
for label in ax2.get_yticklabels():
label.set_color("red")
102
5.3.12 Axes where x and y is zero
In [44]: fig, ax = plt.subplots()
ax.spines[’right’].set_color(’none’)
ax.spines[’top’].set_color(’none’)
ax.xaxis.set_ticks_position(’bottom’)
ax.spines[’bottom’].set_position((’data’,0)) # set position of x spine to x=0
ax.yaxis.set_ticks_position(’left’)
ax.spines[’left’].set_position((’data’,0)) # set position of y spine to y=0
103
5.3.13 Other 2D plot styles
In addition to the regular plot method, there are a number of other functions for generating dif-
ferent kind of plots. See the matplotlib plot gallery for a complete list of available plot types:
http://matplotlib.org/gallery.html. Some of the more useful ones are show below:
In [45]: n = array([0,1,2,3,4,5])
In [46]: fig, axes = plt.subplots(1, 4, figsize=(12,3))
axes[0].scatter(xx, xx + 0.25*randn(len(xx)))
axes[0].set_title("scatter")
104
In [47]: # polar plot using add_axes and polar projection
fig = plt.figure()
ax = fig.add_axes([0.0, 0.0, .6, .6], polar=True)
t = linspace(0, 2 * pi, 100)
ax.plot(t, t, color=’blue’, lw=3);
In [48]: # A histogram
n = np.random.randn(100000)
fig, axes = plt.subplots(1, 2, figsize=(12,4))
axes[0].hist(n)
axes[0].set_title("Default histogram")
axes[0].set_xlim((min(n), max(n)))
105
5.3.14 Text annotation
Annotating text in matplotlib figures can be done using the text function. It supports LaTeX formatting
just like axis label texts and titles:
subplots
106
subplot2grid
In [51]: fig = plt.figure()
ax1 = plt.subplot2grid((3,3), (0,0), colspan=3)
ax2 = plt.subplot2grid((3,3), (1,0), colspan=2)
ax3 = plt.subplot2grid((3,3), (1,2), rowspan=2)
ax4 = plt.subplot2grid((3,3), (2,0))
ax5 = plt.subplot2grid((3,3), (2,1))
fig.tight_layout()
107
gridspec
In [52]: import matplotlib.gridspec as gridspec
fig.tight_layout()
108
add axes Manually adding axes with add axes is useful for adding insets to figures:
# inset
inset_ax = fig.add_axes([0.2, 0.55, 0.35, 0.35]) # X, Y, width, height
109
5.3.16 Colormap and contour figures
Colormaps and contour figures are useful for plotting functions of two variables. In most of these func-
tions we will use a colormap to encode one dimension of the data. There are a number of predefined
colormaps. It is relatively straightforward to define custom colormaps. For a list of pre-defined colormaps,
see: http://www.scipy.org/Cookbook/Matplotlib/Show colormaps
pcolor
110
imshow
cb = fig.colorbar(im, ax=ax)
111
contour
112
5.4 3D figures
To use 3D graphics in matplotlib, we first need to create an instance of the Axes3D class. 3D axes can be
added to a matplotlib figure canvas in exactly the same way as 2D axes; or, more conveniently, by passing
a projection=’3d’ keyword argument to the add axes or add subplot methods.
Surface plots
In [61]: fig = plt.figure(figsize=(14,6))
# ‘ax‘ is a 3D-aware axis instance because of the projection=’3d’ keyword argument to add_subpl
ax = fig.add_subplot(1, 2, 1, projection=’3d’)
Wire-frame plot
In [62]: fig = plt.figure(figsize=(8,6))
ax = fig.add_subplot(1, 1, 1, projection=’3d’)
113
Coutour plots with projections
ax = fig.add_subplot(1,1,1, projection=’3d’)
ax.set_xlim3d(-pi, 2*pi);
ax.set_ylim3d(0, 3*pi);
ax.set_zlim3d(-pi, 2*pi);
114
Change the view angle We can change the perspective of a 3D plot using the view init method, which
takes two arguments: elevation and azimuth angle (in degrees):
ax = fig.add_subplot(1,2,1, projection=’3d’)
ax.plot_surface(X, Y, Z, rstride=4, cstride=4, alpha=0.25)
ax.view_init(30, 45)
ax = fig.add_subplot(1,2,2, projection=’3d’)
ax.plot_surface(X, Y, Z, rstride=4, cstride=4, alpha=0.25)
ax.view_init(70, 30)
fig.tight_layout()
115
5.4.1 Animations
Matplotlib also includes a simple API for generating animations for sequences of figures. With the
FuncAnimation function we can generate a movie file from sequences of figures. The function takes the
following arguments: fig, a figure canvas, func, a function that we provide which updates the figure,
init func, a function we provide to setup the figure, frame, the number of frames to generate, and blit,
which tells the animation function to only update parts of the frame which have changed (for smoother
animations):
def init():
# setup figure
def update(frame_counter):
# update figure for new frame
To use the animation features in matplotlib we first need to import the module matplotlib.animation:
In [66]: # solve the ode problem of the double compound pendulum again
116
dx3 = -0.5 * m * L**2 * ( dx1 * dx2 * sin(x1-x2) + 3 * (g/L) * sin(x1))
dx4 = -0.5 * m * L**2 * (-dx1 * dx2 * sin(x1-x2) + (g/L) * sin(x2))
return [dx1, dx2, dx3, dx4]
Generate an animation that shows the positions of the pendulums as a function of time:
ax.set_ylim([-1.5, 0.5])
ax.set_xlim([1, -1])
def init():
pendulum1.set_data([], [])
pendulum2.set_data([], [])
def update(n):
# n = frame counter
# calculate the positions of the pendulums
x1 = + L * sin(x[n, 0])
y1 = - L * cos(x[n, 0])
x2 = x1 + L * sin(x[n, 1])
y2 = y1 - L * cos(x[n, 1])
# anim.save can be called in a few different ways, some which might or might not work
# on different platforms and with different versions of matplotlib and video encoders
#anim.save(’animation.mp4’, fps=20, extra_args=[’-vcodec’, ’libx264’],
# writer=animation.FFMpegWriter())
anim.save(’animation.mp4’, fps=20, extra_args=[’-vcodec’, ’libx264’])
#anim.save(’animation.mp4’, fps=20, writer="ffmpeg", codec="libx264")
#anim.save(’animation.mp4’, fps=20, writer="avconv", codec="libx264")
plt.close(fig)
Note: To generate the movie file we need to have either ffmpeg or avconv installed. Install it on Ubuntu
using:
or (newer versions)
On MacOSX, try:
117
$ sudo port install ffmpeg
5.4.2 Backends
Matplotlib has a number of “backends” which are responsible for rendering graphs. The di↵erent backends
are able to generate graphics with di↵erent formats and display/event loops. There is a distinction between
noninteractive backends (such as ‘agg’, ‘svg’, ‘pdf’, etc.) that are only used to generate image files (e.g. with
the savefig function), and interactive backends (such as Qt4Agg, GTK, MaxOSX) that can display a GUI
window for interactively exploring figures.
A list of available backends are:
118
In [70]: print(matplotlib.rcsetup.all_backends)
[’GTK’, ’GTKAgg’, ’GTKCairo’, ’MacOSX’, ’Qt4Agg’, ’Qt5Agg’, ’TkAgg’, ’WX’, ’WXAgg’, ’CocoaAgg’, ’GTK3Cai
The default backend, called agg, is based on a library for raster graphics which is great for generating
raster formats like PNG.
Normally we don’t need to bother with changing the default backend; but sometimes it can be useful to
switch to, for example, PDF or GTKCairo (if you are using Linux) to produce high-quality vector graphics
instead of raster based graphics.
In [72]: #
# Now we are using the svg backend to produce SVG vector graphics
#
fig, ax = plt.subplots()
t = numpy.linspace(0, 10, 100)
ax.plot(t, numpy.cos(t)*numpy.sin(t))
plt.savefig("test.svg")
119
In [73]: #
# Show the produced SVG file.
#
SVG(filename="test.svg")
Out[73]:
120
The IPython notebook inline backend When we use IPython notebook it is convenient to use a
matplotlib backend that outputs the graphics embedded in the notebook file. To activate this backend,
somewhere in the beginning on the notebook, we add:
%matplotlib inline
%pylab inline
The di↵erence is that %pylab inline imports a number of packages into the global address space (scipy,
numpy), while %matplotlib inline only sets up inline plotting. In new notebooks created for IPython
1.0+, I would recommend using %matplotlib inline, since it is tidier and you have more control over
which packages are imported and how. Commonly, scipy and numpy are imported separately with:
import numpy as np
import scipy as sp
import matplotlib.pyplot as plt
The inline backend has a number of configuration options that can be set by using the IPython magic
command %config to update settings in InlineBackend. For example, we can switch to SVG figures or
higher resolution figures with either:
%config InlineBackend.figure_format=’svg’
or:
%config InlineBackend.figure_format=’retina’
%config InlineBackend
In [75]: #
# Now we are using the SVG vector graphics displaced inline in the notebook
#
fig, ax = plt.subplots()
t = numpy.linspace(0, 10, 100)
ax.plot(t, numpy.cos(t)*numpy.sin(t))
plt.savefig("test.svg")
121
Interactive backend (this makes more sense in a python script file)
In [76]: #
# RESTART THE NOTEBOOK: the matplotlib backend can only be selected before pylab is imported!
# (e.g. Kernel > Restart)
#
import matplotlib
matplotlib.use(’Qt4Agg’) # or for example MacOSX
import matplotlib.pylab as plt
import numpy
In [77]: # Now, open an interactive plot window with the Qt4Agg backend
fig, ax = plt.subplots()
t = numpy.linspace(0, 10, 100)
ax.plot(t, numpy.cos(t)*numpy.sin(t))
plt.show()
122
Note that when we use an interactive backend, we must call plt.show() to make the figure appear on
the screen.
123
Chapter 6
Sympy - Symbolic algebra in Python
This curriculum builds on material by J. Robert Johansson from his “Introduction to scientific computing
with Python,” generously made available under a Creative Commons Attribution 3.0 Unported License at
https://github.com/jrjohansson/scientific-python-lectures. The Continuum Analytics enhancements use the
Creative Commons Attribution-NonCommercial 4.0 International License.
6.1 Introduction
There are two notable Computer Algebra Systems (CAS) for Python:
• SymPy - A python module that can be used in any Python program, or in an IPython session, that
provides powerful CAS features.
• Sage - Sage is a full-featured and very powerful CAS enviroment that aims to provide an open source
system that competes with Mathematica and Maple. Sage is not a regular Python module, but rather
a CAS environment that uses Python as its programming language.
Sage is in some aspects more powerful than SymPy, but both o↵er very comprehensive CAS functionality.
The advantage of SymPy is that it is a regular Python module and integrates well with the IPython notebook.
In this lecture we will therefore look at how to use SymPy with IPython notebooks. If you are interested
in an open source CAS environment I also recommend to read more about Sage.
To get started using SymPy in a Python program or notebook, import the module sympy:
In [3]: init_printing()
124
6.2 Symbolic variables
In SymPy we need to create symbols for the variables we want to work with. We can create a new symbol
using the Symbol class:
In [4]: x = Symbol(’x’)
Out[5]:
2
(x + ⇡)
In [7]: type(a)
Out[7]: sympy.core.symbol.Symbol
In [9]: x.is_imaginary
Out[9]: False
In [11]: x > 0
Out[11]:
True
In [12]: 1+1*I
Out[12]:
1+i
In [13]: I**2
Out[13]:
In [14]: (x * I + 1)**2
Out[14]:
2
(ix + 1)
125
6.2.2 Rational numbers
There are three di↵erent numerical types in SymPy: Real, Rational, Integer:
In [15]: r1 = Rational(4,5)
r2 = Rational(5,4)
In [16]: r1
Out[16]:
4
5
In [17]: r1+r2
Out[17]:
41
20
In [18]: r1/r2
Out[18]:
16
25
In [19]: pi.evalf(n=50)
Out[19]:
3.1415926535897932384626433832795028841971693993751
In [20]: y = (x + pi)**2
Out[21]:
2
(x + 3.1416)
When we numerically evaluate algebraic expressions we often want to substitute a symbol with a numerical
value. In SymPy we do that using the subs function:
Out[22]:
2
(1.5 + ⇡)
126
Out[23]:
21.5443823618587
The subs function can of course also be used to substitute Symbols and expressions:
Out[24]:
2
(a + 2⇡)
We can also combine numerical evolution of expressions with NumPy arrays:
However, this kind of numerical evolution can be very slow, and there is a much more efficient way to do
it: Use the function lambdify to “compile” a Sympy expression into a function that is much more efficient
to evaluate numerically:
In [29]: f = lambdify([x], (x + pi)**2, ’numpy’) # the first argument is a list of variables that
# f will be a function of: in this case only x -> f(x)
In [30]: y_vec = f(x_vec) # now we can directly pass a numpy array and f(x) is efficiently evaluated
The speedup when using “lambdified” functions instead of direct numerical evaluation can be significant,
often several orders of magnitude. Even in this simple example we get a significant speed up:
127
In [31]: %%timeit
y_vec = f(x_vec)
The slowest run took 14.09 times longer than the fastest. This could mean that an intermediate result is
1000000 loops, best of 3: 1.49 µs per loop
(x + 1) (x + 2) (x + 3)
In [34]: expand((x+1)*(x+2)*(x+3))
Out[34]:
x3 + 6x2 + 11x + 6
The expand function takes a number of keywords arguments which we can tell the functions what kind of
expansions we want to have performed. For example, to expand trigonometric expressions, use the trig=True
keyword argument:
In [35]: sin(a+b)
Out[35]:
sin (a + b)
In [36]: expand(sin(a+b), trig=True)
Out[36]:
(x + 1) (x + 2) (x + 3)
128
6.4.2 Simplify
The simplify tries to simplify an expression into a nice looking expression, using various techniques. More
specific alternatives to the simplify functions also exists: trigsimp, powsimp, logcombine, etc.
The basic usages of these functions are as follows:
In [38]: # simplify expands a product
simplify((x+1)*(x+2)*(x+3))
Out[38]:
(x + 1) (x + 2) (x + 3)
In [39]: # simplify uses trigonometric identities
simplify(sin(a)**2 + cos(a)**2)
Out[39]:
1
In [40]: simplify(cos(x)/sin(x))
Out[40]:
1
tan (x)
1
(a+1)(a+2)
In [43]: apart(f1)
Out[43]:
1 1
a+2 + a+1
1 1
a+3 + a+2
In [46]: together(f2)
Out[46]:
2a+5
(a+2)(a+3)
Simplify usually combines fractions but does not factor:
In [47]: simplify(f2)
Out[47]:
2a+5
(a+2)(a+3)
129
6.5 Calculus
In addition to algebraic manipulations, the other main use of CAS is to do calculus, like derivatives and
integrals of algebraic expressions.
6.5.1 Di↵erentiation
Di↵erentiation is usually simple. Use the diff function. The first argument is the expression to take the
derivative of, and the second argument is the symbol by which to take the derivative:
In [48]: y
Out[48]:
2
(x + ⇡)
In [49]: diff(y**2, x)
Out[49]:
3
4 (x + ⇡)
For higher order derivatives we can do:
In [50]: diff(y**2, x, x)
Out[50]:
2
12 (x + ⇡)
Out[51]:
2
12 (x + ⇡)
To calculate the derivative of a multivariate expression, we can do:
In [52]: x, y, z = symbols("x,y,z")
In [54]: diff(f, x, 1, y, 2)
Out[54]:
130
6.6 Integration
Integration is done in a similar fashion:
In [55]: f
Out[55]:
2 cos (yz)
and also improper integrals
In [58]: integrate(exp(-x**2), (x, -oo, oo))
Out[58]:
p
⇡
Remember, oo is the SymPy notation for inifinity.
1.54976773116654
In [62]: Sum(1/n**2, (n, 1, oo)).evalf()
Out[62]:
1.64493406684823
Products work much the same way:
In [63]: Product(n, (n, 1, 10)) # 10!
Out[63]:
Q10
n=1 n
131
6.7 Limits
Limits can be evaluated using the limit function. For example,
In [64]: limit(sin(x)/x, x, 0)
Out[64]:
1
We can use ‘limit’ to check the result of derivation using the diff function:
In [65]: f
Out[65]:
In [66]: diff(f, x)
Out[66]:
y cos (xy)
df (x, y) f (x + h, y) f (x, y)
=
dx h
In [67]: h = Symbol("h")
Out[68]:
y cos (xy)
OK!
We can change the direction from which we approach the limiting point using the dir keywork argument:
Out[69]:
Out[70]:
132
6.8 Series
Series expansion is also one of the most useful features of a CAS. In SymPy we can perform a series expansion
of an expression using the series function:
In [71]: series(exp(x), x)
Out[71]:
x2 x3 x4 x5
1+x+ 2 + 6 + 24 + 120 + O x6
By default it expands the expression around x = 0, but we can expand around any value of x by explicitly
include a value in the function call:
In [72]: series(exp(x), x, 1)
Out[72]:
⇣ ⌘
e 2 e 3 e 4 e 5 6
e + e (x 1) + 2 (x 1) + 6 (x 1) + 24 (x 1) + 120 (x 1) + O (x 1) ; x ! 1
And we can explicitly define to which order the series expansion should be carried out:
In [73]: series(exp(x), x, 1, 10)
Out[73]:
⇣ ⌘
e 2 e 3 e 4 e 5 e 6 e 7 e 8 e 9 10
e + e (x 1) + 2 (x 1) + 6 (x 1) + 24 (x 1) + 120 (x 1) + 720 (x 1) + 5040 (x 1) + 40320 (x 1) + 362880 (x 1) + O (x 1) ;x ! 1
The series expansion includes the order of the approximation, which is very useful for keeping track of
the order of validity when we do calculations with series expansions of di↵erent order:
In [74]: s1 = cos(x).series(x, 0, 5)
s1
Out[74]:
x2 x4
1 2 + 24 + O x5
In [75]: s2 = sin(x).series(x, 0, 2)
s2
Out[75]:
x + O x2
In [76]: expand(s1 * s2)
Out[76]:
x + O x2
If we want to get rid of the order information we can use the removeO method:
In [77]: expand(s1.removeO() * s2.removeO())
Out[77]:
x5 x3
24 2 +x
But note that this is not the correct expansion of cos(x) sin(x) to 5th order:
In [78]: (cos(x)*sin(x)).series(x, 0, 6)
Out[78]:
2x3 2x5
x 3 + 15 + O x6
133
6.9 Linear algebra
6.9.1 Matrices
Matrices are defined using the Matrix class:
Out[80]:
m11 m12
m21 m22
Out[81]:
b1
b2
With Matrix class instances we can do the usual matrix algebra operations:
In [82]: A**2
Out[82]:
m211 + m12 m21 m11 m12 + m12 m22
m11 m21 + m21 m22 m12 m21 + m222
In [83]: A * b
Out[83]:
b1 m11 + b2 m12
b1 m21 + b2 m22
And calculate determinants and inverses, and the like:
In [84]: A.det()
Out[84]:
In [85]: A.inv()
Out[85]:
2 3
1
+ ⇣ m12 m21 ⌘ ⇣ m12 ⌘
m11 m211
m22
m12 m21
m11 m22
m12 m21
4 m21
m11
1
m11
5
⇣ ⌘ m12 m21
m12 m21 m22
m11 m22 m m11
11
134
6.10 Solving equations
For solving equations and systems of equations we can use the solve function:
In [86]: solve(x**2 - 1, x)
Out[86]:
[ 1, 1]
Out[87]:
q p q p q p q p
1 5 1 5 1 5 1 5
i 2 + 2 , i 2 + 2 , 2 + 2 , 2 + 2
System of equations:
Out[88]:
{x : 1, y : 0}
In terms of other symbolic expressions:
Out[89]:
a
x: 2 + 2c , y: a
2
c
2
6.12 States
We can define symbol states, kets and bras:
In [91]: Ket(’psi’)
Out[91]:
| i
In [92]: Bra(’psi’)
Out[92]:
h |
135
In [93]: u = Ket(’0’)
d = Ket(’1’)
Out[94]:
q
2
↵|0i + |↵| + 1|1i
In [95]: Dagger(phi)
Out[95]:
q
2
↵h0| + |↵| + 1h1|
In [96]: Dagger(phi) * d
Out[96]:
✓ q ◆
2
↵h0| + |↵| + 1h1| |1i
In [97]: qapply(Dagger(phi) * d)
Out[97]:
q
2
↵ h0 |1i + |↵| + 1 h1 |1i
In [98]: qapply(Dagger(phi) * u)
Out[98]:
q
2
↵ h0 |0i + |↵| + 1 h1 |0i
6.12.1 Operators
In [99]: A = Operator(’A’)
B = Operator(’B’)
In [100]: A * B == B * A
Out[100]: False
In [101]: expand((A+B)**3)
Out[101]:
2 2 3 2 2 3
ABA + A (B) + (A) B + (A) + BAB + B (A) + (B) A + (B)
In [102]: c = Commutator(A,B)
c
136
Out[102]:
[A, B]
We can use the doit method to evaluate the commutator:
In [103]: c.doit()
Out[103]:
AB BA
We can mix quantum operators with C-numbers:
In [104]: c = Commutator(a * A, b * B)
c
Out[104]:
↵ [A, B]
To expand the commutator, use the expand method with the commutator=True keyword argument:
Out[105]:
[A, B] B + A [A, B]
Out[106]:
⇥ ⇤
A† , B †
In [107]: ac = AntiCommutator(A,B)
In [108]: ac.doit()
Out[108]:
AB + BA
Example: Quadrature commutator Let’s look at the commutator of the electromagnetic field quada-
tures x and p. We can write the quadrature operators in terms of the creation and annihilation operators
as: p
x = (a + a† )/ 2p
p = i(a a† )/ 2
In [109]: X = (A + Dagger(A))/sqrt(2)
X
Out[109]:
p
2
2 A† + A
In [110]: P = -I * (A - Dagger(A))/sqrt(2)
P
137
Out[110]:
p
2i
2 A† + A
Let’s expand the commutator [x, p]
Out[111]:
⇥ ⇤
i A† , A
Here we see directly that the well known commutation relation for the quadratures
[x, p] = i
is a directly related to
[A, A† ] = 1
(which SymPy does not know about, and does not simplify).
For more details on the quantum module in SymPy, see:
• http://docs.sympy.org/0.7.2/modules/physics/quantum/index.html
• http://nbviewer.ipython.org/urls/raw.github.com/ipython/ipython/master/docs/examples/notebooks/sympy quantum
138
Figure 6.1: Continuum Logo
139
Chapter 7
Using Fortran and C code with Python
This curriculum builds on material by J. Robert Johansson from his “Introduction to scientific computing
with Python,” generously made available under a Creative Commons Attribution 3.0 Unported License at
https://github.com/jrjohansson/scientific-python-lectures. The Continuum Analytics enhancements use the
Creative Commons Attribution-NonCommercial 4.0 International License.
140
7.1 Fortran
7.1.1 F2PY
F2PY is a program that (almost) automatically wraps fortran code for use in Python: By using the f2py
program we can compile fortran code into a module that we can import in a Python program.
F2PY is a part of NumPy, but you will also need to have a fortran compiler to run the examples below.
do 100 i=0, n
print *, "Fortran says hello"
100 continue
end
Overwriting hellofortran.f
running build
running config cc
unifing config cc, config, build clib, build ext, build commands --compiler options
running config fc
unifing config fc, config, build clib, build ext, build commands --fcompiler options
running build src
build src
building extension "hellofortran" sources
f2py options: []
f2py:> /tmp/tmp 6mh2wh9/src.linux-x86 64-3.4/hellofortranmodule.c
creating /tmp/tmp 6mh2wh9/src.linux-x86 64-3.4
Reading fortran codes...
Reading file ’hellofortran.f’ (format:fix,strict)
Post-processing...
Block: hellofortran
Block: hellofortran
Post-processing (stage 2)...
Building modules...
Building module "hellofortran"...
Constructing wrapper function "hellofortran"...
hellofortran(n)
Wrote C/API module "hellofortran" to file "/tmp/tmp 6mh2wh9/src.linux-x86 64-3.4/hellofortranmod
adding ’/tmp/tmp 6mh2wh9/src.linux-x86 64-3.4/fortranobject.c’ to sources.
adding ’/tmp/tmp 6mh2wh9/src.linux-x86 64-3.4’ to include dirs.
copying /home/dhavide/anaconda3/lib/python3.4/site-packages/numpy/f2py/src/fortranobject.c -> /tmp/tmp 6
copying /home/dhavide/anaconda3/lib/python3.4/site-packages/numpy/f2py/src/fortranobject.h -> /tmp/tmp 6
build src: building npy-pkg config files
running build ext
customize UnixCCompiler
141
customize UnixCCompiler using build ext
customize Gnu95FCompiler
Found executable /usr/bin/gfortran
customize Gnu95FCompiler
customize Gnu95FCompiler using build ext
building ’hellofortran’ extension
compiling C sources
C compiler: gcc -pthread -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC
hellofortran.hellofortran(5)
Overwriting hello.py
142
Fortran says hello
Fortran says hello
Fortran says hello
subroutine dprod(x, y, n)
do 100 i=1, n
y = y * x(i)
100 continue
end
Overwriting dprod.f
In [154]: !rm -f dprod.pyf
!f2py3 -m dprod -h dprod.pyf dprod.f
Reading fortran codes...
Reading file ’dprod.f’ (format:fix,strict)
Post-processing...
Block: dprod
{}
In: :dprod:dprod.f:dprod
vars2fortran: No typespec for argument "n".
Block: dprod
Post-processing (stage 2)...
Saving signatures to file "./dprod.pyf"
The f2py program generated a module declaration file called dsum.pyf. Let’s look what’s in it:
In [155]: !cat dprod.pyf
! -*- f90 -*-
! Note: the context of this file is case sensitive.
143
In [156]: %%file dprod.pyf
python module dprod ! in
interface ! in :dprod
subroutine dprod(x,y,n) ! in :dprod:dprod.f
double precision dimension(n), intent(in) :: x
double precision, intent(out) :: y
integer, optional,check(len(x)>=n),depend(x),intent(in) :: n=len(x)
end subroutine dprod
end interface
end python module dprod
Overwriting dprod.pyf
Compile the fortran code into a module that can be included in python:
running build
running config cc
unifing config cc, config, build clib, build ext, build commands --compiler options
running config fc
unifing config fc, config, build clib, build ext, build commands --fcompiler options
running build src
build src
building extension "dprod" sources
creating /tmp/tmpo0ulkq7s/src.linux-x86 64-3.4
f2py options: []
f2py: dprod.pyf
Reading fortran codes...
Reading file ’dprod.pyf’ (format:free)
Post-processing...
Block: dprod
Block: dprod
Post-processing (stage 2)...
Building modules...
Building module "dprod"...
Constructing wrapper function "dprod"...
y = dprod(x,[n])
Wrote C/API module "dprod" to file "/tmp/tmpo0ulkq7s/src.linux-x86 64-3.4/dprodmodule.c"
adding ’/tmp/tmpo0ulkq7s/src.linux-x86 64-3.4/fortranobject.c’ to sources.
adding ’/tmp/tmpo0ulkq7s/src.linux-x86 64-3.4’ to include dirs.
copying /home/dhavide/anaconda3/lib/python3.4/site-packages/numpy/f2py/src/fortranobject.c -> /tmp/tmpo0
copying /home/dhavide/anaconda3/lib/python3.4/site-packages/numpy/f2py/src/fortranobject.h -> /tmp/tmpo0
build src: building npy-pkg config files
running build ext
customize UnixCCompiler
customize UnixCCompiler using build ext
customize Gnu95FCompiler
Found executable /usr/bin/gfortran
customize Gnu95FCompiler
customize Gnu95FCompiler using build ext
building ’dprod’ extension
compiling C sources
C compiler: gcc -pthread -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC
144
creating /tmp/tmpo0ulkq7s/tmp
creating /tmp/tmpo0ulkq7s/tmp/tmpo0ulkq7s
creating /tmp/tmpo0ulkq7s/tmp/tmpo0ulkq7s/src.linux-x86 64-3.4
compile options: ’-I/tmp/tmpo0ulkq7s/src.linux-x86 64-3.4 -I/home/dhavide/anaconda3/lib/python3.4/site-p
gcc: /tmp/tmpo0ulkq7s/src.linux-x86 64-3.4/fortranobject.c
In file included from /home/dhavide/anaconda3/lib/python3.4/site-packages/numpy/core/include/numpy/ndarr
from /home/dhavide/anaconda3/lib/python3.4/site-packages/numpy/core/include/numpy/ndarr
from /home/dhavide/anaconda3/lib/python3.4/site-packages/numpy/core/include/numpy/array
from /tmp/tmpo0ulkq7s/src.linux-x86 64-3.4/fortranobject.h:13,
from /tmp/tmpo0ulkq7s/src.linux-x86 64-3.4/fortranobject.c:2:
/home/dhavide/anaconda3/lib/python3.4/site-packages/numpy/core/include/numpy/npy 1 7 deprecated api.h:15:
#warning "Using deprecated NumPy API, disable it by " \
^
gcc: /tmp/tmpo0ulkq7s/src.linux-x86 64-3.4/dprodmodule.c
In file included from /home/dhavide/anaconda3/lib/python3.4/site-packages/numpy/core/include/numpy/ndarr
from /home/dhavide/anaconda3/lib/python3.4/site-packages/numpy/core/include/numpy/ndarr
from /home/dhavide/anaconda3/lib/python3.4/site-packages/numpy/core/include/numpy/array
from /tmp/tmpo0ulkq7s/src.linux-x86 64-3.4/fortranobject.h:13,
from /tmp/tmpo0ulkq7s/src.linux-x86 64-3.4/dprodmodule.c:18:
/home/dhavide/anaconda3/lib/python3.4/site-packages/numpy/core/include/numpy/npy 1 7 deprecated api.h:15:
#warning "Using deprecated NumPy API, disable it by " \
^
/tmp/tmpo0ulkq7s/src.linux-x86 64-3.4/dprodmodule.c:111:12: warning: ‘f2py size’ defined but not used [-
static int f2py size(PyArrayObject* var, ...)
^
compiling Fortran sources
Fortran f77 compiler: /usr/bin/gfortran -Wall -g -ffixed-form -fno-second-underscore -fPIC -O3 -funroll-
Fortran f90 compiler: /usr/bin/gfortran -Wall -g -fno-second-underscore -fPIC -O3 -funroll-loops
Fortran fix compiler: /usr/bin/gfortran -Wall -g -ffixed-form -fno-second-underscore -Wall -g -fno-secon
compile options: ’-I/tmp/tmpo0ulkq7s/src.linux-x86 64-3.4 -I/home/dhavide/anaconda3/lib/python3.4/site-p
gfortran:f77: dprod.f
/usr/bin/gfortran -Wall -g -Wall -g -shared /tmp/tmpo0ulkq7s/tmp/tmpo0ulkq7s/src.linux-x86 64-3.4/dprodm
Removing build directory /tmp/tmpo0ulkq7s
In [159]: help(dprod)
NAME
dprod
DESCRIPTION
This module ’dprod’ is auto-generated with f2py (version:2).
Functions:
y = dprod(x,n=len(x))
.
DATA
dprod = <fortran object>
VERSION
145
b’$Revision: $’
FILE
/home/dhavide/repositories/scientific-python-lectures/dprod.cpython-34m.so
In [160]: dprod.dprod(arange(1,50))
Out[160]: 6.082818640342675e+62
Out[161]: 6.0828186403426752e+62
Out[162]: 120.0
Compare performance:
The slowest run took 5.61 times longer than the fastest. This could mean that an intermediate result is
1000000 loops, best of 3: 1.63 µs per loop
The slowest run took 6.64 times longer than the fastest. This could mean that an intermediate result is
100000 loops, best of 3: 8.46 µs per loop
Fortran subroutine for the same thing: here we have added the intent(in) and intent(out) as comment
lines in the original fortran code, so we do not need to manually edit the fortran module declaration file
generated by f2py.
146
cf2py intent(hide) :: n
b(1) = a(1)
do 100 i=2, n
b(i) = b(i-1) + a(i)
100 continue
end
Overwriting dcumsum.f
running build
running config cc
unifing config cc, config, build clib, build ext, build commands --compiler options
running config fc
unifing config fc, config, build clib, build ext, build commands --fcompiler options
running build src
build src
building extension "dcumsum" sources
f2py options: []
f2py:> /tmp/tmpe46xtmge/src.linux-x86 64-3.4/dcumsummodule.c
creating /tmp/tmpe46xtmge/src.linux-x86 64-3.4
Reading fortran codes...
Reading file ’dcumsum.f’ (format:fix,strict)
Post-processing...
Block: dcumsum
Block: dcumsum
Post-processing (stage 2)...
Building modules...
Building module "dcumsum"...
Constructing wrapper function "dcumsum"...
b = dcumsum(a)
Wrote C/API module "dcumsum" to file "/tmp/tmpe46xtmge/src.linux-x86 64-3.4/dcumsummodule.c"
adding ’/tmp/tmpe46xtmge/src.linux-x86 64-3.4/fortranobject.c’ to sources.
adding ’/tmp/tmpe46xtmge/src.linux-x86 64-3.4’ to include dirs.
copying /home/dhavide/anaconda3/lib/python3.4/site-packages/numpy/f2py/src/fortranobject.c -> /tmp/tmpe4
copying /home/dhavide/anaconda3/lib/python3.4/site-packages/numpy/f2py/src/fortranobject.h -> /tmp/tmpe4
build src: building npy-pkg config files
running build ext
customize UnixCCompiler
customize UnixCCompiler using build ext
customize Gnu95FCompiler
Found executable /usr/bin/gfortran
customize Gnu95FCompiler
customize Gnu95FCompiler using build ext
building ’dcumsum’ extension
compiling C sources
C compiler: gcc -pthread -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC
creating /tmp/tmpe46xtmge/tmp
creating /tmp/tmpe46xtmge/tmp/tmpe46xtmge
creating /tmp/tmpe46xtmge/tmp/tmpe46xtmge/src.linux-x86 64-3.4
147
compile options: ’-I/tmp/tmpe46xtmge/src.linux-x86 64-3.4 -I/home/dhavide/anaconda3/lib/python3.4/site-p
gcc: /tmp/tmpe46xtmge/src.linux-x86 64-3.4/dcumsummodule.c
In file included from /home/dhavide/anaconda3/lib/python3.4/site-packages/numpy/core/include/numpy/ndarr
from /home/dhavide/anaconda3/lib/python3.4/site-packages/numpy/core/include/numpy/ndarr
from /home/dhavide/anaconda3/lib/python3.4/site-packages/numpy/core/include/numpy/array
from /tmp/tmpe46xtmge/src.linux-x86 64-3.4/fortranobject.h:13,
from /tmp/tmpe46xtmge/src.linux-x86 64-3.4/dcumsummodule.c:18:
/home/dhavide/anaconda3/lib/python3.4/site-packages/numpy/core/include/numpy/npy 1 7 deprecated api.h:15:
#warning "Using deprecated NumPy API, disable it by " \
^
/tmp/tmpe46xtmge/src.linux-x86 64-3.4/dcumsummodule.c:111:12: warning: ‘f2py size’ defined but not used
static int f2py size(PyArrayObject* var, ...)
^
gcc: /tmp/tmpe46xtmge/src.linux-x86 64-3.4/fortranobject.c
In file included from /home/dhavide/anaconda3/lib/python3.4/site-packages/numpy/core/include/numpy/ndarr
from /home/dhavide/anaconda3/lib/python3.4/site-packages/numpy/core/include/numpy/ndarr
from /home/dhavide/anaconda3/lib/python3.4/site-packages/numpy/core/include/numpy/array
from /tmp/tmpe46xtmge/src.linux-x86 64-3.4/fortranobject.h:13,
from /tmp/tmpe46xtmge/src.linux-x86 64-3.4/fortranobject.c:2:
/home/dhavide/anaconda3/lib/python3.4/site-packages/numpy/core/include/numpy/npy 1 7 deprecated api.h:15:
#warning "Using deprecated NumPy API, disable it by " \
^
compiling Fortran sources
Fortran f77 compiler: /usr/bin/gfortran -Wall -g -ffixed-form -fno-second-underscore -fPIC -O3 -funroll-
Fortran f90 compiler: /usr/bin/gfortran -Wall -g -fno-second-underscore -fPIC -O3 -funroll-loops
Fortran fix compiler: /usr/bin/gfortran -Wall -g -ffixed-form -fno-second-underscore -Wall -g -fno-secon
compile options: ’-I/tmp/tmpe46xtmge/src.linux-x86 64-3.4 -I/home/dhavide/anaconda3/lib/python3.4/site-p
gfortran:f77: dcumsum.f
/usr/bin/gfortran -Wall -g -Wall -g -shared /tmp/tmpe46xtmge/tmp/tmpe46xtmge/src.linux-x86 64-3.4/dcumsu
Removing build directory /tmp/tmpe46xtmge
In [169]: import dcumsum
In [170]: a = array([1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0])
In [171]: py_dcumsum(a)
Out[171]: array([ 1., 3., 6., 10., 15., 21., 28., 36.])
In [172]: dcumsum.dcumsum(a)
Out[172]: array([ 1., 3., 6., 10., 15., 21., 28., 36.])
In [173]: cumsum(a)
Out[173]: array([ 1., 3., 6., 10., 15., 21., 28., 36.])
Benchmark the di↵erent implementations:
In [174]: a = rand(10000)
In [175]: timeit py_dcumsum(a)
100 loops, best of 3: 5.41 ms per loop
In [176]: timeit dcumsum.dcumsum(a)
10000 loops, best of 3: 18.5 µs per loop
In [177]: timeit a.cumsum()
The slowest run took 17.25 times longer than the fastest. This could mean that an intermediate result is
10000 loops, best of 3: 47 µs per loop
148
7.1.5 Further reading
1. http://www.scipy.org/F2py
2. http://dsnra.jpl.nasa.gov/software/Python/F2PY tutorial.pdf
3. http://www.shocksolution.com/2009/09/f2py-binding-fortran-python/
7.2 C
7.3 ctypes
ctypes is a Python library for calling out to C code. It is not as automatic as f2py, and we manually need
to load the library and set properties such as the functions return and argument types. On the otherhand
we do not need to touch the C code at all.
#include <stdio.h>
void
hello(int n)
{
int i;
double
dprod(double *x, int n)
{
int i;
double y = 1.0;
return y;
}
void
dcumsum(double *a, double *b, int n)
{
int i;
149
b[0] = a[0];
for (i = 1; i < n; i++)
{
b[i] = a[i] + b[i-1];
}
}
Overwriting functions.c
In [179]: !gcc -c -Wall -O2 -Wall -ansi -pedantic -fPIC -o functions.o functions.c
!gcc -o libfunctions.so -shared functions.o
libfunctions.so: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, BuildID[sh
Now we need to write wrapper functions to access the C library: To load the library we use the ctypes
package, which included in the Python standard library (with extensions from numpy for passing arrays to
C). Then we manually set the types of the argument and return values (no automatic code inspection here!).
import numpy
import ctypes
_libfunctions.hello.argtypes = [ctypes.c_int]
_libfunctions.hello.restype = ctypes.c_void_p
def hello(n):
return _libfunctions.hello(int(n))
150
Overwriting functions.py
import functions
functions.hello(3)
C says hello
C says hello
C says hello
Out[185]: 120.0
151
7.4 Cython
A hybrid between python and C that can be compiled: Basically Python code with type declarations.
cimport numpy
Overwriting cy dcumsum.pyx
A build file for generating C code and compiling it into a Python module.
setup(
cmdclass = {’build_ext’: build_ext},
ext_modules = [Extension("cy_dcumsum", ["cy_dcumsum.pyx"], include_dirs=[numpy.get_include
],
)
Overwriting setup.py
152
^
gcc -pthread -shared -L/home/dhavide/anaconda3/lib -Wl,-rpath=/home/dhavide/anaconda3/lib,--no-as-needed
In [196]: import cy_dcumsum
In [197]: a = array([1,2,3,4], dtype=float)
b = empty_like(a)
cy_dcumsum.dcumsum(a,b)
b
Out[197]: array([ 1., 3., 6., 10.])
In [198]: a = array([1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0])
In [199]: b = empty_like(a)
cy_dcumsum.dcumsum(a, b)
b
Out[199]: array([ 1., 3., 6., 10., 15., 21., 28., 36.])
In [200]: py_dcumsum(a)
Out[200]: array([ 1., 3., 6., 10., 15., 21., 28., 36.])
In [201]: a = rand(100000)
b = empty_like(a)
In [202]: timeit py_dcumsum(a)
10 loops, best of 3: 72.7 ms per loop
In [203]: timeit cy_dcumsum.dcumsum(a,b)
1000 loops, best of 3: 469 µs per loop
cimport numpy
153
7.4.2 Further reading
• http://cython.org
• http://docs.cython.org/src/userguide/tutorial.html
• http://wiki.cython.org/tutorials/numpy
154
Figure 7.1: Continuum Logo
155
Chapter 8
Tools for high-performance computing
applications
This curriculum builds on material by J. Robert Johansson from his “Introduction to scientific computing
with Python,” generously made available under a Creative Commons Attribution 3.0 Unported License at
https://github.com/jrjohansson/scientific-python-lectures. The Continuum Analytics enhancements use the
Creative Commons Attribution-NonCommercial 4.0 International License.
8.1 multiprocessing
Python has a built-in process-based library for concurrent computing, called multiprocessing.
In [2]: import multiprocessing
import os
import time
import numpy
In [3]: def task(args):
print("PID =", os.getpid(), ", args =", args)
156
In [7]: result
The multiprocessing package is very useful for highly parallel tasks that do not need to communicate
with each other, other than when sending the initial data to the pool of processes and when and collecting
the results.
$ ipcluster start -n 4
Or, alternatively, from the “Clusters” tab on the IPython notebook dashboard page. This will start
4 IPython engines on the current host, which is useful for multicore systems. It is also possible to setup
IPython clusters that spans over many nodes in a computing cluster. For more information about possible
use cases, see the official documentation Using IPython for parallel computing.
To use the IPython cluster in our Python programs or notebooks, we start by creating an instance of
IPython.parallel.Client:
Using the ‘ids’ attribute we can retreive a list of ids for the IPython engines in the cluster:
In [10]: cli.ids
Out[10]: [0, 1, 2, 3]
Each of these engines are ready to execute tasks. We can selectively run code on individual engines:
Out[12]: 11411
Out[13]: 11464
157
In [14]: # run it on ALL of the engines at the same time
cli[:].apply_sync(getpid)
We can use this cluster of IPython engines to execute tasks in parallel. The easiest way to dispatch a
function to di↵erent engines is to define the function with the decorator:
@view.parallel(block=True)
Here, view is supposed to be the engine pool which we want to dispatch the function (task). Once our
function is defined this way we can dispatch it to the engine using the map method in the resulting class (in
Python, a decorator is a language construct which automatically wraps the function into another function
or a class).
To see how all this works, lets look at an example:
In [16]: @dview.parallel(block=True)
def dummy_task(delay):
""" a dummy task that takes ’delay’ seconds to finish """
import os, time
t0 = time.time()
pid = os.getpid()
time.sleep(delay)
t1 = time.time()
Now, to map the function dummy task to the random delay time data, we use the map method in
dummy task:
In [18]: dummy_task.map(delay_times)
Let’s do the same thing again with many more tasks and visualize how these tasks are executed on
di↵erent IPython engines:
yticks = []
yticklabels = []
tmin = min(res[:,1])
for n, pid in enumerate(numpy.unique(res[:,0])):
yticks.append(n)
yticklabels.append("%d" % pid)
for m in numpy.where(res[:,0] == pid)[0]:
158
ax.add_patch(plt.Rectangle((res[m,1] - tmin, n-0.25),
res[m,2] - res[m,1], 0.5, color="green", alpha=0.5))
ax.set_ylim(-.5, n+.5)
ax.set_xlim(0, max(res[:,2]) - tmin + 0.)
ax.set_yticks(yticks)
ax.set_yticklabels(yticklabels)
ax.set_ylabel("PID")
ax.set_xlabel("seconds")
That’s a nice and easy parallelization! We can see that we utilize all four engines quite well.
But one short coming so far is that the tasks are not load balanced, so one engine might be idle while
others still have more tasks to work on.
However, the IPython parallel environment provides a number of alternative “views” of the engine cluster,
and there is a view that provides load balancing as well (above we have used the “direct view”, which is why
we called it “dview”).
To obtain a load balanced view we simply use the load balanced view method in the engine cluster
client instance cli:
In [23]: @lbview.parallel(block=True)
def dummy_task_load_balanced(delay):
""" a dummy task that takes ’delay’ seconds to finish """
import os, time
t0 = time.time()
pid = os.getpid()
time.sleep(delay)
t1 = time.time()
159
In the example above we can see that the engine cluster is a bit more efficiently used, and the time to
completion is shorter than in the previous example.
8.3 MPI
When more communication between processes is required, sophisticated solutions such as MPI and OpenMP
are often needed. MPI is process based parallel processing library/protocol, and can be used in Python
programs through the mpi4py package:
http://mpi4py.scipy.org/
To use the mpi4py package we include MPI from mpi4py:
from mpi4py import MPI
A MPI python program must be started using the mpirun -n N command, where N is the number of
processes that should be included in the process group.
Note that the IPython parallel enviroment also has support for MPI, but to begin with we will use mpi4py
and the mpirun in the follow examples.
8.3.1 Example 1
In [30]: %%file mpitest.py
comm = MPI.COMM_WORLD
rank = comm.Get_rank()
if rank == 0:
data = [1.0, 2.0, 3.0, 4.0]
comm.send(data, dest=1, tag=11)
elif rank == 1:
data = comm.recv(source=0, tag=11)
160
Overwriting mpitest.py
8.3.2 Example 2
Send a numpy array from one process to another:
comm = MPI.COMM_WORLD
rank = comm.Get_rank()
if rank == 0:
data = numpy.random.rand(10)
comm.Send(data, dest=1, tag=13)
elif rank == 1:
data = numpy.empty(10, dtype=numpy.float64)
comm.Recv(data, source=0, tag=13)
Overwriting mpi-numpy-array.py
comm = MPI.COMM_WORLD
rank = comm.Get_rank()
p = comm.Get_size()
161
m = A.shape[0] / p
y_part = numpy.dot(A[rank * m:(rank+1)*m], x)
y = numpy.zeros_like(x)
comm.Allgather([y_part, MPI.DOUBLE], [y, MPI.DOUBLE])
return y
A = numpy.load("random-matrix.npy")
x = numpy.load("random-vector.npy")
y_mpi = matvec(comm, A, x)
if rank == 0:
y = numpy.dot(A, x)
print(y_mpi)
print("sum(y - y_mpi) = %f" % (y - y_mpi).sum())
Writing mpi-matrix-vector.py
def psum(a):
r = MPI.COMM_WORLD.Get_rank()
size = MPI.COMM_WORLD.Get_size()
m = len(a) / size
locsum = np.sum(a[r*m:(r+1)*m])
rcvBuf = np.array(0.0, ’d’)
MPI.COMM_WORLD.Allreduce([locsum, MPI.DOUBLE], [rcvBuf, MPI.DOUBLE], op=MPI.SUM)
return rcvBuf
a = np.load("random-vector.npy")
s = psum(a)
if MPI.COMM_WORLD.Get_rank() == 0:
print("sum = %f, numpy sum = %f" % (s,a.sum()))
Overwriting mpi-psum.py
162
8.3.5 Further reading
• http://mpi4py.scipy.org
• http://mpi4py.scipy.org/docs/usrman/tutorial.html
• https://computing.llnl.gov/tutorials/mpi/
8.4 OpenMP
What about OpenMP? OpenMP is a standard and widely used thread-based parallel API that unfortunaltely
is not useful directly in Python. The reason is that the CPython implementation use a global interpreter
lock, making it impossible to simultaneously run several Python threads. Threads are therefore not use-
ful for parallel computing in Python, unless it is only used to wrap compiled code that do the OpenMP
parallelization (Numpy can do something like that).
This is clearly a limitation in the Python interpreter, and as a consequence all parallelization in Python
must use processes (not threads).
However, there is a way around this that is not that painful. When calling out to compiled code the GIL
is released, and it is possible to write Python-like code in Cython where we can selectively release the GIL
and do OpenMP computations.
Here is a simple example that shows how OpenMP can be used via cython:
cimport cython
cimport numpy
from cython.parallel import prange, parallel
cimport openmp
def cy_openmp_test():
cdef int n, N
# release GIL so that we can use OpenMP
with nogil, parallel():
N = openmp.omp_get_num_threads()
n = openmp.omp_get_thread_num()
with gil:
print("Number of threads %d: thread number %d" % (N, n))
In [46]: cy_openmp_test()
163
8.4.1 Example: matrix vector multiplication
In [47]: # prepare some random data
N = 4 * N_core
M = numpy.random.rand(N, N)
x = numpy.random.rand(N)
y = numpy.zeros_like(x)
In [48]: %%cython
cimport cython
cimport numpy
import numpy
@cython.boundscheck(False)
@cython.wraparound(False)
def cy_matvec(numpy.ndarray[numpy.float64_t, ndim=2] M,
numpy.ndarray[numpy.float64_t, ndim=1] x,
numpy.ndarray[numpy.float64_t, ndim=1] y):
return y
Out[49]: array([ 0., 0., 0., 0., 0., 0., 0., 0.])
The slowest run took 465.61 times longer than the fastest. This could mean that an intermediate result i
1000000 loops, best of 3: 2.15 µs per loop
The slowest run took 4.91 times longer than the fastest. This could mean that an intermediate result is
100000 loops, best of 3: 3.29 µs per loop
The Cython implementation here is a bit slower than numpy.dot, but not by much, so if we can use
multiple cores with OpenMP it should be possible to beat the performance of numpy.dot.
cimport cython
cimport numpy
from cython.parallel import parallel
cimport openmp
164
@cython.boundscheck(False)
@cython.wraparound(False)
def cy_matvec_omp(numpy.ndarray[numpy.float64_t, ndim=2] M,
numpy.ndarray[numpy.float64_t, ndim=1] x,
numpy.ndarray[numpy.float64_t, ndim=1] y):
return y
Out[53]: array([ 0., 0., 0., 0., 0., 0., 0., 0.])
The slowest run took 297.37 times longer than the fastest. This could mean that an intermediate result i
1000000 loops, best of 3: 1.93 µs per loop
The slowest run took 69.79 times longer than the fastest. This could mean that an intermediate result is
100000 loops, best of 3: 10.3 µs per loop
Now, this implementation is much slower than numpy.dot for this problem size, because of overhead
associated with OpenMP and threading, etc. But let’s look at the how the di↵erent implementations compare
with larger matrix sizes:
M = numpy.random.rand(N, N)
x = numpy.random.rand(N)
y = numpy.zeros_like(x)
t0 = time.time()
numpy.dot(M, x)
165
duration_ref[idx] = time.time() - t0
t0 = time.time()
cy_matvec(M, x, y)
duration_cy[idx] = time.time() - t0
t0 = time.time()
cy_matvec_omp(M, x, y)
duration_cy_omp[idx] = time.time() - t0
ax.legend(loc=2)
ax.set_yscale("log")
ax.set_ylabel("matrix-vector multiplication duration")
ax.set_xlabel("matrix size");
For large problem sizes the the cython+OpenMP implementation is faster than numpy.dot.
With this simple implementation, the speedup for large problem sizes is about:
Out[60]: 1.2483748994665467
Obviously one could do a better job with more e↵ort, since the theoretical limit of the speed-up is:
In [61]: N_core
Out[61]: 2
166
8.4.2 Further reading
• http://openmp.org
• http://docs.cython.org/src/userguide/parallelism.html
8.5 OpenCL
OpenCL is an API for heterogenous computing, for example using GPUs for numerical computations. There
is a python package called pyopencl that allows OpenCL code to be compiled, loaded and executed on
the compute units completely from within Python. This is a nice way to work with OpenCL, because the
time-consuming computations should be done on the compute units in compiled code, and in this Python
only server as a control language.
In [ ]: %%file opencl-dense-mv.py
import pyopencl as cl
import numpy
import time
# problem size
n = 10000
# platform
platform_list = cl.get_platforms()
platform = platform_list[0]
# device
device_list = platform.get_devices()
device = device_list[0]
if False:
print("Platform name:" + platform.name)
print("Platform version:" + platform.version)
print("Device name:" + device.name)
print("Device type:" + cl.device_type.to_string(device.type))
print("Device memory: " + str(device.global_mem_size//1024//1024) + ’ MB’)
print("Device max clock speed:" + str(device.max_clock_frequency) + ’ MHz’)
print("Device compute units:" + str(device.max_compute_units))
# context
ctx = cl.Context([device]) # or we can use cl.create_some_context()
# command queue
queue = cl.CommandQueue(ctx)
# kernel
KERNEL_CODE = """
//
// Matrix-vector multiplication: r = m * v
//
#define N %(mat_size)d
__kernel
void dmv_cl(__global float *m, __global float *v, __global float *r)
{
167
int i, gid = get_global_id(0);
r[gid] = 0;
for (i = 0; i < N; i++)
{
r[gid] += m[gid * N + i] * v[i];
}
}
"""
kernel_params = {"mat_size": n}
program = cl.Program(ctx, KERNEL_CODE % kernel_params).build()
# data
A = numpy.random.rand(n, n)
x = numpy.random.rand(n, 1)
# host buffers
h_y = numpy.empty(numpy.shape(x)).astype(numpy.float32)
h_A = numpy.real(A).astype(numpy.float32)
h_x = numpy.real(x).astype(numpy.float32)
# device buffers
mf = cl.mem_flags
d_A_buf = cl.Buffer(ctx, mf.READ_ONLY | mf.COPY_HOST_PTR, hostbuf=h_A)
d_x_buf = cl.Buffer(ctx, mf.READ_ONLY | mf.COPY_HOST_PTR, hostbuf=h_x)
d_y_buf = cl.Buffer(ctx, mf.WRITE_ONLY, size=h_y.nbytes)
168
Chapter 9
Revision control software
This curriculum builds on material by J. Robert Johansson from his “Introduction to scientific computing
with Python,” generously made available under a Creative Commons Attribution 3.0 Unported License at
https://github.com/jrjohansson/scientific-python-lectures. The Continuum Analytics enhancements use the
Creative Commons Attribution-NonCommercial 4.0 International License.
In any software development, one of the most important tools are revision control software (RCS).
They are used in virtually all software development and in all environments, by everyone and everywhere
(no kidding!)
RCS can used on almost any digital content, so it is not only restricted to software development, and is
also very useful for manuscript files, figures, data and notebooks!
• The repository does not only contain the latest version of all files, but the complete history of all
changes to the files since they were added to the repository.
• A user can checkout the repository, and obtain a local working copy of the files. All changes are made
to the files in the local working directory, where files can be added, removed and updated.
• When a task has been completed, the changes to the local files are commited (saved to the repository).
169
• If someone else has been making changes to the same files, a conflict can occur. In many cases conflicts
can be resolved automatically by the system, but in some cases we might manually have to merge
di↵erent changes together.
• It is often useful to create a new branch in a repository, or a fork or clone of an entire repository,
when we doing larger experimental development. The main branch in a repository is called often
master or trunk. When work on a branch or fork is completed, it can be merged in to the master
branch/repository.
• With distributed RCSs such as GIT or Mercurial, we can pull and push changesets between di↵erent
repositories. For example, between a local copy of there repository to a central online reposistory (for
example on a community repository host site like github.com).
In the rest of this lecture we will look at git, although hg is just as good and work in almost exactly the
same way.
The first time you start to use git, you’ll need to configure your author information:
If we want to fork or clone an existing repository, we can use the command git clone repository:
Git clone can take a URL to a public repository, like above, or a path to a local directory:
170
In [4]: !git clone gitdemo gitdemo2
Cloning into ’gitdemo2’...
warning: You appear to have cloned an empty repository.
done.
We can also clone private repositories over secure protocols such as SSH:
$ git clone ssh://myserver.com/myrepository
9.5 Status
Using the command git status we get a summary of the current status of the working directory. It shows
if we have modified, added or removed files.
In [5]: !git status
On branch master
Your branch is up-to-date with ’origin/master’.
Changes to be committed:
(use "git reset HEAD <file>..." to unstage)
modified: Lecture-6A-Fortran-and-C.ipynb
modified: Lecture-6B-HPC.ipynb
modified: Makefile
new file: "Offline/Icon\r"
new file: Offline/Lecture-6A-Fortran-and-C.ipynb
new file: Offline/Lecture-6A-Fortran-and-C.tex
new file: "Offline/Lecture-6A-Fortran-and-C files/Icon\r"
new file: Offline/Lecture-6A-Fortran-and-C files/Lecture-6A-Fortran-and-C 5 0.png
new file: Offline/Lecture-6B-HPC.ipynb
new file: Offline/Lecture-6B-HPC.tex
new file: "Offline/Lecture-6B-HPC files/Icon\r"
new file: Offline/Lecture-6B-HPC files/Lecture-6B-HPC 33 0.png
new file: Offline/Lecture-6B-HPC files/Lecture-6B-HPC 37 0.png
new file: Offline/Lecture-6B-HPC files/Lecture-6B-HPC 82 1.png
modified: Preamble.tex
modified: Scientific-Computing-with-Python.pdf
Untracked files:
(use "git add <file>..." to include in what will be committed)
.ipynb checkpoints/
Offline/.ipynb checkpoints/
pycache /
gitdemo/
mymodule.py
qutip/
In this case, only the current ipython notebook has been added. It is listed as an untracked file, and is
therefore not in the repository yet.
171
9.6 Adding files and committing changes
To add a new file to the repository, we first create the file and then use the git add filename command:
Overwriting README
On branch master
Your branch is up-to-date with ’origin/master’.
Changes to be committed:
(use "git reset HEAD <file>..." to unstage)
modified: Lecture-6A-Fortran-and-C.ipynb
modified: Lecture-6B-HPC.ipynb
modified: Makefile
new file: "Offline/Icon\r"
new file: Offline/Lecture-6A-Fortran-and-C.ipynb
new file: Offline/Lecture-6A-Fortran-and-C.tex
new file: "Offline/Lecture-6A-Fortran-and-C files/Icon\r"
new file: Offline/Lecture-6A-Fortran-and-C files/Lecture-6A-Fortran-and-C 5 0.png
new file: Offline/Lecture-6B-HPC.ipynb
new file: Offline/Lecture-6B-HPC.tex
new file: "Offline/Lecture-6B-HPC files/Icon\r"
new file: Offline/Lecture-6B-HPC files/Lecture-6B-HPC 33 0.png
new file: Offline/Lecture-6B-HPC files/Lecture-6B-HPC 37 0.png
new file: Offline/Lecture-6B-HPC files/Lecture-6B-HPC 82 1.png
modified: Preamble.tex
modified: README
modified: Scientific-Computing-with-Python.pdf
Untracked files:
(use "git add <file>..." to include in what will be committed)
.ipynb checkpoints/
Offline/.ipynb checkpoints/
pycache /
gitdemo/
mymodule.py
qutip/
After having added the file README, the command git status list it as an untracked file.
172
On branch master
Your branch is up-to-date with ’origin/master’.
Changes to be committed:
(use "git reset HEAD <file>..." to unstage)
modified: Lecture-6A-Fortran-and-C.ipynb
modified: Lecture-6B-HPC.ipynb
modified: Makefile
new file: "Offline/Icon\r"
new file: Offline/Lecture-6A-Fortran-and-C.ipynb
new file: Offline/Lecture-6A-Fortran-and-C.tex
new file: "Offline/Lecture-6A-Fortran-and-C files/Icon\r"
new file: Offline/Lecture-6A-Fortran-and-C files/Lecture-6A-Fortran-and-C 5 0.png
new file: Offline/Lecture-6B-HPC.ipynb
new file: Offline/Lecture-6B-HPC.tex
new file: "Offline/Lecture-6B-HPC files/Icon\r"
new file: Offline/Lecture-6B-HPC files/Lecture-6B-HPC 33 0.png
new file: Offline/Lecture-6B-HPC files/Lecture-6B-HPC 37 0.png
new file: Offline/Lecture-6B-HPC files/Lecture-6B-HPC 82 1.png
modified: Preamble.tex
modified: README
modified: Scientific-Computing-with-Python.pdf
Untracked files:
(use "git add <file>..." to include in what will be committed)
.ipynb checkpoints/
Offline/.ipynb checkpoints/
pycache /
gitdemo/
mymodule.py
qutip/
Now that it has been added, it is listed as a new file that has not yet been commited to the repository.
On branch master
Your branch is ahead of ’origin/master’ by 1 commit.
(use "git push" to publish your local commits)
Changes not staged for commit:
modified: Lecture-6A-Fortran-and-C.ipynb
modified: Lecture-6B-HPC.ipynb
173
modified: Makefile
modified: Preamble.tex
modified: Scientific-Computing-with-Python.pdf
Untracked files:
.ipynb checkpoints/
Offline/
pycache /
gitdemo/
mymodule.py
qutip/
modified: Lecture-6A-Fortran-and-C.ipynb
modified: Lecture-6B-HPC.ipynb
modified: Makefile
new file: "Offline/Icon\r"
new file: Offline/Lecture-6A-Fortran-and-C.ipynb
new file: Offline/Lecture-6A-Fortran-and-C.tex
new file: "Offline/Lecture-6A-Fortran-and-C files/Icon\r"
new file: Offline/Lecture-6A-Fortran-and-C files/Lecture-6A-Fortran-and-C 5 0.png
new file: Offline/Lecture-6B-HPC.ipynb
new file: Offline/Lecture-6B-HPC.tex
new file: "Offline/Lecture-6B-HPC files/Icon\r"
new file: Offline/Lecture-6B-HPC files/Lecture-6B-HPC 33 0.png
new file: Offline/Lecture-6B-HPC files/Lecture-6B-HPC 37 0.png
new file: Offline/Lecture-6B-HPC files/Lecture-6B-HPC 82 1.png
modified: Preamble.tex
modified: Scientific-Computing-with-Python.pdf
Untracked files:
(use "git add <file>..." to include in what will be committed)
.ipynb checkpoints/
Offline/.ipynb checkpoints/
pycache /
gitdemo/
mymodule.py
qutip/
After committing the change to the repository from the local working directory, git status again reports
that working directory is clean.
174
9.7 Commiting changes
When files that is tracked by GIT are changed, they are listed as modified by git status:
A new line.
Overwriting README
On branch master
Your branch is ahead of ’origin/master’ by 1 commit.
(use "git push" to publish your local commits)
Changes to be committed:
(use "git reset HEAD <file>..." to unstage)
modified: Lecture-6A-Fortran-and-C.ipynb
modified: Lecture-6B-HPC.ipynb
modified: Makefile
new file: "Offline/Icon\r"
new file: Offline/Lecture-6A-Fortran-and-C.ipynb
new file: Offline/Lecture-6A-Fortran-and-C.tex
new file: "Offline/Lecture-6A-Fortran-and-C files/Icon\r"
new file: Offline/Lecture-6A-Fortran-and-C files/Lecture-6A-Fortran-and-C 5 0.png
new file: Offline/Lecture-6B-HPC.ipynb
new file: Offline/Lecture-6B-HPC.tex
new file: "Offline/Lecture-6B-HPC files/Icon\r"
new file: Offline/Lecture-6B-HPC files/Lecture-6B-HPC 33 0.png
new file: Offline/Lecture-6B-HPC files/Lecture-6B-HPC 37 0.png
new file: Offline/Lecture-6B-HPC files/Lecture-6B-HPC 82 1.png
modified: Preamble.tex
modified: README
modified: Scientific-Computing-with-Python.pdf
Untracked files:
(use "git add <file>..." to include in what will be committed)
.ipynb checkpoints/
Offline/.ipynb checkpoints/
pycache /
gitdemo/
mymodule.py
qutip/
Again, we can commit such changes to the repository using the git commit -m "message" command.
175
[master cb5c324] added one more line in README
1 file changed, 3 insertions(+), 1 deletion(-)
On branch master
Your branch is ahead of ’origin/master’ by 2 commits.
(use "git push" to publish your local commits)
Changes to be committed:
(use "git reset HEAD <file>..." to unstage)
modified: Lecture-6A-Fortran-and-C.ipynb
modified: Lecture-6B-HPC.ipynb
modified: Makefile
new file: "Offline/Icon\r"
new file: Offline/Lecture-6A-Fortran-and-C.ipynb
new file: Offline/Lecture-6A-Fortran-and-C.tex
new file: "Offline/Lecture-6A-Fortran-and-C files/Icon\r"
new file: Offline/Lecture-6A-Fortran-and-C files/Lecture-6A-Fortran-and-C 5 0.png
new file: Offline/Lecture-6B-HPC.ipynb
new file: Offline/Lecture-6B-HPC.tex
new file: "Offline/Lecture-6B-HPC files/Icon\r"
new file: Offline/Lecture-6B-HPC files/Lecture-6B-HPC 33 0.png
new file: Offline/Lecture-6B-HPC files/Lecture-6B-HPC 37 0.png
new file: Offline/Lecture-6B-HPC files/Lecture-6B-HPC 82 1.png
modified: Preamble.tex
modified: Scientific-Computing-with-Python.pdf
Untracked files:
(use "git add <file>..." to include in what will be committed)
.ipynb checkpoints/
Offline/.ipynb checkpoints/
pycache /
gitdemo/
mymodule.py
qutip/
Writing tmpfile
Add it:
176
In [19]: !git add tmpfile
Remove it again:
rm ’tmpfile’
commit c380dc8356be86fde9b020d9d5f1a1fc00dd96ed
Author: David Mertz <[email protected]>
Date: Mon Aug 17 15:32:40 2015 -0700
commit 4d74889feb1df45411fe6bc915b559e6651b439a
Author: David Mertz <[email protected]>
Date: Mon Aug 17 15:32:38 2015 -0700
commit cb5c32499483f58ee4a7f56787f40a26d35e4109
Author: David Mertz <[email protected]>
Date: Mon Aug 17 15:32:37 2015 -0700
commit 50c355acc25c78d83d66a6a1e39748b68e62df09
Author: David Mertz <[email protected]>
Date: Mon Aug 17 15:32:33 2015 -0700
commit 68103a5960f95c50833ac7db7622666226292537
177
Author: David Mertz <[email protected]>
Date: Mon Aug 17 15:25:58 2015 -0700
commit 25d440cbd963e56c912013a52307d2384fbc9042
Author: David Mertz <[email protected]>
Date: Mon Aug 17 15:25:55 2015 -0700
commit 493c55661ea291b8573dd1746dba4c6a91a2e502
Author: David Mertz <[email protected]>
Date: Mon Aug 17 15:25:55 2015 -0700
commit 0b35eb6602f1d9fd113caf770d507dbfd173fd84
Author: David Mertz <[email protected]>
Date: Mon Aug 17 15:25:53 2015 -0700
commit e90a8c109da701138e8455c01d3d9abc7289c079
Author: David Mertz <[email protected]>
Date: Mon Aug 17 15:25:51 2015 -0700
commit 7771777a56f5bcf7ff5a423c04d0de52076479cf
Author: David Mertz <[email protected]>
Date: Mon Aug 17 15:25:50 2015 -0700
commit c8eb6ef0d1e06545b276e5637aa8a783fcdedb21
Author: David Mertz <[email protected]>
Date: Mon Aug 17 15:10:20 2015 -0700
commit e61b95df741a4dc734b84b9453290c7e5dc3cc34
Author: David Mertz <[email protected]>
Date: Mon Aug 17 15:10:18 2015 -0700
commit af8e43e5a5406c9f5e01696efc3bd1d7eb2e69c4
Author: David Mertz <[email protected]>
Date: Mon Aug 17 15:10:18 2015 -0700
commit e3896b914fb9f9368c9ade44142448b5cddd0360
178
Author: David Mertz <[email protected]>
Date: Mon Aug 17 15:10:18 2015 -0700
commit cd9d289b9a9182fc3e08986dfe64cca02e0488d0
Author: David Mertz <[email protected]>
Date: Mon Aug 17 15:10:17 2015 -0700
commit 4c349ecf589e95d1633072de1fe715a5750fa084
Author: David Mertz <[email protected]>
Date: Mon Aug 17 14:52:54 2015 -0700
commit d051436004f5fd47af96ac797cc3bcc7a9b08581
Author: David Mertz <[email protected]>
Date: Mon Aug 17 14:52:37 2015 -0700
commit ddc57f12eca11f89c157f59595b9edda83dc3815
Author: David Mertz <[email protected]>
Date: Mon Aug 17 14:51:51 2015 -0700
commit 27e5d0fa081298c8451de903006fbba79ac0130b
Author: David Mertz <[email protected]>
Date: Mon Aug 17 14:51:49 2015 -0700
commit 9783b909269b2f2c2234d00a20fd00d3b3f0bc1b
Author: David Mertz <[email protected]>
Date: Mon Aug 17 14:51:44 2015 -0700
In the commit log, each revision is shown with a timestampe, a unique has tag that, and author infor-
mation and the commit message.
9.10 Di↵s
All commits results in a changeset, which has a “di↵” describing the changes to the file associated with it.
We can use git diff so see what has changed in a file:
README files usually contains installation instructions, and information about how to get start
179
Overwriting README
In [25]: !git diff README
diff --git a/README b/README
index 4f51868..d3951c6 100644
--- a/README
+++ b/README
@@ -1,4 +1,4 @@
-A new line.
\ No newline at end of file
+README files usually contains installation instructions, and information about how to get started using
\ No newline at end of file
That looks quite cryptic but is a standard form for describing changes in files. We can use other tools,
like graphical user interfaces or web based systems to get a more easily understandable di↵.
In github (a web-based GIT repository hosting service) it can look like this:
In [26]: Image(filename=’images/github-diff.png’)
Out[26]:
180
9.11 Discard changes in the working directory
To discard a change (revert to the latest version in the repository) we can use the checkout command like
this:
On branch master
Your branch is ahead of ’origin/master’ by 4 commits.
(use "git push" to publish your local commits)
Changes to be committed:
(use "git reset HEAD <file>..." to unstage)
modified: Lecture-6A-Fortran-and-C.ipynb
modified: Lecture-6B-HPC.ipynb
modified: Makefile
new file: "Offline/Icon\r"
new file: Offline/Lecture-6A-Fortran-and-C.ipynb
new file: Offline/Lecture-6A-Fortran-and-C.tex
new file: "Offline/Lecture-6A-Fortran-and-C files/Icon\r"
new file: Offline/Lecture-6A-Fortran-and-C files/Lecture-6A-Fortran-and-C 5 0.png
new file: Offline/Lecture-6B-HPC.ipynb
new file: Offline/Lecture-6B-HPC.tex
new file: "Offline/Lecture-6B-HPC files/Icon\r"
new file: Offline/Lecture-6B-HPC files/Lecture-6B-HPC 33 0.png
new file: Offline/Lecture-6B-HPC files/Lecture-6B-HPC 37 0.png
new file: Offline/Lecture-6B-HPC files/Lecture-6B-HPC 82 1.png
modified: Preamble.tex
modified: Scientific-Computing-with-Python.pdf
Untracked files:
(use "git add <file>..." to include in what will be committed)
.ipynb checkpoints/
Offline/.ipynb checkpoints/
pycache /
gitdemo/
mymodule.py
qutip/
181
commit c380dc8356be86fde9b020d9d5f1a1fc00dd96ed
Author: David Mertz <[email protected]>
Date: Mon Aug 17 15:32:40 2015 -0700
commit 4d74889feb1df45411fe6bc915b559e6651b439a
Author: David Mertz <[email protected]>
Date: Mon Aug 17 15:32:38 2015 -0700
commit cb5c32499483f58ee4a7f56787f40a26d35e4109
Author: David Mertz <[email protected]>
Date: Mon Aug 17 15:32:37 2015 -0700
commit 50c355acc25c78d83d66a6a1e39748b68e62df09
Author: David Mertz <[email protected]>
Date: Mon Aug 17 15:32:33 2015 -0700
commit 68103a5960f95c50833ac7db7622666226292537
Author: David Mertz <[email protected]>
Date: Mon Aug 17 15:25:58 2015 -0700
commit 25d440cbd963e56c912013a52307d2384fbc9042
Author: David Mertz <[email protected]>
Date: Mon Aug 17 15:25:55 2015 -0700
commit 493c55661ea291b8573dd1746dba4c6a91a2e502
Author: David Mertz <[email protected]>
Date: Mon Aug 17 15:25:55 2015 -0700
commit 0b35eb6602f1d9fd113caf770d507dbfd173fd84
Author: David Mertz <[email protected]>
Date: Mon Aug 17 15:25:53 2015 -0700
commit e90a8c109da701138e8455c01d3d9abc7289c079
Author: David Mertz <[email protected]>
Date: Mon Aug 17 15:25:51 2015 -0700
182
commit 7771777a56f5bcf7ff5a423c04d0de52076479cf
Author: David Mertz <[email protected]>
Date: Mon Aug 17 15:25:50 2015 -0700
commit c8eb6ef0d1e06545b276e5637aa8a783fcdedb21
Author: David Mertz <[email protected]>
Date: Mon Aug 17 15:10:20 2015 -0700
commit e61b95df741a4dc734b84b9453290c7e5dc3cc34
Author: David Mertz <[email protected]>
Date: Mon Aug 17 15:10:18 2015 -0700
commit af8e43e5a5406c9f5e01696efc3bd1d7eb2e69c4
Author: David Mertz <[email protected]>
Date: Mon Aug 17 15:10:18 2015 -0700
commit e3896b914fb9f9368c9ade44142448b5cddd0360
Author: David Mertz <[email protected]>
Date: Mon Aug 17 15:10:18 2015 -0700
commit cd9d289b9a9182fc3e08986dfe64cca02e0488d0
Author: David Mertz <[email protected]>
Date: Mon Aug 17 15:10:17 2015 -0700
commit 4c349ecf589e95d1633072de1fe715a5750fa084
Author: David Mertz <[email protected]>
Date: Mon Aug 17 14:52:54 2015 -0700
commit d051436004f5fd47af96ac797cc3bcc7a9b08581
Author: David Mertz <[email protected]>
Date: Mon Aug 17 14:52:37 2015 -0700
commit ddc57f12eca11f89c157f59595b9edda83dc3815
Author: David Mertz <[email protected]>
Date: Mon Aug 17 14:51:51 2015 -0700
183
commit 27e5d0fa081298c8451de903006fbba79ac0130b
Author: David Mertz <[email protected]>
Date: Mon Aug 17 14:51:49 2015 -0700
commit 9783b909269b2f2c2234d00a20fd00d3b3f0bc1b
Author: David Mertz <[email protected]>
Date: Mon Aug 17 14:51:44 2015 -0700
Now the content of all the files like in the revision with the hash code listed above (first revision)
A new line.
M Lecture-6A-Fortran-and-C.ipynb
M Lecture-6B-HPC.ipynb
M Makefile
A "Offline/Icon\r"
A Offline/Lecture-6A-Fortran-and-C.ipynb
A Offline/Lecture-6A-Fortran-and-C.tex
A "Offline/Lecture-6A-Fortran-and-C files/Icon\r"
A Offline/Lecture-6A-Fortran-and-C files/Lecture-6A-Fortran-and-C 5 0.png
A Offline/Lecture-6B-HPC.ipynb
A Offline/Lecture-6B-HPC.tex
A "Offline/Lecture-6B-HPC files/Icon\r"
A Offline/Lecture-6B-HPC files/Lecture-6B-HPC 33 0.png
A Offline/Lecture-6B-HPC files/Lecture-6B-HPC 37 0.png
A Offline/Lecture-6B-HPC files/Lecture-6B-HPC 82 1.png
M Preamble.tex
M Scientific-Computing-with-Python.pdf
Already on ’master’
Your branch is ahead of ’origin/master’ by 4 commits.
(use "git push" to publish your local commits)
A new line.
184
On branch master
Your branch is ahead of ’origin/master’ by 4 commits.
(use "git push" to publish your local commits)
Changes to be committed:
(use "git reset HEAD <file>..." to unstage)
modified: Lecture-6A-Fortran-and-C.ipynb
modified: Lecture-6B-HPC.ipynb
modified: Makefile
new file: "Offline/Icon\r"
new file: Offline/Lecture-6A-Fortran-and-C.ipynb
new file: Offline/Lecture-6A-Fortran-and-C.tex
new file: "Offline/Lecture-6A-Fortran-and-C files/Icon\r"
new file: Offline/Lecture-6A-Fortran-and-C files/Lecture-6A-Fortran-and-C 5 0.png
new file: Offline/Lecture-6B-HPC.ipynb
new file: Offline/Lecture-6B-HPC.tex
new file: "Offline/Lecture-6B-HPC files/Icon\r"
new file: Offline/Lecture-6B-HPC files/Lecture-6B-HPC 33 0.png
new file: Offline/Lecture-6B-HPC files/Lecture-6B-HPC 37 0.png
new file: Offline/Lecture-6B-HPC files/Lecture-6B-HPC 82 1.png
modified: Preamble.tex
modified: Scientific-Computing-with-Python.pdf
Untracked files:
(use "git add <file>..." to include in what will be committed)
.ipynb checkpoints/
Offline/.ipynb checkpoints/
pycache /
gitdemo/
mymodule.py
qutip/
commit c380dc8356be86fde9b020d9d5f1a1fc00dd96ed
Author: David Mertz <[email protected]>
Date: Mon Aug 17 15:32:40 2015 -0700
185
commit 4d74889feb1df45411fe6bc915b559e6651b439a
Author: David Mertz <[email protected]>
Date: Mon Aug 17 15:32:38 2015 -0700
commit cb5c32499483f58ee4a7f56787f40a26d35e4109
Author: David Mertz <[email protected]>
Date: Mon Aug 17 15:32:37 2015 -0700
commit 50c355acc25c78d83d66a6a1e39748b68e62df09
Author: David Mertz <[email protected]>
Date: Mon Aug 17 15:32:33 2015 -0700
commit 68103a5960f95c50833ac7db7622666226292537
Author: David Mertz <[email protected]>
Date: Mon Aug 17 15:25:58 2015 -0700
commit 25d440cbd963e56c912013a52307d2384fbc9042
Author: David Mertz <[email protected]>
Date: Mon Aug 17 15:25:55 2015 -0700
commit 493c55661ea291b8573dd1746dba4c6a91a2e502
Author: David Mertz <[email protected]>
Date: Mon Aug 17 15:25:55 2015 -0700
commit 0b35eb6602f1d9fd113caf770d507dbfd173fd84
Author: David Mertz <[email protected]>
Date: Mon Aug 17 15:25:53 2015 -0700
commit e90a8c109da701138e8455c01d3d9abc7289c079
Author: David Mertz <[email protected]>
Date: Mon Aug 17 15:25:51 2015 -0700
commit 7771777a56f5bcf7ff5a423c04d0de52076479cf
Author: David Mertz <[email protected]>
Date: Mon Aug 17 15:25:50 2015 -0700
186
commit c8eb6ef0d1e06545b276e5637aa8a783fcdedb21
Author: David Mertz <[email protected]>
Date: Mon Aug 17 15:10:20 2015 -0700
commit e61b95df741a4dc734b84b9453290c7e5dc3cc34
Author: David Mertz <[email protected]>
Date: Mon Aug 17 15:10:18 2015 -0700
commit af8e43e5a5406c9f5e01696efc3bd1d7eb2e69c4
Author: David Mertz <[email protected]>
Date: Mon Aug 17 15:10:18 2015 -0700
commit e3896b914fb9f9368c9ade44142448b5cddd0360
Author: David Mertz <[email protected]>
Date: Mon Aug 17 15:10:18 2015 -0700
commit cd9d289b9a9182fc3e08986dfe64cca02e0488d0
Author: David Mertz <[email protected]>
Date: Mon Aug 17 15:10:17 2015 -0700
commit 4c349ecf589e95d1633072de1fe715a5750fa084
Author: David Mertz <[email protected]>
Date: Mon Aug 17 14:52:54 2015 -0700
commit d051436004f5fd47af96ac797cc3bcc7a9b08581
Author: David Mertz <[email protected]>
Date: Mon Aug 17 14:52:37 2015 -0700
commit ddc57f12eca11f89c157f59595b9edda83dc3815
Author: David Mertz <[email protected]>
Date: Mon Aug 17 14:51:51 2015 -0700
commit 27e5d0fa081298c8451de903006fbba79ac0130b
Author: David Mertz <[email protected]>
Date: Mon Aug 17 14:51:49 2015 -0700
187
commit 9783b909269b2f2c2234d00a20fd00d3b3f0bc1b
Author: David Mertz <[email protected]>
Date: Mon Aug 17 14:51:44 2015 -0700
In [36]: !git tag -a demotag1 -m "Code used for this and that purpuse"
demotag1
tag demotag1
Tagger: David Mertz <[email protected]>
Date: Mon Aug 17 12:16:48 2015 -0700
commit d925777dd7b7211b963d9ef141c1c2a13596e370
Author: David Mertz <[email protected]>
Date: Mon Aug 17 12:16:40 2015 -0700
To retreive the code in the state corresponding to a particular tag, we can use the git checkout tagname
command:
9.14 Branches
With branches we can create diverging code bases in the same repository. They are for example useful for
experimental development that requires a lot of code changes that could break the functionality in the master
branch. Once the development of a branch has reached a stable state it can always be merged back into the
trunk. Branching-development-merging is a good development strategy when serveral people are involved in
working on the same code base. But even in single author repositories it can often be useful to always keep
the master branch in a working state, and always branch/fork before implementing a new feature, and later
merge it back into the main trunk.
In GIT, we can create a new branch like this:
188
We can list the existing branches like this:
expr1
* master
M Lecture-6A-Fortran-and-C.ipynb
M Lecture-6B-HPC.ipynb
M Makefile
A "Offline/Icon\r"
A Offline/Lecture-6A-Fortran-and-C.ipynb
A Offline/Lecture-6A-Fortran-and-C.tex
A "Offline/Lecture-6A-Fortran-and-C files/Icon\r"
A Offline/Lecture-6A-Fortran-and-C files/Lecture-6A-Fortran-and-C 5 0.png
A Offline/Lecture-6B-HPC.ipynb
A Offline/Lecture-6B-HPC.tex
A "Offline/Lecture-6B-HPC files/Icon\r"
A Offline/Lecture-6B-HPC files/Lecture-6B-HPC 33 0.png
A Offline/Lecture-6B-HPC files/Lecture-6B-HPC 37 0.png
A Offline/Lecture-6B-HPC files/Lecture-6B-HPC 82 1.png
M Preamble.tex
M Scientific-Computing-with-Python.pdf
Switched to branch ’expr1’
README files usually contains installation instructions, and information about how to get start
Experimental addition.
Overwriting README
* expr1
master
M Lecture-6A-Fortran-and-C.ipynb
M Lecture-6B-HPC.ipynb
M Makefile
A "Offline/Icon\r"
189
A Offline/Lecture-6A-Fortran-and-C.ipynb
A Offline/Lecture-6A-Fortran-and-C.tex
A "Offline/Lecture-6A-Fortran-and-C files/Icon\r"
A Offline/Lecture-6A-Fortran-and-C files/Lecture-6A-Fortran-and-C 5 0.png
A Offline/Lecture-6B-HPC.ipynb
A Offline/Lecture-6B-HPC.tex
A "Offline/Lecture-6B-HPC files/Icon\r"
A Offline/Lecture-6B-HPC files/Lecture-6B-HPC 33 0.png
A Offline/Lecture-6B-HPC files/Lecture-6B-HPC 37 0.png
A Offline/Lecture-6B-HPC files/Lecture-6B-HPC 82 1.png
M Preamble.tex
M Scientific-Computing-with-Python.pdf
Switched to branch ’master’
Your branch is ahead of ’origin/master’ by 4 commits.
(use "git push" to publish your local commits)
expr1
* master
We can merge an existing branch and all its changesets into another branch (for example the master
branch) like this:
First change to the target branch:
M Lecture-6A-Fortran-and-C.ipynb
M Lecture-6B-HPC.ipynb
M Makefile
A "Offline/Icon\r"
A Offline/Lecture-6A-Fortran-and-C.ipynb
A Offline/Lecture-6A-Fortran-and-C.tex
A "Offline/Lecture-6A-Fortran-and-C files/Icon\r"
A Offline/Lecture-6A-Fortran-and-C files/Lecture-6A-Fortran-and-C 5 0.png
A Offline/Lecture-6B-HPC.ipynb
A Offline/Lecture-6B-HPC.tex
A "Offline/Lecture-6B-HPC files/Icon\r"
A Offline/Lecture-6B-HPC files/Lecture-6B-HPC 33 0.png
A Offline/Lecture-6B-HPC files/Lecture-6B-HPC 37 0.png
A Offline/Lecture-6B-HPC files/Lecture-6B-HPC 82 1.png
M Preamble.tex
M Scientific-Computing-with-Python.pdf
Already on ’master’
Your branch is ahead of ’origin/master’ by 4 commits.
(use "git push" to publish your local commits)
Updating c380dc8..adfa044
Fast-forward
README | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
190
expr1
* master
We can delete the branch expr1 now that it has been merged into the master:
* master
README files usually contains installation instructions, and information about how to get started using
Experimental addition.
origin
* remote origin
Fetch URL: [email protected]:ContinuumIO/scientific-python-lectures.git
Push URL: [email protected]:ContinuumIO/scientific-python-lectures.git
HEAD branch: master
Remote branches:
New-branding tracked
master tracked
Local branch configured for ’git pull’:
master merges with remote master
Local ref configured for ’git push’:
master pushes to master (fast-forwardable)
9.15.1 pull
We can retrieve updates from the origin repository by “pulling” changesets from “origin” to our repository:
Already up-to-date.
We can register addresses to many di↵erent repositories, and pull in di↵erent changesets from di↵erent
sources, but the default source is the origin from where the repository was first cloned (and the work origin
could have been omitted from the line above).
191
9.15.2 push
After making changes to our local repository, we can push changes to a remote repository using git push.
Again, the default target repository is origin, so we can do:
On branch master
Your branch is ahead of ’origin/master’ by 5 commits.
(use "git push" to publish your local commits)
Changes to be committed:
(use "git reset HEAD <file>..." to unstage)
modified: Lecture-6A-Fortran-and-C.ipynb
modified: Lecture-6B-HPC.ipynb
modified: Makefile
new file: "Offline/Icon\r"
new file: Offline/Lecture-6A-Fortran-and-C.ipynb
new file: Offline/Lecture-6A-Fortran-and-C.tex
new file: "Offline/Lecture-6A-Fortran-and-C files/Icon\r"
new file: Offline/Lecture-6A-Fortran-and-C files/Lecture-6A-Fortran-and-C 5 0.png
new file: Offline/Lecture-6B-HPC.ipynb
new file: Offline/Lecture-6B-HPC.tex
new file: "Offline/Lecture-6B-HPC files/Icon\r"
new file: Offline/Lecture-6B-HPC files/Lecture-6B-HPC 33 0.png
new file: Offline/Lecture-6B-HPC files/Lecture-6B-HPC 37 0.png
new file: Offline/Lecture-6B-HPC files/Lecture-6B-HPC 82 1.png
modified: Preamble.tex
modified: Scientific-Computing-with-Python.pdf
Untracked files:
(use "git add <file>..." to include in what will be committed)
.ipynb checkpoints/
Offline/.ipynb checkpoints/
pycache /
gitdemo/
mymodule.py
qutip/
On branch master
Your branch is ahead of ’origin/master’ by 5 commits.
(use "git push" to publish your local commits)
Changes not staged for commit:
modified: Lecture-6A-Fortran-and-C.ipynb
modified: Lecture-6B-HPC.ipynb
modified: Makefile
192
modified: Preamble.tex
modified: Scientific-Computing-with-Python.pdf
Untracked files:
.ipynb checkpoints/
Offline/
pycache /
gitdemo/
mymodule.py
qutip/
• Github : http://www.github.com
• Bitbucket: http://www.bitbucket.org
In [60]: Image(filename=’images/github-project-page.png’)
Out[60]:
193
9.17 Graphical user interfaces
There are also a number of graphical users interfaces for GIT. The available options vary a little bit from
platform to platform:
http://git-scm.com/downloads/guis
In [61]: Image(filename=’images/gitk.png’)
Out[61]:
194
9.18 Further reading
• http://git-scm.com/book
• http://www.vogella.com/articles/Git/article.html
• http://cheat.errtheblog.com/s/git
195